Commit Graph

178 Commits

Author SHA1 Message Date
9132734a35 Use Dr.CI GitHub checkrun summary when querying its API fails (#111628)
This will allow internal SandCastle job to access Dr.CI classification results via GitHub checkrun summary and correctly ignore unrelated failures.

### Testing

Adding `TestBypassFailuresOnSandCastle` where Dr.CI API returns nothing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111628
Approved by: https://github.com/clee2000
2023-10-24 01:32:30 +00:00
4ec777e9a5 [BE] Clean up trymerge code handling broken trunk failures (#111520)
This is the final part of https://github.com/pytorch/pytorch/pull/110054.  The broken trunk classification has been done on Dr.CI, so we can just check for that in trymerge for consistency when ghstack is used.

* [x] https://github.com/pytorch/pytorch/pull/110054
* [x] https://github.com/pytorch/pytorch/pull/110133
* [x] This PR to clean up the broken trunk logic.

One important change is that `get_classifications` doesn't need to query the jobs from Rockset for the head and merge base SHA anymore, saving a query there.  The function looks a lot simpler now.

### Testing

https://github.com/pytorch/pytorch/pull/111253 had 1 broken trunk failure as detected by Dr.CI from the base commit 3eb5cae3af (valid) while trymerge didn't detect that because ghstack base commit be8e517174 didn't have the same failure (miss).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111520
Approved by: https://github.com/clee2000
2023-10-19 02:30:56 +00:00
f952551963 Handle invalid cancellation signals in trymerge (#110690)
This change is needed after https://github.com/pytorch/test-infra/pull/4579 and https://github.com/pytorch/test-infra/pull/4610.  All invalid cancelled signals have been removed from Dr.CI and HUD.  So trymerge should ignore them accordingly for a consistent experience.

### Testing

https://github.com/pytorch/pytorch/pull/110367#issuecomment-1750099960 is the PR where a bunch of invalid cancelled signals showed up and blocked merges

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110690
Approved by: https://github.com/clee2000, https://github.com/ZainRizvi
2023-10-06 22:43:33 +00:00
26bfb0fc21 Check for both workflow and job names from Dr.CI (#110661)
In https://github.com/pytorch/pytorch/pull/110362, the failure was flaky but merge bot treated it as an actual failure. This is a regression after https://github.com/pytorch/test-infra/pull/4604 where the name returned by Dr.CI now includes workflow name.  For example, the name is `trunk / macos-12-py3-arm64 / test (default, 2, 3, macos-m1-12)` in the JSON response:

```
{"FAILED": [], "FLAKY": [{"workflowId": 6372581477, "id": 17297638807, "name": "trunk / macos-12-py3-arm64 / test (default, 2, 3, macos-m1-12)", "jobName": "macos-12-py3-arm64 / test (default, 2, 3, macos-m1-12)", "conclusion": "failure", "completed_at": "2023-10-01T22:18:28Z", "html_url": "https://github.com/pytorch/pytorch/actions/runs/6372581477/job/17297638807", "head_branch": "ciflow/trunk/110362", "pr_number": 110362, "head_sha": "03f51e36dedf234931006d1db61677b229c9a119", "failure_captures": ["Failure: There is only 4671284KB free space left in /, which is less than the minimum requirement of"], "failure_line": "Failure: There is only 4671284KB free space left in /, which is less than the minimum requirement of 6291456KB for macOS", "time": "2023-10-01T22:17:53.847751Z"}], "BROKEN_TRUNK": [], "UNSTABLE": []}
```

I update merge bot to handle this better by considering both workflow name, job name, and the combination full name.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110661
Approved by: https://github.com/clee2000
2023-10-06 04:36:52 +00:00
81a74457ca [BE] Clean up trymerge code handling flaky failures (#110133)
This is the 2nd part of https://github.com/pytorch/pytorch/pull/110054.  The flaky classification has been done on Dr.CI.  There is no need to download flaky rule files and do the check anymore.  Some tests are also updated with new examples because we mocked the list of flaky rules there.  Similar tests have been done on Dr.CI.

* [x] https://github.com/pytorch/pytorch/pull/110054
* [x] Clean up the flaky rules logic because it has already been implemented on Dr. CI
* [ ] Clean up the broken trunk logic for the same reason

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110133
Approved by: https://github.com/clee2000
2023-09-30 08:01:00 +00:00
955298bc40 Use Dr.CI results to classify flaky failures in trymerge (#110054)
After https://github.com/pytorch/test-infra/pull/4589, we can now query Dr.CI to get the list of flaky failures there.  This change queries Dr.CI API endpoint and check if the failure is a flaky one using `is_flaky` function.

Because the change is relatively large, I'm breaking it down to several smaller PRs in this order:

* [x] This PR queries Dr.CI and adds `is_flaky` check
* [ ] Clean up the flaky rules logic because it has already been implemented on Dr. CI
* [ ] Clean up the broken trunk logic for the same reason

### Testing

* Create a new `drci_mocks.json` file to catch the JSON response from Dr.CI API endpoint. The API requires `DRCI_BOT_KEY`.
*  `pytest -v test_trymerge.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110054
Approved by: https://github.com/clee2000
2023-09-27 21:21:29 +00:00
063d2572da Revert "Use Dr.CI results to classify flaky failures in trymerge (#110054)"
This reverts commit d0f82cd082fad7243226e0ab68fd995873ea7d76.

Reverted https://github.com/pytorch/pytorch/pull/110054 on behalf of https://github.com/huydhn due to The mock gql_mocks.json file is not bigger than the file size limit on fbcode ([comment](https://github.com/pytorch/pytorch/pull/110054#issuecomment-1737727552))
2023-09-27 16:33:10 +00:00
d0f82cd082 Use Dr.CI results to classify flaky failures in trymerge (#110054)
After https://github.com/pytorch/test-infra/pull/4589, we can now query Dr.CI to get the list of flaky failures there.  This change queries Dr.CI API endpoint and check if the failure is a flaky one using `is_flaky` function.

Because the change is relatively large, I'm breaking it down to several smaller PRs in this order:

* [x] This PR queries Dr.CI and adds `is_flaky` check
* [ ] Clean up the flaky rules logic because it has already been implemented on Dr. CI
* [ ] Clean up the broken trunk logic for the same reason

### Testing

* Create a new `drci_mocks.json` file to catch the JSON response from Dr.CI API endpoint. The API requires `DRCI_BOT_KEY`.
*  `pytest -v test_trymerge.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110054
Approved by: https://github.com/clee2000
2023-09-26 21:24:21 +00:00
d7f943ec82 [mergebot] Flaky and broken trunk should take precedence over ic (#107761)
I notice a curious case on https://github.com/pytorch/pytorch/pull/107508 where there was one broken trunk failure and the PR was merged with `merge -ic`.  Because the failure had been classified as unrelated, I expected to see a no-op force merge here.  However, it showed up as a force merge with failure.

![Screenshot 2023-08-22 at 20 01 10](https://github.com/pytorch/pytorch/assets/475357/b9c93e24-8da8-4fc6-9b3d-61b6bd0a8937)

The record on Rockset reveals https://github.com/pytorch/pytorch/pull/107508 has:

* 0 broken trunk check (unexpected, this should be 1 as Dr. CI clearly say so)
* 1 ignore current check (unexpected, this should be 0 and the failure should be counted as broken trunk instead)
* 3 unstable ROCm jobs (expected)

It turns out that ignore current takes precedence over flaky and broken trunk classification.  This might have been the expectation in the past but I think that's not the case now.  The bot should be consistent with what is shown on Dr. CI.  The change here is to make flaky, unstable, and broken trunk classification to take precedence over ignore current.  Basically, we only need to ignore new or unrecognized failures that have yet been classified.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107761
Approved by: https://github.com/clee2000
2023-08-23 21:22:56 +00:00
b9c86c521d Make mergebot work with review comments (#107390)
Fixes https://github.com/pytorch/pytorch/issues/100406

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107390
Approved by: https://github.com/clee2000
ghstack dependencies: #107385
2023-08-17 21:31:41 +00:00
4874b02379 [BE] Remove deprecated github gql param and disable inconsistent test (#107385)
Two fixes:
- Stop querying `pushDate`, which [has been deprecated ](https://docs.github.com/en/graphql/reference/objects) and now always returns null
- Disables the test `test_merge_ghstack_into` which was recently added in https://github.com/pytorch/pytorch/pull/105251. This test used the results of another person's ghstack PR for testing, but as the dev submitted chunks of their PR this test's assumptions have been broken. cc @izaitsevfb for a long term fix here

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107385
Approved by: https://github.com/clee2000
2023-08-17 21:31:41 +00:00
4979a1b8f9 Fix trymerge broken trunk detection when the merge base job was retried (successfully) (#107333)
This fixes a discrepancy bug between Dr.CI and trymerge when detecting broken trunk failures.
 Take https://github.com/pytorch/pytorch/pull/107160 as an example:

* Dr.CI correctly identifies the broken trunk failure
* while trymerge records it as a new failure

The issue is that the merge base [failure](https://github.com/pytorch/pytorch/actions/runs/5833057579/job/15820504498) was flaky.  It was retried successfully and its conclusion went from a failure to a success.  The Rockset query returns all run attempts and while Dr.CI correctly records the failure, trymerge overwrites it with the successful retry.   Thus, the latter saw a new failure.

This change makes trymerge keep the merge base failure similar to what Dr.CI does https://github.com/pytorch/test-infra/blob/main/torchci/pages/api/drci/drci.ts#L158-L168

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107333
Approved by: https://github.com/clee2000
2023-08-17 02:09:31 +00:00
d2aa3f5fa9 [GHF][mergebot] record ghstack dependencies in the commit message (#105251)
Currently all information about the dependencies of ghstack PRs (e.g. #105010) is stripped away:
c984885809/.github/scripts/trymerge.py (L1077-L1078)

This PR adds this information back in a more compact form. All dependencies (PR numbers) of each PR in ghstack are recorded.

The resulting commit message will look like this (the last line is new):

> Mock title (#123)
>
> Mock body text
> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123
> Approved by: https://github.com/Approver1, https://github.com/Approver2
> ghstack dependencies: #1, #2

---

### Testing

Unit tests.

---

### Note Re: `# type: ignore[assignment]` in unit tests.

I did my due diligence to find alternatives. Unfortunately mypy [doesn't](https://github.com/python/mypy/issues/6713) support this [way of patching methods](https://docs.python.org/3/library/unittest.mock-examples.html#mock-patching-methods), and the alternatives are either extremely verbose or don't work for this case. I decided it's not worth the effort (since the problem is limited only to the unit test).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105251
Approved by: https://github.com/huydhn
2023-07-29 20:32:10 +00:00
4fe407ad73 Add details about ic, broken, flaky, and unstable checks to merge records (#106162)
At the moment, we only record the list of pending and failed check on Rockset merge records. This is enough to compute the force merge KPI(s), but isn't enough for more in-depth analysis on what happened at the time of the merge:

* If the number of `ok_failed_checks` is less than `ok_failed_checks_threshold`, the list of `failed_checks` would be empty (expectedly).  So Rockset would only record an empty list.
* We support retry in PR, so the classifications on Dr.CI could be different than what dev observed at the time of the merge if retry completed successfully

### Testing

`python .github/scripts/trymerge.py --comment-id 1654010315 106095 --dry-run` (need to comment out some of the code to actually write a test record to Rockset), then manually verify it with

```
SELECT
    *
FROM
    commons.merges
WHERE
    pr_num = 106095
```

to see that `ignore_current_checks`, `broken_trunk_checks`, `flaky_checks`, and `unstable_checks` shows up correctly
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106162
Approved by: https://github.com/clee2000
2023-07-28 09:41:02 +00:00
32175d794a No need to wait for pending unstable jobs when merging (#106095)
No need to wait if the job classification is unstable as it would be ignored anyway. This is useful to not need to wait for scarce resources like ROCm, which is also frequently in unstable mode (There is a ROCm queue atm)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106095
Approved by: https://github.com/clee2000
2023-07-28 07:08:23 +00:00
3b5fb7c0d4 Support regex flaky rules in trymerge (#106103)
This goes together with https://github.com/pytorch/test-infra/pull/4423 to support regex flaky rules in `trymerge`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106103
Approved by: https://github.com/clee2000
2023-07-28 07:05:12 +00:00
c05eb77f09 Increase ignorable failures limit (#105998)
Given the number of unstable job atm (rocm, distributed), having the limit of 3 for ignorable failures is too low.  When I manually look into force merges, I could find many examples like like https://github.com/pytorch/pytorch/pull/105848 where there are 3+ unrelated failures.  As the classification is getting more accurate, we can aim to ignore all flaky and broken trunk failures.

* Default `ok_failed_checks_threshold` to `-1` to ignore all unrelated failures
* Increase the `IGNORABLE_FAILED_CHECKS_THESHOLD` to 10.  The only concern I have before setting it to `-1` is the fog of war situation when a sev occurs.  So 10 is a good middle ground before we agree to set it to `-1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105998
Approved by: https://github.com/clee2000
2023-07-27 00:14:37 +00:00
6d43c89f37 [BE]: Update Ruff to 0.0.280 (#105724)
Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724
Approved by: https://github.com/ezyang, https://github.com/janeyx99
2023-07-22 23:03:34 +00:00
14d87bb5ff [BE] Enable ruff's UP rules and autoformat tools and scripts (#105428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105428
Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/malfet
2023-07-19 01:24:44 +00:00
893983e59f Use GitHub REST API to get the merge base commit SHA (#105098)
Fixes https://github.com/pytorch/pytorch/issues/104713

### Testing
Manual testing locally using #104121 and confirm that the correct merge base commit is returned [803c14490b189f9b755ecb9f2a969876088ea243](1cb87771c1) instead of the wrong value provided by `baseRefOid` (de7b6e55eb0f87f8d822f69bad6b4189a857b460).  Here is the JSON output of the GraphQL query for PR info https://paste.sh/TJ-QQWz4#fvE3Y6qoJ8vDkRBZ3vowkZ3m

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105098
Approved by: https://github.com/malfet
2023-07-14 04:25:45 +00:00
d77a2d8fe3 Remove shard ID and unstable suffix when comparing failed job names with the base commit (#104821)
Fixes https://github.com/pytorch/test-infra/issues/4328

This goes together with https://github.com/pytorch/test-infra/pull/4353 and it updates `trymerge` to remove shard ID and the `unstable` suffix when comparing failed job names with the base commit.

### Testing

Add unit tests with the reported issue as a test case https://github.com/pytorch/pytorch/pull/104214 to make sure that the failure there is reported as ignorable broken trunk instead of a new failure.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104821
Approved by: https://github.com/malfet
2023-07-12 19:51:18 +00:00
8464a6a165 [GHF] Better check for internal diffs (#104344)
During revert, use title of "Meta Internal-Only Changes Check" to determine whether or not internal diff is associated with the PR. When PR is merged/closed, "Meta Internal-Only Changes Check" status is always success, but title message can differ:
- "There is no internal Diff connected, this can be merged now" means that there are no internal change associated with PR (or it was landed via GitHub First workflow)
- "The internal Diff has landed, this can be merged now" meaning that PR has associated internal DIFF, and OSS and internal reverts must happen in sync using internal tooling. (Or a revert PR can be authored in OSS)

Add regression test for https://github.com/pytorch/pytorch/pull/100652 that was originated from the internal diff, but was merged as OSS PR.

Fixes https://github.com/pytorch/pytorch/issues/104232

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104344
Approved by: https://github.com/bigfootjon, https://github.com/huydhn
2023-06-28 19:22:45 +00:00
85efacee07 Add a new UNSTABLE category in trymerge (#102784)
Per title, after https://github.com/pytorch/pytorch/pull/102426 landed, it makes sense to have a new category for UNSTABLE jobs and handle them accordingly in trymerge.

* The simple approach is to check for `unstable` in the check (job) name.  I plan to roll this out first and then see if we need to cover the more complicated, but less popular case, of unstable build job.  Specifically, an unstable build job has no `unstable` in its name
* An unstable job is ignored by trymerge.  This is the same behavior we have atm when a job is moved to unstable.  It's completely ignored
* The update to Dr. CI will come later, so that unstable failures would also be hidden like broken trunk or flaky

### Testing

Leverage the broken trunk Windows CPU job atm and mark Windows CPU jobs as unstable https://github.com/pytorch/pytorch/issues/102297
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102784
Approved by: https://github.com/clee2000
2023-06-04 00:40:27 +00:00
124d812f38 [BE] Fix rule not found error message (#101745)
Prevent error message from becoming of single column of characters

Thanks @clee200 for explaining how it worked before

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at fef1e25</samp>

> _`reject_reason` fixed_
> _Syntax error caused trouble_
> _Autumn of bugs ends_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101745
Approved by: https://github.com/kit1980, https://github.com/osalpekar
2023-05-17 23:57:36 +00:00
cb734123e2 [GHF] Ignore flaky classification for pin updates (#101587)
Also add regression test to validate it

Fixes https://github.com/pytorch/test-infra/issues/4126

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101587
Approved by: https://github.com/atalman
2023-05-16 20:48:30 +00:00
cc0a271935 [GHF] Use baseRefOid to get PRs merge base (#101232)
In general, `GitHubRepo` class must not contain any methods that incur local IO, nor modify git repo checkout its invoked from.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101232
Approved by: https://github.com/kit1980
2023-05-13 00:27:10 +00:00
568bac7961 [BE][GHF] Add retries_decorator (#101227)
I've noticed that 3-4 functions in trymerge are trying to implement similar tail recursion for flaky network retries.

Unify them using single wrapper in `gitutils.py`

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 8d40631</samp>

> _`retries_decorator`_
> _adds resilience to GitHub scripts_
> _autumn of errors_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101227
Approved by: https://github.com/kit1980
2023-05-12 20:29:06 +00:00
ce76670c6f [GHF][BE] Add __repr__ to FlakyRule (#101234)
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 9d9f7a4</samp>

> _`FlakyRule` class_
> _Defines rules for flaky tests_
> _Autumn leaves falling_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101234
Approved by: https://github.com/kit1980
2023-05-12 02:08:20 +00:00
f0cc535c28 [GHF][BE] Memoize `read_flaky_rule (#101239)
Added caching to `read_flaky_rules`, as it's called several times during the merge process and every call incurs network access. Also, one should not expect flaky rules to change while PR is landed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101239
Approved by: https://github.com/huydhn, https://github.com/kit1980
2023-05-12 02:07:37 +00:00
1a6f613b8f Check uppercase when checking for merge blocking SEVs (#100559)
Otherwise it's triggered by phrases like "removing merge blocking" in the details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100559
Approved by: https://github.com/seemethere, https://github.com/malfet
2023-05-03 23:47:06 +00:00
5daef13883 Fix merging label removal (#100433)
During regular merge process, when `GitHubPR` object is created, it does not have `merging` label and when label is added it does not update existing `GitHubPR` object either

To fix the problem, call REST API wrapper `gh_remove_label` directly. Worst case that can happen, if label is already removed at this point, is that it will be printed to the stderr, which is not rendered on HUD anyway

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100433
Approved by: https://github.com/PaliC, https://github.com/kit1980
2023-05-02 00:30:13 +00:00
7b684310c8 [BE][GHF] Do not call GraphQL twice (#100434)
During regular merge process, `GitHubPR` and `GitHubRepo` objects are first created in main() and than re-created in `merge()` instead of being passed by reference, which results in making the same GraphQL requests to the repo twice

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at ee4e23e</samp>

> _Sing, O Muse, of the skillful coder who refactored_
> _The `merge` function, to accept a `GitHubPR` object,_
> _And thus reduced the calls to the divine API_
> _And the duplication of code, that source of errors._
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100434
Approved by: https://github.com/kit1980, https://github.com/PaliC, https://github.com/huydhn, https://github.com/ZainRizvi
2023-05-02 00:26:49 +00:00
331ed5bee7 Add comment link to revert message (#100276)
* add comment url/link to revert message for easier tracking
* update gql mocks accordingly, not sure why databaseid in checkruns got updated as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100276
Approved by: https://github.com/huydhn
2023-04-28 22:41:29 +00:00
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
2418b94576 Rename default branch to main (#99210)
Mostly `s/@master/@main` in numerous `.yml` files.

Keep `master` in `weekly.yml` as it refers to `xla` repo and in `test_trymerge.py` as it refers to a branch PR originates from.
2023-04-16 18:48:14 -07:00
9667f261c6 Remove MERGE_IN_PROGRESS when exiting merge (#98611)
I.e. do it in finally section
This should take care of cases like https://github.com/pytorch/pytorch/pull/97645#issuecomment-1500490754

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98611
Approved by: https://github.com/PaliC
2023-04-07 19:59:08 +00:00
46d765c15e [devX] make labels only count their own occurences (#98551)
Small QoL improvement such that add_numbered_label now works more intuitively. Now if we push different labels instead of having `[reverted, mergedX2, revertX3, mergedX4, revertedX5, mergedX6]` we have `[reverted, merged, revertX2, mergedX2, revertedX3, mergedX3]`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98551
Approved by: https://github.com/huydhn
2023-04-07 08:30:46 +00:00
d06662fb57 Add ephemeral merging label (#98543)
Addresses https://github.com/pytorch/test-infra/issues/3950

Test Plan: Ran a dry run on this pr. The label showed up while trying to merge
<img width="354" alt="Screenshot 2023-04-06 at 4 57 48 PM" src="https://user-images.githubusercontent.com/13758638/230514276-1ac70b58-d2d1-4e4b-892b-a957bf156063.png">

And then disappeared after failing
<img width="373" alt="Screenshot 2023-04-06 at 5 00 11 PM" src="https://user-images.githubusercontent.com/13758638/230514470-38b15ec7-cfd9-4efe-b6e8-0f9af5577c62.png">

There's also the trail of adding and removing the "merging" label at the bottom

Notes: This is slightly buggy sometimes. For example when the merge failed when I was editing this textbox, the label did not disappear.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98543
Approved by: https://github.com/malfet
2023-04-07 08:24:54 +00:00
196acc84b1 [UX] Advise users to rebase-and-merge for stale PRs (#97808)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97808
Approved by: https://github.com/clee2000, https://github.com/kit1980
2023-03-28 19:03:16 +00:00
6c450c7880 Allow -ic when no pending jobs (#97707)
fixes https://github.com/pytorch/test-infra/issues/3933

https://github.com/pytorch/test-infra/pull/3937

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97707
Approved by: https://github.com/kit1980
2023-03-27 22:14:44 +00:00
345714e372 Upload merge records to Rockset (#97471)
This upload a record to a new Rockset `merges` collection in `commons` workspace in the following format:

```
{
    "id": comment_id,
    "pr_num": pr_num,
    "owner": owner,
    "project": project,
    "pending_checks": pending_checks,  # At the time of the merge
    "failed_checks": failed_checks,  # At the time of the merge
    "is_failed": is_failed,  # This is set to True if the merge fails to get through for whatever reason
    "dry_run": dry_run,
    "skip_mandatory_checks": skip_mandatory_checks,
    "ignore_current": ignore_current,
    "error": error,  # The same Exception message that will be shown on PR
}
```

To achieve this, I need to tweak `find_matching_merge_rule` a bit to return the list of pending and failed checks in addition to the matching merge rule.  As this function is also used internally, I have confirmed that the internal call doesn't need the return values.  Thus, the change is safe to land.

### Testing

* Unit testing
* Dry-run locally `python3 .github/scripts/trymerge.py --comment-id 1478678477 --dry-run 97293` using an older PR.  The merge obviously failed, but the record was created successfully on Rockset
```
{
  "_id": "52d3152b-ec35-4b5a-91fc-0e7298fc54b5-1",
  "_event_time": "2023-03-23T21:10:32.754368Z",
  "_meta": null,
  "owner": "pytorch",
  "is_failed": true,
  "id": 1478678477,
  "failed_checks": [],
  "dry_run": true,
  "error": "Command `git -C pytorch cherry-pick -x cc0d2e0fba648bb5deda34a9056f2c4192b22314` returned non-zero exit code 1...",
  "ignore_current": false,
  "project": "pytorch",
  "pr_num": 97293,
  "skip_mandatory_checks": false,
  "pending_checks": []
}
```

* Dry-run locally with this PR `python3 .github/scripts/trymerge.py --comment-id 1481949104 --dry-run --force 97471` with `--force`
```
{
  "_id": "dd7d2580-f6e5-47e7-9441-17df86056c14-1",
  "_event_time": "2023-03-23T21:43:53.915911Z",
  "_meta": null,
  "owner": "pytorch",
  "is_failed": true,
  "id": 1481949104,
  "failed_checks": [],
  "dry_run": true,
  "error": "PR #97471 has not been reviewed yet",
  "ignore_current": false,
  "project": "pytorch",
  "pr_num": 97471,
  "skip_mandatory_checks": true,
  "pending_checks": []
}
```

* Dry-run locally with this PR `python3 .github/scripts/trymerge.py --comment-id 1481949104 --dry-run 97471` again with approval rule commented out

```
{
  "_id": "5d7de4e3-1af1-4869-a3b7-d1a9dbced6ce-1",
  "_event_time": "2023-03-24T00:10:41.914111Z",
  "_meta": null,
  "is_failed": false,
  "id": 1481949104,
  "failed_checks": [],
  "error": "",
  "last_commit_sha": "4657400513f0360a0a4f73d46e1aff0882221687",
  "merge_commit_sha": "416bac5b813a181753afade781ae30f4f0843586",
  "ignore_current": false,
  "pending_checks": [
    [
      "pull / linux-focal-py3.8-gcc7 / test (default, 1, 3, linux.2xlarge)",
      "https://github.com/pytorch/pytorch/actions/runs/4506464828/jobs/7933518379",
      12239935788
    ],
    ...
    [
      "trunk / linux-bionic-cuda11.8-py3.10-gcc7 / test (default, 5, 5, linux.4xlarge.nvidia.gpu)",
      "https://github.com/pytorch/pytorch/actions/runs/4506465633/jobs/7933621958",
      12240067113
    ],
    ...
  ],
  "owner": "pytorch",
  "skip_mandatory_checks": true,
  "author": "Huy Do <huydhn@gmail.com>",
  "project": "pytorch",
  "merge_base_sha": "a3b30c5025e3381022fa00b127b0d881f4ef66d4",
  "pr_num": 97471
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97471
Approved by: https://github.com/clee2000
2023-03-27 18:42:00 +00:00
4c0dce50fd [BE] Apply ufmt to run_test and GitHub Python util scripts (#97588)
This has been bugging me for a while as I'm working on these Python scripts and they are not tracked by ufmt linter.  So I add these script into that linter.

```
[[linter]]
code = 'UFMT'
include_patterns = [
    '.github/**/*.py',
    'test/run_test.py',
```

This change should just work and not break anything as ufmt (black + usort) linter is very safe to use for standalone util scripts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97588
Approved by: https://github.com/kit1980
2023-03-26 04:52:55 +00:00
a37b4fa03a [mergebot] An ignore current flag (#96756)
with https://github.com/pytorch/test-infra/pull/3882

Add -ic/--ignore-current flag for merge.  It ignores the currently failing checks but will stop when there is a new failure.  If there are no pending checks, it fails and tells you to use -f/--force.

Doesn't work on ghstacks with more than 1 PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96756
Approved by: https://github.com/huydhn
2023-03-20 19:07:01 +00:00
481582eb11 Remove land checks in trymerge (#96401)
Remove all references to land checks (rebase on viable strict in a different branch) since its no longer used.  Adding ciflow/trunk on merge and/or rebasing the entire pr is preferred.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96401
Approved by: https://github.com/huydhn
2023-03-10 18:11:05 +00:00
6894bb0a85 Remove on_green and mandatory_only (#96400)
Our default behavior is on green, and currently the two main modes are on green and force.
Surprisingly, both these flags are pretty much not used anywhere.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96400
Approved by: https://github.com/huydhn
2023-03-10 18:11:05 +00:00
88e554b513 Move label check failure to mergebot (#94707)
Fixes https://github.com/pytorch/pytorch/issues/88098

This is a mirror of the same PR (https://github.com/Goldspear/pytorch/pull/3) that has been reviewed in my fork (due to it's a stacked PR).

==============
## Context
This the 3rd of the 3 PRs to address the issue 88098.

## What Changed
1. check_labels.py no longer fails, but only leaving a comment
2. trymerge.py now would fail if no required labels provided

## Tests
* dummy-repo trymerge run [fails without required label](https://github.com/Goldspear/pytorch-dummy/actions/runs/4162819216) and resulted in [a label error comment](https://github.com/Goldspear/pytorch-dummy/pull/3#issuecomment-1427756769)
* the above pr was [correctly merged](https://github.com/Goldspear/pytorch-dummy/pull/3) after label is added.

## Note to Reviewers
1st PR: https://github.com/pytorch/pytorch/pull/94179
2nd PR: https://github.com/pytorch/pytorch/pull/94899
3rd PR: this one
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94707
Approved by: https://github.com/ZainRizvi
2023-03-03 15:09:14 +00:00
61fa43a1f2 [GHF] Add submodule updates check (#95885)
Originally planned to integrate it somehow into the `lintrunner`, but this poses too many challenges, one of them is that it deliberately ignores submodule updates.

On the other hand, almost all the information, other than list of the submodules is already present in the GitHubPR info.

Incorporate small BE change into `test_trymerge.py`, that moves `@mock.patch` from individual test to the class definition.

Fixes https://github.com/pytorch/pytorch/issues/74326 and https://github.com/pytorch/test-infra/issues/1521
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95885
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2023-03-02 18:05:26 +00:00
5ba4dafccd Retry Merge: extract utils from check labels ptr (#94899)
Fixes #88098

This is the rebased and retry merging branch of the reverted PR: https://github.com/pytorch/pytorch/pull/94597

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94899
Approved by: https://github.com/kit1980
2023-03-01 20:40:30 +00:00
b55d0d2aef Fix trymerge changed files count (#95720)
The value from the PR info includes only unique files != The number of files changed (both are technically correct, depending on how you view it)

I'm trying to merge this PR https://github.com/pytorch/pytorch/pull/95233 which makes `.github/ci_commit_pins/triton.txt` a softlink.  So the PR includes 2 changes to that file 1) to delete the file and 2) to add it as a symlink.

```
[
  ".ci/docker/build.sh",
  ".ci/docker/ci_commit_pins/triton.txt",
  ".ci/docker/common/common_utils.sh",
  ".ci/docker/common/install_triton.sh",
  ".ci/docker/requirements-ci.txt",
  ".ci/docker/ubuntu-cuda/Dockerfile",
  ".ci/docker/ubuntu/Dockerfile",
  ".github/ci_commit_pins/triton.txt", <--
  ".github/ci_commit_pins/triton.txt", <--
  ".github/workflows/build-triton-wheel.yml"
]
```

Trymerge doesn't like that and rejects the merge due to `Changed file count mismatch` https://github.com/pytorch/pytorch/actions/runs/4295438799/jobs/7485853815 . This is because the PRInfo GraphQL result from GitHub only counts 9 of them https://paste.sh/zVsOnWoT#p_3RKX_VMjj-e71vwsTeA01W (search for `changedFiles`).  It means that the name are dedup, so that only unique file names are counted.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95720
Approved by: https://github.com/kit1980, https://github.com/malfet, https://github.com/ZainRizvi
2023-02-28 21:55:21 +00:00
11f293a74e Comment about Meta-internal usage of trymerge.py (#95536)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95536
Approved by: https://github.com/malfet
2023-02-27 14:16:04 +00:00