Fix bug where a github api failure would prevent the check from failing even if we already saw that labels were needed.
Also adds more debugging info to the rate limit exceeded error since it's weird to see an error claiming the rate limit has exceeded when the "Used" amount is way below the limit. I suspect these happen when the request arrived just before the rate reset time, but the response was generated right after the reset time, hence the apparently tiny "used" amounts
Example run where the check should have failed, but passed instead:
https://github.com/pytorch/pytorch/actions/runs/4200205209/jobs/7285979824
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95098
Approved by: https://github.com/huydhn
Fixes#88098
### What Changed
* Moved `check_label.py` logic into `trymerge.py`
* Refactored relevant unittests
* ~~Dropped~~ Refactored `check_label.py` ci job
### Tests
`python .github/scripts/test_trymerge.py`
`python .github/scripts/test_check_labels.py`
`make lint & lintrunner -a`
### Notes to reviewers
This PR replaces the [original PR](https://github.com/pytorch/pytorch/pull/92225) to workaround the sticky EasyCLA failure mark on its first commit.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92309
Approved by: https://github.com/ZainRizvi
* flatten the workflows into just jobs in order to give more specific links (link to the specific job that failed instead of just pull), this should make it easier to implement bypass certain failures in the future
* try catch of MandatoryChecksMissingError from find_matching_merge_rule should fix error where merge loops instead of raising runtime error when trunk job fails
* remove usage of on_green and mandatory_only flags just in case. on_green and force are the only two behaviors we currently use
* fail if ghstack pr has non ghstack change, tested locally with #92177 but unsure how to write tests b/c requires use of repo._run_git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92097
Approved by: https://github.com/huydhn, https://github.com/ZainRizvi
This line is added by autoCCBot, but is not really meaningful as commit
message
Test Plan:
```
>>> from trymerge import GitHubPR, RE_PR_CC_LINE
>>> import re
>>> pr=GitHubPR("pytorch", "pytorch", 87809)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'Fixes #ISSUE_NUMBER\r\n\n\n'
>>> pr=GitHubPR("pytorch", "pytorch", 87913)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'Parallel compilation warms the Threadpool when we call `torch._dynamo.optimize()`. In current benchmarks, we were setting up the TRITON_CACHE_DIR much later. Because of this parallel compilation artifacts were not used and compilation latency improvements were not visible in dashboard. This PR just prepones the setup of TRITON_CACHE_DIR.\n\n'
>>> pr=GitHubPR("pytorch", "pytorch", 85692)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'This PR sets CUDA_MODULE_LOADING if it\'s not set by the user. By default, it sets it to "LAZY".\r\n\r\nIt was tested using the following commands:\r\n```\r\npython -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"\r\n```\r\nwhich shows a memory usage of: 287,047,680 bytes\r\n\r\nvs\r\n\r\n```\r\nCUDA_MODULE_LOADING="DEFAULT" python -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"\r\n```\r\nwhich shows 666,632,192 bytes. \r\n\r\nC++ implementation is needed for the libtorch users (otherwise it could have been a pure python functionality).\r\n\r\n'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88252
Approved by: https://github.com/xuzhao9, https://github.com/izaitsevfb
### Context
When a dev submits a PR against the repo, we want to validate that they applied two labels to the PR corresponding the module they edited and the kind of change they're making.
### Change
Extended the open source workflow CI to add a validation to ensure that the PR being checked has the required labels on it. If it doesn't, the check fails and a bot will post a message on the PR with instructions on what labels the developer needs to add (https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work).
### Impact
Every time a new version of PyTorch is released, we want to compile all the changes made to each module. However, when devs forget to tag their PR, compiling the changes to write the release notes becomes a burdensome process (only ~20% of PRs are currently labeled appropriately, which means it can take up to 40 hours to compile release notes). With this new validation, the hope is that most PRs are labeled accordingly for more timely release notes compilation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86829
Approved by: https://github.com/ZainRizvi
Per title, don't start land checks if the PR hasn't been approved yet. This is very important to make sure that we don't start CI jobs from unknown devs, i.e. first time contributor.
Also rename force to `skip_mandatory_checks` to make it clearer on what this flag does
### Testing
```
python .github/scripts/test_trymerge.py
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84239
Approved by: https://github.com/zengk95, https://github.com/ZainRizvi
# The problem
When a dev forks their branch from a red master build, their branch can fail CI checks for reasons unrelated to their changes, but the same checks would however pass in the land validation commit (which is rebased off of viable/strict)
Today, in the above scenario the `merge -l` command fails because mergebot sees the failing checks in the PR, which is not helpful when that same check passes in land validation.
# The solution
This PR changes the behavior so that:
1. If both the PR and land validation ran a workflow, only look at the results from land validation
2. If only the PR ran a specific workflow (e.g. for CLA Check or a nightly run) then continue to look the result from the PR (which matches existing behavior)
### Bonus fixes
It also includes a few extra BE fixes:
- Replaces the tuple we used to pass workflow check results around with a named tuple so that it's easier to tell what data is being used
- Reduces the number of API calls to github by ~50% during merges. Before, we were pulling results from github every time and then filtering it down to the relevant category of checks (e.g. failed/pending/startup_failed). Now, our filters share the check results
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83715
Approved by: https://github.com/zengk95
## FORCE
@pytorchbot successfully started a merge job. Check the current status [here](None).
The merge job was triggered with the force (-f) flag. This means your change will be merged **immediately**, bypassing any CI checks (ETA: 1-5 minutes). If this is not the intended behavior, feel free to use some of the other merge options in the [wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## GREEN
@pytorchbot successfully started a merge job. Check the current status [here](None).
The merge job was triggered with the green (-g) flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the [wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## LAND CHECKS
@pytorchbot successfully started a merge job. Check the current status [here](None).
The merge job was triggered with the land checks (-l) flag. If you did not specify this flag yourself, you are likely enrolled in the [land checks rollout](https://github.com/pytorch/test-infra/blob/main/torchci/lib/bot/rolloutUtils.ts#L1-L34). This means that your change will be merged once all checks on your PR and the land checks have passed (**ETA 4 Hours**). If you need to coordinate lands between different changes and cannot risk a land race, please add the `ciflow/trunk` label to your PR and wait for signal to complete, and then land your changes in proper order. Having `trunk`, `pull`, and `Lint` pre-run on a PR will bypass land checks and the ETA should be immediate. If this is not the intended behavior, feel free to use some of the other merge options in the [wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## LAND CHECKS
@pytorchbot successfully started a merge job. Check the current status [here](None).
The merge job was triggered with the land checks (-l) flag. If you did not specify this flag yourself, you are likely enrolled in the [land checks rollout](https://github.com/pytorch/test-infra/blob/main/torchci/lib/bot/rolloutUtils.ts#L1-L34). This means that your change will be merged once all checks on your PR have passed since you have added the `ciflow/trunk` label to your PR (ETA 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the [wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## NORMAL
@pytorchbot successfully started a merge job. Check the current status [here](None).
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the [wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## Revert Message
@pytorchbot successfully started a revert job. Check the current status [here](None).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
## TROUBLESHOOTING
If you believe this is an error, you can use the old behavior with `@pytorchbot merge -g` (optionally with the `ciflow/trunk` to get land checks) or use `@pytorchbot merge -f "some reason here"`. For more information, see the [bot wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands).
Please reach out to the [PyTorch DevX Team](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours) with feedback or questions!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82907
Approved by: https://github.com/huydhn, https://github.com/janeyx99
### Description
<!-- What did you change and why was it needed? -->
We forgot that the < was for comments in markdown. Also added a link to the wiki to the start land checks message so users can see why their PR is taking extra time to land.
### Issue
<!-- Link to Issue ticket or RFP -->
n/a
### Testing
<!-- How did you test your change? -->
n/a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82649
Approved by: https://github.com/janeyx99, https://github.com/ZainRizvi
### Description
Land checks are new and people can get confused on how to use it. In this, we make it a bit clearer to the user that they can use pytorchbot merge -g or -f to merge it in if they believe that there's infra issues with the land check.
### Issue
N/a
### Testing
N/a. Lint should be enough.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82580
Approved by: https://github.com/huydhn
### Description
There were two very similar functions in trymerge so I tried to refactor so that they was less duplicated code.
### Issue
<!-- Link to Issue ticket or RFP -->
### Testing
<!-- How did you test your change? -->
Ran `python test_trymerge.py` to test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82371
Approved by: https://github.com/huydhn
### Description
This enables COLLABORATORS for a repository to revert PRs if they discover a problem.
The main difference between a COLLABORATOR and a MEMBER is that a MEMBER has access to the entire pytorch organization, while a COLLABORATOR's access is limited to a specific repository. But within that repository, granting them both revert rights seems quite reasonable
### Issue
This request originated from pytorchbot's refusal to revert this pr: https://github.com/pytorch/pytorch/pull/79694#issuecomment-1189460942
### Testing
CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82360
Approved by: https://github.com/malfet, https://github.com/janeyx99
### Description
<!-- What did you change and why was it needed? -->
As we're adding accept2run, we don't want to do the typical land validation if ciflow/trunk is already on the PR. In this case, we default to the on_green checks to make sure all the PR checks and trunk checks are green before merging it in.
### Issue
https://github.com/pytorch/test-infra/issues/448
### Testing
Tested it locally and made sure land checks was false and on_ green was true
```
LANDCHECKS False ONGREEN True
Attempting merge of https://github.com/pytorch/pytorch/pull/82338 (0.003352566560109456 minutes elapsed)
Traceback (most recent call last):
File "/Users/kerryz/pytorch/.github/scripts/trymerge.py", line 1247, in main
merge(args.pr_num, repo,
File "/Users/kerryz/pytorch/.github/scripts/trymerge.py", line 1178, in merge
find_matching_merge_rule(pr, repo)
File "/Users/kerryz/pytorch/.github/scripts/trymerge.py", line 979, in find_matching_merge_rule
raise RuntimeError(reject_reason)
RuntimeError: Matched rule OSS CI, but PR #82338 has not been reviewed yet
(pytorch) kerryz@kerryz-mbp pytorch %
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82338
Approved by: https://github.com/clee2000, https://github.com/huydhn
### Description
<!-- What did you change and why was it needed? -->
We've had reports of people adding ciflow labels and then merging with the expectation that it wouldn't merge until everything was passing. This wasn't how merge worked because it only works for mandatory jobs, so now based on that feedback, we're waiting for all jobs if there's a ciflow/* label on the PR.
### Issue
<!-- Link to Issue ticket or RFP -->
n/a
### Testing
<!-- How did you test your change? -->
let tests run
Ran the trymerge locally and checked that on_green was true/false depending if the ciflow label was added
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82110
Approved by: https://github.com/ZainRizvi, https://github.com/malfet, https://github.com/huydhn
Addresses https://github.com/pytorch/pytorch/issues/80923
The alternative to this is editing the start comment to include the land checks details, but I think it might be a bit misleading because the land check branch creation might not be there when the person first clicks on it when they get a notification. This might be a bit annoying cause then the user won't be able to look at their progress.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80965
Approved by: https://github.com/malfet