pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Aaron Gokaslan	beb4d7816d	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 ) Automatically replaces split with rsplit when relevant and only performs the split up to the first ( or last value). This allows early return of the split function and improve efficiency. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160107 Approved by: https://github.com/albanD	2025-08-08 03:14:59 +00:00
PyTorch MergeBot	3443627e07	Revert "[BE]: Enable RUFF TRY400 rule - log.exception (#153473 )" This reverts commit 4f4ecc583e0f48ad2d062a53bf91c61ab40b4948. Reverted https://github.com/pytorch/pytorch/pull/153473 on behalf of https://github.com/jeanschmidt due to seems to have broken internal signals, @albanD may I count on you to help the author merge his PR? D74837988 ([comment](https://github.com/pytorch/pytorch/pull/153473#issuecomment-2886017075))	2025-05-16 08:29:26 +00:00
Aaron Gokaslan	4f4ecc583e	[BE]: Enable RUFF TRY400 rule - log.exception (#153473 ) Change logging.error to logging.exception to log additional information when relevant. A few places have slipped in logging.errors in try except since I last did a clean up here and the rule is stabilized so I am enabling it codebase wide. I have NOQA'd much of our custom exception stack trace handling for RPC calls and distributed and tried to a fix a few errors based on whether we immediately reraised it or if we didn't print any exception handling where it could be useful. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153473 Approved by: https://github.com/albanD, https://github.com/cyyever	2025-05-15 13:36:59 +00:00
Thanh Ha	50657120a0	Allow workflows to opt-out of experiments (#153085 ) This change adds support to allow workflows to opt-out of experiments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153085 Approved by: https://github.com/ZainRizvi Co-authored-by: Zain Rizvi <ZainRizvi@users.noreply.github.com>	2025-05-09 16:34:46 +00:00
Aaron Orenstein	60f98262f1	PEP585: .github (#145707 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145707 Approved by: https://github.com/huydhn	2025-01-27 21:21:01 +00:00
Tom Ritchford	498a7808ff	Fix unused Python variables outside torch/ and test/ (#136359 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136359 Approved by: https://github.com/albanD	2024-12-11 17:10:23 +00:00
Zain Rizvi	b69282c98c	Enable opting out of experiments even when they're being rolled out (#140433 ) Enables opting out of specific experiments in the runner determinator To opt out: 1. Go to the tracking issue: https://github.com/pytorch/test-infra/issues/5132 2. In the entry by your name, enter the experiment name, prefixed with a `-`. For example, to opt out of the LF fleet you could enter `@ZainRIzvi,-lf` This lets you simultaneously be opted into some experiments and opted out of others. While the `disable-runner-experiments` label offers an option to disable all experiments on a given PR, this one lets you disable a selected set of experiments across all your PRs. Fixes https://github.com/pytorch/pytorch/issues/138099 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140433 Approved by: https://github.com/zxiiro, https://github.com/jeanschmidt	2024-11-14 19:18:24 +00:00
Huy Do	09ba38c4b7	Add an opt-out label to runner determinator on PR (#140054 ) My sales pitch: I need to ssh into the runner from time to time on my PR to debug issues, but it's well-known that LF runners don't support SSH login anymore. So, the propose fix here is to introduce a new label called ~no-runner-determinator~ `no-runner-experiments` that can be attached to the PR. Whenever `.github/scripts/runner_determinator.py` runs on a PR and sees this label, it will not apply any logic and just straight up use an empty prefix. ### Testing With the label: ``` python3 runner_determinator.py \ --github-token "MY_TOKEN" \ --github-issue "5132" \ --github-branch "install-torchao-torchtune-et" \ --github-actor "huydhn" \ --github-issue-owner "huydhn" \ --github-ref-type "branch" \ --github-repo "pytorch/pytorch" \ --eligible-experiments "" \ --pr-number "139947" INFO : Opt-out runner determinator because #139947 has no-runner-determinator label WARNING : No env var found for GITHUB_OUTPUT, you must be running this code locally. Falling back to the deprecated print method. ::set-output name=label-type:: ``` Without the label: ``` python3 runner_determinator.py \ --github-token "MY_TOKEN" \ --github-issue "5132" \ --github-branch "install-torchao-torchtune-et" \ --github-actor "huydhn" \ --github-issue-owner "huydhn" \ --github-ref-type "branch" \ --github-repo "pytorch/pytorch" \ --eligible-experiments "" \ --pr-number "139947" INFO : Based on rollout percentage of 95%, enabling experiment lf. INFO : Skipping experiment 'awsa100', as it is not a default experiment WARNING : No env var found for GITHUB_OUTPUT, you must be running this code locally. Falling back to the deprecated print method. ::set-output name=label-type::lf. ``` Running in trunk commit without a PR number will use the regular logic: ``` python3 runner_determinator.py \ --github-token "MY_TOKEN" \ --github-issue "5132" \ --github-branch "install-torchao-torchtune-et" \ --github-actor "huydhn" \ --github-issue-owner "huydhn" \ --github-ref-type "branch" \ --github-repo "pytorch/pytorch" \ --eligible-experiments "" \ --pr-number "" INFO : Based on rollout percentage of 95%, enabling experiment lf. INFO : Skipping experiment 'awsa100', as it is not a default experiment WARNING : No env var found for GITHUB_OUTPUT, you must be running this code locally. Falling back to the deprecated print method. ::set-output name=label-type::lf. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/140054 Approved by: https://github.com/malfet, https://github.com/ZainRizvi	2024-11-07 22:55:27 +00:00
Jean Schmidt	2cb983ab97	[CI] Adds support for selecting experiments for workflows on runner determinator (#137614 ) adds a `default` tag to experiment configurations, allowing to remove some experiments by default on the random draw: ``` experiments: lf: rollout_perc: 25 otherExp: rollout_perc: 25 default: false --- ``` and includes the configuration to filter what experiments are of interest for a particular workflow (comma separated): ``` get-test-label-type: name: get-test-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... check_experiments: "awsa100" ``` The end goal, is to enable us to run multiple experiments, that are independent from one another. For example, while we still runs the LF infra experiment, we want to migrate other runners leveraging the current solution. A immediate UC is for the A100 instances, where we want to migrate to AWS. Those new instances will during the migration period be labeled both `awsa100.linux.gcp.a100` and `linux.aws.a100`. Once the experiment ends, we will remove the first confusing one. ``` jobs: get-build-label-type: name: get-build-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... get-test-label-type: name: get-test-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... check_experiments: "awsa100" linux-focal-cuda12_1-py3_10-gcc9-inductor-build: name: cuda12.1-py3.10-gcc9-sm80 uses: ./.github/workflows/_linux-build.yml needs: - get-build-label-type - get-test-label-type with: runner_prefix: "${{ needs.get-build-label-type.outputs.label-type }}" ... test-matrix: \| { include: [ { config: "inductor_huggingface_perf_compare", shard: 1, num_shards: 1, runner: "${{ needs.get-test-label-type.outputs.label-type }}linux.gcp.a100" }, ... ]} ... ``` ``` experiments: lf: rollout_perc: 50 awsa100: rollout_perc: 50 default: false ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/137614 Approved by: https://github.com/malfet	2024-10-11 19:20:02 +00:00
Zain Rizvi	f0fa460c60	[BE] Add script to keept the runner-determinator scripts in sync (#136794 ) Whenever we update runner_determinator.py it needs to be copied over into _runner-determinator.yml. This is a quick utility script to make that process less tedious Pull Request resolved: https://github.com/pytorch/pytorch/pull/136794 Approved by: https://github.com/zxiiro, https://github.com/jeanschmidt	2024-10-01 22:26:28 +00:00
Zain Rizvi	d46ebcb31b	Enable experiments for protected branches (#136785 ) This is to allow the protected branches (like `main` and `nightly`) also run on the LF fleet, now that we've migrated over Pull Request resolved: https://github.com/pytorch/pytorch/pull/136785 Approved by: https://github.com/jeanschmidt	2024-09-30 20:58:28 +00:00
Zain Rizvi	09519eb195	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-11 18:01:26 +00:00
PyTorch MergeBot	8f66995459	Revert "Support rolling over a percentage of workflows (#134816 )" This reverts commit fc890b55b51098437b6149abf1026a8b2aaee389. Reverted https://github.com/pytorch/pytorch/pull/134816 on behalf of https://github.com/malfet due to Causes lint to intermittently fail ([comment](https://github.com/pytorch/pytorch/pull/134816#issuecomment-2332902609))	2024-09-05 23:39:41 +00:00
Zain Rizvi	fc890b55b5	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-05 22:21:45 +00:00
Zain Rizvi	469429b959	Refactor runner determinator (#134796 ) Some minor refactorings to make the code easier to parse and easier to add unit tests for. Keeping this as a separate PR for ease of review, since it should have zero functional behavior changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/134796 Approved by: https://github.com/zxiiro, https://github.com/PaliC	2024-09-03 23:29:04 +00:00
Thanh Ha	c6582f11cd	Add get_optin_feature() to allow opt-in to amz2023 (#131792 ) This extends the runner determinator to be able to opt-in to keywords to provide additional options when determining which systems to run jobs on. This enables us to support opt-in users to Amazon Linux 2023. This change creates a generic get_optin_feature() which hopefully will be useful to handle additional future features that we might want to experiment with. This change has kept backwards compatability with the existing issue userlist format and adds support for the comma-separated list of users in a backwards compatible way. The user list has the following rules: - Users are GitHub usernames with the @ prefix - If the first line is a "*" then all users will use the new runners - If the first line is a "!" then all users will use the old runners - Each user is also a comma-separated list of features/experiments to enable - A "#" prefix indicates the user is opted out of the new runners but is opting into features/experiments. Example user list: ``` @User1 @User2,amz2023 #@UserOptOutOfNewRunner,amz2023 ``` This closes pytorch/ci-infra#249. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131792 Approved by: https://github.com/jeanschmidt, https://github.com/ZainRizvi	2024-08-06 17:54:20 +00:00
Thanh Ha	3eb9fa5d58	Add support for using LF Canary runners (#131188 ) The script is updated such that if a canary build is detected and the label_type is LF runner it will run on an LF Canary runner. Closes pytorch/ci-infra#245. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131188 Approved by: https://github.com/ZainRizvi	2024-07-22 13:26:46 +00:00
Zain Rizvi	9645eaaaec	[BE] Improve logging for runner-determinator (#129679 ) This lets us be more flexible about what data we output and throwing exceptions. It's also less likely to break when others make changes (e.g. any print statement would have broken this code before since the printed output was expected to only be a json) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129679 Approved by: https://github.com/zxiiro, https://github.com/jeanschmidt, https://github.com/Skylion007	2024-07-01 22:31:35 +00:00
Zain Rizvi	389492e264	Fix runner determinator bug (#129612 ) Currently the runner determinator is buggy and doesn't let anyone's workflows run against the LF runners (it prefixes a "@" to the user names in the issue instead of either stripping it or prefixing it to the incoming names) This PR fixes the bug so that people opted in to using LF runners can actually use them. It also puts the python code back into the repo. Even though the code isn't directly invoked, having it there makes testing and linting easier/possible Also includes lint fixes Note: if you just review the .yml file you'll see all the relevant diffs ### Testing: #### Before ``` python .github/scripts/runner_determinator.py --github-token $GH_KEY --github-issue 5132 --github-actor ZainRizvi --github-issue-owner ZainRizvi --github-branch foo {"label_type": "", "message": "LF Workflows are disabled for ZainRizvi, ZainRizvi. Using meta runners."} ``` #### After ``` python .github/scripts/runner_determinator.py --github-token $GH_KEY --github-issue 5132 --github-actor ZainRizvi --github-issue-owner ZainRizvi --github-branch foo {"label_type": "lf.", "message": "LF Workflows are enabled for ZainRizvi, ZainRizvi. Using LF runners."} ``` Aside: updated test case after rebase: ``` python .github/scripts/runner_determinator.py --github-token $GH_KEY --github-issue 5132 --github-actor ZainRizvi --github-issue-owner ZainRizvi2 --github-branch foo --github-repo python/pythonss --github-ref-type branch {"label_type": "lf.", "message": "LF Workflows are enabled for ZainRizvi. Using LF runners."} ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129612 Approved by: https://github.com/zxiiro, https://github.com/jeanschmidt	2024-06-27 17:51:09 +00:00

19 Commits