pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
Zain Rizvi	b69282c98c	Enable opting out of experiments even when they're being rolled out (#140433 ) Enables opting out of specific experiments in the runner determinator To opt out: 1. Go to the tracking issue: https://github.com/pytorch/test-infra/issues/5132 2. In the entry by your name, enter the experiment name, prefixed with a `-`. For example, to opt out of the LF fleet you could enter `@ZainRIzvi,-lf` This lets you simultaneously be opted into some experiments and opted out of others. While the `disable-runner-experiments` label offers an option to disable all experiments on a given PR, this one lets you disable a selected set of experiments across all your PRs. Fixes https://github.com/pytorch/pytorch/issues/138099 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140433 Approved by: https://github.com/zxiiro, https://github.com/jeanschmidt	2024-11-14 19:18:24 +00:00
Jean Schmidt	2cb983ab97	[CI] Adds support for selecting experiments for workflows on runner determinator (#137614 ) adds a `default` tag to experiment configurations, allowing to remove some experiments by default on the random draw: ``` experiments: lf: rollout_perc: 25 otherExp: rollout_perc: 25 default: false --- ``` and includes the configuration to filter what experiments are of interest for a particular workflow (comma separated): ``` get-test-label-type: name: get-test-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... check_experiments: "awsa100" ``` The end goal, is to enable us to run multiple experiments, that are independent from one another. For example, while we still runs the LF infra experiment, we want to migrate other runners leveraging the current solution. A immediate UC is for the A100 instances, where we want to migrate to AWS. Those new instances will during the migration period be labeled both `awsa100.linux.gcp.a100` and `linux.aws.a100`. Once the experiment ends, we will remove the first confusing one. ``` jobs: get-build-label-type: name: get-build-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... get-test-label-type: name: get-test-label-type uses: ./.github/workflows/_runner-determinator.yml with: ... check_experiments: "awsa100" linux-focal-cuda12_1-py3_10-gcc9-inductor-build: name: cuda12.1-py3.10-gcc9-sm80 uses: ./.github/workflows/_linux-build.yml needs: - get-build-label-type - get-test-label-type with: runner_prefix: "${{ needs.get-build-label-type.outputs.label-type }}" ... test-matrix: \| { include: [ { config: "inductor_huggingface_perf_compare", shard: 1, num_shards: 1, runner: "${{ needs.get-test-label-type.outputs.label-type }}linux.gcp.a100" }, ... ]} ... ``` ``` experiments: lf: rollout_perc: 50 awsa100: rollout_perc: 50 default: false ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/137614 Approved by: https://github.com/malfet	2024-10-11 19:20:02 +00:00
Zain Rizvi	d46ebcb31b	Enable experiments for protected branches (#136785 ) This is to allow the protected branches (like `main` and `nightly`) also run on the LF fleet, now that we've migrated over Pull Request resolved: https://github.com/pytorch/pytorch/pull/136785 Approved by: https://github.com/jeanschmidt	2024-09-30 20:58:28 +00:00
Zain Rizvi	09519eb195	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-11 18:01:26 +00:00
PyTorch MergeBot	8f66995459	Revert "Support rolling over a percentage of workflows (#134816 )" This reverts commit fc890b55b51098437b6149abf1026a8b2aaee389. Reverted https://github.com/pytorch/pytorch/pull/134816 on behalf of https://github.com/malfet due to Causes lint to intermittently fail ([comment](https://github.com/pytorch/pytorch/pull/134816#issuecomment-2332902609))	2024-09-05 23:39:41 +00:00
Zain Rizvi	fc890b55b5	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-05 22:21:45 +00:00

6 Commits