Reason:
rocm binary builds should not block viable strict upgrade. It is queuing/canceled so viable strict is 1.2 days old
Tested by mangling the workflow file to get to the actual call of the python script `python ../test-infra/tools/scripts/fetch_latest_green_commit.py --required-checks '["pull", "trunk", "lint", "^linux-binary-manywheel$", "^linux-binary-libtorch-release$", "linux-aarch64"]' --viable-strict-branch viable/strict --main-branch master`, which I then ran locally where I have credentials. It returned d64718503728001a1e78168fd7f2d4ff23e57285 which is green. Without this change, it returns 5e5870e858f60ff4bf87d03f3592097e934a9580, which is pretty old
The other solution would have been to mark it as unstable I think
Side note, why is it master and how is it working like that
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162100
Approved by: https://github.com/huydhn
### Motivation
* MI250 Cirrascale runners are currently having network timeout leading to huge queueing of binary smoke test jobs:
<img width="483" height="133" alt="image" src="https://github.com/user-attachments/assets/17293002-78ad-4fc9-954f-ddd518bf0a43" />
* MI210 Hollywood runners (with runner names such as `pytorch-rocm-hw-*`) are not suitable for these jobs, because they seem to take much longer to download artifacts: https://github.com/pytorch/pytorch/pull/153287#issuecomment-2918420345 (this is why these jobs were specifically targeting Cirrascale runners). However, it doesn't seem like Cirrascale runners are necessarily doing much better either e.g. [this recent build](https://github.com/pytorch/pytorch/actions/runs/17332256791/job/49231006755).
* Moving to MI325 runners should address the stability part at least, while also reducing load on limited MI2xx runner capacity.
* However, I'm not sure if the MI325 runners will do any better on the artifact download part (this may need to be investigated more) cc @amdfaa
* Also removing `ciflow/binaries` and `ciflow/binaries_wheel` label/tag triggers for `generated-linux-binary-manywheel-rocm-main.yml` because we already trigger ROCm binary build/test jobs via these labels/tags in `generated-linux-binary-manywheel-nightly.yml`. And for developers who want to trigger ROCm binary build/test jobs on their PRs, they can use the `ciflow/rocm-mi300` label/tag as per this PR.
### TODOs (cc @amdfaa):
* Check that the workflow runs successfully on the MI325 runners in this PR. Note how long the test jobs take esp. the "Download Build Artifacts" step
* Once this PR is merged, clear the queue of jobs targeting `linux.rocm.gpu.mi250`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162044
Approved by: https://github.com/jeffdaily
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
We have 4 different version of inductor benchmark Docker images used in CI at the moment:
1. `pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks` is used by almost all inductor jobs including nightly benchmark
2. `pytorch-linux-jammy-cuda12.8-cudnn9-py3.12-gcc9-inductor-benchmarks` runs inductor unit tests with python 3.12
3. `pytorch-linux-jammy-cuda12.8-cudnn9-py3.13-gcc9-inductor-benchmarks` runs inductor unit tests with python 3.13
4. `pytorch-linux-jammy-py3-gcc11-inductor-benchmarks` runs inductor unit tests on CPU
My proposal here is to clean up (2) and (3) and to keep (1) under the same setup from https://ghcr.io/pytorch/torchbench. Simplicity is the key here as inductor workflows are getting more and more complex:
1. Unit tests for Python variant like 3.12 and 3.13 were useful when they were first added to CI. They are much less useful now. [Flambeau](https://hud.pytorch.org/flambeau/s/3876ec7b-43f0-42c6-bfbf-899035e5bb77) shows a 0.97 correlation between them. And we are also moving to 3.14 nowadays. I want to choose 3.12 for (1), but will do this separately. This is also what TorchBench and vLLM are using on CI.
1. We are gradually cleaning up 3.9 on CI https://github.com/pytorch/pytorch/issues/161167
Another BE change here is to rename the jobs various inductor workflows because I think names like `linux-jammy-cuda12_8-py3_10-gcc9-inductor-build` is too long and confusing to look at, better just use human-friendly names like `inductor-build`. Other information is already spelled out in the build environment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161536
Approved by: https://github.com/zou3519
Fixes https://github.com/pytorch/pytorch/issues/161510
Test plan:
```
% cd third_party/kineto
% git checkout fe80f9319479265f7a208e615e16a363b993d50c; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was 5e75018 Fix Local Time on Windows Builds (#1104)
HEAD is now at fe80f93 Fix MSVC Error (#1134)
Submodule path 'libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
Submodule path 'libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21'
Submodule path 'libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
% git checkout 5e75018; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was fe80f93 Fix MSVC Error (#1134)
HEAD is now at 5e75018 Fix Local Time on Windows Builds (#1104)
warning: unable to rmdir 'third_party/prometheus-cpp': Directory not empty
Submodule path 'libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850'
Submodule path 'libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164'
Submodule path 'libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347'
% cd ../..
% git status
HEAD detached from 649e397c6de
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
(commit or discard the untracked or modified content in submodules)
modified: third_party/kineto (untracked content)
% time git submodule foreach --recursive git clean -ffdx
...
git submodule foreach --recursive git clean -ffdx 0.47s user 0.96s system 88% cpu 1.625 total
% git status
HEAD detached from 649e397c6de
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161748
Approved by: https://github.com/atalman
Fixes https://github.com/pytorch/pytorch/issues/161510
Test plan:
```
% cd third_party/kineto
% git checkout fe80f9319479265f7a208e615e16a363b993d50c; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was 5e75018 Fix Local Time on Windows Builds (#1104)
HEAD is now at fe80f93 Fix MSVC Error (#1134)
Submodule path 'libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
Submodule path 'libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21'
Submodule path 'libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
% git checkout 5e75018; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was fe80f93 Fix MSVC Error (#1134)
HEAD is now at 5e75018 Fix Local Time on Windows Builds (#1104)
warning: unable to rmdir 'third_party/prometheus-cpp': Directory not empty
Submodule path 'libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850'
Submodule path 'libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164'
Submodule path 'libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347'
% cd ../..
% git status
HEAD detached from 649e397c6de
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
(commit or discard the untracked or modified content in submodules)
modified: third_party/kineto (untracked content)
% time git submodule foreach --recursive git clean -ffdx
...
git submodule foreach --recursive git clean -ffdx 0.47s user 0.96s system 88% cpu 1.625 total
% git status
HEAD detached from 649e397c6de
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161748
Approved by: https://github.com/atalman
Fixes https://github.com/pytorch/pytorch/issues/161510
Test plan:
```
% cd third_party/kineto
% git checkout fe80f9319479265f7a208e615e16a363b993d50c; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was 5e75018 Fix Local Time on Windows Builds (#1104)
HEAD is now at fe80f93 Fix MSVC Error (#1134)
Submodule path 'libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159'
Submodule path 'libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
Submodule path 'libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21'
Submodule path 'libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
% git checkout 5e75018; git submodule update --init --recursive
M libkineto/third_party/dynolog
M libkineto/third_party/fmt
M libkineto/third_party/googletest
Previous HEAD position was fe80f93 Fix MSVC Error (#1134)
HEAD is now at 5e75018 Fix Local Time on Windows Builds (#1104)
warning: unable to rmdir 'third_party/prometheus-cpp': Directory not empty
Submodule path 'libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e'
Submodule path 'libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850'
Submodule path 'libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164'
Submodule path 'libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347'
% cd ../..
% git status
HEAD detached from 649e397c6de
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
(commit or discard the untracked or modified content in submodules)
modified: third_party/kineto (untracked content)
% time git submodule foreach --recursive git clean -ffdx
...
git submodule foreach --recursive git clean -ffdx 0.47s user 0.96s system 88% cpu 1.625 total
% git status
HEAD detached from 649e397c6de
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161748
Approved by: https://github.com/atalman
Including below changes,
- Add XPU support package 2025.2 build and test in CI for both Linux and Windows
- Keep XPU support package 2025.1 build in CI to ensure no break issue until PyTorch 2.9 release
- Upgrade XPU support package from 2025.1 to 2025.2 in CD for both Linux and Windows
- Rename Linux CI job name & image name to n & n-1
- Update XPU runtime pypi packages dependencies of CD wheels
- Remove deprecated support package version docker image build
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158733
Approved by: https://github.com/EikanWang, https://github.com/atalman