4481 Commits

Author SHA1 Message Date
6539537a59 [ROCm][CD] create ROCm 7.0 images for binary builds (#163860)
Adds gfx950.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163860
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-09-25 17:26:40 +00:00
29cbcbac42 [BE] Make PyObjectSlot use a global PyInterpreter (#162659)
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process

Gonna ask for review from @huydhn as there are some changes to CI.

Testing: imported internally and the failed android build seems to work now!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD, https://github.com/huydhn
2025-09-25 08:53:19 +00:00
eb7f4e0004 Add PEP 517 compliant Python source distribution to release process (#157815)
This adds the actual creation of a standards compliant sdist along with its upload to s3 to the create release workflow.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157815
Approved by: https://github.com/malfet, https://github.com/atalman
ghstack dependencies: #157814, #160315
2025-09-25 07:15:52 +00:00
cc660d38ac [CI] Install libuv for Win testing (#163797)
Current working theory why f0078941cf caused a regression, are because Windows CI no longer could be build with distributed, as it could not find libuv
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163797
Approved by: https://github.com/wdvr
2025-09-25 01:10:14 +00:00
00f96dd84d [CI] Run CUDA-13 binary builds on trunk (#163787)
There are numerous other workflows that could be used to catch CUDA-12
build regression (our CI builds are almost identical to CD ones), but not many CUDA-13 builds around, so https://github.com/pytorch/pytorch/issues/163342 are really hard to detect in CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163787
Approved by: https://github.com/atalman, https://github.com/huydhn
2025-09-25 00:58:17 +00:00
77b9aac6c2 Add rule for typechecking maintainers (#161307)
Allow the following people merge rights on type checking configs:
  - @lolpack
  - @maggiemoss
  - @ndmitchell
  - @kinto0

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161307
Approved by: https://github.com/albanD, https://github.com/ezyang
2025-09-25 00:14:31 +00:00
0ec946a052 [ROCm] Increase binary build timeout to 5 hours (300 minutes) (#163776)
Despite narrowing down the [FBGEMM_GENAI build to gfx942](https://github.com/pytorch/pytorch/pull/162648), the nightly builds still timed out because they [didn't get enough time to finish the post-PyTorch-build steps](https://github.com/pytorch/pytorch/actions/runs/17969771026/job/51109432897).

This PR increases timeout for ROCm builds for both [libtorch ](https://github.com/pytorch/pytorch/actions/runs/17969771026)and [manywheel](https://github.com/pytorch/pytorch/actions/runs/17969771041), because both of those are close to the 4hr mark currently.

This PR is a more ROCm-targeted version of https://github.com/pytorch/pytorch/pull/162880 (which is for release/2.9 branch).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163776
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-09-24 23:02:08 +00:00
1495b35d29 Remove Python 3.9 for Triton builds (#163778)
Related to https://github.com/pytorch/pytorch/issues/161167

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163778
Approved by: https://github.com/malfet
2025-09-24 20:19:43 +00:00
3b73841f43 update test_quantization tests to run weekly (#163077)
Fixes #162854

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163077
Approved by: https://github.com/huydhn
2025-09-24 11:31:11 +00:00
6f1d962d5b [vllm hash update] update the pinned vllm hash (#163711)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163711
Approved by: https://github.com/pytorchbot
2025-09-24 04:31:37 +00:00
42e9902a0f cd: Move arm64 to linux.arm64.r7g.12xlarge.memory (#163681)
This should reduce the amount of build time we have by a lot by just
throwing more hardware at the problem.

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163681
Approved by: https://github.com/huydhn, https://github.com/atalman, https://github.com/malfet
2025-09-24 04:06:09 +00:00
5f0c7cb4aa Add B200 smoke test (#159494)
Okay running test_max_autotune locally on B200is horrible read, for now to get something landed I am focusing on test_matmul_cuda.py and test_fp8

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159494
Approved by: https://github.com/nWEIdia, https://github.com/huydhn
ghstack dependencies: #163460, #163537, #163552
2025-09-23 15:45:05 +00:00
fcd79d5228 [vllm hash update] update the pinned vllm hash (#163590)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163590
Approved by: https://github.com/pytorchbot
2025-09-23 04:44:15 +00:00
fa15fb01ab [EZ] Remove XLA from unstable.yml (#163564)
It runs for 30 min on linux.12xlarge and then fails and it has been like
that since Aug 7th

Besides, there are no more python-3.9 builds left.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163564
Approved by: https://github.com/seemethere, https://github.com/atalman, https://github.com/huydhn
2025-09-22 22:11:50 +00:00
e558f7a222 [vllm hash update] update the pinned vllm hash (#163463)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163463
Approved by: https://github.com/pytorchbot

Co-authored-by: Huy Do <huydhn@gmail.com>
2025-09-22 21:24:56 +00:00
d279a6a6f1 ci: Add a way to lint all files in a PR from label (#163525)
Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163525
Approved by: https://github.com/ZainRizvi
2025-09-22 18:06:39 +00:00
5e7be98800 [BE] Update Python min version to 3.10 (#162310)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162310
Approved by: https://github.com/atalman, https://github.com/Skylion007, https://github.com/ZainRizvi
2025-09-22 17:04:21 +00:00
10adeb9044 Revert "[BE] Update Python min version to 3.10 (#162310)"
This reverts commit 9f5a644f0768258bc81f8b38492754d297399f74.

Reverted https://github.com/pytorch/pytorch/pull/162310 on behalf of https://github.com/malfet due to Broke lint, but to the best of my knowledge it's no longer possible to run lint for all files on PRs ([comment](https://github.com/pytorch/pytorch/pull/162310#issuecomment-3319289031))
2025-09-22 14:13:59 +00:00
9f5a644f07 [BE] Update Python min version to 3.10 (#162310)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162310
Approved by: https://github.com/atalman, https://github.com/Skylion007, https://github.com/ZainRizvi
2025-09-22 13:37:02 +00:00
edafc902d7 Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659)"
This reverts commit d1993c27ae59842c887d549a3f8936fbcd769498.

Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/wdvr due to reverted internally, please see D82771705 @PaliC ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3317110247))
2025-09-22 06:22:37 +00:00
5b386ee16e [vllm hash update] update the pinned vllm hash (#163392)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163392
Approved by: https://github.com/pytorchbot
2025-09-21 04:34:14 +00:00
0098e5636d [CI] Move Windows build/tests to Python-3.10 (#162862)
What supposed to be a very simple change end up being quite involved, as current Windows CI framework is quite inflexible, i.e. it takes a lots of argument, but later on ignores them, namely:
 - `PYTHON_VERSION` used to be a no-op that is simply ignored by the scripts
 - With this change, `setup-win` action will create an environment called `py_tmp` with specific python version + intel-openmp (that is hard runtime requirement, but for some reason not packaged into the wheel nor marked as such)
 - Copied test type dependencies from be01a40157/aws/ami/windows/scripts/Installers/Install-Pip-Dependencies.ps1 (L16) into `win-test.sh`, but made some adjustments to be compatible with 3.10 runtime (scipy version update) and just make rerun-tests compatible with the rest of the deps

I think in the long run, one needs to update 4432e2cacd/aws/ami/windows/scripts/Installers/Install-Miniconda3.ps1 that currently pins Miniconda python to 3.9, but also figure out how CI can still create a new environment without having to download all the dependencies all the time
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162862
Approved by: https://github.com/wdvr, https://github.com/huydhn
ghstack dependencies: #163339, #163341
2025-09-19 22:51:38 +00:00
52dd7a898c Move ROCM trunk wheel builds to 3.10 (#163339)
This code is a delicious spaghetti: Sometimes python version is defined in jinja template (see https://github.com/pytorch/pytorch/pull/162297 ) sometimes in shell script (see https://github.com/pytorch/pytorch/pull/162877 ), but this time around it's in a python file (and there is another one called `generate_binary_build_matrix.py` that defines `FULL_PYTHON_VERSIONS`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163339
Approved by: https://github.com/clee2000
2025-09-19 18:52:00 +00:00
b8c5ec582f [CD] Simplify NVIDIA driver installation step (#163349)
Undo changes introduced in https://github.com/pytorch/pytorch/pull/160956 as driver has been updated to 580 for both fleets

Fixes https://github.com/pytorch/pytorch/issues/163342
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163349
Approved by: https://github.com/seemethere
2025-09-19 18:50:47 +00:00
2984bfe3da [ez][CI] Run vllm workflow on vllm pin updates (#163353)
As in title

The auto pin update was merged without running vllm workflow
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163353
Approved by: https://github.com/malfet, https://github.com/wdvr
2025-09-19 17:32:49 +00:00
17081209e5 Revert "[CI] Move Windows build/tests to Python-3.10 (#162862)"
This reverts commit 2dcd153342d27b0981ff79eb2ccb8d8962e79c48.

Reverted https://github.com/pytorch/pytorch/pull/162862 on behalf of https://github.com/malfet due to Breaks some windows tests ([comment](https://github.com/pytorch/pytorch/pull/162862#issuecomment-3310606135))
2025-09-19 05:16:49 +00:00
46c647d1ee [vllm hash update] update the pinned vllm hash (#163304)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163304
Approved by: https://github.com/pytorchbot
2025-09-19 04:25:43 +00:00
2dcd153342 [CI] Move Windows build/tests to Python-3.10 (#162862)
What supposed to be a very simple change end up being quite involved, as current Windows CI framework is quite inflexible, i.e. it takes a lots of argument, but later on ignores them, namely:
 - `PYTHON_VERSION` used to be a no-op that is simply ignored by the scripts
 - With this change, `setup-win` action will create an environment called `py_tmp` with specific python version + intel-openmp (that is hard runtime requirement, but for some reason not packaged into the wheel nor marked as such)
 - Introduced `CONDA_ROOT_DIR` env variable in `activate_miniconda3.bat` to avoid `%CONDA_PARENT_DIR%\Miniconda3` invocations throughout the codebase
 - Copied test type dependencies from be01a40157/aws/ami/windows/scripts/Installers/Install-Pip-Dependencies.ps1 (L16) into `win-test.sh`, but made some adjustments to be compatible with 3.10 runtime (scipy version update) and just make rerun-tests compatible with the rest of the deps

I think in the long run, one needs to update 4432e2cacd/aws/ami/windows/scripts/Installers/Install-Miniconda3.ps1 that currently pins Miniconda python to 3.9, but also figure out how CI can still create a new environment without having to download all the dependencies all the time
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162862
Approved by: https://github.com/wdvr, https://github.com/huydhn
2025-09-19 00:33:03 +00:00
f4eca0e3b3 Try updating ET pin in PT/PT (#159664)
Looking into resolving this: https://github.com/pytorch/pytorch/issues/159599

Test Plan: Wait for executorch CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159664
Approved by: https://github.com/malfet
2025-09-18 21:55:16 +00:00
af8c232b75 [CI] reuse old whl: fix metadata file not getting version replaced (#163214)
In the .dist-info/METADATA file, the version was not being written with the new sha.

On python <3.11 (I think), the glob `**` will only match directories, so change this to `*`, which I checked that it will match both files and directories on py3.9 and py3.13

There's probably also a bunch of mismatches in RECORD but thats a problem for later
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163214
Approved by: https://github.com/huydhn
2025-09-18 16:08:29 +00:00
d734b26141 [vllm hash update] update the pinned vllm hash (#163218)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163218
Approved by: https://github.com/pytorchbot
2025-09-18 04:31:47 +00:00
8e48d1ba25 Skip reuse PyTorch wheel when building vLLM (#163232)
This issues starts surfacing in [trunk](b26d4c9a7a/1).  When building vLLM, uv doesn't like that we rename CI wheel without changing its metadata to match it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163232
Approved by: https://github.com/izaitsevfb
2025-09-18 01:42:32 +00:00
8dbac62edb [CI] Update NVIDIA driver to 580.82.07 (#163111)
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with 6e08c9d08e

This fix was suggested in https://github.com/pytorch/pytorch/issues/162878#issuecomment-3288635745

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163111
Approved by: https://github.com/huydhn
2025-09-17 17:37:06 +00:00
d1993c27ae [BE] Make PyObjectSlot use a global PyInterpreter (#162659)
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process

Gonna ask for review from @huydhn as there are some changes to CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD, https://github.com/huydhn
2025-09-17 16:40:55 +00:00
4ca3f435fb Revert "[CI] Update NVIDIA driver to 580.82.07 (#163111)"
This reverts commit 16475a829f7fe3b1dc3c74573740df09ffaec650.

Reverted https://github.com/pytorch/pytorch/pull/163111 on behalf of https://github.com/malfet due to It started to fail now, but worked just fine in PR CI ([comment](https://github.com/pytorch/pytorch/pull/163111#issuecomment-3303707671))
2025-09-17 16:20:31 +00:00
16475a829f [CI] Update NVIDIA driver to 580.82.07 (#163111)
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with 6e08c9d08e

This fix was suggested in https://github.com/pytorch/pytorch/issues/162878#issuecomment-3288635745

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163111
Approved by: https://github.com/huydhn
2025-09-17 14:44:06 +00:00
bb635a11f8 [vllm hash update] update the pinned vllm hash (#163128)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163128
Approved by: https://github.com/pytorchbot
2025-09-17 04:26:07 +00:00
814338826e Set the credential to upload vLLM nightly wheels on schedule and workflow_dispatch (#163018)
The build is ok, but uploading is failing at the moment https://github.com/pytorch/pytorch/actions/runs/17734972779/job/50416387786

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163018
Approved by: https://github.com/wdvr, https://github.com/malfet
2025-09-16 22:26:22 +00:00
c527292c43 [CI] Remove functorch doc build jobs (#163101)
As repo has been archived, there couldn't be any doc updates

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163101
Approved by: https://github.com/svekars, https://github.com/zou3519, https://github.com/ZainRizvi
2025-09-16 22:25:59 +00:00
d4554bc284 Revert "Set the credential to upload vLLM nightly wheels on schedule and workflow_dispatch (#163018)"
This reverts commit 61be0f1c11ef59ff8cf39138b594efe3672816c0.

Reverted https://github.com/pytorch/pytorch/pull/163018 on behalf of https://github.com/huydhn due to Missed another update on the environment ([comment](https://github.com/pytorch/pytorch/pull/163018#issuecomment-3300444271))
2025-09-16 21:44:11 +00:00
4db203f875 Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659)"
This reverts commit 05ee8114f818a95745c812c3cd7aa8e784e61a9a.

Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/jeanschmidt due to seems to have introduced errors in linting see https://github.com/pytorch/pytorch/actions/runs/17750689989/job/50444910643 ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3298626136))
2025-09-16 12:52:57 +00:00
6c0fd747af [vllm hash update] update the pinned vllm hash (#162928)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned vllm hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162928
Approved by: https://github.com/pytorchbot
2025-09-16 04:25:04 +00:00
c7fa16a05c [ROCm][CI] update _rocm-test.yml based on _linux-test.yml (#163014)
Fixes missing huggingface secrets and aligns _rocm-test.yml with other updates from _linux-test.yml that it was initially based on.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163014
Approved by: https://github.com/huydhn
2025-09-16 02:14:38 +00:00
61be0f1c11 Set the credential to upload vLLM nightly wheels on schedule and workflow_dispatch (#163018)
The build is ok, but uploading is failing at the moment https://github.com/pytorch/pytorch/actions/runs/17734972779/job/50416387786

Pull Request resolved: https://github.com/pytorch/pytorch/pull/163018
Approved by: https://github.com/wdvr, https://github.com/malfet
2025-09-16 01:46:59 +00:00
05ee8114f8 [BE] Make PyObjectSlot use a global PyInterpreter (#162659)
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process

Gonna ask for review from @huydhn as there are some changes to CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD, https://github.com/huydhn
2025-09-16 00:37:09 +00:00
de3a863cd8 AMD CPU CI - Add freezing + fix label trigger (#162176)
Added the following changes:

1. Added freezing by default for AMD CPU based CI (to follow pattern introduced by https://github.com/pytorch/pytorch/pull/152298 )
2. Fixed issue with label based CI triggers

Addresses code review comment in https://github.com/pytorch/pytorch/pull/161155

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162176
Approved by: https://github.com/malfet, https://github.com/jeffdaily
2025-09-15 19:29:35 +00:00
fa919feab6 Revert "[lint][CI] Don't checkout submodules for lintrunner-noclang (#162844)"
This reverts commit 6b231af23d63ee543a81c32952138090bebcf61d.

Reverted https://github.com/pytorch/pytorch/pull/162844 on behalf of https://github.com/wdvr due to seems to be needed after all - failing lint ([comment](https://github.com/pytorch/pytorch/pull/162844#issuecomment-3293465058))
2025-09-15 18:43:53 +00:00
6b231af23d [lint][CI] Don't checkout submodules for lintrunner-noclang (#162844)
Shouldn't be needed?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162844
Approved by: https://github.com/huydhn
2025-09-15 17:29:31 +00:00
456fbeaa6d [xla hash update] update the pinned xla hash (#162947)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned xla hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162947
Approved by: https://github.com/pytorchbot
2025-09-15 11:42:02 +00:00
972140b7e9 [benchmark] Add HF LLM benchmarks (#156967)
Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156967
Approved by: https://github.com/huydhn

Co-authored-by: Huy Do <huydhn@gmail.com>
2025-09-14 07:41:06 +00:00