pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Huy Do	4400c5d31e	Continue to build nightly CUDA 12.9 for internal (#163029 ) Revert part of https://github.com/pytorch/pytorch/pull/161916 to continue building CUDA 12.9 nightly Pull Request resolved: https://github.com/pytorch/pytorch/pull/163029 Approved by: https://github.com/malfet	2025-10-11 08:26:47 +00:00
Jithun Nair	0ec0120b19	Move aws OIDC credentials steps into setup-rocm.yml (#164769 ) The AWS ECR login step needs `id-token: write` permissions. We move the steps to get OIDC-based credentials from `_rocm-test.yml` to `setup-rocm.yml`. This lays the groundwork to enable access to AWS ECR in workflows in other repos such as torchtitan that use [linux_job_v2.yml](https://github.com/pytorch/test-infra/blob/main/.github/workflows/linux_job_v2.yml), which also uses [setup-rocm.yml](`335f4f80a0/.github/workflows/linux_job_v2.yml (L168)`). Any caller workflows that eventually execute `setup-rocm` action will thus need to provide the `id-token: write` permission. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164769 Approved by: https://github.com/huydhn	2025-10-10 21:24:29 +00:00
Wei Wang	773c6762b8	[CD][CUDA13][NCCL] Fix nccl version typo for cu13 (#164383 ) https://pypi.org/project/nvidia-nccl-cu13/#history does not have 2.27.5 but 2.27.7+. Companion PR: https://github.com/pytorch/pytorch/pull/164352 Fixes a potential binary breakage due to non-existence of referenced NCCL cu13 version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164383 Approved by: https://github.com/tinglvv, https://github.com/Skylion007, https://github.com/atalman	2025-10-01 21:32:25 +00:00
albanD	2610746375	Revert nccl upgrade back to 2.27.5 (#164352 ) Revert https://github.com/pytorch/pytorch/pull/162351 as it breaks H100 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164352 Approved by: https://github.com/atalman, https://github.com/malfet	2025-10-01 15:27:40 +00:00
Aaron Gokaslan	5504a06e01	[BE]: Update NCCL to 2.28.3 (#162351 ) @eqy New NCCL has some a bunch of bugfixes for features including reducing the number SMs needed by NVLINK collectives as well as some very useful new APIs for SymmetricMemory. Also allows FP8 support for non-reductive operations on pre-sm90 devices. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162351 Approved by: https://github.com/ezyang, https://github.com/malfet, https://github.com/atalman	2025-09-28 01:38:59 +00:00
Jeff Daily	f1260c9b9a	[ROCm][CI/CD] upgrade nightly wheels to ROCm 7.0 (#163937 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163937 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-26 21:42:09 +00:00
Jithun Nair	0ec946a052	[ROCm] Increase binary build timeout to 5 hours (300 minutes) (#163776 ) Despite narrowing down the [FBGEMM_GENAI build to gfx942](https://github.com/pytorch/pytorch/pull/162648), the nightly builds still timed out because they [didn't get enough time to finish the post-PyTorch-build steps](https://github.com/pytorch/pytorch/actions/runs/17969771026/job/51109432897). This PR increases timeout for ROCm builds for both [libtorch ](https://github.com/pytorch/pytorch/actions/runs/17969771026)and [manywheel](https://github.com/pytorch/pytorch/actions/runs/17969771041), because both of those are close to the 4hr mark currently. This PR is a more ROCm-targeted version of https://github.com/pytorch/pytorch/pull/162880 (which is for release/2.9 branch). Pull Request resolved: https://github.com/pytorch/pytorch/pull/163776 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-24 23:02:08 +00:00
Ting Lu	bb1d53bc47	[CD] CUDA 13 specific followup changes (#162455 ) Follow up for CUDA 13 bring up https://github.com/pytorch/pytorch/issues/159779 sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm. remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS Pull Request resolved: https://github.com/pytorch/pytorch/pull/162455 Approved by: https://github.com/atalman	2025-09-11 00:03:47 +00:00
Ke Wen	8922bbcaab	Use same NVSHMEM version across CUDA builds (#162206 ) #161321 bumped NVSHMEM version to 3.3.24 for CUDA 13, leaving CUDA 12 with 3.3.20. This PR bumps the NVSHMEM version to 3.3.24 for CUDA 12 as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162206 Approved by: https://github.com/tinglvv, https://github.com/Skylion007	2025-09-09 20:59:50 +00:00
PyTorch MergeBot	5ccf3ca3ec	Revert "Use same NVSHMEM version across CUDA builds (#162206 )" This reverts commit 0d9c95cd7ee299e2e8c09df26d395be8775b506b. Reverted https://github.com/pytorch/pytorch/pull/162206 on behalf of https://github.com/malfet due to Broke lint, see `4dd73e659a/1` ([comment](https://github.com/pytorch/pytorch/pull/162206#issuecomment-3271040521))	2025-09-09 14:40:45 +00:00
Ke Wen	0d9c95cd7e	Use same NVSHMEM version across CUDA builds (#162206 ) #161321 bumped NVSHMEM version to 3.3.24 for CUDA 13, leaving CUDA 12 with 3.3.20. This PR bumps the NVSHMEM version to 3.3.24 for CUDA 12 as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162206 Approved by: https://github.com/tinglvv, https://github.com/Skylion007	2025-09-09 08:52:27 +00:00
Eddie Yan	145a3a7bda	[CUDA 13][cuDNN] Bump CUDA 13 to cuDNN 9.13.0 (#162268 ) Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162268 Approved by: https://github.com/drisspg, https://github.com/Skylion007	2025-09-06 01:59:03 +00:00
atalman	bffc7dd1f3	[CD] Add cuda 13.0 libtorch builds, remove CUDA 12.9 builds (#161916 ) Related to https://github.com/pytorch/pytorch/issues/159779 Adding CUDA 13.0 libtorch builds, followup after https://github.com/pytorch/pytorch/pull/160956 Removing CUDA 12.9 builds, See https://github.com/pytorch/pytorch/issues/159980 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161916 Approved by: https://github.com/jeanschmidt, https://github.com/Skylion007 Co-authored-by: Ting Lu <tingl@nvidia.com>	2025-09-05 07:47:54 +00:00
PyTorch MergeBot	6b8b3ac440	Revert "[ROCm] Use MI325 (gfx942) runners for binary smoke testing (#162044 )" This reverts commit cd529b686d54bbaa443f5b310140de48422d96c7. Reverted https://github.com/pytorch/pytorch/pull/162044 on behalf of https://github.com/jeffdaily due to mi200 backlog is purged, and mi300 runners are failing in GHA download ([comment](https://github.com/pytorch/pytorch/pull/162044#issuecomment-3254427869))	2025-09-04 16:06:30 +00:00
Jithun Nair	cd529b686d	[ROCm] Use MI325 (gfx942) runners for binary smoke testing (#162044 ) ### Motivation * MI250 Cirrascale runners are currently having network timeout leading to huge queueing of binary smoke test jobs: <img width="483" height="133" alt="image" src="https://github.com/user-attachments/assets/17293002-78ad-4fc9-954f-ddd518bf0a43" /> * MI210 Hollywood runners (with runner names such as `pytorch-rocm-hw-`) are not suitable for these jobs, because they seem to take much longer to download artifacts: https://github.com/pytorch/pytorch/pull/153287#issuecomment-2918420345 (this is why these jobs were specifically targeting Cirrascale runners). However, it doesn't seem like Cirrascale runners are necessarily doing much better either e.g. [this recent build](https://github.com/pytorch/pytorch/actions/runs/17332256791/job/49231006755). Moving to MI325 runners should address the stability part at least, while also reducing load on limited MI2xx runner capacity. * However, I'm not sure if the MI325 runners will do any better on the artifact download part (this may need to be investigated more) cc @amdfaa * Also removing `ciflow/binaries` and `ciflow/binaries_wheel` label/tag triggers for `generated-linux-binary-manywheel-rocm-main.yml` because we already trigger ROCm binary build/test jobs via these labels/tags in `generated-linux-binary-manywheel-nightly.yml`. And for developers who want to trigger ROCm binary build/test jobs on their PRs, they can use the `ciflow/rocm-mi300` label/tag as per this PR. ### TODOs (cc @amdfaa): * Check that the workflow runs successfully on the MI325 runners in this PR. Note how long the test jobs take esp. the "Download Build Artifacts" step * Once this PR is merged, clear the queue of jobs targeting `linux.rocm.gpu.mi250` Pull Request resolved: https://github.com/pytorch/pytorch/pull/162044 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-03 18:34:07 +00:00
Wang, Chuanqi	793fc12aff	[CD] Fix setup-xpu action issue (#161934 ) Fix XPU CD test failure, refer https://github.com/pytorch/pytorch/actions/runs/17370923627/job/49315624191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161934 Approved by: https://github.com/atalman	2025-09-02 16:03:44 +00:00
Wang, Chuanqi	06c7516994	[BE] Upgrade XPU support package to 2025.2 (#158733 ) Including below changes, - Add XPU support package 2025.2 build and test in CI for both Linux and Windows - Keep XPU support package 2025.1 build in CI to ensure no break issue until PyTorch 2.9 release - Upgrade XPU support package from 2025.1 to 2025.2 in CD for both Linux and Windows - Rename Linux CI job name & image name to n & n-1 - Update XPU runtime pypi packages dependencies of CD wheels - Remove deprecated support package version docker image build Pull Request resolved: https://github.com/pytorch/pytorch/pull/158733 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-08-27 19:33:38 +00:00
Ting Lu	ae8d319fd4	Update NVSHMEM to 3.3.24 and fix download link (#161321 ) https://github.com/pytorch/pytorch/issues/159779 Update NVSHMEM 3.3.24 for [PyTorch CUDA13 Binary Cannot Be Built with SM_75 with NVSHMEM](https://github.com/pytorch/pytorch/issues/160980) Enabled back sm_75 for NVSHMEM Fixed the NVSHMEM download link for the issue with 3.3.20 download in issue - [[CD] nvshem-3.3.9 wheels for aarch64 is not manylinux2_28 compliant](https://github.com/pytorch/pytorch/issues/160425) Todo: Should also enable back build ARM with NVSHMEM since it is compatible with manylinux2_28 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161321 Approved by: https://github.com/Skylion007, https://github.com/atalman	2025-08-26 13:26:18 +00:00
atalman	1a566c4909	Remove Python 3.9 nightly builds (#161427 ) Please see https://github.com/pytorch/pytorch/issues/161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161427 Approved by: https://github.com/huydhn	2025-08-25 22:05:40 +00:00
Ting Lu	49ff884b1e	Add CUDA 13.0 x86 builds (#160956 ) https://github.com/pytorch/pytorch/issues/159779 CUDA 13.0.0 NVSHMEM 3.3.20 CUDNN 9.12.0.46 Adding x86 linux builds for CUDA 13. Adding libtorch docker. Package naming changed for CUDA 13 (removed postfix -cu13 for some packages). Preparation checklist: 1. Update index https://download.pytorch.org/whl/nightly/cu130 with pypi packages 2. Update packaging name based on https://pypi.org/project/cuda-toolkit/ metadata Pull Request resolved: https://github.com/pytorch/pytorch/pull/160956 Approved by: https://github.com/atalman Co-authored-by: atalman <atalman@fb.com>	2025-08-22 11:31:09 +00:00
Nikita Shulga	e1a64b75ff	[CD] Delete full builds (#161075 ) As they are no longer needed for Colab, see https://github.com/googlecolab/colabtools/issues/5508#issuecomment-3200871941 and [<img width="896" height="128" alt="image" src="https://github.com/user-attachments/assets/a287393c-bde7-4e10-99bf-2e0d66346efe" /> ](https://colab.research.google.com/drive/1YJ5Y0xsApXSewM1cQwWQ_AS3A77vytgq) Fixes https://github.com/pytorch/pytorch/issues/160972 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161075 Approved by: https://github.com/atalman	2025-08-20 19:40:15 +00:00
Nikita Shulga	7bd4cfaef4	[BE] Update nvshem dependency to 3.3.20 (#160458 ) Which is manylinux2_28 compatible, even on aarch64 platform archive contents and URL pattern changed quite drastically between 3.3.9 and 3.3.20, but hopefully it still works. Package `libnvshmem_host.so.3` into gigantic aarch64+CUDA wheel Should fix https://github.com/pytorch/pytorch/issues/160425 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160458 Approved by: https://github.com/Skylion007, https://github.com/kwen2501, https://github.com/nWEIdia, https://github.com/atalman, https://github.com/tinglvv	2025-08-16 02:00:57 +00:00
PyTorch MergeBot	c015e53d37	Revert "[BE] Update nvshem dependency to 3.3.20 (#160458 )" This reverts commit e0488d9f00865fb56c931580c80e099771c6285e. Reverted https://github.com/pytorch/pytorch/pull/160458 on behalf of https://github.com/wdvr due to need to rerun workflow generation (failing workflow-checks) ([comment](https://github.com/pytorch/pytorch/pull/160458#issuecomment-3193133706))	2025-08-16 01:47:42 +00:00
Nikita Shulga	e0488d9f00	[BE] Update nvshem dependency to 3.3.20 (#160458 ) Which is manylinux2_28 compatible, even on aarch64 platform archive contents and URL pattern changed quite drastically between 3.3.9 and 3.3.20, but hopefully it still works. Package `libnvshmem_host.so.3` into gigantic aarch64+CUDA wheel Should fix https://github.com/pytorch/pytorch/issues/160425 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160458 Approved by: https://github.com/Skylion007, https://github.com/kwen2501, https://github.com/nWEIdia, https://github.com/atalman, https://github.com/tinglvv	2025-08-16 00:50:13 +00:00
Nikita Shulga	d0226719a9	[BE][EZ] Delete remains of split-build logic (#159990 ) Hopefully last piece of https://github.com/pytorch/pytorch/issues/138750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159990 Approved by: https://github.com/atalman ghstack dependencies: #159986	2025-08-07 01:59:30 +00:00
atalman	26d045bb60	Linux py 3.14 wheel builds (#157559 ) Related to https://github.com/pytorch/pytorch/issues/156856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157559 Approved by: https://github.com/malfet, https://github.com/albanD	2025-08-04 20:55:19 +00:00
Aaron Gokaslan	476874b37f	[BE]: Update NCCL to 2.27.5 (#157108 ) Update NCCL to 2.27.5. Minor version, improves Blackwell, Symmem FP8 support, and fixes a bug with MNVVL. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157108 Approved by: https://github.com/atalman	2025-07-08 15:40:54 +00:00
Andrey Talman	7275f28045	Fix cuda 12.9 aarch64 GPU builds. Update CUDA_STABLE variable. (#157630 ) This contains 2 fixes that required in main and will need to be cherry-picked to Release 2.8 branch: 1. The PR https://github.com/pytorch/pytorch/pull/155819 missed to include triton change. 2. CUDA STABLE variable needs to be set to 12.8. Updating CUDA stable updates full static build Pull Request resolved: https://github.com/pytorch/pytorch/pull/157630 Approved by: https://github.com/Skylion007, https://github.com/jeanschmidt	2025-07-04 18:08:31 +00:00
Aaron Gokaslan	a6fab82b16	[BE]: Fix NVSHMEM builds, add missing 12.9 dependency and update to latest for 2.8RC (#157453 ) Fixed our bad builds of nvshmem, (we were not building or testing before) and also updates to the latest version. Newest versions has critical support for things that would actually make it useful, like bfloat16 and float16 support. This is a proper fix for: https://github.com/pytorch/pytorch/pull/157411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157453 Approved by: https://github.com/kwen2501, https://github.com/atalman	2025-07-03 22:55:18 +00:00
Aaron Gokaslan	a317c63d1b	[BE]: Update NCCL to 2.27.3 (#155233 ) Fixes: https://github.com/pytorch/pytorch/issues/155052 and https://github.com/pytorch/pytorch/issues/153517 This upgrade is needed to effectively use those symmetric memory kernels anyway. Also fixes some nasty NCCL bugs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155233 Approved by: https://github.com/nWEIdia, https://github.com/kwen2501, https://github.com/atalman, https://github.com/eqy	2025-06-14 19:20:31 +00:00
Jithun Nair	794ef6c9b8	Enable manywheel build and smoke test on main branch for ROCm (#153287 ) Fixes issue of not discovering breakage of ROCm wheel builds until the nightly job runs e.g. https://github.com/pytorch/pytorch/pull/153253 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153287 Approved by: https://github.com/jeffdaily	2025-06-14 19:14:31 +00:00
PyTorch MergeBot	4574b39aa4	Revert "[BE]: Sync cusparselt 12.9 with static build and other cuda 12 (#155709 )" This reverts commit bbbced94a43cf764ddfe719e7d4c161a3992830c. Reverted https://github.com/pytorch/pytorch/pull/155709 on behalf of https://github.com/clee2000 due to broke lint [GH job link](https://github.com/pytorch/pytorch/actions/runs/15645591737/job/44082402642) [HUD commit link](`bbbced94a4`) landrace with 155819? easy forward fix but its the end of the week so idk when id get a review ([comment](https://github.com/pytorch/pytorch/pull/155709#issuecomment-2972094849))	2025-06-14 01:43:16 +00:00
PyTorch MergeBot	d7e3c9ce82	Revert "Enable manywheel build and smoke test on main branch for ROCm (#153287 )" This reverts commit 3b6569b1ef4b9ff25f5b75fe0a216d6d084d573f. Reverted https://github.com/pytorch/pytorch/pull/153287 on behalf of https://github.com/clee2000 due to broke lint [GH job link](https://github.com/pytorch/pytorch/actions/runs/15646152483/job/44083912145) [HUD commit link](`3b6569b1ef`) ([comment](https://github.com/pytorch/pytorch/pull/153287#issuecomment-2972088294))	2025-06-14 01:32:27 +00:00
Jithun Nair	3b6569b1ef	Enable manywheel build and smoke test on main branch for ROCm (#153287 ) Fixes issue of not discovering breakage of ROCm wheel builds until the nightly job runs e.g. https://github.com/pytorch/pytorch/pull/153253 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153287 Approved by: https://github.com/jeffdaily	2025-06-14 00:05:57 +00:00
Aaron Gokaslan	bbbced94a4	[BE]: Sync cusparselt 12.9 with static build and other cuda 12 (#155709 ) followup for https://github.com/pytorch/pytorch/pull/154980 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155709 Approved by: https://github.com/tinglvv, https://github.com/atalman, https://github.com/nWEIdia, https://github.com/cyyever	2025-06-13 23:10:01 +00:00
Aaron Gokaslan	9cced33c7c	[BE]: Update cudnn to 9.10.2.21 (#155576 ) Update to CUDNN 9.10.2.21 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155576 Approved by: https://github.com/eqy, https://github.com/atalman	2025-06-12 12:50:36 +00:00
PyTorch MergeBot	f59c76b549	Revert "[BE]: Update cudnn to 9.10.2.21 (#155576 )" This reverts commit 2d3615f577894c7a117a55e85bb8371bb598ec50. Reverted https://github.com/pytorch/pytorch/pull/155576 on behalf of https://github.com/malfet due to breaks the same test again (I remember there were a version that adjusted tolerances), see `bc3972b80a/1` ([comment](https://github.com/pytorch/pytorch/pull/155576#issuecomment-2964404710))	2025-06-11 22:03:45 +00:00
Aaron Gokaslan	2d3615f577	[BE]: Update cudnn to 9.10.2.21 (#155576 ) Update to CUDNN 9.10.2.21 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155576 Approved by: https://github.com/eqy, https://github.com/atalman	2025-06-11 20:32:07 +00:00
Ting Lu	4c3da611c2	Add CUDA 12.9.1 x86 nightly binaries (#154980 ) Adding CUDA 12.9.1 to nightly binaries matrix for linux (x86) builds. Add sbsa and libtorch build docker images, builds addition will be follow-up PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154980 Approved by: https://github.com/eqy, https://github.com/atalman	2025-06-11 13:43:17 +00:00
Wang, Chuanqi	eaceb243df	[BE] Update the XPU support package to 2025.1.3 (#154346 ) Fixes #153632 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154346 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-06-11 09:46:18 +00:00
atalman	8153340d10	[CI/CD] Remove CUDA 11.8 builds (#155509 ) This removes CUDA 11.8 from CI/CD Please see: https://github.com/pytorch/pytorch/issues/147383 TODO: Will followup of cleaning CUDA 11.8 config from scripts Pull Request resolved: https://github.com/pytorch/pytorch/pull/155509 Approved by: https://github.com/cyyever, https://github.com/huydhn, https://github.com/malfet	2025-06-10 05:16:41 +00:00
Aaron Gokaslan	3863bbb55b	[BE]: Update cusparselt to 0.7.1 (#155232 ) Needed to support sparse operations on Blackwell, and implements new features for the library. Also optimizes library sizes vs 0.7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155232 Approved by: https://github.com/nWEIdia, https://github.com/malfet	2025-06-09 18:01:23 +00:00
PyTorch MergeBot	9656251bb1	Revert "[BE] Update cudnn to 9.10.1.4 (#155122 )" This reverts commit a14f427db68e54500ef4cd9ed34cb9537263bb74. Reverted https://github.com/pytorch/pytorch/pull/155122 on behalf of https://github.com/malfet due to Looks like it breaks a bunch of tests, see `36a722e20d/1` ([comment](https://github.com/pytorch/pytorch/pull/155122#issuecomment-2949209801))	2025-06-06 13:03:49 +00:00
Aaron Gokaslan	a14f427db6	[BE] Update cudnn to 9.10.1.4 (#155122 ) Follow up to #152782 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155122 Approved by: https://github.com/malfet, https://github.com/atalman	2025-06-05 16:07:25 +00:00
Ke Wen	34c6371d24	Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS (#154568 ) NVSHMEM 3.2.5 (released Mar 2025) have both cu11 and cu12 builds. See: https://pypi.nvidia.com/nvidia-nvshmem-cu12/ https://pypi.nvidia.com/nvidia-nvshmem-cu11/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/154568 Approved by: https://github.com/atalman ghstack dependencies: #154538	2025-06-04 17:43:24 +00:00
Ting Lu	bab59d3c28	Upgrade to CUDA 12.8.1 for nightly binaries (#152923 ) Upgrade current CUDA 12.8 builds to 12.8.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152923 Approved by: https://github.com/atalman	2025-05-23 22:37:05 +00:00
Wang, Chuanqi	c92ea3bc98	[BE] Upgrade XPU support package to 2025.1 in CICD (#151899 ) Address #151097. Including below changes, - Add XPU support package 2025.1 build and test in CI for both Linux and Windows - Keep XPU support package 2025.0 build in CI to ensure no break issue until PyTorch 2.8 release - Upgrade XPU support package from 2025.0 to 2025.1 in CD for both Linux and Windows - Enable XCCL in Linux CD wheel and oneMKL integration in both both Linux and Windows - Update XPU runtime pypi packages of CD wheels - Remove deprecated support package version docker image build Pull Request resolved: https://github.com/pytorch/pytorch/pull/151899 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-05-14 20:21:09 +00:00
Ting Lu	7f79222992	Upgrade to NCCL 2.26.5 for CUDA 12 (#152810 ) Upgrade NCCL to latest 2.26.5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152810 Approved by: https://github.com/eqy, https://github.com/albanD, https://github.com/nWEIdia, https://github.com/atalman, https://github.com/cyyever	2025-05-14 00:52:50 +00:00
Catherine Lee	e05ac9b794	Use folder tagged docker images for binary builds (#151706 ) Should be the last part of https://github.com/pytorch/pytorch/pull/150558, except for maybe s390x stuff, which I'm still not sure what's going on there For binary builds, do the thing like we do in CI where we tag each image with a hash of the .ci/docker folder to ensure a docker image built from that commit gets used. Previously it would use imagename:arch-main, which could be a version of the image based on an older commit After this, changing a docker image and then tagging with ciflow/binaries on the same PR should use the new docker images Release and main builds should still pull from docker io Cons: * if someone rebuilds the image from main or a PR where the hash is the same (ex folder is unchanged, but retrigger docker build for some reason), the release would use that image instead of one built on the release branch * spin wait for docker build to finish Pull Request resolved: https://github.com/pytorch/pytorch/pull/151706 Approved by: https://github.com/atalman	2025-04-22 21:50:10 +00:00
Jithun Nair	b4550541ea	[ROCm] upgrade nightly wheels to rocm6.4 (#151355 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151355 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-04-17 17:29:07 +00:00

1 2 3 4

196 Commits