Enable CUDA 12.4.1 (#132202)

Trying to keep a record of the steps before I lose track of it.

- 1st Commit: Similar to https://github.com/pytorch/builder/pull/1720
- 2nd Commit:  Update CUDA 12.4 CI CUDA versions from 12.4.0 to 12.4.1 mapping to changes in https://github.com/pytorch/pytorch/pull/125944/files
- 3rd Commit: update for aarch64 install_cuda_aarch64.sh docker step
- 4th Commit: aaa456e3e6 Related https://github.com/pytorch/pytorch/pull/121684
- Synchronization point: Meta helps uploading pypi cuda dependencies specified in .github/scripts/generate_binary_build_matrix.py
- The above pypi upload is done (thanks Andrey!), restarted jobs like https://github.com/pytorch/pytorch/actions/runs/10188203670/job/28369471321
- 77532344e3, use temporary docker containers (generated from a previous successful container build). If merged, these containers would be rebuilt, therefore testing them now.  (5th commit)
- 6th commit 5f93c625b5: revert the 5th commit. Update, done but have to debug seemingly irrelevant failures (rocm/xpu/mps)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132202
Approved by: https://github.com/Skylion007, https://github.com/eqy, https://github.com/atalman
This commit is contained in:
Wei Wang
2024-08-20 17:52:50 +00:00
committed by PyTorch MergeBot
parent e41b520ee3
commit 333890b701
7 changed files with 41 additions and 42 deletions

View File

@ -18,7 +18,7 @@ from typing import Dict, List, Optional, Tuple
CUDA_ARCHES = ["11.8", "12.1", "12.4"]
CUDA_ARCHES_FULL_VERSION = {"11.8": "11.8.0", "12.1": "12.1.1", "12.4": "12.4.0"}
CUDA_ARCHES_FULL_VERSION = {"11.8": "11.8.0", "12.1": "12.1.1", "12.4": "12.4.1"}
CUDA_ARCHES_CUDNN_VERSION = {"11.8": "9", "12.1": "9", "12.4": "9"}
@ -68,18 +68,18 @@ PYTORCH_EXTRA_INSTALL_REQUIREMENTS = {
"nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'"
),
"12.4": (
"nvidia-cuda-nvrtc-cu12==12.4.99; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cuda-runtime-cu12==12.4.99; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cuda-cupti-cu12==12.4.99; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cublas-cu12==12.4.2.65; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cufft-cu12==11.2.0.44; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-curand-cu12==10.3.5.119; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cusolver-cu12==11.6.0.99; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cusparse-cu12==12.3.0.142; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-nvtx-cu12==12.4.99; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-nvjitlink-cu12==12.4.99; platform_system == 'Linux' and platform_machine == 'x86_64'"
"nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | "
"nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'"
),
}