Compare commits

..

6 Commits

Author SHA1 Message Date
1eba9b3aa3 change the test wheel to release wheel when release wheel available (#145884)
change the test wheel to release wheel when release wheel available (#145252)

change the test wheel to release wheel when release wheel available

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145252
Approved by: https://github.com/seemethere, https://github.com/atalman

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit 9003d81144fcda2d96814cf9126dbe2b9deb7de7)

Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
2025-01-28 16:09:34 -08:00
2236df1770 [CUDA] Change slim-wheel libraries load order (#145662)
[CUDA] Change slim-wheel libraries load order (#145638)

There is no libnvjitlink in  CUDA-11.x , so attempts to load it first will abort the execution and prevent the script from preloading nvrtc

Fixes issues reported in https://github.com/pytorch/pytorch/pull/145614#issuecomment-2613107072

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145638
Approved by: https://github.com/atalman, https://github.com/kit1980, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit 2a70de7e9257e3f8c2874a10e3612c8939b79867)

Co-authored-by: Wei Wang <weiwan@nvidia.com>
2025-01-24 14:54:25 -08:00
3207040966 [CD] Fix slim-wheel cuda_nvrtc import problem (#145614)
[CD] Fix slim-wheel cuda_nvrtc import problem (#145582)

Similar fix as: https://github.com/pytorch/pytorch/pull/144816

Fixes: https://github.com/pytorch/pytorch/issues/145580

Found during testing of https://github.com/pytorch/pytorch/issues/138340

Please note both nvrtc and nvjitlink exist for cuda 11.8, 12.4 and 12.6 hence we can safely remove if statement. Preloading can apply to all supporting cuda versions.

CUDA 11.8 path:
```
(.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/cuda_nvrtc/lib
__init__.py  __pycache__  libnvrtc-builtins.so.11.8  libnvrtc-builtins.so.12.4  libnvrtc.so.11.2  libnvrtc.so.12
(.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/nvjitlink/lib
__init__.py  __pycache__  libnvJitLink.so.12
```

Test with rc 2.6 and CUDA 11.8:
```
python cudnn_test.py
2.6.0+cu118
---------------------------------------------SDPA-Flash---------------------------------------------
ALL GOOD
---------------------------------------------SDPA-CuDNN---------------------------------------------
ALL GOOD
```

Thank you @nWEIdia for discovering this issue

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145582
Approved by: https://github.com/nWEIdia, https://github.com/eqy, https://github.com/kit1980, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit 9752c7c1c819ce9027806c20492adc235dddecd6)

Co-authored-by: atalman <atalman@fb.com>
2025-01-24 08:40:13 -08:00
ca3c3a63b8 [Release-Only] Remove ptx from Linux CUDA 12.6 binary builds (#145616)
Cuda 12.6 remove +ptx
2025-01-24 08:39:52 -08:00
7be6b5db47 Fix IdentationError of code example (#145525)
Fix IdentationError of code example  (#145251)

I found there is IndentationError when try to copy paste the example of inference with torch.compile
fix the format in this pr

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145251
Approved by: https://github.com/mikaylagawarecki

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit fef92c9447c6786b095fdbada6cfe7280c510e59)

Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
2025-01-24 09:16:57 -05:00
dcb8ad070f update get start xpu (#145286)
update get start xpu (#143183)

- Support new Intel client GPU on Windows [Intel® Arc™ B-Series graphics](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/arc/desktop/b-series/overview.html) and [Intel® Core™ Ultra Series 2 with Intel® Arc™ Graphics](https://www.intel.com/content/www/us/en/products/details/processors/core-ultra.html)
- Support vision/audio prebuilt wheels on Windows
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143183
Approved by: https://github.com/EikanWang, https://github.com/leslie-fang-intel, https://github.com/atalman, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit 465a1cfe2e8a49cb72df3bb33e78bf1572e13e51)

Co-authored-by: ZhaoqiongZ <106125927+ZhaoqiongZ@users.noreply.github.com>
2025-01-24 09:15:54 -05:00
3 changed files with 42 additions and 61 deletions

View File

@ -63,7 +63,7 @@ case ${CUDA_VERSION} in
if [[ "$GPU_ARCH_TYPE" = "cuda-aarch64" ]]; then
TORCH_CUDA_ARCH_LIST="9.0"
else
TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST};9.0+PTX"
TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST};9.0"
fi
EXTRA_CAFFE2_CMAKE_FLAGS+=("-DATEN_NO_TEST=ON")
;;

View File

@ -8,23 +8,24 @@ Hardware Prerequisite
:widths: 50 50
:header-rows: 1
* - Validated Hardware
- Supported OS
* - Intel® Data Center GPU Max Series
- Linux
* - Intel Client GPU
- Windows/Linux
* - Supported OS
- Validated Hardware
* - Linux
- Intel® Client GPUs / Intel® Data Center GPU Max Series
* - Windows
- Intel® Client GPUs
* - WSL2 (experimental feature)
- Intel® Client GPUs
Intel GPUs support (Prototype) is ready in PyTorch* 2.5 for Intel® Data Center GPU Max Series and Intel® Client GPUs on both Linux and Windows, which brings Intel GPUs and the SYCL* software stack into the officialPyTorchstack with consistent user experience to embrace more AI application scenarios.
Intel GPUs support (Prototype) is ready in PyTorch* 2.6 for Intel® Client GPUs and Intel® Data Center GPU Max Series on both Linux and Windows, which brings Intel GPUs and the SYCL* software stack into the official PyTorch stack with consistent user experience to embrace more AI application scenarios.
Software Prerequisite
---------------------
Visit `PyTorch Installation Prerequisites for Intel GPUs <https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpus.html>`_ for more detailed information regarding:
To use PyTorch on Intel GPUs, you need to install the Intel GPUs driver first. For installation guide, visit `Intel GPUs Driver Installation <https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-6.html#driver-installation>`_.
Intel GPU Drivers are sufficient for binary installation, while building from source requires both Intel GPU Drivers and Intel® Deep Learning Essentials. Please refer to `PyTorch Installation Prerequisites for Intel GPUs <https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-6.html>`_ for more information.
#. Intel GPU driver installation
#. Intel support package installation
#. Environment setup
Installation
------------
@ -32,17 +33,13 @@ Installation
Binaries
^^^^^^^^
Platform Linux
""""""""""""""
Now that we have `Intel GPU Driver <https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-6.html#driver-installation>`_ installed, use the following commands to install ``pytorch``, ``torchvision``, ``torchaudio`` on Linux.
Now we have all the required packages installed and environment activated. Use the following commands to install ``pytorch``, ``torchvision``, ``torchaudio`` on Linux.
For preview wheels
For release wheels
.. code-block::
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/test/xpu
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu
For nightly wheels
@ -50,26 +47,13 @@ For nightly wheels
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu
Platform Windows
""""""""""""""""
Now we have all the required packages installed and environment activated. Use the following commands to install ``pytorch`` on Windows, build from source for ``torchvision`` and ``torchaudio``.
For preview wheels
.. code-block::
pip3 install torch --index-url https://download.pytorch.org/whl/test/xpu
For nightly wheels
.. code-block::
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu
From Source
^^^^^^^^^^^
Now that we have `Intel GPU Driver and Intel® Deep Learning Essentials <https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-6.html>`_ installed. Follow guides to build ``pytorch``, ``torchvision``, ``torchaudio`` from source.
Build from source for ``torch`` refer to `PyTorch Installation Build from source <https://github.com/pytorch/pytorch?tab=readme-ov-file#from-source>`_.
Build from source for ``torchvision`` refer to `Torchvision Installation Build from source <https://github.com/pytorch/vision/blob/main/CONTRIBUTING.md#development-installation>`_.
@ -86,11 +70,7 @@ To check if your Intel GPU is available, you would typically use the following c
import torch
torch.xpu.is_available() # torch.xpu is the API for Intel GPU support
If the output is ``False``, double check following steps below.
#. Intel GPU driver installation
#. Intel support package installation
#. Environment setup
If the output is ``False``, double check driver installation for Intel GPUs.
Minimum Code Change
-------------------
@ -183,22 +163,22 @@ Inference with ``torch.compile``
model = model.to("xpu")
data = data.to("xpu")
for i in range(ITERS):
start = time.time()
with torch.no_grad():
model(data)
torch.xpu.synchronize()
end = time.time()
print(f"Inference time before torch.compile for iteration {i}: {(end-start)*1000} ms")
for i in range(ITERS):
start = time.time()
with torch.no_grad():
model(data)
torch.xpu.synchronize()
end = time.time()
print(f"Inference time before torch.compile for iteration {i}: {(end-start)*1000} ms")
model = torch.compile(model)
for i in range(ITERS):
start = time.time()
with torch.no_grad():
model(data)
torch.xpu.synchronize()
end = time.time()
print(f"Inference time after torch.compile for iteration {i}: {(end-start)*1000} ms")
model = torch.compile(model)
for i in range(ITERS):
start = time.time()
with torch.no_grad():
model(data)
torch.xpu.synchronize()
end = time.time()
print(f"Inference time after torch.compile for iteration {i}: {(end-start)*1000} ms")
print("Execution finished")

View File

@ -316,20 +316,21 @@ def _load_global_deps() -> None:
try:
ctypes.CDLL(global_deps_lib_path, mode=ctypes.RTLD_GLOBAL)
# Workaround slim-wheel CUDA-12.4+ dependency bug in libcusparse by preloading nvjitlink
# In those versions of cuda cusparse depends on nvjitlink, but does not have rpath when
# Workaround slim-wheel CUDA dependency bugs in cusparse and cudnn by preloading nvjitlink
# and nvrtc. In CUDA-12.4+ cusparse depends on nvjitlink, but does not have rpath when
# shipped as wheel, which results in OS picking wrong/older version of nvjitlink library
# if `LD_LIBRARY_PATH` is defined
# See https://github.com/pytorch/pytorch/issues/138460
if version.cuda not in ["12.4", "12.6"]: # type: ignore[name-defined]
return
# if `LD_LIBRARY_PATH` is defined, see https://github.com/pytorch/pytorch/issues/138460
# Similar issue exist in cudnn that dynamically loads nvrtc, unaware of its relative path.
# See https://github.com/pytorch/pytorch/issues/145580
try:
with open("/proc/self/maps") as f:
_maps = f.read()
# libtorch_global_deps.so always depends in cudart, check if its installed via wheel
if "nvidia/cuda_runtime/lib/libcudart.so" not in _maps:
return
# If all abovementioned conditions are met, preload nvjitlink
# If all above-mentioned conditions are met, preload nvrtc and nvjitlink
# Please note that order are important for CUDA-11.8 , as nvjitlink does not exist there
_preload_cuda_deps("cuda_nvrtc", "libnvrtc.so.*[0-9]")
_preload_cuda_deps("nvjitlink", "libnvJitLink.so.*[0-9]")
except Exception:
pass