[aarch64] Add back NCCL lib to cuda arm wheel (#156888)

We discovered that when importing latest 12.9 arm nightly wheel, it is missing the NCCL lib. With the use of USE_SYSTEM_NCCL=1, we need to copy the libnccl.so lib into our big wheel environment, so that it can be dynamically linked at runtime.

https://github.com/pytorch/pytorch/pull/152835 enabled USE_SYSTEM_NCCL=1, which would use the system NCCL by default, and it would no longer use the one built from libtorch_cuda.so. With this PR, we add back the libnccl.so to be used at runtime. In this way, we also provide the flexibility to use different versions of NCCL from what came with the original pytorch build.

related - https://github.com/pytorch/pytorch/issues/144768

```
Python 3.12.3 (main, Jun 18 2025, 17:59:45) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 417, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libnccl.so.2: cannot open shared object file: No such file or directory
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156888
Approved by: https://github.com/atalman
This commit is contained in:
Ting Lu
2025-06-26 10:24:12 +00:00
committed by PyTorch MergeBot
parent 18b01afa9e
commit de45c5f673

View File

@ -88,6 +88,7 @@ def package_cuda_wheel(wheel_path, desired_cuda) -> None:
"/usr/local/cuda/lib64/libcusparseLt.so.0", "/usr/local/cuda/lib64/libcusparseLt.so.0",
"/usr/local/cuda/lib64/libcusolver.so.11", "/usr/local/cuda/lib64/libcusolver.so.11",
"/usr/local/cuda/lib64/libcurand.so.10", "/usr/local/cuda/lib64/libcurand.so.10",
"/usr/local/cuda/lib64/libnccl.so.2",
"/usr/local/cuda/lib64/libnvJitLink.so.12", "/usr/local/cuda/lib64/libnvJitLink.so.12",
"/usr/local/cuda/lib64/libnvrtc.so.12", "/usr/local/cuda/lib64/libnvrtc.so.12",
"/usr/local/cuda/lib64/libcudnn_adv.so.9", "/usr/local/cuda/lib64/libcudnn_adv.so.9",