mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 12:54:11 +08:00
[SymmMem] Avoid library mismatch in CMake search (#157836)
Before, if NVSHMEM is installed at *BOTH* system location (e.g. `/usr/local`) and conda location (e.g. `/path/to/conda/lib/python3.10/site-packages/nvidia/nvshmem`, there can be a mismatch in where host lib and device lib are found: ``` -- NVSHMEM_HOME set to: '' -- NVSHMEM wheel installed at: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem' -- NVSHMEM_HOST_LIB: '/usr/local/lib/libnvshmem_host.so' -- NVSHMEM_DEVICE_LIB: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem/lib/libnvshmem_device.a' -- NVSHMEM_INCLUDE_DIR: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem/include' ``` The reason is that CMake prioritize name search over dir search. In the script below, CMake will search all locations for `libnvshmem_host.so` first, before it searches for `.so.3`. ``` find_library(NVSHMEM_HOST_LIB # In pip install case, the lib suffix is `.so.3` instead of `.so` NAMES nvshmem_host nvshmem_host.so.3 HINTS $ENV{NVSHMEM_HOME} ${NVSHMEM_PY_DIR} PATH_SUFFIXES lib lib64 cuda/lib cuda/lib64 lib/x64) ``` This PR adds the `NAMES_PER_DIR` flag, according to CMake's doc: > The NAMES_PER_DIR option tells this command to consider one directory at a time and search for all names in it. After this PR: ``` -- NVSHMEM_HOME set to: '' -- NVSHMEM wheel installed at: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem' -- NVSHMEM_HOST_LIB: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem/lib/libnvshmem_host.so.3' -- NVSHMEM_DEVICE_LIB: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem/lib/libnvshmem_device.a' -- NVSHMEM_INCLUDE_DIR: '.conda/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvshmem/include' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/157836 Approved by: https://github.com/fegin, https://github.com/fduwjj ghstack dependencies: #157513, #157695
This commit is contained in:
@ -1001,7 +1001,7 @@ elseif(USE_CUDA)
|
||||
# 3. Let CMake find it in the default system paths, e.g. /usr/local.
|
||||
find_library(NVSHMEM_HOST_LIB
|
||||
# In pip install case, the lib suffix is `.so.3` instead of `.so`
|
||||
NAMES nvshmem_host nvshmem_host.so.3
|
||||
NAMES nvshmem_host libnvshmem_host.so.3 NAMES_PER_DIR
|
||||
HINTS $ENV{NVSHMEM_HOME} ${NVSHMEM_PY_DIR}
|
||||
PATH_SUFFIXES lib lib64 cuda/lib cuda/lib64 lib/x64
|
||||
DOC "The location of NVSHMEM host library.")
|
||||
|
Reference in New Issue
Block a user