Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16911
I think the Thrust package has want we want for /opt/rocm/include/thrust. We probably can stop patching it now.
Reviewed By: bddppq
Differential Revision: D14015177
fbshipit-source-id: 8d9128783a790c39083a1b8b4771c2c18bd67d46
Summary:
* we do not need EAP packages any longer as the antistatic feature is now in the release
* consistently install the rccl package
* Skip one unit test that has regressed with 2.1
* Follow-up PRs will use 2.1 features once deployed on CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16808
Differential Revision: D13992645
Pulled By: bddppq
fbshipit-source-id: 37ca9a1f104bb140bd2b56d403e32f04c4fbf4f0
Summary:
Drop custom hcc/hip as the 1.9.2 release should contain the relevant patches therein.
Most notable feature in 1.9.2 is mixed precision support in rocBLAS and MIOpen. These features will be enabled by subsequent PRs.
bddppq ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14216
Differential Revision: D13354294
Pulled By: bddppq
fbshipit-source-id: 2541d4a196af21c9432c1aff7f6e65b572628028
Summary:
1) Use the hip-thrust version of Thrust as opposed to the GH master. (ROCm 267)
2) CentOS 7.5 docker (ROCm 279)
* Always install the libraries at docker creation for ubuntu.
* Add Dockerfile for CentOS ROCm
* Enable the centos build
* Source devtoolset in bashrc
* Set locales correctly depending on whether we are on Ubuntu or CentOS
* Install a newer cmake for CentOS
* Checkout thrust as there is no package for CentOS yet.
PyTorch/Caffe2 on ROCm passed tests: https://github.com/ROCmSoftwarePlatform/pytorch/pull/280
For attention: bddppq ezyang
Docker rebuild for Ubuntu not urgent (getting rid of Thrust checkout and package install is mainly cosmetic). If docker for CentOS 7.5 is wanted, build is necessary. Build of PyTorch tested by me in CentOS docker. PyTorch unit tests work mostly, however, a test in test_jit causes a python recursion error that seems to be due to the python2 on CentOS as we haven't ever seen this on Ubuntu - hence please do not enable unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12899
Differential Revision: D13029424
Pulled By: bddppq
fbshipit-source-id: 1ca8f4337ec6a603f2742fc81046d5b8f8717c76
Summary:
* switches docker files over to white rabbit release - removed custom package installs
* skips five tests that regressed in that release
* fixes some case-sensitivity issues in ROCm supplied cmake files by sed'ing them in the docker
* includes first changes to the infrastructure to support upcoming hip-clang compiler
* prints ROCm library versions as part of the build (as discussed w/ ezyang )
* explicitly searches for miopengemm
* installs the new hip-thrust package to be able to remove the explicit Thrust checkout in a future revision
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12577
Differential Revision: D10350165
Pulled By: bddppq
fbshipit-source-id: 60f9c9caf04a48cfa90f4c37e242d944a175ab31
Summary:
* purge hcSPARSE now that rocSPARSE is available
* integrate a custom hcc and HIP
* hcc brings two important compiler fixes (fixes hundreds of unit tests)
* HIP brings a smart dispatcher that allows us to avoid a lot of static_casts (we haven't yet removed the automatic static_casts but this catches some occurrences the script did not catch)
* mark 5 unit tests skipping that have regressed w/ the new hcc (we don't know yet what is at fault)
* optimize bitonic sort - the comparator is always an empty struct - therefore passing it by value saves at least 3 bytes. It also removes an ambiguity around passing references to `__global__` functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11198
Differential Revision: D9652340
Pulled By: ezyang
fbshipit-source-id: f5af1d891189da820e3d13b7bed91a7a43154690
Summary:
* improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs)
* integrate rocFFT (i.e., enable Fourier functionality)
* fix bugs in ROCm caused by wrong warp size
* enable more test sets, skip the tests that don't work on ROCm yet
* don't disable asserts any longer in hipification
* small improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893
Differential Revision: D9615053
Pulled By: ezyang
fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b
Summary:
Set the build environment before installing sccache in order to make sure the docker images have the links set up.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10640
Reviewed By: yf225
Differential Revision: D9399593
Pulled By: Jorghi12
fbshipit-source-id: a062fed8b7e83460fe9d50a7a27c0f20bcd766c4
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406
Reviewed By: Jorghi12
Differential Revision: D9277093
Pulled By: ezyang
fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
Summary:
Current Dockerfile builds pytorch using default python within miniconda, which happens to be Python 3.6
This patch allows users to specify which python should be installed in the default miniconda environment used by the pytorch dockerfile. I have tested the build for python 2.7, 3.5, 3.6 and 3.7. Python 2.7 required typing and cython
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10317
Differential Revision: D9204401
Pulled By: ezyang
fbshipit-source-id: 11355cab3bf448bbe8369a2ed1de0d409c9a2d6e
Summary:
In this changeset:
* improvements to `hipify-python.py`
* marking unit tests broken for ROCm
* reducing the number of jobs for the built to avoid out of memory issues
* switch to Thrust/cub-hip master for the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653
Differential Revision: D9117791
Pulled By: ezyang
fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352
Summary:
This was used to build Caffe2 Docker version 170.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9828
Differential Revision: D8997808
Pulled By: ezyang
fbshipit-source-id: f48938b2b71bc86578c9d9b46c281ed05478724e
* Revert "Stop pinning nccl version. (#8421)"
This reverts commit 3cb45bafc8b9b023049e5f979a2bcb75e3f7009d.
* Allow downgrades from libnccl2 install.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Billing of changes:
- New Jenkins script for building on rocm. For now it is a bit hacked together, but we can improve it once CI is running
- New ROCM docker image for nightly HIP, and also some legacy packages that we need temporarily
- New enabled config py2-clang3.8-rocmnightly-ubuntu16.04-build based off of the existing Caffe2 image (not built yet)
- A big pile of cmake fixes, mostly to turn bits on/off when ROCM build is involved
- Switch from hiprng to hcrng
- Apply some patches directly in code, eliminating the patches
- Use __hdiv instead of hdiv, it's more portable
- THCNumerics<T>::gt doesn't work in HIP, so simulate it with sub
- Add a few more overloads HIP needs
- Turn off use of hcc to link (we plan to turn this back on to get tests running)
- Search for hiprand, hiprng, hipblas, hipsparse
- Better Python 2 portability
* Changes without centos changes
* Changes for protobuf 3.5 and gcc 4.8
* Changing 3.4.1 back to 3.5.1
* Preventing installing two versions of setuptools
* Fixing setuptools bug
* Adding openmpi to all conda builds
* Typo and turning off quiet
* Removing openmpi from non_cuda conda build
* Actually openmpi is already in the images