pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Sam Gross	b90790ab1b	Don't split 256-bit AVX2 load/store intrinsics (#20609 ) Summary: Recent versions of GCC split unaligned load and store intrinsics into two 128-bit instructions. On old processors (Sandy Bridge) this was a bit faster for unaligned data, but bit slower for aligned data. On new processors (Intel Haswell+, recent AMD) splitting loads is slower on both aligned and unaligned data. Clang, MSVC, and ICC do not split unaligned load and store intrinsics. There's a good explanation here: https://stackoverflow.com/questions/52626726/why-doesnt-gcc-resolve-mm256-loadu-pd-as-single-vmovupd#tab-top Splitting load and store intrinsics makes no sense in our AVX2 configuration because the CPUs that support AVX2 instructions are the same CPUs where splitting is disadvantageous on all data alignemnt. Note that this doesn't change the AVX configuration (used by CPUs that support AVX but not AVX2). It's possible this would be benficial for that configuration too (our data is usually 32-byte aligned), but I'd prefer the conservative change for now. torch.add generated assembly (hot loop) (GCC 7.3.0) before: https://gist.github.com/colesbury/066376537bccd514daf8fe4ab54d8295 after: https://gist.github.com/colesbury/8b4b948145001d44b225c51d2428bb91 Timing of `torch.add(x, y, out=z)` for size 10240 (1 thread, Broadwell, no turbo): before: 7.35 us after: 6.39 us (Take the torch.add timings with a grain of salt. The difference in timings is much larger than I would expect.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20609 Differential Revision: D15385800 Pulled By: colesbury fbshipit-source-id: 66415b148a3b19360b9de9881af594ab46547b6f	2019-05-17 09:16:17 -07:00
Syed Tousif Ahmed	5268b7dfaf	Remove support for CUDA 8 (#20298 ) Summary: 1.1.0 stopped support for CUDA 8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20298 Differential Revision: D15294639 Pulled By: ezyang fbshipit-source-id: b9411bfe456f93f1529b745dc83b7d6310df684d	2019-05-13 11:24:22 -07:00
peter	872bab22c6	Some essential changes needed before updating the Windows AMI (#20353 ) Summary: 1. Add cuda 10.1 build 2. Turn on openmp loop support for VS 2019 3. Remove legacy code about selective builds Tested through CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20353 Differential Revision: D15294806 Pulled By: ezyang fbshipit-source-id: 0acf5c3fbbc398fd9ebdf9f97653499d39638432	2019-05-10 09:08:51 -07:00
Richard Zou	e01a5bf28b	Add USE_NAMEDTENSOR compilation flag. (#20162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20162 ghimport-source-id: 0efcd67f04aa087e1dd5faeee550daa2f13ef1a5 Reviewed By: gchanan Differential Revision: D15278211 Pulled By: zou3519 fbshipit-source-id: 6fee981915d83e820fe8b50a8f59da22a428a9bf	2019-05-09 09:09:16 -07:00
Junjie Bai	bc5398451e	Enable ROCm multi-gpu with Gloo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18640 Differential Revision: D15185822 Pulled By: bddppq fbshipit-source-id: 1b49ab3fb0f251cfc7ef3ddd62033ae0065a4ec3	2019-05-07 09:55:47 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
peter	f6609daad7	Fix the warning if the wrong gcc is used with nvcc (#20158 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/11886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20158 Differential Revision: D15217665 Pulled By: ezyang fbshipit-source-id: 951bb640cc8bea705cb2f39ca2024e3c5084cd3b	2019-05-06 06:38:38 -07:00
Jiakai Liu	c7c02724cd	CMakeLists changes to enable libtorch for Android (#19762 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19762 ghimport-source-id: 287aa7fea4efd38994e14d794123eb2046b91fc0 Differential Revision: D15087653 Pulled By: ljk53 fbshipit-source-id: 4498ff9f7f7903c3e25541184302b811267958e9	2019-05-03 09:28:53 -07:00
Jiakai Liu	8cd6d2f101	rename BUILD_ATEN_MOBILE to INTERN_BUILD_MOBILE and make it private (#19942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19942 ghimport-source-id: 6bacc8f5ad7911af8cf5fde9fcb604ade666b862 Reviewed By: dzhulgakov Differential Revision: D15144325 Pulled By: ljk53 fbshipit-source-id: d63a70f007110d5d1055d6bec1ed09a1a6aafdae	2019-05-01 00:20:24 -07:00
peter	3803d1c901	Fix conda build for Windows (#19824 ) Summary: Let's test it before merging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19824 Differential Revision: D15116111 Pulled By: soumith fbshipit-source-id: 0a73de3f045ee1349061674f5f8e2aaba382493c	2019-04-27 23:10:46 -07:00
peterjc123	20a5aa9670	Sync FindCUDA/select_computer_arch.cmake from upstream (#19392 ) Summary: 1. Fixes auto detection for Turing cards. 2. Adds Turing Support Pull Request resolved: https://github.com/pytorch/pytorch/pull/19392 Differential Revision: D14996142 Pulled By: soumith fbshipit-source-id: 3cd45c58212cf3db96e5fa19b07d9f1b59a1666a	2019-04-18 07:03:19 -07:00
Sebastian Messmer	a456e1e196	Add either type (#19285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19285 The either type is a tagged union with two members. This is going to be used in a diff stacked on top to allow a function to return one of two types. Also, generally, either<Error, Result> is a great pattern for returning value_or_error from a function without using exceptions and we could use this class for that later. Reviewed By: dzhulgakov Differential Revision: D14931923 fbshipit-source-id: 7d1dd77b3e5b655f331444394dcdeab24772ab3a	2019-04-18 02:04:43 -07:00
peter	deda88e0aa	One more fix for #18790 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19187 Differential Revision: D14913100 Pulled By: ezyang fbshipit-source-id: bf147747f933a2c9a35f3ff00bf6b83a4f29286c	2019-04-12 09:29:15 -07:00
Edward Yang	48a35135fb	Convert all tabs to spaces, add CI. (#18959 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959 ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156 Differential Revision: D14831246 Pulled By: ezyang fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0	2019-04-09 08:12:26 -07:00
Balint Cristian	67fdb4abf7	AVX2 with GCC9 fix. (#18991 ) Summary: Dear All, The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable. * Using GCC 9 test code from cmake build infra always fails: ``` $ gcc -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2 test.c: In function ‘main’: test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’ 11 \| _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code \| ^ \| \| \| __m256 {aka __vector(8) float} In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51, from test.c:4: /usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’} 550 \| _mm256_extract_epi64 (__m256i __X, const int __N) \| $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 9.0.1 20190328 (Red Hat 9.0.1-0.12) (GCC) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991 Differential Revision: D14821838 Pulled By: ezyang fbshipit-source-id: 7eb3a854a1a831f6fda8ed7ad089746230b529d7	2019-04-07 08:27:00 -07:00
Soumith Chintala	8711df89cc	fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18739 ) Summary: resubmit of https://github.com/pytorch/pytorch/pull/18704 with additional fixes Fixes https://github.com/pytorch/pytorch/issues/18359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18739 Differential Revision: D14737274 Pulled By: soumith fbshipit-source-id: cfbbbf68b098594bd045861d1b2c085da693ea51	2019-04-03 12:52:50 -07:00
peter	5e33085f27	Make it possible for users for select /Zi or /ZI over /Z7 when using MSVC (#18790 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18701. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18790 Differential Revision: D14748195 Pulled By: ezyang fbshipit-source-id: e50df1b5ca199a88d7b5ea3ea45d25d23cd31a27	2019-04-03 08:24:52 -07:00
Sacha	3027e783b1	Fix multi-configuration on Windows CMake (CUDA) (#18548 ) Summary: Multiple configurations is the default (eg. Release;Debug) on Windows and this check always broke this configuration as CMAKE_BUILD_TYPE was not set. The workaround was to always set CMAKE_BUILD_TYPE to Debug or Release, which was very unfortunate. The correct method is to use generator expressions that expand depending on the current CONFIG being processed. Side note: Anywhere else CMAKE_BUILD_TYPE is checked should probably be fixed too. Note that the CMakeLists.txt forces it in to Release mode. However, I came across this error when importing the prebuilt Config in to another project, where CMAKE_BUILD_TYPE was not set. > 3>CMake Error at pre_built/pytorch-1.0.1/share/cmake/Caffe2/public/cuda.cmake:380 (message): > 3> Unknown cmake build type: Proper support for configurations would mean we can build debug and release at the same time and as you can see, it is less CMake code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18548 Differential Revision: D14730790 Pulled By: ezyang fbshipit-source-id: 70ae16832870d742c577c34a50ec7564c3da0afb	2019-04-02 13:19:07 -07:00
Sebastian Messmer	bb8a0d717c	Enable gmock and fix system gtest issue (#18706 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18706 - Enable gmock - Fix issue where the gtest source files in third_party would include system gtest headers Reviewed By: ezyang Differential Revision: D14715302 fbshipit-source-id: 5335390913e651bda85c69d7ea9b5c1bce58f172	2019-04-02 12:33:22 -07:00
Soumith Chintala	a799751e33	Revert D14717015: [pytorch][PR] fix nccl compilation to make sure it compiles for architectures that pytorch compiles for Differential Revision: D14717015 Original commit changeset: 4aac036f57e5 fbshipit-source-id: c820b8dfb27564271e6b80e133fe655658a7c25c	2019-04-02 09:39:03 -07:00
Soumith Chintala	fc6296d777	fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18704 ) Summary: cc: t-vi gchanan zou3519 This fixes https://github.com/pytorch/pytorch/issues/18359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18704 Differential Revision: D14717015 Pulled By: soumith fbshipit-source-id: 4aac036f57e564b05d759662e8ad7a80170901c0	2019-04-01 17:10:42 -07:00
peterjc123	d5861aa55c	Append c10 libs to TorchConfig.cmake (#18418 ) Summary: Fixes #18416. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18418 Differential Revision: D14635322 Pulled By: ezyang fbshipit-source-id: 81cb658f73583e4cd0358173617f747ebf4f7f8a	2019-03-26 19:53:02 -07:00
Gemfield	20159c3ffe	remove redundant --install_dir parameter in GEN_COMMAND (#18473 ) Summary: remove redundant --install_dir parameter in GEN_COMMAND, since "--install_dir parameter " already contained in ${GEN_COMMAND}. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18473 Differential Revision: D14620193 Pulled By: ezyang fbshipit-source-id: ee9953b5d055f4b8beb3557f95f6539051b0028a	2019-03-26 10:22:00 -07:00
Sacha	a4f83fff2b	Only look for Caffe2 package when shared (#18421 ) Summary: Previously it would look for the Config even if it was not written. Fixed #18419 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18421 Differential Revision: D14597139 Pulled By: ezyang fbshipit-source-id: c212cbf5dc91564c12d9d07e507c8285e11c6bdf	2019-03-25 07:27:24 -07:00
Junjie Bai	0fe6e8c870	Remove ComputeLibrary submodule Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18052 Reviewed By: ezyang Differential Revision: D14477355 fbshipit-source-id: c56b802f6d69701596c327cf9af6782f30e335fa	2019-03-16 09:06:42 -07:00
Thomas Viehmann	13bc002422	fixes for AVX detection (#17915 ) Summary: Our AVX2 routines use functions such as _mm256_extract_epi64 that do not exist on 32 bit systems even when they have AVX2. This disables AVX2 when _mm256_extract_epi64 does not exist. This fixes the "local" part of #17901 (except disabling FBGEMM), but there also is sleef to be updated and NNPACK to be fixed, see the bug report for further discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17915 Differential Revision: D14437338 Pulled By: soumith fbshipit-source-id: d4ef7e0801b5d1222a855a38ec207dd88b4680da	2019-03-13 03:55:06 -07:00
HE, Tao	98c54e9fa6	When openblas exists, "OpenBLAS_FOUND" is defined, rather than "OPENBLAS_FOUND". (#17841 ) Summary: See https://github.com/pytorch/pytorch/blob/master/cmake/Modules/FindOpenBLAS.cmake#L36 This typo lead to cmake fails to detect openblas on ubuntu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17841 Differential Revision: D14400261 Pulled By: soumith fbshipit-source-id: 287e019e122230cf6b70ab1ea94e5c514f429c88	2019-03-10 09:34:50 -07:00
Johannes M Dieterich	1607bb322d	Support all ROCm supported uarchs simultaneously: gfx803, gfx900, gfx906 (#17367 ) Summary: Correct misspelled flag. Remove dependency on debug flag (HCC_AMDGPU_TARGET) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17367 Differential Revision: D14227334 Pulled By: bddppq fbshipit-source-id: d838f219a9a1854330b0bc851c40dfbba77a32ef	2019-02-26 11:54:07 -08:00
Lu Fang	3d68a2d6de	Add foxi submodule (ONNXIFI facebook extension) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17178 Reviewed By: yinghai Differential Revision: D14197987 Pulled By: houseroad fbshipit-source-id: c21d7235e40c2ca4925a10c467c2b4da2f1024ad	2019-02-25 08:00:03 -08:00
Soumith Chintala	3a47d56946	Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA (#16705 ) (#17337 ) Summary: Attempt #2 (attempt 1 is https://github.com/pytorch/pytorch/pull/16705 and got reverted because of CI failures) Fixes https://github.com/pytorch/pytorch/issues/14805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17337 Differential Revision: D14175626 Pulled By: soumith fbshipit-source-id: 66f2e10e219a1bf88ed342ec5c89da6f2994d8eb	2019-02-21 16:12:02 -08:00
bddppq	c063a33ef3	Add support to build for multiple amd gpu targets (#17329 ) Summary: iotamudelta petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/17329 Differential Revision: D14161277 Pulled By: bddppq fbshipit-source-id: f3eb9f52e96a8fcd779c57df0f8c9a2c54754e35	2019-02-20 18:45:24 -08:00
Lu Fang	d73e6cb59d	Automatic update of fbcode/onnx to 4c091e048ca42682d63ccd3c1811560bc12b732d (#17264 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17264 Previous import was 822d8df0a2a32233c6022f50a158817a0f19bdc7 Included changes: - [4c091e0](https://github.com/onnx/onnx/commit/4c091e0): Support defined ONNX_ML in parent cmake files (#1821) <Lu Fang> - [57372f3](https://github.com/onnx/onnx/commit/57372f3): Delete OpsetVersionConverter.md which is a duplicate of VersionConverter.md (#1818) <Prasanth Pulavarthi> - [ab1c57e](https://github.com/onnx/onnx/commit/ab1c57e): [ONNXIFI]Add extension to be implementable (#1796) <Rui Zhu> - [b92eee8](https://github.com/onnx/onnx/commit/b92eee8): Revert "Implement Op Annotation's for ONNX (#1648)" (#1812) <Ke Zhang> - [61f1e9e](https://github.com/onnx/onnx/commit/61f1e9e): Enable ONNX_ML by default (#1810) <Shinichiro Hamaji> - [4f064a1](https://github.com/onnx/onnx/commit/4f064a1): fix Greater and Less doc (#1811) <Guoliang Hua> - [0628582](https://github.com/onnx/onnx/commit/0628582): Implement Op Annotation's for ONNX (#1648) <Armen> - [ad9d2f7](https://github.com/onnx/onnx/commit/ad9d2f7): Versioning doc update for Opset 9 (#1805) <Vinitra Swamy> - [e71e3be](https://github.com/onnx/onnx/commit/e71e3be): add dilation case for ConvTranspose op (#1797) <Randy> Reviewed By: yinghai Differential Revision: D14135024 fbshipit-source-id: 1e4f9dda89abf48994798d080dd5d58207a6e4b6	2019-02-19 14:54:34 -08:00
Thomas Viehmann	6a528007a6	find libnvToolsExt instead of using only hardcoded path (#16714 ) Summary: This changes the libnvToolsExt dependency to go through CMake find_library. I have a machine where cuda libs, and libnvToolsExt in particular, are in the "usual library locations". It would be neat if we could find libnvToolsExt and use the path currently hardcoded as default. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16714 Differential Revision: D14020315 Pulled By: ezyang fbshipit-source-id: 00be27be10b1863ca92fd585f273d50bded850f8	2019-02-10 14:01:00 -08:00
Edward Yang	4047c97266	Revert D13952085: [pytorch][PR] Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA Differential Revision: D13952085 Original commit changeset: 410c4e117a44 fbshipit-source-id: fca59c37e71f8e61ae52867d5401b28fbacefe5a	2019-02-05 07:42:59 -08:00
Soumith Chintala	3f570b5eea	Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA (#16705 ) Differential Revision: D13952085 Pulled By: soumith fbshipit-source-id: 410c4e117a44c08eadc6f3ded91fafc320a7c696	2019-02-04 16:51:12 -08:00
JerryShih	73db487a8e	Update the cmake build configuration for AppleClang compiler (#15820 ) Summary: This pr try to merge the https://github.com/pytorch/pytorch/pull/11563 again and fix the linking error in https://github.com/pytorch/pytorch/pull/14837. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15820 Differential Revision: D13942024 Pulled By: ezyang fbshipit-source-id: dc6d1e9c4b0f177914f3745665244272a03ce33c	2019-02-04 08:53:47 -08:00
SsnL	13422fca32	Add torch.backends.openmp.is_available(); fix some cmake messages (#16425 ) Summary: 1. add `torch.backends.openmp.is_available()` 2. Improve various `cmake` outputs 3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets 4. Fix `MKL` warning message, and QUIET flag. 5. Fix various typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425 Differential Revision: D13903395 Pulled By: soumith fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d	2019-01-31 16:15:46 -08:00
Owen Anderson	fc2d8c6889	Eliminate PYCMD in favor of PYTHON_EXECUTABLE in CMake. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16522 Differential Revision: D13867376 Pulled By: resistor fbshipit-source-id: 6bce68facea83c5161a31fcdfafe08827999eb2b	2019-01-30 17:13:43 -08:00
Zachary DeVito	21193bf123	try to get rid of tmp_install (#16414 ) Summary: Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414 Differential Revision: D13863635 Pulled By: zdevito fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828	2019-01-29 17:29:40 -08:00
Owen Anderson	f204e3e624	Pass WERROR to CMake as an explicit parameter rather than an env var. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16465 Differential Revision: D13853949 Pulled By: resistor fbshipit-source-id: 71ccf90a2824ad21c9f26dd753b186f30435d82a	2019-01-28 20:57:18 -08:00
Junjie Bai	9310eb1fd0	Update third_party protobuf to v3.6.1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16251 Reviewed By: ezyang Differential Revision: D13781444 Pulled By: bddppq fbshipit-source-id: b713a021033d214f30a49ee02b95edf8633bcc50	2019-01-23 09:34:53 -08:00
Yaxun (Sam) Liu	9521a15c88	hip-clang enablement (#16085 ) Summary: Initial enabling of the upcoming hip-clang compiler for the PyTorch source base. Changes: * update the Eigen submodule to a version including our upstreamed hip-clang enabling there * modify a few ifdef guards with the `__HIP__` macro used by hip-clang * use `__lane_id` instead of `hc::__lane_id` * add Debug flags for ROCm to the cmake infrastructure Pull Request resolved: https://github.com/pytorch/pytorch/pull/16085 Differential Revision: D13709459 Pulled By: ezyang fbshipit-source-id: 1b7b33fe810a0434766180580d4443ea177eb7c7	2019-01-22 09:09:48 -08:00
Thomas Viehmann	b662a9b66a	add back NNPACK in PyTorch (#15924 ) Summary: This tests the water for adding back NNPACK in PyTorch, it's a lot better than the fallback THNN versions. In #6151, we (ezyang and soumith) removed NNPACK support from PyTorch. Of course Maratyszcza might have advice, too. (Or an opinion on the CMake changes.) The only functional changes are to use NNPack more aggressively on mobile and a .contiguous() to match NNPack's assumption (I stumbled over that while using NNPack for style transfer.) The CMake changes try to use the NNPack we already have in git. In terms of lines of code this is a large part of the diff of https://lernapparat.de/pytorch-jit-android/ . As far as I can tell, we don't have MKLDNN on mobile and the native THNN implementation are prohibitively expensive in terms of both CPU and memory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15924 Differential Revision: D13709576 Pulled By: ezyang fbshipit-source-id: f2e287739909451c173abf046588209a7450ca2c	2019-01-18 15:34:35 -08:00
Syed Tousif Ahmed	0e80df515d	Remove support for CUDNN 6 (#15851 ) Summary: This PR aims to remove support for cuDNN 6. Differential Revision: D13709595 Pulled By: ezyang fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107	2019-01-17 09:57:26 -08:00
peter	f7733526aa	Generate PDB files for better debugging on Windows (#16008 ) Summary: 1. Unify `build_pytorch_libs.bat`, `setup.py` and `torch/CMakeLists.txt` on the debugging flags with the `CMAKE_BUILD_TYPE` being `Debug`, `Release` and `RelWithDebInfo`. 2. Install PDBs through CMake if they are generated. Reference: 1. CMake PDB install: https://gitlab.kitware.com/cmake/cmake/issues/18393#note_459199 2. About debugging flags https://stackoverflow.com/a/4662345 3. MSDN page about /DEBUG flag: https://docs.microsoft.com/en-us/cpp/build/reference/debug-generate-debug-info?view=vs-2017 4. MSDN page about /Z{i/I/7}: https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2017 Work to do: - [x] Test the changes work in Release config through this PR - [ ] <del> Test debug build through https://github.com/pytorch/pytorch/pull/16009 </del> - [x] Test release build with debugging symbols through #16013 Difficulties: - [x] Replace /Zi flags with /Z7 (which will be added if DEBUG or RelWithDebInfo is used), as it is not supported by sccache - [x] Resolve `LINK : fatal error LNK1210: exceeded internal ILK size limit; link with /INCREMENTAL:NO` in the debug build - [ ] DEBUG build blocked by a MSVC bug. In order to resolve it, we'll need to update the MSVC in CI: https://developercommunity.visualstudio.com/content/problem/225957/fatal-error-lnk1318-unexpected-pdb-error-ok-0.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/16008 Differential Revision: D13709527 Pulled By: ezyang fbshipit-source-id: e8365bc75d9ec64099093f7001f83d99a06b196b	2019-01-16 23:34:32 -08:00
Tongliang Liao	55511004d1	Resolve errors in perfkernel for Windows (#16031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16031 1. MSVC only has _mm_prefetch(const char*, int). Fixed in both python codegen and C++ files. 2. uint32_t in "cvtsh_ss_bugfix.h" requires "#include <cstdint>". 3. Some files use gflags headers. Add dependency via c10. 4. Isolate arch flags with interface library and private compile options. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15753 Reviewed By: dskhudia Differential Revision: D13636233 Pulled By: jspark1105 fbshipit-source-id: cdcbd4240e07b749554a2a5676c11af88f23c31d	2019-01-16 21:51:00 -08:00
Shane Li	620ff25bdb	Enhance cpu support on gloo based multi-nodes mode. (#11330 ) Summary: 1. Add some gloo communication operators into related fallback list; 2. Work around to avoid compiling errors while using fallback operator whose CPU operator inherits from 'OperatorBase' directly like PrefetchOperator; 3. Add new cpu context support for some python module files and resnet50 training example file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11330 Reviewed By: yinghai Differential Revision: D13624519 Pulled By: wesolwsk fbshipit-source-id: ce39d57ddb8cd7786db2e873bfe954069d972f4f	2019-01-15 11:47:10 -08:00
Jesse Hellemn	8964a2e6e6	Split Caffe2 CI into cmake-only and python builds (#15917 ) Summary: bypass-lint - Change all Caffe2 builds to use setup.py instead of cmake - Add a -cmake- Caffe2 build configuration that uses cmake and only builds cpp - Move skipIfCI logic from onnx test scripts to the rest of CI logic - Removal of old PYTHONPATH/LD_LIBRARY_PATH/etc. env management Pull Request resolved: https://github.com/pytorch/pytorch/pull/15917 Reviewed By: orionr Differential Revision: D13637583 Pulled By: pjh5 fbshipit-source-id: c5c5639db0251ba12b6e4b51b2ac3b26a8953153	2019-01-14 15:20:44 -08:00
andersj	8a5ba577c1	Revert "remove use of tmp_install" (#15847 ) Summary: This reverts commit 04bf5285896e52ac118d2f9e9b7f582f695f13e2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15847 Differential Revision: D13603174 Pulled By: anderspapitto fbshipit-source-id: ae321434d3345ad94fad67bf71fd027cddeb4588	2019-01-08 16:30:19 -08:00
andersj	04bf528589	remove use of tmp_install Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14553 Differential Revision: D13583335 Pulled By: anderspapitto fbshipit-source-id: 8711fead9eda877c1037a0bc59f91a3d2e01f3e0	2019-01-04 13:48:12 -08:00

1 2 3 4 5 ...

526 Commits