Commit Graph

405 Commits

Author SHA1 Message Date
82903dda9b Fixes for some Windows compiler warnings (#14490)
Summary:
Implement some simple fixes to clean up windows build by fixing compiler warnings. Three main types of warnings were fixes:

1. GCC specific pragmas were changed to not be used on windows.
2. cmake flags that don't exist on windows were removed from windows build
3. Fix a macro that was defined multiple times on Windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14490

Differential Revision: D13241988

Pulled By: ezyang

fbshipit-source-id: 38da8354f0e3a3b9c97e33309cdda9fd23c08247
2018-12-05 21:27:07 -08:00
8901935ad4 Update OpenMP cmake setting for xcode 9 compiler(AppleClang 9.0) (#14473)
Summary:
Original PR: https://github.com/pytorch/pytorch/pull/11563
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14473

Differential Revision: D13234208

Pulled By: ezyang

fbshipit-source-id: 7d874c63659e93728af239ecdfb85547613e52ad
2018-11-28 09:28:26 -08:00
48099c23b4 Move AT_CUDA_CHECK to c10
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13910

Reviewed By: smessmer

Differential Revision: D13046201

fbshipit-source-id: 8d360a0e4d6c2edf070d130e600c6b04f0ee0058
2018-11-19 08:20:10 -08:00
928687bb24 Add c10 cuda library. (#13900)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13900

Add c10 cuda library.

Right now, this is not used by anything, and only tests if the CUDA
headers are available (and not, e.g., that linking works.)

Extra changes:
- cmake/public/cuda.cmake now is correctly include guarded, so you
  can include it multiple times without trouble.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Reviewed By: smessmer

Differential Revision: D13025313

fbshipit-source-id: fda85b4c35783ffb48ddd6bbb98dbd9154119d86
2018-11-19 08:20:07 -08:00
8e91da4cb3 Windows shared build (#13550)
Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
2018-11-16 12:16:28 -08:00
0d7a986da1 Change hip filename extension to .hip (#14036)
Summary:
xw285cornell

- To make hip files to have unique filename extension we change hip files from _hip.cc to .hip (it's the only blessing option other than .cu in hipcc 3d51a1fb01/bin/hipcc (L552)).
- Change to use host compiler to compile .cc|.cpp files. Previously we use hcc to compile them which is unnecessary
- Change the hipify script to not replace "gpu" with "hip" in the filename of the generated hipified files. Previously we do this because hcc has a bug when linking files that have same filename. We have now changed to use host linker to do linking so this is unnecessary anymore.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14036

Reviewed By: xw285cornell

Differential Revision: D13091813

Pulled By: bddppq

fbshipit-source-id: ea3d887751d8abb39d75f5d5104aa66ce66b9ee0
2018-11-16 11:55:59 -08:00
e3bb6ff334 Move c10 dispatcher prototype to c10/
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13690

Reviewed By: dzhulgakov

Differential Revision: D12912235

fbshipit-source-id: 974b85790c23335be8130a50aa4692e3ddcd2bf9
2018-11-14 18:04:36 -08:00
53a3c46950 Switch to packaged Thrust on Ubuntu, enable CentOS 7.5 as a CI target (#12899)
Summary:
1) Use the hip-thrust version of Thrust as opposed to the GH master. (ROCm 267)

2) CentOS 7.5 docker (ROCm 279)

* Always install the libraries at docker creation for ubuntu.
* Add Dockerfile for CentOS ROCm
* Enable the centos build
* Source devtoolset in bashrc
* Set locales correctly depending on whether we are on Ubuntu or CentOS
* Install a newer cmake for CentOS
* Checkout thrust as there is no package for CentOS yet.

PyTorch/Caffe2 on ROCm passed tests: https://github.com/ROCmSoftwarePlatform/pytorch/pull/280

For attention: bddppq ezyang

Docker rebuild for Ubuntu not urgent (getting rid of Thrust checkout and package install is mainly cosmetic). If docker for CentOS 7.5 is wanted, build is necessary. Build of PyTorch tested by me in CentOS docker. PyTorch unit tests work mostly, however, a test in test_jit causes a python recursion error that seems to be due to the python2 on CentOS as we haven't ever seen this on Ubuntu - hence please do not enable unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12899

Differential Revision: D13029424

Pulled By: bddppq

fbshipit-source-id: 1ca8f4337ec6a603f2742fc81046d5b8f8717c76
2018-11-12 14:39:54 -08:00
26751ce300 Fix the improper use of windows-native slashes (#13220)
Summary:
Trying to fix #12510.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13220

Differential Revision: D12994483

Pulled By: soumith

fbshipit-source-id: adbaf7e7a0a7cd1fc3ec947ddb209b55a9cda2a6
2018-11-08 21:09:44 -08:00
fd9aaa6b79 Fix linking errors on Windows (#13100)
Summary:
1. Removes the flag "/FORCE:UNRESOLVED" that shouldn't be used.
2. Fix the code logic for ONNX_BUILD_MAIN_LIBS on Windows
3. Add a patch for protobuf using CMake
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13100

Differential Revision: D12978950

Pulled By: orionr

fbshipit-source-id: db9eb8136acf5712cfb5a24ed228b7934d873331
2018-11-08 09:54:09 -08:00
90ea61800f operators/quantized/server -> quantization/server (#13660)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13660

Any change in server side quantized operator was triggering ios-sanity-check with more than 5 hours testing time. I suspect this was because the operator code was synced with xplat directory. This diff moves server side quantized operators to caffe2/caffe2/quantization/server to avoid this issue.

Reviewed By: hx89

Differential Revision: D12955420

fbshipit-source-id: b6c824b9de5e2a696f8c748e1b2c77d81d46746b
2018-11-07 22:54:13 -08:00
18de330e86 CMake integration for int8 server operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13558

Reviewed By: Maratyszcza

Differential Revision: D12945460

Pulled By: dskhudia

fbshipit-source-id: 1a91027b305fd6af77eebd9a4fad092a12f54712
2018-11-06 15:45:15 -08:00
9f9f06c937 Improve inline container and add some test (#12993)
Summary:
Added getNextRecord/hasNextRecord methods. Even the model data is stored at the end, we can still read the file from the beginning.

Added gtest to cover reader and writer's code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12993

Reviewed By: yinghai

Differential Revision: D10860086

Pulled By: houseroad

fbshipit-source-id: 01b1380f8f50f5e853fe48a8136e3176eb3b0c29
2018-10-26 12:06:47 -07:00
5e73b828bd CMake integration for Int8 ops
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13145

Differential Revision: D10860849

Pulled By: Maratyszcza

fbshipit-source-id: fdbcc23ff9beaeaedfd561176df6cfe87685c1f5
2018-10-25 22:25:10 -07:00
99d24aefc3 Move a number of ATen checks out of Dependencies.cmake (#12990)
Summary:
cc Yangqing mingzhe09088 anderspapitto mingzhe09088
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12990

Differential Revision: D10862301

Pulled By: orionr

fbshipit-source-id: 62ba09cf0725f29692fac71bc30173469283390b
2018-10-25 17:26:25 -07:00
963b012bd8 nomnigraph - HEFT scheduler (#12788)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12788

Static task scheduling algorithm

- Input/Output for static scheduler
- HEFT static scheduling algorithm
- Theoretical critical path analyzer

Reviewed By: bwasti

Differential Revision: D10436418

fbshipit-source-id: 074bc587b9a2c7cb2d9e64291981ff1c160f02b2
2018-10-18 08:40:46 -07:00
0c6ab0e8f4 Delete caffe2/mkl, and references. (#12625)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12625

It's obsoleted by ideep

Reviewed By: Yangqing

Differential Revision: D10372230

fbshipit-source-id: 2d6475ae72389dd654ba0bcbb57766530eb4ac1a
2018-10-13 22:02:32 -07:00
957142a4fe switch ROCm CI targets to white rabbit release (#12577)
Summary:
* switches docker files over to white rabbit release - removed custom package installs
* skips five tests that regressed in that release
* fixes some case-sensitivity issues in ROCm supplied cmake files by sed'ing them in the docker
* includes first changes to the infrastructure to support upcoming hip-clang compiler
* prints ROCm library versions as part of the build (as discussed w/ ezyang )
* explicitly searches for miopengemm
* installs the new hip-thrust package to be able to remove the explicit Thrust checkout in a future revision
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12577

Differential Revision: D10350165

Pulled By: bddppq

fbshipit-source-id: 60f9c9caf04a48cfa90f4c37e242d944a175ab31
2018-10-11 18:03:11 -07:00
0d50c117db Introduce BUILD_ATEN_ONLY cmake option (#12443)
Summary:
Following up #11488 conversation with orionr
And our brief conversation at PTDC about ATen with soumith and apaszke

This PR enables a very slim build focused on ATen particularly without caffe2 and protobuf among other dependencies.
WIth this PR NimTorch tests pass fully, including AD, convolutions, wasm, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12443

Reviewed By: mingzhe09088

Differential Revision: D10249313

Pulled By: orionr

fbshipit-source-id: 4f50503f08b79f59e7717fca2b4a1f420d908707
2018-10-10 12:54:19 -07:00
8fa7de35f2 Enable ROCM clang-7 build
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12223

Differential Revision: D10133697

Pulled By: bddppq

fbshipit-source-id: c1de99afccdad415ac1beb85d3b8ab44f9b58738
2018-10-01 15:11:40 -07:00
a6f1ae7f20 set up c10 scaffolding. Move macros proper first.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939

Reviewed By: orionr, dzhulgakov

Differential Revision: D10004629

Pulled By: Yangqing

fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c
2018-09-24 11:09:59 -07:00
aa8cd7319a Enable build_test on windows (#11802)
Summary:
This PR enables BUILD_TEST for Caffe2 on windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11802

Reviewed By: orionr

Differential Revision: D9951223

Pulled By: mingzhe09088

fbshipit-source-id: 7cdc1626b999daadeae482bd569eebdbd53eb6d4
2018-09-19 20:17:49 -07:00
a7cbcb1bb9 Enable build_python on windows (#11385)
Summary:
The PR aims to resolve issues related to BUILD_PYTHON and BUILD_TEST after FULL_CAFFE2 is removed on Windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11385

Reviewed By: orionr

Differential Revision: D9884906

Pulled By: mingzhe09088

fbshipit-source-id: fc114c0cbff6223f1ec261161e4caecc1fef5dd6
2018-09-17 21:40:03 -07:00
07fd4450ab Revert D9831398: [pytorch][PR] Update OpenMP cmake setting for xcode 9 compiler(AppleClang 9.0)
Differential Revision:
D9831398

Original commit changeset: db119d3f9c26

fbshipit-source-id: 4f183c9c178c159473bdaaa6299d4d5eb8afe549
2018-09-17 09:39:23 -07:00
f5bc2aef07 Update OpenMP cmake setting for xcode 9 compiler(AppleClang 9.0) (#11563)
Summary:
Fix the link OpenMP link error for AppleClang 9.0 compiler.

Built with the following command:
python setup.py build develop

The error message:

```
Undefined symbols for architecture x86_64:
  "___kmpc_critical", referenced from:
      _THFloatTensor_addmm in THTensorMath.cpp.o
      _THDoubleTensor_addmm in THTensorMath.cpp.o
      _THByteTensor_addmm in THTensorMath.cpp.o
      _THCharTensor_addmm in THTensorMath.cpp.o
      _THShortTensor_addmm in THTensorMath.cpp.o
      _THIntTensor_addmm in THTensorMath.cpp.o
      _THLongTensor_addmm in THTensorMath.cpp.o
      ...
  "___kmpc_end_critical", referenced from:
      _THFloatTensor_addmm in THTensorMath.cpp.o
      _THDoubleTensor_addmm in THTensorMath.cpp.o
      _THByteTensor_addmm in THTensorMath.cpp.o
      _THCharTensor_addmm in THTensorMath.cpp.o
      _THShortTensor_addmm in THTensorMath.cpp.o
      _THIntTensor_addmm in THTensorMath.cpp.o
      _THLongTensor_addmm in THTensorMath.cpp.o
      ...
  "___kmpc_end_reduce_nowait", referenced from:
      _.omp_outlined..270 in THTensorMoreMath.cpp.o
      _.omp_outlined..271 in THTensorMoreMath.cpp.o
      _.omp_outlined..273 in THTensorMoreMath.cpp.o
      _.omp_outlined..275 in THTensorMoreMath.cpp.o
      _.omp_outlined..43 in THTensorEvenMoreMath.cpp.o
      _.omp_outlined..44 in THTensorEvenMoreMath.cpp.o
      _.omp_outlined..46 in THTensorEvenMoreMath.cpp.o
      ...
  "___kmpc_end_serialized_parallel", referenced from:
      at::native::embedding_renorm_cpu_(at::Tensor&, at::Tensor const&, double, double) in Embedding.cpp.o
      at::native::_embedding_bag_dense_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, bool, long long) in EmbeddingBag.cpp.o
      at::native::softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::log_softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::native::log_softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::TensorIterator::for_each(std::__1::function<void (int, char**, long long const*, long long)> const&) in TensorIterator.cpp.o
      ...
  "___kmpc_for_static_fini", referenced from:
      _.omp_outlined..9 in Embedding.cpp.o
      _.omp_outlined. in EmbeddingBag.cpp.o
      _.omp_outlined. in GridSampler.cpp.o
      _.omp_outlined..42 in GridSampler.cpp.o
      _.omp_outlined..44 in GridSampler.cpp.o
      _.omp_outlined..45 in GridSampler.cpp.o
      _.omp_outlined..47 in GridSampler.cpp.o
      ...
  "___kmpc_for_static_init_4", referenced from:
      _.omp_outlined. in init.cpp.o
      _.omp_outlined..35 in init.cpp.o
      _.omp_outlined..36 in init.cpp.o
      _.omp_outlined..37 in init.cpp.o
      _.omp_outlined..49 in init.cpp.o
      _.omp_outlined..52 in init.cpp.o
      _.omp_outlined..220 in init.cpp.o
      ...
  "___kmpc_for_static_init_8", referenced from:
      _.omp_outlined..9 in Embedding.cpp.o
      _.omp_outlined. in EmbeddingBag.cpp.o
      _.omp_outlined. in GridSampler.cpp.o
      _.omp_outlined..42 in GridSampler.cpp.o
      _.omp_outlined..44 in GridSampler.cpp.o
      _.omp_outlined..45 in GridSampler.cpp.o
      _.omp_outlined..47 in GridSampler.cpp.o
      ...
  "___kmpc_for_static_init_8u", referenced from:
      _.omp_outlined..203 in init.cpp.o
      _.omp_outlined..207 in init.cpp.o
      _.omp_outlined..209 in init.cpp.o
      _.omp_outlined..210 in init.cpp.o
  "___kmpc_fork_call", referenced from:
      at::native::embedding_dense_backward_cpu(at::Tensor const&, at::Tensor const&, long long, long long, bool) in Embedding.cpp.o
      at::native::embedding_renorm_cpu_(at::Tensor&, at::Tensor const&, double, double) in Embedding.cpp.o
      at::native::_embedding_bag_dense_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, bool, long long) in EmbeddingBag.cpp.o
      at::native::grid_sampler_2d_cpu(at::Tensor const&, at::Tensor const&, long long, long long) in GridSampler.cpp.o
      at::native::grid_sampler_3d_cpu(at::Tensor const&, at::Tensor const&, long long, long long) in GridSampler.cpp.o
      at::native::grid_sampler_2d_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, long long) in GridSampler.cpp.o
      at::native::grid_sampler_3d_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, long long) in GridSampler.cpp.o
      ...
  "___kmpc_global_thread_num", referenced from:
      at::native::embedding_renorm_cpu_(at::Tensor&, at::Tensor const&, double, double) in Embedding.cpp.o
      at::native::_embedding_bag_dense_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, bool, long long) in EmbeddingBag.cpp.o
      at::native::softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::log_softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::native::log_softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::TensorIterator::for_each(std::__1::function<void (int, char**, long long const*, long long)> const&) in TensorIterator.cpp.o
      ...
  "___kmpc_push_num_threads", referenced from:
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 1, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 1, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      ...
  "___kmpc_reduce_nowait", referenced from:
      _.omp_outlined..270 in THTensorMoreMath.cpp.o
      _.omp_outlined..271 in THTensorMoreMath.cpp.o
      _.omp_outlined..273 in THTensorMoreMath.cpp.o
      _.omp_outlined..275 in THTensorMoreMath.cpp.o
      _.omp_outlined..43 in THTensorEvenMoreMath.cpp.o
      _.omp_outlined..44 in THTensorEvenMoreMath.cpp.o
      _.omp_outlined..46 in THTensorEvenMoreMath.cpp.o
      ...
  "___kmpc_serialized_parallel", referenced from:
      at::native::embedding_renorm_cpu_(at::Tensor&, at::Tensor const&, double, double) in Embedding.cpp.o
      at::native::_embedding_bag_dense_backward_cpu(at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long long, bool, long long) in EmbeddingBag.cpp.o
      at::native::softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::log_softmax_cpu(at::Tensor const&, long long) in SoftMax.cpp.o
      at::native::softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::native::log_softmax_backward_cpu(at::Tensor const&, at::Tensor const&, long long, at::Tensor const&) in SoftMax.cpp.o
      at::TensorIterator::for_each(std::__1::function<void (int, char**, long long const*, long long)> const&) in TensorIterator.cpp.o
      ...
  "_omp_get_max_threads", referenced from:
      _THGetNumThreads in THGeneral.cpp.o
      caffe2::Caffe2SetOpenMPThreads(int*, char***) in init_omp.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 0, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 1, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> >, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 1, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 1, false, float, 1, false, 0>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Transpose<Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::Stride<0, 0> > const>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::Stride<0, 0> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      void Eigen::internal::parallelize_gemm<true, Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> >, long>(Eigen::internal::gemm_functor<float, long, Eigen::internal::general_matrix_matrix_product<long, float, 0, false, float, 0, false, 0>, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1> const, 0, Eigen::OuterStride<-1> >, Eigen::Map<Eigen::Matrix<float, -1, -1, 0, -1, -1>, 0, Eigen::OuterStride<-1> >, Eigen::internal::gemm_blocking_space<0, float, float, -1, -1, -1, 1, false> > const&, long, long, long, bool) in math_cpu.cc.o
      ...
  "_omp_get_num_procs", referenced from:
      _THGetNumCores in THGeneral.cpp.o
  "_omp_get_num_threads", referenced from:
      _.omp_outlined. in Embedding.cpp.o
      _.omp_outlined. in SoftMax.cpp.o
      _.omp_outlined..35 in SoftMax.cpp.o
      _.omp_outlined..37 in SoftMax.cpp.o
      _.omp_outlined..38 in SoftMax.cpp.o
      _.omp_outlined..46 in SoftMax.cpp.o
      _.omp_outlined..47 in SoftMax.cpp.o
      ...
  "_omp_get_thread_num", referenced from:
      _.omp_outlined. in Embedding.cpp.o
      _.omp_outlined. in SoftMax.cpp.o
      _.omp_outlined..35 in SoftMax.cpp.o
      _.omp_outlined..37 in SoftMax.cpp.o
      _.omp_outlined..38 in SoftMax.cpp.o
      _.omp_outlined..46 in SoftMax.cpp.o
      _.omp_outlined..47 in SoftMax.cpp.o
      ...
  "_omp_in_parallel", referenced from:
      _THFloatTensor_copy in THTensorCopy.cpp.o
      _THDoubleTensor_copy in THTensorCopy.cpp.o
      _THByteTensor_copy in THTensorCopy.cpp.o
      _THCharTensor_copy in THTensorCopy.cpp.o
      _THShortTensor_copy in THTensorCopy.cpp.o
      _THIntTensor_copy in THTensorCopy.cpp.o
      _THLongTensor_copy in THTensorCopy.cpp.o
      ...
  "_omp_set_num_threads", referenced from:
      _THSetNumThreads in THGeneral.cpp.o
      caffe2::Caffe2SetOpenMPThreads(int*, char***) in init_omp.cc.o
ld: symbol(s) not found for architecture x86_64
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11563

Differential Revision: D9831398

Pulled By: ezyang

fbshipit-source-id: db119d3f9c26a71180335ad955f2f62c5369f9ed
2018-09-17 08:24:20 -07:00
35d52dbb0e re-enable USE_MPI (#11416)
Summary:
The previous error was caused by mpi_test not depending on MPI_CXX_LIBRARIES. This might solve the problem.

Not tested locally - waiting for CI test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11416

Reviewed By: mingzhe09088

Differential Revision: D9771694

Pulled By: Yangqing

fbshipit-source-id: 53e7b4f64eadc88313bc4dd9b8e3f7931cda6e91
2018-09-11 18:26:12 -07:00
17776db2ee Add gtest dependency on aten tests. (#11429)
Summary:
ezyang delivering my promise to you :)

Basically, now aten tests can use gtest as part of our test harness unification effort. I also converted one test (atest.cpp) to show how one can do this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11429

Reviewed By: ezyang

Differential Revision: D9762934

Pulled By: Yangqing

fbshipit-source-id: 68ec3a748403c6bd88399b1e756200985a4e07e3
2018-09-11 13:39:51 -07:00
a175282776 Flags for LMDB, LevelDB, and Caffe2 ops (#11462)
Summary:
Add flags for LMDB and LevelDB, default `OFF`. These can be enabled with

```
USE_LMDB=1 USE_LEVELDB=1 python setup.py build_deps
```

Also add a flag to build Caffe2 ops, which is default `ON`. Disable with

```
NO_CAFFE2_OPS=1 python setup.py build_deps
```

cc Yangqing soumith pjh5 mingzhe09088
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11462

Reviewed By: soumith

Differential Revision: D9758156

Pulled By: orionr

fbshipit-source-id: 95fd206d72fdf44df54fc5d0aeab598bff900c63
2018-09-10 17:27:50 -07:00
802d21c8f4 Remove FULL_CAFFE2 flag (#11321)
Summary:
Continuing pjh5's work to remove FULL_CAFFE2 flag completely.

With these changes you'll be able to also do something like

```
NO_TEST=1 python setup.py build_deps
```
and this will skip building tests in caffe2, aten, and c10d. By default the tests are built.

cc mingzhe09088 Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11321

Reviewed By: mingzhe09088

Differential Revision: D9694950

Pulled By: orionr

fbshipit-source-id: ff5c4937a23d1a263378a196a5eda0cba98af0a8
2018-09-07 15:09:44 -07:00
0f419abf40 Roll nomnigraph build into caffe2 (#11303)
Summary:
We need to remove nomnigraph from the list of public libraries in order to support libtorch extensions. Easiest way to do this is to include it into the Caffe2 source like all other caffe2/core/ code.

However, because the headers are in a different place, we need to include them for linked libraries (pybind, tests, etc).

On an upside, this means that nomnigraph is now default hidden visibility too.

FYI peterjc123 xkszltl goldsborough bwasti Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11303

Reviewed By: pjh5

Differential Revision: D9694932

Pulled By: orionr

fbshipit-source-id: 5db3eb20bc5ddc873ce9151236b74663fbb33ed8
2018-09-06 19:38:09 -07:00
68613cf5a2 Windows DLL build with Caffe2 code (#11266)
Summary:
This is an experimental build on top of what orionr and mingzhe09088 built.

Essentially, the idea is that we will need separate *_API versions for different shared libraries. If this theory is right, I'll try to clean up the design a bit and document it properly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11266

Reviewed By: orionr

Differential Revision: D9682942

Pulled By: Yangqing

fbshipit-source-id: c79653199e67a1500c9174f39f8b0357324763f3
2018-09-06 15:12:20 -07:00
6508db7421 Remove BUILD_CAFFE2 and build everything (#8338)
Summary:
This completely removes BUILD_CAFFE2 from CMake. There is still a little bit of "full build" stuff in setup.py that enables USE_CUDNN and BUILD_PYTHON, but otherwise everything should be enabled for PyTorch as well as Caffe2. This gets us a lot closer to full unification.

cc mingzhe09088, pjh5, ezyang, smessmer, Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8338

Reviewed By: mingzhe09088

Differential Revision: D9600513

Pulled By: orionr

fbshipit-source-id: 9f6ca49df35b920d3439dcec56e7b26ad4768b7d
2018-08-31 13:10:24 -07:00
4cb968fb77 Default hidden visibility (#10752)
Summary:
Flipping to hidden visibility one more time. Let's see what fails.

cc mingzhe09088 pjh5 Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10752

Reviewed By: ezyang

Differential Revision: D9526343

Pulled By: orionr

fbshipit-source-id: c0e9c29270e95e1b2e21c598095f720c199e1e52
2018-08-28 15:25:43 -07:00
f0d8a36e70 Completely remove build_aten and use_aten (#10469)
Summary:
Breaking out of #8338 to completely remove build_aten and use_aten.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10469

Reviewed By: orionr

Differential Revision: D9413639

Pulled By: mingzhe09088

fbshipit-source-id: b7203aa4f5f2bb95c504c8dc187a3167f2570183
2018-08-20 20:26:42 -07:00
0a809fc8b1 build changes to make cpu unified build working. (#10504)
Summary:
Properly annotated all apis for cpu front. Checked with cmake using

cmake -DUSE_ATEN=ON -DUSE_CUDA=OFF -DBUILD_ATEN=ON

and resulting libcaffe2.so has about 11k symbols.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10504

Reviewed By: ezyang

Differential Revision: D9316491

Pulled By: Yangqing

fbshipit-source-id: 215659abf350af7032e9a4b0f28a856babab2454
2018-08-15 17:22:36 -07:00
75651d5b58 improve use of ROCm libraries, enable more tests, small fixes (#10406)
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406

Reviewed By: Jorghi12

Differential Revision: D9277093

Pulled By: ezyang

fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
2018-08-13 11:39:43 -07:00
145eb330ad Back out "Back out "Move typeid.h to move to ATen/core"" (#10465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10465

Original commit changeset: 7050fe845e65

Reviewed By: jerryzh168

Differential Revision: D9296375

fbshipit-source-id: cb8161440ba809dcec5027858a29cd026d537fc3
2018-08-13 11:20:01 -07:00
64a6f17177 Fix ATen/core header installation. (#10463)
Summary:
Fixes #10353 and fixes #10397.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10463

Differential Revision: D9296491

Pulled By: ezyang

fbshipit-source-id: f825c2a21a113e44a6f5c1c5ec17814d9deac366
2018-08-13 09:25:49 -07:00
82d11b847e Use CUDA_LINK_LIBRARIES_KEYWORD instead of hacking. (#10437)
Summary:
There's no need to hack.
Using `CUDA_LINK_LIBRARIES_KEYWORD` is the normal way.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10437

Differential Revision: D9287579

Pulled By: Yangqing

fbshipit-source-id: d3d575ea8c3235576ba971e4b7493ddb435f92f3
2018-08-12 18:09:20 -07:00
1dbdc5a93d Back out "Move typeid.h to move to ATen/core"
Summary: Original commit changeset: 21f2c89e58ca

Reviewed By: yinghai

Differential Revision: D9282171

fbshipit-source-id: 7050fe845e6524b965bdd45794a6fa1665b83e34
2018-08-10 21:39:25 -07:00
def3715e82 Minor changes for nicer pip packages (#9544)
Summary:
I am using this to test a CI job to upload pip packages, and so am using the Caffe2 namespace to avoid affecting the existing pytorch packages.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9544

Reviewed By: orionr

Differential Revision: D9267111

Pulled By: pjh5

fbshipit-source-id: a68162ed29d2eb9ce353d8435ccb5f16c3b0b894
2018-08-10 12:09:46 -07:00
40109b16d0 Remove caffe1 specific proto (#10380)
Summary:
This was used as a convenient way for us to convert c1 models. Now that conversion is more or less done, we should probably require any users who need to convert c1 models to explicitly install c1. This PR removes the explicit c1 proto (which was copied from c1) in favor of explicit installation.

Note that caffe_translator would still work properly, only difference is that now users need to install c1 separately.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10380

Differential Revision: D9267981

Pulled By: Yangqing

fbshipit-source-id: a6ce5d9463e6567976da83f2d08b2c3d94d14390
2018-08-10 11:10:26 -07:00
3667d029b4 Move typeid.h to move to ATen/core (#10163)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10163

- Remove dependency on caffe2/core/common.h for ATen/core/typeid.h
  Unfortunately, Windows seems to rely on typeid.h including this
  header, so it is still included from the forwarding header
  caffe2/core/typeid.h
- Deduplicate Demangle/DemangleType with their ATen equivalents

Reviewed By: smessmer

Differential Revision: D9132432

fbshipit-source-id: 21f2c89e58ca1e795f1b2caa316361b729a5231b
2018-08-10 08:45:44 -07:00
25b2e88750 Stop propagating std flags to downstream gcc/nvcc (#10098)
Summary:
When we directly use -std=c++11, it propagates to the downstream applications.

Problems:
1. Gcc flags propagating to nvcc.
2. nvcc flags propagating to nvcc. (Which throws an error like redeclaration of std flag)

This PR will fix these propagation issues!

Similar problem:
https://github.com/FloopCZ/tensorflow_cc/pull/92
https://github.com/CGAL/cgal/issues/2775

Requires: Cmake 3.12
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10098

Differential Revision: D9187110

Pulled By: ezyang

fbshipit-source-id: 0e00e6aa3119c77a5b3ea56992ef3bbfecd71d80
2018-08-06 15:30:27 -07:00
a38b572de3 enable unit tests and other changes (#10266)
Summary:
This PR for the ROCm target does the following:
* enable some unit tests on ROCm
* fix a missing static_cast that breaks BatchNorm call on ROCm
* fix BatchNorm to work on ROCm w/ ROCm warp sizes etc
* improve the pyhipify script by introducing kernel scope to some transpilations and other improvements
* fix a linking issue on ROCm
* for more unit test sets: mark currently broken tests broken (to be fixed)
* enable THINLTO (phase one) to parallelize linking
* address the first failing of the elementwise kernel by removing non-working ROCm specialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266

Differential Revision: D9184178

Pulled By: ezyang

fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297
2018-08-06 14:54:01 -07:00
c7c6e93312 Use target_compile_definitions for AT_CORE_STATIC_WINDOWS (#10213)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10213

nvcc only respects definitions, not options.

Reviewed By: dzhulgakov

Differential Revision: D9154388

fbshipit-source-id: 04c4809154df1c61108b65f1115fccdeb336952e
2018-08-03 19:25:02 -07:00
cfa05706ef ROCm contributions week 29 (#9653)
Summary:
In this changeset:
* improvements to `hipify-python.py`
* marking unit tests broken for ROCm
* reducing the number of jobs for the built to avoid out of memory issues
* switch to Thrust/cub-hip master for the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653

Differential Revision: D9117791

Pulled By: ezyang

fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352
2018-08-02 09:09:00 -07:00
5a44be50ab Minor nit in comment in CMakeLists.txt
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10125

Reviewed By: smessmer

Differential Revision: D9119766

fbshipit-source-id: 290b804bc552b1c3f68e5129ff60ef7f34307714
2018-08-01 12:39:38 -07:00
fa6b28bf40 Move ArrayRef, Backtrace, Error, SmallVector, optional to ATen/core; add CoreAPI (#10092)
Summary:
This also makes Backtrace more portable, by disabling its functionality for
mobile builds as well.

It also handles Caffe2 static Windows builds by introducing a new variable,
AT_CORE_STATIC_WINDOWS, which must be set if you're building
ATen on Windows as part of a static library.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10092

Reviewed By: gchanan, smessmer

Differential Revision: D9094393

Pulled By: ezyang

fbshipit-source-id: 93281f9302bd378605a26589ae308faf1dac7df4
2018-08-01 08:39:22 -07:00
58fd6e1dd6 Also add ATen/core tests to oss CI (#10029)
Summary:
-
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10029

Reviewed By: ezyang

Differential Revision: D9070030

Pulled By: smessmer

fbshipit-source-id: b5ae79a383dc14e7d79e6a82c5d70e951c9f5168
2018-07-31 13:54:39 -07:00