Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13232
DeviceGuard should be device agnostic, which means that it shouldn't
assume that int64_t means select the CUDA device.
Reviewed By: gchanan
Differential Revision: D10858024
fbshipit-source-id: b40e8337e4046906fd8f83a95e6206367fb29dbe
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12768
Note: DefaultTensorOptions no longer fits in 64-bits.
I kept functions that take ScalarType as input to minimize changes for now.
Reviewed By: ezyang
Differential Revision: D10419671
fbshipit-source-id: 9cc8c5982fde9ff243e03d55c0c52c2aa2c7efd8
Summary:
This PR is a large codemod to rewrite all C++ API tests with GoogleTest (gtest) instead of Catch.
You can largely trust me to have correctly code-modded the tests, so it's not required to review every of the 2000+ changed lines. However, additional things I changed were:
1. Moved the cmake parts for these tests into their own `CMakeLists.txt` under `test/cpp/api` and calling `add_subdirectory` from `torch/CMakeLists.txt`
2. Fixing DataParallel tests which weren't being compiled because `USE_CUDA` wasn't correctly being set at all.
3. Updated README
ezyang ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11953
Differential Revision: D9998883
Pulled By: goldsborough
fbshipit-source-id: affe3f320b0ca63e7e0019926a59076bb943db80
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11189
Replaces it with an operator TensorOptions() method on
Type, reestablishing the implicit conversion. I originally
wanted to get rid of the implicit conversion entirely, but
there were a *lot* of use-sites, so I added it back to avoid
a huge codemod. In this patch, I only had to fix sites that
used the optional device_index API.
Reviewed By: cpuhrsch
Differential Revision: D9628281
fbshipit-source-id: 5fe2a68eefb77a3c9bb446f03a94ad723ef90210
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11101
I'd like to invert the dependency between Tensor and TensorOptions
(such that Tensor includes TensorOptions); to do this, I'd prefer
there to not be a Tensor constructor. Eventually, all references
of Tensor will disappear from TensorOptions.h
Reviewed By: cpuhrsch
Differential Revision: D9585627
fbshipit-source-id: dd4a28b2c06b1e55f629762915f03c2b6c34d840
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11096
To discourage willy-nilly use, and make it clearer that it
is not a Variable
Reviewed By: cpuhrsch
Differential Revision: D9583699
fbshipit-source-id: 4fbde0c01ae3deb2c7ef8c125a9028f089b203ae
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10478
- Removed Backend constructor from Device, and fixed all
use-sites to use DeviceType::CPU instead of kCPU, or
use a new function backendToDeviceType to perform
the conversion.
- New method device_type() on Type; it gives you the
underlying device type, e.g., CPU for SparseCPU.
- We add backward compatibility for kCPU/kCUDA uses,
by introducing a new special type which is implicitly
convertible to both DeviceType and Backend. As long as
you don't define a function that's overloaded on both
DeviceType and Backend (but not on BackendOrDeviceType),
the implicit conversions will ensure that uses
of at::Device(at::kCPU) keep working. We fixed use-sites in
the library, but did NOT fix sites in the test code, so that
we can exercise this BC code.
Reviewed By: Yangqing
Differential Revision: D9301861
fbshipit-source-id: 9a9d88620500715c7b37e655b4fd761f6dd72716
Summary:
This PR adds the functional version of `DataParallel` (i.e. `data_parallel`) to the C++ frontend.
For this, I had to:
1. Add "differentiable" versions of scatter and gather, which perform their inverse operation in the backward pass, to C++. I've added them under `torch/csrc/autograd/functions/comm.{h,cpp}`. I had to move some utilities from `VariableType.cpp` into `torch/csrc/autograd/functions/utils.h`, and changed them a bit to fix the `const_cast`s for which there were `TODO`s,
2. Implement the `replicate`, `parallel_apply` and the combining `data_parallel` functions in C++.
`replicate` is implemented based on our existing `clone()` interface, along with the ability to set the current device via `at::OptionsGuard` (so nice).
`parallel_apply` is implemented using `at::parallel_for` (CC cpuhrsch) and [follows the code from PyTorch](https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/parallel_apply.py).
Added lots of tests for these things.
apaszke ezyang ebetica colesbury
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9234
Differential Revision: D8865182
Pulled By: goldsborough
fbshipit-source-id: 4f1fecf2b3f3bc1540c071dfb2d23dd45de433e4
Summary:
THCStream was recently moved to ATen by mruberry: https://github.com/pytorch/pytorch/pull/8997. This PR now introduces a guard class that replaces `AutoStream` from `torch/csrc/` and also uses this new stream interface.
I had to extend the `CUDAStream` interface with unchecked calls, so that we can reset the stream without throwing an exception in the guard's destructor.
colesbury apaszke ezyang
Fixes https://github.com/pytorch/pytorch/issues/7800
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9277
Differential Revision: D8865183
Pulled By: goldsborough
fbshipit-source-id: 67c9bc09629d92fa5660286b5eec08fde9108cd7
* Created DefaultTensorOptions
* Fix TensorOptions() call which was interpreted as function decl
* Fix empty OptionsGuard
* Make options_ and mutex_ in DefaultTensorOptions class static because of dynamic linker issues
* Make DefaultOptions thread local
* Created TensorOptions
Storing the type in TensorOptions to solve the Variable problem
Created convenience creation functions for TensorOptions and added tests
Converted zeros to TensorOptions
Converted rand to TensorOptions
Fix codegen for TensorOptions and multiple arguments
Put TensorOptions convenience functions into torch namespace too
All factory functions except *_like support TensorOptions
Integrated with recent JIT changes
Support *_like functions
Fix in place modification
Some cleanups and fixes
Support sparse_coo_tensor
Fix bug in Type.cpp
Fix .empty calls in C++ API
Fix bug in Type.cpp
Trying to fix device placement
Make AutoGPU CPU compatible
Remove some auto_gpu.h uses
Fixing some headers
Fix some remaining CUDA/AutoGPU issues
Fix some AutoGPU uses
Fixes to dispatch_tensor_conversion
Reset version of new variables to zero
Implemented parsing device strings
Random fixes to tests
Self review cleanups
flake8
Undo changes to variable.{h,cpp} because they fail on gcc7.2
Add [cuda] tag to tensor_options_cuda.cpp
Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks
Fix linker error in AutoGPU.cpp
Fix bad merge conflict in native_functions.yaml
Fixed caffe2/contrib/aten
Fix new window functions added to TensorFactories.cpp
* Removed torch::TensorOptions
Added code to generate wrapper functions for factory methods
Add implicit constructor from Backend to TensorOptions
Remove Var() from C++ API and use torch:: functions
Use torch:: functions more subtly in C++ API
Make AutoGPU::set_device more exception safe
Check status directly in DynamicCUDAHooksInterface
Rename AutoGPU to DeviceGuard
Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad
remove python_default_init: self.type()
Add back original factory functions, but with deprecation warnings
Disable DeviceGuard for a couple functions in ATen
Remove print statement
Fix DeviceGuard construction from undefined tensor
Fixing CUDA device compiler issues
Moved as many methods as possible into header files
Dont generate python functions for deprecated factories
Remove merge conflict artefact
Fix tensor_options_cuda.cpp
Fix set_requires_grad not being checked
Fix tensor_new.h
TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac
Fix bug in DeviceGuard.h
Missing includes
TEMPORARILY moving a few more methods into .cpp to see if it fixes windows
Fixing linker errors
* Fix up SummaryOps to use new factories
Undo device agnostic behavior of DeviceGuard
Use -1 instead of optional for default device index
Also move DeviceGuard methods into header
Fixes around device index after optional -> int32_t switch
Fix use of DeviceGuard in new_with_tensor_copy
Fix tensor_options.cpp
* Fix Type::copy(
* Remove test_non_float_params from ONNX tests
* Set requires_grad=False in ONNX tests that use ints
* Put layout/dtype/device on Tensor
* Post merge fixes
* Change behavior of DeviceGuard to match AutoGPU
* Fix C++ API integration tests
* Fix flip functions