Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18181
ghimport-source-id: 9c23551584a1a1b0b7ac246367f3a7ae1c50b315
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18184 Fix B903 lint: save memory for data classes with slots/namedtuple
* **#18181 Fix B902 lint error: invalid first argument.**
* #18178 Fix B006 lint errors: using mutable structure in default argument.
* #18177 Fix lstrip bug revealed by B005 lint
A variety of sins were committed:
- Some code was dead
- Some code was actually a staticmethod
- Some code just named it the wrong way
- Some code was purposely testing the omitted case
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14530876
fbshipit-source-id: 292a371d9a76ddc7bfcfd38b6f0da9165290a58e
This changes type(tensor) to return `torch.Tensor` instead of
`torch.autograd.Variable`.
This requires a few implementation changes:
- torch.Tensor is now a regular Python class instead of a
pseudo-factory like torch.FloatTensor/torch.DoubleTensor
- torch.autograd.Variable is just a shell with a __new__ function.
Since no instanes are constructed it doesn't have any methods.
- Adds torch.get_default_dtype() since torch.Tensor.dtype returns
<attribute 'dtype' of 'torch._C._TensorBase' objects>
Questions/possible future works:
How to template-ize to extend support beyond LongTensor?
How to check if autograd works (and if not, how to add explicit gradient)?
CUDA support?
Testing command:
DEBUG=1 NO_CUDA=1 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build && DEBUG=1 NO_CUDA=1 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py develop && python3 test/test_torch.py
Partially fixes#2031
* Initial commit for unique op
* Working unique with test
* Make inverse indices shape conform to input
* flake8 whitespace removal
* address review comment nits
* Expose fn and add docs. Explicitly declare no gradients
* Trial generic dispatch implementation
* Add tests for generics
* flake8 whitespace
* Add basic CUDA error throwing and templateize set
* Explicit contiguous and AT_DISPATCH_ALL_TYPES return
* Remove extraneous numpy conversion
* Refactor out .data calls
* Refactored to variable return length API with wrapper fn as opposed to returning a 0-length tensor, per off-line reviewer comments
* Remove A
* Don't use hidden torch._unique() in test
* Fix documentations
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.
To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.
There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:
https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
This adds overrides in VariableType for the xxx_out ATen functions and
implements Python bindings. There is no support for automatic
differentiation. If any of the inputs (or outputs) requires grad, then the
function will throw an exception unless it's running in "no-grad" mode.
The bindings for calling torch.xxx functions on Variables are moved to a
different object. Previously, they were static method on VariableBase.
This change prevents users from accidentally calling static methods as if
they were instance methods.
This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().
In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()
Fixes#3627
* Implement Variable.cuda using ATen
This adds an optional async flag to Tensor::copy_, which attempts to do
a non-blocking copy if the one of the tensors is in pinned memory and
the other is a CUDA tensor.
* Perform cross-device copy in CopyBackwards
Also call torch.cuda._lazy_init() from Variable.cuda()
* Implement Variable.type via ATen
* Changes from review:
- remove copy_out
- remove unnecessary include
- fix default device for .cuda()
* Combine if statements in dispatch_type
* Implement remaining random methods through ATen
* Change test_bernoulli on Tensor to avoid broadcasting
The new ATen-dispatched bernoulli_ supports broadcasting. The old
Tensor.bernoulli_ bindings instead require the tensors to have the same
number of elements. I haven't change the old code because it will be
deleted soon.
Implements from_numpy using ATen tensors. Variable.from_numpy is a
convenient placeholder for the variant that returns Variables until we
merge Tensor and Variable.
The behavior is slightly changed:
- from_numpy() on an empty array now returns an empty tensor instead of
throwing an exception. The shape may not be preserved.
- CharTensor(ndarray) used to throw an exception. It now copies the
ndarray. Copying is implemented via ATen toType.
* Implement matmul as a native function; use it for Variable impl.
This also includes an (inefficient) version of allclose, which was necessary for testing.
A more efficient version would use some apply logic to fuse the ops and exit early (coming in future PR).
On small tensors [(2, 5, 5) @ (5,5)], this yields ~2.5x speedup over the python implementation.
* Make maybeSqueeze static.
* Have localScalar work with all 1 element tensors, not just scalars.
Also have toCFloat, etc. call localScalar so 1 element tensors work as well.
* Implement python number conversions.
* Implement __bool__, __nonzero__ as ATen functions.
* Remove merge artifacts.
* Simplify by dispatching to toCDouble.
Implements basic and advanced indexing using ATen tensors/variables.
Basic indexing is translated at the Python-binding level
(python_variable_indexing.cpp) to slice/squeeze/unsqueeze/select calls.
Advanced indexing is implemented in ATen in terms of take() and put()
calls.
* Move Variable conversion methods to ATen.
* Add a test to ensure type conversions work through backwards.
* Fix VariableType copy for type conversions.
* Add comment about needing to handle device movement.
* Move back to opposite order for copy function params -- inplace views depend on it.
* Use is_available() rather than is_available.
* Use aten version of is_signed.
* Define is_cuda native function and use it for variable.
* Use ATen dim for Variable dim/ndimension.
* Get rid of dim, ndimension fallthroughs in variable.py.
* Move size/stride Variable methods to use ATen.
* Implement shape property on Variable via ATen.
* Remove the _getattr__ function from Variable.
* Get rid of dispatch functions and avoid cast.
* Add THPUtils_packInt64Array.
* Throw python errors.
* Use fallthrough and fix fallthrough generation for native functions.
* is_cuda is a property, not a method.