Summary:
In TorchScript and C++ extensions we currently advocate a mix of `torch::` and `at::` namespace usage. In the C++ frontend I had instead exported all symbols from `at::` and some from `c10::` into the `torch::` namespace. This is far, far easier for users to understand, and also avoid bugs around creating tensors vs. variables. The same should from now on be true for the TorchScript C++ API (for running and loading models) and all C++ extensions.
Note that since we're just talking about typedefs, this change does not break any existing code.
Once this lands I will update stuff in `pytorch/tutorials` too.
zdevito ezyang gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13523
Differential Revision: D12942787
Pulled By: goldsborough
fbshipit-source-id: 76058936bd8707b33d9e5bbc2d0705fc3d820763
Summary:
Tensors cannot be created globally because of static initialization order issues. So tensors for the optim_baseline test must be created lazily instead. This is fine because these functions will only be called once (in the respective test).
ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12301
Differential Revision: D10201008
Pulled By: goldsborough
fbshipit-source-id: 59a041f437354e7c6600e5655b3e2d0647dbde9e
Summary:
After talking to users of the C++ API we found that having the tensor type be `autograd::Variable` causes more complications than having it be `at::Tensor`. It used to be a problem because `at::Tensor` didn't have the "autograd API" of variable (e.g. `detach()` or `grad()` methods), but those methods are now on `at::Tensor`. As such, we want to make a last big breaking change to have the tensor type be `at::Tensor`, while factory methods like `torch::ones` will return `Variable`s disguised as `at::Tensor`. This will make many things easier, like calling functions in ATen that take vectors of tensors.
This PR makes a small step in this direction by updating the optimizer classes to not use `.data()` on `Variable` to access the underlying `at::Tensor`. Using `.data()` is effectively a hack to work around our modification rules for tensors that require grad. The proper way of doing things is to use `with torch.no_grad` or equivalently `NoGradGuard` in C++ to guard in-place operations.
The next step can then simply redefine `torch::Tensor` to be `at::Tensor`. This transition should be smooth, since all methods available on `Variable` are at this point available on `at::Tensor`.
For this PR I:
1. Modified the implementations of optimizers to not use `.data()`. This means the implementations are now different from PyTorch, which still uses the legacy method of using `.data`.
2. To properly verify (1), I added more fine-grained test cases to our optimizer tests, e.g. `SGD` with and without `weight_decay`, then with `nesterov` etc. Generally more tests = more happy!
3. Minor cleanup of the optimizer codebase
ebetica apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10490
Differential Revision: D9318229
Pulled By: goldsborough
fbshipit-source-id: fb386700f37840542bc5d323f308ea88fe5ea5c5
Summary:
This PR is the final step to making `torch::` the only namespace users of the C++ API ever see. Basically, I did:
``` cpp
namespace torch {
using namespace at;
}
```
And then changed `torch::` to `at::` almost everywhere. This worked surprisingly well out of the box. So users can now write `torch::relu` and `torch::log_softmax` and `torch::conv2d` instead of having to know when to use `at::` and when `torch::`. This is happy!
Another thing I did was to have `using Dtype = at::ScalarType`, which will be the eventual name anyway.
ebetica ezyang apaszke zdevito
Closes https://github.com/pytorch/pytorch/pull/8911
Reviewed By: ezyang
Differential Revision: D8668230
Pulled By: goldsborough
fbshipit-source-id: a72ccb70fca763c396c4b0997d3c4767c8cf4fd3
* Rework optim folder
* Removed TORCH_OPTIMIZER_CLASS macro
* Got rid of CRTP/Impl
* Removed TORCH_AUTOGRAD_KWARG
* Differentiate between Optimizer and LossClosureOptimizer
* Make Optimizers parameters based instead of model based
* Allow construction of optimizer from arbitrary vector
* Added test for zero grad
* Added test for external parameter vectors
* Now comparing against baseline values
* Documentation
* Post rebase fixes
* Different strategy for creating and accessing buffers in optimizers
* Fix member ordering