mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Summary: As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR: 1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class 2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()` 3. Remove `Variable.data()` API 3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history. After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't. **Note that this PR is BC-breaking in the following use cases:** **Use Case 1:** Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type. **Use Case 2:** If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example: ```python params = torch.tensor([1.5, 1.5]).requires_grad_() with torch.no_grad(): # Change gradient to a sparse tensor params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.])) grad_saved = params.grad params.backward(torch.tensor([1.5, 1.5])) assert id(grad_saved) == id(params.grad) # This will fail after this PR ``` The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072 Differential Revision: D14075257 Pulled By: yf225 fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957
31 lines
843 B
C++
31 lines
843 B
C++
#include <torch/csrc/autograd/functions/basic_ops.h>
|
|
|
|
#include <torch/csrc/autograd/function.h>
|
|
#include <torch/csrc/autograd/variable.h>
|
|
#include <torch/csrc/autograd/functions/utils.h>
|
|
|
|
#include <ATen/ATen.h>
|
|
|
|
#include <memory>
|
|
#include <utility>
|
|
|
|
namespace torch { namespace autograd {
|
|
|
|
auto Error::apply(variable_list&& inputs) -> variable_list {
|
|
throw std::runtime_error(msg);
|
|
}
|
|
|
|
auto DelayedError::apply(variable_list&& inputs) -> variable_list {
|
|
tensor_list outputs;
|
|
outputs.reserve(inputs.size());
|
|
for (auto& var : inputs) {
|
|
// FIXME: share version counters
|
|
outputs.emplace_back(var.defined() ? var.tensor_data() : at::Tensor());
|
|
}
|
|
return wrap_outputs(inputs, std::move(outputs), [&](edge_list&& next_edges) {
|
|
return std::make_shared<Error>(msg, std::move(next_edges));
|
|
});
|
|
}
|
|
|
|
}} // namespace torch::autograd
|