Commit Graph

229 Commits

Author SHA1 Message Date
335bfa24e0 Add an AutogradMeta factory. (#28593)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28593

When I turn on Variable everywhere, I will need to be able to construct
AutogradMetas from TensorImpl.  But I cannot call the constructor directly
as it lives in another dynamic library. So I need another virtual factory interface
to be able to do this.

I also adjust the AutogradMeta constructor so that the TensorImpl argument is
optional. This argument is only needed if `requires_grad == True`, as we use it
to test if the variable is valid (only floating point tensors can have requires grad true).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D18171161

Pulled By: ezyang

fbshipit-source-id: 3f2e86720899b3bda36ddd90244c2624645cc519
2019-10-31 11:45:03 -07:00
18f2efa997 Unfriend Variable factory functions. (#28601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28601

In the process, I moved AutogradMeta out of the Variable class. The
intent here is that I'm going to delete Variable class entirely,
so I had better not be putting stuff in it!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D18171160

Pulled By: ezyang

fbshipit-source-id: 9c0bcdc82797eca0577d1b0745b4a2ae962f3010
2019-10-31 11:44:58 -07:00
4f4c69b1de Make set_grad_accumulator private (friend class SavedVariable) (#27666)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27666

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D17886544

Pulled By: ezyang

fbshipit-source-id: b9ff845cb1e5ec6f7cb4f2fa171403d555014248
2019-10-16 09:57:27 -07:00
e1f58b7c4c Make AutogradMeta a private struct in Variable. (#27654)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27654

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D17886547

Pulled By: ezyang

fbshipit-source-id: ea0c5b40a5f34bc37657ed5d3bce9140063ddcbb
2019-10-16 09:57:23 -07:00
34522c212a Add trailing underscore to member variable. (#27651)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27651

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D17886546

Pulled By: ezyang

fbshipit-source-id: b8f7c74b1004d35690a815b0c7671a07ca612e94
2019-10-16 09:57:19 -07:00
60343a82e9 Named tensor support for: atan2, output_nr, detach{_}, requires_grad_ (#26543)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26543

Also adds a test for logical_xor (it already had named tensor support
but there was no test)

Test Plan: - [namedtensor ci]

Differential Revision: D17501403

Pulled By: zou3519

fbshipit-source-id: 49be15580be9fb520e25a8020164e5a599d22d40
2019-09-25 05:23:57 -07:00
mal
6b656565ab Hooks for C++ API (#24393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24393

Ability to register hook on a variable, similar to python autograd API. register_hook will take a function as argument and create a CppFunctionPreHook similar to PyFunctionPreHook.
It will return the index of the hook which can be passed to remove_hook to disable the hook.

Test Plan: Added tests.

Differential Revision: D16861722

fbshipit-source-id: d08047f932e38c7bde04283a18b2d0311c8ad604
2019-08-16 12:44:20 -07:00
mal
e7a9b0d62f Rename torch::autograd::Function to torch::autograd::Node
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23269

Test Plan: Imported from OSS

Differential Revision: D16454878

fbshipit-source-id: b1e840fc2d3901955280d141e5ad6efd5e9d66af
2019-07-23 20:52:22 -07:00
3b1c3996e1 remove RTTI check for TensorImpl shadow copy (#22773)
Summary:
We introduced RTTI in recent change: https://github.com/pytorch/pytorch/pull/21613

For internal mobile build we don't enable '-frtti' yet. This diff is trying to replace
RTTI with alternative approach.

According to dzhulgakov we could compare two tensors' type_id directly in most cases -
which is more strict than comparing TensorImpl subclass type as TensorImpl -> type_id
mapping is 1-to-n but it's more proper for this use case.

The only two cases where we can relax direct type comparison (for legacy reason) are:
1. CPUTensor <-> CUDATensor;
2. SparseCPUTensor <-> SparseCUDATensor;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22773

Differential Revision: D16277696

Pulled By: ljk53

fbshipit-source-id: 043e264fbacc37b7a11af2046983c70ddb62a599
2019-07-15 23:21:57 -07:00
815e73bc20 make_variable consumes the Tensor if it only has one reference
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22705

Test Plan: Imported from OSS

Differential Revision: D16192220

Pulled By: jamesr66a

fbshipit-source-id: 9c42bb759077b74a1370d3a2d7114ed3593f333b
2019-07-14 18:36:20 -07:00
6c454ff14c Stop using Type in Python bindings (#21963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21963
ghimport-source-id: 4d9d66ba2c8587503d892b67f535cc2a62e2d19e

Test Plan: Imported from OSS

Differential Revision: D15897423

Pulled By: li-roy

fbshipit-source-id: 2dd55ceb80971df7c86545b7bfff733387f13572
2019-06-30 04:11:32 -07:00
6b972795e4 Add torch.__future__._overwrite_module_params_on_conversion global flag, and check it in nn.Module._apply() (#21613)
Summary:
https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type:
```python
# 6dc445e1a8/torch/nn/modules/module.py (L192-L208)
def _apply(self, fn):
    ...
    for param in self._parameters.values():
        if param is not None:
            # Tensors stored in modules are graph leaves, and we don't
            # want to create copy nodes, so we have to unpack the data.
            param.data = fn(param.data)  # NOTE: this doesn't allow changing `param.data`'s TensorImpl type
            if param._grad is not None:
                param._grad.data = fn(param._grad.data)  # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type
   ...
```

yf225 TODO: fix the description here when we finish the implementation

To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in.

We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model.

This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher.

[xla ci]

cc. resistor ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613

Differential Revision: D15895387

Pulled By: yf225

fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6
2019-06-19 10:30:02 -07:00
8cde4c4d22 Remove Variable::Impl and DifferentiableViewImpl (#17072)
Summary:
As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR:
1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class
2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()`
3. Remove `Variable.data()` API
3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history.

After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't.

**Note that this PR is BC-breaking in the following use cases:**

**Use Case 1:**
Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type.

**Use Case 2:**
If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example:
```python
params = torch.tensor([1.5, 1.5]).requires_grad_()
with torch.no_grad():
    # Change gradient to a sparse tensor
    params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.]))

grad_saved = params.grad
params.backward(torch.tensor([1.5, 1.5]))
assert id(grad_saved) == id(params.grad)  # This will fail after this PR
```
The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072

Differential Revision: D14075257

Pulled By: yf225

fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957
2019-05-23 21:09:04 -07:00
5b78a5eadb Memory format support for contiguous and is_contiguous (#20455)
Summary:
#19975 was separated by 2 PRs.

This one:

Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions.

At this moment both functions just operate with strides and doesn't store any tensor state.

(Original RFC #19092)

-----

Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api).

Note: We had several complaints about `.to(memory_format)` function, and decided not to support it.

1.  `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior.

    - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern.

        `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise.

2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged.

    - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format.

Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455

Differential Revision: D15341577

Pulled By: VitalyFedyunin

fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d
2019-05-16 07:18:24 -07:00
456b889353 Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() (#20496)
Summary:
Previously, the caller of `shallow_copy_and_detach()` is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of `shallow_copy_and_detach()` to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought.

For similar reasons, we want to enforce users of `shallow_copy_and_detach()` to pass `allow_tensor_metadata_change` to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20496

Differential Revision: D15363620

Pulled By: yf225

fbshipit-source-id: a65e74738b10452668d6dc644b43aad5b3d8c9e6
2019-05-15 21:02:48 -07:00
97e1f07ffc Replace AT_CHECK with TORCH_CHECK [shard 10/10]
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20436

Reviewed By: jerryzh168

Differential Revision: D15318926

fbshipit-source-id: 71a43070cc50cc174f703ebc595f1d87c6fc1e91
2019-05-15 07:35:37 -07:00
6ca38d9840 Cleanup includes in torch/csrc/autograd/* (#19923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19923
ghimport-source-id: 54debdd21ca0f4230b1915905673de274807a2e5

Differential Revision: D15125016

Pulled By: ZolotukhinM

fbshipit-source-id: 8d54f436e4508067089a1d05ce192093220aa1bb
2019-05-06 13:48:42 -07:00
4ae59e4744 Move version_counter_ to TensorImpl (#18223)
Summary:
According to https://github.com/pytorch/pytorch/issues/13638#issuecomment-468055428, after the Variable/Tensor merge, we may capture variables without autograd metadata inside an autograd function, and we need a working version counter in these cases. This PR makes it possible by moving `version_counter_` out of autograd metadata and into TensorImpl, so that variables without autograd metadata still have version counters.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18223

Differential Revision: D14735123

Pulled By: yf225

fbshipit-source-id: 15f690311393ffd5a53522a226da82f5abb6c65b
2019-04-11 15:12:45 -07:00
043e363c6c Cache device on TensorImpl; clean up TensorImpl constructors. (#18833)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833
ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.**
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.

1) We cache device on TensorImpl.  This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves).

2) Clean up TensorImpl APIs.  We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds.  Instead, we just have two different constructors: one for types with a storage, one without.

Reviewed By: dzhulgakov

Differential Revision: D14766230

fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df
2019-04-05 07:21:39 -07:00
32d0e7e339 Move pyobj_ to TensorImpl (#18225)
Summary:
Currently, `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` depend on the existence of `pyobj_` in the autograd metadata of a Variable to convert the Variable to a Python tensor. However, after the Variable/Tensor merge, there will be Variables that don't contain autograd metadata, and to allow the conversion from non-autograd-meta Variable to a Python tensor we need to store the `pyobj_` outside of autograd metadata and in a place where it will always be available.

This PR makes it possible by moving `pyobj_` into TensorImpl, so that `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` can always access a Variable's `pyobj_` and convert the Variable to a Python tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18225

Differential Revision: D14562616

Pulled By: yf225

fbshipit-source-id: 18d4aaace70eee6120abaf9276036d1f8f51b18d
2019-03-23 12:50:38 -07:00
0f7e6f293b Make Variable::set_data non-const; cosmetic fixes.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17761

Differential Revision: D14406603

Pulled By: ezyang

fbshipit-source-id: bc8bba73352eb4b3e21196b36522e9cec70f6676
2019-03-12 12:41:57 -07:00
6a297b8675 Don't make factory methods create a tensor and then immediately copy it (#17565)
Summary:
Create a `make_variable` override that moves out of a tensor instead of going through `shallow_copy_and_detach`. Call this override from factory methods like `empty` that create a brand new tensor, do nothing with it, and then copy it into a variable.

Will update this with actual numbers, but it seems to get rid of around 20-40% of the overhead of calling `torch.empty(0)`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17565

Differential Revision: D14266130

Pulled By: umanwizard

fbshipit-source-id: f57d5f2ca3f80ee8ee96d50f905e852fd10db941
2019-03-03 22:16:21 -08:00
e2a5b203fc Enforce same input tensor storage in VariableType functions (#16305)
Summary:
In VariableType.cpp, when a function modifies its input tensors, it should only change the input tensors' storage data in-place, and should never change the input tensors' storage pointers. This PR adds checks for this, and also fixes functions that fail this test.

This is part of the Variable/Tensor merge work (https://github.com/pytorch/pytorch/issues/13638).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16305

Differential Revision: D13897855

Pulled By: yf225

fbshipit-source-id: 0c4fc7eb530d30db88037b1f0981f6f8454d3b79
2019-02-11 13:33:12 -08:00
4404762d7d Rename IntList to IntArrayRef. (#16751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751

This was made more complicated by the fact that ivalue::IntList
is a thing.  So I had to fix all of the sites where we referring
to IValue post facto.

The following codemods were run, in this order:

```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```

Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752

Reviewed By: dzhulgakov

Differential Revision: D13954363

fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
2019-02-05 14:54:34 -08:00
9bf7eb914d Move VariableImpl functions to AutogradMeta and Variable (#15487)
Summary:
In this PR, we are moving all functions away from `Variable::Impl`, in order to get rid of `Variable::Impl` (and the `data_` Tensor in it) in the next PR. Some of the functions (such as `set_requires_grad` / `requires_grad` / `grad`) will be living in `AutogradMeta` class, while others (such as `backward()` / `rebase_history()` / `grad_accumulator()` / `grad_fn()`) will be living in `Variable` class.

This is the 2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15487

Differential Revision: D13553173

Pulled By: yf225

fbshipit-source-id: 691f9432d0cd0640af380c757f3e3a2f64f8851c
2018-12-27 17:16:31 -08:00
7b87ecae37 Move autograd metadata from VariableImpl to TensorImpl (#13827)
Summary:
Changes originally in this PR:
1. Move Variable::Impl data members into TensorImpl as `AutogradMeta` struct
2. Change Variable::Impl functions to use data members in `AutogradMeta` struct
3. Add `shallow_copy_and_detach()` function to each subclass of TensorImpl
4. Do shallow copy when the user calls `make_variable(tensor)` / `make_variable_view(tensor)` / `variable.set_data(tensor)` / `variable.detach()`

Changes moved from https://github.com/pytorch/pytorch/pull/13645:
1. Add a flag to Variable to disallow size/stride/storage_ptr changes from in-place operations such as `resize_` / `resize_as_` / `set_` / `transpose_`, and set this flag to true when people call `tensor.data` in Python.
2. Write text in the docs to actively discourage changing the shape or storage of `tensor_detached` and expecting `tensor` to also be updated.

This is the 1st+2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13827

Differential Revision: D13507173

Pulled By: yf225

fbshipit-source-id: b177b08438d534a8197e34e1ad4a837e2db0ed6a
2018-12-26 16:34:24 -08:00
1e9c384afb Enable performance-unnecessary-value-param in .clang-tidy (#15026)
Summary:
This PR fixes around 250 places in the codebase where we were making unnecessary copies of objects (some large, some small).

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15026

Differential Revision: D13458784

Pulled By: goldsborough

fbshipit-source-id: be5148b2ce09493588d70952e6f6d6ff5ec5199b
2018-12-13 16:15:35 -08:00
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
e70321ed9e Remove unnecessary type dispatches from Variable::Impl ctor (#13630)
Summary:
This should improve the performance of wrapping a tensor in a Variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13630

Reviewed By: ezyang

Differential Revision: D12944960

Pulled By: zou3519

fbshipit-source-id: 89fa78a563e46a747d851a90ffd1b5cf3cd2d0d7
2018-11-07 07:27:40 -08:00
27ccc8787f Implement data_ptr as a native function.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13367

Reviewed By: ezyang

Differential Revision: D12855339

Pulled By: gchanan

fbshipit-source-id: da5d75ab38e01365717eed9a676dcbb22ac89fe7
2018-10-31 09:51:04 -07:00
8c2d0c831f Speed up tensor.storage_offset (#13267)
Summary:
This PR special cases tensor.storage_offset to avoid dispatches in the
common case. tensor.storage_offset is important for torch.as_strided
performance, because as_strided(sizes, strides) shares an implementation
with as_strided(sizes, strides, storage_offset) and it might not be the
best if there were two separate implementations (including backward
implementations).

This PR reduces times on a tensor.storage_offset
microbenchmark from 22ns to 2ns (these numbers are pretty stable). For
a torch.as_strided benchmark, this PR reduces numbers from 1042 to
928ns, a 100ns improvement, but this number is noisy and goes up and
down.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13267

Reviewed By: ezyang

Differential Revision: D12829828

Pulled By: zou3519

fbshipit-source-id: df907731e2398ce2baf1c8b1860a561ccc456f78
2018-10-30 07:36:21 -07:00
efab8e8fdf Speed up tensor.get_device(), is_cuda(), is_sparse() by avoiding dispatches (#12841)
Summary:
`tensor.get_device()` went through two dispatches: once to the native
function
`get_device()`, and another when `get_device` calls `_th_get_device()`.
This PR avoids the dispatch by directly implementing the `get_device`
function
as a method on Tensor.

Future Work:
- Investigate caching Device on TensorImpl. This will probably bring the
  tensor.get_device down to 2ns, but I'm not sure it's worth it.

before:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                             8 ns          8 ns   89407911
BM_TensorIsCuda                          24 ns         24 ns   29313017
BM_TensorIsSparse                        27 ns         27 ns   26083160
BM_TensorTypeIsCuda                      11 ns         11 ns   65128120
BM_TensorNumel                           11 ns         11 ns   68314492
BM_TensorGetDevice                       71 ns         71 ns    9633125
BM_DeviceGuardCtor                      173 ns        173 ns    4067173
BM_DeviceGuard                          232 ns        232 ns    3009690
```

after:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                            10 ns         10 ns   69803872
BM_TensorIsCuda                           2 ns          2 ns  321626683
BM_TensorIsSparse                         6 ns          6 ns  177045382
BM_TensorNumel                           12 ns         12 ns   58770533
BM_TensorGetDevice                        4 ns          4 ns  128113396
BM_DeviceGuardCtor                       52 ns         52 ns   14997278
BM_DeviceGuard                          158 ns        158 ns    5767248

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12841

Differential Revision: D10489353

Pulled By: zou3519

fbshipit-source-id: a596bc77352f21d5d35433c6de02c2f65aab5f9e
2018-10-25 19:57:52 -07:00
46162ccdb9 Autograd indices/values and sparse_coo ctor (#13001)
Summary:
Reopen of #11253 after fixing bug in index_select
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13001

Differential Revision: D10514987

Pulled By: SsnL

fbshipit-source-id: 399a83a1d3246877a3523baf99aaf1ce8066f33f
2018-10-24 10:00:22 -07:00
08aab4dfdd remove ATen/Error.h and ATen/core/Error.h (#12792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12792

This is a follow up diff after D10238910.

Only non-codemod change is the removal of ATen/Error.h and ATen/core/Error.h. Other files are basically changing the inclusion path + clang format for inclusion order.

Reviewed By: bddppq

Differential Revision: D10437824

fbshipit-source-id: 7f885f80ab5827468d1351cfb2765d0e3f555a69
2018-10-17 17:25:42 -07:00
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
d4ce41c4de Rename tensor_impl_ to impl_ in Tensor (#12035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12035

This brings it in line with Caffe2's naming

Reviewed By: mingzhe09088

Differential Revision: D10024485

fbshipit-source-id: a6feef82a56b5eb3043b0821ea802ba746e542a0
2018-09-25 09:11:39 -07:00
198ade74f9 Remove manual refcounting from Tensor class (#11294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11294

The Tensor(ptr, retain) constructor is error prone and circumvents the intrusive_ptr safety.

This diff removes that and pushes the responsibility to callers.
Step by step, manual refcounting can be pushed back and possibly eliminated in the end.

Reviewed By: ezyang

Differential Revision: D9663476

fbshipit-source-id: 7f010e5e47b137a9575960201c5bf5d552c5c2f5
2018-09-10 12:40:21 -07:00
b0c1397271 Fix intrusive_ptr move/copy for different NullType's (#11260)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11260

This is needed to make something like this work:

    intrusive_ptr<TensorImpl, UndefinedTensorImpl> a = make_intrusive<SparseTensorImpl>(...);

Reviewed By: ezyang

Differential Revision: D9652089

fbshipit-source-id: 19c65e98460ccb27bc69e36d7e558cb9d6e67615
2018-09-10 12:40:20 -07:00
cee743f639 Move backward/set_data to Type-based dispatch.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11440

Differential Revision: D9736565

Pulled By: gchanan

fbshipit-source-id: 1e66f54f1c87084f37c0b014030f0d6d2f8dfaee
2018-09-10 08:40:29 -07:00
93da5a21c9 Update variable view note
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11393

Differential Revision: D9725444

Pulled By: SsnL

fbshipit-source-id: b1607d986ab93e64b0b0ff9e8f10d9e3f6e2160e
2018-09-07 15:09:43 -07:00
110191e5c7 Remove detach from TensorImpl, handle via Type. (#11337)
Summary:
This is so that TensorImpl does not have to depend on Tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11337

Differential Revision: D9684421

Pulled By: gchanan

fbshipit-source-id: d2af93420ca6d493429c251cfe5a34e9289c4484
2018-09-07 08:55:59 -07:00
9ca63c5e63 Reorganize methods in Type, add CPUTypeDefault/CUDATypeDefault (#11205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11205

Our short term plan for supporting out of tree complex development requires an
external library to add a custom subclass of Type without access to the
code generation facilities in ATen.  This commit reorganizes Type so
as to minimize the amount of boilerplate you have to write when making
a subclass of Type.

In particular, it:
- Creates a new CPUTypeDefault/CUDATypeDefault class, which you are
  intended to inherit from, which provides default implementations
  of CPU/CUDA that is layout/dtype agnostic.
- Adds new getCPUAllocator() and getCUDAAllocator() functions, as
  a more public API to get your hands on Allocator
- Adds allocator() and getDeviceFromPtr(), abstracting the device
  specific parts of storage() methods; these methods are now
  implemented in base TypeDefault.
- Delete the static typeString() method, which is now dead.
- Move is_cuda/is_sparse/is_distributed to TypeDefault.

Reviewed By: SsnL

Differential Revision: D9631619

fbshipit-source-id: 40b600d99691230e36e03eb56434c351cbc2aa3a
2018-09-04 20:26:20 -07:00
4cb968fb77 Default hidden visibility (#10752)
Summary:
Flipping to hidden visibility one more time. Let's see what fails.

cc mingzhe09088 pjh5 Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10752

Reviewed By: ezyang

Differential Revision: D9526343

Pulled By: orionr

fbshipit-source-id: c0e9c29270e95e1b2e21c598095f720c199e1e52
2018-08-28 15:25:43 -07:00
f7b02b3a68 Change Tensor/TensorImpl to use c10::intrusive_ptr (#10824)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10824

API additions:
- Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&)
- Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&)
- Tensor::operator=(Tensor&&) && (for completeness sake)
- TensorBase::unsafeGetTensorImpl()
- TensorBase::unsafeReleaseTensorImpl()
- TensorBase::getIntrusivePtr()
- TensorImpl::type_id()
- Tensor::set_data()
- Tensor::is_same(Tensor)
- Tensor::use_count()
- Tensor::type_id()
- Tensor::scalar_type()
- WeakTensor::is_same(WeakTensor)
- intrusive_ptr::weak_use_count()
- weak_intrusive_ptr::weak_use_count()
- c10::raw::intrusive_ptr::{incref,decref,make_weak}
- c10::raw::weak_intrusive_ptr::{incref,decref,lock}

API changes:
- Tensor::pImpl is no longer public (and now named tensor_impl_)
    - Most methods accessed this way are now accessible on Tensor
      maybe_zero_dim() and set_wrapped_number() being prominent exceptions
      (they are now accessed through unsafeGetTensorImpl())
- Type is no longer friend of Tensor
- TensorBase::reset(TensorImpl*) is deleted
- TensorBase::reset(TensorImpl*, bool should_retain) is deleted
- TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead
- TensorBase::get() is deleted; use unsafeGetTensorImpl() instead
- TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead
- TensorBase::retain() is deleted; use _raw_incref() instead
- TensorBase::release() is deleted; use _raw_decref() instead
- WeakTensor lost most of its methods (it no longer inherits from
  TensorBase)
- TensorImpl::storage() is now a const method
- Tensor(TensorBase) constructor removed, instead
  we go through getIntrusivePtr().  I'm not sure about
  this change; I happened to have accidentally removed the
  TensorBase constructor and decided to fix call sites,
  but I could go the other way.
- detail::set_data() is deleted; use Tensor::set_data() instead
- c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead.
  (The reason for this change, is that it is invalid to cast an intrusive_ptr_target*
  to a raw_intrusive_ptr_target* to take advantage of the methods. But there is
  no reason the incref/decref methods shouldn't also work on intrusive_ptr_target;
  it is primarily an API consideration. We can be more standards compliant by
  keeping them as functions, which are universally applicable.)
- intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on
  pointers of the NullType. (This counts as a bug fix, because the documentation
  specified that pointers produced by release() are valid to reclaim(), and
  a release() on a null intrusive_ptr produces the NullType::singleton())

Bug fixes:
- Dispatch code for mutable references incorrectly returned
  a reference to a value argument (which would immediately
  go out of scope).  They now correctly return a tensor by
  value.
- intrusive_ptr copy/move assignment did not work correctly when
  an object was assigned to itself. We now check for this case and
  no-op if so. (This bug manifested itself as a Tensor mysteriously
  becoming an UndefinedTensor after lines of code like
  'x = x.mul_(y)')

Other changes:
- The checked cast functions in Utils.h have now been
  renamed and detemplatized into checked unwrap functions.
- Added type_id() and scalar_type() methods to Tensor
- pImpl is no longer public
- Documented what the && overloads are doing
- All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor')
  have been expunged. This is NO LONGER a valid way to create a new
  tensor, and if you do this, upon your first incref, you will catch an ASSERT
  failure saying that only tensors created by intrusive_ptr::release() are valid
  to reclaim(). Use c10::make_intrusive instead in this situation.
- IValue is adjusted to use intrusive_ptr instead of Retainable, and all
  other sub-classes of Retainable were modified to use intrusive_ptr.
  When doing this, I had to make the constructors of sub-classes like
  ConstantList public, so that c10::make_intrusive could invoke them.  Fortunately,
  if you incorrectly stack allocate a ConstantList, and then try to get an
  intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0.
- IValue very narrowly sidesteps the problem of handling NullType, as it
  considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor>
  which is not always true. This was always the case, but there's now a comment
  explaining what's going on.

Some MSVC bugs were uncovered during the preparation of this patch.
They are documented as comments in the code.

Reviewed By: gchanan

Differential Revision: D9481140

fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc
2018-08-27 16:11:01 -07:00
d632ccd2c1 Cache isContiguous and numel
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10696

Differential Revision: D9437963

Pulled By: cpuhrsch

fbshipit-source-id: 7217682f5e4b69c73d943411d738e4892bb465f5
2018-08-24 22:40:39 -07:00
19031c68dc Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage (#10488)
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage

This patch does two major changes:

- It replaces the use of Retainable in Storage with a new implementation
  based on intrusive_ptr.  This will be necessary because Caffe2 will
  be using this class to implement intrusive_ptrs, and we need to
  line these up for the merge.  One good thing about the new implementation is
  that the default copy/move constructors/assignment operators and destructor
  work automatically, instead of needing to be hardcoded into Storage/Tensor.

- It replaces all places where we returned std::unique_ptr<Storage> with
  Storage, collapsing an unnecessary double indirection that is no longer
  necessary now that we have correctly working copy/move constructors.

I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.

HOW TO FIX YOUR CODE IN THE NEW API

- You no longer need to dereference the result of tensor.storage() to pass
  it to set.  So, instead of:

      x.set_(*y.storage());

  just write:

      x.set_(y.storage());

- If you were accessing methods on StorageImpl via the pImpl() method, you
  must use the dot operator to run pImpl().  Even better; just drop pImpl,
  we now have method forwarding.  So, instead of:

      storage->pImpl()->data();

  just do:

      storage->data();
      // storage.pImpl()->data() works too but is not as recommended

- storage->getDevice() is no more; instead use storage->device().index()

MISC CODE UPDATES

- retain, release, weak_retain, weak_release and weak_lock are now
  reimplemented using the "blessed API", and renamed to make it
  clearer that their use is discouraged.

- nvcc OS X and general OS X portability improvements to intrusive_ptr

- A new comment in intrusive_ptr describing how stack allocated
  intrusive_ptr_targets work differently than heap allocated ones
  from c10::make_intrusive

CAVEAT EMPTOR

- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
  works with intrusive_ptr.  You must reclaim the strong pointer into a
  real strong pointer, construct a weak pointer from it, and then release
  the strong and weak pointers.  See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488

Reviewed By: gchanan

Differential Revision: D9306134

Pulled By: ezyang

fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
2018-08-21 21:39:55 -07:00
00f2731112 Merge THTensor into TensorImpl
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10479

Differential Revision: D9315800

Pulled By: gchanan

fbshipit-source-id: b13ef0de3342600b02b54e0700eb02021a9d1a9e
2018-08-16 08:10:06 -07:00
b8530dc1f0 A few additions (#9837)
Summary:
This PR provides 4 fixes / features:

1. torch::nn::Cloneable inherits virtually from torch::nn::Module. We want to pass around a module with new functions, and the best way to do this is to do a diamond inheritance pattern, i.e.

```c++
struct MySuperModuleImpl : virtual public torch::nn::Module {
  virtual void myFunction() = 0;
}

struct MySuperModule : public torch::nn::Cloneable<MySuperModule>, MySuperModuleImple {};

struct MyModule : public MySuperModule<MyModule> {
  void myFunction() override;
};
```

This way, we can simply pass around MySuperModuleImpl around instead of torch::nn::Module.

2. Optimizer options are public now, since there's no way to decay the LR or modify it during training otherwise
3. Serialization functions creates autograd history and calls copy_! Bad!
4. Optimizers did not create buffers after add_parameters was called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9837

Reviewed By: goldsborough

Differential Revision: D9199746

Pulled By: ebetica

fbshipit-source-id: 76d6b22e589a42637b7cc0b5bcd3c6b6662fb299
2018-08-13 10:24:58 -07:00
41dce17e22 Delete TensorImpl::type_, replace with backend_/scalar_type_/is_variable_ (#10210)
Summary:
The basic game plan is to stop accessing the type_ field directly,
and instead using the stored backend_, scalar_type_ and
is_variable_ to look up the appropriate Type from Context.
Storage of backend_ and scalar_type_ are new.

At some future point in time, I'd like to look at this code
carefully to see if I can get everything in this codepath inlining.
I didn't do it in this patch because there are circular include
problems making things difficult.

Some other details:

- Added Device::backend() which does what it says on the tin

- SparseTensorImpl is temporarily hard-coded to root in at::Context
  for the appropriate context.  If/when we put this in shared code,
  we'll have to break this dep too, but for now it should be OK.

- There's a stupid problem with globalContext() deadlocking if
  you didn't actually initialize it before loading libtorch.so
  (which is bringing along the variable hooks).  I fixed this by
  reordering the static initializers. Fixes #9784

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10210

Differential Revision: D9150697

Pulled By: ezyang

fbshipit-source-id: 89e2006c88688bcfab0dcee82dc369127c198c35
2018-08-03 18:25:19 -07:00
f51f15bb27 Update include paths for ATen/core (#10130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130

Update some include paths to make them internally consistent

Reviewed By: ezyang

Differential Revision: D9119906

fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
2018-08-03 11:57:02 -07:00