Commit Graph

391 Commits

Author SHA1 Message Date
0239284313 Relax dtype restrictions on torch.Tensor (#73850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73850

Previously, torch.Tensor was treated as if it were torch.FloatTensor
(where Float is whatever the default dtype was).  This is not good
behavior for tensor subclasses, which inherit from torch.Tensor and
will want to super() call into it and will only notice later that
only float works as a dtype.  So in this PR I relax the behavior
for this case to make the torch.Tensor constructor more useful for
subclasses.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D34707396

Pulled By: ezyang

fbshipit-source-id: a995d601007b6fcd0317d89f66ca7e08c4d6053e
(cherry picked from commit e8d0d7b3e8b17681b931cbe4f5729de2e80cf3de)
2022-03-09 15:45:24 +00:00
37e0d2e361 Fix segfault while real and imaginary attributes are set to a number (#73867)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73867

Fixes https://github.com/pytorch/pytorch/issues/72947

Test Plan: Imported from OSS

Reviewed By: davidberard98

Differential Revision: D34695956

Pulled By: anjali411

fbshipit-source-id: 2f3eda272a5214335eae506bd387ce8da4d81b8c
(cherry picked from commit fdb07354cac22c30aa047e65fbac9840608db811)
2022-03-08 18:58:26 +00:00
086645ad77 Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D34627164

Pulled By: anjali411

fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e
(cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)
2022-03-07 22:38:42 +00:00
35cfa74f97 Add a default implementation of __torch_dispatch__
I was working on an explanation of how to call into the "super"
implementation of some given ATen operation inside of __torch_dispatch__
(https://github.com/albanD/subclass_zoo/blob/main/trivial_tensors.py)
and I kept thinking to myself "Why doesn't just calling super() on
__torch_dispatch__ work"?  Well, after this patch, it does!  The idea
is if you don't actually unwrap the input tensors, you can call
super().__torch_dispatch__ to get at the original behavior.

Internally, this is implemented by disabling PythonKey and then
redispatching.  This implementation of disabled_torch_dispatch is
not /quite/ right, and some reasons why are commented in the code.
There is then some extra work I have to do to make sure we recognize
disabled_torch_dispatch as the "default" implementation (so we don't
start slapping PythonKey on all tensors, including base Tensors),
which is modeled the same way as how disabled_torch_function is done.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73684

Approved by: albanD
2022-03-03 20:19:33 +00:00
717d8c6224 [BE] Fix pybind deprecation warnings (#72376)
Summary:
Fixes:
```
../torch/csrc/autograd/python_variable.cpp:1798:33: warning: ‘bool pybind11::handle::operator==(const pybind11::handle&) const’ is deprecated: Use obj1.is(obj2) instead [-Wdeprecated-declarations]
     TORCH_CHECK(out == py::none(), "Expected __torch_dispatch__ for ", op.operator_name(),
```
and
```
../torch/csrc/jit/python/python_list.cpp:254:57: warning: ‘pybind11::object::object(pybind11::handle, bool)’ is deprecated: Use reinterpret_borrow<object>() or reinterpret_steal<object>() [-Wdeprecated-declarations]
                     py::object(obj, /*is_borrowed*/ true),
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72376

Reviewed By: albanD

Differential Revision: D34021328

Pulled By: malfet

fbshipit-source-id: 72906077db9031311c6b0ae4c65eb79df9c514d4
(cherry picked from commit e1877ca268b2a48d27adee9cb13f1616fb2ec487)
2022-02-07 18:33:32 +00:00
f3ebf06e98 Release GIL when assigning to real or imag components (#71747)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71747

The getter is trivial as it's just creating a view tensor, but the
setter is actually copying data so does call into kernel code.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33770046

Pulled By: albanD

fbshipit-source-id: f0a70acaef790ae1e5b2f68ac4ce046e850c9624
(cherry picked from commit 36a0109400b256b32a185fcd05f21f302197c081)
2022-01-25 22:30:48 +00:00
a9f44b22c0 Fix composite compliance problems for linalg.{matrix_power, inv, cholesky} (#69437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69437

linalg.{inv, cholesky} are problematic because they call .data_ptr().
This makes them not composite compliant (e.g. meta tensors will not run
on them correctly). This PR makes them composite compliant by adding a
new native_functions operator that does error checking,
`_linalg_check_errors(Tensor info, str api_name, bool is_matrix`
that is a primitive with respect to autograd.

This PR modifies linalg.inv and linalg.cholesky to call the new error
check function. I also needed to refactor singleCheckErrors and
batchCheckErrors to accept a c10::string_view instead of a
`const char*`; you can convert `const char*` to c10::string_view but not
the other way around because `string_view` does not require null
terminated buffers.

Finally, there is a bugfix in `__torch_dispatch__` for this PR for
the composite compliant testing mechanism. Previously,
`__torch_dispatch__` could not handle operators with no returns; this PR
fixes that. No returns in C++ is equivalent to a single None return in
Python.

Test Plan: - composite compliant tests

Reviewed By: albanD

Differential Revision: D32883666

Pulled By: zou3519

fbshipit-source-id: d5a3f52ebab116c93e1a54a203eacc8f787de7e2
(cherry picked from commit 9e24c9599a043877ab4f289469be55550c996a79)
2022-01-20 16:14:34 +00:00
748790588c Upgrading the loop to use irange (#70326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326

See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable.

Test Plan: buck run //caffe2/torch/fb/sparsenn:test

Reviewed By: r-barnes

Differential Revision: D33243400

fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3
2022-01-06 07:06:53 -08:00
7c90bd77ec Test functionalization pass in python (#66101)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66101

Updated description:

This PR tests the functionalization pass in python in two ways. For each of the test programs that I have in `test_functionalization.py`, it:
- runs the program with and without functionalization, and asserts the outputs and (potentially mutated) inputs are equal in both cases
- runs the program with `LoggingTensor`, and uses expecttests on the resulting graph. I manually confirm that the graphs look reasonable and only contain functional ops.

Mechanically, the changes include:
- factoring out `LoggingTensor` into a testing util so it can be re-used in multiple tests
- adding some private python api's in the `torch` namespace as hooks that I can use during testing

In the original version of this PR, I also added some fixes to the `_make_subclass()` function in python: allowing you to pass in strides and storage_offset. I kept them in mainly because the changes were already there.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D31942095

Pulled By: bdhirsh

fbshipit-source-id: 90ff4c88d461089704922e779571eee09c21d707
2021-11-09 14:34:05 -08:00
1f55dd83ac [WIP] wrap XLATensors into Python XLA wrapper class (#65841)
Summary:
**Improbably** fixes https://github.com/pytorch/pytorch/issues/65130

ezyang I'm super n00b in Python extensions, is this what we want to do?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65841

Reviewed By: navahgar

Differential Revision: D31889790

Pulled By: Krovatkin

fbshipit-source-id: c7f077b89f6f02df1962ab83d9e13fcc348a227d
2021-10-25 16:11:03 -07:00
e0643fa3fc use irange for loops 5 (#66744)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66744

Modified loops in files under fbsource/fbcode/caffe2/ from the format

`for(TYPE var=x0;var<x_max;x++)`

to the format

`for(const auto var: irange(xmax))`

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D31705358

fbshipit-source-id: d6ea350cbaa8f452fc78f238160e5374be637a48
2021-10-18 21:59:50 -07:00
2f099c7555 Revert D30652629: use irange for loops
Test Plan: revert-hammer

Differential Revision:
D30652629 (687c2267d4)

Original commit changeset: 0ae6c4bbbb55

fbshipit-source-id: 5c4f067b584a021c8c9656454d1ee60999600fb3
2021-10-15 15:23:10 -07:00
687c2267d4 use irange for loops (#66234)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234

Modified loops in files under fbsource/fbcode/caffe2/ from the format

`for(TYPE var=x0;var<x_max;x++)`

to the format

`for(const auto var: irange(xmax))`

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

bypass_size_limit
allow-large-files

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D30652629

fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
2021-10-15 13:50:33 -07:00
82a216c45b Add tensor.{adjoint(),H,mT,mH} methods and properties (#64179)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64179

This PR follows the discussion in https://github.com/pytorch/pytorch/issues/45063#issuecomment-904431478

Fixes https://github.com/pytorch/pytorch/issues/45063

cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30730483

Pulled By: anjali411

fbshipit-source-id: 821d25083f5f682450f6812bf852dc96a1cdf9f2
2021-10-13 07:44:43 -07:00
70a545b21e Add Tensor._make_wrapper_subclass (#65340)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65340

I thought about a few possible ways of doing this.  The main hazard is
that if I create a CPU tensor that doesn't have any real storage, the
moment I actually try to access the data on the tensor I will segfault.
So I don't want to use _make_subclass on a "cpu meta tensor" because
the CPU meta tensor (with no subclass) is radioactive: printing it
will immediately cause a segfault.  So instead, I have to create
the CPU meta tensor AND subclass all in one go, and that means I need
another function for it.  One downside to doing it this way is
I need another overload for explicit strides, and in general it is
difficult to get the view relationships to all work out properly;
tracked at https://github.com/pytorch/pytorch/issues/65339

Fixes https://github.com/pytorch/pytorch/issues/62972
Fixes https://github.com/pytorch/pytorch/issues/62730

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D31057231

Pulled By: ezyang

fbshipit-source-id: 73522769e093ae8a1bf0c7f7e594659bfb827b28
2021-09-22 11:10:47 -07:00
67bd2a31b5 [Reland] Add python mode (#64360)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64360

This PR adds a (private) enable_python_mode context manager.
(see torch/utils/_python_dispatch.py).
enable_python_mode accepts the type of a __torch_dispatch__ object
as its argument. Whenever an operator gets called inside of the
context manager, it dispatches to the __torch_dispatch__ of
the passed-in type.

Example usage:
```
with enable_python_mode(LoggingTensor):
    z = torch.empty([])
    assert isinstance(z, LoggingTensor)
```

There are quite a few changes that were made to support this.

First, we added TorchDispatchTypeObject, a C++ struct that represents the
type of a `__torch_dispatch__` object (e.g. LoggingTensor).
It holds both the PyObject* representing the class and a PyInterpreter*
so we know which Python interpreter it came from.

Next, we updated the concrete_dispatch_fn in python_variable.cpp to accept
a `const std::shared_ptr<TorchDispatchTypeObject>&` argument. When this
is null, dispatching happens as usual. When it is non-null, we prepend
the TorchDispatchTypeObject's PyObject* to the overloaded args list so that
it is considered first for dispatch.

To get that to work, we changed how `handle_torch_dispatch_no_python_arg_parser`
works. The "overloaded args list" previously only consisted of Tensor PyObjects,
but now it can have types in addition to Tensors!
- We renamed `append_overloaded_arg` to `append_overloaded_arg`
- We added a new `append_overloaded_type` that appends a type to
overloaded_args
- We added special handling in `handle_torch_dispatch_no_python_arg_parser`
and `append_overloaded_arg` to handle types in addition to Tensors.

Then, there is PythonMode and PythonModeTLS.
- We reuse the DispatchKey::Python dispatch key as a mode key
- We use PythonMode::enter and PythonMode::exit to enable/disable
DispatchKey::Python and set the PythonModeTLS.
- PythonModeTLS stores a TorchDispatchTypeObject as metadata.
- PythonMode is in libtorch_python, and PythonModeTLS is in ATen.
This split is due to the libtorch_python library boundary (because we need
to save TLS in ATen/ThreadLocalState)
- We modify the PythonFallbackKernel to look up
the relevant TorchDispatchTypeObject (if Python Mode is active) and
dispatch using it.

There are two more miscellaneous changes:
- internal_new_from_data (torch/csrc/utils/tensor_new.cpp) gets an
exclude guard. enable_python_mode currently does not handle
torch.tensor and the exclude guard is to prevent a bug.

Future:
- This PR does not allow for the nesting of Python modes. In the future we
should be able to enable this with a more sane no_dispatch API and by changing
the TLS to a stack. For now I did not need this for CompositeImplicitAutograd testing.

Test Plan: - new tests

Reviewed By: ezyang

Differential Revision: D30698082

Pulled By: zou3519

fbshipit-source-id: 7094a90eee6aa51f8b71bc4d91cfb6f49e9691f8
2021-09-16 09:02:30 -07:00
8131bc85d0 Raise TypeError on assigned grad with wrong type (#64876)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64813

Raises a TypeError when assigned value to a grad is not a Tensor or
None.

Adds tests.

cc ezyang gchanan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64876

Reviewed By: anjali411

Differential Revision: D30901678

Pulled By: soulitzer

fbshipit-source-id: dbb3cb5fd0bbac6918e0b2e2f51d340daa43dee0
2021-09-13 16:41:45 -07:00
d8ae3cc318 Add more error checking in subclass creation (#64746)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64746

This extracts the error checking that used to be in the PR above.
We are not going to land the proposed fix there, but I think we want this error checking in right now as these would lead to respectively a memory leak and arbitrary memory read/write.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30867569

Pulled By: albanD

fbshipit-source-id: bf468033fb8b49fcb26eed423f5fad82b4a46c56
2021-09-10 16:49:10 -07:00
89f94fc15f Move THPVariable_NewWithVar around (#64550)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64550

Just moves a function around to make the next PR easier to read.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30867570

Pulled By: albanD

fbshipit-source-id: 99ae925568ed29ca7fdea059762c21d430d4a204
2021-09-10 16:49:08 -07:00
d701357d92 Factor out TensorBase that doesn't depend on native operators (#63612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63612

This makes Tensor inherit from a new class TensorBase, that provides a subset of Tensor that doesn't
directly depend on native_functions.yaml. Code that only includes TensorBase.h with thus not need to
be rebuilt every time someone changes an operator signature.

Making `Tensor` inherit from this class means that `const TensorBase&` parameters will be callable
with an ordinary `Tensor`. I've also made `Tensor` constructible and assignable from `TensorBase` to
minimize friction in code mixing the two types.

To help enforce that `Tensor.h` and `Functions.h` aren't accidentally included, I've added an error
into `Operators.h` if `TORCH_ASSERT_NO_OPERATORS` is defined. We can either set this in the build
system for certain folders, or just define it at the top of any file.

I've also included an example of manually special-casing the commonly used `contiguous` operator.
The inline function's slow path defers to `TensorBase::__dispatch_contiguous` which is defined in
`Tensor.cpp`. I've made it so `OptionalTensorRef` is constructible from `TensorBase`, so I can
materialize a `Tensor` for use in dispatch without actually increasing its refcount.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30728580

Pulled By: ezyang

fbshipit-source-id: 2cbc8eee08043382ee6904ea8e743b1286921c03
2021-09-08 13:28:54 -07:00
0457a85d45 Revert D30543236: Add python mode
Test Plan: revert-hammer

Differential Revision:
D30543236 (4bd03b0242)

Original commit changeset: ef5444d96a5a

fbshipit-source-id: b0042ac2c22765fa11d6d00bf751f6a4489eb6d8
2021-08-31 15:28:33 -07:00
4bd03b0242 Add python mode (#63496)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63496

This PR adds a (private) enable_python_mode context manager.
(see torch/utils/_python_dispatch.py).
enable_python_mode accepts the type of a __torch_dispatch__ object
as its argument. Whenever an operator gets called inside of the
context manager, it dispatches to the __torch_dispatch__ of
the passed-in type.

Example usage:
```
with enable_python_mode(LoggingTensor):
    z = torch.empty([])
    assert isinstance(z, LoggingTensor)
```

There are quite a few changes that were made to support this.

First, we added TorchDispatchTypeObject, a C++ struct that represents the
type of a `__torch_dispatch__` object (e.g. LoggingTensor).
It holds both the PyObject* representing the class and a PyInterpreter*
so we know which Python interpreter it came from.

Next, we updated the concrete_dispatch_fn in python_variable.cpp to accept
a `const std::shared_ptr<TorchDispatchTypeObject>&` argument. When this
is null, dispatching happens as usual. When it is non-null, we prepend
the TorchDispatchTypeObject's PyObject* to the overloaded args list so that
it is considered first for dispatch.

To get that to work, we changed how `handle_torch_dispatch_no_python_arg_parser`
works. The "overloaded args list" previously only consisted of Tensor PyObjects,
but now it can have types in addition to Tensors!
- We renamed `append_overloaded_arg` to `append_overloaded_arg`
- We added a new `append_overloaded_type` that appends a type to
overloaded_args
- We added special handling in `handle_torch_dispatch_no_python_arg_parser`
and `append_overloaded_arg` to handle types in addition to Tensors.

Then, there is PythonMode and PythonModeTLS.
- We reuse the DispatchKey::Python dispatch key as a mode key
- We use PythonMode::enter and PythonMode::exit to enable/disable
DispatchKey::Python and set the PythonModeTLS.
- PythonModeTLS stores a TorchDispatchTypeObject as metadata.
- PythonMode is in libtorch_python, and PythonModeTLS is in ATen.
This split is due to the libtorch_python library boundary (because we need
to save TLS in ATen/ThreadLocalState)
- We modify the PythonFallbackKernel to look up
the relevant TorchDispatchTypeObject (if Python Mode is active) and
dispatch using it.

There are two more miscellaneous changes:
- internal_new_from_data (torch/csrc/utils/tensor_new.cpp) gets an
exclude guard. enable_python_mode currently does not handle
torch.tensor and the exclude guard is to prevent a bug.

Future:
- This PR does not allow for the nesting of Python modes. In the future we
should be able to enable this with a more sane no_dispatch API and by changing
the TLS to a stack. For now I did not need this for CompositeImplicitAutograd testing.

Test Plan: - new tests

Reviewed By: malfet, albanD

Differential Revision: D30543236

Pulled By: zou3519

fbshipit-source-id: ef5444d96a5a957d1657b7e37dce80f9a497d452
2021-08-30 18:44:35 -07:00
c78ab28441 Add support for the ONNX Runtime Eager Mode backend (#58248)
Summary:
This PR implements the necessary hooks/stubs/enums/etc for complete ONNX Runtime (ORT) Eager Mode integration. The actual extension will live out of tree at https://github.com/pytorch/ort.

We have been [working on this at Microsoft](https://github.com/microsoft/onnxruntime-pytorch/tree/eager-ort/torch_onnxruntime) for the last few months, and are finally ready to contribute the PyTorch core changes upstream (nothing major or exciting, just the usual boilerplate for adding new backends).

The ORT backend will allow us to ferry [almost] all torch ops into granular ONNX kernels that ORT will eagerly execute against any devices it supports (therefore, we only need a single ORT backend from a PyTorch perspective).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58248

Reviewed By: astaff

Differential Revision: D30344992

Pulled By: albanD

fbshipit-source-id: 69082b32121246340d686e16653626114b7714b2
2021-08-20 11:17:13 -07:00
c508433617 Implement subclass priority for __torch_dispatch__ (#63411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63411

In order to get this behavior, you have to use append_overloaded,
which I forgot to use in the previous implementation.  I exposed
an internal helper function which is more appropriate for dispatch
to Python where we know that an argument is definitely a Tensor (and
this test no longer needs to be done).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D30374489

Pulled By: ezyang

fbshipit-source-id: 43b08c00d1958c9b26d82a025d19f0b67bb85590
2021-08-18 07:49:03 -07:00
da9958c899 irange-ify 1 (#62193)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62193

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29879504

fbshipit-source-id: adc86adcd1e7dcdfa2d7adf4d576f081430d52ec
2021-08-09 15:30:43 -07:00
e55f271859 __torch_dispatch__: Populate kwargs dictionary with keyword-only arguments (#62822)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62822

This is BC breaking for people who were using the old integration,
although only if you had been writing bindings for functions with
keyword-only arguments (that includes functorch).  Other than that,
the patch was pretty straightforward.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30134552

Pulled By: ezyang

fbshipit-source-id: a47f536fb030994a07c9386069b8f800ac86d731
2021-08-09 10:02:54 -07:00
5e5de75f4d Add getPyInterpreter() API (#62659)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62659

It turns out that it is occasionally useful to be able to access the
PyInterpreter object from other Python bindings (see next diff in the
stack).  Make it publicly available.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30074926

Pulled By: ezyang

fbshipit-source-id: 2f745ab7c7a672ed7215231fdf9eef6af9705511
2021-08-06 08:23:24 -07:00
e42360d56f Remove default arguments before calling to __torch_dispatch__ (#61123)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61123

This applies the design pattern of removing explicit arguments when they
coincide with the default arguments.  This simplifies argument patterns
that dispatch kernels receive and make it easier for us to maintain BC
(as addition of a new default argument isn't immediately BC-breaking
for dispatch implementors).

There is an important extra API which I haven't implemented here yet,
which is to take an incomplete sequence of arguments and fill out their
defaults (in case the user did want normalization).  I plan on adding
that in a future PR.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: saketh-are

Differential Revision: D29853616

Pulled By: ezyang

fbshipit-source-id: 71c672cb3a7d4d01f838a1c7fcdb75a8ce7d058e
2021-07-23 10:41:35 -07:00
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
349f2f767c Modernize to default constructor and nullptr in torch (#61735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61735

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29716659

fbshipit-source-id: ec2a0a0b7e55d2e50b1d35f0b651bd40675ae7e8
2021-07-16 10:51:13 -07:00
6ecc1a4c4f Make pytorch clang-tidy clean (#60649)
Summary:
This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master.

I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver):
```bash
python3 setup.py develop

# Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options
python3 tools/clang_tidy.py \
  -j \
  -s \
  -k \
  -v \
  --paths torch/csrc/ \
  -g"-torch/csrc/jit/passes/onnx/helper.cpp" \
  -g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
  -g"-torch/csrc/jit/serialization/onnx.cpp" \
  -g"-torch/csrc/jit/serialization/export.cpp" \
  -g"-torch/csrc/jit/serialization/import.cpp" \
  -g"-torch/csrc/jit/serialization/import_legacy.cpp" \
  -g"-torch/csrc/onnx/init.cpp" \
  -g"-torch/csrc/cuda/nccl.*" \
  -g"-torch/csrc/cuda/python_nccl.cpp" \
  -g"-torch/csrc/autograd/FunctionsManual.cpp" \
  -g"-torch/csrc/generic/*.cpp" \
  -g"-torch/csrc/jit/codegen/cuda/runtime/*" \
  -g"-torch/csrc/deploy/interpreter/interpreter.cpp" \
  -g"-torch/csrc/deploy/interpreter/interpreter.h" \
  -g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \
  -g"-torch/csrc/deploy/interpreter/test_main.cpp"
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649

Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors.

Reviewed By: walterddr, janeyx99

Differential Revision: D29504258

Pulled By: 1ntEgr8

fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e
2021-07-01 12:21:07 -07:00
aacc722aec Dispatch to Python via __torch_dispatch__ (#59760)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59760

See https://github.com/pytorch/pytorch/issues/59049

There are some moving parts to this PR, I'll structure this explanation so the straightforward parts go first, and then the less straightforward parts.

**The actual dispatch to Python.** The core logic of dispatch to Python lives in `concrete_dispatch_fn` in `torch/csrc/autograd/python_variable.cpp`. It takes the input IValue stack, scans all the arguments for Tensor arguments, and defers most of the heavy lifting to `handle_torch_function_no_python_arg_parser` which actually does all of the logic for calling out to torch dispatch (in particular, this function handles multiple dispatch situations for you). Because we have a different function name than regular `__torch_function__` handling, `handle_torch_function_no_python_arg_parser` is generalized to accept a magic method name to look for when testing if Tensors have custom handling or not. Unlike `__torch_function__`, by default there is no `__torch_dispatch__` on Tensor classes.

**Maintaining the Python dispatch key.** In order to get to the dispatch to Python logic, we must tag Tensors with the `__torch_dispatch__` magic method with the newly added Python dispatch key (separated from PythonFuncTorch to allow for a transitional period while they migrate to this mechanism). We expose a new private property `_is_python_dispatch` that assists in debugging if a Tensor is participating in Python dispatch or not. We apply the Python dispatch key the first time a PyObject for a Tensor is constructed (THPVariable_NewWithVar), testing if `__torch_dispatch__` exists with  then newly added `check_has_torch_dispatch`.

**Shallow copy and detach.** For the simple examples tested in this PR, most creations of Tensor route through the dispatcher. The exception to this is `shallow_copy_and_detach`, which bypasses the dispatcher and is used when saving tensors for backwards. When a Tensor is Python dispatch, we override the behavior of `shallow_copy_and_detach` to instead directly call into `__torch_dispatch__` to perform a `detach` operation (in the same way it would be invoked if you called `detach` directly). Because this Python call is triggered directly from c10::TensorImpl, it must be indirected through `PyInterpreter::detach`, which is the general mechanism for dynamic dispatching to the Python interpreter associated with a TensorImpl.

**torchdeploy compatibility.** The dispatch to Python logic cannot be directly registered to the dispatcher as it is compiled in the Python library, which will get loaded multiple times per torchdeploy interpreter. Thus, we must employ a two phase process. First, we register a fallback inside a non-Python library (aten/src/ATen/core/PythonFallbackKernel.cpp). Its job is to determine the appropriate PyInterpreter to handle the Python dispatch by going through all of the arguments and finding the first argument that has a PyObject/PyInterpreter. With this PyInterpreter, it makes another dynamic dispatch via "dispatch" which will go to the correct torchdeploy interpreter to handle dispatching to actual Python.

**Testing.** We provide a simple example of a LoggingTensor for testing, which can be used to generate TorchScript-like traces to observe what operations are being called when a Tensor is invoked. Although a LoggingTensor would be better implemented via an is-a relationship rather than a has-a relationship (as is done in the test), we've done it this way to show that arbitrarily complex compositions of tensors inside a tensor work properly.

**Known limitations.**

* We haven't adjusted any operator code, so some patterns may not work (as they lose the Python subclass in an unrecoverable way)
* `__torch_function__` must be explicitly disabled with `_disabled_torch_function_impl` otherwise things don't work quite correctly (in particular, what is being disabled is default subclass preservation behavior.)
* We don't ever populate kwargs, even when an argument is kwarg-only

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision:
D29017912
D29017912

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Pulled By: ezyang

fbshipit-source-id: a67714d9e541d09203a8cfc85345b8967db86238
2021-06-25 11:50:32 -07:00
204da12592 Reduce number of CEX when passing Tensors to Python (#60546)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60546

Before, we assume conservatively that any Tensor passed to
THPVariable_Wrap could be aliased in another thread and therefore race.
However, THPVariable_Wrap takes in Variable by value; and so if
use_count() <= 1, it is impossible for another thread to have a
reference to it.  So we can conclude that it is definitely uninitialized
if the quick test fails!

Thanks bdhirsh for pointing out the optimization opportunity here.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D29331718

Pulled By: ezyang

fbshipit-source-id: e100796fbc55a0af2c6565c6fbc9ddc8ae7ceb42
2021-06-24 07:40:39 -07:00
e3d75b8475 irange for PyTorch sans jit (#59481)
Summary:
Switches most of the simple for loops outside of `jit` directories to use `c10::irange`.

Generated with D28874212.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59481

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D28909681

fbshipit-source-id: ec9ab1bd602933238d9d0f73d4d8d027b75d9d85
2021-06-09 14:46:11 -07:00
2693b0bef3 Fix compile error when debugging (#59616)
Summary:
Signed-off-by: caozhong <zhong.z.cao@intel.com>

Triggered this probably because my full debug version python. ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59616

Reviewed By: jbschlosser

Differential Revision: D28958685

Pulled By: albanD

fbshipit-source-id: fdab622c4d1be93eb27e9006dcf3db7c5b44a04b
2021-06-09 06:34:06 -07:00
f52e202840 Add warning when accessing Tensor::grad() in the C++ API (#59362)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/35379

 - Adds  `retains_grad` attribute backed by cpp as a native function. The python bindings for the function are skipped to be consistent with `is_leaf`.
   - Tried writing it without native function, but the jit test `test_tensor_properties` seems to require that it be a native function (or alternatively maybe it could also work if we manually add a prim implementation?).
 - Python API now uses `retain_grad` implementation from cpp

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59362

Reviewed By: jbschlosser

Differential Revision: D28969298

Pulled By: soulitzer

fbshipit-source-id: 335f2be50b9fb870cd35dc72f7dadd6c8666cc02
2021-06-08 19:43:21 -07:00
f05d5bec48 Preserve PyObject even when it goes dead (#56017)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56017

Fixes #55686

This patch is seemingly straightforward but some of the changes are very
subtle.  For the general algorithmic approach, please first read the
quoted issue.  Based on the algorithm, there are some fairly
straightforward changes:

- New boolean on TensorImpl tracking if we own the pyobj or not
- PythonHooks virtual interface for requesting deallocation of pyobj
  when TensorImpl is being released and we own its pyobj, and
  implementation of the hooks in python_tensor.cpp
- Modification of THPVariable to MaybeOwned its C++ tensor, directly
  using swolchok's nice new class

And then, there is python_variable.cpp.  Some of the changes follow the
general algorithmic approach:

- THPVariable_NewWithVar is simply adjusted to handle MaybeOwned and
  initializes as owend (like before)
- THPVariable_Wrap adds the logic for reverting ownership back to
  PyObject when we take out an owning reference to the Python object
- THPVariable_dealloc attempts to resurrect the Python object if
  the C++ tensor is live, and otherwise does the same old implementation
  as before
- THPVariable_tryResurrect implements the resurrection logic.  It is
  modeled after CPython code so read the cited logic and see if
  it is faithfully replicated
- THPVariable_clear is slightly updated for MaybeOwned and also to
  preserve the invariant that if owns_pyobj, then pyobj_ is not null.
  This change is slightly dodgy: the previous implementation has a
  comment mentioning that the pyobj nulling is required to ensure we
  don't try to reuse the dead pyobj.  I don't think, in this new world,
  this is possible, because the invariant says that the pyobj only
  dies if the C++ object is dead too.  But I still unset the field
  for safety.

And then... there is THPVariableMetaType.  colesbury explained in the
issue why this is necessary: when destructing an object in Python, you
start off by running the tp_dealloc of the subclass before moving up
to the parent class (much in the same way C++ destructors work).  The
deallocation process for a vanilla Python-defined class does irreparable
harm to the PyObject instance (e.g., the finalizers get run) making it
no longer valid attempt to resurrect later in the tp_dealloc chain.
(BTW, the fact that objects can resurrect but in an invalid state is
one of the reasons why it's so frickin' hard to write correct __del__
implementations).  So we need to make sure that we actually override
the tp_dealloc of the bottom most *subclass* of Tensor to make sure
we attempt a resurrection before we start finalizing.  To do this,
we need to define a metaclass for Tensor that can override tp_dealloc
whenever we create a new subclass of Tensor.  By the way, it was totally
not documented how to create metaclasses in the C++ API, and it took
a good bit of trial error to figure it out (and the answer is now
immortalized in https://stackoverflow.com/q/67077317/23845 -- the things
that I got wrong in earlier versions of the PR included setting
tp_basicsize incorrectly, incorrectly setting Py_TPFLAGS_HAVE_GC on
the metaclass--you want to leave it unset so that it inherits, and
determining that tp_init is what actually gets called when you construct
a class, not tp_call as another not-to-be-named StackOverflow question
suggests).

Aside: Ordinarily, adding a metaclass to a class is a user visible
change, as it means that it is no longer valid to mixin another class
with a different metaclass.  However, because _C._TensorBase is a C
extension object, it will typically conflict with most other
metaclasses, so this is not BC breaking.

The desired new behavior of a subclass tp_dealloc is to first test if
we should resurrect, and otherwise do the same old behavior.  In an
initial implementation of this patch, I implemented this by saving the
original tp_dealloc (which references subtype_dealloc, the "standard"
dealloc for all Python defined classes) and invoking it.  However, this
results in an infinite loop, as it attempts to call the dealloc function
of the base type, but incorrectly chooses subclass type (because it is
not a subtype_dealloc, as we have overridden it; see
b38601d496/Objects/typeobject.c (L1261) )
So, with great reluctance, I must duplicate the behavior of
subtype_dealloc in our implementation.  Note that this is not entirely
unheard of in Python binding code; for example, Cython
c25c3ccc4b/Cython/Compiler/ModuleNode.py (L1560)
also does similar things.  This logic makes up the bulk of
THPVariable_subclass_dealloc

To review this, you should pull up the CPython copy of subtype_dealloc
b38601d496/Objects/typeobject.c (L1230)
and verify that I have specialized the implementation for our case
appropriately.  Among the simplifications I made:

- I assume PyType_IS_GC, because I assume that Tensor subclasses are
  only ever done in Python and those classes are always subject to GC.
  (BTW, yes!  This means I have broken anyone who has extend PyTorch
  tensor from C API directly.  I'm going to guess no one has actually
  done this.)

- I don't bother walking up the type bases to find the parent dealloc;
  I know it is always THPVariable_dealloc.  Similarly, I can get rid
  of some parent type tests based on knowledge of how
  THPVariable_dealloc is defined

- The CPython version calls some private APIs which I can't call, so
  I use the public PyObject_GC_UnTrack APIs.

- I don't allow the finalizer of a Tensor to change its type (but
  more on this shortly)

One alternative I discussed with colesbury was instead of copy pasting
the subtype_dealloc, we could transmute the type of the object that was
dying to turn it into a different object whose tp_dealloc is
subtype_dealloc, so the stock subtype_dealloc would then be applicable.
We decided this would be kind of weird and didn't do it that way.

TODO:

- More code comments

- Figure out how not to increase the size of TensorImpl with the new
  bool field

- Add some torture tests for the THPVariable_subclass_dealloc, e.g.,
  involving subclasses of Tensors that do strange things with finalizers

- Benchmark the impact of taking the GIL to release C++ side tensors
  (e.g., from autograd)

- Benchmark the impact of adding a new metaclass to Tensor (probably
  will be done by separating out the metaclass change into its own
  change)

- Benchmark the impact of changing THPVariable to conditionally own
  Tensor (as opposed to unconditionally owning it, as before)

- Add tests that this actually indeed preserves the Python object

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D27765125

Pulled By: ezyang

fbshipit-source-id: 857f14bdcca2900727412aff4c2e2d7f0af1415a
2021-06-03 10:50:36 -07:00
e9e5588588 Improve Tensor traverse to traverse its grad_fn when possible (#58271)
Summary:
There are two main changes here:
- THPVariable will actually visit their grad_fn if there are no other reference to the c++ Tensor and no other reference to the grad_fn. The critical observation compared to the existing comment (thanks Ed!) is that if we also check that the c++ Tensor object is not referenced somewhere else, we're sure that no one can change the grad_fn refcount between the traverse and the clear.
- THPVariable don't need a special clear for this new cases as we're the only owner of the c++ Tensor and so the cdata.reset() will necessarily free the Tensor and all its resources.

The two tests are to ensure:
- That the cycles are indeed collectible by the gc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58271

Reviewed By: ngimel

Differential Revision: D28796461

Pulled By: albanD

fbshipit-source-id: 62c05930ddd0c48422c79b03118db41a73c1355d
2021-06-01 10:27:52 -07:00
773cfae93b Tag PyObject on TensorImpl per torchdeploy interpreter (#57985)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57985

Fixes https://github.com/pytorch/pytorch/issues/57756

This PR introduces a new `pyobj_interpreter_` field on TensorImpl which tracks what Python interpreter (if any) owns the TensorImpl. This makes it illegal to bind a TensorImpl from multiple Python interpreters, and means that we can now directly store PyObject pointer on TensorImpl even in the presence of multiple Python interpreters, as is the case in torchdeploy. This is a necessary step for PyObject preservation, which cannot be easily implemented when there are multiple Python interpreters.

Although the PR is not that long, there is a very subtle portion of the implementation devoted to ensuring that the tagging process is thread safe, since multiple threads can concurrently try to tag a PyObject. Check Note [Python interpreter tag] and Note [Memory ordering on Python interpreter tag] for detailed discussion of how this is handled. You will have to check this code carefully in code review; I did not torture test the multithreaded paths in any meaningful way.

In a follow up PR, I will pack the interpreter and PyObject fields into single atomic word on 64-bit.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: wconstab

Differential Revision: D28390242

Pulled By: ezyang

fbshipit-source-id: a6d9b244ee6b9c7209e1ed185e336297848e3017
2021-05-20 18:18:39 -07:00
727c1d69d7 Remove unnecessary indirection through torch::autograd::impl::pyobj/set_pyobj (#57733)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57733

I'm going to be modifying the APIs here, so the less API surface
covering these functions the better.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D28289082

Pulled By: ezyang

fbshipit-source-id: 4b71270bb82e0d6baa4dfed2f2e4ee8831f590b5
2021-05-10 08:18:33 -07:00
da8cc355a3 Relax tp_new so that it is OK to call (#57544)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57544

Instead of removing tp_new from the superclass (which causes
super().__new__ to not work), I now still install tp_new on the
superclass, but verify that you are not trying to directly
construct _TensorBase.

Fixes https://github.com/pytorch/pytorch/issues/57421

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D28189475

Pulled By: ezyang

fbshipit-source-id: 9397a3842a77f5428d182dd62244b42425bca827
2021-05-05 09:04:39 -07:00
e845158b1a Assert that GIL is not held in blocking destructors (#57030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57030

PR #57029 is not perfect; there are still obscure situations in which
we might allocate a shared_ptr to an RpcAgent that doesn't have a
no GIL constructor, so this PR adds the other half of the equation:
assert that we don't hold the GIL when running a blocking destructor.
This makes it possible to detect potential deadlocks even if the
code doesn't deadlock in practice (because you got lucky and none
of the threads you blocked on tried to also take out the GIL).

I considered whether or not to make this DEBUG_ONLY.  For now it's
not, so I can get better CI coverage, and because this test only
happens in destructors of objects that die rarely.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28030582

Pulled By: ezyang

fbshipit-source-id: a7d7f6545223c4823c7f6036dfe29bd2edaf60a5
2021-05-02 22:06:02 -07:00
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
e362ee6f8a Make it illegal to directly construct _TensorBase (#56150)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56150

See #56017 for full context; the short story is that by making
it illegal to directly construct _TensorBase, we need only
write a *single* tp_dealloc function which will work universally
for all _TensorBase subclasses, rather than having to write two
versions, one for _TensorBase itself, and others for Python subclasses
of _TensorBase.  This means simpler code.

The subtlety here is that we only install our custom `tp_new` for direct subclasses of TensorBase.  This is important, because overriding the `tp_new` also overrides any user defined constructor.  Fortunately class Tensor(_TensorBase) has no nontrivial constructors and doesn't mind, but other subclasses like Parameter definitely mind!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D28028746

Pulled By: ezyang

fbshipit-source-id: 3c03a14666ad1ded1145fe676afb0a7623cdb9bb
2021-04-28 09:25:25 -07:00
4d72538f80 Give Tensor a trivial (for now) metaclass _TensorMeta (#56147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56147

This is support of #55686, you can see the broader context of the metaclass in
a more complete PR #56017.  The short story is that in the future I want to
give Tensor a non-trivial metaclass, so to derisk the change first I give it a
trivial metaclass to shake out any bugs that might be caused by it.  The
metaclass shouldn't have any performance impact on Tensor as it only gets
invoked upon subclass creation.

By the way, it was totally not documented how to create metaclasses in the Python
C API, and it took a good bit of trial error to figure it out (and the answer is
now immortalized in https://stackoverflow.com/q/67077317/23845 -- the things
that I got wrong in earlier versions of the PR included setting tp_basicsize
incorrectly, incorrectly setting Py_TPFLAGS_HAVE_GC on the metaclass--you want
to leave it unset so that it inherits, and determining that tp_init is what
actually gets called when you construct a class, not tp_call as another
not-to-be-named StackOverflow question suggests).

Aside: Ordinarily, adding a metaclass to a class is a user visible change, as
it means that it is no longer valid to mixin another class with a different
metaclass. However, because _C._TensorBase is a C extension object, it will
typically conflict with most other metaclasses, so this is not BC breaking.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D28028747

Pulled By: ezyang

fbshipit-source-id: c1e35a986aeb3db540c73d188f53dce951eeed33
2021-04-28 09:24:21 -07:00
6ec71ed4f9 Replace all direct cdata access with THPVariable_Unpack (#55799)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55799

I'm going to change the implementation of cdata soon so I need to
abstract over cdata access with a function.  Additionally, many
users are casting manually casting to THPVariable to access
the member so I can remove these unsafe casts in the client code
(the implementation, of course, is still doing an unsafe cast.)

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D27712130

Pulled By: ezyang

fbshipit-source-id: 95fcc013bf3913d67f2c634068eb5b3aab144cb3
2021-04-15 08:57:04 -07:00
5fb1142702 Add CSR (compressed sparse row) layout for sparse tensors (#50937)
Summary:
Implement compressed sparse row format. Derived from the GCS implementation at https://github.com/pytorch/pytorch/pull/44190

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50937

Reviewed By: mrshenli

Differential Revision: D27439865

Pulled By: ezyang

fbshipit-source-id: 3ba3dcb9679505b980ff6a5f513e913bbae2fb1d
2021-04-12 10:09:12 -07:00
30cb6ac53c Introduce mlc device (ML Compute device) to PyTorch's device list (#50634)
Summary:
Apple recently announced ML Compute, a new framework available in macOS Big Sur, which enables users to accelerate the training of neural networks on Mac hardware. This PR is the first on a series of PRs that will enable the integration with ML Compute. Most of the integration code will live on a separate subrepo named `mlc`.
The integration with `mlc` (ML Compute) will be very similar to that of xla. We rely on registering our ops through:

TORCH_LIBRARY_IMPL(aten, PrivateUse1, m) {
 m.impl_UNBOXED(<op_schema_name>, &customized_op_kernel)
 ...
}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50634

Reviewed By: malfet

Differential Revision: D26614213

Pulled By: smessmer

fbshipit-source-id: 3b492b346c61cc3950ac880ac01a82fbdddbc07b
2021-02-24 22:39:11 -08:00
60518d10f6 [deploy] torch::deploy API (#51754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51754

This API allows you to manage multiple python interpreters in a single
process to deploy PyTorch models packaged with torch.package.

torch/csrc/deploy/deploy.h contains the API definition
torch/csrc/deploy/test_deploy.cpp has some examples.

Notes:
* mutex is added to PyTorchStreamReader to make it safe to use from multiple threads at once.
* USE_DEPLOY is only true for the special libtorch_deployinterpreter.so library, when enabled
  we use a hash table to maintain PyObject <> at::Tensor mappping rather than the internal pointer
  in Tensor since >1 interpreter may have a reference to the tensor.
* serialization.py has some additional functions for creating pickle objects
  but keeping storages in memory for use transfering tensors between interpreters

Test Plan: Imported from OSS

Reviewed By: wconstab

Differential Revision: D26329468

Pulled By: zdevito

fbshipit-source-id: d75f4ebb9a27f1d911179d9996041bcb3ca04a07
2021-02-18 02:30:08 -08:00
4a8ef4525e Add new backend type for Intel heterogeneous computation platform. (#49786)
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.

https://github.com/pytorch/pytorch/issues/48246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786

Reviewed By: mrshenli

Differential Revision: D25893962

Pulled By: ezyang

fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
2021-01-20 08:15:18 -08:00