Commit Graph

359 Commits

Author SHA1 Message Date
432fce4e0d Revert "Add tensor post accumulate grad hook API (#107063)"
This reverts commit 3f655277d44909e0770e77e1b4fe1c9b0f39d7b9.

Reverted https://github.com/pytorch/pytorch/pull/107063 on behalf of https://github.com/ZainRizvi due to Diff train weirdness. Need to temporarily revert this PR and will right land it soon afterwards ([comment](https://github.com/pytorch/pytorch/pull/107063#issuecomment-1690799057))
2023-08-24 00:12:34 +00:00
bc0790559b Revert "Remove unnecessary import in python_variable.cpp (#107794)"
This reverts commit 9d23b8b3eabe2cd38eb5a11cc49cda6970675595.

Reverted https://github.com/pytorch/pytorch/pull/107794 on behalf of https://github.com/ZainRizvi due to Diff train weirdness. Need to temporarily revert this PR and will right land it soon afterwards ([comment](https://github.com/pytorch/pytorch/pull/107794#issuecomment-1690798855))
2023-08-24 00:10:18 +00:00
9d23b8b3ea Remove unnecessary import in python_variable.cpp (#107794)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107794
Approved by: https://github.com/albanD, https://github.com/ZainRizvi
2023-08-23 19:43:39 +00:00
02d41b7afd allow result of at::for_blob to advertise as resizeable (for tensor subclasses) (#107416)
Previously, the first overload of `_make_wrapper_subclass` returned a tensor that **always** advertised as having a non-resizeable storage. Eventually, we'll need it be advertise as resizeable for functionalization to work (since functionalization occasionally needs to resize storages).

Not directly tested in this PR (tested more heavily later in aot dispatch, but if someone wants me to write a more direct test I can add one).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107416
Approved by: https://github.com/ezyang, https://github.com/albanD
ghstack dependencies: #107417
2023-08-22 15:25:31 +00:00
3f655277d4 Add tensor post accumulate grad hook API (#107063)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107063
Approved by: https://github.com/albanD, https://github.com/soulitzer
2023-08-22 15:15:57 +00:00
1a6619a830 Added missing whitespace when reporting invalid gradient type (#104992)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104992
Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/Skylion007
2023-07-11 22:24:02 +00:00
4e204ff87b Added is_xla (#103100)
This change creates `is_xla` which is congruent with `is_cuda` and `is_cpu`. Useful in situations like: https://github.com/pytorch/pytorch/pull/102858

```
>>> x = torch.tensor([1], device=xm.xla_device())
>>> x.is_xla
True
>>> x.is_cpu
False
>>> x = torch.tensor([1])
>>> x.is_cpu
True
>>> x.is_xla
False
```

Attn: @albanD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103100
Approved by: https://github.com/albanD
2023-06-22 23:31:04 +00:00
685505353a Back out "Add PyObject preservation for UntypedStorage (#97470)" (#102553)
Summary:
Original commit changeset: c24708d18ccb

Original Phabricator Diff: D46159983

Test Plan: SL tests and CI

Differential Revision: D46284986

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102553
Approved by: https://github.com/DanilBaibak
2023-06-01 17:23:43 +00:00
5fe629e314 Add PyObject preservation for UntypedStorage (#97470)
Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97470
Approved by: https://github.com/ezyang
2023-05-23 01:27:30 +00:00
47ec9cc26d Improve error messages in THPVariable_set_grad (#100683)
Fixes #100174

I'm not sure if there's another direction that we had in mind for this issue, but if so I'm happy to make the changes 🙂

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100683
Approved by: https://github.com/soulitzer
2023-05-12 01:54:20 +00:00
555ab310dc Add itemsize and nbytes properties to Tensor (#98322)
Adds properties for itemsize and nbytes to Tensor matching the properties in NumPy.

Fixes https://github.com/pytorch/pytorch/issues/12728

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98322
Approved by: https://github.com/ezyang
2023-04-05 12:11:55 +00:00
a4b02a15d3 Support registering op returning symint in python (#95240)
Running an operator registered in python returning a symint will result in the following error:
```
RuntimeError: Unable to cast Python instance of type <class 'torch.SymInt'> to C++ type 'long'
```

The interaction of 2 things make the issue being triggered:
- We use boxed kernel here. For boxed kernel, we need convert py::object to IValue in torch/csrc/autograd/python_variable.cpp pushPyOutToStack .
- In the schema parsing code in torch/csrc/jit/frontend/schema_type_parser.cpp SchemaTypeParser::parseFakeAndRealType , if a SymInt is found, we register a Int type instead (not sure why we do this), and register SymInt as the real type.

The result is we would convert an SymInt to int in pushPyOutToStack and cause the issue.

The fix is to use real type when we convert py::object to IValue.

BTW, registering the same op using C++ API does not trigger the issue.
```
TORCH_LIBRARY(clib, m) {
  m.def("sqsum(SymInt a, SymInt b) -> SymInt", [](SymInt a, SymInt b) -> SymInt {
    return a * a + b * b;
  });
}
```
The reason is, the kernel registered in C++ is unboxed kernel and it does not trigger the code path above that converts an py::object to IValue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95240
Approved by: https://github.com/larryliu0820, https://github.com/ezyang
2023-02-22 04:56:37 +00:00
0247ed27cc Apply Clang-Tidy readability-container-size-empty (#93236)
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
2023-01-29 23:28:19 +00:00
f3266015a4 Add _StorageMeta metaclass for StorageBase (#92648)
Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92648
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-01-24 23:08:23 +00:00
4d9920fa9c Move PyInterpreter code in python_variable.cpp to its own files (#92647)
Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92647
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-01-24 23:08:23 +00:00
397b1a3da0 Remove unnecessary includes from python_variable.cpp (#92839)
Follow-up from #92647

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92839
Approved by: https://github.com/Skylion007
2023-01-24 02:59:08 +00:00
8c8cd9539d Add missing moves to torch autograd (#92772)
Applies some additional std::move functions to torch/csrc/autograd to opportunities that were found via static analysis.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92772
Approved by: https://github.com/ezyang
2023-01-24 02:01:52 +00:00
1237cf6b6c Allow direct Tensor constructor to return preexisting PyObject (#92754)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92754
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2023-01-23 20:20:43 +00:00
97342ae04b Fix python tensor hooks behavior on inplace (#92734)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92734
Approved by: https://github.com/albanD
2023-01-21 21:32:37 +00:00
1bc60c6b31 [reland] Improve hooks ordering behavior (#92559)
This reverts commit e525f433e15de1f16966901604a8c4c662828a8a.

Original PR:  #85849
Fixes #ISSUE_NUMBER

In addition to reverting the revert, this PR:
- defines the virtual destructor of FunctionPreHook in the header. Why? Presumably the internal build imports the header from somewhere, but does not have function_hooks.cpp (where the virtual destructor was previously defined) in the same compilation unit.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92559
Approved by: https://github.com/albanD
2023-01-19 08:17:32 +00:00
e525f433e1 Revert "Improve hooks ordering behavior (#85849)"
This reverts commit 049838f2496bd1d29e4e8292714acb0042cc706e.

Reverted https://github.com/pytorch/pytorch/pull/85849 on behalf of https://github.com/albanD due to fails internal build
2023-01-18 15:27:22 +00:00
049838f249 Improve hooks ordering behavior (#85849)
Addresses: https://github.com/pytorch/pytorch/issues/35802

Design doc: https://docs.google.com/document/d/19xSib7FFknRQ5f3ptGFUmiOt3BrgXSUlTQH2xMcZJYg/edit#

### Changes in this PR

#### Implementation
- We have now have 3 fields: pre_hooks, retains_grad_hooks, and tensor_pre_hooks so that we can more precisely define their ordering and when they are executed.
- Since retains grad uses an entirely new field, we cannot reuse the old retains grad, logic. We refactor retains grad to call directly into the variable.cpp logic. Other logic in variable.cpp that handle cpp hooks must also be updated.

#### Hooks ordering and execution:
- Defines pre-hooks registered on tensor to run before pre-hooks registered on grad_fn
- Updates pre-hooks registered on tensor to always run, even if they are the inputs= to .grad()
- Post hooks (and pre hooks) can now observe the modifications to gradient by the tensor pre hook

#### Retains grad hooks
- retains grad hooks always execute last, even if there are other tensor pre-hooks registered

#### Unchanged:
- pre_hooks registered to grad_fn aren't expected to execute if they are the inputs= to .grad()

Follow ups:
- simplify retains_grad field to not be a vector, since it always holds a single hook
- potentially merge capture hooks with tensor pre hooks, this would involve some additional refactoring since
- python hooks registered to tensor behavior on in-place is still wrong

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85849
Approved by: https://github.com/albanD
2023-01-17 16:23:21 +00:00
cyy
9b716a0682 Clean up more clang-tidy supression (#92203)
1. remove unused NOLINTNEXTLINE(performance-move-const-arg)
2. add more std::move

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92203
Approved by: https://github.com/Skylion007
2023-01-17 05:43:08 +00:00
3a0053abd6 Move PyObject code out of TensorImpl into new PyObjectSlot class (#92169)
Redo of PR #92099

Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92169
Approved by: https://github.com/albanD
2023-01-14 02:55:32 +00:00
7078ad5b8c Reland "AOT Autograd refactor + cleanup, handle intermediate views of bases, use view replay, fix non-tensor input handling" (#92076)
Original PR: https://github.com/pytorch/pytorch/pull/89532

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92076
Approved by: https://github.com/janeyx99, https://github.com/albanD
2023-01-12 21:32:05 +00:00
db466ae057 Revert "[Modes] Add assert that the mode isn't already on the stack (#90770)"
This reverts commit 702838637d63936460ea2bf00b64ffec86ed6687.

Reverted https://github.com/pytorch/pytorch/pull/90770 on behalf of https://github.com/DanilBaibak due to Break internal build
2023-01-12 16:44:29 +00:00
702838637d [Modes] Add assert that the mode isn't already on the stack (#90770)
Redo of #89726 on a clean PR, thanks @voznesenskym for the first draft!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90770
Approved by: https://github.com/ezyang
2023-01-11 15:19:43 +00:00
b3603f8129 Revert "Deduplicate c10 error and PyTorchError hierarchy (#87855)"
This reverts commit 34f2d3e6ae56744c20c2f859f97101dff291bbbc.

Reverted https://github.com/pytorch/pytorch/pull/87855 on behalf of https://github.com/osalpekar due to perf regression in quantization tests
2023-01-06 19:56:35 +00:00
34f2d3e6ae Deduplicate c10 error and PyTorchError hierarchy (#87855)
Fixes #53370

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87855
Approved by: https://github.com/albanD
2023-01-02 15:53:36 +00:00
9f91e94080 Workaround for NumPy builds that ship with a broken Dlpack deleter (#89759)
NumPy versions 1.22 and 1.23 (and their respective bugfix releases included) have a buggy implementation of the Dlpack deleter that doesn't account for no-GIL contexts. Since we now release the GIL when deallocating tensors in `THPVariable_clear`, this leads to a failure of internal consistency checks when freeing a Dlpack-backed tensor from NumPy.

This PR adds a check for the buggy NumPy versions and overrides the `DlManagedTensor` deleter to reacquire the GIL before deallocation.

### Rationale for this implementation
The version check was added to `tensor_numpy.h/cpp` as it seemed like a more logical location for it than creating a new translation unit. The overriding of the deleter was originally attempted by directly modifying `at::fromDlpack`, but the lack of a build dependency on the Python C API in A10 prevented that. So, I extended the A10 Dlpack API instead to additionally accept a custom deleter functor.

Fixes #88082

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89759
Approved by: https://github.com/albanD
2022-12-28 13:23:29 +00:00
283cf718ed Fix _fix_weakref memory leak (#90823)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90823
Approved by: https://github.com/eellison, https://github.com/albanD
2022-12-15 01:07:29 +00:00
c79489c8e6 Expose to python the backward AD view_func (#89586)
This will be useful for other systems (AOTAutograd) that want to replay autograd views.

FYI @bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89586
Approved by: https://github.com/soulitzer
2022-11-24 03:39:58 +00:00
e0c194f10b Fix typos in messages under torch (#88961)
This PR fixes typos of messages and parms in c++ source and head files under `torch` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88961
Approved by: https://github.com/albanD
2022-11-14 19:06:41 +00:00
2b166532f7 Remove incorrect assert about hermetic state. (#88885)
I'm not sure why I thought this assert was valid in the first
place, and there's no comment about it.

The assert is tantamount to saying, "no tensor objects should
become dead via SafePyObject when hermetic mode is on."  But
suppose we run a Python GC while we're inside hermetic mode.
This could result in us disposing non-hermetic tensors, which
would hit decref.  So the assert seems invalid.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88885
Approved by: https://github.com/anjali411, https://github.com/malfet
2022-11-12 02:19:45 +00:00
53eac1d482 Revert "Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)"" (#88489)
The bug was that I was accidentally caching at the wrong key name, so
we were never actually hitting the cache.  I've renamed the resolved
key to final_key to avoid shadowing in this way.

This reverts commit 410ce96a23a3496a45478e0b25ffac53aa3c116f.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88489
Approved by: https://github.com/albanD
2022-11-04 19:23:04 +00:00
410ce96a23 Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)"
This reverts commit 86c7cd287caeb23c227d97d283e58bc123294746.

Reverted https://github.com/pytorch/pytorch/pull/88329 on behalf of https://github.com/clee2000 due to test_decomp takes an extra 2 hours in some jobs, windows takes so long it times out
2022-11-03 21:57:19 +00:00
f884e817d4 Make Python op registration work with torchdeploy/multipy (#87162)
See strategy at PythonOpRegistrationTrampoline.cpp for the
big picture.

Along the way, I made OperatorHandle support == and hashing,
and slightly changed the low level python_dispatch impl API
to disallow empty strings for dispatch key, which had the knock
on effect of requiring us to explicitly make sure we pass in
CompositeImplicitAutograd if we would have passed in "" (I didn't apply
this to the rest of the file because I'm lazy.)

Test strategy is we delete the logic for preventing Python op
registrations in torch from being skipped in a torchdeploy context
and show CI still works.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87162
Approved by: https://github.com/anjali411, https://github.com/bdhirsh
2022-11-03 12:56:44 +00:00
86c7cd287c Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)
The motivation is that I am going to add the ability to temporarily
install entries to the python dispatcher, and to do that, I need
an easier way to clear the cache.  Putting the cache in a dict
centralizes cache clearing in one place.  I then add some easy
cache clearing.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88329
Approved by: https://github.com/albanD
2022-11-03 12:53:51 +00:00
59fe272c1e Fix: prefer .is_none() over .is(py::none()) for pybind11 (#88051)
Fixes minor perf regression I saw in #85688 and replaced throughout the code base. `obj == Py_None` is directly equivalent to is_none(). Constructing a temporary py::none() object needlessly incref/decref the refcount of py::none, this method avoids that and therefore is more efficient.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88051
Approved by: https://github.com/albanD
2022-10-31 16:41:27 +00:00
1ff52225f1 Unify SymIntNode and SymFloatNode into SymNode (#87817)
This refactor was prompted by challenges handling mixed int/float
operations in C++.  A previous version of this patch
added overloads for each permutation of int/float and was unwieldy
https://github.com/pytorch/pytorch/pull/87722/  This PR takes a different
approach.

The general outline of the patch is to combine the C++ types SymIntNode
and SymFloatNode into a single type, SymNode.  This is type erased; we
no longer know statically at C++ if we have an int/float and have to test
it with the is_int()/is_float() virtual methods.  This has a number of
knock on effects.

- We no longer have C++ classes to bind to Python.  Instead, we take an
  entirely new approach to our Python API, where we have a SymInt/SymFloat
  class defined entirely in Python, which hold a SymNode (which corresponds
  to the C++ SymNode).  However, SymNode is not pybind11-bound; instead,
  it lives as-is in Python, and is wrapped into C++ SymNode using PythonSymNode
  when it goes into C++.  This implies a userland rename.

  In principle, it is also possible for the canonical implementation of SymNode
  to be written in C++, and then bound to Python with pybind11 (we have
  this code, although it is commented out.)  However, I did not implement
  this as we currently have no C++ implementations of SymNode.

  Because we do return SymInt/SymFloat from C++ bindings, the C++ binding
  code needs to know how to find these classes.  Currently, this is done
  just by manually importing torch and getting the attributes.

- Because SymInt/SymFloat are easy Python wrappers, __sym_dispatch__ now
  takes SymInt/SymFloat, rather than SymNode, bringing it in line with how
  __torch_dispatch__ works.

Some miscellaneous improvements:

- SymInt now has a constructor that takes SymNode.  Note that this
  constructor is ambiguous if you pass in a subclass of SymNode,
  so an explicit downcast is necessary.  This means toSymFloat/toSymInt
  are no more.  This is a mild optimization as it means rvalue reference
  works automatically.

- We uniformly use the caster for c10::SymInt/SymFloat, rather than
  going the long way via the SymIntNode/SymFloatNode.

- Removed some unnecessary toSymInt/toSymFloat calls in normalize_*
  functions, pretty sure this doesn't do anything.

- guard_int is now a free function, since to guard on an int you cannot
  assume the method exists.  A function can handle both int and SymInt
  inputs.

- We clean up the magic method definition code for SymInt/SymFloat/SymNode.
  ONLY the user classes (SymInt/SymFloat) get magic methods; SymNode gets
  plain methods; this is to help avoid confusion between the two types.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87817
Approved by: https://github.com/albanD, https://github.com/anjali411
2022-10-27 20:56:02 +00:00
169ec120ef [Modes] refactor modes to only use a stack in cpp (#86458)
Refactors the mode code to only have the C++ mode stack and not the "C++ mode" like we originally had. This also simplifies the mode logic in a number of places
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86458
Approved by: https://github.com/zou3519
2022-10-21 19:18:23 +00:00
936e93058b Delete torch::deploy from pytorch core (#85953)
As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there.

This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953
Approved by: https://github.com/seemethere, https://github.com/malfet
2022-10-06 07:20:16 +00:00
0a75c42f36 Workaround MSVC ICE due to constexpr char* template argument (#86288)
Test Plan:
Lease a Windows sandcastle https://www.internalfb.com/intern/wiki/Windows_Platform_Engineering/Leasable_VM_-_User_Guide/
and run:

```
buck build arvr/mode/win/opt //xplat/caffe2:_C_impl
```

Differential Revision: D40109191

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86288
Approved by: https://github.com/albanD, https://github.com/malfet
2022-10-06 04:11:05 +00:00
3b6588ab74 Consistent compute numel/contiguous strategy with SymInts (#85858)
Previously, our handling for contiguity was inconsistent in the following ways:

- is_strides_like 2d/3d and is_non_overlapping_and_dense always were computed
  based on sizes_and_strides_, even if you had symbolic ints
- Furthermore, even if you set custom policy for strides, these quantities were
  not overridable by subclasses
- Furthermore, we didn't even store these fields on ExtraMeta
- We duplicate implementations of compute_contiguous (plain, channels last,
  channels last 3d)
- We inconsistently called refresh_numel()/refresh_contiguous(), versus
  recomputing it ourselves

This factor makes a consistent strategy for all of the boolean fields, and
for numel computation.  After this refactor:

- All layout boolean fields are interposable via strides policy
  and can be overridden from Python; you will never access a garbage field
- All layout boolean fields are on ExtraMeta
- You can always call refresh_numel/contiguous, no matter if your Tensor is
  contiguous or not
- The numel/layout boolean fields are always populated consistently with
  the sizes strides fields (either on Tensor or ExtraMeta), even if you
  have custom policy
- There is only one implementation of the actual computation logic

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D39907696](https://our.internmc.facebook.com/intern/diff/D39907696)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85858
Approved by: https://github.com/albanD
2022-09-30 21:26:34 +00:00
5b476e68af Slightly beefed up dynamic shapes tests for storage_offset (#85806)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85806
Approved by: https://github.com/albanD
2022-09-28 19:25:22 +00:00
614d6f19e3 Fix Use obj1.is(obj2) warnings (#85688)
Fixes:
```
#    define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]]
                                          ^
/dev/shm/rbarnes/tempfs/pytorch/torch/csrc/autograd/python_variable.cpp:2603:11: warning: 'operator==' is deprecated: Use obj1.is(obj2) instead [-Wdeprecated-declarations]
  if (out == Py_None) {
          ^
/dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/../pytypes.h:276:5: note: 'operator==' has been explicitly marked deprecated here
    PYBIND11_DEPRECATED("Use obj1.is(obj2) instead")
    ^
/dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/common.h:136:43: note: expanded from macro 'PYBIND11_DEPRECATED'
#    define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]]
                                          ^
/dev/shm/rbarnes/tempfs/pytorch/torch/csrc/autograd/python_variable.cpp:2627:11: warning: 'operator==' is deprecated: Use obj1.is(obj2) instead [-Wdeprecated-declarations]
  if (out == Py_None) {
          ^
/dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/../pytypes.h:276:5: note: 'operator==' has been explicitly marked deprecated here
    PYBIND11_DEPRECATED("Use obj1.is(obj2) instead")
    ^
/dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/common.h:136:43: note: expanded from macro 'PYBIND11_DEPRECATED'
#    define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]]
                                          ^
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85688
Approved by: https://github.com/albanD, https://github.com/ezyang
2022-09-28 04:53:19 +00:00
24a268143d Directly access has_symbolic_sizes_strides, avoid expensive test (#85754)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85754
Approved by: https://github.com/albanD
2022-09-28 00:26:11 +00:00
490727a35f New calling convention for Python dispatcher (#85133)
Instead of calling into the Python dispatcher for EVERY dispatcher
call, we now have a two step process.  First, we
getattr(op: OpOverload, dispatch_key) to "load" the handler for the
function.  This can either be a conventional function (in which
case we will call it, in the same way the old Python dispatcher
worked), or it can be a DispatchKey, in which case we will directly
call that DispatchKey in C++, bypassing marshalling between Python
and C++ entirely.  OpOverload.__getattr__ is carefully written so
that it will cache the

A further optimization would be to define __slots__ on OpOverload,
and ensuring that the DispatchKey strings are interned.

The resulting Python dispatcher is less flexible: after the first
lookup, the handler is cached and we won't recompute it.  Furthermore,
by default, dispatches will not go into Python, and so you won't
get stack frames for the Python dispatcher by default.  But we get
a huge performance improvement: on the following microbenchmark
we go from 2.5s to 1.9s.

```
import time
import torch
from functorch import make_fx

def f(x):
    for i in range(1000):
        x = x * x
    return x

begin = time.time()
res = make_fx(f, tracing_mode="symbolic")(torch.randn(10, 20))
print(time.time()-begin)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85133
Approved by: https://github.com/wconstab
2022-09-16 20:38:21 +00:00
e5fac7f5dc Optimize torch.ops.ns.opname.overload accessor in torch dispatch (#85132)
This doesn't actually seem to help all that much.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85132
Approved by: https://github.com/wconstab
2022-09-16 20:21:03 +00:00
8ca1839d32 Python Dispatcher integration with C++ dispatcher (#85050)
#84826 but without ghstack
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050
Approved by: https://github.com/malfet
2022-09-15 00:43:36 +00:00