pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
soulitzer	97342ae04b	Fix python tensor hooks behavior on inplace (#92734 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92734 Approved by: https://github.com/albanD	2023-01-21 21:32:37 +00:00
soulitzer	1bc60c6b31	[reland] Improve hooks ordering behavior (#92559 ) This reverts commit e525f433e15de1f16966901604a8c4c662828a8a. Original PR: #85849 Fixes #ISSUE_NUMBER In addition to reverting the revert, this PR: - defines the virtual destructor of FunctionPreHook in the header. Why? Presumably the internal build imports the header from somewhere, but does not have function_hooks.cpp (where the virtual destructor was previously defined) in the same compilation unit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92559 Approved by: https://github.com/albanD	2023-01-19 08:17:32 +00:00
PyTorch MergeBot	e525f433e1	Revert "Improve hooks ordering behavior (#85849 )" This reverts commit 049838f2496bd1d29e4e8292714acb0042cc706e. Reverted https://github.com/pytorch/pytorch/pull/85849 on behalf of https://github.com/albanD due to fails internal build	2023-01-18 15:27:22 +00:00
soulitzer	049838f249	Improve hooks ordering behavior (#85849 ) Addresses: https://github.com/pytorch/pytorch/issues/35802 Design doc: https://docs.google.com/document/d/19xSib7FFknRQ5f3ptGFUmiOt3BrgXSUlTQH2xMcZJYg/edit# ### Changes in this PR #### Implementation - We have now have 3 fields: pre_hooks, retains_grad_hooks, and tensor_pre_hooks so that we can more precisely define their ordering and when they are executed. - Since retains grad uses an entirely new field, we cannot reuse the old retains grad, logic. We refactor retains grad to call directly into the variable.cpp logic. Other logic in variable.cpp that handle cpp hooks must also be updated. #### Hooks ordering and execution: - Defines pre-hooks registered on tensor to run before pre-hooks registered on grad_fn - Updates pre-hooks registered on tensor to always run, even if they are the inputs= to .grad() - Post hooks (and pre hooks) can now observe the modifications to gradient by the tensor pre hook #### Retains grad hooks - retains grad hooks always execute last, even if there are other tensor pre-hooks registered #### Unchanged: - pre_hooks registered to grad_fn aren't expected to execute if they are the inputs= to .grad() Follow ups: - simplify retains_grad field to not be a vector, since it always holds a single hook - potentially merge capture hooks with tensor pre hooks, this would involve some additional refactoring since - python hooks registered to tensor behavior on in-place is still wrong Pull Request resolved: https://github.com/pytorch/pytorch/pull/85849 Approved by: https://github.com/albanD	2023-01-17 16:23:21 +00:00
cyy	9b716a0682	Clean up more clang-tidy supression (#92203 ) 1. remove unused NOLINTNEXTLINE(performance-move-const-arg) 2. add more std::move Pull Request resolved: https://github.com/pytorch/pytorch/pull/92203 Approved by: https://github.com/Skylion007	2023-01-17 05:43:08 +00:00
Kurt Mohler	3a0053abd6	Move `PyObject` code out of `TensorImpl` into new `PyObjectSlot` class (#92169 ) Redo of PR #92099 Part of #91395 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92169 Approved by: https://github.com/albanD	2023-01-14 02:55:32 +00:00
Edward Z. Yang	7078ad5b8c	Reland "AOT Autograd refactor + cleanup, handle intermediate views of bases, use view replay, fix non-tensor input handling" (#92076 ) Original PR: https://github.com/pytorch/pytorch/pull/89532 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92076 Approved by: https://github.com/janeyx99, https://github.com/albanD	2023-01-12 21:32:05 +00:00
PyTorch MergeBot	db466ae057	Revert "[Modes] Add assert that the mode isn't already on the stack (#90770 )" This reverts commit 702838637d63936460ea2bf00b64ffec86ed6687. Reverted https://github.com/pytorch/pytorch/pull/90770 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-01-12 16:44:29 +00:00
samdow	702838637d	[Modes] Add assert that the mode isn't already on the stack (#90770 ) Redo of #89726 on a clean PR, thanks @voznesenskym for the first draft! Pull Request resolved: https://github.com/pytorch/pytorch/pull/90770 Approved by: https://github.com/ezyang	2023-01-11 15:19:43 +00:00
PyTorch MergeBot	b3603f8129	Revert "Deduplicate c10 error and PyTorchError hierarchy (#87855 )" This reverts commit 34f2d3e6ae56744c20c2f859f97101dff291bbbc. Reverted https://github.com/pytorch/pytorch/pull/87855 on behalf of https://github.com/osalpekar due to perf regression in quantization tests	2023-01-06 19:56:35 +00:00
William Phetsinorath	34f2d3e6ae	Deduplicate c10 error and PyTorchError hierarchy (#87855 ) Fixes #53370 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87855 Approved by: https://github.com/albanD	2023-01-02 15:53:36 +00:00
shademe	9f91e94080	Workaround for NumPy builds that ship with a broken Dlpack deleter (#89759 ) NumPy versions 1.22 and 1.23 (and their respective bugfix releases included) have a buggy implementation of the Dlpack deleter that doesn't account for no-GIL contexts. Since we now release the GIL when deallocating tensors in `THPVariable_clear`, this leads to a failure of internal consistency checks when freeing a Dlpack-backed tensor from NumPy. This PR adds a check for the buggy NumPy versions and overrides the `DlManagedTensor` deleter to reacquire the GIL before deallocation. ### Rationale for this implementation The version check was added to `tensor_numpy.h/cpp` as it seemed like a more logical location for it than creating a new translation unit. The overriding of the deleter was originally attempted by directly modifying `at::fromDlpack`, but the lack of a build dependency on the Python C API in A10 prevented that. So, I extended the A10 Dlpack API instead to additionally accept a custom deleter functor. Fixes #88082 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89759 Approved by: https://github.com/albanD	2022-12-28 13:23:29 +00:00
Edward Z. Yang	283cf718ed	Fix _fix_weakref memory leak (#90823 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90823 Approved by: https://github.com/eellison, https://github.com/albanD	2022-12-15 01:07:29 +00:00
albanD	c79489c8e6	Expose to python the backward AD view_func (#89586 ) This will be useful for other systems (AOTAutograd) that want to replay autograd views. FYI @bdhirsh Pull Request resolved: https://github.com/pytorch/pytorch/pull/89586 Approved by: https://github.com/soulitzer	2022-11-24 03:39:58 +00:00
Kazuaki Ishizaki	e0c194f10b	Fix typos in messages under torch (#88961 ) This PR fixes typos of messages and parms in c++ source and head files under `torch` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88961 Approved by: https://github.com/albanD	2022-11-14 19:06:41 +00:00
Edward Z. Yang	2b166532f7	Remove incorrect assert about hermetic state. (#88885 ) I'm not sure why I thought this assert was valid in the first place, and there's no comment about it. The assert is tantamount to saying, "no tensor objects should become dead via SafePyObject when hermetic mode is on." But suppose we run a Python GC while we're inside hermetic mode. This could result in us disposing non-hermetic tensors, which would hit decref. So the assert seems invalid. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88885 Approved by: https://github.com/anjali411, https://github.com/malfet	2022-11-12 02:19:45 +00:00
Edward Z. Yang	53eac1d482	Revert "Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 )"" (#88489 ) The bug was that I was accidentally caching at the wrong key name, so we were never actually hitting the cache. I've renamed the resolved key to final_key to avoid shadowing in this way. This reverts commit 410ce96a23a3496a45478e0b25ffac53aa3c116f. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88489 Approved by: https://github.com/albanD	2022-11-04 19:23:04 +00:00
PyTorch MergeBot	410ce96a23	Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 )" This reverts commit 86c7cd287caeb23c227d97d283e58bc123294746. Reverted https://github.com/pytorch/pytorch/pull/88329 on behalf of https://github.com/clee2000 due to test_decomp takes an extra 2 hours in some jobs, windows takes so long it times out	2022-11-03 21:57:19 +00:00
Edward Z. Yang	f884e817d4	Make Python op registration work with torchdeploy/multipy (#87162 ) See strategy at PythonOpRegistrationTrampoline.cpp for the big picture. Along the way, I made OperatorHandle support == and hashing, and slightly changed the low level python_dispatch impl API to disallow empty strings for dispatch key, which had the knock on effect of requiring us to explicitly make sure we pass in CompositeImplicitAutograd if we would have passed in "" (I didn't apply this to the rest of the file because I'm lazy.) Test strategy is we delete the logic for preventing Python op registrations in torch from being skipped in a torchdeploy context and show CI still works. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/87162 Approved by: https://github.com/anjali411, https://github.com/bdhirsh	2022-11-03 12:56:44 +00:00
Edward Z. Yang	86c7cd287c	Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 ) The motivation is that I am going to add the ability to temporarily install entries to the python dispatcher, and to do that, I need an easier way to clear the cache. Putting the cache in a dict centralizes cache clearing in one place. I then add some easy cache clearing. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88329 Approved by: https://github.com/albanD	2022-11-03 12:53:51 +00:00
Aaron Gokaslan	59fe272c1e	Fix: prefer .is_none() over .is(py::none()) for pybind11 (#88051 ) Fixes minor perf regression I saw in #85688 and replaced throughout the code base. `obj == Py_None` is directly equivalent to is_none(). Constructing a temporary py::none() object needlessly incref/decref the refcount of py::none, this method avoids that and therefore is more efficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88051 Approved by: https://github.com/albanD	2022-10-31 16:41:27 +00:00
Edward Z. Yang	1ff52225f1	Unify SymIntNode and SymFloatNode into SymNode (#87817 ) This refactor was prompted by challenges handling mixed int/float operations in C++. A previous version of this patch added overloads for each permutation of int/float and was unwieldy https://github.com/pytorch/pytorch/pull/87722/ This PR takes a different approach. The general outline of the patch is to combine the C++ types SymIntNode and SymFloatNode into a single type, SymNode. This is type erased; we no longer know statically at C++ if we have an int/float and have to test it with the is_int()/is_float() virtual methods. This has a number of knock on effects. - We no longer have C++ classes to bind to Python. Instead, we take an entirely new approach to our Python API, where we have a SymInt/SymFloat class defined entirely in Python, which hold a SymNode (which corresponds to the C++ SymNode). However, SymNode is not pybind11-bound; instead, it lives as-is in Python, and is wrapped into C++ SymNode using PythonSymNode when it goes into C++. This implies a userland rename. In principle, it is also possible for the canonical implementation of SymNode to be written in C++, and then bound to Python with pybind11 (we have this code, although it is commented out.) However, I did not implement this as we currently have no C++ implementations of SymNode. Because we do return SymInt/SymFloat from C++ bindings, the C++ binding code needs to know how to find these classes. Currently, this is done just by manually importing torch and getting the attributes. - Because SymInt/SymFloat are easy Python wrappers, __sym_dispatch__ now takes SymInt/SymFloat, rather than SymNode, bringing it in line with how __torch_dispatch__ works. Some miscellaneous improvements: - SymInt now has a constructor that takes SymNode. Note that this constructor is ambiguous if you pass in a subclass of SymNode, so an explicit downcast is necessary. This means toSymFloat/toSymInt are no more. This is a mild optimization as it means rvalue reference works automatically. - We uniformly use the caster for c10::SymInt/SymFloat, rather than going the long way via the SymIntNode/SymFloatNode. - Removed some unnecessary toSymInt/toSymFloat calls in normalize_* functions, pretty sure this doesn't do anything. - guard_int is now a free function, since to guard on an int you cannot assume the method exists. A function can handle both int and SymInt inputs. - We clean up the magic method definition code for SymInt/SymFloat/SymNode. ONLY the user classes (SymInt/SymFloat) get magic methods; SymNode gets plain methods; this is to help avoid confusion between the two types. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87817 Approved by: https://github.com/albanD, https://github.com/anjali411	2022-10-27 20:56:02 +00:00
samdow	169ec120ef	[Modes] refactor modes to only use a stack in cpp (#86458 ) Refactors the mode code to only have the C++ mode stack and not the "C++ mode" like we originally had. This also simplifies the mode logic in a number of places Pull Request resolved: https://github.com/pytorch/pytorch/pull/86458 Approved by: https://github.com/zou3519	2022-10-21 19:18:23 +00:00
Sahan Paliskara	936e93058b	Delete torch::deploy from pytorch core (#85953 ) As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there. This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953 Approved by: https://github.com/seemethere, https://github.com/malfet	2022-10-06 07:20:16 +00:00
Edward Yang	0a75c42f36	Workaround MSVC ICE due to constexpr char* template argument (#86288 ) Test Plan: Lease a Windows sandcastle https://www.internalfb.com/intern/wiki/Windows_Platform_Engineering/Leasable_VM_-_User_Guide/ and run: ``` buck build arvr/mode/win/opt //xplat/caffe2:_C_impl ``` Differential Revision: D40109191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86288 Approved by: https://github.com/albanD, https://github.com/malfet	2022-10-06 04:11:05 +00:00
Edward Z. Yang	3b6588ab74	Consistent compute numel/contiguous strategy with SymInts (#85858 ) Previously, our handling for contiguity was inconsistent in the following ways: - is_strides_like 2d/3d and is_non_overlapping_and_dense always were computed based on sizes_and_strides_, even if you had symbolic ints - Furthermore, even if you set custom policy for strides, these quantities were not overridable by subclasses - Furthermore, we didn't even store these fields on ExtraMeta - We duplicate implementations of compute_contiguous (plain, channels last, channels last 3d) - We inconsistently called refresh_numel()/refresh_contiguous(), versus recomputing it ourselves This factor makes a consistent strategy for all of the boolean fields, and for numel computation. After this refactor: - All layout boolean fields are interposable via strides policy and can be overridden from Python; you will never access a garbage field - All layout boolean fields are on ExtraMeta - You can always call refresh_numel/contiguous, no matter if your Tensor is contiguous or not - The numel/layout boolean fields are always populated consistently with the sizes strides fields (either on Tensor or ExtraMeta), even if you have custom policy - There is only one implementation of the actual computation logic Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D39907696](https://our.internmc.facebook.com/intern/diff/D39907696) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85858 Approved by: https://github.com/albanD	2022-09-30 21:26:34 +00:00
Edward Z. Yang	5b476e68af	Slightly beefed up dynamic shapes tests for storage_offset (#85806 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85806 Approved by: https://github.com/albanD	2022-09-28 19:25:22 +00:00
Richard Barnes	614d6f19e3	Fix Use obj1.is(obj2) warnings (#85688 ) Fixes: ``` # define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]] ^ /dev/shm/rbarnes/tempfs/pytorch/torch/csrc/autograd/python_variable.cpp:2603:11: warning: 'operator==' is deprecated: Use obj1.is(obj2) instead [-Wdeprecated-declarations] if (out == Py_None) { ^ /dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/../pytypes.h:276:5: note: 'operator==' has been explicitly marked deprecated here PYBIND11_DEPRECATED("Use obj1.is(obj2) instead") ^ /dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/common.h:136:43: note: expanded from macro 'PYBIND11_DEPRECATED' # define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]] ^ /dev/shm/rbarnes/tempfs/pytorch/torch/csrc/autograd/python_variable.cpp:2627:11: warning: 'operator==' is deprecated: Use obj1.is(obj2) instead [-Wdeprecated-declarations] if (out == Py_None) { ^ /dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/../pytypes.h:276:5: note: 'operator==' has been explicitly marked deprecated here PYBIND11_DEPRECATED("Use obj1.is(obj2) instead") ^ /dev/shm/rbarnes/tempfs/pytorch/cmake/../third_party/pybind11/include/pybind11/detail/common.h:136:43: note: expanded from macro 'PYBIND11_DEPRECATED' # define PYBIND11_DEPRECATED(reason) [[deprecated(reason)]] ^ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85688 Approved by: https://github.com/albanD, https://github.com/ezyang	2022-09-28 04:53:19 +00:00
Edward Z. Yang	24a268143d	Directly access has_symbolic_sizes_strides, avoid expensive test (#85754 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85754 Approved by: https://github.com/albanD	2022-09-28 00:26:11 +00:00
Edward Z. Yang	490727a35f	New calling convention for Python dispatcher (#85133 ) Instead of calling into the Python dispatcher for EVERY dispatcher call, we now have a two step process. First, we getattr(op: OpOverload, dispatch_key) to "load" the handler for the function. This can either be a conventional function (in which case we will call it, in the same way the old Python dispatcher worked), or it can be a DispatchKey, in which case we will directly call that DispatchKey in C++, bypassing marshalling between Python and C++ entirely. OpOverload.__getattr__ is carefully written so that it will cache the A further optimization would be to define __slots__ on OpOverload, and ensuring that the DispatchKey strings are interned. The resulting Python dispatcher is less flexible: after the first lookup, the handler is cached and we won't recompute it. Furthermore, by default, dispatches will not go into Python, and so you won't get stack frames for the Python dispatcher by default. But we get a huge performance improvement: on the following microbenchmark we go from 2.5s to 1.9s. ``` import time import torch from functorch import make_fx def f(x): for i in range(1000): x = x * x return x begin = time.time() res = make_fx(f, tracing_mode="symbolic")(torch.randn(10, 20)) print(time.time()-begin) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85133 Approved by: https://github.com/wconstab	2022-09-16 20:38:21 +00:00
Edward Z. Yang	e5fac7f5dc	Optimize torch.ops.ns.opname.overload accessor in torch dispatch (#85132 ) This doesn't actually seem to help all that much. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85132 Approved by: https://github.com/wconstab	2022-09-16 20:21:03 +00:00
Michael Voznesensky	8ca1839d32	Python Dispatcher integration with C++ dispatcher (#85050 ) #84826 but without ghstack Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050 Approved by: https://github.com/malfet	2022-09-15 00:43:36 +00:00
PyTorch MergeBot	706b990306	Revert "Python Dispatcher integration with C++ dispatcher (#84826 )" This reverts commit 35f6a69191ef762cf22b6cbfe94b8d9406e16674. Reverted https://github.com/pytorch/pytorch/pull/84826 on behalf of https://github.com/malfet due to Broke dynamo, see `35f6a69191`	2022-09-14 14:07:58 +00:00
Michael Voznesensky	35f6a69191	Python Dispatcher integration with C++ dispatcher (#84826 ) Signed-off-by: Edward Z. Yang <ezyangfb.com> From @ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See https://github.com/pybind/pybind11/issues/483#issuecomment-1237418106 for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84826 Approved by: https://github.com/ezyang	2022-09-14 06:57:19 +00:00
Edward Z. Yang	c5a8946e40	Revert "Revert "Redo how custom/python_custom methods on TensorImpl work (#84796 )" (#84806 ) This reverts commit ca3b2bfbe3945c756a67a784aaa7d9891698c59b. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84806 Approved by: https://github.com/Chillee	2022-09-10 06:17:35 +00:00
Eli Uriegas	ca3b2bfbe3	Revert "Redo how custom/python_custom methods on TensorImpl work (#84796 ) This reverts commit 591b75bf98b92acd4f3d0a1dc934198afeaa6fc1. Manual revert of https://github.com/pytorch/pytorch/pull/84641 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84796 Approved by: https://github.com/izaitsevfb	2022-09-10 00:18:13 +00:00
Mateusz Sypniewski	67d6f7160c	Add synchronize hooks (#84427 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84427 Approved by: https://github.com/ngimel, https://github.com/lw	2022-09-09 13:56:59 +00:00
Edward Z. Yang	591b75bf98	Redo how custom/python_custom methods on TensorImpl work (#84641 ) A longstanding confusion in the implementation of fake tensor and proxy tensor is what to do about torch.ops.aten.sym_sizes and related calls. In particular, when you have a tensor that (1) has symbolic shapes and (2) has a `__torch_dispatch__` call, previously, you would always get `__torch_dispatch__` calls for sizes/strides query, even if you didn't request it via the dispatch kwargs in `make_wrapper_subclass`. The reason for this is because we were previously mixing several concepts: "I want to dispatch to Python", "I want to call a virtual method" and "I have dynamic shapes". A single boolean variable controlled all of these things, and so it was not possible to understand inside TensorImpl what the user had actually originally requested. In this PR, we track each of these concepts individually so that we can preserve user intent. Then, we combine these into a single "policy" variable that controls whether or not we can use the fastpath or not. For the policy to trigger, we only need one of the exceptional cases to be true. Billing of changes: * Rename `set_sizes_strides_policy` to `set_custom_sizes_strides`; in general, you cannot DIRECTLY set policy; you have to indirectly set it by the public functions. * Some helpers for sizes and strides, since it's more complicated (as it is an enum, rather than just bools as is the case for device and layout). `matches_python_custom` is used to test the Python dispatch user ask. `matches_policy` does the policy test (only used in the user facing functions.) * I reorged the accessor methods so that they are more logical. This makes the diff bad, so I recommend reading the final code directly. * The default custom implementations now more reliably call their default() implementations * As bonus refactor, I devirtualized some functions that don't need to be virtual * `set_sym_sizes_and_strides` is renamed to `set_sizes_and_strides` to make it easier to use in template contexts; it optionally takes a storage offset now so you can set all three values at the same time. If you use the SymInt overload but there are no symbolic integers, we give you a normal resize. * This adds `sym_storage_offset` since we had that in the symbolic shapes branch and there's no reason not to put it in (and it reduces merge conflicts) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84641 Approved by: https://github.com/wconstab	2022-09-09 13:41:13 +00:00
Edward Z. Yang	93359bf9b3	Convert ConcretePyInterpreterVTable into Meyer singleton (#84657 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84657 Approved by: https://github.com/wconstab	2022-09-08 01:03:00 +00:00
Edward Z. Yang	f6ce2a442e	Refactor PyInterpreter to use normal vtables (#84388 ) I realized that we can deal with the dead vtable problem by... introducing another indirection! The resulting code is worse (you have to do one more dereference to get to the vtable), but the reduction in boilerplate is, IMO, worth it. I did this refactor because I'm about to add a lot more methods to PyInterpreter to handle expunging SymInt from TensorImpl. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84388 Approved by: https://github.com/albanD	2022-09-02 00:06:43 +00:00
Nikolay Korovaiko	eda217ab67	Reland symint_numel (#84281 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84281 Approved by: https://github.com/ezyang	2022-08-30 21:53:34 +00:00
Nikolay Korovaiko	44a975335e	Revert "Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) (#82855 )" (#84207 ) This reverts commit bfebf254dd92f3ed35154597166e7e71fb04f31b. Differential Revision: [D39104562](https://our.internmc.facebook.com/intern/diff/D39104562) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84207 Approved by: https://github.com/robieta	2022-08-30 13:22:58 +00:00
Clive Chan	abcf01196c	Release the GIL when munmap'ing tensors - fixes #77139 (#83623 ) Fixes #77139, where deallocating large tensors with munmap takes a significant amount of time while holding the GIL. This causes the pin_memory thread to interfere with the main thread = performance sadness. Thanks @igozali @zhengwy888 @colesbury as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83623 Approved by: https://github.com/albanD	2022-08-18 15:24:18 +00:00
Edward Z. Yang	a3907ca92d	Respect TorchDispatchMode for shallow_copy_and_detach (#83372 ) I noticed I was missing tensor creations with modes when I tried to delete proxy tensor. This was the cause. Hypothetically, all PyInterpreter calls could get this treatment. But I think it only matters for detach; the rest do not return Tensors and most modes will not be interested in them. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83372 Approved by: https://github.com/zou3519	2022-08-16 14:32:27 +00:00
Brian Hirsh	1665715cb0	add sym_strides() function, use in fake/proxy tensors (#81300 ) Add `TensorImpl::sym_strides`, bind it to python with `torch.ops.aten.sym_strides`, and use it in `ProxyTensor` and `FakeTensor`. Before, `ProxyTensor` was generating `ProxySymInt`'s for the sizes, but not for the strides. Internally we still represent strides with a `SymIntArrayRef` though, so I ran into some weird issues where sizes were showing up as `ProxySymInt`, but strides were `PySymInt`'s. Differential Revision: [D38594558](https://our.internmc.facebook.com/intern/diff/D38594558) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81300 Approved by: https://github.com/ezyang	2022-08-16 14:31:27 +00:00
soulitzer	ccb7d56a18	Rename PyFunctionPreHook to PyFunctionTensorPreHook (#83225 ) Now that there will be two types of Python function prehooks, I prefer have the PyFunction hook taking all grad_outputs and returning all grad_inputs as the more "canonical" one Pull Request resolved: https://github.com/pytorch/pytorch/pull/83225 Approved by: https://github.com/albanD	2022-08-12 22:14:32 +00:00
Nikolay Korovaiko	5b621205f4	Revert "Revert "adding a custom caster for c10::SymInt (#82692 )"" (#83223 ) This should fix the MacOS build errors and reland #82692 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83223 Approved by: https://github.com/albanD	2022-08-12 00:46:50 +00:00
Mateusz Sypniewski	916def84d4	CUDA trace Python hooks (#82824 ) ### Description This adds Python hooks into PyTorch that allow the user to register their own callbacks for events such as tensor allocation, stream allocation, event record / wait etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82824 Approved by: https://github.com/lw, https://github.com/ezyang, https://github.com/malfet	2022-08-11 10:21:40 +00:00
PyTorch MergeBot	daeea7d2c3	Revert "adding a custom caster for c10::SymInt (#82692 )" This reverts commit dee63f4f7bb35559fc34539422229d9ab375dfb9. Reverted https://github.com/pytorch/pytorch/pull/82692 on behalf of https://github.com/seemethere due to Broke internal builds, see [logs](https://www.internalfb.com/intern/sandcastle/job/4503600373141339/insights)	2022-08-09 22:17:41 +00:00
Nikolay Korovaiko	dee63f4f7b	adding a custom caster for c10::SymInt (#82692 ) ### Description Adding a custom caster for `c10::SymInt`. This simplifies handling of c10::SymInt on C++/Pytorch boundary. Namely, removing if statements to handle the union nature (e.g. SymIntNode, int) of c10::SymInt. ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82692 Approved by: https://github.com/ezyang	2022-08-08 21:40:53 +00:00

1 2 3 4 5 ...

391 Commits