pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	fc0376e8b1	[BE][2/6] fix typos in test/ (test/test_*.py) (#157636 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157636 Approved by: https://github.com/yewentao256, https://github.com/mlazos ghstack dependencies: #156311, #156609	2025-07-09 11:02:23 +00:00
Tom Ritchford	d8c8ba2440	Fix unused Python variables in test/[e-z]* (#136964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964 Approved by: https://github.com/justinchuby, https://github.com/albanD	2024-12-18 23:02:30 +00:00
Xuehai Pan	4226ed1585	[BE] Format uncategorized Python files with `ruff format` (#132576 ) Remove patterns ``, `test/`, and `torch/**` in `tools/linter/adapters/pyfmt_linter.py` and run `lintrunner`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132576 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: #132574	2024-08-04 17:13:31 +00:00
YangQun1	589aef4bb0	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-08-01 03:18:37 +00:00
PyTorch MergeBot	c3679bed35	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit 91aba7baac3d2a079c0b13db25588842260c98cc. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/clee2000 due to broke inductor/test_triton_kernels inductor/test_triton_kernels.py::KernelTests::test_triton_kernel_functionalize [GH job link](https://github.com/pytorch/pytorch/actions/runs/10094659640/job/27915271250) [HUD commit link](`91aba7baac`) ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2251058374))	2024-07-25 17:42:18 +00:00
YangQun1	91aba7baac	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-25 13:04:23 +00:00
PyTorch MergeBot	8ffd109a00	Revert "Fix py codegen to delete values that don't have any users (#131028 )" This reverts commit 466c167b71e6021f8eadcfbae1d9156a375663ce. Reverted https://github.com/pytorch/pytorch/pull/131028 on behalf of https://github.com/atalman due to breaks CI ([comment](https://github.com/pytorch/pytorch/pull/131028#issuecomment-2247771530))	2024-07-24 12:21:43 +00:00
YangQun1	466c167b71	Fix py codegen to delete values that don't have any users (#131028 ) Fixes #131025 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131028 Approved by: https://github.com/ezyang	2024-07-24 01:03:56 +00:00
hun	518ab48e85	Enable UFMT on test/test_functionalization.py (#123926 ) Part of #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123926 Approved by: https://github.com/ezyang, https://github.com/statelesshz	2024-04-28 17:02:34 +00:00
Joel Schlosser	7956ca16e6	Enable reverse view_funcs by default for python subclasses (#116512 ) Part 3 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). Changes codegen to generate `view_func()` / `rev_view_func()` by default for python subclasses. With `view_func()` existing more often now, the lazy view rebase logic [here](`f10c3f4184/torch/csrc/autograd/variable.cpp (L665-L695)`) causes some slight behavior changes for in-place ops on views: * Additional view nodes are inserted into output graphs, changing their string representation, although they are functionally the same. The extra nodes are removed in AOTAutograd's DCE pass. * When `t` is a `FunctionalTensor`, calling `t.grad_fn` will now invoke `view_func()`; we need to make sure we're operating in a `FunctionalTensorMode` so the view op calls succeed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116512 Approved by: https://github.com/bdhirsh, https://github.com/soulitzer ghstack dependencies: #115894	2024-01-05 16:48:12 +00:00
Tugsbayasgalan Manlaibaatar	76b1d44d57	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-25 04:51:21 +00:00
Joel Schlosser	52f0457d7d	Support view returns for functional inverses on narrowing views (#115893 ) Part 1 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). The following functional inverses are currently implemented scatter-style and thus never return views: * `as_strided_copy_inverse()` * `diagonal_copy_inverse()` * `expand_copy_inverse()` * `select_copy_int_inverse()` * `slice_copy_Tensor_inverse()` * `split_copy_Tensor_inverse()` * `split_with_sizes_copy_inverse()` * `unbind_copy_int_inverse()` * `unfold_copy_inverse()` We need to get actual views for the introduction of reverse view funcs coming next. Details: * Use `as_strided()` to implement actual view inverses for the above * Assumes we're given a mutated_view that is actually part of a bigger storage; this isn't really the case for functionalization * Introduce `InverseReturnMode` enum for customization of functional inverses * `AlwaysView` - always return an actual view; needed for reverse view_funcs() * `NeverView` - always do a copy; useful for certain functionalization use cases (e.g. XLA, executorch) * `ViewOrScatterInverse` - return an actual view in most cases, but prefer scatter inverses when they exist. this avoids the need to implement `as_strided()` for subclasses, which can be difficult or impossible * Make sure functionalization works as before * Use `ViewOrScatterInverse` when reapply_views TLS is True or `NeverView` otherwise * Adds tests to ensure old behavior for above inverses in functionalization Pull Request resolved: https://github.com/pytorch/pytorch/pull/115893 Approved by: https://github.com/bdhirsh	2023-12-21 21:39:22 +00:00
PyTorch MergeBot	0567f71ac6	Revert " pre_dispatch aot_export (#115188 )" This reverts commit a267d6735051a4714fa2ac1c163315b650118744. Reverted https://github.com/pytorch/pytorch/pull/115188 on behalf of https://github.com/jeanschmidt due to sadly, it is required to revert this commit in order to revert https://github.com/pytorch/pytorch/pull/115454 ([comment](https://github.com/pytorch/pytorch/pull/115188#issuecomment-1866310014))	2023-12-21 14:03:18 +00:00
Tugsbayasgalan Manlaibaatar	a267d67350	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-20 21:36:25 +00:00
Yukio Siraichi	132cb57e47	Skip aliasing correction for `lift_fresh`. (#112202 ) Fix: #111506 This PR skips aliasing correction on `lift_fresh` calls. Reasoning is: although unlifted and lifted tensors are technically aliases, they are from different levels of abstraction (`FunctionalTensorWrapper` and `XLATensor`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/112202 Approved by: https://github.com/bdhirsh	2023-11-03 20:46:30 +00:00
Peter Bell	bbd5b935e4	Use `pytree.tree_leaves` everywhere (#112324 ) This changes all the instances I could find of `tree_flatten(...)[0]` or `x, _ = tree_flatten` to use `tree_leaves`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324 Approved by: https://github.com/lezcano ghstack dependencies: #112327, #112323	2023-10-30 03:39:04 +00:00
Brian Hirsh	63526a63f5	Make FunctionalTensor subclass to be more like functorch (interaction with ZeroTensor + Conjugate key) (#109023 ) I added some tests for Conj, Neg and ZeroTensor for both python and C++ functionalization. This also fixes a nasty segfult when running a functorch `jacfwd` test with `torch.compile`, once AOTAutograd is using `FunctionalTensor`. Changes: (1) I use Jeffrey's `make_wrapper_subclass(extra_dispatch_keys)` kwarg to plumb extra dispatch keys ontoto the wrapper, mirroring what C++ functionalization does (C++ functionalization will mirror all dispatch keys from the inner tensor to the wrapper, except for python and functorch keys). (2) FunctionalTensorMode will decompose CompositeImplicitAutograd ops, since (for example) ZeroTensor kernels can send ops like `.to()` directly to the Python key. We'll need a way to toggle this later for pre-dispatch functionalization (3) Bound `_ForceDispatchKeyGuard` and BatchedTensorImpl's dispatch keyset to python Pull Request resolved: https://github.com/pytorch/pytorch/pull/109023 Approved by: https://github.com/zou3519 ghstack dependencies: #108654, #109662, #109632	2023-09-22 07:09:04 +00:00
Brian Hirsh	f22b303f65	Add TorchDispatch version of functionalization (#106404 ) This PR adds a new `FunctionalTensor` subclass, and `FunctionalTensorMode` torch dispatch mode. Together, this class/mode are a lightweight wrapper around our existing C++ functionalization logic. This idea came from Ed - later in the stack, I want to be able to run functionalization underneath torch_dispatch, when performing tracing in AOTAutograd. I can't do this easily with vanilla C++ functionalization, because it has a dedicated dispatch key that always runs before TorchDispatch. However, by adding a torch_dispatch mode shim around functionalization, we can use functionalization as a torch_dispatch mode, which will make it easier to run underneath other modes later. This PR provides the basic new classes, and some light testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106404 Approved by: https://github.com/ezyang	2023-09-15 20:19:25 +00:00
Brian Hirsh	da54f3c519	reorder proxy / fake modes so they always run last (#104482 ) Update: Made refactor of the original PR. See the original description below, but here I'll describe the updates: (1) TLS changes in `TorchDispatchModeTLS.h/cpp`. I added a `TorchDispatchModeKey` enum, that (for now) just contains PROXY and FAKE. The ModeTLS used to just contain a `std::vector<std::shared_ptr<c10::SafePyObject>>` corresponding to the mode stack. It now also contains a separate array of "infra modes", indexed by mode key (PROXY and FAKE, with a new addition, FUNCTIONAL, coming later in the stack). `TorchDispatchModeTLS::push_onto_stack` and `TorchDispatchModeTLS::pop_stack` are now a bit more complicated. Pushing accepts an optional mode_key, which if set, tells us to add the given mode directly to our "infra_modes" array. Popping will first check the "user mode" stack, before trying to pop anything from the infra mode stack. It also optionally returns the mode key of the mode we popped if there was one - that way if we push that same mode back onto the TLS later, we know where it goes. `TorchDispatchModeTLS::dispatch_mode_enabled()` now accepts an optional `skip_infra_modes` param, so you can separately query if there are "any modes at all", or if there are "any user modes". `TorchDispatchModeTLS::get/set/unset_mode()` all take in a mode key, and get/set/unset the mode at that particular mode key (meaning they are only meant to be used for infra modes). There were also some mild codegen changes to support the new enum (2) `fake_tensor.py/proxy_tensor.py/_python_dispatch.py` The way I tell the infra that certain subclasses/modes are "infra" is through the enum: I gave `FakeTensor` and `FakeTensorMode` a `self._mode_key = torch._C.TorchDispatchModeKey.FAKE`. `TorchDispatchMode.__enter/exit__()` (in `_python_dispatch.py` now check if the current mode has a mode key, and if so they plumb it into any `push_onto_stack()` calls (which eventually instructs `TorchDispatchModeTLS` where to put the mode). Same thing for `ProxyTorchDispatchMode`. I also had to change both of these mode's enter/exit, to handle the fact that there can no longer be multiple proxy/fake modes on the mode stack at once. I updated them both to have a `self.enter_stack: List[Optional[TorchDispatchMode]]` - whenever we push a given mode in `__enter__`, we remove the current ambient fake/proxy mode from the mode stack, and save it in `enter_stack`, so that on exit we can reset the state properly. (2) dispatching logic in `python_arg_parser.cpp` This is where the core dispatching logic changes are. I added two helpers, `dispatch_on_subclass()` and `dispatch_on_mode()`. The overall dispatching order is now: ``` (a) dispatch_on_mode() # try user modes first (where the mode stack automatically considers infra modes last) (b) dispatch_on_subclass() # try user subclasses next (skipping infra subclasses) (c) dispatch_on_subclass() # try infra subclasses next (skipping user subclasses) ``` Note that we still want "user subclasses" to run before "infra modes". As Ed helped me realize, this will work today: If proxy/fake modes in step 1, they'll return NotImplemented if they see a user subclass, allowing us to redispatch to the user subclass. How do (b) and (c) distinguish between user and infra subclasses? Infra subclasses (FakeTensor, and later FunctionalTensor) are required to have a `_mode_key` hidden on the subclass - so we filter via arguments that do/don't have the _mode_key. (3) I also changed `DoubleTensor` to `TwoTensor` to minimize confusion (@albanD pointed out that DoubleTensor would be easily confused with `torch.FloatTensor` and friends). ----- original description below ----- The main purpose of this PR is to fix the "ordering problem" between torch_dispatch modes, where we want to ensure that our Fake and Proxy dispatch modes always run after any dispatch modes created by the user, regardless of where they are in the stack. See this doc for more details: https://docs.google.com/document/d/1COQ291nOZvtFnzGTQMJqoYZ3sttEYFw_7HbfSyL8gcA/edit Full set of changes below. I ended up including a few semi-related changes in this PR that I documented - but if folks would rather I separate them out, happy to try to do that. (1) Add dedicated TLS slots for FakeTensorMode and ProxyTensorMode This is the main component of this PR. There are two new slots, `TorchDispatchModeTLS.fake_mode_` and `TorchDispatchModeTLS.proxy_mode_`, which correspond to a single "global" fake and proxy mode. There is now an invariant that `torchDispatchModeState.stack_` can never contain either of these modes. I also added a `TorchDispatchModeTLS::maybe_highest_mode()` helper that consults the `stack_` as well as both the proxy and fake slots, and returns the highest priority mode - this is because there are a few places in the codebase where we legitimately want to get the highest priority mode, including fake or proxy, if one is set. This also made the implementations of the existing `disable_proxy_modes_tracing()` and `get_innermost_proxy_mode()` marginally simpler. (2) Updated the dispatching logic in handle_torch_function_no_python_arg_parser() This is the function that actually figures out which torch_dispatch implementation to call, given the current mode stack and tensor subclass inputs. This function got marginally more complicated as part of the refactor: First we inspect the mode stack and any non-fake subclass inputs. Then we check for the proxy mode slot. Then we check for the Fake mode slot, before finally checking for any fake subclass inputs. (3) new python `_get_fake_tensor_mode()` and `_get_proxy_tensor_mode()` API's Before, if you wanted to see if proxy or fake modes were active in python, you would have to consult the mode stack. Since these two modes are no longer part of the actual mode stack, I added two new API's to directly check if either proxy or fake modes are active. (4) Allow traceable tensor subclasses to access storages from python This is convenient later in the stack, where AOTAutograd needs to detect aliasing of inputs and outputs, where those inputs and outputs might be tensor subclasses. Previously, `x.untyped_storage()` would raise an error if `x` was a subclass. In this PR, I tried to relax this constraint as little as possible: `THPVariable_storage()` will only try to return a storage to python if the tensor subclass that you are passing in is "traceable" (5) Fixed subclass fakeification @wanchaol recently added support to be able to fakeify tensor subclasses. That fakeification logic works in most cases, but there is one case it doesn't handle: autograd metadata. In particular, since autograd sees our tensor subclasses and not their desugared tensors, we need to make sure that our fakeified subclass has the same autograd metadata as the original subclass. I updated `meta_utils.py` to make sure that the autograd metadata is correct. (6) make tensor subclasses resizeable Previously we didn't allow tensor subclasses to be resizeable. I ran into an issue where fakeifying a tensor subclass occasionally requires swapping out its storage, which can involve resizing the tensor. Mechanically, this required updating `at::for_blob()` to expose a way to request that the tensor that you create has resizeable storage, and then using this new API in `_make_wrapper_tensor()`. (7) Added a basic DoubleTensor subclass for testing I use this subclass more later in this stack in my AOTAutograd tests - but it serves as a simple subclass example to test the dispatch ordering in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104482 Approved by: https://github.com/ezyang ghstack dependencies: #107415	2023-08-29 02:36:48 +00:00
soulitzer	3912b722f3	Upgrade LoggingTensor mode and add traceback collection (#103734 ) Parts borrowed from: https://github.com/albanD/subclass_zoo/blob/main/logging_mode.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/103734 Approved by: https://github.com/albanD	2023-06-21 22:04:30 +00:00
Brian Hirsh	bb4e9e9124	functionalization: error during mutations on mem overlap (#99919 ) Fixes https://github.com/pytorch/pytorch/issues/98143. If a user mutates a tensor that has overlapping memory, this can cause silent correctness issues with torch.compile. This PR adds a few checks to detect that situation and error. Unfortunately `at::has_internal_overlap()` wasn't smart enough to detect the one linked in the issue, so I added a (simple) check that only runs in functionalization, that can catch the overlapping memory. We might need to revisit and add more complex checks later though (luckily, functionalization runs during compilation time so we can afford more expensive checks). Pull Request resolved: https://github.com/pytorch/pytorch/pull/99919 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-04-26 16:39:40 +00:00
Brian Hirsh	35c9ea89fa	dont bake in defaults when tracing *_like factories (#97564 ) quick fix for https://github.com/pytorch/pytorch/issues/97541. letting CI run to see if there's any fallout Pull Request resolved: https://github.com/pytorch/pytorch/pull/97564 Approved by: https://github.com/ezyang	2023-03-27 22:53:44 +00:00
Brian Hirsh	68600fc7c6	avoid extra copies in batchnorm inference by introducing a new op, _native_batch_norm_legit_no_training (#94946 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94946 Approved by: https://github.com/ezyang	2023-02-16 11:41:20 +00:00
Brian Hirsh	abfd293c39	functionalization: fix x.is_contiguous(channels_last) (#94195 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94195 Approved by: https://github.com/ezyang	2023-02-11 21:07:08 +00:00
Brian Hirsh	aba4fb9a16	fix functionalization resize stride compute (#94018 ) uncovered from an OpInfo in inductor, when I turned on functionalization Pull Request resolved: https://github.com/pytorch/pytorch/pull/94018 Approved by: https://github.com/ezyang	2023-02-11 21:07:08 +00:00
Sherlock Huang	36fe31f537	[Reland] Refactor stack_trace preservation for node meta preservation (#90803 ) (#92400 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90803 Approved by: https://github.com/jerryzh168, https://github.com/albanD ghstack-source-id: 5848cca08ef5d6f8868f4f79d8bc29711e9a52c2 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/92400 Approved by: https://github.com/jerryzh168	2023-01-30 23:30:43 +00:00
PyTorch MergeBot	498be7ed25	Revert "Refactor stack_trace preservation for node meta preservation (#90803 )" This reverts commit 0f1302eeaed3b10ab6db493c1c33797a6ec46866. Reverted https://github.com/pytorch/pytorch/pull/90803 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-01-10 10:44:28 +00:00
Sherlock Huang	0f1302eeae	Refactor stack_trace preservation for node meta preservation (#90803 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90803 Approved by: https://github.com/jerryzh168, https://github.com/albanD	2023-01-09 23:23:27 +00:00
Brian Hirsh	c47bdd7522	_scatter ops should preserve input stride/storage_offset (#91029 ) It turns out that we do* need to update *_scatter ops to return the exact same strides as their inputs. I added a test to `test/test_functionalization.py`, which now trips thanks to Ed's functionalization stride debugging check. It only actually ends up tripping silent correctness if you try to .backward() on that function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91029 Approved by: https://github.com/ezyang	2022-12-22 19:41:53 +00:00
Brian Hirsh	d6efd25d1e	functionalization: check for undefined tensors in advanced indexing (#90791 ) It looks like running code like `a[:, tensor_idx] = b` can results in: (1) calling `index_put_()` (2) passing (potential undefined) tensors as the indices to index_put_(). Pull Request resolved: https://github.com/pytorch/pytorch/pull/90791 Approved by: https://github.com/ezyang	2022-12-19 16:11:06 +00:00
Brian Hirsh	440a3f2398	fix set_() with functionalization (#90722 ) This should fix https://github.com/pytorch/pytorch/issues/90573 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90722 Approved by: https://github.com/ezyang	2022-12-19 16:11:06 +00:00
Richard Zou	4068c5467d	[Reland] Move functorch/_src to torch/_functorch (#88756 ) (#90091 ) This will be the last disruptive functorch internals change. Why are we moving these files? - As a part of rationalizing functorch we are moving the code in functorch/_src to torch/_functorch - This is so that we can offer the functorch APIs as native PyTorch APIs (coming soon) and resolve some internal build issues. Why are we moving all of these files at once? - It's better to break developers all at once rather than many times Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/90091 Approved by: https://github.com/anijain2305, https://github.com/ezyang	2022-12-03 14:17:15 +00:00
Jane Xu	76e869c911	[BE] Beef up test_functionalization to test functionalizing multi-parameter functions (#89798 ) Previously, `assert_functionalization` only took in uni-Tensor-parameter functions. This PR beefs up the check to allow for functions that take multiple parameters. This PR also changes the test_instance_norm test to check that the multiparam change works. ## Test plan Locally tested, CI should also pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89798 Approved by: https://github.com/samdow	2022-11-30 20:46:16 +00:00
Jane Xu	fcb5d6e771	Enable instance norm running mean test (#89793 ) Followup action to https://github.com/pytorch/pytorch/pull/88697 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89793 Approved by: https://github.com/bdhirsh	2022-11-29 23:45:56 +00:00
PyTorch MergeBot	218d9c6e09	Revert "Move functorch/_src to torch/_functorch (#88756 )" This reverts commit 52bc5c1cfe098fd4b4b13902b4fea83b455b9773. Reverted https://github.com/pytorch/pytorch/pull/88756 on behalf of https://github.com/clee2000 due to broke imports in tests `52bc5c1cfe` https://github.com/pytorch/pytorch/actions/runs/3574742513/jobs/6010814968 probably a landrace	2022-11-29 17:17:11 +00:00
Richard Zou	52bc5c1cfe	Move functorch/_src to torch/_functorch (#88756 ) This will be the last disruptive functorch internals change. Why are we moving these files? - As a part of rationalizing functorch we are moving the code in functorch/_src to torch/_functorch - This is so that we can offer the functorch APIs as native PyTorch APIs (coming soon) and resolve some internal build issues. Why are we moving all of these files at once? - It's better to break developers all at once rather than many times Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/88756 Approved by: https://github.com/ezyang	2022-11-29 13:55:42 +00:00
Jane Xu	8695f0cced	Rectify `native_batch_norm` schema by splitting it into two legit schemas (#88697 ) Using the same repro from the issue (but with BatchNorm2D) Rectifies native_batch_norm schema by splitting the schema into 2: 1. one will have NON-optional alias-able running_mean and running_var inputs 2. the other will just not have those parameters at all (no_stats variation) Calling for name suggestions! ## test plan I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit` CI should pass. ## next steps Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697 Approved by: https://github.com/albanD	2022-11-23 23:23:17 +00:00
Edward Z. Yang	5266953443	Add crossref debug mode for functionalization, catches stride errors (#89498 ) The idea is to add a custom handler to Functionalize key in Python dispatcher that runs the functionalized version along side a non functionalized version, and checks that their outputs agree in the end. (Technically, for metadata mutation we should also check the inputs, but for now we're relying on those functions returning self.) I turned this on for test_functionalize.py (new TestCrossRefFunctionalize) and found a bunch of failures that look legit. This probably doesn't interact that nicely if you're also tracing at the same time, probably need more special logic for that (directly, just disabling tracing for when we create the nested fake tensor mode, but IDK if there's a more principled way to organize this.) There are some misc fixups which I can split if people really want. - xfail_inherited_tests moved to test common_utils - Bindings for _dispatch_tls_set_dispatch_key_included, _dispatch_tls_is_dispatch_key_included and _functionalization_reapply_views_tls - Type stubs for _enable_functionalization, _disable_functionalization - all_known_overloads utility to let you iterate over all OpOverloads in all namespaces. Iterator support on all torch._ops objects to let you iterate over their members. - suspend_functionalization lets you temporarily disable functionalization mode in a context - check_metadata_matches for easily comparing outputs of functions and see if they match (TODO: there are a few copies of this logic, consolidate!) - _fmt for easily printing the metadata of a tensor without its data - _uncache_dispatch for removing a particular dispatch key from the cache, so that we force it to regenerate - check_significant_strides new kwarg only_cuda to let you also do stride test even when inputs are not CUDA - Functionalize in torch._C.DispatchKey Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89498 Approved by: https://github.com/malfet	2022-11-23 04:18:25 +00:00
Edward Z. Yang	d9cbe7764e	Make aten.copy preserve strides (hf_Longformer) (#89464 ) Fixes https://github.com/pytorch/torchdynamo/issues/1888 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D41460986](https://our.internmc.facebook.com/intern/diff/D41460986) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89464 Approved by: https://github.com/bdhirsh	2022-11-22 13:06:43 +00:00
Brian Hirsh	ec4eadac5b	reland "Do not use unsafe restriding for subclasses (#87610 )" (#88343 ) This reverts commit 5b75b19f51837e162cc0e5e5757dfd9bef437c67. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88343 Approved by: https://github.com/ezyang	2022-11-14 13:42:51 +00:00
Edward Z. Yang	0e3031f7e7	Functionalize and compute joint simultaneously. (#88063 ) This also comes with some bug fixes that were uncovered from doing this: - Forward device calls to inner tensor in FunctionalTensorWrapper - Make legacyExtractDispatchKey exclude Functionalize, so that it can get at the real device type key. This is noncontroversial. - Stop stripping dense from key set. The reason for this is FunctionalWrapperTensor may be used in contexts where people query if it is dense or not. If it doesn't report this correctly (from the dispatch key), it will cause errors. This caused some torchbench models to fail when I did one-pass tracing. - Save and restore reapply views TLS correctly Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88063 Approved by: https://github.com/bdhirsh	2022-11-05 03:52:40 +00:00
PyTorch MergeBot	5b75b19f51	Revert "Do not use unsafe restriding for subclasses (#87610 )" This reverts commit 73379acaf3865379aed0a1bab1320616772152f3. Reverted https://github.com/pytorch/pytorch/pull/87610 on behalf of https://github.com/mehtanirav due to [Internal breakages](https://www.internalfb.com/intern/sandcastle/job/36028797828925790/insights)	2022-11-02 16:59:02 +00:00
Edward Z. Yang	bb7e6254e4	Add ability to freeze storages inside functionalization (#88141 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88141 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-11-01 16:00:33 +00:00
Brian Hirsh	73379acaf3	Do not use unsafe restriding for subclasses (#87610 ) This helps convert some accuracy errors into runtime errors, which makes it easier to debug. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/87610 Approved by: https://github.com/albanD	2022-10-31 20:49:15 +00:00
Brian Hirsh	23ff47ccc5	functionalization: fix detach() (#87750 ) `.detach()` worked in basic cases previously, but didn't properly preserve view relationships between the base and the output. This wasn't heavily tested, because autograd doesn't normally encounter `FunctionalTensorWrapper` directly, but could become more common if we fuse functionalization and autograd into a single tracing pass. This will also be a bug fix for LTC (and XLA when they use functionalization) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87750 Approved by: https://github.com/ezyang	2022-10-27 15:47:56 +00:00
Brian Hirsh	9ad1659b17	functionalization: make view_copy outputs always contiguous (#85747 ) This fixes an issue with mobile: The output of view_copy ops should always be contiguous. Later, we can consider adding optional arguments to the `view_copy()` functions to let you explicitly say what the contiguity of the output can be (e.g. channels_last) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85747 Approved by: https://github.com/ezyang	2022-10-21 17:42:02 +00:00
Horace He	a27a4a02fe	Refactored proxytensor to clean up separate branches (#84325 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84325 Approved by: https://github.com/ezyang	2022-08-31 09:37:55 +00:00
Brian Hirsh	e9e7363854	reinplacing pass fixes for torchbench + huggingface (#83626 ) I'm testing out turning on re-inplacing + functionalization by default with the AOTAutograd + eager backend on torchbench + huggingface models. This PR contains a few bug fixes from turning re-inplacing on: (1) Handle more gracefully when FakeTensorMode is already turned on when you call reinplace (2) More robust detection for when an inplace variant of an op exists (the dumb bug was that `pow.Scalar` doesn't have an inplace variant, even though there are several overloads of `pow_`. None of them are eligible though (3) Avoid re-inplacing when it would require resizing the input buffer. This isn't allowed, because inplace ops aren't allowed to resize their inputs. For the last one, I gave the two main examples in more detail in the comments. Important cases are: ``` # This should not be re-inplaced at all; the op broadcasts, so this would require resizing the self tensor torch.add(tensor[1, 4], tensor[4, 4]) # This should not be re-inplaced, because the inplace and out-of-place variants of the op return different dtypes torch.ge(a, b) # However, this means that today when functionalization functionalists a `torch.ge_(a, b)` call, reinplacing won't properly de-functionalize it. I mentioned that optimization is worth adding later in the comments ``` (4) There's some logic around keeping `storage_to_nodes` up to date when we see a view op: if we re-inplace `out = a.add(...)`, and later in the program we encounter a "later_node",`out.view(..)`, and need to replace it with `a.view(...)`, then we need to update some metadata structures. I had to fix that logic: specifically, if "later_node" isn't a dispatcher op, (e.g. if it's an FX output node), I wasn't properly handling the case where the node's fake_meta info was not a tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83626 Approved by: https://github.com/ezyang	2022-08-19 23:30:45 +00:00
Edward Z. Yang	988bd0173c	Add OpOverload.decompose API (#83075 ) This allows you to directly call into the CompositeImplicitAutograd implementation of an operator, without changing any aspects of the dispatcher state. In particular, you can use this to recursively call into a decomposition, dispatching back to your tensor subclass/mode as desired. Hypothetically, we should also make these available in the decompositions dictionary, but I'm leaving this as future work as enumerating these decompositions is annoying (as operators are lazily registered.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83075 Approved by: https://github.com/albanD	2022-08-09 18:53:19 +00:00
Peter Bell	4f255dbfb3	Remove manual bindings for arange (#81380 ) The functional variant of one of the `arange` overloads has a schema mismatch with the out variant. The functional one has `Scalar step`, but the corresponding out variant has `Scalar step=1`. This isn't allowed, so it had to be special-cased in the python codegen and manually bound. This adds the default `step` value to the functional overload and removes the special-casing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81380 Approved by: https://github.com/ngimel	2022-08-07 00:10:27 +00:00

1 2 3

106 Commits