pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Shangdi Yu	cfab04d01b	Fix aten.div type promotion for FakeTensor (#150874 ) Summary: When we divide a FakeTensor by an integer using the fast op implementation, the type promotion should be `ELEMENTWISE_TYPE_PROMOTION_KIND.INT_TO_FLOAT` so we get a float when dividing an int FakeTensor by an integer. ``` FAST = get_fast_op_impls() fast_div = FAST[torch.ops.aten.div.Tensor] fast_div(fake_tensor, some_int) ``` Test Plan: ``` python test/test_fake_tensor.py -k test_fast_div ``` Differential Revision: D72667430 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150874 Approved by: https://github.com/angelayi	2025-04-09 18:52:01 +00:00
Animesh Jain	999fa15ba8	[invoke_subgraph][fake tensor cache] Add a finalizer for id hashed objects (#149667 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149667 Approved by: https://github.com/zou3519 ghstack dependencies: #149087	2025-03-27 00:01:39 +00:00
Animesh Jain	a7596b4b34	[invoke_subgraph] Fake tensor prop caching (#149087 ) Redoing https://github.com/pytorch/pytorch/pull/137808 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149087 Approved by: https://github.com/zou3519	2025-03-27 00:01:39 +00:00
Mikayla Gawarecki	209977e6e5	Add information about checkpoint offset to untyped storages when torch.load under FakeTensorMode (#147787 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147787 Approved by: https://github.com/albanD ghstack dependencies: #147786	2025-03-06 12:04:39 +00:00
Mikayla Gawarecki	bdcc1b579b	Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786 ) This only fixes _rebuild_tensor_v2 and _rebuild_tensor_v3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147786 Approved by: https://github.com/albanD	2025-03-06 12:04:32 +00:00
Brian Hirsh	5cda021cac	support meta_tensor.to(device='cpu') under fake_mode (#146729 ) Fixing this is actually a bit annoying: (1) FakeTensorMode sees a function where all of its inputs are real tensors, so it tries to run the real compute before converting the output to a FakeTensor (2) we don't actually want this, because the "real compute" is support to error normally, when you do `meta_tensor.to(device='cpu')`. Instead, we want FakeTensor to actually skip constant prop and run the normal FakeTensor implementation, which will not error Pull Request resolved: https://github.com/pytorch/pytorch/pull/146729 Approved by: https://github.com/zou3519, https://github.com/SherlockNoMad, https://github.com/albanD ghstack dependencies: #146642	2025-02-12 20:57:10 +00:00
Brian Hirsh	ec0b318ddb	[poc] force UntypedStorage.from_buffer(buf) to return meta storage under FakeTensorMode (#146642 ) context here: https://fb.workplace.com/groups/326136610199609/permalink/495389539940981/ This PR is an attempt to make it such that if you create a tensor from an external buffer (using `UntypedStorage.from_buffer(buf)`, we can generate a proper fake tensor for you out of the box. The annoying bit is that there are not any dispatcher ops to interpose on and change behavior. So instead, I took the manual C binding and tweaked the storage device to be "meta' if we see an active fake mode. Put "poc" in the title since I... think this is hopefully reasonable, but I can be convinced that it's not :) ``` from torch._subclasses.fake_tensor import FakeTensorMode import pickle import io import torch from contextlib import nullcontext use_fake_tensor = True with FakeTensorMode() if use_fake_tensor else nullcontext(): obj = [1, 2] f = io.BytesIO() pickle.Pickler(f).dump(obj) byte_storage = torch.ByteStorage._from_buffer(f.getvalue()) # type: ignore[attr-defined] t = torch.ByteTensor(byte_storage) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/146642 Approved by: https://github.com/zou3519	2025-02-12 20:57:10 +00:00
Edward Z. Yang	90448f0128	Output of nonzero is transposed, fix fake tensor (#144695 ) Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695 Approved by: https://github.com/bobrenjc93, https://github.com/albanD	2025-01-26 01:07:22 +00:00
Edward Z. Yang	bc62930765	Work around buggy use_const_ref_for_mutable_tensors (#145530 ) See https://github.com/pytorch/pytorch/issues/145522 for context This doesn't fix the problem with use_const_ref_for_mutable_tensors and the boxed wrapper, instead it just gets all of our out kernels off of this flag so that the mutable matching pattern works correctly. I also add a check in torchgen to prevent people from making this mistake in the future. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145530 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2025-01-24 14:38:49 +00:00
PyTorch MergeBot	f0a210bf5d	Revert "Output of nonzero is transposed, fix fake tensor (#144695 )" This reverts commit 693d8c7e945cc494bd31ad694ae4f4b6ff13b82a. Reverted https://github.com/pytorch/pytorch/pull/144695 on behalf of https://github.com/izaitsevfb due to breaking internal tests, see D68461259 ([comment](https://github.com/pytorch/pytorch/pull/144695#issuecomment-2608443589))	2025-01-22 23:04:50 +00:00
Edward Z. Yang	693d8c7e94	Output of nonzero is transposed, fix fake tensor (#144695 ) Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695 Approved by: https://github.com/bobrenjc93, https://github.com/albanD	2025-01-21 20:50:09 +00:00
Daniel Vega-Myhre	d02c396fbb	add fp8 support to index_cuda (#144747 ) Fixes #133605 Summary This PR adds support for FP8 data types to the `index_cuda` op. It uses `AT_DISPATCH_V2` which is a new macro that can handle arbitrary number of dtypes, as opposed to the old implementations which had a separate macro for each possible number of dtype arguments (e.g. `AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND{2,3,4,5...}`). Test plan Updated test `index_cuda_with_cpu` in `test/test_fake_tensor.py` to have cases for all dtypes handled by `index_cuda`, including fp8 dtypes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144747 Approved by: https://github.com/vkuzo	2025-01-17 22:53:23 +00:00
Brian Hirsh	d7f45fc575	dynamic shape support for interpolate(antialias=True) backward (#141198 ) Fixes https://github.com/pytorch/pytorch/issues/141187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141198 Approved by: https://github.com/ezyang, https://github.com/Chillee ghstack dependencies: #141161	2025-01-16 00:08:25 +00:00
Yanan Cao (PyTorch)	ba5cacbc17	[Codemod][AddExplicitStrictExportArg] caffe2/test (#143688 ) Reviewed By: avikchaudhuri Differential Revision: D67530154 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143688 Approved by: https://github.com/tugsbayasgalan	2024-12-27 07:58:44 +00:00
Tom Ritchford	d8c8ba2440	Fix unused Python variables in test/[e-z]* (#136964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964 Approved by: https://github.com/justinchuby, https://github.com/albanD	2024-12-18 23:02:30 +00:00
soulitzer	e41a0b33ec	Allow Fakified subclass to have different device for inner and outer tensor (#141839 ) Previously if a wrapper tensor subclass is fakified, the inner tensors would end up having the same device as the outer tensor. This PR makes it so that inner and outer tensors can have different devices. See OffloadTensor PR https://github.com/pytorch/pytorch/pull/141840/files#diff-3bc0cf540b694f4ec0a3749f78b047456657a53a5657e495ffb68e5970c5fdaaR1955 for an application. A simpler test has been added in this PR. This is technically bc-breaking because now the callback passed to MetaConverter needs to accept an extra argument, but no one external should be using this anyway? Pull Request resolved: https://github.com/pytorch/pytorch/pull/141839 Approved by: https://github.com/bdhirsh ghstack dependencies: #141166	2024-12-03 00:09:41 +00:00
eellison	fb7148d05d	Fix split decomp returning self (#140065 ) Previously the split decomp would return the input when there were no splits. this errors in torch.compile (or FakeTensorMode) with : > RuntimeError: View operation returned a tensor that is the same as the input base tensor. This is no longer allowed; you must explicitly create a new tensor (e.g., using .detach()). As a user, you could have made a mistake implementing __torch_dispatch__ or a Python operator decomposition or meta registration; if that's not the case, please report a bug to PyTorch or the backend you are using. Fix for https://github.com/pytorch/pytorch/issues/133394 Differential Revision: [D65635070](https://our.internmc.facebook.com/intern/diff/D65635070) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140065 Approved by: https://github.com/bdhirsh	2024-11-13 01:58:02 +00:00
PyTorch MergeBot	7eb66173e2	Revert "Fix split decomp returning self (#140065 )" This reverts commit 9d99dceb53884387665a2c273beca99a157193a5. Reverted https://github.com/pytorch/pytorch/pull/140065 on behalf of https://github.com/ZainRizvi due to Diff been imported internally, but merged externally. And the internal diff has been updated so the diff and PR are now mismatched. Reverting this PR to get things back into a consistent state. See D65635070 ([comment](https://github.com/pytorch/pytorch/pull/140065#issuecomment-2465928027))	2024-11-09 00:16:26 +00:00
eellison	9d99dceb53	Fix split decomp returning self (#140065 ) Previously the split decomp would return the input when there were no splits. this errors in torch.compile (or FakeTensorMode) with : > RuntimeError: View operation returned a tensor that is the same as the input base tensor. This is no longer allowed; you must explicitly create a new tensor (e.g., using .detach()). As a user, you could have made a mistake implementing __torch_dispatch__ or a Python operator decomposition or meta registration; if that's not the case, please report a bug to PyTorch or the backend you are using. Fix for https://github.com/pytorch/pytorch/issues/133394 Differential Revision: [D65635070](https://our.internmc.facebook.com/intern/diff/D65635070) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140065 Approved by: https://github.com/bdhirsh	2024-11-08 16:53:18 +00:00
Pian Pawakapan	a678eaf1ad	check fake/real mismatches during real tensor prop (#137747 ) Summary: While testing exportability for PT2 Inference models, we found various cases of invalid op inputs during tracing, for example errors like: `a and b must have same reduction dim`, `expected scalar type Long but found Int`, etc. Looking more closely, these happened to due the same few meta kernels & eager kernels producing mismatched outputs upstream (e.g. different output tensor dtype, int output). Adding checks to catch mismatched outputs in real tensor prop upstream, so errors are raised at the mismatched op, instead of the downstream ops taking them as inputs. Relies a lot on utils from [CrossRefFakeMode](`929797dedb/torch/_subclasses/fake_utils.py (L78)`) Follow ups: could add more checks, and maybe have a flag to only enable these for cases like draft mode, so perf doesn't suffer? Test Plan: test_export, test_fake_tensor Differential Revision: D64210055 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137747 Approved by: https://github.com/zou3519	2024-11-04 23:39:48 +00:00
eellison	d90717e4e2	Add option to save real tensors in TORCH_COMPILE_DEBUG repro (#138110 ) This pr adds a utility to try to try to construct the corresponding real tensor values of fake tensors by seeing if their meta storage is contained in the meta converter. Then, we are able to save real tensor values for fx_graph_runnable if `TORCH_COMPILE_DEBUG_SAVE_REAL=1` is set. Differential Revision: [D64502744](https://our.internmc.facebook.com/intern/diff/D64502744) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138110 Approved by: https://github.com/ezyang	2024-10-28 16:18:22 +00:00
William Wen	92fdea8a39	remove skips due to https://github.com/pytorch/torchdynamo/issues/1991 (#138133 ) Closes https://github.com/pytorch/pytorch/issues/93479. A bunch of other dynamo-wrapped tests also exhibit "torch.* returned non-Tensor output unimplemented" making the issue seem less relevant to me. Some tests are marked as xfail as they fail for other reasons. If these tests are indeed important, we should create a new issue to track them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138133 Approved by: https://github.com/ezyang	2024-10-17 17:42:46 +00:00
Yidi Wu	819d6b139c	[scan] flatten subgraph output and make subgraph inputs to be a slice (#135601 ) This pr introduces two changes: 1. Before this pr, the subgraphs output is ([], []), in this pr, we change it to a flattened list for easier codegen and consistency with other control flow operators. 2. Before the PR, the combine_fn of scan takes a sliced input but keep the sliced dimension. For exmaple, suppose xs = torch.randn(3, 4, 5) and we scan over dim 0, the combine_fn looks like: ``` # x.shape = (1, 4, 5) instead of (4, 5) def combine_fn(carry, x): ... ``` In this PR, we fixed this and also simplify some of the slicing logic. 3. this diff also make sure we always stack ys on fist dimension. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135601 Approved by: https://github.com/zou3519 ghstack dependencies: #135600	2024-10-16 22:28:03 +00:00
Animesh Jain	19665f4619	[fake_tensor][cache] Supports ops with tuple of output tensors (#137935 ) This is needed for invoke_subgraph work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137935 Approved by: https://github.com/masnesral	2024-10-15 22:15:07 +00:00
Thomas Bohnstingl	e889252493	Implementation of scan (#134102 ) This operation is supposed to be the pendant to the `associative_scan`, but can operate with non-associative functions. @ydwu4 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134102 Approved by: https://github.com/ydwu4	2024-09-10 04:51:16 +00:00
xinan.lin	5707c6e952	[Fake tensor] Align the appearance of `device_put` op in fx_graph generated for CUDA and XPU, which is exposed in the issue #130823 (#132479 ) [Fake tensor] Align the appearance of device_put op in fx_graph generated for CUDA and XPU, which is exposed in the issue #130823 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132479 Approved by: https://github.com/EikanWang, https://github.com/zou3519, https://github.com/eellison	2024-08-09 05:31:00 +00:00
Oguz Ulgen	221350e3a4	Add None return type to init -- tests (#132352 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352 Approved by: https://github.com/ezyang ghstack dependencies: #132335, #132351	2024-08-01 15:44:51 +00:00
Aaron Orenstein	b193894b94	FakeTensor cache SymInt support (#127596 ) Adds support for SymInts in the FakeTensor cache. A couple notes: 1. When a SymInt is present in the input key for a FakeTensor operation we cache on the ShapeEnv instead of using the FakeTensorMode cache. This is necessary so we don't have to remember and check the guards. It reduces the cache hits but there's diminishing return on how much work we can do before the cache becomes more of a burden than a gain. 2. We need to be careful that when we cache an output SymInt that is a direct copy from the input that when we have a cache-hit we copy the SymNode from the input to the output. This is important because the fx-graph building code actually uses SymNode ids in the process of building the graph so constructing a same-content-but-different-id SymNode will fail. 3. In the cache key we store SymInts as a _PySymInputStub. These represent SymInt (and friends) but support `__hash__` and `__eq__` (which SymInt do not). 4. In the cache entry we store SymInts as a _SymIntOutputStub. Perf example: ``` python benchmarks/dynamo/timm_models.py --ci --accuracy --timing --explain --inductor --dynamic-shapes --dynamic-batch-only --device cuda --training --amp --total-partitions 2 --partition-id 0 --output /tmp/training_timm_models.csv --filter crossvit_9_240 ``` fake tensor cache before: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 68137 INFO: cache_misses: 837 INFO: cache_bypasses: INFO: symbolic shape: 48224 INFO: CompositeImplicitAutograd: 917 INFO: non-fake tensor: 70 INFO: non-FakeTensor output: 62 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` and after: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 88187 INFO: cache_misses: 14233 INFO: cache_bypasses: INFO: CompositeImplicitAutograd: 1037 INFO: non-FakeTensor output: 602 INFO: non-fake tensor: 70 INFO: unsafe view: 36 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127596 Approved by: https://github.com/eellison ghstack dependencies: #131014, #129780	2024-07-21 19:26:38 +00:00
Xuehai Pan	93a33bf3ac	[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 ) Changes: 1. Make some arguments positional-only as we only support Python 3.8+ 2. Clean up `torch.typename(obj)` implementation. 3. Update type annotations., especially `is_tensor()` and `is_masked_tensor()` using `TypeGuard`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129001 Approved by: https://github.com/malfet	2024-06-24 18:04:38 +00:00
PyTorch MergeBot	cb4919344a	Revert "[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 )" This reverts commit e53d9590287cbf97521f96d055910394f6e9a849. Reverted https://github.com/pytorch/pytorch/pull/129001 on behalf of https://github.com/XuehaiPan due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/129001#issuecomment-2186944549))	2024-06-24 16:18:43 +00:00
Xuehai Pan	e53d959028	[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 ) Changes: 1. Make some arguments positional-only as we only support Python 3.8+ 2. Clean up `torch.typename(obj)` implementation. 3. Update type annotations., especially `is_tensor()` and `is_masked_tensor()` using `TypeGuard`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129001 Approved by: https://github.com/malfet	2024-06-24 14:35:41 +00:00
Animesh Jain	8184cd85fc	[fake tensor] Set _is_param for base fake tensors for views (#127823 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127823 Approved by: https://github.com/eellison, https://github.com/ezyang ghstack dependencies: #127972	2024-06-05 20:26:52 +00:00
Edward Z. Yang	3ae118204e	Make propagate_real_tensor more safe (#126281 ) Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/7228787720582401/ There a few improvements here, which luckily fix some xfails: * In generally, it can be unsafe to call operations on Tensors under a `no_dispatch()` mode that is purely trying to disable ambient modes, because this ALSO disables tensor subclass handling. So we test to see if there is a tensor subclass and don't propagate real tensors if that's the case. Another acceptable outcome might be to try to only disable the ambient fake tensor mode, this would help us propagate real tensors through more exotic tensor types, but I'm not going to do it until someone asks for it. * We're graph breaking for wrapped tensors too late. Pull it up earlier so we do it before we try to muck around with the real tensor. * I noticed that occasionally when I do `storage.copy_(real_storage)`, the sizes mismatch. Careful code reading suggests that I should just copy in the real data when the tensor was initially allocated, so that's what I do now, eliminating the need for a storage copy. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126281 Approved by: https://github.com/Skylion007	2024-05-15 23:57:02 +00:00
Catherine Lee	719a8f42bf	Foward fix lint after #125747 (#126295 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126295 Approved by: https://github.com/atalman	2024-05-15 16:37:48 +00:00
Yuanhao Ji	ba3cd6e463	Enable UFMT on `test/test_fake_tensor.py`, `test/test_flop_counter.py` and some files (#125747 ) Part of: #123062 Ran lintrunner on: - test/test_fake_tensor.py - test/test_flop_counter.py - test/test_function_schema.py - test/test_functional_autograd_benchmark.py - test/test_functional_optim.py - test/test_functionalization_of_rng_ops.py Detail: ```bash $ lintrunner -a --take UFMT --all-files ok No lint issues. Successfully applied all patches. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125747 Approved by: https://github.com/malfet	2024-05-15 14:50:14 +00:00
Edward Z. Yang	41fabbd93f	Fanatically correct real tensor cloning for propagate_real_tensors (#126175 ) Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/7211398545654652/ Previously I did it in a crappy way using clone_input in the callback, but this results in tensors that don't have quite the same size/stride/storage offset and there was an internal test case where not having completely accurate information was causing a downstream problem in propagation. So now I make real tensors as similar to their fake equivalents as much as possible. Though... I don't bother with autograd lol. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126175 Approved by: https://github.com/albanD	2024-05-14 23:14:17 +00:00
angelayi	aac215a824	SymInt-ify unsqueeze_copy (#125976 ) Fixes https://github.com/pytorch/pytorch/issues/125853 I only half-know how to code c++ so please lmk if I did templating incorrectly 🙈 The reason I used a template is because the `InferUnsqueezeGeometryResult` struct gets used in a couple of other places, like for unsqueeze_quantized, but I wasn't sure if I should symint-ify those too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125976 Approved by: https://github.com/larryliu0820, https://github.com/ezyang	2024-05-14 15:58:52 +00:00
Edward Z. Yang	e93b57a570	Add propagate_real_tensors mode for unbacked (#125115 ) A common complaint when working with data-dependent code in PyTorch is that it's hard to tell how far you are from the finish line: every time a GuardOnDataDependentSymNode error is hit, you have to somehow fix or workaround it to see the next one. This PR adds a new mode `torch._functorch.config.fake_tensor_propagate_real_tensors` which modifies fake tensors to also propagate real tensors. This means that when we try to guard on a data-dependent SymNode, we can actually produce a real result. We also produce a warning which you should consult to figure out what the crux points are. I ran this on vision_maskrcnn. In the baseline (without this mode), the model has 27 graph breaks, resulting in 40 graphs. With this mode on, the model has only 11 graph breaks, resulting in 15 graphs (the remaining graph breaks are due to missing functionality for item() on float tensor and some other Dynamo missing features.) You get a list of things that would have errored like this: ``` WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u0), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u1) < 2) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u1), 1)) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Ne(Max(1, u1), 1)) -> True WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Max(1, u0) < 2) -> False WARNING:torch.fx.experimental.symbolic_shapes:propagate_real_tensors evaluate_expr(Eq(Max(1, u0), 1)) -> False ``` Potential later follow ups: * Improve the warning messages (in particular, should provide user frames) * GC real tensors when they are no longer needed by tracing. Right now, this will use A LOT of memory, equal to as if your GC was broken and every intermediate tensor was kept live Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125115 Approved by: https://github.com/IvanKobzarev	2024-05-02 15:28:26 +00:00
Guilherme Leobas	761a7b84ba	[Dynamo] Fix alias issue with respect to wrapped numbers (#124731 ) (#124774 ) This PR fixes an issue presented when calling `aten.alias(int)` raises a TypeError. ```python import torch import torch.autograd.forward_ad as fwAD def f(x): return 4312491 * x device = "cpu" with torch._subclasses.fake_tensor.FakeTensorMode(): with fwAD.dual_level(): x = torch.randn(3, device=device) y = torch.ones_like(x) dual = fwAD.make_dual(x, y) f(dual) ``` The test case above illustrates this bug. 1) `4312491` turns into a tensor that is a wrapped number 2) Forward mode AD calls `aten::alias` internally 3) The wrapped number (`4312491`) becomes a python integer 4) `aten.alias(int)` raises a `TypeError` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124774 Approved by: https://github.com/albanD, https://github.com/zou3519	2024-04-30 14:11:46 +00:00
rzou	889e3eeed3	Avoid cuda init to FakeTensorMode (#124413 ) Also partially fixes #122109 This PR: - We add a C++ flag (only_lift_cpu_tensors) to toggle the torch.tensor(1, device='cuda') ctor strategy. When false (default), it does the current PyTorch behavior of unconditionally constructing a concrete CUDA tensor then calling lift_fresh on it. When true, we instead construct a concrete CPU tensor, call lift_fresh, and then call Tensor.to(device) (under any ambient modes). - FakeTensorMode flips this flag depending on if CUDA is available or not. We don't unconditionally set the flag to True because that is likely BC-breaking. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124413 Approved by: https://github.com/eellison	2024-04-19 02:39:35 +00:00
Edward Z. Yang	cebf65126c	FakeTensorProp assert consistency of sizes when metadata previously existed (#124059 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124059 Approved by: https://github.com/bdhirsh, https://github.com/thiagocrepaldi ghstack dependencies: #124105	2024-04-16 23:28:42 +00:00
William Wen	cbde0f048b	[dynamo, 3.12] enable tests disabled due to missing dynamo 3.12 support (#123300 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123300 Approved by: https://github.com/jansel, https://github.com/malfet, https://github.com/zou3519	2024-04-05 20:13:17 +00:00
Edward Z. Yang	85845a29db	Refactor ShapeEnvSettings so it's directly on ShapeEnv (#122310 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122310 Approved by: https://github.com/masnesral, https://github.com/lezcano	2024-03-26 14:16:33 +00:00
Edward Z. Yang	268b0cc714	Do not run CUDA lazy init if it is triggered with fake mode on. (#122636 ) Partially fixes https://github.com/pytorch/pytorch/issues/122109 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122636 Approved by: https://github.com/zou3519	2024-03-26 05:43:59 +00:00
Edward Z. Yang	49b81af45f	Delete dead memoized_only kwarg in FakeTensor (#122271 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122271 Approved by: https://github.com/eellison ghstack dependencies: #122044, #122270	2024-03-25 13:16:21 +00:00
Edward Z. Yang	f32ce4e28e	Delete FakeTensorConverter.__call__ in favor of from_real_tensor (#122270 ) It's annoying grepping for `__call__` call-sites so they're now all explicit now. I'd do this to MetaConverter too but that one is way more public, a lot more sites. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122270 Approved by: https://github.com/eellison ghstack dependencies: #122044	2024-03-25 13:16:13 +00:00
Edward Z. Yang	5891c5b3a6	Factor meta conversion through serializable MetaTensorDesc (#122044 ) Fixes https://github.com/pytorch/pytorch/issues/121085 This PR pretty involved so pay attention to this description. At a high level, the refactor is intended to be mechanical: anywhere in MetaConverter where previously we took a Tensor as argument, we now take a MetaTensorDesc, which contains all of the information that we would have queried off of the Tensor, but placed into a separate data structure which we can serialize or use to recreate a fake tensor in a separate fake tensor mode in exact fidelity to the original. However, this transformation is not always entirely mechanical. Here is what you need to pay attention to: - The memo table from real Tensor -> meta/fake Tensor is now broken into two memo tables: real Tensor -> stable int id -> meta/fake Tensor. The stable int id is needed so that when we do serialization, we know when tensors/storages alias each other and can ensure we preserve this aliasing upon deserialization. The way I have implemented changes the weak reference behavior. Previously, when either the real Tensor OR the meta/fake Tensor went dead, we would remove the entry from the memo table. Now, this only removes entries from one of the two memo tables. This semantically makes sense, because the user may have held on to the stable int id out of band, and may expect a real Tensor to continue to be numbered consistently / expect to be able to lookup a meta/fake tensor from this id. If this is unacceptable, it may be possible to rejigger the memo tables so that we have real Tensor -> stable int id and real Tensor -> meta/fake Tensor, but TBH I find the new implementation a lot simpler, and arranging the memo tables in this way means that I have to muck around with the real tensor to save to the memo table; in the current implementation, I never pass the Tensor to meta_tensor function AT ALL, which means it is impossible to accidentally depend on it. - When I fill in the fields of MetaTensorDesc in describe_tensor, I need to be careful not to poke fields when they are not valid. Previously, preconditions were implicitly checked via the conditional structure ("is this sparse? is this nested?") that is tested before we start reading attributes. This structure has to be replicated in describe_tensor, and I have almost assuredly gotten it wrong on my first try (I'll be grinding through it on CI; a careful audit will help too, by auditing that I've tested all the same conditionals that the original access was guarded by.) - I originally submitted https://github.com/pytorch/pytorch/pull/121821 for the symbolic shapes change, but it turned out the way I did it there didn't actually work so well for this PR. I ended up just inlining the symbolic shapes allocation logic into MetaConverter (look for calls to maybe_specialize_sym_int_with_hint), maybe there is a better way to structure it, but what I really want is to just read sizes/strides/offset directly off of MetaTensorDesc; I don't want another intermediate data structure. - Some fields aren't serializable. These are documented as "NOT serializable". ctx/type should morally be serializable and I just need to setup a contract with subclasses to let them be serialized. The fake_mode is used solely to test if we are refakefying with a pre-existing ShapeEnv and we want to reuse the SymInt directly--serializing this case is hopeless but I am kind of hoping after this refactor we do not need this at all. view_func is not serializable because it's a bound C implemented method. Joel has promised me that this is not too difficult to actually expose as a true data structure, but this is the edgiest of edge cases and there is no reason to deal with it right now. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122044 Approved by: https://github.com/eellison	2024-03-25 06:21:17 +00:00
PyTorch MergeBot	f65373e278	Revert "Factor meta conversion through serializable MetaTensorDesc (#122044 )" This reverts commit e2d89e970480d7e5b10a77928442d8caf94e0e84. Reverted https://github.com/pytorch/pytorch/pull/122044 on behalf of https://github.com/jeanschmidt due to Seems that some landrace caused this PR to break lint ([comment](https://github.com/pytorch/pytorch/pull/122044#issuecomment-2015025490))	2024-03-22 12:46:21 +00:00
Edward Z. Yang	e2d89e9704	Factor meta conversion through serializable MetaTensorDesc (#122044 ) Fixes https://github.com/pytorch/pytorch/issues/121085 This PR pretty involved so pay attention to this description. At a high level, the refactor is intended to be mechanical: anywhere in MetaConverter where previously we took a Tensor as argument, we now take a MetaTensorDesc, which contains all of the information that we would have queried off of the Tensor, but placed into a separate data structure which we can serialize or use to recreate a fake tensor in a separate fake tensor mode in exact fidelity to the original. However, this transformation is not always entirely mechanical. Here is what you need to pay attention to: - The memo table from real Tensor -> meta/fake Tensor is now broken into two memo tables: real Tensor -> stable int id -> meta/fake Tensor. The stable int id is needed so that when we do serialization, we know when tensors/storages alias each other and can ensure we preserve this aliasing upon deserialization. The way I have implemented changes the weak reference behavior. Previously, when either the real Tensor OR the meta/fake Tensor went dead, we would remove the entry from the memo table. Now, this only removes entries from one of the two memo tables. This semantically makes sense, because the user may have held on to the stable int id out of band, and may expect a real Tensor to continue to be numbered consistently / expect to be able to lookup a meta/fake tensor from this id. If this is unacceptable, it may be possible to rejigger the memo tables so that we have real Tensor -> stable int id and real Tensor -> meta/fake Tensor, but TBH I find the new implementation a lot simpler, and arranging the memo tables in this way means that I have to muck around with the real tensor to save to the memo table; in the current implementation, I never pass the Tensor to meta_tensor function AT ALL, which means it is impossible to accidentally depend on it. - When I fill in the fields of MetaTensorDesc in describe_tensor, I need to be careful not to poke fields when they are not valid. Previously, preconditions were implicitly checked via the conditional structure ("is this sparse? is this nested?") that is tested before we start reading attributes. This structure has to be replicated in describe_tensor, and I have almost assuredly gotten it wrong on my first try (I'll be grinding through it on CI; a careful audit will help too, by auditing that I've tested all the same conditionals that the original access was guarded by.) - I originally submitted https://github.com/pytorch/pytorch/pull/121821 for the symbolic shapes change, but it turned out the way I did it there didn't actually work so well for this PR. I ended up just inlining the symbolic shapes allocation logic into MetaConverter (look for calls to maybe_specialize_sym_int_with_hint), maybe there is a better way to structure it, but what I really want is to just read sizes/strides/offset directly off of MetaTensorDesc; I don't want another intermediate data structure. - Some fields aren't serializable. These are documented as "NOT serializable". ctx/type should morally be serializable and I just need to setup a contract with subclasses to let them be serialized. The fake_mode is used solely to test if we are refakefying with a pre-existing ShapeEnv and we want to reuse the SymInt directly--serializing this case is hopeless but I am kind of hoping after this refactor we do not need this at all. view_func is not serializable because it's a bound C implemented method. Joel has promised me that this is not too difficult to actually expose as a true data structure, but this is the edgiest of edge cases and there is no reason to deal with it right now. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122044 Approved by: https://github.com/eellison ghstack dependencies: #122018	2024-03-22 03:56:34 +00:00
Edward Z. Yang	74c09a757b	Simplify Storage meta conversion with PyObject preservation (#122018 ) Thanks to https://github.com/pytorch/pytorch/pull/109039 we can rely on finalizers on Storage PyObject to handle removal from dict. Irritatingly, we still have to attach finalizer, because we don't have a weak key AND value dict (only one or the other). Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122018 Approved by: https://github.com/eellison, https://github.com/kurtamohler	2024-03-18 18:55:58 +00:00

1 2 3 4

188 Commits