pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:30:26 +08:00

Author	SHA1	Message	Date
Pian Pawakapan	745324e487	[export] turn on hybrid symints by default (#130775 ) Sets `prefer_deferred_runtime_asserts_over_guards=True` for export, so any guards emitted from `SymNode.expect_true` (for example, guards that are implicitly required to be true for an op to succeed) won't lead to constraint violations. Instead these should appear in the graph as runtime asserts, or potentially as replacement expressions for placeholder shapes. For example, this reshape op should emit s0 * s1 = s2, deferred as a runtime assert. ``` x = torch.randn(4, 8) # [s0, s1] y = torch.randn(32) # [s2] out = x.reshape(-1) + y # this emits Eq(s0 * s1, s2), and we represent y's shape as [s0s1] in the graph. ``` However, other complex guards can still cause export to fail, for instance guards emitted from `SymNode.guard_bool/guard_size_oblivious` (e.g. explicit if-else conditions in user code or lower-level op implementations hit during tracing) can still raise constraint violations. These can be deferred with `allow_complex_guards_as_runtime_asserts=True`. We don't yet make this default, because while this makes export more likely to succeed, it results in non-trivial asserts being emitted that often represent specialization to a variant of the op, or checks related to 0/1 specialization. We also remove forced specializations for export and kill the `_disable_forced_specializations` flag - now any guard we can't express with Dims/DerivedDims either are handled with Hybrid SymInts, or should be resolved with rewriting or deferring. Follow up: Currently, `ShapeEnv._set_replacement()` is called for complex equality expressions (e.g. s2 -> s0s1 in the example above), and the ExportedProgram stores `s0*s1` in the input placeholder. This isn't checked for validity when the program is run, so an option is to avoid replacement and/or runtime assert on equality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130775 Approved by: https://github.com/avikchaudhuri	2024-07-18 17:40:58 +00:00
Zhengxu Chen	5484c86021	[export] Fully support extension op in serialization/deserialization. (#130851 ) Summary: Finishing up the mechanism to "register" certain types of operators to a registry so that the serializer can handle them correctly. This is expected to be firstly used by executorch. Test Plan: buck run mode/opt caffe2/test:test_export -- -r test_export_with_extension_op_serialization Differential Revision: D59825148 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130851 Approved by: https://github.com/angelayi	2024-07-18 16:47:53 +00:00
angelayi	6c2c8ee15b	[export] Remove preserved ops from decomp list (#130970 ) Fixes https://fb.workplace.com/groups/1075192433118967/permalink/1466016147369925/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/130970 Approved by: https://github.com/bdhirsh	2024-07-18 05:15:22 +00:00
Pian Pawakapan	d96c80649f	[export] constants & non-persistent buffers for training IR (#130864 ) Summary: Uses original ExportedProgram constants and graph signature to inform decompositions, so that constant tensors and non-persistent buffers are respected for training IR. Removes 7 test failures for training IR. Test Plan: test_export Differential Revision: D59820909 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130864 Approved by: https://github.com/angelayi	2024-07-17 18:27:53 +00:00
Pian Pawakapan	18b7633bfb	[export] fix kwargs in run_decompositions() for training IR (#130553 ) Re-exporting GraphModule expects all inputs to be in args, though not in pytree-flattened format. This avoids failing when we run with a fx.Interpreter subclass in [AOTAutograd tracing](`973037be6a/torch/_functorch/_aot_autograd/traced_function_transforms.py (L760-L762)`). Removes 7 test failures for training IR export. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130553 Approved by: https://github.com/zhxchen17, https://github.com/ydwu4	2024-07-11 22:53:18 +00:00
Zhengxu Chen	726a287271	[export] Expand verifier to be multiple on ExportedProgram (#130364 ) Summary: This diff updates the ExportedProgram class in PyTorch to allow for multiple verifiers to be attached to it. This is done by adding a new field to the ExportedProgram schema called "verifiers" which is a list of strings representing the names of the verifiers to be attached to the program. The verifiers are loaded using the "load_verifier" function which is defined in the "torch._export.serde.serialize" module. The "exported_program.dialect" field is also deprecated in favor of the "verifiers" field. Test Plan: CI Differential Revision: D59408546 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130364 Approved by: https://github.com/angelayi, https://github.com/ydwu4	2024-07-11 20:34:49 +00:00
Pian Pawakapan	1b3b4c2fb9	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) (#130380 ) original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train) Summary: This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Test Plan: contbuild & OSS CI, see `940e4477ab` Original Phabricator Test Plan: Imported from GitHub, without a `Test Plan:` line. Differential Revision: D59543603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380 Approved by: https://github.com/izaitsevfb	2024-07-10 19:23:37 +00:00
PyTorch MergeBot	9c9744c3ac	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit 940e4477ab0b81eea25051447cf5f599080c903f. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))	2024-07-09 21:03:49 +00:00
Pian Pawakapan	940e4477ab	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-07 20:10:14 +00:00
PyTorch MergeBot	963f430d13	Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599 )" This reverts commit 0267b2ddcb58aa66b2b62336216da7df4f9939d8. Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk `0267b2ddcb` ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))	2024-07-06 07:20:05 +00:00
Pian Pawakapan	0267b2ddcb	[runtime asserts] deduplicate runtime asserts & CSE (#128599 ) This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example: ``` z = torch.cat([x, x], dim=0) # 2s0 w = z.repeat(y.shape[0]) # 2s0s1 _w = w.shape[0] # something with _w ... # turns into -> s0 = x.shape[0] s1 = y.shape[0] _w0 = 2 s0 _w = _w0 * s1 ``` Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example: ``` torch.sym_constrain_range_for_size(n, min=2, max=16) torch.sym_constrain_range(n, min=4, max=20) torch._check(n >= 0) torch._check(n >= 3) torch._check(n <= 14) # turns into torch.sym_constrain_range_for_size(n) torch._check(n >= 4) torch._check(n <= 14) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599 Approved by: https://github.com/ezyang	2024-07-06 03:44:49 +00:00
Tugsbayasgalan Manlaibaatar	dabaebd339	Make run_decomp work (#129249 ) In this PR, we implement the first version of training_ir.run_decomp functionality. Since we don't return the modified buffers as extra output in training IR, our previous strategy of reusing graph signature won't work. In fact, this run_decomp is more similar to retracing. So i reuse some of export steps here. After this PR: export_for_training().run_decomp({}, _preserve_ops=[all 183 ops]) == export_for_predispatch() - autograd_manipulating_ops. Differential Revision: [D59069090](https://our.internmc.facebook.com/intern/diff/D59069090) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129249 Approved by: https://github.com/zhxchen17 ghstack dependencies: #128077, #129092	2024-06-27 19:16:07 +00:00
Tugsbayasgalan Manlaibaatar	90f6043368	Don't decompose functional composite ops in export inference IR (#128077 ) Recently we decided to split export IR into two different IRs (training vs inference). In the inference IR, one major change we decided to introduce was we wanted to keep the composite ops that user specified in the IR. This PR does that by overriding the CompositeImplicitAutograd decomp in export inference path. Differential Revision: [D58701607](https://our.internmc.facebook.com/intern/diff/D58701607) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128077 Approved by: https://github.com/bdhirsh	2024-06-26 23:07:55 +00:00
Pian Pawakapan	d02bba519c	[export] match fake mode for _decompose_exported_program() (#129421 ) Summary: _decompose_exported_program() ran into an issue with trace_joint, where trace_joint() produces values with mismatching FakeModes. Adding fake mode context to aot_export_module() so this doesn't happen. #thanks to tugsbayasgalan for the fix! Test Plan: test_experimental Differential Revision: D58977694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129421 Approved by: https://github.com/tugsbayasgalan, https://github.com/zhxchen17	2024-06-26 05:52:31 +00:00
Zhengxu Chen	65286883d4	[export] reland "experimental joint graph API." (#129081 ) Summary: previous diff got reverted despite CI was green. Test Plan: CI Differential Revision: D58790048 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129081 Approved by: https://github.com/tugsbayasgalan	2024-06-20 16:50:53 +00:00
PyTorch MergeBot	df94d57c0a	Revert "[export] experimental joint graph API. (#128847 )" This reverts commit 0707811286d1846209676435f4f86f2b4b3d1a17. Reverted https://github.com/pytorch/pytorch/pull/128847 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/128847#issuecomment-2179326891))	2024-06-19 19:04:36 +00:00
Zhengxu Chen	0707811286	[export] experimental joint graph API. (#128847 ) Summary: WARNING: This API is highly unstable and will be subject to change in the future. Add a protoype to "decompose" an ExportedProgram into a joint graph form, so that we can compute the gradients on this graph. Test Plan: buck test mode/opt caffe2/torch/fb/export:test_experimental Differential Revision: D55657917 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128847 Approved by: https://github.com/tugsbayasgalan	2024-06-19 16:45:27 +00:00
Zhengxu Chen	be0eec9031	[export] Improve static typing in tracer. (#128552 ) Summary: as title. Test Plan: CI Differential Revision: D58485487 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128552 Approved by: https://github.com/angelayi	2024-06-14 17:57:37 +00:00
chilli	c486e2ab64	Add coloring to fx graph print out (#128476 ) Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail. Old: <img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6"> New: <img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476 Approved by: https://github.com/ezyang	2024-06-13 23:39:04 +00:00
Zhengxu Chen	0444e89931	[export] Remove replace_sym_size_ops_pass (#128443 ) Summary: Not needed anymore. Test Plan: CI Differential Revision: D58429458 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128443 Approved by: https://github.com/angelayi	2024-06-12 21:03:06 +00:00
Aaron Orenstein	038b927590	Flip default value for mypy disallow_untyped_defs [7/11] (#127844 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127844 Approved by: https://github.com/oulgen ghstack dependencies: #127842, #127843	2024-06-08 18:49:45 +00:00
Pian Pawakapan	e505132797	[export] track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS for export runtime asserts (#127554 ) Track TORCH_DYNAMO_DO_NOT_EMIT_RUNTIME_ASSERTS=1 in export so it doesn't omit runtime asserts. Differential Revision: D57978699 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127554 Approved by: https://github.com/tugsbayasgalan	2024-06-05 04:16:54 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	b5e85b8ecc	Add deferred_runtime_assertion pass after run_decompositions (#127305 ) Summary: We also want to reinsert the deferred_runtime passes after run_decompositions as well Test Plan: CI Reviewed By: zhxchen17 Differential Revision: D57802237 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127305 Approved by: https://github.com/BoyuanFeng	2024-05-31 05:45:28 +00:00
Aaron Gokaslan	3cb16ebf08	[BE]: Update ruff to 0.4.5 (#126979 ) Update ruff to 0.4.5 and addresses some false negatives that have been found in the newer version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126979 Approved by: https://github.com/ezyang	2024-05-24 18:38:35 +00:00
Matthew Hoffman	81277baa0c	Remove removed ruff rule TRY200 (#126256 ) My TOML linter is complaining that "TRY200" is not acceptable for the `tool.ruff.lint` schema. From the ruff docs: https://docs.astral.sh/ruff/rules/reraise-no-cause/ > This rule has been removed and its documentation is only available for historical reasons. > > This rule is identical to [B904](https://docs.astral.sh/ruff/rules/raise-without-from-inside-except/) which should be used instead. and we are currently explicitly ignoring B904. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126256 Approved by: https://github.com/Skylion007	2024-05-17 16:31:05 +00:00
Ze Sheng	51e9bb8783	[Export] Allow ExportedProgram to take empty decomp table (#126142 ) As title. Still, `ep.run_decompositions()` will use `core_aten_decompositions()` by default. Cases like `ep.run_decompositions(get_decompositions([]))` will use empty table, and go with [`aot_autograd_decompositions`](`04877dc430/torch/_functorch/aot_autograd.py (L456-459)`) only. Motivation We didn't have a clean way to pass in an empty decomp table. Since we've made `pre_dispatch` export as default and `ep.run_decompositions` remains with `aot_export_module(..., pre_dispatch=False)`, allowing empty table would help make blank control easier. Testing CI Also looked through all the references in fbcode. The only concern I have is whether we should update [this example](`04877dc430/torch/onnx/_internal/exporter.py (L817)`) or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126142 Approved by: https://github.com/angelayi	2024-05-16 00:31:23 +00:00
Tugsbayasgalan Manlaibaatar	0e419b9146	Fix graph partitioner and make runtime assertion work with submodules in export (#125793 ) Summary: This fix does three things: 1. When we add inputs from partioner to the top level graph module, we insert in the order of partioner which is not guaranteed to be same as original graph inputs. This PR fixes that. 2. When we replace autograd ops with HOP, we create new submodules and access their outputs via getitem calls. As a result, previous node names associated with getitem gets updated, resulting in the graph being different from produced graph signature. So I just update the graph signature accordingly. 3. We run runtime_assertion pass before autograd HOP pass because the constraints won't be populated correctly. Differential Revision: [D57130314](https://our.internmc.facebook.com/intern/diff/D57130314) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125793 Approved by: https://github.com/zhxchen17	2024-05-09 18:13:46 +00:00
angelayi	8be4c1bc2f	[export] Add metadata for nodes insert_deferred_runtime_asserts (#125414 ) Fixes [internal error](https://fb.workplace.com/groups/1075192433118967/permalink/1416709435633930/). The issue is that the asserting nodes added in the `insert_deferred_runtime_assertion` pass do not contain metadata that the ExportedProgram requires the graph to have. One solution to fix this is to retrace the entire module, or another solution is to manually add back this metadata. This diff implements the latter solution (manually add back the metadata) through hooking into fx.graph's `create_node` function, and adding export-specific metadata for every node that is created. The reason I did this is so that the `insert_deferred_runtime_assertion` does not have to know about what metadata export wants. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125414 Approved by: https://github.com/zhxchen17, https://github.com/BoyuanFeng	2024-05-07 23:15:21 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	3946fa1c12	Fix bug in get_update_constraint (#125194 ) Summary: Title Test Plan: CI Differential Revision: D56726321 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125194 Approved by: https://github.com/pianpwk	2024-04-30 18:21:29 +00:00
Pian Pawakapan	93a319a4fc	[export] kill _process_constraints() (#123985 ) The process for populating range_constraints follows separate methods for non-strict (`make_constraints`), and strict (`_process_constraints`). The strict method is somewhat more convoluted, and the analysis that Dynamo performs for strict is already present as part of the non-strict process in make_constraints (produce_guards(), running the export constraint solver). This PR kills _process_constraints() and replaces calls with make_constraints, without duplicating the work that Dynamo already does. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123985 Approved by: https://github.com/avikchaudhuri	2024-04-25 16:58:57 +00:00
Pian Pawakapan	e112792a69	[export] refactor _AddRuntimeAssertionsForInlineConstraintsPass (#124503 ) Summary: The current _AddRuntimeAssertionsForInlineConstraintsPass has 2 known issues caused by its use of torch.fx.Interpreter: 1. SymInt-related ops (e.g. item()) are executed, causing new Unbacked SymInts to appear in the graph during the pass. 2. The graph is reconstructed, and node names/indices can be different from before, causing mismatches with `module_call_graph`, and leading to issues during unflattening. This refactors the pass to use PassBase instead of _ExportPassBaseDeprecatedDoNotUse, only constructing new nodes for assertions. Test Plan: This pass is called on all strict-mode export calls with range_constraints, test that behavior remains unchanged. Differential Revision: D56360137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124503 Approved by: https://github.com/zhxchen17	2024-04-23 20:07:49 +00:00
Pian Pawakapan	10b9d4d19c	[export] handle Dim.lower = 0, 1 for ep.run_decompositions() (#123602 ) Summary: With pre-dispatch export and ep.run_decompositions(), range constraints are updated through looking at ShapeEnv.var_to_range. However the lower bounds on these may be incorrect - analysis on un-specialized symbols are done with lower bounds of 2, which mismatch with user-specified bounds (may be 0, 1). This updates `_get_updated_range_constraints()` to use the old range constraints if possible. Test Plan: Existing pre-dispatch/dynamic shapes test case. Differential Revision: D55899872 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123602 Approved by: https://github.com/tugsbayasgalan	2024-04-19 21:29:36 +00:00
Tugsbayasgalan Manlaibaatar	dd3cea3291	Fix derived dim bugs in ep.run_decomp (#123326 ) Differential Revision: [D55730289](https://our.internmc.facebook.com/intern/diff/D55730289) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123326 Approved by: https://github.com/avikchaudhuri	2024-04-17 04:00:55 +00:00
Pian Pawakapan	d0ccf599cc	[export] Restore original placeholder names (part 2: higher-order-op subgraph naming) (#123587 ) Summary: note: breaking the original diff [D55225818](https://www.internalfb.com/diff/D55225818) into 3 parts (top-level renaming, higher-order-op subgraphs, constant input de/serialization) because of its size. Stacked PR to restore original names to placeholder nodes, replacing the default names arg0_1, arg1_1, ... This PR propagates node names to higher-order-op subgraph placeholders, retaining the top-level names and handling naming collisions by suffixing other non-placeholder nodes in the subgraph with an index. This is the same handling as in fx.Graph/fx.Node, but implemented separately as a pass. Since the input schemas of HOO subgraphs are very different, they are enumerated in _name_hoo_subgraph_placeholders(). Currently cond, map_impl, and wrap_with_set_grad_enabled are handled, but other ops can be easily added. Test Plan: verification checks on placeholder names for all export() calls, unit test in test/export/test_export.py Differential Revision: D55456749 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123587 Approved by: https://github.com/angelayi	2024-04-11 22:40:46 +00:00
PyTorch MergeBot	cf8139b956	Revert "Fix derived dim bugs in ep.run_decomp (#123326 )" This reverts commit 43228742820d8045a3980826f3ef85158dc9032c. Reverted https://github.com/pytorch/pytorch/pull/123326 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/123326#issuecomment-2048389042))	2024-04-10 20:35:01 +00:00
Tugsbayasgalan Manlaibaatar	4322874282	Fix derived dim bugs in ep.run_decomp (#123326 ) Differential Revision: [D55730289](https://our.internmc.facebook.com/intern/diff/D55730289) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123326 Approved by: https://github.com/avikchaudhuri	2024-04-10 18:54:03 +00:00
Pian Pawakapan	d7f23f6826	[export] Restore original placeholder names (part 1: top-level renaming) (#122904 ) Summary: This PR restores original names to placeholder nodes, replacing the default names arg0_1, arg1_1, and so on. User inputs now follow the signature of mod.forward(), for example forward(x, y) produces nodes x, y. If the tensors are nested in dictionaries, lists, tuples, or dataclasses, the names are a concatenation of the path to the tensor, e.g. x = {'a': torch.randn(4), 'b': [torch.randn(4), torch.randn(4)]} produces nodes x_a, x_b_0, x_b_1. Parameters, buffers, constants, and custom objects follow the FQN of the object, prefixed by "p", "b", "c", and "obj" respectively. For example, self.bar.l0.weight gets you p_bar_l0_weight. Effect tokens are named token_1, token_2, and so on, since they are not grounded in model inputs or named attributes. note: breaking the original diff into 3 parts (top-level renaming, higher-order-op subgraphs, constant input de/serialization) because of its size. Examples: ```python # params, buffers, constants, inputs, torch.cond ExportedProgram: class GraphModule(torch.nn.Module): def forward(self, p_l0_weight: "f32[4, 4]", p_l0_bias: "f32[4]", c_alpha: "f32[4]", b_beta: "f32[4]", x_0_a: "f32[4, 4]", y: "f32[4, 4]"): # No stacktrace found for following nodes mul: "f32[4, 4]" = torch.ops.aten.mul.Tensor(x_0_a, x_0_a) t: "f32[4, 4]" = torch.ops.aten.t.default(p_l0_weight); p_l0_weight = None addmm: "f32[4, 4]" = torch.ops.aten.addmm.default(p_l0_bias, y, t); p_l0_bias = y = t = None return addmm # model code class Bar(torch.nn.Module): def forward(self, x): return x * x class Foo(torch.nn.Module): def __init__(self): super().__init__() self.bar = Bar() self.l0 = torch.nn.Linear(4, 4) self.alpha = torch.randn(4) self.register_buffer('beta', torch.randn(4)) def forward(self, x, y): x = x[0]['a'] mul = self.bar(x) z1 = self.l0(y) return z1 # custom objects, dataclasses, tokens, constant inputs ExportedProgram: class GraphModule(torch.nn.Module): def forward(self, token_1: "f32[0]", obj_attr, data_x: "f32[4, 4]", data_y: "f32[4, 4]", mode): # No stacktrace found for following nodes mul: "f32[4, 4]" = torch.ops.aten.mul.Scalar(data_x, 30); data_x = None div: "f32[4, 4]" = torch.ops.aten.div.Tensor_mode(data_y, 1.0, rounding_mode = 'floor'); data_y = None add: "f32[4, 4]" = torch.ops.aten.add.Tensor(mul, div); mul = div = None with_effects = torch._higher_order_ops.effects.with_effects(token_1, torch.ops._TorchScriptTesting.takes_foo.default, obj_attr, add); token_1 = obj_attr = add = None getitem: "f32[0]" = with_effects[0] getitem_1: "f32[4, 4]" = with_effects[1]; with_effects = None return (getitem, getitem_1) # model code class Foo(torch.nn.Module): def __init__(self): super().__init__() self.attr = torch.classes._TorchScriptTesting._Foo(10, 20) def forward(self, data, a=1.0, mode="floor"): x = self.attr.add_tensor(data.x) + torch.div(data.y, a, rounding_mode=mode) x = torch.ops._TorchScriptTesting.takes_foo(self.attr, x) return x dataclass class DataClass: x: Tensor y: Tensor register_dataclass_as_pytree_node( DataClass, serialized_type_name="test.DataClass" ) args = (DataClass(x=torch.randn(4, 4), y=torch.randn(4, 4)), ) kwargs = {'mode': 'floor'} ep = torch.export.export(Foo(), args, kwargs, strict=False) ``` Test Plan: verification checks on placeholder names for all export() calls, unit test in test/export/test_export.py Differential Revision: D55456418 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122904 Approved by: https://github.com/angelayi, https://github.com/thiagocrepaldi	2024-04-05 18:56:00 +00:00
Tugsbayasgalan Manlaibaatar	1ea6d3a9b4	Fix conv decomp when running to core-aten (#123283 ) Differential Revision: [D55709374](https://our.internmc.facebook.com/intern/diff/D55709374) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123283 Approved by: https://github.com/angelayi	2024-04-04 01:14:09 +00:00
Pian Pawakapan	3f99306452	[export] Remove from_export flag (#122500 ) Summary: The flag from_export was incorrectly included in a previous diff (https://www.internalfb.com/diff/D54314379) - it was intended for helping with ExportedProgram verification, but was no longer needed in the final implementation. Test Plan: Changes no functionality, test/export already covers everything Differential Revision: D55205857 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122500 Approved by: https://github.com/avikchaudhuri, https://github.com/zhxchen17	2024-03-22 22:55:14 +00:00
Pian Pawakapan	3bd38928ba	[export] Improve consistency for nn_module_stack metadata, add checks to _trace.py (#120661 ) We would like to improve consistency for nn_module_stack metadata in torch.export. This PR ensures that all tests in test/export/test_export.py has the following constraints: - Remove nn_module_stack for all placeholder & output nodes, for all modules and submodules - Ensure nn_module_stack is present for all other node types for the top-level module (there is still an issue with torch.cond submodules having empty fields) - Add these checks to _export() in _trace.py (we would add this in the Verifier, but downstream apps construct ExportedPrograms separate from _export(), and metadata may not be maintained there) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120661 Approved by: https://github.com/avikchaudhuri	2024-03-16 21:44:52 +00:00
angelayi	ef25d83a62	[export] Add serialization support for tokens (#121552 ) Differential Revision: [D54906766](https://our.internmc.facebook.com/intern/diff/D54906766) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121552 Approved by: https://github.com/zhxchen17	2024-03-15 16:15:11 +00:00
Angela Yi	4b49bc19e8	[export][reland] Disable exported_program.__call__ (#120019 ) Summary: Reland of D53075378 / https://github.com/pytorch/pytorch/pull/119466 Test Plan: CI Differential Revision: D53827930 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120019 Approved by: https://github.com/ydwu4	2024-03-05 05:29:46 +00:00
Michael Suo	12f724c779	[export] preserve constant fqn (#120664 ) Summary: Previously we were renaming constants to `lifted_constant_tensor0` or equivalent. This PR changes things so that the constants retain the same FQN as in the original eager module. Actually, `symbolic_trace` already is supposed to do this, but the code path is not triggered when used from `make_fx`, since we don't pass an actual `nn.Module` instance to `trace()`, but rather a multiply-wrapped-functionalized-lambda-thing. So, I reproduced the essential logic outside of make_fx, at the export layer. Test Plan: added a unit test Differential Revision: D54221616 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120664 Approved by: https://github.com/SherlockNoMad	2024-02-27 06:35:51 +00:00
PyTorch MergeBot	65fd8b6730	Revert "[export] Disable exported_program.__call__ (#119466 )" This reverts commit c26884f06345bf61e0843d13db84e76236ff6142. Reverted https://github.com/pytorch/pytorch/pull/119466 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/119466#issuecomment-1947384298))	2024-02-15 21:42:32 +00:00
Angela Yi	c26884f063	[export] Disable exported_program.__call__ (#119466 ) Summary: `ExportedProgram` is an artifact produced by torch.export, containing the graph that is exported, along with other attributes about the original program such as the graph signature, state dict, and constants. One slightly confusing thing that users run into is that they treat the `ExportedProgram` as a `torch.nn.Module`, since the object is callable. However, as we do not plan to support all features that `torch.nn.Module`s have, like hooks, we want to create a distinction between it and the `ExportedProgram` by removing the `__call__` method. Instead users can create a proper `torch.nn.Module` through `exported_program.module()` and use that as a callable. Test Plan: CI Differential Revision: D53075378 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119466 Approved by: https://github.com/zhxchen17, https://github.com/thiagocrepaldi	2024-02-15 08:49:34 +00:00
Han Qi	757201c213	Refactor ExportedProgram to expose the functions for pre and postprocessing (#119513 ) Reason: Consumers of ExportProgram might choose to further lower exported_program.graph_module to something else. Then, it will need to setup the calling convention to call it. This refactor concentrates these calling convention to one place and can be reused. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/119513 Approved by: https://github.com/zhxchen17	2024-02-12 17:22:27 +00:00
Angela Yi	c3e0836084	[export] Remove CallSpec (#117671 ) Summary: This is not really being used anywhere Test Plan: CI Differential Revision: D52842563 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117671 Approved by: https://github.com/avikchaudhuri, https://github.com/zhxchen17	2024-02-08 17:19:03 +00:00
Edward Z. Yang	3f0fd36835	Introduce size oblivious guards (#118579 ) Fixes https://github.com/pytorch/pytorch/issues/117361 The implementation here slightly diverges from what was proposed in the issue, so I will recap what this PR is doing here. Today, when doing computations involving size-like unbacked SymInts, we assume for all operations that the compile time range of the integer is `[2, inf]`, even though at runtime we also accept zero and one. This PR removes the carte blanche assumption, and instead does the analysis in a much more limited and controlled fashion: only for guards which we have designated as "size oblivious" are we willing to do the analysis under the assumption that the range of all size-like unbacked SymInts is `[2, inf]`; otherwise, we will faithfully only do analysis with `[0, inf]` (or whatever the user provided) bounds. The infra pieces of this PR are: * Remove runtime_var_to_range from torch/fx/experimental/symbolic_shapes.py; modify `_constrain_range_for_size` to refine the range without clamping min to 2, and instead add the symbol to a `size_like` set in the ShapeEnv * When evaluating an expression, if the expression is requested to be evaluated in a `size_oblivious` way, we attempt to statically compute the value of the expression with the assumption that all symbols in `size_like` are updated to assume that they are `>= 2`. * Add Python and C++ APIs for guarding on a SymBool in a size-oblivious way. In C++, I also need to add some helpers for performing symbolic comparisons, since the stock comparisons immediately specialize in the "normal" way. The rest of the changes of the PR are marking various spots in PyTorch framework code as size oblivious, based on what our current test suite exercises. As you review the places where we have marked things as size oblivious, it may become clear why I ended up not opting for the "designate a branch as the default branch when it's not statically obvious which way to go": for some of the conditions, this answer is rather non-obvious. I think potentially there is another refinement on top of this PR, which is something like "I don't care if you can't figure it out with ValueRange analysis, go down this path anyway if there are unbacked sizes involved." But even if we add this API, I think we are obligated to attempt the ValueRange analysis first, since it can lead to better outcomes sometimes (e.g., we are able to figure out that something is contiguous no matter what the unbacked size is.) When is it permissible to mark something as size oblivious? Heuristically, it is OK anywhere in framework code if it gets you past a guard on unbacked SymInt problem. It is somewhat difficult to provide a true semantic answer, however. In particular, these annotations don't have any observational equivalence guarantee; for example, if I have `torch.empty(u0, 1).squeeze()`, we will always produce a `[u0]` size tensor, even though if `u0 == 1` PyTorch will actually produce a `[]` size tensor. The argument that I gave to Lezcano is that we are in fact defining an alternate semantics for a "special" size = 0, 1, for which we have these alternate eager mode semantics. In particular, suppose that we have a constant `special1` which semantically denotes 1, but triggers alternate handling rules. We would define `torch.empty(special1, 1).squeeze()` to always produce a `[special1]` size tensor, making its semantics coincide with unbacked SymInt semantics. In this model, the decision to designate guards as size oblivious is simply a user API question: you put them where ever you need some handling for special1! As we conservatively error out whenever it is not obvious what `special1` semantics should be, it is always valid to expand these semantics to cover more cases (although you can always choose the wrong semantics!) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118579 Approved by: https://github.com/eellison, https://github.com/lezcano	2024-02-06 19:45:32 +00:00
Michael Suo	bf4e171539	[export] support non-persistent buffers (#118969 ) Summary: X-link: https://github.com/pytorch/executorch/pull/1817 Basic support for non-persistent buffers, which are buffers that do not show up in the state dict. One weird twist is that most of our other systems (FX, aot_export, dynamo) have completely buggy handling of non-persistent buffers. I tried to go on a wild goose chase to fix them all, but it got to be too much. So I introduced some sad rewrite passes in `_export` make the final state dict correctly align with the original module's state dict. This exposed some bugs/ambiguous handling of parameters/buffers in existing test code. For example, `TestSaveLoad.test_save_buffer` traced over a module that was not in the root module hierarchy and caused some weird behavior. I think we should error explicitly on use cases like this: https://github.com/pytorch/pytorch/issues/118410. For now I just rewrote the tests or skipped them. As a side effect, this diff tightened up quite a few sloppy behaviors around state dict handling: - Tensor attributes were getting promoted to be buffers—bad! - Tracing through a module not in the children of the root module would add its parameters/buffers to the state dict—bad! This behavior is unlikely to show up in user code since the model would be totally broken, but did show up in a bunch of tests. #buildmore Test Plan: unit tests sandcastle Differential Revision: D53340041 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118969 Approved by: https://github.com/guangy10, https://github.com/huydhn, https://github.com/titaiwangms	2024-02-02 19:16:08 +00:00
Angela Yi	53da422582	[export] Move _create_graph_module_for_export to torch/export (#118893 ) Summary: I have to keep the torch/_export one to not break executorch... Test Plan: CI Reviewed By: avikchaudhuri Differential Revision: D52842750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118893 Approved by: https://github.com/zhxchen17	2024-02-02 16:40:01 +00:00

1 2 3

113 Commits