pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	e50dc40d28	Revert "Update gm.print_readable to include Annotation (#165397 )" This reverts commit 7a657700131f31577544e93587eb339618677e97. Reverted https://github.com/pytorch/pytorch/pull/165397 on behalf of https://github.com/malfet due to I don't know how/why, but it breaks windows tests, see `2e22b1a61e/1` ([comment](https://github.com/pytorch/pytorch/pull/165397#issuecomment-3417428128))	2025-10-17 22:35:50 +00:00
Sherlock Huang	7a65770013	Update gm.print_readable to include Annotation (#165397 ) Sample output ``` [rank0]: # Annotation: {'compile_with_inductor': 'flex_attention'} File: /data/users/bahuang/pytorch/torch/nn/attention/flex_attention.py:1490 in flex_attention, code: out, lse, max_scores = flex_attention_hop( [rank0]: score_mod_2 = self.score_mod_2 [rank0]: mask_fn_2 = self.mask_fn_2 [rank0]: flex_attention_1 = torch.ops.higher_order.flex_attention(xq_5, xk_5, xv_3, score_mod_2, (2048, 2048, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_indices, 128, 128, mask_fn_2), 0.25, {'PRESCALE_QK': False, 'ROWS_GUARANTEED_SAFE': False, 'BLOCKS_ARE_CONTIGUOUS': False, 'WRITE_DQ': True, 'OUTPUT_LOGSUMEXP': True, 'OUTPUT_MAX': False}, (), (g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___mask_mod___closure___0_cell_contents,)); xq_5 = xk_5 = xv_3 = score_mod_2 = mask_fn_2 = None [rank0]: out_2: "bf16[8, 4, 2048, 16]" = flex_attention_1[0]; flex_attention_1 = None ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165397 Approved by: https://github.com/yushangdi, https://github.com/anijain2305	2025-10-17 18:35:18 +00:00
Animesh Jain	f3683453ae	[compile] Regional inductor compilation with fx.annotate (#164776 ) This PR introduces a way to compile a region of FX graph using `fx.traceback.annotate`. ### UX 1) In the user code, mark the region that you want to be compiled with inductor using `with fx_traceback.annotate({"compile_with_inductor": 0})`. As of now, we just rely on the string `compile_with_inductor` and ignore the integer. As the needs arise, we can update the logic. Example ``` def fn(x, y): sin = torch.sin(x) with fx_traceback.annotate({"compile_with_inductor": 0}): mul = sin * y add = mul + 1 return torch.sin(add) ``` 2) You have to instruct the compiler to use the annotations with `compile_fx_annotated_nodes_with_inductor` transformation. This is somewhat controversial, and a user might expect that just setting annotation is enough. But for now to control the blast radius, we need to explicitly do this. One such example is ``` # Set the fw and bw compiler of aot_autograd to `compile_fx_annotated_nodes_with_inductor` def aot_eager_regional_inductor(): return aot_autograd( fw_compiler=compile_fx_annotated_nodes_with_inductor, bw_compiler=compile_fx_annotated_nodes_with_inductor, ) ``` 3) Fixable in short-term - You have to wrap the user code in `torch.fx.traceback.preserve_node_meta` to ensure that annotations are propagated to the compiler. This is fixable, just need to make CI happy. ### Implementation 1) Relies on `CapabilityBasedPartitioner` to "scoop" out regions based on annotations, and then create subgraphs in the main graph. 2) Call `torch._inductor.standalone_compile` on these subgraphs, and jam the returned callable into the FX graph at the place of call_module Resulting graph looks something like this - search for `torch__inductor_standalone_compile_inner` Forward graph ``` class GraphModule(torch.nn.Module): def forward(self, primals_1: "f32[10]", primals_2: "f32[10]"): # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:64 in fn, code: sin = torch.sin(x) sin: "f32[10]" = torch.ops.aten.sin.default(primals_1) # No stacktrace found for following nodes inner = torch__inductor_standalone_compile_inner(sin, primals_2) # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:68 in fn, code: add = mul + 1 getitem: "f32[10]" = inner[0]; inner = None # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:70 in fn, code: return torch.sin(add) sin_1: "f32[10]" = torch.ops.aten.sin.default(getitem) return (sin_1, primals_1, primals_2, sin, getitem) ``` Backward graph ``` class GraphModule(torch.nn.Module): def forward(self, primals_1: "f32[10]", primals_2: "f32[10]", sin: "f32[10]", add: "f32[10]", tangents_1: "f32[10]"): # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:64 in fn, code: sin = torch.sin(x) cos_1: "f32[10]" = torch.ops.aten.cos.default(primals_1); primals_1 = None # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:70 in fn, code: return torch.sin(add) cos: "f32[10]" = torch.ops.aten.cos.default(add); add = None mul_1: "f32[10]" = torch.ops.aten.mul.Tensor(tangents_1, cos); tangents_1 = cos = None # No stacktrace found for following nodes inner = torch__inductor_standalone_compile_inner(mul_1, sin, primals_2); mul_1 = sin = primals_2 = None # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:67 in fn, code: mul = sin * y getitem: "f32[10]" = inner[0] getitem_1: "f32[10]" = inner[1]; inner = None # File: /data/users/anijain/pytorch2/test/dynamo/test_regional_inductor.py:64 in fn, code: sin = torch.sin(x) mul_4: "f32[10]" = torch.ops.aten.mul.Tensor(getitem_1, cos_1); getitem_1 = cos_1 = None return (mul_4, getitem) ``` ### Some issue raised in the HOP meeting 1) CSE will not differentiate different meta custom nodes and do wrong thing. 2) SAC - The recomputed forward will be smaller than the forward. Will we compile a smaller region than? 3) What happens if you have a op in the middle which does not disturb the topology, is it still 1 subgraph? 4) What happens with the nesting of `fx_traceback.annotate`? Are there any ordering requirements? 5) What are we going to use the annotations for? a) compile flex b) streams c) nn.Module info to organize MoE components for pipelining d) PP stages e) Rename graph nodes for more debugging f) No nested regional compile Pull Request resolved: https://github.com/pytorch/pytorch/pull/164776 Approved by: https://github.com/SherlockNoMad ghstack dependencies: #165188	2025-10-13 22:22:20 +00:00
Laith Sakka	7f2a902ea2	more sizelike deprecation (#164889 ) remove expext_size c++ bindings and usages Pull Request resolved: https://github.com/pytorch/pytorch/pull/164889 Approved by: https://github.com/mlazos ghstack dependencies: #164884, #164885, #164886, #164887, #164888	2025-10-10 03:45:06 +00:00
Laith Sakka	2035f6b2e6	use check_size instead of check_is_size in ops.py (#164668 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164668 Approved by: https://github.com/angelayi ghstack dependencies: #164664, #164665, #164667	2025-10-08 14:23:38 +00:00
Animesh Jain	6b1900c22f	[dynamo][hops] Remove const outputs from the speculated subgraph (#161355 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161355 Approved by: https://github.com/zou3519	2025-09-04 18:52:01 +00:00
Animesh Jain	5805c4210b	[invoke_subgraph][inductor] Thread graphsafe rng input states for hops (#160713 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160713 Approved by: https://github.com/eellison	2025-08-21 20:41:29 +00:00
Yidi Wu	fc25c68f20	[hop][exc] make UncapturedHigherOrderOpError print user code and avoid re-raise (#159296 ) After the change, the error stacktrace is attached with user code stack and is suppressed into 1 (without the scrolling up mssage). For example: ```python class Test(torch.nn.Module): def forward(self, c, x): def cond_fn(c, x): return c > 0 and x.size(0) < 20 def body_fn(c, x): return c - 1, x.sin() return torch._higher_order_ops.while_loop(cond_fn, body_fn, (c, x)) ``` Now gives the following error message: ```python Traceback (most recent call last): File "/home/yidi/local/pytorch/test/inductor/test_control_flow.py", line 1705, in test_while_loop_size_mismatch_tensor_expansion self._run_test( ~~~~~~~~~~~~~~^ model=WhileLoopModels.SizeMismatchTensorExpansion(), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<2 lines>... dynamic=dynamic, ^^^^^^^^^^^^^^^^ ) ^ File "/home/yidi/local/pytorch/test/inductor/test_control_flow.py", line 1417, in _run_test result = model(inputs_with_counters) File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(args, *kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(args, *kwargs) File "/home/yidi/local/pytorch/test/inductor/test_control_flow.py", line 1053, in forward return torch._higher_order_ops.while_loop(cond_fn, body_fn, (c, x)) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_higher_order_ops/while_loop.py", line 176, in while_loop return torch.compile( ~~~~~~~~~~~~~~ _while_loop_op_wrapper, backend=backend, fullgraph=True ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ )(flat_cond_fn, flat_body_fn, tuple(flat_inputs), tuple()) ~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 804, in compile_wrapper return fn(args, *kwargs) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 1595, in __call__ result = self._torchdynamo_orig_backend( frame, cache_entry, self.hooks, frame_state, skip=1 ) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 1353, in __call__ result = self._inner_convert( frame, cache_entry, hooks, frame_state, skip=skip + 1 ) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 682, in __call__ result = _compile( frame.f_code, ...<16 lines>... convert_frame_box=self._box, ) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 1172, in _compile guarded_code = compile_inner(code, one_graph, hooks, transform) File "/home/yidi/local/pytorch/torch/_utils_internal.py", line 98, in wrapper_function return function(args, *kwargs) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 858, in compile_inner return _compile_inner(code, one_graph, hooks, transform) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 897, in _compile_inner out_code = transform_code_object(code, transform) File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1461, in transform_code_object transformations(instructions, code_options) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 300, in _fn return fn(args, *kwargs) File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 818, in transform tracer.run() ~~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 3528, in run super().run() ~~~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1372, in run while self.step(): ~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1276, in step self.dispatch_table[inst.opcode](self, inst) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 852, in wrapper return inner_fn(self, inst) File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2240, in CALL_FUNCTION_EX self.call_function(fn, argsvars.items, kwargsvars) ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1200, in call_function self.push(fn.call_function(self, args, kwargs)) # type: ignore[arg-type] ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward return getattr(self.realize(), name)(args, *kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 91, in graph_break_as_hard_error raise exc.with_traceback(sys.exc_info()[2]) from None File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 77, in graph_break_as_hard_error return fn(args, *kwargs) File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 1287, in call_function ) = speculate_subgraph( ~~~~~~~~~~~~~~~~~~^ tx, ^^^ ...<33 lines>... supports_aliasing=self.supports_aliasing, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 877, in speculate_subgraph raise ex File "/home/yidi/local/pytorch/torch/_dynamo/variables/higher_order_ops.py", line 718, in speculate_subgraph output = f.call_function(tx, args, sub_kwargs) File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 580, in call_function return super().call_function(tx, args, kwargs) ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 334, in call_function return tx.inline_user_function_return(self, [self.self_args(), args], kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1217, in inline_user_function_return return InliningInstructionTranslator.inline_call(self, fn, args, kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 3733, in inline_call return tracer.inline_call_() ~~~~~~~~~~~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 3936, in inline_call_ self.run() ~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1372, in run while self.step(): ~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1276, in step self.dispatch_table[inst.opcode](self, inst) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 852, in wrapper return inner_fn(self, inst) File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2240, in CALL_FUNCTION_EX self.call_function(fn, argsvars.items, kwargsvars) ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1200, in call_function self.push(fn.call_function(self, args, kwargs)) # type: ignore[arg-type] ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/lazy.py", line 212, in realize_and_forward return getattr(self.realize(), name)(args, *kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 580, in call_function return super().call_function(tx, args, kwargs) ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/variables/functions.py", line 334, in call_function return tx.inline_user_function_return(self, [self.self_args(), args], kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1217, in inline_user_function_return return InliningInstructionTranslator.inline_call(self, fn, args, kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 3733, in inline_call return tracer.inline_call_() ~~~~~~~~~~~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 3936, in inline_call_ self.run() ~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1372, in run while self.step(): ~~~~~~~~~^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 1276, in step self.dispatch_table[inst.opcode](self, inst) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 830, in inner unimplemented_v2( ~~~~~~~~~~~~~~~~^ gb_type="Data-dependent branching", ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<5 lines>... ], ^^ ) ^ File "/home/yidi/local/pytorch/torch/_dynamo/exc.py", line 580, in unimplemented_v2 raise Unsupported(msg) torch._dynamo.exc.UncapturedHigherOrderOpError: while_loop doesn't work unless it is captured completely with torch.compile. Got Data-dependent branching Explanation: Detected data-dependent branching (e.g. `if my_tensor.sum() > 0:`). Dynamo does not support tracing dynamic control flow. Hint: This graph break is fundamental - it is unlikely that Dynamo will ever be able to trace through your code. Consider finding a workaround. Hint: Use `torch.cond` to express dynamic control flow. Developer debug context: attempted to jump with TensorVariable() For more details about this graph break, please visit: https://pytorch-labs.github.io/compile-graph-break-site/gb/gb0170.html from user code: File "/home/yidi/local/pytorch/torch/_higher_order_ops/while_loop.py", line 167, in _while_loop_op_wrapper return while_loop_op(args, *kwargs) File "/home/yidi/local/pytorch/torch/_higher_order_ops/while_loop.py", line 137, in flat_cond_fn return cond_fn(carried, *additional) File "/home/yidi/local/pytorch/test/inductor/test_control_flow.py", line 1047, in cond_fn return c > 0 and x.size(0) < 20 Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo" To execute this test, run the following from the base repo dir: python test/inductor/test_control_flow.py WhileLoopTests.test_while_loop_size_mismatch_tensor_expansion_device_cpu_dynamic_False This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/159296 Approved by: https://github.com/zou3519	2025-08-11 22:48:10 +00:00
ghostspiders	af10f1f86c	Fix requires_cuda to requires_cuda_and_triton (#160222 ) Fixes ##159399 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160222 Approved by: https://github.com/janeyx99	2025-08-10 07:05:52 +00:00
Shangdi Yu	21c97bd565	[reland] Transfer "stack_trace" in post_grad passes (#158752 ) Summary: We transfer stack trace in post_grad passes. We shouldn't add "stack_trace" to _COPY_META_FIELDS because _COPY_META_FIELDS is used in proxy.py where stack_trace is explicitly set. Since the stack_trace is being used by more and more debugging tools, we should also start testing it more rigorously. This PR start by adding a first test for testing that stack trace is preserved through post_grad_passes. Test Plan: ``` buck run mode/dev-nosan fbcode//caffe2/test/inductor:provenance_tracing -- -r test_pattern_matcher_transfer_meta buck run mode/dev-nosan fbcode//caffe2/test/inductor:auto_functionalize -- --rcaffe2/test/inductor:auto_functionalize_old ``` Rollback Plan: Differential Revision: D78669729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158752 Approved by: https://github.com/jingsh	2025-07-22 03:49:13 +00:00
Xuehai Pan	c8d43cbc6e	[BE][3/6] fix typos in test/ (#157637 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157637 Approved by: https://github.com/yewentao256, https://github.com/albanD ghstack dependencies: #156605	2025-07-17 12:08:33 +00:00
Animesh Jain	22edb457c9	[invoke_subgraph][partitioner] Add meta val on run_and_save_rng ops (#157319 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157319 Approved by: https://github.com/zou3519	2025-07-01 21:02:08 +00:00
Yidi Wu	064a7db7fc	[invoke_subgraph] turn on supports_input_mutation by default (#157177 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157177 Approved by: https://github.com/anijain2305	2025-06-28 18:14:47 +00:00
Yidi Wu	f5f4beaf56	[invoke_subgraph] make collect_meta_analysis fake prop cachable (#156347 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156347 Approved by: https://github.com/anijain2305, https://github.com/zou3519 ghstack dependencies: #156260	2025-06-25 04:29:22 +00:00
Yidi Wu	558d7f7db0	[invoke_subgraph] make same subgraph share get_attr target (#156260 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156260 Approved by: https://github.com/anijain2305, https://github.com/zou3519	2025-06-25 04:29:22 +00:00
Xuehai Pan	6d5c789ad5	[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format` (#144555 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555 Approved by: https://github.com/ezyang ghstack dependencies: #144551, #144554	2025-06-24 04:53:54 +00:00
PyTorch MergeBot	d061a02e6e	Revert "[invoke_subgraph] make same subgraph share get_attr target (#156260 )" This reverts commit 39dd2f4d7defc63164a7969bfac0d0c62ffac900. Reverted https://github.com/pytorch/pytorch/pull/156260 on behalf of https://github.com/ydwu4 due to no signal, it breaks linter tests. ([comment](https://github.com/pytorch/pytorch/pull/156260#issuecomment-2997478798))	2025-06-23 18:24:10 +00:00
PyTorch MergeBot	35d03398e5	Revert "[invoke_subgraph] make collect_meta_analysis fake prop cachable (#156347 )" This reverts commit f179b7198522e6d93bd103efba1a1ebd5a2cf891. Reverted https://github.com/pytorch/pytorch/pull/156347 on behalf of https://github.com/ydwu4 due to no signal, it breaks linter tests. ([comment](https://github.com/pytorch/pytorch/pull/156347#issuecomment-2997453729))	2025-06-23 18:19:29 +00:00
Yidi Wu	f179b71985	[invoke_subgraph] make collect_meta_analysis fake prop cachable (#156347 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156347 Approved by: https://github.com/anijain2305, https://github.com/zou3519 ghstack dependencies: #156260	2025-06-23 17:10:07 +00:00
Yidi Wu	39dd2f4d7d	[invoke_subgraph] make same subgraph share get_attr target (#156260 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156260 Approved by: https://github.com/anijain2305, https://github.com/zou3519	2025-06-23 17:10:07 +00:00
Animesh Jain	fab85fc5f9	[compile][hierarchical compilation] Release nested_compile_region API (#156449 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156449 Approved by: https://github.com/zou3519, https://github.com/jansel	2025-06-21 15:14:59 +00:00
Animesh Jain	7b0118884e	[invoke_subgraph][inductor] Dont fallback on complex dtype (#155885 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155885 Approved by: https://github.com/jansel, https://github.com/zou3519 ghstack dependencies: #155828	2025-06-17 04:47:12 +00:00
Animesh Jain	ffcc6fea7b	[invoke_subgraph] Ignore redundantly nested invoke_subgraph (#155828 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155828 Approved by: https://github.com/zou3519	2025-06-17 04:47:12 +00:00
Animesh Jain	c9e9a0c823	[inductor][invoke_subgraph] Mark invoke_subgraph outputs as user_visible to constrain output strides (#155395 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155395 Approved by: https://github.com/zou3519	2025-06-12 03:58:16 +00:00
Yidi Wu	6ded656aee	[hop] auto functionalize invoke_subgraph (#154072 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154072 Approved by: https://github.com/zou3519 ghstack dependencies: #155261	2025-06-11 22:52:28 +00:00
Animesh Jain	e25ce0f928	[invoke_subgraph] Use eager input vals to constrain input strides (#155291 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155291 Approved by: https://github.com/ezyang, https://github.com/zou3519	2025-06-10 04:06:09 +00:00
Animesh Jain	0f3f59784d	[invoke_subgraph] Throw assertion on uncaptured speculate_subgraph (#155270 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155270 Approved by: https://github.com/zou3519	2025-06-07 11:31:53 +00:00
Animesh Jain	bc5a11b581	[easy][invoke_subgraph] Remove skip from already fixed test (#155286 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155286 Approved by: https://github.com/zou3519	2025-06-06 21:16:22 +00:00
Sidharth	54f1f29fed	[dynamo] dynamic gb_type -> static gb_type (#154435 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154435 Approved by: https://github.com/williamwen42	2025-05-28 03:14:26 +00:00
Yidi Wu	8e6e79fc1b	[hop_schema] support gen_schema for invoke_subgraph (#152984 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152984 Approved by: https://github.com/zou3519 ghstack dependencies: #151067, #152974	2025-05-21 18:55:46 +00:00
rzou	2926dd4d8e	Stop proxy-ing autograd.Function.ctx into the graph (#152621 ) The reason why we did this before is because that's how our older autograd.Function x Dynamo interaction work, but we've since adopted newer designs that don't actually need the autograd.Function.ctx proxied into the graph. We still need a fx.Proxy for the autograd.Function.ctx object, so whenever we do I create one via discard_graph_changes. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/152621 Approved by: https://github.com/oulgen	2025-05-08 13:32:54 +00:00
Animesh Jain	97dfd8dd53	[invoke_subgraph] Run missing graph passes recursively (#152675 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152675 Approved by: https://github.com/bdhirsh, https://github.com/zou3519 ghstack dependencies: #152772, #152770	2025-05-06 02:55:34 +00:00
Animesh Jain	cc254eaa7c	[inductor][refactor] Refactor the fetching of subgraph names (#152770 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152770 Approved by: https://github.com/jansel, https://github.com/zou3519 ghstack dependencies: #152772	2025-05-06 02:55:34 +00:00
Animesh Jain	b1d34acac5	[fx] Recursive DCE on subgraphs (#152772 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152772 Approved by: https://github.com/bdhirsh, https://github.com/zou3519	2025-05-06 02:55:34 +00:00
Animesh Jain	9e3fc41060	[invoke_subgraph] rename identifiers to prevent python mangling (#152581 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152581 Approved by: https://github.com/BoyuanFeng, https://github.com/zou3519 ghstack dependencies: #152547	2025-05-02 06:46:05 +00:00
Animesh Jain	4649fd17b0	[invoke_subgraph] Unpacked operands (#152547 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152547 Approved by: https://github.com/ydwu4, https://github.com/zou3519	2025-05-02 05:44:46 +00:00
Animesh Jain	f2a89b802d	[invoke_subgraph] Cache on tangent metadata and retrace if needed (#152357 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152357 Approved by: https://github.com/zou3519, https://github.com/bdhirsh	2025-04-30 23:49:17 +00:00
Animesh Jain	d620fefb2c	[invoke_subgraph] Use backward identifier for min-cut parititioning (#152207 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152207 Approved by: https://github.com/zou3519, https://github.com/bdhirsh	2025-04-30 14:34:56 +00:00
xinan.lin	75c71ab371	[Break XPU] generalize newly introduced device bias code in Inductor UT. (#151926 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151926 Approved by: https://github.com/EikanWang, https://github.com/jansel	2025-04-25 00:03:23 +00:00
Animesh Jain	d743a7bd85	[invoke_subgraph] Cache fake tensor if no unbacked symint in the output (#151957 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151957 Approved by: https://github.com/zou3519, https://github.com/bdhirsh ghstack dependencies: #151409, #151633, #151477	2025-04-24 14:17:22 +00:00
Animesh Jain	508b882513	[dynamo][invoke_subgraph] Use FxGraphModule comparison instead of hashing (#150911 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150911 Approved by: https://github.com/zou3519	2025-04-14 23:34:26 +00:00
Animesh Jain	173f126068	[invoke_subgraph] Preserve node meta (#150782 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150782 Approved by: https://github.com/bdhirsh ghstack dependencies: #150666	2025-04-08 16:57:39 +00:00
Animesh Jain	0bacb90a9c	[invoke_subgraph][min-cut partitioner] Fix bug to use the correct root module (#150556 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150556 Approved by: https://github.com/bdhirsh, https://github.com/zou3519 ghstack dependencies: #150082, #150450, #150486	2025-04-02 22:35:00 +00:00
Animesh Jain	42c7c7f15f	[invoke_subgraph] Filter out grad_out where fw_out requires_grad is False (#150486 ) I am not sure if this is the right way. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150486 Approved by: https://github.com/zou3519 ghstack dependencies: #150082, #150450	2025-04-02 14:40:08 +00:00
Animesh Jain	61ebe999cc	[invoke_subgraph] Do not cache fake tensors for AOTDispatcher first pass (#150450 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150450 Approved by: https://github.com/zou3519 ghstack dependencies: #150082	2025-04-02 02:31:54 +00:00
Animesh Jain	b060fedfa8	[invoke_subgraph] Support None in the fwd output (#150082 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150082 Approved by: https://github.com/zou3519	2025-04-02 02:31:54 +00:00
angelayi	5e34758cef	[invoke_subgraph] Support unbacked (#149298 ) Differential Revision: [D71420641](https://our.internmc.facebook.com/intern/diff/D71420641) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149298 Approved by: https://github.com/zou3519	2025-03-31 17:25:09 +00:00
Animesh Jain	c9ebf517c2	[dynamo][invoke_subgraph] Input aliasing and mutation check in Dynamo (#148953 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148953 Approved by: https://github.com/zou3519 ghstack dependencies: #149087, #149667, #150036	2025-03-28 03:50:07 +00:00
Animesh Jain	731b559f54	[easy] Use config patch to toggle capture_scalar_output (#150036 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150036 Approved by: https://github.com/angelayi ghstack dependencies: #149087, #149667	2025-03-27 00:01:39 +00:00
angelayi	84ae056d82	[invoke_subgraph] Support pending unbacked symint (#149297 ) The "PendingUnbackedSymbolNotFound" error is when an unbacked symbol is created within a piece of code, but this symbol never appears in any of the outputs. I believe the original intention is to help catch incorrectly written meta kernels, where users might've unintentionally created an unbacked symbol but never used it anywhere, but in our case this is intentional. An example is the following test case: ```python def test_pending_unbacked(self): class M(torch.nn.Module): @mark_compile_region def gn(self, x): u = x[0].item() return x * u def forward(self, x): for _ in range(4): x = self.gn(x) return x torch._dynamo.config.capture_scalar_outputs = True torch.compile(M())(torch.randn(8)) ``` This fails with the error: ``` torch._dynamo.exc.InternalTorchDynamoError: PendingUnbackedSymbolNotFound: Pending unbacked symbols {zuf1} not in returned outputs (FakeTensor(..., size=(8,)),) . ``` In this case, creating the unbacked symbol is intentional, so we can bypass this using `fake_mode.shape_env.ignore_fresh_unbakced_symbols()`. Differential Revision: [D71298926](https://our.internmc.facebook.com/intern/diff/D71298926) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149297 Approved by: https://github.com/zou3519 ghstack dependencies: #149296	2025-03-25 16:42:58 +00:00

1 2

65 Commits