pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Laith Sakka	3559c354ce	stop suggesting using guard_size_oblivious on data dependent errors (#160510 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160510 Approved by: https://github.com/ezyang	2025-09-03 18:07:59 +00:00
Rohit Manav	e92cd94153	removed duplicate imports (#161685 ) Fixes #161684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161685 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2025-08-31 16:21:49 +00:00
Angela Yi	5c306c3ccb	[fx] Add lru_cache to warning (#161721 ) Summary: Added lru_cache to the warning message to avoid flooding logs Test Plan: CI Rollback Plan: Differential Revision: D81245618 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161721 Approved by: https://github.com/pianpwk	2025-08-29 02:25:45 +00:00
PyTorch MergeBot	47742081c9	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit 69d91b94ba5366f4444d8cb8fd3dab4de4f04d3d. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/jeffdaily due to let's revert again instead of waiting for forward fix, see earlier comments ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3235165462))	2025-08-28 22:50:37 +00:00
Avik Chaudhuri	69d91b94ba	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 19:36:19 +00:00
PyTorch MergeBot	a8270dd124	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit 196232bb935cb346f143d5c39e9a73c44121a033. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/atalman due to dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/17289619543/job/49074475338) [HUD commit link](`196232bb93`) ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3234013520))	2025-08-28 15:40:37 +00:00
Avik Chaudhuri	196232bb93	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 07:59:29 +00:00
Xinran / Allan Rui	7da02bf8af	Skip const folding with symbolic expression (#161437 ) Summary: When performing constant folding, we must skip over operators that have symbolic `fill_value`. Test Plan: CI Rollback Plan: Reviewed By: kalpit-meta-1 Differential Revision: D80965936 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161437 Approved by: https://github.com/StellarrZ	2025-08-27 22:09:58 +00:00
Yidi Wu	16ce6a4aad	[hop] move insert_deferred_runtime_asserts under subtracer (#161416 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161416 Approved by: https://github.com/pianpwk ghstack dependencies: #160548	2025-08-27 17:43:02 +00:00
angelayi	4d078cfc4e	[fx] Add is_fx_symbolic_tracing flag (#161385 ) Fixes https://github.com/pytorch/pytorch/issues/135276 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161385 Approved by: https://github.com/pianpwk	2025-08-26 22:26:27 +00:00
Liang Wang	037c43d3b2	[tgif] fix getattr_recursive with ModuleList (#161204 ) Summary: This change updates `getattr_recursive` to handle qualnames with ModuleList that contain digit indices, for example, `op_instances.1.value_model.feature_weights` Test Plan: TBA Rollback Plan: Reviewed By: jiayisuse Differential Revision: D80503985 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161204 Approved by: https://github.com/jiayisuse	2025-08-25 10:08:47 +00:00
bobrenjc93	9a41570199	[rfc] add hint_override kwarg to mark_dynamic (#161007 ) The motivation for this change can be seen through the following example: ``` import torch GPU_TYPE = "cuda" @torch.compile def no_override(x): return x.sum(dim=0) @torch.compile def override(x): return x.sum(dim=0) x_small = torch.randn(4096, 512, device=GPU_TYPE) no_override(x_small) torch._dynamo.decorators.mark_dynamic(x_small, 0, hint_override=4096 * 1000) override(x_small) ``` Previously, when reductions were split, codegen relied only on the first observed shape. With a small input, this resulted in a small split size: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, ks0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 16384 rnumel = r0_numel ``` With the new scheme, inductor honors hint_override during codegen, producing larger and more appropriate split sizes: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, ks0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 1024000 rnumel = r0_numel ``` This addresses a broader problem with dynamism: performance and numerics previously depended on whichever shape was seen first. For example: ``` f(s0) -> f(s2) f(s1) -> f(s2) ``` could generate different kernels. With the new approach, an explicit override pins the chosen configuration: ``` f(s0, hint_override=s0) -> f(s2) f(s1, hint_override=s0) -> f(s2) ``` ensuring consistent kernel generation regardless of input order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161007 Approved by: https://github.com/jansel	2025-08-21 02:22:52 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	dbef606631	Add support for tracing vmap in pre-dispatch export (#154650 ) Summary: ONNX team and recent transformer upgrade ran into this error and we also ran into during our export benchmarking. This diff makes it possible to trace through vmap implementation in pre-dispatch IR. Note that we don't support serializing functorch ops in pre-dispatch IR and in the future, we should desugar them to post-grad ops. The implementation strategy is: 1. We add python wrappers around vmap APIs so that we attach custom torch function handler that is only on during non-strict export. The reason is we don't want to add this to default torch_function handler because it will break BC. 2. Some dynamo changes to make sure it picks up new python wrapper APIs. The reason is when we do strict export, we need to re-materialize these APIs in pre-dispatch IR from torch IR. We can avoid this by special casing in dynamo for export to proxy different API calls but i feel that is too much chaos because you need to be able to proxy 2 different variants of same vmap API. Test Plan: CI Differential Revision: D75623875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154650 Approved by: https://github.com/ezyang, https://github.com/zou3519	2025-08-20 19:31:07 +00:00
PyTorch MergeBot	90ea9ccefe	Revert "[rfc] add hint_override kwarg to mark_dynamic (#161007 )" This reverts commit 0533ff2ccba7e77622ac3c6758f1032bdc10feff. Reverted https://github.com/pytorch/pytorch/pull/161007 on behalf of https://github.com/jeffdaily due to failing on both cuda and rocm ([comment](https://github.com/pytorch/pytorch/pull/161007#issuecomment-3206893756))	2025-08-20 15:31:33 +00:00
bobrenjc93	0533ff2ccb	[rfc] add hint_override kwarg to mark_dynamic (#161007 ) The motivation for this change can be seen through the following example: ``` import torch GPU_TYPE = "cuda" @torch.compile def no_override(x): return x.sum(dim=0) @torch.compile def override(x): return x.sum(dim=0) x_small = torch.randn(4096, 512, device=GPU_TYPE) no_override(x_small) torch._dynamo.decorators.mark_dynamic(x_small, 0, hint_override=4096 * 1000) override(x_small) ``` Previously, when reductions were split, codegen relied only on the first observed shape. With a small input, this resulted in a small split size: ``` def triton_per_fused_sum_1(in_ptr0, out_ptr0, xnumel, r0_numel, XBLOCK : tl.constexpr): xnumel = 512 r0_numel = 32 ``` With the new scheme, inductor honors hint_override during codegen, producing larger and more appropriate split sizes: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 16384 r0_numel = 128 ``` This addresses a broader problem with dynamism: performance and numerics previously depended on whichever shape was seen first. For example: ``` f(s0) -> f(s2) f(s1) -> f(s2) ``` could generate different kernels. With the new approach, an explicit override pins the chosen configuration: ``` f(s0, hint_override=s0) -> f(s2) f(s1, hint_override=s0) -> f(s2) ``` ensuring consistent kernel generation regardless of input order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161007 Approved by: https://github.com/jansel	2025-08-20 07:51:09 +00:00
Shangdi Yu	4e90441133	Add signpost to provenance tracking error (#160755 ) Summary: As title, add signpost to better track error when computing provenance tracking related debugging information Test Plan: CI Rollback Plan: Differential Revision: D80292285 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160755 Approved by: https://github.com/angelayi	2025-08-18 19:17:47 +00:00
PyTorch MergeBot	b82aa3df20	Revert "Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 )" This reverts commit e444cd24d48b3a46f067974f2cc157f5ed27709f. Reverted https://github.com/pytorch/pytorch/pull/159197 on behalf of https://github.com/laithsakka due to internal build failures ([comment](https://github.com/pytorch/pytorch/pull/159197#issuecomment-3195436668))	2025-08-18 07:22:13 +00:00
Laith Sakka	e444cd24d4	Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 ) This might cause some new DDEs on call sites that do not use is_contiguous_or_false() or sym_is_contiguous() but want to find those call sites to handle this properly by calling is_contiguous_or_false() and not is_contiguous() explitly when appropriate. I had to fix one issue after removing the implicit size oblivious reasoning. here is context we defined in this https://github.com/pytorch/pytorch/pull/157472 sym_is_contiguous to be the function computing contiguity for dynamic shapes in c++. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE. when people call is_contiguous we do sym_is_contiguous().guard_bool() when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false() one issue not handled well was this path ``` c10::SymBool TensorImpl::sym_is_contiguous_custom( at::MemoryFormat memory_format) const { if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) { return pyobj_slot_.load_pyobj_interpreter()->is_contiguous( this, memory_format); } return sym_is_contiguous_default(memory_format); } ``` namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format); This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning. once we removed that implicit size oblivious reasoning, the right thing we want is to call return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format); otherwise we would get DDE even if the caller is doing sym_is_contiguous. so I had to define it for pyinterpreter, and then I had to override it for nested tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159197 Approved by: https://github.com/ezyang	2025-08-16 09:15:58 +00:00
Shangdi Yu	aa99e0958f	Separate provenance tracking to different levels (#160383 ) Summary: as title. We've got request from various parties who are interested in turning on the provenance tracking by default. In this PR, we prepare to turn on part of the provenance tracking that doesn't have too much overhead by default. - Change `provenance_tracking` config to `provenance_tracking_level` - turn on the following provenance tracking by default when `basic_provenance_tracking`=True - `set_kernel_post_grad_provenance_tracing` for kernels, this add mapping between triton kernels and post_grad nodes - `dump_inductor_provenance_info` if we're dumping tlparse log - `get_graph_provenance_json` and dump `reate_mapping_pre_post_grad_nodes`. This creates mapping between pre_grad and post_grad nodes. Since we're not turning on the provenance tracking in GraphTransformObserver by default, the mapping here maybe incomplete/limited. - add stack trace from post grad nodes to inductor IR nodes - add exception swallowing for all functions above Test Plan: CI Rollback Plan: Differential Revision: D80031559 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160383 Approved by: https://github.com/angelayi	2025-08-15 04:59:35 +00:00
Lucas Kabela	9faca5f260	typing debugging.py (#160364 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160364 Approved by: https://github.com/Skylion007 ghstack dependencies: #160362, #160363	2025-08-15 02:09:31 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	194fcfcfbd	Add support for param mutation under inference mode (#159661 ) Summary: In HF model rwkv, we have parameter mutation under inference mode which should be safe. This PR does multiple things to make sure it works: 1. We execute global autograd mutation while tracing so that we can actually trace through parameter inplace mutation 2. Add support for parameter mutation under inference mode in AOTAutograd 3. Add support for parameter mutation under inference mode in export. Test Plan: test Rollback Plan: Differential Revision: D79460136 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159661 Approved by: https://github.com/ydwu4	2025-08-14 03:34:04 +00:00
kshitij12345	199e9abb6a	[fx] fix split_module with symint (#160093 ) Fixes https://github.com/pytorch/pytorch/issues/155220 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160093 Approved by: https://github.com/ezyang	2025-08-13 05:50:15 +00:00
Edward Yang	5dddcd5b07	Correctly copy self.module_stack in ModuleStackTracer (#159956 ) There is a bigger cluster of issues which this does not completely fix, but I think this is a matter of good hygiene, especially because we immediately mutate the dict after assigning it. Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/159956 Approved by: https://github.com/pianpwk	2025-08-10 03:33:59 +00:00
thenumberouscode	29712314dd	[fx][pass] Support converting a float32 tensor to a scalar in FX trace. (#158216 ) Fixes https://github.com/pytorch/pytorch/issues/158083 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158216 Approved by: https://github.com/laithsakka	2025-08-09 15:13:13 +00:00
Yanan Cao (PyTorch)	731ee31f7b	[TorchScript, PT2] Add torch._check compatibility support (#159988 ) Summary: Add support for torch._check() in TorchScript jit.script frontend. * It will be special cased to behave like torch._assert, turned into an if + raise exception. Test Plan: Unit tests Rollback Plan: Differential Revision: D79744604 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159988 Approved by: https://github.com/davidberard98	2025-08-08 23:14:13 +00:00
Aaron Gokaslan	beb4d7816d	[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 ) Automatically replaces split with rsplit when relevant and only performs the split up to the first ( or last value). This allows early return of the split function and improve efficiency. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160107 Approved by: https://github.com/albanD	2025-08-08 03:14:59 +00:00
Xiaochang Wu	2507ae63f2	Partitioner: Fix to align partition node order with original graph (#157892 ) Fixes #157891 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157892 Approved by: https://github.com/ezyang	2025-08-06 22:12:47 +00:00
bobrenjc93	311f74089a	remove print (#159917 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159917 Approved by: https://github.com/laithsakka	2025-08-06 03:48:23 +00:00
Lucas Kabela	b6c53383fe	[Dynamo][Better Engineering] Type annotation for `torch/_dynamo/output_graph.py` (#159602 ) As part of better engineering effort, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to `torch/_dynamo/output_graph.py` Running ``` mypy torch/_dynamo/output_graph.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Annotated \| Lines Total \| % lines covered \| Funcs Annotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 2163 \| 4792 \| 45.14% \| 121 \| 268 \| 45.15% \| \| This PR \| 4818 \| 4818 \| 100.00% \| 268 \| 268 \| 100.00% \| \| Delta \| +2655 \| +26 \| +54.84% \| +147 \| 0 \| +54.85% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/159602 Approved by: https://github.com/Skylion007	2025-08-05 03:50:54 +00:00
Sheng Fu	0450f05658	Output tensor meta data for FX graph node (#159311 ) FX graph segment in CompiledFxGraph does not include tensor meta data, for example, tensor shape, tensor stride, tensor data type, tensor device. AI system co-design team requested to include these information in FX graph segment so they can use FX graph segment to project the performance on different hardware. This DIFF is to modify the Graph::Node::format_node to include tensor meta data. Before this DIFF, the triton kernel FX graph segment looks like the following: ``` # %mm : Tensor "f32[4, 4][4, 1]cuda:0" = PlaceHolder[target=mm] # %arg2_1 : Tensor "f32[4, 4][4, 1]cuda:0" = PlaceHolder[target=arg2_1] # %sin : Tensor "f32[4, 4][4, 1]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.sin.default](args = (%mm,), kwargs = {}) # %permute_1 : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%sin, [1, 0]), kwargs = {}) # %mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%arg2_1, 1111), kwargs = {}) # %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%permute_1, %mul), kwargs = {}) # %cos : cuda:0"[num_users=1] = call_function[target=torch.ops.aten.cos.default](args = (%add,), kwargs = {}) # return %cos After this DIFF: # %mm : Tensor "f32[4, 4][4, 1]cuda:0" = PlaceHolder[target=mm] # %arg2_1 : Tensor "f32[4, 4][4, 1]cuda:0" = PlaceHolder[target=arg2_1] # %sin : Tensor "f32[4, 4][4, 1]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.sin.default](args = (%mm,), kwargs = {}) # %permute_1 : Tensor "f32[4, 4][1, 4]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%sin, [1, 0]), kwargs = {}) # %mul : Tensor "f32[4, 4][4, 1]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%arg2_1, 1111), kwargs = {}) # %add : Tensor "f32[4, 4][1, 4]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%permute_1, %mul), kwargs = {}) # %cos : Tensor "f32[4, 4][1, 4]cuda:0"[num_users=1] = call_function[target=torch.ops.aten.cos.default](args = (%add,), kwargs = {}) # return %cos ``` If format_node can not be changed, I can copy the code to caffe2/torch/_inductor/utils.py. Differential Revision: D77973076 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159311 Approved by: https://github.com/angelayi	2025-08-01 21:40:29 +00:00
Lucas Kabela	c137f9da0b	[Dynamo][Better Engineering] Add type coverage to dynamo/compiled_autograd.py (#159518 ) As part of better engineering effort, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to `torch/_dynamo/compiled_autograd.py` Running ``` mypy torch/_dynamo/compiled_autograd.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Annotated \| Lines Total \| % lines covered \| Funcs Annotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 425 \| 1553 \| 27.37% \| 17 \| 62 \| 27.42% \| \| This PR \| 1623 \| 1623 \| 100.00% \| 62 \| 62 \| 100.00% \| \| Delta \| +1198\| +0 \| +72.63% \| +45 \| 0 \| +72.58% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/159518 Approved by: https://github.com/xmfan	2025-08-01 20:24:58 +00:00
gaoyufeng	4396b15aa7	remove co_lnotab in favor of co_linetable (#159227 ) Fixes #158833 DeprecationWarning: remove co_lnotab in favor of co_linetable Pull Request resolved: https://github.com/pytorch/pytorch/pull/159227 Approved by: https://github.com/ezyang	2025-08-01 06:34:38 +00:00
Shangdi Yu	fd2c64e286	Fix duplicated sources in inductor provenance tracking (#159484 ) Summary: The `replace_hook` is called once for each user of the replaced node. This fix avoids adding duplicated node sources. This also means that if there are two nested pass like: ``` with GraphTransformObserver(gm, "outer"): with GraphTransformObserver(gm, "inner"): ..... ``` We'll only see the outer pass's pass name recorded for the replaced node in the "from_node" node meta. I think this is fine. In practice, the outer pass usually contains a more meaningful name, e.g. `decompose_auto_functionalized`, and the inner pass name is just a default pass name like `pattern_matcher`. Test Plan: ``` buck2 run @mode/dev-nosan fbcode//caffe2/test:fx -- -r test_graph_transform_observer_replace ``` Rollback Plan: Differential Revision: D79203058 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159484 Approved by: https://github.com/angelayi	2025-07-30 23:03:11 +00:00
Lucas Kabela	2b1ae29960	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) (#159491 ) Summary: X-link: https://github.com/pytorch/executorch/pull/12986 As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Test Plan: Imported from GitHub, without a `Test Plan:` line. Rollback Plan: Reviewed By: JacobSzwejbka, yangw-dev Differential Revision: D79199389 Pulled By: Lucaskabela Pull Request resolved: https://github.com/pytorch/pytorch/pull/159491 Approved by: https://github.com/anijain2305, https://github.com/yangw-dev	2025-07-30 22:57:50 +00:00
PaliC	b57d1ef110	[BE] Remove __reduce_deploy__ (#158291 ) This PR removes the integration point torch.fx had with torch::deploy (and another minor change). Note: This PR has some broken mypy errors, but I believe those should have been in the code base beforehand, and should be fixed in a separate PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/158291 Approved by: https://github.com/albanD ghstack dependencies: #158290	2025-07-30 01:36:03 +00:00
PyTorch MergeBot	c0c24b61ff	Revert "Partitioner: Fix to align partition node order with original graph (#157892 )" This reverts commit 2d1e92307d3e67622f4fe8058d62e44fe4fa2f4e. Reverted https://github.com/pytorch/pytorch/pull/157892 on behalf of https://github.com/yangw-dev due to fails internal tests : [executorch/backends/xnnpack/partition/xnnpack_partitioner.py:101:24] Incompatible parameter type [6]: In call `Partition.__init__`, for argument `nodes`, expected `Optional[Iterable[Tuple[Node, Optional[int]]]]` but got `dict_keys[Node, str]`. ([comment](https://github.com/pytorch/pytorch/pull/157892#issuecomment-3134004881))	2025-07-29 20:41:45 +00:00
PyTorch MergeBot	d987a6f7f0	Revert "[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 )" This reverts commit abcb24f4de11f8fedf2c2c9ff53b6092ef42306d. Reverted https://github.com/pytorch/pytorch/pull/158397 on behalf of https://github.com/yangw-dev due to Suggested to fix failing internal signals on D78911890 ([comment](https://github.com/pytorch/pytorch/pull/158397#issuecomment-3133823766))	2025-07-29 19:49:40 +00:00
Xiaochang Wu	2d1e92307d	Partitioner: Fix to align partition node order with original graph (#157892 ) Fixes #157891 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157892 Approved by: https://github.com/ezyang	2025-07-28 17:36:29 +00:00
PyTorch MergeBot	a9f6770edd	Revert "[BE] Remove __reduce_deploy__ (#158291 )" This reverts commit 9c68c4d08f4c4da49f0086b80e382f0cdd518f60. Reverted https://github.com/pytorch/pytorch/pull/158291 on behalf of https://github.com/ZainRizvi due to Reverting as per offline discussion to fix internal breaks. @PaliC will reland this as a codev diff. Instructions here: https://fburl.com/fixing-ghfirst-reverts ([comment](https://github.com/pytorch/pytorch/pull/158288#issuecomment-3119037960))	2025-07-25 16:09:39 +00:00
Edward Z. Yang	204eb4da5e	Add expanded_def option for FX printing, render descriptor, update tests (#158708 ) ---- - First, we add a new expanded_def to FX, which will expand the definitions of variables into multiple lines, one per variable definition. This makes extremely long args/return lists much more readable. - Next, we extend this mechanism to also print out descriptors on placeholders and return values, as comments, if available. This is how we will test descriptors. - We update tlparse for AOTAutograd to use this format. - We update expect tests to use this format and update their formats, so you can inspect what it can look at. There may be other tests I should update, open to suggestions. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/158708 Approved by: https://github.com/wconstab ghstack dependencies: #158624	2025-07-25 13:22:32 +00:00
Xuehai Pan	f5e2de928b	[BE] fix remaining flake8 v7 warnings (#159044 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159044 Approved by: https://github.com/Skylion007 ghstack dependencies: #159043	2025-07-25 02:56:34 +00:00
Chuan Jiang	64cb349b81	Extract a method that filters frames in the captured stack trace (#158266 ) Summary: The subclass can override the filtering logic to customize which frames to keep or drop. Test Plan: ``` buck run caffe2/test:test_export -- -r test_stack_trace buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:others -- -r test_constant_random buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export -- -r test_custom_obj_list_out buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r class_member_back_compat ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158266 Approved by: https://github.com/ezyang, https://github.com/yushangdi	2025-07-25 02:22:03 +00:00
Laith Sakka	0b2ef76e85	DDE-Free select with unbacked index. (#157605 ) When select has data dependent input, we cant tell if the actual index shall be index+size or index. to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the output view and we compute its value dynamically at runtime when inductor is lowered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605 Approved by: https://github.com/ColinPeppler	2025-07-24 20:08:05 +00:00
Lucas Kabela	abcb24f4de	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/158397 Approved by: https://github.com/anijain2305	2025-07-24 15:55:18 +00:00
PaliC	9c68c4d08f	[BE] Remove __reduce_deploy__ (#158291 ) This PR removes the integration point torch.fx had with torch::deploy (and another minor change). Note: This PR has some broken mypy errors, but I believe those should have been in the code base beforehand, and should be fixed in a separate PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/158291 Approved by: https://github.com/albanD ghstack dependencies: #158288, #158290	2025-07-23 20:27:28 +00:00
Shangdi Yu	392fa75411	Change from import trace to import config (#158796 ) Summary: for this particular instance, we're doing from torch._inductor.config import trace ...trace.provenance_tracking... but for all other call sites, we're doing from torch._inductor import config ... config.trace.provenance_tracking.... Test Plan: CI Rollback Plan: Differential Revision: D78699876 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158796 Approved by: https://github.com/c00w	2025-07-22 06:10:38 +00:00
PyTorch MergeBot	920f26c761	Revert "[BE] Remove __reduce_deploy__ (#158291 )" This reverts commit 0b9fb91f17edfbc51ae36584dcb8350b2d8bb23b. Reverted https://github.com/pytorch/pytorch/pull/158291 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally, see D78496147 for details. To validate your fixes internally, you can follow the instructions here: https://fburl.com/fixing-ghfirst-reverts ([comment](https://github.com/pytorch/pytorch/pull/158288#issuecomment-3099826158))	2025-07-21 23:17:38 +00:00
Pian Pawakapan	1227ed6674	[dynamic shapes] fix _maybe_evaluate_static axioms bug (#158672 ) Summary: couldn't get a minimal repro, but xref for size change during dict iteration error: https://fb.workplace.com/groups/1075192433118967/posts/1709439696360901 Test Plan: - Rollback Plan: Differential Revision: D78047846 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158672 Approved by: https://github.com/bobrenjc93	2025-07-21 23:14:19 +00:00
Zain Rizvi	193b29ee0c	[BE][EZ] Minor doc fixes (#158574 ) [BE] Minor doc fixes	2025-07-18 10:34:55 -05:00
Will Constable	ce4554352b	Shunt fx_interpreter graphmodule print on error into tlparse (#158469 ) Include both the error stacktrace and the graphmodule in a new structured trace artifact. Log the shortened version to the console, and also log a hint to look at the tlparse for more. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158469 Approved by: https://github.com/ezyang	2025-07-18 02:18:43 +00:00

1 2 3 4 5 ...

2661 Commits