pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	3443627e07	Revert "[BE]: Enable RUFF TRY400 rule - log.exception (#153473 )" This reverts commit 4f4ecc583e0f48ad2d062a53bf91c61ab40b4948. Reverted https://github.com/pytorch/pytorch/pull/153473 on behalf of https://github.com/jeanschmidt due to seems to have broken internal signals, @albanD may I count on you to help the author merge his PR? D74837988 ([comment](https://github.com/pytorch/pytorch/pull/153473#issuecomment-2886017075))	2025-05-16 08:29:26 +00:00
Aaron Gokaslan	4f4ecc583e	[BE]: Enable RUFF TRY400 rule - log.exception (#153473 ) Change logging.error to logging.exception to log additional information when relevant. A few places have slipped in logging.errors in try except since I last did a clean up here and the rule is stabilized so I am enabling it codebase wide. I have NOQA'd much of our custom exception stack trace handling for RPC calls and distributed and tried to a fix a few errors based on whether we immediately reraised it or if we didn't print any exception handling where it could be useful. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153473 Approved by: https://github.com/albanD, https://github.com/cyyever	2025-05-15 13:36:59 +00:00
Vlad K	6a84fe65ec	Fix code portability when looking for Dot (#153259 ) When trying to plot a trace graph, Inductor checks if "dot" is installed. Currently, the code runs a "which dot" command. By default, Windows doesn't have the "which" command. This patch replaces it with the more portable alternative. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153259 Approved by: https://github.com/Skylion007	2025-05-10 16:12:44 +00:00
Benjamin Glass	01cbf5a30a	[AOTInductor] Add wrapper and kernel code to debug code logging (#153181 ) This is a simple PR to make the AOTInductor wrapper and kernel code get output by `TORCH_COMPILE_DEBUG=1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153181 Approved by: https://github.com/desertfire	2025-05-10 15:31:18 +00:00
eellison	2295efa1b3	Fix only logging ir_post_fusion with torch_compile_debug enabled (#148499 ) Because we were invoking the logs through `V.debug`, it was not running if TORCH_COMPILE_DEBUG was not set. this is because there is some magic the in debug [getattr](`d789c22712/torch/_inductor/debug.py (L468-L480)`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/148499 Approved by: https://github.com/shunting314	2025-03-05 05:35:09 +00:00
Xuehai Pan	1cb4e2df65	[BE][PYFMT] migrate PYFMT for `torch._inductor` to `ruff format` (#144550 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144550 Approved by: https://github.com/jansel	2025-02-28 13:33:19 +00:00
Riley Dulin	20295c017e	Fix import of getArtifactLogger for ir_pre_fusion and ir_post_fusion (#147560 ) Fixes #147002 There was an issue with the previous PR https://github.com/pytorch/pytorch/pull/147248 that didn't show up in CI, where a logging import was not complete in torch/_inductor/debug.py before importing it. This only happened if someone directly imported the file without doing any other imports before. Also set to off_by_default by request to reduce log spew. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147560 Approved by: https://github.com/Skylion007	2025-02-25 03:36:08 +00:00
Riley Dulin	93316cfe94	Move ir_pre_fusion.txt and ir_post_fusion.txt to TORCH_LOGS (#147248 ) Fixes #147002 Moves ir_{pre, post}_fusion.txt to be controlled by TORCH_LOGS instead of TORCH_COMPILE_DEBUG. Updated tests of these logs as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147248 Approved by: https://github.com/eellison	2025-02-20 00:26:17 +00:00
Shangdi Yu	a4e4368157	add node mapping processing (#146103 ) Summary: Add `node_mapping = create_node_mapping(pre_grad_graph_id, inductor_post_to_pre_grad_nodes, debug_info)`, to produce a `inductor_provenance_tracking_node_mappings.json` file. This file will be used by the provenance tracking highlighter tool to create provenance visualization. `inductor_triton_kernel_to_post_grad_nodes.json` and `inductor_provenance_tracking_node_mappings.json` files are not dumped if they are both empty. So it's removed from some of the `test_structured_trace` tests. Test Plan: CI ``` buck run mode/dev-nosan fbcode//caffe2/test:fx -- -r graph_provenance buck run mode/dev-nosan fbcode//caffe2/test/inductor:provenance_tracing python test/dynamo/test_structured_trace.py ``` Differential Revision: D68190173 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146103 Approved by: https://github.com/chenyang78	2025-02-01 08:29:29 +00:00
shangdiy	6bd19e65b1	add inductor_triton_kernel_mapping_post_grad.json to tlparseadd changes (#145954 ) Landing D67612181 here. The original exported PR somehow fails OSS CI, but this one doesn't (though the PR content is the same). Add debug trace artifact to inductor_triton_kernel_mapping_post_grad.json (debug artifact for provenance tracking) to tlparse. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145954 Approved by: https://github.com/YUNQIUGUO	2025-01-30 06:18:48 +00:00
Randolf Scholz	835e770bad	Use `typing.IO[bytes]` instead of `io.BytesIO` in annotations (#144994 ) Fixes #144976 Using appoach ① `IO[bytes]`, but could also try with a protocol. ## Notes: - moved `torch.serialization.FILE_LIKE` to `torch.types.FileLike` - Use `FileLike` annotation where it makes sense - made sure those functions also support `os.PathLike` - Replaced `isinstance(x, io.BytesIO)` with `isinstance(x, (io.IOBase, IO))` where appropriate. - Replaced `BinaryIO` with `IO[bytes]` (the two ABCs are almost identical, the only difference is that `BinaryIO` allows `bytearray` input to `write`, whereas `IO[bytes]` only `bytes`) - needed to make `torch.serialization._opener` generic to avoid LSP violations. - skipped `torch/onnx/verification` for now (functions use `BytesIO.getvalue` which is not part of the `IO[bytes]` ABC, but it kind of seems that this is redundant, as e.g. `onnx.load` supports `str \| PathLike[str] \| IO[bytes]` directly... Pull Request resolved: https://github.com/pytorch/pytorch/pull/144994 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2025-01-27 18:08:07 +00:00
Shangdi Yu	4cc5e880f9	Add accuracy issue support in AOTI Minifier (#145539 ) Summary: Add three more repro levels for AOTI minifier (level 2 already exists). They are the same as the existing dynamo minifier repro levels. Now AOTI minifier can minify and repro programs that have numerical accuracy issues as well. 1: Dumps the original graph out to repro.py if compilation fails 2: Dumps a minifier_launcher.py if aoti fails. 3: Always dumps a minifier_launcher.py. Good for segfaults. 4: Dumps a minifier_launcher.py if the accuracy fails. Refactor AOTI minifier unit tests to be cleaner and better re-use the existing minifier testing code. We do not need to manually patch {"aot_inductor.dump_aoti_minifier": True} to each test now, this config is generated in the test code. Differential Revision: D68294638 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145539 Approved by: https://github.com/desertfire	2025-01-24 23:07:19 +00:00
Aaron Orenstein	893ca1dfe1	PEP585 update - torch/_inductor/[_-i]* (#145137 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145137 Approved by: https://github.com/bobrenjc93	2025-01-19 01:22:47 +00:00
Rachel Guo	9275091d6e	[provenance_tracking] Dump inductor_triton_kernel_to_post_grad_nodes.json info in debug_trace (#143055 ) Summary: This diff mainly adds code changes to dump `inductor_triton_kernel_to_post_grad_nodes.json` artifact which contains mapping info from post_grad -> inductor kernel code: `{"inductor_triton_kernel_name": [post_grad_node_0, post_grad_node_1, ..., ], "..."}.` Example paste: P1695235000 verified on the test model. See "Test Plan": We use this artifact to demonstrate provenance tracking in the frontend 3-tab highlighter tool: https://github.com/YUNQIUGUO/compiler_explorer (copy/pasted the input files for demo purpose for now and will integrate with Shangdi's tool to 4-tab) https://pxl.cl/66BzK Note: Currently only supports mapping for inductor's`TritonKernel` type. TODO for enhancing more support for `ExternKernel` and other inductor generated kernel type, etc. Test Plan: test_model_coverage.sh: ``` #!/bin/sh MODEL_ENTITY_ID=644688112 SNAPSHOT_ID=32 MODULE=merge # buck2 build --show-output mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true -c fbcode.nvcc_arch=a100,h100 caffe2/torch/fb/model_transform/experimental/benchmark:mts_gpu_benchmark TORCH_COMPILE_DEBUG=1 CUDA_VISIBLE_DEVICES=0 TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 TORCH_LOGS="+inductor, schedule, fusion, output_code" TORCH_TRACE="tmp/guorachel_tt" TORCHINDUCTOR_MAX_AUTOTUNE=1 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 ../buck-out/v2/gen/fbcode/d29ee94b913014f1/caffe2/torch/fb/model_transform/experimental/benchmark/__mts_gpu_benchmark__/mts_gpu_benchmark.par --model-path manifold://ads_storage_fblearner/tree/user/facebook/fblearner/predictor/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/gpu_lowering/input.predictor.disagg.gpu.merge --lower-backend AOT_INDUCTOR_EP --gpu-trace --aot-inductor-config="{'max_autotune': True}" 2>&1 \| tee output.txt ``` {F1973765026} ``` buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/inductor:provenance_tracing -- --exact 'caffe2/test/inductor:provenance_tracing - test_triton_kernel_post_grad_mapping_aot_inductor (caffe2.test.inductor.test_provenance_tracing.TestProvenanceTracingArtifact)' ``` ``` TORCH_LOGS="+inductor, output_code" buck2 run -c fbcode.enable_gpu_sections=true -c fbcode.nvcc_arch=h100 @//mode/opt fbcode//caffe2/test/inductor:provenance_tracing -- -r test_triton_kernel_post_grad_mapping_aot_inductor ``` Differential Revision: D66967510 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143055 Approved by: https://github.com/chenyang78	2024-12-18 06:51:50 +00:00
Tom Ritchford	da67a6a7bb	[inductor] Replace set by OrderedSet (#138466 ) Uses the set_linter from https://github.com/pytorch/pytorch/pull/138454 and considerable manual editing Pull Request resolved: https://github.com/pytorch/pytorch/pull/138466 Approved by: https://github.com/eellison	2024-12-13 16:08:45 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit fda975a7b3071a20dab8fc2c4e453479e1bb7cf2. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
Shangdi Yu	02c509669a	Aoti minifier flatten (#141156 ) Flatten the inputs to minifier so AOTI Minifier can handle unflattened inputs and kwargs. - flatten the inputs in minifier - changed the "load_and_run" part of the minifier verification to run on the flattened inputs. - refactored code to keep `torch._inductor.__init__.py` clean - update doc `python test/inductor/test_minifier.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/141156 Approved by: https://github.com/desertfire	2024-12-06 07:12:45 +00:00
Jason Ansel	6eca0aee76	[inductor] Refactor ir.Layout into ir.OutputSpec (#140910 ) This separate the concepts of a Layout (size/stride/etc) and an OutputSpec (which includes multiple outputs). Which should make typing easier. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140910 Approved by: https://github.com/ezyang ghstack dependencies: #140895	2024-11-21 20:01:57 +00:00
Aaron Orenstein	06f619d999	typing ir.py - part 2 (#131846 ) See #131852 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131846 Approved by: https://github.com/eellison ghstack dependencies: #139238	2024-11-06 00:01:15 +00:00
PyTorch MergeBot	6dada2136a	Revert "Refactor FxGraphDrawer to use HTML-like labels (#137726 )" This reverts commit 1e738420296a84406cd0a1626074ea6447a6603a. Reverted https://github.com/pytorch/pytorch/pull/137726 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it looks like some internal components are failing after this change and need to be updated ([comment](https://github.com/pytorch/pytorch/pull/137726#issuecomment-2455332612))	2024-11-04 17:44:44 +00:00
Gabriel Ferns	1e73842029	Refactor FxGraphDrawer to use HTML-like labels (#137726 ) Fixes https://github.com/pytorch/pytorch/issues/137499 Testing: Added a new unit test to make sure that the regression case succeeds. I'm debating about whether to make the borders visible. I'm partial to no borders, but it might make it harder for some people to read? ![68a2b0e3-orig_fx_graph_diagram](https://github.com/user-attachments/assets/fbc2fd98-9e76-488e-8ebe-c64fbf206932) Vs. ![2bfe1c4f-orig_fx_graph_diagram](https://github.com/user-attachments/assets/b6bc88ba-dda2-4cf7-84ac-a615e1e03a74) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137726 Approved by: https://github.com/eellison, https://github.com/malfet	2024-11-01 23:19:50 +00:00
eellison	d90717e4e2	Add option to save real tensors in TORCH_COMPILE_DEBUG repro (#138110 ) This pr adds a utility to try to try to construct the corresponding real tensor values of fake tensors by seeing if their meta storage is contained in the meta converter. Then, we are able to save real tensor values for fx_graph_runnable if `TORCH_COMPILE_DEBUG_SAVE_REAL=1` is set. Differential Revision: [D64502744](https://our.internmc.facebook.com/intern/diff/D64502744) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138110 Approved by: https://github.com/ezyang	2024-10-28 16:18:22 +00:00
Xuan Zhang	c05a7adb36	[inductor][debug] fix draw_buffers (#135266 ) Before: ![image](https://github.com/user-attachments/assets/aac756f3-1349-4647-9da3-87cf105cf647) After: <img width="791" alt="image" src="https://github.com/user-attachments/assets/d72c663c-e598-42fa-ac40-9e58956f1ec1"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135266 Approved by: https://github.com/yf225	2024-09-06 04:12:41 +00:00
Aaron Orenstein	d95aedf5fd	[BE] typing for decorators - fx/_compatibility (part 1) (#134202 ) Part of #134054. This corresponds to the pytorch mypy changes from D61493706. Updating takes so long and touches so many files that it's impossible to land as a whole without conflicting with some other intermediate change. So landing these 'type: ignore' for pytorch in advance of them actually being needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134202 Approved by: https://github.com/Skylion007	2024-08-22 17:07:33 +00:00
Edward Z. Yang	5e4d8eb831	Don't generate stack entry for DebugContext.wrap (#132802 ) See https://github.com/pytorch/pytorch/pull/132073 for motivation Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132802 Approved by: https://github.com/albanD ghstack dependencies: #132801	2024-08-07 23:59:38 +00:00
Adnan Akhundov	8927fc209f	[inductor] Add type hints to functions in debug.py (#131836 ) Summary: ATT Test Plan: lintrunner Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/131836 Approved by: https://github.com/eellison	2024-07-28 04:54:22 +00:00
PyTorch MergeBot	945bf78894	Revert "[BE] typing for decorators - fx/_compatibility (#131568 )" This reverts commit 193f62fde91ee20deb5ddcd9ff4593cd78d74c64. Reverted https://github.com/pytorch/pytorch/pull/131568 on behalf of https://github.com/clee2000 due to same as https://github.com/pytorch/pytorch/pull/131572#issuecomment-2254328359 but I clicked the wrong link by accident. This is where it actually starts ([comment](https://github.com/pytorch/pytorch/pull/131568#issuecomment-2254330781))	2024-07-28 03:43:39 +00:00
Xuan Zhang	3d7c424a75	[inductor] update users to buffers instead of scheduler nodes (#131796 ) After a recent refactoring of inductor, `.users` are now associated with buffers instead of scheduler nodes. In `debug.py`, one such usage of `.users` is not updated accordingly, and the change here fixes that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/131796 Approved by: https://github.com/yf225	2024-07-26 03:34:26 +00:00
Aaron Orenstein	193f62fde9	[BE] typing for decorators - fx/_compatibility (#131568 ) See #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131568 Approved by: https://github.com/justinchuby, https://github.com/oulgen, https://github.com/zou3519	2024-07-25 22:24:19 +00:00
Xuehai Pan	b6d477fd56	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129768 Approved by: https://github.com/jansel	2024-07-20 16:20:58 +00:00
Aaron Orenstein	ea614fb2b1	Flip default value for mypy disallow_untyped_defs [2/11] (#127839 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127839 Approved by: https://github.com/oulgen	2024-06-08 18:23:08 +00:00
_daohang_	0a6df4fca6	delete inductor config.trace.compile_profile (#127143 ) Fixes #ISSUE_NUMBER https://fb.workplace.com/groups/257735836456307/posts/687858786777341/?comment_id=687861123443774&reply_comment_id=687865486776671 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127143 Approved by: https://github.com/Chillee	2024-06-07 18:05:50 +00:00
Xuehai Pan	a28bfb5ed5	[4/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort functorch (#127125 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127125 Approved by: https://github.com/Skylion007 ghstack dependencies: #127122, #127123, #127124	2024-05-25 22:45:38 +00:00
Jason Ansel	235f24fc66	[inductor] Add FileLock around V.debug.copy (#122665 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122665 Approved by: https://github.com/ezyang	2024-03-28 03:17:33 +00:00
eellison	1d13c82559	Precompile in background (#121997 ) Precompile benchmarking choices in parallel, and then wait on those choices prior to benchmarking. In the case of deferred templates, we only only wait only those choices in the scheduler to allow multiple separate lowerings to compile in parallel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121997 Approved by: https://github.com/jansel ghstack dependencies: #121996, #120275	2024-03-20 18:34:12 +00:00
Kai Londenberg	96eff4ef70	[inductor max autotune] Detailed autotuning result logs ( machine-readable ) (#119004 ) This diff introduces a new separate logging of autotuning results, with the intention of making the results analyzable, specifically those for the new experimental Cutlass backend. Results are logged as text files with one JSON document corresponding to a single benchmark result per line. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119004 Approved by: https://github.com/jansel ghstack dependencies: #120620	2024-02-29 18:24:13 +00:00
Edward Z. Yang	d03173e88c	Unify MYPYINDUCTOR and MYPY (#118432 ) The original motivation for MYPYINDUCTOR was a faster type checking configuration that only checked a subset of files. With the removal of `follow_imports = ignore`, we are now able to use dmypy to do fast incremental typechecking, eliminating the need for this. Perhaps erroneously, when I tee'ed up this PR I elected to delete the `follow_imports = skip` designations in the mypy-inductor.ini. This lead to a number of extra type error suppressions that I manually edited. You will need to review. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118432 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418	2024-01-27 17:23:20 +00:00
Aaron Gokaslan	86cd6655a1	[BE]: Use exist_ok arg for os.makedirs calls (#116561 ) Optimize os.makedirs calls to use exist_ok parameter when possible to avoid unnecessary checks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116561 Approved by: https://github.com/malfet	2023-12-30 21:12:53 +00:00
Yang Chen	5c3f03e2dd	[inductor] add a config to specify the shape attribute for the generated svg graphs (#114811 ) We draw our fx graphs with the "record" shape attribute by default. Sometimes, when the graph is very complex, we may hit dot errors like below: "flat edge between adjacent nodes one of which has a record shape - replace records with HTML-like labels" and thus fail to generate a graph. So, let's give the user an option to specify the shape attribute for the dot graph. For example, passing INDUCTOR_DOT_GRAPH_SHAPE_SVG = "none" would let us generate HTML-like lables to workaround the above failure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114811 Approved by: https://github.com/weifengpy	2023-11-30 06:10:37 +00:00
Jez Ng	dc63248b76	Make dynamo configs more amenable to static type checking (#112130 ) `install_config_module` makes a regular module into a ConfigModule with extra methods defined on it. mypy thinks those extra methods (or module functions) are undefined since it cannot analyze something so dynamic. As a workaround, I've created a fake module that defines these extra functions, which I import into the config modules during type checking. As part of this change, I've also added more types to config_utils.py and enabled typechecking for torch/_dynamo/config.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112130 Approved by: https://github.com/jansel	2023-11-08 21:17:45 +00:00
Sherlock Huang	a126bbfea3	[AOTInductor] Include AOTI debug folder in package (#112514 ) Summary: Allow user to set debug dir for Inductor Include AOTInductor debug folder in the package. ``` zipinfo package.zip Archive: package.zip Zip file size: 1325264 bytes, number of entries: 46 -rw---- 0.0 fat 212 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/aotinductor_pickle_data.json -rw---- 0.0 fat 6024 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/fx_graph_runnable.py -rw---- 0.0 fat 9031 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/fx_graph_readable.py -rw---- 0.0 fat 9202 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/fx_graph_transformed.py -rw---- 0.0 fat 10865 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/ir_pre_fusion.txt -rw---- 0.0 fat 10865 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/ir_post_fusion.txt -rw---- 0.0 fat 13553 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.0/output_code.py -rw---- 0.0 fat 5822 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.1/fx_graph_runnable.py -rw---- 0.0 fat 8817 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.1/fx_graph_readable.py -rw---- 0.0 fat 8988 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.1/fx_graph_transformed.py -rw---- 0.0 fat 10858 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.1/ir_pre_fusion.txt -rw---- 0.0 fat 10858 bl stor 80-000-00 00:00 package/data/aotinductor/merge-a100/debug/torchinductor/model___9.1/ir_post_fusion.txt ``` Test Plan: CIs Reviewed By: chenyang78 Differential Revision: D50815320 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112514 Approved by: https://github.com/chenyang78, https://github.com/desertfire	2023-11-01 08:25:11 +00:00
Jon Chuang	a21851c69d	fix(inductor): `ForeachKernelSchedulerNode` group shape should be opaque for graph debug (#110336 ) ~~Shape is assumed by `TensorMetadata` to be torch.Shape/tuple, however, some of the scheduler node groups utilize `int`, so convert to tuple.~~ Root cause is actually `foreach` scheduler node having silent-error group of int, when in fact it ought to be opaque `foreach`. Previously: silent error / confusing shape of (0,) ![image](https://github.com/pytorch/pytorch/assets/9093549/5bc2a3c7-151f-4433-bbf8-044c7b03e989) Now: clear that it is foreach which does not have well-defined shape: ![image](https://github.com/pytorch/pytorch/assets/9093549/8373080d-4519-4e74-8a3b-da463e9968da) ~~Alternate might be to create list of shapes for each of its subnodes. Actually, for debuggability sake, I may prefer this. We can ensure that the recursive generation of this string is only done dynamically in a debug code path. Else, incrementally computing it on initialization of ForeachKernel may also be feasible.~~ This is quite infeasible for 100s of params. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110336 Approved by: https://github.com/mlazos	2023-10-31 18:44:08 +00:00
Jon Chuang	79212430df	feat(inductor): fx graph debug should display device (#110346 ) Device mismatch issues are root cause of: https://github.com/pytorch/pytorch/issues/107006, hence make device-related scheduling issues easier to diagnose. Also format single-kwarg graphs to be more concise Example rendering: ![image](https://github.com/pytorch/pytorch/assets/9093549/1b59a994-f2df-45c9-8cb7-37eb3ba12654) CC code owners: @ngimel @jansel @shunting314 @mlazos @peterbell10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110346 Approved by: https://github.com/eellison	2023-10-11 00:34:55 +00:00
eellison	c5f06b9753	Re-enable test_copy_transpose_math_view, neg_view/dce fix (#110651 ) - neg view can just be lowered to neg() post functionalization - we were treating all fallback kernels as not having side effects. we shouldn't dce mutating fallback kernels - either mutations induced by the reinplacing pass or clone_ with unsupported arguments (complex) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110651 Approved by: https://github.com/Chillee, https://github.com/jansel, https://github.com/malfet, https://github.com/Skylion007	2023-10-10 16:34:01 +00:00
willfengg	772e104dfd	[inductor] visualize fused ops in svg graph (#107752 ) example usage * `TORCH_COMPILE_DEBUG=1 INDUCTOR_ORIG_FX_SVG=1 INDUCTOR_POST_FUSION_SVG=1 python trig.py`: show original fx node name, file, and code. see snapshot 2 where we have origin_0, 1, 2 * trig.py can be found in P816304818 Implementation * keep original fx graph in GraphLowering, ```self.orig_gm: torch.fx.GraphModule = gm.__copy__()``` * draw original fx graph with origins ir_post_fusion ```V.debug.draw_orig_fx_graph(self.orig_gm, self.scheduler.nodes)```. node.meta["buff_meta"] tracks buf_name <img width="350" alt="Screenshot 2023-08-29 at 12 40 24 PM" src="https://github.com/pytorch/pytorch/assets/134637289/c4e197cb-ab3b-4a09-a584-c1356376accb"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107752 Approved by: https://github.com/mlazos	2023-09-21 08:03:05 +00:00
Jez Ng	fe452108fb	Enable typechecking for _inductor/debug.py (#109335 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109335 Approved by: https://github.com/eellison ghstack dependencies: #109269, #109347	2023-09-18 18:12:23 +00:00
Shunting Zhang	91778ada87	[inductor] graph replayer (#106952 ) Recently I feel it's a bit painful to run benchmark scripts on my dev environment. E.g., the command below ``` python benchmarks/dynamo/huggingface.py --backend inductor --amp --performance --only YituTechConvBert --training ``` took about 2 minutes to run. It may take even longer for some other models. The command is slow since it - need do dynamo work - verify the model on CPU - run perf tests - compile all the graphs However, often times I only need to debug inductor specific logic like loop ordering and fusion. A lot of the things the script is done are useless for me. Also I only need test one graph at a time (e.g. check fwd graph first and when I'm done, continue to check bwd graph) rather than compiling all the graphs. The graph replayer add a `@save_args` decorator to compile_fx_inner function. When `config.save_args` is true, it will pickle all the arguments to `comple_fx_inner` to the file system. Later on, we can call `load_args_and_run_compile_fx_inner("/tmp/inductor_saved_args/compile_fx_inner_0.pkl")` to replay the graph and compile it with inductor. Replaying the fwd graph took around 60 seconds (maybe this can be further reduced but this is already 2x speedup for dev efficiency) , and it only took around 20 seconds to reach `Scheduler.__init__` method. I also checked `TORCH_COMPILE_DEBUG` flag that already exists. The most similar part of `TORCH_COMPILE_DEBUG` is it can save a graph and it's arguments and later on rerun it. But the difference here is, rather than run the model, we want to call inductor API to compile the model (without even going thru dynamo or aot-autograd). Pull Request resolved: https://github.com/pytorch/pytorch/pull/106952 Approved by: https://github.com/jansel ghstack dependencies: #106990	2023-08-11 22:28:20 +00:00
Elias Ellison	8f4d8b3773	More descriptive graph diagram names in svg (#106146 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106146 Approved by: https://github.com/jansel, https://github.com/Chillee	2023-07-28 17:34:09 +00:00

1 2

75 Commits