pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
bobrenjc93	db00e1699a	[pc] introduce ProgressiveCompilationState and clear callback (#157619 ) followup from https://github.com/pytorch/pytorch/pull/157305 where @aorenste correctly suggested clearing callback. this refactor introduces a new dataclass so we don't need to check nullability for each field Pull Request resolved: https://github.com/pytorch/pytorch/pull/157619 Approved by: https://github.com/aorenste ghstack dependencies: #157305, #157614	2025-07-05 07:55:11 +00:00
bobrenjc93	5ea832e5f6	[pc] migrate progression futures from list to deque (#157614 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157614 Approved by: https://github.com/aorenste ghstack dependencies: #157305	2025-07-05 07:55:03 +00:00
bobrenjc93	d58ed04d89	[async-compile] add progressive compile mode (#157305 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157305 Approved by: https://github.com/aorenste	2025-07-04 04:18:50 +00:00
bobrenjc93	9d677389cb	[async compile] make it more obvious that we support backwards (#157204 ) current failing with ``` (/home/bobren/local/a/pytorch-env) [13:02] devgpu009:/home/bobren/local/a/pytorch python test/inductor/test_compile_subprocess.py -k GPUTests.test_async /home/bobren/local/a/pytorch/torch/backends/cudnn/__init__.py:115: UserWarning: PyTorch was compiled without cuDNN/MIOpen support. To use cuDNN/MIOpen, rebuild PyTorch making sure the library is visible to the build system. warnings.warn( /home/bobren/local/a/pytorch/torch/_inductor/ops_handler.py:741: UserWarning: undefined OpHandler.__getstate__, please add missing op schema warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") /home/bobren/local/a/pytorch/torch/_inductor/ops_handler.py:741: UserWarning: undefined OpHandler.__getstate__, please add missing op schema warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] Unable to pickle input graph or example inputs W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] Traceback (most recent call last): W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] File "/home/bobren/local/a/pytorch/torch/_inductor/compile_fx_ext.py", line 484, in serialize_compile W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] ).serialize() W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] File "/home/bobren/local/a/pytorch/torch/_inductor/compile_fx_ext.py", line 210, in serialize W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] File "/home/bobren/local/a/pytorch/torch/fx/_graph_pickler.py", line 124, in dumps W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] pickler.dump(obj) W0628 13:02:30.666000 3610483 torch/_inductor/compile_fx_ext.py:491] [0/0] AttributeError: Can't pickle local object 'make_opaque_bitwise_fn.<locals>.BitwiseFn' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/157204 Approved by: https://github.com/aorenste	2025-06-29 05:38:54 +00:00
James Wu	fe954cdcbf	Use correct boxed_forward_device_index when running `CompiledFxGraph.post_compile` (#148130 ) This PR threads through the correct boxed_forward_device_index from graph_kwargs to CompiledFXGraph.post_compile. This allows us to correctly update BoxedDeviceIndex from cache hits. We don't actually need to save `boxed_forward_device_index` in CompiledFXGraph because its value is in the cache key, so it always matches to the ambient one anyway. On forward with cudagraphs enabled, derive `boxed_forward_device_index`'s value from `device_idxs`. Testing: ``` python benchmarks/dynamo/cachebench.py --mode training --benchmark torchbench --model BERT_pytorch --device cuda --repeat 1 --dynamic --output="dynamic.json" ``` Now cache hits properly on FXGraphCache. AOTAutogradCache has a guard failure. Will look into that as a followup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148130 Approved by: https://github.com/eellison	2025-03-23 02:57:58 +00:00
Aaron Orenstein	2fcfae72b4	async fx compile (#146135 ) Adds the ability to run the selected out-of-process fx compile scheme in async mode - where we kick off the compile and then run eagerly until the compile is finished. Added a test which runs a tiny model in a loop making sure that we execute it both eagerly and then compiled. Differential Revision: [D71135546](https://our.internmc.facebook.com/intern/diff/D71135546) Pull Request resolved: https://github.com/pytorch/pytorch/pull/146135 Approved by: https://github.com/jamesjwu, https://github.com/jansel	2025-03-19 14:07:51 +00:00

6 Commits