pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
James Wu	b54e466fd0	Megacache integration (#163533 ) This diff adds megacache integration for DynamoCache. Because DynamoCache requires lazy serialization, i.e. it can only be serialized once all relevant backends have been compiled and we're ready for a save, we actually do the DynamoCache saving only on a call to `torch.compiler.save_cache_artifacts`. Differential Revision: [D82735763](https://our.internmc.facebook.com/intern/diff/D82735763/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163533 Approved by: https://github.com/oulgen, https://github.com/zhxchen17	2025-10-15 22:49:15 +00:00
sekyonda	c467e59cb0	dynamo configs to torch.compiler (#163517 ) Moving some dynamo configs to torch.compiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/163517 Approved by: https://github.com/williamwen42, https://github.com/anijain2305 Co-authored-by: Svetlana Karslioglu <svekars@meta.com>	2025-10-14 22:44:53 +00:00
Maggie Moss	086dec3235	Pyrefly suppressions 6/n (#164877 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Almost there! Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the project-excludes field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: INFO 0 errors (5,064 ignored) Only four directories left to enable Pull Request resolved: https://github.com/pytorch/pytorch/pull/164877 Approved by: https://github.com/oulgen	2025-10-08 02:30:57 +00:00
Yuanyuan Chen	a43c4c3972	[5/N] Apply ruff UP035 rule (#164423 ) Continued code migration to enable ruff `UP035`. Most changes are about moving `Callable` from `typing` to `from collections.abc`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164423 Approved by: https://github.com/ezyang	2025-10-02 07:31:11 +00:00
James Wu	bfe9e60ffb	Simplify PrecompileContext to no longer be a CacheArtifactManager (#162886 ) Summary: This diff does a big refactor of PrecompileContext to make it considerably simpler: instead of being a CacheArtifactManager and managing a bunch of bytes, it simply stores two things: dynamo cache entries and backend cache entries. When asked, it stitches them together into PrecompileCacheEntries, which are stored by DynamoCache. This structure then allows us to register DynamoCache to the regular Megacache API, instead of having two separate APIs that are confusing. It also lets us remove the autotune cache integration, since MegaCache API will automatically store autotune cache entries. The intent here is that users who want to use caching precompile will simply be able to use torch.compiler.save_cache_artifacts as before, just with `torch.dynamo.config.caching_precompile` set to True. They can also directly interact with PrecompileContext if they wish to specifically only load Precompile entries, using PrecompileContext.create_cache_entries(). Saving single entries and such with DynamoCache still works normally. Test Plan: All existing unit tests pass. Rollback Plan: Differential Revision: D82380307 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162886 Approved by: https://github.com/zhxchen17	2025-09-20 01:24:37 +00:00
James Wu	eb9073a6b7	[easy] [precompile] Convert CompileArtifacts to callable (#162169 ) The goal of this PR stack is to be able to implement `aot_compile_module`, which AOT precompiles a torch.nn.Module. Step 1 is a simple refactor to make CompileArtifacts itself the callable, which makes it easier to use directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162169 Approved by: https://github.com/zhxchen17	2025-09-07 23:37:31 +00:00
zhxchen17	c36d18d7e8	[rfc] aot precompile with custom backend api (#161383 ) Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161383 Approved by: https://github.com/tugsbayasgalan	2025-08-27 21:26:25 +00:00
Pian Pawakapan	075a2e6967	[PGO] add extra read/write keys (#160715 ) Differential Revision: D80321215 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160715 Approved by: https://github.com/bobrenjc93	2025-08-18 01:41:08 +00:00
Oguz Ulgen	a29ed5e1ac	Add torch compile force disable caches alias (#158072 ) Bunch of people keep thinking current alias only disables inductor cache because it has the name inductor in it. lets globalize the name Pull Request resolved: https://github.com/pytorch/pytorch/pull/158072 Approved by: https://github.com/ezyang	2025-08-02 23:23:17 +00:00
Aaron Orenstein	b794e77b7b	Disable cudagraph GCs by default (#158649 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158649 Approved by: https://github.com/eellison ghstack dependencies: #158193	2025-07-29 19:56:11 +00:00
Aaron Orenstein	e20736bf1d	Dont't GC as often when collecting cudagraphs (#158193 ) TL;DR: Cuts vLLM cudagraph collection from 80s -> 24s Stop garbage collecting by default on every cudagraph recording. The old behavior can be re-enabled by setting `TORCH_CUDAGRAPH_GC=1` or the config `force_cudagraph_gc`. We were previously garbage collecting at the beginning of each cudagraph capture. vLLM collects 5427 graphs and most of those garbage collections weren't actually collecting any memory (CPU or GPU). This changes it to not collect more than every 10s so if we're capturing in a loop we don't burn all our cycles looking for garbage. (These number have a lot of variance from run to run but give the correct general scale) ``` \| calls \| total \| synchronize \| gcs \| collect \| empty cache \| sys freed \| cuda freed \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ before \| 5427 \| 78s \| 1.48s \| 5427 \| 53.22s \| 1.21s \| 145855 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ after \| 5427 \| 24s \| 0s \| 3 \| 1.53s \| 0.84s \| 592 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ ``` total - this is the total time reported by vLLM's "Graph capturing finished" log. The rest of these are measured in torch.cuda.graphs.graph.__enter__(): calls - number of times torch.cuda.graphs.graph.__enter__ was called synchronize - this is the duration taken by the cuda.synchronize call gcs - number of times gc.collect was called collect - this is the duration taken by the gc.collect call empty cache - this is the duration taken by the torch.cuda.empty_cache call sys freed - the number of bytes reported freed by gc.collect cuda freed - the number of bytes reported freed by torch.cuda.memory_reserved So it seems like the heavy lifting is done by torch.cuda.empty_cache() which is fairly quick. Cudagraph results from the TorchInductor Performance DashBoard (this is from the original version using the GC clock so the real results will be slightly better than this): <img width="1494" height="382" alt="image" src="https://github.com/user-attachments/assets/69b705ef-47ce-4b6e-9733-1ec941cad93d" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/158193 Approved by: https://github.com/ngimel	2025-07-24 21:37:11 +00:00
PyTorch MergeBot	9a7c2f1f64	Revert "Add torch compile force disable caches alias (#158072 )" This reverts commit 2ecf083b7247f265a03ec296ba9d7b795f035118. Reverted https://github.com/pytorch/pytorch/pull/158072 on behalf of https://github.com/jeffdaily due to fails on rocm, signal ignored while rocm was unstable ([comment](https://github.com/pytorch/pytorch/pull/158072#issuecomment-3086740829))	2025-07-18 04:58:24 +00:00
Lucas Kabela	b0e325c2c8	[Dynamo][Better Engineering] Add type coverage to decorators (#158509 ) As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to an important file in dynamo, `decorators.py` NOTE: Untyped fns are because there is a conflict with `__init__.py` in compiler so we can't type these at this time Running ``` mypy torch/_dynamo/decorators.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 209 \| 908 \| 23.02% \| 9 \| 39 \| 23.08% \| \| This PR \| 870 \| 943 \| 100.00% \| 36 \| 39 \| 100.00% \| \| Delta \| +661 \| +35 \| +76.98% \| +27 \| 0 \| +76.92% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/158509 Approved by: https://github.com/williamwen42	2025-07-17 23:31:26 +00:00
Oguz Ulgen	2ecf083b72	Add torch compile force disable caches alias (#158072 ) Bunch of people keep thinking current alias only disables inductor cache because it has the name inductor in it. lets globalize the name Pull Request resolved: https://github.com/pytorch/pytorch/pull/158072 Approved by: https://github.com/ezyang	2025-07-17 15:40:36 +00:00
Lucas Kabela	a4d753295e	[Dynamo][Better Engineering] Add enhanced typing support to `_dynamo/eval_frame.py` (#158276 ) As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to the main entrypoint for dynamo, `eval_frame.py` Running ``` mypy torch/_dynamo/eval_frame.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 623 \| 2232 \| 27.91% \| 19 \| 68 \| 27.94% \| \| This PR \| 2285 \| 2285 \| 100.00% \| 68 \| 68 \| 100.00% \| \| Delta \| +1662 \| +63 \| +72.09% \| +49 \| 0 \| +72.06% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/158276 Approved by: https://github.com/williamwen42 Co-authored-by: William Wen <williamwen@meta.com>	2025-07-16 23:31:10 +00:00
Xuehai Pan	4cc8b60d1b	[BE][1/16] fix typos in torch/ (#156311 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156311 Approved by: https://github.com/albanD	2025-07-09 11:02:22 +00:00
James Wu	be56a8d7ac	Automatically load and save dynamo entries via caching_precompile (#155913 ) This PR adds a new config option, `caching_precompile`, and a `DynamoCache`, which loads and saves Dynamo Cache entries automatically. It also hooks up DynamoCache to PrecompileContext, so that we can save multiple cache entries. When this configuration is turned on, we: - Automatically create and initialize a CompilePackage on every torch.compile - Automatically use BundledAutogradcache - Automatically save the CompilePackage entry to DynamoCache after every compile You can also use PrecompileContext.serialize() to manually serialize a full object. I've added unit tests to exhibit this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155913 Approved by: https://github.com/zhxchen17	2025-07-07 23:57:17 +00:00
PyTorch MergeBot	ae1094b72b	Revert "[WIP] Automatically load and save dynamo entries via caching_precompile (#155913 )" This reverts commit e466dab164d9236bfe5817ec8e4d24c7b9d3e392. Reverted https://github.com/pytorch/pytorch/pull/155913 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to fail a test in trunk ([comment](https://github.com/pytorch/pytorch/pull/155913#issuecomment-3045914878))	2025-07-07 16:53:35 +00:00
James Wu	e466dab164	[WIP] Automatically load and save dynamo entries via caching_precompile (#155913 ) This PR adds a new config option, `caching_precompile`, and a `DynamoCache`, which loads and saves Dynamo Cache entries automatically. It also hooks up DynamoCache to PrecompileContext, so that we can save multiple cache entries. When this configuration is turned on, we: - Automatically create and initialize a CompilePackage on every torch.compile - Automatically use BundledAutogradcache - Automatically save the CompilePackage entry to DynamoCache after every compile You can also use PrecompileContext.serialize() to manually serialize a full object. I've added unit tests to exhibit this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155913 Approved by: https://github.com/zhxchen17	2025-07-07 11:56:30 +00:00
Xuehai Pan	3fd84a8592	[BE][PYFMT] migrate PYFMT for `torch/[a-c]*/` to `ruff format` (#144554 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144554 Approved by: https://github.com/soulitzer	2025-07-03 18:56:07 +00:00
Edward Z. Yang	17eb649d55	Implement guard collectives (optimized version) (#156562 ) This is a remix of https://github.com/pytorch/pytorch/pull/155558 Instead of mediating guard collective via a config option, in this one it's done via a `set_stance` like API. The motivation is that checking for the config value on entry on torch.compile is apparently quite expensive, according to functorch_maml_omniglot. So this makes it a bit cheaper. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/156562 Approved by: https://github.com/Microve	2025-06-24 04:59:49 +00:00
Animesh Jain	fab85fc5f9	[compile][hierarchical compilation] Release nested_compile_region API (#156449 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156449 Approved by: https://github.com/zou3519, https://github.com/jansel	2025-06-21 15:14:59 +00:00
Animesh Jain	54976bca10	[dynamo] Provide helper functions for guard filter hook (#155083 ) Collection of ready-made guard filters. One issue is that they are not composable - `filter1(filter2(guard))`. On the other hand, they are easy to use. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155083 Approved by: https://github.com/zhxchen17, https://github.com/jansel	2025-06-15 17:49:36 +00:00
James Wu	3819584f12	[precompile] Implement PrecompileContext for recording precompile artifacts, integrate with CompilePackage (#154415 ) This PR implements a basic interface and test for PrecompileContext, a special CacheArtifactManager specifically designed for precompile. The job of a PrecompileContext is to record things precompile needs as torch is compiling, dump it all into bytes, and then stitch it back together into a cache of callables. ## Why use CacheArtifactManager? Precompile needs a way to record various serializable data as torch is compiling. CacheArtifactManager already does this today pretty well, handling a lot of serialization and cache information. So we're reusing a bunch of that infrastructure directly. ## How is it different from CacheArtifactManager? Unlike regular CacheArtifactManager, PrecompileContext needs to be able to take the recorded artifacts and stitch them together after deserialization, to create a single working callable. Since PrecompileContext doesn't need the cache keys, the "key" field of PrecompileArtifacts can be used for metadata relating to how to stitch the individual functions being compiled together into a full callable. For example, on a given dynamo compile, if there are multiple functions (via graph breaks or recompiles) being compiled, MegaCache would organize it like so: ![image](https://github.com/user-attachments/assets/49a0a75b-1e7f-4d96-8d81-6769fe5a53ca) Whereas we'd visualize PrecompileContext's result like so: ![image](https://github.com/user-attachments/assets/fcc0dd4e-dfbf-4b13-9c08-2e99b373180b) For now, we just handle eager mode; in the diff above, I'll hook up the other backend artifacts from PrecompileContext. After this PR, precompile consists of three main interfaces: ### CompilePackage - Everything needed to run one torch.compile'd function (including graph breaks) - `__init__(fn, cache_entry)` Initializes with a DynamoCacheEntry - `install(backends)` load precompile artifacts into function's dynamo state with a dictionary of backends - `cache_entry()` return a serializable cache entry to save ### DynamoStore - Responsible for tracking CompilePackages on disk (and/or in memory) - `load_package(path)`: load a package given a torch compiled function and a path to the cache artifact - `save_package(package, path): Save a CompiledPackage to a path. Calls PrecompileContext to grab backend data - `record_package(package)`: Record a package to PrecompileContext (for global serialization/deserialization) ### PrecompileContext - Overarching context for serializing and deserializing precompile artifacts. Supports global and local setups. - `serialize()`: (Global) serializes all artifacts in PrecompileContext into bytes - `populate_caches(bytes)`: (Global) takes serialized bytes and puts them into DynamoStore (TODO) - `serialize_artifact_by_key(key)`: (Local) serialize a single artifact by its cache key <img width="1455" alt="image" src="https://github.com/user-attachments/assets/99b61330-7607-4763-bdbc-85b366e82cdd" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/154415 Approved by: https://github.com/zhxchen17 ghstack dependencies: #155118	2025-06-13 14:11:24 +00:00
bobrenjc93	ea5b9eca74	Combine sticky pgo key with job id (#154863 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154863 Approved by: https://github.com/Mingming-Ding	2025-06-03 07:58:38 +00:00
bobrenjc93	984b1a80e3	[ez] add docs for *eager_then_compile stances (#154818 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154818 Approved by: https://github.com/williamwen42 ghstack dependencies: #154802, #154826, #154822, #154823, #154805	2025-06-02 19:04:35 +00:00
bobrenjc93	d865b784e4	Support unbacked whitelist (#154295 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154295 Approved by: https://github.com/angelayi	2025-05-28 23:01:22 +00:00
bobrenjc93	2560c1f3f0	add sticky cache pgo (#154418 ) It's a reland of https://github.com/pytorch/pytorch/pull/154394 that hit some mergebot bug Pull Request resolved: https://github.com/pytorch/pytorch/pull/154418 Approved by: https://github.com/malfet	2025-05-27 16:40:18 +00:00
Tomasz Bohutyn	bb7e30c165	[MegaCache] Make MegaCache generic to allow external plugins registration (#152977 ) Implements #152976 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152977 Approved by: https://github.com/oulgen	2025-05-21 18:18:47 +00:00
Oguz Ulgen	f9bdfe90ae	[MegaCache] Return None on no compilation (#151921 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151921 Approved by: https://github.com/jamesjwu	2025-04-23 04:32:06 +00:00
Oguz Ulgen	8404c09b15	[MegaCache] Rename the PGO artifact when used between different jobs (#151482 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151482 Approved by: https://github.com/bobrenjc93, https://github.com/jamesjwu	2025-04-17 17:09:29 +00:00
Oguz Ulgen	3cf0e2d8ec	Add inductor standalone_compile API (#150670 ) This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution. ``` standalone_compile(gm, example_inputs, options) -> CompiledArtifact CompiledArtifact.save(path, format: binary\|unpacked = binary) CompiledArtifact.load(path, format: binary\|unpacked = binary) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670 Approved by: https://github.com/jamesjwu, https://github.com/zou3519	2025-04-15 23:38:15 +00:00
PyTorch MergeBot	74f6bc28a7	Revert "Add inductor standalone_compile API (#150670 )" This reverts commit c9aef508984a31f03821eaad381468673ef29c0a. Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/Camyll due to breaking internal builds with torch module not found error ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2806975267))	2025-04-15 17:35:59 +00:00
Oguz Ulgen	c9aef50898	Add inductor standalone_compile API (#150670 ) This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution. ``` standalone_compile(gm, example_inputs, options) -> CompiledArtifact CompiledArtifact.save(path, format: binary\|unpacked = binary) CompiledArtifact.load(path, format: binary\|unpacked = binary) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670 Approved by: https://github.com/jamesjwu, https://github.com/zou3519	2025-04-14 22:00:09 +00:00
PyTorch MergeBot	24b3ab9255	Revert "Add inductor standalone_compile API (#150670 )" This reverts commit bbc5fe850454df6860814ab77a1f3a4ca3698157. Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/albanD due to Broke profiler test ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2802067144))	2025-04-14 15:22:33 +00:00
Oguz Ulgen	bbc5fe8504	Add inductor standalone_compile API (#150670 ) This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution. ``` standalone_compile(gm, example_inputs, options) -> CompiledArtifact CompiledArtifact.save(path, format: binary\|unpacked = binary) CompiledArtifact.load(path, format: binary\|unpacked = binary) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670 Approved by: https://github.com/jamesjwu, https://github.com/zou3519	2025-04-14 07:07:10 +00:00
William Wen	25eff6e991	[dynamo] add reason field to torch.compiler.disable (#150341 ) Implements https://github.com/pytorch/pytorch/issues/146445 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150341 Approved by: https://github.com/zou3519, https://github.com/jansel	2025-04-02 04:26:48 +00:00
bobrenjc93	2dcdb4ba78	[ez] include config as part of __all__ in torch.compiler (#148978 ) Right now we are susceptive to a race condition where if the torch.compiler.config is not implicitly import via dynamo/builder.py, we will throw an error when trying to set compiler configs. This fixes it by including config in `__all__`. Previous ``` >>> import torch >>> torch.compiler.config.dynamic_sources = "L['kwargs']['float_features']" Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'torch.compiler' has no attribute 'config' >>> torch.compiler.config.dynamic_sources = "L['kwargs']['float_features']" Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'torch.compiler' has no attribute 'config' ``` Now ``` >>> import torch >>> torch.compiler.config.dynamic_sources = "L['kwargs']['float_features']" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/148978 Approved by: https://github.com/bdhirsh, https://github.com/laithsakka	2025-03-11 21:58:38 +00:00
Oguz Ulgen	57addfcd58	Significantly speed up save_cache_artifacts (#148227 ) While using save_cache_artifacts on internal workloads, we have noticed that repeatedly calling this function after every batch is incredibly expensive. This PR significantly speeds up this function call by opting out of pickle and redesigning serialization algorithm. Essentially what we want is to be able to call serialize many times without incurring costs from scratch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148227 Approved by: https://github.com/jamesjwu ghstack dependencies: #148226	2025-03-03 17:28:41 +00:00
bobrenjc93	4708cfdbd9	Support whitelist of dynamic sources (#147979 ) This PR introduces the ability to whitelist sources as dynamic. This is particularly useful for large models with graph breaks, as you can keep the dynamism across graph breaks since source names stay consistent. Additionally you can use this to mark ints as dynamic. NB: I intentionally didn't complicate the interface by supporting specification of per dimension dynamism. There is virtue in keeping true to the standard way of representing sources (eg. L['x']). If we find in practice that we need more more fine grained control, we can explore further affordances at that time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147979 Approved by: https://github.com/Mingming-Ding	2025-02-28 15:43:14 +00:00
Aaron Orenstein	db4ce78d46	PEP585: More UP006 fixes (#146392 ) This should be the final PR before we can enable RUFF UP006. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146392 Approved by: https://github.com/justinchuby, https://github.com/albanD, https://github.com/Skylion007	2025-02-20 06:18:13 +00:00
Nikita Shulga	95ff9f0340	[Doc] Add period at the end of the sentence (#145384 ) Test plan: https://docs-preview.pytorch.org/pytorch/pytorch/145384/generated/torch.compiler.disable.html#torch-compiler-disable Fixes https://github.com/pytorch/pytorch/issues/145365 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145384 Approved by: https://github.com/huydhn, https://github.com/svekars, https://github.com/kit1980	2025-01-22 19:56:31 +00:00
Aaron Orenstein	805c4b597a	PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145202 Approved by: https://github.com/bobrenjc93	2025-01-20 22:37:26 +00:00
Aaron Orenstein	a79100ab11	PEP585 update - torch/_dynamo (#145105 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145105 Approved by: https://github.com/bobrenjc93	2025-01-18 20:47:11 +00:00
James Wu	6e77d7cac5	Add AOTAutogradCache support for cache hot loading APIs (#144499 ) This diff adds AOTAutogradCache support to the mega cache. Differential Revision: [D67991059](https://our.internmc.facebook.com/intern/diff/D67991059/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D67991059/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/144499 Approved by: https://github.com/oulgen	2025-01-13 07:07:18 +00:00
Oguz Ulgen	9ee242213b	[RFC] Introduce cache hot loading APIs (a.k.a. "Mega-cache") (#143341 ) This PR essentially introduces two new APIs * torch.compiler.save_cache_artifacts * torch.compiler.load_cache_artifacts which aim to create a mega cache experience where the user can start collecting cache artifacts, and later call the save API to fetch them. In the next attempt, the user can "hot load" the cache artifacts via the load function. This bundling approach reduces the need to rely on porting individual files one by one, or relying on many network requests. Note that these APIs CANNOT log to structured logging as these functions will be called before and after compilation, as opposed to during compilation. Due to this limitation, the API returns a struct that the user can log with. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143341 Approved by: https://github.com/jansel	2025-01-07 23:13:24 +00:00
yijun-lee	d4609af1ca	Propagate callable parameter types using ParamSpec (#142306 ) (#144047 ) Fixes #142306 This PR includes typing improvements and refactoring for the following files: - __init__.py - decorators.py - _ops.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/144047 Approved by: https://github.com/XuehaiPan, https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com> Co-authored-by: Xuehai Pan <XuehaiPan@pku.edu.cn>	2025-01-06 16:16:18 +00:00
Yidi Wu	1e201422ed	[export] add is_exporting flag (#142425 ) We added an is_export flag under torch.compiler.is_exporting. This comes handy when we try to do some special logic in user-level and system-level (e.g. in upper of the stack). In increasing-scope: - `_is_fx_tracing` is set to True when we use under symbolic_trace or make_fx. - `is_exporting` is set to True when we're doing strict or non-strict export, which internally has a step that calls make_fx and set _is_fx_tracing to be True. - `is_compiling` is set to True when we're either doing strict, non-strict export or torch.compile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142425 Approved by: https://github.com/avikchaudhuri	2024-12-18 21:36:28 +00:00
Oguz Ulgen	28d8297712	Migrate compiler config to Config (#143152 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143152 Approved by: https://github.com/ezyang ghstack dependencies: #143229	2024-12-14 07:38:25 +00:00
PyTorch MergeBot	e87f07d3b8	Revert "Migrate compiler config to Config (#143152 )" This reverts commit 1ebdfd56053dafa8880a0dedf535fff70aa92e09. Reverted https://github.com/pytorch/pytorch/pull/143152 on behalf of https://github.com/oulgen due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/143152#issuecomment-2542342073))	2024-12-13 20:55:14 +00:00

1 2

79 Commits