Commit Graph

14 Commits

Author SHA1 Message Date
3fd84a8592 [BE][PYFMT] migrate PYFMT for torch/[a-c]*/ to ruff format (#144554)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144554
Approved by: https://github.com/soulitzer
2025-07-03 18:56:07 +00:00
3819584f12 [precompile] Implement PrecompileContext for recording precompile artifacts, integrate with CompilePackage (#154415)
This PR implements a basic interface and test for PrecompileContext, a special CacheArtifactManager specifically designed for precompile. The job of a PrecompileContext is to record things precompile needs as torch is compiling,  dump it all into bytes, and then stitch it back together into a cache of callables.

## Why use CacheArtifactManager?
Precompile needs a way to record various serializable data as torch is compiling. CacheArtifactManager already does this today pretty well, handling a lot of serialization and cache information. So we're reusing a bunch of that infrastructure directly.

## How is it different from CacheArtifactManager?
Unlike regular CacheArtifactManager, PrecompileContext needs to be able to take the recorded artifacts and stitch them together after deserialization, to create a single working callable.
Since PrecompileContext doesn't need the cache keys, the "key" field of PrecompileArtifacts can be used for metadata relating to how to stitch the individual functions being compiled together into a full callable. For example, on a given dynamo compile, if there are multiple functions (via graph breaks or recompiles) being compiled, MegaCache would organize it like so:

![image](https://github.com/user-attachments/assets/49a0a75b-1e7f-4d96-8d81-6769fe5a53ca)

Whereas we'd visualize PrecompileContext's result like so:

![image](https://github.com/user-attachments/assets/fcc0dd4e-dfbf-4b13-9c08-2e99b373180b)

For now, we just handle eager mode; in the diff above, I'll hook up the other backend artifacts from PrecompileContext.

After this PR, precompile consists of three main interfaces:

### CompilePackage
- Everything needed to run one torch.compile'd function (including graph breaks)
- `__init__(fn, cache_entry)` Initializes with a DynamoCacheEntry
- `install(backends)` load precompile artifacts into function's dynamo state with a dictionary of backends
- `cache_entry()` return a serializable cache entry to save

### DynamoStore
- Responsible for tracking CompilePackages on disk (and/or in memory)
- `load_package(path)`: load a package given a torch compiled function and a path to the cache artifact
- `save_package(package, path): Save a CompiledPackage to a path. Calls PrecompileContext to grab backend data
- `record_package(package)`: Record a package to PrecompileContext (for global serialization/deserialization)

### PrecompileContext
- Overarching context for serializing and deserializing precompile artifacts. Supports **global** and **local** setups.
- `serialize()`: (Global) serializes all artifacts in PrecompileContext into bytes
- `populate_caches(bytes)`: (Global) takes serialized bytes and puts them into DynamoStore (TODO)
- `serialize_artifact_by_key(key)`: (Local) serialize a single artifact by its cache key

<img width="1455" alt="image" src="https://github.com/user-attachments/assets/99b61330-7607-4763-bdbc-85b366e82cdd" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154415
Approved by: https://github.com/zhxchen17
ghstack dependencies: #155118
2025-06-13 14:11:24 +00:00
bb7e30c165 [MegaCache] Make MegaCache generic to allow external plugins registration (#152977)
Implements #152976

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152977
Approved by: https://github.com/oulgen
2025-05-21 18:18:47 +00:00
f9bdfe90ae [MegaCache] Return None on no compilation (#151921)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151921
Approved by: https://github.com/jamesjwu
2025-04-23 04:32:06 +00:00
8404c09b15 [MegaCache] Rename the PGO artifact when used between different jobs (#151482)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151482
Approved by: https://github.com/bobrenjc93, https://github.com/jamesjwu
2025-04-17 17:09:29 +00:00
3cf0e2d8ec Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-15 23:38:15 +00:00
74f6bc28a7 Revert "Add inductor standalone_compile API (#150670)"
This reverts commit c9aef508984a31f03821eaad381468673ef29c0a.

Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/Camyll due to breaking internal builds with torch module not found error ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2806975267))
2025-04-15 17:35:59 +00:00
c9aef50898 Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-14 22:00:09 +00:00
24b3ab9255 Revert "Add inductor standalone_compile API (#150670)"
This reverts commit bbc5fe850454df6860814ab77a1f3a4ca3698157.

Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/albanD due to Broke profiler test ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2802067144))
2025-04-14 15:22:33 +00:00
bbc5fe8504 Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-14 07:07:10 +00:00
57addfcd58 Significantly speed up save_cache_artifacts (#148227)
While using save_cache_artifacts on internal workloads, we have noticed that repeatedly calling this function after every batch is incredibly expensive. This PR significantly speeds up this function call by opting out of pickle and redesigning serialization algorithm.

Essentially what we want is to be able to call serialize many times without incurring costs from scratch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148227
Approved by: https://github.com/jamesjwu
ghstack dependencies: #148226
2025-03-03 17:28:41 +00:00
805c4b597a PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145202
Approved by: https://github.com/bobrenjc93
2025-01-20 22:37:26 +00:00
6e77d7cac5 Add AOTAutogradCache support for cache hot loading APIs (#144499)
This diff adds AOTAutogradCache support to the mega cache.

Differential Revision: [D67991059](https://our.internmc.facebook.com/intern/diff/D67991059/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D67991059/)!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144499
Approved by: https://github.com/oulgen
2025-01-13 07:07:18 +00:00
9ee242213b [RFC] Introduce cache hot loading APIs (a.k.a. "Mega-cache") (#143341)
This PR essentially introduces two new APIs
* torch.compiler.save_cache_artifacts
* torch.compiler.load_cache_artifacts

which aim to create a mega cache experience where the user can start collecting cache artifacts, and later call the save API to fetch them. In the next attempt, the user can "hot load" the cache artifacts via the load function.

This bundling approach reduces the need to rely on porting individual files one by one, or relying on many network requests.

Note that these APIs CANNOT log to structured logging as these functions will be called before and after compilation, as opposed to during compilation. Due to this limitation, the API returns a struct that the user can log with.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143341
Approved by: https://github.com/jansel
2025-01-07 23:13:24 +00:00