Avoid merges from extra PGO key, if same source has different rank. Unlikely to happen (needs code hash match & source variable type to change), but being safe.
Differential Revision: D81299840
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162097
Approved by: https://github.com/bobrenjc93
Summary:
This feature is Meta internal only
Add a util function to put dynamic shape-related suggestion to MLHubDebugInsightService, which will then be surfaced to users in the MLHub .
The rollout will be controlled by JK.
Test Plan:
MAST job aps-omnifmv3_dev_baseline_test-a34fdccf21
{F1980593060}
* If you're not able to see the insight, please add yourself to this gk 'mlhub_debugging_insights_dev_visibility'
* The URL link should route to a new Job Inspector page that will provide details and straight forward instructions of how to config the ds. The page is currently still in development so here we use the general PT2 compile JI page.
* Test fails because of the export checks. I'll export after addressing all the comments from reviewers.
Rollback Plan:
Reviewed By: pianpwk
Differential Revision: D78526522
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159391
Approved by: https://github.com/jingsh
Summary:
In D75617963, we started logging dynamic whitelist suggestions to PT2 Compile Events. The whitelists were aggregated across all frames, intending to avoid manual work for the user (e.g. if frame 0/1 saw L['x'] turn dynamic, and later 1/1 saw L['y'], we'd log "L['x'],L['y']" on frame 1/1).
This switches to frame-specific whitelists, as attributing dynamism changes to certain frames was difficult, and suggestions are sometimes polluted by problematic frames (e.g. optimizer states).
The globally aggregated whitelist is still available in tlparse, by looking at the final `put_local_code_state_*` entry.
Test Plan:
loggercli codegen GeneratedPt2CompileEventsLoggerConfig
Rollback Plan:
Differential Revision: D76628834
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155959
Approved by: https://github.com/bobrenjc93
Summary:
When tensor size changes are detected on `dynamic=False`, overwrites the PGO state with the newest static shapes to reflect the latest frame state, instead of updating automatic dynamic.
A longer term solution, if we move to shared PGO state between multiple jobs, would be to update automatic dynamic, but avoid suggesting/logging the whitelist (compiling with `dynamic=False` should already override any dynamic PGO that's read, so we're fine there). This way if any particular job runs with `dynamic=False`, it won't statically overwrite the entire PGO state if it's shared with many other jobs.
Test Plan:
test/dynamo/test_pgo.py
Rollback Plan:
Differential Revisi,on: D76630499
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155961
Approved by: https://github.com/bobrenjc93
This word appears often in class descriptions and is not consistently spelled. Update comments and some function names to use the correct spelling consistently. Facilitates searching the codebase.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155944
Approved by: https://github.com/Skylion007
In some internal frameworks, on second attempts the actual code is copied to a different path than previous attempts.
but its still the same. PGO will not work on those cased due to the following, sate entries before this PR used to be identified by (filepath, function name, line number).
after this PR they are identified by (hash(filepath) , function name, line number). This way PGO will work for those jobs on future attempts and re-compilations of static versions will be avoided.
Sometimes we do not have access to the source code, (file does not exists)
This seems to happen mostly when we re-trace a compiled function but generally it can happen .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152628
Approved by: https://github.com/oulgen
This PR aims to fix the invalid path for windows: `C:\\Users\\sdp\\AppData\\Local\\Temp\\tmp0wugz2qm\\dynamo\\code_state___main__.TestFxGraphCache.test_cache_hot_load_pgo:None:.pkl.lock`
Windows does not allow chars `\ / : * ? " < > |` in a path.
And this PR also replace `os.rename` to `os.replace` in torch/_dynamo/pgo.py because `os.replace` allows target file exists on Windows, but not `os.rename` .
| Function | `os.rename()` | `os.replace()` |
|--------------------------------|----------------------------|----------------------------|
| Rename a file | ✅ | ✅ |
| Move a file | ✅ | ✅ |
| Overwrite an existing file | ❌ (Error on Windows) | ✅ (Will overwrite) |
| Overwrite an existing directory | ❌ (Error on Windows) | ❌ (Error on Windows) |
| Move across disks | ❌ | ❌ |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147708
Approved by: https://github.com/jansel
This PR essentially introduces two new APIs
* torch.compiler.save_cache_artifacts
* torch.compiler.load_cache_artifacts
which aim to create a mega cache experience where the user can start collecting cache artifacts, and later call the save API to fetch them. In the next attempt, the user can "hot load" the cache artifacts via the load function.
This bundling approach reduces the need to rely on porting individual files one by one, or relying on many network requests.
Note that these APIs CANNOT log to structured logging as these functions will be called before and after compilation, as opposed to during compilation. Due to this limitation, the API returns a struct that the user can log with.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143341
Approved by: https://github.com/jansel
**Problem statement**: I want to be able to centralize and simplify the process by which people add columns/data to existing spans. We have MetricsContext and ChromiumEventLogger, and there's various choices you can make to decide where and when to log different levels of observability for your events. To resolve this, I want a central API for "adding to events under dynamo_timed".
**CompileEventLogger** is intended as a frontend for MetricsContext and ChromiumEventLogger so we can use the same class for handling everything.
CompileEventLogger is intended be used within a `dynamo_timed()` context. Its purpose is to 1. log to existing events that are in progress (i.e. within dynamo_timed), and 2. log instant events to chromium that are independent of any specific span.
CompileEventLogger has three log levels:
- CHROMIUM: Log only to chromium events, visible via tlparse.
- PT2_COMPILE: Log to chromium_events + pt2_compile_events
- COMPILATION_METRIC: Log to compilation metrics in addition to the toplevel chromium and pt2_compile_event.
In addition, we have a function CompileEventLogger.add() that automagically chooses the correct log level. For now, it is conservative, and will never automagically choose to log CompilationMetrics (though I could imagine it figuring out the metadata are all keys in CompilationMetric and therefore loggable there).
The goal here is to make one single interface to log stuff for observability reasons, and make it as easy as possible.
Not included in this diff:
- V1 of this diff will not have implementations of `increment` and `add_to_set` which MetricsContext has, so those usages are not replaced yet. But I'll add those in a followup.
- We don't handle `RuntimeMetricsContext`. It's unclear if I want that to be part of this, because under RuntimeMetricsContext there might not be a toplevel event to log to, so chromium events doesn't make sense in that context. So I might leave that separate for now.
Differential Revision: [D67346203](https://our.internmc.facebook.com/intern/diff/D67346203/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143420
Approved by: https://github.com/aorenste
Summary:
I realized I wanted to check "are my cache entries/IO unreasonably large"
and there's no easy way to do it. This lets me do it.
Test Plan: servicelab
Differential Revision: D65390363
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139627
Approved by: https://github.com/c00w
Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR.
This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001
Approved by: https://github.com/oulgen
Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR.
This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Differential Revision: [D65065497](https://our.internmc.facebook.com/intern/diff/D65065497)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001
Approved by: https://github.com/oulgen
Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR.
This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Differential Revision: [D65065497](https://our.internmc.facebook.com/intern/diff/D65065497)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001
Approved by: https://github.com/oulgen