mirror of
https://github.com/pytorch/pytorch.git
synced 2025-11-05 08:24:57 +08:00
This PR implements the semantics change to `torch._dynamo.error_on_graph_break`: - ~`torch.compile` now has a new `error_on_graph_break` kwarg that serves as a lower-priority toggle for erroring/continuing on graph breaks~ - `error_on_graph_break` is a new internal `torch.compile `setting that is lower-priority than `fullgraph`. It allows the user to toggle erroring/continuing on graph breaks. - `error_on_graph_break` does nothing when `fullgraph=True` - `error_on_graph_break` does NOT guarantee a single graph Followup [DONE]: need to change the programming model docs to reflect the 3 graph break modes for compilation: - `fullgraph=True`: enforce one graph, no graph breaks, cannot be toggled - `fullgraph=False, error_on_graph_break=True`: errors on graph breaks, latter can be toggled during compile time - `fullgraph=False, error_on_graph_break=False`: resumes tracing on graph breaks, latter can be toggled during compile time Pull Request resolved: https://github.com/pytorch/pytorch/pull/161747 Approved by: https://github.com/mlazos ghstack dependencies: #161739
1.8 KiB
1.8 KiB
Working with fullgraph=False
While fullgraph=False is the default torch.compile setting, the semantics of resuming compilation upon encountering a graph break are more complicated.
You can find details on the fullgraph=False semantics in the subsections.
The strategy for using torch.compile(fullgraph=False) is as follows:
- Determine the ideal location to place
torch.compile. Normally, it is the highest-level function that doesn’t result in excessive graph breaks. Functions that do a lot of preprocessing or I/O operations are examples of functions that result in many graph breaks and do not significantly benefit fromtorch.compile. a. You can isolate issues by first compiling individual functions/modules before compiling entire models. - Apply
torch.compiler.disableto functions in the compiled region that result in a lot of graph breaks and do not benefit from compilation. In this case, one graph break is better than potentially tens or hundreds. - Use
TORCH_LOGS="graph_breaks"or tlparse to investigate remaining graph breaks. Work around these graph breaks using the same approaches as working around graph breaks under thefullgraph=Trueprogramming model. Not all graph breaks need to be removed - some may impact performance more than others. The general rule is to focus on graph breaks that are happening during model computation. a. We recommend usingtorch.compile(backend='eager')when debugging graph breaks, for faster debugging iteration times
programming_model.where_to_apply_compile
programming_model.compiler_disable
programming_model.error_on_graph_break
programming_model.nested_graph_breaks
programming_model.skipped_functions