mirror of
https://github.com/pytorch/pytorch.git
synced 2025-11-03 23:45:05 +08:00
Add a new documentation to show one memory usage benefit brought by TorchDynamo-based ONNX exporter. Also add a unit test to make sure TorchDynamo-based ONNX exporter works well under FakeTensorMode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139388 Approved by: https://github.com/xadupre
184 lines
6.6 KiB
ReStructuredText
184 lines
6.6 KiB
ReStructuredText
TorchDynamo-based ONNX Exporter
|
|
===============================
|
|
|
|
.. automodule:: torch.onnx
|
|
:noindex:
|
|
|
|
.. contents:: :local:
|
|
:depth: 3
|
|
|
|
.. warning::
|
|
The ONNX exporter for TorchDynamo is a rapidly evolving beta technology.
|
|
|
|
Overview
|
|
--------
|
|
|
|
The ONNX exporter leverages TorchDynamo engine to hook into Python's frame evaluation API
|
|
and dynamically rewrite its bytecode into an FX Graph.
|
|
The resulting FX Graph is then polished before it is finally translated into an ONNX graph.
|
|
|
|
The main advantage of this approach is that the `FX graph <https://pytorch.org/docs/stable/fx.html>`_ is captured using
|
|
bytecode analysis that preserves the dynamic nature of the model instead of using traditional static tracing techniques.
|
|
|
|
In addition, during the export process, memory usage is significantly reduced compared to the TorchScript-enabled exporter.
|
|
See the :doc:`documentation <onnx_dynamo_memory_usage>` for more information.
|
|
|
|
The exporter is designed to be modular and extensible. It is composed of the following components:
|
|
|
|
- **ONNX Exporter**: :class:`Exporter` main class that orchestrates the export process.
|
|
- **ONNX Export Options**: :class:`ExportOptions` has a set of options that control the export process.
|
|
- **ONNX Registry**: :class:`OnnxRegistry` is the registry of ONNX operators and functions.
|
|
- **FX Graph Extractor**: :class:`FXGraphExtractor` extracts the FX graph from the PyTorch model.
|
|
- **Fake Mode**: :class:`ONNXFakeContext` is a context manager that enables fake mode for large scale models.
|
|
- **ONNX Program**: :class:`ONNXProgram` is the output of the exporter that contains the exported ONNX graph and diagnostics.
|
|
- **ONNX Diagnostic Options**: :class:`DiagnosticOptions` has a set of options that control the diagnostics emitted by the exporter.
|
|
|
|
Dependencies
|
|
------------
|
|
|
|
The ONNX exporter depends on extra Python packages:
|
|
|
|
- `ONNX <https://onnx.ai>`_
|
|
- `ONNX Script <https://onnxscript.ai>`_
|
|
|
|
They can be installed through `pip <https://pypi.org/project/pip/>`_:
|
|
|
|
.. code-block:: bash
|
|
|
|
pip install --upgrade onnx onnxscript
|
|
|
|
`onnxruntime <https://onnxruntime.ai>`_ can then be used to execute the model
|
|
on a large variety of processors.
|
|
|
|
A simple example
|
|
----------------
|
|
|
|
See below a demonstration of exporter API in action with a simple Multilayer Perceptron (MLP) as example:
|
|
|
|
.. code-block:: python
|
|
|
|
import torch
|
|
import torch.nn as nn
|
|
|
|
class MLPModel(nn.Module):
|
|
def __init__(self):
|
|
super().__init__()
|
|
self.fc0 = nn.Linear(8, 8, bias=True)
|
|
self.fc1 = nn.Linear(8, 4, bias=True)
|
|
self.fc2 = nn.Linear(4, 2, bias=True)
|
|
self.fc3 = nn.Linear(2, 2, bias=True)
|
|
|
|
def forward(self, tensor_x: torch.Tensor):
|
|
tensor_x = self.fc0(tensor_x)
|
|
tensor_x = torch.sigmoid(tensor_x)
|
|
tensor_x = self.fc1(tensor_x)
|
|
tensor_x = torch.sigmoid(tensor_x)
|
|
tensor_x = self.fc2(tensor_x)
|
|
tensor_x = torch.sigmoid(tensor_x)
|
|
output = self.fc3(tensor_x)
|
|
return output
|
|
|
|
model = MLPModel()
|
|
tensor_x = torch.rand((97, 8), dtype=torch.float32)
|
|
onnx_program = torch.onnx.export(model, (tensor_x,), dynamo=True)
|
|
|
|
As the code above shows, all you need is to provide :func:`torch.onnx.export` with an instance of the model and its input.
|
|
The exporter will then return an instance of :class:`torch.onnx.ONNXProgram` that contains the exported ONNX graph along with extra information.
|
|
|
|
The in-memory model available through ``onnx_program.model_proto`` is an ``onnx.ModelProto`` object in compliance with the `ONNX IR spec <https://github.com/onnx/onnx/blob/main/docs/IR.md>`_.
|
|
The ONNX model may then be serialized into a `Protobuf file <https://protobuf.dev/>`_ using the :meth:`torch.onnx.ONNXProgram.save` API.
|
|
|
|
.. code-block:: python
|
|
|
|
onnx_program.save("mlp.onnx")
|
|
|
|
Two functions exist to export the model to ONNX based on TorchDynamo engine.
|
|
They slightly differ in the way they produce the :class:`ExportedProgram`.
|
|
:func:`torch.onnx.dynamo_export` was introduced with PyTorch 2.1 and
|
|
:func:`torch.onnx.export` was extended with PyTorch 2.5 to easily switch
|
|
from TorchScript to TorchDynamo. To call the former function,
|
|
the last line of the previous example can be replaced by the following one.
|
|
|
|
.. code-block:: python
|
|
|
|
onnx_program = torch.onnx.dynamo_export(model, tensor_x)
|
|
|
|
Inspecting the ONNX model using GUI
|
|
-----------------------------------
|
|
|
|
You can view the exported model using `Netron <https://netron.app/>`__.
|
|
|
|
.. image:: _static/img/onnx/onnx_dynamo_mlp_model.png
|
|
:width: 40%
|
|
:alt: MLP model as viewed using Netron
|
|
|
|
Note that each layer is represented in a rectangular box with a *f* icon in the top right corner.
|
|
|
|
.. image:: _static/img/onnx/onnx_dynamo_mlp_model_function_highlight.png
|
|
:width: 40%
|
|
:alt: ONNX function highlighted on MLP model
|
|
|
|
By expanding it, the function body is shown.
|
|
|
|
.. image:: _static/img/onnx/onnx_dynamo_mlp_model_function_body.png
|
|
:width: 50%
|
|
:alt: ONNX function body
|
|
|
|
The function body is a sequence of ONNX operators or other functions.
|
|
|
|
When the conversion fails
|
|
-------------------------
|
|
|
|
Function :func:`torch.onnx.export` should called a second time with
|
|
parameter ``report=True``. A markdown report is generated to help the user
|
|
to resolve the issue.
|
|
|
|
Function :func:`torch.onnx.dynamo_export` generates a report using 'SARIF' format.
|
|
ONNX diagnostics goes beyond regular logs through the adoption of
|
|
`Static Analysis Results Interchange Format (aka SARIF) <https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html>`__
|
|
to help users debug and improve their model using a GUI, such as
|
|
Visual Studio Code's `SARIF Viewer <https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer>`_.
|
|
|
|
The main advantages are:
|
|
|
|
- The diagnostics are emitted in machine parseable `Static Analysis Results Interchange Format (SARIF) <https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html>`__.
|
|
- A new clearer, structured way to add new and keep track of diagnostic rules.
|
|
- Serve as foundation for more future improvements consuming the diagnostics.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: ONNX Diagnostic SARIF Rules
|
|
:glob:
|
|
|
|
generated/onnx_dynamo_diagnostics_rules/*
|
|
|
|
.. toctree::
|
|
:hidden:
|
|
|
|
onnx_dynamo_memory_usage
|
|
|
|
API Reference
|
|
-------------
|
|
|
|
.. autofunction:: torch.onnx.dynamo_export
|
|
|
|
.. autoclass:: torch.onnx.ExportOptions
|
|
:members:
|
|
|
|
.. autofunction:: torch.onnx.enable_fake_mode
|
|
|
|
.. autoclass:: torch.onnx.ONNXProgram
|
|
:members:
|
|
|
|
.. autoclass:: torch.onnx.ONNXRuntimeOptions
|
|
:members:
|
|
|
|
.. autoclass:: torch.onnx.OnnxExporterError
|
|
:members:
|
|
|
|
.. autoclass:: torch.onnx.OnnxRegistry
|
|
:members:
|
|
|
|
.. autoclass:: torch.onnx.DiagnosticOptions
|
|
:members:
|