mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
[aoti] Add cpp loader (#135374)
* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135374 Approved by: https://github.com/desertfire, https://github.com/malfet
This commit is contained in:
committed by
PyTorch MergeBot
parent
26e5572dd2
commit
cd9ee49a69
@ -466,6 +466,7 @@ lazy_tensor_core_python_sources = [
|
||||
]
|
||||
|
||||
inductor_core_resources = [
|
||||
"torch/csrc/inductor/aoti_package/model_package_loader.cpp",
|
||||
"torch/csrc/inductor/aoti_runner/model_container_runner.cpp",
|
||||
"torch/csrc/inductor/aoti_runner/model_container_runner_cpu.cpp",
|
||||
"torch/csrc/inductor/aoti_torch/shim_common.cpp",
|
||||
@ -841,6 +842,7 @@ libtorch_python_core_sources = [
|
||||
"torch/csrc/fx/node.cpp",
|
||||
"torch/csrc/mps/Module.cpp",
|
||||
"torch/csrc/mtia/Module.cpp",
|
||||
"torch/csrc/inductor/aoti_package/pybind.cpp",
|
||||
"torch/csrc/inductor/aoti_runner/pybind.cpp",
|
||||
"torch/csrc/inductor/aoti_eager/kernel_holder.cpp",
|
||||
"torch/csrc/inductor/aoti_eager/kernel_meta_info.cpp",
|
||||
|
Reference in New Issue
Block a user