[aoti] Add cpp loader (#135374)

* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135374 Approved by: https://github.com/desertfire, https://github.com/malfet
2025-10-20 21:14:14 +08:00 · 2024-09-10 16:08:07 -07:00
parent 26e5572dd2
commit cd9ee49a69
22 changed files with 890 additions and 246 deletions
--- a/build_variables.bzl
+++ b/build_variables.bzl
@ -466,6 +466,7 @@ lazy_tensor_core_python_sources = [
 ]

 inductor_core_resources = [
+    "torch/csrc/inductor/aoti_package/model_package_loader.cpp",
    "torch/csrc/inductor/aoti_runner/model_container_runner.cpp",
    "torch/csrc/inductor/aoti_runner/model_container_runner_cpu.cpp",
    "torch/csrc/inductor/aoti_torch/shim_common.cpp",
@ -841,6 +842,7 @@ libtorch_python_core_sources = [
    "torch/csrc/fx/node.cpp",
    "torch/csrc/mps/Module.cpp",
    "torch/csrc/mtia/Module.cpp",
+    "torch/csrc/inductor/aoti_package/pybind.cpp",
    "torch/csrc/inductor/aoti_runner/pybind.cpp",
    "torch/csrc/inductor/aoti_eager/kernel_holder.cpp",
    "torch/csrc/inductor/aoti_eager/kernel_meta_info.cpp",