13 Commits

Author SHA1 Message Date
058814794b [Code Clean] Replace std::runtime_error with TORCH_CHECK (#163437)
Replace the runtime_error of the vallina C++ exceptions with TORCH_CEHCK
Including:
- torch/csrc/export
- torch/csrc/cuda

Fixes #148114
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163437
Approved by: https://github.com/Skylion007, https://github.com/cyyever
2025-10-12 01:23:02 +00:00
22df9332da [serialization] Add pte file to archive (#162520)
Summary:
Add _package_executorch_files to archive apis. Allow us to package a PTE file into the archive.

I don't think there's a use-case to have more than one PTE file at the moment, but left it as `EXECUTORCH_FILES` just in case.

Test Plan:
Tested in D81992612

Rollback Plan:

Differential Revision: D81977483

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162520
Approved by: https://github.com/angelayi
2025-09-11 07:59:11 +00:00
089ad1d88b [1/n][export] Refactor PT2 Archive weight saving and loading (#160394)
Summary:

We split the refactoring in two parts for forward compatibility concerns
First, we land the deserialization (loading part)
Then, we land the serialization (saving part)

Save weights and constants as individual files in PT2 archive. Each weight/constant will be saved as raw bytes, unless it is a custom object (TorchBind object) or a non-fake tensor subclass, for these two special cases we still save them using pickle.

The metadata of saved tensors along with the file name will be saved as `PayloadMeta`.
The mapping from FQN to `PayloadMeta` will be saved as `PayloadConfig` under `WEIGHTS_CONFIG_FORMAT` and `CONTANTS_CONFIG_FORMAT`

This changes the serialization in python side when calling `torch.export.save()`.

For deserialization in python `torch.export.load()`, we make it BC-safe by allowing loading legacy format weights/constants.

For deserialization in C++ `torch/nativert/ModelRunner.cpp`, we make this a BC breaking change as currently the OSS ModelRunner API is not being used.

The file structure

```
├── archive_format
├── archive_version
├── byteorder
├── .data
│   ├── serialization_id
│   └── version
├── data
│   ├── sample_inputs
│   │   └── model.pt
│   ├── constants
│   │   ├── tensor_0
│   │   ├── tensor_1
│   │   └── model_constants_config.json
│   └── weights
│       ├── weight_0
│       ├── weight_1
│       ├── weight_2
│       ├── weight_3
│       └── model_weights_config.json
└── models
    └── model.json
```

Test Plan:
CI

Rollback Plan:

Differential Revision: D80035490

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160394
Approved by: https://github.com/SherlockNoMad
2025-08-26 01:15:42 +00:00
aeffb68d34 [schema_upgrader] add C++ upgrader for json based upgrading (#156761)
Differential Revision: [D77459912](https://our.internmc.facebook.com/intern/diff/D77459912)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156761
Approved by: https://github.com/angelayi
2025-06-28 18:15:06 +00:00
f810480dbe Revert "[schema_upgrader] add C++ upgrader for json based upgrading (#156761)"
This reverts commit 61712e6f2ba58cce354a742d918934ec7293ee43.

Reverted https://github.com/pytorch/pytorch/pull/156761 on behalf of https://github.com/ydwu4 due to break linter test, which doesn't show up in the pr ([comment](https://github.com/pytorch/pytorch/pull/156761#issuecomment-3014918800))
2025-06-28 03:58:25 +00:00
61712e6f2b [schema_upgrader] add C++ upgrader for json based upgrading (#156761)
Differential Revision: [D77459912](https://our.internmc.facebook.com/intern/diff/D77459912)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156761
Approved by: https://github.com/angelayi
2025-06-27 23:50:19 +00:00
502486d946 [PT2]Add weight and constant config path template (#156359)
Summary: At title.

Test Plan:
N/A

Rollback Plan:

Differential Revision: D76925510

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156359
Approved by: https://github.com/SherlockNoMad
2025-06-20 18:46:01 +00:00
728cf6721e Revert "[PT2]load dense delta by trimming prefixes (#155872)"
This reverts commit c74fd35050a7241f0c439501ef735aa6cdde751f.

Reverted https://github.com/pytorch/pytorch/pull/155872 on behalf of https://github.com/malfet due to Broke lint, internal has been backed out ([comment](https://github.com/pytorch/pytorch/pull/155872#issuecomment-2985542895))
2025-06-18 20:05:56 +00:00
c74fd35050 [PT2]load dense delta by trimming prefixes (#155872)
Summary:
In PT2 with GPU with AOTI, weight names are like
```merge.submod_0._run_on_acc_0.main_module.user_embedding_arch.relevance_pmas.ig_feed.pos_emb```

but when publishing delta snapshots, lowering is skipped so weights are like
```merge.main_module.user_embedding_arch.relevance_pmas.ig_feed.pos_emb```

so when loading delta weights in original model runner, we need to:
1. Redo tensorName -> weight idx look up, because the weight ordering may be different.
2. use trimmed tensorName to find the correct weight path.

Note that with this diff, delta snapshot loading still does NOT use xl weights. This should be fine for now as we are still publishing full model with non-xl weights.

Test Plan:
Merge only:
```
MODEL_TYPE=mtml_ctr_instagram_model
MODULE=merge
MODEL_ENTITY_ID=900234243
SNAPSHOT_ID=7
DENSE_DELTA_SNAPSHOT_ID=13

CUDA_VISIBLE_DEVICES=2,3 buck2 run mode/dev-nosan -c fbcode.nvcc_arch=a100,h100 -c fbcode.enable_gpu_sections=true caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- --loadMode=DenseOnly --baseNetFile=/data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}.predictor.disagg.gpu.${MODULE}  --moduleName=${MODULE} --predictor_hardware_type 1 --submodToDevice "" --deltaNetFile /data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/delta_${DENSE_DELTA_SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}.predictor.disagg.gpu.${MODULE}
```

Local replayer:
```
MODEL_TYPE=mtml_ctr_instagram_model
MODEL_ENTITY_ID=900234243
SNAPSHOT_ID=7
DENSE_DELTA_SNAPSHOT_ID=13

USE_SERVABLE=0 HARDWARE_TYPE=0 DENSE_DELTA_IDS=${DENSE_DELTA_SNAPSHOT_ID} ENABLE_REALTIME_UPDATE=1 CUDA_VISIBLE_DEVICES=6,7 sh ./sigrid/predictor/scripts/start_gpu_with_gif.sh ${MODEL_ENTITY_ID}_${SNAPSHOT_ID} /data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID} 7455

USE_SERVABLE=0 sh sigrid/predictor/scripts/start_gpu_replayer_localhost_with_gif.sh ${MODEL_ENTITY_ID}_${SNAPSHOT_ID} 10 ${MODEL_TYPE} /data/users/$USER/requests/filter_requests_mtml_ctr_instagram_model_500 localhost /data/users/$USER/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID} true 7455
```

Rollback Plan:

Differential Revision: D76520301

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155872
Approved by: https://github.com/SherlockNoMad
2025-06-18 19:13:22 +00:00
b4fb801b2d [export] Move PT2 constants to torch::_export (#153206)
Test Plan:
`buck2 test //sigmoid/...`
https://www.internalfb.com/intern/testinfra/testrun/1970325119807758

Differential Revision: D74417085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153206
Approved by: https://github.com/zhxchen17, https://github.com/dolpm
2025-05-17 08:21:59 +00:00
cyy
8fa81a6066 Enable misc-use-internal-linkage check and apply fixes (#148948)
Enables clang-tidy rule [`misc-use-internal-linkage`](https://clang.llvm.org/extra/clang-tidy/checks/misc/use-internal-linkage.html). This new check was introduced in Clang-Tidy 18 and is available due to recent update of Clang-Tidy 19.

The check marks functions and variables used only in the translation unit as static. Therefore undesired symbols are not leaked into other units, more link time optimisations are possible and the resulting binaries may be smaller.

The detected violations were mostly fixed by using static. In other cases, the symbols were indeed consumed by others files, then their declaring headers were included. Still some declarations were wrong and have been fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148948
Approved by: https://github.com/Skylion007
2025-03-12 14:22:56 +00:00
cyy
2f082e1e56 [13/N] Fix extra warnings brought by clang-tidy-17 (#140897)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140897
Approved by: https://github.com/ezyang
2024-11-27 00:35:19 +00:00
3ef2dfc1ba [export] Implement cpp deserializer. (#136398)
Differential Revision: D63206258

This diff introduces a mechanism to generate a json-compatible deserializer in cpp using nlohmann json (already being used by AOTI).

Why we need this? Because there will be a lot of cases where people don't want to use Python to load the graph (e.g. cpp runtime), and instead they can use this header to deserialize the JSON graph.

Every time we call update_schema.py to update the schema, the header will be auto generated and included into the source files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136398
Approved by: https://github.com/angelayi
2024-11-14 16:34:59 +00:00