This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
Summary: Add an experimental feature to defer pytorch library initialization cost to post startup. As noted this feature is not thread safe, it requires the client to maintain thread safety at library load time.
Reviewed By: zou3519
Differential Revision: D71917841
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150537
Approved by: https://github.com/zou3519
This PR replace c10::guts::to_string with std::to_string. The major part of changes is using void* as optimizer state key since string is used only for serialization and using pointers as hashing keys is more efficient than a string.
Some other guts functions in the affected source files are also replaced.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108480
Approved by: https://github.com/Skylion007
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
Summary: D38998858 (3fae89d4a4) used the wrong version of `_load_for_mobile` that kept the "load everything in memory then parse" technique. This fixes it to call the `_load_for_mobile_impl` version which for non-flatbuffer models will stream parse. See D38998858 (3fae89d4a4) for the expected memory optimization gains.
Test Plan: CI Signals.
Reviewed By: qihqi
Differential Revision: D39138280
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84296
Approved by: https://github.com/qihqi
Summary:
Remove code dup in import.cpp / export_modules.cpp such that
1. Only one copy of switching logic (detect flatbuffer / is_flatbuffer);
2. Move detection of includeness of flatbuffer to runtime (so no more macros)
This also reverts the dependency of import.cpp -> flatbuffer_loader.cpp to flatbuffer_loader.cpp -> import.cpp.
Differential Revision: D36926217
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79184
Approved by: https://github.com/zhxchen17
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76350
fix memory issue
Test Plan:
User reported ASAN errors when running `//xplat/langtech/mobile:giga5_bin`
Verified with the user that rebasing to this diff it is gone.
Reviewed By: pavithranrao
Differential Revision: D35911894
fbshipit-source-id: 41eb88fe1501d1bb7dd9ce3d36c224c3300a41e8
(cherry picked from commit 3a13aae20f698bc5bcc4d5cb686bb36d433a2f03)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57775
The minimum supported bytecode version is updated from 3 to 4. We no longer support version 3 bytecode models.
Why?
* There are hacky codes in operator loading, that performs differently on one operator on the global bytecode version 3. Instead operator related metadata should be passed (for example, in #56845). To allow future development, we remove the hacky way first.
* The bytecode version was bumped from 3 to 4 more than half a year ago. Since all the production models are all bumped to version 4, it's not practical to keep and maintain version 3. The risk to deprecate version 3 is low.
Test Plan: Imported from OSS
Reviewed By: raziel
Differential Revision: D28270791
Pulled By: cccclai
fbshipit-source-id: 70b1bd6352fdaae5f8d2173b81578d77018c8e44
(cherry picked from commit 3e930fa381cd01f3705116795c6426df992372fc)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74594
Extending `_save_for_mobile` and `_load_for_mobile` to support faltbuffer format with additional optional argument which is set to pick pickle by default.
Adding new binary target with suffix `_pickle_and_flatbuffer` to help migration.
Size test in D34909502 shows the size has regressed by ~40K but after removing pickle and comparing lite_predictors we have ~120K size measure that we will achieve when deprecating pickle and moving to flatbuffer
**BEFORE:**
```lang=mermaid
graph TD;
torch_core-->torch_mobile_deserialize;
torch_mobile_core-->torch_mobile_deserialize;
jit_module_saving-->torch_core;
jit_module_saving-->torch_mobile_core;
torch_mobile_deserialize-->caffe2_serialize;
torch_mobile_deserialize-->torch_mobile_module;
caffe2_serialize-->miniz;
flatbuffer_loader-->mobile_bytecode;
flatbuffer_serializer-->mobile_bytecode;
mobile_bytecode-->flatbuffer_2.0;
flatbuffer_loader-->torch_mobile_module;
flatbuffer_serializer-->torch_mobile_module;
```
**AFTER:**
```lang=mermaid
graph TD;
torch_core-->torch_mobile_deserialize;
torch_mobile_core-->torch_mobile_deserialize;
jit_module_saving-->torch_core;
jit_module_saving-->torch_mobile_core;
torch_mobile_deserialize-->caffe2_serialize;
torch_mobile_deserialize-->torch_mobile_module;
caffe2_serialize-->miniz;
flatbuffer_loader-->mobile_bytecode;
flatbuffer_serializer-->mobile_bytecode;
mobile_bytecode-->flatbuffer_2.0;
torch_mobile_deserialize_pickle_and_flatbuffer-->|new| flatbuffer_loader;
torch_mobile_deserialize_pickle_and_flatbuffer-->|new| torch_mobile_deserialize;
torch_mobile_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;
torch_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;
jit_module_saving_pickle_and_flatbuffer-->|new| torch_core_pickle_and_flatbuffer;
jit_module_saving_pickle_and_flatbuffer-->|new| torch_mobile_core_pickle_and_flatbuffer;
flatbuffer_serializer-->torch_mobile_module;
jit_module_saving_pickle_and_flatbuffer-->|new|jit_module_saving;
jit_module_saving_pickle_and_flatbuffer-->|new|flatbuffer_serializer;
flatbuffer_loader-->torch_mobile_module;
```
Original commit changeset: 780dfb6fd6ba
Original Phabricator Diff: D34805092 (284b2b7135)
ghstack-source-id: 152044801
(Note: this ignores all push blocking failures!)
Test Plan:
CI
```
~/fbsource/fbcode] cd ~/fbsource/fbcode/ && buck test -c fbcode.caffe2_enable_flatbuffer=1 //caffe2/test/cpp/jit:jit -- FlatbufferTest.ExtraFiles
Parsing buck files: finished in 0.9 sec
Building: finished in 5.3 sec (100%) 12992/54304 jobs, 0/54304 updated
Total time: 6.2 sec
More details at https://www.internalfb.com/intern/buck/build/2b387fff-f813-4cfa-b53f-eb2378630d4e
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d
Trace available for this run at /tmp/tpx-20220323-134108.766518-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d/trace.log
RemoteExecution session id: reSessionID-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
✓ ListingSuccess: caffe2/test/cpp/jit:jit : 486 tests discovered (19.122)
✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.ExtraFiles (0.187)
Summary
Pass: 1
ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
```
Similar Build Deps Dags
```
[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops_pickle_and_flatbuffer, //xplat/caffe2:torch_mobile_deserialize_pickle_and_flatbuffer)' --output-format dot-compact | pastry
P486770901: https://www.internalfb.com/intern/paste/P486770901/
[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops, //xplat/caffe2:torch_mobile_deserialize)' --output-format dot-compact | pastry
P486771278: https://www.internalfb.com/intern/paste/P486771278/
```
pickle_and_flatbuffer: https://www.internalfb.com/intern/dgw/graph/?build_id=P486770901
pickle: https://www.internalfb.com/intern/dgw/graph/?build_id=P486771278
Reviewed By: iseeyuan
Differential Revision: D35067157
fbshipit-source-id: 9044259c17a2e0da79bd6aedb28efbdfd57e23e0
(cherry picked from commit f738069ec3a72e79da56172741d027de514e9e5f)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72523
`toTuple()` returns a new intrusive pointer that bumps its underlying ref count. Whereas, `toTupeRef` returns a reference. We can save an unnecessary ref count bump.
ghstack-source-id: 149173308
Test Plan: Sandcastle CI
Reviewed By: swolchok
Differential Revision: D34047666
fbshipit-source-id: 8c821e45f7af4f3f1d098871926b9df288e329fb
(cherry picked from commit 34797e508d533c578a40f74ffc82b34e1c3ea40e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70338
Today Unpickler is used by both server and mobile for deserializing model, and it always fallback to mobile parser when there's no type resolver provided by user. However this is not intended as server and mobile type parser supports different things. In this diff we provide a default fallback using script parser and opt it out for all mobile cases.
ghstack-source-id: 146727330
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: iseeyuan
Differential Revision: D33284352
fbshipit-source-id: 997c4f110b36eee6596e8f23f6a87bf91a4197ed
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70225
Thanks for zhxchen17's suggestion. This pr move the operator initialization logic to `upgrader_mobile.cpp`, such that we can leverage the static variable to ensure the operator initialization only happens once.
ghstack-source-id: 146103229
Test Plan:
```
buck test mode/opt //papaya/integration/service/test/analytics/histogram:generic_histogram_system_test -- --exact 'papaya/integration/service/test/analytics/histogram:generic_histogram_system_test - SumHistogramSystemTest.test' --run-disabled
buck test mode/opt //caffe2/test/cpp/jit:jit
buck test mode/dev //papaya/integration/service/test/mnist:mnist_system_test -- --exact 'papaya/integration/service/test/mnist:mnist_system_test - MnistFederatedSystemTest.test'
```
Reviewed By: zhxchen17
Differential Revision: D33247543
fbshipit-source-id: 6c3a87fe909a1be01452fa79649065845b26d805
Summary:
Upgrader should only be initialized once when runtime loads the first module. It no longer needs to initialized afterwards.
Previously, instead of using an atomic variable, the upgrader will be initialized depends on whether byteCodeFunctionWithOperator.function.get_code().operators_ is empty. If it's empty, it means the operator from the upgrader is not initialized yet. However, it's not thread safe. When multiple thread loads module together, it's possible that they all consider it's the first module. Use an atomic variable here to make sure it's thread safe.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70161
ghstack-source-id: 146012642
Test Plan:
```
buck test mode/opt //papaya/integration/service/test/analytics/histogram:generic_histogram_system_test -- --exact 'papaya/integration/service/test/analytics/histogram:generic_histogram_system_test - SumHistogramSystemTest.test' --run-disabled
buck test mode/opt //caffe2/test/cpp/jit:jit
```
Reviewed By: iseeyuan
Differential Revision: D33220320
fbshipit-source-id: 10f2397c3b358d5a1d39a2ce25457e3fdb640d2c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68037
Right now mobile::Code doesn't outlive its enclosing Function, and all accesses to Code happens inside interpreter loop which doesn't outlive the module, so we don't need to use std::shared_ptr here. This also should saves us 1-2 KB for binary size, because shared_ptr seems to bloat on arm64 android.
ghstack-source-id: 145818696
Test Plan: eyes.
Reviewed By: qihqi, tugsbayasgalan
Differential Revision: D32264616
fbshipit-source-id: d83f538d6604cf75fd7728a25127b4849ce7ab2a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67729
1. operator version is needed to decide whether applying upgrader or not. This pr make it available at loading stage.
2. Swap the order of parsing instruction and operator, because instruction needs to know the operator first because deciding whether applying upgrader or not (change `OP` to `CALL` or not).
ghstack-source-id: 145082390
Test Plan:
```
buck test //caffe2/test/cpp/jit:jit
```
Reviewed By: iseeyuan
Differential Revision: D32092516
fbshipit-source-id: 853a68effaf95dca86ae46b7f7f4ee0d8e8767da
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65971
ghstack-source-id: 141842335
We should be able to load methods into their ClassTypes. Right now mobile runtime only loads data member to ClassTypes but not for methods. To support interface call, we inject methods into ClassTypes when the methods are loaded.
Test Plan: existing tests should all pass.
Reviewed By: qihqi
Differential Revision: D31326146
fbshipit-source-id: fb1dbea619910ef1f8fa26146da3ebab348fe902
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066
I noticed a bunch of time being spent heap-allocating Tuples
in the unpickler. 1-, 2-, and 3-element Tuples are apparently common
enough that they get their own bytecode instructions, so I decided to
try also giving them their own representation. We store up to 3
IValues inline in `Tuple` rather than doing a second heap allocation
for a `std::vector<IValue>`.
ghstack-source-id: 140695395
Test Plan:
Added automated tests for TupleElements.
Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284
Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422
We went from 347 ms to 302 ms.
Reviewed By: dhruvbird
Differential Revision: D30592622
fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65865
`operator_str` is not used in `import.cpp` and it is also defined in `parse_operators.cpp` so removing it from `import.cpp`.
Test Plan: CI passing
Reviewed By: iseeyuan
Differential Revision: D31293008
fbshipit-source-id: 1c857cbd63c57b8f79c1a068789fc8605605b642