Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42133
Test Plan:
We save a module with module debugging information as follows.
```
import torch
m = torch.jit.load('./detect.pt')
# Save module without debug info
m._save_for_lite_interpreter('./detect.bc')
# Save module with debug info
m._save_for_lite_interpreter('./detect.bc', _save_debug_info_in_bytecode=True)
```
Size of the file without module debugging information: 4.508 MB
Size of the file with module debugging information: 4.512 MB
Reviewed By: kimishpatel
Differential Revision: D22803740
Pulled By: taivu1998
fbshipit-source-id: c82ea62498fde36a1cfc5b073e2cea510d3b7edb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41376
torch::jit::mobile::Module does not currently support accessing parameters via their attribute names, but torch::jit::Module does. This diff adds an equivalent functionality to mobile::Module.
Test Plan: Imported from OSS
Reviewed By: iseeyuan
Differential Revision: D22609142
Pulled By: ann-ss
fbshipit-source-id: 1a5272ff336f99a3c0bb6194c6a6384754f47846
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39607
add overload name for strcmp macro to prevent duplicated op names in lite interpreter
also reformatted some other files
Test Plan:
verified these op schema are changed
```
-aten::eq(str a, str b) -> (bool)
+aten::eq.str(str a, str b) -> (bool)
-aten::ne(str a, str b) -> (bool)
+aten::ne.str(str a, str b) -> (bool)
-aten::lt(str a, str b) -> (bool)
+aten::lt.str(str a, str b) -> (bool)
-aten::gt(str a, str b) -> (bool)
+aten::gt.str(str a, str b) -> (bool)
-aten::le(str a, str b) -> (bool)
+aten::le.str(str a, str b) -> (bool)
-aten::ge(str a, str b) -> (bool)
+aten::ge.str(str a, str b) -> (bool)
```
Reviewed By: iseeyuan
Differential Revision: D21913049
fbshipit-source-id: 518db068c8c5b0efd19223f0bd94fc3351335dc4
Summary: Fix operator perf observer index issue.
Test Plan:
make sure that the operator index is populated correctly, ran benchmarking for pytext_mobile_inference, see result:
https://www.internalfb.com/intern/aibench/details/598900068317693
Reviewed By: linbinyu
Differential Revision: D21779222
fbshipit-source-id: 0fc3561d83d10cfabd73e1e6b6ee240ce0bafd80
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38489
Remove module and operator observer macros.
ghstack-source-id: 104290763
Test Plan:
a. Verify that QPL is being sent while testing FB4A BI Cloaking:
{F236982877}
b. Verify that AI Benchmark is working on both module and operator level:
https://our.intern.facebook.com/intern/aibench/details/808056762618979
c. Verify that macosx segmentation effect by running buck run xplat/arfx/tracking/segmentation/tools:person_segmentation_demoAppleMac#macosx-x86_64:
{F236982853}
Reviewed By: ljk53
Differential Revision: D21540838
fbshipit-source-id: 516f84ef5673d4ceed38ae152440a5cbacc6ddaa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37962
Temporarily re-enable RecordFunction in lite interpreter when profiler key is not set,
this allows the profiler to work without profiled wrappers in the build
Test Plan: CI
Reviewed By: smessmer, linbinyu
Differential Revision: D21409120
fbshipit-source-id: 6f0311c8eb55537a03b8bdac69def18a496ec672
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37548
Moving RecordFunction from torch::autograd::profiler into at namespace
Test Plan:
CI
Imported from OSS
Differential Revision: D21315852
fbshipit-source-id: 4a4dbabf116c162f9aef0da8606590ec3f3847aa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37382
After adding c10::DispatchKey::Profiler the behavior of RecordFunction
observers is also controlled by the dispatch key,
this PR moves the logic outside of the profiler into the record function
Reviewed By: jamesr66a
Differential Revision: D21268320
fbshipit-source-id: 93207e3b55325d20dcc5b1e8f448ab86933321da
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37292
After adding c10::DispatchKey::Profiler the behavior of RecordFunction
observers is also controlled by the dispatch key,
this PR moves the logic outside of the profiler into the record function
Reviewed By: jamesr66a
Differential Revision: D21245094
fbshipit-source-id: 595e41b18206d2ba4cf639cb320f630907868b3f
Summary: This diff fixes the issues with current handling of debug information passed along the execution of the model. (For example, it is possible that multiple calls to the debug guard may override each other)
Test Plan: CI test/cpp/jit
Reviewed By: dzhulgakov
Differential Revision: D20602775
fbshipit-source-id: 4683957954028af81a1a0f1f12b243650230c9bb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115
This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.
Testing:
Ran the script, CI.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D20568523
Pulled By: SplitInfinity
fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b
Summary: Last diff enabled operator stats for non-production build including AIBench. But the operator latency is off: https://our.intern.facebook.com/intern/aibench/details/414567479798816 as it is representing operator execution end time, and as the threadLocalDebugInfo was not set, the start time is 0. So this diff is fixing it by creating a new ThreadLocalDebugInfo object when op starts to run and store the model information for logging.
Test Plan:
```buck run mode/mac aibench:run_bench_macos -- -b aibench/specifications/models/pytorch/pytext/pytext_mobile_inference.json --platform android --framework pytorch --remote --devices SM-G960F-8.0.0-26```
https://our.intern.facebook.com/intern/aibench/details/922804117425407
```buck run mode/mac aibench:run_bench_macos -- -b aibench/specifications/models/pytorch/fbnet/fbnet_mobile_inference.json --platform android --framework pytorch --remote --devices SM-G960F-8.0.0-26```
https://our.intern.facebook.com/intern/aibench/details/593403202250750
Reviewed By: xta0
Differential Revision: D20436388
fbshipit-source-id: 740bc94c3f51daef6af9b45c1ed7a708f5fc8836
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33294
1. Serialize bytecode of __setstate__ and run it when loading the model.
2. One use case is quantization. To test this use case a few operators are registered temporarily for lite interpreter. The "_" prefix registration will be removed when the operators are all migrated to mobile.
Test Plan: Imported from OSS
Differential Revision: D20162898
Pulled By: iseeyuan
fbshipit-source-id: 7a3180807bf38fbce594d86993896861f12bb58c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30906
Add mobile module observer to measure performance of each method run.
ghstack-source-id: 95120194
Test Plan:
Run pytext model through BI cloaking flow on lite-interpreter and verify logs are sent:
1. buck install -r fb4a
2. Go to internal setting and find MobileConfig, search for android_bi_infra_cloaking_iab_models and set the following params:
a. sample_rate: 1.0
b. enabled: true
c. use_bytedoc_pytorch_model: true
d. use_bytedoc_caffe2_model: false
e. use_full_jit: false
3. Go back to new feed and scroll down until find an ads which will direct you to offsite webpage;
4. Click on the ads, wait for the offsite ads loads;
5. Click back to news feed;
6. Go to scuba table: https://fburl.com/scuba/4fghwp0b and see all the operator runs have been logged:
{F223456981}
Reviewed By: ljk53
Differential Revision: D18702116
fbshipit-source-id: a9f07eee684e3022cef5ba3c5934f30f20192a85
Summary: Add mobile operator observer to measure performance of each operator run, the result will also log into QPL event: [MOBILE_OPERATOR_STATS ](https://fburl.com/quicklog/8773a00a).
Test Plan:
Run pytext model through BI cloaking flow on lite-interpreter and verify logs are sent:
1. buck install -r fb4a
2. Go to internal setting and find MobileConfig, search for android_bi_infra_cloaking_iab_models and set the following params:
a. sample_rate: 1.0
b. enabled: true
c. use_bytedoc_pytorch_model: true
d. use_bytedoc_caffe2_model: false
e. use_full_jit: false
3. Go back to new feed and scroll down until find an ads which will direct you to offsite webpage;
4. Click on the ads, wait for the offsite ads loads;
5. Click back to news feed;
6. Go to scuba table: https://fburl.com/scuba/er7t4g9u and see all the operator runs have been logged:
{F223250762}
Reviewed By: ljk53
Differential Revision: D18131224
fbshipit-source-id: 23e2f6e2a9851c04b29511b45dc53f3cce03e8a0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30060
Mobile forward() passed inputs by reference, which is different from JIT's script::module. To make it consistent, change it pass by value.
Test Plan: Imported from OSS
Differential Revision: D18587786
Pulled By: iseeyuan
fbshipit-source-id: fa398124fd0a5168f708733ff88f0ba327726f43
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25187
The bytecode export flow: dump the bytecode format for the light weighted interpreter.
* The bytecode is generated without input spec optimization. It would be more generic (input independent) with no obvious performance degradation (to be tested).
* Main API: torch::jit::script::Module::save(filename, extra_files, bool *bytecode_format* = false).
* Both bytecode and module object are exported in pickle format.
* The module object (in data.pkl) is the same as the original JIT model.
* The serializer is dependent on pickle only (no protobuf or Json).
* The major functionality is forked in ScriptModuleSerializer2::serialize().
* The test loader is test_bc_export.cpp.
* Simple APIs are added in Code and its implementation to get necessary information (instructions, operators and constants).
* Since there's no dependency on graph/node, GetAttr is promoted from an operator to first-class instruction (https://github.com/pytorch/pytorch/pull/25151) .
* Some definitions (instructions, writeArchive, etc) that are shared by full JIT and bytecode are pulled out of the local namespace (https://github.com/pytorch/pytorch/pull/25148).
The output layout looks like:
* folders of methods.
* In each method folder (for example, forward/):
* bytecode.pkl: instructions and operators
* constants{.pkl,/}: constant list in constants.pkl. If there are tensors in constants, the binary tensor files in constants/ folder.
* data{.pkl,/}: the module object, with binary tensor files in data/ folder. The same as in torchscript.
Test Plan: Imported from OSS
Differential Revision: D17076411
fbshipit-source-id: 46eb298e7320d1e585b0101effc0fcfd09219046