Summary:
## Original commit message:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73368
debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange shares the same source, the string for trace can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.
Test Plan:
## Original Test plan
unit test
Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K archive/xl_model_weights
3.7M archive/extra
8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K archive/code/__torch__/caffe2/torch/fb
8.0K archive/code/__torch__/caffe2/torch
8.0K archive/code/__torch__/caffe2
20M archive/code/__torch__/torch/fx/graph_module
20M archive/code/__torch__/torch/fx
8.0K archive/code/__torch__/torch/classes
20M archive/code/__torch__/torch
20M archive/code/__torch__
20M archive/code
2.7M archive/constants
35M archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K resaved/extra
8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K resaved/code/__torch__/caffe2/torch/fb
8.0K resaved/code/__torch__/caffe2/torch
8.0K resaved/code/__torch__/caffe2
1.3M resaved/code/__torch__/torch/fx/graph_module
1.3M resaved/code/__torch__/torch/fx
8.0K resaved/code/__torch__/torch/classes
1.4M resaved/code/__torch__/torch
1.4M resaved/code/__torch__
1.4M resaved/code
2.7M resaved/constants
13M resaved
[qihan@devvm5585.vll0 ~]$
```
## Additional test:
`buck test mode/dev-tsan //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --exact 'caffe2/benchmarks/static_runtime:static_runtime_cpptest - StaticRuntime.to'` passes
test jest.fbios.startup_cold_start.local.simulator f333356873 -
Differential Revision: D35196883
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74869
Approved by: https://github.com/gmagogsfm
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75244
Original commit changeset: d653a5af662a
Original Phabricator Diff: D35060736 (d9d34922a0)
Test Plan: Model loading test, verified that D35060736 (d9d34922a0) will cause the torch::save => torch::load failure.
Reviewed By: yinghai, jianyuh
Differential Revision: D35387009
fbshipit-source-id: 9d176992d402d57779e2af3d905b3c1538335298
(cherry picked from commit 6c8cc0d3b8a88b15e35702d70e18bbae8aa4628a)
Summary:
This diff is reverting D34455360 (61d6c43864)
D34455360 (61d6c43864) is making the following tests to fail and this revert diff is either the revert of the blame diff or the revert of the stack of diffs that need to be reverted to revert the blame diff
Tests affected:
- https://www.internalfb.com/intern/test/562950004334605/
Multisect link:
https://www.internalfb.com/intern/testinfra/multisect/756170
Test Plan: NA
Reviewed By: zhxchen17
Differential Revision: D34596156
fbshipit-source-id: a465bca0094db3caf6130c80f1ed49eea981359b
(cherry picked from commit ef5e5578c64ce9827570757fb016aafa9c782c6a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73368
debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange shares the same source, the string for trace can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.
Test Plan:
unit test
Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K archive/xl_model_weights
3.7M archive/extra
8.0K archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K archive/code/__torch__/caffe2/torch/fb
8.0K archive/code/__torch__/caffe2/torch
8.0K archive/code/__torch__/caffe2
20M archive/code/__torch__/torch/fx/graph_module
20M archive/code/__torch__/torch/fx
8.0K archive/code/__torch__/torch/classes
20M archive/code/__torch__/torch
20M archive/code/__torch__
20M archive/code
2.7M archive/constants
35M archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K resaved/extra
8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K resaved/code/__torch__/caffe2/torch/fb
8.0K resaved/code/__torch__/caffe2/torch
8.0K resaved/code/__torch__/caffe2
1.3M resaved/code/__torch__/torch/fx/graph_module
1.3M resaved/code/__torch__/torch/fx
8.0K resaved/code/__torch__/torch/classes
1.4M resaved/code/__torch__/torch
1.4M resaved/code/__torch__
1.4M resaved/code
2.7M resaved/constants
13M resaved
[qihan@devvm5585.vll0 ~]$
```
Reviewed By: gmagogsfm
Differential Revision: D34455360
fbshipit-source-id: 8cc716f9bba7183746b1b4ecc33a2de34ac503b9
(cherry picked from commit f1a04730fc9ac8fdab6c8e4c44cb5529e42090e4)
Summary: Reland for D33282878 (911d527b87) . Land backend change first to maintain FC. Will wait for 2 weeks after this diff is in. And than land the front-end change in next diff.
Test Plan:
test in next diff
time buck test mode/dev-nosan fblearner/flow/projects/langtech/translation:tests -- test_e2e_base_training
Reviewed By: gmagogsfm
Differential Revision: D33342547
fbshipit-source-id: b3dee9a4bdfd78103848c12629e5fccafdd621e3
(cherry picked from commit ae1935f1af755180e5607e870ff365dc17061e4a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71597
Problem: _jit_to_backend overrides get/set state. This means any attributes added to the module after lowering will not be preserved after serialization. For edge workflows the biggest problem here is it breaks bundled_inputs.
Solution?:
Real quick and easy way to handle issues with to_backend overriding get/set state. Wraps the lowered module in another module and has forwarding functions for the api specified in 'method_compile_spec'.
The tradeoff with this approach is now the actual workhorse of the module is 1 layer deep which might make debugging slightly grosser/more difficult/confusing. The other approach Martin David and I talked about would be to only lower the portions that require custom get/set state logic. This leaves the top level the same, and only specific backened internals are changed. Personally I'm not sure how much that really addresses the debugging concern all that well. It seems like if you cracked the model open you'd still run into similar amounts of confusion with a lot of the variables and logic referenced coming from another module.
The other concern with this approach is whether or not 'compile_spec' specifies the public api of the module (since thats our source of truth for this wrapper). While it may not be enforced, it certainly seems to be true by convention and the to_backend api already uses it as a source of truth for all functions that get generated in the resulting module. I say we just formally commit to this (compile spec keys being functions) being the contract of the api instead of just assuming it to be the case and then having weird behavior if its not.
Test Plan:
New Unit Test
CI to check for existing behavior and contracts.
manually tested in a notebook with bundled inputs.
{P475790313}
Reviewed By: raziel
Differential Revision: D33694257
fbshipit-source-id: 9ff27db421eba41bac083dff11a22e9e40a36970
(cherry picked from commit 91ef49977ef0bf18242df381a3ee805c24d6f68d)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066
I noticed a bunch of time being spent heap-allocating Tuples
in the unpickler. 1-, 2-, and 3-element Tuples are apparently common
enough that they get their own bytecode instructions, so I decided to
try also giving them their own representation. We store up to 3
IValues inline in `Tuple` rather than doing a second heap allocation
for a `std::vector<IValue>`.
ghstack-source-id: 140695395
Test Plan:
Added automated tests for TupleElements.
Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284
Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422
We went from 347 ms to 302 ms.
Reviewed By: dhruvbird
Differential Revision: D30592622
fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62419
This diff adds support for cpu only kineto profiler on mobile. Thus
enabling chrome trace generation on mobile. This bring cpp API for
mobile profiling on part with Torchscript.
This is done via:
1. Utilizating debug handle annotations in KinetoEvent.
2. Adding post processing capability, via callbacks, to
KinetoThreadLocalState
3. Creating new RAII stype profiler, KinetoEdgeCPUProfiler, which can be
used in surrounding scope of model execution. This will write chrome
trace to the location specified in profiler constructor.
Test Plan:
MobileProfiler.ModuleHierarchy
Imported from OSS
Reviewed By: raziel
Differential Revision: D29993660
fbshipit-source-id: 0b44f52f9e9c5f5aff81ebbd9273c254c3c03299
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61933
### Issue:
SubModules with same name are not serialized correctly in bytecode format while using `_save_for_mobile`. These submodules are not distinguished as different modules even though they have different foward, setstate etc if they have the same name.
### Fix:
Mangler creates unique names so that modules and submodules that have same names can be uniquely identified while saving the module. iseeyuan rightly pointed out the underlying issue that mangler is not used in the process of saving bytecode and hence unique references for the submodules are not created. Please refer to the notebook to repro the issue: N777224
### Diff:
The above idea of fix is implemented. The mangled names are used in bytecode thereby the files in `code/` directory now have right reference to the `bytecode.pkl`
Will this have backward compatibility?
iseeyuan please feel free to correct or update this.
Yes. This fix impacts only modules with same name sub modules which were not serialized correctly before. Existing modules should have correct references and `_load_for_mobile` must not see any change. To confirm this the existing test cases need to pass for the diff to be approved and shipped.
ghstack-source-id: 134242696
Test Plan:
```
~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestCompositeWithSetStates
Downloaded 0/5 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 19.2 sec (100%) 17619/17619 jobs, 3/17619 updated
Total time: 19.5 sec
More details at https://www.internalfb.com/intern/buck/build/91542d50-25f2-434d-9e1a-b93117f4efe1
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: de9e27cf-4c6c-4980-8bc5-b830b7c9c534
Trace available for this run at /tmp/tpx-20210719-161607.659665/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388
✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (8.140)
✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestCompositeWithSetStates (0.528)
Summary
Pass: 1
ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388
```
```
~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestConsistencyOfCompositeWithSetStates
Building: finished in 4.7 sec (100%) 6787/6787 jobs, 0/6787 updated
Total time: 5.0 sec
More details at https://www.internalfb.com/intern/buck/build/63d6d871-1dd9-4c72-a63b-ed91900c4dc9
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 81023cd2-c1a2-498b-81b8-86383d73d23b
Trace available for this run at /tmp/tpx-20210722-160818.436635/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153
✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (7.867)
✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestConsistencyOfCompositeWithSetStates (0.607)
Summary
Pass: 1
ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153
```
To check the `bytecode.pkl` using module inspector please check:
N1007089
Reviewed By: iseeyuan
Differential Revision: D29669831
fbshipit-source-id: 504dfcb5f7446be5e1c9bd31f0bd9c986ce1a647
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57481
This diff introduces function name to InlinedCallStack.
Since we are using InlinedCallStack for debug information in lite
interpreter as well as delegate backends, where InlinedCallStack cannot
be constructed from model source code, we need to save function name.
In the absence of function name Function* is used to get name of the
function. This is when JIT compiles code at runtime.
When that is not possible, this diff introduces a way to obtain function
name.
Test Plan:
test_backend
test_cs_debug_info_serialization
test_backend
test_cs_debug_info_serialization
Imported from OSS
Differential Revision:
D28159097
D28159097
Reviewed By: raziel, ZolotukhinM
Pulled By: kimishpatel
fbshipit-source-id: deacaea3325e27273f92ae96cf0cd0789bbd6e72
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57441
debug info
Previous diffs did not save operator name in debug info. For delegated
backends that only idenfity op for profiling with debug handle, operator
name should be stores as well.
Furthermore to complete debug informaton also serialize function name.
Test Plan:
Existing lite interpreter and backend tests
Existing lite interpreter and backend tests
Imported from OSS
Differential Revision:
D28144581
D28144581
Reviewed By: raziel
Pulled By: kimishpatel
fbshipit-source-id: 415210f147530a53b444b07f1d6ee699a3570d99
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55462
handles and symbolicate exception callstack thrown from backend.
Objective of this diff is to achieve improve error reporting when
exceptions are raised from lowered backend. We would effectively like to
get the same model level stack trace that you would get without having
lowered some module to backend.
For example:
```
class AA(nn.Module):
def forward(self, x, y):
return x + y
class A(nn.Module):
def __init__(...):
self.AA0 = AA()
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
class B(nn.Module):
def forward(self, x):
return x + 2
class C(nn.Module):
def __init__(...):
self.A0 = A()
self.B0 = B()
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
```
If the we then do C().forward(torch.rand((2,3)), torch.rand(14,2))) we
will likely see error stack like:
```
C++ exception with description "The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "<string>", line 3, in forward
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in forward
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
~~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in forward
def forward(self, x, y):
return x + y
~~~~~ <--- HERE
```
We would like to see the same error stack if we lowered C.A0 to some
backend.
With this diff we get something like:
```
Module hierarchy:top(C).A0(backend_with_compiler_demoLoweredModule).AA0(AA)
Traceback of TorchScript (most recent call last):
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 5, in FunctionName_UNKNOWN
typed_inputs: List[Any] = [x, y, ]
if self.__backend.is_available() :
_0, = self.__backend.execute(self.__handles["forward"], typed_inputs)
~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
assert isinstance(_0, Tensor)
return _0
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
~~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return x + y
~~~~~ <--- HERE
```
This is achieved in 3 parts:
Part 1:
A. BackendDebugInfoRecorder:
During backend lowering, in `to_backend`, before calling the preprocess
function corresponding to the backend. This will facilitate recording of
debug info (such as source range + inlined callstack) for the lowered module.
B. Instantiate WithBackendDebugInfoRecorder with BackendDebugInfoRecorder.
This initializes thread local pointer to BackendDebugInfoRecorder.
C. generate_debug_handles:
In preprocess function, the backend will call generate_debug_handles
for each method being lowered separately. generate_debug_handles
takes `Graph` of the method being lowered and returns a map
of Node*-to-debug_handles. Backend is responsible for storing debug
handles appropriately so as to raise exception (and later profiling)
using debug handles when the exception being raised corresponds to
particular Node that was lowered.
Inside generate_debug_handles, we will query the current
BackendDebugHandleInfoRecorder, that is issuing debug handles. This debug
handle manager will issue debug handles as well as record
debug_handles-to-<source range, inlined callstack> map.
D. Back in `to_backend`, once the preprocess function is has finished
lowering the module, we will call `stopRecord` on
BackendDebugInfoRecorder. This will return the debug info map. This
debug info is then stored inside the lowered module.
Part 2:
Serialization:
During serialization for bytecode (lite interpreter), we will do two
things:
1. Extract all the source ranges that are contained inside
debug_handles-to-<source range, inlined callstack> map for lowered
module. This will be source range corresponding to debug handles,
including what is there is inlined callstack. Since we replaced original
module with lowered module, we wont be serializing code for the original
module and thus no source range. That is why the source range will have
to be stored separately. We will lump all the source ranges for all the
lowered modules in one single debug_pkl file.
2. Then we will serialize debug_handles-to-<source range, inlined
callstack> map.
Now during deserialization we will be able to reconstruct
debug_handles-to-<source range, inlined callstack> map. Given all
debug_handles are unique we would not need any module information.
Test Plan:
Tests are added in test_backend.cpp
Tests are added in test_backend.cpp
Imported from OSS
Differential Revision:
D27621330
D27621330
Reviewed By: raziel
Pulled By: kimishpatel
fbshipit-source-id: 0650ec68cda0df0a945864658cab226a97ba1890
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53068
Adds a ```bool is_available()``` method to the backend contract: it returns ```true``` if ```compile()``` and ```execute()``` can be called; ```false``` otherwise.
It is used to implement the following changes in the ```LoweredModule```:
* ```compile()``` in ```__setstate__``` will run if ```is_available()```, else ```__setstate__``` throws an exception (“Backend not available.”).
* ```compile()``` at ```LoweredModule``` creation will run if ```is_available()```, else a WARNING will be thrown.
* ```execute()``` will only be executed if ```is_available()``` returns true; else throws an exception (“Backend not available.”).
The goal of these changes is to ensure we have a well defined behaviour for the different combinations of backend availability on-host and on-target.
More specifically, backends may have different capabilities to compile and/or execute the Module, depending whether this happens on-host (i.e. where the program is being written) or on-target (where the program is being executed).
First of all, we know that "preprocess" always takes place, and that only happens on-host at creation time. So, we can assume that any compilation is needed/possible on-host then all of it could be pushed here.
Overall, we want to ensure the following:
**On host**
| compile | execute | Outcome |
| -- | -- | -- |
| No | No | On module creation, LoweredModule is generated, with a warning (since compilation and execution can still take place on-target). On module load, throws an exception (since execution is not possible). |
| No | Yes | This configuration should not be possible. This assumes the full compiler is not available, even if some work was done in preprocess the program cannot be finalized for execution. |
| Yes | No | In this case, the expectation would be for is_available() to return false, and compilation logic to move into preprocess. |
| Yes | Yes | All good. This is the only case that is_available() should return true. |
**On target**
| compile | execute | Outcome |
| -- | -- | -- |
| No | No | Loading the LoweredModule throws an exception. Since execution is not possible. |
| No | Yes | Basically this is another instance of Yes/Yes: compilation per se may not be possible on device, which means compile() can be called without issue but it is a no-op, and thus is_available should return true. Consequently, loading the LoweredModule: Succeeds, if the preprocessed module is ready for execution. Fails with exception otherwise. |
| Yes | No | This configuration should not be possible. Just putting here for completeness. |
| Yes | Yes | All good. This, along with No/Yes case (because compilation is assumed to have happened on-host, so it's just another instance of Yes/Yes), are the cases where is_available() should return true. |
**Refactoring existing code**
This change also updates other backends (Glow) code, to implement the is_available() method to have the same behaviour as before this change (i.e. always available).
This should not cause backward incompatibilities with already saved models since we're adding a new method to the PyTorchBackendInterface.
Models saved with the old interface that didn't have is_available() will still find the other 2 methods in the bound object (i.e. compile and execute), and the saved LoweredModule logic will be the old one.
**Future**
We plan to use is_available() to implement support for fallback to the PyTorch interpreter.
ghstack-source-id: 123498571
Test Plan: Added C++ (test_backend.cpp) and Python (test_backends.py) tests to validate the exceptions.
Reviewed By: jackm321, spaugh, iseeyuan
Differential Revision: D26615833
fbshipit-source-id: 562e8b11db25784348b5f86bbc4179aedf15e0d3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52870
Add the missing parts to support to_backend modules by lite interpreter.
1. Add ISINSTANCE instruction support, which is used in to_backend for output type check.
2. Bypass lite interpreter's type parser by checking the qualified name. If it starts with "torch.jit", use the same type resolver as nn module (starting with "__torch__").
Tests
Mobile module is serialized and loaded in ```BackendTest.TestCompiler```. The results are compared to those from original torchscript module.
Test Plan: Imported from OSS
Reviewed By: raziel
Differential Revision: D26715351
Pulled By: iseeyuan
fbshipit-source-id: ad9d74ee81c6aa692ab9e5dd7a9003bae5d4f01f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52603
This PR introduced a backend with minimum compilation capability to the to_<backend> flow. The targets are:
- Demonstrate the end-to-end flow with adding a backend -> compilation -> runtime
- How the backend compilation errors be surfaced to the user, with the original model's source code information. (C++ only in this PR. Python APIs will be demonstrated in a following PR.)
Changes:
- Compilation
1. A backend with minimum compilation features, "backend_with_compiler_demo" is added.
2. The compilation happens AOT in the ```pre_process``` function registered to this backend.
3. Compiled results are stored in a string blob for each method. They are serialized to the lowered module with ```__get_state__``` function.
4. Error message with model source code is thrown, for features not handled by the backend compiler.
- Runtime
1. The compiled blob is loaded in ```__set_state__``` method.
2. The ```compile``` function of the backend pass through the AOT compiled blob. (TODO: parsing the blob to the format that the backend can understand can happen here.)
3. The ```execute``` function of the backend executes the specified method (handle).
Test Plan:
- ```BackendTest.TestCompiler```: the C++ end-to-end demonstration on a supported model. After compilation and running, the lowered model produces the same result as the original torchscript model.
- ```BackendTest.TestCompilerNotSupport```: Demonstrate the error message from the AOT compilation for a feature not supported from the input module. The error message looks like:
```
"The node of aten::mul is not supported in this compiler. Source code: File "<string>", line 3
def forward(self, x, h):
return x * h
~~~~~ <--- HERE
```
Reviewed By: raziel
Differential Revision: D26593968
Pulled By: iseeyuan
fbshipit-source-id: 8f264f60a0470e9f07e36fdeccbf17da6c1d7cd7
Summary:
This is a re-land off https://github.com/pytorch/pytorch/pull/51797 with fix for spurious libcuda dependency
Fix limits the scope of `no-as-needed` linker flag to just `jitbackend_test`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52340
Reviewed By: agolynski, iseeyuan
Differential Revision: D26476168
Pulled By: malfet
fbshipit-source-id: f909428af82182b3bffd020ca18cca7a9b5846b6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51797
The C++ API, ```codegen_backend_module``` is added to ```to_<backend>```. Python related stuffs are decoupled in this function. It can be used from both C++ and python.
* Tests
Python: The existing ```test_backends.py```, which calls the C++ API under the hood.
C++: The end-to-end test of ```jit.BackendTest.ToBackend``` is added in ```test_backend.cpp```. The original class definitions in this file is moved to ```test_backend_lib.cpp```
ghstack-source-id: 121687464
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: raziel
Differential Revision: D26280518
fbshipit-source-id: fd466e4b448847ce64010a3297fff0b5760c5280
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51757
Enables backend preprocessing to take place outside of the backend interface.
What's new:
* A new definition for backend preprocessing (i.e. BackendPreprocessFunction).
* Registration of the backend's PyTorchBackendInterface interface implementation is augmented to take the BackendPreprocessFunction.
* A new registry is created to handle the BackendPreprocessFunction functions, using the backend's name as key.
* When a BackendPreprocessFunction is used, the PyTorchBackendInterface's "preprocess" method is not added to the LoweredModule. Instead, the BackendPreprocessFunction is called and its output used to set the LoweredModule's __processed_module.
Why?:
These changes are needed to avoid forcing backend preprocessing to be part of the LoweredModule, and in the future be able to eliminate "preprocess" from the PyTorchBackendInterface.
This is important for Mobile use cases where "preprocess" can take the bulk of the compilation process, and thus contain code dependencies that we do not want to bring (or cannot bring) to the Mobile binary.
What didn't change:
* Everything is backwards compatible:
** The existing "preprocess" method in PyTorchBackendInterface is still there.
** When backend registration is done without the BackendPreprocessFunction, as before, things work the same way: "preprocess" is added to LoweredModule, and invoked through the module's instance of the backend interface.
Longer term, the plan is to refactor existing users to move to the new backend registration.
ghstack-source-id: 121190883
Test Plan:
Updated existing tests (test_backend.py) to use the new registration mechanism.
Verified test ran and passed (in my OSS build).
Reviewed By: iseeyuan
Differential Revision: D26261042
fbshipit-source-id: 0dc378acd5f2ab60fcdc01f7373616d1db961e61
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41145
**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.
**Fixes**
This commit fixes#40067.
Test Plan: Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D22510076
Pulled By: SplitInfinity
fbshipit-source-id: f65964ef3092a095740f06636ed5b1eb0884492d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40842
**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.
**Fixes**
This commit fixes#40067.
Test Plan: Imported from OSS
Differential Revision: D22418731
Pulled By: SplitInfinity
fbshipit-source-id: 621ba4efc1b121fa76c9c7ca377792ac7440d250
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40840
**Summary**
This commit moves the TestBackend used for the JIT backend
extension to the tests directory. It was temporarily placed
in the source directory while figuring out some details of
the user experience for this feature.
**Test Plan**
`python test/test_jit.py TestBackends`
**Fixes**
This commit fixes#40067.
Test Plan: Imported from OSS
Differential Revision: D22418682
Pulled By: SplitInfinity
fbshipit-source-id: 9356af1341ec4d552a41c2a8929b327bc8b56057