Commit Graph

73 Commits

Author SHA1 Message Date
ccd9f3244b Get, save, and load module information for each operator (#42133)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42133

Test Plan:
We save a module with module debugging information as follows.
```
import torch
m = torch.jit.load('./detect.pt')
# Save module without debug info
m._save_for_lite_interpreter('./detect.bc')
# Save module with debug info
m._save_for_lite_interpreter('./detect.bc', _save_debug_info_in_bytecode=True)
```
Size of the file without module debugging information: 4.508 MB
Size of the file with module debugging information: 4.512 MB

Reviewed By: kimishpatel

Differential Revision: D22803740

Pulled By: taivu1998

fbshipit-source-id: c82ea62498fde36a1cfc5b073e2cea510d3b7edb
2020-08-14 01:25:27 -07:00
33e26656fa list workaround for CREATE_OBJECT failure (#41129)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41129

Test Plan: Imported from OSS

Differential Revision: D22436064

Pulled By: ann-ss

fbshipit-source-id: 7cfc38eb953410edfe3d21346c6e377c3b3bfc1f
2020-07-08 18:36:04 -07:00
53af9df557 Unify boxed function signature between jit and c10 (#37034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034

c10 takes a Stack* in boxed functions while JIT took Stack&.
c10 doesn't return anything while JIT returns an int which is always zero.

This changes JIT to follow the c10 behavior.
ghstack-source-id: 106834069

Test Plan: unit tests

Differential Revision: D20567950

fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b
2020-06-29 19:24:26 -07:00
4e976b9334 Remove callBoxedWorkaround (#36850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36850

Since now all unboxing happens after dispatch, which means that all c10 ops support unboxing, we can now use op.callBoxed() for all ops and don't need callBoxedWorkaround (which was going through the JIT registry) anymore.
ghstack-source-id: 102879558

Test Plan: waitforsandcastle

Differential Revision: D21102375

fbshipit-source-id: d1e041116563a9650d5a86b07eb96d217d8756f3
2020-04-24 23:13:31 -07:00
3880f14b64 Canonicalize includes in torch, and add tests for it (#36303)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36303

Test Plan: Imported from OSS

Differential Revision: D20943003

Pulled By: ezyang

fbshipit-source-id: 81fcbaccc1a7eec422bd8347d196bb66a5467884
2020-04-23 08:09:21 -07:00
a894fff265 Back out "Revert D21089648: Put TORCH_LIBRARY in torch/library.h; add custom class API"
Summary: Original commit changeset: 636e8a11afc6

Test Plan: export to OSS

Reviewed By: malfet

Differential Revision: D21170502

fbshipit-source-id: e8f35f103c4924aedbcaaf868475008d24bdeeab
2020-04-22 09:18:23 -07:00
2ccdc39dce Revert D21089648: Put TORCH_LIBRARY in torch/library.h; add custom class API
Test Plan: revert-hammer

Differential Revision:
D21089648

Original commit changeset: 8d54329c1252

fbshipit-source-id: 636e8a11afc628a4cdae9d44824985c10c70555e
2020-04-21 12:21:45 -07:00
01100cb477 Put TORCH_LIBRARY in torch/library.h; add custom class API (#36742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36742

Now, you can define a custom class inside a TORCH_LIBRARY block.
It looks very similar to what you did before.  Instead of

```
static auto m = torch::class_<Class>("Namespace", "Class").def("foo", foo);
```

you write

```
TORCH_LIBRARY(Namespace, m) {
  m.class_<Class>("Class")
    .def("foo", foo);
}
```

All the old usages still work, but at some point we should start
updating the tutorials when we're ready to go 100% live with the
new pybind11 style API.

custom class API previously lived in torch/ folder and in torch
namespace, so for consistency, the new TORCH_LIBRARY also got
moved to torch/library.h The definition of Library::class_ is in the
bottom of that header because I need all of the class_ constructors
available, but there is a circular dependency between the two headers.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D21089648

Test Plan: Imported from OSS

Pulled By: ezyang

fbshipit-source-id: 8d54329c125242605336c22fa1642aae6940b507
2020-04-21 10:05:21 -07:00
a91097bdfb Revert D20964368: Revert D20408831: [Lite Interpreter] Operator registration migrate from manual to selective build
Test Plan: revert-hammer

Differential Revision:
D20964368

Original commit changeset: f1874088a597

fbshipit-source-id: d9317ed97a98e2b04c190785b5564536b1096282
2020-04-10 08:19:36 -07:00
586481a6e2 Revert D20408831: [Lite Interpreter] Operator registration migrate from manual to selective build
Test Plan: revert-hammer

Differential Revision:
D20408831

Original commit changeset: ec75dd762c46

fbshipit-source-id: f1874088a5970dd220cc027d0020ab6223b9bd93
2020-04-10 08:03:38 -07:00
7fcf8b0a3b [Lite Interpreter] Operator registration migrate from manual to selective build (#35426)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35426

Use selective build with the full set of operators (vs. manually register each used op with "_" prefix).

Lite interpreter relies on JIT operator dispatch. In future we still need JIT operator dispatch dispatch ops that are not registered in c10.
Currently the selective build is for c10/aten dispatch in BUCK. There is JIT selective code-gen in OSS but not ported to BUCK yet.
This diff is also porting the selective code-gen in BUCK.
* The selected op list is passed to gen_jit_dispatch.py.
* The list passed to gen_jit_dispatch is the top-level ops (USED_PT_OPS) only, because the selective c10/aten dispatch already registered other ops that are called from the top-level ops.

ghstack-source-id: 101885215

(Note: this ignores all push blocking failures!)

Test Plan:
1. In Python, run torch.jit.export_opnames(scripted_M_mod)
2. Append the operator names into fbcode/caffe2/pt_ops.bzl and the BUCK target.
3. Run
```
buck run xplat/caffe2/fb/lite_predictor:lite_predictor_bi -- --model=/home/myuan/temp/bi_pytext_0315.bc --input_dims "1,4" --input_type int64 --pytext_len=4
```
Should provide expected results.
In addition, the size of the generated code for JIT registration, for example, ```register_aten_ops_0.cpp```, should be significantly reduced (from ~250 KB to ~80KB). The non-selected op registration schema are still kept, but the registration functor is replaced by ```DUMMY_OPERATION```

Reviewed By: ljk53

Differential Revision: D20408831

fbshipit-source-id: ec75dd762c4613aeda3b2094f5dad11804dc9492
2020-04-10 02:31:32 -07:00
361eed6a6e Use JIT op registration directly for lite interpreter. (#34070)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34070

The first step to make all operators available for lite interpreter. The original code used manual registration for lite interpreter ops with a "_" prefix, for two reasons:
1. To minimize the build size.
2. To avoid duplicate registration in OSS (majorly feature testing and unit tests).

Now since we have more and more models to support, the manual registration way is not practical. To make this process automatic while keeping the binary size under control, we plan to:
1. Make all necessary ops callable from lite interpreter.
2. The binary size would be increased because of step 1. Use ljk53 's custom build to selectively build the binary with ops used in specific models. The ops will be automatically collected using get_opnames.
3. The temporary "register_mobile_ops.cpp" can be removed.

Test Plan: Imported from OSS

Differential Revision: D20291596

Pulled By: iseeyuan

fbshipit-source-id: 553b4699619cd71fea20658f3bc8c2d48852ef5c
2020-03-25 07:21:51 -07:00
ab76a8206f [JIT][mobile] Support built-in Function call in lite interpreter (#34676)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34676

Test Plan: Imported from OSS

Differential Revision: D20427938

Pulled By: jamesr66a

fbshipit-source-id: 79eebfa858776f26da55ffd49d3f78fa7ae0df9b
2020-03-13 18:24:18 -07:00
02478984d6 Add support to dump unsupported ops. Add lite_interpter_load test. (#34278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34278

This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.

Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>

Reviewed By: iseeyuan

Differential Revision: D20266341

fbshipit-source-id: 5a6c7a5bc52f910cea82a72045870da8105ccb87
2020-03-05 18:31:31 -08:00
d59e036f4d Revert D20194092: Add support to dump unsupported ops. Add lite_interpter_load test.
Test Plan: revert-hammer

Differential Revision:
D20194092

Original commit changeset: 0d596cd02043

fbshipit-source-id: 17b4bae27543f231bd6c12d90368d399ca55ebdf
2020-03-04 13:53:58 -08:00
17a5c67796 Add support to dump unsupported ops. Add lite_interpter_load test. (#34072)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34072

This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.

Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>

Reviewed By: iseeyuan

Differential Revision: D20194092

fbshipit-source-id: 0d596cd0204308027194af7ed738551d0c32a374
2020-03-04 13:18:12 -08:00
dbe850af5b [jit] do the code reorg (#33851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851

Rationale and context described in #33828.

Script to reproduce the move:
https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9
ghstack-source-id: 99079645

Test Plan: Make sure CI passes

Reviewed By: jamesr66a

Differential Revision: D20133869

fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e
2020-02-27 13:02:51 -08:00
f1b73799d5 Clean up isinstance flags (#33265)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33265

This removes the need for isinstance to keep trace of list and tuple
separately by introducing AnyListType and AnyTupleType into the JIT
type system to be the common supertype of any lists or tuples.

This allows us to remove the weird flags from the interpreter for
the isinstance operator.

Test Plan: Imported from OSS

Differential Revision: D19883933

Pulled By: zdevito

fbshipit-source-id: f998041b42d8b4554c5b99f4d95d1d42553c4d81
2020-02-18 15:07:06 -08:00
7f2c25b6fa Move special ops into interpreter (#32889)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32889

Common primitive ops that have special inputs make it very hard to
serialize the bytecode for mobile because information about how the
op behaves is hidden in the Node*. This changes how we handle the following
ops so that they are encoded as their own interpreter bytecodes.

```
    USES NODE: prim::TupleUnpack(...) -> (...)
    USES NODE: prim::TupleSlice(...) -> (...)
    USES NODE: prim::TupleConstruct(...) -> (...)
    USES NODE: prim::ListUnpack(...) -> (...)
    USES NODE: prim::ListConstruct(...) -> (...)
    USES NODE: prim::DictConstruct(...) -> (...)
    USES NODE: prim::Constant() -> (...)
    USES NODE: prim::isinstance(...) -> (...)
    USES NODE: prim::CreateObject(...) -> (...)
    USES NODE: prim::fork(...) -> (...)
    USES NODE: aten::warn(str message, *, int stacklevel=2) -> () # need stack level information, so ideally in interpreter so it can look at the stack
```

This leaves a state where the _only_ remaining Node*-consuming builtins
are things that are only introduced during JIT optimization and will
not appear in mobile code.

Serialization of bytecode can now be made to directly write the CodeImpl
object without modification.

Test Plan: Imported from OSS

Differential Revision: D19673157

Pulled By: zdevito

fbshipit-source-id: 7b8c633d38a4c783b250fbdb222705e71a83ad26
2020-02-18 15:07:01 -08:00
f362cd510d Move prim ops from JIT registration to C10 (#30612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30612

The first version to move prim ops to c10 registration. After the reviewers are fine with the initial changes, more operators will be moved in the same style.

Test Plan: Imported from OSS

Differential Revision: D19237648

Pulled By: iseeyuan

fbshipit-source-id: c5a519604efffb80564a556536f17d829f71d9f9
2020-01-04 13:47:44 -08:00
3003c5f91b OPN ops TupleConstruct/Unpack and format. (#29635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29635

TupleConstruct/Unpack as OPN ops.

Test Plan: Imported from OSS

Differential Revision: D18499602

fbshipit-source-id: 389b21d3ea532ef6fa729d67ce34214d86700cd2
2019-11-15 16:22:42 -08:00
19ab5381c3 Add OPN instruction and vararg operator table (#27104)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27104

* The use case here is to replace prim::ListConstruct, which requires Node, but Node is not available in mobile lite interpreter.
* (OPN, X, N), X is the index to the vararg operator-name and operator tables. N is number of inputs. For ListConstruct example, operator name can be "aten::listconstruct" and the overloaded name is the output type ("int", "float", "bool", "tensor" and "generic").
* A vararg operator table is built with void(int input_size, Stack& stack) functions.
## Unit test
LiteInterpreterConv covers OPN instruction and conv operator.

Test Plan: Imported from OSS

Differential Revision: D17762853

fbshipit-source-id: 475aa0c6678e3760cec805862a78510913a89c83
2019-10-04 09:35:53 -07:00
7fc06ea541 Bytecode export flow (#25187)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25187

The bytecode export flow: dump the bytecode format for the light weighted interpreter.
* The bytecode is generated without input spec optimization. It would be more generic (input independent) with no obvious performance degradation (to be tested).
* Main API: torch::jit::script::Module::save(filename, extra_files, bool *bytecode_format* = false).
* Both bytecode and module object are exported in pickle format.
    * The module object (in data.pkl) is the same as the original JIT model.
    * The serializer is dependent on pickle only (no protobuf or Json).
    * The major functionality is forked in ScriptModuleSerializer2::serialize().
    * The test loader is test_bc_export.cpp.
* Simple APIs are added in Code and its implementation to get necessary information (instructions, operators and constants).
* Since there's no dependency on graph/node, GetAttr is promoted from an operator to first-class instruction (https://github.com/pytorch/pytorch/pull/25151) .
* Some definitions (instructions, writeArchive, etc) that are shared by full JIT and bytecode are pulled out of the local namespace (https://github.com/pytorch/pytorch/pull/25148).

The output layout looks like:

* folders of methods.
    * In each method folder (for example, forward/):
        * bytecode.pkl: instructions and operators
        * constants{.pkl,/}: constant list in constants.pkl. If there are tensors in constants, the binary tensor files in constants/ folder.
* data{.pkl,/}: the module object, with binary tensor files in data/ folder. The same as in torchscript.

Test Plan: Imported from OSS

Differential Revision: D17076411

fbshipit-source-id: 46eb298e7320d1e585b0101effc0fcfd09219046
2019-09-25 16:35:45 -07:00