Commit Graph

134 Commits

Author SHA1 Message Date
c25e33789e Lightweight at-most-once logging for API usage (#20745)
Summary:
Resubmit #20698 which got messed up.

Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl.

Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745

Differential Revision: D15429196

Pulled By: dzhulgakov

fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
2019-05-23 23:17:59 -07:00
9b1dbffba5 Re-sync with internal repository (#20702) 2019-05-20 09:22:57 -04:00
d3059b9c49 Lightweight logging for once-only API usage 2019-05-19 23:04:40 -07:00
cd28ff5395 Add support for __getstate__/__setstate__ on module (#20242)
Summary:
Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`).
* `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively
* The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided)
    * This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import)

Follow up
* Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage.
](https://our.intern.facebook.com/intern/diff/15287161/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242

Pulled By: driazati

Differential Revision: D15287161

fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8
2019-05-17 14:43:14 -07:00
3afd99680c Remove SourceLocation (respin) (#20333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20333
ghimport-source-id: e64075bb82067224463e9955d10bd13967d1975d

Differential Revision: D15284081

Pulled By: zdevito

fbshipit-source-id: ac26ae48392b9daff08f460529c06af8f4e4722a
2019-05-09 16:17:33 -07:00
e870b11ae6 Revert D15275731: Remote SourceLocation
Differential Revision:
D15275731

Original commit changeset: f4da178c3137

fbshipit-source-id: 830b79735eb2dadc4795b5aae407826bf20ef121
2019-05-09 13:07:11 -07:00
eca91de5d2 Remote SourceLocation (#20300)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20300
ghimport-source-id: 06f606c4db3b70b1d2ed9f6ed4542c3f703c4e17

Differential Revision: D15275731

Pulled By: zdevito

fbshipit-source-id: f4da178c31372c2264feb9f99476b9c9aa66c1f2
2019-05-09 11:48:29 -07:00
1102f4c56e Bump op_version_set (#19812)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19812
ghimport-source-id: 7cdb24c6a7501c6ec5f0eae07325512746f5abb9

Differential Revision: D15102803

Pulled By: zdevito

fbshipit-source-id: bedf0bd6e1170fa294c65c87df75b82d8694f89c
2019-05-08 23:21:22 -07:00
8ebb86dd3a Support torch.save for saving values during execution (#18154)
Summary:
This PR makes `torch.save` call out to the pickler which saves a tensor in the same format that `torch.save()` does, the file looks like `| pickle archive 1 (includes sizes, strides, requires_grad, etc...) | pickle archive 2 (list of tensor keys) | tensor binary data |` and can be read back in with `torch.load(my_file, pickle_module=torch.jit._pickle)`

Fixes #18003

Unpickling in the JIT for things such as model parallelism will be a follow up PR
](https://our.intern.facebook.com/intern/diff/15015160/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/18154

Pulled By: driazati

Differential Revision: D15015160

fbshipit-source-id: ef76a44b8c243f4794cd7e245ec8305e965bc59f
2019-05-08 16:52:53 -07:00
8b46938355 Cleanup includes in torch/csrc/jit/* (#19922)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19922
ghimport-source-id: 0434c46bf75621ff79ea27a18a2475e7f13e2487

Differential Revision: D15125015

Pulled By: ZolotukhinM

fbshipit-source-id: 5685edfc94067f62e363a85e9badb7f757b1d321
2019-05-06 13:40:26 -07:00
a25b79531c use fully qualified name for ScriptClasses (#19239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239
ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706

Differential Revision: D14928293

Reviewed By: eellison

Pulled By: suo

fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a
2019-04-26 19:17:21 -07:00
330990d878 Serialize first-class version of functions (#19723)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19723
ghimport-source-id: 7f7ec6200c3b42d19046a3e228a3d82212697f14

Reviewed By: jamesr66a

Differential Revision: D15078533

Pulled By: zdevito

fbshipit-source-id: fe421afab9607ee942f6d200f04bb6335fc0aa97
2019-04-25 15:53:07 -07:00
9983c24cfc Strip doc_string from exported ONNX models (#18882)
Summary:
Strip the doc_string by default from the exported ONNX models (this string has the stack trace and information about the local repos and folders, which can be confidential).

The users can still generate the doc_string by specifying add_doc_string=True in torch.onnx.export().
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18882

Differential Revision: D14889684

Pulled By: houseroad

fbshipit-source-id: 26d2c23c8dc3f484544aa854b507ada429adb9b8
2019-04-18 22:30:00 -07:00
443a58e03d Export C10 operator in PyTorch Model (#18210)
Summary:
Almost there, feel free to review.

these c10 operators are exported to _caffe2 domain.

TODO:

- [x] let the onnx checker pass
- [x] test tensor list as argument
- [x] test caffe2 backend and converter
- [x] check the c10 schema can be exported to onnx
- [x] refactor the test case to share some code
- [x] fix the problem in ONNX_ATEN_FALLBACK
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18210

Reviewed By: zrphercule

Differential Revision: D14600916

Pulled By: houseroad

fbshipit-source-id: 2592a75f21098fb6ceb38c5d00ee40e9e01cd144
2019-04-08 16:06:00 -07:00
13f03a42d2 Create Object that represents a Module (#18469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18469
ghimport-source-id: 73cb8b58f43f10b1dcfca805fd5b25c4fa977632

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18469 Create Object that represents a Module**
* #18468 slots with explicit value/setValue make more sense in future patches
* #18467 Make Object hold its ClassType
* #18379 Enforce single parent for script submodules
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.

This changes the underlying storage for script::Module to hold
a ivalue::Object which has slots for all the parameters and attributes.

NamedIValue and Slot are now merged together into one class Slot that stores
the tuple (ivalue::Object, offset) and can be used to read the name, type,
or value of the slot and also to set the value. This cleans up a bunch
of client uses.

This PR does not actually use the module object in any generated code.
A future PR will switch how code is generated to treat modules as
first class.

Differential Revision: D14613508

fbshipit-source-id: d853a7559f58d244de2ef54a781427fcd1060ed0
2019-04-05 18:58:52 -07:00
f6f34b3f4c slots with explicit value/setValue make more sense in future patches (#18468)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18468
ghimport-source-id: d4b41c521f2269a695e03c8e7d05d5542731ee48

Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18469 Create Object that represents a Module
* **#18468 slots with explicit value/setValue make more sense in future patches**
* #18467 Make Object hold its ClassType
* #18379 Enforce single parent for script submodules
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.

Reviewed By: suo

Differential Revision: D14613509

fbshipit-source-id: 9f2208d0efd01465c78cebdc3e8365a9e0adf9ff
2019-04-05 13:41:02 -07:00
53458c97dd Enforce single parent for script submodules (#18379) (#18860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18860
ghimport-source-id: 96305349bf3db564f43df2263b1e5bddcc9e9dae

Reviewed By: suo

Differential Revision: D14780421

Pulled By: zdevito

fbshipit-source-id: 2bdd89b35866ba035ebea0adab037e441c1006e2
2019-04-05 13:40:56 -07:00
f97eb8d9e4 Revert D14603722: Enforce single parent for script submodules
Differential Revision:
D14603722

Original commit changeset: 63ab5d0cccf7

fbshipit-source-id: 2c4174def102eda4589e08c4dbd67ce8af975199
2019-04-04 10:32:36 -07:00
7e59c60454 Enforce single parent for script submodules (#18379)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18379
ghimport-source-id: 9895ecc1ff7897e98853dc00675341f36726e7c7

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18379 Enforce single parent for script submodules**
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.

The assumption that a ScriptModule has a single parent is present in
our serialization format, and likely a few other places. It is not
enforced on creation of script module hierarchies though, meaning that
problems associated with (e.g. replicating a module twice in the output
format) will not be caught until much later in the development cycle.

This patch enforces the property when a submodule is registered.
It also removes NamedModule since it is no longer necessary in this regime.
This will also allow the easy discover of a modules fully-qualified name
without needing to traverse the Module hierarchy.

Differential Revision: D14603722

fbshipit-source-id: 63ab5d0cccf7d66c7833e0adf9023024ca9607cb
2019-04-03 20:26:58 -07:00
0512e4e323 Unify namespace of script::Module (#18378)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18378
ghimport-source-id: 55c29bb436a2153d29ff2f4488d99d8863c187b1

Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18379 Enforce single parent for script submodules
* **#18378 Unify namespace of script::Module**
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.

This removes individual OrderedDicts in favor of a single unified
namespace for all things in a script::Module. This removes a whole
class of bugs where both a method and an parameter could get the
same name, for instance.

Since we no longer have to expose OrderedDict::Item objects, a lot of
downstream code can be simplified.

We no longer now double-store names (both in the key of the dictionary,
and in the object itself).

Differential Revision: D14603723

fbshipit-source-id: b5f7551b3074679623edd6ea70269830353b4d4c
2019-04-03 16:04:17 -07:00
3b71f2e1f2 Pin onnx ir_version to 4 (#18768)
Summary:
to make test_operators.py more stable. in future, we will bump this up manually, and I think it's acceptable, since ir_version should be bumped too often.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18768

Reviewed By: zrphercule

Differential Revision: D14741514

Pulled By: houseroad

fbshipit-source-id: 0369dbc55424e345a113e49fc104a441ea290d58
2019-04-03 13:16:59 -07:00
1240327c5c Refactoring serialization of ONNX initializers to be name-based (Resubmission) (#17830)
Summary:
houseroad - this is the resubmission of https://github.com/pytorch/pytorch/pull/17420, as suggested.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17830

Reviewed By: zrphercule

Differential Revision: D14398714

Pulled By: houseroad

fbshipit-source-id: bda475f1ae8a5273ebdb0f6883fc66036c29d326
2019-03-29 15:23:29 -07:00
aa20591baa Add Slot type to abstract the raw pointers being used for slots. (#18226)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18226
ghimport-source-id: b9ec8651212875b30971cc6859d2ddec6559ae3a

If modules become first-class IValues, then the slots will no longer be raw pointers but (IValue, index) pairs. This commit inserts the Slot abstraction so that this change can be made in later patches.

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18226 Add Slot type to abstract the raw pointers being used for slots.**

Differential Revision: D14542022

fbshipit-source-id: b81d7f4334c983d663e7551bda82df43680d7c5f
2019-03-28 10:35:36 -07:00
3d44305e9d Attribute serialization (#17423)
Summary:
Allows serialization/loading of attributes (`IValue`s of any type).
* metadata (attribute name, type) is stored in the `model.json`
* The binary format is a subset of the `pickle` module that supports the operations necessary for `IValue`s
    * Attributes are serialized in the order they are defined on a module to a list in a single `attributes` file, with submodule attributes coming first. This order directly matches the order attributes are listed in `model.json`
    * This can be inspected in Python with `pickle.load()` or with `pickletools` (PyTorch need not be installed for this to work)
        * A class is used to store a tensor's index into the tensor table of the model, so to unpickle the file you have to use a custom Unpickler:
        ```python
        class TensorID(object):
            def __setstate__(self, id):
                self.id = id

        class JitUnpickler(pickle.Unpickler):
            def find_class(self, module, name):
                if module == '__main__' and name == 'TensorID':
                    return TensorID

        JitUnpickler(open("my_model/attributes.pkl", "rb")).load()
        ```
    * pickle format: https://svn.python.org/projects/python/trunk/Lib/pickletools.py
* It currently does not support/guarantee that anything saved out with `pickle` (i.e. if you edit `attributes` with `pickle` directly) instead of our tools will be imported correctly

Also will fix #17683 and fix #16367

Followup Work:
* document format / choice of pickle: #17951
* create an example
* list specializations
* int size specializations, large binputs
* do a first pass over attributes to output only necessary `BINPUT` ops
* attribute reassignment (e.g `self.my_attribute = new_value`)
* `tensor.save("some_checkpoint.pkl")` support with tensors embedded in Pickle file
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17423

Differential Revision: D14470965

Pulled By: driazati

fbshipit-source-id: 6a21a9939efdbe59b4bc57fd31d6d630bab5297e
2019-03-18 18:18:22 -07:00
80a7eac79e Remove Type::elementSizeInBytes
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17785

Reviewed By: ezyang

Differential Revision: D14379074

fbshipit-source-id: 60727f187d61eb571b144bd6eed4dd4908da0b51
2019-03-15 12:56:02 -07:00
18f721fb9a support serialization of classes (#17856)
Summary:
Stack:
      **#17856 [jit] support serialization of classes**  [💛](https://our.intern.facebook.com/intern/diff/D14402599/)

Add support for saving/loading TorchScript modules that depend on user-defned classes.

We track class dependencies the same we track tensor constants, then write them
all out such that we can just compile them in order before compiling the module
hierarchy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17856

Reviewed By: shannonzhu

Differential Revision: D14461599

Pulled By: suo

fbshipit-source-id: 7115f87e069fd00dc8381d7de9997864fef7ea9f
2019-03-15 12:06:23 -07:00
cc07f968f8 Revert D14361993: [pytorch][PR] [Onnx] - refactoring serialization of ONNX initializers to be name-based
Differential Revision:
D14361993

Original commit changeset: da93e945d557

fbshipit-source-id: 15eea001fbcd059ac13903405aeb9ea182c6ee8b
2019-03-08 16:31:14 -08:00
7aae51cded Replace tensor.type().scalarType() calls with tensor.scalar_type()
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17515

Reviewed By: ezyang

Differential Revision: D14233250

fbshipit-source-id: 6c7af8d2291c0c2b148001b30cf03834f34366c0
2019-03-08 14:08:18 -08:00
a2381fa346 Add module attributes (#17309)
Summary:
Similar to `nn.Parameter`s, this PR lets you store any `IValue` on a module as an attribute on a `ScriptModule` (only from the Python front-end currently). To mark something as an attribute, it should wrapped in `jit.Attribute(value, type)` (ex. `self.table = torch.jit.Attribute(table, Dict[str, torch.Tensor])`)

Followup Work:
* (de)serializing for use in C++
* change `self.training` to be a `bool` attribute instead of a buffer
* mutable attributes
* string frontend support
* documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17309

Differential Revision: D14354316

Pulled By: driazati

fbshipit-source-id: 67e08ab5229366b67fbc837e67b58831a4fb3318
2019-03-07 10:44:10 -08:00
e4c9d75008 - refactoring serialization of ONNX initializers to be name-based (#17420)
Summary:
Currently, serialization of model parameters in ONNX export depends on the order in which they are stored in a container (`list` on Python side and `std::vector` on C++ side). This has worked fine till now, but if we need to do any pass on that graph that mutates the parameter list, then strictly order-based serialization may not work.

This PR is the first in a set to bring in more passes (such as constant folding) related to ONNX export. This PR lays the groundwork by moving the serialization in ONNX export from order-based to name based approach, which is more amenable to some of the passes.

houseroad - As discussed this change uses a map for export, and removes the code from `export.cpp` that relies on the order to compute initializer names.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17420

Differential Revision: D14361993

Pulled By: houseroad

fbshipit-source-id: da93e945d55755c126de06641f35df87d1648cc4
2019-03-07 10:25:00 -08:00
e422b27f17 Bump up the producer version in ONNX exporter
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17410

Reviewed By: zrphercule

Differential Revision: D14187821

Pulled By: houseroad

fbshipit-source-id: a8c1d2f7b6ef63e7e92cba638e90922ef98b8702
2019-02-22 14:28:59 -08:00
82aa511146 move prim::None to prim::Constant (again) (#17186)
Summary:
Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test.

https://github.com/pytorch/pytorch/pull/16160
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186

Differential Revision: D14115304

Pulled By: eellison

fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7
2019-02-19 11:45:50 -08:00
91c1d728ac Revert D14109636: [pytorch][PR] move prim::None to a case in prim::Constant
Differential Revision:
D14109636

Original commit changeset: d26fd3839761

fbshipit-source-id: c8c8113e2bff49ea93235732603e6ebc89356533
2019-02-15 16:38:12 -08:00
7caa21f5ca move prim::None to a case in prim::Constant (#16160)
Summary:
This change simplifies analysis done on constants since prim::None does not need to be handled separately now.  To check if a constant node is None, use node->isNone().

Next step will be to remove prim::Undefined.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160

Differential Revision: D14109636

Pulled By: eellison

fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876
2019-02-15 16:27:57 -08:00
c3f5ba9460 PyTorch model metadata. (#16275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16275

Adding a generic string `metadata` field as part of the model to capture additional metadata with the model.

Reviewed By: dzhulgakov

Differential Revision: D13579029

fbshipit-source-id: 7456ef2edbe73bb70bbb31889cecd94e0db329a2
2019-02-13 19:48:11 -08:00
ac00e85e36 Remove undefined tensor in jit script (#16379)
Summary:
This PR is a follow up of #15460, it did the following things:

* remove the undefined tensor semantic in jit script/tracing mode
* change ATen/JIT schema for at::index and other index related ops with `Tensor?[]` to align with what at::index is really doing and to adopt `optional[tensor]` in JIT
* change python_print to correctly print the exported script
* register both TensorList and ListOfOptionalTensor in JIT ATen ops to support both
* Backward compatibility for `torch.jit.annotate(Tensor, None)`

List of follow ups:

* remove the undefined tensor semantic in jit autograd, autodiff and grad_of
* remove prim::Undefined fully

For easy reviews, please turn on `hide white space changes` in diff settings.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16379

Differential Revision: D13855677

Pulled By: wanchaol

fbshipit-source-id: 0e21c14d7de250c62731227c81bfbfb7b7da20ab
2019-02-07 11:02:14 -08:00
1905bbb01d Include ATen/core/functional.h directly instead of torch/csrc/utils/functional.h. (#16377)
Summary:
One more shim removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16377

Differential Revision: D13821816

Pulled By: ZolotukhinM

fbshipit-source-id: 007f014d404de51841437db7eef28367a2f6e46b
2019-01-30 14:02:34 -08:00
cc2d49deb7 Remove generic_if.h. (#16354)
Summary:
The current uses of `IR_IF` are mostly trivial, so there is not much value in having special macros for it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16354

Differential Revision: D13821823

Pulled By: ZolotukhinM

fbshipit-source-id: 1ca73111f5b4868fa38a1f29c9230540773e5de6
2019-01-28 17:02:23 -08:00
47bf30661f Directly include headers from ATen.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287

Differential Revision: D13792949

Pulled By: ZolotukhinM

fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3
2019-01-24 11:22:27 -08:00
a918f1d9af Adding a hook (wrapper) for non-std stream reader in PyTorchStreamReader (#15551)
Summary:
To implement a stream is very annoying, since it is closely defined with the underlying storage streambuffer.

So in this PR, we add ReadAdapterInterface and PyTorchStreamReader will use it. We implement IStreamAdapter as a wrapper of std::istream. And keep the user interface unchanged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15551

Reviewed By: zrphercule

Differential Revision: D13568907

Pulled By: houseroad

fbshipit-source-id: 93708cb801248a6c101f35cb14d1631029365c3c
2019-01-04 22:50:07 -08:00
f636dc9276 clang format world (#15524)
Summary:
The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook.

Here is a list of non-mechanical changes:
- I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting.
- Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas
- Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas
- Small improvements to the precommit hook clang-format
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524

Differential Revision: D13547989

Pulled By: suo

fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493
2018-12-26 06:55:01 -08:00
700271d0e9 Adding ONNX export for torch.expand and torch.ne (#15050)
Summary:
`torch.expand` and `torch.ne` are used often in models and this PR adds ONNX export support for them. ArmenAg has created issue https://github.com/pytorch/pytorch/issues/10882 for this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15050

Differential Revision: D13453036

Pulled By: houseroad

fbshipit-source-id: 4724b4ffcebda6cd6b2acac51d6733cb27318daf
2018-12-17 13:48:14 -08:00
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
efc5e9f71a Replace calls of Type::_th_tensor. (#14877)
Summary:
_th_tensor is moving off Type, so these calls need to be replaced.

Unfortunately, replacing these with a full-fledged solution [e.g. from_storage(..., TensorOptions)] is a bit complicated because the storage itself fully defines the Type (modulo variable).  It's simpler to just wait for the Variable/Tensor merge rather than to solve this now, so instead I changed the call sites to: at::empty({0}, type.options()).set_(storage...).

This isn't great because we are also trying to get rid of Type::options, but this seems to be the lesser-of-two-evils.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14877

Differential Revision: D13374310

Pulled By: gchanan

fbshipit-source-id: eb953ed041507e6190d6f32e383912e5a08311cd
2018-12-07 13:04:48 -08:00
e0f68671bd Restore device when import jit script module (#14454)
Summary:
We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454

Reviewed By: zrphercule

Differential Revision: D13271956

Pulled By: houseroad

fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49
2018-12-03 14:10:30 -08:00
b768db0810 Allow DCE to clean up some mutable ops (#14601)
Summary:
This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to.

This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils.

Tested on traced ResNet, and it gets rid of the dead code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601

Differential Revision: D13309118

Pulled By: suo

fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254
2018-12-03 13:31:08 -08:00
170ff7764f Use a zip archive as our container format (#14521)
Summary:
After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format.
miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it.

This replaces our custom container format with a zip archive, preserving all of the
desirable features of our custom format, such as append-oriented writing, and
mmap'able tensor data while adding a bunch of debugging advantages:

1. You can unzip and explore the container to debug what is going on with a model.
2. You can edit the model using a text editor (e.g. change the definition of a method,
   or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress'
   option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug
   serialized models.
3. We can easily enable features like compression in the future.
4. Stock python , without pytorch installed, and other programming languages
   can reasonably consume this format,using json  and zipfile packages, which enables
   people to build tools like visualizers without those visualizers depending on pytorch.
   This will be especially useful if you want to, for instance, write a visualizer in javascript.

Notes:

*  This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained
   library for reading/writing zipfiles that unlike other zip libraries also includes libz
   compatible compress/decompress support. It is a single header and a single C file without
   any other dependencies. Note that the instructions for miniz explicitly state:

   > Please use the files from the releases page in your projects. Do not use the git checkout directly!

   So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable
   to doing zip-align style things to align data.

*  Removes 'size' from RecordRef. This allows you to edit files in the zip archive without
   editing the meta-data file. Very important if you want to print-debug serialized models.

*  PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings)
   However, their implementation is completely swapped out to use miniz.

*  Code exists to check for the old magic number to give a decent warning to our preview users
   after we change the format.

*  Container version information is now put in a stand-alone 'version' file in the archive
   and serves a similar purpose to the other container version info.

*  All files in the zip archive start at 64-byte boundaries, using an approach similar to
   zip-align. Tests check that this property remains true. While the writer does this,
   the reader doesn't depend on it, allowing user-created archives that can use compression,
   and do not have to align data.

*  Added test to check for > 4GB files and archives. Disabled by default because it takes
   almost 2 minutes to run.

*  torchscript files are now optional: if a submodule does not have methods, it will
   not be written.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521

Reviewed By: jamesr66a

Differential Revision: D13252945

Pulled By: zdevito

fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c
2018-11-30 19:19:29 -08:00
fd31eae9ad Switch import/export to python printing (#14400)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit.

This changes the way methods are defined in TorchScript archives to use
PythonPrint rather than ONNX protobufs.

It also updates torch.proto to directly document the tensor data
structure actually being serialized.

Notes:
* because PythonPrint prints all the methods at once per module, this
  removes MethodDef in favor of a single torchscript_area and a separate
  caffe2_graphs entry. Note that NetDef's already have method names,
  so there is no need or a separate method name entry.
* This switches cpp/pickle area to RecordRef (references to a file in
  the container format) since it is possible the data in these arenas
  may be large and not suited to json ouput.
* Removes 'annotations' -- annotations should be re-added on the first
  commit that actually has a practical use for them. In the current state
  it is unlikely they are representing the right information.
* Some expect files have changed because PythonPrint is preserving more
  debug name information for parameter names.
* MethodEncoder (the ONNX output format) has been deleted. There is still
  some cleanup possible combining EncoderBase and GraphEncode now that there
  is only a single pathway using EncoderBase.
* This incorporates the changes from #14397
  to define TensorDef
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400

Reviewed By: suo

Differential Revision: D13231800

Pulled By: zdevito

fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571
2018-11-29 17:53:49 -08:00
2cacb39a21 Fix ONNX_ATEN mode (#14239)
Summary:
Fix ONNX_ATEN mode by adding it to the validateBlock method.
Before this pr, validateBlock will throw an exception when using this mode.

I will add related test cases for ONNX_ATEN mode in a different pr once this is merged, since we dont have any currently.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14239

Differential Revision: D13145443

Pulled By: zrphercule

fbshipit-source-id: 60e7942aa126acfe67bdb428ef231ac3066234b1
2018-11-21 13:15:23 -08:00
7a654617eb Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861)
Summary:
As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861

Reviewed By: dzhulgakov

Differential Revision: D13036940

Pulled By: zrphercule

fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559
2018-11-20 23:37:50 -08:00