Commit Graph

46 Commits

Author SHA1 Message Date
118bd82dde detect mocked module on saving pass (#70641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70641

Raises a not implemented error if we attempt to pickle an object which uses a mocked module. Now we no longer have to load the object to get this check, and instead happens right on the saving path.

Review History is on https://github.com/pytorch/pytorch/pull/69793 PR was moved to a different branch due to original branch getting corrupted.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33414365

Pulled By: PaliC

fbshipit-source-id: 6d72ddb05c47a3d060e9622ec0b6e5cd6c6c71c8
2022-01-10 11:11:55 -08:00
cbc29acca3 [Codemod][FBSourceBlackLinter] Daily arc lint --take BLACK
Reviewed By: zertosh

Differential Revision: D31423202

fbshipit-source-id: 08d249e8546c0bfe6f1145c0571141b90aad03eb
2021-10-05 20:55:56 -07:00
5883523c1d Remove dtype from torch.Storage and use only torch.ByteStorage (#62030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030

Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible

Fixes https://github.com/pytorch/pytorch/issues/47442

* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls.  `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
 To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.

Original pull request: https://github.com/pytorch/pytorch/pull/59671

Reviewed By: soulitzer, ngimel

Differential Revision: D29466819

Pulled By: ezyang

fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
2021-10-05 13:50:34 -07:00
4fe66d962d [Codemod][FBSourceBlackLinter] Daily arc lint --take BLACK
Reviewed By: zertosh

Differential Revision: D31192084

fbshipit-source-id: 25d490783b876253ddd1ad0a70832766ebd33f51
2021-09-25 06:42:19 -07:00
146817c9d0 Add all_paths utility function (#65602)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65602

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D31163681

Pulled By: tugsbayasgalan

fbshipit-source-id: fa0b28b1d3b73efcc7671698a613e695a01cc103
2021-09-25 01:11:20 -07:00
afa25c77f1 [package] Make it possible to re-save a PackageImporter module (#65101)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65101

As title. Previously this was guarded against for implementation
simplicity, as we didn't really think there was a use case for saving a
mangled module name directly.

But people started doing stuff like:
```
exporter.save_module(my_imported_obj.__module__)
```
which implicitly passes along the mangled module name.

This PR makes it so that given `PackageImporter` instance can always
import modules that it created, and changes `PackageExporter` to
properly demangle the resulting module name when writing the package to
the export archive.

Differential Revision:
D30975712
D30975712

Test Plan: Imported from OSS

Pulled By: suo

fbshipit-source-id: d9e849bf651713890e72dccdcef74fa52d377149
2021-09-17 16:25:11 -07:00
9b2b45919a Revert D29639797: [package] error if we try to mock a module in 3.6
Test Plan: revert-hammer

Differential Revision:
D29639797

Original commit changeset: 775ed78638fb

fbshipit-source-id: 9d2f6dae7ee35c6b37338e36ec7ade9d9e2ccbc2
2021-07-09 19:31:04 -07:00
54ea7d33ba [package] error if we try to mock a module in 3.6 (#61469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61469

This feature is not supported, error out early.

Differential Revision:
D29639797
D29639797

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Pulled By: suo

fbshipit-source-id: 775ed78638fb6da8f830b632726b00c0533ed176
2021-07-09 16:26:38 -07:00
12772c8dd8 [package] PackageExporter visualization methods (#61147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61147

Basic tooling to enable users to see what is inside of a PackageExporter. Added methods:
- `externed/interned/mocked/denied_list()`: returns list of modules which are currently in the specified category
- `relied_on_by(module_name)`: returns list of modules which rely on `module_name`
- `dependency_graph_str()`: returns string format of graph for users. Example of output:
```
digraph G {
rankdir = LR;
node [shape=box];
"<res.foo.pkl>" -> "foo";
"foo" -> "torch.package";
"foo" -> "time";
"foo" -> "sentencepiece";
"foo" -> "package_top";
}
```

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D29559683

Pulled By: Lilyjjo

fbshipit-source-id: 5dff4d04af911a9c9fdd0d100420f1382eaef46e
2021-07-09 15:27:06 -07:00
35b950ea98 [package] properly handle case where we are re-packaging mocked modules (#61434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61434

Mocking is the only time we introduce a "special" module to a
torch.package of our own creation. This interacts poorly with
re-packaging, since if we treat `_mock` as a regular module and try to
package it normally we will produce a broken package.

This PR teaches PackageExporter to recognize `_mock` modules and treat
them specially during the dependency walking process, thus avoiding the
issue.

Test Plan: Imported from OSS

Reviewed By: jdonald, Lilyjjo

Differential Revision: D29638283

Pulled By: suo

fbshipit-source-id: 37a7ffa34da8bb665f679fbd72aa3d71154b2209
2021-07-09 14:27:49 -07:00
6a3170dba1 [package] minor cleanups to internal APIs (#61428)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61428

I was reading this code again after a while and didn't understand as
quickly as I would have liked. Some of the function names are no longer
accurate, etc.

This PR renames these functions to be in the same language of
"dependencies" that the rest of the API uses. I think the resulting
usage of the APIs is more clear than before

Test Plan: Imported from OSS

Reviewed By: Chillee

Differential Revision: D29620946

Pulled By: suo

fbshipit-source-id: 7df640a7ffbd43998063b9ee3955c9dfcbc42cfb
2021-07-09 01:28:24 -07:00
5fbc853c5f [package] PackageExporter remove verbose mode (#61145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61145

Remove 'verbose' mode from PackageExporter as people have complained that it is not useful.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D29559681

Pulled By: Lilyjjo

fbshipit-source-id: eadb1a3a25fadc64119334a09bf1fa4b355b1edd
2021-07-08 18:26:43 -07:00
426c42ba45 [package] ensure we don't write files twice to the archive. (#61371)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61371

The ZIP format allows for writing multiple files with the same name. But
this is handled poorly by most tooling (including our own), so doing so
produces weird behavior depending on the implementation of the ZIP
reader.

Since we have no valid use case for writing multiple files with the same
name to a `torch.package`, just ban it.

Differential Revision:
D29595518
D29595518

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Pulled By: suo

fbshipit-source-id: b9f5263ab47572abde233745c102af3d6143946e
2021-07-07 18:28:42 -07:00
0dd90cceaf [package] track storages across lifetime of PackageExporter (#59735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735

1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue.
2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`)
3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D29075276

Pulled By: Lilyjjo

fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12
2021-06-29 14:16:54 -07:00
5f010c066f [package] Bring back save_source_file (#59962)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59962

This reverts commit 44b021d21b5681c105529881bdbaefb6d3e335f6.

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D29113224

Pulled By: zhxchen17

fbshipit-source-id: 55d42acc421c5f4abbbad9d9ed4d32b615939463
2021-06-18 11:13:35 -07:00
c7890b4a8e [package] doc string cleanup extravaganza (#59843)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59843

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D29049342

Pulled By: Lilyjjo

fbshipit-source-id: 3330fb439f28dda0cafef5797ff61311f4afbf76
2021-06-10 21:21:48 -07:00
04986b909f [package] Add docstring for PackageExporter.intern (#59602)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59602

**Summary**
This commit adds a docstring for `PackageExporter.intern`.

**Test Plan**
Continuous integration.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D28972939

Pulled By: SplitInfinity

fbshipit-source-id: 1765541aa2ed88e01beb48c08b90f56df3a591b7
2021-06-08 19:53:36 -07:00
ed4cda0183 [pkg] opt into autoformat
Summary: woooo

Test Plan: arc lint --apply-patches --take BLACK --paths-cmd 'hg files -I "caffe2/**/*.py"'

Reviewed By: SplitInfinity

Differential Revision: D28608934

fbshipit-source-id: 7768fed50a87883a95319376c0a6d73a9492bdcc
2021-05-21 15:03:52 -07:00
5caccbe39e [pkg] Catch exceptions where dependency resolution gets invalid imports (#58573)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58573

Users can create invalid imports, like:
```
HG: in a top-level package
if False:
  from .. import foo
```

Since this code is never executed, it will not cause the module to fail to
load. But our dependency analysis walks every `import` statement in the AST,
and will attempt to resolve the (incorrectly formed) import, throwing an exception.

For posterity, the code that triggered this: https://git.io/JsCgM

Differential Revision: D28543980

Test Plan: Added a unit test

Reviewed By: Chillee

Pulled By: suo

fbshipit-source-id: 03b7e274633945b186500fab6f974973ef8c7c7d
2021-05-19 23:04:21 -07:00
703f24397b [pkg] simplifications to broken dependency handling (#58572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58572

Right now, we have three categories of error (broken, denied, unhandled). This
PR unifies them into a single "error" field in the node, with optional context.
It also generalizes how formatting of the error in PackagingError occurs.

Differential Revision: D28543982

Test Plan: sandcastle

Reviewed By: Chillee

Pulled By: suo

fbshipit-source-id: d99d37699ec2e172e3798763e60aafe9a66ed6f4
2021-05-19 23:03:12 -07:00
76d2cb3b8e [torch.package/TorchScript] flag to gate allowance of TS serializaiton in torch.package (#57678)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57678

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D28232891

Pulled By: Lilyjjo

fbshipit-source-id: f6b2f4557cb98c4e811b7e3b665e0ffe88115555
2021-05-14 08:21:46 -07:00
307375a88e [torch.Package/TorchScript] torch.Package python logic to save TorchScript (#54893)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54893

Adds logic to torch.Package's `PackageExporter` and `PackageImporter` to handle TorchScript objects. Also adds necessary `__reduce_package__` methods to `ScriptModule` and `RecursiveScriptModule` to enable this

API:
```
# create scripted objects
scripted_mod = torch.jit.script(Mod1("initial_1"))
scripted_mod2 = torch.jit.script(Mod2("initial_2"))

# save objects into package
with PackageExporter(filename, verbose=False) as e:
            e.save_pickle("res", "mod.pkl", scripted_mod)
            e.save_pickle("res", "mod2.pkl", scripted_mod2)

# load scripted objects from package
importer = PackageImporter(filename)
scripted_mod_loaded = importer.load_pickle("res", "mod.pkl")
scripted_mod2_loaded = importer.load_pickle("res", "mod2.pkl")
```

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D27832547

Pulled By: Lilyjjo

fbshipit-source-id: 73bf254c311fee2a2b21a9a7861d6cdc53709bd1
2021-05-14 08:21:41 -07:00
f1ac9b6598 fix lint (#58203)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58203

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D28401974

Pulled By: suo

fbshipit-source-id: cc244e0fc81c5f699ff4bd30754a3f6467f232c4
2021-05-12 18:01:50 -07:00
01d0eb9dac [package] Add an intern keyword (#57341)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57341

Require that users be explicit about what they are going to be
interning. There are a lot of changes that are enabled by this. The new
overall scheme is:

PackageExporter maintains a dependency graph. Users can add to it,
either explicitly (by issuing a `save_*` call) or explicitly (through
dependency resolution). Users can also specify what action to take when
PackageExporter encounters a module (deny, intern, mock, extern).

Nothing (except pickles, tho that can be changed with a small amount
of work) is written to the zip archive until we are finalizing the
package. At that point, we consult the dependency graph and write out
the package exactly as it tells us to.

This accomplishes two things:
1. We can gather up *all* packaging errors instead of showing them one at a time.
2. We require that users be explicit about what's going in packages, which is a common request.

Differential Revision: D28114185

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Pulled By: suo

fbshipit-source-id: fa1abf1c26be42b14c7e7cf3403ecf336ad4fc12
2021-05-12 16:22:43 -07:00
29cfcf70be [package] add mock/extern hooks (#58000)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58000

Directly overriding save_extern and save_mock may mess with our
invariants in weird ways. This is less pronounced now, but once we
switch to graph-based dependency management things will get broken
subtly if people fail to call `super()`.

Better to add hook support to reflect that really you can only do a side
effect. Also has the bonus that people are likely familiar with it from
`nn.Module` hooks.

Differential Revision: D28339191

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Pulled By: suo

fbshipit-source-id: 63ffd39d2dcb1a7524f3c2c6a23bd399e754cc44
2021-05-11 16:46:54 -07:00
44b021d21b [package] remove save_source_file API (#57340)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57340

This API was only used within our own implementation. I couldn't find
any uses anywhere else. Removing it to reduce our overall surface area,
and also because the semantics are unclear in a world where
serialization is deferred to close() time.

Differential Revision: D28114188

Test Plan: Imported from OSS

Reviewed By: anjali411

Pulled By: suo

fbshipit-source-id: 6da53f20518885c7f4359e00e174f5e911906389
2021-05-05 17:57:05 -07:00
a3cba770b5 [package] remove PackageExporter.file_structure (#57339)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57339

After the `intern` changes, we will no longer eager write to the package
archive so `file_structure` as written doesn't make much sense.

Differential Revision: D28114187

Test Plan: Imported from OSS

Reviewed By: anjali411

Pulled By: suo

fbshipit-source-id: 875595db933e9d1b2fdde907b086889cc977e92f
2021-05-05 17:57:04 -07:00
f326f7dda8 [package] use digraph to back dependency visualization (#57338)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57338

Differential Revision: D28114190

Test Plan: Imported from OSS

Reviewed By: astaff

Pulled By: suo

fbshipit-source-id: 78b15edae3b991307fd3656ac7b374d4d218b460
2021-05-05 17:57:02 -07:00
a39c685ace [package] make extern a dict (#57336)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57336

Avoid a small n^2

Differential Revision: D28114189

Test Plan: Imported from OSS

Reviewed By: astaff

Pulled By: suo

fbshipit-source-id: 2672669ad0e23169d70c92f9d5ed61f66081f248
2021-05-05 17:56:59 -07:00
dedf9fbe81 [package] factor out PackageExporter._get_dependencies (#57335)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57335

Mostly refactoring. The only behavioral change is that I have eliminated
the `orig_source_file` argument to `save_source_string`. I think it
doesn't provide enough marginal value (since if you have the module name
you can get the source file anyway).

Differential Revision: D28114184

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Pulled By: suo

fbshipit-source-id: b5e9eb4250dc84552befeef2dcf9e591b32899ae
2021-05-05 17:55:48 -07:00
31e59c3869 torch.package change Folder to Directory and add doc strings (#56925)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56925

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D28002145

Pulled By: Lilyjjo

fbshipit-source-id: 6265970202d1530c4fb7ea10011b0e09094037d5
2021-04-28 13:03:12 -07:00
eac082891f [package] Massage exporter docstrings (#56547)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56547

**Summary**
This commit tweaks the docstrings of `PackageExporter` so that they look
nicer on the docs website.

**Test Plan**
Continuous integration.

Test Plan: Imported from OSS

Reviewed By: ailzhang

Differential Revision: D27912965

Pulled By: SplitInfinity

fbshipit-source-id: 38c0a715365b8cfb9eecdd1b38ba525fa226a453
2021-04-21 14:06:54 -07:00
8d4e6c9570 [package] make GlobGroup a public concept (#56238)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56238

It's already functionally public due to `extern` and `mock`, but
exposing the underlying implementation makes extending PackageExporter
easier.

Changed the underscores, expose on `torch.package`, add docs, etc.

Differential Revision: D27817013

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Pulled By: suo

fbshipit-source-id: e39199e7cb5242a8bfb815777e4bb82462864027
2021-04-16 13:31:48 -07:00
8f68396462 [package] fix error handling with allow_empty (#56190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56190

Previously, if we had some code that did the following:
```
- pattern A, allow_empty=False
- save module B, but throws an exception for whatever reason
- save module that causes match against A
```

Then the resulting behavior would be:
1. exception thrown, which triggers `__close__` on `PackageExporter`
2. `PackageExporter` checks that all patterns are matched against, and sees that A was not matched.
3. Error is raised that we didn't match against pattern A.

This is confusing, since the *real* error that caused packaging to fail
occurred when trying to package module B, but it's being hidden by the
error about module A (even though if packaging module B had succeeded,
there would be no error).

Change it so that the behavior looks like:
1. exception thrown, which triggers `__close__` on `PackageExporter`
2. `PackageExporter` recognizes that an exception is happening and
immediately just returns control flow to the caller to handle the "real"
exception.

Differential Revision: D27803988

Test Plan: Imported from OSS

Reviewed By: guangyuwang

Pulled By: suo

fbshipit-source-id: f67b2e96165a0547c194a8bef1af1c185452173e
2021-04-15 20:16:43 -07:00
669a8acc54 [package] Allow save_module to accept module as arg (#55996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55996

**Sumamary**
This commit modifies `PackageExporter.save_module` so that the `module`
argument can be either a string (`str`) or a module
(`types.ModuleType`).

**Test Plan**
This commit adds a unit test similar to `TestSaveLoad.test_save_module`
that tests that calling `save_module` with a module object works.

**Fixes**
This commit fixes #55939.

Test Plan: Imported from OSS

Reviewed By: jamesr66a, huiguoo

Differential Revision: D27771781

Pulled By: SplitInfinity

fbshipit-source-id: 57c8cf45575bb8dcfca711759fadfff72efb35e7
2021-04-14 15:52:55 -07:00
fc6985eceb [package] Minor fixes to PackageExporter docstrings (#55817)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55817

**Summary**
This commit makes minor edits to the docstrings of `PackageExporter` so
that they render properly in the `torch.package` API reference.

**Test Plan**
Continuous integration (especially the docs tests).

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D27726817

Pulled By: SplitInfinity

fbshipit-source-id: b81276d7278f586fceded83d23cb4d0532f7c629
2021-04-13 10:00:38 -07:00
524dbe1fa1 [Easy] Fix typo in package_exporter.py (#55551)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55551

Simple typo, it should be `OrderedImporter`

Test Plan: ci

Differential Revision: D27629463

fbshipit-source-id: 745527a8339f03a8fd38d0a4491811b3c9ca9b1e
2021-04-07 16:30:07 -07:00
911b8b1bfc [package] rename PackageExporter.external to PacakgeExporter.extern_modules (#54601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54601

This make it consistent with PackageImporter and the on-disk format.

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Differential Revision: D27296915

Pulled By: suo

fbshipit-source-id: a9bc615b1952b6cc4dcba31d4a33932b1fa1a2aa
2021-03-25 11:50:07 -07:00
8c2c9450cc [package] autoformat (#53783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53783

Use isort + black on torch/package/

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D26969020

Pulled By: suo

fbshipit-source-id: e2c0738e79bf41b6342355eb7025998178c35dc9
2021-03-15 17:18:43 -07:00
51592a9e0a [package] Add deny method to PackageExporter (#53233)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53233

**Summary**
This commit adds a `deny` method to `PackageExporter` that allows
modules to be prohibited during the packaging process. A dependency on a
module matching the names or globs that `deny` was called with will
cause an exception to be raised.

**Test Plan**
This commit adds unit tests to `PackagingTest` for this new method:
`test_deny` and `test_deny_glob`.

**Fixes**
This commit fixes #53217.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D26834010

Pulled By: SplitInfinity

fbshipit-source-id: 469b5c6741bcc6dab77e352f41db38fa1e0dae12
2021-03-04 20:37:41 -08:00
f1eedfa2c8 [package] Add allow_empty flag to mock and extern (#53232)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53232

**Summary**
This commit adds an optional `allow_empty` argument to
`PackageExporter.mock` and `PackageExporter.extern` that allows certain
patterns for mocked modules and extern modules to be marked ones that
*must* be matched during the packaging process. If a mock or extern
module with `allow_empty=False` is not matched while packaging, an error
is thrown.

**Test Plan**
This commit adds two new test cases to `PackagingTest`,
`test_extern_glob_allow_empty` and `test_mock_glob_allow_empty` that
test this new flag. Existing tests already tests `allow_empty=True`.

**Fixes**
This commit fixes #53217.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D26834011

Pulled By: SplitInfinity

fbshipit-source-id: 9cf4ea56079ae210d6cfa8604218849eb5cde5f4
2021-03-04 20:35:06 -08:00
ec128eadea [package] _custom_import_pickler -> _package_pickler (#53048)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53048

I am planning the custom pickler and unpicklers that we use as
semi-public interfaces for `torch.rpc` to consume. Some prefatory
movements here.

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Differential Revision: D26734594

Pulled By: suo

fbshipit-source-id: 105ae1161d90f24efc7070a8d80c6ac3d2111bea
2021-03-01 18:38:43 -08:00
958d9a8364 [fx/package] make GraphModules packageable (#51976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51976

FX serializes things by serializing Python code as a string and exec'ing
it on load. This accomplishes one goal (we don't have to pickle the
graph object directly) but breaks the pickle abstraction in ways that
are not composable with `torch.package`.

In particular:
1. `forward` is serialized by saving Python code. On load, it's
installed
by  `exec`ing that code. This `exec` call needs to have the right
importer installed, otherwise it will not import modules from the
`torch.package` but instead import from the Python environment.
2. Any types/functions used are emitted as `import` statement in the
generated Python code. These are effectively dynamic dependencies of the
`GraphModule` being saved, and need to be registered as such so that the
`PackageImporter` will package them.

To address these, this PR introduces a new protocol for the
importer/exporter: `__reduce_package__`.

A class can implement `__reduce_package__` to customize how it is placed
in the importer/exproter. It functions very similarly to `__reduce__`,
except:
- `__reduce_package__` takes one argument, which is the
`PackageExporter`
instance. Users can use this instance to save stuff to the package to
implement their serialization. `__reduce__` takes no args.
- Only the 2-element tuple version of the return value for `__reduce__`
is supported (this could be extended if necessary).
- When the reduction function is called on load, an additional argument
is added to the beginning of the args tuple. This is the
`PackageImporter`
instance doing the loading.

The `__reduce_package__` protocol is defined using `persistent_id` and
`persistent_load`, which ensures that we can still use the cpickle
implementation of the pickler by default.

Pull Request resolved: #51971

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D26340591

Pulled By: suo

fbshipit-source-id: 5872a7d22e832056399a7372bae8a57807717882
2021-02-23 22:43:00 -08:00
0bc57f47f0 torch.Package zipfile debugging printer (#52176)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52176

Added tooling to print out zipfile structure for PackageExporter and PackageImporter.

API looks like:
```
exporter.print_file_structure("sss" /*only include files with this in the path*/)
importer3.print_file_structure(False /*don't print storage*/, "sss" /*only include files with this in the path*/)
```

The output looks like this with the storage hidden by default:
```
─── resnet.zip
    ├── .data
    │   ├── extern_modules
    │   └── version
    ├── models
    │   └── models1.pkl
    └── torchvision
        └── models
            ├── resnet.py
            └── utils.py
```
The output looks like this with the storage being printed out:
```
─── resnet_added_attr_test.zip
    ├── .data
    │   ├── 94574437434544.storage
    │   ├── 94574468343696.storage
    │   ├── 94574470147744.storage
    │   ├── 94574470198784.storage
    │   ├── 94574470267968.storage
    │   ├── 94574474917984.storage
    │   ├── extern_modules
    │   └── version
    ├── models
    │   └── models1.pkl
    └── torchvision
        └── models
            ├── resnet.py
            └── utils.py
```

If the output is filtered with the string 'utils' it'd looks like this:
```
─── resnet_added_attr_test.zip
    └── torchvision
        └── models
            └── utils.py
```

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D26429795

Pulled By: Lilyjjo

fbshipit-source-id: 4fa25b0426912f939c7b52cedd6e217672891f21
2021-02-22 15:04:56 -08:00
d5ac929b62 [package] Introduce Importer to manage module namespace collisions. (#51975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51975

See comments in code.

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D26340592

Pulled By: suo

fbshipit-source-id: 61b16bafad15e19060710ad2d8487c776d672847
2021-02-19 10:06:04 -08:00
76e8324370 [package] rename ex/importer.py to package_ex/importer.py (#52320)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52320

as title

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D26468416

Pulled By: suo

fbshipit-source-id: 890eecea76426918daff900402fbcbc149e48535
2021-02-19 10:04:14 -08:00