Summary:
In order to better track models after serialization, this change writes a serialization_id as a UUID to inline container. Having this ID enables traceability of model in saving and loading events.
serialization_id is generated as a new UUID everytime serialization takes place. It can be thought of as a model snapshot identifier at the time of serialization.
Test Plan:
```
buck2 test @//mode/dev //caffe2/caffe2/serialize:inline_container_test
```
Local tests:
```
buck2 run @//mode/opt //scripts/atannous:example_pytorch_package
buck2 run @//mode/opt //scripts/atannous:example_pytorch
buck2 run @//mode/opt //scripts/atannous:example_pytorch_script
```
```
$ unzip -l output.pt
Archive: output.pt
Length Date Time Name
--------- ---------- ----- ----
36 00-00-1980 00:00 output/.data/serialization_id
358 00-00-1980 00:00 output/extra/producer_info.json
58 00-00-1980 00:00 output/data.pkl
261 00-00-1980 00:00 output/code/__torch__.py
326 00-00-1980 00:00 output/code/__torch__.py.debug_pkl
4 00-00-1980 00:00 output/constants.pkl
2 00-00-1980 00:00 output/version
--------- -------
1045 7 files
```
```
unzip -p output.pt "output/.data/serialization_id"
a9f903df-cbf6-40e3-8068-68086167ec60
```
Differential Revision: D45683657
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100994
Approved by: https://github.com/davidberard98
Summary: To get source for a particular module, the "correct" thing to do is to check the module's spec and use `get_source` if it's a SourceFileLoader, since subclasses may look elsewhere than the `__file__`, and the spec will give the source of truth. For torch packager, however, we prefer to use linecache, but the loader could still change the file, so we figure out the file for the module using the spec's loader rather than using `module.__file__`, if possible.
Test Plan: This code path will get exercised by CI. Also added a test for remapped files.
Differential Revision: D41412983
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90258
Approved by: https://github.com/PaliC
Summary:
Applies new import merging and sorting from µsort v1.0.
When merging imports, µsort will make a best-effort to move associated
comments to match merged elements, but there are known limitations due to
the diynamic nature of Python and developer tooling. These changes should
not produce any dangerous runtime changes, but may require touch-ups to
satisfy linters and other tooling.
Note that µsort uses case-insensitive, lexicographical sorting, which
results in a different ordering compared to isort. This provides a more
consistent sorting order, matching the case-insensitive order used when
sorting import statements by module name, and ensures that "frog", "FROG",
and "Frog" always sort next to each other.
For details on µsort's sorting and merging semantics, see the user guide:
https://usort.readthedocs.io/en/stable/guide.html#sorting
Test Plan: S271899
Reviewed By: lisroach
Differential Revision: D36402110
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78973
Approved by: https://github.com/osalpekar
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72237
add a generic zip file reader/writer to torch.package in order to get rid of dependency on torch for non torchscript / tensor related usages of package. This also enables users to create a derived class from the zip file reader/writer classes to have their own serialization/deserialization if it's desired for performance needs.
https://www.internalfb.com/intern/diff/D35423079/ was reverted due to this refactor changing the name of where most of the implementation components of PackageExporter/PackageImporter come from like ModuleActionType_ etc.
This diff also changes the import paths where these components come from to point to the correct file compared to D35423079
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D35423079
Pulled By: PaliC
fbshipit-source-id: 31abc4364d5fd007911cfb67cf36ebfac5d786f4
(cherry picked from commit 023b0d1445e0b1e1bb7a03c660cd62eb9d26d2a6)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74610
Adding python version to exported package and reading it on import as per this issue in github https://github.com/pytorch/pytorch/issues/74068
ghstack-source-id: 152003088
Test Plan: CI Tests
Reviewed By: PaliC
Differential Revision: D35062709
fbshipit-source-id: 04091a1255a09b96255112a60d31df127c424193
(cherry picked from commit ed39fd54b8b20918dac89a2873ecccf06aafd724)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61147
Basic tooling to enable users to see what is inside of a PackageExporter. Added methods:
- `externed/interned/mocked/denied_list()`: returns list of modules which are currently in the specified category
- `relied_on_by(module_name)`: returns list of modules which rely on `module_name`
- `dependency_graph_str()`: returns string format of graph for users. Example of output:
```
digraph G {
rankdir = LR;
node [shape=box];
"<res.foo.pkl>" -> "foo";
"foo" -> "torch.package";
"foo" -> "time";
"foo" -> "sentencepiece";
"foo" -> "package_top";
}
```
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D29559683
Pulled By: Lilyjjo
fbshipit-source-id: 5dff4d04af911a9c9fdd0d100420f1382eaef46e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61145
Remove 'verbose' mode from PackageExporter as people have complained that it is not useful.
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D29559681
Pulled By: Lilyjjo
fbshipit-source-id: eadb1a3a25fadc64119334a09bf1fa4b355b1edd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57341
Require that users be explicit about what they are going to be
interning. There are a lot of changes that are enabled by this. The new
overall scheme is:
PackageExporter maintains a dependency graph. Users can add to it,
either explicitly (by issuing a `save_*` call) or explicitly (through
dependency resolution). Users can also specify what action to take when
PackageExporter encounters a module (deny, intern, mock, extern).
Nothing (except pickles, tho that can be changed with a small amount
of work) is written to the zip archive until we are finalizing the
package. At that point, we consult the dependency graph and write out
the package exactly as it tells us to.
This accomplishes two things:
1. We can gather up *all* packaging errors instead of showing them one at a time.
2. We require that users be explicit about what's going in packages, which is a common request.
Differential Revision: D28114185
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Pulled By: suo
fbshipit-source-id: fa1abf1c26be42b14c7e7cf3403ecf336ad4fc12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58000
Directly overriding save_extern and save_mock may mess with our
invariants in weird ways. This is less pronounced now, but once we
switch to graph-based dependency management things will get broken
subtly if people fail to call `super()`.
Better to add hook support to reflect that really you can only do a side
effect. Also has the bonus that people are likely familiar with it from
`nn.Module` hooks.
Differential Revision: D28339191
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Pulled By: suo
fbshipit-source-id: 63ffd39d2dcb1a7524f3c2c6a23bd399e754cc44
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57339
After the `intern` changes, we will no longer eager write to the package
archive so `file_structure` as written doesn't make much sense.
Differential Revision: D28114187
Test Plan: Imported from OSS
Reviewed By: anjali411
Pulled By: suo
fbshipit-source-id: 875595db933e9d1b2fdde907b086889cc977e92f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55255
This allows packaged code to detect whether or not they are used in a
packaged context, and do different things depending on that. An example
where this might be useful is to control dynamic dependency loading
depending on whether or not something is packaged.
Test Plan: Imported from OSS
Reviewed By: Lilyjjo
Differential Revision: D27544245
Pulled By: suo
fbshipit-source-id: 55d44ef57281524b8d9ab890bd387de97f20bd9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54303
**Summary**
Creating temporary files can cause problem in fbcode. This commit
updates the packaging tests so that exporters write to a memory
buffer when tests run in fbcode.
**Test Plan**
Continuous integration.
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D27180839
Pulled By: SplitInfinity
fbshipit-source-id: 75689d59448de2cd1595ef0ecec69e1bbcf9a96f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53749
Split up tests into cases that cover specific functionality. Goals:
1. Avoid the omnibus test file mess (see: test_jit.py) by imposing early
structure and deliberately avoiding a generic TestPackage test case.
2. Encourage testing of individual APIs and components by example.
3. Hide the fake modules we created for these tests in their own folder.
You can either run the test files individually, or still use
test/test_package.py like before.
Also this isort + black formats all the tests.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D26958535
Pulled By: suo
fbshipit-source-id: 8a63048b95ca71f4f1aa94e53c48442686076034