pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Nikita Shulga	8f1c3c68d3	[BE] Use nested namespaces in .cpp/.cu files (#92100 ) As we live in C++17 world This is a functional no-op, just - `s/namespace at { namespace native {/namespace at::native {/` - `s/namespace torch { namespace jit {/namespace torch::jit {/` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100 Approved by: https://github.com/izaitsevfb	2023-01-13 16:32:34 +00:00
Theodor Arsenij Larionov	3ac6106523	Add out of bounds checks inside irparser.cpp and unpickler.cpp (#91401 ) Hi! I've been fuzzing different pytorch modules, and found a few crashes. Inside unpickler.cpp/irparser.cpp there are a few places, where `.at()` and `.pop_back()` are called before checking target container size. Lack of these checks results in an attempt to access elements oob (in case of `.at()`), and an actual out-of-bounds access while calling `.pop_back()`/`.pop()` on a `stack_` variable. Crash-files: 1. Crash location: `unpickler.cpp:439` (Call to `.at(idx)` with idx that exceeds `memo_table_` size). - Reproduce the crash: `/message_deserialize_fuzz /homedir/crash-5695ad5b2921127775d4137ee02e23834a0bedc4` - Crash file: [crash-5695ad5b2921127775d4137ee02e23834a0bedc4.zip](https://github.com/pytorch/pytorch/files/10308463/crash-5695ad5b2921127775d4137ee02e23834a0bedc4.zip) - ASAN report: [asan-report-crash-5695ad5b2921127775d4137ee02e23834a0bedc4.log](https://github.com/pytorch/pytorch/files/10308612/asan-report-crash-5695ad5b2921127775d4137ee02e23834a0bedc4.log) 2. Crash location: `irparser.cpp:504` (Call to `.at(idx)` with idx that exceeds `schema->returns()` size). - Reproduce the crash: `/irparser_fuzz /homedir/crash-779ecab3d637c8c87de21e23dddb9def82a26792` - Crash file: [crash-779ecab3d637c8c87de21e23dddb9def82a26792.zip](https://github.com/pytorch/pytorch/files/10308475/crash-779ecab3d637c8c87de21e23dddb9def82a26792.zip) - ASAN report: [asan-report-crash-779ecab3d637c8c87de21e23dddb9def82a26792.log](https://github.com/pytorch/pytorch/files/10308611/asan-report-crash-779ecab3d637c8c87de21e23dddb9def82a26792.log) 3. Crash location: `unpickler.cpp:451` (Call to `.pop_back()` with empty `stack_`). - Reproduce the crash: `/message_deserialize_fuzz /homedir/crash-735acc19c9f39b9bbb5667878af995c9167da37f` - Crash file: [crash-735acc19c9f39b9bbb5667878af995c9167da37f.zip](https://github.com/pytorch/pytorch/files/10308565/crash-735acc19c9f39b9bbb5667878af995c9167da37f.zip) - ASAN report: [asan-report-crash-735acc19c9f39b9bbb5667878af995c9167da37f.log](https://github.com/pytorch/pytorch/files/10308558/asan-report-crash-735acc19c9f39b9bbb5667878af995c9167da37f.log) 4. Crash location: `unpickler.cpp:469` (Call to `.pop()` with empty `stack_`). - Reproduce the crash: `/message_deserialize_fuzz /homedir/crash-b552f1a2bbba5eab0f6aeba58475175b18e5b1b9` - Crash file: [crash-b552f1a2bbba5eab0f6aeba58475175b18e5b1b9.zip](https://github.com/pytorch/pytorch/files/10308568/crash-b552f1a2bbba5eab0f6aeba58475175b18e5b1b9.zip) - ASAN report: [asan-report-crash-b552f1a2bbba5eab0f6aeba58475175b18e5b1b9.log](https://github.com/pytorch/pytorch/files/10308555/asan-report-crash-b552f1a2bbba5eab0f6aeba58475175b18e5b1b9.log) The provided patch adds missing size checks. ### How to reproduce 1. To reproduce the crashes, use provided docker: [Dockerfile](https://github.com/ispras/oss-sydr-fuzz/blob/master/projects/pytorch/Dockerfile) 6. Build the container: `docker build -t oss-sydr-fuzz-pytorch-reproduce .` 7. Copy crash file to the current directory 8. Run the container: ``docker run --privileged --network host -v `pwd`:/homedir --rm -it oss-sydr-fuzz-pytorch-reproduce /bin/bash`` 9. And execute fuzz-targets with the given arguments After execution completes you will see ASAN reports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91401 Approved by: https://github.com/davidberard98	2022-12-29 19:58:29 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
kshitij12345	f74946324e	[fix] allow saving python attr on Tensor and Parameter via torch.save (#81616 ) Fixes: https://github.com/pytorch/pytorch/issues/72129 TODO: * [x] Fix for Parameter Benchmark (Measurable diff for small tensors) ``` [-------------- Save and Load --------------] \| After PR \| Before PR 1 threads: ---------------------------------- () \| 111.7 \| 106.9 (4, 4) \| 114.4 \| 109.2 (128, 128) \| 135.2 \| 128.3 (1024, 1024) \| 1431.9 \| 1431.3 Times are in microseconds (us). ``` <details> <summary> Benchmark Script </summary> ```python import torch from torch.testing._internal.common_utils import BytesIOContext from torch.utils import benchmark import pickle shapes = ((), (4, 4), (128, 128), (1024, 1024)) sizes = [1, 64, 1024, 10000] results = [] def save_load_fn(t): with BytesIOContext() as f: torch.save(t, f) f.seek(0) torch.load(f) for shape in shapes: t = torch.randn(shape) label = 'Save and Load' sub_label = f'{shape}' results.append(benchmark.Timer( stmt='save_load_fn(t)', globals={'t': t, 'save_load_fn':save_load_fn}, label=label, sub_label=sub_label, description='Before PR', ).blocked_autorange(min_run_time=2)) compare = benchmark.Compare(results) compare.print() with open('before_pr.pkl', 'wb') as f: pickle.dump(results, f) # with open('after_pr.pkl', 'rb') as f: # after_pr = pickle.load(f) # with open('before_pr.pkl', 'rb') as f: # before_pr = pickle.load(f) # compare = benchmark.Compare(after_pr + before_pr) # compare.print() ``` </details> NOTE : BC-Breaking : After this PR, all tensors (also regular tensors) will be serialised using `_rebuild_from_type_v2`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81616 Approved by: https://github.com/albanD, https://github.com/kurtamohler	2022-11-11 21:11:12 +00:00
kshitij12345	eb9b156019	[fix] MathBits: serialization (#88182 ) Fixes #81690 TODO: * [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++) * [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python) * [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this) * [x] Add Comments * [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.) Notes: Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : https://github.com/pytorch/pytorch/issues/88484 Sparse Tensor: ```python >>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse() >>> a.conj().is_conj() False >>> a._neg_view() Traceback (most recent call last): File "<stdin>", line 1, in <module> NotImplementedError: Cannot access storage of SparseTensorImpl ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88182 Approved by: https://github.com/ezyang, https://github.com/anjali411	2022-11-09 17:15:12 +00:00
PyTorch MergeBot	78a0ca29d9	Revert "[fix] allow saving python attr on Tensor and Parameter via torch.save (#81616 )" This reverts commit 54b6188cc6dee45b775d688223b847dc8ea85bff. Reverted https://github.com/pytorch/pytorch/pull/81616 on behalf of https://github.com/mehtanirav due to Internal publishing is broken	2022-11-07 18:51:16 +00:00
Kshiteej K	54b6188cc6	[fix] allow saving python attr on Tensor and Parameter via torch.save (#81616 ) Fixes: https://github.com/pytorch/pytorch/issues/72129 TODO: * [x] Fix for Parameter Benchmark (Measurable diff for small tensors) ``` [-------------- Save and Load --------------] \| After PR \| Before PR 1 threads: ---------------------------------- () \| 111.7 \| 106.9 (4, 4) \| 114.4 \| 109.2 (128, 128) \| 135.2 \| 128.3 (1024, 1024) \| 1431.9 \| 1431.3 Times are in microseconds (us). ``` <details> <summary> Benchmark Script </summary> ```python import torch from torch.testing._internal.common_utils import BytesIOContext from torch.utils import benchmark import pickle shapes = ((), (4, 4), (128, 128), (1024, 1024)) sizes = [1, 64, 1024, 10000] results = [] def save_load_fn(t): with BytesIOContext() as f: torch.save(t, f) f.seek(0) torch.load(f) for shape in shapes: t = torch.randn(shape) label = 'Save and Load' sub_label = f'{shape}' results.append(benchmark.Timer( stmt='save_load_fn(t)', globals={'t': t, 'save_load_fn':save_load_fn}, label=label, sub_label=sub_label, description='Before PR', ).blocked_autorange(min_run_time=2)) compare = benchmark.Compare(results) compare.print() with open('before_pr.pkl', 'wb') as f: pickle.dump(results, f) # with open('after_pr.pkl', 'rb') as f: # after_pr = pickle.load(f) # with open('before_pr.pkl', 'rb') as f: # before_pr = pickle.load(f) # compare = benchmark.Compare(after_pr + before_pr) # compare.print() ``` </details> NOTE : BC-Breaking : After this PR, all tensors (also regular tensors) will be serialised using `_rebuild_from_type_v2`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81616 Approved by: https://github.com/albanD, https://github.com/kurtamohler	2022-11-03 09:57:47 +00:00
albanD	c141f28b64	Fix compilation warning and spurious print (#87297 ) Fixes compilation warning, make this warning an error and remove a random print. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87297 Approved by: https://github.com/malfet	2022-10-19 20:56:37 +00:00
Edward Z. Yang	61b4e8a7bf	More SymFloat support (#85411 ) - Support storing SymFloat in IValue - Add SymFloat to JIT type system (erases to float) - Printing support for SymFloat - add/sub/mul/truediv operator support for SymFloat - Support truediv on integers, it returns a SymFloat - Support parsing SymFloat from Python object Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85411 Approved by: https://github.com/albanD	2022-09-22 08:07:22 +00:00
Daniil Kutz	d438e86719	Add assertions to fix torch::jit::load bugs (#79192 ) Fixes #77561, #77563, #77573 and #77575 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79192 Approved by: https://github.com/Gamrix	2022-08-11 18:03:00 +00:00
Amit Kumar Chawla	6592259ea5	[HPU] Enable torch.jit.load for HPU (#81759 ) As per torch.jit.load documentation, all previously saved modules, irrespective of their device, are first loaded onto CPU, and then are moved to the devices they were saved from. So far, supported devices included CPU and CUDA only. To enable torch.jit.load for HPU, additional check for HPU is introduced. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81759 Approved by: https://github.com/eellison	2022-08-01 09:28:44 +00:00
Edward Z. Yang	0b9eb93fe9	Make type_resolver_ null error have more useful info (#81466 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81466 Approved by: https://github.com/yinghai	2022-07-15 05:58:37 +00:00
Pavel Belevich	94eba341f8	Revert RPC Meta device support This reverts commit 058be5f16293357b4bd2bc087f1f54cd8c17f468 and 2e2200d76c611eed8d0aed2ff93e0adc344407d2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77875 Approved by: https://github.com/mrshenli	2022-05-19 23:47:47 +00:00
Edward Z. Yang	0a14a4c280	Register prims as operators. This makes prims look as if they were defined in native_functions.yaml but they're still all written in Python. You now need to give a full schema string for your prims. The returned prim object is now torch.ops.prim overload (prims are not allowed to be overloaded, so we return the overload, not the overload packet, for speed.) Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77117 Approved by: https://github.com/mruberry, https://github.com/albanD	2022-05-11 16:38:14 +00:00
Pavel Belevich	2e2200d76c	RPC Meta device support Pull Request resolved: https://github.com/pytorch/pytorch/pull/76882 Approved by: https://github.com/jamesr66a, https://github.com/mrshenli	2022-05-10 01:26:59 +00:00
Nikolay Korovaiko	5177f95d21	Introducing SymInt to Pytorch (for tracing size arithmetic) (master rebase) (#74861 ) Summary: This PR introduces `SymInt` type to Pytorch which will be used by LTC and AOTAutograd for tracing size arithmetic and tests. `SymInt` is a C++ union structure [int64_t, SymbolicIntNode*] that wraps around an int64_t field where the value of the field could be an index into a list of `shared_ptr<SymbolicIntNode>` or a real int. This PR doesn't add any support for actually tracing symbolic ints. i.e. data_ for now can only contain real ints. ``` Goal 1: just to show we can add a type to PyTorch core. (wraps int) LANDEABLE Finalize the naming - symint Want the name to be short Does invoke “size” - NO SInt/SymInt/SymbolicInt SInt could mean signed int sym_int or symint or SymInt (originally it was “int”; capitalized implies object semantics, whereas lowercase implies value semantics) JIT schema - symint C++ - symint ``` See more details here: https://docs.google.com/document/d/1iiLNwR5ohAsw_ymfnOpDsyF6L9RTUaHMpD8 (`d843f63f2a`)YLw-jxEw Pull Request resolved: https://github.com/pytorch/pytorch/pull/74861 Reviewed By: qihqi, ngimel Differential Revision: D35226230 Pulled By: Krovatkin fbshipit-source-id: 34acf342bd50fcaa4d8d5dd49c2fd6a98823a5b3 (cherry picked from commit 218643f63ef181cabb92d13a6e837eb64f2dda3c)	2022-03-31 21:59:59 +00:00
Janet Yang	99db53eaa7	Jit save/load meta tensors (#73435 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73435 Add support for torch.jit.save and load for meta tensors to use in meta tensor based xl weights. Test Plan: ``` buck test //caffe2/test:jit && -- -r .save_load_meta_tensors. ``` Reviewed By: houseroad Differential Revision: D34479511 fbshipit-source-id: 117ccb12e9e427290a17297204508ec85495e3be (cherry picked from commit ee9aaaf8208d6c9530c828a4b9f28cf2cca05630)	2022-03-10 19:48:29 +00:00
Zhengxu Chen	fe277b8717	[jit][edge] Migrate to TypeFactory for jit types on mobile (#71516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71516 Mobile should be able to contruct dynamic types by default. ghstack-source-id: 147498365 Test Plan: CI. -48KB binary size reduction for igios BSB. UMBEX link: https://www.internalfb.com/intern/unigraph/explorer/?jsgq_traversal_spec=%7B%22builds%22%3A[%22bsb%3A422553426218394%5Cu0040base%22%2C%22bsb%3A422553426218394%5Cu0040diff%22]%7D&unigraph_project=UnigraphProjectMbex&is_mbex_redirected Reviewed By: iseeyuan Differential Revision: D33673958 fbshipit-source-id: 8600c04ae929283681971aae264d3774188df9cd (cherry picked from commit 64ebcec09e69d2eff64fdbf926fb43d3b67f99b2)	2022-01-26 07:32:04 +00:00
Zhengxu Chen	4f35b9144c	[jit][edge] Migrate ListType to DynamicType on mobile. (#70212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70212 Use DynamicType instead of ListType all over the place in Lite Interpreter. Namely we need to modify the following places: 1. Type parser which produces the Type constants. 2. IValue::type() which returns reflected Type from IValues. 3. Helper functions to construct the container value. 4. Typechecks which test whether a type instance is a particular container type. ghstack-source-id: 146818619 Test Plan: CI Reviewed By: iseeyuan Differential Revision: D33176931 fbshipit-source-id: 9144787f5fc4778538e5c665946974eb6171a2e6	2022-01-11 10:57:53 -08:00
Zhengxu Chen	b12ca69179	[jit][edge] Migrate DictType to DynamicType on mobile. (#70202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70202 Use DynamicType instead of DictType all over the place in Lite Interpreter. Namely we need to modify the following places: 1. Type parser which produces the Type constants. 2. IValue::type() which returns reflected Type from IValues. 3. Helper functions to construct the container value. 4. Typechecks which test whether a type instance is a particular container type. ghstack-source-id: 146735648 Test Plan: no behavior change. Reviewed By: iseeyuan Differential Revision: D33137257 fbshipit-source-id: 971bf431658c422ea9353cc32cdab66e98876e9d	2022-01-10 15:55:29 -08:00
Zhengxu Chen	30699cbfd5	Reland D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers. (#71048 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71048 reland D33284352 (`0a921ba0d0`) ghstack-source-id: 146735646 Test Plan: All Github CI: ciflow rerun -l ciflow/all Reviewed By: gmagogsfm Differential Revision: D33489731 fbshipit-source-id: 3e160209a1abb193ad3eed3018054aa7d331025e	2022-01-10 12:42:23 -08:00
Zhengxu Chen	9762aa0fdc	Revert D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers. Test Plan: revert-hammer Differential Revision: D33284352 (`0a921ba0d0`) Original commit changeset: 997c4f110b36 Original Phabricator Diff: D33284352 (`0a921ba0d0`) fbshipit-source-id: af316727442a64f1ae40d53d7a9d26ec550d634e	2022-01-07 19:58:03 -08:00
Zhengxu Chen	0a921ba0d0	[jit][edge] Do not reuse mobile type parser for all unpicklers. (#70338 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70338 Today Unpickler is used by both server and mobile for deserializing model, and it always fallback to mobile parser when there's no type resolver provided by user. However this is not intended as server and mobile type parser supports different things. In this diff we provide a default fallback using script parser and opt it out for all mobile cases. ghstack-source-id: 146727330 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: iseeyuan Differential Revision: D33284352 fbshipit-source-id: 997c4f110b36eee6596e8f23f6a87bf91a4197ed	2022-01-07 18:35:32 -08:00
Zhengxu Chen	649dda9fee	[jit] Implement DynamicType for TorchScript runtime. (#68136 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68136 DynamicType is an extension to existing server JIT types. Today using normal server types on Edge is a bit problematic because in embedded environments we don't need the full spectrum of types but we still build with these unneeded dependencies. Is it possible to just get rid of unneeded JIT types from Edge builds? It's not easy to do so at this moment. For example, on Edge we don't support Union type, but we have to pull in the dependency of Union type because Optional type is being supported which inherits from Union type, so Union type has to be included in the build. Although we could split Union type and Optional type, it could be argued that the root cause is every time we use anything inheriting from `c10::Type`, we don't have the direct evidence of how much dependency we pull in, because we do virtual calls and we don't know what exactly we're calling with server JIT types. If we don't know, it's highly possible that the linker doesn't know either so it cannot effectively strip unused methods. To address this problem, one option is to implement a separate `DynamicType` which has simpler behavior and doesn't store different types as different symbols in binary but rather raw data (or "tag"). This could increase the binary size by several KBs, so I included several binary size reductions in the same stack, hoping at least we don't regress the binary size. Currently `DynamicType` inherits from `c10::Type` because I want to reduce the migration cost of `DynamicType` by making it interfacing with existing server JIT types. In the future `DynamicType` should be implemented as a separate class without relying on `c10::Type` to make things both simpler and leaner. ghstack-source-id: 146670522 Test Plan: in the next diff. Reviewed By: VitalyFedyunin Differential Revision: D32264615 fbshipit-source-id: 180eb0998a14eacc1d8b28db39870d84fcc17d5b	2022-01-07 11:23:07 -08:00
David Berard	41959ce77f	[JIT] scripting, freezing, serialization for sparse csr (#69555 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69555 1. Implement pickling/unpickling 2. Add `test_freeze_sparse_csr, tests_serialize_sparse_csr` tests Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33181367 Pulled By: davidberard98 fbshipit-source-id: a15d5193a7b1b1625a27e4af003cec33cdbc8071	2021-12-20 11:13:34 -08:00
Dhruv Matani	b0817e19e0	[PyTorch] Avoid reading file from stream for 0 byte Tensor storage (#67787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67787 First noticed in https://fb.workplace.com/groups/pytorch.edge.team/posts/952737705280969/ - basically one of the speech models has ~400 0 byte tensor files, so we're basically paying the cost of looking it up in the archive and reading nothing from it. Turns out that there's a fairly simple fix to avoid reading a 0 byte tensor. Once we notice that it's 0 bytes, just use the default `DataPtr` instead to initializing it with 0 bytes read in from the input file stream. ghstack-source-id: 142025211 Test Plan: CI and manually ran a couple production mobile models with bundled inputs. CI Will run all prod. mobile mobiles with bundled inputs. Reviewed By: swolchok Differential Revision: D32054983 fbshipit-source-id: 919b0cdbc44bccb8f6cfe0da10ff5474af37fd99	2021-11-09 21:45:05 -08:00
Scott Wolchok	82f7f8d471	[PyTorch] Adopt IValue::toTupleRef() where obvious (#65505 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65505 Generated with `fastmod -m 'toTuple(\s)->' 'toTupleRef()${1}.'` , followed by `fastmod '(std::move$.)toTupleRef\($.' '${1}toTuple()->'` to unbreak 2 callsites. ghstack-source-id: 142065835 Test Plan: CI Reviewed By: gchanan Differential Revision: D31131025 fbshipit-source-id: 54457ae5bbeb38db9c7f196d469b98521c3d3f34	2021-11-02 10:22:18 -07:00
Scott Wolchok	f65b4b7a4c	[PyTorch] Avoid refcount bump in UnionType::canHoldType (#66693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66693 Passing a `TypePtr` by value causes an unnececssary refcount bump. We don't need to take ownership, so `const Type&` is all we need. I considered providing a compatibility shim that takes `const TypePtr&`, but doing so is dangerous because a copy is required to convert from a more specific pointer like `NoneTypePtr`. ghstack-source-id: 140737081 Test Plan: CI Reviewed By: suo Differential Revision: D31691869 fbshipit-source-id: f766ce3234a28771c2a9ca4c284eb3f96993a3d0	2021-10-18 17:39:59 -07:00
Scott Wolchok	e88d1c4f10	[PyTorch] Add tuple inline storage (#64066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066 I noticed a bunch of time being spent heap-allocating Tuples in the unpickler. 1-, 2-, and 3-element Tuples are apparently common enough that they get their own bytecode instructions, so I decided to try also giving them their own representation. We store up to 3 IValues inline in `Tuple` rather than doing a second heap allocation for a `std::vector<IValue>`. ghstack-source-id: 140695395 Test Plan: Added automated tests for TupleElements. Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284 Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422 We went from 347 ms to 302 ms. Reviewed By: dhruvbird Differential Revision: D30592622 fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8	2021-10-15 12:16:51 -07:00
Scott Wolchok	176d3c6fb4	[PyTorch] Fix many Tuple::elements() callsites (#64065 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64065 It is only safe to mutate Tuple elements if you are the sole owner of the tuple. The most efficient way to do this, then, is `std::move(*std::move(tupleIValue).toTuple()).elements()` (the innermost move allows `IValue::toTuple()` to avoid a refcount bump and the outermost move allows the element vector to be moved out of the tuple), but many callsites write simply `tupleIValue.toTuple().elements()`, which incurs many extra refcount bumps. ghstack-source-id: 139468088 Test Plan: CI Reviewed By: ezyang Differential Revision: D30592621 fbshipit-source-id: e8312de866de09b9ea2a62e5128cbf403ee16f09	2021-10-01 11:36:05 -07:00
Ansley Ussery	6831d8e379	Support Union in TorchScript (#64234 ) Summary: This PR is created to replace https://github.com/pytorch/pytorch/pull/53180 PR stack, which has all the review discussions. Reason for needing a replacement is due to a messy Sandcastle issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64234 Reviewed By: gmagogsfm Differential Revision: D30656444 Pulled By: ansley fbshipit-source-id: 77536c8bcc88162e2c72636026ca3c16891d669a	2021-09-03 06:12:24 -07:00
Scott Wolchok	16ecdbbaa2	[PyTorch] Fix missing move in unpickler (#63974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63974 Saw some time spent in this for model loading, no reason not to move here. ghstack-source-id: 136760979 Test Plan: Re-profile model loading on devserver; IValue copy ctor time has gone down Reviewed By: dhruvbird Differential Revision: D30548923 fbshipit-source-id: 42000f2e18582762b43353cca10ae094833de3b3	2021-08-30 09:38:55 -07:00
Garrett Cramer	7ebdbf82dc	add support for sending cpu sparse tensors over rpc (#62794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62794 This pr updates jit serialization to support pickling Sparse COO tensors. This pr updates message.cpp to support Sparse COO tensors. A bug was filed a few years ago https://github.com/pytorch/pytorch/issues/30807. I tested the fix by adding sparse tensor tests to rpc_test.py and dist_autograd_test.py. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23 gmagogsfm Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D30608848 Pulled By: gcramer23 fbshipit-source-id: 629ba8e4a3d8365875a709c9b87447c7a71204fb	2021-08-29 11:35:00 -07:00
Lily Johnson	0dd90cceaf	[package] track storages across lifetime of PackageExporter (#59735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735 1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue. 2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`) 3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D29075276 Pulled By: Lilyjjo fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12	2021-06-29 14:16:54 -07:00
Lillian Johnson	9403fe17ce	[torch.package/TorchScript] logic to enable sharing of tensors on load (#57573 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57573 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D28226975 Pulled By: Lilyjjo fbshipit-source-id: bc8cb3e8052fa18336c437e0601d8b0028fd1895	2021-05-14 08:21:43 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Can Balioglu	2130f4ccc4	Use c10::ArrayRef instead of std::vector for the jit::unpickle's tensor_table. (#54428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54428 Using c10::ArrayRef as the parameter type makes the API more flexible and allows the caller to leverage small-buffer optimizations (e.g. c10::SmallVector, std::array) for performance critical cases. Test Plan: No behavioral changes. Run the existing unit and integration tests. Reviewed By: suo Differential Revision: D27232222 fbshipit-source-id: 7b13bc6bd02257097ca119077028fbccc68cc925	2021-03-22 15:31:47 -07:00
cyy	d8730194e7	use device methods (#52899 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52899 Reviewed By: zou3519 Differential Revision: D26752203 Pulled By: albanD fbshipit-source-id: eaef89377999b20655fe85d5a38ca7a2c5882de7	2021-03-02 20:14:23 -08:00
Scott Wolchok	0e2520baae	[PyTorch] Don't read 1 char per iteration in Unpickler::readString (#51901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51901 It's much more efficient to read multiple chars with 1 memcpy than to call `read<char>` multiple times. ghstack-source-id: 121278774 Test Plan: Run WireSerializerBench before/after for small tensors: ``` /tmp/WireSerializerBench.Reader --real_data /mnt/homedir/hwwang/test_serialized_api_request --real_pytorch_api_request --bm_regex '[Ss]mall' ``` Before: ``` DeSerializeWire(Small) 7.65us 130.65K DeSerializeWire(small_Zstd) 100.49% 7.62us 131.29K DeSerializeWire(small_Snappy) 100.49% 7.62us 131.29K DeSerializeWireIValue(Small) 82.89% 9.23us 108.30K DeSerializeWireIValue(small_Zstd) 82.87% 9.24us 108.27K DeSerializeWireIValue(small_Snappy) 82.33% 9.30us 107.57K DeSerializeC2ToBlob(small_NoCompress) 1150.28% 665.39ns 1.50M DeSerializeC2ToBlob(small_Zstd) 1149.70% 665.72ns 1.50M DeSerializeC2ToBlob(small_Zstd_Fast) 1150.94% 665.00ns 1.50M DeSerializeC2ToBlob(Small_Snappy) 1151.70% 664.57ns 1.50M DeSerializeC2ToString(small) 9297.81% 82.32ns 12.15M ``` After: ``` DeSerializeWire(Small) 6.86us 145.84K DeSerializeWire(small_Zstd) 100.52% 6.82us 146.60K DeSerializeWire(small_Snappy) 100.13% 6.85us 146.03K DeSerializeWireIValue(Small) 83.94% 8.17us 122.42K DeSerializeWireIValue(small_Zstd) 84.00% 8.16us 122.50K DeSerializeWireIValue(small_Snappy) 84.53% 8.11us 123.28K DeSerializeC2ToBlob(small_NoCompress) 1019.48% 672.58ns 1.49M DeSerializeC2ToBlob(small_Zstd) 1020.03% 672.23ns 1.49M DeSerializeC2ToBlob(small_Zstd_Fast) 1020.59% 671.85ns 1.49M DeSerializeC2ToBlob(Small_Snappy) 1020.30% 672.05ns 1.49M DeSerializeC2ToString(small) 7709.63% 88.94ns 11.24M ``` Second run after to demonstrate it wasn't just variance: ``` DeSerializeWire(Small) 6.92us 144.57K DeSerializeWire(small_Zstd) 99.24% 6.97us 143.47K DeSerializeWire(small_Snappy) 99.58% 6.95us 143.97K DeSerializeWireIValue(Small) 84.83% 8.15us 122.63K DeSerializeWireIValue(small_Zstd) 84.72% 8.16us 122.49K DeSerializeWireIValue(small_Snappy) 84.59% 8.18us 122.29K DeSerializeC2ToBlob(small_NoCompress) 1031.03% 670.89ns 1.49M DeSerializeC2ToBlob(small_Zstd) 1030.64% 671.14ns 1.49M DeSerializeC2ToBlob(small_Zstd_Fast) 1013.39% 682.57ns 1.47M DeSerializeC2ToBlob(Small_Snappy) 1013.95% 682.19ns 1.47M DeSerializeC2ToString(small) 8155.98% 84.81ns 11.79M ``` By the way, this gets us closer to deserialization parity for the real data sample included in D26049387: baseline: ``` DeSerializeWire(RealData) 7.34ms 136.24 DeSerializeWire(RealData_Zstd) 99.95% 7.34ms 136.17 DeSerializeWire(RealData_Snappy) 100.09% 7.33ms 136.36 DeSerializeWireIValue(RealData) 82.69% 8.88ms 112.65 DeSerializeWireIValue(RealData_Zstd) 82.76% 8.87ms 112.75 DeSerializeWireIValue(RealData_Snappy) 82.68% 8.88ms 112.64 DeSerializeC2ToBlob(RealData_NoCompress) 116.87% 6.28ms 159.23 DeSerializeC2ToBlob(RealData_Zstd) 117.33% 6.26ms 159.85 DeSerializeC2ToBlob(RealData_Zstd_Fast) 117.38% 6.25ms 159.91 DeSerializeC2ToBlob(RealData_Snappy) 117.61% 6.24ms 160.23 DeSerializeC2ToString(RealData) 4571.81% 160.55us 6.23K ``` with this diff: ``` DeSerializeWire(RealData) 6.57ms 152.17 DeSerializeWire(RealData_Zstd) 100.17% 6.56ms 152.43 DeSerializeWire(RealData_Snappy) 100.09% 6.57ms 152.31 DeSerializeWireIValue(RealData) 83.06% 7.91ms 126.40 DeSerializeWireIValue(RealData_Zstd) 83.16% 7.90ms 126.54 DeSerializeWireIValue(RealData_Snappy) 83.22% 7.90ms 126.64 DeSerializeC2ToBlob(RealData_NoCompress) 104.02% 6.32ms 158.29 DeSerializeC2ToBlob(RealData_Zstd) 103.46% 6.35ms 157.43 DeSerializeC2ToBlob(RealData_Zstd_Fast) 104.64% 6.28ms 159.23 DeSerializeC2ToBlob(RealData_Snappy) 104.65% 6.28ms 159.25 DeSerializeC2ToString(RealData) 4051.03% 162.22us 6.16K ``` Reviewed By: qizzzh Differential Revision: D26321083 fbshipit-source-id: 92d45e760580bb290078ddac84128174daef0e55	2021-02-17 11:00:48 -08:00
Scott Wolchok	680c4ce1dd	[PyTorch] Avoid some extra intrusive_ptr<Tuple> copies in Unpickler (#51902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51902 These seem like straightforward improvements. (I don't have measurements; feel free to reject if you're skeptical) ghstack-source-id: 121278775 Test Plan: CI Reviewed By: qizzzh Differential Revision: D26322438 fbshipit-source-id: d393a32cc34bb68bc4f804f4b1cc5a8af27763c9	2021-02-17 07:31:58 -08:00
anjali411	18a7ec7d7d	Update the JIT complex type name to be consistent with Python (#51476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51476 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D26179237 Pulled By: anjali411 fbshipit-source-id: 6a5c60c8545eb42416583836b8038ceffd3f3244	2021-02-03 09:59:08 -08:00
Scott Wolchok	7328710cbc	[PyTorch][codemod] Replace immediately-dereferenced cast calls w/castRaw (#50229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50229 `fastmod -m 'cast(<((at\|c10)::)?\w+Type>\s*)->' 'castRaw${1}->'` Presuming it builds, this is a safe change: the result of `cast()` wasn't being saved anywhere, so we didn't need it, so we can use a raw pointer instead of a new `shared_ptr`. ghstack-source-id: 120769170 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D25837494 fbshipit-source-id: 46319100dc0dfc78f6d2b45148207f83481f2ada	2021-02-01 23:12:07 -08:00
anjali411	f9f22c8b5c	Add serialization logic for complex numbers (#51287 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51287 This reverts commit dfdb1547b9c1934904bfd137b4007d6a46a6f597. Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D26131165 Pulled By: anjali411 fbshipit-source-id: 047167fac594ddb670c5e169446e90e74991679a	2021-01-28 17:25:35 -08:00
Mike Ruberry	dfdb1547b9	Revert D26094906: Add serialization logic for complex numbers Test Plan: revert-hammer Differential Revision: D26094906 (`2de4ecd4eb`) Original commit changeset: 7b2614f3ee4a fbshipit-source-id: 6f32a9fc6bb2a904ca1a282bbc6b2df0aee50068	2021-01-27 19:44:26 -08:00
anjali411	2de4ecd4eb	Add serialization logic for complex numbers (#50885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50885 Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D26094906 Pulled By: anjali411 fbshipit-source-id: 7b2614f3ee4a30c4b4cf04aaa3432988b38a0721	2021-01-27 15:19:36 -08:00
generatedunixname89002005325676	5a5bca8ef0	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D26043955 fbshipit-source-id: 0a5740a82bdd3ac7bd1665a325ff7fe79488ccea	2021-01-25 04:20:03 -08:00
anjali411	9ac30d96aa	Add complex IValues (#50883 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50883 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D26003682 Pulled By: anjali411 fbshipit-source-id: f02967d2d236d740cd8647891f732f1d63098d3e	2021-01-22 09:44:40 -08:00
chengjun	4a8ef4525e	Add new backend type for Intel heterogeneous computation platform. (#49786 ) Summary: Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc. https://github.com/pytorch/pytorch/issues/48246 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786 Reviewed By: mrshenli Differential Revision: D25893962 Pulled By: ezyang fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052	2021-01-20 08:15:18 -08:00
Scott Wolchok	480a756194	[PyTorch] IValue::toTensor can now return const Tensor& (#48868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48868 Building on the previous diff, we can make `toTensor()` return a `const Tensor&`, which should make it easier to avoid reference counting. ghstack-source-id: 119327372 Test Plan: internal benchmarks. Reviewed By: bwasti Differential Revision: D25325379 fbshipit-source-id: ca699632901691bcee432f595f75b0a4416d55dd	2021-01-06 08:40:50 -08:00
Ansley Ussery	c619892482	Fix errata (#49903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49903 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D25718411 Pulled By: ansley fbshipit-source-id: 0cc365c5a53077752dc1c5a5c4a65b873baa3604	2020-12-28 20:40:41 -08:00

1 2 3

131 Commits