pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Max Wang	d108a1abb7	Add a .ctags.d/ toplevel directory (#18827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18827 ghimport-source-id: 38f857bc29b2c2c6a71069d00c4c69ed0bef1574 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18827 Add a .ctags.d/ toplevel directory Exclude build artifacts by default. Reviewed By: ezyang Differential Revision: D14765721 fbshipit-source-id: a785dbb2ef1df96af8e23cc65c8db2a6b67b4fce	2019-04-04 12:51:05 -07:00
Wanwannodao	8ca9ba17da	Fix typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18802 Differential Revision: D14781874 Pulled By: ezyang fbshipit-source-id: 0f94c40bd84c84558ea3329117580f6c749c019f	2019-04-04 12:46:39 -07:00
Xiaomeng Yang	b145dcca04	Add support for group ConvTranspose (#18794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18794 Add support for group ConvTranspose Reviewed By: houseroad Differential Revision: D14741327 fbshipit-source-id: 5d947ca044bf8495dd7f8f56122441ebbcc6c7e4	2019-04-04 11:52:06 -07:00
Gregory Chanan	8732a1b42e	Disallow changing the device of a tensor via set_. (#18832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18832 ghimport-source-id: fde4ad90541ba52dfa02bdd83466f17e6541e535 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. This is necessary to cache the device on a TensorImpl. Differential Revision: D14766231 fbshipit-source-id: bba61634b2d6252ac0697b96033c9eea680956e8	2019-04-04 11:15:37 -07:00
Karl Ostmo	15b318de84	U/kostmo/win test offload scripts Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18694 Differential Revision: D14766339 Pulled By: kostmo fbshipit-source-id: a2300e72129979f866430ca5c09dd7fff6df0a89	2019-04-04 10:42:11 -07:00
Zachary DeVito	f97eb8d9e4	Revert D14603722: Enforce single parent for script submodules Differential Revision: D14603722 Original commit changeset: 63ab5d0cccf7 fbshipit-source-id: 2c4174def102eda4589e08c4dbd67ce8af975199	2019-04-04 10:32:36 -07:00
Edward Yang	52a3a51490	Fix deviceCount on FakeGuardImpl. (#18745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18745 ghimport-source-id: 3ed111efe83b3061652869e33d9b5910b7daa732 Differential Revision: D14759198 Pulled By: ezyang fbshipit-source-id: 70a8db767f310fe0e0079c7b0693e9330d7cd472	2019-04-04 09:23:36 -07:00
Gregory Chanan	486fae563d	Stop swapping in Storages of the wrong device for Tensors. (#18831 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18831 ghimport-source-id: 2741e0d70ebe2c2217572c3af54ddd9d2047e342 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. This is necessary to support device caching, see https://github.com/pytorch/pytorch/pull/18751 and https://github.com/pytorch/pytorch/pull/18578. In library code, we potentially swap in Storages with the wrong device when device_guard is False. This happens as follows with "view-like" operations. 1) We allocate a tensor on the 'wrong' device (because device_guard is false). 2) We swap out the 'wrong' storage with the 'right' storage using e.g. THCTensor_setStorage. Instead, we can just construct the Tensor with the correct Storage from the beginning. This is what we do with 'view'. Note there are two other "view-like" cases where this happens: 1) unfold 2) set_() Because these aren't performance critical, I just added the device_guard instead of applying the above correction. For completeness, this also includes a test that all `device_guard: false` functions behave properly under these conditions. Reviewed By: dzhulgakov Differential Revision: D14766232 fbshipit-source-id: 0865c3ddae3f415df5da7a9869b1ea9f210e81bc	2019-04-04 06:25:33 -07:00
Roy Li	d70c6f23f4	Pass ScalarType separately from Type in python constructors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17786 Reviewed By: ezyang Differential Revision: D14379075 fbshipit-source-id: 3abf066563b789a30cafe5b0c868a41326f5b833	2019-04-04 02:24:20 -07:00
Roy Li	f5741eb855	Store ScalarType and Backend instead of Type in TensorIterator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17601 Reviewed By: ezyang Differential Revision: D14274754 fbshipit-source-id: b08880ae586b6ae57d4c0bbeb203796d087926c4	2019-04-04 02:24:16 -07:00
Roy Li	c705d9eb1e	Introduce DeprecatedTypeProperties class (#17991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17991 changes: -Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&. -Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic. -Tensor::dispatch_type() now returns Type& like Tensor::type() used to do. -Changed callsites of Tensor::type() appropriately. Reviewed By: ezyang Differential Revision: D14443117 fbshipit-source-id: 239ccb7a09626279a71d1a37f8f82e7f57bf7d9e	2019-04-04 02:24:13 -07:00
Bram Wasti	095f88e093	Fix to handle null strides in DLPack tensor (#18510 ) Summary: DLPack can have non-strided tensors, which is represented by a nullptr in the place of dl_tensor.strides. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18510 Differential Revision: D14647328 Pulled By: bwasti fbshipit-source-id: 5364282810a5772cfc2319fc8133fe86fdd84dd1	2019-04-04 00:28:13 -07:00
Yinghai Lu	e5e2110a8e	Add shape inference function for Split (#18838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18838 It turns out that we don't have shape inference function of `Split` op at all. This diff adds that. Reviewed By: bertmaher Differential Revision: D14766871 fbshipit-source-id: 535cb4f24bdada603c76579e00e7a39aee93e19f	2019-04-04 00:22:22 -07:00
Lu Fang	0c237f1383	Fix the duplication problem in _unique_state_dict (#18139 ) Summary: Since parameter.data will create a new torch.Tensor each time, we get duplicate tensors when call _unique_state_dict now. Try to deduplicate it before creating new tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18139 Reviewed By: dzhulgakov Differential Revision: D14511262 Pulled By: houseroad fbshipit-source-id: cb69795d0b6509721220650bbb19edeb3459a503	2019-04-03 23:16:44 -07:00
Jongsoo Park	fa0ad057f8	fold col offset into bias; optimize A symmetric quant (#17026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17026 D14013931 was for FC. This diff is similar optimizations for Conv. A subtle difference is that in FC, once we fold col_offset into bias during pre-processing step, we can treat everything as if A_zero_offset == 0 (symmetric quantization of A). In Conv, we can't do this because padding still needs to use the original A_zero_offset. From requantization point of view, once col_offset folded into bias, we can treat as if we're doing symmetric A quantization. But, for steps involving padding like im2col, im2col fused with packing, and direct conv for depth-wise/group convolution we still need to pass the original A_zero_offset. Reviewed By: jianyuh Differential Revision: D14020276 fbshipit-source-id: c29caefd1127bbc6aff0e9d535939bb0c1ecb66c	2019-04-03 22:52:54 -07:00
Michael Suo	72913a55a8	fix flake8 lint (#18835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18835 ghimport-source-id: 7b1f433ae51232822704d62699233688072cbc23 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18835 fix flake8 lint * #18826 [jit] run cpp tests for non-cuda builds in test_jit.py ...again Reviewed By: ZolotukhinM Differential Revision: D14766790 fbshipit-source-id: 29361a407589092831dfbc3c5d63d2834934cd02	2019-04-03 22:24:01 -07:00
Michael Suo	0a4117a36e	run cpp tests for non-cuda builds in test_jit.py (#18826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18826 ghimport-source-id: 7ffa3bc7ef7402a6d6eb6ba5849e197019d77bf8 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18826 [jit] run cpp tests for non-cuda builds in test_jit.py We did all the work of nicely separating our cpp tests that don't require CUDA, but they aren't run from test_jit.py if CUDA is missing. Reviewed By: ZolotukhinM Differential Revision: D14766287 fbshipit-source-id: 9326b3a5c90f6c20fc8cfaf1a1885a363b91f30a	2019-04-03 22:23:58 -07:00
Lu Fang	100f95a362	Fix the linter (#18842 ) Summary: Remove extra empty line Pull Request resolved: https://github.com/pytorch/pytorch/pull/18842 Differential Revision: D14767334 Pulled By: houseroad fbshipit-source-id: 63224bc407949949e1eb5123d3f151e4ac8f6988	2019-04-03 21:37:01 -07:00
Zachary DeVito	7e59c60454	Enforce single parent for script submodules (#18379 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18379 ghimport-source-id: 9895ecc1ff7897e98853dc00675341f36726e7c7 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. The assumption that a ScriptModule has a single parent is present in our serialization format, and likely a few other places. It is not enforced on creation of script module hierarchies though, meaning that problems associated with (e.g. replicating a module twice in the output format) will not be caught until much later in the development cycle. This patch enforces the property when a submodule is registered. It also removes NamedModule since it is no longer necessary in this regime. This will also allow the easy discover of a modules fully-qualified name without needing to traverse the Module hierarchy. Differential Revision: D14603722 fbshipit-source-id: 63ab5d0cccf7d66c7833e0adf9023024ca9607cb	2019-04-03 20:26:58 -07:00
Elias Ellison	b80a4fa201	Allow ints, floats, and tensors in conditional (#18755 ) Summary: Per our offline discussion, allow Tensors, ints, and floats to be casted to be bool when used in a conditional Fix for https://github.com/pytorch/pytorch/issues/18381 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18755 Reviewed By: driazati Differential Revision: D14752476 Pulled By: eellison fbshipit-source-id: 149960c92afcf7e4cc4997bccc57f4e911118ff1	2019-04-03 17:12:17 -07:00
Wanchao Liang	843e6234f5	Fix layernorm ad formula on weight and bias (#18233 ) Summary: Fix the layernorm formula when weight and bias passed in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18233 Differential Revision: D14760375 Pulled By: wanchaol fbshipit-source-id: d6bd3b137bc04c391aa5c24d021d1f811ba2a877	2019-04-03 16:58:33 -07:00
Zachary DeVito	0512e4e323	Unify namespace of script::Module (#18378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18378 ghimport-source-id: 55c29bb436a2153d29ff2f4488d99d8863c187b1 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. This removes individual OrderedDicts in favor of a single unified namespace for all things in a script::Module. This removes a whole class of bugs where both a method and an parameter could get the same name, for instance. Since we no longer have to expose OrderedDict::Item objects, a lot of downstream code can be simplified. We no longer now double-store names (both in the key of the dictionary, and in the object itself). Differential Revision: D14603723 fbshipit-source-id: b5f7551b3074679623edd6ea70269830353b4d4c	2019-04-03 16:04:17 -07:00
Vitaly Fedyunin	773ce4fbd0	Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance (#18648 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18648 ghimport-source-id: 1cf4a8fe91492621e02217f38cae5d7e0699fb05 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18661 Step 7: remove _unique * #18655 Step 6: Rename _unique2 to unique and add int? dim * #18654 Step 5: remove _unque_dim in favor of unique_dim * #18651 Step 4: add support for unique with dim=None * #18650 Step 3: Add support for return_counts to torch.unique for dim not None * #18649 Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim * #18648 Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance `unique` is fragile, previously I tried to change it in #18391 and #17097, they all pass OSS tests but finally get reverted due to internal failure. My previous work of refactoring unique #18459 is based on #18391, and after #18391 get reverted, I could not work on #18459. To continue working on #18459, #18391, and #17097 without worrying about internal failures, I am suggesting the following steps for the improvements of `unique` and `unique_dim`. soumith Please take this and there is no need to put #18391 back. The motivation is basically to move forward as much as possible without causing any internal failures. So I will try to divide it into steps and sort from low probability of internal failure to high probability. (I don't know what the internal failure is, so I have to guess). Let's merge these PR stack one by one until we enounter internal failure. Step 1: Create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and keep `_unique` and `_unique_dim` unchanged. The backend of these two functions and `_unique` and `_unique_dim` are all the same, the only difference is the temporary ones support `return_counts` but not the `_unique` and `_unique_dim`. Step one is mostly #18391 + #18459. The cuda8 errors has been fixed. At this point, there is no user visible API change, so no docs are updated. `torch.unique` does not support `return_counts` yet, and `return_counts` is tested through the newly added temporary operators. This step just added two new ATen operators, so there shouldn't be any internal failure. Step 2: Rename `_unique_dim2_temporary_will_remove_soon` to `unique_dim`. This should cause no internal failure either, because no change to existing operators. The only thing to worry about is to delete `unique_dim` from python side because we don't want users to use it. At this point, C++ users now have `return_counts` support for `unique_dim`. Step 3: Update the docs of `torch.unique` and use `unique_dim` inside `torch.unique` to support `return_counts` In the docs, we should say `torch.unique` with None dim support does not support `return_counts` yet. This might cause internal failure. Step 4: Rename `_unique2_temporary_will_remove_soon` to `_unique2` and use `_unique2` inside `torch.unique` to support `return_counts`. Update the docs saying that `torch.unique` with None dim now support `return_counts`. This might cause internal failure. Step 5: Remove `_unique_dim`. This might cause internal failure. Step 6: Rename `_unique2` to `unique`, add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. Inside `torch.unique`, use `unique` and get rid of `unique_dim`. Unbind `unique_dim` totally from Python at codegen. This is likely to cause internal fail. Step 7: Remove `_unique`. This is very likely to cause internal failure. This PR ====== This PR is for step 1. This create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and implement `return_counts` inside them and do refactor for performance improvements. Please review ngimel VitalyFedyunin. They are mostly copied from #18391 and #18459, so the review should be easy. Below is a benchmark on a tensor of shape `torch.Size([15320, 2])`: Before --------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 192 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 548 ms ± 3.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 226 µs ± 929 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) 302 µs ± 7.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` After ------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 190 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 237 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 219 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 263 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 232 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 301 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 264 µs ± 7.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 339 µs ± 9.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Differential Revision: D14730905 fbshipit-source-id: 10026b4b98628a8565cc28a13317d29adf1225cc	2019-04-03 15:29:55 -07:00
Shen Li	7ae0263e1b	Support replicating multi-GPU modules (#18687 ) Summary: If the input `network` resides on multiple GPUs, `devices` must be a 2D list with `devices[0]` matching `network`'s devices. See #18591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18687 Differential Revision: D14706162 Pulled By: mrshenli fbshipit-source-id: dca630d3308f2dbcf8b75629c452d7a64092ba42	2019-04-03 14:43:07 -07:00
Wanchao Liang	eabd9eac2a	flake8 fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18810 Differential Revision: D14758293 Pulled By: wanchaol fbshipit-source-id: 975abe4fc5dc0dc4d43af61ec0f987e2c5670874	2019-04-03 14:14:18 -07:00
Gregory Chanan	862aff641a	Remove `device_guard: False` from native_functions that don't have a … (#18803 ) Summary: …tensor. There's nothing to device_guard on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18803 Reviewed By: ezyang Differential Revision: D14748091 Pulled By: gchanan fbshipit-source-id: ed6f16d6f4d3f07b6d5ad9696f71a14333c228b8	2019-04-03 14:00:02 -07:00
Edward Yang	cb959aa708	Switch our Linux machine AMI to a newer image. (#18433 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18433 ghimport-source-id: 1c92f98b091232c0045a2e1db75d19c1f258ac1f Differential Revision: D14748827 Pulled By: ezyang fbshipit-source-id: a459451058cf5560811403bafb96c6ff083d7e3a	2019-04-03 13:50:37 -07:00
Jerry Zhang	dfcd7b0185	QTensor (#18230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18230 Implementing minimum qtensor API to unblock other workstreams in quantization Changes: - Added Quantizer which represents different quantization schemes - Added qint8 as a data type for QTensor - Added a new ScalarType QInt8 - Added QTensorImpl for QTensor - Added following user facing APIs - quantize_linear(scale, zero_point) - dequantize() - q_scale() - q_zero_point() Reviewed By: dzhulgakov Differential Revision: D14524641 fbshipit-source-id: c1c0ae0978fb500d47cdb23fb15b747773429e6c	2019-04-03 13:17:11 -07:00
Dmytro Dzhulgakov	3af2d6d904	Enforce import order to make protobuf cpp implementation in python work (#18560 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18560 We have to import python protobuf here before we load cpp extension. Otherwise it breaks under certain build conditions if cpp implementation of protobuf is used. Presumably there's some registry in protobuf library and python side has to initialize the dictionary first, before static initialization in python extension does so. Otherwise, duplicated protobuf descriptors will be created and it can lead to obscure errors like Parameter to MergeFrom() must be instance of same class: expected caffe2.NetDef got caffe2.NetDef. I think it also fixes https://github.com/facebookarchive/caffe2/issues/1573 Reviewed By: ezyang, iroot900 Differential Revision: D14622054 fbshipit-source-id: 2499eb88ecdee85ff8d845859048f7ae5da2a480	2019-04-03 13:17:08 -07:00
Lu Fang	3b71f2e1f2	Pin onnx ir_version to 4 (#18768 ) Summary: to make test_operators.py more stable. in future, we will bump this up manually, and I think it's acceptable, since ir_version should be bumped too often. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18768 Reviewed By: zrphercule Differential Revision: D14741514 Pulled By: houseroad fbshipit-source-id: 0369dbc55424e345a113e49fc104a441ea290d58	2019-04-03 13:16:59 -07:00
Soumith Chintala	8711df89cc	fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18739 ) Summary: resubmit of https://github.com/pytorch/pytorch/pull/18704 with additional fixes Fixes https://github.com/pytorch/pytorch/issues/18359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18739 Differential Revision: D14737274 Pulled By: soumith fbshipit-source-id: cfbbbf68b098594bd045861d1b2c085da693ea51	2019-04-03 12:52:50 -07:00
Soumith Chintala	b5d8844bbe	push magma init into lazyInitCUDA (#18527 ) Summary: Tries to fix C++ API's usage of MAGMA-based functions. Attempts to Fix https://github.com/pytorch/pytorch/issues/18074 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18527 Differential Revision: D14691694 Pulled By: soumith fbshipit-source-id: dd04e74418e486d73ea4a92193ddf79352ed71ba	2019-04-03 12:47:34 -07:00
Jerry Zhang	ed9724f385	For some files that are touched by the QTensor diff (#18765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18765 att Reviewed By: ZolotukhinM Differential Revision: D14733442 fbshipit-source-id: 525002034e6dccc2045da645e1193671fd0474b3	2019-04-03 12:47:31 -07:00
Wanchao Liang	a21e256e8d	Fix contiguous AD and Autogradzero inconsistency (#18633 ) Summary: Fixes #17962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18633 Differential Revision: D14700449 Pulled By: wanchaol fbshipit-source-id: 3d15d67c01b69b28394a0f2f001db90ed9fd31dc	2019-04-03 12:47:28 -07:00
Iurii Zdebskyi	5950c1e8c4	Added indexing for bool tensors and bool Indices (#18583 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18583 ghimport-source-id: 2b1941449827f4ab632fa0f5c8cf0791a6be0845 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18583 Added indexing for bool tensors and bool Indices * #18505 Added numpy conversion * #18166 Bool Tensor for CUDA ----------- This PR enables bool tensor indexing and indexing with bool indices. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU [Done] b) CUDA [In review] 3. Tensor Conversions. [In review] 4. Tensor Indexing. [This PR] 5. Tensor Operations. 6. Back compatibility related changes. TODO: as a follow up, we should move nonzero method from TH to Aten to make code cleaner. Change: ``` v = torch.tensor([True, False, True], dtype=torch.bool) boolIndices = torch.tensor([True, False, False], dtype=torch.bool) v[boolIndices] -> tensor([True], dtype=torch.bool) v = torch.randn(5, 7, 3) boolIndices = torch.tensor([True, False, True, True, False], dtype=torch.bool) v[boolIndices] -> tensor([[[ 0.5885, -0.3322, 0.7388], [ 1.1182, 0.7808, -1.1492], [-0.7952, 0.5255, -0.0251], [ 0.7128, 0.8099, 1.2689], [-0.7018, -1.4733, -0.3732], [ 0.4503, 0.4986, -1.1605], [ 0.3348, -1.3767, -0.2976]], [[-2.0303, -0.4720, -0.1448], [-0.1914, -0.6821, 2.0061], [-1.0420, -0.1872, -0.3438], [ 1.7587, -0.4183, -0.7577], [ 1.0094, -0.1950, -0.2430], [ 0.1174, 0.3308, -0.5700], [ 0.1110, -0.2714, 1.3006]], [[-0.1946, -1.4747, -0.4650], [-1.0567, 1.0110, -0.2809], [ 0.3729, -0.5699, 0.0815], [-0.7733, -0.8316, 0.1674], [ 1.2000, -0.3745, -1.1679], [ 1.7105, 0.9851, -0.1907], [-1.1077, 0.2086, -0.0548]]]) ``` Differential Revision: D14673403 fbshipit-source-id: 2b88ec2c7eb26a4f5ef64f8707fb68068d476fc9	2019-04-03 12:47:26 -07:00
Lu Fang	65dfe1203f	add an assertion to check the param num (#18145 ) Summary: Introduce this check to see whether it will break any existing workflow Pull Request resolved: https://github.com/pytorch/pytorch/pull/18145 Reviewed By: dzhulgakov Differential Revision: D14511711 Pulled By: houseroad fbshipit-source-id: a7bb6ac84c9133fe94d3fe2f1a8566faed14a136	2019-04-03 12:47:23 -07:00
Jiakai Liu	4afc067fed	add Android NDK param to CI docker build script (#18782 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18782 ghimport-source-id: 6c4bde7dc835b59209c1d5f7b243f00c9fe99de2 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18782 [pytorch] add Android NDK param to CI docker build script Inspired by discussion: https://github.com/pytorch/pytorch/pull/16242 Reviewed By: dreiss Differential Revision: D14739471 fbshipit-source-id: 0a081045186cbf359eb3cdadee722741cd8cd62f	2019-04-03 12:47:20 -07:00
Gu, Jinghui	a7b82a44c4	Upgrade mkldnn-bridge for dnnlowp support (#16308 ) Summary: The mkldnn-bridge is upgraded in this PR to support DNNLOWP operators. Meanwhile, APIs have been updated in caffe2 to use latest version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16308 Differential Revision: D14697018 Pulled By: yinghai fbshipit-source-id: ca952589098accb08295fd5aa92924c61e74d69c	2019-04-03 12:47:17 -07:00
Michael Kösel	46a68c1b37	add 'abs' builtin Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18502 Differential Revision: D14750173 Pulled By: eellison fbshipit-source-id: 359cf08938ada442ca1a3b3ea14022ce10229499	2019-04-03 12:47:13 -07:00
kshitij12345	0916b5419a	Fix dense Embedding to work with double backward (#9078 ) Summary: Fixes : #6469 1. `ATen/native/native_functions.yml` had [dispatch](`03e7953a98/aten/src/ATen/native/native_functions.yaml (L451-L455)`) variants for for `embedding_dense_backward` , however `embedding_backward` explicitly made [call](`03e7953a98/aten/src/ATen/native/Embedding.cpp (L35-L45)`) to it, thus leading to error. 2. In case of CUDA type tensor, the function crashed used to crash on dereferencing of indices's data [pointer](`03e7953a98/aten/src/ATen/native/Embedding.cpp (L93)`). Both have been solved and checked against (on CUDA and CPU) 1. As mentioned in the issue ``` import torch class Test(torch.nn.Module): def __init__(self): super(Test,self).__init__() self.embd = torch.nn.Embedding(1000, 100) self.dense = torch.nn.Linear(100, 1) def forward(self, inp): inp = self.embd(inp) return self.dense(inp) test = Test() inp = torch.tensor([0,1,2,1,1]) out = test(inp) raw_loss = out.mean(dim=0) loss_grad = torch.autograd.grad(outputs=raw_loss, inputs=list(test.parameters()), retain_graph=True, create_graph=True, only_inputs=True) norm = sum([param.norm()*2 for param in loss_grad]) loss = raw_loss + norm loss.backward(retain_graph=True) print(test.embd.weight.grad) ``` 2. Test Script ``` import torch import time start = time.time() l = [1,1]100 input = torch.tensor([[1,0],[1,0]],device='cpu') embedding_matrix = torch.tensor([[1.0,3.0],[2.0,4]],requires_grad=True,device='cpu') sq = embedding_matrix * embedding_matrix emb = torch.nn.functional.embedding(input, sq,scale_grad_by_freq=False) print('Embedding Matrix') print(embedding_matrix) print('-----------------') sum_ = emb.sum()#prod.sum() loss_grad, = torch.autograd.grad(outputs=sum_,inputs=embedding_matrix,create_graph=True) print('Gradient') print(loss_grad) print('-----------------') sum2_ = sum_ + loss_grad.sum() print(sum2_) sum2_.backward() print(embedding_matrix.grad) print(time.time() - start) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/9078 Reviewed By: ezyang Differential Revision: D14691901 Pulled By: soumith fbshipit-source-id: 78e2612ba39080be564c876311671eb5a0119a0f	2019-04-03 09:50:34 -07:00
Shen Li	c0ad6747a9	Highlight NCCL all_reduce and all_gather requirements (#18741 ) Summary: See #18689 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18741 Differential Revision: D14726874 Pulled By: mrshenli fbshipit-source-id: a92404c653e3c62fc23fa3ccacfb3b2959b2e307	2019-04-03 09:50:29 -07:00
svcscm	2658190def	Updating submodules Reviewed By: zpao fbshipit-source-id: ea0b06ce68d3fd6092eaea7c835a8b51c1120ea0	2019-04-03 08:30:25 -07:00
peter	5e33085f27	Make it possible for users for select /Zi or /ZI over /Z7 when using MSVC (#18790 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18701. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18790 Differential Revision: D14748195 Pulled By: ezyang fbshipit-source-id: e50df1b5ca199a88d7b5ea3ea45d25d23cd31a27	2019-04-03 08:24:52 -07:00
Jongsoo Park	06b7fe59f2	use optimization in D14020675 (#16945 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16945 As title Reviewed By: jianyuh Differential Revision: D14020769 fbshipit-source-id: fc0f05fcc57bfe9b4aa0c5750060d7b2ba57dd7a	2019-04-03 08:05:10 -07:00
Gregory Chanan	2113ea6fbf	Add device and dtype to storage. (#18749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18749 ghimport-source-id: 9026a037f5e11cdb9ccd386f4b6b5768b9c3259b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18751 Disallow changing the device of a tensor via set_. * #18750 Use non-legacy constructors for tensor deserialization. * #18749 Add device and dtype to storage. The goal here is to fix our serialization, which currently depends on the legacy constructors. Having dtype and device on Storage allows us to use the non-legacy constructors. This fits somewhat along our goal of removing Storage, my having Storage act like a Tensor. Differential Revision: D14729516 fbshipit-source-id: bf4a3e8669ad4859931f4a3fa56df605cbc08dcb	2019-04-03 07:59:02 -07:00
Gregory Chanan	a3da3653eb	Use non-legacy constructors for tensor deserialization. (#18750 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18750 ghimport-source-id: f1475cfb67841c41d9867d4429ba9125d5c7dd07 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18751 Disallow changing the device of a tensor via set_. * #18750 Use non-legacy constructors for tensor deserialization. * #18749 Add device and dtype to storage. Deserialization currently uses legacy constructors. This is bad because we need to maintain them, but there is a more immediate problem: 1) We are trying to implement device caching on TensorImpl to get rid of a virtual dispatch 2) This doesn't work if one is able to change the device of a Tensor underlying a Variable. 3) Deserialization does 2) So the plan is to change deserialization, then enforce that we don't change the device out from underneath a Variable. Differential Revision: D14729513 fbshipit-source-id: 090d6cdb375b94dc1bf4f554b2df243952b8cdc6	2019-04-03 07:54:11 -07:00
Iurii Zdebskyi	48f70ea0a2	Added numpy conversion (#18505 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18505 ghimport-source-id: f3c9b9251e5793f9e192f587194ddfebb45facc1 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18505 [WIP]Added numpy conversion * #18166 Bool Tensor for CUDA Differential Revision: D14646403 fbshipit-source-id: 79d39d692c778ce1981c1d35b1c33e3d93111041	2019-04-03 07:28:24 -07:00
Gregory Chanan	7349dbb7ce	Remove THTensor_(newUnfold). (#18773 ) Summary: It's not used and unfold's use of `device_guard: False` is scary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18773 Differential Revision: D14736526 Pulled By: gchanan fbshipit-source-id: 6281a284bee45fa5038783e4c1ed4d1ed7ca81ab	2019-04-03 07:08:28 -07:00
mingzhe0908	cb66759600	temp fix for flake8 error (#18788 ) Summary: Fix lint error Pull Request resolved: https://github.com/pytorch/pytorch/pull/18788 Reviewed By: houseroad Differential Revision: D14741840 Pulled By: mingzhe09088 fbshipit-source-id: 1fa630e3c6e606e3d78fe8293e5b0e7ea1b78da3	2019-04-02 22:52:52 -07:00
Igor Fedan	3079d95b6c	Fix flake8 issues Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18762 Reviewed By: houseroad Differential Revision: D14734152 Pulled By: ifedan fbshipit-source-id: 5adf123f88273895ad34ee9041896358d686de08	2019-04-02 21:18:01 -07:00

1 2 3 4 5 ...

17163 Commits