pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-24 15:44:58 +08:00

Author	SHA1	Message	Date
Joseph Spisak	cb90e36684	Update onnx.rst (#40605 ) Correcting the link (current 404s)	2020-06-30 06:46:40 -07:00
Jessica Lin	54a63e0420	Update header names and add in C++ docs (#27172 ) 1. Update "Package Reference" to "Python API" 2. Add in torchaudio and torchtext reference links so they show up across all docs not just the main page 3. Add "Other Languages" section and add in C++ docs	2019-10-01 21:52:53 -04:00
Soumith Chintala	8554416a19	delete C_CONTIGUOUS assertions to be compatible with particular builds of numpy	2019-08-08 05:54:09 -07:00
Soumith Chintala	64069120e4	fix install_requires properly	2019-08-08 05:16:06 -07:00
Vishwak Srinivasan	11d7e8d85b	Fix regression in triangular_solve when number of batches = 1 for CUDA (#23997 ) Changelog: - When number of batches = 1, dispatch to trsm instead of trsm_batched in MAGMA Test Plan: - All triangular_solve tests should pass to ensure that the change is valid	2019-08-07 20:36:21 -07:00
Hong Xu	4d1d843d18	Hotpatch CXXFLAGS to be the same as CFLAGS if CXXFLAGS is not set.	2019-08-07 16:01:08 -07:00
Edward Yang	3e8e88b7f4	Use prerendered KaTeX in docs. (#23376 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23376 This uses master version of sphinxcontrib-katex as it only recently got prerender support. Fixes #20984 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D16582064 Pulled By: ezyang fbshipit-source-id: 9ef24c5788c19572515ded2db2e8ebfb7a5ed44d	2019-08-07 15:58:36 -07:00
Brian Johnson	214dc0244b	Adds torch.random to docs/toc	2019-08-07 15:57:29 -07:00
Michael Suo	db37022260	[jit] prefix module qualified names with __module__ (#23633 ) This is temporary, won't be needed with the new serialization format. But for now, since the main module gets its name from the archive name, we need this for safety, other wise something like `torch.jit.save("torch.pt") will break things. ghstack-source-id: f36febe1025ff04e7f79617e548819d4876dc7fa Pull Request resolved: https://github.com/pytorch/pytorch/pull/23630	2019-08-07 18:55:48 -04:00
Michael Suo	a0e9dd3190	[jit] don't try to set training after ScriptModule has been initialized. (#23681 ) Now when initializing a ScriptModule during the torch.jit.load() process, there is already a cpp module backing the thing. That means that setting training will overwrite whatever the initialized ScriptModule had. This PR splits apart the common "set up internal state" part of the Module __init__ and calls that from ScriptModule.__init__ and Module.__init__, leaving the "nn.Module-specific" part (setting `self.training`) for the nn.Module __init__ ghstack-source-id: 9b2ba8a15c43cf230363e4cd10ba4ad3ac4931f7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23680	2019-08-07 18:54:45 -04:00
Wanchao	5e36beca0f	[jit] Support nn.GRU and Make nn.LSTM accept PackedSequence (#23700 ) * [jit] Support nn.GRU and Make nn.LSTM accept PackedPaddedSequence Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * fix * add link to comments	2019-08-07 18:54:08 -04:00
Vishwak Srinivasan	0d7d1d080d	Fix argmax docstring (#23855 )	2019-08-07 18:52:54 -04:00
Edward Z. Yang	e6f9422aca	Fix expansion of stride argument (#23969 ) In max_pool2d, max_pool3d, avg_pool2d, avg_pool3d. There is only one substantive change: when stride.size() == 1, we expand it to size 2. However, I also took the opportunity to give a better error message. Testing here is bare minimum, because I'm in a hurry. Just make sure C++ API with all size 1 inputs works. This is a squash of four commits. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2019-08-07 16:18:52 -04:00
gchanan	805556cbb6	[v1.2] [RETRY] Fixed Bool in IsIntegralType bug (plus review comments) (#23955 ) * Fixed Bool in IsIntegralType bug * Added deprecation message * Resolved PR comments * Update for review comments. * Get rid of tab.	2019-08-07 16:10:56 -04:00
Hong Xu	7bdc5b6a63	Fix unused imports in torch/onnx/symbolic_opset8.py (#23678 ) Summary: Which causes lint errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23678 Differential Revision: D16622458 Pulled By: mrshenli fbshipit-source-id: 145ad30dfb452dd556573c1b3d4cdd9cd7852752	2019-08-07 13:09:35 -07:00
Hong Xu	5ad90d39ab	No need to handle the dependency of INSTALL_TEST on BUILD_TEST in cmake.py (#23806 ) Summary: Simplifying https://github.com/pytorch/pytorch/issues/23793: The dependency relationship between {INSTALL,BUILD}_TEST is already properly handled in CMakeLists.txt. All we need to do is to pass down INSTALL_TEST. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23806 Differential Revision: D16691833 Pulled By: soumith fbshipit-source-id: 7607492b2d82db3f79b174373a92e2810a854a61	2019-08-07 13:03:41 -07:00
Edward Z. Yang	8893c32cd8	Upload OS X binaries to pytorch-nightly please... Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2019-08-07 15:56:40 -04:00
gchanan	cdb032efea	[v1.2] Update CosineAnnealingWarmRestarts to follow PyTorch 1.1+ Step Order. (#23833 ) (#23952 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/23480. I only verified that the schedule reaches the restart at the expected step as specified in the issue, it would be good to have someone else verify correctness here. Script: ``` scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.5), T_0=1, T_mult=2) for i in range(9): print(i) print(scheduler.get_lr()) scheduler.step() ``` Output: ``` 0 [0.5] 1 [0.5] 2 [0.25] 3 [0.5] 4 [0.42677669529663687] 5 [0.25] 6 [0.07322330470336313] 7 [0.5] 8 [0.4809698831278217] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23833 Differential Revision: D16657251 Pulled By: gchanan fbshipit-source-id: 713973cb7cbfc85dc333641cbe9feaf917718eb9	2019-08-07 13:39:24 -04:00
gchanan	2ad6f427e8	[v1.2] fix torch.frac documentation (#23830 ) (#23951 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/13968 . Following the math formula in wiki: https://en.wikipedia.org/wiki/Fractional_part Pull Request resolved: https://github.com/pytorch/pytorch/pull/23830 Differential Revision: D16656871 Pulled By: ailzhang fbshipit-source-id: a71467870cf9566e0c7b1a045f72607dada81e1f	2019-08-07 13:39:12 -04:00
iurii zdebskyi	060f67723c	Updated docs and added deprecation warnings to acknowledge a bool tensor (#22261 ) (#23811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22261 ghimport-source-id: 1611d62d056a04c0ad15ef662e594a3d206a78e2 Test Plan: Imported from OSS Differential Revision: D16005990 Pulled By: izdeby fbshipit-source-id: 2413824aa75a0755719e4df11acd21e6607e5a85	2019-08-07 13:00:03 -04:00
Soumith Chintala	fb50b52949	allow INSTALL_TEST to pass through from env to cmake (#23793 ) Summary: This allows `INSTALL_*` to pass through to cmake. Additional fix is that if `INSTALL_TEST` is specified, it wont use `BUILD_TEST` as the default value for `INSTALL_TEST` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23793 Differential Revision: D16648668 Pulled By: soumith fbshipit-source-id: 52c2a0d8033bc556355b87a6731a577940de9859	2019-08-05 13:09:06 -04:00
Iurii Zdebskyi	3314b57c33	Changed tensor comparison return type from uint8 to bool (#21113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21113 ghimport-source-id: 9c4ba63457a72bfc41894387e0b01be3fd9a9baf Test Plan: Imported from OSS Differential Revision: D15552204 Pulled By: izdeby fbshipit-source-id: a608213668649d058e22b510d7755cb99e7d0037	2019-08-05 07:43:49 -07:00
Tongzhou Wang	1ac19ea8ba	Fix dataloader._shutdown_workers if not all workers are started (#23762 )	2019-08-04 23:28:47 -04:00
Soumith Chintala	93a8dda495	cpu binary builds are built with cu100 docker image now instead of cu80	2019-08-04 21:08:06 -04:00
Soumith Chintala	de276b1d14	add appropriate install_requires	2019-08-04 20:00:30 -04:00
Vishwak Srinivasan	47828fd9fc	Document empty_strided (#23740 ) Changelog: - Add doc string for torch.empty_strided - Remove empty file named `python` in test/	2019-08-03 01:09:33 -04:00
Vishwak Srinivasan	f0a65e660c	Allowing batching for det/logdet/slogdet operations (#22909 ) (#23634 ) Summary: Changelog: - Add batching for det / logdet / slogdet operations - Update derivative computation to support batched inputs (and consequently batched outputs) - Update docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/22909 Test Plan: - Add a `test_det_logdet_slogdet_batched` method in `test_torch.py` to test `torch.det`, `torch.logdet` and `torch.slogdet` on batched inputs. This relies on the correctness of `torch.det` on single matrices (tested by `test_det_logdet_slogdet`). A port of this test is added to `test_cuda.py` - Add autograd tests for batched inputs Differential Revision: D16580988 Pulled By: ezyang fbshipit-source-id: b76c87212fbe621f42a847e3b809b5e60cfcdb7a	2019-08-02 00:15:52 -04:00
Tongzhou Wang	7e327f259b	[1.2.0] fix align_corners doc (#23709 ) * fix align_corners doc * Update torch/nn/functional.py Co-Authored-By: bnehoran <bnehoran@users.noreply.github.com> * Update torch/nn/functional.py Co-Authored-By: bnehoran <bnehoran@users.noreply.github.com>	2019-08-02 00:14:50 -04:00
Tongzhou Wang	ccc7b5c225	Use dst dir for temp file (#23629 ) (#23713 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/23607 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23629 Differential Revision: D16594223 Pulled By: soumith fbshipit-source-id: db0275415111f08fc13ab6be00b76737a20f92df	2019-08-02 00:14:32 -04:00
Tongzhou Wang	6f5fb78e9e	Fix CTC loss for zero-length targets on GPU (#23298 ) (#23715 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/18215 at last! Also sprinkle tests... Pull Request resolved: https://github.com/pytorch/pytorch/pull/23298 Differential Revision: D16582145 Pulled By: soumith fbshipit-source-id: bc8b1a629de0c2606e70a2218ccd135f4a9cdc5d	2019-08-02 00:14:11 -04:00
driazati	4f5211691f	[jit] Recursive compilation error hot fixes (#23686 ) * [jit] Recursive compilation error hot fixes This is a combination of #23454 and #23682 which are needed for the error reporting on recrusively compiled code * #23682	2019-08-01 18:40:15 -04:00
Soumith Chintala	7a7cfcbf0f	add setup metadata to help PyPI flesh out content on pypi package page	2019-08-01 14:36:17 -04:00
Tongzhou Wang	129939132b	[v1.2.0] Slightly improve dataloader docs on when auto-batching is disabled (#23672 ) * Slightly improve dataloader docs on when auto-batching is disabled * fix typo	2019-08-01 14:26:33 -04:00
Tongzhou Wang	bb4ff00f33	fix pin_memory_thread not exiting quickly (#23647 )	2019-08-01 11:08:20 -04:00
Richard Zou	49e32ffc9f	Enable stable docs push to pytorch.github.io:site-v1.2.0	2019-08-01 06:22:00 -07:00
Vishwak Srinivasan	d6df9575f9	Fix regression in torch.qr (#23591 ) (#23606 ) Summary: Changelog: - Use narrow instead of narrow_copy while returning Pull Request resolved: https://github.com/pytorch/pytorch/pull/23591 Test Plan: - All tests should pass to ensure that the change is correct Fixes https://github.com/pytorch/pytorch/issues/23580 Differential Revision: D16581174 Pulled By: ezyang fbshipit-source-id: 1b6bf7d338ddd138ea4c6aa6901834dd202ec79c	2019-07-31 21:04:02 -04:00
Jerry Zhang	c3bad0de3e	at::view (#23452 ) (#23604 ) Summary: accidently calls clone, but what we want is creating an empty tensor and set storage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23452 ghstack-source-id: 87438096 Differential Revision: D16442756 fbshipit-source-id: 6d5663f82c9bd4e9de8fc846c52992477843af6a	2019-07-31 20:25:35 -04:00
Richard Zou	18cbf11329	Prepare the stable docs build for v1.2.0 This sets up the docs build in dry-run mode. If everything looks okay I will enable it.	2019-07-31 13:25:47 -07:00
Richard Zou	a9dc2a15b7	Refactor the pytorch_doc_push_script to take a branch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23556 Test Plan: - Run ci Imported from OSS Differential Revision: D16563747 Pulled By: zou3519 fbshipit-source-id: 104371b3712c00b073a82e5145090e7bd6fd2d53	2019-07-31 13:18:55 -07:00
Soumith Chintala	e7abff0778	Delete re_worker_requirements	2019-07-30 13:02:20 -04:00
vishwakftw	b3a9a7a9b9	Rename gels to lstsq (#23460 ) Summary: Changelog: - Rename `gels` to `lstsq` - Fix all callsites - Rename all tests - Create a tentative alias for `lstsq` under the name `gels` and add a deprecation warning to not promote usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23460 Test Plan: - All tests should pass to confirm that the patch is correct Differential Revision: D16547834 Pulled By: colesbury fbshipit-source-id: b3bdb8f4c5d14c7716c3d9528e40324cc544e496	2019-07-30 09:56:04 -07:00
Hong Xu	cfe9400996	Remove preprocessing of CFLAGS, CPPFLAGS, and LDFLAGS in Python scripts. (#23528 ) Summary: After https://github.com/pytorch/pytorch/issues/23455, there is no need of this preprocessing in Python scripts. They will be automatically processed in CMake (plus CPPFLAGS here probably meant to be CXXFLAGS). Reference: - https://cmake.org/cmake/help/v3.15/envvar/CFLAGS.html - https://cmake.org/cmake/help/v3.15/envvar/CXXFLAGS.html - https://cmake.org/cmake/help/v3.15/envvar/LDFLAGS.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/23528 Differential Revision: D16561561 Pulled By: ezyang fbshipit-source-id: 962a27a2b0a18db0f95477ad067a2611e4128187	2019-07-30 08:07:36 -07:00
Pavel Belevich	fd61cc9ebc	Moved at::assert_no_internal_overlap to TensorIterator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22917 Differential Revision: D16521429 Pulled By: pbelevich fbshipit-source-id: 80ae583c6486d6948431b79e1452902bdf2cfbc3	2019-07-30 07:47:33 -07:00
Johannes M Dieterich	4b78ce1ba4	Clean cmake infrastructure up (#23527 ) Summary: Only check for cmake dependencies we directly depend on (e.g., hipsparse but not rocsparse) Use cmake targets for ROCm where possible. While there, update the docker CI build infrastructure to only pull in packages by name we directly depend on (anticipating the demise of, e.g., miopengemm). I do not anticipate a docker rebuild to be necessary at this stage as the changes are somewhat cosmetic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23527 Differential Revision: D16561010 Pulled By: ezyang fbshipit-source-id: 87cd9d8a15a74caf9baca85a3e840e9d19ad5d9f	2019-07-30 07:26:48 -07:00
Richard Zou	437a8b3eed	Named inference rule for copy_ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23229 Test Plan: Imported from OSS Differential Revision: D16494413 Pulled By: zou3519 fbshipit-source-id: 4acb85e5a4ad09bf5f7cbb84cc8d4ceac0cd9967	2019-07-30 07:17:34 -07:00
Gabriele Mambrini	16da355b30	Sync worker requirement mismatches Summary: Syncing worker requirement mismatches to improve remote build time. Created actions: LARGE: 66 MEDIUM: 649 XLARGE: 1 Updated actions: From LARGE to MEDIUM: 18 From LARGE to XLARGE: 2 From MEDIUM to LARGE: 20 From XLARGE to LARGE: 1 Differential Revision: D16559356 fbshipit-source-id: a51ef034265649314661ab0e283089a069a20437	2019-07-30 02:53:11 -07:00
Ailing Zhang	4e59055c4d	optimize matmul memory usage for certain cases (#23433 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21406 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23433 Differential Revision: D16524135 Pulled By: ailzhang fbshipit-source-id: e7684fec60c9b9db9a09f8ac157b13c8dde1bdd2	2019-07-29 22:35:45 -07:00
Will Feng	7b081e5d1e	Improve error message for changing tensor metadata after .data or .detach() (#23504 ) Summary: When a user tries to change metadata of a tensor created from `.data` or `.detach()`, we currently shows an error message "<function_name> is not allowed on Tensor created from .data or .detach()". However, this error message doesn't suggest what the right fix should look like. This PR improves the error message. Closes https://github.com/pytorch/pytorch/issues/23393. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23504 Differential Revision: D16547415 Pulled By: yf225 fbshipit-source-id: 37f4a0385442e2b0966386fb14d3d938ecf4230c	2019-07-29 22:25:14 -07:00
Owen Anderson	db1e9b1d6c	Fix a few clang warnings. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23524 Differential Revision: D16549562 fbshipit-source-id: 58351fc2858d495b135023626116f6f565c8e9b1	2019-07-29 22:08:50 -07:00
Sebastian Messmer	30bc19d751	dictKeys and dictItems ops on typed dicts return typed lists (#23270 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23270 ghstack-source-id: 87389530 Differential Revision: D16448942 fbshipit-source-id: e6b578f0e97776112259d7ea38e143e4716ec273	2019-07-29 20:00:34 -07:00
Michael Suo	c8817f9436	fix default value for script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23542 Test Plan: Imported from OSS Differential Revision: D16557122 Pulled By: suo fbshipit-source-id: c86578aa2c55f44ed5d573d33874a82244df3d09	2019-07-29 19:51:26 -07:00
Michael Suo	6314af6e57	Revert D16526027: [jit] Include recursive class compilations in error call stack Differential Revision: D16526027 Original commit changeset: 109f2968430d fbshipit-source-id: c27252540ec6b7da60739eb7dcc8b1650672c226	2019-07-29 19:02:39 -07:00
Michael Suo	6574f6167c	Revert D16554694: [jit] add a test for inline tracing Differential Revision: D16554694 Original commit changeset: 0fae4458f18c fbshipit-source-id: 08aa0c292fa5b2dbdd0d1f0e59f531416edef760	2019-07-29 18:57:06 -07:00
Michael Suo	265b498de2	add a test for inline tracing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23535 Test Plan: Imported from OSS Differential Revision: D16554694 Pulled By: suo fbshipit-source-id: 0fae4458f18c06ffbd484905ad7836dce9ce69cc	2019-07-29 18:05:15 -07:00
davidriazati	52b95fd4be	Include recursive class compilations in error call stack (#23454 ) Summary: Previously these were left out which would lead to confusing messages, now it looks something like: ``` torch.jit.frontend.UnsupportedNodeError: import statements aren't supported : at ../test.py:13:9 def bad_fn(self): import pdb ~~~~~~ <--- HERE '__torch__.X' is being compiled since it was called from 'fn' at ../test.py:16:12 def fn(x): return X(10) ~~~~ <--- HERE ``` Fixes #23453 ](https://our.intern.facebook.com/intern/diff/16526027/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/23454 Pulled By: driazati Differential Revision: D16526027 fbshipit-source-id: 109f2968430dbf51ee91b1b3409badfd557d19a4	2019-07-29 18:00:05 -07:00
davidriazati	696642ae8d	Change docs to use recursive script API (#21612 ) Summary: Use the recursive script API in the existing docs TODO: * Migration guide for 1.1 -> 1.2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21612 Pulled By: driazati Differential Revision: D16553734 fbshipit-source-id: fb6be81a950224390bd5d19b9b3de2d97b3dc515	2019-07-29 17:51:22 -07:00
Daya Khudia	bfee46f8e2	Update argument list for non-fbgemm path for qconv_prepack (#23521 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23521 non-fbgemm path should have the same arguments as fbgemm path. Reviewed By: jianyuh Differential Revision: D16547637 fbshipit-source-id: bb00d725fb968cbee32defb8facd2799a7e79bb4	2019-07-29 17:41:11 -07:00
Michael Suo	65a89472c4	Put all modules in the global Python CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23154 Test Plan: Imported from OSS Differential Revision: D16441913 Pulled By: suo fbshipit-source-id: a79f2c3e06a33cbd79b2e3333f16c069f356f451	2019-07-29 16:38:20 -07:00
Hong Xu	e366af7d87	Add TORCH_CHECK to disable sub for bool tensors (#23519 ) Summary: This resolves two issues in one shot: - sub shouldn't be available for bool type. - When sub is applied to an unsupported type, the current error messages shows "add_cpu/add_cuda is not implemented for [type]". They should be "sub_cpu/sub_cuda" instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23519 Differential Revision: D16548770 Pulled By: izdeby fbshipit-source-id: fe404a2a97b8d11bd180ec41364bf8e68414fb15	2019-07-29 16:28:35 -07:00
Mingzhe Li	3c986dff77	introduce auto_set to simplify benchmarking the backward path of operators (#23276 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23276 This diff introduces a new feature to simplify benchmarking the backward path of ops. Here is an example: ``` ... self.input_one = torch.rand(M, N, K, requires_grad=self.auto_set()) self.input_two = torch.rand(M, N, K, requires_grad=self.auto_set()) ... ``` In this way, the benchmark will generate three different test cases. 1. input_one requires grad 2. input_two requires grad 3. both inputs require grad Here is a sample output: ``` # Benchmarking PyTorch: add # Mode: Eager # Name: add_M1_N8_K8_bwdall # Input: M: 1, N: 8, K: 8 Backward Execution Time (us) : 863.744 # Benchmarking PyTorch: add # Mode: Eager # Name: add_M1_N8_K8_bwd1 # Input: M: 1, N: 8, K: 8 Backward Execution Time (us) : 727.915 # Benchmarking PyTorch: add # Mode: Eager # Name: add_M1_N8_K8_bwd2 # Input: M: 1, N: 8, K: 8 Backward Execution Time (us) : 687.626 ``` Reviewed By: zheng-xq Differential Revision: D16450355 fbshipit-source-id: 50ae0916e81c3ff9f0c482ed6d386319eb15b305	2019-07-29 15:58:41 -07:00
Ilia Cherniavskii	41dfe7204b	Threading and CPU Inference note Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23417 Test Plan: cd docs; make html Imported from OSS Differential Revision: D16523781 Pulled By: ilia-cher fbshipit-source-id: d6c09e8a85d39e6185bbdc4b312fea44fcdfff06	2019-07-29 15:45:49 -07:00
Wanchao Liang	f4eb93f7bc	Support pack_padded_sequence and pad_packed_sequence Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23249 Test Plan: Imported from OSS Differential Revision: D16466587 Pulled By: wanchaol fbshipit-source-id: a721da01b2da0ef90cac80b77f1285102e3b1118	2019-07-29 15:36:47 -07:00
Wanchao Liang	c384fbf4c8	support torch._C._get_tracing_state in script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23248 Test Plan: Imported from OSS Differential Revision: D16466588 Pulled By: wanchaol fbshipit-source-id: 3c3d5dec2cea2f9cb080eadaef457cc62ac3fbe0	2019-07-29 15:05:50 -07:00
BowenBao	e1f8985973	Specify onnxruntime version to install for CI tests (#23517 ) Summary: No real change on the CI since currently the default latest is 0.4.0. houseroad bddppq Pull Request resolved: https://github.com/pytorch/pytorch/pull/23517 Differential Revision: D16550375 Pulled By: bddppq fbshipit-source-id: a669b8af678c79c4d6909300b28458fe6b7cd30c	2019-07-29 14:58:15 -07:00
Wanchao Liang	c779eff579	support torch.as_tensor in script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23247 Test Plan: Imported from OSS Differential Revision: D16466590 Pulled By: wanchaol fbshipit-source-id: cf52721eacd177d9040564790382db13a9fcc2fe	2019-07-29 14:38:22 -07:00
Edward Thomson	3a568c9a2b	CI: install clang-tidy (#23518 ) Summary: Install clang-tidy (from LLVM 8) for the `clang_tidy` job. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23518 Differential Revision: D16549621 Pulled By: ezyang fbshipit-source-id: b1d20641380cdfdb0589249770b98163528fa69f	2019-07-29 14:28:26 -07:00
Hong Xu	a8edc2b5d2	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22926 Differential Revision: D16546369 Pulled By: colesbury fbshipit-source-id: 56f7ef4476e586dee19366fdb720085d1c2f2027	2019-07-29 13:47:05 -07:00
dongfangduoshou123	9219a37c12	avoid Include the same header file twice (#23418 ) Summary: avoid Include the same header file twice Pull Request resolved: https://github.com/pytorch/pytorch/pull/23418 Differential Revision: D16546422 Pulled By: colesbury fbshipit-source-id: 5cd868cce73d9199ced9b6f2f6f57bf42e5a5d5b	2019-07-29 13:34:11 -07:00
Elias Ellison	dec4eacae4	fix fbcode weak ordering (#23511 ) Summary: There is an internal fbcode assert that fails if i do not add these checks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23511 Differential Revision: D16545606 Pulled By: eellison fbshipit-source-id: cd3a799850bae8f052f9d81c1e4a2678fda19317	2019-07-29 13:14:39 -07:00
Yaxun (Sam) Liu	0c9979dd7d	Fix TestCuda.test_events_wait (#23520 ) Summary: PyTorch test sets a policy() method to assertLeaksNoCudaTensors. Whenever a test is run, assertLeaksNoCudaTensors is called, which in turn calls CudaMemoryLeakCheck, which in turn calls initialize_cuda_context_rng, where it executes torch.randn on each device, where a kernel is launched on each device. Since the kernel may not finish on device 1, the assertion self.assertTrue(s1.query()) fails. The fix is to insert torch.cuda.synchronize(d0) torch.cuda.synchronize(d1) at the beginning of the test so that previously launched kernels finish before the real test begins. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23520 Differential Revision: D16547701 Pulled By: soumith fbshipit-source-id: 42ad369f909d534e15555493d08e9bb99dd64b6a	2019-07-29 13:09:41 -07:00
SsnL	e982e46de3	Add multiprocessing_context= argument to DataLoader (#22990 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22131 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22990 Differential Revision: D16539052 Pulled By: colesbury fbshipit-source-id: b1c48ae2fb54065dd96a67be263254129e02eaa2	2019-07-29 12:58:40 -07:00
Edward Yang	56664c2c65	Untap caskroom/homebrew-cask in attempt to unbreak OS X builds. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23514 Test Plan: Imported from OSS Differential Revision: D16546841 Pulled By: ezyang fbshipit-source-id: 96d2e988cb0dddfeec0174875761dfa26f25a8c1	2019-07-29 12:45:01 -07:00
xzhu1900	31f1928096	add sorting policy to ChunkDataset (#23053 ) Summary: Add a sorting policy to ChunkDataset. This is considered an advanced parameter for developers who want to apply a 'sorting policy' to the chunk data before sampling into minibatch. Different than the collate method, this policy is applied on the chunk level instead of minibatch level. When a chunk of data is loaded (multiple chunks if cross_chunk_shuffle_count_ is greater than 1), this policy is targeting to the full loaded data. It will be useful if developers want to perform some pre-processing (like bucketing) to the chunk data before example sampler samples the data. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23053 Differential Revision: D16537692 Pulled By: colesbury fbshipit-source-id: cd21ed40ab787a18b8c6dd304e5b806a7a45e6ba	2019-07-29 12:34:02 -07:00
Soumith Chintala	a356276d79	add note to Contribution Guide around recently released research (#23513 ) Summary: Thanks adefazio for the feedback, adding a note to the Contribution guide so that folks don't start working on code without checking with the maintainers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23513 Differential Revision: D16546685 Pulled By: soumith fbshipit-source-id: 1ee8ade963703c88374aedecb8c9e5ed39d7722d	2019-07-29 12:24:59 -07:00
Fritz Obermeyer	06762b4721	Fix distributions.Categorical.sample bug from .view() (#23328 ) Summary: This modernizes distributions code by replacing a few uses of `.contiguous().view()` with `.reshape()`, fixing a sample bug in the `Categorical` distribution. The bug is exercised by the following test: ```py batch_shape = (1, 2, 1, 3, 1) sample_shape = (4,) cardinality = 2 logits = torch.randn(batch_shape + (cardinality,)) dist.Categorical(logits=logits).sample(sample_shape) # RuntimeError: invalid argument 2: view size is not compatible with # input tensor's size and stride (at least one dimension spans across # two contiguous subspaces). Call .contiguous() before .view(). # at ../aten/src/TH/generic/THTensor.cpp:203 ``` I have verified this works locally, but I have not added this as a regression test because it is unlikely to regress (the code is now simpler). Pull Request resolved: https://github.com/pytorch/pytorch/pull/23328 Differential Revision: D16510678 Pulled By: colesbury fbshipit-source-id: c125c1a37d21d185132e8e8b65241c86ad8ad04b	2019-07-29 12:09:50 -07:00
shihongzhi	be644d822b	fixes #20178 (#23297 ) Summary: fixes https://github.com/pytorch/pytorch/issues/20178 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23297 Differential Revision: D16497552 Pulled By: VitalyFedyunin fbshipit-source-id: 386933b15c27d02351f042be71b153bc9439004d	2019-07-29 12:04:44 -07:00
Hong Xu	236149edc5	Make randperm works properly on non-contiguous tensors. (#23043 ) Summary: Close https://github.com/pytorch/pytorch/issues/22710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23043 Differential Revision: D16446340 Pulled By: VitalyFedyunin fbshipit-source-id: 1760af310fee71b369e1aaaf96546277058611c9	2019-07-29 11:59:04 -07:00
Nikolay Korovaiko	d6ff78fd00	fix an over-indented return in trace_module Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23358 Differential Revision: D16519010 Pulled By: Krovatkin fbshipit-source-id: a7e4225b70e915d91c74874e3eca9bcb87baf84c	2019-07-29 11:15:55 -07:00
Richard Zou	505fa83b2f	Implement named inference rule for mul Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23193 Test Plan: - [namedtensor ci] gh-metadata: pytorch pytorch 23193 gh/zou3519/75/head Imported from OSS Differential Revision: D16494401 Pulled By: zou3519 fbshipit-source-id: 0e2395d7de39158ec51feed5da0389715ec52600	2019-07-29 09:58:18 -07:00
Edward Yang	d3fcb4ccd3	Try another version of apt/dpkg killing. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23499 Test Plan: Imported from OSS Differential Revision: D16542875 Pulled By: ezyang fbshipit-source-id: 05aa97f2d61e4fc00a819768448944f85701cab8	2019-07-29 09:13:24 -07:00
Hong Xu	8ada7c9920	Remove two CMAKE_ build options from additional_options. (#23451 ) Summary: Following up 915261c8bef85e3b845a0384d3fb55e55707b609 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23451 Differential Revision: D16542303 Pulled By: ezyang fbshipit-source-id: 1406c311c198eb237f85d6d8f1f0d58626be8257	2019-07-29 08:13:59 -07:00
Hong Xu	09ba4df031	Whether MKLDNN should be built under native arch should respect USE_NATIVE_ARCH (#23445 ) Summary: Currently there is no way to build MKLDNN more optimized than sse4. This commit let MKLDNN build respect USE_NATIVE_ARCH. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23445 Differential Revision: D16542275 Pulled By: ezyang fbshipit-source-id: 550976531d6a52db9128c0e3d4589a33715feee2	2019-07-29 08:13:56 -07:00
Hong Xu	b335f3910f	Remove redundant MSVC_Z7_OVERRIDE processing and combine "/EHa" flag setup (#23455 ) Summary: - MSVC_Z7_OVERRIDE has already handled in CMakeLists.txt. No need to process it for once more in the Python scripts. - Option MSVC_Z7_OVERRIDE should be visible to the user only if MSVC is used. - Move the setting of "/EHa" flag to CMakeLists.txt, where other MSVC-specific flags are processed. This also further prepares the removal of redundant cflags setup in Python build scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23455 Differential Revision: D16542274 Pulled By: ezyang fbshipit-source-id: 4d3b8b07161478bbba8a21feb6ea24c9024e21ac	2019-07-29 08:08:47 -07:00
Ralf Gommers	81e46d4f78	Fix build issue. CUDA may be installed in `$CUDA_HOME/lib` on macOS. (#23491 ) Summary: Closes gh-16955. Closes https://github.com/pytorch/vision/issues/977 On Linux both `lib64` and `lib` may be present (symlinked). The reports seem to all be about macOS, but it seems like this is also possibly more robust on Linux and can't hurt. So not treating platforms differently. Note that Eigen has a similar check in its CMake: ``` if(CUDA_64_BIT_DEVICE_CODE AND (EXISTS "${CUDA_TOOLKIT_ROOT_DIR}/lib64")) link_directories("${CUDA_TOOLKIT_ROOT_DIR}/lib64") else() link_directories("${CUDA_TOOLKIT_ROOT_DIR}/lib") endif() ``` There may be other issues for building from source on macOS, can't test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23491 Differential Revision: D16538973 Pulled By: soumith fbshipit-source-id: cc309347b7d16e718e06878d3824d0a6e40b1019	2019-07-29 08:08:43 -07:00
Hong Xu	97f129bf0a	Let set_rng_state and get_rng_state accept string parameter (#23448 ) Summary: Currently set_rng_state and get_rng_state do not accept string as their parameters. This commit let them accept strings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23448 Differential Revision: D16527172 Pulled By: soumith fbshipit-source-id: 8f9a2129979706e16877cc110f104770fbbe952c	2019-07-29 08:08:39 -07:00
Edward Yang	7a82066282	Update PyTorch Docker image to 323. (#23389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23389 Test Plan: Imported from OSS Differential Revision: D16541971 Pulled By: ezyang fbshipit-source-id: 2b7e483f4d6eedef7f5c140ffc0fac21fecd179b	2019-07-29 07:29:54 -07:00
Mingbo Wan	f546a3b8d8	fixing documentation, issue 22697 (#23268 ) Summary: As fmassa commented : > Agree, it should probably be weight, start, end Pull Request resolved: https://github.com/pytorch/pytorch/pull/23268 Differential Revision: D16493403 Pulled By: zou3519 fbshipit-source-id: 51ed07f6f7abdbd41dc323570aed41d804fa9c1b	2019-07-29 07:24:49 -07:00
Alisdair Johnstone	19858f7cc6	Sync worker requirement mismatches Summary: Syncing worker requirement mismatches to improve remote build time. Created actions: MEDIUM: 981 LARGE: 56 Updated actions: From MEDIUM to LARGE: 10 From LARGE to MEDIUM: 3 From LARGE to XLARGE: 1 Differential Revision: D16532427 fbshipit-source-id: c58bf59e6c571627b3994f8cdfa79758fb85892b	2019-07-29 04:37:23 -07:00
vishwakftw	91d28026f8	Remove unused cuBLAS driver functions for getrs (#23375 ) Summary: Changelog: - Remove getrs driver functions from THCBlas{.h/.cpp} Pull Request resolved: https://github.com/pytorch/pytorch/pull/23375 Test Plan: - Build to pass to confirm no callsites were missed. Differential Revision: D16539079 Pulled By: soumith fbshipit-source-id: b5c285a2d36714ddf3393337eec7d85b1eaf3f51	2019-07-28 21:29:18 -07:00
peter	54c280863c	Add some compiler flags for building cpp extensions on Windows (#23472 ) Summary: (1) Add `COMMON_MSVC_FLAGS` to the flags in the ninja codepath (2) Add `/EHsc` to `COMMON_MSVC_FLAG` (3) Remove `-fPIC` and `-std=c++11` from the flags in the windows codepath Pull Request resolved: https://github.com/pytorch/pytorch/pull/23472 Differential Revision: D16532993 Pulled By: soumith fbshipit-source-id: bc2d983f5f8b4eae9c7385bf170f155679e92e87	2019-07-28 20:33:18 -07:00
Karl Ostmo	ef6356133e	Revert D16428208: [pytorch][PR] only scatter in forward if multi-device per process Differential Revision: D16428208 Original commit changeset: eaa3876b2b95 fbshipit-source-id: 9db3bc86bf419dd06fdaaff434f72b92ecb5a427	2019-07-27 22:41:20 -07:00
Hong Xu	64e4152064	Clarify that torch.device without an index will always represent the current device (#23468 ) Summary: Per discussion in https://github.com/pytorch/pytorch/issues/23448 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23468 Differential Revision: D16532950 Pulled By: soumith fbshipit-source-id: 48c97060aaf55f1d7589afab42c6cd623d71a9a7	2019-07-27 06:49:52 -07:00
Abhinav Jauhri	ffef0e03b7	Enabling GPU device runs for operators (#23461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23461 Enabling GPU device runs for production operator shapes. Reviewed By: xw285cornell, mingzhe09088 Differential Revision: D16526928 fbshipit-source-id: 46657963f4b0bc43d14205ccf1b63d588657e388	2019-07-26 18:53:40 -07:00
James Reed	23e526e6ff	Fix SourceRange comparison Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23341 Test Plan: Imported from OSS Differential Revision: D16505398 Pulled By: jamesr66a fbshipit-source-id: 0bf6a1a054c7749c0a3334654d5746dd9f5dee96	2019-07-26 18:08:43 -07:00
Elias Ellison	3497891c14	add sorted keyword for lists and dicts (#23274 ) Summary: Add `sorted` keyword to JIT for lists and dicts. This desugars to a list copy and a call to `list.sort()`. Since we don't have interfaces yet I implement it in terms of `list.sort()`. When we do we can re-visit implementing this op in a different manner. The test fails bc of a fix to specialized lists which is landing here: https://github.com/pytorch/pytorch/pull/23267 Ignore the first commit because it is formatting, plz use clang_format ppl :'( Pull Request resolved: https://github.com/pytorch/pytorch/pull/23274 Differential Revision: D16527323 Pulled By: eellison fbshipit-source-id: aed8faef23cb790b9af036cd6c1b9b1d7066345d	2019-07-26 17:44:15 -07:00
Mingzhe Li	f0ebf769de	allow accepting empty input to the benchmark (#23462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23462 as title Reviewed By: hl475 Differential Revision: D16527176 fbshipit-source-id: 7a8ff4f3c6122ae7b3205e0b446fec06fd95eedc	2019-07-26 17:30:42 -07:00
Lu Fang	522cca5040	Support IntList in Dict's shalloEquals (#23205 ) Summary: Required when comparing IntList type of default values Pull Request resolved: https://github.com/pytorch/pytorch/pull/23205 ghstack-source-id: 87208341 Reviewed By: zrphercule Differential Revision: D16433809 fbshipit-source-id: 3f60d67d708129be31198161423d819108468077	2019-07-26 17:30:38 -07:00
Adam Stooke	d6d7a5f075	only scatter in forward if multi-device per process (#22384 ) Summary: Scatter is unnecessary if only using one device, and it breaks on some custom data structures like namedtuple, so would like to avoid :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22384 Differential Revision: D16428208 Pulled By: soumith fbshipit-source-id: eaa3876b2b95c1006ccaaacdb62f54c5280e730c	2019-07-26 17:30:34 -07:00
Jiakai Liu	e1ae3a75c8	gate module::save logic on mobile (#23415 ) Summary: This is part of the effort to shrink OSS libtorch mobile build size. We shouldn't need Module::save function on mobile - it depends on csrc/jit/export.cpp which then depends on ONNX. By gating these two methods we can avoid these dependencies for libtorch mobile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23415 ghstack-source-id: 87288228 Reviewed By: dreiss Differential Revision: D16511143 fbshipit-source-id: fd031f91fcf9b7be54cbe1436506965af94ab537	2019-07-26 17:23:38 -07:00
Yuxin Wu	23f963e4a8	Update distributed.rst (#23289 ) Summary: Different backend is supported since https://github.com/pytorch/pytorch/pull/18595 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23289 Differential Revision: D16528229 Pulled By: soumith fbshipit-source-id: 57753e84c015817661ba30835278ee3a899aa2d0	2019-07-26 16:55:52 -07:00
Elias Ellison	ca76c82ce3	Add early returns to JIT (#19179 ) Summary: Add early returns to JIT with minimal changes to compiler.cpp and an IR->IR pass that will transform the graph so that there is only one return value. In compiler.cpp, record when a block will exit so that in the following example will work: ``` if cond: a = torch.zeros([2]) else: return 2 a += 2 ... ``` To match block outputs with values that will not be used, like in the above example with `a`, I add a Bottom Type that subtypes everything else. This allows shape propagation to continue to work, and makes it so that we don't need many extra nodes filling up the graph. The IR transform currently doesn't work on Loops, I didn't add that to this PR to avoid too much complexity, but will add it as a stack (and it should be very little extra code). the IR transform is commented at the top of the file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19179 Differential Revision: D16519819 Pulled By: eellison fbshipit-source-id: 322a27f69966d1fd074ebe723c3e948b458b0e68	2019-07-26 16:42:43 -07:00
Supriya Rao	9223fa1c46	Add support to serialize qtensor in JIT. (#23356 ) Summary: Adds qtensor specific fields to the proto file so that they get serialized into the model.json Pull Request resolved: https://github.com/pytorch/pytorch/pull/23356 ghstack-source-id: 87263428 Differential Revision: D16473237 fbshipit-source-id: bf5b51d0863d036d30a1644a3c3b74516468224b	2019-07-26 15:52:15 -07:00
Edward Yang	9dad13e1f0	Revert "Add fbgemm_qlinear_dynamic op (#23104 )" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23449 Test Plan: Imported from OSS Differential Revision: D16524768 Pulled By: ezyang fbshipit-source-id: 9eb01b021011d1172317b5adb774c10c42ac2b86	2019-07-26 15:02:33 -07:00
Gregory Chanan	953459f29e	Dont allow conversions with QInt. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22931 Test Plan: Imported from OSS Differential Revision: D16467985 Pulled By: gchanan fbshipit-source-id: 3925fc96a641e66b92fa65c542a2a23190c915a5	2019-07-26 14:45:14 -07:00
Jerry Zhang	190d255d2e	Add FLOAT_MODULE to quantized conv (#23414 ) Summary: att Pull Request resolved: https://github.com/pytorch/pytorch/pull/23414 ghstack-source-id: 87225586 Differential Revision: D16511055 fbshipit-source-id: c617733f60cfe38f4791e35e57e9551f2b5d8c09	2019-07-26 14:02:20 -07:00
liqunfu	83d6c6be07	ONNX export for index_select (#21866 ) Summary: ONNX export for index_select Pull Request resolved: https://github.com/pytorch/pytorch/pull/21866 Reviewed By: zrphercule Differential Revision: D16471345 Pulled By: houseroad fbshipit-source-id: 745c23ba8a3223b5ec59b924df7358a36a92518c	2019-07-26 13:56:15 -07:00
Ilia Cherniavskii	74f8094ea5	Rename threading build options Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23407 Test Plan: USE_CUDA=0 ATEN_THREADING=TBB USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop install --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D16522538 Pulled By: ilia-cher fbshipit-source-id: 75c4761d93a7f5936f28e4c5eedcd27d8490d0c5	2019-07-26 13:09:14 -07:00
Shen Li	aae48748f2	Avoid unnecessary tensor clone in Cloneable (#20995 ) Summary: As pointed out by SsnL in https://github.com/pytorch/pytorch/issues/20910, when clone destination is different from the module's device, `Cloneable` currently calls `clone()` and then `to()` on every parameter and buffer, where the first clone is unnecessary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20995 Differential Revision: D15517353 Pulled By: mrshenli fbshipit-source-id: 6b6dc01560540a63845663f863dea0a948021fa5	2019-07-26 12:46:42 -07:00
Mingzhe Li	53182e53f0	fix observer name in the benchmark output (#23443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23443 as title Reviewed By: hl475 Differential Revision: D16520962 fbshipit-source-id: 7a0ccbece487837c204f242d2a5c6f69b32cbc8c	2019-07-26 12:20:41 -07:00
Mingzhe Li	828c08b4c7	allow passing a list of operators to benchmark (#23442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23442 Replace the argument name from `operator` to `operators` which can take a list of operators to test. Reviewed By: hl475 Differential Revision: D16520779 fbshipit-source-id: 94284a87c64471793e319f5bd3143f89b9a192bb	2019-07-26 12:20:36 -07:00
Jan Schlüter	0bc90194fb	Catch and print exception traceback in parallel_apply() workers (#18055 ) Summary: When an exception occurs in one of the modules passed to `parallel_apply()`, it is caught and re-raised in the main thread. This preserves the original exception type and message, but has the traceback point at the position where it's re-raised, rather than the original point of failure. This PR saves the exception information required to generate the traceback, and includes the original traceback in the message of the exception raised in the main thread. Before: ``` ... File ".../torch/nn/parallel/data_parallel.py", line 153, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File ".../torch/nn/parallel/parallel_apply.py", line 84, in parallel_apply raise output RuntimeError: expected type torch.FloatTensor but got torch.cuda.FloatTensor ``` After: ``` ... File ".../torch/nn/parallel/data_parallel.py", line 153, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File ".../torch/nn/parallel/parallel_apply.py", line 88, in parallel_apply ''.join(traceback.format_exception(*exc_info))) RuntimeError: Caught exception in replica 0. Original traceback and message: Traceback (most recent call last): ... File "../models/foo.py", line 319, in bar baz = asdf / ghij[:, np.newaxis] RuntimeError: expected type torch.FloatTensor but got torch.cuda.FloatTensor ``` I took care to raise an exception of the original type (in case the main code checks for that), but replaced the message. It helped me find a bug that did not occur outside `data_parallel()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18055 Differential Revision: D16444972 Pulled By: zhangguanheng66 fbshipit-source-id: ec436c9d4677fad18106a8046cfa835a20a101ce	2019-07-26 11:41:22 -07:00
Mingzhe Li	7499fe72e9	remove c2 tests from benchmark_all_test (#23437 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23437 as title Reviewed By: hl475 Differential Revision: D16519770 fbshipit-source-id: 63fc269e18c264d399e25f44b03f81fc3ae01113	2019-07-26 11:12:53 -07:00
Lu Fang	e5e2face8f	Change handling of DataParallel in ONNX exporter (#23365 ) Summary: Don't automatically unwrap top layer DataParalllel for users. Instead, we provide useful error information and tell users what action to take. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23365 Reviewed By: zrphercule Differential Revision: D16514273 Pulled By: houseroad fbshipit-source-id: f552de5c53fb44807e9d9ad62126c98873ed106e	2019-07-26 11:12:49 -07:00
Lu Fang	c8c5e11fba	Support variadic returns in Schema's operator<< (#23204 ) Summary: old: prim::PythonOp(...) -> new: prim::PythonOp(...) -> ... Pull Request resolved: https://github.com/pytorch/pytorch/pull/23204 ghstack-source-id: 87208343 Reviewed By: zrphercule Differential Revision: D16433592 fbshipit-source-id: 36cbb329188f112e09c3b1708a8090781b830dfe	2019-07-26 10:58:00 -07:00
Ralf Gommers	34f53564b4	Don't warn when using conda compilers with utils.cpp_extension (#23396 ) Summary: The conda compiler are gcc/c++ 7.3.0, but have custom version strings for clarity: x86_64-conda_cos6-linux-gnu-cc x86_64-conda_cos6-linux-gnu-c++ Using these compilers to build a C++ or CUDA extension now gives this warning (unnecessarily): ``` !! WARNING !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Your compiler (/home/rgommers/anaconda3/envs/pytorch-nightly/bin/x86_64-conda_cos6-linux-gnu-c++) is not compatible with the compiler Pytorch was built with for this platform, which is g++ on linux. ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23396 Differential Revision: D16500637 Pulled By: soumith fbshipit-source-id: 5b2fc3593e22e9a7d07dc2c0456dbb4934ffddb2	2019-07-26 10:17:14 -07:00
Jianyu Huang	47a54295ee	Add fbgemm_qlinear_dynamic op (#23104 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23104 ghstack-source-id: 87247148 As suggested in https://github.com/pytorch/pytorch/pull/22891, we will add an overload for ```torch.fbgemm_linear_int8_weight``` (dynamic quantized version of linear function) that takes PackedLinearWeight as input and is pretty much the same in signature as regular aten::linear. Differential Revision: D16381552 fbshipit-source-id: 1ccc4174fd02c546eee328940ac4b0da48fc85e8	2019-07-26 10:11:56 -07:00
Gregory Chanan	b7984fa8a7	Remove cases of AT_FORALL_SCALAR_TYPES_EXCEPT_HALF. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22890 Test Plan: Imported from OSS Differential Revision: D16467980 Pulled By: gchanan fbshipit-source-id: 93ddbd041b7f65cafe8520b095289f14ad6d667f	2019-07-26 09:58:35 -07:00
Richard Zou	0dcb8755c8	Implement tensor.set_names_, tensor.names setter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23172 Test Plan: - [namedtensor ci] gh-metadata: pytorch pytorch 23172 gh/zou3519/74/head Imported from OSS Differential Revision: D16494364 Pulled By: zou3519 fbshipit-source-id: 8d0e26b33346d4eadba30b2e76610f6d7be7c373	2019-07-26 08:50:49 -07:00
Richard Zou	c8a50a26d2	Named inference rule for torch.prod Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23106 Test Plan: - [namedtensor ci] Imported from OSS Differential Revision: D16419175 Pulled By: zou3519 fbshipit-source-id: beb9ef838525c1ea7d7839cb9b8d68028fb4917f	2019-07-26 08:50:45 -07:00
Richard Zou	9817d7e16b	Implement named inference rule for torch.sum Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23081 Test Plan: - New tests [namedtensor ci] Imported from OSS Differential Revision: D16419174 Pulled By: zou3519 fbshipit-source-id: 8679f77f121664d0398d7f062a53c0fa37482481	2019-07-26 08:50:40 -07:00
Daya Khudia	4104e80eae	qconv+relu and qlinear+relu modules (#23410 ) Summary: adding qconv+relu and qlinear+relu modules in nn/_intrinsic/quantized Pull Request resolved: https://github.com/pytorch/pytorch/pull/23410 Test Plan: Extended tests to test these new modules as well buck test mode/dev caffe2/test:quantized -- 'test_linear_api' --print-passing-details ``` Running 1 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799820197379 ✓ caffe2/test:quantized - test_linear_api (test_nn_quantized.ModuleAPITest) 4.055 1/1 (passed) Test output: > test_linear_api (test_nn_quantized.ModuleAPITest) > test API functionality for nn.quantized.linear and nn._intrinsic.quantized.linear_relu ... ok > > ---------------------------------------------------------------------- > Ran 1 test in 4.056s > > OK Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/2251799820197379 Summary (total time 10.66s): PASS: 1 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` buck test mode/dev caffe2/test:quantized -- 'test_conv_api' --print-passing-details ``` Running 2 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074607089664 ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.QuantizedConvTest) 5.195 1/2 (passed) Test output: > test_conv_api (test_quantized_conv.QuantizedConvTest) > Tests the correctness of the conv functional. ... ok > > ---------------------------------------------------------------------- > Ran 1 test in 5.195s > > OK ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 10.616 2/2 (passed) Test output: > test_conv_api (test_nn_quantized.ModuleAPITest) > Tests the correctness of the conv module. ... ok > > ---------------------------------------------------------------------- > Ran 1 test in 10.616s > > OK Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074607089664 Summary (total time 17.31s): PASS: 2 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 `` Differential Revision: D16505333 Pulled By: dskhudia fbshipit-source-id: 04f45cd0e76dc55f4694d558b913ab2958b7d727	2019-07-26 08:50:36 -07:00
Hong Xu	fb8725fdbd	Update doc about subsequent builds: options can be changed in build/CMakeCache.txt Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23331 Test Plan: Imported from OSS Differential Revision: D16517622 Pulled By: ezyang fbshipit-source-id: d2d15b8bb2599b3b9abb7a1aa6bc91bfc0d8e5d0	2019-07-26 08:50:32 -07:00
Hong Xu	0b4c0b95e9	For second-time build, let build_type be inferred from CMakeCache.txt. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23323 Test Plan: Imported from OSS Differential Revision: D16517621 Pulled By: ezyang fbshipit-source-id: 22984df214d01246a7868980e148936698940ea8	2019-07-26 08:50:28 -07:00
Shen Li	beb109f6f1	Enable cpp api test in multigpu-test.sh (#23380 ) Summary: blocking https://github.com/pytorch/pytorch/issues/20995 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23380 Differential Revision: D16517013 Pulled By: mrshenli fbshipit-source-id: 3f44ecf0e8d1e235165f2ce4396795ca38e2d837	2019-07-26 07:44:07 -07:00
BowenBao	46224ef89e	Update ONNX docs (#23185 ) Summary: This is still work in progress. There are several more items to add to complete this doc, including - [x] LHS indexing, index assignments. - [x] Tensor List. - [x] ~Shape/Type propagation.~ - [x] FAQs Please review and share your thoughts, feel free to add anything that you think should be included as well. houseroad spandantiwari lara-hdr neginraoof Pull Request resolved: https://github.com/pytorch/pytorch/pull/23185 Differential Revision: D16459647 Pulled By: houseroad fbshipit-source-id: b401c005f848d957541ba3b00e00c93ac2f4609b	2019-07-26 00:14:54 -07:00
liqunfu	7a0ae0079f	export sort to onnx Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21913 Differential Revision: D15982801 Pulled By: houseroad fbshipit-source-id: 96dbd738c557478fffd48000db7263ae1f9754f5	2019-07-26 00:02:20 -07:00
Horace He	1c00e0fc3f	Added a flatten module (#22245 ) Summary: https://github.com/pytorch/pytorch/issues/2118 I'm not sure I'm doing it correctly, so I'll add tests if we decide that it's roughly correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22245 Differential Revision: D16508957 Pulled By: Chillee fbshipit-source-id: a8dc7af999ba698c921006889f71cb1bc5a59d50	2019-07-25 22:48:52 -07:00
Sebastian Messmer	5b0484d977	Fix forwarding of arguments into kernel function (#23412 ) Summary: They should be forwarded by their actual type, not their rvalue reference. This looked like perfect forwarding but actually wasn't. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23412 ghstack-source-id: 87214575 Reviewed By: dzhulgakov Differential Revision: D16507872 fbshipit-source-id: 2b20a37df83067dd53e917fe87407ad687bb147c	2019-07-25 22:00:40 -07:00
Mingzhe Li	3516f3c235	handle exit from init method (#21211 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21211 There are cases where the `init` method used to create inputs can exit with error. When this happens, that specific input should be skipped. Reviewed By: zheng-xq Differential Revision: D15466410 fbshipit-source-id: 55e86764b2ec56f7730349ff1df6e50efc0239d7	2019-07-25 21:41:06 -07:00
Pavel Belevich	dd79d45c5a	Added torch.bitwise_not docstr (#23397 ) Summary: Fixing https://github.com/pytorch/pytorch/issues/23311 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23397 Differential Revision: D16505107 Pulled By: pbelevich fbshipit-source-id: 8d515fc27e253469393941c8da23d8e0510e64df	2019-07-25 18:32:58 -07:00
Lu Fang	58a3e4f71f	Automatic update of fbcode/onnx to 28ca699b69b5a31892619defca2391044a9a6052 (#23404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23404 Previous import was 707064980b9825b8705b9d1c9aad34d8b022d5dd Included changes: - [28ca699b](https://github.com/onnx/onnx/commit/28ca699b): Member Company logo guidelines (#2196) <Prasanth Pulavarthi> - [47acb06a](https://github.com/onnx/onnx/commit/47acb06a): remove link to outdated issue for contributions wanted (#2186) <Prasanth Pulavarthi> - [168519f6](https://github.com/onnx/onnx/commit/168519f6): Create sigs.md (#2103) <Prasanth Pulavarthi> - [b9320746](https://github.com/onnx/onnx/commit/b9320746): mintor format update (#2180) <Prasanth Pulavarthi> - [65b8e0f9](https://github.com/onnx/onnx/commit/65b8e0f9): add more types support for Equal op (#2176) <Ke Zhang> - [dc5e62a9](https://github.com/onnx/onnx/commit/dc5e62a9): Update AddNewOP document. (#2172) <Emad Barsoum> - [bae8b530](https://github.com/onnx/onnx/commit/bae8b530): Add missing space (#2150) <Takeshi Watanabe> - [5952b7f5](https://github.com/onnx/onnx/commit/5952b7f5): python api example typo fix (#2155) <LeicongLi> - [904cb842](https://github.com/onnx/onnx/commit/904cb842): Fix errors in RoiAlign shape inference code (#2167) <G. Ramalingam> Reviewed By: zrphercule Differential Revision: D16502373 fbshipit-source-id: 81f1e8f0db6828fd551089ae2e0be65153739532	2019-07-25 18:26:04 -07:00
Daya Khudia	bd54608bd2	fused qconv2d+relu kernel (#23353 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23353 Adding support for fused qconv2d + relu Reviewed By: jianyuh Differential Revision: D16473318 fbshipit-source-id: cd3c3476a21ffe946dbd9812e833b957c0fd206c	2019-07-25 17:55:47 -07:00
Daya Khudia	6a8c2758d5	Add better performing versions for groupwise and depthwise convolutions (#22869 ) Summary: Groupwise and depthwise convolutions become faster with this diff Pull Request resolved: https://github.com/pytorch/pytorch/pull/22869 Test Plan: buck test mode/dev caffe2/test:quantized -- 'test_qconv' --print-passing-details ``` Running 2 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224 ✓ caffe2/test:quantized - test_qconv (test_quantized.TestQuantizedConv) 2.731 1/2 (passed) Test output: > test_qconv (test_quantized.TestQuantizedConv) ... ok > > ---------------------------------------------------------------------- > Ran 1 test in 2.732s > > OK ✓ caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQuantizedConv) 5.187 2/2 (passed) Test output: > test_qconv_unpack (test_quantized.TestQuantizedConv) ... ok > > ---------------------------------------------------------------------- > Ran 1 test in 5.188s > > OK Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/562950091484224 Summary (total time 15.66s): PASS: 2 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` buck test mode/dev caffe2/test:quantized -- 'test_conv_api' ``` Running 2 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406 ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.040 1/2 (passed) ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.402 2/2 (passed) Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649676010406 Summary (total time 11.83s): PASS: 2 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` Differential Revision: D16264144 Pulled By: dskhudia fbshipit-source-id: 32fa43e5c3d97c8aaa6e0858327a2ac0aef8df5c	2019-07-25 17:55:43 -07:00
Gregory Chanan	2409e6a475	C++ at::Tensor, torch::tensor constructors should not accept QInts. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22889 Test Plan: Imported from OSS Differential Revision: D16467984 Pulled By: gchanan fbshipit-source-id: 6e2b1bf2a6a8dbc60138cd437b9cf814a0fc297d	2019-07-25 16:30:25 -07:00
Lu Fang	0e3a359a38	Align the operator<< for Argument with FunctionSchema parser (#23203 ) Summary: Align the Argument's operator<< with parser, additional support: 1) List size 2) real default value 3) Alias information Pull Request resolved: https://github.com/pytorch/pytorch/pull/23203 ghstack-source-id: 87118985 Reviewed By: zrphercule Differential Revision: D16433188 fbshipit-source-id: aea5711f93feacd94d1732e2f0d61218a31a0c5c	2019-07-25 15:28:17 -07:00
Sam Gross	83b0fbc38d	Remove TensorIterator::Builder (#23329 ) Summary: The builder pattern doesn't seem to work well with return-value-optimization. This saves ~100 ns in the construction of TensorIterator::binary_op. ``` import torch x = torch.rand(1) y = torch.rand(1) z = torch.rand(1) %timeit torch.add(x, y, out=z) # ~1.76 us vs ~1.88 us on my machine ``` cc resistor zheng-xq Pull Request resolved: https://github.com/pytorch/pytorch/pull/23329 Differential Revision: D16495070 Pulled By: VitalyFedyunin fbshipit-source-id: 8ce116075fa4c7149dabfcdfa25885c1187c8e2f	2019-07-25 15:15:49 -07:00
Edward Yang	2cfe949d45	DEPRECATE_MESSAGE doesn't actually get expanded; inline it at both sites. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23379 Test Plan: Imported from OSS Differential Revision: D16495278 Pulled By: ezyang fbshipit-source-id: 596438fbf3285d6eee7b5d27a014f87b6c261cf1	2019-07-25 14:26:00 -07:00
Michael Suo	b755bc1e31	fix importing for module defs that are named "foo.bar" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23367 Test Plan: Imported from OSS Differential Revision: D16478637 Pulled By: suo fbshipit-source-id: 30c6e7bfe377ef35d8c39e2d31615075ca0a6a19	2019-07-25 14:07:56 -07:00
Guanheng Zhang	b22adeb007	Fix error message for a wrong fork CUDA (#23322 ) Summary: Re-land https://github.com/pytorch/pytorch/pull/23030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23322 Differential Revision: D16469442 Pulled By: zhangguanheng66 fbshipit-source-id: 70b63ab6265efa3f289111ef0ce46bb3c0d353bc	2019-07-25 12:58:14 -07:00
Jesse Hellemn	d18529eb93	Fix upload path for macos binaries (#23386 ) Summary: peterjc123 will this affect windows too? Pull Request resolved: https://github.com/pytorch/pytorch/pull/23386 Differential Revision: D16499443 Pulled By: pjh5 fbshipit-source-id: a3bec32d16f2cd06a097082deae0b020dd8bc5ac	2019-07-25 12:48:04 -07:00
Tao Xu	7ee62d3d91	Fix the iOS build (#23293 ) Summary: The legacy iOS build script (`build_ios.sh`) is still working, but the output is in caffe2, not Pytorch. To enable the Pytorch iOS build, we can set the value of `BUILD_CAFFE2_MOBILE` to `NO`, and turn on another cmake arg - `INTERN_BUILD_MOBILE` ljk53 has created for Android. There is a trivial issue in `used_kernel.cpp` that will cause the compiling error when running `build_ios.sh`, as it uses a `system`API that has been deprecated since iOS 11. The fix below is to bypass this file since it's not needed by mobile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23293 Test Plan: The `build_ios.sh` completed successfully, and all the generated static libraries can be compiled and linked successfully on iOS devices. ### Build script ```shell ./scripts/build_ios.sh \ -DBUILD_CAFFE2_MOBILE=OFF \ -DCMAKE_PREFIX_PATH=$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())') \ -DPYTHON_EXECUTABLE=$(python -c 'import sys; print(sys.executable)') ``` Differential Revision: D16456100 Pulled By: xta0 fbshipit-source-id: 38c73e1e3a0c219a38ddc28b31acc181690f34e8	2019-07-25 12:41:20 -07:00
Lu Fang	071536f895	Fix the operator== for Argument (#23202 ) Summary: type() returns a shared pointer, we should compare its content, not pointer itself Pull Request resolved: https://github.com/pytorch/pytorch/pull/23202 ghstack-source-id: 87125582 Reviewed By: zrphercule Differential Revision: D16432634 fbshipit-source-id: 639e730dcdc1cec02f280efeea53019b36d9ae37	2019-07-25 11:59:28 -07:00
Sebastian Messmer	bbc53bffef	AliasAnalysisKind::CONSERVATIVE/FROM_SCHEMA (#22175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22175 - Rename AliasAnalysisKind::DEFAULT to AliasAnalysisKind::CONSERVATIVE - Introduce AliasAnalysisKind::FROM_SCHEMA that means the alias annotations of the schema should be honored - Introduce AliasAnalysisKind::INTERNAL_SPECIAL_CASE to be able to run assertions that internal special cased ops are treated correctly - aten:: and prim:: ops are not treated as special cases anymore, but just use AliasAnalysisKind::FROM_SCHEMA - There's a set of assertions to ensure that aten:: and prim:: ops are all correctly set up to use AliasAnalysisKind::FROM_SCHEMA. Once this PR lands and passes all tests, we will remove those assertions and open up for the possibility of different AliasAnalysisKind settings for aten:: and prim:: ops Differential Revision: D15929595 fbshipit-source-id: 7c6a9d4d29e13b8c9a856062cd6fb3f8a46a2e0d	2019-07-25 11:53:51 -07:00
Sebastian Messmer	b9202d459a	Polish torch::Dict (#23344 ) Summary: torch::List recently received some polishing that now also is done for Dict. This should be done before the PyTorch 1.2 release because of backwards compatibility. - Dict is just a reference type, so "const Dict" should have the same capabilities as "Dict", constness is not guaranteed in any way. - DictIterator gets comparison operators <, <=, >, >= Pull Request resolved: https://github.com/pytorch/pytorch/pull/23344 ghstack-source-id: 87170304 Differential Revision: D16468800 fbshipit-source-id: 2978c3b9cdcfb2cfb3f26516b15bd455d9a48ba9	2019-07-25 11:35:36 -07:00
Lu Fang	722f80a07d	Align String str() with the format in FunctionSchema (#23201 ) Summary: old: string new: str Pull Request resolved: https://github.com/pytorch/pytorch/pull/23201 ghstack-source-id: 87125580 Reviewed By: zrphercule Differential Revision: D16432340 fbshipit-source-id: 56bc7e8efbc2276315f464958cf38704f75dd06e	2019-07-25 11:31:00 -07:00
Pieter Noordhuis	7c383ba4a0	Remove superfluous check (#23370 ) Summary: This check is not needed. Even if it were, the assignment is clobbered anyway. Closes #23300. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23370 ghstack-source-id: 87157671 Differential Revision: D16485329 fbshipit-source-id: 8ccac79e81f5e0d0d20099d550411c161f58c233	2019-07-25 11:26:16 -07:00
Jianyu Huang	39de49c7ec	Fix a tiny bug and refactor (#22808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22808 - Use ```size_to_dim_```. - ```mod``` is not in the scope. Should be ```module```. Reviewed By: mingzhe09088 Differential Revision: D16225799 fbshipit-source-id: 9a263227d2d508eefdfddfee15fd0822819de946	2019-07-25 11:26:12 -07:00
Jesse Hellemn	39fd264799	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23381 Differential Revision: D16496327 Pulled By: pjh5 fbshipit-source-id: 529029544a5f8c8106bcb7cebdc71aee33e3b86c	2019-07-25 10:39:37 -07:00
Lu Fang	ed316c0ca0	Align Dict str() with the format in FunctionSchema (#23200 ) Summary: Old: Dict[int, str] New: Dict(int, str) Pull Request resolved: https://github.com/pytorch/pytorch/pull/23200 ghstack-source-id: 87125581 Reviewed By: zrphercule Differential Revision: D16432159 fbshipit-source-id: a3dc5fa397697a53e78290d25e19589f757c1eb8	2019-07-25 10:27:41 -07:00
Jerry Zhang	f477cab2dc	Add type checks in _intrinsics.modules.fused (#23361 ) Summary: recreated Pull Request resolved: https://github.com/pytorch/pytorch/pull/23361 ghstack-source-id: 87142339 Reviewed By: zafartahirov Differential Revision: D16455500 fbshipit-source-id: ab2c9d10c7c025ae77f5b80f28e6bd261620a5f7	2019-07-25 09:49:01 -07:00
Lu Fang	25e06618c8	Support parsing variadic return schema (#23199 ) Summary: all cases should be prim ops, but let's support it. it will expect variadic return schema to be prim::PythonOp(...) -> ... Pull Request resolved: https://github.com/pytorch/pytorch/pull/23199 ghstack-source-id: 87113845 Differential Revision: D16431635 fbshipit-source-id: 798b6957ce5d800f7fcf981c86fdcb009cd77a78	2019-07-25 09:39:49 -07:00
Thomas Viehmann	cf50249bde	Disable fusion of grad_sum_to_size (#23372 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/22833 grad_sum_to_size does not commute with AutogradAdd after all because it turns the broadcasting AutogradAdd into a broadcasting add. Chillee did actually do most of the tracking down to the fusion of grad_sum_to_size and pinging me when he had found the cause. Thank you! About the choice of removing the fusion completely instead of being more precise: - We do have grad_sum_to_size elimination which works for cases where broadcasting does not actually happen in the forward, so the cases where the fusing of grad_sum_to_size is actually beneficial is much smaller than when initially proposed. - There will be less fusion, in terms of the tests, IOU stops being fully fused. I vaguely think that it is a case we could handle with refined logic. - Keeping it would add complexity in checking when to merge fusion groups to the complexities that this PR removes. - The future of fusion probably lies more in more complete solutions including reductions (TVM or KeOps or our own or ...). Pull Request resolved: https://github.com/pytorch/pytorch/pull/23372 Differential Revision: D16489930 Pulled By: soumith fbshipit-source-id: bc0431b0d3eda264c401b634675872c4ce46f0f4	2019-07-25 08:55:33 -07:00
Hong Xu	82545ecc71	Specify build dir as a global variable in BUILD_DIR in the build system. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23318 Test Plan: Imported from OSS Differential Revision: D16493987 Pulled By: ezyang fbshipit-source-id: 497e9dd924280f61dde095b4f2b50f5402d9da97	2019-07-25 07:19:47 -07:00
Hong Xu	915261c8be	Let users pass CMake-specific options starting with CMAKE_ to CMake. (#22776 ) Summary: This should make it more convenient to follow https://github.com/pytorch/pytorch/issues/8433's suggestion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22776 Differential Revision: D16493553 Pulled By: ezyang fbshipit-source-id: 852f4779e70f84a4c9f7bab4c2ae4927248ffc93	2019-07-25 07:19:44 -07:00
Hong Xu	f91b19c2aa	Do not explicitly set USE_FBGEMM in tools/setup_helpers/cmake.py (#23314 ) Summary: Instead, defer its default value to CMakeLists.txt NO_FBGEMM has already been handled in tools/setup_helpers/env.py (although deprecated) Pull Request resolved: https://github.com/pytorch/pytorch/pull/23314 Differential Revision: D16493580 Pulled By: ezyang fbshipit-source-id: 7255eb1df5e8a6dd0362507d68da0986a9ed46e2	2019-07-25 07:11:52 -07:00
Kexuan Sun	ba6f65cf33	Add document of functions nn.init.ones_/zeros_ (#23145 ) Summary: Functions `nn.init.ones_` and `nn.init.zeros_` were not documented. As mentioned in https://github.com/pytorch/pytorch/issues/9886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23145 Differential Revision: D16427108 Pulled By: soumith fbshipit-source-id: 4fac31e79717a436411ef5e107a829b403e576c9	2019-07-25 06:09:50 -07:00
jjsjann123	252710262f	(#22775 ) Summary: passing FusionCallback and Symbol to recursive GraphFuser calls. It ensures consistent fusion in nested Blocks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22775 Differential Revision: D16439979 Pulled By: soumith fbshipit-source-id: 18d4b13f52b03708b8580c73f75450adbb672ac1	2019-07-25 05:54:03 -07:00
Prasun	0c79753c0d	Improve documentation for torch.enable_grad , torch.no_grad and torch.set_grad_enabled (#23310 ) Summary: Modified documentation for ` torch.enable_grad` , ` torch.no_grad` and `torch.set_grad_enabled`. Fixes https://github.com/pytorch/pytorch/issues/19189 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23310 Differential Revision: D16489626 Pulled By: soumith fbshipit-source-id: f0926e4f51ffd97521e67bee3a16ad954458247a	2019-07-25 05:48:33 -07:00
Pieter Noordhuis	2938299de1	Fix lint failure introduced in #23346 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23371 Differential Revision: D16489985 Pulled By: pietern fbshipit-source-id: 914048563bbe7bf5ab897c6f12f4a1bb4bff30e1	2019-07-25 05:17:15 -07:00
Ralf Gommers	0842624d50	Fix upload issue with linux libtorch nightlies (#23368 ) Summary: This is a small fix on top of gh-23348, which fixed the libtorch nightly build timeouts. For the latest nighly build (25 July), see https://circleci.com/workflow-run/33d0a24a-b77c-4a8f-9ecd-5646146ce684 The only failures are these uploads, which is because `aws s3 cp` can only deal with one file at a time. The only way to make it do multiple files at once is: ``` aws s3 cp . "$s3_dir" --exclude "" --include "libtorch-.zip" --recursive --acl public-read ``` which is much more verbose. executing one `cp` per file should be fine, and this is also what's done in `binary_macos_upload.sh` Closes gh-23039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23368 Differential Revision: D16488853 Pulled By: soumith fbshipit-source-id: 6dc04b4de2f6cd2de5ae9ad57a6e980f56896498	2019-07-25 04:52:43 -07:00
Pieter Noordhuis	95e822622b	Enhance interpretation of GLOO_SOCKET_IFNAME (#22978 ) Summary: With this change you can now list multiple interfaces separated by comma. ProcessGroupGloo creates a single Gloo context for every device in the list (a context represents a connection to every other rank). For every collective that is called, it will select the context in a round robin fashion. The number of worker threads responsible for executing the collectives is set to be twice the number of devices. If you have a single physical interface, and wish to employ increased parallelism, you can also specify `GLOO_SOCKET_IFNAME=eth0,eth0,eth0,eth0`. This makes ProcessGroupGloo use 4 connections per rank, 4 I/O threads, and 8 worker threads responsible for executing the collectives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22978 ghstack-source-id: 87006270 Differential Revision: D16339962 fbshipit-source-id: 9aa1dc93d8e131c1714db349b0cbe57e9e7266f1	2019-07-25 04:52:38 -07:00
Gu, Jinghui	1dd4d55565	Improve FindMKLDNN.cmake to avoid binary compatibility issue in MKL-DNN (#23292 ) Summary: Illegal instruction is encountered in pre-built package in MKL-DNN. https://github.com/pytorch/pytorch/issues/23231 To avoid such binary compatibility issue, the HostOpts option in MKL-DNN is disabled in order to build MKL-DNN for generic arch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23292 Differential Revision: D16488773 Pulled By: soumith fbshipit-source-id: 9e13c76fb9cb9338103cb767d7463c10891d294a	2019-07-25 04:42:26 -07:00
Pieter Noordhuis	e8ad167211	Remove usage of FOLDER constant in test_distributed.py (#23223 ) Summary: This is step 1 in trying to get rid of constants that are set prior to executing the test runner. All setup logic should be concentrated in the setupClass() function of the TestCase subclass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23223 ghstack-source-id: 87005260 Reviewed By: zhaojuanmao Differential Revision: D16439147 fbshipit-source-id: 7a929ad4b1c8e368e33d1165becbd4d91220882c	2019-07-25 02:55:30 -07:00
Michael Suo	711be82951	Make optimize a thread_local flag Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23170 Test Plan: Imported from OSS Differential Revision: D16441912 Pulled By: suo fbshipit-source-id: a33485178a329d54e41e364c4f14950f88481c55	2019-07-24 23:09:21 -07:00
Mingzhe Li	b3980f46a2	Replace uint8 with int8 in Linear and LSTM quantization path (#23347 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23347 This diff replaces uint8 with int8 to match with the underlying kernel implementation. When we do int8 quantization, we are computing with uint8 (input activation) * int8 (weight) -> uint8 (output activation). The weight is quantized into int8. Reviewed By: jianyuh Differential Revision: D16469435 fbshipit-source-id: a697655b0e97833fc601e5980970aec4dba53c39	2019-07-24 22:25:12 -07:00
Edward Yang	fba325be34	Try kill -9ing apt-get (#23354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23354 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D16474254 Pulled By: ezyang fbshipit-source-id: 0dd7ce02e1aa1a42a24d2af066ebd0ac5206c9a0	2019-07-24 19:27:10 -07:00
Edward Yang	ff3cc795c8	Build binaries with local version string specifying CUDA version (#23325 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23325 Fixes #19990 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D16473826 Pulled By: ezyang fbshipit-source-id: 466db2c22fabd7b574f0a08aec67a18318ddb431	2019-07-24 18:32:32 -07:00
Iurii Zdebskyi	cf0f3556f6	Enabled cumsum and cumprod for bool tensors (#23346 ) Summary: ``` a = torch.tensor([[True, False, True], [False, False, False], [True, True, True]]) >>> torch.cumsum(a, 0) tensor([[1, 0, 1], [1, 0, 1], [2, 1, 2]]) >>> torch.cumsum(a, 1) tensor([[1, 1, 2], [0, 0, 0], [1, 2, 3]]) ``` Tested via unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23346 Differential Revision: D16469393 Pulled By: izdeby fbshipit-source-id: b55f3ca0588f9961a771def40f6ef58932021e1a	2019-07-24 18:16:01 -07:00
Du Tran	c9312e1a8b	Open source 3D depthwise conv (#23164 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23164 for open source CSN model Reviewed By: weiyaowang Differential Revision: D16412312 fbshipit-source-id: bb4e7748e697271563f974ca05878f8832d83653	2019-07-24 17:56:56 -07:00
Jesse Hellemn	73dee44ec5	Specifying libtorch variant in libtorch builds (#23348 ) Summary: This should fix all libtorch timeout issues Pull Request resolved: https://github.com/pytorch/pytorch/pull/23348 Differential Revision: D16472782 Pulled By: pjh5 fbshipit-source-id: b1f3a783d044eb881f6e8755e0c891093e93c93e	2019-07-24 17:31:43 -07:00
Zachary DeVito	3297d8e203	Switch keys to be sequential and stable in pickle serialization Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23280 Test Plan: Imported from OSS Differential Revision: D16452816 Pulled By: zdevito fbshipit-source-id: e143780b8e834298a575ac76d49576df94fbe27b	2019-07-24 17:13:51 -07:00
Zachary DeVito	93da1030df	Fix pickler bug where it would not load if no tensors were saved Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23263 Test Plan: Imported from OSS Differential Revision: D16446928 Pulled By: zdevito fbshipit-source-id: f70f86b28c3901a97b65b4d7654e39dc6e1aab6a	2019-07-24 17:13:46 -07:00
Zachary DeVito	7922b5057d	Memoize storages in pickler Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23262 Test Plan: Imported from OSS Differential Revision: D16446927 Pulled By: zdevito fbshipit-source-id: 92d26f64ff6269b1deef821edae31745158b5137	2019-07-24 17:13:42 -07:00
Lu Fang	71a047c3e3	Unwrap DataParallel automatically (#23334 ) Summary: Handle DataParallel for users. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23334 Differential Revision: D16467844 Pulled By: houseroad fbshipit-source-id: 696aeada437c6c0612ac4ef9c4d51e3386625de0	2019-07-24 16:29:48 -07:00
David Clissold	c23ba35009	Skip QNNpack tests on ppc64le (where support is not enabled) (#23343 ) Summary: Proposed PR for https://github.com/pytorch/pytorch/issues/23342 Disables execution of QNNpack tests if IS_PPC. Basically this parallels the same skipping of tests for IS_WINDOWS as well, which is already present. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23343 Differential Revision: D16469218 Pulled By: soumith fbshipit-source-id: 80b651d00e5d413e359cf418f79e20d74cd9c8e1	2019-07-24 15:24:00 -07:00
Lu Fang	22c169fb9c	Improve the error message for ONNX export (#23317 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23317 Print out the kind type when fail to export Reviewed By: zrphercule Differential Revision: D16462641 fbshipit-source-id: 27157c0bd597362f90ac8cfb33e1808bac0ec48b	2019-07-24 15:03:05 -07:00
Stefan Krah	87d3f66506	max_pool_with_indices: return valid indices if all input elements are -inf (#23161 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20465. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23161 Differential Revision: D16442672 Pulled By: ezyang fbshipit-source-id: 8c2ee13acd73954c7307720c01c732f460266a63	2019-07-24 14:51:39 -07:00
Ailing Zhang	b7d90332ea	add notes about overshoot in bicubic mode (#23321 ) Summary: fix https://github.com/pytorch/pytorch/issues/21044 Bicubic interpolation can cause overshoot. Opencv keeps results dtype aligned with input dtype: - If input is uint8, the result is clamped [0, 255] - If input is float, the result is unclamped. In Pytorch case, we only accept float input, so we'll keep the result unclamped, and add some notes so that users can explicitly call `torch.clamp()` when necessary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23321 Differential Revision: D16464796 Pulled By: ailzhang fbshipit-source-id: 177915e525d1f54c2209e277cf73e40699ed1acd	2019-07-24 14:46:37 -07:00
Alexander Sidorov	d522b3ca58	BlackBoxPredictor OSS part N: ThreadLocalPtr, InferenceGraph (#23257 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23257 Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Reviewed By: zrphercule Differential Revision: D16428124 fbshipit-source-id: b35deada5c015cd97b91ae12a7ea4aac53bd14b8	2019-07-24 14:35:30 -07:00
davidriazati	2915d53096	Move OptionalType wrapping out of constants.cpp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23234 Pulled By: driazati Differential Revision: D16460880 fbshipit-source-id: d4e6b747615dbfe73a92ce571d3b2aaae7179f1b	2019-07-24 14:35:26 -07:00
davidriazati	48ca64dbf7	Better error for compiling a module type Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23312 Pulled By: driazati Differential Revision: D16461299 fbshipit-source-id: 11e56c44d561c3fbf70a96c22c5fd494eea0cf19	2019-07-24 14:24:50 -07:00
Dmytro Dzhulgakov	d6dcec37b6	Add docs about prod ecosystem features (#23010 ) Summary: Covering fleet-wide profiling, api logging, etc. It's my first time writing rst, so suggestions are definitely welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23010 Differential Revision: D16456721 Pulled By: dzhulgakov fbshipit-source-id: 3d3018f41499d04db0dca865bb3a9652d8cdf90a	2019-07-24 14:15:33 -07:00
mal	87482bb15a	Remove torch::autograd::Node::get_shared_ptr() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23307 Test Plan: Imported from OSS Differential Revision: D16460972 fbshipit-source-id: 0678627e05dd4c69c4dafa6b717db5cd1a531f56	2019-07-24 13:50:47 -07:00
Mingzhe Li	8fdbe1e10b	Support LSTM with FP16 weight (#23291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23291 This diff implements LSTM with FP16 weights based on FBGEMM. At a high level, here are the steps: 1. Quantize and pack weight in every layer of LSTM 2. Pass weights from step 1 to the ATen `quantized_lstm` function which does matrix multiplication with FP16 weight. The following code shows the dtype of each variable used in MM: Y = X * W + B (fp32, fp32, fp16, fp32) Reviewed By: jianyuh Differential Revision: D16389595 fbshipit-source-id: c26ae4e153c667a941f4af64e9d07fc251403cee	2019-07-24 12:40:11 -07:00
Edward Yang	1f608d09cf	Revert D16440000: [pytorch][PR] Re-land "Fix error message for a wrong fork CUDA" Differential Revision: D16440000 Original commit changeset: e05683275522 fbshipit-source-id: b688f24c1e6d3d8f63c2d415262a3f0ab1b85914	2019-07-24 12:05:36 -07:00
Richard Zou	1a9edfcd36	Prevent xla-build from clobbering pytorch_linux_trusty_py3_6_gcc5_4_test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23304 Test Plan: Imported from OSS Differential Revision: D16459166 Pulled By: zou3519 fbshipit-source-id: 8f5c35ebf1fe6e86705e7fb4fff4c720bcd8f97b	2019-07-24 11:58:44 -07:00
Sebastian Messmer	0c0ffccbb6	deepCopy also copies type information of lists (#23271 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23271 ghstack-source-id: 87088503 Differential Revision: D16449220 fbshipit-source-id: 551b7cef8f6d0d2d5a56b24ddbe2e0bb2c0c3dbe	2019-07-24 11:53:01 -07:00
Edward Yang	895e79adf1	Revert D16441000: Switch from KaTeX to imgmath for documentation rendering. Differential Revision: D16441000 Original commit changeset: c1ab557cb816 fbshipit-source-id: cbfec2ca648b614b291debd6b3e215db9fbeb57b	2019-07-24 11:43:17 -07:00
Zafar Takhirov	94711d7471	Quantized conv avoid functional usage (#22733 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22733 This refactor changes the conv module to avoid the usage of the functional ops. Reviewed By: jerryzh168 Differential Revision: D15835572 fbshipit-source-id: f2294cd708fbe8372eb3a15cc60d83777d4f7029	2019-07-24 11:43:12 -07:00
Pieter Noordhuis	67179d71f7	Reduce number of processes spawned for gloo_test.TestCase.test_forked_cw (#23221 ) Summary: It used to be run with comm_size=8, which causes flaky results in a stress run. The flakiness was caused by too many listening sockets being created by Gloo context initialization (8 processes times 7 sockets times 20-way concurrency, plus TIME_WAIT). Pull Request resolved: https://github.com/pytorch/pytorch/pull/23221 ghstack-source-id: 86995596 Reviewed By: d4l3k Differential Revision: D16437834 fbshipit-source-id: 998d0e2b087c0ab15eca64e308059c35e1b51e7b	2019-07-24 11:31:20 -07:00
Will Feng	3ed79f4b6c	Fix argument names in torch doc (#22973 ) Summary: I manually went through all functions in `torch.*` and corrected any mismatch between the arguments mentioned in doc and the ones actually taken by the function. This fixes https://github.com/pytorch/pytorch/issues/8698. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22973 Differential Revision: D16419602 Pulled By: yf225 fbshipit-source-id: 5562c9b0b95a0759abee41f967c45efacf2267c2	2019-07-24 11:22:45 -07:00
Jesse Hellemn	eb51131fb4	Revert D16423217: [pytorch][PR] Update sleef to master, fixes #20535 Differential Revision: D16423217 Original commit changeset: 587de3f10e83 fbshipit-source-id: 466e56eab73ce669cc179d08b7f39d2c8b0ffb34	2019-07-24 11:10:15 -07:00
Lu Fang	803af9988c	Fix the problem in parseOctal and throw exception if use \xhh to specify hex value (#23198 ) Summary: follow the rules: 1) https://docs.python.org/2.0/ref/strings.html 2) https://en.cppreference.com/w/cpp/language/escape didn't find anything about the format \h Pull Request resolved: https://github.com/pytorch/pytorch/pull/23198 ghstack-source-id: 86986691 Reviewed By: zrphercule Differential Revision: D16431215 fbshipit-source-id: 7b342708d1984e08b3cbbcf6d487623f1dc63a14	2019-07-24 10:41:59 -07:00
davidriazati	b9a7fc529a	Suppress warnings in tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23264 Pulled By: driazati Differential Revision: D16449965 fbshipit-source-id: ff7d6ddf2275e5f44883a19b6a24c6beaa42ccf4	2019-07-24 10:36:46 -07:00
Iurii Zdebskyi	200cb836f1	Enabled 'add_cuda' for bool and fixed alpha scalar bug (#23044 ) Summary: Enabled 'add_cuda' for bool Tested via unit tests Fixed https://github.com/pytorch/pytorch/issues/22431 #22430 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23044 Differential Revision: D16368095 Pulled By: izdeby fbshipit-source-id: 033d28095ff1c5df4078905c52782cf4cf9ed6b0	2019-07-24 10:31:34 -07:00
Owen Anderson	fbf28b5458	Change TensorIterator to be stack allocated, using named return value optimization to elide copies. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22519 Differential Revision: D16451460 fbshipit-source-id: 6ca6ae2fdf1af5a2f792b42e55279413971b3c46	2019-07-24 10:22:19 -07:00
Edward Yang	7203612f85	Update sleef to master, fixes #20535 (#23168 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20535 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/23168 Differential Revision: D16423217 Pulled By: ezyang fbshipit-source-id: 587de3f10e839b94f51f673741b5fda8849e32f6	2019-07-24 08:18:14 -07:00
Hong Xu	fd1d06e317	Let Python build scripts accept both CMAKE_BUILD_TYPE and the oldschool DEBUG and REL_WITH_DEB_INFO variables. (#22875 ) Summary: Currently the build type is decided by the environment variable DEBUG and REL_WITH_DEB_INFO. This commit also lets CMAKE_BUILD_TYPE be effective. This makes the interface more consistent with CMake. This also prepares https://github.com/pytorch/pytorch/issues/22776. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22875 Differential Revision: D16281663 Pulled By: ezyang fbshipit-source-id: 952f92aad85ff59f1c7abe8256eca8a4a0936026	2019-07-24 08:07:47 -07:00
Guanheng Zhang	aa660b8eb7	Re-land "Fix error message for a wrong fork CUDA" (#23209 ) Summary: Re-land https://github.com/pytorch/pytorch/pull/23030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23209 Differential Revision: D16440000 Pulled By: zhangguanheng66 fbshipit-source-id: e05683275522835a33d5a7e6d76b7e94774e4d98	2019-07-24 07:01:04 -07:00
Johannes M Dieterich	4cd726c7b3	Update ROCm CI to python3.6 (#23088 ) Summary: Rehash of https://github.com/pytorch/pytorch/issues/22322 . Given that python 2.7 will be EOL'd on Jan 1, 2020 and we have models depending on python3.5+, we'd like to update the ROCm CI across the board to python3.6. This PR adds the skip tests and some semantic changes for PyTorch. Added pattern match skip for anything but the ROCm CI compared to #223222 for the python find step in the PyTorch build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23088 Differential Revision: D16448261 Pulled By: bddppq fbshipit-source-id: 69ece1a213418d9abf1444c496dce1c190ee07c8	2019-07-23 23:07:45 -07:00
Elias Ellison	91bef6c168	format sugared_value & compiler.cpp (#23283 ) Summary: there are a lot of formatting changes which makes other diffs to these PRs noisy & hard to read. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23283 Differential Revision: D16453590 Pulled By: eellison fbshipit-source-id: 97b4bf1dbbbfb09c44c57402f61ea27287060044	2019-07-23 22:29:22 -07:00
Zafar Takhirov	bc15a20db9	Minor refactor: propagating messages in TestCase Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23146 Test Plan: Imported from OSS Differential Revision: D16413801 Pulled By: zafartahirov fbshipit-source-id: 8009a7afe77e127ddd220fb71c6c272b0d44c733	2019-07-23 21:18:44 -07:00
Will Feng	8a77098247	Make Module::register_module / register_parameter / register_buffer public (#23196 ) Summary: In Python, `register_module` / `register_parameter` / `register_buffer` method in `nn.Module` is public. This PR makes those APIs public for C++ `nn::Module` as well. Closes https://github.com/pytorch/pytorch/issues/23140. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23196 Differential Revision: D16440239 Pulled By: yf225 fbshipit-source-id: e0eff6e1db592961fba891ec417dc74fa765e968	2019-07-23 21:18:41 -07:00
Zafar Takhirov	24601daa12	Adding check for a single batch in adaptive_avg_pool Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23137 Test Plan: Imported from OSS Differential Revision: D16403804 Pulled By: zafartahirov fbshipit-source-id: df79a8c768ffabeceb4c0044c967a623c5885484	2019-07-23 21:11:06 -07:00
mal	e7a9b0d62f	Rename torch::autograd::Function to torch::autograd::Node Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23269 Test Plan: Imported from OSS Differential Revision: D16454878 fbshipit-source-id: b1e840fc2d3901955280d141e5ad6efd5e9d66af	2019-07-23 20:52:22 -07:00
Vishwak Srinivasan	0ab19d66ee	Port lu_solve to ATen (#22379 ) Summary: Changelog: - Port TH implementation to ATen/native/BatchLinearAlgebra.cpp - Port THC implementation to ATen/native/cuda/BatchLinearAlgebra.cu - Remove TH/THC implementations - Update doc strings Pull Request resolved: https://github.com/pytorch/pytorch/pull/22379 Test Plan: - Added new tests in test_torch.py (port to test_cuda.py exists) Differential Revision: D16089645 Pulled By: zou3519 fbshipit-source-id: dc8561aadacacb23e80c375b4fec687df2b6bbc8	2019-07-23 19:11:35 -07:00
grisha	2197bee3da	add cudnn.cu Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23265 Differential Revision: D16453611 Pulled By: bddppq fbshipit-source-id: b49e01b6d019781097ec5072cdc6d37a2988bfbe	2019-07-23 18:09:23 -07:00
Lucian Grijincu	a936a90391	caffe2/caffe2/fb/operators/cc_amrc: drop SIMD OpenMP vectorization Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23235 Reviewed By: ajtulloch Differential Revision: D16384612 Pulled By: luciang fbshipit-source-id: a4c8257c6d3e151ba99167a152ad824b0dde7671	2019-07-23 17:25:00 -07:00
Jiatong Zhou	7ed9622fdf	Read number of workspaces from argument in recurrent_network_op (#23272 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23272 We see significant performance improvements by limiting concurrency at caffe2 level on mobile. This diff enables setting the number of caffe2 workspaces used during rnn inference. Reviewed By: akyrola Differential Revision: D16448611 fbshipit-source-id: 28abaddb4ea60bacb084ceb28cb7a4d1e67ccc17	2019-07-23 17:19:40 -07:00
BowenBao	a35136dd73	Add support for onnx tensor index export (#21716 ) Summary: Support exporting * Standard tensor indexing like ``` x = torch.ones(4, 5) ind = torch.tensor([0, 1]) return x[ind] ``` * [Advanced indexing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing) like ``` x = torch.ones(4,5,6,7,8) ind1 = torch.tensor([0, 1]) ind2 = torch.tensor([[3], [2]]) ind3 = torch.tensor([[2, 2], [4, 5]]) return x[2:4, ind1, None, ind2, ind3, :] ``` It would be ideal if ONNX can natively support indexing in future opsets, but for opset <= 10 it will always need this kind of workarounds. There are still various limitations, such as not supporting advanced indexing with negative indices, not supporting mask indices of rank > 1, etc. My feeling is that these are less common cases that requires great effort to support using current opset, and it's better to not make the index export more cumbersome than it already is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21716 Reviewed By: zrphercule Differential Revision: D15902199 Pulled By: houseroad fbshipit-source-id: 5f1cc687fc9f97da18732f6a2c9dfe8f6fdb34a6	2019-07-23 17:11:28 -07:00
Elias Ellison	1de44a6f54	fix specialized list from dict keys (#23267 ) Summary: Previously we weren't specializing the list returned from `dict.keys()` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23267 Differential Revision: D16448512 Pulled By: eellison fbshipit-source-id: fcd2a37ac680bdf90219b099a94aa36a80f4067c	2019-07-23 17:02:19 -07:00
Alexander Sidorov	a6ccd62a81	BlackBoxPredictor OSS part 5: glow transforms Summary: Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Reviewed By: bertmaher Differential Revision: D16367134 fbshipit-source-id: fc6bacc1be3ff6336beb57cdad58168d3a2b8c28	2019-07-23 16:39:23 -07:00
Jiakai Liu	bdb1e1305d	exclude some caffe2 modules from libtorch mobile build (#20000 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20000 ghimport-source-id: f47773ef1c6849cd0c0e65080400416c6b370d39 Test Plan: - verified libtorch mobile library builds and links successfully; Imported from OSS Differential Revision: D15169024 Pulled By: ljk53 fbshipit-source-id: 20ac89c6e7053239c93e51f00c5c5dc3595bea74	2019-07-23 16:20:27 -07:00
Yanli Zhao	1c0309a9a9	make OMP_NUM_THREADS default in launch.py (#22501 ) Summary: per https://github.com/pytorch/pytorch/issues/22260, default number of open mp threads are spawned to be the same of number of cores available, for multi processing data parallel cases, too many threads may be spawned and could overload the CPU, resulting in performance regression. so set OMP_NUM_THREADS = number of CPU processors/number of processes in default to neither overload or waste CPU threads Pull Request resolved: https://github.com/pytorch/pytorch/pull/22501 Test Plan: 1. without and with this change, example codes result in same result python ~/local/fbsource-fbcode/fbcode/caffe2/torch/distributed/launch.py --nproc_per_node=2 pytorch/examples/yanlizhao/distributed_launch_example.py Setting OMP_NUM_THREADS environment variable for each process to be: 24, which is max(1, num_cpus / num_processes), you can further tune the variable for optimal performance in your application if needed. final loss = tensor(0.5211, device='cuda:0', grad_fn=<MseLossBackward>) Differential Revision: D16092225 Pulled By: zhaojuanmao fbshipit-source-id: b792a4c27a7ffae40e4a59e96669209c6a85e27f	2019-07-23 16:14:24 -07:00
Zafar Takhirov	058645acb1	Fusion and _intrinsic modules (#23003 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23003 torch.quantization.fuse_module and torch.nn._intrinsic convRelu and LinearRelu Fusion function to combine specific modules: (conv,bn) and (conv,bn,relu). In all cases, replace modules in place. The first module is replaced with the _intrinsic fused module and the remaining modules are replaced by nn.Identity. Support both training and eval. For training, the modules are "fused" with a sequential container. This is to allow for further module swaps for quantization aware training. Also add: torch.nn._intrinsic for convRelu and LinearRelu. TODO: Add tests for _intrinsic modules. Conv BN fusion code is based on DsKhudia's implementation Differential Revision: D16199720 fbshipit-source-id: 95fb9ffe72b361d280313b2ec57de2acd4f9dda2	2019-07-23 14:54:19 -07:00
Pavel Belevich	7b229342ca	Renamed CosineAnnealingLr to CosineAnnealingLR (#23242 ) Summary: fixing https://github.com/pytorch/pytorch/issues/23160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23242 Differential Revision: D16443348 Pulled By: pbelevich fbshipit-source-id: af0edf4e841e04a8016c98bfee72696581f3f070	2019-07-23 14:54:15 -07:00
Levent Ertoz	8d4956fd02	hook up dropout sparse with replacement operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23183 Reviewed By: ffjiang Differential Revision: D16428262 fbshipit-source-id: 0d6e17d15c898629bbd2826441f2c9701a78b0bd	2019-07-23 14:34:25 -07:00
Levent Ertoz	6f01d13728	Implement dropout with replacement for id list features. (#22880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22880 Implement sparse dropout with replacement value. Reviewed By: xianjiec Differential Revision: D16267012 fbshipit-source-id: 8c4878230f61bb3ac333291e2c6aaf2fbdc5f9ce	2019-07-23 14:34:21 -07:00
Zachary DeVito	e0f632c58b	pickler.cpp: respect __getstate__/__setstate__ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23190 Test Plan: Imported from OSS Differential Revision: D16431553 Pulled By: zdevito fbshipit-source-id: 680ea1507c12727fd17aedb3067f522cf490e306	2019-07-23 14:27:51 -07:00
Abhinav Jauhri	bae10db522	Incorporating arguments to pull production operators and adding device type. (#23197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23197 Incorporating arguments to pull production operators and adding device type. Reviewed By: mingzhe09088 Differential Revision: D16387263 fbshipit-source-id: e20ed82225eb1e4b7ab1756ec157967b055d85bf	2019-07-23 13:43:26 -07:00
Michael Suo	d8220b0599	add simple inheritance support to AST Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23109 Test Plan: Imported from OSS Differential Revision: D16441914 Pulled By: suo fbshipit-source-id: 18a57762d376759b98c18bc160eacbcc99f78ee9	2019-07-23 12:21:27 -07:00
Michael Suo	017870a633	kill module_lookup Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23097 Test Plan: Imported from OSS Differential Revision: D16383329 Pulled By: suo fbshipit-source-id: 282f8bac2245d584b66139daf4e5ea7b2b317295	2019-07-23 12:21:23 -07:00
Michael Suo	2a37740a86	make RHS of assignment optional Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23033 Test Plan: Imported from OSS Differential Revision: D16383330 Pulled By: suo fbshipit-source-id: 63c55fae06f0cd534eb5053f91a773431ad052d4	2019-07-23 12:21:19 -07:00
Michael Suo	3be0a2b4be	Parse all stmts in class defs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23031 Test Plan: Imported from OSS Differential Revision: D16383327 Pulled By: suo fbshipit-source-id: 6485109a66e653b7f26d30b91a97af8d71594e22	2019-07-23 12:21:15 -07:00
Thomas Viehmann	0dabaad819	Add Module::replace_module to C++ api (#22546 ) Summary: This adds a replace_module method to the C++ api. This is needed to be able to replace modules. The primary use case I am aware of is to enable finetuning of models. Given that finetuning is fairly popular these days, I think it would be good to facilitate this in the C++ api as well. This has been reported by Jean-Christophe Lombardo on the [forums](https://discuss.pytorch.org/t/finetuning-a-model-on-multiple-gpu-in-c/49195). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22546 Differential Revision: D16440289 Pulled By: yf225 fbshipit-source-id: c136f914b8fc5c0f1975d877ea817fda5c851cda	2019-07-23 11:50:06 -07:00
Jerry Zhang	f112c522af	LinearReLU module (#23022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23022 will be tested in later diffs. Added LinearReLU module for qat, allows conversion from torch.nn._intrisic.LinearReLU to torch.nn._intrinsic.qat.LinearReLU Reviewed By: zafartahirov Differential Revision: D16286800 fbshipit-source-id: 84cce3551d46e649781b9b6107d4076e10e51018	2019-07-23 11:17:25 -07:00
Sebastian Messmer	192dd8faf1	Set correct list type in pybind_utils (#23188 ) Summary: - Pull Request resolved: https://github.com/pytorch/pytorch/pull/23188 ghstack-source-id: 87008828 Differential Revision: D16430911 fbshipit-source-id: 9d9d29bf42402e0fff323dfd0ed65fcfd5564fd3	2019-07-23 10:52:38 -07:00
Sebastian Messmer	19be7ece15	Fix erase_number_types test (#23181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23181 We can't run dead code elimination after erasing number types because dce relies on graph invariants that erase_number_types breaks. Reviewed By: houseroad Differential Revision: D16427819 fbshipit-source-id: d1b98a74d2558b14d4be692219691149689a93d8	2019-07-23 10:23:10 -07:00
Sebastian Messmer	e56f11b750	Fix onnx export (#23180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23180 This pass needs to be run later because it breaks jit graph invariants and the lower_all_tuples pass still needs a valid jit graph. Reviewed By: houseroad Differential Revision: D16427680 fbshipit-source-id: 427c7e74c59a3d7d62f2855ed626cf6258107509	2019-07-23 10:23:06 -07:00
Sebastian Messmer	60afcabc6f	DictConstruct sets correct types (#23171 ) Summary: - Pull Request resolved: https://github.com/pytorch/pytorch/pull/23171 ghstack-source-id: 87009037 Differential Revision: D16423640 fbshipit-source-id: 0f4f9b12759b8a9defaae775e33e2b0af9bb7791	2019-07-23 10:23:01 -07:00
Junjie Bai	67aede98c3	Exclude unused onnx targets (#23195 ) Summary: e.g. onnxifi_dummy Pull Request resolved: https://github.com/pytorch/pytorch/pull/23195 Differential Revision: D16441493 Pulled By: bddppq fbshipit-source-id: 76816e7a7c73f60f3c7abea10fbdbf086cea0476	2019-07-23 10:22:57 -07:00
Sebastian Messmer	9d03133c14	ListConstruct sets correct element type (#23189 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23189 ghstack-source-id: 86971099 Differential Revision: D16430987 fbshipit-source-id: 9af255075b670e6f811e1a9d104f2738a38e9515	2019-07-23 10:14:35 -07:00
Sebastian Messmer	2073cc73f8	Use concrete types in jit test for generic lists (#23192 ) Summary: Creating an untyped generic list is deprecated, we always want type information to be present. This fixes test cases and removes one that used lists with ambigious types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23192 ghstack-source-id: 86972891 Differential Revision: D16431482 fbshipit-source-id: 4ca5cd142118a3f0a4dcb8cd77383127c54abb29	2019-07-23 10:04:12 -07:00
Edward Yang	21f52ce0d4	Remove trailing semicolon from TORCH_CHECK macros. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22339 Test Plan: Imported from OSS Differential Revision: D16182743 Pulled By: ezyang fbshipit-source-id: 3c4ac0abe49ce83901bd5b07279a135857035f80	2019-07-23 09:58:50 -07:00
Edward Yang	174f7a586f	Switch from KaTeX to imgmath for documentation rendering. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23025 Test Plan: Imported from OSS Differential Revision: D16441000 Pulled By: ezyang fbshipit-source-id: c1ab557cb8163e9c69585c32d237c076582a6d73	2019-07-23 09:44:37 -07:00
Tongzhou Wang	792d527746	Fix typos in comments Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23130 Differential Revision: D16402755 Pulled By: ezyang fbshipit-source-id: 8bf9767c0012aed8ad91289bbaf2d979f130d728	2019-07-23 09:44:33 -07:00
Hong Xu	60c46dd4df	Let CMake handle NCCL detection instead of our handcrafted Python script. (#22930 ) Summary: --- How does the current code subsume all detections in the deleted `nccl.py`? - The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`. - The main NCCL detection happens in [FindNCCL.cmake](`8377d4b32c/cmake/Modules/FindNCCL.cmake`), which is called by [nccl.cmake](`8377d4b32c/cmake/External/nccl.cmake`). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this. - `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`. - Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`. * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA. `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled. * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly: - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system. - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ). --- Regarding for relevant issues: - https://github.com/pytorch/pytorch/issues/12063 and https://github.com/pytorch/pytorch/issues/2877: These are properly handled, as explained in the updated comment. - https://github.com/pytorch/pytorch/issues/2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection). - b7e258f81ef61d19b884194cdbcd6c7089636d46 A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests: > When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22930 Differential Revision: D16440275 Pulled By: ezyang fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa	2019-07-23 08:45:51 -07:00
Tongzhou Wang	e4b75c6580	Fix typo in dataloader.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23132 Differential Revision: D16402759 Pulled By: ezyang fbshipit-source-id: 9500570f6b7492a67a2af853bfb63a5667e6b7b5	2019-07-23 08:45:47 -07:00
Kexuan Sun	45d3f495ef	Add document of function torch.as_strided (#22842 ) Summary: Documentation of `torch.as_strided` and `Tensor.as_strided` is missing. As mentioned in https://github.com/pytorch/pytorch/issues/9886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22842 Differential Revision: D16254106 Pulled By: soumith fbshipit-source-id: dee142483fb9ef7bea84bd44a970b6eccdcdc471	2019-07-23 06:06:00 -07:00
Soumith Chintala	c9e62f6988	Update nccl to 2.4.8-1 (#23186 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/23016 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23186 Differential Revision: D16438723 Pulled By: soumith fbshipit-source-id: ff4f5b9c7383b92e5cf2053a87caf2ac11be7aeb	2019-07-23 05:35:32 -07:00
svcscm	9a6ae5c0b1	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: aab8ded966d718befb664a6e968eedc6bbe7cb5e	2019-07-22 22:47:52 -07:00
Jerry Zhang	d7448c7812	quantized conv module (#23178 ) Summary: att Pull Request resolved: https://github.com/pytorch/pytorch/pull/23178 ghstack-source-id: 86973164 Differential Revision: D16426871 fbshipit-source-id: a2ebb38997acfeb61b7dfd6b11dd8ee9b3a7a8ed	2019-07-22 20:47:40 -07:00
Jerry Zhang	f3a37278cc	ConvReLU2d module (#23008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23008 Added ConvReLU2d module to convert from nn._intrinsic.ConvReLU2d to nn._intrinsic.qat.ConvReLU2d Differential Revision: D16286670 fbshipit-source-id: 2903d825175911c0095497369f313bf2a2eb3833	2019-07-22 20:47:36 -07:00
BowenBao	eb5137a5d1	Export torch.arange to ONNX (#22601 ) Summary: Some overlap with https://github.com/pytorch/pytorch/pull/21716 regarding caffe2 nonzero. Will rebase the other one accordingly whichever gets merged first. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22601 Reviewed By: zrphercule Differential Revision: D16224660 Pulled By: houseroad fbshipit-source-id: dbfd1b8776cb626601e0bf83b3fcca291806e653	2019-07-22 20:30:39 -07:00
Jesse Hellemn	06d11f0434	Revert D16368004: [pytorch][PR] Fix error message for a wrong fork CUDA Differential Revision: D16368004 Original commit changeset: 44b6977790ce fbshipit-source-id: c81a232bd52219e56a19c64650c4b6dedeb167cb	2019-07-22 18:46:48 -07:00
Jerry Zhang	3861520603	Verify flatten works for quantized Tensor (#23121 ) Summary: Added a test in `test_torch.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23121 ghstack-source-id: 86983227 Differential Revision: D16391409 fbshipit-source-id: 04e72b2f753a0a6ddbf58d55b794e443b18a2156	2019-07-22 18:34:25 -07:00
Horace He	a24f6c13a3	Fix broken indexing when using None and ellipses indexing together (#22905 ) Summary: https://github.com/pytorch/pytorch/issues/20153 I believe you need 2 passes for this. Take this example ```python torch.jit.script def f(): x = torch.ones(10, 9, 8, 7, 6) return x[..., None, None].shape ``` which results in `[10, 9, 8, 7, 6, 1, 1]` vs ``` torch.jit.script def f(): x = torch.ones(10, 9, 8, 7, 6) return x[..., None, None, :].shape ``` which results in `[10, 9, 8, 7, 1, 1, 6]` After only processing `x[..., None, None` we don't know whether we should be creating a new dimension at the end of the dimension list or somewhere in the middle. What we do depends on the elements to the right of it. Thus, I do 2 passes - one to collect all the dimensions that the index operations operate on, and another that executes the index operations. This still doesn't work for an ellipse index followed by a tensor index, but it wasn't working previously either. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22905 Differential Revision: D16433558 Pulled By: Chillee fbshipit-source-id: c1b303cb97b1af8b6e405bad33495ef3b4c27c4a	2019-07-22 18:11:23 -07:00
Pradeep Dorairaj	648f10be16	Fix load op to return the shape info as before when loading multiple blobs (#23182 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23182 This fixes the issue seen in D16390551 Changing the load op to take in shapes vector needs changes in lots of places (almost all usages of load op). Instead this is a small and safe change where the behavior is unchanged if we are loading multiple blobs and when loading a single blob without shape information. If you are loading just one blob and the shape information is provided, then this returns the right shape info back. For all other cases, behavior is unchanged as before we introduced the issue. This fixes the issue reported by Andrey in D16229465 Reviewed By: boryiingsu Differential Revision: D16428140 fbshipit-source-id: 8ef6705ab2efb346819489e1f166e23269f7ef8a	2019-07-22 15:53:40 -07:00
Jerry Zhang	1c574458b0	nn_quantized test (#23169 ) Summary: - scale/zero_point in quantized modules should be Tensor - fix conv module permutation API Pull Request resolved: https://github.com/pytorch/pytorch/pull/23169 ghstack-source-id: 86956383 Reviewed By: zafartahirov Differential Revision: D16423570 fbshipit-source-id: d29498e07bdd8f71a33b4e16e089f80847bbca6d	2019-07-22 15:53:36 -07:00
Jesse Hellemn	e08f8f45ff	Turning on fbgemm for nightlies (#22784 ) Summary: fbgemm requires a AVX512 which requires a more recent compiler, so this also switches all the nightlies from devtoolset3 to devtoolset7. Since CUDA 9.0 doesn't support devtoolset7, we also switch from CUDA 9.0 to CUDA 9.2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22784 Differential Revision: D16428165 Pulled By: pjh5 fbshipit-source-id: c1af3729d8edce88a96fa9069d4c5a1808c25f99	2019-07-22 15:09:11 -07:00
Guanheng Zhang	a6e45a69a8	Fix error message for a wrong fork CUDA (#23030 ) Summary: Fix https://github.com/pytorch/pytorch/issues/17357 Unblock 1.2 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23030 Differential Revision: D16368004 Pulled By: zhangguanheng66 fbshipit-source-id: 44b6977790ce768efa4777bae41d4b26dae5f288	2019-07-22 15:04:32 -07:00
Kevin Wilfong	3ca7c0ffdb	Add get_accessed_features function to ModelLayer class (#23036 ) Summary: We need a way to figure get a complete list fo features that are used in training a model. One way to do this is to make it possible to get the list of features used in each Model Layer. Then once the model is complete we can go through the layers and aggregate the features. I've introduced a function to expose that information here, get_accessed_features, and implemented it in the FeatureSparseToDense layer to start with. I've tried to include the minimum amount of information to make this useful, while making it easy to integrate into the variety of model layers. This is, for example, why AccessedFeatures does not contain feature_names which is not always present in a model layer. I debated whether or not to include feature_type, but I think that's useful enough, and easy enough to figure out in a model layer, that it's worth including. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23036 Test Plan: Added a unit test to verify the behavior of get_accessed_features in FeatureSparseToDense. aml_dper2-fblearner-flow-integration-tests failed due to a known issue D16355865 aml_dper3-fblearner-flow-integration-tests failed due to a known issue T47197113 I verified no tests in the integration tests failed to issues other than those known ones. DPER2 canaries: https://fburl.com/fblearner/1217voga Reviewed By: volkhin Differential Revision: D16365380 Pulled By: kevinwilfong fbshipit-source-id: 2dbb4d832628180336533f29f7d917cbad171950	2019-07-22 15:04:28 -07:00
Edward Yang	ff23a02ac4	Pin numba to 0.44.0 to fix Windows CI. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23176 Test Plan: Imported from OSS Differential Revision: D16426873 Pulled By: ezyang fbshipit-source-id: 10d800db78416137504c396711dc45109f6f5ca4	2019-07-22 14:59:15 -07:00
vishwakftw	b6d06d5496	Remove empty THCThreadLocal{.h/.cpp} (#23157 ) Summary: These files were removed from the build process and cleaned in https://github.com/pytorch/pytorch/pull/9735. Closes https://github.com/pytorch/pytorch/issues/22572 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23157 Differential Revision: D16426819 Pulled By: soumith fbshipit-source-id: aa01aec9fe0e3af456ba8b75ae85d0b1df2a8ed9	2019-07-22 14:59:11 -07:00
Edward Yang	fdfc676eb6	Invert ownership between PyFunction and THPFunction. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22983 Test Plan: Imported from OSS Differential Revision: D16422209 Pulled By: ezyang fbshipit-source-id: d6e41a1606484fbbd7a95a547b83a4199151be68	2019-07-22 14:13:14 -07:00
Shen Li	ae5b52086e	Support converting Python number to IValue in pybind_utils.h (#22817 ) Summary: I ran into the following error when trying to pass a Python int as an arg to `torch::jit::createStackForSchema`, and I think it is due to the missing support for `NumberType` in [toIValue](https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/pybind_utils.h#L448). > RuntimeError: Missing cases in toIValue for type: Scalar! File a bug report. (toIValue at ../torch/csrc/jit/pybind_utils.h:449) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22817 Differential Revision: D16276006 Pulled By: mrshenli fbshipit-source-id: 7f63519bb37219445e836ec1f51ca4f98bf52c44	2019-07-22 14:01:30 -07:00
Alexander Sidorov	2becbd3faa	BlackBoxPredictor OSS part 4: Open-source other transforms (#23099 ) Summary: Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23099 Test Plan: salex@devvm4218:caffe2 { (fcdaf96\|HISTEDIT)}$ submit_canary --q tw_adindexer_canary_on_canary_tier && submit_canary --q tw_adfinder_canary_on_canary_tier && submit_canary prospector_repl ay_canary /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851419/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717789681292057 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GBYe_ANnNNBnbWsDAAAAAABJPvJBbjEQAAAz /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851536/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717806884923980 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GArl_QPncP7tc30IAAAAAACfza93bjEQAAAz /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851661/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717823090263325 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GNcyAwRrfFd0MIUIAAAAAABLOINibjEQAAAz Differential Revision: D16288332 Pulled By: salexspb fbshipit-source-id: 95899dede6b11a2ae14703b9aaea8e1a677f0aaa	2019-07-22 13:53:43 -07:00
Spandan Tiwari	27031dccb2	Updating producer_version in exported ONNX models to pytorch 1.2. (#23120 ) Summary: Bumping up the producer_version in exported ONNX models in view of the next release. Updating tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23120 Reviewed By: zrphercule Differential Revision: D16420917 Pulled By: houseroad fbshipit-source-id: 6686b10523c102e924ecaf96fd3231240b4219a9	2019-07-22 13:45:39 -07:00
Horace He	7e31c02afe	Fixed deprecated use of yaml.load Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22985 Test Plan: Imported from OSS Differential Revision: D16425112 Pulled By: Chillee fbshipit-source-id: ef0c764c3fd2518b9284d9a20e84d677ebd8f277	2019-07-22 13:25:27 -07:00
Richard Zou	76291829ba	Refactor named inference rule for reductions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23075 Test Plan: Imported from OSS Differential Revision: D16419173 Pulled By: zou3519 fbshipit-source-id: 187639b563336f935e5f06351dd0b680de1aadfd	2019-07-22 13:12:03 -07:00
Richard Zou	b4b51ed5ec	Implement tensor.size(Dimname), tensor.stride(Dimname) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22989 Test Plan: Imported from OSS Differential Revision: D16364437 Pulled By: zou3519 fbshipit-source-id: 393a93fecac27b5d3b1a7f7692590d8fd5e95a5d	2019-07-22 13:11:59 -07:00
Pavel Belevich	965b97f5f0	Bidirectional GRU and LSTM C++ API forward fix (#22850 ) Summary: Fixing https://github.com/pytorch/pytorch/issues/17998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22850 Differential Revision: D16420854 Pulled By: pbelevich fbshipit-source-id: 76f38be40d8479fb9cafba92939cea61d81fd336	2019-07-22 12:59:47 -07:00
Edward Yang	e5797e9350	Revert D16390551: Fix load op to return the shape info as before when loading multiple blobs Differential Revision: D16390551 Original commit changeset: 1055b481a7a9 fbshipit-source-id: ea50a71e3d446a74bd04d9945710cc4ccee63c87	2019-07-22 12:48:14 -07:00
davidriazati	fcdfc35d1c	Support get/setstate with no args (#23119 ) Summary: `pickle` supports this and a lot of the quantized use cases for get/set state follow this pattern Pull Request resolved: https://github.com/pytorch/pytorch/pull/23119 Pulled By: driazati Differential Revision: D16391234 fbshipit-source-id: 9f63e0a1679daa61b17aa64b5995e2be23b07b50	2019-07-22 12:32:29 -07:00
Orion Reblitz-Richardson	858d4a6a04	Cleanup API and remove 'experimental' warning (#23000 ) Summary: This fixes ASAN test issues with https://github.com/pytorch/pytorch/pull/21786 seen at https://circleci.com/api/v1.1/project/github/pytorch/pytorch/2212325/output/105/0?file=true and lands it again. This cleans up the `torch.utils.tensorboard` API to remove all kwargs usage (which isn't clear to the user) and removes the "experimental" warning in prep for our 1.2 release. We also don't need the additional PyTorch version checks now that we are in the codebase itself. cc yf225, lanpa, natalialunova Pull Request resolved: https://github.com/pytorch/pytorch/pull/23000 Reviewed By: sanekmelnikov Differential Revision: D16349734 Pulled By: orionr fbshipit-source-id: 604a9cad56868a55e08b509a0c6f42b84f68de95	2019-07-22 12:10:05 -07:00
davidriazati	fad3031b5c	Fix type hints for None constants (#23029 ) Summary: The type hint was being ignored when emitting `None` constants, this also de-dups some testing code ](https://our.intern.facebook.com/intern/diff/16364572/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/23029 Pulled By: driazati Differential Revision: D16364572 fbshipit-source-id: 64f3abd3e37ee49c209480a85ed4f1b8802e5d93	2019-07-22 11:55:05 -07:00
davidriazati	2891784a72	Resolve with closed over variables instead of stack frame (#22270 ) Summary: Previously we looked at the stack frame of the function that called `script` to resolve variables. This doesn't work if someone calls script with a function defined somewhere else that references captured variables. We already have a mechanism to look at the closed over variables for a function, so this changes the `rcb` to use that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22270 Pulled By: driazati Differential Revision: D16391346 fbshipit-source-id: ad9b314ae86c249251b106079e76a5d7cf6c04c2	2019-07-22 11:44:36 -07:00
Pradeep Dorairaj	fd90b967b2	Fix load op to return the shape info as before when loading multiple blobs (#23166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23166 Changing the load op to take in shapes vector needs changes in lots of places (almost all usages of load op). Instead this is a small and safe change where the behavior is unchanged if we are loading multiple blobs and when loading a single blob without shape information. If you are loading just one blob and the shape information is provided, then this returns the right shape info back. For all other cases, behavior is unchanged as before we introduced the issue. This fixes the issue reported by Andrey in D16229465 Reviewed By: boryiingsu Differential Revision: D16390551 fbshipit-source-id: 1055b481a7a9e83021209e59f38a7cc0b49003cf	2019-07-22 11:27:59 -07:00
Kimish Patel	82db5dceb6	Added running via throughput benchmark options. (#23077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23077 Although the difference between running from python and this is not much if we have forward method's loop long enough (like 1000 in this case). Reviewed By: mingzhe09088 Differential Revision: D16122343 fbshipit-source-id: 5c1d1b98ae82c996baf9d42bcd04995e2ba60c78	2019-07-22 11:27:55 -07:00
Kimish Patel	2ba516d5b6	Added add op framework overhead benchmark for C2 (#23078 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23078 C2 benchmark. Reviewed By: mingzhe09088 Differential Revision: D16122337 fbshipit-source-id: bf56e60c6e60eda2be2938d9f613708a4bc1669a	2019-07-22 11:27:50 -07:00
Kimish Patel	0621068cdc	Add simple add op based framework overhead benchmark. (#23076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23076 Tracing based and non tracing based added Reviewed By: mingzhe09088 Differential Revision: D16097280 fbshipit-source-id: 3a137092f7ccc3dd2d29d95e10178ec89d3ce892	2019-07-22 11:27:45 -07:00
Jerry Zhang	4223e2f9e9	fix qat tests (#23124 ) Summary: missed instantiating observers in linear Pull Request resolved: https://github.com/pytorch/pytorch/pull/23124 ghstack-source-id: 86886705 Reviewed By: raghuramank100 Differential Revision: D16401066 fbshipit-source-id: f9f0f359caeca855c62192d13261a33eef57715a	2019-07-22 10:28:35 -07:00
Jeff Daily	8bc28cc898	Remove cuda free mutex (#23040 ) Summary: Revision of https://github.com/pytorch/pytorch/issues/22173 to address CI failure after merging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23040 Differential Revision: D16366872 Pulled By: mrshenli fbshipit-source-id: 747b6ecf2dc195c25f82b8f732ae9ff52cd3a394	2019-07-22 07:58:29 -07:00
Iurii Zdebskyi	22f7c9e31b	(#23105 ) Summary: Fixed a [bug](https://github.com/pytorch/pytorch/issues/22992) where passing result tensor into masked_select wont work with bool mask. Tested via unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23105 Differential Revision: D16386676 Pulled By: izdeby fbshipit-source-id: 93a1e9bfbc916c8a8eaa149a70a5553f3711f53e	2019-07-22 07:49:30 -07:00
Richard Zou	aeee49d51d	Revert "Temporarily skip mypy-0.720 to unbreak master type checks" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23095 Test Plan: Imported from OSS Differential Revision: D16383149 Pulled By: zou3519 fbshipit-source-id: ca6bdfe0f51f6bdbd4d95142a880f3902f60676d	2019-07-22 06:54:22 -07:00
rohithkrn	b8c8977be7	Update ScatterWeightedSum Op (#23087 ) Summary: Update ScatterWeightedSum op when there exists only one weighted X to update slice of Y which is usually the case when the op is used for gradient update. The changes remove the copy overhead and seeing significant operator performance improvement - 25 - 50% improvment on CUDA based on input configuration - ~50% improvement on ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/23087 Differential Revision: D16385194 Pulled By: bddppq fbshipit-source-id: 3189e892940fb9c26305269eb0d47479b9b71af0	2019-07-21 22:21:40 -07:00
Thomas Viehmann	ff8cb9f622	hipify: do not overwrite files that stay the same (#23112 ) Summary: This is a small patch to not overwrite unchanged files to help a bit with building. It is not as incremental as one might like, given that one has to pass `--out-of-place-only` to not run into the patching and things. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23112 Differential Revision: D16402623 Pulled By: bddppq fbshipit-source-id: 531ce0078bc716ae31bd92c5248080ef02a065b9	2019-07-21 22:00:53 -07:00
jerry73204	2ac9abf759	Fix memory leak in Adam, Adagrad, RMSProp (#23125 ) Summary: As reported in LaurentMazare/tch-rs#76, the memory grows when weight_decay is present when using Adam. It applies the same fix in https://github.com/pytorch/pytorch/issues/23007 to Adam, Adagrad and RMSProp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23125 Differential Revision: D16402421 Pulled By: soumith fbshipit-source-id: 59eb4bd81b8bd9e1a5f7c068ed841f70a4c38a80	2019-07-21 10:06:18 -07:00
Yan Zhu	96b6797fc0	improve enforce in cross_entroy_op (#23062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23062 as title Reviewed By: xianjiec, BIT-silence Differential Revision: D16374601 fbshipit-source-id: 62219c6abde311ebc8a0e6a03cfb517d80bb52b5	2019-07-21 00:07:58 -07:00
Zafar Takhirov	3e66385002	Lint fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23135 Test Plan: Imported from OSS Differential Revision: D16403272 Pulled By: zafartahirov fbshipit-source-id: 31f9eb11216c494a8327bcb5dc37e47a77611e2b	2019-07-20 21:46:18 -07:00
Zafar Takhirov	963707c5ea	MaxPool2d in the torch (#22765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22765 the pooling signature is the same as the non-quantized one. Adding it to the native_functions.yaml Reviewed By: jerryzh168 Differential Revision: D16102608 fbshipit-source-id: 7627ad8f02a231f488b74d1a245b853f89d9c419	2019-07-20 21:41:09 -07:00
Zafar Takhirov	cf3e6478ad	Concat with out (#22408 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22408 Quantized Concatenation with out argument Reviewed By: jianyuh Differential Revision: D16061526 fbshipit-source-id: 61487cf87763665df19feb8e678da72fd66e8740	2019-07-20 16:13:14 -07:00
Nikolay Korovaiko	05f088ec22	make jit logging visible, so it can be used in a TVM compiler Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23041 Differential Revision: D16402934 Pulled By: Krovatkin fbshipit-source-id: 715f9821809527e94bd7f01f1680db046c888e6c	2019-07-20 14:37:49 -07:00
Karl Ostmo	bb9119f67d	Use set -x to help investigate doc push errors (#23111 ) Summary: I couldn't find any verbosity options in the [`docker pull` command docs](https://docs.docker.com/engine/reference/commandline/pull/), but `docker pull` [got a `--quiet` option](https://github.com/docker/cli/pull/882) in a recent version (not sure if we're using that version), and `--quiet` for `docker push` [is forthcoming](https://github.com/docker/cli/pull/1221). Pull Request resolved: https://github.com/pytorch/pytorch/pull/23111 Differential Revision: D16402993 Pulled By: kostmo fbshipit-source-id: 52f77b11b839d28f8cf1ecb58518ca69632d7fbe	2019-07-20 12:36:05 -07:00
Hong Xu	a62c687445	Remove unused atomics detection code. (#23089 ) Summary: USE_{C11,MSC,GCC}_ATOMICS are not used in PyTorch or submodules. Now we remove their underlying detection code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23089 Differential Revision: D16402750 Pulled By: ezyang fbshipit-source-id: fde84b958eb0b5b4d3f0406acefa92ab30ea43be	2019-07-20 10:52:53 -07:00
ngimel	4e5f70089f	fix indexing for more than 65535 elems in non-indexed first dim (#23123 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22843, also adds test from https://github.com/pytorch/pytorch/issues/23102 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23123 Differential Revision: D16402422 Pulled By: soumith fbshipit-source-id: aa7a79159ed947be03ce3725ec8abcf5246a60bf	2019-07-20 06:17:43 -07:00
Jerry Zhang	6791f395f9	support at::view and at::reshape for quantized tensor (#23046 ) Summary: att Pull Request resolved: https://github.com/pytorch/pytorch/pull/23046 ghstack-source-id: 86840501 Differential Revision: D16368897 fbshipit-source-id: 9da232c11f21af5f850cd9545e56996a81791d00	2019-07-19 23:34:04 -07:00
Jerry Zhang	a03205ed66	Move THTensor_compute_stride to ATen (#23045 ) Summary: att Pull Request resolved: https://github.com/pytorch/pytorch/pull/23045 ghstack-source-id: 86842517 Differential Revision: D16368860 fbshipit-source-id: 8970a73758afadbc9a6a3e263cdcfe5e2fd9cc0d	2019-07-19 23:14:11 -07:00
Jerry Zhang	0d8324b18a	Add fused modules in nn._intrinsic (#23085 ) Summary: Using nn.Sequential to represent fused modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/23085 ghstack-source-id: 86883096 Differential Revision: D16379521 fbshipit-source-id: 57d67cb947de8665bd758848595a4a000366153a	2019-07-19 23:04:25 -07:00
Zafar Takhirov	47af41fe72	Quantized concatenation (+fused relu). (#21749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21749 This is the first version without "requantization" Reviewed By: jerryzh168 Differential Revision: D15807940 fbshipit-source-id: 19bb0482abed8ed9d1521a3fa1f15bda8e6a6a7c	2019-07-19 22:23:41 -07:00
Zafar Takhirov	9f4df63c2c	Moving np function to test area Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23118 Test Plan: Imported from OSS Differential Revision: D16400634 Pulled By: zafartahirov fbshipit-source-id: 44872fdf64b20a6b67e5176042fe58c8c2359738	2019-07-19 22:11:21 -07:00
Jerry Zhang	77353636de	Conv module (#23084 ) Summary: Added Conv module for qat Pull Request resolved: https://github.com/pytorch/pytorch/pull/23084 ghstack-source-id: 86862445 Differential Revision: D16379417 fbshipit-source-id: 742cc8b8e0f132070ca4943a1c2e3db60c2b5bdc	2019-07-19 18:49:52 -07:00
Yinghai Lu	b964bdb53a	Fbgemm fp16 tensor support (#23101 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23101 Support for - Shape inference - Tensor info extraction Reviewed By: zrphercule Differential Revision: D16345251 fbshipit-source-id: 53ef674b5b1581e6267e6d2070e34355280dae79	2019-07-19 17:08:03 -07:00
Yinghai Lu	2a8d5a132c	Fix workspace destruction ordering (#23096 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23096 nets can have states that depends on the rest of the state in the Workspace. Hence, they should be destructed first. Reviewed By: ajyu Differential Revision: D16382987 fbshipit-source-id: 3fd030ba206e2d0e897abb9e31c95bdaeb9482b7	2019-07-19 16:49:50 -07:00
davidriazati	79c4f83fbe	Include module names in recursive error stacks (#22921 ) Summary: Following on to #22280, this adds module names so they're included in the call stacks of an error message (e.g. so it appears as `M.forward` instead of `forward`) ](https://our.intern.facebook.com/intern/diff/16287925/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22921 Pulled By: driazati Differential Revision: D16287925 fbshipit-source-id: 6f31d72caa87ba2dc527805d36f7d62eb94c0808	2019-07-19 16:09:14 -07:00
Jerry Zhang	7cc029cb75	Quantization aware training in eager mode (#23082 ) Summary: Add support for quantization aware training in eager mode Modifications to Post training flow: ## Prepare * Fusion: e.g. (Conv, Bn) → ConvBn (float) * Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn ``` * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook: * def forward_pre_hook(self, input): self.float_weight = self.weight self.weight = self.fake_quantize(self.float_weight) def forward_hook(self, input): self.weight = self.float_weight ``` * Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight. * But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module * So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear * qat modules will have fake_quant for output and weights inserted in forward function ## Convert * flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23082 ghstack-source-id: 86824650 Differential Revision: D16379374 fbshipit-source-id: 7d16d1acd87025065a24942ff92abf18e9fc8070	2019-07-19 14:57:25 -07:00
Zachary DeVito	c09e92255c	Add initial support for serializing classes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22953 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D16340214 Pulled By: zdevito fbshipit-source-id: 70fb1968eca34e14492e0d2be52e28b27813f821	2019-07-19 14:51:59 -07:00
Alexander Sidorov	6334edc2d8	BlackBoxPredictor OSS: open-source NQL and custom transforms (#22877 ) Summary: Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22877 Test Plan: did a bunch of unit tests locally and now waitforsandcaslte AdFinder canary: https://our.intern.facebook.com/intern/ads/canary/419623727275650390 adindexer: https://our.intern.facebook.com/intern/ads/canary/419623750891549182 prospector: https://our.intern.facebook.com/intern/ads/canary/419644899887610977 https://our.intern.facebook.com/intern/ads/canary/419645123742738405 Differential Revision: D16267765 Pulled By: salexspb fbshipit-source-id: 776a1cd5415e0695eae28254b3f155e7a9bd8c2b	2019-07-19 14:37:56 -07:00
Elias Ellison	f2f3e8ad8c	fix overspecializing constants in compilation (#22816 ) Summary: When we specialize the tensor type of constants in compilation it causes all sorts of problems. Fix for https://github.com/pytorch/pytorch/issues/22809 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22816 Differential Revision: D16384094 Pulled By: eellison fbshipit-source-id: f33c00d92d87108749d09bf037a6e74c5d9adaa2	2019-07-19 14:19:49 -07:00
Jesse Hellemn	a302821c5d	Adding more binary documentation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23093 Differential Revision: D16384838 Pulled By: pjh5 fbshipit-source-id: 0ce91c2f3f0ec8f5c026622f27039b36c42a81d4	2019-07-19 14:06:34 -07:00
Jie	a28ffaf350	(#22827 ) Summary: 1. Fix out of range memory access for reduction on all dimensions for non-packed tensor. 2. Enabling launch config that maps block width to reduction on fastest striding dimension. This mapping was previously only active when reducing on fastest striding dimension of packed tensor, which is not necessary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22827 Differential Revision: D16271897 Pulled By: zdevito fbshipit-source-id: 20763f6cf9a58e44ffc0e7ec27724dfec8fe2c5d	2019-07-19 13:38:17 -07:00
Orion Reblitz-Richardson	818828e8a8	Only import PIL when needed (#23023 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22389 In most cases we only import `PIL` methods when we need them, but we missed a spot. cc lanpa natalialunova sanekmelnikov Pull Request resolved: https://github.com/pytorch/pytorch/pull/23023 Reviewed By: sanekmelnikov Differential Revision: D16373492 Pulled By: orionr fbshipit-source-id: b08bf8a9b5a861390eadf62eda21ac055777180f	2019-07-19 13:30:43 -07:00
mal	44493a623e	Pass variable_list of inputs to _wrap_outputs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23037 Test Plan: Imported from OSS Differential Revision: D16380071 fbshipit-source-id: ae3333c02ef8a3c09b95bec7b8e92ce649553615	2019-07-19 12:31:23 -07:00
Elias Ellison	2ee0f0bc3a	add break continue to docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23091 Differential Revision: D16382604 Pulled By: eellison fbshipit-source-id: 47432d844c811ecd87ad97155e835b07ae8056cc	2019-07-19 12:17:00 -07:00
vishwakftw	6dfecc7e01	Remove deprecated linear algebra functions (and methods) (#22841 ) Summary: Changelog: - Removed the following linear algebra functions in PyTorch in favor of the renamed operations - `btrifact` (use `lu` instead) - `btrifact_with_info` (use `lu` with `get_infos=True` instead) - `btrisolve` (use `lu_solve` instead) - `btriunpack` (use `lu_unpack` instead) - `gesv` (use `solve` instead) - `pstrf` (use `cholesky` instead) - `potrf` (use `cholesky` instead) - `potri` (use `cholesky_inverse` instead) - `potrs` (use `cholesky_solve` instead) - `trtrs` (use `triangular_solve` instead) - Removed dead code after the removal of `pstrf` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22841 Test Plan: - All existing tests should pass to verify that the removal is clean Closes https://github.com/pytorch/pytorch/issues/22832 Differential Revision: D16346184 Pulled By: zou3519 fbshipit-source-id: f748d16ed7609c028de6adcbc28684d5a1af0678	2019-07-19 11:43:06 -07:00
Hong Xu	61a683c212	Delete aten/src/ATen/out.txt (#23050 ) Summary: An file introduced in 5c0e0589509540fc991a88ffc48e96cc76fd799d , probably by mistake Pull Request resolved: https://github.com/pytorch/pytorch/pull/23050 Differential Revision: D16379947 Pulled By: ezyang fbshipit-source-id: b7fa8995028e180603d7830b6f170a7a57310385	2019-07-19 10:27:59 -07:00
Morgan Funtowicz	5417ddbdae	Fix get_all_math_dtypes for device='cuda' retuning None (#23028 ) Summary: This PR fixes the invalid None return when calling get_all_math_dtype(device='cuda'). Issue came from the __append__ method which doesn't have any return value used in `return dtypes.append(...)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23028 Differential Revision: D16362732 Pulled By: colesbury fbshipit-source-id: 0bbc30a0c663749d768159f1bc37b99f7263297b	2019-07-19 09:29:16 -07:00
Soumith Chintala	84c2c89e2c	Revert D16199356: [qat] Quantization aware training in eager mode Differential Revision: D16199356 Original commit changeset: 62aeaf47c12c fbshipit-source-id: d06a96b0a617ae38029ffb246173ec065454b666	2019-07-19 03:18:48 -07:00
Soumith Chintala	f19aa12ae5	Revert D16274792: [qat] Conv module Differential Revision: D16274792 Original commit changeset: 1da10194123b fbshipit-source-id: 71b34774b463f2350289bd39b8cfd798e095ffa5	2019-07-19 03:18:45 -07:00
Soumith Chintala	c362e72d4a	Revert D16349133: [quant] Add fused modules in nn._intrinsic Differential Revision: D16349133 Original commit changeset: 04d862ac4a0d fbshipit-source-id: d96d9d98e9b29fddf93d4106621752abb00947eb	2019-07-19 03:18:41 -07:00
Soumith Chintala	2401a05aae	Revert D16373996: [fix] conv module missing return Differential Revision: D16373996 Original commit changeset: 1ec85d23c9dd fbshipit-source-id: e507db59405aa240d20f132c3d6df323b241a542	2019-07-19 03:06:39 -07:00
Mingfei Ma	25f0dc3490	BERT CPU performance optimization: use mkldnn for nn.Linear() when input is dense layout (#21851 ) Summary: This PR aims at improving BERT performance on CPU by using `mkldnn` inner product for `nn.Linear()`. The current logic is to use `mkldnn` only when `input` tensor is of mkldnn layout. This PR loosens this condition, `mkldnn` will be used for `nn.Linear()` when `input` tensor is of dense layout. The aten tensor is viewed inplace in `mkldnn` without additional memory copy. 1. when `input.dim() >= 3` , it is viewed as 2d tensor. e.g. `[T, N, C]` is treated as `[TN, C]`; 2. when `input` is not contiguous, it is copied so as to be contiguous. `mkldnn` inner product can't handle non-contiguous memory. With this PR, BERT on `glue/MRPC` inference (batch size = 1) on Xeon 6148 single socket (20 cores@2.5GHz) improves by `44%`: 1. before (unit: iterations/sec): ```bash 408/408 [00:24<00:00, 16.69it/s] ``` 2. after (unit: iterations/sec): ```bash 408/408 [00:16<00:00, 24.06it/s] ``` The latency reduces from `59.92 ms` to `41.56ms` correspondingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21851 Differential Revision: D16056334 Pulled By: dzhulgakov fbshipit-source-id: 9b70ed58323b5e2f3f4e3ebacc766a74a8b68a8a	2019-07-19 00:54:29 -07:00
Oren Amsalem	12ac9171db	fix error message Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22982 Differential Revision: D16356464 Pulled By: soumith fbshipit-source-id: 3ddd5de4cf5c000dcf5b2faed39283dc715cba25	2019-07-18 23:38:55 -07:00
Jerry Zhang	cdfdeb74af	conv module missing return (#23058 ) Summary: att Pull Request resolved: https://github.com/pytorch/pytorch/pull/23058 ghstack-source-id: 86807313 Reviewed By: jianyuh Differential Revision: D16373996 fbshipit-source-id: 1ec85d23c9ddd9975bc32f6c5d30cde04eb1109e	2019-07-18 22:24:56 -07:00
Nikolay Korovaiko	6601978012	Use ProfiledTensorType in peephole.cpp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22767 Differential Revision: D16342954 Pulled By: Krovatkin fbshipit-source-id: a577ea942ff4bab6ae15f14d6ba04a68675c70aa	2019-07-18 21:49:45 -07:00
svcscm	d153b0b58b	Updating submodules Reviewed By: yns88 fbshipit-source-id: 87bb7a817dea65783436d6d6dfbbd492724d20a7	2019-07-18 20:43:55 -07:00
Ilia Cherniavskii	23badc60f3	Fix TBB build for older versions of cmake Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23038 Test Plan: with-proxy pip install --upgrade cmake==3.11.0 python setup.py clean USE_CUDA=0 PARALLEL_BACKEND=NATIVE USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop --cmake with-proxy pip install --upgrade cmake==3.13.3 python setup.py clean USE_CUDA=0 PARALLEL_BACKEND=NATIVE USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop --cmake with-proxy pip install --upgrade cmake==3.6.3 python setup.py clean USE_CUDA=0 PARALLEL_BACKEND=NATIVE USE_OPENMP=0 USE_TBB=1 MKL_THREADING=TBB BLAS=MKL USE_MKLDNN=1 MKLDNN_THREADING=TBB BUILD_BINARY=1 python setup.py develop --cmake Imported from OSS Differential Revision: D16365699 Pulled By: ilia-cher fbshipit-source-id: cbf779dff63e4e186d9b4c2fc21539a24ce0d5a2	2019-07-18 20:12:26 -07:00
Jerry Zhang	e57b682abf	Add fused modules in nn._intrinsic (#22999 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22999 Using nn.Sequential to represent fused modules Reviewed By: zafartahirov Differential Revision: D16349133 fbshipit-source-id: 04d862ac4a0d20e83dc9d6de6b7d0d0c26bdedfd	2019-07-18 18:58:11 -07:00
Jerry Zhang	12d9d768b8	Conv module (#22899 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22899 Added Conv module for qat Reviewed By: zafartahirov Differential Revision: D16274792 fbshipit-source-id: 1da10194123b2759a6a35c60d1c2d2c0b569ccdc	2019-07-18 18:58:07 -07:00
Jerry Zhang	65ef671d11	Quantization aware training in eager mode (#22732 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22732 Add support for quantization aware training in eager mode Modifications to Post training flow: ## Prepare * Fusion: e.g. (Conv, Bn) → ConvBn (float) * Swapping: To insert fake_quant to weight, we need to swap the float modules that has weight with different qat modules, e.g. Conv → torch.nn.qat.Conv , ConvBn → torch.nn._intrinsic.qat.ConvBn ``` * previously we were thinking about modify the weight in forward_pre hook and change it back in forward_hook: * def forward_pre_hook(self, input): self.float_weight = self.weight self.weight = self.fake_quantize(self.float_weight) def forward_hook(self, input): self.weight = self.float_weight ``` * Assignments to self.weight are needed because we can’t change forward function and in forward function they are using self.weight. * But we will need to keep two copies of weight in this case, so it’s probably better to just swap the module * So we want to just swap Conv to torch.nn.qat.Conv and Linear to torch.nn.qat.Linear * qat modules will have fake_quant for output and weights inserted in forward function ## Convert * flow should be identical to ptq, but the swapping dictionary is slightly different since modules are changed in prepare step. Reviewed By: zafartahirov Differential Revision: D16199356 fbshipit-source-id: 62aeaf47c12c62a87d9cac208f25f7592e245d6c	2019-07-18 18:58:03 -07:00
Jerry Zhang	8dfbbf7bf2	Add nn.qat.Linear (#22714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22714 We need this module for add fake_quant for weight Reviewed By: zafartahirov Differential Revision: D16193585 fbshipit-source-id: ed6c04ecf574ca1fe1dcded22c225da05976f7a3	2019-07-18 18:27:27 -07:00
Hong Xu	b6011c3caf	Update torchvision in CI. (#22754 ) Summary: To include `dea1afbf5e` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22754 Differential Revision: D16366676 Pulled By: zhangguanheng66 fbshipit-source-id: abfcb785973f9caa2a5aa1154fa689bbba8ff2dd	2019-07-18 18:22:24 -07:00
svcscm	358e0d3d44	Updating submodules Reviewed By: yns88 fbshipit-source-id: 3998c7b50a8b0377e8f1748a8dbd3b7d2afc99a4	2019-07-18 16:36:25 -07:00
davidriazati	9897ec4701	Recursively compile class types (#22475 ) Summary: Try to compile for class types encountered in recursive script ](https://our.intern.facebook.com/intern/diff/16340717/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22475 Pulled By: driazati Differential Revision: D16340717 fbshipit-source-id: 5e1a46db517be2412f57156efbc4eb3347b01a8a	2019-07-18 15:43:16 -07:00
Vitaly Fedyunin	425d28c30a	Reapply: optimize topk on cpu using parallel and partial sort (#19736 ) (#22865 ) Summary: https://github.com/pytorch/pytorch/issues/19736 was reverted as it was suspected to be broken on the master, trying to reapply Pull Request resolved: https://github.com/pytorch/pytorch/pull/22865 Differential Revision: D16265457 Pulled By: VitalyFedyunin fbshipit-source-id: 784bd6405471f15a8a49ebd0f3e98160d7d0679e	2019-07-18 14:15:54 -07:00
Will Feng	c1c4014bba	Add warning for legacy autograd function (#22922 ) Summary: When working on https://github.com/pytorch/pytorch/pull/22762, we discovered that we haven't actually deprecated legacy autograd function. This PR puts up the deprecation warning for 1.2, with the goal to remove legacy function support completely in the near future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22922 Differential Revision: D16363916 Pulled By: yf225 fbshipit-source-id: 4b554010a3d1f87a3fa45cc1aa29d019c8f1033c	2019-07-18 14:02:17 -07:00
Your Name	a2b3403962	Mark protobuf include path as system include (#23012 ) Summary: To suppress (many) compiler warnings from protobuf headers Pull Request resolved: https://github.com/pytorch/pytorch/pull/23012 Differential Revision: D16364573 Pulled By: bddppq fbshipit-source-id: adbc4921e29389131d43e7bcc1e6fcba19450c76	2019-07-18 13:44:39 -07:00
Shen Li	84d892b645	Remove DistributedDataParallelCPU as DDP now supports CPU models (#22864 ) Summary: cc ailzhang aazzolini yifuwang Pull Request resolved: https://github.com/pytorch/pytorch/pull/22864 Differential Revision: D16358011 Pulled By: mrshenli fbshipit-source-id: 8db2dc035dea03f07a32c749e754f625fda1bf28	2019-07-18 12:50:45 -07:00
Will Feng	a5e6586618	Revert D16357177: [pytorch][PR] Fix race condition, bad lock hierarchy. Move getFreeMutex() into AutoNcclGroup. Differential Revision: D16357177 Original commit changeset: f4ca9cd46cc6 fbshipit-source-id: 49e66e7e59df6cbc7f5d847bacc07da134067956	2019-07-18 12:28:46 -07:00
Will Feng	11d257e5df	Fix SGD memory leak when there is weight_decay (#23007 ) Summary: This fixes https://github.com/pytorch/pytorch/issues/20146. I am working on another PR that adds CPU and CUDA memory leak checking to all C++ API tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23007 Differential Revision: D16358973 Pulled By: yf225 fbshipit-source-id: 5ee7ed4e61e60424031540a633e1fae09d9df171	2019-07-18 12:10:10 -07:00
Hong Xu	502766e99e	Add the mathematical definition of torch.sign to clarify this is the sgn function. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22894 Differential Revision: D16345027 Pulled By: ezyang fbshipit-source-id: 1421571f1f8764539a35b9060d90ea6075f889d3	2019-07-18 11:45:27 -07:00
Richard Zou	662fe699c5	Named inference rules for some initializer fns Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22972 Test Plan: - [namedtensor ci] Imported from OSS Differential Revision: D16342782 Pulled By: zou3519 fbshipit-source-id: 25277688ab51e1e98af0e19aeb9c79399171d2fb	2019-07-18 10:04:29 -07:00
Richard Zou	57cec0a720	Named inference rules for split/chunk Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22971 Test Plan: Imported from OSS Differential Revision: D16342783 Pulled By: zou3519 fbshipit-source-id: 379edc8eb2f45a82ee8a6320f8285f8f81ea0b1b	2019-07-18 10:04:25 -07:00
Jesse Hellemn	6b70217a7e	Adding README for binaries to OSS Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23001 Differential Revision: D16359617 Pulled By: pjh5 fbshipit-source-id: bfe3f0e1dcb00f34e9362a74227e8a0bb90a8aaf	2019-07-18 10:04:21 -07:00
Supriya Rao	b91ab177a0	Add support to print QTensor in cpp (#22950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22950 Print quantized tensor by first dequantizing it and then printing. Also print the scale, zero_point. size and type of tensor. Reviewed By: jerryzh168 Differential Revision: D16286397 fbshipit-source-id: 2d6fb1796e5b329a77c022b18af0a39f6edde0d7	2019-07-18 09:44:20 -07:00
Brian Vaughan	0c091380cc	disable non-deterministic cudnn ctcloss (#22977 ) Summary: Associated issue: https://github.com/pytorch/pytorch/issues/21680 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22977 Differential Revision: D16357873 Pulled By: nairbv fbshipit-source-id: 58711bac7d3e8390e868d594dc265ba053a1537c	2019-07-18 08:28:42 -07:00
Jeff Daily	29347cc9cf	Fix race condition, bad lock hierarchy. Move getFreeMutex() into AutoNcclGroup. (#22173 ) Summary: There are two mutexes within CUDACachingAllocator that cause a deadlock. One of the mutexes was added in order to work around the issue of NCCL interacting poorly with cudaFree. See - `68ff58d771` - https://github.com/pytorch/pytorch/pull/880 As of NCCL version 2 and its new group start/end APIs, the protection surrounding cudaFree() is no longer needed. The PyTorch code was updated to use the NCCL2 group start/end API, but the corresponding cuda_free_mutex and its getter getFreeMutex() were not revised. This PR removes the use of the getFreeMutex() when NCCL2 is used by moving calls to getFreeMutex() into the AutoNcclGroup. That way, depending on the NCCL version used, we either use the mutex or we use the new group APIs. The race condition is as follows, thanks to skeelyamd: The deadlock occurs between hip_free_mutex (aka cuda_free_mutex in github) (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L165) and mutex (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L162). hip_free_mutex is exported from THCCachingAllocator in getFreeMutex (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L660) and is acquired in ProcessGroupNCCL::collective (https://github.com/pytorch/pytorch/blob/master/torch/lib/c10d/ProcessGroupNCCL.cpp#L397), which then calls back into THCCachingAllocator via c10::cuda::CUDACachingAllocator::recordStream (https://github.com/pytorch/pytorch/blob/master/torch/lib/c10d/ProcessGroupNCCL.cpp#L416 to https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L655 to https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L379). At this point it acquires mutex (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L384). This requires hip_free_mutex to be locked before mutex. However, in free_blocks (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L505) THCCachingAllocator locks hip_free_mutex. Free_blocks is called from emptyCache (https://github.com/pytorch/pytorch/blob/master/c10/cuda/CUDACachingAllocator.cpp#L328) which locks mutex. That requires mutex to be locked before hip_free_mutex. emptyCache and ProcessGroupNCCL::collective may not be executed concurrently but this is occurring and deadlocking the CPU. free_blocks is also called by malloc (via cuda_malloc_retry -> free_cached_blocks -> free_blocks) which also locks mutex first and so malloc must not execute concurrent with ProcessGroupNCCL::collective. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22173 Differential Revision: D16357177 Pulled By: pietern fbshipit-source-id: f4ca9cd46cc6d5e15290d99577d88be3f4fa8972	2019-07-18 07:31:02 -07:00
Tongzhou Wang	14ecf92d42	Slightly improve irfft doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22995 Differential Revision: D16356435 Pulled By: soumith fbshipit-source-id: f6cfd9990fd79faebfb566704359c866ddf36525	2019-07-18 03:12:49 -07:00
Igor Fedan	c2df54d6d0	avg_pool2d avg_pool3d for LongTensor (#22433 ) Summary: Generate avg_pool2d/avg_pool3d for LongTensor for CPU. Added divisor_override parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22433 Differential Revision: D16108809 Pulled By: ifedan fbshipit-source-id: 8de7ff585a0479702cceafb5ccf9dfea62a9cc50	2019-07-17 19:59:09 -07:00
Will Feng	52bf38007b	Remove usage of legacy autograd function (#22925 ) Summary: We are planning to put up a deprecation warning for legacy autograd function in 1.2: https://github.com/pytorch/pytorch/pull/22922. This PR removes all usage of legacy function in PyTorch core and test suite, to prepare for the eventual removal of legacy function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22925 Differential Revision: D16344834 Pulled By: yf225 fbshipit-source-id: 8bf4cca740398835a08b7a290f3058c3e46781ba	2019-07-17 19:50:36 -07:00
svcscm	29853293d7	Updating submodules Reviewed By: yns88 fbshipit-source-id: 4f801d353ee14ec0bd6fd24830f0e7a4343d67f8	2019-07-17 18:05:09 -07:00
Zafar Takhirov	992f3860a3	Quantized relu to native_functions (#22316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22316 Adding the quantized ReLU to the native_functions.yamp, as it has the same signature as non-quantized relu Reviewed By: jerryzh168 Differential Revision: D16038441 fbshipit-source-id: 1cfbb594eb9bca1b7ec49ca486defcf1908b0d26	2019-07-17 17:31:02 -07:00
Orion Reblitz-Richardson	e24f18cea0	Revert D15854892: [pytorch][PR] [tensorboard] Cleanup API and remove 'experimental' warning Differential Revision: D15854892 Original commit changeset: 06b849882694 fbshipit-source-id: 588edc4616d020a23645f8c8181782c8412c4c6e	2019-07-17 16:45:54 -07:00
Hong Xu	a0ef4abeed	Add missing comment from #22103 (#22984 ) Summary: One important comment is missing from https://github.com/pytorch/pytorch/issues/22103 (not sure what happened). This commit makes it up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22984 Differential Revision: D16347044 Pulled By: ezyang fbshipit-source-id: 0903909a5fb6740b43195136f1a23c28cfb2a02f	2019-07-17 16:21:38 -07:00
Le Fang	442dd7b906	Implement "trimmed lasso" regularization and support all available regularization in a single interface (#22966 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22966 We want to implement "trimmed lasso" for feature selection with learnable and regularizable weights. Trimmed lasso is a simple yet powerful improved version from traditional lasso. More reference can be found at https://arxiv.org/abs/1708.04527 and http://proceedings.mlr.press/v97/yun19a.html. For quick and necessary intro, please refer to P1-3 of the paper at https://arxiv.org/abs/1708.04527. Given n weights, traditional lasso sums up all weights' l1 norms. The trimmed lasso takes an input integer k (how many weights you want to select from n) and only sums over the smallest n - k weights. Given lambda as the regularization constant, the penalty term is only on the smallest n - k weights, but not other larger weights. If lambda becomes larger than certain threshold, the smallest n - k weights are shrunk to zero. That means we have those weights "dropped". With this property, the number k is the number of weights left after lasso, which we can easily control. Meanwhile, we further support all available regularization in a single interface. Current supported regularizers on weights include no reg, l1, l2, elastic, trimmed l1, elastic with trimmed l1, group l1, and logbarrier. Differential Revision: D16326492 fbshipit-source-id: 6e1fd75606005d9bc09d6650435c96a7984ba69c	2019-07-17 16:12:31 -07:00
Junjie Bai	eb76b7a564	Revert D16199862: [pytorch][PR] [ROCm] Update ROCm CI to python3.6 Differential Revision: D16199862 Original commit changeset: 46ca6029a232 fbshipit-source-id: 2843b919f2655674e39dc764053621994061a12b	2019-07-17 14:26:56 -07:00
Lu Fang	796a39ba85	Automatic update of fbcode/onnx to 707064980b9825b8705b9d1c9aad34d8b022d5dd (#22981 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22981 Previous import was 806aa863020fa180e57f576cb032ec44ce8ddcca Included changes: - [70706498](https://github.com/onnx/onnx/commit/70706498): TensorProto::INT8 & INT16 were missed here (#2164) <ZINEKS> - [8218a4ea](https://github.com/onnx/onnx/commit/8218a4ea): Fix LabelEncoder's shape inference (#2170) <Wei-Sheng Chin> - [0f1a9a1c](https://github.com/onnx/onnx/commit/0f1a9a1c): Fixing a unit test in Cumsum Operator (#2157) <Jeff Saremi> - [2c03cff0](https://github.com/onnx/onnx/commit/2c03cff0): [New Operator] CumSum (#2030) <Jeff Saremi> - [220b8300](https://github.com/onnx/onnx/commit/220b8300): Fix globalpool output shape (#2147) <daquexian> Reviewed By: benoitsteiner Differential Revision: D16341736 fbshipit-source-id: 7e7a2684d8c821991231bfd6558f9f6cb4fb05fb	2019-07-17 14:05:14 -07:00
iotamudelta	031b406c38	Update ROCm CI to python3.6 (#22322 ) Summary: Given that python 2.7 will be EOL'd on Jan 1, 2020 and we have models depending on python3.5+, we'd like to update the ROCm CI across the board to python3.6. This PR adds the skip tests and some semantic changes for PyTorch. Open tasks/questions: * RoiAlignTest.CheckCPUGPUEqual fails in the Caffe2 unit tests. Is this something expects / can be skipped? * for testing, I've used update-alternatives on CentOS/Ubuntu to select python == python 3.6. Is this the preferred way? Pull Request resolved: https://github.com/pytorch/pytorch/pull/22322 Differential Revision: D16199862 Pulled By: ezyang fbshipit-source-id: 46ca6029a232f7d23f3fdb5efc33ae39a379fca8	2019-07-17 13:42:30 -07:00
Jan Schlüter	5adba33c01	Use integer floor division for pooling shape computation (#22304 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21935 by using the integer floor division that was introduced for convolution shapes in https://github.com/pytorch/pytorch/issues/9640. Without this fix, the pooling operators can produce a 1-element output in cases they shouldn't. Disclaimer: I couldn't properly test it locally (it's not picking up the modified version for some reason). I'm marking this WIP until I checked what the CI tools say... Pull Request resolved: https://github.com/pytorch/pytorch/pull/22304 Differential Revision: D16181955 Pulled By: ezyang fbshipit-source-id: a2405372753572548b40616d1206848b527c8121	2019-07-17 13:23:29 -07:00
Tongzhou Wang	332824551c	Fix F.one_hot doc signature Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22929 Differential Revision: D16290741 Pulled By: ezyang fbshipit-source-id: d8b979e64d92b94c5a70bb4ffe2a83042ed6abfc	2019-07-17 13:23:25 -07:00
Zachary DeVito	074afd7143	Remove unneeded IValue copy in unpickler. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22883 Test Plan: Imported from OSS Differential Revision: D16270330 Pulled By: zdevito fbshipit-source-id: ffd05b8c6860889d75172a288f339a434af76d45	2019-07-17 11:00:38 -07:00
Zachary DeVito	b6adb568fb	Cleanup some logic in pickler Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22882 Test Plan: Imported from OSS Differential Revision: D16270332 Pulled By: zdevito fbshipit-source-id: 714f293493965b13e471945fde11831a04875604	2019-07-17 11:00:34 -07:00
George Guanheng Zhang	3c0814ffeb	add docs to onnx APIs (#22938 ) Summary: Add docs to onnx APIs, including - export - export_to_pretty_string - is_in_onnx_export Fix https://github.com/pytorch/pytorch/issues/14698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22938 Differential Revision: D16296182 Pulled By: zhangguanheng66 fbshipit-source-id: 1a1fa769b430db6428e6dfafba5447e6e2a75517	2019-07-17 10:50:41 -07:00
Orion Reblitz-Richardson	4861527446	Cleanup API and remove 'experimental' warning (#21786 ) Summary: This cleans up the `torch.utils.tensorboard` API to remove all kwargs usage (which isn't clear to the user) and removes the "experimental" warning in prep for our 1.2 release. We also don't need the additional PyTorch version checks now that we are in the codebase itself. cc ezyang lanpa natalialunova Pull Request resolved: https://github.com/pytorch/pytorch/pull/21786 Reviewed By: natalialunova Differential Revision: D15854892 Pulled By: orionr fbshipit-source-id: 06b8498826946e578824d4b15c910edb3c2c20c6	2019-07-17 10:34:00 -07:00
Xiaodong Wang	2630109727	always restore dlopen flag in dyndep (#22958 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22958 When we use `extension_loader.DlopenGuard()` to dyndep or import modules, it sets a `RTLD_GLOBAL` flag, and restores the original flags after the `yield`. However, if the modules is not there, yield will fail, and the flags won't be restored, creating all kinds of symbol conflict problems. Reviewed By: bddppq Differential Revision: D16311949 fbshipit-source-id: 7b9ec6d60423ec5e78cae694b66c2f17493840b0	2019-07-17 10:26:25 -07:00
Zafar Takhirov	35b6cdc2eb	Rewriting hypothesis_utils (#22830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22830 Separating the tensor generation and the generation of the quantization parameters - Introducing hypothesis filter `assume_not_overflowing`, which makes sure that the generated tensor and qparams play well with each other. Note: This is an expensive filter! - `qtensor` -> Renameed to `tensor` - `qtensor_conv` -> Renamed to `tensor_conv2d` - The tensors don't return the quantization parameters anymore, use `qparams` for it - The `dtypes` argument is just a quantized dtype now. - The enforcement for zero_point is predefined as before. As before, if set to `None` the zero_point will be sampled. However, if `None`, you can override sampling with `zero_point_min` and `zero_point_max` - Scale sampling can also be overriden using `scale_min` and `scale_max` Reviewed By: jerryzh168 Differential Revision: D16234314 fbshipit-source-id: 5b538a5aa9772b7add4f2ce5eff6fd0decd48f8e	2019-07-17 10:16:13 -07:00
Lu Fang	b96610bf5a	fix the CI job for onnx (#22946 ) Summary: ONNX uses virtualenv, and PyTorch doesn't. So --user flag is causing problems in ONNX ci... Fixing it by moving it to pytorch only scripts. And will install ninja in onnx ci separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22946 Reviewed By: bddppq Differential Revision: D16297781 Pulled By: houseroad fbshipit-source-id: 52991abac61beaf3cfbcc99af5bb1cd27b790485	2019-07-17 09:50:06 -07:00
Jianyu Huang	f72d754877	qlinear operator level benchmark (#22914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22914 Adding op level benchmarking for qlinear operator Reviewed By: mingzhe09088 Differential Revision: D16285204 fbshipit-source-id: 99b734ddfa0af6aada820cac7b2f38ef7a5868cb	2019-07-17 09:13:17 -07:00
vishwakftw	7a99f3987b	Update note about tensors on CPU for certain MAGMA functions, elimina… (#22618 ) Summary: …te argument in macro Changelog: - Update note about tensors on CPU for the following MAGMA functions - magma_(d/s)getrf_gpu and magma_getrf_nopiv_gpu require tensors on CPU for pivots - magma_(d/s)geqrf2_gpu requires tensors on CPU for elementary reflectors - magma_(d/s)syevd_gpu requires tensors on CPU for eigenvalues - Remove dummy tensor in ALLOCATE_ARRAY MACRO Pull Request resolved: https://github.com/pytorch/pytorch/pull/22618 Test Plan: - All existing tests should pass to verify that the patch is correct This PR has been proposed to eliminate confusion due to the previous comments, as indicated in https://github.com/pytorch/pytorch/issues/22573 Differential Revision: D16286198 Pulled By: zou3519 fbshipit-source-id: a5a6ec829084bdb752ca6006b8795227cbaf63b1	2019-07-17 07:38:23 -07:00
Michael Suo	5911cb8e5c	Make `load()` create only one CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22727 Differential Revision: D16197603 Test Plan: Imported from OSS Pulled By: suo fbshipit-source-id: 3eaefe6f229032b109d63a151fe0a20268b5cf56	2019-07-16 20:08:10 -07:00
svcscm	ec57d9215f	Updating submodules Reviewed By: yns88 fbshipit-source-id: 7eb29a58ff20b8ff0b793a84eb2f00e0a1bbe4b5	2019-07-16 19:53:06 -07:00
Igor Fedan	7ed82ea461	Added generation of transpose and dilated 2D and 3D for LongTensor (#22594 ) Summary: Added implementations: transpose2D transpose3D dilated2D and dilated3D for LongTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/22594 Differential Revision: D16155462 Pulled By: ifedan fbshipit-source-id: af57330314bc2c3e0a38b9e75105b20030a1f9bb	2019-07-16 18:58:39 -07:00
Pavel Belevich	bcfa023a00	hardshrink_cpu and hardshrink_backward_cpu refactoring with at::native::cpu_kernel Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22459 Differential Revision: D16132625 Pulled By: pbelevich fbshipit-source-id: d7eb1cd6ed04eba3d0c54feaca1e5ab2836211b5	2019-07-16 18:58:35 -07:00
davidriazati	ef36046ad7	Better error message for using Python builtin_function_or_method (#22935 ) Summary: * better error in `toSugaredValue` * removes a bunch of periods on error messages, `ErrorReport` already adds a `:` at the end of the message](https://our.intern.facebook.com/intern/diff/16291079/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22935 Pulled By: driazati Differential Revision: D16291079 fbshipit-source-id: 478724fc7d1ae79093f4ede18553ffeafa2c7964	2019-07-16 16:49:04 -07:00
Natalia Lunova	25b69997c3	Tensorboard Metrics (#22492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22492 Collect metrics about Tensorboard usage [internal] fbcode/pytorch/tensorboardX/tensorboardX/writer.py [OSS] fbcode/caffe2/torch/utils/tensorboard/writer.py Tensorboard Ondemand https://fb.quip.com/JRvqAKtzgy6z Reviewed By: dzhulgakov Differential Revision: D16105544 fbshipit-source-id: de14e6ec781889e367a6eba39fc777f707628263	2019-07-16 16:18:00 -07:00
Edward Yang	7793ab0871	More documentation about the pyobj field. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22885 Test Plan: Imported from OSS Differential Revision: D16283076 Pulled By: ezyang fbshipit-source-id: 4f6a87d900c4d430eedc90661de89e0f6916347e	2019-07-16 14:47:38 -07:00
Horace He	cd11109c2e	Fix messed up tests for dropout (#22893 ) Summary: Fix https://github.com/pytorch/pytorch/issues/22109. I've confirmed with suo that this wasn't intentional. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22893 Differential Revision: D16288640 Pulled By: Chillee fbshipit-source-id: 00fd6fe418ecefb304866a723051d0e5451ba4d5	2019-07-16 14:17:11 -07:00
Hong Xu	8ced53d62b	Correct the check of whether src is defined in copy_. (#22715 ) Summary: (intentionally left blank) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22715 Differential Revision: D16205243 Pulled By: ezyang fbshipit-source-id: 9bf5a7885691d057198ae482259b36c1773457dd	2019-07-16 14:03:43 -07:00
Edward Yang	798d5d9771	Revert D16281714: Add sanity checks for NCCL detection. Differential Revision: D16281714 Original commit changeset: 396bcbf099bd fbshipit-source-id: a22cc112d1b6a62d689f9d8a7f93e8be3abe2a44	2019-07-16 13:58:27 -07:00
svcscm	7586ffdc57	Updating submodules Reviewed By: yns88 fbshipit-source-id: 30926ecb8fabee3f020ae183bb568a11145bcada	2019-07-16 13:37:24 -07:00
Will Feng	01f03d56ee	Revert D16283037: Add sanity checks for NCCL detection. Differential Revision: D16283037 Original commit changeset: fc09c9443a56 fbshipit-source-id: 30cdf7b1ad91498ee615d018de5571ba36f4383e	2019-07-16 13:20:43 -07:00
davidriazati	7a370dbb41	Enable recursive script mode as the default (#22887 ) Summary: This fixes up the test suite (mostly just adding `ignore` decorations to tests that need to call Python function) so that it all passes with recursive script enabled. The main user-facing result of this change is that Python functions are compiled without any decorators, so non-TorchScriptable code must be decorated with `torch.jit.ignore` (or `torch.jit.ignore(drop_on_export=True` to maintain the functionality of the current `ignore`) Details can be found in #20939 ](https://our.intern.facebook.com/intern/diff/16277608/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22887 Pulled By: driazati Differential Revision: D16277608 fbshipit-source-id: 0abd0dc4291cf40651a1719bff813abb2b559640	2019-07-16 13:00:08 -07:00
Michael Suo	eaee0c6cd9	Make classtypes hold a weak_ptr to their CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22902 Test Plan: Imported from OSS Differential Revision: D16278159 Pulled By: suo fbshipit-source-id: 6aa682e347847e808b44218d38ff1dae66945a07	2019-07-16 12:04:20 -07:00
Michael Suo	b6a88b3344	Make traced fns also go into the global python CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22901 Test Plan: Imported from OSS Differential Revision: D16278160 Pulled By: suo fbshipit-source-id: f3e7d83b48d5f5b5cb1548ccc5b9bd382a3c411a	2019-07-16 12:04:16 -07:00
Lucas Adams	c6fe864db3	Add key_padding_mask kwarg to Transformer (#22588 ) Summary: Motivation: The forward method of MultiheadAttention has a kwarg a key_padding_mask. This mask is of shape (N,S) where N is batch and S is sequence length. This mask is applied prior to attention softmax where True values in the mask are set to float('-inf'). This allows you to mask position j from attention for all position i in input sequence. It's typically used to mask padded inputs. So for a sample in a batch we will be able to make sure no encoder outputs depend on padding inputs. Currently the Transformer, TransformerEncoder, and TransformerEncoderLayer do not have this kwarg, and only have options for a (S,S), (T,T), and (S,T) masks which are applied equally across the batch for source input, target output, and target-source memory respectively. These masks can't be used for padding and are instead used for things like subsequent masking in language modeling, by masking the attention of position i to position j. This diff exposes the key_padding_mask to Transformer, TransformerEncoder, and TransformerEncoderLayer forward methods which is ultimately passed to MultiheadAttention forward. Open question: should we also allow a key_padding_mask for the decoder layer? As padding is usually at the end of each sentence in a batch and sentences are usually decoding from left to right, usually people deal with padding on decoded outputs by just masking those outputs at the loss layer. There might be some scenarios where it's needed though I don't think it would be common. People can also still just subclass and override the layers. We could also pass the input key_padding_mask to the memory <> decoder attention layer. Not sure if that's necessary though because the output of position i from each attention encoder layer won't depend on any masked positions in the input (even if position i is a masked position itself) so there's not really any point in masking position i again. Adds the key_padding_mask kwarg to Transformer, TransformerEncoder, and TransformerEncoderLayer forward methods. The standard TransformerEncoderLayer uses a MultiheadAttention layer as self_attn. MultiheadAttention forward method has a key_padding_mask kwarg that allows for masking of values such as padding per sequence in a batch, in contrast to the attn_mask kwarg which is usually of shape (S,S) and applied equally across the batch. MultiheadAttention calls functional.multi_head_attention_forward, which has the same key_padding_mask kwarg of shape (N,S). Masked (True) values are set to float('-inf'). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22588 Test Plan: buck test mode/dev caffe2/test:nn -- 'test_transformerencoderlayer $test_nn\.TestNN$' buck test mode/dev caffe2/test:nn -- 'test_Transformer_cell $test_nn\.TestNN$' buck test mode/dev caffe2/test:nn -- 'test_transformer_args_check $test_nn\.TestNN$' Differential Revision: D16112263 Pulled By: lucasgadams fbshipit-source-id: dc4147dd1f89b55a4c94e8c701f16f0ffdc1d1a2	2019-07-16 11:57:22 -07:00
Mingzhe Li	9b9546a498	replace ByteTensor with bool in fill_test (#22913 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22913 as title Reviewed By: hl475 Differential Revision: D16285248 fbshipit-source-id: 78b13d48d547760e59e0e5c8875ab09a3cd24828	2019-07-16 11:51:55 -07:00
Hong Xu	31497799b9	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16283037 Pulled By: ezyang fbshipit-source-id: fc09c9443a568d9af1c78a847282a7d707c49dd6	2019-07-16 11:32:36 -07:00
Hong Xu	e2046f8c1d	Add sanity checks for NCCL detection. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22819 Test Plan: Imported from OSS Differential Revision: D16281714 Pulled By: ezyang fbshipit-source-id: 396bcbf099bd07b996cf779c6b43092096b52d90	2019-07-16 11:32:32 -07:00
Hong Xu	3ea04b59c0	Resolve the doc issue in which two asterisks have weird links. (#22896 ) Summary: Asterisks start emphases in rst. We should either escape them or put them as interpreted text. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22896 Differential Revision: D16282869 Pulled By: zou3519 fbshipit-source-id: 15ec4286434db55fb8357b1a12e6f70ef54f8c66	2019-07-16 11:23:06 -07:00
Richard Zou	3f3f5d042a	Revert D16227440: [pytorch][PR] Update note about tensors on CPU for certain MAGMA functions, elimina… Differential Revision: D16227440 Original commit changeset: 97d5537c5da9 fbshipit-source-id: 2dacfcc821e1fb64466e185efa0f6abd0c9ba526	2019-07-16 11:13:59 -07:00
BowenBao	52de340629	Export torch.masked_fill with onnx::where Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22521 Reviewed By: zrphercule Differential Revision: D16155168 Pulled By: houseroad fbshipit-source-id: 5d419f08213324d474b839ba1ae13c799aeee92a	2019-07-16 10:55:30 -07:00
Johannes M Dieterich	6c997538b7	Unwrap sccache post-build for ROCm compilations. (#22743 ) Summary: The sccache wrapping strategy causes problems for at-runtime kernel compilation of MIOpen kernels. We therefore - after the builds of caffe2/pytorch are complete - unwrap sccache again by moving the clang-9 actual binary back into its original place. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22743 Differential Revision: D16283329 Pulled By: bddppq fbshipit-source-id: 4fcdc92be295d5ea9aba75c30e39af1a18a80c13	2019-07-16 10:28:16 -07:00
davidriazati	ba38445cfd	Fix alias annotations for dict ops (#22900 ) Summary: Fixes #22553 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22900 Pulled By: driazati Differential Revision: D16277794 fbshipit-source-id: 657f18c50c9a87597ec1a7d568cc532638cfe386	2019-07-16 10:28:12 -07:00
SsnL	8482efb203	pin_memory malloc now uses existing context if available. (#22229 ) Summary: This is achieved by using `cuDevicePrimaryCtxGetState` as a way to check whether a primary context exists on a device. It is not too slow, from this benchmark of a single call to it on CUDA 10.1, Titan Xp, driver 415.27: ``` --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_cuDevicePrimaryCtxGetState 301 ns 301 ns 2319746 ``` Commits: 1. Add `CUDAHooks::getDeviceWithPrimaryContext` which returns a device index with primary context (if exists). Link `c10/cuda` against `libcuda` for device API calls. 2. Use `getDeviceWithPrimaryContext` to check primary context in `pin_memory`. Fix `OptionalDeviceGuard` doc. 3. Refactor `test_cuda_primary_ctx.py` to support multiple tests. Add test for this in that file. Fixes https://github.com/pytorch/pytorch/issues/21081. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22229 Differential Revision: D16170194 Pulled By: zou3519 fbshipit-source-id: 485a45f211b7844c9e69c63f3b3b75194a796c5d	2019-07-16 10:18:30 -07:00
vishwakftw	054c7eb0f4	Update note about tensors on CPU for certain MAGMA functions, elimina… (#22618 ) Summary: …te argument in macro Changelog: - Update note about tensors on CPU for the following MAGMA functions - magma_(d/s)getrf_gpu and magma_getrf_nopiv_gpu require tensors on CPU for pivots - magma_(d/s)geqrf2_gpu requires tensors on CPU for elementary reflectors - magma_(d/s)syevd_gpu requires tensors on CPU for eigenvalues - Remove dummy tensor in ALLOCATE_ARRAY MACRO Pull Request resolved: https://github.com/pytorch/pytorch/pull/22618 Test Plan: - All existing tests should pass to verify that the patch is correct This PR has been proposed to eliminate confusion due to the previous comments, as indicated in https://github.com/pytorch/pytorch/issues/22573 Differential Revision: D16227440 Pulled By: zou3519 fbshipit-source-id: 97d5537c5da98c0ed3edc4668a09294794fc426b	2019-07-16 10:09:10 -07:00
vishwakftw	f8ad65adb1	Fix torch.triu / torch.tril on contiguous tensors with non-default st… (#22730 ) Summary: …rides Changelog: - Fix behavior of `torch.triu` / `torch.tril` on certain unsqueezed tensors that lead to uninitialized values on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/22730 Test Plan: - Add tests for these cases in test_triu_tril in test_torch Fixes https://github.com/pytorch/pytorch/issues/22581 Differential Revision: D16222897 Pulled By: zou3519 fbshipit-source-id: b86b060187797e5cd2a7731421dff1ba2b5c9596	2019-07-16 10:09:03 -07:00
HaoTang@descartes	0ea8e61f03	For consistent CUDA_HOME behavior (#22845 ) Summary: Align the behavior of `torch.utils.cpp_extension.CUDA_HOME` with that of `tools.setup_helpers.cuda.CUDA_HOME`. Typically, I swapped the position of guess 2 and guess 3 in `torch.utils.cpp_extension.CUDA_HOME` . Fixing issue https://github.com/pytorch/pytorch/issues/22844 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22845 Differential Revision: D16276241 Pulled By: zou3519 fbshipit-source-id: 3b62b439b2f794a6f3637a5fee58991f430985fe	2019-07-16 09:55:56 -07:00
Mingzhe Li	560d847da6	add benchmark for PT fill_ op (#22867 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22867 as title Reviewed By: hl475 Differential Revision: D16263458 fbshipit-source-id: 55b0e62023c117aaa0c2b9a4d65b234a388f086d	2019-07-16 09:50:41 -07:00
Jiakai Liu	3b1c3996e1	remove RTTI check for TensorImpl shadow copy (#22773 ) Summary: We introduced RTTI in recent change: https://github.com/pytorch/pytorch/pull/21613 For internal mobile build we don't enable '-frtti' yet. This diff is trying to replace RTTI with alternative approach. According to dzhulgakov we could compare two tensors' type_id directly in most cases - which is more strict than comparing TensorImpl subclass type as TensorImpl -> type_id mapping is 1-to-n but it's more proper for this use case. The only two cases where we can relax direct type comparison (for legacy reason) are: 1. CPUTensor <-> CUDATensor; 2. SparseCPUTensor <-> SparseCUDATensor; Pull Request resolved: https://github.com/pytorch/pytorch/pull/22773 Differential Revision: D16277696 Pulled By: ljk53 fbshipit-source-id: 043e264fbacc37b7a11af2046983c70ddb62a599	2019-07-15 23:21:57 -07:00
Michael Suo	c5afdd0b55	Revert D16197605: [jit] Make traced fns also go into the global python CU Differential Revision: D16197605 Original commit changeset: d32c975486b0 fbshipit-source-id: a00f0490cc23824792f3e745d7b5a003b1a33d20	2019-07-15 22:31:33 -07:00
Will Feng	a326aad816	Revert D16197608: [jit] Make classtypes hold a weak_ptr to their CU Differential Revision: D16197608 Original commit changeset: 22250d6f0d24 fbshipit-source-id: 47a8cdeb62b1033252070ecb92906358014b551a	2019-07-15 19:49:41 -07:00
svcscm	5f05037de6	Updating submodules Reviewed By: yns88 fbshipit-source-id: a7af5bc022abbfb81af31dbb653e25a3b8d54c4f	2019-07-15 18:07:21 -07:00
Mingzhe Li	94d99f2522	add num_runs flag to the benchmark (#22892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22892 Think of num_runs as manually run the binary <num_runs> times. Each run runs the operator for many iterations. Reviewed By: hl475 Differential Revision: D16271597 fbshipit-source-id: b6f509ee0332c70f85bec0d447b84940c5c0cecd	2019-07-15 17:18:25 -07:00
davidriazati	6ffacd5f02	Use original module's class name for ScriptModules (#22873 ) Summary: Since recursive script creates a ScriptModule from an `nn.Module`, there's no ties to the original module to pull a type name from, so we have to explicitly pass it in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22873 Pulled By: driazati Differential Revision: D16268547 fbshipit-source-id: 902a30e6e36427c6ba7033ded027a29d9dcbc1ee	2019-07-15 15:27:29 -07:00
Nikolay Korovaiko	248336946e	remove stray print Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22825 Differential Revision: D16266401 Pulled By: Krovatkin fbshipit-source-id: 214f90578061aad83eab143381b3c05386edee3d	2019-07-15 14:54:10 -07:00
Jerry Zhang	f7de9be3c0	Add FakeQuantize Module (#21767 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21767 Adding FakeQuantize Module for quantization aware training Reviewed By: dzhulgakov Differential Revision: D15728503 fbshipit-source-id: 2a9a6a362812ede3deac42b93dddca35987bd8e6	2019-07-15 14:08:55 -07:00
Mingzhe Li	0cddd3e751	update README (#21312 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21312 This diff updates the README of op-bench. Reviewed By: zheng-xq Differential Revision: D15612665 fbshipit-source-id: b33119fd4f9d086b03b5e28fbe8a4015b282b15c	2019-07-15 13:34:05 -07:00
vishwakftw	7d055c21b3	Port SVD to ATen, enable batching for matrix inputs (#21588 ) Summary: Changelog: - Port SVD TH implementation to ATen/native/BatchLinearAlgebra.cpp - Port SVD THC implementation to ATen/native/cuda/BatchLinearAlgebra.cu - Allow batches of matrices as arguments to `torch.svd` - Remove existing implementations in TH and THC - Update doc string - Update derivatives to support batching - Modify nuclear norm implementation to use at::svd instead of _batch_svd - Remove _batch_svd as it is redundant Pull Request resolved: https://github.com/pytorch/pytorch/pull/21588 Test Plan: - Add new test suite for SVD in test_torch.py with port to test_cuda.py - Add tests in common_methods_invocations.py for derivative testing Differential Revision: D16266115 Pulled By: nairbv fbshipit-source-id: e89bb0dbd8f2d58bd758b7830d2389c477aa61fb	2019-07-15 13:34:01 -07:00
Michael Suo	260b0e8476	Make classtypes hold a weak_ptr to their CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22726 Differential Revision: D16197608 Test Plan: Imported from OSS Pulled By: suo fbshipit-source-id: 22250d6f0d249f61f269afb4fe8e7d1af0be1205	2019-07-15 13:13:16 -07:00
Michael Suo	5fc1260e0a	Make traced fns also go into the global python CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22725 Differential Revision: D16197605 Test Plan: Imported from OSS Pulled By: suo fbshipit-source-id: d32c975486b0cb4808687f0aa89325571f2817c4	2019-07-15 13:13:12 -07:00
Michael Suo	16aa235f43	_script_compile and _script_class_compile add to the python CU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22724 Differential Revision: D16197609 Test Plan: Imported from OSS Pulled By: suo fbshipit-source-id: e12b31f8c8ce14b0968f4ac9445e7d225126b210	2019-07-15 13:13:08 -07:00
Sebastian Messmer	f2f80744be	Close loophole to create untyped tuples (#22518 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22518 - Reviewed By: dzhulgakov Differential Revision: D16115216 fbshipit-source-id: 1afae3666f7acd7d7833db8a72168364fed4879d	2019-07-15 11:33:45 -07:00
Sebastian Messmer	800f4936f0	Deprecate untyped Lists (#22517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22517 Force anybody creating an untyped Dict to call c10::impl::deprecatedUntypedDict(). This should hopefully make it clear that this is not public API and prevent people from using it. Reviewed By: dzhulgakov Differential Revision: D16115214 fbshipit-source-id: 2c8d0e4e375339c699d583995f79c05c59693c3e	2019-07-15 11:33:35 -07:00
Iurii Zdebskyi	bd88fd0793	Added .bfloat16() (#22852 ) Summary: Add conversion method for bfloat16 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22852 Differential Revision: D16256760 Pulled By: izdeby fbshipit-source-id: 01d75495f9df513a0cdf78791c3eb013ab92bd95	2019-07-15 09:32:18 -07:00
Edward Thomson	8399197df6	Set up CI with Azure Pipelines (#22839 ) Summary: Introduce Azure Pipelines for the linting checks. This is meant to be equivalent to the existing Travis linting phase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22839 Differential Revision: D16260376 Pulled By: ezyang fbshipit-source-id: 1e535c3096358be67a0dad4cd920a92082b2d18e	2019-07-15 06:41:56 -07:00
Edward Yang	535c5540bc	Back out "Back out "[pytorch][PR] Move thnvrtc and DynamicLibrary to ATen"" (#22794 ) Summary: Original commit changeset: 227df3b85316 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22794 ghstack-source-id: 86400904 Differential Revision: D16222777 fbshipit-source-id: 0b198ac59e640df0b8204b4ed30f8e822c15fd9a	2019-07-15 06:28:56 -07:00
Will Feng	317cf7c874	Remove tensor_data() call in Python Variable() and nn.Parameter() constructors (#22821 ) Summary: As part of the Variable/Tensor merge, `variable.tensor_data()` should be removed in favor of `variable.detach()`. This PR removes `tensor_data()` call sites in Python `Variable()` and `nn.Parameter()` constructor paths. Note that this PR is BC-breaking in the following way: - For Python `Variable()` constructor: Previously, in-place updating a tensor after it's been used to create a Variable does not bump the Variable's version counter, which causes the following problem: ```python t = torch.ones(2, 3) v = torch.autograd.Variable(t).requires_grad_() y = v * v t.add_(1) # This bumps version counter of `t` y.sum().backward() # This computes `v`'s gradient incorrectly before this patch, and throws error after this patch ``` After this patch, in-place updating a tensor after it's been used to create a Variable will also bump the Variable's version counter, thus preserving the correctness of the Variable's version counter. - For Python `nn.Parameter()` constructor: Previously, in-place updating a tensor after it's been used to create an nn.Parameter does not bump the nn.Parameter's version counter, which causes the following problem: ```python t = torch.ones(2, 3) v = torch.nn.Parameter(t) y = v * v t.add_(1) # This bumps version counter of `t` y.sum().backward() # This computes `v`'s gradient incorrectly before this patch, and throws error after this patch ``` After this patch, in-place updating a tensor after it's been used to create an nn.Parameter will also bump the nn.Parameter's version counter, thus preserving the correctness of the nn.Parameter's version counter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22821 Differential Revision: D16258030 Pulled By: yf225 fbshipit-source-id: 9a6d68cea1864893193dbefbb6ef0c1d5ca12d78	2019-07-14 21:09:29 -07:00
Hong Xu	14e8fb70a1	Make the signature of fill_out consistent with fill_. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22761 Test Plan: Imported from OSS Differential Revision: D16257779 Pulled By: ezyang fbshipit-source-id: b1201500042ae1f4678835da957de1777c1038a3	2019-07-14 19:20:59 -07:00
Hong Xu	1c266c2738	Move the body of fill_kernel_impl into fill_kernel_cuda Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22760 Test Plan: Imported from OSS Differential Revision: D16257782 Pulled By: ezyang fbshipit-source-id: d214d2d77affd937109b33ca841af76004f85834	2019-07-14 19:20:53 -07:00
Hong Xu	fc297b8e83	Move fill and fill_diagonal to Fill.cpp, Fill.h, and FillKernel.{cpp,cu} Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22758 Test Plan: Imported from OSS Differential Revision: D16257781 Pulled By: ezyang fbshipit-source-id: 9e5ed06e95ef65b036eb388488faad981f1e8012	2019-07-14 19:20:46 -07:00
James Reed	815e73bc20	make_variable consumes the Tensor if it only has one reference Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22705 Test Plan: Imported from OSS Differential Revision: D16192220 Pulled By: jamesr66a fbshipit-source-id: 9c42bb759077b74a1370d3a2d7114ed3593f333b	2019-07-14 18:36:20 -07:00
Junjie Bai	b5fa9a340a	Temporarily skip mypy-0.720 to unbreak master type checks Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22835 Differential Revision: D16239190 Pulled By: bddppq fbshipit-source-id: e97fd3aae0676de8a06dc9fb498f36ed28dc92c3	2019-07-14 09:49:24 -07:00
Lingyi Liu	1a93b96815	Revert da315a4 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22837 Differential Revision: D16239667 Pulled By: llyfacebook fbshipit-source-id: 1a625d78d633927129dd2791e65b333b3902f94f	2019-07-13 01:54:20 -07:00
Junjie Bai	92468f0a6b	Revert D16238204: Revert D16224780: [pytorch][PR] [ROCm] MIOpen integration into pytorch RNN operators Differential Revision: D16238204 Original commit changeset: a6b5eb3f4820 fbshipit-source-id: bdfae93c522c1ce734ab8dc736ced66411fe50ee	2019-07-12 22:58:50 -07:00
Karl Ostmo	da315a4e2a	Revert D16037021: Support GRU module quantization in Pytorch Differential Revision: D16037021 Original commit changeset: 71145c67d869 fbshipit-source-id: 33cd2e57eba30ea33cc4f3116732a721c26f6efb	2019-07-12 21:05:34 -07:00
Karl Ostmo	fcfefc3439	Revert D16224780: [pytorch][PR] [ROCm] MIOpen integration into pytorch RNN operators Differential Revision: D16224780 Original commit changeset: 331dafbb7689 fbshipit-source-id: a6b5eb3f4820fbb58d4a329aa4c93b40a111ff27	2019-07-12 20:55:05 -07:00
Pradeep Dorairaj	ead1193241	Transfer Learning: Caffe2 load op changes to return shape inference (#22829 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22829 Sending out caffe2 load op changes separately since we want pick it to open source. This change is needed because the shape information of the blobs is determined from the load operator and that shape information is needed in our download_group. Reviewed By: boryiingsu Differential Revision: D16229465 fbshipit-source-id: f78b2df9a7f26968d70eca68dde75cd11ab6f7a2	2019-07-12 19:45:13 -07:00
Lingyi Liu	d8c1b86135	Support GRU module quantization in Pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22498 Reviewed By: BIT-silence Differential Revision: D16037021 fbshipit-source-id: 71145c67d8696e525b686cd3313033e5b6771718	2019-07-12 18:31:08 -07:00
James Reed	ba9d559a12	Get rid of torch.mean shape analysis Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22810 Test Plan: Imported from OSS Differential Revision: D16226973 Pulled By: jamesr66a fbshipit-source-id: ad23f48782e8d21788ecae39fc512ff4502716bf	2019-07-12 17:50:10 -07:00
Mingzhe Li	9eb039334f	Use Linear Operator with fp16 weights in JIT (#22323 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22323 This diff adds an interface to use quantized Linear op in JIT. Reviewed By: jamesr66a Differential Revision: D16040724 fbshipit-source-id: 90e90aff9973c96ea076ed6a21ae02c349ee2bcf	2019-07-12 15:59:17 -07:00
Mingzhe Li	573d9e6975	Support Linear operation with fp16 weights in ATen (#22023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22023 This diff implements Linear operation with fp16 weights based on FBGEMM. At a hight level, we want to perform the following operation: Y = X * W + B with dtypes: (fp32, fp32, fp16, fp32) To do that, three steps are needed: 1. Quantize weights from fp32 to fp16, this is done using `PackedGemmMatrixFP16` in the `fbgemm_pack_gemm_matrix_fp16` 2. Conduct matrix multiplication with quantized weights using `cblas_gemm_compute` in `fbgemm_linear_fp16_weight` 3. Add bias to the result from step2 and return the final Y Reviewed By: jianyuh Differential Revision: D15921768 fbshipit-source-id: dc4e5b366f846ce9d58975876940a9b3372b8b8d	2019-07-12 15:59:13 -07:00
Karl Ostmo	35ee4bf4e5	Revert D16204820: [pytorch][PR] optimize topk on cpu using parallel and partial sort Differential Revision: D16204820 Original commit changeset: ea70562c9149 fbshipit-source-id: c8f8e262c7c681593d243f035bf1f0d84675c9dc	2019-07-12 15:14:06 -07:00
Elias Ellison	cf2889ad8f	add support for breaks and continues (#21692 ) Summary: Add support for breaks and continues in the jit. We do with a Graph transform pre-SSA. A graph of the form ``` def test(): while i < 5: if i == 3: break i += 1 print(i) ``` has the body of the loop transformed to ``` if i == 3: did_break = True else: did_break = False if did_break: loop_exit = True else: i += 1 print(i) loop_exit = i < 5 ``` I am going to add more tests but I think it is ready for review now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21692 Differential Revision: D16215807 Pulled By: eellison fbshipit-source-id: 365102f42de4861d9323caaeb39a96de7619a667	2019-07-12 15:02:44 -07:00
BowenBao	b3147bc674	PyTorch export to ONNX Opset 7 and 8 - Cont (#22421 ) Summary: This is an extension to the original PR https://github.com/pytorch/pytorch/pull/21765 1. Increase the coverage of different opsets support, comments, and blacklisting. 2. Adding backend tests for both caffe2 and onnxruntime on opset 7 and opset 8. 3. Reusing onnx model tests in caffe2 for onnxruntime. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22421 Reviewed By: zrphercule Differential Revision: D16225518 Pulled By: houseroad fbshipit-source-id: 01ae3eed85111a83a0124e9e95512b80109d6aee	2019-07-12 14:52:48 -07:00
ashish	9f8e2c067f	MIOpen integration into pytorch RNN operators (#22774 ) Summary: This PR enables pytorch RNN operators to use MIOpen engine ezyang bddppq cc: lcskrishna iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/22774 Differential Revision: D16224780 Pulled By: bddppq fbshipit-source-id: 331dafbb76892d7390b620a95d8f384d38ee5533	2019-07-12 14:47:48 -07:00
Sam Gross	30e03df638	Speeds up fast-path for 1D tensors (#22756 ) Summary: Using PMCTest (https://www.agner.org/optimize/) to measure TensorIterator construction, this results in ~600 fewer instructions retired (~300 fewer cycles) for constructing TensorIterator on a 1D tensor. (Should be roughly ~100 ns, but it's hard to measure that precisely end-to-end). ``` Before: Clock Core cyc Instruct Uops L1D Miss 5082 2768 5690 7644 3 After: Clock Core cyc Instruct Uops L1D Miss 4518 2437 5109 6992 0 ``` Note that Instruct is reliable, Core cyc is a little noisy, and Clock is a little more noisy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22756 Differential Revision: D16207777 Pulled By: VitalyFedyunin fbshipit-source-id: bcc453a90472d9951a1c123bcb1b7a243fde70ac	2019-07-12 12:33:38 -07:00
Wanchao Liang	02bc06a683	avoid kernel launches for zero-sized tensor inputs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22790 Test Plan: Imported from OSS Differential Revision: D16226168 Pulled By: wanchaol fbshipit-source-id: 081607c9acc1540c753b080c5f727dc4e8c22acc	2019-07-12 12:24:52 -07:00
Sam Gross	b1b65f34a9	Make PythonArgs::tensor and PythonArgs::scalar faster (#22782 ) Summary: Speeds up the common case where Tensor is a torch.Tensor (not a subclass). This reduces the number of executed instructions for a torch.add(tensor1, tensor2) by ~328 (should be ~65 ns faster). Note that most of the PythonArgs accessors are too large to be inlined. We should move most of them to the cpp file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22782 Differential Revision: D16223592 Pulled By: colesbury fbshipit-source-id: cc20f8989944389d5a5e3fab033cdd70d581ffb1	2019-07-12 11:57:29 -07:00
Mingfei Ma	10c14ad17c	optimize topk on cpu using parallel and partial sort (#19736 ) Summary: This PR aims at improving `topk()` performance on CPU. This is useful when computing beam search during `Transformer` and `BERT`. Given a tensor x of size `[N, C]`, and we want to apply `x.topk(K)`, the current logic is sequentially loop on the dimension of `N` and do quick select on the dimension of `C` so as to find out top K elements. Performance can be further improved from: - On the dimension of `N`, it can be paralleled - Maybe a faster sorting algorithm for `topk`. (After a bunch of experimenting, `std::partial_sort` seems to be the most promising) So i compared 3 versions: 1. vanilla: sequential + quick select 2. reference PR https://github.com/pytorch/pytorch/issues/19737: parallel + quick select 3. this PR: parallel + partial sort with the following benchmark, on `Xeon 8180, 228 cores@2.5 GHz`: ```python import torch from time import time num_iters = 1000 def bench_topk(N=8, C=168560, k=10): a = torch.randn(N, C) # warm up for i in range(100): torch.topk(a, k) t = 0 for i in range(num_iters): a = torch.randn(N, C) start = time() value, indice = torch.topk(a, k) t += time() - start print("#[%d, %d] times: %f ms" % (N, C, t / num_iters 1000)) Ns = [10, 20, 30] Cs = [10000, 20000, 40000, 80000, 160000, 320000] for n in Ns: for c in Cs: bench_topk(N=n, C=c) ``` ### vanilla: sequential + quick select ``` #[10, 10000] times: 0.746740 ms #[10, 20000] times: 1.437399 ms #[10, 40000] times: 2.832455 ms #[10, 80000] times: 5.649426 ms #[10, 160000] times: 11.309466 ms #[10, 320000] times: 22.798765 ms #[20, 10000] times: 1.511303 ms #[20, 20000] times: 2.822024 ms #[20, 40000] times: 5.564770 ms #[20, 80000] times: 11.443044 ms #[20, 160000] times: 22.747731 ms #[20, 320000] times: 46.234449 ms #[30, 10000] times: 2.214045 ms #[30, 20000] times: 4.236179 ms #[30, 40000] times: 8.418577 ms #[30, 80000] times: 17.067578 ms #[30, 160000] times: 33.826214 ms #[30, 320000] times: 68.109420 ms ``` ### reference PR: parallel + quick select ``` #[10, 10000] times: 0.271649 ms #[10, 20000] times: 0.593016 ms #[10, 40000] times: 1.133518 ms #[10, 80000] times: 2.082355 ms #[10, 160000] times: 4.049928 ms #[10, 320000] times: 7.321285 ms #[20, 10000] times: 0.315255 ms #[20, 20000] times: 0.539054 ms #[20, 40000] times: 1.000675 ms #[20, 80000] times: 1.914586 ms #[20, 160000] times: 4.437122 ms #[20, 320000] times: 8.822445 ms #[30, 10000] times: 0.347209 ms #[30, 20000] times: 0.589947 ms #[30, 40000] times: 1.102814 ms #[30, 80000] times: 2.112201 ms #[30, 160000] times: 5.186837 ms #[30, 320000] times: 10.523023 ms ``` ### this PR: parallel + partial sort ``` #[10, 10000] times: 0.150284 ms #[10, 20000] times: 0.220089 ms #[10, 40000] times: 0.521875 ms #[10, 80000] times: 0.965593 ms #[10, 160000] times: 2.312356 ms #[10, 320000] times: 4.759422 ms #[20, 10000] times: 0.167630 ms #[20, 20000] times: 0.265607 ms #[20, 40000] times: 0.471477 ms #[20, 80000] times: 0.974572 ms #[20, 160000] times: 3.269645 ms #[20, 320000] times: 6.538608 ms #[30, 10000] times: 0.204976 ms #[30, 20000] times: 0.342833 ms #[30, 40000] times: 0.589381 ms #[30, 80000] times: 1.398579 ms #[30, 160000] times: 3.904077 ms #[30, 320000] times: 9.681224 ms ``` In summary, `2` is 5x faster than `vanilla` on average and `3` is 8.6x faster than `vanilla`. On `Fairseq Transformer`, the default parameter on dataset `wmt14` would have a `topk` size of `[8, 168560]`, and this operator gets `3x` faster with this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19736 Differential Revision: D16204820 Pulled By: VitalyFedyunin fbshipit-source-id: ea70562c9149a0d832cf5872a891042ebd74fc63	2019-07-12 11:10:20 -07:00
Sam Gross	fc23d7f3bd	Speed up TensorIterator::compute_strides a little (#22779 ) Summary: For three 1-D operands, compute_strides now takes 298 instructions instead of 480. (Saves ~36 ns). We'll want to make Tensor::sizes(), strides(), and element_size() trivially inlinable to speed this up more. (Using PMCTest from https://www.agner.org/optimize/ to measure instructions retired) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22779 Differential Revision: D16223595 Pulled By: colesbury fbshipit-source-id: e4730755f29a0aea9cbc82c2d376a8e6a0c7bce8	2019-07-12 10:57:32 -07:00
Guanheng Zhang	f266a63eeb	Initiate checkCuda90Bug warning (#22757 ) Summary: Initiate checkCuda90Bug warning to THCudaBlas_Sgemm and THCudaBlas_Hgemm. https://github.com/pytorch/pytorch/pull/22034 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22757 Differential Revision: D16223085 Pulled By: zhangguanheng66 fbshipit-source-id: 470c6cbaba16a3cec295993c2673f02008a602a6	2019-07-12 09:55:09 -07:00
Edward Yang	ccb28939bf	Revert D16222539: [pytorch][PR] Let users pass CMake-specific options starting with CMAKE_ to CMake. Differential Revision: D16222539 Original commit changeset: 1cc6e69c85cd fbshipit-source-id: c79d68976ac1047c54b32c093429b23e9482cd8f	2019-07-12 07:57:57 -07:00
Hong Xu	612eed31a9	Let users pass CMake-specific options starting with CMAKE_ to CMake. (#22776 ) Summary: This should make it more convenient to follow https://github.com/pytorch/pytorch/issues/8433's suggestion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22776 Differential Revision: D16222539 Pulled By: ezyang fbshipit-source-id: 1cc6e69c85cdf0d7f8074653445410d85746847c	2019-07-12 07:28:32 -07:00
Mingzhe Li	7eb0319339	add new tests to benchmark_all_test (#22787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22787 as title Reviewed By: hl475 Differential Revision: D16219329 fbshipit-source-id: 097ee73e7644d5ca482ad044d0fd2c3e7dc2c10b	2019-07-11 22:50:55 -07:00
Mingzhe Li	1878800f47	make custom op work in OSS environment (#22781 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22781 The custom op is required to make the op benchmark work with JIT. Running this command `python setup.py install` in the pt_extension directory to install it. It is required. Reviewed By: hl475 Differential Revision: D16214430 fbshipit-source-id: c9221c532011f9cf0d5453ac8535a6cde65e8376	2019-07-11 21:17:17 -07:00
Jianyu Huang	8ec712da30	Add the support of handle Bias being nullptr for torch.ops.quantized.fbgemm_linear (#22403 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22403 - C10 Operator Registration (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/op_registration/op_registration.cpp) supports None type. - ATen has None Tensor support, e.g., https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml#L1078 Reviewed By: zafartahirov Differential Revision: D16069522 fbshipit-source-id: 3acaec783fc138ff36b14ffc0582d0764be4ad34	2019-07-11 17:33:08 -07:00
Spandan Tiwari	9d11004ee4	Update ONNX constant folding to support opset 10. (#22515 ) Summary: Currently ONNX constant folding (`do_constant_folding=True` arg in `torch.onnx.export` API) supports only opset 9 of ONNX. For opset 10, it is a no-op. This change enables ONNX constant folding for opset 10. Specifically there are three main changes: 1) Turn on constant folding ONNX pass for opset 10. 2) Update support for opset 10 version of `onnx::Slice` op for backend computation during constant folding. 3) Enable constant folding tests in `test/onnx/test_utility_funs.py` for multiple opsets (9 and 10). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22515 Reviewed By: zrphercule Differential Revision: D16189336 Pulled By: houseroad fbshipit-source-id: 3e2e748a06e4228b69a18c5458ca71491bd13875	2019-07-11 16:29:03 -07:00
Michael Suo	291570e085	make CompilationUnit::define return defined functions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22723 Test Plan: Imported from OSS Differential Revision: D16197604 Pulled By: suo fbshipit-source-id: b22491a58aa9ea476acab06614093ff004291407	2019-07-11 14:55:43 -07:00
Michael Suo	de819be93e	refactor self to be a class again Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22722 Test Plan: Imported from OSS Differential Revision: D16197607 Pulled By: suo fbshipit-source-id: b4dd96b3f9cc46b48678aab0ff89afc3666e2185	2019-07-11 14:55:39 -07:00
Michael Suo	22d70e0d4b	Give functions qualified names Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22721 Test Plan: Imported from OSS Differential Revision: D16197606 Pulled By: suo fbshipit-source-id: 94718fcdb0d3b651f16674af3cfd6249ed4533ae	2019-07-11 14:55:34 -07:00
Edward Yang	4b48ae4aec	Suppress progress bar only for pip install Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22708 Test Plan: Imported from OSS Differential Revision: D16206329 Pulled By: ezyang fbshipit-source-id: 4ec29e0e9e48a168e88ec716ee8e270c56a38cdb	2019-07-11 13:50:29 -07:00
Bram Wasti	05d56bd1b6	Remove hard-coded NVRTC specific constant from fuser header Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22699 Test Plan: Imported from OSS Differential Revision: D16192290 Pulled By: bwasti fbshipit-source-id: 4dccaf3e6e0151e86d35474c36e1ddb7f2afb5cf	2019-07-11 13:44:25 -07:00
James Reed	513b7a7a06	assert_no_internal_overlap pass op name by const ref Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22729 Test Plan: Imported from OSS Differential Revision: D16205448 Pulled By: jamesr66a fbshipit-source-id: b383c461dd58e8a3d0bfeae43ebfd1e021668f80	2019-07-11 13:38:10 -07:00
James Reed	9690f8629d	Move the storage in empty_cpu Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22728 Test Plan: Imported from OSS Differential Revision: D16205449 Pulled By: jamesr66a fbshipit-source-id: 6fd198d0d526b5de393e2988906dac2a63064f24	2019-07-11 13:38:07 -07:00
Chunli Fu	a797815198	bucketize op shape inference (#22716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22716 add shape inference func to bucketize op Reviewed By: ipiszy Differential Revision: D16193718 fbshipit-source-id: 6e893356b6408255538545673047dd5124837e70	2019-07-11 12:44:29 -07:00
Edward Yang	ac78a86e1d	Back out "[pytorch][PR] Move thnvrtc and DynamicLibrary to ATen" (#22749 ) Summary: Original commit changeset: add2ee8a8865 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22749 ghstack-source-id: 86323899 Differential Revision: D16203552 fbshipit-source-id: 227df3b85316315c15d2cb7b6a5c884096a82e9e	2019-07-11 12:21:21 -07:00
Xiaomeng Yang	8bdda03ae1	optimize RNN on CPU (#22512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22512 optimize RNN on CPU Reviewed By: llyfacebook Differential Revision: D16113360 fbshipit-source-id: 9ee53b3b4bb9b636e7be1ccdf25420e2caa60762	2019-07-11 12:16:27 -07:00
Jie	3135298dde	(#22602 ) Summary: 1. update on restricting block.z <= 64, compliant to CUDA maximum z-dimension of a block; 2. clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/22602 Differential Revision: D16203857 Pulled By: ezyang fbshipit-source-id: 567719ae175681a48eb0f818ca0aba409dca2550	2019-07-11 12:02:58 -07:00
Jerry Zhang	1682d38a25	Improve hypothesis_utils.py for qtensor (#22693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22693 change np.finfo to torch.finfo Differential Revision: D16185556 fbshipit-source-id: 594f8ba1d6317ac2de47af754a8bd6015d40ea15	2019-07-11 11:56:01 -07:00
Junjie Bai	3fabb9f105	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22737 Differential Revision: D16200090 Pulled By: bddppq fbshipit-source-id: 3819716a9b01f073966fc8b420c6a0b8d13232ac	2019-07-11 11:09:24 -07:00
shihongzhi	45cf33a731	add fill_diagonal function (#21892 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21796 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21892 Differential Revision: D16164678 Pulled By: colesbury fbshipit-source-id: 85df8ae9b7a6a91b6023fe7295b3a8124e4526ea	2019-07-11 09:20:44 -07:00
Kexuan Sun	89d6e88042	Add environment variables used in CONTRIBUTING example (#22736 ) Summary: Some other environment variables can be added to speed things up for development. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22736 Differential Revision: D16200904 Pulled By: soumith fbshipit-source-id: 797ef91a863a244a6c96e0adf64d9f9b4c9a9582	2019-07-11 04:15:51 -07:00
Chaitanya Sri Krishna Lolla	5147819f9d	enabled MIOpen depthwise convolutions (#22696 ) Summary: They mistakenly got removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22696 Differential Revision: D16191442 Pulled By: bddppq fbshipit-source-id: 7ceda274c557879e11f84596040efe9e0c9b861f	2019-07-11 00:14:58 -07:00
Zafar Takhirov	d21e476dcd	Quantized Conv2d Module (#21323 ) Summary: Stack:     ⚪  https://github.com/pytorch/pytorch/issues/21808 Quantized conv avoid functional usage  [💛](https://our.intern.facebook.com/intern/diff/D15835572/)     ⚫  https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module  [💛](https://our.intern.facebook.com/intern/diff/D15551835/) Quantized Conv2d Module Pull Request resolved: https://github.com/pytorch/pytorch/pull/21323 Test Plan: Tests are split into two parts: functional and API. `buck test mode/dev caffe2/test:quantized -- test_conv_api` : https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491 ``` Parsing buck files: finished in 1.4 sec Building: finished in 4.6 sec (100%) 7136/7136 jobs, 2 updated Total time: 6.1 sec Trace available for this run at /tmp/testpilot.20190703-153023.392592.log TestPilot test runner for Facebook. See https://fburl.com/testpilot for details. Testpilot build revision 7149de230b9e1cdc7a872bb31fe099f0616dee09 fbpkg e59e6ab0fe8e47a496f915d34555c3ad at Fri Jun 28 12:20:54 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/647/t.par Discovering tests Running 2 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491 ✓ caffe2/test:quantized - test_conv_api (test_nn_quantized.ModuleAPITest) 0.044 1/2 (passed) ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 5.109 2/2 (passed) Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/4785074605318491 Summary (total time 9.08s): PASS: 2 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` Differential Revision: D15551835 Pulled By: zafartahirov fbshipit-source-id: 481a7df4b8a88e485437e1596eefb08d5e6766fa	2019-07-10 21:31:24 -07:00
Shen Li	ad634875d0	Mark Unpickler data ptr arg as const Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22690 Differential Revision: D16184299 Pulled By: mrshenli fbshipit-source-id: 332954028533952dad01df03eca8e95bf6fe67a9	2019-07-10 20:07:13 -07:00
Sam Gross	4240220926	Revert D16183577: Delegate Python ~ (invert operator) to Tensor.bitwise_not(). Differential Revision: D16183577 Original commit changeset: f86838c407db fbshipit-source-id: bbf53ce52a20b1e90b1fe522d73e558d8044c4ba	2019-07-10 18:29:22 -07:00
Karl Ostmo	1ecc945ab2	Revert D15998762: [jit] Give functions qualified names Differential Revision: D15998762 Original commit changeset: bc2b734f626a fbshipit-source-id: a118cc4e9a34233279e8380529a8d8120a25839d	2019-07-10 16:10:28 -07:00
Karl Ostmo	a1ca32409f	Revert D15998758: [jit] refactor self to be a class again Differential Revision: D15998758 Original commit changeset: 14bad87bb6e4 fbshipit-source-id: f2c29974d4afc4d8f88a36e9c266e6d5a22a6191	2019-07-10 16:10:24 -07:00
Karl Ostmo	e6eb17303f	Revert D16184799: [jit] make CompilationUnit::define return defined functions Differential Revision: D16184799 Original commit changeset: 9f77a7ca2223 fbshipit-source-id: a0e08220d924a6ca55bf2f1f77754553d0133595	2019-07-10 16:10:20 -07:00
Iurii Zdebskyi	fffa7200c1	fixing lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22703 Differential Revision: D16188326 Pulled By: izdeby fbshipit-source-id: 72e6b6f957068c3995010a1b811f24cd2304ff6f	2019-07-10 16:02:21 -07:00
Brian Vaughan	67c634d58e	add a comment to native_functions explaining softmax interfaces (#22651 ) Summary: Address the review comment made by gchanan here: https://github.com/pytorch/pytorch/pull/22456#discussion_r300715866 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22651 Differential Revision: D16181828 Pulled By: nairbv fbshipit-source-id: 0d41a9024c2664298c281e198a997be73e7f8499	2019-07-10 15:34:29 -07:00
Nikolay Korovaiko	0196e0bafb	add line numbers to jit_log.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22630 Differential Revision: D16172090 Pulled By: Krovatkin fbshipit-source-id: 26cdb0077a0bfbf9981e39359472f3251546db53	2019-07-10 15:28:29 -07:00
Michael Suo	c49a71f91f	make CompilationUnit::define return defined functions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22667 Test Plan: Imported from OSS Differential Revision: D16184799 Pulled By: suo fbshipit-source-id: 9f77a7ca2223237fbcb4b12a4734b7d334f7be13	2019-07-10 15:19:11 -07:00
Michael Suo	ee9c8a75f4	refactor self to be a class again (#22207 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22207 ghimport-source-id: 36ee8bd17411a2e220665ad2a27364653061070e Test Plan: Imported from OSS Differential Revision: D15998758 Pulled By: suo fbshipit-source-id: 14bad87bb6e44bf1a43ae86339d8cc7b311c76dd	2019-07-10 15:19:07 -07:00
Michael Suo	c0674cebf1	Give functions qualified names (#22206 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22206 ghimport-source-id: d453219d907e048f24eb7f63c096b2c300307c83 Test Plan: Imported from OSS Differential Revision: D15998762 Pulled By: suo fbshipit-source-id: bc2b734f626ab07f97dc50ddf1b021e8b46de312	2019-07-10 15:19:03 -07:00
Lucas Kabela	86fc417147	Move Quantization Models to common_quantization (#22706 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22706 Moved the models used for quantization test from the test_quantization.py file to common_quantization.py Reviewed By: jerryzh168 Differential Revision: D16189865 fbshipit-source-id: 409b43454b6b3fe278ac16b1affb9085d6ed6835	2019-07-10 15:05:49 -07:00
Edward Yang	ebafa2e15f	Turn on USE_DIRECT_NVRTC in fbcode again. (#22685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22685 ghstack-source-id: 86247780 Reviewed By: bddppq Differential Revision: D16182352 fbshipit-source-id: fc51aa7c1112904b8cccd055dc87e10c836cf2fb	2019-07-10 15:05:45 -07:00
Wanchao Liang	edeb4dbdcb	register __getitem__ builtin Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22276 Test Plan: Imported from OSS Differential Revision: D16060595 Pulled By: wanchaol fbshipit-source-id: e1e27d6be8d62fc1a841860a783aff108980d9d3	2019-07-10 14:53:35 -07:00
Tongzhou Wang	368dbb9ab3	Fix a FIXME in test_nn (#22675 ) Summary: https://github.com/pytorch/pytorch/issues/17262 is already resolved, so this should pass now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22675 Differential Revision: D16188003 Pulled By: zou3519 fbshipit-source-id: 32693229a0590b274ed1bf76b815f17e77c2d3ea	2019-07-10 13:12:50 -07:00
Elias Ellison	00df49c984	Fix Trace inlining of graphs with optional inputs (#22686 ) Summary: Previously in tracing when we called a script function we would inline the graph and set the graph inputs equal to the types the graph was invoked with. This breaks for optional arguments invoked with None since we rely on None being set to Optional[T] in schema matching. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22686 Differential Revision: D16186372 Pulled By: eellison fbshipit-source-id: e25c807c63527bf442eb8b31122d50689c7822f5	2019-07-10 12:57:06 -07:00
Lucas Kabela	3e3e6ee335	Add common_quantized test case utilities (#22694 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22694 Move quantization and quantized utility functions for testing to common_quantized.py and common_quantization.py. Addditionally, add a quantized test case base class which contains common methods for checking the results of quantization on modules. As a consequence of the move, fixed the import at the top of test_quantized.py, and test_quantization to use the new utility Reviewed By: jerryzh168 Differential Revision: D16172012 fbshipit-source-id: 329166af5555fc829f26bf1383d682c25c01a7d9	2019-07-10 12:23:36 -07:00
Hong Xu	7750cae722	Refactor and improve randperm tests. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22121 Test Plan: Imported from OSS Differential Revision: D16153794 Pulled By: li-roy fbshipit-source-id: 4dbfa6cfcc79f6d431918a6646664215fa9ea0b9	2019-07-10 12:23:33 -07:00
Hong Xu	32709af8f4	Swap detection order in randperm_out_cuda to avoid unnecessary conversion from float when the input is small. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22103 Test Plan: Imported from OSS Differential Revision: D16153585 Pulled By: li-roy fbshipit-source-id: 0801b91e7b352c8de8fdfbe929be85d69182b8da	2019-07-10 12:23:29 -07:00
Hong Xu	0f7c3710dd	Support Half type in randperm. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22102 Test Plan: Imported from OSS Differential Revision: D16153586 Pulled By: li-roy fbshipit-source-id: d58e3dbc5da893005f4eaf521a28b0d752274eff	2019-07-10 12:23:25 -07:00
Hong Xu	9c4c9c3af0	Delegate Python ~ (invert operator) to Tensor.bitwise_not(). Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22326 Test Plan: Imported from OSS Differential Revision: D16183577 Pulled By: colesbury fbshipit-source-id: f86838c407db4ded9ce70998bf1ab1ffd75b3b58	2019-07-10 12:17:52 -07:00
Hong Xu	574e808680	Add a bitwise NOT operator for integer and Boolean types (CUDA). Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22320 Test Plan: Imported from OSS Differential Revision: D16183578 Pulled By: colesbury fbshipit-source-id: 2f72cce5e10fd637be1ac87e1bbfe0937a661034	2019-07-10 12:17:48 -07:00
Hong Xu	e2dc1fc715	Add a bitwise NOT operator for integer and Boolean types (CPU). Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22283 Test Plan: Imported from OSS Differential Revision: D16183576 Pulled By: colesbury fbshipit-source-id: 2e539fab8ff885dddb9bff334d1d784b28d65b8f	2019-07-10 12:17:44 -07:00
mal	58e20638f7	Refactoring _wrap_outputs to remove python dependence. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22631 Test Plan: test suite Imported from OSS Differential Revision: D16185040 fbshipit-source-id: 9b83749f6c9cd05d13f54a3bb4801e263293252b	2019-07-10 12:12:16 -07:00
Michael Suo	ec1b669d23	fix dce over loops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22632 Test Plan: Imported from OSS Differential Revision: D16184469 Pulled By: suo fbshipit-source-id: b7cc2d20a7dd8b287e1b6128ddb70d3936032a7e	2019-07-10 12:03:19 -07:00
Xiaodong Wang	9b8d771733	skip import nccl and gloo_gpu in cpu machine (#22522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22522 Skip importing nccl and gloo_gpu modules in cpu machine Reviewed By: bddppq Differential Revision: D16115827 fbshipit-source-id: 329b7a0bb5eccb78c9e772bdab5db7c79b546d55	2019-07-10 11:56:56 -07:00
Jerry Zhang	b984b0ab4b	fix print (#22689 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22689 att Reviewed By: Lucaskabela Differential Revision: D16184260 fbshipit-source-id: 1a6ad51a37918d0c81d6e3baa0ca0baa32cb9673	2019-07-10 11:26:34 -07:00
Nikolay Korovaiko	f81395b3e3	Enable more passes in ProfilingGraphExecutor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22079 Differential Revision: D16119322 Pulled By: Krovatkin fbshipit-source-id: 301fcc42d0e1f031d9de5bcd9679fb8c2d742fef	2019-07-10 10:44:18 -07:00
Iurii Zdebskyi	10c60b601a	Added Bfloat16 tensor for cpu with very limited support (#21860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21860 ghimport-source-id: 5290755b63033cdfdeb911a4ecf4aa282b3db02d Test Plan: Imported from OSS Differential Revision: D15856091 Pulled By: izdeby fbshipit-source-id: 54e7e17be1b5c5a2e80a41feaeaeba75dbb8108f	2019-07-10 09:08:52 -07:00
Zhi Tian	6eb3969ac7	keep reuqires_grad unchanged after converting bn to syncbn (#22569 ) Summary: After converting BN layers to SyncBN layers, the function will set all `requires_grad = True` regardless of the original requires_grad states. I think it is a bug and have fixed it in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22569 Differential Revision: D16151647 Pulled By: zou3519 fbshipit-source-id: e2ad1886c94d8882485e7fb8be51ad76469ecc67	2019-07-10 08:38:04 -07:00
Edward Yang	cbb0b8166d	Revert D16161144: [pytorch][PR] Add traces to LowerGradOf and SpecializeAutoGrad Differential Revision: D16161144 Original commit changeset: 9e206fcfb179 fbshipit-source-id: 8f9eecb5cd6ca715bd0c647c32cf77cd9d88e6ac	2019-07-10 06:55:01 -07:00
Iurii Zdebskyi	3a8d7463bd	Enabled BFloat16 storage (#21523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21523 ghimport-source-id: 698b3cbd6b21c09b9ff8bf8011980df8e35c33b0 Test Plan: Imported from OSS Differential Revision: D15819368 Pulled By: izdeby fbshipit-source-id: f6b3bba7b3ca8ee677bd80a231dbb3920c07d61c	2019-07-09 21:51:06 -07:00
svcscm	932ec8aa9f	Updating submodules Reviewed By: zpao fbshipit-source-id: f5636ab0457c1b2e15df95a5677a7194978d9cd0	2019-07-09 21:39:57 -07:00
Iurii Zdebskyi	e72b617eb5	Intoducing bfloat16 type (#21522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21522 ghimport-source-id: 4803f197ec04938501fdb10c1741280331c349d2 Test Plan: Imported from OSS Differential Revision: D15819369 Pulled By: izdeby fbshipit-source-id: 46408dc316a5c4dc644a736dc42da2422b34bcb9	2019-07-09 21:14:10 -07:00
xzhu1900	de5a481c6e	add forward declaration in stateful dataset (#22562 ) Summary: Addressing potential dependency issue by adding forward declaration for OutputArchive/InputArchive. This change follows the same pattern in base.h in 'torch/csrc/api/include/torch/data/samplers/base.h' Pull Request resolved: https://github.com/pytorch/pytorch/pull/22562 Differential Revision: D16161524 Pulled By: soumith fbshipit-source-id: d03f8a2ece5629762f9fa8a27b15b0d037e8f07b	2019-07-09 16:41:56 -07:00
Mingzhe Li	3cf5f22f02	Enable C2 operators running with {cpu, gpu} * {forward, backward} (#22664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22664 This diff enables c2 operators to run the combination of {cpu, gpu} * {forward, backward}. Reviewed By: hl475 Differential Revision: D15781789 fbshipit-source-id: e9843e3c46ea144042829860638d406f6a33792b	2019-07-09 16:41:53 -07:00
Mingzhe Li	95a5da175d	change c2 bench to use new tensor creation interface (#22663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22663 as title Reviewed By: hl475 Differential Revision: D15744502 fbshipit-source-id: 441ab9fb7580ca87c3f2027d0a63ba18b8d35016	2019-07-09 16:41:49 -07:00
Hong Xu	e1fdf8a46f	Add comments about adding new build options. (#22641 ) Summary: Also revert the change of cmake.py in c97829d7011bd59d662f6af9c3a0ec302e7e75fc . The comments are added to prevent future similar incidents in the future (which has occurred a couple of times in the past). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22641 Differential Revision: D16171763 Pulled By: ezyang fbshipit-source-id: 5a65f9fbb3c1c798ebd25521932bfde0ad3d16fc	2019-07-09 16:41:46 -07:00
Andrew Jones	e2216ada65	Properly formats errors rising up from C++ extension compilation (#22445 ) Summary: Here's a C++ extension with a missing semicolon: ```python torch.utils.cpp_extension.load_inline('test', 'int main() { return 0 }') ``` which currently generates this error ``` RuntimeError: Error building extension 'test_v6': b'[1/2] c++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test_v6 -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o\nFAILED: main.o \nc++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test_v6 -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o\n/tmp/torch_extensions/test/main.cpp: In function \xe2\x80\x98int main()\xe2\x80\x99:\n/tmp/torch_extensions/test/main.cpp:2:23: error: expected \xe2\x80\x98;\xe2\x80\x99 before \xe2\x80\x98}\xe2\x80\x99 token\n int main() { return 0 }\n ^\nninja: build stopped: subcommand failed.\n' ``` After this PR, the error is ``` RuntimeError: Error building extension 'test': [1/2] c++ -MMD -MF main.o.d - DTORCH_EXTENSION_NAME=test -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o FAILED: main.o c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=test - DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site- packages/torch/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site- packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /tmp/torch_extensions/test/main.cpp -o main.o /tmp/torch_extensions/test/main.cpp: In function ‘int main()’: /tmp/torch_extensions/test/main.cpp:2:23: error: expected ‘;’ before ‘}’ token int main() { return 0 } ^ ninja: build stopped: subcommand failed. ``` which is a lot easier to read. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22445 Differential Revision: D16094205 Pulled By: ezyang fbshipit-source-id: 21043344aac260dc3e4e04d6a42898507bb840e4	2019-07-09 16:41:42 -07:00
Nick Korovaiko	50901be9fb	Add traces to LowerGradOf and SpecializeAutoGrad Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22599 Differential Revision: D16161144 Pulled By: Krovatkin fbshipit-source-id: 9e206fcfb1796e9448e80f178b75d0c277bd348f	2019-07-09 16:41:39 -07:00
Tongzhou Wang	0c2cd93e43	Avoid potential extra copy in _lu_with_info_cuda (#22634 ) Summary: No need to `clone` if the expanded size matches original size. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22634 Differential Revision: D16171091 Pulled By: ezyang fbshipit-source-id: 3d8f116398f02952488e321c0ee0ff2868768a0c	2019-07-09 16:41:36 -07:00
Mingzhe Li	45aad2e680	change unary, pool, max ops to use new interface (#22661 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22661 as title Reviewed By: hl475 Differential Revision: D16170825 fbshipit-source-id: d80944224b8717e7aa35980907ff48e587b85217	2019-07-09 16:41:32 -07:00
Mingzhe Li	2b2fe525b9	introduce a new interface to add a list of operators (#21209 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21209 This diff introduces a new interface to add a list of operators. Here are the steps to add ops using this interface: - create op_list: ```unary_ops_list = op_bench.op_list( attr_names=["op_name", "op_function"], attrs=[ ["abs", torch.abs], ["abs_", torch.abs_], ], ) ``` - create a bench class: ``` class UnaryOpBenchmark(op_bench.TorchBenchmarkBase): def init(self, M, N, op_function): self.input_one = torch.rand(M, N) self.op_func = op_function def forward(self): return self.op_func(self.input_one) ``` - 3. register those ops ``` op_bench.generate_pt_tests_from_list(unary_ops_list, unary_ops_configs, UnaryOpBenchmark) ``` Reviewed By: zheng-xq Differential Revision: D15514188 fbshipit-source-id: f09b359cab8175eeb8d51b3ad7bbbcfbc9f6430f	2019-07-09 16:41:29 -07:00
Jerry Zhang	164388150a	fix lint (#22654 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22654 att Reviewed By: bddppq Differential Revision: D16168219 fbshipit-source-id: db1a5e2161e7be70b2f6e6b4beaa27ea91f853f2	2019-07-09 16:41:26 -07:00
davidriazati	8a233b99cb	Report errors through call stack (#22280 ) Summary: The error for `test_error_stack_module`: ``` Traceback (most recent call last): File "../test.py", line 35, in <module> scripted = torch.jit.script(M()) File "/home/davidriazati/other/pytorch/torch/jit/__init__.py", line 1119, in script return _convert_to_script_module(obj) File "/home/davidriazati/other/pytorch/torch/jit/__init__.py", line 1825, in _convert_to_script_module raise e RuntimeError: d(int x) -> int: Expected a value of type 'int' for argument 'x' but instead found type 'str'. : at ../test.py:11:12 def c(x): return d("hello") + d(x) ~ <--- HERE 'c' is being compiled since it was called from 'b' at ../test.py:14:12 def b(x): return c(x) ~~~ <--- HERE 'b' is being compiled since it was called from 'forward' at ../test.py:22:16 def forward(self, x): return b(x) ~~~ <--- HERE 'forward' is being compiled since it was called from 'forward' at ../test.py:31:20 def forward(self, x): return x + self.submodule(x) ~~~~~~~~~~~~~~~~ <--- HERE ``` This also unifies our error reporting in the front end with `ErrorReport` TODO * Include module names in message, #22207 should make this easy ](https://our.intern.facebook.com/intern/diff/16060781/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22280 Pulled By: driazati Differential Revision: D16060781 fbshipit-source-id: c42968b53aaddb774ac69d5abbf7e60c23df8eed	2019-07-09 16:41:22 -07:00
Zafar Takhirov	13d58fd9f5	README for the quantized op creation (#22165 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22165 Workflow description for the quantized ops design. Reviewed By: jerryzh168 Differential Revision: D15975977 fbshipit-source-id: ef73b172f609adef149c157c404bb452b5457a9f	2019-07-09 16:41:19 -07:00
Bram Wasti	dd4982e287	Cleanup integer sign warnings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22560 Test Plan: Imported from OSS Differential Revision: D16151479 Pulled By: bwasti fbshipit-source-id: a54139d8f95ed964530862f96723e4365724b2da	2019-07-09 16:41:16 -07:00
Brandon Amos	046c4589df	lu: When not using pivoting, return the identity permutation instead of zeros (#22242 ) Summary: Some of my qpth users have told me that updating to the latest version of PyTorch and replacing the btrifact/btrisolve calls with the LU ones wasn't working and I didn't believe them until I tried it myself :) These updates have broken unpivoted LU factorizations/solves on CUDA. The LU factorization code used to return the identity permutation when pivoting wasn't used but now returns all zeros as the pivots. This PR reverts it back to return the identity permutation. I've not yet tested this code as I'm having some trouble compiling PyTorch with this and am hitting https://github.com/pytorch/pytorch/issues/21700 and am not sure how to disable that option. Here's a MWE to reproduce the broken behavior, and my fix. ```python torch.manual_seed(0) n = 4 L = torch.randn(n,n) A = L.mm(L.t()).unsqueeze(0) b = torch.randn(1, n) A_lu_cpu = torch.lu(A) A_lu_cuda_nopivot = torch.lu(A.cuda(), pivot=False) A_lu_cuda_pivot = torch.lu(A.cuda(), pivot=True) print('A_lu_cuda_nopivot\n', A_lu_cuda_nopivot) print('-----\nA_lu_cuda_pivot\n', A_lu_cuda_nopivot) x_cpu = b.lu_solve(A_lu_cpu) x_cuda_nopivot = b.cuda().lu_solve(A_lu_cuda_nopivot) x_cuda_nopivot_fixed = b.cuda().lu_solve( A_lu_cuda_nopivot[0], torch.arange(1, n+1, device='cuda:0').int()) x_cuda_pivot = b.cuda().lu_solve(*A_lu_cuda_pivot) print(x_cpu, x_cuda_nopivot, x_cuda_nopivot_fixed, x_cuda_pivot) ``` Output: ``` A_lu_cuda_nopivot (tensor([[[ 2.8465, -0.7560, 0.8716, -1.7337], [-0.2656, 5.5724, -1.1316, 0.6678], [ 0.3062, -0.2031, 1.4206, -0.5438], [-0.6091, 0.1198, -0.3828, 1.5103]]], device='cuda:0'), tensor([[0, 0, 0, 0]], device='cuda:0', dtype=torch.int32)) ----- A_lu_cuda_pivot (tensor([[[ 2.8465, -0.7560, 0.8716, -1.7337], [-0.2656, 5.5724, -1.1316, 0.6678], [ 0.3062, -0.2031, 1.4206, -0.5438], [-0.6091, 0.1198, -0.3828, 1.5103]]], device='cuda:0'), tensor([[0, 0, 0, 0]], device='cuda:0', dtype=torch.int32)) (tensor([[-0.3121, -0.1673, -0.4450, -0.2483]]), tensor([[-0.1661, -0.1875, -0.5694, -0.4772]], device='cuda:0'), tensor([[-0.3121, -0.1673, -0.4450, -0.2483]], device='cuda:0'), tensor([[-0.3121, -0.1673, -0.4450, -0.2483]], device='cuda:0')) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22242 Differential Revision: D16049334 Pulled By: ezyang fbshipit-source-id: 7eacae810d87ffbdf8e07159bbbc03866dd9979d	2019-07-09 11:16:50 -07:00
Tim Hatch	7fcfed19e7	Fix interpreter lines for files with python2-only syntax. Reviewed By: lisroach Differential Revision: D15362271 fbshipit-source-id: 48fab12ab6e55a8537b19b4623d2545ca9950ec5	2019-07-09 10:51:43 -07:00
Jerry Zhang	5040d52a5a	torch.quantization conversion utilities, observers for eager mode quantization (#22010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22010 torch.quantization module with observers and conversion routines Reviewed By: zafartahirov Differential Revision: D15554183 fbshipit-source-id: 05a3fabe28dd701978b8ecebf5bfc3a4c044ba5c	2019-07-09 10:51:38 -07:00
Nikolay Korovaiko	073fa6f411	add GRAPH_UPDATE logging to guard_elimination.cpp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22497 Differential Revision: D16165106 Pulled By: Krovatkin fbshipit-source-id: aeb48d81d92c71f7038903b1656d760b6b95c562	2019-07-09 10:09:35 -07:00
ptrblck	a3346e100e	Performance improvements for depthwise convolutions in FP16 (#22302 ) Summary: This PR activates faster depthwise convolution kernels for Volta and Turing GPUs using cudnn >= 7600. The script to benchmark the current PyTorch master branch and this PR branch can be found [here](https://gist.github.com/ptrblck/4590cf20721d8f43296c9903abd4a774). (50 warmup iterations, 1000 iterations for timing) I've used https://github.com/pytorch/pytorch/issues/3265 to create a similar benchmark and added a few additional setups. Since the results are quite long, I've uploaded them in a spreadsheet [here](https://docs.google.com/spreadsheets/d/13ByXcqg7LQUr3DVG3XpLwnJ-CXg3GUZJ3puyTMw9n2I/edit?usp=sharing). Times are given in ms per iteration. We've benchmarked this PR on a DGX1 using V100 GPUs. The current workload check in `check_cudnn_depthwise_workload` is quite long and can be moved to another file, if wanted. CC ngimel (Thanks for the support while benchmarking it ;) ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22302 Differential Revision: D16115057 Pulled By: ezyang fbshipit-source-id: bad184658518e73b4d6b849d77e408f5a7a757de	2019-07-09 07:28:31 -07:00
SsnL	31d821e267	Move thnvrtc and DynamicLibrary to ATen (#22362 ) Summary: Having the NVRTC stub in ATen is necessary to call driver APIs in ATen. This is currently blocking https://github.com/pytorch/pytorch/pull/22229. `DynamicLibrary` is also moved as it is used in the stub code, and seems general enough. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22362 Differential Revision: D16131787 Pulled By: ezyang fbshipit-source-id: add2ee8a8865229578aa00001a00d5a6671e0e73	2019-07-09 07:28:27 -07:00
Tongzhou Wang	74883d4865	Fix typos in gradcheck error message Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22357 Differential Revision: D16065935 Pulled By: ezyang fbshipit-source-id: f131655eaca27f9df4cd6c511faabf0b8f2bf0de	2019-07-09 07:12:56 -07:00
Barys Skarabahaty	92e8d04098	Sync worker requirement mismatches Summary: Syncing worker requirement mismatches to improve remote build time. Created actions: MEDIUM: 488 LARGE: 29 XXLARGE: 2 Updated actions: From MEDIUM to LARGE: 227 From XLARGE to MEDIUM: 1 From XLARGE to LARGE: 1 From XLARGE to XXLARGE: 1 From LARGE to MEDIUM: 2 From LARGE to XLARGE: 2 Differential Revision: D16161669 fbshipit-source-id: 67a4e0d883ca3f1ca3185a8285903c0961537757	2019-07-09 05:24:19 -07:00
Nick Korovaiko	9fb4386c14	Add a higher-level log traces to DCE Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22511 Differential Revision: D16153694 Pulled By: Krovatkin fbshipit-source-id: 5edbc04bdccfa89f5ad0bf37f51e1bd2cb28962a	2019-07-08 21:55:02 -07:00
Shoaib Ahmed Siddiqui	5395db22a4	Typo fixed in documentation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22600 Differential Revision: D16156989 Pulled By: mrshenli fbshipit-source-id: e491b083d872eaceb829028dadbab2e28ecfc785	2019-07-08 19:29:07 -07:00
Igor Fedan	b5860b5572	torchvision version changed to the latest one (#22598 ) Summary: Version changed to `487c9bf4b7` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22598 Differential Revision: D16155471 Pulled By: ifedan fbshipit-source-id: 2d54883c91add28c0f076858f292363eb95340a9	2019-07-08 17:13:59 -07:00
Jongsoo Park	738aba171b	use caffe2_dnnlowp_force_slow_path in FC (#22143 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22143 Like Conv DNNLOWP operator, allow FC to run the slow path to debug numerical issues caused by Intel's int8 instruction that does horizontal addition of 2 int8 multiplication results in 16 bit Reviewed By: hx89 Differential Revision: D15966885 fbshipit-source-id: c6726376a3e39d341fd8aeb0e54e0450d2af8920	2019-07-08 17:01:04 -07:00
Nikolay Korovaiko	905b9a89b2	fix uninitialized variable in BailOutInserter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22596 Differential Revision: D16156883 Pulled By: Krovatkin fbshipit-source-id: 8926262a2d3115f34400c9ebb0c98c540e1cc623	2019-07-08 16:45:51 -07:00
Supriya Rao	c97829d701	Adding FC and Relu QNNPACK ops to C10 registry (#22174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22174 This is a preliminary change outlining the approach we plan to follow to integrate QNNPACK operators into the pytorch backend. The operators will not be made visible to the user in the python world, so ultimately we will have a function that calls qnnpack backend based on the environment being run on. The goal of the project is to integrate QNNPACK library with PyTorch to achieve good performance for quantized mobile models. Reviewed By: ljk53 Differential Revision: D15806325 fbshipit-source-id: c14e1d864ac94570333a7b14031ea231d095c2ae	2019-07-08 14:21:42 -07:00
Owen Anderson	0e7b65e48a	Convert VariableVersion counter to intrusive_ptr, saving a memory allocation on every Tensor. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22514 Differential Revision: D16114714 fbshipit-source-id: 441043d18938710869b64cb67884f49cd6060727	2019-07-08 13:40:58 -07:00
Hong Xu	0c1ecf19e1	Simplify the flow control in div_kernel_cuda. (#22555 ) Summary: Some duplicated code is removed. It also becomes clear that there is only one special case `div_kernel_cuda` is handling. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22555 Differential Revision: D16152091 Pulled By: zou3519 fbshipit-source-id: bb875370077c1f84efe4b766b3e1acc461e73e6c	2019-07-08 12:13:10 -07:00
SsnL	478d480d37	Add Module.requires_grad_ (#22576 ) Summary: addresses https://github.com/pytorch/pytorch/issues/20241 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22576 Differential Revision: D16149314 Pulled By: zou3519 fbshipit-source-id: 1cc4c1ec084df30e00e9ae73ce1a53494a034d5c	2019-07-08 12:13:07 -07:00
joker	456d27dff0	Update module.h (Fix a grammatical error of the comment in line 233) (#22548 ) Summary: Fix a grammatical error of the comment in line 233. change from " Returns an `OrderedDict` of he submodules of this `Module`" to " Returns an `OrderedDict` of the submodules of this `Module`" Pull Request resolved: https://github.com/pytorch/pytorch/pull/22548 Differential Revision: D16134534 Pulled By: zou3519 fbshipit-source-id: 33b1dd0fbc3a24bef99b6e0192566e2839292842	2019-07-08 11:51:49 -07:00
Will Feng	3a12520844	Pass Variable into Caffe2 ops, by requiring that the Variable doesn't require grad (#22473 ) Summary: As part of the Variable/Tensor merge, we want to be able to pass Variables into Caffe2 without doing extra shallow copy, to improve performance and also allow for in-place mutations in Caffe2 ops. There are a few approaches outlined in https://github.com/pytorch/pytorch/pull/22418, and this PR is the chosen approach. Specifically, we can have the assumption that we won't be connecting autograd to C2 gradients at any point (as it's too tricky and not that useful). Therefore, we can pass Variable into Caffe2 ops by requiring that all Variables in Caffe2 don't require grad. For code paths in Caffe2 that might potentially track gradients (e.g. `ScriptModuleOp` and `call_caffe2_op_from_c10`), we use the `torch::NoGradGuard` to make sure gradients are not tracked. This supersedes https://github.com/pytorch/pytorch/pull/22418. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22473 Differential Revision: D16099042 Pulled By: yf225 fbshipit-source-id: 57efc3c7cfb3048d9abe90e63759acc14ebd2972	2019-07-08 11:31:10 -07:00
Iurii Zdebskyi	304552b61a	Enabled masked fill and scatter for bool tensors (#22491 ) Summary: Enabled masked_fill, masked_fill_, masked_scatter_, masked_scatter on bool tensors. Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/22491 Differential Revision: D16108817 Pulled By: izdeby fbshipit-source-id: 8b1808f41d7a4f65fe6d3797a3c83b2dac3446c7	2019-07-08 10:49:40 -07:00
Pavel Belevich	a48cf8f52d	Fixed RNNImplBase reset and flat_weights methods to handle bidirectional flag correctly (#22493 ) Summary: Fixing https://github.com/pytorch/pytorch/issues/19545: Changed torch/csrc/api/src/nn/modules/rnn.cpp to be consistent with torch/nn/modules/rnn.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/22493 Differential Revision: D16111433 Pulled By: pbelevich fbshipit-source-id: edfa41e8a9889d64918998dc7c46b8763fdf5765	2019-07-08 10:34:04 -07:00
Jon Malmaud	595e344769	Add type stubs to import 'nn' modules (#22411 ) Summary: Forgot to mirror the `nn/ __init__.py` semantics in the new `nn` type stub. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22411 Differential Revision: D16149798 Pulled By: ezyang fbshipit-source-id: 0ffa256fbdc5e5383a7b9c9c3ae61acd11de1dba	2019-07-08 09:22:37 -07:00
vishwakftw	9bafe5d4da	Fix torch.normal with CUDA tensors (#22533 ) Summary: `addcmul_out` overwrote the samples, which led to constant values being output by `torch.normal`. Changelog: - Replace the `addcmul_out` calls with combo of inplace `mul` and `add` and justification for this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22533 Test Plan: - Enable tests for test_normal on all devices Fixes https://github.com/pytorch/pytorch/issues/22529 Differential Revision: D16141337 Pulled By: ezyang fbshipit-source-id: 567a399042e0adcd154582f362318ce95a244c62	2019-07-08 08:27:38 -07:00
Hong Xu	80e2fab952	Deprecate and set a date for removing NO_* and WITH_* (user) build options (#22474 ) Summary: Currently specifying different build options in respect to the "USE_" series is in quite a disarray. There are a lot of build options that accept three variants: USE_OPTION, WITH_OPTION, and NO_OPTION. Some build options only accept USE_ and NO_ variant. Some accept only USE_. This inconsistency is quite confusing and hard to maintain. To resolve this inconsistency, we can either let all these build options support all three variants, or we only support the USE_ variant. This commit makes a step to the latter choice, i.e., deprecates and sets a date for removing the NO_ and WITH_ variants and keeps only the USE_ variant. This is likely better than the former solution because: - NO_ and WITH_ variants are not documented. - CMakeLists.txt only has the USE_ variants for relevant build options defined. It would be a surprise that when user pass these variables to CMake during rebuild and find them ineffective. - Multiple variants are difficult to maintain. - The behavior is confusing if more than one variant is passed. For example, what to be expected if one sets "NO_CUDA=1 USE_CUDA=1"? The downside is that this will break backward compatibility for existing build scripts in the future (if they used the undocumented build options). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22474 Differential Revision: D16149396 Pulled By: ezyang fbshipit-source-id: 7145b88ad195db2051772b9665dd708dfcf50b7d	2019-07-08 08:22:08 -07:00
Arul	43d36415b9	torch.utils.data.Dataloader: documentation about RNG state consumption (#22540 ) Summary: the outcome from the pytorch forum issue: https://discuss.pytorch.org/t/dataloader-problem-problem-arises-when-shuffle-true/45631 The discussion is here: https://github.com/pytorch/pytorch/pull/20749 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22540 Differential Revision: D16131777 Pulled By: ezyang fbshipit-source-id: 566deda1b44dc7fae54250e9b508d120851a2848	2019-07-08 08:22:04 -07:00
peter	ce8c9d9bd5	Fix cuda detection script (#22527 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22527 Differential Revision: D16126220 Pulled By: ezyang fbshipit-source-id: eb05141282b0f058324da1b3d3cb34566f222a67	2019-07-08 07:06:59 -07:00
peter	d4464d3418	Use system locale in collect_env.py (#22579 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22570. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22579 Differential Revision: D16147304 Pulled By: soumith fbshipit-source-id: db73447bffbfdf54f7b830447d4b9584f363f05f	2019-07-07 20:55:31 -07:00
SsnL	d48cbd62cd	Fix spectral_norm load_state_dict with strict=False (#22545 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21251 also fixes some missing hook removals. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22545 Differential Revision: D16139506 Pulled By: soumith fbshipit-source-id: 552a9f9f91be328a47ee8f1e1d29c1f59b0ebca3	2019-07-07 19:08:48 -07:00
peter	94bd5ddf7f	Add some essentials for building c++ extensions on Windows (#22563 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22489. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22563 Differential Revision: D16142615 Pulled By: ezyang fbshipit-source-id: d7c27a874f788dd27065fad6699485e4a6372ec4	2019-07-06 19:29:25 -07:00
Jongsoo Park	9db7bc8bc7	fix uninitialized variable warning (#22477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22477 There is actually no use of uninitialized variable but some compilers are not smart enough to reason about two if branches are already taken together. Reviewed By: hx89 Differential Revision: D16100211 fbshipit-source-id: 25f01d668063603d7aaa776451afe8a10415d2ea	2019-07-06 00:36:45 -07:00
Lara	42c6ea5faa	ONNX Export Topk with Dynamic k (+ add test cases) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21104 Differential Revision: D16061592 Pulled By: houseroad fbshipit-source-id: 855b310a138fdde9c25869ffe9f127189dc2eaf5	2019-07-05 23:46:36 -07:00
Will Feng	221af09ca7	Move GradMode / AutoGradMode / NoGradGuard to ATen core (#18573 ) Summary: After the Variable/Tensor merge, code paths in ATen need to be able to check whether a tensor requires gradient, and throw errors in places where a `requires_grad=true` tensor is not allowed (such as https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Utils.h#L76-L78 and https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/SparseTensorImpl.cpp#L86). Since the `GradMode` thread-local variable controls whether a tensor should accumulate gradients, we need to be able to check this variable from ATen when we determine whether a tensor requires gradient, hence the PR to move `GradMode` / `AutoGradMode` / `NoGradGuard` to ATen. Note that we intentionally don't merge `at::GradMode` and `at::NonVariableTypeMode`, with the following reasoning: Semantically, `at::GradMode` and `at::NonVariableTypeMode` actually mean different things: `at::GradMode` controls whether a tensor should accumulate gradients, and `at::NonVariableTypeMode` controls whether a Variable should be treated as a non-Variable tensor in type dispatches. There are places whether we don't want the tensor to accumulate gradients, but still want the Variable to be treated as a Variable. Here is one example: ```python # torch/tensor.py with torch.no_grad(): ... new_tensor = self.new() # `at::GradMode` is false at this point ... ``` ```cpp // tools/autograd/templates/python_variable_methods.cpp static PyObject * THPVariable_new(PyObject* self, PyObject* args, PyObject* kwargs) { ... // if we merge `at::GradMode` and `at::NonVariableTypeMode`, since `at::GradMode` is false and `self_.type()` checks `at::GradMode` to decide whether to return non-Variable type, it will return a non-Variable type here, which is not what we want (and throws a "Tensor that was converted to Variable was not actually a Variable" error) return THPVariable_Wrap(torch::utils::legacy_tensor_new(self_.type(), args, kwargs)); ... } ``` For the above reason, we cannot merge `at::GradMode` and `at::NonVariableTypeMode`, as they have different purposes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18573 Differential Revision: D16134413 Pulled By: yf225 fbshipit-source-id: 6140347e78bc54206506499c264818eb693cdb8a	2019-07-05 23:41:37 -07:00
James Reed	39a4ec4141	Make device_of take tensor by const ref Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22556 Test Plan: Imported from OSS Differential Revision: D16137540 Pulled By: jamesr66a fbshipit-source-id: 8a6be6edd602c53edc954508ea27d8a6071bd964	2019-07-05 23:34:43 -07:00
Dehua Cheng	7730346853	Make shuffling optional in DistributedSampler (#22479 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22479 In some cases, for example, when we training on CTR data, we would like to start training from old samples and finish on new recent samples. This diff add the option to disable the shuffling in DistributedSampler to accommodate this use case. Reviewed By: soumith Differential Revision: D16100388 fbshipit-source-id: 35566581f5250040b2db5ec408a63037b47a9f5d	2019-07-05 18:56:28 -07:00
svcscm	9e1ae4b264	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 8932f509ab9b14988428a1b9a42d3388ff5a4ad5	2019-07-05 18:36:03 -07:00
Elias Ellison	577042a3cc	Better Constant Propagation through Tuples (#22561 ) Summary: Replaces https://github.com/pytorch/pytorch/pull/21501 because ghimport had errors when i tried to import the stack that i couldn't figure out :'( has the two commits that were previously accepted and the merge commit Pull Request resolved: https://github.com/pytorch/pytorch/pull/22561 Differential Revision: D16135743 Pulled By: eellison fbshipit-source-id: f0a98842ccb334c7ceab04d1437e09dc76be0eb1	2019-07-05 18:06:46 -07:00
Sebastian Messmer	a09150adc0	Deprecate untyped Dicts (#22516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22516 Force anybody creating an untyped Dict to call c10::impl::deprecatedUntypedDict(). This should hopefully make it clear that this is not public API and prevent people from using it. Differential Revision: D16115215 fbshipit-source-id: 2ef4cb443da1cdf4ebf5b99851f69de0be730b97	2019-07-05 18:00:13 -07:00
svcscm	595d2dbb4d	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 1a6c249837151564f48f675ced6a221ec739aae9	2019-07-05 15:39:56 -07:00
Nikolay Korovaiko	91706d1044	Primitive Jit Logging Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22278 Differential Revision: D16134598 Pulled By: Krovatkin fbshipit-source-id: e64b14d0d68801189fc78c059a4e8b322acce3fa	2019-07-05 15:27:38 -07:00
Sebastian Messmer	ed60d9fcf9	List/Dict remember and check their element type (#22005 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22005 When a Dict or List is created with type information, it will remember that. If at any point later, this list is instantiated to a List<T> with a concrete type, it will assert that T is the correct type. Differential Revision: D15914462 fbshipit-source-id: a8c3d91cb6d28d0c1ac0b57a4c4c6ac137153ff7	2019-07-05 15:17:51 -07:00
mal	042da2171e	Skip test_slogdet_sign if LAPACK library is not installed Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22551 Test Plan: ran test locally Imported from OSS Differential Revision: D16132182 fbshipit-source-id: 5b9efbf883efa66c4d8b7c400bdb804ac668a631	2019-07-05 11:57:24 -07:00
Supriya Rao	3507eaf3ea	Add clone() implementation for QTensor (#22510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22510 Added a new function to implement clone operation on quantized tensors. Also added a test case which can be tested as shown in test plan. This change is required to be able to call torch.jit.trace on quantized models. Clone implementation calls copy_ on QTensor internally. Differential Revision: D16059576 fbshipit-source-id: 226918cd475521b664ed72ee336a3da8212ddcdc	2019-07-05 11:24:53 -07:00
mal	0140a756d8	Prioritize reentrant tasks and execute them recursively until close to limit Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22397 Test Plan: Added test for reentrant backwards with checkpoint and a test for a recursive backwards function (which should fail if we run all the reentrant tasks recursively in the same thread) and for testing priority of reentrant tasks. ~~Will add a test for priority of reentrant tasks in future pr.~~ Imported from OSS Differential Revision: D16131955 fbshipit-source-id: 18301d45c1ec9fbeb566b1016dbaf7a84a09c7ac	2019-07-05 08:51:06 -07:00
Kexin Yu	e5d640341f	Set stream for softmax kernel launch (#22470 ) Summary: Currently, the stream parameter is not set when launching these two kernels: softmax_warp_forward() and softmax_warp_backward(), i.e. the kernels are always put on the default stream, which may fail to respect the stream that was set previously. Add at::cuda::getCurrentCUDAStream() as a launch argument to fix this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22470 Differential Revision: D16115051 Pulled By: izdeby fbshipit-source-id: 38b27e768bb5fcecc1a06143ab5d63b0e68a279e	2019-07-05 07:33:55 -07:00
Du Tran	d2ceab2766	update video input (#22471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22471 update C2 video input with latest augmentation Reviewed By: HengCV Differential Revision: D16096127 fbshipit-source-id: bb07394e211cd52b50005d801b6d03250248ea9e	2019-07-05 00:56:33 -07:00
Jianyu Huang	4ba1c4f798	Add the support of handle Bias being nullptr for torch.ops.quantized.fbgemm_conv (#22472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22472 As Title says. Reviewed By: dskhudia, bddppq Differential Revision: D16097594 fbshipit-source-id: 7f56b7906dd9c2792e21a8aa553c0b8d05b19012	2019-07-04 19:37:37 -07:00
Michael Suo	57dbc79674	fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22543 Test Plan: Imported from OSS Differential Revision: D16127755 Pulled By: suo fbshipit-source-id: 5f6ba507bf5b671987e2cabf03c2118271800595	2019-07-04 18:26:04 -07:00
Michael Suo	b952011bae	use save/load for emitFunctionHook Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22314 Test Plan: Imported from OSS Differential Revision: D16121781 Pulled By: suo fbshipit-source-id: b376afb082726d78f082a0ff6902c4b501435d4b	2019-07-04 17:12:16 -07:00
Michael Suo	bc24593a80	remove unused argument in import.cpp (#22205 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22205 ghimport-source-id: afdf13f6515a1352cde4cbb7b45bf01e717bcf4d Test Plan: Imported from OSS Differential Revision: D15998763 Pulled By: suo fbshipit-source-id: 6e5c1c668b9de5e352d2aa7ca27197deb42ca94b	2019-07-04 17:12:12 -07:00
Michael Suo	4b9b7d6f03	improvements to QualifiedName (#22204 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22204 ghimport-source-id: 319afc622f7137ca9075efefca1a05acedc19a4a Test Plan: Imported from OSS Differential Revision: D15998759 Pulled By: suo fbshipit-source-id: 4534443aef61255af0fa3d2ed1be5e87266e2f2c	2019-07-04 17:12:08 -07:00
Michael Suo	f5919dba45	refactoring of module/object (#22203 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22203 ghimport-source-id: 6b3807ac8aa53df2fdd770b43d8e54b8f0d69c20 Test Plan: Imported from OSS Differential Revision: D15998760 Pulled By: suo fbshipit-source-id: dd51edbcb66561189ae9d94a129434092bcad01b	2019-07-04 17:12:04 -07:00
Michael Suo	3b2844eeea	Make CompilationUnit own Functions (#22202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22202 ghimport-source-id: de6c963af1df76d2d6357155e64a5913ab879f76 Test Plan: Imported from OSS Differential Revision: D15998761 Pulled By: suo fbshipit-source-id: 5414a6424953738d823b265d20dc67dde6e5b2d8	2019-07-04 17:12:00 -07:00
Michael Suo	66378c7025	make test context managers exception safe Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22502 Test Plan: Imported from OSS Differential Revision: D16121783 Pulled By: suo fbshipit-source-id: e5f991b189261f596355541cc331ef92667bd1a5	2019-07-04 17:11:56 -07:00
Nikolay Korovaiko	2b06e7cd50	add #pragma once to jit headers Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22505 Differential Revision: D16119310 Pulled By: Krovatkin fbshipit-source-id: 8b742411f40d66690ce28726c213741e0c2de618	2019-07-04 11:10:59 -07:00
Nikolay Korovaiko	6f6a680481	remove erase_fork_wait.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22509 Differential Revision: D16119307 Pulled By: Krovatkin fbshipit-source-id: 3251f594be6d365652b847bdc35dd4f4b62c35e6	2019-07-03 22:47:57 -07:00
Wanchao Liang	799633e4cd	move casting ops from prim to aten Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22275 Test Plan: Imported from OSS Differential Revision: D16060597 Pulled By: wanchaol fbshipit-source-id: a11d8ad3b037e15bd670cc7cd3fefd4f0abd0bba	2019-07-03 22:22:28 -07:00
Brian Vaughan	97a604ef57	Rereapply optional ScalarType interface changes that were reverted in D16079809 (#22456 ) Summary: re-apply changes reverted in: https://github.com/pytorch/pytorch/pull/22412 Also change log_softmax to take positional arguments. Long-term we do want the kwarg-only interface, but seems to currently be incompatible with jit serialization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22456 Differential Revision: D16097159 Pulled By: nairbv fbshipit-source-id: 8cb73e9ca18fc66b35b873cf4a574b167a578b3d	2019-07-03 20:03:25 -07:00
David Riazati	10c4b98ade	Remove weak script (#22212 ) Summary: * Deletes all weak script decorators / associated data structures / methods * In order to keep supporting the standard library in script, this enables recursive script on any function defined in `torch.nn` * Most changes in `torch/nn` are the result of `ag -Q "weak" torch/nn/ -l \| xargs sed -i '/weak/d'`, only `rnn.py` needed manual editing to use the `ignore` and `export` to continue supporting the overloaded `forward` methods * `Sequential`/`ModuleList` no longer need to be added to constants since they are compiled on demand This should also fix https://github.com/pytorch/pytorch/issues/22212 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22212 Differential Revision: D15988346 Pulled By: driazati fbshipit-source-id: af223e3ad0580be895377312949997a70e988e4f	2019-07-03 17:28:25 -07:00
Mingzhe Li	b93f29ded3	add JIT path to the benchmark (#22309 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22309 This diff enables PT operators to run with JIT mode. Users can control eager and JIT mode using the `use_jit` flag. In this diff, we are putting operators in a loop and passed it to JIT. One extra step which wraps the operator with the `_consume` op is introduced to avoid dead code elimination optimization in JIT. With that, the reported time includes the real operator execution time plus the `_consume` (directly return input, nothing else if happening inside) op. Reviewed By: zheng-xq Differential Revision: D16033082 fbshipit-source-id: e03be89fd5a505e44e81015dfc63db9cd76fb8a1	2019-07-03 17:18:03 -07:00
Shuaipeng Li	29ec4769bb	Fix SyncBatchNorm running var update issue (#22248 ) Summary: ## Fix https://github.com/pytorch/pytorch/issues/22192 + change signature: `func: batch_norm_gather_stats(Tensor input, Tensor mean, Tensor invstd, Tensor? running_mean, Tensor? running_var, float momentum, float eps, Tensor counts) -> (Tensor, Tensor)` + change cuda & cuda head ```cuda std::tuple<Tensor, Tensor> batch_norm_gather_stats_cuda(const Tensor& self, const Tensor& mean, const Tensor& invstd, const Tensor& running_mean, const Tensor& running_var, double momentum, double epsilon, int64_t count) { const Tensor& running_var, double momentum, double epsilon, const Tensor& counts) ``` + change python interface ```python class SyncBatchNorm(Function): def forward(self, input, weight, bias, running_mean, running_var, eps, momentum, process_group, world_size): ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22248 Differential Revision: D16002146 Pulled By: mrshenli fbshipit-source-id: 9007e83928267b89df4d3847aabfbdb63e456956	2019-07-03 17:17:59 -07:00
Mingzhe Li	325ec2327f	create tensor based on provided datatype (#22468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22468 as title Reviewed By: ajauhri Differential Revision: D15744503 fbshipit-source-id: 050b32dd7f135512385fc04f098c376c664211a9	2019-07-03 17:08:23 -07:00
BowenBao	319ef3bcbb	Fix onnx custom op export & add initial test case (#21321 ) Summary: - Fix typo in ```torch/onnx/utils.py``` when looking up registered custom ops. - Add a simple test case 1. Register custom op with ```TorchScript``` using ```cpp_extension.load_inline```. 2. Register custom op with ```torch.onnx.symbolic``` using ```register_custom_op_symbolic```. 3. Export model with custom op, and verify with Caffe2 backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21321 Differential Revision: D16101097 Pulled By: houseroad fbshipit-source-id: 084f8b55e230e1cb6e9bd7bd52d7946cefda8e33	2019-07-03 16:59:12 -07:00
Mingzhe Li	9c44f6c723	generate tests based on op metadata (#21432 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21432 This diff introduce a new interface to generate tests based on the metadata of operators. Reviewed By: ajauhri Differential Revision: D15675542 fbshipit-source-id: ba60e803ea553d8b9eb6cb2bcdc6a0368ef62b1c	2019-07-03 16:48:41 -07:00
Sebastian Messmer	2732a5e534	Another dce fix (#22499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22499 Another place where onnx export is running dead code elimination after making the jit graph invalid. Fixing it. Reviewed By: houseroad Differential Revision: D16111969 fbshipit-source-id: 5ba80340c06d091988858077f142ea4e3da0638c	2019-07-03 16:37:53 -07:00
Alyssa Wang	d9e15bccb0	Perform weight re-init for embedding table in sparse_lookup.py (#22348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22348 This is the last step of LRU hash eviction weight re-init. This diff checks if there's evicted values in sparse_lookup, if so call op created in D15709866 to re-init the values for indicies in evicted_values. Also created gradient op for the operator. The gradient op just passes the output gradient as input gradient. Reviewed By: itomatik Differential Revision: D16044736 fbshipit-source-id: 9afb85209b0de1038c5153bcb7dfc5f52e0b2abb	2019-07-03 10:33:40 -07:00
Pieter Noordhuis	c9f41e9bc0	Add device guard around MPI operations (#22446 ) Summary: If the current CUDA device is not the same as the device that hosts the tensor the operation works on then OpenMPI will segfault, as reported in https://github.com/pytorch/pytorch/issues/21922. This changes adds a device guard for every operation to ensure the correct device is set. Fixes https://github.com/pytorch/pytorch/issues/21922. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22446 Differential Revision: D16106823 Pulled By: pietern fbshipit-source-id: 99d762eb3851c0a0e0b4fe81cf27c1c8d35596cc	2019-07-03 02:01:53 -07:00
James Reed	abb2e68989	Don't construct a single element array for unary ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22469 Test Plan: Imported from OSS Differential Revision: D16105794 Pulled By: jamesr66a fbshipit-source-id: 6bb5a6703c8dba3cda20a6db192d2a15711751a1	2019-07-02 23:29:56 -07:00
James Reed	7fef0b7b72	Take const refs in TensorIterator::mark_outputs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22465 Test Plan: Imported from OSS Differential Revision: D16105795 Pulled By: jamesr66a fbshipit-source-id: 22945fc3f02f8450ae6b92492a0f7baad80c5cb5	2019-07-02 23:29:52 -07:00
Sebastian Messmer	0d63619414	Deprecate vector/unordered_map again (#22478 ) Summary: The deprecation was temporarily removed in https://github.com/pytorch/pytorch/pull/21999 because it threw warnings on our codebase. https://github.com/pytorch/pytorch/pull/22162 fixed these internal call sites so we can now re-enable the deprecation. I checked locally that this PR doesn't add any new warnings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22478 Differential Revision: D16101342 Pulled By: smessmer fbshipit-source-id: 378eb208f6a3dd3a28d2efe2e001fd71a9569297	2019-07-02 21:59:05 -07:00
Sebastian Messmer	17cc79865d	Fix dead code elimination in onnx export (#22476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22476 Dead code elimination assumes a valid jit graph because it checks if operators have side effects. The onnx export path destroys the jit graph right before calling dead code elimination, but it actually doesn't care about side effects. We can just call dead code elimination and disable side effect lookup and things should work. Reviewed By: houseroad Differential Revision: D16100172 fbshipit-source-id: 8c790055e0d76c4227394cafa93b07d1310f2cea	2019-07-02 21:28:57 -07:00
Jiakai Liu	76e14c1e51	remove caffe2/core dependency from ATen/core/jit_type.h (#22441 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22441 This include doesn't seem to be needed. Remove it to simplify mobile build dependency. Reviewed By: dreiss Differential Revision: D16088224 fbshipit-source-id: f6aec21655e259726412e26a006d785912436c2a	2019-07-02 21:07:59 -07:00
Brennan Vincent	e210c65097	Add `torch.where` overload with only condition argument (#21986 ) Summary: Requested in https://github.com/pytorch/pytorch/issues/21798 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21986 Differential Revision: D16081577 Pulled By: zhangguanheng66 fbshipit-source-id: 658c0f451b833aceb1a41ee424c7990eec00bc02	2019-07-02 18:18:15 -07:00
Brennan Vincent	dcd902bdde	provide "size" parameter in torch.normal when called with two floats (#20545 ) Summary: This has been requested in https://github.com/pytorch/pytorch/issues/20323 (It is still not exactly the same as NumPy, which allows you to pass tensors at mean/std and broadcast them with size, but the present PR is extremely simple and does the main thing people are asking for) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20545 Differential Revision: D15358736 Pulled By: zhangguanheng66 fbshipit-source-id: 762ea5eab5b8667afbac2df0137df017ba6e413c	2019-07-02 18:18:11 -07:00
Guanheng Zhang	bb0f299f27	Update MultiheadAttention module support key/value with different number of features and allow static key/value (#21288 ) Summary: The changes include: 1. Allow key/value to have different number of features with query. It supports the case when key and value have different feature dimensions. 2. Support three separate proj_weight, in addition to a single in_proj_weight. The proj_weight of key and value may have different dimension with that of query so three separate proj_weights are necessary. In case that key and value have same dimension as query, it is preferred to use a single large proj_weight for performance reason. However, it should be noted that using a single large weight or three separate weights is a size-dependent decision. 3. Give an option to use static k and v in the multihead_attn operator (see saved_k and saved_v). Those static key/value tensors can now be re-used when training the model. 4. Add more test cases to cover the arguments. Note: current users should not be affected by the changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21288 Differential Revision: D15738808 Pulled By: zhangguanheng66 fbshipit-source-id: 288b995787ad55fba374184b3d15b5c6fe9abb5c	2019-07-02 18:06:25 -07:00
Duke Vijitbenjaronk	d684112ec9	Output sequence probability with CTC beam search, optional multiple output sequences (#21927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21927 Add `OUTPUT_PROB` output to CTCBeamSearchDecoderOp to return a probability for each sequence. Add argument to output top-k instead of top-1 decoded sequences. Reviewed By: SuperIRabbit Differential Revision: D15797371 fbshipit-source-id: 737ca5cc4f90a0bcc3660ac9f58519a175977b69	2019-07-02 17:29:13 -07:00
Sebastian Messmer	830c6590ef	EraseNumberTypes cleans itself up (#22461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22461 We shouldn't call dead code elimination after EraseNumberTypes because dead code elimination assumes a valid jit graph which EraseNumberTypes just broke. Let's have it clean up itself isntead. Reviewed By: houseroad Differential Revision: D16094656 fbshipit-source-id: f2752277d764e78ab276c57d56b2724b872b136f	2019-07-02 17:06:53 -07:00
Hong Xu	a6441c00d6	Remove build variable NCCL_EXTERNAL (#22467 ) Summary: It's always set to equal USE_NCCL, we made Gloo depending on Caffe2 NCCL build. See 30da84fbe1614138d6d9968c1475cb7dc459cd4b Pull Request resolved: https://github.com/pytorch/pytorch/pull/22467 Differential Revision: D16098581 Pulled By: ezyang fbshipit-source-id: f706ec7cebc2e6315bafca013b669f5a72e04815	2019-07-02 15:36:44 -07:00
Alyssa Wang	34f950c800	Create C2 operator to replace values in embedding table (#22279 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22279 This new operator is used for embedding table weight re-init. After we get the evicted indices, they will be the rows need reseting in embedding table. Then we can create a 1d tensor with default values, and apply this operator to copy the tensor to all evicted rows in embedding table Will add gradient op in next diff Reviewed By: itomatik Differential Revision: D15709866 fbshipit-source-id: 2297b70a7326591524d0be09c73a588da245cc08	2019-07-02 15:26:22 -07:00
Jongsoo Park	040a4bd914	include conv_op_impl.h from conv_dnnlowp_op.cc (#22458 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22458 To make sure template instantiation. Reviewed By: jianyuh Differential Revision: D16094183 fbshipit-source-id: 7861df0b303bec42ab80a53477c4b608edebb61d	2019-07-02 15:09:34 -07:00
Brennan Vincent	474dec4b00	Warn on conditions that can trigger cuBLAS sgemm bug (#22034 ) Summary: The sgemm in cuBLAS 9.0 has some issues with sizes above 2M on Maxwell and Pascal architectures. Warn in this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22034 Differential Revision: D15949930 Pulled By: zhangguanheng66 fbshipit-source-id: 0af977ec7900c76328d23898071de9c23778ff8b	2019-07-02 15:09:31 -07:00
Hong Xu	f5b3f9ecd9	Remove unnecessary ROCm detection code in Python scripts. (#22464 ) Summary: ROCm is already detected in cmake/public/LoadHIP.cmake. No need to detect twice. Plus, the Python script read environment variable ROCM_HOME, but what is really used in CMake scripts is ROCM_PATH -- A user must specify both environment variables right. Since ROCM_HOME is undocumented, this commit completely eradicates it. --- ezyang A remake of https://github.com/pytorch/pytorch/issues/22228 because its dependency has been dismissed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22464 Differential Revision: D16096833 Pulled By: bddppq fbshipit-source-id: fea461e80ee61ec77fa3a7b476f7aec4fc453d5d	2019-07-02 14:46:03 -07:00
Sebastian Messmer	e68dc899d1	Fix compiler warnings (#22162 ) Summary: Fix various compiler warnings Pull Request resolved: https://github.com/pytorch/pytorch/pull/22162 Differential Revision: D16085339 Pulled By: smessmer fbshipit-source-id: d36a4b334315f1a5942cac46443a7d166ca36d0d	2019-07-02 14:12:55 -07:00
Chunli Fu	53a52f574f	infer shape until no more change (#22425 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22425 Currently, in bound_shape_inference.cc: InferBoundShapeAndType, we firstly infer ops in the order and then infer inputs of concat in reverse order. In ctr_instagram_model tiny version, concat is right before FC, so we can infer the inputs for concat. But in production version, we found there are some ops between concat and FC(or other ops we know the shape), so the shape of these ops cannot be inferred. This diff is a tmp solution for this problem: infer shape in order and in reverse order repeatly until no more change. Reviewed By: yinghai, ipiszy Differential Revision: D16082521 fbshipit-source-id: d5066509368029c6736dce156030adf5c38653d7	2019-07-02 13:13:55 -07:00
Hui Wu	07ef85e326	Add USE_MKLDNN_CBLAS build option. (#19014 ) Summary: MKL-DNN is the main library for computation when we use ideep device. It can use kernels implemented by different algorithms (including JIT, CBLAS, etc.) for computation. We add the "USE_MKLDNN_CBLAS" (default OFF) build option so that users can decide whether to use CBLAS computation methods or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19014 Differential Revision: D16094090 Pulled By: ezyang fbshipit-source-id: 3f0b1d1a59a327ea0d1456e2752f2edd78d96ccc	2019-07-02 12:29:54 -07:00
Sebastian Messmer	6d5871300b	Use concrete types on call sites for Dict/List (#22004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22004 In future, we want all dicts/lists to store information about the types they contain. This is only possible if the creation API doesn't allow creating lists/dicts without type information. This diff removes some call sites that don't specify type information and have it specify type information. Reviewed By: dzhulgakov Differential Revision: D15906387 fbshipit-source-id: 64766a2534b52c221e8a5501a85eaad13812e7bd	2019-07-02 11:52:35 -07:00
Hong Xu	693871ded3	Rename macros and build options NAMEDTENSOR_ENABLED to BUILD_NAMEDTENSOR (#22360 ) Summary: Currently the build system accepts USE_NAMEDTENSOR from the environment variable and turns it into NAMEDTENSOR_ENABLED when passing to CMake. This discrepancy does not seem necessary and complicates the build system. The naming of this build option is also semantically incorrect ("BUILD_" vis-a-vis "USE_"). This commit eradicate this issue before it is made into a stable release. The support of NO_NAMEDTENSOR is also removed, since PyTorch has been quite inconsistent about "NO_*" build options. --- Note: All environment variables with their names starting with `BUILD_` are currently automatically passed to CMake with no need of an additional wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22360 Differential Revision: D16074509 Pulled By: zou3519 fbshipit-source-id: dc316287e26192118f3c99b945454bc50535b2ae	2019-07-02 11:46:13 -07:00
Alyssa Wang	bb07f2d063	Pass LRU hash output evicted_values to SparseLookup (#21389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21389 As titled. To do weight re-init on evicted rows in embedding table, we need to pass the info of the evicted hashed values to SparseLookup, which is the layer model responsible for constructing the embedding table and do pooling. To pass evicted values, we need to adjust the output record of lru_sparse_hash to include the evicted values, and add optional input to all processors that needs to take in sparse segment. For SparseLookup to get the evicted values, its input record needs to be adjusted. Now the input record can have type IdList/IdScoreList/or a struct of feature + evicted values Reviewed By: itomatik Differential Revision: D15590307 fbshipit-source-id: e493881909830d5ca5806a743a2a713198c100c2	2019-07-02 11:27:37 -07:00
Haixin Liu	869ce89474	use feenableexcept when glibc is available (#22241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22241 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20387 glibc has a non-standard function, feenableexcept, that triggers floating-point exception handler . Compared to feclearexcept + fetestexcept , this approach allows us to see precisely where the exception is raised from the stack trace. Reviewed By: jspark1105 Differential Revision: D15301095 fbshipit-source-id: 94f6e72456b2280f78d7d01c2ee069ae46d609bb	2019-07-02 10:49:55 -07:00
Gregory Chanan	e74b0fc87c	Fix empty_like for quantized tensors. (#21978 ) Summary: empty_like uses the tensor options of `self`, rather than the passed in tensor options. This means it messes up variable/tensor types, and ignores specifications like different dtypes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21978 Differential Revision: D15903948 Pulled By: gchanan fbshipit-source-id: f29946be01c543f888daef2e99fe928e7b7d9d74	2019-07-02 10:28:59 -07:00
Karl Ostmo	7235532df3	Revert D16088193: Refactored math tests to iterate over all math ops Differential Revision: D16088193 Original commit changeset: 81b6e536b450 fbshipit-source-id: 81ee8857c3d5353955d75e05203cfbf2d3b8dacd	2019-07-02 10:18:46 -07:00
Karl Ostmo	a845d02cd5	Revert D16088191: Added math.log2 and hypot Differential Revision: D16088191 Original commit changeset: 5d80c480243d fbshipit-source-id: 12ea2617e3af5bf81b1f2a57f8633ca06a99db5b	2019-07-02 10:18:42 -07:00
Dave Kotfis	2dd71b18c4	Fix PoolWindow crash from thread_local (#22405 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19394 See https://developercommunity.visualstudio.com/content/problem/124121/thread-local-variables-fail-to-be-initialized-when.html for context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22405 Differential Revision: D16090822 Pulled By: ezyang fbshipit-source-id: 9fdd2c272fa7723fb62b906336d2e2620411b12b	2019-07-02 09:48:14 -07:00
Lara Haidar	7ca7edc307	ONNX Export LayerNorm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22265 Reviewed By: zrphercule Differential Revision: D16076268 Pulled By: houseroad fbshipit-source-id: 29b4ecab2fa0dc7250c9d1ad6924903181a66ab2	2019-07-02 09:37:07 -07:00
Michael Acar	a4b2f3e213	Implement AdamW optimizer (#21250 ) Summary: # What is this? This is an implementation of the AdamW optimizer as implemented in [the fastai library](`803894051b/fastai/callback.py`) and as initially introduced in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101). It decouples the weight decay regularization step from the optimization step during training. There have already been several abortive attempts to push this into pytorch in some form or fashion: https://github.com/pytorch/pytorch/pull/17468, https://github.com/pytorch/pytorch/pull/10866, https://github.com/pytorch/pytorch/pull/3740, https://github.com/pytorch/pytorch/pull/4429. Hopefully this one goes through. # Why is this important? Via a simple reparameterization, it can be shown that L2 regularization has a weight decay effect in the case of SGD optimization. Because of this, L2 regularization became synonymous with the concept of weight decay. However, it can be shown that the equivalence of L2 regularization and weight decay breaks down for more complex adaptive optimization schemes. It was shown in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101) that this is the reason why models trained with SGD achieve better generalization than those trained with Adam. Weight decay is a very effective regularizer. L2 regularization, in and of itself, is much less effective. By explicitly decaying the weights, we can achieve state-of-the-art results while also taking advantage of the quick convergence properties that adaptive optimization schemes have. # How was this tested? There were test cases added to `test_optim.py` and I also ran a [little experiment](https://gist.github.com/mjacar/0c9809b96513daff84fe3d9938f08638) to validate that this implementation is equivalent to the fastai implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21250 Differential Revision: D16060339 Pulled By: vincentqb fbshipit-source-id: ded7cc9cfd3fde81f655b9ffb3e3d6b3543a4709	2019-07-02 09:09:10 -07:00
Mads R. B. Kristensen	c9a8413306	Numerical stability of embedding kernels (#22401 ) Summary: Address the issue raised in https://github.com/pytorch/pytorch/issues/22377. The PR https://github.com/pytorch/pytorch/issues/22016 introduces a temporary tensor of weights `grad_weight_per_segment` of the same dtype as the end result, which can be a problem when using `float16`. In this PR, it now use a `float32` temporary tensor when the input is `float16`. ngimel, can I get you to review? I think I have fixed the issues you have pointed out. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22401 Differential Revision: D16077319 Pulled By: mrshenli fbshipit-source-id: 7cfad7f40b4d41a244052baa2982ab51bbbd7309	2019-07-02 08:53:22 -07:00
Horace He	b76877728a	Added math.log2 and hypot Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21512 Test Plan: Imported from OSS Differential Revision: D16088191 Pulled By: Chillee fbshipit-source-id: 5d80c480243d2644c96df26337cf65918d79443e	2019-07-02 06:28:34 -07:00
Horace He	3d3d07b7dd	Refactored math tests to iterate over all math ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21511 Test Plan: Imported from OSS Differential Revision: D16088193 Pulled By: Chillee fbshipit-source-id: 81b6e536b4505178c829a9d925c30cd185b7a706	2019-07-02 06:28:30 -07:00
Pieter Noordhuis	0ffda97aa4	Make Gloo an optional c10d dependency (#22257 ) Summary: The CMake modifications include removal of some unnecessary paths (e.g. find_package(CUDA) and friends) that are no longer used since c10d is always part of the larger torch build. The macro `C10D_USE_...` was ambiguous and is now removed in favor of only having top level `USE_...`. The c10d test suite is changed to include skip annotations for the tests that depend on Gloo as well. Now, if you compile with `USE_DISTRIBUTED=1` and `USE_GLOO=0` you get a functioning build for which the tests actually pass. Closes https://github.com/pytorch/pytorch/issues/18851. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22257 Differential Revision: D16087993 Pulled By: pietern fbshipit-source-id: 0cea66bd5cbd9736b06fa1d45ee13a18cab88adb	2019-07-02 02:39:48 -07:00
Hong Xu	b9ede6600e	Remove the USE_MIOPEN build option as MIOpen is always used when built with ROCm. (#22420 ) Summary: Close https://github.com/pytorch/pytorch/issues/22200 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22420 Differential Revision: D16087538 Pulled By: bddppq fbshipit-source-id: ecf3e7eb8213bb093e1c5290d096c233284a2ff9	2019-07-02 00:05:59 -07:00
Dmytro Dzhulgakov	6721e67c10	Remove hacky stub for quantized ops (#22388 ) Summary: Effectively reverts https://github.com/pytorch/pytorch/pull/18267 - this was a temporary measure and is not used any more. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22388 Differential Revision: D16070725 Pulled By: dzhulgakov fbshipit-source-id: ee5db11a608f248b0da981169d4cc90470fd482f	2019-07-01 23:21:42 -07:00
Xianjie Chen	2dd1323379	Fix the GPU trainer for NoneCalibration and RNN Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22385 Reviewed By: Wakeupbuddy Differential Revision: D16053190 fbshipit-source-id: 6304c5c51f33691c201c78d4c921a9c250d9b4f5	2019-07-01 22:55:18 -07:00
Hong Xu	5bd97be309	Fix lint error in format_time() in throughput_benchmark.py and clean it up a bit. (#22424 ) Summary: The `assert False` lint error has been causing CI to fail: ./torch/utils/throughput_benchmark.py:14:13: B011 Do not call assert False since python -O removes these calls. Instead callers should raise AssertionError(). Pull Request resolved: https://github.com/pytorch/pytorch/pull/22424 Differential Revision: D16083464 Pulled By: bddppq fbshipit-source-id: 6d96e36c8fcbb391d071b75fe79c22d526c1ba3c	2019-07-01 22:15:37 -07:00
Greg McGary	edd5b770be	Remove API-level guard on NeuralNetworks.h (#22429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22429 Android NDK r20 removes the guard `(__ANDROID_API__ <= __ANDROID_API_O_MR1__)`, so we do it here also. There is insufficient reason to keep these decls undefined for earlier API levels. NDK r15 and earlier don't even define `__ANDROID_API_O_MR1__`, so the preprocessor defaults it to 0 and the guard evaluates as TRUE. Reviewed By: smeenai, hlu1 Differential Revision: D16084105 fbshipit-source-id: f0857b3eb0573fe219f0d6c5e6583f89e2b5518f	2019-07-01 22:09:11 -07:00
Lu Fang	de84104059	Lint ONNX Related Code (#22423 ) Summary: Lint the code Pull Request resolved: https://github.com/pytorch/pytorch/pull/22423 Differential Revision: D16086518 Pulled By: houseroad fbshipit-source-id: c6e5143f42c73a70beeaa2e089df4164f6265c32	2019-07-01 21:44:16 -07:00
James Reed	ffa15d2285	Load original SourceRanges on import (#22180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22180 ghimport-source-id: efa46dcb845c099f0a746f523901ab2c2cd3b004 Test Plan: Imported from OSS Differential Revision: D15981425 Pulled By: jamesr66a fbshipit-source-id: bef682bd13c1a5be95bdb97e025690c6f2d523d3	2019-07-01 21:14:39 -07:00
James Reed	2c2a913a4f	Preserve SourceRanges across serialization (#22179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22179 ghimport-source-id: 9879551127da09d78ca348b9e436db5a09a92a38 Test Plan: Imported from OSS Differential Revision: D15981423 Pulled By: jamesr66a fbshipit-source-id: a2506f5a2f05916b6e8226841b0229110e758671	2019-07-01 21:14:35 -07:00
James Reed	e05942c09b	Serialization methods for SourceRange and Source (#22178 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22178 ghimport-source-id: 85ca4d4454c6d4b57a82f211004c4bb712d1c980 Test Plan: Imported from OSS Differential Revision: D15981426 Pulled By: jamesr66a fbshipit-source-id: f81f5ee3b66fc4a0d4a708b8109712b5df9f241a	2019-07-01 21:14:31 -07:00
James Reed	671782d88a	Refactor file:line:col to be less ugly (#22177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22177 ghimport-source-id: e35f068c2d39bd8fa2058a9bfc0b1a3856f9383d Test Plan: Imported from OSS Differential Revision: D15981424 Pulled By: jamesr66a fbshipit-source-id: b7748c5cfd4f8ea594314cb601a2b8045173700a	2019-07-01 21:14:28 -07:00
Wanchao Liang	dff2c07183	Manual revert of D16012838 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22412 Reviewed By: nairbv, houseroad Differential Revision: D16079809 fbshipit-source-id: ee0d805ff7a2bc5f98bcc65f90b8199751c840f6	2019-07-01 19:58:21 -07:00
David Riazati	2c18bf21be	Fix `ScriptModule.__dir__()` (#22426 ) Summary: `_method_names` is on `_c`, so `dir(script_module)` calls previously didn't work Pull Request resolved: https://github.com/pytorch/pytorch/pull/22426 Differential Revision: D16085330 Pulled By: driazati fbshipit-source-id: 6f9f1bef5da4306c0f26aa0be1bcec6dd3a6f0fb	2019-07-01 19:33:14 -07:00
xzhu1900	f0f2331a1c	Add support for cross-chunk shuffling in ChunkDataset (#22347 ) Summary: This change adds one advanced support for cross-chunk shuffling. For training with static dataset, the default configuration is at user's disposal. However, in some user cases, over each epoch, new data is added to the current dataset, thus the dataset's size is dynamically changing/increasing. In order to mix the new data and the old data for better random sampling, one approach is to shuffle examples from more than 1 chunks. This feature is supported with this change. By specifying the `cross_chunk_shuffle_count_` on construction, advanced user can specify how many chunks to shuffle example from. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22347 Differential Revision: D16081378 Pulled By: zhangguanheng66 fbshipit-source-id: fd001dfb9e66947839adecfb9893156fbbce80d0	2019-07-01 19:13:34 -07:00
Sebastian Messmer	1f9c4fdb5e	split onnx passes (#22413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22413 _jit_pass_erase_number_types invalidates the jit graph but parts of _jit_pass_onnx rely on having a valid jit graph. This splits _jit_pass_onnx into _jit_pass_onnx_remove_print and _jit_pass_onnx_preprocess_caffe2 (which rely on the valid jit graph), runs these before _jit_pass_erase_number_types, and then runs the rest of _jit_pass_onnx after _jit_pass_erase_number_types Reviewed By: houseroad Differential Revision: D16079890 fbshipit-source-id: ae68b87dced077f76cbf1335ef3bf89984413224	2019-07-01 18:16:53 -07:00
iurii zdebskyi	a54acd3755	Update the way boolean tensor are being printed (#22238 ) Summary: In case when the boolean tensor gets printed out, no need to specify the dtype. Example: ``` >> x = torch.tensor([[True, True, True], [True, True, True]]) >> print(x) tensor([[True, True, True], [True, True, True]]) >> x = torch.tensor([True]) >> print(x) tensor([True]) >> x = torch.tensor(True) >> print(x) tensor(True) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22238 Differential Revision: D15996304 Pulled By: izdeby fbshipit-source-id: 5699acf3e00abca8a2bbb5384f8271eeb063dce7	2019-07-01 18:04:42 -07:00
Gu, Jinghui	cbf572671d	update mkldnn-bridge to avoid mem leak (#22392 ) Summary: fix the memory leak issue exposed by https://github.com/pytorch/pytorch/issues/21537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22392 Test Plan: {F164886124} Reviewed By: yinghai Differential Revision: D16074150 Pulled By: bddppq fbshipit-source-id: b70192aad3d531f349fea5d2d477b827715a2363	2019-07-01 17:12:48 -07:00
Mingzhe Li	402b9f9a6d	add PT chunk op to the benchmark (#22409 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22409 as title Reviewed By: hl475 Differential Revision: D16079031 fbshipit-source-id: 109060ffc953f2357b2783b13f9b9dc87bd3f98a	2019-07-01 16:37:05 -07:00
Mingzhe Li	8a726f5815	add PT split op to the benchmark (#22410 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22410 as title Reviewed By: hl475 Differential Revision: D16078705 fbshipit-source-id: 29e1cc19d0e93a561d07c47e5678a311e6de3e3b	2019-07-01 16:37:01 -07:00
Mingzhe Li	8281909e73	add PT cat operator to the benchmark (#22404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22404 as title Reviewed By: hl475 Differential Revision: D16078395 fbshipit-source-id: 4ff5c558036af1dce6ac0001a1a1fc3a373a981f	2019-07-01 16:36:57 -07:00
Mingzhe Li	007fd01e9b	Enable PT operators running with {cpu, gpu} * {forward, backward} (#22416 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22416 This diff tests the combination of cpu/gpu and forward/backward path for PT add operator. Reviewed By: hl475 Differential Revision: D15770792 fbshipit-source-id: 38cc648361d2501d774db407f988c3cb5115b2ae	2019-07-01 16:30:58 -07:00
Lu Fang	dfa6fca1c6	Supporting Manifold DB in Predictor Exporter (#22334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22334 Improve the function signatures of save_to_db and load_from_db in predictor_exporter. Reviewed By: akyrola Differential Revision: D16047208 fbshipit-source-id: a4e947f86e00ef3b3dd32c57efe58f76a38fcec7	2019-07-01 16:17:02 -07:00
svcscm	30fedeae4a	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 26112fb218995b292bb28e65332f6259b3c289f6	2019-07-01 15:51:30 -07:00
Xiaomeng Yang	10e4137396	Optimize InstanceNormGradientOp (#22288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22288 Optimize InstanceNormGradientOp Benchmarks: CPU with [N, C, H, W] = [128, 256, 56, 56], NCHW order: 616ms -> 128ms NHWC order: 1612ms -> 174ms GPU with [N, C, H, W] = [128, 256, 112, 112], NCHW order: 6450ms -> 37ms NHWC order: 1419ms -> 82ms Reviewed By: houseroad Differential Revision: D16023630 fbshipit-source-id: 5af9bf1103cde2fc2bcb5cd5a057d039732f052e	2019-07-01 15:10:17 -07:00
Alexander Sidorov	d0348c0ef9	ThroughputBenchmark: improve formatting for ExecutionStats (#22293 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22293 Just wrapping C class with nicer python interface which now ust print dirrectly to get all the data. Later we can add various visualizations there Differential Revision: D16023999 fbshipit-source-id: 8436e37e36965821a690035617784dcdc352dcd1	2019-07-01 14:24:34 -07:00
Alexander Sidorov	d0db2a76a0	PyTorch ThroughputBenchmark: fix inaccuracy in number of iterations reporting (#22292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22292 as we do atomic fetch_add to validate if a thread should finish, we should not take the last iteration into account. As a result total number of iterations should be exactly the same as user sets via config.num_iters Now when running a unit test I see exact number of iterations reported Differential Revision: D16023963 fbshipit-source-id: 3b12ee17276628ecd7b0979f28cd6deb777a1543	2019-07-01 14:24:29 -07:00
Will Feng	813b01e4a8	Use at::AutoNonVariableTypeMode before calling ATen tensor factory functions (#22364 ) Summary: As part of the Variable/Tensor merge, one invariant for tensor libraries such as ATen / Caffe2 / XLA is that they should only deal with Tensors, not Variables. However, currently in `variable_factories.h` we are potentially passing Variables into those tensor libraries without the `at::AutoNonVariableTypeMode` guard, which will cause those libraries to treat those Variables as Variables (i.e. their `is_variable()` is true), not Tensors. Consider the following example for `full_like`: ```cpp inline at::Tensor full_like(const at::Tensor & self, at::Scalar fill_value) { ... // Both ATen and XLA rely on `at::full_like` to dispatch to library specific implementations. // // When `self` is a Variable, since we are not using `at::AutoNonVariableTypeMode`, // `at::full_like` will also use `self` as a Variable (and it will see that `self.is_variable()` is true), // which breaks the invariant that ATen / XLA should never deal with Variables. at::Tensor tensor = at::full_like(self, fill_value, self.options().is_variable(false)); at::Tensor result = autograd::make_variable_consuming(std::move(tensor), /requires_grad=/false); ... return result; } ``` Instead, the invariant-preserving implementation would be: ```cpp inline at::Tensor full_like(const at::Tensor & self, at::Scalar fill_value) { ... at::Tensor tensor = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); // Both ATen and XLA rely on `at::full_like` to dispatch to library specific implementations. // // When `self` is a Variable, since we have `at::AutoNonVariableTypeMode` in the scope, // `at::full_like` will use `self` as a Tensor (and it will see that `self.is_variable()` is false), // which preserves the invariant that ATen / XLA should only deal with Tensors. return at::full_like(self, fill_value, self.options().is_variable(false)); })(); at::Tensor result = autograd::make_variable_consuming(std::move(tensor), /requires_grad=/false); ... return result; } ``` This PR makes the suggested change for all variable factory functions. cc. ailzhang This should allow us to remove all `tensor_data()` calls in the XLA codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22364 Differential Revision: D16074862 Pulled By: yf225 fbshipit-source-id: 3deba94b90bec92a757041ec05d604401a30c353	2019-07-01 14:08:28 -07:00
Your Name	d632b1ff3c	Expose is_mkldnn to python and register it as torchscript prim op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22386 Differential Revision: D16074722 Pulled By: bddppq fbshipit-source-id: b9b2a05a894847640084f063fba68d9db4e6aec1	2019-07-01 12:31:59 -07:00
svcscm	2ab6ff42d1	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: c9d3be641389f3c45a9e5a65280d8bfd20e38ea0	2019-07-01 12:25:21 -07:00
Jerry Zhang	577c04c490	add mutation support for forward_pre_hook and forward_hook (#22285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22285 Previously forward hooks are expected to return None, this PR adds the support to overwrite input and output in `forward_pre_hook` and `forward_hook`, this is used to implement inserting quant/dequant function calls around forward functions. Differential Revision: D16022491 fbshipit-source-id: 02340080745f22c8ea8a2f80c2c08e3a88e37253	2019-07-01 11:06:42 -07:00
Andrew Gallagher	f7421b82ad	Remove versions constraints from `external_deps` (#22113 ) Summary: As per attached tasks, these are noops and are being deprecated/removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22113 Reviewed By: philipjameson Differential Revision: D15901131 fbshipit-source-id: 3acf12208f692548afe4844be13717a49d74af32	2019-07-01 10:55:30 -07:00
Jon Malmaud	bfeff1eb8f	Stubs for torch.nn (#19089 ) Summary: Closes https://github.com/pytorch/pytorch/issues/18724 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19089 Differential Revision: D16073654 Pulled By: ezyang fbshipit-source-id: 5642179651ce45ab7c5a46cc1fcc4fd6b37fa71c	2019-07-01 09:50:17 -07:00
SsnL	a43d9af52c	Comment on why Windows build_pytorch.bat builds twice (#22363 ) Summary: I've noticed that Windows CI seems to build twice, e.g., https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-win-ws2016-cuda9-cudnn7-py3-build/60304/console This adds a comment explaining why. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22363 Differential Revision: D16073609 Pulled By: zou3519 fbshipit-source-id: ddb422b7c7e18cc436caff2c5838373a82f69429	2019-07-01 09:45:01 -07:00
Daya Khudia	451c907a47	Adding qconv unpack operator for serialization (#22354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22354 qconv weight unpack operator Reviewed By: zafartahirov, jianyuh Differential Revision: D16059668 fbshipit-source-id: b068b1a13bcf6a9148d864db384db780d474bfbf	2019-07-01 09:39:14 -07:00
Richard Zou	f894ef7263	Add smoke test for information fn/method/attrs to test_namedtensor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22341 Test Plan: - `python test/test_namedtensor.py -v` [namedtensor ci] gh-metadata: pytorch pytorch 22341 gh/zou3519/66/head Imported from OSS Differential Revision: D16053440 Pulled By: zou3519 fbshipit-source-id: 400f2e1c136cd7db4346a42b58813e42595ca755	2019-07-01 07:24:54 -07:00
Richard Zou	496e35f76b	More named inference rules for pointwise unary ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22308 Test Plan: - `python test/test_namedtensor.py -v` [namedtensor ci] gh-metadata: pytorch pytorch 22308 gh/zou3519/65/head Imported from OSS Differential Revision: D16053441 Pulled By: zou3519 fbshipit-source-id: 2e8d4cc11d7a711d2b789752a316a11fffc0996e	2019-07-01 07:24:51 -07:00
Roy Li	2a698682e4	Remove Type dispatch (#21964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21964 ghimport-source-id: fdfb555ac4efbf31ae7d2c700a5aa44ad0cc4d7f Test Plan: Imported from OSS Differential Revision: D15897424 Pulled By: li-roy fbshipit-source-id: 3cd6744254e34d70e6875ffde749b5cf959b663c	2019-06-30 04:11:35 -07:00
Roy Li	6c454ff14c	Stop using Type in Python bindings (#21963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21963 ghimport-source-id: 4d9d66ba2c8587503d892b67f535cc2a62e2d19e Test Plan: Imported from OSS Differential Revision: D15897423 Pulled By: li-roy fbshipit-source-id: 2dd55ceb80971df7c86545b7bfff733387f13572	2019-06-30 04:11:32 -07:00
Roy Li	9c8f9f0ecb	Remove many usages of Type (#21941 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21941 ghimport-source-id: f20cca6229daba9eb8652adb3d959266ae081ef1 Test Plan: Imported from OSS Differential Revision: D15893331 Pulled By: li-roy fbshipit-source-id: c988b16008ff0e2725a88c6025afd4aabdaca45a	2019-06-30 04:11:28 -07:00
Andrew Naguib	3cba9e8aaa	Error Message Paraphrasing (#22369 ) Summary: Saying `I` in an err msg is too subjective to be used in a framework. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22369 Differential Revision: D16067712 Pulled By: soumith fbshipit-source-id: 2a390646bd5b15674c99f65e3c460a7272f508b6	2019-06-30 00:13:02 -07:00
vishwakftw	41e51ce142	Fix QNNPACK and NNPACK settings (#22367 ) Summary: `setup.py` recommends setting `USE_QNNPACK=0` and `USE_NNPACK=0` to disable building QNNPACK and NNPACK respectively. However this wasn't reflected correctly because we were looking for `NO_QNNPACK` and `NO_NNPACK`. This PR fixes it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22367 Differential Revision: D16067393 Pulled By: soumith fbshipit-source-id: 6491865ade9a6d41b7a79d68fd586a7854051f28	2019-06-29 21:15:59 -07:00
Chris Seymour	d8de69d621	Adds symbolic op for logsumexp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22306 Differential Revision: D16046027 Pulled By: houseroad fbshipit-source-id: 7319fd58321220941250c5b8eff024914798e392	2019-06-29 00:09:06 -07:00
Ashwinee Panda	b52621c870	Revise error message for invalid Reduction (#22160 ) Summary: Say the user inputs reduction=False. Of course, we can't add a bool and a string, so the ValueError itself will error -which is more confusing to the user. Instead, we should use string formatting. I would use `f"{reduction} is not..."` but unsure whether we are ok with using f"" strings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22160 Differential Revision: D15981826 Pulled By: soumith fbshipit-source-id: 279f34bb64a72578c36bdbabe2da83d2fa4b93d8	2019-06-28 22:37:04 -07:00
Lu Fang	9e18234109	Automatic update of fbcode/onnx to 806aa863020fa180e57f576cb032ec44ce8ddcca (#22359 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22359 Previous import was 355a4954ea4e5836a5e943589509951c44feb6b4 Included changes: - [806aa863](https://github.com/onnx/onnx/commit/806aa863): Expose ONNX_ML build option to python (#2138) <bddppq> - [8f6e60db](https://github.com/onnx/onnx/commit/8f6e60db): Missing newline fix (#2128) <Chris Seymour> - [d94f99d2](https://github.com/onnx/onnx/commit/d94f99d2): Avoid unnecessary copies of names by checker (#2098) <Scott McKay> - [01f77251](https://github.com/onnx/onnx/commit/01f77251): update qlinear conv test (#2120) <Ashwini Khade> - [1f0c13d3](https://github.com/onnx/onnx/commit/1f0c13d3): Add shape inference for LinearClassifier (#2077) <Hariharan Seshadri> - [eb798fcf](https://github.com/onnx/onnx/commit/eb798fcf): Fix inconsistency in describing graph's initializer. The initializer (#2115) <xykong58> Reviewed By: bddppq, zrphercule Differential Revision: D16061494 fbshipit-source-id: 6ccb63c135c27b307048aa42c11313675027ffb7	2019-06-28 22:22:24 -07:00
Owen Anderson	7cc8f37f56	Reduce needless copying when returning lists of tensors in the JIT interpreter. (#21690 ) Summary: This fixes the JIT performance gap reported in https://twitter.com/VahidK/status/1138677898439561216 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21690 Differential Revision: D15783709 fbshipit-source-id: 23bb4acda6b60c27e95667e1d53c7d261a87167d	2019-06-28 19:00:05 -07:00
Sebastian Messmer	737f8a7638	Fix onnx passes (#22319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22319 The onnx pass replacing ints with Tensors produces an invalid JIT graph. It should only be called right before the onnx pass. Also, it should only be called if we actually export to onnx. Reviewed By: houseroad Differential Revision: D16040374 fbshipit-source-id: e78849ee07850acd897fd9eba60b6401fdc4965b	2019-06-28 17:08:55 -07:00
Gisle Dankel	e76c9751c4	Use lazy initialization in autograd record_function to avoid static (#22317 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22317 About to add an observer that is also statically initialized in a different file, so we need to enforce initialization order. Reviewed By: ilia-cher Differential Revision: D16012275 fbshipit-source-id: f26e57149a5e326fd34cb51bde93ee99e65403c4	2019-06-28 14:52:56 -07:00
Mingzhe Li	3a198400f8	modify pool benchmarks Summary: as title Reviewed By: hl475 Differential Revision: D16058193 fbshipit-source-id: 8f4e04a0356960f6483d6ef58e64876740434849	2019-06-28 14:35:23 -07:00
Mingzhe Li	89c709d217	modify unary operators benchmark Summary: as title Reviewed By: hl475 Differential Revision: D16057665 fbshipit-source-id: 07e31a17450fbfd88b5bd330c31c729de5300eaa	2019-06-28 14:03:41 -07:00
Mingzhe Li	6cf4df5d06	add PT softmax ops to the benchmark suite (#21208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21208 The diff adds softmax, softmax2d, and logsoftmax to the benchmark suite. Reviewed By: zheng-xq Differential Revision: D15526265 fbshipit-source-id: b7ba63032dba7146765513c8cb1ac5a6a7bd1a68	2019-06-28 13:58:20 -07:00
Edward Yang	2132ea1d8d	Fix "python: can't open file '.jenkins/pytorch/print_sccache_log.py': [Errno 2] No such file or directory" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22315 Test Plan: Imported from OSS Differential Revision: D16049862 Pulled By: ezyang fbshipit-source-id: d9c83208e6b5ee7eb009ddb585dbfa0ea1cbb9e6	2019-06-28 07:15:33 -07:00
Phyllipe Medeiros	042a2fd810	Sync worker requirement mismatches Summary: Syncing worker requirement mismatches to improve remote build time. Created actions: MEDIUM: 445 LARGE: 354 Updated actions: From MEDIUM to LARGE: 21 From LARGE to XLARGE: 34 From LARGE to MEDIUM: 9 From XLARGE to MEDIUM: 1 Differential Revision: D16047893 fbshipit-source-id: 7afab2ef879277f114d67fd1da9f5102ec04ed7f	2019-06-28 04:13:06 -07:00
Hong Xu	e259894e83	Test raising TypeError in torch.from_numpy() (#21607 ) Summary: With some additional cleanup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21607 Differential Revision: D16046063 Pulled By: li-roy fbshipit-source-id: 15256a0e94afea39db3cb581c546c2a18a8a7fda	2019-06-27 23:54:47 -07:00
Jerry Zhang	0804452709	fix lint in torch/nn/quantized/modules/linear.py (#22325 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22325 att Reviewed By: bddppq Differential Revision: D16042464 fbshipit-source-id: 0610896c08667fdaa95983f49140193ecb9ede16	2019-06-27 23:18:42 -07:00
Hong Xu	1bea27be9d	Remove three cpu sigmoid functions that are identical to IMPLEMENT_UNARY_OP_VEC (#22271 ) Summary: This does not occur in CUDA code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22271 Differential Revision: D16024605 Pulled By: bddppq fbshipit-source-id: bb4f16bacbdc040faa59751fba97958f4c2d33cd	2019-06-27 23:05:05 -07:00
Jerry Zhang	5e77111486	nn.quantized.Relu and nn.quantize.Quantize/DeQuantize modules Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21930 Differential Revision: D15554224 fbshipit-source-id: 1de9ac7412468106be60e53852c23318ead37bc6	2019-06-27 16:15:17 -07:00
Uladzislau Paulovich	6f0f7e316d	Support building caffe2 with clang-cl on Windows (#22307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22307 MSVC-specific pragma doesn't silence the warning about throwing constructor and therefore `clang-cl` fails to compile this file. This diff fixes the problem by adding additional check for `clang` compiler. Reviewed By: smessmer Differential Revision: D16032324 fbshipit-source-id: 6dbce0ebf0a533d3e42b476294720590b43a8448	2019-06-27 15:43:38 -07:00
Spandan Tiwari	83768f0756	Add ONNX export support for multidim torch.sum. (#22240 ) Summary: This change fixes the issue reported in https://github.com/pytorch/pytorch/issues/22066. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22240 Reviewed By: zrphercule Differential Revision: D15996934 Pulled By: houseroad fbshipit-source-id: 3a842ba26f54aa710233fbe87d727fc1f2568d9c	2019-06-27 15:02:33 -07:00
Jerry Zhang	2832e33a94	Add serialization for nn.quantized.Linear module (#21925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21925 att Differential Revision: D15483071 fbshipit-source-id: 3a218dad5b653b38a0885339889ff70c75a13bef	2019-06-27 14:57:22 -07:00
Jerry Zhang	5c46e701fc	Implementation of nn.quantized.linear module (#21921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21921 Call FBGEMM kernels to implement quantized linear operator. This operator is used only for inference. Differential Revision: D15375695 fbshipit-source-id: b9ca6c156fd60481fea83e55603b2897f7bfc3eb	2019-06-27 14:09:48 -07:00
Pieter Noordhuis	7a40412158	Delay reduction of unused parameters until first autograd hook is called (#22219 ) Summary: Reduction of gradients for unused parameters should happen as soon as possible, because they potentially block reduction of gradients for used parameters. This used to happen instantly when `prepare_for_backward` was called and it found parameters that didn't contribute. This meant that if you have a model with unused parameters, and you want to discard the model output (i.e. not call backward on some loss), reduction of the gradients of those unused parameters would have been kicked off, and you'd see an error the next time you called `forward`. In this commit, this original approach is slightly changed to delay reduction of the gradients of those unused parameters until the first autograd hook is called. This means that you can now discard the model output regardless of the model having unused parameters or not. This is a prerequisite for making the `find_unused_parameters` argument to DDP default to `True`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22219 Differential Revision: D16028698 Pulled By: pietern fbshipit-source-id: c6aec2cd39c4a77746495d9cb1c9fb9c5ac61983	2019-06-27 14:09:44 -07:00
Horace He	ac39869370	Fixed list() not making a copy (#22093 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22087 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22093 Differential Revision: D16036814 Pulled By: Chillee fbshipit-source-id: 3c7106f907415ed0f600acaf45d2c61e1c60867a	2019-06-27 13:55:43 -07:00
Alexander Sidorov	b1096995d5	Update ThroughputBenchmark to reflect new script::Module API (no (#22291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22291 There was a race between landing the benchmark diff and https://github.com/pytorch/pytorch/pull/21934 from zdevito. This PR should fix the issue. Reviewed By: zdevito Differential Revision: D16023640 fbshipit-source-id: 931714352e656f045f9ef3cd17422db51b168384	2019-06-27 12:57:27 -07:00
Richard Zou	177b8bf6e7	Named inference rule for more pointwise ops. (#22268 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22268 ghimport-source-id: c722f9fbb3fc529c872dcccbf58ba1a8c5fcda8e Test Plan: - `python test/test_namedtensor.py -v` [namedtensor ci] Imported from OSS Differential Revision: D16030549 Pulled By: zou3519 fbshipit-source-id: 5cbb2c8626335a32a22ed8079245a5faa7cf553f	2019-06-27 12:49:36 -07:00
Richard Zou	7732b1a604	Enable named inference for some unary pointwise ops (#22267 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22267 ghimport-source-id: 1566df9a20712cada6ea1209e000c5ff757daa14 Test Plan: Imported from OSS Differential Revision: D16030550 Pulled By: zou3519 fbshipit-source-id: 183ca1d14dc0fb6f1ee6e114b48c2703c61e11ce	2019-06-27 12:49:32 -07:00
Richard Zou	69b702a6eb	Implement unify_from_right (#22223 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22223 ghimport-source-id: b88bd2a13c1c9c699945a69ec05300c6e598e95a Test Plan: - `build/bin/NamedTensor_test` [namedtensor ci] gh-metadata: pytorch pytorch 22223 gh/zou3519/62/head Imported from OSS Differential Revision: D16030551 Pulled By: zou3519 fbshipit-source-id: f3d53e3f9b2428a4926c61a02631e6cd29f89e4b	2019-06-27 12:49:29 -07:00
Richard Zou	6386e4d244	Named inference rule for `abs`. (#22151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22151 ghimport-source-id: 54c1726b578ac162af817f78df6f540b764e46e3 Test Plan: - `python test/test_namedtensor.py` [namedtensor ci] Imported from OSS Differential Revision: D15970326 Pulled By: zou3519 fbshipit-source-id: 4ea25f0a73bbc24b604d3ded2027eeb4ce800de0	2019-06-27 12:49:25 -07:00
Lu Fang	2913f6a26d	Adding modules for Python 3 compatibility (#22295 ) Summary: To improve the python 3 compatibility and make linter happy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22295 Reviewed By: zrphercule Differential Revision: D16024957 Pulled By: houseroad fbshipit-source-id: c0eddf731891b2f547ba619b3c2f6b2d7a32f034	2019-06-27 12:06:40 -07:00
Xiaomeng Yang	6947e192f7	Remove unused param in Caffe2 LayerNormGradientOp (#22282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22282 Remove unused param in Caffe2 LayerNormGradientOp Reviewed By: bddppq, houseroad Differential Revision: D16017117 fbshipit-source-id: bdd0bd2aca009e549dfd2bf622494dfc791589e3	2019-06-27 11:22:44 -07:00
davidriazati	be0631b6ee	Add the rest of the `dict` API (#21979 ) Summary: This adds the rest of the `dict.???` methods that were missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/21979 Pulled By: driazati Differential Revision: D16023573 fbshipit-source-id: 3ea9bd905090e2a176af654a8ca98c7d965ea679	2019-06-27 11:08:18 -07:00
Horace He	c9626a11cc	Made a += b for lists do an in place add (#21896 ) Summary: In talks with smessmer, we decided that it'd be better to put the logic in `list`, as optimal behavior requires knowing `.capacity()` Results on my cpu (for the benchmark here: https://twitter.com/VahidK/status/1138674536679821312) now look like this: ``` Pytorch batch_gather took 0.018311 seconds. Pytorch batch_gather jit took 0.013921 seconds. Pytorch vectorized batch_gather took 0.001384 seconds. ``` Previously, `batch_gather jit` took 3x as long as `batch_gather`. Some logic taken from https://github.com/pytorch/pytorch/pull/21690. Note that these two PR's are somewhat orthogonal. That PR handles this benchmark by looking at the alias analysis, while this PR specializes for `+=`. Note that we can't jit the vectorized version as we think `torch.arange` returns a float tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21896 Differential Revision: D15998628 Pulled By: Chillee fbshipit-source-id: b0085960da4613578b94deb98ac62c0a4532a8c3	2019-06-27 10:59:24 -07:00
Hong Xu	bf677b8849	Set MKLDNN (default) build variables in CMakeLists.txt, not in Python build scripts (#22215 ) Summary: This is yet another step to disentangle Python build scripts and CMake and improve their integration (Let CMake handle more build environment detections, and less by our handcrafted Python scripts). The processor detection logic also changed a bit: Instead of detecting whether the system processor is PPC or ARM, this PR changes to detect Intel CPUs, because this is more precise as MKL only supports Intel CPUs. The build option `USE_MKLDNN` will also not be presented to users on non-Intel processors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22215 Differential Revision: D16005953 Pulled By: ezyang fbshipit-source-id: bf3f74d53609b3f835e280f63a872ff3c9352763	2019-06-27 10:21:55 -07:00
Igor Fedan	d2bad941f4	Fix lint issues Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22303 Differential Revision: D16030302 Pulled By: ifedan fbshipit-source-id: 5564f6f810382f31f9416e5881978b03f51e53a9	2019-06-27 09:27:16 -07:00
zaf	e9d1b852c4	Functional conv2d (#21225 ) Summary: Stack:     ⚪  https://github.com/pytorch/pytorch/issues/21323 Quantized Conv2d Module  [💛](https://our.intern.facebook.com/intern/diff/D15551835/)     ⚫  https://github.com/pytorch/pytorch/issues/21225 Functional conv2d  [💛](https://our.intern.facebook.com/intern/diff/D15544061/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21225 Test Plan: `buck test mode/dev caffe2/test:quantized -- test_conv_api`: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929 ``` Action graph will be rebuilt because files have been added or removed. Parsing buck files: finished in 1.1 sec Building: finished in 5.1 sec (100%) 6958/6958 jobs, 2 updated Total time: 6.3 sec Trace available for this run at /tmp/testpilot.20190603-163323.4026295.log TestPilot test runner for Facebook. See https://fburl.com/testpilot for details. Testpilot build revision 17661db57af88ec71497f5c21efa86531c07662b fbpkg ce57c6c1c73f45c4aa890e9df65820c3 at Sat Jun 1 17:06:32 2019 by twsvcscm from /usr/local/fbprojects/packages/testinfra.testpilot/625/t.par Discovering tests Running 1 tests Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929 ✓ caffe2/test:quantized - test_conv_api (test_quantized_conv.FunctionalAPITest) 6.962 1/1 (passed) Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1407375016833929 Summary (total time 10.65s): PASS: 1 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` Reviewed By: dskhudia Differential Revision: D15544061 Pulled By: zafartahirov fbshipit-source-id: 700c0c78b5915bf7e54bda7c44f44b7b1e247f4d	2019-06-27 09:19:54 -07:00
iurii zdebskyi	59c42595e0	Enabled gather and scatter for bool tensor (#21924 ) Summary: - moving stuff around in order to enable bool. - Added implementation of atomicAdd(bool, bool) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21924 Differential Revision: D15883711 Pulled By: izdeby fbshipit-source-id: 733f35c2bc3d87cec9f9687d72b62d2d2cd7c03e	2019-06-27 09:07:50 -07:00
Soumith Chintala	f13fadd510	fix python2 corner-case in torch.distributed.launch (#20996 ) Summary: Small fix for the comment raised in `4cf76574b9 (r33134850)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20996 Differential Revision: D15991510 Pulled By: pietern fbshipit-source-id: 4e5a35864b5a4ec9402aa83a19c4a3ba0df2f01f	2019-06-27 05:19:37 -07:00
xzhu1900	f39b6624ba	ChunkDataset checkpoint support (#21889 ) Summary: When dealing with large scale dataset, it is handy if we can save the dataset status and resume later. Especially in cases where some unexpected crash happens, user don't need to start over the whole dataset from begining. Instead, they can reload it from the last checkpoint. This change adds support for checkpoint save/load logic in ChunkDataset. On ChunkDataset construction, user can specify a file name from which to load the checkpoint. If it is empty, default to start from fresh; otherwise the ChunkDataset will 'fast forward' the chunk sampler to the corresponding checkpoint. The user can also call ChunkDataset::save() to serialize current status to a file, which can be used later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21889 Differential Revision: D16024582 Pulled By: ailzhang fbshipit-source-id: 1862ab5116f94c9d29da174ce04a91041d06cad5	2019-06-26 22:54:14 -07:00
Hong Xu	30d890c672	Removed an outdated comment above IMPLEMENT_UNARY_OP_VEC(abs) (#22272 ) Summary: due to 82b570528db0a43fc04bb90f5d4538c01e4a5582 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22272 Differential Revision: D16024544 Pulled By: bddppq fbshipit-source-id: 37955bff3301975c0dd6abde8a3ba79af0555111	2019-06-26 22:24:13 -07:00
Hong Xu	f144b9ebef	Fix two overindent lint errors in test/common_nn.py. (#22287 ) Summary: This keeps causing lint tests to fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22287 Differential Revision: D16024524 Pulled By: bddppq fbshipit-source-id: a3e3780a55943283e9c854e94ac06ea4715e5319	2019-06-26 21:41:41 -07:00
Hong Xu	e6d4a2d289	Remove unused file cmake/Modules/FindMIOpen.cmake (#22244 ) Summary: `cmake/public/LoadHIP.cmake` calls `find_package(miopen)`, which uses the CMake module in MIOpen installation (It includes the line `set(miopen_DIR ${MIOPEN_PATH}/lib/cmake/miopen)`). `cmake/Modules/FindMIOpen.cmake` is not used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22244 Differential Revision: D16000771 Pulled By: bddppq fbshipit-source-id: 07bb40fdf033521e8427fc351715d47e6e30ed34	2019-06-26 21:21:46 -07:00
Will Feng	5e0a74dd70	Rename copy_tensor_data to copy_tensor_metadata (#22266 ) Summary: The original name `copy_tensor_data` could be confusing because users are not sure whether it deep-copies data in the tensor's storage or just copies the tensor's metadata. The renaming makes it more clear. cc. ailzhang This might break XLA build, but I think the renaming makes it more clear why we use `copy_tensor_data` in XLATensorImpl's shallow-copy functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22266 Differential Revision: D16014724 Pulled By: yf225 fbshipit-source-id: f6ee966927d4d65d828b68264b3253b2f8fd768d	2019-06-26 21:16:57 -07:00
Lara	45c6fa0007	Refactor Tests for Multiple ONNX Opsets (#20036 ) Summary: Refactor tests for https://github.com/pytorch/pytorch/pull/19294. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20036 Reviewed By: zrphercule Differential Revision: D16016593 Pulled By: houseroad fbshipit-source-id: eaae324e347679acf3d0ac1c14be03919f54496e	2019-06-26 17:06:57 -07:00
Alexander Sidorov	f51de8b61a	Back out "Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark" (#22185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22185 Original commit changeset: 72a0eac1658b Differential Revision: D15981928 fbshipit-source-id: d2455d79e81c26ee90d41414cde8ac0f9b703bc3	2019-06-26 16:05:51 -07:00
Nikolay Korovaiko	3f2a839dda	Add comments to bailoug_graph.* Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22161 Differential Revision: D15975355 Pulled By: Krovatkin fbshipit-source-id: dca0095b4f05cff8277663ad38b65eeb44417f40	2019-06-26 15:39:38 -07:00
Igor Fedan	04fe2453c4	conv2d/conv3d for LongTensor (#20730 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20730 Generates forward conv2d function for LongTensor Differential Revision: D15423753 fbshipit-source-id: 0e770b61257cc4c6559581796bf104ef68155c84	2019-06-26 15:29:56 -07:00
Wanchao Liang	3ba72a11db	Revert D15999938: [jit] Add the rest of the `dict` API Differential Revision: D15999938 Original commit changeset: 7bc2a55e3f79 fbshipit-source-id: e377c00e990d6f058960936e69712b77851c06fa	2019-06-26 14:16:37 -07:00
Brian Vaughan	7707dee761	Re apply optional ScalarType changes (#22237 ) Summary: This is (mostly) the re-application of: https://github.com/pytorch/pytorch/pull/21088 which was reverted due to an issue conflicting with changes in: https://github.com/pytorch/pytorch/pull/22104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22237 Differential Revision: D16012838 Pulled By: nairbv fbshipit-source-id: 35f4a73c97ab68b4e2648aca96b2176f07b5a883	2019-06-26 13:36:25 -07:00
Sebastian Messmer	8b02522b93	Avoid copy in ArrayRef<->vector comparison (#22218 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22218 - Differential Revision: D15990763 fbshipit-source-id: 53c98f915fadc8a65aea896c80292d5804d967a4	2019-06-26 13:36:21 -07:00
Vitaly Fedyunin	516c7e4456	Adding memory_format to empty and empty_like operators (#20558 ) Summary: Original RFC https://github.com/pytorch/pytorch/issues/19092 To ensure that we are not introducing BC breaking change, empty_like returns contiguous tensor by default. ```python nCwh = torch.randn(N, C, H, W) nhwC = nCwh.contiguous(memory_format=torch.channels_last) new_nCwh = torch.empty_like(nhwC) new_nCwh.is_contiguous(memory_format=torch.channels_last) == False ``` Now we need a way to preserve memory format in `empty_like` ```python nCwh = torch.randn(N, C, H, W) nhwC = nCwh.contiguous(memory_format=torch.channels_last) new_nhwC = torch.empty_like(nhwC, memory_format=torch.preserve_format) new_nhwC.is_contiguous(memory_format=torch.channels_last) == True like_nCwh = torch.empty_like(nCwh, memory_format=torch.preserve_format) like_nCwh.is_contiguous(memory_format=torch.channels_last) == False ``` Usage of `torch.preserve_format` allows us to avoid `if` constructs. We can also generate different memory format outputs ```python nCwh = torch.randn(N, C, H, W) nhwC = nCwh.contiguous(memory_format=torch.channels_last) new_nhwC = torch.empty_like(nCwh, memory_format=torch.channels_last) new_nhwC.is_contiguous(memory_format=torch.channels_last) == True new_nCwh = torch.empty_like(nhwC, memory_format=torch.contiguous_format) new_nCwh.is_contiguous(memory_format=torch.channels_last) == False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20558 Differential Revision: D15502474 Pulled By: VitalyFedyunin fbshipit-source-id: 2e120d57eefad6fb8e04b8322c79871392f64331	2019-06-26 11:48:27 -07:00
Richard Zou	5bdc4db26e	Refactor named tensor helper code (#22150 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22150 ghimport-source-id: 460022febc24f49b86f0d5dbf8dc227564bde6cb Test Plan: Imported from OSS Differential Revision: D15970325 Pulled By: zou3519 fbshipit-source-id: 86a3e3ca82bbf4ff815431e25c5f9a35fcd23be0	2019-06-26 11:33:29 -07:00
Xiaomeng Yang	29b53b0259	Fix bug in caffe2 transpose on GPU (#22233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22233 Fix bug in caffe2 transpose on GPU Reviewed By: hl475 Differential Revision: D15994973 fbshipit-source-id: 542dc8757b51a6322fffa55826c1d4e32927398d	2019-06-26 11:33:25 -07:00
davidriazati	2dc9643080	Better error message for mismatched dict key type (#22231 ) Summary: ](https://our.intern.facebook.com/intern/diff/15993936/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22231 Pulled By: driazati Differential Revision: D15993936 fbshipit-source-id: 6822ef01477a3b32beb8c037a621fa71abd022c8	2019-06-26 10:46:45 -07:00
davidriazati	af9e0085f2	Add the rest of the `dict` API (#21979 ) Summary: This adds the rest of the `dict.???` methods that were missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/21979 Pulled By: driazati Differential Revision: D15999938 fbshipit-source-id: 7bc2a55e3f791015a0ff2e3731703075cf0770ee	2019-06-26 10:40:29 -07:00
Tongzhou Wang	25eae3ed08	Disable test_proper_exit flaky worker_kill (#22208 ) Summary: I learned from https://github.com/pytorch/pytorch/pull/22058 that `worker_kill` is just flaky, regardless of `hold_iter_reference`. So let's disable it altogether for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22208 Differential Revision: D15990307 Pulled By: soumith fbshipit-source-id: d7d3f4fe7eaac4987f240cb8fd032c73a84157d7	2019-06-26 09:47:40 -07:00
Mingzhe Li	a4f281446b	introduce flags to set omp and mkl threads (#21472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21472 as title Reviewed By: hl475 Differential Revision: D15695846 fbshipit-source-id: 44437f6b94a9c583275fcc711bb6ccf2b04f90fc	2019-06-26 09:33:05 -07:00
Will Feng	5f84f372a6	Use variable_data() in tensor_to_numpy (#22214 ) Summary: As part of the Variable/Tensor merge, we want to gradually remove call sites of `tensor_data()` and the API itself, and instead uses `variable_data()`. This PR removes the `tensor_data()` call in the tensor_to_numpy conversion path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22214 Differential Revision: D15997397 Pulled By: yf225 fbshipit-source-id: 6fcab7b14e138824fc2adb5434512bcf868ca375	2019-06-26 08:57:47 -07:00
Vincent Quenneville-Belair	f176950a67	Use lower case for strong wolfe option. (#22092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22092 ghimport-source-id: ccc53ed2f1e16865237334a4dde4d162e21762e5 Test Plan: Imported from OSS Differential Revision: D15955996 Pulled By: vincentqb fbshipit-source-id: 8ffbea3b9ef8ff7021d42524fa46112da8a3438e	2019-06-26 08:20:25 -07:00
Richard Zou	9f22805cc6	Refactor function_wrapper.create_generic (#22077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22077 ghimport-source-id: 39cf0a2e66e7fa2b6866af72782a22a4bd025e4c Test Plan: - Compared the build/aten/src folder before and after this change locally and verified they are identical (`diff -r`). - Wait for CI + Also, [namedtensor ci] Imported from OSS Differential Revision: D15941967 Pulled By: zou3519 fbshipit-source-id: d8607df78f48325fba37e0d00fce0ecfbb78cb36	2019-06-26 08:20:21 -07:00
Igor Fedan	b297552887	Make nn functions configurable for different scalar types (#20729 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20729 Currently there is no way to specify what scalar types each nn function will support. This change will allow to specify supported scalar types for each function/backward function and device. By default each function will support Float, Double, Half. If you want to scpecify any extra supported scalar types, other then default, you will need to change nn.yalm: - name: _some_func(Tensor self) cname: SomeFunction CPU: forward_scalar_types: ['Float', 'Double', 'Long'] backward_scalar_types: ['Float', 'Double'] Differential Revision: D15423752 fbshipit-source-id: b3c157316d6e629bc39c1b377a3b23c71b1656cf	2019-06-26 07:53:38 -07:00
peter	95b5718007	Prevent VS from emitting errors when using swap in Optional.h (#22182 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22182 Differential Revision: D15981740 Pulled By: ezyang fbshipit-source-id: d58b3ca3aea8d3d383150208b87fa4bbd4f6fe33	2019-06-26 07:29:35 -07:00
Tongzhou Wang	fde75a33e1	update IterableDataset doc to be consistent with current behavior Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22230 Differential Revision: D15994680 Pulled By: ezyang fbshipit-source-id: 9e47e8369aa08a550987c4468ce75aa7650ee1d4	2019-06-26 06:49:22 -07:00
Nikolay Korovaiko	655a370859	restoring HEADs for ideep and onnx to more recent versions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22250 Differential Revision: D16003227 Pulled By: Krovatkin fbshipit-source-id: bf906a8e9e5e0f79391e5984c6cdfb9638d84981	2019-06-26 02:19:17 -07:00
Pieter Noordhuis	17b37eb353	Bump gloo (#22225 ) Summary: This includes: * Removal of builder classes * Add allgatherv * Add bcube allreduce algorithm Pull Request resolved: https://github.com/pytorch/pytorch/pull/22225 Differential Revision: D16003629 Pulled By: pietern fbshipit-source-id: fd062b82bfeeddb8190206d9931a781c7daff6f9	2019-06-26 00:55:36 -07:00
Thomas Viehmann	c1fc2f25c2	export deleteFunction in torch/csrc/autograd/function.h (#22236 ) Summary: In `torch/csrc/autograd/function.h` we define `torch::autograd::Function`, a (the?) central autograd record-holding class. `Function` is declared public API (`TORCH_API`). We also define a custom deleter `deleteFunction` which we use throughout PyTorch's own use of `Function`. This trivial PR declares the deleter public API as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22236 Differential Revision: D16001335 Pulled By: yf225 fbshipit-source-id: 6ef0a3630e8f82f277a0e6e26cc64455ef7ee43e	2019-06-25 20:46:09 -07:00
Ailing Zhang	e8bc992b03	print device when it's not on default device (#22094 ) Summary: we used to not print device when it's on xla. It's sometimes confusing as it looks the same as cpu tensor... Pull Request resolved: https://github.com/pytorch/pytorch/pull/22094 Differential Revision: D15975405 Pulled By: ailzhang fbshipit-source-id: f19ceb9e26f5f2f6e7d659de12716f0dfe065f42	2019-06-25 20:28:50 -07:00
Jiakai Liu	1a164bf30b	remove unused mkldnn include (#22217 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22217 Seems introduced in https://github.com/pytorch/pytorch/pull/19209/files which is no longer used. Remove it to simplify mobile build. Reviewed By: bddppq Differential Revision: D15983344 fbshipit-source-id: 37ee0bfbd022da09af6bc44c6e3fec1c99a8e732	2019-06-25 17:45:39 -07:00
Sebastian Messmer	de85abf226	Allow default construction of Dict/List (#22084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22084 For DictPtr/ListPtr, default construction was disallowed because it was ambigious if it's supposed to create an empty list or a nullptr. But since we renamed them to Dict/List, we can now allow default construction without ambiguity. Differential Revision: D15948098 fbshipit-source-id: 942a9235b51608d1870ee4a2f2f0a5d0d45ec6e6	2019-06-25 17:40:48 -07:00
Sebastian Messmer	e425789286	Fix "missing return statement" warning (#22216 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22216 - Differential Revision: D15989670 fbshipit-source-id: d0534a3bf1eef29657738e271d35503a2f75a043	2019-06-25 16:57:42 -07:00
Wanchao Liang	f7a126f941	fix optional type subtype relation (#22186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22186 ghimport-source-id: 05ef8c3a176fe2a67d4835888e6db52b57a6d199 Test Plan: Imported from OSS Differential Revision: D15994644 Pulled By: wanchaol fbshipit-source-id: 7c5c4eebd421f6c9470661c2c2eb38bafdff8bbd	2019-06-25 16:57:38 -07:00
David Riazati	defd23b8b9	Clean up old uses of checkScript (#22002 ) Summary: This cleans up the `checkScript` API and some old tests that were hardcoding outputs. It also now runs the Python function when a string is passed in to verify the outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22002 Differential Revision: D15924485 Pulled By: driazati fbshipit-source-id: ee870c942d804596913601cb411adc31bd988558	2019-06-25 16:24:19 -07:00
Lara	7b1ffba3bf	ArgumentStash for Scalar arguments (#21931 ) Summary: Scalars are being traced as constants. This PR is to fix this issue. The ONNX Graph for Test_Full_op() before and after this change: def Test_Full_op(): class Test_Full(nn.Module): def forward(self, x): return torch.full((3, 4), x, dtype=torch.long) model = Test_Full() x = torch.tensor(12) output = model(x) Before this change: graph(%input1 : Long()): %output1 : Float(3, 4) = onnx::Constant[value=<Tensor>] return (%output1) After this change: graph(%input1 : Long()): %1 : int[] = onnx::Constant[value= 3 4 [ Variable[CPULongType]{2} ]] %2 : Tensor = onnx::ConstantOfShape[value={0}] %output1 : Float(3, 4) = onnx::Add(%2, %input1) return (%output1) Similar PR : https://github.com/pytorch/pytorch/pull/12939 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21931 Reviewed By: zrphercule Differential Revision: D15950066 Pulled By: houseroad fbshipit-source-id: 3470665d88fa34faa600940ef16b069a06002cd5	2019-06-25 15:22:08 -07:00
Cheng,Penghui	7ee82d48a8	Removed work around for convolution transpose op since the bug has be… (#22184 ) Summary: …en fixed in v0.18 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22184 Differential Revision: D15982627 Pulled By: bddppq fbshipit-source-id: 8725d5b5e5b68e029ffb08af12b416bd310c9638	2019-06-25 14:34:34 -07:00
Zachary DeVito	5b87049c66	remove uses of std::shared_ptr<Module> (#21934 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21934 ghimport-source-id: e64ab9096f43749ead3ac5567675b815da295664 Test Plan: Imported from OSS Differential Revision: D15892401 Pulled By: zdevito fbshipit-source-id: 6424139206593ff944556c69d8a54723884eacaf	2019-06-25 13:24:38 -07:00
Pieter Noordhuis	1d705b4b07	Run clang-format on c10d bits (#22194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22194 TSIA Differential Revision: D15983780 fbshipit-source-id: 1365bcf9bbc262a3657f646e81d2fc9c32f24c97	2019-06-25 12:34:52 -07:00
Liuyi Jin	f5a1ea170b	SIMD version average pooling added (#22148 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22148 Average pooling is added into dnnlowp optimization code. Reviewed By: jspark1105 Differential Revision: D15936556 fbshipit-source-id: 6177ee62529801898f230c6fb89e9c4b598593a5	2019-06-25 12:19:21 -07:00
Uladzislau Paulovich	a7cb07eb0f	Add missing algorithm header to Array utility (#22157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22157 This header uses `std::swap_ranges` function which is defined in `<algorithm>` header (https://en.cppreference.com/w/cpp/algorithm/swap_ranges). Therefore this file isn't guaranteed to compile on all platforms. This diff fixes the problem by adding the missing header. Reviewed By: smessmer Differential Revision: D15971425 fbshipit-source-id: e3edcec131f72d729161f5644ee152f66489201a	2019-06-25 12:19:17 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
vishwakftw	bcb5fd8f06	Port symeig to ATen and enable batching of inputs (#21858 ) Summary: Changelog: - Port `symeig` from TH/THC to ATen - Enable batching of matrix inputs for `symeig` - Modify derivative computation based on batching - Update docs to reflect the change Pull Request resolved: https://github.com/pytorch/pytorch/pull/21858 Test Plan: - Added additional tests in `test_torch.py` (with a port to `test_cuda.py`) and `common_methods_invocations.py` to test if both the port and batching work. Differential Revision: D15981789 Pulled By: soumith fbshipit-source-id: ab9af8361f8608db42318aabc8421bd99a1ca7ae	2019-06-25 12:13:27 -07:00
Sebastian Messmer	4ec6fbefa6	Show deprecation warning when stateful lambdas are used as kernels (#21885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21885 If a kernel is defined as a stateful lambda static auto registry = torch::RegisterOperators().op("my::op", [some_closure] (Tensor a) {...}); this can have very unexpected behavior when kernels are instantiated. There is no guarantee that the state is kept. In the options based API, state is already disallowed: // this is a compiler error static auto registry = torch::RegisterOperators().op("my::op", torch::RegisterOperators::options().kernel([some_closure] (Tensor a) {...})); but we can't disallow it in the non-options-based API for backwards compatibility reasons. We can, however, show a deprecation warning. This is what this diff introduces. Differential Revision: D15867089 fbshipit-source-id: 300fa4772fad8e7d177eb7cb910063d360537a4a	2019-06-25 11:53:18 -07:00
Ailing	c68119387d	serialize torch.Size object (#20952 ) Summary: fixes https://github.com/pytorch/pytorch/issues/20823 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20952 Differential Revision: D15514274 Pulled By: ailzhang fbshipit-source-id: 8340a40fadfd06063f7f33b0d99d693e74d5defb	2019-06-25 10:44:35 -07:00
Ivan Ogasawara	7daa96a3ce	porting convtranspose3d to ATen (#22019 ) Summary: Resolves https://github.com/pytorch/pytorch/issues/18353 CPU and GPU porting for convolution transpose 3d Pull Request resolved: https://github.com/pytorch/pytorch/pull/22019 Differential Revision: D15985353 Pulled By: ezyang fbshipit-source-id: 1c579577a32db24a1ce38f5ab9b3f1cb9c8f2a6e	2019-06-25 10:22:34 -07:00
xiaobing.zhang	9af8ea1ce5	Not expose mkldnn reshape and transpose (#22193 ) Summary: This PR is to make mkldnn reshape and transpose not exposes as Tensor API, please see the comments in https://github.com/pytorch/pytorch/pull/21943. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22193 Differential Revision: D15983434 Pulled By: bddppq fbshipit-source-id: ad3514dfd8a3b0d89442eef752864e5d3f3d04f0	2019-06-25 09:52:47 -07:00
mal	c8b5f1d2f8	Switch autograd to use a pool of workers for each device (#21911 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21911 ghimport-source-id: 3b7d37481201aa4b4ca8f7767603d0dfd13f871f Test Plan: Tested on https://github.com/pytorch/pytorch/issues/6959 and ensured no Recursion Error. Performance testing: [word_language_model](https://gist.github.com/malvika2147/34c214871d549f9275812f2d20506990) (no significant change) [mnist](https://gist.github.com/malvika2147/77890eef102099490a1029122fb20dd0) (no significant change) [Comparison of performance](https://gist.github.com/malvika2147/c0a8790910b8513bd2e20b224bdd6300) on https://github.com/pytorch/pytorch/issues/6959 with smaller inputs. (slower by about ~25%, expected) Imported from OSS Differential Revision: D15985852 fbshipit-source-id: ca172690857fd1718462b80f3a244af9d8825d6c	2019-06-25 09:08:26 -07:00
Mads R. B. Kristensen	94e83da55c	Optimization of the Embedding and Embedding-Bag CUDA Kernel (#22016 ) Summary: Re-implementation of the `embedding_dense_backward_cuda()` and the `embedding_bag_backward_cuda_sum_avg()` functions. #### Performance Running a [Mortgage Workflow](https://github.com/EvenOldridge/MortgageWorkflowA) with a block size of 100K on a DXG-2 (single GPU), we see a 270% speedup: ``` Original version: 370,168 example/s Optimized version: 1,034,228 example/s ``` The original version is bounded by the `EmbeddingBag_accGradParametersKernel_sum_avg`, which takes 70% of the CUDA execution time. In the optimized version, the optimized kernel now takes only 17% of the time. #### Greater Numerical Stability An added benefit is greater numerical stability. Instead of doing a flat sum where a single variable are used to accumulate the weights, this code uses two-steps where each GPU-thread computes a sub-result defined by `NROWS_PER_THREAD` before the final result are accumulated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22016 Differential Revision: D15944339 Pulled By: mrshenli fbshipit-source-id: 398d5f48826a017fc4b31c24c3f8b56d01830bf0	2019-06-25 08:14:15 -07:00
Hong Xu	b0bd8758fc	Further remove redundant CMake option passing code for those CMake variables that are directly controlled by environment variables but with a different name. (#22154 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22154 ghimport-source-id: 714b98566e70063925c4c9e10940a4fe46fb5a3d Test Plan: Imported from OSS Differential Revision: D15985376 Pulled By: ezyang fbshipit-source-id: 60710125009cd8bf60b5600a3f05854d931d9844	2019-06-25 07:23:06 -07:00
Hong Xu	ce1a9653a8	Remove more build options not needed to be explicitly set in Python build scripts. (#22153 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22153 ghimport-source-id: 129d90626a8e64079477a744fbbaba58e139a852 Test Plan: Imported from OSS Differential Revision: D15985375 Pulled By: ezyang fbshipit-source-id: 925bb1c886633b002beb1da0754bb055aa971e21	2019-06-25 07:23:03 -07:00
Syed Tousif Ahmed	839b496fbd	Fixes bugs in torch.multinomial without replacement (#22183 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22183 ghimport-source-id: f03c17178de115adbe983953a8f9f205e3df7721 Test Plan: Imported from OSS Differential Revision: D15985324 Pulled By: ezyang fbshipit-source-id: 6e9dc3b54d448f4bb374b004d7f1dd1ac5c014f6	2019-06-25 07:15:18 -07:00
Xiaomeng Yang	b61693c0ed	Optimize InstanceNormOp forward (#22130 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22130 Optimize InstanceNormOp forward For InstanceNormOp on CPU with order = NHWC, N = 128, C = 256, H = 56, W = 56: 183ms -> 115ms. For InstanceNormOp on GPU with N = 256, C = 256, H = 112, W = 112: NCHW: 1475ms -> 45ms NHWC: 1597ms -> 79ms Reviewed By: houseroad Differential Revision: D15963711 fbshipit-source-id: 3fa03109326456b9f301514fecbefa7809438d3e	2019-06-25 01:04:53 -07:00
Le Fang	ac4913ee62	support both regularizable and sofmax re-weighting on sparse features in dot product (#22176 ) Summary: In order to select more important features in dot product among a list of candidate sparse features, we can assign one learnable weight on each feature, reweight each feature by multiplying the weight onto its embedding before dot product. We finally select features based on the weight magnitude after training. We can perform L1 and/or L2 regularization on the weights. To summarize, the weights tend to shrink their values (avoiding overfitting) due to L2 regularization, and some weights will vanish to zero as L1. To avoid sparse feature embedding being ignored due to early collapse of weights, a piece lr warm up policy is used in optimizing regularization term, such that regularization is weak at first stage and gets stronger afterwards (a small lr constant in iters less than threshold 1, a medium lr constant in stage 2, and a final reasonable large lr constant in all iters after threshold 2). The features with nonzero and relatively large weights (in absolute value) will be selected for the module. We can also apply softmax on the original weights to make it sum to 1. We can even boosting the softmaxed weights by multiply the number of softmax components, which essentially make them sum to the number of softmax components and avergae to 1. In this idea, all the weights are positive and sum to a constant. Regularization is not a must since we can count on the competition between softmax weights themselves to achieve reasonable re-weighting. We expect those weights be more dense, comparing with sparse ones from L1 regularization and we can select features based on top K weights. Overall, we aim to demonstrate the selected feature set outperform current v0 feature set in experiments. Special acknowledgement goes to Shouyuan Chen, who initiated the work of regularizable weighting. --- Pull Request resolved: https://github.com/pytorch/pytorch/pull/22176 The diff will export updates to Github repository, as stated below. {F162787228} Basically, the updates on the files are summarized as below: - adding logger messages `caffe2/python/layer_model_helper.py` - add ElasticNet regularizer, which combines both L1 and L2 regularization `caffe2/python/regularizer.py` - implement piecewarmup, specifically warm up with three constant pieces `caffe2/sgd/learning_rate_functors.h, caffe2/sgd/learning_rate_op.cc, caffe2/sgd/learning_rate_op.h` Differential Revision: D15923430 fbshipit-source-id: ee18902cb88c23b1b7b367cc727d690a21e4cda9	2019-06-24 21:27:33 -07:00
Hong Xu	299ea84a70	Use latest stable flake8-bugbear in CI and fix B011 flake8 error. (#21944 ) Summary: - PyCQA/flake8-bugbear#53 has been fixed (but not yet closed on their side) and a new version of flake8-bugbear has been released on Mar 28, 2019. Switch CI to use the latest stable version. - Fix the new B011 errors that flake8-bugbear catches in the current codebase. --- B011: Do not call assert False since python -O removes these calls. Instead callers should raise AssertionError(). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21944 Differential Revision: D15974842 Pulled By: soumith fbshipit-source-id: de5c2c07015f7f1c50cb3904c651914b8c83bf5c	2019-06-24 20:48:15 -07:00
Thomas Viehmann	f5df0c9104	Don't end on inplace operators in einsum (#22111 ) Summary: Returning the result of an inplace `squeeze_` in `einsum` (which itself is traced) interacts badly with `autograd.Function`. I must admit that I'm not 100% certain whether it should be necessary to change this, but I consider this a good change overall. Fixes: https://github.com/pytorch/pytorch/issues/22072 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22111 Differential Revision: D15974990 Pulled By: soumith fbshipit-source-id: 477e7f23833f02999085f665c175d062e7d32acd	2019-06-24 20:39:20 -07:00
iurii zdebskyi	ede08492e1	Enabled mul for bool tensors on CUDA (#21771 ) Summary: Enable mul_cuda for bool tensors. This is a helper PR to fix a [test failure](https://circleci.com/gh/pytorch/pytorch/1992191?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link) in the other [PR](https://github.com/pytorch/pytorch/pull/21113). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21771 Differential Revision: D15883737 Pulled By: izdeby fbshipit-source-id: 4c39644bbe8e80da4d14570862589944285d4bfe	2019-06-24 18:37:29 -07:00
Corentin Dancette	3b700a43d5	Add missing whitespace in error message (#21904 ) Summary: The current error message displays as: `RuntimeError: index koccurs twice in output` A whitespace is missing between the index and 'occurs' Pull Request resolved: https://github.com/pytorch/pytorch/pull/21904 Differential Revision: D15878941 Pulled By: colesbury fbshipit-source-id: 163dda1829bf4956978cd01fd0e751673580722d	2019-06-24 15:32:46 -07:00
Stefan Krah	88cdc16835	AveragePool: expand incomplete kernel_size for the C++ API Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22075 Differential Revision: D15945260 Pulled By: mrshenli fbshipit-source-id: 827660c19ebbdb5f0aae2f4eadb6025ae2f93674	2019-06-24 15:32:41 -07:00
Stefan Krah	2372e7ed2e	DilatedMaxPool: expand incomplete kernel_size for the C++ API (#22073 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22032. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22073 Differential Revision: D15944471 Pulled By: mrshenli fbshipit-source-id: 84b265be00d67aa7f13508ede0646763d2339f1d	2019-06-24 15:32:36 -07:00
Adam Paszke	b2a39314e7	Make Dropout.__repr__ consistent with other modules (#22110 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/22106. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22110 Differential Revision: D15958821 Pulled By: ezyang fbshipit-source-id: 89381dc3bfa79544580e20fea906cef4f5101b61	2019-06-24 15:27:06 -07:00
Hong Xu	273b6c5bae	Cast return value of vector.at() to void to avoid nodiscard warning in MSVC. (#22061 ) Summary: Fix https://github.com/pytorch/pytorch/issues/22053 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22061 Differential Revision: D15957983 Pulled By: ezyang fbshipit-source-id: e4416c5f0db2bc6b8bfaa27be52b942148ec7b3d	2019-06-24 15:27:02 -07:00
Ziyang Hu	0ac28c8966	Quick fix for #18215 , the CPU case (#21910 ) Summary: The bug is that when target_length == 0, there is no preceding BLANK state and the original implementation will lead to out of bound pointer access. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21910 Differential Revision: D15960239 Pulled By: ezyang fbshipit-source-id: 7bbbecb7bf91842735c14265612c7e5049c4d9b3	2019-06-24 15:26:58 -07:00
Adam Paszke	41d0525de3	Improve repr for IncompatibleKeys (#22119 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20128. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22119 Differential Revision: D15961965 Pulled By: ezyang fbshipit-source-id: 9cc397726e6bea5580e79d291cfc1ee75337fa0c	2019-06-24 15:26:54 -07:00
Adam Paszke	f1775796dd	Fix minor issues with #21736 (#22074 ) Summary: cc mrshenli Pull Request resolved: https://github.com/pytorch/pytorch/pull/22074 Differential Revision: D15965376 Pulled By: mrshenli fbshipit-source-id: 50ff96de6390817d8ea52c04322c6bee3d649b32	2019-06-24 15:18:26 -07:00
Hong Xu	a45898931c	Document the Boolean tensor type. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21601 Differential Revision: D15971573 Pulled By: gchanan fbshipit-source-id: c07c57f989980149cb1307dcca6ba64dce52d0ef	2019-06-24 14:16:36 -07:00
Ilia Cherniavskii	7c4206499e	Fix in ivalue::Future (#22114 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22114 ghimport-source-id: 249b76c078e7af8ebb6cab113dd48dbd3e31e8dc Test Plan: ran intra_inter_benchmark with PARALLEL_BACKEND=NATIVE build Imported from OSS Differential Revision: D15958901 Pulled By: ilia-cher fbshipit-source-id: 1c3dedc4cf1ff8166aeb26899a06c7287a499562	2019-06-24 12:56:46 -07:00
Ilia Cherniavskii	6350dbddd1	Fix sequential MKL case (#22062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22062 ghimport-source-id: a30255d7453c4ffecf40215a785c1e06b7296368 Test Plan: USE_CUDA=0 PARALLEL_BACKEND=OPENMP BLAS=MKL USE_MKLDNN=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/parallel_info Imported from OSS Differential Revision: D15938079 Pulled By: ilia-cher fbshipit-source-id: e7ef0c5bc75ebb845ebe66bf76a4070d45305b35	2019-06-24 12:56:43 -07:00
Nick Korovaiko	21da33f0f9	Better trace comments Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22090 Differential Revision: D15968440 Pulled By: Krovatkin fbshipit-source-id: e55e03a4303adbaa576c4384e7a42410bd99da6e	2019-06-24 12:51:27 -07:00
Bram Wasti	f1c7fa0503	De-deprecate some warnings that hurt usability (#21999 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21999 ghimport-source-id: a77b3aea3d3ed33f328e143203730f2655371837 Test Plan: Imported from OSS Differential Revision: D15925892 Pulled By: bwasti fbshipit-source-id: 2b4e0af40bc1c6d12c617ba8701d3a5f7a6d833d	2019-06-24 12:35:00 -07:00
Nikolay Korovaiko	2347a4032b	Fix tracing docs and add more comprehensive examples (#22082 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21857 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22082 Differential Revision: D15968306 Pulled By: Krovatkin fbshipit-source-id: a76e500b0b7192bd814931ec48bbe9c37b8b92e0	2019-06-24 12:10:19 -07:00
Chandler Zuo	85cbe0d825	Fix Concat Dimension Bug (#22088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22088 This diff is similar to D14163001. We need to handle the edge case when add_axis=1. Reviewed By: jspark1105 Differential Revision: D15949003 fbshipit-source-id: 328d1e07b78b69bde81eee78c9ff5a8fb81f629b	2019-06-24 10:32:48 -07:00
Johannes M Dieterich	322261a4de	Fix dispatching of backwards kernel for ROCm. (#22125 ) Summary: Use WARP_SIZE consistently also for the dispatch dimensions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22125 Differential Revision: D15966661 Pulled By: bddppq fbshipit-source-id: 93eb663e01aff3b49474504a2f96f060919edf0c	2019-06-24 10:32:44 -07:00
Michael Suo	e016a424ef	Revert D15944971: [pytorch][PR] merge interfaces that have an optional scalartype parameter Differential Revision: D15944971 Original commit changeset: 53473c370813 fbshipit-source-id: a18158b448cb8993b12e1a3bf2c2a3e0d6df6b10	2019-06-24 09:41:33 -07:00
byronhe	6edaa11e5a	fix broken link Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22064 Differential Revision: D15951107 Pulled By: mrshenli fbshipit-source-id: 0b8f97bd2bbac26855cd2889e1fc619770974ee2	2019-06-24 07:34:16 -07:00
Pieter Noordhuis	77eda8de8e	Support sparse gradients in DistributedDataParallel (#22037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22037 This adds support for sparse gradients to the reducer as well as to the DistributedDataParallel wrapper. Note that an out of band signal is needed whether or not a dense parameter (e.g. an embedding) is expected to receive a sparse gradient or not. This information is passed to the bucket assignment computation routine and the reducer as a vector of booleans. Every parameter for which we expect a sparse gradient is assigned its own bucket, as we cannot easily group multiple unrelated sparse tensors. Reviewed By: mrshenli Differential Revision: D15926383 fbshipit-source-id: 39c0d5dbd95bf0534314fdf4d44b2385d5321aaf	2019-06-24 07:34:12 -07:00
Pieter Noordhuis	a7ec889de4	Add sparse tensor allreduce (#22036 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22036 Implemented only on ProcessGroupGloo, as an allgather of metadata (sparse_dim, dense_dim, and nnz), followed by an allgather of indices, followed by an allgather of values. Once these operations have finished, all ranks locally compute a reduction over these sparse tensors. Works for both CPU and CUDA tensors. This surfaced a problem with the existing assumption of only modifying tensors that are passed at the call site, because for sparse tensors we don't know the dimensions of the output tensors before we run the collective. To deal with this unknown, this commit adds a `result` function to the `c10d::ProcessGroup::Work` class that returns a vector of tensors. It's a bit odd to have to retrieve the result through this function only for operations on sparse tensors. To make this work irrespective of tensor layout, we can create a follow-up commit to make all in place operations make their results accessible through this function as well. This doesn't break any existing contracts but does have the potential to add interface ambiguity. This is a resubmission of #19146. Reviewed By: mrshenli Differential Revision: D15926384 fbshipit-source-id: b6ee5d81606bfa8ed63c3d63a9e307613491e0ae	2019-06-24 07:34:09 -07:00
Syed Tousif Ahmed	313960d52e	Use at::detail::* instead of detail::* to avoid ambiguity in windows (#22029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22029 ghimport-source-id: d1a26a07faf101c644775267a141ba56cbd3f1c9 Test Plan: Imported from OSS Differential Revision: D15965039 Pulled By: ezyang fbshipit-source-id: 31baf405da6f7c6d9e31f5954ec827889dadf769	2019-06-24 07:18:02 -07:00
Brian Vaughan	142361a7e4	merge interfaces that have an optional scalartype parameter (#21088 ) Summary: This change is backwards incompatible in C++ only on mean(), sum(), and prod() interfaces that accepted either of: ``` Tensor sum(IntArrayRef dim, bool keepdim=false) const; Tensor sum(IntArrayRef dim, ScalarType dtype) const; ``` but now to specify both the dim and dtype will require the keepdim parameter: ``` Tensor sum(IntArrayRef dim, bool keepdim=false, c10::optional<ScalarType> dtype=c10::nullopt) const; ``` [xla ci] Pull Request resolved: https://github.com/pytorch/pytorch/pull/21088 Reviewed By: ailzhang Differential Revision: D15944971 Pulled By: nairbv fbshipit-source-id: 53473c370813d9470b190aa82764d0aea767ed74	2019-06-24 07:17:58 -07:00
Hong Xu	cd0d8480d3	Remove many build options redundantly specified in Python build scripts. (#21877 ) Summary: Currently many build options are explicitly passed from Python build scripts to CMake. But this is unecessary, at least for many of them. This commit removes the build options that have the same name in CMakeLists.txt and environment variables (e.g., `USE_REDIS`). Additionally, many build options that are not explicitly passed to CMake are lost. For `ONNX_ML`, `ONNX_NAMESPACE`, and `BUILDING_WITH_TORCH_LIBS`, I changed their default values in CMake scripts (as consistent with what the `CMake.defines` call meant), to avoid their default values being redundantly set in the Python build scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21877 Differential Revision: D15964996 Pulled By: ezyang fbshipit-source-id: 127a46af7e2964885ffddce24e1a62995e0c5007	2019-06-24 07:17:54 -07:00
Pearu Peterson	1b34ccfc78	Porting SpatialDilatedConvolution and VolumetricDilatedConvolution to ATen (#20983 ) Summary: This PR tackles issue https://github.com/pytorch/pytorch/issues/18352 . Progress: - [x] conv_dilated2d CPU - [x] conv_dilated3d CPU - [x] conv_dilated2d CUDA - [x] conv_dilated3d CUDA - [x] RocM port - [x] Port of CUDA gemm and gemv - [x] Refactored 2d and 3d functions as well as output and gradient computations into a single C++ template function - [x] Cleanup + [x] eliminate forward functions + [x] eliminate buffers `columns` and `ones` from functions API + [x] eliminate out functions + [x] eliminate using `ones` Note that col2im, im2col, col2vol, vol2col implementations are exposed in `ATen/native/im2col.h` and `ATen/native/vol2col.h`. The corresponding operators (not ported in this PR) should use these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20983 Differential Revision: D15958088 Pulled By: ezyang fbshipit-source-id: 1897f6e15abbf5710e9413cd1e443c2e1dc7d705	2019-06-24 07:12:54 -07:00
peter	3ba654e6d5	Add finding thnvrtc_library into torchconfig.cmake (#22126 ) Summary: Fixes https://github.com/pytorch/pytorch/pull/21861#issuecomment-504805368 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22126 Differential Revision: D15964930 Pulled By: ezyang fbshipit-source-id: 0fb749784bec9af5a8ccbcf775fa7d9d4d34a4c6	2019-06-24 07:04:44 -07:00
Soumith Chintala	08060e898b	Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark Differential Revision: D15435461 Original commit changeset: db08829dc3f4 fbshipit-source-id: 72a0eac1658b2d3f885bc9a21c49fcc23030ae3e	2019-06-23 22:55:05 -07:00
Wanchao Liang	d96ce9b9fe	add for in dict support (#22006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22006 ghimport-source-id: d9686c0b61b0eea3787f48adce567249e4e8faf0 Test Plan: Imported from OSS Differential Revision: D15948548 Pulled By: wanchaol fbshipit-source-id: 4227502ca050099085ad481aef725ac2cab06d74	2019-06-23 20:49:35 -07:00
Wanchao Liang	c9344fc9c4	add for in string support (#21990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21990 ghimport-source-id: 69b4882f8602c4088e7a833c43fd3cd37501a3c0 Test Plan: Imported from OSS Differential Revision: D15948547 Pulled By: wanchaol fbshipit-source-id: 057e7f4fb67c6dca98458ceb14414368e1a86260	2019-06-23 20:49:30 -07:00
Wanchao Liang	eab35756d8	support iteration tuple unpacking (#21985 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21985 ghimport-source-id: 1f20a8db7b6bad23b18ac1caefcb46b3fa141697 Test Plan: Imported from OSS Differential Revision: D15948549 Pulled By: wanchaol fbshipit-source-id: 758c9c3dfad40c4158aee21ddebcd25b711111d7	2019-06-23 20:49:26 -07:00
Alexander Sidorov	9b45237618	PyTorch ThroughputBenchmark (#20766 ) Summary: This is useful for measuring inference performance of your models. This is a very basic benchmark for now. We don't support batching on the benchmark side, no inter and intra op parallelizm is supported yet, just caller based parallelizm. Main phylosophy here is that user should be able to provide inputs from python and just stack them within the benchmark. API should be exactly the same as passing inputs to module.forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20766 Test Plan: Added a new unit test Differential Revision: D15435461 Pulled By: salexspb fbshipit-source-id: db08829dc3f4398bb1d8aa16cc4a58b6c72f16c6	2019-06-23 13:03:18 -07:00
Adam Paszke	c0f96aaf01	Restore default values on premature test exit (#22115 ) Summary: Previously any assert failures would leave the updated setting, making the test suite semantics dependent on the order in which the tests are run. The diff is large only due to the indentation change (might be good to review without whitespace changes). cc yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22115 Differential Revision: D15960875 Pulled By: soumith fbshipit-source-id: 9313695277fc2d968786f13371719e03fff18519	2019-06-23 12:55:00 -07:00
James Reed	887ecf797c	Fix DictType isSubtypeOf (#22104 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22104 ghimport-source-id: 9db14020f424cf2e021d63e9c0fe4017ac7cd6c8 Test Plan: Imported from OSS Differential Revision: D15956726 Pulled By: jamesr66a fbshipit-source-id: 85448deab70c5e5b7ab1132652836ed575581868	2019-06-22 16:36:34 -07:00
Wanchao Liang	45b91bd326	refactor all for in range/tensor tests to be together with other for loop tests (#21950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21950 ghimport-source-id: b2491313bc2e0fcc10f77167c261cbae4d884ebb Test Plan: Imported from OSS Differential Revision: D15948546 Pulled By: wanchaol fbshipit-source-id: 34dde28902ae5b8affbf6e4deaaffdb1d8ddd6ec	2019-06-22 01:38:14 -07:00
Wanchao Liang	e0f5ab2c2e	Tree based Iterator infrastructure: for in range/list/tensor/zip/enumerate (#21801 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21801 ghimport-source-id: b019d3e9a6f9bf152991a01b40e424dff176ffaa Test Plan: Imported from OSS Differential Revision: D15948545 Pulled By: wanchaol fbshipit-source-id: 6110a0f3ab08cbbb398441e8330f56083ecd2d99	2019-06-22 01:00:42 -07:00
Nikolay Korovaiko	a256b09ce9	Backout Liveness Tests again :-( Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22100 Differential Revision: D15956214 Pulled By: Krovatkin fbshipit-source-id: 9b0c8ecf5b479bf878ffc31acc416bd8dbfe4b50	2019-06-22 00:18:21 -07:00
Ilia Cherniavskii	7b1d6c8912	Update intra_inter_benchmark (#22051 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22051 ghimport-source-id: 70710b3866b1a5e21656b77d2695ada74d00254e Test Plan: PARALLEL_BACKEND=NATIVE_TBB USE_OPENMP=0 USE_TBB=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ USE_CUDA=0 BLAS=MKL USE_MKLDNN=1 BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/intra_inter_benchmark Imported from OSS Differential Revision: D15933951 Pulled By: ilia-cher fbshipit-source-id: 88ad8f7a1634c1612ffaa68f22721ffc73d9b2ba	2019-06-21 23:06:27 -07:00
Jerry Zhang	91bf0a9f9d	Move quantized tensor tests in test_torch.py to test_quantized_tensor.py (#22089 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22089 att Reviewed By: jianyuh Differential Revision: D15950101 fbshipit-source-id: 70acdeeef3a05201d72f986d5a0005832efd75ff	2019-06-21 22:48:34 -07:00
Jongsoo Park	b19b20efef	fix minor comment (#21576 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21576 Fix comment regarding original_tensor Reviewed By: jianyuh Differential Revision: D15733294 fbshipit-source-id: e2957f32dcf90859b77e61c931b64abdd066aabb	2019-06-21 22:23:53 -07:00
James Reed	f7b2778cb1	s/uniqueName/debugName/ (#22096 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22096 ghimport-source-id: 8f1d994b98432942b5beeb10bf6d30e447d51997 Test Plan: Imported from OSS Differential Revision: D15956004 Pulled By: jamesr66a fbshipit-source-id: 319d2d20ef0863249a8a2bdd228b4f792d37bfab	2019-06-21 20:54:53 -07:00
SsnL	7d637de771	Reduce excessive CI printing in TestHub (#22043 ) Summary: https://github.com/pytorch/pytorch/pull/21132 reverted https://github.com/pytorch/pytorch/pull/19606. Now these tests again print like 40% lines of CI outputs (e.g., https://circleci.com/gh/pytorch/pytorch/2041825?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link) This PR now uses the functionality introduced in https://github.com/pytorch/vision/issues/862. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22043 Differential Revision: D15947268 Pulled By: ailzhang fbshipit-source-id: f84f4d6b86203dbe8687e04ae3ed8c99df0bdff8	2019-06-21 20:08:44 -07:00
svcscm	63ca908026	Updating submodules Reviewed By: yns88 fbshipit-source-id: d3374d2ee514cc0526559ffbac6dc11918ea71cf	2019-06-21 18:51:07 -07:00
Ailing Zhang	856268c716	Revert D15947873: [JIT] s/uniqueName/debugName Differential Revision: D15947873 Original commit changeset: 31a2b30d0ce9 fbshipit-source-id: ef1c0f120c1835184d8106d176cea58ec6ad40b7	2019-06-21 18:51:03 -07:00
James Reed	36e4b54420	s/uniqueName/debugName (#22048 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22048 ghimport-source-id: a82d80ceec1d8055ce4cf62df10ade4a224109f8 Test Plan: Imported from OSS Differential Revision: D15947873 Pulled By: jamesr66a fbshipit-source-id: 31a2b30d0ce911edf5791ca10040a1e968750b06	2019-06-21 17:59:38 -07:00
Richard Zou	4bc89bd5a6	Implement tensor.select(Dimname,int) (#21795 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21795 ghimport-source-id: d13af6078a47de1d6045cfbb7d278c378fe734fe Test Plan: Imported from OSS Differential Revision: D15833457 Pulled By: zou3519 fbshipit-source-id: fa52aff25ce0e12f31da3eef83ea948b4f7a5d9f	2019-06-21 16:16:45 -07:00
svcscm	18a904c12e	Updating submodules Reviewed By: yns88 fbshipit-source-id: 494a8fe00cbdb782bbdb05eefb17e9166d117599	2019-06-21 15:14:46 -07:00
Nikolay Korovaiko	f164c01f9c	Adding liveness test cases back Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21762 Differential Revision: D15943509 Pulled By: Krovatkin fbshipit-source-id: 4b65bf63ab15a2347da5f7269cc0f2dbb226b330	2019-06-21 15:09:09 -07:00
Ilia Cherniavskii	38aa5a519e	Experimental option to use single thread pool (#22047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22047 ghimport-source-id: 8731538a091997fd31d6aff59152dc9241de2ba4 Test Plan: EXPERIMENTAL_SINGLE_THREAD_POOL=1 PARALLEL_BACKEND=NATIVE_TBB USE_OPENMP=0 USE_TBB=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ USE_CUDA=0 BLAS=MKL USE_MKLDNN=1 BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/parallel_info ./build/bin/thread_init_test ./build/bin/test_parallel ./build/bin/intra_inter_benchmark Imported from OSS Differential Revision: D15931188 Pulled By: ilia-cher fbshipit-source-id: 1ca1b190b6e16ce5398f2dad72deaf3cb083a43b	2019-06-21 14:54:16 -07:00
Wanchao Liang	5ff06a7b0b	more complete tuple assignments (#21949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21949 ghimport-source-id: 458793d74af3728bf0338867b081157905a7635a Test Plan: Imported from OSS Differential Revision: D15948550 Pulled By: wanchaol fbshipit-source-id: 9ed69e0859e052816f06fc9c288b905551b2e48c	2019-06-21 14:49:38 -07:00
Johannes M Dieterich	4009089d1f	Sparse BLAS: Remove workaround to check zero length inputs. (#22080 ) Summary: Fix was released with ROCm 2.4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22080 Differential Revision: D15947427 Pulled By: bddppq fbshipit-source-id: b6b66f4cfc334ddc6140d1d519792d4783ba0efa	2019-06-21 14:45:06 -07:00
Johannes M Dieterich	04e9278306	First round of optimizations for segment_reduction_op kernels. (#22081 ) Summary: Apply launch bounds annotations for ROCm as the maximum threads per block (1024) is higher than the ROCm internal default (256). Reduce the minBlocksPerMultiprocessor for ROCm to 8 from 16 as this improves performance in some microbenchmarks by (statistically significant) 4%. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22081 Differential Revision: D15947426 Pulled By: bddppq fbshipit-source-id: b4b7015417f99e14dfdedb62639e4d837c38e4fd	2019-06-21 14:33:12 -07:00
davidriazati	1c5fe2e8c4	Add support for Python 3.8 Constant node (#22007 ) Summary: We can't really test these until we get Python 3.8 in the CI, but these all work locally and won't be invoked at all for Python 3.7 and lower so this should be pretty safe. Fixes #21710 ](https://our.intern.facebook.com/intern/diff/15914735/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22007 Pulled By: driazati Differential Revision: D15914735 fbshipit-source-id: 83833cebe7e38b162719a4f53cbe52c3fc638edd	2019-06-21 14:22:06 -07:00
liqunfu	f9b3989206	handle slice with negative indices and indices exceeding tensor dimen… (#21811 ) Summary: handle slice with negative indices and indices exceeding tensor dimensions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21811 Reviewed By: zrphercule Differential Revision: D15944243 Pulled By: houseroad fbshipit-source-id: f7d987e9d8d704ade9d489599df14afbf1333428	2019-06-21 13:37:54 -07:00
Gregory Chanan	38c9bb8261	Remove most usages of THCHalfAutoNumerics. (#21878 ) Summary: This was originally introduced between at::Half, which overloaded a number of operators; since this isn't necessary anymore, get rid of it. Note in many cases, these files still need THCNumerics.cuh (which was included by THCHalfAutoNumerics); I was not careful about isolating these usages. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21878 Differential Revision: D15941236 Pulled By: gchanan fbshipit-source-id: 65f30a20089fcd618e8f3e9646cf03147a15ccba	2019-06-21 12:40:38 -07:00
Sebastian Messmer	06c3bd0302	Improve ListPtr::extract() (#21753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21753 - it accidentally didn't move non-IValue-based lists before. This is fixed now. - it only needs to recreate a T() for IValue-based lists Reviewed By: resistor Differential Revision: D15809220 fbshipit-source-id: 944badf1920ee05f0969fff0d03284a641dae4a9	2019-06-21 12:26:01 -07:00
Vitaly Fedyunin	fe580e850e	Rewrite lerp operator to use TensorIterator and support compile-time vectorization. (#22038 ) Summary: Get benefit from the compile time vectorization and multi-threading. Before: ```python In [1]: import torch In [2]: x = torch.randn(1000000) In [3]: y = torch.randn(1000000) In [4]: w = 0.7 In [5]: timeit torch.lerp(x, y, w) 2.29 ms ± 23.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` After: ```python In [1]: import torch In [2]: x = torch.randn(1000000) In [3]: y = torch.randn(1000000) In [4]: w = 0.7 In [5]: timeit torch.lerp(x, y, w) 452 µs ± 1.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` After with multi-processing: ```python In [1]: import torch In [2]: x = torch.randn(1000000) In [3]: y = torch.randn(1000000) In [4]: w = 0.7 In [5]: timeit torch.lerp(x, y, w) 167 µs ± 48.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22038 Differential Revision: D15941468 Pulled By: VitalyFedyunin fbshipit-source-id: fa8a5126187df4e6c849452e035b00b22be25739	2019-06-21 11:39:27 -07:00
Ilia Cherniavskii	28630529ac	Limit overall number of threads used by TBB (#22045 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22045 ghimport-source-id: ea49ae04d86677f7a73a07968ce454eb1128fb84 Test Plan: PARALLEL_BACKEND=NATIVE_TBB USE_OPENMP=0 USE_TBB=1 MKL_SEQ=1 MKLDNN_THREADING=SEQ USE_CUDA=0 BLAS=MKL USE_MKLDNN=1 BUILD_BINARY=1 python setup.py develop --cmake ./build/bin/parallel_info ./build/bin/thread_init_test ./build/bin/test_parallel Imported from OSS Differential Revision: D15930319 Pulled By: ilia-cher fbshipit-source-id: 4c33ae395965e5708f8d7ceb67495b303fc4d22c	2019-06-21 11:39:18 -07:00
Dmytro Dzhulgakov	82dd69326b	Split nn.Module._save_to_state_dict to make it overridable (#21933 ) Summary: # Motivation We allow to override JIT module serialization with `__getstate__/__setstate__` in order to cover cases where parameters are not serializable. Use cases include: MKLDNN integration: `a388c78350/torch/utils/mkldnn.py (L18-L26)` and also fbgemm prepacked format integration for quantized tensors. However many Eager scripts use `torch.save(module.state_dict())` form of serialization. There are several ways to make it work: * make packed_weight itself pickleable (e.g. by binding `__getstate__/__setstate__` on C++ UDT level) * change: we’d need to allow module buffers to be of arbitrary, non-Tensor types * pro: no change to state_dict behavior * cons: might not be directly inspectable by user calling .state_dict(), especially if packed weights represent several tensors fused together * make packed_weight being proper Tensor layout * pro: no change to state_dict or buffers behavior * cons: adding new tensor layouts is pretty costly today * cons: doesn’t work if multiple tensors are packed in one interleaved representation * [this approach] allow Modules to override state_dict and return regular tensors * pro: most flexible and hackable * pro: maintains semantic meaning of statedict as all data necessary to represent module’s state * cons: complicates state_dict logic * cons: potential code duplication between `__getstate__/__setstate__` Based on discussions with zdevito and gchanan we decided to pick latter approach. Rationale: this behavior is fully opt-in and will impact only modules that need it. For those modules the requirement listed above won't be true. But we do preserve requirement that all elements of state_dict are tensors. (https://fburl.com/qgybrug4 for internal discussion) In the future we might also implement one of the approaches above but those are more involved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21933 Differential Revision: D15937678 Pulled By: dzhulgakov fbshipit-source-id: 3cb5d1a8304d04def7aabc0969d0a2e7be182367	2019-06-21 09:55:22 -07:00
Gavriel State	b2197ef2b0	Adding support for JIT Fusion on Windows for CUDA (#21861 ) Summary: This pull request adds the necessary Windows DLL code to be able to support JIT fusion for CUDA. CPU JIT Fusion isn't supported. This also adds all the non-CPU JIT tests back in on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21861 Differential Revision: D15940939 Pulled By: soumith fbshipit-source-id: e11f6af1ac258fcfd3a077e6e2f2e6fa38be4ef1	2019-06-21 09:44:17 -07:00
Roy Li	edb5a1662e	Remove getDeviceFromPtr and allocator from Type (#21940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21940 ghimport-source-id: 0a618878ae030f663b05662f83ac4b549a90ba29 Test Plan: Imported from OSS Differential Revision: D15893330 Pulled By: li-roy fbshipit-source-id: a3dfb6b4ed0c72f7f3efd00192fb63aabc9c5967	2019-06-21 01:05:33 -07:00
Roy Li	b36a041d6f	Move UnsafeTensorFromTH and UnsafeStorageFromTH off Type (#21923 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21923 ghimport-source-id: f015c8521ef9071eaa982cbf73c13aa925035956 Test Plan: Imported from OSS Differential Revision: D15883390 Pulled By: li-roy fbshipit-source-id: 6a7a7ffbe6000199d41cdca5efb97371f46dd8fe	2019-06-21 01:05:29 -07:00
Jongsoo Park	5d7cf66862	add Int8SpatialBNRelu (#22014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22014 Add Int8SpatialBN + Relu fused operator. Reviewed By: dskhudia Differential Revision: D15916551 fbshipit-source-id: a938e0f0e105ab5f823a3cb6144f50aa2ab944c1	2019-06-20 23:23:04 -07:00
Junjie Bai	7d81e62562	Add mkldnn tests for running end to end resnet models Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22041 Differential Revision: D15928786 Pulled By: bddppq fbshipit-source-id: 4b12e5bda2da13aba2d63d357a0a854d59317362	2019-06-20 22:42:49 -07:00
Tongzhou Wang	71741ba115	rename test to be more consistent Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22057 Differential Revision: D15936870 Pulled By: soumith fbshipit-source-id: ab6194219da2582efdf324b89b2bc87dfe4e5d69	2019-06-20 22:02:36 -07:00
Nikolay Korovaiko	a3fc6ed046	Hook up liveness into profiling pipeline. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21881 Differential Revision: D15931627 Pulled By: Krovatkin fbshipit-source-id: dc825a563c7aceb5f66a2ed2a600d550b70941b2	2019-06-20 21:23:16 -07:00
Jerry Zhang	3838324539	Add max/min/argmax/argmin/sort/argsort for quantized Tensor (#21546 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21546 Added following methods for QTensor: - max, min - argmax, argmin - sort, argsort Reviewed By: dzhulgakov Differential Revision: D15718117 fbshipit-source-id: 746b978d5722cb75e216fc65585bf206d45a7969	2019-06-20 21:00:03 -07:00
Jongsoo Park	95aee81dd7	more general fusion logic (#22015 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22015 Previous fusion logic only works for operators back-to-back in the linear order of protobuf file. This diff generalizes to work for any predecessor-successor operators in the graph without any "interfering" use/def of the related blobs. Reviewed By: csummersea Differential Revision: D15916709 fbshipit-source-id: 82fe4911a8250845a8bea3427d1b77ce2442c495	2019-06-20 20:44:26 -07:00
Jerry Zhang	88921feafd	change return type for q_scale and q_zero_point (#21709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21709 Change the return type from Scalar to double/int64_t so we don't need to do conversion when we call other quantize related aten functions Differential Revision: D15793003 fbshipit-source-id: 510936c69fa17a4d67340a31ebb03415647feb04	2019-06-20 20:30:39 -07:00
Tongzhou Wang	058beae411	Add IterableDataset (#19228 ) Summary: This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy. 1. Add `IterableDataset`. 3. So we have 2 data loader mods: `Iterable` and `Map`. 1. `Iterable` if the `dataset` is an instance of `IterableDataset` 2. `Map` o.w. 3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading. 3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`. 4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration. 5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`. 7. Import torch.utils.data in `torch/__init__.py` 9. data loader examples and documentations 10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate` Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228 Reviewed By: bddppq Differential Revision: D15058152 fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde	2019-06-20 20:12:44 -07:00
Lu Fang	d4119f8fcb	Automatic update of fbcode/onnx to 355a4954ea4e5836a5e943589509951c44feb6b4 (#22030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22030 Previous import was dd599b05f424eb161a31f3e059566a33310dbe5e Included changes: - [355a4954](https://github.com/onnx/onnx/commit/355a4954): Update codeowners to have community folder changes assigned to steering committee (#2104) <Prasanth Pulavarthi> - [ceaa5da7](https://github.com/onnx/onnx/commit/ceaa5da7): Fix Resize/Upsample Shape inference function (#2085) <Raymond Yang> - [4de8dc0d](https://github.com/onnx/onnx/commit/4de8dc0d): Clarify shape inference requirements for new operators (#2088) <Hariharan Seshadri> - [52aa1fad](https://github.com/onnx/onnx/commit/52aa1fad): Fix NN defs file (#2083) <Hariharan Seshadri> Reviewed By: bddppq Differential Revision: D15924221 fbshipit-source-id: 91ba64ef3e1a2de4a7dd0b02ee6393508cc44a73	2019-06-20 15:52:45 -07:00
Frank Jiang	84a2d5d7aa	Add hashing to bucket-weighted pooling (#20673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20673 Add option to bucket-weighted pooling to hash the bucket so that any cardinality score can be used. Reviewed By: huginhuangfb Differential Revision: D15003509 fbshipit-source-id: 575a149de395f18fd7759f3edb485619f8aa5363	2019-06-20 15:12:36 -07:00
Edward Yang	1aae4b02df	Fix 'error : detail is ambiguous' on Windows (#22025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22025 ghimport-source-id: 0fb408ad185a989507f7509a2a3574e1a7e60ab2 Test Plan: Imported from OSS Differential Revision: D15926651 Pulled By: ezyang fbshipit-source-id: 298340bfbfe44dcd81cde8f0d56f8dbde92fb7bd	2019-06-20 13:23:21 -07:00
svcscm	19ef15709f	Updating submodules Reviewed By: yns88 fbshipit-source-id: 0be0694d6adf1ae9baa408a4b372101a26a14ba4	2019-06-20 12:59:31 -07:00
Brennan Vincent	4cd7d78718	correct arange docs (#21992 ) Summary: https://github.com/pytorch/pytorch/issues/21579 correctly points out an inaccuracy in the docs for `arange`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21992 Differential Revision: D15914411 Pulled By: umanwizard fbshipit-source-id: 3eb1734b29af3f3858f0f4d54c71e28dbda5c75b	2019-06-20 12:36:00 -07:00
Shen Li	08facca1a1	Support accumulating DDP grads using a context manager (#21736 ) Summary: The first attempt and more discussions are available in https://github.com/pytorch/pytorch/issues/19577 #### Goal Allow toggling DDP gradient synchronization across iterations. With this feature, users may accumulate grads in module variables, and only kick off expensive grad synchronize every a few iterations. #### Concerns Our first attempt in https://github.com/pytorch/pytorch/issues/19577 tries to do it using a variable or a function. But apaszke made a good point that it will not be error prone, and favors a context manager instead. #### Proposed Solution Instead of providing a `accumulate_grads` variable/function/context, we provide a `DistributedDataParallel.no_sync()` context manager. And it does exactly what the name suggests, i.e., disable DDP grad synchronization within the context. Note that `accumulate_grads` means `no_sync` + no optimizer step, where the latter is not controlled by DDP. It is true that users need to call another `model(input).backward()` after exiting the context, and this is indeed more verbose. But I think it is OK as one major concern in the previous discussion is to prevent users from running into errors without knowing it. This API should reaffirm the expected behavior, and does not mess up with other use cases if accumulating grads is not required.. The application would then look like: ```python with ddp.no_sync(): for input in inputs: ddp(input).backward() ddp(one_more_input).backward() optimizer.step() ``` chenyangyu1988 myleott Pull Request resolved: https://github.com/pytorch/pytorch/pull/21736 Differential Revision: D15805215 Pulled By: mrshenli fbshipit-source-id: 73405797d1e39965c52016af5cf45b15525ce21c	2019-06-20 12:23:52 -07:00
Horace He	40b9f8f0a0	Added more descriptive error message for index out of range (#21758 ) Summary: https://github.com/pytorch/pytorch/issues/21535 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21758 Differential Revision: D15922915 Pulled By: Chillee fbshipit-source-id: dcb301a661c359f27869200ee241ec272ef50d3a	2019-06-20 12:12:03 -07:00
davidriazati	6bd58b7548	Move list / dict tests to TestList and TestDict (#22000 ) Summary: There aren't any substantive changes aside from some test renames (e.g. `TestScript.test_dict_membership` -> `TestDict.test_membership`) and the addition of `TestDict.dict()`. Adding the rest of the dict ops was making the tests a mess and `TestScript` is already > 10000 lines by itself, so breaking them up should make things cleaner ](https://our.intern.facebook.com/intern/diff/15911383/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/22000 Pulled By: driazati Differential Revision: D15911383 fbshipit-source-id: 614428e03fbc14252f0e9cde74ab9a707169a860	2019-06-20 11:17:35 -07:00
Hong Xu	0702b5f345	Partially parallelize randperm on CPU. (#21529 ) Summary: This commit parallelizes the variable initialization (from 1 to n) step on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21529 Differential Revision: D15855402 Pulled By: VitalyFedyunin fbshipit-source-id: f1ba54587451f9cb0eb5e542c3c5b458b48e1a3d	2019-06-20 10:44:01 -07:00
Will Feng	e388f70499	Move cppdocs build to CircleCI (#19768 ) Summary: The cppdocs build job (originally run on Chronos as a cron job) was frequently broken because it was not run on every PR. This PR moves it to CircleCI and enables it on every PR, so that we can get the build failure signal much earlier. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19768 Differential Revision: D15922289 Pulled By: yf225 fbshipit-source-id: e36ef59a2e42f78b7d759ee02f2d94dc90f88fff	2019-06-20 10:24:21 -07:00
Edward Yang	76fe91bb2f	Revert D14889547: Add sparse tensor allreduce Differential Revision: D14889547 Original commit changeset: 34f3de4d6a2e fbshipit-source-id: 24d2239da0b865280af88dce3d8fb25883fc0174	2019-06-20 10:07:27 -07:00
Edward Yang	cb4c213f55	Revert D15007365: Support sparse gradients in DistributedDataParallel Differential Revision: D15007365 Original commit changeset: f298e83fd3ca fbshipit-source-id: ef5e556d2df37f0c64652bd3563956afd8d9fd7f	2019-06-20 10:07:22 -07:00
Ivan Ogasawara	f8f583cbae	Port convtranspose2d (#20994 ) Summary: this PR will resolve partially https://github.com/pytorch/pytorch/issues/18353 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20994 Differential Revision: D15876052 Pulled By: ezyang fbshipit-source-id: 5896e0cbb656d0530e39fd681808adc685841b37	2019-06-20 07:11:38 -07:00
Pieter Noordhuis	365de7bda1	Support sparse gradients in DistributedDataParallel (#19443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19443 This adds support for sparse gradients to the reducer as well as to the DistributedDataParallel wrapper. Note that an out of band signal is needed whether or not a dense parameter (e.g. an embedding) is expected to receive a sparse gradient or not. This information is passed to the bucket assignment computation routine and the reducer as a vector of booleans. Every parameter for which we expect a sparse gradient is assigned its own bucket, as we cannot easily group multiple unrelated sparse tensors. Reviewed By: mrshenli Differential Revision: D15007365 fbshipit-source-id: f298e83fd3ca828fae9e80739e1db89d045c99ac	2019-06-20 07:06:28 -07:00
Pieter Noordhuis	aee6a412e9	Add sparse tensor allreduce (#19146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19146 Implemented only on ProcessGroupGloo, as an allgather of metadata (sparse_dim, dense_dim, and nnz), followed by an allgather of indices, followed by an allgather of values. Once these operations have finished, all ranks locally compute a reduction over these sparse tensors. Works for both CPU and CUDA tensors. This surfaced a problem with the existing assumption of only modifying tensors that are passed at the call site, because for sparse tensors we don't know the dimensions of the output tensors before we run the collective. To deal with this unknown, this commit adds a `result` function to the `c10d::ProcessGroup::Work` class that returns a vector of tensors. It's a bit odd to have to retrieve the result through this function only for operations on sparse tensors. To make this work irrespective of tensor layout, we can create a follow-up commit to make all in place operations make their results accessible through this function as well. This doesn't break any existing contracts but does have the potential to add interface ambiguity. Reviewed By: mrshenli Differential Revision: D14889547 fbshipit-source-id: 34f3de4d6a2e09c9eba368df47daad0dc11b333e	2019-06-20 07:06:24 -07:00
Summer Deng	97ea44b34a	Fix issue in quantization error measurement when followed by Relu (#21890 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21890 As title Reviewed By: jspark1105 Differential Revision: D15739808 fbshipit-source-id: 8fbcca04f0711fd9f994d67e1f4a604ef9fa42c6	2019-06-19 22:29:54 -07:00
xiaobing.zhang	b6f542f8a1	Add aten mkldnn transpose (#21943 ) Summary: This PR is about: 1. Make mkldnn reshape can share same memory fro plain format tensor. 2. Add mkldnn transpose operator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21943 Differential Revision: D15916063 Pulled By: bddppq fbshipit-source-id: d1971c67341f277c1e80c1fa34e213b6c27f4062	2019-06-19 22:20:46 -07:00
Roy Li	3d44cd6d19	Replace Type dispatch with ATenDispatch (#22008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22008 ghimport-source-id: b0a5cc3da283b195f88636e2a61939d2facd11d9 Test Plan: Imported from OSS Differential Revision: D15914756 Pulled By: li-roy fbshipit-source-id: 5bc300ec525a3ee9e6491dd4c55e78bbd977d691	2019-06-19 21:42:54 -07:00
Horace He	5d67c606ea	Added error for classes that don't have an init function (#21880 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21761 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21880 Differential Revision: D15879205 Pulled By: Chillee fbshipit-source-id: 8b614970196b381357b6032a73eeaab0b7a4f667	2019-06-19 21:33:37 -07:00
Zhanibek Datbayev	4fee532de6	Pass loop_over optional parameter for cached reader properly. (#21929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21929 Just need to pass `loop_over` argument properly. Reviewed By: noname01 Differential Revision: D15885401 fbshipit-source-id: f1928277262a80e5b41f4c4f3945c2f378a4e233	2019-06-19 18:15:32 -07:00
Sebastian Messmer	96c0bd3722	ListPtr->List DictPtr->Dict step 3 (#21938 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21938 After having changed all call sites, we can now remove the old naming scheme. Reviewed By: zdevito Differential Revision: D15892402 fbshipit-source-id: 1f5b53a12fa657f6307811e8657c2e14f6285d2f	2019-06-19 18:02:08 -07:00
Sebastian Messmer	275087383b	ListPtr->List DictPtr->Dict step 2 (#21937 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21937 This changes call sites to use the new naming scheme Reviewed By: zdevito Differential Revision: D15892404 fbshipit-source-id: 8d32aa90a0ead1066688166478f299fde9c2c133	2019-06-19 18:02:05 -07:00
Sebastian Messmer	093c78f854	ListPtr->List DictPtr->Dict step 1 (#21936 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21936 This introduces torch::List and torch::Dict as aliases to ListPtr/DictPtr. After this lands, we can step by step change the call sites to the new naming and finally remove the old spellings. Reviewed By: zdevito Differential Revision: D15892405 fbshipit-source-id: 67b38a6253c42364ff349a0d4049f90f03ca0d44	2019-06-19 18:02:01 -07:00
Ailing Zhang	cba79f4872	Revert D15637222: [wip] Replace Type dispatch with ATenDispatch Differential Revision: D15637222 Original commit changeset: fcfaea0b5480 fbshipit-source-id: 9bca7ebb91d7a3609b86663089140d7c5a33f58d	2019-06-19 17:36:52 -07:00
James Reed	15be5483c0	Move NamedType method definitions into cpp file (#21983 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21983 ghimport-source-id: fe9c1eba5f4c737e1442b877d396b9e8e5298cfb Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D15907633 Pulled By: jamesr66a fbshipit-source-id: bd2dfdca117cdc3ae35fdd9d29cf521d82636069	2019-06-19 16:43:11 -07:00
Zachary DeVito	f6aac41391	Defining object destructor in c10 (#21984 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21984 ghimport-source-id: 5767592e37ed388422eed5639f8ba0722aec66e2 Test Plan: Imported from OSS Differential Revision: D15906530 Pulled By: zdevito fbshipit-source-id: bec8c8b0b5b9dcc2e8fc69b5031fcfa6bb22d54e	2019-06-19 16:27:40 -07:00
Roy Li	24a6c32407	Replace Type dispatch with ATenDispatch (#21320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21320 ghimport-source-id: cc18f746a1c74df858cb0f6d8b7d4de4315683c7 Test Plan: Imported from OSS Differential Revision: D15637222 Pulled By: li-roy fbshipit-source-id: fcfaea0b5480ab966175341cce92e3aa0be7e3cb	2019-06-19 15:46:45 -07:00
Edward Yang	00fdb2cf95	Enable XLA by default on pull requests. (#21991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21991 ghimport-source-id: 5077e62a613c36256d2b5a2427aa9c3887c4a797 Test Plan: Imported from OSS Differential Revision: D15907913 Pulled By: ezyang fbshipit-source-id: c67bb999f02760836d1568c1a3911add3f1538f0	2019-06-19 15:01:49 -07:00
Syed Tousif Ahmed	effcc398c4	Refactor Random Number Generators in ATen (#21555 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21555 ghimport-source-id: dd900a8c3e1ef9ef1e011b8bb5476626d18cc462 Test Plan: Imported from OSS Differential Revision: D15875780 Pulled By: ezyang fbshipit-source-id: 6e04e90af62ab9c9593d74f344a3a084aaaf6f43	2019-06-19 13:54:09 -07:00
Lara	34aee933f9	ONNX Export Interpolate (Resize) for opset version 10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21434 Reviewed By: zrphercule Differential Revision: D15777197 Pulled By: houseroad fbshipit-source-id: 517b06a54a234ffdb762401e83f5a732023ed259	2019-06-19 13:40:27 -07:00
Sebastian Messmer	44128e09f0	Speed up op lookup and registration (#21806 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21806 Dispatcher::findSchema(op_name) now uses a lookup table instead of iterating through the list of operators to find it. This speeds up op lookup (as in finding the operator handle from the name, not as in finding a kernel when you already have the operator handle) and it also speeds up op registration since that needs to look if an op with the same name already eists. Differential Revision: D15834256 fbshipit-source-id: c3639d7b567e4ed5e3627c3ebfd01b7d08b55ac1	2019-06-19 12:05:14 -07:00
Sebastian Messmer	d1c80300ce	Better stringification of dispatch keys in error messages (#21809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21809 Many error messages show dispatch keys, for example when the dispatcher didn't find a kernel to dispatch to. Previously, this was a string like "CPU" or "CUDA" for known backends and just an arbitrary number for other backends. Now, tensor type id registration also registers a name for the dispatch key and shows that in the error messages. There is no API change, just the error messages are better now. Differential Revision: D15835809 fbshipit-source-id: 4f0c9d0925c6708b02d79c653a2fae75b6623bb9	2019-06-19 11:44:24 -07:00
James Reed	dd046bef8d	NamedTuple serialization (#21839 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21839 ghimport-source-id: b9d82018fbf26b22d58cad3a033cbfe4e879a8fe Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15860002 Pulled By: jamesr66a fbshipit-source-id: 0fc97c4adefa9ae4937f21179c7afa817f4099e5	2019-06-19 10:43:55 -07:00
James Reed	5a37f8c63f	Refactor TupleType to take a NamedTupleSpec (#21836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21836 ghimport-source-id: 91cab735765ff875046b42864188e86b8487b0ae Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15860003 Pulled By: jamesr66a fbshipit-source-id: 62a99a212ae6f9af83a90305e443f2dd05588292	2019-06-19 10:43:51 -07:00
James Reed	c0be6e6290	Introduce SerializableType (#21835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21835 ghimport-source-id: e674048a56b9a573ba89e484f4b41818d3f08234 Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15860004 Pulled By: jamesr66a fbshipit-source-id: 2d2905296939903ed4586932bea0a504b542bbdb	2019-06-19 10:43:47 -07:00
James Reed	74104f383e	Some small fixes for NamedTuple (#21813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21813 ghimport-source-id: a1edca8ad0384a9e493ef2f3b0aa5005a668a8f3 Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15860005 Pulled By: jamesr66a fbshipit-source-id: 4a43432d2dacebde1a676a93ac57f675db857154	2019-06-19 10:43:43 -07:00
Will Feng	6b972795e4	Add `torch.__future__._overwrite_module_params_on_conversion` global flag, and check it in `nn.Module._apply()` (#21613 ) Summary: https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type: ```python # `6dc445e1a8/torch/nn/modules/module.py (L192-L208)` def _apply(self, fn): ... for param in self._parameters.values(): if param is not None: # Tensors stored in modules are graph leaves, and we don't # want to create copy nodes, so we have to unpack the data. param.data = fn(param.data) # NOTE: this doesn't allow changing `param.data`'s TensorImpl type if param._grad is not None: param._grad.data = fn(param._grad.data) # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type ... ``` yf225 TODO: fix the description here when we finish the implementation To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in. We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model. This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher. [xla ci] cc. resistor ailzhang Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613 Differential Revision: D15895387 Pulled By: yf225 fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6	2019-06-19 10:30:02 -07:00
Jie	056a033cdc	updating upsampling bilinear2d kernel: (#21879 ) Summary: 1. faster atomicAdd trick for fp16 backward kernel 2. better launch configs for backward kernel 3. removed unnecessary buffer initialization for forward kernel Pull Request resolved: https://github.com/pytorch/pytorch/pull/21879 Differential Revision: D15898680 Pulled By: ezyang fbshipit-source-id: 1fc81e6c078f1538d82e4f36921b630499eb504f	2019-06-19 07:42:21 -07:00
hexiaoting	34536e207a	Fix: convert Onnx DynamicSlice operator with 4 inputs to caffe2 fa… (#20846 ) Summary: I reported an issue [https://github.com/pytorch/pytorch/issues/20743](url) and make this pull request for it Pull Request resolved: https://github.com/pytorch/pytorch/pull/20846 Reviewed By: zrphercule Differential Revision: D15569135 Pulled By: houseroad fbshipit-source-id: 96a2c818ef666a7d79b96decfa347d7154b34d5c	2019-06-19 00:09:15 -07:00
Will Feng	4b1df5c1f5	Use fn(param) instead of fn(param.data) in nn.Module._apply (#21865 ) Summary: When we pass `fn` to `nn.Module._apply()` and `fn` is an in-place operation, the correct behavior should also include bumping the parameters' and their gradients' version counters. This PR fixes the old incorrect behavior and makes sure the new behavior is right. Note that this PR is BC-breaking in the following way: Previously, passing an in-place operation to `nn.Module._apply()` does not bump the module's parameters' and their gradients' version counters. After this PR, the module's parameters' and their gradients' version counters will be correctly bumped by the in-place operation, which will invalidate them in any autograd graph they previously participate in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21865 Differential Revision: D15881952 Pulled By: yf225 fbshipit-source-id: 62f9244a4283a110147e9f20145ff232a5579fbd	2019-06-18 20:45:40 -07:00
Igor Fedan	abd6cffe55	Added some extra tests for std_mean and var_mean for multiple dims. (#20650 ) Summary: Added some extra tests for std_mean and var_mean for multiple dims. Some refactoring of previously created tests based on PR comments: https://github.com/pytorch/pytorch/pull/18731 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20650 Differential Revision: D15396101 Pulled By: ifedan fbshipit-source-id: d15c3c2c7084a24d6cfea4018173552fcc9c03a9	2019-06-18 20:36:32 -07:00
Jerry Zhang	fa5263af2c	Add set_quantizer_ for QTensor (#21852 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21852 To enable change of q_scale and q_zero_point in `copy_` Differential Revision: D15793427 fbshipit-source-id: a7040b5b956d161fd6af6176287f4a4aa877c9be	2019-06-18 19:50:12 -07:00
Junjie Bai	e239e31da6	Fix lint error (#21932 ) Summary: https://github.com/pytorch/pytorch/issues/21916 has broken python lint on master https://travis-ci.org/pytorch/pytorch/jobs/547354937 ``` ./tools/build_variables.py:167:39: E261 at least two spaces before inline comment ./tools/build_variables.py:379:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:379:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:380:17: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:380:19: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:381:18: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:381:20: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:382:23: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:382:25: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:387:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:387:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:388:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:388:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:389:16: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:389:18: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:390:25: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:390:27: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:391:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:391:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:395:22: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:395:24: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:402:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:402:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:403:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:403:15: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:404:16: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:404:18: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:405:25: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:405:27: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:406:13: E251 unexpected spaces around keyword / parameter equals ./tools/build_variables.py:406:15: E251 unexpected spaces around keyword / parameter equals ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21932 Differential Revision: D15892041 Pulled By: bddppq fbshipit-source-id: f62949a7617f8ea9c036ea9b48ab1e340a7af83e	2019-06-18 19:08:24 -07:00
Hong Xu	3bdde56907	Fix incorrect usage of __HIP_PLATFORM_HCC__ (#21757 ) Summary: This avoid using `__HIP_PLATFORM_HCC__` in case it changes in the future. Following up https://github.com/pytorch/pytorch/issues/21718 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21757 Reviewed By: xw285cornell Differential Revision: D15891867 Pulled By: bddppq fbshipit-source-id: 5de55687ab1c86eddf6b4d8d25fee48d96ec72ad	2019-06-18 18:56:32 -07:00
Michael Suo	a388c78350	fix bug in CompilationUnit::define (#21886 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21886 ghimport-source-id: fefbd758bbe2fbcaaad84a376ac5f69c40bccb80 Test Plan: Imported from OSS Differential Revision: D15867647 Pulled By: suo fbshipit-source-id: 3e0f5bbc98ec93ccf26442c4c574626e45e53888	2019-06-18 15:41:55 -07:00
David Riazati	52e1cea057	Fix recusive method compilation (#21862 ) Summary: The code in `python_sugared_value.cpp` to recursively compile methods was not being tested, so this adds a test for it and fixes some errors in it It was necessary to disable any hooks set since (at least in our tests) they would try to export a half-finished graph since they were being called on recursively compiled methods Pull Request resolved: https://github.com/pytorch/pytorch/pull/21862 Differential Revision: D15860314 Pulled By: driazati fbshipit-source-id: e8afe9d4c75c345b6e1471072d67c5e335b61337	2019-06-18 14:08:56 -07:00
Zachary DeVito	eda08b0aae	script::Module as a view. (#21814 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21814 ghimport-source-id: 49cfea6101ad9ca438600c465762e23252e05ff3 Test Plan: Imported from OSS Differential Revision: D15839583 Pulled By: zdevito fbshipit-source-id: ab4ef31a523b3ac1477aa7e6d4d9513e7408c560	2019-06-18 13:58:49 -07:00
Kutta Srinivasan	94c61d4f32	Fix infinite loop in del_post_hook (#21914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21914 https://github.com/pytorch/pytorch/pull/21591 added a needed feature to clean up grad accumulator post hooks when the DistributedDataParallel model object is cleaned up. There's a minor typo that causes it to loop infinitely over the first element. Differential Revision: D15878884 fbshipit-source-id: b7fd0bbd51eb187579d639b1709c6f7b62b85e7a	2019-06-18 13:43:59 -07:00
Iurii Zdebskyi	c0f51142cd	Added a test case for an index error for the index_copy_ (#21912 ) Summary: Follow up PR with a test for the [fixed](`4b45f08f87`) [bug](https://github.com/pytorch/pytorch/issues/20322). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21912 Differential Revision: D15878674 Pulled By: izdeby fbshipit-source-id: c8fef2214606c796d174d0faaaf633531a7bea88	2019-06-18 13:43:56 -07:00
Dmytro Dzhulgakov	ad00c12379	Clean up //caffe2/torch-cpp to avoid duplicated symbols (#21916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21916 Hopefully fixes https://fb.workplace.com/groups/1405155842844877/permalink/2832659000094547/ Reviewed By: rutyrinott Differential Revision: D15862128 fbshipit-source-id: 77c01a57bddc39b267e307b50942e029a381711b	2019-06-18 13:05:22 -07:00
Jerry Zhang	081cd3a293	Change AT_CHECK to TORCH_CHECK in python_arg_parser.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21887 Differential Revision: D15869483 Pulled By: jerryzh168 fbshipit-source-id: f3d9d73078e7c1c08ec79694105e18084e7f9caf	2019-06-18 10:48:38 -07:00
James Reed	28ecc104f4	Fix WeakIValueEq (#21891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21891 ghimport-source-id: a037850c96fe803540412db9a88548fa41f2d4f0 Test Plan: Imported from OSS Differential Revision: D15871588 Pulled By: jamesr66a fbshipit-source-id: ecfdece1285c0737d0b1dc2afe959c43d9413001	2019-06-18 10:35:35 -07:00
Junjie Bai	010f238d17	Retry pip install to make pytorch rocm CI more stable (#21895 ) Summary: pip install randomly core dumps, examples: https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/25720//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/25723//console Pull Request resolved: https://github.com/pytorch/pytorch/pull/21895 Differential Revision: D15873197 Pulled By: bddppq fbshipit-source-id: 2c967bc0a47bef9d3f7af83e99514c93b54e353f	2019-06-18 10:10:56 -07:00
davidriazati	5eb25c3704	Support `in` membership checks (#21527 ) Summary: This PR adds support for `in` checks like `key in my_dict` For now it leaves lists as a follow up due to the changes around `IValue` lists and it needing an `IValue` equality op. For objects it uses the magic method `__contains__(self, key)` ](https://our.intern.facebook.com/intern/diff/15811203/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21527 Pulled By: driazati Differential Revision: D15811203 fbshipit-source-id: 95745060394f8a9450efaaf8ab09d9af83bea01e	2019-06-18 09:49:12 -07:00
David Riazati	afad3e4954	Add support for class annotations (#21379 ) Summary: This adds support for inferred attributes (everything except empty lists, dicts, and tuples) as well as using the PEP 526 style annotations on a class, so this eliminates the need for `torch.jit.Attribute` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21379 Differential Revision: D15718537 Pulled By: driazati fbshipit-source-id: b7481ae3d7ee421613e931b7dc3427ef2a99757f	2019-06-18 09:49:09 -07:00
davidriazati	85528feb40	Mark test_snli as a slow test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21908 Pulled By: driazati Differential Revision: D15875846 fbshipit-source-id: 98b79e7beee5ffd72e1f41d22e07e618547b23e9	2019-06-18 09:44:12 -07:00
Igor Fedan	0998a32588	Backward function will set a flag if it released variables (#21533 ) Summary: This is a fix for https://github.com/pytorch/pytorch/issues/21469 Currently there is no way to define if backward function released variables when variables were added to a vector. This change will set a flag if function has saved variables and they were released. So we will prevent if somebody will call this function again with already released variables. Functions that do not have saved variables can be called multiple times for BC Pull Request resolved: https://github.com/pytorch/pytorch/pull/21533 Differential Revision: D15810481 Pulled By: ifedan fbshipit-source-id: 5663e0c14f1b65727abc0d078aef348078d6a543	2019-06-18 09:21:17 -07:00
Ailing Zhang	f363a33e10	Set __file__ for torch.ops (#21888 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19351 https://github.com/pytorch/lockdown/issues/93 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21888 Differential Revision: D15871142 Pulled By: ailzhang fbshipit-source-id: 339e9d493e2e13f09e118814bdd1d7a5942804b8	2019-06-18 08:46:23 -07:00
Stefan Krah	31e1e63bc2	Port avg_pool3d() to ATen (#21732 ) Summary: This will need a conflict resolution once avg_pool2d() has been merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21732 Differential Revision: D15824923 Pulled By: ezyang fbshipit-source-id: 83341e0209b660aecf788272079d8135d78b6ff1	2019-06-18 08:33:30 -07:00
Jie	c471a63a39	UpSample-nearest cuda kernel update (#21694 ) Summary: updating upsampling kernel: 1. avoids atomicAdd for better fp16 performance. 2. better launch configures for 2D input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21694 Differential Revision: D15875791 Pulled By: ezyang fbshipit-source-id: 426fc5d5f0c0cdf58bfa1a2b564f17a9ea286fa4	2019-06-18 08:24:25 -07:00
Richard Zou	998efb48c3	Add at::dimname_to_position helper. (#21789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21789 ghimport-source-id: 42c0a58280f3645dd38ea11d39311a0c53f90488 Test Plan: - `build/bin/NamedTensor_test` [namedtensor ci] Imported from OSS Differential Revision: D15833455 Pulled By: zou3519 fbshipit-source-id: 8dd51a7b785972668984a7c161b94b92039a1cb1	2019-06-18 07:44:04 -07:00
Edward Yang	8f9e0f77dd	Turn off non-default stream testing. (#21793 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21793 ghimport-source-id: 5264fa90ca77fbc79898cfa2f0ee02f47dec27d4 Test Plan: Imported from OSS Differential Revision: D15874814 Pulled By: ezyang fbshipit-source-id: 5c51ab9ae431faf2db549b88b07ba00783acab25	2019-06-18 07:00:08 -07:00
Horace He	08a0ac84d7	Removed unused variable from closure in range (#21897 ) Summary: This was some code I added :^) Time for me to remove it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21897 Differential Revision: D15873213 Pulled By: Chillee fbshipit-source-id: 769c3bd71c542be4afddc02dc2f65aa5c751b10d	2019-06-18 02:21:50 -07:00
Horace He	6042012a93	Fixed "tried to access to nonexistent attribute" -> "tried to access nonexistent attribute" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21863 Differential Revision: D15873204 Pulled By: Chillee fbshipit-source-id: c5d85487b287ee9dd8318161ef9399ffd1ee0b68	2019-06-18 02:13:09 -07:00
Horace He	df787cf079	Fixed a warning in `test_jit.py` (#21898 ) Summary: What's the point of having warnings if we never fix them :^) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21898 Differential Revision: D15873280 Pulled By: Chillee fbshipit-source-id: a8274bab2badd840d36a9d2e1354677a6114ae1d	2019-06-18 01:15:15 -07:00
Lu Fang	f1c1d1a964	Export the cosine_similarity op as an ATenOp correctly (#21884 ) Summary: cosine_similarity has two non-tensor parameters, needs some special handling. Add the support for its export in this diff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21884 Reviewed By: zrphercule Differential Revision: D15866807 Pulled By: houseroad fbshipit-source-id: a165fbc00c65c44b276df89ae705ca8960349d48	2019-06-17 23:34:59 -07:00
Bram Wasti	3ed8acdf59	Fixes lint error in py3 (#21883 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21883 ghimport-source-id: c4330d71033929178ef10f2a0fcd8b0b2b468cb5 Test Plan: Imported from OSS Differential Revision: D15866746 Pulled By: bwasti fbshipit-source-id: c3d23f3396a95d5b1d689a07662e82e48cb3ab7a	2019-06-17 22:20:06 -07:00
Ilia Cherniavskii	2ba164b943	Future interface for ATen/Parallel (#21764 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21764 ghimport-source-id: fca083c09d814a0411020871f49429509fc0e8b5 Test Plan: Imported from OSS see https://github.com/pytorch/pytorch/pull/21764 Differential Revision: D15816658 Pulled By: ilia-cher fbshipit-source-id: 0e25ca6ff66a837d4f69f37a47e59927ab10e216	2019-06-17 22:05:59 -07:00
Sam Gross	d8314a6260	Replace nullary/unary/binary loops with generic implementation (#21475 ) Summary: ``` This replaces the kernel helpers in Loops.h/cuh with the following: cpu_kernel cpu_kernel_vec gpu_kernel gpu_kernel_with_scalars These work with functions with any number of input arugments, with the exception of 'gpu_kernel_with_scalars' which is limited to binary operations. Previously, we only supported functions of 0, 1, or 2 input arguments. Adding support for 3 or 4 input argument functions required significant amount of additional code. This makes a few other changes: Remove 'ntensors' from the for_each/serial_for_each loop. Most loops assume a fixed number of tensors, and the value is accessible from TensorIterator::ntensors() Only lift CPU scalars to parameters in 'gpu_kernel_with_scalars'. Previously, we performed this recursively in gpu_unary_kernel and gpu_binary_kernel, so something like `torch.add(3, 4, out=cuda_tensor)` would specialize to a "nullary" kernel. Now, only the first scalar input is lifted to a kernel parameter. Any additional scalar inputs are copied to CUDA tensors. Note that operations like `x + 5` and `5 + x` still work efficiently. This avoids generating an exponential number of specializations in the number of input arguments. ``` Performance measurements Timing numbers are unchanged for basic elementwise operations. Linked below is a script to measure torch.add perf on PR vs. master CPU+GPU (GCC 7.3): [miniperf.py](https://gist.github.com/colesbury/4a61893a22809cb0931f08cd37127be4) Generated assembly cpu_kernel and cpu_kernel_vec still generate good vectorized code with both GCC 7.3 and GCC 4.8.5. Below is the assembly for the "hot" inner loop of torch.add as well as an auto-vectorized torch.mul implementation using cpu_kernel/ binary_kernel. (The real torch.mul uses cpu_kernel_vec but I wanted to check that auto vectorization still works well): [torch.add GCC 7.3](https://gist.github.com/colesbury/927ddbc71dc46899602589e85aef1331) [torch.add GCC 4.8](https://gist.github.com/colesbury/f00e0aafd3d1c54e874e9718253dae16) [torch.mul auto vectorized GCC 7.3](https://gist.github.com/colesbury/3077bfc65db9b4be4532c447bc0f8628) [torch.mul auto vectorized GCC 4.8](https://gist.github.com/colesbury/1b38e158b3f0aaf8aad3a76963fcde86) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21475 Differential Revision: D15745116 Pulled By: colesbury fbshipit-source-id: 914277d7930dc16e94f15bf87484a4ef82890f91	2019-06-17 19:08:33 -07:00
Gu, Jinghui	7f057f00cc	Update mkldnn-bridge to fix MKLDNN grouped conv issue (#21854 ) Summary: 1. Fix grouped conv issue in https://github.com/pytorch/pytorch/issues/21597 2. Fix build error in `731670f40a` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21854 Test Plan: buck run experimental/jbai/pt_issue_21597_mkldnn_conv_2d_repro:run Reviewed By: yinghai Differential Revision: D15861105 Pulled By: bddppq fbshipit-source-id: fe3e2943a15aab4294f8e6bb15db15829a94420f	2019-06-17 18:21:26 -07:00
Zachary DeVito	5237835a17	Make script::Method a value type (#21675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21675 ghimport-source-id: 90ee7ba00e58b0151ca4c17e91fd17303c9d5d08 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D15777725 Pulled By: zdevito fbshipit-source-id: 8482cd2e1dcd7dd77a9cacbb76743bd190c7c4cf	2019-06-17 18:14:50 -07:00
Sam Gross	cc4498a54a	Always enable P2P access for GPU copies (#21872 ) Summary: PR https://github.com/pytorch/pytorch/issues/20685 incorrectly only enabled P2P access for non-contiguous copies. This can make cudaMemcpy slow for inter-gpu copies, especially on ROCm devices. I didn't notice a difference on CUDA 10, but ngimel says it's important for CUDA too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21872 Differential Revision: D15863965 Pulled By: colesbury fbshipit-source-id: 0a858f3c338fa2a5d05949d7f65fc05a70a9dfe1	2019-06-17 17:48:28 -07:00
Hongyu Xiong	76a250d590	Add new regression loss function type to FBLearner (#21080 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21080 Add Huber loss as a new option for regression training (refer to TensorFlow implementation: https://fburl.com/9va71wwo) # huber loss def huber(true, pred, delta): error = abs(true-pred) loss = 0.5 * min(error, delta)^2 + delta * max(error - delta, 0) return mean(loss) As a combination of MSE loss (`x < delta`) and MAE loss (`x >= delta`), the advantage of Huber loss is to reduce the training dependence on outlier. One thing worth to note is that Huber loss is not 2nd differential at `x = delta`. To further address this problem, one could consider adopt the loss of `LOG(cosh(x))`. Reviewed By: chintak Differential Revision: D15524377 fbshipit-source-id: 73acbe2728ce160c075f9acc65a1c21e3eb64e84	2019-06-17 17:43:00 -07:00
Bram Wasti	8aeb4ef4bf	Add python string standard lib (#21807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21807 ghimport-source-id: dcb2c78b8facb90a323ab9212b7703e553354273 Test Plan: Imported from OSS Differential Revision: D15835509 Pulled By: bwasti fbshipit-source-id: bc8bc5ae5a4fb4a1581aa94485973ed87af4eaaf	2019-06-17 15:48:36 -07:00
Michael Suo	d329dffd92	improve error message on recursive class defs (#21842 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21842 ghimport-source-id: 33569714b18fc476c4e6b3bc976b53b1f107273d Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15857568 Pulled By: suo fbshipit-source-id: 6307597b9741cfdccd5c55216ebdc7c4391a5e23	2019-06-17 15:23:21 -07:00
Michael Suo	cdae8b93a7	improve recursive scripting error message (#21841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21841 ghimport-source-id: fbca813d12ca4bfad7967e12c8dafe5eaba77cab Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D15857569 Pulled By: suo fbshipit-source-id: 152eba10565cf7119508079e98512f116eb3a5a8	2019-06-17 15:23:17 -07:00
Dmytro Dzhulgakov	c0420d9618	Attempt to fix TRT build after library merge (#21775 ) Summary: After fixing https://github.com/pytorch/pytorch/issues/20774 the TRT build was broken Because of missing annotations, pybind_state_gpu.so was missing symbols, but pybind_state.so did not. It caused a weird combination when trying to import pybind_state_gpu first left system in semi-initialized state and lead to sigsev. Minimal repro: ``` >>> import ctypes >>> ctypes.CDLL('/var/lib/jenkins/.local/lib/python2.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.so') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/ctypes/__init__.py", line 362, in __init__ self._handle = _dlopen(self._name, mode) OSError: /var/lib/jenkins/.local/lib/python2.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.so: undefined symbol: _ZN6caffe219TensorRTTransformer9TransformEPNS_9WorkspaceEPNS_6NetDefERKSt13unordered_mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_11TensorShapeESt4hashISB_ESt8equal_toISB_ESaISt4pairIKSB_SC_EEE >>> ctypes.CDLL('/var/lib/jenkins/.local/lib/python2.7/site-packages/caffe2/python/caffe2_pybind11_state.so') Segmentation fault (core dumped) ``` Too lazy to repro locally, let's see if CI passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/21775 Differential Revision: D15829605 Pulled By: dzhulgakov fbshipit-source-id: 1adb2bde56b0cd68f84cfca67bc050adcf787cd9	2019-06-17 14:16:45 -07:00
Hong Xu	0408697317	Followup cleanup in cmake.py and add a comment in setup.py (#21792 ) Summary: Following up b811b6d5c03596d789a33d7891b606842e01f7d2 * Use property instead of __setattr__ in CMake. * Add a comment clarifying when built_ext.run is called. --- cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/21792 Differential Revision: D15860606 Pulled By: umanwizard fbshipit-source-id: ba1fa07f58d4eac81ac27fa9dc7115d1cdd3dec0	2019-06-17 13:46:25 -07:00
Edward Yang	7279e07c8b	Don't use anonymous namespace in header. (#21790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21790 ghimport-source-id: ff648a1e9a1b8627f0742307e2e7810d6445d597 Test Plan: Imported from OSS Differential Revision: D15827311 Pulled By: ezyang fbshipit-source-id: 996bfd3a93fcda5934dcc523adae0648cba1c4fa	2019-06-17 13:26:02 -07:00
Richard Zou	1aa16d356e	named inference rule for tensor.select (#21752 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21752 ghimport-source-id: 95e17087b8c29c9bd88003ae225cb7329d0b67e6 Test Plan: - `python test/test_namedtensor.py` [namedtensor ci] gh-metadata: pytorch pytorch 21752 gh/zou3519/50/head Imported from OSS Differential Revision: D15833453 Pulled By: zou3519 fbshipit-source-id: 7b51e4137e54712aa9c6274a9e6bb48ab7191b8d	2019-06-17 13:12:49 -07:00
LUO luoyuchu	b403b10ff9	Fix #11752 : fix numerical issue in log_softmax (#21672 ) Summary: https://github.com/pytorch/pytorch/issues/11866 has corrected this issue in function `host_softmax` (aten/src/ATen/native/SoftMax.cpp). But I tried the example proposed in https://github.com/pytorch/pytorch/issues/11752. `log_softmax` is still not working for big logits. I have looked into the source code, found that example had called `vec_host_softmax_lastdim`, not `host_softmax`. This code fixes the issue in `_vec_log_softmax_lastdim` and has a test for `log_softmax`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21672 Differential Revision: D15856327 Pulled By: VitalyFedyunin fbshipit-source-id: 7a1fd3c0a03d366c99eb873e235361e4fcfa7567	2019-06-17 12:59:08 -07:00
Ivan Ogasawara	0f675f9cbc	Port im2col and vol2col (#21769 ) Summary: resolves partially https://github.com/pytorch/pytorch/issues/18353 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21769 Differential Revision: D15854530 Pulled By: ezyang fbshipit-source-id: 574853c068010d1b7588047d2ab7450077471447	2019-06-17 10:06:26 -07:00
Richard Zou	2b23fac8da	Disallow creation of tensors with duplicate names (#21781 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21781 ghimport-source-id: d77e0c97fe0104b4b29571fd5828967399d34fb1 Test Plan: - `python test/test_namedtensor.py -v` [namedtensor ci] gh-metadata: pytorch pytorch 21781 gh/zou3519/51/head Imported from OSS Differential Revision: D15833454 Pulled By: zou3519 fbshipit-source-id: fca4de83fba4bced615ec3cbd4ce4c441ddfcaf2	2019-06-17 09:59:50 -07:00
Richard Zou	44707dd3ca	Rename Dimname::name to Dimname::full_name (#21803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21803 ghimport-source-id: e0bc5a746e745e18f19215c6551d79cb0cd5f9c5 Test Plan: - [namedtensor ci] Imported from OSS Differential Revision: D15833452 Pulled By: zou3519 fbshipit-source-id: 7aa4d78ff436bd6a622a5ea235b75135d9798d33	2019-06-17 08:32:32 -07:00
Richard Zou	7c1528bab6	Copy NamedTensorMeta in TensorImpl::copy_tensor_data() (#21735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21735 ghimport-source-id: 4a4289693e372880e3d36e579c83d9e8745e70ed Test Plan: - I'm not sure how to test this other than making sure it compiles. - [namedtensor ci] gh-metadata: pytorch pytorch 21735 gh/zou3519/49/head Imported from OSS Differential Revision: D15833456 Pulled By: zou3519 fbshipit-source-id: ea2fa6d5c5f1eb2d7970d47189d6e4fcd947146d	2019-06-17 08:32:28 -07:00
Shen Li	da4e60226c	Keep Reducer hooks in a vector instead of an unordered_map (#21783 ) Summary: kuttas pointed out that the DDP Reducer only needs to remember `uintptr, Function` pairs, and hence does not need a nunordered map as added by https://github.com/pytorch/pytorch/issues/21591. Using a vector should speed it up a bit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21783 Differential Revision: D15854312 Pulled By: mrshenli fbshipit-source-id: 153ba035b8d658c7878a613f16a42de977d89c43	2019-06-17 08:24:19 -07:00
Xiaodong Wang	76713fb564	Fix remote build + clean up disable feature hack (#21816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21816 Clean up disable feature hack. Reviewed By: bddppq Differential Revision: D15833285 fbshipit-source-id: a2ae5d0f15e47b835dbd3997bbaa0add7e868f20	2019-06-17 08:08:34 -07:00
Dmytro Dzhulgakov	4a6aa1d806	Populate producer_info.json in any PyTorch model at FB (#21662 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21662 Use hook added in https://github.com/pytorch/pytorch/pull/20863 to auto-write a file with environment information (including user, machine, Flow, etc). Reviewed By: natalialunova Differential Revision: D15690185 fbshipit-source-id: ccaaeda9562db32925041d18f394fb98fab8db99	2019-06-16 20:12:23 -07:00
vishwakftw	c9ba3f699d	Bag of documentation fixes (#21846 ) Summary: Thanks henon for raising the issues. Fixes https://github.com/pytorch/pytorch/issues/21830 Fixes https://github.com/pytorch/pytorch/issues/21831 Fixes https://github.com/pytorch/pytorch/issues/21832 Fixes https://github.com/pytorch/pytorch/issues/21827 Fixes https://github.com/pytorch/pytorch/issues/21822 Fixes https://github.com/pytorch/pytorch/issues/21820 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21846 Differential Revision: D15847389 Pulled By: soumith fbshipit-source-id: 421cc48af646a2618af731697de7d4de83d3eabe	2019-06-16 19:35:27 -07:00
Zachary DeVito	972ec676b2	Remove lowered execution (#21674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21674 ghimport-source-id: b8e27f0ce9b8b362daf73556ee67457fb5355062 Reviewed By: eellison Differential Revision: D15777726 Pulled By: zdevito fbshipit-source-id: 718ac676c9a1bcf99b856862fd29631d825645da	2019-06-16 14:29:18 -07:00
Ailing Zhang	ff1172d705	high pri Jit builtins (#21451 ) Summary: bin/hex/oct/round/chr Pull Request resolved: https://github.com/pytorch/pytorch/pull/21451 Differential Revision: D15702863 Pulled By: ailzhang fbshipit-source-id: 9f69896b79e7584f12353e9f2ee2969dbe1ec6d6	2019-06-16 09:48:38 -07:00
Michael Suo	4f75da3b41	change ClassType::compilation_unit to return owning ptr (#21787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21787 ghimport-source-id: eed7b98b0f02745066164b8ef3906291931e2ecb Test Plan: Imported from OSS Differential Revision: D15831353 Pulled By: suo fbshipit-source-id: 50695c35dba8ffea710cbc9aca8aba6a75512fa0	2019-06-16 02:37:07 -07:00
Edward Yang	263b1985a8	Revert D15833924: [jit] Fix stdout capturing, remove some expect files Differential Revision: D15833924 Original commit changeset: 152972b4c240 fbshipit-source-id: 1d5a2258bc134fdc7bd2cb557bcc05f2289443b6	2019-06-15 20:39:11 -07:00
Will Feng	04f09d4235	Move unwrap logic from c10 to caffe2 (#21620 ) Summary: After https://github.com/pytorch/pytorch/pull/17072, we are allowed to pass Variables into ATen ops, thus there is no need to unwrap input variables in the c10 call path. Note that since Caffe2 still expects inputs to be pure Tensors, we moved the unwrapping logic to the Caffe2 wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21620 Differential Revision: D15763560 Pulled By: yf225 fbshipit-source-id: 5375f0e51eb320f380ae599ebf98e6b259f0bff8	2019-06-14 22:02:43 -07:00
peter	794ee6d00c	Switch to out-source builds for LibTorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21772 Differential Revision: D15839332 Pulled By: yf225 fbshipit-source-id: 017cf61c5682c6a8ffeaf2ca952e1418c27be30e	2019-06-14 21:00:18 -07:00
Will Feng	4a2fc00db0	Revert D15830704: [jit] Add Python string standard lib Differential Revision: D15830704 Original commit changeset: e55a8c6bf910 fbshipit-source-id: 1ec953bfaabab0288e953f48cde0a32370ac3fc6	2019-06-14 20:52:58 -07:00
svcscm	97b92eede1	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 979417253fab9142059bdfb6e834f44bb1cc8d0d	2019-06-14 17:41:30 -07:00
davidriazati	220efdbdc4	Refactor pybind_utils.h (#21550 ) Summary: This refactors pybind_utils so we can have all our type-inferring stuff in 1 place (e.g. for #21379) There is some follow up work to make the error messages better, but I think that's fine to save for another PR. ](https://our.intern.facebook.com/intern/diff/15727002/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21550 Pulled By: driazati Differential Revision: D15727002 fbshipit-source-id: a6974f2e1e5879f0503a18efc138da31cda7afa2	2019-06-14 17:27:45 -07:00
Nikolay Korovaiko	a85305fdea	Hook up profiled execution in the interpreter (#21799 ) Summary: Rebasing https://github.com/pytorch/pytorch/pull/21616 onto master Pull Request resolved: https://github.com/pytorch/pytorch/pull/21799 Differential Revision: D15832854 Pulled By: Krovatkin fbshipit-source-id: 88d754446df2abc25ea86e46764848d48ee3a5fc	2019-06-14 16:56:13 -07:00
James Reed	4bcc72fe95	Support for NamedTuple (#21428 ) Summary: Resolves https://github.com/pytorch/lockdown/issues/18 This implements NamedTuple by taking advantage of the existing `names` field in `TupleType`. TODO: This currently doesn't retain the NamedTuple-ness through serialization. Discussed with suo offline, we can probably make a way to define an anonymous NamedTuple in script (e.g. `NamedTuple('Foo', [('a', int), ('b', float), ('c', List[float])])` and serialize that TODO: implement support for calling the constructor with kwargs Pull Request resolved: https://github.com/pytorch/pytorch/pull/21428 Differential Revision: D15741564 Pulled By: jamesr66a fbshipit-source-id: c077cbcea1880675ca6deb340a9ec78f824a136c	2019-06-14 16:45:56 -07:00
Hector Yuen	ac8d1a1f76	fix some issues found by enabling -Wshorten-64-to-32 (#18187 ) Summary: when enabling this flag, there were a lot of warnings, this pr focuses on the warnings where this comparison could be affecting array indices, which could be ones most prone to fail the good news is that I didn't find anything obviously concerning one degenerate case could be when the matrices we work with are too skinny could run into issues (dim1=1, dim2 needs to hold a big number) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18187 Differential Revision: D14527182 Pulled By: hyuen fbshipit-source-id: b9f46b6f68ab912c55368961758a7a5af1805555	2019-06-14 16:29:32 -07:00
Jerry Zhang	94f903654c	Add qscheme() method (#20608 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20608 Exposing QScheme in python as Python objects like `torch.qscheme.per_tensor_affine` etc. Reviewed By: zafartahirov Differential Revision: D15364354 fbshipit-source-id: 4d6a96d67e9ead051cf4a8f934553a8c7232fdb7	2019-06-14 16:29:29 -07:00
davidriazati	d0021b3ac7	Fix stdout capturing, remove some expect files Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21805 Pulled By: driazati Differential Revision: D15833924 fbshipit-source-id: 152972b4c24041b8a459d5b8ef8789543a6b8153	2019-06-14 16:05:06 -07:00
Thiago Crepaldi	07fea3f5b6	Add new get_batch() method to ChunkDataset API (#21797 ) Summary: We plan on generating python bindings for C++ ChunkDataset API using the current Pytorch Dataloader class, which must call get_batch() instead of get_batch(size) This changes doesnt break the current API, just add one more method that will make future extensions easier (WIP) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21797 Differential Revision: D15830522 Pulled By: soumith fbshipit-source-id: 7208f305b48bf65d2783eaff43ff57a05e62c255	2019-06-14 13:39:54 -07:00
Bram Wasti	dddc65db9e	Add Python string standard lib (#21059 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21059 ghimport-source-id: f813585cde1b275c134b19009a2f5c0b3d70fc6e Reviewed By: jamesr66a Differential Revision: D15830704 Pulled By: bwasti fbshipit-source-id: e55a8c6bf910a163b9a5260235e315af9532b129	2019-06-14 13:34:42 -07:00
Peter Yeh	65a3dbdfb0	Remove hip device sync in miopen Conv implementation (#21791 ) Summary: xw285cornell bddppq Note there are other optimizations coming. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21791 Differential Revision: D15829238 Pulled By: bddppq fbshipit-source-id: 66c62c646f315d65b4e432ca20890faded843db4	2019-06-14 12:32:50 -07:00
Tzu-Wei Huang	1fc240e59a	add tests for add_custom_scalars and others (#20987 ) Summary: Originally, the tests for tensorboard writer are smoke tests only. This PR lets CI compare the output with expected results at low level. The randomness of the tensors in the test are also removed. ps. I found that how protobuf serializes data differs between different python environment. One method to solve this is to write the data and then read it back instantly. (compare the data at a higher level) For `add_custom_scalars`, the data to be written is a dictionary. and the serialized result might be different (not `ordereddict`). So only smoke test for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20987 Reviewed By: NarineK, lanpa Differential Revision: D15804871 Pulled By: orionr fbshipit-source-id: 69324c11ff823b19960d50def73adff36eb4a2ac	2019-06-14 12:27:07 -07:00
Richard Zou	0d6eb209e6	Expose torch.empty(sizes, *, names, ...) to Python (#21648 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21648 ghimport-source-id: 583f155c8ee95967d2f8b9d8df27d94b9e725694 Differential Revision: D15804482 Pulled By: zou3519 fbshipit-source-id: f86520dda479100be2a752e4db8a902167413a83	2019-06-14 11:52:47 -07:00
Stefan Krah	710821875a	Fix flaky nuclear_norm() test (#21638 ) Summary: Try to fix a sporadic failure on some CIs. I've run this test hundreds of times on my machine (GeForce 1060, MAGMA) but I cannot reproduce this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21638 Differential Revision: D15827779 Pulled By: ezyang fbshipit-source-id: 3586075e48907b3b84a101c560a34cc733514a02	2019-06-14 11:40:03 -07:00
zaf	ff8c3fd54e	Adding the quantized namespace to torch.nn and importing it from torch (#21600 ) Summary: Stack:     ⚫  https://github.com/pytorch/pytorch/issues/21600 Adding the quantized namespace to torch  [💛](https://our.intern.facebook.com/intern/diff/D15742149/) Add nn.quantized name space to torch Pull Request resolved: https://github.com/pytorch/pytorch/pull/21600 Differential Revision: D15742149 Pulled By: zafartahirov fbshipit-source-id: 60dede12c81861f369d208b06f5b68e9384312f6	2019-06-14 11:05:45 -07:00
Sebastian Messmer	9a1dc43f34	Deprecate unordered_map and vector in IValues (#21712 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21712 Warn when people use unordered_map or vector with IValues. These APIs are deprecated. The unordered_map API is slow because it requires copying the whole map. The vector API is slow for some types (e.g. std::string) because for them it also requires copying the whole map. Also, the vector API would get slow for all types if we decide to switch to SmallVector. Differential Revision: D15792428 fbshipit-source-id: 1b72406b3a8d56521c862858c9f0ed01e56f2757	2019-06-14 11:05:41 -07:00
Edward Yang	029a968212	Define __setstate__ on _ConvNd to handle pre-padding_mode pickles. (#21687 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21687 ghimport-source-id: df49530d25239ac4d62eae83c5d7b0d8f00f836a Differential Revision: D15807402 Pulled By: ezyang fbshipit-source-id: f51b221444afc4e017db7544642a9c0a7d2a3efb	2019-06-14 11:00:21 -07:00
Brian Vaughan	7284f448ba	Fix handling of kwargs from common method invocations (#21499 ) Summary: When kwargs are specified in a test defined via common_method_invocations, it doesn't work if there isn't also a positional argument (`{'foo':'foo'}` without a positional arg generates a python call like: `self.method(, foo=foo)`, erroring on the `,`). I wanted to test something in a different PR and noticed I couldn't. Also fixed some flake8 warnings I was seeing locally. I replaced `lambda x: x` with `ident` since it seems a bit cleaner to me, but happy to revert that if others don't agree? Pull Request resolved: https://github.com/pytorch/pytorch/pull/21499 Differential Revision: D15826974 Pulled By: nairbv fbshipit-source-id: a3f37c80ba2303c7d9ae06241df06c7475b64e36	2019-06-14 10:47:33 -07:00
Lu Fang	c1744a6c39	Add ONNX py3 CI cases (#21715 ) Summary: So far, we only have py2 ci for onnx. I think py3 support is important. And we have the plan to add onnxruntime backend tests, which only supports py3. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21715 Reviewed By: bddppq Differential Revision: D15796885 Pulled By: houseroad fbshipit-source-id: 8554dbb75d13c57b67ca054446a13a016983326c	2019-06-14 10:20:14 -07:00
xiaobing.zhang	c06ccbe663	Add aten mkldnn zero_ operator (#20573 ) Summary: ### mkldnn backward ops list: - [ ] \(https://github.com/pytorch/pytorch/pull/20567) Add aten mkldnn conv2d backward operator 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20570) Add aten mkldnn backward ops: relu, linear and reshape 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20571) Add aten mkldnn backward ops: max_pool2d, avg_pool2d and adaptive_avg_poo2d 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20572) Add aten mkldnn batchnorm backward operator 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20573) Add aten mkldnn zero_ operator💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20575) Add mkldnn mul operator 💚 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20573 Differential Revision: D15820477 Pulled By: bddppq fbshipit-source-id: 35d95f5b4e013c8db1911f52148550a2e40a2e68	2019-06-14 09:48:49 -07:00
Tongzhou Wang	bc6281028c	rebuild_storage_fd retry on EINTR (#21723 ) Summary: Some data loader tests are flaky on py 2 with the following error ``` Jun 12 22:17:31 Traceback (most recent call last): Jun 12 22:17:31 File "test_dataloader.py", line 798, in test_iterable_dataset Jun 12 22:17:31 fetched = sorted([d.item() for d in dataloader_iter]) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 697, in __next__ Jun 12 22:17:31 idx, data = self._get_data() Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 664, in _get_data Jun 12 22:17:31 success, data = self._try_get_data() Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 617, in _try_get_data Jun 12 22:17:31 data = self.data_queue.get(timeout=timeout) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/queues.py", line 135, in get Jun 12 22:17:31 res = self._recv() Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/queue.py", line 22, in recv Jun 12 22:17:31 return pickle.loads(buf) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1382, in loads Jun 12 22:17:31 return Unpickler(file).load() Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 858, in load Jun 12 22:17:31 dispatch[key](self) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1133, in load_reduce Jun 12 22:17:31 value = func(*args) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/reductions.py", line 274, in rebuild_storage_fd Jun 12 22:17:31 fd = multiprocessing.reduction.rebuild_handle(df) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/reduction.py", line 157, in rebuild_handle Jun 12 22:17:31 new_handle = recv_handle(conn) Jun 12 22:17:31 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/reduction.py", line 83, in recv_handle Jun 12 22:17:31 return _multiprocessing.recvfd(conn.fileno()) Jun 12 22:17:31 OSError: [Errno 4] Interrupted system call ``` Apparently, Python 2.7's `recvfd` calls `recvmsg` without EINTR retry: https://github.com/python/cpython/blob/2.7/Modules/_multiprocessing/multiprocessing.c#L174 So we should call it with an outer try-catch loop. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21723 Differential Revision: D15806247 Pulled By: ezyang fbshipit-source-id: 16cb661cc0fb418fd37353a1fef7ceeb634f02b7	2019-06-14 09:10:00 -07:00
LeviViana	deb2140c6e	Throwing errors for min and max reductions in empty CUDA tensors (#19612 ) Summary: Related to https://github.com/pytorch/pytorch/issues/17750. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19612 Differential Revision: D15813649 Pulled By: gchanan fbshipit-source-id: aa3dc34dd1e6d8bb24fa4c18891204108759bb35	2019-06-14 08:34:30 -07:00
Hong Xu	b811b6d5c0	When building extensions, honor options set in CMake. (#21653 ) Summary: Currently when building extensions, variables such as USE_CUDA, USE_CUDNN are used to determine what libraries should be linked. But we should use what CMake has detected, because: 1. If CMake found them unavailable but the variables say some libraries should be linked, the build would fail. 2. If the first build is made using a set of non-default build options, rebuild must have these option passed to setup.py again, otherwise the extension build process is inconsistent with CMake. For example, ```bash # First build USE_CUDA=0 python setup.py install # Subsequent builds like this would fail, unless "build/" is deleted python setup.py install ``` This commit addresses the above issues by using variables from CMakeCache.txt when building the extensions. --- The changes in `setup.py` may look lengthy, but the biggest changed block is mostly moving them into a function `configure_extension_build` (along with some variable names changed to `cmake_cache_vars['variable name']` and other minor changes), because it must be called after CMake has been called (and thus the options used and system environment detected by CMake become available). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21653 Differential Revision: D15824506 Pulled By: ezyang fbshipit-source-id: 1e1eb7eec7debba30738f65472ccad966ee74028	2019-06-14 08:13:40 -07:00
Hong Xu	4001e71547	When converting to NumPy, throw TypeError when type is not supported (#21608 ) Summary: This makes the error thrown in aten_to_numpy_dtype consistent with that in numpy_dtype_to_aten. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21608 Differential Revision: D15816035 Pulled By: gchanan fbshipit-source-id: 392e8b9ea37003a859e7ed459911a1700fcbd695	2019-06-14 07:35:03 -07:00
Michael Carilli	2d5ce519f2	Fix with emit_nvtx, also allow shape information to appear in nvtx ranges. (#21691 ) Summary: This PR is intended as a fix for https://github.com/pytorch/pytorch/issues/21644. It allows the `with emit_nvtx` context manager to take an additional `record_shapes` argument. `record_shapes` is False by default, but if True, the nvtx ranges generated for each autograd op will append additional information about the sizes of Tensors received by that op. The format of shape information is equivalent to what the CPU-side profiler spits out. For example, ``` M = torch.randn(2, 3) mat1 = torch.randn(2, 3) mat2 = torch.randn(3, 3) with torch.cuda.profiler.profile(): with torch.autograd.profiler.emit_nvtx(record_shapes=True): torch.addmm(M, mat1, mat2) ``` produces the following nvtx range label for addmm: ![Screenshot from 2019-06-12 10-48-01](https://user-images.githubusercontent.com/7799218/59374008-b7d13100-8cff-11e9-9245-58410073d965.png) (cf the "Input Shapes" shown in `864cfbc216 (diff-115b6d48fa8c0ff33fa94b8fce8877b6)`) I also took the opportunity to do some minor docstring cleanup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21691 Differential Revision: D15816226 Pulled By: gchanan fbshipit-source-id: b2b01ea10fea61a6409a32b41e85b6c8b4851bed	2019-06-14 07:35:00 -07:00
Shagun	b9675efb5a	Fix the issue of sizes vs size for tensor creation ops (#21686 ) Summary: Related to [pytorch#20921](https://github.com/pytorch/pytorch/issues/20921) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21686 Differential Revision: D15816109 Pulled By: gchanan fbshipit-source-id: 4428b8e77b6c8b297ddb77e58fc1cb916c9cc46e	2019-06-14 07:34:56 -07:00
Benny Chen	1e7bd7586d	Query caffe2 operator stats for detailed execution info (#20924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20924 I found a python3 bug for deserializing caffe2 code. The exception thrown is Unicode related error instead of just decode error, and we need to catch that as well Reviewed By: ipiszy Differential Revision: D15293221 fbshipit-source-id: 29820800d1b4cbe5bf3f5a189fe2023e655d0508	2019-06-13 23:41:04 -07:00
Aapo Kyrola	d9eec4ef0d	backend.py: _getattr__ must raise AttributeError (#21763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21763 Custom __getattr__ functions can only raise AttributeError. This code throwed NotImplementedError which caused upstream troubles when hasattr() was called. Differential Revision: D15815176 fbshipit-source-id: 0982e2382de4578d3fc05c5d2a63f624d6b4765e	2019-06-13 23:17:57 -07:00
Anshul Jain (B*8)	044809f1f3	Handling numel() == 0 in convTranspose (#21652 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21652 Diff fixes issue of empty ROIs for convTranspose Issue StackTrace: P65374505 Reviewed By: jerryzh168 Differential Revision: D15766739 fbshipit-source-id: 39cf8feca66b6aae22ff4ec5c1b6a4e3f20f378d	2019-06-13 23:02:26 -07:00
Richard Zou	5c0e058950	Implement at::empty(IntArrayRef, DimnameList?, TensorOptions) in aten (#21647 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21647 ghimport-source-id: 1db4ec31f047f7854a39c28e2b38918dc6b44f42 Differential Revision: D15804425 Pulled By: zou3519 fbshipit-source-id: 575cc3de09287efe75e7052df129626748208d0d	2019-06-13 20:38:19 -07:00
Edward Yang	3e79036382	Make it possible to trigger all tests by pushing to ci-all/ branch (#21750 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21750 ghimport-source-id: 4792aa5ccab7e4b54c21f23d0b78802f85bbeb8d Differential Revision: D15819367 Pulled By: ezyang fbshipit-source-id: db91ee727c66469ac78e59b3662f29db53a916bc	2019-06-13 19:53:35 -07:00
Ailing Zhang	16b4a12ed8	better example for local weights (#21685 ) Summary: fixes https://github.com/pytorch/hub/issues/29 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21685 Differential Revision: D15817774 Pulled By: ailzhang fbshipit-source-id: d2f615e5d431186d45a21d8300fb9ba3c37b246c	2019-06-13 17:56:25 -07:00
Sherman Wong	adc99efb46	Add batch id to tracer event (#21446 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21446 this is used for easier tracing of iter id when looking at trace diagram Reviewed By: ilia-cher Differential Revision: D15628950 fbshipit-source-id: ee75b3bdb14a36abc18c7bddc49d8ec9789b724d	2019-06-13 17:13:42 -07:00
Mikhail Zolotukhin	fbecb4621f	schema_matching.cpp: improve error messages. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21141 Differential Revision: D15808354 Pulled By: ZolotukhinM fbshipit-source-id: 16d938fd5acafb445a0c433cabc9a55cab563165	2019-06-13 17:04:38 -07:00
Sam Gross	cfd8c58b45	Tune elementwise ops for ROCm (#21754 ) Summary: ``` The stride calculation using OffsetCalculator performs poorly with MAX_DIMS=25. This reduces MAX_DIMS (after coalescing) to 16 on ROCm. I think it's unlikely that anyone will exceed this limit. If they do, we can add additional specializations for ROCm with more dimensions. ``` I'm not sure about the underlying cause. With MAX_DIM=25, the add kernel's params is ~648 bytes vs. ~424 bytes with MAX_DIM=16. The kernel instruction footprint is bigger too, but most of these instructions are never executed and most kernel parameters are never loaded because the typical dimensionality is much smaller. Mini benchmark here: https://gist.github.com/colesbury/1e917ae6a0ca9d24712121b92fed4c8f (broadcasting operations are much faster) cc iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/21754 Reviewed By: bddppq Differential Revision: D15811906 Pulled By: colesbury fbshipit-source-id: 063f92c083d26e2ef2edc98df7ff0400f9432b9d	2019-06-13 16:25:26 -07:00
Sungmann Cho	f59581218f	Fix spelling errors (#21665 ) Summary: alloctor -> allocator excutable -> executable excution -> execution foward -> forward initiaize -> initialize paralell -> parallel preprocesor -> preprocessor tranpose -> transpose Pull Request resolved: https://github.com/pytorch/pytorch/pull/21665 Differential Revision: D15806155 Pulled By: soumith fbshipit-source-id: d92b21ec8650a2b32f05faf9af0b7d2b073e992c	2019-06-13 15:21:55 -07:00
Natalia Gimelshein	efd20de276	fix multihead attention for half (#21658 ) Summary: Currently multihead attention for half type is broken ``` File "/home/ngimel/pytorch/torch/nn/functional.py", line 3279, in multi_head_attention_forward attn_output = torch.bmm(attn_output_weights, v) RuntimeError: Expected object of scalar type Float but got scalar type Half for argument https://github.com/pytorch/pytorch/issues/2 'mat2' ``` because softmax converts half inputs into fp32 inputs. This is unnecessary - all the computations in softmax will be done in fp32 anyway, and the results need to be converted into fp16 for the subsequent batch matrix multiply, so nothing is gained by writing them out in fp32. This PR gets rid of type casting in softmax, so that half works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21658 Differential Revision: D15807487 Pulled By: zhangguanheng66 fbshipit-source-id: 4709ec71a36383d0d35a8f01021e12e22b94992d	2019-06-13 15:17:04 -07:00
Will Feng	4716409f30	Use expect to fill in pytorchbot token (#20459 ) Summary: In this PR, we use `expect` to fill in the token for pytorchbot when doing `git push`, so that we don't need to save the token in the git remote URL. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20459 Differential Revision: D15811676 Pulled By: yf225 fbshipit-source-id: cd3b780da05d202305f76878e55c3435590f15a8	2019-06-13 14:56:38 -07:00
Edward Yang	b858f42e16	Document that no_grad is thread local. (#21755 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21755 ghimport-source-id: dfb53759024d9ba9d104fdb2a8151ab996e55234 Differential Revision: D15811172 Pulled By: ezyang fbshipit-source-id: c8c7c1c15277d8fe8cc513e20af449257d7ff15c	2019-06-13 13:47:09 -07:00
BowenBao	3e8dc565bd	Bug fix: ONNX export full operator (#21669 ) Summary: Fix an obvious bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21669 Reviewed By: zrphercule Differential Revision: D15806614 Pulled By: houseroad fbshipit-source-id: d0f6e934252e0057f3dbcc7f160236ee6f4497ac	2019-06-13 13:20:21 -07:00
Iurii Zdebskyi	4b45f08f87	Added dim check for index_copy_ (#21617 ) Summary: Fixing reported [bug](https://github.com/pytorch/pytorch/issues/20322) The issue was related to not checking the dimensions of source vs destination tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21617 Differential Revision: D15749963 Pulled By: izdeby fbshipit-source-id: acff114c729fd9c0a9a51325e0ebd8b42e1f2fc1	2019-06-13 13:15:23 -07:00
Aapo Kyrola	aa6887e6ef	add error message to missing function backend (#21742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21742 Add error message to NotImplementedError so we know which function it is about. Reviewed By: bddppq Differential Revision: D15806379 fbshipit-source-id: 14eab9d03aa5b44ab95c5caeadc0e01d51f22188	2019-06-13 13:10:48 -07:00
Guanheng Zhang	756a20de93	Add/edit docs for nn.transformer (#21746 ) Summary: Add docs for TransformerEncoder and TransformerDecoder, plus minor edits. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21746 Differential Revision: D15807498 Pulled By: zhangguanheng66 fbshipit-source-id: 388efb5821c4c3d25865cecea70902e9b2bf5d15	2019-06-13 12:27:26 -07:00
davidriazati	7c7d5be033	Skip failing test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21751 Pulled By: driazati Differential Revision: D15809091 fbshipit-source-id: 3cc96e632a7b89b4d86d68d2a76021d971447e12	2019-06-13 12:21:56 -07:00
Tongzhou Wang	51ee048709	improve torch.load & torch.save doc formatting Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21747 Differential Revision: D15808189 Pulled By: ezyang fbshipit-source-id: 5413eaaa901be098c6bad135f702ba103bc79d6c	2019-06-13 12:13:04 -07:00
Natalia Lunova	63a7c7bb2a	Add event and event_counter columns to caffe2_usage_tracer table (#21739 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21739 Added event and event_counter columns for PyTorch/Caffe2 API usage metrics Reviewed By: dzhulgakov Differential Revision: D15119119 fbshipit-source-id: a71010bd659109a8e4f3a8bad84b22c1d15dc528	2019-06-13 12:06:02 -07:00
Kevin Chen	f87d5cc191	Fix first reshape in pixel_shuffle conversion (#21486 ) Summary: When converting pixel_shuffle to reshape + transpose + reshape, the first reshape should be: [N, C * r^2, H, W] => [N, C, r, r, H, W] in order to match pytorch's implementation (see ATen PixelShuffle.cpp). This previously wasn't caught by the test case, since it uses C = r = 4. Updated test case to have C = 2, r = 4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21486 Reviewed By: houseroad Differential Revision: D15700945 Pulled By: houseroad fbshipit-source-id: 47019691fdc20e152e867c7f6fd57da104a12948	2019-06-13 11:44:54 -07:00
Ilia Cherniavskii	fc3f702ba8	at::launch benchmark (#21581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21581 ghimport-source-id: 6a65d73694b17611d6ad45db0b39b86c318a68c7 Differential Revision: D15736495 Pulled By: ilia-cher fbshipit-source-id: 6b9109ad3611ff3c8b1a37796e9149bef0c2ad36	2019-06-13 10:46:35 -07:00
davidriazati	eca42a5122	Fix failing test for Final annotations (#21725 ) Summary: ](https://our.intern.facebook.com/intern/diff/15800009/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21725 Pulled By: driazati Differential Revision: D15800009 fbshipit-source-id: 5409c213161e3f2031710933897b85872aad2a83	2019-06-13 10:41:44 -07:00
Ilia Cherniavskii	5485f09f18	Native TBB parallel backend (#20480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20480 ghimport-source-id: c710f897c4c9b9616fc3dd76d80b4845aea43a1f Differential Revision: D15333692 Pulled By: ilia-cher fbshipit-source-id: 61e476dd5c737fe144e3aec000d8ebb11fbc0547	2019-06-13 10:11:16 -07:00
Stefan Krah	ab0d5ab99d	Port avg_pool2d() to ATen Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21635 Differential Revision: D15768487 Pulled By: ezyang fbshipit-source-id: 85e1d883aded0f4d3ac5100719df335f5a337fc5	2019-06-13 10:03:58 -07:00
Xiaodong Wang	5a7e2ccc0b	Add use_rocm flag to detect AMD build in the runtime (#21718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21718 adding a detection method on whether the package is built for AMD. Reviewed By: bddppq Differential Revision: D15795893 fbshipit-source-id: 91a21ee76b2273b1032507bdebe57e016717181d	2019-06-13 09:30:49 -07:00
Junjie Bai	556af7c19d	ROCm 2.5 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21724 Differential Revision: D15799149 Pulled By: bddppq fbshipit-source-id: c72689e73470f2ca145556a2ac8cb34e36e341ef	2019-06-13 01:32:46 -07:00
James Malcolm	42770e1370	Improving Categorical Distribution Docs' (#16291 ) (#21707 ) Summary: Closes: Confusing documentation with distributions.Categorical about logits https://github.com/pytorch/pytorch/issues/16291 Solution: Changes documentation on the Categorical distribution from `log probabilities` to `event log-odds`. This makes should reduce confusion as raised by this issue, and is consistent with other distributions such as `torch.Binomial`. More than happy to make any other changes if they fit :). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21707 Differential Revision: D15799181 Pulled By: soumith fbshipit-source-id: f11acca7a5c130102a3ff6674640235ee5aa69bf	2019-06-12 23:54:02 -07:00
BowenBao	a3db2844e1	Support tuples in ScriptModule inputs/outputs (#20784 ) Summary: - [x] Add tests after https://github.com/pytorch/pytorch/pull/20256 is merged - Support exporting ScriptModule with inputs/outputs of arbitrarily constructed tuples. - Moved the assigning of output shapes to after graph conversion to ONNX is completed. By then all tuples in the IR has already been lowered by the pass ```_jit_pass_lower_all_tuples```. If assigning output shapes is required to happen before that, we'll need to hand parse the tuple structures in the graph, and repeat the same logic in ```_jit_pass_lower_all_tuples```. Handling inputs is easier because all tuple information is encoded within the input tensor type. - Swap the order of ```_jit_pass_lower_all_tuples``` and ```_jit_pass_erase_number_types```. Ops like ```prim::TupleIndex``` relies on index being a scalar. ```_jit_pass_erase_number_types``` will convert these kind of scalars to tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20784 Reviewed By: zrphercule Differential Revision: D15484171 Pulled By: houseroad fbshipit-source-id: 4767a84038244c929f5662758047af6cb92228d3	2019-06-12 23:37:28 -07:00
vishwakftw	4c03ac7ac4	Allow batch sizes > 65535 for inverse, solve, cholesky_solve and tria… (#21689 ) Summary: …ngular_solve Changelog: - Iterate over mini batches of 65535 matrices (maximum) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21689 Differential Revision: D15800254 Pulled By: soumith fbshipit-source-id: c743ff13f1ba25d26874429d44e41a3c0ed21d6a	2019-06-12 23:30:19 -07:00
xiaobing.zhang	b599bb3836	Add mkldnn mul operator (#20575 ) Summary: ### mkldnn backward ops list: - [ ] \(https://github.com/pytorch/pytorch/pull/20567) Add aten mkldnn conv2d backward operator 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20570) Add aten mkldnn backward ops: relu, linear and reshape 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20571) Add aten mkldnn backward ops: max_pool2d, avg_pool2d and adaptive_avg_poo2d 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20572) Add aten mkldnn batchnorm backward operator 💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20573) Add aten mkldnn zero_ operator💛 - [ ] \(https://github.com/pytorch/pytorch/pull/20575) Add mkldnn mul operator 💛 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20575 Differential Revision: D15799529 Pulled By: bddppq fbshipit-source-id: 4887d8ef1a0e316ad9db199b657d9481fc13e486	2019-06-12 22:41:51 -07:00
Will Feng	d3b3cbe26e	Revert D15769066: [pytorch][PR] schema_matching.cpp: improve error messages. Differential Revision: D15769066 Original commit changeset: 5853e0360581 fbshipit-source-id: ac6fa8429136abf4c7835919009f936eea11ea7b	2019-06-12 20:17:38 -07:00
Karl Ostmo	49481d576d	Torch rename (#20774 ) Summary: This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants). Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR. The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774 Differential Revision: D15769965 Pulled By: kostmo fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821	2019-06-12 20:12:34 -07:00
Nikolay Korovaiko	e9121e27ce	remove liveness tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21719 Differential Revision: D15797628 Pulled By: Krovatkin fbshipit-source-id: 87742bdde0b05aff4341ababb1f55c51991768ec	2019-06-12 19:04:41 -07:00
Nick Korovaiko	f5c00345b3	Profiling Programs section in README.md Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21695 Differential Revision: D15795716 Pulled By: Krovatkin fbshipit-source-id: e14a44210ea4312a247157a6681fce449e40f779	2019-06-12 17:52:05 -07:00
Nikolay Korovaiko	8dd670657b	Liveness for BailOut graphs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21615 Differential Revision: D15793434 Pulled By: Krovatkin fbshipit-source-id: d89f1bf61ea57a1e3b75f8e2b200c27beb8b46cf	2019-06-12 17:22:33 -07:00
Zachary DeVito	8c57ce87b0	make tests pass with enable_first_class_module() enabled. (#21565 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21565 ghimport-source-id: d1fe735fb7821eadc59116fb921d8fe39a49f818 Reviewed By: driazati Differential Revision: D15729503 Pulled By: zdevito fbshipit-source-id: fabb678f040d21fae7545e3b2be1d098e24c544e	2019-06-12 17:13:00 -07:00
Zachary DeVito	d8056cb832	Update quantization to work with first-class modules. (#21660 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21660 ghimport-source-id: f9a11b2748f49042ee636755358d79c547aa249e Reviewed By: suo Differential Revision: D15770237 Pulled By: zdevito fbshipit-source-id: 41fa8577028eef247bc545635cd93192a0b19db4	2019-06-12 17:12:57 -07:00
Zachary DeVito	56f4602630	Add WeakIValue, use in tracer. (#21515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21515 ghimport-source-id: 7898a68791db2b5050164ab01d6ca6991e05746d Reviewed By: suo Differential Revision: D15719981 Pulled By: zdevito fbshipit-source-id: 42cf26cf6541bcdf95f1343da3b9228fe2c229da	2019-06-12 17:12:53 -07:00
davidriazati	0293cf5bb6	Add `Final[T]` annotated members to `__constants__` (#21603 ) Summary: Class member annotations can be marked with `Final[T]` instead of adding them to `__constants__`. `Final` comes from the `typing_extensions` module (which will be used if it is present). If not, the polyfill from `_jit_internal` is exposed as `torch.jit.Final` for users that don't want to install `typing_extensions`. This keeps around `__constants__` since a lot of code is still using it, but in documentation follow ups we should change the examples to all to use `Final`. TODO: install typing_extensions on CI, move tests to a Python3 only file when #21489 lands ](https://our.intern.facebook.com/intern/diff/15746274/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21603 Pulled By: driazati Differential Revision: D15746274 fbshipit-source-id: d2c9b5643b4abba069b130c26fd42714c906ffac	2019-06-12 16:40:40 -07:00
David Riazati	0481a7710d	Support for type annotations instead of torch.jit.annotate() (#21390 ) Summary: This adds support for PEP 526 style annotations on assignments in place of `torch.jit.annotate()`, so ```python a = torch.jit.annotate(List[int], []) ``` turns into ```python a : List[int] = [] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21390 Differential Revision: D15790937 Pulled By: driazati fbshipit-source-id: 0cc204f7209a79839d330663cc6ba8320d3a4120	2019-06-12 15:51:46 -07:00
Brennan Vincent	699de487db	numerical integration "trapz" function. (#21610 ) Summary: This is intended to match [numpy.trapz](https://docs.scipy.org/doc/numpy/reference/generated/numpy.trapz.html): numerical integration based on the trapezoid rule. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21610 Differential Revision: D15747618 Pulled By: umanwizard fbshipit-source-id: 8eadb2e75c9877b07592d875ca0b2cca6cb72297	2019-06-12 15:30:13 -07:00
Sebastian Messmer	b527e48588	Use c10::List (#21177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177 - Integrate c10::ListPtr into IValue and the c10 dispatcher. - Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now Differential Revision: D15476433 fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954	2019-06-12 13:58:24 -07:00
Syed Tousif Ahmed	ae342fd076	Refactor Random Number Generators in ATen (#21364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21364 ghimport-source-id: ca7d37e10190ba46dc8512f437404ca9216d3369 Differential Revision: D15696497 Pulled By: ezyang fbshipit-source-id: 2e713b8566ae915e175b5a79ac1dd9b86cc2a23d	2019-06-12 13:01:30 -07:00
Mikhail Zolotukhin	96910251e0	schema_matching.cpp: improve error messages. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21141 Differential Revision: D15769066 Pulled By: ZolotukhinM fbshipit-source-id: 5853e0360581c44e42b068add3bf2bc68e671b2b	2019-06-12 12:43:12 -07:00
Richard Zou	28adca82ea	Add some named tensor helper functions (#21636 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21636 ghimport-source-id: 5eff5744cd3c80f75bdb02576be1407a64e0434d Differential Revision: D15780269 Pulled By: zou3519 fbshipit-source-id: 87ff40ffbe0ebd5fc4d105709c9f6f8dda5f9952	2019-06-12 12:34:44 -07:00
Richard Zou	20b0acf057	Add some more namedtensor builds to the CI (#21632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21632 ghimport-source-id: 6a8da97ce153c6d279017af920edd0d20765c32c Differential Revision: D15760331 Pulled By: zou3519 fbshipit-source-id: b2f4c65df5f6f9322d47da995c76851387e5df47	2019-06-12 12:34:40 -07:00
Richard Zou	3e6eb3dcab	Add virtual dtor to NamedTensorMetaInterface (#21633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21633 ghimport-source-id: 6cdf0b1559e696a19e282ff6d5ba79c6b119e8c0 Differential Revision: D15760589 Pulled By: zou3519 fbshipit-source-id: 537882c05ab7b19889a31c648c5efeb1949831a8	2019-06-12 12:34:37 -07:00
Guanheng Zhang	83cec5f3ee	nn.Transformer (#20170 ) Summary: Accidentally rebased the old PR and make it too messy. Find it here (https://github.com/pytorch/pytorch/pull/19274) Create a PR for comments. The model is still WIP but I want to have some feedbacks before moving too far. The transformer model depends on several modules, like MultiheadAttention (landed). Transformer is implemented based on the paper (https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf). Users have the flexibility to build a transformer with self-defined and/or built-in components (i.e encoder, decoder, encoder_layer, decoder_layer). Users could use Transformer class to build a standard transformer model and modify sub-layers as needed. Add a few unit tests for the transformer module, as follow: TestNN.test_Transformer_cell TestNN.test_transformerencoderlayer TestNN.test_transformerdecoderlayer TestNN.test_transformer_args_check TestScript.test_scriptmodule_transformer_cuda There is another demonstration example for applying transformer module on the word language problem. https://github.com/pytorch/examples/pull/555 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20170 Differential Revision: D15417983 Pulled By: zhangguanheng66 fbshipit-source-id: 7ce771a7e27715acd9a23d60bf44917a90d1d572	2019-06-12 12:22:12 -07:00
svcscm	180aa234fc	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 5cbf562652b9d7cf3877b5f819141f88c9b857d3	2019-06-12 12:17:42 -07:00
Will Feng	8f40164517	Add libtorch Linux CPU binary build to PR CI (#21671 ) Summary: Currently we don't have any Linux libtorch binary build in the PR CI, which led to nightly build failure such as https://circleci.com/gh/pytorch/pytorch/1939687. This PR adds Linux libtorch CPU binary build to prevent such breakage from happening in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21671 Differential Revision: D15785003 Pulled By: yf225 fbshipit-source-id: d1f2e4235e48296ddecb3367f8e5a0df16f4ea49	2019-06-12 12:07:31 -07:00
Shen Li	39d412194f	Fix ProcessGroupGloo allgather for tensors with shared storage (#21490 ) Summary: Fix https://github.com/pytorch/pytorch/issues/20421 `ProcessGroupGloo` only requires input/output tensors to be contiguous. Contiguous tensors might not start from the beginning of the underlying storage, e.g., `chunk(..., dim=0)[1]`. The current implementation passes `tensor.storage().data()` ptr to gloo buffer. This leads to wrong results if the tensor has a non-zero storage offset. The proposed solution is to use `tensor.data_ptr()` instead. Let's see if this breaks any tests. cc qijianan777 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21490 Differential Revision: D15768907 Pulled By: mrshenli fbshipit-source-id: 9d7d1e9baf0461b31187c7d21a4a53b1fbb07397	2019-06-12 11:59:17 -07:00
fehiepsi	ad73ea22f7	Add strong Wolfe line search for lbfgs (#8824 ) Summary: This pull request adds a line search for lbfgs. "strong Wolfe" is the default line search method in [minFunc](https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html) and it is also recommended in the [Numerical Optimization](https://www.springer.com/gp/book/9780387303031) book. The implementation is based on four sources: + https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html + https://www.springer.com/gp/book/9780387303031 Algorithms 3.5, 3.6, formula 3.59 + https://github.com/torch/optim/blob/master/lswolfe.lua + https://github.com/torch/optim/blob/master/polyinterp.lua The 'lua' version is based on an old version of `minFunc`, which has been updated in 2012. I made a couple of small changes based on the updated version. Due to that, the test of comparing with `.lua` version is not consistent (that's is the reason I changed a learning rate in the test). Pull Request resolved: https://github.com/pytorch/pytorch/pull/8824 Differential Revision: D15783067 Pulled By: vincentqb fbshipit-source-id: 5316d9088233981120376d79c7869d5f97e51b69	2019-06-12 11:32:41 -07:00
Jiyan Yang	2c91ba3bbc	Add div hashing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21422 Reviewed By: xianjiec Differential Revision: D15589181 fbshipit-source-id: f6ff0726164f88da45e4b090b4d5ad05305b3225	2019-06-12 11:27:37 -07:00
daquexian	76e01542ed	Fix the shape of PReLU weight (#21330 ) Summary: Fix issue https://github.com/pytorch/pytorch/issues/21271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21330 Reviewed By: zrphercule Differential Revision: D15776459 Pulled By: houseroad fbshipit-source-id: 4e0aef88e9c91c79faa3da6fa66f7466dee52018	2019-06-12 11:03:40 -07:00
Daya Khudia	7123c6ca04	Enable groupwise for qconv (#21592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21592 We now support groupwise convolutions for qconv2d Reviewed By: zafartahirov Differential Revision: D15739239 fbshipit-source-id: 80b9b4fef5b9ee3d22ebecbaf205b970ab3d4250	2019-06-12 11:03:36 -07:00
Will Feng	8cc8e15473	Back out "[pytorch][PR] [Re-landing] Fix caffe2 windows CI for new Windows AMI" (#21670 ) Summary: Original commit changeset: e65c1d6bfcc9 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21670 Differential Revision: D15776087 Pulled By: yf225 fbshipit-source-id: cbb55cbbcb133cae1aeb2fe75cc52e7350cc6c88	2019-06-12 10:37:19 -07:00
Shen Li	cbcb2b5ad7	Delete DDP hooks in Reducer destructor (#21591 ) Summary: Closes https://github.com/pytorch/pytorch/issues/21344 DDP assigns the original module to the first module replica instead of creating a new one. Then, it creates a new Reducer to add post hooks to sync gradients. However, because every reconstructed DDP instance wraps the same original module, all their reducers will add hooks to the same set of variables. This PR deletes DDP hooks from variables when destructing Reducer, trying to make DDP failure recoverable. pietern kuttas and I discussed the following solutions: #### Solution 1 Keep `add_post_hook` API intact, and do a `dynamic_cast` in `del_post_hook` to check hook type. If the type matches Reducer's hook, delete it. As pietern mentioned, this will not work if we create multiple DDP instances from the same original model. #### Solution 2 Use a counter to generate a unique key for every hook in `Function`, and keep them in a map. return the key to the caller of `add_post_hook`, and ask the caller to provide key if it needs to delete the hook. Con: this would add extra overhead to `add_post_hook` and every `Function` object. #### Solution 3 [Current implementation] kuttas suggests that, instead of generating a unique key, directly using the address of the pointer would be better. In order to avoid messing up dereferencing, let `add_post_hook` to return a `uintptr_t`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21591 Differential Revision: D15745706 Pulled By: mrshenli fbshipit-source-id: e56d2d48de0c65f6667790ab16337eac7f7d8b76	2019-06-12 07:08:28 -07:00
Edward Yang	1e4af2b969	Pin torchvision version. (#20811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20811 ghimport-source-id: 52e043453272d8441a2c0efd7f005b71ded024d6 Differential Revision: D15779416 Pulled By: ezyang fbshipit-source-id: 1b3c2d9aeab57e580038f0c2a8bfbfcae9d7b62a	2019-06-12 06:16:20 -07:00
Jongsoo Park	1ffa9d3d3b	correct measure quantization error when followed_by=Relu and dequantize_output=1 (#21664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21664 As title Reviewed By: csummersea Differential Revision: D15770947 fbshipit-source-id: 57f5842e1a250300703b02134c314e4f06b767b8	2019-06-11 23:36:15 -07:00
James Reed	c2a18a6702	Override print when python is present (#21625 ) Summary: This makes it so we can see the output of prim::Print in environments like iPython notebooks which override sys.stdout Pull Request resolved: https://github.com/pytorch/pytorch/pull/21625 Differential Revision: D15756793 Pulled By: jamesr66a fbshipit-source-id: 7d9a14b2e229ed358e784318e9d862677db2c461	2019-06-11 22:58:22 -07:00
Elias Ellison	aa7e27fa70	Emit Loop Condition as Separate Block (#21611 ) Summary: Emit loop condition as a separate block in loops, then inline them before conversion to SSA. This is needed for breaks & continues where we will inline the condition block after the continue pass and before the break pass. I also considered emitting a prim::For and a prim::While, but i think it's easier to just have one pathway. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21611 Differential Revision: D15775820 Pulled By: eellison fbshipit-source-id: de17c5e65f6e4a0256a660948b1eb630e41b04fb	2019-06-11 22:03:26 -07:00
Mingzhe Li	341a7e4bb5	Fix issue in backward path (#21663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21663 as title Reviewed By: hl475 Differential Revision: D15770793 fbshipit-source-id: b3d0dd030237c4d62bddc388984a273153fac4a6	2019-06-11 21:09:25 -07:00
Jongsoo Park	afd202be9f	StoreMatrixInMatrixMarketFormat can store both integer and float tensors (#21606 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21606 StoreMatrixInMatrixMarketFormat was able to dump quantized tensors only but sometimes we want to dump float tensors. Reviewed By: csummersea Differential Revision: D15741611 fbshipit-source-id: 95b03c2fdf1bd8407f7d925171d9dc9f25677464	2019-06-11 17:28:19 -07:00
Lu Fang	c2a08d339b	Automatic update of fbcode/onnx to dd599b05f424eb161a31f3e059566a33310dbe5e (#21641 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21641 Previous import was 5160f3ac3380302224998f1c95e111cd961c4bc5 Included changes: - [dd599b05](https://github.com/onnx/onnx/commit/dd599b05): Fix type s/depracted/deprecated/ (#2092) <Takeshi Watanabe> - [abb1702a](https://github.com/onnx/onnx/commit/abb1702a): Add shape inference for Tile op (#2076) <Hariharan Seshadri> - [67638d9c](https://github.com/onnx/onnx/commit/67638d9c): [New Operator] Round (#2053) <Jeff Saremi> - [584e4477](https://github.com/onnx/onnx/commit/584e4477): Add dilations support in ConvTranspose shape inference and update docs (#2068) <daquexian> Reviewed By: zrphercule Differential Revision: D15762382 fbshipit-source-id: 590f25fb733e1565eb90fcdeb797b0ba34e2d2c3	2019-06-11 16:54:47 -07:00
Will Feng	968114ae3d	Revert D15769256: [jit] Add python string standard lib Differential Revision: D15769256 Original commit changeset: 1af487446361 fbshipit-source-id: 96bea4a49664dad68762bef75ae28e64c673f8b1	2019-06-11 16:54:43 -07:00
Brennan Vincent	039629cedd	fix incorrect use of TeX in docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21649 Differential Revision: D15766392 Pulled By: umanwizard fbshipit-source-id: a362ec06e971ee12c47a45bc9c15cc773ec878e3	2019-06-11 16:19:40 -07:00
Mikhail Zolotukhin	1bd21d3f14	test_jit: Remove tests checking non-guaranteed properties from 'test_insert_observers'. (#21657 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21657 ghimport-source-id: e9c7e45c00db55bf3b7895d06d77f0d99bfc1afe Differential Revision: D15769295 Pulled By: ZolotukhinM fbshipit-source-id: cfb40bc5d7116b1d99f5e0f5c4f5577f5aa33804	2019-06-11 16:12:09 -07:00
Daya Khudia	ee33afe2b1	randomized testing for qconv (#21436 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21436 Test many different options Reviewed By: zafartahirov Differential Revision: D15683754 fbshipit-source-id: 60d0fc697b53c7e4adadbe80995d45f28729bca4	2019-06-11 16:07:22 -07:00
Natalia Gimelshein	cf5c3bb3fe	make range functions respect current stream (#21619 ) Summary: Stream is not respected on range/linspace/logspace functions, which contributes to https://github.com/pytorch/pytorch/issues/21589 (this is not a complete solution for that issue). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21619 Differential Revision: D15769666 Pulled By: ezyang fbshipit-source-id: 7c036f7aecb3119430c4d432775cad98a5028fa8	2019-06-11 15:46:48 -07:00
Bram Wasti	9241c4b3c6	Add python string standard lib (#21656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21656 ghimport-source-id: cc7d7f68e33e95a97f6274c50823138aa4bacabb Differential Revision: D15769256 Pulled By: bwasti fbshipit-source-id: 1af487446361d90d03dce004c3e2169a3e62667d	2019-06-11 15:23:23 -07:00
vishwakftw	9737b166a4	Fix bug in multinomial_alias_draw (#21324 ) Summary: An incorrect increment / decrement caused the samples to not be generated from a multinomial distribution Changelog: - Remove the incorrect increment / decrement operation Fixes https://github.com/pytorch/pytorch/issues/21257, fixes https://github.com/pytorch/pytorch/issues/21508 cc: LeviViana neerajprad Pull Request resolved: https://github.com/pytorch/pytorch/pull/21324 Differential Revision: D15761029 Pulled By: colesbury fbshipit-source-id: 2aeb51e2d3cfdb8356806a7d5b12d4b9910e37fb	2019-06-11 15:18:17 -07:00
Ejaaz Merali	fb9fbc009c	Fix momentum bug in CyclicLR (#20401 ) Summary: Resolves issue https://github.com/pytorch/pytorch/issues/19003 The author of this issue also asked that `cycle_momentum` default to `False` if the optimizer does not have a momentum parameter, but I'm not sure what the best way to do this would be. Silently changing the value based on the optimizer may confuse the user in some cases (say the user explicitly set `cycle_momentum=True` but doesn't know that the Adam optimizer doesn't use momentum). Maybe printing a warning when switching this argument's value would suffice? Pull Request resolved: https://github.com/pytorch/pytorch/pull/20401 Differential Revision: D15765463 Pulled By: ezyang fbshipit-source-id: 88ddabd9e960c46f3471f37ea46013e6b4137eaf	2019-06-11 15:10:28 -07:00
Bram Wasti	cdbc20677c	Add len to OrderedDict types (#21651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21651 ghimport-source-id: 0bba5c1930865e2d18b18782ba8c8990b0761d4d Differential Revision: D15767795 Pulled By: bwasti fbshipit-source-id: 70e27176897b0f977c9034ffb3ad21091c91e12e	2019-06-11 14:53:40 -07:00
Will Feng	7a040f4b0b	Revert D15706021: [jit] Support for type annotations instead of torch.jit.annotate() Differential Revision: D15706021 Original commit changeset: 8bf1459f229d fbshipit-source-id: 7ae34578560e2dccd0f04af2220445b3999771fe	2019-06-11 14:33:28 -07:00
Vitaly Fedyunin	b46e87cd3d	Fix catch block to fix 'error: catching polymorphic type' (#21637 ) Summary: Fix catch block to fix 'error: catching polymorphic type `class c10::Error` by value [-Werror=catch-value=]' Pull Request resolved: https://github.com/pytorch/pytorch/pull/21637 Differential Revision: D15761860 Pulled By: VitalyFedyunin fbshipit-source-id: befc18a9c217440381cdb50a1319b0b5db5710e9	2019-06-11 12:30:52 -07:00
Sergey Zagoruyko	dd439bc39e	Rename hubconf.conf to hubconf.py in docs (#21631 ) Summary: It's a typo I guess. cc fmassa Pull Request resolved: https://github.com/pytorch/pytorch/pull/21631 Differential Revision: D15764909 Pulled By: soumith fbshipit-source-id: 5ffc7bde181c13e151332e7de3c0da36505b495e	2019-06-11 12:22:43 -07:00
davidriazati	bbcd6cc782	Support for type annotations instead of torch.jit.annotate() (#21390 ) Summary: This adds support for PEP 526 style annotations on assignments in place of `torch.jit.annotate()`, so ```python a = torch.jit.annotate(List[int], []) ``` turns into ```python a : List[int] = [] ``` ](https://our.intern.facebook.com/intern/diff/15706021/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21390 Pulled By: driazati Differential Revision: D15706021 fbshipit-source-id: 8bf1459f229d5fd0e16e59953b9656e85a2207fb	2019-06-11 12:03:57 -07:00
Shen Li	25d1496d58	Fix Process Group for tensors shared across processes (#21449 ) Summary: Ops on a Process Group (pg) instance will hit an error when input/output tensors are created on a different process, because, pg calls `recordStream` on `CUDACachingAllocator` which only knows tensors created within the same process. The proposed solution is to add a `suppressError` arg (suggestions for better names?) to `recordStream`. See comments in code for arguments. CC pichuang1984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21449 Differential Revision: D15689736 Pulled By: mrshenli fbshipit-source-id: e7fc81b167868f8666536067eaa7ae2c8584d88e	2019-06-11 11:50:25 -07:00
Ailing Zhang	50ee1f3fa7	better error msg when seeing a unsupported builtin function (#21068 ) Summary: fixes https://github.com/pytorch/lockdown/issues/39. Hopefully it doesn't break other tests.... Pull Request resolved: https://github.com/pytorch/pytorch/pull/21068 Differential Revision: D15761895 Pulled By: ailzhang fbshipit-source-id: 60cbb16cfc930b377d753b81e10b7edaea9a1281	2019-06-11 11:32:44 -07:00
Brian Johnson	4610347fdf	Breaks up NN module in docs so it loads faster. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21291 Differential Revision: D15760935 Pulled By: ezyang fbshipit-source-id: 114da4c52b78949e631e9adcae4eb620546124fb	2019-06-11 09:38:41 -07:00
Hong Xu	646a7f99bb	Move management of calls of "cmake --build" to setup_helper/cmake.py and refactor as a CMake class Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21493 Differential Revision: D15759279 Pulled By: ezyang fbshipit-source-id: 157e1de36f1c5a51caf2a25b363a94369c442012	2019-06-11 07:04:05 -07:00
Richard Zou	835a6b9da2	Fix namedtensor build (#21609 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21609 ghimport-source-id: 648a0bcd28db2cdda1bf2fa6a904ca8f851088c2 Differential Revision: D15747687 Pulled By: zou3519 fbshipit-source-id: 2a972a15fa7399391617fc6e6b19879b86568c3a	2019-06-11 06:53:50 -07:00
Jinghui	29c849ff34	implement transpose operator for MKLDNN (#19955 ) Summary: implement transpose operator for MKLDNN 1. upgrade mkldnn-bridge to support ND transpose 2. implement transpose operator in caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19955 Differential Revision: D15701832 Pulled By: bddppq fbshipit-source-id: e4337cd0ba6f8180a35c8c70cbb6830a0a84182f	2019-06-11 01:55:13 -07:00
Gu, Jinghui	731670f40a	upgrade mkldnn-bridge (#20569 ) Summary: 1. reduce the overhead of mkldnn-bridge itself 2. remove redundant code and useless APIs 3. provide new operators, including int8 inner_product, ND permute/transpose, elem_add/mul, and etc. 4. improve inner_product to support io format weights without implicit reorder 5. add SoftMax support Pull Request resolved: https://github.com/pytorch/pytorch/pull/20569 Reviewed By: houseroad Differential Revision: D15558663 Pulled By: bddppq fbshipit-source-id: 79a63aa139037924e9ffb1069f7e7f1d334efe3a	2019-06-11 00:47:11 -07:00
Mingzhe Li	f2623c74a9	add PT pointwise unary ops to the benchmark suite (#21207 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21207 This diff adds 80 PT pointwise unary ops to the benchmark suite. Most of the ops are added using the generate_pt_tests_from_list interface. The rest are handled separately. Reviewed By: zheng-xq Differential Revision: D15471597 fbshipit-source-id: 8ea36e292a38b1dc50f064a48c8cd07dbf78ae56	2019-06-10 21:35:44 -07:00
Mingzhe Li	4e3c97a0be	add separate path for op with JIT (#21210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21210 This diff introduces a new path to run op with JIT. There are two steps involved here: 1. Users need to script the op. This should happen in the `init` method. 2. The generated graph from step1 is passed to `jit_forward` which will be executed by the benchmark backend Reviewed By: zheng-xq Differential Revision: D15460831 fbshipit-source-id: 48441d9cd4be5d0acebab901f45544616e6ed2ee	2019-06-10 19:53:58 -07:00
Roy Li	a82feee07c	Method-only entries in native functions should have self as first argument (#21549 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21549 ghimport-source-id: a98fd7a18b4c523d9facb328a3b80a35416834ce Differential Revision: D15724794 Pulled By: li-roy fbshipit-source-id: a0f218cf6fd32d9694921685fc805d868156fce3	2019-06-10 19:32:34 -07:00
Bram Wasti	fff22125a5	AT_CHECK -> TORCH_CHECK (#21547 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21547 ghimport-source-id: d99d3fdcd9abde4e1126716d32ed05aaf8508c50 Differential Revision: D15747676 Pulled By: bwasti fbshipit-source-id: ae9824436e8316e2d0002d2973df4833a18c5f23	2019-06-10 16:58:09 -07:00
Sebastian Messmer	f5c24fc66d	Deprecate torch::jit::RegisterOperators (#21552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21552 Original commit changeset: a142c22be3fd https://github.com/pytorch/pytorch/pull/21368 got reverted because of a MSVC issue. This commit re-introduces that change and fixes the MSVC issue. Differential Revision: D15727526 fbshipit-source-id: 8eb0eb9a7108dc049911b79342c364ac1b8623c8	2019-06-10 16:52:24 -07:00
Michael Suo	cab3e726df	Split out Function into its own file (#21539 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21539 ghimport-source-id: f1e4396a0bec6e30d3179f926ec4da68807942f7 Differential Revision: D15741979 Pulled By: suo fbshipit-source-id: 4cd0ed36bcbf8db0b36a101dda6f58975f806889	2019-06-10 16:37:58 -07:00
Mingzhe Li	512c9d8c76	add PT gather op to the benchmark suite (#21614 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21614 as title Reviewed By: kimishpatel Differential Revision: D15525115 fbshipit-source-id: 6a17e1d791bdb432cc3d51e45c5e82b96268127d	2019-06-10 16:31:52 -07:00
Sebastian Messmer	32a0440209	Publish torch::Dict and torch::OperatorKernel (#20723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20723 These classes already existed but only as c10::Dict and c10::OperatorKernel. Since they're now part of torch::RegisterOperators(), they should also live in the torch namespace. Differential Revision: D15421575 fbshipit-source-id: d64ebd8664fadc264bbbae7eca1faa182529a32b	2019-06-10 16:19:42 -07:00
Shen Li	a93a1ccbb3	Run test_c10d.py in multi-gpu environment (#21598 ) Summary: yf225 helped me discovered that our CI does not run multi-gpu tests in `test_c10d.py`. There are quite a few multi-gpu c10d tests. This PR tries to enable those tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21598 Differential Revision: D15744256 Pulled By: mrshenli fbshipit-source-id: 0a1524a862946128321f66fc8b7f331eff10e52a	2019-06-10 15:58:38 -07:00
Cheng,Penghui	74f6c55f0f	support negative axis in concat and split operators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17955 Differential Revision: D14476031 Pulled By: ezyang fbshipit-source-id: e0e57e8595ed2005ded9e923572a40fe62aca5a7	2019-06-10 15:26:29 -07:00
Edward Yang	3889855a5b	Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21463 ghimport-source-id: 1b0ea4a282b41388d5c6f6a5d18d37c14ae874ad Differential Revision: D15747426 Pulled By: ezyang fbshipit-source-id: 0708394f907b98a9f45bcfa26e5cc450fda8cf76	2019-06-10 15:26:25 -07:00
Stefan Krah	8b9b215dc5	Add a 'dim' argument to nuclear norm (#21022 ) Summary: Addresses #18275. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21022 Differential Revision: D15743515 Pulled By: ezyang fbshipit-source-id: e4aaea0bd7f863a2abad45c4322d6a9fb02a88e3	2019-06-10 15:18:34 -07:00
Kartikey Pandey	2378c120e6	Implements divmod function (#20979 ) Summary: This PR refer to issue #18627 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20979 Differential Revision: D15743929 Pulled By: wanchaol fbshipit-source-id: 967fc3fd519501e427176e10b112c8be1390540b	2019-06-10 15:00:56 -07:00
eellison	8a88d33103	Uninitialized Ivalue (#21387 ) Summary: Create an uninitialized ivalue. This will be needed for Breaks & Continues to match up if block outputs of values that are guaranteed not to be used but need to escape the block scope. It is not exposed to users. Was previously part of final returns but I was asked to make a separate PR for it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21387 Differential Revision: D15745124 Pulled By: eellison fbshipit-source-id: ae6a6f766b4a70a71b9033987a630cfbf044e296	2019-06-10 14:51:24 -07:00
Gregory Chanan	dd0ffd6864	Use schema string specification in derivatives.yaml. (#20916 ) Summary: For consistency, derivatives.yaml now uses the same schema specification as native_functions.yaml. Note that there are some small downsides, e.g. changing the default values or return parameter names in native_functions.yaml also now requires updating derivatives.yaml as well. But this has a few nice properties: 1) Able to copy-paste definitions from native_functions to derivatives. 2) Makes it impossible to write derivatives for operators without schemas (e.g. old TH operators). 3) Moves us closer to the ideal situation of co-locating forward and backwards declarations. Note that this doesn't change any generated code; in particular, this has the same behavior of mapping in-place and out-of-place definitions together. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20916 Differential Revision: D15497800 Pulled By: gchanan fbshipit-source-id: baee5caf56b675ce78dda4aaf6ce6a34575a6432	2019-06-10 13:47:55 -07:00
Sebastian Messmer	5f25a252d6	Allow tensors with requires_grad=True in c10 ops (#21599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21599 We prevented this because c10 ops can't have a backwards yet and calling them with requires_grad=True would do the wrong thing if the c10 op is not purely implemented by calling other autograd-able ops. However, it is a valid use case to have c10 ops that just call other autograd-aware ops, and these ops should be callable with requires_grad=True. This should fix https://github.com/pytorch/pytorch/issues/21584. Differential Revision: D15744692 fbshipit-source-id: ba665365c850ef63fc9c51498fd69afe49e5d7ec	2019-06-10 13:37:06 -07:00
Sam Gross	5a48642fde	Revert D15717575: [pytorch][PR] Fix bug in multinomial_alias_draw Differential Revision: D15717575 Original commit changeset: b1154e226d42 fbshipit-source-id: 305ca010bfda88c9295c52e0626d867452c72f84	2019-06-10 13:28:11 -07:00
Michael Suo	4fb302eb34	fix optional type promotion for classes (#21593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21593 ghimport-source-id: f68730618bccf2326218e08d0a2a70171fdd8921 Differential Revision: D15741471 Pulled By: suo fbshipit-source-id: 7ac1a0f6d9d2ff4bc819caff43a7a5b6d37cbc98	2019-06-10 12:51:00 -07:00
Michael Suo	a436822c40	Consider contained types in alias analysis (#21431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21431 ghimport-source-id: d86ce974a065ec572e71cfa14a8f6bdf48216da7 Reviewed By: jamesr66a Differential Revision: D15718560 Pulled By: suo fbshipit-source-id: a36ce907ab26be22f12bab6175797fe8b34721f1	2019-06-10 12:42:10 -07:00
Michael Suo	bb4aff2680	cleanups to memory_dag (#21430 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21430 ghimport-source-id: 2dc5a0df8512e796c12d65d3ecc5981638122ce6 Reviewed By: jamesr66a Differential Revision: D15718561 Pulled By: suo fbshipit-source-id: 1ef31c08c8a757b632451eb07a47a8227e76c67f	2019-06-10 12:42:06 -07:00
Michael Suo	ae144032aa	cleanups to alias analysis interfaces (#21397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21397 ghimport-source-id: 8733e1af2fe66a3f4494a2c24c82a039375a982e Reviewed By: jamesr66a Differential Revision: D15642662 Pulled By: suo fbshipit-source-id: ae66b7b4f19f255d6fe0e7e804bd0df6d86cb8d1	2019-06-10 12:42:02 -07:00
Michael Suo	ddac8da813	avoid calling front() on empty working set (#21396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21396 ghimport-source-id: 7e57282099d2fd57c58c990b51ae933e427aecb2 Reviewed By: jamesr66a Differential Revision: D15642663 Pulled By: suo fbshipit-source-id: f9b467ba53f03438879bf3929da522aabaff2343	2019-06-10 12:41:58 -07:00
vishwakftw	bb1dbdb99b	Fix bug in multinomial_alias_draw (#21324 ) Summary: An incorrect increment / decrement caused the samples to not be generated from a multinomial distribution Changelog: - Remove the incorrect increment / decrement operation Fixes #21257, fixes #21508 cc: LeviViana neerajprad Pull Request resolved: https://github.com/pytorch/pytorch/pull/21324 Differential Revision: D15717575 Pulled By: ezyang fbshipit-source-id: b1154e226d426c0d412d360c15f7c64aec95d101	2019-06-10 12:05:48 -07:00
Nikolay Korovaiko	30d6933016	BailOut Graphs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21381 Differential Revision: D15724412 Pulled By: Krovatkin fbshipit-source-id: 18e4a1916c7cd1baea76953d0087d6257e58c55b	2019-06-10 11:49:38 -07:00
Vishwak Srinivasan	3df5a46a99	Skip triangular_solve CUDA test on non-default stream Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21590 Differential Revision: D15742549 Pulled By: ezyang fbshipit-source-id: fd5b2cbce86e5f229c2ffba114ef362934296d07	2019-06-10 11:38:42 -07:00
Elias Ellison	6f99bcda8a	fix test (#21594 ) Summary: test that wasn't on the CI, but is tested internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21594 Differential Revision: D15742157 Pulled By: eellison fbshipit-source-id: 11fc82d1fc0281ffedd674ed96100e0c783c0599	2019-06-10 11:23:18 -07:00
fehiepsi	91ea2cd5a7	clip sigmoid to prevent transforms return inf/nan values (#20288 ) Summary: This PR addresses some numerical issues of Sigmoid/StickBreakingTransform, where these transforms give +-inf when the unconstrained values move to +-20 areas. For example, with ``` t = torch.distributions.SigmoidTransform() x = torch.tensor(20.) t.inv(t(x)), t.log_abs_det_jacobian(x, t(x)) ``` current behaviour the inverse will return `inf` and logdet return `-inf` while this PR makes it to `15.9424` and `-15.9424`. And for ``` t = torch.distributions.StickBreakingTransform() x = torch.tensor([20., 20.]) t.inv(t(x)), t.log_abs_det_jacobian(x, t(x)) ``` current value is `(inf, nan)` and `-inf` for logdet, while this PR makes it `[16.6355, 71.3942]` and `-47.8272` for logdet. Although these finite values are wrong and seems unavoidable, it is better than returning `inf` or `nan` in my opinion. This is useful in HMC where despite that the grad will be zero when the unconstrained parameter moves to unstable area (due to clipping), velocity variable will force the parameter move to another area which by chance can move the parameter out of unstable area. But inf/nan can be useful to stop doing inference early. So the changes in this PR might be inappropriate. I also fix some small issues of `_Simplex` and `_RealVector` constraints where batch shape of the input is not respected when checking validation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20288 Differential Revision: D15742047 Pulled By: ezyang fbshipit-source-id: b427ed1752c41327abb3957f98d4b289307a7d17	2019-06-10 11:16:31 -07:00
Haixin Liu	4bdbd30b96	Add python binding to deserialize blob (#21532 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21532 Add python binding to deserialize blob Reviewed By: yinghai Differential Revision: D15706816 fbshipit-source-id: f498c7e0f7392f055b13810bbf81cba59f25e1d2	2019-06-10 10:49:21 -07:00
Elias Ellison	e4fae884f6	Change compiler to use Load/Stores, then transform to SSA (#21101 ) Summary: This changes our compiler so it first emits Loads & Stores, and then transforms the graph to SSA in a follow up pass. When a variable is set, we emit a prim::Store, and when a variable is referenced, we emit a prim::Load. ``` a = 1 print(a) ``` becomes: ``` %a.1 : int = prim::Constant[value=1]() prim::Store[name="a"](%a.1) %a : int = prim::Load[name="a"]() prim::Print(%a) ``` In the follow up pass, convertToSSA, the values are turned into SSA form with the Loads & Stores removed. This change will enable breaks and continues because you can transform the graph with the variable naming information still intact. There are still some remaining jitter and edge cases issues that I have to look through, but I think is still ready for eview. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21101 Differential Revision: D15723353 Pulled By: eellison fbshipit-source-id: 3269934d4bc24ddaf3a87fdd20620b0f954d83d0	2019-06-10 10:26:43 -07:00
Ailing Zhang	1e6c99a6e0	update hub doc (#21568 ) Summary: update doc as pointed out in https://github.com/pytorch/hub/pull/22 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21568 Differential Revision: D15732927 Pulled By: ailzhang fbshipit-source-id: 78ab026539e5ee59e7c3a8144e2c9fcbbc225733	2019-06-10 09:39:35 -07:00
mal	f308b07e8c	Don't leak threads on exit (#21438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21438 ghimport-source-id: 33f145f5b3508163365442c22a223c4a44e677d8 Differential Revision: D15738856 fbshipit-source-id: 656e8d0e3d0d22f116e3ab66bf0282608d6f1a76	2019-06-10 09:14:13 -07:00
Jongsoo Park	c294d64eff	fix concat and split tensor inference function (#21382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21382 Concat tensor inference function was not handling correctly the case where axis argument points to the last dimension so input tensors don't need to have the same number of dimensions. Split tensor inference function was not handling correctly the case where split information is provided as the second input tensor rather than as an argument. Reviewed By: mdschatz Differential Revision: D15633148 fbshipit-source-id: d566af44dc882457ee9efe83d2461b28408c2c5d	2019-06-10 08:23:53 -07:00
Malvika Joshi	9deab0cf0e	Documentation for locking discipline in engine.cpp/.h (#21548 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21548 Added documentation as titled. Reviewed By: ezyang Differential Revision: D15723146 fbshipit-source-id: fab4a35c62f07256673318c0874701f7628b2f7a	2019-06-10 07:50:01 -07:00
Richard Zou	547fcaa977	Add named_guard to native_functions options (#21373 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21373 ghimport-source-id: acab6d3ab0b287d504afa98eaefa2aed6fe99453 Differential Revision: D15717925 Pulled By: zou3519 fbshipit-source-id: 8515c448b368be79f71681833b5edf960da44fe8	2019-06-10 07:29:41 -07:00
Richard Zou	8ffcbfb7d4	Add unique_ptr<NamedTensorMeta> field to TensorImpl (#21341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21341 ghimport-source-id: 06021b06864746571a904a1cfc0aaea5f8a12325 Differential Revision: D15717907 Pulled By: zou3519 fbshipit-source-id: 48ee76cf2f11a8b092be75ecac8d5faee68ca0d9	2019-06-10 07:29:36 -07:00
peter	f9c4d0d7a9	Fix NVTX path on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21580 Differential Revision: D15738060 Pulled By: ezyang fbshipit-source-id: 05a2e97279816753d574678252bf9b35913c99b1	2019-06-10 06:05:44 -07:00
Xing Wang	c4e0d61646	Regularization is not supported in FP16 (#21319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21319 Add assertion to raise Exception when Regularization is applied on FP16. Reviewed By: bddppq Differential Revision: D15528486 fbshipit-source-id: c887c90d1d9ccfdaded3b5fa16816c6f29910e2e	2019-06-09 23:59:48 -07:00
Iurii Zdebskyi	b1bf16eeab	Enabled _th_ixor_ and _th_equal for bool (#21538 ) Summary: Following up on the feedback in this [PR](https://github.com/pytorch/pytorch/pull/21113/files?file-filters%5B%5D=.cwrap&owned-by%5B%5D=) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21538 Differential Revision: D15721390 Pulled By: izdeby fbshipit-source-id: 1b5265bf8726c1051f306f7674d731e25a6c7d03	2019-06-09 15:28:38 -07:00
杨培文 (Yang Peiwen)	e447a733a1	Update module.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21570 Differential Revision: D15732665 Pulled By: ezyang fbshipit-source-id: caa12a8619ad1396540f787b5c849d29cc5b03bd	2019-06-09 15:28:35 -07:00
Zachary DeVito	04e6564f0c	clean up the TracingState API (#21564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21564 ghimport-source-id: b6f71e2238f6f7c8de6cfbf6969a5e08e07be46c Reviewed By: suo Differential Revision: D15729497 Pulled By: zdevito fbshipit-source-id: aacfea6058fadb572df692aa9ebd6cab0bcd03fc	2019-06-09 15:28:32 -07:00
Zachary DeVito	69aa2b2814	Collapse tracing_state.h into tracer.h (#21563 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21563 ghimport-source-id: de87e5e621da33326a9d2cb8a57d82d355166479 Reviewed By: suo Differential Revision: D15729499 Pulled By: zdevito fbshipit-source-id: 17b3e2e71d004f08c4413e80091388ae9ac2df2b	2019-06-09 15:28:29 -07:00
Zachary DeVito	ea822d9626	Interpreter support for CallFunction/CallMethod (#21562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21562 ghimport-source-id: 17e5e183f730f50d97ef48973aafc6249d54978f Reviewed By: suo Differential Revision: D15729500 Pulled By: zdevito fbshipit-source-id: efa8a133b617b1498810392a8da6b513ce00b5eb	2019-06-09 15:28:26 -07:00
Zachary DeVito	ad0c08f950	Expose ExecutionPlan in prep for function calls (#21561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21561 ghimport-source-id: 4bf28d8140610a0cefef0c0a17f0a513ae855dde Reviewed By: suo Differential Revision: D15729498 Pulled By: zdevito fbshipit-source-id: b26458336da1efaba71d8a577c3917c6622dae0d	2019-06-09 15:28:22 -07:00
Zachary DeVito	de31f6719c	Add flag to temporarily enable first class modules (#21560 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21560 ghimport-source-id: a555ca33fcd3efd1147aaf90f26a8e63da1c1a67 Reviewed By: suo Differential Revision: D15729502 Pulled By: zdevito fbshipit-source-id: d6c11472bfc791e2ad1e9aa695b0439d72b79681	2019-06-09 15:28:19 -07:00
Zachary DeVito	18996a8952	unfinished push/pop reduction (#21559 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21559 ghimport-source-id: 81ba4a5638577781e1ea706599966c033c37e814 Reviewed By: suo Differential Revision: D15729501 Pulled By: zdevito fbshipit-source-id: 3423bff61e89617c40078d5fab726b77d21bfa27	2019-06-09 15:28:16 -07:00
Zachary DeVito	13edda417d	Prepare interpreter for function calling (#21558 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21558 ghimport-source-id: a8a19dbefea869ca1401e5afea6c02f31f95b99a Reviewed By: suo Differential Revision: D15729491 Pulled By: zdevito fbshipit-source-id: 9629664608a2379a2ddcafaf741fa8463c4fb917	2019-06-09 15:28:13 -07:00
Charles Lovering	8ae7b1c486	Update functional.py doc (#21510 ) Summary: - Fixes a typo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21510 Differential Revision: D15731277 Pulled By: ezyang fbshipit-source-id: c3f8e110f5c61e797b857477b495168ea8d63cd5	2019-06-09 15:28:09 -07:00
Brennan Vincent	74828be4a7	fix segfault in `cat` on CPU with tensors that can't be indexed with 32-bit ints. (#21530 ) Summary: Should be self-explanatory. This `int` variable is overflowing. Reported in #21526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21530 Differential Revision: D15719275 Pulled By: umanwizard fbshipit-source-id: 24e917a00a5b78bc3af29ef3b8b72eea7e89d5d5	2019-06-09 15:28:06 -07:00
Xiaomeng Yang	406374657a	Optimize batch mm op when broadcast the second input (#21556 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21556 Optimize batch mm op when broadcast the second input Reviewed By: houseroad Differential Revision: D15728914 fbshipit-source-id: c60441d69d4997dd32a3566780496c7ccda5e67a	2019-06-09 15:28:03 -07:00
Zachary DeVito	d71501259b	Revert D15572818: Prepare interpreter for function calling Differential Revision: D15572818 Original commit changeset: 3a9b5f053664 fbshipit-source-id: b932411e8e88c7414c8db332d6049fe4e26bd83e	2019-06-07 22:20:54 -07:00
Zachary DeVito	d4bcab0dba	Revert D15590900: Reduce number of stack manipulation instructions in interpreter. Differential Revision: D15590900 Original commit changeset: 98829979feba fbshipit-source-id: eb7f1d396bb2b98d2852af81c69db81430eba33c	2019-06-07 22:20:50 -07:00
Zachary DeVito	03641413e5	Revert D15600068: Add flag to temporarily enable first class modules Differential Revision: D15600068 Original commit changeset: 9b68e23d7f8b fbshipit-source-id: 45f36b3aaa4f1c457c27490579496456cbbc680b	2019-06-07 22:20:47 -07:00
Zachary DeVito	e616a5e8b8	Revert D15600067: Expose ExecutionPlan in prep for function calls Differential Revision: D15600067 Original commit changeset: 82b7de458dd6 fbshipit-source-id: ca26a362cd73bdb9e8c4eba15dd5c10986fa79fe	2019-06-07 22:20:44 -07:00
Zachary DeVito	bfb235b8c9	Revert D15618275: Interpreter support for CallFunction/CallMethod Differential Revision: D15618275 Original commit changeset: 038ae27e5416 fbshipit-source-id: 8dbe0f564ba103fe445dacc471085c659171705f	2019-06-07 22:20:40 -07:00
Zachary DeVito	c27cabe2d7	Revert D15719982: Collapse tracing_state.h into tracer.h Differential Revision: D15719982 Original commit changeset: 56bb021dd949 fbshipit-source-id: 2eb3e2c9745c35a84ebcc0fc7ac62b5f1fdd6437	2019-06-07 22:20:37 -07:00
Zachary DeVito	3cfe914191	Revert D15719980: clean up the TracingState API Differential Revision: D15719980 Original commit changeset: 3de2746c3f3c fbshipit-source-id: 4610e215936b2476a0271355ef3b8f1f480bdea8	2019-06-07 22:20:34 -07:00
Zachary DeVito	dd0faf4366	clean up the TracingState API (#21514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21514 ghimport-source-id: 6a9b6fdd7e696ea29e8715482708efe897230e4d Reviewed By: jamesr66a Differential Revision: D15719980 Pulled By: zdevito fbshipit-source-id: 3de2746c3f3c3de4111b4cb73f4c4acedbf28862	2019-06-07 20:57:05 -07:00
Zachary DeVito	8c5f3acfc0	Collapse tracing_state.h into tracer.h (#21513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21513 ghimport-source-id: 86278929818a8fc65684bd8f2ffac31460772fe9 Reviewed By: jamesr66a Differential Revision: D15719982 Pulled By: zdevito fbshipit-source-id: 56bb021dd949668562ea481c5ff0115a9ea2b02e	2019-06-07 20:57:01 -07:00
Zachary DeVito	5f6afafdef	Interpreter support for CallFunction/CallMethod (#21325 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21325 ghimport-source-id: eeca1176f5e00c85a69cd016acccf5105e670e02 Reviewed By: jamesr66a Differential Revision: D15618275 Pulled By: zdevito fbshipit-source-id: 038ae27e5416f1ce338009627c839a4d61a00658	2019-06-07 20:56:58 -07:00
Zachary DeVito	1517ff66a1	Expose ExecutionPlan in prep for function calls (#21273 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21273 ghimport-source-id: b92c1e07fbe4122467a21b98d29635295093e0c2 Reviewed By: jamesr66a Differential Revision: D15600067 Pulled By: zdevito fbshipit-source-id: 82b7de458dd65c175f55b0f383bfc3fcf4704032	2019-06-07 20:56:55 -07:00
Zachary DeVito	7e08bc42d5	Add flag to temporarily enable first class modules (#21272 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21272 ghimport-source-id: 43e73d1b93ccbe0dd6845eb3f7444c9d0abd444b Reviewed By: jamesr66a Differential Revision: D15600068 Pulled By: zdevito fbshipit-source-id: 9b68e23d7f8b6046a5a0d6d9fd16138ac384b863	2019-06-07 20:56:52 -07:00
Zachary DeVito	dde27958dd	Reduce number of stack manipulation instructions in interpreter. (#21240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21240 ghimport-source-id: 5e9cbe8b3df3ac721135d2f652a420ae0b14ac55 Reviewed By: jamesr66a Differential Revision: D15590900 Pulled By: zdevito fbshipit-source-id: 98829979feba23685f0ba98ba3cb840157f7259a	2019-06-07 20:56:49 -07:00
Zachary DeVito	c53e4d012d	Prepare interpreter for function calling (#21185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21185 ghimport-source-id: 6b9cb92d1f1f59bb980dcfa0d29dfe985ee955d1 Reviewed By: jamesr66a Differential Revision: D15572818 Pulled By: zdevito fbshipit-source-id: 3a9b5f053664c09212b97f1391d8d006337b5550	2019-06-07 20:56:46 -07:00
Edward Yang	c36dc35853	Revert D15576968: Turn on Werror for deprecated-declarations. Differential Revision: D15576968 Original commit changeset: fb73a8986a5b fbshipit-source-id: 1ae19afc6816f764b895a47162728433a319ac0b	2019-06-07 19:15:56 -07:00
James Reed	b849f101b1	Fix slow unpickling (#21542 ) Summary: This was looking at the number of elements in the memo table, not the total capacity, and was thus calling reserve() a lot more than it should have Pull Request resolved: https://github.com/pytorch/pytorch/pull/21542 Reviewed By: driazati Differential Revision: D15723132 Pulled By: jamesr66a fbshipit-source-id: 20e1f9099b6a51a33994ea9dbc3f22eb3bc0c8f9	2019-06-07 17:28:55 -07:00
Edward Yang	66d596645a	Turn on Werror for deprecated-declarations. (#21195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21195 The motivation is that, while we shouldn't break USER code for using deprecated declarations, we should keep our internal code base deprecation clean. Differential Revision: D15576968 fbshipit-source-id: fb73a8986a5b60bf49ee18260653100319bb1030	2019-06-07 17:24:17 -07:00
Mingzhe Li	a5cf6d5100	reorganize op bench directory (#21543 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21543 No code change in this diff. Reviewed By: hl475 Differential Revision: D15721419 fbshipit-source-id: 06212cc882f5297064153417dc4d80bce9ec2667	2019-06-07 16:06:51 -07:00
Nikolay Korovaiko	5b4a188a95	add support for steps(strides) in tensor slices Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20929 Differential Revision: D15632636 Pulled By: Krovatkin fbshipit-source-id: 0e127bbd7b339784c4be2e0a57f28024727d5ad3	2019-06-07 15:55:26 -07:00
Junjie Bai	5744fb3007	Add mkldnn softmax operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21516 Differential Revision: D15712759 Pulled By: bddppq fbshipit-source-id: bf515135263156bea1a2b3e53a47edf697b8b1e2	2019-06-07 15:22:18 -07:00
Gregory Chanan	a947d98282	Set "scalar_check: false" for some LAPACK functions that can't return scalars. (#21498 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21498 ghimport-source-id: 33ce2f3f083616f633561e4871585f439e2647c0 Differential Revision: D15715477 Pulled By: gchanan fbshipit-source-id: 7772573ba74cdf7a5f2d86d2e581652ebd85e1c6	2019-06-07 15:00:54 -07:00
Sebastian Messmer	fe5ceea580	Rename caffe2<->c10 operator wrappers (#21322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21322 Naming is everything. - Rename c10_operator.h -> export_caffe2_op_to_c10.h - Rename operator_c10wrapper.h -> export_c10_op_to_caffe2.h - Rename corresponding macros This hugely improves readability and explains what these things are doing. Reviewed By: dzhulgakov Differential Revision: D15616816 fbshipit-source-id: d976aefcb43a0f55d85c3424fdd9aca7e71c3603	2019-06-07 13:48:10 -07:00
Michael Suo	dad85b7e69	clang-format by line (#21531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21531 ghimport-source-id: 711867e19cc3948a5e2a6aa8c4f2cd631abb04d2 Reviewed By: zdevito Differential Revision: D15719260 Pulled By: suo fbshipit-source-id: e88c5d3e14e6ecc956ce30ab0246ed606f4b0a38	2019-06-07 13:42:44 -07:00
Richard Zou	fa0c5c31d4	Turn namedtensor build back on (#21520 ) Summary: namedtensor build + test should run on PRs only if the commit message includes [namedtensor ci]. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21520 Differential Revision: D15718404 Pulled By: zou3519 fbshipit-source-id: ce8b5df2682e795e64958a9d49e2e3c091599b33	2019-06-07 13:37:48 -07:00
Rui Zhu	2b902e9738	Fix the offset numerical bug when casting (#21484 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21484 cast<int32_t*> => cast<int32_t> Also fixed reserve problem which might cause incorrect pointer. Reviewed By: yinghai Differential Revision: D15699866 fbshipit-source-id: 374418476bddd60f5c5306c8c57319ccf28b9990	2019-06-07 12:33:18 -07:00
Michael Suo	ac8c3fa7b6	Revert D15717337: [pytorch][PR] [precommit hook] clang-format by line Differential Revision: D15717337 Original commit changeset: 57e65a679a8f fbshipit-source-id: f73794087a23d56d03497b29d9a9e4e7d54deaad	2019-06-07 11:50:42 -07:00
Michael Suo	a77802cf56	clang-format by line (#15657 ) Summary: This should further reduce noise by only clang-formatting the lines you actually touched in the precommit hook. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15657 Differential Revision: D15717337 Pulled By: suo fbshipit-source-id: 57e65a679a8fdee5c3ff28e241c74ced9398eb0c	2019-06-07 11:46:36 -07:00
Gregory Chanan	b7f5d1e4c6	Fix size of histc return on CPU when input is 0-dimensional and bins=1. (#21497 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21497 ghimport-source-id: bc03f27408aa772f78d5351afe404b5e91a7c4ce Differential Revision: D15715478 Pulled By: gchanan fbshipit-source-id: 90e1b65249b4b12f936ee8877cc0bc5a972d9ceb	2019-06-07 11:23:46 -07:00
Elias Ellison	881adb5bcd	fix tuple indexing bug (#21521 ) Summary: lower tuples pass didn't check bounds for tuple index Pull Request resolved: https://github.com/pytorch/pytorch/pull/21521 Differential Revision: D15716813 Pulled By: eellison fbshipit-source-id: 8eead98c2c63118e7d24a8c8bf6184b02afb7dcd	2019-06-07 11:17:59 -07:00
Tim Khatkevich	a5cca4d342	add failback for Sign operator (#21343 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21343 Needed to binarise features Reviewed By: yinghai Differential Revision: D15625653 fbshipit-source-id: 52f48259a040dac35a7000bb1eea9feb5c7ef1ab	2019-06-07 10:56:22 -07:00
svcscm	51fb42ebcf	Updating submodules Reviewed By: yns88 fbshipit-source-id: 5778cdb5173fc16e5d5474fefa2ea89264101184	2019-06-07 10:43:12 -07:00
Tzu-Wei Huang	54413cf91e	replace LegacyTracedModule with torchscript used in add_graph (#21339 ) Summary: The new implementation of tracing supports more module. So many error-handling code can be removed by placing the old one (LegacyTracedModule). cc orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/21339 Reviewed By: natalialunova Differential Revision: D15695154 Pulled By: orionr fbshipit-source-id: af7d35754e9f34bd1a0ad7b72a9ebe276ff8ab98	2019-06-07 10:43:08 -07:00
huba	b144ba66d5	Change PyTorch tests to use non-default CUDA stream (#21474 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21474 ghimport-source-id: b2477765362248a80557d1a20db02a1290bdcde3 Differential Revision: D15699700 Pulled By: fbhuba fbshipit-source-id: 1aa4309fec0982c8477cfab29ca5f42d2b171f97	2019-06-07 10:24:48 -07:00
Gregory Chanan	5cc3a3e2bf	Set "scalar_check: false" for TH methods that can't return scalars. (#21496 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21496 ghimport-source-id: d7197bccfe9e4d807f38a66e02ca6f0bf32bdc2b Differential Revision: D15715479 Pulled By: gchanan fbshipit-source-id: fa59eb808d26119b33eb97bb90ef70e95e58458d	2019-06-07 10:19:23 -07:00
Lara	cc85c3dbbc	ONNX Export Slice and Flip ops for opset 10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20533 Reviewed By: zrphercule Differential Revision: D15579713 Pulled By: houseroad fbshipit-source-id: 91f3ac0cb14ef226f980362b0013b6b92cb8b8da	2019-06-07 10:03:26 -07:00
Edward Yang	3eced796cd	Make torchvision install chatty. (#21476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21476 ghimport-source-id: adfd08b818f31ebbdf3da89d6bb95d33e14a9403 Differential Revision: D15715270 Pulled By: ezyang fbshipit-source-id: dde02579d9960ac960306d0a024b8e17846ae0ff	2019-06-07 08:41:13 -07:00
Yiming Wu	1503c734ce	updating gemmlowp tp 3fb5c Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21488 Differential Revision: D15715264 Pulled By: ezyang fbshipit-source-id: 86978f294720e0ce6f60b748a71f0604d6cfa00c	2019-06-07 08:32:49 -07:00
Edward Yang	8c9a88bdab	Make test_cuda.py work on Python 2. (#21466 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21466 ghimport-source-id: 0a235c8b8cf994621a5a5afe022340dd35764c91 Differential Revision: D15698096 Pulled By: ezyang fbshipit-source-id: 1759c2681071e9c7e83de3de86daf4333c5f8f3a	2019-06-07 08:13:03 -07:00
Kaixhin	c60465873c	Fix batch norm multiplier init (#13774 ) Summary: Fixes #12259, needs to make sure tests (see #13766) don't break due to numerical precision issues. Not sure what would need to be adjusted here... Pull Request resolved: https://github.com/pytorch/pytorch/pull/13774 Differential Revision: D15715021 Pulled By: ezyang fbshipit-source-id: 20ce2beee1b39ebe9f023c5f2b25be53acccb5f3	2019-06-07 07:50:39 -07:00
Richard Zou	c604847602	Implement at::match(Dimname, Dimname) and at::unify(Dimname, Dimname) (#21281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21281 ghimport-source-id: 4b241d54a60c188b8566065c90b227b40914a5ca Differential Revision: D15699063 Pulled By: zou3519 fbshipit-source-id: c0f00c370d266a4ea5211aae943041fd899e960a	2019-06-07 06:30:45 -07:00
Richard Zou	4727685ea1	Added at::Dimname (#21280 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21280 ghimport-source-id: 921848326e4828ffd422868be26c409c6490e1ab Differential Revision: D15698516 Pulled By: zou3519 fbshipit-source-id: 502b9b019d51dd46327e6caf2af69aa520c70cb6	2019-06-07 06:30:42 -07:00
Edward Yang	e27c2f1437	Revert "Revert D15632268: [pytorch][PR] Continuation of Port max_unpool1d, max_unpool2d and max_unpool3d to ATen" (#21427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21427 ghimport-source-id: 930c2fb29320f70e78f94e7eaaffe8e2ab62e7f3 Differential Revision: D15698423 Pulled By: ezyang fbshipit-source-id: 891c94c24b6d377cd6dd94d86cc66465b582359f	2019-06-07 05:52:27 -07:00
Edward Yang	d6af6588c2	Super resolution export to Caffe2 is broken, skip it. (#21479 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21479 ghimport-source-id: 60fa97fb2dfb37a758c0e8b9c2bc0fb2819fd2f7 Differential Revision: D15713609 Pulled By: ezyang fbshipit-source-id: a3d9c49e2db985f4373508cd44e94d43ae6e24da	2019-06-07 05:46:26 -07:00
Peng Gong	78a376592d	add cancelAsyncCallback method to OperatorBase (#21492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21492 If one async operator failed, async_scheduling net currently only marks all scheduled async operators as finished without cancelling the callbacks. The new behavior is to cancel the callbacks first, then set event status to finished. Reviewed By: ilia-cher Differential Revision: D15702475 fbshipit-source-id: 55a1774d768b2e238bab859b83332f1877a001ca	2019-06-06 20:57:12 -07:00
David Zhang	696b2c89b4	Adding gradient to Boolean Mask operator (#21423 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21423 - add gradient for boolean mask - add test for gradient checking Reviewed By: BIT-silence Differential Revision: D15640036 fbshipit-source-id: 79f40c6901e805bf1b8e9b01b57903e30b00f654	2019-06-06 20:48:47 -07:00
svcscm	d3d195e0b1	Updating submodules Reviewed By: yns88 fbshipit-source-id: af5812e3d071e66f9d0272c36bf639eb04bde7e4	2019-06-06 20:42:34 -07:00
Owen Anderson	772fd79d40	Defer constructing error strings for definitions under If's until they're needed. (#21429 ) Summary: This saves ~7% DenseNet load time (4.3s -> 4.0s) on my laptop. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21429 Differential Revision: D15681374 fbshipit-source-id: 9925a6154d51f2d592e26cb5ff8bf7ab3ee2519b	2019-06-06 20:32:57 -07:00
Alex Şuhan	abc0d3e544	Fix unused variable warning Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21444 Differential Revision: D15701786 Pulled By: ezyang fbshipit-source-id: 8348e08f9b8f3047b30736f9a944786ab84e6b68	2019-06-06 19:37:54 -07:00
ptrblck	bad67015fe	Add warning for Turing GPUs and CUDA <= 9000 (#21468 ) Summary: Turing GPUs (compute capability 7.5) require CUDA10 to work properly. We've seen some issues for these GPUs using PyTorch binaries with CUDA9 or older: [Discussion Board #1](https://discuss.pytorch.org/t/cudnn-status-execution-failed-error/38575) [Discussion Board #2](https://discuss.pytorch.org/t/cublas-runtime-error-on-gpu-running-but-works-on-cpu/46545/6) Tested on using CUDA9 with an RTX 2080Ti. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21468 Differential Revision: D15696170 Pulled By: ezyang fbshipit-source-id: ed43f4e4948d3f97ec8e7d7952110cbbfeafef2a	2019-06-06 19:33:02 -07:00
Edward Yang	63d4bbb0ec	Turn XLA back on for default set (but filtered out using should_run_job) (#21470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21470 ghimport-source-id: 69800c1ce1187591b7bcdb8a63973b4fd8d0e326 Differential Revision: D15696930 Pulled By: ezyang fbshipit-source-id: fafbcba38d9572a23ee9c1d81cdcce3a154ae4c6	2019-06-06 19:18:45 -07:00
Huamin Li	f433913996	add more info back to BenchResult (#21502 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21502 In BenchResult, we keep name, avg_fwd, std_fwd, avg_bwd, and std_bwd. There is no information about the number of each iteration. In this diff, I am adding more info to BenchResult to include the number reported from each iteration. Reviewed By: wanchaol Differential Revision: D15706306 fbshipit-source-id: 3f14be4ba91f1f6da473995783bd7af1d067938d	2019-06-06 18:43:51 -07:00
Edward Yang	d51bd2191c	Revert D15629687: Deprecate torch::jit::RegisterOperators Differential Revision: D15629687 Original commit changeset: 2f87f18be655 fbshipit-source-id: a142c22be3fdf14a2b3c29b8766b218fb0883927	2019-06-06 18:09:01 -07:00
davidriazati	37ab35c8fc	Move jit testing utils to their own file (#21491 ) Summary: This moves `JitTestCase` to its own file so that we can have other jit test files (ex. `test_jit_py3.py`) There aren't any code changes, just a move and cleaning up the imports Pull Request resolved: https://github.com/pytorch/pytorch/pull/21491 Pulled By: driazati Differential Revision: D15703060 fbshipit-source-id: 6082e8b482100bb7b0cd9ae69738f1273e626171	2019-06-06 15:52:45 -07:00
Sebastian Messmer	e87f77def6	Fix typo (#21426 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21426 - Differential Revision: D15679789 fbshipit-source-id: 5fd448e66af159fd79883aa874065424ec9694ad	2019-06-06 15:44:16 -07:00
Sebastian Messmer	d714abf597	Deprecate torch::jit::RegisterOperators (#21368 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21368 - Differential Revision: D15629687 fbshipit-source-id: 2f87f18be65552f3eb3f4c945d7f19ba4bae0eb8	2019-06-06 15:44:12 -07:00
David Zhang	cb2ec07fa2	ReshapeOp supports empty tensor (#21230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21230 tsia; we support empty tensor with this diff for reshape operator Reviewed By: jerryzh168 Differential Revision: D15583356 fbshipit-source-id: 6d44c04e95ca3546509bfb12102e29c878f9a7c7	2019-06-06 15:02:11 -07:00
Aapo Kyrola	b161832f10	support ceil mode by padding changes (#21310 ) Summary: Modify MKLDNN pooling operation to support ceil mode by adjusting the right/bottom padding accordingly. This is done similarly as in Caffe (see discussion https://github.com/pytorch/pytorch/pull/19205#discussion_r276903751). To make this possible, I split the padding to left and right (top / bottom). This naming is confusing but actually follows mkldnn's own naming for pooling::compute(). We increase the r paddings so that it matches the ceiling mode expected output size. Strengthened the test case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21310 Reviewed By: bddppq Differential Revision: D15611664 Pulled By: akyrola fbshipit-source-id: 46b40015dafef69a8fd5e7b2c261d8dbf448cd20	2019-06-06 14:47:35 -07:00
Daya Khudia	80a083ef92	Remove unneeded headers (#21393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21393 Result of splitting the base diff. We moved a header from src/* to include/fbgemm/* Reviewed By: jianyuh Differential Revision: D15635188 fbshipit-source-id: ad7d0ddba964ff1cb8b2e33f5f98e457a4d2eac9	2019-06-06 14:23:54 -07:00
Brian Vaughan	8a9ea55b25	Add autograd for to_sparse. (#20458 ) Summary: https://github.com/pytorch/pytorch/issues/18111 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20458 Differential Revision: D15699732 Pulled By: nairbv fbshipit-source-id: f7a5424c1f1d3b0e4eba0d503d75ae8a18ef7ff4	2019-06-06 14:23:51 -07:00
ThisIsIsaac	87d10d49f4	Bilinear Upsampling increased throughput (#19306 ) Summary: changed `UpsampleBilinearKernel` s.t. the throughput increased 40~50%. I tested locally with my local test code -- not pytorch's provided test code -- because I am having a build problem ( which I made an issue about [here](https://github.com/pytorch/pytorch/issues/19184)). I tested with various tensor sizes and across all the sizes, it should a significant increase in throughput. 1. added `__restrict__` 2. instead of launch as many threads as there are output elements, I launched only `output_height * output_width` may threads and had each thread iterate through the channel and batch dimension. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19306 Differential Revision: D15701840 Pulled By: ezyang fbshipit-source-id: 53c54d4f4e4a28b58ecc7d7ae6b864cbfc760e27	2019-06-06 13:58:57 -07:00
Xingdong Zuo	c5d5d45f40	Fix numerically instability of `SigmoidTransform` (#19802 ) Summary: fix #18254 for numerically instability of `SigmoidTransform` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19802 Differential Revision: D15701837 Pulled By: ezyang fbshipit-source-id: fe6c755c523487c8bbdcc3bfb8455801617c70a4	2019-06-06 13:58:53 -07:00
fehiepsi	f8cab38578	Address precision matrix instability of MVN distribution (#21366 ) Summary: Currently, when the input of MVN is precision matrix, we take inverse to convert the result to covariance matrix. This, however, will easily make the covariance matrix not positive definite, hence will trigger a cholesky error. For example, ``` import torch torch.manual_seed(0) x = torch.randn(10) P = torch.exp(-(x - x.unsqueeze(-1)) ** 2) torch.distributions.MultivariateNormal(loc=torch.ones(10), precision_matrix=P) ``` will trigger `RuntimeError: cholesky_cpu: U(8,8) is zero, singular U.` This PR uses some math tricks ([ref](https://nbviewer.jupyter.org/gist/fehiepsi/5ef8e09e61604f10607380467eb82006#Precision-to-scale_tril)) to only take inverse of a triangular matrix, hence increase the stability. cc fritzo, neerajprad , SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/21366 Differential Revision: D15696972 Pulled By: ezyang fbshipit-source-id: cec13f7dfdbd06dee94b8bed8ff0b3e720c7a188	2019-06-06 13:54:46 -07:00
vfn	8ece538a79	Addresses bad behavior with overridden optimizer.step by #20124 (#21460 ) Summary: This PR addresses the problem described in the comment: https://github.com/pytorch/pytorch/pull/20203#issuecomment-499231276 and previously coded bad behaviour: - a warning was raised all the times when lr schedulling is initialized Now the code checks that: - on the second call of `lr_scheduler.step`, ensure that `optimizer.step` has been already called, otherwise raise a warning (as it was done in #20203 ) - if optimizer's step is overridden -> raise once another warning to aware user about the new pattern: `opt.step()` -> `lrs.step()` as we can not check this . Now tests check that - at initialization (`lrs = StepLR(...)`)there is no warnings - if we replace `optimizer.step` by something else (similarly to the [code of nvidia/apex](https://github.com/NVIDIA/apex/blob/master/apex/amp/_process_optimizer.py#L287)) there is another warning raised. cc ezyang PS. honestly I would say that there is a lot of overhead introduced for simple warnings. I hope all these checks will be removed in future `1.2.0` or other versions... Pull Request resolved: https://github.com/pytorch/pytorch/pull/21460 Differential Revision: D15701776 Pulled By: ezyang fbshipit-source-id: eac5712b9146d9d3392a30f6339cd33d90c497c7	2019-06-06 13:54:42 -07:00
Mingzhe Li	51d0da2802	Improve build docs and process for Windows (#21190 ) Summary: Fixes #21026. 1. Improve build docs for Windows 2. Change `BUILD_SHARED_LIBS=ON` for Caffe2 local builds 3. Change to out-source builds for LibTorch and Caffe2 (transferred to #21452) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21190 Differential Revision: D15695223 Pulled By: ezyang fbshipit-source-id: 0ad69d7553a40fe627582c8e0dcf655f6f63bfdf	2019-06-06 13:46:52 -07:00
Ilia Cherniavskii	6fc702f384	Per-callback sampling (#21394 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21394 ghimport-source-id: 2607c7b456031a1ddb19fabc3b6fe2585c276d76 Differential Revision: D15639723 Pulled By: ilia-cher fbshipit-source-id: 938d02c1daf5bec5afa5d3cd021d2dae7e7160ce	2019-06-06 13:46:48 -07:00
peter	bb788631ce	Fix caffe2 windows CI for new Windows AMI (#21452 ) Summary: The alternative of #21410. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21452 Differential Revision: D15701767 Pulled By: ezyang fbshipit-source-id: e65c1d6bfcc98e88460f4a57e5b99c2f395c0ceb	2019-06-06 13:46:45 -07:00
Thomas Viehmann	3feb40d602	pack_padded_sequence: Check for empty (zero-element) tensors (#21461 ) Summary: Fixes: #20529 Thank you, JamieCT for the bug report with reproducing script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21461 Differential Revision: D15696183 Pulled By: ezyang fbshipit-source-id: a93cde2c924f8447563c64ce8a1cf75fcee60a01	2019-06-06 13:41:52 -07:00
Natalia Lunova	3b6362d98e	Remove NodeExecStats and AllocatorMemoryUsed (#21419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21419 Removed ```node_stats``` and unused imports Reviewed By: orionr Differential Revision: D15672824 fbshipit-source-id: 6167c80c081d925f02a1d279f3af0e1b8de66752	2019-06-06 13:35:52 -07:00
Brennan Vincent	0a3fb45d3d	allow passing Python built-in types as dtypes (#21215 ) Summary: Another simple bit of syntax that NumPy supports and we don't. Support int, float, and bool. ```python >>> torch.randn((2,3), dtype=float) tensor([[-0.1752, -0.3240, -0.6148], [ 0.1861, 1.6472, 0.1687]], dtype=torch.float64) ``` A bit confusingly, Python's "float" actually means double, but nothing we can do about that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21215 Differential Revision: D15697012 Pulled By: umanwizard fbshipit-source-id: 9a38d960a610b8e67023486b0c9265edd3c22246	2019-06-06 13:17:23 -07:00
Junjie Bai	b647804a55	Fix embedding bag nan output when input is empty (#21400 ) Summary: ``` import torch Embed = torch.nn.EmbeddingBag(100, 10, sparse=True) print(Embed(input=torch.LongTensor([]), offsets=torch.LongTensor([0]))) print(Embed(input=torch.LongTensor([]), offsets=torch.LongTensor([0, 0]))) ``` Before this fix: ``` tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]) ``` After this fix: ``` tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]) tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21400 Differential Revision: D15643357 Pulled By: bddppq fbshipit-source-id: 119eba38129dc0a3757c331304a18044714fcca5	2019-06-06 13:03:17 -07:00
Brennan Vincent	f4f32cecfd	numpy like nonzero (called nonzero_tuple) (#20293 ) Summary: No performance degradation compared to Numpy when indexing: ``` In [15]: x=torch.randn((1000,1000)) In [16]: %timeit x[x.nonzero_tuple()] 4.63 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [17]: y=x.numpy() In [18]: %timeit y[y.nonzero()] 14.6 ms ± 281 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [20]: x=x.t() In [22]: %timeit x[x.nonzero_tuple()] 9.01 ms ± 626 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [24]: y=x.numpy() In [25]: %timeit y[y.nonzero()] 16.8 ms ± 770 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20293 Differential Revision: D15358754 Pulled By: umanwizard fbshipit-source-id: 1344aabd95c969eeda9780c475a39551231879e1	2019-06-06 12:50:59 -07:00
davidriazati	8a2985eb05	Support recursive ModuleList / Sequential (#21306 ) Summary: Adds support for recursively compiling `nn.Sequential` and `nn.ModuleList`. When either is used, it is converted to a `jit._ConstModuleList` or `jit._ConstSequential` as necessary. Due to this, we don't need to add it to `__constants__` since it's made constant on demand. This PR also moves the recursive script tests out to their own class `TestRecursiveScript` (the added test is called `test_iterable_modules`) ](https://our.intern.facebook.com/intern/diff/15611738/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21306 Pulled By: driazati Differential Revision: D15611738 fbshipit-source-id: fac52993990bd2dfad71d044c463a58a3759932a	2019-06-06 12:23:59 -07:00
Iurii Zdebskyi	2e37ab85af	Enable bool support for several index methods (#21435 ) Summary: Enable bool tensors for these index methods: - index_select - index_copy - put - take - index_fill Tested via unit tests TODO: Enable index_add in a separate PR as it requires more "side" changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21435 Differential Revision: D15684964 Pulled By: izdeby fbshipit-source-id: 48440e4d44873d70c4577e017dd0d8977e0fa15a	2019-06-06 12:14:01 -07:00
davidriazati	61cc03fb8d	Make ScriptModule.training an attribute instead of a parameter (#21078 ) Summary: Redo of #19587 ](https://our.intern.facebook.com/intern/diff/15560540/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21078 Pulled By: driazati Differential Revision: D15560540 fbshipit-source-id: f415775d87c163f93b3bbdd5f87c9ff73f58b049	2019-06-06 12:06:49 -07:00
Iurii Zdebskyi	f1adddd1c6	Updated sum() logic to properly deal with bool tensor (#21421 ) Summary: `torch.tensor([True, False, True], dtype=torch.bool).sum()` should return 2 instead of True as it does now. Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/21421 Differential Revision: D15674203 Pulled By: izdeby fbshipit-source-id: b00e3d0ca809c9b92b750adc05632522dad50c74	2019-06-06 12:02:23 -07:00
Shen Li	b7b6b612a7	Fix C++ data parallel (#20910 ) Summary: Fixes #19540 CC nmerrill67 C++ data parallel was using Module.clone() to create module replicas on every destination device. However, clone() does not set up gradient edges to point from replicas to the original module. As a result, the gradient will not be aggregated into the original module. This commit fixes the the problem by manually setting gradient edges from every parameter X in every replica to the same parameter X in the original module. ## Failed Attempt Initially I tried implementing what we did in `replicate.py`, which 1. create module replicas 2. use Python `Broadcast` autograd function to broadcast every parameter in the original module to all destination devices. 3. assign the broadcast result params to module replicas' `_parameters` dict. This works in Python because derived module member field params (e.g., `Linear.weight`) and base module `_parameters` (e.g., `Linear._parameters['weight']`) are referencing the same parameter instance. Assigning one of them will apply to both. However, in C++, even though I can modify Module's `parameters_ `values and gradient edges to point to the broadcast source, I cannot touch the weight and bias member fields in Linear, because replicate cannot (and should not) add special-case handlers to every different module. (See `Linear` [.h](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/nn/modules/linear.h), [.cpp](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/src/nn/modules/linear.cpp)) Although they initially point to the same `TensorImpl` instance, after assigning to `Module.parameters_['weight']`, it will be different from `Linear.weight`. ## Solution Options gchanan and I had several discussions on this issue and figured two solutions to this problem. ### Option One [implemented in this PR] Replicate the module in two steps: 1. call `Module.clone()` to create a module replica on every destination device. 2. manually setting gradient edges from every parameter in every replica to the same parameter in the original module. * Pro: Does not need to change any existing module, and relatively easier to implement * Con: It is a little hackish. ### Options Two Implement a `Replicatable` class (similar to `Cloneable`), and make it a friend class of `Module`. For more details see `Note [Replicating Modules]` in the code change. * Pro: Maybe this aligns more with our existing approach implemented in `Cloneable`? * Con: Require changes to every existing module. I am inclined to go with option one, because `replicate` will only be used on data parallel. I feel it is too big an overkill if we have to change all existing module implementations due to a data parallel requirement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20910 Differential Revision: D15556426 Pulled By: mrshenli fbshipit-source-id: aa836290ec657b32742e2bea80bd0ac2404ef3b0	2019-06-06 11:57:31 -07:00
Hong Xu	da4f3629c5	Add missing shebangs to Python files with executable permissions. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21305 Differential Revision: D15613078 Pulled By: ezyang fbshipit-source-id: 1fedf4368d65db406b617a51402ee8a20968aff7	2019-06-06 10:53:40 -07:00
Yoshiaki Nakamura	52596164d4	Fix 32-bit env. model load issue (#20900 ) Summary: Fixed an issue where models can not be loaded in a 32-bit environment like Raspbian. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20900 Differential Revision: D15696709 Pulled By: ezyang fbshipit-source-id: 37a81f05f235d3b9fc6244e12d3320ced3d1465e	2019-06-06 10:30:29 -07:00
Hong Xu	f891b4338a	Test the exceptions raised by isfinite and isinf (#21168 ) Summary: Following up ef1fdc27a3779586efad631d698cec2d6d19503f Pull Request resolved: https://github.com/pytorch/pytorch/pull/21168 Differential Revision: D15696615 Pulled By: ezyang fbshipit-source-id: 46904974ef3c4cb87c7a1d06871bf01543e61ef2	2019-06-06 10:30:26 -07:00
mruberry	dffff3218b	Improves NVRTC Error messages (#21174 ) Summary: Current versions of NVRTC incorrectly map error code 7 to the error string "NVRTC unknown error." This update maps error code 7 to the correct string explicitly in PyTorch. See the documentation at: https://docs.nvidia.com/cuda/nvrtc/index.html#group__error. This may give us a better idea of the source of NVRTC errors that some community members, like Uber, have reported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21174 Differential Revision: D15696593 Pulled By: ezyang fbshipit-source-id: f5c7b5876c07b311ab5f2d7c8e375e93273912c6	2019-06-06 10:30:22 -07:00
vishwakftw	6615797837	Add derivative for QR decomposition (#21274 ) Summary: Changelog: - Implement derivative for QR decomposition for tall and square matrices i.e., num rows >= num cols Pull Request resolved: https://github.com/pytorch/pytorch/pull/21274 Differential Revision: D15696506 Pulled By: ezyang fbshipit-source-id: 1c77bb8369818112c84139360f6e2650f92bf2fd	2019-06-06 10:11:21 -07:00
Kabir Kwatra	26bcadcc61	Gumbel-Softmax Arxiv Docs Link Fix (#21376 ) Summary: Links separated #20297 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21376 Differential Revision: D15696413 Pulled By: ezyang fbshipit-source-id: 513bd430e41c109aa2d0fbaa9a242acb2a12059b	2019-06-06 10:11:18 -07:00
Mads R. B. Kristensen	ee15ad1bd6	"CharTensor" numpy conversion is supported now (#21458 ) Summary: Fixed #21269 by removed the the expected `ValueError` when converting a tensor to a Numpy `int8` array in the Numba interoperability test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21458 Differential Revision: D15696363 Pulled By: ezyang fbshipit-source-id: f4ee9910173aab0b90a757e75c35925b026d1cc4	2019-06-06 10:06:41 -07:00
Will Feng	c8083e0292	Include named_any.h in modules.h (#21437 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19462. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21437 Differential Revision: D15684880 Pulled By: yf225 fbshipit-source-id: db23c7e4e0f62d22b0b6c18f15420c3bb66af366	2019-06-06 09:57:33 -07:00
Hong Xu	856e3518c5	Parallelize eye() on CPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21077 Differential Revision: D15695329 Pulled By: ezyang fbshipit-source-id: 9841777238dac7c08cde2db3cd9401853f633af3	2019-06-06 09:52:13 -07:00
selaselah	ae18f8e761	Fix latex formular error about *normal (#21000 ) Summary: issue: https://github.com/pytorch/pytorch/issues/20903 the latex abort norm should be `\mathcal{N}(\text{mean}, \text{std}^2)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21000 Differential Revision: D15695335 Pulled By: ezyang fbshipit-source-id: 34dcca0acb20c297f876287e081cd44d11a3e516	2019-06-06 08:47:42 -07:00
lsrock1	4e02d3c0a1	insert default parameters in binary cross entropy with logits (#21336 ) Summary: I inserted default weight and reduction params in binary_cross_entropy_with_logits function . These default params exist in python and binary_cross_entropy function in cpp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21336 Differential Revision: D15628917 Pulled By: ezyang fbshipit-source-id: 38e5f53851125238842df1bd71cb6149c8603be1	2019-06-06 08:47:39 -07:00
Zeno Gantner	d50dca4075	fix two typos: "a the" => "the" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20437 Differential Revision: D15321243 Pulled By: zou3519 fbshipit-source-id: 6e1690132769b8ef2fd679cb5898c378efac2112	2019-06-06 08:42:57 -07:00
BowenBao	63a55d4932	Support gather export with OneHot + Mul (#21235 ) Summary: This could serve as a alternative solution to export ```torch.gather``` before something similar goes into ONNX spec. The exported model is verified to be correct against onnxruntime backend. We weren't able to test against Caffe2 backend because it doesn't seem to support OneHot opset9. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21235 Differential Revision: D15613039 Pulled By: houseroad fbshipit-source-id: 7fc097f85235c071474730233ede7d83074c347f	2019-06-06 08:35:28 -07:00
Hong Xu	240d62fbaa	Move redundant code that checks NumPy during build to a helper module and add an option to disable building with NumPy Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21417 Reviewed By: ezyang Differential Revision: D15694357 Pulled By: fmassa fbshipit-source-id: bc1bda23349ba4531f19619fa4adecb846225c20	2019-06-06 08:15:19 -07:00
Edward Yang	a68d2e817b	Kill apt-get even harder, and before we purge. (#21464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21464 ghimport-source-id: 81beb6ef39e3d412e755f0ae06c9186d8e11a8bc Differential Revision: D15694828 Pulled By: ezyang fbshipit-source-id: 0791fe017a1318425528795f576fb96e54b14dae	2019-06-06 07:49:39 -07:00
Mingzhe Li	12528990f8	change output of ai_pep_format (#21440 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21440 This diff modifies the output format when ai_pep_format is enabled. Reviewed By: hl475 Differential Revision: D15681042 fbshipit-source-id: df5f2dbb38d1bd866ca7f74ef4e63459d480be6e	2019-06-05 21:54:24 -07:00
svcscm	4e679e30a8	Updating submodules Reviewed By: yns88 fbshipit-source-id: 060bf204b6400515bbc8f1b9b3ef34bef9d32560	2019-06-05 20:06:53 -07:00
Horace He	7e300fbb21	Added degrees, radians, ldexp (#21131 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21131 ghimport-source-id: 62b9cb71a17f9c9a7999a6e33c2d8b840ce097ff Differential Revision: D15563184 Pulled By: Chillee fbshipit-source-id: e2c47fb9f9c0fe9f039cfd001c5e6d5b455e034c	2019-06-05 19:17:02 -07:00
Nishant Pandit	bd2d318e23	Modify quant-dequant node api to take module object and method name (#21407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21407 Modify api takes module object and method whose graph is instrumented to insert the quant dequant nodes Differential Revision: D15651624 fbshipit-source-id: 1ff1ae446c986184c724504c8fdd0dcd43864016	2019-06-05 19:08:56 -07:00
svcscm	505ae5f51d	Updating submodules Reviewed By: yns88 fbshipit-source-id: 9ab609a16522eb233f128df024903eb880742224	2019-06-05 19:03:34 -07:00
Horace He	f8202d85a0	Added frexp, isinf, isnan, isfinite (#21130 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21130 ghimport-source-id: fa771086da13deed232e142db6f940439bcc67bc Differential Revision: D15563186 Pulled By: Chillee fbshipit-source-id: fe33dbc454af2a9626ad810a5304300eb17d7530	2019-06-05 18:46:39 -07:00
Hector Yuen	26db46b324	change the epilogue of SLS to match the simd section (#21439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21439 this bug got exposed after testing accuracy on shapes not multiples of 8 Reviewed By: jspark1105 Differential Revision: D15684759 fbshipit-source-id: 2950f2bd87ee1d8e539148285a14c755f606b3a7	2019-06-05 18:41:55 -07:00
Junjie Bai	7e6d932208	Make strtod_c compatible with different gcc abi (#21293 ) Summary: We have encountered `std::bad_cast` error when running PyTorch binary built with cxx11 abi on CentOS7, stack trace: ``` #0 0x00007fec10160207 in raise () from /lib64/libc.so.6 #1 0x00007fec101618f8 in abort () from /lib64/libc.so.6 #2 0x00007fec015767d5 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6 #3 0x00007fec01574746 in ?? () from /lib64/libstdc++.so.6 #4 0x00007fec01574773 in std::terminate() () from /lib64/libstdc++.so.6 #5 0x00007fec01574993 in __cxa_throw () from /lib64/libstdc++.so.6 #6 0x00007fec015c94d2 in std::__throw_bad_cast() () from /lib64/libstdc++.so.6 #7 0x00007feb2ab3c2d7 in std::__cxx11::numpunct<char> const& std::use_facet<std::__cxx11::numpunct<char> >(std::locale const&) () from /root/.local/lib/python2.7/site-packages/torch/lib/libcaffe2.so #8 0x00007feb28643d62 in torch::jit::script::strtod_c(char const, char*) () from /root/.local/lib/python2.7/site-packages/torch/lib/libcaffe2.so ``` We are suspecting this line will get compiled to gcc abi dependent symbol: ``` char decimal_point = std::use_facet<std::numpunct<char>>(std::locale()).decimal_point(); ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21293 Differential Revision: D15609910 Pulled By: bddppq fbshipit-source-id: e247059729863868e4b36d6fec4fcbc36fbc4bb1	2019-06-05 18:10:09 -07:00
svcscm	e07d94558d	Updating submodules Reviewed By: yns88 fbshipit-source-id: 6608ac4be8c338ff5a8116b275bbad487d317972	2019-06-05 16:28:40 -07:00
Brian Vaughan	991c557270	Fix an incorrect implementation of celu (#21213 ) Summary: Fixing an incorrect implementation of the CELU activation function. The existing implementation works by a chance combination of errors that seem to cancel each other out. This change makes the code more readable, aligns the parameter names correctly, and is consistent with the cuda implementation. I came across this issue while working on version counters... I attempted to specify a gradient in derivatives.yaml for CELU due to a failed test, but the derivative couldn't be specified correctly without fixing the celu implementation. https://github.com/pytorch/pytorch/pull/20612 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21213 Differential Revision: D15678823 Pulled By: nairbv fbshipit-source-id: 29fa76b173a66c2c44ed2e0b7959e77f95d19c43	2019-06-05 15:45:50 -07:00
Owen Anderson	335869e833	Fix 3x DenseNet compile time regression by restoring earlier-out tests in AliasDB::writesToAlias. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21425 Differential Revision: D15678631 fbshipit-source-id: 3da2c694de13ad03019e2b3ff451e762199265bb	2019-06-05 15:40:29 -07:00
Edward Yang	6b9f46b2d0	Fix "warning: missing return statement at end of non-void function" (#21424 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21424 Fixes #21418 Differential Revision: D15676140 fbshipit-source-id: cfadce164c6cfefb16f8bf7bc09529ba8b910769	2019-06-05 15:30:54 -07:00
Syed Tousif Ahmed	0e3c4a054b	Remove curandStateMTGP32 usage (#21301 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21301 ghimport-source-id: d4516237a8fb46d1f74c47532e849e5926fc6a79 Differential Revision: D15632929 Pulled By: ezyang fbshipit-source-id: b5147edb95dc3d71f87581aa2ab002e48c3fef30	2019-06-05 14:06:25 -07:00
Brian Vaughan	793b302653	ensure version_counter gets incremented for non-differentiable outputs (#20612 ) Summary: issue: https://github.com/pytorch/pytorch/issues/14571 To reproduce I: 1) added these lines to derivatives.yaml: ``` - name: add_(Tensor self, Scalar other, Scalar alpha) output_differentiability: [False, False, False] - name: add_(Tensor self, Tensor other, Scalar alpha) output_differentiability: [False, False, False] ``` 2) Ran this code: ``` import torch scalar = torch.tensor(5) var1 = torch.randn(4,2,requires_grad=True) var2 = var1.detach().requires_grad_() output1 = var1 * scalar output2 = var2 * scalar output1.sum().backward() scalar.add_(5, 1) output2.sum().backward() print(var1.grad) print(var2.grad) ``` Observed modified var2.grad in the output: ``` tensor([[5., 5.], [5., 5.], [5., 5.], [5., 5.]]) tensor([[10., 10.], [10., 10.], [10., 10.], [10., 10.]]) ``` After making this change, re-running the above code produces the expected error: ``` Traceback (most recent call last): File "test.py", line 18, in <module> output2.sum().backward() File "/home/bvaughan/anaconda3/lib/python3.7/site-packages/torch/tensor.py", line 107, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/bvaughan/anaconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.LongTensor []] is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20612 Differential Revision: D15661958 Pulled By: nairbv fbshipit-source-id: af3373135a1a589a635b49e0ff62622a210258e6	2019-06-05 13:36:05 -07:00
Edward Yang	8215f44405	Revert D15660575: [pytorch][PR] Fix Caffe2 CI job for new Windows AMI Differential Revision: D15660575 Original commit changeset: cfc0f325b0fb fbshipit-source-id: cb7d87605c9019b9e563bf5ce4325a919263938e	2019-06-05 12:15:34 -07:00
Peyman Manikashani	98e3aaeb78	Adding support for exporting models with variable length input/output to ONNX (#20034 ) Summary: Proposal: https://gist.github.com/pk-g/cc45ff8c5891b5699bffd883a87f13ae?fbclid=IwAR17bRA7Fks4APoZRYiNa93UkLdoFCpRDuIYEx0lNVyPTyaDAShbEnytiQo Pull Request resolved: https://github.com/pytorch/pytorch/pull/20034 Reviewed By: zrphercule Differential Revision: D15606731 Pulled By: houseroad fbshipit-source-id: 247251e07b4893cb3f7a1287948b1f57aadb7851	2019-06-05 12:02:23 -07:00
Horace He	ba2bdf8d0e	Added factorial (#21129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21129 ghimport-source-id: a676dd33c4d0b2b60c3e9ce725bda0abeb22375f Differential Revision: D15563183 Pulled By: Chillee fbshipit-source-id: 641cae34c181a16c772665f5f7ed01c96a67ea9c	2019-06-05 11:51:03 -07:00
Edward Yang	7a1c9076ac	Revert D15632268: [pytorch][PR] Continuation of Port max_unpool1d, max_unpool2d and max_unpool3d to ATen Differential Revision: D15632268 Original commit changeset: 8e337e8dc17a fbshipit-source-id: de98b1af51a53105c97fb076b09efb6fa8eb08a7	2019-06-05 11:41:50 -07:00
davidriazati	f172fadd80	Make warnings be UserWarnings with source file info (#21231 ) Summary: Redo of #15201, this makes `warnings.warn` calls match their Python behavior ](https://our.intern.facebook.com/intern/diff/15605266/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21231 Pulled By: driazati Differential Revision: D15605266 fbshipit-source-id: 5931fd720b0c40d52dd492fbd1f5a76abefaab5c	2019-06-05 11:09:11 -07:00
Edward Yang	3068a945ce	Retry awscli install. (#21383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21383 ghimport-source-id: e518a4f8bf498694b6d504b8a695c5f11e7c681f Differential Revision: D15664738 Pulled By: ezyang fbshipit-source-id: d645db505de906e65c057f0d6964b5ce0fb6ff52	2019-06-05 10:38:01 -07:00
Iurii Zdebskyi	bf0e3b62ae	Minor preparational JIT changes (#21096 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21096 ghimport-source-id: 169f8b4b70cd77b0f9b07cca81d2b4cde2c46456 Reviewed By: ezyang Differential Revision: D15546176 Pulled By: izdeby fbshipit-source-id: cdd0a1c87263955eef9d3174ec2f36d1d2935f48	2019-06-05 10:30:01 -07:00
Edward Yang	c15254d4ab	Expunge some more deprecated uses of AT_CHECK. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21194 Differential Revision: D15576898 fbshipit-source-id: f030195f5bffe0027d4081aece57e2852aaf9ecb	2019-06-05 10:25:25 -07:00
Daya Khudia	ec7dc52e60	Fix a bug in qconv (#21294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21294 Returned output tensor wasn't of correct shape Reviewed By: zafartahirov Differential Revision: D15605081 fbshipit-source-id: f79a9d5b93b8b97e79c09411b9dc681987a61e44	2019-06-05 10:19:31 -07:00
Iurii Zdebskyi	03617574d3	Сhange type of a tensor with bools (#19097 ) Summary: This is bc-breaking change Change dtype of a tensor which was created from bool data. Old behavior: torch.tensor([True, False]) -> uint8 tensor Now: torch.tensor([True, False]) -> bool tensor Tested via tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19097 Reviewed By: ezyang Differential Revision: D15632553 Pulled By: izdeby fbshipit-source-id: b019150844c561a6845710a3c62b12f06b68bbe3	2019-06-05 10:19:27 -07:00
Anthony Scopatz	22ddddfb80	Continuation of Port max_unpool1d, max_unpool2d and max_unpool3d to ATen (#19465 ) Summary: This PR is a continuation of #15310, which itself is a continuation of #14845, #14941, & #15293. It should be synced up with the pytorch/master branch as of yesterday. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19465 Differential Revision: D15632268 Pulled By: ezyang fbshipit-source-id: 8e337e8dc17ac31439935ccb530a7caf77f960e6	2019-06-05 10:13:58 -07:00
Jason Lian	6874c4058d	Add type annotation to stft (#21302 ) Summary: We want to be able to call stft from a torchscript which requires that stft have a type annotation Pull Request resolved: https://github.com/pytorch/pytorch/pull/21302 Differential Revision: D15607973 Pulled By: cpuhrsch fbshipit-source-id: c4a5c09cdaafe7e81cf487a3ad216d1b03464a21	2019-06-05 10:06:48 -07:00
peter	7c6f2836d4	Fix Caffe2 CI job for new Windows AMI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21410 Differential Revision: D15660575 Pulled By: ezyang fbshipit-source-id: cfc0f325b0fbc22282686a4d12c7a53236d973d4	2019-06-05 06:35:39 -07:00
Vishwak Srinivasan	6251c563eb	Add CUDA support for _dirichlet_grad (#21191 ) Summary: Changelog: - Migrate _dirichlet_grad implementation from TH to ATen - Add CUDA support for _dirichlet_grad Closes #11030. Closes #15773. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21191 Differential Revision: D15660330 Pulled By: ezyang fbshipit-source-id: c8ad5b80366e5348139ce9be10400f22fc430344	2019-06-05 06:35:35 -07:00
Yongqiang Wang	b460a1987e	Per discussion at https://github.com/pytorch/pytorch/pull/21244 , fix bugs in (#21392 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21392 as discussed at https://github.com/pytorch/pytorch/pull/21244, we found some values in log_beta are not properly initialized. This diff will 1) initialize all log_beta to -inf; 2) fix a tricky compare condition; 3) zero all the gradient elements corresponding to padding to zero. Offline experiments show that this diff can fix previous seen NaN loss. Differential Revision: D15637977 fbshipit-source-id: 477008a5e11aae946bd2aa401ab7e0c513421af0	2019-06-05 00:28:45 -07:00
Yuri Putivsky	42b2f56124	Fixing race condition at Module::forward method (#21398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21398 Module::forward method calls find_method() function potentially in multiple threads. Internally it calls find_offset() method and reads dict_ object. If the correspondent name is not in a dictionary thread call insert() method and modifies dict_ object. At the same time when first thread modifies dict_ object another thread can enter forward()->find_method()->find_offset() path and access dict_ object for reading while it have been modified -> crash. Moved mutex protection up to protect both calls find_offset() and insert(). Consider to use C++ 17 shared_mutex locking object instead of recursive_mutex object. Reviewed By: bddppq Differential Revision: D15638942 fbshipit-source-id: ca6a453448302a0b3666c87724755fa4e9ce242f	2019-06-04 23:03:25 -07:00
Syed Tousif Ahmed	95eb9339c1	Adds CUDA C++11 and Profiling Notes (#21386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21386 ghimport-source-id: 9430c7640b90d9add38d9bf2f1bd0c8f62b7f239 Differential Revision: D15640102 Pulled By: ezyang fbshipit-source-id: 98a5efdea9b1de05207ebd3624cb20acda9fe96b	2019-06-04 19:18:55 -07:00
Syed Tousif Ahmed	eadac840f7	Speedup bernoulli_scalar_cuda_kernel with grid-stride loop (#21300 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21300 ghimport-source-id: c314c28cb693b554d6f24de235c11ba24ed6bf61 Reviewed By: jerryzh168 Differential Revision: D15632935 Pulled By: ezyang fbshipit-source-id: 9bb24f17d78151bf50942905c967bdcfe1ff00cb	2019-06-04 19:13:57 -07:00
Syed Tousif Ahmed	c82bf8ef10	Move THCTensor_(lognormal) to ATen (#21299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21299 ghimport-source-id: 2c63f289f02087f023feda8bff6b90ed49737889 Reviewed By: jerryzh168 Differential Revision: D15632930 Pulled By: ezyang fbshipit-source-id: 85c17cdca486b46942c5b500e4fd4d95bb5657f9	2019-06-04 19:13:53 -07:00
Syed Tousif Ahmed	4671bed0f3	Move THCTensor_(geometric) to ATen (#21298 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21298 ghimport-source-id: c0e2604aa25cc5da2b67293cafd88c2e77e476f9 Reviewed By: jerryzh168 Differential Revision: D15632932 Pulled By: ezyang fbshipit-source-id: 248ca4b56967116f27174cda44893ecfe4ca9a99	2019-06-04 19:13:50 -07:00
Syed Tousif Ahmed	d341bcb3dc	Move THCTensor_(exponential) to ATen (#21297 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21297 ghimport-source-id: 5f45154e714ab44dec961dabf1c64e54aaa063a2 Reviewed By: jerryzh168 Differential Revision: D15632931 Pulled By: ezyang fbshipit-source-id: 0367eec0a9ef6812b1b3ab7597817ee40a011bb8	2019-06-04 19:13:46 -07:00
Horace He	92b76df8f6	Finished trigonometric functions (#21128 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21128 ghimport-source-id: d566de103f2aefc59e6423181de325d8f42620f4 Differential Revision: D15563190 Pulled By: Chillee fbshipit-source-id: ad2e09cac5c7dae9978a7bd61098c2828620cdc4	2019-06-04 17:59:09 -07:00
Horace He	7309cb60fd	Finished the high-priority functions (#21127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21127 ghimport-source-id: 609021958e76ea01299f62b9491038005e6b4f27 Differential Revision: D15563189 Pulled By: Chillee fbshipit-source-id: 5c6155a69fff7447689ef012ea303dc358d50486	2019-06-04 17:59:05 -07:00
Horace He	622588d8fd	Added remainder of high-priority trigonometric math ops (#21126 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21126 ghimport-source-id: e310f3cfb28436b99ad038691887ca82068ca2c9 Differential Revision: D15563191 Pulled By: Chillee fbshipit-source-id: 7135ddd5bc9eebc818694fa8b67eaade907fa8a1	2019-06-04 17:59:02 -07:00
Brennan Vincent	e268fc97c3	Re-add Tensor.T (#21175 ) Summary: Something flaky is going on with `test_inplace_view_saved_output` on Windows. With my PR #20598 applied, the test fails, even though there is no obvious reason it should be related, so the PR was reverted. Based on commenting out various parts of my change and re-building, I think the problem is with the name -- renaming everything from `T` to `asdf` seems to make the test stop failing. I can't be sure that this is actually the case though, since I could just be seeing patterns in non-deterministic build output... I spoke with colesbury offline and we agreed that it is okay to just disable this test on Windows for now and not block landing the main change. He will look into why it is failing. Test Plan: I will wait to make sure the Windows CI suite passes before landing this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21175 Differential Revision: D15566970 Pulled By: umanwizard fbshipit-source-id: edf223375d41faaab0a3a14dca50841f08030da3	2019-06-04 17:38:25 -07:00
Hong Xu	ba08cf336d	Reorganize cmake related functions to tools/setup_helpers/cmake.py (#21367 ) Summary: Currently tools/build_pytorch_libs.py looks quite convoluted. This commit reorgnizes cmake related functions to a separate file to make the code clearer. --- This is hopefully helpful for further contribution for better integration with cmake. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21367 Differential Revision: D15636991 Pulled By: soumith fbshipit-source-id: 44d76e4e77aec0ce33cb32962b6a79a7f82785da	2019-06-04 17:01:38 -07:00
Elias Ellison	6ee9e87ff5	Back out "[pytorch][PR] don't materialize constants" (#21374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21374 Original commit changeset: d5609b0a5697 Not materializing constants slows compilation time significantly Differential Revision: D15630632 fbshipit-source-id: c6b5026ee6eae2ef290628f350f49a657495bd5d	2019-06-04 16:32:09 -07:00
James Reed	45d2305732	fix incorrect default on Graph::toString (#21370 ) Summary: This default was incorrect and made printing in python not print file:line:col This wasn't caught because FileCheck internally uses operator<< to print the graph, which has `true` hardcoded as the value. I've added more comprehensive tests to catch this Pull Request resolved: https://github.com/pytorch/pytorch/pull/21370 Differential Revision: D15631135 Pulled By: jamesr66a fbshipit-source-id: c809e06fff4f0174eefeb89062024384b4944ef7	2019-06-04 16:15:38 -07:00
James Reed	0dc7286e15	Better error message when trying to instantiate NamedTuple Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21309 Differential Revision: D15630564 Pulled By: jamesr66a fbshipit-source-id: 82753feee65bbe6c8b2f827cc2664628f3b9f4a3	2019-06-04 16:11:05 -07:00
Igor Fedan	d348d6405c	cdist: pairwise distances between two sets of tensors with batch mode (#20934 ) Summary: Batch implementation for cdist function Pull Request resolved: https://github.com/pytorch/pytorch/pull/20934 Differential Revision: D15609458 Pulled By: ifedan fbshipit-source-id: 31c12e120d168baec6a6af913f599838a44034d7	2019-06-04 15:52:52 -07:00
Karl Ostmo	6a3ebdbbc5	Remove all conda 3.5 nightly configs, remove libtorch smoketests (#21380 ) Summary: \| \| Before \| After ------------ \| ------------ \| ------------- Binary builds \| ![binarybuilds-config-dimensions](https://user-images.githubusercontent.com/261693/58915716-77a5f900-86d6-11e9-8a39-7ef587e56281.png) \| ![binarybuilds-config-dimensions](https://user-images.githubusercontent.com/261693/58915620-4a594b00-86d6-11e9-9e5f-95cf085e6fc8.png) \| Smoke tests \| ![binarysmoketests-config-dimensions](https://user-images.githubusercontent.com/261693/58915728-812f6100-86d6-11e9-80c1-182242fdfd0e.png) \| ![binarysmoketests-config-dimensions](https://user-images.githubusercontent.com/261693/58915686-68bf4680-86d6-11e9-8cd2-e65a47384b4f.png) \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/21380 Differential Revision: D15634729 Pulled By: kostmo fbshipit-source-id: aef44b0e5b9997be55d93969ab85effca68c5c88	2019-06-04 15:48:47 -07:00
Michael Suo	ca32563999	add suggestion to use lld to CONTRIBUTING.md (#21334 ) Summary: I found this significantly speeds up incremental builds. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21334 Differential Revision: D15632994 Pulled By: suo fbshipit-source-id: bb4af90f4400bffa90d168d82ff30fece5e3835c	2019-06-04 15:40:49 -07:00
bddppq	4940e41d16	Fix mkl-dnn tautological compare error (#21371 ) Summary: ``` ../third_party/ideep/mkl-dnn/src/cpu/jit_avx512_common_convolution.hpp:144:821: error: self-comparison always evaluates to true [-Werror,-Wtautological-compare] virtual pd_t clone() const override { return new pd_t(this); } virtual status_t create_primitive(primitive_t *primitive, const primitive_at_t inputs, const primitive_t *outputs) const override { double ms = get_msec(); primitive_t::input_vector ins(inputs, inputs + this->n_inputs()); primitive_t::outpu t_vector outs(outputs, outputs + this->n_outputs()); auto ret = safe_ptr_assign<primitive_t>(primitive, new (jit_avx512_common_convolution_bwd_data_t)(this, ins, outs)); ms = get_msec() - ms; if (mkldnn_verbose()->level >= 2) { printf("mkldnn_verbose,create,%s,%g\n", this->info(), ms); fflush(0); } return ret; } v irtual const char *name() const override { return (avx512_common == sse42 ? "jit:" "sse42" : (avx512_common == avx ? "jit:" "avx" : (avx512_common == avx2 ? "jit:" "avx2" : (avx512_common == avx512_common ? "jit:" "avx512_common" : (avx512_common == avx512_core ? "jit:" "avx512_core" : (avx512_common == avx512_mic ? "jit:" "avx512_mic" : (avx512_common == avx512_mic_4ops ? "jit:" "avx512_mic_4ops" : "jit:" ""))))))); }; ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21371 Differential Revision: D15631392 Pulled By: bddppq fbshipit-source-id: 3b0008acab8ae53ce61327686bd8367e7fb5d298	2019-06-04 15:27:07 -07:00
Michael Suo	403ca41142	make analyzeConservative more conservative (#21227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21227 ghimport-source-id: cac97ba20cb020f3edc4e83e7641201f0826f40a Reviewed By: jamesr66a Differential Revision: D15592316 Pulled By: suo fbshipit-source-id: b311f73a5d81d6d0b0331678b6a625e446588ebd	2019-06-04 15:09:46 -07:00
Michael Suo	0dbae7eddb	cleanup templated implementation of mayAlias (#21224 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21224 ghimport-source-id: 6ec4ea015043bbddd031f92c5149e8313f21977d Reviewed By: jamesr66a Differential Revision: D15592318 Pulled By: suo fbshipit-source-id: 47c52342f2a1360752306908e2f394ef52e47504	2019-06-04 15:09:43 -07:00
Michael Suo	adf6f6c442	use memory locations instead of values for working set (#21223 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21223 ghimport-source-id: 82800a465a4273e185bfffe2f67835b2f7f3a519 Reviewed By: jamesr66a Differential Revision: D15592317 Pulled By: suo fbshipit-source-id: 5e87c803a928b61c923200888a3ff1ac7b2523e0	2019-06-04 15:09:39 -07:00
Michael Suo	f330168570	remove multisets from work set (#21222 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21222 ghimport-source-id: 0eb6daa92bef68a35bef918c3f3a791b401812aa Reviewed By: jamesr66a Differential Revision: D15592319 Pulled By: suo fbshipit-source-id: 895d26538ba1edcd73b83147a68b7e4069084230	2019-06-04 15:09:36 -07:00
Michael Suo	df0b83654a	cleanups to alias analysis (#21221 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21221 ghimport-source-id: 778e7317bbe874d35a903d89af5e0bc9721c8680 Reviewed By: jamesr66a Differential Revision: D15592313 Pulled By: suo fbshipit-source-id: d6f6d2be8cd80b40dd26d0bb3be30f074e356105	2019-06-04 15:09:33 -07:00
Brennan Vincent	77c2f5dd75	fix copyright notice in docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21372 Differential Revision: D15631889 Pulled By: umanwizard fbshipit-source-id: cf764432c27cb1b01d8137ed60ec7de361450d0e	2019-06-04 14:53:45 -07:00
Cheng,Penghui	57f932a638	Enable 'empty' function for mkldnn Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21184 Differential Revision: D15625296 Pulled By: bddppq fbshipit-source-id: 47d26798bcf48e227ffd813f299959a7b8993641	2019-06-04 14:16:13 -07:00
Mingzhe Li	b869a3b4ac	add new ops to benchmark_all_test (#21365 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21365 This diff adds new operators to benchmark_all_test so all the supported ops can be built as one binary Reviewed By: hl475 Differential Revision: D15627328 fbshipit-source-id: b7ca550a279f485102a6a6bd47e4032c7beb9940	2019-06-04 13:54:26 -07:00
Horace He	2ed6f017ed	Added better tests for math ops and unified them (#21125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21125 ghimport-source-id: 2a576b563208ce3d83e6771643e20d24bc72af86 Differential Revision: D15563188 Pulled By: Chillee fbshipit-source-id: 0e77471729f715063d6bee075d2fc65f8db8b6c3	2019-06-04 13:15:54 -07:00
Horace He	6938de8851	made floor/ceil return ints (#21124 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21124 ghimport-source-id: e3e45bd50c9af1ee03fd58f2f4d631ce23d9612e Differential Revision: D15563187 Pulled By: Chillee fbshipit-source-id: 6504a41da883a8287d64db20d40cf958edb7404c	2019-06-04 10:32:16 -07:00
Syed Tousif Ahmed	87690d2b77	Move THCTensor_(cauchy) to ATen (#21289 ) Summary: ## Effective Bandwidth Benchmark - using https://gist.github.com/syed-ahmed/f8b7384d642f4bce484228b508b4bc68 - on V100 ### Float Type #### Before: ``` cauchy, size, elements 65536 forward 4.980564117431641e-06 bandwidth (GB/s) 52.63339529803734 cauchy, size, elements 131072 forward 6.232261657714844e-06 bandwidth (GB/s) 84.12483762631982 cauchy, size, elements 262144 forward 9.548664093017577e-06 bandwidth (GB/s) 109.81389540833959 cauchy, size, elements 524288 forward 1.59454345703125e-05 bandwidth (GB/s) 131.52052963827754 cauchy, size, elements 1048576 forward 2.86865234375e-05 bandwidth (GB/s) 146.21165262978724 cauchy, size, elements 2097152 forward 5.4748058319091796e-05 bandwidth (GB/s) 153.2220184158516 cauchy, size, elements 4194304 forward 0.00010075807571411133 bandwidth (GB/s) 166.50988897012377 cauchy, size, elements 8388608 forward 0.0001935744285583496 bandwidth (GB/s) 173.34124269355965 cauchy, size, elements 16777216 forward 0.00038077831268310545 bandwidth (GB/s) 176.24129779641603 cauchy, size, elements 33554432 forward 0.0006851387023925781 bandwidth (GB/s) 195.8986224705994 ``` #### After: ``` cauchy, size, elements 65536 forward 6.077289581298828e-06 bandwidth (GB/s) 43.13501874366419 cauchy, size, elements 131072 forward 6.2131881713867184e-06 bandwidth (GB/s) 84.38308731972373 cauchy, size, elements 262144 forward 6.46829605102539e-06 bandwidth (GB/s) 162.11008150033175 cauchy, size, elements 524288 forward 6.8783760070800785e-06 bandwidth (GB/s) 304.8905726935182 cauchy, size, elements 1048576 forward 9.505748748779296e-06 bandwidth (GB/s) 441.23867681003264 cauchy, size, elements 2097152 forward 1.5070438385009766e-05 bandwidth (GB/s) 556.6266744001266 cauchy, size, elements 4194304 forward 2.4406909942626954e-05 bandwidth (GB/s) 687.396152951685 cauchy, size, elements 8388608 forward 4.6243667602539064e-05 bandwidth (GB/s) 725.6005792706125 cauchy, size, elements 16777216 forward 9.100198745727539e-05 bandwidth (GB/s) 737.4439380404413 cauchy, size, elements 33554432 forward 0.00017449140548706055 bandwidth (GB/s) 769.1939188944922 ``` ### Double Type #### Before: ``` cauchy, size, elements 65536 forward 4.885196685791015e-06 bandwidth (GB/s) 53.660889593753055 cauchy, size, elements 131072 forward 6.229877471923828e-06 bandwidth (GB/s) 84.15703235943361 cauchy, size, elements 262144 forward 9.605884552001953e-06 bandwidth (GB/s) 109.15975455706132 cauchy, size, elements 524288 forward 1.5976428985595704e-05 bandwidth (GB/s) 131.26537863315923 cauchy, size, elements 1048576 forward 2.9621124267578124e-05 bandwidth (GB/s) 141.59840666786866 cauchy, size, elements 2097152 forward 5.5103302001953126e-05 bandwidth (GB/s) 152.23421637604707 cauchy, size, elements 4194304 forward 0.00010124444961547851 bandwidth (GB/s) 165.70998275677383 cauchy, size, elements 8388608 forward 0.0001944279670715332 bandwidth (GB/s) 172.58027487195184 cauchy, size, elements 16777216 forward 0.00034950494766235353 bandwidth (GB/s) 192.01119883668116 cauchy, size, elements 33554432 forward 0.0007002186775207519 bandwidth (GB/s) 191.67973135938277 ``` #### After: ``` cauchy, size, elements 65536 forward 5.91278076171875e-06 bandwidth (GB/s) 44.33514628129032 cauchy, size, elements 131072 forward 6.234645843505859e-06 bandwidth (GB/s) 84.09266751632889 cauchy, size, elements 262144 forward 7.433891296386719e-06 bandwidth (GB/s) 141.05344807902503 cauchy, size, elements 524288 forward 1.1401176452636719e-05 bandwidth (GB/s) 183.94171941045587 cauchy, size, elements 1048576 forward 1.960039138793945e-05 bandwidth (GB/s) 213.99082890665372 cauchy, size, elements 2097152 forward 3.434181213378906e-05 bandwidth (GB/s) 244.26806504326578 cauchy, size, elements 4194304 forward 6.517410278320313e-05 bandwidth (GB/s) 257.4215107465028 cauchy, size, elements 8388608 forward 0.0001229524612426758 bandwidth (GB/s) 272.9057365819818 cauchy, size, elements 16777216 forward 0.00023239374160766602 bandwidth (GB/s) 288.77225150621814 cauchy, size, elements 33554432 forward 0.00046050310134887696 bandwidth (GB/s) 291.4589013773367 ``` Resubmit of https://github.com/pytorch/pytorch/pull/20622 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21289 Differential Revision: D15622713 Pulled By: ezyang fbshipit-source-id: abe8bd57794bd1c3a0b92395367a9653c5d0f2db	2019-06-04 08:24:42 -07:00
Edward Yang	f9e746e9c8	Use "important" node to toggle whether or not to build on PR (#21308 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21308 ghimport-source-id: 75fba872a658d8257a3f6ff9d9e33a320c6e523e Differential Revision: D15621909 Pulled By: ezyang fbshipit-source-id: 6d016d9ffdeb6414d70a1b48ed4766b5dc626353	2019-06-04 08:05:56 -07:00
peter	1291d95e82	Revert "Fix the caffe2_gpu linkage with torch on Windows" (#21335 ) Summary: The original PR (#16071) is not working anymore after `caffe2` and `torch` is unified. What's more, It is making the binary big since the optimizing flag is disabled on a very big project(the `torch` library used to be small, but it now applies on the whole `caffe2` and `caffe2_gpu` library). We need to get it reverted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21335 Differential Revision: D15622163 Pulled By: soumith fbshipit-source-id: 900bd400106d27a1512eed1e9f2288114f5f41bb	2019-06-04 07:49:49 -07:00
Bram Vanroy	38d68ad803	Update randomness.rst (#21337 ) Summary: Following [this question on the forums](https://discuss.pytorch.org/t/reproducibility-and-performance/46504), I propose the following doc change. It clarifies that 'performance reduction' concerns the processing speed (and not the training accuracy). Related website commit: https://github.com/pytorch/pytorch.github.io/pull/211 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21337 Differential Revision: D15622151 Pulled By: soumith fbshipit-source-id: f0edeb20049f2ee715c400e7c57abb966864d621	2019-06-04 07:38:00 -07:00
Edward Yang	ae42a11ab2	Make .circleci Conf class uses dataclasses; use types. (#21284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21284 ghimport-source-id: a628af85b7a085e15903168e957bed1e273d6636 Differential Revision: D15621908 Pulled By: ezyang fbshipit-source-id: 64e2da8b96cdc1b53c0b314771d225eebf3d4b2d	2019-06-04 07:28:53 -07:00
Sam Gross	25a6ff10f0	Add gtest for TensorIterator (#21253 ) Summary: This adds a regression test for the bug fix in #21236. Operations involving CUDA tensors an CPU scalars should not copy the CPU scalar to the device (because that is slow). They should instead "lift" the scalar to a kernel parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21253 Reviewed By: bddppq Differential Revision: D15604080 Pulled By: colesbury fbshipit-source-id: c14ded5d584499eaa5ea83337ffc50278205f3d6	2019-06-04 07:23:42 -07:00
svcscm	fecd5fa171	Updating submodules Reviewed By: yns88 fbshipit-source-id: 1a30f65182aead9145cb02fb544e2b7a25043f44	2019-06-04 07:23:39 -07:00
svcscm	2ee2d78a29	Updating submodules Reviewed By: yns88 fbshipit-source-id: 0256f2f4afaeaa2c16074dbca3b9a03ca434c7de	2019-06-03 23:36:35 -07:00
Ilia Cherniavskii	af4c24153f	Honor OMP/MKL environment variables in AT_PARALLEL_NATIVE case (#21189 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21189 ghimport-source-id: 4dcfaf04880346ff5ca79ca4dd11c94dcb645ce5 Differential Revision: D15574578 Pulled By: ilia-cher fbshipit-source-id: 919fccb58b997f9a7add5486a79f9cd4cabaa1ee	2019-06-03 23:22:58 -07:00
Junjie Bai	f251416d70	Update fbgemm submodule (#21328 ) Summary: Fix master breakage cc jianyuh Pull Request resolved: https://github.com/pytorch/pytorch/pull/21328 Differential Revision: D15618649 Pulled By: bddppq fbshipit-source-id: bce279705520dbd9c6df5fb794cdaeaed48a1a5a	2019-06-03 22:17:04 -07:00
Wanchao Liang	113a27ee45	bake constants into the traced graph, get rid of getNestedValueTrace (#21046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21046 ghimport-source-id: 5cb3efb1896fbe42336e24c14fbf0bb5e646528e Differential Revision: D15530991 Pulled By: wanchaol fbshipit-source-id: b096ca5a1cdce496742b7f7e1de3ef8d21e9a8b0	2019-06-03 21:48:11 -07:00
Zachary DeVito	cf356a342b	Fix a bug in loop unrolling (#21239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21239 ghimport-source-id: 68256b752be795b32ab3f426848ed1d64fc5ea3e Reviewed By: suo Differential Revision: D15590901 Pulled By: zdevito fbshipit-source-id: 8700aab723d4486fd20d3414df8160b36a3cc5da	2019-06-03 21:35:14 -07:00
Zachary DeVito	6e657c5586	Add CallMethod, inline eagerly (#21116 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21116 ghimport-source-id: 3c47e335dd80f52216e50e0a215cedc1862a9e78 Reviewed By: eellison Differential Revision: D15552816 Pulled By: zdevito fbshipit-source-id: 708fe87439d94117dca0a26c98f0917f497f718f	2019-06-03 21:35:11 -07:00
Jianyu Huang	0f58d20fe4	Add quantized::fbgemm_linear_unpack operator for serialization (#97 ) Summary: Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/97 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20721 - FBGEMM: Add unpack function for PackBMatrix class: Unpack pmat buffer to the origin_buf (Used for the serialization to recover weight matrix). - PyTorch Quantizer: Add quantized::fbgemm_linear_unpack operator for serialization. Reviewed By: zafartahirov Differential Revision: D15314568 fbshipit-source-id: 12080c8887ce31dc849d23e132ae1766ac319407	2019-06-03 20:36:30 -07:00
Hong Xu	4b576e5184	Do not hardcode build_dir in build_caffe2. Use the build_dir parameter. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21296 Differential Revision: D15613035 Pulled By: bddppq fbshipit-source-id: 19313cbe0135581990d489f489d366d00962a3c3	2019-06-03 20:31:30 -07:00
Jiakai Liu	702ba3d2fb	build torch for libtorch mobile build (#21234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21234 ghimport-source-id: 8d401691a811991c79acf5e09e60389910910365 Differential Revision: D15616540 Pulled By: ljk53 fbshipit-source-id: 150e706630911bf14c55f47f4058eaada1edf1cc	2019-06-03 19:51:05 -07:00
Bram Wasti	82ceeaeca2	Add options to jit's operator constructor (#21315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21315 ghimport-source-id: 168ddecb333a8cb309e7b859683de9b077123205 Differential Revision: D15614506 Pulled By: bwasti fbshipit-source-id: ae013a88e2069c38845b5b8ff805db96ab2c29e9	2019-06-03 19:30:22 -07:00
root	457c0f164e	insert missing #pragma once in VariableTypeUtils.h (#21134 ) Summary: insert missing #pragma once keyword to prevent redefinition error Pull Request resolved: https://github.com/pytorch/pytorch/pull/21134 Differential Revision: D15607673 Pulled By: li-roy fbshipit-source-id: 0000fa18e3c55e5d36a64b171d6e85eb4bc211a1	2019-06-03 17:50:56 -07:00
Lu Fang	1c5bd1fa65	Automatic update of fbcode/onnx to 5160f3ac3380302224998f1c95e111cd961c4bc5 (#21311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21311 Previous import was 9005291283e943f1a91da5f0acf218bc4e8eb2ca Included changes: - [5160f3ac](https://github.com/onnx/onnx/commit/5160f3ac): Fix typo (#2069) <Takeshi Watanabe> - [ac218ac6](https://github.com/onnx/onnx/commit/ac218ac6): Add a missing step when upgrading an operator (#2071) <daquexian> - [5972eed9](https://github.com/onnx/onnx/commit/5972eed9): Clarify the axis/size in pads, strides, dilations (#2048) <daquexian> Reviewed By: bddppq Differential Revision: D15612734 fbshipit-source-id: 235dc3d49e4a6ccd4f43e6c2f648e87611d52697	2019-06-03 17:35:53 -07:00
James Reed	02fd1878e3	Cast dropout to float in RNN (#21304 ) Summary: This solves the situation where, for example, someone instantiates LSTM with `dropout=0`, a Python integer. This works fine in Python, but JIT throws a type error because it expected float but got int Resolves https://github.com/pytorch/lockdown/issues/65 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21304 Differential Revision: D15613153 Pulled By: jamesr66a fbshipit-source-id: eabff76e3af3de0612583b37dbc5f7eab7e248a4	2019-06-03 16:59:04 -07:00
Chunli Fu	45de3ef6a7	Export feature length information for onnxifi operator (#21303 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21303 Export feature length information for onnxifi operator recommit for D15548138 disable caffe2_extract_feature_length_for_shape_inference by default change LOG(INFO) to VLOG(4) change LOG(WARNING) to LOG_EVERY_N(WARNING, 1000) Reviewed By: yinghai, ipiszy Differential Revision: D15608620 fbshipit-source-id: f96410366fe6bae954fea9d6b50ee72f4969d024	2019-06-03 16:53:06 -07:00
Ailing Zhang	7c823312d3	hub doc improvements Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21307 Differential Revision: D15610441 Pulled By: ailzhang fbshipit-source-id: 2b2a28ed808936cf7c93db31afc6b5ea888ab1b1	2019-06-03 16:29:39 -07:00
Spandan Tiwari	22865d4ce1	Add ONNX export support for torch.rand. (#20559 ) Summary: This PR adds support for torch.rand export in the PyTorch ONNX exporter. There are other generator ops that need to be supported for export and they will added in subsequent PRs. This op is needed with priority for a model on our end. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20559 Differential Revision: D15379653 Pulled By: houseroad fbshipit-source-id: d590db04a4cbb256c966f4010a9361ab8eb3ade3	2019-06-03 16:09:01 -07:00
Xing Wang	7d84ca6e06	clean code to unify the logic to use fp16 by the optimizer engine (#20915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20915 Clean the unary processor code. Some question are added into the comments to seek suggestions. Reviewed By: pjh5 Differential Revision: D15448502 fbshipit-source-id: ef0c45718c1a06187e3fe2e4e59b7f20c641d9c5	2019-06-03 15:03:35 -07:00
Mingzhe Li	3004b397f0	change test_name to be globally unique value across tests (#21206 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21206 This diff change the default test_name to be a globally unique value across tests. With that, users can list all the tests and choose to run a specific test. Reviewed By: zheng-xq Differential Revision: D15543508 fbshipit-source-id: 0814ef6a60d41637fed5245e30c282497cf21bb8	2019-06-03 14:55:11 -07:00
Mingzhe Li	ca80ec7c97	introduce a new intrace to add op [PT changes] (#21149 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21149 The diff modifies the interface for PyTorch operators in the benchmark suite Reviewed By: zheng-xq Differential Revision: D15433897 fbshipit-source-id: e858183431eb37d90313356716c2de8709372b58	2019-06-03 14:55:08 -07:00
Elias Ellison	88d033f842	don't materialize constants (#21229 ) Summary: This doesn't affect anything because we run constant pooling, and in the case of Closures and Forks creates unnecessary closures over constants. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21229 Differential Revision: D15587764 Pulled By: eellison fbshipit-source-id: d5609b0a5697071fab5050eb9e03876ab9ebb27a	2019-06-03 13:36:57 -07:00
BowenBao	9a41f44732	Improve ONNX Loop export (#20445 ) Summary: ~~This is work in progress due to its dependency on multiple pending PRs.~~ - [x] ONNX: Relax constraint on subgraph input/output type & shape check. https://github.com/onnx/onnx/pull/2009 - [x] PyTorch: Add infra to test_pytorch_onnx_caffe2.py to test ScriptModule models. https://github.com/pytorch/pytorch/pull/20256 This PR should partially resolve https://github.com/pytorch/pytorch/issues/17531. However, ideally we shouldn't need to put cast(and reshape) node to help the conversion for loop condition. - Added cast node for condition values before entering loop node. The ONNX spec only accepts Bool type, while in PyTorch if the condition value is an output from other node it could potentially have any integral type. - Tidying up the exported ONNX loop subgraph input type & shape. According to ONNX spec, input "M" is exported as 0-d scalar tensor with type int64. input "Cond" is exported as incomplete tensor of type Bool without shape information. This is because through out the iteration, the rank of condition value is dynamic, either 0-d or 1-d, as long as it holds a single value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20445 Differential Revision: D15534188 Pulled By: houseroad fbshipit-source-id: d174e778529def05ee666afeee4b8fb27786e320	2019-06-03 13:00:00 -07:00
mal	4980b8b95c	Renaming member variables in engine.cpp/h (#21283 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21283 ghimport-source-id: 360a138e420ace3cd4ca6ccbc761c8e68319440d Differential Revision: D15607428 fbshipit-source-id: f8df6b42796a49c4d68fa8366b6a68d5715f6421	2019-06-03 12:54:50 -07:00
Jianyu Huang	37fed9b24a	Rename FC to Linear for the function name (#21268 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21268 As Title says. Reviewed By: zafartahirov Differential Revision: D15599232 fbshipit-source-id: 0046f933657f60807fdca7009676bfb052748d91	2019-06-03 11:55:35 -07:00
Jianyu Huang	63b3c5a66a	Replace AT_ASSERTM with TORCH_CHECK (#21267 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21267 Replace AT_ASSERTM with TORCH_CHECK: AT_ASSERTM is deprecated. Not sure when ```AT_ASSERT``` is dprecated with some new TORCH ASSERT function. Reviewed By: zafartahirov Differential Revision: D15599242 fbshipit-source-id: 23f21a9a23dc3c147dc817e6d278066d0832e08d	2019-06-03 11:47:14 -07:00
Natalia Gimelshein	ad971a37d0	Improve performance of advanced indexing backward (#20557 ) Summary: This PR improves performance of advanced indexing backward, partially solving #15245 (performance is still worse than gather, but not by such outrageous margins). Before, using benchmarking harness from #15245, cuda 10/V100: ``` Indexing is faster by at most -270.61607820767887 us on N: 16 D: 256 K: 1 Indexing is slower by at most 11127.466280784833 us on N: 16 D: 4096 K: 4096 ``` after: ``` Indexing is faster by at most 23.524456737696028 us on N: 512 D: 4096 K: 4096 Indexing is slower by at most 186.24056029472553 us on N: 16 D: 1024 K: 4096 ``` Strategy is to reuse embedding backward kernel, adapting it to handle unindexed dimensions in the beginning by launching additional threadblocks, and also allowing it to handle slices that are bigger than `65K*128`, that is hardly ever a problem for embedding. Still, integer indexing is baked in the kernel, and is important for performance, so for now bigger than 2G element tensors are not supported. The main savings come from not having to expand index to all unindexed dimensions, and not sorting expanded index with incoming gradient values, but rather only sorting unexpanded index. There are ways to make sorting overhead smaller (thanks mcarilli for suggestions) but I'll get to it when it becomes a real problem, or rather, when cuda graphs will force us to get rid of thrust::sort calls. I've also added tests for indexing backward, before tests for index_put_ and indexing backward were non-existent. This PR also fixes #20457 by casting indices to `self` backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20557 Differential Revision: D15582434 Pulled By: ezyang fbshipit-source-id: 91e8f2769580588ec7d18823d99a26f1c0da8e2a	2019-06-03 11:38:53 -07:00
James Reed	4ac732ed7a	file:line for tracing (#21247 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/21217 This adds support for recording file and line information during tracing, by extracting the top Python interpreter frame Pull Request resolved: https://github.com/pytorch/pytorch/pull/21247 Reviewed By: suo, driazati Differential Revision: D15594553 Pulled By: jamesr66a fbshipit-source-id: 72e1b3a46f1dabe3e83a608ec1a7d083bd1720f9	2019-06-03 11:13:49 -07:00
Diego Estrada	27d1daab45	Export ONNX Dropout for opset 10 (#20710 ) Summary: Remove Dropout from the opset 10 blacklist. ONNX Dropout was modified in opset 10, but only the output "mask" was modified, which is not exported in pytorch opset 9. So we can still fallback on the opset 9 op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20710 Differential Revision: D15571248 Pulled By: houseroad fbshipit-source-id: 15267eb63308a29a435261034b2f07324db1dea6	2019-06-03 10:59:56 -07:00
Lucas Hendren	770089c2b8	math module support: isnan, asinh, atanh, cosh, sinh, and tanh (#19337 ) Summary: driazati and eellison Please review This PR is for #19026 . Specifically, isnan, asinh, atanh, cosh, sinh, and tanh Pull Request resolved: https://github.com/pytorch/pytorch/pull/19337 Differential Revision: D15580932 Pulled By: driazati fbshipit-source-id: 38513fa59088e038264f9f6f0d6374a13a165589	2019-06-03 10:54:42 -07:00
Elias Ellison	fb72625267	Remove onnx export expects (#21238 ) Summary: We're not getting much from checking the export strings, and they are noisy and slow development. DIdn't realize they existed until now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21238 Differential Revision: D15604256 Pulled By: eellison fbshipit-source-id: 488e9401231228cffe132dab99d519563fa63afc	2019-06-03 10:30:12 -07:00
shihongzhi	2e59a0a646	add contiguous function type hint for tensor (#21285 ) Summary: Fixes #21261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21285 Differential Revision: D15604270 Pulled By: soumith fbshipit-source-id: c1c02348e338477a507052de0a1065cf42a99387	2019-06-03 10:17:03 -07:00
Natalia Lunova	96667dfe41	Write add_scalars data in the same file (#21100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21100 Added multifile flag to write scalar data into separate files. This can slow down dashboard loading. Reviewed By: orionr Differential Revision: D15548913 fbshipit-source-id: dd39a7f76f93025d28f14babbf933e39860e6910	2019-06-03 09:53:27 -07:00
peter	5b33698776	Fix build error in c10 on Windows (#21005 ) Summary: Targets https://github.com/pytorch/pytorch/issues/20635#issuecomment-496265510 Reference: 1. https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=vs-2015#microsoft-specific-predefined-macros 2. https://docs.microsoft.com/en-us/cpp/cpp/deprecated-cpp?view=vs-2019 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21005 Differential Revision: D15543134 Pulled By: ezyang fbshipit-source-id: f32709b018a7de651cb31575fc6117bfc4dd3bd1	2019-06-03 09:53:24 -07:00
Syed Tousif Ahmed	155f767382	Move THCTensor_{normal, normal_means, normal_stddevs, normal_means_stddevs} to ATen (#21287 ) Summary: ## Effective Bandwidth Benchmark - using https://gist.github.com/syed-ahmed/f8b7384d642f4bce484228b508b4bc68 - on V100 ### Float Type #### Before: ``` normal, size, elements 65536 forward 4.956722259521484e-06 bandwidth (GB/s) 52.88656218258779 normal, size, elements 131072 forward 5.285739898681641e-06 bandwidth (GB/s) 99.18914098114568 normal, size, elements 262144 forward 7.548332214355469e-06 bandwidth (GB/s) 138.91492454529376 normal, size, elements 524288 forward 1.1980533599853516e-05 bandwidth (GB/s) 175.0466273076219 normal, size, elements 1048576 forward 2.091646194458008e-05 bandwidth (GB/s) 200.52645667862762 normal, size, elements 2097152 forward 3.9961338043212894e-05 bandwidth (GB/s) 209.91809610901498 normal, size, elements 4194304 forward 7.39765167236328e-05 bandwidth (GB/s) 226.79110538115253 normal, size, elements 8388608 forward 0.0001377725601196289 bandwidth (GB/s) 243.5494555001696 normal, size, elements 16777216 forward 0.0002710080146789551 bandwidth (GB/s) 247.62686107087774 normal, size, elements 33554432 forward 0.0005375170707702637 bandwidth (GB/s) 249.69947058177252 ``` #### After: ``` normal, size, elements 65536 forward 6.198883056640625e-06 bandwidth (GB/s) 42.288908760615385 normal, size, elements 131072 forward 6.756782531738281e-06 bandwidth (GB/s) 77.59432800112916 normal, size, elements 262144 forward 7.560253143310547e-06 bandwidth (GB/s) 138.6958849291706 normal, size, elements 524288 forward 7.550716400146485e-06 bandwidth (GB/s) 277.7421225831386 normal, size, elements 1048576 forward 1.1034011840820313e-05 bandwidth (GB/s) 380.1250225673293 normal, size, elements 2097152 forward 1.802682876586914e-05 bandwidth (GB/s) 465.34019427102237 normal, size, elements 4194304 forward 2.8417110443115234e-05 bandwidth (GB/s) 590.3913430460946 normal, size, elements 8388608 forward 4.8711299896240235e-05 bandwidth (GB/s) 688.8428777608927 normal, size, elements 16777216 forward 9.685993194580078e-05 bandwidth (GB/s) 692.8444265018856 normal, size, elements 33554432 forward 0.00018213510513305663 bandwidth (GB/s) 736.9130069787966 ``` ### Double Type #### Before: ``` normal, size, elements 65536 forward 5.8841705322265624e-06 bandwidth (GB/s) 44.55071425348461 normal, size, elements 131072 forward 8.018016815185547e-06 bandwidth (GB/s) 65.38873789925661 normal, size, elements 262144 forward 1.2989044189453124e-05 bandwidth (GB/s) 80.72772597474304 normal, size, elements 524288 forward 2.2075176239013673e-05 bandwidth (GB/s) 95.00046465285668 normal, size, elements 1048576 forward 4.1041374206542965e-05 bandwidth (GB/s) 102.19696784254678 normal, size, elements 2097152 forward 7.57598876953125e-05 bandwidth (GB/s) 110.72624650312186 normal, size, elements 4194304 forward 0.00013725996017456056 bandwidth (GB/s) 122.22949779865557 normal, size, elements 8388608 forward 0.0002614736557006836 bandwidth (GB/s) 128.32815569921402 normal, size, elements 16777216 forward 0.0005080199241638184 bandwidth (GB/s) 132.0988819689674 normal, size, elements 33554432 forward 0.0009479570388793945 bandwidth (GB/s) 141.58629821311564 ``` #### After: ``` normal, size, elements 65536 forward 5.991458892822265e-06 bandwidth (GB/s) 43.75294977222444 normal, size, elements 131072 forward 7.293224334716797e-06 bandwidth (GB/s) 71.88699756626349 normal, size, elements 262144 forward 8.094310760498048e-06 bandwidth (GB/s) 129.54481623281296 normal, size, elements 524288 forward 1.2805461883544922e-05 bandwidth (GB/s) 163.7701177100726 normal, size, elements 1048576 forward 2.2592544555664064e-05 bandwidth (GB/s) 185.64991604491345 normal, size, elements 2097152 forward 3.801822662353516e-05 bandwidth (GB/s) 220.6470092112881 normal, size, elements 4194304 forward 6.761550903320313e-05 bandwidth (GB/s) 248.1267425164457 normal, size, elements 8388608 forward 0.00013209104537963867 bandwidth (GB/s) 254.02503177684966 normal, size, elements 16777216 forward 0.0002667689323425293 bandwidth (GB/s) 251.56176699703818 normal, size, elements 33554432 forward 0.0004705166816711426 bandwidth (GB/s) 285.25604559501795 ``` Resubmit of #20621 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21287 Differential Revision: D15603695 Pulled By: ezyang fbshipit-source-id: f8c5032678d503d45ac99fb1475a929df7c2b361	2019-06-03 09:45:02 -07:00
Nikolay Korovaiko	21113c2d36	EliminateGuards Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21070 Differential Revision: D15603561 Pulled By: Krovatkin fbshipit-source-id: 03056688e8b99eddcb30d80cc20ab37ad3f13af2	2019-06-03 09:39:45 -07:00
Sam Gross	c8539be962	Make is_contiguous checks generic in number of arguments (#21106 ) Summary: Loops.h has contains specializations for cases where all the inputs are contiguous as well as cases where one input is a scalar and all other inputs are contiguous. Previously, there were separate checks for each functions that take zero, one, or two input arguments. This is getting unwieldy, especially once we add support for functions that take three inputs (#21025). This requires the use of recursive templates (which have their own downsides), but this seems better than the alternative. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21106 Differential Revision: D15562430 Pulled By: colesbury fbshipit-source-id: 5f19ab2212e16e29552887f4585c2b4a70309772	2019-06-03 09:19:19 -07:00
Hong Xu	b159e0ce08	Significantly simplify the spawning of pytorch libs building process. (#21105 ) Summary: Instead of attempting to hardcode calls to "ninja" or "make", we should always let cmake do it. This better integrates build configurations (DEBUG or REL_WITH_DEB_INFO) and better handles the case in which the native build tool is not in PATH (cmake has some capacity to find them and has options for users to specify their locations). Pull Request resolved: https://github.com/pytorch/pytorch/pull/21105 Differential Revision: D15602883 Pulled By: soumith fbshipit-source-id: 32ac46d438af00e791defde6ae5ac21c437d0bb0	2019-06-03 08:28:19 -07:00
Shen Li	f62a006097	Retry Fix Python DataParallel RNN in no_grad mode (#21262 ) Summary: Retry #21197 The previous one failed because it uses some Python3 only syntax. ezyang Do we still have multi-GPU py2 tests? I am curious why the CI tests did not catch this error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21262 Differential Revision: D15598941 Pulled By: mrshenli fbshipit-source-id: 95f416589448c443685d6d236d205b011998a715	2019-06-03 08:04:35 -07:00
Xiaomeng Yang	0c6efbd410	Fix gelu documents (#21265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21265 Fix gelu documents Reviewed By: hl475 Differential Revision: D15598958 fbshipit-source-id: 483040069102daada705401c36c8990598142d3d	2019-06-02 20:17:56 -07:00
Xiaomeng Yang	eaa3ba6587	Add autograd for layer_norm on CPU (#20883 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20883 Add autograd for layer_norm on CPU, after this diff, both PyTorch and jit model can automatically benefit from performance improvement of nn.functional.layer_norm Reviewed By: zheng-xq Differential Revision: D15483790 fbshipit-source-id: 94ed3b16ab6d83ca6c254dbcfb224ff7d88837f3	2019-06-02 16:55:32 -07:00
Xiaomeng Yang	31c79b71ff	Add gelu gradient for pytorch (#21237 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21237 Add gelu gradient for pytorch Reviewed By: zheng-xq Differential Revision: D15589816 fbshipit-source-id: 76fda7c413afed5b6cc3abe3a26c258d393a53ce	2019-06-02 09:42:42 -07:00
Xiaomeng Yang	93ae040ff0	Add gelu activation in pytorch (#20665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20665 Add gelu activation forward on CPU in pytorch Compare to current python implemented version of gelu in BERT model like def gelu(self, x): x * 0.5 * (1.0 + torch.erf(x / self.sqrt_two)) The torch.nn.functional.gelu function can reduce the forward time from 333ms to 109ms (with MKL) / 112ms (without MKL) for input size = [64, 128, 56, 56] on a devvm. Reviewed By: zheng-xq Differential Revision: D15400974 fbshipit-source-id: f606b43d1dd64e3c42a12c4991411d47551a8121	2019-06-02 09:08:47 -07:00
Karl Ostmo	aac424a6c4	Revert D15577342: [pytorch][PR] Fix Python DataParallel RNN in no_grad mode Differential Revision: D15577342 Original commit changeset: 1a024c572171 fbshipit-source-id: 9a3ddc14ebb2d75d9dc3ee1fe69df9ffba3529de	2019-06-01 22:17:19 -07:00
Zafar Takhirov	360e6d1b0b	Fixes a bug in the test (#21146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21146 The error was reported by https://our.intern.facebook.com/intern/test/562949965807317?ref_report_id=1837062 The API changed from `a.quantize_linear(...)` to `torch.quantize_linear(a, ...)` Reviewed By: dskhudia Differential Revision: D15557418 fbshipit-source-id: 88463e09fdf1f574f1b8128f6a00c2810091cd03	2019-06-01 18:00:33 -07:00
James Reed	62ae348d1a	Exclude file:line from graphs used for fuser kernel cache (#21252 ) Summary: cc ezyang this is meant to fix the fuser failures on master Pull Request resolved: https://github.com/pytorch/pytorch/pull/21252 Differential Revision: D15594283 Pulled By: jamesr66a fbshipit-source-id: 85f37e78b2de051c92ade3fe4c44c7530b4542e5	2019-06-01 16:18:55 -07:00
Yinghai Lu	7c40576c61	Save the weight shape info the first time we have chance to extract it (#21233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21233 It is possible that OnnxifiOp is created in a thread where weights have been cleaned from the workspace, which is legit use case as we can create the backend once and lower all the weights. So we need to extract the weight shape info the first time we create the backend and save it. Reviewed By: bertmaher, rdzhabarov Differential Revision: D15587237 fbshipit-source-id: 1f264dc32c0398c42b618e9c41c119eb13e1c9f1	2019-06-01 12:55:29 -07:00
Jiang Wu	0efc527dd1	Revert D15548138: Export feature length information for onnxifi operator Differential Revision: D15548138 Original commit changeset: 460118648bb4 fbshipit-source-id: 1a25ca2942d804f6c88e96c436f09f68c260b9be	2019-06-01 12:41:47 -07:00
Shen Li	51ebbe970a	Fix Python DataParallel RNN in no_grad mode (#21197 ) Summary: Fixes #21108 When grad is disabled, Python autograd function outputs are [wrapped as detached aliases](`8cde4c4d22/torch/csrc/autograd/python_function.cpp (L395-L399)`), which prevents calling `Tensor.set_()` on them after recent changes in Tensors and Variables. This will hit a problem when users would like to call `rnn.flatten_parameters()` in the forward pass, as the function [calls `set_()`](`9d09f5df6c/aten/src/ATen/native/cudnn/RNN.cpp (L669)`). The proposed solution is to avoid using an autograd Broadcast if in no_grad mode. apsdehal Pull Request resolved: https://github.com/pytorch/pytorch/pull/21197 Differential Revision: D15577342 Pulled By: mrshenli fbshipit-source-id: 1a024c572171a3f2daca9454fd3ee6450d112f7c	2019-06-01 10:37:57 -07:00
Tongzhou Wang	f051fbd4a8	Fix typo in test_dataloader Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21226 Differential Revision: D15592797 Pulled By: soumith fbshipit-source-id: b9a83e574c7b10fb0d661332ab68e376409a4724	2019-06-01 10:30:14 -07:00
Natalia Gimelshein	d168a8533f	compare scalar device with common device (#21236 ) Summary: I think there was a typo in #20690 here https://github.com/pytorch/pytorch/pull/20690/files#diff-b47a50873394e38a005b4c1acd151957R130. Original conditional was ` common_backend == Backend::CUDA && op.tensor.type().backend() == Backend::CPU)`, now it is `op.device.is_cuda() && op.tensor.device().is_cpu()`. It seems that `op.device` and `op.tensor.device()` should be the same, so this conditional is never true. This leads to spurious h2d copies for operations between cuda tensors and cpu scalars, because cpu scalars are now sent to gpu, instead of being passed to lambdas directly. Unfortunately, I don't know how to test this change, because functionally everything was fine after #20690, it was just a performance regression. cc colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/21236 Differential Revision: D15592754 Pulled By: soumith fbshipit-source-id: 105bfecc61c222cfdb7294a03c9ecae3cc7f5817	2019-06-01 10:24:31 -07:00
Hans Lee	41b17e2458	Fix wrong type hints for Tensor.is_cuda, is_leaf (#21192 ) Summary: `Tensor.is_cuda` and `is_leaf` is not a predicate function but a `bool` attribute. This patch fixes the type hints in `torch/__init__.pyi` for those attributes. ```diff - def is_cuda(self) -> bool: ... + is_cuda: bool - def is_leaf(self) -> bool: ... + is_leaf: bool ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21192 Differential Revision: D15592766 Pulled By: soumith fbshipit-source-id: 8c4ecd6939df8b8a8a19e1c9db6d40193bca7e4a	2019-06-01 10:04:52 -07:00
peter	be7fc40621	Fix `sccache not being used on Windows` (#21248 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/21167. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21248 Differential Revision: D15592742 Pulled By: soumith fbshipit-source-id: 4add002698c13301f142526cd783c866d345bf5e	2019-06-01 09:47:39 -07:00
James Reed	619261d7a7	Add file-line info for jit.load and string frontend (#21217 ) Summary: This makes file-line reporting also work for things loaded using `torch.jit.load()` as well as the string frontend (via `CompilationUnit`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21217 Differential Revision: D15590838 Pulled By: jamesr66a fbshipit-source-id: 6b6a12574bf9eca0b83f24f0b50535fda5863243	2019-05-31 23:43:15 -07:00
Owen Anderson	b663eec119	Lazily build error strings in schema matching using replay. (#21241 ) Summary: Saves ~20% (5.3s -> 4.3s) loading DenseNet on my laptop. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21241 Differential Revision: D15590338 fbshipit-source-id: 2c8aebc829d4ea46f358d74d396cc44f5f57fcf5	2019-05-31 23:34:20 -07:00
Sovvo	5bc7c1f83d	fix contribution and governance links (#21243 ) Summary: Updated web links on contribution_guide and governance documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/21243 Differential Revision: D15591065 Pulled By: soumith fbshipit-source-id: fdcfc518605a08a2ac35a10c146122d7d0a3f609	2019-05-31 21:02:13 -07:00
Chunli Fu	85786bea7d	Export feature length information for onnxifi operator (#21110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21110 Export feature length information for onnxifi operator Reviewed By: ipiszy Differential Revision: D15548138 fbshipit-source-id: 460118648bb4467c096f79dea524060c9524f23d	2019-05-31 20:25:34 -07:00
Mingzhe Li	516ea33f6a	add PT maxpool and avgpool ops to the benchmark suite (#21200 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21200 This diff adds MaxPool1d/2d/3d and AvgPool1d/2d/3d to the benchmark suite. Reviewed By: hl475 Differential Revision: D15541980 fbshipit-source-id: 394d136ee94a16ee24285939323ca5fe317e99d3	2019-05-31 19:35:29 -07:00
Mingzhe Li	dceea73460	add PT conv and convtranspose ops to the benchmark suite (#21199 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21199 This diff adds Conv1d, ConvTranspose1d, Conv2d, ConvTranspose2d, Conv3d, and ConvTranspose3d operators to the benchmark suite. Reviewed By: hl475 Differential Revision: D15520817 fbshipit-source-id: 5512afec2be8a1036fbcd170f70265c7e455fcde	2019-05-31 19:35:25 -07:00
Mingzhe Li	2d75d31398	add PT linear op to the benchmark suite (#21204 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21204 as title Reviewed By: hl475 Differential Revision: D15484743 fbshipit-source-id: 7094a983e370e1c3952021146b58b844874b7d5e	2019-05-31 19:35:22 -07:00
Mingzhe Li	00b3e69211	add PT batchnorm op to the benchmark suite (#21201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21201 as title Reviewed By: hl475 Differential Revision: D15482581 fbshipit-source-id: d93713a35be41e76d077df419cb24585f69d72eb	2019-05-31 19:35:18 -07:00
Mingzhe Li	ed1078bde3	migrate matmul operator to the new interface (#21198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21198 as title Reviewed By: hl475 Differential Revision: D15325768 fbshipit-source-id: a5d7c6837cd09445e75846660d12807dd26af6cc	2019-05-31 19:35:15 -07:00
Michael Suo	c8dc707fee	avoid multiple writes to files on export (#21186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21186 ghimport-source-id: 2f62fed50e0d74f4162b74b6a2f44b8baa376316 Differential Revision: D15581527 Pulled By: suo fbshipit-source-id: b1150cfa47d8df6f217f048c742a5ba9fa7f7935	2019-05-31 19:14:46 -07:00
Junjie Bai	4c19421f16	Register gradient op with engine (#21205 ) Summary: cc dreiss Pull Request resolved: https://github.com/pytorch/pytorch/pull/21205 Differential Revision: D15578948 Pulled By: bddppq fbshipit-source-id: ef285174e8637daef624c8088ebd903a70582345	2019-05-31 18:48:47 -07:00
James Reed	daa1e2de1a	Add file:line:graph to graph printout (#21180 ) Summary: Example: ``` import torch torch.jit.script def foo(x): y = torch.neg(x) return x - y print(foo.graph.debug_str()) ``` ``` graph(%x : Tensor): %2 : int = prim::Constant[value=1]() %y : Tensor = aten::neg(%x) # demo.py:5:9 %3 : Tensor = aten::sub(%x, %y, %2) # demo.py:6:12 return (%3) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21180 Differential Revision: D15583548 Pulled By: jamesr66a fbshipit-source-id: 0c6dc2fb7555c01dde9c563b78422ef234b2681b	2019-05-31 18:14:18 -07:00
Aapo Kyrola	678dc44d4c	use _sparse_coo_tensor_unsafe in coalesce for speedup (#21214 ) Summary: Studied why sparse tensor coalesce was slow: issue #10757. Using nv-prof, and writing a simple benchmark, I determined bulk of the time was used ``kernelTransformReduceInnermostDimIndex``, which is called when sparse tensor is constructed with sparse_coo_tensor when it does sanity check on the minimum and maximum indices. However, we do not need this sanity check because after coalescing the tensor, these min/maxs won't change. On my benchmark with 1 million non-zeros, the runtime of coalesce. was about 10x from 0.52s to 0.005 sec. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21214 Reviewed By: bddppq Differential Revision: D15584338 Pulled By: akyrola fbshipit-source-id: a08378baa018dbd0b45d7aba661fc9aefd3791e0	2019-05-31 17:10:05 -07:00
Yinghai Lu	9e5f1db66b	Reuse common options between ONNXIFI and TVM transformations (#21163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21163 These two backend transformation share some common traits. Therefore we want to reuse the data struct/code as much as possible. Reviewed By: hlu1 Differential Revision: D15561177 fbshipit-source-id: 35f5d63b2b5b3657f4ba099634fd27c3af545f1b	2019-05-31 17:01:36 -07:00
Mikhail Zolotukhin	b12a5f6155	schema_matching.cpp: mark internal functions as static. (#21140 ) Summary: Some of the functions are only used in this file - mark them `static`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21140 Differential Revision: D15578076 Pulled By: Krovatkin fbshipit-source-id: 71ae67baabebd40c38ecb9292b5b8202ad2b9fc1	2019-05-31 16:40:16 -07:00
Mingzhe Li	668dbcc41b	migrate intraop benchmarks to the new interface (#21202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21202 Migrate Ilia's op benchmarks to the new interface Reviewed By: hl475 Differential Revision: D15322577 fbshipit-source-id: 8e75d51e7ddacbd56896c55f2996a9358491d83e	2019-05-31 16:19:04 -07:00
Mingzhe Li	c62d476206	migrate add operator to the new interface (#21152 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21152 Migrate existing add benchmark to use the new op front-end Reviewed By: zheng-xq Differential Revision: D15325524 fbshipit-source-id: 34e969e1bd289913d881c476711bce9f8ac18a29	2019-05-31 16:19:00 -07:00
Jerry Zhang	fd19d06db4	remaining use of t.quantize_linear (#21219 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21219 att Differential Revision: D15583802 fbshipit-source-id: 742e8b799d67485b2d48b1458839f3f3b000f200	2019-05-31 16:05:44 -07:00
Hong Xu	4dbeb87e52	PyTorch Dockerfile should update submodules recursively. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21216 Differential Revision: D15584114 Pulled By: bddppq fbshipit-source-id: dbe0c3a54024a90fcd2c6689f8b9689ed0cd639b	2019-05-31 14:56:57 -07:00
Elias Ellison	0aeb971622	conditionally defined var better error message (#20911 ) Summary: i will do loops in a follow up after some other changes I am working on have landed Pull Request resolved: https://github.com/pytorch/pytorch/pull/20911 Differential Revision: D15497205 Pulled By: eellison fbshipit-source-id: 8cac197c6a6045b27b552cbb39e6fc86ca747b18	2019-05-31 14:32:03 -07:00
davidriazati	2f4824b2fb	Add support for recursive compilation on Modules (#20708 ) Summary: Following on #19747, this implements most of the `torch.jit.script()` changes laid out in #20939. Still to do: * Accessing a method from Python does not add it as a `ScriptMethod` (so only `export`ed methods and `forward` are compiled) * Calling a method other than `forward` on a submodule doesn't work ](https://our.intern.facebook.com/intern/diff/15560490/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20708 Pulled By: driazati Differential Revision: D15560490 fbshipit-source-id: cc7ef3a1c2772eff9beba5f3e66546d2b7d7198a	2019-05-31 14:27:16 -07:00
Sebastian Messmer	834d678eb8	Remove old custom op implementation (#21085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21085 Now that torch::jit::RegisterOperators() always passes through to torch::RegisterOperators() (see diffs stacked below this), we can remove the old custom op implementation. Reviewed By: dzhulgakov Differential Revision: D15542261 fbshipit-source-id: ef437e6c71950e58fdd237d6abd035826753c2e4	2019-05-31 13:51:14 -07:00
Sebastian Messmer	384d828ea5	Add aliasAnalysis to torch::RegisterOperators() (#21084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21084 - Now AliasAnalysisKind can be set using the torch::RegisterOperators() API - This also allows us to remove the last place in torch::jit::RegisterOperators that didn't use c10 yet. Reviewed By: dzhulgakov Differential Revision: D15542097 fbshipit-source-id: ea127ecf051a5c1e567e035692deed44e04faa9e	2019-05-31 13:51:07 -07:00
Sebastian Messmer	80556761c8	c10::OperatorOptions (#21181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21181 Implement c10::OperatorOptions as a class to store metadata about operators. This is meant to replace torch::jit::OperatorOptions. Reviewed By: dzhulgakov Differential Revision: D15569897 fbshipit-source-id: 95bf0bf917c1ef2bdf32702405844e1a116d9a64	2019-05-31 13:51:00 -07:00
Sebastian Messmer	b91e0d14a7	registration options should only be callable on rvalues (#21079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21079 They're invalidating *this, so they shouldn't be callable on non-rvalues. Reviewed By: dzhulgakov Differential Revision: D15541583 fbshipit-source-id: a2a9dafb29af03477486ea2ce9029399f557c728	2019-05-31 13:50:54 -07:00
Owen Anderson	181792176d	Implement various AliasAnalysis operations directly on top of MemoryLocations. (#21203 ) Summary: This reduces DenseNet load time by about 25% (down to 5.3s on my laptop) and gets AliasAnalysis out of the profile top hits entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21203 Differential Revision: D15578155 fbshipit-source-id: ddbb1ad25c9540b5214702830084aa51cc6fd3cb	2019-05-31 13:38:32 -07:00
Thor Johnsen	e098878d75	Cuda persistent softmax (#20827 ) Summary: Adds persistent cuda kernels that speed up SoftMax applied over the fast dimension, i.e. torch.nn.Softmax(dim=-1) and torch.nn.LogSoftmax(dim=-1). When the size is <= 1024, this code is 2-10x faster than the current code, speedup is higher for smaller sizes. This code works for half, float and double tensors with 1024 or fewer elements in the fast dimension. Numerical accuracy is on par with the current code, i.e. relative error is ~1e-8 for float tensors and ~1e-17 for double tensors. Relative error was computed against the CPU code. The attached image shows kernel time in us for torch.nn.Softmax(dim=-1) applied to a half precision tensor of shape [16384,n], n is plotted along the horizontal axis. Similar uplifts can be seen for the backward pass and for LogSoftmax. ![image](https://user-images.githubusercontent.com/41591019/58212822-b63ebb00-7cb5-11e9-910d-1fc7d8585d58.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20827 Differential Revision: D15582509 Pulled By: ezyang fbshipit-source-id: 65805db37487cebbc4ceefb1a1bd486d24745f80	2019-05-31 13:20:15 -07:00
Tao Xu	052bab7069	Move legacy TH functions(sinh,cosh) to TensorIterator + Vec256 (#21115 ) Summary: This is a follow up on Jame's PR: https://github.com/pytorch/pytorch/pull/19041. The idea is to replace the legacy `sinh` / `cosh` ops that are being dispatched to TH with the operations defined in `Vec256` for better performance. benchmark(from Jame's script): ```python import torch, time ops = ['sinh', 'cosh'] x = torch.rand(1024, 1024) NITER = 10000 print('op', 'time per iter (ms)', 'gops/s', 'GB/s', sep='\t') for op in ops: s = time.time() for i in range(NITER): getattr(x, op)() elapsed_sec = ((time.time() - s) / NITER) print(op, elapsed_sec * 1000, (10241024/elapsed_sec)/1e9, (1024102442) / elapsed_sec / 1e9, sep='\t') ``` code on master: ``` op time per iter (ms) gops/s GB/s sinh 3.37614369392395 0.3105839369002935 2.484671495202348 cosh 3.480502033233643 0.3012714803748572 2.4101718429988574 ``` after change (on Macbook pro 2018): ``` op time per iter (ms) gops/s GB/s sinh 0.8956503868103027 1.1707425301677301 9.365940241341841 cosh 0.9392147302627564 1.1164390487217428 8.931512389773943 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/21115 Reviewed By: ljk53 Differential Revision: D15574580 Pulled By: xta0 fbshipit-source-id: 392546a0df11ed4f0945f2bc84bf5dea2750b60e	2019-05-31 12:06:26 -07:00
Jerry Zhang	7f960a9c01	remove quantize_linear from Tensor method (#21196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21196 we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend Python ``` torch.quantize_linear(t, ...) ``` C++ ``` at::quantize_linear(t, ...) ``` Differential Revision: D15577123 fbshipit-source-id: d0abeea488418fa9ab212f84b0b97ee237124240	2019-05-31 12:01:10 -07:00
Jongsoo Park	c185145d8c	remove dependency to caffe2::math and eigen (#21169 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21169 We should minimize dependency from perfkernels (we were including eigen header files only in cc files not compiled with avx or avx2 options but better to be very strict because it's easy to introduce illegal instruction errors in perfkernels) Reviewed By: salexspb Differential Revision: D15563839 fbshipit-source-id: d4b1bca22d7f2e6f20f23664d4b99498e5984586	2019-05-31 11:55:16 -07:00
Brennan Vincent	8c927b208c	improve test_docs_coverage error messages (#21029 ) Summary: Most important fix: Correct "tensor.rst" to "tensors.rst" Secondary fix: some minor English spelling/grammar fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21029 Differential Revision: D15523230 Pulled By: umanwizard fbshipit-source-id: 6052d8609c86efa41a4289cd3a099b2f1037c810	2019-05-31 11:13:39 -07:00
davidriazati	e13b483f58	Fix weak module cuda() _flat_weights bug (#21107 ) Summary: Dynamically creating a type at runtime was messing up the MRO and has been causing many other problems. I think it's best to delete it, this causes a regression since ```python self.linear = nn.Linear(10, 10) isinstance(self.linear, nn.Linear) ``` will now be `False` again, but this will be fixed once recursive script mode is the default (#20939) ](https://our.intern.facebook.com/intern/diff/15560549/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21107 Pulled By: driazati Differential Revision: D15560549 fbshipit-source-id: 7bd6b958acb4f353d427d66196bb4ee577ecb1a6	2019-05-31 10:35:30 -07:00
Mingzhe Li	0223d3744a	introduce a new intrace to add op [C2 changes] (#21148 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21148 The diff modifies the interface for Caffe2 operators in the benchmark suite Reviewed By: zheng-xq Differential Revision: D15433888 fbshipit-source-id: c264a95906422d7a26c10b1f9836ba8b35e36b53	2019-05-31 09:21:07 -07:00
Mingzhe Li	31089b02ce	introduce a new interface to add op [core changes] (#21147 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21147 This diff introduces a new interface to add PT/C2 operators to the benchmark suite. The following steps are needed to add a new operator: 1. Specify the input shapes, args to an operator in configs 2. Create a PT/C2 benchmark class which includes ```init``` (create tensors), ```forward``` (specify the operator to be tested.), and ```backward```(gradient of an op.) methods 3. call generate_pt_test/generate_c2_test to create test cases based on configs Reviewed By: zheng-xq Differential Revision: D15250380 fbshipit-source-id: 1025a7cf60d2427baa0f3f716455946d3d3e6a27	2019-05-31 09:21:04 -07:00
Edward Yang	012069ca8f	Revert D15454048: Move THCTensor_{normal, normal_means, normal_stddevs, normal_means_stddevs} to ATen Differential Revision: D15454048 Original commit changeset: 8bfc57bf015b fbshipit-source-id: 98c562ab4cf7a00e9041b2aa50eb7fb0f0c48f69	2019-05-31 07:49:22 -07:00
Edward Yang	dc8f306b8e	Revert D15454052: Move THCTensor_(cauchy) to ATen Differential Revision: D15454052 Original commit changeset: 4f4d33ec11cf fbshipit-source-id: 832a738796e6b6bdf969a44bb2cdcf171cbd5f77	2019-05-31 07:49:18 -07:00
Ailing Zhang	be9ce6318e	remove import torchvision when testing torch.hub (#21132 ) Summary: This should pass once https://github.com/pytorch/vision/pull/971 is merged. To remove torchvision as baseline, we just compare to sum of all param.sum() in pretrained resnet18 model, which means we need to manually update the number only when that pretrained weights are changed, which is generally rare. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21132 Differential Revision: D15563078 Pulled By: ailzhang fbshipit-source-id: f28c6874149a1e6bd9894402f6847fd18f38b2b7	2019-05-31 07:38:30 -07:00
Edward Yang	e161360b62	Revert D15558784: [reland][pt1][quant] remove quantize_linear from Tensor method Differential Revision: D15558784 Original commit changeset: 0b194750c423 fbshipit-source-id: d180a7f76bb05ad7470f17bc3d2bd614fab16529	2019-05-31 06:20:05 -07:00
Sebastian Messmer	5fcd37bd8f	List (#21164 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21164 Write a List type to be used in operator kernels. This abstracts away from the concrete list type used (e.g. std::vector vs SmallVector) and allows us to change these implementation details without breaking the kernel API. Also, this class allows for handling List<bool>, which would not work with ArrayRef because vector<bool> is a bitset and can't be converted to ArrayRef<bool>. Reviewed By: ezyang Differential Revision: D15476434 fbshipit-source-id: 5855ae36b45b70437f996c81580f34a4c91ed18c	2019-05-31 04:15:39 -07:00
Jerry Zhang	f91f24764e	remove quantize_linear from Tensor method (#21156 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21156 we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend Python ``` torch.quantize_linear(t, ...) ``` C++ ``` at::quantize_linear(t, ...) ``` Differential Revision: D15558784 fbshipit-source-id: 0b194750c423f51ad1ad5e9387a12b4d58d969a9	2019-05-30 22:02:12 -07:00
Jerry Zhang	0a0ff83124	replace `num_bits` with `quant_min` and `quant_max` (#21097 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21097 att Differential Revision: D15547166 fbshipit-source-id: 60bc7f7d82c424558b67881627fb74f1eff515af	2019-05-30 20:57:57 -07:00
Jerry Zhang	277bf69fa0	Add torch.load/torch.save for QTensor (#20830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20830 att Reviewed By: dzhulgakov Differential Revision: D15340701 fbshipit-source-id: 677038c8101f66dec4856c2eccf9f9e394012226	2019-05-30 20:52:19 -07:00
vishwakftw	eb4d43df3b	Make CUDA triu / tril support batches of size > 65535 (#21067 ) Summary: In the previous implementation of triu / tril, we passed the batch size in the 2nd dimension of a grid. This is limited to 65535, which means that performing triu / tril on a tensor with batch size > 65535 will throw an error. This PR removes the dependence on the 2nd dimension, and corresponding non-contiguity constraints. Changelog: - Compute offset, row and col in the kernel - Use 1st dimension of grid alone - Remove unnecessary contiguity checks on tensors as a result of this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21067 Differential Revision: D15572501 Pulled By: ezyang fbshipit-source-id: 93851cb661918ce794d43eeb12c8a38762e1358c	2019-05-30 20:16:11 -07:00
Michael Suo	057ddab766	on import, register class before defining it (#21182 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21182 ghimport-source-id: 2457a4306c0a72888bb8359a267fcd12b43f103a Differential Revision: D15571334 Pulled By: suo fbshipit-source-id: 26ca9dddb25df1b1eac2e17c70f682e20e08cb6d	2019-05-30 20:09:01 -07:00
Syed Tousif Ahmed	d6438c956b	Move THCTensor_(cauchy) to ATen (#20622 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20622 ghimport-source-id: b100d6cededf6f2c2020c3d7961271f16497bbdc Differential Revision: D15454052 Pulled By: ezyang fbshipit-source-id: 4f4d33ec11cf36b91c67759bd27252d1e457cff1	2019-05-30 18:13:16 -07:00
Syed Tousif Ahmed	26d16ae515	Move THCTensor_{normal, normal_means, normal_stddevs, normal_means_stddevs} to ATen (#20621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20621 ghimport-source-id: f461d7f1eb6b5a8306dd8175cbb0a7fcc9f64c76 Differential Revision: D15454048 Pulled By: ezyang fbshipit-source-id: 8bfc57bf015b85f57ed99a54176926386aab4e34	2019-05-30 18:01:31 -07:00
Lu Fang	07ac00d21a	Automatic update of fbcode/onnx to 9005291283e943f1a91da5f0acf218bc4e8eb2ca (#21057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21057 Previous import was cc2333a3f929caca7223b98699237f19388dd585 Included changes: - [90052912](https://github.com/onnx/onnx/commit/90052912): Fix wrong condition and add --user in update_doc.sh (#2050) <daquexian> - [a4f44a20](https://github.com/onnx/onnx/commit/a4f44a20): Add bit-shift operators for supporting hashing (#1931) <Wei-Sheng Chin> - [0098752c](https://github.com/onnx/onnx/commit/0098752c): Add shape inference logic for Expand op (#2041) <Hariharan Seshadri> - [fbe8addb](https://github.com/onnx/onnx/commit/fbe8addb): update qops tests (#2040) <Ashwini Khade> - [874fb37c](https://github.com/onnx/onnx/commit/874fb37c): Fix torchvision installation (#2054) <bddppq> - [1f5f6582](https://github.com/onnx/onnx/commit/1f5f6582): Fix bug that kernel_shape rather than effective_kernel_shape is used in dilated conv (#2043) <daquexian> - [38b6c44e](https://github.com/onnx/onnx/commit/38b6c44e): Changes done internally at Facebook (#2035) <Lu Fang> - [5c51f0db](https://github.com/onnx/onnx/commit/5c51f0db): Explicitly specify type of integers in the input tensor. (#2034) <Dmitri Smirnov> Reviewed By: benoitsteiner Differential Revision: D15534241 fbshipit-source-id: 8d2b78a986e5b7fbeb248f2d7b80c1a07230654e	2019-05-30 17:33:18 -07:00
Iurii Zdebskyi	ff0d00f921	Updated scalar type to onnx mapping (#21095 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21095 ghimport-source-id: 32a79eace02216de9170f163027b1aa93756b821 Differential Revision: D15546175 Pulled By: izdeby fbshipit-source-id: 4e47c8538aaf30b4af198baac7279133e4d74b36	2019-05-30 17:11:12 -07:00
Daya S Khudia	726caeace3	Use QTensor for bias (#21038 ) Summary: Use QTesnor for bias tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/21038 Differential Revision: D15524980 Pulled By: dskhudia fbshipit-source-id: c7bf2efc8fe3f4b5574c721c2f64ff073045ecc4	2019-05-30 16:16:03 -07:00
Iurii Zdebskyi	64f06d4964	Enable all and any for bool tensors (#21033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21033 ghimport-source-id: 35fdcf27b0bde8ec3e5b3051cf0d730f20f94783 Differential Revision: D15530497 Pulled By: izdeby fbshipit-source-id: 9c15cc960055f59a05ce0276f9d51c567626d966	2019-05-30 16:16:00 -07:00
Iurii Zdebskyi	9a22cb9f49	Enabled add, sum and mul for bool tensor (#21032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21032 ghimport-source-id: 6ab21752b4af451e8b10a0e02cd5d726aa7472f0 Differential Revision: D15530496 Pulled By: izdeby fbshipit-source-id: f4f83aa80eafbb4f307aadc1a13d8cdcf3055c24	2019-05-30 16:11:43 -07:00
James Reed	fe39602451	Support for rudimentary f-strings (#21037 ) Summary: Resolves https://github.com/pytorch/lockdown/issues/51 This adds support for converting simple f-string literals to calls to `string.format()`. It does not support conversion specifiers or format strings. This also does not support the string parser frontend, since that implementation would be more involved and likely would require modifying our TorchScript AST Pull Request resolved: https://github.com/pytorch/pytorch/pull/21037 Reviewed By: zdevito Differential Revision: D15541183 Pulled By: jamesr66a fbshipit-source-id: ae9df85e73f646d7219c1349f5b7683becbcef20	2019-05-30 15:50:45 -07:00
James Reed	76deb450c6	Record source/line info in SourceRange and report in highlight (#21157 ) Summary: Resubmission of https://github.com/pytorch/pytorch/pull/20898 with flake8 fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/21157 Reviewed By: zdevito Differential Revision: D15560324 Pulled By: jamesr66a fbshipit-source-id: fc4e429eac03d2768f758b19c9d43e0bb614c2b8	2019-05-30 15:45:30 -07:00
Horace He	416357648c	Optimize alias analysis (#20899 ) Summary: # Overall Improvements 1. Switched from using `unordered_set` to sparse bitset. 1. Prevent some excessive memory allocations (thanks to resistor ) 1. Take advantage of the sparse bitset operations 1. Switch to `flat_hash_map` instead of `unordered_map` in some places. # Benchmarks (somewhat approximate, best of a couple runs) 1. InceptionNet (load + one forward pass): 19.8->13.3 1. GoogleNet(load + one forward pass): 10.0 -> 7.24 1. DenseNet (only load): 7.3 -> 5.3 I use the `sparse bitset` taken from https://llvm.org/doxygen/SparseBitVector_8h_source.html. I had to make some modifications to use `__builtin_popcountl` and instructions like that instead of other transitive clang dependencies. ## Some notes on our graph topologies In general, our graphs are very sparse, and most of the components aren't connected. For GoogleNet, we have 200k nodes, we do 2k `mayAlias` queries, and the sum of magnitudes of sets at each node is 500k (ie: every node, on average, reaches 2.5 leaves). PS: Holy crap macbooks throttle an insane amount with the default fan settings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20899 Differential Revision: D15564612 Pulled By: Chillee fbshipit-source-id: 2a293a21a9be25f942ca888c8f225cab32bbfcd0	2019-05-30 15:37:50 -07:00
Abhinav Jauhri	31aefd9b09	Adding models to jenkins benchmark script (#21010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21010 Adding and editing the jenkins benchmarks script to accommadate both Renext and Shufflenet models. Reviewed By: bddppq Differential Revision: D15515354 fbshipit-source-id: 2a92c272b0b74ed3ecc78af6544a06337c7753cf	2019-05-30 15:17:40 -07:00
Elias Ellison	f6e5846a67	add handle to run all jit tests (#21161 ) Summary: Now you can run `python test/run_tests --jit` to run all jit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/21161 Differential Revision: D15563912 Pulled By: eellison fbshipit-source-id: 4bb0285cda4168b72a3dc4bba471485566a59873	2019-05-30 14:12:21 -07:00
Stephen Chen	7f308b88b9	Only populate net_pos in ssaRewrite if the op doesn't already have a net_pos argument (#21051 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21051 In net transforms, we perform an SSARewrite where we update the 'net_pos' for all the ops in the net. The transform function also takes a unordered set of net positions for blacklisting. It's possible that SSARewrite will change the indexes of the ops so the blacklist is applied to the wrong ops. We fix this issue by having SSARewrite only assign new net_pos if the op doesn't already have one. Reviewed By: yinghai Differential Revision: D15532795 fbshipit-source-id: e020492a7b5196a91cdc39d0eda761b1ca612cdb	2019-05-30 13:37:35 -07:00
Horace He	80020306ef	Added base parameter to math.log (#21151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21151 ghimport-source-id: 76dc0852022a87a000888a787de1391f71923074 Differential Revision: D15563185 Pulled By: Chillee fbshipit-source-id: 6ed7cc32ed7c103f360022b97f6df47ccd0403e7	2019-05-30 13:32:52 -07:00
svcscm	4e3e4d7ff5	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 80a731a1b8b04df01cb0d68ec39d4af10e0b61b7	2019-05-30 13:07:20 -07:00
Jesse Hellemn	4aee92833c	Update libtorch docs (#21150 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21150 Differential Revision: D15559590 Pulled By: pjh5 fbshipit-source-id: 4063bf91464425e8efe4765dc17bb7e9b7bfccc7	2019-05-30 12:49:56 -07:00
Roy Li	313ef4f5d5	Make data_ptr a method on Tensor (#20878 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20878 ghimport-source-id: f19993d97ecb8cfcd60b371d9ed49e3ad2e051c7 Differential Revision: D15482061 Pulled By: li-roy fbshipit-source-id: c0563ce849fc3277e86a1a58bd384e38365786b2	2019-05-30 11:47:59 -07:00
Qian Hong	d17aa72373	Added more regression test for groupconv w/o bias. (#18519 ) Summary: Follow-up of https://github.com/pytorch/pytorch/issues/18218, which was fixed by https://github.com/pytorch/pytorch/pull/18463 with mkl-dnn upgraded to v0.18.1. Covering special case when group > 1, input-channel / group < 16 and output-channel is multiple of 16. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18519 Differential Revision: D14643071 Pulled By: soumith fbshipit-source-id: d0ebed59326c67089e042b50583b87ed2c3ccc2f	2019-05-30 11:36:07 -07:00
Zachary DeVito	6dc445e1a8	Conservative alias analysis rules for CallFunction/CallMethod (#21087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21087 ghimport-source-id: 4fa6763ffecc7d2974b902dd9bd2bd9ac467bab7 Differential Revision: D15542512 Pulled By: zdevito fbshipit-source-id: 2dcd673cd4c200d7a854347429d4f33a11793cbc	2019-05-30 11:01:56 -07:00
Michael Suo	b6d1a72f48	improve error message on inferred type (#21058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21058 ghimport-source-id: 7fad3a0567022dd417f4bd079a50a22e3c1dc020 Differential Revision: D15547218 Pulled By: suo fbshipit-source-id: 5dbd567c79e6d01e9af4b8552777f7f0043df5b2	2019-05-30 10:50:34 -07:00
Jesse Hellemn	ec76976a7a	Remove all devtoolset7 jobs (#21153 ) Summary: These do not work. We'll save time and cpu until someone has the time to fix these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21153 Differential Revision: D15558601 Pulled By: pjh5 fbshipit-source-id: f9bfe580aa7962a88506f9af0032647f553637a4	2019-05-30 10:39:26 -07:00
Edward Yang	fffffde2f8	Delete more tabs, fix lint. (#21142 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21142 ghimport-source-id: 4666c0731d9c08e9990ffafd0ae88fa1e7896348 Differential Revision: D15555285 Pulled By: ezyang fbshipit-source-id: 9e5bfacf202ceba37bd29cfd5dcb651b7f79068d	2019-05-30 06:36:47 -07:00
Edward Yang	e9df9e7960	Revert D15552424: [pytorch][PR] [JIT] Record source/line info in SourceRange and report in highlight Differential Revision: D15552424 Original commit changeset: 78d0f0de03f7 fbshipit-source-id: cc24f62189b7bbcdc1406912cfb3d4ca52b8e67e	2019-05-30 05:17:15 -07:00
Edward Yang	c4a90ca18e	Revert D15477933: [pt1][quant] remove quantize_linear and dequantize from Tensor method Differential Revision: D15477933 Original commit changeset: c8aa81f681e0 fbshipit-source-id: ec494fbbab72e20da262bdd8657887e1fdd173cb	2019-05-30 05:04:12 -07:00
Ilia Cherniavskii	3805490d6a	Typo fix (#21122 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21122 fix a typo Reviewed By: dzhulgakov Differential Revision: D15553921 fbshipit-source-id: 260b0be5975d49bb6d70e45d83505efcecf02875	2019-05-30 00:16:01 -07:00
Dmytro Dzhulgakov	52ded63128	Revert D15546045: [jit] Add support for recursive compilation on Modules Differential Revision: D15546045 Original commit changeset: c2c8fe179088 fbshipit-source-id: c921fb92cf9f5c6c94c77fa5070f9c5775c91b77	2019-05-29 23:42:50 -07:00
Zachary DeVito	3083c71cde	First class functions in IR, inlined eagerly (#21052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21052 ghimport-source-id: cc476b9cc301967dde5de6212ca144cdb252e84c Differential Revision: D15533353 Pulled By: zdevito fbshipit-source-id: 4d25461969cfcc9e5f641d585584cc100c7b34ae	2019-05-29 23:04:18 -07:00
James Reed	6b099edb53	fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21118 Differential Revision: D15553121 Pulled By: jamesr66a fbshipit-source-id: 14ebf0e4cb33f8155ac86a9538beb8570bdfe8c8	2019-05-29 21:50:12 -07:00
Yinghai Lu	7cea6d9b71	Redesign the output shape adjustment of OnnxifiOp (#21027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21027 Previously, we are only able to adjust batch size when output shape has batch size conditioned at its first dim. Although not common, there are cases where we want to slice back the output whose batch size is conditioned on non-first dim, or whose output shape doesn't really has batch size in it but rather is an expression of it. Examples are shapes at the output of `Transpose` or `Tile`. This diff redesigns how we handle the output size. The key is when we run OnnxifiOp, the input shapes are given, and we can actually do a shape inference to derive the real output shapes, no matter how they got transformed. And then we compare the real output shape with max batch sized output shape, dim by dim and use a `Slice` op to cut the max output back to real output shape. Notice that general `Slice` op is slow and in most of the cases, we still prefer adjusting batch size by shrinking its first dim, which is just an operation on meta info without data allocation/manipulation. Therefore, we add a flag `fast_path` to detect this situation and operate accordingly. Reviewed By: tracelogfb Differential Revision: D15515189 fbshipit-source-id: 9c1fff161f82d0bc20eeac07ca4a2756e964e9fd	2019-05-29 21:39:00 -07:00
James Reed	6875018793	Record source/line info in SourceRange and report in highlight (#20898 ) Summary: Resolves https://github.com/pytorch/lockdown/issues/29 Examples: ``` import torch torch.jit.script def foobar(x): return torch.blargh(xyz) == RuntimeError: object has no attribute blargh: at compile.py:5:12 torch.jit.script def foo(x): return torch.blargh(x) ~~~~~~~~~~~~ <--- HERE ``` It also gets the correct column number in the case where the original source file has common leading whitespace in front of the callable: ``` import torch with torch.no_grad(): torch.jit.script def foo(x): return torch.blargh(x) == RuntimeError: object has no attribute blargh: at compile_leading.py:6:24 torch.jit.script def foo(x): return torch.blargh(x) ~~~~~~~~~~~~ <--- HERE ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20898 Differential Revision: D15552424 Pulled By: jamesr66a fbshipit-source-id: 78d0f0de03f7ccbf3e7ea193a1b4eced57ea5d69	2019-05-29 21:32:33 -07:00
James Reed	57f4f98c40	Fix borked SourceRanges Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21109 Reviewed By: zdevito Differential Revision: D15551392 Pulled By: jamesr66a fbshipit-source-id: 4f29214049b8feced0e740f84007b5751703ee20	2019-05-29 20:13:14 -07:00
Jerry Zhang	67291ba74f	remove quantize_linear and dequantize from Tensor method (#20874 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20874 A criteria for what should go in Tensor method is whether numpy has it, for this one it does not so we are removing it as a Tensor method, we can still call it as function. Python ``` torch.quantize_linear(t, ...), torch.dequantize(t) ``` C++ ``` at::quantize_linear(t, ...), at::dequantize(t) ``` Reviewed By: dzhulgakov Differential Revision: D15477933 fbshipit-source-id: c8aa81f681e02f038d72e44f0c700632f1af8437	2019-05-29 19:17:16 -07:00
davidriazati	8d3388aef2	Add support for recursive compilation on Modules (#20708 ) Summary: Following on #19747, this implements most of the `torch.jit.script()` changes laid out in #20939. Still to do: * Accessing a method from Python does not add it as a `ScriptMethod` (so only `export`ed methods and `forward` are compiled) * Calling a method other than `forward` on a submodule doesn't work ](https://our.intern.facebook.com/intern/diff/15546045/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20708 Pulled By: driazati Differential Revision: D15546045 fbshipit-source-id: c2c8fe179088ffbdad47198e799a456560655b86	2019-05-29 18:52:36 -07:00
Horace He	33d35f5f93	Fixed isinstance typos Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21102 Differential Revision: D15549564 Pulled By: Chillee fbshipit-source-id: 6746dc9e01b5a30d55d544beb70b7005f0cfd8ae	2019-05-29 17:51:27 -07:00
Edward Yang	990e63f587	Remove unnecessary sources from base CircleCI AMI (#21103 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/21103 Differential Revision: D15550213 Pulled By: kostmo fbshipit-source-id: b4a2c38d168f722b30c96494079ccdd468b9ece8	2019-05-29 17:46:08 -07:00
BowenBao	12b0dede39	Support exporting tensor factories from scripting Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20255 Differential Revision: D15534186 Pulled By: houseroad fbshipit-source-id: 182e117a35fa31445fcad8cb492160500f71599a	2019-05-29 16:53:49 -07:00
Owen Anderson	9be72ce44f	Convert Tree to use intrusive_ptr instead of shared_ptr. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20815 Differential Revision: D15453817 fbshipit-source-id: 569ab807d32fb3dcebfe201a049c770b1600e5c7	2019-05-29 16:33:02 -07:00
Jerry Zhang	4900edebcf	QTensor permute, transpose and contiguous (#20869 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20869 Adding support for the functions listed in the title, by implementing the copy kernel. Differential Revision: D15474060 fbshipit-source-id: 9264df6e442cca1cc5d952e3e5dcc9f4a426f317	2019-05-29 16:05:53 -07:00
Sebastian Messmer	99b057d89c	Failing assertions is unlikely (#20876 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20876 Tell the compiler that assertions are likely to succeed. This allows the compiler to generate betterr code and optimize for the success case. Differential Revision: D15480066 fbshipit-source-id: 4485154d66b2ee0ef8a401718712dbd61d811aee	2019-05-29 15:59:33 -07:00
Zafar Takhirov	9daf48525e	Quantized Max Pool op (#20474 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20474 parallel implementaiton of the MaxPool (no ReLU). Reviewed By: dskhudia Differential Revision: D15327923 fbshipit-source-id: ca6475e7fe1434b55d4b7730a074bb7ff50355fd	2019-05-29 15:01:01 -07:00
Michael Suo	154029a6ff	Revert D15534670: [jit] improve error message on inferred type Differential Revision: D15534670 Original commit changeset: 8bbfd6e9c1af fbshipit-source-id: fe62cf954292e8ef1d00a3cc569206f73cedcd31	2019-05-29 14:56:08 -07:00
Michael Suo	5dacf6b048	improve error message on inferred type (#21058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21058 ghimport-source-id: e7d6e082b0faf4f3d3e683f2c98863ee269439f0 Differential Revision: D15534670 Pulled By: suo fbshipit-source-id: 8bbfd6e9c1afbc3006d7d55ed633e18618e05021	2019-05-29 14:47:00 -07:00
Michael	6ea9044d3c	add 'all' builtin (#20521 ) Summary: [jit] add 'all' builtin Pull Request resolved: https://github.com/pytorch/pytorch/pull/20521 Differential Revision: D15527657 Pulled By: driazati fbshipit-source-id: eaa3c1c560810581150646858339369e4305fdf2	2019-05-29 14:46:56 -07:00
peter	8fcd80af20	Fix "cuda: unknown error" on Windows (#21062 ) Summary: Thanks Jonas1312 for validating this workground. Fixes #20635. However, I don't know exactly why this one is needed. The following are my guesses: 1. It is a CUDA bug. Static linking against `cudart` is the default now, so they didn't run enough tests for dynamic ones. 2. It is related to UCRT. But (1)according to msdn, shared DLLs should share the same CRT. (2) The CUDA related objects like `CUDevice` passing to `cudart` are stored on the stack, not the heap. (3) If this is the case, it should always fail, not sometimes. https://docs.microsoft.com/en-us/cpp/c-runtime-library/potential-errors-passing-crt-objects-across-dll-boundaries?view=vs-2019 3. It is a bug of our side. However, I was unable to find it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21062 Differential Revision: D15543557 Pulled By: ezyang fbshipit-source-id: c23af45ebf582fad93ce5f029af6e1f06cf1d49d	2019-05-29 14:34:02 -07:00
Jerry Zhang	157fcfc07d	Add `quantize_linear_per_channel` (#20765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20765 att Reviewed By: dskhudia Differential Revision: D15435455 fbshipit-source-id: 77770044411ce8ee02d26d63eb7e79cd10db103e	2019-05-29 14:29:16 -07:00
Sebastian Messmer	53ccba004f	New torch assertion macros (#20887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20887 Switch AT_xxx assertion macros to the TORCH_ variants and make sure the separation between TORCH_CHECK and TORCH_INTERNAL_ASSERT makes sense. Differential Revision: D15484658 fbshipit-source-id: 490ae64cc36946756c30971f1b685048bc5f77da	2019-05-29 14:15:04 -07:00
vfdev	449a2c3555	Fixes #20124 (#20203 ) Summary: Fixes #20124 Description: Code wraps `optimizer.step()` method to detect whether user is following new pattern or old pattern. In case of old pattern detected, a UserWarning is raised. Documentation is also updated to reflect the change: ![Screen Shot 2019-05-07 at 11 05 17](https://user-images.githubusercontent.com/2459423/57287527-04e63580-70b8-11e9-9ddd-5d159ef0ed2f.png) cc SsnL, bado-lee Pull Request resolved: https://github.com/pytorch/pytorch/pull/20203 Differential Revision: D15543060 Pulled By: ezyang fbshipit-source-id: 3605e1afdb6ffc2dfd5e75e92e01b967c4d065b5	2019-05-29 14:15:01 -07:00
Jerry Zhang	74375299e0	add torch.nn._intrinsic and torch.nn._intrinsic.quantized namespace (#20940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20940 - `torch.nn._intrinsic` will contain normal(unquantized) fused modules like Conv2DRelu, Conv2DBnRelu, FakeQuantize ops etc. - `torch.nn._intrinsic` will contain fused and quantized modules like Quantized Conv2DRelu, Quantized LinearRelu etc. Right now I only added FakeQuantize op in `torch.nn._intrinsic` namespace, we'll have more later Differential Revision: D15505228 fbshipit-source-id: d380929e38af7a5bcfbea27474d5b80f95d43b03	2019-05-29 14:09:37 -07:00
davidriazati	736bf7b46c	Fix __constants__ for some nn modules (#21071 ) Summary: A bunch of modules were missing entries for `__constants__` which was making their `__repr__`s not work. Others had `__constants__` that were not necessary since it was provided by some parent class instead. Fixes #20978 ](https://our.intern.facebook.com/intern/diff/15539518/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21071 Pulled By: driazati Differential Revision: D15539518 fbshipit-source-id: 24bdd1ef41ef636eefd5d2bad4ab2d79646ed4f0	2019-05-29 13:55:53 -07:00
Wanchao Liang	1e1f2c85f0	remove constant pooling expect (#21003 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21003 ghimport-source-id: c1e0d0555758cab12ce82e0283bab559c7e8e4e2 Differential Revision: D15523443 Pulled By: wanchaol fbshipit-source-id: 40973c1c0c0ab07fe4b1334e9ae0e4b16b5add8e	2019-05-29 13:55:50 -07:00
Yashodhan Ghadge	0ffd20c268	Fix empty tensor for unique_dim (#19000 ) Summary: Fixes: #18408 cc: zasdfgbnm Pull Request resolved: https://github.com/pytorch/pytorch/pull/19000 Reviewed By: ezyang Differential Revision: D15470136 Pulled By: VitalyFedyunin fbshipit-source-id: daf71566b4dbdc91927d164f813b5ee8645af1a2	2019-05-29 13:50:32 -07:00
Wanchao Liang	2cd1c78632	Revert D15523444: [jit] move casting ops from prim to aten Differential Revision: D15523444 Original commit changeset: 642342bf1cce fbshipit-source-id: 29de1c7e19cbb3273230c280346e786e61d2d445	2019-05-29 13:42:05 -07:00
Iurii Zdebskyi	7cb1aa67b0	Enabled min, max, minall, maxall, cmin, cmax, cmaxValue, cminValue for bool tensors (#21031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21031 ghimport-source-id: 379b3e9d20872eb5ad14403ed6751cdb0e730bc5 Reviewed By: ezyang Differential Revision: D15530499 Pulled By: izdeby fbshipit-source-id: f113d6974ee18ac3dfb5c0bcff66865345d137d2	2019-05-29 13:22:54 -07:00
Sebastian Messmer	85777b92b2	Assert against using Operator methods not supported when exporting it to c10, part 2 (#17946 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17946 Some of these are probably implementable for exported operators, but aren't implemented yet and for now it's better to assert than to just return wrong results. Reviewed By: ezyang Differential Revision: D14430749 fbshipit-source-id: 2b0037a9ed227a22aa7376a90e6d3d09d3e04707	2019-05-29 13:16:00 -07:00
Wanchao Liang	a0111aaf0d	move casting ops from prim to aten (#21002 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21002 ghimport-source-id: 4c88a54a3ecb76c5ca3c2c328b749350860a166d Differential Revision: D15523444 Pulled By: wanchaol fbshipit-source-id: 642342bf1ccea83c88897bc023979a32ee01addf	2019-05-29 12:36:47 -07:00
Horace He	dd903eb645	Add start and step parameters for range in torchscript (#20795 ) Summary: Fixes #18440 I calculate a derived index from `start,stop,step` as `start + stepindex`. When `start=0` and `step=1` (the defaults/`range(n)`), this is the same behavior as before. Unluckily, it seems that we do not optimize out operations like `x1` or `x+0`. That means that we're doing lots of redundant operations when we don't need to. EDIT: More specifically, it seems like we only do this optimization for (tensor, scalar): https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/peephole.cpp#L128 The most annoying part of this code is calculating the number of iterations, given `start, stop, step`. I ended up going with the formula `(abs(stop-start) + abs(step)-1)//abs(step)`. Other intuitively appealing formulas like `(stop-start + step -1)//step` don't work for negative numbers. I tried using `SymbolicVariable` for the calculations, but it seems that `symbolicvariable` only outputs ops for `tensors`, not the integers we have. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20795 Differential Revision: D15446869 Pulled By: Chillee fbshipit-source-id: 6085545ace04e25985c6ac870226f7a651f670d5	2019-05-29 12:31:29 -07:00
David Riazati	fa8c132e24	Revert D15502768: [pytorch][PR] [jit] Make ScriptModule.training an attribute instead of a parameter Differential Revision: D15502768 Original commit changeset: 3022f2d57ec6 fbshipit-source-id: 5cd08d3c3a75e38e3aa9b75a0c0059a2c6c85a1e	2019-05-29 12:18:18 -07:00
Jerry Zhang	94b9706017	fix `dequantize_linear` (#21035 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21035 Fix the dtype error in `dequantize_linear`, it should accept the same dtype argument as `quantize_linear` Differential Revision: D15521931 fbshipit-source-id: 0114c046a3f1046e42fca49c74c85e487fee8616	2019-05-29 12:18:15 -07:00
Nikolay Korovaiko	cbf2a4f5c4	print a warning if a type annotation prefix is invalid according to mypy (#20884 ) Summary: This PR adds a check that prints a warning if a type annotation prefix isn't what mypy expects. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20884 Differential Revision: D15511043 Pulled By: Krovatkin fbshipit-source-id: 9038e074807832931faaa5f4e69628f94f51fd72	2019-05-29 11:56:55 -07:00
dawars	a6bb15493d	Removed accidental TensorFlow dependency (#21066 ) Summary: I accidentally added a TF dependency in #20413 by using the from tensorboard.plugins.mesh.summary import _get_json_config import. I'm removing it at the cost of code duplication. orionr, Please review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21066 Reviewed By: natalialunova Differential Revision: D15538746 Pulled By: orionr fbshipit-source-id: 8a822719a4a9f5d67f1badb474e3a73cefce507f	2019-05-29 11:18:10 -07:00
Dmytro Dzhulgakov	f2199a34eb	Hook to store additional metadata about environment (#20863 ) Summary: In larger system environment, there's usually a need to store some information about how the model was created (e.g. from which process, workflow, by which user, etc). It's almost like JPEG metadata written by camera. This PR adds a low-level c++ hook to allow population of additional files in zip container based on environment. The reason to have it a low-level hook instead of top-level API wrapper (e.g. `m.save_with_metadata`) is to capture all usages of the saving API transparently for user. Let me know if there are concerns. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20863 Differential Revision: D15487941 Pulled By: dzhulgakov fbshipit-source-id: 120c5a4c9758aa82846bb51a1207f923e3da1333	2019-05-29 10:11:58 -07:00
Iurii Zdebskyi	00c1584979	Added possibility to index scalars by bool masks (#21030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21030 ghimport-source-id: 7a66ca096c62d050a38a6fcc9f6b2d61e387eb34 Differential Revision: D15530498 Pulled By: izdeby fbshipit-source-id: d5d38f9610caa55fb7179d41f568c5ea5fa1f2e2	2019-05-29 09:32:55 -07:00
Tongzhou Wang	1d4685c20f	Improve test_proper_exit error printing (#20166 ) Summary: This doesn't have `strace` yet. But still have `faulthandler` to print stack traces at hanging. Also part of an attempt to isolate changes from #19228 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/20166 Differential Revision: D15536504 Pulled By: ezyang fbshipit-source-id: fe6e6e2e9899f30d8167436d7bc62b42883a3356	2019-05-29 07:51:31 -07:00
Thomas Viehmann	aa42742df0	ctc_loss: fix backward when 2d target tensor is larger than max_target_length (#20971 ) Summary: Previously, we didn't work when 2d target tensors had extra columns at the end. Now we just ignore those. Also fix the confusion in the doc example regarding the number of classes. Thank you, ypw-rich for the report with reproducing example. Fixes: #20522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20971 Differential Revision: D15535481 Pulled By: ezyang fbshipit-source-id: 397e44e20165fc4fa2547bee9390d4c0b688df93	2019-05-29 05:13:00 -07:00
Stefan Krah	55f5eb3c47	DilatedMaxPool2d: small cleanup Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20936 Differential Revision: D15514542 Pulled By: ezyang fbshipit-source-id: 6341a4bb8a9ee0b632c32a013ea609d842a21962	2019-05-29 05:06:33 -07:00
Stefan Krah	f8565121d9	Port dilated_max_pool3d() to ATen Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20933 Differential Revision: D15525485 Pulled By: ezyang fbshipit-source-id: 6ff44f11d984903cd20d79cfad04963e6443e6ca	2019-05-29 04:58:42 -07:00
Edward Yang	0544a491d5	Revert D15499749: [pytorch][PR] Add `Tensor.T` attribute to reverse dimensions Differential Revision: D15499749 Original commit changeset: f3306b496667 fbshipit-source-id: 7f50431d2ea37bc41bfed62f386ddedea1412878	2019-05-29 04:29:48 -07:00
Roy Li	3038cf8eee	Remove THSTensor and SparseTensorRef (#20877 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20877 ghimport-source-id: a07f53ca158f9a3dce7a25ef5a169871e98ea3ea Differential Revision: D15480353 Pulled By: li-roy fbshipit-source-id: 1152dbc4df827ded3be1a57f007a6b7de12f567f	2019-05-29 01:37:03 -07:00
Roy Li	9faa409b56	Fix __irshift__ dispatch (#21047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21047 ghimport-source-id: 8c781c9882eebb07325a1fc7aa6f340bbec18886 Differential Revision: D15529160 Pulled By: li-roy fbshipit-source-id: d9a444e42df5c509ae10849ba6f8006fbec830c5	2019-05-29 01:03:34 -07:00
Roy Li	8dda19b79f	Remove extraneous TensorId checks in as_strided (#21045 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21045 ghimport-source-id: e95fbf50bccf6ebc613bb13fb16915254912f22d Differential Revision: D15528971 Pulled By: li-roy fbshipit-source-id: c721cc6280dff6e14c5533681d0b35aaa8f98f00	2019-05-29 00:53:53 -07:00
Zachary DeVito	d76546a463	Fix tracing bugs where using `1 - x` in C++ would cause the size of 1 to get hardcoded. (#20932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20932 ghimport-source-id: f0a7f12ffd77aec063a088b18c6b1d108c712df8 Differential Revision: D15501251 Pulled By: zdevito fbshipit-source-id: 91e6e5944d2663b673afde45fc6eed22f31101c4	2019-05-29 00:14:25 -07:00
Junjie Bai	5c53aa4869	Make build with makefiles less noisy (#21053 ) Summary: https://github.com/pytorch/pytorch/pull/17783 has made ninja and makefile builds to print out build commands unconditionally, this has made the build log very verbose, e.g. ROCm CI build log becomes >13mb. Large build log make searching for real error hard. https://github.com/pytorch/pytorch/pull/20508 has reverted the ninja change, and this one reverts the makefile change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21053 Differential Revision: D15533412 Pulled By: bddppq fbshipit-source-id: ad89b617d06acc670d75d4cf25111a4081e9c95e	2019-05-29 00:08:45 -07:00
Junjie Bai	9b147961c4	Fix get_gpu_memory_info in non-cuda builds (#21054 ) Summary: #21024 broke master cc akyrola Pull Request resolved: https://github.com/pytorch/pytorch/pull/21054 Reviewed By: akyrola Differential Revision: D15533406 Pulled By: bddppq fbshipit-source-id: 0dcfa0ce865e109b46280ef1786dbc7a8af30739	2019-05-28 23:05:15 -07:00
Hans Lee	ffdce79078	Deprecate variadic inputs of checkpoint_sequential (#21006 ) Summary: I've reported inconsistency between `checkpoint_sequential` and `nn.Sequential` at https://github.com/pytorch/pytorch/issues/19260. Both should provide the same input signature but they don't. I think the consistency is important and I agree with apaszke that `nn.Sequential`'s semantics should be kept instead of `checkpoint_sequential`. I hope `checkpoint_sequential` raises `TypeError` on variadic arguments since PyTorch 1.2.0. But for now, it's okay just to warn as `DeprecationWarning`. I've talked about this approach with soumith. Please review this pull request. Any comment will be my pleasure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21006 Differential Revision: D15530801 Pulled By: soumith fbshipit-source-id: 0ceb2cc6a17dcc547d0d00ebaf9df8603be53183	2019-05-28 21:33:45 -07:00
Thomas Viehmann	d23d04f17f	Allow nondet_tol for nondeterminism in gradcheck and gradgradcheck (#20980 ) Summary: gradcheck currently includes a determinism check (although only trying twice and seeing if results match). This can lead to flaky tests, e.g. in #20971, but also #13818. This adds nondet_tol for both gradcheck and gradgradcheck. It does not change / reenable any tests yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20980 Differential Revision: D15530129 Pulled By: soumith fbshipit-source-id: 04d7f85b5b59cd62867820c74b064ba14f4fa7f8	2019-05-28 21:26:13 -07:00
njdalton	d190450a35	Fix typo in CyclicLR docs (#21021 ) Summary: Fixes a typo in the CyclicLR docs by adding the lr_scheduler directory and puts in other required arguments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21021 Differential Revision: D15530109 Pulled By: soumith fbshipit-source-id: 98781bdab8d82465257229e50fa3bd0015da1286	2019-05-28 21:18:50 -07:00
Aapo Kyrola	f1fe4b1114	add simple memory analyzer and log warning if GPU underutilized (#21024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21024 Add a new pybinded call to CUDAGetMemoryInfo. Reviewed By: wesolwsk Differential Revision: D15520607 fbshipit-source-id: f6d04e48f7d7cb089fc52fa8835cfee3f452d2f1	2019-05-28 19:58:54 -07:00
Horace He	1bed5f39f4	Fix warning in register_c10_ops by making index unsigned (#20964 ) Summary: Just an annoying warning that's been popping up a lot. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20964 Differential Revision: D15531064 Pulled By: Chillee fbshipit-source-id: 9580115676c5e246481054bbfc749a551a3cca5e	2019-05-28 18:02:09 -07:00
vishwakftw	f6ec464890	Enable batched QR decomposition and add a `some` option (#20689 ) Summary: This PR covers two important points with respect to the QR decomposition: - batching of input matrices (#7500) - adding `some` as an option in `torch.qr` akin to NumPy's `mode` option (#10538) Changelog: - Enable batching for inputs to `torch.qr` - Move QR decomposition implementation to ATen (CPU and CUDA) - Remove existing implementations in TH/THC - Add a `some` option to `torch.qr` that will enable users to switch between complete and reduced decomposition - Modify doc strings Pull Request resolved: https://github.com/pytorch/pytorch/pull/20689 Differential Revision: D15529230 Pulled By: soumith fbshipit-source-id: 16af82b1d2db8a3a758fa8a5f798d83f5f950efb	2019-05-28 17:52:37 -07:00
Xiaomeng Yang	c1048182be	Use constants from math.h for gelu op (#20974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20974 Use constants from math.h for gelu op Reviewed By: hl475, houseroad Differential Revision: D15511736 fbshipit-source-id: 7d069888fb5c7c310774d056f18711365b39b8e4	2019-05-28 17:52:34 -07:00
Jongsoo Park	0290897bca	tracing for intra_op_parallel (#20603 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20603 When we use intra_op_parallel operators, Caffe2 tracing was generating trace only for the master task giving a false impression that a lot of threads are underutilized. This diff also traces child tasks. Reviewed By: ilia-cher Differential Revision: D14820008 fbshipit-source-id: ff4ed203804d86d9231c21c99d869f1ddf1d1ef9	2019-05-28 17:39:23 -07:00
Hong Xu	9a989ec469	Add an option to stop the build process once cmake terminates. (#21034 ) Summary: Add an option to setup.py to stop the build process once cmake terminates. This leaves users a chance to fine adjust build options. Also update README accordingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21034 Differential Revision: D15530096 Pulled By: soumith fbshipit-source-id: 71ac6ff8483c3ee77c38d88f0d059db53a7d3901	2019-05-28 17:11:00 -07:00
Brennan Vincent	9294de8c9f	Add `Tensor.T` attribute to reverse dimensions (#20598 ) Summary: For compatibility with numpy Pull Request resolved: https://github.com/pytorch/pytorch/pull/20598 Differential Revision: D15499749 Pulled By: umanwizard fbshipit-source-id: f3306b496667f20169e9b28db3150d12183703bc	2019-05-28 16:59:06 -07:00
Zafar Takhirov	2791a44948	Renaming the relu kernel and adding hypothesis tests (#20647 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20647 The initial assumption was that `qint8` would be unsigned. After introduction of `quint8` and `qint8`, some tests break. Reviewed By: jerryzh168 Differential Revision: D15332106 fbshipit-source-id: 6ed18da428915aea918a363c5f38754a3c75d06b	2019-05-28 16:46:44 -07:00
Kimish Patel	d6d192e0af	Added engine information to the profiling result. (#20493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20493 This helps distinguish if the op was a quantized op or not. Reviewed By: salexspb Differential Revision: D15337854 fbshipit-source-id: 43c7aef143085cfaeb4ec2102a7f36cc454e0e94	2019-05-28 16:41:12 -07:00
Kimish Patel	7afa75006e	Enable operator profiling via command line (#20173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20173 Enabled op profiling even when net type is not dag or prof dag. Also added engine type info to summary. Reviewed By: salexspb, ilia-cher Differential Revision: D15177813 fbshipit-source-id: 5be0efeaabc9a961cf1d73b0703749c08bb1adbb	2019-05-28 16:41:08 -07:00
Horace He	2ba608b4a0	Fixed gcd to use 64 bit integers (#21041 ) Summary: Not much to say. Fixes implementation introduced here: https://github.com/pytorch/pytorch/pull/19115 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21041 Differential Revision: D15528801 Pulled By: Chillee fbshipit-source-id: bacd709eb711ca00156bd70480d6051b437517ed	2019-05-28 16:20:55 -07:00
David Riazati	28079c3906	Make ScriptModule.training an attribute instead of a parameter (#19587 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19587 [jit] Make ScriptModule.training an attribute instead of a parameter Remove the hack we had previously where `training` was a buffer Pull Request resolved: https://github.com/pytorch/pytorch/pull/19587 Differential Revision: D15502768 Pulled By: driazati fbshipit-source-id: 3022f2d57ec6849868f9225d9bc2bfb7828cb318	2019-05-28 16:06:46 -07:00
Nikolay Korovaiko	18809f7b0b	Better error message in __get_state__ to let a user know that ScriptModules can't be deep-copied atm (#20885 ) Summary: Before we look into supporting `deepcopy` we could at least improve an error msg. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20885 Differential Revision: D15511023 Pulled By: Krovatkin fbshipit-source-id: 93b8730a2cc663eee0147f14d3341d0606748eaf	2019-05-28 15:09:07 -07:00
peter	07c4e45ca6	Some minor fixes for the changes in #20945 (#21008 ) Summary: Fixes after #20945 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21008 Differential Revision: D15526193 Pulled By: ezyang fbshipit-source-id: 4cfabc482c149e0aeb92ae7fff04098771fe33ed	2019-05-28 14:48:50 -07:00
Wanchao Liang	0885dd28c8	refactor register_prim_ops (#21001 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21001 ghimport-source-id: f1b8e3999bf18fb0f3b857a13c3e3f609e1e4b4e Differential Revision: D15523445 Pulled By: wanchaol fbshipit-source-id: c1e29b0985bde580703a1fca9df46da773826df6	2019-05-28 14:11:04 -07:00
Sam Gross	b85c52923b	Re-land "Fix advanced indexing on "huge" Tensors" (#21019 ) Summary: This #20919 without the changes to aten/src/THC/THCIntegerDivider.cuh that broke the ROCm build. cc bddppq Original summary: This fixes advanced indexing in cases where there's more than 2^31-1 bytes in the output. The `gpu_index_kernel` was missing the `can_use_32bit_indexing`/`with_32bit_indexing` check. This also adds a number of TORCH_INTERNAL_ASSERTS in Loops.cuh, OffsetCalculator, and IntDivider that sizes are fit in a signed 32-bit integer. More comprehensive tests that require a 32 GB GPU are here: https://gist.github.com/colesbury/e29387f5851521256dff562be07b981e Pull Request resolved: https://github.com/pytorch/pytorch/pull/21019 Differential Revision: D15518477 Pulled By: colesbury fbshipit-source-id: 4db5626fda76eb58250793e8aa7d4f2832db3a34	2019-05-28 12:45:56 -07:00
Horace He	52d27890dc	Improve error message for missing attribute (#20779 ) Summary: Fixes #20495 . Now for ```python class A(torch.jit.ScriptModule): def __init__(self): super(A, self).__init__() torch.jit.script_method def forward(self, x): return x + self.whatisgoingon class B(A): def __init__(self): super(B, self).__init__() torch.jit.script_method def bar(self, x): return x * x A() ``` it does ``` RuntimeError: attribute 'whatisgoingon' does not exist: torch.jit.script_method def forward(self, x): return x + self.whatisgoingon ~~~~~~~~~~~~~~~~~~ <--- HERE ``` I added a test in `test_jit.py` as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20779 Differential Revision: D15441138 Pulled By: Chillee fbshipit-source-id: 88f458c36b5e32a1ffc467b27bbc28a3c5c07321	2019-05-28 12:27:52 -07:00
Orion Reblitz-Richardson	bc10677fcb	Some name and variable cleanup (#20861 ) Summary: As a part of https://github.com/pytorch/pytorch/pull/20580 I noticed that we had some unusual variable naming in `summary.py`. This cleans it up and also removes some variables that weren't being used. I'll wait until we have an `add_custom_scalars` test to land this. cc lanpa natalialunova Pull Request resolved: https://github.com/pytorch/pytorch/pull/20861 Differential Revision: D15503420 Pulled By: orionr fbshipit-source-id: 86d105a346198a1ca543d1c5d297804402ab5a0c	2019-05-28 12:22:47 -07:00
Your Name	99674eb86f	Re-enable test_dag_net_forking on ROCm (#21013 ) Summary: Fixes #16229 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21013 Differential Revision: D15515824 Pulled By: bddppq fbshipit-source-id: 23a6c7eaad6129328c6b9dfcc55ac2d31a6d2dc0	2019-05-28 12:12:53 -07:00
Sam Pepose	082936f033	Clarify cycliclr param docs (#20880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20880 This clarifies how the momentum parameters should be used. Reviewed By: soumith Differential Revision: D15482450 fbshipit-source-id: e3649a38876c5912cb101d8e404abca7c3431766	2019-05-28 12:07:47 -07:00
Chunli Fu	68c3ef72b5	Change bound shape inference for LengthsRangeFill & GatherRanges, add more tests (#20610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20610 Change InferLengthsRangeFill Add InferGatherRanges add tests from ClipRangesGatherSigridHash all the way to SparseLengthsWeightedSum add tests from SigridTransforms all the way to SparseLengthsWeightedSum e2e test will be added in the following diff Reviewed By: ipiszy Differential Revision: D15382730 fbshipit-source-id: a611cd129007a273dfc43955cd99af1c4ed04efd	2019-05-28 11:33:51 -07:00
davidriazati	bbe3411846	Refactor schema_matching.cpp (#20549 ) Summary: It was kind of hard to read through this code so this adds a bunch of comments, no behavior should be changed ](https://our.intern.facebook.com/intern/diff/15499974/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20549 Pulled By: driazati Differential Revision: D15499974 fbshipit-source-id: 95bf660c3b2bab1c90a2250696cece68bd1925cc	2019-05-28 10:55:09 -07:00
Roy Li	ff6cda0da6	Generate TH functions outside of Type (#20309 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20309 ghimport-source-id: d0a0195be53f991f20eb0fbb03edf3814f18b831 Differential Revision: D15509848 Pulled By: li-roy fbshipit-source-id: 35aafdcb9bb868a41f75cf422c48d357f8655d67	2019-05-28 02:55:51 -07:00
Roy Li	eacb311810	Move 1d tensor checks to TH (#20859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20859 ghimport-source-id: 675cea2c31b48bb5e3b4676640021ace783ea3a8 Differential Revision: D15509850 Pulled By: li-roy fbshipit-source-id: 468b3b1249d58dd8104643d61d263d1f9b0308bf	2019-05-28 02:55:48 -07:00
Roy Li	d2f14db6cb	Change view dispatch to abstract (#20308 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20308 ghimport-source-id: cac8d130d45cc36e51d1661c15ad98c10353ea54 Differential Revision: D15509849 Pulled By: li-roy fbshipit-source-id: 9576028b7075f58c431dc8c12a38c4c5a34c9340	2019-05-28 02:55:41 -07:00
Ilia Cherniavskii	580eab6562	Restore TBB module (#20454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20454 ghimport-source-id: 14aca1dedbe647d41e55e7538a6b7eeab0fc4384 Differential Revision: D15326062 Pulled By: ilia-cher fbshipit-source-id: 02b005a679b10dc7a264978e87a8d2bb98ab972f	2019-05-28 02:49:36 -07:00
Ilia Cherniavskii	82aecfad6a	Native ATen/Parallel backend (#20087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20087 ghimport-source-id: bcfc8a86abe0893e4a380fe6f6123e2082ba4317 Differential Revision: D15248663 Pulled By: ilia-cher fbshipit-source-id: fdb7a8860c85d8202026b629cb7fa344782bd2c4	2019-05-28 01:40:54 -07:00
peter	f4b434a6a5	Fix incorrect torch version in CMake (#21007 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20525 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21007 Differential Revision: D15515260 Pulled By: soumith fbshipit-source-id: 149084cce276c5e76ca0c5c0872c5e990c47bdfd	2019-05-27 23:46:49 -07:00
jpgard	0556141339	fix small typo muliprocessing -> multiprocessing (#20998 ) Summary: Minor typo fix in docstring. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20998 Differential Revision: D15514698 Pulled By: soumith fbshipit-source-id: a9ceb557251ff5868e810331195243b6a8717851	2019-05-27 21:36:13 -07:00
Junjie Bai	5ddbfc97e9	Revert D15501945: [pytorch][PR] Fix advanced indexing on "huge" Tensors Differential Revision: D15501945 Original commit changeset: e876e678e866 fbshipit-source-id: 2833eb118a62e301571a983529f6e4fc91442581	2019-05-27 20:26:37 -07:00
peter	3b0d431bf5	Check for incompatible versions between CUDA and MSVC Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20945 Differential Revision: D15514576 Pulled By: ezyang fbshipit-source-id: 3c0b8b64edce236a84a7195605d437a00a67b7f4	2019-05-27 19:22:21 -07:00
Syed Tousif Ahmed	0d35f14565	Update cuSPARSE namespace collision w/ CUDA 10.1 Update 1 (#20889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20889 ghimport-source-id: 8f5f500fa542d4992cd9213923e1af8de115ee58 Differential Revision: D15495545 Pulled By: ezyang fbshipit-source-id: 60057cf13694158299a8124b1a787cb4e3c21d21	2019-05-27 18:43:32 -07:00
Nishant Pandit	9d9751f634	Convert dequantize_linear to an internal function _dequantize_linear (#20938 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20938 Dequantize_linear need not be exposed to the front end users. It will only be used for the jit passes for q-dq insertion and op substitution. Differential Revision: D15446097 fbshipit-source-id: a5fbcf2bb72115122c9653e5089d014e2a2e891d	2019-05-27 15:40:21 -07:00
Guanheng Zhang	8e3311c5e2	Remove functionality unsupported by the JIT from multi_head_attention_forward. (#20653 ) Summary: Remove the internal functions in multi_head_attention_forward. Those internal functions cause 10-15% performance regression and there is possibly a JIT issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20653 Differential Revision: D15398888 Pulled By: cpuhrsch fbshipit-source-id: 0a3f053a4ade5009e73d3974fa6733c2bff9d929	2019-05-27 15:12:58 -07:00
Soumith Chintala	6e76813a39	fix SyncBatchNorm doc (#20991 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20991 Differential Revision: D15513518 Pulled By: soumith fbshipit-source-id: 9618c0b2442e013e4d37793cdb04cb4f4b1b141c	2019-05-27 14:46:58 -07:00
xiaobing.zhang	ebc8d7170e	fix the bug for mkldnn clone (#20943 ) Summary: This PR is to solve the bug for clone a MKLDNN tensor, please see the issue https://github.com/pytorch/pytorch/issues/20895. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20943 Differential Revision: D15511516 Pulled By: mrshenli fbshipit-source-id: 05b41d6c7eaf8703521f4c768b8f26ec8501dc5e	2019-05-27 12:09:52 -07:00
Soumith Chintala	6480d3f140	Revert D15511921: [pytorch][PR] BatchSampler now uses list.clear() instead of creating new objects Differential Revision: D15511921 Original commit changeset: e943d21e75e1 fbshipit-source-id: 933b7ef74c7a530f0a2cc087c8ee6f0455cf9239	2019-05-27 10:51:24 -07:00
Tongzhou Wang	482ae8e6b2	BatchSampler now uses list.clear() instead of creating new objects Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20976 Differential Revision: D15511921 Pulled By: soumith fbshipit-source-id: e943d21e75e19f9154a0570f3188cc3ce174083e	2019-05-26 23:45:26 -07:00
Tongliang Liao	ecf012213b	Update submodule URL based on redirection. (#20973 ) Summary: Changes: - protobuf has been moved to protocolbuffers/protobuf a while ago. - cpuinfo has been moved to pytorch/cpuinfo and updated in FBGEMM recently. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20973 Differential Revision: D15511926 Pulled By: soumith fbshipit-source-id: 2c50373c9b245524f839bd1059870dd2b84e3b81	2019-05-26 22:29:21 -07:00
Tongzhou Wang	bb89827e1d	Update cuda pinned memory note to include tensor.to (#20977 ) Summary: separate bits of changes from #19228 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20977 Differential Revision: D15511919 Pulled By: soumith fbshipit-source-id: 5015a29cdac6d6e160388c493182c330f0da63ec	2019-05-26 22:22:06 -07:00
Hong Xu	1e8f129a05	In setup.py, also check some submodules of submodules. (#20937 ) Summary: Sometimes users forget using the "--recursive" option when they update submodules. This added check should help expose this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20937 Differential Revision: D15502846 Pulled By: mrshenli fbshipit-source-id: 34c28a2c71ee6442d16b8b741ea44a18733b1536	2019-05-26 18:43:24 -07:00
Dougal J. Sutherland	8dbdd00f87	tweak tqdm to have download speed in kB/MB/etc (#20908 ) Summary: This changes the progress bars in `_download_url_to_file` from saying things like `49773343.40it/s` to `47.5MB/s`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20908 Differential Revision: D15511223 Pulled By: soumith fbshipit-source-id: 2422eb5fb486f9ef4bd69c556c4ed1775b8b2860	2019-05-26 15:34:47 -07:00
Tongzhou Wang	5ab6e07180	.view(...) now suggests .reshape(...) instead .contiguous().view(...) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20968 Differential Revision: D15511236 Pulled By: soumith fbshipit-source-id: 673fc2982ad6ea287fdd0cff2684bdc2317a6709	2019-05-26 15:34:44 -07:00
Alexander Krotov	c611630b9d	Fix subscripts in RNN documentation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20949 Differential Revision: D15510760 Pulled By: soumith fbshipit-source-id: 51e9dbea7d8c8194e46e12311e397deff32dbe2f	2019-05-26 14:57:40 -07:00
daquexian	a3a458ed30	Fix align corner docs (#20961 ) Summary: I believe the `True` and `False` in the doc are reversed :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20961 Differential Revision: D15510806 Pulled By: soumith fbshipit-source-id: 62566bb595e187506b23dedc24892e48f35b1147	2019-05-26 14:57:37 -07:00
Yifu Wang	5e69e76aba	Remove padding_mode from torch.nn.functional.conv{1,2,3}d's docstr (#20891 ) Summary: Fixes #20694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20891 Differential Revision: D15510790 Pulled By: soumith fbshipit-source-id: aa3630693c7446bf18a390cb49c4df9bc9c59eea	2019-05-26 14:52:51 -07:00
Shen Li	4c5b1e3460	Update nccl submodule to v2.4.6 (#20882 ) Summary: Fixes #20630 Haven't tested it yet. Let's see if it passes all CI tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20882 Reviewed By: pietern Differential Revision: D15483561 Pulled By: mrshenli fbshipit-source-id: 5f0730a04d92906af077b2fe2170b674ca371e6c	2019-05-26 13:00:26 -07:00
Kaiyu Shi	9310e600f6	Use a simpler way to delete recursive function Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20913 Differential Revision: D15508071 Pulled By: mrshenli fbshipit-source-id: ad9a0ab4295bb0f1063d43682a10c124d8384635	2019-05-26 12:17:25 -07:00
Shagun	66e6571eb8	fixed issue #20921 (#20922 ) Summary: For tensor creation ops like `torch.zeros` and `torch.ones`, the docs [0], [1] use `sizes` as the first argument to the function call while the correct argument is `size`. This is tested for pytorch 1.1 installed using pip on ubuntu 19.04 An example ``` >>> torch.zeros(2, 3) tensor([[0., 0., 0.], [0., 0., 0.]]) >>> torch.zeros(sizes = (2, 3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: zeros() missing 1 required positional arguments: "size" >>> torch.zeros(size = (2, 3)) tensor([[0., 0., 0.], [0., 0., 0.]]) >>> torch.ones(sizes = (2, 3)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ones() missing 1 required positional arguments: "size" >>> torch.ones(size = (2, 3)) tensor([[1., 1., 1.], [1., 1., 1.]]) ``` [0]: https://pytorch.org/docs/master/torch.html#torch.zeros [1]: https://pytorch.org/docs/master/torch.html#torch.ones Pull Request resolved: https://github.com/pytorch/pytorch/pull/20922 Differential Revision: D15498741 Pulled By: mrshenli fbshipit-source-id: 963324ffa004d62ca77ce30ed6f0c3932b5b79b7	2019-05-25 22:22:18 -07:00
Tongzhou Wang	83fe92870d	Update multiprocessing note now that shared CUDA tensors are refcounted (#19904 ) Summary: The mp notes are not updated after https://github.com/pytorch/pytorch/pull/16854. (The torch.multiprocessing page is.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19904 Differential Revision: D15509661 Pulled By: soumith fbshipit-source-id: 7c11e14a6c804498dda3adbf19710e63e6a564a0	2019-05-25 17:40:42 -07:00
Edward Yang	bdce5533fe	Fix pytorch_macos_10_13_py3_test (#20944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20944 ghimport-source-id: da2dcbfaff4e0e75f6b4e836fc1af1d8aee11c56 Differential Revision: D15508912 Pulled By: ezyang fbshipit-source-id: 6758a8a516d0a875a5f6bbbb12e43d899bcf2161	2019-05-25 08:17:34 -07:00
Yanghan Wang	81e70ffa19	fix bug of not using get_score_cls_index in BoxWithNMSLimitOp (#20868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20868 When `input_boxes_include_bg_cls` is false (which means `input_scores_fg_cls_starting_id` is 0), It doesn't map the class index of score currectly when sorting and limiting the detections over all classes after nms. Reviewed By: newstzpz Differential Revision: D15472706 fbshipit-source-id: dc1e808b63ad09fb4bd95acf866771bb3fa92d69	2019-05-24 22:31:01 -07:00
Tomasz Wrona	2fb665a9df	Add warning about memory overhead when using multiple tiny tensors (#20801 ) Summary: added note in docs regarding #19408 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20801 Differential Revision: D15503351 Pulled By: mrshenli fbshipit-source-id: 7ab371a7992233fb867aadd4bb6b74fccd232c33	2019-05-24 21:45:51 -07:00
Wanchao Liang	c7e0722814	allow pass ordered dict for nn sequential (#20796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20796 ghimport-source-id: 9f895a2a6ebc71984196b868dc3ea6a12286bc81 Differential Revision: D15505330 Pulled By: wanchaol fbshipit-source-id: 2922c56606b477a34f4e6433fa790d5b2de9d77a	2019-05-24 20:31:05 -07:00
Sam Gross	b93bdf6989	Fix advanced indexing on "huge" Tensors (#20919 ) Summary: This fixes advanced indexing in cases where there's more than 2^31-1 bytes in the output. The `gpu_index_kernel` was missing the `can_use_32bit_indexing`/`with_32bit_indexing` check. This also adds a number of TORCH_INTERNAL_ASSERTS in Loops.cuh, OffsetCalculator, and IntDivider that sizes are fit in a signed 32-bit integer. More comprehensive tests that require a 32 GB GPU are here: https://gist.github.com/colesbury/e29387f5851521256dff562be07b981e Fixes #20888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20919 Differential Revision: D15501945 Pulled By: colesbury fbshipit-source-id: e876e678e866d2efda8ee92c47a1d2d1310671f0	2019-05-24 16:25:04 -07:00
Sam Gross	430d1a2761	Attempt to fix flaky test_structseq_repr (#20931 ) Summary: Previously, this used `crepr` afer the decref of `repr`. This is not allowed because `repr` owns the cached copy of `crepr`. Let's see if this fixes the contbuild. See #20926 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20931 Differential Revision: D15501929 Pulled By: colesbury fbshipit-source-id: 24141ba62df8758d2a3998cf7c2054be09088b6a	2019-05-24 15:55:44 -07:00
Edward Yang	b1df8bfe8a	Reduce set of build/tests which run on PRs. (#20930 ) Summary: Resubmit of #20775 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/20930 Differential Revision: D15503173 Pulled By: ezyang fbshipit-source-id: a5de8eacf6b29ee26f07ac53c915fff3f4d32569	2019-05-24 15:25:37 -07:00
Brennan Vincent	c46c6a4fe6	Zero slice bug (#20914 ) Summary: Bug reported internally at FB: ```python >>> t=torch.from_numpy(np.empty((0,4))) >>> t[:,1::2]=1 Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: Trying to resize storage that is not resizable at ../aten/src/TH/THStorageFunctions.cpp:76 ``` This happens because the storage offset of `t[:, 1::2]` is 1, and it has 0 elements. We can fix this by avoiding resizing the storage for no-element arrays. (We could also* have avoided it by not modifying the storage index in this case, but I felt this way was more semantically correct -- in general, we should not be assuming it's okay to do anything to the storage when it has zero elements). Pull Request resolved: https://github.com/pytorch/pytorch/pull/20914 Differential Revision: D15497860 Pulled By: umanwizard fbshipit-source-id: 6af61d73a05edfc5c07ce8be9e530f15bf72e6a9	2019-05-24 15:10:59 -07:00
davidriazati	3858e1684b	Don't print backtrace for interpreter errors (#20925 ) Summary: Eager Python errors don't include a backtrace so script shouldn't either Pull Request resolved: https://github.com/pytorch/pytorch/pull/20925 Pulled By: driazati Differential Revision: D15499952 fbshipit-source-id: 1169f13ba5578cd52948725eda73de8229146bb1	2019-05-24 14:58:48 -07:00
Yanghan Wang	371bd043d6	register ResizeNearestOp to C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20928 Reviewed By: smessmer Differential Revision: D15499661 fbshipit-source-id: 5af24d5c9d7ff739b8355e19dfe66b496bc026a5	2019-05-24 14:39:11 -07:00
Dávid Komorowicz	b5a5e296aa	Support 3D mesh/point cloud (#20413 ) Summary: I started adding support for the new [mesh/point cloud](https://github.com/tensorflow/graphics/blob/master/tensorflow_graphics/g3doc/tensorboard.md) data type introduced to TensorBoard recently. I created the functions to add the data, created the appropriate summaries. This new data type however requires a Merged summary containing the data for the vertices, colors and faces. I got stuck at this stage. Maybe someone can help. lanpa? I converted the example code by Google to PyTorch: ```python import numpy as np import trimesh import torch from torch.utils.tensorboard import SummaryWriter sample_mesh = 'https://storage.googleapis.com/tensorflow-graphics/tensorboard/test_data/ShortDance07_a175_00001.ply' log_dir = 'runs/torch' batch_size = 1 # Camera and scene configuration. config_dict = { 'camera': {'cls': 'PerspectiveCamera', 'fov': 75}, 'lights': [ { 'cls': 'AmbientLight', 'color': '#ffffff', 'intensity': 0.75, }, { 'cls': 'DirectionalLight', 'color': '#ffffff', 'intensity': 0.75, 'position': [0, -1, 2], }], 'material': { 'cls': 'MeshStandardMaterial', 'roughness': 1, 'metalness': 0 } } # Read all sample PLY files. mesh = trimesh.load_remote(sample_mesh) vertices = np.array(mesh.vertices) # Currently only supports RGB colors. colors = np.array(mesh.visual.vertex_colors[:, :3]) faces = np.array(mesh.faces) # Add batch dimension, so our data will be of shape BxNxC. vertices = np.expand_dims(vertices, 0) colors = np.expand_dims(colors, 0) faces = np.expand_dims(faces, 0) # Create data placeholders of the same shape as data itself. vertices_tensor = torch.as_tensor(vertices) faces_tensor = torch.as_tensor(faces) colors_tensor = torch.as_tensor(colors) writer = SummaryWriter(log_dir) writer.add_mesh('mesh_color_tensor', vertices=vertices_tensor, faces=faces_tensor, colors=colors_tensor, config_dict=config_dict) writer.close() ``` I tried adding only the vertex summary, hence the others are supposed to be optional. I got the following error from TensorBoard and it also didn't display the points: ``` Traceback (most recent call last): File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/werkzeug/serving.py", line 302, in run_wsgi execute(self.server.app) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/werkzeug/serving.py", line 290, in execute application_iter = app(environ, start_response) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/tensorboard/backend/application.py", line 309, in __call__ return self.data_applications[clean_path](environ, start_response) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/werkzeug/wrappers/base_request.py", line 235, in application resp = f(*args[:-2] + (request,)) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/tensorboard/plugins/mesh/mesh_plugin.py", line 252, in _serve_mesh_metadata tensor_events = self._collect_tensor_events(request) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/tensorboard/plugins/mesh/mesh_plugin.py", line 188, in _collect_tensor_events tensors = self._multiplexer.Tensors(run, instance_tag) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 400, in Tensors return accumulator.Tensors(tag) File "/home/dawars/workspace/pytorch/venv/lib/python3.6/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 437, in Tensors return self.tensors_by_tag[tag].Items(_TENSOR_RESERVOIR_KEY) KeyError: 'mesh_color_tensor_COLOR' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20413 Differential Revision: D15500737 Pulled By: orionr fbshipit-source-id: 426e8b966037d08c065bce5198fd485fd80a2b67	2019-05-24 14:30:58 -07:00
Sebastian Messmer	6063ffd055	Specify dispatch key with kernel (#20821 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20821 Change registration API. Instead of static auto registry = torch::RegisterOperators() .op("my::op", torch::RegisterOperators::options() .kernel<Kernel>() .dispatchKey(CPUTensorId())); it is now static auto registry = torch::RegisterOperators() .op("my::op", torch::RegisterOperators::options() .kernel<Kernel>(CPUTensorId())); This binds kernel and dispatch key together, allowing them to be separate from other future configuration options like alias analysis or autograd wrappers. The semantic problem behind this is that the dispatch key is a kernel config parameter and not an operator config parameter while things like autograd wrappers, alias info, and actually the kernel itself are operator config parameters. And while previously, the different kind of config parameters have been mixed, this diff now separates them. Before this change, it wouldn't have been well defined if you specified a dispatchKey together with an autogradWrapper or aliasInfo for example. // what is this supposed to do? static auto registry = torch::RegisterOperators() .op("my::op", torch::RegisterOperators::options() .aliasInfo(DEFAULT) .dispatchKey(CPUTensorId())); If we get more kernel config parameters in the future, we could introduce something like this static auto registry = torch::RegisterOperators() .op("my::op", torch::RegisterOperators::options() .kernel<Kernel>(torch::RegisterOperators::kernelOptions() .dispatchKey(CPUTensorId()) .otherConfig()); but that's overkill as long as dispatch keys are the only kernel config parameter, and we can introduce that later without breaking backwards compatibility. A nice side effect of this is that people can register multiple kernels to the same operator in the same `.op()` call: static auto registry = torch::RegisterOperators() .op("my::op", torch::RegisterOperators::options() .kernel<Kernel1>(CPUTensorId()) .kernel<Kernel2>(CUDATensorId())); Reviewed By: dzhulgakov Differential Revision: D15455790 fbshipit-source-id: 1c46bfe676dcacf74cf36bd3f5df3d2c32b8fb11	2019-05-24 14:23:35 -07:00
Igor Fedan	a2328a27e9	Improve torch.cdist performance (#20605 ) Summary: Fix based on https://github.com/pytorch/pytorch/issues/15253 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20605 Differential Revision: D15396123 Pulled By: ifedan fbshipit-source-id: 3ed373e68339a35360f083d4aad1b655abcaf97e	2019-05-24 14:06:55 -07:00
Sebastian Messmer	4501dc305d	Assert against using Operator methods not supported when exporting it to c10 (#17818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17818 Some of these are probably implementable for exported operators, but aren't implemented yet and for now it's better to assert than to just return wrong results. Reviewed By: ezyang Differential Revision: D14392459 fbshipit-source-id: bf86e6cb0a7cfefd112a65dc85cc243e57a5ad52	2019-05-24 13:45:01 -07:00
Edward Yang	c8f404a68e	Revert D15499918: Reduce set of build/tests which run on PRs. Differential Revision: D15499918 Original commit changeset: 992e3f91f95d fbshipit-source-id: 86fc43d3da7ea3e3a32e95fc4f4f3de6cbd5d49b	2019-05-24 12:55:04 -07:00
Edward Yang	d03265b44f	Reduce set of build/tests which run on PRs. (#20775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20775 ghimport-source-id: 8d05ed03b8a841233d578a38b7c84bd1152c08e5 Differential Revision: D15499918 Pulled By: ezyang fbshipit-source-id: 992e3f91f95dd9c0564e5ed6793dd1b286ddba00	2019-05-24 12:29:52 -07:00
Sam Gross	dee11a92c1	Use Device instead of Backend in TensorIterator (#20690 ) Summary: This PR also moves Device::validate into the header file, which makes statements like `Device d = kCPU` effectively free. Device includes the device's index, so TensorIterator::compute_types now implicitly checks that all CUDA inputs are on the same GPU. Previously, this was done ad-hoc in places like TensorIterator::binary_op. Note that zero-dim Tensor (scalars) are NOT required to be on the same device as other inputs because they behave almost like Python numbers. TensorIterator handles copying zero-dim Tensors to the common device. Prior to this PR, TensorIterator would copy zero-dim Tensors between CPU and GPU, but not between different GPUs (because Backend didn't encode the GPU index). This removes that restriction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20690 Differential Revision: D15414826 Pulled By: colesbury fbshipit-source-id: 1d0ad1f7d663252af36dd4590bcda418c2f7a09f	2019-05-24 12:14:08 -07:00
Thomas Viehmann	17941f9979	JIT: Eliminate SumToSize by using Optional Lists (#18697 ) Summary: This PR is a eliminates unneeded grad_sum_to_size and in particular speeds up the LSTM backward by allowing better fusion. It consists of two parts: - In AutoDiff, record broadcasting sizes only if the broadcast output size is different from the input size, otherwise record None. - The specialization of Optional arguments (#18407) allows us to then eliminate ` _grad_sum_to_size(t, None)` in the peephole optimization step. Thus, in the LSTM case, no SumToSize remain in the crucial fusion group. The trick here is that we can specialize on the runtime information from the forward. I'm testing that different broadcasting situations lead to different graphs. I didn't move all symbolic_script _grad_sum_to_size to the new logic, but it might be better to do this incrementally, anyway. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18697 Differential Revision: D15482076 Pulled By: wanchaol fbshipit-source-id: 7f89367e35b8729910077c95c02bccefc8678afb	2019-05-24 11:24:17 -07:00
Gregory Chanan	47043220ee	Update version strings to 1.2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20812 Differential Revision: D15451892 Pulled By: gchanan fbshipit-source-id: 07355dbd446053a69b5cf4e3be1842aa1075c71f	2019-05-24 11:07:29 -07:00
Roman Dzhabarov	a640c81536	Add llvm8 installation step. (#20879 ) Summary: Add ability to build docker container with llvm8. ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/20879 Differential Revision: D15497037 Pulled By: rdzhabarov fbshipit-source-id: d673d1ddd4156c95516e61223b397c2f9bce1214	2019-05-24 10:51:53 -07:00
Elias Ellison	fa20327618	Update Refinement Docs (#20912 ) Summary: To say that we don't do refinement on module attributes Pull Request resolved: https://github.com/pytorch/pytorch/pull/20912 Differential Revision: D15496453 Pulled By: eellison fbshipit-source-id: a1ab9fb0157a30fa1bb71d0793fcc9b1670c4926	2019-05-24 10:17:55 -07:00
vishwakftw	8c4b2a835b	Remove extra workspace queries in matrix inverse computation (#20904 ) Summary: Earlier, the workspace size query and allocation was placed inside the loop. However, since we have batches of matrices with the same number of rows and columns, the workspace size query and allocation for every matrix in the batch is redundant. This PR moves the workspace size query and allocation outside the loop, effectively saving (batch_size - 1) number of queries and allocation (and consequently the deallocation). There is a tremendous speedup in inverse computation as a result of this change. Changelog: - Move workspace query and allocation outside the batch loop Pull Request resolved: https://github.com/pytorch/pytorch/pull/20904 Differential Revision: D15495505 Pulled By: ezyang fbshipit-source-id: 226729734465fcaf896f86e1b1a548a81440e082	2019-05-24 09:59:37 -07:00
Hong Xu	4109ec1278	In Dockerfile, do not install unecessary packages, use conda to install ninja (saving one layer), and use "." to refer to WORKDIR to reduce redundancy. (#20881 ) Summary: - Do not install unecessary packages in the Docker image. - In the Docker image, use conda to install ninja (saving one layer) - When workdir is set, use "." to refer to it to reduce redundancy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20881 Differential Revision: D15495769 Pulled By: ezyang fbshipit-source-id: dab7df71ac107c85fb1447697e25978daffc7e0b	2019-05-24 09:32:40 -07:00
Hong Xu	6af2482612	Leave it as an option for whether to colorize output during build (#20771 ) Summary: Currently PyTorch forces color output due to #20662. But users should be left an option to turn it off because redirection of the output to a file would be messed if color output is forced. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20771 Differential Revision: D15495677 Pulled By: ezyang fbshipit-source-id: 9d89bbed40d0b67368554305394763a54c5ff6f5	2019-05-24 09:22:52 -07:00
PgLoLo	ec45baf4dd	tensor_illustration with correct numbers and better fonts for README file (#20751 ) Summary: Fix of README tensor image for issue #20641 Numbers are fixed, symbols made more readable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20751 Differential Revision: D15495706 Pulled By: ezyang fbshipit-source-id: b6013574d16253ec681fc57143efe3d53952fbe9	2019-05-24 09:18:18 -07:00
Hong Xu	ef1fdc27a3	Raise TypeError when the argument to isinf and isfinite is not a tensor (#20817 ) Summary: Currently when the argument to isinf and isfinite is not tensor, a ValueError is raised. This, however, should be a TypeError, because the error is a type mismatch. In the error message, "str(tensor)" is replaced by "repr(tensor)" because, when an error occurs, a printable representation of the object is likely more useful than the "informal" string version of the object. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20817 Differential Revision: D15495624 Pulled By: ezyang fbshipit-source-id: 514198dcd723a7031818e50a87e187b22d51af73	2019-05-24 09:18:15 -07:00
Josef Lindman Hörnlund	87040af498	Fix documentation for attention mask shape (#20850 ) Summary: Attention mask should be of shape `(L, S)` since it is added to `attn_output_weights`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20850 Differential Revision: D15495587 Pulled By: ezyang fbshipit-source-id: 61d6801da5291df960daab273e874df28aedbf6e	2019-05-24 09:10:11 -07:00
Sergii Dymchenko	a5c90aaf47	Use "length of the RNN input" instead of "length of the RNN" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20873 Differential Revision: D15495570 Pulled By: ezyang fbshipit-source-id: e3b4cd67ccf97d0053ac053c3bcb74415b928c0a	2019-05-24 09:03:50 -07:00
Edward Yang	3e4f213e82	Instructions for how to update pytorch-ci-hud when updating binary builds (#20758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20758 ghimport-source-id: ffb4c97c42c6efbb16ea5d93ea8af1bdf71cb1e4 Differential Revision: D15435639 Pulled By: ezyang fbshipit-source-id: a12bde8b0b11bbe0d0280b6b3994d9c65dc4f5cc	2019-05-24 07:20:06 -07:00
Ilia Cherniavskii	c3d05e86cc	Resend "Split ATen/Parallel into interface and backend" (#20825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20825 ghimport-source-id: 0371fbd37cb37635647d473d5ac9f2859e787061 Differential Revision: D15458073 Pulled By: ilia-cher fbshipit-source-id: cd27d0da1691f6be1183cd152348ac0d93a53996	2019-05-24 02:03:06 -07:00
Ilia Cherniavskii	6b74856747	Fix init_thread calls in thread pool initialization (#20848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20848 ghimport-source-id: e542858a198252838c1f3100dbfbe90fd3960f07 Differential Revision: D15466918 Pulled By: ilia-cher fbshipit-source-id: e75d38f51edd5b508c4ca28a292e4141e90f209f	2019-05-24 01:14:31 -07:00
Zafar Takhirov	1bb728fe14	Change the quantizer to match the behavior of the FBGEMM implementation (#20892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20892 FBGEMM uses 64 bit values. Need to change our implementation to match Reviewed By: jerryzh168 Differential Revision: D15487664 fbshipit-source-id: 29cba26093c6f9aeafce14982c1ae12149e63562	2019-05-24 00:46:08 -07:00
Sebastian Messmer	fc941d3bca	Catchall kernels instead of fallback kernels (#20773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20773 This removes the feature to register fallback kernels that are called when no other kernel matches. Instead, we introduce the concept of catchall kernels that are always called independent of inputs. If you only have a fallback/catchall kernel and no kernels with concrete dispatch keys, then both concepts behave in the same way. The difference is that we now disallow operators to have both, a catchall kernel and kernels with concrete dispatch keys. This was possible before when they have been fallback kernels. The reason for this change is that we anticipate needing a method_missing feature in backends, i.e. a backend-wide fallback to call when the backend doesn't specify a kernel for an operator. We are not clear on precendence between this backend-wide fallback and an operator level fallback. Disallow fallbacks for now so we are free to choose later without breaking backwards compatibility. Reviewed By: dzhulgakov Differential Revision: D15438977 fbshipit-source-id: cb3aa764a1659d909ee21a7bd8ec3d32438aafaa	2019-05-23 23:47:51 -07:00
Dmytro Dzhulgakov	c25e33789e	Lightweight at-most-once logging for API usage (#20745 ) Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca	2019-05-23 23:17:59 -07:00
Will Feng	8cde4c4d22	Remove Variable::Impl and DifferentiableViewImpl (#17072 ) Summary: As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR: 1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class 2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()` 3. Remove `Variable.data()` API 3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history. After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't. Note that this PR is BC-breaking in the following use cases: Use Case 1: Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type. Use Case 2: If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example: ```python params = torch.tensor([1.5, 1.5]).requires_grad_() with torch.no_grad(): # Change gradient to a sparse tensor params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.])) grad_saved = params.grad params.backward(torch.tensor([1.5, 1.5])) assert id(grad_saved) == id(params.grad) # This will fail after this PR ``` The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072 Differential Revision: D14075257 Pulled By: yf225 fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957	2019-05-23 21:09:04 -07:00
Abhinav Jauhri	f93e0619f3	Adding ShufflenetV2 to caffe2's benchmark suite. (#20180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20180 Adding ShufflenetV2 (by Ma et. al. 2018) to the caffe2's benchmark suite. To run, use: `buck run mode/opt caffe2/caffe2/python/examples:imagenet_trainer -- --train_data null --batch_size 128 --epoch_size 3200 --num_epochs 2 --num_gpus 2 --model shufflenet` Reviewed By: bddppq, xw285cornell Differential Revision: D15094282 fbshipit-source-id: 0e1ce9c5975868e917b0f179e2c5b15647a76b4e	2019-05-23 20:40:17 -07:00
svcscm	3aa7ee6fe6	Updating submodules Reviewed By: yns88 fbshipit-source-id: 17161d7e1e742b402715f8ed006e5b3abfa78561	2019-05-23 20:40:14 -07:00
svcscm	cfb6c4a8ee	Updating submodules Reviewed By: yns88 fbshipit-source-id: 58b230ad12620032f391733c7f9c1e44aeaa390b	2019-05-23 19:49:06 -07:00
Michael Suo	62af37aa88	dropout symbolic_script should respect the training flag (#20760 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20760 ghimport-source-id: eb667c3549a03a2fc01ffa0a2d3bc7e3a29b78e0 Reviewed By: jamesr66a Differential Revision: D15486511 Pulled By: suo fbshipit-source-id: 56ae930a01b0f6f4305a2a745135d4529b4a1ca0	2019-05-23 18:17:17 -07:00
Junjie Bai	bd53c8eb93	Move torchvision install out of onnx test script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20890 Differential Revision: D15486657 Pulled By: bddppq fbshipit-source-id: 3acd7386d1f070cad9bd43d6e74244b706c0dc16	2019-05-23 18:02:48 -07:00
Sebastian Messmer	d5b7138a2c	Dict is a reference type (#20669 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20669 Before, Dict was a value type, i.e. copying it did a deep copy. Unfortunately, this doesn't work well with storing and passing Dicts around in IValues because IValues are reference types. This diff changes Dict to be a reference type. Reviewed By: dzhulgakov Differential Revision: D15404911 fbshipit-source-id: dc990d3eb7cae044b74dd0253f8b704dde6a6c86	2019-05-23 15:24:31 -07:00
Peyman Manikashani	93d5503f34	bug fix 19374 - fix for upsample export Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20116 Differential Revision: D15256899 Pulled By: houseroad fbshipit-source-id: cf0dfd679d528fbb77f483e23071f4a96fb27091	2019-05-23 14:48:23 -07:00
Yinghai Lu	48bf7b9be8	Fix oscillation in coalesceInsertedDataDependencies (#20833 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20833 Att. The algorithm is still "horrendously inefficient". But since we are sunsetting Nomnigraph, I just did the minimal fix here. Reviewed By: tracelogfb Differential Revision: D15463880 fbshipit-source-id: 413a1280a92c1923ba49031177816a2d5f888575	2019-05-23 14:04:20 -07:00
Will Feng	2d96876d88	Use conda torchvision version (#20865 ) Summary: This tries to fix the following error on current master: ``` May 23 16:18:47 Traceback (most recent call last): May 23 16:18:47 File "main.py", line 7, in <module> May 23 16:18:47 from torchvision import datasets, transforms May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/__init__.py", line 1, in <module> May 23 16:18:47 from torchvision import models May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/models/__init__.py", line 11, in <module> May 23 16:18:47 from . import detection May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/__init__.py", line 1, in <module> May 23 16:18:47 from .faster_rcnn import * May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/models/detection/faster_rcnn.py", line 7, in <module> May 23 16:18:47 from torchvision.ops import misc as misc_nn_ops May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/ops/__init__.py", line 1, in <module> May 23 16:18:47 from .boxes import nms, box_iou May 23 16:18:47 File "/opt/conda/lib/python3.6/site-packages/torchvision/ops/boxes.py", line 2, in <module> May 23 16:18:47 from torchvision import _C May 23 16:18:47 ImportError: /opt/conda/lib/python3.6/site-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19NonVariableTypeMode10is_enabledEv ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20865 Differential Revision: D15481736 Pulled By: yf225 fbshipit-source-id: 67d4fd70652ccc709b44cb15392d6e44a8fe9235	2019-05-23 13:49:59 -07:00
Syed Tousif Ahmed	b6d0f6c85a	Move THCTensor_{random, clampedRandom, cappedRandom} to ATen (#20620 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20620 ghimport-source-id: 7c09c2462021e3fa5adef61570a575964ff16125 Differential Revision: D15454050 Pulled By: ezyang fbshipit-source-id: 5b0421c56445baf19dbdbdd9680af128a5cdf443	2019-05-23 13:44:16 -07:00
Masaki Kozuki	48424a6c94	Avoid dynamic dispatch inside the omp loop in AdaptiveAvgPool2d (#20366 ) Summary: This PR changes CPU implementation of `AdaptiveAveragePool2D` by - move dispatch to outside the OpenMP loop - support fp16 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20366 Differential Revision: D15456069 Pulled By: ezyang fbshipit-source-id: 00fa2916f8b136af9f5c8b5db0eca4619f9f5bac	2019-05-23 13:29:29 -07:00
vishwakftw	cf0268e51c	Modify cos to cosh in Vec256 (#20797 ) Summary: Minor typo fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20797 Differential Revision: D15469308 Pulled By: ezyang fbshipit-source-id: 3288ad69316e296e46d861737c5b09e0ea1e694b	2019-05-23 13:24:09 -07:00
Junjie Bai	70caa2efe2	Add mkldnn sigmoid operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20820 Reviewed By: dzhulgakov Differential Revision: D15455866 fbshipit-source-id: 712b06dfbd441051dc284a1acdf94926df09bc1d	2019-05-23 12:51:57 -07:00
Junjie Bai	8dedb04c26	Enable torch.jit.trace for mkldnn modules Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20800 Differential Revision: D15447892 fbshipit-source-id: 78e76523c5412c020a2bc22d6998ff7b36356720	2019-05-23 12:51:54 -07:00
Junjie Bai	63585c3b81	Add support for save and load mkldnn modules Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20799 Reviewed By: wanchaol Differential Revision: D15447891 fbshipit-source-id: e34de946c79282fb934a5c52ff1def41c7993c75	2019-05-23 12:51:50 -07:00
Jonas1312	5f83c5d834	Fix build error with MSVC (#20853 ) Summary: Close #20642 Possibly broken by #19816 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20853 Differential Revision: D15474620 Pulled By: jerryzh168 fbshipit-source-id: 99b52d92a93bac7cab52537f1ebdbd286d4b2cfe	2019-05-23 12:11:29 -07:00
Nikolay Korovaiko	31e2d20c5e	Dictionarize check_inputs coming from `trace` Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20813 Differential Revision: D15466836 Pulled By: Krovatkin fbshipit-source-id: ffdb418592b76dc67c65c59f4dc7303f08734f97	2019-05-23 11:17:55 -07:00
Lingyi Liu	2c556a9489	fix the input/output type mismatch (#20829 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20829 as title Reviewed By: jamesr66a Differential Revision: D15461937 fbshipit-source-id: 02c7150c0e8d020030ae8898008f718c74850dca	2019-05-23 11:08:21 -07:00
davidriazati	9c57d8df42	Make LayerNorm.normalized_shape a tuple Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20832 Pulled By: driazati Differential Revision: D15464693 fbshipit-source-id: 244f24d6917b17dde5e33ff852c716fb053b7ca5	2019-05-23 10:51:08 -07:00
Federico Baldassarre	99b3f5cd70	Fixes error with custom scalars, fixes #20579 (#20580 ) Summary: When adding custom scalars like this ```python from torch.utils.tensorboard import SummaryWriter with SummaryWriter() as writer: writer.add_custom_scalars({'Stuff': { 'Losses': ['MultiLine', ['loss/(one\|two)']], 'Metrics': ['MultiLine', ['metric/(three\|four)']], }}) ``` This error is raised: ``` TypeError: Parameter to MergeFrom() must be instance of same class: expected tensorboard.SummaryMetadata.PluginData got list. ``` Removing the square brackets around `SummaryMetadata.PluginData(plugin_name='custom_scalars')` should be enough to fix it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20580 Differential Revision: D15469700 Pulled By: orionr fbshipit-source-id: 7ce58034bc2a74ab149fee6419319db68d8abafe	2019-05-23 10:17:36 -07:00
Ailing Zhang	a16708a1ae	Workaround python2.7 find_module limitation / explicitly close file (#20782 ) Summary: fix #20781 #20757 hmm I don't know an easy way to add a test to make sure it runs against a package installed as .egg. But i tested it locally with torchvision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20782 Differential Revision: D15443600 Pulled By: ailzhang fbshipit-source-id: 285eb0d9a44d6edb8e93618fa293f4feb431d2ae	2019-05-23 09:44:17 -07:00
Stefan Krah	ec57d1f18a	Port dilated_max_pool2d() to ATen Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20691 Differential Revision: D15435960 Pulled By: ezyang fbshipit-source-id: 548b7cc42e52ad2c641ec7d9cf78028d9411d02e	2019-05-23 09:04:04 -07:00
Sam Gross	f039401bf2	Add back at::_copy_from for use by XLA (#20783 ) Summary: XLA needs a way to override CPUTensor.copy_(XLATensor), but we only dispatch on the "self" argument. This inverts the dispatch order when "src" is an unhandled type. Note that things like XLATensor.copy_(CPUTensor) never enter this implementation. cc dlibenzi Pull Request resolved: https://github.com/pytorch/pytorch/pull/20783 Differential Revision: D15443187 Pulled By: colesbury fbshipit-source-id: 4ee93ba598ef0fed2a99c0683aae30cb50a1f99c	2019-05-23 08:47:20 -07:00
Brian Vaughan	80aed36fb6	fix a couple of typos in README markdown (#20819 ) Summary: was reading the README on github and came across a couple of typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20819 Differential Revision: D15469603 Pulled By: nairbv fbshipit-source-id: 0ed7868de2d4e6d82557a8c170783966f8a1afd7	2019-05-23 08:11:25 -07:00
Elias Ellison	8fc069fa17	add batch of string ops (#20826 ) Summary: First batch of https://github.com/pytorch/pytorch/issues/20769, handles `isupper`, `islower`, `isdigit`, `isspace`, `isalnum`, `isalpha`, `upper`, `lower` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20826 Differential Revision: D15466986 Pulled By: eellison fbshipit-source-id: d1df65721da803dfa30e28fdd9b874405be6bc7d	2019-05-23 08:01:16 -07:00
Junjie Bai	90182a7332	Install torchvision from master Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20836 Differential Revision: D15464705 Pulled By: bddppq fbshipit-source-id: abe2ac2de2bf4c8d07334e6b2565c738c40428ae	2019-05-23 02:16:57 -07:00
Roy Li	d35a587958	Remove cpu_half, cpu_bool, cuda_bool from native_functions.yaml (#20552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20552 ghimport-source-id: 0ef4e85b40f3b927564257f44f72f671251acaf1 Differential Revision: D15362154 Pulled By: li-roy fbshipit-source-id: b2477582389099c6696dca33f1371e8e136e32b6	2019-05-22 22:58:40 -07:00
Jerry Zhang	41100d4027	Add PerChannelAffineQuantizer (#20764 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20764 att Reviewed By: dskhudia Differential Revision: D15367364 fbshipit-source-id: 1d3ebf356ceac73b0fa4493209839d1c66d4d5b3	2019-05-22 19:16:52 -07:00
Edward Yang	a21cf76575	Revert D15459166: [pytorch][PR] add batch of string ops Differential Revision: D15459166 Original commit changeset: 0ed908022475 fbshipit-source-id: d0a04228605e3437a02961a525eed8f8b3b59c17	2019-05-22 19:07:50 -07:00
Tzu-Wei Huang	5952ca8d9f	Remove duplicated _optimize_trace and use core (#20394 ) Summary: The duplicated code of `_optimize_trace` in _pytorch_graph.py is used to bypass some optimization step which causes missing scope. It seems that most of the problematic steps have been fixed recently. Standard models implemented in torchvision are visually inspected before the commit. However, the `+=` in `50d54a82d1/torchvision/models/resnet.py (L63)` will let `f4d9bfaa4d/torch/onnx/utils.py (L159)` produce a bad result. It can be fixed by replacing it with `out += identity`. This also implies that `+=` has non-intuitive behavior. cc orionr ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/20394 Reviewed By: NarineK Differential Revision: D15452204 Pulled By: orionr fbshipit-source-id: eaa4c13f16551c78dc6419f1e22eb2c560af4cc5	2019-05-22 18:34:20 -07:00
Wanchao Liang	871c9dcb1d	move batchnorm and layernorm fusion to decompose (#20337 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20337 ghimport-source-id: 2196f84f2ef384c1f25587b2fb4bd9dd2f63c2b4 Differential Revision: D15448596 Pulled By: wanchaol fbshipit-source-id: b66e608f1b72471fc0775aaa4e09f9fa1070fc3c	2019-05-22 18:01:27 -07:00
Daya S Khudia	cde611a66c	Quantized Conv2d operator (#20772 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20772 Copy of D15178352 A conflicting commit landed at the same time as D15178352 that removed registering kernels using IntArrayRef, Hence, D15178352 was revered. Using std::vector instead. Reviewed By: zafartahirov Differential Revision: D15437237 fbshipit-source-id: cd2f1caebcc720352b48ce25d716cb1ca49a5197	2019-05-22 17:53:24 -07:00
Elias Ellison	aebcd80ae4	add batch of string ops (#20826 ) Summary: First batch of https://github.com/pytorch/pytorch/issues/20769, handles `isupper`, `islower`, `isdigit`, `isspace`, `isalnum`, `isalpha`, `upper`, `lower` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20826 Differential Revision: D15459166 Pulled By: eellison fbshipit-source-id: 0ed908022475e27011803cc4af7cf393a4312783	2019-05-22 17:33:04 -07:00
Michael Suo	7aa3887f43	make wildcards alias only each other (#20670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20670 ghimport-source-id: f5704c49fcb829e4668441f31fcf9305da22335c Reviewed By: jamesr66a Differential Revision: D15447567 Pulled By: suo fbshipit-source-id: 391236806838de2524410e26946456441e562470	2019-05-22 16:50:09 -07:00
Michael Suo	90910fc6cb	Mark values entering containers as wildcards (#20556 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20556 ghimport-source-id: d7c62e38a2f6928f6f8d988c26a38ea8f8cff8b6 Reviewed By: jamesr66a Differential Revision: D15447568 Pulled By: suo fbshipit-source-id: 77ebc11b571b8517d3bad3ee1b3ee5ac037542b2	2019-05-22 16:50:06 -07:00
BowenBao	28be521e39	Fix bug in exporting node with multiple outputs by scripting Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20256 Differential Revision: D15422040 Pulled By: houseroad fbshipit-source-id: 5de2a992d7d99a48905c39a1878eb0b3b68d6a3f	2019-05-22 16:29:36 -07:00
Wanchao Liang	c2e3e79afc	fix pow bug on overloads and clean up (#20824 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20824 ghimport-source-id: ceb1b64e2866ec8577800a8c378d8222a62cf199 Reviewed By: cpuhrsch Differential Revision: D15458009 Pulled By: wanchaol fbshipit-source-id: 51546d142d2c84e961d8b12ae85a2988a342da3b	2019-05-22 16:21:18 -07:00
Sebastian Messmer	98928f4d79	Allow both Variables and Tensors in c10 kernel interface (#20816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20816 Previously, the c10 dispatcher expected ops to be called with Variables and unwrapped them to Tensors before calling into the kernel. The kernel was expected to return Tensors that were re-wrapped into Variables before passing them on into the system. However, that doesn't work with kernels that call other operators. One recent example was a kernel that returned the result of `torch::ones()` as output. Now, with this diff, the c10 dispatcher still passes Tensors to the kernel and Variables back into the system, but it accepts ops to be called with both Tensors or Variables and kernels are also allowed to return either. After https://github.com/pytorch/pytorch/pull/17072 , we should be able to get rid of the whole wrapping/unwrapping logic. Reviewed By: hl475 Differential Revision: D15453963 fbshipit-source-id: 7602b7f2bc43e8ceb8a8c0e97aafcc53d4c47b6c	2019-05-22 16:03:12 -07:00
Jerry Zhang	9ea009fe8b	Add as_quantized_tensor (#20740 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20740 Provide a way to assemble quantized Tensor from int8 Tensor, scale and zero point. Differential Revision: D15232416 fbshipit-source-id: c3a3d9d7214b1dc569214c019440c2779fbd063b	2019-05-22 15:19:45 -07:00
Iurii Zdebskyi	12bc81ae2a	Change comparison ops result dtype to bool [Part1] (#20767 ) Summary: This is the first part of the planned changes to change the comparison operations result tensor dtype from Byte to Bool. You can see the whole list of changes (not cleaned up) [here](https://github.com/pytorch/pytorch/pull/19332). As the PR is too big for a single review im breaking it into pieces. Changes in this PR: 1. Enable these methods for bool tensors: - maskedSelect - maskedSelectBool - bitand - cbitand - bitor - cbitor - bitxor - cbitxor - sign - equal - neg 2. Add bool clause for the TH version of sign method. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20767 Differential Revision: D15436446 Pulled By: izdeby fbshipit-source-id: 8d2494b5f4873cd79c7f1a40d2cb045cadfad51a	2019-05-22 15:12:46 -07:00
Sam Gross	6ec3c12255	Update references to minimum CUDA and cuDNN version (#20718 ) Summary: I didn't update the Windows references because I wasn't sure if they apply to CUDA 9. peterjc123 what should the Windows section say? Pull Request resolved: https://github.com/pytorch/pytorch/pull/20718 Differential Revision: D15459276 Pulled By: colesbury fbshipit-source-id: 917e22f8ac75378d88c962c226b5a42b6799c79a	2019-05-22 14:54:54 -07:00
Jerry Zhang	05543153dd	CUDA implementation of fakequant (#20252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20252 Add CUDA implementation for fakequant op for quantization aware training. Reviewed By: zafartahirov Differential Revision: D15243386 fbshipit-source-id: 37610ab046786ffc69aaec5235e5df8304c353d6	2019-05-22 14:46:39 -07:00
Edward Yang	fdb923996d	Revert D15445092: Some minor fix to unblock the Bert model quantization Differential Revision: D15445092 Original commit changeset: 22da41a56ecb fbshipit-source-id: eca9a85900bf48fe6a9da5cfff61606a10f0c3de	2019-05-22 14:25:14 -07:00
Tzu-Wei Huang	cfc98ae714	fix add_histogram_raw (#20688 ) Summary: This is a porting of the fix from: https://github.com/lanpa/tensorboardX/issues/421 cc orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/20688 Reviewed By: NarineK Differential Revision: D15415093 Pulled By: orionr fbshipit-source-id: d32a6298218fbc6fe315aa0f18b57e0c8ef92627	2019-05-22 14:06:21 -07:00
Kittipat Virochsiri	fd2aa93b37	Exposing LengthsSum/Mean/Max in pytorch (#20802 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20802 Need this for sequence model Reviewed By: dzhulgakov Differential Revision: D15448529 fbshipit-source-id: cd5abe3b689fc0e02feff10faf8cd61c99369f4f	2019-05-22 13:55:19 -07:00
Lara	8d7a025703	ONNX Export Scatter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18543 Differential Revision: D14658639 Pulled By: houseroad fbshipit-source-id: 5d7821b54d2fc93f71120155adf328897d13aff6	2019-05-22 13:31:54 -07:00
Shunting Zhang	fea4a56af3	Add ability to filter metric schema in LayerModelHelper (#20786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20786 Add a method to LayerModelHelper to filter metrics_schema. A general model builder may add metric schema that is not needed in some situations. This change add the ability to skip those unneeded. Reviewed By: alex1o1o7cloud Differential Revision: D15418140 fbshipit-source-id: 520f5dffd9938cf206cb1352e2953a4d4d2b6ab1	2019-05-22 12:26:20 -07:00
Lu Fang	810816a1f9	Automatic update of fbcode/onnx to cc2333a3f929caca7223b98699237f19388dd585 (#20763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20763 Previous import was ead449a30d026a7a0a59e2ba0a42ca8e52ec2359 Included changes: - [cc2333a3](https://github.com/onnx/onnx/commit/cc2333a3): Version Conversion of Min, Max, Mean from opset 7 to 8 (#2032) <Ksenija Stanojevic> - [5d0975f4](https://github.com/onnx/onnx/commit/5d0975f4): Fix auto_pad shape inference bug (#2028) <stevenlix> - [819afd05](https://github.com/onnx/onnx/commit/819afd05): Version Conversion from opset 8 to 9 (#2007) <Ksenija Stanojevic> - [6c913669](https://github.com/onnx/onnx/commit/6c913669): fix macro ONNX_DISALLOW_COPY_AND_ASSIGN bug (#2017) <one-hello> Reviewed By: BIT-silence Differential Revision: D15425957 fbshipit-source-id: b799357930e8c9421e9bfcbfd97907e086862a6d	2019-05-22 11:37:01 -07:00
Michael Kösel	4e0d098ace	Fix optimizer type hint (#20648 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20548 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20648 Differential Revision: D15453935 Pulled By: ezyang fbshipit-source-id: 8778e819c58fdc2620f123ec5b5fd568e23b7705	2019-05-22 11:27:40 -07:00
Hong Xu	795a1a6ffa	When detecting numpy, assign relavant variables outside the try block (#20739 ) Summary: When detecting the presence of NumPy using import, move numpy-related variable assignments outside the try block (i.e., to an else block) to improve readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20739 Differential Revision: D15453916 Pulled By: ezyang fbshipit-source-id: d3c37f2b290846be3c6a1462251cbb3e95d493be	2019-05-22 11:27:36 -07:00
Edward Yang	fd95947e68	Revert D15248618: Split ATen/Parallel into interface and backend Differential Revision: D15248618 Original commit changeset: 060879266bc8 fbshipit-source-id: fc5cbb030b87613c9e15100118c3d4a064097c20	2019-05-22 09:55:51 -07:00
Lingyi Liu	70ecddfd76	Some minor fix to unblock the Bert model quantization (#20787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20787 Set requires_grad=False for bias: this will block the jit tracing. The as_type fix: The input tensor shape and output tensor shape will be different, which will trigger the assertion failure at https://fburl.com/0m8xy7tc. Reviewed By: jamesr66a Differential Revision: D15445092 fbshipit-source-id: 22da41a56ecb9ac092585d0cc1ff0658fb9d631b	2019-05-21 23:13:08 -07:00
Nishant Pandit	a501e7d5be	Add quant-dequant nodes for bias. (#20045 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20045 This pass adds quant-dequant nodes for bias. This pass requires quant-dequant pass for activations and weights to be done as it is required to compute the qparams for bias Differential Revision: D15179141 fbshipit-source-id: 3aab9fceefcadc3fa42a4e802d9b1e18addad78a	2019-05-21 21:59:37 -07:00
Lu Fang	c2d0e7316f	Add DictType to Metadata (#20770 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20770 Add dict type since it's part of the pytorch built-in system, and sparse features and text features will be converted to Dict Reviewed By: pritamdamania87 Differential Revision: D15436255 fbshipit-source-id: 239adbd6a8f68be29020fe656d790f6872f1f0e9	2019-05-21 21:53:06 -07:00
Michael Suo	70eb315da4	Use AT_INTERNAL_ASSERT in test_base (#20555 ) Summary: as title. We were using AT_ASSERT, which is newly deprecated. In this case, we do in fact want an internal assertion since this is used in testing code to describe expected behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20555 Differential Revision: D15362964 Pulled By: suo fbshipit-source-id: 984bfe71a774571611f3bbd81767d3cdb878a6fd	2019-05-21 21:25:07 -07:00
Jianyu Huang	4a85e7955c	Rename FC to Linear in the test routine (#20716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20716 As Title says. Reviewed By: zafartahirov Differential Revision: D15410823 fbshipit-source-id: e82fc241ee288b41304675cb087c0cdcd60d7148	2019-05-21 19:58:19 -07:00
Jerry Zhang	77651615c8	fbgemm precision argument (#20790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20790 att Reviewed By: jianyuh Differential Revision: D15445903 fbshipit-source-id: fd338aea55e40eecc780be881e67417679e2ea35	2019-05-21 19:26:15 -07:00
Ilia Cherniavskii	c4a3b4d528	Split ATen/Parallel into interface and backend (#20057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20057 ghimport-source-id: c583f61bf661c994eb4d0625748a299e892a7246 Differential Revision: D15248618 Pulled By: ilia-cher fbshipit-source-id: 060879266bc8616916fe220adef6ae6c0b076fbd	2019-05-21 19:15:47 -07:00
Jerry Zhang	adbab82846	int_repr for different quantized types (#20656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20656 att Reviewed By: zafartahirov Differential Revision: D15398134 fbshipit-source-id: b02899d4ff33598416f65cf76b2ecc62adee243b	2019-05-21 17:57:42 -07:00
Owen Anderson	c1d6bcf301	Use SmallVector to allocate Compound operands inline. (#20762 ) Summary: Reduces load time for serialized ResNet-18 by 5.5%. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20762 Differential Revision: D15437364 fbshipit-source-id: 2ba34dd229a1054553d0ee09f044ce1915377d78	2019-05-21 16:37:52 -07:00
Xiaomeng Yang	c9da01194a	Optimize pytorch layer_norm forward (#20345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20345 Seperate from D15194600 Optimize pytorch layer_norm op part 1: optimize layer_norm_forward_cpu import Eigen Maps for the performance of reduction Reviewed By: zheng-xq Differential Revision: D15290608 fbshipit-source-id: cf2c208dfd6fbcbc4c69db3ed60278d9bee156b5	2019-05-21 15:59:49 -07:00
Brian Vaughan	9cec8ae146	use tensoriterator instead of th for fill_ implementation. (#20719 ) Summary: Moves fill_ to aten as suggested in: https://github.com/pytorch/pytorch/pull/20336#issuecomment-493260729 borrows from cpuhrsch's PR: https://github.com/pytorch/pytorch/pull/18876/files#diff-0d1178f1a4ce15aeb760d251974e6924 Co-authored-by: cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/20719 Differential Revision: D15439420 Pulled By: nairbv fbshipit-source-id: cbcc313cda61a528cecc4a28d601871565e6110c	2019-05-21 15:45:45 -07:00
Sam Gross	7a0c6d528a	Fix copy_transpose_valid check (#20759 ) Summary: Fixes #20755 (Was broken in #20685) cc vadimkantorov Pull Request resolved: https://github.com/pytorch/pytorch/pull/20759 Differential Revision: D15433712 Pulled By: colesbury fbshipit-source-id: 29f612f7d4d7b73158d6f5dc1e46fd2f8fb09a2f	2019-05-21 15:37:37 -07:00
Elias Ellison	5acc664f9d	make magic methods work with casts too (#20654 ) Summary: Previous implementation of magic methods extended from BuiltinOperators, but it should be able to work with other sugared values, such as casts. I was also considering making CastValue's and BuiltinOperators's extend from a MagicMethod super class, and having them try to call into the super's before their own call. However, not all Builtin Operators have corresponding magic methods so i did it this way instead (although there are workarounds for that). Pull Request resolved: https://github.com/pytorch/pytorch/pull/20654 Differential Revision: D15434469 Pulled By: eellison fbshipit-source-id: 813fa00bf8b5b9ada46505075ebf984d8eee6aef	2019-05-21 14:23:06 -07:00
Jianyu Huang	e6f22e1b89	Change Bias to QTensor with qint32(int32_t) (#20713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20713 As Title says. Reviewed By: zafartahirov Differential Revision: D15410734 fbshipit-source-id: c00f409278736cf9e3205f7d36dda1b96120f47d	2019-05-21 14:17:37 -07:00
Jianyu Huang	b9a150ede0	Change Weight to QTensor with qint8(int8_t) (#20712 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20712 As Title says. Differential Revision: D15410696 fbshipit-source-id: 48147a79d8cc47a724eb473796a37a1c64f8e883	2019-05-21 14:17:34 -07:00
Jianyu Huang	ac2314fdeb	Fix a bug in quantize_linear (#20711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20711 For uint8_t, ```std::numeric_limits::digits``` returns 8; For int8_t, ```std::numeric_limits::digits``` returns 7. FBGEMM wants to get the ```qparams.precision``` to be always 8 for both int8_t and uint8_t. Reviewed By: jerryzh168 Differential Revision: D15410695 fbshipit-source-id: 17dc3842d7c426947454c201bcb167b87b7301ce	2019-05-21 14:17:31 -07:00
David Reiss	32803b52f6	Update Conda description in PyTorch README (#20726 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20726 Edward says it doesn't actually provide compilers, but it does provide dependencies, so let's mention that instead. Reviewed By: ezyang Differential Revision: D15423316 fbshipit-source-id: 9b384f88e5bf7a3d2c132508620c276b49e1569f	2019-05-21 14:12:30 -07:00
Mads R. B. Kristensen	5d8879cf6d	Auto-convert GPU arrays that support the __cuda_array_interface__ protocol (#20584 ) Summary: This PR implements auto-conversion of GPU arrays that support the `__cuda_array_interface__` protocol (fixes #15601). If an object exposes the `__cuda_array_interface__` attribute, `touch.as_tensor()` and `touch.tensor()` will use the exposed device memory. #### Zero-copy When using `touch.as_tensor(...,device=D)` where `D` is the same device as the one used in `__cuda_array_interface__`. #### Implicit copy When using `touch.as_tensor(...,device=D)` where `D` is the CPU or another non-CUDA device. #### Explicit copy When using `torch.tensor()`. #### Exception When using `touch.as_tensor(...,device=D)` where `D` is a CUDA device not used in `__cuda_array_interface__`. #### Lifetime `torch.as_tensor(obj)` tensor grabs a reference to `obj` so that the lifetime of `obj` exceeds the tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/20584 Differential Revision: D15435610 Pulled By: ezyang fbshipit-source-id: c423776ba2f2c073b902e0a0ce272d54e9005286	2019-05-21 14:06:46 -07:00
Hong Xu	847d9c57d1	Improve the recommended citation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20768 Differential Revision: D15436734 Pulled By: ezyang fbshipit-source-id: d073f3b76a60bd8edf1e7799a1bb153d04a09bb1	2019-05-21 13:51:56 -07:00
peter	bb20956e3c	Add support for CMake switches for VS 2019 (#20752 ) Summary: Appending `arch` to the generator name is not supported for VS starting from VS 2019. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20752 Differential Revision: D15436740 Pulled By: ezyang fbshipit-source-id: 20057aae8f708d82619927bf2cb87dd1bc2df312	2019-05-21 13:46:39 -07:00
Elias Ellison	47dc65fe76	add str comparisons (#20761 ) Summary: add string comparisons Pull Request resolved: https://github.com/pytorch/pytorch/pull/20761 Differential Revision: D15434616 Pulled By: eellison fbshipit-source-id: c00c7bac6308dbcc6a9e46b92421f49fb2d5a81c	2019-05-21 12:47:50 -07:00
Jerry Zhang	cca923c481	Add dequantize_linear for JIT pass (#20107 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20107 att Reviewed By: nishantpdce Differential Revision: D15202187 fbshipit-source-id: 7d6274a67fcca695c0425587f35046fecbc2ccdc	2019-05-21 12:26:48 -07:00
Sebastian Messmer	cc02a1af61	Throw error if multiple kernels registered (#20737 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20737 If someone tries to register multiple kernels in the same .op() call, we're now throwing an error. Differential Revision: D15425660 fbshipit-source-id: 6d2f1444da3e16a6a98863d847965c2aa211e046	2019-05-21 12:17:01 -07:00
Xiaodong Wang	f3d827f311	Hipify fb/quantize Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20725 Reviewed By: bddppq Differential Revision: D15407710 fbshipit-source-id: e5fdeee7e2dffd43cfdd6fab6193eb8a80902c02	2019-05-21 10:51:36 -07:00
Xiaodong Wang	b5edeca39d	Split cpu/gpu in caffe2/distributed + some clean up (#20674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20674 A few targets in caffe2/caffe2/distribute needs to be split too, otherwise won't compile. Also some clean ups and make select_gpu_type to gpu_library_selector Differential Revision: D15406019 fbshipit-source-id: 6455ab885b248502b48d4c7565597e00fecfd547	2019-05-21 10:51:33 -07:00
Michael Suo	d7cd2d7a8c	compile with -fcolor-diagnostics (#20662 ) Summary: Let there be color! Pull Request resolved: https://github.com/pytorch/pytorch/pull/20662 Differential Revision: D15434110 Pulled By: suo fbshipit-source-id: a317ae72ad72e0b8249f55c9c8d31f420c78c040	2019-05-21 10:32:55 -07:00
Jesse Hellemn	c790f10e2d	Fix missing cudatoolkit dependency in binary linux tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20732 Differential Revision: D15434025 Pulled By: pjh5 fbshipit-source-id: 74a5798d14b6e61cdcdc784c159294b87264d3de	2019-05-21 10:27:15 -07:00
Jesse Hellemn	e3970d66d4	Fixing upload_binary_htmls again Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20736 Differential Revision: D15433417 Pulled By: pjh5 fbshipit-source-id: 58964a341226b536be899855058422cb82aa054b	2019-05-21 10:16:08 -07:00
Jesse Hellemn	fac307a5cf	Revert D15178352: [pt1][quant] Quantized Conv2d operator Differential Revision: D15178352 Original commit changeset: 2e5453283137 fbshipit-source-id: 73cf64c483eedbd41a047e7593c0c92bbd33008c	2019-05-21 09:59:57 -07:00
Brian Vaughan	eca7fa35a4	Fix -Wattributes warning on older versions of gcc (#20587 ) Summary: building with cuda and gcc 4.8.5-28, we see many warnings like: [893/1645] Building NVCC (Device) object caffe2/CMakeFiles/caffe2_gpu.dir/__/aten/src/THCUNN/caffe2_gpu_generated_ELU.cu.o /home/bvaughan/repos/pytorch/c10/util/ArrayRef.h:277:48: warning: ‘deprecated’ attribute directive ignored [-Wattributes] using IntList C10_DEPRECATED_USING = ArrayRef<int64_t>; This change prevents those warnings on the older compiler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20587 Differential Revision: D15432749 Pulled By: nairbv fbshipit-source-id: fd707afcbd6564f96617378d7cd6d62d941a052b	2019-05-21 09:47:40 -07:00
Jesse Hellemn	712c60f960	Fixing missing miniconda path in macos smokes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20727 Differential Revision: D15433407 Pulled By: pjh5 fbshipit-source-id: 2f5d4e1e49068e9597f7052deb70a287b91e482b	2019-05-21 09:47:37 -07:00
Daya S Khudia	29b1b59449	Quantized Conv2d operator (#20064 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20064 Initial implementation of quantized convolution operator using fbgemm. Reviewed By: zafartahirov Differential Revision: D15178352 fbshipit-source-id: 2e5453283137dc165e9a20164ffc138fa8caf88a	2019-05-21 09:13:42 -07:00
Nishant Pandit	d73caca2a1	Add mandatory ScalarType nodes as input to the quant-dequant nodes. (#20468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20468 ScalarType node is mandatory for activations and parameters now. This change inserts ScalarType node for all the quant-dequant nodes. For the activations, currently the default value is at::ScalarType::Undefined. Remove this and explicitly pass the at::ScalarType::QUint8 dtype Differential Revision: D15331600 fbshipit-source-id: 5b51e0b42e694bf409026af4783a12da6d7e234b	2019-05-20 20:01:17 -07:00
Abhijith Reddy	371cf109a3	Increase static tolerance for negative feature ids Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20671 Reviewed By: Wakeupbuddy Differential Revision: D15401078 fbshipit-source-id: a946b1df6fae2851d60fadf32e57feb44ba95f38	2019-05-20 19:09:22 -07:00
Yanghan Wang	0beecbdaad	fix soft_nms_cpu call in BoxWithNMSLimit (#20738 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20738 D15348610 introduces a bug of misaligned arguments. Reviewed By: isameer Differential Revision: D15425627 fbshipit-source-id: b6345863847426ae04eb31245d13f7fcb69d0355	2019-05-20 18:49:41 -07:00
Huan Gui	fbdafdffa1	Move bucketize_op to open source Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19952 Reviewed By: houseroad Differential Revision: D15145552 fbshipit-source-id: e0074c878a5c164324a9cc477783285dedffd188	2019-05-20 18:03:27 -07:00
Sam Gross	320c38555e	Refactor CUDA copy and general copy dispatch (#20685 ) Summary: Copy.cu goes from 308 to 190 lines of code. In general it uses, the same copy strategy, using cudaMempcyAsync, a pointwise kernel, or a copy using temporary buffers. The pointwise kernel has slightly improved performance when broadcasting due to faster index calculation. This deletes "`s_copy_`", "`_s_copy_from`", and "`_copy_same_type_`". The only entry-point now is "`copy_`". A mini-benchmark is here: https://gist.github.com/colesbury/706de1d4e8260afe046020988410b992 Before: https://gist.github.com/colesbury/ab454b6fe3791bff420d7bcf8c041f18 After: https://gist.github.com/colesbury/9024d242b56ab09a9ec985fa6d1620bc Results were measured on 2.2 GHz Broadwell; no-turbo; one thread; compiled with GCC 7.3.0. (Results are slower than typical usage due to turbo being off.) The only significant differences is in the CUDA [1024] -> [1024, 1024] broadcasting copy which is ~25% faster. I don't expect a noticeable difference in real programs. CPU copy overhead is a tiny bit (~200 ns) faster, but I don't expect anyone to notice that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20685 Differential Revision: D15414819 Pulled By: colesbury fbshipit-source-id: d3c6e04a5020470e3bef15b1fc09503cae5df440	2019-05-20 17:09:44 -07:00
Yinghai Lu	cf7ef5e631	Add onnxifi support for Int8FCDNNLowPPackedWeightBlob (#20564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20564 Reviewed By: bddppq Differential Revision: D15106712 fbshipit-source-id: 428db9c23cfd36ddedc8d79121fbbb3bb484c993	2019-05-20 16:57:11 -07:00
Karl Ostmo	0bfc0eeef7	restore hidden visibility by default for Linux builds (#20461 ) Summary: Symbols are given hidden visibility by default on Linux to emulate the behavior on Windows. This helps developers catch visibility issues in their streamlined Linux dev environment before being surprised, late in the process, by Windows errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20461 Reviewed By: kostmo Differential Revision: D15410410 Pulled By: dzhulgakov fbshipit-source-id: 1d684b5a9a80b692966a775c3f1c56b7c72ffc95	2019-05-20 16:49:37 -07:00
Sebastian Messmer	be1f83c350	Fix dll linkage for tensor type ids (#20547 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20547 - Differential Revision: D15359988 fbshipit-source-id: 680115a6b73f64c9b02f86eccb8feb799adc6c90	2019-05-20 16:25:09 -07:00
davidriazati	410c7210db	Add `save()` to `torch._C.Function` (#20386 ) Summary: Fixes #20017 This wraps the `torch._C.Function` currently returned from `torch.jit.script` and `torch.jit.trace` in a `ScriptFunction` and `TracedFunction` respectively, both of which are just wrappers to hold the function. ](https://our.intern.facebook.com/intern/diff/15403161/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20386 Pulled By: driazati Differential Revision: D15403161 fbshipit-source-id: 94fb9f32929e62a00be6cf7512ea144ec9b91e0b	2019-05-20 16:19:51 -07:00
Brennan Vincent	987f1ccf49	Add "ndim" property to tensor (#20565 ) Summary: For compatibility with numpy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20565 Differential Revision: D15374390 Pulled By: umanwizard fbshipit-source-id: 4ab209a5fb27d8ba27ee7eb6b67b858ce2480594	2019-05-20 16:10:50 -07:00
Matthew Brandyberry	6ae99aa5bc	onnx/caffe2 tests: Do not execute models with CPU-only operators on GPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20720 Reviewed By: bddppq Differential Revision: D15422322 Pulled By: houseroad fbshipit-source-id: c79795434157ff5f0a7b2774fd40edc71cf35ba7	2019-05-20 16:04:45 -07:00
Nishant Pandit	be33434d85	Unify the addQuantDequantNode api for inputs and outputs from quant nodes (#20677 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20677 With new changes in IR, it is possible to insert nodes after param nodes in graph. Thus we do not need to have two methods for inserting q-dq nodes to input or output to quantizable nodes. Differential Revision: D15406354 fbshipit-source-id: 1963762f434fd82877fa76a272e8520c342b6069	2019-05-20 15:27:45 -07:00
Sebastian Messmer	cf548ba683	De-deprecate old list and dict APIs (#20709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20709 - Remove ArrayRef based API. This is neither the old nor the planned new API. - De-deprecate kernels based on std::vector and std::unordered_map. We don't have the Dict/List based API figured out entirely yet, so we shouldn't push people towards using them. std::vector and std::unordered_map will get deprecated again once we figured out List/Dict. Reviewed By: dzhulgakov Differential Revision: D15417025 fbshipit-source-id: bfbb33c762e43487bb499bc8cc36d515e678f8fc	2019-05-20 13:53:00 -07:00
Daniel Yang	c062175803	Remove unused var (ws_) and use vars in undefined case for compile (#20667 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20667 Compilation errors: ``` xplat/caffe2/caffe2/utils/signal_handler.h:31:10: error: private field 'SIGINT_action_' is not used [-Werror,-Wunused-private-field] Action SIGINT_action_; ^ xplat/caffe2/caffe2/utils/signal_handler.h:32:10: error: private field 'SIGHUP_action_' is not used [-Werror,-Wunused-private-field] Action SIGHUP_action_; ^ xplat/caffe2/caffe2/utils/signal_handler.h:33:17: error: private field 'my_sigint_count_' is not used [-Werror,-Wunused-private-field] unsigned long my_sigint_count_; ^ xplat/caffe2/caffe2/utils/signal_handler.h:34:17: error: private field 'my_sighup_count_' is not used [-Werror,-Wunused-private-field] unsigned long my_sighup_count_; ^ 4 errors generated. xplat/caffe2/caffe2/share/fb/stylizer/median_blur_ops.cc:593:14: error: private field 'ws_' is not used [-Werror,-Wunused-private-field] Workspace* ws_; ^ 1 error generated. ``` Reviewed By: bwasti Differential Revision: D15402928 fbshipit-source-id: 5b98499850aa659fd37ab8e7f2e75166787b8129	2019-05-20 13:52:57 -07:00
Lu Fang	af6eea9391	Add the support of feature store example in pytorch model in fblearner (#20040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20040 Add the support of feature store example in fblearner pytorch predictor, end to end Reviewed By: dzhulgakov Differential Revision: D15177897 fbshipit-source-id: 0f6df8b064eb9844fc9ddae61e978d6574c22916	2019-05-20 12:58:27 -07:00
Sebastian Messmer	9fbce974c9	torch::jit::RegisterOperators forwards to c10::RegisterOperators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20383 Reviewed By: zdevito Differential Revision: D15300937 fbshipit-source-id: 740fe323fc0945759651116ae61aff4d36319d73	2019-05-20 12:41:04 -07:00
Edward Yang	74bdcd44c4	Remove tab. (#20715 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20715 ghimport-source-id: 600e244581b37152d86614cca6c9fb5fee6cdcde Differential Revision: D15417984 Pulled By: ezyang fbshipit-source-id: 939425de1a95ecc3798384e121b12faaba3a27b8	2019-05-20 11:57:18 -07:00
Edward Yang	10445c0404	Finish removal of AT_CHECK, officially deprecate the macro. (#20600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20600 All future uses of AT_CHECK will fail our CI. Reviewed By: jerryzh168 Differential Revision: D15375397 fbshipit-source-id: 5582664d6c7c4f1a56ae45647eb1bca49fed2866	2019-05-20 11:57:15 -07:00
Sam Gross	c1fa449763	Break reference cycle in load_state_dict (#20397 ) Summary: load_state_dict includes a recursive inner function `load` that captures Tensors through the close-over variable `state_dict`. Because it's recursive, it also captures itself leading to a reference cycle. This breaks the reference cycle so that any Tensors in state_dict can be collected immediately instead of waiting until the next GC cycle. Alternatively, we could have passed `state_dict` and `metadata` as arguments to load to prevent capture of Tensors. (That would still result in cyclic garbage, but not any cyclic garbage of Tensors). See: https://github.com/pytorch/pytorch/issues/20199#issuecomment-491089004 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20397 Differential Revision: D15414834 Pulled By: colesbury fbshipit-source-id: 4c2275a08b2d8043deb3779db28be03bda15872d	2019-05-20 11:46:00 -07:00
davidriazati	796e359601	Refactor builtin ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20666 Pulled By: driazati Differential Revision: D15404419 fbshipit-source-id: c52bfa408c3caabb0a14779308686cf27fed349b	2019-05-20 11:21:18 -07:00
Natalia Lunova	c0a2a3b22b	Add a new method SummaryWriter.flush() (#20607 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20607 Add a new method SummaryWriter.flush() that iterates through all of the FileWriters and flushes them Reviewed By: orionr Differential Revision: D15380124 fbshipit-source-id: 1975f3f61c5ae3754552bfdb23f2cd78f687d19f	2019-05-20 11:05:12 -07:00
Nikolay Korovaiko	f215db9b92	InsertGuards pass Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20438 Differential Revision: D15342655 Pulled By: Krovatkin fbshipit-source-id: a193e582d621b99f848573fb4478e7b62265dc9f	2019-05-20 10:49:19 -07:00
Edward Yang	036a159fb9	Audit AT_ASSERT sites in TensorImpl.h; doc improvements (#20649 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20649 I went through every occurrence of AT_ASSERT in this file and thought about whether or not it should be TORCH_INTERNAL_ASSERT or TORCH_CHECK. I think I did a good job at it. Some thoughts: - In order to decide if a check is "internal" or not, we must think about where the separation between userspace and our internals are. I think any code that utilizes the PyTorch or Caffe2 C++ frontends count as userspace. An important collorary is that the majority of operator code "counts" as userspace, even though it lives in our repository. This is inline with TCB (trusted computing base) thinking: you want the TCB to be as small as possible, and because we have a lot of operator implementations, they should not count as TCB. - The primary test I applied when considering an AT_ASSERT was whether or not I could trigger this error by just making method calls on caffe2::Tensor or at::Tensor. If I could, that made it a TORCH_CHECK. This covers most of the misapplications of TORCH_INTERNAL_ASSERT. One place I didn't do this was the "is variable" checks; I think you have to work a bit harder to trigger this case, and userspace code is not mixing up Variables and Tensros. - I updated the docs for device_opt_, explaining when it could be nullopt. (The nullopt checks here are TORCH_CHECK, because you can trigger them by taking an undefined tensor and poking the methods.) Differential Revision: D15395576 fbshipit-source-id: 1c51b396012e7d949fbb4258092cf80e5e6f851b	2019-05-20 09:54:37 -07:00
Iurii Zdebskyi	71260b98e2	Fixed histc return type for CUDA (#20369 ) Summary: Fixing reported [issue](https://github.com/pytorch/pytorch/issues/20208). Pull Request resolved: https://github.com/pytorch/pytorch/pull/20369 Reviewed By: zou3519 Differential Revision: D15300959 Pulled By: izdeby fbshipit-source-id: 219692f99a66ea433112dfc226132eb6867122cf	2019-05-20 08:08:28 -07:00
kirayue	d0c742134d	#20028 (#20696 ) Summary: Hi, ezyang Sorry to trouble you. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20696 Differential Revision: D15413694 Pulled By: ezyang fbshipit-source-id: 1c19d18e00c3a66a52bb9230aa25d7530f6e659c	2019-05-20 07:51:55 -07:00
Kimish Patel	cda9e995e2	Benchmark repeat op. (#20016 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20016 PT's repeat op benchmark Reviewed By: zheng-xq Differential Revision: D15166941 fbshipit-source-id: b1ed7af790460456210b60bfb4e44a08657e9612	2019-05-20 07:34:54 -07:00
Shen Li	8acaa286b7	Make CUDACachingAllocator::recordStream() a no-op on null ptrs (#20658 ) Summary: Fixes #20651 Communication collectives in `torch.distributed` call `CUDACachingAllocator::recordStream()` on input and output tensors to prevent their memory blocks being freed too early. `CUDACachingAllocator` uses tensor's data pointer to track memory blocks, which does not accept null pointers. However, empty tensor's `storage().data()` might be null. In this case, as there is no associated memory block for the empty tensor, it should be fine to make `recordStream()` a no-op. Tests only cover `broadcast` empty tensors for GLOO backend, because GLOO does not support empty inputs (facebookincubator/gloo/issues/179). It can be addressed in either `ProcessGroupGloo` or GLOO itself. Will add more tests when that gap is filled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20658 Differential Revision: D15399371 Pulled By: mrshenli fbshipit-source-id: d29ebd1c72fddae49531f32695f81b89e42e5a4d	2019-05-20 07:13:51 -07:00
fehiepsi	071971476d	Fix Binomimal overflow when logits is large (#20679 ) Summary: This PR fixes #17843. In addition (test locally), this still maintains the continuity of log_prob which is addressed in https://github.com/pytorch/pytorch/pull/15962 cc neerajprad Pull Request resolved: https://github.com/pytorch/pytorch/pull/20679 Differential Revision: D15413311 Pulled By: ezyang fbshipit-source-id: 4fc0ca755ae6a85aa7deb2206dab675f82f9aa25	2019-05-20 06:52:29 -07:00
Edward Z. Yang	9b1dbffba5	Re-sync with internal repository (#20702 )	2019-05-20 09:22:57 -04:00
Ilia Cherniavskii	5835165ce3	Add get/set_num_interop_threads into torch.h include (#20659 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20659 ghimport-source-id: 4858d03a9f89c613f64901c3430a7b212f76eb95 Reviewed By: dzhulgakov Differential Revision: D15399780 Pulled By: ilia-cher fbshipit-source-id: c1c3cb628c5ee664468f9d181bcd76a5105a89fd	2019-05-20 00:34:59 -07:00
Dmytro Dzhulgakov	4598729399	better handling of getenv	2019-05-19 23:19:52 -07:00
Dmytro Dzhulgakov	d3059b9c49	Lightweight logging for once-only API usage	2019-05-19 23:04:40 -07:00
Jongsoo Park	7b9ee598d6	separate option for FE_OVERFLOW (#20476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20476 There're overflow exceptions happening for legitimate computation like for big x, sigmoid(x) = 1 / (1 + exp(-x)) = 1 / (1 + inf) = 1 This diff separates the option for FE_OVERFLOW to make caffe2_operator_throw_if_fp_exceptions=1 option less noisy. Reviewed By: hx89 Differential Revision: D15332947 fbshipit-source-id: 9148233f5b84551a0900f0557ba22f2b1508ae0c	2019-05-19 16:05:27 -07:00
Nishant Pandit	dd050b7b91	Replace AT_ASSERT with TORCH_INTERNAL_ASSERT/TORCH_CHECK (#20668 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20668 AT_ASSERT is deprecated. Replace it wil the new exception handling macro. Differential Revision: D15404401 fbshipit-source-id: 7ae9f5471f1f8ad95e7bb835380e098c00313d9c	2019-05-18 01:48:11 -07:00
Chenguang Xi	96a1f7695f	Support plot norm of specific embeddings of a LUT in diagnose_options (#19809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19809 as title Reviewed By: chocjy Differential Revision: D15100505 fbshipit-source-id: cba290fd4317b260e2bf1689b9ca215d3d19a9e2	2019-05-18 01:08:45 -07:00
Wanchao Liang	2308257483	delete brodcasting ops from shape analysis resize aliasing (#20661 ) Summary: According to https://pytorch.org/docs/stable/notes/broadcasting.html, in-place operations do not allow the in-place tensor to change shape as a result of the broadcast. Therefore our shape analysis could keep the shape information on inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20661 Differential Revision: D15406477 Pulled By: wanchaol fbshipit-source-id: 8ab60e783292f2fe26e5fdecfb64bec43bca6826	2019-05-18 00:07:32 -07:00
Sebastian Messmer	e74869473d	De-deprecate parts of the legacy API (#20561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20561 We previously planned to deprecate the direct passing of a kernel function or lambda to the op() call, e.g. static auto registry = RegisterOperators().op("my::op", &func); and push users towards the options based API: static auto registry = RegisterOperators().op("my::op", RegisterOperators::options().kernel<decltype(func), &func>()); because that has a slightly lower performance overhead when calling the kernel. However, that overhead is negligible for all but exotic use cases, so there's no reason to push users towards a more verbose API. This diff removes the deprecation warning from that API. However, if you use the API together with deprecated types like std::unordered_map, you will now get a deprecation warning there. Reviewed By: zdevito Differential Revision: D15364271 fbshipit-source-id: 56dae0c5870bbab16ad19ba5178f4bea9eafed9f	2019-05-17 20:54:45 -07:00
Sebastian Messmer	cb6be42403	Options based registration API (#20514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20514 Change API from static auto registry = c10::RegisterOperators() .op("my::op", c10::kernel(...), c10::dispatchKey(...) ); to static auto registry = c10::RegisterOperators() .op("my::op", c10::RegisterOperators::options() .kernel(...) .dispatchKey(...) ); because this allows better discoverability. People looking for which options are available will easier find it and IDE autocompletion will work better. Reviewed By: zdevito Differential Revision: D15346348 fbshipit-source-id: 4b74a33b75c2b9cda4a903639fb7abd2c7cff167	2019-05-17 20:54:42 -07:00
Jerry Zhang	85fad0597c	Add qint8 type (int8_t) (#19984 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19984 Add qint8 for QTensor, with underlying type of int8_t Reviewed By: jianyuh Differential Revision: D15150715 fbshipit-source-id: 57580f599d46f9323af5ce462dbbc464b25e40d7	2019-05-17 20:35:05 -07:00
Mikhail Zolotukhin	986c9eb537	Add a pybind for Module::get_functions. (#20594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20594 ghimport-source-id: 96f8046fd3aed49d8b29312ce2f8d7e0ea5e5787 Differential Revision: D15375132 Pulled By: ZolotukhinM fbshipit-source-id: 7ad87e29d965e459c49be1be8a08f86ed2a0b4db	2019-05-17 20:00:28 -07:00
Nishant Pandit	2af5911a95	Modify the Insert quant-dequant test cases to look for q-dq pattern (#20672 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20672 Current test case looks for q->int_repr->dq pattern and constant nodes also. The prim::Constant nodes are not guaranteed to be present at same point in graph. So we modify the test case to only look for the q->int_repr->dq nodes. Differential Revision: D15405606 fbshipit-source-id: 2086ffb5bbd328d2a9a55f4c2a2de342575194d3	2019-05-17 18:46:27 -07:00
Roy Li	e42665cf39	Some small performance fixes for c10 dispatcher (#20472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20472 ghimport-source-id: d118bf8d48eea3faf241a7288fcad1bb6a5f051f Differential Revision: D15332284 Pulled By: li-roy fbshipit-source-id: a8d9e50a440a7ad3ee730f70c0fcae06ae848cbd	2019-05-17 17:35:56 -07:00
Dmytro Dzhulgakov	d9dcfacd9e	Improve CPUAllocator OOM message (#20618 ) Summary: Spotted while debugging some problem Before ``` >>> torch.empty(1015) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: [enforce fail at CPUAllocator.cpp:56] posix_memalign(&data, gAlignment, nbytes) == 0. 12 vs 0 ``` After ``` >>> torch.empty(1015) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: [enforce fail at CPUAllocator.cpp:65] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 4000000000000000 bytes. Error code 12 (Cannot allocate memory) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20618 Reviewed By: ezyang Differential Revision: D15390400 Pulled By: dzhulgakov fbshipit-source-id: 31f448303e4bd5f8c2bad8ca0f05bcece22a4b5e	2019-05-17 16:14:49 -07:00
Jesse Hellemn	e79610c0df	Fix missing env for update_binary_size job Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20645 Differential Revision: D15400130 Pulled By: pjh5 fbshipit-source-id: d2a07fc5608ab3a96026b60e16bd12add8e0c9d4	2019-05-17 15:46:50 -07:00
davidriazati	c267d0c869	Misc error message improvements (#19369 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19369 [jit] Misc error message improvements ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19369 Pulled By: driazati Differential Revision: D14982975 fbshipit-source-id: 18dae3d56c3a01280f66223a69e2e52d15d3b651	2019-05-17 15:30:58 -07:00
Mikhail Zolotukhin	3be2f7c8e6	SubgraphMatcher: add attributes support. (#20602 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20602 ghimport-source-id: fa3225bb5d70729d6a1bcf88295d031707d986a1 Differential Revision: D15377635 Pulled By: ZolotukhinM fbshipit-source-id: ebd385e7b9436429d0ad76ed3d932925a29f6456	2019-05-17 15:10:02 -07:00
Elias Ellison	0b9b929d14	Use python type string for user facing error msgs (#20657 ) Summary: Otherwise users see something like (Tensor, Tensor)? and don't know what the ? means. First commit is formatting. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20657 Differential Revision: D15400225 Pulled By: eellison fbshipit-source-id: cf826790bf2ddafd34f6d5c144526cad9904770b	2019-05-17 15:04:53 -07:00
davidriazati	c819d76789	Add list(string) (#20617 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #20617 [jit] Add list(string) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20617 Pulled By: driazati Differential Revision: D15397739 fbshipit-source-id: 55a1e54f0ed2d9fd3ce44a8bb64bb4ba63181c9a	2019-05-17 14:57:20 -07:00
davidriazati	a543586bff	Add `_enable_recursive_script` to try to script all Python functions (#19578 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19578 [jit] Try to script all Python functions This adds the `torch.jit._enable_recursive_script` context manager, which will try to compile any Python functions it sees. It's hidden behind an internal context manager for now since it's incomplete (doesn't work for script_methods/Python submodules). If it can't compile the Python function it outputs an error. ](https://our.intern.facebook.com/intern/diff/15386727/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19578 Pulled By: driazati Differential Revision: D15386727 fbshipit-source-id: 4e308f67677b8e9fccfc525a91bb2f4585062048	2019-05-17 14:50:45 -07:00
davidriazati	cd28ff5395	Add support for __getstate__/__setstate__ on module (#20242 ) Summary: Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`). * `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively * The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided) * This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import) Follow up * Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage. ](https://our.intern.facebook.com/intern/diff/15287161/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242 Pulled By: driazati Differential Revision: D15287161 fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8	2019-05-17 14:43:14 -07:00
Levent Ertoz	5f14ef8cc1	Split out gpu/cpu targets based on gpu_library_targets (#20633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20633 Merge the c2_gpu and is_amd_build logic in targets files. Reviewed By: dzhulgakov Differential Revision: D15176621 fbshipit-source-id: 9185b394ffcb305fd8d94dc7c7c92780bf10a511	2019-05-17 13:07:10 -07:00
Edward Yang	79c5dc313c	Remove unnecessary format literals from error message. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20646 Differential Revision: D15394795 fbshipit-source-id: 8033cf03341244b2b6a119e3c59f48ee6fe959cc	2019-05-17 10:45:40 -07:00
vishwakftw	26dfeffacd	Remove TH/THC link for single matrix inverse (#20534 ) Summary: - Earlier, we had to use the legacy implementation of `getri` for single matrix inverse from TH and THC - Now, this has been moved to ATen Changelog: - Move single matrix inverse implementation to ATen - Remove unused code in TH and THC resulting from the change - Minor modifications made to single matrix CPU function implementations in ATen to avoid redundancy Pull Request resolved: https://github.com/pytorch/pytorch/pull/20534 Differential Revision: D15393383 Pulled By: ezyang fbshipit-source-id: 81972111cd9757d15f1d634f294c93fd0f35636c	2019-05-17 10:03:36 -07:00
Edward Yang	839a69f587	Revert D15393514: [pytorch][PR] Refine CosineAnnealingWarmRestarts doc for issue #20028 Differential Revision: D15393514 Original commit changeset: 03f270a577fc fbshipit-source-id: 3633f4e9916bdadf018288a64df89078b14af563	2019-05-17 09:55:56 -07:00
Stefan Krah	8c9f4c560a	Add matmul optimization for the case A.ndim <= 2 && B.ndim >= 3 (#20448 ) Summary: This addresses #18862. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20448 Differential Revision: D15393465 Pulled By: ezyang fbshipit-source-id: 87e5b0ed8253ea00365f420d98ac96dd4e934028	2019-05-17 09:44:26 -07:00
Mikhail Zolotukhin	a212a5b97a	ir.cpp, module.cpp: clang-format. (#20592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20592 ghimport-source-id: 98dc62a9595c6b94706960274ce9beebacc9ca00 Differential Revision: D15375131 Pulled By: ZolotukhinM fbshipit-source-id: 7edbb14a337d1646b48756eef4163846648cbd93	2019-05-17 09:21:32 -07:00
Sam Gross	b90790ab1b	Don't split 256-bit AVX2 load/store intrinsics (#20609 ) Summary: Recent versions of GCC split unaligned load and store intrinsics into two 128-bit instructions. On old processors (Sandy Bridge) this was a bit faster for unaligned data, but bit slower for aligned data. On new processors (Intel Haswell+, recent AMD) splitting loads is slower on both aligned and unaligned data. Clang, MSVC, and ICC do not split unaligned load and store intrinsics. There's a good explanation here: https://stackoverflow.com/questions/52626726/why-doesnt-gcc-resolve-mm256-loadu-pd-as-single-vmovupd#tab-top Splitting load and store intrinsics makes no sense in our AVX2 configuration because the CPUs that support AVX2 instructions are the same CPUs where splitting is disadvantageous on all data alignemnt. Note that this doesn't change the AVX configuration (used by CPUs that support AVX but not AVX2). It's possible this would be benficial for that configuration too (our data is usually 32-byte aligned), but I'd prefer the conservative change for now. torch.add generated assembly (hot loop) (GCC 7.3.0) before: https://gist.github.com/colesbury/066376537bccd514daf8fe4ab54d8295 after: https://gist.github.com/colesbury/8b4b948145001d44b225c51d2428bb91 Timing of `torch.add(x, y, out=z)` for size 10240 (1 thread, Broadwell, no turbo): before: 7.35 us after: 6.39 us (Take the torch.add timings with a grain of salt. The difference in timings is much larger than I would expect.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20609 Differential Revision: D15385800 Pulled By: colesbury fbshipit-source-id: 66415b148a3b19360b9de9881af594ab46547b6f	2019-05-17 09:16:17 -07:00
Natalia Gimelshein	000d73ccde	fix WAR race (#20182 ) Summary: was flagged by racecheck. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20182 Differential Revision: D15393536 Pulled By: ezyang fbshipit-source-id: ad4849c9fb2c8feb966be1c4ca0dadd7360f58fe	2019-05-17 09:06:52 -07:00
kirayue	3c69c9a7fe	Refine CosineAnnealingWarmRestarts doc for issue #20028 (#20267 ) Summary: Fixes #20028 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20267 Differential Revision: D15393514 Pulled By: ezyang fbshipit-source-id: 03f270a577fc3e0414d3f07d97512a409b08f7cd	2019-05-17 09:02:28 -07:00
zhiqiang	cfb87c1022	Update documentation for CTCLoss (#20422 ) Summary: Change `Inputs` to `Shape` to unify the format of CTCLoss `class`, and add the type of `Output` in `Shape`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20422 Differential Revision: D15393484 Pulled By: ezyang fbshipit-source-id: 5b49647f9740de77db49a566fa2de74fcecd9110	2019-05-17 09:02:25 -07:00
daquexian	35e0015c70	Export sign onnx operator (#20470 ) Summary: A trivial commit that supports exporting sign operator Pull Request resolved: https://github.com/pytorch/pytorch/pull/20470 Differential Revision: D15393446 Pulled By: ezyang fbshipit-source-id: 12fb1c147d016205abf814907d667f7d8b074ae1	2019-05-17 08:57:22 -07:00
Edward Yang	4e551a7edb	Make C10_NODISCARD macro more portable for nvcc+clang. (#20324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20324 ghimport-source-id: e51181c82f87c946b5ffcb87b0ad71a056cb4659 Differential Revision: D15359317 Pulled By: ezyang fbshipit-source-id: d88798f13a61c74456641ddec8250c08ce8af240	2019-05-17 08:57:19 -07:00
vishwakftw	690efa5220	Remove checks for CUDA 8 in LU-based tests (#20482 ) Summary: CUDA 8 is no longer supported and removed from CI, so these checks are irrelevant Pull Request resolved: https://github.com/pytorch/pytorch/pull/20482 Differential Revision: D15393438 Pulled By: ezyang fbshipit-source-id: ac0979bf660b3314eec502c745e34ce4940bda0e	2019-05-17 08:51:56 -07:00
Edward Yang	110ed511a4	Make check-doxygen.sh output more interpretable. (#20362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20362 ghimport-source-id: ac791884dc6d3954f69d8fc997b2b561f435e0e7 Differential Revision: D15375139 Pulled By: ezyang fbshipit-source-id: c8aa0f991430269090e068f828810bae7aa39a07	2019-05-17 08:47:11 -07:00
peter	1136ad59f9	Enable simd and loop vectorizer with MSVC Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20530 Differential Revision: D15392676 Pulled By: ezyang fbshipit-source-id: c8fda0c7835127f81adf55016223bb4dc14ff40a	2019-05-17 08:38:56 -07:00
Shen Li	fa4ca4e70e	Emphasize all DDP forward() outputs must participate in computing loss (#20586 ) Summary: CC borguz chenyangyu1988 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20586 Reviewed By: ezyang Differential Revision: D15373674 Pulled By: mrshenli fbshipit-source-id: b986918b3592616a9bcc88fba1b8fd53016f68d7	2019-05-17 07:35:49 -07:00
Ivan Ogasawara	c941abbc0a	Fix upsample kernel launch / reorder arguments (#20505 ) Summary: this is a follow up for https://github.com/pytorch/pytorch/pull/19630 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20505 Differential Revision: D15392706 Pulled By: ezyang fbshipit-source-id: 5a8a7aacdbcf740508baf2b6e0c081c4e5a0390f	2019-05-17 07:30:50 -07:00
peter	3bc0bd9534	Fix caffe2 build failure on Windows (#20574 ) Summary: Fixes #20568. Looks like CMake is passing `/MD` when we call `add_library`. We need to fix these with C source files too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20574 Differential Revision: D15392682 Pulled By: ezyang fbshipit-source-id: c92034d8725fcec48fd7db6cf5322868e956dc6b	2019-05-17 07:21:42 -07:00
vishwakftw	4c806a9e8a	Allow tuples for scale_factor argument in nn.Upsample (#20581 ) Summary: Fixes #20523 . nn.Upsample was unable to accept tuple inputs for the scale_factor argument due to direct casting to float, which was done in #17732. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20581 Differential Revision: D15392622 Pulled By: ezyang fbshipit-source-id: b56ba8197a5bbf8891bc7e1bebf5cad63dcab04d	2019-05-17 07:14:18 -07:00
Ilia Cherniavskii	409200df59	Move inter-op settings into ATen/Parallel (#20050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050 ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14 Differential Revision: D15248576 Pulled By: ilia-cher fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249	2019-05-17 03:12:02 -07:00
Ashwin Bharambe	36d3398aa5	Clang-format ImageInputOp (#20441 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20441 This op is fairly complex and the fact that it isn't formatted correctly makes things that much harder to reason about. Clean it up. Reviewed By: dreiss Differential Revision: D15220006 fbshipit-source-id: 30632d8bdbf15f96e73d8b6c96c5f29c052e6e7c	2019-05-16 23:00:09 -07:00
Jongsoo Park	ea9c6e7581	eliminate FE_INVALID in unit test (#20502 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20502 Following D15307410 removing more floating point exceptions in unit tests Reviewed By: hx89 Differential Revision: D15340930 fbshipit-source-id: 269fc75e0800bc9d39126767a0f3ca15cd8b0cad	2019-05-16 21:55:28 -07:00
Will Feng	e4c7f59fbc	Shallow-copy indices and values in sparse tensor ctor (#20614 ) Summary: (Reopens https://github.com/pytorch/pytorch/pull/20330 and fixes test error.) After the Variable/Tensor merge, there is no guarantee that `indices` and `values` passed into the sparse tensor constructor don't contain AutogradMeta. However, we want to maintain the existing invariant that `indices_` and `values_` of a sparse tensor don't contain AutogradMeta, and to achieve this we need do shallow-copy in the sparse tensor constructor. Note that this is BC-breaking for code that changes the sizes / strides of the indices or values tensor after it's used to create a sparse tensor. In current master, such changes will be reflected in the sparse tensor and break sparse tensor invariants. After this PR, those changes will not be reflected in the sparse tensor, and thus the sparse tensor invariants are always preserved. Specifically, running in-place size/stride-changing ops such as `resize_` / `resize_as_` / `as_strided_` / `set_` / `transpose_` on the original values tensor will not update the sparse tensor's `values_`. For example: ```python # Calling resize_ on non-requires-grad value tensor i2 = torch.zeros([1, 1]) v2 = torch.ones([1, 2, 3]) t2 = torch.sparse_coo_tensor(i2, v2, torch.Size([2, 2, 3])) v2.resize_(4, 5) t2.coalesce().values().size() # On current master, this throws "indices and values must have same nnz, but got nnz from indices: 1, nnz from values: 4", because resizing the original value tensor affects `values_` of the sparse tensor. # After this PR, this prints "torch.Size([1, 2, 3])", which means resizing the original value tensor doesn't affect `values_` of the sparse tensor. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20614 Differential Revision: D15385811 Pulled By: yf225 fbshipit-source-id: e963fcf5e4097f8c881b56145f408565d97cf5c1	2019-05-16 18:35:05 -07:00
Yanghan Wang	3c86d597c4	update legacy plus one for mpscnn Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20554 Reviewed By: jerryzh168 Differential Revision: D15362378 fbshipit-source-id: 070cd8314257386036dca89167c738c6602b3f33	2019-05-16 18:17:18 -07:00
Yanghan Wang	8bdbd59d0c	handle box plus one for gpu generate_proposals Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20553 Reviewed By: newstzpz Differential Revision: D15362108 fbshipit-source-id: 53b1ef132288855f8977748442bfe5e5806c6c6e	2019-05-16 18:17:15 -07:00
Yanghan Wang	373e6a78bf	make box plus one a legacy argument in detection ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20550 Reviewed By: newstzpz Differential Revision: D15348610 fbshipit-source-id: 12b1e119e9bc9191ba9f2aa6d695ef215780c349	2019-05-16 18:17:12 -07:00
Jerry Zhang	220e6894c5	Rename qint8 data type (#19932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19932 In preparation to add int8_t data type for QTensor Reviewed By: zafartahirov Differential Revision: D15137838 fbshipit-source-id: 59462c36d6fc5982986d4196bf3f32f49bb294d7	2019-05-16 18:09:28 -07:00
svcscm	980982ac09	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: d0158f7e77a915ffbc28c10de8864d2ae9b24e6f	2019-05-16 16:06:55 -07:00
Will Feng	2ddf126b96	Revert D15373683: [pytorch][PR] [BC-breaking] Shallow-copy indices and values in sparse tensor ctor Differential Revision: D15373683 Original commit changeset: 32e7275d7121 fbshipit-source-id: ed1786ee9ffa11f7c14c9cd10be6db48285dc57a	2019-05-16 15:22:48 -07:00
Will Feng	4f02321a9a	Shallow-copy indices and values in sparse tensor ctor (#20330 ) Summary: After the Variable/Tensor merge, there is no guarantee that `indices` and `values` passed into the sparse tensor constructor don't contain AutogradMeta. However, we want to maintain the existing invariant that `indices_` and `values_` of a sparse tensor don't contain AutogradMeta, and to achieve this we need do shallow-copy in the sparse tensor constructor. Note that this is BC-breaking for code that changes the sizes / strides of the indices or values tensor after it's used to create a sparse tensor. In current master, such changes will be reflected in the sparse tensor and break sparse tensor invariants. After this PR, those changes will not be reflected in the sparse tensor, and thus the sparse tensor invariants are always preserved. Specifically, running in-place size/stride-changing ops such as `resize_` / `resize_as_` / `as_strided_` / `set_` / `transpose_` on the original values tensor will not update the sparse tensor's `values_`. For example: ```python # Calling resize_ on non-requires-grad value tensor i2 = torch.zeros([1, 1]) v2 = torch.ones([1, 2, 3]) t2 = torch.sparse_coo_tensor(i2, v2, torch.Size([2, 2, 3])) v2.resize_(4, 5) t2.coalesce().values().size() # On current master, this throws "indices and values must have same nnz, but got nnz from indices: 1, nnz from values: 4", because resizing the original value tensor affects `values_` of the sparse tensor. # After this PR, this prints "torch.Size([1, 2, 3])", which means resizing the original value tensor doesn't affect `values_` of the sparse tensor. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20330 Differential Revision: D15373683 Pulled By: yf225 fbshipit-source-id: 32e7275d7121e17937c7cc258e8a60bb0848ff25	2019-05-16 15:04:23 -07:00
Mingfei Ma	21ef4cc615	Improve bmm performance on CPU by applying TensorAccessor (#20266 ) Summary: Currently `bmm()` has very heavy performance overhead on CPU due to construction/deconstruction of `TensorImpl`. Applying `TensorAccessor` when indexing tensor data can greatly improve the performance. I tested this on `fairseq` Transformer model. Results on Xeon 6148 (202 cores 2.5GHz) indicate this PR improves Transformer training performance by approximately 10%* (seconds per iteration reduced from 3.60 to 3.21). Considering the fact that `bmm()` takes only 14% of the total time, 10% overall improvement indicates `bmm()` itself improves by roughly 3x. Before: ``` \| epoch 001: 0%\| \| 43/25337 [02:34<25:17:11, 3.60s/it, loss=16.179, nll_loss=16.137, ppl=72045.59, wps=1320, ups=0, wpb=4758.767, bsz=136.558, num_updates=43, lr=6.45e-06, gnorm=6.88 ``` After: ``` \| epoch 001: 0%\| \| 23/25337 [01:13<22:32:48, 3.21s/it, loss=17.072, nll_loss=17.068, ppl=137419.42, wps=1478, ups=0, wpb=4746.870, bsz=128.348, num_updates=23, lr=3.45e-06, gnorm=10. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20266 Differential Revision: D15262201 Pulled By: cpuhrsch fbshipit-source-id: c2e4e406c06714b04cc7534f3da71e986eddca35	2019-05-16 14:01:48 -07:00
BowenBao	fa189641b5	Add export for __and__ & __or__ (#17894 ) Summary: In onnx spec, the supported input/output type for `And` and `Or` is `Bool` only. Thus in exporting, cast to/from `Bool` is inserted for input/output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17894 Reviewed By: zrphercule Differential Revision: D15103148 Pulled By: houseroad fbshipit-source-id: 3e1068ea236c743260d42882fb11f0e3a21707e6	2019-05-16 13:52:06 -07:00
Yanghan Wang	61012080c8	split and register CollectAndDistributeFpnRpnProposals with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20509 Reviewed By: newstzpz Differential Revision: D15302181 fbshipit-source-id: 7d3b29b667cd900f2976101f35200e1ee20b0f64	2019-05-16 13:40:46 -07:00
Mikhail Zolotukhin	d784636b39	Scope: Move implementations from .h to .cpp file. (#20593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20593 ghimport-source-id: e1e9a7f98158c23adf02d5ed2763ab1e33bb2997 Differential Revision: D15375134 Pulled By: ZolotukhinM fbshipit-source-id: 8d8e0c1e0ef7697ded59a4b19e2d9de7c5230294	2019-05-16 13:18:04 -07:00
svcscm	75d04900fe	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 2ee799db589f63e7b9336a02d047afcc768e8b58	2019-05-16 09:48:39 -07:00
Jesse Hellemn	5821a76b8e	Forcing gcc ABI and safer bash scripts, v2 (#20540 ) Summary: First time this was merged it broke master and was reverted. This time I do not add ```set -u``` to the .circleci/scripts/setup* scripts. There's still a chance that ```set -u``` breaks the binary builds on master, but at least those can be fixed in parallel and don't completely eliminate signal from all merges. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20540 Differential Revision: D15373444 Pulled By: pjh5 fbshipit-source-id: 0203c20865827366ecd8fa07b2db74d255549ed1	2019-05-16 09:40:01 -07:00
Natalia Gimelshein	66c6133264	fix empty dropout (#20541 ) Summary: Fix for #20499 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20541 Differential Revision: D15372461 Pulled By: ezyang fbshipit-source-id: cdc237a98244515a573216a6dac4826261c973f9	2019-05-16 09:33:51 -07:00
Vitaly Fedyunin	a837c00acd	Removing unnecessary comments (+fix flake8) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20589 Differential Revision: D15373655 Pulled By: VitalyFedyunin fbshipit-source-id: 25277648d3e8f8a09cec7569ceda56e74c2ef0b1	2019-05-16 09:19:34 -07:00
Jongsoo Park	5f8e849d84	eliminate FE_INVALID in optimizer related operators and tests (#20501 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20501 Fixing unit tests related to optimizer related operators and tests Reviewed By: hx89 Differential Revision: D15307410 fbshipit-source-id: e5400c26e08f26191ee542fe6b02e0a69bc4e1ae	2019-05-16 08:23:46 -07:00
Vitaly Fedyunin	5b78a5eadb	Memory format support for contiguous and is_contiguous (#20455 ) Summary: #19975 was separated by 2 PRs. This one: Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions. At this moment both functions just operate with strides and doesn't store any tensor state. (Original RFC #19092) ----- Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api). Note: We had several complaints about `.to(memory_format)` function, and decided not to support it. 1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior. - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern. `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise. 2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged. - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format. Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455 Differential Revision: D15341577 Pulled By: VitalyFedyunin fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d	2019-05-16 07:18:24 -07:00
Sebastian Messmer	09f22d10a6	Infer schema for experimental ops (#20513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20513 They've been using an old API, switch them to the new one instead. Reviewed By: li-roy Differential Revision: D15346349 fbshipit-source-id: 538eb460897ec6addebeebf88b316eb0d6b1dd6f	2019-05-16 01:29:35 -07:00
Sebastian Messmer	9bd3305592	Allow nested lists/dicts in legacy operator API (#20379 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20379 The legacy custom op API allowed nesting of std::unordered_map and std::vector. While we haven't figured out yet how to do that with the new API, we at least have to keep backwards compatibility. This diff adds the feature so we can switch to the new API without breaking third party code. Reviewed By: li-roy Differential Revision: D15287693 fbshipit-source-id: bb5b8429fddf6298719cbf567b584ed371f8fc81	2019-05-16 01:29:32 -07:00
Will Feng	456b889353	Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() (#20496 ) Summary: Previously, the caller of `shallow_copy_and_detach()` is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of `shallow_copy_and_detach()` to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought. For similar reasons, we want to enforce users of `shallow_copy_and_detach()` to pass `allow_tensor_metadata_change` to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20496 Differential Revision: D15363620 Pulled By: yf225 fbshipit-source-id: a65e74738b10452668d6dc644b43aad5b3d8c9e6	2019-05-15 21:02:48 -07:00
Guanheng Zhang	3caf4e6985	Remove weak_script in MultiheadAttention function. (#20563 ) Summary: Remove weak_script. After recently splitting the forward() function in MultiheadAttention module, we notice a memory leak on GPU. Fix the problem by removing those "weak_script" decorator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20563 Differential Revision: D15368262 Pulled By: zhangguanheng66 fbshipit-source-id: 475db93c9ee0dbaea8fb914c004e7d1e0d419bc2	2019-05-15 20:10:39 -07:00
Edward Yang	7db1fb84fa	Use slimmer exception raising code when on mobile. (#20543 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20543 All of that code for concatenating strings together adds up. Just discard it all for mobile builds. Reviewed By: ljk53 Differential Revision: D15353447 fbshipit-source-id: a82dd0b884335d662605aabf7dd3d09dfcc1478b	2019-05-15 19:45:18 -07:00
David Reiss	1891614aa5	Add GivenTensorInt16Fill (#20515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20515 Needed by the upcoming quantized version of GenerateProposals Reviewed By: dzhulgakov Differential Revision: D14430952 fbshipit-source-id: ea852f04cc4b070f8fbe7a1e6535bba4d5b230fd	2019-05-15 19:45:15 -07:00
Ilia Cherniavskii	5917ec2c52	Print registry warning only when DEBUG is set (#20398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20398 Reduce logging volume from the Registry Reviewed By: nairbv Differential Revision: D15312262 fbshipit-source-id: e3546c288d6e1a396b2a4b08204a418aca889437	2019-05-15 19:29:05 -07:00
Rui Zhu	c129ab06e9	Change onnxifi workflow to support multi-group quantized & Add multi quantization info to caffe2.proto (#20439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20439 This is the QTensorProto workflow for multi group quantization in C2 side. No DNNLOWP Tensor related thing is included in this pr, so once we finished glow side, we should be able to test this pr using resnet50. Reviewed By: yinghai Differential Revision: D15096919 fbshipit-source-id: 741eecd59eb79d24d9fe2b035f6246d42422d25c	2019-05-15 19:24:08 -07:00
Roy Li	51e40ab832	Add scalar type info to tensor print (#20483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20483 ghimport-source-id: 31bfc51af1060e83492315b96884fc725a1eb84f Differential Revision: D15334010 Pulled By: li-roy fbshipit-source-id: 199b575855146a7336d57c165191a16e7e1b5785	2019-05-15 19:03:21 -07:00
Jerry Zhang	abb3698976	Add QInt32 ScalarType and qint32 data type (#19816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19816 We need this for quantization for bias add third argument of ScalarType to `quantize_linear` Differential Revision: D15094174 fbshipit-source-id: f19ec8f4716cf5fe0aa21b38d45af6d27c9ab377	2019-05-15 18:50:18 -07:00
Rakesh Komuravelli	1a0f753e6e	Fixing typos in schema description for BatchMatMul (#20512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20512 Fixing typos in the description of schema for one of the inputs for BatchMatMul operator. Reviewed By: jianyuh, BIT-silence Differential Revision: D15343879 fbshipit-source-id: 06354e8e6b0d79fea937ed2703bb457b2d04f859	2019-05-15 18:06:30 -07:00
Jerry Zhang	b3e510518b	Tensor codemod for instance_norm (#20517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20517 fixing a bug in instance_norm Reviewed By: ezyang Differential Revision: D15349006 fbshipit-source-id: 2496f7f372118d2713c12a6e9b3357bf6c640b71	2019-05-15 17:51:37 -07:00
Guanheng Zhang	ca24e18c7e	Add an AssertError check back to MultiheadAttention module (#20492 ) Summary: Fix a typo in the doc. Add an AssertError check back to MultiheadAttention module Pull Request resolved: https://github.com/pytorch/pytorch/pull/20492 Differential Revision: D15349008 Pulled By: cpuhrsch fbshipit-source-id: 2d898345f03787c713e537673613a748ad826b34	2019-05-15 17:28:25 -07:00
Long Jin	161566187c	enable CopyVector for type of int on CUDA (#20520 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20520 as title Reviewed By: xianjiec Differential Revision: D15351010 fbshipit-source-id: 99466de9da0abdffe26d6919768dcb4e52cb2ff1	2019-05-15 16:53:51 -07:00
Igor Fedan	4c23c34e79	Computing var/stddev and mean at the same time (#18731 ) Summary: The current variance kernels compute mean at the same time. Many times we want both statistics together, so it seems reasonable to have a kwarg/function that allows us to get both values without launching an extra kernel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18731 Differential Revision: D14726082 Pulled By: ifedan fbshipit-source-id: 473cba0227b69eb2240dca5e61a8f4366df0e029	2019-05-15 16:42:38 -07:00
Chunli Fu	08bdd694f9	Extract feature length information from SigridTransforms op (#20384 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20384 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20171 Extract feature length information from SigridTransforms op Reviewed By: ipiszy Differential Revision: D15219408 fbshipit-source-id: 307d2b65b208d3af6977d90246d0372795c45815	2019-05-15 16:21:57 -07:00
Lu Fang	428104c60a	Automatic update of fbcode/onnx to ead449a30d026a7a0a59e2ba0a42ca8e52ec2359 (#20542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20542 Previous import was e08efaa35ed54362dfa283240506c003175889b7 Included changes: - [ead449a3](https://github.com/onnx/onnx/commit/ead449a3): fix array range bug (#2015) <one-hello> - [0442d426](https://github.com/onnx/onnx/commit/0442d426): Relax constraint on subgraph input/output type and shape (#2009) <Bowen Bao> Reviewed By: zrphercule Differential Revision: D15350320 fbshipit-source-id: 2cc5db926785cda0b79efb6747da3900361dba76	2019-05-15 15:12:59 -07:00
Sebastian Messmer	8226330af3	Extend testAvailableArgTypes (#20374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20374 This test case now also tests that the argument type works correctly in kernels that - don't return outputs - return multiple outputs Reviewed By: li-roy Differential Revision: D15298233 fbshipit-source-id: 82ab9d81b55b4f9fb34d66a155cc426af8592e25	2019-05-15 14:57:40 -07:00
Sebastian Messmer	f89ab7b623	Allow Dict type in c10 operators (#20373 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20373 - Add support for Dict<Key, Value> arguments and returns to c10 operators - Add support for std::unordered_map<Key, Value> to the legacy API (but not to c10 kernels) Reviewed By: li-roy Differential Revision: D15298235 fbshipit-source-id: 6d9793db1f12bea377f508a9b33a495ebe0bec18	2019-05-15 14:57:37 -07:00
Ilia Cherniavskii	a821e11127	Speed up RecordFunction with sampled callbacks (#20307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20307 ghimport-source-id: 94adc3c3102bb6b9cb60cf6c6112e350aa954aaf Differential Revision: D15276308 Pulled By: ilia-cher fbshipit-source-id: c536063669d8414b4ce0b09fd5dc0d76f1e94bb5	2019-05-15 14:48:49 -07:00
Sebastian Messmer	b55d2dcc84	Publish c10::RegisterOperators as torch::RegisterOperators (#20334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20334 - Reviewed By: li-roy Differential Revision: D15284557 fbshipit-source-id: fdd1d9f2910dbd05a869eef13ccdc68c80e6bd81	2019-05-15 13:45:07 -07:00
Edward Yang	852f8526c5	Replace AT_CHECK with TORCH_CHECK [shard 5/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20431 Reviewed By: jerryzh168 Differential Revision: D15318266 fbshipit-source-id: 500719451202458fae312aa196c0c60098d6a541	2019-05-15 12:54:08 -07:00
Zachary DeVito	5243fe0350	Allow static inheritence for ScriptModules (#20503 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20503 ghimport-source-id: 35684175f0806485b074ea136548823ad1bc1c30 Differential Revision: D15341555 Pulled By: zdevito fbshipit-source-id: ad19da3306914196dcbbcee829dcb0a9f22e3722	2019-05-15 12:41:55 -07:00
Natalia Gimelshein	da3e74b21c	define use_cuda in dropout backward to allow peephole optimization to… (#20289 ) Summary: … work Pull Request resolved: https://github.com/pytorch/pytorch/pull/20289 Differential Revision: D15350262 Pulled By: wanchaol fbshipit-source-id: b457304688524822c1e6f23049e05472130c1ff4	2019-05-15 11:36:06 -07:00
Jesse Hellemn	bd047d812e	Recursively checkout submodules for Pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20537 Differential Revision: D15354830 Pulled By: pjh5 fbshipit-source-id: 40902c450756dc127d34c9ec64e78d33edb6c5c9	2019-05-15 10:48:27 -07:00
Brennan Vincent	72bb84c518	Provide a few default args for numpy translation (#20451 ) Summary: Add automatic translations for a few argument names that commonly differ between PyTorch and NumPy. For now, they are as follows: * `keepdim` -> `keepdims` * `dim` -> `axis` * `input` -> (any of `a`, `x`, `x1`) * `other` -> `x2` Basic examples: ```python >>> t=torch.randn(10,10) >>> torch.sum(x=t, axis=1) tensor([ 0.5199, -0.3768, 4.3619, -0.9105, 1.1804, 1.0837, -0.9036, 0.2365, 1.1171, -0.0999]) ``` ```python >>> torch.add(x1=5, x2=6) tensor(11) ``` The additional overhead is zero when using traditional PyTorch argument names, and a few (usually 1) extra PyDict lookups when using NumPy argument names. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20451 Differential Revision: D15337521 Pulled By: umanwizard fbshipit-source-id: 7a7d389786f4ccf5c86a14ecb2002c61730c51b5	2019-05-15 10:13:17 -07:00
Edward Yang	83649ef081	Replace AT_CHECK with TORCH_CHECK [shard 1/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20426 Reviewed By: jerryzh168 Differential Revision: D15318160 fbshipit-source-id: 4d1fb341ab47147d760d527b901de6ce54182753	2019-05-15 08:44:54 -07:00
peter	2827f3ded6	Portable way of the warning clause Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20484 Differential Revision: D15353119 Pulled By: ezyang fbshipit-source-id: a708548554728ec34c51a8032ceb2b12f16a8d5c	2019-05-15 08:14:01 -07:00
peter	15c0091d8a	Fix GetLastError in THAllocator for Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20526 Differential Revision: D15352969 Pulled By: ezyang fbshipit-source-id: 50a9ee10c1c80cfe737c96dd8af63a2b034686ae	2019-05-15 08:13:57 -07:00
Edward Yang	73a97387c1	Replace AT_CHECK with TORCH_CHECK [shard 9/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20435 Reviewed By: jerryzh168 Differential Revision: D15318877 fbshipit-source-id: 4d83571187ea14a604fef83ac355d328b46d93e1	2019-05-15 08:05:59 -07:00
Edward Yang	365fc26571	Replace AT_CHECK with TORCH_CHECK [shard 8/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20434 Reviewed By: jerryzh168 Differential Revision: D15318396 fbshipit-source-id: dcd0f51be2d64b9440bb95ce8f40acb12545c2f4	2019-05-15 08:05:56 -07:00
Edward Yang	d1623f4cc9	Replace AT_CHECK with TORCH_CHECK [shard 3/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20428 Reviewed By: jerryzh168 Differential Revision: D15318209 fbshipit-source-id: e492aaa79146cfce9489bdb354cc539d7c4220a7	2019-05-15 07:40:50 -07:00
Edward Yang	9d09f5df6c	Replace AT_CHECK with TORCH_CHECK [shard 7/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20432 Reviewed By: jerryzh168 Differential Revision: D15318289 fbshipit-source-id: 6c443ac848fe28a1e3e8d7f33a12cd50f80b3e40	2019-05-15 07:40:47 -07:00
peter	101067703e	Fix strtod for MSVC (#20490 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20408. Tested locally by Jonas1312. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20490 Differential Revision: D15353137 Pulled By: ezyang fbshipit-source-id: 0c0aefe54b11d50f703171700838af51f7666418	2019-05-15 07:40:44 -07:00
Edward Yang	97e1f07ffc	Replace AT_CHECK with TORCH_CHECK [shard 10/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20436 Reviewed By: jerryzh168 Differential Revision: D15318926 fbshipit-source-id: 71a43070cc50cc174f703ebc595f1d87c6fc1e91	2019-05-15 07:35:37 -07:00
Bram Wasti	8e26759f14	Back out "[pytorch][PR] Manually set _GLIBCXX_USE_CXX11_ABI in devtoolset7 binary builds" Summary: Original commit changeset: 571bba8a93ea Reviewed By: pjh5 Differential Revision: D15349783 fbshipit-source-id: 75c3e2b9b97e0ac0e8bcdef93e53b0d475c6fa38	2019-05-15 00:02:55 -07:00
Cheng Cheng	fd18b89c98	shape inference for learning rate op (#20020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20020 Add shape inference for LearningRate op. The output (lr) should have similar shape with input (iteration), but not the same type (float vs int). Reviewed By: un-disclosed Differential Revision: D15112300 fbshipit-source-id: 09969aefa15172a6f3c70cd9b2548e3020da5d7a	2019-05-14 23:34:32 -07:00
Jiyan Yang	33f421027c	Allow recency weight pooling for fp16 (#20506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20506 as titled Reviewed By: alex1o1o7cloud Differential Revision: D15342758 fbshipit-source-id: 89e7cb6d7b9511ef6c70611359736328571d7fc0	2019-05-14 20:13:38 -07:00
svcscm	ea13b53856	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 63e9b4a8cf5b15a6ba20d1946aac36c1604d8079	2019-05-14 19:02:43 -07:00
Kedar Pujara	254de9e8ec	Removing cyclic dependency (#20511 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20511 Removed cyclic dependency of caffe2/core/net.h and workspace.h Differential Revision: D15303412 fbshipit-source-id: 6e772e372cd0cf2af05d7815f1df8ae20bc2a65e	2019-05-14 18:55:19 -07:00
Sebastian Messmer	ace506fb38	Dict (#20372 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20372 Implement a Dict type that allows us to abstract away from the concrete implementation used. The API is similar to std::unordered_map, but behind the scenes we can switch to any map implementation we like. ska::flat_hash_map, google dense map, or any future map implementation with better performance. Switching such an implementation choice does not have to break backwards compatibility of kernel code using the Dict type. Reviewed By: zdevito Differential Revision: D15298234 fbshipit-source-id: b5ad368a9e9516030805cd8f5f1b02e3986933c0	2019-05-14 18:37:02 -07:00
Jerry Zhang	56fb5e03b5	refactor registerStoragePyTypeObject (#20467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20467 for upcoming changes in Storage for QInt8 Reviewed By: ezyang Differential Revision: D15330865 fbshipit-source-id: 2840e59c0bf088983f792fd724de41b3bb3dec55	2019-05-14 18:22:33 -07:00
Jesse Hellemn	ea38fbfc5c	Manually set _GLIBCXX_USE_CXX11_ABI in devtoolset7 binary builds (#20243 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/17492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20243 Differential Revision: D15348101 Pulled By: pjh5 fbshipit-source-id: 571bba8a93eaa9806db3f3d38697c26b5285da7a	2019-05-14 18:02:42 -07:00
Edward Yang	358fb51e77	Replace AT_CHECK with TORCH_CHECK [shard 6/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20430 Reviewed By: jerryzh168 Differential Revision: D15318250 fbshipit-source-id: eaee93447d757124a0c9fb5dcde503ae6a065912	2019-05-14 16:00:59 -07:00
Edward Yang	5b45355431	Replace AT_CHECK with TORCH_CHECK [shard 2/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20427 Reviewed By: jerryzh168 Differential Revision: D15318190 fbshipit-source-id: 15518a683d7b662ef00f255134aaf9dbd183f099	2019-05-14 16:00:56 -07:00
Edward Yang	71af7c46bb	Replace AT_CHECK with TORCH_CHECK [shard 4/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20429 Reviewed By: jerryzh168 Differential Revision: D15318222 fbshipit-source-id: daf693c34b4ee92e302eee679ed76a862715d1bb	2019-05-14 15:50:16 -07:00
Zachary DeVito	9610f150d7	stop build spew on development (#20508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20508 ghimport-source-id: 26a16e2918fb93058c7740afb85070e0d29b4d1b Differential Revision: D15343207 Pulled By: zdevito fbshipit-source-id: b6d8858024cc440d59cf88d69e0fbc0e67dc85ce	2019-05-14 15:30:52 -07:00
Michael Suo	24cd0e08cf	identify important circleci builds (#20498 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20498 ghimport-source-id: b62b5bcf73ce87b1054cad053fd1cc118a586cf6 Differential Revision: D15342506 Pulled By: suo fbshipit-source-id: 9889103d23affe0d7eea0abfd801bae46d5238a2	2019-05-14 15:16:06 -07:00
Sebastian Messmer	9e7f22b223	Remove dependencies from Caffe2Go on PyTorch JIT (#20463 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20463 Source file changes mostly involve ifdef'ing-out references to JIT code from files that are part of Caffe2Go. Update Internal build scripts to remove those files from our globs. After this, changes to most of the JIT files should not trigger mobile CI. Reviewed By: dzhulgakov Differential Revision: D15329407 fbshipit-source-id: 48f614c6b028eef0a03ce5161d083a3e078b0412	2019-05-14 14:36:08 -07:00
Ivan Ogasawara	3479777519	UpSample GPU Porting (#19630 ) Summary: resolves #16158 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19630 Differential Revision: D15335765 Pulled By: ezyang fbshipit-source-id: 03dd590c715a65c20ac99674a5d77179cd4a50fc	2019-05-14 11:58:21 -07:00
Cheng Cheng	7ffc37e022	Add ShapeInference for AtomicIter Op (#20021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20021 Add shape inference for AtomicIter operator. The operator takes two blobs iteration and iter_mutex as input and outputs iteration, which should have the same type and shape as the input. Reviewed By: un-disclosed Differential Revision: D15111643 fbshipit-source-id: 0d06413305cc4c6257c0cfabf62fb874970803bc	2019-05-14 11:43:21 -07:00
Jason Lian	6e82b1c77d	Split nn.MultiHeadAttention into Module + functional (#20415 ) Summary: Moving functions from torch/nn/modules/activation.py to torch/nn/functional.py. For functions not implemented (_get_input_buffer and _set_input_buffer), a TODO is added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20415 Differential Revision: D15318078 Pulled By: jamarshon fbshipit-source-id: 5ca698e2913821442cf8609cc61ac8190496a3c6	2019-05-14 08:41:28 -07:00
Sam Gross	b46a630836	Update Sleef to include fix for FMA4 detection (#20450 ) Summary: FMA4 support is in bit 16 of register ECX, not EDX of the "extended processor info" (0x80000001). Once we verify that this change fixes https://github.com/pytorch/pytorch/issues/12112, I'll make a PR for upstream Sleef. The mapping of registers to reg is: ``` reg[0] = eax reg[1] = ebx reg[2] = ecx <--- reg[3] = edx ``` Bit 16 of EDX is PAT (Page Attribute Table) on AMD CPUs, which is widely supported. Intel CPUs do not set this bit. This causes "Illegal instruction" errors on AMD CPUs that do not support FMA4. See https://github.com/pytorch/pytorch/issues/12112 See https://github.com/shibatch/sleef/issues/261 http://developer.amd.com/wordpress/media/2012/10/254811.pdf (Page 20) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20450 Differential Revision: D15324405 Pulled By: colesbury fbshipit-source-id: 96fb344c646998ff5da19e4cdbf493f5a4e9892a	2019-05-14 08:33:18 -07:00
Jongsoo Park	101176870e	eliminate FE_INVALID exceptions related to fp16 conversion (#20390 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20390 duc0 Ngo implemented observing floating point exceptions but there were a couple of places where we have "benign" floating point exceptions leading to false positives. This diff eliminates one source of such false positives, namely using _mm256_cvtph_ps and _mm256_cvtps_ph for partially uninitialized array for the remainder loop. Reviewed By: hx89 Differential Revision: D15307358 fbshipit-source-id: 38f57dfdd90c70bc693292d2f9c33c7ba558e2c9	2019-05-13 23:42:01 -07:00
Du Tran	8e9692df27	codemode change missing [from D13586737] Summary: as title Reviewed By: jerryzh168 Differential Revision: D15327669 fbshipit-source-id: e262dacb097e91475b1925ec40b375ec6722ad5a	2019-05-13 20:44:04 -07:00
davidriazati	e8fb5f35f0	Bump torch proto version (#20444 ) Summary: Tagging along to changes in #20191 which added more support for types in the pickler Pull Request resolved: https://github.com/pytorch/pytorch/pull/20444 Pulled By: driazati Differential Revision: D15321463 fbshipit-source-id: 985061bf5070a7d7bad58ea8db11d531f3d13e74	2019-05-13 18:32:16 -07:00
Ansha Yu	a9aaf698a4	add c2 benchmark runs in cpp (#20108 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20108 Add cpp runs for c2, hooked up via pybinds. Print output to terminal. This is not hooked up with the pep output yet because I'd like to verify the numbers first. Note that this isn't quite the same mechanism as the pytorch cpp hookup, which uses cpp_python_extensions. If I can use the same mechanism to pull all the inputs for c2 through cpp and do FeedBlobs in cpp, then I'll switch to that. Reviewed By: zheng-xq Differential Revision: D15155976 fbshipit-source-id: 708079dacd3e19aacfe43d70c5e5bc54da2cf9e3	2019-05-13 17:01:08 -07:00
Wanchao Liang	d2da3ee601	temporarily disbale layernorm AD (#20442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20442 ghimport-source-id: c246ade4ee9ee31b2e3413efff3ea6a246e1837e Differential Revision: D15321524 Pulled By: wanchaol fbshipit-source-id: 22c77d08c91af2d83dfd2c4a84cafc56e9240033	2019-05-13 16:35:50 -07:00
Edward Yang	f0829f37c8	Rename AT_ASSERT to TORCH_INTERNAL_ASSERT; other macro updates (#20321 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20321 First part of https://github.com/pytorch/pytorch/issues/20287 - Rename `AT_ASSERT` to `TORCH_INTERNAL_ASSERT` - Make `TORCH_INTERNAL_ASSERT` work with variadic inputs - Deprecated `AT_ASSERT` and `AT_ASSERTM` - Rename `AT_CHECK` to `TORCH_CHECK` - Make `TORCH_CHECK` give a better error message when no arguments are provided - Deprecate `AT_ERROR` in favor of `TORCH_CHECK(false, ...)` - Deprecate `AT_INDEX_ERROR` in favor of `TORCH_CHECK_INDEX(false, ...)` - Rename `AT_WARN` to `TORCH_WARN` No use sites are changed; I'll work on that in follow up patches (or disable the deprecation, if necessary.) Differential Revision: D15278439 fbshipit-source-id: 7e0ed489d4e89e5f56b8ad7eafa72cb9a06065ee	2019-05-13 16:16:42 -07:00
Will Feng	1364104054	Fix version counter sharing in set_data() (#20391 ) Summary: In https://github.com/pytorch/pytorch/pull/18223/files#diff-77a6f3462f2233b921d3042412fed6d3R178, we used `auto saved_version_ = data_.unsafeGetTensorImpl()->version_counter().current_version()` and then `new_data_impl_copy->set_version_counter(saved_version_)`, which actually doesn't preserve the original semantics that `var.set_data(tensor)` should keep `var`'s version counter object intact. This PR fixes the bug and adds test to make sure it doesn't happen again. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20391 Differential Revision: D15323430 Pulled By: yf225 fbshipit-source-id: e3ba49b51ec8ccecd51c80cb182387f74cfd2b2b	2019-05-13 16:03:42 -07:00
Will Feng	3a0b27b73d	Move at::NonVariableTypeMode to TensorImpl, and check it in is_variable() (#20392 ) Summary: As part of the Variable/Tensor merge, we allow passing Tensor with AutogradMeta into ATen ops, but we want to make sure they are not treated as Variables (i.e. their `is_variable()` is false). This PR makes the necessary change to make this work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20392 Differential Revision: D15321899 Pulled By: yf225 fbshipit-source-id: c2ab09db73c63bd71ba2d8391095f4d6b4240a9a	2019-05-13 15:49:23 -07:00
Lu Fang	2dc9152dbe	Automatic update of fbcode/onnx to e08efaa35ed54362dfa283240506c003175889b7 (#20443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20443 Previous import was 5bde6371620b76302864bce90f521d72eda95d0e Included changes: - [e08efaa3](https://github.com/onnx/onnx/commit/e08efaa3): Fix shape inference logic for TopK operator (#2005) <Hariharan Seshadri> - [d80ea947](https://github.com/onnx/onnx/commit/d80ea947): Nullary variadic (#1889) <G. Ramalingam> - [50dc186b](https://github.com/onnx/onnx/commit/50dc186b): Removed setting MD/MDd flags manually through cmake. The MTd/MT part is still necessary. Looks like CI fails without it. (#1995) <Alexander Yermolovich> - [e7f81c5e](https://github.com/onnx/onnx/commit/e7f81c5e): Move NonMaxSupression to object_detection folder (#2001) <Hector Li> - [86ab4517](https://github.com/onnx/onnx/commit/86ab4517): Prevent using invalid iterator, fix arithmetics. (#2004) <Dmitri Smirnov> Reviewed By: zrphercule Differential Revision: D15302141 fbshipit-source-id: 146c346c188934e5125371b261ecfde93b4aa166	2019-05-13 14:47:11 -07:00
Jesse Hellemn	824d4f9957	Needed fixes for binaries Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20385 Differential Revision: D15321396 Pulled By: pjh5 fbshipit-source-id: de7ca1ac928bdea3bcf6c78e84c7e9b786bcff52	2019-05-13 11:58:50 -07:00
Jiyan Yang	6c3b8a24ff	Make sure reducer=None is not used when fp16 embedding is enabled Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20349 Reviewed By: hyuen Differential Revision: D15291545 fbshipit-source-id: fa5fd0b97aeca6e5f45866908f3f205b701c931b	2019-05-13 11:53:14 -07:00
davidriazati	63c05bffcb	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20440 Pulled By: driazati Differential Revision: D15320614 fbshipit-source-id: dc650c478e39d0c3e6b660c2d9ef93b3479df1ac	2019-05-13 11:37:27 -07:00
Masaki Kozuki	7799ea5eb3	Port adaptive_avg_pool3d to ATen (#19898 ) Summary: Resolves #18065. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19898 Differential Revision: D15240607 Pulled By: ezyang fbshipit-source-id: 00cf23ed20c1757d5eef71fd8c6a2f53d372e341	2019-05-13 11:29:22 -07:00
Syed Tousif Ahmed	5268b7dfaf	Remove support for CUDA 8 (#20298 ) Summary: 1.1.0 stopped support for CUDA 8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20298 Differential Revision: D15294639 Pulled By: ezyang fbshipit-source-id: b9411bfe456f93f1529b745dc83b7d6310df684d	2019-05-13 11:24:22 -07:00
sebftw	62957ab0a1	Tiny spelling mistake fix. (#20425 ) Summary: "then the output would also has k tensors" -> "then the output would also have k tensors" Pull Request resolved: https://github.com/pytorch/pytorch/pull/20425 Differential Revision: D15320152 Pulled By: zou3519 fbshipit-source-id: b04e2ccd29c6a3e33ad1040d0ea975a01a7bd9b5	2019-05-13 11:18:53 -07:00
Syed Tousif Ahmed	67414714e5	Move THCTensor_(uniform) to ATen (#20292 ) Summary: As a first step for this plan: https://github.com/pytorch/pytorch/issues/19508#issuecomment-485178192, this PR moves `THCTensor_(uniform)` to ATen. Major changes are: - `uniform_` cuda kernel now utilizes a philox generator. - the kernel also utilizes TensorIterator - the kernel uses a grid-stride loop to achieve peak effective bandwidth - Since the engine has changed from `curandStateMTGP32` to `curandStatePhilox4_32_10`, the randoms generated now will be different. - Here is the diff showing codegen changes: https://gist.github.com/syed-ahmed/4af9ae0d42b6c7dbaa13b9dd0d1dd1e8 (BC breaking change if any) - Philox4_32_10 is known to pass the standard TestU01 Big Crush test (https://www.thesalmons.org/john/random123/papers/random123sc11.pdf) and hence the quality of random numbers generated isn't an issue when compared to the previously used `curandStateMTGP32`. - I have added a test case in `aten/src/ATen/test/cuda_distributions_test.cu` which verifies that philox offset is incremented properly The benchmark was done on a DGX station with 4 V100s. I modified the script from jcjohnson 's [multinomial benchmark](https://github.com/jcjohnson/pytorch-multinomial-benchmark) to produce this notebook which shows that there is a general speedup with this PR and a regression hasn't been introduced: https://gist.github.com/syed-ahmed/9d26d4e96308aed274d0f2c7be5218ef To reproduce the notebook: - Run https://gist.github.com/syed-ahmed/4208c22c541f1d30ad6a9b1efc1d728f in a container with the current pytorch top of tree with the command: `python uniform_benchmark.py --stats_json before.json` - Apply this diff to the current pytorch top of tree and run the same script in a container with the command: `python uniform_benchmark.py --stats_json after.json` - Run the notebook attached above with the `after.json` and `before.json` in the same directory The effected bandwidth was calculated using the script (thanks to ngimel ): https://gist.github.com/syed-ahmed/f8b7384d642f4bce484228b508b4bc68 Following are the numbers before and after. ``` uniform, size, elements 65536 forward 5.168914794921875e-06 bandwidth (GB/s) 50.71548098597786 uniform, size, elements 131072 forward 5.056858062744141e-06 bandwidth (GB/s) 103.67860705101367 uniform, size, elements 262144 forward 7.164478302001953e-06 bandwidth (GB/s) 146.357621001797 uniform, size, elements 524288 forward 1.1217594146728515e-05 bandwidth (GB/s) 186.9520302275877 uniform, size, elements 1048576 forward 1.923084259033203e-05 bandwidth (GB/s) 218.10297600317384 uniform, size, elements 2097152 forward 3.640890121459961e-05 bandwidth (GB/s) 230.39992200138826 uniform, size, elements 4194304 forward 6.778717041015625e-05 bandwidth (GB/s) 247.49839679819922 uniform, size, elements 8388608 forward 0.00012810707092285157 bandwidth (GB/s) 261.92490202361347 uniform, size, elements 16777216 forward 0.00025241613388061524 bandwidth (GB/s) 265.86598474620627 uniform, size, elements 33554432 forward 0.000497891902923584 bandwidth (GB/s) 269.5720239913193 ``` ``` uniform, size, elements 65536 forward 5.550384521484375e-06 bandwidth (GB/s) 47.22988091821306 uniform, size, elements 131072 forward 5.581378936767578e-06 bandwidth (GB/s) 93.93520954942333 uniform, size, elements 262144 forward 6.165504455566406e-06 bandwidth (GB/s) 170.071404141686 uniform, size, elements 524288 forward 6.3276290893554685e-06 bandwidth (GB/s) 331.4277702414469 uniform, size, elements 1048576 forward 8.509159088134765e-06 bandwidth (GB/s) 492.91639239047356 uniform, size, elements 2097152 forward 1.2989044189453124e-05 bandwidth (GB/s) 645.8218077979443 uniform, size, elements 4194304 forward 2.347707748413086e-05 bandwidth (GB/s) 714.6211452997259 uniform, size, elements 8388608 forward 4.4286251068115234e-05 bandwidth (GB/s) 757.6715389250498 uniform, size, elements 16777216 forward 8.672237396240235e-05 bandwidth (GB/s) 773.8356427961071 uniform, size, elements 33554432 forward 0.00016920566558837892 bandwidth (GB/s) 793.2224227438523 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20292 Differential Revision: D15277761 Pulled By: ezyang fbshipit-source-id: 8bfe31a01eeed77f0ed6e7ec4d2dda4c6472ecaa	2019-05-13 09:38:28 -07:00
Yauheni Koran	5f7ef09f57	math module support: gcd, copysign, erf, erfc, expm1, fabs, gamma, lgamma (#19707 ) Summary: eellison driazati Refer to issue #19026 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19707 Differential Revision: D15302632 Pulled By: eellison fbshipit-source-id: 68ff13b478b93cc33703ef3276b5fa727c8ff31a	2019-05-13 08:55:23 -07:00
Guanheng Zhang	41673d477c	Disable incremental_state function in MultiheadAttention module. (#20177 ) Summary: To fully support incremental_state function, it requires several additional utils available in fairseq. However, we lack a problem for the unit test. Therefore, the incremental_state function will be disable for now. If it is needed in the future, a feature request could be created. Fixed #20132 Add some unit tests to cover the arguments of MultiheadAttention module, including bias, add_bias_kv, add_zero_attn, key_padding_mask, need_weights, attn_mask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20177 Differential Revision: D15304575 Pulled By: cpuhrsch fbshipit-source-id: ebd8cc0f11a4da0c0998bf0c7e4e341585e5685a	2019-05-13 08:21:15 -07:00
Clément Pinard	f8aa6a8f44	Make a deep copy of extra_compile_flag dictionnary (#20221 ) Summary: See issue #20169 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20221 Differential Revision: D15317126 Pulled By: ezyang fbshipit-source-id: 0a12932db4f6ba15ea1d558fa329ce23fe2baef6	2019-05-13 08:11:39 -07:00
peter	30bdb8c0d7	Hotfix for caffe2 windows build (#20417 ) Summary: We don't need to overlay vc env when not using ninja. CMake will deal with it automatically. Overlaying is a no-op when the env is the same with the generator specified but will generate the error "Cannot find CMAKE_CXX_COMPILER" when they are different. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20417 Differential Revision: D15317081 Pulled By: ezyang fbshipit-source-id: 5d9100321ecd593e810c31158f22c67d3e34973b	2019-05-13 08:03:45 -07:00
Tongzhou Wang	f496ea36b2	DataLoader: add error detection for worker_init_fn (#20150 ) Summary: This is an attempt to isolate unrelated changes from #19228 for easier review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20150 Differential Revision: D15314891 Pulled By: ezyang fbshipit-source-id: 8c429747ba83ad5aca4cdd8f8086bcf65a326921	2019-05-12 18:28:56 -07:00
Roy Li	163f0e182c	Fix bug in non_blocking copy (#20305 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20305 ghimport-source-id: eb3dacb10fd93bbb5a6bbe078ed1ec842163d0e6 Differential Revision: D15276094 Pulled By: li-roy fbshipit-source-id: 4728f419aa050e6c94a4f62231fa1a86caa556a7	2019-05-11 15:20:19 -07:00
Nishant Pandit	6a8f55796a	Add quant-dequant nodes for weights Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20041 Differential Revision: D15178086 fbshipit-source-id: 8cb060d72b68e44bf042338924f203ae62d74f6a	2019-05-11 14:03:10 -07:00
Nikolay Korovaiko	9499c7b7ee	Profiling GraphExecutor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19994 Differential Revision: D15307752 Pulled By: Krovatkin fbshipit-source-id: 7b35191042199ef16823487e15fe639968cbdc89	2019-05-10 23:05:47 -07:00
Lara Haidar	f4d9bfaa4d	Support Exports to Multiple ONNX Opset (#19294 ) Summary: Support exporting multiple ONNX opsets (more specifically opset 10 for now), following the proposal in https://gist.github.com/spandantiwari/99700e60919c43bd167838038d20f353. And add support for custom ops (merge with https://github.com/pytorch/pytorch/pull/18297). This PR will be followed by another PR containing the changes related to testing the ops for different opsets. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19294 Reviewed By: zrphercule Differential Revision: D15043951 Pulled By: houseroad fbshipit-source-id: d336fc35b8827145639137bc348ae07e3c14bb1c	2019-05-10 18:37:12 -07:00
Xue Feng	1129b3344a	move DistillBatchLRLoss Layer from open source to fb Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20291 Reviewed By: chocjy Differential Revision: D15272181 fbshipit-source-id: 2e0964fa1b1031607134548bb87c4e103c5b1383	2019-05-10 17:46:04 -07:00
Nikolay Korovaiko	3f3ee5600a	make trace's errors more helpful in terms of what it can and can't do when tracing module's methods Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20368 Differential Revision: D15305340 Pulled By: Krovatkin fbshipit-source-id: bafb13002df5c9741160e96205e0846243dde3ec	2019-05-10 17:40:21 -07:00
davidriazati	7d3d5b73f4	Add multiline type annotation support for Python frontend (#14922 ) Summary: This allows multiline type comments in accordance with [PEP 484](https://www.python.org/dev/peps/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code) ```python torch.jit.script def foo(x # type: Tensor y # type: Tuple[Tensor, Tensor] ): # type: (...) -> Tuple[Tensor, Tensor] return x, x ```](https://our.intern.facebook.com/intern/diff/15268432/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/14922 Pulled By: driazati Differential Revision: D15268432 fbshipit-source-id: e9add8d8025e42390a14a643835d15cc67a2f33e	2019-05-10 17:27:41 -07:00
davidriazati	3a39ce0f41	Fix reflection on weak modules, copy attributes (#20190 ) Summary: * Constructs a new type at runtime so that `isinstance` checks work for weak modules assigned to `ScriptModule`s * Fix some extraneous names in `__constants__` * Add `in_features` and `out_features` to `nn.Linear` `__constants__` Fixes #19363 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20190 Pulled By: driazati Differential Revision: D15302350 fbshipit-source-id: 1d4d21ed44ab9578a4bc2a72396a82e9bbcd387c	2019-05-10 17:14:49 -07:00
davidriazati	00d0ddb140	Add all list specializations to pickler (#20191 ) Summary: TensorList, DoubleList, and BoolList were missing from the pickler, so this adds them. As a follow up a lot of the code for these could be templated and cut down ](https://our.intern.facebook.com/intern/diff/15299106/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20191 Pulled By: driazati Differential Revision: D15299106 fbshipit-source-id: f10c0c9af9d60a6b7fb8d93cea9f550b1a7e2415	2019-05-10 17:14:42 -07:00
Owen Anderson	6197eed409	Eliminate a const_cast. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20371 Differential Revision: D15298327 fbshipit-source-id: 566ec842e971ba6f80e333bb874cc5fd2c36b02e	2019-05-10 15:43:12 -07:00
James Cross	a4ae689636	quantize_rnn_modules in ensemble_export (#20365 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20365 Enable quantization of `torch.nn.LSTM` module in the decoder of PyTorch natively exported beam search models. Reviewed By: jmp84 Differential Revision: D15260631 fbshipit-source-id: bbdd3a30c2c2110986eb7aa7ff11ce1c9090ddf4	2019-05-10 15:33:41 -07:00
Orion Reblitz-Richardson	a0c2829194	Preserve log_dir arg and member for SummaryWriter (#20382 ) Summary: Given that tensorboardX and our PyTorch 1.1 release had `log_dir` as the argument for SummaryWriter initialization and member variable (which some users access), we need to preserve this name. However, we might deprecate this in the future and I've added a `get_logdir` method that can be used in the future. cc natalialunova, lanpa Pull Request resolved: https://github.com/pytorch/pytorch/pull/20382 Reviewed By: NarineK Differential Revision: D15300941 Pulled By: orionr fbshipit-source-id: a29a70fcbc614a32ebfa6c655962fdff081af1af	2019-05-10 14:59:47 -07:00
Edward Z. Yang	3aa414c8f2	Add documentation to Dispatch.h (#20339 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20339 ghimport-source-id: 99b6cf9f03be8d55674a43c8a7e19a42b75143f1 Differential Revision: D15295652 Pulled By: ezyang fbshipit-source-id: f3b127ed3f53f13931596c06e181056940443290	2019-05-10 14:31:03 -07:00
Edward Z. Yang	10a9ef833c	Avoid unnecessary refcount bump in unary operators. (#20331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20331 ghimport-source-id: 26588acca171555182714c6b6f89610564e228a1 Differential Revision: D15295193 Pulled By: ezyang fbshipit-source-id: 44b7a8b4c9d41003a6559be2af0bd4f0ada54b31	2019-05-10 14:23:22 -07:00
Nikolay Korovaiko	75c6d37bac	profile on uses Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20049 Differential Revision: D15242492 Pulled By: Krovatkin fbshipit-source-id: f6eab3ec9d7a2f59f19905d757cf7c6f9ad2fdf6	2019-05-10 14:18:03 -07:00
Christian Puhrsch	c6255a57e4	Remove CPU_tensor_parallel_kernel_apply2 (#20207 ) Summary: This code is unused and has been superseded by TensorIterators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20207 Differential Revision: D15240832 Pulled By: cpuhrsch fbshipit-source-id: 4f600bb8645f9b28a137e2cefb099978f5152d05	2019-05-10 14:04:32 -07:00
Tzu-Wei Huang	6dc70aa513	add test coverage for make_np (#20317 ) Summary: addresses https://github.com/pytorch/pytorch/pull/16196#discussion_r276381946 cc orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/20317 Differential Revision: D15289400 Pulled By: orionr fbshipit-source-id: 914416a8c1369d95656f556c6e05348957789466	2019-05-10 13:59:48 -07:00
Michael Suo	ce033485eb	Convenience APIs for script objects (#20226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20226 ghimport-source-id: 22937d72e35ec4eba38019284a368453089fe3eb Differential Revision: D15243625 Pulled By: suo fbshipit-source-id: 5e9fb773da244f9ef201dba524155c3b19b2b4e0	2019-05-10 13:03:58 -07:00
Zafar Takhirov	50149fb66b	Adds quantized addition and renames sum to add (#20233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20233 Adding a quantized addition (without relu) Reviewed By: jianyuh Differential Revision: D15245791 fbshipit-source-id: 34ede5d805d9ab0d31e8ae87cefb110504bd3c87	2019-05-10 12:53:33 -07:00
interesaaat	35fed93b1e	Adding Poisson NLL loss to libtorch (#19316 ) Summary: This PR add Poisson NLL loss to aten and substitute the python implementation with a call to the c++. Fixes #19186. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19316 Differential Revision: D15012957 Pulled By: ezyang fbshipit-source-id: 0a3f56e8307969c2f9cc321b5357a496c3d1784e	2019-05-10 11:57:49 -07:00
Nick Korovaiko	ed25b8a667	Add a FAQ entry to explain `Cannot insert a Tensor that requires grad as a constant` Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20181 Differential Revision: D15296816 Pulled By: Krovatkin fbshipit-source-id: 382e3a0aa982774771e98050cbc65d144cbd959e	2019-05-10 10:55:33 -07:00
StandbyMe	ea5c9c9267	Update installing.rst (#20354 ) Summary: Delete useless `cd` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20354 Differential Revision: D15296154 Pulled By: soumith fbshipit-source-id: 2042b56c91b33e302b0ed9c77f29b9b64079fa98	2019-05-10 10:04:06 -07:00
Karl Ostmo	4ba28deb6e	Unify libtorch and libcaffe2 (#17783 ) Summary: This PR is an intermediate step toward the ultimate goal of eliminating "caffe2" in favor of "torch". This PR moves all of the files that had constituted "libtorch.so" into the "libcaffe2.so" library, and wraps "libcaffe2.so" with a shell library named "libtorch.so". This means that, for now, `caffe2/CMakeLists.txt` becomes a lot bigger, and `torch/CMakeLists.txt` becomes smaller. The torch Python bindings (`torch_python.so`) still remain in `torch/CMakeLists.txt`. The follow-up to this PR will rename references to `caffe2` to `torch`, and flatten the shell into one library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17783 Differential Revision: D15284178 Pulled By: kostmo fbshipit-source-id: a08387d735ae20652527ced4e69fd75b8ff88b05	2019-05-10 09:50:53 -07:00
peter	872bab22c6	Some essential changes needed before updating the Windows AMI (#20353 ) Summary: 1. Add cuda 10.1 build 2. Turn on openmp loop support for VS 2019 3. Remove legacy code about selective builds Tested through CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20353 Differential Revision: D15294806 Pulled By: ezyang fbshipit-source-id: 0acf5c3fbbc398fd9ebdf9f97653499d39638432	2019-05-10 09:08:51 -07:00
Brian Vaughan	d68802ba47	Sparse half embeddings on cuda (#19695 ) Summary: ``` import torch a = torch.nn.Embedding(3, 4, sparse=True).half().cuda() a(torch.LongTensor([1, 0]).cuda()).sum().backward() ``` gave: `RuntimeError: torch.cuda.sparse.HalfTensor is not enabled` This PR enables sparse.HalfTensor on cuda. Still won't work for CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19695 Differential Revision: D15281162 Pulled By: nairbv fbshipit-source-id: 0d83d946a059393bd53d8b8102e2daa9b4c02588	2019-05-10 08:00:55 -07:00
Gerard Goossen	148e90ba2a	Give clear error message when attempting to merge struct which can't be merged. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19804 Differential Revision: D15098833 fbshipit-source-id: 2950e247c74e125e033cd9cfbf5631eee5298ea0	2019-05-10 07:01:01 -07:00
Gerard Goossen	c2c0a32155	Remove setting logger level in caffe2.python.checkpoint (#19803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19803 There is no reason to set a specific logging level for this module. Removing it to just use the default logging level. Differential Revision: D15098834 fbshipit-source-id: 1654c04500c19690ddde03343f2e84b04bb0f1ef	2019-05-10 07:00:58 -07:00
xkszltl	2a875fc126	Fix THD->c10 dependency to gflags.h (#20319 ) Summary: Fixed #20250 Not sure if there's any specific design reason to `add_dependecy()` and manually add a few include dir, instead of linking the target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20319 Differential Revision: D15294584 Pulled By: ezyang fbshipit-source-id: 97f813a6b1829dad49958e0f880b33eb95747607	2019-05-10 06:50:58 -07:00
peter	8726b27333	Fix overlay_vc_env when called by legacy python (#20304 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20155. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20304 Differential Revision: D15292369 Pulled By: zdevito fbshipit-source-id: 7da2e0cb85c98d0fcd4461d39e2a8c57391db60e	2019-05-10 06:44:58 -07:00
Edward Yang	c397134d6b	Revert D15156384: Dict Differential Revision: D15156384 Original commit changeset: b9313ec4dd9a fbshipit-source-id: 3b44f49ec4eaba692cfb2cfe46e5f98102e337d9	2019-05-10 06:11:25 -07:00
Edward Yang	85d56852d3	Revert D15227620: Allow Dict type in c10 operators Differential Revision: D15227620 Original commit changeset: c1ea6c12165e fbshipit-source-id: b2cecfffdd38b7c97e20d0ee81915ea10daf8460	2019-05-10 06:11:22 -07:00
Edward Yang	c744468e36	Revert D15227621: Extend testAvailableArgTypes Differential Revision: D15227621 Original commit changeset: 83db7536e906 fbshipit-source-id: 8ecc443bd18787de52572fde06df408e1c52c50d	2019-05-10 06:11:18 -07:00
Yinghai Lu	99874f87cb	Use registry for BoundShapeInferencer (#20341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20341 Att Reviewed By: ipiszy Differential Revision: D15288131 fbshipit-source-id: 956ced99cc5c5b8199f81f7baa844fe8a0505456	2019-05-10 01:16:02 -07:00
Pieter Noordhuis	a0e5240afc	Fix DistributedDataParallelTest.test_accumulate_gradients (#20351 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20351 This was broken because of a merge race between #20282 and the stack in #20236. Cleaned up the test and comments a bit as well. Differential Revision: D15292786 fbshipit-source-id: a4379ea700cad959d3a6921fc5ddf9384fb8f228	2019-05-09 23:27:18 -07:00
Owen Anderson	02df1ccd9c	Remove const_cast's from subgraph matcher. (#20303 ) Summary: The trick here is that creating a mapping from const values to const values means that downstream clients that want to mutate the output of the mapping are stuck. However, a mapping from const values to non-const values is just fine and doesn't put constraints on downstream clients. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20303 Differential Revision: D15284076 fbshipit-source-id: 16206fd910dd5f83218525ca301b1889df0586cb	2019-05-09 18:07:14 -07:00
Jesse Hellemn	e47b210075	Adding setup job as prereq to html update jobs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20325 Differential Revision: D15287710 Pulled By: pjh5 fbshipit-source-id: 2bbed3a46c4affb5ae4e6dd4feb1dda59aeb5d04	2019-05-09 16:25:22 -07:00
Zachary DeVito	3afd99680c	Remove SourceLocation (respin) (#20333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20333 ghimport-source-id: e64075bb82067224463e9955d10bd13967d1975d Differential Revision: D15284081 Pulled By: zdevito fbshipit-source-id: ac26ae48392b9daff08f460529c06af8f4e4722a	2019-05-09 16:17:33 -07:00
Pieter Noordhuis	558c6c4d8a	Make DistributedDataParallel usable with CPU models (#20236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20236 Use the new version of broadcast_coalesced that deals with both CPU and CUDA models. Add tests that evaluate correctness of DistributedDataParallel for CPU models. Closes #17757. Reviewed By: mrshenli Differential Revision: D15245428 fbshipit-source-id: d2fa09f68593b3cd1b72efeb13f5af23ebd5c80a	2019-05-09 14:11:17 -07:00
Pieter Noordhuis	f32c9bd5e9	Refactor core DistributedDataParallel tests (#20235 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20235 The tests expected to only run for CUDA models. In a future commit we need to update this to work for CPU models as well. Therefore, we can no longer rely on only integers being passed for device identifiers. With this change we pass both the materialized list of devices to use (as `torch.Device` objects), as well as an optional list of integers. The latter is specified to exercise the code in the DistributedDataParallel constructor that turns a list of integers into CUDA devices, IFF it is used to wrap a single-device CUDA module. This commit also groups together the 'str' and non-'str' tests. These used to test passing the list of devices as integers or as `torch.Device` instances. These are now executed from the same test. Reviewed By: mrshenli Differential Revision: D15245429 fbshipit-source-id: 5797ba9db33d2c26db8e7493c91bb52f694285ac	2019-05-09 14:11:14 -07:00
Pieter Noordhuis	caa0d0c50a	Add c10d::broadcast_coalesced and tests (#20234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20234 The differences with the existing function _dist_broadcast_coalesced is that this one works for both CPU and CUDA tensors and that it has a maximum number of in flight operations. This should be the final change needed to have only a single version of DistributedDataParallel that both supports CPU and CUDA models, or even a mix of both. See #17757 for more information. Reviewed By: mrshenli Differential Revision: D15228099 fbshipit-source-id: a2113ba6b09b68cb5328f49f4c1960031eb43c93	2019-05-09 14:11:08 -07:00
Gu, Jinghui	c31fccd678	Fix crash issue in conv+sum fusion for MKLDNN on caffe2 (#20139 ) Summary: The isConvFusion(...) is only for Conv op. If non-Conv op, the crash takes place. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20139 Differential Revision: D15280604 Pulled By: yinghai fbshipit-source-id: eb45be11990b3bf7c5b45f02ebb6018444ab5357	2019-05-09 13:53:46 -07:00
Junjie Bai	8c3a7bb57f	Move librosa and psutil installation from CI script to docker images build script (#20299 ) Summary: pip install librosa randomly coredump, causes CI flakiness Pull Request resolved: https://github.com/pytorch/pytorch/pull/20299 Differential Revision: D15276270 Pulled By: bddppq fbshipit-source-id: 9105106f41aaacf620751290b016359ef7d665b3	2019-05-09 13:48:29 -07:00
Wanchao Liang	e870b11ae6	Revert D15275731: Remote SourceLocation Differential Revision: D15275731 Original commit changeset: f4da178c3137 fbshipit-source-id: 830b79735eb2dadc4795b5aae407826bf20ef121	2019-05-09 13:07:11 -07:00
Zachary DeVito	eca91de5d2	Remote SourceLocation (#20300 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20300 ghimport-source-id: 06f606c4db3b70b1d2ed9f6ed4542c3f703c4e17 Differential Revision: D15275731 Pulled By: zdevito fbshipit-source-id: f4da178c31372c2264feb9f99476b9c9aa66c1f2	2019-05-09 11:48:29 -07:00
Alexander Sidorov	726661b152	profiler: improve repr for averaged events (#20281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20281 This is how it looks like now: ``` <FunctionEventAvg key=mm self_cpu_time=11.404s cpu_time=2.895ms cuda_time=0.000us input_shapes=[[26, 4096], [4096, 1024]]> ``` Before I forgot to update the repr for these when updated it for not averaged events Differential Revision: D15262862 fbshipit-source-id: a9e5b32c347b31118f98b4b5bf2bf46c1cc6d0d2	2019-05-09 11:34:52 -07:00
Will Feng	39af9563e2	Re-enable CUDA tests for C++ API (#20238 ) Summary: CUDA tests for C++ API were not running on the CI due to a missing character in https://github.com/pytorch/pytorch/pull/11554. This PR fixes the bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20238 Differential Revision: D15279945 Pulled By: yf225 fbshipit-source-id: 1fa7439cd40b6f2fba6792eb790bacf0d67d0054	2019-05-09 11:06:51 -07:00
Sebastian Messmer	fe714862dd	Extend testAvailableArgTypes (#20185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20185 This test case now also tests that the argument type works correctly in kernels that - don't return outputs - return multiple outputs Reviewed By: li-roy Differential Revision: D15227621 fbshipit-source-id: 83db7536e9065e0f8c5032d6b96e970bbaf718b3	2019-05-09 10:54:16 -07:00
Sebastian Messmer	e0166f4670	Allow Dict type in c10 operators (#20184 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20184 - Add support for Dict<Key, Value> arguments and returns to c10 operators - Add support for std::unordered_map<Key, Value> to the legacy API (but not to c10 kernels) Reviewed By: li-roy Differential Revision: D15227620 fbshipit-source-id: c1ea6c12165e07b74272cb48c6021bdb5c2d7922	2019-05-09 10:54:12 -07:00
Sebastian Messmer	c92129033a	Dict (#19976 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19976 Implement a Dict type that allows us to abstract away from the concrete implementation used. The API is similar to std::unordered_map, but behind the scenes we can switch to any map implementation we like. ska::flat_hash_map, google dense map, or any future map implementation with better performance. Switching such an implementation choice does not have to break backwards compatibility of kernel code using the Dict type. Reviewed By: li-roy Differential Revision: D15156384 fbshipit-source-id: b9313ec4dd9acb3b6a0035345b6ba4f2a437d1e5	2019-05-09 10:54:07 -07:00
Chenyang Yu	2019f6cd51	Add unit test to ensure no gradients sync when calling ddp.module(input) (#20282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20282 Add unit test to ensure no gradients sync when calling ddp.module(input), e.g not invoking prepare_for_backward PyText is depending on DDP for data parallel distributed training. To support accumulate gradients locally before gradients sync, we are calling orig_model.forward instead of ddp_model.forward. Add a unit test to avoid changes break the assumption. Reviewed By: pietern, mrshenli Differential Revision: D15263155 fbshipit-source-id: 7734e174f507690fb23ea6c52dffff4a93f9b151	2019-05-09 10:15:19 -07:00
Ailing Zhang	899bddeeb6	fix typo in adaptive methods annotation (#20306 ) Summary: fixes #20215 The confusing behavior was caused by typos in type annotation :( Pull Request resolved: https://github.com/pytorch/pytorch/pull/20306 Differential Revision: D15276216 Pulled By: ailzhang fbshipit-source-id: 1b0c9635a72a05c9b537f80d85b117b5077fbec7	2019-05-09 09:29:37 -07:00
Richard Zou	83a80d2b31	Add test/test_namedtensor.py (#20168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20168 ghimport-source-id: 78bd3c4b6bc87c216ce33dba13b61feb87e5fe53 Reviewed By: gchanan Differential Revision: D15278222 Pulled By: zou3519 fbshipit-source-id: 3bcdb1cb654400350d42464dd9e0d5e0a7116e1e	2019-05-09 09:09:22 -07:00
Richard Zou	199fa12dee	Add namedtensor build and test to the CI (#20163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20163 ghimport-source-id: 6a821a708a20dbfa84c31a8af067d451c59b3964 Reviewed By: gchanan Differential Revision: D15278216 Pulled By: zou3519 fbshipit-source-id: 0bc28929f8b3ce317de06f1fc275e586c649785c	2019-05-09 09:09:19 -07:00
Richard Zou	e01a5bf28b	Add USE_NAMEDTENSOR compilation flag. (#20162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20162 ghimport-source-id: 0efcd67f04aa087e1dd5faeee550daa2f13ef1a5 Reviewed By: gchanan Differential Revision: D15278211 Pulled By: zou3519 fbshipit-source-id: 6fee981915d83e820fe8b50a8f59da22a428a9bf	2019-05-09 09:09:16 -07:00
Philipp Lang	f23fb66e6e	Fix in file position logic: file descriptor and Python-side handle (#20270 ) Summary: This addresses #18436 The logic replicates the essence of closing file descriptors in numpy: `bf20e30340/numpy/core/include/numpy/npy_3kcompat.h (L278)` This stores the position of the file descriptor before resetting it to the Python handle offset, then resets to the original position before exit. The Python-side handle is then updated to reflect the new position. Also added somewhat more demanding tests to cover this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20270 Differential Revision: D15275902 Pulled By: soumith fbshipit-source-id: 5ca8a52b61c7718d2e69571f72f80b1350b0acdb	2019-05-09 08:20:01 -07:00
Brian Vaughan	c406bf20a0	error instead of crashing on attempt to subclass typed tensors (#20283 ) Summary: https://github.com/pytorch/pytorch/issues/20052 typed tensors (e.g. torch.FloatTensor) can't be subclassed. Was causing crashes and other errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20283 Differential Revision: D15278138 Pulled By: nairbv fbshipit-source-id: 8493eac4d34dfb76b054362bf0acec02146cd0e2	2019-05-09 07:10:38 -07:00
peter	1e35ef07e9	Switch off USE_DISTRIBUTED on default for MSVC (#20302 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20250 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20302 Differential Revision: D15277733 Pulled By: ezyang fbshipit-source-id: a8915939d033a04f962908d19bbad840b6234e27	2019-05-09 06:29:31 -07:00
Zayd Hammoudeh	2aea5b6335	Fixed Softmax doc to specify dimension to prevent warning in 1.1.0. (#20310 ) Summary: See Issue #20301 Specifying dim in docstring example to prevent UserWarning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20310 Differential Revision: D15277734 Pulled By: ezyang fbshipit-source-id: 2e8b748dbe743675a5a538ccbe97713aad02e8ac	2019-05-09 06:21:57 -07:00
Will Feng	0087069dce	Use torch::get/set_num_threads without additional includes beyond torch/torch.h (#20176 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20130. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20176 Differential Revision: D15275036 Pulled By: yf225 fbshipit-source-id: 0f04e1fbfed18c07030b20e92e957ef5f2b5707d	2019-05-09 06:08:27 -07:00
eellison	916ef76817	Sort User Defined Classes (#19706 ) Summary: Schema matching for sort is a little tricky because you have to check whether the class defines the __lt__ method correctly, and you have to mark whatever effects happen in __lt__ to the sorting op as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19706 Differential Revision: D15244366 Pulled By: eellison fbshipit-source-id: 73b3e36462c6cc40f9d8cb235b44499a67d3149e	2019-05-09 00:07:51 -07:00
Zachary DeVito	1102f4c56e	Bump op_version_set (#19812 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19812 ghimport-source-id: 7cdb24c6a7501c6ec5f0eae07325512746f5abb9 Differential Revision: D15102803 Pulled By: zdevito fbshipit-source-id: bedf0bd6e1170fa294c65c87df75b82d8694f89c	2019-05-08 23:21:22 -07:00
Thiago Crepaldi	3d4d7b9082	Refactor ChunkDataReader API + fix missing headers (#19485 ) Summary: This PR restricts the BatchType template argument of ChunkDataReader to STL vectors only. Internally, ChunkDataReader was assuming BatchType was a vector, but the user could pass any type to the template argument, leading to compiling issues during CPP extensions. Additionally to the proposed API change, this PR adds missing include headers to chunk.h. Currently the current implementation works but if users try to create C++ extensions that implements new ChunkDataReaders to be along with the existing ChunkDataset, the build will fail due to the missing headers. In terms of functionality, nothing has changed. This PR simply makes the implementation slightly more robust for future extensions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19485 Differential Revision: D15261725 Pulled By: soumith fbshipit-source-id: 38c9465d665392ae6a2d12c5a520a4f501e1a6ca	2019-05-08 22:20:19 -07:00
Owen Anderson	bed1d7d3ff	Eliminate some const_cast's. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20279 Differential Revision: D15261884 fbshipit-source-id: 8daf76a7031ca5a648995218fd14a9aa143612dc	2019-05-08 21:51:24 -07:00
Shen Li	d6815e1e27	Only record grad_fn in C++ Scatter and Gather when required so (#20286 ) Summary: C++ `Scatter` and `Gather` always set autograd history for input data tensors regardless whether they require grad. This hits assertion failure in `set_history(Tensor, shared_ptr<Function> grad_fn)` where `grad_fn` cannot be nullptr. After this PR, C++ `Scatter` and `Gather` only record `grad_fn` when required. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20286 Differential Revision: D15266610 Pulled By: mrshenli fbshipit-source-id: 641df0ea36e7c922b5820c8dc3f83e2a050412b5	2019-05-08 21:04:21 -07:00
James Reed	2179d5b32b	Dynamic quantized full LSTM module (#20249 ) Summary: Previously we only had a Python wrapper for `torch.quantized_lstm_cell`. We had the op `torch.quantized_lstm`, but it didn't have a wrapper. This makes the wrapper Pull Request resolved: https://github.com/pytorch/pytorch/pull/20249 Reviewed By: driazati Differential Revision: D15250023 Pulled By: jamesr66a fbshipit-source-id: f05ad784d903e0ef3a62633c8bf80bad79de48ae	2019-05-08 19:46:59 -07:00
Spandan Tiwari	7bc8562a9a	Enable ONNX constant folding in test_pytorch_onnx_caffe2.py tests. (#20290 ) Summary: This is a step towards enabling the ONNX constant folding pass by default in the PT->ONNX export. In this change we have enabled test points in `test/onnx/test_pytorch_onnx_caffe2.py` to run with constant folding pass enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20290 Reviewed By: zrphercule Differential Revision: D15271674 Pulled By: houseroad fbshipit-source-id: 9e59ab46ae74b4ad8dea1a2200ecc1f3eb8aad75	2019-05-08 18:47:03 -07:00
Bilge Acun	3ee97183b0	ScaleBlobs Operator (#19660 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19660 Implementation of aggregated Scale operator. The operator takes a list of tensors as an input and scales all of them them with the argument float value. The tensor sizes can be different, therefore bookkeeping of the sizes and pointers to the tensors are necessary for the GPU version of the kernel. Reviewed By: BIT-silence Differential Revision: D14984233 fbshipit-source-id: 37cc97159a4f2c38cd6fff4f5710ab7d3a773611	2019-05-08 17:57:33 -07:00
Wanchao Liang	4d676d53a6	split canonicalize_ops, make a decompose pass (#19988 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19988 ghimport-source-id: 1dbf39e07099fa24ef9a6c0221312bf01a8011b7 Differential Revision: D15190355 Pulled By: wanchaol fbshipit-source-id: 83f2b6557efd758810ccb4a4229d71fdebfd06e0	2019-05-08 17:21:59 -07:00
Elias Ellison	33b4afe3bb	dont make alias for none value (#20112 ) Summary: Don't make an alias value for a value that is known to be None. This was preventing constant propagation from running the `out is None` check in nn.functional.normalize, and thus preventing the if statement from being inlined. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20112 Differential Revision: D15267328 Pulled By: eellison fbshipit-source-id: 5b878b0dc50944c2e7a2f583ea483dad9d6bbec3	2019-05-08 17:10:19 -07:00
davidriazati	8ebb86dd3a	Support `torch.save` for saving values during execution (#18154 ) Summary: This PR makes `torch.save` call out to the pickler which saves a tensor in the same format that `torch.save()` does, the file looks like `\| pickle archive 1 (includes sizes, strides, requires_grad, etc...) \| pickle archive 2 (list of tensor keys) \| tensor binary data \|` and can be read back in with `torch.load(my_file, pickle_module=torch.jit._pickle)` Fixes #18003 Unpickling in the JIT for things such as model parallelism will be a follow up PR ](https://our.intern.facebook.com/intern/diff/15015160/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18154 Pulled By: driazati Differential Revision: D15015160 fbshipit-source-id: ef76a44b8c243f4794cd7e245ec8305e965bc59f	2019-05-08 16:52:53 -07:00
Junjie Bai	27c82c03c5	Unify code path for native mkldnn conv and reorder on-the-fly mkldnn conv (#19963 ) Summary: Also, current mkldnn primitive code recreate the computation everytime, causes tiny Convolutions spending significant portion of its time on the repeated codegen. ideep has implemented an lru cache to save the computation and so this change will help us improving perfs for tiny Convolutions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19963 Differential Revision: D15156527 Pulled By: bddppq fbshipit-source-id: 6a8fbd10a213ec22cdeaff1a2bdb0d09905d1fcd	2019-05-08 15:30:05 -07:00
Tzu-Wei Huang	b3bce01e26	Have add_video use NamedTemporaryFile directly (#20223 ) Summary: address comment in #16196 https://github.com/pytorch/pytorch/pull/16196/files#r278676986 cc orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/20223 Reviewed By: natalialunova Differential Revision: D15261528 Pulled By: orionr fbshipit-source-id: 1aebcc6cb1c9313d890c5b506973855ebc63fb3b	2019-05-08 15:00:44 -07:00
Elias Ellison	35de90e324	Canonicalize order of If and Loop outputs (#20015 ) Summary: Canonicalize the ordering of outputs of if and loop nodes based on their first usage. Previously we were able to canonicalize output order by sorting on variable name, but this breaks down with outputs added in an early return pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20015 Differential Revision: D15266066 Pulled By: eellison fbshipit-source-id: ba5340c068a68b1ffc73f056db194b92d3274dc4	2019-05-08 14:52:07 -07:00
Tongzhou Wang	1ab33fce9a	Disable worker_kill & holder_iter_reference combination in test_proper_exit (#20172 ) Summary: cc nairbv All failures I have seen are of this combination. So let's just disable it for all cases. After #20063 I find it failing for py3 once. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20172 Differential Revision: D15266527 Pulled By: nairbv fbshipit-source-id: afb9389dfc54a0878d52975ffa37a0fd2aa3a735	2019-05-08 14:39:47 -07:00
Tzu-Wei Huang	7edf9a25e8	Clarify API and add examples for all methods (#20008 ) Summary: As a part of supporting writing data into TensorBoard readable format, we show more example on how to use the function in addition to the API docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20008 Reviewed By: natalialunova Differential Revision: D15261502 Pulled By: orionr fbshipit-source-id: 16611695a27e74bfcdf311e7cad40196e0947038	2019-05-08 14:06:10 -07:00
Jianyu Huang	4a086f700f	Quantized FCRelu operator (#19750 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19750 Add the FCRelu operator. Differential Revision: D15079097 fbshipit-source-id: b8e8a591158ad2355bf7acd14476993bbc9beae6	2019-05-08 14:00:40 -07:00
Jianyu Huang	e8cdfb5d23	Use QTensor with quantized FC operator (#19541 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19541 For the quantized FC operator, replace the tuple (Tensor, scale, zero_point) with QTensor. Differential Revision: D14900407 fbshipit-source-id: 164df38f3564e0a68af21b9fedaba98a44ca1453	2019-05-08 14:00:37 -07:00
Junjie Bai	8defcbfcf4	Enable caffe2 softmax tests with ROCm 2.4 (#20280 ) Summary: cc xw285cornell petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/20280 Reviewed By: xw285cornell Differential Revision: D15262695 Pulled By: bddppq fbshipit-source-id: d72490ff599cdab0331230bc9b12075085386319	2019-05-08 13:29:11 -07:00
Jianyu Huang	1deb8dde58	Quantized FC operator (#19497 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19497 Implements a basic quantized FC (uint8 * int8 -> uint8) with FBGEMM APIs. Related document: https://fb.quip.com/rP5tAx56ApMM https://fb.quip.com/MU7aAbzGDesu Work Item List: 1. [DONE] currently we use prepack routines inside Quantized FC operator. Will separate it as a standalone operator soon. 2. [DONE] rebase to D14817809 and D14994781 (cpp custom types). 3. [DONE] correctness unit test. 4. [To Do] rebase to QTensor. Similar to D14565413, this will be implemented in the next Diff. Differential Revision: D14761865 fbshipit-source-id: 031a39915fecd947afb4dd2719112b4ddc1082d3	2019-05-08 13:17:51 -07:00
Bram Wasti	7b733e4fc1	Rebase conflict fix for isFusableDevice (#20251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20251 ghimport-source-id: 0c8c1847a7979fcd77e4f6618730b170b6b8ce25 Differential Revision: D15262850 Pulled By: bwasti fbshipit-source-id: 17ecc340a310ddbcce141cfa3ee0efa9660194d2	2019-05-08 12:14:12 -07:00
Mikhail Zolotukhin	c931d7e9d2	SubgraphRewriter: Add a support for arbitrary replacement graphs in subgraph rewriter. (#20084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20084 ghimport-source-id: 91b3b0b66da00c6592a2d57c8f2a88a73c019d1a Differential Revision: D15190191 Pulled By: ZolotukhinM fbshipit-source-id: d57ba6b6790ea2fd277b2feb3f4a58895ed15486	2019-05-08 11:50:46 -07:00
Mikhail Zolotukhin	b3324d0fe3	SubgraphRewriter: Expose runOnGraph and use it in tests. (#20083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20083 ghimport-source-id: e4d425775c2a2fb5ed334727e902a91f744b697c Differential Revision: D15190192 Pulled By: ZolotukhinM fbshipit-source-id: 5fbcd61fa631d8f22b5016754f8d1a46eefb19c5	2019-05-08 11:50:43 -07:00
Mikhail Zolotukhin	8a6072c3bd	SubgraphRewriter: Rename pattern fusion to subgraph rewrite. (#20082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20082 ghimport-source-id: f0594f4ad918288fb3158b4ecfa8010cf09dd0c2 Differential Revision: D15190193 Pulled By: ZolotukhinM fbshipit-source-id: 81b026398c94f2fbf7487cafbb86b7364a78d827	2019-05-08 11:22:29 -07:00
Elias Ellison	1a85e57334	flake fixes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20278 Differential Revision: D15261153 Pulled By: eellison fbshipit-source-id: 2b210d50bc49c7a91605f1bf957890445077d21f	2019-05-08 10:51:10 -07:00
Ilia Cherniavskii	2a104f7383	Port ATen/native to ATen/Parallel (#20043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20043 ghimport-source-id: 8003ef8ca335d4c4717a886a3a75fd78ec53ade5 Differential Revision: D15248505 Pulled By: ilia-cher fbshipit-source-id: 7be500ed8bfb23cc36f1dd7108e344319e3e5332	2019-05-08 10:33:43 -07:00
Richard Zou	a7db3a7591	Error out if `git log` fails in setup_ci_environment (#20231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20231 ghimport-source-id: f58c44a52c719bf3c5cdcaa1f1e9624466423ace Reviewed By: ezyang Differential Revision: D15244497 Pulled By: zou3519 fbshipit-source-id: 87ca362aaa6a9040d32080b208272dacf7e45f63	2019-05-08 10:05:42 -07:00
Richard Zou	0626ea4300	Run 'checkout' before 'setup ci environment' on pytorch linux tests (#20213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20213 ghimport-source-id: ea395ee1151207f37e030ed0126564ff7a73bac8 Reviewed By: ezyang Differential Revision: D15244428 Pulled By: zou3519 fbshipit-source-id: 45ba2a80a2a9d056f806b5960bfe20f3c60f1e33	2019-05-08 10:05:39 -07:00
Junjie Bai	cd72be20e0	Update ROCm 2.4 (#20253 ) Summary: xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/20253 Reviewed By: ezyang Differential Revision: D15256826 Pulled By: bddppq fbshipit-source-id: 405c21fc727d8145c4d3ca4fe8d84804569ebe53	2019-05-08 09:35:40 -07:00
Ilia Cherniavskii	ede38bd743	Port THNN to ATen/Parallel (#20032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20032 ghimport-source-id: f8e1a4c7726585f2d1c894b64e3687ede6177282 Differential Revision: D15247960 Pulled By: ilia-cher fbshipit-source-id: de784c0002cbe074c87f912a5824e8a75683dbf8	2019-05-08 09:31:39 -07:00
Fritz Obermeyer	fa6a00f313	Fix memory leak in torch._dirichlet_grad() (#20244 ) Summary: Fixes https://github.com/pyro-ppl/pyro/issues/1853 This fixes a memory leak in `torch._dirichlet_grad()`. This function is used for reparametrized gradients for the `Dirichlet` and `Beta` distributions. - [x] Could a reviewer please confirm that `freeCopyTo()` is being used correctly and doesn't need an additional `decref()`? The author is unfamiliar with PyTorch C++ memory utilities. Help appreciated. - ran locally and confirmed leak is fixed Pull Request resolved: https://github.com/pytorch/pytorch/pull/20244 Differential Revision: D15259008 Pulled By: ezyang fbshipit-source-id: 222ec7d80ddd97bcdd7d54549f3e756575e8402e	2019-05-08 07:45:39 -07:00
Stefan Krah	10715ffc30	Mention issue number in the JIT workaround comments (#20222 ) Summary: This just updates the `JIT` comments with the issue number #20215. Hopefully this will stop the proliferation of the workaround. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20222 Differential Revision: D15259007 Pulled By: ezyang fbshipit-source-id: 5060a351aa618c6dae49d0b7a6ac9b0f57f2490a	2019-05-08 07:34:30 -07:00
Glen Mailer	a5ff09782e	Fix missing files for upload jobs (#20265 ) Summary: The earlier fix to extract scripts missed an attach_workspace which was used to make the built binaries available to the nightly build upload jobs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20265 Differential Revision: D15259080 Pulled By: soumith fbshipit-source-id: bf835c2cd76976b4563798ee348f7db83c7a79c1	2019-05-08 07:02:41 -07:00
Edward Yang	2db9066a41	Fix formatting for note in eig. (#19743 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19743 ghimport-source-id: fcb5f1aa3ee3d71e06ac1b8fbe6d6859a3547d63 Reviewed By: zou3519 Differential Revision: D15258642 Pulled By: ezyang fbshipit-source-id: 7091fc3e7c829542a65ae3a490912d8d13aadfb3	2019-05-08 06:37:21 -07:00
peter	d6f62b70f3	Fix cuda and cudnn libraries search process on Windows (#20205 ) Summary: Fixes #20202 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20205 Differential Revision: D15258626 Pulled By: ezyang fbshipit-source-id: 855ad457a8bb7a46accc7cf6ec5cb09e98f6e770	2019-05-08 06:08:47 -07:00
Ilia Cherniavskii	d24c0aa82f	Remove explicit checks for parallelism from TH (#20002 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20002 ghimport-source-id: b2037d3226d52ac672578700f77aca215eb309a0 Differential Revision: D15232028 Pulled By: ilia-cher fbshipit-source-id: 2c818ba819c95e1d709e0cea8c4f7816fddbcfc1	2019-05-08 02:33:31 -07:00
Ilia Cherniavskii	0ebe252c9c	Port TH library to ATen/Parallel (#19105 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19105 ghimport-source-id: db3e26f89d098e86215c48e464ace615193f5772 Differential Revision: D14947557 Pulled By: ilia-cher fbshipit-source-id: 7e987e74c034646ba818f02e7bd711aba2ee3364	2019-05-08 01:07:17 -07:00
Michael Suo	26dd65eaf8	Namespace isolation for classes (#19903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19903 ghimport-source-id: deadf59f469ad620d0ee10b089dfc9bb92171710 Differential Revision: D15118978 Pulled By: suo fbshipit-source-id: f2b487fd65520d1b7f45cb74145634d334ef1614	2019-05-07 22:48:31 -07:00
Zachary DeVito	e41aa0ed2f	fix parsing bugs (#20246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20246 ghimport-source-id: 611ea44dc6c1d31cee99fe3e89628be072a4a381 Differential Revision: D15247081 Pulled By: zdevito fbshipit-source-id: 4b836ce6b665c26bb6eb5347206c55d12807fae6	2019-05-07 19:35:51 -07:00
Glen Mailer	21a3895c7d	Extract repeated scripts into files (#19674 ) Summary: The current pytorch config.yml is causing some backend performance problems on CircleCI, due to the size of the file when all of the YAML anchors have been expanded. You can view the "processed" config as our internal system deal with it by running `circleci config process`. circleci config process .circleci/config.yml \| wc -c Before: 2833769 bytes After: 558252 bytes (~80% less) Add create a new job, `setup`, that has 2 functions: - Assert that config.yml is up to date - Put the .circleci/scripts directory into a workspace, so that downstream jobs can easily access it. The `setup` job becomes the parent of all jobs in the workflow. This allows us to fail fast if config is invalid. It might be a good place to add other, quick, lint checks to help fail the build faster. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19674 Differential Revision: D15252864 Pulled By: pjh5 fbshipit-source-id: 0778c7b8f95e7f3f33ac92fbb8862377fc9fb0ac	2019-05-07 19:04:00 -07:00
Jongsoo Park	42e9a619b3	add decay parameter in ref_adagrad (#15329 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15329 Add decay parameter to match with C++ Adagrad implementation. Reviewed By: chocjy Differential Revision: D13300991 fbshipit-source-id: db734df0202d8f5fd156f2742207d0b5a3aa7348	2019-05-07 18:58:58 -07:00
Zachary DeVito	f0b5ad8919	add str builtin support (#20188 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20188 ghimport-source-id: 390b889f67155eb6168a96d05307ec4e86f5c631 Differential Revision: D15228812 Pulled By: zdevito fbshipit-source-id: 789efa2e0b2f9518ed97ac877b59637e5b0ebcf4	2019-05-07 18:51:14 -07:00
James Reed	9fcf585475	Autograd profile recording in c10 custom ops (#20175 ) Summary: This ensures that custom operators registered through c10::RegisterOperators are recorded in autograd profile traces. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20175 Differential Revision: D15221311 Pulled By: jamesr66a fbshipit-source-id: 9452b24272c2399c20a49af85b62d34cabe6e27a	2019-05-07 18:41:36 -07:00
Tzu-Wei Huang	fb9d9fbd4e	smoke test for add_graph (#20007 ) Summary: Do tests with common models from torchvision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20007 Differential Revision: D15251754 Pulled By: orionr fbshipit-source-id: 9dc09bd407b3ccaaa310d2f4a8d53d5a7d12469d	2019-05-07 18:29:25 -07:00
Thomas Viehmann	3ac4d92824	tweak scripts/build_android.sh for ABI and header install (#20152 ) Summary: We now can build libtorch for Android. This patch aims to provide two improvements to the build - Make the architecture overridable by providing an environment variable `ANDROID_ABI`. - Use `--target install` when calling cmake to actually get the header files nicely in one place. I ran the script without options to see if the caffe2 builds are affected (in particularly by the install), but they seem to run OK and probably only produce a few files in build_android/install. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20152 Differential Revision: D15249020 Pulled By: pjh5 fbshipit-source-id: bc89f1dcadce36f63dc93f9249cba90a7fc9e93d	2019-05-07 17:44:07 -07:00
Lu Fang	faf2c3ac26	Standard gamma's export Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20126 Reviewed By: zrphercule Differential Revision: D15243663 Pulled By: houseroad fbshipit-source-id: 7f5f63f37462a844b03b98783c10e6c21f608a52	2019-05-07 17:37:25 -07:00
Lu Fang	48e649b803	Automatic update of fbcode/onnx to 5bde6371620b76302864bce90f521d72eda95d0e (#20232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20232 Previous import was 7d7bc83d29a328233d3e8affa4c4ea8b3e3599ef Included changes: - [5bde6371](https://github.com/onnx/onnx/commit/5bde6371): Add shape inference for legacy auto_pad modes (#1988) <stevenlix> - [6c9b3407](https://github.com/onnx/onnx/commit/6c9b3407): Move Quantization working group to completed state (#1980) <Prasanth Pulavarthi> - [8eba124e](https://github.com/onnx/onnx/commit/8eba124e): Define the IR acronym (#1985) <Senja Filipi> Reviewed By: bddppq Differential Revision: D15244357 fbshipit-source-id: e1a4eaee5de44f9eb944a290afac7a678b50b033	2019-05-07 17:32:48 -07:00
Elias Ellison	6606ac5d41	Support operator overloading for UDT (#20033 ) Summary: Support operator overloading for User Defined Types, which includes desugaring `a + b` and python builtin functions which call into a method if it is defined like `len(x)`. See https://rszalski.github.io/magicmethods/ for list of magic methods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20033 Reviewed By: driazati Differential Revision: D15246573 Pulled By: eellison fbshipit-source-id: 03d45dd524ea2a3b40db36843d6067bede27b30d	2019-05-07 17:28:06 -07:00
Jesse Hellemn	9c62280ea8	Remove brew libomp from binary mac machines (#20228 ) Summary: Resolves https://github.com/pytorch/pytorch/issues/20030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20228 Differential Revision: D15246258 Pulled By: pjh5 fbshipit-source-id: f57033af74b678566be02cdf5700b5ae6e154d4a	2019-05-07 17:09:40 -07:00
Ilia Cherniavskii	eecf52b444	Fix in benchmark_test_generator (#20237 ) Summary: Add missing import Pull Request resolved: https://github.com/pytorch/pytorch/pull/20237 Differential Revision: D15245957 Pulled By: ilia-cher fbshipit-source-id: 0f71aa08eb9ecac32002a1644838d06ab9faa37c	2019-05-07 17:03:25 -07:00
Michael Suo	8a375189ea	Remove flake8 E303 (too many blank lines) (#20225 ) Summary: similar to too few blank lines, I feel like this is not important enough to warrant breaking signal for all linters when it's violated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20225 Differential Revision: D15243480 Pulled By: suo fbshipit-source-id: 37cdc18daf09e07081e42b69c72d331d81660217	2019-05-07 16:29:15 -07:00
Gao, Xiang	ba3e6de4c2	Fix test_namedtuple_return (#20212 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/20198 Input matrix created randomly might be rank deficient or not positive semidefinite Not tested yet, will look at CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20212 Differential Revision: D15241331 Pulled By: zou3519 fbshipit-source-id: b71cdc5d5b622b5423a43b86d75cfaf3d9484100	2019-05-07 15:58:25 -07:00
Zachary DeVito	785583a435	Use ignore=dirty in submodules. (#20135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20135 ghimport-source-id: 73a07e07ed9485f80374262de2fb9b87e687a47a Differential Revision: D15214187 Pulled By: zdevito fbshipit-source-id: 2f2272f0ee7dad3935e6c31897a0b635b4e66133	2019-05-07 15:41:19 -07:00
Alexander Sidorov	864cfbc216	PyTorch Profiler Shape aggregation support (#20035 ) Summary: This is useful when you would like to understand performance bottlenecks of your model. One can use the shape analysis in order to fit model to a roofline model of their hardware. Please note that this feature can potentially skew profiling results. Also timing for not nested events will become wrong. One should only use timing for the bottom most events when shape analysis is used. Also for the case where people don't need shapes, profiling should not be affected. As in this case we don't collect shapes, which is the default behavior and this diff doesn't change it. One of the next steps could be, for example, choosing best candidates for quantization. In the scope of this diff I am just adding optional shapes collection into the Even class. After that in python there is minor functionality for providing groupping by shapes. In the output tables shapes are being truncated but in groupping full shape string is used as a key. Here is an example output: test_profiler_shapes (test_autograd.TestAutograd) ... ``` ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CUDA total % CUDA total CUDA time avg Number of Calls Input Shapes ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- unsigned short 2.30% 305.031us 2.30% 305.031us 305.031us NaN 0.000us 0.000us 1 [[30, 20]] addmm 69.40% 9.199ms 69.40% 9.199ms 9.199ms NaN 0.000us 0.000us 1 [[30], [128, 20], [20, 30], [], []] unsigned short 0.98% 129.326us 0.98% 129.326us 129.326us NaN 0.000us 0.000us 1 [[40, 30]] addmm 27.32% 3.621ms 27.32% 3.621ms 3.621ms NaN 0.000us 0.000us 1 [[40], [128, 30], [30, 40], [], []] ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Self CPU time total: 13.255ms CUDA time total: 0.000us ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CUDA total % CUDA total CUDA time avg Number of Calls Input Shapes ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- unsigned short 2.30% 305.031us 2.30% 305.031us 305.031us NaN 0.000us 0.000us 1 [[30, 20]] addmm 69.40% 9.199ms 69.40% 9.199ms 9.199ms NaN 0.000us 0.000us 1 [[30], [128, 20], [20, 30], [], []] unsigned short 0.98% 129.326us 0.98% 129.326us 129.326us NaN 0.000us 0.000us 1 [[40, 30]] addmm 27.32% 3.621ms 27.32% 3.621ms 3.621ms NaN 0.000us 0.000us 1 [[40], [128, 30], [30, 40], [], []] ------------------ --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Self CPU time total: 13.255ms CUDA time total: 0.000us ``` Also added this for older aggregation test: ``` test_profiler_aggregation_lstm (test_autograd.TestAutograd) ... ====================================================================================================================================================================================================== TEST ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CUDA total % CUDA total CUDA time avg Number of Calls Input Shapes ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- lstm 0.69% 4.606ms 5.30% 35.507ms 35.507ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.67% 4.521ms 5.27% 35.340ms 35.340ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.66% 4.399ms 5.02% 33.638ms 33.638ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.65% 4.354ms 4.92% 32.958ms 32.958ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.65% 4.351ms 4.96% 33.241ms 33.241ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.65% 4.323ms 5.10% 34.163ms 34.163ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.64% 4.304ms 4.92% 32.938ms 32.938ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.64% 4.300ms 5.10% 34.172ms 34.172ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.64% 4.292ms 5.05% 33.828ms 33.828ms NaN 0.000us 0.000us 1 [[5, 3, 10]] lstm 0.64% 4.263ms 4.98% 33.357ms 33.357ms NaN 0.000us 0.000us 1 [[5, 3, 10]] ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Self CPU time total: 670.120ms CUDA time total: 0.000us ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg CUDA total % CUDA total CUDA time avg Number of Calls Input Shapes ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- sigmoid 15.32% 102.647ms 15.32% 102.647ms 171.078us NaN 0.000us 0.000us 600 [[3, 20]] mul 15.20% 101.854ms 15.20% 101.854ms 169.757us NaN 0.000us 0.000us 600 [[3, 20], [3, 20]] lstm 12.74% 85.355ms 100.00% 670.120ms 33.506ms NaN 0.000us 0.000us 20 [[5, 3, 10]] addmm 11.16% 74.808ms 11.16% 74.808ms 249.361us NaN 0.000us 0.000us 300 [[80], [3, 20], [20, 80], [], []] tanh 9.89% 66.247ms 9.89% 66.247ms 165.617us NaN 0.000us 0.000us 400 [[3, 20]] split 6.42% 43.019ms 6.42% 43.019ms 215.095us NaN 0.000us 0.000us 200 [[3, 80]] add 5.67% 38.020ms 5.67% 38.020ms 190.101us NaN 0.000us 0.000us 200 [[3, 80], [3, 80], []] add 4.81% 32.225ms 4.81% 32.225ms 161.124us NaN 0.000us 0.000us 200 [[3, 20], [3, 20], []] addmm 3.79% 25.380ms 3.79% 25.380ms 253.796us NaN 0.000us 0.000us 100 [[80], [3, 10], [10, 80], [], []] unsigned short 3.72% 24.925ms 3.72% 24.925ms 83.083us NaN 0.000us 0.000us 300 [[80, 20]] ----------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- ----------------------------------- Self CPU time total: 670.120ms CUDA time total: 0.000us Total time based on python measurements: 691.366ms CPU time measurement python side overhead: 3.17% ok ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20035 Differential Revision: D15174987 Pulled By: salexspb fbshipit-source-id: 9600c5d1d1a4c2cba08b320fed9da155d8284ab9	2019-05-07 14:47:01 -07:00
BowenBao	831bd1c27d	support onnx export rnn with batch_first=True (#19766 ) Summary: Also fixed test_pytorch_onnx_caffe2.py rnn tests which are not really testing batch_first = True. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19766 Reviewed By: zrphercule Differential Revision: D15220950 Pulled By: houseroad fbshipit-source-id: 96833af7c569f1f939174ba672704f7af87d69f8	2019-05-07 14:19:25 -07:00
Zachary DeVito	e58817fed9	Make graph->param_node()->next() the first node (#19788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19788 ghimport-source-id: fec4b7ea6c4cdb6bf3624262ea4e37f2641d4a6f Differential Revision: D15094260 Pulled By: zdevito fbshipit-source-id: b415f029afe4163e9d0bd97a4e0c56c9e625c765	2019-05-07 14:03:02 -07:00
Junjie Bai	bc5398451e	Enable ROCm multi-gpu with Gloo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18640 Differential Revision: D15185822 Pulled By: bddppq fbshipit-source-id: 1b49ab3fb0f251cfc7ef3ddd62033ae0065a4ec3	2019-05-07 09:55:47 -07:00
sdg	cf55670bdd	Add proper __repr__ to LogSoftMax (#20018 ) Summary: Fixes #19961 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20018 Differential Revision: D15218171 Pulled By: ezyang fbshipit-source-id: 36bdf44d3b7a6df6a6ec5275a74741d4b057d3b4	2019-05-07 08:38:09 -07:00
Ahmad Salim Al-Sibahi	b8256280ce	Working on component-wise transformations that mimic `torch.cat` and `torch.stack` (#11868 ) Summary: As discussed in #11755 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/11868 Differential Revision: D10032248 Pulled By: ezyang fbshipit-source-id: d3a81c19f65a3e716f7f1cfc0a42b86c32fc484c	2019-05-07 07:49:29 -07:00
Zhang Liliang	f7a7868820	add process_group in convert_sync_batchnorm (#19240 ) Summary: In line 508. convert_sync_batchnorm is called recursively to convert the bn to syncbn, thus the process_group also should be passed in the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19240 Differential Revision: D15240318 Pulled By: ezyang fbshipit-source-id: 0fc9e856392824814991e5e9e8f9513d57f311af	2019-05-07 06:51:18 -07:00
Edoardo Conti	2356fac9a5	Add DirichletFullyConnectedActor to Soft Actor-Critic Summary: This can be used for problems where the action vector must sum to 1 Reviewed By: kittipatv Differential Revision: D15206348 fbshipit-source-id: 665fbed893d8c52d451a12d3bb2e73b2638b7963	2019-05-06 23:52:35 -07:00
Bram Wasti	4ca325df87	Add Custom graph fusion (#18588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18588 ghimport-source-id: f40df177af8b87c73f04bf337f478a62133284cf Differential Revision: D14901297 Pulled By: bwasti fbshipit-source-id: 1b6371a5175b3d63dad542b7cc22cb82e8c6cfd0	2019-05-06 23:15:16 -07:00
Ilia Cherniavskii	19e6886576	Intra-op parallel microbenchmarks for PT (#19997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19997 ghimport-source-id: 420d4a68a1ef879beee2734adba8abb575e0b0ab Differential Revision: D15231375 Pulled By: ilia-cher fbshipit-source-id: ce7248ea2ebb54d25c9d831c6e3f23f3534557dd	2019-05-06 20:21:45 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
Ilia Cherniavskii	8c97f0b19e	Initialize Caffe2 only when running Caffe2 benchmarks (#19980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19980 ghimport-source-id: ca31ca25b88a1c6219e4a32483f70738a8fdbf88 Differential Revision: D15229797 Pulled By: ilia-cher fbshipit-source-id: 0b23dbdba0c0f60932a75d8b1900c54285f5a8e4	2019-05-06 19:17:23 -07:00
Ilia Cherniavskii	0c7e98b765	Support for non-contiguous tensors and arbitrary dtypes in PT benchmarks (#19993 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19993 ghimport-source-id: 4cf51b61bb83b72883148ab0faa0c75c3cef7635 Differential Revision: D15230363 Pulled By: ilia-cher fbshipit-source-id: a3ab591d6fd24e874958401e63eaec56bda19a5c	2019-05-06 19:12:09 -07:00
Xiaodong Wang	839343a482	Add USE_CUDA macro in THD DataChannelGloo for non-GPU use case (#20186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20186 We're missing two USE_CUDA macro for GPU-related code in THD's DataChannelGloo. Also adding GlooCache back to compilation. Differential Revision: D15227502 fbshipit-source-id: f260e1cb294d662ba0c170931913b64287d62344	2019-05-06 19:06:58 -07:00
Jerry Zhang	8ca10d35e5	Add torch.nn.quantized.functional namespace (#20042 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20042 Exposing torch.ops.quantized as torch.nn.quantized.functional Differential Revision: D15178099 fbshipit-source-id: 8d65134bd727296f2750bbd2b54df0b99fc84b33	2019-05-06 18:49:58 -07:00
Spandan Tiwari	838ada3a62	Update logic for folding onnx::Constant nodes. (#20109 ) Summary: Currently, constant folding pass during ONNX conversion removes all onnx::Constant nodes that are parents of nodes that are folded. In situations where the parent onnx::Constant node is other subscribers downstream this could be a problem. This change updates the removal logic to remove to only those onnx::Constant nodes that do not have other subscribers downstream Pull Request resolved: https://github.com/pytorch/pytorch/pull/20109 Reviewed By: zrphercule Differential Revision: D15220392 Pulled By: houseroad fbshipit-source-id: 150788654ea1c84262becaffd6de152114bf76c0	2019-05-06 17:51:58 -07:00
davidriazati	877b7c1b8d	Fix NameError with PYTORCH_JIT=0 (#20120 ) Summary: Right now using `PYTORCH_JIT=0` gives this error: `NameError: name '_CachedForward' is not defined`](https://our.intern.facebook.com/intern/diff/15210046/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20120 Pulled By: driazati Differential Revision: D15210046 fbshipit-source-id: 493716d38e0078bfe96fab3dc624ec029988cf1c	2019-05-06 17:41:09 -07:00
Karl Ostmo	1cfe15ef2a	temporarily disable devtoolset7 nightlies (#20174 ) Summary: toward issue #20066 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20174 Differential Revision: D15229018 Pulled By: kostmo fbshipit-source-id: 350c16a27c6530fe7d1d36a2dc11cb5008cf30e5	2019-05-06 17:15:10 -07:00
Mikhail Zolotukhin	4211f674f0	Cleanup includes in c10/core/CPUAllocator.cpp. (#19885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19885 ghimport-source-id: 1f7d228ac8d1dd9aeb70446a6baf63f62f195663 Differential Revision: D15118516 Pulled By: ZolotukhinM fbshipit-source-id: bdaf56d97db9e70cbd36ca03349f6eabfbac2668	2019-05-06 16:06:19 -07:00
Wanchao Liang	8fbde94664	lower batchmm to non-diff optimization (#19987 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19987 ghimport-source-id: ca4c38312bd56d8a71f1925297deee7f64f573d3 Differential Revision: D15190356 Pulled By: wanchaol fbshipit-source-id: 761edb08c670fcbc24a06a5b11ceddf311f75884	2019-05-06 15:58:33 -07:00
Orion Reblitz-Richardson	0c5dc965a4	Add logging import and failing MLP (#20115 ) Summary: Add logging import and a failed MLP model that confirms that we don't fail `add_graph` when graph optimization fails. This addresses part of https://github.com/pytorch/pytorch/issues/18903 cc lanpa ezyang natalialunova Pull Request resolved: https://github.com/pytorch/pytorch/pull/20115 Reviewed By: natalialunova Differential Revision: D15206765 Pulled By: orionr fbshipit-source-id: c40b7e2671ef845a1529a2910ba030159f53f393	2019-05-06 15:44:59 -07:00
Bram Wasti	035966d538	Add options to Operator to enable registration of alias analysis passes (#19382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19382 ghimport-source-id: aeaad3b84ea20dd95b38635ca28c5ff657187909 Differential Revision: D14990873 Pulled By: bwasti fbshipit-source-id: e1292ac8358ca8ff5bad8d8aeaddf06c23e66067	2019-05-06 15:40:13 -07:00
Thomas Viehmann	5c9ab6f411	Specialize Optional[T] to T (or subtype for Tensor) or None when executing graph (#18407 ) Summary: This patch specializes `Optional[Tensor]` graph inputs to either a `DimensionedTensorType` (if a Tensor is passed) or `NoneType`. Other `Optional[T]` are specialized to `T` or `None`. - For unwrapping (checked and unchecked) we need to keep the output type, as IR code that follows unwrapping may not work with NoneType (just as it doesn't deal with Optional). While it would not be hit during execution, it will run against the (legitimate) assumptions of the analysis passes. - Function lookup currently will not match NoneType when it expects optional (I'm not entirely sure why this doesn't lead to unhappyness currently, but hey), I amend this at the level of the function matching code (`operator.cpp`), but see Adam's comments. We would run into trouble if we needed to select between functions whose signature only differs in Optional types with different subtypes, but we would have the same problem when calling them directly, so I would think this is OK. - It would enable throwing away branches we can't hit. This also reduces the "blockyness" of the graph, so it may be easier to apply optimizations (e.g. fuse things in `if t is None: ...` and outside the `if`. - Arguments passed into `Optional[Tensor]` arguments will get shape information, which is very handy. - It get's rid of the problem that tensors passed into Optional arguments get requires_grad set erroneously #18270 (though that also affects lists, which aren't fixed here). - `Optional[List[int]]` is needed for #18697. - We're changing typing in a more subtle way than the `TensorType`->`DimensionedTensorType`. - In particular, specializing to NoneType loses the Type information captured in the `OptionalType` element type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18407 Reviewed By: zdevito Differential Revision: D15216808 Pulled By: eellison fbshipit-source-id: 01f1a7643deaf4962c3f55eff2070d54b0e54b69	2019-05-06 15:35:03 -07:00
Michael Suo	47f5be164a	allow classes to be used in their own methods (#20106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20106 ghimport-source-id: e85bebfae59ad7720cfa64c5f88d5d5fa742c221 Reviewed By: shannonzhu Differential Revision: D15202261 Pulled By: suo fbshipit-source-id: ae01b868c0939cecf650bd2b5ad8bb94312eaef7	2019-05-06 15:25:52 -07:00
Michael Suo	e37d9c8168	fix compilation order for class methods (#20094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20094 ghimport-source-id: cea2887831fcda83d50adfd2ee8414dedbebe897 Reviewed By: shannonzhu Differential Revision: D15197079 Pulled By: suo fbshipit-source-id: b36d5f8c666372e82f6e47a3be70f6dc835cc6ab	2019-05-06 15:25:48 -07:00
Kittipat Virochsiri	0aa7407dd0	Rearrange stopping condition in CompositeReader (#20062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20062 Previously, the batch counter is incremented even if none of the readers has data. In this diff, 1) Limiter is applied to the last reader so that the batch counter is not incremented unless the first N-1 readers have data 2) The stop blob of the last reader as the stop blob of the task so that it's checked before the counter is incremented Reviewed By: xianjiec Differential Revision: D15099761 fbshipit-source-id: 47ed6c728118fe453cf57ac3457085867939485b	2019-05-06 15:06:32 -07:00
Lu Fang	b3c35e5202	Export randn_like in ONNX exporter (#20093 ) Summary: As a work around for dynamic shape case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20093 Reviewed By: zrphercule Differential Revision: D15220661 Pulled By: houseroad fbshipit-source-id: de271fce542be380bd49a3c74032c61f9aed3b67	2019-05-06 14:54:46 -07:00
Elias Ellison	26f5275644	Index into a tuple with non constant integer (#20081 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/16962 This needs fixing because we turn lists into tuples when constantify a module, so indexing into a Tuple of one type with a non-constant integer is quite common. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20081 Differential Revision: D15205893 Pulled By: eellison fbshipit-source-id: 61d74ee071ad0aad98e46fe807d6f6cc5f6abd2f	2019-05-06 14:23:16 -07:00
Mikhail Zolotukhin	722eb48ff2	Cleanup includes in torch/csrc/* (#19924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19924 ghimport-source-id: f7248b16c8e263a7d0ba7975b1fc0b00cb2cf2c0 Differential Revision: D15125018 Pulled By: ZolotukhinM fbshipit-source-id: 322c7ca53e38ef8b43b5ac5bd747b28bc10379f1	2019-05-06 14:03:18 -07:00
Mikhail Zolotukhin	6ca38d9840	Cleanup includes in torch/csrc/autograd/* (#19923 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19923 ghimport-source-id: 54debdd21ca0f4230b1915905673de274807a2e5 Differential Revision: D15125016 Pulled By: ZolotukhinM fbshipit-source-id: 8d54f436e4508067089a1d05ce192093220aa1bb	2019-05-06 13:48:42 -07:00
Mikhail Zolotukhin	8b46938355	Cleanup includes in torch/csrc/jit/* (#19922 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19922 ghimport-source-id: 0434c46bf75621ff79ea27a18a2475e7f13e2487 Differential Revision: D15125015 Pulled By: ZolotukhinM fbshipit-source-id: 5685edfc94067f62e363a85e9badb7f757b1d321	2019-05-06 13:40:26 -07:00
Mikhail Zolotukhin	edb376eceb	Cleanup includes in torch/csrc/jit/script/* (#19921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19921 ghimport-source-id: 12a4553a4a081e8a41f4ed432b4ce3dc14e4699f Differential Revision: D15125017 Pulled By: ZolotukhinM fbshipit-source-id: f7285bd1e0745dadb9cd353a5fa8a09728012a59	2019-05-06 13:24:22 -07:00
Jerry Zhang	17268a9225	Add print function for QTensor (#19513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19513 Add support for printing a QTensor in python frontend Differential Revision: D15017168 fbshipit-source-id: 312d1f18e6ca3c9eb4a5b8bb1c64f7cc8bc1dcf5	2019-05-06 13:12:43 -07:00
Tongzhou Wang	0de4b9e97e	Improve nn.ActivationCls repr of inplace Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20127 Differential Revision: D15212928 Pulled By: soumith fbshipit-source-id: f2e3ccb51315a11043d685bc6bf415ea039eeaa3	2019-05-06 13:04:23 -07:00
Yanbo Liang	a8387b7779	Delete TensorImpl::GetDevice() (#20025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20025 Delete TensorImpl::GetDevice() and clean all its call sites. Reviewed By: ezyang Differential Revision: D15170917 fbshipit-source-id: b6862b74aa036198544f79d18a8c0f995cb0ca7b	2019-05-06 12:44:23 -07:00
Brian Vaughan	9005a2c0fc	disable flaky test_proper_exit again, still occasionally failing (#20063 ) Summary: test was disabled for being flaky, re-enabled in https://github.com/pytorch/pytorch/pull/19421 but still occasionally failing: https://circleci.com/gh/pytorch/pytorch/1520165?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link ``` Apr 29 19:51:58 ====================================================================== Apr 29 19:51:58 FAIL: test_proper_exit (__main__.TestDataLoader) Apr 29 19:51:58 There might be ConnectionResetError or leaked semaphore warning (due to dirty process exit), but they are all safe to ignore Apr 29 19:51:58 ---------------------------------------------------------------------- Apr 29 19:51:58 Traceback (most recent call last): Apr 29 19:51:58 File "/var/lib/jenkins/workspace/test/common_utils.py", line 129, in wrapper Apr 29 19:51:58 fn(args, kwargs) Apr 29 19:51:58 File "test_dataloader.py", line 847, in test_proper_exit Apr 29 19:51:58 self.fail(fail_msg + ', and had exception {}'.format(loader_p.exception)) Apr 29 19:51:58 AssertionError: test_proper_exit with use_workers=True, pin_memory=False, hold_iter_reference=False, exit_method=worker_kill: loader process did not terminate, and had exception Traceback (most recent call last): Apr 29 19:51:58 File "test_dataloader.py", line 227, in run Apr 29 19:51:58 super(ErrorTrackingProcess, self).run() Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/process.py", line 114, in run Apr 29 19:51:58 self._target(self._args, *self._kwargs) Apr 29 19:51:58 File "test_dataloader.py", line 424, in _test_proper_exit Apr 29 19:51:58 for i, _ in enumerate(it): Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 545, in __next__ Apr 29 19:51:58 idx, batch = self._get_batch() Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 522, in _get_batch Apr 29 19:51:58 success, data = self._try_get_batch() Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 480, in _try_get_batch Apr 29 19:51:58 data = self.data_queue.get(timeout=timeout) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/queues.py", line 135, in get Apr 29 19:51:58 res = self._recv() Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/queue.py", line 22, in recv Apr 29 19:51:58 return pickle.loads(buf) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1382, in loads Apr 29 19:51:58 return Unpickler(file).load() Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 858, in load Apr 29 19:51:58 dispatch[key](self) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1133, in load_reduce Apr 29 19:51:58 value = func(args) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/reductions.py", line 274, in rebuild_storage_fd Apr 29 19:51:58 fd = multiprocessing.reduction.rebuild_handle(df) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/reduction.py", line 155, in rebuild_handle Apr 29 19:51:58 conn = Client(address, authkey=current_process().authkey) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/connection.py", line 169, in Client Apr 29 19:51:58 c = SocketClient(address) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/multiprocessing/connection.py", line 304, in SocketClient Apr 29 19:51:58 s.connect(address) Apr 29 19:51:58 File "/opt/python/2.7.9/lib/python2.7/socket.py", line 224, in meth Apr 29 19:51:58 return getattr(self._sock,name)(*args) Apr 29 19:51:58 error: [Errno 111] Connection refused ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20063 Differential Revision: D15218223 Pulled By: nairbv fbshipit-source-id: 32018c4220f7cb9372ef138631fc3a79759265e1	2019-05-06 08:34:27 -07:00
Syed Tousif Ahmed	8818005a91	Fix warnings coming from bernoulli and dirichlet kernels (#19933 ) Summary: Fixes the following warnings in CI: ``` /tmp/pip-req-build-un327jey/aten/src/ATen/native/cuda/Distributions.cu(123): warning: pointless comparison of unsigned integer with zero detected during instantiation of "void <unnamed>::bernoulli_tensor_cuda_kernel<scalar_t,prob_t>(at::Tensor &, const at::Tensor &, std::pair<uint64_t, uint64_t>) [with scalar_t=uint8_t, prob_t=uint8_t]" (238): here ``` ``` /tmp/pip-req-build-un327jey/aten/src/ATen/native/cuda/Distributions.cu:220:116: warning: 'c10::ScalarType detail::scalar_type(const at::Type&)' is deprecated [-Wdeprecated-declarations] /tmp/pip-req-build-un327jey/aten/src/ATen/Dispatch.h:46:1: note: declared here ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19933 Differential Revision: D15218177 Pulled By: ezyang fbshipit-source-id: 3994f7130da1ab04d2e8c3c4fe97f722b7502b41	2019-05-06 08:29:23 -07:00
Sunyeop Lee	3a318074e5	Fix examples in jit#user-defined-types documentation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20003 Differential Revision: D15218172 Pulled By: ezyang fbshipit-source-id: 07dfebafa337252c2733c018ecaabb0d9902e07d	2019-05-06 08:20:00 -07:00
Syed Tousif Ahmed	f29858ff14	Resolve host_define.h warnings (#19917 ) Summary: Eigen was updated with the commit needed to get rid of this warning that plagued the CI. This PR bumps third_party/eigen to that commit head. ``` warning: #warning "host_defines.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release. Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19917 Differential Revision: D15218183 Pulled By: ezyang fbshipit-source-id: 653c7d61ea401a7d4469c2009612dc43cc70122d	2019-05-06 08:19:57 -07:00
nihui	cc06e2f947	fix build with python-2.7.5 (#20137 ) Summary: pytorch failed to build with the following error, complaining about the first regex match It may be caused by a bug in python 2.7.5 This change proposed is a workaround for building pytorch with python 2.7.5 Since the '*' star notation is greedy in python regex, the new expression shall produce the identical result with the old one. ``` Traceback (most recent call last): File "/data2/nihuini/pytorch/cmake/../aten/src/ATen/gen.py", line 14, in <module> import preprocess_declarations File "/data2/nihuini/pytorch/aten/src/ATen/preprocess_declarations.py", line 3, in <module> from function_wrapper import TYPE_FORMAL_GENERIC File "/data2/nihuini/pytorch/aten/src/ATen/function_wrapper.py", line 5, in <module> from code_template import CodeTemplate File "/data2/nihuini/pytorch/aten/src/ATen/code_template.py", line 13, in <module> class CodeTemplate(object): File "/data2/nihuini/pytorch/aten/src/ATen/code_template.py", line 23, in CodeTemplate subtitution = re.compile(substitution_str, re.MULTILINE) File "/usr/lib64/python2.7/re.py", line 190, in compile return _compile(pattern, flags) File "/usr/lib64/python2.7/re.py", line 242, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat -- CMake Error at cmake/Codegen.cmake:162 (message): Failed to get generated_cpp list Call Stack (most recent call first): caffe2/CMakeLists.txt:2 (include) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20137 Differential Revision: D15218122 Pulled By: ezyang fbshipit-source-id: 10b618ff92a04e9074f5d83e31411fc2341e0cf8	2019-05-06 08:08:41 -07:00
vfdev	61f1242b7f	Formula typo fix (#20110 ) Summary: T_{cur + 1} -> T_{cur} + 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20110 Differential Revision: D15218135 Pulled By: ezyang fbshipit-source-id: fb914d977cac447867921510bf57b59e62e4f68c	2019-05-06 08:08:37 -07:00
Tongzhou Wang	343c1c21f2	update nn.init.calculate_gain doc example Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20131 Differential Revision: D15218126 Pulled By: ezyang fbshipit-source-id: 164c9d1573cd4d2c3689fb83b952e71862d4f1f2	2019-05-06 08:08:34 -07:00
Tongliang Liao	1dfeffbff5	Expose test utils (#20114 ) Summary: Some functions were not decorated with `CAFFE2_API`, makes them unusable when creating unit tests for custom ops outside Caffe2 repo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20114 Differential Revision: D15217490 Pulled By: ezyang fbshipit-source-id: dda3910ad24e566567607deaac705a34ec8e7b8d	2019-05-06 07:06:04 -07:00
Tongliang Liao	f2c715cbe1	Fix the spelling of "context" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20055 Differential Revision: D15217488 Pulled By: ezyang fbshipit-source-id: bb2b57b5e749357b47a01c6c3e73addf3c5418c7	2019-05-06 06:54:30 -07:00
peter	f6609daad7	Fix the warning if the wrong gcc is used with nvcc (#20158 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/11886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20158 Differential Revision: D15217665 Pulled By: ezyang fbshipit-source-id: 951bb640cc8bea705cb2f39ca2024e3c5084cd3b	2019-05-06 06:38:38 -07:00
Ricky Chen	57948414ac	Fix small typo T_mul->T_mult Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20148 Differential Revision: D15217485 Pulled By: ezyang fbshipit-source-id: cb183cdc2eb3e42c685ef024742a18745923d283	2019-05-06 06:32:33 -07:00
PetroSokirniy	71d23ebc13	#20143 TripletMarginLoss example isn't clear (#20145 ) Summary: Pull Request for TripletMarginLoss example isn't clear #20143 SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/20145 Differential Revision: D15217491 Pulled By: ezyang fbshipit-source-id: 46e30496e9f0dd830a2eec42c5775209a773cc48	2019-05-06 06:32:30 -07:00
Xue Feng	23ba0561c3	Add Gate Policy GateLearningRateOp (#20044 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20044 We do not have a gating functor. This diff adds it. I'm leveraging existing learning rate op because there are other policies I'll need to use as a union together. * Since there are other policy in LearningRateOp which will be used as a union, I chose to add it as a LearningRateOp. * constantwarmup cannot do step function of nonzero first and zero later * There are multiple uses for it, * e.g. as a gating blob generator that is useful for turning off. * e.g. as a learning rate switcher at certain iteration. * For generalizability, no regulation or constraint is applied on the range of the values * see figure below for illustration {F157366621} Reviewed By: ccheng16 Differential Revision: D15178229 fbshipit-source-id: 1e66e9a4bc1bfb946a57f8aefc97d8170f6be731	2019-05-05 20:11:04 -07:00
Sebastian Messmer	863818e05a	Allow overwriting kernels (#19777 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19777 When used from iPython/Jupyter/Bento, the cell doing kernel registration might be executed multiple times. Also, there might be a kernel library that wants to overwrite one of the existing kernels. Let's allow this. Reviewed By: dzhulgakov Differential Revision: D15090318 fbshipit-source-id: 09f842e8fd36646053c5c2f11325de4d31105b0c	2019-05-05 01:06:26 -07:00
Sebastian Messmer	470af2357d	Refactorings to prepare for overwritable kernels (#19776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19776 The diff stacked on top of this enables overwriting of kernels, but before that, we need to do some refactorings. This diff: - hides deregistration logic behind a RAII class so we can keep more information about what exactly to deregister without the API user knowing about it. - Split KernelRegistration from SchemaRegistration by taking Dispatcher::OperatorDef and moving it to a different file. This is better readable, especially since kernel registration will become more complex in the next diff. - Move LeftRight synchronization out of DispatchTable to Operator because there will be a mutex added to Operator in the next diff and related synchronization primitives shouldn't live on different abstraction levels. Reviewed By: dzhulgakov Differential Revision: D15090322 fbshipit-source-id: 2e51a192075163f0d496956d9e54b9aaf26b2369	2019-05-05 01:06:23 -07:00
Guanheng Zhang	fc00bfd12e	Update MultiheadAttention documentations (#20071 ) Summary: Add documentations to add_bias_kv, add_zero_attn, and attn_mask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20071 Differential Revision: D15213034 Pulled By: zhangguanheng66 fbshipit-source-id: c3db4b9e8527863420ba3ce6abf6098d3b0fb7a7	2019-05-04 13:55:41 -07:00
zhiqiang	ecdeef37df	Fix math rendering of CTC loss docs and fix error of lint in test_torch.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19662 Differential Revision: D15063326 Pulled By: soumith fbshipit-source-id: 71e79dee19c6259e27393d6b3d5ca3f8edfd6afa	2019-05-03 22:30:41 -07:00
Nikolay Korovaiko	7ddd5d06ed	trace multiple methods (#19905 ) Summary: This PR adds a new trace API `trace_module` that will allow us to trace multiple methods as a part of a single `ScriptModule` See the example below. ```python class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv = nn.Conv2d(1, 1, 3) def forward(self, x): return self.conv(x) def weighted_kernel_sum(self, weight): return weight * self.conv.weight example_weight = torch.rand(1, 1, 3, 3) example_forward_input = torch.rand(1, 1, 3, 3) n = Net() inputs = {'forward' : example_forward_input, 'weighted_kernel_sum' : example_weight} module = torch.jit.trace_module(n, inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19905 Differential Revision: D15200007 Pulled By: Krovatkin fbshipit-source-id: 0354d973fe40cb6e58b395bd866df14e0fc29d5b	2019-05-03 16:21:58 -07:00
Bharat123rox	7ad04ad28d	DOC: Update web documentation of geometric_ to be consistent with Tensor behaviour (#20091 ) Summary: Fix #19940 by updating web doc to reflect Tensor behaviour which will reflect [here](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.geometric_) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20091 Differential Revision: D15196734 Pulled By: soumith fbshipit-source-id: a1b8aff9599f170e76a9cbca5112b5a9488bc36c	2019-05-03 15:39:10 -07:00
Xiaomeng Yang	271f005eeb	Add elementwise_affine for LayerNormGradientOp (#19982 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19982 Add elementwise_affine for LayerNormGradientOp Reviewed By: houseroad Differential Revision: D15157493 fbshipit-source-id: 7465f2c1d4df4649b4903b93483c4861e9c7afa9	2019-05-03 15:33:46 -07:00
Sebastian Messmer	440aac082a	The deprecated API allows std::vector arguments (#19784 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19784 For backwards compatibility, we allow vector arguments if a kernel is registered with the deprecated API. Reviewed By: dzhulgakov Differential Revision: D15091972 fbshipit-source-id: 4db3e3a262e605504b05c42d40046011408501d2	2019-05-03 12:18:09 -07:00
Lu Fang	e0bd7cc821	Change the export of _dim_arange in ONNX (#20078 ) Summary: Previously using ATen op, now fully switched to pure ONNX/Caffe2 ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20078 Reviewed By: zrphercule Differential Revision: D15188774 Pulled By: houseroad fbshipit-source-id: 8ae3094369497e2f3ebf478cda222b73de2a995e	2019-05-03 11:07:05 -07:00
Yinghai Lu	840680bbf3	Reduce overhead of OnnxifiOp (#20085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20085 Reduce the overhead of OnnxifiOp from ~40us avg (~100us max) to ~25us avg (~40us max). Reviewed By: bertmaher Differential Revision: D15191071 fbshipit-source-id: 830d2bd08b19567a133d43c8a7b996032629d326	2019-05-03 10:57:19 -07:00
Jiakai Liu	c7c02724cd	CMakeLists changes to enable libtorch for Android (#19762 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19762 ghimport-source-id: 287aa7fea4efd38994e14d794123eb2046b91fc0 Differential Revision: D15087653 Pulled By: ljk53 fbshipit-source-id: 4498ff9f7f7903c3e25541184302b811267958e9	2019-05-03 09:28:53 -07:00
Guanheng Zhang	0e77c0f5de	Add ninja to PyTorch README.md file. (#20079 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17572 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20079 Differential Revision: D15195388 Pulled By: zhangguanheng66 fbshipit-source-id: b7448b482f07e96753f727664416a5d0e85602b4	2019-05-03 07:13:51 -07:00
barrh	767c82e151	Initialize last_epoch in _LRScheduler.__init__() (#20059 ) Summary: Class attributes preferably be explicitly initiated within the __init__() call. Otherwise, overriding step() is prone to bugs. This patch partially reverts #7889 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20059 Differential Revision: D15195747 Pulled By: soumith fbshipit-source-id: 3d1a51d8c725d6f14e3e91ee94c7bc7a7d6c1713	2019-05-02 22:38:12 -07:00
Orion Reblitz-Richardson	af87cfd7f9	Remove in-memory scalars and add comments (#20038 ) Summary: This takes care of some outstanding review comments for https://github.com/pytorch/pytorch/pull/16196/ Specifically: 1. Add comment about kind 2. Add comment about GraphPy 3. Remove ONNX version comment 4. Remove scalar_dict from SummaryWriter and all history functions cc lanpa ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/20038 Reviewed By: natalialunova Differential Revision: D15177257 Pulled By: orionr fbshipit-source-id: 218aa799d8b7dbb58f422a331236bba4959347de	2019-05-02 22:26:28 -07:00
Neeraj Pradhan	fb40e58f24	Remove deprecated tensor constructors in torch.distributions (#19979 ) Summary: This removes the deprecated `tensor.new_*` constructors (see #16770) from `torch.distributions` module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19979 Differential Revision: D15195618 Pulled By: soumith fbshipit-source-id: 46b519bfd32017265e90bd5c53f12cfe4a138021	2019-05-02 20:45:02 -07:00
Amir Arsalan Soltani	792bc56ec2	Update README.md (#20088 ) Summary: Sometimes people need to checkout an older version and build PyTorch. In that case, they need to do `git submodule sync` and maybe `git submodule update --init` as mentioned [here](https://github.com/pytorch/pytorch/issues/20074). Pull Request resolved: https://github.com/pytorch/pytorch/pull/20088 Differential Revision: D15195729 Pulled By: soumith fbshipit-source-id: 73232b801e5524cdba462dd504fb973d95d0498c	2019-05-02 20:40:03 -07:00
Sebastian Messmer	fb8792e2b6	Remove torch/jit from xplat build (#19967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19967 - Reviewed By: dreiss, dzhulgakov Differential Revision: D15150843 fbshipit-source-id: af7d6902934883be9d8021b3601de2fe1f3bf806	2019-05-02 15:31:06 -07:00
Zachary DeVito	1ecbf0d213	Spilt MethodValue and FunctionValue (#19985 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19985 ghimport-source-id: 6a032fd47f2f2e5ccd59fbd70bd12d0b9a815685 Differential Revision: D15160042 Pulled By: zdevito fbshipit-source-id: a4ed3750ceea44d7d4c7d970b440840eb0ea5936	2019-05-02 14:37:19 -07:00
Jesse Hellemn	2ec2287cce	Fix smoke tests on binary builds Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20069 Differential Revision: D15188864 Pulled By: pjh5 fbshipit-source-id: 692c5ecf1a4f5a560cf8fbcfe3634b809f184d72	2019-05-02 14:24:30 -07:00
Sebastian Messmer	99c548e223	Make IValue("string") do the right thing (#20027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20027 Before this change, any pointer was converted to bool, i.e. IValue("string") == IValue(true) After this change, it does the right thing and creates a string. Reviewed By: dzhulgakov Differential Revision: D15172409 fbshipit-source-id: 8167dd780005f9bceef4fe3c751f752e42ceeb20	2019-05-02 12:36:13 -07:00
Chunli Fu	38e630a03f	Extract feature length information from ClipRangesGatherSigridHash (#19704 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19704 Extract feature length information from ClipRangesGatherSigridHash Reviewed By: ipiszy Differential Revision: D15075047 fbshipit-source-id: f70f61b7910515df5c47a042ac7afa523dbdb02a	2019-05-02 12:31:01 -07:00
Kanghwan Jang	6f7a315a71	Allow onnx export for maxpool with dilations (#18721 ) Summary: Now, MaxPool operator supports the 'dilations' attribute with this commit: `b22041c3f1` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18721 Reviewed By: zrphercule Differential Revision: D15152400 Pulled By: houseroad fbshipit-source-id: e8f5ab35c5c2c3a540a22f7cf7bb453d892d0400	2019-05-02 11:26:57 -07:00
Jesse Hellemn	75868683dc	Removing CUDA 8.0 nightlies (#20068 ) Summary: Resolves https://github.com/pytorch/pytorch/issues/20067 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20068 Differential Revision: D15184571 Pulled By: pjh5 fbshipit-source-id: a37846a23ac7b414f9a3741f37a2db5bb61c93a9	2019-05-02 10:53:56 -07:00
Lu Fang	002009b5a9	Automatic update of fbcode/onnx to 7d7bc83d29a328233d3e8affa4c4ea8b3e3599ef (#20012 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20012 Previous import was f1311e74ec8a91cbf86094cd6f10157cbf00c536 Included changes: - [7d7bc83d](https://github.com/onnx/onnx/commit/7d7bc83d): fix shape inference (#1984) <Ashwini Khade> - [68630bbd](https://github.com/onnx/onnx/commit/68630bbd): fixing some of Mod test cases (#1962) <Jeff Saremi> Reviewed By: zrphercule Differential Revision: D15160934 fbshipit-source-id: c53aff401f56b2febeb6c4ee302670eb12b9b495	2019-05-01 22:15:30 -07:00
Nikolay Korovaiko	725ef26f34	Add suport for tensor targets in for-in (#19380 ) Summary: Fixes #19314 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19380 Differential Revision: D15167858 Pulled By: Krovatkin fbshipit-source-id: e87261bbf3e6f8df0601df80280eb3dba42798cd	2019-05-01 17:57:56 -07:00
Jiakai Liu	46589a8a32	fix labs warning in THTensorMoreMath.cpp (#19760 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19760 ghimport-source-id: d0aabee8c7b881710292d468dca1537c5ea761b5 Differential Revision: D15087655 Pulled By: ljk53 fbshipit-source-id: ac133dfc2301c2a86c41b4b8f1483d7d23824e1e	2019-05-01 16:35:42 -07:00
Jiakai Liu	ff70c364a4	fix labs warning in THVectorDefault.cpp (#19999 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19999 ghimport-source-id: 81157c60f0a9e5adbf12af8fc3a6e4352826b859 Differential Revision: D15169025 Pulled By: ljk53 fbshipit-source-id: 8e6f8df6dec6d21d6c7e743e974f4fcfff7cdeb5	2019-05-01 16:35:39 -07:00
davidriazati	18cb098588	Remove warnings on new_* constructors (#20026 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * *#20026 Remove warnings on new_ constructors** Revert of #16770, fixes #19995 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20026 Pulled By: driazati Differential Revision: D15171691 fbshipit-source-id: 057c3b4a9fd6086ca240007e5404a286080f04b6	2019-05-01 16:35:36 -07:00
davidriazati	8987ae314e	Remove try/catch in constant propagation (#19686 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19686 [jit] Remove try/catch in constant propagation The try-catch here gets tripped pretty often when constant prop is run which screws up `catch throw` in gdb.](https://our.intern.facebook.com/intern/diff/15170134/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19686 Pulled By: driazati Differential Revision: D15170134 fbshipit-source-id: 93688561126f3ab582c8358e8f2787f7fce9aa73	2019-05-01 16:27:08 -07:00
Jesse Hellemn	0da0c4be48	Rotate circleci keys Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19981 Differential Revision: D15174219 Pulled By: pjh5 fbshipit-source-id: 205952aa90ed93f193f40d4293f5a8d82fa33ed6	2019-05-01 16:04:41 -07:00
Yinghai Lu	e846ccd7bc	Fix bug in dumpNet (#20001 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20001 att Reviewed By: zrphercule Differential Revision: D15164116 fbshipit-source-id: dab19fb84fa0ab648103317af5509703db918682	2019-05-01 15:27:49 -07:00
Nishant Pandit	c80ae6dc8e	Change the quant-dequant node pattern for jit op substitution pass (#19910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19910 This change modifies the quant-dequant node pattern from qparam->q->dq to qparam->q->int_repr->qparam->dq. The motivation for this change is to make the qparams required for op substition one level up at dequant node instead of multiple levels up. Differential Revision: D15120146 fbshipit-source-id: 74b0fd5cb50a338f562740a9cc727a7791c718c3	2019-05-01 12:18:33 -07:00
Stephen Chen	3f7e3d5857	Add the ability to observe intermediate tensors in an onnxifi op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19966 Reviewed By: yinghai Differential Revision: D15096086 fbshipit-source-id: 8e6a26c46898f99d411dd5841f086946884b2457	2019-05-01 11:51:02 -07:00
Pieter Noordhuis	de582e2f89	Fix test_forked_cw (#19680 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19680 This was broken for quite some time because of an operator schema check that went into effect at some point in time. Reviewed By: manojkris Differential Revision: D15055082 fbshipit-source-id: 7f730f9b810bdaffd69bab7ac4d02c5b2e40645b	2019-05-01 11:11:45 -07:00
Nishant Pandit	e04caa3f44	Pass Quantization parameters for quant nodes (#19402 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19402 This pass propagate the qparams calculated after calibration to the quant nodes which will be used later for quantization Differential Revision: D14995230 fbshipit-source-id: 5709153ea1c039c4ab4470ddb689a303b0bcc6fd	2019-05-01 09:15:59 -07:00
Gregory Chanan	f564226167	Avoid reusing TYPE parameter in AT_DISPATCH macros. (#19968 ) Summary: From the comment: "don't use TYPE again in case it is an expensive or side-effect op" Pull Request resolved: https://github.com/pytorch/pytorch/pull/19968 Differential Revision: D15151567 Pulled By: gchanan fbshipit-source-id: 4d42c081ac1472b71f1cea5172cb42a7c83a7043	2019-05-01 08:52:06 -07:00
Nishant Pandit	1f9a0c5dd6	Add observer nodes for input data nodes (#19232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19232 Add observer nodes to collect stats for input data nodes excluding params which are constant at inference and need not be observed. This information is required to compute quantization params. Differential Revision: D14885485 fbshipit-source-id: 8762cc2a4e510e1553b3dbd1d1aecd55b4bdb89f	2019-05-01 03:49:14 -07:00
Jiakai Liu	8cd6d2f101	rename BUILD_ATEN_MOBILE to INTERN_BUILD_MOBILE and make it private (#19942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19942 ghimport-source-id: 6bacc8f5ad7911af8cf5fde9fcb604ade666b862 Reviewed By: dzhulgakov Differential Revision: D15144325 Pulled By: ljk53 fbshipit-source-id: d63a70f007110d5d1055d6bec1ed09a1a6aafdae	2019-05-01 00:20:24 -07:00
Jiakai Liu	5108e807e0	add new macro TORCH_MOBILE for libtorch mobile build (#19761 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19761 ghimport-source-id: 359a44594d3d5afb8102435b4eac6ab920c24ef4 Differential Revision: D15087652 Pulled By: ljk53 fbshipit-source-id: f1a79c38c9415bb3786cc4d073370b1cb807e5ce	2019-04-30 21:25:15 -07:00
Mikhail Zolotukhin	360640bc9c	Extract Python-specific SugaredValues to a separate file from init.cpp. (#19986 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19986 ghimport-source-id: 67f5fec4b5b2114f2922505a7743ed27e6d7e6cc Differential Revision: D15160820 Pulled By: ZolotukhinM fbshipit-source-id: e39238db8f30a8809891bff8a2fe39977124f6ca	2019-04-30 19:38:23 -07:00
Sebastian Messmer	ba84ad0d97	LeftRight works for classes without default constructors (#19775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19775 - Reviewed By: dzhulgakov Differential Revision: D15090319 fbshipit-source-id: e80865975970400c3db24bba4af4327105f3b9b2	2019-04-30 16:34:15 -07:00
Sebastian Messmer	e97da36cbb	Explicitly disable copy&move on LeftRight (#19774 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19774 see in-source comment Reviewed By: dzhulgakov Differential Revision: D15090320 fbshipit-source-id: ae9ba5b5df7115c2b1c275e384030063dbbf8f1a	2019-04-30 16:34:12 -07:00
Yanghan Wang	a285cbcccf	support different class modes for bbox in box_with_nms_limit_op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19820 Reviewed By: newstzpz Differential Revision: D15112955 fbshipit-source-id: a757622a32cff7159c39735607103138dbbafc24	2019-04-30 16:02:44 -07:00
Jesse Hellemn	74f527a8fa	Adding job to upload binary sizes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19934 Differential Revision: D15155702 Pulled By: pjh5 fbshipit-source-id: cf841624145d14c7f40153c8fe7b442e633c0f92	2019-04-30 15:56:01 -07:00
Kimish Patel	48981a02e9	Fix typo in embedding_bag_cpu. (#19432 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19432 Fixed typo. Reviewed By: cpuhrsch Differential Revision: D15003373 fbshipit-source-id: 8d0f9f70181f6e5041c1a09f5b8a7a5707a4ff2c	2019-04-30 15:51:22 -07:00
Sebastian Messmer	cb5442b31a	Move IValues from stack into kernels (#19783 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19783 Previously, the IValues were copied into the kernel arguments, which caused a refcount bump if Tensor was taken by value. Now, a kernel can take Tensor by value without any refcount bump because it is moved in. Reviewed By: dzhulgakov Differential Revision: D15091973 fbshipit-source-id: 4c5ff2e3ee86f5934cc84191697f7dbc9c3ee345	2019-04-30 15:46:16 -07:00
Junjie Bai	cb8ff2a2b4	Add mkldnn support for adaptive_avg_pool2d (#19818 ) Summary: AdaptiveAvgPool2d is used in torchvision resnet models https://github.com/pytorch/vision/blob/9a481d0/torchvision/models/resnet.py#L145 Fixes #19797 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19818 Differential Revision: D15112777 Pulled By: bddppq fbshipit-source-id: 6c9b29c805d28356cda49c10c2cd3ce9d7a8b3f5	2019-04-30 15:00:34 -07:00
Tzu-Wei Huang	f5b1a41c58	specify data type in the doc (#19959 ) Summary: addresses comments in #19915 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19959 Differential Revision: D15149993 Pulled By: orionr fbshipit-source-id: 0e438cfa1a311e89d4bed7ae9d7710a9f1b19a78	2019-04-30 14:15:56 -07:00
davidriazati	947fd9c3f5	More doc edits (#19929 ) Summary: * document `torch.jit.Attribute` * add JIT one-liner to `README.md` * misc clarity edits](https://our.intern.facebook.com/intern/diff/15152418/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19929 Pulled By: driazati Differential Revision: D15152418 fbshipit-source-id: dfee03f0a17300aaf453fcf17f418463288f66c2	2019-04-30 13:52:07 -07:00
Jesse Hellemn	a9c189ca14	Macos upload fix (#19965 ) Summary: The second commit will be removed before landing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19965 Differential Revision: D15153243 Pulled By: pjh5 fbshipit-source-id: 70eae38d0cb07dc732c0cf044d36ec36d0a4472d	2019-04-30 13:46:24 -07:00
Jiakai Liu	0ad1d5c317	fix THAllocator.cpp (#19759 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19759 ghimport-source-id: 7ccd6e69dfd36ca9d59da59d5ec55f08e58b79f7 Reviewed By: soumith Differential Revision: D15087651 Pulled By: ljk53 fbshipit-source-id: 536db274f7e725a0170b6f2c462146c08139f341	2019-04-30 13:29:47 -07:00
Jiakai Liu	4e154c1585	remove C10_MOBILE from LegacyTypeDispatch.cpp (#19758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19758 ghimport-source-id: 59d0d337260e3baf57da87af3f97836b321a3db2 Reviewed By: pjh5 Differential Revision: D15087654 Pulled By: ljk53 fbshipit-source-id: 8f56808faba35bae68ee99c62c761f94807b6fd7	2019-04-30 13:29:43 -07:00
Yan Zhu	508ca44fcc	add logging in sparse_to_dense_mask when skipping (#19945 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19945 this logging was removed in D6888977, but i think it could be useful to debug upstream data issue to check the violating index Reviewed By: xianjiec Differential Revision: D15127887 fbshipit-source-id: 4ad7eceefcd063bf45bc190a4c0d458a089c918a	2019-04-30 12:57:24 -07:00
Yinghai Lu	56977db4a7	Provide option to save quantized data for DNNLOWP without layout optimization (#19681 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19681 For accelerator, we need to lower just the quantized weights data without layout transformation. This diff attempts to provide this option. Reviewed By: jerryzh168, zrphercule Differential Revision: D15066568 fbshipit-source-id: 133d749e087c2ad4a899bee5e96f597f70b2443c	2019-04-30 12:32:42 -07:00
Iurii Zdebskyi	ca57dd9332	Fixed log_normal and geometric for CPU (#19938 ) Summary: log_normal_ and geometric_ were disabled for CPU by mistake in [this PR](`bc53805f2e`), this PR fixes it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19938 Differential Revision: D15143404 Pulled By: izdeby fbshipit-source-id: 41c7bd29f046b5a3ac6d601de8c64ab553771d19	2019-04-30 12:18:10 -07:00
Sebastian Messmer	c6cb32f588	Forbid kernels from returning Scalar[] (#19811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19811 This does not immediately take effect for the custom op API but will break backwards compatibility once we switch to the new operator registration. Reviewed By: dzhulgakov Differential Revision: D15101924 fbshipit-source-id: 8890a5a3e163d3263dc1837be0b4851984771917	2019-04-30 12:13:09 -07:00
Yinghai Lu	9a81d1e692	Automatic generation of unittest for Glow integration Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19936 Reviewed By: ipiszy Differential Revision: D15138090 fbshipit-source-id: 29a812548bb5da00176b00c1e9a26a7c31cea9c0	2019-04-30 12:13:06 -07:00
Mikhail Zolotukhin	3a0727e58b	Fix flake8. (#19832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19832 ghimport-source-id: 7360a52dbcf83458797c27002afc1fd53ee5907f Differential Revision: D15115620 Pulled By: ZolotukhinM fbshipit-source-id: aa62b04facc1e1824a8889a32dace5804daa21df	2019-04-30 12:09:10 -07:00
Sebastian Messmer	e710f3b1e1	Fix C10_MOBILE macro for ios (#19779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19779 This macro wasn't set correctly because the target macros weren't included from Apple's header. Reviewed By: dzhulgakov Differential Revision: D15090427 fbshipit-source-id: 43ca44f0f409e11718b7f60c3fdcd2aa02d7018e	2019-04-30 12:03:24 -07:00
Lu Fang	965c3b2761	Automatic update of fbcode/onnx to f1311e74ec8a91cbf86094cd6f10157cbf00c536 (#19949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19949 Previous import was 22662bfd4dcc6baebf29e3b823a051676f991001 Included changes: - [f1311e74](https://github.com/onnx/onnx/commit/f1311e74): Lint the docs name (#1982) <Lu Fang> - [c1c04af4](https://github.com/onnx/onnx/commit/c1c04af4): Fix a shapeinference bug in upsample v9/10 (#1969) <Raymond Yang> - [2f7dc10f](https://github.com/onnx/onnx/commit/2f7dc10f): Create managingexperimentalops (#1974) <Joseph Spisak> - [419582b7](https://github.com/onnx/onnx/commit/419582b7): Create archivefileformat doc based on the wiki equivalent (#1973) <Joseph Spisak> - [4fcf3842](https://github.com/onnx/onnx/commit/4fcf3842): Create NLPinONNXproposal (#1975) <Joseph Spisak> - [153c4f9a](https://github.com/onnx/onnx/commit/153c4f9a): Create ONNXIFIproposal (#1976) <Joseph Spisak> - [c695016a](https://github.com/onnx/onnx/commit/c695016a): Create onnxreleases (#1977) <Joseph Spisak> - [ccf4b383](https://github.com/onnx/onnx/commit/ccf4b383): Create functionsproposal (#1978) <Joseph Spisak> - [c4d32371](https://github.com/onnx/onnx/commit/c4d32371): Create typeannotations.md (#1979) <Joseph Spisak> Reviewed By: zrphercule Differential Revision: D15145652 fbshipit-source-id: eca661fe5a735dce5af992e16b6a4013e896b8b4	2019-04-30 11:57:47 -07:00
iurii zdebskyi	aa6403bae6	Added .bool() method Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19928 Differential Revision: D15131923 Pulled By: izdeby fbshipit-source-id: 3909cf4623fe85e98ceaf57fbb57745919899445	2019-04-30 10:34:31 -07:00
Daya S Khudia	d868c97580	Improve performance of Int8SpatialBN (needed for DF4 quantization) (#19702 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19702 avx2 implementation of core compute for Int8SpatialBN Reviewed By: jianyuh Differential Revision: D15073973 fbshipit-source-id: c30b0c621348ba9331ba5e48b281c00cf6e479a1	2019-04-30 10:26:48 -07:00
Long Jin	95ce796663	enable CopyVector for type of int32_t (#19931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19931 as title Reviewed By: Wakeupbuddy Differential Revision: D15134369 fbshipit-source-id: a8afa166b5537bf815be875fa8afcb599897d5a7	2019-04-29 22:24:06 -07:00
Sebastian Messmer	4c678dbe87	Make moving FunctionSchema efficient (#19773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19773 const members can't be moved, so whenever somebody moved a function schema, it was copied instead. This diff fixes this. Reviewed By: dzhulgakov Differential Revision: D15090323 fbshipit-source-id: 123a1d6b96ac46cb237966c0b072edebcdafe54c	2019-04-29 21:43:40 -07:00
BowenBao	8e77506799	Add onnx export for floor, ceil, log2 and prim::shape Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17895 Reviewed By: zrphercule Differential Revision: D15103396 Pulled By: houseroad fbshipit-source-id: 2ec80f11a19a8659aa496e68aed769a8dd1efb18	2019-04-29 21:23:14 -07:00
Zachary DeVito	55c719b161	Remove operator.h's dependency on function_schema.h (#19817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19817 A lot of files were depending on the JIT's typesystem because operator.h depends on function_schema.h. However, this isn't fundamental to the design. This diff tries to remove the direct depenency and only includes the c10 wrapper helpers in files where it is required. Reviewed By: smessmer Differential Revision: D15112247 fbshipit-source-id: 2c53d83e542c32d9a398c8b60dbf40ab7a1cb0f6	2019-04-29 19:50:43 -07:00
Mikhail Zolotukhin	2a95cf6345	Add a pattern-based fusion pass. (#19596 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19596 ghimport-source-id: 1d7af5877dbeffa826201812649a9009c06c6305 Differential Revision: D15042033 Pulled By: ZolotukhinM fbshipit-source-id: e3178d9aec2ac63fc3779ddedbd967aae0401c76	2019-04-29 19:17:31 -07:00
Jesse Hellemn	abbfb7dd23	Fix devtoolset7 binary builds (#19919 ) Summary: cc kostmo Pull Request resolved: https://github.com/pytorch/pytorch/pull/19919 Differential Revision: D15141049 Pulled By: pjh5 fbshipit-source-id: 9080924c5304d7db001d6bd910a8c3051a34eeae	2019-04-29 17:21:03 -07:00
Lara Haidar-Ahmad	f5c2b5a259	ONNX Export Min and Max ops with dim Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19689 Reviewed By: zrphercule Differential Revision: D15103119 Pulled By: houseroad fbshipit-source-id: f44555657f6965c41737e69485da119f37cf9d7c	2019-04-29 16:50:18 -07:00
Mikhail Zolotukhin	a5b8a7d44b	Don't include TensorMethods.h - it's already included by Tensor.h anyway. (#19831 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19831 ghimport-source-id: 03ccc99f348021bd8a91c5006b5e4ebd3565290e Differential Revision: D15115630 Pulled By: ZolotukhinM fbshipit-source-id: a322cdca9c87429f092509d855d16551873bf17e	2019-04-29 16:11:25 -07:00
Orion Reblitz-Richardson	62a1640666	Improve torch.utils.tensorboard docs (#19915 ) Summary: This adds method details and corrects example on the page that didn't run properly. I've now confirmed that it runs in colab with nightly. For those with internal access the rendered result can be seen at https://home.fburl.com/~orionr/pytorch-docs/tensorboard.html cc lanpa, soumith, ezyang, brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/19915 Differential Revision: D15137430 Pulled By: orionr fbshipit-source-id: 833368fb90f9d75231b8243b43de594b475b2cb1	2019-04-29 16:06:27 -07:00
Jianyu Huang	ff33c8c24a	Avoid printing debug information in the test Summary: As title says. Reviewed By: zafartahirov Differential Revision: D15097015 fbshipit-source-id: b506130d8c8993672815221ba2c861ee1a95c64a	2019-04-29 15:48:38 -07:00
Mikhail Zolotukhin	114644449e	Break circular dependency between Type.h, Tensor.h and TensorMethods.h (#19830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19830 ghimport-source-id: 9b057c549652a39d6706679ee86494124bb40ef0 Differential Revision: D15115631 Pulled By: ZolotukhinM fbshipit-source-id: 821c0aedacd9692d6076349ff8318c8f6234891c	2019-04-29 14:46:50 -07:00
Elias Ellison	ccf35f4be0	remove scalar to float matching (#19918 ) Summary: Trying to get this in before 1.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19918 Reviewed By: driazati Differential Revision: D15124430 Pulled By: eellison fbshipit-source-id: 549cdcbaff91218657e94ce08c0f4e69b576d809	2019-04-29 14:41:56 -07:00
davidriazati	4294dba981	Misc pickler improvements (#19638 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19638 [jit] Serialize attribute module as torch.jit._pickle * use `torch.jit._pickle` as the module for globals in the pickle program. Pickle will try to resolve these to the actual functions in `torch.jit._pickle.py` automatically (I believe this can also be overridden to point to whatever functions you want). This means that `pickle.load("my_model/attributes.pkl")` will work instead of having to use a custom `pickle.Unpickler` * use `REDUCE` opcodes instead of `BUILD` to make use of the last bullet * use a union in the unpickler to support globals better (+ any future metadata we might need that can't be stored in an `IValue`), this makes some of the code around `IntList`s clearer and lets us get rid of any lookbehind for opcodes * pickle things as a tuple instead of a list (an immutable result is more semantically correct)](https://our.intern.facebook.com/intern/diff/15111203/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19638 Pulled By: driazati Differential Revision: D15111203 fbshipit-source-id: 526c6c2b63a48eb1cba1c658045a7809730070dd	2019-04-29 13:45:10 -07:00
Mikhail Zolotukhin	8933ff651c	Remove self-include. (#19833 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19833 ghimport-source-id: 396eb04317ad67980a624e71587bd57c4ac353f5 Differential Revision: D15115726 Pulled By: ZolotukhinM fbshipit-source-id: 20677510a640b32defb63ae8cf3c22ffb1d1fdc4	2019-04-29 12:44:43 -07:00
iurii zdebskyi	de19eeee99	Enabled masked for a bool tensor (#19140 ) Summary: Added deprecation warnings for the masked methods and enabled them for a bool tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19140 Differential Revision: D14888021 Pulled By: izdeby fbshipit-source-id: 0e42daf8f3732ca29f36d10485402bfc502716ad	2019-04-29 10:40:12 -07:00
peter	39b885cbbf	Add magma for CUDA 10.1 to Windows docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19914 Differential Revision: D15123029 Pulled By: soumith fbshipit-source-id: a3d4b498a763e1a9829896d44211be00400ec39d	2019-04-29 10:13:21 -07:00
Xiaodong Wang	9d0b5a1ce9	Build caffe2/fb/operators (#19688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19688 Minor changes to hipify script to take extra folders. Reviewed By: bddppq Differential Revision: D15068427 fbshipit-source-id: e2e792c8227cbd0e15fd2564f87d740a62c477da	2019-04-29 09:01:10 -07:00
Pieter Noordhuis	841360029a	Finer grained consistency check in reducer (#19901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19901 The existing code used `expect_autograd_hooks_` as a proxy for the situation where finalization of the previous iteration is needed. This is not correct, however, since you may decide to completely ignore the output of a DDP wrapped module. If this is the case, and no gradients have been passed to the reducer, it is fine to keep going. This commit adds a new variable `require_finalize_` that tracks whether the finalization is really needed. Reviewed By: mrshenli Differential Revision: D15118871 fbshipit-source-id: 25938eaf1fe13e2940feae1312892b9d3da8a67d	2019-04-28 23:12:19 -07:00
Pieter Noordhuis	5525c419fc	Only call into reducer if torch.is_grad_enabled() (#19897 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19897 During validation, gradient reduction is not needed, and autograd is never called. The model output will always be a detached tensor. After the new reducer was merged, this meant that it would find all model parameters unused, and kick off reduction for them. When #19799 and output where no parameters are used and it tries to kick off reduction of zeroed gradients. Test for `torch.is_grad_enabled()` and `self.training` before calling into the reducer. Reviewed By: mrshenli Differential Revision: D15118726 fbshipit-source-id: b0208f632a61cbe8110fa626fa427937b7f05924	2019-04-28 23:12:16 -07:00
Michael Suo	a92e1ccd6c	Remove trailing whitespace flake8 lint (#19828 ) Summary: Not useful and noisy Pull Request resolved: https://github.com/pytorch/pytorch/pull/19828 Differential Revision: D15118966 Pulled By: suo fbshipit-source-id: e31c05f4eac206a2a2d9508fcbef8a658d2a9baf	2019-04-28 22:19:01 -07:00
Shen Li	b695e562e5	Make find_unused_parameters in DDP default to False (#19895 ) Summary: As DDP in previous releases does not support unused params, turning off `find_unused_parameters` by default to derisk new reducer. CC pietern soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/19895 Reviewed By: pietern Differential Revision: D15118563 Pulled By: mrshenli fbshipit-source-id: 6215c486e1dae3387b36011d8e64a2721ac85f58	2019-04-28 21:22:26 -07:00
Mikhail Zolotukhin	3f7a87788f	Remove redundant include from jit/fuser/cpu/dynamic_library.h. (#19863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19863 ghimport-source-id: e13569d7d5abb243d4d5703a2a35f437f359efee Differential Revision: D15118505 Pulled By: ZolotukhinM fbshipit-source-id: 3dd1297fdabb102c5f0f68c9b1198195a851f8d0	2019-04-28 20:34:03 -07:00
peter	3803d1c901	Fix conda build for Windows (#19824 ) Summary: Let's test it before merging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19824 Differential Revision: D15116111 Pulled By: soumith fbshipit-source-id: 0a73de3f045ee1349061674f5f8e2aaba382493c	2019-04-27 23:10:46 -07:00
Pieter Noordhuis	9b69da2b55	Allow for iterations where no module parameter is used (#19821 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19821 It is possible that not a single parameter is used during an iteration. If this is the case, the `prepare_for_backward` function marks all parameters as unused, kicks off reduction of all buckets, and finalizes the reduction. This is different from the prior implementation where we assumed that autograd would produce a gradient for at least a single parameter. We then used the autograd callback mechanism to queue a finalizer callback. Now, this finalizer may be executed in line. Reviewed By: mrshenli Differential Revision: D15113272 fbshipit-source-id: dc91458b569cd8c106ddaeea558464b515683550	2019-04-27 22:57:59 -07:00
Michael Suo	f0a007a26c	Use QualifiedName for classes (#19575 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19575 ghimport-source-id: eaed1d9d66672aadfe893e62a3a811da8ac7d966 Differential Revision: D15035281 Reviewed By: shannonzhu Pulled By: suo fbshipit-source-id: 7bac5e5f9223af77268bc03e08b37450a6840dbe	2019-04-27 16:13:27 -07:00
Michael Suo	1d6e868c2f	make QualifiedName a value type (#19626 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19626 ghimport-source-id: 20f03b245245a7fab1c175cebc41e477b1eadc68 Reviewed By: shannonzhu Differential Revision: D15049582 Pulled By: suo fbshipit-source-id: 9e73a1d8ee8aa1849880815fa6fddebca408b8b4	2019-04-27 16:13:24 -07:00
Michael Suo	096dd8a4f2	separate QualifiedName into its own file (#19566 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19566 ghimport-source-id: c237f2a25d1aa9fc41f19fefe7a08a53a54279db Differential Revision: D15032205 Reviewed By: shannonzhu Pulled By: suo fbshipit-source-id: 7527d97565559ebfb2556600eea5d93c1e141ac8	2019-04-27 16:13:20 -07:00
Chandler Zuo	472be69a73	Avoid Output Uninitialized Blobs in Load with load_all=1 (#19133 ) Summary: When output blob names are specified while load_all=1, output blob names are ignored. However, this behavior is not documented. In this diff, we just disallow users to provide blob names when load_all=1. See discussion at https://fb.workplace.com/groups/1405155842844877/permalink/2714909788536136/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/19133 Reviewed By: dzhulgakov Differential Revision: D14883698 Pulled By: chandlerzuo fbshipit-source-id: 6e4171e36c4ccc4f857e79da98b858a06b7d8ad6	2019-04-27 10:45:44 -07:00
Max Wang	268859ce0d	Fix CUDA stream syncing bug in allgather and reduce_scatter (#19631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19631 ghimport-source-id: edc47e77d6ef03e966944ff98eefc22f2574eeaa Reviewed By: mrshenli Differential Revision: D15110077 Pulled By: mxw fbshipit-source-id: 27a68308ade5ea511e2ea568a071eedb5d21c1ba	2019-04-27 08:35:56 -07:00
Michael Suo	a25b79531c	use fully qualified name for ScriptClasses (#19239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239 ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706 Differential Revision: D14928293 Reviewed By: eellison Pulled By: suo fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a	2019-04-26 19:17:21 -07:00
Xiaomeng Yang	2ce39de3fc	Add elementwise_affine for layer_norm_op (#19713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19713 Add elementwise_affine for layer_norm_op Reviewed By: houseroad Differential Revision: D15075454 fbshipit-source-id: e8a7d3da1c81e49fa55323f5e74a68bc4ef8d83f	2019-04-26 17:20:01 -07:00
David Riazati	f9786ad351	Add support for LONG_BINGET pickler op (#19815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19815 ghimport-source-id: dd51c13892a8f0d91d726ae8ec65206d5e81f33e Differential Revision: D15109969 Pulled By: driazati fbshipit-source-id: da0bb5e30038173e74ca3e0e103dc11ba1638797	2019-04-26 17:13:48 -07:00
Elias Ellison	5a83a7424d	fix optional type unification (#19813 ) Summary: Previously in type unification when we encountered an Optional[T] and a None, we would unify it to Optional[Optional[T]]. If you think about Optionals as a union of [T, None], then a union of [Optional[T], None] -> [T, None]. We should just be never create an Optional of an Optional. The other fix is to change unify_types directly, but I think this is the more general fix, and would play more nicely with our optional type refinement, which also assumes we never encounter an Optional[Optional[T]]. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19813 Reviewed By: suo Differential Revision: D15103083 Pulled By: eellison fbshipit-source-id: db803db10d6934eaa5458e7c1746546b0d0c0a6c	2019-04-26 16:14:51 -07:00
Michael Antonov	698103cdd6	DataLoader docs update to describe how workers are managed, including Windows. (#18091 ) Summary: It's been hard to understand how workers are launched and what code runs in the worker vs. main process, especially on Windows, which leads to many of our samples failing. This explains when workers run an how to make code work on Windows as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18091 Differential Revision: D15083766 Pulled By: soumith fbshipit-source-id: 8a7e60defc8a72ec63874f657d7d5267d951dccf	2019-04-26 16:01:30 -07:00
Soumith Chintala	4e6608e86d	Revert D15103223: [pytorch][PR] [CUDA 10] Resolve host_define.h warnings Differential Revision: D15103223 Original commit changeset: 5b56c4dd9cc4 fbshipit-source-id: f9a8e5ff0ee54cf5bb588896ab26dd9f0fb9ba45	2019-04-26 16:01:27 -07:00
Tongzhou Wang	42fbeef5d7	update F.grid_sample doc for clarity (#19754 ) Summary: https://github.com/pytorch/pytorch/issues/19717 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19754 Differential Revision: D15085449 Pulled By: soumith fbshipit-source-id: 0dda05bd395d58a496bf397ca7f1c50a239b0ed1	2019-04-26 16:01:24 -07:00
davidriazati	dc67d9f3b9	Cleanup documentation (#19584 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19584 [jit] Cleanup documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/19584 Pulled By: driazati Differential Revision: D15104801 fbshipit-source-id: 87391fd62ee92b615e680469f8bd9a1ac654be7e	2019-04-26 15:43:07 -07:00
Soumith Chintala	75754beca3	Revert D14577575: [pytorch][PR] Fix lack of state init for adagrad and add share_memory flag Differential Revision: D14577575 Original commit changeset: 12440079ac96 fbshipit-source-id: 935106385e608471dc280fc61cfedf19d330812d	2019-04-26 15:43:04 -07:00
Orion Reblitz-Richardson	11297702b9	Fix the install of TensorBoard for doc generation (#19814 ) Summary: One more fix for https://github.com/pytorch/pytorch/pull/19810 We now know that we are running with python3, so no need to check python version. The quotes were probably causing problems here. cc ezyang soumith zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19814 Differential Revision: D15106459 Pulled By: orionr fbshipit-source-id: 0443b9b54d17fead9c8c2c9d8d2f373e1f95a28b	2019-04-26 14:56:04 -07:00
Stefan Krah	be20d65b70	Follow up to adaptive_max_pool3d() port (#19748 ) Summary: This is a follow up PR for #19547. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19748 Differential Revision: D15103230 Pulled By: ezyang fbshipit-source-id: e7ce925faeadea502f77ed42d52e247c8c6571d8	2019-04-26 14:34:54 -07:00
Stefan Krah	cb4d41afcd	Follow up to adaptive_max_pool2d() port (#19738 ) Summary: This is a follow up PR for #19409. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19738 Differential Revision: D15103231 Pulled By: ezyang fbshipit-source-id: 11c9fec641b389906b8accd22504a683331fa6ec	2019-04-26 14:30:09 -07:00
Syed Tousif Ahmed	2573e695b0	Resolve host_define.h warnings (#19789 ) Summary: Eigen was updated with the commit needed to get rid of this warning that plagued the CI. This PR bumps third_party/eigen to that commit head. ``` warning: #warning "host_defines.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release. Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19789 Differential Revision: D15103223 Pulled By: ezyang fbshipit-source-id: 5b56c4dd9cc41ff1794570ba2f6abfbe23f6ab68	2019-04-26 13:52:21 -07:00
Max Wang	c5845c4482	Add support for reduce-scatter in c10d (#18844 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18844 ghimport-source-id: c6b2f0032c7c2212be2000a9c1f262f63d878a97 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18844 Add support for reduce-scatter in c10d * #18820 Refactor ProcessGroupNCCL collective primitives Reviewed By: mrshenli Differential Revision: D14768369 fbshipit-source-id: a9def7a0da6e9cd995e982371cc1e22f3df1a156	2019-04-26 13:46:57 -07:00
Junjie Bai	c9f380df02	Add aten mkldnn linear operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19210 Reviewed By: dzhulgakov Differential Revision: D14901641 fbshipit-source-id: 8fa68b9941fd93cea0f313a828cba34c5c81ae11	2019-04-26 13:41:57 -07:00
Junjie Bai	48b81da4cb	Add aten mkldnn view operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19209 Reviewed By: dzhulgakov Differential Revision: D14894545 fbshipit-source-id: 69455184811de1d1444b5d494e4a9d8c83301431	2019-04-26 13:41:54 -07:00
Junjie Bai	61d5a8dded	Add aten mkldnn add operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19207 Reviewed By: dzhulgakov Differential Revision: D14889477 fbshipit-source-id: 2c5e5ea5dfc26a9c9a172c5fa2c6d7584b167e16	2019-04-26 13:41:51 -07:00
Junjie Bai	fb53c189b3	Add aten mkldnn batch_norm operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19206 Reviewed By: dzhulgakov Differential Revision: D14887205 fbshipit-source-id: ea00c9e3205c449d08ab29535309164f951aab95	2019-04-26 13:41:48 -07:00
Junjie Bai	4864000e55	Add aten mkldnn ops: relu, max_pool2d and avg_pool2d Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19205 Reviewed By: dzhulgakov Differential Revision: D14850598 fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba	2019-04-26 13:41:44 -07:00
Junjie Bai	3445020ca3	Add aten mkldnn conv2d operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19204 Reviewed By: dzhulgakov Differential Revision: D14857513 fbshipit-source-id: 1172c9785e5a17a7d7360474551bdc7a511b3f2f	2019-04-26 13:41:41 -07:00
Junjie Bai	8f1445c406	Add is_mkldnn to at::Tensor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19633 Reviewed By: dzhulgakov Differential Revision: D15053320 fbshipit-source-id: 12b9f85a025a9e957e1b7b3014ba44ae71bfd7a5	2019-04-26 13:41:38 -07:00
Wanchao Liang	236c2b2387	Let script module buffer attributes can also cast device/type (#19700 ) Summary: Tested locally this fix #19039, did not add a test since there's no way to create a script module in the cpp world. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19700 Differential Revision: D15094195 Pulled By: wanchaol fbshipit-source-id: fcc2c1e5efbc160d976ae485ba2457442f62f065	2019-04-26 13:06:52 -07:00
Will Feng	5099db08d4	Ignore `nn::Functional` submodules in `nn::Module` serialization (#19740 ) Summary: Currently, the Python API doesn't serialize layers that don't have weights (such as `nn.ReLU` and `nn.MaxPool2d`e.g. in https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py#L80-L81). If one saves a model that contains weight-less layers in Python and tries to load it into C++, the C++ module loading code (`torch::load(...)`) will throw an error complaining that the expected layers are not found in the serialized file (e.g. https://github.com/pytorch/vision/pull/728#issuecomment-480974175). This PR solves the problem by ignoring layers that are not serializable (which currently only include `nn::Functional`) in the C++ module serialization code (`torch::save(...)` and `torch::load(...)`), and the user is expected to use `nn::Functional` to wrap the weight-less layers so that they can be ignored when serializing / deserializing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19740 Differential Revision: D15100575 Pulled By: yf225 fbshipit-source-id: 956481a2355d1de45341585abedda05e35d2ee8b	2019-04-26 12:47:23 -07:00
Max Wang	61d48aa989	Refactor ProcessGroupNCCL collective primitives (#18820 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18820 ghimport-source-id: 220b2a3dd9d4d6d2e557e1802851f082c2dc6452 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18820 Refactor ProcessGroupNCCL collective primitives Planning to add reduce-scatter, but no room in my stomach for more copypasta. Also rewrote the tensor list validation logic. The existing validation was ill-suited for all the cases it was being used for; it took a vector of input tensors and a vector of output tensors, but only ever received either two references to the same vector, or a bespoke singleton vector and a vector of outputs (for which it would ignore all but the first output). In the first case, it performed unnecessary checks, and in the second, it skipped necessary ones. Reviewed By: mrshenli Differential Revision: D14762369 fbshipit-source-id: dcf882ce1c5854333a9eb4424bfc18d9f4648ddf	2019-04-26 12:38:48 -07:00
Orion Reblitz-Richardson	e1ebf330d5	Install TensorBoard for doc generation (#19810 ) Summary: In order to have `torch.utils.tensorboard.SummaryWriter` rendered in the documentation at the bottom of https://pytorch.org/docs/master/tensorboard.html we need to have TensorBoard installed. This change makes it so our pinned version of `tb-nightly` is used for doc generation same as it is used for running tests at https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/test.sh#L45-L52 Eventually we'll use a pinned version of `pip install tensorboard`, but it's not on the release channel yet. cc kostmo soumith ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/19810 Differential Revision: D15101730 Pulled By: orionr fbshipit-source-id: c41678c4f9ef3d56a168f2b96a1ab05f351bdc56	2019-04-26 12:06:18 -07:00
huangyanhua	bacc8815c7	update Anaconda download link (#19794 ) Summary: Now `https://www.continuum.io/` is redirected to `https://www.anaconda.com` and old Anaconda download link `https://www.continuum.io/downloads` is dead. This PR update it to `https://www.anaconda.com/distribution/#download-section`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19794 Differential Revision: D15099538 Pulled By: soumith fbshipit-source-id: 967dcda34d9d446c0d26c0014f10cc710f69a0c5	2019-04-26 09:45:44 -07:00
Spandan Tiwari	dafee117e8	Removing unused arg f from _model_to_graph(). (#19647 ) Summary: Input argument `f` in `_model_to_graph()` method in `torch/onnx/utils.py` is unused. This PR removes it. If there's a reason to keep it around, please let me know. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19647 Reviewed By: dzhulgakov Differential Revision: D15071720 Pulled By: houseroad fbshipit-source-id: 59e0dd7a4d5ebd64d0e30f274b3892a4d218c496	2019-04-26 09:40:52 -07:00
Pieter Noordhuis	0d8a3610c5	Multiple module outputs and multiple calls to backward (#19799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19799 A module that returns multiple outputs and where the called may end up doing multiple calls to torch.autograd.backward did not work with DistributedDataParallel. It expected the first call to torch.autograd.backward to provide gradients for ALL parameters that expect gradients and were used in computing the module output. If you have outputs with disjoint autograd graphs it is fine to call torch.autograd.backward on both and fill in the module's parameter gradients in separate chunks. With this change we delay queuing the finalizer callback until we have marked all buckets as ready, instead of queueing it the first time we receive an autograd hook. This returns the current implementation to be functionally equivalent to the DistributedDataParallel implementation before #18953 was merged. Reviewed By: mrshenli Differential Revision: D15097045 fbshipit-source-id: 2df023319713bc31e29a8b45108c78e6593fccd4	2019-04-26 08:20:10 -07:00
Eric Faust	dcfb5620df	Allow passing lists as trace inputs. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19580 Differential Revision: D15034978 fbshipit-source-id: d3bc32ccae1c12104f2bde43fd4700d220bb3ca9	2019-04-26 02:41:57 -07:00
Karl Ostmo	8f0603b128	C++ changes toward libtorch and libcaffe2 unification (#19554 ) Summary: * adds TORCH_API and AT_CUDA_API in places * refactor code generation Python logic to separate caffe2/torch outputs * fix hip and asan * remove profiler_cuda from hip * fix gcc warnings for enums * Fix PythonOp::Kind Pull Request resolved: https://github.com/pytorch/pytorch/pull/19554 Differential Revision: D15082727 Pulled By: kostmo fbshipit-source-id: 83a8a99717f025ab44b29608848928d76b3147a4	2019-04-26 01:38:10 -07:00
Yinghai Lu	9d180e602f	More topi support (#19728 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19728 Added `Tanh`, `Transpose` and `Mul` support. Reviewed By: hlu1 Differential Revision: D15078878 fbshipit-source-id: 0a0df6b0d453bc38987b6d744774c127dd6875fe	2019-04-26 00:53:11 -07:00
zrphercule	c182824f69	Update foxi version (#19793 ) Summary: Update foxi to the latest version for group quantization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19793 Reviewed By: jackm321, houseroad Differential Revision: D15095982 Pulled By: zrphercule fbshipit-source-id: 0d1cb403cbda47a4fda9035e1712fced60ced283	2019-04-25 22:39:40 -07:00
Lu Fang	20c22bcae4	Automatic update of fbcode/onnx to 22662bfd4dcc6baebf29e3b823a051676f991001 (#19790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19790 Previous import was 27d4b617e7097cda7d0d4c45ff2b09d248f33179 Included changes: - [22662bfd](https://github.com/onnx/onnx/commit/22662bfd): Bump up version number and update Versioning for 1.5.0 release (#1965) <Raymond Yang> - [b1a3a8c8](https://github.com/onnx/onnx/commit/b1a3a8c8): fix the ci (#1964) <Lu Fang> Reviewed By: zrphercule Differential Revision: D15095183 fbshipit-source-id: b69cb62685122b83a1493b2702aa6ec950ee15bf	2019-04-25 22:23:25 -07:00
Junjie Bai	f0d493d290	Add devtoolset 8 (gcc 8) + glibc 2.26 + centos 7.5 rocm docker image (#19767 ) Summary: xw285cornell Will add py3.6-devtoolset8-glibc2.26-rocmrpm-centos7.5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19767 Differential Revision: D15094446 Pulled By: bddppq fbshipit-source-id: 01a932d893cf4559f98612888308b3ad6900a038	2019-04-25 22:13:20 -07:00
Tzu-Wei Huang	98e312cf96	TensorBoard support within PyTorch (#16196 ) Summary: This PR adds TensorBoard logging support natively within PyTorch. It is based on the tensorboardX code developed by lanpa and relies on changes inside the tensorflow/tensorboard repo landing at https://github.com/tensorflow/tensorboard/pull/2065. With these changes users can simply `pip install tensorboard; pip install torch` and then log PyTorch data directly to the TensorBoard protobuf format using ``` import torch from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter() s1 = torch.rand(1) writer.add_scalar('data/scalar1', s1[0], 0) writer.close() ``` Design: - `EventFileWriter` and `RecordWriter` from tensorboardX now live in tensorflow/tensorboard - `SummaryWriter` and PyTorch-specific conversion from tensors, nn modules, etc. now live in pytorch/pytorch. We also support Caffe2 blobs and nets. Action items: - [x] `from torch.utils.tensorboard import SummaryWriter` - [x] rename functions - [x] unittests - [x] move actual writing function to tensorflow/tensorboard in https://github.com/tensorflow/tensorboard/pull/2065 Review: - Please review for PyTorch standard formatting, code usage, etc. - Please verify unittest usage is correct and executing in CI Any significant changes made here will likely be synced back to github.com/lanpa/tensorboardX/ in the future. cc orionr, ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/16196 Differential Revision: D15062901 Pulled By: orionr fbshipit-source-id: 3812eb6aa07a2811979c5c7b70810261f9ea169e	2019-04-25 21:30:23 -07:00
Ailing Zhang	97e80ab6fc	Always enable autodiff check (#19787 ) Summary: disable_autodiff_subgraph_inlining should be always on to check AD regression. Thanks eellison for spotting the test regression! Pull Request resolved: https://github.com/pytorch/pytorch/pull/19787 Differential Revision: D15093104 Pulled By: ailzhang fbshipit-source-id: 82a75a7dd7097d5f93a2e4074023da2105341c1b	2019-04-25 21:22:30 -07:00
Jack Montgomery	48d5ab54a8	Automatic update of fbcode/foxi to 8f74bc4df3a4cfc69b1a3eadf62aa29d9961c72d AND update Glow AND update C2 (#19792 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19792 This diff also contains the contents of D15092641 and D15090411 so as to not let c2, foxi, and glow get out of sync Previous import was 81e1683d6348eee4b5ed1145222dc2c41be4269c Included changes: - [8f74bc4](https://github.com/houseroad/foxi/commit/8f74bc4): Small fixes (#12) <Jack Montgomery> - [72097e4](https://github.com/houseroad/foxi/commit/72097e4): Add multiple quantization params per tensor (#11) <Jack Montgomery> - [b681fe0](https://github.com/houseroad/foxi/commit/b681fe0): Merge pull request #10 from jackm321/add_autoinstrument_graph_prop <Jack Montgomery> - [a68d835](https://github.com/houseroad/foxi/commit/a68d835): Add ONNXIFI_GRAPH_PROPERTY_AUTO_INSTRUMENT_NODES <Jack Montgomery> Reviewed By: rdzhabarov, zrphercule Differential Revision: D15086794 fbshipit-source-id: 8df02c62303b580e16a218d6be7791747e3d7213	2019-04-25 21:03:32 -07:00
Alexander Sidorov	7a8bc85f47	Profiler: add Self CPU Time Total, CPU time total and other general improvements (#19378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19378 Function profile events are typically nested. In this diff I add parent child relationship to the intervals. This way we can attribute self time easily. As a result, user printing a table from a profiler trace gets self cpu time. This diff doesn't try to address CUDA self time as CUDA kernels are already getting special care in the profiler. There are also some other minor improvements. Like reporting total CPU time spent, reversed sorting, aggregated data after the table, etc. There is a new unit test added which tests more functionality than previous profiler test Reviewed By: zheng-xq Differential Revision: D14988612 fbshipit-source-id: 2ee6f64f0a4d0b659c6b23c0510bf13aa46f07dc	2019-04-25 20:53:55 -07:00
Zafar Takhirov	6e06154c13	Quantized SumRelu (#19319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19319 Quantized SUM + ReLU (Fused). The implementation is the same as the one in the DNNLOWP. Reviewed By: jianyuh Differential Revision: D14866442 fbshipit-source-id: c8c737a37e35b6ce3c1c2077c07546aba16e0612	2019-04-25 18:01:21 -07:00
Zafar Takhirov	76307667ca	Use the QTensor with QReLU (#19312 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19312 Replaces the tuple hack with the QTensor. Please, note this can be landed ONLY after #18960 (D14810261) is landed. Reviewed By: raghuramank100 Differential Revision: D14819460 fbshipit-source-id: 75ca649304b1619cb3cfe845962c9f226b8f884a	2019-04-25 18:01:17 -07:00
Zafar Takhirov	db9008496e	Changing the rounding in the QTensor (#19714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19714 We had rounding in the quantizer set as `round(x/scale) + zp`. To make it consistent, converting it to `round(x/scale + zp)`. Reviewed By: raghuramank100 Differential Revision: D15077095 fbshipit-source-id: 5d20a90391fe8c2e11b338c05631fcf7770320c3	2019-04-25 18:01:13 -07:00
Jesse Hellemn	e814c11045	Fix env vars needed for devtoolset7 binaries Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19780 Differential Revision: D15091963 Pulled By: pjh5 fbshipit-source-id: 2594395b2313d5c8a37db28965d99b0541a227e3	2019-04-25 17:50:14 -07:00
Jesse Hellemn	c5cca65351	Fixing update_s3_htmls for binaries Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19746 Differential Revision: D15091326 Pulled By: pjh5 fbshipit-source-id: ed172c678dd5659fa31d5d9b6ee1bf119ede2889	2019-04-25 17:24:02 -07:00
Spandan Tiwari	9ef8eb4cbc	Fix case for `activations` attribute in nn.RNN ONNX export. (#19368 ) Summary: This PR addresses the https://github.com/pytorch/pytorch/issues/19366 issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19368 Reviewed By: zrphercule Differential Revision: D15043949 Pulled By: houseroad fbshipit-source-id: 9b90410307d31bc5f2fd14aa0cdd33b22572ed7c	2019-04-25 16:31:25 -07:00
Zachary DeVito	a425e1cbf8	Remove duplicate inlineCallToCode (#19724 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19724 ghimport-source-id: a68d28ac9bbe62dd61f03bfd9d57f4ef1d0ce9c9 Reviewed By: jamesr66a Differential Revision: D15078532 Pulled By: zdevito fbshipit-source-id: bebd34ff6105f538395260b027dc169448b5bc96	2019-04-25 15:53:10 -07:00
Zachary DeVito	330990d878	Serialize first-class version of functions (#19723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19723 ghimport-source-id: 7f7ec6200c3b42d19046a3e228a3d82212697f14 Reviewed By: jamesr66a Differential Revision: D15078533 Pulled By: zdevito fbshipit-source-id: fe421afab9607ee942f6d200f04bb6335fc0aa97	2019-04-25 15:53:07 -07:00
Zachary DeVito	6cb1b994d8	Trace directly into first-class module form. (#19722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19722 ghimport-source-id: b024666feccb324f5ba9aae4a6301723e04d9846 Reviewed By: jamesr66a Differential Revision: D15078535 Pulled By: zdevito fbshipit-source-id: b866b31c1864a090c545560cbecee81e34ad2d16	2019-04-25 15:53:03 -07:00
Zachary DeVito	31524bda1f	@torch.jit.script(fn) now is a torch.jit.Function (#19721 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19721 ghimport-source-id: b4f5024adc845a82dc5197d19aab1496bf85089f Reviewed By: jamesr66a Differential Revision: D15078534 Pulled By: zdevito fbshipit-source-id: 408d3a871302c5ac5d6426dc5de567f2188ebf4c	2019-04-25 15:53:00 -07:00
Zachary DeVito	12f7c2dea3	pybind CompilationUnit and Function directly (#19720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19720 ghimport-source-id: c5829234dbbe8f7fe719ffce3fa92ce5198ffd21 Reviewed By: jamesr66a Differential Revision: D15078536 Pulled By: zdevito fbshipit-source-id: e617de31fc907a408fb50e18d9358dfd64de1f9e	2019-04-25 15:52:57 -07:00
Oleg Bogdanov	bf5a5c2a31	caffe2 \| Use _aligned_free in WorkerPool destruction (#19751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19751 This has probably never been tested on Windows but destruction of WorkersPool crashes because it uses _aligned_malloc to allocate and 'free' to deallocate, which is not symmetric. Fix is to use _aligned_free in deallocation Reviewed By: hlu1 Differential Revision: D15083472 fbshipit-source-id: 42243fce8f2dfea7554b52e6b289d9fea81d7681	2019-04-25 14:54:50 -07:00
Yinghai Lu	65496e4e67	Bug fix in bound shape inferencer (#19729 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19729 Accessing dims() without boundary check is not good. Reviewed By: zrphercule Differential Revision: D15078912 fbshipit-source-id: 3746d0c18261abeec0c4880c30430125928c3309	2019-04-25 14:50:19 -07:00
Thomas Viehmann	556c8a300b	Fall back to asking nvcc for detecting cuda version if no cudaart is found (#19741 ) Summary: This happens on Debian/Ubuntu with distribution-provided cuda repackaging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19741 Differential Revision: D15082550 Pulled By: soumith fbshipit-source-id: 2ca39c6cdc9305896529b6fd537270116223cd6c	2019-04-25 10:54:20 -07:00
Lu Fang	5025d1d5e4	Automatic update of fbcode/onnx to 27d4b617e7097cda7d0d4c45ff2b09d248f33179 (#19718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19718 Previous import was 0e8d2bc5e51455c70ef790b9f65aa632ed9bc8a7 Included changes: - [27d4b617](https://github.com/onnx/onnx/commit/27d4b617): Adding RoIAlign operator (#1869) <Sam Pepose> - [70c9026c](https://github.com/onnx/onnx/commit/70c9026c): add ReverseSequence op (#1927) <Guoliang Hua> - [ed2db02a](https://github.com/onnx/onnx/commit/ed2db02a): README.md: Update badge style for build status (#1942) <Yulong Wang> - [e36d3b54](https://github.com/onnx/onnx/commit/e36d3b54): Enable python 3.7 in CI for Windows (#1943) <Raymond Yang> Differential Revision: D15077516 fbshipit-source-id: c8c6935381ff5a96ab9a4ee519685814f4ea6e59	2019-04-25 10:54:15 -07:00
Lu Fang	bbedadddce	Fix Circle CI for ONNX repo (#19725 ) Summary: New pip package becomes more restricted. We need to add extra flag to make the installation work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19725 Differential Revision: D15078698 Pulled By: houseroad fbshipit-source-id: bbd782a0c913b5a1db3e9333de1ca7d88dc312f1	2019-04-25 10:48:41 -07:00
Ailing Zhang	0effe1d4a4	Make interpolate bicubic match opencv result (#19703 ) Summary: Fixes #19650 When driazati started bicubic implementation we used TF result as ground truth. It turns out opencv version bicubic resize is used more commonly. This PR does two things: - Fix a bug where we didn't use area mode to compute source index - Follow the Opencv logic to handle computed negative source indices(we used to bound them by 0). Pull Request resolved: https://github.com/pytorch/pytorch/pull/19703 Differential Revision: D15078159 Pulled By: ailzhang fbshipit-source-id: 06a32baf2fbc93b90a156b863b4f9fab326d3242	2019-04-25 10:21:31 -07:00
Daniel Thul	29d8711ef0	Fix compilation on Windows 10 (CUDA 10.0, Visual Studio 2017) (#19615 ) Summary: I want to use libtorch in a C++/CUDA project but as soon as I include `<torch/torch.h>`, ".cu" files fail to compile: `torch/csrc/jit/script/tree.h(64): error C3520: 'args': parameter pack must be expanded in this context` This PR makes it build on my machine (don't know if it breaks anything though). Pull Request resolved: https://github.com/pytorch/pytorch/pull/19615 Differential Revision: D15063712 Pulled By: ezyang fbshipit-source-id: 7561e705f8f5b42b8e6a23430710b36508fee1ee	2019-04-25 09:37:17 -07:00
kirayue	af06d6342c	Add SGDR(Stochastic Gradient Descent with Warm Restarts) scheduler (#17226 ) Summary: Because of merge error with master in #15042, open a new PR for ezyang. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17226 Differential Revision: D14418145 Pulled By: mrshenli fbshipit-source-id: 099ba225b28e6aba71760b81b2153ad1c40fbaae	2019-04-25 09:26:31 -07:00
Vitaly Fedyunin	465799fab3	Replace cpu_apply with TensorIterator inside of Copy function (#18618 ) Summary: Replace cpu_apply functions with the TensorIterator. Vectorize copy and clone functions. Move big pieces of the code to cpu kernels folder to be able to use AVX2. Add fast path for copy_ function if tensor types matches. Slow down observed on smaller tensors (up to 10% or about 1us per op.) which might be explained by the bigger CPU footprint of TensorInterator in compare to simpler cpu_apply. COntrary on bigger tensors we can see 2x-3x performance improvement (single threaded, multithreading giving even better performance boost). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18618 Differential Revision: D14954118 Pulled By: VitalyFedyunin fbshipit-source-id: 9d9bdf3fd9d5e539a03071cced50d0a47bac1615	2019-04-25 08:09:14 -07:00
Seungwon Park	6c7135decb	fix typo: pytoch -> pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19719 Differential Revision: D15080095 Pulled By: ezyang fbshipit-source-id: b731a0fde87d25c63c1e3d4b9a9c2244e5ad84af	2019-04-25 06:40:40 -07:00
Natalia Gimelshein	3875e1ba45	try to make at::cat in mm_tree_reduction operate on contig tensors (#18816 ) Summary: Sometimes at::cat gets transposed inputs and goes on a slow path. Also, make jit_premul lstm benchmark add bias to the whole input tensor to avoid separate reduction kernels in the backward pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18816 Differential Revision: D15013576 Pulled By: wanchaol fbshipit-source-id: bcfa1cf44180b11b05b0f55f034707012f66281a	2019-04-24 23:44:25 -07:00
Wanchao Liang	c571969148	Fix the insert_guard for norm decomposation (#19646 ) Summary: move the insert_guard all the way up to the beginning of the decomposation, this will fix the case that we lose insert_point context after decomposeCommonNormalization and we still need to modify the graph. fixes #19502 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19646 Differential Revision: D15058040 Pulled By: wanchaol fbshipit-source-id: ebdbf8623ebfe4556c461e1b650e94b905791adb	2019-04-24 23:12:37 -07:00
Wanchao Liang	3c81eb3aa7	add max_pool2d to AD, add tests for both autodiff and inference mode Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19661 Differential Revision: D15074431 Pulled By: wanchaol fbshipit-source-id: b31cf2126c2c5d6a12c2ef5dc67b57677652f1fc	2019-04-24 22:19:51 -07:00
Summer Deng	cbd0a2d3c9	Fix the depthwise 3x3x3 fast path criteria for the stride (#19692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19692 Remove the requirement on stride for the optimized depthwise 3x3x3 kernels. Reviewed By: jspark1105 Differential Revision: D15070214 fbshipit-source-id: 9fe2d8e96930166e4eb0e2dd2288f6a0c4831e0a	2019-04-24 21:35:27 -07:00
Natalia Gimelshein	614871d948	use relative path to load libthnvrtc (#19690 ) Summary: We had a few hard to repro cases where very occasionally libthnvrtc failed to be loaded due to what looked like garbled dladdr return in `info.dli_fname` field. We could not root cause why this is happening, but this workaround avoids the problem altogether. $ORIGIN is already added to RPATH as the first search location, so dlopen("libthnnvrtc.so") will look for libthnvrtc in the caller (`libtorch.so.1`) directory, which was the purpose of the previous code that was getting `libtorch.so.1` directory using dladdr. ``` root@4ec0aab027a0:/opt/conda/lib/python3.6/site-packages/torch/lib# readelf -d ./libtorch.so.1 \| grep RPATH 0x000000000000000f (RPATH) Library rpath: [$ORIGIN:/usr/local/cuda/lib64:/opt/conda/lib] ``` Hopefully, same should be happening on Mac. cc zdevito ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/19690 Differential Revision: D15076990 Pulled By: soumith fbshipit-source-id: a4d2992ccf26953f1fc73f17c4e752d69c58e2fc	2019-04-24 21:09:31 -07:00
Roy Li	72b8b6c374	Change some comments related to moving copy_ to native (#19618 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19618 ghimport-source-id: 6bb9965f2f7b72f602f03e27b664d7d7696edd00 Differential Revision: D15048632 Pulled By: li-roy fbshipit-source-id: a2707e3086f3a9993780a7f76104c5f00f2a9618	2019-04-24 19:23:06 -07:00
Roy Li	17e4cd0c0a	Remove old complex Types (#19616 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19616 ghimport-source-id: d8b3e15d84d3e6f810af3cb83d1413c5f048bcdc Differential Revision: D15047741 Pulled By: li-roy fbshipit-source-id: 572045f88f410d97f60c56298018bfee6268b375	2019-04-24 19:18:16 -07:00
Roy Li	6fead42eb8	Remove function variant of copy_ (#19622 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19622 ghimport-source-id: 41eadd845e2cbd113735b8650dca39354e1c7f1d Differential Revision: D15049274 Pulled By: li-roy fbshipit-source-id: 4f7d7cb2c8339e5e7e35f95397fe6a3f4b7c74f3	2019-04-24 17:57:18 -07:00
Roy Li	a6811e17c0	Restore copy_ overload with async arg (#19641 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19641 ghimport-source-id: 7099221334505bacdc209cff8bf29e3004c30379 Differential Revision: D15056755 Pulled By: li-roy fbshipit-source-id: e9063b606e72a70fc1270fbcdcf1c0b23d876dd3	2019-04-24 17:51:50 -07:00
davidriazati	c08f3d06c3	Add some of nn.init to weak script (#19640 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19640 [jit] Add some of nn.init to weak script Pull Request resolved: https://github.com/pytorch/pytorch/pull/19640 Pulled By: driazati Differential Revision: D15065332 fbshipit-source-id: 30df9f02e527cd5e5ebe34b7e003444eae96c66d	2019-04-24 17:00:48 -07:00
Will Feng	9aa0e6078f	Support serializing std::vector<torch::Tensor> (#19677 ) Summary: In the distributed training development work, we need to be able to serialize a `std::vector` of `torch::Tensor`s. This PR adds support for serializing `std::vector<torch::Tensor>`. cc. mrshenli Pull Request resolved: https://github.com/pytorch/pytorch/pull/19677 Differential Revision: D15069860 Pulled By: yf225 fbshipit-source-id: 505147e5f5fea78be1bf60fb8418bc187dbc2a98	2019-04-24 16:50:16 -07:00
James Reed	32174bedb8	Fix fuser tests on sandcastle (#19684 ) Summary: suo Pull Request resolved: https://github.com/pytorch/pytorch/pull/19684 Reviewed By: suo Differential Revision: D15067942 Pulled By: jamesr66a fbshipit-source-id: 697a836ea37dab78fffd092194cecd8294ca9907	2019-04-24 16:50:13 -07:00
David Riazati	3d6e956412	Add LONG_BINPUT to unpickler (#19696 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19696 ghimport-source-id: 8d711cd3ed2b2810b5b3d765564429882f96d1f1 Differential Revision: D15072658 Pulled By: driazati fbshipit-source-id: a28a90218874e07cfbed4f8df3d6d23ae5e70933	2019-04-24 16:39:23 -07:00
Elias Ellison	62447a5aa3	improve err msg (#19645 ) Summary: Print out the tensor value when throwing the cannot insert tensor with grad error Pull Request resolved: https://github.com/pytorch/pytorch/pull/19645 Differential Revision: D15057809 Pulled By: eellison fbshipit-source-id: 3f622ef1322a75c965e780275f1fb447e9acf38d	2019-04-24 16:22:07 -07:00
Jerry Zhang	6ec55c13a9	Enable assignment for QTensor in pytorch frontend (#19676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19676 Make copy work with QTensor, enable assignment of QTensor in pytorch frontend. Differential Revision: D15064710 fbshipit-source-id: 04f2dc02a825695d41fa1114bfca49e92108fef3	2019-04-24 16:05:34 -07:00
Alex Şuhan	4a65ee95cc	Make torch.equal work with boolean CPU tensors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19604 Differential Revision: D15056022 Pulled By: li-roy fbshipit-source-id: 1309b107b2d4ee0a490bce1b43c3c175180a1580	2019-04-24 15:51:10 -07:00
Vitaly Fedyunin	d14abe3aff	Add torch.from_file function similar to the Storage.from_file, but returning tensor (#18688 ) Summary: Porting `torch.Storage.from_file(filename, shared, size)` function to `torch.from_file(filename, shared, size, dtype=torch.int)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18688 Differential Revision: D15012644 Pulled By: VitalyFedyunin fbshipit-source-id: 3f62ca9e414fad3847fe71b785ff97b5bdc2d2cd	2019-04-24 15:38:56 -07:00
Dmytro Dzhulgakov	d247912dbf	Add no-gpu build mode for all of PyTorch and Caffe2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19687 Differential Revision: D15023347 fbshipit-source-id: 5bed0d72e8ff337e066c142ca5c8e2c2bae93746	2019-04-24 13:27:59 -07:00
David Goodwin	c855e04d5f	Caffe2 shouldn't fail if CUDA peer access is already enabled Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19586 Differential Revision: D15061544 Pulled By: dzhulgakov fbshipit-source-id: 6a5f9f4fe45259d689671f58ad5206cdaf15c5bd	2019-04-24 13:22:27 -07:00
BowenBao	960513006f	Support exporting squeeze & unsqueeze with negative dim attribute Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19297 Reviewed By: zrphercule Differential Revision: D14953525 Pulled By: houseroad fbshipit-source-id: 8d7eecd2804b8e27d3ee4ad6e763352818d02d0c	2019-04-24 12:45:59 -07:00
Gu, Jinghui	b675f07bb6	Remove useless input shape checker in conv (#19608 ) Summary: The input shape checkers in conv/int8_conv operator is aims to avoid the issue when running with mkldnn winograd, the weigths has to be reordered each time if input shape changed. However, the checkers result to big performance regression due to frequent reorder. Meanwhile, in mkldnn-bridge, such case has been already fixed by correcting the prop_kind. Therefore, we have to remove the useless checker to fix the performance regression. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19608 Differential Revision: D15061169 Pulled By: yinghai fbshipit-source-id: 649a43ae6fce989e84939210f6dffb143ec3d350	2019-04-24 11:39:43 -07:00
Zachary DeVito	87a6974193	Make it possible for self.forward to return a ScriptMethod (#19217 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19217 ghimport-source-id: 6fdd7f5ac041dae950b47ca316f30682ede0b083 Reviewed By: suo Differential Revision: D14922120 Pulled By: zdevito fbshipit-source-id: 5e82e5d7ee72df6f401146d2519c80ea336ff40e	2019-04-24 11:14:34 -07:00
Rui Zhu	2f73b3d26e	Add if ops support for onnxifi and ssa-rewrite (#19585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19585 Originally we will unroll all If op to many different subnets; Now we will not unroll it anymore, but just add all external input of its subnet to the If op, and ssa-rewrite all external input/outputs. That would be enough. Reviewed By: yinghai Differential Revision: D15038139 fbshipit-source-id: 8532216d8749068acd5558ad0d8cb1d98463a063	2019-04-24 11:01:13 -07:00
Karl Ostmo	41486306d9	GCC ABI variants for nightly builds (#18888 ) Summary: closes #17492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18888 Differential Revision: D15065093 Pulled By: pjh5 fbshipit-source-id: 6abeabf68b91106fc8ae9df238f6a40613d40b57	2019-04-24 10:08:56 -07:00
SsnL	5e62ee2b97	Fix no SIGCHLD checking in DataLoaderIter._shutdown_workers (#19421 ) Summary: Also 1. Bump multiprocessing test timeout following python core tests 2. Fix one type of flakiness in `test_proper_exit`. 3. Add trace reporting when loader process hangs in `test_proper_exit` using `faulthandler`. 3. Give `test_proper_exit` another try. I'll heavily retest this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19421 Differential Revision: D15063728 Pulled By: ezyang fbshipit-source-id: 4e0d992622e11053c44a9ec237b88b9a28a4472c	2019-04-24 08:06:58 -07:00
Edward Yang	c42f3f9055	Revert D15008160: Enable assignment for QTensor in pytorch frontend Differential Revision: D15008160 Original commit changeset: 5f1166246d76 fbshipit-source-id: 24c7350431ae6a87199d6e3f7ffbbc8ec7d3c28b	2019-04-24 06:58:13 -07:00
Elias Ellison	84b275b70f	fix rocm test (#19663 ) Summary: for some reason exec in python fails on rocm build alone Pull Request resolved: https://github.com/pytorch/pytorch/pull/19663 Differential Revision: D15061382 Pulled By: eellison fbshipit-source-id: d6e1776e88c22de973796e5080147e6d31aba477	2019-04-24 00:48:27 -07:00
efaust	8273b9b3cb	Enforce consistent dict iteration order for trace inputs. (#19528 ) Summary: Stack:     ⚫  #19528 [pytorch] Enforce consistent dict iteration order for trace inputs.  [💛](https://our.intern.facebook.com/intern/diff/D15023656/) Don't iterate down unordered_maps and expect ordering. Should fix test flakiness. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19528 Differential Revision: D15023656 Pulled By: efaust fbshipit-source-id: 91c9a31a8652fcf93ae0e942bea4cec67bb490c9	2019-04-23 23:36:48 -07:00
Jerry Zhang	309c15e2df	Enable assignment for QTensor in pytorch frontend (#19530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19530 Make copy work with QTensor, enable assignment of QTensor in pytorch frontend. Differential Revision: D15008160 fbshipit-source-id: 5f1166246d768b23f009cde1fa03e8952368a332	2019-04-23 21:29:31 -07:00
eellison	d902774cad	Dont introduce aliasing in CSE or Constant Pooling (#19576 ) Summary: We can't introduce aliasing to a graph output, since they may be mutated after. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19576 Differential Revision: D15057734 Pulled By: eellison fbshipit-source-id: 33594c05d985a0c58edebd6252e1ee2c0efb6f0e	2019-04-23 20:39:09 -07:00
Jerry Zhang	ba1cf38718	Remove QTensor alias (#19635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19635 att Differential Revision: D15053349 fbshipit-source-id: 7cd0e6c9ff567d05b051527410f452b059458af2	2019-04-23 20:34:11 -07:00
Dmytro Dzhulgakov	8b798f43e3	Commit explicit libtorch_python sources (#19607 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19607 Explicit is better than implicit - it's pretty hard to debug where particular file is if it's not greppable. As a follow up step - we should look whether we can just include build_variables.py in CMake directly to share setups of two build systems Reviewed By: ezyang Differential Revision: D15023348 fbshipit-source-id: 600ef2d1871bc28530c6a02681b284f7499904df	2019-04-23 19:49:42 -07:00
Elias Ellison	5119cc7cdf	builtin ivalues sort (#19572 ) Summary: Add sorting to all the lists which we specialize on (Tensor, int, float, bool). First part of https://github.com/pytorch/pytorch/issues/19372 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19572 Differential Revision: D15052677 Pulled By: eellison fbshipit-source-id: 301e8e0e3e29e04aca1311410db0a474fd833cff	2019-04-23 16:38:08 -07:00
James Reed	80020b3d2d	Guard {set,rebase}_history on grad_fn check (#19623 ) Summary: We would previously have statements like ``` set_history(flatten_tensor_args( result ), grad_fn); ``` Internally, {set,rebase}_history would check grad_fn and short circuit if it is nullptr. However, this means that we are executing the expression `flatten_tensor_args( result )` and immediately throwing away the results. This was causing unnecessary allocations + overhead. My JIT overhead benchmark script (with custom benchmark method): ``` import torch, time torch.jit.script def add(x, y): return x + y a = torch.rand([]) b = torch.rand([]) niter = 1000000 with torch.no_grad(): s = time.time() add.__getattr__('forward').benchmark(niter, a, b) e = time.time() - s print('overhead per call (us)', e / niter * 1e6) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19623 Differential Revision: D15053399 Pulled By: jamesr66a fbshipit-source-id: 8777e1a2b5c5a5bbd3a035b7247c8154c5fc4aa6	2019-04-23 15:40:11 -07:00
Xiaomeng Yang	fb9fc42a0c	optimize BatchMatmulOp (#18612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18612 optimize BatchMatmulOp Reviewed By: houseroad Differential Revision: D14681665 fbshipit-source-id: cf5ea4909ace58fd44fe6fa634531102ac84e851	2019-04-23 15:34:59 -07:00
Jerry Zhang	176bdc0722	fix lint (#19632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19632 at Differential Revision: D15052952 fbshipit-source-id: 7c38fad99799e5ac914685c36eadf932afe52b74	2019-04-23 15:29:38 -07:00
Phúc Lê	9b272affde	Add base support to torch.logspace, default base=10 (#19542 ) Summary: Add base support for torch.logspace. See #19220 for details. SsnL can you feedback? Thanks a lot. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19542 Differential Revision: D15028484 Pulled By: soumith fbshipit-source-id: fe5a58a203b279103abbc192c754c25d5031498e	2019-04-23 15:06:34 -07:00
Michael Suo	96b966297e	disable flake8 E302 (two blank lines) (#19634 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19634 ghimport-source-id: 68b11ac3c19daf8df3bbf11e6181e9450899e90a Differential Revision: D15053466 Pulled By: suo fbshipit-source-id: 09d7859aa2059fc9eb3b47fa62467537bab40e05	2019-04-23 15:06:31 -07:00
Tongzhou Wang	6b8771a7a6	fix nn.Sequential doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19597 Differential Revision: D15042383 Pulled By: soumith fbshipit-source-id: f912ed2a726a17fcc25795ff66b73ae4caacd247	2019-04-23 14:58:16 -07:00
Oleg Bogdanov	70b82d28b8	caffe2 \| Windows compat fixes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19531 Reviewed By: hlu1 Differential Revision: D15024541 fbshipit-source-id: cd8249a6d529afb65fa8afd74a05dbfe73eb1fb0	2019-04-23 14:30:19 -07:00
Sebastian Messmer	2e048feb9e	Remove fixed TODO (#19590 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19590 - Reviewed By: ezyang Differential Revision: D15039561 fbshipit-source-id: 246cf4fa91a33cb4c96750b534b8c3d0c312f311	2019-04-23 13:50:22 -07:00
Huamin Li	55e53d3d7e	correct comments in group_norm_op (#19621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19621 Comments for group_norm_op is not accurate (i.e., the math part), this diff will fix it. Reviewed By: BIT-silence Differential Revision: D15048695 fbshipit-source-id: 27d41d3ae21054257967815254134849944d56ca	2019-04-23 13:31:15 -07:00
Sebastian Messmer	5f82d59c0a	Simplify argument test cases (#19593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19593 Removes a lot of duplication Reviewed By: dzhulgakov Differential Revision: D15039887 fbshipit-source-id: e90fe024b84220dd337fdd314d8f7e3620baec28	2019-04-23 12:58:35 -07:00
Sebastian Messmer	fddd763ec1	Add test cases for optional of list (#19592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19592 This is already supported but wasn't tested yet Reviewed By: ezyang Differential Revision: D15039888 fbshipit-source-id: dc8ea724c76dd1719b1d4810a20c8f958e5beecc	2019-04-23 12:58:32 -07:00
Stefan Krah	fc8834df4b	Port adaptive_max_pool3d() to ATen (#19547 ) Summary: This is the second part of #18064. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19547 Differential Revision: D15046630 Pulled By: ezyang fbshipit-source-id: 03f80602b94d47bca66bfd0dcab1b7bb99e5b7f1	2019-04-23 12:51:25 -07:00
Elias Ellison	0922a64d22	add torch.tensor requires grad (#19445 ) Summary: Add setting requires_grad = True within torchscript to torch.Tensor Within constant propagation, we can't insert any constants that require grad. Also added shape analysis and requires grad analysis to torch.tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/19445 Differential Revision: D15046211 Pulled By: eellison fbshipit-source-id: b4ef7a6b4b6b8dc03e1fa49f87dc415874cd1998	2019-04-23 12:27:52 -07:00
Yinghai Lu	4e8cc8ee90	Surface the Glow traces to C2 (#19087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19087 att Reviewed By: jackm321 Differential Revision: D14863112 fbshipit-source-id: 2680161b9f05391e73bb8dac4fbbeabb87a82c05	2019-04-23 12:27:49 -07:00
Kaiyu Shi	444f792fa6	Fix lack of state init for adagrad and add share_memory flag (#17679 ) Summary: The current code initialize the `state` in `__init__` method, but the initialization process is not invoked in `add_parameter_group`. I followed the same approach in other Optimizers to init the `state`. ```python import torch emb = torch.nn.Embedding(10,10) emb2 = torch.nn.Embedding(10,10) optim = torch.optim.Adagrad(emb.parameters()) print(optim.state[emb.weight]) # already initialized optim.add_param_group({'params': emb2.parameters()}) print(optim.state[emb2.weight]) # empty dict loss = emb2.weight.sum() + emb.weight.sum() loss.backward() optim.step() # raised KeyError ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17679 Differential Revision: D14577575 Pulled By: ezyang fbshipit-source-id: 12440079ac964b9eedad48e393d47f558babe300	2019-04-23 12:22:19 -07:00
Priya Goyal	0d0acba3bd	Allow extracting element-wise loss in softmax (#19579 ) Summary: Often times, we want to experiment with loss per element (image etc.). This changeset allows getting per element loss as well. This output is optional. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19579 Reviewed By: jerryzh168 Differential Revision: D15035797 Pulled By: prigoyal fbshipit-source-id: 562dea514f49c1f2f1cbbc083a1938dc019a75c4	2019-04-23 11:49:49 -07:00
Wanchao Liang	e9c8f372c4	dispatch max_pools with no indices, expose max_pools to torch namespace (#19449 ) Summary: in functional interfaces we do boolean dispatch, but all to max_pool\d_with_indices. This change it to emit max_pool\d op instead when it's not necessary to expose with_indices ops to different backends (for jit). It also bind max_pool\d to the torch namespace, which is the same behavior with avg_pool\d Pull Request resolved: https://github.com/pytorch/pytorch/pull/19449 Differential Revision: D15016839 Pulled By: wanchaol fbshipit-source-id: f77cd5f0bcd6d8534c1296d89b061023a8288a2c	2019-04-23 11:20:05 -07:00
Jerry Zhang	f3be2816ae	Adds `fakeQuantizePerTensorAffineOp` to pytorch (#19387 ) Summary: Adding fakequant op so that we can use it in pytorch models, the exact implementation might change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19387 Differential Revision: D13739657 fbshipit-source-id: d5cb084e843d236bb1da9827ac1ba3900ed99786	2019-04-23 11:12:53 -07:00
James Reed	1b3967b491	-fno-math-errno -fno-trapping-math (#19552 ) Summary: As suggested in https://github.com/pytorch/pytorch/pull/19152#discussion_r275925767, this may give the compiler more opportunities for auto-vectorization Pull Request resolved: https://github.com/pytorch/pytorch/pull/19552 Differential Revision: D15048358 Pulled By: jamesr66a fbshipit-source-id: db2c2c515c3e9f7d22305c039ab0c8a867fc43a2	2019-04-23 11:06:49 -07:00
Bram Wasti	d8729efabe	Only require python print on certain namespaces (#19383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19383 ghimport-source-id: b93c7849a52d11ecbf26b614704740d44a2447f9 Differential Revision: D15032727 Pulled By: bwasti fbshipit-source-id: a19f72abb99e63d87eab13022538f325b2e20526	2019-04-23 10:52:48 -07:00
Zafar Takhirov	3cc60e54e3	Use `fbgemm` for quantize/dequantize ops (#19500 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19500 Changes the `quantize_linear` and `dequantize` to `fbgemm`-based implementation. Reviewed By: jianyuh, jerryzh168 Differential Revision: D15014561 fbshipit-source-id: b651e69d336b5b08b4a75a4a4eddf46c040a4934	2019-04-23 10:30:12 -07:00
Jiyan Yang	714344a976	Specify to use Float16UniformFill if necessary in sparse lookup layer (#18499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18499 If the init op is not fp16 compatible, it should throw. However, in the special case where the original init op is UniformFill, we replace it with Float16UniformFill Reviewed By: kennyhorror Differential Revision: D14627209 fbshipit-source-id: eb427772874a732ca8b3a25d06670d119ce8ac14	2019-04-23 10:14:08 -07:00
Chandler Zuo	e3f1504621	Fix the Division by Zero Bug of CosineAnnealingLR (#19180 ) Summary: Added the formula for the corner case. Updated unit tests. Fixes #17913 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19180 Differential Revision: D14942023 Pulled By: ezyang fbshipit-source-id: 167c109b97a7830d5b24541dc91e4788d531feec	2019-04-23 09:54:28 -07:00
Vadim Velicodnii	7a4189696f	Fix the documentation for BCEWithLogitsLoss (#17218 , #16804 ) (#19212 ) Summary: I fixed a mistake in the explanation of `pos_weight` argument in `BCEWithLogitsLoss` and added an example. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19212 Differential Revision: D14923431 Pulled By: ezyang fbshipit-source-id: 15696c67d56789102ac72afbe9bdd7b667eae5a0	2019-04-23 09:54:24 -07:00
crcrpar	bb05f70724	fix the docstring of `RandomSampler` (#19113 ) Summary: fix - the order of `Arguments` in `RandomSampler` doc - the meaningless check of `replacement`'s type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19113 Differential Revision: D15013081 Pulled By: ezyang fbshipit-source-id: 39e367f42841de6814b1214eb9df7b75f14f747e	2019-04-23 09:54:20 -07:00
mruberry	83cf9473dc	Avoid (future) cusparse name collision (#19591 ) Summary: A future version of cusparse will define "cusparseGetErrorString." This PR simply updates PyTorch's name for this function to "getCusparseErrorString" to avoid the collision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19591 Differential Revision: D15046871 Pulled By: ezyang fbshipit-source-id: 821304f75fe84c68a26680a93809a18cfdbd540b	2019-04-23 09:40:15 -07:00
jhultman	f767c9ac76	Add docs and test guaranteeing indices from torch.nonzero ordered C-style (#19539 ) Summary: See #17556. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19539 Differential Revision: D15030151 Pulled By: ezyang fbshipit-source-id: d46ee56a66d89b0113f86e3f8693dc1680d0adb9	2019-04-23 09:29:21 -07:00
Tongzhou Wang	3b4d4ef503	Remove unnecessary printing from tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19606 Differential Revision: D15046583 Pulled By: ezyang fbshipit-source-id: ea9bb691d23855e7eddbabe68bf112a726641ba4	2019-04-23 09:24:08 -07:00
Bado Lee	36084908e4	Fix lr_scheduler's last_epoch value at the time of initialization (BC BREAKING!) (#7889 ) Summary: Hello everyone :) !! I've found that lr_scheduler was initialized with last_epoch as -1. This causes that even after the first step (not the one in init but explicit step of scheduler), learning rate of scheduler's optimizer remains as the previous. ```python >>> import torch >>> cc = torch.nn.Conv2d(10,10,3) >>> myinitial_lr = 0.1 >>> myoptimizer = torch.optim.Adam(cc.parameters(), lr=myinitial_lr) >>> mylrdecay = 0.5 >>> myscheduler = torch.optim.lr_scheduler.ExponentialLR(myoptimizer,mylrdecay) >>> myscheduler.get_lr() [0.2] # this is because of get_lr calculates lr by 0.1 * 0.5^-1 >>> myscheduler.optimizer.param_groups[0]["lr"] 0.1 # this is not consistent with get_lr value >>> myscheduler.last_epoch -1 >>> myscheduler.step() >>> myscheduler.get_lr() [0.1] # this should be the value right after the init, not after first step >>> myscheduler.optimizer.param_groups[0]["lr"] 0.1 # since this is after first step, it should have been decayed as 0.05 >>> myscheduler.last_epoch 0 >>> myscheduler.step() >>> myscheduler.last_epoch 1 >>> myscheduler.get_lr() [0.05] >>> myscheduler.optimizer.param_groups[0]["lr"] 0.05 >>> myscheduler.last_epoch 1 ``` First problem is, even after the init of lr_scheduler, you get the inconsistent parameter values. The second problem is, you are stuck with same learning rate in the first 2 epochs if the step function of lr_scheduler is not called in the beginning of the epoch loop. Of course, you can avoid this by calling lr_scheduler's step in the beginning, but I don't think this is proper use since, incase of optimizer, step is called in the end of the iteration loop. I've simply avoided all above issues by setting last_epoch as 0 after the initialization. This also makes sense when you init with some value of last_epoch which is not -1. For example, if you want to init with last epoch 10, lr should not be set with decayed 1 step further. Which is last_epoch gets +1 in the previous code. base_lr * self.gamma ** self.last_epoch Instead, it should be set with step 10 exact value. I hope this fix find it's way with all your help :) I'm really looking forward & excited to become a contributor for pytorch! Pytorch Rocks!! Pull Request resolved: https://github.com/pytorch/pytorch/pull/7889 Differential Revision: D15012769 Pulled By: ezyang fbshipit-source-id: 258fc3009ea7b7390a3cf2e8a3682eafb506b08b	2019-04-23 08:54:09 -07:00
SebFar	f9c4ce781f	Removes variable which is assigned but not used (#19194 ) Summary: n was set as self.in_channels, but not used within the scope of the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19194 Differential Revision: D14937764 Pulled By: ezyang fbshipit-source-id: 55cb599109309503fee897f77d798fd454fcc02d	2019-04-23 08:48:03 -07:00
SsnL	dce3d74dfb	add torch.cuda.synchronize(device=None) (#19573 ) Summary: fixes https://github.com/pytorch/pytorch/issues/19509 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19573 Differential Revision: D15045730 Pulled By: ezyang fbshipit-source-id: 732721b4b360fc4348ca7c87d4cd1386e7651bdd	2019-04-23 08:40:38 -07:00
Stefan Krah	75ce5173a9	Port adaptive_max_pool2d() to ATen (#19409 ) Summary: This is the first part of #18064. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19409 Differential Revision: D15037390 Pulled By: ezyang fbshipit-source-id: 16a3feed2fd9cc66033696da224a7d5fb7208534	2019-04-23 07:37:25 -07:00
zhiqiang	88f78c719a	Fix math formatting of PairwiseDistance and CosineSimilarity docs and fix math formatting of CTC loss docs. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19534 Differential Revision: D15034011 Pulled By: ezyang fbshipit-source-id: 60b81c970c919508a57c86fb23edc9f64973117c	2019-04-23 07:24:07 -07:00
Michael Suo	5bafb64e67	Revert D15039713: [pytorch][PR] add torch.tensor requires grad Differential Revision: D15039713 Original commit changeset: 47f1931b6fc4 fbshipit-source-id: fd91ce8ddd6d2f4e0016054dcdc2541dacc0e191	2019-04-22 23:15:49 -07:00
James Reed	e7fc7c732c	Bugfix for fusion device check (#19594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19594 I missed a callsite Reviewed By: wanchaol Differential Revision: D15041457 fbshipit-source-id: eef76ad51bee06a56d31b4ab64f19250fe2ad8f0	2019-04-22 20:55:17 -07:00
Elias Ellison	d2b03512da	add torch.tensor requires grad (#19445 ) Summary: Add setting requires_grad = True within torchscript to torch.Tensor Within constant propagation, we can't insert any constants that require grad. Also added shape analysis and requires grad analysis to torch.tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/19445 Differential Revision: D15039713 Pulled By: eellison fbshipit-source-id: 47f1931b6fc4a1137c13d80110cc404465bfdf06	2019-04-22 18:02:41 -07:00
Vitaly Fedyunin	8be6d5ffd8	Add onnx support for _unique2 operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19582 Reviewed By: ezyang, jamesr66a Differential Revision: D15037375 fbshipit-source-id: 6060476925bf02fa07f852054e06d2107f046e38	2019-04-22 17:52:47 -07:00
Lu Fang	5a796d15be	Automatic update of fbcode/onnx to 0e8d2bc5e51455c70ef790b9f65aa632ed9bc8a7 (#19568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19568 Previous import was 83dd62659fc07d5b7fa93b5d1c1879f93509c7db Included changes: - [0e8d2bc5](https://github.com/onnx/onnx/commit/0e8d2bc5): [Minor need to be in 1.5]Fix an issue in NMS test data which introduce wrong shape. (#1953) <Hector Li> - [9346dd5d](https://github.com/onnx/onnx/commit/9346dd5d): adding modulus operator (#1874) <Jeff Saremi> - [414dbc73](https://github.com/onnx/onnx/commit/414dbc73): Fix shape inference for slice (#1950) <Hariharan Seshadri> - [6fb0775d](https://github.com/onnx/onnx/commit/6fb0775d): Fix shape inference for ConstantOfShape op (#1951) <Ashwini Khade> Reviewed By: bddppq, zrphercule, benoitsteiner Differential Revision: D15033070 fbshipit-source-id: f7eb90b142cbdc9bf1600cfd33e5a8df709045fb	2019-04-22 17:36:36 -07:00
James Reed	5be4bee4ff	Don't create FusionGroups for known-CPU producer values (#19342 ) Summary: I believe the existing check in FuseGraph was only `false` if PyTorch was built with NO_CUDA=1. Otherwise, we would create fusion groups even if we're on a CPU-only machine running CPU code. This is confusing. Instead I've made it so that the decision to fuse or not is dependent on if the producer Value is a known CPU tensor. If it is, we skip fusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19342 Differential Revision: D15038351 Pulled By: jamesr66a fbshipit-source-id: fce9d83929309a7bf14346833f84b996f3e7f6db	2019-04-22 16:57:18 -07:00
Sebastian Messmer	969af4315a	Explicitly define supported types (#19516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19516 Explicitly define types that are supported in kernel inputs and outputs. Also, this allows us to show much nicer error messages if a user writes kernels with wrong argument types. Reviewed By: ezyang Differential Revision: D15020306 fbshipit-source-id: 55ebec81e075e874777acd59aa29a5578fc19ef7	2019-04-22 16:31:28 -07:00
Mikhail Zolotukhin	8abab61d39	IRParser: optionally create name->value map of the parsed IR. (#19551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19551 ghimport-source-id: e666e3c00786a3b1c747f2dd6e85a48a63bdd69d Differential Revision: D15028056 Pulled By: ZolotukhinM fbshipit-source-id: 37e08d6df1d43513748ecfdd8549738eac7ec24e	2019-04-22 16:09:05 -07:00
Nikolay Korovaiko	43d0b78c31	Profiling : Adding Profile Op to provide storage for profiling lambdas Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19365 Differential Revision: D14998968 Pulled By: Krovatkin fbshipit-source-id: a7f7d1529cbe4e8b30638c6eb8e2ff68f6e114c3	2019-04-22 15:09:30 -07:00
Xiang Gao	7f053b27bc	Step 5: remove _unique_dim in favor of unique_dim (#18654 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18654 ghimport-source-id: 63c84cedc3335719fca4a085fa19bdc57d2bc88a Differential Revision: D15000635 Pulled By: VitalyFedyunin fbshipit-source-id: 9e8594622a867a79d8e2b6be96579816aa22ae2d	2019-04-22 12:42:51 -07:00
Yinghai Lu	767d184b77	Add back option to not adjust output batch size (#19442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19442 For cases like CV, some of ops like transpose and tile will mangle the batch size so that we don't know how to adjust output batch size. In this case, the current solution is just fix the input batch statically and do not adjust output batch size. Reviewed By: zrphercule Differential Revision: D15007237 fbshipit-source-id: a21b943a52ee5462d9d7804dfae44360f579f8cf	2019-04-22 12:29:24 -07:00
Michael Antonov	7655b857f7	Add debug logic to c2_ref_test and its helpers (#19359 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19359 Even with file IO exception handling, some of the sandcastle c2_ref_tests are still failing in length-check assert, as can be seen here: https://our.intern.facebook.com/intern/test/844424932589974?ref_report_id=0 This is an attempt to add printing logic to debug what's going on. Reviewed By: dzhulgakov Differential Revision: D14966274 fbshipit-source-id: adce6d4780d664c5ef59f9341b6133b0d09324cb	2019-04-22 12:08:55 -07:00
Dehua Cheng	a09240b0a0	fix variable shadowing issus (#19567 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19567 fix variable shadowing Reviewed By: bddppq, wx1988 Differential Revision: D15032114 fbshipit-source-id: 895ea21f22b87db8c7c8684f54fa186d22f24d10	2019-04-22 11:55:30 -07:00
Elias Ellison	19f73180cf	Add manual_seed in script (#19510 ) Summary: Add manual_seed to torch script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19510 Reviewed By: suo, driazati Differential Revision: D15018823 Pulled By: eellison fbshipit-source-id: d7734a8ad05ba254c0d88abf3fb58c4ce6a4e53b	2019-04-22 10:58:15 -07:00
Lu Fang	e714429bf4	Automatic update of fbcode/onnx to 83dd62659fc07d5b7fa93b5d1c1879f93509c7db (#19454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19454 Previous import was ad7313470a9119d7e1afda7edf1d654497ee80ab Included changes: - [83dd6265](https://github.com/onnx/onnx/commit/83dd6265): Add NonMaxSuppression operator (#1703) <Hector Li> - [31ca5d6f](https://github.com/onnx/onnx/commit/31ca5d6f): add node tests for quantized ops (#1944) <Ashwini Khade> - [e6076c1d](https://github.com/onnx/onnx/commit/e6076c1d): Fix test stat coverage script (#1948) <Raymond Yang> - [ad036405](https://github.com/onnx/onnx/commit/ad036405): Add IsInf to detect infinity values (#1884) <Wei-Sheng Chin> Reviewed By: benoitsteiner Differential Revision: D15010015 fbshipit-source-id: 4b29de21de60f8e6a2db75309809a4e619c92532	2019-04-22 10:46:08 -07:00
Gregory Chanan	5d5d67fa3f	Get rid of unnecessary matches_jit_signature: True specifications. (#19549 ) Summary: Unstacked version of https://github.com/pytorch/pytorch/pull/19431. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19549 Reviewed By: ezyang Differential Revision: D15027965 Pulled By: gchanan fbshipit-source-id: a4456326a999d77d6baeb0edbb1bb5db5208a8f8	2019-04-22 10:26:29 -07:00
vishwakftw	c30224ad21	Rename potri to cholesky_inverse (#19498 ) Summary: Changelog: - Rename `potri` to `cholesky_inverse` to remain consistent with names of `cholesky` methods (`cholesky`, `cholesky_solve`) - Fix all callsites - Rename all tests - Create a tentative alias for `cholesky_inverse` under the name `potri` and add a deprecation warning to not promote usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/19498 Differential Revision: D15029901 Pulled By: ezyang fbshipit-source-id: 2074286dc93d8744cdc9a45d54644fe57df3a57a	2019-04-22 08:18:39 -07:00
Jiyan Yang	deadf3ba89	Add assertion to make sure init op is always fp16 compatible in fp16 training Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18498 Reviewed By: kennyhorror Differential Revision: D14626755 fbshipit-source-id: d8a0b3c02920ab3835911a21bf05e8956853fcd7	2019-04-21 23:43:13 -07:00
Roy Li	689dd800ed	Generate only one Type class per backend (#19295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19295 ghimport-source-id: 9345110f91f044a449804ddd5116cc9179444a00 Differential Revision: D14948581 Pulled By: li-roy fbshipit-source-id: a317b03d58d621e8df162918038f7543bfb13ba2	2019-04-21 21:16:14 -07:00
Roy Li	189f30603c	Make complex its own backend (#19275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19275 ghimport-source-id: 73fd40b02152aed6f24225a88d7ffde7f700899e Differential Revision: D14948582 Pulled By: li-roy fbshipit-source-id: a1be6e57057defc74a007c5351c5edb2b9dcaf30	2019-04-21 21:16:10 -07:00
Roy Li	ab78449e8c	Add ScalarType argument to Type::options() (#19270 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19270 ghimport-source-id: a5ade6131f3260066c5750ea1fa9ed5c998bb791 Differential Revision: D14938707 Pulled By: li-roy fbshipit-source-id: 018fb3f01706531a06515d6d861e5683a455a705	2019-04-21 21:16:07 -07:00
Roy Li	a044ba1af5	Generate cases for all ScalarTypes in Type functions that call to TH (#19230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19230 ghimport-source-id: 81f360f2ebd137b8e7d8e885b85246cc219761aa Differential Revision: D14927991 Pulled By: li-roy fbshipit-source-id: 1b6a57918ecdc9c87858d3e50578edef0b6e7ad5	2019-04-21 21:16:03 -07:00
Mikhail Zolotukhin	868933a467	Fix clang-format. (#19550 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19550 ghimport-source-id: 980d96762426d3e97c26839edbaf107a3fc18b2f Differential Revision: D15028055 Pulled By: ZolotukhinM fbshipit-source-id: a50a0aaa74d0f1b9249ad79ab80e4b7747c3bffc	2019-04-21 20:31:09 -07:00
Shen Li	1bd5f2c181	Fix some typos in jit README Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19548 Differential Revision: D15028275 Pulled By: mrshenli fbshipit-source-id: 84ff635be3b4681962451b4c301271683174d7a8	2019-04-21 19:45:05 -07:00
Gregory Chanan	5afc274708	Match JIT signature with triu_indices / tril_indices. (#19484 ) Summary: This just plugs into the existing mechanism to do a direct translation to TensorOptions in the backend, so no codegen changes. After this lands, all native_functions will match the JIT signature. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19484 Differential Revision: D15013051 Pulled By: gchanan fbshipit-source-id: 6818f868d2f765ca3e56e7e6f75fe4f68492466c	2019-04-21 15:57:52 -07:00
Gregory Chanan	9eb48e1b03	Make one_hot non-differentiable. (#19524 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19524 ghimport-source-id: ceda3ad43471242ebbd272a21de11731c7d8bef6 Differential Revision: D15021417 Pulled By: gchanan fbshipit-source-id: 65d1f17a32f81f47dba5e58e343d0b7b828e1d51	2019-04-21 14:14:37 -07:00
Gregory Chanan	6733037416	Remove 'BoolTensor', 'IndexTensor' from frontend specifications. (#19523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19523 ghimport-source-id: 618a15c2d1d9af9f87b46e32f10ff77111c2e3b7 Differential Revision: D15021420 Pulled By: gchanan fbshipit-source-id: 048af8da3128de10bdee5827b6fbc169c3ad25a8	2019-04-21 14:14:34 -07:00
Gregory Chanan	3944601588	Have _embedding_bag_dense_backward match JIT signature. (#19522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19522 ghimport-source-id: ad645d87396de645a1aff5fd9d9939cb79cf6558 Differential Revision: D15021419 Pulled By: gchanan fbshipit-source-id: bd7017edadb4ec9d43cefddf0aee8c52c5cca6a4	2019-04-21 14:14:30 -07:00
Gregory Chanan	e3523979ae	Have embedding_dense_backward match JIT signature. (#19521 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19521 ghimport-source-id: 817d3defb5f4ee98bae1f0488f99cb0e9a5226a2 Differential Revision: D15021376 Pulled By: gchanan fbshipit-source-id: 2e29f1d3913f94fab3347dc48676303510d7da46	2019-04-21 14:14:27 -07:00
Gu, Jinghui	638ffac359	Update mkldnn-bridge to fix crash issue in DNNLOWP dequantize op (#19159 ) Summary: Remove an useless format checker in mkldnn-bridge to fix the crash issue in DNNLOWP dequantize op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19159 Differential Revision: D15027670 Pulled By: yinghai fbshipit-source-id: ac97d6ff94de013105108b9596b1bd7621c5aa75	2019-04-21 14:05:13 -07:00
Gregory Chanan	83373e7755	Hook up non_differentiability in derivatives.yaml when no autograd function is generated. (#19520 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19520 ghimport-source-id: a1272aa0b23692fb189974c4daba7b2e4e0dad50 Differential Revision: D15021380 Pulled By: gchanan fbshipit-source-id: ec83efd4bb6d17714c060f13a0527a33a10452db	2019-04-21 13:48:55 -07:00
Gregory Chanan	8868a4f20b	Move non_differentiable_arg_names from autograd functions to differentiability_info. (#19519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19519 ghimport-source-id: 74e603688b2e4ed33f6c46c7da9d009336140e74 Differential Revision: D15021378 Pulled By: gchanan fbshipit-source-id: e366a914c67a90ba0552b67d0bf5b347edbaf189	2019-04-21 11:09:39 -07:00
Tongzhou Wang	6d307db5b4	Move cuFFT plan cache note outside Best Practices (#19538 ) Summary: I mistakenly put it there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19538 Differential Revision: D15026500 Pulled By: soumith fbshipit-source-id: 0c13499571fdfd789c3bd1c4b58abd870725d422	2019-04-20 21:39:59 -07:00
Michael Suo	26f1c6d4d4	Revert D14689639: [pytorch] Allow passing lists as trace inputs. Differential Revision: D14689639 Original commit changeset: 6dcec8a64319 fbshipit-source-id: 03a5e7c80e7f2420e33b056b5844a78d7fd41141	2019-04-20 08:50:47 -07:00
Gu, Jinghui	c96c91da22	Improve optimizations for DNNLOWP support on MKL-DNN (#18843 ) Summary: In this PR, the fusion alogrithms are improved to support DNNLOWP. 1. Enabled conv fusions for DNNLOWP 2. Fused order switch op into following quantize op 3. Improve conv+sum fusion to parse larger scope/window 4. re-org fusion code to fix random crash issue due to changing graph Pull Request resolved: https://github.com/pytorch/pytorch/pull/18843 Differential Revision: D15021030 Pulled By: yinghai fbshipit-source-id: 88d2199d9fc69f392de9bfbe1f291e0ebf78ab08	2019-04-20 02:12:06 -07:00
Nishant Pandit	fe87327c28	Make Observer class as template Quant class for QuantConfig (#19418 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19418 This change makes Observer class template which always takes an observer function as argument. Second test-case becomes redundant, hence removing it. Reviewed By: jerryzh168 Differential Revision: D15000594 fbshipit-source-id: 9555fe98a5f2054b8fd01e64e9ac2db72c043bfa	2019-04-19 21:47:54 -07:00
Sam Leeman-Munk	9f4f7e1621	Support compilation on gcc-7.4.0 (#19470 ) Summary: There are two corrections in this pull request. The first is specific to gcc-7.4.0. compiled with -std=c++14 gcc-7.4.0 has __cplusplus = 201402L This does not meet the check set in Deprecated.h, which asks for >201402L. The compiler goes down to the __GNUC__ check, which passes and sets C10_DEPRECATED_MESSAGE to a value that c++14 does not appear to support or even recognize, leading to a compile time error. My recommended solution, which worked for my case, was to change the = into a >= The second correction comes in response to this error: caffe2/operators/crash_op.cc: In member function ‘virtual bool caffe2::CrashOp::RunOnDevice()’: caffe2/operators/crash_op.cc:14:11: error: ‘SIGABRT’ was not declared in this scope I am merely committing to the repository the solution suggested here (which worked for me) https://discuss.pytorch.org/t/building-pytorch-from-source-in-conda-fails-in-pytorch-caffe2-operators-crash-op-cc/42859 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19470 Differential Revision: D15019529 Pulled By: ailzhang fbshipit-source-id: 9ce9d713c860ee5fd4266e5c2a7f336a97d7a90d	2019-04-19 21:41:36 -07:00
James Reed	d17c22d024	Improve embedding_bag add kernel (#19329 ) Summary: This was actually getting pretty poor throughput with respect to memory bandwidth. I used this test to measure the memory bandwidth specifically for the AXPY call: https://gist.github.com/jamesr66a/b27ff9ecbe036eed5ec310c0a3cc53c5 And I got ~8 GB/s before this change, but ~14 GB/s after this change. This seems to speed up the operator overall by around 1.3x (benchmark: https://gist.github.com/jamesr66a/c533817c334d0be432720ef5e54a4166): == Before == time_per_iter 0.0001298875093460083 GB/s 3.082544287868467 == After == time_per_iter 0.00010104801654815674 GB/s 3.9623142905451076 The large difference between the local BW increase and the full-op BW increase likely indicates significant time is being spent elsewhere in the op, so I will investigate that. EDIT: I updated this PR to include a call into caffe2/perfkernels. This is the progression: before time_per_iter 8.983819484710693e-05 GB/s 4.456723564864611 After no axpy time_per_iter 7.19951868057251e-05 GB/s 5.56126065872172 AFter perfkernels time_per_iter 5.6699180603027346e-05 GB/s 7.061548257694262 After perfkernels no grad time_per_iter 4.388842582702637e-05 GB/s 9.122769670026413 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19329 Reviewed By: dzhulgakov Differential Revision: D14969630 Pulled By: jamesr66a fbshipit-source-id: 42d1015772c87bedd119e33c0aa2c8105160a738	2019-04-19 19:16:24 -07:00
Pieter Noordhuis	6325b6e44e	Make finding unused model parameters optional (#19515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19515 This is still done by default, but can now be disabled by specifying `find_unused_parameters=False`. There are use cases where finding unused parameters results in erroneous behavior, because a subset of model parameters is used outside the `forward` function. One can argue that doing this is not a good idea, but we should not break existing use cases without an escape hatch. This configuration parameter is that escape hatch. Reviewed By: bddppq Differential Revision: D15016381 fbshipit-source-id: f2f86b60771b3801ab52776e62b5fd6748ddeed0	2019-04-19 17:23:36 -07:00
Sebastian Messmer	63e2833ceb	Disallow std::vector arguments (#19511 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19511 In the c10 operator registration API, disallow std::vector arguments and show a nice error message pointing users towards using ArrayRef instead. Reviewed By: ezyang Differential Revision: D15017423 fbshipit-source-id: 157ecc1298bbc598d2e310a16041edf195aaeff5	2019-04-19 17:06:31 -07:00
Sebastian Messmer	1ac14b03b5	Drop instead of pop (#19503 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19503 After reading the arguments from the stack, the c10 kernel wrapper accidentally popped them again, causing a vector to be allocated. Instead, it should just drop them because they have already been read. Reviewed By: ezyang Differential Revision: D15016023 fbshipit-source-id: b694a2929f97fa77cebe247ec2e49820a3c818d5	2019-04-19 17:06:28 -07:00
Mikhail Zolotukhin	9818c7cb63	Add minimalistic implementation of subgraph matcher. (#19322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19322 ghimport-source-id: 93c713f829d1b2a9aa5d104cb1f30148dd37c967 Differential Revision: D14962182 Pulled By: ZolotukhinM fbshipit-source-id: 3989fba06502011bed9c24f12648d0baa2a4480c	2019-04-19 16:35:16 -07:00
Mingzhe Li	26f12af537	Fix op benchmarks error in OSS environment (#19518 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19518 Previous design needs to run the op benchmarks from PyTorch root directory which could lead to `module not found` error in OSS environment. This diff fixes that issue by making the benchmark to be launched in the `benchmarks` folder. Reviewed By: ilia-cher Differential Revision: D15020787 fbshipit-source-id: eb09814a33432a66cc857702bc86538cd17bea3b	2019-04-19 16:25:16 -07:00
Mingzhe Li	5da7b74d48	fix AI-PEP path error (#19514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19514 as title Reviewed By: hl475 Differential Revision: D15018499 fbshipit-source-id: 9ce38e3a577432e0575a6743f5dcd2e907d3ab9d	2019-04-19 16:25:13 -07:00
eellison	a421f882dc	First step at container aliasing (#18710 ) Summary: First step at allowing container types within alias analysis. Since the current implementation hides the concept of Wildcards within alias analysis and does not expose it to memory dag, we cannot represent whether a container type holds a wildcard. As a result, only handle TupleConstruct, where we can directly inspect if any input values are wildcards, and don't handle nested containers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18710 Differential Revision: D15017068 Pulled By: eellison fbshipit-source-id: 3ee76a5482cef1cc4a10f034593ca21019161c18	2019-04-19 16:07:11 -07:00
Xiaomeng Yang	f5fe7aa0b2	Fix relu bug for empty tensor (#19451 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19451 Fix relu bug for empty tensor Reviewed By: xianjiec Differential Revision: D15009811 fbshipit-source-id: b75e567c3bec08d7d12b950d8f1380c50c138704	2019-04-19 15:21:07 -07:00
Eric Faust	2d4875b8ed	Allow passing lists as trace inputs. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18636 Differential Revision: D14689639 fbshipit-source-id: 6dcec8a64319ae3c4da9a93f574a13ce8ec223a5	2019-04-19 13:37:50 -07:00
Michael Suo	9245eaf3f0	Allow for segmented printing in PythonPrint (#19238 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19238 ghimport-source-id: 469d33cd187fa68840b201d625800a0f4fead547 Differential Revision: D14928291 Reviewed By: zdevito Pulled By: suo fbshipit-source-id: 257fce3dd1601ba192092d3fc318374e3752907e	2019-04-19 13:02:06 -07:00
Michael Suo	73c166a5ed	add resolveType to Resolver (#19237 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19237 ghimport-source-id: 70777ec37155be37efef1b743d564752e4dff9de Differential Revision: D14928289 Reviewed By: zdevito Pulled By: suo fbshipit-source-id: 46827da9ace16730669fc654bf781d83172d18b1	2019-04-19 13:02:02 -07:00
Michael Suo	1e94a3bc4d	Turn resolver into a class (#19236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19236 ghimport-source-id: d36705ea5ecff085d0d84ea57bb96d18d7c260dd Differential Revision: D14928292 Reviewed By: zdevito Pulled By: suo fbshipit-source-id: cd038100ac423fa1c19d0547b9e5487a633a2258	2019-04-19 13:01:59 -07:00
davidriazati	405c7bcea0	Fix bad annotation in docs (#19501 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * #19501 [jit] Fix bad annotation in docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/19501 Pulled By: driazati Differential Revision: D15016062 fbshipit-source-id: 3dcd0481eb48b84e98ffe8c5df2cbc9c2abf99f9	2019-04-19 12:42:26 -07:00
Yinghai Lu	b85edac16f	Fix out-of-topological-order issue in Nomnigraph (#19458 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19458 The algorithm in https://fburl.com/ggh9iyvc fails to really ensure topological ordering of nodes. The fix is ugly but effective. I think we need a real topological sort to fix this issue more nicely. Mikhail Zolotukhin, Bram Wasti. Differential Revision: D15011893 fbshipit-source-id: 130c3aa442f5d578adfb14fbe5f16aa722434942	2019-04-19 12:19:39 -07:00
Roy Li	53bb739b67	Remove uses of TypeID (#19452 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19452 ghimport-source-id: 816ae7fe1a18d76f064d5796dec44dca6a138a21 Differential Revision: D15009920 Pulled By: li-roy fbshipit-source-id: 722f05a927528148555561da62839f84dba645c6	2019-04-19 12:07:35 -07:00
Jerry Zhang	3762cf9cc6	Expose QScheme in frontend (#19381 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19381 Expose QScheme enum in frontend so that people can use it in quantization configs in modules. Differential Revision: D14922992 fbshipit-source-id: ab07b8a7ec42c1c1f5fe84a4a0c805adbcad408d	2019-04-19 11:57:59 -07:00
Gregory Chanan	1898e9368b	Revert D15003385: Have embedding_dense_backward match JIT signature. Differential Revision: D15003385 Original commit changeset: 53cbe18aa454 fbshipit-source-id: be904ee2212aa9e402715c436a84d95f6cde326f	2019-04-19 11:27:16 -07:00
Gregory Chanan	e3470ae4bd	Revert D15003379: Have _embedding_bag_dense_backward match JIT signature. Differential Revision: D15003379 Original commit changeset: f8e82800171f fbshipit-source-id: 55f83557998d166aeb41d00d7a590acdc76fcf22	2019-04-19 11:27:13 -07:00
Gregory Chanan	79bfc3931a	Revert D15003387: Remove 'BoolTensor', 'IndexTensor' from frontend specifications. Differential Revision: D15003387 Original commit changeset: e518e8ce3228 fbshipit-source-id: af5b107239446ea8d6f229a427d5b157fcafd224	2019-04-19 11:27:10 -07:00
Gregory Chanan	013926cfcf	Revert D15003382: Make one_hot non-differentiable. Differential Revision: D15003382 Original commit changeset: e9244c7a5f0a fbshipit-source-id: 84789cf4c46c77cce655e70c2a8ff425f32f48bd	2019-04-19 11:27:08 -07:00
Jerry Zhang	fc1aadec3b	Make empty_affine_quantized private (#19446 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19446 change empty_affine_quantized to _empty_affine_quantized Reviewed By: dzhulgakov Differential Revision: D15008757 fbshipit-source-id: c7699ac0c208a8f17d88e95193970c75ba7219d3	2019-04-19 11:21:44 -07:00
Gregory Chanan	c3755eeeee	Make one_hot non-differentiable. (#19430 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19430 ghimport-source-id: 6787473873fdc21400138a4322e17fee8db62607 Differential Revision: D15003382 Pulled By: gchanan fbshipit-source-id: e9244c7a5f0ad7cd2f79635944a8b37f910231c9	2019-04-19 11:03:14 -07:00
Gregory Chanan	622cf1fec9	Remove 'BoolTensor', 'IndexTensor' from frontend specifications. (#19429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19429 ghimport-source-id: 6116682b84210a34babb8b87a92e7050433e5d59 Differential Revision: D15003387 Pulled By: gchanan fbshipit-source-id: e518e8ce322810e06175bb4e6672d4ea1eb18efd	2019-04-19 11:03:12 -07:00
Gregory Chanan	b0812d3d4c	Have embedding_dense_backward match JIT signature. (#19427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19427 ghimport-source-id: 93438cd495129a1e41118c62e6339909783035fd Differential Revision: D15003385 Pulled By: gchanan fbshipit-source-id: 53cbe18aa4541a2501f496abfee526e40093c0ff	2019-04-19 11:03:09 -07:00
Gregory Chanan	a6ab443e32	Have _embedding_bag_dense_backward match JIT signature. (#19428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19428 ghimport-source-id: 037efa3df95efc1fbff631826351d1698a3c49ec Differential Revision: D15003379 Pulled By: gchanan fbshipit-source-id: f8e82800171f632e28535e416283d858156068ec	2019-04-19 11:03:06 -07:00
Gregory Chanan	30b2953b8b	Stop generating autograd functions for derivatives.yaml entries that only specify output differentiability. (#19424 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19424 ghimport-source-id: e9d1b86742607f5cbe39fb278fa7f378739cd6ef Differential Revision: D15003380 Pulled By: gchanan fbshipit-source-id: 8efb94fbc0b843863021bf25deab57c492086237	2019-04-19 10:56:20 -07:00
David Riazati	e7b9526dc6	Fix ord() when dealing with utf8 chars (#19423 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19423 ghimport-source-id: e7449489fbc86ec1116f94027b3c1561942413ee Reviewed By: eellison Differential Revision: D15002847 Pulled By: driazati fbshipit-source-id: 4560cebcfca695447423d48d65ed364e7dbdbedb	2019-04-19 10:27:04 -07:00
barrh	557b1b362f	Fix copied optimizer (#19308 ) Summary: Add the defaults field to the copied object. Prior to this patch, optimizer.__getattr__ has excluded the defaults attribute of optimizer source object, required by some LR schedulers. (e.g. CyclicLR with momentum) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19308 Differential Revision: D15012801 Pulled By: soumith fbshipit-source-id: 95801b269f6f9d78d531d4fed95c973b280cc96f	2019-04-19 10:27:01 -07:00
MilesCranmer	30292d994f	Add an identity module (#19249 ) Summary: This is a simple yet useful addition to the torch.nn modules: an identity module. This is a first draft - please let me know what you think and I will edit my PR. There is no identity module - nn.Sequential() can be used, however it is argument sensitive so can't be used interchangably with any other module. This adds nn.Identity(...) which can be swapped with any module because it has dummy arguments. It's also more understandable than seeing an empty Sequential inside a model. See discussion on #9160. The current solution is to use nn.Sequential(). However this won't work as follows: ```python batch_norm = nn.BatchNorm2d if dont_use_batch_norm: batch_norm = Identity ``` Then in your network, you have: ```python nn.Sequential( ... batch_norm(N, momentum=0.05), ... ) ``` If you try to simply set `Identity = nn.Sequential`, this will fail since `nn.Sequential` expects modules as arguments. Of course there are many ways to get around this, including: - Conditionally adding modules to an existing Sequential module - Not using Sequential but writing the usual `forward` function with an if statement - ... However, I think that an identity module is the most pythonic strategy, assuming you want to use nn.Sequential. Using the very simple class (this isn't the same as the one in my commit): ```python class Identity(nn.Module): def __init__(self, args, *kwargs): super().__init__() def forward(self, x): return x ``` we can get around using nn.Sequential, and `batch_norm(N, momentum=0.05)` will work. There are of course other situations this would be useful. Thank you. Best, Miles Pull Request resolved: https://github.com/pytorch/pytorch/pull/19249 Differential Revision: D15012969 Pulled By: ezyang fbshipit-source-id: 9f47e252137a1679e306fd4c169dca832eb82c0c	2019-04-19 10:12:18 -07:00
Junjie Bai	ef499cd567	Remove no-fork workaround for running tests with ROCm (#19436 ) Summary: This should have been fixed in newest ROCm version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19436 Reviewed By: ezyang Differential Revision: D15004685 Pulled By: bddppq fbshipit-source-id: 19fd4cca94c914dc54aabfbb4e62b328aa348a35	2019-04-19 09:51:03 -07:00
Edward Yang	f3ef94a806	Delete defunct test/ffi directory. (#19168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19168 ghimport-source-id: 5190a8d00c529735e99e8745c5e7cf1901fdb800 Differential Revision: D14938318 Pulled By: ezyang fbshipit-source-id: eaeb6814178c434f737b99ae1fce63fd9ecdb432	2019-04-19 08:16:49 -07:00
Bharat123rox	a97330b7c5	Fix missing doc out= for torch.cumprod (#19340 ) Summary: Fix #19255 by adding the `out=None` argument for `torch.cumprod` missing [here](https://pytorch.org/docs/master/torch.html#torch.cumprod) also added the docstring for `out` in torch.cumsum which was missing [here](https://pytorch.org/docs/master/torch.html#torch.cumsum) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19340 Differential Revision: D14973931 Pulled By: ezyang fbshipit-source-id: 232f5c9a606b749d67d068afad151539866fedda	2019-04-19 07:59:57 -07:00
Clément Pinard	0676ba0c5c	Mention packed accessors in tensor basics doc (#19464 ) Summary: This is a continuation of efforts into packed accessor awareness. A very simple example is added, along with the mention that the template can hold more arguments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19464 Differential Revision: D15012564 Pulled By: soumith fbshipit-source-id: a19ed536e016fae519b062d847cc58aef01b1b92	2019-04-19 07:20:16 -07:00
Gregory Chanan	ea6c738c8a	Rename 'not_differentiable' to 'non_differentiable'. (#19272 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19272 ghimport-source-id: 755e91efa68c5a1c4377a6853f21b3eee3f8cab5 Differential Revision: D15003381 Pulled By: gchanan fbshipit-source-id: 54db27c5c5e65acf65821543db3217de9dd9bdb5	2019-04-19 07:07:55 -07:00
Lu Fang	aa50f1e365	Clean the onnx constant fold code a bit (#19398 ) Summary: This is a follow up PR of https://github.com/pytorch/pytorch/pull/18698 to lint the code using clang-format. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19398 Differential Revision: D14994517 Pulled By: houseroad fbshipit-source-id: 2ae9f93e66ce66892a1edc9543ea03932cd82bee	2019-04-18 23:59:26 -07:00
Eric Faust	593bb145ce	Allow passing dicts as trace inputs. (#18092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18092 Previously, tracing required all inputs to be either tensors, or tuples of tensor. Now, we allow users to pass dicts as well. Differential Revision: D14491795 fbshipit-source-id: 7a2df218e5d00f898d01fa5b9669f9d674280be3	2019-04-18 23:52:00 -07:00
Lu Fang	9034b66f14	skip test_trace_c10_ops if _caffe2 is not built (#19099 ) Summary: fix https://github.com/pytorch/pytorch/issues/18142 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19099 Differential Revision: D15010452 Pulled By: houseroad fbshipit-source-id: 5bf158d7fce7bfde109d364a3a9c85b83761fffb	2019-04-18 23:40:15 -07:00
Gemfield	d9115b533a	remove needless ## in REGISTER_ALLOCATOR definition. (#19261 ) Summary: remove needless ## in REGISTER_ALLOCATOR definition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19261 Differential Revision: D15002025 Pulled By: soumith fbshipit-source-id: 40614b1d79d1fe05ccf43f0ae5aab950e4c875c2	2019-04-18 22:44:09 -07:00
Lara Haidar-Ahmad	9983c24cfc	Strip doc_string from exported ONNX models (#18882 ) Summary: Strip the doc_string by default from the exported ONNX models (this string has the stack trace and information about the local repos and folders, which can be confidential). The users can still generate the doc_string by specifying add_doc_string=True in torch.onnx.export(). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18882 Differential Revision: D14889684 Pulled By: houseroad fbshipit-source-id: 26d2c23c8dc3f484544aa854b507ada429adb9b8	2019-04-18 22:30:00 -07:00
Natalia Gimelshein	f0d98199fb	improve dim sort performance (#19379 ) Summary: We are already using custom comparators for sorting (for a good reason), but are still making 2 sorting passes - global sort and stable sorting to bring values into their slices. Using a custom comparator to sort within a slice allows us to avoid second sorting pass and brings up to 50% perf improvement. t-vi I know you are moving sort to ATen, and changing THC is discouraged, but #18350 seems dormant. I'm fine with #18350 landing first, and then I can put in these changes. cc umanwizard for review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19379 Differential Revision: D15011019 Pulled By: soumith fbshipit-source-id: 48e5f5aef51789b166bb72c75b393707a9aed57c	2019-04-18 22:25:08 -07:00
SsnL	941ccd6b35	Fix missing import sys in pin_memory.py (#19419 ) Summary: kostmo pointed this out in #15331. Thanks :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19419 Differential Revision: D15002846 Pulled By: soumith fbshipit-source-id: c600fab3f7a7a5147994b9363910af4565c7ee65	2019-04-18 22:19:26 -07:00
Ran	940caed0d4	update documentation of PairwiseDistance#19241 (#19412 ) Summary: Fix the documentation of PairwiseDistance [#19241](https://github.com/pytorch/pytorch/issues/19241) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19412 Differential Revision: D14998271 Pulled By: soumith fbshipit-source-id: bcb2aa46d3b3102c4480f2d24072a5e14b049888	2019-04-18 22:13:52 -07:00
Soumith Chintala	8638634a6e	fixes link in TripletMarginLoss (#19417 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19417 Differential Revision: D15001610 Pulled By: soumith fbshipit-source-id: 1b85ebe196eb5a3af5eb83d914dafa83b9b35b31	2019-04-18 22:13:48 -07:00
Mingzhe Li	08f5c05d60	make separate operators as independent binaries (#19450 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19450 We want to make each operator benchmark as a separate binary. The previous way to run the benchmark is by collecting all operators into a single binary, it is unnecessary when we want to filter a specific operator. This diff aims to resolve that issue. Reviewed By: ilia-cher Differential Revision: D14808159 fbshipit-source-id: 43cd25b219c6e358d0cd2a61463b34596bf3bfac	2019-04-18 20:00:47 -07:00
svcscm	1e78252de7	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: a727513842c0a240b377bda4e313fbedbc54c2e8	2019-04-18 18:34:36 -07:00
Xiang Gao	e1750754c8	Step 4: add support for unique with dim=None (#18651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18651 ghimport-source-id: e11988130a3f9a73529de0b0d08b4ec25fbc639c Differential Revision: D15000463 Pulled By: VitalyFedyunin fbshipit-source-id: 9e258e473dea6a3fc2307da2119b887ba3f7934a	2019-04-18 18:28:07 -07:00
Michael Suo	1b1d1c9837	allow bools to be used as attributes (#19440 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19440 ghimport-source-id: 9c962054d760526bf7da324b114455fcb1038521 Differential Revision: D15005723 Pulled By: suo fbshipit-source-id: 75fc87ae33894fc34d3b913881defb7e6b8d7af0	2019-04-18 18:13:21 -07:00
David Riazati	a0e09216f0	Fix test build (#19444 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19444 ghimport-source-id: c85db00e8037e7f6f0424eb8bd17f957d20b7247 Reviewed By: eellison Differential Revision: D15008679 Pulled By: driazati fbshipit-source-id: 0987035116d9d0069794d96395c8ad458ba7c121	2019-04-18 18:05:04 -07:00
Thomas Viehmann	b9291f55bb	pow scalar exponent / base autodiff, fusion (#19324 ) Summary: Fixes: #19253 Fixing pow(Tensor, float) is straightforward. The breakage for pow(float, Tensor) is a bit more subtle to trigger, and fixing needs `torch.log` (`math.log` didn't work) from the newly merged #19115 (Thanks ngimel for pointing out this has landed.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19324 Differential Revision: D15003531 Pulled By: ailzhang fbshipit-source-id: 8b22138fa27a43806b82886fb3a7b557bbb5a865	2019-04-18 17:58:35 -07:00
Gao, Xiang	b4fa979a37	Improve unique CPU performance for returning counts (#19352 ) Summary: Benchmark on a tensor of shape `torch.Size([15320, 2])`. Benchmark code: ```python print(torch.__version__) print() a = tensor.flatten() print('cpu, sorted=False:') %timeit torch._unique2_temporary_will_remove_soon(a, sorted=False) %timeit torch._unique2_temporary_will_remove_soon(a, sorted=False, return_inverse=True) %timeit torch._unique2_temporary_will_remove_soon(a, sorted=False, return_counts=True) %timeit torch._unique2_temporary_will_remove_soon(a, sorted=False, return_inverse=True, return_counts=True) print() print('cpu, sorted=True:') %timeit torch._unique2_temporary_will_remove_soon(a) %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True) %timeit torch._unique2_temporary_will_remove_soon(a, return_counts=True) %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True, return_counts=True) print() ``` Before ``` 1.1.0a0+36854fe cpu, sorted=False: 340 µs ± 4.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 724 µs ± 6.28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 54.3 ms ± 469 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 54.6 ms ± 659 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) cpu, sorted=True: 341 µs ± 7.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 727 µs ± 7.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 54.7 ms ± 795 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 54.3 ms ± 647 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` After ``` 1.1.0a0+261d9e8 cpu, sorted=False: 350 µs ± 865 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) 771 µs ± 598 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) 1.09 ms ± 6.86 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 1.09 ms ± 4.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) cpu, sorted=True: 324 µs ± 4.99 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 705 µs ± 3.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 1.09 ms ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 1.09 ms ± 5.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19352 Differential Revision: D14984717 Pulled By: VitalyFedyunin fbshipit-source-id: 3c56f85705ab13a92ec7406f4f30be77226a3210	2019-04-18 17:52:59 -07:00
Pieter Noordhuis	563de88aa5	Revert D14909203: Remove usages of TypeID Differential Revision: D14909203 Original commit changeset: d716179c484a fbshipit-source-id: 992ff1fcd6d35d3f2ae768c7e164b7a0ba871914	2019-04-18 17:47:39 -07:00
Sebastian Messmer	ce969c0bc4	Add tests for argument types (#19290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19290 Add test cases for the supported argument types And TODOs for some unsupported ones that we might want to support. Reviewed By: dzhulgakov Differential Revision: D14931920 fbshipit-source-id: c47bbb295a54ac9dc62569bf5c273368c834392c	2019-04-18 17:20:13 -07:00
David Riazati	d9052b2176	Allow optionals arguments from C++ (#19311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19311 ghimport-source-id: 699f62eb2bbad53ff2045fb2e217eb1402f2cdc5 Reviewed By: eellison Differential Revision: D14983059 Pulled By: driazati fbshipit-source-id: 442f96d6bd2a8ce67807ccad2594b39aae489ca5	2019-04-18 17:15:05 -07:00
Mingzhe Li	45d5b6be48	Enhance front-end to add op (#19433 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19433 For operator benchmark project, we need to cover a lot of operators, so the interface for adding operators needs to be very clean and simple. This diff is implementing a new interface to add op. Here is the logic to add new operator to the benchmark: ``` long_config = {} short_config = {} map_func add_test( [long_config, short_config], map_func, [caffe2 op] [pt op] ) ``` Reviewed By: zheng-xq Differential Revision: D14791191 fbshipit-source-id: ac6738507cf1b9d6013dc8e546a2022a9b177f05	2019-04-18 17:07:02 -07:00
Dmytro Dzhulgakov	edf77fe64a	Fix cpp_custom_type_hack variable handling (#19400 ) Summary: My bad - it might be called in variable and non-variable context. So it's better to just inherit variable-ness from the caller. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19400 Reviewed By: ezyang Differential Revision: D14994781 Pulled By: dzhulgakov fbshipit-source-id: cb9d055b44a2e1d7bbf2e937d558e6bc75037f5b	2019-04-18 16:44:25 -07:00
Ailing Zhang	4c93be0fa0	fix hub doc formatting issues (#19434 ) Summary: minor fixes for doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/19434 Differential Revision: D15003903 Pulled By: ailzhang fbshipit-source-id: 400768d9a5ee24f9183faeec9762b688c48c531b	2019-04-18 16:02:19 -07:00
Pieter Noordhuis	a5c4348d54	Recursively find tensors in DDP module output (#19360 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19360 We'll return the output object verbatim since it is a freeform object. We need to find any tensors in this object, though, because we need to figure out which parameters were used during this forward pass, to ensure we short circuit reduction for any unused parameters. Before this commit only lists were handled and the functionality went untested. This commit adds support for dicts and recursive structures, and also adds a test case. Closes #19354. Reviewed By: mrshenli Differential Revision: D14978016 fbshipit-source-id: 4bb6999520871fb6a9e4561608afa64d55f4f3a8	2019-04-18 14:57:09 -07:00
Sebastian Messmer	17f05ad5e5	Moving at::Tensor into caffe2::Tensor without bumping refcount (#19388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19388 The old implementation forced a refcount bump when converting at::Tensor to caffe2::Tensor. Now, it is possible to move it without a refcount bump. Reviewed By: dzhulgakov Differential Revision: D14986815 fbshipit-source-id: 92b4b0a6f323ed38376ffad75f960cad250ecd9b	2019-04-18 14:13:26 -07:00
Ailing Zhang	88f70a1670	Fix pickling torch.float32 (#18045 ) Summary: Attempt fix for #14057 . This PR fixes the example script in the issue. The old behavior is a bit confusing here. What happened to pickling is python2 failed to recognize `torch.float32` is in module `torch`, thus it's looking for `torch.float32` in module `__main__`. Python3 is smart enough to handle it. According to the doc [here](https://docs.python.org/2/library/pickle.html#object.__reduce__), it seems `__reduce__` should return `float32` instead of the old name `torch.float32`. In this way python2 is able to find `float32` in `torch` module. > If a string is returned, it names a global variable whose contents are pickled as normal. The string returned by __reduce__() should be the object’s local name relative to its module Pull Request resolved: https://github.com/pytorch/pytorch/pull/18045 Differential Revision: D14990638 Pulled By: ailzhang fbshipit-source-id: 816b97d63a934a5dda1a910312ad69f120b0b4de	2019-04-18 12:28:10 -07:00
David Riazati	f5435634b4	Respect order of Parameters in rnn.py (#18198 ) Summary: Previously to get a list of parameters this code was just putting them in the reverse order in which they were defined, which is not always right. This PR allows parameter lists to define the order themselves. To do this parameter lists need to have a corresponding function that provides the names of the parameters. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18198 Differential Revision: D14966270 Pulled By: driazati fbshipit-source-id: 59331aa59408660069785906304b2088c19534b2	2019-04-18 11:18:20 -07:00
Nikolay Korovaiko	2d0d153288	Refactor EmitLoopCommon to make it more amenable to future extensions (#19341 ) Summary: This PR paves the way for support more iterator types in for-in loops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19341 Differential Revision: D14992749 Pulled By: Krovatkin fbshipit-source-id: e2d4c9465c8ec3fc74fbf23006dcb6783d91795f	2019-04-18 09:59:21 -07:00
Kutta Srinivasan	b7323a94ad	Cleanup init_process_group (#19033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19033 torch.distributed.init_process_group() has had many parameters added, but the contract isn't clear. Adding documentation, asserts, and explicit args should make this clearer to callers and more strictly enforced. Reviewed By: mrshenli Differential Revision: D14813070 fbshipit-source-id: 80e4e7123087745bed436eb390887db9d1876042	2019-04-18 09:37:38 -07:00
peterjc123	20a5aa9670	Sync FindCUDA/select_computer_arch.cmake from upstream (#19392 ) Summary: 1. Fixes auto detection for Turing cards. 2. Adds Turing Support Pull Request resolved: https://github.com/pytorch/pytorch/pull/19392 Differential Revision: D14996142 Pulled By: soumith fbshipit-source-id: 3cd45c58212cf3db96e5fa19b07d9f1b59a1666a	2019-04-18 07:03:19 -07:00
Alexandros Metsai	9e3bdb3231	Update module.py documentation. (#19347 ) Summary: Added the ">>>" python interpreter sign(three greater than symbols), so that the edited lines will appear as code, not comments/output, in the documentation. Normally, the interpreter would display "..." when expecting a block, but I'm not sure how this would work on the pytorch docs website. It seems that in other code examples the ">>>" sign is used as well, therefore I used with too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19347 Differential Revision: D14986154 Pulled By: soumith fbshipit-source-id: 8f4d07d71ff7777b46c459837f350eb0a1f17e84	2019-04-18 06:46:24 -07:00
Tongzhou Wang	973d51079b	Add device-specific cuFFT plan caches (#19300 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19224 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19300 Differential Revision: D14986967 Pulled By: soumith fbshipit-source-id: 8c31237db50d6924bba1472434c10326610d9255	2019-04-18 06:39:35 -07:00
Mingfei Ma	b8fb6eae88	Improve bmm() performance on CPU when input tensor is non-contiguous (#19338 ) Summary: This PR aims to improve Transformer performance on CPU, `bmm()` is one of the major bottlenecks now. Current logic of `bmm()` on CPU only uses MKL batch gemm when the inputs `A` and `B` are contiguous or transposed. So when `A` or `B` is a slice of a larger tensor, it falls to a slower path. `A` and `B` are both 3D tensors. MKL is able to handle the batch matrix multiplication on occasion that `A.stride(1) == 1 \|\| A.stride(2) == 1` and `B.stride(1) == \|\| B.stride(2) == 1`. From [fairseq](https://github.com/pytorch/fairseq) implementation of Transformer, multi-head attention has two places to call bmm(), [here](https://github.com/pytorch/fairseq/blob/master/fairseq/modules/multihead_attention.py#L167) and [here](https://github.com/pytorch/fairseq/blob/master/fairseq/modules/multihead_attention.py#L197), `q`, `k`, `v` are all slices from larger tensor. So the `bmm()` falls to slow path at the moment. Results on Xeon 6148 (202 cores 2.5GHz) indicate this PR improves Transformer training performance by 48%* (seconds per iteration reduced from 5.48 to 3.70), the inference performance should also be boosted. Before: ``` \| epoch 001: 0%\| \| 27/25337 [02:27<38:31:26, 5.48s/it, loss=16.871, nll_loss=16.862, ppl=119099.70, wps=865, ups=0, wpb=4715.778, bsz=129.481, num_updates=27, lr=4.05e-06, gnorm=9.133, ``` After: ``` \| epoch 001: 0%\| \| 97/25337 [05:58<25:55:49, 3.70s/it, loss=14.736, nll_loss=14.571, ppl=24339.38, wps=1280, ups=0, wpb=4735.299, bsz=131.134, num_updates=97, lr=1.455e-05, gnorm=3.908, ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19338 Differential Revision: D14986346 Pulled By: soumith fbshipit-source-id: 827106245af908b8a4fda69ed0288d322b028f08	2019-04-18 06:34:17 -07:00
Sebastian Messmer	12d6f79ecd	Optional inputs and outputs (#19289 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19289 Allow optional inputs and outputs in native c10 operators Reviewed By: dzhulgakov Differential Revision: D14931927 fbshipit-source-id: 48f8bec009c6374345b34d933f148c08bb4f7118	2019-04-18 02:04:57 -07:00
Sebastian Messmer	fa96de2b3f	Add some tests (#19288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19288 - Reviewed By: dzhulgakov Differential Revision: D14931924 fbshipit-source-id: 6c53b5d1679080939973d33868e58ca4ad70361d	2019-04-18 02:04:53 -07:00
Sebastian Messmer	601f36bacc	Use string based schema for exposing caffe2 ops (#19287 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19287 Since we now have a string-schema-based op registration API, we can also use it when exposing caffe2 operators. Reviewed By: dzhulgakov Differential Revision: D14931925 fbshipit-source-id: ec162469d2d94965e8c99d431c801ae7c43849c8	2019-04-18 02:04:50 -07:00
Sebastian Messmer	5ca22cce69	Allow registering ops without specifying the full schema (#19286 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19286 The operator registration API now allows registering an operator by only giving the operator name and not the full operator schema, as long as the operator schema can be inferred from the kernel function. Reviewed By: dzhulgakov Differential Revision: D14931921 fbshipit-source-id: 3776ce43d4ce67bb5a3ea3d07c37de96eebe08ba	2019-04-18 02:04:46 -07:00
Sebastian Messmer	a456e1e196	Add either type (#19285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19285 The either type is a tagged union with two members. This is going to be used in a diff stacked on top to allow a function to return one of two types. Also, generally, either<Error, Result> is a great pattern for returning value_or_error from a function without using exceptions and we could use this class for that later. Reviewed By: dzhulgakov Differential Revision: D14931923 fbshipit-source-id: 7d1dd77b3e5b655f331444394dcdeab24772ab3a	2019-04-18 02:04:43 -07:00
Sebastian Messmer	12dcc77bcb	Allow ops without tensor args if only fallback kernel exists (#19284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19284 Instantiating a dispatch table previously only worked when the op had a tensor argument we could dispatch on. However, the legacy API for custom operators didn't have dispatch and also worked for operators without tensor arguments, so we need to continue supporting that. It probably generally makes sense to support this as long as there's only a fallback kernel and no dispatched kernel registered. This diff adds that functionality. Reviewed By: dzhulgakov Differential Revision: D14931926 fbshipit-source-id: 38fadcba07e5577a7329466313c89842d50424f9	2019-04-18 02:04:40 -07:00
Sebastian Messmer	8036af39d2	String-based schemas in op registration API (#19283 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19283 Now that the function schema parser is available in ATen/core, we can use it from the operator registration API to register ops based on string schemas. This does not allow registering operators based on only the name yet - the full schema string needs to be defined. A diff stacked on top will add name based registration. Reviewed By: dzhulgakov Differential Revision: D14931919 fbshipit-source-id: 71e490dc65be67d513adc63170dc3f1ce78396cc	2019-04-18 01:03:40 -07:00
Sebastian Messmer	41dc54e291	Move function schema parser to ATen/core build target (#19282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19282 This is largely a hack because we need to use the function schema parser from ATen/core but aren't clear yet on how the final software architecture should look like. - Add function schema parser files from jit to ATen/core build target. - Also move ATen/core build target one directory up to allow this. We only change the build targets and don't move the files yet because this is likely not the final build set up and we want to avoid repeated interruptions for other developers. cc zdevito Reviewed By: dzhulgakov Differential Revision: D14931922 fbshipit-source-id: 26462e2e7aec9e0964706138edd3d87a83b964e3	2019-04-18 01:03:37 -07:00
Lu Fang	789c438d86	Automatic update of fbcode/onnx to ad7313470a9119d7e1afda7edf1d654497ee80ab (#19339 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19339 Previous import was 971311db58f2fa8306d15e1458b5fd47dbc8d11c Included changes: - [ad731347](https://github.com/onnx/onnx/commit/ad731347): Fix shape inference for matmul (#1941) <Bowen Bao> - [3717dc61](https://github.com/onnx/onnx/commit/3717dc61): Shape Inference Tests for QOps (#1929) <Ashwini Khade> - [a80c3371](https://github.com/onnx/onnx/commit/a80c3371): Prevent unused variables from generating warnings across all platforms. (#1930) <Pranav Sharma> - [be9255c1](https://github.com/onnx/onnx/commit/be9255c1): add title (#1919) <Prasanth Pulavarthi> - [7a112a6f](https://github.com/onnx/onnx/commit/7a112a6f): add quantization ops in onnx (#1908) <Ashwini Khade> - [6de42d7d](https://github.com/onnx/onnx/commit/6de42d7d): Create working-groups.md (#1916) <Prasanth Pulavarthi> Reviewed By: yinghai Differential Revision: D14969962 fbshipit-source-id: 5ec64ef7aee5161666ed0c03e201be0ae20826f9	2019-04-18 00:45:20 -07:00
Roy Li	fbf505cba7	Remove copy and copy_ special case on Type (#18972 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18972 ghimport-source-id: b5d3012b00530145fa24ab0cab693a7e80cb5989 Differential Revision: D14816530 Pulled By: li-roy fbshipit-source-id: 9c7a166abb22d2cd1f81f352e44d9df1541b1774	2019-04-18 00:21:43 -07:00
Spandan Tiwari	a64cce326f	Add constant folding to ONNX graph during export (Resubmission) (#18698 ) Summary: Rewritten version of https://github.com/pytorch/pytorch/pull/17771 using graph C++ APIs. This PR adds the ability to do constant folding on ONNX graphs during PT->ONNX export. This is done mainly to optimize the graph and make it leaner. The two attached snapshots show a multiple-node LSTM model before and after constant folding. A couple of notes: 1. Constant folding is by default turned off for now. The goal is to turn it on by default once we have validated it through all the tests. 2. Support for folding in nested blocks is not in place, but will be added in the future, if needed. Original Model: ![multiple_lstm_original](https://user-images.githubusercontent.com/23646532/53987630-6ac53980-40d6-11e9-9702-1ccfee124a83.JPG) Constant-folded model: ![multiple_lstm_constant_folded](https://user-images.githubusercontent.com/23646532/53987632-6c8efd00-40d6-11e9-81c5-362c16f68861.JPG) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18698 Differential Revision: D14889768 Pulled By: houseroad fbshipit-source-id: b6616b1011de9668f7c4317c880cb8ad4c7b631a	2019-04-18 00:10:04 -07:00
Roy Li	01d7d3de46	Remove usages of TypeID (#19183 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19183 ghimport-source-id: 9af190b072523459fa61e5e79419b88ac8586a4d Differential Revision: D14909203 Pulled By: li-roy fbshipit-source-id: d716179c484aebfe3ec30087c5ecd4a11848ffc3	2019-04-17 23:55:47 -07:00
Sebastian Messmer	c7b1fdb767	Fixing function schema parser for Android (#19281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19281 String<->Number conversions aren't available in the STL used in our Android environment. This diff adds workarounds for that so that the function schema parser can be compiled for android Reviewed By: dzhulgakov Differential Revision: D14931649 fbshipit-source-id: d5d386f2c474d3742ed89e52dff751513142efad	2019-04-17 23:50:17 -07:00
Sebastian Messmer	094678c04b	Split function schema parser from operator (#19280 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19280 We want to use the function schema parser from ATen/core, but with as little dependencies as possible. This diff moves the function schema parser into its own file and removes some of its dependencies. Reviewed By: dzhulgakov Differential Revision: D14931651 fbshipit-source-id: c2d787202795ff034da8cba255b9f007e69b4aea	2019-04-17 23:50:15 -07:00
Ailing Zhang	a1174dbc50	fix hub doc format Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19396 Differential Revision: D14993859 Pulled By: ailzhang fbshipit-source-id: bdf94e54ec35477cfc34019752233452d84b6288	2019-04-17 23:43:56 -07:00
Mikhail Zolotukhin	b31bab7860	Clang-format torch/csrc/jit/passes/quantization.cpp. (#19385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19385 ghimport-source-id: 67f808db7dcbcb6980eac79a58416697278999b0 Differential Revision: D14991917 Pulled By: ZolotukhinM fbshipit-source-id: 6c2e57265cc9f0711752582a04d5a070482ed1e6	2019-04-17 22:08:41 -07:00
Shen Li	6732358bf9	Allow DDP to wrap multi-GPU modules (#19271 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19271 allow DDP to take multi-gpu models Reviewed By: pietern Differential Revision: D14822375 fbshipit-source-id: 1eebfaa33371766d3129f0ac6f63a573332b2f1c	2019-04-17 21:21:54 -07:00
Jiyan Yang	c48e1679f9	Add validator for optimizers when parameters are shared Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18497 Reviewed By: kennyhorror Differential Revision: D14614738 fbshipit-source-id: beddd8349827dcc8ccae36f21e5d29627056afcd	2019-04-17 21:10:38 -07:00
Ailing Zhang	2787f1d8ed	hub minor fixes (#19247 ) Summary: A few improvements while doing bert model Pull Request resolved: https://github.com/pytorch/pytorch/pull/19247 Differential Revision: D14989345 Pulled By: ailzhang fbshipit-source-id: f4846813f62b6d497fbe74e8552c9714bd8dc3c7	2019-04-17 21:04:33 -07:00
Elias Ellison	776fec0f9f	fix wrong schema (#19370 ) Summary: Op was improperly schematized previously. Evidently checkScript does not test if the outputs are the same type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19370 Differential Revision: D14985159 Pulled By: eellison fbshipit-source-id: feb60552afa2a6956d71f64801f15e5fe19c3a91	2019-04-17 19:55:30 -07:00
Mikhail Zolotukhin	bf5f30f39b	Fix printing format in examples in jit/README.md. (#19323 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19323 ghimport-source-id: 74a01917de70c9d59099cf601b24f3cb484ab7be Differential Revision: D14990100 Pulled By: ZolotukhinM fbshipit-source-id: 87ede08c8ca8f3027b03501fbce8598379e8b96c	2019-04-17 18:38:09 -07:00
Eric Faust	48859e3ad3	Allow for single-line deletions in clang_tidy.py (#19082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19082 When you have just one line of deletions, just as with additions, there is no count printed. Without this fix, we ignore all globs with single-line deletions when selecting which lines were changed. When all the changes in the file were single-line, this meant no line-filtering at all! Differential Revision: D14860426 fbshipit-source-id: c60e9d84f9520871fc0c08fa8c772c227d06fa27	2019-04-17 17:02:30 -07:00
Michael Suo	242743eedb	Revert D14901379: [jit] Add options to Operator to enable registration of alias analysis passes Differential Revision: D14901379 Original commit changeset: d92a497e280f fbshipit-source-id: 51d31491ab90907a6c95af5d8a59dff5e5ed36a4	2019-04-17 16:56:14 -07:00
Michael Suo	0414f23855	Revert D14901485: [jit] Only require python print on certain namespaces Differential Revision: D14901485 Original commit changeset: 4b02a66d325b fbshipit-source-id: 93348056c00f43c403cbf0d34f8c565680ceda11	2019-04-17 16:56:11 -07:00
Yinghai Lu	5fa1aad670	Remove unused template parameter in OnnxifiOp (#19362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19362 `float` type is never used in OnnxifiOp.... Reviewed By: bddppq Differential Revision: D14977970 fbshipit-source-id: 8fee02659dbe408e5a3e0ff95d74c04836c5c281	2019-04-17 16:48:14 -07:00
Jerry Zhang	ad8f34fcca	Add empty_quantized (#18960 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18960 empty_affine_quantized creates an empty affine quantized Tensor from scratch. We might need this when we implement quantized operators. Differential Revision: D14810261 fbshipit-source-id: f07d8bf89822d02a202ee81c78a17aa4b3e571cc	2019-04-17 16:17:40 -07:00
Elias Ellison	4371cb5e01	Cast not expressions to bool (#19361 ) Summary: As part of implicitly casting condition statements, we should be casting not expressions as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19361 Differential Revision: D14984275 Pulled By: eellison fbshipit-source-id: f8dae64f74777154c25f7a6bcdac03cf44cbb60b	2019-04-17 16:06:48 -07:00
Owen Anderson	d6b91075dc	Eliminate type dispatch from copy_kernel, and use memcpy directly rather than implementing our own copy. (#19198 ) Summary: It turns out that copying bytes is the same no matter what type they're interpreted as, and memcpy is already vectorized on every platform of note. Paring this down to the simplest implementation saves just over 4KB off libtorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19198 Differential Revision: D14922656 Pulled By: resistor fbshipit-source-id: bb03899dd8f6b857847b822061e7aeb18c19e7b4	2019-04-17 15:39:13 -07:00
Bram Wasti	3e0b46b6d1	Only require python print on certain namespaces (#18846 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18846 ghimport-source-id: b211e15d24c88fdc32d79222d9fce2fa9c291541 Differential Revision: D14901485 Pulled By: bwasti fbshipit-source-id: 4b02a66d325ba5391d1f838055aea13b5e4f6485	2019-04-17 14:24:50 -07:00
Bram Wasti	3a031c414a	Add options to Operator to enable registration of alias analysis passes (#18589 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18589 ghimport-source-id: dab203f6be13bf41963848f5315235b6bbe45c08 Differential Revision: D14901379 Pulled By: bwasti fbshipit-source-id: d92a497e280f1b0a63b11a9fd8ae9b48bf52e6bf	2019-04-17 13:14:55 -07:00
Richard Zou	eaa14f5f59	Error out on in-place binops on tensors with internal overlap (#19317 ) Summary: This adds checks for `mul_`, `add_`, `sub_`, `div_`, the most common binops. See #17935 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19317 Differential Revision: D14972399 Pulled By: zou3519 fbshipit-source-id: b9de331dbdb2544ee859ded725a5b5659bfd11d2	2019-04-17 13:02:07 -07:00
Vitaly Fedyunin	ff4a4d6155	Update for #19326 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19367 Differential Revision: D14981835 Pulled By: VitalyFedyunin fbshipit-source-id: e8a97986d9669ed7f465a7ba771801bdd043b606	2019-04-17 12:56:08 -07:00
Zafar Takhirov	aad6f97898	Decorator to make sure we can import `core` from caffe2 (#19273 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19273 Some of the CIs are failing if the protobuf is not installed. Protobuf is imported as part of the `caffe2.python.core`, and this adds a skip decorator to avoid running tests that depend on `caffe2.python.core` Reviewed By: jianyuh Differential Revision: D14936387 fbshipit-source-id: e508a1858727bbd52c951d3018e2328e14f126be	2019-04-17 11:22:49 -07:00
Yinghai Lu	f1f31b634d	Eliminate AdjustBatch ops (#19083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19083 As we have discussed, there are too many of AdjustBatch ops and they incur reallocation overhead and affects the performance. We will eliminate these ops by - inling the input adjust batch op into Glow - inling the output adjust batch op into OnnxifiOp and do that only conditionally. This is the C2 part of the change and requires change from Glow side to work e2e. Reviewed By: rdzhabarov Differential Revision: D14860582 fbshipit-source-id: ac2588b894bac25735babb62b1924acc559face6	2019-04-17 10:00:25 -07:00
Bharat123rox	3fcee4875c	Add rst entry for nn.MultiheadAttention (#19346 ) Summary: Fix #19259 by adding the missing `autoclass` entry for `nn.MultiheadAttention` from [here](https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/activation.py#L676) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19346 Differential Revision: D14971426 Pulled By: soumith fbshipit-source-id: ceaaa8ea4618c38fa2bff139e7fa0d6c9ea193ea	2019-04-17 04:40:28 -07:00
Sebastian Messmer	db611b7caf	Delete C10Tensor (#19328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19328 Plans changed and we don't want this class anymore. Reviewed By: dzhulgakov Differential Revision: D14966746 fbshipit-source-id: 09ea4c95b352bc1a250834d32f35a94e401f2347	2019-04-17 00:02:27 -07:00
Junjie Bai	33443d083e	Fix python lint (#19331 ) Summary: VitalyFedyunin jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19331 Differential Revision: D14969435 Pulled By: bddppq fbshipit-source-id: c1555c52064758ecbe668f92b837f2d7524f6118	2019-04-16 21:47:30 -07:00
Nikolay Korovaiko	58d4414c33	Profiling pipeline part1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18772 Differential Revision: D14952781 Pulled By: Krovatkin fbshipit-source-id: 1e99fc9053c377291167f0b04b0f0829b452dbc4	2019-04-16 21:21:08 -07:00
Tongzhou Wang	93201d0676	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19310 Differential Revision: D14952046 Pulled By: soumith fbshipit-source-id: 1bbaaad6f932a832ea8e5e804d0d9cd9140a5071	2019-04-16 20:31:15 -07:00
Jerry Zhang	06c28d8a12	Add slicing and int_repr() to QTensor (#19296 ) Summary: Stack:     ⚫  #19296 [pt1][quant] Add slicing and int_repr() to QTensor  [💛](https://our.intern.facebook.com/intern/diff/D14756833/)     ⚪  #18960 [pt1][quant] Add empty_quantized  [💛](https://our.intern.facebook.com/intern/diff/D14810261/)     ⚪  #19312 Use the QTensor with QReLU  [💛](https://our.intern.facebook.com/intern/diff/D14819460/)     ⚪  #19319 [RFC] Quantized SumRelu  [💛](https://our.intern.facebook.com/intern/diff/D14866442/) Methods added to pytorch python frontend: - int_repr() returns a CPUByte Tensor which copies the data of QTensor. - Added as_strided for QTensorImpl which provides support for slicing a QTensor(see test_torch.py) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19296 Differential Revision: D14756833 Pulled By: jerryzh168 fbshipit-source-id: 6f4c92393330e725c4351d6ff5f5fe9ac7c768bf	2019-04-16 20:17:21 -07:00
Jerry Zhang	33e7977154	move const defs of DeviceType to DeviceType.h (#19185 ) Summary: Stack:     ⚫  #19185 [c10][core][ez] move const defs of DeviceType to DeviceType.h  [💛](https://our.intern.facebook.com/intern/diff/D14909415/) att Pull Request resolved: https://github.com/pytorch/pytorch/pull/19185 Differential Revision: D14909415 Pulled By: jerryzh168 fbshipit-source-id: 876cf999424d8394f5ff20e6750133a4e43466d4	2019-04-16 20:02:21 -07:00
Xiaoqiang Zheng	5627940e9c	Add a fast path for batch-norm CPU inference. (#19152 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19152 Adding a fast path for batch-norm CPU inference when all tensors are contiguous. * Leverage vectorization through smiple loops. * Folding linear terms before computation. * For resnext-101, this version gets 18.95 times faster. * Add a microbenchmark: * (buck build mode/opt -c python.package_style=inplace --show-output //caffe2/benchmarks/operator_benchmark:batchnorm_benchmark) && \ (OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 buck-out/gen/caffe2/benchmarks/operator_benchmark/batchnorm_benchmark#binary.par) * batch_norm: data shape: [1, 256, 3136], bandwidth: 22.26 GB/s * batch_norm: data shape: [1, 65536, 1], bandwidth: 5.57 GB/s * batch_norm: data shape: [128, 2048, 1], bandwidth: 18.21 GB/s Reviewed By: soumith, BIT-silence Differential Revision: D14889728 fbshipit-source-id: 20c9e567e38ff7dbb9097873b85160eca2b0a795	2019-04-16 19:27:54 -07:00
Jerry Zhang	ff0a7ae43f	Testing for folded conv_bn_relu (#19298 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19298 Proper testing for conv_bn_relu folding Differential Revision: D13998891 fbshipit-source-id: ceb58ccec19885cbbf38964ee0d0db070e098b4a	2019-04-16 19:04:06 -07:00
Ilia Cherniavskii	9f35185b56	Initialize intra-op threads in JIT thread pool (#19058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19058 ghimport-source-id: 53e87df8d93459259854a17d4de3348e463622dc Differential Revision: D14849624 Pulled By: ilia-cher fbshipit-source-id: 5043a1d4330e38857c8e04c547526a3ba5b30fa9	2019-04-16 18:27:22 -07:00
Mikhail Zolotukhin	5e7bc26f65	Fix ASSERT_ANY_THROW. (#19321 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19321 ghimport-source-id: 9efffc36950152105bd0dc13f450161367101410 Differential Revision: D14962184 Pulled By: ZolotukhinM fbshipit-source-id: 22d602f50eb5e17a3e3f59cc7feb59a8d88df00d	2019-04-16 15:44:53 -07:00
David Riazati	78f589e794	Add len() for strings (#19320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19320 ghimport-source-id: 62131cb24e9bf65f0ef3e60001cb36509a1f4163 Reviewed By: bethebunny Differential Revision: D14961078 Pulled By: driazati fbshipit-source-id: 08b9a4b10e4a47ea09ebf55a4743defa40c74698	2019-04-16 15:11:33 -07:00
Xiang Gao	df67969e6b	Step 3: Add support for return_counts to torch.unique for dim not None (#18650 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18650 ghimport-source-id: 75759c95e6c48e27c172b919097dbc40c6bfb5e6 Differential Revision: D14892319 Pulled By: VitalyFedyunin fbshipit-source-id: ec5d1b80fc879d273ac5a534434fd648468dda1e	2019-04-16 14:06:45 -07:00
Karl Ostmo	c8897d2263	invoke NN smoketests from a python loop instead of a batch file (#18756 ) Summary: I tried first to convert the `.bat` script to a Bash `.sh` script, but I got this error: ``` [...]/build/win_tmp/ci_scripts/test_python_nn.sh: line 3: fg: no job control ``` Line 3 was where `%TMP_DIR%/ci_scripts/setup_pytorch_env.bat` was invoked. I found a potential workaround on stack overflow of adding the `monitor` (`-m`) flag to the script, but hat didn't work either: ``` 00:58:00 /bin/bash: cannot set terminal process group (3568): Inappropriate ioctl for device 00:58:00 /bin/bash: no job control in this shell 00:58:00 + %TMP_DIR%/ci_scripts/setup_pytorch_env.bat 00:58:00 /c/Jenkins/workspace/pytorch-builds/pytorch-win-ws2016-cuda9-cudnn7-py3-test1/build/win_tmp/ci_scripts/test_python_nn.sh: line 3: fg: no job control ``` So instead I decided to use Python to replace the `.bat` script. I believe this is an improvement in that it's both "table-driven" now and cross-platform. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18756 Differential Revision: D14957570 Pulled By: kostmo fbshipit-source-id: 87794e64b56ffacbde4fd44938045f9f68f7bc2a	2019-04-16 13:11:03 -07:00
Vitaly Fedyunin	1c5073fb4b	Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors (#18952 ) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Now compatible with both torch scripts: ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)` and ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))` Same checked for all similar functions `rand_like`, `empty_like` and others It is fixed version of #18455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18952 Differential Revision: D14801792 Pulled By: VitalyFedyunin fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba	2019-04-16 11:06:15 -07:00
J M Dieterich	31686805f2	Enable unit tests for ROCm 2.3 (#19307 ) Summary: Unit tests that hang on clock64() calls are now fixed. test_gamma_gpu_sample is now fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19307 Differential Revision: D14953420 Pulled By: bddppq fbshipit-source-id: efe807b54e047578415eb1b1e03f8ad44ea27c13	2019-04-16 10:58:27 -07:00
Jerry Zhang	e1f38a847d	Fix type conversion in dequant and add a test (#19226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19226 Type conversoin was wrong previously. Thanks zafartahirov for finding it! Differential Revision: D14926610 fbshipit-source-id: 6824f9813137a3d171694d743fbb437a663b1f88	2019-04-16 10:52:44 -07:00
Alexandr Morev	da4ff17eee	math module support (#19115 ) Summary: This PR refer to issue [#19026](https://github.com/pytorch/pytorch/issues/19026) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19115 Differential Revision: D14936053 Pulled By: driazati fbshipit-source-id: 68d5f33ced085fcb8c10ff953bc7e99df055eccc	2019-04-16 10:44:07 -07:00
Shen Li	344acaa0ca	Revert replicate.py to disallow replicating multi-device modules (#19278 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19278 Based on discussion in https://github.com/pytorch/pytorch/pull/19278 and https://github.com/pytorch/pytorch/pull/18687, changes to replicate.py will be reverted to disallow replicating multi-device modules. Reviewed By: pietern Differential Revision: D14940018 fbshipit-source-id: 7504c0f4325c2639264c52dcbb499e61c9ad2c26	2019-04-16 10:03:38 -07:00
Zachary DeVito	b9c20d5224	graph_for based on last_optimized_executed_graph (#19142 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19142 ghimport-source-id: 822013fb7e93032c74867fc77c6774c680aef6d1 Differential Revision: D14888703 Pulled By: zdevito fbshipit-source-id: a2ad65a042d08b1adef965c2cceef37bb5d26ba9	2019-04-16 09:17:53 -07:00
Richard Zou	3b29cbaf86	Enable half for CUDA dense EmbeddingBag backward. (#19293 ) Summary: I audited the relevant kernel and saw it accumulates a good deal into float so it should be fine. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19293 Differential Revision: D14942274 Pulled By: zou3519 fbshipit-source-id: 36996ba0fbb29fbfb12b27bfe9c0ad1eb012ba3c	2019-04-16 08:57:20 -07:00
Mingzhe Li	3501576230	calculate execution time based on final iterations (#19299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19299 I saw larger than 5% performance variation with small operators, this diff aims to reduce the variation by avoiding python overhead. Previously, in the benchmark, we run the main loop for 100 iterations then look at the time. If it's not significant, we will double the number of iterations to rerun and look at the result. We continue this process until it becomes significant. We calculate the time by total_time / number of iterations. The issue is that we are including multiple python trigger overhead. Now, I change the logic to calculate execution time based on the last run instead of all runs, the equation is time_in_last_run/number of iterations. Reviewed By: hl475 Differential Revision: D14925287 fbshipit-source-id: cb646298c08a651e27b99a5547350da367ffff47	2019-04-16 08:57:17 -07:00
Ilia Cherniavskii	646cb6157d	Move OMP/MKL thread initialization into ATen/Parallel (#19011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19011 ghimport-source-id: 432e31eccfd0e59fa21a790f861e6b2ff4fdbac6 Differential Revision: D14846034 Pulled By: ilia-cher fbshipit-source-id: d9d03c761d34bac80e09ce776e41c20fd3b04389	2019-04-16 00:16:32 -07:00
Mark Santaniello	20fc7b6ec7	Avoid undefined symbol error when building AdIndexer LTO (#19009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19009 Move the definition of `MulFunctor<>::Backward()` into a header file. Reviewed By: BIT-silence Differential Revision: D14823230 fbshipit-source-id: 1efaec01863fcc02dcbe7e788d376e72f8564501	2019-04-15 23:43:13 -07:00
Nikolay Korovaiko	ada10ad416	Ellipsis in subscript Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17763 Differential Revision: D14893533 Pulled By: Krovatkin fbshipit-source-id: c46b4e386d3aa30e6dc03e3052d2e5ff097fa74b	2019-04-15 22:10:44 -07:00
Ilia Cherniavskii	f1c8e01524	Add input information in RecordFunction calls (#18717 ) Summary: Add input information into generated RecordFunction calls in VariableType wrappers, JIT operators and a few more locations Pull Request resolved: https://github.com/pytorch/pytorch/pull/18717 Differential Revision: D14729156 Pulled By: ilia-cher fbshipit-source-id: 811ac4cbfd85af5c389ef030a7e82ef454afadec	2019-04-15 20:28:08 -07:00
Summer Deng	84b264b17d	Add NHWC order support in the cost inference function of 3d conv (#19170 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19170 As title The quantized resnext3d model in production got the following failures without the fix: ``` Caffe2 operator Int8ConvRelu logging error: [enforce fail at conv_pool_op_base.h:463] order == StorageOrder::NCHW. 1 vs 2. Conv3D only supports NCHW on the production quantized model ``` Reviewed By: jspark1105 Differential Revision: D14894276 fbshipit-source-id: ef97772277f322ed45215e382c3b4a3702e47e59	2019-04-15 16:47:22 -07:00
Jongsoo Park	ffc9e29844	unit test with multiple op invocations (#19118 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19118 A bug introduced by D14700576 reported by Yufei (fixed by D14778810 and D14785256) was not detected by our units tests. This diff improves unit tests to catch such errors (with this diff and without D14778810, we can reproduce the bug Yufei reported). This improvement also revealed a bug that affects the accuracy when we pre-pack weight and bias together and the pre-packed weight/bias are used by multiple nets. We were modifying the pre-packed bias in-place which was supposed to be constants. Reviewed By: csummersea Differential Revision: D14806077 fbshipit-source-id: aa9049c74b6ea98d21fbd097de306447a662a46d	2019-04-15 14:41:28 -07:00
Karl Ostmo	00148825fc	Run shellcheck on Jenkins scripts (#18874 ) Summary: closes #18873 Doesn't fail the build on warnings yet. Also fix most severe shellcheck warnings Limited to `.jenkins/pytorch/` at this time Pull Request resolved: https://github.com/pytorch/pytorch/pull/18874 Differential Revision: D14936165 Pulled By: kostmo fbshipit-source-id: 1ee335695e54fe6c387ef0f6606ea7011dad0fd4	2019-04-15 12:48:52 -07:00
Pieter Noordhuis	a0263ec047	Make DistributedDataParallel use new reducer (#18953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18953 This removes Python side bucketing code from DistributedDataParallel and replaces it with calls to the new C++ based bucketing and reducing code. To confirm this is working well, we ran a test with both the previous implementation and the new implementation, and confirmed they are numerically equivalent. Performance is improved by a couple percent or more, including the single machine multiple GPU runs. Closes #13273. Reviewed By: mrshenli Differential Revision: D14580911 fbshipit-source-id: 44e76f8b0b7e58dd6c91644e3df4660ca2ee4ae2	2019-04-15 12:44:38 -07:00
Gemfield	6ed57e052d	Fix the return value of ParseFromString (#19262 ) Summary: Fix the return value of ParseFromString. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19262 Differential Revision: D14937605 Pulled By: ezyang fbshipit-source-id: 3f441086517186a075efb3d74f09160463b696b3	2019-04-15 12:39:29 -07:00
vishwakftw	3403cb857b	Modify Cholesky derivative (#19116 ) Summary: The derivative of the Cholesky decomposition was previously a triangular matrix. Changelog: - Modify the derivative of Cholesky from a triangular matrix to symmetric matrix Pull Request resolved: https://github.com/pytorch/pytorch/pull/19116 Differential Revision: D14935470 Pulled By: ezyang fbshipit-source-id: 1c1c76b478c6b99e4e16624682842cb632e8e8b9	2019-04-15 12:16:55 -07:00
Karl Ostmo	991279dc7d	produce diagram for caffe2 build matrix (#18517 ) Summary: This PR splits the configuration tree data from the logic used to construct the tree, for both `pytorch` and `caffe2` build configs. Caffe2 configs are also now illustrated in a diagram. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18517 Differential Revision: D14936170 Pulled By: kostmo fbshipit-source-id: 7b40a88512627377c5ea0f24765dabfef76ca279	2019-04-15 11:45:32 -07:00
Sam Gross	7caad0ed33	Free all blocks with outstanding events on OOM-retry (#19222 ) Summary: The caching allocator tries to free all blocks on an out-of-memory error. Previously, it did not free blocks that still had outstanding stream uses. This change synchronizes on the outstanding events and frees those blocks. See #19219 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19222 Differential Revision: D14925071 Pulled By: colesbury fbshipit-source-id: a2e9fe957ec11b00ea8e6c0468436c519667c558	2019-04-15 11:29:27 -07:00
Vitaly Fedyunin	86619b8ba6	Make sure that any of the future versions can load and execute older models. (#19174 ) Summary: Helps to test #18952 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19174 Differential Revision: D14899474 Pulled By: VitalyFedyunin fbshipit-source-id: a4854ad44da28bd0f5115ca316e6078cbfe29d0d	2019-04-15 10:49:31 -07:00
Sebastian Messmer	68c4ebbeeb	Sync fbcode/caffe2 and xplat/caffe2 (1) (#19218 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19218 Sync some contents between fbcode/caffe2 and xplat/caffe2 to move closer towards a world where they are identical. Reviewed By: dzhulgakov Differential Revision: D14919916 fbshipit-source-id: 29c6b6d89ac556d58ae3cd02619aca88c79591c1	2019-04-13 21:45:52 -07:00
Ailing Zhang	2060e44ec8	upgrade bazel version in CI [xla ci] (#19246 ) Summary: The latest TF requires upgrading bazel version. This PR should fix xla tests in CI. [xla ci] Pull Request resolved: https://github.com/pytorch/pytorch/pull/19246 Differential Revision: D14929533 Pulled By: ailzhang fbshipit-source-id: f6deb31428ed39f267d96bb9814d06f76641e73b	2019-04-13 20:16:37 -07:00
Junjie Bai	1c099fd5c9	Update docker images to use ROCm 2.3 (#19231 ) Summary: xw285cornell petrex iotamudelta https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-trigger-test/24676/ https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-devtoolset7-rocmrpm-centos7.5-trigger-test/17679/ https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-trigger/24652/ https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-devtoolset7-rocmrpm-centos7.5-trigger/9943/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/19231 Differential Revision: D14928580 Pulled By: bddppq fbshipit-source-id: 025b0affa6bcda6ee9f823dfc6c2cf8b92e71027	2019-04-13 13:11:26 -07:00
Zachary DeVito	10bc789dff	fix flake8 (#19243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19243 ghimport-source-id: ae80aed3a5742df21afb6e55979686220a27cce7 Differential Revision: D14928670 Pulled By: zdevito fbshipit-source-id: 20ec0d5c8d6f1c515beb55e2e63eddf3b2fc12dd	2019-04-13 10:04:39 -07:00
Zachary DeVito	e958ceb5d7	Remove GraphExecutor's python bindings (#19141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19141 ghimport-source-id: 796a41f5514d29959af052fcf5391a2834850a80 Reviewed By: jamesr66a Differential Revision: D14888702 Pulled By: zdevito fbshipit-source-id: c280145f08e7bc210434d1c99396a3257b626cf9	2019-04-13 08:42:24 -07:00
Zachary DeVito	ddda563f22	Cleanup ScriptModule bindings (#19138 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19138 ghimport-source-id: 10f810f5e7551c1cb65fc4799744083bd7ffd1ee Reviewed By: jamesr66a Differential Revision: D14886945 Pulled By: zdevito fbshipit-source-id: a5e5bb08694d03166a7516ec038656c2a02e7896	2019-04-13 08:42:21 -07:00
Zachary DeVito	dcb5fd3613	get propagate_shape logic out of module.h (#19137 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19137 ghimport-source-id: 2394765f2d401e68ffdfa4c985bfab4cca2517f8 Reviewed By: jamesr66a Differential Revision: D14885946 Pulled By: zdevito fbshipit-source-id: daa2894ed9761107e9d273bb172840dc23ace072	2019-04-13 08:42:17 -07:00
Zachary DeVito	1827ca4c35	Make debug subgraph inlining thread local (#19136 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19136 ghimport-source-id: 3a24ab36aa753ce5cce7bba3467bdbe88e5c7f60 Reviewed By: jamesr66a Differential Revision: D14885051 Pulled By: zdevito fbshipit-source-id: b39c6ceef73ad9caefcbf8f40dd1b9132bba03c2	2019-04-13 08:42:14 -07:00
Zachary DeVito	c38c7b0ec5	Support Kwargs in C++ Function/Method calls (#19086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19086 ghimport-source-id: 7790a5cc6e32f6f72e92add0b9f76dfa49ad9859 Reviewed By: jamesr66a Differential Revision: D14875729 Pulled By: zdevito fbshipit-source-id: ad1e4542381d9c33722155459e794f1ba4660dbb	2019-04-13 08:42:11 -07:00
Johannes M Dieterich	d8669a2c7e	Enable working ROCm tests (#19169 ) Summary: Enable multi-GPU tests that work with ROCm 2.2. Have been run three times on CI to ensure stability. While there, remove skipIfRocm annotations for tests that depend on MAGMA. They still skip but now for the correct reason (no MAGMA) to improve our diagnostics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19169 Differential Revision: D14924812 Pulled By: bddppq fbshipit-source-id: 8b88f58bba58a08ddcd439e899a0abc6198fef64	2019-04-12 21:51:10 -07:00
Ailing Zhang	ca02558d40	import warnings in torch.hub & fix master CI travis (#19181 ) Summary: fix missing import in #18758 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19181 Differential Revision: D14908198 Pulled By: ailzhang fbshipit-source-id: 31e0dc4a27521103a1b93f72511ae1b64a36117f	2019-04-12 21:35:31 -07:00
Jerry Zhang	bba96db2a5	fix lint errors in gen.py (#19221 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19221 att Reviewed By: colesbury Differential Revision: D14923858 fbshipit-source-id: 4793d7794172d401455c5ce72dfc27dddad515d4	2019-04-12 18:26:38 -07:00
Bram Wasti	b1539412db	Add pass registration mechanism (#18587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18587 ghimport-source-id: 80d753f7046a2a719e0c076684f44fa2059a0921 Differential Revision: D14901227 Pulled By: bwasti fbshipit-source-id: 56511d0313419b63945a36b80e9ea51abdef2bd4	2019-04-12 15:32:00 -07:00
Wanchao Liang	a3d3008e73	JIT Layernorm fusion (#18266 ) Summary: Partially fuse layer_norm by decomposing layer_norm into the batchnorm kernel that computes the stats, and then fusing the affine operations after the reduce operations, this is similar to the batchnorm fusion that apaszke did, it also only works in inference mode now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18266 Differential Revision: D14879877 Pulled By: wanchaol fbshipit-source-id: 0197d8f2a17ec438d3e53f4c411d759c1ae81efe	2019-04-12 14:38:31 -07:00
Yinghai Lu	0e435afc3c	Add more debugging helper to net transformer (#19176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19176 Add some amenities for debugging. Reviewed By: llyfacebook Differential Revision: D14901740 fbshipit-source-id: 2c4018fdbf7e3aba2a754b6b4103a72893c229c2	2019-04-12 14:28:37 -07:00
Jerry Zhang	1c836e7bb9	Add Quantized Backend (#18546 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18546 We'll expose all combinations of various ways of quantization in the top level dispatch key, that is we have AffineCPUTensor, PerChannelAffineCUDATensor, etc. QTensor method added: - is_quantized() - item() Differential Revision: D14637671 fbshipit-source-id: 346bc6ef404a570f0efd34e8793056ad3c7855f5	2019-04-12 12:55:49 -07:00
Xiang Gao	3f7ddd269c	Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim (#18649 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18649 ghimport-source-id: 3411d240a6af5fe299a889667964730184e30645 Differential Revision: D14888292 Pulled By: VitalyFedyunin fbshipit-source-id: 80da83c264598f74ab8decb165da4a1ce2b352bb	2019-04-12 12:41:20 -07:00
Lu Fang	bd55abb463	Fix onnx ints (#19102 ) Summary: If JIT constant propagation doesn't work, we have to handle the ListConstructor in symbolic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19102 Reviewed By: zrphercule Differential Revision: D14875588 Pulled By: houseroad fbshipit-source-id: d25c847d224d2d32db50aae1751100080e115022	2019-04-12 12:01:14 -07:00
Huamin Li	c480798a1c	use C10_REGISTER for GELU op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19090 Reviewed By: BIT-silence Differential Revision: D14864737 fbshipit-source-id: 8debd53171f7068726f0ab777a13ca46becbfbdf	2019-04-12 11:41:04 -07:00
Edward Yang	79db4e9c10	Fix tabs lint. (#19196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19196 ghimport-source-id: c10b1b19b087d7650e1614f008a9c2db21dfec2f Differential Revision: D14913428 Pulled By: ezyang fbshipit-source-id: 815b919d8e4516d0e5d89ebbdc4dff6d1d08da47	2019-04-12 11:22:05 -07:00
Will Feng	65ae897ae8	Pin nvidia-container-runtime version (#19195 ) Summary: This PR is to fix the CI error: ``` nvidia-docker2 : Depends: nvidia-container-runtime (= 2.0.0+docker18.09.4-1) but 2.0.0+docker18.09.5-1 is to be installed E: Unable to correct problems, you have held broken packages. Exited with code 100 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19195 Differential Revision: D14913104 Pulled By: yf225 fbshipit-source-id: d151205f5ffe9cac7320ded3c25baa7e051c3623	2019-04-12 10:00:40 -07:00
peter	deda88e0aa	One more fix for #18790 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19187 Differential Revision: D14913100 Pulled By: ezyang fbshipit-source-id: bf147747f933a2c9a35f3ff00bf6b83a4f29286c	2019-04-12 09:29:15 -07:00
Jerry Zhang	7e73783c6f	Fix promoteTypes for QInt types (#19182 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19182 This is a bug discovered by zafartahirov, right now if one of the tensor is QInt type we'll return undefined, but actually we want to allow ops that accepts Tensors of the same QInt type to work. Reviewed By: zafartahirov Differential Revision: D14909172 fbshipit-source-id: 492fd6403da8c56e180efe9d632a3b7fc879aecf	2019-04-11 19:43:18 -07:00
Roy Li	422b01e788	Replace more usages of Type with DeprecatedTypeProperties (#19093 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19093 ghimport-source-id: a82e3dce912a173b42a6a7e35eb1302d9f334e03 Differential Revision: D14865520 Pulled By: li-roy fbshipit-source-id: b1a8bf32f87920ce8d82f990d670477bc79d0ca7	2019-04-11 17:02:05 -07:00
David Riazati	fea0a0be53	Support attributes when copying modules (#19040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19040 ghimport-source-id: 37933efd717795751283cae8141e2e2caaae2e95 Reviewed By: eellison Differential Revision: D14895573 Pulled By: driazati fbshipit-source-id: bc2723212384ffa673d2a8df2bb57f38c62cc104	2019-04-11 15:38:06 -07:00
Will Feng	4ae59e4744	Move version_counter_ to TensorImpl (#18223 ) Summary: According to https://github.com/pytorch/pytorch/issues/13638#issuecomment-468055428, after the Variable/Tensor merge, we may capture variables without autograd metadata inside an autograd function, and we need a working version counter in these cases. This PR makes it possible by moving `version_counter_` out of autograd metadata and into TensorImpl, so that variables without autograd metadata still have version counters. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18223 Differential Revision: D14735123 Pulled By: yf225 fbshipit-source-id: 15f690311393ffd5a53522a226da82f5abb6c65b	2019-04-11 15:12:45 -07:00
Iurii Zdebskyi	507fe66bea	Enable comp ops for bool tensor (#19109 ) Summary: Enabled comparison ops for bool tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/19109 Differential Revision: D14871187 Pulled By: izdeby fbshipit-source-id: cf9951847d69124a93e5e21dd0a39c9568b1037d	2019-04-11 14:37:10 -07:00
Will Feng	c7b5a8a876	Change is_variable() to check existence of AutogradMeta, and remove is_variable_ (#19139 ) Summary: Currently, a TensorImpl's `is_variable_` is true if and only if the TensorImpl has AutogradMeta. This PR unifies these two concepts by removing `is_variable_` and change `is_variable()` to check existence of AutogradMeta instead. Removing `is_variable_` is part of the work in Variable/Tensor merge. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19139 Differential Revision: D14893339 Pulled By: yf225 fbshipit-source-id: ceb5e22c3c01f79b5d21d5bdbf4a7d1bc397796a	2019-04-11 14:03:33 -07:00
Zachary DeVito	ef406ee925	First class modules in the compiler, round 2 (#19167 ) Summary: This PR propagates where we use first-class modules objects into the compiler. This creates a transitionary state where: * compiler.cpp creates Graphs where `self` is a Module class and attributes/parameters/buffers/submodules are looked up with `prim::GetAttr` * GraphExecutor still runs "lowered graphs" where the self object has been removed by a compiler pass `lower_first_class_method`. * Tracing still creates "lowered graphs", and a pass "lift_lowered_method" creates a first-class method graph for things. * This PR separates out Method and Function. A script::Function is a pure Graph with no `self` bound. Similar to Python, a script::Method is just a bound `self` and its underlying `script::Function`. * This PR also separates CompilationUnit from Module. A CompilationUnit is just a list of named script::Functions. Class's have a CompilationUnit holding the class methods, and Modules also have a CompilationUnit holding their Methods. This avoids the weird circular case Module --has a-> Class -> has a -> Module ... Details: * In this transitionary state, we maintain two copies of a Graph, first-class module and lowered. Th first-class one has a self argument that is the module's class type. The lowered one is the lowered graph that uses the initial_ivalues inputs. * When defining lowered methods using `_defined_lowered` we immediately create the first-class equivalent. The reverse is done lazily, creating lowered_methods on demand from the class. * The two way conversions will be deleted in a future PR when the executor itself runs first-class objects. However this requires more changes to (1) the traces, (2) the python bindings, and (3) the onnx export pass and would make this PR way to large. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19167 Differential Revision: D14891966 Pulled By: zdevito fbshipit-source-id: 0b5f03118aa65448a15c7a7818e64089ec93d7ea	2019-04-11 13:55:48 -07:00
Gregory Chanan	b6ee83a5b4	Materialize a non-default device for C2 legacy storage. (#18605 ) Summary: It's not intended that Storages have 'default' CUDA devices, but this is allowable via the Storage::create_legacy codepath. This also messages with device_caching, because the initial cache is obtained from the Storage, which may have a 'default' device. Instead, we materialize a device by allocating 0 bytes via the allocator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18605 Differential Revision: D14680620 Pulled By: gchanan fbshipit-source-id: 6d43383d836e90beaf12bfe37c3f0506843f5432	2019-04-11 13:50:41 -07:00
Yinghai Lu	bbe648dffb	Allow empty net type (#19154 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19154 I recently saw some weird workflow error due to empty but set net_type. Maybe we should just fallback to simple net in this case. Reviewed By: dzhulgakov Differential Revision: D14890072 fbshipit-source-id: 4e9edf8232298000713bebb0bfdec61e9c5df17d	2019-04-11 12:43:07 -07:00
Lu Fang	33a950924a	Skip Slice if it's no op (#19155 ) Summary: If it's identity op, just skip the slice and return the input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19155 Reviewed By: zrphercule Differential Revision: D14890238 Pulled By: houseroad fbshipit-source-id: f87b93df2cca0cb0e8ae2a1d95ba148044eafd4a	2019-04-11 12:26:32 -07:00
Lu Fang	feb5d26510	Rename ONNX util test names (#19153 ) Summary: Rename test cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19153 Reviewed By: zrphercule Differential Revision: D14890095 Pulled By: houseroad fbshipit-source-id: 37a787398c88d9cc92b411c2355b43200cf1c4b0	2019-04-11 11:29:16 -07:00
Pieter Noordhuis	c1b92f518d	Remove ProcessGroup::getGroupRank (#19147 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19147 After #14809 was merged there is no longer a need for getGroupRank. Every ProcessGroup object has its own rank and size fields which are accurate for the global group as well as subgroups. Strictly speaking removing a function in a minor version bump is a big no-no, but I highly doubt this was ever used outside of `torch.distributed` itself. This will result in a compile error for folks who have subclassed the ProcessGroup class though. If this is a concern we can delay merging until a later point in time, but eventually this will need to be cleaned up. Differential Revision: D14889736 fbshipit-source-id: 3846fe118b3265b50a10ab8b1c75425dad06932d	2019-04-11 09:17:40 -07:00
Zafar Takhirov	c145c34a7b	Basic implementation of QRelu in C10 (#19091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19091 Implements a basic quantized ReLU (uint8). This is a temporary solution before using the `QTensor` type instead of the tuple. Reviewed By: dzhulgakov Differential Revision: D14565413 fbshipit-source-id: 7d53cf5628cf9ec135603d6a1fb7c79cd9383019	2019-04-11 08:47:56 -07:00
Guanheng Zhang	4b20fc826d	Import MultiheadAttention to PyTorch (#18334 ) Summary: Import MultiheadAttention into the core pytorch framework. Users now can import MultiheadAttention directly from torch.nn. See "Attention Is All You Need" for more details related to MultiheadAttention function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18334 Differential Revision: D14577966 Pulled By: zhangguanheng66 fbshipit-source-id: 756c0deff623f3780651d9f9a70ce84516c806d3	2019-04-11 08:07:30 -07:00
Xing Wang	b6f130aa70	try to enable uncertainty for lr loss (#17236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17236 Following the paper in https://papers.nips.cc/paper/7141-what-uncertainties-do-we-need-in-bayesian-deep-learning-for-computer-vision.pdf, approximate the classification case with the regression formulation. For the LRLoss, add penalty based on the variance and regularization on the variance with a tunable parameter lambda. Reviewed By: chocjy Differential Revision: D14077106 fbshipit-source-id: 4405d8995cebdc7275a0dd07857d32a8915d78ef	2019-04-11 07:35:19 -07:00
sakaia@jp.fujitsu.com	160d0776d5	Remove comment (#19148 ) Summary: Remove pointer to nonexistent Note. It is already removed in "Remove support for CUDNN 6 (#15851)" Pull Request resolved: https://github.com/pytorch/pytorch/pull/19148 Differential Revision: D14891514 Pulled By: soumith fbshipit-source-id: dd33cfefa3a21e18afae5b3992dea085adaabda8	2019-04-11 07:04:45 -07:00
Zachary DeVito	f5165ade5b	Revert D14842057: Compiler uses first-class modules** Differential Revision: D14842057 Original commit changeset: ca6e7b5a4380 fbshipit-source-id: e8f1862a59bf20d5f78648b2fdc53a8b3750ead3	2019-04-11 06:17:01 -07:00
Zachary DeVito	5e1f0b2a07	Compiler uses first-class modules** (#19043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19043 ghimport-source-id: 0c9e80d5f35654af6d472abd5643bff3e9eb9ddf Differential Revision: D14842057 Pulled By: zdevito fbshipit-source-id: ca6e7b5a43805240f40b84d30e54495061067dc0	2019-04-11 00:00:48 -07:00
Christian Puhrsch	fce0c5e17d	Require matches_jit_signature within native_functions.yaml (#18956 ) Summary: """ This will verify that the func syntax follows the JIT signature schema. If you are a developer outside the core team, set this to False first to help us track unification. After your tests pass try setting this to True once and leave it set to True if it doesn't trigger any asserts. This means that your signature happens to be compliant. In general, it serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and assert coverage of all functions by each component. """ Pull Request resolved: https://github.com/pytorch/pytorch/pull/18956 Differential Revision: D14807952 Pulled By: cpuhrsch fbshipit-source-id: 42dac49269fb3cd96dc62e0b10820d0c32c7fb0e	2019-04-10 23:35:28 -07:00
Ailing Zhang	e54cb03a51	add/move a few apis in torch.hub (#18758 ) Summary: * `torch.hub.list('pytorch/vision')` - show all available hub models in `pytorch/vision` * `torch.hub.show('pytorch/vision', 'resnet18')` - show docstring & example for `resnet18` in `pytorch/vision` * Moved `torch.utils.model_zoo.load_url` to `torch.hub.load_state_dict_from_url` and deprecate `torch.utils.model_zoo` * We have too many env to control where the cache dir is, it's not very necessary. I actually want to unify `TORCH_HUB_DIR`, `TORCH_HOME` and `TORCH_MODEL_ZOO`, but haven't done it. (more suggestions are welcome!) * Simplify `pytorch/vision` example in doc, it was used to show how how hub entrypoint can be written so had some confusing unnecessary args. An example of hub usage is shown below ``` In [1]: import torch In [2]: torch.hub.list('pytorch/vision', force_reload=True) Downloading: "https://github.com/pytorch/vision/archive/master.zip" to /private/home/ailzhang/.torch/hub/master.zip Out[2]: ['resnet18', 'resnet50'] In [3]: torch.hub.show('pytorch/vision', 'resnet18') Using cache found in /private/home/ailzhang/.torch/hub/vision_master Resnet18 model pretrained (bool): a recommended kwargs for all entrypoints args & kwargs are arguments for the function In [4]: model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True) Using cache found in /private/home/ailzhang/.torch/hub/vision_master ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18758 Differential Revision: D14883651 Pulled By: ailzhang fbshipit-source-id: 6db6ab708a74121782a9154c44b0e190b23e8309	2019-04-10 23:10:39 -07:00
Pieter Noordhuis	5164622ba4	Revert D14878128: [jit] Support attributes when copying modules Differential Revision: D14878128 Original commit changeset: 7ef5f7b1b16b fbshipit-source-id: 3818222a897f8c01bc67f550ed0fd3ddecf61015	2019-04-10 22:24:30 -07:00
Pieter Noordhuis	ce166d949d	ProcessGroupMPI exists only if it is valid (#14809 ) Summary: Previously, MPI process groups were created for all processes, even if they were not part of the created group. Their MPI_Comm member field would be MPI_COMM_NULL and they would ignore any calls. Their rank and size were identical to that of the global process group and they had a special groupRank and groupSize field to capture the _real_ rank. This also meant assymetry with other process group types, where creating a new group would either return the process group OR GroupMember.NON_GROUP_MEMBER. For the MPI process group, it would always return a process group and an additional check was needed to verify whether or not a process was indeed part of a process group or not. This commit changes this such that every MPI process group is a valid process group, and by extension that we no longer have to special case MPI to determine whether or not a process is part of a group. Now, if the value returned by `new_group` is GroupMember.NON_GROUP_MEMBER, the process is not a member, otherwise it is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14809 Differential Revision: D14887937 Pulled By: pietern fbshipit-source-id: c5bf86d3b33e524cc5004ee68e30103178fa491d	2019-04-10 21:36:35 -07:00
Shen Li	6b0ca8eae5	Fix flaky store timeout test (#19114 ) Summary: ~Sometimes, `init_process_group()`, `store.get()`, and `destory_process_group()` can take more than a few seconds. Hence, removing thread join timeout.~ The error was due to `Address already in use` when starting TPC backend. The solution is to catch the error and report it to the `retry_on_address_already_in_use_error` decorator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19114 Reviewed By: ezyang Differential Revision: D14872680 Pulled By: mrshenli fbshipit-source-id: fc504d02853ca73f76288c0ade564ab20bc01f7e	2019-04-10 20:35:36 -07:00
Xiaomeng Yang	821b5f138a	Optimize SoftmaxOp on CPU (#18635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18635 Optimize SoftmaxOp on CPU Reviewed By: houseroad Differential Revision: D14689516 fbshipit-source-id: d2dcee2476d1a3a21f428e99bce9835f1d229d64	2019-04-10 18:52:15 -07:00
Zachary DeVito	1abbee0f8e	Allow Tensor lists to show up in symbolic differentiable graphs. (#16784 ) Summary: It is done by flattening all tensor lists that are inputs/outputs to the graph into the inputs/outputs list in the autograd graph. This is less desirable than simply allowing IValues to exist in the inputs/outputs of autograd::Function but it is substantially less intrusive. CaptureList describes the variables captured for backward in a single class. UnpackInstructs describes how the flattened inputs to backwards are re-packed into lists. ailzhang This PR is also part 2 of covering maskrcnn & bert AD formulas, following #16689. Ops added in this PR: ``` cat index meshgrid reshape split split_with_sizes stack unbind ``` I will also add a few perf numbers here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16784 Differential Revision: D14104063 Pulled By: ailzhang fbshipit-source-id: 5ceadadfd67ccaac60c5fd6740786c5354e252b9	2019-04-10 18:16:20 -07:00
David Riazati	612998f2ee	Support attributes when copying modules (#19040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19040 ghimport-source-id: 37933efd717795751283cae8141e2e2caaae2e95 Differential Revision: D14878128 Pulled By: driazati fbshipit-source-id: 7ef5f7b1b16b9bf9254e8503564fa3a750d841ab	2019-04-10 16:12:29 -07:00
Hao Lu	226a358136	Move ConcatBatchMatMulBatchGatherOp to OSS Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19059 Reviewed By: bwasti Differential Revision: D14849735 fbshipit-source-id: fefd1887d38e51151c07a8b187e9c7c50ef02c6e	2019-04-10 15:29:03 -07:00
Edward Yang	70313941b4	Print CuDNN version correctly. (#19110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19110 ghimport-source-id: efbaf9b23cb61e7ea65460684778c6eeb38ae28e Differential Revision: D14874497 Pulled By: ezyang fbshipit-source-id: ced03576f7598189dd8cce79b3303a5529551f46	2019-04-10 14:20:22 -07:00
Roy Li	8e8874ae54	Infer device from pointer in from_blob (#19094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19094 ghimport-source-id: 8207cf614ba36333af610309b24fdc13441b2837 Differential Revision: D14865925 Pulled By: li-roy fbshipit-source-id: 16613801f7fe0e829ccab8af081517ea4257db06	2019-04-10 12:51:05 -07:00
Gu, Jinghui	575aebc182	implement operators for DNNLOWP (#18656 ) Summary: Implement operators for DNNLOWP, including int8_conv, int8_FC, int8_pooling, int8_relu, int8_sum, quantize/dequantize, and order_swtich operators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18656 Differential Revision: D14767092 Pulled By: yinghai fbshipit-source-id: 1f3e24929a358a42214da333bd304c593ea4468f	2019-04-10 12:04:39 -07:00
Gregory Chanan	d537e12310	Improve mismatched storage error message. (#19068 ) Summary: Previously the error message would look like: ``` Attempted to set the storage of a tensor on device cuda:0 to a storage on different device cuda. This is no longer allowed; the devices must match. ``` Now it looks like: ``` Attempted to set the storage of a tensor on device "cuda:0" to a storage on different device "cuda". This is no longer allowed; the devices must match. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19068 Reviewed By: dzhulgakov Differential Revision: D14854257 Pulled By: gchanan fbshipit-source-id: deb1ef73c2fcbf9338e7d67f2856282db2befac8	2019-04-10 11:51:33 -07:00
David Riazati	86532c921d	Refactor pickler (#19035 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19035 ghimport-source-id: 553977b9963d4877e5066a61702f887e81706598 Differential Revision: D14839341 Pulled By: driazati fbshipit-source-id: d6e4f21b2df28e2a0a21b26bf08d9905599119ad	2019-04-10 11:26:07 -07:00
iurii zdebskyi	1858773c0c	Fixed bool Tensor value change bug (#19096 ) Summary: Fixes #19077 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19096 Differential Revision: D14871044 Pulled By: izdeby fbshipit-source-id: 61b12559c8c5b9613e00ba5933f478321ea80469	2019-04-10 11:09:07 -07:00
Dmytro Dzhulgakov	92f70bb639	Split python_ir.h in a more sensible way (#19081 ) Summary: Files included in libtorch do depend on torch/csrc/utils/object_ptr.h, e.g. ir.cpp: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/ir.h#L10 (including usage in std::vector that requires destructor for THPPointer) However, object_ptr.h depends on python stub: https://github.com/pytorch/pytorch/blob/master/torch/csrc/utils/object_ptr.h#L3 Whereas object_ptr.cpp depends full on on python: https://github.com/pytorch/pytorch/blob/master/torch/csrc/utils/object_ptr.cpp#L8 `torch/csrc/utils/object_ptr.cpp` is included only in Python extension target: https://github.com/pytorch/pytorch/blob/master/torch/CMakeLists.txt#L541 The only reason it was working on master is that compiler was aggressive enough in pruning unused inline functions. With a bit of changes in flags, it started breaking (like in kostmo's PR). This PR splits out python-dependent bits more explicitly by forward declaring THPPointer for real. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19081 Reviewed By: ezyang Differential Revision: D14860091 Pulled By: dzhulgakov fbshipit-source-id: 4e86cb8e2ac57aedb3cd00c15270d65bb376206c	2019-04-10 10:26:50 -07:00
Yinghai Lu	b461689cfd	Clear input/ouput shape cache for each inference (#19085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19085 This is a bug where input_shapes_ and output_shapes_ will grow indefinitely. Fix it here. Reviewed By: bertmaher, rdzhabarov Differential Revision: D14861695 fbshipit-source-id: d59116f27c3b54f5cc5a33533de4b9222dbb7afc	2019-04-10 10:21:37 -07:00
Xiang Gao	ea2405c7dc	Add torch.unique_consecutive (#19060 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/19045 Please review: VitalyFedyunin ngimel This is independent on the #18649 series. This will cause merge conflicts in #18649 series, but please merge this first, and I will resolve the merge conflicts there. The new feature is exposed in `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon`. But not at `torch.unique` yet. I will take care of the API after #18649 series get merged completely. Benchmark on a tensor of shape `torch.Size([15320, 2])`: ```python print(torch.__version__) print() a = tensor.sort().values.to('cpu') print('cpu, sorted_input=False:') %timeit torch._unique2_temporary_will_remove_soon(a) %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True) %timeit torch._unique2_temporary_will_remove_soon(a, return_counts=True) %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True, return_counts=True) print() print('cpu, sorted_input=True:') %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True) %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True) %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_counts=True) %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True, return_counts=True) print() a = a.to('cuda') print('cuda, sorted_input=False:') %timeit torch._unique2_temporary_will_remove_soon(a); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, return_counts=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True, return_counts=True); torch.cuda.synchronize() print() print('cuda, sorted_input=True:') %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_counts=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+2addccc cpu, sorted_input=False: 340 µs ± 5.88 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 717 µs ± 14.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 52.3 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 52.3 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) cpu, sorted_input=True: 32.8 µs ± 285 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 49.9 µs ± 557 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 51.6 µs ± 1.08 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 78 µs ± 782 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cuda, sorted_input=False: 213 µs ± 1.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 291 µs ± 3.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 250 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 321 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) cuda, sorted_input=True: 45.6 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 110 µs ± 2.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 82 µs ± 857 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 143 µs ± 409 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` ```python print(torch.__version__) print() a1, a2 = tensor.unbind(1) indices = (a1 * tensor.max() + a2).sort().indices a = tensor.index_select(0, indices).to('cpu') print('cpu, sorted_input=False:') %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_counts=True) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True, return_counts=True) print() print('cpu, sorted_input=True:') %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_counts=True) %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True, return_counts=True) print() a = a.to('cuda') print('cuda, sorted_input=False:') %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_counts=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True, return_counts=True); torch.cuda.synchronize() print() print('cuda, sorted_input=True:') %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_counts=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` cpu, sorted_input=False: 55.4 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 55.8 ms ± 616 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 55.2 ms ± 402 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 55.1 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) cpu, sorted_input=True: 54.7 ms ± 585 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 55.2 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 54.5 ms ± 865 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 54.9 ms ± 577 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) cuda, sorted_input=False: 171 µs ± 783 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 220 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 203 µs ± 2.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 251 µs ± 2.83 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) cuda, sorted_input=True: 59.6 µs ± 757 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 113 µs ± 431 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 93.2 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 147 µs ± 2.81 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` The CPU implementation of `unique_dim` is super slow, see https://github.com/pytorch/pytorch/issues/18987, but this PR will not worry about this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19060 Differential Revision: D14866909 Pulled By: ezyang fbshipit-source-id: d20012cec68c37b05cf770a6f4d6524f910b950f	2019-04-10 07:36:08 -07:00
Lu Fang	23b0908d38	Replace tabs with space (#19100 ) Summary: fix the linter Pull Request resolved: https://github.com/pytorch/pytorch/pull/19100 Differential Revision: D14869256 Pulled By: houseroad fbshipit-source-id: 27ca93cd1dce01ac705b9c9ed93ca8eb6c36351c	2019-04-10 00:35:02 -07:00
Roy Ju	a9a29dd63f	Fixes error when too many parameters are passed to fused cuda kernel (#18063 ) Summary: Bug fix for https://github.com/pytorch/pytorch/issues/15043, where a large fusion in JIT with a large number of kernel arguments, which exceeds the limit allowed by nvrtc on a cuda device. The fix is to check the number of arguments before a cuda kernel is generated. If the number exceeds the limit, take the runFallBack() path. Add a reduced test from the original issue to keep the test time low. The test would fail without this fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18063 Differential Revision: D14691401 Pulled By: soumith fbshipit-source-id: b98829bc89ed7724e91eda82ae3a5a1151af721a	2019-04-09 22:37:09 -07:00
Summer Deng	496b0b03d9	amend D14778810 (#18902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18902 Fix in D14778810 had an issue that when we fallback to acc32 because the density of outlier is too high W_quantized_ is already modified. In this diff we first just count the number of outliers (without modifying W_quantized_) and only when density is low enough and no need for fallback we modify W_quantized_ and construct an outlier matrix. Reviewed By: jspark1105 Differential Revision: D14785256 fbshipit-source-id: 03933110a4ca7409686a06b18a9bb921f8657950	2019-04-09 22:08:54 -07:00
James Reed	82b570528d	Move abs, frac, reciprocal, and neg to TensorIterator (#19041 ) Summary: I've been messing around with vectorizing the fusion compiler in JIT, and noticed that these ops were pathologically slow. I moved them to use TensorIterator + Vec256<> and got some speed wins. Benchmark script: ``` import torch, time ops = ['abs', 'neg', 'reciprocal', 'frac'] x = torch.rand(1024, 1024) NITER = 10000 print('op', 'time per iter (ms)', 'gops/s', 'GB/s', sep='\t') for op in ops: s = time.time() for i in range(NITER): getattr(x, op)() elapsed_sec = ((time.time() - s) / NITER) print(op, elapsed_sec * 1000, (10241024/elapsed_sec)/1e9, (1024102442) / elapsed_sec / 1e9, sep='\t') ``` Before this change (on my mac with a skylake): ``` op time per iter (ms) gops/s GB/s abs 0.9730974197387695 1.0775652866097343 8.620522292877874 neg 1.0723679780960083 0.9778136063534356 7.822508850827485 reciprocal 1.2610594034194946 0.8315040490215421 6.6520323921723366 frac 1.1681334018707275 0.8976509004200546 7.181207203360437 ``` After this change: ``` op time per iter (ms) gops/s GB/s abs 0.5031076192855835 2.084198210889721 16.673585687117768 neg 0.4433974027633667 2.3648672578256087 18.91893806260487 reciprocal 0.47145988941192624 2.2241043693195985 17.79283495455679 frac 0.5036592721939087 2.0819154096627024 16.65532327730162 ``` So, after this change it looks like we are hitting machine peak for bandwidth and are bandwidth bound. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19041 Differential Revision: D14862037 Pulled By: jamesr66a fbshipit-source-id: e2032ac0ca962dbf4120bb36812277c260e22912	2019-04-09 21:55:00 -07:00
Wanchao Liang	56b18eadab	Fix aten op output assignment (#18581 ) Summary: Fixes the problem of #18391 The issue is that when we code gen the ATenOp, we always generated static number of outputs for each operator. E.g. If there's operator from a old model that only requires two outputs, in its createOperator it will only allocate two output blobs, while the newer version of the operator (`unique` in this case) requires more output blob to be allocated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18581 Differential Revision: D14865647 Pulled By: wanchaol fbshipit-source-id: 85f63fe16d6fe408a09eca84798c7e8cab3070e9	2019-04-09 21:39:12 -07:00
Richard Zou	447d74a074	EmbeddingBag w/ differentiable per_sample_weights (#18957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18957 ghimport-source-id: 7396ca08b137ea40f04285764a9d9a6d4f19227e Reviewed By: cpuhrsch Differential Revision: D14856526 Pulled By: zou3519 fbshipit-source-id: 949faea219c7c02ad981b1db610a477194d3f5c9	2019-04-09 18:13:06 -07:00
Richard Zou	c889ff6cf8	EmbeddingBag w/ per_sample_weights CUDA fwd + bwd (#18800 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18800 ghimport-source-id: 17f638dea0e1ac9a86ec06b223c60362ed78449c Reviewed By: cpuhrsch Differential Revision: D14851422 Pulled By: zou3519 fbshipit-source-id: 27b114e51e66112e4bc9cfc63d1d1ddfa650d347	2019-04-09 18:13:02 -07:00
Richard Zou	0397d7c0c8	EmbeddingBag w/ per_sample_weights CPU backward (#18799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18799 ghimport-source-id: 58a6f629e890449013f24a9b6282664ca2a1e3ba Reviewed By: cpuhrsch Differential Revision: D14851417 Pulled By: zou3519 fbshipit-source-id: c36b9d469989354bf6cef1c2c3dc4f13e7cb1a25	2019-04-09 18:12:59 -07:00
Richard Zou	2a2007e5ac	EmbeddingBag CPU forward with per_sample_weights. (#18735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18735 ghimport-source-id: d81bef54dafd7167d2451250d7be478d3c013920 Reviewed By: cpuhrsch Differential Revision: D14851415 Pulled By: zou3519 fbshipit-source-id: cea6039e760ad571b90f0a536e420498f34be325	2019-04-09 18:12:55 -07:00
Richard Zou	c561ef5406	Refactor CPU embedding_bag implementation (#18734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18734 ghimport-source-id: e0e50d4b47f2fb8c86e464aacb950521d601f8d3 Reviewed By: cpuhrsch Differential Revision: D14851413 Pulled By: zou3519 fbshipit-source-id: 8ac4e4de590a363e9807dc552fe4ca52b92652ed	2019-04-09 18:12:52 -07:00
Alexander Sidorov	0ca8f7a15f	Make BlackBoxPredictor handle networks throwing exceptions (#19080 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19080 OSS: add a tiny unit test utility function to create tensors given shape and data outside of any workspace. I use it in an internal test Reviewed By: dzhulgakov Differential Revision: D14814194 fbshipit-source-id: 6d53b235d99a97da812215f5c7f11fecad363c8c	2019-04-09 16:42:12 -07:00
Shen Li	168c0797c4	Remind users to set map_location properly when using DDP Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19084 Differential Revision: D14861702 Pulled By: mrshenli fbshipit-source-id: 10ca4a9b41e707050a6bce228ccca4177c9fa4a6	2019-04-09 16:29:38 -07:00
Vishwak Srinivasan	487388d8ad	Rename btrisolve to lu_solve (#18726 ) Summary: Changelog: - Rename `btrisolve` to `lu_solve` to remain consistent with names of solve methods (`cholesky_solve`, `triangular_solve`, `solve`) - Fix all callsites - Rename all tests - Create a tentative alias for `lu_solve` under the name `btrisolve` and add a deprecation warning to not promote usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/18726 Differential Revision: D14726237 Pulled By: zou3519 fbshipit-source-id: bf25f6c79062183a4153015e0ec7ebab2c8b986b	2019-04-09 15:21:24 -07:00
Shen Li	5eb6a2be41	Avoid calling tensor.data.set_() in DDP Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18961 Differential Revision: D14811208 Pulled By: mrshenli fbshipit-source-id: c1c46dfa13e0a6ec83aefd35696ee31a7ea3d810	2019-04-09 14:18:24 -07:00
Dmytro Dzhulgakov	1f0ee9d6e6	Reapply Wrap workaround for cpp custom types a bit prettier and add an example" (#19062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19062 As a temporary demonstration on how to extend this hack further until custom C types are ready. Reviewed By: ezyang Differential Revision: D14817809 fbshipit-source-id: 6eaf731e9135313eb858e178abcd9f25380ab8fe	2019-04-09 12:36:32 -07:00
Shen Li	8f9b11cf33	Propagate ProcessGroup timeout to Store (#16571 ) Summary: closes #16520 Hi pietern, I am not sure if this is the expected way to pass timeout to `Store`, could you please help take a look? Thanks! Questions: 1. How do I write tests for this? I wanted to do something like `test_barrier_timeout_global`, but it seems I need to set the pg's timeout larger than the `Store`'s default timeout (3 min) to see a difference, which is too long for a unit test. And I do not want to change the `Store`'s default timeout either. Any suggestion? 2. Should I also propagate timeout configuration down to `PrefixStore` in `_new_process_group_helper`? Pull Request resolved: https://github.com/pytorch/pytorch/pull/16571 Differential Revision: D13954527 Pulled By: mrshenli fbshipit-source-id: 77f2653903f24255207233eb298f7c0321119a87	2019-04-09 12:36:28 -07:00
Wanchao Liang	aa017db59c	make test_jit_fuser runnable Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19036 Differential Revision: D14839800 Pulled By: wanchaol fbshipit-source-id: b52c131b58e1b42a8c3da5d1117217c3dc2e5f5b	2019-04-09 12:36:25 -07:00
Edward Yang	ad45d09202	Fix documentation for unfold(dimension=..., ...), fixes #18793 (#19020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19020 ghimport-source-id: 8f31e51b79daba11939aa7992450984054713b9c Differential Revision: D14851890 Pulled By: ezyang fbshipit-source-id: 8498e86a63633fdfd9ecae9b7f85b773b75fe27a	2019-04-09 11:54:25 -07:00
Edward Yang	a3e177083b	Debugging: Increase process reporting for apt/dpkg. (#18880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18880 ghimport-source-id: b43a33c12df379ec75c1fd4c713c1fc723a763e1 Differential Revision: D14856296 Pulled By: ezyang fbshipit-source-id: 30691eb14dddfe998b2605b416aaa1b14d1b6ad5	2019-04-09 11:40:47 -07:00
Edward Yang	29ea08616b	Add torch.__config__.show(), reporting detailed version of all libraries. (#18579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18579 ghimport-source-id: 65124c95e49423de4ad1008c65e75057fea09b94 Differential Revision: D14778507 Pulled By: ezyang fbshipit-source-id: 1e4bb79f4800a116ce8fb7af2fefbd34da8d102c	2019-04-09 11:13:24 -07:00
Omegastick	31ff0ecd2b	Fix torch::nn::init::orthogonal_ with CNNs (#18915 ) Summary: Fixes #18518 I changed the C++ API torch::nn::init::orthogonal_ implementation to match the Python implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18915 Differential Revision: D14851833 Pulled By: ezyang fbshipit-source-id: 45b5e9741582777c203e9ebed564ab3ac1f94baf	2019-04-09 10:39:15 -07:00
Soumith Chintala	25bd28c3a0	move nightlies to 1.1.0xxx Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19069 Differential Revision: D14854600 Pulled By: soumith fbshipit-source-id: 85c703bddbd47c1b3914d58ab9521ed22ddeb62a	2019-04-09 10:33:29 -07:00
Lu Fang	ba77eadbca	add an utility function to check whether it's in the middle of onnx export or not Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19050 Reviewed By: yinghai Differential Revision: D14849878 Pulled By: houseroad fbshipit-source-id: a0a4a57f5f9f315ba1334edfccc9284a8099d17f	2019-04-09 10:07:08 -07:00
Lu Fang	75d6d8833d	remove interned_string.h dep (#19061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19061 remove the deps on interned_string.h Reviewed By: BIT-silence Differential Revision: D14850078 fbshipit-source-id: 07e6ad72a7de369049ea56f32b72276fb4c59b32	2019-04-09 09:59:15 -07:00
Liang Xiong	b1bea0b733	add logging to make the saving action visible (#19042 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19042 show the model saving step in the log. Reviewed By: kennyhorror Differential Revision: D14809385 fbshipit-source-id: c7a1e50ff92bb45b16b1c501d9325b304b07fbd3	2019-04-09 09:35:43 -07:00
Xiang Gao	89145e602b	Namedtuple return for gels, triangular_solve, and test refactor (#17195 ) Summary: Partial fix of: https://github.com/pytorch/pytorch/issues/394 - `gels` and `triangular_solve` now returns namedtuple - refactor test for namedtuple API for better coverage and maintainability Pull Request resolved: https://github.com/pytorch/pytorch/pull/17195 Differential Revision: D14851875 Pulled By: ezyang fbshipit-source-id: 9b2cba95564269d2c3a15324ba48751d68ed623c	2019-04-09 09:13:26 -07:00
Edward Yang	48a35135fb	Convert all tabs to spaces, add CI. (#18959 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959 ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156 Differential Revision: D14831246 Pulled By: ezyang fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0	2019-04-09 08:12:26 -07:00
Shen Li	544783fa1d	Fix BN tests for >= 8 GPU test environments (#19049 ) Summary: DDP does not support replicating BN layers within a process. Existing BN tests fail if the test environment has more than 8 GPUs. This is fixed by explicitly setting each process to use a single replica. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19049 Differential Revision: D14845286 Pulled By: mrshenli fbshipit-source-id: 937dda5081d415ece48b21f2781b6b4e008dd42f	2019-04-09 08:08:05 -07:00
Shuichi KITAGUCHI	17adce1b69	do not use constexpr with CUDA >= 9.2 compiler on Windows. (#18986 ) Summary: Define `AT_CPP14_CONSTEXPR` from `constexpr` to empty on Windows with CUDA >= 9.2 as workaround. Discussed in #18425. When using CUDA 10.1 on Windows, I faced following errors: ~~~ D:/data/source/pytorch\c10/util/ArrayRef.h(144): error: variable in constexpr function does not have automatic storage duration detected during instantiation of "const T &c10::ArrayRef<T>::front() const [with T=at::Tensor]" D:/data/source/pytorch/aten/src\ATen/DeviceGuard.h(30): here ~~~ From documentation of CUDA Toolkit v10.1.105, compiler supports `constexpr` and relaxing requirements (in C++14), but compilation failed. I suppose this could be compiler bug and require this workaround. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18986 Differential Revision: D14821836 Pulled By: ezyang fbshipit-source-id: 9800da2fe7291e7c09e8e5e882adebab08d83ae3	2019-04-09 08:03:13 -07:00
Edward Yang	5f24f9a29b	Add torch/lib/protobuf to gitignore, fixes #18700 (#19019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19019 ghimport-source-id: 84d36f8d27912d1d094d5672154b82187dd88761 Differential Revision: D14846615 Pulled By: ezyang fbshipit-source-id: e402557ec321c85be3b28c8602b680246c8eecfe	2019-04-09 07:34:37 -07:00
Lu Fang	72e171dc52	Automatic update of fbcode/onnx to 971311db58f2fa8306d15e1458b5fd47dbc8d11c (#19046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19046 Previous import was 079c2639f9bb79b1774d1e3bfa05b0c093816ca7 Included changes: - [971311db](https://github.com/onnx/onnx/commit/971311db): use ONNX_NAMESPACE::to_string instead of std::to_string (#1915) <Lu Fang> - [65227446](https://github.com/onnx/onnx/commit/65227446): Remove all the experimental ops (#1909) <Lu Fang> - [bdb28f29](https://github.com/onnx/onnx/commit/bdb28f29): opset converter backward compatibility support for opset versions 9 and 8 (#1847) <Peyman Manikashani> - [47692338](https://github.com/onnx/onnx/commit/47692338): Create CODEOWNERS for automatic reviewer assignment for PRs (#1910) <Prasanth Pulavarthi> - [8121c731](https://github.com/onnx/onnx/commit/8121c731): Revert "quantization support in onnx (#1872)" (#1911) <Lu Fang> - [4cfa5426](https://github.com/onnx/onnx/commit/4cfa5426): quantization support in onnx (#1872) <Ke Zhang> - [030bbb80](https://github.com/onnx/onnx/commit/030bbb80): Update LICENSE formatting and clarify # of WG chairs (#1907) <Prasanth Pulavarthi> Reviewed By: yinghai Differential Revision: D14843284 fbshipit-source-id: 96c1c79abb62beff227a9fc8b2af9382c4673755	2019-04-08 23:20:02 -07:00
peter	3bfdffe487	Fix default CXX for Windows in cpp_extensions.py (#19052 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/19017. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19052 Differential Revision: D14846702 Pulled By: soumith fbshipit-source-id: b0e4dadaa749da0fa2d0405a1a064820d094220a	2019-04-08 23:14:22 -07:00
Lu Fang	7db4c8ed76	fix the onnx ci Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19048 Reviewed By: yinghai Differential Revision: D14844917 Pulled By: houseroad fbshipit-source-id: 30719e05a443981284dedf34a9e51213271aa934	2019-04-08 23:07:31 -07:00
Xiaomeng Yang	fd40c0eba0	Add gelu op (#18992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18992 Add gelu op Reviewed By: houseroad Differential Revision: D14814811 fbshipit-source-id: 00f126b8b83763c57ebbf28fbd2de5a8fab6d491	2019-04-08 21:58:29 -07:00
jgong5	3ad710b837	Add MKL-DNN Tensor (#17748 ) Summary: This is a minimalist PR to add MKL-DNN tensor per discussion from Github issue: https://github.com/pytorch/pytorch/issues/16038 Ops with MKL-DNN tensor will be supported in following-up PRs to speed up imperative path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17748 Reviewed By: dzhulgakov Differential Revision: D14614640 Pulled By: bddppq fbshipit-source-id: c58de98e244b0c63ae11e10d752a8e8ed920c533	2019-04-08 21:41:38 -07:00
Soumith Chintala	e0c593eae7	detect C++ ABI flag for cpp extensions from available runtime information (#18994 ) Summary: Previously, when a user built PyTorch from source, but set the version string manually to be binary-formatted, it would've simply used CXX11_ABI=0 incorrectly. We have this information available at runtime with `torch._C._GLIBCXX_USE_CXX11_ABI`, so this PR improves the situation by simply using that information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18994 Differential Revision: D14839393 Pulled By: soumith fbshipit-source-id: ca92e0810b29ffe688be82326e02a64a5649a3ad	2019-04-08 17:50:03 -07:00
Spandan Tiwari	df05c7fbac	Fix momentum setting in BatchNorm forward pass. (#18764 ) Summary: This is a fix for issue https://github.com/pytorch/pytorch/issues/18525. The issue is related not only to ONNX export, but can manifest in other scenarios. An existing test point in test/onnx/test_operators.py has been updated to cover this scenario as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18764 Reviewed By: zrphercule Differential Revision: D14735166 Pulled By: houseroad fbshipit-source-id: 5a737c648f64355929ff31eb12bd4869e744768d	2019-04-08 16:30:00 -07:00
Jiakai Liu	cfb6054ada	add android build workflow to pytorch CI jobs (#18919 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18919 ghimport-source-id: 3f0ce4334c899d262403d88bd8bd7513e99570f0 Reviewed By: kostmo Differential Revision: D14800728 Pulled By: ljk53 fbshipit-source-id: fec2e34c192181b8fa31c9a30f60c9bf7388f083	2019-04-08 16:25:30 -07:00
Lu Fang	443a58e03d	Export C10 operator in PyTorch Model (#18210 ) Summary: Almost there, feel free to review. these c10 operators are exported to _caffe2 domain. TODO: - [x] let the onnx checker pass - [x] test tensor list as argument - [x] test caffe2 backend and converter - [x] check the c10 schema can be exported to onnx - [x] refactor the test case to share some code - [x] fix the problem in ONNX_ATEN_FALLBACK Pull Request resolved: https://github.com/pytorch/pytorch/pull/18210 Reviewed By: zrphercule Differential Revision: D14600916 Pulled By: houseroad fbshipit-source-id: 2592a75f21098fb6ceb38c5d00ee40e9e01cd144	2019-04-08 16:06:00 -07:00
Zachary DeVito	09c19e1068	Fix interpolate tracing (#19034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19034 ghimport-source-id: 874e0b0a8685184416152a77fc1850d9a06516ae Differential Revision: D14837282 Pulled By: zdevito fbshipit-source-id: b0ed82b607c288a54eecec3d6ed62c4626e5a563	2019-04-08 14:59:26 -07:00
Elias Ellison	930fb2f319	Fix default dtype in shape analysis (#18968 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/18823 Previously we were setting the dtype to Float when in torchscript the default is double. When the problem in https://github.com/pytorch/pytorch/issues/17662 gets landed, we will have to reevalute (and this test will fail). We should still be consistent in shape_analysis in the meantime. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18968 Differential Revision: D14837939 Pulled By: eellison fbshipit-source-id: 32383b55c14bdc7753e26dec33c39ab10124c255	2019-04-08 14:50:28 -07:00
Iurii Zdebskyi	a7095b355e	Renamed bool tensors into byte tensors (#19021 ) Summary: Renamed bool tensors into byte tensors to represent the correct type in generated code Pull Request resolved: https://github.com/pytorch/pytorch/pull/19021 Differential Revision: D14835188 Pulled By: izdeby fbshipit-source-id: 0252d2c69dab35ac2f076cf9a87423463e902c76	2019-04-08 13:53:40 -07:00
Thomas Viehmann	026a9c6bf2	Handle None indexing in TorchScript (#18615 ) Summary: t[None], t[None, 1:, None] and friends for unsqueezing Fixes: #12810 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18615 Differential Revision: D14837039 Pulled By: wanchaol fbshipit-source-id: ab3862c41629f087b0a46b7c59c93dac4018e6fe	2019-04-08 13:44:49 -07:00
Junjie Bai	239de1623d	Turn on mkldnn in most builds except rocm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18965 Differential Revision: D14836931 Pulled By: bddppq fbshipit-source-id: 463a9bc5043a1f3194158f7bbfae3b71c6cd4b20	2019-04-08 13:19:14 -07:00
David Riazati	9b76f69cd3	Remove dead code in module.cpp (#19022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19022 ghimport-source-id: cdf694c1b426eb9f82d4c148c9f2c2cfc180cedd Reviewed By: eellison Differential Revision: D14833409 Pulled By: driazati fbshipit-source-id: 8914c7227add7f3e07f56b21a513ba7727fb6800	2019-04-08 13:04:04 -07:00
Mikhail Zolotukhin	943f712d7a	Convert test_recursive_cse to use Filecheck inline annotations. (#19032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19032 ghimport-source-id: 58a146542deb08dd3057d099167ba530a5e51400 Differential Revision: D14836689 Pulled By: ZolotukhinM fbshipit-source-id: e65ca5f09193eb7c16c204aedd50c474ea31210c	2019-04-08 12:27:20 -07:00
Mikhail Zolotukhin	062b1321fe	Add a document 'How to Write Tests Using FileCheck' (#19005 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19005 ghimport-source-id: f9c3eff54adc8eef3ead2c77be62c44d88d22a00 Differential Revision: D14826845 Pulled By: ZolotukhinM fbshipit-source-id: 62cc3657ee89acc979403da15e39bd4cd09a866d	2019-04-08 12:12:30 -07:00
Duc Ngo	e7b2669151	caffe2 - Expose tensor filler util to Python (#18886 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18886 Expose tensor filler util to Python and add a unit test (both C++/Python) Reviewed By: salexspb Differential Revision: D14784470 fbshipit-source-id: bb8e013d1755c27c166e87d5a8491a97c65d3d8d	2019-04-08 11:54:10 -07:00
Jiakai Liu	66a3277dfa	call build_android.sh from pytorch CI build script (#18918 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18918 ghimport-source-id: 98c63da263adbbc6ac74a69ac117740c852833cd Reviewed By: dreiss Differential Revision: D14800727 Pulled By: ljk53 fbshipit-source-id: 4d06f845bb34bcdb74b0602404f2a0782f8c8783	2019-04-08 11:03:54 -07:00
Jon Malmaud	0565141728	Type annotations for `util.data`. (#18963 ) Summary: I haven't had a chance to rigorously try these out yet so don't merge yet. Closes #18725. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18963 Differential Revision: D14832897 Pulled By: ezyang fbshipit-source-id: 4780e7a34126bc66ddbfd9d808dfc9e0edd77e68	2019-04-08 09:52:53 -07:00
Johannes M Dieterich	a2ac260524	ifdef guard some explicit pragma unrolls (#19018 ) Summary: the ROCm compiler cannot and will not satisfy them, causing compile time warnings. Reason being a runtime loop trip count. Some warnings remain arising from other parts of the ROCm stack - tickets are filed and they will be resolved within these components. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19018 Differential Revision: D14832859 Pulled By: ezyang fbshipit-source-id: 0d66e4aebe4e56af14dd5e2967d3c374a82be25c	2019-04-08 09:47:23 -07:00
Summer Deng	02968398d5	Fix a dev mode bug in activation distribution observer (#19004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19004 Handling the exception case when the data has min 3.40282e+38 max -3.40282e+38 Reviewed By: jspark1105 Differential Revision: D14822193 fbshipit-source-id: b9771d1584fdf8317f5b8c7f5806be5d27314386	2019-04-08 09:36:50 -07:00
Gregory Chanan	2addcccbf1	Clean up some sparse code. (#18962 ) Summary: 1) sparse_dispatches in native_parse was not used anymore, got rid of it. 2) got rid of overloaded sizes_ in SparseTensorImpl, which just uses the base implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18962 Differential Revision: D14811545 Pulled By: gchanan fbshipit-source-id: 2fa60ef50456b5f605caa63beae1d8d2542fd527	2019-04-08 08:15:42 -07:00
Roy Li	65b9196741	Remove tensorWithAllocator() from Type (#18780 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18780 ghimport-source-id: 7d18a11ce87d988bd32f6ebb96acd878ab8d61be Stack from [ghstack](https://github.com/ezyang/ghstack): * #18780 Remove tensorWithAllocator() from Type * #18779 Remove tensorFromBlob() from Type Differential Revision: D14739336 fbshipit-source-id: 429ab10bb9f6ac9f97b5a11c7a836b6b6336cb2d	2019-04-08 00:00:39 -07:00
Johannes M Dieterich	5241e6ec5c	Fix sparse mm for ROCm (#18985 ) Summary: * Annotate also two pass reduction with launch bounds * ifdef some shortcomings of ROCm w.r.t. short-circuit returns - internal tickets filed * while there, plug memory leak by destroying matrix descriptor after the sparse call (applicable to cuSPARSE) * while there, fix types for cusparseXcoo2csr as per cuSPARSE documentation * enable test_dsmm in test_sparse which now passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/18985 Differential Revision: D14822009 Pulled By: bddppq fbshipit-source-id: 757267a47a63ee56ef396c33059f7eca099f4833	2019-04-07 18:16:16 -07:00
Ilia Cherniavskii	6c91610f0c	Check if profiler is disabled in push/pop event (#18908 ) Summary: Make sure to check if profiler is disabled in push/pop and mark event functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/18908 Differential Revision: D14791931 Pulled By: ilia-cher fbshipit-source-id: e4f5149e69999ee2b9238c21cccad6d27c6a714a	2019-04-07 15:06:04 -07:00
Nishant Pandit	08ee4e5607	Implement Observer pass on simple model and validate stats (#18848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18848 The Observer Module is based on eager mode compute qparam implementation. Goal is to validate QParam result for EagerMode and Script Mode for simple model Observer stats are collected and qparam computed only for activations only at this point Reviewed By: zafartahirov Differential Revision: D14720805 fbshipit-source-id: cb2f321b4b9927b37905fdb8eb55c5610d41b351	2019-04-07 09:17:14 -07:00
Balint Cristian	67fdb4abf7	AVX2 with GCC9 fix. (#18991 ) Summary: Dear All, The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable. * Using GCC 9 test code from cmake build infra always fails: ``` $ gcc -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2 test.c: In function ‘main’: test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’ 11 \| _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code \| ^ \| \| \| __m256 {aka __vector(8) float} In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51, from test.c:4: /usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’} 550 \| _mm256_extract_epi64 (__m256i __X, const int __N) \| $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 9.0.1 20190328 (Red Hat 9.0.1-0.12) (GCC) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991 Differential Revision: D14821838 Pulled By: ezyang fbshipit-source-id: 7eb3a854a1a831f6fda8ed7ad089746230b529d7	2019-04-07 08:27:00 -07:00
Roy Li	f6af76ead7	Remove tensorFromBlob() from Type (#18779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18779 ghimport-source-id: e7453b74fcce0e4f4a9cbce0324992a85272a426 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18780 Remove tensorWithAllocator() from Type * #18779 Remove tensorFromBlob() from Type Differential Revision: D14739335 fbshipit-source-id: 8a0619a5b412332efa3b2d60c1edebd53d089d50	2019-04-07 01:37:43 -07:00
James Reed	9b69f21a95	Improve precision of emitted code for prim::Constant (#18817 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/18815 and https://github.com/pytorch/pytorch/pull/18811. This makes it so that we emit a higher-precision literal for float values in the fusion kernel, as well as assign that to a `double` variable. This prevents us from losing precision for values such as `pi`, but with the previous fixes this will also get downcasted to `float` if downstream operations require it. Therefore, we should not lose performance because of implicit promotions Pull Request resolved: https://github.com/pytorch/pytorch/pull/18817 Differential Revision: D14820842 Pulled By: jamesr66a fbshipit-source-id: 519671c6ca5e7adac746a4c4c72760a6d91e332f	2019-04-07 00:18:24 -07:00
Arunava	79533ef097	convert_sync_batch_norm to SyncBatchNorm (#18787 ) Summary: Closes #18382 Please let me know if any changes are required. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18787 Differential Revision: D14821147 Pulled By: soumith fbshipit-source-id: edd98eab1b3f4151c4ae5148146435ddb2ae678d	2019-04-07 00:13:02 -07:00
Summer Deng	907b4c5890	fix bug when falling back to acc32 when weight is prepacked (#18974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18974 When the weight is prepacked and it doesn't contain a prepacked weight for acc32, we shouldn't fallback to acc32. Reviewed By: bddppq Differential Revision: D14814067 fbshipit-source-id: aec917322de695e283f0aca1e930c5603d196404	2019-04-06 21:53:08 -07:00
Ailing Zhang	dbd9971dd2	move 2ops back to autodiff (#18969 ) Summary: Move these 2 ops back to autodiff to unblock xla CI. I will leave them for my next PR to cleanup symbolic_variable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18969 Differential Revision: D14816811 Pulled By: ailzhang fbshipit-source-id: dd8a7e133dcad29560d3d1d25691883960117299	2019-04-06 21:41:25 -07:00
Nishant Pandit	1ffa358fca	Preserve naming for inputs/outputs with observer insertion (#18713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18713 - Quantizer observer node output is hooked up to following node which mutates the naming for input/output. This is not desired and required because observer op can be a sync node - Quantizer is aimed for quantizing tensors so we should insert observer op for Values that are type tensor Reviewed By: zafartahirov Differential Revision: D14715916 fbshipit-source-id: feca04c65a43103b46084d3548998498b19ee599	2019-04-06 21:01:21 -07:00
James Reed	34382e428f	Emit math functions specific to output type (#18815 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/18811 This makes it so that we only emit the *f variants of math functions if the output value's type is FloatTensor, otherwise we call the double variants to prevent loss of precision. This fixes more numerical issues Pull Request resolved: https://github.com/pytorch/pytorch/pull/18815 Differential Revision: D14816965 Pulled By: jamesr66a fbshipit-source-id: 464be644168875ede987142281fb2168f4041e81	2019-04-06 17:56:05 -07:00
Soumith Chintala	8961ad8c5b	add instructions for NVIDIA Jetson platforms (#18990 ) Summary: Thanks to dusty-nv , we now have Stable and Weekly wheels provided for the NVIDIA Jetson Platform. They require JetPack 4.2. He's also maintaining source build instructions. This PR adds links to the binaries and source build instructions to the README. The links are dynamic, so when new stable / weekly wheels are available, Dustin will update the same URL to point to the new files Pull Request resolved: https://github.com/pytorch/pytorch/pull/18990 Differential Revision: D14820158 Pulled By: soumith fbshipit-source-id: 761a56557decb72ad9c1b9f8a2745667f558eec3	2019-04-06 12:42:43 -07:00
Nishant Pandit	bcd527190a	Quantizer pass to insert quant-dequant nodes into IR (#18446 ) Summary: - Quantizer pass to mutate IR by inserting quant-dequant nodes before and after nodes which support quantized ops. This information will be used by jit compiler to substitute with quantized ops - This currently covers simple model. It will be expanded later for subgraph pattern matching to cover more complex patterns Pull Request resolved: https://github.com/pytorch/pytorch/pull/18446 Differential Revision: D14592265 Pulled By: nishantpdce fbshipit-source-id: c9ba6c12aa96cb9c117826e386721eec83a55ea6	2019-04-06 12:39:26 -07:00
Soumith Chintala	7b5b1486c9	add SyncBatchNorm to docs (#18988 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18983 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18988 Differential Revision: D14820042 Pulled By: soumith fbshipit-source-id: 356169f554a42303b266d700d3379a5288f9671d	2019-04-06 11:43:20 -07:00
mooncake4132	d6d0fcc92b	Add c10_cuda to libraries in CUDAExtension for Windows (#18982 ) Summary: This change was necessary for me to compile [apex](https://github.com/NVIDIA/apex) on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18982 Differential Revision: D14819818 Pulled By: soumith fbshipit-source-id: 37ff9b93a72ab2b7c87f23a61e9f776c71c4c1a8	2019-04-06 10:30:51 -07:00
Gao, Xiang	1497d45315	Remove Trainer from README.md (#18980 ) Summary: Trainer has been removed long time ago Pull Request resolved: https://github.com/pytorch/pytorch/pull/18980 Differential Revision: D14819855 Pulled By: ezyang fbshipit-source-id: f62020e688ebf6663416aec7435bf1f531607941	2019-04-06 09:12:50 -07:00
Zachary DeVito	13f03a42d2	Create Object that represents a Module (#18469 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18469 ghimport-source-id: 73cb8b58f43f10b1dcfca805fd5b25c4fa977632 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18469 Create Object that represents a Module * #18468 slots with explicit value/setValue make more sense in future patches * #18467 Make Object hold its ClassType * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. This changes the underlying storage for script::Module to hold a ivalue::Object which has slots for all the parameters and attributes. NamedIValue and Slot are now merged together into one class Slot that stores the tuple (ivalue::Object, offset) and can be used to read the name, type, or value of the slot and also to set the value. This cleans up a bunch of client uses. This PR does not actually use the module object in any generated code. A future PR will switch how code is generated to treat modules as first class. Differential Revision: D14613508 fbshipit-source-id: d853a7559f58d244de2ef54a781427fcd1060ed0	2019-04-05 18:58:52 -07:00
Gao, Xiang	8c9caf185b	Add numpy like repeat as torch.repeat_interleave (#18395 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/14093 cc: SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/18395 Differential Revision: D14599509 Pulled By: umanwizard fbshipit-source-id: 2391a1cc135fe5bab38475f1c8ed87c4a96222f3	2019-04-05 18:16:25 -07:00
Elias Ellison	e6bbbb017e	Fix interpolate trace (#18875 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/10654 The issue is that in tracing `.size` returns an int tensor, and when an int tensor is multiplied by a scalar the int dominates and the scalar gets casted 0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18875 Differential Revision: D14814441 Pulled By: eellison fbshipit-source-id: a4e96a2698f2fcbf3ec4b2bb4c43a30250f30ad9	2019-04-05 17:55:23 -07:00
James Reed	6084908287	Code string API for fuser testing (#18884 ) Summary: This adds a C++ function `debugGetFusedKernelCode` as well as a Python binding `_jit_fuser_get_fused_kernel_code` that will, given a FusionGroup graph and a set of specified inputs, return the compiled kernel source code. We can then check the contents of this source code for verification of the fuser codegen backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18884 Differential Revision: D14795508 Pulled By: jamesr66a fbshipit-source-id: 8f6e9dd13ebbb517737d893b0b5f5e9aa06af124	2019-04-05 17:13:17 -07:00
Michael Suo	ce67775f08	remove unused func (#18712 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18712 ghimport-source-id: e435150a501b20695a5276addee93d795e04b532 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18712 [jit][easy] remove unused func * #18711 [jit] fix side-effects and aliasing for custom ops as title Differential Revision: D14730979 fbshipit-source-id: 381d16ea2a45779bf6d5fc6d90a4f8585461e902	2019-04-05 15:19:28 -07:00
Junjie Bai	46fe266507	Revert D14778810: [caffe2/int8] fix bug when falling back to acc32 when weight is prepacked Differential Revision: D14778810 Original commit changeset: d49a8c4b7c81 fbshipit-source-id: 15568b084848de74437582548bec42aadc74080d	2019-04-05 14:01:33 -07:00
Zachary DeVito	f6f34b3f4c	slots with explicit value/setValue make more sense in future patches (#18468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18468 ghimport-source-id: d4b41c521f2269a695e03c8e7d05d5542731ee48 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18469 Create Object that represents a Module * #18468 slots with explicit value/setValue make more sense in future patches * #18467 Make Object hold its ClassType * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. Reviewed By: suo Differential Revision: D14613509 fbshipit-source-id: 9f2208d0efd01465c78cebdc3e8365a9e0adf9ff	2019-04-05 13:41:02 -07:00
Zachary DeVito	091acb0978	Make Object hold its ClassType (#18467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18467 ghimport-source-id: d51bdd64d2529d08c634c58df1a0870b54ad49fb Stack from [ghstack](https://github.com/ezyang/ghstack): * #18469 Create Object that represents a Module * #18468 slots with explicit value/setValue make more sense in future patches * #18467 Make Object hold its ClassType * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. Currently it holds a symbol whose unqualified name is the name of the class. This will get confusing when there are multiple possible registries, and it makes getting the class type from the object difficult. The pointer to the class is only 4 more bytes so this patch just puts it in the object. Reviewed By: suo Differential Revision: D14613510 fbshipit-source-id: b35175ba4be83d2522deaa6dad5070d6ec691fed	2019-04-05 13:40:59 -07:00
Zachary DeVito	53458c97dd	Enforce single parent for script submodules (#18379 ) (#18860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18860 ghimport-source-id: 96305349bf3db564f43df2263b1e5bddcc9e9dae Reviewed By: suo Differential Revision: D14780421 Pulled By: zdevito fbshipit-source-id: 2bdd89b35866ba035ebea0adab037e441c1006e2	2019-04-05 13:40:56 -07:00
Stas Bekman	f9a56d4af2	CUDA_NVCC_EXECUTABLE is not needed, as nvcc is in PATH (#18958 ) Summary: As indicated by f0k: https://github.com/pytorch/pytorch/pull/18495#issuecomment-480178763 nvcc via ccache is already first in the PATH in the instructions I provided, so CUDA_NVCC_EXECUTABLE is not needed. I re-built to test that it's so. Thank you! Pull Request resolved: https://github.com/pytorch/pytorch/pull/18958 Differential Revision: D14810732 Pulled By: ezyang fbshipit-source-id: 3758ae2253c745c5d7cfccedd49fa00cc4629965	2019-04-05 13:07:05 -07:00
Ahmad Salim Al-Sibahi	8e1e29124d	Fix precision issue with expansion that prefers 'probs' over 'logits' (#18614 ) Summary: I have experienced that sometimes both were in `__dict__`, but it chose to copy `probs` which loses precision over `logits`. This is especially important when training (bayesian) neural networks or doing other type of optimization, since the loss is heavily affected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18614 Differential Revision: D14793486 Pulled By: ezyang fbshipit-source-id: d4ff5e34fbb4021ea9de9f58af09a7de00d80a63	2019-04-05 13:07:01 -07:00
Joakim Rishaug	b90cbb841d	Method is supposed to be in-place (#18684 ) Summary: Tracing models which attempts to return this in-place value doesn't turn out well. I haven't run any tests to confirm the results to be honest, but regardless of the outcome, the operation happens in-place, so it should work as before. Sample output from traced model attempting to set `max_norm` on `Embedding`: ``` a leaf Variable that requires grad has been used in an in-place operation. (check_inplace at /pytorch/torch/csrc/autograd/VariableTypeUtils.h:49) frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f0ecc5cc021 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f0ecc5cb8ea in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so) frame #2: <unknown function> + 0x38ab2f (0x7f0ecb55ab2f in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #3: torch::autograd::VariableType::embedding_renorm_(at::Tensor&, at::Tensor const&, double, double) const + 0x76 (0x7f0ecb5b5966 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #4: <unknown function> + 0x56c958 (0x7f0ecb73c958 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #5: <unknown function> + 0x672286 (0x7f0ecb842286 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #6: torch::jit::InterpreterState::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) + 0x22 (0x7f0ecb83d842 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #7: <unknown function> + 0x65c6ac (0x7f0ecb82c6ac in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1) frame #8: <unknown function> + 0x3c8ab4 (0x7f0f06bc0ab4 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #9: <unknown function> + 0x3ad2c3 (0x7f0f06ba52c3 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #10: <unknown function> + 0x11663e (0x7f0f0690e63e in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so) <omitting python frames> frame #39: python_call + 0x11 (0x5563c3c521c1 in uwsgi) frame #40: uwsgi_request_wsgi + 0x100 (0x5563c3c54410 in uwsgi) frame #41: wsgi_req_recv + 0xac (0x5563c3becabc in uwsgi) frame #42: simple_loop_run + 0xc4 (0x5563c3c35be4 in uwsgi) frame #43: simple_loop + 0x10 (0x5563c3c35a00 in uwsgi) frame #44: uwsgi_ignition + 0x241 (0x5563c3c3a3a1 in uwsgi) frame #45: uwsgi_worker_run + 0x275 (0x5563c3c3ec35 in uwsgi) frame #46: <unknown function> + 0x8f22c (0x5563c3c3f22c in uwsgi) frame #47: <unknown function> + 0x3c13e (0x5563c3bec13e in uwsgi) frame #48: __libc_start_main + 0xf1 (0x7f0f138922e1 in /lib/x86_64-linux-gnu/libc.so.6) frame #49: _start + 0x2a (0x5563c3bec16a in uwsgi) : operation failed in interpreter: op_version_set = 0 def forward(self, input_1: Tensor) -> Tensor: _0 = torch.norm(self.item_embedding.weight, 2, 1, True) _1 = torch.div(self.item_embedding.weight, _0) m_weight = torch.t(_1) input_2 = torch.contiguous(input_1) weight_1 = torch.embedding_renorm_(self.item_embedding.weight, input_2, 1., 2.) ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE x = torch.embedding(weight_1, input_2, -1, False, False) input_3 = torch.div(x, torch.norm(x, 2, 2, True)) max_batch_size = ops.prim.NumToTensor(torch.size(input_3, 0)) hx = torch.zeros([2, int(max_batch_size), 70], dtype=6, layout=0, device=torch.device("cpu")) _2 = [self.lstm_layer.weight_ih_l0, self.lstm_layer.weight_hh_l0, self.lstm_layer.weight_ih_l1, self.lstm_layer.weight_hh_l1] input_4, _3, _4 = torch.lstm(input_3, [hx, hx], _2, False, 2, 0.10000000000000001, False, False, True) input = torch.matmul(input_4, torch.t(self.rnn2item.weight)) tastevec = torch.div(input, torch.norm(input, 2, 2, True)) outputs = torch.matmul(tastevec, m_weight) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18684 Differential Revision: D14782041 Pulled By: ezyang fbshipit-source-id: 7b2fc19b7d5b6600263644498bb728319a19f39d	2019-04-05 13:00:29 -07:00
Summer Deng	28990f34d9	fix bug when falling back to acc32 when weight is prepacked (#18881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18881 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18878 When the weight is prepacked and it doesn't contain a prepacked weight for acc32, we shouldn't fallback to acc32. TODO: add unit tests with better coverage Reviewed By: feiyu1990 Differential Revision: D14778810 fbshipit-source-id: d49a8c4b7c815ab29b77feb53ee730ad63780488	2019-04-05 13:00:26 -07:00
Marek Kolodziej	c1790fa202	More numerically stable lerp (#18871 ) Summary: The C++ and CUDA implementations of the lerp are not numerically stable. This is discussed on Wikipedia [here](https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support). I checked the GPU SASS output and there's no overhead from using the more precise implementation, from Kepler all the way to Turing. I haven't looked at CPU ASM though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18871 Differential Revision: D14793438 Pulled By: ezyang fbshipit-source-id: 2ddc2e026c5285466cae7d1b4101174253100445	2019-04-05 12:51:20 -07:00
Pieter Noordhuis	edc7b4726b	Increase default c10d/ProcessGroupGloo test timeout (#18916 ) Summary: See #18659. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18916 Differential Revision: D14808749 Pulled By: pietern fbshipit-source-id: 9a9c8beddb2dbbb1bf4c5e575743d9e1fa3f07fa	2019-04-05 12:16:30 -07:00
Ailing Zhang	cb3a4a3d28	remove symbolic variable part 1 (#17986 ) Summary: As discussed with gchanan we should deduplicate symbolic_variable and symbolic_script to prepare for the future merge with derivatives.yaml. This PR moves most easy formulas to symbolic_script. TODO: run benchmarks to make sure no perf regression cc: apaszke zdevito wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/17986 Differential Revision: D14766412 Pulled By: ailzhang fbshipit-source-id: d95a3f876e256c0f505779a71587c985571d3b8f	2019-04-05 12:06:47 -07:00
Edward Yang	f3dbcfdfb5	Revert D14742020: Wrap workaround for cpp custom types a bit prettier and add an example Differential Revision: D14742020 Original commit changeset: 0f2fd83ae56a fbshipit-source-id: 5640255aef0319b7d8996e07132e87213130d31c	2019-04-05 12:02:12 -07:00
Karl Ostmo	c65eeeb075	Decompose more Windows scripts (#18917 ) Summary: This PR: * pulls four distinct installation steps out of `build_pytorch.bat` and into their own scripts. * eliminates the copy step for helper scripts called by `win-build.sh` and `win-test.sh` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18917 Differential Revision: D14807236 Pulled By: kostmo fbshipit-source-id: 03e91a5834dfd6d68903ad9725eacc099bbf6d53	2019-04-05 11:31:52 -07:00
Dmytro Dzhulgakov	ef779b3397	Wrap workaround for cpp custom types a bit prettier and add an example (#18791 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18791 As a temporary demonstration on how to extend this hack further until custom C types are ready. Reviewed By: jamesr66a Differential Revision: D14742020 fbshipit-source-id: 0f2fd83ae56ab2abe16977a1829ed421e6abe74b	2019-04-05 11:20:13 -07:00
bddppq	c3a559deb7	Remove cuda::compat functions in aten (#18905 ) Summary: Looks like the issue of using `std::` functions is fixed in new rocm version Pull Request resolved: https://github.com/pytorch/pytorch/pull/18905 Differential Revision: D14792943 Pulled By: bddppq fbshipit-source-id: af11acbb85872943f23b6e55415db1f0699e7b8f	2019-04-05 11:15:16 -07:00
Michael Suo	fefa6d305e	fix side-effects and aliasing for custom ops (#18711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18711 ghimport-source-id: c9caedc0660b2b7ba3730cd0e1a2e0e9c3cf422b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18711 [jit] fix side-effects and aliasing for custom ops Previously we didn't track aliasing, mutation, or side effects for custom ops. This PR adds in guards with the most conservative assumptions possible: the op will 1) have side effects, 2) write to everything 3) produce a wildcard. In order to tell whether a given operator is a custom op, this PR introduces the concept of a "reserved" namespace (basically all our builtin namespaces). Custom ops live in non-reserved namespaces, so a check on the namespace is sufficient to tell whether a schema/node is "custom" or not. This is just to get things correct for now. Follow-ups to this: - Users should be able to specify aliasing/mutability without having to learn the whole alias annotation schema. - Relax assumptions a bit. In particular outputs can only alias input tensors, they don't have to be wildcards. Fixes #18490 Differential Revision: D14730978 fbshipit-source-id: 540b47a24ccf24145051609bdcc99c97e46e0fe0	2019-04-05 10:48:14 -07:00
Elias Ellison	abc758ed40	Expand the list of ops that mutate an inputs shape (#18812 ) Summary: Expand the list of ops that resize an input in-place to include broadcasting ops and other ops that affect shape. Whoever is reviewing the PR could you please look through pytorch in place ops and see if I missed any. Expanding the PR from: https://github.com/pytorch/pytorch/pull/17518 This is already being tested in test_resize_input_ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18812 Differential Revision: D14793410 Pulled By: eellison fbshipit-source-id: 125f4f5375ac1036fb96fabc9da2aaccc9adc778	2019-04-05 10:43:34 -07:00
J M Dieterich	e45e3634d6	add launch bounds, enable more tests (#18909 ) Summary: Add launch bounds annotations for ROCm arising from maxThreadsPerBlock and apply threads use. Enable tests that now work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18909 Differential Revision: D14801490 Pulled By: ezyang fbshipit-source-id: b81c97fc783a2627bc7e31b32036a364cfe40cc7	2019-04-05 10:17:15 -07:00
Yinghai Lu	1d263ed92a	Add backward pass to infer single missing input shape for Concat opportunitiscally (#18911 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18911 Att. Reviewed By: bddppq Differential Revision: D14791295 fbshipit-source-id: 4b7a775924f0eadb0cb73aa6c434a6a5be8b92be	2019-04-05 10:11:58 -07:00
Jiakai Liu	0c5d444b28	change to use clang if NDK >= 18 (#18914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18914 ghimport-source-id: 4d9d9322ee5559d96e13533ec37ff3be86a0227c Reviewed By: ezyang Differential Revision: D14794162 Pulled By: ljk53 fbshipit-source-id: caac55e12b1e62bf6ebcc6e2062d5ed122ad4e64	2019-04-05 10:02:03 -07:00
Zachary DeVito	5e8a9e8802	Revert D14673459: [pytorch][PR] [jit] Replace Slot on script::Method with NamedIValue Differential Revision: D14673459 Original commit changeset: 21200180c47f fbshipit-source-id: 9c01de4cf5bb7c87ac0c55705b901db990cd917b	2019-04-05 09:57:13 -07:00
Edward Yang	8793e8db42	Disable flaky test_proper_exit test. (#18950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18950 ghimport-source-id: 27bd575fd3c73a51ace1360aa020fa63a792a5d2 Differential Revision: D14802009 Pulled By: ezyang fbshipit-source-id: 051e1d038892c2c6e8337357fa80771b8dc42680	2019-04-05 09:49:54 -07:00
Edward Yang	865ed7682d	Checkout pytorch_sphinx_theme with https. (#18859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18859 ghimport-source-id: fbbcb8a2dd9c9f0a317de489b6bbb83e9071a7d8 Differential Revision: D14801989 Pulled By: ezyang fbshipit-source-id: a9bc02e1383adafcac01994e6346b28551d95c71	2019-04-05 09:35:49 -07:00
Pieter Noordhuis	ce92cf9bd1	Add tests for reducer class (#18845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18845 This adds a few CPU only test cases for the reducer class. Reviewed By: mrshenli Differential Revision: D14768432 fbshipit-source-id: c008a52206826304e634a95bc14167ed94c97662	2019-04-05 09:07:29 -07:00
Owen Anderson	79ac2120ba	Fix a few instances of notifying on a CV while holding the lock (#18857 ) Summary: Fix a few instances of notifying on a CV while holding the lock to release the lock before notifying. This avoids an extra thread suspension when the notified thread tries to grab the lock. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18857 Differential Revision: D14779132 Pulled By: resistor fbshipit-source-id: b18a05c4c15be1426ebfdffac1c8f002b771cfd7	2019-04-05 08:41:53 -07:00
peter	0829ef00dd	Unify caffe2 and libtorch build scripts on Windows (#18683 ) Summary: `scripts/build_windows.bat` is the original way to build caffe2 on Windows, but since it is merged into libtorch, the build scripts should be unified because they actually do the same thing except there are some different flags. The follow-up is to add the tests. Looks like the CI job for caffe2 windows is defined [here](https://github.com/pytorch/ossci-job-dsl/blob/master/src/jobs/caffe2.groovy#L906). Could we make them a separate file, just like what we've done in `.jenkins/pytorch/win-build.sh`? There's a bunch of things we can do there, like using ninja and sccache to accelerate build. cc orionr yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18683 Differential Revision: D14730188 Pulled By: ezyang fbshipit-source-id: ea287d7f213d66c49faac307250c31f9abeb0ebe	2019-04-05 07:47:32 -07:00
Gregory Chanan	84068f43f2	Simplify storage wrapping in TH. (#18855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18855 ghimport-source-id: 01faa229fa4db901ab8539d3778b716d909ba4cf Reviewed By: dzhulgakov Differential Revision: D14790669 Pulled By: gchanan fbshipit-source-id: 167b9bc9c9872743fa8f6040a26ddf7ff5789c27	2019-04-05 07:21:42 -07:00
Gregory Chanan	043e363c6c	Cache device on TensorImpl; clean up TensorImpl constructors. (#18833 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833 ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. 1) We cache device on TensorImpl. This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves). 2) Clean up TensorImpl APIs. We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds. Instead, we just have two different constructors: one for types with a storage, one without. Reviewed By: dzhulgakov Differential Revision: D14766230 fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df	2019-04-05 07:21:39 -07:00
Vitaly Fedyunin	b7c830b916	Revert "Adding pin_memory kwarg to zeros, ones, empty,... (#18854 ) Summary: This reverts commit c484cf43a02863efd2f4a76aad43246fb0191ab5. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18854 Differential Revision: D14778393 Pulled By: VitalyFedyunin fbshipit-source-id: 4b5a1f5b1c091bbc4a8e75614734cc011d26b452	2019-04-05 06:25:33 -07:00
Sebastian Messmer	ab4133397c	Silence compiler warnings (#18912 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18912 We intentionally test a deprecated API, no need to show the warnings here. Reviewed By: dzhulgakov Differential Revision: D14792617 fbshipit-source-id: 9ea2a4106d566064283726eed2c274b98f49a2e5	2019-04-05 01:52:00 -07:00
Dmytro Dzhulgakov	c34e5ff952	ScriptModuleOp in caffe2 (#18716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18716 Might be useful as an intermediate stage for some systems that currently use Caffe2 nets as an execution mechanism. Not sure it's a good idea all together, please comment. Limitations: - only Tensor types as inputs/outputs - the entire module is serialized as a zip archive inside a proto in Caffe2 db, it'd be subject to 4Gb limit and is likely very slow. For small models it'd work though. - no autograd, though it can be attached in principle - no way to retrieve parameters inside the script module from C2 runtime perspective (though they potentially can be alias-fetched and stored as individual blobs) - after deserialization, python wrappers returned don't have correct type (as we don't do module_lookup trick) Build-wise, I had to add dependency from pybind_state to libtorch.so. I don't think we build Caffe2 python frontend independently anymore, so it should be fine. Reviewed By: amirshim, houseroad Differential Revision: D14339599 fbshipit-source-id: 88a37a8abd1f1c4703e5ef937031f222535d4080	2019-04-05 01:07:43 -07:00
Karl Ostmo	8bdd0c3a85	flake8 fix on extracted python script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18931 Differential Revision: D14796114 Pulled By: kostmo fbshipit-source-id: 25971be5a36fffc61e29db981af7298a0fe0ed8c	2019-04-05 00:54:23 -07:00
David Riazati	8f5e478aa2	Replace Slot on script::Method with NamedIValue (#18252 ) Summary: This refactor lets us track the types of initial values added onto a `Method`. The main motivation for this is the change in `module.cpp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18252 Differential Revision: D14673459 Pulled By: driazati fbshipit-source-id: 21200180c47f25bb70898771adfb569856e6c34a	2019-04-04 23:35:56 -07:00
Karl Ostmo	90b8552c98	U/kostmo/windows offload scripts 3 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18754 Differential Revision: D14794893 Pulled By: kostmo fbshipit-source-id: 05187d9b53615ffbcc7253accdc692c4ecaf25d9	2019-04-04 21:08:05 -07:00
Tongzhou Wang	4f5e72600e	fix lint in optim doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18883 Differential Revision: D14793365 Pulled By: ezyang fbshipit-source-id: c1b46c98e3319badec3e0e772d0ddea24cbf9c89	2019-04-04 19:08:13 -07:00
Iurii Zdebskyi	8a466d147c	Fixed the comment to reference gist example instead of private repo (#18852 ) Summary: Replace link to a file in a private repo with a gist Pull Request resolved: https://github.com/pytorch/pytorch/pull/18852 Reviewed By: ezyang Differential Revision: D14778481 Pulled By: izdeby fbshipit-source-id: 8389aa4bf115ddcfd85079cc2c861404efa678e7	2019-04-04 18:26:24 -07:00
Sepehr Sameni	b11a8c6aef	return missing keys from load_state_dict (#18668 ) Summary: return missing_keys and unexpected_keys from load_state_dict so the user can handle them when strict mode is off; also removed an unused variable Pull Request resolved: https://github.com/pytorch/pytorch/pull/18668 Differential Revision: D14782073 Pulled By: ezyang fbshipit-source-id: ab3b855eb77bb7422594d971988067e86eef20f2	2019-04-04 18:11:56 -07:00
Junjie Bai	814c1df29a	Fix caffe2 miopen conv transpose gradient op for case of no dX gradient Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18809 Reviewed By: ezyang Differential Revision: D14759762 Pulled By: bddppq fbshipit-source-id: ff795b7e58c82f67a1d7284b5ab06b0e0e5fd3ae	2019-04-04 17:29:30 -07:00
Brennan Vincent	d35c39e73b	don't attempt to multiply by a sparse matrix (#18737 ) Summary: Tested by running the script in #16562 , and there was no error. Then: ``` >>> print(mat.grad) tensor([[1., 2., 3.], [1., 2., 3.], [1., 2., 3.]]) ``` which is correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18737 Differential Revision: D14773078 Pulled By: umanwizard fbshipit-source-id: 8aa36eb6f6aa104263a467d9ac91d61b3bfd05f5	2019-04-04 17:24:53 -07:00
Wanchao Liang	07efee395c	add Fast-RNN to AI-PEP Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18885 Reviewed By: hl475 Differential Revision: D14728854 fbshipit-source-id: 7e7a2946929551963f7c938e3d82a260a9efdfbd	2019-04-04 17:04:21 -07:00
Pieter Noordhuis	7a19d3c9e1	Allow override of backend in dist.new_group() (#18595 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18595 There is no need to force the backend to be the same as the global process group, as long as the backend is "nccl" or "gloo". Reviewed By: mrshenli Differential Revision: D14657204 fbshipit-source-id: 868817b9f219e3be8db0761a487f0027ed46663b	2019-04-04 14:23:03 -07:00
Lara	1ec1db477d	ONNX Export All Cases of Softmax Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18482 Reviewed By: zrphercule Differential Revision: D14630697 Pulled By: houseroad fbshipit-source-id: c06f1e3bead10a265c5f4ac3723d49f4caf46801	2019-04-04 13:24:04 -07:00
Iurii Zdebskyi	b4d2df1fee	Added bool and half support for resize_as_ and view methods (#18821 ) Summary: Enabled resize_as_ and view methods for bool and half tensors. tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/18821 Reviewed By: ezyang Differential Revision: D14762852 Pulled By: izdeby fbshipit-source-id: 4312079fb4e893fea6f71ff4f163094b2674f1e8	2019-04-04 13:09:10 -07:00
Lu Fang	bb16e8dacb	Automatic update of fbcode/onnx to 079c2639f9bb79b1774d1e3bfa05b0c093816ca7 (#18841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18841 Previous import was f0d7df2c643c4e37f1fd7735ef02c972c4d19fb5 Included changes: - [079c2639](https://github.com/onnx/onnx/commit/079c2639): update the squeeze and unsqueeze doc (#1905) <Lu Fang> - [a8b45d62](https://github.com/onnx/onnx/commit/a8b45d62): fix the ir_version onnx-operators.proto (#1903) <Lu Fang> Reviewed By: zrphercule Differential Revision: D14767158 fbshipit-source-id: 2d772fece45e25d72bf1d10fad156189397f3f86	2019-04-04 13:01:37 -07:00
James Reed	33f4751fb8	Actually model scalar type promotion in shape analysis (#18811 ) Summary: This was causing some numerical issues in the fuser Pull Request resolved: https://github.com/pytorch/pytorch/pull/18811 Differential Revision: D14767390 Pulled By: jamesr66a fbshipit-source-id: f1123d1aab5501abad850d2edc996f8aa8dafe04	2019-04-04 12:56:40 -07:00
Max Wang	d108a1abb7	Add a .ctags.d/ toplevel directory (#18827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18827 ghimport-source-id: 38f857bc29b2c2c6a71069d00c4c69ed0bef1574 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18827 Add a .ctags.d/ toplevel directory Exclude build artifacts by default. Reviewed By: ezyang Differential Revision: D14765721 fbshipit-source-id: a785dbb2ef1df96af8e23cc65c8db2a6b67b4fce	2019-04-04 12:51:05 -07:00
Wanwannodao	8ca9ba17da	Fix typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18802 Differential Revision: D14781874 Pulled By: ezyang fbshipit-source-id: 0f94c40bd84c84558ea3329117580f6c749c019f	2019-04-04 12:46:39 -07:00
Xiaomeng Yang	b145dcca04	Add support for group ConvTranspose (#18794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18794 Add support for group ConvTranspose Reviewed By: houseroad Differential Revision: D14741327 fbshipit-source-id: 5d947ca044bf8495dd7f8f56122441ebbcc6c7e4	2019-04-04 11:52:06 -07:00
Gregory Chanan	8732a1b42e	Disallow changing the device of a tensor via set_. (#18832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18832 ghimport-source-id: fde4ad90541ba52dfa02bdd83466f17e6541e535 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. This is necessary to cache the device on a TensorImpl. Differential Revision: D14766231 fbshipit-source-id: bba61634b2d6252ac0697b96033c9eea680956e8	2019-04-04 11:15:37 -07:00
Karl Ostmo	15b318de84	U/kostmo/win test offload scripts Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18694 Differential Revision: D14766339 Pulled By: kostmo fbshipit-source-id: a2300e72129979f866430ca5c09dd7fff6df0a89	2019-04-04 10:42:11 -07:00
Zachary DeVito	f97eb8d9e4	Revert D14603722: Enforce single parent for script submodules Differential Revision: D14603722 Original commit changeset: 63ab5d0cccf7 fbshipit-source-id: 2c4174def102eda4589e08c4dbd67ce8af975199	2019-04-04 10:32:36 -07:00
Edward Yang	52a3a51490	Fix deviceCount on FakeGuardImpl. (#18745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18745 ghimport-source-id: 3ed111efe83b3061652869e33d9b5910b7daa732 Differential Revision: D14759198 Pulled By: ezyang fbshipit-source-id: 70a8db767f310fe0e0079c7b0693e9330d7cd472	2019-04-04 09:23:36 -07:00
Gregory Chanan	486fae563d	Stop swapping in Storages of the wrong device for Tensors. (#18831 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18831 ghimport-source-id: 2741e0d70ebe2c2217572c3af54ddd9d2047e342 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors. * #18832 [STACK] Disallow changing the device of a tensor via set_. * #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors. This is necessary to support device caching, see https://github.com/pytorch/pytorch/pull/18751 and https://github.com/pytorch/pytorch/pull/18578. In library code, we potentially swap in Storages with the wrong device when device_guard is False. This happens as follows with "view-like" operations. 1) We allocate a tensor on the 'wrong' device (because device_guard is false). 2) We swap out the 'wrong' storage with the 'right' storage using e.g. THCTensor_setStorage. Instead, we can just construct the Tensor with the correct Storage from the beginning. This is what we do with 'view'. Note there are two other "view-like" cases where this happens: 1) unfold 2) set_() Because these aren't performance critical, I just added the device_guard instead of applying the above correction. For completeness, this also includes a test that all `device_guard: false` functions behave properly under these conditions. Reviewed By: dzhulgakov Differential Revision: D14766232 fbshipit-source-id: 0865c3ddae3f415df5da7a9869b1ea9f210e81bc	2019-04-04 06:25:33 -07:00
Roy Li	d70c6f23f4	Pass ScalarType separately from Type in python constructors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17786 Reviewed By: ezyang Differential Revision: D14379075 fbshipit-source-id: 3abf066563b789a30cafe5b0c868a41326f5b833	2019-04-04 02:24:20 -07:00
Roy Li	f5741eb855	Store ScalarType and Backend instead of Type in TensorIterator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17601 Reviewed By: ezyang Differential Revision: D14274754 fbshipit-source-id: b08880ae586b6ae57d4c0bbeb203796d087926c4	2019-04-04 02:24:16 -07:00
Roy Li	c705d9eb1e	Introduce DeprecatedTypeProperties class (#17991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17991 changes: -Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&. -Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic. -Tensor::dispatch_type() now returns Type& like Tensor::type() used to do. -Changed callsites of Tensor::type() appropriately. Reviewed By: ezyang Differential Revision: D14443117 fbshipit-source-id: 239ccb7a09626279a71d1a37f8f82e7f57bf7d9e	2019-04-04 02:24:13 -07:00
Bram Wasti	095f88e093	Fix to handle null strides in DLPack tensor (#18510 ) Summary: DLPack can have non-strided tensors, which is represented by a nullptr in the place of dl_tensor.strides. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18510 Differential Revision: D14647328 Pulled By: bwasti fbshipit-source-id: 5364282810a5772cfc2319fc8133fe86fdd84dd1	2019-04-04 00:28:13 -07:00
Yinghai Lu	e5e2110a8e	Add shape inference function for Split (#18838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18838 It turns out that we don't have shape inference function of `Split` op at all. This diff adds that. Reviewed By: bertmaher Differential Revision: D14766871 fbshipit-source-id: 535cb4f24bdada603c76579e00e7a39aee93e19f	2019-04-04 00:22:22 -07:00
Lu Fang	0c237f1383	Fix the duplication problem in _unique_state_dict (#18139 ) Summary: Since parameter.data will create a new torch.Tensor each time, we get duplicate tensors when call _unique_state_dict now. Try to deduplicate it before creating new tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18139 Reviewed By: dzhulgakov Differential Revision: D14511262 Pulled By: houseroad fbshipit-source-id: cb69795d0b6509721220650bbb19edeb3459a503	2019-04-03 23:16:44 -07:00
Jongsoo Park	fa0ad057f8	fold col offset into bias; optimize A symmetric quant (#17026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17026 D14013931 was for FC. This diff is similar optimizations for Conv. A subtle difference is that in FC, once we fold col_offset into bias during pre-processing step, we can treat everything as if A_zero_offset == 0 (symmetric quantization of A). In Conv, we can't do this because padding still needs to use the original A_zero_offset. From requantization point of view, once col_offset folded into bias, we can treat as if we're doing symmetric A quantization. But, for steps involving padding like im2col, im2col fused with packing, and direct conv for depth-wise/group convolution we still need to pass the original A_zero_offset. Reviewed By: jianyuh Differential Revision: D14020276 fbshipit-source-id: c29caefd1127bbc6aff0e9d535939bb0c1ecb66c	2019-04-03 22:52:54 -07:00
Michael Suo	72913a55a8	fix flake8 lint (#18835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18835 ghimport-source-id: 7b1f433ae51232822704d62699233688072cbc23 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18835 fix flake8 lint * #18826 [jit] run cpp tests for non-cuda builds in test_jit.py ...again Reviewed By: ZolotukhinM Differential Revision: D14766790 fbshipit-source-id: 29361a407589092831dfbc3c5d63d2834934cd02	2019-04-03 22:24:01 -07:00
Michael Suo	0a4117a36e	run cpp tests for non-cuda builds in test_jit.py (#18826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18826 ghimport-source-id: 7ffa3bc7ef7402a6d6eb6ba5849e197019d77bf8 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18826 [jit] run cpp tests for non-cuda builds in test_jit.py We did all the work of nicely separating our cpp tests that don't require CUDA, but they aren't run from test_jit.py if CUDA is missing. Reviewed By: ZolotukhinM Differential Revision: D14766287 fbshipit-source-id: 9326b3a5c90f6c20fc8cfaf1a1885a363b91f30a	2019-04-03 22:23:58 -07:00
Lu Fang	100f95a362	Fix the linter (#18842 ) Summary: Remove extra empty line Pull Request resolved: https://github.com/pytorch/pytorch/pull/18842 Differential Revision: D14767334 Pulled By: houseroad fbshipit-source-id: 63224bc407949949e1eb5123d3f151e4ac8f6988	2019-04-03 21:37:01 -07:00
Zachary DeVito	7e59c60454	Enforce single parent for script submodules (#18379 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18379 ghimport-source-id: 9895ecc1ff7897e98853dc00675341f36726e7c7 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. The assumption that a ScriptModule has a single parent is present in our serialization format, and likely a few other places. It is not enforced on creation of script module hierarchies though, meaning that problems associated with (e.g. replicating a module twice in the output format) will not be caught until much later in the development cycle. This patch enforces the property when a submodule is registered. It also removes NamedModule since it is no longer necessary in this regime. This will also allow the easy discover of a modules fully-qualified name without needing to traverse the Module hierarchy. Differential Revision: D14603722 fbshipit-source-id: 63ab5d0cccf7d66c7833e0adf9023024ca9607cb	2019-04-03 20:26:58 -07:00
Elias Ellison	b80a4fa201	Allow ints, floats, and tensors in conditional (#18755 ) Summary: Per our offline discussion, allow Tensors, ints, and floats to be casted to be bool when used in a conditional Fix for https://github.com/pytorch/pytorch/issues/18381 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18755 Reviewed By: driazati Differential Revision: D14752476 Pulled By: eellison fbshipit-source-id: 149960c92afcf7e4cc4997bccc57f4e911118ff1	2019-04-03 17:12:17 -07:00
Wanchao Liang	843e6234f5	Fix layernorm ad formula on weight and bias (#18233 ) Summary: Fix the layernorm formula when weight and bias passed in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18233 Differential Revision: D14760375 Pulled By: wanchaol fbshipit-source-id: d6bd3b137bc04c391aa5c24d021d1f811ba2a877	2019-04-03 16:58:33 -07:00
Zachary DeVito	0512e4e323	Unify namespace of script::Module (#18378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18378 ghimport-source-id: 55c29bb436a2153d29ff2f4488d99d8863c187b1 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. This removes individual OrderedDicts in favor of a single unified namespace for all things in a script::Module. This removes a whole class of bugs where both a method and an parameter could get the same name, for instance. Since we no longer have to expose OrderedDict::Item objects, a lot of downstream code can be simplified. We no longer now double-store names (both in the key of the dictionary, and in the object itself). Differential Revision: D14603723 fbshipit-source-id: b5f7551b3074679623edd6ea70269830353b4d4c	2019-04-03 16:04:17 -07:00
Vitaly Fedyunin	773ce4fbd0	Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance (#18648 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18648 ghimport-source-id: 1cf4a8fe91492621e02217f38cae5d7e0699fb05 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18661 Step 7: remove _unique * #18655 Step 6: Rename _unique2 to unique and add int? dim * #18654 Step 5: remove _unque_dim in favor of unique_dim * #18651 Step 4: add support for unique with dim=None * #18650 Step 3: Add support for return_counts to torch.unique for dim not None * #18649 Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim * #18648 Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance `unique` is fragile, previously I tried to change it in #18391 and #17097, they all pass OSS tests but finally get reverted due to internal failure. My previous work of refactoring unique #18459 is based on #18391, and after #18391 get reverted, I could not work on #18459. To continue working on #18459, #18391, and #17097 without worrying about internal failures, I am suggesting the following steps for the improvements of `unique` and `unique_dim`. soumith Please take this and there is no need to put #18391 back. The motivation is basically to move forward as much as possible without causing any internal failures. So I will try to divide it into steps and sort from low probability of internal failure to high probability. (I don't know what the internal failure is, so I have to guess). Let's merge these PR stack one by one until we enounter internal failure. Step 1: Create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and keep `_unique` and `_unique_dim` unchanged. The backend of these two functions and `_unique` and `_unique_dim` are all the same, the only difference is the temporary ones support `return_counts` but not the `_unique` and `_unique_dim`. Step one is mostly #18391 + #18459. The cuda8 errors has been fixed. At this point, there is no user visible API change, so no docs are updated. `torch.unique` does not support `return_counts` yet, and `return_counts` is tested through the newly added temporary operators. This step just added two new ATen operators, so there shouldn't be any internal failure. Step 2: Rename `_unique_dim2_temporary_will_remove_soon` to `unique_dim`. This should cause no internal failure either, because no change to existing operators. The only thing to worry about is to delete `unique_dim` from python side because we don't want users to use it. At this point, C++ users now have `return_counts` support for `unique_dim`. Step 3: Update the docs of `torch.unique` and use `unique_dim` inside `torch.unique` to support `return_counts` In the docs, we should say `torch.unique` with None dim support does not support `return_counts` yet. This might cause internal failure. Step 4: Rename `_unique2_temporary_will_remove_soon` to `_unique2` and use `_unique2` inside `torch.unique` to support `return_counts`. Update the docs saying that `torch.unique` with None dim now support `return_counts`. This might cause internal failure. Step 5: Remove `_unique_dim`. This might cause internal failure. Step 6: Rename `_unique2` to `unique`, add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. Inside `torch.unique`, use `unique` and get rid of `unique_dim`. Unbind `unique_dim` totally from Python at codegen. This is likely to cause internal fail. Step 7: Remove `_unique`. This is very likely to cause internal failure. This PR ====== This PR is for step 1. This create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and implement `return_counts` inside them and do refactor for performance improvements. Please review ngimel VitalyFedyunin. They are mostly copied from #18391 and #18459, so the review should be easy. Below is a benchmark on a tensor of shape `torch.Size([15320, 2])`: Before --------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 192 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 548 ms ± 3.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 226 µs ± 929 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) 302 µs ± 7.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` After ------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 190 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 237 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 219 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 263 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 232 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 301 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 264 µs ± 7.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 339 µs ± 9.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Differential Revision: D14730905 fbshipit-source-id: 10026b4b98628a8565cc28a13317d29adf1225cc	2019-04-03 15:29:55 -07:00
Shen Li	7ae0263e1b	Support replicating multi-GPU modules (#18687 ) Summary: If the input `network` resides on multiple GPUs, `devices` must be a 2D list with `devices[0]` matching `network`'s devices. See #18591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18687 Differential Revision: D14706162 Pulled By: mrshenli fbshipit-source-id: dca630d3308f2dbcf8b75629c452d7a64092ba42	2019-04-03 14:43:07 -07:00
Wanchao Liang	eabd9eac2a	flake8 fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18810 Differential Revision: D14758293 Pulled By: wanchaol fbshipit-source-id: 975abe4fc5dc0dc4d43af61ec0f987e2c5670874	2019-04-03 14:14:18 -07:00
Gregory Chanan	862aff641a	Remove `device_guard: False` from native_functions that don't have a … (#18803 ) Summary: …tensor. There's nothing to device_guard on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18803 Reviewed By: ezyang Differential Revision: D14748091 Pulled By: gchanan fbshipit-source-id: ed6f16d6f4d3f07b6d5ad9696f71a14333c228b8	2019-04-03 14:00:02 -07:00
Edward Yang	cb959aa708	Switch our Linux machine AMI to a newer image. (#18433 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18433 ghimport-source-id: 1c92f98b091232c0045a2e1db75d19c1f258ac1f Differential Revision: D14748827 Pulled By: ezyang fbshipit-source-id: a459451058cf5560811403bafb96c6ff083d7e3a	2019-04-03 13:50:37 -07:00
Jerry Zhang	dfcd7b0185	QTensor (#18230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18230 Implementing minimum qtensor API to unblock other workstreams in quantization Changes: - Added Quantizer which represents different quantization schemes - Added qint8 as a data type for QTensor - Added a new ScalarType QInt8 - Added QTensorImpl for QTensor - Added following user facing APIs - quantize_linear(scale, zero_point) - dequantize() - q_scale() - q_zero_point() Reviewed By: dzhulgakov Differential Revision: D14524641 fbshipit-source-id: c1c0ae0978fb500d47cdb23fb15b747773429e6c	2019-04-03 13:17:11 -07:00
Dmytro Dzhulgakov	3af2d6d904	Enforce import order to make protobuf cpp implementation in python work (#18560 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18560 We have to import python protobuf here before we load cpp extension. Otherwise it breaks under certain build conditions if cpp implementation of protobuf is used. Presumably there's some registry in protobuf library and python side has to initialize the dictionary first, before static initialization in python extension does so. Otherwise, duplicated protobuf descriptors will be created and it can lead to obscure errors like Parameter to MergeFrom() must be instance of same class: expected caffe2.NetDef got caffe2.NetDef. I think it also fixes https://github.com/facebookarchive/caffe2/issues/1573 Reviewed By: ezyang, iroot900 Differential Revision: D14622054 fbshipit-source-id: 2499eb88ecdee85ff8d845859048f7ae5da2a480	2019-04-03 13:17:08 -07:00
Lu Fang	3b71f2e1f2	Pin onnx ir_version to 4 (#18768 ) Summary: to make test_operators.py more stable. in future, we will bump this up manually, and I think it's acceptable, since ir_version should be bumped too often. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18768 Reviewed By: zrphercule Differential Revision: D14741514 Pulled By: houseroad fbshipit-source-id: 0369dbc55424e345a113e49fc104a441ea290d58	2019-04-03 13:16:59 -07:00
Soumith Chintala	8711df89cc	fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18739 ) Summary: resubmit of https://github.com/pytorch/pytorch/pull/18704 with additional fixes Fixes https://github.com/pytorch/pytorch/issues/18359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18739 Differential Revision: D14737274 Pulled By: soumith fbshipit-source-id: cfbbbf68b098594bd045861d1b2c085da693ea51	2019-04-03 12:52:50 -07:00
Soumith Chintala	b5d8844bbe	push magma init into lazyInitCUDA (#18527 ) Summary: Tries to fix C++ API's usage of MAGMA-based functions. Attempts to Fix https://github.com/pytorch/pytorch/issues/18074 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18527 Differential Revision: D14691694 Pulled By: soumith fbshipit-source-id: dd04e74418e486d73ea4a92193ddf79352ed71ba	2019-04-03 12:47:34 -07:00
Jerry Zhang	ed9724f385	For some files that are touched by the QTensor diff (#18765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18765 att Reviewed By: ZolotukhinM Differential Revision: D14733442 fbshipit-source-id: 525002034e6dccc2045da645e1193671fd0474b3	2019-04-03 12:47:31 -07:00
Wanchao Liang	a21e256e8d	Fix contiguous AD and Autogradzero inconsistency (#18633 ) Summary: Fixes #17962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18633 Differential Revision: D14700449 Pulled By: wanchaol fbshipit-source-id: 3d15d67c01b69b28394a0f2f001db90ed9fd31dc	2019-04-03 12:47:28 -07:00
Iurii Zdebskyi	5950c1e8c4	Added indexing for bool tensors and bool Indices (#18583 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18583 ghimport-source-id: 2b1941449827f4ab632fa0f5c8cf0791a6be0845 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18583 Added indexing for bool tensors and bool Indices * #18505 Added numpy conversion * #18166 Bool Tensor for CUDA ----------- This PR enables bool tensor indexing and indexing with bool indices. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU [Done] b) CUDA [In review] 3. Tensor Conversions. [In review] 4. Tensor Indexing. [This PR] 5. Tensor Operations. 6. Back compatibility related changes. TODO: as a follow up, we should move nonzero method from TH to Aten to make code cleaner. Change: ``` v = torch.tensor([True, False, True], dtype=torch.bool) boolIndices = torch.tensor([True, False, False], dtype=torch.bool) v[boolIndices] -> tensor([True], dtype=torch.bool) v = torch.randn(5, 7, 3) boolIndices = torch.tensor([True, False, True, True, False], dtype=torch.bool) v[boolIndices] -> tensor([[[ 0.5885, -0.3322, 0.7388], [ 1.1182, 0.7808, -1.1492], [-0.7952, 0.5255, -0.0251], [ 0.7128, 0.8099, 1.2689], [-0.7018, -1.4733, -0.3732], [ 0.4503, 0.4986, -1.1605], [ 0.3348, -1.3767, -0.2976]], [[-2.0303, -0.4720, -0.1448], [-0.1914, -0.6821, 2.0061], [-1.0420, -0.1872, -0.3438], [ 1.7587, -0.4183, -0.7577], [ 1.0094, -0.1950, -0.2430], [ 0.1174, 0.3308, -0.5700], [ 0.1110, -0.2714, 1.3006]], [[-0.1946, -1.4747, -0.4650], [-1.0567, 1.0110, -0.2809], [ 0.3729, -0.5699, 0.0815], [-0.7733, -0.8316, 0.1674], [ 1.2000, -0.3745, -1.1679], [ 1.7105, 0.9851, -0.1907], [-1.1077, 0.2086, -0.0548]]]) ``` Differential Revision: D14673403 fbshipit-source-id: 2b88ec2c7eb26a4f5ef64f8707fb68068d476fc9	2019-04-03 12:47:26 -07:00
Lu Fang	65dfe1203f	add an assertion to check the param num (#18145 ) Summary: Introduce this check to see whether it will break any existing workflow Pull Request resolved: https://github.com/pytorch/pytorch/pull/18145 Reviewed By: dzhulgakov Differential Revision: D14511711 Pulled By: houseroad fbshipit-source-id: a7bb6ac84c9133fe94d3fe2f1a8566faed14a136	2019-04-03 12:47:23 -07:00
Jiakai Liu	4afc067fed	add Android NDK param to CI docker build script (#18782 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18782 ghimport-source-id: 6c4bde7dc835b59209c1d5f7b243f00c9fe99de2 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18782 [pytorch] add Android NDK param to CI docker build script Inspired by discussion: https://github.com/pytorch/pytorch/pull/16242 Reviewed By: dreiss Differential Revision: D14739471 fbshipit-source-id: 0a081045186cbf359eb3cdadee722741cd8cd62f	2019-04-03 12:47:20 -07:00
Gu, Jinghui	a7b82a44c4	Upgrade mkldnn-bridge for dnnlowp support (#16308 ) Summary: The mkldnn-bridge is upgraded in this PR to support DNNLOWP operators. Meanwhile, APIs have been updated in caffe2 to use latest version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16308 Differential Revision: D14697018 Pulled By: yinghai fbshipit-source-id: ca952589098accb08295fd5aa92924c61e74d69c	2019-04-03 12:47:17 -07:00
Michael Kösel	46a68c1b37	add 'abs' builtin Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18502 Differential Revision: D14750173 Pulled By: eellison fbshipit-source-id: 359cf08938ada442ca1a3b3ea14022ce10229499	2019-04-03 12:47:13 -07:00
kshitij12345	0916b5419a	Fix dense Embedding to work with double backward (#9078 ) Summary: Fixes : #6469 1. `ATen/native/native_functions.yml` had [dispatch](`03e7953a98/aten/src/ATen/native/native_functions.yaml (L451-L455)`) variants for for `embedding_dense_backward` , however `embedding_backward` explicitly made [call](`03e7953a98/aten/src/ATen/native/Embedding.cpp (L35-L45)`) to it, thus leading to error. 2. In case of CUDA type tensor, the function crashed used to crash on dereferencing of indices's data [pointer](`03e7953a98/aten/src/ATen/native/Embedding.cpp (L93)`). Both have been solved and checked against (on CUDA and CPU) 1. As mentioned in the issue ``` import torch class Test(torch.nn.Module): def __init__(self): super(Test,self).__init__() self.embd = torch.nn.Embedding(1000, 100) self.dense = torch.nn.Linear(100, 1) def forward(self, inp): inp = self.embd(inp) return self.dense(inp) test = Test() inp = torch.tensor([0,1,2,1,1]) out = test(inp) raw_loss = out.mean(dim=0) loss_grad = torch.autograd.grad(outputs=raw_loss, inputs=list(test.parameters()), retain_graph=True, create_graph=True, only_inputs=True) norm = sum([param.norm()*2 for param in loss_grad]) loss = raw_loss + norm loss.backward(retain_graph=True) print(test.embd.weight.grad) ``` 2. Test Script ``` import torch import time start = time.time() l = [1,1]100 input = torch.tensor([[1,0],[1,0]],device='cpu') embedding_matrix = torch.tensor([[1.0,3.0],[2.0,4]],requires_grad=True,device='cpu') sq = embedding_matrix * embedding_matrix emb = torch.nn.functional.embedding(input, sq,scale_grad_by_freq=False) print('Embedding Matrix') print(embedding_matrix) print('-----------------') sum_ = emb.sum()#prod.sum() loss_grad, = torch.autograd.grad(outputs=sum_,inputs=embedding_matrix,create_graph=True) print('Gradient') print(loss_grad) print('-----------------') sum2_ = sum_ + loss_grad.sum() print(sum2_) sum2_.backward() print(embedding_matrix.grad) print(time.time() - start) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/9078 Reviewed By: ezyang Differential Revision: D14691901 Pulled By: soumith fbshipit-source-id: 78e2612ba39080be564c876311671eb5a0119a0f	2019-04-03 09:50:34 -07:00
Shen Li	c0ad6747a9	Highlight NCCL all_reduce and all_gather requirements (#18741 ) Summary: See #18689 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18741 Differential Revision: D14726874 Pulled By: mrshenli fbshipit-source-id: a92404c653e3c62fc23fa3ccacfb3b2959b2e307	2019-04-03 09:50:29 -07:00
svcscm	2658190def	Updating submodules Reviewed By: zpao fbshipit-source-id: ea0b06ce68d3fd6092eaea7c835a8b51c1120ea0	2019-04-03 08:30:25 -07:00
peter	5e33085f27	Make it possible for users for select /Zi or /ZI over /Z7 when using MSVC (#18790 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18701. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18790 Differential Revision: D14748195 Pulled By: ezyang fbshipit-source-id: e50df1b5ca199a88d7b5ea3ea45d25d23cd31a27	2019-04-03 08:24:52 -07:00
Jongsoo Park	06b7fe59f2	use optimization in D14020675 (#16945 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16945 As title Reviewed By: jianyuh Differential Revision: D14020769 fbshipit-source-id: fc0f05fcc57bfe9b4aa0c5750060d7b2ba57dd7a	2019-04-03 08:05:10 -07:00
Gregory Chanan	2113ea6fbf	Add device and dtype to storage. (#18749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18749 ghimport-source-id: 9026a037f5e11cdb9ccd386f4b6b5768b9c3259b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18751 Disallow changing the device of a tensor via set_. * #18750 Use non-legacy constructors for tensor deserialization. * #18749 Add device and dtype to storage. The goal here is to fix our serialization, which currently depends on the legacy constructors. Having dtype and device on Storage allows us to use the non-legacy constructors. This fits somewhat along our goal of removing Storage, my having Storage act like a Tensor. Differential Revision: D14729516 fbshipit-source-id: bf4a3e8669ad4859931f4a3fa56df605cbc08dcb	2019-04-03 07:59:02 -07:00
Gregory Chanan	a3da3653eb	Use non-legacy constructors for tensor deserialization. (#18750 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18750 ghimport-source-id: f1475cfb67841c41d9867d4429ba9125d5c7dd07 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18751 Disallow changing the device of a tensor via set_. * #18750 Use non-legacy constructors for tensor deserialization. * #18749 Add device and dtype to storage. Deserialization currently uses legacy constructors. This is bad because we need to maintain them, but there is a more immediate problem: 1) We are trying to implement device caching on TensorImpl to get rid of a virtual dispatch 2) This doesn't work if one is able to change the device of a Tensor underlying a Variable. 3) Deserialization does 2) So the plan is to change deserialization, then enforce that we don't change the device out from underneath a Variable. Differential Revision: D14729513 fbshipit-source-id: 090d6cdb375b94dc1bf4f554b2df243952b8cdc6	2019-04-03 07:54:11 -07:00
Iurii Zdebskyi	48f70ea0a2	Added numpy conversion (#18505 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18505 ghimport-source-id: f3c9b9251e5793f9e192f587194ddfebb45facc1 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18505 [WIP]Added numpy conversion * #18166 Bool Tensor for CUDA Differential Revision: D14646403 fbshipit-source-id: 79d39d692c778ce1981c1d35b1c33e3d93111041	2019-04-03 07:28:24 -07:00
Gregory Chanan	7349dbb7ce	Remove THTensor_(newUnfold). (#18773 ) Summary: It's not used and unfold's use of `device_guard: False` is scary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18773 Differential Revision: D14736526 Pulled By: gchanan fbshipit-source-id: 6281a284bee45fa5038783e4c1ed4d1ed7ca81ab	2019-04-03 07:08:28 -07:00
mingzhe0908	cb66759600	temp fix for flake8 error (#18788 ) Summary: Fix lint error Pull Request resolved: https://github.com/pytorch/pytorch/pull/18788 Reviewed By: houseroad Differential Revision: D14741840 Pulled By: mingzhe09088 fbshipit-source-id: 1fa630e3c6e606e3d78fe8293e5b0e7ea1b78da3	2019-04-02 22:52:52 -07:00
Igor Fedan	3079d95b6c	Fix flake8 issues Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18762 Reviewed By: houseroad Differential Revision: D14734152 Pulled By: ifedan fbshipit-source-id: 5adf123f88273895ad34ee9041896358d686de08	2019-04-02 21:18:01 -07:00
Jerry Zhang	40a54bf2f1	Change ReinitializeTensor to use C10_LOG_FIRST_N (#18531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18531 Currently we use C10_LOG_EVERY_MS to log the data type change, but it pollutes the log of some service, we would like to change it to C10_LOG_FIRST_N to prevent that. Reviewed By: dzhulgakov Differential Revision: D14647704 fbshipit-source-id: b84e4002bd4aa94d616133cd1049c3d4ab05386e	2019-04-02 21:03:37 -07:00
Yinghai Lu	80404cb2f5	Add support for getting TensorProto argument (#18364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18364 att Reviewed By: bddppq Differential Revision: D14584784 fbshipit-source-id: 03f9207d5cf4f7f4b812428a931edbcdcb21ca8d	2019-04-02 20:58:28 -07:00
Michael Suo	31849bc524	make test module hook use save/load (#18284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18284 ghimport-source-id: 5a92c03fda19072ffb6afd40e0f56806716c7be6 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18296 [jit] Add namespacing for ScriptClasses * #18284 [jit] make test module hook use save/load * #18211 [jit] Turn script_type_parser into a class * #18148 [jit] python interop for script classes Instead of python-printing and comparing strings (which does not capture depdency information, etc.), use save/load on in-memory buffers and compare the main module contents inside the buffer Reviewed By: ailzhang Differential Revision: D14581129 fbshipit-source-id: 52264ae9ce076775ab3fd1a0c32c8d6f6677a903	2019-04-02 18:09:52 -07:00
Zachary DeVito	2d07993bcb	Add ability to specialize class types to ArgumentSpec (#18314 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18314 ghimport-source-id: 8cecb768d476ab19c9460f39c8f94a764e4cb052 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. Differential Revision: D14574395 fbshipit-source-id: cc3af6e56e9ae52990f4a1ad56ecceaa2d493577	2019-04-02 17:35:57 -07:00
Mingzhe Li	5f5a2aaab9	Operator-level performance microbenchmarks (#18740 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18740 Test utilities for writing Caffe2/PyTorch performance microbenchmarks. Brief description of the file structure * benchmark_core.py : core utiltiites for running microbenchmark tests * benchmark_caffe2.py : Caffe2 specific benchmark utilitites * benchmark_pytorch.py: PyTorch specific benchmark utilities * benchmark_runner.py : Main function. Currently it can run the microbenchmark tests in a stand-alone mode. The next step is to have this integrate with AI-PEP. The utilities are located at https://github.com/pytorch/pytorch/tree/master/test to have access to both Caffe2/PyTorch Python's frontend. Include two operator microbenchmarks; support both Caffe2/PyTorch: * MatMul * Add Reference: PyTorch benchmarks : https://github.com/pytorch/benchmark/tree/master/timing/python. In this work, we start with two example binary operators MatMul and Add, but eventually we should to cover unary operators like in the PyTorch benchmark repo. Reviewed By: zheng-xq Differential Revision: D13887111 fbshipit-source-id: b7a56b95448c9ec3e674b0de0ffb96af4439bfce	2019-04-02 17:06:19 -07:00
Iurii Zdebskyi	b832b99afb	Bool Tensor for CUDA (#18166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18166 ghimport-source-id: a8e2ba2d966e49747a55701c4f6863c5e24d6f14 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18166 Bool Tensor for CUDA * #18165 Resolved comments from Bool Tensor for CPU PR ------ This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU [Done] b) CUDA [This PR] 3. Tensor Conversions. 4. Tensor Indexing. 5. Tensor Operations. 6. Back compatibility related changes. Change: Enable bool tensor in CUDA with the following operations: torch.zeros torch.tensor torch.ones torch.rand/rand_like/randint/randint_like torch.full torch.full_like torch.empty torch.empty_like Tested via unit tests and local scripts. Differential Revision: D14605104 fbshipit-source-id: b7d7340a7d70edd03a109222d271e68becba762c	2019-04-02 16:17:05 -07:00
Jan Schlüter	b77e3c2ca1	Add helpful information to the gradient/inplace operation exception (#18523 ) Summary: To debug a `one of the variables needed for gradient computation has been modified by an inplace operation` error, I wanted to know which variable has been modified, so I extended the error message with what information is easily available at this point. Before: ``` RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation ``` After: ``` RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [80, 1]], which is output 0 of UnsqueezeBackward0, is at version 1, not expected version 0. Hint: enable anomaly detection to find the forward pass operation which modified it. ``` The hint to enable anomaly detection is only shown when it is not enabled. It's meant to save people some googling. I'd even go further and reference `torch.autograd.set_detect_anomaly(True)`, but maybe we're not running Python? Disclaimer: I haven't looked at other parts of the code to check if using `std::stringstream` is acceptable practice, let me know if it isn't. Similarly, I haven't checked about indentation practices. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18523 Differential Revision: D14683249 Pulled By: soumith fbshipit-source-id: f97a99d4aabea7461df766d66cd72300b48e2350	2019-04-02 15:23:04 -07:00
Mikhail Zolotukhin	74d9146559	build_variables.py: turn on link_whole for _C_impl library. (#18763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18763 Without `link_whole` flag in opt-builds some of the files are not linked into `_C_impl` library, which causes some of static initializers not to run (namely, registering an cutomPythonOperation from python_interpreter.cpp). This diff fixes it. Differential Revision: D14732471 fbshipit-source-id: 57cff6b4b6d479ad7ab7fd29f677746d91d6ff45	2019-04-02 15:17:13 -07:00
vaeksare	84a9694ed0	Fix windows msbuild bug (#18748 ) Summary: Fix the bug introduced by #18681 where an undefined variable was being used to limit max cpu count when building for Windows without Ninja. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18748 Differential Revision: D14733209 Pulled By: soumith fbshipit-source-id: 52fc0dd4dde99da75a6956b63f02da2e647eed4f	2019-04-02 14:35:40 -07:00
Igor Fedan	2e97c82470	torch.cross' dim default changed to c10::optional instead of int=-1 (#17582 ) Summary: Argument dim=-1 doesn't work for torch.cross. The signature of the torch.cross has been changed to c10::optional<int64_t> dim instead of int64_t. So based on document "If dim is not given, it defaults to the first dimension found with the size 3." and if dim is specified (even negative) it will use the correspondent dim. Fixes #17229 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17582 Differential Revision: D14483063 Pulled By: ifedan fbshipit-source-id: f9699093ec401cb185fd33ca4563c8a46cdcd746	2019-04-02 13:27:00 -07:00
Sacha	3027e783b1	Fix multi-configuration on Windows CMake (CUDA) (#18548 ) Summary: Multiple configurations is the default (eg. Release;Debug) on Windows and this check always broke this configuration as CMAKE_BUILD_TYPE was not set. The workaround was to always set CMAKE_BUILD_TYPE to Debug or Release, which was very unfortunate. The correct method is to use generator expressions that expand depending on the current CONFIG being processed. Side note: Anywhere else CMAKE_BUILD_TYPE is checked should probably be fixed too. Note that the CMakeLists.txt forces it in to Release mode. However, I came across this error when importing the prebuilt Config in to another project, where CMAKE_BUILD_TYPE was not set. > 3>CMake Error at pre_built/pytorch-1.0.1/share/cmake/Caffe2/public/cuda.cmake:380 (message): > 3> Unknown cmake build type: Proper support for configurations would mean we can build debug and release at the same time and as you can see, it is less CMake code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18548 Differential Revision: D14730790 Pulled By: ezyang fbshipit-source-id: 70ae16832870d742c577c34a50ec7564c3da0afb	2019-04-02 13:19:07 -07:00
Igor Fedan	36237c4893	Fix flake8 issues in gragrad test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18727 Differential Revision: D14724887 Pulled By: ifedan fbshipit-source-id: 8c1db6460303e746e4aea0142302b8d61277c067	2019-04-02 12:45:18 -07:00
Sebastian Messmer	f095c34b73	Register operators by passing arguments to RegisterOperators constructor (#18577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18577 This is also part of the legacy API and we need to support it if we want to replace it. Reviewed By: dzhulgakov Differential Revision: D14671432 fbshipit-source-id: 007abf4ab816647a509fc08e35d79b6c1aa55b03	2019-04-02 12:33:33 -07:00
Sebastian Messmer	58f5954252	Allow registering an operator schema without a kernel (#18551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18551 This is helpful for defining a set of operators as an interface but not adding concrete kernels just yet. The registration logic will ensure that any other libraries that add kernels for these schemas exactly match the schema defined here. Reviewed By: dzhulgakov Differential Revision: D14660208 fbshipit-source-id: 7adb5a4876cff5a0ad21d92d8c450cb889f00cc3	2019-04-02 12:33:30 -07:00
Sebastian Messmer	7a37e066e6	Improve compiler error messages of the op registration API (#18550 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18550 When the operator registration API is used wrongly, in most cases we should now get a nice compiler error instead of weird template error messages. This is done by making the enable_if conditions more broad so they also match error cases, but then having static_asserts against these error cases inside the function. Before that, since the function didn't match, the error message said something like "no function found to match your call", now it will show the error message specified in the static_asserts. Reviewed By: dzhulgakov Differential Revision: D14659178 fbshipit-source-id: 7ca4fb72d9051eadf0a7e2717b962bf1213a52b2	2019-04-02 12:33:27 -07:00
Sebastian Messmer	ae1d13a06f	Improve and test error messages for signature mismatches (#18547 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18547 - Argument indices in the error messages are 1-indexed not 0-indexed. - Add test cases that a mismatching signature actually shows the correct error messages Reviewed By: dzhulgakov Differential Revision: D14656695 fbshipit-source-id: 55e45634baa3117e18b8687ea6b2a2f83715bdf6	2019-04-02 12:33:24 -07:00
Sebastian Messmer	bb8a0d717c	Enable gmock and fix system gtest issue (#18706 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18706 - Enable gmock - Fix issue where the gtest source files in third_party would include system gtest headers Reviewed By: ezyang Differential Revision: D14715302 fbshipit-source-id: 5335390913e651bda85c69d7ea9b5c1bce58f172	2019-04-02 12:33:22 -07:00
Edward Yang	01c03caacc	Emergency workaround for apt-get failure. (#18733 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18733 ghimport-source-id: b56766fb4b1084d8a7947cf622275d44e325141b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18733 Emergency workaround for apt-get failure. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: dreiss Differential Revision: D14725779 fbshipit-source-id: 6855347853a3f13461ca267ed563e2db5815166e	2019-04-02 10:49:21 -07:00
Pieter Noordhuis	0b6ed83f33	Fix clang-tidy errors in torch/csrc/distributed Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18709 Differential Revision: D14725936 Pulled By: pietern fbshipit-source-id: 307bc446d53da5d0e04d730bb51b7fb29212ace3	2019-04-02 10:32:37 -07:00
Eli Amesefe	385a755b68	Undefined behavior with memset of std::string to 0 (#18703 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18703 `zeroPtr` is sometimes a `std::string` tensor, so `memset` to 0 is undefined behavior. This might be accidentally safe with `std::string` implementation that use SSO (Small String Optimization), but will crash otherwise. Reviewed By: zheng-xq Differential Revision: D14714458 fbshipit-source-id: 012a18464e6514d38ff791509b88ddc3fc55b2b1	2019-04-02 10:10:11 -07:00
Soumith Chintala	a799751e33	Revert D14717015: [pytorch][PR] fix nccl compilation to make sure it compiles for architectures that pytorch compiles for Differential Revision: D14717015 Original commit changeset: 4aac036f57e5 fbshipit-source-id: c820b8dfb27564271e6b80e133fe655658a7c25c	2019-04-02 09:39:03 -07:00
Lu Fang	1f5a46ab05	Automatic update of fbcode/onnx to f0d7df2c643c4e37f1fd7735ef02c972c4d19fb5 (#18695 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18695 Previous import was fb1a80692c1ab0bd27b1072f2e7bffacba336777 Included changes: - [f0d7df2c](https://github.com/onnx/onnx/commit/f0d7df2c): fix testcase names of maxpool_2d_ceil and averagepool_2d_ceil (#1896) <karljang> Reviewed By: zrphercule Differential Revision: D14709993 fbshipit-source-id: 7fe2145a481ea2c1b6d85ba1c85c662200a53241	2019-04-02 09:16:48 -07:00
Vitaly Fedyunin	c484cf43a0	Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors. (#18455 ) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455 Reviewed By: ezyang Differential Revision: D14672084 Pulled By: VitalyFedyunin fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124	2019-04-02 08:48:19 -07:00
Edward Yang	aed7c9bc96	Improve Backend comment. (#18567 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18567 ghimport-source-id: 1e50e611a3afcfae86828b7afe06c3fdc6a7bef7 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18567 Improve Backend comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: dzhulgakov Differential Revision: D14666189 fbshipit-source-id: 64a41c4a998b1a59ff780d1ae06fa16e5ef3c7c4	2019-04-02 08:06:48 -07:00
vishwakftw	baac5489a8	Expose alias multinomial methods to ATen (#17904 ) Summary: This PR exposes the multinomialAliasSetup and multinomialAliasDraw methods. cc: neerajprad Pull Request resolved: https://github.com/pytorch/pytorch/pull/17904 Differential Revision: D14700205 Pulled By: ezyang fbshipit-source-id: 16462fb1f1ef1d560fd586632ea356b23e966ee3	2019-04-02 07:56:41 -07:00
BloodAxe	5ade96fc84	Update cpp_extension.py (#18638 ) Summary: Hi. It seems that when building CPP-extensions with CUDA for Windows, an `extra_cuda_cflags` options are not properly forwarded to `nvcc`. Use of extra CUDA options is necessary to build, for instance, a InplaceABN (https://github.com/mapillary/inplace_abn), which requires `--expt-extended-lambda` option. This PR adds one line that correctly appends `extra_cuda_cflags`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18638 Differential Revision: D14704270 Pulled By: ezyang fbshipit-source-id: e1e330d193d9afd5707a5437a74c0499460d2b90	2019-04-02 07:56:38 -07:00
Mark Pare	fba89b2ae1	fix typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18653 Differential Revision: D14713920 Pulled By: ezyang fbshipit-source-id: 170295a162dd23916c1dcc9330918d33277cc9ed	2019-04-02 07:51:30 -07:00
Gregory Chanan	d5bf6ddc29	Kill LegacyBridge functions that don't do multiple dispatch. (#18696 ) Summary: At some point, we needed these functions to deal with autograd dispatching to the sparse of TH version of a backwards. But we rewrote all backwards definitions in terms of native functions, so this is no longer necessary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18696 Differential Revision: D14710834 Pulled By: gchanan fbshipit-source-id: b22568c58eefc79d672555bd8832398ccd965cb7	2019-04-02 07:34:55 -07:00
svcscm	af84371ba8	Updating submodules Reviewed By: zpao fbshipit-source-id: da3cd711bb81b07c6c284426ffc5e10a969b0d2b	2019-04-02 06:50:53 -07:00
Jongsoo Park	f084c129db	add Int8FCRelu (#18673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18673 Add a fused FC + Relu Reviewed By: csummersea Differential Revision: D14667055 fbshipit-source-id: d88fefba008fc0ca450291532d2b320694c6b785	2019-04-01 23:50:30 -07:00
David Riazati	8e873ce273	Fix uninitialized value in pickler (#18678 ) Summary: Fixes #18671 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18678 Differential Revision: D14708969 Pulled By: driazati fbshipit-source-id: d372c6e3a2a3d3fc48d8afc1fa6807f2ce0e5c6e	2019-04-01 17:34:36 -07:00
Soumith Chintala	2e029db2f9	fixes multiprocessing serialization for integer nn.Parameter (#18639 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18639 Differential Revision: D14711565 Pulled By: soumith fbshipit-source-id: 0063ed138a215b95d6571dcd68b18569714abe19	2019-04-01 17:15:42 -07:00
Soumith Chintala	fc6296d777	fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18704 ) Summary: cc: t-vi gchanan zou3519 This fixes https://github.com/pytorch/pytorch/issues/18359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18704 Differential Revision: D14717015 Pulled By: soumith fbshipit-source-id: 4aac036f57e564b05d759662e8ad7a80170901c0	2019-04-01 17:10:42 -07:00
Jon Malmaud	1b25fdbcd0	More type stubs (#18511 ) Summary: Added stubs for: * The `device` module * The `cuda` module * Parts of the `optim` module * Began adding stubs for the `autograd` module. I'll annotate more later but `no_grad` and friends are probably the most used exports from it so it seemed like a good place to start. This would close #16996, although comments on that issue reference other missing stubs so maybe it's worth keeping open as an umbrella issue. The big remaining missing package is `nn`. Also added a `py.typed` file so mypy will pick up on the type stubs. That closes #17639. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18511 Differential Revision: D14715053 Pulled By: ezyang fbshipit-source-id: 9e4882ac997063650e6ce47604b3eaf1232c61c9	2019-04-01 16:03:58 -07:00
Gregory Chanan	aa23b8c664	NCCL build fix WITH_DISTRIBUTED=1. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18691 Reviewed By: ezyang Differential Revision: D14706205 Pulled By: gchanan fbshipit-source-id: 802f19bfd7df3703c0dbce03036e2f2e32eb3efb	2019-04-01 15:58:54 -07:00
Duc Ngo	16f07d7dac	caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18622 Set up correct inheritance structure for remaining operator test classes Reviewed By: ezyang Differential Revision: D14685941 fbshipit-source-id: a6b1b3be325935b7fec7515be13a4994b3016bf0	2019-04-01 15:53:22 -07:00
Elias Ellison	20b63aa977	Peephole Optimize Shape Ops (#18549 ) Summary: Peephole optimize ops that just require Dimensioned Tensor Type, which is what we specialize graphs on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18549 Differential Revision: D14690827 Pulled By: eellison fbshipit-source-id: 9d7439eb584f0a5b877f5aa53cf80150f00e7e5f	2019-04-01 15:39:43 -07:00
Sebastian Messmer	4a0f842d42	Deprecated lambda based API (#18542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18542 This adds the deprecated API for defining kernels as lambdas. The new API for defining kernels as lambdas was introduced in D14653005. Reviewed By: dzhulgakov Differential Revision: D14653551 fbshipit-source-id: 99900f1436716c69e52c83b68333b642ec2c8558	2019-04-01 14:58:35 -07:00
Sebastian Messmer	723ce02a55	deprecated function based API (#18444 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18444 This adds the deprecated function based API to c10::RegisterOperators(). This is the API currently exposed under jit::RegisterOperators() and we need to support it for backwards compatibility. Reviewed By: dzhulgakov Differential Revision: D14514218 fbshipit-source-id: c77676851cfd431d66f18fd8038cf153a3a7d7cc	2019-04-01 14:58:32 -07:00
Junjie Bai	246f5c412e	Revert "Tensor construction codemod(raw_mutable_data) (#16373 )" (#18680 ) Summary: This reverts commit d73c830e236f5b980e5c91914b818d150b60278c. We have observed significant perf drop when training ResNext101 with multiple amd GPUs: Before: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1636/console 2 GPUs ResNext training got 150\~160 imgs/sec 4 GPUs ResNext training got 270\~280 imgs/sec After: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1637/console Both 2 and 4 GPUs ResNext training drop to 110\~120 imgs/sec Similar perf drop are seen on ResNet50 training jobs as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18680 Differential Revision: D14702941 Pulled By: bddppq fbshipit-source-id: 828141805afc23f25c08d4a2eb6d4b99f817c128	2019-04-01 14:39:13 -07:00
Pieter Noordhuis	bdfdf6c2b9	C++ handler for gradient reduction (#18251 ) Summary: This commit adds the `c10d::Reducer` class that hooks into autograd and performs gradient bucketing and reduction. These are the core parts of `nn.parallel.DistributedDataParallel` that up to now were only usable for CUDA models. This should enable the following: * Distributed data parallelism for models defined using the C++ frontend. * Allow overlap of gradient computation and reduction for non-CUDA models. * Enable distributed data parallelism for models with some unused parameters. This does not include any logic for computing bucket assignment, which can be done separately; either by observing autograd execution order (this is what Apex does), or by assigning buckets based on some maximum byte size, or both. Also see #17757 and #13273. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18251 Reviewed By: mrshenli Differential Revision: D14571899 Pulled By: pietern fbshipit-source-id: 20f95eefd288dfe8cfffe0a28ca22fa7c9c3cd4c	2019-04-01 14:30:02 -07:00
svcscm	a0285dd0f4	Updating submodules Reviewed By: zpao fbshipit-source-id: 735fc388bff7066e8f46526266a73bf35e121442	2019-04-01 13:59:58 -07:00
Jongsoo Park	89e9b1cf8e	add ConvRelu schema (#18693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18693 As title Reviewed By: protonu Differential Revision: D14662880 fbshipit-source-id: 3664faa660a04e1f528a413d2a1700b872c3c684	2019-04-01 13:09:07 -07:00
Karl Ostmo	90a5c56988	offload scripts from win-test.sh Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18601 Differential Revision: D14711856 Pulled By: kostmo fbshipit-source-id: 75fe620541fe2903f69a53dbd1b6d51a0d718113	2019-04-01 13:04:30 -07:00
peter	929258a680	Some fixes for the build script on Windows (#18681 ) Summary: Fixes https://discuss.pytorch.org/t/pytorch-build-from-source-on-windows/40288/13?u=peterjc123. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18681 Differential Revision: D14711039 Pulled By: soumith fbshipit-source-id: f7e1a94b163064c055670b2925cd4502e7773599	2019-04-01 12:42:51 -07:00
Igor Fedan	d6c269c33e	Fix for double backwards tests (#18190 ) Summary: If none of the outputs require_grad, we don't actually check gradgrad, instead we will check that their numerical gradients are 0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18190 Differential Revision: D14563388 Pulled By: ifedan fbshipit-source-id: a4eb94c9eb60f14dbe6986cd8cef1fe78a7bc839	2019-04-01 12:33:30 -07:00
David Riazati	be01c90797	Add string index/slice operations (#18247 ) Summary: Adds support for string indexing (`"a"[0]`) and slicing (`"abc"[1:3]`) to script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18247 Differential Revision: D14574486 Pulled By: driazati fbshipit-source-id: 4b42aa0881e5398ea7f112be46c0335e6e19dced	2019-04-01 12:11:35 -07:00
eellison	af9335436d	Re-land Parsing file check (#18570 ) Summary: The last time I tried to land it there was a merge race with the docs coverage test lol. Re-landing with the fix. Re-land of https://github.com/pytorch/pytorch/pull/18304 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18570 Reviewed By: driazati Differential Revision: D14707285 Pulled By: eellison fbshipit-source-id: 3a0265928aa8cad78961723d8bf0fbf871fdb71d	2019-04-01 11:56:32 -07:00
Ru Li	3749d65a7e	Create Node2Vec ModuleKeeper Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18504 Reviewed By: sunnieshang Differential Revision: D14632091 fbshipit-source-id: d4544866552dc6bcbc7515be9e88cb11e7622a44	2019-04-01 10:36:23 -07:00
Jongsoo Park	822c8ee143	use acc16 only when n>128 and k>128 in Skylake (#18672 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18672 In Skylake, when n < 128 or k < 128, acc16 is slower. Reviewed By: jianyuh Differential Revision: D14700576 fbshipit-source-id: 80ca9f1af4626637eed9c5ca49f95ae744811189	2019-04-01 08:52:28 -07:00
Gregory Chanan	4c74cf7489	Move ideep singleton registration to ATen from C2. (#18335 ) Summary: Since we are going to add ideep to ATen, and ATen is always compiled, it makes sense to have the registration in ATen rather than C2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18335 Reviewed By: bddppq Differential Revision: D14578652 Pulled By: gchanan fbshipit-source-id: 4d77fcfc21a362b21d5291a127498aa722548873	2019-04-01 08:00:33 -07:00
Shuichi KITAGUCHI	ddbfdc911d	Create torch/lib directory before copying _C.lib on Windows environment. (#18666 ) Summary: `python setup.py develop` fails with following messages. ~~~ ... -- Building with NumPy bindings -- Not using cuDNN -- Not using MIOpen -- Not using CUDA -- Using MKLDNN -- Not using NCCL -- Building without distributed package Copying extension caffe2.python.caffe2_pybind11_state Copying caffe2.python.caffe2_pybind11_state from torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd to C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd copying torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python building 'torch._C' extension creating build\temp.win-amd64-3.7 creating build\temp.win-amd64-3.7\Release creating build\temp.win-amd64-3.7\Release\torch creating build\temp.win-amd64-3.7\Release\torch\csrc ... creating C:\data\source\pytorch\build\lib.win-amd64-3.7\torch C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /nodefaultlib:libucrt.lib ucrt.lib /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\data\source\pytorch\torch\lib /LIBPATH:C:\data\dlenv\libs /LIBPATH:C:\data\dlenv\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\um\x64" shm.lib torch_python.lib /EXPORT:PyInit__C build\temp.win-amd64-3.7\Release\torch/csrc/stub.obj /OUT:build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib /NODEFAULTLIB:LIBCMT.LIB ライブラリ build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib とオブジェクト build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.exp を作成中コード生成しています。コード生成が終了しました。 copying build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd -> torch copying build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> caffe2\python copying build/temp.win-amd64-3.7/Release/torch/csrc/_C.cp37-win_amd64.lib -> build/lib.win-amd64-3.7/torch/lib/_C.lib error: could not create 'build/lib.win-amd64-3.7/torch/lib/_C.lib': No such file or directory ~~~ When `python setup.py install` is executed, `torch/lib` has been created by previous process (copying many files) and this copy succeeds. But in develop mode, that process does not executed and this copy fails. This patch creates `torch/lib` directory if do not exist. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18666 Differential Revision: D14704269 Pulled By: ezyang fbshipit-source-id: b2d7c698a906b945bf34bb78f17b91b4fdfd3294	2019-04-01 07:28:08 -07:00
Sacha	8276d82f78	Move flags that do not work on MSVC (#18686 ) Summary: MSVC errors on these flags as they are not supported Pull Request resolved: https://github.com/pytorch/pytorch/pull/18686 Differential Revision: D14704254 Pulled By: ezyang fbshipit-source-id: 936d33ed6b7474d7774a49505cdac50dbe8dd99a	2019-04-01 07:28:05 -07:00
Junjie Bai	44b5891121	Fix unused lambda capture warnings (#18662 ) Summary: ``` aten/src/ATen/native/cpu/DistanceOpsKernel.cpp.DEFAULT.cpp:109:104: warning: lambda capture 'combs' is not used [-Wunused-lambda-capture] parallel_for(0, combs, internal::GRAIN_SIZE / (16 * m), [p, self_start, self_end, n, m, res_start, combs](int64_t k, int64_t end) { ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18662 Differential Revision: D14699379 Pulled By: bddppq fbshipit-source-id: 5062d4327bb5f7b485c2ffa30c98e10576416f03	2019-03-31 22:35:58 -07:00
Jongsoo Park	505d50ea90	handle a rare case of histogram min is inf/nan (#18239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18239 When min is inf or nan, we get UBSAN errors Reviewed By: csummersea Differential Revision: D14537668 fbshipit-source-id: e70ffb5ecd2b10793356070c69fdabf8f25b203e	2019-03-31 21:32:54 -07:00
Edward Yang	6841537933	Delete duplicated technical content from contribution_guide.rst (#18628 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18628 ghimport-source-id: d94b81a6f303883d97beaae25344fd591e13ce52 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18629 Provide flake8 install instructions. * #18628 Delete duplicated technical content from contribution_guide.rst There's useful guide in contributing_guide.rst, but the technical bits were straight up copy-pasted from CONTRIBUTING.md, and I don't think it makes sense to break the CONTRIBUTING.md link. Instead, I deleted the duplicate bits and added a cross reference to the rst document. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14701003 fbshipit-source-id: 3bbb102fae225cbda27628a59138bba769bfa288	2019-03-31 19:13:22 -07:00
Edward Yang	35bc83524d	Provide flake8 install instructions. (#18629 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18629 ghimport-source-id: 66a8871c56ffcfa7d4bfdf601e180fae99194e28 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18629 Provide flake8 install instructions. * #18628 Delete duplicated technical content from contribution_guide.rst Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14701004 fbshipit-source-id: b64292f0ef01b7894cf6b9ff8d5fd9e921c8d162	2019-03-31 18:59:18 -07:00
Rui Zhu	19fe2b9db4	Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18621 This diff added caffe2 support for onnxifi quantization. Reviewed By: yinghai Differential Revision: D14648767 fbshipit-source-id: 4ddb492cacbba6142305866e6dbb875880acaea3	2019-03-31 17:42:27 -07:00
David Riazati	3c70326cf4	Fix test on windows (#18667 ) Summary: Breakage in #18188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18667 Differential Revision: D14700133 Pulled By: driazati fbshipit-source-id: 4cc26bd579fc1b074b3bef6046cc1030facee130	2019-03-31 16:24:21 -07:00
Ailing Zhang	9c87543124	Enforce check ad in test_jit (#18509 ) Summary: If a test triggers autodiff, it must have a `DifferentiableGraph` in its differentiated forward graph, and this subgraph must have either the original aten node, or the corresponding nodes used in AD formula. Typically a forward differentiable graph looks like this: ``` graph(%i0 : Float(), %i1 : Float()): %3 : Float() = prim::DifferentiableGraph_0(%i0, %i1) return (%3) with prim::DifferentiableGraph_0 = graph(%0 : Float(), %1 : Float()): %2 : Float() = aten::max(%0, %1) return (%2) ``` which tells us `aten::max(Tensor self, Tensor other) -> Tensor` is symbolically differentiable. Update: there're a lot of cases (fusions/ConstantChunk/python implementations) that breaks it so I decided to make the check optionally take node names if different from function name. ~~[OLD]Theoretically I could also check if `aten::max` is in the differentiable block or not to be more precise, but there're also cases like `chunk` where in a differentiable block it's replaced with a prim node (ConstantChunk) and we will have to special case them. Any suggestions here (to be more precise or no) is very welcome!~~ We used to have a list containing nn tests should be run against AD, I moved it to an field when constructing our test(either torch or nn). I think it's cleaner this way, and it matches the fact that for the same op we support one schema of it but not all, in this way we could just turn on the corresponding test which triggers that supported schema. cc: apaszke zdevito wanchaol ngimel for a review [Done] : - Going through a manual second pass of all tests to check if they should enable AD test or not.... - Add a readme about how to add AD for an op and how to add/enable its test in test_jit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18509 Differential Revision: D14696811 Pulled By: ailzhang fbshipit-source-id: c5e693277baac585cd3aed5ab2c0e7faa5e6f29f	2019-03-31 08:51:30 -07:00
Junjie Bai	828a6a3b39	Use proper isnan check Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18663 Differential Revision: D14699385 Pulled By: bddppq fbshipit-source-id: 596ad3371e7704802591e49f7e1c55dc6cd2896f	2019-03-31 02:08:11 -07:00
Soumith Chintala	cb39bd9c2f	pad_circular -> _pad_circular (#18608 ) Summary: pad_circular is really private, as circular padding is exposed via `F.pad` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18608 Differential Revision: D14691704 Pulled By: soumith fbshipit-source-id: 8c2f90596feed670976115041efed3ca071e8306	2019-03-30 13:27:04 -07:00
Will Feng	a45b79d23f	Fix wrap(at::Scalar) (#18632 ) Summary: Problem: ```cpp // This function expects a `Variable` as input inline PyObject* wrap(at::Tensor tensor) { return THPVariable_Wrap(Variable(std::move(tensor))); } inline PyObject* wrap(at::Scalar scalar) { // This function calls `wrap(at::Tensor tensor)` (the function above), but since // `scalar_to_tensor(...)` returns a `Tensor` and not a `Variable`, the call to // `wrap(at::Tensor tensor)` will fail with "Tensor that was converted to Variable // was not actually a Variable", which is not what we want. return wrap(scalar_to_tensor(scalar)); } ``` The right fix is to call `make_variable(...)` with the tensor returned from `scalar_to_tensor(scalar)`. This unblocks https://github.com/pytorch/pytorch/pull/18230 as it is the only patch that hits this code path now. All other native functions that return Scalar (such as `item()` or `_local_scalar_dense()`) either has custom-defined implementation that doesn't go through this path, or is not exposed to Python at all. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18632 Differential Revision: D14689293 Pulled By: yf225 fbshipit-source-id: be7ba5d3de83a69533a2997de97ad92989ff78ee	2019-03-30 11:36:11 -07:00
Gao, Xiang	0f6bf09db5	Deprecated type() -> scalar_type() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18642 Differential Revision: D14696848 Pulled By: ezyang fbshipit-source-id: 43d1f86ecee5f6c6c5b70fd7d0e2063c3fc473ab	2019-03-30 10:55:46 -07:00
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
ryan	96456bfa4c	Update documentation for CTCLoss (#18415 ) Summary: This is meant to resolve #18249, where I pointed out a few things that could improve the CTCLoss docs. My main goal was to clarify: - Target sequences are sequences of class indices, excluding the blank index - Lengths of `target` and `input` are needed for masking unequal length sequences, and do not necessarily = S, which is the length of the longest sequence in the batch. I thought about Thomas's suggestion to link the distill.pub article, but I'm not sure about it. I think that should be up to y'all to decide. I have no experience with .rst, so it might not render as expected :) t-vi ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/18415 Differential Revision: D14691969 Pulled By: soumith fbshipit-source-id: 381a2d52307174661c58053ae9dfae6e40cbfd46	2019-03-30 01:26:34 -07:00
Sebastian Messmer	2a58fd9844	Fallback kernels (#18443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18443 Allow registering a kernel without a dispatch key. In this case, the kernel becomes a fallback kernel that is called whenever no other kernel matches. This is also useful for the legacy function based API (since that API doesn't know about dispatch keys) or any other custom ops that don't care about dispatch and just want one kernel to be called no matter the dispatch key. Reviewed By: dzhulgakov Differential Revision: D14603258 fbshipit-source-id: 242dc8871dad2989ca25079854d0cc97429e7199	2019-03-30 00:07:34 -07:00
Sebastian Messmer	f4e87e193a	Introduce lambda-based kernel API (#18541 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18541 Allow registering lambdas as c10 kernels. Reviewed By: dzhulgakov Differential Revision: D14653005 fbshipit-source-id: f867cc776b1339e83b7a2e1935f5cf924cfba44a	2019-03-30 00:07:31 -07:00
Sebastian Messmer	24752eb7b8	Report better errors when kernel or dispatch key are missing (#18302 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18302 These might be use cases we want to support in the future, but they don't work yet. Let's at least report an error instead of doing segfaults or worse. Reviewed By: dzhulgakov Differential Revision: D14572346 fbshipit-source-id: 49262ce131493bc887defe2978d8b22f202cd8cc	2019-03-30 00:07:28 -07:00
Sebastian Messmer	48e7f98917	Move stuff to cpp files (#18301 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18301 Move code out of headers and templates into source files and non-templates. Reviewed By: dzhulgakov Differential Revision: D14572347 fbshipit-source-id: 9fd5d62d54000a95e93076cd73f591ba2c5c2653	2019-03-30 00:07:25 -07:00
Sebastian Messmer	14c28fabd2	Check kernel against function schema in c10 op registration (#18256 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18256 This diff infers the function schema from the kernel function/functor and checks that it matches the specified function schema. This diff does not allow (yet) to omit specifying the function schema in the registration API. That will come in a future diff. Reviewed By: dzhulgakov Differential Revision: D14552738 fbshipit-source-id: 00202b489ede19f26ae686c97416b38c72c11532	2019-03-30 00:07:22 -07:00
Sebastian Messmer	c4bb09cc42	Add functor- and function-based kernel registration API (#18162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18162 - Adds the API to register a functor- and function-based kernel. - Change the experimental c10 ops to use this new API instead of the old one - Deletes the old APIs in KernelRegistration.h and OpSchemaRegistration.h Reviewed By: dzhulgakov Differential Revision: D14514239 fbshipit-source-id: 35b2f6e8f62964e54886450a6a5fac812ed20f26	2019-03-30 00:07:19 -07:00
Sebastian Messmer	9abc8a5b47	New operator registration MVP (#18161 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18161 This introduces version 0 for the new operator registration. For now, it only works with kernels that are defined as stack-based functions. This is actually not the intended public API for defining kernels, but it's the basis which is going to be used to define the public APIs (see diffs on top for them), and it's also the API used for exposing caffe2 operators. This diff also switches the mechanism for exposing caffe2 operators to the new mechanism. Reviewed By: dzhulgakov Differential Revision: D14514231 fbshipit-source-id: 454ab7b5b46a10203aa27b175400d23f818dd1df	2019-03-30 00:07:16 -07:00
Junjie Bai	6095814229	Fix trt installation in CI (#18609 ) Summary: caffe2_py2_cuda9_0_cudnn7_ubuntu16_04_build is failing ``` ... Mar 29 04:44:46 Need to get 174 MB of archives. Mar 29 04:44:46 After this operation, 576 MB of additional disk space will be used. Mar 29 04:44:46 Do you want to continue? [Y/n] Abort. Exited with code 1 ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18609 Differential Revision: D14694990 Pulled By: bddppq fbshipit-source-id: 260446a8650f660a2baf123a3f17efdf0a8d6c64	2019-03-29 22:47:29 -07:00
David Riazati	24db1667da	Attribute serialization improvements (#18188 ) Summary: * adds attributes to `ScriptModule.__getattr__` so they can be accessed in Python after re-importing * full support for all the possible values for an `int64_t` * this necessitated a bunch more `pushWhatever` functions, so re-introduced a templated version to cut down on duplicate code * tests to validate references / value sharing works * adds `torch.jit.Unpickler` which people can use to de-serialize the pickle files into Python / have a quick reference on how to do this without PyTorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/18188 Differential Revision: D14527490 Pulled By: driazati fbshipit-source-id: efd15579cc04aa2e28c4b2c9490d82d849dee559	2019-03-29 19:10:12 -07:00
Cheng,Penghui	e13101e069	support pre-convert filter format for mkldnn training mode and change 'OptimizeForIdeep' to 'OptimizeForMkldnn' (#15171 ) Summary: For MKL-DNN,the filter data will be reorderd to primitive format, it takes a lot of time. So the patch provide a method to convert filter format before training. And "OptimizeForIdeep" will be changed to "OptimizeForMkldnn" in this patch. This patch depends on https://github.com/pytorch/pytorch/pull/12866 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15171 Differential Revision: D14590741 Pulled By: yinghai fbshipit-source-id: 07971c9977edac3c8eec08ca2c39cda639683492	2019-03-29 19:00:48 -07:00
Jerry Zhang	d73c830e23	Tensor construction codemod(raw_mutable_data) (#16373 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16373 motivation: https://github.com/pytorch/pytorch/pull/12407 This is a manual diff. most of the fixes should be: ``` auto* Y = Output(0); Y->Resize(dims); Y->raw_mutable_data(dtype); ``` --> ``` auto* Y = Output(0, dims, at::dtype(dtype)); ``` But there might be other cases. Reviewed By: dzhulgakov Differential Revision: D13725460 fbshipit-source-id: 649a4b0e42f62cda1a60171dd9fa3e440dc9dca1	2019-03-29 18:36:46 -07:00
David Riazati	7b0ef31780	Add hash() global (#18258 ) Summary: This adds `hash()` which supports `int`, `str`, and `float`. It relies on `std::hash` which is implementation defined, so the result of `hash()` in TorchScript is not the same as in Python, but should satisfy the same properties. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18258 Differential Revision: D14692317 Pulled By: driazati fbshipit-source-id: 909df5d024bb3feea157d5a203b7de53c72261c9	2019-03-29 18:29:34 -07:00
Elias Ellison	a5ddecd00c	Move fuser to test_jit_fuser (#18590 ) Summary: Start of breaking up test_jit.py New files will have the format test_jit_* so they are easily grepable but remain in the same directory so we don't have to go through multiple sources for imports. I am adding a test that's expected to fail to be sure it's running. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18590 Reviewed By: wanchaol Differential Revision: D14677094 Pulled By: eellison fbshipit-source-id: 9782c6aa9525bb6f332fc75cfff004c83a417522	2019-03-29 18:13:26 -07:00
James Reed	85f36014e2	Experimental logging/counters API (#18235 ) Summary: This defines a generic counters API that users can utilize to provide monitoring functionality in e.g. a production service. We expose both counters for runtime internals as well as a TorchScript API to create user-defined counters. Synopsis of the API: - `torch/csrc/jit/script/logging.h` specifies the externally-facing API in C++ - `torch/jit/_logging.py` specifies the Python API We use an interface, `LoggerBase`, to define the interactions between users and a logging backend. Implementing a subclass of `LoggerBase` allows the user to handle these events in a custom way, such as logging into a DB or calling into an infra-specific counters API. From the frontend perspective, we can create log events in two ways: 1. We provide an `add_stat_value(name, val)` function. This calls into the Logger backend with a key/value pair. For example, we might call `add_stat_value('foo', 1)` to bump an event counter. 2. We provide a `time_point()` function to record a timestamp in nanoseconds. This can be used in conjunction with `add_stat_value` to record runtime wall clock durations. Examples of frontend usage can be found in `test_jit.py TestLogging`. We provide a trivial `LockingLogger` implementation as an example and for testing purposes. It is likely not ready for production usage. It demonstrates that a backend implementing the API can do things like specify aggregation types and report these aggregate stats via the `get_counters()` API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18235 Differential Revision: D14545060 Pulled By: jamesr66a fbshipit-source-id: 04099543a1898cfdd411511e46e03d5dce9b4881	2019-03-29 17:14:03 -07:00
David Riazati	e2fd1d966f	Revert D14668859: [pytorch][PR] Re-land Parsing file check Differential Revision: D14668859 Original commit changeset: 3825a35ddc61 fbshipit-source-id: f3343ec6b63fe8f1f04959adfac4331865990047	2019-03-29 17:14:00 -07:00
Pieter Noordhuis	47e2772320	Update argument names of torch::autograd::FunctionPostHook (#18140 ) Summary: They are called as (outputs, inputs) and were named (inputs, outputs). Possible follow up fix is to make the outputs argument an lvalue to allow for calling multiple post hooks without ever copying outputs vector. It looks like the copy is now forced because the hook takes a const reference as input and returns an value. This would change the prototype of the function, so needs further discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18140 Differential Revision: D14684498 Pulled By: pietern fbshipit-source-id: 1bd3ddbdd1ff7fe0a18241de5a9ec745a4e7ef07	2019-03-29 16:30:23 -07:00
Soumith Chintala	81b73951f1	note on updating existing source (#18409 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18388 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18409 Differential Revision: D14597666 Pulled By: soumith fbshipit-source-id: 156104c0cd19da06f6f96a225228d1e8cf831af1	2019-03-29 16:14:47 -07:00
eellison	393731ab24	Re-land Parsing file check (#18570 ) Summary: The last time I tried to land it there was a merge race with the docs coverage test lol. Re-landing with the fix. Re-land of https://github.com/pytorch/pytorch/pull/18304 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18570 Differential Revision: D14668859 Pulled By: eellison fbshipit-source-id: 3825a35ddc6179a0d433d70d22b5c1a96c20b21a	2019-03-29 15:46:59 -07:00
Spandan Tiwari	1240327c5c	Refactoring serialization of ONNX initializers to be name-based (Resubmission) (#17830 ) Summary: houseroad - this is the resubmission of https://github.com/pytorch/pytorch/pull/17420, as suggested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17830 Reviewed By: zrphercule Differential Revision: D14398714 Pulled By: houseroad fbshipit-source-id: bda475f1ae8a5273ebdb0f6883fc66036c29d326	2019-03-29 15:23:29 -07:00
Mikhail Zolotukhin	fca9d9a100	Initial implementation of InsertObserverNodes pass. (#18152 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18152 ghimport-source-id: 1dd5e62c4d93394dcd8d8af2871554575c8d3d1a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18152 Initial implementation of InsertObserverNodes pass. * #18151 Add quant-passes stubs. gh-metadata: pytorch pytorch 18150 gh/zolotukhinm@gmail.com/2/head Differential Revision: D14584223 fbshipit-source-id: 30896acc1a8901d22c6a167eb87d2fbaafbbeb6f	2019-03-29 15:08:57 -07:00
Gu, Jinghui	84f020fe09	Fix bug in tensor feed which caused crash due to wrong tensor type (#18552 ) Summary: In blob feeder for ideep device, the wrong device option is given and led to a crash issue. This patch aims to correct the device option to fix this bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18552 Differential Revision: D14679838 Pulled By: yinghai fbshipit-source-id: bde11e6a6fe44822166881dcb7c9bd0b34b4ecf3	2019-03-29 14:12:36 -07:00
Gu, Jinghui	e3b1758f19	Upgrade mkldnn-bridge to revert tensor capacity patch and prepare for DNNLOWP support (#18471 ) Summary: 1. Upgrade mkldnn-bridge to revert tensor capacity patch to avoid ASAN issue. 2. Prepare for DNNLOWP support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18471 Differential Revision: D14621569 Pulled By: yinghai fbshipit-source-id: 9df300b77d0f2acd1a4f63c2925b7a7cab7a474e	2019-03-29 13:54:04 -07:00
Yanghan Wang	f4e35d30ed	register BoxWithNMSLimit with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17956 Reviewed By: houseroad Differential Revision: D14417300 fbshipit-source-id: eb5e2ba84513b3b7bfa509dc442424b13fe9148f	2019-03-29 13:41:40 -07:00
Gregory Chanan	d895d30876	Fix c10d build without nccl. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18582 Differential Revision: D14672928 Pulled By: gchanan fbshipit-source-id: 74e9805cbaf5ebe8e3f579fe08dad72eb410b80a	2019-03-29 13:34:38 -07:00
Will Feng	6ebfbdf4c6	Add named submodule support to nn::Sequential (#17552 ) Summary: Previously, we were not able to assign names to `nn::Sequential`'s submodules. This PR adds this feature to match the Python API. Example use: ```cpp Sequential sequential(named_submodule({ {"linear", Linear(10, 3)}, {"conv2d", Conv2d(1, 2, 3)}, {"dropout", Dropout(0.5)}, {"batchnorm", BatchNorm(5)}, {"embedding", Embedding(4, 10)}, {"lstm", LSTM(4, 5)} })); ``` It also enables loading parameters of Python `nn.Sequential` module with custom submodules names into C++ frontend, unblocking https://github.com/pytorch/vision/pull/728#issuecomment-466661344. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17552 Differential Revision: D14246834 Pulled By: yf225 fbshipit-source-id: 3030b5c5d68f6dd5d3e37ac4b4f98dc6d6d9ba72	2019-03-29 13:06:29 -07:00
Vishwak Srinivasan	e73be58ff7	Rename `btriunpack` to `lu_unpack` (#18529 ) Summary: Changelog: - Renames `btriunpack` to `lu_unpack` to remain consistent with the `lu` function interface. - Rename all relevant tests, fix callsites - Create a tentative alias for `lu_unpack` under the name `btriunpack` and add a deprecation warning to not promote usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18529 Differential Revision: D14683161 Pulled By: soumith fbshipit-source-id: 994287eaa15c50fd74c2f1c7646edfc61e8099b1	2019-03-29 13:01:30 -07:00
Elias Ellison	be2ac6828c	fix lint (#18623 ) Summary: Fix lint Pull Request resolved: https://github.com/pytorch/pytorch/pull/18623 Differential Revision: D14686265 Pulled By: eellison fbshipit-source-id: 4bbe0f5bc58f508cbf4bc1baef2029ce1eaa42d8	2019-03-29 11:50:12 -07:00
Xiaodong Wang	62d8c8cf0a	Manual hipify caffe2/distributed and rocm update (no hcc modules support) (#18088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18088 Manually hipify the distributed folder Reviewed By: bddppq Differential Revision: D14482702 fbshipit-source-id: cc0abdf525b423ab1f18db8010d21e27c6668d36	2019-03-29 11:07:32 -07:00
Summer Deng	7c438c82eb	Change dnnlowp log level from warning to v2 (#18576 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18576 As in title Reviewed By: feiyu1990 Differential Revision: D14670898 fbshipit-source-id: 1983099b2ba57daab393278553f10dcdb1812fdf	2019-03-29 09:29:25 -07:00
Stas Bekman	c0a2452ffe	multiline KeyError msg python bug workaround (#18557 ) Summary: make multiline KeyError msg readable by working around a python bug https://bugs.python.org/issue2651 discussion: https://github.com/pytorch/pytorch/issues/16647 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18557 Differential Revision: D14681086 Pulled By: soumith fbshipit-source-id: acbd13a823302c854c3d364028ed414fd8ce6bc8	2019-03-29 07:04:20 -07:00
Søren Rasmussen	95d3825e48	ReduceLrOnPlateau: best=current -> best=copy(current) (#16364 ) (#16697 ) Summary: Fixes #16364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16697 Differential Revision: D14680879 Pulled By: soumith fbshipit-source-id: c50c22f3eacea4474fb3a04fe85fbf11d5a177c9	2019-03-29 06:56:51 -07:00
crcrpar	cf444f3544	make InstanceNorm1d raise an error if the input is 2D (#11992 ) Summary: Resolves #11991 . Any comment is welcome. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11992 Differential Revision: D14680974 Pulled By: soumith fbshipit-source-id: 8e287a9c32bf43b35edc9d127f16ed6b72c61d91	2019-03-29 06:50:04 -07:00
Arunava	c189eba3e1	Fixed torch.arange docs (#18604 ) Summary: Kindly let me know if its okay and if any places i need to make a fix. Closes #18534 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18604 Differential Revision: D14680712 Pulled By: soumith fbshipit-source-id: 030e4a3d8f7839cbe2b8a3ef386323f0d39eb81a	2019-03-29 06:42:28 -07:00
Junjie Bai	e22a2b9015	Minor fixes in fastrnns benchmarks Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18613 Reviewed By: wanchaol Differential Revision: D14681838 fbshipit-source-id: 60bd5c9b09398c74335f003cd21ea32dd1c45876	2019-03-29 01:22:28 -07:00
Vishwak Srinivasan	d859031ebf	Rename `btrifact*` to `lu` (#18435 ) Summary: Changelog: - Renames `btrifact` and `btrifact_with_info` to `lu`to remain consistent with other factorization methods (`qr` and `svd`). - Now, we will only have one function and methods named `lu`, which performs `lu` decomposition. This function takes a get_infos kwarg, which when set to True includes a infos tensor in the tuple. - Rename all tests, fix callsites - Create a tentative alias for `lu` under the name `btrifact` and `btrifact_with_info`, and add a deprecation warning to not promote usage. - Add the single batch version for `lu` so that users don't have to unsqueeze and squeeze for a single square matrix (see changes in determinant computation in `LinearAlgebra.cpp`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18435 Differential Revision: D14680352 Pulled By: soumith fbshipit-source-id: af58dfc11fa53d9e8e0318c720beaf5502978cd8	2019-03-29 00:34:30 -07:00
Xiaomeng Yang	c21e763cd6	Optimize relu op on GPU (#18506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18506 Optimize relu op on GPU Reviewed By: houseroad Differential Revision: D14633171 fbshipit-source-id: bd3afa9a0bae1325d32ad4153736a0c7ecb0ec64	2019-03-29 00:23:24 -07:00
Lu Fang	a5a1c9a171	Automatic update of fbcode/onnx to fb1a80692c1ab0bd27b1072f2e7bffacba336777 (#18585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18585 Previous import was b29e78a4efb8e5d8995f576bbf19a846807829b6 Included changes: - [fb1a8069](https://github.com/onnx/onnx/commit/fb1a8069): Fix wrongly handled attribute in MVN and test generating scripts (#1877) <Raymond Yang> - [b22041c3](https://github.com/onnx/onnx/commit/b22041c3): Add dilation attribute to MaxPool (#1864) <karljang> Reviewed By: zrphercule, benoitsteiner Differential Revision: D14668623 fbshipit-source-id: fa7f44b1ecc949d8dd654939d20b1e93db98b1d2	2019-03-28 23:47:10 -07:00
Lu Fang	0fd1ee3145	Automatic update of fbcode/foxi to 81e1683d6348eee4b5ed1145222dc2c41be4269c (#18596 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18596 Previous import was 2bcc4064c90e87b9638615c733485f07c47b7558 Included changes: - [81e1683](https://github.com/houseroad/foxi/commit/81e1683): Merge pull request #9 from zrphercule/add_foxi_quantization <Rui Zhu> - [580559c](https://github.com/houseroad/foxi/commit/580559c): char=>uint8 <zrphercule> - [1a572f7](https://github.com/houseroad/foxi/commit/1a572f7): add quantization <zrphercule> Reviewed By: zrphercule Differential Revision: D14677404 fbshipit-source-id: 09429b3bf0e7783a25b8145020e505761bad887d	2019-03-28 23:24:30 -07:00
Elias Ellison	ff4b6d1a49	Delete batch tensor (#18575 ) Summary: Deleting batch tensor since we are no longer maintaining the project and keeping it functional is blocking other improvements. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18575 Differential Revision: D14671126 Pulled By: eellison fbshipit-source-id: b42d5b699c4d12171ed95e6d3a977532167f0d2c	2019-03-28 23:13:27 -07:00
Thomas Viehmann	b1a0233ee4	Update NNPACK to current master (#18580 ) Summary: This fixes builds on x86 (32 bits). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18580 Differential Revision: D14672462 Pulled By: soumith fbshipit-source-id: 7629b001c2bfa3e5b6ade7f1b03a8280232a4c16	2019-03-28 22:23:08 -07:00
Gemfield	1c3428af31	Enhance build_ios.sh to be consistent with build_android.sh (#18564 ) Summary: 1, Enhance build_ios.sh to be consistent with build_android.sh; 2, Add docs for build_ios.sh. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18564 Differential Revision: D14680752 Pulled By: soumith fbshipit-source-id: 6d2667ed8a3c85a057a522838f5d0461dd4788cf	2019-03-28 21:37:55 -07:00
Hyungjoo Andrew Cho	3752916132	Serialization supports pathlib.Path object for the input argument (#18562 ) Summary: This will allow pathlib.Path object to the torch.load as an input argument. Fixes #16607 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18562 Differential Revision: D14668255 Pulled By: soumith fbshipit-source-id: 0ae4f7c210918582912f2d1ef2a98f1ab288c540	2019-03-28 21:01:15 -07:00
Aurélien Roy	12abc8a99a	Target and input sizes mismatch warning in L1 Loss / L1 Smooth Loss (#18565 ) Summary: Addind the same warning message already present in the mse_loss function to the L1 losses when input and target sizes are different. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18565 Differential Revision: D14671415 Pulled By: soumith fbshipit-source-id: 01f5e1fb1ea119dbb2aecf1d94d0cb462f284982	2019-03-28 20:49:51 -07:00
bddppq	1989716ae5	Resubmit PR-18512: Improved onnx export for 3 onnx ops (#18571 ) Summary: Fix ROCm CI failure Pull Request resolved: https://github.com/pytorch/pytorch/pull/18571 Differential Revision: D14669323 Pulled By: bddppq fbshipit-source-id: 022afe5c20e680295c9cfdfe1ec14650305955a8	2019-03-28 18:12:49 -07:00
Jeff Daily	2f174e9453	in caching allocator, ignore and clear the error if not ready Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18584 Differential Revision: D14675041 Pulled By: bddppq fbshipit-source-id: c1fab797e0d224e0a481a0395a3f9975c4265ff6	2019-03-28 17:53:30 -07:00
Ilia Cherniavskii	600eeecbf4	Add external callbacks into RecordFunction (#17844 ) Summary: Add a way to insert external callbacks into PT's RecordFunction Pull Request resolved: https://github.com/pytorch/pytorch/pull/17844 Differential Revision: D14399664 Pulled By: ilia-cher fbshipit-source-id: 76654799811fefd3ffed4abfb46ed95b492cebab	2019-03-28 17:48:45 -07:00
Jing Huang	11ac0cf276	Implement rotated generate_proposals_op without opencv dependency (CPU version) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18533 Reviewed By: ezyang Differential Revision: D14648083 fbshipit-source-id: e53e8f537100862f8015c4efa4efe4d387cef551	2019-03-28 17:02:50 -07:00
Ahmed Aly	1ae2c1950c	Use SetOutputTensor instead of copying outputs manually (#17770 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17770 As title Reviewed By: dzhulgakov Differential Revision: D14370937 fbshipit-source-id: f415490c38556cf03bb13dce3643775331483448	2019-03-28 16:01:33 -07:00
Shen Li	aea8ee1f68	Fix NCCL/Gloo process groups and DDP stream sync bug (#18465 ) Summary: DDP with NCCL backend uses a [worker stream](`d3eb941ed9/torch/csrc/distributed/c10d/ddp.cpp (L142)`) to flatten grand batch tensors, and passes the flattened tensor to [another stream](`d3eb941ed9/torch/lib/c10d/ProcessGroupNCCL.cpp (L379)`) to conduct ncclAllReduce. The flattened tensor has to record the ncclAllReduce stream, otherwise multiple streams might access the same memory space. cc ppwwyyxx Pull Request resolved: https://github.com/pytorch/pytorch/pull/18465 Differential Revision: D14613449 Pulled By: mrshenli fbshipit-source-id: b62773732552d12cc87b7adeb6897e9e11753ea9	2019-03-28 15:12:40 -07:00
Ahmed Aly	9eb0f435d9	Inference LSTM integration test (#18559 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18559 Adding integration test for inference LSTM Reviewed By: houseroad Differential Revision: D14656698 fbshipit-source-id: 80fb2a72be30fcb695f4471b72bf9d6e3965bf81	2019-03-28 11:31:06 -07:00
Zachary DeVito	aa20591baa	Add Slot type to abstract the raw pointers being used for slots. (#18226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18226 ghimport-source-id: b9ec8651212875b30971cc6859d2ddec6559ae3a If modules become first-class IValues, then the slots will no longer be raw pointers but (IValue, index) pairs. This commit inserts the Slot abstraction so that this change can be made in later patches. Stack from [ghstack](https://github.com/ezyang/ghstack): * #18226 Add Slot type to abstract the raw pointers being used for slots. Differential Revision: D14542022 fbshipit-source-id: b81d7f4334c983d663e7551bda82df43680d7c5f	2019-03-28 10:35:36 -07:00
Junjie Bai	77280b11e3	Revert D14635130: Improved onnx export for 3 onnx ops. Differential Revision: D14635130 Original commit changeset: d54a2b6e2950 fbshipit-source-id: f624e2befdde245cb88435a95508b2a8e6b12e61	2019-03-28 10:26:34 -07:00
Benoit Steiner	eee760dbd3	Improved onnx export for 3 onnx ops. (#18512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18512 Ceil and Floor have been supported since version 6 of ONNX: export them using the native onnx ops instead of an Aten op. Similarly, support for the Where op has been added in version 9, so we don't need to wrap these op in an Aten op. Reviewed By: houseroad Differential Revision: D14635130 fbshipit-source-id: d54a2b6e295074a6214b5939b21051a6735c9958	2019-03-28 08:55:21 -07:00
Elias Ellison	ffc7158bf2	Revert D14652372: [pytorch][PR] Add parsing to file check Differential Revision: D14652372 Original commit changeset: 7430b9d1dc2b fbshipit-source-id: fa3d0f68515fe53447746469844d2db20c1292e0	2019-03-28 00:12:47 -07:00
Ilia Cherniavskii	b0d9712938	C++17.h: forward -> c10::guts::forward (#18492 ) Summary: Use c10::guts::forward instead of forward Pull Request resolved: https://github.com/pytorch/pytorch/pull/18492 Reviewed By: smessmer Differential Revision: D14625513 Pulled By: ilia-cher fbshipit-source-id: 8bc4e20f102fe2a107a22f3e172882d60b95ab0e	2019-03-27 21:14:07 -07:00
Thomas Viehmann	9696f06bcf	Use __ldg for CUDA kernels in fuser (#18540 ) Summary: While benchmarking a kernel with broadcasted inputs, I noticed that is was much slower than a hand-coded kernel for the smae task. The kernel in question computed a * b + c for a of shape 32 x 32 x 10240 and b and c of shape 1 x 32 x 1. This patch accellerates said kernel from 450us to 250us on my GTX1080Ti. I didn't change half because there doesn't seem to be __ldg for half. An alternative could be to sprinkle const and restrict. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18540 Differential Revision: D14657840 Pulled By: soumith fbshipit-source-id: 408847346ec12d1d1d9b119ac50bbc70f0d9ed33	2019-03-27 20:22:17 -07:00
Sam Pepose	8635078d9e	Adds Cyclical Learning Rate and Momentum (#18001 ) Summary: This implements a cyclical learning rate (CLR) schedule with an optional inverse cyclical momentum. More info about CLR: https://github.com/bckenstler/CLR This is finishing what #2016 started. Resolves #1909. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18001 Differential Revision: D14451845 Pulled By: sampepose fbshipit-source-id: 8f682e0c3dee3a73bd2b14cc93fcf5f0e836b8c9	2019-03-27 19:56:04 -07:00
Edward Yang	54abfda124	Completely synchronize behavior of Facebook flake8 and public flake8. (#18538 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18538 ghimport-source-id: 665b09f158d1c5dd94686d4212792504b55b7f73 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18538 Completely synchronize behavior of Facebook flake8 and public flake8. Previously, developers at Facebook had the very funny experience wherein /usr/local/bin/flake8 behaved differently than a freshly installed flake8 from pip. In this commit, I add enough ignores to .flake8 and install enough plugins to make the Facebook flake8 and public flake8 line up exactly. These means you don't have to care which flake8 you use; they all will report accurate information on your Python files. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14652336 fbshipit-source-id: ba7776eaa139cf2e3df2e65349da6fd7c99acca4	2019-03-27 19:51:21 -07:00
Elias Ellison	8faf0112f3	add slow tests annotation to some jit tests (#18545 ) Summary: Adds slow test annotation to the following very slow tests - 70.33s test/test_jit.py::TestScript::test_script_module_script_resnet 32.33s test/test_jit.py::TestBatched::test_beam_search 17.70s test/test_jit.py::TestBatched::test_greedy_search 15.58s test/test_jit.py::TestScript::test_script_module_trace_resnet18 The list of remaining slow tests is below. Let me know if you think any of the others should be added to slow tests as well. Slow tests will only run on master. 15.28s call test/test_jit.py::TestJit::test_export_batchnorm 12.96s call test/test_jit.py::TestEndToEndHybridFrontendModels::test_snli 11.65s call test/test_jit.py::TestEndToEndHybridFrontendModels::test_neural_style 6.38s call test/test_jit.py::TestJitGeneratedModule::test_nn_LocalResponseNorm_1d 5.96s call test/test_jit.py::TestJitGeneratedModule::test_nn_LocalResponseNorm_2d_uneven_pad 5.91s call test/test_jit.py::TestJitGeneratedModule::test_nn_LocalResponseNorm_3d_custom_params 4.76s call test/test_jit.py::TestJit::test_alexnet 3.82s call test/test_jit.py::TestScript::test_number_math 3.81s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_no_bias 3.76s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_groups_thnn 3.65s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_stride_pad1circular 3.49s call test/test_jit.py::TestBatched::test_lstm 3.33s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_pad2circular 3.19s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv1d_stride1_pad2circular 3.11s call test/test_jit.py::TestEndToEndHybridFrontendModels::test_dcgan_models 3.11s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_stride_padding 3.11s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_stride 3.08s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_no_bias 3.08s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv1d_stride1_pad1circular 3.07s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_groups 3.05s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_dilated 3.05s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_depthwise_with_multiplier 3.04s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_groups 3.03s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_dilated 3.02s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv2d_depthwise_dilated 3.02s call test/test_jit.py::TestJitGeneratedModule::test_nn_Conv3d_dilated_strided Pull Request resolved: https://github.com/pytorch/pytorch/pull/18545 Differential Revision: D14656064 Pulled By: eellison fbshipit-source-id: d17ee23c3b3679276cee983555d43e83ce099356	2019-03-27 19:27:23 -07:00
Elias Ellison	0daafe0209	Add parsing to file check (#18304 ) Summary: This allows you to embed checks in IR, making the test more readable. E.g. ``` graph_str = 'graph(%0 : Double(5, 5)): # CHECK: aten::relu %1 : Double(5, 5) = aten::relu(%0) return (%1)' FileCheck().run(graph_str, parseIR(graph_str)) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18304 Differential Revision: D14652372 Pulled By: eellison fbshipit-source-id: 7430b9d1dc2b7584704375aac02d7392ecec76a0	2019-03-27 18:16:05 -07:00
Elias Ellison	ad1ebf7082	bug fix for node with writers in create autodiff subgraph (#18491 ) Summary: Previously we were moving nodes with writers into differentiable subgraphs, without necessarily preserving whether or not they were written to. This can lead to bugs with CSE, which needs that context. I'm not completely sure if there's anything else we can do to be more aggresive here - inline these nodes and not run CSE and just run constant pooling, or possibly something else, but I think we should land this correctness condition first and then possibly think further. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18491 Differential Revision: D14648562 Pulled By: eellison fbshipit-source-id: bc1e444774ccdb708e22f0e06a477a221a231f9e	2019-03-27 16:08:03 -07:00
Xianjie Chen	d74b11ce0e	add extra info for the auto gen sum ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17934 Reviewed By: iroot900 Differential Revision: D14418689 fbshipit-source-id: 9e11e461001467f0000ea7c355d5b0f0d738fa85	2019-03-27 14:56:32 -07:00
Vitaly Fedyunin	58f3712ceb	Clarify error text of the pin_memory function Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18530 Reviewed By: ezyang Differential Revision: D14647578 Pulled By: VitalyFedyunin fbshipit-source-id: ddd70240d52d2e9a96e26f5a0dfea8d76fe25078	2019-03-27 14:56:29 -07:00
Wanchao Liang	6684ef3f23	Move fast rnn benchmark to pytorch/pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18369 Differential Revision: D14652039 Pulled By: wanchaol fbshipit-source-id: 1177b1f60d96672c3e2c9d527b56ee06ca7c0af1	2019-03-27 14:46:09 -07:00
eellison	e4f1681c82	Rename isTensor api -> isCompleteTensor (#18437 ) Summary: Is Tensor has been brought up as misleading a couple times, rename it isCompleteTensor for clarity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18437 Differential Revision: D14605223 Pulled By: eellison fbshipit-source-id: 189f67f12cbecd76516a04e67d8145c260c79036	2019-03-27 14:46:06 -07:00
Elias Ellison	1eee2090d4	Const trace error v2 (#18535 ) Summary: Trying to reland https://github.com/pytorch/pytorch/pull/18298 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18535 Differential Revision: D14652391 Pulled By: eellison fbshipit-source-id: 699e30045dd5f14f0a2b98378272045a292e1e2a	2019-03-27 14:40:56 -07:00
jithunnair-amd	fdedc62c26	enable more unit tests (#18537 ) Summary: Enable unit tests working with ROCm 2.3. In particular, these are unit tests where we skipped for double data types previously and some tests for multi-GPU setups. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18537 Differential Revision: D14651822 Pulled By: ezyang fbshipit-source-id: 7dd575504ebe235a91489866c91000e9754b1235	2019-03-27 14:27:23 -07:00
Min Ni	c3e3c5cc39	Skip tests if C2/ONNX models cannot be read (#18494 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18494 Today we have some C2 end2end test run requiring reading model data from external filesystem (for example, Gluster and AWS). This could be a source for flaky test when the external filesystems are not reachable during the tests. In this diff, we add try/catch logic around where we download models and open model files from external system. In case such attempts fails, we will catch the excption and let the unittest skip the current test instead of failure. I also refactor the code a little bit by removing some duplicated logic on downloading and build the c2 model data. It has been duplicated in two classes and a few functions... Reviewed By: yinghai Differential Revision: D14442241 fbshipit-source-id: da8bf56c8d096efa34ca2070de5cd10a18aad70c	2019-03-27 11:21:44 -07:00
zrphercule	30da6c7d06	Add qtensors in caffe2 protobuf argument (#18486 ) Summary: We are about to merge onnxifi quantization support soon. Before that, I would like to merge this diff seperately to make sure it doesnt break anything. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18486 Reviewed By: bddppq, houseroad Differential Revision: D14626419 Pulled By: yinghai fbshipit-source-id: 504c1eae60be1e629203267b59defb8b69d82c0a	2019-03-27 11:16:40 -07:00
Paul O’Shannessy	defe67caf2	Generate sphinx docs with secure content. (#18508 ) Summary: There are a number of pages in the docs that serve insecure content. AFAICT this is the sole source of that. I wasn't sure if docs get regenerated for old versions as part of the automation, or if those would need to be manually done. cf. https://github.com/pytorch/pytorch.github.io/pull/177 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18508 Differential Revision: D14645665 Pulled By: zpao fbshipit-source-id: 003563b06048485d4f539feb1675fc80bab47c1b	2019-03-27 11:01:48 -07:00
ZhuBaohe	8c3285bf11	Fix loss functions doc (#18420 ) Summary: Correct docstring display error on web page caused by my previous PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/18420 Differential Revision: D14642467 Pulled By: soumith fbshipit-source-id: 16fdd3301a4c5bad27fbcd8686f7fbfcc1e908ee	2019-03-27 10:23:24 -07:00
Edward Yang	81e030d9a6	Upgrade flake8-bugbear to master, fix the new lints. (#18507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18507 ghimport-source-id: 1c3642befad2da78a7e5f39d6d58732b85c76267 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18507 Upgrade flake8-bugbear to master, fix the new lints. It turns out Facebobok is internally using the unreleased master flake8-bugbear, so upgrading it grabs a few more lints that Phabricator was complaining about but we didn't get in open source. A few of the getattr sites that I fixed look very suspicious (they're written as if Python were a lazy language), but I didn't look more closely into the matter. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14633682 fbshipit-source-id: fc3f97c87dca40bbda943a1d1061953490dbacf8	2019-03-27 08:07:41 -07:00
peter	85d78a0532	Add export annotations for functions in c10 (#18464 ) Summary: Fixes #18461. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18464 Differential Revision: D14620963 Pulled By: ezyang fbshipit-source-id: c11f3967de2ac69c7140767c8fe73a85555e9f40	2019-03-27 07:58:58 -07:00
Li Yu	a3933b87c6	Back out "Revert D14613517: [pytorch][PR] Updating onnxtrt submodule to master branch" (#18514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18514 Original commit changeset: d6267ddfc339 Reviewed By: bddppq Differential Revision: D14634476 fbshipit-source-id: 2633b0b4c512d71001e5c20cd79c0c0d7856f942	2019-03-26 23:44:33 -07:00
Lu Fang	eae7ad4ca8	Automatic update of fbcode/onnx to b29e78a4efb8e5d8995f576bbf19a846807829b6 (#18503 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18503 Previous import was c05f2ae412daf8fd64136ca354b97ccf73e0ea6c Included changes: - [b29e78a4](https://github.com/onnx/onnx/commit/b29e78a4): update copyright for open governance (#1885) <Prasanth Pulavarthi> - [3b0ecd55](https://github.com/onnx/onnx/commit/3b0ecd55): open governance (#1881) <Prasanth Pulavarthi> - [bbe28349](https://github.com/onnx/onnx/commit/bbe28349): Revert "Adding Reverse op (#1804)" (#1882) <Lu Fang> - [5be3e223](https://github.com/onnx/onnx/commit/5be3e223): Adding Reverse op (#1804) <Peyman Manikashani> Reviewed By: zrphercule Differential Revision: D14632717 fbshipit-source-id: 2637a4090e7071a59caff3a910fa4f077906bf3c	2019-03-26 21:58:22 -07:00
Yinghai Lu	f3ddc40ca4	Move weight offload inside backend construction functor (#18385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18385 By moving the weight offload into the backend initialization function, we can instantiate the backend once by creating the OnnxifiOp once and then clean up the parameter workspace. And we need to keep hold of that instantiated net (OnnxifiOp) without cleaning it. Subsequent ctor of OnnxifiOp of the same model will hit the cached backend and they will not look into weight offloading, which is safe as the weight is already gone. Reviewed By: ipiszy Differential Revision: D14590379 fbshipit-source-id: f7f34016e09777ad3df0af487885cd14658e1044	2019-03-26 21:03:17 -07:00
Tongzhou Wang	60538c8366	fix #16448 (#18479 ) Summary: Fixes #16448 bddppq Pull Request resolved: https://github.com/pytorch/pytorch/pull/18479 Differential Revision: D14635360 Pulled By: ezyang fbshipit-source-id: 4010319fbce050dd0bdf4da3cd1171b9737f3c4c	2019-03-26 20:58:25 -07:00
James Reed	f447b63ed0	Add section about .code to docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18493 Differential Revision: D14634677 Pulled By: jamesr66a fbshipit-source-id: 9ee065f6ce4218f725b93deb4c64b4ef55926145	2019-03-26 20:52:31 -07:00
Stas Bekman	45ec4920e3	how to use the `ccache` package on Ubuntu (#18495 ) Summary: Added full instructions for how to use the `ccache` package. Thanks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18495 Differential Revision: D14635351 Pulled By: ezyang fbshipit-source-id: 158e1052bae580e95f73644252fdbddcc0213128	2019-03-26 20:08:09 -07:00
peterjc123	d5861aa55c	Append c10 libs to TorchConfig.cmake (#18418 ) Summary: Fixes #18416. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18418 Differential Revision: D14635322 Pulled By: ezyang fbshipit-source-id: 81cb658f73583e4cd0358173617f747ebf4f7f8a	2019-03-26 19:53:02 -07:00
Xiang Gao	2ba41c5550	Add some missing docs for tensor methods and attributes, new unittest to enforce tensors.rst no longer miss anything (#16057 ) Summary: This depend on https://github.com/pytorch/pytorch/pull/16039 This prevent people (reviewer, PR author) from forgetting adding things to `tensors.rst`. When something new is added to `_tensor_doc.py` or `tensor.py` but intentionally not in `tensors.rst`, people should manually whitelist it in `test_docs_coverage.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16057 Differential Revision: D14619550 Pulled By: ezyang fbshipit-source-id: e1c6dd6761142e2e48ec499e118df399e3949fcc	2019-03-26 18:05:56 -07:00
Li Yu	66e8c74814	Revert D14613517: [pytorch][PR] Updating onnxtrt submodule to master branch Differential Revision: D14613517 Original commit changeset: dd20d718db55 fbshipit-source-id: d6267ddfc339d04f182e2de1750a601c8d6bf8c6	2019-03-26 17:37:55 -07:00
Junjie Bai	e912632b74	Fix direct comparison of OperatorDef proto structs (#18466 ) Summary: arguments order is okay to be different ajyu Pull Request resolved: https://github.com/pytorch/pytorch/pull/18466 Differential Revision: D14627258 Pulled By: bddppq fbshipit-source-id: 430e1fb1bea2c5639a547ae7c1652368788c86b9	2019-03-26 17:25:09 -07:00
Soumith Chintala	66628f78b7	Revert D14605905: [pytorch][PR] Add return_counts to torch.unique Differential Revision: D14605905 Original commit changeset: 555f5a12a8e2 fbshipit-source-id: c7874f5987893e956c022180a37763d88bba38db	2019-03-26 17:18:01 -07:00
Sameer Indarapu	bdd098c694	Fix typo in Github links in elementwise_ops_schema.cc (#18018 ) Summary: s/elementwise_op_schema.cc/elementwise_ops_schema.cc Pull Request resolved: https://github.com/pytorch/pytorch/pull/18018 Differential Revision: D14612291 Pulled By: soumith fbshipit-source-id: 09276283b9ff92c039ce530165c62cc8421fb443	2019-03-26 15:37:26 -07:00
Tongzhou Wang	5292685d2f	Improve numerical precision of (s)logdet (#18449 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18448 and https://github.com/pytorch/pytorch/issues/18450 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18449 Differential Revision: D14611638 Pulled By: soumith fbshipit-source-id: 4f1f27ab5316a92d2783e734169f599afed743cf	2019-03-26 15:32:14 -07:00
Soumith Chintala	436723122e	fix arange shape issue inconsistency across cpu and cuda (#18462 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/18363 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18462 Differential Revision: D14620263 Pulled By: soumith fbshipit-source-id: 223524cdda2f5d55c2ca8d4cdcf6f7a05a6c15eb	2019-03-26 15:27:24 -07:00
Kevin Chen	bbe110f4e1	Updating onnxtrt submodule to master branch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18441 Differential Revision: D14613517 Pulled By: bddppq fbshipit-source-id: dd20d718db55942df9cce7acd1151d6902bc57ff	2019-03-26 14:25:55 -07:00
BowenBao	654e59fcac	Minor fix for onnx ConstantOfShape export (#18199 ) Summary: Set value as tensor of 1 element instead of scalar, according to ONNX spec. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18199 Reviewed By: dzhulgakov Differential Revision: D14542588 Pulled By: houseroad fbshipit-source-id: 70dc978d870ebe6ef37c519ba4a20061c3f07372	2019-03-26 13:23:16 -07:00
Xiang Gao	5bff395a82	Namedtuple return for solve, slogdet, sort, topk (#17093 ) Summary: More ops for https://github.com/pytorch/pytorch/issues/394. ~~Also need to rebase after landing #16186, because we need to update the whitelist of the new unit test added in #16186.~~ cc: ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/17093 Differential Revision: D14620068 Pulled By: ezyang fbshipit-source-id: deec5ffc9bf7624e0350c85392ee59789bad4237	2019-03-26 12:39:08 -07:00
Sebastian Messmer	c6bfcb854b	Expose c10 operators to caffe2 by operator name (#18160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18160 When exposing a c10 operator to the caffe2 frontend, don't use the operator schema but use the operator name instead. This allows us to get rid of the existing mechanism for operator schema registration in a diff stacked on top. Reviewed By: dzhulgakov Differential Revision: D14513420 fbshipit-source-id: 6b08a9c6d9497eaf18b62361dd44bc07c7b4b76b	2019-03-26 12:36:11 -07:00
Edward Yang	3bbe204f32	Test running a CUDA build on CPU machine. (#18242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18242 ghimport-source-id: b949d312a48226a34f90304162e910acee7c95cd Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * #18362 Add ability to query if built with CUDA and MKL-DNN. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14584429 fbshipit-source-id: b54de5b33f0c795a7d9605d30576cdf9b74050fd	2019-03-26 12:31:11 -07:00
Edward Yang	0aeaeffb6c	Properly use cudaGetLastError return code. (#18485 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18485 I don't know how (1) we landed the wrong version of the patch and (2) how this passed the push blocking test Reviewed By: pjh5 Differential Revision: D14621961 fbshipit-source-id: 0a3953d7adcdc79727a61c2acff65f436dcafe55	2019-03-26 12:26:44 -07:00
Xiaomeng Yang	265fa0ce4d	Move math::Axpy function to elementwise lib (#18316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18316 Move math::Axpy function to elementwise lib i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14574697 fbshipit-source-id: 7cfbb2da295c8966c5328bd6b577cce2638eea62	2019-03-26 12:19:19 -07:00
Gu, Jinghui	6f3186a578	Upgrade mkldnn to version 0.18.1 (#18463 ) Summary: Upgrade mkldnn to version 0.18.1 Fix the MKLDNN build issue if linking with MKL 2019. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18463 Differential Revision: D14620228 Pulled By: ezyang fbshipit-source-id: 136074ad0e4631e1dde4ca1b0af4ee6a41e50913	2019-03-26 11:00:25 -07:00
Pat Mellon	fa0bfa03ed	Add Google tag (#17690 ) Summary: This PR adds a Global Site Tag to the site. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17690 Differential Revision: D14620816 Pulled By: zou3519 fbshipit-source-id: c02407881ce08340289123f5508f92381744e8e3	2019-03-26 10:35:24 -07:00
Gemfield	20159c3ffe	remove redundant --install_dir parameter in GEN_COMMAND (#18473 ) Summary: remove redundant --install_dir parameter in GEN_COMMAND, since "--install_dir parameter " already contained in ${GEN_COMMAND}. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18473 Differential Revision: D14620193 Pulled By: ezyang fbshipit-source-id: ee9953b5d055f4b8beb3557f95f6539051b0028a	2019-03-26 10:22:00 -07:00
Iurii Zdebskyi	1a742075ee	Resolving comments from Bool Tensor for CPU PR (#18165 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18165 ghimport-source-id: 55cb3fb63a25c2faab1725b4ec14c688bf45bd38 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18166 Bool Tensor for CUDA * #18165 Resolved comments from Bool Tensor for CPU PR ------- ------------ This is a follow up PR that resolves some additional feedback on one the of previous Bool Tensor PRs. gchanan, here is a list of almost all the comments from the original PR with respective fixes and replies: [utils/python_scalars.h] why is this converting from uint8_t and not bool? (comment?) When i was adding this, i was testing by creating a tensor and then calling its .tolist(). it worked for bool and uint8_t equally good so i left uint8_t as thought it makes more sense as we are calling PyBool_FromLong. �Changing it to bool. [ATen/Dispatch.h]better name?. fixed. [test/test_torch.py] what about other factories, such as full? (and more). There is a test that goes through the factory methods - test_tensor_factories_empty. i added some bool cases above it and added a comment that once CUDA will be done, i will unite them and it will iterate not just between CUDA and CPU but also all types. ��Adding all bool cases now. Will unite in CUDA PR. [generic/THTensorMath.h] any changes in this file actually needed? Bad merge. Fixed. [TH/THTensor.h] this generates code for random, clampedRandom, and cappedRandom -- do we have tests for all of these with bool? Added [c10/core/ScalarType.h] I'm not very confident about the lack of Bool here -- can you look at the call sites and see what makes sense to do here? Added bool to the macro and created a similar one without for a single case which fails the build with errors: _./torch/csrc/jit/symbolic_variable.h:79:20: error: ambiguous overload for ‘operator’ (operand types are ‘const torch::jit::SymbolicVariable’ and ‘torch::jit::Value’) return (this) insertConstant(rhs);_ Differential Revision: D14605105 fbshipit-source-id: abf82d50e8f8c50b386545ac068268651b28496d	2019-03-26 09:59:34 -07:00
Edward Yang	515238e0a5	Unify cudaGetDeviceCount implementations. (#18445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18445 ghimport-source-id: 30d018737bf6989bc68b7e3676f44e0ca6141fde Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * #18445 Unify cudaGetDeviceCount implementations. I went about doing this by searching for calls to cudaGetDeviceCount, and then methodically replacing them with references to c10::cuda::device_count() or at::cuda::device_count(). There is a point to doing this: the various implementations wildly differed in their handling of what to do when cudaGetDeviceCount returns an error. The final standardized behavior is that all errors are swallowed and we return device count of zero. This indirectly fixes running CUDA builds on CPU, which was broken in #17847. I added 'noexcept' to the 'deviceCount' virtual method on DeviceGuardImpl. This is a BC-breaking change for anyone inheriting from DeviceGuardImpl but all you need to do is put 'noexcept' on your method and it is backwards compatible with older libtorch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14612189 fbshipit-source-id: 3c8d186e3dd623c0e27625212c7ce30f75d943cb	2019-03-26 09:50:14 -07:00
Christian Puhrsch	cf094d4edc	Use TensorIterator for unary operations Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18309 Differential Revision: D14591533 Pulled By: cpuhrsch fbshipit-source-id: a3b0788a481bddf1803c9f2d3289263d7364f8d7	2019-03-26 09:22:52 -07:00
vishwakftw	5e462a3ed6	Introduce SobolEngine (#10505 ) Summary: `SobolEngine` is a quasi-random sampler used to sample points evenly between [0,1]. Here we use direction numbers to generate these samples. The maximum supported dimension for the sampler is 1111. Documentation has been added, tests have been added based on Balandat 's references. The implementation is an optimized / tensor-ized implementation of Balandat 's implementation in Cython as provided in #9332. This closes #9332 . cc: soumith Balandat Pull Request resolved: https://github.com/pytorch/pytorch/pull/10505 Reviewed By: zou3519 Differential Revision: D9330179 Pulled By: ezyang fbshipit-source-id: 01d5588e765b33b06febe99348f14d1e7fe8e55d	2019-03-26 07:53:07 -07:00
Wanchao Liang	9080942afb	fix str of autogradzero Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18442 Differential Revision: D14602880 Pulled By: wanchaol fbshipit-source-id: ebd00f9bb5f1f7e33964c10d8c9f165b7bb4985f	2019-03-25 23:49:27 -07:00
eellison	dc6b5b2a52	Optimize boolean expressions & unwraps (#18259 ) Summary: Simplify or eliminate boolean and/or expressions, optimize unwrapping a value that cannot be None, and optimize using `is` with a None and a non-None value Since peephole optimize is now introducing constants, i added another constant propagation pass after running it. Previously i had a PR that did this & optimized shape ops - i will add the shape optimizations in a separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18259 Differential Revision: D14602749 Pulled By: eellison fbshipit-source-id: 1c3f5a67067d8dfdf55d7b78dcb616472ea8a267	2019-03-25 21:50:57 -07:00
Junjie Bai	a729630cbf	Fix python resolution in caffe2 CI scripts Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18417 Differential Revision: D14612704 Pulled By: bddppq fbshipit-source-id: 0942048a9c3990afc50ce73c1fa1005c4d4097aa	2019-03-25 20:56:15 -07:00
Xiang Gao	bf2a30cb22	Support dim=None for argmax and argmin (#18264 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/18263 cc: houseroad Pull Request resolved: https://github.com/pytorch/pytorch/pull/18264 Reviewed By: ezyang Differential Revision: D14559234 Pulled By: houseroad fbshipit-source-id: c5b8623752d6c6af41c6d715fd9585a65294868d	2019-03-25 20:43:34 -07:00
Xiang Gao	e2730ddb21	Add return_counts to torch.unique (#18391 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/12598 This PR was originally authorized by ptrblck at https://github.com/pytorch/pytorch/pull/15495, but since there was no update for months after the request change, I clone that branch and resolve the code reviews here. Hope everything is good now. Especially, the implementation of count is changed from ptrblck's original algorithm to the one ngimel suggest, i.e. using `unique_by_key` and `adjacent_difference`. The currently implementation of `_unique_dim` is VERY slow for computing inverse index and counts, see https://github.com/pytorch/pytorch/issues/18405. I will refactor `_unique_dim` in a later PR. For this PR, please allow me to keep the implementation as is. cc: ptrblck ezyang ngimel colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/18391 Reviewed By: soumith Differential Revision: D14605905 Pulled By: VitalyFedyunin fbshipit-source-id: 555f5a12a8e28c38b10dfccf1b6bb16c030bfdce	2019-03-25 20:38:17 -07:00
Natalia Gimelshein	ba4de667fa	change dropout lowering in symbolic_script (#18375 ) Summary: Dropout is now eligible for fusion, and generated fused kernels are just as fast as dropout in ATen. Change its lowering in symbolic script so that it can actually be fused. Still special-cased for cuda, because without fusion this lowering is less efficient than current (bernoulli_ * input). Testing is covered by the test case that ailzhang added (test_dropout_cuda). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18375 Differential Revision: D14611938 Pulled By: soumith fbshipit-source-id: 11b18f4784e6c9265e382a8f8deca7add8df3b37	2019-03-25 20:05:11 -07:00
Gao, Xiang	a40e0a7f2d	Add torch.version.git_version (#18299 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/18293 cc: colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/18299 Differential Revision: D14611972 Pulled By: soumith fbshipit-source-id: cdb48ef37c8869713a9a43ea0da08e1bed9279a2	2019-03-25 19:59:40 -07:00
Xiang Gao	674c274d92	Change deprecated IntList to IntArrayRef Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18262 Differential Revision: D14612244 Pulled By: ezyang fbshipit-source-id: 5d21c7b94d64104fececcb15c6d38d9bd2a1fc70	2019-03-25 19:47:21 -07:00
Tongzhou Wang	d1e416ac73	Enable printing to stderr for test_proper_exit for better debugging (#18458 ) Summary: related to https://github.com/pytorch/pytorch/issues/16608 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18458 Differential Revision: D14611718 Pulled By: soumith fbshipit-source-id: 6dc903ff2d32b9c3b76470869d1f4e9a67f706df	2019-03-25 19:20:21 -07:00
Karl Ostmo	e1c272797b	Don't require pygraphviz for regenerate.sh (#17485 ) Summary: closes #17336 Do not overwrite config.yml if script throws an error Pull Request resolved: https://github.com/pytorch/pytorch/pull/17485 Differential Revision: D14604388 Pulled By: kostmo fbshipit-source-id: 5024545e3a8711abdbc0800911c766929dbca196	2019-03-25 18:04:53 -07:00
Mikhail Zolotukhin	13b95eac55	Add quant-passes stubs. (#18151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18151 ghimport-source-id: 7d12462971bdf3e5e26a3f150f1fcad05bba1a15 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18152 Initial implementation of InsertObserverNodes pass. * #18151 Add quant-passes stubs. gh-metadata: pytorch pytorch 18149 gh/zolotukhinm@gmail.com/1/head Differential Revision: D14584224 fbshipit-source-id: b3d0b5ff797160d5ad23f91f732e627b0129086c	2019-03-25 17:48:54 -07:00
Duc Ngo	6a1a019c0a	caffe2 - support flaky operator tests for caffe2 build (#18155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18155 - Make a python decorator caffe2_flaky for caffe2 operator unit tests. - The environment variable CAFFE2_RUN_FLAKY_TESTS are now used to mark flaky test mode During test run, - If flaky tests mode are on, only flaky tests are run - If flaky tests mode are off, only non-flaky tests are run Mark ctc_beam_search_decoder_op_test as flaky Reviewed By: ezyang, salexspb Differential Revision: D14468816 fbshipit-source-id: dceb4a48daeb5437ad9cc714bef3343e9761f3a4	2019-03-25 16:58:34 -07:00
iurii zdebskyi	7a90bae416	Remove unused th_scalar_type (#18390 ) Summary: th_scalar_type seems to be unused anywhere so can be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18390 Reviewed By: ezyang Differential Revision: D14591374 Pulled By: izdeby fbshipit-source-id: 2113aa81229cdfdfb8dc5c951ea6dea3725b8582	2019-03-25 15:55:10 -07:00
Ivan Ogasawara	56c16fe26f	Porting CPU UpSample functions to ATen (#18020 ) Summary: This PR resolves partially #10482 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18020 Differential Revision: D14598029 Pulled By: ezyang fbshipit-source-id: 513e7c6438ab6d5dc3f43241e7cb724744e9a287	2019-03-25 14:39:13 -07:00
nihui	ed8c462dc7	Fix caffe2 build with BLAS=OpenBLAS (#18422 ) Summary: g++ complains about failing to find the declaration of cblas_sscal and cblas_dscal BLAS function let's fix it :) fedora 29, gcc 8.3.1, openblas 0.3.5 build with cmake -DBLAS=OpenBLAS .. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18422 Differential Revision: D14598977 Pulled By: soumith fbshipit-source-id: bde77bfb359d2ff38226401caeed78c114ef7468	2019-03-25 11:59:10 -07:00
Wanchao Liang	6c9b312fd4	Add addcmul, lerp to fuser, enable scalar->float specialization in symbolic script (#18081 ) Summary: This PR did two things: 1. Enable scalar->float specialization in symbolic script, so AD formula that contains scalar in the schema, should write `float` instead. 2. add addcmul, lerp to AD and fuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18081 Differential Revision: D14490493 Pulled By: wanchaol fbshipit-source-id: b3b86d960d5f051b30733bc908b19786111cdaa4	2019-03-25 11:05:45 -07:00
Edward Yang	50df3e5e2e	Add ability to query if built with CUDA and MKL-DNN. (#18362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18362 ghimport-source-id: 374b7ab97e2d6a894368007133201f510539296f Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * #18362 Add ability to query if built with CUDA and MKL-DNN. Fixes #18108. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14584430 fbshipit-source-id: 7605a1ac4e8f2a7c70d52e5a43ad7f03f0457473	2019-03-25 10:39:09 -07:00
svcscm	17fcdfb925	Updating submodules Reviewed By: yns88 fbshipit-source-id: b2c5eb7dfa9048e399461c00d1103e945a30a5bc	2019-03-25 10:32:26 -07:00
Vitaly Fedyunin	5653a914f7	Implement reference counting for shared IPC CUDA tensors (#16854 ) Summary: This is to fix #16141 and similar issues. The idea is to track a reference to every shared CUDA Storage and deallocate memory only after a consumer process deallocates received Storage. ezyang Done with cleanup. Same (insignificantly better) performance as in file-per-share solution, but handles millions of shared tensors easily. Note [ ] documentation in progress. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16854 Differential Revision: D13994490 Pulled By: VitalyFedyunin fbshipit-source-id: 565148ec3ac4fafb32d37fde0486b325bed6fbd1	2019-03-25 10:24:38 -07:00
Gregory Chanan	f5ea528687	Don't segfault on trying to get data_ptr of sparse tensor. (#18347 ) Summary: Also asserts in storage_initialized that there is a storage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18347 Differential Revision: D14582028 Pulled By: gchanan fbshipit-source-id: df3f5d181188f39e361839169fd054539c3b2839	2019-03-25 08:59:53 -07:00
Gregory Chanan	647154f82a	Assert tensor isn't sparse in enforce_invariants. (#18338 ) Summary: There's no reason we can't check this, but I'm punting on implementing it for now. But it currently segfaults, so this is an improvements. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18338 Differential Revision: D14580308 Pulled By: gchanan fbshipit-source-id: 44d4cafeab12e1beeb3453a2d4068d221c2e9c4f	2019-03-25 08:44:17 -07:00
Sacha	a4f83fff2b	Only look for Caffe2 package when shared (#18421 ) Summary: Previously it would look for the Config even if it was not written. Fixed #18419 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18421 Differential Revision: D14597139 Pulled By: ezyang fbshipit-source-id: c212cbf5dc91564c12d9d07e507c8285e11c6bdf	2019-03-25 07:27:24 -07:00
Summer Deng	c297f26843	Add more options to the quantization model exporter (#18383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18383 Add command line options for different quantization schemes. Reviewed By: amylittleyang Differential Revision: D14476862 fbshipit-source-id: 37fbf5b4c1c550121eae313f5a71d703a0a87f0f	2019-03-25 04:23:17 -07:00
Thomas Viehmann	9e176fe5fe	Revert "Specialize optional tensor inputs to graphs in the JIT (#18360 )" (#18411 ) Summary: This reverts commit 7cc7ed1322405ba3c627b9c5661a330f92c4183d. I think it's better to sort out the issues raised in #18407 firs. I'm sorry for not stopping it earlier. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18411 Differential Revision: D14594937 Pulled By: soumith fbshipit-source-id: 3c90b7fa7694e2f59e55607acecde4a47af801ea	2019-03-24 21:29:29 -07:00
Gao, Xiang	5acac411e4	Fix deprecated: type() -> scalar_type() (#18406 ) Summary: Sorry for not sending these fixes in a single PR. I found this compiler warning when I was working on something else, and I just go to GitHub and modify the file directly for convenience... Pull Request resolved: https://github.com/pytorch/pytorch/pull/18406 Differential Revision: D14594180 Pulled By: soumith fbshipit-source-id: 92f48513bc62fbe2c67c759d68830a973296e43b	2019-03-24 19:46:03 -07:00
Gao, Xiang	6c029c80f7	Fix deprecated: type() -> scalar_type() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18394 Differential Revision: D14593890 Pulled By: soumith fbshipit-source-id: 92b9a8c22008341c0cc3b7a721bef1973c528daf	2019-03-24 19:27:23 -07:00
mc-robinson	8bc5b86709	Added tensor size warning to F.mse_loss() (#18349 ) Summary: To address the issue of broadcasting giving the wrong result in `nn.MSELoss()` as mentioned here https://github.com/pytorch/pytorch/issues/16045 . In particular, the issue often arises when computing the loss between tensors with shapes (n, 1) and (n,) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18349 Differential Revision: D14594176 Pulled By: soumith fbshipit-source-id: f23ae68a4bf42f3554ad7678a314ba2c7532a6db	2019-03-24 19:22:14 -07:00
Elias Ellison	ca962f0f95	Fix For Requires Grad Infinite Loop (#18361 ) Summary: Previously, we would continue to run requires grad on a loop body when the outputs and inputs disagreed. This adds a check so that we don't continue running if the results haven't changed since the last run. Fix for https://github.com/pytorch/pytorch/issues/18320 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18361 Differential Revision: D14584332 Pulled By: eellison fbshipit-source-id: 696b225f80a2036318540946428b525985a9e735	2019-03-24 14:34:50 -07:00
Soumith Chintala	92c9fef860	update magma instructions (#18410 ) Summary: fixes https://github.com/pytorch/pytorch/issues/18389 cc: stas00 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18410 Differential Revision: D14594198 Pulled By: soumith fbshipit-source-id: fb46ef77a36c90ad95e47f7066f5d32aa1f1370f	2019-03-24 13:15:11 -07:00
Iurii Zdebskyi	1323c193ed	Removed some dead code (#18201 ) Summary: Removed some dead code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18201 Differential Revision: D14555251 Pulled By: izdeby fbshipit-source-id: f49640133ef4ae1b0306f7cec6655f23869cc6e7	2019-03-24 08:24:03 -07:00
Thomas Viehmann	7cc7ed1322	Specialize optional tensor inputs to graphs in the JIT (#18360 ) Summary: This specializes optional tensor inputs to either a DimensionedTensorType or, when None is passed, UndefinedTensor (aka AutogradZeroTensorType). This works because we already have different specs and thus separate plans for the two cases. It enhances the shape analysis - because now unwrapped optional tensors will have DimensionedTensorType with appropriate shape and required grad etc. Also, when combined with "if-pruning" (which I understand #18259 works towards), we actually get much nicer concrete graphs, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18360 Differential Revision: D14590577 Pulled By: soumith fbshipit-source-id: cac204a506d1d38b15703cbcc67a6b75fd4979f4	2019-03-23 23:00:37 -07:00
Will Feng	32d0e7e339	Move pyobj_ to TensorImpl (#18225 ) Summary: Currently, `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` depend on the existence of `pyobj_` in the autograd metadata of a Variable to convert the Variable to a Python tensor. However, after the Variable/Tensor merge, there will be Variables that don't contain autograd metadata, and to allow the conversion from non-autograd-meta Variable to a Python tensor we need to store the `pyobj_` outside of autograd metadata and in a place where it will always be available. This PR makes it possible by moving `pyobj_` into TensorImpl, so that `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` can always access a Variable's `pyobj_` and convert the Variable to a Python tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18225 Differential Revision: D14562616 Pulled By: yf225 fbshipit-source-id: 18d4aaace70eee6120abaf9276036d1f8f51b18d	2019-03-23 12:50:38 -07:00
Xiang Gao	5860fa5dcf	Fix deprecated scalar type in ATen/native/Distributions.cpp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18265 Differential Revision: D14577543 Pulled By: ezyang fbshipit-source-id: 36674530b32366c51835e4073d7ba23d455d2fda	2019-03-23 10:09:26 -07:00
Edward Yang	d9960fbdb2	Revert D14446895: [C2] Implement rotated generate_proposals_op without opencv dependency (~2x faster) Differential Revision: D14446895 Original commit changeset: 847f2443e645 fbshipit-source-id: fc6ab5ee59e027f125f5ab0f7ee51ad7db37d4a4	2019-03-23 09:38:55 -07:00
Michael Suo	d85451c07b	Revert D14584266: [pytorch][PR] Better error message for tensor with grad as constant in tracing Differential Revision: D14584266 Original commit changeset: 4e7850dadc78 fbshipit-source-id: 3bb3b5006e469edff984c16e0ff8d5dac2862d88	2019-03-23 02:50:54 -07:00
Elias Ellison	7c2290e7ce	Better error when module attr is used (#18164 ) Summary: Adds a suggestion to add to __constants__ when a torch.nn.Module attr is accessed Pull Request resolved: https://github.com/pytorch/pytorch/pull/18164 Differential Revision: D14580060 Pulled By: eellison fbshipit-source-id: 0c5adc21d7341a5691d4b45930947cb1ba84c8e8	2019-03-22 20:22:27 -07:00
Will Feng	7be05b822c	Fix incorrect sparse add behavior when the sparse tensor has non-contiguous values (#18179 ) Summary: Currently, this code gives incorrect result: ```python import torch indices=torch.tensor([[7, 1, 3]]) values=torch.tensor([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]) x = torch.sparse_coo_tensor(indices, values, size=(10, 3)) values=torch.tensor(1.).expand(3, 3) y = torch.sparse_coo_tensor(indices, values, size=(10, 3)) z = x + y tensor(indices=tensor([[7, 1, 3]]), values=tensor([[2., 1., 1.], [1., 1., 1.], [1., 1., 1.]]), size=(10, 3), nnz=3, layout=torch.sparse_coo) ``` This PR fixes the bug by adding special handling for sparse tensors with non-contiguous values in the addition function (specifically, by cat'ing the indices and values together). This PR closes https://github.com/pytorch/pytorch/issues/17950 and https://github.com/pytorch/pytorch/issues/17919. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18179 Reviewed By: ezyang Differential Revision: D14569591 Pulled By: yf225 fbshipit-source-id: f5a14c4a31337fc95eab64596212066b4fb18b1a	2019-03-22 19:35:14 -07:00
Jing Huang	6052d04100	Implement rotated generate_proposals_op without opencv dependency (1.8x faster) (#18010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18010 [C2] Implement rotated generate_proposals_op without opencv dependency. Reviewed By: newstzpz Differential Revision: D14446895 fbshipit-source-id: 847f2443e645f8cae1327dfbaa111c48875ca9be	2019-03-22 18:15:27 -07:00
Mikhail Zolotukhin	0d78126a6f	Remove empty file (actual file_check.cpp resides in torch/csrc/jit/testing) (#18303 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18303 ghimport-source-id: 66f4402075b123e36c6ffdf806b7c93187a1a58a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18307 Convert test_recursive_cse to use Filecheck inline annotations. * #18306 [Filecheck] Add a feature to parse check annotations from string. * #18305 Add python bindings for parseIR. * #18303 Remove empty file (actual file_check.cpp resides in torch/csrc/jit/testing) Differential Revision: D14586003 fbshipit-source-id: a13e57bd4302e4d3f06198068d525de25e2aa8b3	2019-03-22 17:03:25 -07:00
Michael Suo	ff3ecfec89	Turn script_type_parser into a class (#18211 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18211 ghimport-source-id: 73b81e9ec631937b14db1da10991831788a6894b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18296 [jit] Add namespacing for ScriptClasses * #18284 [jit] make test module hook use save/load * #18211 [jit] Turn script_type_parser into a class * #18148 [jit] python interop for script classes If we are namespacing classes, the type parser will need to carry around some state about which namespaces to look in. This PR just wraps it in a class in preparation. Also, subscriptToType can no longer be static, since parseTypeFromExpr may give different results depending on the namespaces available, so it's been made a regular function instead of a static map lookup. Reviewed By: eellison Differential Revision: D14581128 fbshipit-source-id: 711315472ccde1920abf9fdb5a871ac27fb86787	2019-03-22 16:30:05 -07:00
Michael Suo	10751d5fb4	python interop for script classes (#18148 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18148 ghimport-source-id: 40a9d745dc9aeba53d098743323fcbd50ca65137 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18148 py interop Support for converting classes between the Python–TorchScript boundary. Like other TorchScript values, ScriptClasses are native Python values when used in Python and IValues when used in TorchScript. Notably, there is a copy across this boundary, which will be surprising to users who will expect standard Python reference semantics. I have some ideas for fixing that, but it's a more involved process. Reviewed By: jamesr66a Differential Revision: D14526259 fbshipit-source-id: 5916e3032488a42dc7da756c1826d7c040a21ebd	2019-03-22 16:30:04 -07:00
Elias Ellison	3badea6eb3	Better error message for tensor with grad as constant in tracing (#18298 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/17583 There's an unrelated issue right now causing a segfault when printing tensor so that might have to fixed first for this to land Pull Request resolved: https://github.com/pytorch/pytorch/pull/18298 Differential Revision: D14584266 Pulled By: eellison fbshipit-source-id: 4e7850dadc78ef1e98ad40b9d8adc0fef42acf48	2019-03-22 15:29:30 -07:00
Nikolay Korovaiko	2ad2b2c7b1	Support for basic list comprehensions (#17267 ) Summary: Supports the following syntax: ``` torch.jit.script def comp(l): # type: (List[float]) -> List[float] n = [x * 3 for x in l] return n ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17267 Differential Revision: D14581119 Pulled By: Krovatkin fbshipit-source-id: 6fd091a8a9ab607386ac58fda6ad88bf8aea380e	2019-03-22 15:25:13 -07:00
Edward Yang	e20894fce5	Make it possible to trigger XLA/slow tests via commit message. (#18345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18345 ghimport-source-id: 9649d76bb194866859d62e6ba2a3a265c96ebba5 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18345 Make it possible to trigger XLA/slow tests via commit message. Four variants are supported: `[xla ci] [ci xla] [xla test] [test xla]`; substitute xla with slow for slow tests. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14584557 fbshipit-source-id: fcbfdfb28246823135bb3d3910baae073d16e81d	2019-03-22 15:06:40 -07:00
Sebastian Messmer	f68faa35c0	Avoid refcount when looking up dispatch key Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18294 Reviewed By: ezyang Differential Revision: D14512979 fbshipit-source-id: 45e548974f06184c375c2bb8339e3049a4ebd880	2019-03-22 14:09:20 -07:00
Jiakai Liu	e5eb871419	Fix DCHECK to handle dangling else (#18295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18295 Replace "if (false)" with "while (false)" which fixes potential dangling else issue as shown in added test case. Reviewed By: ezyang Differential Revision: D14569608 fbshipit-source-id: 407052db9182ce27b7a59841e90fa50d3eca262e	2019-03-22 14:04:29 -07:00
Natalia Gimelshein	ed47b85d3b	Allow fusion of float function arguments (#18087 ) Summary: so that functions like `def fn(x, p:float)` can be fused. Fixes #9940 and #11186. Fuses only float (not integer) arguments to simplify assembling arguments for fusion launch. CPU fusion is disabled in CI and this won't be tested, but I tested it locally. cc t-vi, apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/18087 Differential Revision: D14581206 Pulled By: wanchaol fbshipit-source-id: ccb0cf79b1751706f9b2cdf1715115eae5a39fb6	2019-03-22 13:52:33 -07:00
Thomas Viehmann	2aac18098d	Fix error reporting in NVRTC use of the fuser (#18327 ) Summary: Two functions were not directed ad NVRTC. It's a bit hard to test this, as the fuser usually produces correct code - unless I try to hack on it. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18327 Differential Revision: D14579285 Pulled By: soumith fbshipit-source-id: 1be7ba461cc473d514ba619507742a47d4d7c97e	2019-03-22 13:41:00 -07:00
Ailing Zhang	fe5d23cf4a	Using sqrt for better precision in cosine_similarity (#18250 ) Summary: address comment in #18168 . Testing in CI... Pull Request resolved: https://github.com/pytorch/pytorch/pull/18250 Differential Revision: D14568601 Pulled By: ailzhang fbshipit-source-id: 39fbbdb08743b53fa665c7e88e4750cbe0976ec7	2019-03-22 13:33:30 -07:00
Jianyu Huang	18a6781f57	Fix alignment issues for Fake BFP16 fp32 -> bfp16 rounding routines (#18321 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18321 As title. Reviewed By: jspark1105 Differential Revision: D14575512 fbshipit-source-id: 0e33cdab54b1aef8b67f0b4c366692c5dbdf631d	2019-03-22 12:41:58 -07:00
Dmytro Dzhulgakov	6e0cbc7f31	Untangle internal build python and cpp dependencies Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18326 Reviewed By: ezyang Differential Revision: D14576294 fbshipit-source-id: 186ce1e3d026d962b7386f861eddf093f583a878	2019-03-22 12:18:03 -07:00
Alexander Sidorov	d4c52158c7	Caffe2: crash op (#18207 ) Summary: this is handy when testing various core dump related things. If in the future we want to unit test our future gdb debugger extensions, we can use this op to generate a core dump for us within a unit test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18207 Differential Revision: D14482186 Pulled By: salexspb fbshipit-source-id: 39a9fffbdd4bd083597f544d1c783a82cf023a89	2019-03-22 11:52:01 -07:00
Duc Ngo	172ec4ace5	caffe2 - Util to cleanup external inputs and outputs from a NetDef (#18194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18194 Add a util method to cleanup external inputs and outputs from a NetDef The following conditions will be met after the modification - No duplicate external inputs - No duplicate external outputs - Going through list of ops in order, all op inputs must be outputs from other ops, or registered as external inputs. - All external outputs must be outputs of some operators. Reviewed By: ZolotukhinM Differential Revision: D14528589 fbshipit-source-id: c8d82fda1946aa3696abcbec869a4a8bb22f09b6	2019-03-22 11:23:03 -07:00
Dmytro Dzhulgakov	7397eb7e8e	End to end hack to call server side Caffe2 ops (#18267 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18267 Motivation: we don't actually want to use it for real under any circumstances. This is an idea to unblock our internal progress and parallelize workstreams. We can easily define schemas for all ops in question and implement forwarding to C2 ops which is NOT going to be performant. Then several things can be happening in parallel: * move code of ops outside of C2 ops that depend on protobuf into c10 * development of optimization/fusion passes * building python-level wrappers with clean API * improving perf This demonstrates, Relu, quant, dequant. It seems to cover all use cases necessary (maybe except weights prepacking). Ideally I'd demonstrate Conv, but will get to it later in a separate PR (contributions welcomed) Reviewed By: ezyang Differential Revision: D14531232 fbshipit-source-id: 4cd4a71ae0cb373c6c0e81f965c442b82a1b4069	2019-03-22 11:17:45 -07:00
Bilge Acun	f6df6aed89	Optimize MomentumSGDUpdate maximum block size and make it templated Summary: Removing the maximum number of blocks limit from the operator and making the nesterov parameter templated to remove branching. Reviewed By: BIT-silence Differential Revision: D14567003 fbshipit-source-id: 394c2039ee214adc6ccd2e562e4e9563d307131f	2019-03-22 09:54:25 -07:00
Edward Yang	e3da16a99e	Add test for #17271 (torch.exp incorrect for 2*31 size tensor) (#18292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18292 ghimport-source-id: a3e96584db0eef7b6202a1211808f9f6e59dd529 Stack from [ghstack](https://github.com/ezyang/ghstack): #18292 Add test for #17271 (torch.exp incorrect for 231 size tensor)** * #18291 Correctly call superclass setUp in TestCase subclasses. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14567642 fbshipit-source-id: c60ee7597a86f5d2c5c0b72cb106f17815950427	2019-03-22 07:50:38 -07:00
Edward Yang	2934153f35	Correctly call superclass setUp in TestCase subclasses. (#18291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18291 ghimport-source-id: d6e95e899bd320407967df41435801e54864ba62 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18292 Add test for #17271 (torch.exp incorrect for 2*31 size tensor) #18291 Correctly call superclass setUp in TestCase subclasses. This makes PYTORCH_TEST_SKIP_FAST work correctly for more tests, reducing the wasted testing effort on our slow_test job. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14567643 fbshipit-source-id: 40cf1d6556e0dd0a0550ff3d9ffed8b6000f8191	2019-03-22 07:46:44 -07:00
Gerard Goossen	46990c20fa	Verify def before infer fensor (#18129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18129 A lot of tensor interference function assume the operator passes the schema. So call Verity to make sure this is actually the case. Created diff before to add checking in Concat (https://github.com/pytorch/pytorch/pull/17110), but I encountered lot more places where this is assumed (for example ElementwiseOpShapeInference) Reviewed By: mdschatz Differential Revision: D14503933 fbshipit-source-id: cf0097b8c3e4beb1cded6b61e092a6adee4b8fcb	2019-03-22 06:36:25 -07:00
Jongsoo Park	77a7285764	add more Python interface functions to make quantization simpler (#18246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18246 Simplifies histogram collection and quantization process. Histogram collection before this diff was something like this ``` from caffe2.quantization.server import dnnlowp_pybind11 ... dnnlowp_pybind11.ObserveHistogramOfOutput(hist_file) for ... workspace.RunNet(predict_net) dnnlowp_pybind11.ClearNetObservers() # This is to trigger Stop function in the observer to dump out histogram file but this can have unintended consequence of also clearing all the other useful observers we attached ``` After this diff we can ``` workspace.CreateNet(predict_net) # Note we need to create net to have a net to attach observer histogram_observer = dnnlowp_pybind11.AddHistogramObserver(predic_net, hist_file) for ... workspace.RunNet(predict_net) predict_net.RemoveObserver(histogram_observer) ``` Choosing quantization parameters of weights before this diff was something like this ``` dnnlowp_pybind11.ObserveHistogramOfOutput(weight_hist_file) workspace.RunNetOnce(init_net) dnnlowp_pybind11.ClearNetObservers() # Has same issue as the histogram collection example above dnnlowp_pybind11.RegisterQuantizationParamsWithHistogram( weight_hist_file, is_weight=True, qparams_output_file_name=qparams_file ) workspace.CreateNet(init_net, overwrite=True) dnnlowp_pybind11.ClearNetObservers() logger.info("Loading quantization params from {}".format(qparams_file)) blobs_to_qparams = {} with open(qparams_file) as f: lines = f.readlines() for line in lines: op_id, op_type, output_id, tensor_name, mini, maxi, scale, zero_point, precision = ( line.split() ) op_id = int(op_id) output_id = int(output_id) op = net.Proto().op[op_id] if op_type != op.type or op.output[output_id] != tensor_name: print( "Corrupt qparams file {} {} {} {} {}".format( qparams_file, op_type, op.type, op.output[output_id], tensor_name ) ) blobs_to_qparams[tensor_name] = QuantizationParam(float(scale), int(zero_point)) ``` After this diff this can be simplified to ``` blobs_to_qparams = {} for op in init_net.Proto().op: for output in op.output: scale, zero_point = dnnlowp_pybind11.ChooseQuantizationParams(output) blobs_to_qparams[output] = QuantizationParam(scale, zero_point) ``` Reviewed By: dskhudia Differential Revision: D14544694 fbshipit-source-id: 4fd06cd63256201e2e9d15c39f503138d1be53c2	2019-03-22 00:52:24 -07:00
Weiyi Zheng	f3cf6ed789	add fbgemm fp16 (fbfcpacked) support, add global_init_net in predictor_export_meta (#18257 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18257 support adding op in global_init_net. because pred_init_net is per thread, and just doesn't cut it. Reviewed By: jspark1105 Differential Revision: D14552695 fbshipit-source-id: 53dd44c84ad019019ab9f35fc04d076b7f941ddc	2019-03-22 00:19:59 -07:00
Lu Fang	afc7574aed	Automatic update of fbcode/onnx to c05f2ae412daf8fd64136ca354b97ccf73e0ea6c (#18285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18285 Previous import was 96c58ceeacf0f2b73d752e413e4fd78787a12da3 Included changes: - [c05f2ae4](https://github.com/onnx/onnx/commit/c05f2ae4): update both core and ml docs (#1879) <Lu Fang> - [f895279b](https://github.com/onnx/onnx/commit/f895279b): fix the problems introduced in previous PRs in operator registration (#1878) <Lu Fang> - [f6f80657](https://github.com/onnx/onnx/commit/f6f80657): Skip the schema check on ops in non-standard domain (#1876) <Lu Fang> - [8c8be722](https://github.com/onnx/onnx/commit/8c8be722): Introduce Function Body Helper (#1868) <Sherlock> - [b605eafb](https://github.com/onnx/onnx/commit/b605eafb): Support down sampling for Upsample with scales < 1. (#1773) <Ke Zhang> - [47f7aa71](https://github.com/onnx/onnx/commit/47f7aa71): Remove scaledtanh (#1866) <Ashwini Khade> - [4dfc56de](https://github.com/onnx/onnx/commit/4dfc56de): Add Ceil support for Max and Average Pooling (#1860) <Lara Haidar> - [552a8efc](https://github.com/onnx/onnx/commit/552a8efc): Add testcase generator for functions (#1862) <Raymond Yang> - [fdb978a5](https://github.com/onnx/onnx/commit/fdb978a5): Promote Thresholded Relu Op (#1856) <Ashwini Khade> - [ce332628](https://github.com/onnx/onnx/commit/ce332628): Update Slice with dynamic input & optional input steps (#1836) <Bowen Bao> - [3a9a8787](https://github.com/onnx/onnx/commit/3a9a8787): Merge function into opschema (#1834) <Raymond Yang> - [3dbf8fe9](https://github.com/onnx/onnx/commit/3dbf8fe9): Handle string comparision represented as np.objects (#1851) <Dmitri Smirnov> - [3b0d3bb2](https://github.com/onnx/onnx/commit/3b0d3bb2): remove global variable in header file (#1850) <Lu Fang> - [1cca8733](https://github.com/onnx/onnx/commit/1cca8733): bump the version for drop out - fix the issue that the version was not bumped when changing its type constraint declaration. (#1848) <Ke Zhang> - [1ec81bc6](https://github.com/onnx/onnx/commit/1ec81bc6): Change TopK operator to allow dynamic 'k' (#1829) <Hariharan Seshadri> - [a89a4a16](https://github.com/onnx/onnx/commit/a89a4a16): Remove exp op: Affine, ImageScaler,ParametricSoftplus, Crop. (#1832) <Ke Zhang> Reviewed By: yinghai Differential Revision: D14566202 fbshipit-source-id: b1e5912ae6887e2865fc628363071e2b9938dfa4	2019-03-22 00:13:42 -07:00
David Riazati	f79eac2c7a	Cleanup TorchScript rst docs (#18234 ) Summary: * Adds more headers for easier scanning * Adds some line breaks so things are displayed correctly * Minor copy/spelling stuff Pull Request resolved: https://github.com/pytorch/pytorch/pull/18234 Reviewed By: ezyang Differential Revision: D14567737 Pulled By: driazati fbshipit-source-id: 046d991f7aab8e00e9887edb745968cb79a29441	2019-03-21 20:19:17 -07:00
Junjie Bai	46439c78d0	Replace the remaining usages of IntList in caffe2 to IntArrayRef Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18282 Differential Revision: D14569269 Pulled By: bddppq fbshipit-source-id: 5fc33701b83f9efdec4b456d2691764831d10e7f	2019-03-21 16:34:38 -07:00
Yinghai Lu	979db03722	Blacklist certain op types when doing bound shape inference (#18290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18290 Some such as `Tile` will mess up our tracking of batch size and for now it makes sense to stop the shape inference on these ops so that we don't lower it and downstream ops without proper batch info. Reviewed By: zrphercule Differential Revision: D14463550 fbshipit-source-id: 2792481efa540f2a7dd310e677c213860c3053ca	2019-03-21 15:43:05 -07:00
Sebastian Messmer	104773c715	Fix use of c10::guts::apply (#18159 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18159 In some instances, the call to forward could clash with std::forward. Fully qualify it to make sure it gets the right one Reviewed By: ezyang Differential Revision: D14512189 fbshipit-source-id: 6242607dbe54fcdb93229c1a4aaee8b84a88caa1	2019-03-21 14:57:33 -07:00
Sebastian Messmer	8b94de06af	Allow using C10_DECLARE_TENSOR_TYPE and C10_DEFINE_TENSOR_TYPE from any namespace (#18158 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18158 They didn't work when called from other namespaces before because they didn't fully specify the c10 namespace. Reviewed By: ezyang Differential Revision: D14512187 fbshipit-source-id: a496b89a1bbe2b56137cfae03ab94a60f38d7068	2019-03-21 14:57:32 -07:00
Sebastian Messmer	daa77c6e26	Move schema inference to c10 (#18090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18090 This schema inference is needed by the c10 operator registration mechanism. Move it to c10. It is going to be used by diffs stacked on top. Reviewed By: ezyang Differential Revision: D14491454 fbshipit-source-id: 0f8ddcdbd91467c8347d315dd443a1ca8b216481	2019-03-21 14:57:30 -07:00
Sebastian Messmer	1877087df2	Allow registering same operator schema multiple times (#18038 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18038 Now that we have named overloads, we can allow registering the same function schema multiple times and just check it's identical. This is going to be used in custom op registration since they register the schema every time a kernel is registered. Reviewed By: dzhulgakov Differential Revision: D14467494 fbshipit-source-id: 2c26cf72a64b65f120afe05e989302ec42597515	2019-03-21 14:57:28 -07:00
vishwakftw	291746f110	Rename trtrs to triangular_solve (#18213 ) Summary: Changelog: - Renames `trtrs` to `triangular_solve` to remain consistent with `cholesky_solve` and `solve`. - Rename all tests, fix callsites - Create a tentative alias for `triangular_solve` under the name `trtrs`, and add a deprecation warning to not promote usage. - Move `isnan` to _torch_docs.py - Remove unnecessary imports Pull Request resolved: https://github.com/pytorch/pytorch/pull/18213 Differential Revision: D14566902 Pulled By: ezyang fbshipit-source-id: 544f57c29477df391bacd5de700bed1add456d3f	2019-03-21 14:27:21 -07:00
kshitij12345	1c671c56c1	Fix contribution_guide docs (#18237 ) Summary: Fixes Typo and a Link in the `docs/source/community/contribution_guide.rst` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18237 Differential Revision: D14566907 Pulled By: ezyang fbshipit-source-id: 3a75797ab6b27d28dd5566d9b189d80395024eaf	2019-03-21 13:20:57 -07:00
svcscm	cf19ad2152	Updating submodules Reviewed By: yns88 fbshipit-source-id: 80b00c33e6f6c7cfa08f645cd33419f6545f45d2	2019-03-21 13:15:54 -07:00
Xiaomeng Yang	43a5c636e2	Optimize group_norm_op (#17945 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17945 Optimize group_norm_op Reviewed By: houseroad Differential Revision: D14419908 fbshipit-source-id: 4024b5c5dbeff97f4f026d61fc44af1f0e98ed68	2019-03-21 13:05:01 -07:00
Edward Yang	9214852da2	Enable running of slow tests in CI. (#18236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18236 ghimport-source-id: 2bb80d017c2ea833669a2d55b340a922b2d44685 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18236 Enable running of slow tests in CI. * #18231 Add a decorator for marking slow tests. These tests only run on master, as they are slow. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14563115 fbshipit-source-id: f54ddef4abedc7e872e58657fc9ac537952773d0	2019-03-21 12:44:45 -07:00
Pieter Noordhuis	a7d886b9a0	Run clang-format on torch/csrc/distributed/c10d Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18255 Differential Revision: D14563072 Pulled By: pietern fbshipit-source-id: bd83f90ae949b14bc95f4009ba12319c9b7936d0	2019-03-21 11:55:11 -07:00
Edward Yang	99ddcb2c9f	Shut up compiler about unused the_type. (#18278 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18278 ghimport-source-id: 3c35f6e7229c3c2b3a27d96370d7c05fad58365e Stack from [ghstack](https://github.com/ezyang/ghstack): * #18278 Shut up compiler about unused this_type. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14563050 fbshipit-source-id: 4b516f6c9ef3784d1430f793f304066c351b1a93	2019-03-21 11:39:21 -07:00
Edward Yang	549c4da917	Add a decorator for marking slow tests. (#18231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18231 ghimport-source-id: 78c230f60c41877fe91b89c8c979b160f36f856b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18231 Add a decorator for marking slow tests. The general strategy: - It's a normal skip decorator, which triggers a skip if PYTORCH_TEST_WITH_SLOW is not set. - It also annotates the method in question that says it's slow. We use this to implement a catch-all skipper in setUp that skips all non-slow tests when PYTORCH_TEST_SKIP_FAST is set. I added a little smoketest to test_torch and showed that I get: ``` Ran 432 tests in 0.017s OK (skipped=431) ``` when running with PYTORCH_TEST_WITH_SLOW=1 and PYTORCH_TEST_SKIP_FAST=1 CI integration coming in later patch, as well as nontrivial uses of this decorator. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14544441 fbshipit-source-id: 54435ce4ec827193e019887178c09ebeae3ae2c9	2019-03-21 11:17:34 -07:00
Igor Fedan	3eff333bff	lint changes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18276 Differential Revision: D14563385 Pulled By: ifedan fbshipit-source-id: 12a51dbdb7b9e96be9fefa21fe298796b1ae6b58	2019-03-21 11:11:35 -07:00
Thomas Viehmann	8356ffa922	move median to ATen (#17637 ) Summary: This moves median to ATen. - median with dimension reduces to kthvalue - median without dimension (aka medianall) is implemented in parallel to kthvalue because we would not want to reshape (copying for non-contiguous) and then copy again in kthvalue. We can sue the helper functions we moved from kthvalue. - `median_cuda` was accidentally already put into ATen in #17544. - The quickselect algorthm without indices for CPU in TH is now obsolete and removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17637 Differential Revision: D14346510 Pulled By: ezyang fbshipit-source-id: c07ad144efbd6b4194179bb1c02635862521d8cb	2019-03-21 10:02:04 -07:00
Edward Yang	d1497debf2	Fix B903 lint: save memory for data classes with slots/namedtuple (#18184 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18184 ghimport-source-id: 2ce860b07c58d06dc10cd7e5b97d4ef7c709a50d Stack from [ghstack](https://github.com/ezyang/ghstack): * #18184 Fix B903 lint: save memory for data classes with slots/namedtuple * #18181 Fix B902 lint error: invalid first argument. * #18178 Fix B006 lint errors: using mutable structure in default argument. * #18177 Fix lstrip bug revealed by B005 lint Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14530872 fbshipit-source-id: e26cecab3a8545e7638454c28e654e7b82a3c08a	2019-03-21 09:10:30 -07:00
Edward Yang	ba81074c40	Fix B902 lint error: invalid first argument. (#18181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18181 ghimport-source-id: 9c23551584a1a1b0b7ac246367f3a7ae1c50b315 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18184 Fix B903 lint: save memory for data classes with slots/namedtuple * #18181 Fix B902 lint error: invalid first argument. * #18178 Fix B006 lint errors: using mutable structure in default argument. * #18177 Fix lstrip bug revealed by B005 lint A variety of sins were committed: - Some code was dead - Some code was actually a staticmethod - Some code just named it the wrong way - Some code was purposely testing the omitted case Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14530876 fbshipit-source-id: 292a371d9a76ddc7bfcfd38b6f0da9165290a58e	2019-03-21 09:10:28 -07:00
Edward Yang	0654c7d4a7	Fix B006 lint errors: using mutable structure in default argument. (#18178 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18178 ghimport-source-id: 667ee76b418f505fa64b863e52a603c508dcd1bf Stack from [ghstack](https://github.com/ezyang/ghstack): * #18184 Fix B903 lint: save memory for data classes with slots/namedtuple * #18181 Fix B902 lint error: invalid first argument. * #18178 Fix B006 lint errors: using mutable structure in default argument. * #18177 Fix lstrip bug revealed by B005 lint Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14530874 fbshipit-source-id: 38f4456a085bfe55f2a96fff53028ebd0d621604	2019-03-21 09:10:25 -07:00
Thomas Viehmann	0122121176	Two amendments for the shape analysis (#18271 ) Summary: Two small refinements to the shape analysis: - `detach` can set requires grad to false for dimensioned tensors (not sure if I would also need to deal with Complete?). - add `batch_norm_stats`. I noticed these while looking at what's going on when trying to code batch norm manually. (Hi wanchaol ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/18271 Differential Revision: D14561303 Pulled By: ezyang fbshipit-source-id: 64a6879392e77403c44f2ed82f84b6397754d0ea	2019-03-21 08:07:51 -07:00
Edward Yang	9bc8badbcb	Fix lstrip bug revealed by B005 lint (#18177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18177 ghimport-source-id: fbbf915b66762fc88bc5b541464e71ba27500958 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18184 Fix B903 lint: save memory for data classes with slots/namedtuple * #18181 Fix B902 lint error: invalid first argument. * #18178 Fix B006 lint errors: using mutable structure in default argument. * #18177 Fix lstrip bug revealed by B005 lint lstrip() doesn't strip a prefix; it strips all of the characters in the passed in string. B005 lint revealed this. Replaced with substring operation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14530873 fbshipit-source-id: 13b3438fcc3cce13b5110730dc3d0b528a52930f	2019-03-21 07:56:24 -07:00
Igor Fedan	e5cdd94324	Backward function for torch.cdist Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17173 Differential Revision: D14111482 Pulled By: ifedan fbshipit-source-id: d72cfd53c29d0f8cf5f8ad1148d14f3d5abd938e	2019-03-21 00:39:29 -07:00
Lu Fang	2016daaf51	Fix ONNX symbolic for argmin and argmax (#18261 ) Summary: Fix the problem introduced in https://github.com/pytorch/pytorch/pull/17103 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18261 Reviewed By: bddppq Differential Revision: D14558781 Pulled By: houseroad fbshipit-source-id: 7bb50072e77d1d7b2a93f4011fa1362f26e9df1c	2019-03-20 22:51:13 -07:00
Xiaomeng Yang	e04c9195b7	Update math::Transpose to support tensor with size > 2G (#17670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17670 Update math::Transpose to support tensor with size > 2G i-am-not-moving-c2-to-c10 Differential Revision: D14313624 fbshipit-source-id: 0b4a85b913972e5a8981f0d40d0c539407b98f30	2019-03-20 18:22:21 -07:00
Jongsoo Park	bbbabda4e8	handle dst_bin_width==0 case properly (#18240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18240 For rare cases when dst_bin_width == 0 we should just put all numbers to an arbitrary bin. Reviewed By: csummersea Differential Revision: D14544685 fbshipit-source-id: 02d04ff8bd1555d6cf7e7eeb1196a4ab3325a9e5	2019-03-20 17:11:25 -07:00
Lu Fang	e12091d0a3	Revert D14114134: [asr] add fbgemm fp16 (fbfcpacked) support, add global_init_net in predictor_export_meta Differential Revision: D14114134 Original commit changeset: 112bb2ceb9d3 fbshipit-source-id: 763262c1b78eed88a653caad5adc27d97feb43aa	2019-03-20 16:32:53 -07:00
Gao, Xiang	7e6220393f	Cleanup arg{min, max} (#17103 ) Summary: Why do we need this workaround? `PythonArgParser` handles these two cases well. The discussion started at https://github.com/pytorch/pytorch/pull/6201#issuecomment-378724406. The conclusion at that time by goldsborough was: > Because we wanted to allow `dim=None` in Python and route to a different function. Essentially the problem was wanting to wrap the C++ function in Python. AFAIK there is no way of translating `dim=None` behavior into C++? So Richard and I came up with this strategy Maybe at that time `PythonArgParser` was not powerful enough to handle the routing of two function with same name but different C++ signature. Will keep an eye on the CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17103 Differential Revision: D14523503 Pulled By: VitalyFedyunin fbshipit-source-id: cae3e2678062da2eccd93b51d4050578c7a9ab80	2019-03-20 16:28:27 -07:00
Bharat123Rox	ebc9f75895	Added the exception of ignore_index (#18117 ) Summary: Fix #17801 to add an exception regarding `ignore_index` in the documentation for `torch.nn.CrossEntropyLoss` and `torch.nn.NLLLoss` If any other files/functions are hit, I'd be glad to incorporate the changes there too! 😊 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18117 Differential Revision: D14542079 Pulled By: ezyang fbshipit-source-id: 7b918ac61f441dde7d3d6782d080c500cf2097f1	2019-03-20 16:03:34 -07:00
David Riazati	fd35814348	Add .get() for dicts (#18238 ) Summary: Fixes #18232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18238 Differential Revision: D14546689 Pulled By: driazati fbshipit-source-id: ed021e6f54c891d6c734c8f2345f4e83a3c6c905	2019-03-20 14:57:13 -07:00
Pieter Noordhuis	d5328a8a30	Update nccl submodule to 2.4.2 (#17883 ) Summary: Didn't test this. Let's see what happens. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17883 Differential Revision: D14547470 Pulled By: pietern fbshipit-source-id: c35d232f6bcc5a2dce55da636a0acbea5c2725d8	2019-03-20 14:39:52 -07:00
Pieter Noordhuis	83d84c22e4	Reinstate ncclCommDestroy (#17943 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17943 Together with xw285cornell came up with a solution for static destruction order fiasco that caused the NCCL context to be destroyed after the CUDA context was already destroyed. In this commit we destroy all cached NCCL contexts as soon as the last NCCL related Caffe2 operator instance is destructed, thereby avoiding a dependency on static variable destruction. Reviewed By: xw285cornell Differential Revision: D14429724 fbshipit-source-id: fe5ce4b02b1002af8d9f57f6fa089b7a80e316ce	2019-03-20 14:20:45 -07:00
Davide Libenzi	272a48f6fe	Enable autograd to recognize the XLA backend as one providing multiple devices (#17847 ) Summary: …e devices, while not being CUDA/HIP. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17847 Differential Revision: D14545634 Pulled By: ezyang fbshipit-source-id: 417181bf2ff4f8978544afe2fb6b042e787854ed	2019-03-20 13:58:36 -07:00
Weiyi Zheng	1b71f6d4eb	add fbgemm fp16 (fbfcpacked) support, add global_init_net in predictor_export_meta (#17905 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17905 support adding op in global_init_net. because pred_init_net is per thread, and just doesn't cut it. Reviewed By: jspark1105 Differential Revision: D14114134 fbshipit-source-id: 112bb2ceb9d3d5e663dd430585567f4eaa2db35f	2019-03-20 13:52:10 -07:00
Zhang Dong	1442808fcd	fixed typo in shape_analysis.cpp (#18227 ) Summary: cc: VitalyFedyunin Pull Request resolved: https://github.com/pytorch/pytorch/pull/18227 Differential Revision: D14541764 Pulled By: VitalyFedyunin fbshipit-source-id: 9477deb1a99e6581f15a4de4d7631d747f56f3a6	2019-03-20 12:47:32 -07:00
Lu Fang	18b31b73fb	Retain the parameter names in ONNX exporter (#17551 ) Summary: So, we will keep the names of ONNX initializers the same as the names in PyTorch state dict. Later, we will make this as the default behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17551 Reviewed By: dzhulgakov Differential Revision: D14491920 Pulled By: houseroad fbshipit-source-id: f355c02e1b90d7ebbebf4be7c0fb6ae208ec795f	2019-03-20 12:11:23 -07:00
Alexandr Morev	abc171bd53	Fix typo in docstring Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18216 Differential Revision: D14539824 Pulled By: ezyang fbshipit-source-id: 490b72951a75f3f8b949a2d692d660a3693ee98a	2019-03-20 11:16:36 -07:00
Vishwak Srinivasan	a519217ee7	Add batched version of trtrs (#18025 ) Summary: - Remove single batch TH/THC implementations - Remove `_batch_trtrs_lower` from `multivariate_normal` - Add tests for batched behavior - Modify trtrs_backward to accommodate for batched case - Modify docs In a future PR, this will be renamed to `triangular_solve`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18025 Differential Revision: D14523004 Pulled By: ifedan fbshipit-source-id: 11c6a967d107f969b60e5a5c73ce6bb8099ebbe1	2019-03-20 11:11:32 -07:00
Sacha Refshauge	e312801453	Remove GLOO usage when USE_GLOO is OFF Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18203 Differential Revision: D14540520 Pulled By: soumith fbshipit-source-id: f1c96cc563ed1e913040e3e16b109d3e3030128c	2019-03-20 09:31:53 -07:00
peterjc123	2a6cbfaccf	Enable 32 bit CPU build on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18176 Differential Revision: D14539884 Pulled By: ezyang fbshipit-source-id: 0e4bd9c1ef1830cd9bcc40df36b87534f61def08	2019-03-20 09:26:50 -07:00
peter	19c13eee39	Correct cmake flags passing (#18217 ) Summary: Fixes #18214. According to the CMake manual, we should pass the arguments first, and put the directory as the last element. Otherwise, these flags may not be passed correctly. Reference: 1. https://cmake.org/cmake/help/latest/manual/cmake.1.html#synopsis 2. https://stackoverflow.com/a/27169347 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18217 Differential Revision: D14540588 Pulled By: ezyang fbshipit-source-id: a027f585dde66c5da7bbbe584fa42c3e56027d59	2019-03-20 09:21:31 -07:00
Gregory Chanan	bd1271338a	Add python_variable._is_view for debugging. (#18197 ) Summary: I don't know if we actually want to expose this or not, but it's useful for debugging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18197 Reviewed By: ezyang Differential Revision: D14530712 Pulled By: gchanan fbshipit-source-id: 98fdba9cf113738f0db3a198c49365de536b9919	2019-03-20 08:43:02 -07:00
Johannes M Dieterich	4741d613ee	Do not apply these explicit unroll pragmas for ROCm. (#18204 ) Summary: Loop analysis indicates that there is a runtime trip count and hence unrolling cannot take place. This will silence compile-time warnings we have been observing with recent ROCm releases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18204 Differential Revision: D14539875 Pulled By: ezyang fbshipit-source-id: a7ea7f2a95603754296b76a6b62a154f56f4ad4d	2019-03-20 08:06:07 -07:00
Edward Yang	8f1db1c6c1	Copy-edit CONTRIBUTING and update. (#18131 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18131 ghimport-source-id: 473dae70f6c236d317bec77d894310c0aa0376ec Stack from [ghstack](https://github.com/ezyang/ghstack): * #18131 Copy-edit CONTRIBUTING and update. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14505049 fbshipit-source-id: 02aeae33c0049889243c56dd0d761487dac2351e	2019-03-20 07:40:59 -07:00
Ailing Zhang	8895bfba6a	fix cosine_similarity (#18168 ) Summary: fixes #18057 according to colesbury 's suggestion. Thanks! cc: ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/18168 Differential Revision: D14520953 Pulled By: ailzhang fbshipit-source-id: 970e6cfb482d857a81721ec1d0ee4a4df84a0450	2019-03-19 20:09:17 -07:00
Elias Ellison	3baf99bea7	Breakup test misc pt2 (#18191 ) Summary: Further breakup test_misc.h. The remaining tests don't directly map to a jit file so I left them in test_misc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18191 Differential Revision: D14533442 Pulled By: eellison fbshipit-source-id: 7f538ce0aea208b6b55a4716dfcf039548305041	2019-03-19 19:41:22 -07:00
David Riazati	9d455ac2fe	Add serialization docs to jit/README (#17951 ) Summary: Documents the serialization format for `torch.jit.save`. Some of the info is copied from houseroad's internal doc. [Formatted Markdown](https://github.com/driazati/pytorch/blob/serial_docs/torch/csrc/jit/README.md) Also refactors the readme to have a heading hierarchy + table of contents Pull Request resolved: https://github.com/pytorch/pytorch/pull/17951 Differential Revision: D14531644 Pulled By: driazati fbshipit-source-id: cbcd9462054cc9f8a2f8cea2c98d8aba4e7d227c	2019-03-19 16:47:04 -07:00
Edward Yang	08aa973fb8	Turn on Travis builds for ghstack PRs. (#18193 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18193 ghimport-source-id: 540859cf0b238a9832f45b3f4c2351e3343fc1a2 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18193 Turn on Travis builds for ghstack PRs. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14529945 fbshipit-source-id: 4476e996e311a04f2a997ca9b7c4cf2157dd6286	2019-03-19 14:51:07 -07:00
Michael Suo	cd6a6c54c6	do not throw when unicode is seen in pull request info (#18195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18195 ghimport-source-id: 05102cb115c6bd6d141f51905e20155bcd79a908 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18195 [build] do not throw when unicode is seen in pull request info Differential Revision: D14529707 fbshipit-source-id: 2f6a31b01b3a9b044fd24be466cc5325b70929ad	2019-03-19 14:45:47 -07:00
Edward Yang	6758f5587f	Delete bugbear from Python 2 lint. (#18192 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18192 ghimport-source-id: 9523a09d7ec202ef08cf0ecdf48c42739ea6b0ce Stack from [ghstack](https://github.com/ezyang/ghstack): * #18192 Delete bugbear from Python 2 lint. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14529240 fbshipit-source-id: 1a433b53dd38d1c455e8c0750d97c594ac51ef09	2019-03-19 14:24:03 -07:00
David Riazati	1bc4eb93c7	Support attributes when emitting function calls (#18156 ) Summary: The type of each `initial_ivalue` is completely known at some point but that information is discarded by the time a call to it is emitted. This PR is kind of a hack, as a better (longer) solution, the method should know about the type of each initial value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18156 Differential Revision: D14525768 Pulled By: driazati fbshipit-source-id: 52d53e9711a07a4551c988bd95fe997e654aa465	2019-03-19 13:56:40 -07:00
Tongzhou Wang	f212fd9fd6	Customized pin_memory for PackedSequence (#18079 ) Summary: fixes https://github.com/pytorch/pytorch/issues/18078 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18079 Reviewed By: ezyang Differential Revision: D14521192 Pulled By: zou3519 fbshipit-source-id: cec773a3a6f2c405a0d9701e213b7caf81649181	2019-03-19 13:41:30 -07:00
Edward Yang	916a670828	Enable flake8-bugbear line length checking. (#18138 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18138 ghimport-source-id: be62a71ef98714e6f168a00f84120f612363528e Stack from [ghstack](https://github.com/ezyang/ghstack): * #18138 Enable flake8-bugbear line length checking. flake8-bugbear's line length checker (B950) which permits violations of up to 10% but specifies the "true" limit when you go over. I had to ignore a bunch of flake8-bugbear's other checks when I turned this on. They're good checks though (they're turned on in fbcode) and we should fix them eventually. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: salexspb Differential Revision: D14508678 fbshipit-source-id: 2610ecc0dd43cc0788d77f4d024ebd85b26b8d41	2019-03-19 13:31:04 -07:00
Michael Suo	794c631e23	fix bug in alias analysis (#18146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18146 ghimport-source-id: 4b061c27c5c44ef0d06066490ed16cab3d0c7a64 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18146 [jit] fix bug in alias analysis We handled hasWriters() incorrectly in the case of wildcards. There's even a comment describing the correct behavior. Sad! Much thanks to t-vi for tracking this down and suggesting the fix! Differential Revision: D14524208 fbshipit-source-id: 8010b54257241bd64013a0d0a8b6e7d22d8c70af	2019-03-19 11:35:28 -07:00
vishwakftw	234bb8719a	Add backend checks to solve methods (gesv, cholesky_solve) (#18116 ) Summary: Changelog: - Incorporate a simple backend check in the linearSolveCheckInputs function in LinearAlgebraUtils.h Pull Request resolved: https://github.com/pytorch/pytorch/pull/18116 Differential Revision: D14504469 Pulled By: soumith fbshipit-source-id: 7402b6dbaa8d73048946613b806d54f68bcbd8f4	2019-03-19 10:44:45 -07:00
Hector Yuen	7bb36ada1f	fix -Wsign-compare warnings for some files inside c2 (#18123 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18123 the motivation of this fix is to resolve things like: for(auto i = 0; i < N; i++) where N is bigger than int32 These instances of comparison were found by enabling -Wsign-compare There are way too many things to fix, so issuing this as a series of fixes The plan is to fix all these issues and then enable this flag into Caffe2 to catch future instances Reviewed By: ZolotukhinM Differential Revision: D14497094 fbshipit-source-id: bca3927a2188bd33a508fa503ba221c220cdaefe	2019-03-19 10:39:20 -07:00
Neta Zmora	1c76746f61	SGD: remove unneeded multiply-add initialization operations (#18114 ) Summary: The momentum buffer is initialized to the value of d_p, but the current code takes the long way to do this: 1. Create a buffer of zeros 2. Multiply the buffer by the momentum coefficient 3. Add d_p to the buffer All of these can be collapsed into a single step: 1. Create a clone of d_p Pull Request resolved: https://github.com/pytorch/pytorch/pull/18114 Differential Revision: D14509122 Pulled By: ezyang fbshipit-source-id: 4a79b896201d5ff20770b7ae790c244ba744edb8	2019-03-19 10:34:17 -07:00
Ailing Zhang	a50ba7e238	specialized CUDA impl for dropout in AD (#17756 ) Summary: In aten we have a _fused_dropout implementation for CUDA case. As ngimel suggested if we discard it in JIT AD, it hurts performance. It doesn't seem ideal to include backend specific implementation in AD, but this is helpful to prevent performance regression atm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17756 Differential Revision: D14368999 Pulled By: ailzhang fbshipit-source-id: 9a371c5020f630e8f6e496849ec9772b6f196169	2019-03-19 10:34:15 -07:00
Neeraj Pradhan	9a153412fd	Fix underflow issue with dirichlet sample (#17488 ) Summary: Addresses #15738, using fritzo's suggestion. This adds a `torch._sample_dirichlet` method in `Distributions.cpp` and `Distributions.cu`. - For CPU, this leads to no perf hit since all we do is to promote the `alpha` to double when getting the gamma samples (the gamma sampler anyways uses `accscalar_t`(double for CPU)) and cast it back to float32 on return. - I have added an analogous method for CUDA as well, but the default sampler for CUDA uses scalar_t for efficiency, so I have kept it as that. With this, I do not see the bias towards 1 as reported in #15738 with `float32`, but there is a spurious mode at 0.5, as would be expected. Users would need to explicitly use `float64` for GPU to not see the spurious mode at 0.5. (EDIT: see note below, it appears that the bias issue is still there for certain builds). Added some tests and checked that there is no perf regression. My experience with C++ is very limited, so apologies in advance if I missed something basic. cc. ailzhang, fritzo, fmassa Pull Request resolved: https://github.com/pytorch/pytorch/pull/17488 Differential Revision: D14410301 Pulled By: ezyang fbshipit-source-id: 62b2f694b4642685eab06db96d74ce28e05c3992	2019-03-19 10:34:13 -07:00
Gregory Chanan	84fe20600d	Kill Backend constructor of TensorOptions. (#18137 ) Summary: It's wrong and unused. Use one of the many other constructors instead :). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18137 Differential Revision: D14508364 Pulled By: gchanan fbshipit-source-id: 19c6ff78ad9d9221d0874425edd02b78627c4ca7	2019-03-19 08:00:21 -07:00
Gregory Chanan	3a85f88efd	Remove deviceTypeToBackend, which is underspecified. (#18135 ) Summary: There are multiple backends for a device type, so we just kill this function. Also, kill an getNonVariableType instance which was also underspecified. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18135 Differential Revision: D14507474 Pulled By: gchanan fbshipit-source-id: fc791a76d4b851b23d09a070725f3838621eb13d	2019-03-19 07:53:28 -07:00
Gregory Chanan	190c36bbc2	Stop generating unimplemented type methods. (#18144 ) Summary: This gets rid of 'aten_sparse' which was used at one time with legacy THS code, but is now only overloaded in native_parse.py. The way that 'aten_sparse' worked was wonky -- it extended all backends (default [CPU, CUDA]) to include sparse. But this is totally unnecessary; we already have the backends we need to generate for from type_method_definition_dispatch. codegen changes: `fc37c8e171/diff.txt` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18144 Reviewed By: ezyang Differential Revision: D14511324 Pulled By: gchanan fbshipit-source-id: 8bb4ac4cf0985f8756790779a22bc229e18e8e7f	2019-03-19 07:42:06 -07:00
Bharat Raghunathan	8ed2b88bf1	Corrected type of 'swap' in torch.nn.TripletMarginLoss (#18115 ) Summary: Fix #16428 by correcting type of 'swap' from `float` to `bool` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18115 Differential Revision: D14516615 Pulled By: ezyang fbshipit-source-id: c61a45d533f3a443edf3c31c1ef3d9742bf46d2b	2019-03-19 07:09:15 -07:00
Deepali Chourasia	542c273e5b	handle scenario when GPU support is not available and p2p_access_pattern is empty (#17974 ) Summary: Observed that when there is no GPU support available `workspace `sets `GetGpuPeerAccessPattern `to `[]` in https://github.com/pytorch/pytorch/blob/master/caffe2/python/workspace.py#L79 and this case is not handled in https://github.com/pytorch/pytorch/blob/master/caffe2/python/data_parallel_model.py#L1065. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17974 Differential Revision: D14517066 Pulled By: ezyang fbshipit-source-id: 186911d95c07e9a55ab82a41d0c7c919e4281bb4	2019-03-18 23:11:54 -07:00
Lutz Roeder	195cba500f	Fix Caffe2 operator schemas (#15462 ) (#13229 ) (#18109 ) Summary: Maratyszcza harouwu yinghai This is broken since #13065. `c_str()` returns a pointer that isn't permanent. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18109 Differential Revision: D14516622 Pulled By: ezyang fbshipit-source-id: 7113d92eac4f61479c4c7b323cf78cc8aa00b17e	2019-03-18 21:00:43 -07:00
Junji Hashimoto	afb2f2424a	Increase line-width of Declarations.yaml (#18050 ) Summary: There are some line breaks in schema_string of Declarations.yaml. Is this valid yaml? I am reading yaml-spec. It seems that the “\|” indicator or single/double quote is required to insert line-break. https://yaml.org/spec/1.2/spec.html ![image](https://user-images.githubusercontent.com/2469618/54405834-1e53ac80-471b-11e9-9925-be13a109eb46.png) Could you increase line-width of yaml to avoid newline? Pull Request resolved: https://github.com/pytorch/pytorch/pull/18050 Differential Revision: D14516694 Pulled By: ezyang fbshipit-source-id: 1db9f3bf131b54a783d668de973915892603189e	2019-03-18 20:49:05 -07:00
svcscm	86f1dd3fb0	Updating submodules Reviewed By: yns88 fbshipit-source-id: eeeec4229e05916f2c17e525aee5ac4465ef52db	2019-03-18 20:40:35 -07:00
Zhang Dong	2737d2c7dc	delete unnecessary file .gitkeep (#18136 ) Summary: delete unnecessary file .gitkeep in /pytorch/tree/master/torch/csrc/nn Pull Request resolved: https://github.com/pytorch/pytorch/pull/18136 Differential Revision: D14516584 Pulled By: ezyang fbshipit-source-id: a7555693cb3df1c5e37fcd3ca9bb379a2258f2d1	2019-03-18 20:31:25 -07:00
David Riazati	3d44305e9d	Attribute serialization (#17423 ) Summary: Allows serialization/loading of attributes (`IValue`s of any type). * metadata (attribute name, type) is stored in the `model.json` * The binary format is a subset of the `pickle` module that supports the operations necessary for `IValue`s * Attributes are serialized in the order they are defined on a module to a list in a single `attributes` file, with submodule attributes coming first. This order directly matches the order attributes are listed in `model.json` * This can be inspected in Python with `pickle.load()` or with `pickletools` (PyTorch need not be installed for this to work) * A class is used to store a tensor's index into the tensor table of the model, so to unpickle the file you have to use a custom Unpickler: ```python class TensorID(object): def __setstate__(self, id): self.id = id class JitUnpickler(pickle.Unpickler): def find_class(self, module, name): if module == '__main__' and name == 'TensorID': return TensorID JitUnpickler(open("my_model/attributes.pkl", "rb")).load() ``` * pickle format: https://svn.python.org/projects/python/trunk/Lib/pickletools.py * It currently does not support/guarantee that anything saved out with `pickle` (i.e. if you edit `attributes` with `pickle` directly) instead of our tools will be imported correctly Also will fix #17683 and fix #16367 Followup Work: * document format / choice of pickle: #17951 * create an example * list specializations * int size specializations, large binputs * do a first pass over attributes to output only necessary `BINPUT` ops * attribute reassignment (e.g `self.my_attribute = new_value`) * `tensor.save("some_checkpoint.pkl")` support with tensors embedded in Pickle file Pull Request resolved: https://github.com/pytorch/pytorch/pull/17423 Differential Revision: D14470965 Pulled By: driazati fbshipit-source-id: 6a21a9939efdbe59b4bc57fd31d6d630bab5297e	2019-03-18 18:18:22 -07:00
Jongsoo Park	87b6cbb6fd	fix bug in pool_dnnlowp_op_avx2.cc (#18141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18141 VLEN should've been 32 Reviewed By: jianyuh Differential Revision: D14510780 fbshipit-source-id: ddf12746e1c69677a268432432ddb088cc210084	2019-03-18 16:31:42 -07:00
svcscm	0a8efce51e	Updating submodules Reviewed By: yns88 fbshipit-source-id: ed297c07c681f5f45d3f99edf48680015ca5b138	2019-03-18 16:21:23 -07:00
Vishwak Srinivasan	421b508d55	Rename gesv to solve (#18060 ) Summary: Changelog: - Renames `gesv` to `solve` to remain consistent with `cholesky_solve`. - Rename all tests, fix callsites - Create a tentative alias for `solve` under the name `gesv`, and add a deprecated warning to not promote usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18060 Differential Revision: D14503117 Pulled By: zou3519 fbshipit-source-id: 99c16d94e5970a19d7584b5915f051c030d49ff5	2019-03-18 16:04:24 -07:00
James Reed	0eb4f7aa71	Modify BeamSearch to support CharSourceEncoder (#18107 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18107 Pull Request resolved: https://github.com/pytorch/translate/pull/396 also: 1. fix issues with OptionalType not having a createWithContainedType (PyTorch diff) 2. Delete tests for ONNX full beam search export (nobody is using it and it just makes things harder. Currently ONNX doesn't support `_unwrap_optional`) Reviewed By: jmp84 Differential Revision: D14483771 fbshipit-source-id: 0e37ef1cb5a16d03a535eef808b0488b98802128	2019-03-18 14:11:57 -07:00
Narine Kokhlikyan	670f509984	Circular Convolution Function via circular padding (#17240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17240 Added circular padding in addition to zero padding to Conv1D, Conv2D and Conv3D based on the solution suggested in: https://github.com/pytorch/pytorch/issues/3858 Reviewed By: ezyang Differential Revision: D14126416 fbshipit-source-id: a2f1587503ee0cfff98d5cb0d5b0a600ef8aaeb4	2019-03-18 12:33:20 -07:00
Thomas Viehmann	2b7a5d1876	don't include /usr/include when nvcc is in /usr/bin (#18127 ) Summary: ...because gcc will have failures with very strange error messages if you do. This affects people with Debian/Ubuntu-provided NVCC, the PR should not change anything for anyone else. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18127 Differential Revision: D14504386 Pulled By: soumith fbshipit-source-id: 1aea168723cdc71cdcfffb3193ee116108ae755e	2019-03-18 12:18:27 -07:00
Michael Suo	ed36fd30c8	fix double free in test_jit (#18121 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18121 ghimport-source-id: 70c273bfbcb68f7b25cf87f5614c662960864758 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18121 [jit] fix double free in test_jit These definitions used to be in anonymous namespace so they weren't exported from the translation unit. #18071 put those in a `test` namespace so I guess they were getting their destructors called twice on exit somehow. Making them static again fixes the problem. Reviewed By: ezyang Differential Revision: D14498349 fbshipit-source-id: f969781695dcbebdfcfce667fce5b986222a373e	2019-03-18 09:59:13 -07:00
Huitong Qiu	754bf595ca	Replace resize_dim with set_sizes_and_strides in THTensor_(squeeze) (#18059 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18059 Replace resize_dim() with set_sizes_and_strides() in `THTensor_(squeeze)` in aten/src/TH/generic/THTensor.cpp and `THCTensor_(squeeze)` in aten/src/THC/generic/THCTensor.cpp Reviewed By: ezyang Differential Revision: D14471066 fbshipit-source-id: 1c8c412ff09246c4df6843736e3bf0279bfadea8	2019-03-18 08:52:58 -07:00
Tongzhou Wang	2e311d2003	update exp. family doc (#18118 ) Summary: sphinx doesn't understand hyphen. it does not merge the two halves together in html. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18118 Differential Revision: D14498012 Pulled By: mrshenli fbshipit-source-id: d6f4cfddc0a8e3a8f91578da43c26ca9c6fff3ce	2019-03-17 21:39:42 -07:00
Gregory Chanan	fe22871b49	Change one_hot from IndexTensor to Tensor. (#18073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18073 ghimport-source-id: f4dadebafa0423c4c5a0e46c15b38129402d830a Stack: * #18072 properly device_guard IndexTensor and BoolTensor. * #18073 Change one_hot from IndexTensor to Tensor. There is no codegen change. Reviewed By: ezyang Differential Revision: D14485248 fbshipit-source-id: ee2ba8e5dcbbbaf0214a026c8e7ed4e6712becb0	2019-03-17 15:40:40 -07:00
Gregory Chanan	3c2fccc1b4	properly device_guard IndexTensor and BoolTensor. (#18072 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18072 ghimport-source-id: 9653731602c72f299e095dd50e3afe6bcc8b01d6 Stack: * #18072 properly device_guard IndexTensor and BoolTensor. * #18073 Change one_hot from IndexTensor to Tensor. Currently IndexTensor and BoolTensors do not have device_guards applied to them. This is bad in the case where the only tensor(s) are IndexTensors or BoolTensors, because no device guard is present. The only case this currently happens is with one_hot which ends up not mattering because of the way the implementation is written. But I wanted to make sure we are covered here. Reviewed By: ezyang Differential Revision: D14485249 fbshipit-source-id: e57b28086fa1ad2fdd248bb1220e8a2e42da03e1	2019-03-17 15:40:39 -07:00
Michael Suo	f9ad125e39	fix corner case for optional aliasing (#18093 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18093 ghimport-source-id: 021adc52aa7bfe5fff74531c76a8cd28cab30b2a Stack: * #18093 [jit] fix corner case for optional aliasing Occasionally the compiler can insert constant Nones to make types line up. In that case, don't try to make a pointer from the optional type to None, since we know statically that None won't be mutated or whatever. Reviewed By: shannonzhu Differential Revision: D14493004 fbshipit-source-id: 6564065f39d99ee5af664f3a0fe235892973d9be	2019-03-17 14:56:40 -07:00
Jianyu Huang	96fe2b4ecb	Typo fix (#18089 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18089 Typo fix for the fully connected layer documentation. Reviewed By: jspark1105 Differential Revision: D14488632 fbshipit-source-id: ca0271ca0250c1d653ed7f250e8588f7b2ce1056	2019-03-16 15:07:01 -07:00
Duc Ngo	da3cc6e7ee	Caffe2 - Add flag to fails if float point exceptions is detected in operator runs (#18040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18040 Add flag to fails if float point exceptions is detected in operator runs Sample exception Exception [enforce fail at operator.h:837] !std::fetestexcept(FE_DIVBYZERO). Division by zero floating point exception (FE_DIVBYZERO) reported. Error from operator: input: "1" input: "0" output: "out" name: "" type: "Div" Reviewed By: jspark1105 Differential Revision: D14467731 fbshipit-source-id: fad030b1d619a5a661ff2114edb947e4562cecdd	2019-03-16 12:28:05 -07:00
Junjie Bai	0fe6e8c870	Remove ComputeLibrary submodule Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18052 Reviewed By: ezyang Differential Revision: D14477355 fbshipit-source-id: c56b802f6d69701596c327cf9af6782f30e335fa	2019-03-16 09:06:42 -07:00
Jongsoo Park	c7448aa13c	remove unused parameters in optimizer tests (#18084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18084 data_strategy parameter was not used in some of unit tests for optimizers Reviewed By: hyuen Differential Revision: D14487830 fbshipit-source-id: d757cd06aa2965f4c0570a4a18ba090b98820ef4	2019-03-15 18:06:15 -07:00
Sebastian Messmer	be364ac8d7	Specify overload name in function schema (#18037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18037 The FunctionSchema can now store an overload name and the parser knows how to parse it. Specify like this: my_func.overload1(arg1: Tensor) -> Tensor my_func.overload2(arg1: Tensor, arg2: Tensor) -> Tensor Reviewed By: zdevito Differential Revision: D14467497 fbshipit-source-id: 8832b32f07351bb61090357b17b77a6a2fed3650	2019-03-15 16:58:13 -07:00
Sebastian Messmer	7a3488e0fc	Expose c10 cuda ops to caffe2 (#18036 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18036 - Add macros to export c10 cuda operators to caffe2 frontend - Instead of having a separate caffe2 registry for the c10 operator wrappers, use the existing caffe2 registries Reviewed By: ezyang Differential Revision: D14467495 fbshipit-source-id: 7715ed2e38d2bbe16f1446ae82c17193a3fabcb9	2019-03-15 16:58:12 -07:00
Jack Montgomery	cb2ea17707	Automatic update of fbcode/foxi to 2bcc4064c90e87b9638615c733485f07c47b7558 (#18070 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18070 Previous import was d1f45b1a2b1585d0e9bc65e15e463db344fc3ff6 Included changes: - [2bcc406](https://github.com/houseroad/foxi/commit/2bcc406): Merge pull request #7 from jackm321/tracing_fixes <Jack Montgomery> - [c39033c](https://github.com/houseroad/foxi/commit/c39033c): Fixes for tracing events <Jack Montgomery> - [50912cf](https://github.com/houseroad/foxi/commit/50912cf): Merge pull request #5 from jackm321/add_trace_events <Jack Montgomery> - [ba2fdcb](https://github.com/houseroad/foxi/commit/ba2fdcb): Merge pull request #5 from jackm321/add_trace_events <Jack Montgomery> - [7d42b12](https://github.com/houseroad/foxi/commit/7d42b12): address comments <Jack Montgomery> - [dcabd8d](https://github.com/houseroad/foxi/commit/dcabd8d): Add trace events interface <Jack Montgomery> Reviewed By: houseroad Differential Revision: D14483201 fbshipit-source-id: f51ed869c9a89521079df89903abc0ac0a45ac7b	2019-03-15 16:49:08 -07:00
Gregory Chanan	d1843d4173	Add backwards compatibility and other fixes to Dispatch macros. (#17996 ) Summary: Changes: 1) https://github.com/pytorch/pytorch/pull/17527 changed dispatch macros to be ScalarType based instead of at::Type based. This broke cpp extensions that relied on dispatch macros. Since IMO these should be ScalarType based (and some extensions have already updated), we allow either at::Type or at::ScalarType to be passed, but passing at::Type will result in a deprecated warning. 2) Reintroduce macros that were deleted (AT_DISPATCH_ALL_TYPES_AND_HALF, AT_DISPATCH_COMPLEX_TYPES, AT_DISPATCH_ALL_TYPES_AND_HALF_AND_COMPLEX, AT_DISPATCH_ALL_TYPES_AND_COMPLEX); the AND_HALF ones now give a deprecated warning because there are more extensible macros that were introduced in their place. 3) Makes AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND into a ScalarType based macro (and updates usages). This was the result of a logical merge conflicts. 4) Adds a new macro, C10_DEPRECATED_MESSAGE for passing a deprecated message to the compiler. I didn't spend much time seeing if this can be enabled for versions before C++14. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17996 Reviewed By: ezyang Differential Revision: D14446203 Pulled By: gchanan fbshipit-source-id: 1da56e2e9c15aa8f913ebbf6bf1110c5b6dc375e	2019-03-15 14:21:46 -07:00
Elias Ellison	f3806094d5	Breakup Test Misc (batch 1/2) (#18071 ) Summary: Breakup test_misc so that a test for a file is in test_filename. I think we might want to wait on moving test files into the source directory, since that would involve moving some tests over to the C10 folder, and this goes 99% of the way for test discoverability IMO anyway. I added a file test_utils for common functions invoked in the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18071 Differential Revision: D14485787 Pulled By: eellison fbshipit-source-id: dcb20d1978d490999d435ea20c1d0503413a5c80	2019-03-15 13:56:19 -07:00
yuanhaoxie	aafbefa4d6	Remove the identical if branch (#18019 ) Summary: elif branch and else branch have the same content. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18019 Differential Revision: D14475107 Pulled By: ezyang fbshipit-source-id: 5075cc938f57649af7537de1a7c9d76ea976cafc	2019-03-15 13:14:26 -07:00
Roy Li	80a7eac79e	Remove Type::elementSizeInBytes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17785 Reviewed By: ezyang Differential Revision: D14379074 fbshipit-source-id: 60727f187d61eb571b144bd6eed4dd4908da0b51	2019-03-15 12:56:02 -07:00
Michael Kösel	9a8a268672	add index and count to list (#17446 ) Summary: see https://github.com/pytorch/pytorch/issues/16662 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17446 Differential Revision: D14461293 Pulled By: Krovatkin fbshipit-source-id: 03572467cdf85efc909c1864c0558a93085c8ff3	2019-03-15 12:45:17 -07:00
Lara Haidar-Ahmad	001cffed9d	ONNX Export IsNan op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17698 Reviewed By: zrphercule Differential Revision: D14470646 Pulled By: houseroad fbshipit-source-id: d3e6adc83c4f9fa288c5fe0ae4c6af71fdd47905	2019-03-15 12:19:03 -07:00
Michael Suo	18f721fb9a	support serialization of classes (#17856 ) Summary: Stack:     ⚫  #17856 [jit] support serialization of classes  [💛](https://our.intern.facebook.com/intern/diff/D14402599/) Add support for saving/loading TorchScript modules that depend on user-defned classes. We track class dependencies the same we track tensor constants, then write them all out such that we can just compile them in order before compiling the module hierarchy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17856 Reviewed By: shannonzhu Differential Revision: D14461599 Pulled By: suo fbshipit-source-id: 7115f87e069fd00dc8381d7de9997864fef7ea9f	2019-03-15 12:06:23 -07:00
Michael Kösel	cd26200d1b	add reverse to list (#17001 ) Summary: Add reverse functionality to list. See https://github.com/pytorch/pytorch/issues/16662 ```python import torch torch.jit.script def foo(): a = [1, 2, 3, 4] a.reverse() return a ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17001 Reviewed By: eellison Differential Revision: D14092019 Pulled By: driazati fbshipit-source-id: b353c763677c22312b64dde0db268e2988610ba1	2019-03-15 11:53:37 -07:00
Lu Fang	b420f8ff70	1/2 Add Tracing support for C2 Ops (#17899 ) Summary: The C10 ops are not registered as custom ops in PyTorch. So we have to add the explicit support for it, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17899 Reviewed By: dzhulgakov Differential Revision: D14436999 Pulled By: houseroad fbshipit-source-id: a31fdf13a5c84f9b156a7288e0ffa57deb23b83f	2019-03-15 11:48:34 -07:00
Richard Zou	3b5ddaf034	Delete dead code in THTensorMoreMath.cpp (#17993 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17993 ghimport-source-id: 5427773f6306bdeddffd9a3ae032acc3f253f458 Stack: * #17926 Implement at::has_internal_overlap helper function * #17927 Error out on in-place (unary) ops on tensors that have internal overlap * #17993 [easy] Delete dead code in THTensorMoreMath.cpp We seem to have new implementations already for these in ATen. Reviewed By: ezyang Differential Revision: D14457838 fbshipit-source-id: 8481aad74b2127bd28c0f3e09740889fc0488a31	2019-03-15 07:50:20 -07:00
Richard Zou	3c977fb7ce	Error out on in-place (unary) ops on tensors that have internal overlap (#17927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17927 ghimport-source-id: 626d321e430b6b5c0ea3aa1eb9df8c1e2d058bf8 Stack: * #17926 Implement at::has_internal_overlap helper function * #17927 Error out on in-place (unary) ops on tensors that have internal overlap On the way to #17935. Works for CPU and CUDA on the following ops: - abs_, acos_, asin_, atan_, ceil_, cos_, erf_, erfc_, exp_, expm1_ - floor_, log_, log10_, log1p_, log2_, round_, rsqrt_, - sin_, sqrt_, tan_, tanh_, trunc_ This PR adds a check to see if the out/result tensor has internal overlap. If it does, then we error out because the result may be incorrect. This is overly conservative; there are some cases where if the result is the same as the input, the inplace operation is OK (such as floor_, round_, and trunc_). However, the current code isn't organized in such a way that this is easy to check, so enabling those will come in the future. Reviewed By: ezyang Differential Revision: D14438871 fbshipit-source-id: 15e12bf1fdb2ab7f74bb806e22bc74840bd6abd1	2019-03-15 07:50:19 -07:00
Richard Zou	a4123decf7	Implement at::has_internal_overlap helper function (#17926 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17926 ghimport-source-id: 9f7572b5d43e474492363fa17dcb86a6c27ca13c Stack: * #17926 Implement at::has_internal_overlap helper function * #17927 Error out on in-place (unary) ops on tensors that have internal overlap On the way to #17935. Checks if a tensor's sizes/strides indicate that multiple elements share the same memory location. This problem in general is hard so at::has_internal_overlap implements two heuristics and avoids solving the general problem: if a tensor is contiguous, it cannot have internal overlap if a tensor has any zero strides, it does have internal overlap otherwise, return MemOverlap::kTooHard to indicate that there might be overlap, but we don't know. Reviewed By: ezyang Differential Revision: D14438858 fbshipit-source-id: 607ab31771315921ab6165b2a1f072ac3e75925a	2019-03-15 07:50:17 -07:00
Gregory Chanan	ea652973f2	Fix truncation of default float values in JIT signatures. (#18044 ) Summary: In python2, float values get truncated. We are storing default float values as floats (not 100% sure why?), which results in the defaults being truncated in the JIT and not matching the (specified) native function signatures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18044 Reviewed By: ezyang Differential Revision: D14469868 Pulled By: gchanan fbshipit-source-id: a456de599e8dab106966bcac7a6033f02ce3cdd2	2019-03-15 07:43:15 -07:00
Choongwoo Han	40074d647c	Allow None for checkpoint (#17969 ) Summary: Currently, we cannot run a checkpointed function with None argument. ```python out = torch.utils.checkpoint.checkpoint(run_fn, input_var, None) ``` ``` File "/home/tunz/anaconda3/envs/torchdev/lib/python3.7/site-packages/torch/utils/checkpoint.py", line 14, in detach_variable x = inp.detach() AttributeError: 'NoneType' object has no attribute 'detach' ``` This PR makes checkpoint function to safely handle None argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17969 Differential Revision: D14475148 Pulled By: ezyang fbshipit-source-id: 9afe9e9aac511a6df1e1620e9ac341536890d451	2019-03-15 07:38:41 -07:00
ttup7777	54ef852d7f	Fix unclosed files in download.py, test_onnxifi.py, test_trt.py (#18017 ) Summary: According to https://docs.python.org/3/tutorial/inputoutput.html, it is good practice to use the "with" keyword when dealing with file objects. If not, you should call f.close() to close the file and immediately free up any system resources used by it. Thus, I adjust the open file function to "with open() as f". Pull Request resolved: https://github.com/pytorch/pytorch/pull/18017 Differential Revision: D14475112 Pulled By: ezyang fbshipit-source-id: d1c0821e39cb8a09f86d6d08b437b4a99746416c	2019-03-15 07:29:46 -07:00
Junjie Bai	785c76584c	Run multi-gpu (single host) resnet50 and resnext101 training in bench (#18043 ) Summary: This is now working in rocm 2.2 cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/18043 Differential Revision: D14477493 Pulled By: bddppq fbshipit-source-id: 4d2dab1d5dbdbd4d6189162c074b19c4e9882c7d	2019-03-15 02:51:54 -07:00
BowenBao	8f07a9da30	Update nonzero onnx export (#18047 ) Summary: The output format of NonZero in ONNX(numpy https://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html) differs from that in PyTorch: In ONNX: `[rank_of_input, num_of_nonzeros]`, whereas in PyTorch: `[num_of_nonzeros, rank_of_input]`. To resolve the difference a Transpose op after the nonzero output is added in the exporter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18047 Differential Revision: D14475081 Pulled By: ezyang fbshipit-source-id: 7a3e4899f3419766b6145d3e9261e92859e81dc4	2019-03-14 22:19:20 -07:00
Jongsoo Park	e21aa16931	more careful use of auto in sparse operations (#17958 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17958 In some places, we need 64-bit for corner cases even though it's going to be rare. In some places, we were using 64-bit unnecessarily. Reviewed By: hyuen Differential Revision: D14435523 fbshipit-source-id: e01ab73029ff780133af7ff4bbbe2e17926ed5a2	2019-03-14 22:10:42 -07:00
Junjie Bai	30b80de876	Update caffe2 docker images tag to 253 (#18031 ) Summary: To use ROCm 2.2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18031 Reviewed By: ezyang Differential Revision: D14469242 Pulled By: bddppq fbshipit-source-id: c969bcf95dabe067d7b1a2cf6e07209e11148ec1	2019-03-14 20:53:07 -07:00
Johannes M Dieterich	8362177bcf	Fix typo (#17949 ) Summary: Fix a very common typo in my name. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17949 Differential Revision: D14475162 Pulled By: ezyang fbshipit-source-id: 91c2c364c56ecbbda0bd530e806a821107881480	2019-03-14 20:11:33 -07:00
J M Dieterich	1ba1ca0acb	Update to ROCm2.2 (#18007 ) Summary: ROCm 2.2 was released today, if we respin the CI docker images with the attached, PyTorch/Caffe2 will support ROCm 2.2 Changes necessary: * for the Ubuntu target, HIP PR 934 needs to be applied to fix the forceinline definition. ROCm 2.3 will contain this. * two unit tests proof flaky on different platforms, disable them defensively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18007 Differential Revision: D14473903 Pulled By: bddppq fbshipit-source-id: b1939f11d1c765a3bf71bb244b15f6ceb0e816d3	2019-03-14 18:47:22 -07:00
Michael Suo	8b32933ea1	fix clang-tidy (#18030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18030 ghimport-source-id: d68781226eee923c90be862ef54693feef5f1c1a Stack: * #18030 [jit] fix clang-tidy fix the following complaint ``` pytorch/torch/csrc/jit/ir.cpp:84:7: error: pass by value and use std::move [modernize-pass-by-value,-warnings-as-errors] const std::string& delim = ", ") ^~~~~~~~~~~~~~~~~~ std::string ``` Reviewed By: shannonzhu Differential Revision: D14466714 fbshipit-source-id: 195cba335ae656db28fc6230b9e56ad208c88c29	2019-03-14 17:31:08 -07:00
David Riazati	e782f200f7	Allow fewer arguments than the max in ArgumentSpec (#17826 ) Summary: Fixes #17558 The flattened tuple `Optional[Tuple[int, int]]` could either result in 1 (`None`) or 2 (`int` and `int`) values, so allow this case in `ArgumentSpec` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17826 Differential Revision: D14415290 Pulled By: driazati fbshipit-source-id: 971bfa39502cfb8f08a991f16ffed6d138e48dc9	2019-03-14 16:54:44 -07:00
Lu Fang	9de4350b77	Automatic update of fbcode/foxi to d1f45b1a2b1585d0e9bc65e15e463db344fc3ff6 (#18028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18028 Previous import was 520e8e135f1ad75959bf9b5bd15c361b8caeb8d6 Included changes: - [d1f45b1](https://github.com/houseroad/foxi/commit/d1f45b1): update the gitignore (#6) <Lu Fang> - [398135c](https://github.com/houseroad/foxi/commit/398135c): Remove static variable in header (#3) <Lu Fang> - [f817be1](https://github.com/houseroad/foxi/commit/f817be1): sync to ONNX cb544d07cc022e3fe83622fda9b2b1fa00b75b89 (#2) <Lu Fang> Reviewed By: zrphercule Differential Revision: D14464213 fbshipit-source-id: b5d166f05f7fd503dec11d676e219cc6c6a373f9	2019-03-14 15:47:36 -07:00
Edward Yang	d3e3b246ea	Use std::isnan instead of self-comparison. (#18021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18021 ghimport-source-id: 03423ba47ba5900c2b400c4457b148147ce8b35e Stack: * #18021 Use std::isnan instead of self-comparison. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: soumith Differential Revision: D14460699 fbshipit-source-id: d8feb7f3f0e93996bd1b4f4aea163548b1d12437	2019-03-14 15:41:36 -07:00
Yinghai Lu	b263a2d8a1	Unroll If ops when doing ONNXIFI transform (#18039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18039 We basically flatten the whole net in order to ease the ONNXIFI transform. An alternative way is to ONNXIFI the internal net of the If op, which can be done by adding interfacing inputs/outputs that the internal then_net or else_net referred to the inputs/outputs of the If op. This will be left as an TODO option. Reviewed By: zrphercule Differential Revision: D14452132 fbshipit-source-id: 00ad48d40da6fb8eabf9cca36701bcf61cbe4edc	2019-03-14 14:51:24 -07:00
Yinghai Lu	77d6d9e1b8	Minor improvements to ONNXIFI transform (#17964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17964 1. Make the output of TensorFill outputs CONSTANT during shape inference 2. Add option to avoid adding BatchAdjust ops 3. Add option to avoid lowering subgraph that's smaller than a limit Reviewed By: hyuen Differential Revision: D14360903 fbshipit-source-id: b3c5966b44e7cd0d56428acd6cc97f529b36b171	2019-03-14 14:51:23 -07:00
Junjie Bai	3057580c89	Run fp16 resnext101 training in bench script (#17963 ) Summary: cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/17963 Differential Revision: D14453445 Pulled By: bddppq fbshipit-source-id: 7ca0e0c33ae89d4d4cf6ddba321daf4d6b2d5ed6	2019-03-14 14:41:37 -07:00
Jie	6458a6f0fc	Tensor Iterator loop unrolling (#17667 ) Summary: Modified Tensor Iterator gpu reduction kernel. Creating multiple accumulator during thread reduce, this removes data dependency between unrolled loops, expose instruction level parallelism that benefits latency bounded kernels (e.g. welford used by `torch.std`) This approach increases register usage, such that we need to tune unrolling factors to prevent register spilling. Current implementation tune down the unrolling factor to 2 for welford (register heavy kernel), while keeping it unchanged (4) for the rest of reduction kernels. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17667 Differential Revision: D14368325 Pulled By: umanwizard fbshipit-source-id: 9d64c0dccabdb1b7c3922a6557224af704a1974e	2019-03-14 14:09:01 -07:00
Xiaomeng Yang	9506779a73	Temp fix for TileOp backward compatibility (#18026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18026 Temp fix for TileOp backward compatibility Reviewed By: kittipatv Differential Revision: D14463672 fbshipit-source-id: 1f3ec550245cb63f1bc4f26196b9334cfe5d0705	2019-03-14 13:54:29 -07:00
Michael Suo	e862243abe	add a dump method to TreeViews (#17965 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17965 ghimport-source-id: 0d3d6340141d8413ce524a8d8ed0d308854ee7ef Stack: * (to be filled) Also added it to the python bindings. Not for any particular reason, just because otherwise the function gets elided (even in debug mode!) and thus can't be called from the debugger Reviewed By: eellison Differential Revision: D14442654 fbshipit-source-id: 2868bb32ccb80b04f9483883faa702f63a7948bf	2019-03-14 12:27:32 -07:00
Duc Ngo	5cbc1981f3	JIT IR - Make valueMapPtr optional in convertNetDefToIR (#17942 ) Summary: Make valueMapPtr optional in convertNetDefToIR, and add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/17942 Differential Revision: D14429687 Pulled By: duc0 fbshipit-source-id: 3a5a72bbb5acc1bfd7144a987688c599016fbf7a	2019-03-14 12:22:49 -07:00
Yanghan Wang	53fb9a462a	register RoIAlign with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17889 Reviewed By: smessmer Differential Revision: D14411630 fbshipit-source-id: c3b7941d725ae2c78e8d79f52a7983db92b75807	2019-03-14 11:55:29 -07:00
Wanchao Liang	10d64a1372	add tanh to AD and fix layernorm schema Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17816 Differential Revision: D14453048 Pulled By: wanchaol fbshipit-source-id: 45815db964a4d9ee85d8933e335b47f215e3c467	2019-03-14 11:20:40 -07:00
peter	9af6564060	Add magma debug version for Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18008 Differential Revision: D14455117 Pulled By: soumith fbshipit-source-id: 29d9a2e0b36d72bece0bb1870bbdc740c4d1f9d6	2019-03-14 10:15:57 -07:00
peter	bba906c2cb	Simplify env creation when running Windows tests (#17916 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/13465. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17916 Differential Revision: D14460589 Pulled By: soumith fbshipit-source-id: e952d08648b833cfd4a8551355ecd68045fea25c	2019-03-14 10:10:31 -07:00
Edward Yang	84c30398c7	Fix lint in test_multiprocessing. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18016 Reviewed By: eellison Differential Revision: D14458177 fbshipit-source-id: f17b3e06223ab399e9ce24be6988effe04dad636	2019-03-14 09:58:13 -07:00
Gregory Chanan	73c5921134	Remove ArgcountSortPlugin, which doesn't seem to be used. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17977 Reviewed By: ezyang Differential Revision: D14438842 Pulled By: gchanan fbshipit-source-id: 9b1746880fd7e3bd2b76a2559face34940ce7570	2019-03-14 09:30:38 -07:00
Edward Yang	3fe7bdb2ff	Fix lint in test_nn.py (#18006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18006 ghimport-source-id: e267ece1ac03e0d17e01dddf4a77f52421859435 Stack: * #18006 Fix lint in test_nn.py Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: eellison Differential Revision: D14458108 fbshipit-source-id: 18ee6199447efed55a922cff5b3ad940a21c0536	2019-03-14 08:59:24 -07:00
Sebastian Messmer	a41b6d7d1f	Simplify macros for exposing c10 ops to c2 (#17781 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17781 The wrapper for calling a c10 operator from caffe2 is now based on a runtime FunctionSchema instead of compile time information. This way, it can be created for any c10 operator schema with just one invocation to a simple macro instead of having to define arguments and more as compile time structures. Furthermore, previously, the wrapper assumed there's an argument present for preallocated outputs, but that was only true for caffe2 operators exported to c10. So the wrapper only worked correctly for calling caffe2->c10->caffe2. Now with the new implementation, it works for any c10 operator. Also, binary size for this should be much smaller. Reviewed By: ezyang Differential Revision: D14375054 fbshipit-source-id: bac7ab8e63929e6e2a148eacac41ed092009aa86	2019-03-14 08:54:16 -07:00
Sebastian Messmer	25d06eef7b	Improve caffe2 operator wrapping (#17743 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17743 - caffe2::Operator::SetOutputTensor() can now be used in operators that are called from c10/PyTorch. - If the operator uses SetOutputTensor() instead of XOutput(), the wrapper doesn't preallocate an empty tensor for the operator anymore. Only outputs accessed in XOutput() will get an output tensor preallocated. - Remove the copying of the vector with output tensors into a vector with pointer to output tensors. - Preallocated outputs are now passed in as one TensorList argument on the stack. This TensorList argument has a well-defined name so other wrappers (i.e. the wrapper calling from c2 into c10) can recognize and use it). - Macros for exporting caffe2 operators to c10 are simplified. Instead of having `c10_op_handle_for_c2_op`, we now pass in the operator handle as a template argument. - `SetOutputTensor` and `OutputTensorOrUndefined` now work with operators exported to c10 Reviewed By: ezyang Differential Revision: D14362434 fbshipit-source-id: 44a5e717204f21ea8e9728437429d9b84906f9f5	2019-03-14 08:54:15 -07:00
Gregory Chanan	6def5b69e3	Remove unused KwargsPlugin. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17980 Reviewed By: ezyang Differential Revision: D14438877 Pulled By: gchanan fbshipit-source-id: f93764b00999effb5c8f852f8eda3a6da32dc767	2019-03-14 08:03:55 -07:00
vaeksare	40a3e14ade	Disable btri tests on Windows if MAGMA is not found (#17989 ) Summary: Fixes #17988 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17989 Reviewed By: ezyang Differential Revision: D14454571 Pulled By: soumith fbshipit-source-id: fc39a807a597d3574f4ca4e22cea12194e4693c0	2019-03-14 07:22:55 -07:00
bhushan	16e50c78e7	Report convolution size mismatch (#17436 ) Summary: 1. Kernel size is larger than input 2. Expected output size to be less than zero Test case added: - invalid_conv1d - Relevant test cases for conv2d and conv3d exists Fixes #17247 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17436 Reviewed By: mrshenli Differential Revision: D14354272 Pulled By: fmassa fbshipit-source-id: 94b98621aa03b1f60d151ef9399ed3da55d41b42	2019-03-14 06:35:29 -07:00
Jongsoo Park	8bd9465b79	make momentum non negative in adagrad test (#18009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18009 momentum should be initialized with non-negative values Reviewed By: hyuen Differential Revision: D14450841 fbshipit-source-id: 5bbbd11645db9e6f2dc42b26a00ff3caf378c59f	2019-03-14 03:15:07 -07:00
Lu Fang	f827f1052a	Fix the CI Summary: https://github.com/pytorch/pytorch/pull/17995 's CI has verified it should fix the CI. Reviewed By: bddppq Differential Revision: D14447674 fbshipit-source-id: 50085db9ae7421b5be216ed0a2216234babfdf6c	2019-03-13 17:28:50 -07:00
Junjie Bai	6df7116273	Fix missing return in HIPStreamMasqueradingAsCUDA::operator<< (#17961 ) Summary: ``` In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/BatchLinearAlgebra.hip:3: In file included from /var/lib/jenkins/workspace/aten/src/ATen/hip/HIPContext.h:5: /var/lib/jenkins/workspace/aten/src/ATen/hip/impl/HIPStreamMasqueradingAsCUDA.h:107:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ 1 warning generated. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17961 Reviewed By: houseroad Differential Revision: D14436421 Pulled By: bddppq fbshipit-source-id: 962665602178699d7c7b55f4ca7ff1eb72ee0349	2019-03-13 16:04:42 -07:00
Gregory Chanan	c5b50a3440	Remove AssertNDim, which doesn't seem to be used. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17978 Reviewed By: colesbury Differential Revision: D14438845 Pulled By: gchanan fbshipit-source-id: 106650c37fb1885201eaef27cb6d86b49ef27976	2019-03-13 15:10:55 -07:00
Gregory Chanan	42acae5406	Remove unused BoolOption. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17979 Reviewed By: zou3519 Differential Revision: D14438876 Pulled By: gchanan fbshipit-source-id: a6aeab0261ce6926ed82a81edee4564a8dd341ed	2019-03-13 13:38:19 -07:00
Elliot Waite	1e42720a77	Fix some typos in distributed.py. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17959 Differential Revision: D14437347 Pulled By: soumith fbshipit-source-id: 4c33571f56e9da687666516a310f91924cddd4d9	2019-03-13 09:28:03 -07:00
peter	1c3494daf0	Fix Windows test CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17954 Differential Revision: D14437473 Pulled By: soumith fbshipit-source-id: f0d79ff0c5d735f822be3f42bbca91c1928dacaf	2019-03-13 09:22:46 -07:00
Edward Yang	9089182ce4	Fix lint in test_utils.py (#17944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17944 ghimport-source-id: 5b45086428b5a36e737882c78f285141121fd1bc Stack: * #17944 Fix lint in test_utils.py Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14430132 fbshipit-source-id: b00de7b4c685645ad5a4dc8c5fe6ce7e1893a3eb	2019-03-13 09:02:35 -07:00
Guanheng Zhang	26a4c2ada6	Speed up gemm by reordering the for loops (#17730 ) Summary: Optimize the order of the "for" loops. Note: For "transa = true" cases, the order of the "for" loops has been optimzied in the original code. Therefore, no significant improvement is observed in those case (i.e. "transa && transb" and "transa && !transb") mode/opt (i.e. static libary) ////////////////////////////////////////////////////////////////////////////// transa && transb after: loops: 2229 x: 128 y: 128 z: 128 time: 2243ns => acceleration multiplier: 0.90 loops: 124 x: 128 y: 1024 z: 128 time: 40381ns => acceleration multiplier: 0.97 loops: 121 x: 1024 y: 128 z: 128 time: 41651ns => acceleration multiplier: 0.96 loops: 15 x: 1024 y: 1024 z: 128 time: 333771ns => acceleration multiplier: 0.98 loops: 4610 x: 128 y: 128 z: 64 time: 1084ns => acceleration multiplier: 0.95 loops: 252 x: 128 y: 1024 z: 64 time: 19860ns => acceleration multiplier: 0.98 loops: 248 x: 1024 y: 128 z: 64 time: 20232ns => acceleration multiplier: 0.98 loops: 30 x: 1024 y: 1024 z: 64 time: 167338ns => acceleration multiplier: 0.99 before: loops: 2468 x: 128 y: 128 z: 128 time: 2026ns loops: 128 x: 128 y: 1024 z: 128 time: 39338ns loops: 126 x: 1024 y: 128 z: 128 time: 39930ns loops: 16 x: 1024 y: 1024 z: 128 time: 327549ns loops: 4840 x: 128 y: 128 z: 64 time: 1033ns loops: 258 x: 128 y: 1024 z: 64 time: 19441ns loops: 252 x: 1024 y: 128 z: 64 time: 19854ns loops: 31 x: 1024 y: 1024 z: 64 time: 166254ns ////////////////////////////////////////////////////////////////////////////// transa && !transb after: loops: 4880 x: 128 y: 128 z: 128 time: 1024ns => acceleration multiplier: 0.98 loops: 638 x: 128 y: 1024 z: 128 time: 7839ns => acceleration multiplier: 1.04 loops: 605 x: 1024 y: 128 z: 128 time: 8276ns => acceleration multiplier: 1.01 loops: 77 x: 1024 y: 1024 z: 128 time: 65713ns => acceleration multiplier: 1.00 loops: 9935 x: 128 y: 128 z: 64 time: 503ns => acceleration multiplier: 1.00 loops: 1252 x: 128 y: 1024 z: 64 time: 3994ns => acceleration multiplier: 1.00 loops: 1183 x: 1024 y: 128 z: 64 time: 4226ns => acceleration multiplier: 0.98 loops: 153 x: 1024 y: 1024 z: 64 time: 32766ns => acceleration multiplier: 0.99 before: loops: 4985 x: 128 y: 128 z: 128 time: 1003ns loops: 615 x: 128 y: 1024 z: 128 time: 8140ns loops: 599 x: 1024 y: 128 z: 128 time: 8357ns loops: 76 x: 1024 y: 1024 z: 128 time: 65934ns loops: 9897 x: 128 y: 128 z: 64 time: 505ns loops: 1248 x: 128 y: 1024 z: 64 time: 4008ns loops: 1203 x: 1024 y: 128 z: 64 time: 4159ns loops: 154 x: 1024 y: 1024 z: 64 time: 32499ns ////////////////////////////////////////////////////////////////////////////// !transa && transb after: loops: 3919 x: 128 y: 128 z: 128 time: 1276ns => acceleration multiplier: 2.97 loops: 497 x: 128 y: 1024 z: 128 time: 10069ns => acceleration multiplier: 7.85 loops: 449 x: 1024 y: 128 z: 128 time: 11145ns => acceleration multiplier: 4.77 loops: 57 x: 1024 y: 1024 z: 128 time: 88595ns => acceleration multiplier: 7.12 loops: 7575 x: 128 y: 128 z: 64 time: 660ns => acceleration multiplier: 3.00 loops: 967 x: 128 y: 1024 z: 64 time: 5173ns => acceleration multiplier: 7.66 loops: 877 x: 1024 y: 128 z: 64 time: 5702ns => acceleration multiplier: 4.76 loops: 111 x: 1024 y: 1024 z: 64 time: 45232ns => acceleration multiplier: 7.03 before: loops: 1320 x: 128 y: 128 z: 128 time: 3789ns loops: 64 x: 128 y: 1024 z: 128 time: 79061ns loops: 95 x: 1024 y: 128 z: 128 time: 53107ns loops: 8 x: 1024 y: 1024 z: 128 time: 631161ns loops: 2521 x: 128 y: 128 z: 64 time: 1983ns loops: 127 x: 128 y: 1024 z: 64 time: 39604ns loops: 185 x: 1024 y: 128 z: 64 time: 27128ns loops: 16 x: 1024 y: 1024 z: 64 time: 318155ns ////////////////////////////////////////////////////////////////////////////// !transa && !transb after: loops: 3895 x: 128 y: 128 z: 128 time: 1283ns => acceleration multiplier: 1.73 loops: 393 x: 128 y: 1024 z: 128 time: 12746ns => acceleration multiplier: 3.36 loops: 411 x: 1024 y: 128 z: 128 time: 12170ns => acceleration multiplier: 1.93 loops: 46 x: 1024 y: 1024 z: 128 time: 110116ns => acceleration multiplier: 3.17 loops: 7404 x: 128 y: 128 z: 64 time: 675ns => acceleration multiplier: 1.58 loops: 636 x: 128 y: 1024 z: 64 time: 7872ns => acceleration multiplier: 2.70 loops: 724 x: 1024 y: 128 z: 64 time: 6911ns => acceleration multiplier: 1.32 loops: 73 x: 1024 y: 1024 z: 64 time: 68502ns => acceleration multiplier: 2.49 before: loops: 2253 x: 128 y: 128 z: 128 time: 2219ns loops: 117 x: 128 y: 1024 z: 128 time: 42788ns loops: 214 x: 1024 y: 128 z: 128 time: 23465ns loops: 15 x: 1024 y: 1024 z: 128 time: 349076ns loops: 4694 x: 128 y: 128 z: 64 time: 1065ns loops: 236 x: 128 y: 1024 z: 64 time: 21251ns loops: 549 x: 1024 y: 128 z: 64 time: 9108ns loops: 30 x: 1024 y: 1024 z: 64 time: 170799ns Pull Request resolved: https://github.com/pytorch/pytorch/pull/17730 Differential Revision: D14325149 Pulled By: zhangguanheng66 fbshipit-source-id: a7a5a83890fdf99fee6eb87a3a5060b7b6bd862f	2019-03-13 08:57:26 -07:00
livc	ecc5e623a2	fix punctuation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17973 Differential Revision: D14438725 Pulled By: zou3519 fbshipit-source-id: 30a5485b508b4ae028057e0b66a8abb2b163d66b	2019-03-13 08:14:30 -07:00
Thomas Viehmann	13bc002422	fixes for AVX detection (#17915 ) Summary: Our AVX2 routines use functions such as _mm256_extract_epi64 that do not exist on 32 bit systems even when they have AVX2. This disables AVX2 when _mm256_extract_epi64 does not exist. This fixes the "local" part of #17901 (except disabling FBGEMM), but there also is sleef to be updated and NNPACK to be fixed, see the bug report for further discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17915 Differential Revision: D14437338 Pulled By: soumith fbshipit-source-id: d4ef7e0801b5d1222a855a38ec207dd88b4680da	2019-03-13 03:55:06 -07:00
Thomas Viehmann	7e34bd230b	Disable FBGEMM when building under x86 32bit (#17922 ) Summary: FBGEMM doesn't work on x86 32bit and prior to this patch, it will generate x86_64 objects in a build that is supposed to be x86 32bit. FBGEMM actually relies on registers not available on x86_32, so we disable it. This takes of one element of #17901. There are more dependencies and a separate PR (#17915) regarding AVX detection for the code in the main repository. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17922 Differential Revision: D14437340 Pulled By: soumith fbshipit-source-id: bd9fc98cf607d9b0bc28127fbbc8b04fa10eecbe	2019-03-13 03:46:50 -07:00
serhii-havrylov	f6de833cac	Update docs for `mark_non_differentiable` method (#17891 ) Summary: The current documentation doesn't reflect the real values of tensors during the backward pass. This issue is mentioned in https://github.com/pytorch/pytorch/issues/12631 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17891 Differential Revision: D14419949 Pulled By: soumith fbshipit-source-id: 8b495628c3f017bc880f8096682cd176a53974e5	2019-03-13 03:19:59 -07:00
Sebastian Messmer	1e7f027f5b	Simplify OpKernel (#17925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17925 There's no need for OpKernel to keep the cache creator around if we initialize cache on construction. This basically means, kernel caches are now constructed when the kernel is looked up from the dispatcher, and not delayed to the first call anymore. This gives us the benefit of cheaper calling because now kernel calling doesn't have to check if the cache is already initialized. Also, this improves thread-safety. Now, OpKernel is thread-safe if the kernel is thread-safe. Reviewed By: ezyang Differential Revision: D14424907 fbshipit-source-id: a0d09a3a560dfe78aab53d558c9ebb91b57722df	2019-03-13 01:40:10 -07:00
Junjie Bai	8714b8bb89	Mark DispatchTable move ctor and move assignment operator as deleted (#17948 ) Summary: ``` 21:39:50 /var/lib/jenkins/workspace/aten/src/ATen/core/dispatch/DispatchTable.h:125:3: warning: explicitly defaulted move constructor is implicitly deleted [-Wdefaulted-function-deleted] 21:39:50 DispatchTable(DispatchTable&&) = default; 21:39:50 ^ 21:39:50 /var/lib/jenkins/workspace/aten/src/ATen/core/dispatch/DispatchTable.h:212:36: note: move constructor of 'DispatchTable' is implicitly deleted because field 'kernels_' has a deleted move constructor 21:39:50 detail::ThreadsafeOperatorTable_ kernels_; 21:39:50 ^ 21:39:50 /var/lib/jenkins/workspace/aten/src/ATen/core/dispatch/DispatchTable.h:105:68: note: copy constructor of 'ThreadsafeOperatorTable_' is implicitly deleted because field 'map_' has a deleted copy constructor 21:39:50 LeftRight<ska::flat_hash_map<TensorTypeId, DispatchTableEntry>> map_; 21:39:50 ^ 21:39:50 /var/lib/jenkins/workspace/c10/util/LeftRight.h:152:16: note: copy constructor of 'LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::DispatchTableEntry, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::DispatchTableEntry> > > >' is implicitly deleted because field '_writeMutex' has a deleted copy constructor 21:39:50 std::mutex _writeMutex; 21:39:50 ^ 21:39:50 /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/mutex:129:5: note: 'mutex' has been explicitly marked deleted here 21:39:50 mutex(const mutex&) = delete; ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17948 Reviewed By: ezyang Differential Revision: D14431344 Pulled By: bddppq fbshipit-source-id: b1c6593b73cb467a58b09a3470b8899b82564d5e	2019-03-13 01:29:50 -07:00
Lu Fang	4dcb4b1601	Add more hint in the JIT tracer (#17957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17957 So developer knows what action should be taken when model contains nondeterministic node Reviewed By: dzhulgakov Differential Revision: D14435923 fbshipit-source-id: 12d930185852f78c54efc8e90c51aa7c7c7faab5	2019-03-13 00:56:59 -07:00
Andrey Malevich	c8f9072ab6	Fix half-float conversion ops to handle tensors larger than 2B of params (#17952 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17952 As desc. Reviewed By: hyuen Differential Revision: D14435092 fbshipit-source-id: dc614ba16ad531101d04d01aec8f1fbd534ebec5	2019-03-12 23:03:22 -07:00
Lu Fang	8bc3b66be9	Override the resolve_library_path in FBCode (#17497 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17497 The following problems have been addressed: 1) import torch.ops correctly, 2) make realpath call optional Reviewed By: dzhulgakov Differential Revision: D14094358 fbshipit-source-id: 2f9a6fca656867287a7c82c465a4554384ff7323	2019-03-12 22:09:24 -07:00
Karl Ostmo	b21e9e4dae	update ccache guide (#17938 ) Summary: closes #17937 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17938 Differential Revision: D14435791 Pulled By: kostmo fbshipit-source-id: b1d0db8902f78bde51150606e2a67fb9ddfe7812	2019-03-12 21:48:17 -07:00
Michael Suo	9a946c4072	unify cpp tests (#17947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17947 Instead of having a gtest and a no-gtest file that you have to remember to register tests in, add a single registration point and use some macro magic to make it work for both gtest and non-gtest builds Reviewed By: eellison Differential Revision: D14431302 fbshipit-source-id: e1abac135992577a943eaa7abcc81a6ed31fa6e5	2019-03-12 21:35:40 -07:00
svcscm	4f939dded1	Updating submodules Reviewed By: zpao fbshipit-source-id: 7d454d0f58898741f293b356dfc10d7fc31fd55c	2019-03-12 20:34:05 -07:00
Duc Ngo	66556f48e3	Remove sinkMaxPool transformation (#17694 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17694 Remove sinkMaxPool transformation as it's unused Differential Revision: D14328624 fbshipit-source-id: bd245403b756157120faa61a0f9253c15120e7a8	2019-03-12 20:10:46 -07:00
Alexey Kozhevnikov	f7b70a69e5	Fix Windows build (#17917 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17917 D14375995 introduced instantiation of the following templates with `bool` type (more specifically `To` is `int64_t`, `From` is `bool`): ``` template <typename To, typename From> typename std::enable_if<std::is_integral<From>::value, bool>::type overflows( From f) { using limit = std::numeric_limits<typename scalar_value_type<To>::type>; if (!limit::is_signed && std::numeric_limits<From>::is_signed) { // allow for negative numbers to wrap using two's complement arithmetic. // For example, with uint8, this allows for `a - b` to be treated as // `a + 255 * b`. return f > limit::max() \|\| (f < 0 && -static_cast<uint64_t>(f) > limit::max()); } else { return f < limit::lowest() \|\| f > limit::max(); } } template <typename To, typename From> typename std::enable_if<std::is_floating_point<From>::value, bool>::type overflows(From f) { using limit = std::numeric_limits<typename scalar_value_type<To>::type>; if (limit::has_infinity && std::isinf(static_cast<double>(f))) { return false; } if (!limit::has_quiet_NaN && (f != f)) { return true; } return f < limit::lowest() \|\| f > limit::max(); } ``` MSVC gives C4804 warning and because "treat warnings as errors" is on it fails to build on Windows. Disabling such warning for those 2 templates. Reviewed By: mingzhe09088 Differential Revision: D14421157 fbshipit-source-id: e72ba34406628c84da48518b32a46f851819bad1	2019-03-12 19:53:56 -07:00
Jongsoo Park	92e35ac0a7	fix overly restrictive assertion (#17939 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17939 Instead of just asserting min <= 0 and max >= 0 , we adjust histogram to include 0 in the range. We need to include 0 in the range during norm error minimization to correctly represent our quantization method that includes 0. Reviewed By: csummersea Differential Revision: D14428732 fbshipit-source-id: 6669a9d2c7d409ec3b31aee0afe48071986b9b71	2019-03-12 18:18:49 -07:00
Owen Anderson	e34abe03a8	Enable threadpool threads to greedily acquire new tasks if available. (#17808 ) Summary: This improves locality and affinity by keeping work on the same threads preferentially to starting work on new ones, and reduces contention on the threadpool lock more generally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17808 Differential Revision: D14391282 Pulled By: resistor fbshipit-source-id: 3aec81656a50460a725aa4187c61864295d4f46e	2019-03-12 18:05:55 -07:00
Duc Ngo	552f903c63	JIT IR - Add option to remove prefix string when converting from JIT IR to NetDef (#17931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17931 When converting from NetDef to IR and back, the prefix string should be removed so the operator types are preserved in caffe2. Reviewed By: ZolotukhinM Differential Revision: D14425954 fbshipit-source-id: 2807e7337b0f804f126970768b1250a4a8c5f35c	2019-03-12 17:02:26 -07:00
Kai Zhang	4ad17c9031	Misleading documentation for module._load_from_state_dict (#17618 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17618 Base on the code, we only add key to `missing_keys` and `unexpected_keys` if `$strict` is `True`. The documentation is confusing. This diff also fix one FLAKE8 warning. Reviewed By: ailzhang Differential Revision: D14280593 fbshipit-source-id: d368f5596bdf74ff62ee4d28d79120f5af91e0a3	2019-03-12 16:57:39 -07:00
Sandeep Kumar	6248266d91	Enable detectron on AMD GPU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17862 Differential Revision: D14429234 Pulled By: bddppq fbshipit-source-id: 5cb8750bd9db0ff8a179977d2bfbb180265cce81	2019-03-12 16:29:42 -07:00
Iurii Zdebskyi	1cfb50334f	Removed dead code from THTensorMath.h (#17873 ) Summary: This PR removes dead code from THTensorMath.h I found these unused methods while working on a PR where i plan to move fill and zero methods from TH/THC to Aten. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17873 Differential Revision: D14407013 Pulled By: izdeby fbshipit-source-id: a3551c5d91e7b380931a8b3bd4b3ae972d16911d	2019-03-12 13:53:33 -07:00
Edward Yang	6466ddbd86	Fix lint in test_torch.py (#17807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17807 Lint also detected a bug in test_linspace where we weren't actually testing the CUDA case. Differential Revision: D14388241 fbshipit-source-id: e219e46400f4952c6b384bca3baa0724ef94acde	2019-03-12 13:48:28 -07:00
svcscm	40ecdc57ff	Updating submodules Reviewed By: zpao fbshipit-source-id: 06c0f738c791cccf79025d15f1fc2076bf34fcd1	2019-03-12 13:29:46 -07:00
jainkunal3004	ee87254720	Eliminate the use of Type. (#17804 ) Summary: Stack:     ⚫  #17804 Eliminate the use of Type.  [💛](https://our.intern.facebook.com/intern/diff/D14382165/) at::CPU produces Type object which is then casted into TensorOptions, instead directly using TensorOptions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17804 Differential Revision: D14407851 Pulled By: ezyang fbshipit-source-id: 6462d698305b7c24382c1bfd440d3227bd28d9e4	2019-03-12 12:54:24 -07:00
Dan Povey	0f7e6f293b	Make Variable::set_data non-const; cosmetic fixes. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17761 Differential Revision: D14406603 Pulled By: ezyang fbshipit-source-id: bc8bba73352eb4b3e21196b36522e9cec70f6676	2019-03-12 12:41:57 -07:00
Ailing Zhang	3e00f79a1e	remove warning for upsample code (#17921 ) Summary: IIRC we decided to remove warning in code in #11568. This got reverted accidentally in #14123. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17921 Differential Revision: D14422811 Pulled By: ailzhang fbshipit-source-id: 7067264bd1d3e3b7861d29e18ade2969ed705ca1	2019-03-12 12:16:33 -07:00
Xiaomeng Yang	f229521154	Optimize TileOp (#17290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17290 Optimize TileOp Reviewed By: wesolwsk Differential Revision: D14145844 fbshipit-source-id: 1571fa0512218dbc48080592ede4e23903be85dd	2019-03-12 12:16:30 -07:00
Xiaomeng Yang	54b33503ec	Optimize channel_stats_op (#16243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16243 Optimize channel_stats_op and add NHWC impl Reviewed By: takatosp1 Differential Revision: D13775515 fbshipit-source-id: decb889e646f5316d4afefdf9f9b6bc6343613cd	2019-03-12 12:08:00 -07:00
Hector Yuen	99f1465c35	enable shape inference for elementwise operators (#17885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17885 enable shape inference for elementwise operators Reviewed By: yinghai Differential Revision: D14411014 fbshipit-source-id: b19bcaabb2bba26fb79745ec84af0e4e5ed18ff0	2019-03-12 12:02:24 -07:00
Elias Ellison	4d2f6f1bbe	Remove remaining test jit expects redux (#17924 ) Summary: Trying to reland https://github.com/pytorch/pytorch/pull/17886 since it broke a build and I reverted it Pull Request resolved: https://github.com/pytorch/pytorch/pull/17924 Differential Revision: D14423842 Pulled By: eellison fbshipit-source-id: f219e786bd07f7da3b7f9e866981199f5ccf6318	2019-03-12 11:33:34 -07:00
Elias Ellison	abab9c1d78	Handle Scalars Better (#17875 ) Summary: This PR allows Scalars to be castable with `int()` and `float()`, allows scalars to match with float arguments, and prints out a better error message if `x.item()` is used as an int. Scalars are a very uncommon case, and I don't think we want to add the maintenance burden of building out op coverage for it. It's more maintainable to better handle converting it to int/float. Fix https://github.com/pytorch/pytorch/issues/17652 Also note: https://github.com/pytorch/pytorch/issues/16849 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17875 Differential Revision: D14411138 Pulled By: eellison fbshipit-source-id: a4e957cefb0ffd10ddb234d92f6d1558cfce8751	2019-03-12 10:52:26 -07:00
Brian Johnson	fd04073e61	Fixed a formatting issue in doc comments (#17505 ) Summary: for torch.distributed.broadcast_multigpu per issue #17243 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17505 Reviewed By: janewangfb Differential Revision: D14373865 Pulled By: pietern fbshipit-source-id: 6d7e91a3da50a7c9ba417ad852f7746eb5200043	2019-03-12 09:55:29 -07:00
Edward Yang	18949c8e00	Add nbytes, itemsize, element_size to at::Tensor. (#17810 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17810 Partially addresses #12728. Also, switch the element_size bindings to use the new function, rather than the method on Type. We don't add Python bindings yet, as they need to be special (they will be properties.) Differential Revision: D14388790 fbshipit-source-id: 294183d0c8a59b0c13f2bf21d6f1cd557333e83b	2019-03-12 09:48:54 -07:00
Edward Yang	dc4cbd9565	Fix lint in test_distributions.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17821 Differential Revision: D14392899 fbshipit-source-id: 99f75b1d3a71bde8050caef8df7e5b9ecfe0c755	2019-03-12 09:39:24 -07:00
Edward Yang	030fec9703	Fix lint in test_jit.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17823 Differential Revision: D14392996 fbshipit-source-id: b9aa83898768c929e753c0f17bb09a54d724ae4d	2019-03-12 09:20:20 -07:00
Edward Yang	4073e3c2f2	Fix lint errors in test_autograd Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17812 Reviewed By: eellison Differential Revision: D14388897 fbshipit-source-id: 6e2671805dc8d57af68eb0a0cd6ccb24d9db45e2	2019-03-12 08:55:12 -07:00
Andras Tantos	f3a860ba07	Added a few extra python bindings to help with walking the IR graph from Python (#17822 ) Summary: These changes add the following new Python bindings: - Values have a 'type' property now that allows getting to the 'type' object - Blocks have now inputs and outputs as well as returnNode and paramNode properties Pull Request resolved: https://github.com/pytorch/pytorch/pull/17822 Differential Revision: D14410123 Pulled By: ezyang fbshipit-source-id: 64ef79f85a7a43b83e4b127b1d39efcaa64b74dc	2019-03-12 08:55:10 -07:00
Thomas Viehmann	aba9051a65	kthvalue consistency with sort in the presence of NaN (#17824 ) Summary: This PR causes kthvalue to be consistent with sort (i.e. treat NaN as larger than any number), so that `a.kthvalue(n) == a.sort()[n - 1]`. One drawback is that median with a NaN argument does not return NaN, which is a deviation from NumPy. Thank you, ngimel, for raising this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17824 Differential Revision: D14410092 Pulled By: ezyang fbshipit-source-id: bdec2d8272dc4c65bcf2f9b8995e237774c44c02	2019-03-12 08:49:19 -07:00
joy	9ecee93a16	Fix minor grammatical mistakes in torch/nn/modules/loss.py (#17892 ) Summary: Fixes some minor grammatical mistakes in the doc of `loss.py`. I think in the doc: > Note that for some losses, there multiple elements per sample. the "are" is lost between "there" and "multiple". This mistake takes place in all the descriptions of parameter `size_average` and there are 17 of them. It's minor but perfects the doc I think. 😁 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17892 Differential Revision: D14418177 Pulled By: ezyang fbshipit-source-id: 412759f2f9b215819463bf8452ab0e0513218cd6	2019-03-12 08:42:50 -07:00
Christian Puhrsch	02c48cced9	Remove (almost all) TensorOptions from native_functions.yaml (#17385 ) Summary: Stacked on top of https://github.com/pytorch/pytorch/pull/17386 Brings us to 1014/1106 of writing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17385 Differential Revision: D14248008 Pulled By: cpuhrsch fbshipit-source-id: 033e00de91e3edf7ae01ca03ebe436c0446b3b5c	2019-03-12 08:00:00 -07:00
Karl Ostmo	12d6725c15	Restore full Windows tests (#17102 ) Summary: closes #17101 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17102 Differential Revision: D14420716 Pulled By: ezyang fbshipit-source-id: 0134736e2d919afa683afa84cb2140f659042643	2019-03-12 06:34:45 -07:00
peter	525fef708d	Prevent VS2017 from emitting ambiguous symbol errors (second time) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17863 Differential Revision: D14404818 Pulled By: soumith fbshipit-source-id: 9dac6b926e270e2a29ec2e4dba2e93984da0e5f5	2019-03-12 01:56:58 -07:00
xuzhu	af2e347164	Fix windows test hang (#17778 ) Summary: This PR resolves two concurrent issues discovered when running the test in windows. Details about the windows test can be found here: https://github.com/pytorch/pytorch/issues/17609 The change covers two fixes: 1. update running_preloaders_ upfront before creating worker thread to prevent underflow. 2. add a lock when updating stop_ to prevent dead lock in condition variable cv_write_. The fix has been tested on both Windows and Linux. With --gtest_repeat=1000, the tests runs smoothly without issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17778 Differential Revision: D14404910 Pulled By: soumith fbshipit-source-id: 2fbb8007e4b0bce4613e9a9fd31b8aace1bbfa8d	2019-03-12 01:50:49 -07:00
vishwakftw	f268370b42	torch.btrifact for tensors with greater than 3 dimensions (#14964 ) Summary: Motivation: - Earlier, `torch.btrifact` could not handle tensors with greater than 3 dimensions. This is because of the check: > AT_CHECK(THTensor_(nDimension)(a) == 3, "expected 3D tensor, got size: ", a->sizes()); What is in this PR?: - Move `btrifact` to ATen - Remove relation to TH/THC. - Handle tensors with more than three dimensions - Tests - Docs modifications: added a note about the non-pivoting variant. [blocked due to old magma-cuda binaries] Pull Request resolved: https://github.com/pytorch/pytorch/pull/14964 Differential Revision: D14405106 Pulled By: soumith fbshipit-source-id: f051f5d6aaa45f85836a2867176c065733563184	2019-03-12 01:46:07 -07:00
Roy Li	b161ac9634	Small clean up of aten_op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17530 Reviewed By: ezyang Differential Revision: D14237931 fbshipit-source-id: fb73d63d89fab0622097a49be6ed0b75ddb02a7c	2019-03-11 21:04:16 -07:00
Michael Suo	496a3339dc	add support for parsing class defs to the string frontend (#17628 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17628 This is not hooked up anywhere yet, just adding support. This shares the same restrictions as the python frontend—namely, that the only exprs allowed right now are method defs. Reviewed By: shannonzhu Differential Revision: D14291654 fbshipit-source-id: 7798e5ff412a52ef8803c7bae8f439e50968a73a	2019-03-11 19:13:55 -07:00
Michael Suo	64bb86d946	add test for out of order methods (#17624 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17624 Just to make sure this path works Reviewed By: shannonzhu Differential Revision: D14288056 fbshipit-source-id: b719c0e90252b6821b1f9b22d3d98982985a6cb3	2019-03-11 19:13:54 -07:00
Michael Suo	f9820e55af	initializing class value (#17585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17585 Create a sugared value that represents a class during initialization. This is so that assignments to attributes correctly define attributes in __init__ but raise an error elsewhere. Reviewed By: shannonzhu Differential Revision: D14263403 fbshipit-source-id: 09b2feeb272302f00a79c2a0302fbdf5483aed6a	2019-03-11 19:13:52 -07:00
Pieter Noordhuis	2e753fc753	Remove unused parameter in ProcessGroupGloo (#17718 ) Summary: This is not used anywhere and wasn't cleaned up prior to 1.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17718 Reviewed By: janewangfb Differential Revision: D14355154 Pulled By: pietern fbshipit-source-id: f8ff3c8f50cd6365b369a5c5b85d72d8940df048	2019-03-11 18:01:20 -07:00
Elias Ellison	f540536dfd	Revert D14414435: [pytorch][PR] Remove remaining IR Expect files Differential Revision: D14414435 Original commit changeset: 0bfd7ce66ac2 fbshipit-source-id: 02de1814f3c4e581d3798059cee752517b176ed9	2019-03-11 17:36:44 -07:00
Elias Ellison	fd67f6b463	Remove remaining IR Expect files (#17886 ) Summary: Last batch of IR expect files removed. Includes some removal of expect files that are no longer used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17886 Differential Revision: D14414435 Pulled By: eellison fbshipit-source-id: 0bfd7ce66ac2f72a57f15f45ebd60b95e80b6c16	2019-03-11 17:32:19 -07:00
Iurii Zdebskyi	4aa22833cf	Bool tensor creation (cpu) (#17376 ) Summary: This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU (this PR) b) CUDA 3. Tensor Conversions. 4. Tensor Indexing. 5. Tensor Operations. 6. Back compatibility related changes. Change: Enable CPU tensors and these operations: - torch.zeros - torch.tensor - torch.ones - torch.randint - torch.full - torch.full_like - torch.empty - torch.empty_like Tested via: 1) unit tests 2) torch.zeros(2,2, dtype=torch.bool) torch.tensor([True, False], dtype=torch.bool) torch.tensor([-1, -1.1, 0, 1, 1.1, 2], dtype=torch.bool) torch.ones([1,2], dtype=torch.bool) torch.randint(10, (2, 2), dtype=torch.bool) torch.full((2, 3), True, dtype=torch.bool) torch.empty(4, dtype=torch.bool) a = torch.tensor([0,0,1]) b = torch.full_like(a, True) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17376 Reviewed By: ezyang Differential Revision: D14375995 Pulled By: izdeby fbshipit-source-id: a65490b5360ee0e6e3accc54ce7e32e49ad2d2a8	2019-03-11 17:03:40 -07:00
Roy Li	b5fa5a5603	Remove device guard from TypeDefault::copy() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17833 Reviewed By: ezyang Differential Revision: D14400901 Pulled By: li-roy fbshipit-source-id: ababc95dadfc94a996a80c5332f45f76a300963d	2019-03-11 15:53:41 -07:00
Michael Suo	066d15840f	re-enable torch.split tests (#17859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17859 this has been fixed due to improvements in shape analysis Reviewed By: driazati Differential Revision: D14402781 fbshipit-source-id: 4ef2722ffedd9c8ac1eff55c244b421d7d3715ed	2019-03-11 15:22:55 -07:00
Edward Yang	d391137acd	Fix lint in test_dataloader.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17878 Reviewed By: eellison Differential Revision: D14409933 fbshipit-source-id: 20ee8953a21e29b4557aff62b5e48dddd630eef6	2019-03-11 14:50:51 -07:00
Johannes M Dieterich	fa29c179b7	Optimize fused_dropout_kernel launch bounds for AMD hardware Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17870 Differential Revision: D14409990 Pulled By: ezyang fbshipit-source-id: 0452282f459770823641b2527f47b1186ab14666	2019-03-11 14:45:42 -07:00
Vishwak Srinivasan	3f1d0ee5d5	Deprecate torch.pstrf (#17866 ) Summary: Changelog: - Add deprecation warning to torch.pstrf Pull Request resolved: https://github.com/pytorch/pytorch/pull/17866 Differential Revision: D14405527 Pulled By: soumith fbshipit-source-id: 73f3b7d61c60eb57e4bffd08112e552ae3e6dfdc	2019-03-11 12:27:52 -07:00
Gao, Xiang	11c89dde55	Allow structseq to be input of operators where tuple is expected (#17208 ) Summary: Currently the following code gives an error on python 2 because `ret` is a structseq which is not a tuple ```python ret = a.max(dim=0) ret1 = torch.max(a, dim=0, out=ret) ``` This PR modify tuple check in python arg parser to allow structseq to be input of operators where tuple is expected, which would make the above code work. Depend on: https://github.com/pytorch/pytorch/pull/17136 Partially fixes: https://github.com/pytorch/pytorch/issues/16813 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17208 Differential Revision: D14280198 Pulled By: VitalyFedyunin fbshipit-source-id: beffebfd3951c4f5c7c8fe99a5847616a89491f3	2019-03-11 11:33:35 -07:00
Eric Nakagawa	b9e8f56daa	Add PyTorch Governance, Contributor Guide, and List of Persons of Interest Summary: Adding new documents to the PyTorch website to describe how PyTorch is governed, how to contribute to the project, and lists persons of interest. Reviewed By: orionr Differential Revision: D14394573 fbshipit-source-id: ad98b807850c51de0b741e3acbbc3c699e97b27f	2019-03-11 10:36:41 -07:00
Yinghai Lu	abd39d5a88	Fix compilation error (#17860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17860 att Reviewed By: bddppq Differential Revision: D14402751 fbshipit-source-id: 2d53b230dfd775372addeab1d3eaf0b9552fae9f	2019-03-11 10:26:42 -07:00
Edward Yang	b3c9090736	Revert D14392864: Fix lint in test_dataloader.py Differential Revision: D14392864 Original commit changeset: 12477b9cfe29 fbshipit-source-id: 1864a80d5cfaceeae55d0145340a578f978ab4a7	2019-03-11 10:19:41 -07:00
Iurii Zdebskyi	817fd9ebf1	Removed dead code from THTensorMath.h (#17769 ) Summary: This PR removes dead code from THTensorMath.h I found these unused methods while working on a PR where i plan to move fill and zero methods from TH/THC to Aten. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17769 Differential Revision: D14372732 Pulled By: izdeby fbshipit-source-id: 94fd3b52c691ebc89d2bdc8905452e7498038bf5	2019-03-11 10:14:44 -07:00
bhushan	b57fe3cc66	Introducing array-like sequence methods __contains__ (#17733 ) Summary: for tensor Fixes: #17000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17733 Differential Revision: D14401952 Pulled By: soumith fbshipit-source-id: c841b128c5a1fceda1094323ed4ef1d0cf494909	2019-03-11 09:00:16 -07:00
peter	906f9efc57	Revert "Add check for x64 Python before setup (#17707 )" (#17864 ) Summary: This reverts commit 08fb9021da32e73bd7dec73104eea6a76dd44439. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17864 Differential Revision: D14404920 Pulled By: soumith fbshipit-source-id: d41fc06e249f3437d4f80d1d6a5fdbd44c90462b	2019-03-11 08:52:13 -07:00
Nicki Skafte	8045b3eb14	Registering of kl-divergence for independent distribution (#17681 ) Summary: This address issue https://github.com/pytorch/pytorch/issues/13545 and implements the proposed fix together with a single test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17681 Differential Revision: D14360161 Pulled By: ezyang fbshipit-source-id: 427afc88e9054b5b0dc39ebbab1087b990695ea5	2019-03-11 08:10:16 -07:00
Edward Yang	c02369151d	Fix lint in test_dataloader.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17820 Reviewed By: eellison Differential Revision: D14392864 fbshipit-source-id: 12477b9cfe290428d51cc28e024c8cbe8bb7bf51	2019-03-11 08:01:33 -07:00
Tongzhou Wang	1d827b7271	Further improvements of nn.container docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17731 Differential Revision: D14401894 Pulled By: soumith fbshipit-source-id: cebb25859f78589cc4f4f8afb1e84c97f82b6962	2019-03-10 18:30:39 -07:00
Tongzhou Wang	b6313d74e1	fix faq typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17851 Differential Revision: D14401791 Pulled By: soumith fbshipit-source-id: ed6d64d6f5985e7ce76dca1e9e376782736b90f9	2019-03-10 15:33:52 -07:00
bhushan	6bcff88d3e	Fix log_softmax and softmax if any dimension is 0-d (#17651 ) Summary: - Test added - test_dim_function_empty: softmax and log_softmax on last dimension fixes: #17262 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17651 Differential Revision: D14349009 Pulled By: gchanan fbshipit-source-id: b6f728f5c6be8ae7615749e3f0c201886632923e	2019-03-10 15:25:58 -07:00
ZhuBaohe	75f88d4da6	Correct loss docstrings (#17300 ) Summary: In the loss doc description, replace the deprecated 'reduct' and 'size_average' parameters with the 'reduction' parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17300 Differential Revision: D14195789 Pulled By: soumith fbshipit-source-id: 625e650ec20f13b2d22153a4a535656cf9c8f0eb	2019-03-10 11:56:41 -07:00
HE, Tao	98c54e9fa6	When openblas exists, "OpenBLAS_FOUND" is defined, rather than "OPENBLAS_FOUND". (#17841 ) Summary: See https://github.com/pytorch/pytorch/blob/master/cmake/Modules/FindOpenBLAS.cmake#L36 This typo lead to cmake fails to detect openblas on ubuntu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17841 Differential Revision: D14400261 Pulled By: soumith fbshipit-source-id: 287e019e122230cf6b70ab1ea94e5c514f429c88	2019-03-10 09:34:50 -07:00
bhushan	a6c4ea66dd	Passing indices as a list to Subset instead of Tensor (#17649 ) Summary: Indices in Subset were stored as tensors earlier passing as list in random_split to ensure integer indexing fixes: #17466 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17649 Differential Revision: D14400250 Pulled By: soumith fbshipit-source-id: cd20a959f33773c4babf8e861ea37ec61c2713a0	2019-03-10 09:23:53 -07:00
James Reed	81e025d9ac	Clarify JIT docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17846 Differential Revision: D14400363 Pulled By: jamesr66a fbshipit-source-id: 862316b5fd95526b6edebeca19d2cc522779df11	2019-03-09 23:13:31 -08:00
Pritam Damania	24e7b824e0	Add metadata for torch jit TracedModules. (#17640 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17640 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17311 I've extended our model metadata framework in this diff to support traced modules as well. Re-used a lot of components from the previous implementation of ScriptModule metadata. Tracing is a little different from Scripting since you can't just create a subclass of TopLevelTraceModule (type returned by torch.jit.trace) and attach metadata the way we did for ScriptModule. As a result, I've introduced a separate API torch.fb.jit_trace which returns an instance of TracedModuleWithMetadata which is a subclass of TopLevelTracedModule. As a result, we can now attach metadata to this instance. Reviewed By: dzhulgakov Differential Revision: D14117966 fbshipit-source-id: 3eee5eef733cb8d6a219c02e2f41d08698eca326	2019-03-09 21:37:15 -08:00
Konstantin Lopuhin	320c6977c2	Fix PySlice_Unpack not available on PyPy 3.6 yet (#17836 ) Summary: This is one of the fixes needed to support compilation on PyPy 3.6, see https://github.com/pytorch/pytorch/issues/17835 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17836 Differential Revision: D14399404 Pulled By: soumith fbshipit-source-id: ca650a6e2066aed86ddd3314a95d0cb3c515c633	2019-03-09 20:10:16 -08:00
Ronan Lamy	742568e7eb	PyPy compatibility: let unmodified slots be inherited in the standard way (#17837 ) Summary: This is needed to fix a segfault on PyPy 3.6, see https://bitbucket.org/pypy/pypy/issues/2968/segfault-calling-cpyext_tp_new_tuple and https://github.com/pytorch/pytorch/issues/17835 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17837 Differential Revision: D14399408 Pulled By: soumith fbshipit-source-id: 75328a30018313d3223dd3e3eef9240a416c049b	2019-03-09 11:42:16 -08:00
Junjie Bai	17232fb842	Run fp16 resnet50 training in bench script (#17831 ) Summary: cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/17831 Differential Revision: D14398532 Pulled By: bddppq fbshipit-source-id: 37c03cc2eebe3a6083e05631cb6ff03474e4a8a2	2019-03-08 21:53:12 -08:00
Summer Deng	c10c73f047	Int8 FC performance debugging (#17700 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17700 Add performance debugging utilities in DNNLOWP FC operator and the python script Reviewed By: amylittleyang Differential Revision: D14321299 fbshipit-source-id: 50dbd7b352a1da5d2ecb659d8003e71e70750063	2019-03-08 19:03:54 -08:00
Xiaomeng Yang	0fd1dc45c0	Optimize LayerNormOp (#17604 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17604 Optimize LayerNormOp i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14274175 fbshipit-source-id: a7aa263a1b0eb109682d2be99306e7b2cdcc0faf	2019-03-08 17:38:14 -08:00
Roy Li	65b00aa597	Remove some simple use cases of Type::ScalarType() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17529 Reviewed By: ezyang Differential Revision: D14237932 fbshipit-source-id: be633a1fc19215d53cfe083fdd7196acf2b7dd2f	2019-03-08 16:42:05 -08:00
Roy Li	3aeb78079b	Change Dispatch.h to use ScalarType over Type Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17527 Reviewed By: zou3519 Differential Revision: D14235395 fbshipit-source-id: 3f53e33f6794f1f14c2edf79014b8ef8397822c5	2019-03-08 16:42:04 -08:00
Lu Fang	cc07f968f8	Revert D14361993: [pytorch][PR] [Onnx] - refactoring serialization of ONNX initializers to be name-based Differential Revision: D14361993 Original commit changeset: da93e945d557 fbshipit-source-id: 15eea001fbcd059ac13903405aeb9ea182c6ee8b	2019-03-08 16:31:14 -08:00
James Reed	1d26a3ae7e	Open registration for c10 thread pool (#17788 ) Summary: 1. Move ATen threadpool & open registration mechanism to C10 2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788 Reviewed By: zdevito Differential Revision: D14379707 Pulled By: jamesr66a fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45	2019-03-08 15:38:41 -08:00
David Riazati	0955592243	Cast nn.Upsample.scale_factor to a float (#17732 ) Summary: Fixes #17106 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17732 Differential Revision: D14388192 Pulled By: driazati fbshipit-source-id: d9c9e87a7c6db63c1de3ddebbb8dcf619f0dc34d	2019-03-08 15:29:35 -08:00
Edward Yang	4bea15f580	Fix lint in run_test.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17815 Reviewed By: eellison Differential Revision: D14390308 fbshipit-source-id: 22efd62a1bbd1fc8155a942d7160d5b7d3158e6b	2019-03-08 14:41:36 -08:00
Edward Yang	dbb5d02a45	Fix lint in test/common_utils.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17814 Reviewed By: eellison Differential Revision: D14390194 fbshipit-source-id: b4b3bbe20a15d0b9ed127b255e01c0d6d0832c1b	2019-03-08 14:22:57 -08:00
Roy Li	7aae51cded	Replace tensor.type().scalarType() calls with tensor.scalar_type() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17515 Reviewed By: ezyang Differential Revision: D14233250 fbshipit-source-id: 6c7af8d2291c0c2b148001b30cf03834f34366c0	2019-03-08 14:08:18 -08:00
Yinghai Lu	efed875b3f	Catch exceptions in bound_shape_inference (#17775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17775 Handles use input shape hint properly. Reviewed By: zrphercule Differential Revision: D14368735 fbshipit-source-id: 504cd96589e47aa432617e56362aa6b01a25ba9b	2019-03-08 13:18:28 -08:00
Sebastian Messmer	4a7c549e8f	refactor caffe2 operator constructors - 11/9 (#17722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17722 clangr codemod Reviewed By: ezyang Differential Revision: D14350584 fbshipit-source-id: adef54cedc9409b4fb365f6644e2621a9e47b2ff	2019-03-08 12:38:54 -08:00
Edward Yang	55cf9c742a	Suppress C408 lint (don't use dict constructor) (#17813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17813 We have a lot of manually written out dict() constructors, and (1) I don't think use of curly brace syntax is much of an improvement and (2) it seems like a waste of time to fix them all. Reviewed By: eellison Differential Revision: D14390136 fbshipit-source-id: 6199bef4dea75b6079bcb9d9e8acf20a2e1a86e1	2019-03-08 12:19:17 -08:00
Christian Puhrsch	11f50e73e3	Add matches_jit_signature to recent native functions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17805 Differential Revision: D14388004 Pulled By: cpuhrsch fbshipit-source-id: c50580b6fe1e9cfefed91aaa526376325d9f9c0d	2019-03-08 11:42:25 -08:00
peterjc123	fe90ee9dc8	Add /MD to prevent linking errors on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17799 Differential Revision: D14385777 Pulled By: ezyang fbshipit-source-id: 8c1d9f80c48399087f5fae4474690e6d80d740e6	2019-03-08 10:46:25 -08:00
Dmytro Dzhulgakov	a60fadfb71	Change message on unknown db type to be friendly (#17795 ) Summary: CreateDB actually returns nullptr when db type is unknown and throws when the file is missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/17795 Reviewed By: ezyang Differential Revision: D14383226 Pulled By: dzhulgakov fbshipit-source-id: 1dcf75a6b4ba8b64a24d4e5daf02db3189d56b7b	2019-03-08 10:46:24 -08:00
David Riazati	667763a63a	Trace rnn max_batch_size (#17727 ) Summary: This causes the tracer to record the select / cast to int operation instead of just an int constant Fixes #15319 but relies on a fix for #17583 first Pull Request resolved: https://github.com/pytorch/pytorch/pull/17727 Differential Revision: D14377886 Pulled By: driazati fbshipit-source-id: 59453def54ba72756303f723993844dbeb5d2f8b	2019-03-08 10:36:36 -08:00
Sebastian Messmer	7f7d12854d	Remove legacy way of exposing caffe2 operators to PyTorch (#17742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17742 This path isn't used anymore, and is incompatible with the changes stacked on top of this diff. Removing it. cc bwasti to check and confirm these can really be deleted Reviewed By: ezyang Differential Revision: D14362426 fbshipit-source-id: 32cdc19f28c2a981ae1e204901420998367ee588	2019-03-08 10:22:41 -08:00
Gregory Chanan	b132f0f1e7	Remove 'Tensor' key from ATen codegen. (#17782 ) Summary: We used to have different ATen Tensor types, but we don't anymore. This was just being maintained by a codegen'ed comment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17782 Reviewed By: ezyang Differential Revision: D14378004 Pulled By: gchanan fbshipit-source-id: 1bbf276393a391252d372cc385230c784bd78588	2019-03-08 09:46:38 -08:00
Gregory Chanan	5ffd7dbbb4	Remove ProcessorSpecificPlugin. (#17789 ) Summary: It doesn't seem to be used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17789 Reviewed By: ezyang Differential Revision: D14382423 Pulled By: gchanan fbshipit-source-id: 0ac3236c48979a1b2bcd615e307e55f10fd8eb77	2019-03-08 09:46:37 -08:00
Gregory Chanan	33c83a3f35	Remove THPPlugin. (#17790 ) Summary: It doesn't seem to be used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17790 Reviewed By: ezyang Differential Revision: D14380897 Pulled By: gchanan fbshipit-source-id: 3c3884a08c3b6c1489347d439509b19e079c5861	2019-03-08 09:42:01 -08:00
Edward Yang	424a03186a	Replace tens with hundreds. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17752 Differential Revision: D14366743 fbshipit-source-id: 39f6ac08180d780866e284024918d9abd197d239	2019-03-08 07:33:41 -08:00
Tim Khatkevich	6aacc1b2dd	Support failback for more operators in ideep (#17747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17747 RMACRegions, Normalize and RoIPooling Reviewed By: dskhudia Differential Revision: D14365096 fbshipit-source-id: dafcb7077515e03c2880832a442015b70fc7140d	2019-03-08 05:48:22 -08:00
Mikhail Zolotukhin	256923523a	Cleanup include files in jit/passes/common_subexpression_elimination.h. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17784 Differential Revision: D14381529 Pulled By: ZolotukhinM fbshipit-source-id: e32e17ee644ef888a6d56a8ee3648e7ac21758bf	2019-03-08 01:11:20 -08:00
Christian Puhrsch	b290a16b2d	Use return names in JIT operators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17638 Differential Revision: D14295606 Pulled By: cpuhrsch fbshipit-source-id: 62040ac65434411357808735f0fe6cd33cc1c30f	2019-03-07 23:34:42 -08:00
Jerry Zhang	ac87488bd3	Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#17764 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17764 Original commit changeset: f1923fdca4a1 reverted int8 ops fixes the original runtime regression. We'll ignore the memory regression since it is flaky, see D14228484 Reviewed By: dzhulgakov Differential Revision: D13885233 fbshipit-source-id: ccbe4b94acb44b7b4cb3ae4d73e3f6091e1e1195	2019-03-07 18:38:53 -08:00
Roy Li	cc7aec12fd	Clean up some old ScalarType stuff Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17755 Differential Revision: D14377135 Pulled By: li-roy fbshipit-source-id: 35305760a1621340ba66c61a193ff61cfedfa7e8	2019-03-07 16:21:52 -08:00
Elias Ellison	549eb9e9bc	add reference to flake8-mypy in contributing.md Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17759 Differential Revision: D14376813 Pulled By: eellison fbshipit-source-id: cca1128e967ef7368633b94a3fa3c8e76a4a16f4	2019-03-07 15:28:59 -08:00
vishwakftw	9d70e199f4	Move lerp to ATen, add functionality for tensor weights (#17348 ) Summary: Changelog: - Remove TH/THC bindings - Add tensor weights for `lerp` - Modify derivatives appropriately Pull Request resolved: https://github.com/pytorch/pytorch/pull/17348 Differential Revision: D14355845 Pulled By: soumith fbshipit-source-id: eaede4c09ee589d77ba6cf52583510ea8e3a2fcf	2019-03-07 14:04:58 -08:00
Iurii Zdebskyi	6227afb305	Refactor dispatcher (#17753 ) Summary: This is a side PR for a bool tensor feature. The idea of this change came from a feedback received in this [PR](https://github.com/pytorch/pytorch/pull/17376). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17753 Differential Revision: D14367989 Pulled By: izdeby fbshipit-source-id: 4fa380e56e20f18e480be68920170dbc3a4eb91c	2019-03-07 13:41:54 -08:00
Wanchao Liang	aa57f17808	add layernorm to AD Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17702 Differential Revision: D14368472 Pulled By: wanchaol fbshipit-source-id: 8db390e39444078258ad1d34ba74d6ddafa5d02b	2019-03-07 13:36:51 -08:00
Hector Yuen	5bf9e41938	move half<->float conversions to oss operators (#17548 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17548 expose half float operators to OSS common/math/Float16.h is the original implementation this is substituted by caffe2/c10/util/Half.h from the comments seems like the both implementations don't handle denormals Reviewed By: jspark1105 Differential Revision: D14244200 fbshipit-source-id: f90ba28c5bf6a2b451b429cc4925b8cc376ac651	2019-03-07 13:00:13 -08:00
Lu Fang	aa4c4c47fa	Fix the update ONNX expect files (#17767 ) Summary: Fix the CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/17767 Reviewed By: zrphercule Differential Revision: D14370483 Pulled By: houseroad fbshipit-source-id: e7b0bbde0797c41f5a010fa206fab80fe2792eb7	2019-03-07 12:54:42 -08:00
Mikhail Zolotukhin	7bcc2301ee	Cleanup testFusion/testOne: there are unused arguments. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17737 Differential Revision: D14366584 Pulled By: ZolotukhinM fbshipit-source-id: 3c2dd2aabfecca475909e4eec4a077d900795da9	2019-03-07 11:19:24 -08:00
Lu Fang	4480aa31c2	Automatic update of fbcode/onnx to 96c58ceeacf0f2b73d752e413e4fd78787a12da3 (#17676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17676 Previous import was e18bb41d255a23daf368ffd62a2645db55db4c72 Included changes: - [96c58ce](https://github.com/onnx/onnx/commit/96c58ce): Fix shape inference when auto_pad is notset again (#1830) <Li-Wen Chang> - [873ddbb](https://github.com/onnx/onnx/commit/873ddbb): More extendable Runner (#1809) <Michał Karzyński> Reviewed By: zrphercule Differential Revision: D14321241 fbshipit-source-id: 12de9021afc61f5435f1b719cccf7b0f4ad73a84	2019-03-07 11:10:31 -08:00
Lu Fang	1043ff6d68	Set the default ONNX opset to the latest stable opset (i.e., 9) (#17736 ) Summary: 1) The changes in the new opset won't affect internal pipeline. 2) The CI won't be affected by the ONNX changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17736 Reviewed By: zrphercule Differential Revision: D14358710 Pulled By: houseroad fbshipit-source-id: 4ef15d2246b50f6875ee215ce37ecf92d555ca6a	2019-03-07 10:56:06 -08:00
David Riazati	a2381fa346	Add module attributes (#17309 ) Summary: Similar to `nn.Parameter`s, this PR lets you store any `IValue` on a module as an attribute on a `ScriptModule` (only from the Python front-end currently). To mark something as an attribute, it should wrapped in `jit.Attribute(value, type)` (ex. `self.table = torch.jit.Attribute(table, Dict[str, torch.Tensor])`) Followup Work: * (de)serializing for use in C++ * change `self.training` to be a `bool` attribute instead of a buffer * mutable attributes * string frontend support * documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/17309 Differential Revision: D14354316 Pulled By: driazati fbshipit-source-id: 67e08ab5229366b67fbc837e67b58831a4fb3318	2019-03-07 10:44:10 -08:00
Spandan Tiwari	e4c9d75008	- refactoring serialization of ONNX initializers to be name-based (#17420 ) Summary: Currently, serialization of model parameters in ONNX export depends on the order in which they are stored in a container (`list` on Python side and `std::vector` on C++ side). This has worked fine till now, but if we need to do any pass on that graph that mutates the parameter list, then strictly order-based serialization may not work. This PR is the first in a set to bring in more passes (such as constant folding) related to ONNX export. This PR lays the groundwork by moving the serialization in ONNX export from order-based to name based approach, which is more amenable to some of the passes. houseroad - As discussed this change uses a map for export, and removes the code from `export.cpp` that relies on the order to compute initializer names. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17420 Differential Revision: D14361993 Pulled By: houseroad fbshipit-source-id: da93e945d55755c126de06641f35df87d1648cc4	2019-03-07 10:25:00 -08:00
Lara Haidar-Ahmad	3f94fc4862	ONNX Export for Max and Average Pooling in CEIL_MODE Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16769 Differential Revision: D14362175 Pulled By: houseroad fbshipit-source-id: 65cfb1dfba6a43d39cc85374add368fe8e4e5645	2019-03-07 10:10:21 -08:00
Elias Ellison	561037aef8	use flake8-mypy (#17721 ) Summary: Use flake8 installed with mypy checks so that our linter matches fbcode. Mypy type errors also provide valuable signal Pull Request resolved: https://github.com/pytorch/pytorch/pull/17721 Differential Revision: D14357778 Pulled By: eellison fbshipit-source-id: d8c9ea3fe3b5f550c3b70fe259e0eabf95e4c92d	2019-03-07 09:15:54 -08:00
Jongsoo Park	1d522598fb	use fp16<->fp32 intrinsic (#17496 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17496 As title. Reviewed By: hyuen Differential Revision: D14222907 fbshipit-source-id: d5d6c032e725ca8b52aca2be7401ec3c59f6a242	2019-03-07 02:23:07 -08:00
Ahmed Aly	f8778aef78	Implement a Caffe2 standalone LSTM operator (#17726 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17726 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17725 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17461 Implementing a standalone LSTM Operator in Caffe2 adopted from this Aten implementation: diffusion/FBS/browse/master/fbcode/caffe2/aten/src/ATen/native/RNN.cpp. The most tricky thing in this exercise was that caffe2::Tensor has no copy constructor that made it necessary to implement a custom templated copy constructor for the different Tensor containers used in the code. Also there was no way to use off-the-shelf C2 operators in my code easily so I had to copy some code that is doing basic matmul, cat, split, transpose and linear as utility functions. Two things missing: - Profiling this implementation against the current ONNXified LSTM op - Make this operator available to use in PyTorch Reviewed By: dzhulgakov Differential Revision: D14351575 fbshipit-source-id: 3b99b53212cf593c7a49e45580b5a07b90809e64	2019-03-07 01:08:49 -08:00
Sebastian Messmer	7d02a1fbc7	caffe2:libtorch_cuda depends on caffe2:caffe2_gpu (#17729 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17729 When doing "import torch" in fbcode, previously the caffe2 cuda kernels weren't loaded because libcaffe2_gpu.so wasn't loaded. Once you also did "from caffe2.python import workspace", then the cuda kernels were loaded because that triggered a runtime mechanism for loading libcaffe2_gpu.so. We want the cuda kernels to always be available, so this diff adds a dependency from caffe2:libtorch_cuda to caffe2:caffe2_gpu. Reviewed By: ezyang Differential Revision: D14353498 fbshipit-source-id: 76a9fe69f231b308ab40eac393bb216c6fad3658	2019-03-06 23:53:16 -08:00
Jongsoo Park	39423fbdd4	add tensor and cost inference functions (#17684 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17684 Adding tensor and cost inference functions to more int8 operators. Reviewed By: yinghai Differential Revision: D14174746 fbshipit-source-id: dfad975fa75899565c8fb61f1b7747a9206ebd22	2019-03-06 23:34:02 -08:00
Lara Haidar	3dba1285ab	ONNX Export Narrow op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17550 Differential Revision: D14350401 Pulled By: houseroad fbshipit-source-id: 4d88079bb7a8bbd270b0272009826eb3b202cc33	2019-03-06 22:37:58 -08:00
Yinghai Lu	3230404645	Keep the dim_type of hinted shape as BATCH if possible (#17734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17734 If input is not BATCH, we will skip adjust its batch size during onnxifi transformation. So when we take hints, we take it as CONSTANT but later need to change it to BATCH if possible. Reviewed By: jackm321 Differential Revision: D14355983 fbshipit-source-id: 63eb54a44afb1565c71486fdd73db07ca0ac4fd4	2019-03-06 19:58:35 -08:00
jwu	8ec7357312	fix different round behavior on CPU and GPU #16498 (#17443 ) Summary: xxtemp, colesbury, bhushan23, zou3519, convert gpu round behavior to half-to-even, consistent with torch cpu version and numpy. You feedback are welcomed. See #16498 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17443 Differential Revision: D14261786 Pulled By: VitalyFedyunin fbshipit-source-id: 98156436b545d72769831a89e2775d43ad913ebc	2019-03-06 19:40:10 -08:00
zou3519	68c5c66800	Warn about memory overlaps on expanded tensors (#17576 ) Summary: Eventually we should remove these when we're certain that all our ops handle memory overlaps correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17576 Differential Revision: D14349990 Pulled By: zou3519 fbshipit-source-id: c3a09f6113b9b1bf93e7f13c0b426c45b2cdf21f	2019-03-06 17:44:04 -08:00
Tongzhou Wang	93768785ec	fix exp fam. formula Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17719 Differential Revision: D14349029 Pulled By: soumith fbshipit-source-id: cf016756a9319436f7379e8377f8bd1e1b672b40	2019-03-06 15:47:13 -08:00
Sebastian Messmer	f6fda4409b	refactor caffe2 operator constructors - 10/9 (#17659 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17659 clangr codemod Reviewed By: ezyang Differential Revision: D14304675 fbshipit-source-id: 45fbd84c50651a70ae29bf46df3322715e99d225	2019-03-06 15:11:47 -08:00
Lu Fang	4db3f8f806	Improve ONNX symbolic for logsoftmax and softmax (#17672 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17672 support dtype in the onnx symbolic Reviewed By: zrphercule Differential Revision: D14313987 fbshipit-source-id: e9364621b3f795191d880599711dfbcb220d0e31	2019-03-06 15:02:08 -08:00
peter	c78da0c6ed	Enable using CMD when building cpp extensions on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706 Differential Revision: D14346482 Pulled By: ezyang fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9	2019-03-06 14:45:31 -08:00
Yinghai Lu	a87d475c2f	Do not rename net boundary inputs/outputs during ssaRewrite. (#17545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17545 This diff avoids renaming boundary inputs of net during onnxifi transform. It also removes adding mappings for the initializer during onnxifi op creation. Thus gets read of the mapped ws creation during onnxifi op creation. Reviewed By: zrphercule Differential Revision: D14243161 fbshipit-source-id: 6eafa920c45f6a6bfacbbb443e8e84cf9778644c	2019-03-06 14:26:58 -08:00
Sebastian Messmer	9024faaafe	Reapply D14078519 (#17596 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17596 Was reverted before, now fixed version. Reviewed By: ezyang Differential Revision: D14270288 fbshipit-source-id: c72490b5d02cc6098cb60145fa9a842b3c9a24c5	2019-03-06 13:51:00 -08:00
eellison	bd7fcced69	Batch of expect file removals (#17581 ) Summary: Another batch of removing expect files. One note - I removed the Batched expect files without adding equivalent tests since they are already being tested in another ways, and we are no longer actively maintaining that project. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17581 Differential Revision: D14343578 Pulled By: eellison fbshipit-source-id: ce0b1fd2b5b4ec80ad9003bab1b58f41645d3da6	2019-03-06 13:44:26 -08:00
jiej	39669316a6	(#14267 ) Summary: - Summary: Added synchronized batch normalization, allows synchronization of stats across mini-batches between processes within a process group. Current implementation uses a mixture of extended ATen native functions (cpp cuda extension) + torch.nn.modules (c10d python API) - User-facing api: 1. torch.nn.utils.convert_sync_batchnorm(modules, process_group=None) 2. torch.nn.SyncBatchNorm(num_features, eps=1e-5, momentum=0.1, affine=True, track_running_stats=True, *process_group=None) - supported use case: DistributedDataParallel with single-gpu multi-process* a. User creates model containing `torch.nn.SyncBatchNorm` layers through one of the ways listed below: 1. use layers directly: torch.nn.SyncBatchNorm(...) similar API as with torch.nn.BatchNormXd(...) with added argument `process_group` which is used to limit the scope of synchronization within each process group. Default value is None, which implies synchronization across all GPUs 2. use torch.nn.utils.convert_sync_batchnorm(modules, process_group) recursively convert all `torch.nn.BatchNormXd` into `torch.nn.SyncBatchNorm` preserving values of parameters/buffers. the utility function also allows user to specify process_group value to all converted layers. b. user wraps their model with `torch.distributed.parallel.DataParallelDistributed`, from this point, user should follow the general guidelines for DDP use guide - Error checking For use cases not supported, we error out: 1. Application launched without ddp: > import torch > sbn = torch.nn.SyncBatchNorm(10).cuda() > inp = torch.randn(5, 10, 3, 3).cuda() > sbn(inp) --> Error! > AttributeError: SyncBatchNorm is only supported within torch.nn.parallel.DistributedDataParallel 2. Application launched using DDP with multi-GPU per-process: > ddp_module = nn.parallel.DistributedDataParallel(module, device_ids=device_ids, output_device=args.local_rank) > ValueError: SyncBatchNorm is only supported for DDP with single GPU per process Pull Request resolved: https://github.com/pytorch/pytorch/pull/14267 Differential Revision: D14270035 Pulled By: ezyang fbshipit-source-id: 4956d8fa565c32e9df5408d53719ff9f945f4d6d	2019-03-06 13:39:11 -08:00
Tongzhou Wang	0ed1b9fb98	Update ModuleDict doc about order Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17717 Differential Revision: D14346557 Pulled By: ezyang fbshipit-source-id: 2484c7d8105f9aa8bce5567d1fa2d4f587cc9cc2	2019-03-06 13:09:46 -08:00
Pieter Noordhuis	e2de88dc5a	Update CODEOWNERS (#17720 ) Summary: teng-li is passing the baton to mrshenli. Thanks for all your work on distributed teng-li!! 🎉 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17720 Differential Revision: D14350120 Pulled By: pietern fbshipit-source-id: edfe784520c54630203cc8fbb296455d3dbf341b	2019-03-06 12:33:48 -08:00
Lara Haidar-Ahmad	073634612f	ONNX Export Argmin and Argmax ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17382 Differential Revision: D14338811 Pulled By: houseroad fbshipit-source-id: be07548d8063d1aa94f1801c18137738365b85fb	2019-03-06 12:11:47 -08:00
Lu Fang	97eb139a94	Turn atol to 1e-5 when comparing the end to end results (#17708 ) Summary: results smaller than 1e-5 don't make sense. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17708 Differential Revision: D14348893 Pulled By: houseroad fbshipit-source-id: 5e07c38e5b58b27b61fae63bfc3c21e2fe5629fe	2019-03-06 12:06:45 -08:00
Elias Ellison	7fa996f8e2	remove loop expects (#17695 ) Summary: Replace loop unrolling expect files with assertions on the output IR Pull Request resolved: https://github.com/pytorch/pytorch/pull/17695 Differential Revision: D14347105 Pulled By: eellison fbshipit-source-id: 1703b4ca32bc1c67c01fc4330b0e6eb66feaa103	2019-03-06 11:48:46 -08:00
youkaichao	b87abdfc12	typo fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17653 Differential Revision: D14302003 Pulled By: ezyang fbshipit-source-id: 8ad90985a392b07127c7e315d4e74ce77962b573	2019-03-06 11:36:44 -08:00
Deepali Chourasia	e3516d0a95	omit group conv NHWC test for GPU (#17715 ) Summary: Observed the test `TestGroupConvolution.test_group_convolution` to fail with the following error: ``` Falsifying example: test_group_convolution(self=<caffe2.python.operator_test.group_conv_test.TestGroupConvolution testMethod=test_group_convolution>, stride=3, pad=0, kernel=5, size=8, group=4, input_channels_per_group=7, output_channels_per_group=8, batch_size=2, order='NHWC', engine='', use_bias=False, gc=, dc=[, device_type: 1]) You can reproduce this example by temporarily adding reproduce_failure('3.59.1', b'AAAA') as a decorator on your test case ``` This example generated by hypothesis has `group=2, order='NHWC' and dc=[, device_type: 1])`. I think this example should be skipped. I have mimicked the change corresponding to [PR#13554](https://github.com/pytorch/pytorch/pull/13554) to skip this example. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17715 Differential Revision: D14346642 Pulled By: ezyang fbshipit-source-id: b1f1fef09f625fdb43d31c7213854e61a96381ba	2019-03-06 11:32:35 -08:00
Elias Ellison	10ea02facf	fix tuple matching (#17687 ) Summary: Check for Tuple Matching in isSubvalueOf, since they may contain container types that need to be recursed within isSubvalueOf Fix for https://github.com/pytorch/pytorch/issues/17650 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17687 Differential Revision: D14324642 Pulled By: eellison fbshipit-source-id: 7f1e019875286b2640a3b9c003d1635dda8cf543	2019-03-06 11:25:36 -08:00
Spandan Tiwari	c658d9b21b	Temporarily disable Upsample operator tests in pytorch-onnx tests (#17696 ) Summary: In discussion with houseroad, because Upsample op is being updated in ONNX https://github.com/onnx/onnx/pull/1773 and these tests are blocking it. These tests will be updated once the ONNX PR goes in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17696 Differential Revision: D14338845 Pulled By: houseroad fbshipit-source-id: cfaf8cf1ab578ae69dd3bf21b1c0681b572b9b6f	2019-03-06 11:25:34 -08:00
peter	08fb9021da	Add check for x64 Python before setup (#17707 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17657. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17707 Differential Revision: D14346705 Pulled By: ezyang fbshipit-source-id: 5daafacdb99eb9a9c6517263d10f20c79f920d24	2019-03-06 10:48:16 -08:00
Edward Yang	1e6acc676f	Replace caffe2::DeviceGuard with c10::cuda::CUDAGuard (#17623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17623 Despite it's generic sounding name, caffe2::DeviceGuard actually only worked on CUDA devices. Rename it to something that more clearly spells out its applicability. I'm not sure if it's the right call, but in this patch I added 'using CUDAGuard = c10::cuda::CUDAGuard', as this seems to be more in-line with how the Caffe2 codebase is currently written. More idiomatic c10 namespace style would be to say cuda::CUDAGuard. Willing to change this if people shout. This is a respin of D13156470 (#14284) Reviewed By: dzhulgakov Differential Revision: D14285504 fbshipit-source-id: 93b8ab938b064572b3b010c307e1261fde0fff3d	2019-03-06 10:48:15 -08:00
Duc Ngo	e9eb18a18c	Remove nomscheduler (#17693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17693 Remove nomscheduler tool Reviewed By: yinghai Differential Revision: D14328168 fbshipit-source-id: 674d0e18596a4dc2bbb6b8d321f4066c4fc454ab	2019-03-06 10:48:13 -08:00
bhushan	886e482776	index operation support for torch.HalfTensor (#17645 ) Summary: - Test cases added 1. indexing for half tensor 2. setting for half tensor fixes #17161 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17645 Differential Revision: D14302069 Pulled By: ezyang fbshipit-source-id: 100f141c07046f200c904e27c5882a9417bccda0	2019-03-06 10:32:35 -08:00
Soumith Chintala	507c93bad2	Revert D14160172: Implement a Caffe2 standalone LSTM operator Differential Revision: D14160172 Original commit changeset: c33e3f9e8aea fbshipit-source-id: cffe35d93f0ac75ca93aa98a3b82af3d372f2fc1	2019-03-06 08:44:25 -08:00
Tongzhou Wang	39f94619ec	fix typo in hub doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17705 Differential Revision: D14338380 Pulled By: ailzhang fbshipit-source-id: d53eece30bede88a642e718ee6f829ba29c7d1c4	2019-03-05 23:19:30 -08:00
Ailing Zhang	fefaebabba	fix dropout AD & rename range to rangelist (#17691 ) Summary: fixes #17669 Address apaszke 's comments in #17523 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17691 Differential Revision: D14328083 Pulled By: ailzhang fbshipit-source-id: 9ec4a54f13bfd1aaf4b1821dd00c31793ac07a44	2019-03-05 20:50:10 -08:00
Chaitanya Sri Krishna Lolla	36e0d39f50	enable use of MIOpen for depthwise convolutions (#17685 ) Summary: * added miopen conv mode to be used for setConvDescriptor * added miopen depthwise convolutions Pull Request resolved: https://github.com/pytorch/pytorch/pull/17685 Differential Revision: D14327811 Pulled By: bddppq fbshipit-source-id: d5bdc1abafd5f39694fadf3f9275b9d880c5b115	2019-03-05 18:44:14 -08:00
Ahmed Aly	bfe7a58f69	Implement a Caffe2 standalone LSTM operator (#17461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17461 Implementing a standalone LSTM Operator in Caffe2 adopted from this Aten implementation: diffusion/FBS/browse/master/fbcode/caffe2/aten/src/ATen/native/RNN.cpp. The most tricky thing in this exercise was that caffe2::Tensor has no copy constructor that made it necessary to implement a custom templated copy constructor for the different Tensor containers used in the code. Also there was no way to use off-the-shelf C2 operators in my code easily so I had to copy some code that is doing basic matmul, cat, split, transpose and linear as utility functions. Two things missing: - Profiling this implementation against the current ONNXified LSTM op - Make this operator available to use in PyTorch Reviewed By: dzhulgakov Differential Revision: D14160172 fbshipit-source-id: c33e3f9e8aeae578b64d97593cb031a251216029	2019-03-05 17:34:44 -08:00
Soumith Chintala	a478d41620	Fix nll_loss crash on cpu where ignore_index is out of bounds (#17328 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15508 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17328 Differential Revision: D14322629 Pulled By: soumith fbshipit-source-id: 7d02f372be78794782c18affcfc109ce30b1e91c	2019-03-05 14:35:05 -08:00
Johannes M Dieterich	288e1fbd18	Add '--hip-clang-launch' to favor <<<>>>-based launch. (#17686 ) Summary: hip-clang uses triple chevron kernel dispatch syntax. Add an option to the hipification script to skip translating triple chevron to hipLaunchKernelGGL. Once we switch to hip-clang, this option will be default and subsequently removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17686 Differential Revision: D14327810 Pulled By: bddppq fbshipit-source-id: 5e1512325077dd3ebb8fb9b5bf35fd1f8d9a4dc3	2019-03-05 12:52:22 -08:00
Sam Gross	079093a662	Improve caching allocator for Pascal and newer GPUs. (#17120 ) Summary: ``` NVIDIA changed the CUDA allocation behavior on Pascal GPUs. The page size increased from 1MB to 2MB and allocations larger than 1MB are now always page-aligned. Previously, allocations larger than 1MB were aligned to 128KB boundaries. This interacted poorly with the caching allocator. The remaining memory in a page could only be filled by small cudaMalloc calls, but the caching allocator never cudaMalloc's a chunk smaller than 1MB. This behavior could also cause a large discrepancy between the memory usage reported by nvidia-smi and the memory usage reported by PyTorch, because nvidia-smi counts a partially used page as "full", while PyTorch only counts the actual memory requested. This PR makes a few changes to the caching allocator to better support Pascal and Volta GPUs: - All cudaMalloc calls are now multiples of 2MB (the page size) - Requests between 1-10MB allocate (and split) a 20MB block to reduce wasted space due to rounding - Small requests are now packed into 2MB blocks (instead of 1MB) This improves Mask R-CNN memory usage by 10-20% in internal tests on Volta GPUs. Maxwell performance seems to be largely unchanged, but it's possible that some use cases suffer slightly. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17120 Differential Revision: D14301536 Pulled By: colesbury fbshipit-source-id: a8282315ea8f7b8ca149b5066fdeaecd0d404edf	2019-03-05 09:44:27 -08:00
Davide Libenzi	8420a2025b	Turn the Half::from_bits into a constexpr function to avoid unresolve… (#17661 ) Summary: …d symbol errors when building in DEBUG mode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17661 Differential Revision: D14319610 Pulled By: soumith fbshipit-source-id: 6c508a37155e29260f403d7174f343aa1ff32385	2019-03-05 07:31:38 -08:00
Elias Ellison	7fc3aa8c49	Remove Expect Files from python / tracing / script interop Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17622 Differential Revision: D14308307 Pulled By: eellison fbshipit-source-id: bda249d38ac2570000a12b0ca328c26233ecefe8	2019-03-04 23:04:54 -08:00
peterjc123	b219882c0b	Enable apex on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17675 Differential Revision: D14320473 Pulled By: soumith fbshipit-source-id: cb696984f5196f9b8b50722b4fe927bb6407c322	2019-03-04 21:53:47 -08:00
Soumith Chintala	f176450d60	bump docker build to upgrade magma to 2.5.0 (#17674 ) Summary: upgrades magma in docker build. vishwakftw Pull Request resolved: https://github.com/pytorch/pytorch/pull/17674 Differential Revision: D14320187 Pulled By: soumith fbshipit-source-id: 7887f65fb703b802fc6231408b55ad9c4039882b	2019-03-04 20:31:16 -08:00
Sebastian Messmer	54c4b5a4db	refactor caffe2 operator constructors - 1/9 (#17082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17082 clangr codemod Reviewed By: ezyang Differential Revision: D14078498 fbshipit-source-id: f7f65d6d81c7942293f53fdaa61f756d8b7360c1	2019-03-04 16:04:01 -08:00
Sebastian Messmer	910519e45b	Expose cuda kernel for caffe2::GenerateProposals Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17066 Reviewed By: ezyang, wat3rBro Differential Revision: D14071130 fbshipit-source-id: 6fe26503f6069c36ec31d6c09b549b932d5db242	2019-03-04 14:59:08 -08:00
Jongsoo Park	aea8dd8377	print warnings when DNNLOWP_16 or DNNLOWP_ROWWISE_16 engine is used (#17176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17176 As title Reviewed By: csummersea Differential Revision: D14111616 fbshipit-source-id: 1282cb2452c4ad385fd2dc6d3f8c19e9fec715ff	2019-03-04 14:28:42 -08:00
Sebastian Messmer	8569d9cbea	Fix XOutput/XOutputTensor for ivalue based c2 operators (#17599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17599 XOutput/XOutputTensor was broken for ivalue based operators. This diff fixes that. Reviewed By: ezyang Differential Revision: D14274003 fbshipit-source-id: b99f020244c66c4e2551dbd32ae0f665cc91b338	2019-03-04 14:20:13 -08:00
Sebastian Messmer	c7db0b35d8	Fix InputSize/OutputSize for ivalue based operators (#17579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17579 These methods previously just returned 0 when it was not a legacy operator, making it impossible to convert some operators. Reviewed By: dzhulgakov Differential Revision: D14253094 fbshipit-source-id: 72bfdcf6da291a4ab80d1e0ceb20984b86edc408	2019-03-04 14:20:12 -08:00
Wanchao Liang	173561ff12	Fix clamp fusion on missing limits (#17533 ) Summary: Fixes #17449 Context: before #17186, we don't fuse `clamp` for the case when `min/max` are missing inputs, because they are `prim::None` node, after #17186, we make None a `prim::Constant` node which enables the fusion for `clamp`. But codegen.cpp does not handle the case when `prim::Constant` is not a Double/Int/Bool, this PR makes it so that missing inputs are handled correctly, it is done in the following way: 1. emit nothing when you see `type? = prim::Constant()` 2. when emitRHS, do special casing for aten::clamp Pull Request resolved: https://github.com/pytorch/pytorch/pull/17533 Differential Revision: D14238450 Pulled By: wanchaol fbshipit-source-id: 61a272154754b13e89021bb86002927f02cde19c	2019-03-04 13:18:10 -08:00
Jie	a87eeec9bf	int32 indexing for Tensor Iterator Reduction (#17428 ) Summary: 1. Enabling int32 indexing for cases where TI cannot accumulate in output due to incompatible data types (e.g. Welford). 2. Updating Welford kernel to use int32 instead of int64 indexing on GPU. This change improves performance for torch.var / torch.std Implementation: 1. Allocated extra buffer to handle accumulation between sub Tensor Iterators. 2. Removed int64 indexing in gpu_reduce_kernel 3. WelfordOps now supports index type / combination typeas a template parameter. While GPU uses int32_t and float, CPU implementation uses int64_t and double. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17428 Differential Revision: D14264608 Pulled By: umanwizard fbshipit-source-id: 3eb54451de925b469dbc1127e5ea7443c4431036	2019-03-04 13:11:47 -08:00
Iurii Zdebskyi	3257608276	Removed all usages of TH_Index_Base (#17591 ) Summary: TH_Index_Base is hard coded to 0 and can be removed from the code base. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17591 Differential Revision: D14269273 Pulled By: izdeby fbshipit-source-id: d844e261f4af7297bad8a81e7d6dcf0a391b94e6	2019-03-04 12:51:42 -08:00
Dmytro Dzhulgakov	dec116e96f	PyTorch/Caffe2 tensor interop in Python (#17190 ) Summary: Because of two separate python extensions with different pybind instances I have to go through void* conversion. Since it's hidden from user, it's fine. New APIs added on C2 side: - workspace.FetchTorch('blob') - workspace.Workspace.current.blobs['blob'].to_torch() - workspace.FeedBlob('blob', pytorch_tensor) Works on CPU an GPU. The only glitches are with resizing because of variable/tensor split. But data sharing works properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17190 Reviewed By: ezyang Differential Revision: D14163882 Pulled By: dzhulgakov fbshipit-source-id: d18e5b8fcae026f393c842a1149e972515732de2	2019-03-04 11:34:01 -08:00
wkcn	244d330980	Fixed typo in aten/src/ATen/native_parse.py (#17641 ) Summary: Hi, there. There is a typo in aten/src/ATen/native_parse.py, and I fix it. `std::aray` -> `std::array` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17641 Differential Revision: D14301981 Pulled By: ezyang fbshipit-source-id: a37859cdedcbf6c29333b954486dfa086d6c2176	2019-03-04 10:10:52 -08:00
Martin Schatz	5b835682e3	Remove GPU dependency from ProfileObserver (#17592 ) Summary: Remove GPU dependency and register ProfileObserver. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17592 Reviewed By: ezyang Differential Revision: D14265801 Pulled By: mdschatz fbshipit-source-id: f98c0c32653c64a8b087c58ece4f864dfbe1d4b8	2019-03-04 10:00:46 -08:00
Brennan Vincent	6a297b8675	Don't make factory methods create a tensor and then immediately copy it (#17565 ) Summary: Create a `make_variable` override that moves out of a tensor instead of going through `shallow_copy_and_detach`. Call this override from factory methods like `empty` that create a brand new tensor, do nothing with it, and then copy it into a variable. Will update this with actual numbers, but it seems to get rid of around 20-40% of the overhead of calling `torch.empty(0)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17565 Differential Revision: D14266130 Pulled By: umanwizard fbshipit-source-id: f57d5f2ca3f80ee8ee96d50f905e852fd10db941	2019-03-03 22:16:21 -08:00
Jack Richter-Powell	7a51c03a30	Fixed typo in torch/functional.py w/r/t broadcast_tensors (#17642 ) Summary: In reference to #17574 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17642 Differential Revision: D14297177 Pulled By: ezyang fbshipit-source-id: 968176ea3b46a0153da0fd9e6b40db314d29e51c	2019-03-03 10:08:41 -08:00
Bryan He	01977c0a89	Change fake tqdm constructor to match real tqdm (#17636 ) Summary: Currently, the fake tqdm implementation requires an input (whereas real tqdm does not). This caused a problem in torchvision (https://github.com/pytorch/vision/pull/770), and seems likely to cause minor irritations elsewhere. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17636 Differential Revision: D14296530 Pulled By: ezyang fbshipit-source-id: bc077d898773c93dab34c985a7b30525a43e558a	2019-03-03 01:06:10 -08:00
Christian Puhrsch	ef7ddcd29e	Mark native_functions as matched if uncaptured by JIT (#17631 ) Summary: Various functions aren't used by the JIT, so they're jit-compliant w.r.t. their schema by default. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17631 Differential Revision: D14295559 Pulled By: cpuhrsch fbshipit-source-id: a2ecdcb5df47eb67c54ec642d88d42e985515142	2019-03-02 18:20:04 -08:00
Christian Puhrsch	80927fc068	Ban std::array from native_functions.yaml Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17629 Differential Revision: D14292941 Pulled By: cpuhrsch fbshipit-source-id: 3c3eed57a5505a4e1da3aea682092677ab0e73e3	2019-03-01 19:21:49 -08:00
Christian Puhrsch	416474a720	Remove more usages of BoolTensor and IndexTensor from native_functions.yaml Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16468 Differential Revision: D14095405 Pulled By: cpuhrsch fbshipit-source-id: ea4d6bb7a4e81c05fe9861190ddbf52201612bbf	2019-03-01 19:14:41 -08:00
Thomas Viehmann	c6715eda06	Implement kthvalue in ATen (#17544 ) Summary: The CPU version is based on the TH version. The GPU version is based on #8406 by Pararth Shah (thank you). CPU quickselect based on that in TH's THTensorMoreMath.cpp, but with C++ (quickselectnoindex will be achieved by a different swap) CPU kthvalue is based on the THTensor function in the same file. The dim_apply function is a C++ replacement for TH_TENSOR_DIM_APPLYx macros. The CUDA kernel uses functions adapted from the THCTensorSortK implementation. In particular radixSelect is from THCTensorTopK.cuh. The CUDA launcher code replaces a bunch of macros with C++. It will be re-used in one of the following patches. Plan for further PRs: - This - Sort - TopK + Mode + Median in any order - Rip out THC stuff. There may be utility functions / structs in the SortingCommon.cuh that come into relevance only with sort. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17544 Differential Revision: D14286934 Pulled By: ezyang fbshipit-source-id: 35dbea050b097e88777ac5fa5c0f499d5e23c738	2019-03-01 19:00:10 -08:00
Christian Puhrsch	43f94077d8	Change vml.h to support sizes greater than 2**32 - 1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17280 Differential Revision: D14154997 Pulled By: cpuhrsch fbshipit-source-id: c19b15d18da59c9ee87e82765d3244d2a4ef6729	2019-03-01 17:22:26 -08:00
Grigory Arutyunov	2336f0ba06	msvc_fixes (#17201 ) Summary: Fixing MSVC errors ``` D:\pytorch-scripts\caffe2_builders\v141\pytorch\aten\src\THC/THCReduce.cuh(144): error C4002: too many actual paramet ers for macro 'C10_LAUNCH_BOUNDS_1' [D:\pytorch-scripts\caffe2_builders\v141\pytorch\build\Debug\caffe2\caffe2_gpu.vcxp roj] D:\pytorch-scripts\caffe2_builders\v141\pytorch\aten\src\THC/THCReduce.cuh(259): error C4002: too many actual paramet ers for macro 'C10_LAUNCH_BOUNDS_1' [D:\pytorch-scripts\caffe2_builders\v141\pytorch\build\Debug\caffe2\caffe2_gpu.vcxp roj] D:/pytorch-scripts/caffe2_builders/v141/pytorch/aten/src/THCUNN/SpatialDilatedMaxPooling.cu(51): error C4002: too man y actual parameters for macro 'C10_LAUNCH_BOUNDS_1' [D:\pytorch-scripts\caffe2_builders\v141\pytorch\build\Debug\caffe2 \caffe2_gpu.vcxproj] ``` on variadic C10_LAUNCH_BOUNDS as well as Debug linking issues with at::Half in pool_op_cudnn.cc like this one ``` pool_op_cudnn.obj : error LNK2019: unresolved external symbol "public: bool __cdecl caffe2::MaxPoolFunctor<class caff e2::CUDAContext>::GlobalPoolingBackward<struct c10::Half,2>(int,int,int,struct c10::Half const ,struct c10::Half const ,struct c10::Half const ,struct c10::Half ,class caffe2::CUDAContext )const " (??$GlobalPoolingBackward@UHalf@c10@ @$01@?$MaxPoolFunctor@VCUDAContext@caffe2@@caffe2@QEBA_NHHHPEBUHalf@c10@00PEAU23@PEAVCUDAContext@1@Z) referenced in function "public: bool __cdecl caffe2::`anonymous namespace'::CuDNNMaxPoolFunctor::GlobalPoolingBackward<struct c10::H alf,2>(int,int,int,struct c10::Half const ,struct c10::Half const ,struct c10::Half const ,struct c10::Half ,class caffe2::CUDAContext )const " (??$GlobalPoolingBackward@UHalf@c10@@$01@CuDNNMaxPoolFunctor@?A0xb936404a@caffe2@QEBA_NH HHPEBUHalf@c10@00PEAU34@PEAVCUDAContext@2@Z) [D:\pytorch-scripts\caffe2_builders\v141\pytorch\build\Debug\caffe2\caff e2_gpu.vcxproj] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17201 Differential Revision: D14165732 Pulled By: ezyang fbshipit-source-id: 875fd9a5b2db6f83fc483f6d750d2c011260eb8b	2019-03-01 15:17:41 -08:00
Jithun Nair	06c8aa7a3b	Hipify fixes for Masquerade logic (#17598 ) Summary: ezyang Please review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17598 Differential Revision: D14287724 Pulled By: ezyang fbshipit-source-id: 46e5083854a827370bb4c81b82e5a4ede511e473	2019-03-01 15:13:19 -08:00
Wanchao Liang	ab95b5c6cc	Rename prim::Undefined to prim::AutogradZero (#17611 ) Summary: supersedes #17245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17611 Differential Revision: D14283581 Pulled By: wanchaol fbshipit-source-id: 8022d02b8a021ea2fee9a18a2c8920eb123200c5	2019-03-01 15:13:18 -08:00
Roy Li	5b6703629c	Add python test for extension backend tensor.device (#17602 ) Summary: Adding a test for #17361 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17602 Differential Revision: D14287373 Pulled By: li-roy fbshipit-source-id: 544ecf17eb310aed22ba0ea5f86f46b8e3bb69b5	2019-03-01 14:22:49 -08:00
Edward Yang	2ed99fee0d	Revert D13935403: Call c10 cuda op from test_torch Differential Revision: D13935403 Original commit changeset: b2915ec8a366 fbshipit-source-id: 0f3409d5c102d719bc1f0483695aee93e7d613c9	2019-03-01 14:18:26 -08:00
Amy Yang	c2c32340a4	add command line option to use hive filler; add README (#17619 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17619 --filler hive --iter -1 will let debugger exhaust all batches from a hive partition before exiting. add README that summarizes command line options and usage. Reviewed By: yinghai Differential Revision: D14220166 fbshipit-source-id: daa23b7e8a9184481c6d7b67acf1599e5c99d74a	2019-03-01 13:56:15 -08:00
Thomas Viehmann	5360984fbd	Remove TH(CU)NN Sparse Linear (#17610 ) Summary: Sparse Linear in TH(CU)NN implements sparse linear layers without using sparse matrices. It is currently not documented in PyTorch and there is no functional or module interface. This means it is unused from a PyTorch point of view. The reason for removing it is twofold: - The module uses sort, which I would like to move to ATen. - When we implement a SparseLinear layer, we would want to do it using sparse tensors, so it's not all that useful, anyway. I checked this on slack with soumith, I hope the above is an accurate representation. All bad ideas are my own. This is part of the ongoing work to move sort/topk/mode/median/kthvalue to ATen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17610 Differential Revision: D14280663 Pulled By: gchanan fbshipit-source-id: 289231d2c20626855ce2ceecd4f204b460c32378	2019-03-01 12:36:52 -08:00
ZhuBaohe	19a6de328f	Correct docstring of vision/init functions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17351 Differential Revision: D14276355 Pulled By: soumith fbshipit-source-id: 9b572b6a04eeb1e44cd93961edac76ed10f7b24e	2019-03-01 11:40:23 -08:00
Sebastian Messmer	0a7b2af13b	Call c10 cuda op from test_torch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16692 Reviewed By: ezyang Differential Revision: D13935403 fbshipit-source-id: b2915ec8a3664bb6e918ed357908cc33d8f9449a	2019-03-01 10:59:19 -08:00
peter	698f947463	Revert #17191 and #17215 that no longer apply on Windows (#17567 ) Summary: They are previously merged to resolve #17051. However, since it was resolved by the upstream, and it was causing some issues like https://github.com/abjer/tsds/issues/8, I think it's time to revert these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17567 Differential Revision: D14265241 Pulled By: kostmo fbshipit-source-id: 7fa2b7dd4ebc5148681acb439cf82d983898694e	2019-03-01 10:37:27 -08:00
Michael Suo	e6a9062335	usertype -> class (#17528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17528 as title. register_prim_ops is messy because someone ruined clang-format, but I figured it's okay to include here since this is such a mechanical change Reviewed By: driazati Differential Revision: D14236943 fbshipit-source-id: c2b22845837b7f830015510e48ec2ee5202fa407	2019-03-01 10:08:23 -08:00
Michael Suo	830ca665f5	alias analysis refactor take 2 (#17594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17594 The original version of this broke things because a concurrent change raced with it in CI. Reviewed By: ezyang Differential Revision: D14266663 fbshipit-source-id: e8ac5dfcb7349b4f2c425d9f0eabbfc964314063	2019-03-01 10:08:22 -08:00
peter	7fddd01c51	Fix the missing Windows CPU job in the build status section (#17608 ) Summary: It will be better to split the CPU job on CI. But unluckily, we are out of Windows machines. cc, davidbrownellWork yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17608 Differential Revision: D14281393 Pulled By: soumith fbshipit-source-id: ae9a6140b7207ce56cfb2da3d812bc3fe060764a	2019-03-01 10:03:25 -08:00
peter	81f2bdf9c2	Update magma to 2.5.0 for Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17607 Differential Revision: D14281291 Pulled By: yf225 fbshipit-source-id: 51209c5540932871e45e54ba6d61b3b7d264aa8c	2019-03-01 09:53:56 -08:00
bhushan	a6170573c8	Adding support for 0-d tensor for transpose (.t()) (#17535 ) Summary: - Test updates 1. test_torch: added 0-d test case and t_() test cases 2. test_jit : updated error message for TestAsync.test_async_script_error - Updating documentation for torch.t() Adding information regarding new support of 0-D and 1-D tenso Fixes #17520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17535 Differential Revision: D14269984 Pulled By: gchanan fbshipit-source-id: 38b723f31484be939261c88edb33575d242eca65	2019-03-01 08:45:01 -08:00
svcscm	6899e901cc	Updating submodules Reviewed By: yns88 fbshipit-source-id: 05fafcfb34c76f425ac5c8ef24a5f920641c2cf7	2019-03-01 01:37:01 -08:00
Junjie Bai	212024282b	Mark cudaGetLastError return value unused in C10_CUDA_CHECK Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17605 Reviewed By: xw285cornell Differential Revision: D14277586 Pulled By: bddppq fbshipit-source-id: 38879208f2ab83cf39d8a8a61b288cd09fcafd9a	2019-03-01 00:05:46 -08:00
Huan Gui	d3fcd0d798	add dropout during eval (#17549 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17549 Currently Dropout is only enabled in training, we enable the option of having dropout in Eval. This is to follow [1]. This functionality would be used for uncertainty estimation in exploration project. [1] Gal, Yarin, and Zoubin Ghahramani. "Dropout as a bayesian approximation: Representing model uncertainty in deep learning." international conference on machine learning. 2016. Reviewed By: Wakeupbuddy Differential Revision: D14216216 fbshipit-source-id: 87c8c9cc522a82df467b685805f0775c86923d8b	2019-02-28 23:21:29 -08:00
Johannes M Dieterich	3ed44b6714	Adjust launch_bounds annotation for AMD hardware. (#17555 ) Summary: The max pooling backwards kernel is currently annotated with launch bounds (256,8). Adjust the number of waves to 4 (4 times 64 is 256) for ROCm. This improves training performance for torchvision models by up to 15% (AlexNet) on a gfx906 GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17555 Differential Revision: D14277744 Pulled By: bddppq fbshipit-source-id: 2a62088f7b8a87d1e350c432bf655288967c7883	2019-02-28 22:59:11 -08:00
Sebastian Messmer	6b07612cef	Fix verbose compiler warning in flat_hash_map (#17562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17562 fixes https://github.com/pytorch/pytorch/issues/17332 Reviewed By: ezyang Differential Revision: D14254499 fbshipit-source-id: 9d5d7408c2ce510ac20cd438c6514dc2bbe3a854	2019-02-28 16:38:43 -08:00
Sebastian Messmer	35a52aa33f	Fix diagnostic pragmas (#17561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17561 The push at the top of the file was missing a corresponding pop Reviewed By: ezyang Differential Revision: D14254500 fbshipit-source-id: ff20359b563d6d6dcc68273dc754ab31aa8fad12	2019-02-28 16:38:42 -08:00
Sebastian Messmer	2e94054e34	Allow dispatch based on tensor list args (#17522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17522 Dispatch is still based on the first tensor arg, but that first "tensor arg" is now allowed to be a tensor list. That is, the first argument that is either Tensor or TensorList will be the deciding factor for dispatch. If it is a TensorList, then that TensorList must not be empty or dispatch will fail. Reviewed By: ezyang Differential Revision: D14235840 fbshipit-source-id: 266c18912d56ce77aa84306c5605c4191f3d882b	2019-02-28 16:32:00 -08:00
Sebastian Messmer	b004b31d06	Allow exposing caffe2 operators with variable number of input tensors to c10 (#17491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17491 Before, there was no way to expose a caffe2 operator that had a variable number of inputs. Now, this is allowed by giving the operator one tensor list input. Note that the tensor list must be the first input, and that any other tensor inputs will be ignored and inaccessible in this case. Reviewed By: ezyang Differential Revision: D14220705 fbshipit-source-id: 7f921bfb581caf46b229888c409bbcc40f7dda80	2019-02-28 16:31:59 -08:00
Syed Tousif Ahmed	1ccf74ae9d	blacklist fft algorithms for strided dgrad (#17016 ) Summary: Applies https://github.com/pytorch/pytorch/pull/16626 from v1.0.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17016 Differential Revision: D14270100 Pulled By: ezyang fbshipit-source-id: 1137899dd1551d33d16f39e8dde76cad8192af46	2019-02-28 16:28:07 -08:00
Sebastian Messmer	0f60283e84	Revert D14078519: [codemod][caffe2] [clangr] refactor caffe2 operator constructors - 5/9 Differential Revision: D14078519 Original commit changeset: b0ca31a52e4a fbshipit-source-id: 713ae108d3dd6f33abdbf98a5f213e57e2b64642	2019-02-28 15:09:28 -08:00
David Riazati	b36d9351b1	Add generic list/dict custom op bindings (#17587 ) Summary: Fixes #17017 Sandcastle refuses to land #17037, so trying fresh here Pull Request resolved: https://github.com/pytorch/pytorch/pull/17587 Differential Revision: D14265402 Pulled By: driazati fbshipit-source-id: b942721aa9360ac6b3862f552ac95529eb0cf52c	2019-02-28 15:00:26 -08:00
Sebastian Messmer	7413f0926a	refactor caffe2 operator constructors - 8/9 (#17089 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17089 clangr codemod Reviewed By: ezyang Differential Revision: D14078539 fbshipit-source-id: 9ca196af4af7f26fc82e6cf82b35d478d0597752	2019-02-28 14:45:20 -08:00
Sebastian Messmer	28b5df1c8f	refactor caffe2 operator constructors - 6/9 (#17087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17087 clangr codemod Reviewed By: ezyang Differential Revision: D14078525 fbshipit-source-id: 7cc03b30b0d4eb99818e35406be4119b27bdb1bc	2019-02-28 14:23:57 -08:00
Sebastian Messmer	a4ed7126ca	refactor caffe2 operator constructors - 2/9 (#17083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17083 clangr codemod Reviewed By: ezyang Differential Revision: D14078504 fbshipit-source-id: 34dddb035eee2fca3150e47c57489614b91b6725	2019-02-28 14:23:55 -08:00
Sebastian Messmer	8db403b9dc	refactor caffe2 operator constructors - 7/9 (#17088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17088 clangr codemod also manually moved the constructor of a class from the .cpp file to the .h file. Reviewed By: ezyang Differential Revision: D14078531 fbshipit-source-id: 2adb4ac0ce523742da6cce3bc3b6c177b816c299	2019-02-28 14:23:53 -08:00
Sebastian Messmer	42512242cc	refactor caffe2 operator constructors - 4/9 (#17085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17085 clangr codemod Reviewed By: ezyang Differential Revision: D14078515 fbshipit-source-id: aaa48ae10892e3f47063f2133e026fea46f3240b	2019-02-28 14:23:52 -08:00
Sebastian Messmer	b0d3165cc8	refactor caffe2 operator constructors - 3/9 (#17084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17084 clangr codemod Reviewed By: ezyang Differential Revision: D14078507 fbshipit-source-id: ed02d772890b30196302b6830f541f054b7e95c8	2019-02-28 14:13:17 -08:00
Edward Yang	c9989dfe37	Make HIPStream also masquerade as CUDA. (#17469 ) Summary: HIPGuard interfaces that interacted with HIPStream were previously totally busted (because the streams had the wrong device type). This fixes it, following along the same lines of MasqueardingAsCUDA. Along the way I beefed up the explanatory comment. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc jithunnair-amd iotamudelta bddppq Pull Request resolved: https://github.com/pytorch/pytorch/pull/17469 Differential Revision: D14243396 Pulled By: ezyang fbshipit-source-id: 972455753a62f8584ba9ab194f9c785db7bb9bde	2019-02-28 13:46:11 -08:00
Alex Şuhan	e157a6432f	Fix Python device type property for XLA and MSNPU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17361 Differential Revision: D14243546 Pulled By: soumith fbshipit-source-id: b7498968f72e3d97de5bf6e5b44c5a59b6913acb	2019-02-28 13:36:19 -08:00
Morgan Funtowicz	c596683309	Rely on numel() == 1 to check if distribution parameters are scalar. (#17503 ) Summary: As discussed here #16952, this PR aims at improving the __repr__ for distribution when the provided parameters are torch.Tensor with only one element. Currently, __repr__() relies on dim() == 0 leading to the following behaviour : ``` >>> torch.distributions.Normal(torch.tensor([1.0]), torch.tensor([0.1])) Normal(loc: torch.Size([1]), scale: torch.Size([1])) ``` With this PR, the output looks like the following: ``` >>> torch.distributions.Normal(torch.tensor([1.0]), torch.tensor([0.1])) Normal(loc: 1.0, scale: 0.10000000149011612) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17503 Differential Revision: D14245439 Pulled By: soumith fbshipit-source-id: a440998905fd60cf2ac9a94f75706021dd9ce5bf	2019-02-28 13:36:17 -08:00
Zachary DeVito	9cbd7a18f5	fix reordering of inlines (#17557 ) Summary: See comment inside of code. This fixes a bug where sometimes we would try to avoid printing long lines but would inadvertently reorder the expressions, which can change the semantics of the program Pull Request resolved: https://github.com/pytorch/pytorch/pull/17557 Differential Revision: D14250608 Pulled By: zdevito fbshipit-source-id: d44996af4e90fe9ab9508d13cd04adbfc7bb5d1c	2019-02-28 13:12:15 -08:00
Xiang Gao	2e5a8cee82	Customize the printing of namedtuple return (#17136 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17112 ```python print("good", torch.randn(5,5,5).max(1)) print("terrible", torch.randn(5,5,10).max(1)) print("not as good", torch.randn(5,5,500).max(1)) print ("old behaviour = gold standard") print(tuple(torch.randn(5,5,5).max(1))) print(tuple(torch.randn(5,5,10).max(1))) print(tuple(torch.randn(5,5,500).max(1))) ``` now gives ``` >>> import torch >>> print("good", torch.randn(5,5,5).max(1)) good torch.return_types.max( values=tensor([[ 1.2821, 1.8063, 1.8075, 1.3082, -0.1267], [ 0.3437, 0.7353, 1.2619, 0.7557, 1.6662], [ 0.8583, 1.8906, 1.0246, 1.7598, 1.1184], [ 1.7821, 0.0230, 0.9452, 1.0318, 1.0823], [ 0.4116, -0.0379, -0.1843, 1.4129, 1.8796]]), indices=tensor([[4, 4, 3, 2, 1], [1, 2, 4, 1, 1], [2, 4, 0, 2, 1], [0, 2, 0, 3, 1], [0, 4, 4, 4, 4]])) >>> print("terrible", torch.randn(5,5,10).max(1)) terrible torch.return_types.max( values=tensor([[ 2.1272, 1.3664, 2.2067, 1.3974, -0.0883, 1.2505, 1.0074, 1.1217, 0.3849, 0.6936], [ 0.6288, -0.4560, 1.2748, 1.5482, 1.2777, 1.6874, 0.7151, 0.6041, 1.3572, 1.6232], [ 1.6703, 1.0075, 1.6480, 2.2839, 1.3390, 0.4938, 1.6449, 1.7628, 0.8141, 2.5714], [ 0.7079, 1.8677, 3.2478, 1.5591, 2.4870, 0.8635, -0.1450, 1.6923, 1.4924, 1.6298], [ 2.4056, 0.8002, 0.9317, 0.7455, 0.7866, 2.1191, 0.3492, 1.2095, 1.8637, 1.7470]]), indices=tensor([[1, 1, 0, 0, 0, 0, 3, 4, 4, 4], [4, 2, 2, 1, 2, 2, 3, 1, 1, 3], [0, 3, 3, 0, 2, 1, 4, 1, 0, 1], [4, 1, 3, 0, 3, 2, 0, 1, 4, 3], [1, 0, 3, 2, 1, 0, 0, 1, 0, 1]])) >>> print("not as good", torch.randn(5,5,500).max(1)) not as good torch.return_types.max( values=tensor([[ 0.3877, 0.7873, 1.8701, ..., 0.5971, 1.6103, -0.3435], [ 1.1300, 2.2418, 1.4239, ..., 1.3943, 0.3872, 1.6475], [ 2.0656, 1.3136, 0.9896, ..., 2.3918, 0.8226, 1.0517], [ 1.1054, 0.9945, 1.0561, ..., 2.1039, 1.1524, 3.0304], [ 1.5041, 2.2809, 1.0883, ..., 0.8504, 2.4774, 1.1041]]), indices=tensor([[4, 3, 1, ..., 1, 4, 0], [4, 4, 4, ..., 3, 0, 3], [3, 0, 1, ..., 2, 2, 4], [0, 1, 1, ..., 4, 2, 2], [1, 0, 4, ..., 2, 0, 2]])) >>> print ("old behaviour = gold standard") old behaviour = gold standard >>> print(tuple(torch.randn(5,5,5).max(1))) (tensor([[ 1.1908, 1.1807, 1.3151, 1.7184, 0.3556], [ 0.3798, 0.9213, 0.3001, 1.3087, 2.2419], [ 1.4233, 1.4814, 1.9900, 1.7744, 1.3059], [ 1.0026, -0.0330, 1.3061, 1.8730, 2.0685], [ 1.3041, 1.6458, 1.3449, 1.8948, 3.6206]]), tensor([[0, 4, 3, 4, 0], [1, 1, 4, 0, 4], [4, 1, 0, 3, 3], [1, 2, 1, 4, 0], [3, 3, 0, 3, 3]])) >>> print(tuple(torch.randn(5,5,10).max(1))) (tensor([[-0.1232, 0.8275, 0.6732, 1.1223, 0.8247, 1.2851, 1.6009, 1.9979, 1.9109, 0.7313], [ 0.2260, 0.5922, 1.6928, 0.6024, 2.1158, 3.0619, 0.5653, 0.7426, 0.8316, 0.6346], [ 0.4319, 0.2231, 0.5255, 1.7620, 1.1657, 0.8875, 0.5782, 0.6506, 0.5032, 1.7097], [ 0.4137, 1.7265, 1.4260, 2.0301, 1.2244, 0.7128, 2.6345, 0.7230, 1.3553, 1.6508], [ 1.0684, 1.7195, 1.4068, 0.7076, -0.0242, 0.8474, 0.8754, 1.7108, 0.2188, 1.1584]]), tensor([[0, 1, 3, 4, 2, 3, 4, 2, 1, 0], [1, 4, 0, 0, 3, 2, 0, 0, 3, 3], [2, 3, 1, 1, 4, 0, 1, 4, 4, 4], [0, 4, 1, 3, 2, 0, 2, 0, 3, 1], [1, 0, 0, 0, 0, 3, 3, 3, 2, 0]])) >>> print(tuple(torch.randn(5,5,500).max(1))) (tensor([[0.9395, 1.5572, 1.8797, ..., 2.0494, 0.8202, 0.9623], [1.7937, 0.7225, 1.8836, ..., 0.7927, 1.4976, 1.1813], [0.8558, 1.6943, 1.4192, ..., 0.8327, 1.9661, 0.4197], [1.2993, 1.4995, 0.9357, ..., 0.7810, 1.3030, 2.6216], [1.4206, 1.8315, 1.0338, ..., 1.4312, 1.3198, 1.5233]]), tensor([[0, 4, 3, ..., 3, 0, 2], [0, 1, 0, ..., 0, 4, 3], [3, 4, 3, ..., 3, 0, 0], [3, 2, 3, ..., 1, 2, 1], [1, 2, 4, ..., 3, 1, 3]])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17136 Differential Revision: D14250021 Pulled By: VitalyFedyunin fbshipit-source-id: aae72f03b35980063b1ac1f07b8353eddb0c8b93	2019-02-28 13:07:26 -08:00
Michael Suo	1046593509	Revert D14231251: [jit] alias_analysis refactor Differential Revision: D14231251 Original commit changeset: 6cd98ae6fced fbshipit-source-id: 96189f47daf7cc4cf4ef5cd343022d56a2296b39	2019-02-28 12:56:17 -08:00
Sebastian Messmer	44fb22f9fe	refactor caffe2 operator constructors - 5/9 (#17086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17086 clangr codemod Reviewed By: ezyang Differential Revision: D14078519 fbshipit-source-id: b0ca31a52e4ab97b145a1490461d59f8fa93874a	2019-02-28 12:00:39 -08:00
Michael Suo	54c5b10934	alias_analysis refactor (#17511 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17511 AliasTracker was doing bookkeeping for three concepts: the points-to graph, writes, and wildcards. This PR makes AliasTracker's job clearer: it keeps track of the points-to graph. Thus it has been renamed MemoryDAG. Write and wildcard information were pulled back into AliasDb as part of this—I may decide to pull them into their own little modules since I don't want the alias analysis stuff to get too bloated. This refactor is necessary because we want to start tracking information for aliasing elements that _aren't_ first-class IR Values (e.g. the "stuff" inside a list). So MemoryDAG can't know too much about Values Reviewed By: houseroad Differential Revision: D14231251 fbshipit-source-id: 6cd98ae6fced8d6c1522c2454da77c3c1b2b0504	2019-02-28 12:00:36 -08:00
Michael Suo	f9d3f1dca5	allow "before" and "after" alias annotations (#17480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17480 This was always part of our "spec" but not implemented Reviewed By: houseroad Differential Revision: D14214301 fbshipit-source-id: 118db320b43ec099dc3e730c67d39487474c23ea	2019-02-28 12:00:34 -08:00
Rui Zhu	e1df99295f	ONNXIFI extension & e2e tests. (#17478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17478 Enable onnxifi_ext in glow and build an e2e test in caffe2. Reviewed By: yinghai Differential Revision: D14190136 fbshipit-source-id: 26245278b487b551623109b14432f675279b17b5	2019-02-28 11:54:55 -08:00
Soumith Chintala	7fbee1f79e	update slack invite instructions Summary: update slack invite instructions Reviewed By: pjh5 Differential Revision: D14255348 fbshipit-source-id: 564fed0d44a6a68f80d1894fed40c3ddb360aa52	2019-02-28 11:29:05 -08:00
Evgeny Mankov	456d3e5f56	Fix errors in the description for installation on Windows (#17475 ) Summary: + All quotes for ENV VARS are erroneous; + Toolset hasn't be specified; + Provide paths for all 3 Visual Studio 2017 products: Community/Professional/Enterprise. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17475 Differential Revision: D14262968 Pulled By: soumith fbshipit-source-id: c0504e0a6be9c697ead83b06b0c5cf569b5c8625	2019-02-28 10:41:16 -08:00
Sebastian Messmer	a9395ce259	refactor caffe2 operator constructors - 9/9 (#17090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17090 clangr codemod Reviewed By: ezyang Differential Revision: D14078550 fbshipit-source-id: 68e6de4298e55ce83039b7806c1a275c4d6593c8	2019-02-28 09:53:18 -08:00
Gemfield	9bcceb75b5	Fix the false generated_comment (#17563 ) Summary: The generated_comments are wrong to below generated files: ```bash ./torch/csrc/autograd/generated/VariableType_0.cpp:3:// generated from tools/autograd/templates/VariableType_0.cpp ./torch/csrc/autograd/generated/VariableType_1.cpp:3:// generated from tools/autograd/templates/VariableType_1.cpp ./torch/csrc/autograd/generated/VariableType_2.cpp:3:// generated from tools/autograd/templates/VariableType_2.cpp ./torch/csrc/autograd/generated/VariableType_3.cpp:3:// generated from tools/autograd/templates/VariableType_3.cpp ./torch/csrc/autograd/generated/VariableType_4.cpp:3:// generated from tools/autograd/templates/VariableType_4.cpp ./torch/csrc/autograd/generated/VariableTypeEverything.cpp:3:// generated from tools/autograd/templates/VariableTypeEverything.cpp ./torch/csrc/jit/generated/register_aten_ops_0.cpp:23:// generated from tools/autograd/templates/register_aten_ops_0.cpp ./torch/csrc/jit/generated/register_aten_ops_1.cpp:23:// generated from tools/autograd/templates/register_aten_ops_1.cpp ./torch/csrc/jit/generated/register_aten_ops_2.cpp:23:// generated from tools/autograd/templates/register_aten_ops_2.cpp ``` These generated files were split to speed the compile, however, the template files are not. After this fix, the comments will look like below: ```bash ./torch/csrc/autograd/generated/VariableType_0.cpp:3:// generated from tools/autograd/templates/VariableType.cpp ./torch/csrc/autograd/generated/VariableType_1.cpp:3:// generated from tools/autograd/templates/VariableType.cpp ...... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17563 Differential Revision: D14260992 Pulled By: soumith fbshipit-source-id: 038181367fa43bee87837e4170704ddff7f4d6f2	2019-02-28 09:44:08 -08:00
Dmytro Dzhulgakov	13e6326c07	Remove useless OpenCV reference Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17564 Differential Revision: D14255542 Pulled By: dzhulgakov fbshipit-source-id: c129f3751ae82deedd258ee16586552b77baaca6	2019-02-27 23:31:10 -08:00
Ailing Zhang	03132c1f56	convolution/matmul/dropout (#17523 ) Summary: * Add AD formula for _convolution & matmul & dropout * add prim::range, fixes #17483 Example: ``` dim = 3 x = range(dim) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17523 Differential Revision: D14254002 Pulled By: ailzhang fbshipit-source-id: ba60d77b047db347929b72beca2623fb26aec957	2019-02-27 21:41:59 -08:00
Elias Ellison	221edddd18	disallow shape analysis with resize ops (#17518 ) Summary: resize_ and resize_as resize the input tensor. because our shape analysis is flow invariant, we don't do shape analysis on any op that relies on a Tensor that can alias a resized Tensor. E.g. in the following graph the x += 10 x may have been resized. ``` torch.jit.script def test(x, y): for i in range(10): x += 10 x.resize_as_([1 for i in int(range(torch.rand()))) return x ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17518 Differential Revision: D14249835 Pulled By: eellison fbshipit-source-id: f281b468ccb8c29eeb0f68ca5458cc7246a166d9	2019-02-27 19:02:09 -08:00
Sebastian Messmer	6706e9af19	Make C10_MOBILE consistent with how feature macros are usually used (#17481 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17481 Usually, feature macros are either defined or undefined and checked accordingly. C10_MOBILE was a weird special case that was always defined but either defined to 1 or to 0. This caused a lot of confusion for me when trying to disable something from mobile build and it also disabled it from the server build (because I was using ifdef). Also, I found a place in the existing code base that made that wrong assumption and used the macro wrongly, see https://fburl.com/y4icohts Reviewed By: dzhulgakov Differential Revision: D14214825 fbshipit-source-id: f3a155b6d43d334e8839e2b2e3c40ed2c773eab6	2019-02-27 17:57:51 -08:00
Sebastian Messmer	7c5ffc4120	Disable c10 dispatcher on mobile (#17078 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17078 This prevents caffe2 operators from being expsoed to c10 on mobile, which in turn causes the whole c10 dispatcher to be stripped away and saves binary size. We probably want to re-enable the c10 dispatcher for mobile, but for now this is ok. Reviewed By: ezyang Differential Revision: D14077972 fbshipit-source-id: e4dd3e3b60cdfbde91fe0d24102c1d9708d3e5c4	2019-02-27 17:57:50 -08:00
Shen Li	1154506533	Always synchronize src and dst streams when copying tensors (#16966 ) Summary: fixes #15568 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16966 Differential Revision: D14213144 Pulled By: mrshenli fbshipit-source-id: 2fcf5e07895fde80b4aee72e2736b0def876d21f	2019-02-27 14:57:56 -08:00
Lara Haidar	5f06dcc4d7	ONNX Export Adaptive Pooling Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17412 Differential Revision: D14247923 Pulled By: houseroad fbshipit-source-id: 5530cea8f80da7368bff1e29cf89c45ad53accee	2019-02-27 14:57:54 -08:00
Christian Puhrsch	e47aeede32	Use name for output variables instead of out in JIT (#17386 ) Summary: This adds 88 matches. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17386 Differential Revision: D14179139 Pulled By: cpuhrsch fbshipit-source-id: 2c3263b8e4d084db84791e53290e8c8b1b7aecd5	2019-02-27 14:03:33 -08:00
Jesse Hellemn	1971c0528d	Forcing UTC on Mac circleci jobs (#17516 ) Summary: And adding timestamps to linux build jobs Pull Request resolved: https://github.com/pytorch/pytorch/pull/17516 Differential Revision: D14244533 Pulled By: pjh5 fbshipit-source-id: 26c38f59e0284c99f987d69ce6a2c2af9116c3c2	2019-02-27 13:22:06 -08:00
Xiaomeng Yang	9709d5e787	Fix math::Set for large tensor (#17539 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17539 Fix math::Set for large tensor i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov, houseroad Differential Revision: D14240756 fbshipit-source-id: 0ade26790be41fb26d2cc193bfa3082c7bd4e69d	2019-02-27 12:34:58 -08:00
Natalia Gimelshein	b4572668b4	Add sparse gradient option to `gather` operation (#17182 ) Summary: This PR allows `gather` to optionally return sparse gradients, as requested in #16329. It also allows to autograd engine to accumulate sparse gradients in place when it is safe to do so. I've commented out size.size() check in `SparseTensor.cpp` that also caused #17152, it does not seem to me that check serves a useful purpose, but please correct me if I'm wrong and a better fix is required. Motivating example: For this commonly used label smoothing loss function ``` def label_smoothing_opt(x, target): padding_idx = 0 smoothing = 0.1 logprobs = torch.nn.functional.log_softmax(x, dim=-1, dtype=torch.float32) pad_mask = (target == padding_idx) ll_loss = logprobs.gather(dim=-1, index=target.unsqueeze(1), sparse = True).squeeze(1) smooth_loss = logprobs.mean(dim=-1) loss = (smoothing - 1.0) * ll_loss - smoothing * smooth_loss loss.masked_fill_(pad_mask, 0) return loss.sum() ``` backward goes from 12.6 ms with dense gather gradients to 7.3 ms with sparse gradients, for 9K tokens x 30K vocab, which is some single percent end-to-end improvement, and also improvement in peak memory required. Shout-out to core devs: adding python-exposed functions with keyword arguments through native_functions.yaml is very easy now! cc gchanan apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/17182 Differential Revision: D14158431 Pulled By: gchanan fbshipit-source-id: c8b654611534198025daaf7a634482b3151fbade	2019-02-27 11:42:48 -08:00
Jane Wang	a2b9f7f484	add elastic zeus handler (#16746 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16746 as titled. We use a special url schem elasticzeus for elastic zeus so that we dont need to change the public interface of init_process_group. Reviewed By: aazzolini, soumith Differential Revision: D13948151 fbshipit-source-id: 88939dcfa0ad93467dabedad6905ec32e6ec60e6	2019-02-27 11:29:59 -08:00
Jongsoo Park	222a07863f	optimize elementwise sum (#17456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17456 Using an instruction sequence similar to function in fbgemm/src/QuantUtilAvx2.cc elementwise_sum_benchmark added Reviewed By: protonu Differential Revision: D14205695 fbshipit-source-id: 84939c9d3551f123deec3baf7086c8d31fbc873e	2019-02-27 10:12:41 -08:00
rohithkrn	8c72217817	Enable boolean_mask, adadelta, adagrad fp16 on ROCm (#17235 ) Summary: - Fix bugs, indentation for adadelta and adagrad tests to enable fp16 - Enable boolean_mask fp16 on ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/17235 Differential Revision: D14240828 Pulled By: bddppq fbshipit-source-id: ab6e8f38aa7afb83b4b879f2f4cf2277c643198f	2019-02-27 10:07:36 -08:00
Iurii Zdebskyi	e0b44cac1f	Enabled HALF for fill() and zero() methods. Moved them into THTensorFill (#17536 ) Summary: For some additional context on this change, please, see this [PR](https://github.com/pytorch/pytorch/pull/17376) As a part of work on Bool Tensor, we will need to add support for a bool type to _fill() and _zero() methods that are currently located in THTensorMath. As we don't need anything else and those methods are not really math related - we are moving them out into separate THTensorFill for simplicity. Change: -moved _fill() and _zero() from THTensorMath.h to THTensorFill -enabled _fill() and _zero() for HALF type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17536 Differential Revision: D14242130 Pulled By: izdeby fbshipit-source-id: 1d8bd806f0f5510723b9299d360b70cc4ab96afb	2019-02-27 09:21:54 -08:00
Tongzhou Wang	44a607b90c	Fix autograd with buffers requiring grad in DataParallel (#13352 ) Summary: Causing a problem with spectral norm, although SN won't use that anymore after #13350 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/13352 Differential Revision: D14209562 Pulled By: ezyang fbshipit-source-id: f5e3183e1e7050ac5a66d203de6f8cf56e775134	2019-02-26 20:53:19 -08:00
Chaitanya Sri Krishna Lolla	74098eadb0	enable assymetric dilations and stride for miopen conv (#17472 ) Summary: As of MIOpen 1.7.1 as shipped in ROCm 2.1 this works correctly and we can use MIOpen and do not need to fall back Pull Request resolved: https://github.com/pytorch/pytorch/pull/17472 Differential Revision: D14210323 Pulled By: ezyang fbshipit-source-id: 4c08d0d4623e732eda304fe04cb722c835ec70e4	2019-02-26 20:45:35 -08:00
Johannes M Dieterich	76828647c1	Enable tests working on ROCm 2.1 dual gfx906 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17473 Reviewed By: bddppq Differential Revision: D14210243 Pulled By: ezyang fbshipit-source-id: 519032a1e73c13ecb260ea93102dc8efb645e070	2019-02-26 20:41:16 -08:00
peter	96faaa9d50	Fix linking errors when building dataloader test binaries on Windows (#17494 ) Summary: Fixes #17489. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17494 Differential Revision: D14226525 Pulled By: ezyang fbshipit-source-id: 3dfef9bc6f443d647e9f05a54bc17c5717033723	2019-02-26 20:36:45 -08:00
hysts	cbefd0323b	Fix typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17521 Differential Revision: D14237482 Pulled By: soumith fbshipit-source-id: 636e0fbe2c667d15fcb649136a65ae64937fa0cb	2019-02-26 20:23:34 -08:00
Christian Puhrsch	eff672ef06	Remove Bool/IndexTensor from schema for native functions with derivatives (#17193 ) Summary: This only deals with four functions, but is an important first step towards removing BoolTensor and IndexTensor entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17193 Differential Revision: D14157829 Pulled By: cpuhrsch fbshipit-source-id: a36f16d1d88171036c44cc7de60ac9dfed9d14f2	2019-02-26 17:54:33 -08:00
Ilia Cherniavskii	348d1889ff	Fix operator initialization order (#15445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15445 Initilize task graph after operators (task graph uses ops) Reviewed By: yinghai Differential Revision: D13530864 fbshipit-source-id: fdc91e9158c1b50fcc96fd1983fd000fdf20c7da	2019-02-26 15:41:16 -08:00
bhushan	4ca1a54526	Make transpose consistent with numpy's behavior (#17462 ) Summary: Pytorch's tensor.t() is now equivalent with Numpy's ndarray.T for 1D tensor i.e. tensor.t() == tensor Test case added: - test_t fixes #9687 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17462 Differential Revision: D14214838 Pulled By: soumith fbshipit-source-id: c5df1ecc8837be22478e3a82ce4854ccabb35765	2019-02-26 14:23:19 -08:00
Lu Fang	63519df07a	Bump up the ONNX default opset version to 10 (#17419 ) Summary: Align with the master of ONNX. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17419 Reviewed By: zrphercule Differential Revision: D14197985 Pulled By: houseroad fbshipit-source-id: 13fc1f7786aadbbf5fe83bddf488fee3dedf58ce	2019-02-26 12:37:27 -08:00
liangdzou	72eb70c272	' ' ==> ' ' (#17498 ) Summary: Fix formatting error for cpp code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17498 Reviewed By: zou3519 Differential Revision: D14224549 Pulled By: fmassa fbshipit-source-id: f1721c4a75908ded759aea8c561f2e1d66859eec	2019-02-26 12:31:50 -08:00
Johannes M Dieterich	1607bb322d	Support all ROCm supported uarchs simultaneously: gfx803, gfx900, gfx906 (#17367 ) Summary: Correct misspelled flag. Remove dependency on debug flag (HCC_AMDGPU_TARGET) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17367 Differential Revision: D14227334 Pulled By: bddppq fbshipit-source-id: d838f219a9a1854330b0bc851c40dfbba77a32ef	2019-02-26 11:54:07 -08:00
knightXun	5903522ad6	refactor: a bit intricate so I refactor it (#16995 ) Summary: this code is a bit intricate so i refactor it Pull Request resolved: https://github.com/pytorch/pytorch/pull/16995 Differential Revision: D14050667 Pulled By: ifedan fbshipit-source-id: 55452339c6518166f3d4bc9898b1fe2f28601dc4	2019-02-26 10:21:22 -08:00
Elias Ellison	e5b4baab40	new batch of expect file removals Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17486 Differential Revision: D14218963 Pulled By: eellison fbshipit-source-id: dadc8bb71e756f47cdb04525d47f66c13ed56d16	2019-02-26 08:20:43 -08:00
Michael Suo	2cdbb140e6	user defined types (#17314 ) Summary: First pass at user defined types. The following is contained in this PR: - `UserType` type, which contains a reference to a module with all methods for the type, and a separate namespace for data attributes (map of name -> TypePtr). - `UserTypeRegistry`, similar to the operator registry - `UserObject` which is the runtime representation of the user type (just a map of names -> IValues) - `UserTypeValue` SugaredValue, to manage getattr and setattr while generating IR, plus compiler.cpp changes to make that work. - Frontend changes to get `torch.jit.script` to work as a class decorator - `ClassDef` node in our AST. - primitive ops for object creation, setattr, and getattr, plus alias analysis changes to make mutation safe. Things that definitely need to get done: - Import/export, python_print support - String frontend doesn't understand class definitions yet - Python interop (using a user-defined type outside TorchScript) is completely broken - Static methods (without `self`) don't work Things that are nice but not essential: - Method definition shouldn't matter (right now you can only reference a method that's already been defined) - Class definitions can only contain defs, no other expressions are supported. Things I definitely won't do initially: - Polymorphism/inheritance Pull Request resolved: https://github.com/pytorch/pytorch/pull/17314 Differential Revision: D14194065 Pulled By: suo fbshipit-source-id: c5434afdb9b39f84b7c85a9fdc2891f8250b5025	2019-02-26 01:34:07 -08:00
Michael Suo	3ceeaae5e6	add mutability to docs (#17454 ) Summary: Not sure the best way to integrate this…I wrote something that focuses on mutability "vertically" through the stack. Should I split it up and distribute it into the various sections, or keep it all together? Pull Request resolved: https://github.com/pytorch/pytorch/pull/17454 Differential Revision: D14222883 Pulled By: suo fbshipit-source-id: 3c83f6d53bba9186c32ee443aa9c32901a0951c0	2019-02-26 00:36:19 -08:00
Christian Puhrsch	5d77f4f0d5	Remove usages of int64_t from native_functions.yaml Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17387 Differential Revision: D14185458 Pulled By: cpuhrsch fbshipit-source-id: 5c8b358d36b77b60c3226afcd3443c2b1727cbc2	2019-02-25 17:52:26 -08:00
Michael Suo	2f840ba6d6	upload alias tracker graph for docs (#17476 ) Summary: as title Pull Request resolved: https://github.com/pytorch/pytorch/pull/17476 Differential Revision: D14218312 Pulled By: suo fbshipit-source-id: 64df096a3431a6f25cd2373f0959d415591fed15	2019-02-25 16:58:43 -08:00
Ailing Zhang	68e90a398e	Temporarily disable select/topk/kthvalue AD (#17470 ) Summary: Temporarily disable them for perf consideration. Will figure out a way to do `torch.zeros(sizes, grad.options())` in torchscript before enabling these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17470 Differential Revision: D14210313 Pulled By: ailzhang fbshipit-source-id: efaf44df1192ae42f4fe75998ff0073234bb4204	2019-02-25 16:29:11 -08:00
Elias Ellison	411cf434af	Batch of expect file removals Remove dce expect files (#17471 ) Summary: Batch of removing expect test files Pull Request resolved: https://github.com/pytorch/pytorch/pull/17471 Differential Revision: D14217265 Pulled By: eellison fbshipit-source-id: 425da022115b7e83aca86ef61d4d41fd046d439e	2019-02-25 16:15:19 -08:00
Dmytro Dzhulgakov	df0d4e6c7a	Back out part of "Fix NERPredictor for zero initialization" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17482 Reviewed By: david-y-lam Differential Revision: D14216135 fbshipit-source-id: 2ef4cb5dea74fc5c68e9b8cb43fcb180f219cb32	2019-02-25 16:03:45 -08:00
Stefan Krah	e4e9b738d3	Followup to #17049 : change more instances of RuntimeError to IndexError Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17114 Differential Revision: D14150890 Pulled By: gchanan fbshipit-source-id: 579ca71665166c6a904b894598a0b334f0d8acc7	2019-02-25 15:34:22 -08:00
Krishna Kalyan	59ece70201	Missing argument description (value) in scatter_ function documentation (#17467 ) Summary: Update the docs to include the value parameter that was missing in the `scatter_` function. Differential Revision: D14209225 Pulled By: soumith fbshipit-source-id: 5c65e4d8fbd93fcd11a0a47605bce6d57570f248	2019-02-25 14:39:26 -08:00
Lu Fang	9e08c998db	Throw exception when foxi is not checked out (#17477 ) Summary: Add check and provide useful warning/error information to user if foxi is not checked out. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17477 Reviewed By: zrphercule Differential Revision: D14212896 Pulled By: houseroad fbshipit-source-id: 557247d5d8fdc016b1c24c2a21503e59f874ad09	2019-02-25 14:39:24 -08:00
svcscm	6f53c51a01	Updating submodules Reviewed By: yns88 fbshipit-source-id: ae3e05c2ee3af5df171556698ff1469780d739d1	2019-02-25 13:50:20 -08:00
Michael Suo	79a5a73a1e	simplify aliasdb interface (#17453 ) Summary: Stack:     ⚫  #17453 [jit] simplify aliasdb interface  [💛](https://our.intern.facebook.com/intern/diff/D14205209/) The previous "getWrites" API relies on the user to do alias checking, which is confusing and inconsistent with the rest of the interface. So replace it with a higher-level call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17453 Differential Revision: D14209942 Pulled By: suo fbshipit-source-id: d4aff2af6062ab8465ee006fc6dc603296bcb7ab	2019-02-25 13:34:51 -08:00
Elias Ellison	b0b7541ca4	fix list type unification (#17424 ) Summary: Previously we were unifying the types of lists across if block outputs. This now fails with Optional subtyping because two types which can be unified have different runtime representations. ``` torch.jit.script def list_optional_fails(x): # type: (bool) -> Optional[int] if x: y = [1] else: y = [None] return y[0] ``` the indexing op will expect y to be a generic list, but it will find an intlist. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17424 Differential Revision: D14210903 Pulled By: eellison fbshipit-source-id: 4b8b26ba2e7e5bebf617e40316475f91e9109cc2	2019-02-25 13:34:50 -08:00
Shen Li	b527055fcf	Restore current streams on dst device after switching streams (#17439 ) Summary: When switching back to `d0` from a stream on a different device `d1`, we need to restore the current streams on both `d0` and `d1`. The current implementation only does that for `d0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17439 Differential Revision: D14208919 Pulled By: mrshenli fbshipit-source-id: 89f2565b9977206256efbec42adbd789329ccad8	2019-02-25 12:06:41 -08:00
Lu Fang	29c27d7b99	Automatic update of fbcode/onnx to e18bb41d255a23daf368ffd62a2645db55db4c72 (#17460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17460 Previous import was 4c091e048ca42682d63ccd3c1811560bc12b732d Included changes: - [e18bb41](https://github.com/onnx/onnx/commit/e18bb41): Infer shape of the second output of Dropout op (#1822) <Shinichiro Hamaji> - [cb544d0](https://github.com/onnx/onnx/commit/cb544d0): Clarify dtype of Dropout's mask output (#1826) <Shinichiro Hamaji> - [b60f693](https://github.com/onnx/onnx/commit/b60f693): Fix shape inference when auto_pad is notset (#1824) <Li-Wen Chang> - [80346bd](https://github.com/onnx/onnx/commit/80346bd): update test datat (#1825) <Rui Zhu> - [b37fc6d](https://github.com/onnx/onnx/commit/b37fc6d): Add stringnormalizer operator to ONNX (#1745) <Dmitri Smirnov> Reviewed By: zrphercule Differential Revision: D14206264 fbshipit-source-id: 0575fa3374ff2b93b2ecee9989cfa4793c599117	2019-02-25 11:09:08 -08:00
Will Feng	393c97fda7	Fix variable checking in THCPModule_setRNGState (#17474 ) Summary: See https://github.com/pytorch/pytorch/pull/16325/files#r259576901 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17474 Differential Revision: D14209549 Pulled By: yf225 fbshipit-source-id: 2ae091955ae17f5d1540f7d465739c4809c327f8	2019-02-25 11:05:51 -08:00
vishwakftw	724c7e76c6	Fix reduction='none' in poisson_nll_loss (#17358 ) Summary: Changelog: - Modify `if` to `elif` in reduction mode comparison - Add error checking for reduction mode Pull Request resolved: https://github.com/pytorch/pytorch/pull/17358 Differential Revision: D14190523 Pulled By: zou3519 fbshipit-source-id: 2b734d284dc4c40679923606a1aa148e6a0abeb8	2019-02-25 10:35:33 -08:00
Michael Liu	f9ba3831ef	Apply modernize-use-override (4) Summary: Use C++11’s override and remove virtual where applicable. Change are automatically generated. bypass-lint drop-conflicts Reviewed By: ezyang Differential Revision: D14191981 fbshipit-source-id: 1f3421335241cbbc0cc763b8c1e85393ef2fdb33	2019-02-25 08:31:27 -08:00
Gregory Chanan	15a55b86ed	Fix nonzero for scalars on cuda, to_sparse for scalars on cpu/cuda. (#17406 ) Summary: I originally set out to fix to_sparse for scalars, which had some overly restrictive checking (sparse_dim > 0, which is impossible for a scalar). This fix uncovered an issue with nonzero: it didn't properly return a size (z, 0) tensor for an input scalar, where z is the number of nonzero elements (i.e. 0 or 1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17406 Differential Revision: D14185393 Pulled By: gchanan fbshipit-source-id: f37a6e1e3773fd9cbf69eeca7fdebb3caa192a19	2019-02-25 08:23:40 -08:00
Tongliang Liao	65ecef1509	Export ElementwiseLinear to ONNX (Mul + Add). (#17411 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17411 Reshape-based approach to support dynamic shape. The first Reshape flatten inner dimensions and the second one recover the actual shape. No Shape/Reshape will be generated unless necessary. ![image](https://user-images.githubusercontent.com/5203025/52215001-114ace80-28ce-11e9-815f-28ad190d3189.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16716 Reviewed By: zrphercule Differential Revision: D14094532 Pulled By: houseroad fbshipit-source-id: bad6a1fbf5963ef3dd034ef4bf440f5a5d6980bc	2019-02-25 08:11:13 -08:00
Lu Fang	3d68a2d6de	Add foxi submodule (ONNXIFI facebook extension) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17178 Reviewed By: yinghai Differential Revision: D14197987 Pulled By: houseroad fbshipit-source-id: c21d7235e40c2ca4925a10c467c2b4da2f1024ad	2019-02-25 08:00:03 -08:00
Michael Liu	3de67cd63d	Fix remaining -Wreturn-std-move violations in fbcode (#17308 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17308 In some cases there is still no RVO/NRVO and std::move is still needed. Latest Clang gained -Wreturn-std-move warning to detect cases like this (see https://reviews.llvm.org/D43322). Reviewed By: igorsugak Differential Revision: D14150915 fbshipit-source-id: 0df158f0b2874f1e16f45ba9cf91c56e9cb25066	2019-02-25 07:29:16 -08:00
Michael Suo	4ac91b2d64	add debug/release tip to cpp docs (#17452 ) Summary: as title. These were already added to the tutorials, but I didn't add them to the cpp docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17452 Differential Revision: D14206501 Pulled By: suo fbshipit-source-id: 89b5c8aaac22d05381bc4a7ab60d0bb35e43f6f5	2019-02-24 23:08:15 -08:00
Michael Suo	15840e30dc	add pointer to windows FAQ in contributing.md (#17450 ) Summary: " ProTip! Great commit summaries contain fewer than 50 characters. Place extra information in the extended description." lol Pull Request resolved: https://github.com/pytorch/pytorch/pull/17450 Differential Revision: D14206500 Pulled By: suo fbshipit-source-id: af7ffe299f8c8f04fa8e720847a1f6d576ebafc1	2019-02-24 23:03:00 -08:00
Thomas Viehmann	d76b9395a0	Remove ROIPooling (#17434 ) Summary: Fixes: #17399 It's undocumented, unused and, according to the issue, not actually working. Differential Revision: D14200088 Pulled By: soumith fbshipit-source-id: a81f0d0f5516faea2bd6aef5667b92c7dd012dbd	2019-02-23 21:00:10 -08:00
Krishna Kalyan	d80f0a1f3a	Add example to WeightedRandomSampler doc string (#17432 ) Summary: Example for the weighted random sampler are missing [here](https://pytorch.org/docs/stable/data.html#torch.utils.data.WeightedRandomSampler) Differential Revision: D14198642 Pulled By: soumith fbshipit-source-id: af6d8445d31304011002dd4308faaf40b0c1b609	2019-02-23 20:29:06 -08:00
Michael Suo	96b765dcf6	Revert D14095703: [pytorch][PR] [jit] Add generic list/dict custom op bindings Differential Revision: D14095703 Original commit changeset: 2b5ae20d42ad fbshipit-source-id: 85b23fe4ce0090922da953403c95691bf3e28710	2019-02-23 15:55:08 -08:00
svcscm	a1ca908ac2	Updating submodules Reviewed By: zpao fbshipit-source-id: 8fa0be05e7410a863febb98b18be55ab723a41db	2019-02-23 12:45:50 -08:00
Jaliya Ekanayake	bb3a2d99ac	Jaliyae/chunk buffer fix (#17409 ) Summary: The chunk buffer had a possibility to hang when no data is read and the buffer size is lower than chunk size. We detected this while running with larger dataset and hence the fix. I added a test to mimic the situation and validated that the fix is working. Thank you Xueyun for finding this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17409 Differential Revision: D14198546 Pulled By: soumith fbshipit-source-id: b8ca43b0400deaae2ebb6601fdc65b47f32b0554	2019-02-23 08:48:53 -08:00
Stefan Krah	5ea6344c54	Skip test_event_handle_multi_gpu() on a single GPU system (#17402 ) Summary: This fixes the following test failure: ``` ====================================================================== ERROR: test_event_handle_multi_gpu (__main__.TestMultiprocessing) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_multiprocessing.py", line 445, in test_event_handle_multi_gpu with torch.cuda.device(d1): File "/home/stefan/rel/lib/python3.7/site-packages/torch/cuda/__init__.py", line 229, in __enter__ torch._C._cuda_setDevice(self.idx) RuntimeError: cuda runtime error (10) : invalid device ordinal at /home/stefan/pytorch/torch/csrc/cuda/Module.cpp:33 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17402 Differential Revision: D14195190 Pulled By: soumith fbshipit-source-id: e911f3782875856de3cfbbd770b6d0411d750279	2019-02-23 08:29:36 -08:00
Olen ANDONI	be4ad3fe30	fix(typo): Change 'integeral' to 'integer' Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17396 Differential Revision: D14195023 Pulled By: soumith fbshipit-source-id: 300ab68c24bfbf10768fefac44fad64784463c8f	2019-02-23 08:22:01 -08:00
Lu Fang	6a99f86429	Fix the ONNX expect file (#17430 ) Summary: The CI is broken now, this diff should fix it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17430 Differential Revision: D14198045 Pulled By: houseroad fbshipit-source-id: a1c8cb5ccff66f32488702bf72997f634360eb5b	2019-02-23 00:02:02 -08:00
Karl Ostmo	674e11ccde	order caffe2 ubuntu configs contiguously (#17427 ) Summary: This involves another purely cosmetic (ordering) change to the `config.yml` to facilitate simpler logic. Other changes: * add some review feedback as comments * exit with nonzero status on config.yml mismatch * produce a diagram for pytorch builds Pull Request resolved: https://github.com/pytorch/pytorch/pull/17427 Differential Revision: D14197618 Pulled By: kostmo fbshipit-source-id: 267439d3aa4c0a80801adcde2fa714268865900e	2019-02-22 20:18:29 -08:00
Jongsoo Park	c10662962c	remove redundant inference functions for FC (#17407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17407 As title says Reviewed By: csummersea Differential Revision: D14177921 fbshipit-source-id: e48e1086d37de2c290922d1f498e2d2dad49708a	2019-02-22 20:13:20 -08:00
Jongsoo Park	08fed51926	optimize max pool 2d (#17418 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17418 Retry of D14181620 this time with CMakeLists.txt changes Reviewed By: jianyuh Differential Revision: D14190538 fbshipit-source-id: c59b1bd474edf6376f4c2767a797b041a2ddf742	2019-02-22 19:43:57 -08:00
Roy Li	340cf2a2dd	Generate derived extension backend Type classes for each scalar type (#17278 ) Summary: Previously we only generate one class for each extension backend. This caused issues with scalarType() calls and mapping from variable Types to non-variable types. With this change we generate one Type for each scalar type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17278 Reviewed By: ezyang Differential Revision: D14161489 Pulled By: li-roy fbshipit-source-id: 91e6a8f73d19a45946c43153ea1d7bc9d8fb2409	2019-02-22 18:38:33 -08:00
Ilia Cherniavskii	47263e48f4	Better handling of net errors in prof_dag counters (#17384 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17384 Better handling of possible net run errors in prof_dag counters. Reviewed By: yinghai Differential Revision: D14177619 fbshipit-source-id: 51bc952c684c53136ce97e22281b1af5706f871e	2019-02-22 18:38:31 -08:00
eellison	d8d8371bd3	Batch of Expect Files removal (#17414 ) Summary: Batch of removing expect files, and some tests that no longer test anything. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17414 Differential Revision: D14196342 Pulled By: eellison fbshipit-source-id: 75c45649d1dd1ce39958fb02f5b7a2622c1d1d01	2019-02-22 18:11:51 -08:00
Arthur Crippa Búrigo	c65b0cbe3d	Fix target name. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17365 Differential Revision: D14195831 Pulled By: soumith fbshipit-source-id: fdf03f086f650148c34f4c548c66ef1eee698f05	2019-02-22 17:27:16 -08:00
Zachary DeVito	4491577fb5	jit technical docs - parts 1, 2, and most of 3 (#16887 ) Summary: This will evolve into complete technical docs for the jit. Posting what I have so far so people can start reading it and offering suggestions. Goto to Files Changed and click 'View File' to see markdown formatted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16887 Differential Revision: D14191219 Pulled By: zdevito fbshipit-source-id: 071a0e7db05e4f2eb657fbb99bcd903e4f46d84a	2019-02-22 17:27:14 -08:00
Vishwak Srinivasan	9e69703dac	USE_ --> BUILD_ for CAFFE2_OPS and TEST (#17390 ) Differential Revision: D14195572 Pulled By: soumith fbshipit-source-id: 28e4ff3fe03a151cd4ed014c64253389cb85de3e	2019-02-22 17:19:44 -08:00
Gemfield	3ab2080047	Fix install libcaffe2_protos.a issue mentioned in #14317 (#17393 ) Summary: Fix install libcaffe2_protos.a issue mentioned in #14317. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17393 Differential Revision: D14195359 Pulled By: soumith fbshipit-source-id: ed4da594905d708d03fcd719dc50aec6811d5d3f	2019-02-22 17:05:48 -08:00
Yinghai Lu	1d05d0d848	Improve onnxifi backend init time (#17375 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17375 Previously we create the onnxGraph first and take it to the onnx manager for registration. It doesn't work well in practice. This diff takes "bring your own constructor" approach to reduce the resource spent doing backend compilation. Reviewed By: kimishpatel, rdzhabarov Differential Revision: D14173793 fbshipit-source-id: cbc4fe99fc522f017466b2fce88ffc67ae6757cf	2019-02-22 16:58:30 -08:00
vfdev	e984244828	fix code block typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17421 Differential Revision: D14194877 Pulled By: soumith fbshipit-source-id: 6173835d833ce9e9c02ac7bd507cd424a20f2738	2019-02-22 16:34:12 -08:00
Junjie Bai	807632d402	Double resnet50 batch size in benchmark script (#17416 ) Summary: The benchmarks are now running on gpu cards with more memory Pull Request resolved: https://github.com/pytorch/pytorch/pull/17416 Differential Revision: D14190493 Pulled By: bddppq fbshipit-source-id: 66db1ca1fa693d24c24b9bc0185a6dd8a3337103	2019-02-22 15:30:35 -08:00
Mikhail Zolotukhin	6d744f8fbf	Preserve names when converting to/from NetDef. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17378 Differential Revision: D14176515 Pulled By: ZolotukhinM fbshipit-source-id: da9ea28310250ab3ca3a99cdc210fd8d1fbbc82b	2019-02-22 15:25:52 -08:00
David Riazati	dbd66c17bc	Add generic list/dict custom op bindings (#17037 ) Summary: Fixes #17017 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17037 Differential Revision: D14095703 Pulled By: driazati fbshipit-source-id: 2b5ae20d42ad21c98c86a8f1cd7f1de175510507	2019-02-22 14:49:43 -08:00
Elias Ellison	93e8b938ff	fix test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17304 Differential Revision: D14151545 Pulled By: eellison fbshipit-source-id: d85535b709c58e2630b505ba57e9823d5a59c1d5	2019-02-22 14:43:23 -08:00
Ailing Zhang	9aae82bc2c	Improvements for current AD (#17187 ) Summary: This PR removes a few size of `self` that passed from forward pass to backward pass when `self` is already required in backward pass. This could be reason that cause the potential slow down in #16689 . I will attach a few perf numbers (still a bit volatile among runs tho) I got in the comment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17187 Differential Revision: D14179512 Pulled By: ailzhang fbshipit-source-id: 5f3b1f6f26a3fef6dec15623b940380cc13656fa	2019-02-22 14:34:14 -08:00
Lu Fang	e422b27f17	Bump up the producer version in ONNX exporter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17410 Reviewed By: zrphercule Differential Revision: D14187821 Pulled By: houseroad fbshipit-source-id: a8c1d2f7b6ef63e7e92cba638e90922ef98b8702	2019-02-22 14:28:59 -08:00
Michael Kösel	b18c60839d	list add insert and remove (#17200 ) Summary: See https://github.com/pytorch/pytorch/issues/16662 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17200 Differential Revision: D14144020 Pulled By: driazati fbshipit-source-id: c9a52954fd5f4fb70e3a0dc02d2768e0de237142	2019-02-22 14:12:56 -08:00
Jesse Hellemn	9977f43d19	Pin nightly builds to last commit before 5am UTC (#17381 ) Summary: This fell through the cracks from the migration from pytorch/builder to circleci. It's technically still racey, but is much less likely now Pull Request resolved: https://github.com/pytorch/pytorch/pull/17381 Differential Revision: D14190137 Pulled By: pjh5 fbshipit-source-id: 2d4cd04ee874cacce47d1d50b87a054b0503bb82	2019-02-22 14:08:06 -08:00
Zachary DeVito	356a94b64e	Lazily load libcuda libnvrtc from c++ (#17317 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16860 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17317 Differential Revision: D14157877 Pulled By: zdevito fbshipit-source-id: c37aec2d77c2e637d4fc6ceffe2bd32901c70317	2019-02-22 13:51:45 -08:00
Elias Ellison	81b43202ae	Refactor Type Parser b/w Schemas & IRParser into a type common parser (#17383 ) Summary: Creates a new shared type parser to be shared between the IR parser and the Schema Parser. Also adds parsing of CompleteTensorType and DimensionedTensorType, and feature-gates that for the IRParser. Renames the existing type_parser for python annotations, python_type_parser, and names the new one jit_type_parser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17383 Differential Revision: D14186438 Pulled By: eellison fbshipit-source-id: bbd5e337917d8862c7c6fa0a0006efa101c76afe	2019-02-22 13:43:55 -08:00
Lu Fang	b0c18570ca	add the support for stable ONNX opsets in exporter (#16068 ) Summary: Still wip, need more tests and correct handling for opset 8 in symbolics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16068 Reviewed By: zrphercule Differential Revision: D14185855 Pulled By: houseroad fbshipit-source-id: 55200be810c88317c6e80a46bdbeb22e0b6e5f9e	2019-02-22 12:05:17 -08:00
Karl Ostmo	dd3acbc6d5	add readme and notice at the top of config.yml (#17323 ) Summary: reorder some envars for consistency add readme and notice at the top of config.yml generate more yaml from Python closes #17322 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17323 Differential Revision: D14186734 Pulled By: kostmo fbshipit-source-id: 23b2b2c1960df6f387f1730c8df1ec24a30433fd	2019-02-22 11:30:49 -08:00
Lu Fang	0c24f3754b	Revert D14181620: [caffe2/int8] optimize max pool 2d Differential Revision: D14181620 Original commit changeset: ffc6c4412bd1 fbshipit-source-id: 4391703164a672c9a8daecb24a46578765df67c6	2019-02-22 11:23:59 -08:00
Gu, Jinghui	60de0b885f	fallback operators to CPU for onnx support (#15270 ) Summary: fallback operators to CPU for onnx support Pull Request resolved: https://github.com/pytorch/pytorch/pull/15270 Differential Revision: D14099496 Pulled By: yinghai fbshipit-source-id: 52b744aa5917700a802bdf19f7007cdcaa6e640a	2019-02-22 10:47:53 -08:00
Jongsoo Park	4778a4089e	optimize max pool 2d (#17391 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17391 Optimize 2D max pool using AVX2 intrinsics. Reviewed By: jianyuh Differential Revision: D14181620 fbshipit-source-id: ffc6c4412bd1c1d7839fe06226921df40d9cab83	2019-02-22 10:36:19 -08:00
Iurii Zdebskyi	8a21c6a5ee	Fixed the script for the THC generated files (#17370 ) Summary: As of tight now, the script will produce a new generated file which will be inconsistent with the rest. Test Result: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17370 Differential Revision: D14184943 Pulled By: izdeby fbshipit-source-id: 5d3b956867bee661256cb4f38f086f33974a1c8b	2019-02-22 09:42:43 -08:00
Gregory Chanan	2b86cc442c	Fix coalesce, clone, to_dense for sparse scalars. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17379 Differential Revision: D14183641 Pulled By: gchanan fbshipit-source-id: dbd071b648695d51502ed34ab204a1aee7e6259b	2019-02-22 09:02:37 -08:00
Tongzhou Wang	3d5968d366	Fix DataParallel(cpu_m).cuda() not working by checking at forward (#17363 ) Summary: Fixes #17362 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17363 Differential Revision: D14175151 Pulled By: soumith fbshipit-source-id: 7b7e2335d553ed2133287deeaca3f6b6254aea4a	2019-02-22 08:31:36 -08:00
Will Feng	be6ad7ddde	Rename BatchNorm running_variance to running_var (#17371 ) Summary: Currently there is a mismatch in naming between Python BatchNorm `running_var` and C++ BatchNorm `running_variance`, which causes JIT model parameters loading to fail (https://github.com/pytorch/vision/pull/728#issuecomment-466067138): ``` terminate called after throwing an instance of 'c10::Error' what(): No such serialized tensor 'running_variance' (read at /home/shahriar/Build/pytorch/torch/csrc/api/src/serialize/input-archive.cpp:27) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x85 (0x7f2d92d32f95 in /usr/local/lib/libc10.so) frame #1: torch::serialize::InputArchive::read(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, at::Tensor&, bool) + 0xdeb (0x7f2d938551ab in /usr/local/lib/libtorch.so.1) frame #2: torch::nn::Module::load(torch::serialize::InputArchive&) + 0x98 (0x7f2d9381cd08 in /usr/local/lib/libtorch.so.1) frame #3: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #4: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1) frame #5: torch::nn::operator>>(torch::serialize::InputArchive&, std::shared_ptr<torch::nn::Module> const&) + 0x32 (0x7f2d9381c7b2 in /usr/local/lib/libtorch.so.1) frame #6: <unknown function> + 0x2b16c (0x5645f4d1916c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #7: <unknown function> + 0x27a3c (0x5645f4d15a3c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #8: <unknown function> + 0x2165c (0x5645f4d0f65c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #9: <unknown function> + 0x1540b (0x5645f4d0340b in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) frame #10: __libc_start_main + 0xf3 (0x7f2d051dd223 in /usr/lib/libc.so.6) frame #11: <unknown function> + 0x1381e (0x5645f4d0181e in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest) ``` Renaming C++ BatchNorm `running_variance` to `running_var` should fix this problem. This is a BC-breaking change, but it should be easy for end user to rename `running_variance` to `running_var` in their call sites. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17371 Reviewed By: goldsborough Differential Revision: D14172775 Pulled By: yf225 fbshipit-source-id: b9d3729ec79272a8084269756f28a8f7c4dd16b6	2019-02-22 08:00:25 -08:00
svcscm	562fa55f3d	Updating submodules Reviewed By: zpao fbshipit-source-id: ac16087a2b27b028d8e9def81369008c4723d70f	2019-02-21 22:52:41 -08:00
Chandler Zuo	260f66c316	Fix concat dimension check bug (#17343 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17343 See [post](https://fb.workplace.com/groups/1405155842844877/permalink/2630764056950710/) Reviewed By: dzhulgakov Differential Revision: D14163001 fbshipit-source-id: 038f15d6a58b3bc31910e7bfa47c335e25739f12	2019-02-21 19:34:30 -08:00
David Riazati	1fea60be25	Add dict to docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16640 Differential Revision: D14178270 Pulled By: driazati fbshipit-source-id: 581040abd0b7f8636c53fd97c7365df99a2446cf	2019-02-21 17:45:24 -08:00
David Riazati	2370c989d8	Add LSTM to standard library (#15744 ) Summary: WIP Attempt 2 at #14831 This adds `nn.LSTM` to the jit standard library. Necessary changes to the module itself are detailed in comments. The main limitation is the lack of a true `PackedSequence`, instead this PR uses an ordinary `tuple` to stand in for `PackedSequence`. Most of the new code in `rnn.py` is copied to `nn.LSTM` from `nn.RNNBase` to specialize it for LSTM since `hx` is a `Tuple[Tensor, Tensor]` (rather than just a `Tensor` as in the other RNN modules) for LSTM. As a hack it adds an internal annotation `@_parameter_list` to mark that a function returns all the parameters of a module. The weights for `RNN` modules are passed to the corresponding op as a `List[Tensor]`. In Python this has to be gathered dynamically since Parameters could be moved from CPU to GPU or be deleted and replaced (i.e. if someone calls `weight_norm` on their module, #15766), but in the JIT parameter lists are immutable, hence a builtin to handle this differently in Python/JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15744 Differential Revision: D14173198 Pulled By: driazati fbshipit-source-id: 4ee8113159b3a8f29a9f56fe661cfbb6b30dffcd	2019-02-21 16:24:19 -08:00
David Riazati	ac00a0cd47	Dict mutability (#16884 ) Summary: Adds `aten::_set_item` for `dict[key]` calls Pull Request resolved: https://github.com/pytorch/pytorch/pull/16884 Differential Revision: D14000488 Pulled By: driazati fbshipit-source-id: ea1b46e0a736d095053effb4bc52753f696617b2	2019-02-21 16:24:17 -08:00
Soumith Chintala	3a47d56946	Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA (#16705 ) (#17337 ) Summary: Attempt #2 (attempt 1 is https://github.com/pytorch/pytorch/pull/16705 and got reverted because of CI failures) Fixes https://github.com/pytorch/pytorch/issues/14805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17337 Differential Revision: D14175626 Pulled By: soumith fbshipit-source-id: 66f2e10e219a1bf88ed342ec5c89da6f2994d8eb	2019-02-21 16:12:02 -08:00
Elias Ellison	290b2a1d9d	Fix Insert Constant Lint Fail (#17316 ) Summary: The test I added was failing lint because a constant was being created that wasn't being destroyed. It was being inserted to all_nodes, then failing the check ` AT_ASSERT(std::includes(ALL_OF(sum_set), ALL_OF(all_nodes_set)));` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17316 Differential Revision: D14172548 Pulled By: eellison fbshipit-source-id: 0922db21b7660e0c568c0811ebf09b22081991a4	2019-02-21 15:54:44 -08:00
Zachary DeVito	4c6da649e5	Partial support for kwarg_only arguments in script (#17339 ) Summary: This provides the minimum necessary to allow derivative formulas for things that have a kwarg only specifier in their schema. Support for non-parser frontend default arguments for kwargs is not completed. Fixes #16921 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17339 Differential Revision: D14160923 Pulled By: zdevito fbshipit-source-id: 822e964c5a3fe2806509cf24d9f51c6dc01711c3	2019-02-21 15:27:06 -08:00
Natalia Gimelshein	5fa78303ed	fix double backward for half softmax/logsoftmax (#17330 ) Summary: Fix for #17261, SsnL do you have tests for it in your other PR? If not, I'll add to this. Example from #17261 now does not error out (and same for log_softmax). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17330 Differential Revision: D14171529 Pulled By: soumith fbshipit-source-id: ee925233feb1b44ef9f1d757db59ca3601aadef2	2019-02-21 14:58:45 -08:00
Christian Puhrsch	9101dfc57c	Revisit some native functions to increase number of jit matches (#17340 ) Summary: Adds about 30 matches due to new functions / misuse of double. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17340 Differential Revision: D14161109 Pulled By: cpuhrsch fbshipit-source-id: bb3333446b32551f7469206509b480db290f28ee	2019-02-21 14:41:06 -08:00
Mikhail Zolotukhin	46f15b74b7	Add Value::isValidName method. (#17372 ) Summary: The method will be used in IRParser and in NetDef converter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17372 Differential Revision: D14172494 Pulled By: ZolotukhinM fbshipit-source-id: 96cae8422bc73c3c2eb27524f44ec1ee8cae92f3	2019-02-21 14:34:17 -08:00
Bharat123Rox	6feded880e	Fix #17218 by updating documentation (#17258 ) Summary: Fix Issue #17218 by updating the corresponding documentation in [BCEWithLogitsLoss](https://pytorch.org/docs/stable/nn.html#torch.nn.BCEWithLogitsLoss) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17258 Differential Revision: D14157336 Pulled By: ezyang fbshipit-source-id: fb474d866464faeaae560ab58214cccaa8630f08	2019-02-21 14:17:35 -08:00
Soumith Chintala	45251fb52e	fix lint (#17366 ) Summary: fix lint Pull Request resolved: https://github.com/pytorch/pytorch/pull/17366 Differential Revision: D14171702 Pulled By: soumith fbshipit-source-id: 5d8ecfac442e93b11bf4095f9977fd3302d033eb	2019-02-21 13:39:53 -08:00
Nikolay Korovaiko	3145f46a22	switch to Operation in register_prim_ops.cpp (#17183 ) Summary: This PR switches from `OperationCreator` to `Operation` to simplify the logic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17183 Differential Revision: D14169829 Pulled By: Krovatkin fbshipit-source-id: 27f40a30c92e29651cea23f08b5b1f13d7eced8c	2019-02-21 12:45:25 -08:00
Karl Ostmo	b312e9de6a	Use standard docker image for XLA build Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17287 Differential Revision: D14169689 Pulled By: kostmo fbshipit-source-id: 24e255be23936542093008ed51d2c061b2924993	2019-02-21 11:56:23 -08:00
Gregory Chanan	25730f15bb	Modernize test_sparse. (#17324 ) Summary: Our sparse tests still almost exclusively use legacy constructors. This means you can't, for example, easily test scalars (because the legacy constructors don't allow them), and not surprisingly, many operations are broken with sparse scalars. Note: this doesn't address the SparseTensor constructor itself, because there is a separate incompatibility there that I will address in a follow-on commit, namely, that torch.sparse.FloatTensor() is supported, but torch.sparse_coo_tensor() is not (because the size is ambiguous). The follow-on PR will explicitly set the size for sparse tensor constructors and add a test for the legacy behavior, so we don't lose it. Included in this PR are changes to the constituent sparse tensor pieces (indices, values): 1) IndexTensor becomes index_tensor 2) ValueTensor becomes value_tensor if it is a data-based construction, else value_empty. 3) Small changes around using the legacy tensor type directly, e.g. torch.FloatTensor.dtype exists, but torch.tensor isn't a type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17324 Differential Revision: D14159270 Pulled By: gchanan fbshipit-source-id: 71ee63e1ea6a4bc98f50be41d138c9c72f5ca651	2019-02-21 11:40:43 -08:00
Soumith Chintala	c63af8837d	remove nn.Upsample deprecation warnings from tests (#17352 ) Differential Revision: D14168481 Pulled By: soumith fbshipit-source-id: 63c37c5f04d2529abd4f42558a3d5e81993eecec	2019-02-21 11:27:24 -08:00
Soumith Chintala	3069c45069	upgrade documentation in setup.py to NO_ -> USE_ (#17333 ) Summary: fixes https://github.com/pytorch/pytorch/issues/17265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17333 Differential Revision: D14168483 Pulled By: soumith fbshipit-source-id: a79f4f9d9e18cb64e2f56f777caa69ae92d2fa4b	2019-02-21 10:25:43 -08:00
Dmytro Dzhulgakov	5744d5213d	Enforce non-negativity of tensor construction (#17077 ) Summary: Apparently, before the only way we enforced it was size>=0 in alloc_cpu. So empty((5,-5)) would fail but empty((-5,-5)) would hang :) Please suggest better place to enforce it if any. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17077 Differential Revision: D14077930 Pulled By: dzhulgakov fbshipit-source-id: 1120513300fd5448e06fa15c2d72f9b0ee5734e4	2019-02-21 09:28:53 -08:00
Igor Macedo Quintanilha	94a95a0c7f	Fixing docstring in CTCLoss (#17307 ) Summary: The argument `zero_infinity` is in the wrong place! :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17307 Differential Revision: D14154850 Pulled By: ezyang fbshipit-source-id: 7a9fe537483b23041f21ba1b80375b7f44265538	2019-02-21 08:13:28 -08:00
fehiepsi	de81a2741f	Fix the slowness of mvn's log_prob (#17294 ) Summary: This PR addresses the slowness of MVN's log_prob as reported in #17206. t-vi I find it complicated to handle permutation dimensions if we squeeze singleton dimensions of bL, so I leave it as-is and keep the old approach. What do you think? Pull Request resolved: https://github.com/pytorch/pytorch/pull/17294 Differential Revision: D14157292 Pulled By: ezyang fbshipit-source-id: f32590b89bf18c9c99b39501dbee0eeb61e130d0	2019-02-21 08:04:33 -08:00
Gao, Xiang	722cbe3064	Move argsort to C++ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17099 Differential Revision: D14165671 Pulled By: ezyang fbshipit-source-id: 3871de6874fe09871ebd9b8943c13c9af325bf33	2019-02-21 07:59:27 -08:00
Tri Dao	37890610b0	Include vec256 headers in setup.py (#17220 ) Summary: Fix #16650. Headers such as `ATen/cpu/vml.h` contain `#include <ATen/cpu/vec256/vec256.h>` for example, but these vec256 headers aren't included, due to commit e4c0bb1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17220 Differential Revision: D14165695 Pulled By: ezyang fbshipit-source-id: 27b2aa2a734b3719ca4af0565f79623b64b2620f	2019-02-21 07:37:01 -08:00
peter	5106918656	Enable MAX_JOBS for using Ninja on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17341 Differential Revision: D14164740 Pulled By: soumith fbshipit-source-id: 7a1c3db0a7c590f72a777fcd32e1c740bb0c6257	2019-02-21 04:40:17 -08:00
Luca Wehrstedt	29f4f8f048	Avoid unnecessary CPU-to-GPU copy of torch.load with CUDA (#17297 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17297 When `torch.load` needs to load a tensor, no matter which device it will be end up being loaded on, it first creates a CPU storage for it of the necessary size. This storage is allocated but it's not "set" yet, hence no data is written to it: it exists in the kernel's memory map, but it's not resident and doesn't take up physical pages. Then, this storage is passed to the `map_location` function (if the parameter is a string, a device or a map, PyTorch builds that function automatically). The default map for CUDA consists effectively in `lambda storage, _: storage.cuda()` (I omitted the code needed to pick the correct device). This creates a GPU storage and copies over the data of the CPU storage. This step is unnecessary as we're copying uninitialized memory. (Surprisingly enough, though, it appears the kernel is smart enough that reading from the unpaged CPU memory doesn't cause it to become paged.) Once `map_location` returns a storage residing on the correct target device, `torch.load` resumes reading the file and copying the tensor's content over into the storage. This will overwrite the content that had previously been written to it, which confirms that the above copy was pointless. A way to avoid this useless copy is to just create and return a new empty storage on the target GPU, instead of "transforming" the original one. This does indeed increase the performance: ``` In [5]: torch.save(torch.rand(100, 100, 100), "/tmp/tensor") In [6]: %timeit torch.load("/tmp/tensor", map_location="cuda") 1.55 ms ± 111 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [7]: %timeit torch.load("/tmp/tensor", map_location=lambda storage, _: torch.cuda.FloatStorage(storage.size())) 1.03 ms ± 44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Credit for this diff is shared with adamlerer and fmassa. Differential Revision: D14147673 fbshipit-source-id: a58d4bc0d894ca03a008499334fc2cdd4cc91e9f	2019-02-21 01:32:19 -08:00
Michael Suo	2c302b6ea6	allow lists to contain any tensor type (#17321 ) Summary: If something is a TensorList, it should be a list of `TensorType`, not a list of some specialized type. Fixes #17140, #15642 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17321 Differential Revision: D14158192 Pulled By: suo fbshipit-source-id: ba8fe6ae8d618c73b23cd00cbcb3111c390c5514	2019-02-21 00:18:50 -08:00
Junjie Bai	d92ddcf7ca	Skip convnets benchmark in rocm CI (#17331 ) Summary: random coredump Pull Request resolved: https://github.com/pytorch/pytorch/pull/17331 Differential Revision: D14162018 Pulled By: bddppq fbshipit-source-id: 3ed15a79b7bca2498c50f6af80cbd6be7229dea8	2019-02-20 21:12:24 -08:00
Edward Yang	b3b692a80a	Don't have malloc-free pairs that cross DLL boundaries. (#17302 ) Summary: See https://blogs.msdn.microsoft.com/oldnewthing/20060915-04/?p=29723 for more background on this requirement on Windows. Fixes #17239. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc xkszltl peterjc123 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17302 Differential Revision: D14150067 Pulled By: ezyang fbshipit-source-id: 9dc16ca781ff17515b8df1bb55492477e7843d4c	2019-02-20 20:31:41 -08:00
bddppq	c063a33ef3	Add support to build for multiple amd gpu targets (#17329 ) Summary: iotamudelta petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/17329 Differential Revision: D14161277 Pulled By: bddppq fbshipit-source-id: f3eb9f52e96a8fcd779c57df0f8c9a2c54754e35	2019-02-20 18:45:24 -08:00
Michael Suo	501d346da8	batched cleanups (#17288 ) Summary: Bunch of random stuff I came across while doing UDT stuff. Putting in a separate PR to avoid noise - fix up the alias analysis list ops to include fork/wait - improve dump() for aliasDb to print writes - Move BuiltinFunction::call() to sugaredvalue with the rest of the methods - formatting and includes Pull Request resolved: https://github.com/pytorch/pytorch/pull/17288 Differential Revision: D14147105 Pulled By: suo fbshipit-source-id: 62e2a922a1726b684347365dc42c72188f154e9c	2019-02-20 18:31:53 -08:00
Edward Yang	cb3d3d1115	(Permanently) fix CI breakage due to new docker version. (#17338 ) Summary: Pull request resolved: https://github.com/pytorch/pytorch/pull/17338 See comment in config.yml for details. build-break Reviewed By: orionr Differential Revision: D14160934 fbshipit-source-id: a91160ab15dd6c174a7d946a78a7d2d50ae0a011	2019-02-20 18:00:11 -08:00
Cheng,Penghui	376bb40379	Implementation convolutionTranspose operator for mkl-dnn (#12866 ) Summary: the speed-up of a single operation is up to 2-3X on BDW. This PR depend on https://github.com/pytorch/pytorch/pull/14308 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12866 Differential Revision: D13936110 Pulled By: ezyang fbshipit-source-id: 34e3c2ca982a41e8bf556e2aa0477c999fc939d3	2019-02-20 17:26:10 -08:00
Cheng,Penghui	c02e2ff0b0	Support multi-device configuration for MKL-DNN (#12856 ) Summary: MKL-DNN support multi-node mode，but not support multi-devices mode,this commit will support multi-devices for MKL-DNN.This commit depend on https://github.com/pytorch/pytorch/pull/11330 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12856 Differential Revision: D13735075 Pulled By: ezyang fbshipit-source-id: b63f92b7c792051f5cb22e3dda948013676e109b	2019-02-20 16:57:43 -08:00
Ailing Zhang	90950f79c7	fix missing std (#17263 ) Summary: add missing std introduced by #16689 . Investigating why this wasn't caught in CI (nor my local dev environment). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17263 Reviewed By: ezyang Differential Revision: D14134556 Pulled By: ailzhang fbshipit-source-id: 6f0753fa858d3997e654924779646228d6d49838	2019-02-20 16:47:35 -08:00
Ilia Cherniavskii	0edc81136c	Rethrow exceptions from RunAsync (#15034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15034 Rethrow exception happened during RunAsync, ensure that pending tasks are not executed after marked as finished Reviewed By: andrewwdye Differential Revision: D13409649 fbshipit-source-id: 3fd12b3dcf32af4752f8b6e55eb7a92812a5c057	2019-02-20 16:32:24 -08:00
Ilia Cherniavskii	0337494c6a	Reinforce scheduling invariants (#17132 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17132 schedule() function is not supposed to throw exception and is supposed to succeed in scheduling the full graph of tasks, potential errors (e.g. errors from underlying thread pool, out of memory exceptions etc) are considered not recoverable. The invariant - the graph of tasks is either not executed or executed in full before the call to finishRun() Reviewed By: andrewwdye Differential Revision: D14092457 fbshipit-source-id: a3e5d65dfee5ff5e5e71ec72bb9e576180019698	2019-02-20 16:32:23 -08:00
Lukasz Wesolowski	3e44880d4d	Modify TileOp GPU implementation to expose more concurrency and better utilize GPU memory bandwidth (#17275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17275 Previous implementation used a memcpy inside the kernel. It is more efficient to reduce the data fetched per thread to a single word from memory. This exposes more concurrency and takes advantage of GPU memory coalescing support. Reviewed By: takatosp1 Differential Revision: D14120147 fbshipit-source-id: c4734003d4342e55147c5b858f232a006af60b68	2019-02-20 16:02:14 -08:00
Christian Puhrsch	9e4a993878	Support str for native_functions.yaml schema Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17276 Differential Revision: D14154222 Pulled By: cpuhrsch fbshipit-source-id: 411181da5399608c1d1f3218f8f570bb106c88ec	2019-02-20 15:47:06 -08:00
Xiaomeng Yang	2e67b34ea7	Separate gpu reduce functions (#17146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17146 Separate gpu reduce functions i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14097564 fbshipit-source-id: a27de340997111a794b1d083c1673d4263afb9fb	2019-02-20 14:49:01 -08:00
Edward Yang	474adf5458	Minor doc updates in c10/core/Allocator.h (#17164 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/17164 Differential Revision: D14154393 Pulled By: ezyang fbshipit-source-id: 59d8276d4bb4e7cadb4382769b75e5348ed388de	2019-02-20 14:36:15 -08:00
Xiang Gao	b2dde4386a	Namedtuple return for symeig, eig, pstrf, qr, geqrf (#16950 ) Summary: More ops for https://github.com/pytorch/pytorch/issues/394 Differential Revision: D14118645 Pulled By: ezyang fbshipit-source-id: a98646c3ddcbe4e34452aa044951286dcf9df778	2019-02-20 14:01:19 -08:00
Thomas Viehmann	36ddad3bfe	Allow PyTorch to be built without NCCL (#17295 ) Summary: With this patch you can use USE_DISTRIBUTED=OFF (possibly in combination with USE_NCCL=OFF (?)) The significance is partly because the NCCL doesn't build with CUDA 8. This is written under the assumption that NCCL is required for distributed if not, the USE_DISTRIBUTED check in nccl.py should be replaced by a check for the USE_NCCL environment variable. Fixes: #17274 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17295 Differential Revision: D14155080 Pulled By: ezyang fbshipit-source-id: 0d133f7c5b4d118849f041bd4d4cbbd7ffc3c7b4	2019-02-20 13:35:16 -08:00
Lu Fang	e6cf3c886d	add foxi submodule (#17184 )	2019-02-20 16:25:05 -05:00
Peizhao Zhang	54e4c4d7de	Removed obsolete argument correct_transform_coords in bbox_transform op. (#16723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16723 Removed obsolete argument correct_transform_coords in bbox_transform op. * It was only for backward compatibility. We should not have models using it now. Differential Revision: D13937430 fbshipit-source-id: 504bb066137ce408c12dc9dcc2e0a513bad9b7ee	2019-02-20 13:22:33 -08:00
Hector Yuen	075c7b1fef	make the threshold for acurracy more precise (#17194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17194 we found that there is a per row absolute error due to int8 quant and a relative error table-wide in case fp16 is used Reviewed By: csummersea Differential Revision: D14113353 fbshipit-source-id: c7065aa9d15c453c2e5609f421ad0155145af889	2019-02-20 13:14:11 -08:00
Yinghai Lu	db1d61a5c3	Add rule based filtering for ONNXIFI transformation (#17198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17198 We come to the point that we need to apply some rules to bind certain ops together to avoid un-inferrable intermediate shapes. We either lower them together to backend or neither. This diff adds a pass for us to add rules like this. The first one is to bind `Gather` with `SparseLengthsWeighted*`. Reviewed By: ipiszy Differential Revision: D14118326 fbshipit-source-id: 14bc62e1feddae02a3dd8eae93b8f553d52ac951	2019-02-20 12:47:24 -08:00
svcscm	63214b572b	Updating submodules Reviewed By: zpao fbshipit-source-id: 4ee15707bcf8c23c2d7feb6987acecef4131d467	2019-02-20 09:38:12 -08:00
Oleg Bogdanov	260facfdea	caffe2 \| added missing operator source file (#17272 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17272 after windows-specific fixes were applied new file was left out of CMakeLists Reviewed By: orionr Differential Revision: D14140419 fbshipit-source-id: 6a6c652048ed196ec20241bc2a1d08cbe2a4e155	2019-02-20 09:28:29 -08:00
Nikolay Korovaiko	a91e056f2a	add list methods: copy,extend (#17092 ) Summary: This PR adds the following methods to python's list. * copy * extend and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/17092 Differential Revision: D14141817 Pulled By: Krovatkin fbshipit-source-id: c89207f0f25f3d1d4ad903ee634745615d61d576	2019-02-20 09:24:25 -08:00
SsnL	79f898263b	Improve error message w/ size inference on empty tensors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17255 Differential Revision: D14143094 Pulled By: soumith fbshipit-source-id: f96fa7f8eb6eaac72887d3e837546cbfa505f101	2019-02-20 09:12:26 -08:00
Gemfield	c3a23379ea	add install step and docs for Android build (#17298 ) Summary: This commit did below enhancements: 1, add doc for build_android.sh; 2, add install step for build_android.sh, thus the headers and libraries can be collected together for further usage conveniently; 3, change the default INSTALL_PREFIX from $PYTORCH_ROOT/install to $PYTORCH_ROOT/build_android/install to make the project directory clean. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17298 Differential Revision: D14149709 Pulled By: soumith fbshipit-source-id: a3a38cb41f26377e21aa89e49e57e8f21c9c1a39	2019-02-20 07:05:24 -08:00
Soumith Chintala	1b3315ec17	improve libtorch install docs with GPU note (#17299 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15702 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17299 Differential Revision: D14149712 Pulled By: soumith fbshipit-source-id: 5b83110bb00e4d4dad04c1f293c2b52e41711f11	2019-02-20 06:30:08 -08:00
Thomas Viehmann	237e5438f5	Add launch bounds for TopK kernel, be more conservative in sorting (#17296 ) Summary: The particular use case reported is Jetson TX2 and maskrcnn. Fixes #17144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17296 Differential Revision: D14147886 Pulled By: soumith fbshipit-source-id: 44d5a89aaeb4cc07d1b53dd90121013be93c419c	2019-02-20 03:10:46 -08:00
Lara Haidar	b8d1f4a423	ONNX Export Maxpool Indices Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16455 Differential Revision: D14140375 Pulled By: houseroad fbshipit-source-id: 12d02c447e7fe0fae49969d1daf40a87660ed416	2019-02-19 21:10:14 -08:00
Michael Suo	4f45bc73f7	Revert D14144264: [pytorch][PR] [jit] clean up print from test Differential Revision: D14144264 Original commit changeset: eec837d29c46 fbshipit-source-id: ad91cb1d047fd34967385b661a6757111f92026e	2019-02-19 18:56:23 -08:00
svcscm	f8c9ec5e44	Updating submodules Reviewed By: zpao fbshipit-source-id: 68a648b2136823994f02fa5b567a2656494f6dd3	2019-02-19 17:50:40 -08:00
Michael Suo	5be3ffbde2	clean up print from test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17279 Differential Revision: D14144264 Pulled By: suo fbshipit-source-id: eec837d29c46e96be37c54192a841046b486cb8b	2019-02-19 17:46:41 -08:00
peter	428b666814	Fix dll loading process in newer Python on Windows (#17191 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/17051. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17191 Differential Revision: D14138427 Pulled By: kostmo fbshipit-source-id: 9f207105161ad0312eb09fd86072afd5f22de785	2019-02-19 17:16:41 -08:00
peter	972fc5f191	Fix dll loading issue for Caffe2 and Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17215 Reviewed By: orionr Differential Revision: D14138445 Pulled By: kostmo fbshipit-source-id: 0bb4f2f1ed5bda7416ba7e4c6b0618414b328934	2019-02-19 17:04:06 -08:00
Tongzhou Wang	2b57bdb7ab	Fix cuda softmax backward with empty input (#17259 ) Summary: Fixes #17256 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17259 Differential Revision: D14142196 Pulled By: soumith fbshipit-source-id: 1f2dc202951b59b43da27684f9f924314bcd3040	2019-02-19 16:41:52 -08:00
Jie	594a4d7b55	at::native batch norm kernel launch config update (#17047 ) Summary: limit block dimension to avoid configuration error on batch norm kernel launch This should resolve #16998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17047 Differential Revision: D14142132 Pulled By: soumith fbshipit-source-id: 9c8c52dcd1d108cda1f65f5227e625b8fe6e12a0	2019-02-19 16:41:51 -08:00
Sergei Nikolaev	6455d91e4d	False alarm about leak in TestNN.test_variable_sequence_cuda (#17242 ) Summary: `TestNN.test_variable_sequence_cuda` sometimes brakes due to CUDA leak. The cause appears to be too small tolerance breaking float16 sub-test of the test above. When it breaks it calls abort disrupting correct tear down of the test and false alarming about the leak. ~~Also, removed annoying Upsample module warning. IMHO this warning is wrong because the module Upsample is not deprecated. Seems like it's been mixed with `nn.functional.upsample` function which is indeed deprecated in favor of `nn.functional.interpolate`, see `torch/nn/functional.py:2387` for details (this replacement is also performed in `test_nn.py`).~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/17242 Differential Revision: D14141686 Pulled By: soumith fbshipit-source-id: faa8f87440d94bdc6ab0ff00be6dad82353115c4	2019-02-19 15:59:30 -08:00
Karl Ostmo	09c9af9451	U/kostmo/gen circle conf (#17189 ) Summary: Diagram preview: ![binarysmoketests-config-dimensions](https://user-images.githubusercontent.com/261693/53040977-a0f88d00-3437-11e9-9190-796cc243e0f9.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17189 Differential Revision: D14141362 Pulled By: kostmo fbshipit-source-id: 0625a1234d0307c6be79f17e756ddb1cc445b374	2019-02-19 15:37:09 -08:00
Ailing Zhang	f827f9f77a	update doc for multinomial (#17269 ) Summary: Update documentation to raise awareness of the fix in #12490. Thanks matteorr for pointing this out! Pull Request resolved: https://github.com/pytorch/pytorch/pull/17269 Reviewed By: ezyang Differential Revision: D14138421 Pulled By: ailzhang fbshipit-source-id: 6433f9807a6ba1d871eba8e9d37aa6b78fa1e1fd	2019-02-19 15:30:52 -08:00
Lu Fang	d73e6cb59d	Automatic update of fbcode/onnx to 4c091e048ca42682d63ccd3c1811560bc12b732d (#17264 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17264 Previous import was 822d8df0a2a32233c6022f50a158817a0f19bdc7 Included changes: - [4c091e0](https://github.com/onnx/onnx/commit/4c091e0): Support defined ONNX_ML in parent cmake files (#1821) <Lu Fang> - [57372f3](https://github.com/onnx/onnx/commit/57372f3): Delete OpsetVersionConverter.md which is a duplicate of VersionConverter.md (#1818) <Prasanth Pulavarthi> - [ab1c57e](https://github.com/onnx/onnx/commit/ab1c57e): [ONNXIFI]Add extension to be implementable (#1796) <Rui Zhu> - [b92eee8](https://github.com/onnx/onnx/commit/b92eee8): Revert "Implement Op Annotation's for ONNX (#1648)" (#1812) <Ke Zhang> - [61f1e9e](https://github.com/onnx/onnx/commit/61f1e9e): Enable ONNX_ML by default (#1810) <Shinichiro Hamaji> - [4f064a1](https://github.com/onnx/onnx/commit/4f064a1): fix Greater and Less doc (#1811) <Guoliang Hua> - [0628582](https://github.com/onnx/onnx/commit/0628582): Implement Op Annotation's for ONNX (#1648) <Armen> - [ad9d2f7](https://github.com/onnx/onnx/commit/ad9d2f7): Versioning doc update for Opset 9 (#1805) <Vinitra Swamy> - [e71e3be](https://github.com/onnx/onnx/commit/e71e3be): add dilation case for ConvTranspose op (#1797) <Randy> Reviewed By: yinghai Differential Revision: D14135024 fbshipit-source-id: 1e4f9dda89abf48994798d080dd5d58207a6e4b6	2019-02-19 14:54:34 -08:00
Will Feng	c88798dbc1	Make tril_ and triu_ actually in-place (#17031 ) Summary: Currently, when the input tensor `self` is not contiguous, `tril_` and `triu_` calls `self = self.contiguous()`, which allocates a new contiguous tensor and assign it to `self`. This effectively changes the input tensor `self`'s pointer and will break downstream code after Variable/Tensor merge. This PR fixes it so that `tril_` and `triu_` always update the input tensor in-place and preserve the input tensor's TensorImpl. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17031 Differential Revision: D14069592 Pulled By: yf225 fbshipit-source-id: d188218f426446a44ccc1d33fc28ac3f828c6a05	2019-02-19 14:47:17 -08:00
Michael Liu	0fc03d155a	Fix remaining -Wreturn-std-move violations in fbcode Summary: Some value are copied when it could've been moved. Detected by compiler flag -Wreturn-std-move Reviewed By: igorsugak Differential Revision: D14134303 fbshipit-source-id: 8fc3bb2017108b3d65097cb8447e33f5b6c743b4	2019-02-19 12:41:27 -08:00
Elias Ellison	89df22e57b	Lightweight String check Utility (#16858 ) Summary: light weight implementation of LLVM filecheck utility. Currently only handles string matching - regexes & saving a regex to a variable name can be added as needed. Current intended usage is through FileCheckBuilder python handle, and is shown in the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16858 Differential Revision: D14096244 Pulled By: eellison fbshipit-source-id: c7c8d1457691c105e6ccbb3c1a378d96baac2569	2019-02-19 12:31:57 -08:00
eellison	82aa511146	move prim::None to prim::Constant (again) (#17186 ) Summary: Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test. https://github.com/pytorch/pytorch/pull/16160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186 Differential Revision: D14115304 Pulled By: eellison fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7	2019-02-19 11:45:50 -08:00
Krishna Kalyan	9ebc433bda	Clarification of Lerp operation on tensors (#17253 ) Summary: The `tensor` be used as `end` clarified in the docs. Differential Revision: D14132212 Pulled By: ezyang fbshipit-source-id: e9bca14d5079e5f7adfc18afcb1eec832ef86e9e	2019-02-19 11:27:02 -08:00
Natalia Gimelshein	19117f6a0a	reenable rand_like fusion when there is no broadcast (#16087 ) Summary: Reenables rand_like fusion if no tensor is broadcasted in the fusion group. This is a sufficient but not necessary condition for fused rand_like to produce correct results, and it has an unpleasant side effect of falling back to non-fused path if rand_like was optimistically included in the fusion group, but there is a broadcast in the fusion group not necessarily related to rand_like. E.g. before this PR, if the network had (biasAdd -> relu -> dropout), fuser could fuse biasAdd and relu, now it will try fusing the whole thing (if dropout is expressed via rand_like) and fall back every time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16087 Differential Revision: D13720232 Pulled By: zou3519 fbshipit-source-id: 1e19203bec4a59257bfc7078b054a19f00fab4ad	2019-02-19 11:12:25 -08:00
Karl Ostmo	43d5cd4d34	discrepancy in smoke_macos_libtorch_2.7_cpu job spec (#17224 ) Summary: closes #17223 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17224 Reviewed By: pjh5 Differential Revision: D14121612 Pulled By: kostmo fbshipit-source-id: bfd5a392de5e614031389725535756d7fa7db784	2019-02-19 10:14:21 -08:00
Iurii Zdebskyi	444039c47b	Bool tensor. Part 0: Boolean storage implementation (#16810 ) Summary: This is the first commit from a series of planned changes in order to add boolean tensors to PyTorch. The whole plan looks like this: 0. Storage Implementation (this change) 1. Tensor Creation. 2. Tensor Conversions. 3. Tensor Indexing. 4. Tensor Operations. 5. Back compatibility related changes. This feature was requested by the community: https://github.com/pytorch/pytorch/issues/4764 https://github.com/pytorch/pytorch/issues/4219 https://github.com/pytorch/pytorch/issues/4288 Change: Added boolean type to the Storage class for CPU and CUDA backends. Tested via: 1. unit tests 2. running this: -> import torch -> torch.BoolStorage <class 'torch.BoolStorage'> -> torch.cuda.BoolStorage <class 'torch.cuda.BoolStorage'> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16810 Reviewed By: gchanan Differential Revision: D14087246 Pulled By: izdeby fbshipit-source-id: 042642ced1cb0fd1bb6bff05f9ca871a5c54ee5e	2019-02-19 08:22:13 -08:00
ZhuBaohe	e81878e0a9	Correct padding and activations docstrings in nn module Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17197 Differential Revision: D14131284 Pulled By: soumith fbshipit-source-id: 6edd225b47b1dde81b5ad0a23c588c6621987a69	2019-02-19 08:16:52 -08:00
Michael Liu	f2f4030294	Use move to avoid copying (#17188 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17188 Using flag "-Wreturn-std-move", compiler can identify the cases where a copy operation is performed when a move operation would have been available. Wrapped return statement with std::move to fix. For some reason, these files are not automatically modded. With D14115372 we should be able to turn on the compile flag Reviewed By: soumith Differential Revision: D14115786 fbshipit-source-id: e763b92eecbe4468027fc141d029618d1e9f280b	2019-02-19 07:14:27 -08:00
Zhonghao Liu	57617ee429	Replace resize_dim() with set_sizes_and_strides() (#17127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17127 Replace resize_dim() with set_sizes_and_strides() in `THTensor_(squeeze1d) in aten/src/TH/generic/THTensor.cpp` Reviewed By: ezyang Differential Revision: D14088697 fbshipit-source-id: 518b72f7c0c4fbedf11a29a6ceb9fee8eefd9273	2019-02-19 07:04:17 -08:00
Jaliya Ekanayake	9477c143c6	C++ Frontend: adding two distributed samples (Random and Sequential) (#16910 ) Summary: Adding two distrbuted samplers, Random and Sequential to the mix. Similar to python counterpart, DistributedSampler introduces a new method `set_epoch(size_t epoch)` which can be use to shuffle data determinstically between distributed processes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16910 Differential Revision: D14130980 Pulled By: soumith fbshipit-source-id: ec08b7130c01e2fc6dc3693f7ac622a0a6d60f10	2019-02-19 05:40:37 -08:00
ZhuBaohe	8852e21245	Correct recurrent/linear/dropout/sparse layers docstrings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17238 Differential Revision: D14130811 Pulled By: soumith fbshipit-source-id: d3998ca7da46aec5a59220c6af489f71f3d60735	2019-02-19 05:23:04 -08:00
surgan12	fad9eda7fb	Optional arg fixes (#17222 ) Summary: fixes #17210. cc : ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/17222 Differential Revision: D14130833 Pulled By: soumith fbshipit-source-id: 19ff6020c47208e3436ae28cd16110a0f435b25e	2019-02-19 04:39:18 -08:00
bhushan	7e5442f900	Reset grad attribute when called using del (#16525 ) Summary: del Tensor.grad set PyObject to nullptr and Tensor.grad = None set PyObject to Py_None Handling both the cases now fixes ##16471 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16525 Differential Revision: D14130800 Pulled By: soumith fbshipit-source-id: ed85c38305bba94d5047311cb58e4e4cedd09832	2019-02-19 04:33:57 -08:00
Ying Zhang	9a7bcacc27	Logging stuffs (#17177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17177 Add more logging and flag. Reviewed By: yinghai Differential Revision: D14111643 fbshipit-source-id: 4b1c005faf41c21f59100bc401120c6970a24c42	2019-02-17 13:41:50 -08:00
Mikhail Zolotukhin	3a01a45f06	Implement IRParser. (#16987 ) Summary: It might need some cleaning up and might be missing some features, but it should be already working for most cases. This PR is based on top of PR16986 (so please review only the last commit here). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16987 Differential Revision: D14074577 Pulled By: ZolotukhinM fbshipit-source-id: 712b598f423265655f574bb9903e2066628eaad3	2019-02-16 20:23:50 -08:00
Junjie Bai	bf16a6bc3c	Skip onnx logsoftmax tests in rocm (#17170 ) Summary: similar to softmax there are issues of getting nan randomly Pull Request resolved: https://github.com/pytorch/pytorch/pull/17170 Differential Revision: D14110515 Pulled By: bddppq fbshipit-source-id: 5c97661184d45a02122fd69d35a839fdf4520c8c	2019-02-16 18:06:04 -08:00
Gao, Xiang	b6b99fd7d3	Add namedtuple return for min, median, mode, kthvalue, add test for namedtuple return API (#16186 ) Summary: This partially fixes https://github.com/pytorch/pytorch/issues/394 and depend on https://github.com/pytorch/pytorch/pull/15429. I suggest to review this only after https://github.com/pytorch/pytorch/pull/15429 get landed, otherwise the diff might be large to review. The test only allows explicitly whitelisted operators to have named return. Differential Revision: D14070735 Pulled By: ezyang fbshipit-source-id: ace2a672998b4e4a8094f52cbda5aa1cea6e3b42	2019-02-16 00:01:33 -08:00
David Riazati	b3d8c569d3	Remove templates for GenericDict Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17175 Differential Revision: D14113022 Pulled By: driazati fbshipit-source-id: 5183e131cc8ccb58525875f76fa03133570a59ea	2019-02-15 21:35:19 -08:00
Ailing Zhang	20fd6dca77	fix missing constant in adaptive_avg_pool2d AD (#17180 ) Summary: Thanks ngimel for pointing this out! Pull Request resolved: https://github.com/pytorch/pytorch/pull/17180 Differential Revision: D14113001 Pulled By: ailzhang fbshipit-source-id: 78e7d7f2cda3889138e2bf26a54980c2cc665882	2019-02-15 21:14:34 -08:00
Mikhail Zolotukhin	6c06b32558	Implement NetDef <--> JIT IR converters. Try 2. (#17123 ) Summary: Currently the converters are very straightforward, i.e. there is no code for trying to preserve semantics, we're purely perform conversion from one format to another. Two things that we might want to add/change: 1. Add semantic conversion as well (but probably it would be a good idea to keep it separate as a temporary thing). 2. Make sure we don't mess with value names, as they are crucial for current uses of NetDefs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17123 Differential Revision: D14090244 Pulled By: ZolotukhinM fbshipit-source-id: 07175fa9235582e1d1da5f10a42a5c1280b1b394	2019-02-15 20:39:30 -08:00
Hector Yuen	cde7204636	change the epsilon for fp32/fp16 to uint8 to be the same (#17062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17062 from jiyan's training jobs it seems like we found a quantization bug fp32 fp32->rowwise int8 is fine fp16 is fine fp16->rowwise int8 is not fine we are preconverting everything to fp32 and using the existing code, so there is no need to change the epsilon in the case of fp16 since at the time of converting, everything is a float Reviewed By: jspark1105 Differential Revision: D14063271 fbshipit-source-id: 747297d64ed8c6fdf4be5bb10ac584e1d21a85e6	2019-02-15 18:33:37 -08:00
Elias Ellison	91c1d728ac	Revert D14109636: [pytorch][PR] move prim::None to a case in prim::Constant Differential Revision: D14109636 Original commit changeset: d26fd3839761 fbshipit-source-id: c8c8113e2bff49ea93235732603e6ebc89356533	2019-02-15 16:38:12 -08:00
Elias Ellison	7caa21f5ca	move prim::None to a case in prim::Constant (#16160 ) Summary: This change simplifies analysis done on constants since prim::None does not need to be handled separately now. To check if a constant node is None, use node->isNone(). Next step will be to remove prim::Undefined. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160 Differential Revision: D14109636 Pulled By: eellison fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876	2019-02-15 16:27:57 -08:00
Xiang Gao	4fcab92d6c	Move outplace ops to ATen (#16788 ) Summary: Based on https://github.com/pytorch/pytorch/pull/12413, with the following additional changes: - Inside `native_functions.yml` move those outplace operators right next to everyone's corresponding inplace operators for convenience of checking if they match when reviewing - `matches_jit_signature: True` for them - Add missing `scatter` with Scalar source - Add missing `masked_fill` and `index_fill` with Tensor source. - Add missing test for `scatter` with Scalar source - Add missing test for `masked_fill` and `index_fill` with Tensor source by checking the gradient w.r.t source - Add missing docs to `tensor.rst` Differential Revision: D14069925 Pulled By: ezyang fbshipit-source-id: bb3f0cb51cf6b756788dc4955667fead6e8796e5	2019-02-15 15:58:10 -08:00
Igor Fedan	5737c5259c	Fix for 16939:multinomial performance regressed Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17121 Differential Revision: D14088558 Pulled By: ifedan fbshipit-source-id: e03583135f1e797fe1d8081ec5e9e6b63d4015c1	2019-02-15 15:44:41 -08:00
Adam Paszke	7157be8622	Add special ops for BatchNorm symbolic differentiation (#15403 ) Summary: The main problem there is with differentiating batch norm statically is that we make a lot of complex run-time decisions about the backend we choose. Then, the autograd derivatives are implemented for every backend separately, which makes sense, because they might be saving buffers containing different values. To resolve the issue, the forward op returns an index of the chosen backend, and the backward function takes it as an argument, such that it knows how to interpret the buffers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15403 Differential Revision: D14098815 Pulled By: ailzhang fbshipit-source-id: 7fcd3e6e0566433e81fe8286fb441c1ecaf198ad	2019-02-15 15:40:28 -08:00
Elias Ellison	21696502ff	improve error msg when module list isn't added to __constants__ (#17167 ) Summary: Add suggestion to add to __constants__ when a ModuleList of Sequential module is used as a tuple Addresses https://github.com/pytorch/pytorch/issues/13899 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17167 Differential Revision: D14107688 Pulled By: eellison fbshipit-source-id: 8c07d1f3e25a9c6bdcfd96dbf6b72c2130838278	2019-02-15 15:03:50 -08:00
Josh Varty	1cdcdd78af	Kaiming Initialization (#14718 ) Summary: /cc goldsborough Working on #14582 The corresponding python implementations are at: [pytorch/torch/nn/init.py](`6302e4001a/torch/nn/init.py (L261-L327)`) Here is my initial implementation of Kaiming Initialization. I have not been able to figure out how to successfully run tests locally so I haven't added any yet. A couple questions: - Are the enums defined in the right place? I copied their names from Python, but do you prefer different naming conventions for C++? - To run tests locally do I use `python setup.py test`? Can I run just a subset of the tests somehow? - Should I add my tests at [test/cpp/api/misc.cpp](https://github.com/pytorch/pytorch/blob/master/test/cpp/api/misc.cpp#L47-L54)? Pull Request resolved: https://github.com/pytorch/pytorch/pull/14718 Differential Revision: D14049159 Pulled By: goldsborough fbshipit-source-id: 966ac5126875936e69b185b5041f16476ed4cf70	2019-02-15 14:58:22 -08:00
Andy Wei	5eee0670ab	Pass torch.distributed launch process local rank as environment variable instead of argument (#16360 ) Summary: In `torch.distributed.launch.py`, it passes `local_rank` as argument and requires user's program to parse it. However, it would be more flexible for users and consistent with other variables, e.g. `RANK`, `MASTER_PORT`, `WORLD_SIZE`, if passing through environment variables. `265ed8ff45/torch/distributed/launch.py (L200-L212)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16360 Differential Revision: D14070372 Pulled By: ezyang fbshipit-source-id: c3f6a8e55ab513918cad09d1326eccdedb4d98c9	2019-02-15 14:52:55 -08:00
David Riazati	ea405f8d01	Assert cases exist for unschematized ops in alias analysis Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16334 Differential Revision: D13901238 Pulled By: driazati fbshipit-source-id: be99f89e7dc6a299b770ea92e217932a5271027d	2019-02-15 14:27:26 -08:00
Ailing Zhang	8d33eb450e	Fix avg pool2d api (#17166 ) Summary: Fix xla breakage (partially). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17166 Differential Revision: D14106954 Pulled By: ailzhang fbshipit-source-id: 35ae6713272d0517b66da2ee9209f49015492b89	2019-02-15 13:58:30 -08:00
Karl Ostmo	eba1b23ddd	Fix syntax error in set instantiation (#17174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17174 Use curly braces syntax to avoid Lint complaint Reviewed By: yf225 Differential Revision: D14111368 fbshipit-source-id: 44aa21deb9feededb94f23d92262a4164fe0cc1c	2019-02-15 13:58:29 -08:00
Gregory Chanan	6454e3262d	Make getting the dtype of a tensor work for backend extensions. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17131 Differential Revision: D14093163 Pulled By: gchanan fbshipit-source-id: 06638706e26505e3c741b7ae290000ca258599db	2019-02-15 13:47:37 -08:00
Gregory Chanan	9b5d3f6f5e	Stop reassigning (output) reference arguments in BinaryOps. (#17059 ) Summary: The binary ops that are using TensorIterator do a trick in order to only write the code once for out and non-out variants: 1) Have the non-out variant call the out variant with an undefined tensor. 2) the out variant then reassigns the result tensor to the output of the TensorIterator; this is a no-op in the case where a valid tensor was passed and it correctly propagates the result back to the non-out variant, which is legal because it's just reassigning an undefined tensor. I believe other solutions to this problem would require an unnecessary reference bump, e.g. defining another out variant that returns a Tensor rather than a reference. Unfortunately, this doesn't work with const-references, which we want to move our output arguments to be (because const doesn't actually provide const correctness here, and writers mistakenly reassign the parameter in the case it isn't an out variant). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17059 Differential Revision: D14068402 Pulled By: gchanan fbshipit-source-id: 89fef177a1e174dbe2858e2eae0f6d85460b07d1	2019-02-15 13:41:04 -08:00
Yinghai Lu	70ee257ad4	Fix batch insert (#17158 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17158 Because of Reshape op, batch size can be changed. This diff addresses first order issue raised from multiple batch size system. We need to export different real_batch_size for different max_batch_size input and attach it to the right output. It also fixes a false exception. Reviewed By: ipiszy Differential Revision: D14099541 fbshipit-source-id: 0fa9e86826f417a11d2b5dd2ee60dff64a7ce8c4	2019-02-15 12:28:23 -08:00
Karl Ostmo	01686db21b	Generate CircleCI config.yml from a script (#17039 ) Summary: This initial PR splits the `.circleci/config.yml` file into several smaller files that are stitched verbatim back into the original. A proof of concept of dynamically generating yaml for the job configuration list is also introduced. Since the `config.yml` file must exist in the repo in its final form, there must exist a manual update and check-in step to regenerate `config.yml` from its constituent parts. Consistency between the checked-in `config.yml` file and the authoritative source data is enforced at build time through TravisCI. closes #17038 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17039 Reviewed By: yf225 Differential Revision: D14109059 Pulled By: kostmo fbshipit-source-id: bc04a73145290358854f5a5e552a45e559118fc3	2019-02-15 12:21:25 -08:00
Nikolay Korovaiko	82b269060c	Add support for simpler for-in-list + tests (#16726 ) Summary: This PR add supports for simpler for-in-list loops such as the example below: ```python torch.ji.python def sum_list(a): # type: (List[int]) -> int sum = 0 for i in a: sum += i return sum ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16726 Differential Revision: D14070007 Pulled By: ezyang fbshipit-source-id: b4d971ee647729a6caa3099ceac34ec5c4f143de	2019-02-15 11:41:20 -08:00
David Riazati	326c891d32	Update pybind11 (#17143 ) Summary: Fixes #17130 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17143 Differential Revision: D14107386 Pulled By: zdevito fbshipit-source-id: 1834d14bcdcad6857c199bf4fb8f67298394bbf3	2019-02-15 11:24:25 -08:00
Shen Li	472cfc0f2c	Enforce module device at DataParallel construction time (#17129 ) Summary: closes #17065 CC douwekiela Pull Request resolved: https://github.com/pytorch/pytorch/pull/17129 Differential Revision: D14093353 Pulled By: mrshenli fbshipit-source-id: 9a5a10f16e392337a7f7073223541cf69b402f82	2019-02-15 11:14:46 -08:00
Krishna	b892f69440	one_hot docs missing (#17142 ) Summary: one_hot docs is missing [here](https://pytorch.org/docs/master/nn.html#one-hot). I dug around and could not find a way to get this working properly. Differential Revision: D14104414 Pulled By: zou3519 fbshipit-source-id: 3f45c8a0878409d218da167f13b253772f5cc963	2019-02-15 10:48:18 -08:00
Michael Kösel	38139bc356	add pop support to list (#17015 ) Summary: [WIP] add "pop" to list, see https://github.com/pytorch/pytorch/issues/16662 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17015 Differential Revision: D14071680 Pulled By: eellison fbshipit-source-id: b49a318059c1cc131acda50713132e11b562568f	2019-02-15 10:48:17 -08:00
svcscm	7481cc9d7c	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: bbfb709d8681da60ccc9f3bafc6c296c32fcf835	2019-02-15 10:42:23 -08:00
Jongsoo Park	dad0dbd3b9	merge fully_connected_rowwise_dnnlowp_op into fully_connected_dnnlowp_op (#17105 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17105 To make FC with rowwise quantization faster, reduce code duplication, and make code consistent with Convolution Reviewed By: csummersea Differential Revision: D14080461 fbshipit-source-id: 2b0e67b86e7e3029c90751a8824bf80ae1223680	2019-02-15 09:50:11 -08:00
Jongsoo Park	90fc6133b2	bug fix when we prepack weight and bias together (#17145 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17145 Prepacked weight contains both weight and bias, so the bias should be obtained from input index 1, not from 2 Reviewed By: jianyuh Differential Revision: D14097281 fbshipit-source-id: b8b836b85a7b240e2fd1734377c46d9bf2ce3390	2019-02-15 09:21:20 -08:00
Brian W. Hart	fbd690c1fe	caffe2: fix PinnedCPUAllocator cudaHostRegister() leak (#16340 ) Summary: In the NUMA case, PinnedCPUAllocator's allocate() would return a DataPtr constructed by DefaultCPUAllocator, which would reference the Default... Delete() rather than the Pinned... Delete(). That meant Pinned... Delete() would never run, so cudaHostUnregister() would never be called when regions were freed. See: https://github.com/pytorch/pytorch/issues/16280 This change adds a 'naked_allocate()' method to the Default allocator that just returns a pointer to the allocated memory rather than wrapping it in a DataPtr. Pinned allocator uses that then constructs a DataPtr with reference to its own Delete(). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16340 Reviewed By: dzhulgakov Differential Revision: D13843206 Pulled By: ezyang fbshipit-source-id: 9efb572e5a01b49ef2a4aceeccc13cd0b1066528	2019-02-15 07:02:33 -08:00
Xiang Gao	07b5782ff7	Add some missing docs to torch.rst, new unittest to enforce torch.rst no longer miss anything (#16039 ) Summary: This prevent people (reviewer, PR author) from forgetting adding things to `torch.rst`. When something new is added to `_torch_doc.py` or `functional.py` but intentionally not in `torch.rst`, people should manually whitelist it in `test_docs_coverage.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16039 Differential Revision: D14070903 Pulled By: ezyang fbshipit-source-id: 60f2a42eb5efe81be073ed64e54525d143eb643e	2019-02-15 07:02:31 -08:00
Jie	a771a6ba67	(#16825 ) Summary: setting the correct math type for cudnn rnn, which is enforced starting from cudnn 7.5+ 1. Updating persistent rnn check with input data type instead of rnn math type; 2. Updating rnn type promotion to set correct math type for accumulation; 3. Replace datatype check for filter descriptor from rnn.datatype to input.datatype; Pull Request resolved: https://github.com/pytorch/pytorch/pull/16825 Differential Revision: D14071190 Pulled By: ezyang fbshipit-source-id: 1c9a1531ccf510cb0619e830be444c20c5e72f3f	2019-02-15 07:02:30 -08:00
ZhuBaohe	acf5ec07af	Correct conv and pooling docstrings in nn module (#17052 ) Summary: This PR fix conv and pooling docstrings in nn module Pull Request resolved: https://github.com/pytorch/pytorch/pull/17052 Differential Revision: D14068566 Pulled By: ezyang fbshipit-source-id: 3ec1de232ff6334b6a544dadefbb0ee6193d443a	2019-02-15 06:58:02 -08:00
wbydo	6c67dcfb05	Fix AdaptiveLogSoftmaxWithLoss's constructor (#16694 ) Summary: t-ken1 and I are members of a same team. I have added test codes about the pull request https://github.com/pytorch/pytorch/pull/16656. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16694 Differential Revision: D14070106 Pulled By: ezyang fbshipit-source-id: ff784dbf45e96a6bcf9a4b5cb9544a661a8acad2	2019-02-15 06:58:00 -08:00
David Riazati	48943c3b7a	Update Upsample docs to match nn.interpolate Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17134 Reviewed By: ezyang Differential Revision: D14095694 Pulled By: driazati fbshipit-source-id: 79afec9ddd50b3b8ce39acf98c2543cf1a3d1127	2019-02-15 06:38:41 -08:00
Johannes M Dieterich	f84165d20d	Remove static_cast insertion/kernel argument extration. (#17055 ) Summary: In light of the antistatic feature being a part of the released ROCm 2.1, remove the feature in pyHIPIFY for extraction of kernel arguments and insertion of static_casts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17055 Differential Revision: D14068478 Pulled By: bddppq fbshipit-source-id: 6895f490c78247a129aa18c520ff8d4d1a3d3642	2019-02-15 01:54:31 -08:00
Gu, Jinghui	b65c22c01a	Upgrade mkl-dnn to v0.17.3 to fix core dump issue (#17107 ) Summary: Upgrade mkl-dnn to 0.17.3 to fix core dump issue in #16183 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17107 Differential Revision: D14097600 Pulled By: yinghai fbshipit-source-id: 2baa44e211ce37fbdf01585344c98745f5ba008c	2019-02-15 01:23:07 -08:00
Peizhao Zhang	056dd5b6de	Updated bbox_transform and nms unit test for caffe2 ops. (#16722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16722 Updated bbox_transform and nms unit test for caffe2 ops. Differential Revision: D13937416 fbshipit-source-id: 034743d29671c6e73d323a935e2d734ecc071bff	2019-02-15 00:21:55 -08:00
BowenBao	2634e306e4	Extend support for exporting reshape to onnx. (#16971 ) Summary: Resolve issue with reshape_as test case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16971 Differential Revision: D14098871 Pulled By: houseroad fbshipit-source-id: ed6b966821462d374313256abbbe27f96ce11b2c	2019-02-15 00:17:05 -08:00
Wanchao Liang	f062f5fd4a	add std to autodiff, and mean/var/std to operator set (#17137 ) Summary: supersedes #16684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17137 Differential Revision: D14096724 Pulled By: wanchaol fbshipit-source-id: d801d70029a6a1f5851400ff4094c0299c102b2b	2019-02-14 23:18:53 -08:00
Guoqiang Jerry Chen	678a472ee5	Script module data parallel (#16891 ) Summary: support data parallel for ScriptModule. see unit tests for testing done for this PR. I also tried traced version of resnet18 from torchvision. I'm yet to try a complete end-to-end data parallel training. This will be next steps. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16891 Differential Revision: D14002222 Pulled By: gqchen fbshipit-source-id: fce3598169113215599815c6978e66d3c3a8c282	2019-02-14 22:52:19 -08:00
Jongsoo Park	0a975d333f	add pre-packing operation in README.md (#17151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17151 As title Reviewed By: jianyuh Differential Revision: D14084272 fbshipit-source-id: e58c041e0374f6e82b337e5b6325ef06981ad8b4	2019-02-14 22:46:47 -08:00
Summer Deng	a1f2ed008f	Minor fix of the histogram observer in FBL eval flows (#17118 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17118 Fix the bug in quantization eval workflow; Add mul_nets option in histogram observer pybind Reviewed By: yinghai Differential Revision: D14085321 fbshipit-source-id: 08e3153148522ebc9512a57144d9a8ad154bb6f8	2019-02-14 22:02:04 -08:00
Wanchao Liang	5f6ecd14c4	more test coverage on emitIf none dispatch (#16794 ) Summary: Follow up of #14533, add more test coverage for emitif metaprogramming conditions. Also delete some unwrap optional usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16794 Differential Revision: D14096868 Pulled By: wanchaol fbshipit-source-id: ee1cec609c58d0dd65211249a90207be06649e71	2019-02-14 21:39:55 -08:00
ngimel	91c50aeec6	Speed-up adaptive average pooling for the common case of size=1 output (#17011 ) Summary: When adaptive pooling has to produce a single pixel feature map, it is faster to do so by calling .mean(). Backward calls a pretty inefficient cuda kernel with atomics, which becomes ridiculously slow for halfs. For half this PR provides approx 30x speed-up for adaptive average pooling, which results in 30% end-to-end speed-up on senet. Improvements are smaller for float, but still significant (approx 5x). Also this PR unifies handling of 3d (no batch dimension) and 4d tensors, using negative dimension indices. cc ezyang for review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17011 Reviewed By: ailzhang Differential Revision: D14078747 Pulled By: soumith fbshipit-source-id: 0eb9255da2351190a6bcaf68c30e2ae2402a2dd9	2019-02-14 21:15:16 -08:00
Thomas Viehmann	7cff803d0a	Improve example for torch.mode (#17069 ) Summary: This updates the example for `torch.mode` to show a case where there is a mode. Also add a bit of a description to the explanation as well as being a bit more precise about "a" mode rather than "the" mode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17069 Differential Revision: D14078722 Pulled By: soumith fbshipit-source-id: 837a238d53a9b8e868511acbdc258633975bea48	2019-02-14 18:52:53 -08:00
Yinghai Lu	58648a19df	Create BackendTransformerBase to host common functions used for backend lowering (#17074 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17074 There are some common functionalities in backend lowering. This diff creates a base class which hosts these common stuff. Reviewed By: ipiszy Differential Revision: D14073192 fbshipit-source-id: 9617603d0e73db6f7fcc5572756b9dbab506dae5	2019-02-14 17:57:03 -08:00
Zhizhen Qin	6a46738986	Fix android crash when model detects nothing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17119 Reviewed By: sf-wind Differential Revision: D14087835 Pulled By: ZhizhenQin fbshipit-source-id: 32e61d46679bae645fd0bbec724513cfa5c553ab	2019-02-14 17:29:14 -08:00
kngwyu	d61455cf40	Fix some documentation links in torch.tensor (#17109 ) Summary: Currently it's broken https://pytorch.org/docs/stable/tensors.html#torch.Tensor.norm Pull Request resolved: https://github.com/pytorch/pytorch/pull/17109 Differential Revision: D14093567 Pulled By: ezyang fbshipit-source-id: b167cde2150ee97ccf5689fcf50ff8157acfce10	2019-02-14 17:13:50 -08:00
Michael Liu	5f866d0ea2	Apply modernize-use-override (2nd iteration) Summary: Use C++11’s override and remove virtual where applicable. Change are automatically generated. Reviewed By: Orvid Differential Revision: D14086124 fbshipit-source-id: 2005227d095d776ca3b4309a57f54e25782b9b58	2019-02-14 16:52:57 -08:00
James Reed	f1da9892e9	Generalize catArray for contiguous inputs and dim != 0 (#17032 ) Summary: I noticed that we were sinking a lot of time into `cat` operations in machine translation on CPU, and drilled down to us doing the cat element-by-element, even though all the inputs were contiguous. The reason was we were doing the cat along a dimension that was not 0, and that caused us to not use the fast `memcpy` branch. This PR generalizes that branch. Quick benchmark script: ``` import torch, time tensors = [torch.rand(6, 2, 1024) for i in range(5)] NITER = 1000 s = time.time() for i in range(NITER): torch.cat(tensors, dim=1) print('time per iter ', (time.time() - s) / NITER) ``` Before: ``` time per iter 8.089399337768554e-05 ``` After: ``` time per iter 2.183413505554199e-05 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17032 Differential Revision: D14090038 Pulled By: jamesr66a fbshipit-source-id: 2c733a84915896008ac95f2233f44894bd2573de	2019-02-14 16:33:23 -08:00
Wanchao Liang	f3dd5563e4	fix test_jit canonicalize_tensor_iterator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17104 Differential Revision: D14089928 Pulled By: wanchaol fbshipit-source-id: 8b288514ab9ee8d24a11d39b75eef95783f28f20	2019-02-14 16:03:34 -08:00
Sebastian Messmer	65e06df24a	Use new constructor in USE_SIMPLE_CTOR_DTOR (#17080 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17080 This changes all operators using this macro to the new format Reviewed By: dzhulgakov Differential Revision: D14078628 fbshipit-source-id: 67048e485e326765fd49567cc008633d3d500d5c	2019-02-14 15:54:16 -08:00
Xiaodong Wang	6f2bcc9b4f	Caffe2 TARGETS for HIP (#17076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17076 OSS: slightely change the tools/amd_build/build_amd.py to add the output_directory for internal use. Also modify the renaming convention in hipify script to reflect the updated rules. Reviewed By: bddppq Differential Revision: D13767218 fbshipit-source-id: cbcadc51daab42197d545f204840dcc18176bb3d	2019-02-14 15:45:21 -08:00
Ailing Zhang	b0545aa85f	maskrcnn & bert AD coverage part 1 (#16689 ) Summary: - Moved a few functions from `autograd` namespace to `aten` namespace to be visible from JIT nativeResolver. - Added a hack to loop up keyword only argument. Will add proper support for kw only later - Simulate function overload in aten using `_<number>` as function name suffix. - Even `forward` returns multiple outputs like in `kthvalue`, there's at most one requires grad that we currently support. - Removed the `TensorList` related ops here since partial `TensorList` support is prone to bugs. Our symbolic diff for `cat` was never tested with autodiff, and it seems broken. Need to find another proper way to support these ops(either by properly supporting `TensorList` or sth like `prim::ConstantChunk` and leave them for next PR. Ops supported in this PR: ``` erf expand_as index kthvalue mean permute pow rsub select sqrt squeeze t to topk transpose view var embedding logsumexp // grad is None _dim_arange contiguous nonzero ones_like ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16689 Differential Revision: D14020806 Pulled By: ailzhang fbshipit-source-id: a5e2c144a7be5a0d39d7ac5f93cb402ec12503a5	2019-02-14 15:36:39 -08:00
jiej	b5193b6a81	Second PR to restore reverted commit (#16224 ) (#17040 ) Summary: update: 1. global_reduce check for should_block_y_reduce first. This avoids the enabling global_reduce without block_y_reduce. Leading to accessing shared memory during global reduce without allocation. 2. updating block_y_reduce heuristics. Improves perf on tiny tensors 3. adding test case covering old cases where illegal memory access might occur TensorIterator cuda launch configs update (#16224) Update launch configs for TensorIterator gpu_reduce_kernel. Enable flexible block dimension to improve efficiency for reduction cases with small fast dimension. Previously TensorIterator launches blocks with fixed 32x16 threads. For cases like: import torch torch.randn(2**20, 4, device='cuda').sum(0) The fixed launch config does handle coalesced memory access efficiently. Updated launch configure enables flexible block dimension. Combining with improved reduction scheme (using flexible vertical / horizontal reduction instead of limited warp / block reduction in the old code), it ensures optimal memory access pattern even with reduction on dimension with small stride. Possible future improvements: 1. Precise dynamic shared memory allocation. 2. Using warp shuffle for vertical (block_y) reduction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16224 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17040 Differential Revision: D14078295 Pulled By: umanwizard fbshipit-source-id: ecc55054a5a4035e731f0196d633412225c3b06c	2019-02-14 15:23:01 -08:00
Yinghai Lu	b515ebc6f1	Remove fake inference for shape info in ONNXIFI transform (#17046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17046 As we are moving to use bound shape inference, we can remove the awkward fake inference run path and make the code cleaner. Reviewed By: ipiszy Differential Revision: D14061501 fbshipit-source-id: b3ace98b3dabef3c3359086a0bb1410518cefa26	2019-02-14 15:12:20 -08:00
Gregory Chanan	0a5de6e972	Update alexnet expect. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17122 Reviewed By: colesbury Differential Revision: D14090209 Pulled By: gchanan fbshipit-source-id: 78c5961dd7d752b237782b6ed90c376bbd6d3145	2019-02-14 14:45:02 -08:00
Michael Kösel	ff2053dfa1	add clear functionality to list (#17050 ) Summary: Add clear functionality to list. See #16662 ```python import torch torch.jit.script def foo(): a = [1, 2, 3, 4] a.clear() return a ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17050 Differential Revision: D14071799 Pulled By: driazati fbshipit-source-id: 305551c16f7db127c43de0ad5885d9f10678e101	2019-02-14 14:38:11 -08:00
Yinghai Lu	d52862ca81	Moderate the dim type after LengthsRangeFill (#17096 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17096 LengthsRangeFill will take a batch size of lengths input and expand it into sequence. Later op should follow this type until it hits another batch type moderating op, e.g. SparseLengthsSum. Reviewed By: ipiszy Differential Revision: D14079422 fbshipit-source-id: 1a26925d502c32875ea95c160268bf6a256cc955	2019-02-14 14:28:27 -08:00
jayleverett	016f212357	fix behavior of ConcatDataset w/ negative indices (#15756 ) Summary: Currently, when you pass a negative index to a `Dataset` created with `ConcatDataset`, it simply passes that index to the first dataset in the list. So if, for example, we took `concatenated_dataset[-1]`, this will give us the last entry of the first dataset, rather than the last entry of the last dataset, as we would expect. This is a simple fix to support the expected behavior for negative indices. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15756 Reviewed By: ezyang Differential Revision: D14081811 Pulled By: fmassa fbshipit-source-id: a7783fd3fd9e1a8c00fd076c4978ca39ad5a8a2a	2019-02-14 13:02:54 -08:00
Dwarak Rajagopal	65d6f1014a	Add support of count_include_pad and test end to end test for AveragePool (#17034 ) Summary: Add support of count_include_pad end to end test for AveragePool We can export AveragePool from PyTorch with count_include_pad attribute. However, we don't directly support it in Caffe2's ONNX backend. We also want to check whether we can pass the end to end test for average pool operator with count_include_pad attribute (pytorch => onnx => caffe2) Pull Request resolved: https://github.com/pytorch/pytorch/pull/17034 Reviewed By: houseroad Differential Revision: D14060186 Pulled By: dwarakrajagopal fbshipit-source-id: 10dae532611c71f8c8cfc3fa701cc7c1c1c02695	2019-02-14 11:48:42 -08:00
BowenBao	19addc7eb0	Support nonzero onnx export Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17036 Differential Revision: D14079676 Pulled By: houseroad fbshipit-source-id: 562b538dd9ab330c26f15fdb34c98dc7a23571a1	2019-02-13 23:52:42 -08:00
Dmytro Dzhulgakov	5a26579e27	Add more headers to setup.py to make pytorch/benchmark work (#16890 ) Summary: Since we don't do tmp_install any more it's better to include all necessary headers. cc kostmo for better suggestions of how to list all headers here Pull Request resolved: https://github.com/pytorch/pytorch/pull/16890 Differential Revision: D14079848 Pulled By: dzhulgakov fbshipit-source-id: 4522c80d05e5d91f99f6700cde46cac559330d28	2019-02-13 23:14:36 -08:00
Dmytro Dzhulgakov	3408d9de20	Clean up Storage/StorageImpl constructors (#16948 ) Summary: Small cleanup while doing https://github.com/pytorch/pytorch/pull/16857: - rename C2 constructors as create_legacy - remove duplicated constructors - make resizable flag non-default Pull Request resolved: https://github.com/pytorch/pytorch/pull/16948 Differential Revision: D14062755 Pulled By: dzhulgakov fbshipit-source-id: 3b7b4ec9cdf67d2628cccc001156e040006b673e	2019-02-13 22:58:32 -08:00
Dmytro Dzhulgakov	11816affab	Safety check for negative alloc_cpu() attempt (#17071 ) Summary: Some legacy TH code was relying on alloc to throw when called with negative number!!! E.g. `torch.linspace(0, 1, -1)`. And it breaks ASAN build. I still believe alloc should receive size_t, but I added a safety enforce inside. It should fix ASAN. I'll follow up with a proper fix for empty_cpu (which is probably the right place to do it) separately Pull Request resolved: https://github.com/pytorch/pytorch/pull/17071 Differential Revision: D14074157 Pulled By: dzhulgakov fbshipit-source-id: 3ed3bdb873e446edecb558e1df491310fd7179e3	2019-02-13 22:41:13 -08:00
svcscm	f0fed41ea2	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: b4e7a3850b01bbec56faa3eb0feb3bc6197c0393	2019-02-13 22:07:16 -08:00
Michael Liu	92a516b9ff	Apply modernize-use-override - 2/2 Summary: Use C++11’s override and remove virtual where applicable. Change are automatically generated. Reviewed By: Orvid Differential Revision: D14054721 fbshipit-source-id: 15d266fa1779b1e3ea6270f00841d7fb1e4d44ee	2019-02-13 21:01:28 -08:00
svcscm	84bdf86034	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 5d9763a6f26ba53c6402b978004aaa7508f4e354	2019-02-13 21:01:27 -08:00
ptrblck	8abfd28f58	#16627 convert weights using torch.as_tensor to avoid warning (#17067 ) Summary: Minor change which fixes #16627 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17067 Differential Revision: D14078726 Pulled By: soumith fbshipit-source-id: c04a5f1eff44e4a4b04b981f0ae8de6ff018515b	2019-02-13 20:54:29 -08:00
svcscm	b33f4cff6b	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: e074a865b859fd72b34b012505dfbd3a27a0cc41	2019-02-13 20:54:27 -08:00
Edward Yang	dae356df1f	Revert D14062537: [pytorch][PR] Implement NetDef <--> JIT IR converters. Differential Revision: D14062537 Original commit changeset: 88b184ee7276 fbshipit-source-id: 01971bbe20daade40cc2cbf85fc08edb380b445c	2019-02-13 20:29:17 -08:00
Pritam Damania	c3f5ba9460	PyTorch model metadata. (#16275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16275 Adding a generic string `metadata` field as part of the model to capture additional metadata with the model. Reviewed By: dzhulgakov Differential Revision: D13579029 fbshipit-source-id: 7456ef2edbe73bb70bbb31889cecd94e0db329a2	2019-02-13 19:48:11 -08:00
Dmytro Dzhulgakov	46503a7ac0	Trim libshm deps, move tempfile.h to c10 (#17019 ) Summary: libshm_manager doesn't need to depend on all of libtorch. It only uses tiny tempfile.h which can be moved to c10. I could just duplicate the file too, but it's not worth it as c10 is small enough. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17019 Differential Revision: D14052688 Pulled By: dzhulgakov fbshipit-source-id: 8797d15f8c7c49c49d40b7ab2f43aa3bf6becb0c	2019-02-13 19:38:35 -08:00
Mikhail Zolotukhin	d25fee31fc	Implement NetDef <--> JIT IR converters. (#16967 ) Summary: Currently the converters are very straightforward, i.e. there is no code for trying to preserve semantics, we're purely perform conversion from one format to another. Two things that we might want to add/change: 1. Add semantic conversion as well (but probably it would be a good idea to keep it separate as a temporary thing). 2. Make sure we don't mess with value names, as they are crucial for current uses of NetDefs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16967 Differential Revision: D14062537 Pulled By: ZolotukhinM fbshipit-source-id: 88b184ee7276779e5e9152b149d69857515ad98a	2019-02-13 18:39:39 -08:00
David Riazati	decc0893f2	Remove IgnoredPythonOp sugared value Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17042 Differential Revision: D14072497 Pulled By: driazati fbshipit-source-id: 68fe3fa89c22e60142d758c8cbe0e6e258e7d5c2	2019-02-13 17:59:56 -08:00
Xiaomeng Yang	3a34f443c5	Separate reduce functions from math (#16929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16929 Separate CPU reduce functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13999469 fbshipit-source-id: bd628b15a6e3c1f04cc62aefffb0110690e1c0d1	2019-02-13 17:50:47 -08:00
Junjie Bai	9b7f3da74b	Skip test_cudnn_multiple_threads_same_device on ROCm (flaky) (#17061 ) Summary: cc iotamudelta https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10722//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10710//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/10753//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-devtoolset7-rocmrpm-centos7.5-test/1756//console ``` 19:07:18 ====================================================================== 19:07:18 FAIL: test_cudnn_multiple_threads_same_device (test_nn.TestNN) 19:07:18 ---------------------------------------------------------------------- 19:07:18 Traceback (most recent call last): 19:07:18 File "/var/lib/jenkins/workspace/test/test_nn.py", line 3905, in test_cudnn_multiple_threads_same_device 19:07:18 (2048 - test_iters) * (2048 - test_iters)) 19:07:18 File "/var/lib/jenkins/workspace/test/common_utils.py", line 453, in assertEqual 19:07:18 super(TestCase, self).assertLessEqual(abs(x - y), prec, message) 19:07:18 AssertionError: 3794704.0 not less than or equal to 1e-05 : ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17061 Differential Revision: D14069324 Pulled By: bddppq fbshipit-source-id: e33b09abca217a62a8b577f9c332ea22985ef4ff	2019-02-13 17:18:47 -08:00
Tongliang Liao	a670824fee	Support FC (Caffe2) -> Gemm (ONNX) with variable input shape. (#16184 ) Summary: For >2D input, previously the code uses static shape captured during tracing and reshape before/after `Gemm`. Now we add `-1` to the first `Reshape`, and uses `Shape(X) => Slice(outer) => Concat(with -1 for inner) => Reshape` for the second. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16184 Differential Revision: D14070754 Pulled By: ezyang fbshipit-source-id: 86c69e9b254945b3406c07e122e57a00dfeba3df	2019-02-13 17:12:34 -08:00
Junjie Bai	2ad5dcbbe4	Make timeout in resnet50_trainer configurable (#17058 ) Summary: xw285cornell petrex dagamayank Pull Request resolved: https://github.com/pytorch/pytorch/pull/17058 Differential Revision: D14068458 Pulled By: bddppq fbshipit-source-id: 15df4007859067a22df4c6c407df4121e19aaf97	2019-02-13 17:03:48 -08:00
Boris Daskalov	41dddfd55f	Make mkldnn Stream object thread_local and enable mkldnn thread-safe (#17022 ) Summary: This PR fixes following issue: https://github.com/pytorch/pytorch/issues/16828 It is a combination of two things: 1) MKLDNN streams are not thread-safe but are currently shared between different threads. This change makes them thread_local 2) By default MKLDNN primitives can share global memory and can't be invoked from multiple threads. This PR enables the MKLDNN_ENABLE_CONCURRENT_EXEC cmake configuration option that makes them thread-safe. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17022 Differential Revision: D14069052 Pulled By: ezyang fbshipit-source-id: f8f7fcb86c40f5d751fb35dfccc2f802b6e137c6	2019-02-13 16:04:53 -08:00
Tongliang Liao	491f2d4cb8	Support conversion from Caffe2 MergeDim to ONNX Reshape + Squeeze. (#16189 ) Summary: `MergeDim` can be done by `Reshape([1, -1, 0, 0, ...]) + Squeeze`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16189 Differential Revision: D14070676 Pulled By: ezyang fbshipit-source-id: 28d7e9b35cc2c1dcbd4afb3fbdf7383e219b1777	2019-02-13 15:53:38 -08:00
vishwakftw	86594e63eb	Fix mvlgamma doc (#17045 ) Summary: Changelog: - Fix the constant in the docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/17045 Differential Revision: D14068698 Pulled By: ezyang fbshipit-source-id: af040b9a9badea213785f5bf3b6daf4d90050eb2	2019-02-13 15:24:44 -08:00
Mikhail Zolotukhin	f79563a665	Change IR graph print format to make it look more pythonic (#16986 ) Summary: This removes curly braces from the outputs (we have indentation to indicate scopes), also adds ':' after graph and blocks declaration and removes ';' from the return line. ".expect" tests are updated to keep up with it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16986 Differential Revision: D14062540 Pulled By: ZolotukhinM fbshipit-source-id: 7f8e2d11619152a21ef7f1f7f8579c49392c3eca	2019-02-13 12:37:24 -08:00
Gregory Chanan	18b8572505	Turn off the ability for Declarations.cwrap entries to be methods. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17053 Differential Revision: D14065887 Pulled By: gchanan fbshipit-source-id: 5d06ac66d27d28d48c2aff2b0d911f34ea0cd6fd	2019-02-13 12:25:05 -08:00
Jaliya Ekanayake	bc39cf4d5e	Remove chunk count check on the ChunkBuffer (#16868 ) Summary: Previously, the ChunkBuffer depends on the remaining chunk count to signal end of dataloading. This does not work with distributed samplers where each sampler only loads a subset of chunks. This refactor remove the dependency on the remaining chunk count at the ChunkBuffer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16868 Differential Revision: D14066517 Pulled By: goldsborough fbshipit-source-id: 293dfe282ceff326dff0876c2f75c2ee4f4463e2	2019-02-13 11:09:42 -08:00
Stefan Krah	a5e7b1d032	Use IndexError instead of RuntimeError in ATen CPU kernels Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17049 Reviewed By: ezyang Differential Revision: D14064700 Pulled By: fmassa fbshipit-source-id: 3575db103bba5a7d82f574cbb082beca419151ec	2019-02-13 10:19:28 -08:00
Edward Yang	1fc05bd285	Mark IntList as deprecated; add C10_DEPRECATED_USING (#16824 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16824 There was a big wooly yak getting the deprecated macros to work. Gory details are in Deprecated.h Reviewed By: smessmer Differential Revision: D13978429 fbshipit-source-id: f148e5935ac36eacc481789d22c7a9443164fe95	2019-02-13 08:51:20 -08:00
Yinghai Lu	db82fc7ca6	Add more debugging facilities to ONNXIFI transform (#17043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17043 Add more debugging facilities for ONXNIFI transform. Reviewed By: ipiszy Differential Revision: D14019492 fbshipit-source-id: 8c258ccba2f8ce77db096031fc8a61e15bd8af93	2019-02-13 00:05:41 -08:00
svcscm	26018d027a	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 399afdc341075c383227d0d410a30eeb6c1d3b08	2019-02-13 00:05:40 -08:00
svcscm	2b5bef22b7	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: edb216d2eca7120d0f7729b2e4640096a0341154	2019-02-12 21:26:24 -08:00
Dmytro Dzhulgakov	51dd2000cd	unify c2 and TH allocator (#16892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16892 Replaces https://github.com/pytorch/pytorch/pull/14517 Merged caffe2 and TH CPU Allocators. Mostly using the code from caffe2 allocators. `memset` of caffe2 allocator is gone now. These two allocators should be almost the same. Baseline: ``` Running ./tensor_allocation Run on (48 X 2501 MHz CPU s) CPU Caches: L1 Data 32K (x24) L1 Instruction 32K (x24) L2 Unified 256K (x24) L3 Unified 30720K (x2) ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_MakeStorageImpl 148 ns 148 ns 4676594 BM_StorageImplCtor 54 ns 54 ns 12957810 BM_MallocStorageImpl 62 ns 62 ns 11254745 BM_TensorImplCtor 22 ns 22 ns 31939472 BM_MallocTensorImpl 105 ns 105 ns 6505661 BM_Malloc_1 43 ns 43 ns 16464905 BM_MakeTensorFromStorage 126 ns 126 ns 5586116 BM_MakeVariableFromTensor 236 ns 236 ns 2995528 BM_ATenCPUTensorAllocationSmall1 319 ns 319 ns 2268884 BM_ATenCPUTensorAllocationSmall2 318 ns 318 ns 2163332 BM_ATenCPUTensorAllocationMedium1 403 ns 403 ns 1663228 BM_ATenCPUTensorAllocationMedium2 448 ns 448 ns 1595004 BM_ATenCPUTensorAllocationBig1 532 ns 532 ns 1352634 BM_ATenCPUTensorAllocationBig2 4486 ns 4486 ns 160978 ``` Changed: ``` Running ./tensor_allocation Run on (48 X 2501 MHz CPU s) CPU Caches: L1 Data 32K (x24) L1 Instruction 32K (x24) L2 Unified 256K (x24) L3 Unified 30720K (x2) ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_MakeStorageImpl 141 ns 141 ns 4803576 BM_StorageImplCtor 55 ns 55 ns 13129391 BM_MallocStorageImpl 64 ns 64 ns 11088143 BM_TensorImplCtor 23 ns 23 ns 31616273 BM_MallocTensorImpl 101 ns 101 ns 7017585 BM_Malloc_1 39 ns 39 ns 18523954 BM_MakeTensorFromStorage 118 ns 118 ns 5877919 BM_MakeVariableFromTensor 452 ns 452 ns 1565722 BM_ATenCPUTensorAllocationSmall1 384 ns 384 ns 1819763 BM_ATenCPUTensorAllocationSmall2 389 ns 389 ns 1857483 BM_ATenCPUTensorAllocationMedium1 425 ns 425 ns 1646284 BM_ATenCPUTensorAllocationMedium2 430 ns 430 ns 1561319 BM_ATenCPUTensorAllocationBig1 508 ns 508 ns 1309969 BM_ATenCPUTensorAllocationBig2 3799 ns 3799 ns 173674 ``` lstm benchmark: Before: ``` INFO:lstm_bench:Iter: 1 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 21 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 41 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 61 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 81 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 101 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 121 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 141 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 161 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 181 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 201 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 221 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 241 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 261 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 281 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 301 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 321 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 341 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 361 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 381 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Done. Total EPS excluding 1st iteration: 0.8k ``` After: ``` INFO:lstm_bench:Iter: 1 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 21 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 41 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 61 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 81 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 101 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 121 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 141 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 161 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 181 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 201 / 390. Entries Per Second: 0.8k. INFO:lstm_bench:Iter: 221 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 241 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 261 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 281 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 301 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 321 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 341 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 361 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Iter: 381 / 390. Entries Per Second: 0.7k. INFO:lstm_bench:Done. Total EPS excluding 1st iteration: 0.8k ``` Reviewed By: ezyang Differential Revision: D13202632 fbshipit-source-id: db6d2ec756ed15b0732b15396c82ad42302bb79d	2019-02-12 21:16:34 -08:00
svcscm	f87022bf2f	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 7d730945dbdd7bb7d10192061229ee6e759a1a7f	2019-02-12 20:45:38 -08:00
Yinghai Lu	fb5790ce94	Remove second output of Reshape during ONNXIFI transform (#17027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17027 Glow doesn't support second output of Reshape right now and it's useless. For correctness, we do make sure that the second output of Reshape is of Constant type during bound shape inference. Reviewed By: ipiszy Differential Revision: D14056555 fbshipit-source-id: f39cca7ba941bf5a5cc3adc96e2b1f943cc0be93	2019-02-12 18:31:53 -08:00
Johannes M Dieterich	9d01be1a5a	enable more unit tests in test_nn (#16994 ) Summary: These tests work with ROCm 2.1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16994 Differential Revision: D14059802 Pulled By: bddppq fbshipit-source-id: 8e2cbb13196c2e0283d3e02b7f761374bc580751	2019-02-12 17:58:44 -08:00
Johannes M Dieterich	02b838e065	fix bicubic upsampling and enable tests (#17020 ) Summary: Fix macro name in ifdef guard, enable upsampling tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17020 Differential Revision: D14059780 Pulled By: bddppq fbshipit-source-id: 82c57d17d5bccdccb548c65d2b7a1ff8ab05af30	2019-02-12 17:33:08 -08:00
Jongsoo Park	92221ad840	Fold col offsets into bias; optimize A symmetric quant (#16942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16942 We can fold col offsets into bias if zero point of activation is constant. fbgemm still needs to provide an option to pass col offsets in case zero point of activation keep changes (e.g., dynamic quantization). A trick to optimize static quantization case is setting A zero point to 0 after folding into bias. This diff also optimizes when weights use symmetric quantization. When B zero point is 0, we use PackAMatrix instead of PackAWithRowOffset . TODO: Ideally, PackAWithRowOffset should perform as fast as PackAMatrix when B_zero_point is 0 to make client code simpler Same in PackAWithIm2Col and depth-wise convolution (group convolution is already doing this) Reviewed By: csummersea Differential Revision: D14013931 fbshipit-source-id: e4d313343e2a16a451eb910beed30e35de02a40c	2019-02-12 17:33:06 -08:00
Johannes M Dieterich	3e1e5d5a8b	enable unit tests in test_cuda that now pass with ROCm 2.1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17012 Differential Revision: D14059761 Pulled By: bddppq fbshipit-source-id: 8309c3ffe1efed42b5db69fdec26427413c3f224	2019-02-12 17:28:46 -08:00
Sebastian Messmer	9696fee635	Register CUDA kernels for caffe2 operators (#16691 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16691 Previous diffs already introduced a macro that registers caffe2 CPU kernels with c10. This now also registers the CUDA kernels with it. Reviewed By: bwasti Differential Revision: D13901619 fbshipit-source-id: c15e5b7081ff10e5219af460779b88d6e091a6a6	2019-02-12 17:24:01 -08:00
Johannes M Dieterich	059c55f8cc	Enable test_jit tests that work on ROCm 2.1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17010 Differential Revision: D14059748 Pulled By: bddppq fbshipit-source-id: 7a1f7eee4f818dba91e741437415370973e4d429	2019-02-12 17:18:44 -08:00
Ying Zhang	7c24de8d04	Extract ShapeInfo and some util functions into a separate file. (#17025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17025 Extract ShapeInfo and some util functions into a separate file. Reviewed By: yinghai Differential Revision: D14017432 fbshipit-source-id: 201db46bce6d52d9355a1a86925aa6206d0336bf	2019-02-12 17:06:29 -08:00
Yinghai Lu	f435fb8290	Allow customization of blob node in net_drawer (#16915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16915 TSIA Reviewed By: ipiszy Differential Revision: D14018010 fbshipit-source-id: df5ccc06fa37f08e7a02a8acc466c4ad47afe04e	2019-02-12 15:02:50 -08:00
Yinghai Lu	65b49b4696	Ignore unknown_shaped tensor in bound shape inference (#16916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16916 Two fixes for maximum effort bound shape inference 1. Ignore failed and unknown shape 2. Add specialization for `SparseLengthsWeightedSumFused8BitRowwise`. Reviewed By: ipiszy Differential Revision: D14017810 fbshipit-source-id: 25cd68d35aa20b9ed077bdb562eb7f9deff0ab96	2019-02-12 15:02:49 -08:00
Pearu Peterson	7c1e4258a9	Workarounds to the lack of nvidia-smi and ldconfig programs in macosx (was PR 16968) (#16999 ) Summary: Fix issue #12174 for Mac OSX. PS: This is a duplicate of PR #16968 that got messed up. Sorry for the confusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16999 Differential Revision: D14050669 Pulled By: zou3519 fbshipit-source-id: a4594c03ae8e0ca91a4836408b6c588720162c9f	2019-02-12 14:39:28 -08:00
vishwakftw	0d95028bee	Dispatch the correct legacy function for geqrf_out and ormqr_out (#16964 ) Summary: This fixes the segfault. Changelog: - Modify the function calls in LegacyDefinitions for `geqrf_out` and `ormqr_out` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16964 Differential Revision: D14025985 Pulled By: gchanan fbshipit-source-id: aa50e2c1694cbf3642273ee14b09ba12625c7d33	2019-02-12 13:48:51 -08:00
Davide Libenzi	68c3b959de	Register layout for XLA backend. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16946 Differential Revision: D14054716 Pulled By: gchanan fbshipit-source-id: 063495b99b9f7d29ca3ad2020a6bc90d36ba0d7d	2019-02-12 13:44:07 -08:00
Tongliang Liao	0eee56fff7	Export ReduceMean/ReduceFrontMean/ReduceBackMean (Caffe2) to ReduceMean (ONNX). (#16727 ) Summary: The second input (`lengths`) is not supported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16727 Differential Revision: D14054105 Pulled By: houseroad fbshipit-source-id: 36b8d00460f9623696439e1bd2a6bc60b7bb263c	2019-02-12 13:35:32 -08:00
James Reed	b0d57aa7b1	Clean up allocations in FBGEMM linear (#16985 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16985 These statements were causing some redundant allocations + copying, so I cleaned them up Reviewed By: zdevito, wanchaol Differential Revision: D14031067 fbshipit-source-id: f760fb29a2561894d52a2663f557b3e9ab1653de	2019-02-12 13:02:21 -08:00
Gregory Chanan	34e4bd3ec5	Properly dispatch s_copy__cpu. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16974 Differential Revision: D14030516 Pulled By: gchanan fbshipit-source-id: ba4cde5ebf2898d207efbc9117c1f1d6ccae861b	2019-02-12 12:53:36 -08:00
Gregory Chanan	2b1e2b6b53	Get rid of unused THPStorage defines related to accreal. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16973 Differential Revision: D14029538 Pulled By: gchanan fbshipit-source-id: b51f203ccff97695bf228772bb13e3e6b9bb6d1a	2019-02-12 12:48:48 -08:00
Yinghai Lu	f2e6a3f230	Fix AddAdjustBatchOp (#16997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16997 1. Don't create multiple AdjustBatch ops for the same input name. We create it once and hook input to abc_post_adjust_batch. 2. Dangling tensor. The problem for such an error is still with AttachAdjustBatchOp. Considering such as net ``` op { type : "Relu" input: "X" outpu: "Y" } op { type : "Relu" input: "Y" output: "Y2" } external_output: "Y" external_output: "Y2" ``` In this the output of first Relu will be used as an internal node as well as output. We cannot simply rename Y into Y_pre_batch_adjust. Basically, we need another pass in to check all the input of the ops in the net and rename Y into Y_pre_batch_adjust. Reviewed By: bertmaher Differential Revision: D14041446 fbshipit-source-id: f6553e287a8dfb14e4044cc20afaf3f290e5151b	2019-02-12 11:45:43 -08:00
Will Feng	21ce1da5e9	Roll back PyTorch DockerVersion to 282 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17013 Differential Revision: D14052415 Pulled By: yf225 fbshipit-source-id: df663fb46ee825174fe06b8d395979b3d4e84766	2019-02-12 11:39:15 -08:00
Karl Ostmo	a2c322e735	fix silent failure on Windows builds (#16984 ) Summary: Closes #16983 Remove backticks that are being interpreted by the shell. Add -e option to bash script to avoid future such failures Pull Request resolved: https://github.com/pytorch/pytorch/pull/16984 Reviewed By: yf225 Differential Revision: D14039128 Pulled By: kostmo fbshipit-source-id: c31a1895377ca86c1b59e79351843cc8c4fd7de3	2019-02-12 11:06:27 -08:00
Theo	3618b52c74	Add module and name to func created with _jit_internal.boolean_dispatch (#16922 ) Summary: The use case for making this PR is the following bug : (with F = torch.nn.functional) `F.max_pool2d.__module__` is `torch._jit_internal` `F.max_pool2d.__name__` is `fn` With this PR you get: `F.max_pool2d.__module__` is `torch.nn.functional` `F.max_pool2d.__name__` is `max_pool2d` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16922 Differential Revision: D14020053 Pulled By: driazati fbshipit-source-id: c109c1f04640f3b2b69bc4790b16fef7714025dd	2019-02-12 09:38:48 -08:00
Edward Yang	40528efeac	More docs for methods in operator.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16826 Reviewed By: izdeby Differential Revision: D13979891 fbshipit-source-id: df8391ffaff0d44845057bb839f05aea6fc5712c	2019-02-12 08:19:38 -08:00
Daniel	e5742494f6	Minor typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16980 Differential Revision: D14033686 Pulled By: gchanan fbshipit-source-id: 9f7967defc6795640e14157d0b701b185061741f	2019-02-12 08:02:04 -08:00
SsnL	f61f9e1757	Fix allow_inf in assertEqual (#16959 ) Summary: gchanan pointed out in https://github.com/pytorch/pytorch/pull/16389 that `allow_inf` is treating `-inf` and `inf` as equal. This fixes it. Also fixing #16448 since it's near and 2.1 has released. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16959 Differential Revision: D14025297 Pulled By: gchanan fbshipit-source-id: 95348309492e7ab65aa4d7aabb5a1800de66c5d6	2019-02-12 07:56:34 -08:00
Edward Yang	ae1fc584ea	Refine return type Stream to HIPStream in HIPStreamGuardMasqueradingAsCUDA (#16978 ) Summary: Previously, we used the templated class directly to provide implementations. However, there is a subtle difference between this, and CUDAStreamGuard: CUDAStreamGuard has refined types for the Streams it returns. This lead to a compilation failure of HIPified ddp.cpp. This commit lines them up more closely, at the cost of copy-paste. A possible alternate strategy would have been to extend the InlineDeviceGuard templates to optionally accept refinements for Stream. I leave this for future work. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16978 Differential Revision: D14045346 Pulled By: ezyang fbshipit-source-id: 2b101606e62e4db588027c57902ea739a2119410	2019-02-12 07:36:25 -08:00
Edward Yang	b7b245845a	Revert D14030665: [pytorch][PR] [HOTFIX] Pin docker-ce version to the one expected by nvidia-docker2 Differential Revision: D14030665 Original commit changeset: dece6a5aa4d1 fbshipit-source-id: 885a464ec3d1c23d4e07630fa3b67e69a3eab1b8	2019-02-12 07:04:57 -08:00
Simeon Monov	bad4442a7c	Parse the command line and check the arguments before build_deps() (#16914 ) Summary: This is needed to check for wrong arguments or --help options before `build_deps()` is executed. Otherwise command line arguments are not parsed and checked until `setup()` is run. Fixes: #16707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16914 Differential Revision: D14041236 Pulled By: soumith fbshipit-source-id: 41f635772ccf47f05114775d5a19ae04c495ab3b	2019-02-12 00:15:42 -08:00
Dmytro Dzhulgakov	4d4c5273de	Fix and add testing for nullptr allocator in c2->pt conversion (#16857 ) Summary: Fixes the bug for when tensor is created on Caffe2 side, then passed to PT and resized. Now we just initialize allocator correctly. Note that the code in raw_mutable_data() is still necessary because of non-resizable tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16857 Reviewed By: houseroad Differential Revision: D14019469 Pulled By: dzhulgakov fbshipit-source-id: 14d3a3b946d718bbab747ea376903646b885706a	2019-02-11 23:21:02 -08:00
Dmytro Dzhulgakov	aa626840af	Fix NERPredictor for zero initialization Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16931 Reviewed By: dragonxlwang Differential Revision: D14016749 fbshipit-source-id: b5512c52cef77651bdba1e31f588ea649daacdd9	2019-02-11 23:12:26 -08:00
David Riazati	d266453541	Allow calling a Python function with a dict Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16989 Differential Revision: D14037896 Pulled By: driazati fbshipit-source-id: 5f26d2d8fabf0f267909a3383f19d984645f94d0	2019-02-11 21:52:44 -08:00
Kimish Patel	4292d13240	Keep weights name unchanged during SsaRewrite (#16932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16932 During onnxifi transformation net ssa is rewritten. At the last step the weight names are changed back to what they were before. The diff keeps the weight names unchanged thru the process. Reviewed By: yinghai Differential Revision: D13972597 fbshipit-source-id: 7c29857f788a674edf625c073b345f2b44267b33	2019-02-11 14:55:31 -08:00
Will Feng	917eac91f4	Pin docker-ce version to the one expected by nvidia-docker2 (#16976 ) Summary: Fix errors such as https://circleci.com/gh/pytorch/pytorch/760715. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16976 Differential Revision: D14030665 Pulled By: yf225 fbshipit-source-id: dece6a5aa4d13ff771c18b4ce02a0b9f9572a379	2019-02-11 14:21:20 -08:00
Sebastian Messmer	920c684367	Expose GenerateProposals to PyTorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16880 Reviewed By: bwasti Differential Revision: D13998092 fbshipit-source-id: 23ab886ba137377312557fa718f262f4c8149cc7	2019-02-11 14:15:47 -08:00
Sebastian Messmer	0c02d317ea	Expose BBoxTransform to pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16879 Reviewed By: bwasti Differential Revision: D13998093 fbshipit-source-id: ddfe4bff83e9a1a4cedf1e520e6d2977b21cb3af	2019-02-11 14:15:45 -08:00
Sebastian Messmer	64aa769ef9	Minimize templated code in caffe2 operator wrapper (#16965 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16965 Instead of having one large templated function to wrap the caffe2 op, minimize the amount of templated code. Non-templated code can be reused between different operators and decreases binary size. Reviewed By: orionr Differential Revision: D14018806 fbshipit-source-id: bedd4152eec21dd8c5778446963826316d210543	2019-02-11 14:15:43 -08:00
Adam Paszke	7743ed8502	Don't keep unnecessary saved_inputs alive (#16583 ) Summary: Fixes #16577. This greatly improves memory efficiency of certain ops like Dropout2d. Previously, they were implemented as `input * mask` where mask never requires_grad, but we didn't use that knowledge in forward, and (in case of a in-place dropout) kept input.clone() for the backward, when it would simply get ignored. This patch tries to address this situation by emitting some guards for stores like this, but only if they are as simple, as checking if a single value requires_grad. Interestingly, the same optimizations apply to methods like bmm, baddmm, etc., but _not to mm nor addmm_, because of how their derivatives are defined. Apparently they unnecessarily use `mat1` to compute the derivative of `mat1` just to improve the error message in case `mat1` was sparse. I'd like to apply this optimization to that case, but I don't want to loose the nicer error message, so if anyone has any ideas for solutions, please let me know... Full list of operators affected by this patch: * _nnpack_spatial_convolution * addbmm * addcdiv * addcmul * addmv * addr * baddbmm * bmm * cross * div * dot * fmod * ger * index_add_ * mul * mv * scatter_add_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/16583 Differential Revision: D13900881 Pulled By: gchanan fbshipit-source-id: dd0aeb2ab58c4b6aa95b37b46d3255b3e014291c	2019-02-11 13:42:09 -08:00
Will Feng	e2a5b203fc	Enforce same input tensor storage in VariableType functions (#16305 ) Summary: In VariableType.cpp, when a function modifies its input tensors, it should only change the input tensors' storage data in-place, and should never change the input tensors' storage pointers. This PR adds checks for this, and also fixes functions that fail this test. This is part of the Variable/Tensor merge work (https://github.com/pytorch/pytorch/issues/13638). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16305 Differential Revision: D13897855 Pulled By: yf225 fbshipit-source-id: 0c4fc7eb530d30db88037b1f0981f6f8454d3b79	2019-02-11 13:33:12 -08:00
Sebastian Messmer	4b454c3bdd	Revert unneeded fixes in flat_hash_map (#16907 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16907 The begin()/end() fix actually doesn't make sense, see my comment on https://github.com/skarupke/flat_hash_map/pull/8 This diff removes it. Reviewed By: ezyang Differential Revision: D13985779 fbshipit-source-id: f08b02c941069e2a4e728e02a19b65dc72f96b41	2019-02-11 13:28:25 -08:00
Sebastian Messmer	9521612bb7	Fix constexpr in KernelRegistrationBuilder (#16906 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16906 In C++11, constexpr implies const, so these methods actually wouldn't be rvalue overloads as intended but const rvalue overloads. Let's only apply the constexpr flag in C++14 to be safe. Reviewed By: bddppq Differential Revision: D13998486 fbshipit-source-id: a04d17ef0cc8f45e3d0a1ca9843d194f4f0f6f7f	2019-02-11 13:28:23 -08:00
Xiaodong Wang	af0c79eed4	Catch cudaError_t return val (nodiscard in rocm) (#16399 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16399 Catching cudaError_t return values in a few places, because it's nodiscard in rocm. Unless we add -Wno-unused-result, it'll end up with a compilation error. Also in c10/cuda/test, check whether a host has GPU or not. We were silently throwing out the error before (so not really testing the cuda api). Reviewed By: bddppq Differential Revision: D13828281 fbshipit-source-id: 587d1cc31c20b836ce9594e3c18f067d322b2934	2019-02-11 13:18:36 -08:00
Thomas Viehmann	29f096cc70	optionally zero infinite losses in CTCLoss (#16199 ) Summary: Here is a stab at implementing an option to zero out infinite losses (and NaN gradients). It might be nicer to move the zeroing to the respective kernels. The default is currently `False` to mimic the old behaviour, but I'd be half inclined to set the default to `True`, because the behaviour wasn't consistent between CuDNN and Native anyways and the NaN gradients aren't terribly useful. This topic seems to come up regularly, e.g. in #14335 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16199 Differential Revision: D14020462 Pulled By: ezyang fbshipit-source-id: 5ba8936c66ec6e61530aaf01175dc49f389ae428	2019-02-11 13:12:55 -08:00
Zhizhen Qin	632df48207	Merge binaries "convert_image_to_tensor" and "caffe2_benchmark" (#16875 ) Summary: Merge binaries "convert_image_to_tensor" and "caffe2_benchmark" to remove the overhead of writing to/reading from Tensor file. *TODO next: TensorProtos is another overhead. No need for de-serialization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16875 Reviewed By: sf-wind Differential Revision: D13997726 Pulled By: ZhizhenQin fbshipit-source-id: 4dec17f0ebb59cf1438b9aba5421db2b41c47a9f	2019-02-11 13:07:26 -08:00
SsnL	b4f1a871e8	Fix missing CircleCI GPG key (#16961 ) Summary: I'm seeing a bunch of apt gpg key errors on CI with the following message: ``` An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://packagecloud.io trusty InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 4E6910DFCB68C9CD ``` Most of the times apt will reuse the old cached version, but sometimes this results in a build failure: https://circleci.com/gh/pytorch/pytorch/758366?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link. This should hopefully fix it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16961 Differential Revision: D14028151 Pulled By: ezyang fbshipit-source-id: 7648a0a58ece38d8d04916937a9fa17f34f8833e	2019-02-11 12:31:38 -08:00
Edward Yang	c90a33b627	Disable binary_linux_conda_3.6_cu90_build on PRs. (#16958 ) Summary: Issue tracked at https://github.com/pytorch/pytorch/issues/16710 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16958 Differential Revision: D14028078 Pulled By: ezyang fbshipit-source-id: 6c68f79775a156ef4a55ac450a5a0ecacc0e6af5	2019-02-11 11:53:49 -08:00
Xiaodong Wang	48c5d0ae8c	Install Thrust package and stop patching (#16911 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16911 I think the Thrust package has want we want for /opt/rocm/include/thrust. We probably can stop patching it now. Reviewed By: bddppq Differential Revision: D14015177 fbshipit-source-id: 8d9128783a790c39083a1b8b4771c2c18bd67d46	2019-02-11 09:47:39 -08:00
Eskil Jörgensen	8042edcdb1	Make pin_memory and default_collate preserve namedtuples (#16440 ) Summary: Open issue: https://github.com/pytorch/pytorch/issues/3281 Corresponding PR (conflict): https://github.com/pytorch/pytorch/pull/4577 Another open issue: https://github.com/pytorch/pytorch/issues/14613 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16440 Differential Revision: D14020901 Pulled By: ezyang fbshipit-source-id: 4abe817fc43c281a510715d311bad544511995d3	2019-02-11 08:47:33 -08:00
Edward Yang	d7e6f9b5a7	Revert D14020906: [pytorch][PR] Extend support for exporting reshape to onnx. Differential Revision: D14020906 Original commit changeset: 168616873044 fbshipit-source-id: 2730bb6990d41f3a9cef6625ea919c219733433d	2019-02-11 06:08:55 -08:00
Ivan Ogasawara	8b4dea3f56	Added scientific notation on set_printoptions (#16876 ) Summary: This PR fixes #15683 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16876 Differential Revision: D14021703 Pulled By: soumith fbshipit-source-id: 1f603a7d24e331831d8d389f4a704c6a5b070b0c	2019-02-11 04:55:12 -08:00
BowenBao	4335aac6e6	Extend support for exporting reshape to onnx. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16632 Differential Revision: D14020906 Pulled By: ezyang fbshipit-source-id: 168616873044b980145a3554dab942bdec19efb2	2019-02-10 20:19:35 -08:00
eyyub.sari@epitech.eu	e661dc27ff	Int8GivenTensorFill Operator Schema fix typo (#16204 ) Summary: Hi, caffe2/operators/quantized/int8_given_tensor_fill_op.cc expects the value array to be named "values" but the operator schema describe "value" (no s). I guess it is a little typo but it made me losing a bit of time before understanding why I had this error by passing "value" instead of "values": ``` [F int8_given_tensor_fill_op.h:95] Check failed: output->t.numel() == values_.numel() output size: 3 given size: 0 Aborted (core dumped) ``` Thanks, Eyyüb Sari Pull Request resolved: https://github.com/pytorch/pytorch/pull/16204 Differential Revision: D14020476 Pulled By: ezyang fbshipit-source-id: a8a46bfc44ec125e7925ce4b7c79fdf99c890a50	2019-02-10 20:08:45 -08:00
Adam Paszke	8f6ee88a1d	Add support for fusion of half batch norm with float stats (#16735 ) Summary: Fixes #16642. cc ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/16735 Differential Revision: D14020310 Pulled By: ezyang fbshipit-source-id: ac78726f471d16d188eb998354d52bc79fe2c282	2019-02-10 19:37:57 -08:00
musikisomorphie	c282afffa7	Improve the Sparse matrix multiplication computational speed #16187 (#16905 ) Summary: Instead of converting coo to csr format of the sparse matrix in the original implementation, in my revision I directly use coo format for sparse dense matrix mutliplication. On my linux machine it is 5 times faster than the original code: ``` (original code) SIZE: 15000 DENSITY: 0.01 DEVICE: cpu torch: 0.39403 seconds np: 0.00496674 seconds torch/np: 79.3338 ---------------------------------------- (my update) SIZE: 15000 DENSITY: 0.01 DEVICE: cpu torch: 0.0812583 seconds np: 0.00501871 seconds torch/np: 16.1911 ``` Further code feedback and running time tests are highly welcomed. I will keep revise my code if needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16905 Differential Revision: D14020095 Pulled By: ezyang fbshipit-source-id: 4ab94075344a55b375f22421e97a690e682baed5	2019-02-10 19:37:54 -08:00
Michael Carilli	0742874643	Allow dataloader to accept a custom memory pinning function (#16743 ) Summary: Renewed attempt at https://github.com/pytorch/pytorch/pull/14171 From the original PR: > Currently, the pin_memory_batch function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how. > >This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom collate_fn returns a custom batch type. The old PR allowed the user to implement batch pinning for custom batch and data types by passing a custom pin function to the dataloader. slayton58 suggested a cleaner approach: allow the user to define a `pin_memory` method on their custom types, and have `pin_memory_batch` [check for the presence of that method](https://github.com/pytorch/pytorch/pull/16743/files#diff-9f154cbd884fe654066b1621fad654f3R56) in the incoming batch as a fallback. I've updated the test and docstrings accordingly. The old PR was merged but then reverted due to weird cuda OOM errors on windows that may or may not have been related. I have no idea why my changes would cause such errors (then or now) but it's something to keep an eye out for. fmassa and yf225 who were my POCs on the old PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16743 Differential Revision: D13991745 Pulled By: ezyang fbshipit-source-id: 74e71f62a03be453b4caa9f5524e9bc53467fa17	2019-02-10 19:37:53 -08:00
Hameer Abbasi	73d7ecd183	Add abs for ByteTensor and CharTensor. (#16893 ) Summary: Fixes #15089 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16893 Differential Revision: D14020115 Pulled By: ezyang fbshipit-source-id: 6f3be6ed28d2d37667159be45959d400bc473451	2019-02-10 19:31:57 -08:00
Xiang Gao	eae139e18f	Support named tuple return from operators on JIT (#16253 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/16233 The following changes are made: - Modify `TupleType` to store optional field names - Modify schema matching to return fill in those field names when creating `TupleType` as return type. - Modify codegen of JIT to copy field names to schema string - Modify `SchemaParser` to set field names of returned schema. - Modify `SimpleValue::attr` to emit tuple indexing for named tuple. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16253 Reviewed By: ezyang Differential Revision: D13954298 Pulled By: zdevito fbshipit-source-id: 247d483d78a0c9c12d1ba36e1f1ec6c3f1a3007b	2019-02-10 18:15:56 -08:00
Derek Kim	9cb41e5386	Enhance the documentation for torch.nn.DataParallel (#15993 ) Summary: I found a few sentences in DataParallel docstring confusing, so I suggest this enhancement. - Arbitrary arguments are allowed to be passed .... INCLUDING tensors (Not EXCLUDING) - The original author said that "other types" are shallow-copied but I think actually only some builtin types are (effectively) shallow-copied. And "other types" are shared. Here is an example. ```python import torch from torch.nn import Module, DataParallel from collections import deque class MyModel(Module): def forward(self, x): x.append(None) model = MyModel(); model.cuda() model = DataParallel(model) d = deque() model.forward(d) print(d) ``` This is a side note. As far as I know, copying objects is not a specially frequent operation in python unlike some other languages. Notably, no copying is involved in assignment or function parameter passing. They are only name bindings and it is the whole point of "everything is object" python philosophy, I guess. If one keep this in mind, it may help you dealing with things like multithreading. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15993 Differential Revision: D14020404 Pulled By: ezyang fbshipit-source-id: a38689c94d0b8f77be70447f34962d3a7cd25e2e	2019-02-10 15:55:31 -08:00
ZhuBaohe	aae6b53c5b	DOC: correct docstring for torch and torch.Tensor package (#16842 ) Summary: This PR is a simple fix for the mistake in the "tensor" and "torch.Tensor"doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16842 Differential Revision: D14020300 Pulled By: ezyang fbshipit-source-id: 3ab04f1223d6e60f8da578d04d759e385d23acbb	2019-02-10 14:37:29 -08:00
Thomas Viehmann	6a528007a6	find libnvToolsExt instead of using only hardcoded path (#16714 ) Summary: This changes the libnvToolsExt dependency to go through CMake find_library. I have a machine where cuda libs, and libnvToolsExt in particular, are in the "usual library locations". It would be neat if we could find libnvToolsExt and use the path currently hardcoded as default. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16714 Differential Revision: D14020315 Pulled By: ezyang fbshipit-source-id: 00be27be10b1863ca92fd585f273d50bded850f8	2019-02-10 14:01:00 -08:00
Xiang Gao	8c9df48fd4	Clean up autograd method tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16790 Differential Revision: D14020305 Pulled By: ezyang fbshipit-source-id: 3aa3362830cde35967a3895837a25b3cf3287569	2019-02-10 13:49:12 -08:00
Travis Johnston	a1a330bd6e	fixed LogSigmoid math string that wasn't rendering in documentation (#16900 ) Summary: The documentation for LogSigmoid says: > Applies the element-wise function: > \<blank\> Now the documentation properly displays the math string. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16900 Differential Revision: D14020097 Pulled By: ezyang fbshipit-source-id: 41e229d0fcc6b9bb53367be548bf85286dc13546	2019-02-10 11:47:56 -08:00
drkw	e0323a6aea	ctc_loss error message bug fix. (#16917 ) Summary: CTCLLoss argument error message is wrong. Please fix this. (sorry if I made some mistakes.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16917 Differential Revision: D14019983 Pulled By: ezyang fbshipit-source-id: 3337a2e86da6f3f7594c73fddb73340494a19ce2	2019-02-10 10:49:29 -08:00
Will Feng	202eaa4ef4	Use non-Variable type for callsites that check type equality (#16325 ) Summary: When Variable and Tensor are merged, the dynamic type of the tensors passed to certain functions will become variables, and expecting `type()` on those variables to still return non-Variable types will cause type mismatch error. One way to fix this problem is to use the thread-local guard `at::AutoNonVariableTypeMode` to force `type()` to return non-Variable type, but ideally we want to limit the use of `at::AutoNonVariableTypeMode` to be only in VariableType.cpp. Another way to fix the problem is to use `at::globalContext().getNonVariableType()` instead to get the non-Variable type of the tensor, which is what this PR is trying to achieve. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16325 Differential Revision: D14012022 Pulled By: yf225 fbshipit-source-id: 77ef1d2a02f78bff0063bdd72596e34046f1e00d	2019-02-10 09:47:50 -08:00
Jiren Jin	a9f1d2e371	Fix the error in the note about `torch.device` documentation. (#16839 ) Summary: This PR is a simple fix for the mistake in the first note for `torch.device` in the "tensor attributes" doc. ![image](https://user-images.githubusercontent.com/8536399/52399611-1becaa00-2b00-11e9-85bf-cac04b29842d.png) ``` >>> # You can substitute the torch.device with a string >>> torch.randn((2,3), 'cuda:1') ``` Above code will cause error like below: ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-53-abdfafb67ab1> in <module>() ----> 1 torch.randn((2,3), 'cuda:1') TypeError: randn() received an invalid combination of arguments - got (tuple, str), but expected one of: * (tuple of ints size, torch.Generator generator, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool requires_grad) * (tuple of ints size, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool requires_grad) ``` Simply adding the argument name `device` solves the problem: `torch.randn((2,3), device='cuda:1')`. However, another concern is that this note seems redundant as there is already another note covering this usage: ![image](https://user-images.githubusercontent.com/8536399/52399583-0ecfbb00-2b00-11e9-914f-e95da4edecd1.png) So maybe it's better to just remove this note? Pull Request resolved: https://github.com/pytorch/pytorch/pull/16839 Reviewed By: ezyang Differential Revision: D13989209 Pulled By: gchanan fbshipit-source-id: ac255d52528da053ebfed18125ee6b857865ccaf	2019-02-09 20:18:58 -08:00
Johannes M Dieterich	19790b218f	Register coalescer bug was fixed in ROCm 2.1 (#16923 ) Summary: Remove specialization/workaround for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16923 Differential Revision: D14018521 Pulled By: bddppq fbshipit-source-id: d88162740bca6dc8ad37397dfbf8c84408074a00	2019-02-09 11:27:50 -08:00
Johannes M Dieterich	d72c5d5a49	Alignas is now correctly handled on ROCm (#16920 ) Summary: Post 2.1 release, packing is fixed and alignas works as expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16920 Differential Revision: D14018539 Pulled By: bddppq fbshipit-source-id: 0ed4d9e9f36afb9b970812c3870082fd7f905455	2019-02-09 11:27:48 -08:00
Johannes M Dieterich	5089ee9677	Enable buildin bitonic sort (#16919 ) Summary: It now works post ROCm 2.1 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16919 Differential Revision: D14018538 Pulled By: bddppq fbshipit-source-id: c4e1bafb53204a6d718b2d5054647d5715f23243	2019-02-09 11:27:47 -08:00
Junjie Bai	f169f398d0	Change the default image size from 227 to 224 in resnet50 trainer (#16924 ) Summary: cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/16924 Differential Revision: D14018509 Pulled By: bddppq fbshipit-source-id: fdbc9e94816ce6e4b1ca6f7261007bda7b80e1e5	2019-02-09 11:18:58 -08:00
Johannes M Dieterich	23e1c55cc0	enable unit tests working on ROCm 2.1 (#16871 ) Summary: This is the first round of enabling unit tests that work on ROCm 2.1 in my tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16871 Differential Revision: D13997662 Pulled By: bddppq fbshipit-source-id: d909a3f7dd5fc8f85f126bf0613751c8e4ef949f	2019-02-09 00:30:50 -08:00
Elias Ellison	fc4f33b08f	Add suggest add to __constants__ message on save fail Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16850 Differential Revision: D14014735 Pulled By: eellison fbshipit-source-id: 7b6d5d5b64b9b107743cea1548cb4ee1b653977e	2019-02-08 19:40:11 -08:00
Chandler Zuo	6737190b5c	Make the exception raised from "numpy.dtype(numpy.void, (INT,))" less cryptic (#16809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16809 https://fb.facebook.com/groups/582508038765902/permalink/736710343345670/?comment_id=824042307945806&reply_comment_id=824318864584817 numpy.dtype(numpy.void, (<INT>, )) raises a cryptic message "invalid itemsize in generic type tuple" that is hard to debug. This diff adds the message to ask the user to investigate the error causing blob. Reviewed By: kennyhorror Differential Revision: D13973359 fbshipit-source-id: 43a0c492ffafbabdfd7f7541c08a258e5ac0280f	2019-02-08 16:46:50 -08:00
Bram Wasti	12bace141b	Revert D13970381: [caffe2] Add visibility to registry class to fix ubsan error Differential Revision: D13970381 Original commit changeset: 763db24b8a98 fbshipit-source-id: dda8672ed0bc6fecc4dde5ce73feb99e15205978	2019-02-08 16:21:10 -08:00
Nikita Shulga	0799a81cb7	Extend Net.RunAllOnGPU() to support RecurrentNetwork op (#15713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15713 [caffe2] Extend Net.RunAllOnGPU() to support RecurrentNetwork op Reviewed By: dzhulgakov Differential Revision: D13576507 fbshipit-source-id: f517127492c9d516ece663d42fef84338c70344e	2019-02-08 15:48:42 -08:00
James Reed	48fe839d56	delete critical section in TH*Tensor_addmm (#16889 ) Summary: This was serializing all calls to `addmm` (and any op that used it, in my case `bmm`) in the entire process, and led to downright atrocious performance in the TorchScript threaded runtime. Removing this gives a 2x throughput boost for high-load machine translation inference. The original justification for this is dubious: there are other `gemm` callsites in the codebase that are not protected by critical sections. And in caffe2 land we never had any issues with nonreentrant BLAS libraries Pull Request resolved: https://github.com/pytorch/pytorch/pull/16889 Differential Revision: D14008928 Pulled By: jamesr66a fbshipit-source-id: 498e2133bd6564dba539a2d9751f4e61afbce608	2019-02-08 14:49:01 -08:00
Bram Wasti	f83556bb7b	Revert D13806753: [pytorch][PR] TensorIterator cuda launch configs update Differential Revision: D13806753 Original commit changeset: 37e45c7767b5 fbshipit-source-id: 74ac9f54f86853287b372ccf21fb37ed0e04a5d3	2019-02-08 12:44:42 -08:00
Elias Ellison	cd2dca3caf	Allow sequential modules in module list (#16882 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/16845 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16882 Differential Revision: D14007746 Pulled By: eellison fbshipit-source-id: d7918275cc1de6a67320619c3203463f66783343	2019-02-08 12:32:11 -08:00
Gu, Jinghui	5ada54e0bc	Impl ExpandDims op and fallback to CPU if needed (#15264 ) Summary: Impl ExpandDims op and fallback to CPU if needed Pull Request resolved: https://github.com/pytorch/pytorch/pull/15264 Differential Revision: D13808797 Pulled By: yinghai fbshipit-source-id: 7795ec303a46e85f84e5490273db0ec76e8b9374	2019-02-08 12:04:53 -08:00
Bram Wasti	54c981d9a9	Add visibility to registry class to fix ubsan error (#16792 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16792 fix Reviewed By: ezyang Differential Revision: D13970381 fbshipit-source-id: 763db24b8a98a2757a63b77c70c8c68ba47f31e6	2019-02-08 10:17:47 -08:00
Edward Yang	b9b0be7af2	Remove Legacy entry point. (#16721 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16721 The very key line is we have to set the stream to the default stream before calling the allocator. This is very interesting. It shouldn't be necessary, but seemingly is! Reviewed By: dzhulgakov Differential Revision: D13943193 fbshipit-source-id: c21014917d9fe504fab0ad8abbc025787f559287	2019-02-08 09:33:58 -08:00
Edward Yang	b3fbd3eebf	Deduplicate instances caching allocator, so that we only have one instance. (#16720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16720 I'm taking the deduplication slowly because there is something here that is causing problems, and I want to figure out what it is. Reviewed By: dzhulgakov Differential Revision: D13943194 fbshipit-source-id: cbc08fee5862fdcb393b9dd5b1d2ac7250f77c4b	2019-02-08 09:33:56 -08:00
Edward Yang	5c982622b0	Delete duplicate copy of THCCachingAllocator (round two). (#16615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16615 This is another go at landing https://github.com/pytorch/pytorch/pull/16226 Now that the caching allocator is moved to c10_cuda, we can delete the duplicate copy from Caffe2. The difference between this and the previous PR is that this version faithfully maintains the binding code; in particular, we end up with a SECOND copy of the caching allocator in this patch. I verified that this code does NOT cause a crash in the workflow we canaried last time. In further diffs, I plan to eliminate the second copy, and then adjust the binding code. Reviewed By: dzhulgakov Differential Revision: D13901067 fbshipit-source-id: 66331fd4eadffd0a5defb3cea532d5cd07287872	2019-02-08 09:33:55 -08:00
Junjie Bai	f03296299b	Bump caffe2 docker images to 248 (#16863 ) Summary: Jenkins jobs update will be separate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16863 Differential Revision: D13994672 Pulled By: bddppq fbshipit-source-id: 5b27879dc6ac11a42016fe7835e9124345005ebb	2019-02-08 00:40:04 -08:00
Sebastian Messmer	6ce147c021	Also register op schema when no kernels are registered Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16878 Reviewed By: bwasti Differential Revision: D13997959 fbshipit-source-id: 7527a560b03f672f76e95d4f22ae28ce24698cc1	2019-02-07 20:53:21 -08:00
Sebastian Messmer	2c713032a1	Don't automatically handle context parameter (#16867 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16867 Some caffe2 operators (example: BBoxTransform) have not just one template parameter which is the context, but might have multiple template parameters. Because of this, we can't handle the context parameter inside the macro. Reviewed By: bwasti Differential Revision: D13995696 fbshipit-source-id: f55c3be913c8b125445a8d486846fc2fab587a63	2019-02-07 20:53:17 -08:00
Yinghai Lu	fe5989d466	Support onnxifi with partially shaped inferred net (#16877 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16877 That's it. Reviewed By: ipiszy Differential Revision: D13997771 fbshipit-source-id: f512c7f30b4a4747aca335a0769712c2a2cc2206	2019-02-07 20:44:39 -08:00
Pearu Peterson	7ce33c586d	Robust determination of cudnn library and relevant conda packages. (#16859 ) Summary: This PR implements: 1. a fix to issue #12174 - determine the location of cudnn library using `ldconfig` 2. a fix to determine the installed conda packages (in recent versions of conda, the command `conda` is a Bash function that cannot be called within a python script, so using CONDA_EXE environment variable instead) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16859 Differential Revision: D14000399 Pulled By: soumith fbshipit-source-id: 905658ecacb0ca0587a162fade436de9582d32ab	2019-02-07 20:34:46 -08:00
Yinghai Lu	930ed00b33	Specialize LengthsRangeFill and SparseLengthsWeightedSum in bound shape inference (#16869 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16869 TSIA. Reviewed By: ipiszy, rdzhabarov Differential Revision: D13994946 fbshipit-source-id: 7e507abc5a3c2834c92910e521387085c56e8b2e	2019-02-07 20:18:15 -08:00
Summer Deng	b5111918cd	Activation histogram net observer with multiple histogram files as output (#16855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16855 Save the histogram of each net to a separate file Reviewed By: jspark1105 Differential Revision: D13991610 fbshipit-source-id: a5be4e37a5e63567dcd7fdf99f451ee31bb350a5	2019-02-07 19:51:30 -08:00
David Riazati	ee0e71bee7	Allow dicts in C++ frontend (#16846 ) Summary: Fixes #16856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16846 Differential Revision: D13991103 Pulled By: driazati fbshipit-source-id: 4830dd6f707fa90429b5d3070eeda0bee53d2f2b	2019-02-07 18:44:49 -08:00
Xiaomeng Yang	2db847b3a7	Separate elementwise level2 math functions (#16753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16753 Separate elementwise level2 math functions i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13954928 fbshipit-source-id: 1ca7a5d3da96e32510f502e5e4e79168854bee67	2019-02-07 18:38:26 -08:00
Freddie Mendoza	22477c6a7f	Fix (#2 ) ppc64le build break on git status --porcelain check (#16852 ) Summary: Add test/.hypothesis/ to .gitignore to pass git status --porcelain check in CI build Pull Request resolved: https://github.com/pytorch/pytorch/pull/16852 Differential Revision: D14000206 Pulled By: soumith fbshipit-source-id: 5da99a4bb242c12aa35776f7254f6399a7fa6d8c	2019-02-07 18:29:37 -08:00
Michael Suo	96369506c4	doc updates for TorchScript (#16866 ) Summary: Some batched updates: 1. bool is a type now 2. Early returns are allowed now 3. The beginning of an FAQ section with some guidance on the best way to do GPU training + CPU inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/16866 Differential Revision: D13996729 Pulled By: suo fbshipit-source-id: 3b884fd3a4c9632c9697d8f1a5a0e768fc918916	2019-02-07 18:03:57 -08:00
Alex Şuhan	67bb7b2931	Fix autodiff of nll_loss Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16851 Differential Revision: D13995046 Pulled By: wanchaol fbshipit-source-id: 557c99f1d1825fa9b6031dd9fa8ba9b54205e8c4	2019-02-07 17:42:01 -08:00
James Reed	c35f3ae89f	aten::_convolution now participates in shape analysis (#16837 ) Summary: During tracing, we record `aten::_convolution` rather than `aten::convolution`. The schema for the former was not present in the shape analysis pass, and resulted in some missing shape information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16837 Differential Revision: D13993831 Pulled By: jamesr66a fbshipit-source-id: ebb63bf628d81613258caf773a3af5930303ce5a	2019-02-07 17:26:11 -08:00
peter.yeh@amd.com	c65b03b9f8	Enable arg_ops_test/unique_ops_test on AMD/rocm (#16853 ) Summary: Verified both tests are passing on rocm 2.1 env. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16853 Differential Revision: D13996279 Pulled By: bddppq fbshipit-source-id: c0df610d7d9ca8d80ed2d1339cdadef59105a71c	2019-02-07 16:51:15 -08:00
Johannes M Dieterich	bca358ad02	Update CI to recently released ROCm 2.1 release (#16808 ) Summary: * we do not need EAP packages any longer as the antistatic feature is now in the release * consistently install the rccl package * Skip one unit test that has regressed with 2.1 * Follow-up PRs will use 2.1 features once deployed on CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/16808 Differential Revision: D13992645 Pulled By: bddppq fbshipit-source-id: 37ca9a1f104bb140bd2b56d403e32f04c4fbf4f0	2019-02-07 15:12:18 -08:00
Yinghai Lu	0f42a1ed29	Use bound shape inference in SparseNN tests (#16834 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16834 Inserting AdjustBatch ops will possibly change the names of the input/output, so we need to create a mapping and use the renamed names for external_inputs/outputs and input_shape_info for the onnxifi_net. Reviewed By: ipiszy Differential Revision: D13982731 fbshipit-source-id: c18b8a03d01490162929b2ca30c182d166001626	2019-02-07 14:51:32 -08:00
Davide Libenzi	66084c0bc9	Add recognition for XLA device types. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16844 Differential Revision: D13988805 Pulled By: gchanan fbshipit-source-id: 4e89d6d2cde8bdac41739efa65cc91569a360953	2019-02-07 14:51:28 -08:00
Sebastian Messmer	64339dbd51	Fix and re-enable test case (#16643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16643 The test was disabled in D13908117 because it conflicted with another diff that was about to land. Now fixed the merge conflict and re-landing it. Reviewed By: ezyang Differential Revision: D13911775 fbshipit-source-id: b790f1c3a3f207916eea41ac93bc104d011f629b	2019-02-07 13:58:16 -08:00
Sebastian Messmer	6750e1e3e9	C10_REGISTER_CAFFE2_OPERATOR: Macro for registering c2 kernels (#16548 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16548 With this macro, a caffe2 operator can now directly be registered with c10. No need to write custom wrapper kernels anymore. Differential Revision: D13877076 fbshipit-source-id: e56846238c5bb4b1989b79855fd44d5ecf089c9c	2019-02-07 13:58:14 -08:00
Jesse Hellemn	ac4f66c9c3	Fix Anaconda logins on binary builds Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16848 Differential Revision: D13993614 Pulled By: pjh5 fbshipit-source-id: 16854b06d01460b78d9dbe7bd0341b7332984795	2019-02-07 13:44:52 -08:00
Zhicheng Yan	4193f7a106	new embedding label type in image input op (#16835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16835 We were using label type `multi_label_dense` to denote both 1) dense representation of integer label 2) embedding label of data type floating number. This cause some issues as two cases have different assumption, such as for integer label, we will check whether label value is in [0, number_class - 1]. But such check should be skipped for `embedding label`. Reviewed By: BIT-silence Differential Revision: D13985048 fbshipit-source-id: 1202cdfeea806eb47647e3f4a1ed9c104f72ad2c	2019-02-07 13:35:59 -08:00
Michael Antonov	b6648c1bbc	Update ATen internals to use int64_t for dimension indexing (#16739 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16739 Some code ATen locations seemed to use int, etc. inclorrectly where either int64_t or size_t was required. Update them to use int64_t for dimension indexing where necessary. Reviewed By: ezyang Differential Revision: D13950124 fbshipit-source-id: aaf1cef783bf3c657aa03490f2616c35c816679f	2019-02-07 13:15:42 -08:00
Will Feng	1aa90192ea	Make JIT attributes t_ and ts_ store Variable instead of Tensor (#16596 ) Summary: Discussed with zdevito and we want to use Variable (with `set_requires_grad(false)`) instead of Tensor in all parts of JIT, to eliminate the distinction and the conceptual overhead when trying to figure out which one to use. This also helps with the Variable/Tensor merge work tracked at https://github.com/pytorch/pytorch/issues/13638, which will make common functions (such as `numel()` / `sizes()` / `dim()`) on Variable much faster when finished. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16596 Differential Revision: D13979971 Pulled By: yf225 fbshipit-source-id: c69119deec5bce0c22809081115f1012fdbb7d5a	2019-02-07 12:34:00 -08:00
David Riazati	44d98c30a3	Better error when using a constant tensor (#16724 ) Summary: Fixes #16284 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16724 Differential Revision: D13990531 Pulled By: driazati fbshipit-source-id: adbf47a07eddb3813fbe1322944abfe5fcff89fa	2019-02-07 12:28:28 -08:00
Richard Zou	72f070a124	Backport the stable doc build on v1.0.1 to master (#16503 ) Summary: List of changes: - Always push the final state of the doc build docker for debugging purposes. - Adds code for the stable doc build. This code is never actually run on master, only the v1.0.1 branch. There is a big note for this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16503 Differential Revision: D13972469 Pulled By: zou3519 fbshipit-source-id: 68f459650ef0de200a34edd43fc1372143923972	2019-02-07 11:41:07 -08:00
Wanchao Liang	ac00e85e36	Remove undefined tensor in jit script (#16379 ) Summary: This PR is a follow up of #15460, it did the following things: * remove the undefined tensor semantic in jit script/tracing mode * change ATen/JIT schema for at::index and other index related ops with `Tensor?[]` to align with what at::index is really doing and to adopt `optional[tensor]` in JIT * change python_print to correctly print the exported script * register both TensorList and ListOfOptionalTensor in JIT ATen ops to support both * Backward compatibility for `torch.jit.annotate(Tensor, None)` List of follow ups: * remove the undefined tensor semantic in jit autograd, autodiff and grad_of * remove prim::Undefined fully For easy reviews, please turn on `hide white space changes` in diff settings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16379 Differential Revision: D13855677 Pulled By: wanchaol fbshipit-source-id: 0e21c14d7de250c62731227c81bfbfb7b7da20ab	2019-02-07 11:02:14 -08:00
Fritz Obermeyer	0d366e1bde	Support multiple inheritance in torch.distributions (#16772 ) Summary: This adds calls to `super().__init__()` in three classes in torch.distributions. This is needed when `Distribution` and `Transform` objects are used with multiple inheritance, as e.g. combined with `torch.nn.Module`s. For example ```py class MyModule(torch.distributions.Transform, torch.nn.Module): ... ``` cc martinjankowiak esling who have wanted to use this pattern, e.g. in #16756 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16772 Differential Revision: D13978633 Pulled By: soumith fbshipit-source-id: 8bc6cca1747cd74d32135ee2fe588bba2ea796f1	2019-02-07 01:37:57 -08:00
vishwakftw	2681af1c8a	Remove redundant wrappers in torch.distributions (#16807 ) Summary: Changelog: - Remove torch.distributions.multivariate_normal._batch_diag : same functionality is provided by torch.diagonal - Remove torch.distributions.lowrank_multivariate_normal._batch_vector_diag : same functionality is provided by torch.diag_embed Pull Request resolved: https://github.com/pytorch/pytorch/pull/16807 Differential Revision: D13985550 Pulled By: soumith fbshipit-source-id: 25c7d00c52ff7f85e431134e9ce0d5dda453667b	2019-02-07 01:13:55 -08:00
Ying Zhang	511f6fc2d5	Insert AdjustBatchSizeOp into the predict_net. (#16811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16811 As the title. The AdjustBatch ops will be inserted before and after the Onnxifi op to: 1) adjust batch/seq sizes to the ideal batch/seq size before these tensors are processed by the Onnxifi op; 2) adjust batch size to the original batch size for batches generated by the Onnxifi op. Reviewed By: yinghai Differential Revision: D13967711 fbshipit-source-id: 471b25ae6a60bf5b7ebee1de6449e0389b6cafff	2019-02-07 00:40:11 -08:00
rohithkrn	aa88c2c0b6	Unify gpu_support variable in python tests (#16748 ) Summary: Assign `has_gpu_support = has_cuda_support or has_hip_support` and make according changes in python tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16748 Differential Revision: D13983132 Pulled By: bddppq fbshipit-source-id: ca496fd8c6ae3549b736bebd3ace7fa20a6dad7f	2019-02-07 00:29:51 -08:00
Mohana Rao	85ac272670	Update Docker file section in README.md (#16812 ) Summary: Emphasize on the fact that docker build should be triggered from pytorch repo directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16812 Differential Revision: D13985531 Pulled By: soumith fbshipit-source-id: c6511d1e81476eb795b37fb0ad23e8951dbca617	2019-02-06 23:53:50 -08:00
Jie	49443d49fb	TensorIterator cuda launch configs update (#16224 ) Summary: Update launch configs for TensorIterator gpu_reduce_kernel. Enable flexible block dimension to improve efficiency for reduction cases with small fast dimension. Previously TensorIterator launches blocks with fixed 32x16 threads. For cases like: import torch torch.randn(2**20, 4, device='cuda').sum(0) The fixed launch config does handle coalesced memory access efficiently. Updated launch configure enables flexible block dimension. Combining with improved reduction scheme (using flexible vertical / horizontal reduction instead of limited warp / block reduction in the old code), it ensures optimal memory access pattern even with reduction on dimension with small stride. Possible future improvements: 1. Precise dynamic shared memory allocation. 2. Using warp shuffle for vertical (block_y) reduction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16224 Differential Revision: D13806753 Pulled By: soumith fbshipit-source-id: 37e45c7767b5748cf9ecf894fad306e040e2f79f	2019-02-06 23:10:41 -08:00
Sebastian Messmer	b2135b2b72	Define layer_norm schema in caffe2 (#16535 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16535 There is now no need anymore to define the layer norm schema in a central location. It can just be defined in caffe2 next to the kernel implementation. Reviewed By: ezyang Differential Revision: D13869503 fbshipit-source-id: c478153f8fd712ff6d507c794500286eb3583149	2019-02-06 21:21:34 -08:00
Sebastian Messmer	16468a9f45	Automatically register c10 ops with JIT (#16534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16534 All c10 ops from the c10 dispatcher are now automatically registered with JIT Reviewed By: dzhulgakov Differential Revision: D13869275 fbshipit-source-id: 5ab5dec5b983fe661f977f9d29d8036768cdcab6	2019-02-06 21:21:33 -08:00
Yinghai Lu	e5e0bf4152	Add AdjustBatch Op (#16676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16676 This op is used for changing batch size (first dimension) of the tensor. Reviewed By: bertmaher, ipiszy Differential Revision: D13929200 fbshipit-source-id: 4f2c3faec072d468be8301bf00c80d33adb3b5b3	2019-02-06 19:15:41 -08:00
bddppq	100aa0798e	Bring back running pytorch tests in rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16829 Differential Revision: D13982323 Pulled By: bddppq fbshipit-source-id: 6ffadb96b9e2ebd64a29e38674a51401dfb211db	2019-02-06 17:58:48 -08:00
Zachary DeVito	f34192db0f	Rename DynamicType -> TensorType (#16787 ) Summary: ``` import json from subprocess import check_call from pprint import pprint renames = { 'c10::TensorType': 'DimentionedTensorType', 'c10::DynamicType': 'TensorType', 'c10::TensorTypePtr': 'DimentionedTensorTypePtr', 'c10::DynamicTypePtr': 'TensorTypePtr', 'c10::TypeKind::DynamicType': 'TensorType', 'c10::TypeKind::TensorType': 'DimentionedTensorType', } entries = json.loads(open('compile_commands.json', 'r').read()) build = None sources = [] for e in entries: name = e['file'] if not ('jit' in name or 'ATen/core' in name): continue build = e['directory'] sources.append(name) args = ['clang-rename', '-i', '-force', '-pl'] for name in sorted(renames.keys()): args += ['-qualified-name={}'.format(name), '-new-name={}'.format(renames[name])] for source in sources: cmd = args + [source] pprint(args) check_call(cmd, cwd=build) check_call(['git', 'stash', 'push', '-m', 'rename']) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16787 Differential Revision: D13974132 Pulled By: zdevito fbshipit-source-id: 8368fd53e17cff83707bbe77f2d7aad74f8ce60e	2019-02-06 17:31:07 -08:00
Yinghai Lu	1b919ca93e	Use bound shape inference in onnxifi transform (#16598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16598 ATT. Reviewed By: bertmaher, rdzhabarov Differential Revision: D13893698 fbshipit-source-id: 8d2ad9814fe76924a46b450eb7ebd3601fbdbbc7	2019-02-06 16:34:37 -08:00
Soumith Chintala	717ae09184	improve error message (#16719 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16712 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16719 Differential Revision: D13978688 Pulled By: ezyang fbshipit-source-id: 61f8fa4c54c6969a58550e32e18be2eb9254ced7	2019-02-06 15:51:58 -08:00
Jongsoo Park	8105aaca86	int8 SpatialBN (#16796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16796 SpatialBN int8 version Reviewed By: dskhudia Differential Revision: D13971224 fbshipit-source-id: e55fd608c161069daaa4e62c618bc14b01f32cb7	2019-02-06 15:32:01 -08:00
Jongsoo Park	30ab1773f9	call istringstream clear after str (#16820 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16820 Sometimes parsing histogram was not working correctly due to changes in D13633256 We need to call istringstream clear after str Reviewed By: csummersea Differential Revision: D13977509 fbshipit-source-id: ce3e8cb390641d8f0b5c9a7d6d6daadffeddbe11	2019-02-06 15:23:08 -08:00
Narine Kokhlikyan	ea35d8e40a	Replace resize_dim() with set_sizes_and_strides() (#16732 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16732 Use set_sizes_and_strides instead of resize_dim with. Reviewed By: ezyang Differential Revision: D13947867 fbshipit-source-id: 067b096b1fde14b039690992a5fe3ace386b2789	2019-02-06 14:50:52 -08:00
Jongsoo Park	929cd23da1	no EIGEN engine for DeformConv (#16785 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16785 There's no EIGEN engine implemented for DeformConv but unit test was checking it. Reviewed By: BIT-silence Differential Revision: D13967306 fbshipit-source-id: e29c19f59f5700fc0501c59f45d60443b87ffedc	2019-02-06 11:59:31 -08:00
Jongsoo Park	8d4b2db529	format deform_conv_test.py (#16786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16786 Format to prepare D13967306 Reviewed By: BIT-silence Differential Revision: D13967317 fbshipit-source-id: 2de895f8474b04c55ba067fbf788c553dc010c60	2019-02-06 11:59:29 -08:00
Yinghai Lu	1a13dedf98	Fix/Improve bound shape inference with real net tests (#16597 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16597 This diff fixes some bugs in shape inference for `SparseLengthsSumFused8BitRowwise`. And added input shape inference for `Concat` when `add_axis=1`. Reviewed By: bertmaher Differential Revision: D13892452 fbshipit-source-id: 6cd95697a6fabe6d78a5ce3cb749a3a1e51c68e7	2019-02-06 10:41:07 -08:00
Edward Yang	34cfbb0040	Typofix (#16800 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16800 Differential Revision: D13972592 Pulled By: ezyang fbshipit-source-id: 45c352ac6090c8060bf75f44dec7205556986d88	2019-02-06 10:34:04 -08:00
Oleg Bogdanov	30a6feda84	caffe2 \| MSVS compatibility fixes (#16765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16765 Code changes required to build caffe2 for windows with toolchain used by FB. Reviewed By: orionr Differential Revision: D13953258 fbshipit-source-id: 651823ec9d81ac70e32d4cce5bc2472434104733	2019-02-06 09:47:01 -08:00
Gu, Jinghui	887080e92a	Fallback sum/add to CPU if needed (#15267 ) Summary: Fallback sum/add to CPU if needed Pull Request resolved: https://github.com/pytorch/pytorch/pull/15267 Differential Revision: D13935064 Pulled By: yinghai fbshipit-source-id: eb228683d00a0462a1970f849d35365bc98340d6	2019-02-06 09:35:14 -08:00
Lu Fang	39eab01b61	Automatic update of fbcode/onnx to 822d8df0a2a32233c6022f50a158817a0f19bdc7 (#16791 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16791 Previous import was bfa8b335ab6d1ed7b688dc2ec96421a3fe9e644c Included changes: - [822d8df](https://github.com/onnx/onnx/commit/822d8df): allow removed experimental ops in the checker for now (#1792) <Lu Fang> Reviewed By: MisterTea Differential Revision: D13970103 fbshipit-source-id: 5feaaa6c65ba10901eeba0b63724d7e451e9b642	2019-02-06 09:21:41 -08:00
Freddie Mendoza	f2e0d64775	Adding torch/lib64 in .gitignore for ppc64le CI build to pass (#16782 ) Summary: Adding torch/lib64 in .gitignore so that a git status --porcelain check during CI build and test passes for ppc64le. During build torch/lib64 is created and contains third-party libraries. This should be ignored by the porcelain check Pull Request resolved: https://github.com/pytorch/pytorch/pull/16782 Differential Revision: D13972794 Pulled By: ezyang fbshipit-source-id: 5459c524eca42d396ac46e756a327980b4b1fa53	2019-02-06 09:05:49 -08:00
Edward Yang	a3f600e394	Revert D13854304: [redo][c10] LayerNorm Registration Example Differential Revision: D13854304 Original commit changeset: ec463ce22721 fbshipit-source-id: 4262b9a2ef486e1c7c0283ea021331ac97cc5f56	2019-02-06 08:26:23 -08:00
Edward Yang	fc0e88dd77	Revert D13855525: [c10] Expose RoIAlign to torch Differential Revision: D13855525 Original commit changeset: cfee7bb1544d fbshipit-source-id: 0b4124b78c4082b52e592a1275069c879a9aed39	2019-02-06 08:26:22 -08:00
Edward Yang	33a6a7a3ea	Revert D13856086: [c10] Expose GenerateProposals to torch Differential Revision: D13856086 Original commit changeset: a4873646a71a fbshipit-source-id: 79b634426404236ddbc407d3796a350ad3dae5ca	2019-02-06 08:26:20 -08:00
Edward Yang	018485130f	Revert D13864292: [c10] Expose BBoxTransform to pytorch Differential Revision: D13864292 Original commit changeset: 1f57664e7834 fbshipit-source-id: 37663b7e8213185ecaa5c219076fc7de64704549	2019-02-06 08:26:18 -08:00
Edward Yang	c0a7bf94ed	Revert D13865221: [c10] Expose BoxWithNMSLimit Differential Revision: D13865221 Original commit changeset: 8a3f1d420183 fbshipit-source-id: 0057be9619b660dcad8c01bae67b54400127577e	2019-02-06 08:26:17 -08:00
Edward Yang	cda43336d4	Revert D13866214: [c10] Expose HeatmapMaxKeypoints to torch Differential Revision: D13866214 Original commit changeset: 2ca79037fc07 fbshipit-source-id: d2c653f4f32cf0ea76875888f3523c0dc7db9960	2019-02-06 08:26:16 -08:00
Rodrigo Berriel	d327965dac	Fix pip list format in collect_env (#16798 ) Summary: Since pip 18.0 (2018-07-22), `legacy` is no longer a valid choice for `pip list --format` as can be seen in the [Release Notes](https://pip.pypa.io/en/stable/news/#id62). Therefore, the options now are: `columns`, `freeze` and `json`. With `legacy`, this is how it looked like: ``` [...] Versions of relevant libraries: [pip3] numpy (1.16.1) [pip3] torch (1.0.1) [pip3] torchvision (0.2.1) [...] ``` Changing to `freeze`, this is how it looks like: ``` [...] Versions of relevant libraries: [pip3] numpy==1.16.1 [pip3] torch==1.0.1 [pip3] torchvision==0.2.1 [...] ``` Currently, this is what happens: ``` [...] Versions of relevant libraries: [pip] Could not collect [...] ``` The `freeze` option is also available in old pip, so this change is backwards compatible. Also, if we would like to keep the old style, which I think it is not necessary, I could easily change that. --- In case anyone wants to know how `columns` looks like (I prefer `freeze`): ``` [...] Versions of relevant libraries: [pip3] numpy 1.16.1 [pip3] torch 1.0.1 [pip3] torchvision 0.2.1 [...] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16798 Differential Revision: D13971793 Pulled By: soumith fbshipit-source-id: 3721d9079a2afa245e1185f725598901185ea4cd	2019-02-06 07:48:08 -08:00
Soumith Chintala	d1b2ab83fc	disable default system-wide detection of gflags, glog, opencv, lmdb, leveldb (#16789 ) Summary: They can instead by enable by env flags USE_* (as always). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16789 Differential Revision: D13971789 Pulled By: soumith fbshipit-source-id: d5eac9be677114be3fb15b43080faa0efdfff8ee	2019-02-06 05:13:47 -08:00
Zachary DeVito	255136fc1d	fix BUILD_CAFFE2_OPS Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16783 Differential Revision: D13965061 Pulled By: zdevito fbshipit-source-id: 6fe710ca51e2f338873b56f23256668ca3fe2032	2019-02-05 22:39:51 -08:00
Edward Yang	ab035d01e3	Remove unnecessary typing import. (#16777 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16777 Differential Revision: D13969679 Pulled By: ezyang fbshipit-source-id: d4728797a5927ae32628621c654eadb93c0e7682	2019-02-05 21:12:35 -08:00
Michael Suo	43f4c86238	Fix alias analysis for fork/wait (#16671 ) Summary: (review top commit only). As expected, fork/wait introduces some corner cases into the alias analysis. The comments inline should describe the changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16671 Differential Revision: D13963219 Pulled By: suo fbshipit-source-id: 2bec6fc03a4989cf309fbb9473f3f2ffe2c31431	2019-02-05 20:43:30 -08:00
Ailing Zhang	c1dff549da	changes to apply xla patch (#16781 ) Summary: This PR will let xla tests passes after https://github.com/pytorch/xla/pull/183 is in. Will add back the branch filters once it's ready. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16781 Differential Revision: D13968976 Pulled By: ailzhang fbshipit-source-id: df3b173336b3247aa56ef723569a1f68cdfa56e0	2019-02-05 19:03:05 -08:00
Jerry Zhang	db5a3c274d	Tensor construction codemod (#16568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16568 In caffe2/caffe2/operators and caffe2/caffe2/fb/operators (Resize + mutable_data) and (ResizeLike + mutable_data) motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13863416 fbshipit-source-id: 90ad3971850b89bf4b2ac81e9fa59d3bc43dc1c9	2019-02-05 18:51:02 -08:00
David Riazati	18edd3ab08	Warn when tracing legacy constructors Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16770 Differential Revision: D13963581 Pulled By: driazati fbshipit-source-id: 8f8cdfc455ba65be370fd952fc5e5c233525d002	2019-02-05 18:32:59 -08:00
David Riazati	7bf7a4162d	Use torch.zeros for nn.LSTM Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16779 Differential Revision: D13963577 Pulled By: driazati fbshipit-source-id: dc9edc3d2096760737ecbe4b3dd441ed2d53f4ad	2019-02-05 17:57:51 -08:00
Roy Li	c5c831953b	Set SCCACHE_IDLE_TIMEOUT=1200 (#16728 ) Summary: Doubling the sccache timeout from default of 600. the asan build of #16645 will fail without this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16728 Differential Revision: D13963727 Pulled By: li-roy fbshipit-source-id: 3614d75c1b46d663fa05b84f99d8a099283a8e64	2019-02-05 15:28:35 -08:00
Johannes M Dieterich	448e0d78e9	Document hip-clang and its __HIP__ macro (#16771 ) Summary: In #16085 , we introduced initial hip-clang bring-up code. Document the use of the __HIP__ macro now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16771 Differential Revision: D13961538 Pulled By: ezyang fbshipit-source-id: 67f6226abcbe62e2f4efc291c84652199c464ca6	2019-02-05 15:13:52 -08:00
Edward Yang	4404762d7d	Rename IntList to IntArrayRef. (#16751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751 This was made more complicated by the fact that ivalue::IntList is a thing. So I had to fix all of the sites where we referring to IValue post facto. The following codemods were run, in this order: ``` codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>' codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>' ``` Some manual fixups were done afterwards; they can be reviewed separately at https://github.com/pytorch/pytorch/pull/16752 Reviewed By: dzhulgakov Differential Revision: D13954363 fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64	2019-02-05 14:54:34 -08:00
David Riazati	e2d3a3fd6a	dict values(), keys(), and len() (#16629 ) Summary: Adds some operations for dicts to match Python and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/16629 Differential Revision: D13961144 Pulled By: driazati fbshipit-source-id: b31f27a4320ff62cd118b508fb0a13056535dc7c	2019-02-05 13:55:25 -08:00
Lu Fang	0ceef3c9f6	Automatic update of fbcode/onnx to bfa8b335ab6d1ed7b688dc2ec96421a3fe9e644c (#16767 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16767 Previous import was 875f7bbe537b9d6931d065977c192eaaf61e1179 Included changes: - [bfa8b33](https://github.com/onnx/onnx/commit/bfa8b33): [ONNXIFI]Add extension of onnxSetIOAndRunGraph (#1781) <Rui Zhu> Reviewed By: zrphercule Differential Revision: D13959349 fbshipit-source-id: 4876d00a3f7033cf9d89554f8b4789acd6881f72	2019-02-05 13:17:35 -08:00
Jesse Hellemn	0f7a0f8c83	Fix commit races on binary CI on master PR-merges (#16773 ) Summary: There is no way to test this until it is merged. On master jobs that run after a PR is merged, there is no CIRCLE_PR_NUMBER so the binary builds clone pytorch/pytorch/master, which races. Based off of https://circleci.com/docs/2.0/env-vars/ and the circleci checkout code ``` git config --global url."ssh://git@github.com".insteadOf "https://github.com" \|\| true git config --global gc.auto 0 \|\| true if [ -e /home/circleci/project/.git ] then cd /home/circleci/project git remote set-url origin "$CIRCLE_REPOSITORY_URL" \|\| true else mkdir -p /home/circleci/project cd /home/circleci/project git clone "$CIRCLE_REPOSITORY_URL" . fi if [ -n "$CIRCLE_TAG" ] then git fetch --force origin "refs/tags/${CIRCLE_TAG}" else git fetch --force origin "master:remotes/origin/master" fi if [ -n "$CIRCLE_TAG" ] then git reset --hard "$CIRCLE_SHA1" git checkout -q "$CIRCLE_TAG" elif [ -n "$CIRCLE_BRANCH" ] then git reset --hard "$CIRCLE_SHA1" git checkout -q -B "$CIRCLE_BRANCH" fi git reset --hard "$CIRCLE_SHA1" ``` I believe we do no use git tags Pull Request resolved: https://github.com/pytorch/pytorch/pull/16773 Differential Revision: D13962132 Pulled By: pjh5 fbshipit-source-id: c62d2139f38ff39ecda1509b0bcd8bd102828e40	2019-02-05 13:17:33 -08:00
Bram Wasti	a9713d07b0	Expose HeatmapMaxKeypoints to torch (#16528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16528 .. Reviewed By: smessmer Differential Revision: D13866214 fbshipit-source-id: 2ca79037fc070bade5542345af5ce09f88beda44	2019-02-05 12:56:58 -08:00
Bram Wasti	3df7b321cc	Expose BoxWithNMSLimit (#16529 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16529 .. Reviewed By: smessmer Differential Revision: D13865221 fbshipit-source-id: 8a3f1d420183ed5ae51b3c9e4eb6e033078c7ae4	2019-02-05 12:56:56 -08:00
Bram Wasti	add39b85cc	Expose BBoxTransform to pytorch (#16530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16530 .. Reviewed By: smessmer Differential Revision: D13864292 fbshipit-source-id: 1f57664e78347e72c0087aa3d825a6a9517c1945	2019-02-05 12:56:54 -08:00
Bram Wasti	f33a2b960e	Expose GenerateProposals to torch (#16477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16477 expose generateproposals to torch Reviewed By: smessmer Differential Revision: D13856086 fbshipit-source-id: a4873646a71a6b6c01740d21729e827f4b36588f	2019-02-05 12:56:52 -08:00
Bram Wasti	f5d4636021	Expose RoIAlign to torch (#16476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16476 enable calling roialign (caffe2) from torch frontend Reviewed By: smessmer Differential Revision: D13855525 fbshipit-source-id: cfee7bb1544dc58df4231604ba01d61ca905ae3f	2019-02-05 12:56:50 -08:00
Bram Wasti	240240bb10	LayerNorm Registration Example (#16478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16478 This diff includes an example registration of a caffe2 op in torch. A previous attempt ran into a static initialization order bug. Reviewed By: smessmer Differential Revision: D13854304 fbshipit-source-id: ec463ce2272126d08a5163d1599361ee5b718bbc	2019-02-05 12:56:48 -08:00
Bram Wasti	af4d2b889c	Enable undefined at::Tensor to be passed as Output (#16730 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16730 with Jerry's new updates Tensor must be defined -- as a result I've needed to update the shim for caffe2 ops being used in PyTorch Reviewed By: smessmer Differential Revision: D13946950 fbshipit-source-id: 6f77877c61a743f82bdfc2ad04d6ab583000cc18	2019-02-05 12:56:46 -08:00
Alex Şuhan	9811a4220d	Add XLA / TPU device type, backend type and type id (#16763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16763 Replicate the easy bits in https://github.com/pytorch/pytorch/pull/15153 with TPU / XLA instead of MSNPU. Also don't initialize the storage for XLA tensors for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16585 Reviewed By: ezyang Differential Revision: D13912118 Pulled By: gchanan fbshipit-source-id: 4889177e2478768fb281ed075b71146d1d850bd9	2019-02-05 12:56:44 -08:00
Zachary DeVito	6efa40e07b	Preserve method parameter names (#16750 ) Summary: Fixes #16591 This uses uniqueBaseName so that parameters do not end up with suffixes. It changes next_id to be per-base-name rather than global to fix jittering issues when re-importing a re-numbered graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16750 Differential Revision: D13960282 Pulled By: zdevito fbshipit-source-id: 2156f581d9b95d77bf1f1252074e800b19116555	2019-02-05 12:51:24 -08:00
Ailing Zhang	f8d4a14f6d	add xla tests to enabled-configs (#16761 ) Summary: This should enable xla tests thus let master xla tests pass. As usual, I will add the branch filters back before landing. Thanks ezyang ! Pull Request resolved: https://github.com/pytorch/pytorch/pull/16761 Differential Revision: D13959746 Pulled By: ailzhang fbshipit-source-id: 7384da281d093d16edccb4283c74e47ac659eeff	2019-02-05 12:25:45 -08:00
Jesse Hellemn	e30d33483b	Fix logging top commit of pytorch + builder in binaries for long summaries (#16766 ) Summary: I'll test with this really long summary. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce risus sem, mattis vitae commodo vitae, mattis vel ex. Integer nec consectetur ligula, sit amet ultricies risus. Suspendisse potenti. Donec aliquet quam ante. Donec porttitor justo ligula, ut vestibulum erat facilisis a. Nullam eget lobortis nisi. Aenean quis sem id ante eleifend condimentum nec a lacus. Sed sed dolor augue. Proin feugiat, tellus in eleifend cursus, libero nulla lacinia erat, et efficitur dui odio ut ex. In et sem purus. Proin dictum scelerisque magna, nec feugiat dolor lobortis id. Proin ante urna, ultrices in semper et, pulvinar et dui. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Mauris ullamcorper neque a pharetra rhoncus. Aliquam vel semper felis. Integer id massa erat. Morbi leo eros, varius sed viverra eu, dictum nec purus. Fusce vitae mollis sem, non fringilla nulla. Donec tincidunt luctus dolor. Morbi lobortis, magna quis viverra bibendum, lacus tortor pulvinar risus, eu porta tellus nulla vitae dolor. Sed tincidunt, turpis quis facilisis malesuada, nulla eros lobortis lorem, a fermentum mi nisl non quam. Pellentesque vehicula, nisl non eleifend viverra, tellus neque accumsan tellus, id ultricies lacus mi sed sapien. Proin rutrum ultrices quam sit amet euismod. Maecenas vel faucibus libero, nec efficitur mi. Proin felis augue, elementum eget vestibulum non, euismod sed urna. Curabitur purus nisi, interdum nec rutrum id, faucibus nec sapien. Integer consectetur interdum elit, volutpat vulputate velit. Integer et ultricies magna. Fusce blandit lorem urna, quis sodales sapien porttitor in. Nulla nec sodales sem. Morbi consequat massa sit amet fringilla pretium. Nunc maximus vitae neque auctor pharetra. Morbi gravida feugiat urna, eu sagittis est pulvinar eget. Maecenas ut fermentum ante, eget malesuada neque. In ut maximus magna. Donec nec finibus sapien. Quisque viverra erat lobortis, rhoncus augue sed, hendrerit dui. Donec in feugiat augue, a ultrices justo. Pellentesque rutrum augue sed nulla auctor, a venenatis risus aliquam. Nullam ipsum justo, dictum sit amet elementum eu, eleifend a turpis. Proin ut tellus ut urna volutpat fermentum ac aliquam tellus. Quisque ultricies est id eros dictum ultrices. Cras eu urna interdum, eleifend felis vitae, vulputate nulla. Cras tincidunt, mi sodales imperdiet tristique, diam odio convallis ligula, ac vulputate enim sapien eu tellus. Phasellus eleifend finibus sapien id ullamcorper. Donec aliquet eleifend consectetur. Proin in nulla venenatis, egestas neque quis, blandit sem. Suspendisse pellentesque arcu vel ligula fermentum maximus. Aliquam non ipsum ut ante pharetra finibus. Nunc rhoncus purus sit amet risus congue venenatis. Integer id vestibulum neque, et fermentum elit. Nunc sit amet tortor quis mi aliquam vestibulum et in mauris. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Maecenas mollis hendrerit nulla, non tempus neque pharetra ac. Proin commodo bibendum velit, consectetur pretium metus sollicitudin eget. Aliquam malesuada semper tempor. Ut vel vulputate dolor, eu faucibus mauris. Nam commodo quis dolor sit amet eleifend. Phasellus eget massa odio. Donec tempor est at ante finibus lobortis. Suspendisse porttitor imperdiet ultrices. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Nullam id dignissim magna, non suscipit odio. Vestibulum vel maximus erat, suscipit ullamcorper tellus. Fusce egestas augue lorem, in ultricies est vehicula ac. Integer pretium, ex in elementum varius, nisi turpis posuere lectus, nec posuere ligula mi ac ligula. Donec vehicula dolor ut ex elementum, quis scelerisque tellus molestie. Mauris euismod magna ac ornare cursus. Vivamus dapibus quam nec tellus aliquam elementum. Phasellus ultricies quis augue ut fringilla. Suspendisse eu molestie eros. Suspendisse potenti. Curabitur varius sodales maximus. Etiam nec rutrum est. Sed vulputate suscipit elit, eu condimentum mauris pretium eget. Curabitur convallis commodo dui. Aenean lectus orci, pretium non mi sit amet, commodo imperdiet dui. In hac habitasse platea dictumst. In et ex nisl. Duis justo tortor, finibus at augue vitae, fermentum hendrerit tellus. Donec malesuada justo a molestie posuere. Morbi nisl leo, feugiat ut faucibus ut, mattis id purus. Vestibulum hendrerit lorem ligula, et ullamcorper nisl lacinia sed. Integer vitae lacinia nunc, sed interdum enim. Aliquam aliquet ipsum vitae eros ornare accumsan. Phasellus venenatis laoreet est, sed feugiat neque lobortis id. Proin pulvinar placerat leo lacinia vehicula. Duis accumsan semper lobortis. Donec elementum nunc non quam aliquam, rutrum fringilla justo interdum. Morbi pulvinar pellentesque massa vitae maximus. Cras condimentum aliquam massa, et pellentesque lorem dictum a. Vivamus at dignissim justo. Donec ligula dui, tempus vestibulum est vel, rutrum blandit arcu. Vivamus iaculis molestie neque in elementum. Sed convallis tempus quam non elementum. Nulla euismod lobortis ligula. Etiam ac mauris eget magna posuere ornare id vitae felis. Nunc efficitur lorem et euismod porttitor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16766 Differential Revision: D13959962 Pulled By: pjh5 fbshipit-source-id: 9b71bdf981d4fda9d8951e2d183db81f349b7f81	2019-02-05 11:30:53 -08:00
Richard J. Knight	2a85d98745	Fix type-o in unsupported data type error message (#16537 ) Summary: -In the case where an operator does not support a given data type an error message is emitted to alert the user, this message is incorrectly structured. This commit adds to and rearranges the error message to make it a little clearer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16537 Differential Revision: D13958859 Pulled By: zou3519 fbshipit-source-id: 935fc3adcef2f969042b1db902c9ec004488ea9c	2019-02-05 10:46:46 -08:00
Adam Paszke	963e410b57	Make tuple checks faster (#16657 ) Summary: As the comment indicates, the issue is only present in some versions of Python 2, so we should be able to use heavily optimized PyTuple_Check in most cases, and skip allocation of the strings, and unnecessary lookups on object's type. cc ezyang zasdfgbnm Pull Request resolved: https://github.com/pytorch/pytorch/pull/16657 Differential Revision: D13957854 Pulled By: ezyang fbshipit-source-id: be32eb473ad77a0805e8247d8d583d673d4bdf25	2019-02-05 09:35:37 -08:00
Syed Tousif Ahmed	85ad011843	Fixes selection of cuDNN algorithm (#15881 ) Summary: This PR updates the logic for using cudnnGet* and cudnnFind. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions. Changelist: - Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`. - Used cudnnGet_v7 everywhere cudnnGet* was being used. - Removed all cudnn6 paths (This PR depends on https://github.com/pytorch/pytorch/pull/15851) Differential Revision: D13957944 Pulled By: ezyang fbshipit-source-id: a88c39d80ae37f2d686665622302b62b50fab404	2019-02-05 09:30:00 -08:00
Adam Paszke	c751cf8b36	Don't throw in operator== for TypeMeta and ScalarType (#16736 ) Differential Revision: D13957847 Pulled By: ezyang fbshipit-source-id: 3cc01538aab1bbb396c29ce61e0e95118f8d011f	2019-02-05 08:56:22 -08:00
Brennan Vincent	1ce188c510	logsumexp for multiple dimensions (#16475 ) Summary: Move `logsumexp` and `max_values` to `TensorIterator` and use it to make `logsumexp` work for multiple dimensions. Timings on a tensor of shape `(10,1000000,10)`, for each combination of (cpu, single-threaded cpu, gpu) and dimension: before 208 ms ± 2.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 279 ms ± 5.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 199 ms ± 2.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1.11 s ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1.25 s ± 25.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1.11 s ± 6.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 15.4 ms ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 132 ms ± 30.1 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 39.6 ms ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) after 199 ms ± 8.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 307 ms ± 8.73 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 207 ms ± 7.62 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 1.16 s ± 8.92 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1.26 s ± 47.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1.13 s ± 13.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 15.4 ms ± 868 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) 132 ms ± 27.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 39.6 ms ± 21.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16475 Differential Revision: D13855746 Pulled By: umanwizard fbshipit-source-id: aaacc0b967c3f89073487e1952ae6f76b7bd7ad3	2019-02-05 08:32:11 -08:00
Edward Yang	4047c97266	Revert D13952085: [pytorch][PR] Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA Differential Revision: D13952085 Original commit changeset: 410c4e117a44 fbshipit-source-id: fca59c37e71f8e61ae52867d5401b28fbacefe5a	2019-02-05 07:42:59 -08:00
James Reed	29827e1971	Integrate PyTorch quantization APIs into ensemble export modules (#309 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/309 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16481 This gives us a boolean flag `quantize` on the `BeamSearch` module that allows us to apply FBGEMM quantization to a pretrained PyTorch model and export this to PyTorch native runtime. Reviewed By: jmp84 Differential Revision: D13514776 fbshipit-source-id: 3f7cbff0782aae54c9623ad1ea7e66d7f49e2b32	2019-02-05 01:55:11 -08:00
James Reed	0cd918f4d3	Fork/join parallelism for ensemble export modules (#310 ) Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/310 This adds fork/join parallelism to the EncoderEnsemble and DecoderBatchedStepEnsemble models. Note that when run in Python, these calls are no-op, and similarly we remove these calls before exporting to ONNX. But when we run in the PyTorch native runtime, we will now have the opportunity to run these sections in parallel. Benchmark validation is pending me slogging through FBLearner Flow issues, as usual Reviewed By: jmp84 Differential Revision: D13827861 fbshipit-source-id: 0cb9df6e10c0ba64a6b81fa374e077bce90f1d5b	2019-02-05 01:55:09 -08:00
James Reed	ce15ae8f23	Add an API to set the number of threads in C10 thread pool (#16669 ) Summary: Tested locally on machine translation service Pull Request resolved: https://github.com/pytorch/pytorch/pull/16669 Differential Revision: D13927858 Pulled By: jamesr66a fbshipit-source-id: efcb8c21e0c2f76ac37967e6f52967da515595c3	2019-02-05 00:15:56 -08:00
Dmytro Dzhulgakov	3796cbaf7a	Try to turn off zero-out of tensors fully Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16601 Reviewed By: ezyang Differential Revision: D13893776 fbshipit-source-id: 3190258f2591540dc54ad8504ac6ded998bef384	2019-02-04 23:59:11 -08:00
Jerry Zhang	ae5fd10b02	Tensor method rename size()->numel() - 2/3 (#16745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16745 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: dzhulgakov Differential Revision: D13944353 fbshipit-source-id: 25c2ca22204706544ee67e59c663bf495f2b4f6b	2019-02-04 23:59:10 -08:00
Jerry Zhang	3df91ceb5e	Tensor method rename size()->numel() - 3/3 (#16747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16747 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: dzhulgakov Differential Revision: D13944380 fbshipit-source-id: 2167e2092ab27d31a4d5ef6cfa4b65d192f597a8	2019-02-04 23:54:33 -08:00
Jerry Zhang	cb9740a608	Tensor method rename size()->numel() - 1/3 Summary: Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: dzhulgakov Differential Revision: D13944296 fbshipit-source-id: 67e97c2cf45889d25f2cb3e2203cecba03c8a3aa	2019-02-04 23:33:17 -08:00
Summer Deng	a7a2618d51	Bug fix in l2 quantization (#16749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16749 Use global quantization options in l2 quantization Reviewed By: jspark1105 Differential Revision: D13951378 fbshipit-source-id: d4e356149587e5d2d09a6937c7fa1aa131957fd6	2019-02-04 22:31:38 -08:00
Michael Suo	b1822966ee	points-to graph simplification (#16605 ) Summary: This PR reworks the mutability API to be simpler (updates passes to use "mayAlias" calls) and improves the caching logic. The difference is that we now directly express the idea of a "memory location." Leaves in the alias trackers points-to graph are considered unique memory locations, and mayAlias questions can be boiled down whether two values share a leaf. To speed up queries, some basic path compression has been added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16605 Differential Revision: D13952738 Pulled By: suo fbshipit-source-id: cfc7fb2b23369f1dc425d1d8ca2c753c193d95dd	2019-02-04 22:04:25 -08:00
Edward Yang	6c04224cd8	Revert "Move outplace ops to ATen (#12413 )" (#16731 ) Summary: This reverts commit f660d3ae19decc64390e894fbaf8de80d87585e0. cc zasdfgbnm Reasoning at https://github.com/pytorch/pytorch/pull/12413#issuecomment-460424129 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16731 Differential Revision: D13948022 Pulled By: ezyang fbshipit-source-id: b10669cf03679e306850314b7b5b08bed0839e19	2019-02-04 19:30:04 -08:00
Lu Fang	1409a2afc8	Automatic update of fbcode/onnx to 875f7bbe537b9d6931d065977c192eaaf61e1179 (#16734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16734 Previous import was 15c33c945851907411619f599900c3852108e7e3 Included changes: - [875f7bb](https://github.com/onnx/onnx/commit/875f7bb): Bump docker image version from 230 to 238 (#1786) <bddppq> - [f94e430](https://github.com/onnx/onnx/commit/f94e430): Fix: setup.py is using wrong cmake build type (#1784) <Changming Sun> - [2896c77](https://github.com/onnx/onnx/commit/2896c77): Fix Cast testcase data (#1776) <Raymond Yang> Reviewed By: bddppq Differential Revision: D13948288 fbshipit-source-id: 5f733005d4bf483d58b630d511cadb0fa4ac7910	2019-02-04 17:37:40 -08:00
Soumith Chintala	3f570b5eea	Fix static linkage cases and NO_DISTRIBUTED=1 + CUDA (#16705 ) Differential Revision: D13952085 Pulled By: soumith fbshipit-source-id: 410c4e117a44c08eadc6f3ded91fafc320a7c696	2019-02-04 16:51:12 -08:00
Jerry Zhang	846a64e805	Tensor method rename ndim()->dim() - 1/3 (#16678 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16678 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: houseroad Differential Revision: D13929413 fbshipit-source-id: 677ce760bdbf9f5560630fdc40dd60af227fb696	2019-02-04 15:49:16 -08:00
Mas-ud Hussain	9e31d6dbf1	Merge job-spec env variables of Pytorch/Caffe2 CI jobs (#16649 ) Summary: The idea is to unify the environment variables `JOB_BASE_NAME` and `BUILD_ENVIRONMENT` which controlled the Pytorch and Caffe2 jobs respectively. In this commit, we have converted all the `JOB_BASE_NAME` references in _.jenkins/pytorch/_ files to `BUILD_ENVIRONMENT`. Then, did the same thing in ._circleci/config.yml_. One thing that we needed to be careful was when both `BUILD_ENVIRONMENT `and `JOB_BASE_NAME` were present under same declaration in _config.yml_ file (e.g., for "caffe2-" stuffs). To ensure that all "==" checks work as expected, we also had to add "" in some if conditions in _.jenkins/caffe2/build.sh_ file. Finally, removed "-build", "-test", etc. suffixes from `COMPACT_JOB_NAME` variable assignment in the bash script files in _.jenkins/pytorch_ folder, e.g., modify `COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-build"` to `COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16649 Differential Revision: D13946392 Pulled By: mmh683 fbshipit-source-id: 790de6abf96de184758e395c9098a50998e05bc5	2019-02-04 15:37:44 -08:00
Jesse Hellemn	b250385811	Log top commit of pytorch + builder in binaries Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16729 Differential Revision: D13947737 Pulled By: pjh5 fbshipit-source-id: 9ba8ea56baff7147f73458ab26d0553fff31a46f	2019-02-04 14:30:44 -08:00
Junjie Bai	c15ed3a2f2	Run resnext101 training in rocm benchmark (#16017 ) Summary: cc xw285cornell Pull Request resolved: https://github.com/pytorch/pytorch/pull/16017 Differential Revision: D13946680 Pulled By: bddppq fbshipit-source-id: ea125b0389188a59db3d537671a3214a557aecdb	2019-02-04 14:16:25 -08:00
Joshua Meier	6d407baedf	Replace resize_dim() with set_sizes_and_strides() in THTensor_(unsqueeze1d) in aten/src/TH/generic/THTensor.cpp (#16673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16673 Replace resize_dim() with set_sizes_and_strides() in THTensor_(unsqueeze1d) in aten/src/TH/generic/THTensor.cpp, as described in T38058642. Reviewed By: ezyang Differential Revision: D13928879 fbshipit-source-id: d593cebcc82589cd362ac78884d4e367d0da0ce6	2019-02-04 12:32:14 -08:00
Jerry Zhang	db4235f31d	Tensor method rename ndim()->dim() - 2/3 (#16679 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16679 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: houseroad Differential Revision: D13929450 fbshipit-source-id: fcc222744c28b41f2cedffc0c2ef5d04aceaa5af	2019-02-04 11:12:57 -08:00
JerryShih	73db487a8e	Update the cmake build configuration for AppleClang compiler (#15820 ) Summary: This pr try to merge the https://github.com/pytorch/pytorch/pull/11563 again and fix the linking error in https://github.com/pytorch/pytorch/pull/14837. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15820 Differential Revision: D13942024 Pulled By: ezyang fbshipit-source-id: dc6d1e9c4b0f177914f3745665244272a03ce33c	2019-02-04 08:53:47 -08:00
Dmytro Dzhulgakov	dc528fd734	Fix build with cuda but no cudnn in caffe2 (#16701 ) Summary: Just noticed while building on a machine without cudnn present - it was building but the runtime failed since some methods weren't bound Pull Request resolved: https://github.com/pytorch/pytorch/pull/16701 Differential Revision: D13937247 Pulled By: dzhulgakov fbshipit-source-id: c81f05be7a9e64a1a8591036dcf8692c0ed4064e	2019-02-03 22:14:51 -08:00
Dmytro Dzhulgakov	da24749e8d	Fix ReservoirSampling zero-initialization reliance (#16702 ) Summary: The op was implicitly relying on pos_to_output to be zero-initialized after extending. We're removing this functionality from allocator, thus fixing here. For some reason it wasn't spotted by junk-initialization but was reliably reproducible with standard malloc() if both junk_fill and zero_fill flags are turned off. cc kittipatv jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16702 Reviewed By: kittipatv Differential Revision: D13937257 Pulled By: dzhulgakov fbshipit-source-id: 3ee520b05467108e6c3e64eb3e6c60589bdf3d87	2019-02-03 21:31:56 -08:00
Pieter Noordhuis	6cb593b88c	Remove --without-parallel (#16704 ) Summary: See homebrew/homebrew-core@60c72ba9 and homebrew/homebrew-core#31510. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16704 Differential Revision: D13938093 Pulled By: pietern fbshipit-source-id: 8a70d462180257f96202a0373a86a273b524045c	2019-02-03 13:39:26 -08:00
Pieter Noordhuis	a53d28dd87	Bump gloo (#16638 ) Summary: This bump includes: * Memory leak fix where the Gloo transport would hold on to auxiliary structures for send/recv pairs after they finished. * Fix write-after-free from Gloo thread during stack unwinding on error. * Removal of the PATENTS file. Fixes #16144. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16638 Differential Revision: D13937950 Pulled By: pietern fbshipit-source-id: 3cfecaf13ee0f214c06681386557a4b1c3e1d6b9	2019-02-03 11:52:31 -08:00
vishwakftw	6d86bc7c3f	Fix issue with scalars and __rpow__ (#16687 ) Summary: Changelog: - Modify __rpow__ function in tensor.py to adapt to scalars Pull Request resolved: https://github.com/pytorch/pytorch/pull/16687 Differential Revision: D13936720 Pulled By: soumith fbshipit-source-id: b0c8727968b04efbc6e7461807c812d962f03370	2019-02-02 18:55:51 -08:00
Sebastian Messmer	4c16ea93d1	Improve LeftRight (#16524 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16524 - Make it exception safe. When an exception happens during write, the old state is recovered. - Use RAII instead of try/catch to increment counters in readers. This is more readable, and it also makes it work with reader closures that return void, which previously didn't work because the reader return value was stored on the stack. - Assert there's no reads or writes happening when it's destructed to avoid destruction race conditions - Explain the algorithm in detail in comments - Add test cases Reviewed By: ezyang Differential Revision: D13866609 fbshipit-source-id: 01306a282a3f555569caa13d8041486f960d00e2	2019-02-02 16:33:27 -08:00
svcscm	39a55f4ea6	Updating submodules Reviewed By: zpao fbshipit-source-id: e66e01e164d1784740fcb8bebc4817d2a8cd7903	2019-02-02 16:33:25 -08:00
svcscm	988647b21c	Updating submodules Reviewed By: zpao fbshipit-source-id: 31a8d843ffba2d7405b4742ea553937a00dff216	2019-02-02 05:05:07 -08:00
James Reed	44809fda84	fix conditional in mean workaround (#16686 ) Summary: When trying to get a test to pass I was missing an exclamation mark. Instead now I just use a different function in the conditional Pull Request resolved: https://github.com/pytorch/pytorch/pull/16686 Differential Revision: D13935182 Pulled By: jamesr66a fbshipit-source-id: 7525a1a829276641dbafe06734f03f6202df6b22	2019-02-02 00:55:58 -08:00
Xiaomeng Yang	7d4a81cbb2	Use macro for reduce on 2d blocks (#16344 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16344 Use macro for reduce on 2d blocks i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13808988 fbshipit-source-id: b68c0fb6079c1b6e203a072083aba7a95c202bc2	2019-02-01 23:49:07 -08:00
Sebastian Messmer	f36f3cce9a	Simplify layer_norm_op_test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16570 Reviewed By: ezyang Differential Revision: D13883913 fbshipit-source-id: 7437d3cbc00c0de92bb01562c620cb658aa9f0d3	2019-02-01 21:34:18 -08:00
Hao Lu	13db5dbb81	Make predictor base class Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16541 Reviewed By: ajtulloch Differential Revision: D13858261 fbshipit-source-id: acbfdbea59bd20ab1cc7956ee0d8856d6faa8361	2019-02-01 20:59:19 -08:00
Yinghai Lu	98b333d810	Tag model_id and onnxifi index in OnnxifiOp (#16648 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16648 We added onnxGraph sharing keyed on model id and net seq number but we forgot to supply these info to the Onnxifi. Therefore, we will only create ONE onnxGraph whatsoever... This diff adds necessary info to the OnnxifiOp to prevent this from happening. Reviewed By: bertmaher, rdzhabarov Differential Revision: D13912356 fbshipit-source-id: fe8982327287a35f32fe3b125d94b617d18c0ab5	2019-02-01 18:51:51 -08:00
svcscm	a4ac3cbb2f	Updating submodules Reviewed By: zpao fbshipit-source-id: ed389204bc423d2d5f7a36e2d61c0f55fe0522e1	2019-02-01 18:46:03 -08:00
David Riazati	c865d46736	Add @ignore annotation (#16055 ) Summary: Adds a decorator `torch.jit.ignore` for Python functions that tells the compiler to skip over these Python values, putting a `prim::Error` in their place which always throws an exception when run. This lets you have Python-only code in your model in an explicit way, which is useful for debugging, and still be able to save/load the model. Fixes #15815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16055 Differential Revision: D13797286 Pulled By: driazati fbshipit-source-id: 29d36776608ec101649a702952fc6ff3c27655b1	2019-02-01 16:46:12 -08:00
Hui Wu	31ab03e34f	Add Winograd Conv method for CPU (#15196 ) Summary: Add winograd conv method. Users can select the direct conv or winograd conv in the model file. We close the origin pr https://github.com/pytorch/pytorch/pull/12154 and create this new one for better rebasing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15196 Differential Revision: D13463721 Pulled By: yinghai fbshipit-source-id: c5cd5c8aa7622ae7e52aeabd3dbb8ffb99b9b4ee	2019-02-01 16:41:30 -08:00
Jesse Hellemn	69a816c060	Increase timeout on anaconda logins Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16682 Differential Revision: D13931438 Pulled By: pjh5 fbshipit-source-id: 9961e91a80d8c59ab6347e830b1da38533524dd2	2019-02-01 16:33:51 -08:00
Jerry Zhang	dc64d95f3a	Tensor method rename ndim()->dim() - 3/3 (#16680 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16680 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: houseroad Differential Revision: D13929471 fbshipit-source-id: b284ead11031f96fd8b6d96d2f29ffeb14207faa	2019-02-01 16:28:28 -08:00
Lu Fang	9594d9bcfd	fix the ONNX ci (#16674 ) Summary: ~~Let's see whether this trigger and fix the problem~~ remove the expect files from test_verify Pull Request resolved: https://github.com/pytorch/pytorch/pull/16674 Reviewed By: zrphercule Differential Revision: D13930668 Pulled By: houseroad fbshipit-source-id: 092157af07f475cf3809c95a4fe586e050c53b7e	2019-02-01 15:58:39 -08:00
Jesse Hellemn	7139410b72	Allow USE_NINJA to be toggled by an env variable Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16665 Differential Revision: D13930021 Pulled By: pjh5 fbshipit-source-id: 4b490f952a56e8561329ab8898be2bf779b46b9d	2019-02-01 15:33:06 -08:00
Michael Suo	bd75fba4e8	fix tracing using a dictionary as input (#16616 ) Summary: Previously this would fail with the error message: ``` ValueError: Auto nesting doesn't know how to process an input object of type dict. Accepted types: Tensors, or lists/tuples of them ``` Turns out we're not using the line that causes this error (or a side effect of that line), so removing it fixes the issue. Also cleaned up some related dead code (cc apaszke to make sure the code isn't useful in some way) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16616 Differential Revision: D13908352 Pulled By: suo fbshipit-source-id: 27094f1f4ea0af215b901f7ed3520e94fbc587b3	2019-02-01 14:44:56 -08:00
Sebastian Messmer	aaa8ace486	Implement new c10 dispatcher (#16625 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16625 This is a squash of multiple PRs that refactored the old c10 dispatcher into a new one that follows the c10 dispatcher design doc. It is now unboxed and follows the Stack semantics from JIT. It also uses the runtime JIT schema instead of its own compile time schema definitions. Reviewed By: ezyang Differential Revision: D13907069 fbshipit-source-id: edcc4806ccd21474fdfb5a98516219b1956db13d	2019-02-01 13:52:01 -08:00
Will Feng	a40e8ce7c5	Add train() / eval() / is_training() to C++ ScriptModule API (#16044 ) Summary: This PR aims to fix https://discuss.pytorch.org/t/how-to-change-a-loaded-model-to-evaluation-mode-in-c/32330, by adding `train()` / `eval()` / `is_training()` to C++ ScriptModule API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16044 Differential Revision: D13857724 Pulled By: yf225 fbshipit-source-id: 16d3969fb5840ff7e66c7f72e800e6c75db8d2ff	2019-02-01 13:07:38 -08:00
Syed Tousif Ahmed	6d373c02ef	Revert "Fixes selection of cuDNN algorithm (#15881 )" (#16484 ) Summary: There is a regression in cudnnGet*_v7 that causes slowdown in resnet50 training. I am opening a bug with cuDNN team about this. This reverts commit 38374468832e307ca741901870914857a836dd5d. ezyang 😿 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16484 Differential Revision: D13924755 Pulled By: soumith fbshipit-source-id: 8c719345fc443f1289539bfae630eea9224ba4a5	2019-02-01 13:07:36 -08:00
Soumith Chintala	638dbe4b46	Revert "Upgrade mkl-dnn to v0.17.3 to fix core dump issue (github#161… (#16660 ) Summary: …83) (#16653)" This reverts commit 87ae1558a6c8c7c0693bfa995458d16239c484d7. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16660 Differential Revision: D13924272 Pulled By: soumith fbshipit-source-id: 79747d728adff1a9c32d8529846f0305052e57e8	2019-02-01 11:12:16 -08:00
Roy Li	4c803f4ebd	Expose backend extensions to python Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16582 Reviewed By: gchanan Differential Revision: D13887539 fbshipit-source-id: 8755babf2e3e849af974655f2f3a91740efe977e	2019-02-01 11:00:18 -08:00
Roy Li	7e642dfff3	Introduce backend extensions (overriding operators on custom backends) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15153 Reviewed By: gchanan Differential Revision: D13445571 fbshipit-source-id: 62e2ebe0a6e81c4983b47cddb57ee5eb78e96708	2019-02-01 11:00:16 -08:00
Roy Li	64186e06ec	Dispatch factory functions on Type (#15093 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15093 Needed for backend extensions. Reviewed By: ezyang Differential Revision: D13427897 fbshipit-source-id: d0b34b0072e597ae599bd3bc25356831d7a18d6a	2019-02-01 11:00:15 -08:00
Edward Yang	d29912f59e	Only run Travis on master branch, not on export-DXXXXX branches. (#16628 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16628 Differential Revision: D13922097 Pulled By: ezyang fbshipit-source-id: eb16d90cc61167af5edc0c4e361d7a807a3099e5	2019-02-01 09:31:46 -08:00
Ailing Zhang	3672f1536e	Ignore assert_git_not_dirty for xla tests (#16611 ) Summary: Testing, will restore the branch filter before landing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16611 Differential Revision: D13902234 Pulled By: ailzhang fbshipit-source-id: 7fa4048b891645f5253c48b905fb9630e3079524	2019-02-01 08:56:44 -08:00
Asher Mancinelli	7078b2baf5	Better bounds checks in ctcloss (#16269 ) Summary: Adds better bounds checks for target lengths in CTC loss, checks for integral types for target and prediction lengths, and adds tests for each, according to #15946 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16269 Differential Revision: D13847567 Pulled By: ezyang fbshipit-source-id: 5d7a975565e02baf78fe388813a1d1ef56dfb212	2019-02-01 08:02:54 -08:00
Gu, Jinghui	87ae1558a6	Upgrade mkl-dnn to v0.17.3 to fix core dump issue (github#16183) (#16653 ) Summary: Upgrade mkl-dnn to 0.17.3 to fix core dump issue in #16183 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16653 Differential Revision: D13918278 Pulled By: soumith fbshipit-source-id: b9c09c50ef188b4099966216e155c9f3f2542276	2019-02-01 07:16:46 -08:00
peter.yeh@amd.com	10cd9d5a03	Skip dag_net_forking test on Rocm (#16639 ) Summary: -Skip the test due to flaky behavior on AMD/Rocm -The fix is expected in Rocm 2.2 ( HSA runtime) bddppq Pull Request resolved: https://github.com/pytorch/pytorch/pull/16639 Differential Revision: D13915231 Pulled By: bddppq fbshipit-source-id: 66e1d275836337170b15ceb9d60cfdd3242d4df8	2019-02-01 00:53:54 -08:00
Amy Yang	b67b29b667	add SingleLoadedNetSupplier (#16620 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16620 LogfiledbNetLoader loads all external input blobs into a workspace instance, we pack a shared pointer to this loaded workspace into the SingleLoadedNetSupplier. SingleLoadedNetSupplier will pass this workspace to BlackBoxPredictor to be executed. (D13891759 is a WIP of how it all comes together) Reviewed By: pjh5 Differential Revision: D13901467 fbshipit-source-id: 20589f898922f5f1aec50be131dad17a8c38e9b2	2019-01-31 23:51:59 -08:00
Xiaomeng Yang	4ae9ab24b6	Update conv_base to support empty batch (#16603 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16603 Update conv_base to support empty batch Reviewed By: houseroad Differential Revision: D13894111 fbshipit-source-id: fc4370ff16ba6046f374e77bd845d28e6af05ea3	2019-01-31 23:46:18 -08:00
James Malcolm	b0e692c8a6	Improving docs for MultiLabelSoftMarginLoss (#16644 ) Summary: Resolves #15863 Changed the documentation for MultiLabelSoftMarginLoss and MultiLabelMarginLoss to be more explicit about the `target` format. More than happy to change the messaging based on discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16644 Differential Revision: D13912395 Pulled By: soumith fbshipit-source-id: 24a3c214c5f6f9d043e25b13ac758c1c1211b641	2019-01-31 22:07:14 -08:00
Zachary DeVito	536f647bae	respect MAX_JOBS (#16641 ) Summary: We inadvertently switch the OSX build over to ninja on CI. It then fails to respect MAX_JOBS and hits the same scache deadlock bug, this makes the ninja build respect MAX_JOBS. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16641 Differential Revision: D13910751 Pulled By: zdevito fbshipit-source-id: 61bec500539519b019b74421a13cd87fc1d86090	2019-01-31 20:55:37 -08:00
James Reed	6ba4ca8780	Workaround unvectorized mean implementation (#16618 ) Summary: Workaround for https://github.com/pytorch/pytorch/issues/16617 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16618 Differential Revision: D13904276 Pulled By: jamesr66a fbshipit-source-id: f8b5ea4c5f12dbc405123c9080c55b342c95bcd1	2019-01-31 20:50:53 -08:00
svcscm	2486facc34	Updating submodules Reviewed By: zpao fbshipit-source-id: 4d94eb18d4da58541a96c9f9c2ecc9746f779933	2019-01-31 19:37:24 -08:00
Edward Yang	e48ffa84d8	Add compare_exchange_deleter to DataPtr/UniqueVoidPtr (#16513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16513 compare_exchange_deleter makes it easier to replace a deleter on a DataPtr with a new one, without requiring allocating another closure to hold the old deleter. See comment for details. This diff was originally landed as part of D13762540 (#16226) but we are reverting that diff D13863610 (#16510) Reviewed By: smessmer Differential Revision: D13864245 fbshipit-source-id: 56eda4748238dd3a5130ba6434fda463fe7c690e	2019-01-31 17:40:04 -08:00
Bram Wasti	e4c1b51d82	Shim caffe2 GetRepeatedArgument helper for use with Ivalue (#16519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16519 GetRepeatedArguments is needed for some ops Reviewed By: dzhulgakov Differential Revision: D13864293 fbshipit-source-id: a39255cd391c28acd75a6f0e81d558542417e032	2019-01-31 17:33:57 -08:00
SsnL	13422fca32	Add torch.backends.openmp.is_available(); fix some cmake messages (#16425 ) Summary: 1. add `torch.backends.openmp.is_available()` 2. Improve various `cmake` outputs 3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets 4. Fix `MKL` warning message, and QUIET flag. 5. Fix various typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425 Differential Revision: D13903395 Pulled By: soumith fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d	2019-01-31 16:15:46 -08:00
Xiang Gao	f660d3ae19	Move outplace ops to ATen (#12413 ) Summary: So that things like below can be JITable, and available in C++ API: ```python import torch torch.jit.script def f(x, y, z): x.index_add(0, y, z) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12413 Differential Revision: D13899948 Pulled By: suo fbshipit-source-id: b0006b4bee2d1085c813733e1037e2dcde4ce626	2019-01-31 16:09:45 -08:00
Jesse Hellemn	6e17f4a126	Grant credentials to s3 html update job Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16631 Differential Revision: D13908331 Pulled By: pjh5 fbshipit-source-id: 846a4f933d947f7217b856bd79ff85b7f97288a8	2019-01-31 15:59:31 -08:00
Jerry Zhang	d5d7718770	fix scope related naming issue in build_quant_conv_bn_relu, and also format function signature Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14885 Reviewed By: harouwu Differential Revision: D13374077 fbshipit-source-id: 5082c4ea0d2fdc197243b022b9b489f38b04c8e9	2019-01-31 15:53:27 -08:00
Dmytro Dzhulgakov	51752e09c6	Disable layernorm_c10 test for now (#16630 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16630 two PRs landed concurrently - enforcing tensor constraints and refactoring c10. Since it's not a prod code - disable test and I'll let Sebastian to fix it properly. Reviewed By: ezyang Differential Revision: D13908117 fbshipit-source-id: 381c5626078b794afa1fc7a95cb1ea529650424c	2019-01-31 15:47:13 -08:00
Elias Ellison	a386c28fcd	Remove constant propagation expect files (#16348 ) Summary: Remove constant prop expect files, and express graph conditions via python bindings. First diff in larger effort to remove expect files Pull Request resolved: https://github.com/pytorch/pytorch/pull/16348 Differential Revision: D13906929 Pulled By: eellison fbshipit-source-id: 7963caa3ccbc7bfc0006a160c952aa173d1ce633	2019-01-31 15:41:22 -08:00
James Reed	dfb081a7e4	Fix a lot of C++ build warnings (#16411 ) Summary: I went through my build log and did what I thought were reasonable fixes to all the C++ compilation warnings that came up Pull Request resolved: https://github.com/pytorch/pytorch/pull/16411 Differential Revision: D13901006 Pulled By: jamesr66a fbshipit-source-id: 02df4e3e5a5c8dd9e69ac9f065cd3f2a80645033	2019-01-31 14:35:56 -08:00
David Riazati	3f8fd19a86	Add immutable dict support (#16208 ) Summary: This PR adds basic support (creation and indexing) for immutable dictionaries in Script. This includes Python/string frontend support and a `IValue::GenericDict` type backed by a `std::unordered_map`. Only `str`, `int`, and `float` are supported as keys, any type can be a value. Structure is pretty similar to list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16208 Differential Revision: D13881686 Pulled By: driazati fbshipit-source-id: 29ce9835b953c3456f57bcc2bbdf7fe0cbf941c0	2019-01-31 14:29:23 -08:00
Jithun Nair	4bdf51cbd6	Make the miopen handle part of ConvolutionParams (#16613 ) Summary: so that it's included in the hashed key that decides whether to call Find or not. This is required to ensure that Find is run for all devices Pull Request resolved: https://github.com/pytorch/pytorch/pull/16613 Differential Revision: D13901769 Pulled By: bddppq fbshipit-source-id: 7d29ea9e40231cd4eef80847afa1307efeb0945c	2019-01-31 14:09:04 -08:00
Dmytro Dzhulgakov	a061e3fd77	Back out "Revert D13596031: Improve c2-aten tensor interop and add proper testing" (#16514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16514 Original commit changeset: dc371697f14b Relanding https://github.com/pytorch/pytorch/pull/15860 - the problem was that layer_norm was using at::empty which is not yet on mobile Reviewed By: ezyang Differential Revision: D13861480 fbshipit-source-id: e2116da32bc117175c96b9151b1beba9b31eff36	2019-01-31 13:38:55 -08:00
Zachary DeVito	0b29bd82f6	use distutils to discover msvc compiler paths (#16540 ) Summary: This simplifies the process for building on windows, since users no longer have to find and run the vcvarsall.bat file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16540 Differential Revision: D13893596 Pulled By: zdevito fbshipit-source-id: 79b7ad55c3251b3f573fd8464931138f8a52dd1d	2019-01-31 13:25:33 -08:00
Bram Wasti	1ff46f03ed	Fix SIOF in torch using caffe2 registry (#16473 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16473 This resolves the issues associated with caffe2 initialization (specifically the REGISTER_FUNCTION_SCHEMA_OPERATOR calls) being run after Torch's static op registration calls. The fix employs a meyer's singleton wrapped by the constructor of a type. Everything is placed inside a macro to make it easier for users to use. Reviewed By: smessmer Differential Revision: D13854306 fbshipit-source-id: ecf60861f229532826fae254974e9af4389055df	2019-01-31 13:04:11 -08:00
Bram Wasti	1efad7f6be	Swap Caffe2 operator constructor to pass arguments by value (#16576 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16576 allows instantiation of operator with arguments passed by move rather than explicit copies per Sebastian's suggestion Reviewed By: smessmer Differential Revision: D13882416 fbshipit-source-id: bc8d50e73f5a1ae87155b0cf96799b8573a7a8fa	2019-01-31 13:04:09 -08:00
David Riazati	26565046ac	Allow ScriptModule(optimize=False) when jit disabled (#16297 ) Summary: Fixes #16285 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16297 Differential Revision: D13797276 Pulled By: driazati fbshipit-source-id: 3a93500d4233cfbb8f5af7feba43f6ff4c3d22c7	2019-01-31 12:29:15 -08:00
Thomas Viehmann	20d45c43d7	Get more fusion after autodiff uses SumToSize (#14957 ) Summary: Here is a fresh attempt at getting some fusion back in autodiff-generated graphs in the presence of SumToSize. - The sum to size operator is now `aten::_grad_sum_to_size` to allow symbolic script differentiation (and that in turn would need to use this in place of sum_to_size to signal that it strictly operates on gradients). This is also used in the autodiff code, replacing `prim::SumToSize`. - `_grad_sum_to_size` is now fusable, `cat`s - which are fused afterwards thanks to Adam's simplification of the code - are only fused if there is no `_grad_sum_to_size` in the fusion group. - I push the `_grad_sum_to_size` out of the the fusion group when compiling and record the desired summations in the KernelSpec. The reasoning is the following: - As the autodiff is a repeated applicaiton of the chain rule, we always have the pattern `grad_in = mm(A, grad_out)`, with A often diagonal for cases interesting to the fuser, whence it is `grad_in = a * grad_out` (a pointwise multiplication). We know that only `grad_out` may have AutodiffGradSumToSize applied, so we can commute AutodiffGradSumToSize with the `mul` (and `div` and `neg` are of similar origin). - For `type_as` the gradient might be giving the type, so just skip SumToSize, - `add` (which was inserted as `prim::AutogradAdd`) adding gradients when the forward used the same value in several places. This is non-broadcasting, so we know that the two arguments would have the same sizes as inputs - which is good so we don't have to do bookkeeping of the two parts. Details: - During fusion, the Tensor arguments are always kept as the first parameters of the fusion group to accomodate indexing assumptions in the fuser. - The rewriting of the fusion group to record the necessary output transformation and eliminate `_grad_sum_to_size` from the fusion group is now in the fuser compile step. - In the execution step, the arguments are split into Tensor / Non-Tensor and the non-tensor args are mostly forgotten about except for doing `sum_to_size` at the end. This would want to be improved if/when we fuse nonconstant scalar arguments. - In a number of places in the fuser, the non-Tensor arguments to the fusion group needed to be ignored. Thank you, apaszke for the insightful discussion. All bad ideas and errors are my own. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14957 Differential Revision: D13888173 Pulled By: zou3519 fbshipit-source-id: 071992c876e8b845f2b3e6329ae03a835d39a0ea	2019-01-31 12:24:38 -08:00
peter	4b7e70067c	Enable USE_NINJA in build_pytorch_libs.py if it is in PATH (#16545 ) Summary: It is required to fix the nightly conda builds. cc zdevito ezyang soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/16545 Differential Revision: D13900610 Pulled By: soumith fbshipit-source-id: 676f903a082f6f083e07245a1df38175bb82b2f7	2019-01-31 11:57:11 -08:00
sebftw	b109549bf3	Replaced "from_numpy" with "as_tensor" in docs. (#16587 ) Summary: In the warning box on https://pytorch.org/docs/stable/tensors.html#torch.Tensor.new_tensor it says: > new_tensor() always copies data. [...] If you have a numpy array and want to avoid a copy, use torch.from_numpy(). But then further up the page we have another warning box with the message: > torch.tensor() always copies data. [...] If you have a numpy array and want to avoid a copy, use torch.as_tensor(). Now I believe this is just a small oversight, since from_numpy is to be deprecated in favour of as_tensor. See for example https://github.com/pytorch/pytorch/issues/6885 and https://github.com/pytorch/pytorch/issues/8611. I suggest to just use torch.as_tensor() in both of the warning boxes. cc gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/16587 Differential Revision: D13897038 Pulled By: gchanan fbshipit-source-id: 2eb3cd47d2c0b5bf4350f980de3be9fe59b4a846	2019-01-31 11:51:32 -08:00
bhushan	482d3a3bf3	printing correct dimension while indexing (#16495 ) Summary: applySelect does modify the tensor and removes the top most dimension which makes it complicated to track just using dim and need to use another parameter as real_dim to signify original dimension fixes #16192 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16495 Differential Revision: D13897182 Pulled By: gchanan fbshipit-source-id: 105581dbbff6b431cc8e2539a07e0058161e53a1	2019-01-31 11:45:56 -08:00
Brennan Vincent	32daa90fbd	remove unused capture (#16526 ) Summary: We don't use this in the lambda body anymore. Remove it to fix a warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16526 Differential Revision: D13867043 Pulled By: umanwizard fbshipit-source-id: 4c9a9d194fdfcb63fde16823517d2c6c8e2ae93d	2019-01-31 11:11:55 -08:00
Michael Suo	72a431edce	split up AliasTracker into a separate file (#16588 ) Summary: This just moves thing around to make AliasTracker independently testable and keep things a little more separate. Follow-on PRs will change the interfaces of AliasDb and AliasTracker to be more clearly distinct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16588 Differential Revision: D13891894 Pulled By: suo fbshipit-source-id: c5b590b5fdd462afefe743e499034068bf35784a	2019-01-31 10:53:53 -08:00
Zachary DeVito	e7e3838f3b	Access profiler from cpp (#16580 ) Summary: jamesr66a Pull Request resolved: https://github.com/pytorch/pytorch/pull/16580 Differential Revision: D13891299 Pulled By: zdevito fbshipit-source-id: 83b335bf3231a9ab30e9318f2bce6d741ba5ffae	2019-01-31 10:37:47 -08:00
SsnL	d2861230f3	Fix cuFFT plan cache size on CUDA 10 cannot be set to > 4096 (#16384 ) Summary: Doc doesn't need to be changed. Also clarifies two inaccurate comments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16384 Differential Revision: D13886637 Pulled By: soumith fbshipit-source-id: 227385008211a6f3ad9135c54fd2d3754cc9daaf	2019-01-31 06:56:39 -08:00
Jesse Hellemn	d47108add0	Clean up binary jobs in CircleCI (#16511 ) Summary: - Add libtorch upload jobs - Unify checkout and env code for binary jobs (san binary test jobs) - Compress variables passed into binary jobs Pull Request resolved: https://github.com/pytorch/pytorch/pull/16511 Differential Revision: D13893714 Pulled By: pjh5 fbshipit-source-id: b8bd72e1397dec569a8ec3e859e319178c7c6f8b	2019-01-30 23:46:58 -08:00
svcscm	8b053fccc7	Updating submodules Reviewed By: zpao fbshipit-source-id: 36c332beab1aaccb281d5ee07952d399056b7f8c	2019-01-30 23:37:23 -08:00
Jongsoo Park	db121375e7	more careful use of inline/template function in perfkernels (#15388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15388 This is another pass to make perfkernels code safer from illegal instruction error. Removed dependency to c10/util/Logging.h We're err on the safer side at the expense of some verbosity. Reviewed By: dskhudia Differential Revision: D13502902 fbshipit-source-id: 4f833115df885c5b4f8c1ca83b9badea1553f944	2019-01-30 22:49:37 -08:00
svcscm	26200ebf56	Updating submodules Reviewed By: zpao fbshipit-source-id: a0a2a635f86ef3bebfb4ca1a36f7ec9c2b09d7bb	2019-01-30 21:12:02 -08:00
Jerry Zhang	d3742603cb	DeviceScope support for CUDA and testing (#15357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15357 Supporting device option in FQ bn folding for ITER related ops Reviewed By: wat3rBro Differential Revision: D13370259 fbshipit-source-id: 4324c2716dfa69ddedc661ae2b1ad34c2f6fc4b6	2019-01-30 18:42:12 -08:00
Antoine Busque	a44826e659	Fix: avoid race condition on model zoo directory creation (#16578 ) Summary: The current implementation of the `torch.utils.model_zoo.load_url` function is prone to a race condition when creating the directory in which it saves the loaded models, since it checks whether the directory exists and then creates it in two separate steps. The directory can be created after the check was made but before we attempt to create the directory, resulting in an unhandled exception. Instead, try to create the directory directly, and do nothing if it already exists. Note: for Python versions ≥ 3.2, we could simply use the `exist_ok=True` flag on `os.makedirs`, but this is unavailable in Python 2.7. Signed-off-by: Antoine Busque <antoine.busque@elementai.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16578 Differential Revision: D13886470 Pulled By: soumith fbshipit-source-id: 88815c8a65eec96caea32d6e9a7f83802502fdb9	2019-01-30 18:35:45 -08:00
Iurii Zdebskyi	bc53805f2e	Remove redundant declarations (#16463 ) Summary: As there are no checks that all the functions are actually being used, we can end up with stale entries. This diff removes unused entries from Declarations.cwrap Testing: Successful build via "python setup.py develop" Pull Request resolved: https://github.com/pytorch/pytorch/pull/16463 Differential Revision: D13885815 Pulled By: izdeby fbshipit-source-id: 4e35c2ac9196167af74dff3d4f971210721285f8	2019-01-30 18:29:00 -08:00
Michael Suo	3ba6f55ae3	begin splitting up cpp tests (#16536 ) Summary: Start splitting up these tests so we don't have a massive test file. Doesn't change how you run them, since `gtest.cpp` and `no-gtest.cpp` will still collect everything. Renamed `tests.h` to `test_misc.h` to vaguely discourage people from adding yet more stuff to it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16536 Reviewed By: zdevito, eellison Differential Revision: D13882215 Pulled By: suo fbshipit-source-id: 61cf97f3c2c50703dcf6a3a34da01415ecb7e7d6	2019-01-30 17:58:54 -08:00
Christian Puhrsch	0ef9569841	Use dispatch tensor for device_guard instead of first Tensor argument Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16579 Differential Revision: D13886593 Pulled By: cpuhrsch fbshipit-source-id: 0722ec61da13c2541f7de51bf5c1ecfb9a12fad2	2019-01-30 17:30:24 -08:00
Owen Anderson	fc2d8c6889	Eliminate PYCMD in favor of PYTHON_EXECUTABLE in CMake. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16522 Differential Revision: D13867376 Pulled By: resistor fbshipit-source-id: 6bce68facea83c5161a31fcdfafe08827999eb2b	2019-01-30 17:13:43 -08:00
ParticularlyPythonicBS	16e2e4f29f	added example to clear ambiguity in torch.Tensor.view (#16563 ) Summary: Added example to the documentation of [torch.Tensor.view](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view) to avoid the misunderstanding referenced in issue [#16560](https://github.com/pytorch/pytorch/issues/16560) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16563 Differential Revision: D13885008 Pulled By: soumith fbshipit-source-id: b7e7fbea1f16124bc4e679ae9c50ab619e1f043d	2019-01-30 16:53:31 -08:00
Gregory Chanan	851437dd4b	Fix uninitialized data and broken broadcasting with sparse.mm and spa… (#16572 ) Summary: …rse.addmm. Fixes https://github.com/pytorch/pytorch/issues/16543. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16572 Differential Revision: D13884235 Pulled By: gchanan fbshipit-source-id: 308916051364d72f72ec56f0495c6c7c09845131	2019-01-30 16:08:50 -08:00
SsnL	33f2ab1fdb	add new build files to gitignore; test that build does not leave git repo checkout dirty (#16565 ) Summary: These appear when I run ``` MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ NO_CUDA=1 NO_DISTRIBUTED=1 BUILD_CAFFE2_OPS=0 DEBUG=1 python3 setup.py develop --cmake ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16565 Differential Revision: D13885790 Pulled By: ezyang fbshipit-source-id: af0e028d7fa7832a945aaee4e241ceb5418f4ec8	2019-01-30 15:19:11 -08:00
Edward Yang	c653fa2b00	Move Deprecated.h to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16504 Reviewed By: smessmer Differential Revision: D13860570 fbshipit-source-id: 4742dc30c78d49b0f655b4e9536f51b36a595636	2019-01-30 14:26:37 -08:00
Elias Ellison	18659e1336	Allow generic containers as module inputs (#16482 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16326 Previously we didn't handle module inputs which included Generic Lists. When checking whether a generic list if a subvalue of the input arg type, I currently recurse on every element of the list. This shouldn't be too slow since the innermost list will be specialized and we won't have to check it's elements. E.g. Tensor[][] -> GenericList [TensorList ]. The error message could be improved, but extracting the complete type of nested lists would have to deal with unifying types across lists / empty lists & typevars so I'm going to save that for a follow up PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16482 Differential Revision: D13882582 Pulled By: eellison fbshipit-source-id: 3609bc572f0ee9ebf20a77ea5ebc8fa3b165e24b	2019-01-30 14:20:56 -08:00
Erik Brinkman	22e9c3055a	Explicit pdist captures (#16286 ) Summary: Per discussion with cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/16286 Differential Revision: D13883001 Pulled By: erikbrinkman fbshipit-source-id: 86f35d35fde5db67e3fbb09abc418da0116c9aac	2019-01-30 14:02:36 -08:00
Mikhail Zolotukhin	1905bbb01d	Include ATen/core/functional.h directly instead of torch/csrc/utils/functional.h. (#16377 ) Summary: One more shim removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16377 Differential Revision: D13821816 Pulled By: ZolotukhinM fbshipit-source-id: 007f014d404de51841437db7eef28367a2f6e46b	2019-01-30 14:02:34 -08:00
Jesse Hellemn	b28f0f9d37	Remove --no-update-dependencies (#16575 ) Summary: Absolutely no idea why this is needed. This should be a valid argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16575 Differential Revision: D13884796 Pulled By: pjh5 fbshipit-source-id: 6011e721e2870499f6b5e627d5ad00ece08b530b	2019-01-30 13:53:51 -08:00
Edward Yang	3ab736b774	Update PyTorch DockerVersion to 285. (#16507 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16507 Differential Revision: D13884588 Pulled By: ezyang fbshipit-source-id: b7e22daa15874f9a226195d4749b4f9f827d7c1e	2019-01-30 13:29:25 -08:00
Tim Khatkevich	2ed5569bd6	Support fallback for more operators (#16566 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16566 it's a follow-up to https://github.com/pytorch/pytorch/pull/16456 Reviewed By: yinghai Differential Revision: D13881462 fbshipit-source-id: eff063580ac8f622477417ed4b25320299451811	2019-01-30 13:21:20 -08:00
Lu Fang	307c83b5eb	fix the linter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16567 Differential Revision: D13882166 Pulled By: houseroad fbshipit-source-id: daf760f51e4fce376ca09421900405970d00c4d2	2019-01-30 13:16:49 -08:00
Sebastian Messmer	c43917b0a3	Add a test case calling caffe2 layer_norm from caffe2 executor but through the c10 dispatcher Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16283 Reviewed By: ezyang Differential Revision: D13792591 fbshipit-source-id: 9c190649e38e8706549102b2e136ceaf508eb37f	2019-01-30 13:16:47 -08:00
Jerry Zhang	2af95d8e3e	Back out "[pt1][tensor] Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize" (#16516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16516 Original commit changeset: 64abce3dbaed Reviewed By: dzhulgakov Differential Revision: D13863715 fbshipit-source-id: f1923fdca4a1a82768d9c280a8493ff15a7eb2ba	2019-01-30 12:50:38 -08:00
zrphercule	cdbd388206	Remove the debugging info of pytorch=>onnx coverage script (#16538 ) Summary: Remove the debug info. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16538 Reviewed By: houseroad Differential Revision: D13872068 Pulled By: zrphercule fbshipit-source-id: 7572668d0048c37f6b6029a48e5ae4b8b21823f7	2019-01-30 11:40:28 -08:00
Jacie Fan	a7796bc24d	CUDA histogram implementation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15842 Reviewed By: zou3519 Differential Revision: D13868982 Pulled By: jaciefan fbshipit-source-id: bce81dc121c4538d204047506f8f14d0b4d8f905	2019-01-30 11:36:20 -08:00
Michael Suo	dc84ff1e5a	Use a points-to graph for alias analysis (#16386 ) Summary: This PR changes the way we store aliasing information from a "set" approach to a "points-to" analysis. Set-based approaches lose information in ways that make it difficult to do "live" updates to the alias DB as one as mutating the graph. The tradeoff is that simple queries get more expensive, since they require traversing the points-to graph to answer most questions. In practice, this is unlikely to be that costly since we don't have massive aliasing chains, but we could create an approximation/caching layer if this becomes a problem. My rough plan is: 1. This PR, switching to a points-to graph 2. Make it "live": analyzing a node should record all the edges the node added, so that we can rollback when the node is destroyed. 3. Reduce wildcard scope: we can make the wildcard a special vertex that points to anything that we're not "sure" about; namely, things that have been put inside lists, or graph inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16386 Differential Revision: D13855117 Pulled By: suo fbshipit-source-id: f009f58143173c275501624eb105d07ab60fe5e1	2019-01-30 11:28:03 -08:00
Lara Haidar-Ahmad	dff8165d04	ONNX Export Flatten operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16240 Reviewed By: bddppq Differential Revision: D13800025 Pulled By: houseroad fbshipit-source-id: ae4c5e42026477b28ffd44eda2438d93936ea510	2019-01-30 11:05:00 -08:00
Edward Yang	68620cdcb5	Revert D13880053: [pytorch][PR] add new build files to gitignore; test that build doesn't leave repo dirty Differential Revision: D13880053 Original commit changeset: 0171f42438ef fbshipit-source-id: a734f8704c1cbe16434c672289c505b19b2b490a	2019-01-30 11:04:58 -08:00
vishwakftw	34b43baeec	Allow list and tuples to be passed as output_size to max_unpool1d (#16489 ) Summary: Changelog: - Modify concantenation of [1] to a tuple by using cases for list and non-list types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16489 Differential Revision: D13875838 Pulled By: soumith fbshipit-source-id: fade65cc47385986b773b9bde9b4601ab93fe1cf	2019-01-30 11:00:34 -08:00
Lu Fang	b1b00f329e	Fix the flake8 linter Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16549 Reviewed By: bddppq Differential Revision: D13877435 Pulled By: houseroad fbshipit-source-id: dbe575ba3f6dd30d27ac6aa5eec2eea025063540	2019-01-30 09:36:00 -08:00
Ailing Zhang	3b91df3744	add example multiprocess code (#16345 ) Summary: fixes #16141 Differential Revision: D13868539 Pulled By: ailzhang fbshipit-source-id: 03e858d0aff7804c5e9e03a8666f42fd12836ef2	2019-01-30 09:35:58 -08:00
Yinghai Lu	fa717cba63	Support int64_t shape data for ideep reshape op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16533 Reviewed By: jerryzh168 Differential Revision: D13867402 fbshipit-source-id: ff53a851f142ef915ad69da3868bb3aab4d48987	2019-01-30 09:00:09 -08:00
SsnL	2d2eb7145a	add new build files to gitignore; test that build doesn't leave repo dirty Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16441 Differential Revision: D13880053 Pulled By: ezyang fbshipit-source-id: 0171f42438efdd651b6af22e521b80e85b12681c	2019-01-30 08:41:59 -08:00
Tim Khatkevich	7d7855ea31	Fallback support for more operators (#16456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16456 Adding fallbacks for more operators and fixing ifndef for expand_op.h Reviewed By: yinghai Differential Revision: D13845382 fbshipit-source-id: b7c5b7f7f176707b9ddffade139562a8085967ed	2019-01-30 03:54:11 -08:00
Lu Fang	21907b6ba2	Fix the dropout onnx symbolic, and ensure all exported models in test_operators.py are eval mode (#16547 ) Summary: In eval mode, skip dropout operator in onnx exporter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16547 Reviewed By: houseroad Differential Revision: D13877136 Pulled By: dzhulgakov fbshipit-source-id: c366da156f83677bcf4989b79166aae5b6c36125	2019-01-30 01:16:21 -08:00
Xiaomeng Yang	598b713660	Seperate level1 elementwise functions from math (#16397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16397 Seperate level1 elementwise functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13830626 fbshipit-source-id: e6e672647076dab8b3b24be181f580a1486250c9	2019-01-30 00:04:12 -08:00
Sebastian Messmer	ed4776820a	Fix includes for ATen/core/stack.h (#16462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16462 This file was moved, now we change the includes to the new location and remove the proxy header. Reviewed By: ezyang Differential Revision: D13847279 fbshipit-source-id: 4617d52fdcfe785cb7b2154460a6686c437abd8b	2019-01-29 23:33:13 -08:00
Sebastian Messmer	7c66ad7455	Add test case for calling c10 ops from pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16062 Reviewed By: ezyang Differential Revision: D13628955 fbshipit-source-id: f6ed3f07db2675bd9ae9251da990ca7b8c963717	2019-01-29 18:22:52 -08:00
Sebastian Messmer	12f92f453b	Kernel gets Stack* instead of ArrayRef<IValue> (#16282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16282 This changes the core kernel abstraction to be a function taking a stack, popping its arguments from the stack and pushing results to the stack, instead of getting arguments as ArrayRef<IValue> and returning an output IValue. Caffe2 operators need to have a way to pass in preallocated output tensors. The convention for them is to get all inputs and outputs on the stack and also return all of them, i.e. a caffe2 op will always have inputs == outputs. This will probably change in later diffs towards making the outputs in-arguments optional in the JIT schema. Reviewed By: ezyang Differential Revision: D13792335 fbshipit-source-id: e9cc2b5e438cc4653e1f701633a154b92b604932	2019-01-29 18:22:51 -08:00
xuzhu	6249442e90	Chunk dataset implementation (#15932 ) Summary: This PR contains the implementation of chunk dataset, with the API proposed in PR https://github.com/pytorch/pytorch/pull/15562 A chunk dataset is derived from StatefulDataset. It utilizes worker threads to prefetches chunk data, splits it into batches and caches them into a queue. When get_batch is called from dataloader, batch data is retrieved from the queue, and data in new chunks will be pushed for later following batches. Chunk dataset uses two samplers (chunk_sampler and example_sampler) to perform sampling. The chunk_sampler decides which chunk to load, and example_sampler shuffles the examples inside a specific chunk. More detail of this sampling approach can be found here: http://martin.zinkevich.org/publications/nips2010.pdf Pull Request resolved: https://github.com/pytorch/pytorch/pull/15932 Differential Revision: D13868688 Pulled By: soumith fbshipit-source-id: a43000c478ca2a3c64cc84b3626d6b8b1ad9a07e	2019-01-29 18:06:01 -08:00
Zachary DeVito	21193bf123	try to get rid of tmp_install (#16414 ) Summary: Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414 Differential Revision: D13863635 Pulled By: zdevito fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828	2019-01-29 17:29:40 -08:00
Gregory Chanan	ffed8bff6a	Fix torch.sparse.sum parsing of dim. (#16517 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16501. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16517 Differential Revision: D13865322 Pulled By: gchanan fbshipit-source-id: fa0ac37a9e7b8f19a5bdf75e5771128e48c41612	2019-01-29 16:19:22 -08:00
Pieter Noordhuis	f924fc6eb1	Make Store::setTimeout take milliseconds (#16278 ) Summary: This is particularly useful when using a c10d::Store from tests. cc jgehring Pull Request resolved: https://github.com/pytorch/pytorch/pull/16278 Reviewed By: janewangfb Differential Revision: D13866271 Pulled By: pietern fbshipit-source-id: c8670b5f4ebd5cd009f2cabbe46cc17a9237d775	2019-01-29 16:15:25 -08:00
Edward Yang	279238f0b8	Back out "Delete duplicate copy of THCCachingAllocator." (#16510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16510 This diff was supposed to be memory usage neutral, but based on some internal flows involving cuDNN, it was not. Reverting pending further investigation. Original commit changeset: 03f1ebf7f11c Reviewed By: xw285cornell Differential Revision: D13863610 fbshipit-source-id: 15517e255fd6b0c064b65fb99f0ef19742236cfd	2019-01-29 15:44:19 -08:00
Matthew Brandyberry	4f809397fd	Fix compare_exchange_weak usage in weak_intrusive_ptr (#16302 ) Summary: In the case of spurious failure, refcount is not incremented -- which leads to underflow once all references are released. This was discovered when exercising multiprocessing on ppc64le. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16302 Differential Revision: D13845435 Pulled By: ezyang fbshipit-source-id: 8e264fff9dca8152cb12617e3216d5e48acd9557	2019-01-29 14:10:04 -08:00
Lu Fang	719134f3c3	Automatic update of fbcode/onnx to 15c33c945851907411619f599900c3852108e7e3 (#16493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16493 Previous import was dc75285d4a1cff9618400164dfdb26c5a1bab70a Included changes: - [15c33c9](https://github.com/onnx/onnx/commit/15c33c9): Add ppc64le build (#1768) <Chin Huang> - [198f840](https://github.com/onnx/onnx/commit/198f840): Update Broadcasting.md (#1769) <Verma-Rajat> - [60ac95f](https://github.com/onnx/onnx/commit/60ac95f): Merge back from release 1.4.1 (#1767) <Raymond Yang> - [a683372](https://github.com/onnx/onnx/commit/a683372): Bump up version number for v1.4.0 (#1761) (#1763) <Raymond Yang> - [dbf3581](https://github.com/onnx/onnx/commit/dbf3581): Add TfIdfVectorizer operator to ONNX (#1721) <Dmitri Smirnov> Reviewed By: zrphercule Differential Revision: D13858840 fbshipit-source-id: 1d00f63f265cc6deed965b92ed00c44f547ff03e	2019-01-29 13:48:49 -08:00
Edward Yang	541ce96564	Make the pytorch's cmake minimum required version equal to caffe2's. (#16506 ) Summary: Stack:     ⚫  #16506 Make the pytorch's cmake minimum required version equal to caffe2's.  [💛](https://our.intern.facebook.com/intern/diff/D13861564/) Originally authored by JerryShih <bignose1007@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16506 Differential Revision: D13863979 Pulled By: ezyang fbshipit-source-id: 9275739a820ae03ec6eaa41959f9340c9bba8de3	2019-01-29 13:39:32 -08:00
peter	3ab620926f	More windows fixes towards the code refactor (#16451 ) Summary: Fixes #16446. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16451 Differential Revision: D13864388 Pulled By: soumith fbshipit-source-id: 6cb173eafbd3da33c479c56c85aff75e8be4bf35	2019-01-29 13:15:36 -08:00
SsnL	ded6fb0293	Add stack & cat support for CPU Half (#16389 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/6968 Needed for #14705 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16389 Differential Revision: D13861446 Pulled By: gchanan fbshipit-source-id: 7b8700b95aaf252d9669693dbddccb2302e58409	2019-01-29 13:06:29 -08:00
peter	d79e45bbba	Add some smoke tests for Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16496 Differential Revision: D13863489 Pulled By: soumith fbshipit-source-id: 518003c27a6b788b5a78b58cdb8698f0bb6ce4d8	2019-01-29 12:54:39 -08:00
Thomas Viehmann	6a6983ed7f	create type hint stub files for module torch (#12500 ) Summary: We have: - This is an initial stab at creating a type stub `torch/__init__.pyi` . - This is only tested on Python 3, since that's the only Python version mypy works on. - So far, we only aim at doing this for torch functions and torch.Tensor. - Quite a few methods and functions have to be typed manually. These are done in `torch/__init__.pyi.in` For me, PyCharm (the non-paid one) didn't seem to indicate errors in the .pyi when opening and seemed to be able to get the type hint for the few functions I tried, but I don't use PyCharm for my usual PyTorch activities, so I didn't extensively try this out. An example of a generated PYI is at [this gist](https://gist.github.com/ezyang/bf9b6a5fa8827c52152858169bcb61b1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12500 Differential Revision: D13695553 Pulled By: ezyang fbshipit-source-id: 4566c71913ede4e4c23ebc4a72c17151f94e8e21	2019-01-29 12:14:17 -08:00
Edward Yang	3b337e7892	Revert D13596031: Improve c2-aten tensor interop and add proper testing Differential Revision: D13596031 Original commit changeset: d20b601e06ba fbshipit-source-id: dc371697f14b3893a9164380a39e7a49d8d68ecf	2019-01-29 07:14:57 -08:00
Soumith Chintala	bd19dd4b90	url download bugfix for URLs served without Content-Length header (#16153 ) Summary: Some HTTP servers dont return Content-Length, account for that Fixes: https://github.com/pytorch/pytorch/issues/16152 Differential Revision: D13858882 Pulled By: soumith fbshipit-source-id: e4293e9368ed4c87548d22adec1ce0c25ea4bd8f	2019-01-29 01:28:47 -08:00
Mikhail Zolotukhin	dbebb5322c	Properly screen string literals when dumping JIT IR Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16056 Differential Revision: D13719444 Pulled By: ZolotukhinM fbshipit-source-id: 7113ee9328eff6263513476cdf9254a2e1116f4c	2019-01-29 00:26:37 -08:00
Mikhail Zolotukhin	0e6123fb8a	Remove dependency on ResourceGuard from IR.h. (#16351 ) Summary: It looks like `WithInsertionPoint` and `WithCurrentScope` can be easily implemented without `ResourceGuard` - that helps readability and removes one more dependency. Is there anything I'm missing? Pull Request resolved: https://github.com/pytorch/pytorch/pull/16351 Differential Revision: D13821826 Pulled By: ZolotukhinM fbshipit-source-id: b203200b345fb5508a97dc8656e6f51cde4cc21f	2019-01-29 00:21:32 -08:00
Mikhail Zolotukhin	862d466bef	Remove redundant includes from scope.h and attributes.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16472 Differential Revision: D13852553 Pulled By: ZolotukhinM fbshipit-source-id: d5634982c2c42e704d9902774a77660e05fd71eb	2019-01-28 23:47:15 -08:00
Dmytro Dzhulgakov	5e21e0fe75	Improve c2-aten tensor interop and add proper testing (#15860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15860 Few changes (which are harder to split in separate diffs, so together): - make conversion explicit (as they can throw to avoid surprises) - fix tensor legacy dispatch not initialized when tensor is created on C2 side - add a bunch of invariants to enforce Reviewed By: ezyang Differential Revision: D13596031 fbshipit-source-id: d20b601e06ba47aeff2f6e8e15769840e2d46108	2019-01-28 23:41:50 -08:00
Your Name	9d6be6ac09	Remove redundant "build" setup.py commond from onnx scripts Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16487 Differential Revision: D13858628 Pulled By: bddppq fbshipit-source-id: e1ff3fc5f9be5d3dbbf96ee73c3a8c901b440b82	2019-01-28 22:59:33 -08:00
James Reed	7f552041ff	Fix identifier shadowing in tracer (#16480 ) Summary: This was causing build failures under `-Werror` targets under optimized build modes Pull Request resolved: https://github.com/pytorch/pytorch/pull/16480 Differential Revision: D13857621 Pulled By: jamesr66a fbshipit-source-id: 2990b987dbca943298ad478c9ee2792236f5fa5b	2019-01-28 21:47:39 -08:00
Owen Anderson	f204e3e624	Pass WERROR to CMake as an explicit parameter rather than an env var. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16465 Differential Revision: D13853949 Pulled By: resistor fbshipit-source-id: 71ccf90a2824ad21c9f26dd753b186f30435d82a	2019-01-28 20:57:18 -08:00
Edward Yang	99fab45733	Remove redundant build from build develop instructions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16467 Differential Revision: D13849661 Pulled By: ezyang fbshipit-source-id: d3d58bd31ac65ad9cbf0057b9a4c499c0f59d95a	2019-01-28 20:47:11 -08:00
Jerry Zhang	52ca4b86db	Change SetOutputSize in ConvTransposeUnpoolBaseOp (#16179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16179 to avoid passing partially initialized Tensor around. Reviewed By: ezyang Differential Revision: D13744009 fbshipit-source-id: 4c545765e1cd164b3e87ce08ec4c1cb1e37e2b8f	2019-01-28 18:28:11 -08:00
Sebastian Messmer	5ebf4cd4e3	Move stack.h to ATen/core (#16247 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16247 Stack is going to be used by the c10 dispatcher. This just moves the file, also changing the namespace turned out to be more complicated than I thought, I'll leave the namespace for now. Reviewed By: ezyang Differential Revision: D13774189 fbshipit-source-id: 66aeee36425e0ea2b3a4f8159604f38572306d57	2019-01-28 17:46:10 -08:00
Sebastian Messmer	504bcb276c	Remove state from schema and calling API (#16180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16180 Only the kernel knows about its state, the caller doesn't see it anymore. Reviewed By: ezyang Differential Revision: D13744071 fbshipit-source-id: cb00ff1a881508c1b36ac4123bee1f68ca02ca9c	2019-01-28 17:46:08 -08:00
Mikhail Zolotukhin	cc2d49deb7	Remove generic_if.h. (#16354 ) Summary: The current uses of `IR_IF` are mostly trivial, so there is not much value in having special macros for it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16354 Differential Revision: D13821823 Pulled By: ZolotukhinM fbshipit-source-id: 1ca73111f5b4868fa38a1f29c9230540773e5de6	2019-01-28 17:02:23 -08:00
Jesse Hellemn	ed50bccb35	Remove CUDA_VERSION to flag and remove JOB_BASE_NAME from binary jobs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16470 Differential Revision: D13853387 Pulled By: pjh5 fbshipit-source-id: a2baccde65ab82b69380ee57b16e43cc80ed3e04	2019-01-28 16:52:11 -08:00
Gregory Chanan	4eceb7a055	Fix cmake byte version issue in build_pytorch_libs. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16457 Differential Revision: D13846408 Pulled By: gchanan fbshipit-source-id: 26962bc12d7d9fdad71f9dd7526f6d32e6008295	2019-01-28 16:00:28 -08:00
Jerry Zhang	ff963d4b9f	Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#16273 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16273 Previously we have SetOutputSize which accept a partially initialized Output Tensor and set it to the correct size, the diff change this to GetOutputSize that returns the correct size instead. e.g. ``` auto* Y = Output(0); ConvPoolOp<Context>::SetOutputSize(X, Y, channels); ... Y->mutable_data<T>... ``` --> ``` auto sizes = ConvPoolOp<Context>::GetOutputSize(X, channels); auto* Y = Output(0, sizes, at::dtype<T>()); ``` Reviewed By: dzhulgakov Differential Revision: D13736281 fbshipit-source-id: 64abce3dbaed0b375098463333dfd0ea5a3b1945	2019-01-28 15:56:34 -08:00
James Reed	b076227b21	Move tracer impls into cpp file (#16410 ) Summary: Working on the tracer was really annoying because a lot of the implementations were in `tracer.h` and editing that file caused us to rebuild almost the whole world. So this moves all the implementations into tracer.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/16410 Differential Revision: D13847776 Pulled By: jamesr66a fbshipit-source-id: ec8500da32b2d4cd990f293a0a96101d3e82f158	2019-01-28 15:34:02 -08:00
Michael Suo	1a77918955	fix alias annotations on to, cpu, cuda (#16460 ) Summary: Fix alias annotations for ops that may return a fresh tensor. The previous version was overly conservative. Currently there is no actual behavior change in the alias analysis, but we may use the information in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16460 Differential Revision: D13849086 Pulled By: suo fbshipit-source-id: cd23b314a800e5e077d866e74456d37a321439d5	2019-01-28 15:20:23 -08:00
Your Name	e3c0926c44	Remove usage of deprecated "min_satisfying_examples" hypothesis setting (#16401 ) Summary: This setting has been deprecated in [hypythesis 3.56.0](`d1b0df5b91/hypothesis-python/docs/changes.rst (3560---2018-04-17)`) and recently has been removed in [hypothesis 4.x](`d1b0df5b91/hypothesis-python/docs/changes.rst (400---2019-01-14)`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16401 Reviewed By: ezyang Differential Revision: D13832528 Pulled By: bddppq fbshipit-source-id: 04b9f1dfdf2dcfe0ef121dd02f7fbfdf6bf4aead	2019-01-28 14:17:10 -08:00
Christian Puhrsch	0ae45e30bc	Support Tensor alias annotations for native_functions.yaml (#16239 ) Summary: Adds Tensor alias annotations. This isn't a full implementation of alias annotations, but that isn't required to increase compliance with the JIT signature schema. There are some sanity checks within native_parse.py for their usage, which can also help overall correctness. Otherwise, this exists solely for further alignment between the JIT signature schema and the native_functions.yaml func schema. This gets us to ~85% matches. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16239 Differential Revision: D13804133 Pulled By: cpuhrsch fbshipit-source-id: aa5750f2c7e0f08b8c35d6d8f38cb148e9629855	2019-01-28 13:57:25 -08:00
Johannes M Dieterich	120c54743e	Annotate the bicubic interpolation kernels (#16449 ) Summary: with the correct `__launch_bounds__` for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16449 Differential Revision: D13844111 Pulled By: bddppq fbshipit-source-id: 07ed8552a630f3a6426d9e5648c415f066991e3d	2019-01-28 13:47:28 -08:00
SsnL	fb17be1368	Clear cmake cache when --cmake (#16426 ) Summary: Also, because sometimes we have `CMakeCache.txt` but cmake errored out so I'm adding the existence of `'build.ninja'` as another criterion of rerunning cmake. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16426 Differential Revision: D13843801 Pulled By: ezyang fbshipit-source-id: ea1efb201062f23b7608f8d061997d8a8e293445	2019-01-28 13:43:17 -08:00
Jerry Zhang	e866bc7c88	Remove dims() in caffe2::Tensor (#16356 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16356 att Reviewed By: dzhulgakov Differential Revision: D13813197 fbshipit-source-id: 68c0fb43404536f622422c51949c819d8a037aa5	2019-01-28 12:42:42 -08:00
Sebastian Messmer	05678d0bfa	Op-calling API can handle state (#16177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16177 Change the API for calling operators so that it can store state in an OpKernel object. This diff doesn't store the state there yet, that comes in a follow up diff. Reviewed By: ezyang Differential Revision: D13742889 fbshipit-source-id: 20511a9a1b9f850074e50634d4b4acf87f8c6ecd	2019-01-28 11:46:05 -08:00
Sebastian Messmer	80f4374dde	Handle stack correctly (#16246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16246 The op schema says it returns multiple values, so let's actually return multiple values instead of one tuple. For some reason, this did work when called from python (probably some auto-unpacking), but once called from JIT, it segfaulted. This diff fixes that. Reviewed By: dzhulgakov Differential Revision: D13780147 fbshipit-source-id: fe94f82f4c53b7454f77c4484fca4ac9dc444475	2019-01-28 11:46:03 -08:00
Helmut	c7547dbd5e	Fix compiler error in swapBytes64 for rare architectures (#16418 ) Summary: swapBytes64 used to use SwapByteOrder_32 and value, both of which dont exist. This commit rewrites that part from scratch. This happened on Debugbuild on Microsoft compiler. For that case " && !defined(_DEBUG)" is also removed, because _byteswap_uint64 works fine in debug mode (if it is necessary it should me commented why). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16418 Differential Revision: D13843306 Pulled By: ezyang fbshipit-source-id: dde1c7baeccec3aaa750d4b7200b3f4ccb4a00cb	2019-01-28 11:38:07 -08:00
Junjie Bai	17d7818578	Fix lint errors introduced in pytorch/pytorch@ceece5d (#16454 ) Summary: ifedan ``` ./test/common_utils.py:748:1: E302 expected 2 blank lines, found 1 ./test/test_torch.py:1235:5: E303 too many blank lines (2) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16454 Differential Revision: D13844905 Pulled By: bddppq fbshipit-source-id: 3dc7c740d86310a8efc9864d7c7798fda8257a21	2019-01-28 11:29:11 -08:00
Syed Tousif Ahmed	17e3ab957a	Report the slowest 10 tests when using pytest (#16423 ) Summary: This flag is useful in identifying if a test is taking way too long like the ones in the following snippet when running the test suite with pytest. `9757ad35b0/test/common_utils.py (L814-L835)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16423 Differential Revision: D13843507 Pulled By: ezyang fbshipit-source-id: 643e1766a85905b3b112ea5ca562135a17896a72	2019-01-28 10:33:05 -08:00
Xiaomeng Yang	0a2d14dd7c	Optimize SpatialBNOp on GPU (#16395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16395 Optimize SpatialBNOp on GPU i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13829833 fbshipit-source-id: 04d2a63e8e9830c4c39a91cf87fcd7aa765dc55f	2019-01-28 09:36:45 -08:00
Igor Fedan	ceece5dd0f	CPU implementation of torch.cdist (#16168 ) Summary: cdist is used for calculating distances between collections of observations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16168 Differential Revision: D13739147 Pulled By: ifedan fbshipit-source-id: 9419c2c166891ac7db40672c72f17848f0b446f9	2019-01-28 09:16:32 -08:00
Brennan Vincent	14138f4605	Don't initialize a new `std::vector` in a loop. (#15850 ) Summary: Before this diff, we execute `std::vector<optional<acc_t>> buffer((unsigned)max_threads, optional<acc_t> {});` in every iteration of `foreach_reduced_elt`. Change the code to only execute that line if we need it; i.e., we are actually about to parallelize. This overhead is quite significant when we are doing a lot of small reductions in single-threaded code. ``` x=torch.randn((1024,10,1024),dtype=torch.float64) torch.set_num_threads(1) %timeit x.std(1) ``` Before (with #15845 applied): 708.25 ms After: 508 ms Pull Request resolved: https://github.com/pytorch/pytorch/pull/15850 Differential Revision: D13612960 Pulled By: umanwizard fbshipit-source-id: f5e61abfe0027775c97ed81ac09c997fbee741df	2019-01-28 08:50:27 -08:00
Edward Yang	d2cdffaf37	More documentation on caffe2::Operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16371 Reviewed By: dzhulgakov Differential Revision: D13820472 fbshipit-source-id: efccea0e92c86d30ec2bdda50eb9aab8a3a1504d	2019-01-28 07:41:14 -08:00
rotuna	fdaa77ae8b	Better error message when creating a module instance in jit.script (#16416 ) Summary: Made the change requested in #15555 PR was failing build due to a time out error while getting packages using pip. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16416 Differential Revision: D13833873 Pulled By: soumith fbshipit-source-id: e2200e9e8015558fcd359dfa3d025b25802d62b5	2019-01-27 16:29:46 -08:00
peter	952a03ccea	Fix issues on Windows brought by #16289 (#16412 ) Summary: This one needs to be merged ASAP because the CUDA build for Windows is skipped at this time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16412 Differential Revision: D13833889 Pulled By: soumith fbshipit-source-id: 95a401a01fb0f9c1045df0bfd72d8206b8a6f3fd	2019-01-27 15:02:31 -08:00
Gemfield	2b6607065b	Fix a typo in Parallel.h (#16419 ) Summary: Fix a typo in Parallel.h. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16419 Differential Revision: D13833705 Pulled By: soumith fbshipit-source-id: 824ebe753e028fc8e2b5d7a51fdba98a365fd29a	2019-01-27 14:16:47 -08:00
peterjc123	ee18448138	Don't install PDB for Windows static build of caffe2_observers (#16420 ) Summary: Fixes #16292. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16420 Differential Revision: D13833704 Pulled By: soumith fbshipit-source-id: 482ad6ce103bed7206e924e8c82454fbb1bfac42	2019-01-27 12:29:49 -08:00
SsnL	c863a759a0	Fix slogdet sign requiring grad when input requires grad (#16337 ) Summary: The real fix for https://github.com/pytorch/pytorch/issues/15605. This is sort of BC breaking because now ```py In [1]: import torch In [2]: a = torch.randn(3, 3, requires_grad=True) In [3]: a.slogdet() Out[3]: (tensor(1.), tensor(0.1356, grad_fn=<SlogdetBackward>)) In [4]: a.slogdet()[0].requires_grad Out[4]: False ``` while before this patch ` a.slogdet()[0]` requires grad with `grad_fn=<SlogdetBackward>`. But any use of backproping through this value will meet the error in #15605 so I don't think this is a problem. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16337 Differential Revision: D13832644 Pulled By: soumith fbshipit-source-id: f96c477e99edcbdbd966888e5c5ea7fd058429a8	2019-01-27 12:11:14 -08:00
Zachary DeVito	6944461a76	CI Fix: restore MAX_JOBS variable (#16415 ) Summary: Restores a CI workaround (https://github.com/pytorch/pytorch/pull/7361) that got dropped with build_pytorch_libs.sh. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16415 Differential Revision: D13833092 Pulled By: zdevito fbshipit-source-id: f78b60cafd8da945790dba28de373b8faf46e9f5	2019-01-27 01:27:50 -08:00
Samuel Fadel	3c30cf3237	Update einsum documentation. (#16323 ) Summary: The documentation stated that operands to einsum should be a list of Tensors, not individual arguments. The function, however, now accepts individual arguments for each Tensor operand and a single argument consisting of a list of Tensors. The documentation was updated to reflect this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16323 Differential Revision: D13832647 Pulled By: soumith fbshipit-source-id: c01c2b350f47674d3170337f493b0ee2ea381b3f	2019-01-26 18:00:57 -08:00
James Reed	de6bb3f3e3	Fix flake8 warnings/errors in test_jit.py (#16409 ) Summary: These were really annoying to see in the phabricator UI when trying to land PRs that touched test_jit.py, so this fixes them. One remaining item is the T484 error. Locally, flake8 still chokes on that line even though I put the noqa comment there (and tried varying whitespaces around it etc). Not sure why it still persists... Pull Request resolved: https://github.com/pytorch/pytorch/pull/16409 Differential Revision: D13832658 Pulled By: jamesr66a fbshipit-source-id: 46356ba6444ae5ee1a141c28489bdcc7c99e39c0	2019-01-26 17:42:08 -08:00
James Reed	d1ed0176df	Trace fork and join calls Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16232 Differential Revision: D13772974 Pulled By: jamesr66a fbshipit-source-id: b2db370271809e26d3301f8cc98eec567db5e62b	2019-01-26 14:42:45 -08:00
vishwakftw	8c81a72e87	Switch to CUDA implementation if batch size >= 65536 for affine_grid (#16403 ) Summary: Changelog: - Append a condition that switches to the native CUDA implementation for affine_grid Fixes #16365 Differential Revision: D13832192 Pulled By: soumith fbshipit-source-id: 3f484e6673d71e3ba7627b170cb8f1611e12b9b2	2019-01-26 11:18:57 -08:00
SsnL	f6e6b0fd33	gitignore gdb history Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16404 Differential Revision: D13832191 Pulled By: soumith fbshipit-source-id: ab23d1ad72c041ec2d9616c273bbf399e0feb10d	2019-01-26 09:46:01 -08:00
Juan Miguel Pino	41e9b092a9	Revert D13821061: [redo][c10] layernorm example Differential Revision: D13821061 Original commit changeset: 82f0dade0145 fbshipit-source-id: e5b0b1bab0c9e731ae04add35e9a6c91656dd178	2019-01-25 22:52:04 -08:00
Jerry Zhang	f4e54fd659	trying to fix testX (#16370 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16370 passed locally but seems testX has some problem Reviewed By: ezyang Differential Revision: D13820250 fbshipit-source-id: e4ad9d1ec99508867d4ead46753a7fb7019c50bd	2019-01-25 17:02:21 -08:00
Bram Wasti	27a1ba3ef2	layernorm example (#16374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16374 this fixes the original attempt in OSS (adds to CMake and python build files) Reviewed By: smessmer Differential Revision: D13821061 fbshipit-source-id: 82f0dade0145fd04bdf8e3cb3954b5790e918162	2019-01-25 16:52:33 -08:00
Bram Wasti	13fde345fb	plug caffe2 into jit" (#16388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16388 previous diff broke master -- this refactors out the custom_operator.cpp file into a separate header + cpp pair (caffe2_operator.{h,cpp}) Reviewed By: smessmer Differential Revision: D13823550 fbshipit-source-id: 00e005e650336132d05aef97c1f0e5242ccad5ba	2019-01-25 16:52:32 -08:00
Junjie Bai	41acbb3b6b	Enable centos pytorch rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14879 Differential Revision: D13821534 Pulled By: bddppq fbshipit-source-id: 45151b880992f1efa83e29c4985a723374575506	2019-01-25 16:27:55 -08:00
Zachary DeVito	9477a5d9c8	Remove bash from build (#16289 ) Summary: This commit removes the dependency on `build_pytorch_libs.sh` by moving the remaining functionality that is not expressible in cmake into python. Removing the indirection through bash also removes over 300 lines of environment munging code that is incredibly hard to understand because it passes a lot of secret parameters through `os.env`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16289 Reviewed By: ezyang Differential Revision: D13821662 Pulled By: zdevito fbshipit-source-id: d658d26925e3b1169ac1e3d44a159cf8a1f0d9b1	2019-01-25 16:03:53 -08:00
Jerry Zhang	539894d70a	Remove caffe2::ShareData (#16139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16139 Original commit changeset: 4b15a4c62995 Reviewed By: dzhulgakov Differential Revision: D13677464 fbshipit-source-id: 1a644a88fac02b44feebac48ccc01bc72cc47edb	2019-01-25 15:39:11 -08:00
Jesse Hellemn	ca86d1f01d	Trying a fix to anaconda logins on nightlies Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16387 Differential Revision: D13826227 Pulled By: pjh5 fbshipit-source-id: 769a53e40a4912879faf9716a80c0e0c86acdbf8	2019-01-25 15:19:29 -08:00
Elias Ellison	956cabd887	Update Documentation for Optionals (#16380 ) Summary: Now that https://github.com/pytorch/pytorch/pull/15587 has landed, updating docs. Will close https://github.com/pytorch/pytorch/issues/15278 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16380 Differential Revision: D13825221 Pulled By: eellison fbshipit-source-id: c5a7a7fbb40ba7be46a80760862468f2c9967169	2019-01-25 15:14:16 -08:00
Zachary DeVito	c42431bd7a	Revert D13740752: [c10] plug caffe2 into jit Differential Revision: D13740752 Original commit changeset: 2d9383574d42 fbshipit-source-id: e9ff217a438720423340a10af7fa263b33f2ae24	2019-01-25 12:29:19 -08:00
Gu, Jinghui	0e6791b275	Impl Shape op for mkldnn (#15266 ) Summary: Impl Shape op for mkldnn Pull Request resolved: https://github.com/pytorch/pytorch/pull/15266 Differential Revision: D13804558 Pulled By: yinghai fbshipit-source-id: 8a35f608c23973d7a15c3d645aee4059eb55f245	2019-01-25 11:04:57 -08:00
Bram Wasti	958f846fb3	Back out "[c10] layernorm example" Summary: Original commit changeset: 87240ca7f48d Reviewed By: bddppq Differential Revision: D13816657 fbshipit-source-id: bafcf0779d811c7e4a134cfb323a89352fa8c180	2019-01-25 10:22:30 -08:00
Ailing Zhang	f087c65a56	Add xla test in CI (#15978 ) Summary: Adding xla CPU tests in our CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15978 Differential Revision: D13816344 Pulled By: ailzhang fbshipit-source-id: f74c52e846976ea4ac439313847908a0e99d05eb	2019-01-25 09:24:45 -08:00
Edward Yang	45602ce9a2	Delete Tensor::swap(), replace with pointer swap (#12730 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12730 i-am-not-moving-c2-to-c10 Reviewed By: smessmer Differential Revision: D10415430 fbshipit-source-id: 8a2ce8611c5fa77bbbd73fb6788c1baa3b370f07	2019-01-25 08:25:07 -08:00
SsnL	4aae89fa7b	Make test_proper_exit more robust (#16249 ) Summary: 1. Improve error message for better debugging info 2. Increase timeout 3. Also apply the windows worker failure detection mechanism on non-Windows platforms, for better robustness Attempt to fix #14501 cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/16249 Differential Revision: D13784702 Pulled By: ezyang fbshipit-source-id: 09a7cff83ab9edce561ed69f9fb555ab35d1275f	2019-01-25 08:25:05 -08:00
Si Chen	ec2a7fa4d4	fix contbuild (#16362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16362 https://our.intern.facebook.com/intern/testinfra/diagnostics/281475065177800.844424930381786.1548397180/ Reviewed By: ezyang Differential Revision: D13816639 fbshipit-source-id: 024117233f6d3bc6244013ca2ee1aea065560212	2019-01-25 08:25:04 -08:00
Xiaomeng Yang	8683b75df6	Minor change of group_norm_gradient on GPU (#16307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16307 Minor change of group_norm_gradient on GPU Reviewed By: houseroad Differential Revision: D13800613 fbshipit-source-id: 9e55f93b1e322efe3fc2d684b9c47c3dbb7a0f48	2019-01-25 01:25:29 -08:00
Junjie Bai	52135e9b12	Revert D13551909: [fbcode] logdevice for generic feature type Differential Revision: D13551909 Original commit changeset: 807830c50bee fbshipit-source-id: 48cacf4ec1765253a9be9d78f4b28cc48330be59	2019-01-25 00:33:06 -08:00
Qin Huang	11a2b3799b	logdevice for generic feature type (#16191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16191 logdevice related modifications for generic feature type we directly convert the generic feature structures to json strings, which corresponds to the column input in offline and dper Reviewed By: itomatik Differential Revision: D13551909 fbshipit-source-id: 807830c50bee569de202530bc3700374757793a2	2019-01-24 23:33:19 -08:00
Bram Wasti	265ed8ff45	layernorm example (#16350 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16350 Example usage of the new caffe2 integration Reviewed By: smessmer Differential Revision: D13408546 fbshipit-source-id: 87240ca7f48d653a70241d243aa0eb25efa67611	2019-01-24 22:28:22 -08:00
Bram Wasti	6d2aee4a9b	plug caffe2 into jit (#16331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16331 Temporary measure to enable caffe2 ops in pytorch Reviewed By: smessmer Differential Revision: D13740752 fbshipit-source-id: 2d9383574d42ce84ee471aba32eeb4f5a0cc7a4c	2019-01-24 22:28:21 -08:00
Bram Wasti	d4b60f4014	Add RunOperator for using FunctionSchema registered ops easily in caffe2 (#16173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16173 Helper to make it easy to run ops in caffe2 Reviewed By: smessmer Differential Revision: D13468240 fbshipit-source-id: 2276c7870af6dcdf829957f005fd16ac1ef319b5	2019-01-24 22:28:19 -08:00
Bram Wasti	3b6b777a11	Add correct Input() shim to caffe2 operator impl (#16048 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16048 This enables full shimming of the operator (previously it was only Output() shimmed). Reviewed By: smessmer Differential Revision: D13468241 fbshipit-source-id: c853b775ab5cdcd968f4a6cc4766e91c3c6b1c45	2019-01-24 22:28:18 -08:00
Shen Li	7ce634ebc2	Relax lower bound for nogil timing test to avoid false alarm (#16259 ) Summary: fixes #16250, #16271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16259 Differential Revision: D13784505 Pulled By: mrshenli fbshipit-source-id: 0b7ad98cd3c018b9907d70158de3abc3c4cb57ef	2019-01-24 17:16:02 -08:00
Mikhail Zolotukhin	c787de6284	Code-style fixes. (#16342 ) Summary: Some cleanups in ir.{h,cpp}. I plan to continue cleaning it up, so this is a first step. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16342 Differential Revision: D13808897 Pulled By: ZolotukhinM fbshipit-source-id: 2dedb414576c3efbf8e36434145d7f14a66b1ee7	2019-01-24 16:44:25 -08:00
Jongsoo Park	6700eff03e	disable testing group conv with EIGEN engine (#16335 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16335 group conv is not implemented with EIGEN engine so this diff disables related tests Reviewed By: jamesr66a Differential Revision: D13807204 fbshipit-source-id: 41f6de43da40882f57e64474520e185733caefb7	2019-01-24 16:39:20 -08:00
Elias Ellison	c2be9f1487	Remove unneeded manual unwrap optionals (#16245 ) Summary: Remove calls to torch.jit._unwrap_optional that are no longer needed. The remaining instances would require control flow logic for exceptions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16245 Differential Revision: D13804292 Pulled By: eellison fbshipit-source-id: 08c5cbe4b956519be2333de5cf4e202488aff626	2019-01-24 15:48:01 -08:00
Yan Shang	f769cf999d	fix buildindexop (#16341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16341 as in the title Reviewed By: intermilan Differential Revision: D13808679 fbshipit-source-id: 0d12d3253f380bec66bc9be899be565861b8163a	2019-01-24 15:31:54 -08:00
Hoa Dinh	69b5ae4c54	Revert D13747581: Optimize SpatialBN on GPU Differential Revision: D13747581 Original commit changeset: 48a885a240ef fbshipit-source-id: 58cec6023843d7459865eb80c9db8dac463cb96c	2019-01-24 15:26:37 -08:00
Jerry Zhang	0b470d0a3b	Add Test for ReinitializeTensor (#16338 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16338 att Reviewed By: ezyang Differential Revision: D13806760 fbshipit-source-id: 322b9b7d314aeb0194f52b803ca35c0cb8efcdec	2019-01-24 15:05:21 -08:00
Will Feng	2a70f24cce	Add thread-local guard: at::AutoNonVariableTypeMode (#15939 ) Summary: This PR adds thread-local guard (`at::AutoNonVariableTypeMode`) to make sure that in VariableType.cpp the operations on baseType still dispatch to non-Variable type, even if the parameters will become Variables after the Tensor/Variable merge. We achieve this by making `legacyTensorType()` and `getType()` check the `at::AutoNonVariableTypeMode` guard to decide whether to return non-Variable type for a variable. This is part of the VariableImpl/TensorImpl merge work: https://github.com/pytorch/pytorch/issues/13638. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15939 Reviewed By: ezyang Differential Revision: D13640980 Pulled By: yf225 fbshipit-source-id: d12c2543822958558d7d70d36c50999a5eb8783f	2019-01-24 14:33:03 -08:00
Jongsoo Park	f0dd85d141	reduce parameter space of test_1x1_conv to avoid timeout (#16223 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16223 As title says Reviewed By: jamesr66a Differential Revision: D13758202 fbshipit-source-id: 3cdffb80a5dad53b29e65e8eb0ae128edba70dbb	2019-01-24 14:17:11 -08:00
Sidney Zhang	fdda533eb1	Update docs to include variable annotation example (#16324 ) Summary: Relates to this issue https://github.com/pytorch/pytorch/issues/16288 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16324 Reviewed By: ezyang Differential Revision: D13805412 Pulled By: suo fbshipit-source-id: 8b80f988262da2c717452a71142327bbc23d1b8f	2019-01-24 13:10:56 -08:00
Edward Yang	792cb774f1	Delete duplicate copy of THCCachingAllocator. (#16226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16226 Now that the caching allocator is moved to c10_cuda, we can delete the duplicate copy from Caffe2. Reviewed By: dzhulgakov, smessmer Differential Revision: D13762540 fbshipit-source-id: 03f1ebf7f11c68c19aa0d66110156fe228da6138	2019-01-24 12:06:57 -08:00
Edward Yang	e936a69085	Move THCCachingAllocator to c10_cuda. (#16119 ) Summary: Some renaming and renamespacing also took place. I was originally planning not to do anything, but it turns out that it was easier to make HIPify work by using a namespace CUDACachingAllocator:: rather than THCCachingAllocator_, since :: is a word boundary but _ is not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16119 Reviewed By: smessmer Differential Revision: D13718768 fbshipit-source-id: 884a481d99027fd3e34471c020f826aa12225656	2019-01-24 12:06:56 -08:00
Edward Yang	24b50f1411	Remove unnecessary includes and headers from THCCachingAllocator, move to at::cuda:: namespace (#16117 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16117 This means I can move it to c10_cuda with minimal fuss. Reviewed By: smessmer Differential Revision: D13717836 fbshipit-source-id: a94c7dc649af64542480fc1c226b289588886c00	2019-01-24 12:06:54 -08:00
Mikhail Zolotukhin	47bf30661f	Directly include headers from ATen. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287 Differential Revision: D13792949 Pulled By: ZolotukhinM fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3	2019-01-24 11:22:27 -08:00
Richard Zou	af513cd433	Refactor the docs build workflow (#16265 ) Summary: In preparation for setting up a doc build job for stable docs, I wanted to refactor the workflow so that future changes will be easier. This PR the following changes: - Refactor the doc push script into a reusable command - Add command line options for the doc push script. These don't matter too much for now but will be useful for setting up future jobs for building different versions of the docs. - Instead of checking out pytorch/pytorch:master, we re-use the pytorch installation inside the docker image. - Change the sed in the script to a perl command. sed is annoyingly different across platforms; the perl command is more stable - Run the script in dry run mode (without pushing the doc build) whenever a PR is opened. This lets us test changes to the doc build workflow. Test Plan - I tested the doc build script locally with my own credentials and it worked fine. - Wait for the pytorch_doc_push CI. - After merging this PR, keep an eye on the pytorch_doc_push CI status. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16265 Differential Revision: D13803511 Pulled By: zou3519 fbshipit-source-id: 4564bca3e74d490f89a1d1da9fb8b98eb44bdbb1	2019-01-24 11:18:57 -08:00
Owen Anderson	a4be15377f	Save a little bit of work in constant pooling by not moving nodes that will get deleted. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16161 Differential Revision: D13791247 Pulled By: resistor fbshipit-source-id: 2a5a4f98309509b4ba875373ee57e6f63c75a4fd	2019-01-24 10:59:57 -08:00
Gregory Chanan	0cb24098c7	Handle non-contiguous inputs with mkldnn convolution. (#16300 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/16018. Backwards appears to be fine because the derivative is written in terms of mkldnn_convolution itself. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16300 Differential Revision: D13797776 Pulled By: gchanan fbshipit-source-id: 68a990b8a3c186412a99d176931314806c9ed7bf	2019-01-24 07:39:31 -08:00
Xiaomeng Yang	45c3cc9174	Optimize SpatialBN on GPU (#16202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16202 Optimize SpatialBN on GPU Reviewed By: houseroad Differential Revision: D13747581 fbshipit-source-id: 48a885a240ef2a325235e8f89ebbe50e7c780c84	2019-01-24 02:55:10 -08:00
Xiaomeng Yang	60241e94b3	optimize group_norm (#16216 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16216 Optimize GroupNormOp Reviewed By: houseroad Differential Revision: D13754145 fbshipit-source-id: 650f64c81486c6c9d276f2e3325392d5838751ba	2019-01-23 23:57:45 -08:00
Lu Fang	8ab4d348f4	Fix the tensor deserialization problem of jit script module on CUDA (#16279 ) Summary: Now we create a temporary tensor for the whole record. Fix https://github.com/pytorch/pytorch/issues/15271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16279 Reviewed By: BIT-silence Differential Revision: D13791442 Pulled By: houseroad fbshipit-source-id: 6f52ca09627fb684f74121357cc42e4adadec36a	2019-01-23 21:35:35 -08:00
Erik Brinkman	3cba115abb	Small fixes for pdist (#16210 ) Summary: pdist was recently patched to remove buggy batch support and fix issues with large tensors. This fixed missed a few spots, and didn't handle a few recommendations that this commit addresses. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16210 Differential Revision: D13791914 Pulled By: gchanan fbshipit-source-id: 0595841be1b298f7268fd4c02a6628acfec918f2	2019-01-23 19:40:16 -08:00
Jerry Zhang	0a3932acb2	Fix comparison in ReinitializeTensor (#16294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16294 In `ReinitializeTensor`, we compare `tensor->GetDevice()` and `options.device()`, but in the callsite, we actually just provide an option with `device_type`, which means the `device_id` will always be default(-1) for `options`, but for tensor, although it is passed a `device` with default `device_id`, when we allocate the data, the `device` of the `tensor` is the `device` of `Storage`, which is the `device` of underlying `DataPtr`, which is the same as the `device` of the `Context` of the operator, which has a non-default `device_id`. Therefore everytime we do `ReinitializeTensor`, we'll find the `device` does not match, and after the `ReinitializeTensor` call, the `device` still does not match. That's why everytime we'll allocate a new Tensor and cause perf regressions for ops that uses `ReinitializeTensor` on multiple GPUs. Reviewed By: BIT-silence Differential Revision: D13795635 fbshipit-source-id: 24d6afa1a0196a32eb0134ee08b4280244cdb0c3	2019-01-23 19:29:29 -08:00
Benny Chen	f25322fb97	Fix issues under caffe round 1 Summary: Some automation to fix uninitialized members for caffe2 code. Ran canary to make sure I don't have any regression in prod, but not sure how to test comprehensively for caffe2 Reviewed By: ezyang Differential Revision: D13776185 fbshipit-source-id: fb2a479971cc0276d8784be1c44f01252410bd24	2019-01-23 19:04:59 -08:00
David Riazati	31de19f210	Add support for overloaded functions (#15556 ) Summary: This PR adds support for overloaded functions as a step toward adding rnn modules to the JIT standard library. Possible overloads must be manually specified, and when resolving the overload it chooses by the first one that passes the schema matching logic. The structure is very similar to boolean dispatch in #14425. The overload will only work on weak modules. In order to avoid supporting overloaded methods in Python to match the JIT execution, the current setup offloads that work to the user. In the test added in `test_jit.py`, two methods are used to overload the `forward` method. In order to call `forward` outside the JIT, a Python-only `forward` that does the right argument type switching must also be provided. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15556 Differential Revision: D13576348 Pulled By: driazati fbshipit-source-id: 7d3bdd4ee5a6088cc20c92f26a696d1ee5b9204b	2019-01-23 18:16:01 -08:00
Elias Ellison	8710184eea	Constant propagation changes (#16244 ) Summary: - remove loop node that is guaranteed not to execute - remove extra loop outputs that are no longer needed - if we are inlining an if node, only run constant propagation on the block that will execute - remove the recurse argument since we only expose the Graph Constant Propagation and it's not used This also includes a few extra hooks to python_ir that I think make it a little be easier to test graph conditions from python. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16244 Differential Revision: D13791635 Pulled By: eellison fbshipit-source-id: d16351fffcfc8013b02015db200f8fde002e0577	2019-01-23 17:50:33 -08:00
nlml	4b06c063a5	raise exception if try jit.load non-existent file (#16270 ) Summary: addresses https://github.com/pytorch/pytorch/issues/16267 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16270 Differential Revision: D13791773 Pulled By: suo fbshipit-source-id: 256304a02dbf724a7c0baade48c94b3ee77f53cf	2019-01-23 16:16:18 -08:00
Jesse Hellemn	80bd28bcb2	Fixing upload of nightly binaries and clean MacOS output (#16016 ) Summary: - Fix environment variable used to guard binary uploads - Move common MacOS brew setup-code into a common function to decrease code duplication and also to move that noisy console output into its own CircleCI step - Split Mac builds into separate build-test and upload jobs. Add one of these jobs to PR runs; add upload jobs to nightly binarybuilds workflow Pull Request resolved: https://github.com/pytorch/pytorch/pull/16016 Differential Revision: D13791084 Pulled By: pjh5 fbshipit-source-id: 8eeb8e1963d46eab84f0f6dad9f0265163d5bf73	2019-01-23 15:38:04 -08:00
Teng Li	fc5b79cd1c	CUDA event should only be recorded after NCCL group (#8219 ) Summary: Otherwise, it won't work if we sync on this event. Pull Request resolved: https://github.com/pytorch/pytorch/pull/8219 Reviewed By: pietern Differential Revision: D13788657 Pulled By: teng-li fbshipit-source-id: 8c96e9691ed2441d7a685fb7ae8fece906f58daf	2019-01-23 14:18:26 -08:00
Edward Yang	07a090247a	Change data() accessor in Caffe2 to return non-const pointer. (#16176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16176 This makes PyTorch and Caffe2's data() method line up. Historically, PyTorch made no distinction between tensors with const or non-const data, and thus provided a non-const pointer with data() member. Changing the API to return a const-pointer would break all mutable code, whereas changing the Caffe2 API to change a pointer doesn't break any code, except for code which required an exact match on const-ness (e.g., in template arguments). Since the latter is less disruptive, we've opted for it here. The few places downstream that broke due to this are fixed in this patch. Reviewed By: smessmer Differential Revision: D13742916 fbshipit-source-id: baa4b4544cfdf7c1f369f4d69a1e0d5953c1bd99	2019-01-23 13:55:24 -08:00
svcscm	dba4d37ac2	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 99d58034f9369846f8c82a5ea11c71e202e52a4e	2019-01-23 13:08:36 -08:00
Christian Puhrsch	f2b1842344	Align native_functions.yaml func schema more with JIT signature schema (#16111 ) Summary: This PR applies a few minor modifications leading to 100s of additional matches Modifications to native_functions.yaml 1) double to float 2) int64_t to int 3) IntList[\d] to int[\d] 4) {} to [] 5) Tensor? x=[] to Tensor? x=None 6) TensorList to Tensor[] 7) 1e-x to 1e-0x 8) Generator* x = nullptr to Generator? x = None 9) `{.}` to `[.]` Overall this adds about 300 new matches and brings us to about 1/2 compliance of native_functions func with their JIT signature equivalent While this is still a draft "tools/jit/gen_jit_dispatch.py" contains code to aid in finding close signatures Pull Request resolved: https://github.com/pytorch/pytorch/pull/16111 Reviewed By: ezyang Differential Revision: D13738123 Pulled By: cpuhrsch fbshipit-source-id: d1ec1e089bdb26ec155f6f31ccf768270acb76c7	2019-01-23 12:52:31 -08:00
Syed Tousif Ahmed	3837446883	Fixes selection of cuDNN algorithm (#15881 ) Summary: This PR updates the logic for using cudnnGet* and cudnnFind. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions. Changelist: - Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`. - Used cudnnGet_v7 everywhere cudnnGet* was being used. - Removed all cudnn6 paths (This PR depends on https://github.com/pytorch/pytorch/pull/15851) Differential Revision: D13787601 Pulled By: ezyang fbshipit-source-id: 81fe86727673d021306fe1c99c3e528b7c9ad17f	2019-01-23 12:47:15 -08:00
Edward Yang	879bf65811	Disable flaky test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16274 Reviewed By: pietern Differential Revision: D13788036 fbshipit-source-id: a9b7353fb0655908e6d47387cc77af33e9471aed	2019-01-23 11:57:44 -08:00
Junjie Bai	9310eb1fd0	Update third_party protobuf to v3.6.1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16251 Reviewed By: ezyang Differential Revision: D13781444 Pulled By: bddppq fbshipit-source-id: b713a021033d214f30a49ee02b95edf8633bcc50	2019-01-23 09:34:53 -08:00
Armaan Sethi	e669f72466	fix sigma in the middle of when word (#16227 ) Summary: there is a random sigma in the when word on : https://pytorch.org/cppdocs/contributing.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/16227 Differential Revision: D13762753 Pulled By: goldsborough fbshipit-source-id: 3d4bf4be859a3069402fe8c3fbc8ebee4f25cc5a	2019-01-23 08:35:32 -08:00
Derek Kim	36e27aa092	Typos and broken RSTs fixed in torch.distribution (#16136 ) Summary: - probabilty -> probability - make long lines break - Add LogitRelaxedBernoulli in distribution.rst Pull Request resolved: https://github.com/pytorch/pytorch/pull/16136 Differential Revision: D13780406 Pulled By: soumith fbshipit-source-id: 54beb975eb18c7d67779a9631dacf7d1461a6b32	2019-01-23 03:03:10 -08:00
Johannes M Dieterich	8b49efe86a	tune elementwise for AMD uarch (#16217 ) Summary: Tune elementwise kernel for AMD architectures by increasing the work group sizes and launch bounds. This change improves training throughput for torchvision models by up to 11% in our tests while exhibiting no significant performance regression. No functional/performance change for CUDA - just shifting numbers into constrexpr. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16217 Differential Revision: D13776684 Pulled By: bddppq fbshipit-source-id: edbaebe904598b2de66a9e9a68a1aa219ebc01e9	2019-01-22 18:23:51 -08:00
rohithkrn	ddeaa541aa	fix typo in resnet50_trainer.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16219 Differential Revision: D13776742 Pulled By: bddppq fbshipit-source-id: 10a6ab4c58159b3f619b739074f773662722c1d9	2019-01-22 17:28:04 -08:00
Lu Fang	e7a77ac3b0	Automatic update of fbcode/onnx to dc75285d4a1cff9618400164dfdb26c5a1bab70a Summary: Previous import was c553fb32a0902ce5dd42e1b40123e9e9b38bdbe7 Included changes: - [dc75285](https://github.com/onnx/onnx/commit/dc75285): Relax constraint that the initializers must be a subset of graph inputs (#1718) <G. Ramalingam> - [985c8cd](https://github.com/onnx/onnx/commit/985c8cd): Fix typo in scan shape inferencing (#1753) <Scott McKay> - [ab52a5d](https://github.com/onnx/onnx/commit/ab52a5d): remove stale test cases <Lu Fang> - [56434bb](https://github.com/onnx/onnx/commit/56434bb): Removing experimental ConstantFill op. <Spandan Tiwari> - [881c63c](https://github.com/onnx/onnx/commit/881c63c): Show string names of data types instead of int IDs (#1749) <Shinichiro Hamaji> - [0a12fe4](https://github.com/onnx/onnx/commit/0a12fe4): Update ConstantOfShape op. (#1744) <Bowen Bao> - [ef028e5](https://github.com/onnx/onnx/commit/ef028e5): Update definition of Cast Op to support casting to/from string (#1704) <Raymond Yang> Reviewed By: BIT-silence Differential Revision: D13773962 fbshipit-source-id: b98079277994a699d4807210ba1d9c27f4672090	2019-01-22 15:01:29 -08:00
Shen Li	2235fb256e	Add default_stream() and enhance current_stream() (#16200 ) Summary: Closes #16156 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16200 Differential Revision: D13747455 Pulled By: mrshenli fbshipit-source-id: 00c0d5f341c3ac7a757bdb4631a17e11fbc6d3ec	2019-01-22 14:35:19 -08:00
Edward Yang	3e790f6ee8	complex_registration_extension.cpp includes to angled brackets Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16122 Reviewed By: smessmer Differential Revision: D13717900 fbshipit-source-id: 8401f39d993482d3e08d2d79bc1841deafee2a5b	2019-01-22 14:22:38 -08:00
Edward Yang	0f45e6dbdc	Remove ATen/Allocator.h forwarding header. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16121 Reviewed By: smessmer Differential Revision: D13717899 fbshipit-source-id: 83488f2aa801ca75059949ec85171ec03e64c4ff	2019-01-22 14:22:36 -08:00
Edward Yang	77de69867a	Remove dead curVal store. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16116 Reviewed By: smessmer Differential Revision: D13717719 fbshipit-source-id: 2ecee3f08f64e64ec5ac3c92fb326bc3df37e40e	2019-01-22 13:38:36 -08:00
Sebastian Messmer	325df4ccfb	Make kernel registration constexpr again (#16166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16166 Since we now don't use std::function anymore, we can make kernel registration constexpr again. Reviewed By: ezyang Differential Revision: D13738630 fbshipit-source-id: 918fa3a3c8c6f0ddbd0f08b3b143cdf066265387	2019-01-22 13:29:13 -08:00
Sebastian Messmer	cd8f4154f4	Avoid closure around kernel (#16165 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16165 Store kernels as direct function pointers instead of std::function. Using direct function pointers avoids a performance risk std::function would introduce. Reviewed By: ezyang Differential Revision: D13738627 fbshipit-source-id: a348906c8a201436699681980a82ca95065a06a0	2019-01-22 13:29:11 -08:00
Sebastian Messmer	6192831b76	Pass IValues from JIT to c10 dispatcher (#16066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16066 Don't unwrap and re-wrap but directly pass through the IValues Reviewed By: ezyang Differential Revision: D13689037 fbshipit-source-id: 99b8155e640eb61a3c0597bf0f2b9c338712b45e	2019-01-22 13:29:09 -08:00
Shen Li	1c058de9ac	Release GIL when synchronize or wait (#16182 ) Summary: address the second future work item in #15937 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16182 Differential Revision: D13744972 Pulled By: mrshenli fbshipit-source-id: e9812e3fd4a5623e99b639d9f334bfc2d1827d92	2019-01-22 13:29:07 -08:00
Wanchao Liang	c6503a4205	Revert D13540278: [pytorch][PR] Unhide unique from C++, make unique partially scriptable Differential Revision: D13540278 Original commit changeset: 3768c76a90b0 fbshipit-source-id: 7a31c239f9dca6ff467344d99820095addcae9d7	2019-01-22 12:22:40 -08:00
Xiang Gao	c5e1b469be	Return namedtuples from torch.* function with multiple return arguments for C++ operators (#15429 ) Summary: Partially fixes: https://github.com/pytorch/pytorch/issues/394 Implementation detail: Codegen is modified to generate codes that looks like below: ```C++ static PyObject * THPVariable_svd(PyObject* self_, PyObject* args, PyObject* kwargs) { HANDLE_TH_ERRORS static PythonArgParser parser({ "svd(Tensor input, bool some=True, bool compute_uv=True, , TensorList[3] out=None)", }, /traceable=*/true); ParsedArgs<6> parsed_args; auto r = parser.parse(args, kwargs, parsed_args); static PyStructSequence_Field fields0[] = { {"U", ""}, {"S", ""}, {"V", ""}, {nullptr} }; static PyStructSequence_Desc desc0 = { "torch.return_types.svd_out", nullptr, fields0, 3 }; static PyTypeObject type0; static bool namedtuple_type_initialized0 = false; if (!namedtuple_type_initialized0) { PyStructSequence_InitType(&type0, &desc0); namedtuple_type_initialized0 = true; } static PyStructSequence_Field fields1[] = { {"U", ""}, {"S", ""}, {"V", ""}, {nullptr} }; static PyStructSequence_Desc desc1 = { "torch.return_types.svd", nullptr, fields1, 3 }; static PyTypeObject type1; static bool namedtuple_type_initialized1 = false; if (!namedtuple_type_initialized1) { PyStructSequence_InitType(&type1, &desc1); namedtuple_type_initialized1 = true; } if (r.idx == 0) { if (r.isNone(3)) { return wrap(&type1, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2))); } else { auto results = r.tensorlist_n<3>(3); return wrap(&type0, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2), results[0], results[1], results[2])); } } Py_RETURN_NONE; END_HANDLE_TH_ERRORS } ``` Types are defined as static member of `THPVariable_${op_name}` functions, and initialized at the first time the function is called. When parsing function prototypes in `native_functions.yaml`, the parser will set the specified name as `field_name` when see things like `-> (Tensor t1, ...)`. These field names will be the field names of namedtuple. The class of namedtuples will be named `torch.return_types.${op_name}`. In some python 2, `PyStructSequence` is not a subtype of tuple, so we have to create some functions to check if an object is a tuple or namedtuple for compatibility issue. Operators in `native_functions.yaml` are changed such that only `max` and `svd` are generated as namedtuple. Tests are added for these two operators to see if the return value works as expected. Docs for these two ops are also updated to explicitly mention the return value is a namedtuple. More ops will be added in later PRs. There is some issue with Windows build of linker unable to resolve `PyStructSequence_UnnamedField`, and some workaround is added to deal with this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15429 Differential Revision: D13709678 Pulled By: ezyang fbshipit-source-id: 23a511c9436977098afc49374e9a748b6e30bccf	2019-01-22 11:12:18 -08:00
Jongsoo Park	1e19fd941f	Fix formating in caffe2/quantization/server/README.md Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14237 Reviewed By: dskhudia Differential Revision: D13751791 Pulled By: jspark1105 fbshipit-source-id: 54f73d5134e596817802c66d43098d18458c2799	2019-01-22 10:15:37 -08:00
Yaxun (Sam) Liu	9521a15c88	hip-clang enablement (#16085 ) Summary: Initial enabling of the upcoming hip-clang compiler for the PyTorch source base. Changes: * update the Eigen submodule to a version including our upstreamed hip-clang enabling there * modify a few ifdef guards with the `__HIP__` macro used by hip-clang * use `__lane_id` instead of `hc::__lane_id` * add Debug flags for ROCm to the cmake infrastructure Pull Request resolved: https://github.com/pytorch/pytorch/pull/16085 Differential Revision: D13709459 Pulled By: ezyang fbshipit-source-id: 1b7b33fe810a0434766180580d4443ea177eb7c7	2019-01-22 09:09:48 -08:00
Andy Wei	4cf76574b9	Raise CalledProcessError when torch.distributed launch process not return 0 (#16069 ) Summary: `torch.distributed.launch.py` will not raise error when `subprocess.Popen` is not return 0. For better debugging it should always raise an error if processes launched have unusual behavior Pull Request resolved: https://github.com/pytorch/pytorch/pull/16069 Differential Revision: D13709467 Pulled By: ezyang fbshipit-source-id: 31d32a5ec8fed7bccd62d845bfba0e670ed3fe20	2019-01-22 08:50:47 -08:00
Shahzad Lone	53ae8bc64d	Reserve vectors that we know the size in advance for. (#16201 ) Summary: Save reallocation costs, by reserving vectors according to how many elements we expect to put in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16201 Differential Revision: D13762594 Pulled By: ezyang fbshipit-source-id: 7e3bfe421489dde48a2ddb0920dd155f69baecc0	2019-01-22 08:02:40 -08:00
Will Feng	dfcafb1f71	cpp doc fix (#16221 ) Summary: Fixed a few C++ API callsites to work with v1.0.1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16221 Differential Revision: D13759207 Pulled By: yf225 fbshipit-source-id: bd92c2b95a0c6ff3ba5d73cb249d0bc88cfdc340	2019-01-21 21:56:22 -08:00
Lu Fang	addebf110f	Move away from ConstantFill (#16214 ) Summary: Prerequisite of https://github.com/onnx/onnx/pull/1434 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16214 Reviewed By: BIT-silence Differential Revision: D13755116 Pulled By: houseroad fbshipit-source-id: a46be8d7df959b5ede93e1f9c911a9a9326e6879	2019-01-21 20:15:38 -08:00
Zachary DeVito	9757ad35b0	ban conv_double_backward from sandcastle, it takes too long Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16220 Differential Revision: D13755108 Pulled By: zdevito fbshipit-source-id: 46b1b128b155964c25249add0c84680491845e9b	2019-01-21 20:00:29 -08:00
Zachary DeVito	0cd1ab82b0	Remove dead code from setup.py, remove need for build target. (#16162 ) Summary: Now it is only necessary to use 'develop' or 'install' to build. Incremental cmake is on by default. `develop --cmake` forces it to rerun. The NinjaBuilder stuff is dead. It was used to make building _C.so faster but now _C.so is just an empty stub file. Removed a bunch of custom build commands from setup.py that are no longer meaningful now that cmake handles most of the build. Removed unused targets in build_pytorch_lib.sh/bat Pull Request resolved: https://github.com/pytorch/pytorch/pull/16162 Differential Revision: D13744155 Pulled By: zdevito fbshipit-source-id: d836484782c65b7f8e8c7a82620886f7a7777892	2019-01-21 17:27:56 -08:00
Xiang Gao	bed7db7772	Unhide unique from C++, make unique partially scriptable (#15256 ) Summary: This PR does three things: ~~Allow `int64_t?` in function schema, which provide an elegant way of implementing null-able int arguments, as discussed in https://github.com/pytorch/pytorch/pull/15208#pullrequestreview-185230081~~ ~~Originally implemented in https://github.com/pytorch/pytorch/pull/15235~~ ~~Example:~~ ```yaml - func: myop(Tensor self, int64_t? dim=None) -> Tensor variants: function ``` ~~cc: zou3519~~ Edit: implemented in https://github.com/pytorch/pytorch/pull/15234 Previously tried in https://github.com/pytorch/pytorch/pull/12064. There was a problem that C++ does not have kwarg support, which makes it confusing to know whether `unique(t, 1)` actually means `unique(t, dim=1)` or `unique(t, sorted=1)`. Now I think I have a better idea on how to implement this: there are two ATen operators: `unique` and `unique_dim`. `unique` has the same signature as in python, and exported to both python and C++. `unique_dim` has signature `unique_dim(tensor, dim, sorted=False, return_inverse=False)`, and only exported to C++, which could be used more naturally for a C++ user. Differential Revision: D13540278 Pulled By: wanchaol fbshipit-source-id: 3768c76a90b0881f565a1f890459ebccbdfe6ecd	2019-01-21 12:31:37 -08:00
Lu Fang	c33512bdfc	Automatic update of fbcode/onnx to c553fb32a0902ce5dd42e1b40123e9e9b38bdbe7 (#16190 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16190 Previous import was fd60104394fa353e1762f44ecad1b2166e33deef Included changes: - [c553fb3](https://github.com/onnx/onnx/commit/c553fb3): Handle negative axis in scan shape inference (#1748) <G. Ramalingam> - [51b6ecc](https://github.com/onnx/onnx/commit/51b6ecc): external_data: Store large tensor values in separate files (#678) <Michał Karzyński> - [ba05f26](https://github.com/onnx/onnx/commit/ba05f26): Scan output axes (#1737) <G. Ramalingam> - [90920c0](https://github.com/onnx/onnx/commit/90920c0): Add NonZero op. (#1714) <Sergii Dymchenko> - [c4cf112](https://github.com/onnx/onnx/commit/c4cf112): fix the test cases for constantofshape (#1746) <Lu Fang> - [d902349](https://github.com/onnx/onnx/commit/d902349): Add sample implementation support (#1712) <Lu Fang> Differential Revision: D13745693 fbshipit-source-id: 05e2cce9ae1dfa2865db83840df64673d55cea57	2019-01-21 09:46:29 -08:00
Xiaomeng Yang	866c4e3467	Separate Moments from math and optimize it (#16175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16175 Separate Moments from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13742472 fbshipit-source-id: 90757d908d38c98ca69818855aaf68315e525992	2019-01-20 08:53:25 -08:00
Shen Li	898329c3f9	Unify device() return type in Stream, Event, and Tensor (#16150 ) Summary: Addresses one future work item in #15937 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16150 Differential Revision: D13732299 Pulled By: mrshenli fbshipit-source-id: 4d0b35df573a3bf92dea6e2e7eb42fe8bac77b18	2019-01-19 23:01:31 -08:00
Spandan Tiwari	1fb6b431a3	Replace use of ConstantLike with with ConstantOfShape (#16095 ) Summary: Submitting this PR as an update to existing PR (https://github.com/pytorch/pytorch/pull/15938) on houseroad 's request. This PR replaces the use of ONNX op `ConstantLike` with `ConstantOfShape` in the ONNX exporter. In addition to removing the call sites in `symbolic.py`, it also replace the call site in `peephole.cpp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16095 Differential Revision: D13745723 Pulled By: houseroad fbshipit-source-id: e2a5f534f01adf199df9e27544f7afcfa540e1f0	2019-01-19 19:52:54 -08:00
Miro Furtado	33d1ec396b	Fix LBFGS issue (#16167 ) Summary: Resolves #15923 where LBFGS threw "Error: a leaf Variable that requires grad has been used in an in-place operation." Pull Request resolved: https://github.com/pytorch/pytorch/pull/16167 Differential Revision: D13745822 Pulled By: soumith fbshipit-source-id: 7d1d0511d06838c0c6f4c8a6b53cf15193283059	2019-01-19 15:01:06 -08:00
Kjell Schubert	a28c0ff7b8	Allow for concurrent quantization in FullyConnectedDNNLowPOp (#16174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16174 Our service creates a new caffe2 workspace for the same underlying network on multiple threads concurrently at service startup time (later these workspaces are being reused for sequential requests), resulting in concurrent quantization via FullyConnectedDNNLowPOp calling GetOrCreateFbgemmPackBMatrix(). The lazily performed quantizations during the first inference in each workspace are all funnelled through GetOrCreateFbgemmPackBMatrix()'s cache_mutex, which means quantization is serialized, so at service startup time only a single CPU core is being used for around a minute until the serial quantization is done. An better solution would be to avoid the quantization of the same weight matrix of the operator copies in different net copies to begin with, but this here is the simpler solution for our current problem. Reviewed By: jspark1105 Differential Revision: D13708785 fbshipit-source-id: 537519896b3b939c552d67f400bafc8a69ce11eb	2019-01-19 06:00:22 -08:00
Lu Fang	daedec2350	Support ConstantOfShape in Caffe2 ONNX Backend (#16108 ) Summary: This PR is the prerequisite to land https://github.com/pytorch/pytorch/pull/16095 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16108 Reviewed By: BIT-silence Differential Revision: D13725722 Pulled By: houseroad fbshipit-source-id: 28c0fb72f075cd04f9db44dfab0163844c20c620	2019-01-18 22:58:23 -08:00
Xiaomeng Yang	b436f94b53	Separate affine_channel from math and optimize it (#16135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16135 Separate affine_channel from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13727606 fbshipit-source-id: 8980af4afadaf964a18a9da581106fe30896a7e9	2019-01-18 22:40:16 -08:00
Sebastian Messmer	e8b872abe2	Pass IValue from c10 dispatcher to caffe2 operator (#16065 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16065 Before, we registered the caffe2 kernel with the c10 dispatcher using plain C types. Now, we pass in IValues, which avoids the unwrapping inbetween. Reviewed By: ezyang Differential Revision: D13689036 fbshipit-source-id: b976a2c46a5a541f6a926b3df255e8a535e32420	2019-01-18 16:02:18 -08:00
Sebastian Messmer	c9044166a5	Make c10 dispatcher use boxed kernel function pointers (#16051 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16051 This changes the kernels stored in the c10 dispatcher from plain C function pointers to IValue-based KernelFunction. Note that KernelFunction is currently taking an `ArrayRef<IValue>` as arguments. A later diff will change that to it taking a `Stack`. Reviewed By: ezyang Differential Revision: D13684518 fbshipit-source-id: 1fa54f60cec2e967b92a4a043d6e3ac1627ed991	2019-01-18 16:02:15 -08:00
Thomas Viehmann	b662a9b66a	add back NNPACK in PyTorch (#15924 ) Summary: This tests the water for adding back NNPACK in PyTorch, it's a lot better than the fallback THNN versions. In #6151, we (ezyang and soumith) removed NNPACK support from PyTorch. Of course Maratyszcza might have advice, too. (Or an opinion on the CMake changes.) The only functional changes are to use NNPack more aggressively on mobile and a .contiguous() to match NNPack's assumption (I stumbled over that while using NNPack for style transfer.) The CMake changes try to use the NNPack we already have in git. In terms of lines of code this is a large part of the diff of https://lernapparat.de/pytorch-jit-android/ . As far as I can tell, we don't have MKLDNN on mobile and the native THNN implementation are prohibitively expensive in terms of both CPU and memory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15924 Differential Revision: D13709576 Pulled By: ezyang fbshipit-source-id: f2e287739909451c173abf046588209a7450ca2c	2019-01-18 15:34:35 -08:00
Natalia Gimelshein	ed57425b0a	improve performance of unique with inverse indices (#16145 ) Summary: Partial fix for #15804, only w/o dim. For jcjohnson benchmarking script I'm getting the following results on V100: Before: ``` unning with N = 10000, M = 10000 cuda (no inverse): 0.98 ms cpu (no inverse): 0.96 ms cuda (with inverse): 1.07 ms cpu (with inverse): 1.76 ms Running with N = 10000, M = 100000 cuda (no inverse): 0.76 ms cpu (no inverse): 1.53 ms cuda (with inverse): 1.23 ms cpu (with inverse): 3.02 ms Running with N = 100000, M = 100000 cuda (no inverse): 1.28 ms cpu (no inverse): 11.22 ms cuda (with inverse): 69.76 ms cpu (with inverse): 20.28 ms Running with N = 100000, M = 1000000 cuda (no inverse): 0.78 ms cpu (no inverse): 18.78 ms cuda (with inverse): 133.45 ms cpu (with inverse): 34.09 ms Running with N = 500000, M = 500000 cuda (no inverse): 1.43 ms cpu (no inverse): 61.13 ms cuda (with inverse): 3315.18 ms cpu (with inverse): 104.57 ms Running with N = 500000, M = 5000000 cuda (no inverse): 0.86 ms cpu (no inverse): 96.44 ms cuda (with inverse): 5209.93 ms cpu (with inverse): 176.10 ms ``` After ``` Running with N = 10000, M = 10000 cuda (no inverse): 1.04 ms cpu (no inverse): 0.94 ms cuda (with inverse): 0.64 ms cpu (with inverse): 1.76 ms Running with N = 10000, M = 100000 cuda (no inverse): 0.77 ms cpu (no inverse): 1.55 ms cuda (with inverse): 0.58 ms cpu (with inverse): 2.79 ms Running with N = 100000, M = 100000 cuda (no inverse): 1.30 ms cpu (no inverse): 14.15 ms cuda (with inverse): 1.63 ms cpu (with inverse): 20.90 ms Running with N = 100000, M = 1000000 cuda (no inverse): 0.82 ms cpu (no inverse): 18.63 ms cuda (with inverse): 0.61 ms cpu (with inverse): 33.52 ms Running with N = 500000, M = 500000 cuda (no inverse): 1.51 ms cpu (no inverse): 59.81 ms cuda (with inverse): 1.23 ms cpu (with inverse): 110.69 ms Running with N = 500000, M = 5000000 cuda (no inverse): 0.92 ms cpu (no inverse): 104.26 ms cuda (with inverse): 0.84 ms cpu (with inverse): 187.12 ms ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16145 Differential Revision: D13738821 Pulled By: soumith fbshipit-source-id: 0811fb4ade47e3b466cebbc124e3f3333a986749	2019-01-18 14:56:39 -08:00
Michael Suo	c6d9c51c7e	fix for clang-tidy (#16164 ) Summary: It turns out that clang-tidy is bundled with travis's standard trusty distribution, so no need to install it manually. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16164 Differential Revision: D13738986 Pulled By: suo fbshipit-source-id: d0cd76c615625b2ed7f18951289412989f15849d	2019-01-18 14:04:26 -08:00
Shen Li	292edfb087	Change current device in stream context manager if necessary (#16128 ) Summary: Fixes #16019 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16128 Differential Revision: D13721850 Pulled By: mrshenli fbshipit-source-id: 422c6c0b97c1cd46e127e265b532cb8c74a3aac5	2019-01-18 12:39:51 -08:00
Jerry Zhang	eea50e91fa	Fix SoftmaxOps (#16049 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16049 We might see the pattern ``` if (scale_.numel() != N) { scale_->Resize(N); // set initial value for scale_ } // In class: Tensor scale_{CPU}; ``` before in the code, where `scale_` is a member variable of Type `caffe2::Tensor` This pattern actually serves two purposes, if `scale_` is partially initialized with device type but not size, this call will initialize Tensor with the correct size, or if `scale_` is already initialized with size, it will check whether the size matches a runtime value `N` and if not it will Resize. To rewrite this we'll do the following: ``` if (!scale_.defined() \|\| scale_.numel() != N) { ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU)); // set initial value for scale_ } ``` There are some variants, if `scale_` is resized to a constant size, we can call `ReinitializeTensor` instead ``` if (scale_.numel() != 1) { scale_->Resize(1); } ``` --> ``` ReinitializeTensor(&scale_, {1}, at::dtype<float>().device(CPU)); ``` Normal Resize will be refactored directly into ReinitializeTensor: ``` scale_->Resize(N); ``` --> ``` ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU)); ``` Reviewed By: dzhulgakov Differential Revision: D13667883 fbshipit-source-id: 2c7cb61544b72765b594011b99150eb5a1b50836	2019-01-18 12:30:59 -08:00
Jerry Zhang	3f4bb3d493	rest of uses for deprecation of dims() in Tensor (#16118 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16118 att Differential Revision: D13697211 fbshipit-source-id: 12bf6edd1794240ac748cc1b8fecb0c1e8eb9112	2019-01-18 11:52:12 -08:00
Nikita Shulga	b69c05dbd6	RNN operators should inherit step_net device_options (#16086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16086 [caffe2] RNN operators should inherit step_net device_options According to NetDef documentaiton, if network has a specific device option it applies to all network operators that do not explicitly specifiy it. But this does not seem to be the case for RecurrentNetwork operators Reviewed By: orionr Differential Revision: D13699552 fbshipit-source-id: 14529bc9504e3b02f763e3c2429be21e46f82b68	2019-01-18 11:36:38 -08:00
Elias Ellison	d4f6befc93	Add implicit optional unwrapping (#15587 ) Summary: Add support for type inference for optional type refinement. If a conditional is of the form "x is None" or "x is not None", or is a boolean expression containing multiple none checks, the proper type refinements are inserted in each branch. For example: if optional_tensor is not None and len(optional_tensor) < 2: # optional_tensor is a Tensor if optional_tensor1 is not None and optional_tensor2 is not None: # both optional_tensor1 and optional_tensor2 are Tensors TODO: - not run an op for unchecked unwrap optional in the interpreter - potentially refine types to prim::None (omitted for now to simply things & because it's not an actual use cause). Pull Request resolved: https://github.com/pytorch/pytorch/pull/15587 Differential Revision: D13733810 Pulled By: eellison fbshipit-source-id: 57c32be9f5a09ab5542ba0144a6059b96de23d7a	2019-01-18 11:25:01 -08:00
Jerry Zhang	da578b7dcf	Add defined() to caffe2::Tensor (#16125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16125 Add defined() method to check whether the Tensor is defined. Reviewed By: ezyang Differential Revision: D13719222 fbshipit-source-id: ff8efef2159ed1026bd16acaea40c768a1e20a47	2019-01-18 11:03:36 -08:00
Edward Yang	b9b160d86f	Remove ATen/Half.h and ATen/core/Half.h forwarding headers. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16115 Reviewed By: bddppq Differential Revision: D13717049 fbshipit-source-id: fb1d690183a932a1fa1a2d235f3219520f51620a	2019-01-18 10:55:21 -08:00
Shen Li	1ff864712b	Port legacy any(*) to ATen Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15547 Differential Revision: D13549495 Pulled By: mrshenli fbshipit-source-id: 09a065a8ffa7d73f409759b779c7314cc87f4853	2019-01-18 10:32:19 -08:00
Richard Zou	ed0a761c82	Improve pack_sequence and pack_padded_sequence error message (#16084 ) Summary: Mention that if enforce_sorted=True, the user can set enforce_sorted=False. This is a new flag that is probably hard to discover unless one throughly reads the docs. Fixes #15567 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16084 Differential Revision: D13701118 Pulled By: zou3519 fbshipit-source-id: c9aeb47ae9769d28b0051bcedb8f2f51a5a5c260	2019-01-18 07:58:54 -08:00
Teng Li	b4bc55beef	TCP init method race condition fix (#15684 ) Summary: This PR fixes a race condition for TCP init method, when master rank can exit earlier than slave ranks and thus the TCP daemon thread gets shutdown before other slaves are able to access it. This will let every rank (process) write a special key to the store to mark that they are completed (and thus about to exit). The master rank (who is the server) will always wait until all the ranks to complete before complete itself. This should fix: https://github.com/pytorch/pytorch/issues/15638 Tested using the repro of https://github.com/pytorch/pytorch/issues/15638 and works fine. Also test_distributed and test_c10d should have already had this coverage. I had to make rendezvous test in c10d the world size of 1, since it is a single process code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15684 Differential Revision: D13570904 Pulled By: teng-li fbshipit-source-id: 34f3bc471204bbd29320df359347ad5561c6b589	2019-01-18 02:29:38 -08:00
Dmytro Dzhulgakov	aaff2fecda	Remove caffe2::Tensor copy constructor (#15416 ) Summary: Based on offline discussion it should be less surprising to the users of existing code. Thus caffe2::Tensor is now a move-only class (as it used to be), explicit calls to UnsafeSharedInstance() are necessary to get shared_ptr behavior. This change also identified a few places that misused the copy constructor - those are fixed Pull Request resolved: https://github.com/pytorch/pytorch/pull/15416 Reviewed By: Yangqing Differential Revision: D13524598 fbshipit-source-id: aea12d6dff77342606fa88ce4ddddbff266245a7	2019-01-18 00:31:56 -08:00
Zachary DeVito	b5c733324c	Fix RERUN_CMAKE Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16132 Differential Revision: D13726816 Pulled By: zdevito fbshipit-source-id: 26ad70651b0138642ad5240670f5c452018c13a2	2019-01-18 00:04:31 -08:00
Mikhail Zolotukhin	cb2961f63c	Cleanup includes in python_print.cpp. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16129 Differential Revision: D13724297 Pulled By: ZolotukhinM fbshipit-source-id: 24e140bc052c85ef40b928eb84f463d341346a51	2019-01-17 18:13:17 -08:00
Mikhail Zolotukhin	27674dc7c6	Refactor attributes.h (#16098 ) Summary: This PR inlines `Attributes` into `Node`. It helps to cleanup the code a little as everything is one place (some of the cleanups are included in the PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/16098 Differential Revision: D13717637 Pulled By: ZolotukhinM fbshipit-source-id: c54ae65178a95a01354688921a9ccb1ca699f8eb	2019-01-17 17:39:58 -08:00
Sebastian Messmer	40b3e4907c	Fix export macros (#15856 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15856 They seem to be wrong. cc zdevito to take a look but I think this is now more correct. It's weird this didn't cause linker errors. Probably, this functionality isn't used across library boundaries yet. Reviewed By: dzhulgakov Differential Revision: D13605257 fbshipit-source-id: 7077ca9027c3ac79a4847ec15ead7ddb28696445	2019-01-17 16:04:01 -08:00
Sebastian Messmer	0ab8de3125	Remove some dependencies from ivalue.h to ATen (#15855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855 This is preparation work for moving IValue to c10. Reviewed By: ezyang Differential Revision: D13605259 fbshipit-source-id: cc545f582ab8607bb02aaf71273cb2710200b295	2019-01-17 16:03:58 -08:00
Sebastian Messmer	68164c1c3e	Code style cleanup (#15854 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15854 - Remove unnecessary `inline` keyword - Add a TODO stating the intention for Blob::ShareExternal() Reviewed By: dzhulgakov Differential Revision: D13605258 fbshipit-source-id: c0bc85c74c4ca4b3811d42ac7f866182e159d840	2019-01-17 16:03:56 -08:00
Sebastian Messmer	637b35b372	Use intrusive_ptr for Blob in IValue (#16052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16052 We need IValue to take/return Blob as an intrusive_ptr because we want to pass it around and Blob has disabled copying. This is needed in a diff on top. Reviewed By: ezyang Differential Revision: D13684761 fbshipit-source-id: 7cb3d7e9fec39a2bc9f063d4d30404e6d7016eb2	2019-01-17 15:56:54 -08:00
Sebastian Messmer	3e85a2bcbf	Move c10 dispatcher back to ATen/core (#16050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16050 The c10 dispatcher will (soon) depend on IValue and IValue can't be moved to c10 yet because it depends on at::Tensor, which depends on legacy Type dispatch and we don't want the legacy dispatch in c10. So instead, we move the c10 dispatcher back to ATen/core until we can actually move at::Tensor to c10. Reviewed By: ezyang Differential Revision: D13684517 fbshipit-source-id: 1125f4254223907c52f96ff73034f6d4ae9fd0a7	2019-01-17 15:56:52 -08:00
Chris Gottbrath	a9438ba62f	Moving cuda-convnet2 to the internal fb dir to satisfy internal dependencies. (#16104 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16104 PyTorch PR 15784 removed cuda-convnet from the contrib directory. This broke some internal-only fb dependencies. Moving this to the internal area. Reviewed By: ezyang Differential Revision: D13709112 fbshipit-source-id: 2d7811545da67489869b59c350a29817eff693cf	2019-01-17 15:11:20 -08:00
Michael Suo	431a34f3ff	further wildcard cleanups (#16041 ) Summary: Some cleanup to wildcard handling, including one bugfix: previously, we were not considering writes to the wildcard set as part of the potential write set for nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16041 Differential Revision: D13705738 Pulled By: suo fbshipit-source-id: acb8ccbaa70fe47445577ddf24a69f84630de411	2019-01-17 14:54:34 -08:00
David Riazati	962f3f4864	Refactor _jit_internal (#16058 ) Summary: Use qualified names in `jit/__init__.py` to avoid polluting that namespace Pull Request resolved: https://github.com/pytorch/pytorch/pull/16058 Differential Revision: D13718745 Pulled By: driazati fbshipit-source-id: 19d150569c8374541250a961f24f70c3f523de03	2019-01-17 13:56:50 -08:00
Jesse Hellemn	99b029aca3	Include all Caffe2 headers in Python installations (#16124 ) Summary: Confirmed on a local run that all the additional headers are present. This shouldn't be caught in any existing tests though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16124 Differential Revision: D13720773 Pulled By: pjh5 fbshipit-source-id: 22a42639f5649cac555ecc5a8b6760a8cbfcf01f	2019-01-17 13:51:51 -08:00
Aaron Jaech	0282318bea	Add comment to explain rnn bias vectors (#15843 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15843 RNN/LSTMs only need one bias vector, but our implementation uses two to be compatible with CuDNN. This diff adds a comment to explain this. Reviewed By: ezyang Differential Revision: D13602365 fbshipit-source-id: eef5bd9383d9f241dc0ef0472f753b4a44cc19b5	2019-01-17 13:32:42 -08:00
Will Feng	7e9e1c7a9f	Add @yf225 to cpp codeowner Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16120 Differential Revision: D13718880 Pulled By: yf225 fbshipit-source-id: 1c0a41ffba71855a3ad88b8d263ba2bd5076351d	2019-01-17 13:03:48 -08:00
Yangqing Jia	9aac5c7e85	Update FP16 to latest master (#14498 ) Summary: TSIA - fp16 cmake had a bug that is fixed in https://github.com/Maratyszcza/FP16/pull/9 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/14498 Differential Revision: D13240829 Pulled By: Yangqing fbshipit-source-id: 724745750efe4f1b49d29ee07380c36997579915	2019-01-17 13:00:28 -08:00
Egil Martinsson	d6a8dd9538	Cleanup gumbel_softmax (#13339 ) Summary: Fixes #12643, amends to #3341. - Allow multidimensional input ~~(but apply softmax over `dim=-1`)~~ with `dim` argument - Cleaner: Less lines of code - Faster (1.32x speedup vs original, 2x speedup vs using `torch.Distributions`) - Small fixes in docstring - Remove some references in docstring. Was the linked (excellent) ipynb the first to do the straight-through trick? Instead, I propose changing to reference to the two papers most known for it. - Add deprecationwarning for `eps`. It's not needed anymore. - Initial commit keeps some code alternatives commented to exploit CI - As of discussion when `gumbel_softmax` was added (#3341), this was merged into `torch.nn.functional` before all the work with `Distributions` and `Pyro`, and there will probably be multiple other best practices for this in the future. I've tested building using the `Distributions`-api, but it was too slow, see below. I therefore propose not using `Distributions` to keep it fast and simple, but adding a comment in docstring that `gumbel_softmax` may be deprecated in the future. ``` dist = torch.distributions.RelaxedOneHotCategorical(temperature=tau, logits=logits, validate_args=False) y_soft = dist.rsample() ``` Pros: * Built using tricks like `logsumexp` etc * Explicitly uses `torch.distributions.utils._finfo` to avoid overflow (old implementation had an `eps` flag) * Maintained for this exact purpose. Cons: * Very slow. Construction of distribution adds overhead see timings below. May be solved in future with speedups of `TransformedDistribution` and `Distribution`. * Assumes which `dim` to apply softmax over. ``` y_soft = logits.new(logits.shape) y_soft = (logits - y_soft.exponential_().log()) / tau # Gumbel noise y_soft = y_soft.softmax(dim) # Gumbel softmax noise ``` Pros: * Faster ``` import time start = time.time() num_draws = 1000000 logits = torch.randn(1,3) for draw in range(num_draws): y_draw = gumbel_softmax(logits, hard=True) counts = counts + y_draw print(end - start) >> 12.995795965194702 >> 7.658372640609741 >> 20.3382670879364 ```` Decide on which path to chose. I'll commit in changes to the unit tests in a while to show that it passes both old tests and new tests. I'll also remove the commented code about `RelaxedOneHotCategorical` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13339 Differential Revision: D13092434 Pulled By: ezyang fbshipit-source-id: 4c21788df336f4e9c2ac289022e395b261227b4b	2019-01-17 12:56:35 -08:00
Christian Puhrsch	a667767220	Add matches_jit_signature attribute to native_functions.yaml (#16040 ) Summary: If "matches_jit_signature" is set to True for a particular function, we will assume that the func syntax follows the JIT signature syntax. This is a temporary attribute and doesn't need to be set by developers outside the core team. It serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and match coverage of different function descriptions. Followup PRs might be about removing _out from native_functions.yaml and using Tensor annotations instead, etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16040 Reviewed By: ezyang Differential Revision: D13703176 Pulled By: cpuhrsch fbshipit-source-id: ce248e1823a6f18efa95502f9f3eebf023b4a46c	2019-01-17 12:39:08 -08:00
FrankHui	fe4ae9dfe4	add if in register_buffer like register_parameters (#16110 ) Summary: without this "if", code below will throw error " Linear' object has no attribute '_buffers' " And with this if, error would be "cannot assign buffer before Module.\_\_init\_\_() call", which I think it's more accurate, just like register_parameter. ``` import math import torch from torch.nn.parameter import Parameter from torch.nn import functional as F from torch.nn import Module class Linear(Module): def __init__(self, in_features, out_features, bias=True): self.in_features = in_features self.out_features = out_features self.register_buffer('test', torch.Tensor(out_features, in_features)) self.weight = Parameter(torch.Tensor(out_features, in_features)) if bias: self.bias = Parameter(torch.Tensor(out_features)) else: self.register_parameter('bias', None) super(Linear, self).__init__() self.reset_parameters() def reset_parameters(self): stdv = 1. / math.sqrt(self.weight.size(1)) self.weight.data.uniform_(-stdv, stdv) if self.bias is not None: self.bias.data.uniform_(-stdv, stdv) def forward(self, input): return F.linear(input, self.weight, self.bias) def extra_repr(self): return 'in_features={}, out_features={}, bias={}'.format( self.in_features, self.out_features, self.bias is not None ) linear = Linear(3,4) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16110 Differential Revision: D13715839 Pulled By: soumith fbshipit-source-id: c300eff0a8655aade448354cf489a592f7db722a	2019-01-17 11:50:12 -08:00
Edward Yang	f1ad5e08c7	Revert D13709409: [pytorch][PR] Exclude pyi from flake8 checks. Differential Revision: D13709409 Original commit changeset: ec4a959e146f fbshipit-source-id: feabed5719a0bfdfe7979074b7e1ba9756c4ba25	2019-01-17 11:28:56 -08:00
Guoqiang Jerry Chen	6641b09fac	respect grad guard for torch.jit._fork and torch.jit._wait (#16101 ) Summary: respect grad guard for torch.jit._fork and torch.jit._wait. Verified that the test failed without the fix, and pass with the fix. Ideally I would like to enable and disable grad inside the forked function. It doesn't seems like it's supported at this moment. This code handles that as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16101 Differential Revision: D13708374 Pulled By: gqchen fbshipit-source-id: 0533f080c4d0253fb4c61d2a0d3cc22de5721a09	2019-01-17 11:12:57 -08:00
Gregory Chanan	595f767880	Revert batched pdist, improve existing kernel, add test (#15901 ) Summary: 1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have. Motivation: https://github.com/pytorch/pytorch/issues/15511 2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors 3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901 Reviewed By: ezyang Differential Revision: D13616730 Pulled By: gchanan fbshipit-source-id: 620d3f9b9acd492dc131bad9d2ff618d69fc2954	2019-01-17 10:44:43 -08:00
Derek Kim	fbdafb006e	Fix trivial typos in torch.cuda._utils (#16026 ) Summary: Trivial typo fixings. Maybe the indefinite article "an" is needed before each "specified index" but I'm not perfectly sure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16026 Differential Revision: D13709499 Pulled By: ezyang fbshipit-source-id: 698b000bb8aa063afd81db6e67046456a439b2ce	2019-01-17 10:40:43 -08:00
Sasha Rush	dbe6a7a9ff	Unify the shape notation for all of the pytorch modules (#15741 ) Summary: PR to update the shape notation for all of the torch.nn modules to take a unified form. The goal is to make these definitions machine-readable and those checkable by unifying the style across all of the different modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15741 Differential Revision: D13709601 Pulled By: ezyang fbshipit-source-id: fb89a03903fdf0cd0dcf76f3e469b8582b2f3634	2019-01-17 10:32:14 -08:00
Neeraj Pradhan	67f2039f4c	Fix numerical stability in binomial.log_prob (#15962 ) Summary: This issue was discovered by fehiepsi in https://github.com/uber/pyro/issues/1706 with the `log_prob` computation for Binomial, ~and can be seen with `torch.float32` when we have a combination of low probability value and high `total_count` - a test is added to capture this (since scipy only uses float64, the comparison is done using relative tolerance).~ The problem is in the code that tries to pull out the minimum values amongst the logits (written by me earlier, presumably to avoid numerical instability issues), but it is not needed. EDIT: After a few attempts, I have been unable to reliably show that the change is more numerically stable, and have removed my previous test which fails on linux. The reason is that the issue manifests itself when `total_count` is high and `probs` is very low. However, the precision of `lgamma` when `total_count` is high is bad enough to wash away any benefits. The justification for this still stands though - (a) simplifies code (removes the unnecessary bit), (b) is no worse than the previous implementation, (c) has better continuity behavior as observed by fehiepsi in the issue above. cc. fehiepsi, alicanb, fritzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/15962 Differential Revision: D13709541 Pulled By: ezyang fbshipit-source-id: 596c6853b6e4d5fba42336afa168a665ab6fbde2	2019-01-17 10:18:37 -08:00
Lu Fang	c51cf09a4b	Automatic update of fbcode/onnx to fd60104394fa353e1762f44ecad1b2166e33deef (#16094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16094 Previous import was 84a0441ae28795a928005863dc142bee81827566 Included changes: - [fd60104](https://github.com/onnx/onnx/commit/fd60104): deprecate no-spatial mode of BN (#1637) <liqunfu> Reviewed By: BIT-silence Differential Revision: D13705357 fbshipit-source-id: 44dbc8bf15fced6d50048b04c2882e38f75c0e34	2019-01-17 10:14:15 -08:00
Derek Kim	f09003d95d	A trivial typo fixed in onnx.verify.verify (#15871 ) Summary: A trivial typo fixing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15871 Differential Revision: D13709588 Pulled By: ezyang fbshipit-source-id: 84460e53e30470bef72bc836c08fd149b4d725cf	2019-01-17 09:57:33 -08:00
Syed Tousif Ahmed	0e80df515d	Remove support for CUDNN 6 (#15851 ) Summary: This PR aims to remove support for cuDNN 6. Differential Revision: D13709595 Pulled By: ezyang fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107	2019-01-17 09:57:26 -08:00
Edward Yang	1a5c5fe7c9	Exclude pyi from flake8 checks. (#16105 ) Summary: Idiomatic pyi files will fail with Python 2 flake8 even though they would work with mypy. This is because pyi files generally use Python 3 only syntax. No point in linting them. There are currently no pyi files checked in, this is purely a prophylactic measure. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16105 Reviewed By: zou3519 Differential Revision: D13709409 Pulled By: ezyang fbshipit-source-id: ec4a959e146f81ccb9533b04348be8dd78808421	2019-01-17 09:51:45 -08:00
Marat Dukhan	76782cfc21	Update cpuinfo to avoid reporting error when sysfs is not accessible (#16107 ) Summary: On some cloud-based x86 systems /sys/ is not mounted. cpuinfo has a work-around for these systems, but it reports an error if sysfs files fail to read, and this error was confusing to some users (e.g. pytorch/cpuinfo#20). This update downgrades the error to a warning, so it is not reported with default configuration options. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16107 Differential Revision: D13715243 Pulled By: soumith fbshipit-source-id: f5c4c86422343ca449487f0185f3a8865ccf3b9d	2019-01-17 09:26:49 -08:00
bddppq	1a09a2a27f	Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16106 Differential Revision: D13709490 Pulled By: bddppq fbshipit-source-id: 1b5b32261f06543371f7bd7ac9b11957a5eb4ad0	2019-01-17 09:18:08 -08:00
DavidWongEA	334258e39e	Potential fix for model inference crash on Win10 (#15919 ) (#16092 ) Summary: Please refer to issue #15919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16092 Differential Revision: D13712897 Pulled By: soumith fbshipit-source-id: edcd1ed3504f1fa1af841a1757616382c745958f	2019-01-17 08:44:02 -08:00
Shen Li	24f4d3987e	Move all Stream and Event Python implementation to C++ (#15937 ) Summary: 1. Added `torch/csrc/cuda/Event.h` and `torch/csrc/cuda/Event.cpp` to bind Python Event class to C++ implementation. 2. Move all CUDA runtime invocations from `torch/cuda/streams.py` to C++ 3. Added tests to cover Stream and Event APIs. ~(event IPC handle tests is introduced in #15974)~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/15937 Differential Revision: D13649001 Pulled By: mrshenli fbshipit-source-id: 84ca58f35f6ba679a4ba33150ceba678d760d240	2019-01-17 07:29:22 -08:00
Derek Kim	1e425d1a47	A trivial typo fix in caffe2.python (#15907 ) Summary: blobl -> globl Pull Request resolved: https://github.com/pytorch/pytorch/pull/15907 Differential Revision: D13709586 Pulled By: ezyang fbshipit-source-id: 9d3ad76b7fea76c7934407d3c164417b4157e234	2019-01-17 04:57:34 -08:00
Xiaomeng Yang	7536887cb7	Add count_include_pad for avg_pool on CuDNN (#16100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16100 Add count_include_pad for avg_pool on CuDNN Reviewed By: houseroad Differential Revision: D13707959 fbshipit-source-id: 261f5d116066fef75cf9a5787dfbc5d12b5b9f9b	2019-01-17 02:10:12 -08:00
Derek Kim	4171ef3728	Enhance the documentation for DistributedDataParallel from torch.nn.parallel.distributed (#16010 ) Summary: - a typo fixed - made the docs consistent with #5108 And maybe one more change is needed. According to the current docs > The batch size should be larger than the number of GPUs used locally. But shouldn't the batch size be larger than the number of GPUs used either locally or remotely? Sadly, I couldn't experiment this with my single GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16010 Differential Revision: D13709516 Pulled By: ezyang fbshipit-source-id: e44459a602a8a834fd365fe46e4063e9e045d5ce	2019-01-17 01:02:44 -08:00
QingfengLi	ded4ff87af	fix a little error in comments (#15922 ) Summary: There is a little error in the comment, "A->B", so the Task B must start after task A finishes, not "B". Pull Request resolved: https://github.com/pytorch/pytorch/pull/15922 Differential Revision: D13709579 Pulled By: ezyang fbshipit-source-id: 735afe83f4532b7c7456da3e96209b3e07071f37	2019-01-17 00:25:23 -08:00
fulltopic	c7a48da493	Corresponding data type for BYTE (#15627 ) Summary: TensorProto.DataType in caffe2/proto/caffe2.proto has BYTE = 3 defined, while there is no corresponding TypeMeta defined in caffe2/core/types.cc: DataTypeToTypeMeta. This issue failed the C++ tutorial of MNIST + LMDB. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15627 Differential Revision: D13709602 Pulled By: ezyang fbshipit-source-id: d4826d0f9b3975e6a8478d4bad1abbbedcaea197	2019-01-17 00:17:56 -08:00
Derek Kim	ec8b1c94a9	Fix possible importing errors in build_libtorch.py (#15471 ) Summary: 1. I fixed the importing process, which had some problems - I think `setup_helpers` should not be imported as the top level module. It can lead to many future errors. For example, what if `setup_helpers` imports another module from the upper level? So we need to change it. - The code is not consistent with other modules in `tools` package. For example, other modules in the package imports `from tools.setuptools...` not `from setuptools...`. - It should be able to run with `python -m tools.build_libtorch` command because this module is a part of the tools package. Currently, you cannot do that and I think it's simply wrong. ~~2. I Added platform specific warning messages. - I constantly forgot that I needed to define some environment variables in advance specific to my platform to build libtorch, especially when I'm working at a non pytorch root directory. So I thought adding warnings for common options would be helpful .~~ ~~3. Made the build output path configurable. And a few other changes.~~ orionr ebetica Pull Request resolved: https://github.com/pytorch/pytorch/pull/15471 Differential Revision: D13709607 Pulled By: ezyang fbshipit-source-id: 950d5727aa09f857d973538c50b1ab169d88da38	2019-01-16 23:55:57 -08:00
Mikhail Zolotukhin	fcb4b4f002	Remove redundant includes from ir.{h,cpp}. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16080 Differential Revision: D13701796 Pulled By: ZolotukhinM fbshipit-source-id: 7efae3a0fd969376e4b438a8d8fb96adb33dc55c	2019-01-16 23:43:45 -08:00
peter	f7733526aa	Generate PDB files for better debugging on Windows (#16008 ) Summary: 1. Unify `build_pytorch_libs.bat`, `setup.py` and `torch/CMakeLists.txt` on the debugging flags with the `CMAKE_BUILD_TYPE` being `Debug`, `Release` and `RelWithDebInfo`. 2. Install PDBs through CMake if they are generated. Reference: 1. CMake PDB install: https://gitlab.kitware.com/cmake/cmake/issues/18393#note_459199 2. About debugging flags https://stackoverflow.com/a/4662345 3. MSDN page about /DEBUG flag: https://docs.microsoft.com/en-us/cpp/build/reference/debug-generate-debug-info?view=vs-2017 4. MSDN page about /Z{i/I/7}: https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2017 Work to do: - [x] Test the changes work in Release config through this PR - [ ] <del> Test debug build through https://github.com/pytorch/pytorch/pull/16009 </del> - [x] Test release build with debugging symbols through #16013 Difficulties: - [x] Replace /Zi flags with /Z7 (which will be added if DEBUG or RelWithDebInfo is used), as it is not supported by sccache - [x] Resolve `LINK : fatal error LNK1210: exceeded internal ILK size limit; link with /INCREMENTAL:NO` in the debug build - [ ] DEBUG build blocked by a MSVC bug. In order to resolve it, we'll need to update the MSVC in CI: https://developercommunity.visualstudio.com/content/problem/225957/fatal-error-lnk1318-unexpected-pdb-error-ok-0.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/16008 Differential Revision: D13709527 Pulled By: ezyang fbshipit-source-id: e8365bc75d9ec64099093f7001f83d99a06b196b	2019-01-16 23:34:32 -08:00
JerryShih	0dfdc2cbdb	Update int8_simd.h (#13859 ) Summary: If we use clang with sse4 support, we will have the function redefinition error between [1] and [2]. This patch try to add some checkings to fix this problem. I just turn on USE_NATIVE_ARCH with clang, then I hit the redefinition error. [1] caffe2/operators/quantized/int8_simd.h [2] third_party/gemmlowp/gemmlowp/fixedpoint/fixedpoint_sse.h Pull Request resolved: https://github.com/pytorch/pytorch/pull/13859 Differential Revision: D13095694 Pulled By: ezyang fbshipit-source-id: c65166e4d5a04bb54e2b82c52740af00116ccb0d	2019-01-16 23:19:46 -08:00
SsnL	ffd613800f	Add IS_PYTORCH_CI flag for testing (#16006 ) Summary: Use case: Some data loader tests rely on `psutil` (a third party lib). So they are guarded by `skipIf`. But we want to always test them on CI envs. With `IS_PYTORCH_CI`, we can raise if `psutil` is not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16006 Reviewed By: ezyang Differential Revision: D13673957 Pulled By: yf225 fbshipit-source-id: c63a7138093f45333c0b371fed0bcc88b67f2a22	2019-01-16 23:07:38 -08:00
jiej	7c56db73d5	Moving torch.norm to ATen using TensorIterator (#15414 ) Summary: Adding supports for torch.nomr: i. multi dimensions for dim ii. dtype that specifies math/output tensor type Pull Request resolved: https://github.com/pytorch/pytorch/pull/15414 Differential Revision: D13702022 Pulled By: ezyang fbshipit-source-id: da2676f2b6aff988889b1539d0de8ecd4946823a	2019-01-16 22:15:25 -08:00
Tongliang Liao	55511004d1	Resolve errors in perfkernel for Windows (#16031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16031 1. MSVC only has _mm_prefetch(const char*, int). Fixed in both python codegen and C++ files. 2. uint32_t in "cvtsh_ss_bugfix.h" requires "#include <cstdint>". 3. Some files use gflags headers. Add dependency via c10. 4. Isolate arch flags with interface library and private compile options. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15753 Reviewed By: dskhudia Differential Revision: D13636233 Pulled By: jspark1105 fbshipit-source-id: cdcbd4240e07b749554a2a5676c11af88f23c31d	2019-01-16 21:51:00 -08:00
Soumith Chintala	aa6b0f50ad	add a constexpr in c10::Half (#16091 ) Summary: Debug build generates references which are not resolved otherwise as recognized by dlibenzi Pull Request resolved: https://github.com/pytorch/pytorch/pull/16091 Differential Revision: D13703584 Pulled By: soumith fbshipit-source-id: 6ac5666d2c6b1520e083f6eac9c535a1609d9c6b	2019-01-16 21:13:21 -08:00
Jerry Zhang	d277f77da2	Tensor reinitialization codemod - 3/5 (#15912 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15912 Codemod generated with clangr shard mode, 25 files per diff, To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and call `ReinitializeTensor` to initialize it. motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13586734 fbshipit-source-id: 8485d2c51225343961351c7a2e8f95055534f9a9	2019-01-16 19:49:01 -08:00
Yinghai Lu	57d29ffa9c	Bound shape inference for c2 (#16081 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16081 A simple version of bound shape inference, conditioned on batch size. In addition to doing normal shape inference, it will change the batch size (1st dim of the shape) of the inputs as well as batch size modulating ops such as `SparseLengthsSum`. Probably support to more ops is needed, such as `SparseToDense`. We can build on this. Reviewed By: jackm321, rdzhabarov Differential Revision: D13661968 fbshipit-source-id: 6a724a647e109757c26e3e26e15a49725ecc75cc	2019-01-16 19:02:56 -08:00
Xiaomeng Yang	7a5f782c2e	Fix max_pool_grad test (#16088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16088 Fix max_pool_grad test Reviewed By: houseroad Differential Revision: D13700917 fbshipit-source-id: f4f942ee920bcd943c38a8f8a6aafd1d13c4515f	2019-01-16 15:32:27 -08:00
Edward Yang	5b2d30ec85	Revert D12812029: [pt1][tensor] Remove deprecated caffe2::Tensor APIs Differential Revision: D12812029 Original commit changeset: ea0c3dd882be fbshipit-source-id: d5bb4cbb1d7c9be08789599a7db0fb3313f3dbc4	2019-01-16 14:53:20 -08:00
Chandler Zuo	237c0c3c7a	Port the backend of FractionalMaxPool3d from TH to ATen (#15575 ) Summary: 1. Port the FractionalMaxPool3d implementation from THNN/THCUNN to ATen. 2. Expose this function to Python module nn. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15575 Differential Revision: D13612848 Pulled By: chandlerzuo fbshipit-source-id: 5f474b39005efa7788e984e8a805456dcdc43f6c	2019-01-16 14:16:30 -08:00
Natalia Gimelshein	aff0964ee7	update pytorch docker to cuda 10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16082 Differential Revision: D13699081 Pulled By: soumith fbshipit-source-id: 86942e2c5595931384cf87dd1ef75936a4d74a57	2019-01-16 13:37:37 -08:00
Thomas Viehmann	d33e7d1236	multinomial: fix detection of zero probability (#16075 ) Summary: The cumsum over the probabilities can be not monotonically non-decreasing. Thus it is hard to detect zero probability classes using just the cumsum. This changes the binary search postprocessing to use the (non-cumulated) distribution instead. Thank you, jcjohnson, for the bug report with reproducing case. Fixes: #13867 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16075 Differential Revision: D13695565 Pulled By: soumith fbshipit-source-id: 02c4d6f868f0050c1ae7d333f4317c5610e49cd9	2019-01-16 12:50:49 -08:00
Kimish Patel	e58cc6ab28	Enable single graph sharing between multiple threads for onnxifiop (#16047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16047 Implements single thead safe map enabling sharing of generated graph between different ops. Added model_id to every onnxified op to help create a unique id in the map. Some formatting fix. Reviewed By: yinghai Differential Revision: D13663927 fbshipit-source-id: 27417e8fe752fdd48abb6a87966cd76d592e1206	2019-01-16 12:19:16 -08:00
vishwakftw	503f412f79	Fix error message formatting in AT_CHECK/AT_ERROR (#16067 ) Summary: Changelog: - Fix formatting for error messages in prelu, EmbeddingBag, RNN Fixes https://github.com/pytorch/pytorch/issues/16043 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16067 Differential Revision: D13693286 Pulled By: soumith fbshipit-source-id: b0760d13c9a45e82dababfc44dabe648e5345ca3	2019-01-16 11:34:13 -08:00
Rasmus Diederichsen	71b24127d2	Correct sphinx-note in symeig (wrong indentation) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16073 Differential Revision: D13692874 Pulled By: soumith fbshipit-source-id: ea2a98e88679d382f9a2edab199e9ba7c8ce2213	2019-01-16 10:47:48 -08:00
peter	3cf76e78bd	Fix the caffe2_gpu linkage with torch on Windows (#16071 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15992. Inspired by https://docs.microsoft.com/en-us/cpp/build/reference/optimization-best-practices?view=vs-2017. But this PR needs to be tested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16071 Differential Revision: D13693006 Pulled By: soumith fbshipit-source-id: e83e9ae2591fa4da01d2b1b593558dba3bdc3cf7	2019-01-16 09:10:39 -08:00
Shen Li	a2af554e6f	Port legacy all() to ATen (#15540 ) Summary: Questions: 1. ~This PR disables `common_dtype` computation [in `TensorIterator.cpp`](https://github.com/mrshenli/pytorch/blob/all/aten/src/ATen/native/TensorIterator.cpp#L489-L491) for `all` operators. The reason is that, [this code](https://github.com/mrshenli/pytorch/blob/all/aten/src/ATen/native/TensorIterator.cpp#L120) otherwise complains type mismatch, where the `op.tensor` is `type Variable[CPUByteType]` while the `op` is `CPUByteType`. I am not sure if this is the right solution for this problem.~ 2. Should I clean up all occurrences of `_th_all` and `_th_all_out` (and `logicalAnd`, `logicalAndAll`)? 3. Do I need to implement derivatives for `all`? gchanan Benchmark: <img width="590" alt="screen shot 2018-12-26 at 3 24 31 pm" src="https://user-images.githubusercontent.com/16999635/50456505-e9596a00-0922-11e9-844e-00c4b4aad7ca.png"> <img width="587" alt="screen shot 2018-12-26 at 3 26 10 pm" src="https://user-images.githubusercontent.com/16999635/50456509-ef4f4b00-0922-11e9-96bf-0a30c8574fe7.png"> <img width="590" alt="screen shot 2018-12-26 at 3 26 54 pm" src="https://user-images.githubusercontent.com/16999635/50456510-ef4f4b00-0922-11e9-8a63-e47988843cc8.png"> <img width="589" alt="screen shot 2018-12-26 at 3 27 16 pm" src="https://user-images.githubusercontent.com/16999635/50456511-ef4f4b00-0922-11e9-9004-2518aebcdc6e.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15540 Differential Revision: D13548938 Pulled By: mrshenli fbshipit-source-id: 5a2e5eef1047decb4c79906cb9f3332034908c9c	2019-01-16 09:06:26 -08:00
Edward Yang	411173757e	Rename away uses of THAllocator and THCDeviceAllocator (#16061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16061 I discovered I needed to delete these names in preparation of moving THCCachingAllocator to c10_cuda; might as well also fix all the other sites too. Reviewed By: dzhulgakov Differential Revision: D13686869 fbshipit-source-id: e8cc55d39ac4bfd3e3a22c761f89a7a111ce5f5e	2019-01-16 05:36:47 -08:00
Edward Yang	4d07951a54	Stop pretending that TH headers are both C++ and C compatible. (#16059 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16059 Just deleted all __cplusplus ifdef guards; we only ever use these headers in C++ contexts. Reviewed By: dzhulgakov Differential Revision: D13686580 fbshipit-source-id: ce28c4a32f3596bfb17aeeb34904a02899991453	2019-01-16 05:36:45 -08:00
Brennan Vincent	fb68d813be	Fix logic errors when accumulating reductions in output (CUDA) (#16023 ) Summary: The correct logic is as follows: * If there is an earlier split, we need to combine with its result * If there is not a later split, we need to project before saving into the output. This should partially f i x #15837 . For example: ``` In [7]: a=torch.ones([1838860800], dtype=torch.float, device="cuda:1") In [8]: a.mean() Out[8]: tensor(1., device='cuda:1') ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16023 Differential Revision: D13678449 Pulled By: umanwizard fbshipit-source-id: ab5078484c88e96bb30121b5cf24a0e8b0a8c2f8	2019-01-15 19:57:57 -08:00
Jerry Zhang	5353847b19	Remove deprecated caffe2::Tensor APIs (#15814 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15814 Plan is to remove the APIs we want to deprecate one by one and make sure it still builds in sandcastle and ossci Reviewed By: ezyang Differential Revision: D12812029 fbshipit-source-id: ea0c3dd882bec95fcd4507160ebc61f598b6d040	2019-01-15 18:42:04 -08:00
Jerry Zhang	5e72e99c86	Remaining Tensor API fixes - dims() -> sizes() (#15743 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15743 Remaining fixes so that D12812029 will compile Reviewed By: dzhulgakov Differential Revision: D13535559 fbshipit-source-id: 2c8b3403570c8c35ac8efe2d827233abc0e6e0d1	2019-01-15 18:42:02 -08:00
Edward Yang	8b5894491c	Comment about CuDNNWrapper (#15496 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15496 Differential Revision: D13544130 Pulled By: ezyang fbshipit-source-id: 51bdd8312b482925b30a478774cdfa629c57ee4e	2019-01-15 18:01:12 -08:00
Chandler Zuo	ad39cbde59	Port FractionalMaxPool2d from TH to ATen (#15531 ) Summary: Tested: pytest test/test_nn.py -k Fractional Pull Request resolved: https://github.com/pytorch/pytorch/pull/15531 Differential Revision: D13612833 Pulled By: chandlerzuo fbshipit-source-id: b919d698d068b97ba7a4f8021367e7f6c8aae39c	2019-01-15 17:57:12 -08:00
James Reed	dc4977ddf0	Support tracing GenericList (#15969 ) Summary: Treat GenericList similarly to tuples and TensorList: recursively unpack them and assignValueTrace accordingly. Also add interpreter support for ListUnpack on GenericList Pull Request resolved: https://github.com/pytorch/pytorch/pull/15969 Differential Revision: D13665139 Pulled By: jamesr66a fbshipit-source-id: cd8cb3dd7475f424e48a69d217f2eac529df9f6a	2019-01-15 17:32:48 -08:00
Kyle Lexmond	b792bfec0e	s/fwdproxy.any/fwdproxy/g in fbsource (#16024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16024 codemod with 'Yes to all': s/fwdproxy.any/fwdproxy/g in fbsource Reviewed By: maxgeorg Differential Revision: D13666336 fbshipit-source-id: a5a694d66efec5304a1c8c231d638441f88efe1d	2019-01-15 17:26:31 -08:00
Lu Fang	8f11df3cb7	Automatic update of fbcode/onnx to 84a0441ae28795a928005863dc142bee81827566 (#16046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16046 Previous import was 7abd834091f1024c11749dcfd25126802db9fdd5 Included changes: - [84a0441](https://github.com/onnx/onnx/commit/84a0441): Clarify namescopes in the presence of nested subgraphs (#1665) <G. Ramalingam> - [118fec5](https://github.com/onnx/onnx/commit/118fec5): Add Where op. (#1569) <Sergii Dymchenko> - [beefa15](https://github.com/onnx/onnx/commit/beefa15): Use strings directly for casing as np.object w/o redundant StringHolder. (#1736) <Dmitri Smirnov> - [4023bae](https://github.com/onnx/onnx/commit/4023bae): Add a capability to input/output unicode strings (#1734) <Dmitri Smirnov> - [1a8a7fc](https://github.com/onnx/onnx/commit/1a8a7fc): typos fixed: iutput -> input (#1726) <Beomsoo Kim> - [0128478](https://github.com/onnx/onnx/commit/0128478): Scan test update (#1732) <G. Ramalingam> - [c6a24fd](https://github.com/onnx/onnx/commit/c6a24fd): turn rtol to 0.002 on densenet121, since AMD and Nvidia GPU's precion difference (#1733) <Lu Fang> - [5b7ac72](https://github.com/onnx/onnx/commit/5b7ac72): Add Shrink operator (#1622) <Rui Zhu> Reviewed By: yinghai Differential Revision: D13676711 fbshipit-source-id: 513cc137223469b47af48919432aaecf58006012	2019-01-15 17:17:31 -08:00
Xiaomeng Yang	13f38ab79d	Add count_include_pad to average_pool_gradient_op (#15997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15997 Add count_include_pad to average_pool_gradient_op Reviewed By: houseroad Differential Revision: D13648339 fbshipit-source-id: 205cb2acb32dc24a85256b628298b1a11f0ffa2c	2019-01-15 16:56:40 -08:00
Zachary DeVito	b2eb98f6c3	Remove cuda from autograd profiler (#15898 ) Summary: This puts stubs in the autograd profiler for the use of cuda APIs allowing the cuda parts of libtorch to be linked separately from the CPU parts. This also edits the buck build. Previous: For GPU builds: _C -> csrc -> caffe2 For CPU builds: _C -> csrc-cpu -> caffe2 Now: GPU: _C -> libtorch_cuda -> (libtorch -> caffe2, for CPU) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15898 Reviewed By: ailzhang Differential Revision: D13617991 Pulled By: zdevito fbshipit-source-id: 6d84a50bb356a54b4217f93219902755601b00e1	2019-01-15 16:43:11 -08:00
Yavuz Yetim	2824cb6e9c	Fix namespace typo. (#16021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16021 Adds nom:: so that TRIVIAL_CONVERTER works more generally. Reviewed By: janewangfb Differential Revision: D13664748 fbshipit-source-id: 100f47a8326e41bd0ac2ae281669f5a0363fe060	2019-01-15 16:43:08 -08:00
Jesse Hellemn	c448f85e1f	Fixing missing cpp tests for Caffe2 setup.py builds (#16037 ) Summary: These were broken (always skipped in setup.py builds) by https://github.com/pytorch/pytorch/pull/15917 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16037 Differential Revision: D13675549 Pulled By: pjh5 fbshipit-source-id: fed50855dd0b5d0c80fface3d8b2156f18aae4e7	2019-01-15 13:09:12 -08:00
Sebastian Messmer	57b5e7572b	Test cases for calling caffe2 LayerNorm from PyTorch and JIT Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15895 Reviewed By: dzhulgakov Differential Revision: D13615336 fbshipit-source-id: de28fef8ce025d6d37a4c80c029ec97b7195cfd9	2019-01-15 12:03:57 -08:00
Shane Li	620ff25bdb	Enhance cpu support on gloo based multi-nodes mode. (#11330 ) Summary: 1. Add some gloo communication operators into related fallback list; 2. Work around to avoid compiling errors while using fallback operator whose CPU operator inherits from 'OperatorBase' directly like PrefetchOperator; 3. Add new cpu context support for some python module files and resnet50 training example file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11330 Reviewed By: yinghai Differential Revision: D13624519 Pulled By: wesolwsk fbshipit-source-id: ce39d57ddb8cd7786db2e873bfe954069d972f4f	2019-01-15 11:47:10 -08:00
Elias Ellison	7d601715e5	Constant prop prim::None (#15979 ) Summary: Previously we were only constant propping prim::Constants, but we should be constant propping prim::None as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15979 Differential Revision: D13664692 Pulled By: eellison fbshipit-source-id: 01839403576c21fc030c427e49275b8e1210fa8f	2019-01-15 11:34:51 -08:00
Edward Yang	9a6fe4feec	Add a note about THNN height/width/etc argument reordering. (#15819 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15819 Differential Revision: D13665297 Pulled By: ezyang fbshipit-source-id: 4570275bc9e65269788f836f2447d09474cefeff	2019-01-15 10:52:39 -08:00
Jesse Hellemn	406b9c49bd	Fix Python path finding for benchmark tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16022 Differential Revision: D13673792 Pulled By: pjh5 fbshipit-source-id: 177a823ef343b7f60e26ad9ef51415332045438d	2019-01-15 10:48:40 -08:00
James Reed	7f1397acef	Quantized RNNCell modules (#15469 ) Summary: Similarly to https://github.com/pytorch/pytorch/pull/13777, we apply post-processing quantization to RNN cell modules (`RNNCell`, `LSTMCell`, and `GRUCell`). A further follow-up PR will involve quantizing the full `RNN`, `GRU`, and `LSTM` modules. This depends on those modules being scriptable as part of the standard library scripting effort, though. Note that infrastructure in this pr such as `gather_quantized_params` is currently unused but should be used in the future when we can port over the full RNN modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15469 Differential Revision: D13545802 Pulled By: jamesr66a fbshipit-source-id: ad3b694517842893ea619438e9f5e88fd7b96510	2019-01-15 10:40:51 -08:00
Derek Kim	19717224c5	Miscellaneous broken RSTs fixed (#16033 ) Summary: https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_ https://pytorch.org/docs/master/torch.html#torch.addmm https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033 Differential Revision: D13671202 Pulled By: soumith fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d	2019-01-15 09:50:12 -08:00
Lu Fang	b329e03684	Add PyTorchPredictorContainer (#15899 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15899 Add PyTorchPredictorContainer to support multiple jit script modules Reviewed By: pritamdamania87 Differential Revision: D13596139 fbshipit-source-id: 3ce0bdf2f4dbba7aa1d20e824d03e5ac98f5d887	2019-01-15 09:18:18 -08:00
Xiang Gao	1065e7cd24	Add `itertools.{prod, combinations, combinations_with_replacement}` like op to pytorch (#9393 ) Summary: closes https://github.com/pytorch/pytorch/issues/7580 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9393 Differential Revision: D13659628 Pulled By: zou3519 fbshipit-source-id: 3a233befa785709395a793ba8833413be394a6fd	2019-01-15 08:31:22 -08:00
Jongsoo Park	964732fa8d	use fbgemm gconv in dnnlowp (#16020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16020 Needs to go over more iterations. For conv, I think we need a high level interface that abstracts out low-level details of which code path will be taken (acc16, outlier-aware, depth-wise, group conv, ...) otherwise the client code will be complex as can be seen from DNNLOWP Conv ops. This will also help us to make interface more stable. Reviewed By: dskhudia, jianyuh Differential Revision: D13588996 fbshipit-source-id: 9afce9e441bcaf20437fcc2874fb9d4165a46bcb	2019-01-15 00:02:31 -08:00
Brennan Vincent	bc233fe405	`var` for multiple dimensions (#15892 ) Summary: Timings are the same as for `std` . Pull Request resolved: https://github.com/pytorch/pytorch/pull/15892 Differential Revision: D13651173 Pulled By: umanwizard fbshipit-source-id: a26bf1021dd972aa9e3e60fb901cd4983bfa190f	2019-01-14 20:17:42 -08:00
svcscm	02b9f686a7	Updating submodules Reviewed By: yns88 fbshipit-source-id: 19841cff4a7fd69318d7828db75c16cd75757edd	2019-01-14 20:17:41 -08:00
svcscm	5d527b9cc2	Updating submodules Reviewed By: yns88 fbshipit-source-id: 68b7c41366618ffd636c2b9c45c7ffbbcbc44f85	2019-01-14 18:43:27 -08:00
Duc Ngo	10b16953d1	nomnigraph - easy - use new test utils in converter_nomnigraph_test (#15751 ) Summary: Use new test utils in converter_nomnigraph_test , and add utils to set device option name, external inputs, outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15751 Differential Revision: D13586228 Pulled By: duc0 fbshipit-source-id: ff809dd7bf9f30641ce2a6fef7e2810f005521c2	2019-01-14 18:38:38 -08:00
Sebastian Messmer	4ed9de8680	Remove code duplication (#15880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15880 The layer_norm reference was implemented twice. Removing one of them. Reviewed By: dzhulgakov Differential Revision: D13611232 fbshipit-source-id: cee96c78d3255c3a4e34300693bf9260cf096615	2019-01-14 17:59:37 -08:00
Edward Yang	ddece5a793	Fix ormqr docs, fixes #15565 (#15694 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc meganset Pull Request resolved: https://github.com/pytorch/pytorch/pull/15694 Differential Revision: D13573064 Pulled By: zou3519 fbshipit-source-id: 1d0b693d7c26db91826b81e6c98b45a69b5e9bc4	2019-01-14 17:08:18 -08:00
SsnL	774705ba05	Fix c10d checking errno unconditionally (#15986 ) Summary: In #15964, I learned that `errno` is only meaningful if the function call fails. E.g., on some macos, a successful `fork()` sets `errno` to `EINVAL` in child process. This commit changes the `SYSCALL` macro so error checking is only done when an error happens. This means checking whether `rv == -1` for most calls, but is checking `rv == nullptr` for `inet_ntop`. Now `SYSCALL` accepts a second argument `success_cond`, which should be an expression returning whether the call succeeded. `SYSCHECK_ERR_RETURN_NEG1` is the shorthand for checking if rv is `-1`. Any suggestion on better macro names is welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15986 Reviewed By: janewangfb Differential Revision: D13661790 Pulled By: pietern fbshipit-source-id: 9551b14b9f88805454a7bfb8e4d39e0f3aed8131	2019-01-14 16:02:05 -08:00
Elias Ellison	4fb3931896	add tensor.to to script (#15976 ) Summary: Previously it only worked with keyword arguments. Now it is fully compatible. Fix for: https://github.com/pytorch/pytorch/issues/15478 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15976 Differential Revision: D13643979 Pulled By: eellison fbshipit-source-id: 6a47bce7db362da80452adffebd2732f8e62a240	2019-01-14 15:49:31 -08:00
Jesse Hellemn	8964a2e6e6	Split Caffe2 CI into cmake-only and python builds (#15917 ) Summary: bypass-lint - Change all Caffe2 builds to use setup.py instead of cmake - Add a -cmake- Caffe2 build configuration that uses cmake and only builds cpp - Move skipIfCI logic from onnx test scripts to the rest of CI logic - Removal of old PYTHONPATH/LD_LIBRARY_PATH/etc. env management Pull Request resolved: https://github.com/pytorch/pytorch/pull/15917 Reviewed By: orionr Differential Revision: D13637583 Pulled By: pjh5 fbshipit-source-id: c5c5639db0251ba12b6e4b51b2ac3b26a8953153	2019-01-14 15:20:44 -08:00
Peter Goldsborough	4bdaca827c	Make call operator on module holder call forward (#15831 ) Summary: In Python, you can use the call operator to invoke the `forward()` method of a module. In C++ this was currently not possible, because I couldn't figure out how to deduce the return type of a module's `forward()` method under the constraint that `forward()` may not exist at all (since the base module class in C++ does not mandate a `forward()` method). I now figured it out, so the call operator can be used. ezyang ebetica Pull Request resolved: https://github.com/pytorch/pytorch/pull/15831 Differential Revision: D13652676 Pulled By: goldsborough fbshipit-source-id: ccab45a15215dda56460e560f0038781b539135f	2019-01-14 14:40:33 -08:00
svcscm	c620725177	Updating submodules Reviewed By: yns88 fbshipit-source-id: 0e31357e8a34614226e8948ae76d67e0786a9196	2019-01-14 12:46:24 -08:00
Derek Kim	8c55e56c37	Fix broken rst of torch.nn.utils.spectral_norm and others (#15995 ) Summary: - Currently, the [rst](https://pytorch.org/docs/stable/nn.html#torch.nn.utils.spectral_norm) looks broken, at least in my browser. So I fixed it. - I thought a subscript may be needed to the left W in the definition. - A few typos fixed. crcrpar Pull Request resolved: https://github.com/pytorch/pytorch/pull/15995 Differential Revision: D13649888 Pulled By: soumith fbshipit-source-id: 00a2c3b043c7c8ebdd9fc2bf77ba27ae695fee3f	2019-01-14 07:35:36 -08:00
SsnL	300dcc3b96	Add cuda.reset_max_memory_* (#15985 ) Summary: Addresses #15968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15985 Differential Revision: D13649916 Pulled By: soumith fbshipit-source-id: a207aea5709a79dba7a6fc541d0a70103f49efff	2019-01-14 07:31:51 -08:00
SsnL	7c08f1083e	libshm retry on EINTR (#15964 ) Summary: fixes https://github.com/pytorch/pytorch/issues/14314 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15964 Differential Revision: D13639034 Pulled By: soumith fbshipit-source-id: 44592762aa46982e5d3616d55b5666a2c2ce9105	2019-01-14 04:30:40 -08:00
Derek Kim	abdaa477e5	Improved the documentation for torch.nn.functional.pad (#15984 ) Summary: - Fixed a few typos and grammar errors. - Changed the sentences a bit. - Changed the format of the tuples to be consistent with padding notations in the other places. For example, `ReflectionPad2d`'s dostring contains :math:`H_{out} = H_{in} + \text{padding\_top} + \text{padding\_bottom}`. I also made sure that the generated html doesn't break. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15984 Differential Revision: D13649939 Pulled By: soumith fbshipit-source-id: 0abfa22a7bf1cbc6546ac4859652ce8741d41232	2019-01-14 04:12:45 -08:00
Derek Kim	6ec753f2f9	Improve the docstring of nn.random.fork_rng (#15960 ) Summary: Improved the docstring of nn.random.fork_rng Pull Request resolved: https://github.com/pytorch/pytorch/pull/15960 Differential Revision: D13649929 Pulled By: soumith fbshipit-source-id: d3843179a2f1f838792c2f07f34deda2c06af56e	2019-01-14 02:41:18 -08:00
surgan12	492b7d410b	doc fixes (#15990 ) Summary: fixes #15597 , #15283 and #10258 Differential Revision: D13649905 Pulled By: soumith fbshipit-source-id: 753f46c2c96c61fba460019d9ed3e0d047d42ee7	2019-01-13 23:38:39 -08:00
Jongsoo Park	ca18fb8567	simplify lambda function use in conv dnnlowp ops to fix #15911 (#15996 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15996 As reported in issue #15911, gcc 4.9 was getting internal compiler error due to a complex use of lambda function in conv_dnnlowp_op.cc and conv_acc16_op.cc . This diff simplifies them. Reviewed By: viswanathgs Differential Revision: D13648264 fbshipit-source-id: 1551ae8a0a7653749185dca51ccceb2471b96b82	2019-01-13 23:32:48 -08:00
kyryl	a7415787ac	fix RandomSampler length (#15991 ) Summary: Hi! This PR addresses #15537 issue. Please review. Thanks! Differential Revision: D13649890 Pulled By: soumith fbshipit-source-id: 166212ae383331345423236dfc4fa2ea907d265d	2019-01-13 23:09:51 -08:00
peter	e18d6cd455	Fix static build on Windows (#15989 ) Summary: Tested locally. It could be now be started by running `set EXTRA_CAFFE2_CMAKE_FLAGS= -DTORCH_STATIC=1` before build. If we want to make sure it works, then maybe we should add it into CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15989 Differential Revision: D13649935 Pulled By: soumith fbshipit-source-id: 956945ed572819d8cf0bc9bd48df3ea9bc6f4a8a	2019-01-13 22:53:30 -08:00
Sergei Nikolaev	a282378baf	Caffe 2: Reshape Op upgrade (#15380 ) Summary: This is follow up on #13945 where we had to turn off some TRT tests because some ops were not ready to accept ONNX opset 9+ models. This PR fixes Reshape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15380 Differential Revision: D13649825 Pulled By: houseroad fbshipit-source-id: b72e62803de5b63cc001c3fe4b3bf64dfa996e94	2019-01-13 22:49:40 -08:00
Jongsoo Park	04b8a2f1ba	fix compile error reported in issue #15911 (#15953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15953 Fix issue reported in https://github.com/pytorch/pytorch/issues/15911 Reviewed By: csummersea Differential Revision: D13633256 fbshipit-source-id: 3808f100ff7dedfe5e20708e72e6081ff07eb32c	2019-01-12 21:03:12 -08:00
Jerry Zhang	6371bc76a9	Back out "[pt1][tensor] Remove caffe2::ShareData" (#15983 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15983 Original commit changeset: 6e4275d02f4c Reviewed By: supertopher, Yangqing Differential Revision: D13644123 fbshipit-source-id: 4b15a4c62995c0e68aad58465600409e302e6504	2019-01-12 07:07:22 -08:00
wuhuikx	35480a7c44	Remove StopGradient op when it is inplace in inference (#12152 ) Summary: For Inference, if the StopGradient op is inpalce, we just remove it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12152 Differential Revision: D13633946 Pulled By: yinghai fbshipit-source-id: 57762bcc37b38a1d39cb4af316ca50bfe961b105	2019-01-11 23:55:01 -08:00
Xiaomeng Yang	586d030311	Add global pooling specialization and also update MaxPooling on GPU (#15824 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15824 Add global pooling specialization and also update MaxPooling on GPU Reviewed By: houseroad Differential Revision: D13596340 fbshipit-source-id: c8a42aa69ee92c383c9f19d3ed57b77cb3e5bd28	2019-01-11 22:37:48 -08:00
Michael Suo	83c054de48	AliasDB interface cleanup (#15656 ) Summary: This is the first of several PRs to simplify AliasDb usage. - Hide the concept wildcards from users. They are too hard to think about and too easy to forget about. - Start moving "mutability-safe" graph mutation methods into AliasDb (right now, the various methods that deal with topological move). Eventually I want to create a "mutability-aware" handle to the graph. If you only use that handle to transform the graph, you can be sure that all transformations are safe with respect to mutability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15656 Differential Revision: D13615492 Pulled By: suo fbshipit-source-id: 5c39a157b4ea76f1f976315d06a314a89cc4f22f	2019-01-11 20:06:53 -08:00
svcscm	00b2dff6b6	Updating submodules Reviewed By: zpao fbshipit-source-id: 2671ea6bb594280a9d3352fbfa3628f28c6847aa	2019-01-11 19:57:11 -08:00
Peter Goldsborough	a4c1aa4bc5	Add the normalize transform to the core library (#15891 ) Summary: Adds the `Normalize` transform to the core C++ frontend library. ebetica ezyang soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15891 Differential Revision: D13642167 Pulled By: goldsborough fbshipit-source-id: 573428e626d6106cf2aadf3dc2e2aecb9a85efc3	2019-01-11 19:50:18 -08:00
Jongsoo Park	e5266b4ba6	3x3x3 depthwise convolution with per channel quantization (#15775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15775 Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/55 fbgemm didn't have per-channel quantization for 3x3x3 depth-wise convolution Reviewed By: jianyuh Differential Revision: D13587438 fbshipit-source-id: 91c36fae7a0e8386e3bc49808e18918b01681dd1	2019-01-11 19:42:29 -08:00
Jianyu Huang	264e16bffd	Make it consistent for OperatorBase usage (#15908 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15908 "OperatorBase::" is changed to "this->template ". For example, # This no longer works OperatorBase::GetSingleArgument<>() # Should change to: this->template GetSingleArgument<>() https://fb.workplace.com/groups/101100140348621/permalink/576804082778222/ Follow up of D13574832. Sample Diff: D9319742, D10045844. Reviewed By: jspark1105 Differential Revision: D13613574 fbshipit-source-id: 2cb4094557b4af78d41e289816cad3e1194fb82c	2019-01-11 19:32:58 -08:00
Jerry Zhang	55b0e2a1eb	rocm build (#15981 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15981 caffe2/operators/unique_ops.cu translated to caffe2/operators/hip/unique_ops.hip breaks rocm build Reviewed By: BIT-silence Differential Revision: D13646129 fbshipit-source-id: 900a14e14216686ec4560b30df2eabbd7ec2ff91	2019-01-11 18:39:52 -08:00
svcscm	6f08e2a588	Updating submodules Reviewed By: zpao fbshipit-source-id: 3bbf550cb0bfe71c05b73b8bc4ce97285b50608b	2019-01-11 18:00:01 -08:00
Jerry Zhang	bff0f88cc8	Tensor construction codemod(ResizeLike) - 2/3 (#15940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15940 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: smessmer Differential Revision: D13629047 fbshipit-source-id: 5f0641a9aaab9045fa63c32c6a07a4cab3340cc3	2019-01-11 17:41:48 -08:00
James Webber	162ad94590	Fixed typo in batchnorm docstrings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15975 Differential Revision: D13642271 Pulled By: soumith fbshipit-source-id: 60ffa392bf1f916f2b93c943bb44a642a9815c42	2019-01-11 17:28:37 -08:00
Jerry Zhang	fd0ed2e298	Tensor reinitialization codemod - 4/5 (#15967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15967 Codemod generated with clangr shard mode, 25 files per diff, To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and call `ReinitializeTensor` to initialize it. motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: smessmer Differential Revision: D13586735 fbshipit-source-id: eae2d79e1107a2e813ce3809e690af4706aaa9ca	2019-01-11 16:41:19 -08:00
Lu Fang	94acddb57f	Fix the lint (#15973 ) Summary: Fix the lint error introduced in https://github.com/pytorch/pytorch/pull/15965 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15973 Differential Revision: D13640856 Pulled By: houseroad fbshipit-source-id: 3f14d9898dcfb0fc469468f63fa1461c88b66b2e	2019-01-11 15:59:59 -08:00
Jerry Zhang	eb15587c99	Tensor reinitialization codemod - 2/5 (#15947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15947 Codemod generated with clangr shard mode, 25 files per diff, To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and call `ReinitializeTensor` to initialize it. motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: smessmer Differential Revision: D13586732 fbshipit-source-id: 5295ab27ca0155f96a4fccf9c0ba8a609101ba24	2019-01-11 15:05:01 -08:00
James Reed	1235aa4fca	Expose dim() on type and use it in ONNX symbolics (#15933 ) Summary: While integrating fork/join into production translation, we found that trying to export `transpose()` where the input is of `TensorType` (rather than `CompleteTensorType`) failed. This is not ideal, since `TensorType` still contains the number of dimensions of the tensor, and that's all the `transpose` symbolic needs. This PR introduces a pybind binding for `dim()` on `TensorType` (and `CompleteTensorType` by inheritance). We now use this in places where it logically makes sense in the symbolics: those symbolics which only require knowledge of the number of dimensions rather than concrete sizes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15933 Differential Revision: D13639657 Pulled By: jamesr66a fbshipit-source-id: 6e50e407e93060085fd00a686a928764d0ec888d	2019-01-11 14:54:19 -08:00
Jerry Zhang	253b680928	Tensor construction codemod(ResizeLike) - 3/3 (#15943 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15943 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: smessmer Differential Revision: D13629082 fbshipit-source-id: d3863615fd612f73bb73ac67159fd0f0d237fe5c	2019-01-11 14:34:31 -08:00
Lin Yang	d042914221	FC shape inference should use int64_t (#15961 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15961 as title Reviewed By: yinghai Differential Revision: D13634427 fbshipit-source-id: ec7d168b6272f0dac8a693401cfd0bea368f929a	2019-01-11 14:28:39 -08:00
Christian Puhrsch	d33159a426	Undo norm optimizations and add more documentation for parallel.h (#15885 ) Summary: See https://github.com/pytorch/pytorch/issues/15602 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15885 Differential Revision: D13614841 Pulled By: cpuhrsch fbshipit-source-id: 5d3e45f499d36ac287dbbc2e45798aa51eb5bfdf	2019-01-11 13:32:35 -08:00
Cheng,Penghui	926e718d5f	Add/fallback some operators for mkl-dnn (#11696 ) Summary: Implementation LeakyRelu operator for mkl-dnn,the speed-up of a single operation is up to 10X on BDW. Implementation rashape operator for mkl-dnn,it will resolve occasionally crash issue which use fallback reshape operator. Implementation CreateBlobQueue and SafeEnqueueBlobs operators,it will resolve crash issue which use fallback operators. Fallback CreateBlobsQueueDBOp,TensorProtosDBInput,CloseBlobsQueue operators. Implement adam operator for mkl-dnn,the speed-up of a single operator is up to 6X on BDW. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11696 Reviewed By: yinghai Differential Revision: D10100438 Pulled By: wesolwsk fbshipit-source-id: 0b6e06897cc11e0a8e349d80a870b1e72e47f10d	2019-01-11 12:53:06 -08:00
Dmytro Dzhulgakov	96ea2594d8	Don't call cudaStreamDestroy at destruction time (#15692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15692 It was leading to ocassional crashes with dynamically linked CUDA because runtime was already destroyed. Also, unique_ptr<T[]> is more suitable than deque<T> for the purpose. Reviewed By: Yangqing Differential Revision: D13571988 fbshipit-source-id: 37eb26dfbe361c49160367b53f87bd037c6c0e46	2019-01-11 12:36:41 -08:00
Jerry Zhang	726341fea7	Tensor construction codemod(ResizeLike) - 1/3 (#15944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15944 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13628999 fbshipit-source-id: e17c44cec6746674dfd5c2a89c28c4ac0a3da450	2019-01-11 12:28:12 -08:00
Jesse Hellemn	bcc88dfb4e	Move nightly binary builds to 05:05 UTC (#15966 ) Summary: This corresponds to 00:05 EST Pull Request resolved: https://github.com/pytorch/pytorch/pull/15966 Differential Revision: D13639027 Pulled By: pjh5 fbshipit-source-id: 6685a7af74329b2730e519afd10e350ef2258f32	2019-01-11 11:46:21 -08:00
vishwakftw	e07cca1312	Add backend checks for batch norm (#15955 ) Summary: Fixes #15826 Changelog: - Add backend checks in `batch_norm_cpu` and `batch_norm_cuda` - Modify check in `checkBackend` to pass on undefined tensors. Differential Revision: D13636410 Pulled By: soumith fbshipit-source-id: 3b1cfe5ca8b7c0346569077163503065e75c2659	2019-01-11 11:28:45 -08:00
zrphercule	c9d7ead0c4	Add scalar_type_to_pytorch_type dict in ONNX symbolic Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15965 Differential Revision: D13637521 Pulled By: zrphercule fbshipit-source-id: 922cadc56f6380f67c14444cff4aa354a87150af	2019-01-11 10:55:43 -08:00
Zachary DeVito	3f6b212e80	Register CPU/CUDA fuser dynamically (#15887 ) Summary: This avoids a bunch of conditional compilation logic Pull Request resolved: https://github.com/pytorch/pytorch/pull/15887 Reviewed By: eellison Differential Revision: D13613239 Pulled By: zdevito fbshipit-source-id: a18fc69676b3ef19b4469ab58d8714d1f6efccbb	2019-01-11 10:50:35 -08:00
Adam Paszke	d580d3583b	Simplify cat fusion (#15633 ) Summary: That makes that definition of a "fusable node" much simpler, as we don't need to keep considering whether something has to be an "exit node" at every step. The fuser now tries to maximize the pointwise fusions first, and proceeds to prepending chunks and appending concats only once a fix point is reached. This patch not only makes the fuser much simpler to reason about, making it siginifcantly easier to implement features like SumToSize fusion, to improve performance of derivative graphs. cc zou3519 mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/15633 Differential Revision: D13575306 Pulled By: zou3519 fbshipit-source-id: 0c55ea61d65d1f1ed3d75a8e1e83bc85a83f3aff	2019-01-11 10:33:42 -08:00
Elias Ellison	3d0d16d31c	Add bindings for .cpu() & .cuda() to script (#15904 ) Summary: Adding bindings for .cpu() and .cuda() to script. It's worth noting that if the device remains unchanged, than the returned tensor aliases the input, but if it does change than they do not alias each other. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15904 Differential Revision: D13632879 Pulled By: eellison fbshipit-source-id: 024a04f267909674aa1e510562efd9cb081f407c	2019-01-11 10:04:08 -08:00
Shen Li	03a570cad9	comment out large test cases for tril(u)_indices (#15959 ) Summary: 4GB is still too large and leads to CUDA OOM failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15959 Differential Revision: D13635146 Pulled By: mrshenli fbshipit-source-id: 3dc34a03d6ed65c458839d8fa37cd05bf3bc8106	2019-01-11 09:25:03 -08:00
Lu Fang	7841fe4f27	Automatic update of fbcode/onnx to 7abd834091f1024c11749dcfd25126802db9fdd5 (#15942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15942 Previous import was 8384c788939bc65463f9754b6a7a00b212b18ba1 Included changes: - [7abd834](https://github.com/onnx/onnx/commit/7abd834): Clarify some aspects of the Loop spec. (#1587) <Scott McKay> - [5a5b15f](https://github.com/onnx/onnx/commit/5a5b15f): Support rtol and atol at the model granularity (#1723) <Lu Fang> - [ba76e45](https://github.com/onnx/onnx/commit/ba76e45): print some information (#1724) <Lu Fang> - [797390d](https://github.com/onnx/onnx/commit/797390d): Update README.md (#1722) <Prasanth Pulavarthi> - [40cdb5f](https://github.com/onnx/onnx/commit/40cdb5f): repaire convtranspose shape inference (#1660) <peter yang> - [68fdb3f](https://github.com/onnx/onnx/commit/68fdb3f): [Minor] Fix Windows line ending in test coverage generating script (#1717) <Raymond Yang> - [00101bf](https://github.com/onnx/onnx/commit/00101bf): Remove ConstantLike op. Updates to ConstantOfShape op. (#1716) <Spandan Tiwari> - [c59e90a](https://github.com/onnx/onnx/commit/c59e90a): add a shape inference test for group conv (#1719) <Lu Fang> Reviewed By: zrphercule Differential Revision: D13629499 fbshipit-source-id: 4b3e4cb29bdb84c3777a8fb26263548efb20f317	2019-01-11 08:28:01 -08:00
Brennan Vincent	70dd44f6a8	Match NumPy by considering NaNs to be larger than any number when sorting (#15886 ) Summary: Fixes #15764 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15886 Differential Revision: D13612971 Pulled By: umanwizard fbshipit-source-id: 91f552a25d1fd108f2f0b10e09a0ce0364f8c21e	2019-01-11 08:14:11 -08:00
Gregory Chanan	b7cdeb3fc3	Port empty_strided to ATen. (#15948 ) Summary: Turns out this has basically been implemented already in Resize.h / Resize.cuh. Also added some testing, basically just to check that empty_strided behaves equivalently to as_strided. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15948 Differential Revision: D13631098 Pulled By: gchanan fbshipit-source-id: eb0e04eead45e4cff393ebde340f9d265779e185	2019-01-11 07:58:05 -08:00
Syed Tousif Ahmed	14dcdc4c35	Move cudaDeviceProp to ATen (#14834 ) Summary: This PR moves `deviceProperties` from `THCState` struct to `CUDAContext` in ATen and hence, takes one more step towards removing `THCState`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14834 Differential Revision: D13633956 Pulled By: soumith fbshipit-source-id: 51820ac224fc566f17aa92570fd378cff4248596	2019-01-11 07:09:32 -08:00
Derek Kim	da753b7ccf	Trivial typo fixings in nn.functional dropout* docstrings (#15951 ) Summary: Defualt -> Default Pull Request resolved: https://github.com/pytorch/pytorch/pull/15951 Differential Revision: D13633875 Pulled By: soumith fbshipit-source-id: 0da823ef235418396e9322089f6610b592e6990f	2019-01-10 22:42:52 -08:00
Syed Tousif Ahmed	86af14b0c7	Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461 ) Summary: When compiling for `TORCH_CUDA_ARCH_LIST=7.5` we were getting ptxas warnings (https://github.com/pytorch/pytorch/issues/14310). This was because we had some hardcoded values when using launch_bounds in kernels. The maximum number of threads per multiprocessor is 1024 for Turing architecture (7.5) but 2048 for previous architectures. The hardcoded launch_bounds in the kernel were requesting for 2048 threads when compiling for Turing and hence were generating the warning. This PR adds a macro that checks for the bounds on the launch bounds value supplied. The max number of threads per block across all architectures is 1024. If a user supplies more than 1024, I just clamp it down to 512. Depending on this value, I set the minimum number of blocks per sm. This PR should resolve https://github.com/pytorch/pytorch/issues/14310. The gradient computation being wrong reported in that PR is probably due to the faulty card. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15461 Differential Revision: D13633952 Pulled By: soumith fbshipit-source-id: 795aa151109f343ab5433bf3cb070cb6ec896fff	2019-01-10 21:44:39 -08:00
Gu, Jinghui	07ea3e035e	Fix fallback issues to handle inplace case (#15726 ) Summary: Fix fallback issues to handle inplace case Pull Request resolved: https://github.com/pytorch/pytorch/pull/15726 Differential Revision: D13591243 Pulled By: yinghai fbshipit-source-id: 6897f1daacb36beabcdfc22c39242bbdfdd0e534	2019-01-10 19:47:09 -08:00
Vitaly Fedyunin	0934e8de58	Optimize CPU version performance of the nonzero function. (#15925 ) Summary: Same as #15190 but compatible with MSVS compiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/15925 Differential Revision: D13623473 Pulled By: VitalyFedyunin fbshipit-source-id: d0db9dbc1a0d8fc9bda08348cb1d3763ae9f8679	2019-01-10 17:50:38 -08:00
Jerry Zhang	890568a018	Tensor reinitialization codemod - 5/5 (#15884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15884 Codemod generated with clangr shard mode, 25 files per diff, To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and call `ReinitializeTensor` to initialize it. motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: hyuen Differential Revision: D13586737 fbshipit-source-id: dc8e49e9f29505b8898bb19f84c1a983f2d811ab	2019-01-10 16:32:26 -08:00
Evgeniy Zheltonozhskiy	e46e572b30	Add backward pass notes for eig() and symeig() Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15929 Differential Revision: D13626158 Pulled By: soumith fbshipit-source-id: ab869560926036053c39d20b217ccef8767e7d3f	2019-01-10 16:27:48 -08:00
Sebastian Messmer	da7468853a	caffe2::Tensor::is_same() (#15407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15407 Don't ask the tensor for its intrusive pointer if we just want to check if two tensors are the same. This mirrors ATen APIs. Reviewed By: dzhulgakov Differential Revision: D13520389 fbshipit-source-id: 681317f36f480ab60e532bb08a073f98f39770fd	2019-01-10 16:22:25 -08:00
Sebastian Messmer	b9e1028cff	Clean up Half (#15317 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15317 - Merge bitcasts.h and Half.h - Remove 'static' keyword Reviewed By: dzhulgakov Differential Revision: D13498492 fbshipit-source-id: 46d47143e7d3a9d3f4aa7d92379dbba015c97435	2019-01-10 16:22:23 -08:00
Sebastian Messmer	d408324350	Move files to/from c10/core and c10/util (#15316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316 This starts cleaning up the files in c10 according to the module structure we decided on. Move to c10/util: - Half.h, Half-inl.h, Half.cpp, bitcasts.h Move to c10/core: - Device.h, Device.cpp - DeviceType.h, DeviceType.cpp i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov Differential Revision: D13498493 fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63	2019-01-10 16:22:22 -08:00
Sebastian Messmer	6b64052e20	Remove Context from c10 operator schemas (#15312 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15312 Context will soon be entirely obsolete. Remove it from the operator schema interface. Reviewed By: dzhulgakov Differential Revision: D13495323 fbshipit-source-id: caa0f8f092cd6284e510c3e1e3374fe2f8338364	2019-01-10 16:22:20 -08:00
Sebastian Messmer	8136c39b5e	Enable calling caffe2 LayerNorm from PyTorch and JIT (#15243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15243 Register it as a custom JIT op. Reviewed By: dzhulgakov Differential Revision: D13473791 fbshipit-source-id: 0f7e72e3efc85a75060a7597fadaf0a8bd289651	2019-01-10 16:22:18 -08:00
Zachary DeVito	913785445e	fix rocm build Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15945 Differential Revision: D13630505 Pulled By: zdevito fbshipit-source-id: a4d2ae1370ab475fc1711027c0c9d2a9192be195	2019-01-10 16:16:15 -08:00
bddppq	27f6a29fd0	Remove USE_CUDA and USE_ROCM in engine.cpp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15893 Differential Revision: D13627319 Pulled By: zdevito fbshipit-source-id: 7c72c1c6cc242143fb66383423c668c9b9810884	2019-01-10 14:45:11 -08:00
Peter Goldsborough	c5012d8641	Extend note about contributing to the C++ frontend (#15902 ) Summary: soumith ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15902 Differential Revision: D13628525 Pulled By: goldsborough fbshipit-source-id: 70cf36d1bacd9d689d4fa4f2290886fd3765e89b	2019-01-10 14:22:00 -08:00
Jesse Hellemn	3ec3351306	Fix different env variables in schedules runs pt 2 (#15934 ) Summary: Unfortunately I do not know how to test this without merging it first Pull Request resolved: https://github.com/pytorch/pytorch/pull/15934 Reviewed By: orionr Differential Revision: D13627472 Pulled By: pjh5 fbshipit-source-id: 35eced1483bbf3c0c3f6f62fb7bbbf2f200e50e6	2019-01-10 14:09:12 -08:00
Xiaomeng Yang	4b427780aa	Change PoolOp Functors design to support CuDNN CUDA fallback (#15903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15903 Change PoolOp Functors design to support CuDNN CUDA fallback Reviewed By: houseroad Differential Revision: D13617085 fbshipit-source-id: 8a539d77f35bc47afe5dc8e32aaad52e45cb691c	2019-01-10 14:00:22 -08:00
Peter Goldsborough	b1fa19961e	Fix bug in torch::load and unpack torch::optim::detail namespace (#15926 ) Summary: Wasn't clearing optimizer buffers before adding new entries to it during deserialization. Successive calls to `torch::load` with the same optimizer would just append to the buffer container. Also moved `serialize()` function from `torch::optim::detail` into `torch::optim` so users can use it for custom optimizers. Fixes #15792 ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15926 Differential Revision: D13623615 Pulled By: goldsborough fbshipit-source-id: e193091f25f56a95f2a9648af312cb7caa45f300	2019-01-10 13:55:50 -08:00
Elias Ellison	9173cd5a4d	fix aliasing on unwrap optional (#15748 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/15604 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15748 Differential Revision: D13583632 Pulled By: eellison fbshipit-source-id: 9655ee010494179e17e34f3047363477dad15fb1	2019-01-10 12:52:53 -08:00
Adam Paszke	d35295c603	JIT Batch Norm fusion (#15897 ) Summary: Resubmit of #15146, which has been accidentally reverted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15897 Differential Revision: D13616093 Pulled By: zou3519 fbshipit-source-id: 0c3a3bec8f9fed57274da9f6c7cf40cbc05cf91a	2019-01-10 12:38:47 -08:00
Jesse Hellemn	7f268c6262	Fix different env variables in schedules runs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15927 Reviewed By: orionr Differential Revision: D13624127 Pulled By: pjh5 fbshipit-source-id: e8b14f0401b0c278a5d17af6d7979800917e3ae6	2019-01-10 12:33:24 -08:00
Orion Reblitz-Richardson	4edc8273eb	Allow for registration after GlobalInit (#15876 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15876 Build changes made it so some .so libraries are now registered after GlobalInit is called. Although this shouldn't be common, it also shouldn't be explicitly excluded. These changes allow for late Caffe2 registration, but also warn in that case. Reviewed By: kuttas Differential Revision: D13608186 fbshipit-source-id: 0ca7bcd32516d374077db0c2548cf8c28ccdd5f6	2019-01-10 09:35:33 -08:00
SsnL	9b5ec2a076	Fix TestDataLoader.test_proper_exit (#15665 ) Summary: Currently, in `test_proper_exit`, 1. we do not kill the correct input `pid` in the `kill_pid` function `fe15d6a2c2/test/test_dataloader.py (L325-L329)` 2. the Windows command that detects process status doesn't actually work `fe15d6a2c2/test/test_dataloader.py (L641-L646)` 3. `worker_error` and `worker_kill` cases (sometimes?) are not tested because the workers may exit naturally due to the pre-fetching mechanism and a too small `dataset size / batch size`. In this PR, I, in separate commits: 1. Install `psutil` (a python package specifically built for process monitoring) on some CI builds. (Linux builds installation are done in https://github.com/pietern/pytorch-dockerfiles/pull/29 https://github.com/pietern/pytorch-dockerfiles/pull/30 https://github.com/pytorch/ossci-job-dsl/pull/36 and https://github.com/pytorch/pytorch/pull/15795). 2. Rewrite `test_proper_exit` with `psutil` so we 1. do not rely on the hacky `is_process_alive` `fe15d6a2c2/test/test_dataloader.py (L640-L653)` 2. increase the #task per worker so `worker_error` and `worker_kill` properly trigger 3. test error message content to ensure that the loader exits with correct message corresponding to each exiting scenario. 3. Fix Windows data loader not having any mechanism to detect worker failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15665 Differential Revision: D13615527 Pulled By: soumith fbshipit-source-id: cfb2f67837d2d87928a53f00b4d20f09754b7949	2019-01-10 08:47:27 -08:00
peter	0ed3f766e9	Unify flags and environmental variable when building LibTorch/PyTorch (#15868 ) Summary: Fixes #15858. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15868 Differential Revision: D13622354 Pulled By: soumith fbshipit-source-id: bb8c49520ebf926c6194d42db75accba867018c7	2019-01-10 06:47:14 -08:00
Jesse Hellemn	3d68f35639	Adding binary builds to circleci Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15577 Reviewed By: orionr Differential Revision: D13617359 Pulled By: pjh5 fbshipit-source-id: 2b2a1b8735f2af6973a2352bee78912794402ae1	2019-01-10 00:06:09 -08:00
SsnL	2fa9264ba1	Fix lint Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15910 Differential Revision: D13620684 Pulled By: houseroad fbshipit-source-id: af3b1e2fed55ecd3417f66e549fa921bf4fd758e	2019-01-09 23:20:32 -08:00
an-kumar	cdaeb0db54	Make SGD match python (#15840 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15530 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15840 Differential Revision: D13608503 Pulled By: goldsborough fbshipit-source-id: aad17c110d64cbe2c126bccd36d228e4108ffa9a	2019-01-09 22:21:14 -08:00
Mikhail Zolotukhin	628bf5e3c9	test_jit.py: Speedup EndToEnd tests by reducing workload size. (#15906 ) Summary: Currently these tests are taking most of the time in test_jit.py run, with the proposed changes the testing time is reduced by ~75%: ``` TestEndToEndHybridFrontendModels.test_neural_style: 203.360s -> 10.650s TestEndToEndHybridFrontendModels.test_snli: 422.315s -> 9.152s TestEndToEndHybridFrontendModels.test_super_resolution: 73.362s -> 19.185s time python test/test_jit.py (real): 13m50.828s -> 3m11.768s time python test/test_jit.py (user): 85m59.745s -> 13m18.135s time python test/test_jit.py (sys): 144m9.028s -> 25m58.019s ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15906 Differential Revision: D13619659 Pulled By: ZolotukhinM fbshipit-source-id: 6c22d8740f8ddb865c3a0667af32653723383816	2019-01-09 21:14:35 -08:00
Shen Li	23e28efed4	Porting legacy reflection_pad2d to ATen Summary: Other changes: 1. Avoided using `THCDeviceTensor` by re-calculating the mapping from cuda (blockIdx, threadIdx) to input/output tensor index. 2. Changed Camelcase naming to underscore naming. Differential Revision: D13546803 fbshipit-source-id: 1df54f13e64934da3d803d9b6586bd5208d42d6d	2019-01-09 20:55:27 -08:00
vishwakftw	5f1dd9e743	Fix log_prob for Gumbel distribution (#15878 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15681 Changelog: - Add hard-coded implementation of log_prob Pull Request resolved: https://github.com/pytorch/pytorch/pull/15878 Differential Revision: D13613716 Pulled By: soumith fbshipit-source-id: 2ba74e52748b6213098b167940dcc068f0c056f4	2019-01-09 20:09:34 -08:00
Jerry Zhang	4caca2f062	Tensor method rename sizes().size() -> dim() Summary: Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: smessmer Differential Revision: D13568637 fbshipit-source-id: 4e1b6658355d4073097eb666ba73596e0261bef1	2019-01-09 19:53:56 -08:00
vishwakftw	b4c3268b23	Batched upper triangular, lower triangular (#15257 ) Summary: Changelog: - Implements `triu` and `tril` for batches of 2D tensors. - Remove TH/THC binding for `tril` - Fix CUDA implementation - Update docstrings for tril and triu. - Remove mask-based `triu` and `tril` in cholesky forward and backward. - Remove batched tril in torch.distributions.utils Pull Request resolved: https://github.com/pytorch/pytorch/pull/15257 Differential Revision: D13613888 Pulled By: mrshenli fbshipit-source-id: 0949a05b9b8e974c1acfaf02a6284848ec5cc1c4	2019-01-09 19:46:39 -08:00
Summer Deng	5af9aaa5bb	Minor bug fix in dnnlowp (#15841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15841 Fix the bugs in dnnlowp to support int8/int16 quantization for sparsenn. Reviewed By: jspark1105 Differential Revision: D13600878 fbshipit-source-id: 27f06d7c54a663208320c8f211714220a9b49540	2019-01-09 17:18:30 -08:00
Mikhail Zolotukhin	159c2f3918	test_jit.py: Replace direct `exec` invocation with a wrapper. (#15882 ) Summary: Python2 doesn't allow to invoke `exec` from a nested function: File "test/test_jit.py", line 4653 exec(code, globals(), scope) SyntaxError: unqualified exec is not allowed in function 'test' it is a nested function This patch wraps exec with a separate function, making it work for both python2 and python3. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15882 Differential Revision: D13614235 Pulled By: ZolotukhinM fbshipit-source-id: 9a074308c2379f089402e0bf5a996cc649d6dbca	2019-01-09 17:01:20 -08:00
Gregory Chanan	b28738ccb5	Revert D13468570: [pytorch][PR] Optimize CPU version performance of the nonzero function. Differential Revision: D13468570 Original commit changeset: e55ce54d6062 fbshipit-source-id: 4c043564b0a69b5af11559e5dc94790e7064841f	2019-01-09 15:41:36 -08:00
Mickaël Schoentgen	71c6e24373	Fix several ResourceWarning: unclosed file (#15746 ) Summary: Hello, This is a patch to fix `ResourceWarning: unclosed file`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15746 Differential Revision: D13587286 Pulled By: soumith fbshipit-source-id: 08ac34c5b51d9334867f65a2927bff11511553f3	2019-01-09 15:36:53 -08:00
Yan Shang	a1180d8e86	Fix BuildIndexOp (#15580 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15580 adding the UNDEFINED datatype. Reviewed By: itomatik Differential Revision: D13556099 fbshipit-source-id: b730f7fca8faefb8a013c265296eee26bcedaff0	2019-01-09 15:12:50 -08:00
Shen Li	7b9f794580	Wrap C10 CUDAStream instead of cudaStream_t in THCPStream Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15833 Differential Revision: D13608337 Pulled By: mrshenli fbshipit-source-id: 4c66ef89fad0dc14a11ddb69da92907797cd2828	2019-01-09 15:12:48 -08:00
Jerry Zhang	0c32e1b43e	use C10_MOBILE/ANDROID/IOS (#15363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363 Didn't define C10_MOBILE in the numa file move diff: D13380559 move CAFFE2_MOBILE/ANDROID/IOS to c10 ``` codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS" ``` i-am-not-moving-c2-to-c10 Reviewed By: marcinkwiatkowski Differential Revision: D13490020 fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4	2019-01-09 15:08:20 -08:00
Vitaly Fedyunin	5838b59c5d	Optimize CPU version performance of the nonzero function. (#15190 ) Summary: Optimized CPU version of the nonzero. Now 2x faster (in avg.) than numpy. Can be further optimized for 1D tensors and boolean tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15190 Differential Revision: D13468570 Pulled By: VitalyFedyunin fbshipit-source-id: e55ce54d60626a42d9a10a02e407856458b8055e	2019-01-09 13:37:38 -08:00
Gregory Chanan	0571eaebab	Remove TH binding of newWithStorage as it is not used. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15838 Differential Revision: D13601517 Pulled By: gchanan fbshipit-source-id: 71ec107de2c880e7e0fd2ad6b4ea3d112dbb9d86	2019-01-09 13:10:33 -08:00
Junjie Bai	692caa7211	Revert D13598894: [pytorch][PR] [Caffe2] [ROCm] Use correct workspace alloc call in MIOpen conv operator Differential Revision: D13598894 Original commit changeset: 44886161abdf fbshipit-source-id: 6c6057136f1ea741fcd1734695356709aeb4bf12	2019-01-09 10:03:50 -08:00
Topher Lubaway	14b40c0633	Revert D13548303: [pytorch][PR] Add support for batch_norm fusion to the JIT Differential Revision: D13548303 Original commit changeset: a2e2e5abc383 fbshipit-source-id: 5b70cdbcbd1cac06eeefb2a939773358c061183c	2019-01-09 08:53:57 -08:00
SsnL	fe15d6a2c2	Fix macos build (#15873 ) Summary: macos builds are broken now with the following error: ``` /usr/local/Homebrew/Library/Homebrew/config.rb:39:in `initialize': no implicit conversion of nil into String (TypeError) from /usr/local/Homebrew/Library/Homebrew/config.rb:39:in `new' from /usr/local/Homebrew/Library/Homebrew/config.rb:39:in `<top (required)>' from /usr/local/Homebrew/Library/Homebrew/vendor/portable-ruby/2.3.7/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/local/Homebrew/Library/Homebrew/vendor/portable-ruby/2.3.7/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/local/Homebrew/Library/Homebrew/global.rb:25:in `<top (required)>' from /usr/local/Homebrew/Library/Homebrew/brew.rb:13:in `require_relative' from /usr/local/Homebrew/Library/Homebrew/brew.rb:13:in `<main>' Exited with code 1 ``` No recent commits look suspicious, and I can even reproduce locally on my macbook, so it might be related to some new `brew` updates. Empirically, calling `brew update` first seems to fix this. Example error build: https://circleci.com/gh/pytorch/pytorch/534392?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link Pull Request resolved: https://github.com/pytorch/pytorch/pull/15873 Differential Revision: D13608019 Pulled By: soumith fbshipit-source-id: 1499cb5246929e275a11ca6fccef6ef32918e45e	2019-01-09 07:50:36 -08:00
zou3519	f0c2a9a7b6	Add torch.bincount() test case on sliced tensor (#15835 ) Summary: This was causing a problem in #15735 but appears to have been fixed. Adding this test to prevent regressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15835 Differential Revision: D13600282 Pulled By: zou3519 fbshipit-source-id: d9939e74d372be71c50122a5f6a615fbd7fa4df6	2019-01-09 07:31:19 -08:00
Andre Georg Holzner	961f829067	deduplicated code in elementwise_op_broadcast_test.py (#15865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15865 factored out code used in tests for operators Add, Mul and Sub into two new methods: a first one to generate the test vectors, a second one to run the actual tests given a caffe2 and python operator. Reviewed By: houseroad Differential Revision: D13526955 fbshipit-source-id: 8970ba5a1305ca19a54a14b51816d4a19f19d678	2019-01-09 03:07:22 -08:00
Jon Crall	c7ec7cdd46	Fixed syntax error in doctest (#15646 ) Summary: I fixed a very small extra parenthesis in a doctest. I'm also going to use this issue as a place to propose the eventual inclusion of xdoctest (a pip installable library I wrote) in pytorch's test suite. I think there are a lot of problems with Python's built in doctest module, and I've built xdoctest to fix them. I would love for my project to get some exposure and its addition to PyTorch may benefit both projects. Please see the readme for more details on what xdoctest brings to the table over the builtin doctest module: https://github.com/Erotemic/xdoctest I came across this small syntax error when working on ensuring xdoctest was compatible with pytorch. It isn't 100% there yet, but I'm working on it. My goal is to ensure that xdoctest is 100% compatible with all of torch's doctest out-of-the-box before writing up the PR. I'm also airing the idea out-loud before I commit too much time into this (or get my hopes up), so I'm attaching this little blurb to a no-brainer-merge PR to (1) demonstrate a little bit of value (because xdoctest flagged this syntax error) and (2) see how its received. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15646 Differential Revision: D13606111 Pulled By: soumith fbshipit-source-id: d4492801a38ee0ae64ea0326a83239cee4d811a4	2019-01-09 01:29:11 -08:00
surgan12	ac206a95f5	crelu mentioned (#15825 ) Summary: Mentioning crelu near relu in the docs . fixes #15730 . cc : ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15825 Differential Revision: D13605782 Pulled By: soumith fbshipit-source-id: d34932cf82e5407c48548dbdfc1c61b596669a0b	2019-01-08 22:55:49 -08:00
Yinghai Lu	5fe2697655	Initialize tensor with fp32 in Caffe2Backend.prepare() (#15832 ) Summary: Fix https://github.com/pytorch/pytorch/issues/14104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15832 Reviewed By: bddppq Differential Revision: D13598332 Pulled By: yinghai fbshipit-source-id: 3302ac47928974f49353c5da8af440e5c1716c22	2019-01-08 22:33:52 -08:00
Thomas Viehmann	c93cf89de2	Fix cuda native loss_ctc for varying input length (#15798 ) Summary: Thank you, freesouls, for the reproducing example! This is strictly fixing the bug in gradients for varying length inputs discussed in the middle-to-bottom of the bug report. I'll have a feature patch regarding inf losses -> NaN grads separately. Fixes: #14401 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15798 Differential Revision: D13605739 Pulled By: soumith fbshipit-source-id: 167ff42399c7e4cdfbd88d59bac5d25b57c0363f	2019-01-08 22:28:39 -08:00
marka17	cb32418669	Add element-wise multiplication in formulas (#15834 ) Summary: Absence of element-wise multiplication can confused some beginners Pull Request resolved: https://github.com/pytorch/pytorch/pull/15834 Differential Revision: D13603369 Pulled By: soumith fbshipit-source-id: 1d5c17c57778ddbb4b201122d826d1d6437204d1	2019-01-08 21:17:25 -08:00
Derek Kim	3f6e58b43b	Typos fixed in CWrapPlugin.get_type_check (#15859 ) Summary: Typos fixed in CWrapPlugin.get_type_check Pull Request resolved: https://github.com/pytorch/pytorch/pull/15859 Differential Revision: D13605908 Pulled By: soumith fbshipit-source-id: a8c970f0ac6d54dfd69b9775fc1a2b4f198b4ed6	2019-01-08 20:55:35 -08:00
Sebastian Messmer	1d6e818f2c	Move LayerNorm op schema to c10 (#15199 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15199 In order to call it from PyTorch, this op schema can't live in caffe2 but must be included from PyTorch. Moving it to c10. This is not where it should be in the end (that's why there is a large TODO here), but an intermediate hack to enable this use case and proof-of-concept. Reviewed By: ezyang Differential Revision: D13462124 fbshipit-source-id: 1e187b9def8ef049c91e6de947ea4a85758d711b	2019-01-08 20:31:48 -08:00
Sebastian Messmer	11708cbd7b	Update flat_hash_map (#15367 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15367 This updates flat_hash_map and fixes an issue with singletons across library boundaries (see the PRs linked at the top of the file) Reviewed By: ezyang Differential Revision: D13510912 fbshipit-source-id: e90a297a7a2d69ae3fe48e4fcd8a44ad4b81292a	2019-01-08 20:31:46 -08:00
Sebastian Messmer	905df3943a	Fix C10_API/C10_EXPORT for op schema registration (#15324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15324 This was missing but needs to be here, otherwise we can't register schemas without linker errors. Reviewed By: ezyang Differential Revision: D13500679 fbshipit-source-id: ba06351cb8ae09ec456cb93e527d388ace578fbb	2019-01-08 20:31:45 -08:00
Sebastian Messmer	d562840910	Use C10Tensor in the dispatcher (#15195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15195 This removes the use of caffe2::Tensor or at::Tensor in the c10 dispatcher and only uses C10::Tensor. It also changes output tensors to be passed as `const Tensor&` instead of `Tensor*` because we otherwise can't forward them in operator_c10wrapper.h. Reviewed By: ezyang Differential Revision: D13461640 fbshipit-source-id: 7f79925a7d60f01660a24bbfda47391af0c70ed3	2019-01-08 20:31:43 -08:00
Sebastian Messmer	8ac55a6812	Convert caffe2/aten Tensors to/from c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14820 Reviewed By: dzhulgakov Differential Revision: D13348044 fbshipit-source-id: 95008e6ead3cfc478696b1c203769241d4cf6ca8	2019-01-08 20:31:42 -08:00
Sebastian Messmer	31d7c933af	Implement c10::Tensor (#14819 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14819 This is a minimal wrapper for a c10::TensorImpl, maybe destined for greatness later when we move caffe2::Tensor or at::Tensor into c10. Reviewed By: dzhulgakov Differential Revision: D13348039 fbshipit-source-id: 874f515358e94f35dc7a4c3e55b35fde59c51ff1	2019-01-08 20:31:40 -08:00
albanD	828cb18fa3	Allow ReadyQueue to handle empty tasks (#15791 ) Summary: Allow the comparison function used in ReadyQueue to handle the empty FunctionTasks created by the reentrant autograd. Fix #11732 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15791 Differential Revision: D13598006 Pulled By: soumith fbshipit-source-id: 0bfdf28a735fbfe44f0fdbaf8b74a6198e6a1984	2019-01-08 20:06:04 -08:00
Brennan Vincent	8a07cbe5e1	In loop_wrapper, do not copy the passed-in functor (capture it by reference instead). (#15845 ) Summary: The overhead of the copy actually makes an appreciable difference when doing a lot of small reductions (i.e., when the reduced dimension is significantly smaller than the non-reduced dimensions. ``` x=torch.randn((1024,10,1024),dtype=torch.float64) torch.set_num_threads(1) %timeit x.std(1) ``` Before: 813.0 ms After: 708.25 ms Pull Request resolved: https://github.com/pytorch/pytorch/pull/15845 Differential Revision: D13603246 Pulled By: umanwizard fbshipit-source-id: 020d224d76fcb8a0b55b75b0f2937e9508891beb	2019-01-08 19:59:39 -08:00
David Carrillo Cisneros	2b22612289	Add NHWC support to Resize Operator (#15553 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15553 Add unit test and implementation of NHWC layout for Resize operator. Also, add pragma parallel loop to old NCHWC layout. Reviewed By: jspark1105 Differential Revision: D13540762 fbshipit-source-id: eebf252bf0d1efdff180a171d804181045f100a5	2019-01-08 16:44:17 -08:00
andersj	8a5ba577c1	Revert "remove use of tmp_install" (#15847 ) Summary: This reverts commit 04bf5285896e52ac118d2f9e9b7f582f695f13e2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15847 Differential Revision: D13603174 Pulled By: anderspapitto fbshipit-source-id: ae321434d3345ad94fad67bf71fd027cddeb4588	2019-01-08 16:30:19 -08:00
Jesse Hellemn	4f51ca490e	Correcting source pybind11 library to install into Python Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15836 Reviewed By: anderspapitto Differential Revision: D13601331 Pulled By: pjh5 fbshipit-source-id: 36785c501774c01f47acb49cdac265b2c95a5040	2019-01-08 15:06:55 -08:00
Zachary DeVito	acc83ad54e	implement floordiv with correct integer and division by 0 semantics (#15813 ) Summary: fixes #15768 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15813 Differential Revision: D13594872 Pulled By: zdevito fbshipit-source-id: c6c78c9e17fb16ec2bdc42402d203592cf35b7db	2019-01-08 13:44:18 -08:00
Derek Kim	92a2bfe52d	A trivial error message updates on `at::Tensor _convolution` (#15830 ) Summary: I fixed an grammatical error on this function previously, but I also realized that its content was also wrong. A weight tensors of a convolutional layer should be at least 3 dimensional, not 2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15830 Differential Revision: D13597968 Pulled By: soumith fbshipit-source-id: 72a75106e88945c68d6462828b149441cfb5acde	2019-01-08 13:20:00 -08:00
peter	24314e9ceb	Enable torch static build on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15769 Reviewed By: yf225, pjh5 Differential Revision: D13597845 Pulled By: orionr fbshipit-source-id: 99640e22974990ae570a4795ce07274c4447cb01	2019-01-08 13:19:57 -08:00
Richard Zou	196eee6ccd	Fix sum_to behavior with zero dimensions (#15796 ) Summary: Fixes #15223. This fixes an autograd bug where backprop either fails or produces gradients of incorrect sizes when tensors with zero-sized dimensions are involved. Previously, we were reducing along dimensions that had size greater than 1 when summing to a size in autograd. This is incorrect because we should also reduce along dimensions with size 0 to produce a tensor of size 1 in that dimension that then gets viewed to the correct shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15796 Differential Revision: D13593199 Pulled By: zou3519 fbshipit-source-id: 2e2acac34943a9b7fabadc10c9efd4f66db298fd	2019-01-08 13:19:54 -08:00
mwootton	734eb31035	Cache workspace size in the BenchmarkCache. (#15742 ) Summary: Cache the workspace size information for MIOpen for a given configuration as opposed to inquiring it every time. This reduces overhead significantly as inquiring the workspace size forces a full read of the performance database in MIOpen and this database has grown significantly in recent releases. This caching gets us back to ideal performance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15742 Differential Revision: D13598932 Pulled By: bddppq fbshipit-source-id: 4e65d247b71dec828293cf0562aac3fbd4fad83a	2019-01-08 13:10:15 -08:00
mruberry	1bc47c0d86	Refactors shape logic out of code generation, fixes possible segfault (#15750 ) Summary: This PR: - Removes shape logic from the code generator, which was previously relied on to return chunk and concat information - Copies the logic to detect if a kernel has a rand_like node to the executor, making its pass independent of the code generator - Fixes a possible segfault where references to a vector still being modified were relied upon The actual shape logic is unchanged. The possible segfault is in the handling of the former "flat_inputs" in codegen.cpp. This vector holds pairs, and the second element of these pairs is a reference. In some cases these would be references to items in the vector chunk_desc, which could be added to later, possibly invalidating any references to items in it. I hit a similar segfault in testing when naively making parallel code for "flat_outputs." I'm submitting this small PR because it's separable, self-contained, has a fix, and I am trying to actively get away from large PRs to encourage more stability and incremental change in the fuser. ngimel zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15750 Differential Revision: D13597451 Pulled By: zou3519 fbshipit-source-id: 0d48b365779b42849b044ba0286258aacc7b0332	2019-01-08 12:36:59 -08:00
Johannes M Dieterich	c42def29c8	Use parallel thrust execution policy on ROCm (#15481 ) Summary: The Thrust shipped with ROCm is recent enough to support this API. Minimize divergence between CUDA/ROCm by changing idef guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15481 Differential Revision: D13598739 Pulled By: bddppq fbshipit-source-id: 20d0a7e3887a4050eea65033161561af47411de1	2019-01-08 12:20:26 -08:00
ashishfarmer	cc402d8fa1	Use correct workspace alloc call in MIOpen conv operator (#15712 ) Summary: This PR contains changes for: 1. Using memory alloc from HIPContext while allocating workspace for MIOpen conv and transpose_conv operators rather than direct HIP mem alloc 2. Minor cleanup and removing an unnecessary sync call from MIOpen conv op Differential Revision: D13598894 Pulled By: bddppq fbshipit-source-id: 44886161abdf91cd29c7c93b3e23620e1b09c7c9	2019-01-08 11:38:45 -08:00
Jerry Zhang	532a709771	Tensor method rename dims()->sizes() - 2/2 Summary: Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: smessmer Differential Revision: D13581787 fbshipit-source-id: b04c6aa87fea3a10b522a71fccc1fcfb76a2c212	2019-01-08 11:34:36 -08:00
Jerry Zhang	ede1f4ad05	Remove caffe2::ShareData (#15418 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15418 Previously we are using Resize + ShareData. Instead, we'll create a function on Tensor that clones itself with same storage. Suppose we want `t` to `ShareData` with `t0`, Previous: ``` Tensor t(dims, CPU); t.Resize(t0.sizes()); t.ShareData(t0); ``` Now: ``` Tensor t = t0.Alias(); ``` Reviewed By: dzhulgakov Differential Revision: D13507609 fbshipit-source-id: 6e4275d02f4c3356cbce91127f1b01111dc86b9f	2019-01-08 11:01:56 -08:00
Peter Goldsborough	8232bd526f	Move isnan to C++ (#15722 ) Summary: Wanted to use `Tensor.isnan` in C++, figured it'd be nice to have, so I made it into a tiny native function. gchanan ezyang apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/15722 Differential Revision: D13591315 Pulled By: goldsborough fbshipit-source-id: a78bd22101fde87a0257f759b9bfcf3b4208f5fa	2019-01-08 10:42:33 -08:00
Natalia Gimelshein	461dc9a28b	use all_weights instead of _parameters in _flat_weights in rnn (#15766 ) Summary: Fixes #15749 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15766 Differential Revision: D13592320 Pulled By: soumith fbshipit-source-id: 6c3805f576c3df5a2da8bef1e4305eda379718df	2019-01-08 09:48:36 -08:00
Richard Zou	8f11147d43	Use CUDAGuard when serializing CUDA Tensors (#15807 ) Summary: Fixes #15308. Before this change, `torch.save` and `torch.load` would initialize the CUDA context on GPU 0 if it hadn't been initialized already, even if the serialized tensors are only on GPU 1. This PR fixes that bug by using CUDAGuard in the storage serialization path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15807 Differential Revision: D13593201 Pulled By: zou3519 fbshipit-source-id: 4addc91ea5a5278d56a03f3d422577ee39e99897	2019-01-08 07:31:50 -08:00
Adam Paszke	29a9d6af45	Stop leaving garbage files after running test_jit.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15404 Differential Revision: D13548316 Pulled By: zou3519 fbshipit-source-id: fe8731d8add59777781d34d9c3f3314f11467b23	2019-01-08 07:22:55 -08:00
Adam Paszke	5e1b35bf28	Add support for batch_norm fusion to the JIT (#15146 ) Summary: We don't support reductions yet, but simply decomposing batch_norm into a kernel that computes the stats, and the fusing everything else with ReLU and following pointwise ops provides nice speedups. Note that this is only limited to inference mode for now, because we don't support convolutions and batch norm in AD, so the fuser isn't applied to those parts. This commit gives us a 7% end-to-end speedup for ResNet50 with batch size 32. Note that this only applies to inference mode at the moment due to lack of AD support for CNN operations (I'll be adding that soon), and not to the standard `torchvision` models, because they use in-place ops which aren't supported by the fuser (we need a way of proving that de-inplacing them is safe). cc zou3519 zdevito mruberry ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/15146 Differential Revision: D13548303 Pulled By: zou3519 fbshipit-source-id: a2e2e5abc383f637fae19bd1b423f20c2cbc056a	2019-01-08 07:00:19 -08:00
Yinghai Lu	c3a0000864	Support communicating with C2 protobuf in Onnxifi flow (#15472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15472 Create a path to pass serialized C2 protobuf instead of ONNX during ONNXIFI flow Reviewed By: houseroad Differential Revision: D13536603 fbshipit-source-id: 7d016474f4beedbda480ed2e2c0004af7868aafe	2019-01-07 22:12:29 -08:00
Xiaomeng Yang	4650d70e93	Add count_include_pad arg for AveragePoolOp on GPU (#15787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15787 Add count_include_pad arg for AveragePoolOp on GPU Reviewed By: houseroad Differential Revision: D13589185 fbshipit-source-id: 235a84cfcd2033ee796c13e338fc3d03e832b5b1	2019-01-07 21:36:26 -08:00
Shen Li	99d2743863	Move Stream.query() implementation down to C++ (#15737 ) Summary: See #15682 Pushing up this small PR to check if I am doing the right thing. If correct, more will follow for other Stream APIs. Questions will be added inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15737 Differential Revision: D13581400 Pulled By: mrshenli fbshipit-source-id: 24afed7847b89b62f0692c79a101ec7ff9d9ee4d	2019-01-07 20:58:07 -08:00
Derek Kim	55baca57d2	A trivial error in the error message of `at::Tensor _convolution` fixed (#15772 ) Summary: A trivial grammatical error fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15772 Differential Revision: D13592279 Pulled By: zou3519 fbshipit-source-id: 14f60c61747a3893cd0e4c860f7b4c4c4ba28c28	2019-01-07 20:01:43 -08:00
Jongsoo Park	770b5ac42b	clean up D13579188 (#15759 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15759 Some flags have too long names. And some other few minor clean ups. Reviewed By: jianyuh Differential Revision: D13587353 fbshipit-source-id: f8aee7f167505644f5d8f80fe2eed70201ef1e54	2019-01-07 18:48:25 -08:00
BowenBao	24867a58aa	Add support for exporting onnx split (#15092 ) Summary: * With the update of split output to dynamic list it breaks the export to onnx. Now split ir becomes two ops: 1. Dynamic[] <= Split(), and 2. out1, out2, out3 <= Prim::ListUnpack. In this fix these two consecutive ops get fused when being exported to onnx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15092 Reviewed By: dzhulgakov Differential Revision: D13583832 Pulled By: houseroad fbshipit-source-id: 3eb18c871e750921ad6d5cc179254bee9bcf4c99	2019-01-07 16:09:24 -08:00
Jongsoo Park	bc328d01e5	simplify conv dnnlowp ops by not allowing fp32 in/out (#15758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15758 DNNLOWP Conv operators became very complex due to many options. This diff simplifies them by not allowing fp32 in/out. This is OK for Conv operators because Conv operators are usually used in deep networks where quantizing and dequantizing using separate operators is not much overhead. Reviewed By: csummersea Differential Revision: D13587341 fbshipit-source-id: e88c919dae79d1c5b7d787ea539edf5bcb064afc	2019-01-07 15:14:59 -08:00
Gu, Jinghui	49ba2cb796	Enable conv+add fusion, same as conv+sum (#15268 ) Summary: Enable conv+add fusion, same as conv+sum Caution: only element-wise add is supported on IDEEP without scalar broadcast. Otherwise, the fusion is illegal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15268 Differential Revision: D13577375 Pulled By: yinghai fbshipit-source-id: 92c9c4b667c5ca5f7a262a5bffaa8aa68eeff3bd	2019-01-07 14:42:45 -08:00
David Riazati	76feb8c40f	Allow List arguments to Python Ops (#15721 ) Summary: Adds `List` to eval environment for type lines and allows `List` to be used on PythonOps (follows the same style as the `Tuple` code), fixes #15661 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15721 Differential Revision: D13578540 Pulled By: driazati fbshipit-source-id: fce54dc3c0931d8b017b2e3483f0ac53826dda94	2019-01-07 13:51:53 -08:00
SsnL	668678e753	Bump CircleCI docker version to 278 (#15795 ) Summary: Just changing the version number doesn't seem to work. I needed to also fix macos brew parallel conflict should this merge together with https://github.com/pytorch/ossci-job-dsl/pull/36 ? Pull Request resolved: https://github.com/pytorch/pytorch/pull/15795 Differential Revision: D13591839 Pulled By: yf225 fbshipit-source-id: 6b2a90943e63c8dcc4b6d9159eb54f1b5974c9ac	2019-01-07 12:32:33 -08:00
Peter Goldsborough	382807302c	Fix C++ Frontend example in frontend.html (#15717 ) Summary: The small end-to-end example in https://pytorch.org/cppdocs/frontend.html is a little outdated and needs fixes. ezyang soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15717 Differential Revision: D13591306 Pulled By: goldsborough fbshipit-source-id: 3334d68c7f77cf094b66ec2b2f396c4c65bb0d72	2019-01-07 11:39:47 -08:00
Peter Goldsborough	321a559359	Fix restructured text issue in tensor_basics.rst (#15701 ) Summary: Fix submitted by huntzhan in https://github.com/pytorch/cppdocs/pull/4. The source is in this repo so the patch has to be applied here. soumith ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15701 Differential Revision: D13591302 Pulled By: goldsborough fbshipit-source-id: 796957696fd560a9c5fb42265d7b2d018abaebe3	2019-01-07 11:35:19 -08:00
Gu, Jinghui	2ebeb33697	Fallback to CPU concat op to handle TensorCPU inputs (#15263 ) Summary: Fallback to CPU concat op to handle TensorCPU inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/15263 Differential Revision: D13587030 Pulled By: yinghai fbshipit-source-id: 010a8579d61c3beb8556eb92493a552b2ab0030c	2019-01-07 11:13:23 -08:00
Jongsoo Park	c68eb5ec44	fix conv unit test for groupwise quantization and pre-packing (#15761 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15761 As title says. Reviewed By: csummersea Differential Revision: D13587727 fbshipit-source-id: f0631b8cbb89d65a1d952bc25b463de23de93bec	2019-01-07 11:08:32 -08:00
vishwakftw	95febdfacc	Add is_floating_point to docs (#15704 ) Summary: Fixes #15700 . Changelog: - Expose torch.*.is_floating_point to docs Differential Revision: D13580734 Pulled By: zou3519 fbshipit-source-id: 76edb4af666c08237091a2cebf53d9ba5e6c8909	2019-01-07 10:43:22 -08:00
Elias Ellison	2ff0e3b196	Pool prim::None nodes (#15745 ) Summary: Make the constant pooling pass pool prim::None nodes Pull Request resolved: https://github.com/pytorch/pytorch/pull/15745 Differential Revision: D13583518 Pulled By: eellison fbshipit-source-id: 7f8aa70522515805ab0991c6db3d96b5a96cdede	2019-01-07 10:00:51 -08:00
Owen Anderson	3277723173	Replace some malloc+memset pairs with calloc. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15765 Differential Revision: D13588723 Pulled By: resistor fbshipit-source-id: 47d35dc608847a5b173cfcf2aaa2a77359e56722	2019-01-06 18:57:17 -08:00
mruberry	b6a8c45f57	Removes print statements from test_torch.py (#15747 ) Summary: These print statements do not affect the test, and tests (generally) shouldn't print. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15747 Differential Revision: D13587289 Pulled By: soumith fbshipit-source-id: c758793c9e35faf02bacba6c7c6d072f7c40453f	2019-01-05 09:07:27 -08:00
Mickaël Schoentgen	04f5605ba1	Fix several DeprecationWarning: invalid escape sequence (#15733 ) Summary: Hello, This is a little patch to fix `DeprecationWarning: invalid escape sequence`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15733 Differential Revision: D13587291 Pulled By: soumith fbshipit-source-id: ce68db2de92ca7eaa42f78ca5ae6fbc1d4d90e05	2019-01-05 08:53:35 -08:00
ArutyunovG	2fb2d080d3	caffe2_benchmark msvc build fix (#15619 ) Summary: Fixing error in caffe2_benchmark binary ``` 2018-12-29T14:09:59.7867995Z d:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.h(90): error C2678: binary '\|=': no operator found which takes a left-hand operand of type 'std::_Iosb<int>::_Openmode' (or there is no acceptable conversion) (compiling source file D:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.cc) [D:\a\1\s\caffe2_builders\v141\pytorch\build\Release\binaries\caffe2_benchmark.vcxproj] 2018-12-29T14:09:59.7868252Z d:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.h(92): error C2678: binary '\|=': no operator found which takes a left-hand operand of type 'std::_Iosb<int>::_Openmode' (or there is no acceptable conversion) (compiling source file D:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.cc) [D:\a\1\s\caffe2_builders\v141\pytorch\build\Release\binaries\caffe2_benchmark.vcxproj] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15619 Differential Revision: D13580195 Pulled By: soumith fbshipit-source-id: b0a4479cd5f7555801b1977aeee96b6433293da7	2019-01-05 08:29:31 -08:00
Lu Fang	a918f1d9af	Adding a hook (wrapper) for non-std stream reader in PyTorchStreamReader (#15551 ) Summary: To implement a stream is very annoying, since it is closely defined with the underlying storage streambuffer. So in this PR, we add ReadAdapterInterface and PyTorchStreamReader will use it. We implement IStreamAdapter as a wrapper of std::istream. And keep the user interface unchanged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15551 Reviewed By: zrphercule Differential Revision: D13568907 Pulled By: houseroad fbshipit-source-id: 93708cb801248a6c101f35cb14d1631029365c3c	2019-01-04 22:50:07 -08:00
Cheng,Penghui	1488c5dd03	support 0 size in any of the tensor dimensions in mkldnn (#15295 ) Summary: support 0 size in any of the tensor dimensions in mkldnn Pull Request resolved: https://github.com/pytorch/pytorch/pull/15295 Differential Revision: D13573747 Pulled By: yinghai fbshipit-source-id: 5bf7a0b9e2567e80f44981a7823be5407fc94e53	2019-01-04 22:33:18 -08:00
Lin Huang	2d8b332262	Port replication_pad2d and replication_pad3d to ATen (#15538 ) Summary: port replication padding 2D and 3D from legacy TH API implementation to ATen implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15538 Differential Revision: D13547567 Pulled By: lhuang04 fbshipit-source-id: decfe100d9edfdcfb62f39ee23f37b6cae0d461f	2019-01-04 17:08:14 -08:00
zrphercule	3d44eeec0a	Fix different types in rsub caused bug (#15707 ) Summary: Before this pr, rsub did not convert two elements into the same dtype, therefore "1 - x" may export to an onnx model that two elements of rsub having different dtype. By adding this symbolic patch this bug should be fixed. Related test cases also created. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15707 Differential Revision: D13583042 Pulled By: zrphercule fbshipit-source-id: 3a2de47a1a8d1ded1a0adfb911adbe6ac729cdef	2019-01-04 16:14:13 -08:00
Jerry Zhang	ae91156e5d	Tensor method rename dims()->sizes() - 1/2 Summary: Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: BIT-silence Differential Revision: D13581782 fbshipit-source-id: b16b4198e100617769d84aa599bf141117cfbe5b	2019-01-04 16:02:22 -08:00
Lu Fang	12e6c1ceeb	Automatic update of fbcode/onnx to 8384c788939bc65463f9754b6a7a00b212b18ba1 (#15739 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15739 Previous import was 765f5ee823a67a866f4bd28a9860e81f3c811ce8 Included changes: - [8384c78](https://github.com/onnx/onnx/commit/8384c78): add constantofshape (#1582) <Rui Zhu> - [9afc06c](https://github.com/onnx/onnx/commit/9afc06c): Set symbol visibility to hidden for non-Windows (#1707) <Paul Jesse Hellemn> - [6f8a9f0](https://github.com/onnx/onnx/commit/6f8a9f0): Revert "Add NonMaxSupression operator (#1695)" (#1702) <Lu Fang> - [8b89544](https://github.com/onnx/onnx/commit/8b89544): Add NonMaxSupression operator (#1695) <Hector Li> - [0a7cc48](https://github.com/onnx/onnx/commit/0a7cc48): Add bfloat16 support. (#1699) <Dmitri Smirnov> - [da7c50c](https://github.com/onnx/onnx/commit/da7c50c): ONNX does not maintain versions for experimental ops (#1696) <Ke Zhang> - [0c8d857](https://github.com/onnx/onnx/commit/0c8d857): Correct type of value_info in Graph (#1694) <Maik Riechert> - [f612532](https://github.com/onnx/onnx/commit/f612532): Fix typos (#1686) <Eundoo Song> Reviewed By: zrphercule Differential Revision: D13581674 fbshipit-source-id: 8f8ee86a05a86fe99bf94509148c559ea3df1464	2019-01-04 15:56:55 -08:00
andersj	04bf528589	remove use of tmp_install Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14553 Differential Revision: D13583335 Pulled By: anderspapitto fbshipit-source-id: 8711fead9eda877c1037a0bc59f91a3d2e01f3e0	2019-01-04 13:48:12 -08:00
Will Feng	6adbe12c74	Update CI credentials Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15736 Differential Revision: D13583174 Pulled By: yf225 fbshipit-source-id: 742470db10ef9df8f95e27626453b68ca90723e8	2019-01-04 13:36:10 -08:00
zrphercule	43761e01f5	Temporarily disable all XXXlike operator tests in pytorch-onnx test (#15740 ) Summary: We are going to have some breaking changes in ConstantLike and related operators in onnx, therefore it is better to disable all related tests for these operators for now. These operators are not currently supported by caffe2, and are not included in our most recently released onnx, therefore we do not need to worry about internal/external production breaking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15740 Differential Revision: D13582528 Pulled By: zrphercule fbshipit-source-id: 92a890c1dc2a833969af69edfea85331bb4d562f	2019-01-04 13:36:09 -08:00
Jerry Zhang	07c4991622	Tensor construction codemod - 2/2 (#15600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15600 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13542455 fbshipit-source-id: 8a3b15b0a1f81565f34e309114e1c3e1f7f65a3c	2019-01-04 13:31:53 -08:00
Elias Ellison	b1529eeadb	Print out operator suggestions for unknown builtin op (#15183 ) Summary: This improves the error message for "unknown builtin op" to suggest similarly named ops. Currently it prints out all operators with a name within two edits. Related issue: https://github.com/pytorch/pytorch/issues/13409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15183 Differential Revision: D13578509 Pulled By: eellison fbshipit-source-id: 5c73408eda1f7aa456f5bd28790c34df0c76aeca	2019-01-04 13:04:44 -08:00
svcscm	fad8480146	Updating submodules Reviewed By: yns88 fbshipit-source-id: b8be56b57d109dfef5980ea7255e2ab021da099e	2019-01-04 12:28:13 -08:00
Jerry Zhang	9e88547d72	Tensor construction codemod - 1/2 (#15598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15598 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13542429 fbshipit-source-id: db1059c78e85724d9b4fdab70466cf329db68359	2019-01-04 11:53:36 -08:00
Jongsoo Park	ad0ef7ae48	remove dependency to fp32 batch permutation op (#15723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15723 As title says. Reviewed By: jianyuh Differential Revision: D13578604 fbshipit-source-id: 0da0ac31ae83c1e0daa9077e878feb4deffed6a3	2019-01-04 07:56:05 -08:00
Michael Carilli	e313f1a7bf	Cudnn Handle Pool 3: At Wit's End (#15668 ) Summary: ezyang Here's a freshly rebased version of https://github.com/pytorch/pytorch/pull/15080 with the if statement that relieved the hangs that occasionally, nondeterministically, occurred on cudnnCreate on a particular windows build ([example w/debug statements](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-win-ws2016-cuda9-cudnn7-py3-test2/19238/console)) in https://github.com/pytorch/pytorch/pull/15280. I'd like to run the CI over this several times before it's considered mergeable. Sometimes the windows hang doesn't manifest for 2 or 3 consecutive trials. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15668 Differential Revision: D13579291 Pulled By: soumith fbshipit-source-id: 3972eb98bad6ece933ca5e67a10fc4bc2ed06068	2019-01-04 06:28:21 -08:00
vishwakftw	e798a09f6d	Remove TH/THC link for cholesky_solve (#15691 ) Summary: Changelog: - Remove TH/THC binding - Port single matrix case to ATen Pull Request resolved: https://github.com/pytorch/pytorch/pull/15691 Differential Revision: D13579317 Pulled By: soumith fbshipit-source-id: 63a55606c656396e777e8e6828acd2ef88ed1543	2019-01-04 06:24:17 -08:00
Youngseok	b740b92f36	Modify torch.gesv error message (#15654 ) Summary: [doc](https://pytorch.org/docs/stable/torch.html#torch.gesv) uses `B` uppercase so error message should follow to avoid confusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15654 Differential Revision: D13571297 Pulled By: soumith fbshipit-source-id: 0b4e7797eceff92618f808bbfa65d13c1dcc2da0	2019-01-03 21:46:02 -08:00
Jongsoo Park	069d894145	make conv_depthwise_dnnlowp_op_test faster (#15725 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15725 As title says. Reviewed By: jianyuh Differential Revision: D13579188 fbshipit-source-id: 382072c95929ccf9e189e2338e35b046c4a0650f	2019-01-03 21:46:00 -08:00
Elad Zippory	2d8f14cd12	clarified language of doc for torch.mul (#15664 ) Summary: see issue #15636 Please note - I build the documents but the HTML is not updated with the edited content. I did not also build the fork. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15664 Differential Revision: D13571310 Pulled By: soumith fbshipit-source-id: d43be0f61705693d778cc12c13e86d6b06130ac7	2019-01-03 21:39:35 -08:00
Jongsoo Park	a923ea7cf0	disallow nbits_in_non_outlier == 0 in acc16 conv; option to fallback to acc32 (#15708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15708 nbits_in_non_outlier == 0 doesn't make sense because it means everything is outlier and we can just use 32-bit accumulation. Depending on architecture, break-even point between acc16 and acc32 can be different. Adding thresholds for falling back to acc32. Reviewed By: jianyuh Differential Revision: D13574832 fbshipit-source-id: b7a37aacbfdc7867e31838dafcdd5f7c2ac282af	2019-01-03 20:31:33 -08:00
Elias Ellison	bebf1f7463	Torch tensor (#15224 ) Summary: Support torch.tensor in script. Already been accepted, trying to reland Pull Request resolved: https://github.com/pytorch/pytorch/pull/15224 Differential Revision: D13466616 Pulled By: eellison fbshipit-source-id: f7850da07b0eb11af98f255fc15bd3cf861f2a40	2019-01-03 17:35:17 -08:00
Shen Li	1e9a6d7192	A quick fix for Stream operation errors on non-current device (#15689 ) Summary: see #15682 This is a quick fix by implementing the simpler solution as suggested by colesbury. As benchmark result shows, it slows down `Stream.query()` by ~20%, I would be happy to further pursue a more complex solution by implementing this in C++/ATen. But I would still vote for merge this quick fix first just to get rid of the bug sooner. ~Test TBA~ Added FYI jeffreyksmithjr now ```python In [1]: def f(): ...: d0 = torch.device('cuda:0') ...: d1 = torch.device('cuda:1') ...: with torch.cuda.device(d0): ...: s0 = torch.cuda.current_stream() ...: with torch.cuda.device(d1): ...: s1 = torch.cuda.current_stream() ...: s0.query() ...: s1.query() In [4]: %timeit f() 38.1 µs ± 4.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) In [5]: %timeit f() 37.6 µs ± 2.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` before ```python In [4]: %timeit f() 28.5 µs ± 1.74 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) In [5]: %timeit f() 35.3 µs ± 2.91 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15689 Differential Revision: D13571697 Pulled By: mrshenli fbshipit-source-id: 4fe697f91248c6419136d37bb5b7147e612e2f4c	2019-01-03 15:14:58 -08:00
David Riazati	3270e4d4a5	Break up generated tests (#13992 ) Summary: This PR breaks up `TestJitGenerated` into 3 classes. This makes for easier testing of specific groups (e.g. run all generated functional tests without having to wait for the autograd tests) Pull Request resolved: https://github.com/pytorch/pytorch/pull/13992 Differential Revision: D13076371 Pulled By: driazati fbshipit-source-id: 1267af59be7d69feb690f5805fcd43fea58a7159	2019-01-03 14:34:13 -08:00
Michael Suo	dcbc4f32db	flake8 hook fix (#15693 ) Summary: This PR bypasses checking the user's configuration entirely and always use strict, since the CI considers it a hard failure if you can't pass flake8. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15693 Differential Revision: D13574889 Pulled By: suo fbshipit-source-id: f5e1c5731cc49b6223b415317033c275bc7d4fec	2019-01-03 13:55:20 -08:00
Stuart Golodetz	2403135257	Prevent VS2017 from emitting ambiguous symbol errors (#15697 ) Summary: These `std::forward` calls cause VS2017 to emit: error C2872: 'std': ambiguous symbol This fix prevents the ambiguity by specifying that `::std` is intended. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15697 Differential Revision: D13573483 Pulled By: goldsborough fbshipit-source-id: 0439de3523a37a18df7af0cff4a1284a53833ddd	2019-01-03 13:45:35 -08:00
Zachary DeVito	d42e90991b	trace s_copy_ (#15690 ) Summary: s_copy_ was previously special-cased for out of place tracing. This adds support for inplace tracing, which fixes tracing of inception_v3 Fixes #15216 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15690 Differential Revision: D13572011 Pulled By: zdevito fbshipit-source-id: 1d565dec039a4b8c59179254285e61d2517ef9a9	2019-01-03 12:28:14 -08:00
Ailing Zhang	78442f04fc	Add mkldnn conv double backward (#15686 ) Summary: Fixes #15353 . Like cudnn conv implementation, mkldnn also falls back to the default `_convolution_double_backward` as double backward. This bug wasn't caught by CI before because mkldnn is only used when input scalar type is float, but our tests are all using double as default. Adding test for float inputs, but mkldnn seems to have imprecision issues similar to cudnn implementation, so here I only check if double backward exists instead of calling `gradgradcheck`. Please correct me if the precision should actually be checked. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15686 Differential Revision: D13571682 Pulled By: ailzhang fbshipit-source-id: f1762439762370f276cfd59e8b8b8a4dee960a4b	2019-01-03 10:50:00 -08:00
Spandan Tiwari	947229ebd7	Fix ONNX export of logical ops, including torch.ne, to have correct output datatype (#15677 ) Summary: This is the an updated version of the earlier PR https://github.com/pytorch/pytorch/pull/15185, since that one was closed. Currently PyTorch ONNX exporter exports the logical ops (lt, gt, le, ge, eq, ne) with output type in corresponding ONNX ops as type tensor(uint8). But ONNX spec allows for only tensor(bool), which is why models that have these ops fail to load properly. This issue is captured in #11339. Part of this issue, relating to the allowed input types, has been fixed in ONNX spec by houseroad. This PR fixes the other part pertaining to output type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15677 Reviewed By: dzhulgakov Differential Revision: D13568450 Pulled By: houseroad fbshipit-source-id: a6afbea1afdb4edad8f8b1bc492f50b14e5f2fce	2019-01-03 10:35:25 -08:00
Shen Li	279ca4acd2	Port legacy reflection_pad1d to ATen (#15480 ) Summary: 1. Avoided using `THCDeviceTensor` by re-calculating the mapping from cuda (blockIdx, threadIdx) to input/output tensor index. 2. Changed Camelcase naming to underscore naming. Profiling: Legacy: ```bash $py.test test/test_nn.py -k ReflectionPad1d -v -s .... =========== 2 passed, 1258 deselected, 800 warnings in 4.35 seconds ============ ``` Now: ```bash $py.test test/test_nn.py -k ReflectionPad1d -v -s ... =========== 2 passed, 1258 deselected, 800 warnings in 4.03 seconds ============ ``` I have two questions about the code. Any insights are appreciated. gchanan zou3519 1. I can verify that [this magic](https://github.com/pytorch/pytorch/blob/master/aten/src/THCUNN/TemporalReflectionPadding.cu#L32-L36) correctly maps output index to input index in different cases. But, I have no idea about how did you come up with this algorithm that merges three categories (in left padding, in original input, in right padding) into a single statement? 2. Why do we need [get contiguous](https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/TemporalReflectionPadding.c#L80) tensors when calculating forward and backward propagation? Reflection_pad2d porting will come in the next PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15480 Differential Revision: D13544924 Pulled By: mrshenli fbshipit-source-id: 182045434f210032a82cab721a190da0cd781fbf	2019-01-03 10:30:37 -08:00
Jongsoo Park	1159302ab1	bug fix in 3d group conv (#15625 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15625 3D group conv (both NCHW and NHWC layout) was not correct. Added group=2 in test_1d_convolution and test_3d_convolution in conv_test Reviewed By: protonu Differential Revision: D13562099 fbshipit-source-id: 586e8a7574a2764f2a3b559db6c2415b3ab90453	2019-01-03 09:46:49 -08:00
Gregory Chanan	6103a04cff	Port torch.arange to aten and parallelize on CPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15667 Differential Revision: D13566631 Pulled By: gchanan fbshipit-source-id: e3243a4e81ecb58373681df8bf6a00428352fb14	2019-01-03 09:20:41 -08:00
Gerard Goossen	10c10b0990	Ignore flake8 warning about whitespace before ':' (#15663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15663 Ignore sometimes incorrect flake8 warning about whitespace before ':' See https://github.com/ambv/black/issues/315 Reviewed By: soumith Differential Revision: D13565818 fbshipit-source-id: 9d5ec2335899527ee71f4b505c00865a354e3bf0	2019-01-03 05:02:10 -08:00
Xiaomeng Yang	f53010370b	Add count_include_pad arg for PoolOpGradient on CPU and fix ARM performance issue. (#15651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15651 Add count_include_pad arg for PoolOpGradient on CPU and fix ARM performance issue. Reviewed By: houseroad Differential Revision: D13564257 fbshipit-source-id: 3a143f1122bc507ccb7827e9b46908d5c7203735	2019-01-03 00:18:47 -08:00
Jianyu Huang	3b5a940355	Unify the usage of Dequantize (#15685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15685 The declaration of "Dequantize" is in "fbsource/fbcode/deeplearning/fbgemm2/QuantUtils.h", so it requires the "namespace fbgemm". <T> is actually optional, since the type can de deduced from the first argument. In some places we have "Dequantize<T>(...)", while in other places we have "Dequantize(...)". We'd better unify them. As a reference, all occurrences of "Quantize" are using "fbgemm::Quantize<T>(...)". Reviewed By: jspark1105 Differential Revision: D13570847 fbshipit-source-id: 7fca9f7f9e4e0d9e5eb27ac44b8707adc3c80717	2019-01-02 21:32:46 -08:00
Shen Li	efc3d6b65d	Fix vec256 inversion (#15659 ) Summary: soumith zou3519 I was browsing the code, and think `vec256_int.h` might need a minor revision, but not 100% sure. 1. It currently invert the result by `XOR` with 0. Should it `XOR` with 1 instead? ~2. AVX2 logical operations would set all bits in a byte/word/... to `1` if the condition holds. So functions, such as `_mm256_cmpeq_epi64 ` would return `0/-1` instead of `0/1`. Should it be masked with `1` to make sure it returns 0/1?~ ~Would I be correct if I assume that the code revised below is not yet activated, but will be after we port legacy code to ATen?~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/15659 Differential Revision: D13565929 Pulled By: mrshenli fbshipit-source-id: 8ae3daf256c3d915dd855a2215c95275e899ea8c	2019-01-02 21:32:44 -08:00
Zachary DeVito	b0cf780ecc	Add min/max on numbers to JIT Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15680 Differential Revision: D13568806 Pulled By: zdevito fbshipit-source-id: ef0f33cc12a057184293bc31d28cc7b24f73eb94	2019-01-02 20:10:38 -08:00
Natalia Gimelshein	e2549cbc01	initialize with ident value in global reduction (#15653 ) Summary: Fixes #15647. cc colesbury. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15653 Differential Revision: D13571132 Pulled By: soumith fbshipit-source-id: 8f25943c974b3b931f4528e0e0a370bc095dab51	2019-01-02 19:52:57 -08:00
svcscm	0b0553f92d	Updating submodules Reviewed By: yns88 fbshipit-source-id: f7b540159cf1fe72825d09d55d56117d14ff90eb	2019-01-02 19:00:02 -08:00
rtarquini	879bccb1af	Support for Jetson Xavier (#15660 ) Summary: The request changes are to support building Pytorch 1.0 on the Jetson Xavier with Openblas. Jetson Xavier with Jetpack 3.3 has generic lapack installed. To pick up the CUDA accelerated BLAS/Lapack, I had to build Openblas and build/link pytorch from source. Otherwise, I got a runtime error indicating lapack routines were not cuda enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15660 Differential Revision: D13571324 Pulled By: soumith fbshipit-source-id: 9b148d081d6e7fa7e1824dfdd93283c67f69e683	2019-01-02 18:51:42 -08:00
Jesse Hellemn	62883a911c	Fixing cuda100 smoke tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15673 Reviewed By: yf225 Differential Revision: D13568746 Pulled By: pjh5 fbshipit-source-id: e636de417d61b48074399da75bfb2576c9f62743	2019-01-02 17:13:16 -08:00
Jerry Zhang	3ea5a9a66d	Remove PythonOp non-CPU path and PytorchOp (#15417 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15417 Right now the way we test whether Blob contains a CPU tensor is broken in ```PythonOpBase``` is broken, which means non-CPU path might never be taken. Searching through the codebase, non-gpu path is used in PythonDLPack, and it is used in PytorchOp which is unused. So we'll remove non-gpu path in this diff. Reviewed By: dzhulgakov Differential Revision: D13495011 fbshipit-source-id: 9fe9537f05026d2a2cf7051efa81d184de722710	2019-01-02 16:36:37 -08:00
svcscm	7857909158	Updating submodules Reviewed By: yns88 fbshipit-source-id: bb142e8f91046cc2b7ea32dac46ec0753b4bc218	2019-01-02 14:58:48 -08:00
Michael Suo	d86cc3e7de	fix select after chunk op (#15672 ) Summary: Fixes #15669. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15672 Differential Revision: D13567274 Pulled By: suo fbshipit-source-id: a63e6cfc9dacedd4cb99dc51eee452038418001e	2019-01-02 14:35:23 -08:00
Michael Suo	bb3c3f516b	make flake8 failure blocking (#15675 ) Summary: Right now it just prints whatever flake8 errors and moves forward with the commit. This is too easy to miss. It should block the commit so that the user can fix the issue Pull Request resolved: https://github.com/pytorch/pytorch/pull/15675 Differential Revision: D13567821 Pulled By: suo fbshipit-source-id: 5f0de40ddd771bad8d6848417408cffbceb03183	2019-01-02 12:52:59 -08:00
Zachary DeVito	c5554856c9	redo sleef build fix (#15549 ) Summary: This was accidentally reverted by #14866 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15549 Differential Revision: D13549674 Pulled By: zdevito fbshipit-source-id: e209aac53dccb082b91cfa2d292310eabeb459e3	2019-01-02 12:48:25 -08:00
Jongsoo Park	bee6c6761e	format conv_test.py to prepare D13562099 (#15632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15632 Just formatting and a few lints. Reviewed By: yinghai Differential Revision: D13562403 fbshipit-source-id: c56f8ee61f68cdaccc0828a764ff729454f68259	2019-01-02 11:34:30 -08:00
kiendang	eeb14675f1	Fix torch.gesv args in doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15649 Differential Revision: D13564312 Pulled By: soumith fbshipit-source-id: b3bba2ece600880077eb09b092ce17e331995bd6	2019-01-02 00:20:22 -08:00
surgan12	b52420742d	clamp fixes (#15479 ) Summary: fix to #15338 . Differential Revision: D13564343 Pulled By: soumith fbshipit-source-id: be64b572945533e10ae6f627d335b47f093720a3	2019-01-01 23:12:17 -08:00
svcscm	8278a8b16f	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: acb68439e62ea270af22364183a6ecba883fab66	2019-01-01 23:12:16 -08:00
svcscm	2398b607ec	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 5c5ad6a5cc9220ee1dd9565d64c7459f866ff74d	2019-01-01 17:23:01 -08:00
Alexander Rodin	a0d22b6965	Fix typo in documentation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15628 Differential Revision: D13562685 Pulled By: soumith fbshipit-source-id: 1621fcff465b029142313f717035e935e9159513	2018-12-30 18:07:57 -08:00
vishwakftw	7bb41e3953	Make btriunpack work for high dimensional batches and faster than before (#15286 ) Summary: Changelog: - Optimize btriunpack by using `torch.where` instead of indexing, inplace operations instead of out place operations and avoiding costly permutations by computing the final permutation over a list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15286 Differential Revision: D13562038 Pulled By: soumith fbshipit-source-id: e2c94cfab5322bf1d24bf56d7b056619f553acc6	2018-12-30 12:42:07 -08:00
Xiaomeng Yang	56d945a1ca	Add count_include_pad arg for average_pool_op on CPU (#15593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15593 Add count_include_pad arg for average_pool_op on CPU Reviewed By: houseroad Differential Revision: D13558123 fbshipit-source-id: 188879ec3af313105ff66ac0b5a81ea44fca2855	2018-12-30 04:16:47 -08:00
vishwakftw	ef487d4f1d	Remove TH/THC link for cholesky (#15595 ) Summary: Changelog: - Remove TH/THC binding - Port single matrix case to ATen Pull Request resolved: https://github.com/pytorch/pytorch/pull/15595 Differential Revision: D13561657 Pulled By: soumith fbshipit-source-id: 65f8c4b455cf19a0c7b6aeac2e3b985c7a7208f8	2018-12-29 17:54:50 -08:00
Christoph	2a45050fdc	Concatenate directly into shared memory when constructing batches for numpy (#14534 ) Summary: Since #1323 tensors are shared with shared memory, but this feature is not active for numpy. This PR fix this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14534 Differential Revision: D13561649 Pulled By: soumith fbshipit-source-id: b6bc9e99fb91e8b675c2ef131fba9fa11c1647c0	2018-12-29 17:51:02 -08:00
Mark Harfouche	4047cdc690	Add a patch for OSX with SDK<10.12 (#15615 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15614 Build passing on SDK 10.9 https://dev.azure.com/ramonaoptics/feedstock-builds/_build/results?buildId=13 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15615 Differential Revision: D13561737 Pulled By: soumith fbshipit-source-id: 2ab0f78338d4949fa3f2735915fd96dce4bcd621	2018-12-29 16:11:58 -08:00
Gao, Xiang	d3e5540276	Fix typo: szie -> size Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15466 Differential Revision: D13536343 Pulled By: soumith fbshipit-source-id: cb3df30bf346ef6bc0bc1b6430107b3e0e086f8d	2018-12-28 22:40:52 -08:00
peter	119efd5266	Make the warning suppression safer (#15560 ) Summary: Address the problem introduced in https://github.com/pytorch/pytorch/pull/15499#issuecomment-450038494. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15560 Differential Revision: D13561346 Pulled By: soumith fbshipit-source-id: 6abf622672bdcb77ae1a7188e8a3817fa97aecbc	2018-12-28 22:12:36 -08:00
Jongsoo Park	d53012b4fe	add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15588 Use NHWC2NCHW or NCHW2NHWC functions which is easier to understand compared to code using transpose and generalizable to non-2D convolutions. Reviewed By: csummersea Differential Revision: D13557674 fbshipit-source-id: c4fdb8850503ea58f6b17b188513ae2b29691ec0	2018-12-28 17:34:50 -08:00
Vishwak Srinivasan	9c8d8eab9d	Remove TH/THC link for gesv (#15510 ) Summary: This PR removes the TH/THC binding for gesv. Changelog: - Remove TH/THC binding - Port single matrix case to ATen - Enable test_gesv for CUDA as well Pull Request resolved: https://github.com/pytorch/pytorch/pull/15510 Differential Revision: D13559990 Pulled By: soumith fbshipit-source-id: 9da2825e94d3103627e719709e6b1f8b521a07fb	2018-12-28 16:54:27 -08:00
Dong Li	cd3c4a2f1c	keep extra_info of each op in ProfDagStats (#15244 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15244 This DIFF keeps track of the extra_info information attached to each operator. When getPerOpStas() is called, it attaches the extra_info to the result ProfDagStats protobuf. Facebook Net transform attaches a global_op_id which is defined as a tuple of (orig_net_name, original_op_index) to each operator, The global_op_id is encoded as extra_info in each operator. Reviewed By: aazzolini Differential Revision: D13016289 fbshipit-source-id: 3e2719ec7ed0ebe47740b77581c565ff7e79b102	2018-12-28 15:03:23 -08:00
David Riazati	692898fe37	Error when torch.load-ing a JIT model (#15578 ) Summary: Throw a warning when calling `torch.load` on a zip file Fixes #15570 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15578 Differential Revision: D13555954 Pulled By: driazati fbshipit-source-id: a37ecdb3dd0c23eff809f86e2f8b74cd48ff7277	2018-12-28 13:54:32 -08:00
SsnL	fb22f76eb6	default_collate should collate bool list to byte tensors (#14669 ) Summary: Based on #15331 . Review only the last commit. Fixes https://github.com/pytorch/pytorch/issues/14507. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14669 Reviewed By: ezyang Differential Revision: D13528725 Pulled By: soumith fbshipit-source-id: f12f1ac1c4ff2a3ddd6877c0c096a5da3a1ffa3c	2018-12-28 12:26:46 -08:00
Jongsoo Park	6a3e54eda9	append caffe2 prefix to dnnlowp cmd line options (#15582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15582 Following convention of having caffe2_ prefix in command line options Reviewed By: viswanathgs Differential Revision: D13252055 fbshipit-source-id: 142a6395b832f211f34d0a87ec2d62c1e5fcdc69	2018-12-28 11:51:59 -08:00
Jesse Hellemn	2c4c8784d2	adding nightly build smoke tests to circleci Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15441 Reviewed By: yf225 Differential Revision: D13552399 Pulled By: pjh5 fbshipit-source-id: 4a52ee2d08324b9ab6b8c266ad6a1cd3bdad1c71	2018-12-28 10:47:38 -08:00
Lingyi Liu	c1643ec551	add the int support (#15581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15581 as title Reviewed By: protonu Differential Revision: D13556274 fbshipit-source-id: ba21f0970257d526e2fe7574eea4f89465b9c618	2018-12-27 17:16:32 -08:00
Will Feng	9bf7eb914d	Move VariableImpl functions to AutogradMeta and Variable (#15487 ) Summary: In this PR, we are moving all functions away from `Variable::Impl`, in order to get rid of `Variable::Impl` (and the `data_` Tensor in it) in the next PR. Some of the functions (such as `set_requires_grad` / `requires_grad` / `grad`) will be living in `AutogradMeta` class, while others (such as `backward()` / `rebase_history()` / `grad_accumulator()` / `grad_fn()`) will be living in `Variable` class. This is the 2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15487 Differential Revision: D13553173 Pulled By: yf225 fbshipit-source-id: 691f9432d0cd0640af380c757f3e3a2f64f8851c	2018-12-27 17:16:31 -08:00
Roy Li	50fbf79451	test basic tensor interop Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12249 Differential Revision: D13469356 Pulled By: li-roy fbshipit-source-id: b49748462aa44ac34b8ce79783f2c895a537a232	2018-12-27 17:04:00 -08:00
David Riazati	70f0c4745b	Allow int/float cast to bool (#13391 ) Summary: This PR adds explicit `bool()` casts to match Python semantics `bool(1) = True` `bool(0) = False` `bool(0.0) = False` `bool(0.1) = True` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13391 Differential Revision: D12871213 Pulled By: driazati fbshipit-source-id: 773a48b2647973138efe854abe725d647f1d727d	2018-12-27 16:01:08 -08:00
Elias Ellison	0fff5b3612	remove print ops before exporting onnx graph (#15550 ) Summary: Removing print ops before exporting onnx graph, fixes https://github.com/pytorch/pytorch/issues/15505 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15550 Differential Revision: D13551195 Pulled By: eellison fbshipit-source-id: 1ea1e34cb5b8433eacc2b86fb10b241198af96be	2018-12-27 15:46:05 -08:00
Igor Fedan	62151aa259	Added deviceCount() virtual method to DeviceGuardImplInterface (#15574 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15574 Added deviceCount() virtual method to DeviceGuardImplInterface, also added correspondent implementation for CPUGuardImpl, CUDAGuardImpl, FakeGuardImpl, VirtualGuardImpl, HIPGuardImplMasqueradingAsCUDA Reviewed By: soumith Differential Revision: D13554609 fbshipit-source-id: 913bf2aad44a0a356efe54505ee4abaf6c4622db	2018-12-27 15:36:32 -08:00
Gregory Chanan	02a249ed92	Port torch.range to aten and parallelize on CPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15484 Differential Revision: D13538955 Pulled By: gchanan fbshipit-source-id: ee3889ad116988d963e603621310b3bbdce0aec9	2018-12-27 15:25:57 -08:00
Lu Fang	d63740bc3f	Export group norm as ATen and add test (#15569 ) Summary: Short term solution, export group norm as an ATen op to unblock users. Long term will add GroupNorm to onnx. Add an end to end test for this one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15569 Differential Revision: D13554293 Pulled By: houseroad fbshipit-source-id: b4974c9ea2a1b81338ca1e5c6747efe2715d7932	2018-12-27 14:44:29 -08:00
SsnL	e4477feb15	Update cuda.get/set_rng_state doc (#14324 ) Summary: Now that `cuda.get/set_rng_state` accept `device` objects, the default value should be an device object, and doc should mention so. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14324 Reviewed By: ezyang Differential Revision: D13528707 Pulled By: soumith fbshipit-source-id: 32fdac467dfea6d5b96b7e2a42dc8cfd42ba11ee	2018-12-27 14:09:25 -08:00
Marat Dukhan	9ad6ada9de	Update QNNPACK (#15561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15561 - Update QNNPACK submodule to master (API-incompatible) - Do matching changes in Caffe2 Int8 operators Reviewed By: dreiss Differential Revision: D13551322 fbshipit-source-id: 066f9087061167f7d7cfbc1c8f8628dfa93d056e	2018-12-27 11:59:54 -08:00
Michael Suo	ed949e20cb	Revert D13552080: [pytorch][PR] add clang-format check to CI Differential Revision: D13552080 Original commit changeset: 462a73894c16 fbshipit-source-id: ebfc5aa3343cebabbc24ff39e4e9841a372443e2	2018-12-27 10:56:52 -08:00
daquexian	c86cd9e530	Fix wrong class name in jit _make_fail (#15559 ) Summary: It should be ScriptModule rather than TracedModule :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15559 Differential Revision: D13552058 Pulled By: soumith fbshipit-source-id: 0aa17639c225818b00d59daec4bc2336f039f658	2018-12-27 02:02:33 -08:00
Michael Suo	80cc280c68	add clang-format check to CI (#15543 ) Summary: Simple check that runs against your PR's changes and complains if running clang-format would have created a change. Does nothing when run against master, so it's "safe" to accept changes that fail this check and it won't break the build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15543 Reviewed By: soumith Differential Revision: D13552080 Pulled By: suo fbshipit-source-id: 462a73894c16e7108806af7fa88440c377d4d0d2	2018-12-26 22:20:32 -08:00
Ailing Zhang	4d029bba7f	Fix github branch prefix v (#15552 ) Summary: Fixes #15519 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/15552 Differential Revision: D13550780 Pulled By: ailzhang fbshipit-source-id: b117e5ced42de207b91045bffcee8907dd73201e	2018-12-26 19:48:47 -08:00
Viswanath Sivakumar	eeaf1b64cb	Rotated boxes support for GPU GenerateProposals op (#15470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15470 On top of D13509114 and D13017791. Pretty straight-forward. Reviewed By: newstzpz Differential Revision: D13536671 fbshipit-source-id: ff65981b70c63773ccc9aef3ff28e3c9508f6716	2018-12-26 18:03:56 -08:00
Viswanath Sivakumar	e25702ac2b	CUDA kernel for rotated NMS support, over 200x speedup than CPU (#15365 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15365 On top of D13017791, adding rotated NMS support with the same kernel building blocks. Results in 218x speedup on avg. Reviewed By: SuperIRabbit Differential Revision: D13509114 fbshipit-source-id: c1d33c8dc4bc50b5906b4f01bb0caf1115e2a357	2018-12-26 18:03:55 -08:00
Will Feng	7b87ecae37	Move autograd metadata from VariableImpl to TensorImpl (#13827 ) Summary: Changes originally in this PR: 1. Move Variable::Impl data members into TensorImpl as `AutogradMeta` struct 2. Change Variable::Impl functions to use data members in `AutogradMeta` struct 3. Add `shallow_copy_and_detach()` function to each subclass of TensorImpl 4. Do shallow copy when the user calls `make_variable(tensor)` / `make_variable_view(tensor)` / `variable.set_data(tensor)` / `variable.detach()` Changes moved from https://github.com/pytorch/pytorch/pull/13645: 1. Add a flag to Variable to disallow size/stride/storage_ptr changes from in-place operations such as `resize_` / `resize_as_` / `set_` / `transpose_`, and set this flag to true when people call `tensor.data` in Python. 2. Write text in the docs to actively discourage changing the shape or storage of `tensor_detached` and expecting `tensor` to also be updated. This is the 1st+2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13827 Differential Revision: D13507173 Pulled By: yf225 fbshipit-source-id: b177b08438d534a8197e34e1ad4a837e2db0ed6a	2018-12-26 16:34:24 -08:00
Soumith Chintala	4c5b1cc026	version bump to 1.1 (#15554 ) Summary: version bump to 1.1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15554 Differential Revision: D13550818 Pulled By: soumith fbshipit-source-id: 8a28582c98b42c081e103581551a01fd96c9f42d	2018-12-26 15:44:25 -08:00
derek	8c6ff91d57	In README.md CMAKE_PREFIX_PATH should be CONDA_PREFIX when using an conda virtual environment (#15548 ) Summary: In current README.md, `CMAKE_PREFIX_PATH` is set to conda root even when you have activated an virtual environment. When an conda virtualenv is activated, packages are installed in `CONDA_PREFIX`, not conda root. I think `CMAKE_PREFIX_PATH` should also be set to `CONDA_PREFIX` in this case. I think some build issues can be solved with the new instruction. Maybe something like #14954. soumith, When I made PR #15335 I was confused and made a wrong point. I think this PR could be the real solution. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15548 Differential Revision: D13549681 Pulled By: soumith fbshipit-source-id: 42d855b6e49ee58d735d2f4715d3e5752a748693	2018-12-26 12:57:07 -08:00
David Pollack	cdb8edce75	add from_pretrained method to EmbeddingBag (#15273 ) Summary: The `EmbeddingBag` module does not include a `from_pretrained` method like the `Embedding` module. I added it for consistency between the two modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15273 Differential Revision: D13547842 Pulled By: soumith fbshipit-source-id: 8ffde51ff0c1e8fc8310263b6f375da88089ff7d	2018-12-26 08:35:39 -08:00
vishwakftw	5ac95758e2	Make argument size checking consistent across CPU and CUDA for torch.gesv (#15430 ) Summary: There is an inconsistency in the size of arguments for gesv, which is fixed in this PR. Changelog: - Replicate check in CPU as done for CUDA - Fix argument ordering (minor) in CUDA checking Fixes #15328 Differential Revision: D13531167 Pulled By: soumith fbshipit-source-id: c4b4e4fc12880208d08e88d1e47e730ac98c2ad3	2018-12-26 08:32:28 -08:00
Michael Suo	f636dc9276	clang format world (#15524 ) Summary: The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook. Here is a list of non-mechanical changes: - I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting. - Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas - Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas - Small improvements to the precommit hook clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524 Differential Revision: D13547989 Pulled By: suo fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493	2018-12-26 06:55:01 -08:00
Frank Zhang	d4712ee218	Added correct isinf handling for Integral tensors (#15489 ) Summary: Currently torch.isinf on integral tensor will raise RuntimeError: value cannot be converted to type int16_t without overflow: inf. This pr will suppress the error and return false(0) for all integral tensors. The behavior will also be consistent with np.isinf Pull Request resolved: https://github.com/pytorch/pytorch/pull/15489 Reviewed By: zou3519 Differential Revision: D13540786 Pulled By: flashhack fbshipit-source-id: e730dea849da6a59f3752d347bcfbadfd12c6483	2018-12-26 06:36:09 -08:00
Derek Kim	d602ddcda3	Trivial comment update in autograd/function.h (#15529 ) Summary: I removed the explanation on `num_inputs` parameter. This parameter was removed in #8168 colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/15529 Differential Revision: D13547854 Pulled By: soumith fbshipit-source-id: 8a9ac58f2c93a2533b82ec63089477166ed0bcb9	2018-12-26 02:25:54 -08:00
peter	6e4be0af2e	Fix failed type cast in Windows Debug Build (#15333 ) Summary: Fixes #15330 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15333 Differential Revision: D13531317 Pulled By: soumith fbshipit-source-id: b956f27bd7fa33cbdf405338fcbcbc7df2fd629f	2018-12-26 00:48:58 -08:00
Gu, Jinghui	12e0ed55b4	Upgrade MKL-DNN to version 0.17 and static build MKL-DNN (#15504 ) Summary: Upgrade MKl-DNN to 0.17 and static build MKL-DNN to fix the potentail build error due to old mkldnn version in host system. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15504 Differential Revision: D13547885 Pulled By: soumith fbshipit-source-id: 46f790a3d9289c1e153e51c62be17c5206ea8f9a	2018-12-25 22:56:51 -08:00
Soumith Chintala	2fe5c29d81	remove legacy from docs (#15112 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15112 Differential Revision: D13547845 Pulled By: soumith fbshipit-source-id: 61e3e6c6b0f6b6b3d571bee02db2938ea9698c99	2018-12-25 21:57:54 -08:00
Alexander Rodin	60b13d1f71	Use at::zeros instead of torch::zeros in non-differentiable example (#15527 ) Summary: There was a typo in C++ docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15527 Differential Revision: D13547858 Pulled By: soumith fbshipit-source-id: 1f5250206ca6e13b1b1443869b1e1c837a756cb5	2018-12-25 21:50:17 -08:00
peter	2ed95c5871	Fix the compare logic in function `overflows` for MSVC (#15499 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/15497. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15499 Differential Revision: D13547835 Pulled By: soumith fbshipit-source-id: a674da93bf905a0b81f0cc60449ccb97c2746926	2018-12-25 21:50:15 -08:00
SsnL	521894c490	Allow converting char tensor to numpy; add [fi]info.min (#15046 ) Summary: https://github.com/pytorch/pytorch/pull/14710 with test fixed. Also added `finfo.min` and `iinfo.min` to get castable tensors. cc soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15046 Reviewed By: soumith Differential Revision: D13429388 Pulled By: SsnL fbshipit-source-id: 9a08004419c83bc5ef51d03b6df3961a9f5dbf47	2018-12-24 09:11:24 -08:00
Lin Huang	b7bc49ad70	Port replication_pad1d to ATen (#15507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15485 port replication_pad1d Reviewed By: ezyang Differential Revision: D13531920 fbshipit-source-id: dcd64ebd2c24b7431996231b8d5addfb600b1072	2018-12-24 06:34:02 -08:00
Peter Goldsborough	ad6799537e	Support stateful dataset (#15096 ) Summary: Currently re-implements the dataloader for stateful datasets. Outstanding work: - Refactor DataLoader and DataLoader2 to have common base classes and only differ in specifi pieces of logic, - Figure out how to not duplicate the `MapDataset` logic for stateful vs. non-stateful Pull Request resolved: https://github.com/pytorch/pytorch/pull/15096 Differential Revision: D13522043 Pulled By: goldsborough fbshipit-source-id: 08e461ca51783047f11facc4d27dfa2e4f1e4c2a	2018-12-24 06:26:40 -08:00
Michael Suo	8cd917812b	put interactive prompt in bash (#15521 ) Summary: This makes compatibility with different versions of python a little bit simpler, and fixes a problem where stdin wasn't being read from the terminal properly in the prompt. zdevito This should fix your EOF exception. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15521 Differential Revision: D13546358 Pulled By: suo fbshipit-source-id: fb7551a86c888196831c046d9d9848e7ff05b925	2018-12-24 05:37:46 -08:00
peter	f8a56bf476	Fix the iterator category for torch::data::Iterator (#15500 ) Summary: Try to fix https://github.com/pytorch/pytorch/issues/14410. Additional info: From this [page](https://stackoverflow.com/questions/14062297/canonical-way-to-define-forward-output-iterator), If we change it into `input_iterator_tag`, it doesn't mean the `output_iterator_tag` is lost. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15500 Differential Revision: D13545773 Pulled By: soumith fbshipit-source-id: 327bfb7be83d53e42925e0e391b2a4277e3a1b36	2018-12-23 19:49:44 -08:00
Michael Suo	c07647814b	Precommit hook: just warn if no clang-tidy (#15514 ) Summary: The precommit hook shouldn't hard fail if there's no `clang-tidy`, just warn and omit the check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15514 Differential Revision: D13545776 Pulled By: suo fbshipit-source-id: 9bf3f8ee18703c6d1a39eb7776092fb5e120d2a1	2018-12-23 14:38:13 -08:00
Gao, Xiang	4a716250cc	Add torch.rot90 to torch.rst Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15512 Differential Revision: D13545775 Pulled By: soumith fbshipit-source-id: 2a8896571745630cff4aaf3d5469ef646bdcddb4	2018-12-23 14:31:11 -08:00
Brennan Vincent	51f1c4fea5	fix parallelization detection for CPU foreach_reduced_elt (#15483 ) Summary: This does two things: (1): revert #15114 , which is incorrect and actually just completely disables parallelization in this function (because `at::get_num_threads` returns `-1` unless it has been set explicitly) (2): Fix our (FB-internal) failing tests that #15114 was intended to fix, by still working correctly in a setup where `#ifdef _OPENMP` is set and `omp_get_max_threads() > 1` , but `#pragma omp parallel` only launches one thread. I believe such an unusual situation only exists in certain unit tests within FB infra but we still need it to work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15483 Differential Revision: D13538940 Pulled By: umanwizard fbshipit-source-id: a3362c7ac7327ced350d127bb426f82c59e42732	2018-12-23 12:51:40 -08:00
Jongsoo Park	4e4ef0cffb	add rowwise adagrad lp test (#15082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15082 We didn't have unit test for low-precision rowwise adagrad Reviewed By: chocjy Differential Revision: D13300732 fbshipit-source-id: 46e7bdfc82c5a6855eeb6f653c0a96b0b3a20546	2018-12-22 10:25:39 -08:00
Jongsoo Park	e012b183dd	handle empty inputs to SparseLengthsMean correctly (#15389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15389 SparseLengthsMean was generating uninitialized data for empty inputs (lengths == 0). We should return zeros. The unit tests were also not covering this special case which is fixed by this diff. Reviewed By: salexspb Differential Revision: D13515970 fbshipit-source-id: 3c35265638f64f13f0262cee930c94f8628005da	2018-12-21 22:20:14 -08:00
Hao Lu	58a7f2aed1	Add pthreadpool_create and pthreadpool_destroy (#15492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15492 Add pthreadpool_create and pthreadpool_destroy, which are used by NNPACK tests. Reviewed By: Maratyszcza Differential Revision: D13540997 fbshipit-source-id: 628c599df87b552ca1a3703854ec170243f04d2e	2018-12-21 20:28:18 -08:00
Pritam Damania	90aa21e795	Metadata for input/output formats in model file proto. (#15252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15252 We would like to extend the model file format to include strongly type, semantic information about the model inputs and outputs. The goal is for a user to be able to consider a model file like a function with a well defined API describing what the inputs and outputs would be. Reviewed By: dzhulgakov Differential Revision: D13009915 fbshipit-source-id: 5df124a876ad03c05fbdaacae0eab659637734c1	2018-12-21 17:42:38 -08:00
Zachary DeVito	f3a588fede	add len to nativeResolver (#15488 ) Summary: (otherwise len is not resolvable using torch::jit::compile) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15488 Differential Revision: D13539991 Pulled By: zdevito fbshipit-source-id: 3ba85fa7b1adb163f9229c568f7997d22321903d	2018-12-21 16:47:15 -08:00
David Riazati	934fc28656	Remove NoneGenerator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15335 Differential Revision: D13540357 Pulled By: driazati fbshipit-source-id: a289e5944b65872103f68faac74e18f10e7c6fff	2018-12-21 16:33:37 -08:00
David Riazati	1dcf2ea096	Add self to Python printer reserved words (#15318 ) Summary: This adds `self` to the list of reserved words and also sorts the lines and prevents the tracer from naming values 'self' (which happens in torch/tensor.py) Fixes #15240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15318 Differential Revision: D13540192 Pulled By: driazati fbshipit-source-id: 46ae02e51b1b31d5c62110fa83ba258ea6bada27	2018-12-21 16:02:07 -08:00
Ailing Zhang	70aafad08a	AD support for adaptive_avg_pool2d (#15459 ) Summary: This adds AD support for adaptive_avg_pool2d, which is necessary for resnet50 in pytorch/vision:master. cc: soumith asuhan dlibenzi apaszke I saw that autodiff bug you fixed in #15403 , as it doesn't prevent this PR from passing, so I'll leave it for your PR to fix it. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15459 Differential Revision: D13534732 Pulled By: ailzhang fbshipit-source-id: 4e48b93e35d5ecfe7bd64b6a132a55b07843f206	2018-12-21 15:38:24 -08:00
Hao Lu	01be9b7292	Handling nullptr case Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15467 Reviewed By: Maratyszcza Differential Revision: D13536504 fbshipit-source-id: ab46ff6bb4b6ce881c3e29d7e6a095ea62289db4	2018-12-21 15:08:00 -08:00
Bram Wasti	235d47760b	Relax check on outputs (#15458 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15458 many nets in the wild seem to have outputs that are never produced by the net. Reviewed By: ZolotukhinM Differential Revision: D13534185 fbshipit-source-id: 2b23b39c28404c53f68868f3bf6df53c5fea9eab	2018-12-21 14:19:37 -08:00
Zachary DeVito	6bf05bfde6	allow non-final returns (#15463 ) Summary: This PR allows a subclass of programs that have return statements that are not final in the graph. `final_returns.h` contains the a comment describing how this is accomplished. To minimize complexity in `compiler.cpp`, this pass is done as an AST-to-AST rewrite before the compiler runs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15463 Differential Revision: D13538962 Pulled By: zdevito fbshipit-source-id: 67105ca873351825b4a364092ab1873779f3e462	2018-12-21 14:01:33 -08:00
derek	3da4a04733	Fixed trivial typos in Dropout2D and Dropout3D classes (#15200 ) Summary: Fixed trivial typos in Dropout2D and Dropout3D classes weiyangfb Pull Request resolved: https://github.com/pytorch/pytorch/pull/15200 Differential Revision: D13537888 Pulled By: ezyang fbshipit-source-id: 8fb06027ca663a2e4bfa016af400698ae3c88ad1	2018-12-21 11:58:10 -08:00
svcscm	ff8fbc4f23	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 59d7a5b82fb78bc2d2285d0896e35c262512ffb9	2018-12-21 11:47:05 -08:00
surgan12	7e2ec24886	eq_fixes (#15475 ) Summary: fixes #15464 . cc : ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15475 Differential Revision: D13537812 Pulled By: ezyang fbshipit-source-id: 127adf612ac8b3d3a64baa3d12a53daba7d3e4b8	2018-12-21 11:43:06 -08:00
vishwakftw	d9cad71b36	Enable running collect_env.py without building PyTorch (#15468 ) Summary: Closes #15346 Differential Revision: D13537873 Pulled By: ezyang fbshipit-source-id: 7765ce4108dae9479d8900c0815cc2f174596a83	2018-12-21 11:37:43 -08:00
Bram Wasti	ac506f5820	Back out "[nomnigraph][executor] computeChains with nomnigraph" (#15451 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15451 Original commit changeset: ccd050bfead6 Reviewed By: ilia-cher Differential Revision: D13533161 fbshipit-source-id: 1d0dcd54c2e3875aab015f3e996693e67a449b87	2018-12-21 11:09:27 -08:00
James Reed	acbd9c49b0	Direct FBGEMM integraton into ATen (#13777 ) Summary: This PR implements infrastructure for post-processing a model to apply int8 quantization to its `nn.Linear` modules. Highlights of the implementation: 1) Inputs and outputs are `float` (quantized and packed internally), but the weight is quantized and packed ahead of time for efficiency. This implementation performs well in small-batch size GEMM calls. It should not be considered a general-purpose quantized GEMM kernel. 2) Weight packing is dependent on machine architecture (e.g. vector register width), so it is done just-in-time. Concretely, it is done on model load for the weights and it is done during operator execution for the input value. 3) Biases are unquantized 4) We fail loudly if we are attempting to run this on a machine that does not support FBGEMM. This is because we do not want a model's numerics to differ based on which machine it is run on. A model containing these FBGEMM ops must be run with FBGEMM The API can be seen in the added test case. Highlights are: 1) `torch.jit.quantized.quantize_linear_modules` walks the module hierarchy of the passed-in Module and replaces all `nn.Linear` modules with a new `QuantizedLinear` module, which encapsulates the behavior described above. 2) `_pack()` and `_unpack()` script methods are present on `QuantizedLinear` modules. These methods should be called before serialization and after deserialization, respectively. This ensures that the weight matrix is properly packed for the running machine's architecture. Note that in the long term, we would like to move toward a more Pickle-style serialization technique, rather than having these explicit methods that mutate member values. This is blocked on being able to assign attributes in a ScriptMethod, among other things. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13777 Differential Revision: D13383276 Pulled By: jamesr66a fbshipit-source-id: 00f29c9f34544add2b90107e3cf55a287802c344	2018-12-21 10:35:51 -08:00
Ashwin Ramaswami	614121c1ef	Replace getargspec with getfullargspec (#15396 ) Summary: Replace `getargspec` with `getfullargspec` to resolve test warnings. Fixes #15344 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/15396 Differential Revision: D13529548 Pulled By: zou3519 fbshipit-source-id: 50d3be92423a9ce89bc4895b67569663e1abbaa6	2018-12-21 09:40:33 -08:00
Fei Sun	2b23ba8ef0	The benchmark binary support multiple batches in one run (#15443 ) Summary: It is sometimes beneficial to run multiple batches in one benchmark and check the aggregated results. This PR enables this functionality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15443 Reviewed By: llyfacebook Differential Revision: D13531129 Pulled By: sf-wind fbshipit-source-id: 553a762a5cbadf5a3d9fd6af767ae34899bc1aa2	2018-12-21 08:45:41 -08:00
Gregory Chanan	433db13b48	Move torch.logspace to ATen and parallelize on CPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15438 Reviewed By: ezyang Differential Revision: D13529626 Pulled By: gchanan fbshipit-source-id: 896e8afee3d6b5a706c4f5815b91ba6bd8af6672	2018-12-21 08:24:33 -08:00
Dmytro Dzhulgakov	61cc701dd7	Fix cudnn dropout (#15473 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15473 Revert accidental changes introduced in D13335176 IntList is a range and copying it just copies pointers. Thus pointers would point either on deallocated memory or on the same memory causing equality always pass. Reviewed By: ezyang Differential Revision: D13537131 fbshipit-source-id: c97b3533be689bb4cdadd9e612f1284ac50e4bda	2018-12-21 08:15:44 -08:00
Jongsoo Park	f52f68bcf9	format specialized_segment_ops_test.py to prepare D13515970 (#15408 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15408 Applied formatting to specialized_segment_ops_test.py to prepare D13515970 Reviewed By: salexspb Differential Revision: D13520300 fbshipit-source-id: c3250b6abe8087c607f65ae60d1da61bd46c342b	2018-12-20 23:44:47 -08:00
Yinghai Lu	cb79e1b3a5	Clean up onnxifi transformation code (#15453 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15453 Just move things around to facilitate further development. No logic change. Reviewed By: rdzhabarov Differential Revision: D13533959 fbshipit-source-id: eebab1306939e802aacffb24a711d372fd67916c	2018-12-20 22:06:47 -08:00
Edward Yang	26b04523b1	Record Caffe2's current stream ID in c10_cuda. (#15174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15174 Previously, Caffe2 maintained a separate per-thread per-device current logical CUDA stream ID. In this PR, we switch Caffe2 over to using c10::Stream to manage the current stream, and also manage the allocation of cudaStream_t objects. This results in a slight behavior change: previously, Caffe2 would have been willing to allocate an arbitrary number of CUDA streams, depending on how high the logical stream IDs went. The c10::Stream pool has a fixed number of streams, once you exceed it, it wraps around. Reviewed By: dzhulgakov Differential Revision: D13451550 fbshipit-source-id: da6cf33ee026932a2d873835f6e090f7b8a7d8dc	2018-12-20 21:54:05 -08:00
Richard Zou	3353064060	Add option to automatically handle unsorted variable-length sequences in RNNs (#15225 ) Summary: Fixes #3584. Motivation: manually sorting sequences, packing them, and then unsorting them is something a lot of users have complained about doing, especially when we can offer library support for them. Overview: we internally sort sequences before packing them and store a list of `unsorted_indices` that represent how to unsort the sequences inside PackedSequence. The packing helper functions return PackedSequence with the `permutation` field and the unpacking helper functions use it to unsort. To implement this, the following changes were made: - PackedSequence now keeps `sorted_indices` and `unsorted_indices`. These two can be thought of as permutations and are inverses of each other. `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how to unsort the sequences. - Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence that maintains the legacy behavior of error-ing out on unsorted-sequences. When `enforce_sorted=True`, these functions maintain their ONNX exportability. - pack_sequence(sequences, enforce_sorted) takes in unsorted sequences. - pack_padded_sequence can take in a padded tensor that represents padded, unsorted sequences. - pad_packed_sequence unsorts the PackedSequence such that it is still the inverse operation of packed_padded_sequence. - RNNs apply `sort_indices` to their input hidden state and apply `unsort_indices` to their output hidden state. This is to ensure that the hidden state batches correspond to the user's ordering of input sequences. NOT BC-Breaking - The default for pack_sequence and pack_padded_sequence is `enforce_sorted=True` to avoid breaking ONNX export. To use the new functionality, pass in `enforce_sorted=False` Testing Plan - Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence, and TestNN.test_variable_sequence (RNN test) to check the behavior of unsorted sequences, sorted sequences, and sorted sequences with enforce_sorted=True - test/test_jit.py has a test to see if RNNs are exportable with enforce_sorted=True cc colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/15225 Reviewed By: soumith Differential Revision: D13507138 Pulled By: zou3519 fbshipit-source-id: b871dccd6abefffca81bc4e3efef1873faa242ef	2018-12-20 17:37:18 -08:00
WeihuangXu	52699f0754	Change default value of unique to 'sorted=True' Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15379 Differential Revision: D13531287 Pulled By: ezyang fbshipit-source-id: 1512da7d660dc413688d99264e6434897c3ac78c	2018-12-20 17:09:08 -08:00
Jongsoo Park	4ee1c2c632	add denormal options (ftz and daz) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15423 Reviewed By: yinghai Differential Revision: D13526340 fbshipit-source-id: de2ecc717b4f778f33a8bf940ed144dbb230c7a8	2018-12-20 17:04:39 -08:00
surgan12	3a6d473b49	collect_env fix (#15447 ) Summary: fixes #15214 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15447 Differential Revision: D13531523 Pulled By: ezyang fbshipit-source-id: 8f24f5ae9f3e78f6c5c9ee702ba14faca7aa297a	2018-12-20 16:56:34 -08:00
Lu Fang	a178f0a316	Remove unused field in jit script module deserializer (#15439 ) Summary: A little bit clean up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15439 Reviewed By: zrphercule Differential Revision: D13532015 Pulled By: houseroad fbshipit-source-id: 2fb1e01fc28549c7e78af6c65ee68339950bc7da	2018-12-20 16:18:40 -08:00
Edward Yang	8883ac4b58	Revert D13494873: [pytorch][PR] Fixing ONNX export of logical ops to have correct output datatype Differential Revision: D13494873 Original commit changeset: 069d2f956a5a fbshipit-source-id: 80ef10b2eb623a63da51dc2e4874f2ee446f426d	2018-12-20 15:56:31 -08:00
Viswanath Sivakumar	95a0e2c421	Fix ASAN div by zero error in rotated GenerateProposals op (#15415 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15415 Was introduced in D13429770 Reviewed By: SuperIRabbit Differential Revision: D13524114 fbshipit-source-id: a890eb3b97c24952c361155d1432a801499f4ddd	2018-12-20 15:44:15 -08:00
Jerry Zhang	ed5b584f65	Tensor construction codemod(ResizeLike) - 7/7 (#15087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15087 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: ezyang Differential Revision: D13419765 fbshipit-source-id: 34d695309a66723281429610a12544598c507d74	2018-12-20 15:33:07 -08:00
rory	d6cbcb43c5	allow numpy-like boolean-list indexing in pytorch (#14932 ) Summary: Suggested fix to issue #6773, the fix allows numpy-like boolean-list indexing in pytorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/14932 Differential Revision: D13398795 Pulled By: ezyang fbshipit-source-id: 67f8daf9829db2550ff76d2bde673be6dd2708cd	2018-12-20 15:33:06 -08:00
Teng Li	f56217af3b	Doc improvement on DDP (#15440 ) Summary: I noticed that some users don't even know we have this support. Adding into the doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/15440 Differential Revision: D13531045 Pulled By: teng-li fbshipit-source-id: 9757c400c0010608758c754df04e603b36035a10	2018-12-20 14:51:57 -08:00
Edward Yang	cde26c659e	Fix type annotation error. (#15448 ) Summary: According to mypy, the trailing -> None is mandatory. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15448 Differential Revision: D13532179 Pulled By: ezyang fbshipit-source-id: e8972f8c9ada4657c518cd7bcd46e489ab8ddf5f	2018-12-20 14:47:57 -08:00
Johannes M Dieterich	c24a124fa0	Add launch bounds needed for ROCm 2.0 (#15400 ) Summary: ROCm 2.0's compiler requires launch_bounds annotations if flat work group sizes are larger than the default of 256. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15400 Differential Revision: D13531239 Pulled By: ezyang fbshipit-source-id: c0b40600a8c332823da6c7113c644d8dba424a9c	2018-12-20 14:39:13 -08:00
Zachary DeVito	1a2ec10bd4	Support enough of closures to write autograd functions (#15411 ) Summary: This PR adds enough of the infra for supporting closures (inner script functions) in order to allow us to expression symbolic gradients using them. We do not actually ever run graphs that contain these closures. The symbolic_script infrastructure just extracts them out of the original forward graph and turns them into discrete forward/backward pairs. This cuts down on the type annotations necessary to write forward/backward pairs and aligns closely with the "differentiator" function approach to expression reverse-mode AD. Example: This code: ``` import torch r = torch.jit.CompilationUnit( ''' def mul_forward(self, other): def backward(grad_output): grad_self = (grad_output * other).sum_to_size(self.size()) grad_other = (grad_output * self).sum_to_size(other.size()) return grad_self, grad_other return self * other, backward ''') print(r.module.code) ``` Will produce this graph (pretty printed for clarity): ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: backward = (self.__lambda, (other, self)) return (torch.mul(self, other), backward) def __lambda(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` symbolic_script will then do some modifications to remove the unsuppored prim::Function node, yielding: ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: return (torch.mul(self, other), (other, self)) def backward(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15411 Differential Revision: D13523340 Pulled By: zdevito fbshipit-source-id: 4d4a269460e595b16802c00ec55ae00e3e682d49	2018-12-20 14:39:11 -08:00
hbraun@nvidia.com	3fdf567752	Adding CUDA version for C2 operators generate proposals and nms (#13694 ) Summary: Related to issue #13684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13694 Reviewed By: wat3rBro Differential Revision: D13017791 Pulled By: newstzpz fbshipit-source-id: 4bdc58e474d8e1f6cd73a02bf51f91542a2b9d0b	2018-12-20 14:39:09 -08:00
Gao, Xiang	a47749cb28	Add at::one_hot (#15208 ) Summary: Closes: https://github.com/pytorch/pytorch/issues/15060 Differential Revision: D13528014 Pulled By: ezyang fbshipit-source-id: 5a18689a4c5638d92f9390c91517f741e5396293	2018-12-20 14:24:58 -08:00
Fei Sun	2a64a78e7b	Extract arguments to its own file and pass arguments to ios apps (#15413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15413 In order to pass arguments to the ios app, need to extarct the arguments to its own file. Also, in the ios app, do not use the benchmark.json, which parses the arguments. This is an incompatible change, needs to add hot fix to the tests. Reviewed By: llyfacebook Differential Revision: D13523240 fbshipit-source-id: b559cc7f52d8f50ee206a7ff8d7b59292d855197	2018-12-20 13:31:48 -08:00
Spandan Tiwari	f0f9277c3c	Fixing ONNX export of logical ops to have correct output datatype (#15185 ) Summary: Currently PyTorch ONNX exporter exports the logical ops (`lt`, `gt`, `le`, `ge`, `eq`) with output type in corresponding ONNX ops as type `tensor(uint8)`. But ONNX spec allows for only `tensor(bool)`, which is why models that have these ops fail to load properly. This issue is captured in https://github.com/pytorch/pytorch/issues/11339. Part of this issue, relating to the allowed input types, has been fixed in ONNX spec by houseroad. This PR fixes the other part pertaining to output type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15185 Differential Revision: D13494873 Pulled By: houseroad fbshipit-source-id: 069d2f956a5ae9bf0ac2540a32594a31b01adef8	2018-12-20 12:37:27 -08:00
David Riazati	cb0b096f2b	Miscellaneous small doc fixes (#15373 ) Summary: This PR makes some small changes for better consistency in our README and CONTRIBUTING docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/15373 Differential Revision: D13512753 Pulled By: driazati fbshipit-source-id: 44398ad1894eef521d5f5acb1d06acaad67728cf	2018-12-20 12:33:40 -08:00
Edward Yang	cac02034f6	Extend README for ATen/native/cpu (#15437 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15437 Differential Revision: D13529436 Pulled By: ezyang fbshipit-source-id: 2e2193d54ea7f7626fe7392e4d0c130c2f87a76f	2018-12-20 11:17:00 -08:00
Shen Li	06a7cb5901	Implementing cuda kernel for tril_indices and triu_indices (#15203 ) Summary: Followup PR of #14904, and the stretch goal of #12653. Directly calculate coordinates in the original tensor using column index in the result tensor. Every GPU thread takes care of a column (two numbers) in the output tensor. The implementation detects and handles precision loss during calculating the square root of a `int64_t` variable, and supports tensors with up to `row * column = 2 ^ 59` numbers. Algorithm details are describe in [comments of TensorFactories.cu](`23ddb6f58a/aten/src/ATen/native/cuda/TensorFactories.cu (L109-L255)`). zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15203 Reviewed By: zou3519 Differential Revision: D13517695 Pulled By: mrshenli fbshipit-source-id: 86b305d22cac08c8962a3b0cf8e9e620b7ec33ea	2018-12-20 10:23:38 -08:00
Edward Yang	5c66662e58	Revert D13498974: [pytorch][PR] [jit] Add self to Python printer reserved words Differential Revision: D13498974 Original commit changeset: 488efb661476 fbshipit-source-id: 3b991bccf4cf2ffdafe70f145aff0ae2837e31f8	2018-12-20 10:02:37 -08:00
Erik Brinkman	8db44eda01	Add support for batched pdist (#12302 ) Summary: This updates pdist to work for batched inputs, and updates the documentation to reflect issues raised. closes #9406 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12302 Reviewed By: ezyang Differential Revision: D13528485 Pulled By: erikbrinkman fbshipit-source-id: 63d93a6e1cc95b483fb58e9ff021758b341cd4de	2018-12-20 09:41:08 -08:00
Brennan Vincent	7a764fe270	multi-dim standard deviation for CUDA. (#14990 ) Summary: This is the CUDA version of #14535 . It refactors Reduce.cuh to allow more general classes of reductions to be performed -- we no longer assume that the temporary data returned during reduction is just one scalar, and instead allow an arbitrary accumulate type. We also allow 64-bit indexing when necessary, since in general we will no longer be able to accumulate directly in the output. (In the cases when we can, we continue to split the tensors until they can be addressed with 32-bits, as before). As an initial use-case, we implement `std` in multiple dimensions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14990 Differential Revision: D13405097 Pulled By: umanwizard fbshipit-source-id: a56c24dc2fd5326d417632089bd3f5c4f9f0d2cb	2018-12-20 08:56:32 -08:00
David Riazati	5e624948b6	Add self to Python printer reserved words (#15318 ) Summary: This adds `self` to the list of reserved words and also sorts the lines and prevents the tracer from naming values 'self' (which happens in torch/tensor.py) Fixes #15240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15318 Differential Revision: D13498974 Pulled By: driazati fbshipit-source-id: 488efb661476cdcdb8ecb9cb48942f02e3c1e611	2018-12-20 02:29:09 -08:00
Peter Goldsborough	eb5d28ecef	Pretty printing of C++ modules (#15326 ) Summary: A long outstanding nicety: pretty printing of C++ modules. E.g. ``` Sequential sequential( Linear(10, 3), Conv2d(1, 2, 3), Dropout(0.5), BatchNorm(5), Embedding(4, 10), LSTM(4, 5)); std::cout << sequential; ``` prints ``` torch::nn::Sequential( (0): torch::nn::Linear(in=10, out=3, with_bias=true) (1): torch::nn::Conv2d(input_channels=1, output_channels=2, kernel_size=[3, 3], stride=[1, 1]) (2): torch::nn::Dropout(rate=0.5) (3): torch::nn::BatchNorm(features=5, eps=1e-05, momentum=0.1, affine=true, stateful=true) (4): torch::nn::Embedding(count=4, dimension=10) (5): torch::nn::LSTM(input_size=4, hidden_size=5, layers=1, dropout=0) ) ``` apaszke ebetica ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15326 Differential Revision: D13518986 Pulled By: goldsborough fbshipit-source-id: 63bf753672f0e348951de3645208f263581de5fb	2018-12-19 21:55:49 -08:00
Hassan Eslami	2ef0f1222a	Restructuring prof dag counters (#13321 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13321 This diff simply refactors the `ProfDAGCounters` into two: * `ProfDAGCounters` that gathers stats at runtime. * `ProfDAGReport` which holds the report from the gathered stats once stats collection is done. This refactoring allow us to implement `+=` for `ProfDAGReport`, which can be used for aggregating same-net reports on each host. Reviewed By: donglimm Differential Revision: D12837988 fbshipit-source-id: 0470c5fd6437f12711cab25a15a12965d79b2a91	2018-12-19 21:48:30 -08:00
Wanchao Liang	b89b46abfb	Remove python_default_init from ATen and use Optional (#15234 ) Summary: Optional clean up. This PR remove python_default_init from the yaml files, and the code-gen, and utilize optional type to do the work. This also fix the bug in the #13149 to correctly adopt as_strided backward. Fixes #9941 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15234 Differential Revision: D13502044 Pulled By: wanchaol fbshipit-source-id: 774b61fc4414482cf11d56e22bd0275aefb352a4	2018-12-19 21:38:50 -08:00
Jerry Zhang	3fc889e976	Tensor construction codemod(ResizeLike) - 1/7 (#15073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15073 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13419563 fbshipit-source-id: 8c284405fa3a867303216df876ee6b20d8a46551	2018-12-19 21:38:48 -08:00
bddppq	2db742fc95	Do not use fork to invoke test scripts in pytorch rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14600 Differential Revision: D13523937 Pulled By: bddppq fbshipit-source-id: 1493fdd051283650081d7944bb2bd7f0c4c44990	2018-12-19 21:35:16 -08:00
Edward Yang	1071e92335	Replace Vec256<T>::size with constexpr method (#15406 ) Summary: Stack:     ⚫  #15406 Replace Vec256<T>::size with constexpr method  [💛](https://our.intern.facebook.com/intern/diff/D13519902/) See Note [constexpr static function to avoid odr-usage compiler bug] for detailed justification. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15406 Differential Revision: D13523774 Pulled By: ezyang fbshipit-source-id: c0ab44298bb2ef3d68a66d026fc6bc156a909a6b	2018-12-19 20:33:45 -08:00
Marat Dukhan	9abd755a76	Make cpuinfo logging less verbose (#15405 ) Summary: Log only errors in cpuinfo. Fix to #15401 and #15398 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15405 Differential Revision: D13526251 Pulled By: Maratyszcza fbshipit-source-id: 4d9eba0912f7b45093bed2e343cd77a151ffa8c4	2018-12-19 20:23:36 -08:00
James Sun	88bf683cbc	Support error handling in forked threads (#14523 ) Summary: Save error info in the future for parent thread to pick up. Throw the error when the thread is the root thread. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14523 Differential Revision: D13251756 Pulled By: highker fbshipit-source-id: b40f9a45665e1a934743f131ec5e8bad5622ce67	2018-12-19 18:54:46 -08:00
Jerry Zhang	5dd5ef3214	default options for OutputTensorCopyFrom (#15248 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15248 OutputTensorCopyFrom takes four arguments: index, a source Tensor, TensorOptions and whether we want to perform an async call. We want to provide some default option for TensorOptions, (1). default device to context_.device() (2). default dtype to input.dtype(). User can also explicitly provide these options to override default values. next diff will change the order of TensorOptions parameter so that user don't need to write down tensor options unless they want to override. Reviewed By: dzhulgakov Differential Revision: D13453824 fbshipit-source-id: 87401f81c7c3f9fd3d8936c710e6c2e04a59b689	2018-12-19 18:14:47 -08:00
James Sun	a00cfd1e9b	Fix Module::copy_into Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15393 Differential Revision: D13519477 Pulled By: highker fbshipit-source-id: d62928597ec0700b550e7cf481c8febae57b200d	2018-12-19 17:09:59 -08:00
Zachary DeVito	0b219538cf	add unpack_outputs to inlineCallTo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15382 Differential Revision: D13518844 Pulled By: zdevito fbshipit-source-id: 981936988080af80629b70bf5f6dfa52ceb09c2f	2018-12-19 15:11:59 -08:00
Benoit Rostykus	07d20b1e7c	Fix documentation (#15372 ) Summary: Current documentation example doesn't compile. This fixes the doc so the example works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15372 Differential Revision: D13522167 Pulled By: goldsborough fbshipit-source-id: 5171a5f8e165eafabd9d1a28d23020bf2655f38b	2018-12-19 15:04:24 -08:00
Bram Wasti	055de167d5	computeChains with nomnigraph (#15366 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15366 swap the old implementation with a slightly easier one to understand I ran the tests and compared the number of chains compared to the old algorithm. This one outperforms on every test, but we have yet to see if that impacts performance at all. old chain 34 nomnigraph chain 25 old chain 46 nomnigraph chain 34 old chain 228 nomnigraph chain 188 old chain 397 nomnigraph chain 338 Reviewed By: ilia-cher Differential Revision: D13057451 fbshipit-source-id: ccd050bfead6eb94ab9c7b0a70b09a22c2b9e499	2018-12-19 15:04:23 -08:00
SsnL	9217bde807	Refactor dataloader.py (#15331 ) Summary: Same as #14668, and was approved there. ailzhang , please apply this patch to Horizon's `data_streamer.py`: https://gist.github.com/SsnL/020fdb3d6b7016d81b6ba1d04cc41459 Thank you! Below is the original description at #14668: As I am working on tasks in https://github.com/pytorch/pytorch/issues/13023, I realized how unreadable the code is because all functions to be run in multiprocessing must be at top global level. Adding more functionalities to `dataloader.py` will only make things worse. So in this PR, I refactor `dataloader.py` and move much of it into `data._utils`. E.g., the `_worker_loop` and related methods are now in `data._utils.worker`, signal handling code in `data._utils.signal_handling`, collating code in `data._utils.collate`, etc. This split, IMHO, makes code much clearer. I will base my future changes to DataLoader on top of this. No functionality is changed, except that I added `torch._six.queue`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15331 Reviewed By: yf225 Differential Revision: D13503120 Pulled By: ailzhang fbshipit-source-id: 94df16b4d80ad1102c437cde0d5a2e62cffe1f8e	2018-12-19 12:36:03 -08:00
vishwakftw	41e7e1bc40	Rename potrs to cholesky_solve (#15334 ) Summary: Changelog: - Renames `potrs` to `cholesky_solve` to remain consistent with Tensorflow and Scipy (not really, they call their function chol_solve) - Default argument for upper in cholesky_solve is False. This will allow a seamless interface between `cholesky` and `cholesky_solve`, since the `upper` argument in both function are the same. - Rename all tests - Create a tentative alias for `cholesky_solve` under the name `potrs`, and add deprecated warning to not promote usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15334 Differential Revision: D13507724 Pulled By: soumith fbshipit-source-id: b826996541e49d2e2bcd061b72a38c39450c76d0	2018-12-19 12:31:24 -08:00
Elias Ellison	33018e4e09	centralize side effects ops as node method (#15188 ) Summary: A number of different passes rely on whether a node has side effects. This centralizes the list of side effectful ops in one place. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15188 Differential Revision: D13508438 Pulled By: eellison fbshipit-source-id: 2143e782b787731ce007b6dcd50cbde30e1b8dd0	2018-12-19 10:52:54 -08:00
Tugrul Ates	560530aeec	Optional ScalarType support for native functions & JIT (#15154 ) Summary: For #6593 and #9515 This completes the support for optional<ScalarType> in native, JIT and autograd. Note: Mostly following the existing implementation for optional<Scalar> that was added in https://github.com/pytorch/pytorch/pull/12582. This PR introduces a way to make functions accept an optional dtype and it will unblock #9515 by allowing the `dtype` param for type promotion interface: ``` func: name(inputs, , ScalarType? dtype=None, Casting casting=same_kind) ``` An alternative approach could have been using `ScalarType::Undefined` for the same purpose but without optional, though it would have been a bit hacky. ``` func: name(inputs, , ScalarType dtype=Undefined, Casting casting=same_kind) ``` Here's an example use of this in action: `971f69eac6` There are already a bunch of native functions that were getting optional `dtype` through function overloading. https://github.com/pytorch/pytorch/pull/15133 is the attempt to migrate all of those. I will send those changes separately after this since some functions (e.g. sum) need quite a bit of change in the codebase. See the commits over there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15154 Differential Revision: D13457760 Pulled By: tugrulates fbshipit-source-id: 706134f0bd578683edd416b96329b49a1ba8ab48	2018-12-19 10:45:35 -08:00
vfdev-5	54d4fe3f49	Implement 'to' on ScriptModules (#15340 ) Summary: Following #6008 Fixes "Implement 'to' on ScriptModules #7354" cc zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/15340 Differential Revision: D13506646 Pulled By: zdevito fbshipit-source-id: 318fea2e8e51a37ce9844efa4c8db67d45a66317	2018-12-19 10:41:23 -08:00
Marat Dukhan	1d94a2bee3	Update cpuinfo submodule (#15385 ) Summary: Pull cpuinfo changes that should make it work on AWS Lambda servers (which don't have `/sys/devices/system/cpu/{possible,present}` files, and probably don't mount sysfs at all). I'm not 100% sure it will fix the issue, but getting this update in would make it easier for users to test using a nightly build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15385 Reviewed By: soumith Differential Revision: D13517467 Pulled By: Maratyszcza fbshipit-source-id: e8e544cd1f9dad304172ebb7b6ba7a8ad7d34e66	2018-12-19 07:31:45 -08:00
svcscm	cbde820bc3	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: dfbdae40e505c46cd64751c6ec107c84f9434131	2018-12-18 23:37:34 -08:00
Jianyu Huang	cd8dd49fba	race condition fix of using mutable_data inside OPENMP region for batched matmul (#15371 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15371 Similar to D13387692: Never call mutable_data from an OpenMP region!!! Reviewed By: jspark1105 Differential Revision: D13511259 fbshipit-source-id: 100812d2a547c0a1d5018749d5fdc88162375673	2018-12-18 23:22:56 -08:00
Michael Suo	6ca1d93473	add whitelisted clang-format checks (#15254 ) Summary: This PR adds clang-format automation: - It only checks on whitelisted files, so we can enable incrementally without noise - There is a pre-commit hook provided that will do the same check, plus prompt users to apply the clang-format changes (no change is made without the user agreeing). My plan is to migrate over whole files at a time, clang-formatting them and then adding them to the whitelist. Doing it this way should avoid too many merge pains (the most you'll have to is run clang-format on the affected file before rebasing). Pull Request resolved: https://github.com/pytorch/pytorch/pull/15254 Differential Revision: D13515888 Pulled By: suo fbshipit-source-id: d098eabcc97aa228c4dfce8fc096c3b5a45b591f	2018-12-18 22:34:20 -08:00
Zachary DeVito	122b4ef41d	build fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15384 Differential Revision: D13515708 Pulled By: zdevito fbshipit-source-id: ea077cfec30edf41b85dc83c0a969d1146434145	2018-12-18 22:11:44 -08:00
Zachary DeVito	0368054a6d	Split up compiler.cpp (#15355 ) Summary: This separates the different parts of compiler.cpp to make their relationship more clear. In particular it adds: * sugared_value.{h,cpp} - all the public SugaredValues that the compiler defines and a few that were inside compiler.cpp * type_parser.{h, cpp} - Turns TreeRef's defining types into TypePtr * schema_matching.{h, cpp} - infrastructure for matching arguments against overloaded schema and emitting builtin operators with a particular schema. Retains: * compiler.{h, cpp} - now responsible simply for the `defineMethodsInModule` infra structure. Some utility functions like inlineCallTo have moved to ir.h. Only thing that is not a move is some changes in module.h/cpp that remove multiple returns from `Method::emit_call_to`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15355 Reviewed By: suo, wanchaol Differential Revision: D13507524 Pulled By: zdevito fbshipit-source-id: 69ec936a9ff1a383c12a883616346b219c72e393	2018-12-18 19:43:35 -08:00
Ailing Zhang	6ab2e7442d	Autograd using torchscript (#14604 ) Summary: This PR enables autodiff to use the forward/backward graph compiled from python code, instead of using symbolic gradients(modifying the original graph directly). We put the map in a separate .h file for now to wait for the native_functions.yaml and derivatives.yaml merge. This should ideally go into native_functions.yaml eventually. This PR should be enough to unblock us for now, we can start writing gradients for aten functions in python. Differential Revision: D13494635 Pulled By: ailzhang fbshipit-source-id: f8d51a15243ac46afd09d930c573ccdfcd9fdaaf	2018-12-18 19:10:57 -08:00
Wanchao Liang	4928c76415	Minor clean up for test_jit (#15368 ) Summary: * remove None args in functional tests * remove some expect files that are not necessary Pull Request resolved: https://github.com/pytorch/pytorch/pull/15368 Differential Revision: D13512349 Pulled By: wanchaol fbshipit-source-id: 304cffff966487d15c373057ae8ad114ef8aa7f9	2018-12-18 18:26:37 -08:00
David Riazati	f3bff2d500	Add RNNCell modules to Script standard library (#14695 ) Summary: Adds RNNCell modules to script standard lib cc apaszke for argument_spec changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/14695 Differential Revision: D13467680 Pulled By: driazati fbshipit-source-id: 13a14da87714325cc4c3d49e5fde8a850d5d757b	2018-12-18 17:28:28 -08:00
David Riazati	f3cc9b2218	Remove fully qualified weak script names (#15364 ) Summary: Cleanup to make references to `weak_script` consistent across codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/15364 Differential Revision: D13509676 Pulled By: driazati fbshipit-source-id: 93dbbbe57e9b9b6587895f3cc6fac678babd21de	2018-12-18 16:48:52 -08:00
Chandler Zuo	096ee8467c	Redefine scheduler to set learning rate using recursive formula (#14010 ) Summary: Modified step_lr for StepLR, MultiStepLR, ExponentialLR and CosineAnnealingLR. In this way, multiple schedulers can be used simultaneously to modify the learning rates. Related issue: https://github.com/pytorch/pytorch/issues/13022 Added unit tests combining multiple schedulers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14010 Reviewed By: ezyang Differential Revision: D13494941 Pulled By: chandlerzuo fbshipit-source-id: 7561270245639ba1f2c00748f8e4a5f7dec7160c	2018-12-18 16:44:31 -08:00
Ruiyang Liu	5e97720100	Replace resize_dim() with set_sizes_and_strides() in (#15348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15348 We have a function resize_dim() on TensorImpl in c10/core/TensorImpl.h which lets you change the dimensionality of a tensor, resizing both sizes and strides. Unfortunately, this API is fairly easy to misuse, because it fills in the new entries with garbage when you size it larger. We want to refactor the call sites to use set_sizes_and_strides() instead, so that there is never an intermediate tensor state where the sizes/strides don't make sense. In this diff, resize_dim() is replaced with set_sizes_and_strides() in aten/src/TH/THTensor.hpp. Reviewed By: ezyang Differential Revision: D13505512 fbshipit-source-id: 193bab89f0018c13ca07488be336d8e967746b76	2018-12-18 16:38:36 -08:00
Richard Zou	5667af3880	Minor cleanup for TestFuser tests (#15134 ) Summary: Changelog: - change some expect tests that didn't have to be expect tests, instead use self.assertAllFused - Some of the fuser tests weren't using self.assertAllFused. - Minor test renames cc apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/15134 Differential Revision: D13507481 Pulled By: zou3519 fbshipit-source-id: dd0788530a60bb5ed2f42b961fae3db2b4404b64	2018-12-18 16:33:59 -08:00
Bill Li	3681bf7cff	add dense vector to id_list operator (#15090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15090 as title step 2 of the linked task Reviewed By: ellie-wen Differential Revision: D13425977 fbshipit-source-id: f3538ed68f42470ba39c5b779af764d4a5591a9d	2018-12-18 16:27:38 -08:00
Michael Suo	f5da198236	fix clang-tidy script for python 3 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15360 Differential Revision: D13509668 Pulled By: suo fbshipit-source-id: a3448a115eaac8dd4c3f179901a23bdbc5098408	2018-12-18 15:06:14 -08:00
Gregory Chanan	2469f7e02e	Port torch.linspace to ATen and parallelize it on CPU. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15320 Reviewed By: ezyang Differential Revision: D13498995 Pulled By: gchanan fbshipit-source-id: fba655d51d978fffaa53a5e4cae4a99ebfb0eddc	2018-12-18 15:01:49 -08:00
David Riazati	3118124cd6	Add (Un)Fold modules to standard library (#14759 ) Summary: Depends on #14597 for the corresponding aten ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14759 Differential Revision: D13325356 Pulled By: driazati fbshipit-source-id: 99e39449c1ccfa293de05672c31a11e580bdd11f	2018-12-18 12:03:08 -08:00
Lu Fang	f4c504593c	Fix the (reduce)min and (reduce)max ONNX exporting (#15241 ) Summary: max and reducemax are smashed together, we need to support one input case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15241 Reviewed By: yinghai Differential Revision: D13473312 Pulled By: houseroad fbshipit-source-id: 9b8c847286a2631b006ca900271bc0d26574101a	2018-12-18 11:48:06 -08:00
Zachary DeVito	056cfaf3ff	Method returns a single argument (#15289 ) Summary: This PR changes Method (just Method not all graphs) to always have a single return argument. This is part 1 in a set of changes that will enable us to have better handling if early return statements. The simplification that this change provides greatly reduces the work for the next step. This change makes it so that Method and Python handle multiple returns in the same way: * 0 - None * 1 - <single value> * many - Tuple[...] The result is that a lot of special-case handling in compiler.cpp and its bindings can be removed. It also fixes several bugs in return handling, including one where return values were not always checked against their attributed values. Notes: * inferTypeFrom is renamed to be more accurate and discourage use. * This has uncovered some bugs in other components, which are noted in the diff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15289 Differential Revision: D13481649 Pulled By: zdevito fbshipit-source-id: 0e2242a40bb28cca2d0e8be48bede96195e4858c	2018-12-18 10:44:09 -08:00
Jerry Zhang	12cf5178aa	caffe2 mobile opengl (#15322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15322 caffe2 mobile opengl code is not used, deleting it to reduce complications when we perform other changes Reviewed By: Maratyszcza Differential Revision: D13499943 fbshipit-source-id: 6479f6b9f50f08b5ae28f8f0bc4a1c4fc3f3c3c2	2018-12-18 08:20:52 -08:00
Edward Yang	54d8ce94ee	Revert D13383102: [pytorch][PR] Upgrade MKL-DNN to version 0.17 Differential Revision: D13383102 Original commit changeset: c434f0e0ddff fbshipit-source-id: 690f46ca0710954fa591a5ea77535e9759db4de5	2018-12-18 07:39:20 -08:00
svcscm	bb9b7de831	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 4bf66581d07d839f459869bc9c6428011063cc5b	2018-12-17 21:25:36 -08:00
Zachary DeVito	3a98462f2c	improve script/no script save error (#15321 ) Summary: Improves the error message for #15116 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15321 Differential Revision: D13499379 Pulled By: zdevito fbshipit-source-id: b8dc0a83efabff74199f4aab2ee98aa41c42608b	2018-12-17 21:13:58 -08:00
James Sun	e37a22128e	Allow tracing with fork/wait (#15184 ) Summary: There is still limitation on this: if a script module is somewhere in the trace, the inputs/outputs can only be tensors or tuples of tensors. resolves #15052 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15184 Differential Revision: D13457691 Pulled By: highker fbshipit-source-id: 8fe46afc41357a0eb8eadd83f687b31d074deb0e	2018-12-17 20:34:26 -08:00
Jie	bd958cde68	[TensorIterator fixing mean to output correct result for half precisi… (#14878 ) Summary: …on](#12115) mean is calculated in two step sum()/numel(). For half precision, data gets casted back to half after sum(). We fused the division into the reduction kernel by adding pre_op/post_op. This allows us to do torch.ones(65536).cuda().half().mean() to return correct result. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14878 Differential Revision: D13491159 Pulled By: soumith fbshipit-source-id: e83802e1628b6d2615c45e18d7acf991d143a09e	2018-12-17 20:13:30 -08:00
Edward Yang	71ee882157	Reenable OpenMP by reverting the following two commits. (#15315 ) Summary: Revert "Put back linker flag for OpenMP to prevent build break on ppc64le (#14569)" This reverts commit a84e873bb156080ea76ab182171b1f3b4d5395f6. Revert "Update OpenMP cmake setting for xcode 9 compiler(AppleClang 9.0) (#14473)" This reverts commit 8901935ad42fe9bf093d1106ea43606008a4024d. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15315 Differential Revision: D13495852 Pulled By: ezyang fbshipit-source-id: bcd3f60088b14831c53d3c171f10cd1ab6b35dee	2018-12-17 19:54:41 -08:00
Peter Goldsborough	aec9fdf0a4	Fix _apply in nn.Module (#15305 ) Summary: Fixes an issue that arose from https://github.com/pytorch/pytorch/pull/13481 where `.shared_memory()` couldn't be called. Effectively undoes all changes to `nn.Module` from that PR and solve the relevant problem in a different way (the goal was to be able to call `._apply()` on the Python wrapper for a C++ module). soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15305 Differential Revision: D13493937 Pulled By: goldsborough fbshipit-source-id: 4cb8687f90fc8709a536c5e7eacd0dc8edf6f750	2018-12-17 16:22:21 -08:00
Peter Goldsborough	2f38ffbcb3	Add a correctness check for C++ types to custom operators (#15247 ) Summary: The JIT uses `int64_t` for its integer type and `double` for its floating point type, but users quite often want to write `int` or `float` and that currently fails in not-so-nice ways for custom ops. This PR adds a simple `static_assert` to catch these common failure cases. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/15247 Differential Revision: D13493941 Pulled By: goldsborough fbshipit-source-id: c1cd0d10ab5838c75f167c0bdb57e45a0bc1344e	2018-12-17 16:17:27 -08:00
Tristan Rice	e650a84872	caffe2/python/task: added __repr__ methods to all task definitions (#15250 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15250 This adds `__repr__` methods to all of the classes under task.py. This makes the objects much easier to interact with when using them in an interactive manner, such as in a Jupyter notebook. The default `__repr__` method just returns the object ID which is very unhelpful. Reviewed By: hanli0612 Differential Revision: D13475758 fbshipit-source-id: 6e1b166ec35163b9776c797b6a2e0d002560cd29	2018-12-17 16:02:16 -08:00
Roy Li	e0b261a35b	Port nn fold and unfold to c++ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14597 Reviewed By: ezyang Differential Revision: D13272227 fbshipit-source-id: 6eccab5ff5830a977398a96393b778095120edc6	2018-12-17 15:46:37 -08:00
James Sun	c66adfc16b	Allow future type parsing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14887 Differential Revision: D13490984 Pulled By: highker fbshipit-source-id: 165fe995867be273793f983154aa6cbce13e4396	2018-12-17 15:39:52 -08:00
Jesse Hellemn	efb37e86eb	Removing BUILD_C10_EXPERIMENTAL_OPS option and unglobbing experimental/c10d ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15064 Reviewed By: orionr Differential Revision: D13474801 Pulled By: pjh5 fbshipit-source-id: 9d3664c3a3a1b6c2d9f083f8476fe3b037296b98	2018-12-17 15:35:41 -08:00
David Riazati	59d71b9664	Bicubic interpolation for nn.functional.interpolate (#9849 ) Summary: Addresses #918, interpolation results should be similar to tf * Adds bicubic interpolation operator to `nn.functional.interpolate` * Corresponding test in `test_nn.py` The operator is added in legacy `TH` to be aligned with the other upsampling operators; they can be refactored/moved to ATen all at once when #10482 is resolved Pull Request resolved: https://github.com/pytorch/pytorch/pull/9849 Differential Revision: D9007525 Pulled By: driazati fbshipit-source-id: 93ef49a34ce4e5ffd4bda94cd9a6ddc939f0a4cc	2018-12-17 15:31:48 -08:00
Wanchao Liang	c5dd91c4ae	add isinstance static type checking for jit (#15076 ) Summary: This PR add isinstance to do static type checking in JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15076 Differential Revision: D13471067 Pulled By: wanchaol fbshipit-source-id: d39b7ed5db9fcca4b503659d02cf7795950ea8ea	2018-12-17 15:21:49 -08:00
peter	216ab259fb	Fix the missing caffe2 proto files for Windows (#15157 ) Summary: Fixes #15156 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15157 Differential Revision: D13490420 Pulled By: orionr fbshipit-source-id: 4387d707f634a5975238af915b1befb2277f8ec7	2018-12-17 15:21:47 -08:00
Edward Yang	f4c59c5fdf	Replace SwitchToDevice(0) with SwitchToDevice() (#15126 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15126 I want to make people stop manufacturing StreamId from thin air, and a first step is to make people use the default stream. Reviewed By: dzhulgakov Differential Revision: D13432922 fbshipit-source-id: 9f0d8d70646c50d979bde5ba3c3addeebac48a3d	2018-12-17 15:15:00 -08:00
David Riazati	df4c9471ec	Don't enforce docstrings on bool dispatch (#15306 ) Summary: Allows 2 functions that are boolean dispatched to have no docstrings (the only case that will fail now is if both functions have docstrings) Fixes #15281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15306 Differential Revision: D13494884 Pulled By: driazati fbshipit-source-id: 65fec39ae03a7d6a68ad617c9b270faeb1617930	2018-12-17 14:41:05 -08:00
Soumyaroop Roy	95d3fed68f	Fix for issue 14829 (#14908 ) Summary: * Modify the testcase as outlined in the issue * Issue url: https://github.com/pytorch/pytorch/issues/14829 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14908 Differential Revision: D13490360 Pulled By: ezyang fbshipit-source-id: ff11a72e19b49223652182e82c2b4e65fe444ca7	2018-12-17 14:28:50 -08:00
Junjie Bai	e07fc114a0	Minor fixes in .jenkins/caffe2/bench.sh Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15304 Differential Revision: D13493876 Pulled By: bddppq fbshipit-source-id: 7146eb2587e526af65b4b0290c25bd55653a3088	2018-12-17 13:53:55 -08:00
Spandan Tiwari	700271d0e9	Adding ONNX export for torch.expand and torch.ne (#15050 ) Summary: `torch.expand` and `torch.ne` are used often in models and this PR adds ONNX export support for them. ArmenAg has created issue https://github.com/pytorch/pytorch/issues/10882 for this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15050 Differential Revision: D13453036 Pulled By: houseroad fbshipit-source-id: 4724b4ffcebda6cd6b2acac51d6733cb27318daf	2018-12-17 13:48:14 -08:00
Edward Yang	3df79f403e	Tighten up invariants regarding StreamId. (#15125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15125 I realized that it is really bad juju if you fake a StreamId out of thin air, because in general this isn't going to work. So, make the constructor a lot scarier. Most "faking StreamId out of thin air" happens because someone just wants to put something on the default stream. Reviewed By: dzhulgakov Differential Revision: D13432800 fbshipit-source-id: a86991d6fc1d8aa4e54e8175e5f06f90856238e6	2018-12-17 13:30:54 -08:00
David Riazati	1dbc7cff3e	Fix tensor printing bug in Python 2 (#12732 ) Summary: `rsplit` doesn't have kwargs in Python 2 so this line raises an error Fixes #15135 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12732 Differential Revision: D10458630 Pulled By: driazati fbshipit-source-id: a63e42fbc0e39e4291480775b516c98122ec05a1	2018-12-17 13:17:51 -08:00
peter	d71fac20eb	Refactor hotpatch_vars and apply it to libtorch (#14976 ) Summary: Fixes #14801. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14976 Differential Revision: D13485381 Pulled By: soumith fbshipit-source-id: 0af3c2e1b90988d56f6f85632328d1e4b788ffd2	2018-12-16 21:53:31 -08:00
Derek Kim	656b565a0f	Trivial comment correction in dataloader (#15276 ) Summary: Trivial comment correction in dataloader Pull Request resolved: https://github.com/pytorch/pytorch/pull/15276 Differential Revision: D13477324 Pulled By: soumith fbshipit-source-id: 2a74a014999655d129311d611f2a09411339cb13	2018-12-15 10:59:00 -08:00
Krishna Kalyan	c51c825efe	Delete ffi documentation (#15220 ) Summary: Deleting FFI documentation since its deprecated. Differential Revision: D13477329 Pulled By: soumith fbshipit-source-id: 0b3d485eb7cef1f05b6b397dff50f21a49d6409e	2018-12-15 09:49:02 -08:00
Fei Sun	60badccd10	Fix a typo in the assert Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15265 Reviewed By: llyfacebook Differential Revision: D13477029 Pulled By: sf-wind fbshipit-source-id: 9c5571a583c01f9701625541ebec0c836cb923f2	2018-12-15 09:09:09 -08:00
y0ast	4bcb425490	fix cholesky call in potrs example (#15215 ) Summary: Cholesky by default returns the lower triangular matrix, see [docs](https://pytorch.org/docs/stable/torch.html#torch.cholesky). However `torch.potrs` by default requires the upper triangular matrix. The naming of the variable `u` suggests that the example expects the upper to be returned, so I've added the flag to make that happen in the example. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15215 Differential Revision: D13476468 Pulled By: soumith fbshipit-source-id: 7b68035f435a2b1be4d363b3f63e407394af949d	2018-12-15 04:43:34 -08:00
Michael Suo	2b57bd4107	value-based mark and sweep DCE (#14910 ) Summary: This makes DCE more granular by tracking live values/aliases through the graph (rather than just nodes). So we can be more aggressive in DCE around control flow blocks. For example, in: ``` %a0 = aten::foo() %b = aten::foo() %a2, %b2 = prim::If(%cond) { block0() { %a1 = aten::foo(%.0) %b1 = aten::foo(%b) } -> (%a1, %b1) } return (%a2) ``` we will now dce all the `%b` stuff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14910 Differential Revision: D13476445 Pulled By: suo fbshipit-source-id: 2bf5db19711c07dde946697a4f4b270bd8baf791	2018-12-15 01:16:44 -08:00
Xiang Gao	df614371c7	Mention Jacobian-vector product in the doc of torch.autograd (#15197 ) Summary: A friend of me is learning deep learning and pytorch, and he is confused by the following piece of code from the tutorial https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#gradients : ```python x = torch.randn(3, requires_grad=True) y = x * 2 while y.data.norm() < 1000: y = y * 2 print(y) gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float) y.backward(gradients) print(x.grad) ``` He don't know where the following line comes from: ```python gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float) ``` What are we computing? Why don't we compute "the gradient of `y` w.r.t `x`"? In the tutorial, it only says > You can do many crazy things with autograd! Which does not explain anything. It seems to be hard for some beginners of deep learning to understand why do we ever do backwards with external gradient fed in and what is the meaning of doing so. So I modified the tutorial in https://github.com/pytorch/tutorials/pull/385 and the docstring correspondingly in this PR, explaining the Jacobian vector product. Please review this PR and https://github.com/pytorch/tutorials/pull/385 together. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15197 Differential Revision: D13476513 Pulled By: soumith fbshipit-source-id: bee62282e9ab72403247384e4063bcdf59d40c3c	2018-12-15 00:10:30 -08:00
Jerry Zhang	5b542a755f	Tensor method rename dims()->sizes() (#15246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15246 Codemod generated with clangr shard mode, 25 files per diff, Reviewed By: igorsugak Differential Revision: D13470369 fbshipit-source-id: ce995beab7c64bebe8b234fb5e6d015940ec2952	2018-12-14 21:11:02 -08:00
Zachary DeVito	f118568662	Create parser.cpp (#15238 ) Summary: Moves implementation into .cpp file. Parser was getting included in several compilation units. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15238 Differential Revision: D13474635 Pulled By: zdevito fbshipit-source-id: 7dc824eea8f506d6c8ae1aa67aeec0c34d5285fc	2018-12-14 19:31:36 -08:00
Fei Sun	e1808be37d	Add several features to converting images to blobs (#15204 ) Summary: Several enhancements are implemented: * Resize the images to be within a boundary between min-size and max-size (can be height and weight). It tries to resize the minimum size to match the min-size and keep the aspect ratio. However, if in that case the maximum size is more than the max-size, then resize the maximum size to be equal to the max-size (and the minimum size is less than min-size). The min/max sizes are specified in argument scale, in a comma separated form. If one of the size is -1, then that size is not a restriction. * Change the OpenCV resize function arguments from using cv::Size() to the x, y scale. Theoretically they should be the same. But in reality, the two ways of specifying them may result to different resized outputs. * Once the image is read in, change the data to floats. That means, after resize and other preprocessing steps, the float values are preserved (not truncated to int). * It is possible to convert data in text format to the blob format. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15204 Reviewed By: llyfacebook Differential Revision: D13467225 Pulled By: sf-wind fbshipit-source-id: 7da34a72d43a9603cd7ab953f5821c1222d0178f	2018-12-14 17:37:21 -08:00
Yinghai Lu	717496e6c1	Supply static shape info to Reshape when doing onnxGetCompatibility (#15242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15242 Newer version ONNX Reshape gets shape info from a tensor. Hence for static backend, we need to provide this info to it when doing `onnxGetCompatibility` too. Reviewed By: jackm321 Differential Revision: D13471959 fbshipit-source-id: 8a58e28edd900b6ad54a1dbd63ff2579fbe0e820	2018-12-14 16:37:39 -08:00
rohithkrn	763b9954f3	FP16MomentumSGDUpdate Op fix and enable for ROCm (#15150 ) Summary: 1. Fix a bug in FP16MomentumSGDUpdate operator 2. Enable operator for ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/15150 Differential Revision: D13473145 Pulled By: bddppq fbshipit-source-id: 4c5c5f30cb9bba658e3639dbe193fa08a304d306	2018-12-14 16:33:45 -08:00
Alexander Sidorov	e596d23137	Start unittesting our main observer (#15191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15191 OSS: just splitting out basic flags from a unit test. So I can extend them in another test where I need to add additional flags. Reviewed By: yinghai Differential Revision: D13159184 fbshipit-source-id: 9823e792cf0ed8d0379235c44564862b7d784845	2018-12-14 16:24:38 -08:00
bddppq	34f1f2208b	Build c10 HIP test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15233 Reviewed By: ezyang Differential Revision: D13471002 Pulled By: bddppq fbshipit-source-id: b42c3bc2b9db672ce50a52eb700cc6ed13d3535f	2018-12-14 15:36:38 -08:00
Krishna Kalyan	5e09c7bc80	record unit time in torch.cuda.event (#15221 ) Summary: Record unit of time for torch.cuda.Event's elapsed_time Differential Revision: D13467646 Pulled By: zou3519 fbshipit-source-id: 4f1f4ef5fa4bc5a1b4775dfcec6ab155e5bf8d6e	2018-12-14 15:29:06 -08:00
James Reed	054456eb93	Preserve module hierarchy on traced modules (#15101 ) Summary: We need this, for example, to properly call `_unpack` when we have a traced module in the hierarchy Pull Request resolved: https://github.com/pytorch/pytorch/pull/15101 Differential Revision: D13468467 Pulled By: jamesr66a fbshipit-source-id: c2b6740b12cde6e23395d12e42d4fc2c4c7ca3f2	2018-12-14 15:07:51 -08:00
Zachary DeVito	60f02b87be	fix an issue where two rules build the same .py files Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15230 Differential Revision: D13471625 Pulled By: zdevito fbshipit-source-id: a982413a308c7a9bb5b6a82fe96fd3de44f555aa	2018-12-14 14:52:52 -08:00
Johannes M Dieterich	bd368b867d	Do not ifdef __launch_bounds__ out for ROCm. (#15228 ) Summary: The compiler understands it and profits from knowing it by not using too many VGPRs as it defaults to 256 default workgroup size. Fixes a problem in bringup of ROCm 2.0 on gfx906. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15228 Differential Revision: D13470950 Pulled By: bddppq fbshipit-source-id: f9aa44c7c95299a099c0ea9317b9044cc056acc5	2018-12-14 14:47:32 -08:00
Edward Yang	dcd1685282	Revert D13440858: [pytorch][PR] Use a pool of per-thread cudnn handles for each device, updated Differential Revision: D13440858 Original commit changeset: 1c6af5c53538 fbshipit-source-id: fda42ea75000d4a4e9c4a8eeaaa5518f7ad9c298	2018-12-14 14:35:01 -08:00
Chaitanya Sri Krishna Lolla	9f1d8f2eeb	enabled tests in test_nn, test_cuda and test_sparse (#15232 ) Summary: tests work on ROCm 1.9.2 as present on CI (fp16 bringup, hipMemset and sparse improvements) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15232 Differential Revision: D13470991 Pulled By: bddppq fbshipit-source-id: 45acc4f9ea5baaaf7672b86eb022948055779925	2018-12-14 14:27:57 -08:00
David Riazati	e9fb4d1f11	Fix jit doc codeblocks and tables (#15227 ) Summary: Some of the codeblocks were showing up as normal text and the "unsupported modules" table was formatted incorrectly Pull Request resolved: https://github.com/pytorch/pytorch/pull/15227 Differential Revision: D13468847 Pulled By: driazati fbshipit-source-id: eb7375710d4f6eca1d0f44dfc43c7c506300cb1e	2018-12-14 14:27:56 -08:00
Johannes M Dieterich	b316e44a46	Remove __forceinline__ hipification step. (#15229 ) Summary: The HIP definition now correctly contains the inline attribute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15229 Differential Revision: D13470962 Pulled By: bddppq fbshipit-source-id: 34f8361bda5f3dce20a2eeb530c3a25d1b1bdd06	2018-12-14 14:24:05 -08:00
Peter Goldsborough	7a61306031	Enable all clang-tidy performance checks (#15198 ) Summary: This PR adds the final set of clang-tidy checks we should add for our codebase: a last set of performance-related checks. Most fixes here are around changing `auto` to `const auto&` in a few places where unnecessary copies were made, and adding `reserve()` calls before loops doing repeated `push_back()`. Also a few cases of calling `std::string::find` with a single-character string literal instead of a single char, which uses a less efficient string search algorithm meant for searching larger substrings. ![image](https://user-images.githubusercontent.com/6429851/49978940-adc1a780-ff01-11e8-99da-a4e431361f07.png) ezyang apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/15198 Differential Revision: D13468797 Pulled By: goldsborough fbshipit-source-id: 2bed1ea1c7c162b7f3e0e1026f17125e88c4d5b2	2018-12-14 13:32:47 -08:00
Junjie Bai	fc2856e9aa	Refactor caffe2 CI scripts and add benchmark scripts Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14575 Differential Revision: D13468049 Pulled By: bddppq fbshipit-source-id: e73bc8742c8a03f498816eee8a72b06a3e19fe48	2018-12-14 13:19:33 -08:00
Peter Goldsborough	4327a2d70a	Better tests/support for Python/C++ inter-op (#15193 ) Summary: Methods like `module.named_modules()` returns a container of `shared_ptr<nn::Module>`. Currently the `nn::Module` base class does not have Python bindings. This PR fixes this, and adds more unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15193 Differential Revision: D13458713 Pulled By: goldsborough fbshipit-source-id: 4091fe1b96a1be8db14c6a4307fbacc2b41ff6fe	2018-12-14 08:42:10 -08:00
Jerry Zhang	fb8487d708	Tensor construction codemod(ResizeLike) - 3/7 (#15122 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15122 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: dzhulgakov Differential Revision: D13419643 fbshipit-source-id: 65b5a037b94d458b944d51f790ba2829db1fb530	2018-12-14 02:08:37 -08:00
Michael Suo	78bf1a9065	Revert D13407930: [pytorch][PR] Support torch.tensor in script Differential Revision: D13407930 Original commit changeset: d17f1195a221 fbshipit-source-id: f4458872c48ec4a2c9983b21ed90bcdc0ae665b7	2018-12-13 22:13:07 -08:00
Duc Ngo	331c4b5b4d	caffe2 - make DataRandomFiller usable in unit tests (#15027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15027 - Make DataRandomFiller able to accept input_dims and input_types for only non intermediate inputs. Add a helper to fill input directly to a workspace Reviewed By: highker Differential Revision: D13408345 fbshipit-source-id: 5fc54d33da12e3f0a200e79380d4c695b0339b17	2018-12-13 20:45:52 -08:00
Duc Ngo	66b26806fc	caffe2 - easy - utils to set argument of operator (#15022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15022 Add setArgument testing utils to make it easy to set argument for an operator Reviewed By: yinghai Differential Revision: D13405225 fbshipit-source-id: b5c1859c6819d53c1a44718e2868e3137067df36	2018-12-13 20:45:50 -08:00
Duc Ngo	9726651d1e	caffe2 - easy - test utils for tensor assertion (#15020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15020 Add test utils for assertion of a tensor (sizes and values) Reviewed By: salexspb Differential Revision: D13401146 fbshipit-source-id: bc385df074043e03ea884940b5631b96de4a607e	2018-12-13 20:45:48 -08:00
Duc Ngo	d0b4ae835d	caffe2 - easy - test utils to compare tensors in two workspaces (#15181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15181 Add test utils to compare tensors in two workspaces Reviewed By: ZolotukhinM Differential Revision: D13387212 fbshipit-source-id: e19d932a1ecc696bd0a08ea14d9a7485cce67bb2	2018-12-13 20:45:46 -08:00
Duc Ngo	a0f68646ac	caffe2 - easy - test utils to fill tensors (#15019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15019 Put some utils to fill tensors to test_utils Reviewed By: salexspb Differential Revision: D13386691 fbshipit-source-id: 51d891aad1ca12dc5133c0352df65b8db4f96edb	2018-12-13 20:45:44 -08:00
Duc Ngo	8fedde5530	caffe2 - easy - test utils to create operator (#15180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15180 Test utils to create an operator On top of D13370461 Reviewed By: ZolotukhinM Differential Revision: D13382773 fbshipit-source-id: a88040ed5a60f31d3e73f1f958219cd7338dc52e	2018-12-13 20:45:42 -08:00
Duc Ngo	eb6fec3652	caffe2 - easy - Create test_util to make it easier to write C++ unit tests (#15014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15014 Currently it looks like many of the simple operations such as comparing tensors, creating tensors, fetching tensors... are too verbose and took effort to write correctly in unit tests. Easy to use utilities are often more important to increase productivity writing unit tests. While caffe2 python unit tests are relatively easier to write at the moment, the C++ side seems lacking. In this change I create a test_util, started with assertsTensorEquals, getTensor, createTensor, and we can start putting more easy to use utilities there. Reviewed By: salexspb Differential Revision: D13370461 fbshipit-source-id: bee467a127e1d032ef19482f98aa5c776cf508c0	2018-12-13 20:45:41 -08:00
vishwakftw	81644ed9ab	Fix derivative for mvlgamma (#15049 ) Summary: Fixes #15015. Added tests to validate derivative. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15049 Reviewed By: soumith Differential Revision: D13434117 Pulled By: zou3519 fbshipit-source-id: 4a292600af9eb08b67c0f8b5482e9512aac95e72	2018-12-13 20:32:57 -08:00
Roy Li	0b9b965c1a	Fix numpy conversion for int8 tensor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15194 Differential Revision: D13459270 Pulled By: li-roy fbshipit-source-id: 605534add263860a3ad9a7fa70888301ee0bf8e4	2018-12-13 19:38:09 -08:00
Natalia Gimelshein	fb140c7828	add erf and erfc to fuser/autodiff Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15139 Differential Revision: D13455690 Pulled By: soumith fbshipit-source-id: b06e5f5d362869c2e5fa11a52f9450d77c30d4cb	2018-12-13 19:17:40 -08:00
Sebastian Messmer	bb8ee2de0f	Move TensorImpl::CopyFrom to caffe2::Tensor (2/2) (#14858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14858 This diff doesn't change logic but just takes the existing code and moves it to caffe2::Tensor Reviewed By: ezyang Differential Revision: D13365817 fbshipit-source-id: bc73b27a793602cb14200dcdf357aa63233da43c	2018-12-13 18:41:24 -08:00
Sebastian Messmer	070f33f154	Move TensorImpl::CopyFrom to caffe2::Tensor (1/2) (#14656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14656 This diff doesn't move it yet, but prepares it to be moved, i.e. removes all access to class internals. dzhulgakov: Please comment on if you think it still makes sense to land this even though it's not blocking anymore since we're going to move at::CopyBytes anyhow. ezyang: There's some changes in the implementation, especially handling undefined dest tensors. Please review carefully. Reviewed By: ezyang Differential Revision: D13287688 fbshipit-source-id: 17800ca8a79ab1633f23be58d96f99a160d8ed24	2018-12-13 18:41:23 -08:00
Jing Huang	dc72a5e02c	For rotated proposals, replace cv::rotatedRectangleIntersection with a correct version that doesn't have underflow problem (#15113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15113 cv::rotatedRectangleIntersection has a known float underflow bug that would cause failure in ```CV_Assert(intersection.size() <= 8)``` For rotated proposals, replace cv::rotatedRectangleIntersection with a correct version that doesn't have underflow problem. Otherwise, when ```USE_CPP_GENERATE_PROPOSALS = true```, the training would fail. Reviewed By: viswanathgs Differential Revision: D13429770 fbshipit-source-id: 5e95d059f3c668f14059a0a83e8e53d8554cdb99	2018-12-13 18:13:46 -08:00
Elias Ellison	aecab53778	Support torch.tensor in script (#14913 ) Summary: Adding support for torch.tensor in script. The input list is typed as t[], because it can be arbitrarily nested. I added a check a compile time check that the inner type of the list is a bool, float, or int. Also adds specialization for Boolean Lists, which already existed at the ivalue level but had not been added to the compiler yet Pull Request resolved: https://github.com/pytorch/pytorch/pull/14913 Differential Revision: D13407930 Pulled By: eellison fbshipit-source-id: d17f1195a22149d5b0d08d76c89a7fab8444f7c5	2018-12-13 17:38:38 -08:00
Sebastian Messmer	bbbfda72a0	Remove TensorImpl -> Type dependency Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15086 Reviewed By: dzhulgakov Differential Revision: D13425628 fbshipit-source-id: 08a8a774d17b071367454e027012a02f96d177d4	2018-12-13 17:10:59 -08:00
Peter Goldsborough	1e9c384afb	Enable performance-unnecessary-value-param in .clang-tidy (#15026 ) Summary: This PR fixes around 250 places in the codebase where we were making unnecessary copies of objects (some large, some small). ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/15026 Differential Revision: D13458784 Pulled By: goldsborough fbshipit-source-id: be5148b2ce09493588d70952e6f6d6ff5ec5199b	2018-12-13 16:15:35 -08:00
Junjie Bai	bdfff2f8c2	Add missing caffe2_hip extension in setup.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15189 Reviewed By: orionr Differential Revision: D13457644 Pulled By: bddppq fbshipit-source-id: c2363e9b8fd21709b62777e5b2199f01ec1c65f8	2018-12-13 15:59:51 -08:00
bddppq	de0784510d	Remove disabled_features in hipify Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15098 Reviewed By: ezyang Differential Revision: D13453762 Pulled By: bddppq fbshipit-source-id: e177042c78f5bf393163d660c25b80285353853d	2018-12-13 15:43:57 -08:00
bddppq	855d9e1f19	Run ONNX cuda backend test cases via ROCm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15069 Differential Revision: D13427757 Pulled By: bddppq fbshipit-source-id: ba0273d75986cd5b146f7041a83c63ddf9c6c0cf	2018-12-13 15:10:00 -08:00
vishwakftw	6911ce19d7	Remove _finfo; replace _finfo usage with torch.finfo (#15165 ) Summary: This PR removes the usage of _finfo defined in torch.distributions.utils and changes the call sites to use torch.finfo instead Differential Revision: D13451936 Pulled By: soumith fbshipit-source-id: 6dbda3a6179d9407bc3396bf1a2baf3e85bc4cf2	2018-12-13 14:30:27 -08:00
Jerry Zhang	f1f7c16c90	Tensor construction codemod(ResizeLike) - 4/7 (#15088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15088 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: ezyang Differential Revision: D13419682 fbshipit-source-id: 3e59403bc1c0e71e5cb66df932ed0c6a0a72e643	2018-12-13 13:39:56 -08:00
David Reiss	cbd1c519c4	Replace non-printable-ascii characters in ProtoDebugString (#14918 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14918 When ProtoBuf-Lite is in use, ProtoDebugString just calls SerializeAsString. This produces binary output, which is not a very suitable "debug" string. Specifically, we've observed it causing problems when calling code tries to add the debug string to a Java exception message (which requires valid UTF-8). Now, we replace all non-ASCII bytes with "?". This is not a very fast implementation, but generating debug strings shouldn't be a performance-sensitive operation in any application. Reviewed By: dzhulgakov Differential Revision: D13385540 fbshipit-source-id: 8868172baf20efaf53fecf7d666a6980f59b64f5	2018-12-13 13:16:24 -08:00
Jerry Zhang	994f72ee3e	Tensor construction codemod(ResizeLike) - 6/7 (#15137 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15137 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: ezyang Differential Revision: D13419736 fbshipit-source-id: f4ad7b9582c2f809258169b7fef9adbca7063d99	2018-12-13 12:47:33 -08:00
Jerry Zhang	43c0b50c2e	Tensor construction codemod(ResizeLike) - 5/7 (#15084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15084 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: ezyang Differential Revision: D13419711 fbshipit-source-id: dd2b740c3f13d8087085bafc5571aaf908d1af42	2018-12-13 12:42:52 -08:00
Junjie Bai	86fbf17ba6	Use std::vector instead of alloca to work around hcc crash Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15175 Differential Revision: D13453708 Pulled By: bddppq fbshipit-source-id: f8c147ae9f679e395fee9d4c73ebcca052c9a752	2018-12-13 12:34:36 -08:00
Junjie Bai	f61612206c	Fix old tensor OutputTensorCopyFrom usage in ImageInput operator (#15094 ) Summary: cc jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15094 Differential Revision: D13451898 Pulled By: bddppq fbshipit-source-id: 27906be62fb88aaa13c257441a2e35a285b445ee	2018-12-13 11:48:19 -08:00
Vitaly Fedyunin	e5bd6fe86d	Kill non-forward, non-backward functions generated from nn.yaml (#15127 ) Summary: Updating binding to legacy functions. Remove unused declarations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15127 Differential Revision: D13433405 Pulled By: VitalyFedyunin fbshipit-source-id: 58544d38affd20818742338c9eb789d9d14ccbaa	2018-12-13 11:34:50 -08:00
Edward Yang	bc80deea1b	Delete defunct USE_SIMPLE_BASE_CTOR_DTOR (#15144 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15144 Differential Revision: D13440872 Pulled By: ezyang fbshipit-source-id: 2b1d73fac0c63729ba01d8f129642334ae9d9cf3	2018-12-13 11:20:37 -08:00
Lu Fang	e51092a2b8	Fix typo (#15045 ) Summary: Simple typo fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/15045 Reviewed By: dzhulgakov Differential Revision: D13413509 Pulled By: houseroad fbshipit-source-id: be66700c30d038368b1433232a4e3fd9299c83d6	2018-12-13 11:13:19 -08:00
Michael Carilli	ca4358c8f5	Use a pool of per-thread cudnn handles for each device, updated (#15080 ) Summary: Rebased version of https://github.com/pytorch/pytorch/pull/14861, hopefully addressing ezyang's comments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15080 Differential Revision: D13440858 Pulled By: ezyang fbshipit-source-id: 1c6af5c53538b81c6b92cf1dda231ed333f28035	2018-12-13 10:24:06 -08:00
vishwakftw	214f46faf5	Fix bincount for non-contiguous inputs on CPU (#15109 ) Summary: Fixes #15058. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15109 Differential Revision: D13447448 Pulled By: soumith fbshipit-source-id: 56e8d42934538fb00465105a2c5ccfeb7c18a651	2018-12-13 09:44:20 -08:00
Vitaly Fedyunin	bf7a2b9125	Unify SparseTensorImpl::size_ and TensorImpl::sizes_ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15130 Differential Revision: D13434981 Pulled By: VitalyFedyunin fbshipit-source-id: 98bd4d66834a3c3d2ea577adb0c8413852da095d	2018-12-13 08:55:35 -08:00
Peter Goldsborough	0bf1383f0a	Python <-> C++ Frontend inter-op (#13481 ) Summary: This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side). I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly. The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript. apaszke zdevito CC dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481 Differential Revision: D12981996 Pulled By: goldsborough fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7	2018-12-13 08:04:02 -08:00
Richard Zou	b14d6d730a	Reuse KernelSpec for FusionGroups with equivalent graphs (#14541 ) Summary: Before this PR, loop unrolling + the graph fuser was creating multiple FusionGroups with the same bodies (with different variable names) for JIT LSTMs. Each FusionGroup got registered to a separate fusion key; each key resulted in a different compilation for the same specializations. This PR makes it so that when registering FusionGroups with the fusion compiler, the compiler first checks the KernelSpec cache to see if the FusionGroup's graph exists already. If it does, then return the corresponding KernelSpec's key to share compiled kernels. In addition, graphs in the KernelSpec cache are canonicalized before being cached. I added a flag to the canonicalize pass to remove unique names of values. This shortens the compile time for a JIT LSTM (seq_len of 100, loop unroll factor of 8) from 5.3s to 2.3s. Most of this compile time is running the graph fuser and/or fusion compiler; while this PR makes it so that there is only one unique kernel in the forward pass, there are a lot of different kernels (6) in the backward pass (after loop unrolling) that should be investigated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14541 Differential Revision: D13324487 Pulled By: zou3519 fbshipit-source-id: b841d82ed35a959b5cfc72db033bf5a7b42cc4fb	2018-12-13 07:54:35 -08:00
Syed Tousif Ahmed	aa022313cb	Removes THCNumerics usages in RNN.cu (#15085 ) Summary: We don't need THCNumerics here since at::Half can be implicitly converted to float and the cuda math dispatches are handled by `/usr/local/cuda/include/crt/math_functions.hpp` and `cmath`. ATen should be free of THCNumerics after this and when porting kernels from THC, one should not use THCNumerics. Should close: https://github.com/pytorch/pytorch/issues/11878 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15085 Differential Revision: D13447558 Pulled By: soumith fbshipit-source-id: 4ff5cbf838edcd01e2d1397e4d7f4f920e9e9fc3	2018-12-13 00:24:17 -08:00
Jongsoo Park	1e0eab5df8	minimize header file includes from _avx2.cc (#14950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14950 Minimize the number of headers included from _avx2.cc files to avoid accidental compilation of functions defined the header files reused by other translation units that can lead to illegal instruction errors. Reviewed By: dskhudia Differential Revision: D13394483 fbshipit-source-id: 67149a6fb51f7f047e745bfe395cb6dd4ae7c1ae	2018-12-13 00:18:11 -08:00
Gu, Jinghui	4b97a46421	Disable strict-overflow flag to avoid compilation error (#14977 ) Summary: Disable strict-overflow flag to avoid compilation error Pull Request resolved: https://github.com/pytorch/pytorch/pull/14977 Differential Revision: D13447577 Pulled By: soumith fbshipit-source-id: 1957bd5aa3c7b79219da3dd53560464977c89526	2018-12-12 22:41:33 -08:00
Russell Kaplan	1e93317b99	Remove "early-release beta" disclaimer from README (#15136 ) Summary: Now that PyTorch 1.0 is out, this should be updated :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15136 Differential Revision: D13447377 Pulled By: soumith fbshipit-source-id: bd4e662c53d0699f25d4d90c1b4c1e182b4427c2	2018-12-12 22:14:14 -08:00
Xianjie Chen	fabd23cb2d	support casting to string (#15110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15110 support casting to string on CPU Reviewed By: intermilan Differential Revision: D13429381 fbshipit-source-id: b737a1ba1237b10f692d5c42b42a544b94ba9fd1	2018-12-12 21:33:58 -08:00
Cheng,Penghui	1717ea1da0	Implementation of ChannelShuffle Op for MKLDNN (#15106 ) Summary: the speed-up of a single operation is up to 3X . Pull Request resolved: https://github.com/pytorch/pytorch/pull/15106 Differential Revision: D13429596 Pulled By: bddppq fbshipit-source-id: f8d987cafeac9bef9c3daf7e43ede8c6a4ee2ce5	2018-12-12 20:25:12 -08:00
Tyler Moncur	895cb8fcea	Fix resize for edge case tensors (#14874 ) Summary: Certain tensor shapes failed when being resized. This pull request addresses the bug found in #13404. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14874 Differential Revision: D13429788 Pulled By: soumith fbshipit-source-id: 8aa6451dbadce46d6d1c47a01cb26e6559bcfc8c	2018-12-12 19:56:23 -08:00
Peter Goldsborough	78a77667dd	Autoformat build_variables.py (#15152 ) Summary: autoformat `tools/build_variables.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15152 Differential Revision: D13445343 Pulled By: goldsborough fbshipit-source-id: fd63588de114cb92deda03fa1a0b36f5f9082b2f	2018-12-12 19:30:17 -08:00
Jongsoo Park	fab78827d6	don't compile dnnlowp.cc in avx2 option (#15147 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15147 Forgot to take out dnnlowp.cc from avx2 list in a previous diff. Reviewed By: dskhudia Differential Revision: D13440686 fbshipit-source-id: 9ada98b6e885c7d5f22c91a735ff60304480b4cb	2018-12-12 18:57:09 -08:00
Brett Koonce	d8260239a0	docs: minor spelling tweaks Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15148 Differential Revision: D13443708 Pulled By: suo fbshipit-source-id: 5e3ec0afd3416ab8ce207f2d04105c49e1c04611	2018-12-12 18:17:14 -08:00
Zachary DeVito	2211a283d2	Export defs.bzl to open source for pytorch (#15132 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15132 Pull Request resolved: https://github.com/facebook/fbshipit/pull/64 Reviewed By: dzhulgakov Differential Revision: D13424093 fbshipit-source-id: bbebef964b9f3aef8f59cd394eca068680c36b5a	2018-12-12 17:40:29 -08:00
Junjie Bai	107c9ef518	Add back c2 string_utils include header to benchmark_helper Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15143 Differential Revision: D13439694 fbshipit-source-id: 78698b66d52a0178118cbf3e79a7a5ad1763d47b	2018-12-12 16:38:00 -08:00
Johannes M Dieterich	6610ace28b	use ROCm 1.9.2 fp16 capabilities in rocBLAS and MIOpen interfaces (#14994 ) Summary: * relax MIOpen if statement to allow fp16/fp32 mixed precision training now supported by ROCm 1.9.2 * use gemm_ex API of rocBLAS in ROCm 1.9.2 instead of the previous hgemm API * with this: enable all but one half test in test_nn While there, fix also: * a group convolution issue w/ MIOpen pertaining to initializing MIOpen on multi-GPU systems properly we detected while working on this Pull Request resolved: https://github.com/pytorch/pytorch/pull/14994 Differential Revision: D13439869 Pulled By: bddppq fbshipit-source-id: 75e4eb51a59488882e64b5eabdc30555b25be25e	2018-12-12 16:16:47 -08:00
Viswanath Sivakumar	f34d827007	Optimize CPU GenerateProposals op by lazily generating anchors (3-5x faster) (#15103 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15103 There are two main optimizations in this diff: 1. We generate all anchors for every single spatial grid first, and then apply NMS to pick 2000 anchors according to RPN_PRE_NMS_TOP_N. By first sorting the score and picking the 2000 top ones and then lazily generating only the corresponding anchors is much faster. 2. Transposing bbox_deltas from (num_anchors * 4, H, W) to (H, W, num_anchors * 4) was also quite slow - taking about 20ms in the RRPN case when there are lots of anchors which it's negligible for RPN case (like 0.1 ms). Instead of transponsing, performing all operations in the (num_anchors, H, W) format speeds things up. For regular RPN scenario, this gives 5x speedup from 5.84ms to 1.18ms a case with 35 anchors over a 600x600 image. For rotated boxes with 245 anchors, the runtime down from 80ms to 27ms per iter. Reviewed By: newstzpz Differential Revision: D13428688 fbshipit-source-id: 6006b332925e01a7c9433ded2ff5dc9e6d96f7d3	2018-12-12 15:53:52 -08:00
Shen Li	90f9e8103c	Implement torch.tril_indices and torch.triu_indices (#12653 ) (#14904 ) Summary: This is an optimized implementation that does the following: 1. created an empty Tensor of correct size. 2. fill the Tensor with correct values. The following three designs to fill in the Tensor result in roughly the same performance. Hence, the 2nd option is taken for simpler code, and to return contiguous tensors. 1. Sequential: fill row coordinates first, then columns. This results in two for-loop and more arithmetic operations. 2. Interleaved: fill in index coordinates one by one, which jumps between the two output Tensor rows in every iteration. 3. Transpose: create a n X 2 Tensor, fill the Tensor sequentially, and then transpose it. <img width="352" alt="screen shot 2018-12-10 at 3 54 39 pm" src="https://user-images.githubusercontent.com/16999635/49769172-07bd3580-fc94-11e8-8164-41839185e9f9.png"> NOTE: This implementation returns a 2D tensor, instead of a tuple of two tensors. It means that users will not be able to do the following: ```python x = torch.ones(3, 3) i = torch.tril_indices(3, 3) x[i] # need to first convert the 2D tensor into a tuple of two 1D tensors. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14904 Reviewed By: zou3519 Differential Revision: D13433027 Pulled By: mrshenli fbshipit-source-id: 41c876aafcf584832d7069f7c5929ffb59e0ae6a	2018-12-12 15:40:14 -08:00
Imran	342e62f1e3	Minor documentation mistake (#15068 ) Summary: keepdim is a optional parameter for torch.max() Pull Request resolved: https://github.com/pytorch/pytorch/pull/15068 Differential Revision: D13437745 Pulled By: zou3519 fbshipit-source-id: b5198c7d4ae17758cd136f6e5aecc6cb5838f174	2018-12-12 15:24:26 -08:00
David Riazati	5837320b70	Add script standard library documentation + cleanup (#14912 ) Summary: Documents what is supported in the script standard library. * Adds `my_script_module._get_method('forward').schema()` method to get function schema from a `ScriptModule` * Removes `torch.nn.functional` from the list of builtins. The only functions not supported are `nn.functional.fold` and `nn.functional.unfold`, but those currently just dispatch to their corresponding aten ops, so from a user's perspective it looks like they work. * Allow printing of `IValue::Device` by getting its string representation Pull Request resolved: https://github.com/pytorch/pytorch/pull/14912 Differential Revision: D13385928 Pulled By: driazati fbshipit-source-id: e391691b2f87dba6e13be05d4aa3ed2f004e31da	2018-12-12 12:30:13 -08:00
Immanuel Alexander	64b3364209	Move adaptive avg pooling 2d to ATen native (#14714 ) Summary: adaptive_avg_pool1d, adaptive_avg_pool2d, and adaptive_avgpool3d are neural network functions that are currently implemented in our legacy THNN (CPU) / THCUNN (CUDA) libraries. It is generally better if these live in our new library ATen, since it is more feature complete and reduces cognitive overhead. This change moves currently to adaptive_avg_pool1d and adaptive_avg_pool2d to ATen. timed relevant cpu tests with this change: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) .s.s.s.s.s.s.s... ---------------------------------------------------------------------- Ran 17 tests in 6.273s OK (skipped=7) real 0m7.164s user 3m1.289s sys 0m0.905s ``` compared to master: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) .s.s.s.s.s.s.s... ---------------------------------------------------------------------- Ran 17 tests in 7.232s OK (skipped=7) real 0m8.065s user 3m34.714s sys 0m2.440s ``` also timed relevant cuda tests with this change: ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) ................. ---------------------------------------------------------------------- Ran 17 tests in 21.049s OK real 0m24.106s user 0m20.890s sys 0m4.026s ``` compared to master ``` [ialex@devgpu064.ash5 ~/pytorch] time python test/test_nn.py test_AdaptiveAvgPool1d (__main__.TestNN) test_AdaptiveAvgPool1d_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_single (__main__.TestNN) test_AdaptiveAvgPool2d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool2d_tuple_none_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_single (__main__.TestNN) test_AdaptiveAvgPool3d_single_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_cuda (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none (__main__.TestNN) test_AdaptiveAvgPool3d_tuple_none_cuda (__main__.TestNN) test_adaptive_log_softmax (__main__.TestNN) test_adaptive_pooling_input_size (__main__.TestNN) test_adaptive_pooling_size_none (__main__.TestNN) ................. ---------------------------------------------------------------------- Ran 17 tests in 23.021s OK real 0m27.095s user 0m20.121s sys 0m3.668s ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14714 Differential Revision: D13384084 Pulled By: xnder fbshipit-source-id: 344442103ccbbda72d3c010d2feea00e9985d226	2018-12-12 12:25:22 -08:00
Jerry Zhang	63e77ab6c4	Move numa.{h, cc} to c10/util (#15024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15024 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393 att Reviewed By: dzhulgakov Differential Revision: D13380559 fbshipit-source-id: abc3fc7321cf37323f756dfd614c7b41978734e4	2018-12-12 12:21:10 -08:00
Richard Zou	b34ab435ef	Stop erroneously running aten::warn (#15124 ) Summary: Fixes #15119. Before this PR, we were propagating constants through aten::warn AND running it as a part of shape analysis. This caused aten::warn to be run regardless of if it is supposed to be run dynamically. This PR adds an exclusion for aten::warn in constant propagation and shape analysis, similar to that of prim::RaiseException. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15124 Differential Revision: D13432815 Pulled By: zou3519 fbshipit-source-id: 15ab533ce2accb2da3fd4e569070c7979ce61708	2018-12-12 11:35:23 -08:00
Edward Yang	2d485ffb17	Move CUDAGuard, CUDAStream and CUDAGuardImpl to c10/cuda (#14248 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14248 This diff also introduces a horrifying hack to override CUDA's DeviceGuardImpl with a HIPGuardImplMasqueradingAsCUDA, to accommodate PyTorch's current behavior of pretending CUDA is HIP when you build with ROCm enabled. Reviewed By: bddppq Differential Revision: D13145293 fbshipit-source-id: ee0e207b6fd132f0d435512957424a002d588f02	2018-12-12 11:24:26 -08:00
Gregory Chanan	9943cf2378	Kill Type.storage. (#15075 ) Summary: It's not used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15075 Reviewed By: ezyang Differential Revision: D13422487 Pulled By: gchanan fbshipit-source-id: 272aa0a10e96f3ffb97d571490b517f972b9dcf7	2018-12-12 10:57:54 -08:00
Brennan Vincent	9d2955c39c	fix infinite loop when get_max_threads is nonzero but num_threads is 1 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15114 Differential Revision: D13431891 Pulled By: umanwizard fbshipit-source-id: f968b8e50cf776c346d4a28d72b12e7856c95839	2018-12-12 10:04:18 -08:00
Gregory Chanan	68ad9ae5be	Ensure there aren't variables in checked_tensor_unwrap, checked_tenso… (#15105 ) Summary: …r_list_unwrap. These functions use unsafeGetTensorImpl(), which doesn't work with Variables (in a silent way that may blow up later). So let's do early checking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15105 Reviewed By: ezyang Differential Revision: D13429149 Pulled By: gchanan fbshipit-source-id: b85f6f5b7cdb9a6dd0c40205b924c840a3920ba0	2018-12-12 09:58:03 -08:00
Richard Zou	0ad39ec5c1	Add better support for bools in the graph fuser (#15057 ) Summary: Fixes #15038. aten::_cast_Float(tensor, non_blocking) support was added in #14336. Its second argument is a bool, but because we don't support generating values of type bool in the fuser codegen, the codegen errored out. aten::_cast_Float in the fuser never actually uses its non_blocking argument, so another way to fix this would be to have a special op for a fused cast but I thought that we might have fusible ops that do take bool arguments in the future so this would be good to have. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15057 Differential Revision: D13432091 Pulled By: zou3519 fbshipit-source-id: 455fe574f5f080aca9a112e346b841a2534a8dc3	2018-12-12 09:39:44 -08:00
Brennan Vincent	f36a84b71b	fix some tests that I accidentally disabled (#15077 ) Summary: While moving these scenarios into `_test_dim_ops` I accidentally left an empty loop in the actual tests, causing them to do nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15077 Differential Revision: D13428759 Pulled By: umanwizard fbshipit-source-id: 08f53068981d9192c1408878b168e9053f4dc92e	2018-12-12 09:25:34 -08:00
Edward Yang	3ae684266a	Don't setup x86_64-linux-gnu-gcc as an sccache wrapper. (#15078 ) Summary: When I do this setup in a local Docker development environment, I get the following error: x86_64-linux-gnu-gcc: error trying to exec 'cc1plus': execvp: No such file or directory Somehow, gcc seems to get confused when it gets run from the wrong directory. Best not to do it. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/15078 Differential Revision: D13432143 Pulled By: ezyang fbshipit-source-id: b18e15f493503a4c8205c85f92a214e49762a7bc	2018-12-12 08:01:03 -08:00
Junjie Bai	00a4c8d41c	Use c10::to_string that works cross platform (#15117 ) Summary: Fix master breakage introduced in #15108 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15117 Differential Revision: D13430568 Pulled By: bddppq fbshipit-source-id: ce10bc552f085d1bf0afbc13119991bee014ac95	2018-12-12 02:58:49 -08:00
Zhiping Xiu	1423c0d9f1	Add EmptyNameScope to allow you jump out from current scope. (#14631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14631 adding a empty name scope to allow people jump out from current namescope. This could be useful when you want to access blob from parent or sibling scope. Facebook: e.g: we encoutered a potential usecase in D13124249 (it's a large diff, please search by EmptyNameScope in that diff), we need to access to a blob declared in root namescope from a device namescope (device namescope has been used by parallel_GPU API). `EmptyNameScope` can help us do that with ease. I referenced to `EmptyDeviceScope` D6103412 while implementing this one. Reviewed By: yinghai Differential Revision: D13272240 fbshipit-source-id: d4cde5abcc2336e456b6c6ef086266ef94d86da8	2018-12-12 01:39:50 -08:00
bddppq	479481b6cb	Remove linker and dlopen flags that allowed undefined symbols in rocm build (#15091 ) Summary: Previously the undefined symbols were caused by disabled_modules in tools/amd_build/disabled_features.json (now it's cleared). Pull Request resolved: https://github.com/pytorch/pytorch/pull/15091 Differential Revision: D13429595 Pulled By: bddppq fbshipit-source-id: b341e83f9e5a8d16440a364e837b045a8a4fd6e1	2018-12-11 23:23:47 -08:00
Peter Goldsborough	0dade9862c	Fix serialization (#15033 ) Summary: Fixes a bug where (de-)/serializing a hierarchy of submodules where one submodule doesn't have any parameters, but its submodules do, doesn't get properly loaded. This had to do with the fact that the old protobuf format couldn't store empty parameters. Fixes https://github.com/pytorch/pytorch/issues/14891 soumith ezyang ebetica Pull Request resolved: https://github.com/pytorch/pytorch/pull/15033 Differential Revision: D13411322 Pulled By: goldsborough fbshipit-source-id: 2ef73b2aa93fa9e46b1cbe1fd47d9f134d6016d5	2018-12-11 22:43:36 -08:00
Fei Sun	e20f9bbead	Update the output format for benchmark_helper. It outputs the dimensi… (#15108 ) Summary: …on first and all the values in the next line. This way, it can output arbitrary blob Pull Request resolved: https://github.com/pytorch/pytorch/pull/15108 Reviewed By: llyfacebook Differential Revision: D13429346 Pulled By: sf-wind fbshipit-source-id: 5e0bba2a46fbe8d997dfc3d55a698484552e3af8	2018-12-11 22:24:56 -08:00
Zachary DeVito	b07ee44f40	Pre-commit flake8/clang-tidy (#15102 ) Summary: Provide a pre-commit hook that does flake8 and clang tidy checks. Enables the clang-tidy script to run in parallel to make it fast enough to be used in a pre-commit hook. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15102 Reviewed By: soumith Differential Revision: D13429629 Pulled By: zdevito fbshipit-source-id: bd52fe5652f29b033de8d9926d78350b2da4c2fc	2018-12-11 22:18:18 -08:00
Jane Wang	f8455ed754	add gloo support for gather on GPU (#14916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14916 as titled Reviewed By: pietern Differential Revision: D13267832 fbshipit-source-id: 3b89d08af93f74941f17ff892c33fc2a4a023c19	2018-12-11 21:21:10 -08:00
Sebastian Messmer	3fa53da61a	Fix include paths for UndefinedTensorImpl.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14818 Reviewed By: ezyang Differential Revision: D13348042 fbshipit-source-id: 11bdfc755767ce9d0a6fa95b2cf49d50adde8d60	2018-12-11 21:01:45 -08:00
Sebastian Messmer	63db95dd11	Move UndefinedTensorImpl to c10 (meh) (#14817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14817 unfortunately, we still need this. Reviewed By: ezyang Differential Revision: D13348041 fbshipit-source-id: e8dcc89f5c71bd1ea2c9813990dac6e58e63b1fd	2018-12-11 21:01:42 -08:00
Sebastian Messmer	2dfdbef91d	Fix include paths for TensorImpl.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14816 Reviewed By: ezyang Differential Revision: D13348040 fbshipit-source-id: a7204d89c2dd277d13093b0ed862f40b53dee82f	2018-12-11 21:01:40 -08:00
Sebastian Messmer	9e9e87c19e	Move TensorImpl to c10 (yay!) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14795 Reviewed By: ezyang Differential Revision: D13336856 fbshipit-source-id: 5375d0e42312ff7564f4df06210a5e49542d59e3	2018-12-11 21:01:38 -08:00
Gregory Chanan	bff6d42cef	Add at::scalar_tensor factory function, use it instead of Type.scalar… (#15074 ) Summary: …_tensor. This is part of a long series of paring down the Type interface. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15074 Differential Revision: D13421482 Pulled By: gchanan fbshipit-source-id: 84010ee71fef2cb74d32d5de7858d8ed9f36b885	2018-12-11 20:37:41 -08:00
Edward Yang	b710642969	Make ATen HIPify out-of-place, but still reuse CUDA names. (#14866 ) Summary: ``` This diff changes the HIPification of ATen to be out-of-place. We now have the following mappings: - ATen/cuda => ATen/hip - ATen/native/cuda => ATen/native/hip - ATen/native/sparse/cuda => ATen/native/sparse/hip - THC => THH - THCUNN => THHUNN The build system is adjusted to know about these new build paths, and HIPify is taught how to adjust include paths and THC_GENERIC_FILE appropriately. ATen_hip is now built as the ATen_hip library, rather than reusing ATen_cuda. However, despite these new filepaths, none of the identifiers in ATen have actually changed. So, e.g., THHGeneral.h still defines functions named THC_blahblah, and HIP still shows up as CUDA in PyTorch itself. We'll tackle this in a subsequent PR; this diff is just to get the files out-of-place. Minor extra improvements: - Don't edit tmp_install when hipifying - HIP no longer builds native_cudnn_cpp; it was unnecessary - Caffe2_HIP_INCLUDES is now Caffe2_HIP_INCLUDE, for consistency with all the other variables. - HIP build now properly respects ATEN_CUDA_FILES_GEN_LIB (it did not previously.) - You can now override file extension matching in pyHIPIFY by explicitly specifying its full name in the matching list. This is used so we can HIPify CMakeLists.txt in some situations. A little bit of string and ceiling wax: - gen.py grows a --rocm flag so that it knows to generate CUDA files which actually refer to the HIP headers (e.g., THH.h) We'll get rid of this eventually and generate real HIP files, but not for this PR. - Management of HIP dependencies is now completely deleted from the ATen CMakeLists.txt. The old code was dead (because it was shoveled in ATen_CUDA_DEPENDENCY_LIBS and promptly ignored by the Caffe2 build system) and didn't actually work. ``` Stacked on https://github.com/pytorch/pytorch/pull/14849 review last commit only Pull Request resolved: https://github.com/pytorch/pytorch/pull/14866 Differential Revision: D13419475 Pulled By: ezyang fbshipit-source-id: cb4c843df69a1d8369314c9fab1b7719520fa3db	2018-12-11 19:15:27 -08:00
Daniel Ingram	5c2c40ad87	Add error type to raise statement Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15039 Differential Revision: D13419566 Pulled By: zou3519 fbshipit-source-id: f67a3aebce937e3e640e91e81eb3e184cfdf269c	2018-12-11 17:41:44 -08:00
Peter Goldsborough	73ee7fda4c	Remove deprecated variable_tensor_functions (#15003 ) Summary: Removing the deprecated functions in `torch/csrc/variable_tensor_functions.h` (like `torch::CPU`) and corresponding implementations from `torch/csrc/torch.cpp` from master after the release. ezyang gchanan soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15003 Differential Revision: D13418086 Pulled By: goldsborough fbshipit-source-id: a0accdf6f7b0efa1ec07ac7b74b86ff2da37543f	2018-12-11 17:16:11 -08:00
Jane Wang	0552326846	add gloo scatter support on GPU (#14917 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14917 as titled Reviewed By: pietern Differential Revision: D13271560 fbshipit-source-id: 0187a3390f8ebd72a2c074e7a651432159d427c0	2018-12-11 17:11:13 -08:00
Zachary DeVito	92314c83fa	re-enable copy of python files, but be careful that the copy is only … (#14982 ) Summary: …done once This allow no-op build to work correctly even when BUILD_CAFFE2_OPS is on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14982 Differential Revision: D13413960 Pulled By: zdevito fbshipit-source-id: 6e5412a8c375af8a47c76f548cdd31cff15f3853	2018-12-11 16:54:08 -08:00
Richard Zou	71e0cb505c	Split off fuser tests in test_jit.py to their own test case (#15072 ) Summary: This PR creates TestFuser inside test_jit.py to be a home for graph fuser specific tests. This was a useful exercise because now that all the fuser tests are in one place, I can spot redundant and bitrotting tests for cleanup in a future PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15072 Differential Revision: D13421458 Pulled By: zou3519 fbshipit-source-id: 80b1a7712feff75a0c186d1664601c4edbbca694	2018-12-11 14:55:06 -08:00
David Riazati	7408ce2f80	Supress warnings on generated tests Summary: Removes all warnings spew for the TestJitGenerated tests Differential Revision: D13420919 fbshipit-source-id: f251c12f923088ccc5daa2984c15003a67cbd1c1	2018-12-11 14:00:41 -08:00
Josef Lindman Hörnlund	04b65dfd1f	Issue 14984: Remove divide by zero error in index_put_ (#14986 ) Summary: No check for zero index tensor was done in the accumulate=True (serial) case in the new TensorIterator code since https://github.com/pytorch/pytorch/pull/13420. https://github.com/pytorch/pytorch/issues/14984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14986 Differential Revision: D13417861 Pulled By: colesbury fbshipit-source-id: e6ed1af8f708b53a35803fc157ed1f043169ec89	2018-12-11 13:38:12 -08:00
zrphercule	109c8d22dc	Update onnx coverage script for more accurate result (#15029 ) Summary: The coverage of scalar-input test cases were not accurate. This patch fixed that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15029 Differential Revision: D13419764 Pulled By: zrphercule fbshipit-source-id: a14a5cbef432bea8c9126156f5deb1125e1aeb47	2018-12-11 13:14:35 -08:00
Michael Suo	f2f47de5ad	tox.ini -> .flake8 (#15065 ) Summary: We were only using this file to configure flake8, and fbcode linters do not recognize tox.ini which causes spurious linter warnings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15065 Differential Revision: D13420774 Pulled By: suo fbshipit-source-id: e43a46befa36862c8b3c0a90074aec6a66531492	2018-12-11 13:14:34 -08:00
Roy Li	ca7f8fed60	silence unreachable code warnings (#15036 ) Summary: Stack:     ⚫  #15036 silence unreachable code warnings  [💛](https://our.intern.facebook.com/intern/diff/D13411100/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15036 Differential Revision: D13414712 Pulled By: li-roy fbshipit-source-id: d4aa84571fa94c66f3c5bfa9575a10c6ee398f9e	2018-12-11 13:09:04 -08:00
Michael Suo	d825b39061	improve deep equality check in alias annotation test (#15031 ) Summary: Previously we were returning true if either IValue wasn't a tensor, which…is bad Pull Request resolved: https://github.com/pytorch/pytorch/pull/15031 Differential Revision: D13409759 Pulled By: suo fbshipit-source-id: f8bdcd05d334c1276ce46f55812065d358c1ff5d	2018-12-11 12:14:00 -08:00
James Sun	02d149b767	Fix race condition in ThreadPool::workOnTasksUntilCompleted (#14833 ) Summary: Resolves #14704 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14833 Differential Revision: D13405211 Pulled By: highker fbshipit-source-id: 8552d51eeb5d3af0ed66c461e5ddfeb9ae2926bd	2018-12-11 11:46:58 -08:00
TerryTsao	c2a754c58b	Fix CMakeLists.txt for Int8 python bindings (#15047 ) Summary: Currently in caffe2, one cannot properly fetch the content of Int8 blobs. Upon digging the source code, it turns out that the relevant source code is not being compiled. Adding the source to CMakeLists.txt fixes this issue. First time ever doing a pull request. Please let me know if there's any rule I should follow. Thanks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15047 Differential Revision: D13417583 Pulled By: bddppq fbshipit-source-id: dd39575971a3012635edbf97a045d80e4b62a8eb	2018-12-11 10:48:47 -08:00
Orion Reblitz-Richardson	687834dcb4	Install cpp tests when built (#15000 ) Summary: This is broken out of https://github.com/pytorch/pytorch/pull/13733/ We want to install cpp tests so they can ultimately be runnable from that location for Caffe2 tests run from PyTorch builds. cc pjh5 yf225 anderspapitto Pull Request resolved: https://github.com/pytorch/pytorch/pull/15000 Reviewed By: pjh5 Differential Revision: D13416253 Pulled By: orionr fbshipit-source-id: 51280be0a22557a742f90c9f303c58c35cbd4a38	2018-12-11 10:07:48 -08:00
Michael Carilli	5d3a347685	Stashing checkpointing RNG states based on devices of arg tensors (#14518 ) Summary: This PR intends to address apaszke's concerns in https://github.com/pytorch/pytorch/pull/14253#issuecomment-441740016. Preserving the rng state is now controlled by a kwarg rather than a global state, hopefully in a python 2.7-compatible way. Additionally, the checkpointing function stashes and restores the RNG states of 1. devices associated with all input tensor args to run_fn as well as 2. the current device. I could easily change this to only save and restore the RNG states associated 1. alone. This would simplify the logic to create a [deduplicated, ordered](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R37) list of devices considered active. I'm wondering if the [get_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R32) and [set_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) functions are general enough to reside elsewhere (presumably torch/random.py). I'm also wondering if the check on [torch.cuda._initialized](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) would be better placed within `get_device_states`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14518 Differential Revision: D13356210 Pulled By: ezyang fbshipit-source-id: afa4cc21ce7862142d5cb1dec3750018df222039	2018-12-11 09:48:45 -08:00
svcscm	25ddd659c9	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: d39b31f12ab2ab570548f3e8a65949332a64a0ff	2018-12-11 07:40:37 -08:00
Marat Dukhan	bf1d411dbf	Switch Int8Softmax, Int8Relu, and Int8LeakyRelu to QNNPACK (#14933 ) Summary: Int8Softmax: 4x-5x speedup compared to previous implementation Pull Request resolved: https://github.com/pytorch/pytorch/pull/14933 Differential Revision: D13406820 Pulled By: Maratyszcza fbshipit-source-id: ea8cbe1b861ddb7ff1b851d06d52c6fd6d04ed01	2018-12-11 00:49:06 -08:00
Lingyi Liu	a1ea7dbe40	Adjust the API call to deserilize the tensorproto (#14132 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14132 as title Reviewed By: jerryzh168 Differential Revision: D13110697 fbshipit-source-id: 822c9079de11951f90aec3d26f0e4108847e7dac	2018-12-10 22:54:42 -08:00
Natalia Gimelshein	27d5ae7afb	use datatype dependent tolerance in data parallel tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14856 Differential Revision: D13413560 Pulled By: soumith fbshipit-source-id: b3a0cfe93477ed332e6eaa2e39ef5f4cc8b36481	2018-12-10 22:50:27 -08:00
paland3	81dc78d871	Update pooling.py (#14998 ) Summary: Strange line in the documentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14998 Differential Revision: D13413235 Pulled By: soumith fbshipit-source-id: 80d05ec1185719b785f0aac914bc2369c1174f2f	2018-12-10 22:36:20 -08:00
Zachary DeVito	48a361cc62	Clean up casting ops (#14947 ) Summary: This removes FloatToInt style names replacing it with just the destination name (e.g. FloatToInt -> Float). This makes it more consistent with the syntax and makes it easier to add type conversions (just add a new prim::Int op, for instance). None of these ops get serialized so this should not effect loading of old models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14947 Differential Revision: D13408409 Pulled By: zdevito fbshipit-source-id: d773fe863f14d9de893f686832769f8cc8903a8e	2018-12-10 22:15:08 -08:00
Jongsoo Park	cff509e2b1	share code between adagrad and rowwise adagrad tests (#14692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14692 Remove some code duplication Reviewed By: chocjy Differential Revision: D13296731 fbshipit-source-id: 5924e037ca64fc4b89234be922bc5ca47fb8bd32	2018-12-10 22:10:39 -08:00
Ilia Cherniavskii	c48b15e41a	TBB task graph (#15041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15041 Adding an alternative implementation of a task graph based on TBB Reviewed By: dmudiger Differential Revision: D13412517 fbshipit-source-id: f5efedd680bbe0072bf38d504e5682ab51dd630f	2018-12-10 21:35:04 -08:00
bddppq	45dfc6764e	Enable more caffe2 fp16 rocm tests (#15040 ) Summary: cc rohithkrn petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/15040 Reviewed By: houseroad Differential Revision: D13413068 Pulled By: bddppq fbshipit-source-id: b2967f16f8da0b9e80083138fb8632c14e9e9b63	2018-12-10 21:30:21 -08:00
Lu Fang	5022f9d6ef	Enable the build of tests in ATen/core (#15032 ) Summary: Otherwise they won't build Pull Request resolved: https://github.com/pytorch/pytorch/pull/15032 Reviewed By: yinghai Differential Revision: D13409801 Pulled By: houseroad fbshipit-source-id: 95464aa8f3604835997ba1bb7f3c3e51485d1686	2018-12-10 21:24:54 -08:00
Gregory Chanan	962b82dd81	More scaffolding for LegacyTHDispatch. (#14852 ) Summary: 1) at::functions are now also exposed in the at::legacy::th namespace and we move relevant calls over to use them (to avoid merge conflicts) 2) LegacyTHDispatch now handles device-type initialization 3) We generate derived LegacyTHDispatchers, e.g. THLegacyCPULongDispatcher, although they are currently empty. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14852 Reviewed By: ezyang Differential Revision: D13360852 Pulled By: gchanan fbshipit-source-id: af6705aeba3593ea5dba9bfc62890e5257bc81f8	2018-12-10 19:57:01 -08:00
Ilia Cherniavskii	e9cd781681	Back out "Revert D13043261: [caffe2] Task graph and task future abstractions in executor" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15030 Reviewed By: bddppq Differential Revision: D13408998 fbshipit-source-id: 9eb675e09fbc4829eab34df7aa660a0590816feb	2018-12-10 19:30:58 -08:00
Jerry Zhang	83f32eebd9	Tensor construction codemod - 2/3 (#14836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14836 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: bddppq Differential Revision: D13335176 fbshipit-source-id: 8d89510670e2cf70559d2f75e68f7181feb0b6d9	2018-12-10 19:30:56 -08:00
Jesse Hellemn	5222a1b190	Fixing reading of FBGEMM from env variables Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15023 Reviewed By: orionr Differential Revision: D13406778 Pulled By: pjh5 fbshipit-source-id: 2265f01170fb7969cbdf4e44ca6ef183f5d8017d	2018-12-10 18:18:38 -08:00
Syed Tousif Ahmed	a97cf568a4	Alignas Array struct (#14920 ) Summary: This PR aligns the Array struct such that cuda vector performance improvements can be utilized. I tested this by using it on our Philox header. Note how the vector store instruction gets used for cuda vector types and when using alignas on Array, vs when not using alignas on Array. With cuda vector type (uint4, uint2, float4): https://godbolt.org/z/UaWOmR With alignas: https://godbolt.org/z/Eeh0t5 Without alignas: https://godbolt.org/z/QT63gq Pull Request resolved: https://github.com/pytorch/pytorch/pull/14920 Differential Revision: D13406751 Pulled By: soumith fbshipit-source-id: 685b1010ef1f576dde30c278b1e9b642f87c843d	2018-12-10 17:58:03 -08:00
rohithkrn	7e2b074219	Integrate rocBLAS fp16 api into Caffe2 (#14882 ) Summary: This PR integrates rocBLAS half and mixed precision APIs in to Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14882 Differential Revision: D13407840 Pulled By: bddppq fbshipit-source-id: 75cb0d74da066776fa66575f1d255e879d36121e	2018-12-10 17:54:06 -08:00
Junjie Bai	92f3616f36	Fix old tensor CopyFrom usage in boolean mask operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15025 Differential Revision: D13407323 Pulled By: bddppq fbshipit-source-id: 1bc1d28ad0c6c71d25d788549be18917e393ee50	2018-12-10 17:23:45 -08:00
Jongsoo Park	4fcc2fffc3	unit test with multiple omp threads (#14958 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14958 Test with multiple threads Reviewed By: jianyuh Differential Revision: D13394791 fbshipit-source-id: 931a6c3bda15ebc816807e537dd0841c383e7a6f	2018-12-10 17:23:44 -08:00
Jerry Zhang	9b272c08cf	Remove partially initialized Tensor in Deserialization (#14197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13642 Previously we pass in a patially initialized Tensor to Deserialize and it will fill it with the result of deserialization of a tensor proto. Now we want it to return a Tensor directly since it's just a shared pointer to TensorImpl. Reviewed By: dzhulgakov Differential Revision: D12874357 fbshipit-source-id: 12b80a763375da23cfa64a74d6bc186d8d03b94f	2018-12-10 17:17:29 -08:00
Junjie Bai	4a145cd95c	Revert D13043261: [caffe2] Task graph and task future abstractions in executor Differential Revision: D13043261 Original commit changeset: d89424354aea fbshipit-source-id: b307e3281c4d83b60ba2bfadcbcf69afb7a41412	2018-12-10 16:03:59 -08:00
James Reed	0a36fe565d	apply() for ScriptModules (#14655 ) Summary: This can be use to initialize state that is not necessarily eligible for serialization/is implementation-specific. Concretely, I'm going to use this to pack the weight matrices for quantized Linear modules according to the FBGEMM APIs Pull Request resolved: https://github.com/pytorch/pytorch/pull/14655 Differential Revision: D13404438 Pulled By: jamesr66a fbshipit-source-id: 2d327cef5520fdd716b5b1b29effd60a049e8a4a	2018-12-10 15:40:31 -08:00
Edward Yang	9bbb3efe2f	Simplify THPPointer implementation for Storage. (#14897 ) Summary: We've virtualized the destructor for storage, so we no longer have to forward to a particular backend. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14897 Differential Revision: D13399216 Pulled By: ezyang fbshipit-source-id: 531d29c3f278477cfa8759f30ab4f304d695b659	2018-12-10 15:18:49 -08:00
Edward Yang	23cc3daabd	Disable getNumGPUs rewrite (#14993 ) Summary: cc iotamudelta Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14993 Differential Revision: D13405804 Pulled By: ezyang fbshipit-source-id: c4aa9ed29ee2a4f3abf76c1e0fa8babfd738db35	2018-12-10 15:13:55 -08:00
Sebastian Messmer	6ad9f7b798	Fix include path for WrapDimMinimal.h Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14794 Reviewed By: dzhulgakov Differential Revision: D13336842 fbshipit-source-id: ca49a9fd1d409d8a75e43eeb9b9b02c305ebb79a	2018-12-10 15:10:03 -08:00
Sebastian Messmer	279ec9ef7a	Move WrapDimMinimal to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14793 Reviewed By: ezyang Differential Revision: D13336841 fbshipit-source-id: 4365a799e1856cc68dd94a273e97663fee5f51db	2018-12-10 15:10:01 -08:00
Edward Yang	66315ab323	Stop disabling maybeOverlappingIndices (#14999 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/14999 Differential Revision: D13405754 Pulled By: ezyang fbshipit-source-id: 98459496494390ad1115b4f1f6738d53c14f0745	2018-12-10 15:02:08 -08:00
Jane Wang	483ba553bd	add gloo allgather support on GPU (#14576 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14576 as titled Reviewed By: pietern Differential Revision: D13266063 fbshipit-source-id: e262f77d63724a7504a7112907bbfba49612fe75	2018-12-10 14:32:54 -08:00
Ilia Cherniavskii	029600813e	Task graph and task future abstractions in executor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14116 Reviewed By: dmudiger Differential Revision: D13043261 fbshipit-source-id: d89424354aea14d1d14eb8320fb3aa34908a4e81	2018-12-10 14:28:56 -08:00
Jerry Zhang	a51fe386c8	caffe2/caffe2/contrib/script (#15007 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15007 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14979 att Reviewed By: dzhulgakov Differential Revision: D13286191 fbshipit-source-id: b8a6bc7aea44487aea4dcf7f44c858fd30c6293c	2018-12-10 14:23:31 -08:00
Michael Suo	25144c8a09	s/Torch Script/TorchScript/g (#15011 ) Summary: pls Pull Request resolved: https://github.com/pytorch/pytorch/pull/15011 Differential Revision: D13404158 Pulled By: suo fbshipit-source-id: e906281463d65c86e4e9073eb0c0a26f4f29e307	2018-12-10 13:48:24 -08:00
Yuxin Wu	110ccbb689	Improve the docs of interpolate(align_corners=) (#14806 ) Summary: ailzhang Pull Request resolved: https://github.com/pytorch/pytorch/pull/14806 Reviewed By: ailzhang Differential Revision: D13366332 Pulled By: ppwwyyxx fbshipit-source-id: 08fcea95d5c86b11cdfe464fdd9daa50050871f1	2018-12-10 12:50:38 -08:00
Giuseppe Ottaviano	e77de07448	Improve build time of register_symbols.cpp without compiler hacks (#14911 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14911 In optimized modes the compiler tries to inline all the `unordered_map::operator[]` calls, creating a massive amount of code which takes several minutes to optimize. Instead, create a table of PODs and populate the maps using a simple loop. Reviewed By: soumith, luciang Differential Revision: D13382948 fbshipit-source-id: b6752921e0f7213595d26b39e4397f6a3897960b	2018-12-10 11:57:11 -08:00
Edward Yang	18c93b87c2	Delete defunct THP_API.h header. (#14899 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14899 Differential Revision: D13383687 Pulled By: ezyang fbshipit-source-id: f2a08a769cc3775ba55f9c58d622a83df622d816	2018-12-10 10:47:24 -08:00
Edward Yang	1989157eb6	Disable test_leaf_variable_sharing on ASAN runs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15001 Reviewed By: orionr Differential Revision: D13399119 fbshipit-source-id: 6b1d098e55a67b1f5bc6d08a8ee3c1be8234a654	2018-12-10 10:43:05 -08:00
Edward Yang	d30b6bf3b6	Revert D13306052: [pytorch][PR] Allow converting CharTensor to np arrays Differential Revision: D13306052 Original commit changeset: 202d038f139c fbshipit-source-id: 11f6bdd687f8ea5ce2e5f28f48d19449a5c403eb	2018-12-10 10:36:17 -08:00
Edward Yang	dc1e6d0b98	Non-INTERFACE AT_LINK_STYLE is dead code (#14822 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14822 Differential Revision: D13355574 Pulled By: ezyang fbshipit-source-id: a7173084f8735424619b2e393df2715a05918b44	2018-12-10 09:42:53 -08:00
SsnL	54d5c53826	Support torch.load with encoding (#14743 ) Summary: Addresses a common compatibility issue when loading Py2 checkpoints in Py3 regarding to bytes. E.g., [1] https://github.com/pytorch/pytorch/issues/5994, [2] https://github.com/CSAILVision/places365/issues/25, [3] https://discuss.pytorch.org/t/how-to-load-a-saved-model-trained-on-pytorch-0-3-1-python-2-7-on-pyorch-1-0-python-3-7/31212 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14743 Reviewed By: weiyangfb Differential Revision: D13350888 Pulled By: soumith fbshipit-source-id: 2df4e828a8b70509118a355307ca3ebe51e108f6	2018-12-10 08:07:36 -08:00
SsnL	9b2bd284b3	Convert int8 numpy array to CharTensor (#14700 ) Summary: When rewriting `default_collate`, I noticed that `from_numpy` and `as_tensor` and `tensor` all do not work on `np.int8` arrays. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14700 Reviewed By: weiyangfb Differential Revision: D13305297 Pulled By: soumith fbshipit-source-id: 2937110f65ed714ee830d50098db292238e9b2a9	2018-12-10 07:39:06 -08:00
SsnL	e1b5dbf699	Allow converting CharTensor to np arrays (#14710 ) Summary: The other direction of #14700 cc soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/14710 Reviewed By: weiyangfb Differential Revision: D13306052 Pulled By: soumith fbshipit-source-id: 202d038f139cf05e01069ff8d05268c66354c983	2018-12-10 07:35:28 -08:00
Jongsoo Park	b039a715ce	pre-pack operation of dnnlowp conv with 16-bit accumulation (#14881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14881 This diff allows us to pre-quantize and pre-pack weight matrix used in DNNLOWP_ACC16 . The intended use pattern is run Int8ConvPackWeight in init_net that generates a packed weight and Int8Conv with DNNLOWP_ACC16 engine uses the the packed weight. Reviewed By: csummersea Differential Revision: D13374662 fbshipit-source-id: dd02b9a4eb7af1fe208aa857fcd0b445e6e395af	2018-12-10 01:08:21 -08:00
Zachary DeVito	e747acbebb	Respect -q of setup.py (#14972 ) Summary: 1. Changes the prints along the 'rebuild' pathway to respect the '-q' flag of setup.py A clean rebuild now only prints: [zdevito@devgpu172.prn2 /data/users/zdevito/pytorch] python setup.py -q rebuild develop [0/1] Install the project... -- Install configuration: "RelWithDebInfo" ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. ninja: no work to do. 2. Deletes apparently dead calls to `generate_code`. Now that CMake builds these files, it appears that it is getting called twice and the second version is never used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14972 Reviewed By: soumith Differential Revision: D13396330 Pulled By: zdevito fbshipit-source-id: 83c45143bbc6a6d2c1cfee929291ec059f2b5dc3	2018-12-09 22:47:49 -08:00
SsnL	fab8085111	_get_device_index supports parsing device strings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14929 Reviewed By: weiyangfb Differential Revision: D13394498 Pulled By: soumith fbshipit-source-id: 948c6118abdf6c1e1a8a17709333954cafb2345e	2018-12-09 21:12:46 -08:00
Soumith Chintala	5fd69e7551	remove mingfeima mkldnn reference from README, as no longer necessary (#14975 ) Summary: we now get mkldnn automatically from third_party/ideep Differential Revision: D13396480 Pulled By: soumith fbshipit-source-id: 20f819ba4b78cbe9c7d0baeab1c575669cbf6c20	2018-12-09 20:44:10 -08:00
Zachary DeVito	aefc83f46d	fixing some rebuild issues (#14969 ) Summary: This fixes rebuild issues with the ninja part of the build. With this patch all ninja files will now report `nothing to do` if nothing has changed assuming `BUILD_CAFFE2_OPS=0`. 1. This only does the python file processing for caffe2 when BUILD_CAFFE2_OPS=1, this part of the build file is written in such a way that it is always required to rerun and can take substantial time to move files around in the no-op build. In the future this part should be rewritten to use a faster method of copying the files or should treat copying the files as part of the build rules and only run when the files are out of date. 2. This points `sleef` to a patched version that fixes a dead build output that is causing everything to relink all the time. See https://github.com/shibatch/sleef/pull/231#partial-pull-merging for the upstream change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14969 Reviewed By: soumith Differential Revision: D13395998 Pulled By: zdevito fbshipit-source-id: ca85b7be9e99c5c578103c144ef0f2c3b927e724	2018-12-09 16:32:19 -08:00
vishwakftw	fc30e2782c	Remove deprecated info argument in btrifact (#14935 ) Summary: As specified in title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14935 Differential Revision: D13394449 Pulled By: soumith fbshipit-source-id: 569d59414f3a1a43ea641bded4b5433eb53e3490	2018-12-09 15:59:30 -08:00
Soumith Chintala	86e03b8a30	add fix for CUDA 10 (#14971 ) Summary: Linux binaries-only fix for CUDA10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14971 Differential Revision: D13395932 Pulled By: soumith fbshipit-source-id: a72d6ab6b98c6c936e6391d55d2e4e45b9f1e6dd	2018-12-09 15:54:27 -08:00
Your Name	5f2736b84a	Fix mismatched test_{full,ones,zeros}_like onnx expect files (#14956 ) Summary: master broken #14903 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14956 Differential Revision: D13395363 Pulled By: bddppq fbshipit-source-id: 31f0913843292e557807fd5a976f8907fa6cae4b	2018-12-09 08:57:14 -08:00
Yiming Wu	a1494efdfa	fix auto grad summing for IfOp where intermediate output needs renaming (#14772 ) Summary: fix auto grad summing for IfOp where intermediate output needs renaming. Bug before this diff: - we only renames the output of IfOp without changing the subnet ops output - this results in blob not found error the unittest provides an example this diff fix that for IfOp Pull Request resolved: https://github.com/pytorch/pytorch/pull/14772 Differential Revision: D13327090 Pulled By: harouwu fbshipit-source-id: ec40ee88526ace3619c54551e223dd71158a02f8	2018-12-09 08:26:46 -08:00
Spandan Tiwari	fa12e1e4d4	Export ones_like, zeros_like and full_like using ONNX ConstantLike op. (#14903 ) Summary: This PR does the following: 1) Updates the ONNX export for `torch.zeros_like` and `torch.full_like` ops to use ONNX op `ConstantLike`. This reduces the export of experimental op `ConstantFill`, which may possibly be removed in future, see https://github.com/onnx/onnx/pull/1434). 2) It also adds export support for `torch.ones_like`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14903 Differential Revision: D13383700 Pulled By: houseroad fbshipit-source-id: 566d00a943e9497172fcd5a034b638a650ab13a2	2018-12-08 22:49:02 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
Jongsoo Park	a7b3197b2d	race condition fix of calling mutable_data inside a openmp region (#14921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14921 Fix race condition introduced in D13188595 . Let's reminder ourselves "never call mutable_data from an OpenMP region!!!" Reviewed By: jianyuh Differential Revision: D13387692 fbshipit-source-id: 6a3aeedeeda55a9ede660de8f1f44d4eee76ae2b	2018-12-08 18:17:20 -08:00
Fei Sun	e9db9595d2	Add crop argument, can crop rec as well, first resize and then crop Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14894 Reviewed By: llyfacebook Differential Revision: D13377604 Pulled By: sf-wind fbshipit-source-id: 333d0d864e6c2dc85f405baa25ed58029d62750f	2018-12-08 11:14:56 -08:00
Marat Dukhan	b0909ea6a0	Switch Int8Sigmoid to QNNPACK (#14883 ) Summary: 50x-100x speedup compared to current version. Also, fixes a bug in the current version when batch size exceeds 1 (current version processes only the first image in this case). Pull Request resolved: https://github.com/pytorch/pytorch/pull/14883 Differential Revision: D13390655 Pulled By: Maratyszcza fbshipit-source-id: 1b33a97bf2d0866d38faa2b42e64fd2859017898	2018-12-08 02:47:29 -08:00
Your Name	5e06fa0baf	ONNX changes to use int32_t (instead of enum) to store data type Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14926 Reviewed By: houseroad Differential Revision: D13390642 Pulled By: bddppq fbshipit-source-id: c2314b24d9384f188fda2b9a5cc16465ad39581e	2018-12-08 01:06:08 -08:00
Sebastian Messmer	c8a5ec14dd	Remove at references from c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14432 Reviewed By: dzhulgakov Differential Revision: D13223904 fbshipit-source-id: 43b06e33e088e7789ccea6d92267936fe30d8571	2018-12-08 00:28:35 -08:00
Brennan Vincent	25110d61fb	Implement `std` for multiple dimensions on CPU devices. (#14535 ) Summary: Tested on a tensor with 1 billion elements and 3 dimensions on a powerful, highly multi-core Linux machine. parallelized: All operations (e.g., `t.std(1)`) that could be done in the old code are now several times faster. All new operations (e.g., `t.std((0,2))` are significantly faster than the NumPy equivalents. `t.std((0, 1, 2))`, a new operation, is logically equivalent to the old `t.std()`, but faster. serial: The above comment about old operationos now being faster still holds, but `t.std((t1, ..., tn))` is now a few times slower than `t.std()`. If this turns out to be important, we can special-case that to use the old algorithm. The approach is to create a new method, `TensorIterator::foreach_reduced_elt`, valid for `TensorIterator`s that represent a dimension reduction. This method calls a supplied function for each element in the output, supplying it with the input elements that correspond to that output. Given that primitive, we can implement reductions like the following pseudocode: If there is more than one output element: ``` PARALLEL FOR EACH element IN output: accumulator = identity SERIAL FOR EACH data_point IN element.corresponding_input: accumulator.update(data_point) element = accumulator.to_output() ``` If there is only one output element, we still want to parallelize, so we do so along the input instead: ``` accumulators[n_threads] PARALLEL FOR EACH input_chunk IN input.chunks(): accumulators[thread_num()] = identity SERIAL FOR EACH data_point IN input_chunk: accumulators[thread_num()].update_with_data(data_point) accumulator = identity SERIAL FOR EACH acc in accumulators: accumulator.update_with_other_accumulator(acc) output_element = accumulator.to_output() ``` Note that accumulators and data points do not have to be the same type in general, since it might be necessary to track arbitrary amounts of data at intermediate stages. For example, for `std`, we use a parallel version of Welford's algorithm, which requies us to track the mean, second moment, and number of elements, so the accumulator type for `std` contains three pieces of data. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14535 Differential Revision: D13283887 Pulled By: umanwizard fbshipit-source-id: 8586b7bf00bf9f663c55d6f8323301e257f5ec3f	2018-12-07 20:16:04 -08:00
Orion Reblitz-Richardson	c2a75926ca	Add CAFFE2_API to video processing functions (#14900 ) Summary: Extracted from https://github.com/pytorch/pytorch/pull/13733 Some tests were failing because these methods didn't have an export. cc pjh5 yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14900 Reviewed By: pjh5 Differential Revision: D13381130 Pulled By: orionr fbshipit-source-id: 030536f8fb09765c09a7b0bd45400161053f2e18	2018-12-07 19:55:21 -08:00
Johannes M Dieterich	52942e1f09	Enable unit tests known to work on ROCm (#14011 ) Summary: * Enable unit tests known to work on ROCm. * Disable a few that are known to be flaky for the time being. * Use std::abs for Half * No more special casing for ROCm in TensorMathReduce * Document an important detail for a hardcoded block size w.r.t. ROCm in TensorMathReduce ezyang bddppq for awareness Pull Request resolved: https://github.com/pytorch/pytorch/pull/14011 Differential Revision: D13387679 Pulled By: bddppq fbshipit-source-id: 4177f2a57b09d866ccbb82a24318f273e3292f71	2018-12-07 18:57:32 -08:00
Lu Fang	5be28ade66	Automatic update of fbcode/onnx to aca8473a40cf43f01958c81b648efcee7f3a755a (#14865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14865 Previous import was 42804705bdbf179d1a98394008417e1392013547 Included changes: - [aca8473](https://github.com/onnx/onnx/commit/aca8473): Add Erf operator for computing error function (#1675) <bddppq> - [3fc82ca](https://github.com/onnx/onnx/commit/3fc82ca): Add IsNaN operator. (#1656) <Pranav Sharma> - [0685f01](https://github.com/onnx/onnx/commit/0685f01): Add Sign Op (#1658) <Rui Zhu> - [2a8fae8](https://github.com/onnx/onnx/commit/2a8fae8): Fix unused var warning (#1669) <Yinghai Lu> - [e212833](https://github.com/onnx/onnx/commit/e212833): Update scan (#1653) <G. Ramalingam> Reviewed By: zrphercule Differential Revision: D13370727 fbshipit-source-id: 13a93d5acc8d4758f682278ea162ec9124ced22d	2018-12-07 17:37:42 -08:00
rohithkrn	11a9248d01	Enable fp16 for MIOPEN operators in Caffe2 (#14905 ) Summary: This PR enables fp16 MIOPEN operators in Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14905 Differential Revision: D13383439 Pulled By: bddppq fbshipit-source-id: 840afa8d08bef2952ca0039dee2423f1542bb330	2018-12-07 17:26:44 -08:00
Gu, Jinghui	70598740ec	Upgrade MKL-DNN to version 0.17 (#14308 ) Summary: upgrade MKL-DNN to version 0.17 update mkldnn bridge to latest. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14308 Differential Revision: D13383102 Pulled By: yinghai fbshipit-source-id: c434f0e0ddff2ee2c86db2d6c44a37298fd005a3	2018-12-07 16:44:50 -08:00
Daniel Bermond	478eb70c07	Fix build with OpenCV 4.0 (#14356 ) Summary: Fixes #14355 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14356 Differential Revision: D13356237 Pulled By: bddppq fbshipit-source-id: 2bf6ee21995c2c7b617c4e78ea7341f975f1b937	2018-12-07 16:40:31 -08:00
Sebastian Messmer	4453a1ff88	Remove unused TensorImpl dependencies Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14792 Reviewed By: ezyang Differential Revision: D13336843 fbshipit-source-id: 12f84799a70c2e90a8b934dd8dc031c09a6782f0	2018-12-07 16:23:48 -08:00
Sebastian Messmer	65aa11a876	Remove TensorImpl -> context_base dependency (#14658 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14658 Remove this dependency by moving at::CopyBytes to c10. The implementations for at::CopyBytes will have to live in aten/caffe2 for now because they're not unified for CUDA yet. They'll be moved into c10/backend/xxx later. Reviewed By: dzhulgakov Differential Revision: D13288655 fbshipit-source-id: 1c92379345308b3cd39a402779d7b7999613fc0d	2018-12-07 16:23:46 -08:00
Sebastian Messmer	086a37876b	Fix include paths for TensorOptions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14747 Reviewed By: ezyang Differential Revision: D13318645 fbshipit-source-id: f5ba77a93f6019fbf5faffb47a2837c95fad474d	2018-12-07 16:23:44 -08:00
James Reed	459aac4f24	Update graph printouts in JIT docs (#14914 ) Summary: Tracing records variable names and we have new types and stuff in the IR, so this updates the graph printouts in the docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/14914 Differential Revision: D13385101 Pulled By: jamesr66a fbshipit-source-id: 6477e4861f1ac916329853763c83ea157be77f23	2018-12-07 15:08:53 -08:00
Ailing Zhang	5734e96775	Improve hub documentation (#14862 ) Summary: Added a few examples and explains to how publish/load models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14862 Differential Revision: D13384790 Pulled By: ailzhang fbshipit-source-id: 008166e84e59dcb62c0be38a87982579524fb20e	2018-12-07 14:59:01 -08:00
James Reed	65da7ddad6	USE_FBGEMM=True by default Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14868 Differential Revision: D13383390 Pulled By: jamesr66a fbshipit-source-id: 1880c07dfd239e19153bd4fde2ab2c8d0604f956	2018-12-07 14:22:55 -08:00
Sergei Nikolaev	a0ee3a279c	USE_TENSORRT support and TensorRT 5 compatibility Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13945 Differential Revision: D13317525 Pulled By: yinghai fbshipit-source-id: 8630dfec1bbc5aac19539e344e7c38a7fd8b051d	2018-12-07 14:01:11 -08:00
Orion Reblitz-Richardson	febc7ff99f	Add __init__.py so files get picked up on install (#14898 ) Summary: This will let us install tests and other Caffe2 python code as a part of running Caffe2 tests in PyTorch. Broken out of https://github.com/pytorch/pytorch/pull/13733/ cc pjh5 yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14898 Reviewed By: pjh5 Differential Revision: D13381123 Pulled By: orionr fbshipit-source-id: 0ec96629b0570f6cc2abb1d1d6fce084e7464dbe	2018-12-07 13:40:23 -08:00
Gregory Chanan	efc5e9f71a	Replace calls of Type::_th_tensor. (#14877 ) Summary: _th_tensor is moving off Type, so these calls need to be replaced. Unfortunately, replacing these with a full-fledged solution [e.g. from_storage(..., TensorOptions)] is a bit complicated because the storage itself fully defines the Type (modulo variable). It's simpler to just wait for the Variable/Tensor merge rather than to solve this now, so instead I changed the call sites to: at::empty({0}, type.options()).set_(storage...). This isn't great because we are also trying to get rid of Type::options, but this seems to be the lesser-of-two-evils. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14877 Differential Revision: D13374310 Pulled By: gchanan fbshipit-source-id: eb953ed041507e6190d6f32e383912e5a08311cd	2018-12-07 13:04:48 -08:00
Peter Goldsborough	d6c53328f9	Large scale fix of python-related files in torch/csrc/ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14515 Differential Revision: D13247966 Pulled By: goldsborough fbshipit-source-id: 7a127c508fc576a7a92626dd6b729f660162d628	2018-12-07 13:04:46 -08:00
PenghuiCheng	939877bf4b	Implementation of WeightedSum op for mkl-dnn and fix FC op output shape issue. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14407 Reviewed By: yinghai Differential Revision: D13364364 Pulled By: wesolwsk fbshipit-source-id: e69bcd1bc52e35b2f0e45e5dc40184f1bd66605d	2018-12-07 12:35:19 -08:00
Yudong Guang	265b55d028	Revert D13205604: Move numa.{h, cc} to c10/util Differential Revision: D13205604 Original commit changeset: 54166492d318 fbshipit-source-id: 89b6833518c0b554668c88ae38d97fbc47e2de17	2018-12-07 10:01:25 -08:00
vishwakftw	1c9df7facf	Expose torch.roll function and method (#14880 ) Summary: Fixes #14859 . Differential Revision: D13376915 Pulled By: zou3519 fbshipit-source-id: f1fc0e8492a159431a3fc0a19a41aa10429ecc80	2018-12-07 07:42:47 -08:00
Junjie Bai	6651fae827	Make autograd engine compatible with hip Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14873 Differential Revision: D13375053 Pulled By: bddppq fbshipit-source-id: f3051640386667bbf0566856ed433eb83276c39e	2018-12-07 00:12:06 -08:00
Jon Crall	6e453e56f9	Fixed ConvT docstring (#14876 ) Summary: Fixes #14099 I attempted to be as consistent as possible with the formatting, hence why my equation reads d(k - 1) instead of (k - 1)d. Also there is an unused variable on line 46: `n = self.in_channels`. I could fix that here too if that's not too out of scope. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14876 Differential Revision: D13374317 Pulled By: soumith fbshipit-source-id: a9f110acafa58cdb4206956dbe3ab4738d48292d	2018-12-06 23:57:30 -08:00
svcscm	51d26e76f7	Updating submodules Reviewed By: yns88 fbshipit-source-id: 7da015701f18f8a0b5a8092aae02a42ede7bfd44	2018-12-06 22:52:22 -08:00
David Riazati	4655b7bc4b	Remove weak module test expect files (#14871 ) Summary: This PR removes some expect files that aren't really testing anything Pull Request resolved: https://github.com/pytorch/pytorch/pull/14871 Differential Revision: D13373762 Pulled By: driazati fbshipit-source-id: e3537ee83df23b3b3b854f9b1253fd0cc8e9dd33	2018-12-06 21:55:12 -08:00
Wei Yang	1a247f872f	gradcheck (#14596 ) Summary: - allow gradcheck to take sparse tensor as input - sparse output is not allowed yet at gradcheck - add backward for `to_dense()` to get around sparse output - calling gradcheck at test_sparse, so that we can use `_gen_sparse()` and also easily cover coalesced / uncoalesced test cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/14596 Differential Revision: D13271904 Pulled By: weiyangfb fbshipit-source-id: 5317484104404fd38058884c86e987546011dd86	2018-12-06 18:03:38 -08:00
Teng Li	bfa666eb0d	Skipping two c10d tests only if there are multi-GPUs (#14860 ) Summary: Otherwise, these tests will fail, even though there are never meant to run on single GPU machines. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14860 Differential Revision: D13369060 Pulled By: teng-li fbshipit-source-id: 8a637a6d57335491ba8602cd09927700b2bbf8a0	2018-12-06 17:28:07 -08:00
Sebastian Messmer	ada8f828f9	Move TensorOptions, DefaultTensorOptions to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14746 Reviewed By: ezyang Differential Revision: D13318644 fbshipit-source-id: b703d7dc67e75d9e9571c80d62a100c5fc4e84df	2018-12-06 15:59:04 -08:00
Marat Dukhan	bd3eb87258	Switch Int8MaxPool operator to QNNPACK (#14832 ) Summary: 1.6-2.4X speedup on ARM when compiled with gcc Pull Request resolved: https://github.com/pytorch/pytorch/pull/14832 Differential Revision: D13358160 Pulled By: Maratyszcza fbshipit-source-id: 39e9791886fac62650bb53a9df341889f0bb5d49	2018-12-06 15:14:28 -08:00
Richard Zou	e6a420114f	collect_env.py: get conda magma and mkl information (#14854 ) Summary: Fixes #12371 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14854 Differential Revision: D13363635 Pulled By: zou3519 fbshipit-source-id: f8b5d05038bf5ce451399dfeed558ae298178128	2018-12-06 14:58:14 -08:00
zrphercule	ddca0442b6	Add LogSigmoid support in ONNX symbolic (#14830 ) Summary: Add LogSigmoid: torch.LogSigmoid(x) = onnx.Log(onnx.Sigmoid(x)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/14830 Differential Revision: D13353891 Pulled By: zrphercule fbshipit-source-id: bf456170b9e6c4edad07b3333cd5797f8e0fa97f	2018-12-06 14:17:33 -08:00
Ashwin Bharambe	5f0bff9639	Kill GPU memory logs in normal runs (#14838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14838 The GPU memory tracking logs are incredibly annoying and merely serve to pollute output. I `VLOG(1)`ed them. Hopefully, this is non-controversial. Reviewed By: kuttas Differential Revision: D13343290 fbshipit-source-id: b3cae99346c97b66e97ea660061e15dc5c99b9fc	2018-12-06 13:51:14 -08:00
Junjie Bai	f82f4de229	Stop inserting static casts in Hipify (#14853 ) Summary: Latest hcc can now properly cast to correct type internally, so there is no need to insert static_cast in hipify scripts anymore. However the hcc included in the latest ROCm release (1.9.2) doesn't have this fix, so leaving a flag to continue doing static_cast for those using the official ROCm releases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14853 Differential Revision: D13363171 Pulled By: bddppq fbshipit-source-id: a36476a8511222ff3c933d31788e8a0ffb04f5ca	2018-12-06 13:19:33 -08:00
Jerry Zhang	b5db6ac9f1	Tensor construction codemod - 3/3 (#14835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14835 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: bddppq Differential Revision: D13335184 fbshipit-source-id: 26d8247e16b30bdff045530034af9b72c76d066f	2018-12-06 11:50:59 -08:00
Jerry Zhang	20d1bff292	Tensor construction codemod - 1/3 (#14828 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14828 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: bddppq Differential Revision: D13335160 fbshipit-source-id: a3ae4c5a86bfbdaf2d5aa14e0eef57255e829fd4	2018-12-06 11:47:32 -08:00
Jerry Zhang	1d111853ae	Move numa.{h, cc} to c10/util (#14393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393 att Reviewed By: ezyang Differential Revision: D13205604 fbshipit-source-id: 54166492d31827b0343ed070cc36a825dd86e2ed	2018-12-06 11:30:13 -08:00
Johannes M Dieterich	75a2d8e2de	Upgrade CI to ROCm 1.9.2 (#14216 ) Summary: Drop custom hcc/hip as the 1.9.2 release should contain the relevant patches therein. Most notable feature in 1.9.2 is mixed precision support in rocBLAS and MIOpen. These features will be enabled by subsequent PRs. bddppq ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/14216 Differential Revision: D13354294 Pulled By: bddppq fbshipit-source-id: 2541d4a196af21c9432c1aff7f6e65b572628028	2018-12-06 10:13:39 -08:00
Jan Schlüter	1c8d41a08d	Allow linspace and logspace with steps=1 and start != end like numpy (#14748 ) Summary: `torch.linspace(0, 1, 1)` fails with `RuntimeError: invalid argument 3: invalid number of points at ../aten/src/TH/generic/THTensorMoreMath.cpp:2119`, while `np.linspace(0, 1, 1)` works fine. Looking at the code, there is even a comment by gchanan asking: "NumPy allows you to pass different points even if n <= 1 -- should we?" I would say "yes". Currently, I would need to handle the case of `steps == 1` or `steps == 0` separately, making sure to change the `end` when calling `torch.linspace`. This is impractical. If we support `start != end`, there are two possibilities for the result: Either we ensure the first value in the resulting sequence always equals `start`, or we ensure the last value in the resulting sequence always equals `end`. Numpy chose the former, which also allows it to support a boolean `endpoint` flag. I'd say we should follow numpy. This PR adapts `linspace` and `logspace` to mimic the behavior of numpy, adapts the tests accordingly, and extends the docstrings to make clear what happens when passing `steps=1`. If you decide against this PR, the error message should become explicit about what I did wrong, and the documentation should be extended to mention this restriction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14748 Differential Revision: D13356136 Pulled By: ezyang fbshipit-source-id: db85b8f0a98a5e24b3acd766132ab71c91794a82	2018-12-06 09:30:55 -08:00
Jie	d2fdc33411	(#14580 ) Summary: Removes cast of half to float in torch.sum, with float16 input tensor and float32 output tensor, instead we cast data when loading input in kernel. This supposingly would save a kernel launch as well as a full global memory load on promoted data type (float). Pull Request resolved: https://github.com/pytorch/pytorch/pull/14580 Differential Revision: D13356203 Pulled By: ezyang fbshipit-source-id: 85e91225b880a65fe3ceb493371b9b36407fdf48	2018-12-06 09:03:46 -08:00
Ricardo Cuenca	eb3cabffd6	Consistent formatting in losses' docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14739 Differential Revision: D13356143 Pulled By: ezyang fbshipit-source-id: 9ae8316dd8ba6e910247b64cec22db63df10e11c	2018-12-06 09:01:24 -08:00
Alex Şuhan	2e7cc86a62	Add (partial) autodiff support for nll_loss (#14305 ) Summary: Not ready yet, need some comments / help with this. It's good enough for https://github.com/pytorch/xla immediate goals (forward + backward trace fusion), but there are at least two issues with it: 1. If we don't allow it, `test/test_jit.py` fails to cover the change. 2. If we allow the weight to be set, running `test/test_jit.py TestJitGenerated.test_nn_nll_loss` fails with: ``` ====================================================================== ERROR: test_nn_nll_loss (__main__.TestJitGenerated) ---------------------------------------------------------------------- Traceback (most recent call last): File "test/test_jit.py", line 10001, in do_test fn, f_args_variable, kwargs_variable, no_grad=no_grad) File "test/test_jit.py", line 9360, in check_against_reference outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs) File "test/test_jit.py", line 425, in runAndSaveRNG results = func(inputs, kwargs) File "test/test_jit.py", line 9298, in script_fn self.assertExportImport(CU.the_method.graph, tensors) File "test/test_jit.py", line 415, in assertExportImport self.assertExportImportModule(m, inputs) File "test/test_jit.py", line 419, in assertExportImportModule self.assertEqual(self.runAndSaveRNG(m.forward, inputs), File "test/test_jit.py", line 425, in runAndSaveRNG results = func(inputs, *kwargs) RuntimeError: arguments for call are not valid: for operator aten::nll_loss_backward(Tensor grad_output, Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index, Tensor total_weight, , Tensor out) -> Tensor: expected a value of type Tensor for argument 'total_weight' but found bool <internally-created-node> ~ <--- HERE for operator aten::nll_loss_backward(Tensor grad_output, Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index, Tensor total_weight) -> Tensor: expected a value of type Tensor for argument 'total_weight' but found bool <internally-created-node> ~ <--- HERE for call at: <internally-created-node> ~ <--- HERE ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14305 Differential Revision: D13356265 Pulled By: ezyang fbshipit-source-id: 504d783b2d87f923e698a6a4efc0fd9935a94a41	2018-12-06 08:58:54 -08:00
svcscm	e7bd8457a6	Updating submodules Reviewed By: yns88 fbshipit-source-id: 2adbb6f97d4b8f067a2538fec855063510b0ca3f	2018-12-06 08:58:53 -08:00
svcscm	6039c7611f	Updating submodules Reviewed By: yns88 fbshipit-source-id: e0509413215f3b7578b825c52365fec4da625bd5	2018-12-06 02:55:47 -08:00
lcskrishna	12addc64a6	Fixed MIOpen RNN Segfault issue and enabled RNN test (#14810 ) Summary: This pull request contains changes for: 1. Added MIOpen RNN API miopenGetRNNLayerBiasSize and miopenGetRNNLayerParamSize. 2. Fixed usage of API miopenGetRNNLayerParam. 3. Modifying the RNN test to run using MIOpen engine. Differential Revision: D13355699 Pulled By: bddppq fbshipit-source-id: 6f750657f8049c5446eca893880b397804120b69	2018-12-05 23:54:31 -08:00
Yinghai Lu	39d50ef4f6	Export complete subgraph io info when calling onnxGetBackendCompatibility (#14827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14827 We need to send complete IO info when doing `onnxGetBackendCompatibility` to backend like Glow. Previously we are missing some info because sometimes we generate more than one nodes from one C2 op. This fixes the issue. Reviewed By: jackm321 Differential Revision: D13352049 fbshipit-source-id: 8d8ac70656a0ac42f3a0ccecad61456a4f3b2435	2018-12-05 23:52:06 -08:00
Huan Gui	ba287eebca	Fix clip gradient with empty input (#14709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14709 As titled Reviewed By: Wakeupbuddy Differential Revision: D13305554 fbshipit-source-id: 380062d4b0e4f9dc0207a27766cac7b8d05384d5	2018-12-05 22:53:25 -08:00
JerryShih	997df9a6ec	Remove protobuf dependency in pytorch cmake file. (#14182 ) Summary: Currently, pytorch doesn't dependent on protobuf. So, we don't need to include the protobuf dir in pytorch cmake file. And if we build caffe2 without custom-protobuf[1], we will have the protobuf mismatched problem. [1] `92dbd0219f/CMakeLists.txt (L65)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14182 Differential Revision: D13356273 Pulled By: ezyang fbshipit-source-id: 8120c3452d158dc51d70156433d7b9076c6aed47	2018-12-05 22:49:50 -08:00
Xiang Gao	3799d32b7b	Optimize images (#14084 ) Summary: This is a PR that [ImgBot](https://imgbot.net/) opened on my fork https://github.com/zasdfgbnm/pytorch/pull/1, I forward it here. ImgBot does lossless compression on images to reduce file size. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14084 Differential Revision: D13356293 Pulled By: ezyang fbshipit-source-id: 731236d95ad870db8ccb99b03ed306704365242c	2018-12-05 22:46:32 -08:00
Aldian Fazrihady	e27d77815d	Prevent `profile_observer_test` from being run by CPU test (#14168 ) Summary: Fix CMakeLists.txt, so the test for CPU won't run profile_observer_test.cc, as currently it only supports GPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/14168 Differential Revision: D13356274 Pulled By: ezyang fbshipit-source-id: 7d105f2e18675e5fab129864958148b0f18d582c	2018-12-05 22:34:29 -08:00
Achal Shah	14fb651b5f	CAFFE2_INCLUDE_DIRS points to invalid path (#14306 ) Summary: I know that including CAFFE2_INCLUDE_DIRS in include headers are not necessary for newer cmakes. But I had this in one of my old projects and cmake gave me error that "/usr/lib/include" is invalid path. It seems like "${_INSTALL_PREFIX}/lib/include" should be changed to "${_INSTALL_PREFIX}/include" as all caffe2 headers are in /include rather than /lib/include/ Please correct me if I am wrong? Pull Request resolved: https://github.com/pytorch/pytorch/pull/14306 Differential Revision: D13356246 Pulled By: ezyang fbshipit-source-id: e2d5d3c42352e59b245714ad90fd7a9ef48170d7	2018-12-05 22:32:04 -08:00
HB_alon	5e307bd1be	use "Extension" instead of the unimported "setuptools.Extension" (#14475 ) Summary: use "Extension" instead of the unimported "setuptools.Extension" Pull Request resolved: https://github.com/pytorch/pytorch/pull/14475 Differential Revision: D13356219 Pulled By: ezyang fbshipit-source-id: 5a3e7eb73a32d6bf09676efd9eddded5586435cd	2018-12-05 22:18:47 -08:00
Shuichi KITAGUCHI	d393dd0744	generate ATen core files with LF. (#14667 ) Summary: on Windows environment, some ATen core files (Type.h, Tensor.h, TensorMethods.h) are created and it's new line code is CRLF. (maybe enviconment dependant) therefore, comparing files is failed in generate_outputs()agener917.py and compilation stopped. this patch generates these files with LF forcibly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14667 Differential Revision: D13356170 Pulled By: ezyang fbshipit-source-id: ef8cc3a6cc8bf3c45b78e9eb3df98cf47c0d33bb	2018-12-05 22:14:29 -08:00
Brendan Soffientini	2d60afbc90	Remove outdated css file and refs in cpp conf.py (#14779 ) Summary: pytorch_theme.css is no longer necessary for the cpp or html docs site build. The new theme styles are located at https://github.com/pytorch/pytorch_sphinx_theme. The Lato font is also no longer used in the new theme. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14779 Differential Revision: D13356125 Pulled By: ezyang fbshipit-source-id: c7635eb7512c7dcaddb9cad596ab3dbc96480144	2018-12-05 21:55:45 -08:00
vaeksare	82903dda9b	Fixes for some Windows compiler warnings (#14490 ) Summary: Implement some simple fixes to clean up windows build by fixing compiler warnings. Three main types of warnings were fixes: 1. GCC specific pragmas were changed to not be used on windows. 2. cmake flags that don't exist on windows were removed from windows build 3. Fix a macro that was defined multiple times on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14490 Differential Revision: D13241988 Pulled By: ezyang fbshipit-source-id: 38da8354f0e3a3b9c97e33309cdda9fd23c08247	2018-12-05 21:27:07 -08:00
Edward Yang	a6399121da	Shut up "address will always evaluate to 'true'" warnings (#14774 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14774 Differential Revision: D13327969 Pulled By: ezyang fbshipit-source-id: 43380c89eedaaa89467952401b8fd3f5a9ad754a	2018-12-05 21:18:31 -08:00
Edward Yang	f9446e0c94	HIPify less files in PyTorch (#14804 ) Summary: Stacked on #14803 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14804 Differential Revision: D13347986 Pulled By: ezyang fbshipit-source-id: c93177b4ad51855660d0de36d042bfc542bd4be0	2018-12-05 20:52:38 -08:00
Junjie Bai	ba0ebe33c1	Unify device argument parsing between torch and c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14786 Differential Revision: D13334501 Pulled By: bddppq fbshipit-source-id: ae3536be1fe0dcd6a1552ec93629ecc9554c0d7c	2018-12-05 18:37:32 -08:00
Pieter Noordhuis	252e9058d4	Improve assertion failure message (#14813 ) Summary: See #14554. I can't figure out how the reported issue can happen. The best next thing is have more information when this happens again. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14813 Differential Revision: D13351908 Pulled By: pietern fbshipit-source-id: 61b30fcae2e34da54329d0893ca4921b6ad60f0d	2018-12-05 17:20:25 -08:00
Bram Wasti	83ad52634a	Add FunctionSchema based Operator Registry (#13789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13789 This enables creation of operators with FunctionSchema and IValue Reviewed By: smessmer Differential Revision: D13008791 fbshipit-source-id: 151efc88ac315f4a0ab0171a99774caaf767ef1e	2018-12-05 17:20:24 -08:00
Pieter Noordhuis	67dcf10631	Increase test timeout (#14814 ) Summary: It is possible that some sort of contention causes process scheduling delays which in turn cause the timeout to not be hit. Increased sleep here will decrease the probability of this happening. Fixes #14555. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14814 Differential Revision: D13351924 Pulled By: pietern fbshipit-source-id: 1222cf0855408dfcb79f30f94694c790ee998cf9	2018-12-05 17:18:11 -08:00
Pieter Noordhuis	c02b3e7cea	Retry test on address already in use error (#14815 ) Summary: Thanks nairbv for the suggestion. Also see #14589. Fixes #14703. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14815 Differential Revision: D13351913 Pulled By: pietern fbshipit-source-id: d11a4152505d0ce15592b13e417bb80551476a61	2018-12-05 17:09:46 -08:00
Lu Fang	6fccca4278	improve ONNX tests on torch.Linear Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14821 Reviewed By: zrphercule Differential Revision: D13348773 Pulled By: houseroad fbshipit-source-id: 611ca6e28f715e5518649c8c16f702ac3433308c	2018-12-05 17:07:10 -08:00

4357 changed files with 319194 additions and 220037 deletions

2

.circleci/.gitignore vendored Normal file

View File

 @ -0,0 +1,2 @@
 *.svg
 *.png

									
										476

.circleci/README.md
									
										Normal file
									
												View File
												
				@ -0,0 +1,476 @@

				Structure of CI

				===============

				setup job:

				1. Does a git checkout

				2. Persists CircleCI scripts (everything in `.circleci`) into a workspace.  Why?

				   We don't always do a Git checkout on all subjobs, but we usually

				   still want to be able to call scripts one way or another in a subjob.

				   Persisting files this way lets us have access to them without doing a

				   checkout.  This workspace is conventionally mounted on `~/workspace`

				   (this is distinguished from `~/project`, which is the conventional

				   working directory that CircleCI will default to starting your jobs

				   in.)

				3. Write out the commit message to `.circleci/COMMIT_MSG`.  This is so

				   we can determine in subjobs if we should actually run the jobs or

				   not, even if there isn't a Git checkout.

				CircleCI configuration generator

				================================

				One may no longer make changes to the `.circleci/config.yml` file directly.

				Instead, one must edit these Python scripts or files in the `verbatim-sources/` directory.

				Usage

				----------

				1. Make changes to these scripts.

				2. Run the `regenerate.sh` script in this directory and commit the script changes and the resulting change to `config.yml`.

				You'll see a build failure on TravisCI if the scripts don't agree with the checked-in version.

				Motivation

				----------

				These scripts establish a single, authoritative source of documentation for the CircleCI configuration matrix.

				The documentation, in the form of diagrams, is automatically generated and cannot drift out of sync with the YAML content.

				Furthermore, consistency is enforced within the YAML config itself, by using a single source of data to generate

				multiple parts of the file.

				* Facilitates one-off culling/enabling of CI configs for testing PRs on special targets

				Also see https://github.com/pytorch/pytorch/issues/17038

				Future direction

				----------------

				### Declaring sparse config subsets

				See comment [here](https://github.com/pytorch/pytorch/pull/17323#pullrequestreview-206945747):

				In contrast with a full recursive tree traversal of configuration dimensions,

				> in the future future I think we actually want to decrease our matrix somewhat and have only a few mostly-orthogonal builds that taste as many different features as possible on PRs, plus a more complete suite on every PR and maybe an almost full suite nightly/weekly (we don't have this yet). Specifying PR jobs in the future might be easier to read with an explicit list when we come to this.

				----------------

				----------------

				# How do the binaries / nightlies / releases work?

				### What is a binary?

				A binary or package (used interchangeably) is a pre-built collection of c++ libraries, header files, python bits, and other files. We build these and distribute them so that users do not need to install from source.

				A **binary configuration** is a collection of

				* release or nightly

				    * releases are stable, nightlies are beta and built every night

				* python version

				    * linux: 2.7m, 2.7mu, 3.5m, 3.6m 3.7m (mu is wide unicode or something like that. It usually doesn't matter but you should know that it exists)

				    * macos and windows: 2.7, 3.5, 3.6, 3.7

				* cpu version

				    * cpu, cuda 9.0, cuda 10.0

				    * The supported cuda versions occasionally change

				* operating system

				    * Linux - these are all built on CentOS. There haven't been any problems in the past building on CentOS and using on Ubuntu

				    * MacOS

				    * Windows - these are built on Azure pipelines

				* devtoolset version (gcc compiler version)

				    * This only matters on Linux cause only Linux uses gcc. tldr is gcc made a backwards incompatible change from gcc 4.8 to gcc 5, because it had to change how it implemented std::vector and std::string

				### Where are the binaries?

				The binaries are built in CircleCI. There are nightly binaries built every night at 9pm PST (midnight EST) and release binaries corresponding to Pytorch releases, usually every few months.

				We have 3 types of binary packages

				* pip packages - nightlies are stored on s3 (pip install -f <a s3 url>). releases are stored in a pip repo (pip install torch) (ask Soumith about this)

				* conda packages - nightlies and releases are both stored in a conda repo. Nighty packages have a '_nightly' suffix

				* libtorch packages - these are zips of all the c++ libraries, header files, and sometimes dependencies. These are c++ only

				    * shared with dependencies

				    * static with dependencies

				    * shared without dependencies

				    * static without dependencies

				All binaries are built in CircleCI workflows. There are checked-in workflows (committed into the .circleci/config.yml) to build the nightlies every night. Releases are built by manually pushing a PR that builds the suite of release binaries (overwrite the config.yml to build the release)

				# CircleCI structure of the binaries

				Some quick vocab:

				* A\**workflow** is a CircleCI concept; it is a DAG of '**jobs**'. ctrl-f 'workflows' on\https://github.com/pytorch/pytorch/blob/master/.circleci/config.yml to see the workflows.

				* **jobs** are a sequence of '**steps**'

				* **steps** are usually just a bash script or a builtin CircleCI command.* All steps run in new environments, environment variables declared in one script DO NOT persist to following steps*

				* CircleCI has a **workspace**, which is essentially a cache between steps of the *same job* in which you can store artifacts between steps.

				## How are the workflows structured?

				The nightly binaries have 3 workflows. We have one job (actually 3 jobs:  build, test, and upload) per binary configuration

				1. binarybuilds

				    1. every day midnight EST

				    2. linux: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml

				    3. macos: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/macos-binary-build-defaults.yml

				    4. For each binary configuration, e.g. linux_conda_3.7_cpu there is a

				        1. binary_linux_conda_3.7_cpu_build

				            1. Builds the build. On linux jobs this uses the 'docker executor'.

				            2. Persists the package to the workspace

				        2. binary_linux_conda_3.7_cpu_test

				            1. Loads the package to the workspace

				            2. Spins up a docker image (on Linux), mapping the package and code repos into the docker

				            3. Runs some smoke tests in the docker

				            4. (Actually, for macos this is a step rather than a separate job)

				        3. binary_linux_conda_3.7_cpu_upload

				            1. Logs in to aws/conda

				            2. Uploads the package

				2. update_s3_htmls

				    1. every day 5am EST

				    2. https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/binary_update_htmls.yml

				    3. See below for what these are for and why they're needed

				    4. Three jobs that each examine the current contents of aws and the conda repo and update some html files in s3

				3. binarysmoketests

				    1. every day

				    2. https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml

				    3. For each binary configuration, e.g. linux_conda_3.7_cpu there is a

				        1. smoke_linux_conda_3.7_cpu

				            1. Downloads the package from the cloud, e.g. using the official pip or conda instructions

				            2. Runs the smoke tests

				## How are the jobs structured?

				The jobs are in https://github.com/pytorch/pytorch/tree/master/.circleci/verbatim-sources . Jobs are made of multiple steps. There are some shared steps used by all the binaries/smokes. Steps of these jobs are all delegated to scripts in https://github.com/pytorch/pytorch/tree/master/.circleci/scripts .

				* Linux jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml

				    * binary_linux_build.sh

				    * binary_linux_test.sh

				    * binary_linux_upload.sh

				* MacOS jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/macos-binary-build-defaults.yml

				    * binary_macos_build.sh

				    * binary_macos_test.sh

				    * binary_macos_upload.sh

				* Update html jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/binary_update_htmls.yml

				    * These delegate from the pytorch/builder repo

				    * https://github.com/pytorch/builder/blob/master/cron/update_s3_htmls.sh

				    * https://github.com/pytorch/builder/blob/master/cron/upload_binary_sizes.sh

				* Smoke jobs (both linux and macos): https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml

				    * These delegate from the pytorch/builder repo

				    * https://github.com/pytorch/builder/blob/master/run_tests.sh

				    * https://github.com/pytorch/builder/blob/master/smoke_test.sh

				    * https://github.com/pytorch/builder/blob/master/check_binary.sh

				* Common shared code (shared across linux and macos): https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-binary-build-defaults.yml

				    * binary_checkout.sh - checks out pytorch/builder repo. Right now this also checks out pytorch/pytorch, but it shouldn't. pytorch/pytorch should just be shared through the workspace. This can handle being run before binary_populate_env.sh

				    * binary_populate_env.sh - parses BUILD_ENVIRONMENT into the separate env variables that make up a binary configuration. Also sets lots of default values, the date, the version strings, the location of folders in s3, all sorts of things. This generally has to be run before other steps.

				    * binary_install_miniconda.sh - Installs miniconda, cross platform. Also hacks this for the update_binary_sizes job that doesn't have the right env variables

				    * binary_run_in_docker.sh - Takes a bash script file (the actual test code) from a hardcoded location, spins up a docker image, and runs the script inside the docker image

				### **Why do the steps all refer to scripts?**

				CircleCI creates a  final yaml file by inlining every <<* segment, so if we were to keep all the code in the config.yml itself then the config size would go over 4 MB and cause infra problems.

				### **What is binary_run_in_docker for?**

				So, CircleCI has several executor types: macos, machine, and docker are the ones we use. The 'machine' executor gives you two cores on some linux vm. The 'docker' executor gives you considerably more cores (nproc was 32 instead of 2 back when I tried in February). Since the dockers are faster, we try to run everything that we can in dockers. Thus

				* linux build jobs use the docker executor. Running them on the docker executor was at least 2x faster than running them on the machine executor

				* linux test jobs use the machine executor and spin up their own docker. Why this nonsense? It's cause we run nvidia-docker for our GPU tests; any code that calls into the CUDA runtime needs to be run on nvidia-docker. To run a nvidia-docker you need to install some nvidia packages on the host machine and then call docker with the '—runtime nvidia' argument. CircleCI doesn't support this, so we have to do it ourself.

				    * This is not just a mere inconvenience. **This blocks all of our linux tests from using more than 2 cores.** But there is nothing that we can do about it, but wait for a fix on circleci's side. Right now, we only run some smoke tests (some simple imports) on the binaries, but this also affects non-binary test jobs.

				* linux upload jobs use the machine executor. The upload jobs are so short that it doesn't really matter what they use

				* linux smoke test jobs use the machine executor for the same reason as the linux test jobs

				binary_run_in_docker.sh is a way to share the docker start-up code between the binary test jobs and the binary smoke test jobs

				### **Why does binary_checkout also checkout pytorch? Why shouldn't it?**

				We want all the nightly binary jobs to run on the exact same git commit, so we wrote our own checkout logic to ensure that the same commit was always picked. Later circleci changed that to use a single pytorch checkout and persist it through the workspace (they did this because our config file was too big, so they wanted to take a lot of the setup code into scripts, but the scripts needed the code repo to exist to be called, so they added a prereq step called 'setup' to checkout the code and persist the needed scripts to the workspace). The changes to the binary jobs were not properly tested, so they all broke from missing pytorch code no longer existing. We hotfixed the problem by adding the pytorch checkout back to binary_checkout, so now there's two checkouts of pytorch on the binary jobs. This problem still needs to be fixed, but it takes careful tracing of which code is being called where.

				# Code structure of the binaries (circleci agnostic)

				## Overview

				The code that runs the binaries lives in two places, in the normal [github.com/pytorch/pytorch](http://github.com/pytorch/pytorch), but also in [github.com/pytorch/builder](http://github.com/pytorch/builder) , which is a repo that defines how all the binaries are built. The relevant code is

				```

				# All code needed to set-up environments for build code to run in,

				# but only code that is specific to the current CI system

				pytorch/pytorch

				- .circleci/                # Folder that holds all circleci related stuff

				  - config.yml              # GENERATED file that actually controls all circleci behavior

				  - verbatim-sources        # Used to generate job/workflow sections in ^

				  - scripts/                # Code needed to prepare circleci environments for binary build scripts

				- setup.py                  # Builds pytorch. This is wrapped in pytorch/builder

				- cmake files               # used in normal building of pytorch

				# All code needed to prepare a binary build, given an environment

				# with all the right variables/packages/paths.

				pytorch/builder

				# Given an installed binary and a proper python env, runs some checks

				# to make sure the binary was built the proper way. Checks things like

				# the library dependencies, symbols present, etc.

				- check_binary.sh

				# Given an installed binary, runs python tests to make sure everything

				# is in order. These should be de-duped. Right now they both run smoke

				# tests, but are called from different places. Usually just call some

				# import statements, but also has overlap with check_binary.sh above

				- run_tests.sh

				- smoke_test.sh

				# Folders that govern how packages are built. See paragraphs below

				- conda/

				  - build_pytorch.sh          # Entrypoint. Delegates to proper conda build folder

				  - switch_cuda_version.sh    # Switches activate CUDA installation in Docker

				  - pytorch-nightly/          # Build-folder

				- manywheel/

				  - build_cpu.sh              # Entrypoint for cpu builds

				  - build.sh                  # Entrypoint for CUDA builds

				  - build_common.sh           # Actual build script that ^^ call into

				- wheel/

				  - build_wheel.sh            # Entrypoint for wheel builds

				```

				Every type of package has an entrypoint build script that handles the all the important logic.

				## Conda

				Both Linux and MacOS use the same code flow for the conda builds.

				Conda packages are built with conda-build, see https://conda.io/projects/conda-build/en/latest/resources/commands/conda-build.html

				Basically, you pass `conda build` a build folder (pytorch-nightly/ above) that contains a build script and a meta.yaml. The meta.yaml specifies in what python environment to build the package in, and what dependencies the resulting package should have, and the build script gets called in the env to build the thing.

				tldr; on conda-build is

				1. Creates a brand new conda environment, based off of deps in the meta.yaml

				    1. Note that environment variables do not get passed into this build env unless they are specified in the meta.yaml

				    2. If the build fails this environment will stick around. You can activate it for much easier debugging. The “General Python” section below explains what exactly a python “environment” is.

				2. Calls build.sh in the environment

				3. Copies the finished package to a new conda env, also specified by the meta.yaml

				4. Runs some simple import tests (if specified in the meta.yaml)

				5. Saves the finished package as a tarball

				The build.sh we use is essentially a wrapper around ```python setup.py build``` , but it also manually copies in some of our dependent libraries into the resulting tarball and messes with some rpaths.

				The entrypoint file `builder/conda/build_conda.sh` is complicated because

				* It works for both Linux and MacOS

				    * The mac builds used to create their own environments, since they all used to be on the same machine. There’s now a lot of extra logic to handle conda envs. This extra machinery could be removed

				* It used to handle testing too, which adds more logic messing with python environments too. This extra machinery could be removed.

				## Manywheels (linux pip and libtorch packages)

				Manywheels are pip packages for linux distros. Note that these manywheels are not actually manylinux compliant.

				`builder/manywheel/build_cpu.sh` and `builder/manywheel/build.sh` (for CUDA builds) just set different env vars and then call into `builder/manywheel/build_common.sh`

				The entrypoint file `builder/manywheel/build_common.sh` is really really complicated because

				* This used to handle building for several different python versions at the same time. The loops have been removed, but there's still unneccessary folders and movements here and there.

				    * The script is never used this way anymore. This extra machinery could be removed.

				* This used to handle testing the pip packages too. This is why there’s testing code at the end that messes with python installations and stuff

				    * The script is never used this way anymore. This extra machinery could be removed.

				* This also builds libtorch packages

				    * This should really be separate. libtorch packages are c++ only and have no python. They should not share infra with all the python specific stuff in this file.

				* There is a lot of messing with rpaths. This is necessary, but could be made much much simpler if the above issues were fixed.

				## Wheels (MacOS pip and libtorch packages)

				The entrypoint file `builder/wheel/build_wheel.sh` is complicated because

				* The mac builds used to all run on one machine (we didn’t have autoscaling mac machines till circleci). So this script handled siloing itself by setting-up and tearing-down its build env and siloing itself into its own build directory.

				    * The script is never used this way anymore. This extra machinery could be removed.

				* This also builds libtorch packages

				    * Ditto the comment above. This should definitely be separated out.

				Note that the MacOS Python wheels are still built in conda environments. Some of the dependencies present during build also come from conda.

				## General notes

				### Note on run_tests.sh, smoke_test.sh, and check_binary.sh

				* These should all be consolidated

				* These must run on all OS types: MacOS, Linux, and Windows

				* These all run smoke tests at the moment. They inspect the packages some, maybe run a few import statements. They DO NOT run the python tests nor the cpp tests. The idea is that python tests on master and PR merges will catch all breakages. All these tests have to do is make sure the special binary machinery didn’t mess anything up.

				* There are separate run_tests.sh and smoke_test.sh because one used to be called by the smoke jobs and one used to be called by the binary test jobs (see circleci structure section above). This is still true actually, but these could be united into a single script that runs these checks, given an installed pytorch package.

				### Note on libtorch

				Libtorch packages are built in the wheel build scripts: manywheel/build_*.sh for linux and build_wheel.sh for mac. There are several things wrong with this

				* It’s confusinig. Most of those scripts deal with python specifics.

				* The extra conditionals everywhere severely complicate the wheel build scripts

				* The process for building libtorch is different from the official instructions (a plain call to cmake, or a call to a script)

				### Note on docker images / Dockerfiles

				All linux builds occur in docker images. The docker images are

				* soumith/conda-cuda

				    * Has ALL CUDA versions installed. The script pytorch/builder/conda/switch_cuda_version.sh sets /usr/local/cuda to a symlink to e.g. /usr/local/cuda-10.0 to enable different CUDA builds

				    * Also used for cpu builds

				* soumith/manylinux-cuda90

				* soumith/manylinux-cuda92

				* soumith/manylinux-cuda100

				    * Also used for cpu builds

				The Dockerfiles are available in pytorch/builder, but there is no circleci job or script to build these docker images, and they cannot be run locally (unless you have the correct local packages/paths). Only Soumith can build them right now.

				### General Python

				* This is still a good explanation of python installations https://caffe2.ai/docs/faq.html#why-do-i-get-import-errors-in-python-when-i-try-to-use-caffe2

				# How to manually rebuild the binaries

				tldr; make a PR that looks like https://github.com/pytorch/pytorch/pull/21159

				Sometimes we want to push a change to master and then rebuild all of today's binaries after that change. As of May 30, 2019 there isn't a way to manually run a workflow in the UI. You can manually re-run a workflow, but it will use the exact same git commits as the first run and will not include any changes. So we have to make a PR and then force circleci to run the binary workflow instead of the normal tests. The above PR is an example of how to do this; essentially you copy-paste the binarybuilds workflow steps into the default workflow steps. If you need to point the builder repo to a different commit then you'd need to change https://github.com/pytorch/pytorch/blob/master/.circleci/scripts/binary_checkout.sh#L42-L45 to checkout what you want.

				## How to test changes to the binaries via .circleci

				Writing PRs that test the binaries is annoying, since the default circleci jobs that run on PRs are not the jobs that you want to run. Likely, changes to the binaries will touch something under .circleci/ and require that .circleci/config.yml be regenerated (.circleci/config.yml controls all .circleci behavior, and is generated using ```.circleci/regenerate.sh``` in python 3.7). But you also need to manually hardcode the binary jobs that you want to test into the .circleci/config.yml workflow, so you should actually make at least two commits, one for your changes and one to temporarily hardcode jobs. See https://github.com/pytorch/pytorch/pull/22928 as an example of how to do this.

				```

				# Make your changes

				touch .circleci/verbatim-sources/nightly-binary-build-defaults.yml

				# Regenerate the yaml, has to be in python 3.7

				.circleci/regenerate.sh

				# Make a commit

				git add .circleci *

				git commit -m "My real changes"

				git push origin my_branch

				# Now hardcode the jobs that you want in the .circleci/config.yml workflows section

				# Also eliminate ensure-consistency and should_run_job checks

				# e.g. https://github.com/pytorch/pytorch/commit/2b3344bfed8772fe86e5210cc4ee915dee42b32d

				# Make a commit you won't keep

				git add .circleci

				git commit -m "[DO NOT LAND] testing binaries for above changes"

				git push origin my_branch

				# Now you need to make some changes to the first commit.

				git rebase -i HEAD~2 # mark the first commit as 'edit'

				# Make the changes

				touch .circleci/verbatim-sources/nightly-binary-build-defaults.yml

				.circleci/regenerate.sh

				# Ammend the commit and recontinue

				git add .circleci

				git commit --amend

				git rebase --continue

				# Update the PR, need to force since the commits are different now

				git push origin my_branch --force

				```

				The advantage of this flow is that you can make new changes to the base commit and regenerate the .circleci without having to re-write which binary jobs you want to test on. The downside is that all updates will be force pushes.

				## How to build a binary locally

				### Linux

				You can build Linux binaries locally easily using docker.

				```

				# Run the docker

				# Use the correct docker image, soumith/conda-cuda used here as an example

				#

				# -v path/to/foo:path/to/bar makes path/to/foo on your local machine (the

				#    machine that you're running the command on) accessible to the docker

				#    container at path/to/bar. So if you then run `touch path/to/bar/baz`

				#    in the docker container then you will see path/to/foo/baz on your local

				#    machine. You could also clone the pytorch and builder repos in the docker.

				#

				# If you're building a CUDA binary then use `nvidia-docker run` instead, see below.

				#

				# If you know how, add ccache as a volume too and speed up everything

				docker run \

				    -v your/pytorch/repo:/pytorch \

				    -v your/builder/repo:/builder \

				    -v where/you/want/packages/to/appear:/final_pkgs \

				    -it soumith/conda-cuda /bin/bash

				# Export whatever variables are important to you. All variables that you'd

				# possibly need are in .circleci/scripts/binary_populate_env.sh

				# You should probably always export at least these 3 variables

				export PACKAGE_TYPE=conda

				export DESIRED_PYTHON=3.6

				export DESIRED_CUDA=cpu

				# Call the entrypoint

				# `|& tee foo.log` just copies all stdout and stderr output to foo.log

				# The builds generate lots of output so you probably need this when

				# building locally.

				/builder/conda/build_pytorch.sh |& tee build_output.log

				```

				**Building CUDA binaries on docker**

				To build a CUDA binary you need to use `nvidia-docker run` instead of just `docker run` (or you can manually pass `--runtime=nvidia`). This adds some needed libraries and things to build CUDA stuff.

				You can build CUDA binaries on CPU only machines, but you can only run CUDA binaries on CUDA machines. This means that you can build a CUDA binary on a docker on your laptop if you so choose (though it’s gonna take a loong time).

				For Facebook employees, ask about beefy machines that have docker support and use those instead of your laptop; it will be 5x as fast.

				### MacOS

				There’s no easy way to generate reproducible hermetic MacOS environments. If you have a Mac laptop then you can try emulating the .circleci environments as much as possible, but you probably have packages in /usr/local/, possibly installed by brew, that will probably interfere with the build. If you’re trying to repro an error on a Mac build in .circleci and you can’t seem to repro locally, then my best advice is actually to iterate on .circleci    :/

				But if you want to try, then I’d recommend

				```

				# Create a new terminal

				# Clear your LD_LIBRARY_PATH and trim as much out of your PATH as you

				# know how to do

				# Install a new miniconda

				# First remove any other python or conda installation from your PATH

				# Always install miniconda 3, even if building for Python <3

				new_conda="~/my_new_conda"

				conda_sh="$new_conda/install_miniconda.sh"

				curl -o "$conda_sh" https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh

				chmod +x "$conda_sh"

				"$conda_sh" -b -p "$MINICONDA_ROOT"

				rm -f "$conda_sh"

				export PATH="~/my_new_conda/bin:$PATH"

				# Create a clean python env

				# All MacOS builds use conda to manage the python env and dependencies

				# that are built with, even the pip packages

				conda create -yn binary python=2.7

				conda activate binary

				# Export whatever variables are important to you. All variables that you'd

				# possibly need are in .circleci/scripts/binary_populate_env.sh

				# You should probably always export at least these 3 variables

				export PACKAGE_TYPE=conda

				export DESIRED_PYTHON=3.6

				export DESIRED_CUDA=cpu

				# Call the entrypoint you want

				path/to/builder/wheel/build_wheel.sh

				```

				N.B. installing a brand new miniconda is important. This has to do with how conda installations work. See the “General Python” section above, but tldr; is that

				1. You make the ‘conda’ command accessible by prepending `path/to/conda_root/bin` to your PATH.

				2. You make a new env and activate it, which then also gets prepended to your PATH. Now you have `path/to/conda_root/envs/new_env/bin:path/to/conda_root/bin:$PATH`

				3. Now say you (or some code that you ran) call python executable `foo`

				    1. if you installed `foo` in `new_env`, then `path/to/conda_root/envs/new_env/bin/foo` will get called, as expected.

				    2. But if you forgot to installed `foo` in `new_env` but happened to previously install it in your root conda env (called ‘base’), then unix/linux will still find `path/to/conda_root/bin/foo` . This is dangerous, since `foo` can be a different version than you want; `foo` can even be for an incompatible python version!

				Newer conda versions and proper python hygeine can prevent this, but just install a new miniconda to be safe.

				### Windows

				Maybe @peterjc123 can fill this section in.

0

caffe2/contrib/cuda-convnet2/cudaconvnet/init.py → .circleci/cimodel/init.py

View File

0

aten/src/THC/THCThreadLocal.cpp → .circleci/cimodel/data/init.py

View File

									
										178

.circleci/cimodel/data/binary_build_data.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,178 @@

				#!/usr/bin/env python3

				"""

				This module models the tree of configuration variants

				for "smoketest" builds.

				Each subclass of ConfigNode represents a layer of the configuration hierarchy.

				These tree nodes encapsulate the logic for whether a branch of the hierarchy

				should be "pruned".

				In addition to generating config.yml content, the tree is also traversed

				to produce a visualization of config dimensions.

				"""

				from collections import OrderedDict

				from cimodel.lib.conf_tree import ConfigNode

				import cimodel.data.dimensions as dimensions

				LINKING_DIMENSIONS = [

				    "shared",

				    "static",

				]

				DEPS_INCLUSION_DIMENSIONS = [

				    "with-deps",

				    "without-deps",

				]

				def get_processor_arch_name(cuda_version):

				    return "cpu" if not cuda_version else "cu" + cuda_version

				LINUX_PACKAGE_VARIANTS = OrderedDict(

				    manywheel=[

				        "2.7m",

				        "2.7mu",

				        "3.5m",

				        "3.6m",

				        "3.7m",

				    ],

				    conda=dimensions.CONDA_PYTHON_VERSIONS,

				    libtorch=[

				        "2.7m",

				    ],

				)

				CONFIG_TREE_DATA = OrderedDict(

				    linux=(dimensions.CUDA_VERSIONS, LINUX_PACKAGE_VARIANTS),

				    macos=([None], OrderedDict(

				        wheel=dimensions.STANDARD_PYTHON_VERSIONS,

				        conda=dimensions.CONDA_PYTHON_VERSIONS,

				        libtorch=[

				            "2.7",

				        ],

				    )),

				)

				# Why is this an option?

				# All the nightlies used to be devtoolset3 and built with the old gcc ABI. We

				# added a devtoolset7 option so that we could build nightlies with the new gcc

				# ABI. That didn't work since devtoolset7 can't build with the new gcc ABI. But

				# then we set devtoolset7 to be the default anyways, since devtoolset7

				# understands avx512, which is needed for good fbgemm performance.

				# This should be removed. The base dockers should just be upgraded to

				# devtoolset7 so we don't have to reinstall this in every build job.

				# The same machinery that this uses, though, should be retooled for a different

				# compiler toolchain that can build with the new gcc ABI.

				DEVTOOLSET_VERSIONS = [

				    7,

				]

				class TopLevelNode(ConfigNode):

				    def __init__(self, node_name, config_tree_data, smoke):

				        super(TopLevelNode, self).__init__(None, node_name)

				        self.config_tree_data = config_tree_data

				        self.props["smoke"] = smoke

				    def get_children(self):

				        return [OSConfigNode(self, x, c, p) for (x, (c, p)) in self.config_tree_data.items()]

				class OSConfigNode(ConfigNode):

				    def __init__(self, parent, os_name, cuda_versions, py_tree):

				        super(OSConfigNode, self).__init__(parent, os_name)

				        self.py_tree = py_tree

				        self.props["os_name"] = os_name

				        self.props["cuda_versions"] = cuda_versions

				    def get_children(self):

				        packaging_variants = [PackageFormatConfigNode(self, k, v) for k, v in self.py_tree.items()]

				        if self.find_prop("smoke"):

				            filtered_packaging_variants = list(filter(lambda x: x.get_label() != "libtorch", packaging_variants))

				            return filtered_packaging_variants

				        else:

				            return packaging_variants

				class PackageFormatConfigNode(ConfigNode):

				    def __init__(self, parent, package_format, python_versions):

				        super(PackageFormatConfigNode, self).__init__(parent, package_format)

				        self.props["python_versions"] = python_versions

				        self.props["package_format"] = package_format

				    def get_children(self):

				        if self.find_prop("os_name") == "linux":

				            return [LinuxGccConfigNode(self, v) for v in DEVTOOLSET_VERSIONS]

				        else:

				            return [ArchConfigNode(self, v) for v in self.find_prop("cuda_versions")]

				class LinuxGccConfigNode(ConfigNode):

				    def __init__(self, parent, devtoolset_version):

				        super(LinuxGccConfigNode, self).__init__(parent, "DEVTOOLSET=" + str(devtoolset_version))

				        self.props["devtoolset_version"] = devtoolset_version

				    def get_children(self):

				        cuda_versions = self.find_prop("cuda_versions")

				        # XXX devtoolset7 on CUDA 9.0 is temporarily disabled

				        # see https://github.com/pytorch/pytorch/issues/20066

				        if self.find_prop("devtoolset_version") == 7:

				            cuda_versions = filter(lambda x: x != "90", cuda_versions)

				        return [ArchConfigNode(self, v) for v in cuda_versions]

				class ArchConfigNode(ConfigNode):

				    def __init__(self, parent, cu):

				        super(ArchConfigNode, self).__init__(parent, get_processor_arch_name(cu))

				        self.props["cu"] = cu

				    def get_children(self):

				        return [PyVersionConfigNode(self, v) for v in self.find_prop("python_versions")]

				class PyVersionConfigNode(ConfigNode):

				    def __init__(self, parent, pyver):

				        super(PyVersionConfigNode, self).__init__(parent, pyver)

				        self.props["pyver"] = pyver

				    def get_children(self):

				        smoke = self.find_prop("smoke")

				        package_format = self.find_prop("package_format")

				        os_name = self.find_prop("os_name")

				        has_libtorch_variants = package_format == "libtorch" and os_name == "linux"

				        linking_variants = LINKING_DIMENSIONS if has_libtorch_variants else []

				        return [LinkingVariantConfigNode(self, v) for v in linking_variants]

				class LinkingVariantConfigNode(ConfigNode):

				    def __init__(self, parent, linking_variant):

				        super(LinkingVariantConfigNode, self).__init__(parent, linking_variant)

				    def get_children(self):

				        return [DependencyInclusionConfigNode(self, v) for v in DEPS_INCLUSION_DIMENSIONS]

				class DependencyInclusionConfigNode(ConfigNode):

				    def __init__(self, parent, deps_variant):

				        super(DependencyInclusionConfigNode, self).__init__(parent, deps_variant)

				        self.props["libtorch_variant"] = "-".join([self.parent.get_label(), self.get_label()])

									
										213

.circleci/cimodel/data/binary_build_definitions.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,213 @@

				#!/usr/bin/env python3

				from collections import OrderedDict

				import cimodel.data.binary_build_data as binary_build_data

				import cimodel.lib.conf_tree as conf_tree

				import cimodel.lib.miniutils as miniutils

				import cimodel.lib.visualization as visualization

				class Conf(object):

				    def __init__(self, os, cuda_version, pydistro, parms, smoke, libtorch_variant, devtoolset_version):

				        self.os = os

				        self.cuda_version = cuda_version

				        self.pydistro = pydistro

				        self.parms = parms

				        self.smoke = smoke

				        self.libtorch_variant = libtorch_variant

				        self.devtoolset_version = devtoolset_version

				    def gen_build_env_parms(self):

				        elems = [self.pydistro] + self.parms + [binary_build_data.get_processor_arch_name(self.cuda_version)]

				        if self.devtoolset_version is not None:

				            elems.append("devtoolset" + str(self.devtoolset_version))

				        return elems

				    def gen_docker_image(self):

				        docker_word_substitution = {

				            "manywheel": "manylinux",

				            "libtorch": "manylinux",

				        }

				        docker_distro_prefix = miniutils.override(self.pydistro, docker_word_substitution)

				        # The cpu nightlies are built on the soumith/manylinux-cuda100 docker image

				        alt_docker_suffix = self.cuda_version or "100"

				        docker_distro_suffix = "" if self.pydistro == "conda" else alt_docker_suffix

				        return miniutils.quote("soumith/" + docker_distro_prefix + "-cuda" + docker_distro_suffix)

				    def get_name_prefix(self):

				        return "smoke" if self.smoke else "binary"

				    def gen_build_name(self, build_or_test):

				        parts = [self.get_name_prefix(), self.os] + self.gen_build_env_parms()

				        if self.libtorch_variant:

				            parts.append(self.libtorch_variant)

				        if not self.smoke:

				            parts.append(build_or_test)

				        return "_".join(parts)

				    def gen_yaml_tree(self, build_or_test):

				        env_tuples = [("BUILD_ENVIRONMENT", miniutils.quote(" ".join(self.gen_build_env_parms())))]

				        if self.libtorch_variant:

				            env_tuples.append(("LIBTORCH_VARIANT", miniutils.quote(self.libtorch_variant)))

				        os_name = miniutils.override(self.os, {"macos": "mac"})

				        d = {"<<": "*" + "_".join([self.get_name_prefix(), os_name, build_or_test])}

				        if build_or_test == "test":

				            if not (self.smoke and self.os == "macos"):

				                env_tuples.append(("DOCKER_IMAGE", self.gen_docker_image()))

				            if self.cuda_version:

				                env_tuples.append(("USE_CUDA_DOCKER_RUNTIME", miniutils.quote("1")))

				        else:

				            if self.os == "linux" and build_or_test != "upload":

				                d["docker"] = [{"image": self.gen_docker_image()}]

				        d["environment"] = OrderedDict(env_tuples)

				        if build_or_test == "test":

				            if self.cuda_version:

				                d["resource_class"] = "gpu.medium"

				        return d

				def get_root(smoke, name):

				    return binary_build_data.TopLevelNode(

				        name,

				        binary_build_data.CONFIG_TREE_DATA,

				        smoke,

				    )

				def gen_build_env_list(smoke):

				    root = get_root(smoke, "N/A")

				    config_list = conf_tree.dfs(root)

				    newlist = []

				    for c in config_list:

				        conf = Conf(

				            c.find_prop("os_name"),

				            c.find_prop("cu"),

				            c.find_prop("package_format"),

				            [c.find_prop("pyver")],

				            c.find_prop("smoke"),

				            c.find_prop("libtorch_variant"),

				            c.find_prop("devtoolset_version"),

				        )

				        newlist.append(conf)

				    return newlist

				def predicate_exclude_nonlinux_and_libtorch(config):

				    return config.os == "linux" and (config.smoke or config.pydistro != "libtorch")

				def add_build_entries(jobs_dict, phase, smoke, filter_predicate=lambda x: True):

				    configs = gen_build_env_list(smoke)

				    for conf_options in filter(filter_predicate, configs):

				        jobs_dict[conf_options.gen_build_name(phase)] = conf_options.gen_yaml_tree(phase)

				def add_binary_build_specs(jobs_dict):

				    add_build_entries(jobs_dict, "build", False)

				def add_binary_build_tests(jobs_dict):

				    add_build_entries(jobs_dict, "test", False, predicate_exclude_nonlinux_and_libtorch)

				def add_binary_build_uploads(jobs_dict):

				    add_build_entries(jobs_dict, "upload", False)

				def add_smoke_test_specs(jobs_dict):

				    add_build_entries(jobs_dict, "test", True)

				def get_nightly_tests():

				    configs = gen_build_env_list(False)

				    filtered_configs = filter(predicate_exclude_nonlinux_and_libtorch, configs)

				    tests = []

				    for conf_options in filtered_configs:

				        params = {"requires": ["setup", conf_options.gen_build_name("build")]}

				        tests.append({conf_options.gen_build_name("test"): params})

				    return tests

				def get_nightly_uploads():

				    configs = gen_build_env_list(False)

				    def gen_config(conf, phase_dependency):

				        return {

				            conf.gen_build_name("upload"): OrderedDict([

				                ("context", "org-member"),

				                ("requires", ["setup", conf.gen_build_name(phase_dependency)]),

				            ]),

				        }

				    mylist = []

				    for conf in configs:

				        phase_dependency = "test" if predicate_exclude_nonlinux_and_libtorch(conf) else "build"

				        mylist.append(gen_config(conf, phase_dependency))

				    return mylist

				def gen_schedule_tree(cron_timing):

				    return [{

				        "schedule": {

				            "cron": miniutils.quote(cron_timing),

				            "filters": {

				                "branches": {

				                    "only": ["master"],

				                },

				            },

				        },

				    }]

				def add_jobs_and_render(jobs_dict, toplevel_key, smoke, cron_schedule):

				    jobs_list = ["setup"]

				    configs = gen_build_env_list(smoke)

				    for build_config in configs:

				        build_name = build_config.gen_build_name("build")

				        jobs_list.append({build_name: {"requires": ["setup"]}})

				    jobs_dict[toplevel_key] = OrderedDict(

				        triggers=gen_schedule_tree(cron_schedule),

				        jobs=jobs_list,

				    )

				    graph = visualization.generate_graph(get_root(smoke, toplevel_key))

				    graph.draw(toplevel_key + "-config-dimensions.png", prog="twopi")

				def add_binary_build_jobs(jobs_dict):

				    add_jobs_and_render(jobs_dict, "binarybuilds", False, "5 5 * * *")

				def add_binary_smoke_test_jobs(jobs_dict):

				    add_jobs_and_render(jobs_dict, "binarysmoketests", True, "15 16 * * *")

									
										109

.circleci/cimodel/data/caffe2_build_data.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,109 @@

				#!/usr/bin/env python3

				from cimodel.lib.conf_tree import ConfigNode, X, XImportant

				from cimodel.lib.conf_tree import Ver

				CONFIG_TREE_DATA = [

				    (Ver("ubuntu", "14.04"), [

				        (Ver("gcc", "4.8"), [X("py2")]),

				        (Ver("gcc", "4.9"), [X("py2")]),

				    ]),

				    (Ver("ubuntu", "16.04"), [

				        (Ver("cuda", "9.0"), [

				            # TODO make explicit that this is a "secret TensorRT build"

				            #  (see https://github.com/pytorch/pytorch/pull/17323#discussion_r259446749)

				            # TODO Uh oh, were we supposed to make this one important?!

				            X("py2"),

				            XImportant("cmake"),

				        ]),

				        (Ver("cuda", "9.1"), [XImportant("py2")]),

				        (Ver("mkl"), [XImportant("py2")]),

				        (Ver("gcc", "5"), [XImportant("onnx_py2")]),

				        (Ver("clang", "3.8"), [X("py2")]),

				        (Ver("clang", "3.9"), [X("py2")]),

				        (Ver("clang", "7"), [XImportant("py2"), XImportant("onnx_py3.6")]),

				        (Ver("android"), [XImportant("py2")]),

				    ]),

				    (Ver("centos", "7"), [

				        (Ver("cuda", "9.0"), [X("py2")]),

				    ]),

				    (Ver("macos", "10.13"), [

				        # TODO ios and system aren't related. system qualifies where the python comes

				        #  from (use the system python instead of homebrew or anaconda)

				        (Ver("ios"), [X("py2")]),

				        (Ver("system"), [XImportant("py2")]),

				    ]),

				]

				class TreeConfigNode(ConfigNode):

				    def __init__(self, parent, node_name, subtree):

				        super(TreeConfigNode, self).__init__(parent, self.modify_label(node_name))

				        self.subtree = subtree

				        self.init2(node_name)

				    # noinspection PyMethodMayBeStatic

				    def modify_label(self, label):

				        return str(label)

				    def init2(self, node_name):

				        pass

				    def get_children(self):

				        return [self.child_constructor()(self, k, v) for (k, v) in self.subtree]

				    def is_build_only(self):

				        if str(self.find_prop("language_version")) == "onnx_py3.6":

				            return False

				        return str(self.find_prop("compiler_version")) in [

				            "gcc4.9",

				            "clang3.8",

				            "clang3.9",

				            "clang7",

				            "android",

				        ] or self.find_prop("distro_version").name == "macos"

				class TopLevelNode(TreeConfigNode):

				    def __init__(self, node_name, subtree):

				        super(TopLevelNode, self).__init__(None, node_name, subtree)

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return DistroConfigNode

				class DistroConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["distro_version"] = node_name

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return CompilerConfigNode

				class CompilerConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["compiler_version"] = node_name

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return LanguageConfigNode

				class LanguageConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["language_version"] = node_name

				        self.props["build_only"] = self.is_build_only()

				    def child_constructor(self):

				        return ImportantConfigNode

				class ImportantConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["important"] = True

				    def get_children(self):

				        return []

									
										187

.circleci/cimodel/data/caffe2_build_definitions.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,187 @@

				#!/usr/bin/env python3

				from collections import OrderedDict

				import cimodel.data.dimensions as dimensions

				import cimodel.lib.conf_tree as conf_tree

				from cimodel.lib.conf_tree import Ver

				import cimodel.lib.miniutils as miniutils

				import cimodel.lib.visualization as visualization

				from cimodel.data.caffe2_build_data import CONFIG_TREE_DATA, TopLevelNode

				from dataclasses import dataclass

				DOCKER_IMAGE_PATH_BASE = "308535385114.dkr.ecr.us-east-1.amazonaws.com/caffe2/"

				DOCKER_IMAGE_VERSION = 287

				@dataclass

				class Conf:

				    language: str

				    distro: Ver

				    compiler: Ver

				    build_only: bool

				    is_important: bool

				    # TODO: Eventually we can probably just remove the cudnn7 everywhere.

				    def get_cudnn_insertion(self):

				        omit = self.language == "onnx_py2" \

				            or self.language == "onnx_py3.6" \

				            or self.compiler.name in ["android", "mkl", "clang"] \

				            or str(self.distro) in ["ubuntu14.04", "macos10.13"]

				        return [] if omit else ["cudnn7"]

				    def get_build_name_root_parts(self):

				        return [

				            "caffe2",

				            self.language,

				        ] + self.get_build_name_middle_parts()

				    def get_build_name_middle_parts(self):

				        return [str(self.compiler)] + self.get_cudnn_insertion() + [str(self.distro)]

				    def construct_phase_name(self, phase):

				        root_parts = self.get_build_name_root_parts()

				        return "_".join(root_parts + [phase]).replace(".", "_")

				    def get_platform(self):

				        platform = self.distro.name

				        if self.distro.name != "macos":

				            platform = "linux"

				        return platform

				    def gen_docker_image(self):

				        lang_substitutions = {

				            "onnx_py2": "py2",

				            "onnx_py3.6": "py3.6",

				            "cmake": "py2",

				        }

				        lang = miniutils.override(self.language, lang_substitutions)

				        parts = [lang] + self.get_build_name_middle_parts()

				        return miniutils.quote(DOCKER_IMAGE_PATH_BASE + "-".join(parts) + ":" + str(DOCKER_IMAGE_VERSION))

				    def gen_yaml_tree(self, phase):

				        tuples = []

				        lang_substitutions = {

				            "onnx_py2": "onnx-py2",

				            "onnx_py3.6": "onnx-py3.6",

				        }

				        lang = miniutils.override(self.language, lang_substitutions)

				        parts = [

				            "caffe2",

				            lang,

				        ] + self.get_build_name_middle_parts() + [phase]

				        build_env = "-".join(parts)

				        if not self.distro.name == "macos":

				            build_env = miniutils.quote(build_env)

				        tuples.append(("BUILD_ENVIRONMENT", build_env))

				        if self.compiler.name == "ios":

				            tuples.append(("BUILD_IOS", miniutils.quote("1")))

				        if phase == "test":

				            # TODO cuda should not be considered a compiler

				            if self.compiler.name == "cuda":

				                tuples.append(("USE_CUDA_DOCKER_RUNTIME", miniutils.quote("1")))

				        if self.distro.name == "macos":

				            tuples.append(("PYTHON_VERSION", miniutils.quote("2")))

				        else:

				            tuples.append(("DOCKER_IMAGE", self.gen_docker_image()))

				            if self.build_only:

				                tuples.append(("BUILD_ONLY", miniutils.quote("1")))

				        d = OrderedDict({"environment": OrderedDict(tuples)})

				        if phase == "test":

				            resource_class = "large" if self.compiler.name != "cuda" else "gpu.medium"

				            d["resource_class"] = resource_class

				        d["<<"] = "*" + "_".join(["caffe2", self.get_platform(), phase, "defaults"])

				        return d

				def get_root():

				    return TopLevelNode("Caffe2 Builds", CONFIG_TREE_DATA)

				def instantiate_configs():

				    config_list = []

				    root = get_root()

				    found_configs = conf_tree.dfs(root)

				    for fc in found_configs:

				        c = Conf(

				            language=fc.find_prop("language_version"),

				            distro=fc.find_prop("distro_version"),

				            compiler=fc.find_prop("compiler_version"),

				            build_only=fc.find_prop("build_only"),

				            is_important=fc.find_prop("important"),

				        )

				        config_list.append(c)

				    return config_list

				def add_caffe2_builds(jobs_dict):

				    configs = instantiate_configs()

				    for conf_options in configs:

				        phases = ["build"]

				        if not conf_options.build_only:

				            phases = dimensions.PHASES

				        for phase in phases:

				            jobs_dict[conf_options.construct_phase_name(phase)] = conf_options.gen_yaml_tree(phase)

				    graph = visualization.generate_graph(get_root())

				    graph.draw("caffe2-config-dimensions.png", prog="twopi")

				def get_caffe2_workflows():

				    configs = instantiate_configs()

				    # TODO Why don't we build this config?

				    # See https://github.com/pytorch/pytorch/pull/17323#discussion_r259450540

				    filtered_configs = filter(lambda x: not (str(x.distro) == "ubuntu14.04" and str(x.compiler) == "gcc4.9"), configs)

				    x = []

				    for conf_options in filtered_configs:

				        phases = ["build"]

				        if not conf_options.build_only:

				            phases = dimensions.PHASES

				        for phase in phases:

				            requires = ["setup"]

				            sub_d = {"requires": requires}

				            if phase == "test":

				                requires.append(conf_options.construct_phase_name("build"))

				            if not conf_options.is_important:

				                # If you update this, update

				                # pytorch_build_definitions.py too

				                sub_d["filters"] = {"branches": {"only": ["master", r"/ci-all\/.*/"]}}

				            x.append({conf_options.construct_phase_name(phase): sub_d})

				    return x

									
										23

.circleci/cimodel/data/dimensions.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,23 @@

				#!/usr/bin/env python3

				PHASES = ["build", "test"]

				CUDA_VERSIONS = [

				    None,  # cpu build

				    "92",

				    "100",

				]

				STANDARD_PYTHON_VERSIONS = [

				    "2.7",

				    "3.5",

				    "3.6",

				    "3.7",

				]

				CONDA_PYTHON_VERSIONS = [

				    "2.7",

				    "3.6",

				    "3.7",

				]

									
										200

.circleci/cimodel/data/pytorch_build_data.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,200 @@

				#!/usr/bin/env python3

				from cimodel.lib.conf_tree import ConfigNode, X, XImportant

				CONFIG_TREE_DATA = [

				    ("trusty", [

				        (None, [

				            XImportant("2.7.9"),

				            X("2.7"),

				            X("3.5"),

				            X("nightly"),

				        ]),

				        ("gcc", [

				            ("4.8", [X("3.6")]),

				            ("5.4", [

				                XImportant("3.6"),

				                ("3.6", [

				                    ("xla", [XImportant(True)]),

				                    ("namedtensor", [XImportant(True)]),

				                ]),

				            ]),

				            ("7", [X("3.6")]),

				        ]),

				    ]),

				    ("xenial", [

				        ("clang", [

				            ("5", [

				                XImportant("3.6"),  # This is actually the ASAN build

				                ("3.6", [

				                    ("namedtensor", [XImportant(True)]),  # ASAN

				                ]),

				            ]),

				        ]),

				        ("cuda", [

				            ("9", [

				                # Note there are magic strings here

				                # https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L21

				                # and

				                # https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L143

				                # and

				                # https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L153

				                # (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259453144)

				                X("2.7"),

				                XImportant("3.6"),

				                ("2.7", [

				                    ("namedtensor", [XImportant(True)]),

				                ]),

				            ]),

				            ("9.2", [X("3.6")]),

				            ("10", [X("3.6")]),

				        ]),

				        ("android", [

				            ("r19c", [XImportant("3.6")]),

				        ]),

				    ]),

				]

				def get_major_pyver(dotted_version):

				    parts = dotted_version.split(".")

				    return "py" + parts[0]

				class TreeConfigNode(ConfigNode):

				    def __init__(self, parent, node_name, subtree):

				        super(TreeConfigNode, self).__init__(parent, self.modify_label(node_name))

				        self.subtree = subtree

				        self.init2(node_name)

				    def modify_label(self, label):

				        return label

				    def init2(self, node_name):

				        pass

				    def get_children(self):

				        return [self.child_constructor()(self, k, v) for (k, v) in self.subtree]

				class TopLevelNode(TreeConfigNode):

				    def __init__(self, node_name, subtree):

				        super(TopLevelNode, self).__init__(None, node_name, subtree)

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return DistroConfigNode

				class DistroConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["distro_name"] = node_name

				    def child_constructor(self):

				        distro = self.find_prop("distro_name")

				        next_nodes = {

				            "trusty": TrustyCompilerConfigNode,

				            "xenial": XenialCompilerConfigNode,

				        }

				        return next_nodes[distro]

				class TrustyCompilerConfigNode(TreeConfigNode):

				    def modify_label(self, label):

				        return label or "<unspecified>"

				    def init2(self, node_name):

				        self.props["compiler_name"] = node_name

				    def child_constructor(self):

				        return TrustyCompilerVersionConfigNode if self.props["compiler_name"] else PyVerConfigNode

				class TrustyCompilerVersionConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["compiler_version"] = node_name

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return PyVerConfigNode

				class PyVerConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["pyver"] = node_name

				        self.props["abbreviated_pyver"] = get_major_pyver(node_name)

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return ExperimentalFeatureConfigNode

				class ExperimentalFeatureConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["experimental_feature"] = node_name

				    def child_constructor(self):

				        experimental_feature = self.find_prop("experimental_feature")

				        next_nodes = {

				            "xla": XlaConfigNode,

				            "namedtensor": NamedTensorConfigNode,

				            "important": ImportantConfigNode,

				        }

				        return next_nodes[experimental_feature]

				class XlaConfigNode(TreeConfigNode):

				    def modify_label(self, label):

				        return "XLA=" + str(label)

				    def init2(self, node_name):

				        self.props["is_xla"] = node_name

				    def child_constructor(self):

				        return ImportantConfigNode

				class NamedTensorConfigNode(TreeConfigNode):

				    def modify_label(self, label):

				        return "NAMEDTENSOR=" + str(label)

				    def init2(self, node_name):

				        self.props["is_namedtensor"] = node_name

				    def child_constructor(self):

				        return ImportantConfigNode

				class ImportantConfigNode(TreeConfigNode):

				    def modify_label(self, label):

				        return "IMPORTANT=" + str(label)

				    def init2(self, node_name):

				        self.props["is_important"] = node_name

				    def get_children(self):

				        return []

				class XenialCompilerConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["compiler_name"] = node_name

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return XenialCompilerVersionConfigNode

				class XenialCompilerVersionConfigNode(TreeConfigNode):

				    def init2(self, node_name):

				        self.props["compiler_version"] = node_name

				    # noinspection PyMethodMayBeStatic

				    def child_constructor(self):

				        return PyVerConfigNode

									
										310

.circleci/cimodel/data/pytorch_build_definitions.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,310 @@

				#!/usr/bin/env python3

				from collections import OrderedDict

				from cimodel.data.pytorch_build_data import TopLevelNode, CONFIG_TREE_DATA

				import cimodel.data.dimensions as dimensions

				import cimodel.lib.conf_tree as conf_tree

				import cimodel.lib.miniutils as miniutils

				import cimodel.lib.visualization as visualization

				from dataclasses import dataclass, field

				from typing import List, Optional

				DOCKER_IMAGE_PATH_BASE = "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/"

				DOCKER_IMAGE_VERSION = 323

				@dataclass

				class Conf:

				    distro: str

				    parms: List[str]

				    pyver: Optional[str] = None

				    cuda_version: Optional[str] = None

				    # TODO expand this to cover all the USE_* that we want to test for

				    #  tesnrorrt, leveldb, lmdb, redis, opencv, mkldnn, ideep, etc.

				    # (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259453608)

				    is_xla: bool = False

				    restrict_phases: Optional[List[str]] = None

				    gpu_resource: Optional[str] = None

				    dependent_tests: List = field(default_factory=list)

				    parent_build: Optional['Conf'] = None

				    is_namedtensor: bool = False

				    is_important: bool = False

				    # TODO: Eliminate the special casing for docker paths

				    # In the short term, we *will* need to support special casing as docker images are merged for caffe2 and pytorch

				    def get_parms(self, for_docker):

				        leading = []

				        # We just don't run non-important jobs on pull requests;

				        # previously we also named them in a way to make it obvious

				        # if self.is_important and not for_docker:

				        #    leading.append("AAA")

				        leading.append("pytorch")

				        if self.is_xla and not for_docker:

				            leading.append("xla")

				        if self.is_namedtensor and not for_docker:

				            leading.append("namedtensor")

				        cuda_parms = []

				        if self.cuda_version:

				            cuda_parms.extend(["cuda" + self.cuda_version, "cudnn7"])

				        return leading + ["linux", self.distro] + cuda_parms + self.parms

				    def gen_docker_image_path(self):

				        parms_source = self.parent_build or self

				        base_build_env_name = "-".join(parms_source.get_parms(True))

				        return miniutils.quote(DOCKER_IMAGE_PATH_BASE + base_build_env_name + ":" + str(DOCKER_IMAGE_VERSION))

				    def get_build_job_name_pieces(self, build_or_test):

				        return self.get_parms(False) + [build_or_test]

				    def gen_build_name(self, build_or_test):

				        return ("_".join(map(str, self.get_build_job_name_pieces(build_or_test)))).replace(".", "_").replace("-", "_")

				    def get_dependents(self):

				        return self.dependent_tests or []

				    def gen_yaml_tree(self, build_or_test):

				        build_job_name_pieces = self.get_build_job_name_pieces(build_or_test)

				        build_env_name = "-".join(map(str, build_job_name_pieces))

				        env_dict = OrderedDict([

				            ("BUILD_ENVIRONMENT", build_env_name),

				            ("DOCKER_IMAGE", self.gen_docker_image_path()),

				        ])

				        if self.pyver:

				            env_dict["PYTHON_VERSION"] = miniutils.quote(self.pyver)

				        if build_or_test == "test" and self.gpu_resource:

				            env_dict["USE_CUDA_DOCKER_RUNTIME"] = miniutils.quote("1")

				        d = {

				            "environment": env_dict,

				            "<<": "*" + "_".join(["pytorch", "linux", build_or_test, "defaults"]),

				        }

				        if build_or_test == "test":

				            resource_class = "large"

				            if self.gpu_resource:

				                resource_class = "gpu." + self.gpu_resource

				                if self.gpu_resource == "large":

				                    env_dict["MULTI_GPU"] = miniutils.quote("1")

				            d["resource_class"] = resource_class

				        return d

				    def gen_workflow_yaml_item(self, phase):

				        # All jobs require the setup job

				        parameters = OrderedDict({"requires": ["setup"]})

				        if phase == "test":

				            # TODO When merging the caffe2 and pytorch jobs, it might be convenient for a while to make a

				            #  caffe2 test job dependent on a pytorch build job. This way we could quickly dedup the repeated

				            #  build of pytorch in the caffe2 build job, and just run the caffe2 tests off of a completed

				            #  pytorch build job (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259452641)

				            dependency_build = self.parent_build or self

				            parameters["requires"].append(dependency_build.gen_build_name("build"))

				        if not self.is_important:

				            # If you update this, update

				            # caffe2_build_definitions.py too

				            parameters["filters"] = {"branches": {"only": ["master", r"/ci-all\/.*/"]}}

				        return {self.gen_build_name(phase): parameters}

				# TODO This is a hack to special case some configs just for the workflow list

				class HiddenConf(object):

				    def __init__(self, name, parent_build=None):

				        self.name = name

				        self.parent_build = parent_build

				    def gen_workflow_yaml_item(self, phase):

				        return {self.gen_build_name(phase): {"requires": [self.parent_build.gen_build_name("build")]}}

				    def gen_build_name(self, _):

				        return self.name

				# TODO Convert these to graph nodes

				def gen_dependent_configs(xenial_parent_config):

				    extra_parms = [

				        (["multigpu"], "large"),

				        (["NO_AVX2"], "medium"),

				        (["NO_AVX", "NO_AVX2"], "medium"),

				        (["slow"], "medium"),

				        (["nogpu"], None),

				    ]

				    configs = []

				    for parms, gpu in extra_parms:

				        c = Conf(

				            xenial_parent_config.distro,

				            ["py3"] + parms,

				            pyver="3.6",

				            cuda_version=xenial_parent_config.cuda_version,

				            restrict_phases=["test"],

				            gpu_resource=gpu,

				            parent_build=xenial_parent_config,

				            is_important=xenial_parent_config.is_important,

				        )

				        configs.append(c)

				    for x in ["pytorch_short_perf_test_gpu", "pytorch_python_doc_push", "pytorch_cpp_doc_push"]:

				        configs.append(HiddenConf(x, parent_build=xenial_parent_config))

				    return configs

				def get_root():

				    return TopLevelNode("PyTorch Builds", CONFIG_TREE_DATA)

				def gen_tree():

				    root = get_root()

				    configs_list = conf_tree.dfs(root)

				    return configs_list

				def instantiate_configs():

				    config_list = []

				    root = get_root()

				    found_configs = conf_tree.dfs(root)

				    restrict_phases = None

				    for fc in found_configs:

				        distro_name = fc.find_prop("distro_name")

				        python_version = None

				        if distro_name == "xenial":

				            python_version = fc.find_prop("pyver")

				            parms_list = [fc.find_prop("abbreviated_pyver")]

				        else:

				            parms_list = ["py" + fc.find_prop("pyver")]

				        compiler_name = fc.find_prop("compiler_name")

				        cuda_version = None

				        if compiler_name == "cuda":

				            cuda_version = fc.find_prop("compiler_version")

				        elif compiler_name == "android":

				            android_ndk_version = fc.find_prop("compiler_version")

				            # TODO: do we need clang to compile host binaries like protoc?

				            parms_list.append("clang5")

				            parms_list.append("android-ndk-" + android_ndk_version)

				            restrict_phases = ["build"]

				        elif compiler_name:

				            gcc_version = compiler_name + (fc.find_prop("compiler_version") or "")

				            parms_list.append(gcc_version)

				            # TODO: This is a nasty special case

				            if compiler_name == "clang":

				                parms_list.append("asan")

				        if cuda_version in ["9.2", "10"]:

				            # TODO The gcc version is orthogonal to CUDA version?

				            parms_list.append("gcc7")

				        is_xla = fc.find_prop("is_xla") or False

				        is_namedtensor = fc.find_prop("is_namedtensor") or False

				        is_important = fc.find_prop("is_important") or False

				        gpu_resource = None

				        if cuda_version and cuda_version != "10":

				            gpu_resource = "medium"

				        c = Conf(

				            distro_name,

				            parms_list,

				            python_version,

				            cuda_version,

				            is_xla,

				            restrict_phases,

				            gpu_resource,

				            is_namedtensor=is_namedtensor,

				            is_important=is_important,

				        )

				        if cuda_version == "9" and python_version == "3.6":

				            c.dependent_tests = gen_dependent_configs(c)

				        config_list.append(c)

				    return config_list

				def add_build_env_defs(jobs_dict):

				    mydict = OrderedDict()

				    config_list = instantiate_configs()

				    for c in config_list:

				        phases = c.restrict_phases or dimensions.PHASES

				        for phase in phases:

				            # TODO why does this not have a test?

				            if phase == "test" and c.cuda_version == "10":

				                continue

				            d = c.gen_yaml_tree(phase)

				            mydict[c.gen_build_name(phase)] = d

				            if phase == "test":

				                for x in filter(lambda x: type(x) is not HiddenConf, c.get_dependents()):

				                    d = x.gen_yaml_tree(phase)

				                    mydict[x.gen_build_name(phase)] = d

				    # this is the circleci api version and probably never changes

				    jobs_dict["version"] = 2

				    jobs_dict["jobs"] = mydict

				    graph = visualization.generate_graph(get_root())

				    graph.draw("pytorch-config-dimensions.png", prog="twopi")

				def get_workflow_list():

				    config_list = instantiate_configs()

				    x = ["setup"]

				    for conf_options in config_list:

				        phases = conf_options.restrict_phases or dimensions.PHASES

				        for phase in phases:

				            # TODO why does this not have a test?

				            if phase == "test" and conf_options.cuda_version == "10":

				                continue

				            x.append(conf_options.gen_workflow_yaml_item(phase))

				        # TODO convert to recursion

				        for conf in conf_options.get_dependents():

				            x.append(conf.gen_workflow_yaml_item("test"))

				    return x

0

aten/src/THC/THCThreadLocal.h → .circleci/cimodel/lib/init.py

View File

									
										110

.circleci/cimodel/lib/conf_tree.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,110 @@

				#!/usr/bin/env python3

				from dataclasses import dataclass, field

				from typing import Optional, Dict

				def X(val):

				    """

				    Compact way to write a leaf node

				    """

				    return val, []

				def XImportant(name):

				    """Compact way to write an important (run on PRs) leaf node"""

				    return (name, [("important", [X(True)])])

				@dataclass

				class Ver:

				    """

				    Represents a product with a version number

				    """

				    name: str

				    version: str = ""

				    def __str__(self):

				        return self.name + self.version

				@dataclass

				class ConfigNode:

				    parent: Optional['ConfigNode']

				    node_name: str

				    props: Dict[str, str] = field(default_factory=dict)

				    def get_label(self):

				        return self.node_name

				    # noinspection PyMethodMayBeStatic

				    def get_children(self):

				        return []

				    def get_parents(self):

				        return (self.parent.get_parents() + [self.parent.get_label()]) if self.parent else []

				    def get_depth(self):

				        return len(self.get_parents())

				    def get_node_key(self):

				        return "%".join(self.get_parents() + [self.get_label()])

				    def find_prop(self, propname, searched=None):

				        """

				        Checks if its own dictionary has

				        the property, otherwise asks parent node.

				        """

				        if searched is None:

				            searched = []

				        searched.append(self.node_name)

				        if propname in self.props:

				            return self.props[propname]

				        elif self.parent:

				            return self.parent.find_prop(propname, searched)

				        else:

				            # raise Exception('Property "%s" does not exist anywhere in the tree! Searched: %s' % (propname, searched))

				            return None

				def dfs_recurse(

				        node,

				        leaf_callback=lambda x: None,

				        discovery_callback=lambda x, y, z: None,

				        child_callback=lambda x, y: None,

				        sibling_index=0,

				        sibling_count=1):

				    discovery_callback(node, sibling_index, sibling_count)

				    node_children = node.get_children()

				    if node_children:

				        for i, child in enumerate(node_children):

				            child_callback(node, child)

				            dfs_recurse(

				                child,

				                leaf_callback,

				                discovery_callback,

				                child_callback,

				                i,

				                len(node_children),

				            )

				    else:

				        leaf_callback(node)

				def dfs(toplevel_config_node):

				    config_list = []

				    def leaf_callback(node):

				        config_list.append(node)

				    dfs_recurse(toplevel_config_node, leaf_callback)

				    return config_list

									
										13

.circleci/cimodel/lib/miniutils.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,13 @@

				#!/usr/bin/env python3

				def quote(s):

				    return sandwich('"', s)

				def sandwich(bread, jam):

				    return bread + jam + bread

				def override(word, substitutions):

				    return substitutions.get(word, word)

									
										64

.circleci/cimodel/lib/miniyaml.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,64 @@

				#!/usr/bin/env python3

				from collections import OrderedDict

				LIST_MARKER = "- "

				INDENTATION_WIDTH = 2

				def is_dict(data):

				    return type(data) is dict or type(data) is OrderedDict

				def is_collection(data):

				    return is_dict(data) or type(data) is list

				# TODO can eventually drop this custom sorting

				def sortkey(x):

				    k = x[0]

				    return (

				        k == "<<",

				        k != "environment",

				        k,

				    )

				def render(fh, data, depth, is_list_member=False):

				    """

				    PyYaml does not allow precise control over the quoting

				    behavior, especially for merge references.

				    Therefore, we use this custom YAML renderer.

				    """

				    indentation = " " * INDENTATION_WIDTH * depth

				    if is_dict(data):

				        tuples = list(data.items())

				        if type(data) is not OrderedDict:

				            tuples.sort(key=sortkey)

				        for i, (k, v) in enumerate(tuples):

				            # If this dict is itself a list member, the first key gets prefixed with a list marker

				            list_marker_prefix = LIST_MARKER if is_list_member and not i else ""

				            trailing_whitespace = "\n" if is_collection(v) else " "

				            fh.write(indentation + list_marker_prefix + k + ":" + trailing_whitespace)

				            render(fh, v, depth + 1 + int(is_list_member))

				        # TODO Could eventually drop this cosmetic convention

				        if depth == 2:

				            fh.write("\n")

				    elif type(data) is list:

				        for v in data:

				            render(fh, v, depth, True)

				    else:

				        list_member_prefix = indentation + LIST_MARKER if is_list_member else ""

				        fh.write(list_member_prefix + str(data) + "\n")

									
										86

.circleci/cimodel/lib/visualization.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,86 @@

				#!/usr/bin/env python3

				"""

				This module encapsulates dependencies on pygraphviz

				"""

				import colorsys

				import cimodel.lib.conf_tree as conf_tree

				def rgb2hex(rgb_tuple):

				    def to_hex(f):

				        return "%02x" % int(f * 255)

				    return "#" + "".join(map(to_hex, list(rgb_tuple)))

				def handle_missing_graphviz(f):

				    """

				    If the user has not installed pygraphviz, this causes

				    calls to the draw() method of the returned object to do nothing.

				    """

				    try:

				        import pygraphviz  # noqa: F401

				        return f

				    except ModuleNotFoundError:

				        class FakeGraph:

				            def draw(self, *args, **kwargs):

				                pass

				        return lambda _: FakeGraph()

				@handle_missing_graphviz

				def generate_graph(toplevel_config_node):

				    """

				    Traverses the graph once first just to find the max depth

				    """

				    config_list = conf_tree.dfs(toplevel_config_node)

				    max_depth = 0

				    for config in config_list:

				        max_depth = max(max_depth, config.get_depth())

				    # color the nodes using the max depth

				    from pygraphviz import AGraph

				    dot = AGraph()

				    def node_discovery_callback(node, sibling_index, sibling_count):

				        depth = node.get_depth()

				        sat_min, sat_max = 0.1, 0.6

				        sat_range = sat_max - sat_min

				        saturation_fraction = sibling_index / float(sibling_count - 1) if sibling_count > 1 else 1

				        saturation = sat_min + sat_range * saturation_fraction

				        # TODO Use a hash of the node label to determine the color

				        hue = depth / float(max_depth + 1)

				        rgb_tuple = colorsys.hsv_to_rgb(hue, saturation, 1)

				        this_node_key = node.get_node_key()

				        dot.add_node(

				            this_node_key,

				            label=node.get_label(),

				            style="filled",

				            # fillcolor=hex_color + ":orange",

				            fillcolor=rgb2hex(rgb_tuple),

				            penwidth=3,

				            color=rgb2hex(colorsys.hsv_to_rgb(hue, saturation, 0.9))

				        )

				    def child_callback(node, child):

				        this_node_key = node.get_node_key()

				        child_node_key = child.get_node_key()

				        dot.add_edge((this_node_key, child_node_key))

				    conf_tree.dfs_recurse(toplevel_config_node, lambda x: None, node_discovery_callback, child_callback)

				    return dot

3479

.circleci/config.yml

View File

File diff suppressed because it is too large Load Diff

									
										39

.circleci/ensure-consistency.py
									
										Executable file
									
												View File
												
				@ -0,0 +1,39 @@

				#!/usr/bin/env python3

				import os

				import subprocess

				import sys

				import tempfile

				import generate_config_yml

				CHECKED_IN_FILE = "config.yml"

				REGENERATION_SCRIPT = "regenerate.sh"

				PARENT_DIR = os.path.basename(os.path.dirname(os.path.abspath(__file__)))

				README_PATH = os.path.join(PARENT_DIR, "README.md")

				ERROR_MESSAGE_TEMPLATE = """

				The checked-in CircleCI "%s" file does not match what was generated by the scripts.

				Please re-run the "%s" script in the "%s" directory and commit the result. See "%s" for more information.

				"""

				def check_consistency():

				    _, temp_filename = tempfile.mkstemp("-generated-config.yml")

				    with open(temp_filename, "w") as fh:

				        generate_config_yml.stitch_sources(fh)

				    try:

				        subprocess.check_call(["cmp", temp_filename, CHECKED_IN_FILE])

				    except subprocess.CalledProcessError:

				        sys.exit(ERROR_MESSAGE_TEMPLATE % (CHECKED_IN_FILE, REGENERATION_SCRIPT, PARENT_DIR, README_PATH))

				    finally:

				        os.remove(temp_filename)

				if __name__ == "__main__":

				    check_consistency()

									
										127

.circleci/generate_config_yml.py
									
										Executable file
									
												View File
												
				@ -0,0 +1,127 @@

				#!/usr/bin/env python3

				"""

				This script is the source of truth for config.yml.

				Please see README.md in this directory for details.

				"""

				import os

				import sys

				import shutil

				from collections import namedtuple, OrderedDict

				import cimodel.data.pytorch_build_definitions as pytorch_build_definitions

				import cimodel.data.binary_build_definitions as binary_build_definitions

				import cimodel.data.caffe2_build_definitions as caffe2_build_definitions

				import cimodel.lib.miniutils as miniutils

				import cimodel.lib.miniyaml as miniyaml

				class File(object):

				    """

				    Verbatim copy the contents of a file into config.yml

				    """

				    def __init__(self, filename):

				        self.filename = filename

				    def write(self, output_filehandle):

				        with open(os.path.join("verbatim-sources", self.filename)) as fh:

				            shutil.copyfileobj(fh, output_filehandle)

				class FunctionGen(namedtuple('FunctionGen', 'function depth')):

				    __slots__ = ()

				class Treegen(FunctionGen):

				    """

				    Insert the content of a YAML tree into config.yml

				    """

				    def write(self, output_filehandle):

				        build_dict = OrderedDict()

				        self.function(build_dict)

				        miniyaml.render(output_filehandle, build_dict, self.depth)

				class Listgen(FunctionGen):

				    """

				    Insert the content of a YAML list into config.yml

				    """

				    def write(self, output_filehandle):

				        miniyaml.render(output_filehandle, self.function(), self.depth)

				def horizontal_rule():

				    return "".join("#" * 78)

				class Header(object):

				    def __init__(self, title, summary=None):

				        self.title = title

				        self.summary_lines = summary or []

				    def write(self, output_filehandle):

				        text_lines = [self.title] + self.summary_lines

				        comment_lines = ["# " + x for x in text_lines]

				        lines = miniutils.sandwich([horizontal_rule()], comment_lines)

				        for line in filter(None, lines):

				            output_filehandle.write(line + "\n")

				# Order of this list matters to the generated config.yml.

				YAML_SOURCES = [

				    File("header-section.yml"),

				    File("linux-build-defaults.yml"),

				    File("macos-build-defaults.yml"),

				    File("nightly-binary-build-defaults.yml"),

				    File("linux-binary-build-defaults.yml"),

				    File("macos-binary-build-defaults.yml"),

				    File("nightly-build-smoke-tests-defaults.yml"),

				    Header("Job specifications job specs"),

				    Treegen(pytorch_build_definitions.add_build_env_defs, 0),

				    File("job-specs-setup.yml"),

				    File("job-specs-custom.yml"),

				    Treegen(caffe2_build_definitions.add_caffe2_builds, 1),

				    File("binary_update_htmls.yml"),

				    Header("Binary build specs individual job specifications"),

				    Treegen(binary_build_definitions.add_binary_build_specs, 1),

				    Header(

				        "Binary build tests", [

				            "These are the smoke tests run right after the build, before the upload.",

				            "If these fail, the upload doesn't happen."

				        ]

				    ),

				    Treegen(binary_build_definitions.add_binary_build_tests, 1),

				    File("binary-build-tests.yml"),

				    Header("Binary build uploads"),

				    Treegen(binary_build_definitions.add_binary_build_uploads, 1),

				    Header("Smoke test specs individual job specifications"),

				    Treegen(binary_build_definitions.add_smoke_test_specs, 1),

				    File("workflows.yml"),

				    Listgen(pytorch_build_definitions.get_workflow_list, 3),

				    File("workflows-pytorch-macos-builds.yml"),

				    Listgen(caffe2_build_definitions.get_caffe2_workflows, 3),

				    File("workflows-binary-builds-smoke-subset.yml"),

				    Header("Daily smoke test trigger"),

				    Treegen(binary_build_definitions.add_binary_smoke_test_jobs, 1),

				    Header("Daily binary build trigger"),

				    Treegen(binary_build_definitions.add_binary_build_jobs, 1),

				    Header("Nightly tests"),

				    Listgen(binary_build_definitions.get_nightly_tests, 3),

				    File("workflows-nightly-uploads-header.yml"),

				    Listgen(binary_build_definitions.get_nightly_uploads, 3),

				    File("workflows-s3-html.yml"),

				]

				def stitch_sources(output_filehandle):

				    for f in YAML_SOURCES:

				        f.write(output_filehandle)

				if __name__ == "__main__":

				    stitch_sources(sys.stdout)

									
										8

.circleci/regenerate.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,8 @@

				#!/bin/bash -xe

				# Allows this script to be invoked from any directory:

				cd $(dirname "$0")

				NEW_FILE=$(mktemp)

				./generate_config_yml.py > $NEW_FILE

				cp $NEW_FILE config.yml

4

.circleci/scripts/README.md Normal file

View File

 @ -0,0 +1,4 @@
 All the scripts in this directory are callable from `~/workspace/.circleci/scripts/foo.sh`.
 Don't try to call them as `.circleci/scripts/foo.sh`, that won't
 (necessarily) work.  See Note [Workspace for CircleCI scripts] in
 job-specs-setup.yml for more details.

									
										48

.circleci/scripts/binary_checkout.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,48 @@

				#!/bin/bash

				set -eux -o pipefail

				# This step runs on multiple executors with different envfile locations

				if [[ "$(uname)" == Darwin ]]; then

				  # macos executor (builds and tests)

				  workdir="/Users/distiller/project"

				elif [[ -d "/home/circleci/project" ]]; then

				  # machine executor (binary tests)

				  workdir="/home/circleci/project"

				else

				  # docker executor (binary builds)

				  workdir="/"

				fi

				# It is very important that this stays in sync with binary_populate_env.sh

				export PYTORCH_ROOT="$workdir/pytorch"

				export BUILDER_ROOT="$workdir/builder"

				# Clone the Pytorch branch

				git clone https://github.com/pytorch/pytorch.git "$PYTORCH_ROOT"

				pushd "$PYTORCH_ROOT"

				if [[ -n "${CIRCLE_PR_NUMBER:-}" ]]; then

				  # "smoke" binary build on PRs

				  git fetch --force origin "pull/${CIRCLE_PR_NUMBER}/head:remotes/origin/pull/${CIRCLE_PR_NUMBER}"

				  git reset --hard "$CIRCLE_SHA1"

				  git checkout -q -B "$CIRCLE_BRANCH"

				  git reset --hard "$CIRCLE_SHA1"

				elif [[ -n "${CIRCLE_SHA1:-}" ]]; then

				  # Scheduled workflows & "smoke" binary build on master on PR merges

				  git reset --hard "$CIRCLE_SHA1"

				  git checkout -q -B master

				else

				  echo "Can't tell what to checkout"

				  exit 1

				fi

				git submodule update --init --recursive --quiet

				echo "Using Pytorch from "

				git --no-pager log --max-count 1

				popd

				# Clone the Builder master repo

				git clone -q https://github.com/pytorch/builder.git "$BUILDER_ROOT"

				pushd "$BUILDER_ROOT"

				git fetch origin

				git reset origin/master --hard

				echo "Using builder from "

				git --no-pager log --max-count 1

				popd

									
										44

.circleci/scripts/binary_install_miniconda.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,44 @@

				#!/bin/bash

				set -eux -o pipefail

				# This step runs on multiple executors with different envfile locations

				if [[ "$(uname)" == Darwin ]]; then

				  envfile="/Users/distiller/project/env"

				elif [[ -d "/home/circleci/project" ]]; then

				  # machine executor (binary tests)

				  envfile="/home/circleci/project/env"

				else

				  # docker executor (binary builds)

				  envfile="/env"

				fi

				# TODO this is super hacky and ugly. Basically, the binary_update_html job does

				# not have an env file, since it does not call binary_populate_env.sh, since it

				# does not have a BUILD_ENVIRONMENT. So for this one case, which we detect by a

				# lack of an env file, we manually export the environment variables that we

				# need to install miniconda

				if [[ ! -f "$envfile" ]]; then

				  MINICONDA_ROOT="/home/circleci/project/miniconda"

				  workdir="/home/circleci/project"

				  retry () {

				      $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)

				  }

				  export -f retry

				else

				  source "$envfile"

				fi

				conda_sh="$workdir/install_miniconda.sh"

				if [[ "$(uname)" == Darwin ]]; then

				  retry curl -o "$conda_sh" https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh

				else

				  retry curl -o "$conda_sh" https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

				fi

				chmod +x "$conda_sh"

				"$conda_sh" -b -p "$MINICONDA_ROOT"

				rm -f "$conda_sh"

				# We can't actually add miniconda to the PATH in the envfile, because that

				# breaks 'unbuffer' in Mac jobs. This is probably because conda comes with

				# a tclsh, which then gets inserted before the tclsh needed in /usr/bin

									
										30

.circleci/scripts/binary_linux_build.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,30 @@

				#!/bin/bash

				echo "RUNNING ON $(uname -a) WITH $(nproc) CPUS AND $(free -m)"

				set -eux -o pipefail

				source /env

				# Defaults here so they can be changed in one place

				export MAX_JOBS=12

				# Parse the parameters

				if [[ "$PACKAGE_TYPE" == 'conda' ]]; then

				  build_script='conda/build_pytorch.sh'

				elif [[ "$DESIRED_CUDA" == cpu ]]; then

				  build_script='manywheel/build_cpu.sh'

				else

				  build_script='manywheel/build.sh'

				fi

				# We want to call unbuffer, which calls tclsh which finds the expect

				# package. The expect was installed by yum into /usr/bin so we want to

				# find /usr/bin/tclsh, but this is shadowed by /opt/conda/bin/tclsh in

				# the conda docker images, so we prepend it to the path here.

				if [[ "$PACKAGE_TYPE" == 'conda' ]]; then

				  mkdir /just_tclsh_bin

				  ln -s /usr/bin/tclsh /just_tclsh_bin/tclsh

				  export PATH=/just_tclsh_bin:$PATH

				fi

				# Build the package

				SKIP_ALL_TESTS=1 unbuffer "/builder/$build_script" | ts

									
										50

.circleci/scripts/binary_linux_test.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,50 @@

				#!/bin/bash

				source /home/circleci/project/env

				cat >/home/circleci/project/ci_test_script.sh <<EOL

				# =================== The following code will be executed inside Docker container ===================

				set -eux -o pipefail

				# Set up Python

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  retry conda create -qyn testenv python="$DESIRED_PYTHON"

				  source activate testenv >/dev/null

				elif [[ "$DESIRED_PYTHON" == 2.7mu ]]; then

				  export PATH="/opt/python/cp27-cp27mu/bin:\$PATH"

				else

				  python_nodot="\$(echo $DESIRED_PYTHON | tr -d m.u)"

				  export PATH="/opt/python/cp\$python_nodot-cp\${python_nodot}m/bin:\$PATH"

				fi

				# Install the package

				# These network calls should not have 'retry's because they are installing

				# locally and aren't actually network calls

				# TODO there is duplicated and inconsistent test-python-env setup across this

				#   file, builder/smoke_test.sh, and builder/run_tests.sh, and also in the

				#   conda build scripts themselves. These should really be consolidated

				pkg="/final_pkgs/\$(ls /final_pkgs)"

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  conda install -y "\$pkg" --offline

				  retry conda install -yq future numpy protobuf six

				  if [[ "$DESIRED_CUDA" != 'cpu' ]]; then

				    # DESIRED_CUDA is in format cu90 or cu100

				    if [[ "${#DESIRED_CUDA}" == 4 ]]; then

				      cu_ver="${DESIRED_CUDA:2:1}.${DESIRED_CUDA:3}"

				    else

				      cu_ver="${DESIRED_CUDA:2:2}.${DESIRED_CUDA:4}"

				    fi

				    retry conda install -yq -c pytorch "cudatoolkit=\${cu_ver}"

				  fi

				else

				  pip install "\$pkg"

				  retry pip install -q future numpy protobuf six

				fi

				# Test the package

				/builder/check_binary.sh

				# =================== The above code will be executed inside Docker container ===================

				EOL

				echo

				echo

				echo "The script that will run in the next step is:"

				cat /home/circleci/project/ci_test_script.sh

									
										40

.circleci/scripts/binary_linux_upload.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,40 @@

				#!/bin/bash

				# Do NOT set -x

				source /home/circleci/project/env

				set -eu -o pipefail

				set +x

				declare -x "AWS_ACCESS_KEY_ID=${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}"

				declare -x "AWS_SECRET_ACCESS_KEY=${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}"

				cat >/home/circleci/project/login_to_anaconda.sh <<EOL

				set +x

				echo "Trying to login to Anaconda"

				yes | anaconda login \

				    --username "$PYTORCH_BINARY_PJH5_CONDA_USERNAME" \

				    --password "$PYTORCH_BINARY_PJH5_CONDA_PASSWORD"

				set -x

				EOL

				chmod +x /home/circleci/project/login_to_anaconda.sh

				#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!

				# DO NOT TURN -x ON BEFORE THIS LINE

				#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!

				set -eux -o pipefail

				export PATH="$MINICONDA_ROOT/bin:$PATH"

				# Upload the package to the final location

				pushd /home/circleci/project/final_pkgs

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  retry conda install -yq anaconda-client

				  retry timeout 30 /home/circleci/project/login_to_anaconda.sh

				  anaconda upload "$(ls)" -u pytorch --label main --no-progress --force

				elif [[ "$PACKAGE_TYPE" == libtorch ]]; then

				  retry pip install -q awscli

				  s3_dir="s3://pytorch/libtorch/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"

				  for pkg in $(ls); do

				    retry aws s3 cp "$pkg" "$s3_dir" --acl public-read

				  done

				else

				  retry pip install -q awscli

				  s3_dir="s3://pytorch/whl/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"

				  retry aws s3 cp "$(ls)" "$s3_dir" --acl public-read

				fi

									
										24

.circleci/scripts/binary_macos_build.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,24 @@

				#!/bin/bash

				set -eux -o pipefail

				source "/Users/distiller/project/env"

				mkdir -p "$PYTORCH_FINAL_PACKAGE_DIR"

				# For some reason `unbuffer` breaks if we change the PATH here, so we

				# write a script with the PATH change in it and unbuffer the whole

				# thing

				build_script="$workdir/build_script.sh"

				touch "$build_script"

				chmod +x "$build_script"

				# Build

				cat >"$build_script" <<EOL

				export PATH="$workdir/miniconda/bin:$PATH"

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  "$workdir/builder/conda/build_pytorch.sh"

				else

				  export TORCH_PACKAGE_NAME="$(echo $TORCH_PACKAGE_NAME | tr '-' '_')"

				  "$workdir/builder/wheel/build_wheel.sh"

				fi

				EOL

				unbuffer "$build_script" | ts

									
										30

.circleci/scripts/binary_macos_test.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,30 @@

				#!/bin/bash

				set -eux -o pipefail

				source "/Users/distiller/project/env"

				export "PATH=$workdir/miniconda/bin:$PATH"

				pkg="$workdir/final_pkgs/$(ls $workdir/final_pkgs)"

				# Don't test libtorch

				# TODO we should test libtorch

				if [[ "$PACKAGE_TYPE" == libtorch ]]; then

				  exit 0

				fi

				# Create a new test env

				# TODO cut all this out into a separate test job and have an entirely different

				# miniconda

				source deactivate || true

				conda create -qyn test python="$DESIRED_PYTHON"

				source activate test >/dev/null

				# Install the package

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  conda install -y "$pkg" --offline

				else

				  pip install "$pkg" --no-index --no-dependencies -v

				fi

				# Test

				pushd "$workdir/pytorch"

				$workdir/builder/run_tests.sh "$PACKAGE_TYPE" "$DESIRED_PYTHON" "$DESIRED_CUDA"

									
										40

.circleci/scripts/binary_macos_upload.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,40 @@

				#!/bin/bash

				# Do NOT set -x

				set -eu -o pipefail

				set +x

				export AWS_ACCESS_KEY_ID="${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}"

				export AWS_SECRET_ACCESS_KEY="${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}"

				cat >/Users/distiller/project/login_to_anaconda.sh <<EOL

				set +x

				echo "Trying to login to Anaconda"

				yes | anaconda login \

				    --username "$PYTORCH_BINARY_PJH5_CONDA_USERNAME" \

				    --password "$PYTORCH_BINARY_PJH5_CONDA_PASSWORD"

				set -x

				EOL

				chmod +x /Users/distiller/project/login_to_anaconda.sh

				#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!

				# DO NOT TURN -x ON BEFORE THIS LINE

				#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!

				set -eux -o pipefail

				source "/Users/distiller/project/env"

				export "PATH=$workdir/miniconda/bin:$PATH"

				pushd "$workdir/final_pkgs"

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  retry conda install -yq anaconda-client

				  retry /Users/distiller/project/login_to_anaconda.sh

				  retry anaconda upload "$(ls)" -u pytorch-nightly --label main --no-progress --force

				elif [[ "$PACKAGE_TYPE" == libtorch ]]; then

				  retry pip install -q awscli

				  s3_dir="s3://pytorch/libtorch/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"

				  for pkg in $(ls); do

				    retry aws s3 cp "$pkg" "$s3_dir" --acl public-read

				  done

				else

				  retry pip install -q awscli

				  s3_dir="s3://pytorch/whl/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"

				  retry aws s3 cp "$(ls)" "$s3_dir" --acl public-read

				fi

									
										106

.circleci/scripts/binary_populate_env.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,106 @@

				#!/bin/bash

				set -eux -o pipefail

				export TZ=UTC

				# We need to write an envfile to persist these variables to following

				# steps, but the location of the envfile depends on the circleci executor

				if [[ "$(uname)" == Darwin ]]; then

				  # macos executor (builds and tests)

				  workdir="/Users/distiller/project"

				elif [[ -d "/home/circleci/project" ]]; then

				  # machine executor (binary tests)

				  workdir="/home/circleci/project"

				else

				  # docker executor (binary builds)

				  workdir="/"

				fi

				envfile="$workdir/env"

				touch "$envfile"

				chmod +x "$envfile"

				# Parse the BUILD_ENVIRONMENT to package type, python, and cuda

				configs=($BUILD_ENVIRONMENT)

				export PACKAGE_TYPE="${configs[0]}"

				export DESIRED_PYTHON="${configs[1]}"

				export DESIRED_CUDA="${configs[2]}"

				export DESIRED_DEVTOOLSET="${configs[3]:-}"

				if [[ "$PACKAGE_TYPE" == 'libtorch' ]]; then

				  export BUILD_PYTHONLESS=1

				fi

				# Pick docker image

				if [[ "$PACKAGE_TYPE" == conda ]]; then

				  export DOCKER_IMAGE="soumith/conda-cuda"

				elif [[ "$DESIRED_CUDA" == cpu ]]; then

				  export DOCKER_IMAGE="soumith/manylinux-cuda100"

				else

				  export DOCKER_IMAGE="soumith/manylinux-cuda${DESIRED_CUDA:2}"

				fi

				# Upload to parallel folder for gcc abis

				# All nightlies used to be devtoolset3, then devtoolset7 was added as a build

				# option, so the upload was redirected to nightly/devtoolset7 to avoid

				# conflicts with other binaries (there shouldn't be any conflicts). Now we are

				# making devtoolset7 the default.

				if [[ "$DESIRED_DEVTOOLSET" == 'devtoolset7' || "$(uname)" == 'Darwin' ]]; then

				  export PIP_UPLOAD_FOLDER='nightly/'

				else

				  # On linux machines, this shouldn't actually be called anymore. This is just

				  # here for extra safety.

				  export PIP_UPLOAD_FOLDER='nightly/devtoolset3/'

				fi

				# We put this here so that OVERRIDE_PACKAGE_VERSION below can read from it

				export DATE="$(date -u +%Y%m%d)"

				if [[ "$(uname)" == 'Darwin' ]]; then

				  export PYTORCH_BUILD_VERSION="1.2.0.dev$DATE"

				else

				  export PYTORCH_BUILD_VERSION="1.2.0.dev$DATE+$DESIRED_CUDA"

				fi

				export PYTORCH_BUILD_NUMBER=1

				cat >>"$envfile" <<EOL

				# =================== The following code will be executed inside Docker container ===================

				export TZ=UTC

				echo "Running on $(uname -a) at $(date)"

				export PACKAGE_TYPE="$PACKAGE_TYPE"

				export DESIRED_PYTHON="$DESIRED_PYTHON"

				export DESIRED_CUDA="$DESIRED_CUDA"

				export LIBTORCH_VARIANT="${LIBTORCH_VARIANT:-}"

				export BUILD_PYTHONLESS="${BUILD_PYTHONLESS:-}"

				export DESIRED_DEVTOOLSET="$DESIRED_DEVTOOLSET"

				export DATE="$DATE"

				export NIGHTLIES_DATE_PREAMBLE=1.2.0.dev

				export PYTORCH_BUILD_VERSION="$PYTORCH_BUILD_VERSION"

				export PYTORCH_BUILD_NUMBER="$PYTORCH_BUILD_NUMBER"

				export OVERRIDE_PACKAGE_VERSION="$PYTORCH_BUILD_VERSION"

				export TORCH_PACKAGE_NAME='torch-nightly'

				export TORCH_CONDA_BUILD_FOLDER='pytorch-nightly'

				export USE_FBGEMM=1

				export PIP_UPLOAD_FOLDER="$PIP_UPLOAD_FOLDER"

				export DOCKER_IMAGE="$DOCKER_IMAGE"

				export workdir="$workdir"

				export MAC_PACKAGE_WORK_DIR="$workdir"

				export PYTORCH_ROOT="$workdir/pytorch"

				export BUILDER_ROOT="$workdir/builder"

				export MINICONDA_ROOT="$workdir/miniconda"

				export PYTORCH_FINAL_PACKAGE_DIR="$workdir/final_pkgs"

				export CIRCLE_TAG="${CIRCLE_TAG:-}"

				export CIRCLE_SHA1="$CIRCLE_SHA1"

				export CIRCLE_PR_NUMBER="${CIRCLE_PR_NUMBER:-}"

				export CIRCLE_BRANCH="$CIRCLE_BRANCH"

				# =================== The above code will be executed inside Docker container ===================

				EOL

				echo 'retry () {' >> "$envfile"

				echo '    $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)' >> "$envfile"

				echo '}' >> "$envfile"

				echo 'export -f retry' >> "$envfile"

				cat "$envfile"

									
										48

.circleci/scripts/binary_run_in_docker.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,48 @@

				#!/bin/bash

				# This section is used in the binary_test and smoke_test jobs. It expects

				# 'binary_populate_env' to have populated /home/circleci/project/env and it

				# expects another section to populate /home/circleci/project/ci_test_script.sh

				# with the code to run in the docker

				# Expect all needed environment variables to be written to this file

				source /home/circleci/project/env

				echo "Running the following code in Docker"

				cat /home/circleci/project/ci_test_script.sh

				echo

				echo

				set -eux -o pipefail

				# Expect actual code to be written to this file

				chmod +x /home/circleci/project/ci_test_script.sh

				# Run the docker

				if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then

				  export id=$(docker run --runtime=nvidia -t -d "${DOCKER_IMAGE}")

				else

				  export id=$(docker run -t -d "${DOCKER_IMAGE}")

				fi

				# Copy the envfile and script with all the code to run into the docker.

				docker cp /home/circleci/project/. "$id:/circleci_stuff"

				# Copy built packages into the docker to test. This should only exist on the

				# binary test jobs. The package should've been created from a binary build job,

				# whhich persisted the package to a CircleCI workspace, which this job then

				# copies into a GPU enabled docker for testing

				if [[ -d "/home/circleci/project/final_pkgs" ]]; then

				  docker cp /home/circleci/project/final_pkgs "$id:/final_pkgs"

				fi

				# Copy the needed repos into the docker. These do not exist in the smoke test

				# jobs, since the smoke test jobs do not need the Pytorch source code.

				if [[ -d "$PYTORCH_ROOT" ]]; then

				  docker cp "$PYTORCH_ROOT" "$id:/pytorch"

				fi

				if [[ -d "$BUILDER_ROOT" ]]; then

				  docker cp "$BUILDER_ROOT" "$id:/builder"

				fi

				# Execute the test script that was populated by an earlier section

				export COMMAND='((echo "source /circleci_stuff/env && /circleci_stuff/ci_test_script.sh") | docker exec -i "$id" bash) 2>&1'

				echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

									
										127

.circleci/scripts/cpp_doc_push_script.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,127 @@

				# =================== The following code **should** be executed inside Docker container ===================

				# Install dependencies

				sudo apt-get -y update

				sudo apt-get -y install expect-dev

				# This is where the local pytorch install in the docker image is located

				pt_checkout="/var/lib/jenkins/workspace"

				# Since we're cat-ing this file, we need to escape all $'s

				echo "cpp_doc_push_script.sh: Invoked with $*"

				# Argument 1: Where to copy the built documentation for Python API to

				# (pytorch.github.io/$install_path)

				install_path="$1"

				if [ -z "$install_path" ]; then

				echo "error: cpp_doc_push_script.sh: install_path (arg1) not specified"

				  exit 1

				fi

				# Argument 2: What version of the Python API docs we are building.

				version="$2"

				if [ -z "$version" ]; then

				echo "error: cpp_doc_push_script.sh: version (arg2) not specified"

				  exit 1

				fi

				is_master_doc=false

				if [ "$version" == "master" ]; then

				  is_master_doc=true

				fi

				# Argument 3: (optional) If present, we will NOT do any pushing. Used for testing.

				dry_run=false

				if [ "$3" != "" ]; then

				  dry_run=true

				fi

				echo "install_path: $install_path  version: $version  dry_run: $dry_run"

				# ======================== Building PyTorch C++ API Docs ========================

				echo "Building PyTorch C++ API docs..."

				# Clone the cppdocs repo

				rm -rf cppdocs

				git clone https://github.com/pytorch/cppdocs

				set -ex

				sudo apt-get -y install doxygen

				# Generate ATen files

				pushd "${pt_checkout}"

				pip install -r requirements.txt

				time GEN_TO_SOURCE=1 python aten/src/ATen/gen.py \

				  -s aten/src/ATen \

				  -d build/aten/src/ATen \

				  aten/src/ATen/Declarations.cwrap \

				  aten/src/THNN/generic/THNN.h \

				  aten/src/THCUNN/generic/THCUNN.h \

				  aten/src/ATen/nn.yaml \

				  aten/src/ATen/native/native_functions.yaml

				# Copy some required files

				cp aten/src/ATen/common_with_cwrap.py tools/shared/cwrap_common.py

				cp torch/_utils_internal.py tools/shared

				# Generate PyTorch files

				time python tools/setup_helpers/generate_code.py \

				  --declarations-path build/aten/src/ATen/Declarations.yaml \

				  --nn-path aten/src/

				# Build the docs

				pushd docs/cpp

				pip install breathe==4.11.1 bs4 lxml six

				pip install --no-cache-dir -e "git+https://github.com/pytorch/pytorch_sphinx_theme.git#egg=pytorch_sphinx_theme"

				pip install exhale>=0.2.1

				pip install sphinx==1.8.5

				# Uncomment once it is fixed

				# pip install -r requirements.txt

				time make VERBOSE=1 html -j

				popd

				popd

				pushd cppdocs

				# Purge everything with some exceptions

				mkdir /tmp/cppdocs-sync

				mv _config.yml README.md /tmp/cppdocs-sync/

				rm -rf *

				# Copy over all the newly generated HTML

				cp -r "${pt_checkout}"/docs/cpp/build/html/* .

				# Copy back _config.yml

				rm -rf _config.yml

				mv /tmp/cppdocs-sync/* .

				# Make a new commit

				git add . || true

				git status

				git config user.email "soumith+bot@pytorch.org"

				git config user.name "pytorchbot"

				# If there aren't changes, don't make a commit; push is no-op

				git commit -m "Automatic sync on $(date)" || true

				git status

				if [ "$dry_run" = false ]; then

				  echo "Pushing to https://github.com/pytorch/cppdocs"

				  set +x

				/usr/bin/expect <<DONE

				  spawn git push -u origin master

				  expect "Username*"

				  send "pytorchbot\n"

				  expect "Password*"

				  send "$::env(GITHUB_PYTORCHBOT_TOKEN)\n"

				  expect eof

				DONE

				  set -x

				else

				  echo "Skipping push due to dry_run"

				fi

				popd

				# =================== The above code **should** be executed inside Docker container ===================

									
										118

.circleci/scripts/python_doc_push_script.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,118 @@

				# =================== The following code **should** be executed inside Docker container ===================

				# Install dependencies

				sudo apt-get -y update

				sudo apt-get -y install expect-dev

				# This is where the local pytorch install in the docker image is located

				pt_checkout="/var/lib/jenkins/workspace"

				echo "python_doc_push_script.sh: Invoked with $*"

				set -ex

				# Argument 1: Where to copy the built documentation to

				# (pytorch.github.io/$install_path)

				install_path="$1"

				if [ -z "$install_path" ]; then

				echo "error: python_doc_push_script.sh: install_path (arg1) not specified"

				  exit 1

				fi

				# Argument 2: What version of the docs we are building.

				version="$2"

				if [ -z "$version" ]; then

				echo "error: python_doc_push_script.sh: version (arg2) not specified"

				  exit 1

				fi

				is_master_doc=false

				if [ "$version" == "master" ]; then

				  is_master_doc=true

				fi

				# Argument 3: The branch to push to. Usually is "site"

				branch="$3"

				if [ -z "$branch" ]; then

				echo "error: python_doc_push_script.sh: branch (arg3) not specified"

				  exit 1

				fi

				# Argument 4: (optional) If present, we will NOT do any pushing. Used for testing.

				dry_run=false

				if [ "$4" != "" ]; then

				  dry_run=true

				fi

				echo "install_path: $install_path  version: $version  dry_run: $dry_run"

				git clone https://github.com/pytorch/pytorch.github.io -b $branch

				pushd pytorch.github.io

				export LC_ALL=C

				export PATH=/opt/conda/bin:$PATH

				rm -rf pytorch || true

				# Install TensorBoard in python 3 so torch.utils.tensorboard classes render

				pip install -q https://s3.amazonaws.com/ossci-linux/wheels/tensorboard-1.14.0a0-py3-none-any.whl

				# Get all the documentation sources, put them in one place

				pushd "$pt_checkout"

				git clone https://github.com/pytorch/vision

				pushd vision

				conda install -q pillow

				time python setup.py install

				popd

				pushd docs

				rm -rf source/torchvision

				cp -a ../vision/docs/source source/torchvision

				# Build the docs

				pip -q install -r requirements.txt || true

				if [ "$is_master_doc" = true ]; then

				  make html

				else

				  make html-stable

				fi

				# Move them into the docs repo

				popd

				popd

				git rm -rf "$install_path" || true

				mv "$pt_checkout/docs/build/html" "$install_path"

				# Add the version handler by search and replace.

				# XXX: Consider moving this to the docs Makefile or site build

				if [ "$is_master_doc" = true ]; then

				  find "$install_path" -name "*.html" -print0 | xargs -0 perl -pi -w -e "s@master\s+\((\d\.\d\.[A-Fa-f0-9]+\+[A-Fa-f0-9]+)\s+\)@<a href='http://pytorch.org/docs/versions.html'>\1 \&#x25BC</a>@g"

				else

				  find "$install_path" -name "*.html" -print0 | xargs -0 perl -pi -w -e "s@master\s+\((\d\.\d\.[A-Fa-f0-9]+\+[A-Fa-f0-9]+)\s+\)@<a href='http://pytorch.org/docs/versions.html'>$version \&#x25BC</a>@g"

				fi

				git add "$install_path" || true

				git status

				git config user.email "soumith+bot@pytorch.org"

				git config user.name "pytorchbot"

				# If there aren't changes, don't make a commit; push is no-op

				git commit -m "auto-generating sphinx docs" || true

				git status

				if [ "$dry_run" = false ]; then

				  echo "Pushing to pytorch.github.io:$branch"

				  set +x

				/usr/bin/expect <<DONE

				  spawn git push origin $branch

				  expect "Username*"

				  send "pytorchbot\n"

				  expect "Password*"

				  send "$::env(GITHUB_PYTORCHBOT_TOKEN)\n"

				  expect eof

				DONE

				  set -x

				else

				  echo "Skipping push due to dry_run"

				fi

				popd

				# =================== The above code **should** be executed inside Docker container ===================

									
										87

.circleci/scripts/setup_ci_environment.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,87 @@

				#!/usr/bin/env bash

				set -ex -o pipefail

				# Set up NVIDIA docker repo

				curl -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

				echo "deb https://nvidia.github.io/libnvidia-container/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list

				echo "deb https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list

				echo "deb https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list

				# Remove unnecessary sources

				sudo rm -f /etc/apt/sources.list.d/google-chrome.list

				sudo rm -f /etc/apt/heroku.list

				sudo rm -f /etc/apt/openjdk-r-ubuntu-ppa-xenial.list

				sudo rm -f /etc/apt/partner.list

				sudo apt-get -y update

				sudo apt-get -y remove linux-image-generic linux-headers-generic linux-generic docker-ce

				# WARNING: Docker version is hardcoded here; you must update the

				# version number below for docker-ce and nvidia-docker2 to get newer

				# versions of Docker.  We hardcode these numbers because we kept

				# getting broken CI when Docker would update their docker version,

				# and nvidia-docker2 would be out of date for a day until they

				# released a newer version of their package.

				#

				# How to figure out what the correct versions of these packages are?

				# My preferred method is to start a Docker instance of the correct

				# Ubuntu version (e.g., docker run -it ubuntu:16.04) and then ask

				# apt what the packages you need are.  Note that the CircleCI image

				# comes with Docker.

				sudo apt-get -y install \

				  linux-headers-$(uname -r) \

				  linux-image-generic \

				  moreutils \

				  docker-ce=5:18.09.4~3-0~ubuntu-xenial \

				  nvidia-container-runtime=2.0.0+docker18.09.4-1 \

				  nvidia-docker2=2.0.3+docker18.09.4-1 \

				  expect-dev

				sudo pkill -SIGHUP dockerd

				retry () {

				    $*  || $* || $* || $* || $*

				}

				retry sudo pip -q install awscli==1.16.35

				if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then

				  DRIVER_FN="NVIDIA-Linux-x86_64-410.104.run"

				  wget "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN"

				  sudo /bin/bash "$DRIVER_FN" -s --no-drm || (sudo cat /var/log/nvidia-installer.log && false)

				  nvidia-smi

				fi

				if [[ "${BUILD_ENVIRONMENT}" == *-build ]]; then

				  echo "declare -x IN_CIRCLECI=1" > /home/circleci/project/env

				  echo "declare -x COMMIT_SOURCE=${CIRCLE_BRANCH:-}" >> /home/circleci/project/env

				  echo "declare -x PYTHON_VERSION=${PYTHON_VERSION:-}" >> /home/circleci/project/env

				  echo "declare -x SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2" >> /home/circleci/project/env

				  if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then

				    echo "declare -x TORCH_CUDA_ARCH_LIST=5.2" >> /home/circleci/project/env

				  fi

				  export SCCACHE_MAX_JOBS=`expr $(nproc) - 1`

				  export MEMORY_LIMIT_MAX_JOBS=8  # the "large" resource class on CircleCI has 32 CPU cores, if we use all of them we'll OOM

				  export MAX_JOBS=$(( ${SCCACHE_MAX_JOBS} > ${MEMORY_LIMIT_MAX_JOBS} ? ${MEMORY_LIMIT_MAX_JOBS} : ${SCCACHE_MAX_JOBS} ))

				  echo "declare -x MAX_JOBS=${MAX_JOBS}" >> /home/circleci/project/env

				  if [[ "${BUILD_ENVIRONMENT}" == *xla* ]]; then

				    # This IAM user allows write access to S3 bucket for sccache & bazels3cache

				    set +x

				    echo "declare -x AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}" >> /home/circleci/project/env

				    echo "declare -x AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}" >> /home/circleci/project/env

				    set -x

				  else

				    # This IAM user allows write access to S3 bucket for sccache

				    set +x

				    echo "declare -x AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}" >> /home/circleci/project/env

				    echo "declare -x AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}" >> /home/circleci/project/env

				    set -x

				  fi

				fi

				# This IAM user only allows read-write access to ECR

				set +x

				export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_ECR_READ_WRITE_V4:-}

				export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_ECR_READ_WRITE_V4:-}

				eval $(aws ecr get-login --region us-east-1 --no-include-email)

				set -x

									
										50

.circleci/scripts/setup_linux_system_environment.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,50 @@

				#!/usr/bin/env bash

				set -eux -o pipefail

				# Set up CircleCI GPG keys for apt, if needed

				curl -L https://packagecloud.io/circleci/trusty/gpgkey | sudo apt-key add -

				# Stop background apt updates.  Hypothetically, the kill should not

				# be necessary, because stop is supposed to send a kill signal to

				# the process, but we've added it for good luck.  Also

				# hypothetically, it's supposed to be unnecessary to wait for

				# the process to block.  We also have that line for good luck.

				# If you like, try deleting them and seeing if it works.

				sudo systemctl stop apt-daily.service || true

				sudo systemctl kill --kill-who=all apt-daily.service || true

				sudo systemctl stop unattended-upgrades.service || true

				sudo systemctl kill --kill-who=all unattended-upgrades.service || true

				# wait until `apt-get update` has been killed

				while systemctl is-active --quiet apt-daily.service

				do

				    sleep 1;

				done

				while systemctl is-active --quiet unattended-upgrades.service

				do

				    sleep 1;

				done

				# See if we actually were successful

				systemctl list-units --all | cat

				# For good luck, try even harder to kill apt-get

				sudo pkill apt-get || true

				# For even better luck, purge unattended-upgrades

				sudo apt-get purge -y unattended-upgrades

				cat /etc/apt/sources.list

				# For the bestest luck, kill again now

				sudo pkill apt || true

				sudo pkill dpkg || true

				# Try to detect if apt/dpkg is stuck

				if ps auxfww | grep '[a]pt'; then

				  echo "WARNING: There are leftover apt processes; subsequent apt update will likely fail"

				fi

				if ps auxfww | grep '[d]pkg'; then

				  echo "WARNING: There are leftover dpkg processes; subsequent apt update will likely fail"

				fi

									
										96

.circleci/scripts/should_run_job.py
									
										Normal file
									
												View File
												
				@ -0,0 +1,96 @@

				import argparse

				import re

				import sys

				# Modify this variable if you want to change the set of default jobs

				# which are run on all pull requests.

				#

				# WARNING: Actually, this is a lie; we're currently also controlling

				# the set of jobs to run via the Workflows filters in CircleCI config.

				default_set = [

				    # PyTorch CPU

				    # Selected oldest Python 2 version to ensure Python 2 coverage

				    'pytorch-linux-trusty-py2.7.9',

				    # PyTorch CUDA

				    'pytorch-linux-xenial-cuda9-cudnn7-py3',

				    # PyTorch ASAN

				    'pytorch-linux-xenial-py3-clang5-asan',

				    # PyTorch DEBUG

				    'pytorch-linux-trusty-py3.6-gcc5.4',

				    # Caffe2 CPU

				    'caffe2-py2-mkl-ubuntu16.04',

				    # Caffe2 CUDA

				    'caffe2-py2-cuda9.1-cudnn7-ubuntu16.04',

				    # Caffe2 ONNX

				    'caffe2-onnx-py2-gcc5-ubuntu16.04',

				    'caffe2-onnx-py3.6-clang7-ubuntu16.04',

				    # Caffe2 Clang

				    'caffe2-py2-clang7-ubuntu16.04',

				    # Caffe2 CMake

				    'caffe2-cmake-cuda9.0-cudnn7-ubuntu16.04',

				    # Binaries

				    'manywheel 2.7mu cpu devtoolset7',

				    'libtorch 2.7m cpu devtoolset7',

				    # Caffe2 Android

				    'caffe2-py2-android-ubuntu16.04',

				    # Caffe2 OSX

				    'caffe2-py2-system-macos10.13',

				    # PyTorch OSX

				    'pytorch-macos-10.13-cuda9.2-cudnn7-py3',

				    # PyTorch Android

				    'pytorch-linux-xenial-py3-clang5-android-ndk-r19c',

				    # XLA

				    'pytorch-xla-linux-trusty-py3.6-gcc5.4',

				    # Other checks

				    'pytorch-short-perf-test-gpu',

				    'pytorch-python-doc-push',

				    'pytorch-cpp-doc-push',

				]

				# Takes in commit message to analyze via stdin

				#

				# This script will query Git and attempt to determine if we should

				# run the current CI job under question

				#

				# NB: Try to avoid hard-coding names here, so there's less place to update when jobs

				# are updated/renamed

				#

				# Semantics in the presence of multiple tags:

				#   - Let D be the set of default builds

				#   - Let S be the set of explicitly specified builds

				#   - Run S \/ D

				parser = argparse.ArgumentParser()

				parser.add_argument('build_environment')

				args = parser.parse_args()

				commit_msg = sys.stdin.read()

				# Matches anything that looks like [foo ci] or [ci foo] or [foo test]

				# or [test foo]

				RE_MARKER = re.compile(r'\[(?:([^ \[\]]+) )?(?:ci|test)(?: ([^ \[\]]+))?\]')

				markers = RE_MARKER.finditer(commit_msg)

				for m in markers:

				    if m.group(1) and m.group(2):

				        print("Unrecognized marker: {}".format(m.group(0)))

				        continue

				    spec = m.group(1) or m.group(2)

				    if spec in args.build_environment or spec == 'all':

				        print("Accepting {} due to commit marker {}".format(args.build_environment, m.group(0)))

				        sys.exit(0)

				for spec in default_set:

				    if spec in args.build_environment:

				        print("Accepting {} as part of default set".format(args.build_environment))

				        sys.exit(0)

				print("Rejecting {}".format(args.build_environment))

				sys.exit(1)

									
										29

.circleci/scripts/should_run_job.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,29 @@

				#!/usr/bin/env bash

				set -exu -o pipefail

				SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

				# Check if we should actually run

				echo "BUILD_ENVIRONMENT: ${BUILD_ENVIRONMENT:-}"

				echo "CIRCLE_PULL_REQUEST: ${CIRCLE_PULL_REQUEST:-}"

				if [ -z "${BUILD_ENVIRONMENT:-}" ]; then

				  echo "Cannot run should_run_job.sh if BUILD_ENVIRONMENT is not defined!"

				  echo "CircleCI scripts are probably misconfigured."

				  exit 1

				fi

				if ! [ -e "$SCRIPT_DIR/COMMIT_MSG" ]; then

				  echo "Cannot run should_run_job.sh if you don't have COMMIT_MSG"

				  echo "written out.  Are you perhaps running the wrong copy of this script?"

				  echo "You should be running the copy in ~/workspace; SCRIPT_DIR=$SCRIPT_DIR"

				  exit 1

				fi

				if [ -n "${CIRCLE_PULL_REQUEST:-}" ]; then

				  if [[ $CIRCLE_BRANCH != "ci-all/"* ]]; then

				    # Don't swallow "script doesn't exist

				    [ -e "$SCRIPT_DIR/should_run_job.py"  ]

				    if ! python "$SCRIPT_DIR/should_run_job.py" "${BUILD_ENVIRONMENT:-}" < "$SCRIPT_DIR/COMMIT_MSG" ; then

				      circleci step halt

				      exit

				    fi

				  fi

				fi

									
										20

.circleci/verbatim-sources/binary-build-tests.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,20 @@

				# There is currently no testing for libtorch TODO

				#  binary_linux_libtorch_2.7m_cpu_test:

				#    environment:

				#      BUILD_ENVIRONMENT: "libtorch 2.7m cpu"

				#    resource_class: gpu.medium

				#    <<: *binary_linux_test

				#

				#  binary_linux_libtorch_2.7m_cu90_test:

				#    environment:

				#      BUILD_ENVIRONMENT: "libtorch 2.7m cu90"

				#    resource_class: gpu.medium

				#    <<: *binary_linux_test

				#

				#  binary_linux_libtorch_2.7m_cu100_test:

				#    environment:

				#      BUILD_ENVIRONMENT: "libtorch 2.7m cu100"

				#    resource_class: gpu.medium

				#    <<: *binary_linux_test

									
										98

.circleci/verbatim-sources/binary_update_htmls.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,98 @@

				  # update_s3_htmls job

				  # These jobs create html files for every cpu/cu## folder in s3. The html

				  # files just store the names of all the files in that folder (which are

				  # binary files (.whl files)). This is to allow pip installs of the latest

				  # version in a folder without having to know the latest date. Pip has a flag

				  # -f that you can pass an html file listing a bunch of packages, and pip will

				  # then install the one with the most recent version.

				  update_s3_htmls: &update_s3_htmls

				    machine:

				      image: ubuntu-1604:201903-01

				    steps:

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *setup_linux_system_environment

				    - run:

				        <<: *binary_checkout

				    # N.B. we do not run binary_populate_env. The only variable we need is

				    # PIP_UPLOAD_FOLDER (which is 'nightly/' for the nightlies and '' for

				    # releases, and sometimes other things for special cases). Instead we

				    # expect PIP_UPLOAD_FOLDER to be passed directly in the env. This is

				    # because, unlike all the other binary jobs, these jobs only get run once,

				    # in a separate workflow. They are not a step in other binary jobs like

				    # build, test, upload.

				    #

				    # You could attach this to every job, or include it in the upload step if

				    # you wanted. You would need to add binary_populate_env in this case to

				    # make sure it has the same upload folder as the job it's attached to. This

				    # function is idempotent, so it won't hurt anything; it's just a little

				    # unnescessary"

				    - run:

				        name: Update s3 htmls

				        no_output_timeout: "1h"

				        command: |

				          set +x

				          echo "declare -x \"AWS_ACCESS_KEY_ID=${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}\"" >> /home/circleci/project/env

				          echo "declare -x \"AWS_SECRET_ACCESS_KEY=${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}\"" >> /home/circleci/project/env

				          source /home/circleci/project/env

				          set -eux -o pipefail

				          retry () {

				              $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)

				          }

				          retry pip install awscli==1.6

				          "/home/circleci/project/builder/cron/update_s3_htmls.sh"

				  # Update s3 htmls for the nightlies

				  update_s3_htmls_for_nightlies:

				    environment:

				      PIP_UPLOAD_FOLDER: "nightly/"

				    <<: *update_s3_htmls

				  # Update s3 htmls for the nightlies for devtoolset7

				  update_s3_htmls_for_nightlies_devtoolset7:

				    environment:

				      PIP_UPLOAD_FOLDER: "nightly/devtoolset7/"

				    <<: *update_s3_htmls

				  # upload_binary_logs job

				  # The builder hud at pytorch.org/builder shows the sizes of all the binaries

				  # over time. It gets this info from html files stored in S3, which this job

				  # populates every day.

				  upload_binary_sizes: &upload_binary_sizes

				    machine:

				      image: ubuntu-1604:201903-01

				    steps:

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *setup_linux_system_environment

				    - run:

				        <<: *binary_checkout

				    - run:

				        <<: *binary_install_miniconda

				    - run:

				        name: Upload binary sizes

				        no_output_timeout: "1h"

				        command: |

				          set +x

				          echo "declare -x \"AWS_ACCESS_KEY_ID=${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}\"" > /home/circleci/project/env

				          echo "declare -x \"AWS_SECRET_ACCESS_KEY=${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}\"" >> /home/circleci/project/env

				          export DATE="$(date -u +%Y_%m_%d)"

				          retry () {

				              $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)

				          }

				          source /home/circleci/project/env

				          set -eux -o pipefail

				          # This is hardcoded to match binary_install_miniconda.sh

				          export PATH="/home/circleci/project/miniconda/bin:$PATH"

				          # Not any awscli will work. Most won't. This one will work

				          retry conda create -qyn aws36 python=3.6

				          source activate aws36

				          pip install awscli==1.16.46

				          "/home/circleci/project/builder/cron/upload_binary_sizes.sh"

									
										63

.circleci/verbatim-sources/header-section.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,63 @@

				# WARNING: DO NOT EDIT THIS FILE DIRECTLY!!!

				# See the README.md in this directory.

				# IMPORTANT: To update Docker image version, please first update

				# https://github.com/pytorch/ossci-job-dsl/blob/master/src/main/groovy/ossci/pytorch/DockerVersion.groovy and

				# https://github.com/pytorch/ossci-job-dsl/blob/master/src/main/groovy/ossci/caffe2/DockerVersion.groovy,

				# and then update DOCKER_IMAGE_VERSION at the top of the following files:

				# * cimodel/data/pytorch_build_definitions.py

				# * cimodel/data/caffe2_build_definitions.py

				# And the inline copies of the variable in

				# * verbatim-sources/job-specs-custom.yml

				#   (grep for DOCKER_IMAGE)

				docker_config_defaults: &docker_config_defaults

				  user: jenkins

				  aws_auth:

				    # This IAM user only allows read-write access to ECR

				    aws_access_key_id: ${CIRCLECI_AWS_ACCESS_KEY_FOR_ECR_READ_WRITE_V4}

				    aws_secret_access_key: ${CIRCLECI_AWS_SECRET_KEY_FOR_ECR_READ_WRITE_V4}

				# This system setup script is meant to run before the CI-related scripts, e.g.,

				# installing Git client, checking out code, setting up CI env, and

				# building/testing.

				setup_linux_system_environment: &setup_linux_system_environment

				  name: Set Up System Environment

				  no_output_timeout: "1h"

				  command: ~/workspace/.circleci/scripts/setup_linux_system_environment.sh

				# NB: This (and the command below) must be run after attaching

				# ~/workspace.  This is NOT the default working directory (that's

				# ~/project); this workspace is generated by the setup job.

				should_run_job: &should_run_job

				  name: Should Run Job After attach_workspace

				  no_output_timeout: "2m"

				  command: ~/workspace/.circleci/scripts/should_run_job.sh

				setup_ci_environment: &setup_ci_environment

				  name: Set Up CI Environment After attach_workspace

				  no_output_timeout: "1h"

				  command: ~/workspace/.circleci/scripts/setup_ci_environment.sh

				# Installs expect and moreutils so that we can call `unbuffer` and `ts`.

				# Also installs OpenMP

				# !!!!NOTE!!!! this is copied into a binary_macos_brew_update job which is the

				# same but does not install libomp. If you are changing this, consider if you

				# need to change that step as well.

				macos_brew_update: &macos_brew_update

				  name: Brew update and install moreutils, expect and libomp

				  no_output_timeout: "1h"

				  command: |

				    set -ex

				    # See https://discourse.brew.sh/t/fetching-homebrew-repos-is-slow/5374/3

				    brew untap caskroom/homebrew-cask

				    # moreutils installs a `parallel` executable by default, which conflicts

				    # with the executable from the GNU `parallel`, so we must unlink GNU

				    # `parallel` first, and relink it afterwards

				    brew update

				    brew unlink parallel

				    brew install moreutils

				    brew link parallel --overwrite

				    brew install expect

				    brew install libomp

									
										267

.circleci/verbatim-sources/job-specs-custom.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,267 @@

				  pytorch_short_perf_test_gpu:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-short-perf-test-gpu

				      DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda9-cudnn7-py3:323"

				      PYTHON_VERSION: "3.6"

				      USE_CUDA_DOCKER_RUNTIME: "1"

				    resource_class: gpu.medium

				    machine:

				      image: ubuntu-1604:201903-01

				    steps:

				    # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *should_run_job

				    - run:

				        <<: *setup_linux_system_environment

				    - run:

				        <<: *setup_ci_environment

				    - run:

				        name: Perf Test

				        no_output_timeout: "1h"

				        command: |

				          set -e

				          export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				          echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}

				          docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null

				          export id=$(docker run --runtime=nvidia -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				          docker cp $id:/var/lib/jenkins/workspace/env /home/circleci/project/env

				          # This IAM user allows write access to S3 bucket for perf test numbers

				          set +x

				          echo "declare -x AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_PERF_TEST_S3_BUCKET_V4}" >> /home/circleci/project/env

				          echo "declare -x AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_PERF_TEST_S3_BUCKET_V4}" >> /home/circleci/project/env

				          set -x

				          docker cp /home/circleci/project/env $id:/var/lib/jenkins/workspace/env

				          export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/short-perf-test-gpu.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				  pytorch_python_doc_push:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-python-doc-push

				      # TODO: stop hardcoding this

				      DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda9-cudnn7-py3:323"

				    resource_class: large

				    machine:

				      image: ubuntu-1604:201903-01

				    steps:

				    # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *should_run_job

				    - run:

				        <<: *setup_linux_system_environment

				    - run:

				        <<: *setup_ci_environment

				    - run:

				        name: Doc Build and Push

				        no_output_timeout: "1h"

				        command: |

				          set -ex

				          export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				          echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}

				          docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null

				          export id=$(docker run -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				          # master branch docs push

				          if [[ "${CIRCLE_BRANCH}" == "master" ]]; then

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/python_doc_push_script.sh docs/master master site") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          # stable release docs push. Due to some circleci limitations, we keep

				          # an eternal PR open for merging v1.2.0 -> master for this job.

				          # XXX: The following code is only run on the v1.2.0 branch, which might

				          # not be exactly the same as what you see here.

				          elif [[ "${CIRCLE_BRANCH}" == "v1.2.0" ]]; then

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/python_doc_push_script.sh docs/stable 1.2.0 site-v1.2.0") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          # For open PRs: Do a dry_run of the docs build, don't push build

				          else

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/python_doc_push_script.sh docs/master master site dry_run") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          fi

				          echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				          # Save the docs build so we can debug any problems

				          export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug

				          docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}

				          docker push ${DEBUG_COMMIT_DOCKER_IMAGE}

				  pytorch_cpp_doc_push:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-cpp-doc-push

				      DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda9-cudnn7-py3:323"

				    resource_class: large

				    machine:

				      image: ubuntu-1604:201903-01

				    steps:

				    # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *should_run_job

				    - run:

				        <<: *setup_linux_system_environment

				    - run:

				        <<: *setup_ci_environment

				    - run:

				        name: Doc Build and Push

				        no_output_timeout: "1h"

				        command: |

				          set -ex

				          export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				          echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}

				          docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null

				          export id=$(docker run -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				          # master branch docs push

				          if [[ "${CIRCLE_BRANCH}" == "master" ]]; then

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/cpp_doc_push_script.sh docs/master master") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          # stable release docs push. Due to some circleci limitations, we keep

				          # an eternal PR open (#16502) for merging v1.0.1 -> master for this job.

				          # XXX: The following code is only run on the v1.0.1 branch, which might

				          # not be exactly the same as what you see here.

				          elif [[ "${CIRCLE_BRANCH}" == "v1.0.1" ]]; then

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/cpp_doc_push_script.sh docs/stable 1.0.1") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          # For open PRs: Do a dry_run of the docs build, don't push build

				          else

				            export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo "export GITHUB_PYTORCHBOT_TOKEN=${GITHUB_PYTORCHBOT_TOKEN}" && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && . ./.circleci/scripts/cpp_doc_push_script.sh docs/master master dry_run") | docker exec -u jenkins -i "$id" bash) 2>&1'

				          fi

				          echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				          # Save the docs build so we can debug any problems

				          export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug

				          docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}

				          docker push ${DEBUG_COMMIT_DOCKER_IMAGE}

				  pytorch_macos_10_13_py3_build:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-macos-10.13-py3-build

				    macos:

				      xcode: "9.0"

				    steps:

				      # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				      - attach_workspace:

				          at: ~/workspace

				      - run:

				          <<: *should_run_job

				      - checkout

				      - run:

				          <<: *macos_brew_update

				      - run:

				          name: Build

				          no_output_timeout: "1h"

				          command: |

				            set -e

				            export IN_CIRCLECI=1

				            # Install sccache

				            sudo curl https://s3.amazonaws.com/ossci-macos/sccache --output /usr/local/bin/sccache

				            sudo chmod +x /usr/local/bin/sccache

				            export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2

				            # This IAM user allows write access to S3 bucket for sccache

				            set +x

				            export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				            export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				            set -x

				            chmod a+x .jenkins/pytorch/macos-build.sh

				            unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts

				            mkdir -p /Users/distiller/pytorch-ci-env/workspace

				            # copy with -a to preserve relative structure (e.g., symlinks), and be recursive

				            cp -a /Users/distiller/project/. /Users/distiller/pytorch-ci-env/workspace

				      - persist_to_workspace:

				          root: /Users/distiller/pytorch-ci-env

				          paths:

				            - "*"

				  pytorch_macos_10_13_py3_test:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-macos-10.13-py3-test

				    macos:

				      xcode: "9.0"

				    steps:

				      # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				      # This workspace also carries binaries from the build job

				      - attach_workspace:

				          at: ~/workspace

				      - run:

				          <<: *should_run_job

				      - run:

				          <<: *macos_brew_update

				      - run:

				          name: Test

				          no_output_timeout: "1h"

				          command: |

				            set -e

				            export IN_CIRCLECI=1

				            # copy with -a to preserve relative structure (e.g., symlinks), and be recursive

				            # TODO: I'm not sure why we can't just run our job in

				            # ~/workspace and call it a day

				            # NB: Yes, you need workspace twice

				            cp -a ~/workspace/workspace/. /Users/distiller/project

				            chmod a+x .jenkins/pytorch/macos-test.sh

				            unbuffer .jenkins/pytorch/macos-test.sh 2>&1 | ts

				  pytorch_macos_10_13_cuda9_2_cudnn7_py3_build:

				    environment:

				      BUILD_ENVIRONMENT: pytorch-macos-10.13-cuda9.2-cudnn7-py3-build

				    macos:

				      xcode: "9.0"

				    steps:

				      # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				      - attach_workspace:

				          at: ~/workspace

				      - run:

				          <<: *should_run_job

				      - checkout

				      - run:

				          <<: *macos_brew_update

				      - run:

				          name: Build

				          no_output_timeout: "1h"

				          command: |

				            set -e

				            export IN_CIRCLECI=1

				            # Install CUDA 9.2

				            sudo rm -rf ~/cuda_9.2.64_mac_installer.app || true

				            curl https://s3.amazonaws.com/ossci-macos/cuda_9.2.64_mac_installer.zip -o ~/cuda_9.2.64_mac_installer.zip

				            unzip ~/cuda_9.2.64_mac_installer.zip -d ~/

				            sudo ~/cuda_9.2.64_mac_installer.app/Contents/MacOS/CUDAMacOSXInstaller --accept-eula --no-window

				            sudo cp /usr/local/cuda/lib/libcuda.dylib /Developer/NVIDIA/CUDA-9.2/lib/libcuda.dylib

				            sudo rm -rf /usr/local/cuda || true

				            # Install cuDNN 7.1 for CUDA 9.2

				            curl https://s3.amazonaws.com/ossci-macos/cudnn-9.2-osx-x64-v7.1.tgz -o ~/cudnn-9.2-osx-x64-v7.1.tgz

				            rm -rf ~/cudnn-9.2-osx-x64-v7.1 && mkdir ~/cudnn-9.2-osx-x64-v7.1

				            tar -xzvf ~/cudnn-9.2-osx-x64-v7.1.tgz -C ~/cudnn-9.2-osx-x64-v7.1

				            sudo cp ~/cudnn-9.2-osx-x64-v7.1/cuda/include/cudnn.h /Developer/NVIDIA/CUDA-9.2/include/

				            sudo cp ~/cudnn-9.2-osx-x64-v7.1/cuda/lib/libcudnn* /Developer/NVIDIA/CUDA-9.2/lib/

				            sudo chmod a+r /Developer/NVIDIA/CUDA-9.2/include/cudnn.h /Developer/NVIDIA/CUDA-9.2/lib/libcudnn*

				            # Install sccache

				            sudo curl https://s3.amazonaws.com/ossci-macos/sccache --output /usr/local/bin/sccache

				            sudo chmod +x /usr/local/bin/sccache

				            export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2

				            # This IAM user allows write access to S3 bucket for sccache

				            set +x

				            export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				            export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				            set -x

				            git submodule sync && git submodule update -q --init --recursive

				            chmod a+x .jenkins/pytorch/macos-build.sh

				            unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts

									
										34

.circleci/verbatim-sources/job-specs-setup.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,34 @@

				  setup:

				    docker:

				      - image: circleci/python:3.7.3

				    steps:

				      - checkout

				      - run:

				          name: Ensure config is up to date

				          command: ./ensure-consistency.py

				          working_directory: .circleci

				      - run:

				          name: Save commit message

				          command: git log --format='%B' -n 1 HEAD > .circleci/scripts/COMMIT_MSG

				      # Note [Workspace for CircleCI scripts]

				      # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

				      # In the beginning, you wrote your CI scripts in a

				      # .circleci/config.yml file, and life was good.  Your CI

				      # configurations flourished and multiplied.

				      #

				      # Then one day, CircleCI cometh down high and say, "Your YAML file

				      # is too biggeth, it stresses our servers so."  And thus they

				      # asketh us to smite the scripts in the yml file.

				      #

				      # But you can't just put the scripts in the .circleci folder,

				      # because in some jobs, you don't ever actually checkout the

				      # source repository.  Where you gonna get the scripts from?

				      #

				      # Here's how you do it: you persist .circleci/scripts into a

				      # workspace, attach the workspace in your subjobs, and run all

				      # your scripts from there.

				      - persist_to_workspace:

				          root: .

				          paths: .circleci/scripts

									
										101

.circleci/verbatim-sources/linux-binary-build-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,101 @@

				# binary linux build defaults

				##############################################################################

				binary_linux_build: &binary_linux_build

				  resource_class: 2xlarge+

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *binary_checkout

				  - run:

				      <<: *binary_populate_env

				  - run:

				      name: Install unbuffer and ts

				      command: |

				        set -eux -o pipefail

				        source /env

				        retry yum -q -y install epel-release

				        retry yum -q -y install expect moreutils

				  - run:

				      name: Update compiler to devtoolset7

				      command: |

				        set -eux -o pipefail

				        source /env

				        if [[ "$DESIRED_DEVTOOLSET" == 'devtoolset7' ]]; then

				          source "/builder/update_compiler.sh"

				          # Env variables are not persisted into the next step

				          echo "export PATH=$PATH" >> /env

				          echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> /env

				        else

				          echo "Not updating compiler"

				        fi

				  - run:

				      name: Build

				      no_output_timeout: "1h"

				      command: |

				        source "/pytorch/.circleci/scripts/binary_linux_build.sh"

				  - persist_to_workspace:

				      root: /

				      paths: final_pkgs

				# This should really just be another step of the binary_linux_build job above.

				# This isn't possible right now b/c the build job uses the docker executor

				# (otherwise they'd be really really slow) but this one uses the macine

				# executor (b/c we have to run the docker with --runtime=nvidia and we can't do

				# that on the docker executor)

				binary_linux_test: &binary_linux_test

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  # TODO: We shouldn't attach the workspace multiple times

				  - attach_workspace:

				      at: /home/circleci/project

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_linux_system_environment

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      <<: *binary_checkout

				  - run:

				      <<: *binary_populate_env

				  - run:

				      name: Prepare test code

				      no_output_timeout: "1h"

				      command: ~/workspace/.circleci/scripts/binary_linux_test.sh

				  - run:

				      <<: *binary_run_in_docker

				binary_linux_upload: &binary_linux_upload

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_linux_system_environment

				  - run:

				      <<: *setup_ci_environment

				  - attach_workspace:

				      at: /home/circleci/project

				  - run:

				      <<: *binary_populate_env

				  - run:

				      <<: *binary_install_miniconda

				  - run:

				      name: Upload

				      no_output_timeout: "1h"

				      command: ~/workspace/.circleci/scripts/binary_linux_upload.sh

									
										234

.circleci/verbatim-sources/linux-build-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,234 @@

				##############################################################################

				# Linux build defaults

				##############################################################################

				pytorch_linux_build_defaults: &pytorch_linux_build_defaults

				  resource_class: large

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_linux_system_environment

				  - checkout

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      name: Build

				      no_output_timeout: "1h"

				      command: |

				        set -e

				        # Pull Docker image and run build

				        echo "DOCKER_IMAGE: "${DOCKER_IMAGE}

				        docker pull ${DOCKER_IMAGE} >/dev/null

				        export id=$(docker run -t -d -w /var/lib/jenkins ${DOCKER_IMAGE})

				        git submodule sync && git submodule update -q --init --recursive

				        docker cp /home/circleci/project/. $id:/var/lib/jenkins/workspace

				        if [[ ${BUILD_ENVIRONMENT} == *"namedtensor"* ]]; then

				          NAMED_FLAG="export BUILD_NAMEDTENSOR=1"

				        fi

				        export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo '"$NAMED_FLAG"' && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/build.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				        echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				        # Push intermediate Docker image for next phase to use

				        if [ -z "${BUILD_ONLY}" ]; then

				          # Note [Special build images]

				          # The namedtensor and xla builds use the same docker image as

				          # pytorch-linux-trusty-py3.6-gcc5.4-build. In the push step, we have to

				          # distinguish between them so the test can pick up the correct image.

				          output_image=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				          if [[ ${BUILD_ENVIRONMENT} == *"namedtensor"* ]]; then

				            export COMMIT_DOCKER_IMAGE=$output_image-namedtensor

				          elif [[ ${BUILD_ENVIRONMENT} == *"xla"* ]]; then

				            export COMMIT_DOCKER_IMAGE=$output_image-xla

				          else

				            export COMMIT_DOCKER_IMAGE=$output_image

				          fi

				          docker commit "$id" ${COMMIT_DOCKER_IMAGE}

				          docker push ${COMMIT_DOCKER_IMAGE}

				        fi

				pytorch_linux_test_defaults: &pytorch_linux_test_defaults

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_linux_system_environment

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      name: Test

				      no_output_timeout: "90m"

				      command: |

				        set -e

				        # See Note [Special build images]

				        output_image=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				        if [[ ${BUILD_ENVIRONMENT} == *"namedtensor"* ]]; then

				          export COMMIT_DOCKER_IMAGE=$output_image-namedtensor

				          NAMED_FLAG="export BUILD_NAMEDTENSOR=1"

				        elif [[ ${BUILD_ENVIRONMENT} == *"xla"* ]]; then

				          export COMMIT_DOCKER_IMAGE=$output_image-xla

				        else

				          export COMMIT_DOCKER_IMAGE=$output_image

				        fi

				        echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}

				        docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null

				        if [ -n "${USE_CUDA_DOCKER_RUNTIME}" ]; then

				          export id=$(docker run --runtime=nvidia -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				        else

				          export id=$(docker run -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				        fi

				        if [ -n "${MULTI_GPU}" ]; then

				          export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo '"$NAMED_FLAG"' && echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/multigpu-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				        else

				          export COMMAND='((echo "export BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT}" && echo '"$NAMED_FLAG"'&& echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				        fi

				        echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				caffe2_linux_build_defaults: &caffe2_linux_build_defaults

				  resource_class: large

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_linux_system_environment

				  - checkout

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      name: Build

				      no_output_timeout: "1h"

				      command: |

				        set -e

				        cat >/home/circleci/project/ci_build_script.sh <<EOL

				        # =================== The following code will be executed inside Docker container ===================

				        set -ex

				        export BUILD_ENVIRONMENT="$BUILD_ENVIRONMENT"

				        # Reinitialize submodules

				        git submodule sync && git submodule update -q --init --recursive

				        # conda must be added to the path for Anaconda builds (this location must be

				        # the same as that in install_anaconda.sh used to build the docker image)

				        if [[ "${BUILD_ENVIRONMENT}" == conda* ]]; then

				          export PATH=/opt/conda/bin:$PATH

				          sudo chown -R jenkins:jenkins '/opt/conda'

				        fi

				        # Build

				        ./.jenkins/caffe2/build.sh

				        # Show sccache stats if it is running

				        if pgrep sccache > /dev/null; then

				          sccache --show-stats

				        fi

				        # =================== The above code will be executed inside Docker container ===================

				        EOL

				        chmod +x /home/circleci/project/ci_build_script.sh

				        echo "DOCKER_IMAGE: "${DOCKER_IMAGE}

				        docker pull ${DOCKER_IMAGE} >/dev/null

				        export id=$(docker run -t -d -w /var/lib/jenkins ${DOCKER_IMAGE})

				        docker cp /home/circleci/project/. $id:/var/lib/jenkins/workspace

				        export COMMAND='((echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && ./ci_build_script.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				        echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

				        # Push intermediate Docker image for next phase to use

				        if [ -z "${BUILD_ONLY}" ]; then

				          if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				            export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-cmake-${CIRCLE_SHA1}

				          else

				            export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				          fi

				          docker commit "$id" ${COMMIT_DOCKER_IMAGE}

				          docker push ${COMMIT_DOCKER_IMAGE}

				        fi

				caffe2_linux_test_defaults: &caffe2_linux_test_defaults

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *setup_linux_system_environment

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      name: Test

				      no_output_timeout: "1h"

				      command: |

				        set -e

				        # TODO: merge this into Caffe2 test.sh

				        cat >/home/circleci/project/ci_test_script.sh <<EOL

				        # =================== The following code will be executed inside Docker container ===================

				        set -ex

				        export BUILD_ENVIRONMENT="$BUILD_ENVIRONMENT"

				        # libdc1394 (dependency of OpenCV) expects /dev/raw1394 to exist...

				        sudo ln /dev/null /dev/raw1394

				        # conda must be added to the path for Anaconda builds (this location must be

				        # the same as that in install_anaconda.sh used to build the docker image)

				        if [[ "${BUILD_ENVIRONMENT}" == conda* ]]; then

				          export PATH=/opt/conda/bin:$PATH

				        fi

				        # Upgrade SSL module to avoid old SSL warnings

				        pip -q install --user --upgrade pyOpenSSL ndg-httpsclient pyasn1

				        pip -q install --user -b /tmp/pip_install_onnx "file:///var/lib/jenkins/workspace/third_party/onnx#egg=onnx"

				        # Build

				        ./.jenkins/caffe2/test.sh

				        # Remove benign core dumps.

				        # These are tests for signal handling (including SIGABRT).

				        rm -f ./crash/core.fatal_signal_as.*

				        rm -f ./crash/core.logging_test.*

				        # =================== The above code will be executed inside Docker container ===================

				        EOL

				        chmod +x /home/circleci/project/ci_test_script.sh

				        if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				          export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-cmake-${CIRCLE_SHA1}

				        else

				          export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}-${CIRCLE_SHA1}

				        fi

				        echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}

				        docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null

				        if [ -n "${USE_CUDA_DOCKER_RUNTIME}" ]; then

				          export id=$(docker run --runtime=nvidia -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				        else

				          export id=$(docker run -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})

				        fi

				        docker cp /home/circleci/project/. "$id:/var/lib/jenkins/workspace"

				        export COMMAND='((echo "source ./workspace/env" && echo "sudo chown -R jenkins workspace && cd workspace && ./ci_test_script.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'

				        echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

									
										72

.circleci/verbatim-sources/macos-binary-build-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,72 @@

				##############################################################################

				# Macos binary build defaults

				# The root of everything is /Users/distiller/pytorch-ci-env/workspace

				##############################################################################

				binary_mac_build: &binary_mac_build

				  macos:

				    xcode: "9.0"

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *binary_checkout

				  - run:

				      <<: *binary_populate_env

				  - run:

				      <<: *binary_macos_brew_update

				  - run:

				      <<: *binary_install_miniconda

				  - run:

				      name: Build

				      no_output_timeout: "1h"

				      command: |

				        set -eux -o pipefail

				        script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_build.sh"

				        cat "$script"

				        source "$script"

				  - run:

				      name: Test

				      no_output_timeout: "1h"

				      command: |

				        set -eux -o pipefail

				        script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_test.sh"

				        cat "$script"

				        source "$script"

				  - persist_to_workspace:

				      root: /Users/distiller/project

				      paths: final_pkgs

				binary_mac_upload: &binary_mac_upload

				  macos:

				    xcode: "9.0"

				  steps:

				  # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				  - attach_workspace:

				      at: ~/workspace

				  - run:

				      <<: *should_run_job

				  - run:

				      <<: *binary_checkout

				  - run:

				      <<: *binary_populate_env

				  - run:

				      <<: *binary_macos_brew_update

				  - run:

				      <<: *binary_install_miniconda

				  - attach_workspace: # TODO - we can `cp` from ~/workspace

				      at: /Users/distiller/project

				  - run:

				      name: Upload

				      no_output_timeout: "10m"

				      command: |

				        script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_upload.sh"

				        cat "$script"

				        source "$script"

									
										90

.circleci/verbatim-sources/macos-build-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,90 @@

				##############################################################################

				# Macos build defaults

				##############################################################################

				caffe2_macos_build_defaults: &caffe2_macos_build_defaults

				  macos:

				    xcode: "9.0"

				  steps:

				    # See Note [Workspace for CircleCI scripts] in job-specs-setup.yml

				    - attach_workspace:

				        at: ~/workspace

				    - run:

				        <<: *should_run_job

				    - checkout

				    - run:

				        <<: *macos_brew_update

				    - run:

				        name: Build

				        no_output_timeout: "1h"

				        command: |

				          set -e

				          export IN_CIRCLECI=1

				          brew install cmake

				          # Reinitialize submodules

				          git submodule sync && git submodule update -q --init --recursive

				          # Reinitialize path (see man page for path_helper(8))

				          eval `/usr/libexec/path_helper -s`

				          # Use Homebrew Python if configured to do so

				          if [ "${PYTHON_INSTALLATION}" == "homebrew" ]; then

				            export PATH=/usr/local/opt/python/libexec/bin:/usr/local/bin:$PATH

				          fi

				          pip -q install numpy

				          # Install Anaconda if we need to

				          if [ -n "${CAFFE2_USE_ANACONDA}" ]; then

				            rm -rf ${TMPDIR}/anaconda

				            curl -o ${TMPDIR}/conda.sh https://repo.continuum.io/miniconda/Miniconda${ANACONDA_VERSION}-latest-MacOSX-x86_64.sh

				            chmod +x ${TMPDIR}/conda.sh

				            /bin/bash ${TMPDIR}/conda.sh -b -p ${TMPDIR}/anaconda

				            rm -f ${TMPDIR}/conda.sh

				            export PATH="${TMPDIR}/anaconda/bin:${PATH}"

				            source ${TMPDIR}/anaconda/bin/activate

				          fi

				          # Install sccache

				          sudo curl https://s3.amazonaws.com/ossci-macos/sccache --output /usr/local/bin/sccache

				          sudo chmod +x /usr/local/bin/sccache

				          export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2

				          # This IAM user allows write access to S3 bucket for sccache

				          set +x

				          export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				          export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}

				          set -x

				          export SCCACHE_BIN=${PWD}/sccache_bin

				          mkdir -p ${SCCACHE_BIN}

				          if which sccache > /dev/null; then

				            printf "#!/bin/sh\nexec sccache $(which clang++) \$*" > "${SCCACHE_BIN}/clang++"

				            chmod a+x "${SCCACHE_BIN}/clang++"

				            printf "#!/bin/sh\nexec sccache $(which clang) \$*" > "${SCCACHE_BIN}/clang"

				            chmod a+x "${SCCACHE_BIN}/clang"

				            export PATH="${SCCACHE_BIN}:$PATH"

				          fi

				          # Build

				          if [ "${BUILD_IOS:-0}" -eq 1 ]; then

				            unbuffer scripts/build_ios.sh 2>&1 | ts

				          elif [ -n "${CAFFE2_USE_ANACONDA}" ]; then

				            # All conda build logic should be in scripts/build_anaconda.sh

				            unbuffer scripts/build_anaconda.sh 2>&1 | ts

				          else

				            unbuffer scripts/build_local.sh 2>&1 | ts

				          fi

				          # Show sccache stats if it is running

				          if which sccache > /dev/null; then

				            sccache --show-stats

				          fi

									
										70

.circleci/verbatim-sources/nightly-binary-build-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,70 @@

				##############################################################################

				# Binary build (nightlies nightly build) defaults

				# The binary builds use the docker executor b/c at time of writing the machine

				# executor is limited to only two cores and is painfully slow (4.5+ hours per

				# GPU build). But the docker executor cannot be run with --runtime=nvidia, and

				# so the binary test/upload jobs must run on a machine executor. The package

				# built in the build job is persisted to the workspace, which the test jobs

				# expect. The test jobs just run a few quick smoke tests (very similar to the

				# second-round-user-facing smoke tests above) and then upload the binaries to

				# their final locations. The upload part requires credentials that should only

				# be available to org-members.

				#

				# binary_checkout MUST be run before other commands here. This is because the

				# other commands are written in .circleci/scripts/*.sh , so the pytorch source

				# code must be downloaded on the machine before they can be run. We cannot

				# inline all the code into this file, since that would cause the yaml size to

				# explode past 4 MB (all the code in the command section is just copy-pasted to

				# everywhere in the .circleci/config.yml file where it appears).

				##############################################################################

				# Checks out the Pytorch and Builder repos (always both of them), and places

				# them in the right place depending on what executor we're running on. We curl

				# our .sh file from the interweb to avoid yaml size bloat. Note that many jobs

				# do not need both the pytorch and builder repos, so this is a little wasteful

				# (smoke tests and upload jobs do not need the pytorch repo).

				binary_checkout: &binary_checkout

				  name: Checkout pytorch/builder repo

				  command: ~/workspace/.circleci/scripts/binary_checkout.sh

				# Parses circleci arguments in a consistent way, essentially routing to the

				# correct pythonXgccXcudaXos build we want

				binary_populate_env: &binary_populate_env

				  name: Set up binary env variables

				  command: ~/workspace/.circleci/scripts/binary_populate_env.sh

				binary_install_miniconda: &binary_install_miniconda

				  name: Install miniconda

				  no_output_timeout: "1h"

				  command: ~/workspace/.circleci/scripts/binary_install_miniconda.sh

				# This section is used in the binary_test and smoke_test jobs. It expects

				# 'binary_populate_env' to have populated /home/circleci/project/env and it

				# expects another section to populate /home/circleci/project/ci_test_script.sh

				# with the code to run in the docker

				binary_run_in_docker: &binary_run_in_docker

				  name: Run in docker

				  # This step only runs on circleci linux machine executors that themselves

				  # need to start docker images

				  command: ~/workspace/.circleci/scripts/binary_run_in_docker.sh

				# This is copied almost verbatim from the macos_brew_update job

				# In version 2.1 and above we could make this a command and pass a parameter to

				# it, but in this version there is no way to pass a parameter to a step

				binary_macos_brew_update: &binary_macos_brew_update

				  name: Brew update and install moreutils and expect

				  no_output_timeout: "1h"

				  command: |

				    set -eux -o pipefail

				    # See https://discourse.brew.sh/t/fetching-homebrew-repos-is-slow/5374/3

				    brew untap caskroom/homebrew-cask

				    # moreutils installs a `parallel` executable by default, which conflicts

				    # with the executable from the GNU `parallel`, so we must unlink GNU

				    # `parallel` first, and relink it afterwards

				    brew update

				    brew unlink parallel

				    brew install moreutils

				    brew link parallel --overwrite

				    brew install expect

									
										64

.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,64 @@

				# Nighlty build smoke tests defaults

				# These are the second-round smoke tests. These make sure that the binaries are

				# correct from a user perspective, testing that they exist from the cloud are

				# are runnable. Note that the pytorch repo is never cloned into these jobs

				##############################################################################

				smoke_linux_test: &smoke_linux_test

				  machine:

				    image: ubuntu-1604:201903-01

				  steps:

				  - attach_workspace:

				      at: ~/workspace

				  - attach_workspace:

				      at: /home/circleci/project

				  - run:

				      <<: *setup_linux_system_environment

				  - run:

				      <<: *setup_ci_environment

				  - run:

				      <<: *binary_checkout

				  - run:

				      <<: *binary_populate_env

				  - run:

				      name: Test

				      no_output_timeout: "1h"

				      command: |

				        set -ex

				        cat >/home/circleci/project/ci_test_script.sh <<EOL

				        # The following code will be executed inside Docker container

				        set -eux -o pipefail

				        /builder/smoke_test.sh

				        # The above code will be executed inside Docker container

				        EOL

				  - run:

				      <<: *binary_run_in_docker

				smoke_mac_test: &smoke_mac_test

				  macos:

				    xcode: "9.0"

				  steps:

				    - attach_workspace:

				        at: ~/workspace

				    - attach_workspace: # TODO - we can `cp` from ~/workspace

				        at: /Users/distiller/project

				    - run:

				        <<: *binary_checkout

				    - run:

				        <<: *binary_populate_env

				    - run:

				        <<: *binary_macos_brew_update

				    - run:

				        <<: *binary_install_miniconda

				    - run:

				        name: Build

				        no_output_timeout: "1h"

				        command: |

				          set -ex

				          source "/Users/distiller/project/env"

				          export "PATH=$workdir/miniconda/bin:$PATH"

				          # TODO unbuffer and ts this, but it breaks cause miniconda overwrites

				          # tclsh. But unbuffer and ts aren't that important so they're just

				          # disabled for now

				          ./builder/smoke_test.sh

									
										4

.circleci/verbatim-sources/workflows-binary-build-header.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,4 @@

				##############################################################################

				# Daily binary build trigger

				##############################################################################

									
										51

.circleci/verbatim-sources/workflows-binary-builds-smoke-subset.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,51 @@

				      # Binary builds (subset, to smoke test that they'll work)

				      #

				      # NB: If you modify this file, you need to also modify

				      # the binary_and_smoke_tests_on_pr variable in

				      # pytorch-ci-hud to adjust the list of whitelisted builds

				      # at https://github.com/ezyang/pytorch-ci-hud/blob/master/src/BuildHistoryDisplay.js

				      - binary_linux_manywheel_2.7mu_cpu_devtoolset7_build:

				          requires:

				            - setup

				      - binary_linux_manywheel_3.7m_cu100_devtoolset7_build:

				          requires:

				            - setup

				      - binary_linux_conda_2.7_cpu_devtoolset7_build:

				          requires:

				            - setup

				      # This binary build is currently broken, see https://github.com/pytorch/pytorch/issues/16710

				      # - binary_linux_conda_3.6_cu90_devtoolset7_build

				      - binary_linux_libtorch_2.7m_cpu_devtoolset7_static-without-deps_build:

				          requires:

				            - setup

				      # TODO we should test a libtorch cuda build, but they take too long

				      # - binary_linux_libtorch_2.7m_cu90_devtoolset7_static-without-deps_build

				      - binary_macos_wheel_3.6_cpu_build:

				          requires:

				            - setup

				      - binary_macos_conda_2.7_cpu_build:

				          requires:

				            - setup

				      - binary_macos_libtorch_2.7_cpu_build:

				          requires:

				            - setup

				      - binary_linux_manywheel_2.7mu_cpu_devtoolset7_test:

				          requires:

				            - setup

				            - binary_linux_manywheel_2.7mu_cpu_devtoolset7_build

				      - binary_linux_manywheel_3.7m_cu100_devtoolset7_test:

				          requires:

				            - setup

				            - binary_linux_manywheel_3.7m_cu100_devtoolset7_build

				      - binary_linux_conda_2.7_cpu_devtoolset7_test:

				          requires:

				            - setup

				            - binary_linux_conda_2.7_cpu_devtoolset7_build

				      # This binary build is currently broken, see https://github.com/pytorch/pytorch/issues/16710

				      # - binary_linux_conda_3.6_cu90_devtoolset7_test:

				      #     requires:

				      #       - setup

				      #       - binary_linux_conda_3.6_cu90_devtoolset7_build

									
										11

.circleci/verbatim-sources/workflows-nightly-uploads-header.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,11 @@

				      #- binary_linux_libtorch_2.7m_cpu_test:

				      #    requires:

				      #      - binary_linux_libtorch_2.7m_cpu_build

				      #- binary_linux_libtorch_2.7m_cu90_test:

				      #    requires:

				      #      - binary_linux_libtorch_2.7m_cu90_build

				      #- binary_linux_libtorch_2.7m_cu100_test:

				      #    requires:

				      #      - binary_linux_libtorch_2.7m_cu100_build

				      # Nightly uploads

									
										19

.circleci/verbatim-sources/workflows-pytorch-macos-builds.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,19 @@

				      # Warning: indentation here matters!

				      # Pytorch MacOS builds

				      - pytorch_macos_10_13_py3_build:

				          requires:

				            - setup

				          filters:

				            branches:

				              only: master

				      - pytorch_macos_10_13_py3_test:

				          requires:

				            - setup

				            - pytorch_macos_10_13_py3_build

				          filters:

				            branches:

				              only: master

				      - pytorch_macos_10_13_cuda9_2_cudnn7_py3_build:

				          requires:

				            - setup

									
										30

.circleci/verbatim-sources/workflows-s3-html.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,30 @@

				  # Scheduled to run 4 hours after the binary jobs start

				  # These jobs need to run after all the binary jobs run, regardless of if the

				  # jobs failed or not. There's no way to do this in CircleCI right now, so we

				  # just schedule this to run after all the binary jobs should've finished.

				  # These jobs are all idempotent and very lightweight; they just upload html

				  # files that track what binaries are available and what their sizes are.

				  update_s3_htmls:

				    triggers:

				      - schedule:

				          cron: "0 9 * * *"

				          filters:

				            branches:

				              only:

				                - master

				    jobs:

				      - setup

				      - update_s3_htmls_for_nightlies:

				          context: org-member

				          requires:

				            - setup

				      - update_s3_htmls_for_nightlies_devtoolset7:

				          context: org-member

				          requires:

				            - setup

				      - upload_binary_sizes:

				          context: org-member

				          requires:

				            - setup

									
										12

.circleci/verbatim-sources/workflows.yml
									
										Normal file
									
												View File
												
				@ -0,0 +1,12 @@

				##############################################################################

				##############################################################################

				# Workflows

				##############################################################################

				##############################################################################

				# PR jobs pr builds

				workflows:

				  version: 2

				  build:

				    jobs:

25

.clang-tidy

View File

 @ -3,26 +3,29 @@
 Checks: '
   -*
   ,bugprone-*
   ,-bugprone-macro-parentheses
   ,-bugprone-forward-declaration-namespace
   ,-bugprone-macro-parentheses
   ,cppcoreguidelines-*
   ,-cppcoreguidelines-pro-bounds-array-to-pointer-decay
   ,-cppcoreguidelines-pro-type-static-cast-downcast
   ,-cppcoreguidelines-pro-bounds-pointer-arithmetic
   ,-cppcoreguidelines-pro-bounds-constant-array-index
   ,-cppcoreguidelines-pro-type-cstyle-cast
   ,-cppcoreguidelines-pro-type-reinterpret-cast
   ,-cppcoreguidelines-pro-type-vararg
   ,-cppcoreguidelines-special-member-functions
   ,-cppcoreguidelines-interfaces-global-init
   ,-cppcoreguidelines-owning-memory
   ,hicpp-signed-bitwise
   ,-cppcoreguidelines-pro-bounds-array-to-pointer-decay
   ,-cppcoreguidelines-pro-bounds-constant-array-index
   ,-cppcoreguidelines-pro-bounds-pointer-arithmetic
   ,-cppcoreguidelines-pro-type-cstyle-cast
   ,-cppcoreguidelines-pro-type-reinterpret-cast
   ,-cppcoreguidelines-pro-type-static-cast-downcast
   ,-cppcoreguidelines-pro-type-union-access
   ,-cppcoreguidelines-pro-type-vararg
   ,-cppcoreguidelines-special-member-functions
   ,hicpp-exception-baseclass
   ,hicpp-avoid-goto
   ,modernize-*
   ,-modernize-use-default-member-init
   ,-modernize-return-braced-init-list
   ,-modernize-use-auto
   ,-modernize-use-default-member-init
   ,-modernize-use-using
   ,performance-*
   ,-performance-noexcept-move-constructor
   '
 WarningsAsErrors: '*'
 HeaderFilterRegex: 'torch/csrc/.*'

2

.ctags.d/pytorch.ctags Normal file

View File

 @ -0,0 +1,2 @@
 --exclude=build/*
 --exclude=include/*

12

.flake8 Normal file

View File

 @ -0,0 +1,12 @@
 [flake8]
 select = B,C,E,F,P,T4,W,B9
 max-line-length = 120
 # C408 ignored because we like the dict keyword argument syntax
 # E501 is not flexible enough, we're using B950 instead
 ignore =
     E203,E305,E402,E501,E721,E741,F403,F405,F821,F841,F999,W503,W504,C408,E302,W291,E303,
     # these ignores are from flake8-bugbear; please fix!
     B007,B008,
     # these ignores are from flake8-comprehensions; please fix!
     C400,C401,C402,C403,C404,C405,C407,C411,
 exclude = docs/src,venv,third_party,caffe2,scripts,docs/caffe2,tools/amd_build/pyHIPIFY,torch/lib/include,torch/lib/tmp_install,build,torch/include,*.pyi

29

.gitignore vendored

View File

 @ -22,23 +22,27 @@
 aten/build/
 aten/src/ATen/Config.h
 aten/src/ATen/cuda/CUDAConfig.h
 build/
 caffe2/cpp_test/
 dist/
 docs/src/**/*
 docs/cpp/build
 docs/cpp/source/api
 test/.coverage
 test/.hypothesis/
 test/cpp/api/mnist
 test/custom_operator/model.pt
 test/data/gpu_tensors.pt
 test/data/legacy_modules.t7
 test/data/legacy_serialized.pt
 test/data/linear.pt
 test/data/*.pt
 dropout_model.pt
 test/generated_type_hints_smoketest.py
 test/htmlcov
 test/cpp_extensions/install/
 third_party/build/
 tools/shared/_utils_internal.py
 torch.egg-info/
 torch/__init__.pyi
 torch/nn/functional.pyi
 torch/nn/modules/*.pyi
 torch/csrc/autograd/generated/*
 torch/csrc/cudnn/cuDNN.cpp
 torch/csrc/generated
 @ -52,6 +56,8 @@ torch/csrc/nn/THNN_generic.cwrap
 torch/csrc/nn/THNN_generic.h
 torch/csrc/nn/THNN.cpp
 torch/csrc/nn/THNN.cwrap
 torch/bin/
 torch/cmake/
 torch/lib/*.a*
 torch/lib/*.dll*
 torch/lib/*.exe*
 @ -59,16 +65,27 @@ torch/lib/*.dylib*
 torch/lib/*.h
 torch/lib/*.lib
 torch/lib/*.so*
 torch/lib/protobuf*.pc
 torch/lib/build
 torch/lib/caffe2/
 torch/lib/cmake
 torch/lib/include
 torch/lib/pkgconfig
 torch/lib/protoc
 torch/lib/protobuf/
 torch/lib/tmp_install
 torch/lib/torch_shm_manager
 torch/lib/site-packages/
 torch/lib/python*
 torch/lib64
 torch/include/
 torch/share/
 torch/test/
 torch/version.py
 # Root level file used in CI to specify certain env configs.
 # E.g., see .circleci/config.yaml
 env
 .circleci/scripts/COMMIT_MSG
 # IPython notebook checkpoints
 .ipynb_checkpoints
 @ -151,6 +168,9 @@ docs/source/scripts/activation_images/
 # OSX dir files
 .DS_Store
 # GDB history
 .gdb_history
 ## Caffe2
 # build, distribute, and bins (+ python proto bindings)
 @ -185,7 +205,6 @@ docs/dev
 *.sst
 *.ldb
 LOCK
 LOG*
 CURRENT
 MANIFEST-*

146

.gitmodules vendored

View File

 @ -1,84 +1,116 @@
 [submodule "third_party/pybind11"]
 	path = third_party/pybind11
 	url = https://github.com/pybind/pybind11.git
     ignore = dirty
     path = third_party/pybind11
     url = https://github.com/pybind/pybind11.git
 [submodule "third_party/cub"]
 	path = third_party/cub
 	url = https://github.com/NVlabs/cub.git
     ignore = dirty
     path = third_party/cub
     url = https://github.com/NVlabs/cub.git
 [submodule "third_party/eigen"]
 	path = third_party/eigen
 	url = https://github.com/eigenteam/eigen-git-mirror.git
     ignore = dirty
     path = third_party/eigen
     url = https://github.com/eigenteam/eigen-git-mirror.git
 [submodule "third_party/googletest"]
 	path = third_party/googletest
 	url = https://github.com/google/googletest.git
     ignore = dirty
     path = third_party/googletest
     url = https://github.com/google/googletest.git
 [submodule "third_party/benchmark"]
 	path = third_party/benchmark
 	url = https://github.com/google/benchmark.git
     ignore = dirty
     path = third_party/benchmark
     url = https://github.com/google/benchmark.git
 [submodule "third_party/protobuf"]
 	path = third_party/protobuf
 	url = https://github.com/google/protobuf.git
     ignore = dirty
     path = third_party/protobuf
     url = https://github.com/protocolbuffers/protobuf.git
 [submodule "third_party/ios-cmake"]
 	path = third_party/ios-cmake
 	url = https://github.com/Yangqing/ios-cmake.git
     ignore = dirty
     path = third_party/ios-cmake
     url = https://github.com/Yangqing/ios-cmake.git
 [submodule "third_party/NNPACK"]
 	path = third_party/NNPACK
 	url = https://github.com/Maratyszcza/NNPACK.git
     ignore = dirty
     path = third_party/NNPACK
     url = https://github.com/Maratyszcza/NNPACK.git
 [submodule "third_party/gloo"]
 	path = third_party/gloo
 	url = https://github.com/facebookincubator/gloo
     ignore = dirty
     path = third_party/gloo
     url = https://github.com/facebookincubator/gloo
 [submodule "third_party/NNPACK_deps/pthreadpool"]
 	path = third_party/pthreadpool
 	url = https://github.com/Maratyszcza/pthreadpool.git
     ignore = dirty
     path = third_party/pthreadpool
     url = https://github.com/Maratyszcza/pthreadpool.git
 [submodule "third_party/NNPACK_deps/FXdiv"]
 	path = third_party/FXdiv
 	url = https://github.com/Maratyszcza/FXdiv.git
     ignore = dirty
     path = third_party/FXdiv
     url = https://github.com/Maratyszcza/FXdiv.git
 [submodule "third_party/NNPACK_deps/FP16"]
 	path = third_party/FP16
 	url = https://github.com/Maratyszcza/FP16.git
     ignore = dirty
     path = third_party/FP16
     url = https://github.com/Maratyszcza/FP16.git
 [submodule "third_party/NNPACK_deps/psimd"]
 	path = third_party/psimd
 	url = https://github.com/Maratyszcza/psimd.git
     ignore = dirty
     path = third_party/psimd
     url = https://github.com/Maratyszcza/psimd.git
 [submodule "third_party/zstd"]
 	path = third_party/zstd
 	url = https://github.com/facebook/zstd.git
     ignore = dirty
     path = third_party/zstd
     url = https://github.com/facebook/zstd.git
 [submodule "third-party/cpuinfo"]
 	path = third_party/cpuinfo
 	url = https://github.com/Maratyszcza/cpuinfo.git
     ignore = dirty
     path = third_party/cpuinfo
     url = https://github.com/pytorch/cpuinfo.git
 [submodule "third_party/python-enum"]
 	path = third_party/python-enum
 	url = https://github.com/PeachPy/enum34.git
     ignore = dirty
     path = third_party/python-enum
     url = https://github.com/PeachPy/enum34.git
 [submodule "third_party/python-peachpy"]
 	path = third_party/python-peachpy
 	url = https://github.com/Maratyszcza/PeachPy.git
     ignore = dirty
     path = third_party/python-peachpy
     url = https://github.com/Maratyszcza/PeachPy.git
 [submodule "third_party/python-six"]
 	path = third_party/python-six
 	url = https://github.com/benjaminp/six.git
 [submodule "third_party/ComputeLibrary"]
 	path = third_party/ComputeLibrary
 	url = https://github.com/ARM-software/ComputeLibrary.git
     ignore = dirty
     path = third_party/python-six
     url = https://github.com/benjaminp/six.git
 [submodule "third_party/onnx"]
 	path = third_party/onnx
 	url = https://github.com/onnx/onnx.git
     ignore = dirty
     path = third_party/onnx
     url = https://github.com/onnx/onnx.git
 [submodule "third_party/onnx-tensorrt"]
 	path = third_party/onnx-tensorrt
 	url = https://github.com/onnx/onnx-tensorrt
     ignore = dirty
     path = third_party/onnx-tensorrt
     url = https://github.com/onnx/onnx-tensorrt
 [submodule "third_party/sleef"]
 	path = third_party/sleef
 	url = https://github.com/shibatch/sleef
     ignore = dirty
     path = third_party/sleef
     url = https://github.com/zdevito/sleef
 [submodule "third_party/ideep"]
 	path = third_party/ideep
 	url = https://github.com/intel/ideep
     ignore = dirty
     path = third_party/ideep
     url = https://github.com/intel/ideep
 [submodule "third_party/nccl/nccl"]
 	path = third_party/nccl/nccl
 	url = https://github.com/NVIDIA/nccl
     ignore = dirty
     path = third_party/nccl/nccl
     url = https://github.com/NVIDIA/nccl
 [submodule "third_party/gemmlowp/gemmlowp"]
 	path = third_party/gemmlowp/gemmlowp
 	url = https://github.com/google/gemmlowp.git
     ignore = dirty
     path = third_party/gemmlowp/gemmlowp
     url = https://github.com/google/gemmlowp.git
 [submodule "third_party/QNNPACK"]
 	path = third_party/QNNPACK
 	url = https://github.com/pytorch/QNNPACK
     ignore = dirty
     path = third_party/QNNPACK
     url = https://github.com/pytorch/QNNPACK
 [submodule "third_party/neon2sse"]
 	path = third_party/neon2sse
 	url = https://github.com/intel/ARM_NEON_2_x86_SSE.git
     ignore = dirty
     path = third_party/neon2sse
     url = https://github.com/intel/ARM_NEON_2_x86_SSE.git
 [submodule "third_party/fbgemm"]
 	path = third_party/fbgemm
 	url = https://github.com/pytorch/fbgemm
     ignore = dirty
     path = third_party/fbgemm
     url = https://github.com/pytorch/fbgemm
 [submodule "third_party/foxi"]
     ignore = dirty
     path = third_party/foxi
     url = https://github.com/houseroad/foxi.git
 [submodule "third_party/tbb"]
 	path = third_party/tbb
 	url = https://github.com/01org/tbb
 	branch = tbb_2018

									
										60

.jenkins/caffe2/bench.sh
									
										Executable file
									
												View File
												
				@ -0,0 +1,60 @@

				#!/bin/bash

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				# Anywhere except $ROOT_DIR should work. This is so the python import doesn't

				# get confused by any 'caffe2' directory in cwd

				cd "$INSTALL_PREFIX"

				if [[ $BUILD_ENVIRONMENT == *-cuda* ]]; then

				    num_gpus=$(nvidia-smi -L | wc -l)

				elif [[ $BUILD_ENVIRONMENT == *-rocm* ]]; then

				    num_gpus=$(rocminfo | grep 'Device Type.*GPU' | wc -l)

				else

				    num_gpus=0

				fi

				caffe2_pypath="$(cd /usr && $PYTHON -c 'import os; import caffe2; print(os.path.dirname(os.path.realpath(caffe2.__file__)))')"

				# Resnet50

				if (( $num_gpus == 0 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 128 --epoch_size 12800 --num_epochs 2 --use_cpu

				fi

				if (( $num_gpus >= 1 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 128 --epoch_size 12800 --num_epochs 2 --num_gpus 1

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 256 --epoch_size 25600 --num_epochs 2 --num_gpus 1 --float16_compute --dtype float16

				fi

				if (( $num_gpus >= 2 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 256 --epoch_size 25600 --num_epochs 2 --num_gpus 2

				fi

				if (( $num_gpus >= 4 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 512 --epoch_size 51200 --num_epochs 2 --num_gpus 4

				fi

				# ResNext

				if (( $num_gpus == 0 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --resnext_num_groups 32 --resnext_width_per_group 4 --num_layers 101 --train_data null --batch_size 32 --epoch_size 3200 --num_epochs 2 --use_cpu

				fi

				if (( $num_gpus >= 1 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --resnext_num_groups 32 --resnext_width_per_group 4 --num_layers 101 --train_data null --batch_size 32 --epoch_size 3200 --num_epochs 2 --num_gpus 1

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --resnext_num_groups 32 --resnext_width_per_group 4 --num_layers 101 --train_data null --batch_size 64 --epoch_size 3200 --num_epochs 2 --num_gpus 1 --float16_compute --dtype float16

				fi

				if (( $num_gpus >= 2 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --resnext_num_groups 32 --resnext_width_per_group 4 --num_layers 101 --train_data null --batch_size 64 --epoch_size 6400 --num_epochs 2 --num_gpus 2

				fi

				if (( $num_gpus >= 4 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --resnext_num_groups 32 --resnext_width_per_group 4 --num_layers 101 --train_data null --batch_size 128 --epoch_size 12800 --num_epochs 2 --num_gpus 4

				fi

				# Shufflenet

				if (( $num_gpus == 0 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 32 --epoch_size 3200 --num_epochs 2 --use_cpu --model shufflenet

				fi

				if (( $num_gpus >= 1 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 32 --epoch_size 3200 --num_epochs 2 --num_gpus 1 --model shufflenet

				fi

				if (( $num_gpus >= 2 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 64 --epoch_size 6400 --num_epochs 2 --num_gpus 2 --model shufflenet

				fi

				if (( $num_gpus >= 4 )); then

				    "$PYTHON" "$caffe2_pypath/python/examples/imagenet_trainer.py" --train_data null --batch_size 128 --epoch_size 12800 --num_epochs 2 --num_gpus 4 --model shufflenet

				fi

									
										218

.jenkins/caffe2/build.sh
									
												View File
												
				@ -2,15 +2,30 @@

				set -ex

				pip install --user --no-cache-dir hypothesis==3.59.0

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				# TODO: Migrate all centos jobs to use proper devtoolset

				if [[ "$BUILD_ENVIRONMENT" == *py2-cuda9.0-cudnn7-centos7* ]]; then

				  # There is a bug in pango packge on Centos7 that causes undefined

				  # symbols, upgrading glib2 to >=2.56.1 solves the issue. See

				  # https://bugs.centos.org/view.php?id=15495

				  sudo yum install -y -q glib2-2.56.1

				fi

				# CMAKE_ARGS are only passed to 'cmake' and the -Dfoo=bar does not work with

				# setup.py, so we build a list of foo=bars and then either convert it to

				# -Dfoo=bars or export them before running setup.py

				build_args=()

				build_to_cmake () {

				  cmake_args=()

				  for build_arg in $*; do

				    cmake_args+=("-D$build_arg")

				  done

				  echo ${cmake_args[@]}

				}

				# The INSTALL_PREFIX here must match up with test.sh

				INSTALL_PREFIX="/usr/local/caffe2"

				LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)

				ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)

				CMAKE_ARGS=()

				SCCACHE="$(which sccache)"

				if [ "$(which gcc)" != "/root/sccache/gcc" ]; then

				  # Setup SCCACHE

				  ###############################################################################

				@ -87,52 +102,61 @@ report_compile_cache_stats() {

				  fi

				}

				###############################################################################

				# Explicitly set Python executable.

				###############################################################################

				# On Ubuntu 16.04 the default Python is still 2.7.

				PYTHON="$(which python)"

				if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then

				  PYTHON=$(which "python${BASH_REMATCH[1]}")

				  CMAKE_ARGS+=("-DPYTHON_EXECUTABLE=${PYTHON}")

				fi

				###############################################################################

				# Use special scripts for Android and setup builds

				###############################################################################

				if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then

				  export ANDROID_NDK=/opt/ndk

				  CMAKE_ARGS+=("-DBUILD_BINARY=ON")

				  CMAKE_ARGS+=("-DBUILD_TEST=ON")

				  CMAKE_ARGS+=("-DUSE_OBSERVERS=ON")

				  CMAKE_ARGS+=("-DUSE_ZSTD=ON")

				  "${ROOT_DIR}/scripts/build_android.sh" ${CMAKE_ARGS[*]} "$@"

				  build_args+=("BUILD_BINARY=ON")

				  build_args+=("BUILD_TEST=ON")

				  build_args+=("USE_OBSERVERS=ON")

				  build_args+=("USE_ZSTD=ON")

				  "${ROOT_DIR}/scripts/build_android.sh" $(build_to_cmake ${build_args[@]}) "$@"

				  exit 0

				fi

				###############################################################################

				# Set cmake args

				# Set parameters

				###############################################################################

				CMAKE_ARGS+=("-DBUILD_BINARY=ON")

				CMAKE_ARGS+=("-DBUILD_TEST=ON")

				CMAKE_ARGS+=("-DINSTALL_TEST=ON")

				CMAKE_ARGS+=("-DUSE_OBSERVERS=ON")

				CMAKE_ARGS+=("-DUSE_ZSTD=ON")

				CMAKE_ARGS+=("-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX}")

				if [[ $BUILD_ENVIRONMENT == *mkl* ]]; then

				  CMAKE_ARGS+=("-DBLAS=MKL")

				  CMAKE_ARGS+=("-DUSE_MKLDNN=ON")

				if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				  build_args+=("BUILD_PYTHON=OFF")

				else

				  build_args+=("BUILD_PYTHON=ON")

				  build_args+=("PYTHON_EXECUTABLE=${PYTHON}")

				fi

				if [[ $BUILD_ENVIRONMENT == *mkl* ]]; then

				  build_args+=("BLAS=MKL")

				  build_args+=("USE_MKLDNN=ON")

				fi

				build_args+=("BUILD_BINARY=ON")

				build_args+=("BUILD_TEST=ON")

				build_args+=("INSTALL_TEST=ON")

				build_args+=("USE_ZSTD=ON")

				if [[ $BUILD_ENVIRONMENT == *py2-cuda9.0-cudnn7-ubuntu16.04* ]]; then

				  # removing http:// duplicate in favor of nvidia-ml.list

				  # which is https:// version of the same repo

				  sudo rm -f /etc/apt/sources.list.d/nvidia-machine-learning.list

				  curl -o ./nvinfer-runtime-trt-repo-ubuntu1604-5.0.2-ga-cuda9.0_1-1_amd64.deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvinfer-runtime-trt-repo-ubuntu1604-5.0.2-ga-cuda9.0_1-1_amd64.deb

				  sudo dpkg -i ./nvinfer-runtime-trt-repo-ubuntu1604-5.0.2-ga-cuda9.0_1-1_amd64.deb

				  sudo apt-key add /var/nvinfer-runtime-trt-repo-5.0.2-ga-cuda9.0/7fa2af80.pub

				  sudo apt-get -qq update

				  sudo apt-get install -y --no-install-recommends libnvinfer5=5.0.2-1+cuda9.0 libnvinfer-dev=5.0.2-1+cuda9.0

				  rm ./nvinfer-runtime-trt-repo-ubuntu1604-5.0.2-ga-cuda9.0_1-1_amd64.deb

				  build_args+=("USE_TENSORRT=ON")

				fi

				if [[ $BUILD_ENVIRONMENT == *cuda* ]]; then

				  CMAKE_ARGS+=("-DUSE_CUDA=ON")

				  CMAKE_ARGS+=("-DCUDA_ARCH_NAME=Maxwell")

				  CMAKE_ARGS+=("-DUSE_NNPACK=OFF")

				  build_args+=("USE_CUDA=ON")

				  build_args+=("USE_NNPACK=OFF")

				  # Target only our CI GPU machine's CUDA arch to speed up the build

				  build_args+=("TORCH_CUDA_ARCH_LIST=Maxwell")

				  # Explicitly set path to NVCC such that the symlink to ccache or sccache is used

				  CMAKE_ARGS+=("-DCUDA_NVCC_EXECUTABLE=${CACHE_WRAPPER_DIR}/nvcc")

				  build_args+=("CUDA_NVCC_EXECUTABLE=${CACHE_WRAPPER_DIR}/nvcc")

				  # Ensure FindCUDA.cmake can infer the right path to the CUDA toolkit.

				  # Setting PATH to resolve to the right nvcc alone isn't enough.

				@ -143,10 +167,16 @@ if [[ $BUILD_ENVIRONMENT == *cuda* ]]; then

				  export PATH="/usr/local/cuda/bin:$PATH"

				fi

				if [[ $BUILD_ENVIRONMENT == *rocm* ]]; then

				  build_args+=("USE_ROCM=ON")

				  # This is needed to enable ImageInput operator in resnet50_trainer

				  CMAKE_ARGS+=("-USE_OPENCV=ON")

				  build_args+=("USE_OPENCV=ON")

				  # This is needed to read datasets from https://download.caffe2.ai/databases/resnet_trainer.zip

				  CMAKE_ARGS+=("-USE_LMDB=ON")

				  build_args+=("USE_LMDB=ON")

				  # When hcc runs out of memory, it silently exits without stopping

				  # the build process, leaving undefined symbols in the shared lib

				  # which will cause undefined symbol errors when later running

				  # tests. Setting MAX_JOBS to smaller number to make CI less flaky.

				  export MAX_JOBS=4

				  ########## HIPIFY Caffe2 operators

				  ${PYTHON} "${ROOT_DIR}/tools/amd_build/build_amd.py"

				@ -155,37 +185,25 @@ fi

				# building bundled nccl in this config triggers a bug in nvlink. For

				# more, see https://github.com/pytorch/pytorch/issues/14486

				if [[ "${BUILD_ENVIRONMENT}" == *-cuda8*-cudnn7* ]]; then

				    CMAKE_ARGS+=("-DUSE_SYSTEM_NCCL=ON")

				    build_args+=("USE_SYSTEM_NCCL=ON")

				fi

				# Try to include Redis support for Linux builds

				if [ "$(uname)" == "Linux" ]; then

				  CMAKE_ARGS+=("-DUSE_REDIS=ON")

				fi

				# Currently, on Jenkins mac os, we will use custom protobuf. Mac OS

				# contbuild at the moment is minimal dependency - it doesn't use glog

				# or gflags either.

				if [ "$(uname)" == "Darwin" ]; then

				  CMAKE_ARGS+=("-DBUILD_CUSTOM_PROTOBUF=ON")

				  build_args+=("USE_REDIS=ON")

				fi

				# Use a speciallized onnx namespace in CI to catch hardcoded onnx namespace

				CMAKE_ARGS+=("-DONNX_NAMESPACE=ONNX_NAMESPACE_FOR_C2_CI")

				# We test the presence of cmake3 (for platforms like Centos and Ubuntu 14.04)

				# and use that if so.

				if [[ -x "$(command -v cmake3)" ]]; then

				    CMAKE_BINARY=cmake3

				else

				    CMAKE_BINARY=cmake

				fi

				build_args+=("ONNX_NAMESPACE=ONNX_NAMESPACE_FOR_C2_CI")

				###############################################################################

				# Configure and make

				###############################################################################

				if [[ -z "$INTEGRATED" ]]; then

				if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				  # cmake-only non-setup.py build, to test cpp only bits. This installs into

				  # /usr/local/caffe2 and installs no Python tests

				  build_args+=("CMAKE_INSTALL_PREFIX=${INSTALL_PREFIX}")

				  # Run cmake from ./build_caffe2 directory so it doesn't conflict with

				  # standard PyTorch build directory. Eventually these won't need to

				@ -194,8 +212,16 @@ if [[ -z "$INTEGRATED" ]]; then

				  mkdir build_caffe2

				  cd ./build_caffe2

				  # We test the presence of cmake3 (for platforms like Centos and Ubuntu 14.04)

				  # and use that if so.

				  if [[ -x "$(command -v cmake3)" ]]; then

				      CMAKE_BINARY=cmake3

				  else

				      CMAKE_BINARY=cmake

				  fi

				  # Configure

				  ${CMAKE_BINARY} "${ROOT_DIR}" ${CMAKE_ARGS[*]} "$@"

				  ${CMAKE_BINARY} "${ROOT_DIR}" $(build_to_cmake ${build_args[@]}) "$@"

				  # Build

				  if [ "$(uname)" == "Linux" ]; then

				@ -205,20 +231,35 @@ if [[ -z "$INTEGRATED" ]]; then

				    exit 1

				  fi

				  # This is to save test binaries for testing

				  mv "$INSTALL_PREFIX/test/" "$INSTALL_PREFIX/cpp_test/"

				  ls -lah $INSTALL_PREFIX

				else

				  # Python build. Uses setup.py to install into site-packages

				  build_args+=("USE_LEVELDB=ON")

				  build_args+=("USE_LMDB=ON")

				  build_args+=("USE_OPENCV=ON")

				  build_args+=("BUILD_TEST=ON")

				  # These flags preserve the flags that were used before this refactor (blame

				  # me)

				  build_args+=("USE_GLOG=ON")

				  build_args+=("USE_GFLAGS=ON")

				  build_args+=("USE_FBGEMM=OFF")

				  build_args+=("USE_MKLDNN=OFF")

				  build_args+=("USE_DISTRIBUTED=ON")

				  for build_arg in "${build_args[@]}"; do

				    export $build_arg

				  done

				  # sccache will be stuck if  all cores are used for compiling

				  # see https://github.com/pytorch/pytorch/pull/7361

				  if [[ -n "${SCCACHE}" ]]; then

				  if [[ -n "${SCCACHE}" && $BUILD_ENVIRONMENT != *rocm* ]]; then

				    export MAX_JOBS=`expr $(nproc) - 1`

				  fi

				  USE_LEVELDB=1 USE_LMDB=1 USE_OPENCV=1 BUILD_BINARY=1 python setup.py install --user

				  # This is to save test binaries for testing

				  cp -r torch/lib/tmp_install $INSTALL_PREFIX

				  ls $INSTALL_PREFIX

				  $PYTHON setup.py install --user

				  report_compile_cache_stats

				fi

				@ -230,39 +271,14 @@ fi

				# Install ONNX into a local directory

				pip install --user -b /tmp/pip_install_onnx "file://${ROOT_DIR}/third_party/onnx#egg=onnx"

				report_compile_cache_stats

				# Symlink the caffe2 base python path into the system python path,

				# so that we can import caffe2 without having to change $PYTHONPATH.

				# Run in a subshell to contain environment set by /etc/os-release.

				#

				# This is only done when running on Jenkins!  We don't want to pollute

				# the user environment with Python symlinks and ld.so.conf.d hacks.

				#

				if [[ -z "$INTEGRATED" ]]; then

				  if [ -n "${JENKINS_URL}" ]; then

				    (

				      source /etc/os-release

				      function python_version() {

				        "$PYTHON" -c 'import sys; print("python%d.%d" % sys.version_info[0:2])'

				      }

				      # Debian/Ubuntu

				      if [[ "$ID_LIKE" == *debian* ]]; then

				        python_path="/usr/local/lib/$(python_version)/dist-packages"

				        sudo ln -sf "${INSTALL_PREFIX}/caffe2" "${python_path}"

				      fi

				      # RHEL/CentOS

				      if [[ "$ID_LIKE" == *rhel* ]]; then

				        python_path="/usr/lib64/$(python_version)/site-packages/"

				        sudo ln -sf "${INSTALL_PREFIX}/caffe2" "${python_path}"

				      fi

				      # /etc/ld.so.conf.d is used on both Debian and RHEL

				      echo "${INSTALL_PREFIX}/lib" | sudo tee /etc/ld.so.conf.d/caffe2.conf

				      sudo ldconfig

				    )

				if [[ $BUILD_ENVIRONMENT == *rocm* ]]; then

				  ORIG_COMP=/opt/rocm/hcc/bin/clang-*_original

				  if [ -e $ORIG_COMP ]; then

				    # runtime compilation of MIOpen kernels manages to crash sccache - hence undo the wrapping

				    # note that the wrapping always names the compiler "clang-7.0_original"

				    WRAPPED=/opt/rocm/hcc/bin/clang-[0-99]

				    sudo mv $ORIG_COMP $WRAPPED

				  fi

				fi

				report_compile_cache_stats

									
										22

.jenkins/caffe2/common.sh
									
										Normal file
									
												View File
												
				@ -0,0 +1,22 @@

				set -ex

				LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)

				ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)

				TEST_DIR="$ROOT_DIR/caffe2_tests"

				gtest_reports_dir="${TEST_DIR}/cpp"

				pytest_reports_dir="${TEST_DIR}/python"

				# Figure out which Python to use

				PYTHON="$(which python)"

				if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then

				  PYTHON=$(which "python${BASH_REMATCH[1]}")

				fi

				# /usr/local/caffe2 is where the cpp bits are installed to in in cmake-only

				# builds. In +python builds the cpp tests are copied to /usr/local/caffe2 so

				# that the test code in .jenkins/test.sh is the same

				INSTALL_PREFIX="/usr/local/caffe2"

				mkdir -p "$gtest_reports_dir" || true

				mkdir -p "$pytest_reports_dir" || true

				mkdir -p "$INSTALL_PREFIX" || true

									
										139

.jenkins/caffe2/test.sh
									
												View File
												
				@ -1,23 +1,6 @@

				#!/bin/bash

				set -ex

				LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)

				ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)

				TEST_DIR=$ROOT_DIR/caffe2_tests

				# Figure out which Python to use

				PYTHON="python"

				if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then

				  PYTHON="python${BASH_REMATCH[1]}"

				fi

				# The prefix must mirror the setting from build.sh

				INSTALL_PREFIX="/usr/local/caffe2"

				# Add the site-packages in the caffe2 install prefix to the PYTHONPATH

				SITE_DIR=$($PYTHON -c "from distutils import sysconfig; print(sysconfig.get_python_lib(prefix=''))")

				INSTALL_SITE_DIR="${INSTALL_PREFIX}/${SITE_DIR}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				# Skip tests in environments where they are not built/applicable

				if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then

				@ -25,41 +8,39 @@ if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then

				  exit 0

				fi

				# Set PYTHONPATH and LD_LIBRARY_PATH so that python can find the installed

				# Caffe2.

				export PYTHONPATH="${PYTHONPATH}:$INSTALL_SITE_DIR"

				export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${INSTALL_PREFIX}/lib"

				cd "$ROOT_DIR"

				if [ -d $TEST_DIR ]; then

				  echo "Directory $TEST_DIR already exists; please remove it..."

				  exit 1

				# Find where cpp tests and Caffe2 itself are installed

				if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				  # For cmake only build we install everything into /usr/local

				  cpp_test_dir="$INSTALL_PREFIX/cpp_test"

				  ld_library_path="$INSTALL_PREFIX/lib"

				else

				  # For Python builds we install into python

				  # cd to /usr first so the python import doesn't get confused by any 'caffe2'

				  # directory in cwd

				  python_installation="$(dirname $(dirname $(cd /usr && $PYTHON -c 'import os; import caffe2; print(os.path.realpath(caffe2.__file__))')))"

				  caffe2_pypath="$python_installation/caffe2"

				  cpp_test_dir="$python_installation/torch/test"

				  ld_library_path="$python_installation/torch/lib"

				fi

				mkdir -p $TEST_DIR/{cpp,python}

				cd "${WORKSPACE}"

				# C++ tests

				################################################################################

				# C++ tests #

				################################################################################

				echo "Running C++ tests.."

				gtest_reports_dir="${TEST_DIR}/cpp"

				junit_reports_dir="${TEST_DIR}/junit_reports"

				mkdir -p "$gtest_reports_dir" "$junit_reports_dir"

				for test in $(find "${INSTALL_PREFIX}/test" -executable -type f); do

				for test in $(find "$cpp_test_dir" -executable -type f); do

				  case "$test" in

				    # skip tests we know are hanging or bad

				    */mkl_utils_test|*/aten/integer_divider_test)

				      continue

				      ;;

				    */scalar_tensor_test|*/basic|*/native_test)

					  if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

						continue

					  else

					    "$test"

					  fi

					  ;;

					*)

				      if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

				        continue

				      else

				        LD_LIBRARY_PATH="$ld_library_path" "$test"

				      fi

				      ;;

				    *)

				      # Currently, we use a mixture of gtest (caffe2) and Catch2 (ATen). While

				      # planning to migrate to gtest as the common PyTorch c++ test suite, we

				      # currently do NOT use the xml test reporter, because Catch doesn't

				@ -70,21 +51,39 @@ for test in $(find "${INSTALL_PREFIX}/test" -executable -type f); do

				      # output than it is to have XML output for Jenkins.

				      # Note: in the future, if we want to use xml test reporter once we switch

				      # to all gtest, one can simply do:

				      # "$test" --gtest_output=xml:"$gtest_reports_dir/$(basename $test).xml"

				      "$test"

				      LD_LIBRARY_PATH="$ld_library_path" \

				          "$test" --gtest_output=xml:"$gtest_reports_dir/$(basename $test).xml"

				      ;;

				  esac

				done

				# Get the relative path to where the caffe2 python module was installed

				CAFFE2_PYPATH="$INSTALL_SITE_DIR/caffe2"

				################################################################################

				# Python tests #

				################################################################################

				if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then

				  exit 0

				fi

				if [[ "$BUILD_ENVIRONMENT" == *ubuntu14.04* ]]; then

				  # Hotfix, use hypothesis 3.44.6 on Ubuntu 14.04

				  # See comments on

				  # https://github.com/HypothesisWorks/hypothesis-python/commit/eadd62e467d6cee6216e71b391951ec25b4f5830

				  sudo pip -q uninstall -y hypothesis

				  # "pip install hypothesis==3.44.6" from official server is unreliable on

				  # CircleCI, so we host a copy on S3 instead

				  sudo pip -q install attrs==18.1.0 -f https://s3.amazonaws.com/ossci-linux/wheels/attrs-18.1.0-py2.py3-none-any.whl

				  sudo pip -q install coverage==4.5.1 -f https://s3.amazonaws.com/ossci-linux/wheels/coverage-4.5.1-cp36-cp36m-macosx_10_12_x86_64.whl

				  sudo pip -q install hypothesis==3.44.6 -f https://s3.amazonaws.com/ossci-linux/wheels/hypothesis-3.44.6-py3-none-any.whl

				else

				  pip install --user --no-cache-dir hypothesis==3.59.0

				fi

				# Collect additional tests to run (outside caffe2/python)

				EXTRA_TESTS=()

				# CUDA builds always include NCCL support

				if [[ "$BUILD_ENVIRONMENT" == *-cuda* ]]; then

				  EXTRA_TESTS+=("$CAFFE2_PYPATH/contrib/nccl")

				  EXTRA_TESTS+=("$caffe2_pypath/contrib/nccl")

				fi

				rocm_ignore_test=()

				@ -92,34 +91,46 @@ if [[ $BUILD_ENVIRONMENT == *-rocm* ]]; then

				  # Currently these tests are failing on ROCM platform:

				  # Unknown reasons, need to debug

				  rocm_ignore_test+=("--ignore $CAFFE2_PYPATH/python/operator_test/arg_ops_test.py")

				  rocm_ignore_test+=("--ignore $CAFFE2_PYPATH/python/operator_test/piecewise_linear_transform_test.py")

				  rocm_ignore_test+=("--ignore $CAFFE2_PYPATH/python/operator_test/softmax_ops_test.py")

				  rocm_ignore_test+=("--ignore $CAFFE2_PYPATH/python/operator_test/unique_ops_test.py")

				  rocm_ignore_test+=("--ignore $caffe2_pypath/python/operator_test/piecewise_linear_transform_test.py")

				  # On ROCm, RCCL (distributed) development isn't complete.

				  # https://github.com/ROCmSoftwarePlatform/rccl

				  rocm_ignore_test+=("--ignore $caffe2_pypath/python/data_parallel_model_test.py")

				fi

				# Python tests

				# NB: Warnings are disabled because they make it harder to see what

				# the actual erroring test is

				echo "Running Python tests.."

				if [[ "$BUILD_ENVIRONMENT" == *py3* ]]; then

				  # locale setting is required by click package with py3

				  export LC_ALL=C.UTF-8

				  export LANG=C.UTF-8

				fi

				pip install --user pytest-sugar

				"$PYTHON" \

				  -m pytest \

				  -x \

				  -v \

				  --disable-warnings \

				  --junit-xml="$TEST_DIR/python/result.xml" \

				  --ignore "$CAFFE2_PYPATH/python/test/executor_test.py" \

				  --ignore "$CAFFE2_PYPATH/python/operator_test/matmul_op_test.py" \

				  --ignore "$CAFFE2_PYPATH/python/operator_test/pack_ops_test.py" \

				  --ignore "$CAFFE2_PYPATH/python/mkl/mkl_sbn_speed_test.py" \

				  --junit-xml="$pytest_reports_dir/result.xml" \

				  --ignore "$caffe2_pypath/python/test/executor_test.py" \

				  --ignore "$caffe2_pypath/python/operator_test/matmul_op_test.py" \

				  --ignore "$caffe2_pypath/python/operator_test/pack_ops_test.py" \

				  --ignore "$caffe2_pypath/python/mkl/mkl_sbn_speed_test.py" \

				  ${rocm_ignore_test[@]} \

				  "$CAFFE2_PYPATH/python" \

				  "$caffe2_pypath/python" \

				  "${EXTRA_TESTS[@]}"

				cd ${INSTALL_PREFIX}

				if [[ -n "$INTEGRATED" ]]; then

				  pip install --user torchvision

				#####################

				# torchvision tests #

				#####################

				if [[ "$BUILD_ENVIRONMENT" == *onnx* ]]; then

				  pip install -q --user git+https://github.com/pytorch/vision.git

				  pip install -q --user ninja

				  # JIT C++ extensions require ninja, so put it into PATH.

				  export PATH="/var/lib/jenkins/.local/bin:$PATH"

				  if [[ "$BUILD_ENVIRONMENT" == *py3* ]]; then

				    pip install -q --user onnxruntime==0.4.0

				  fi

				  "$ROOT_DIR/scripts/onnx/test.sh"

				fi

									
										26

.jenkins/pytorch/build-asan.sh
									
												View File
												
				@ -4,7 +4,9 @@

				# (This is set by default in the Docker images we build, so you don't

				# need to set it yourself.

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-build"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				echo "Clang version:"

				@ -14,8 +16,24 @@ clang --version

				# symbolize=1: Gives us much better errors when things go wrong

				export ASAN_OPTIONS=detect_leaks=0:symbolize=1

				# TODO: Make the ASAN flags a more unified env var

				# FIXME: Remove the hardcoded "-pthread" option.

				# With asan build, the cmake thread CMAKE_HAVE_LIBC_CREATE[1] checking will

				# succeed because "pthread_create" is in libasan.so. However, libasan doesn't

				# have the full pthread implementation. Other advanced pthread functions doesn't

				# exist in libasan.so[2]. If we need some pthread advanced functions, we still

				# need to link the pthread library.

				# This issue is already fixed in cmake 3.13[3]. If we use the newer cmake, we

				# could remove this hardcoded option.

				#

				# [1] https://github.com/Kitware/CMake/blob/8cabaaf054a16ea9c8332ce8e9291bd026b38c62/Modules/FindThreads.cmake#L135

				# [2] https://wiki.gentoo.org/wiki/AddressSanitizer/Problems

				# [3] https://github.com/Kitware/CMake/commit/e9a1ddc594de6e6251bf06d732775dae2cabe4c8

				#

				# TODO: Make the ASAN flags a centralized env var and unify with USE_ASAN option

				CC="clang" CXX="clang++" LDSHARED="clang --shared" \

				  CFLAGS="-fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -shared-libasan" \

				  NO_CUDA=1 USE_MKLDNN=0 \

				  CFLAGS="-fsanitize=address -fsanitize=undefined -fno-sanitize-recover=all -shared-libasan -pthread" \

				  CXX_FLAGS="-pthread" \

				  USE_ASAN=1 USE_CUDA=0 USE_MKLDNN=0 \

				  python setup.py install

				assert_git_not_dirty

									
										133

.jenkins/pytorch/build.sh
									
												View File
												
				@ -4,7 +4,9 @@

				# (This is set by default in the Docker images we build, so you don't

				# need to set it yourself.

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-build"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				# For distributed, four environmental configs:

				@ -18,7 +20,7 @@ if [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9-* ]]; then

				  sudo apt-get -qq install --allow-downgrades --allow-change-held-packages libnccl-dev=2.2.13-1+cuda9.0 libnccl2=2.2.13-1+cuda9.0

				fi

				if [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9*gcc7* ]] || [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda8-* ]] || [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9-cudnn7-py2* ]] || [[ "$BUILD_ENVIRONMENT" == *-trusty-py2.7.9* ]]; then

				if [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9*gcc7* ]] || [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9-* ]] || [[ "$BUILD_ENVIRONMENT" == *-trusty-py2.7.9* ]]; then

				  # TODO: move this to Docker

				  sudo apt-get -qq update

				  if [[ "$BUILD_ENVIRONMENT" == *-trusty-py2.7.9* ]]; then

				@ -30,8 +32,8 @@ if [[ "$BUILD_ENVIRONMENT" == *-xenial-cuda9*gcc7* ]] || [[ "$BUILD_ENVIRONMENT"

				  sudo mkdir -p /var/run/sshd

				fi

				if [[ "$BUILD_ENVIRONMENT" == "pytorch-linux-xenial-py3-clang5-asan" ]]; then

				  exec "$(dirname "${BASH_SOURCE[0]}")/build-asan.sh" $*

				if [[ "$BUILD_ENVIRONMENT" == *-linux-xenial-py3-clang5-asan* ]]; then

				  exec "$(dirname "${BASH_SOURCE[0]}")/build-asan.sh" "$@"

				fi

				echo "Python version:"

				@ -44,7 +46,30 @@ echo "CMake version:"

				cmake --version

				# TODO: Don't run this...

				pip install -q -r requirements.txt || true

				pip_install -r requirements.txt || true

				# TODO: Don't install this here

				if ! which conda; then

				  # In ROCm CIs, we are doing cross compilation on build machines with

				  # intel cpu and later run tests on machines with amd cpu.

				  # Also leave out two builds to make sure non-mkldnn builds still work.

				  if [[ "$BUILD_ENVIRONMENT" != *rocm* && "$BUILD_ENVIRONMENT" != *-trusty-py3.5-* && "$BUILD_ENVIRONMENT" != *-xenial-cuda9-cudnn7-py3-* ]]; then

				    pip_install mkl mkl-devel

				    export USE_MKLDNN=1

				  else

				    export USE_MKLDNN=0

				  fi

				fi

				# Use special scripts for Android builds

				if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then

				  export ANDROID_NDK=/opt/ndk

				  build_args=()

				  build_args+=("-DBUILD_CAFFE2_MOBILE=OFF")

				  build_args+=("-DCMAKE_PREFIX_PATH=$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')")

				  build_args+=("-DPYTHON_EXECUTABLE=$(python -c 'import sys; print(sys.executable)')")

				  exec ./scripts/build_android.sh "${build_args[@]}" "$@"

				fi

				if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

				  # When hcc runs out of memory, it silently exits without stopping

				@ -65,7 +90,7 @@ if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

				    fi

				    # Setup wrapper scripts

				    for compiler in cc c++ gcc g++ x86_64-linux-gnu-gcc; do

				    for compiler in cc c++ gcc g++; do

				      (

				        echo "#!/bin/sh"

				        echo "exec $SCCACHE $(which $compiler) \"\$@\""

				@ -83,24 +108,23 @@ if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

				  # OPENCV is needed to enable ImageInput operator in caffe2 resnet5_trainer

				  # LMDB is needed to read datasets from https://download.caffe2.ai/databases/resnet_trainer.zip

				  USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 python setup.py install --user

				  exit 0

				fi

				# TODO: Don't install this here

				if ! which conda; then

				  pip install -q mkl mkl-devel

				  if [[ "$BUILD_ENVIRONMENT" == *trusty-py3.6-gcc7.2* ]] || [[ "$BUILD_ENVIRONMENT" == *trusty-py3.6-gcc4.8* ]]; then

				    export USE_MKLDNN=1

				  else

				    export USE_MKLDNN=0

				  ORIG_COMP=/opt/rocm/hcc/bin/clang-*_original

				  if [ -e $ORIG_COMP ]; then

				    # runtime compilation of MIOpen kernels manages to crash sccache - hence undo the wrapping

				    # note that the wrapping always names the compiler "clang-7.0_original"

				    WRAPPED=/opt/rocm/hcc/bin/clang-[0-99]

				    sudo mv $ORIG_COMP $WRAPPED

				  fi

				  exit 0

				fi

				# sccache will fail for CUDA builds if all cores are used for compiling

				# gcc 7 with sccache seems to have intermittent OOM issue if all cores are used

				if [ -z "$MAX_JOBS" ]; then

				  if ([[ "$BUILD_ENVIRONMENT" == *cuda* ]] || [[ "$BUILD_ENVIRONMENT" == *gcc7* ]]) && which sccache > /dev/null; then

				    export MAX_JOBS=`expr $(nproc) - 1`

				    export MAX_JOBS=$(($(nproc) - 1))

				  fi

				fi

				@ -115,35 +139,48 @@ if [[ "$BUILD_ENVIRONMENT" == *trusty-py3.6-gcc5.4* ]]; then

				  export DEBUG=1

				fi

				# Patch required to build xla

				if [[ "${BUILD_ENVIRONMENT}" == *xla* ]]; then

				  git clone --recursive https://github.com/pytorch/xla.git

				  ./xla/scripts/apply_patches.sh

				fi

				# check that setup.py would fail with bad arguments

				echo "The next three invocations are expected to fail with invalid command error messages."

				( ! get_exit_code python setup.py bad_argument )

				( ! get_exit_code python setup.py clean] )

				( ! get_exit_code python setup.py clean bad_argument )

				# ppc64le build fails when WERROR=1

				# set only when building other architectures

				# only use for "python setup.py install" line

				if [[ "$BUILD_ENVIRONMENT" != *ppc64le* ]]; then

				  WERROR=1 python setup.py install

				elif [[ "$BUILD_ENVIRONMENT" == *ppc64le* ]]; then

				else

				  python setup.py install

				fi

				# Add the test binaries so that they won't be git clean'ed away

				git add -f build/bin

				assert_git_not_dirty

				# Test documentation build

				if [[ "$BUILD_ENVIRONMENT" == *xenial-cuda8-cudnn7-py3* ]]; then

				if [[ "$BUILD_ENVIRONMENT" == *xenial-cuda9-cudnn7-py3* ]]; then

				  pushd docs

				  # TODO: Don't run this here

				  pip install -q -r requirements.txt || true

				  pip_install -r requirements.txt || true

				  LC_ALL=C make html

				  popd

				  assert_git_not_dirty

				fi

				# Test standalone c10 build

				if [[ "$BUILD_ENVIRONMENT" == *xenial-cuda8-cudnn7-py3* ]]; then

				if [[ "$BUILD_ENVIRONMENT" == *xenial-cuda9-cudnn7-py3* ]]; then

				  mkdir -p c10/build

				  pushd c10/build

				  cmake ..

				  make -j

				  popd

				  assert_git_not_dirty

				fi

				# Test no-Python build

				@ -166,4 +203,54 @@ if [[ "$BUILD_TEST_LIBTORCH" == "1" ]]; then

				  CMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" cmake "$CUSTOM_OP_TEST"

				  make VERBOSE=1

				  popd

				  assert_git_not_dirty

				fi

				# Test XLA build

				if [[ "${BUILD_ENVIRONMENT}" == *xla* ]]; then

				  # TODO: Move this to Dockerfile.

				  pip_install lark-parser

				  # Bazel doesn't work with sccache gcc. https://github.com/bazelbuild/bazel/issues/3642

				  sudo add-apt-repository "deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-7 main"

				  wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key|sudo apt-key add -

				  sudo apt-get -qq update

				  # Install clang-7 clang++-7 for xla

				  sudo apt-get -qq install clang-7 clang++-7

				  # Bazel dependencies

				  sudo apt-get -qq install pkg-config zip zlib1g-dev unzip

				  # XLA build requires Bazel

				  wget https://github.com/bazelbuild/bazel/releases/download/0.24.1/bazel-0.24.1-installer-linux-x86_64.sh

				  chmod +x bazel-*.sh

				  sudo ./bazel-*.sh

				  BAZEL="$(which bazel)"

				  if [ -z "${BAZEL}" ]; then

				    echo "Unable to find bazel..."

				    exit 1

				  fi

				  # Install bazels3cache for cloud cache

				  sudo apt-get -qq install npm

				  npm config set strict-ssl false

				  curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -

				  sudo apt-get install -qq nodejs

				  sudo npm install -g bazels3cache

				  BAZELS3CACHE="$(which bazels3cache)"

				  if [ -z "${BAZELS3CACHE}" ]; then

				    echo "Unable to find bazels3cache..."

				    exit 1

				  fi

				  bazels3cache --bucket=ossci-compiler-cache-circleci-xla --maxEntrySizeBytes=0

				  pushd xla

				  export CC=clang-7 CXX=clang++-7

				  # Use cloud cache to build when available.

				  sed -i '/bazel build/ a --remote_http_cache=http://localhost:7777 \\' build_torch_xla_libs.sh

				  python setup.py install

				  popd

				  assert_git_not_dirty

				fi

									
										68

.jenkins/pytorch/common.sh
									
												View File
												
				@ -17,14 +17,28 @@ function cleanup {

				set -ex

				# Save the SCRIPT_DIR absolute path in case later we chdir (as occurs in the gpu perf test)

				SCRIPT_DIR="$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P )"

				# Required environment variables:

				#   $BUILD_ENVIRONMENT (should be set by your Docker image)

				# Figure out which Python to use for ROCm

				if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]] && [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then

				  PYTHON=$(which "python${BASH_REMATCH[1]}")

				  # non-interactive bashs do not expand aliases by default

				  shopt -s expand_aliases

				  export PYTORCH_TEST_WITH_ROCM=1

				  alias python="$PYTHON"

				fi

				# This token is used by a parser on Jenkins logs for determining

				# if a failure is a legitimate problem, or a problem with the build

				# system; to find out more, grep for this string in ossci-job-dsl.

				echo "ENTERED_USER_LAND"

				export IS_PYTORCH_CI=1

				# compositional trap taken from https://stackoverflow.com/a/7287873/23845

				# note: printf is used instead of echo to avoid backslash

				@ -61,17 +75,33 @@ declare -f -t trap_add

				trap_add cleanup EXIT

				function assert_git_not_dirty() {

				    # TODO: we should add an option to `build_amd.py` that reverts the repo to

				    #       an unmodified state.

				    if ([[ "$BUILD_ENVIRONMENT" != *rocm* ]] && [[ "$BUILD_ENVIRONMENT" != *xla* ]]) ; then

				        git_status=$(git status --porcelain)

				        if [[ $git_status ]]; then

				            echo "Build left local git repository checkout dirty"

				            echo "git status --porcelain:"

				            echo "${git_status}"

				            exit 1

				        fi

				    fi

				}

				if which sccache > /dev/null; then

				  # Save sccache logs to file

				  sccache --stop-server || true

				  rm ~/sccache_error.log || true

				  SCCACHE_ERROR_LOG=~/sccache_error.log RUST_LOG=sccache::server=error sccache --start-server

				  # increasing SCCACHE_IDLE_TIMEOUT so that extension_backend_test.cpp can build after this PR:

				  # https://github.com/pytorch/pytorch/pull/16645

				  SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=1200 RUST_LOG=sccache::server=error sccache --start-server

				  # Report sccache stats for easier debugging

				  sccache --zero-stats

				  function sccache_epilogue() {

				    echo '=================== sccache compilation log ==================='

				    python $(dirname "${BASH_SOURCE[0]}")/print_sccache_log.py ~/sccache_error.log

				    python "$SCRIPT_DIR/print_sccache_log.py" ~/sccache_error.log 2>/dev/null

				    echo '=========== If your build fails, please take a look at the log above for possible reasons ==========='

				    sccache --show-stats

				    sccache --stop-server || true

				@ -98,22 +128,9 @@ if [ -z "$COMPACT_JOB_NAME" ]; then

				  exit 1

				fi

				if grep --line-regexp -q "$COMPACT_JOB_NAME" "$(dirname "${BASH_SOURCE[0]}")/disabled-configs.txt"; then

				  echo "Job is explicitly disabled, SKIPPING"

				  exit 0

				else

				  echo "Job is not disabled, proceeding"

				fi

				if grep --line-regexp -q "$COMPACT_JOB_NAME" "$(dirname "${BASH_SOURCE[0]}")/enabled-configs.txt"; then

				  echo "Job is enabled, proceeding"

				else

				  echo "Job is not enabled, FAILING now (revert changes to enabled-configs.txt to fix this)"

				  exit 1

				fi

				if [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-xenial-cuda9-cudnn7-py3 ]] || \

				   [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-trusty-py3.6-gcc7* ]]; then

				if [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-xenial-cuda9-cudnn7-py3* ]] || \

				   [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-trusty-py3.6-gcc7* ]] || \

				   [[ "$BUILD_ENVIRONMENT" == *pytorch_macos* ]]; then

				  BUILD_TEST_LIBTORCH=1

				else

				  BUILD_TEST_LIBTORCH=0

				@ -122,7 +139,7 @@ fi

				# Use conda cmake in some CI build. Conda cmake will be newer than our supported

				# min version 3.5, so we only do it in two builds that we know should use conda.

				if [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-xenial-cuda* ]]; then

				  if [[ "$BUILD_ENVIRONMENT" == *cuda8-cudnn7-py2* ]] || \

				  if [[ "$BUILD_ENVIRONMENT" == *cuda9-cudnn7-py2* ]] || \

				     [[ "$BUILD_ENVIRONMENT" == *cuda9-cudnn7-py3* ]]; then

				    if ! which conda; then

				      echo "Expected ${BUILD_ENVIRONMENT} to use conda, but 'which conda' returns empty"

				@ -138,3 +155,16 @@ if [[ "$BUILD_ENVIRONMENT" == *pytorch-linux-xenial-cuda* ]]; then

				    fi

				  fi

				fi

				function pip_install() {

				  # retry 3 times

				  pip install --progress-bar off "$@" || pip install --progress-bar off "$@" || pip install --progress-bar off "$@"

				}

				function get_exit_code() {

				  set +e

				  "$@"

				  retcode=$?

				  set -e

				  return $retcode

				}

5

.jenkins/pytorch/disabled-configs.txt

View File

 @ -1,5 +0,0 @@
 # This file contains a list of disabled configurations.  Disabled
 # configurations are skipped and not considered a failure if they
 # fail.  You can use this to temporarily reserve a test name to
 # turn on CI side before PyTorch repository supports it.  This
 # file has the same format as .jenkins/enabled-configs.txt

									
										2

.jenkins/pytorch/docker-build-test.sh
									
												View File
												
				@ -1,6 +1,8 @@

				#!/bin/bash

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="docker-build-test"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				docker build -t pytorch .

51

.jenkins/pytorch/enabled-configs.txt

View File

 @ -1,51 +0,0 @@
 # This file contains a list of enabled configurations
 # to perform tests on.  If you want to run tests on CI on
 # a limited set of tests before enabling the full test suite,
 # you can delete lines from this file.  Any test that is not
 # in this file will report a failure (so you don't forget to
 # reenable the tests on merge ;)
 pytorch-linux-xenial-cuda8-cudnn7-py3-build
 pytorch-linux-xenial-cuda8-cudnn7-py3-test
 pytorch-linux-xenial-cuda8-cudnn7-py3-multigpu-test
 pytorch-linux-xenial-cuda9-cudnn7-py2-build
 pytorch-linux-xenial-cuda9-cudnn7-py2-test
 pytorch-linux-xenial-cuda9-cudnn7-py3-build
 pytorch-linux-xenial-cuda9-cudnn7-py3-test
 pytorch-linux-xenial-cuda9.2-cudnn7-py3-gcc7-build
 pytorch-linux-xenial-cuda9.2-cudnn7-py3-gcc7-test
 pytorch-linux-xenial-cuda10-cudnn7-py3-gcc7-build
 pytorch-linux-xenial-cuda10-cudnn7-py3-gcc7-test
 pytorch-linux-xenial-py3-clang5-asan-build
 pytorch-linux-xenial-py3-clang5-asan-test
 pytorch-linux-trusty-py2.7.9-build
 pytorch-linux-trusty-py2.7.9-test
 pytorch-linux-trusty-py2.7-build
 pytorch-linux-trusty-py2.7-test
 pytorch-linux-trusty-py3.5-build
 pytorch-linux-trusty-py3.5-test
 pytorch-linux-trusty-py3.6-gcc4.8-build
 pytorch-linux-trusty-py3.6-gcc4.8-test
 pytorch-linux-trusty-py3.6-gcc5.4-build
 pytorch-linux-trusty-py3.6-gcc5.4-test
 pytorch-linux-trusty-py3.6-gcc7.2-build
 pytorch-linux-trusty-py3.6-gcc7.2-test
 pytorch-linux-trusty-py3.6-gcc7-build
 pytorch-linux-trusty-py3.6-gcc7-test
 pytorch-linux-trusty-pynightly-build
 pytorch-linux-trusty-pynightly-test
 pytorch-win-ws2016-cuda9-cudnn7-py3-build
 pytorch-win-ws2016-cuda9-cudnn7-py3-test
 pytorch-macos-10.13-py3-build
 pytorch-macos-10.13-py3-test
 pytorch-macos-10.13-cuda9.2-cudnn7-py3-build
 pytorch-docker-build-test
 short-perf-test-cpu
 short-perf-test-gpu
 py2-clang7-rocmdeb-ubuntu16.04-build
 py2-clang7-rocmdeb-ubuntu16.04-test
 py2-devtoolset7-rocmrpm-centos7.5-build
 pytorch-ppc64le-cuda9.2-cudnn7-py3-build
 pytorch-ppc64le-cuda9.2-cudnn7-py3-test
 pytorch-ppc64le-cuda9.1-cudnn7-py3-build
 pytorch-ppc64le-cuda9.1-cudnn7-py3-test

									
										4

.jenkins/pytorch/macos-build-test.sh
									
												View File
												
				@ -1,9 +1,9 @@

				#!/bin/bash

				if [ -z "${JOB_BASE_NAME}" ] || [[ "${JOB_BASE_NAME}" == *-build* ]]; then

				if [ -z "${BUILD_ENVIRONMENT}" ] || [[ "${BUILD_ENVIRONMENT}" == *-build* ]]; then

				  source "$(dirname "${BASH_SOURCE[0]}")/macos-build.sh"

				fi

				if [ -z "${JOB_BASE_NAME}" ] || [[ "${JOB_BASE_NAME}" == *-test* ]]; then

				if [ -z "${BUILD_ENVIRONMENT}" ] || [[ "${BUILD_ENVIRONMENT}" == *-test* ]]; then

				  source "$(dirname "${BASH_SOURCE[0]}")/macos-test.sh"

				fi

									
										13

.jenkins/pytorch/macos-build.sh
									
												View File
												
				@ -1,6 +1,8 @@

				#!/bin/bash

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-build"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				export PATH="/usr/local/bin:$PATH"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				@ -17,17 +19,18 @@ source ${PYTORCH_ENV_DIR}/miniconda3/bin/activate

				conda install -y mkl mkl-include numpy pyyaml setuptools cmake cffi ninja

				rm -rf ${PYTORCH_ENV_DIR}/miniconda3/lib/python3.6/site-packages/torch*

				git submodule sync --recursive

				git submodule update --init --recursive

				export CMAKE_PREFIX_PATH=${PYTORCH_ENV_DIR}/miniconda3/

				# Build PyTorch

				if [[ "${JOB_BASE_NAME}" == *cuda9.2* ]]; then

				if [[ "${BUILD_ENVIRONMENT}" == *cuda9.2* ]]; then

				  export CUDA_VERSION=9.2

				  export TORCH_CUDA_ARCH_LIST=5.2

				  export PATH=/Developer/NVIDIA/CUDA-${CUDA_VERSION}/bin${PATH:+:${PATH}}

				  export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-${CUDA_VERSION}/lib${DYLD_LIBRARY_PATH:+:${DYLD_LIBRARY_PATH}}

				  export CUDA_HOME=/Developer/NVIDIA/CUDA-${CUDA_VERSION}

				  export NO_CUDA=0

				  export USE_CUDA=1

				  if [ -z "${IN_CIRCLECI}" ]; then

				    # Eigen gives "explicit specialization of class must precede its first use" error

				@ -50,7 +53,7 @@ if which sccache > /dev/null; then

				  printf "#!/bin/sh\nexec sccache $(which clang) \$*" > "${PYTORCH_ENV_DIR}/clang"

				  chmod a+x "${PYTORCH_ENV_DIR}/clang"

				  if [[ "${JOB_BASE_NAME}" == *cuda* ]]; then

				  if [[ "${BUILD_ENVIRONMENT}" == *cuda* ]]; then

				    printf "#!/bin/sh\nexec sccache $(which nvcc) \$*" > "${PYTORCH_ENV_DIR}/nvcc"

				    chmod a+x "${PYTORCH_ENV_DIR}/nvcc"

				    export CUDA_NVCC_EXECUTABLE="${PYTORCH_ENV_DIR}/nvcc"

				@ -65,6 +68,8 @@ export IMAGE_COMMIT_TAG=${BUILD_ENVIRONMENT}-${IMAGE_COMMIT_ID}

				python setup.py install

				assert_git_not_dirty

				# Upload torch binaries when the build job is finished

				if [ -z "${IN_CIRCLECI}" ]; then

				  7z a ${IMAGE_COMMIT_TAG}.7z ${PYTORCH_ENV_DIR}/miniconda3/lib/python3.6/site-packages/torch*

									
										79

.jenkins/pytorch/macos-test.sh
									
												View File
												
				@ -1,12 +1,14 @@

				#!/bin/bash

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-test"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				export PATH="/usr/local/bin:$PATH"

				# Set up conda environment

				export PYTORCH_ENV_DIR="${HOME}/pytorch-ci-env"

				export PYTORCH_ENV_DIR="${HOME}/workspace"

				# If a local installation of conda doesn't exist, we download and install conda

				if [ ! -d "${PYTORCH_ENV_DIR}/miniconda3" ]; then

				  mkdir -p ${PYTORCH_ENV_DIR}

				@ -16,17 +18,24 @@ fi

				export PATH="${PYTORCH_ENV_DIR}/miniconda3/bin:$PATH"

				source ${PYTORCH_ENV_DIR}/miniconda3/bin/activate

				conda install -y mkl mkl-include numpy pyyaml setuptools cmake cffi ninja six

				pip install hypothesis

				pip install -q hypothesis "librosa>=0.6.2" psutil

				# faulthandler become built-in since 3.3

				if [[ ! $(python -c "import sys; print(int(sys.version_info >= (3, 3)))") == "1" ]]; then

				  pip install -q faulthandler

				fi

				if [ -z "${IN_CIRCLECI}" ]; then

				  rm -rf ${PYTORCH_ENV_DIR}/miniconda3/lib/python3.6/site-packages/torch*

				fi

				git submodule sync --recursive

				git submodule update --init --recursive

				export CMAKE_PREFIX_PATH=${PYTORCH_ENV_DIR}/miniconda3/

				# Test PyTorch

				if [ -z "${IN_CIRCLECI}" ]; then

				  if [[ "${JOB_BASE_NAME}" == *cuda9.2* ]]; then

				  if [[ "${BUILD_ENVIRONMENT}" == *cuda9.2* ]]; then

				    # Eigen gives "explicit specialization of class must precede its first use" error

				    # when compiling with Xcode 9.1 toolchain, so we have to use Xcode 8.2 toolchain instead.

				    export DEVELOPER_DIR=/Library/Developer/CommandLineTools

				@ -49,34 +58,49 @@ if [ -z "${IN_CIRCLECI}" ]; then

				  7z x ${IMAGE_COMMIT_TAG}.7z -o"${PYTORCH_ENV_DIR}/miniconda3/lib/python3.6/site-packages"

				fi

				# Test that OpenMP is enabled

				pushd test

				if [[ ! $(python -c "import torch; print(int(torch.backends.openmp.is_available()))") == "1" ]]; then

				  echo "Build should have OpenMP enabled, but torch.backends.openmp.is_available() is False"

				  exit 1

				fi

				popd

				test_python_all() {

				  echo "Ninja version: $(ninja --version)"

				  python test/run_test.py --verbose

				  assert_git_not_dirty

				}

				test_cpp_api() {

				test_libtorch() {

				  # C++ API

				  # NB: Install outside of source directory (at the same level as the root

				  # pytorch folder) so that it doesn't get cleaned away prior to docker push.

				  # But still clean it before we perform our own build.

				  #

				  CPP_BUILD="$PWD/../cpp-build"

				  rm -rf $CPP_BUILD

				  mkdir -p $CPP_BUILD/caffe2

				  if [[ "$BUILD_TEST_LIBTORCH" == "1" ]]; then

				    # NB: Install outside of source directory (at the same level as the root

				    # pytorch folder) so that it doesn't get cleaned away prior to docker push.

				    # But still clean it before we perform our own build.

				  BUILD_LIBTORCH_PY=$PWD/tools/build_libtorch.py

				  pushd $CPP_BUILD/caffe2

				  VERBOSE=1 DEBUG=1 python $BUILD_LIBTORCH_PY

				  popd

				    echo "Testing libtorch"

				  python tools/download_mnist.py --quiet -d test/cpp/api/mnist

				    CPP_BUILD="$PWD/../cpp-build"

				    rm -rf $CPP_BUILD

				    mkdir -p $CPP_BUILD/caffe2

				  # Unfortunately it seems like the test can't load from miniconda3

				  # without these paths being set

				  export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$PWD/miniconda3/lib"

				  export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/miniconda3/lib"

				  "$CPP_BUILD"/caffe2/bin/test_api

				    BUILD_LIBTORCH_PY=$PWD/tools/build_libtorch.py

				    pushd $CPP_BUILD/caffe2

				    VERBOSE=1 DEBUG=1 python $BUILD_LIBTORCH_PY

				    popd

				    python tools/download_mnist.py --quiet -d test/cpp/api/mnist

				    # Unfortunately it seems like the test can't load from miniconda3

				    # without these paths being set

				    export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$PWD/miniconda3/lib"

				    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/miniconda3/lib"

				    TORCH_CPP_TEST_MNIST_PATH="test/cpp/api/mnist" "$CPP_BUILD"/caffe2/bin/test_api

				    assert_git_not_dirty

				  fi

				}

				test_custom_script_ops() {

				@ -96,18 +120,19 @@ test_custom_script_ops() {

				  # Run tests C++-side and load the exported script module.

				  build/test_custom_ops ./model.pt

				  popd

				  assert_git_not_dirty

				}

				if [ -z "${JOB_BASE_NAME}" ] || [[ "${JOB_BASE_NAME}" == *-test ]]; then

				if [ -z "${BUILD_ENVIRONMENT}" ] || [[ "${BUILD_ENVIRONMENT}" == *-test ]]; then

				  test_python_all

				  test_cpp_api

				  test_libtorch

				  test_custom_script_ops

				else

				  if [[ "${JOB_BASE_NAME}" == *-test1 ]]; then

				  if [[ "${BUILD_ENVIRONMENT}" == *-test1 ]]; then

				    test_python_all

				  elif [[ "${JOB_BASE_NAME}" == *-test2 ]]; then

				    test_cpp_api

				  elif [[ "${BUILD_ENVIRONMENT}" == *-test2 ]]; then

				    test_libtorch

				    test_custom_script_ops

				  fi

				fi

									
										9

.jenkins/pytorch/multigpu-test.sh
									
												View File
												
				@ -4,7 +4,9 @@

				# (This is set by default in the Docker images we build, so you don't

				# need to set it yourself.

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-multigpu-test"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				echo "Testing pytorch (distributed only)"

				@ -25,4 +27,9 @@ if [ -n "${IN_CIRCLECI}" ]; then

				  fi

				fi

				python tools/download_mnist.py --quiet -d test/cpp/api/mnist

				OMP_NUM_THREADS=2 TORCH_CPP_TEST_MNIST_PATH="test/cpp/api/mnist" "$PWD/../cpp-build"/caffe2/build/bin/test_api

				time python test/run_test.py --verbose -i distributed

				time python test/run_test.py --verbose -i c10d

				time python test/run_test.py --verbose -i c10d_spawn

				assert_git_not_dirty

									
										2

.jenkins/pytorch/perf_test/common.sh
									
												View File
												
				@ -10,7 +10,7 @@ get_runtime_of_command () {

				  TIMEFORMAT=%R

				  # runtime=$( { time ($@ &> /dev/null); } 2>&1 1>/dev/null)

				  runtime=$( { time $@; } 2>&1 1>/dev/null)

				  runtime=$( { time "$@"; } 2>&1 1>/dev/null)

				  if [[ $runtime == *"Error"* ]]; then

				    exit 1

				  fi

									
										7

.jenkins/pytorch/perf_test/compare_with_baseline.py
									
												View File
												
				@ -1,7 +1,6 @@

				import sys

				import json

				import math

				import numpy

				import argparse

				parser = argparse.ArgumentParser()

				@ -62,8 +61,10 @@ print("z-value: ", z_value)

				if z_value >= 3:

				    raise Exception('''\n

				z-value >= 3, there is high chance of perf regression.\n

				To reproduce this regression, run `cd .jenkins/pytorch/perf_test/ && bash ''' + test_name + '''.sh` on your local machine and compare the runtime before/after your code change.

				''')

				To reproduce this regression, run

				`cd .jenkins/pytorch/perf_test/ && bash {}.sh` on your local machine

				and compare the runtime before/after your code change.

				'''.format(test_name))

				else:

				    print("z-value < 3, no perf regression detected.")

				    if args.update:

									
										4

.jenkins/pytorch/perf_test/test_cpu_speed_mini_sequence_labeler.sh
									
												View File
												
				@ -19,14 +19,14 @@ test_cpu_speed_mini_sequence_labeler () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python main.py)

				    SAMPLE_ARRAY+=(${runtime})

				  done

				  cd ../../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										6

.jenkins/pytorch/perf_test/test_cpu_speed_mnist.sh
									
												View File
												
				@ -12,7 +12,7 @@ test_cpu_speed_mnist () {

				  cd examples/mnist

				  pip install -r requirements.txt

				  conda install -c pytorch torchvision-cpu

				  # Download data

				  python main.py --epochs 0

				@ -20,7 +20,7 @@ test_cpu_speed_mnist () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python main.py --epochs 1 --no-log)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -28,7 +28,7 @@ test_cpu_speed_mnist () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										4

.jenkins/pytorch/perf_test/test_cpu_speed_torch.sh
									
												View File
												
				@ -1,3 +1,5 @@

				#!/bin/bash

				. ./common.sh

				test_cpu_speed_torch () {

				@ -17,7 +19,7 @@ test_cpu_speed_torch () {

				  fi

				  if ! python perf-tests/modules/test_cpu_torch.py ${ARGS}; then

				    echo "To reproduce this regression, run \`cd .jenkins/pytorch/perf_test/ && bash "${FUNCNAME[0]}".sh\` on your local machine and compare the runtime before/after your code change."

				    echo "To reproduce this regression, run \`cd .jenkins/pytorch/perf_test/ && bash ${FUNCNAME[0]}.sh\` on your local machine and compare the runtime before/after your code change."

				    exit 1

				  fi

				}

									
										4

.jenkins/pytorch/perf_test/test_cpu_speed_torch_tensor.sh
									
												View File
												
				@ -1,3 +1,5 @@

				#!/bin/bash

				. ./common.sh

				test_cpu_speed_torch_tensor () {

				@ -17,7 +19,7 @@ test_cpu_speed_torch_tensor () {

				  fi

				  if ! python perf-tests/modules/test_cpu_torch_tensor.py ${ARGS}; then

				    echo "To reproduce this regression, run \`cd .jenkins/pytorch/perf_test/ && bash "${FUNCNAME[0]}".sh\` on your local machine and compare the runtime before/after your code change."

				    echo "To reproduce this regression, run \`cd .jenkins/pytorch/perf_test/ && bash ${FUNCNAME[0]}.sh\` on your local machine and compare the runtime before/after your code change."

				    exit 1

				  fi

				}

									
										4

.jenkins/pytorch/perf_test/test_gpu_speed_cudnn_lstm.sh
									
												View File
												
				@ -19,7 +19,7 @@ test_gpu_speed_cudnn_lstm () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python cudnn_lstm.py --skip-cpu-governor-check)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -27,7 +27,7 @@ test_gpu_speed_cudnn_lstm () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										4

.jenkins/pytorch/perf_test/test_gpu_speed_lstm.sh
									
												View File
												
				@ -19,7 +19,7 @@ test_gpu_speed_lstm () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python lstm.py --skip-cpu-governor-check)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -27,7 +27,7 @@ test_gpu_speed_lstm () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										4

.jenkins/pytorch/perf_test/test_gpu_speed_mlstm.sh
									
												View File
												
				@ -19,7 +19,7 @@ test_gpu_speed_mlstm () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python mlstm.py --skip-cpu-governor-check)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -27,7 +27,7 @@ test_gpu_speed_mlstm () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										6

.jenkins/pytorch/perf_test/test_gpu_speed_mnist.sh
									
												View File
												
				@ -12,7 +12,7 @@ test_gpu_speed_mnist () {

				  cd examples/mnist

				  pip install -r requirements.txt

				  conda install -c pytorch torchvision

				  # Download data

				  python main.py --epochs 0

				@ -23,7 +23,7 @@ test_gpu_speed_mnist () {

				  # Needs warm up to get accurate number

				  python main.py --epochs 1 --no-log

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python main.py --epochs 1 --no-log)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -31,7 +31,7 @@ test_gpu_speed_mnist () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										4

.jenkins/pytorch/perf_test/test_gpu_speed_word_language_model.sh
									
												View File
												
				@ -28,7 +28,7 @@ test_gpu_speed_word_language_model () {

				  SAMPLE_ARRAY=()

				  NUM_RUNS=$1

				  for (( i=1; i<=$NUM_RUNS; i++ )) do

				  for (( i=1; i<=NUM_RUNS; i++ )) do

				    runtime=$(get_runtime_of_command python main.py --cuda --epochs 1)

				    echo $runtime

				    SAMPLE_ARRAY+=(${runtime})

				@ -36,7 +36,7 @@ test_gpu_speed_word_language_model () {

				  cd ../..

				  stats=$(python ../get_stats.py ${SAMPLE_ARRAY[@]})

				  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")

				  echo "Runtime stats in seconds:"

				  echo $stats

									
										9

.jenkins/pytorch/short-perf-test-cpu.sh
									
												View File
												
				@ -1,13 +1,18 @@

				#!/bin/bash

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="short-perf-test-cpu"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				SCRIPT_PARENT_DIR=$(dirname "${BASH_SOURCE[0]}")

				# shellcheck source=.jenkins/pytorch/common.sh

				source "$SCRIPT_PARENT_DIR/common.sh"

				cd .jenkins/pytorch/perf_test

				echo "Running CPU perf test for PyTorch..."

				pip install awscli

				pip install -q awscli

				# Set multipart_threshold to be sufficiently high, so that `aws s3 cp` is not a multipart read

				# More info at https://github.com/aws/aws-cli/issues/2321

									
										6

.jenkins/pytorch/short-perf-test-gpu.sh
									
												View File
												
				@ -1,13 +1,17 @@

				#!/bin/bash

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="short-perf-test-gpu"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				pushd .jenkins/pytorch/perf_test

				echo "Running GPU perf test for PyTorch..."

				pip install awscli

				# Trying to uninstall PyYAML can cause problem. Workaround according to:

				# https://github.com/pypa/pip/issues/5247#issuecomment-415571153

				pip install -q awscli --ignore-installed PyYAML

				# Set multipart_threshold to be sufficiently high, so that `aws s3 cp` is not a multipart read

				# More info at https://github.com/aws/aws-cli/issues/2321

									
										128

.jenkins/pytorch/test.sh
									
												View File
												
				@ -4,7 +4,9 @@

				# (This is set by default in the Docker images we build, so you don't

				# need to set it yourself.

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}-test"

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME="${BUILD_ENVIRONMENT}"

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				echo "Testing pytorch"

				@ -23,17 +25,40 @@ if [ -n "${IN_CIRCLECI}" ]; then

				    sudo apt-get -qq install --no-install-recommends openssh-client openssh-server

				    sudo mkdir -p /var/run/sshd

				  fi

				  if [[ "$BUILD_ENVIRONMENT" == *-slow-* ]]; then

				    export PYTORCH_TEST_WITH_SLOW=1

				    export PYTORCH_TEST_SKIP_FAST=1

				  fi

				fi

				# --user breaks ppc64le builds and these packages are already in ppc64le docker

				if [[ "$BUILD_ENVIRONMENT" != *ppc64le* ]]; then

				  # JIT C++ extensions require ninja.

				  pip install -q ninja --user

				  pip_install --user ninja

				  # ninja is installed in /var/lib/jenkins/.local/bin

				  export PATH="/var/lib/jenkins/.local/bin:$PATH"

				  # TODO: move this to Docker

				  pip install -q hypothesis --user

				  pip_install --user hypothesis

				  # TODO: move this to Docker

				  PYTHON_VERSION=$(python -c 'import platform; print(platform.python_version())'|cut -c1)

				  echo $PYTHON_VERSION

				  # if [[ $PYTHON_VERSION == "2" ]]; then

				  #   pip_install --user https://s3.amazonaws.com/ossci-linux/wheels/tensorboard-1.14.0a0-py2-none-any.whl

				  # else

				  #   pip_install --user https://s3.amazonaws.com/ossci-linux/wheels/tensorboard-1.14.0a0-py3-none-any.whl

				  # fi

				  pip_install --user tb-nightly

				  # mypy will fail to install on Python <3.4.  In that case,

				  # we just won't run these tests.

				  pip_install --user mypy || true

				fi

				# faulthandler become built-in since 3.3

				if [[ ! $(python -c "import sys; print(int(sys.version_info >= (3, 3)))") == "1" ]]; then

				  pip_install --user faulthandler

				fi

				# DANGER WILL ROBINSON.  The LD_PRELOAD here could cause you problems

				@ -60,13 +85,6 @@ if [[ "$BUILD_ENVIRONMENT" == *asan* ]]; then

				    # Increase stack size, because ASAN red zones use more stack

				    ulimit -s 81920

				    function get_exit_code() {

				      set +e

				      "$@"

				      retcode=$?

				      set -e

				      return $retcode

				    }

				    (cd test && python -c "import torch")

				    echo "The next three invocations are expected to crash; if they don't that means ASAN/UBSAN is misconfigured"

				    (cd test && ! get_exit_code python -c "import torch; torch._C._crash_if_csrc_asan(3)")

				@ -74,24 +92,20 @@ if [[ "$BUILD_ENVIRONMENT" == *asan* ]]; then

				    (cd test && ! get_exit_code python -c "import torch; torch._C._crash_if_aten_asan(3)")

				fi

				if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then

				  export PYTORCH_TEST_WITH_ROCM=1

				  export LANG=C.UTF-8

				  export LC_ALL=C.UTF-8

				fi

				if [[ "${JOB_BASE_NAME}" == *-NO_AVX-* ]]; then

				if [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX-* ]]; then

				  export ATEN_CPU_CAPABILITY=default

				elif [[ "${JOB_BASE_NAME}" == *-NO_AVX2-* ]]; then

				elif [[ "${BUILD_ENVIRONMENT}" == *-NO_AVX2-* ]]; then

				  export ATEN_CPU_CAPABILITY=avx

				fi

				test_python_nn() {

				  time python test/run_test.py --include nn --verbose

				  assert_git_not_dirty

				}

				test_python_all_except_nn() {

				  time python test/run_test.py --exclude nn --verbose

				  assert_git_not_dirty

				}

				test_aten() {

				@ -115,38 +129,28 @@ test_aten() {

				    ls build/bin

				    aten/tools/run_tests.sh build/bin

				    assert_git_not_dirty

				  fi

				}

				test_torchvision() {

				  rm -rf ninja

				  echo "Installing torchvision at branch master"

				  rm -rf vision

				  # TODO: This git clone is bad, it means pushes to torchvision can break

				  # PyTorch CI

				  git clone https://github.com/pytorch/vision --quiet

				  pushd vision

				  # python setup.py install with a tqdm dependency is broken in the

				  # Travis Python nightly (but not in latest Python nightlies, so

				  # this should be a transient requirement...)

				  # See https://github.com/pytorch/pytorch/issues/7525

				  #time python setup.py install

				  pip install -q --user .

				  popd

				  pip_install --user git+https://github.com/pytorch/vision.git@2b73a4846773a670632b29fb2fc2ac57df7bce5d

				}

				test_libtorch() {

				  if [[ "$BUILD_TEST_LIBTORCH" == "1" ]]; then

				     echo "Testing libtorch"

				     CPP_BUILD="$PWD/../cpp-build"

				     if [[ "$BUILD_ENVIRONMENT" == *cuda* ]]; then

				       "$CPP_BUILD"/caffe2/bin/test_jit

				     else

				       "$CPP_BUILD"/caffe2/bin/test_jit "[cpu]"

				     fi

				     python tools/download_mnist.py --quiet -d mnist

				     OMP_NUM_THREADS=2 "$CPP_BUILD"/caffe2/bin/test_api

				    echo "Testing libtorch"

				    python test/cpp/jit/tests_setup.py setup

				    CPP_BUILD="$PWD/../cpp-build"

				    if [[ "$BUILD_ENVIRONMENT" == *cuda* ]]; then

				      "$CPP_BUILD"/caffe2/build/bin/test_jit

				    else

				      "$CPP_BUILD"/caffe2/build/bin/test_jit "[cpu]"

				    fi

				    python test/cpp/jit/tests_setup.py shutdown

				    python tools/download_mnist.py --quiet -d test/cpp/api/mnist

				    OMP_NUM_THREADS=2 TORCH_CPP_TEST_MNIST_PATH="test/cpp/api/mnist" "$CPP_BUILD"/caffe2/build/bin/test_api

				    assert_git_not_dirty

				  fi

				}

				@ -155,32 +159,46 @@ test_custom_script_ops() {

				    echo "Testing custom script operators"

				    CUSTOM_OP_BUILD="$PWD/../custom-op-build"

				    pushd test/custom_operator

				    cp -r "$CUSTOM_OP_BUILD" build

				    cp -a "$CUSTOM_OP_BUILD" build

				    # Run tests Python-side and export a script module.

				    python test_custom_ops.py -v

				    python model.py --export-script-module=model.pt

				    # Run tests C++-side and load the exported script module.

				    build/test_custom_ops ./model.pt

				    popd

				    assert_git_not_dirty

				  fi

				}

				if [ -z "${JOB_BASE_NAME}" ] || [[ "${JOB_BASE_NAME}" == *-test ]]; then

				test_xla() {

				  export XLA_USE_XRT=1 XRT_DEVICE_MAP="CPU:0;/job:localservice/replica:0/task:0/device:XLA_CPU:0"

				  export XRT_WORKERS="localservice:0;grpc://localhost:40934"

				  pushd xla

				  python test/test_operations.py

				  python test/test_train_mnist.py --tidy

				  popd

				  assert_git_not_dirty

				}

				(cd test && python -c "import torch; print(torch.__config__.show())")

				(cd test && python -c "import torch; print(torch.__config__.parallel_info())")

				if [[ "${BUILD_ENVIRONMENT}" == *xla* ]]; then

				  test_torchvision

				  test_xla

				elif [[ "${BUILD_ENVIRONMENT}" == *-test1 ]]; then

				  test_torchvision

				  test_python_nn

				elif [[ "${BUILD_ENVIRONMENT}" == *-test2 ]]; then

				  test_python_all_except_nn

				  test_aten

				  test_libtorch

				  test_custom_script_ops

				else

				  test_torchvision

				  test_python_nn

				  test_python_all_except_nn

				  test_aten

				  test_libtorch

				  test_custom_script_ops

				else

				  if [[ "${JOB_BASE_NAME}" == *-test1 ]]; then

				    test_torchvision

				    test_python_nn

				  elif [[ "${JOB_BASE_NAME}" == *-test2 ]]; then

				    test_torchvision

				    test_python_all_except_nn

				    test_aten

				    test_libtorch

				    test_custom_script_ops

				  fi

				fi

									
										157

.jenkins/pytorch/win-build.sh
									
												View File
												
				@ -9,164 +9,29 @@ if [ ! -f setup.py ]; then

				  exit 1

				fi

				# shellcheck disable=SC2034

				COMPACT_JOB_NAME=pytorch-win-ws2016-cuda9-cudnn7-py3-build

				source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

				SCRIPT_PARENT_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )

				source "$SCRIPT_PARENT_DIR/common.sh"

				export IMAGE_COMMIT_ID=`git rev-parse HEAD`

				export IMAGE_COMMIT_TAG=${BUILD_ENVIRONMENT}-${IMAGE_COMMIT_ID}

				if [[ ${JOB_NAME} == *"develop"* ]]; then

				  export IMAGE_COMMIT_TAG=develop-${IMAGE_COMMIT_TAG}

				fi

				mkdir -p ci_scripts/

				export TMP_DIR="${PWD}/build/win_tmp"

				export TMP_DIR_WIN=$(cygpath -w "${TMP_DIR}")

				cat >ci_scripts/upload_image.py << EOL

				export SCRIPT_HELPERS_DIR=$SCRIPT_PARENT_DIR/win-test-helpers

				import os

				import sys

				import boto3

				IMAGE_COMMIT_TAG = os.getenv('IMAGE_COMMIT_TAG')

				$SCRIPT_HELPERS_DIR/build_pytorch.bat

				session = boto3.session.Session()

				s3 = session.resource('s3')

				data = open(sys.argv[1], 'rb')

				s3.Bucket('ossci-windows-build').put_object(Key='pytorch/'+IMAGE_COMMIT_TAG+'.7z', Body=data)

				object_acl = s3.ObjectAcl('ossci-windows-build','pytorch/'+IMAGE_COMMIT_TAG+'.7z')

				response = object_acl.put(ACL='public-read')

				assert_git_not_dirty

				EOL

				cat >ci_scripts/build_pytorch.bat <<EOL

				set PATH=C:\\Program Files\\CMake\\bin;C:\\Program Files\\7-Zip;C:\\ProgramData\\chocolatey\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\Amazon\\AWSCLI;%PATH%

				:: Install MKL

				if "%REBUILD%"=="" (

				  if "%BUILD_ENVIRONMENT%"=="" (

				    curl -k https://s3.amazonaws.com/ossci-windows/mkl_2018.2.185.7z --output mkl.7z

				  ) else (

				    aws s3 cp s3://ossci-windows/mkl_2018.2.185.7z mkl.7z --quiet

				  )

				  7z x -aoa mkl.7z -omkl

				)

				set CMAKE_INCLUDE_PATH=%cd%\\mkl\\include

				set LIB=%cd%\\mkl\\lib;%LIB

				:: Install MAGMA

				if "%REBUILD%"=="" (

				  if "%BUILD_ENVIRONMENT%"=="" (

				    curl -k https://s3.amazonaws.com/ossci-windows/magma_2.4.0_cuda90_release.7z --output magma_2.4.0_cuda90_release.7z

				  ) else (

				    aws s3 cp s3://ossci-windows/magma_2.4.0_cuda90_release.7z magma_2.4.0_cuda90_release.7z --quiet

				  )

				  7z x -aoa magma_2.4.0_cuda90_release.7z -omagma

				)

				set MAGMA_HOME=%cd%\\magma

				:: Install sccache

				mkdir %CD%\\tmp_bin

				if "%REBUILD%"=="" (

				  :check_sccache

				  %CD%\\tmp_bin\\sccache.exe --show-stats || (

				    taskkill /im sccache.exe /f /t || ver > nul

				    del %CD%\\tmp_bin\\sccache.exe

				    if "%BUILD_ENVIRONMENT%"=="" (

				      curl -k https://s3.amazonaws.com/ossci-windows/sccache.exe --output %CD%\\tmp_bin\\sccache.exe

				    ) else (

				      aws s3 cp s3://ossci-windows/sccache.exe %CD%\\tmp_bin\\sccache.exe

				    )

				    goto :check_sccache

				  )

				)

				:: Install Miniconda3

				if "%BUILD_ENVIRONMENT%"=="" (

				  set CONDA_PARENT_DIR=%CD%

				) else (

				  set CONDA_PARENT_DIR=C:\\Jenkins

				)

				if "%REBUILD%"=="" (

				  IF EXIST %CONDA_PARENT_DIR%\\Miniconda3 ( rd /s /q %CONDA_PARENT_DIR%\\Miniconda3 )

				  curl -k https://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe -O

				  .\Miniconda3-latest-Windows-x86_64.exe /InstallationType=JustMe /RegisterPython=0 /S /AddToPath=0 /D=%CONDA_PARENT_DIR%\\Miniconda3

				)

				call %CONDA_PARENT_DIR%\\Miniconda3\\Scripts\\activate.bat %CONDA_PARENT_DIR%\\Miniconda3

				if "%REBUILD%"=="" (

				  :: We have to pin Python version to 3.6.7, until mkl supports Python 3.7

				  call conda install -y -q python=3.6.7 numpy cffi pyyaml boto3

				)

				:: Install ninja

				if "%REBUILD%"=="" ( pip install ninja )

				set WORKING_DIR=%CD%

				call "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Auxiliary\\Build\\vcvarsall.bat" x64

				call "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Auxiliary\\Build\\vcvarsall.bat" x86_amd64

				cd %WORKING_DIR%

				git submodule update --init --recursive

				set PATH=%CD%\\tmp_bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0\\libnvvp;%PATH%

				set CUDA_PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0

				set CUDA_PATH_V9_0=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0

				set NVTOOLSEXT_PATH=C:\\Program Files\\NVIDIA Corporation\\NvToolsExt

				set CUDNN_LIB_DIR=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0\\lib\\x64

				set CUDA_TOOLKIT_ROOT_DIR=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0

				set CUDNN_ROOT_DIR=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0

				:: Target only our CI GPU machine's CUDA arch to speed up the build

				set TORCH_CUDA_ARCH_LIST=5.2

				sccache --stop-server

				sccache --start-server

				sccache --zero-stats

				set CC=sccache cl

				set CXX=sccache cl

				set DISTUTILS_USE_SDK=1

				set CMAKE_GENERATOR=Ninja

				if not "%USE_CUDA%"=="1" (

				  if "%REBUILD%"=="" (

				    set NO_CUDA=1

				    python setup.py install

				  )

				  if errorlevel 1 exit /b 1

				  if not errorlevel 0 exit /b 1

				)

				if not "%USE_CUDA%"=="0" (

				  if "%REBUILD%"=="" (

				    sccache --show-stats

				    sccache --zero-stats

				    rd /s /q %CONDA_PARENT_DIR%\\Miniconda3\\Lib\\site-packages\\torch

				    for /f "delims=" %%i in ('where /R caffe2\proto *.py') do (

				      IF NOT "%%i" == "%CD%\caffe2\proto\__init__.py" (

				        del /S /Q %%i

				      )

				    )

				    copy %CD%\\tmp_bin\\sccache.exe tmp_bin\\nvcc.exe

				  )

				  set CUDA_NVCC_EXECUTABLE=%CD%\\tmp_bin\\nvcc

				  if "%REBUILD%"=="" set NO_CUDA=0

				  python setup.py install && sccache --show-stats && (

				    if "%BUILD_ENVIRONMENT%"=="" (

				      echo NOTE: To run \`import torch\`, please make sure to activate the conda environment by running \`call %CONDA_PARENT_DIR%\\Miniconda3\\Scripts\\activate.bat %CONDA_PARENT_DIR%\\Miniconda3\` in Command Prompt before running Git Bash.

				    ) else (

				      mv %CD%\\build\\bin\\test_api.exe %CONDA_PARENT_DIR%\\Miniconda3\\Lib\\site-packages\\torch\\lib

				      7z a %IMAGE_COMMIT_TAG%.7z %CONDA_PARENT_DIR%\\Miniconda3\\Lib\\site-packages\\torch && python ci_scripts\\upload_image.py %IMAGE_COMMIT_TAG%.7z

				    )

				  )

				)

				EOL

				ci_scripts/build_pytorch.bat

				if [ ! -f $IMAGE_COMMIT_TAG.7z ] && [ ! ${BUILD_ENVIRONMENT} == "" ]; then

				if [ ! -f ${TMP_DIR}/${IMAGE_COMMIT_TAG}.7z ] && [ ! ${BUILD_ENVIRONMENT} == "" ]; then

				    exit 1

				fi

				echo "BUILD PASSED"

									
										116

.jenkins/pytorch/win-test-helpers/build_pytorch.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,116 @@

				if "%DEBUG%" == "1" (

				  set BUILD_TYPE=debug

				) ELSE (

				  set BUILD_TYPE=release

				)

				set PATH=C:\Program Files\CMake\bin;C:\Program Files\7-Zip;C:\ProgramData\chocolatey\bin;C:\Program Files\Git\cmd;C:\Program Files\Amazon\AWSCLI;%PATH%

				set INSTALLER_DIR=%SCRIPT_HELPERS_DIR%\installation-helpers

				call %INSTALLER_DIR%\install_mkl.bat

				call %INSTALLER_DIR%\install_magma.bat

				call %INSTALLER_DIR%\install_sccache.bat

				call %INSTALLER_DIR%\install_miniconda3.bat

				:: Install ninja

				if "%REBUILD%"=="" ( pip install -q ninja )

				git submodule sync --recursive

				git submodule update --init --recursive

				if "%CUDA_VERSION%" == "9" goto cuda_build_9

				if "%CUDA_VERSION%" == "10" goto cuda_build_10

				goto cuda_build_end

				:cuda_build_9

				:: Override VS env here

				pushd .

				call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" x64

				@echo on

				popd

				set DISTUTILS_USE_SDK=1

				set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0

				set CUDA_PATH_V9_0=%CUDA_PATH%

				goto cuda_build_common

				:cuda_build_10

				set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1

				set CUDA_PATH_V10_1=%CUDA_PATH%

				goto cuda_build_common

				:cuda_build_common

				set CUDNN_LIB_DIR=%CUDA_PATH%\lib\x64

				set CUDA_TOOLKIT_ROOT_DIR=%CUDA_PATH%

				set CUDNN_ROOT_DIR=%CUDA_PATH%

				set NVTOOLSEXT_PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt

				set PATH=%CUDA_PATH%\bin;%CUDA_PATH%\libnvvp;%PATH%

				:cuda_build_end

				set PATH=%TMP_DIR_WIN%\bin;%PATH%

				:: Target only our CI GPU machine's CUDA arch to speed up the build

				set TORCH_CUDA_ARCH_LIST=5.2

				sccache --stop-server

				sccache --start-server

				sccache --zero-stats

				set CC=sccache cl

				set CXX=sccache cl

				set CMAKE_GENERATOR=Ninja

				:: The following code will try to build PyTorch twice if USE_CUDA is neither 0

				:: nor 1. It is intended so that both builds can be folded into 1 CI run.

				if not "%USE_CUDA%"=="1" (

				  if "%REBUILD%"=="" (

				    :: Must save and restore the original value of USE_CUDA, otherwise the

				    :: `if not "%USE_CUDA%"=="0"` line can be messed up.

				    set OLD_USE_CUDA=%USE_CUDA%

				    set USE_CUDA=0

				    python setup.py install

				    set USE_CUDA=%OLD_USE_CUDA%

				  )

				  if errorlevel 1 exit /b 1

				  if not errorlevel 0 exit /b 1

				)

				if not "%USE_CUDA%"=="0" (

				  :: sccache will fail for CUDA builds if all cores are used for compiling

				  if not defined MAX_JOBS set /A MAX_JOBS=%NUMBER_OF_PROCESSORS%-1

				  if "%REBUILD%"=="" (

				    sccache --show-stats

				    sccache --zero-stats

				    rd /s /q %CONDA_PARENT_DIR%\Miniconda3\Lib\site-packages\torch

				    for /f "delims=" %%i in ('where /R caffe2\proto *.py') do (

				      IF NOT "%%i" == "%CD%\caffe2\proto\__init__.py" (

				        del /S /Q %%i

				      )

				    )

				    copy %TMP_DIR_WIN%\bin\sccache.exe %TMP_DIR_WIN%\bin\nvcc.exe

				  )

				  set CUDA_NVCC_EXECUTABLE=%TMP_DIR_WIN%\bin\nvcc

				  if "%REBUILD%"=="" set USE_CUDA=1

				  python setup.py install --cmake && sccache --show-stats && (

				    if "%BUILD_ENVIRONMENT%"=="" (

				      echo NOTE: To run `import torch`, please make sure to activate the conda environment by running `call %CONDA_PARENT_DIR%\Miniconda3\Scripts\activate.bat %CONDA_PARENT_DIR%\Miniconda3` in Command Prompt before running Git Bash.

				    ) else (

				      7z a %TMP_DIR_WIN%\%IMAGE_COMMIT_TAG%.7z %CONDA_PARENT_DIR%\Miniconda3\Lib\site-packages\torch %CONDA_PARENT_DIR%\Miniconda3\Lib\site-packages\caffe2 && python %SCRIPT_HELPERS_DIR%\upload_image.py %TMP_DIR_WIN%\%IMAGE_COMMIT_TAG%.7z

				    )

				  )

				)

									
										21

.jenkins/pytorch/win-test-helpers/download_image.py
									
										Executable file
									
												View File
												
				@ -0,0 +1,21 @@

				#!/usr/bin/env python

				import os

				import sys

				import boto3

				import botocore

				IMAGE_COMMIT_TAG = os.getenv('IMAGE_COMMIT_TAG')

				session = boto3.session.Session()

				s3 = session.resource('s3')

				BUCKET_NAME = 'ossci-windows-build'

				KEY = 'pytorch/' + IMAGE_COMMIT_TAG + '.7z'

				LOCAL_FILE_PATH = sys.argv[1]

				try:

				    s3.Bucket(BUCKET_NAME).download_file(KEY, LOCAL_FILE_PATH)

				except botocore.exceptions.ClientError as e:

				    if e.response['Error']['Code'] == "404":

				        print("The object does not exist.")

				    else:

				        raise

									
										17

.jenkins/pytorch/win-test-helpers/installation-helpers/install_magma.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,17 @@

				if "%CUDA_VERSION%" == "9" set CUDA_SUFFIX=cuda90

				if "%CUDA_VERSION%" == "10" set CUDA_SUFFIX=cuda101

				if "%CUDA_SUFFIX%" == "" (

				  echo unknown CUDA version, please set `CUDA_VERSION` to 9 or 10.

				  exit /b 1

				)

				if "%REBUILD%"=="" (

				  if "%BUILD_ENVIRONMENT%"=="" (

				    curl -k https://s3.amazonaws.com/ossci-windows/magma_2.5.0_%CUDA_SUFFIX%_%BUILD_TYPE%.7z --output %TMP_DIR_WIN%\magma_2.5.0_%CUDA_SUFFIX%_%BUILD_TYPE%.7z

				  ) else (

				    aws s3 cp s3://ossci-windows/magma_2.5.0_%CUDA_SUFFIX%_%BUILD_TYPE%.7z %TMP_DIR_WIN%\magma_2.5.0_%CUDA_SUFFIX%_%BUILD_TYPE%.7z --quiet

				  )

				  7z x -aoa %TMP_DIR_WIN%\magma_2.5.0_%CUDA_SUFFIX%_%BUILD_TYPE%.7z -o%TMP_DIR_WIN%\magma

				)

				set MAGMA_HOME=%TMP_DIR_WIN%\magma

									
										15

.jenkins/pytorch/win-test-helpers/installation-helpers/install_miniconda3.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,15 @@

				if "%BUILD_ENVIRONMENT%"=="" (

				  set CONDA_PARENT_DIR=%CD%

				) else (

				  set CONDA_PARENT_DIR=C:\Jenkins

				)

				if "%REBUILD%"=="" (

				  IF EXIST %CONDA_PARENT_DIR%\Miniconda3 ( rd /s /q %CONDA_PARENT_DIR%\Miniconda3 )

				  curl -k https://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe --output %TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe

				  %TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe /InstallationType=JustMe /RegisterPython=0 /S /AddToPath=0 /D=%CONDA_PARENT_DIR%\Miniconda3

				)

				call %CONDA_PARENT_DIR%\Miniconda3\Scripts\activate.bat %CONDA_PARENT_DIR%\Miniconda3

				if "%REBUILD%"=="" (

				  :: We have to pin Python version to 3.6.7, until mkl supports Python 3.7

				  call conda install -y -q python=3.6.7 numpy cffi pyyaml boto3

				)

									
										10

.jenkins/pytorch/win-test-helpers/installation-helpers/install_mkl.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,10 @@

				if "%REBUILD%"=="" (

				  if "%BUILD_ENVIRONMENT%"=="" (

				    curl -k https://s3.amazonaws.com/ossci-windows/mkl_2018.2.185.7z --output %TMP_DIR_WIN%\mkl.7z

				  ) else (

				    aws s3 cp s3://ossci-windows/mkl_2018.2.185.7z %TMP_DIR_WIN%\mkl.7z --quiet

				  )

				  7z x -aoa %TMP_DIR_WIN%\mkl.7z -o%TMP_DIR_WIN%\mkl

				)

				set CMAKE_INCLUDE_PATH=%TMP_DIR_WIN%\mkl\include

				set LIB=%TMP_DIR_WIN%\mkl\lib;%LIB

									
										15

.jenkins/pytorch/win-test-helpers/installation-helpers/install_sccache.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,15 @@

				mkdir %TMP_DIR_WIN%\bin

				if "%REBUILD%"=="" (

				  :check_sccache

				  %TMP_DIR_WIN%\bin\sccache.exe --show-stats || (

				    taskkill /im sccache.exe /f /t || ver > nul

				    del %TMP_DIR_WIN%\bin\sccache.exe

				    if "%BUILD_ENVIRONMENT%"=="" (

				      curl -k https://s3.amazonaws.com/ossci-windows/sccache.exe --output %TMP_DIR_WIN%\bin\sccache.exe

				    ) else (

				      aws s3 cp s3://ossci-windows/sccache.exe %TMP_DIR_WIN%\bin\sccache.exe

				    )

				    goto :check_sccache

				  )

				)

									
										39

.jenkins/pytorch/win-test-helpers/run_python_nn_smoketests.py
									
										Executable file
									
												View File
												
				@ -0,0 +1,39 @@

				#!/usr/bin/env python

				from __future__ import print_function

				import subprocess

				TESTS = [

				    (

				        "Checking that caffe2.python is available",

				        "from caffe2.python import core",

				    ),

				    (

				        "Checking that MKL is available",

				        "import torch; exit(0 if torch.backends.mkl.is_available() else 1)",

				    ),

				    (

				        "Checking that CUDA archs are setup correctly",

				        "import torch; torch.randn([3,5]).cuda()",

				    ),

				    (

				        "Checking that magma is available",

				        "import torch; torch.rand(1).cuda(); exit(0 if torch.cuda.has_magma else 1)",

				    ),

				    (

				        "Checking that CuDNN is available",

				        "import torch; exit(0 if torch.backends.cudnn.is_available() else 1)",

				    ),

				]

				if __name__ == "__main__":

				    for description, python_commands in TESTS:

				        print(description)

				        command_args = ["python", "-c", python_commands]

				        command_string = " ".join(command_args)

				        print("Command:", command_string)

				        subprocess.check_call(command_args)

									
										85

.jenkins/pytorch/win-test-helpers/setup_pytorch_env.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,85 @@

				if exist "%TMP_DIR%/ci_scripts/pytorch_env_restore.bat" (

				    call %TMP_DIR%/ci_scripts/pytorch_env_restore.bat

				    exit /b 0

				)

				set PATH=C:\Program Files\CMake\bin;C:\Program Files\7-Zip;C:\ProgramData\chocolatey\bin;C:\Program Files\Git\cmd;C:\Program Files\Amazon\AWSCLI;%PATH%

				:: Install Miniconda3

				if "%BUILD_ENVIRONMENT%"=="" (

				    set CONDA_PARENT_DIR=%CD%

				) else (

				    set CONDA_PARENT_DIR=C:\Jenkins

				)

				if NOT "%BUILD_ENVIRONMENT%"=="" (

				    IF EXIST %CONDA_PARENT_DIR%\Miniconda3 ( rd /s /q %CONDA_PARENT_DIR%\Miniconda3 )

				    curl https://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe --output %TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe

				    %TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe /InstallationType=JustMe /RegisterPython=0 /S /AddToPath=0 /D=%CONDA_PARENT_DIR%\Miniconda3

				)

				call %CONDA_PARENT_DIR%\Miniconda3\Scripts\activate.bat %CONDA_PARENT_DIR%\Miniconda3

				if NOT "%BUILD_ENVIRONMENT%"=="" (

				    :: We have to pin Python version to 3.6.7, until mkl supports Python 3.7

				    :: Numba is pinned to 0.44.0 to avoid https://github.com/numba/numba/issues/4352

				    call conda install -y -q python=3.6.7 numpy mkl cffi pyyaml boto3 protobuf numba==0.44.0

				)

				pip install -q ninja future hypothesis "librosa>=0.6.2" psutil pillow

				:: No need to install faulthandler since we only test Python >= 3.6 on Windows

				:: faulthandler is builtin since Python 3.3

				if "%CUDA_VERSION%" == "9" goto cuda_build_9

				if "%CUDA_VERSION%" == "10" goto cuda_build_10

				goto cuda_build_end

				:cuda_build_9

				pushd .

				call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" x64

				@echo on

				popd

				set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0

				set CUDA_PATH_V9_0=%CUDA_PATH%

				goto cuda_build_common

				:cuda_build_10

				pushd .

				call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" x64

				@echo on

				popd

				set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1

				set CUDA_PATH_V10_1=%CUDA_PATH%

				goto cuda_build_common

				:cuda_build_common

				set CUDNN_LIB_DIR=%CUDA_PATH%\lib\x64

				set CUDA_TOOLKIT_ROOT_DIR=%CUDA_PATH%

				set CUDNN_ROOT_DIR=%CUDA_PATH%

				set NVTOOLSEXT_PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt

				set PATH=%CUDA_PATH%\bin;%CUDA_PATH%\libnvvp;%PATH%

				set NUMBAPRO_CUDALIB=%CUDA_PATH%\bin

				set NUMBAPRO_LIBDEVICE=%CUDA_PATH%\nvvm\libdevice

				set NUMBAPRO_NVVM=%CUDA_PATH%\nvvm\bin\nvvm64_32_0.dll

				:cuda_build_end

				set PYTHONPATH=%TMP_DIR_WIN%\build;%PYTHONPATH%

				if NOT "%BUILD_ENVIRONMENT%"=="" (

				    pushd %TMP_DIR_WIN%\build

				    python %SCRIPT_HELPERS_DIR%\download_image.py %TMP_DIR_WIN%\%IMAGE_COMMIT_TAG%.7z

				    :: 7z: -aos skips if exists because this .bat can be called multiple times

				    7z x %TMP_DIR_WIN%\%IMAGE_COMMIT_TAG%.7z -aos

				    popd

				) else (

				    xcopy /s %CONDA_PARENT_DIR%\Miniconda3\Lib\site-packages\torch %TMP_DIR_WIN%\build\torch\

				)

				@echo off

				echo @echo off >> %TMP_DIR%/ci_scripts/pytorch_env_restore.bat

				for /f "usebackq tokens=*" %%i in (`set`) do echo set "%%i" >> %TMP_DIR%/ci_scripts/pytorch_env_restore.bat

				@echo on

									
										30

.jenkins/pytorch/win-test-helpers/test_custom_script_ops.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,30 @@

				call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat

				cd test\custom_operator

				:: Build the custom operator library.

				mkdir build

				pushd build

				echo "Executing CMake for custom_operator test..."

				:: Note: Caffe2 does not support MSVC + CUDA + Debug mode (has to be Release mode)

				cmake -DCMAKE_PREFIX_PATH=%TMP_DIR_WIN%\build\torch -DCMAKE_BUILD_TYPE=Release -GNinja ..

				if ERRORLEVEL 1 exit /b 1

				echo "Executing Ninja for custom_operator test..."

				ninja -v

				if ERRORLEVEL 1 exit /b 1

				echo "Ninja succeeded for custom_operator test."

				popd

				:: Run tests Python-side and export a script module.

				python test_custom_ops.py -v

				python model.py --export-script-module="build/model.pt"

				:: Run tests C++-side and load the exported script module.

				cd build

				set PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt\bin\x64;%TMP_DIR_WIN%\build\torch\lib;%PATH%

				test_custom_ops.exe model.pt

									
										7

.jenkins/pytorch/win-test-helpers/test_libtorch.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,7 @@

				call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat

				cd %TMP_DIR_WIN%\build\torch\bin

				set PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt\bin\x64;%TMP_DIR_WIN%\build\torch\lib;%PATH%

				test_api.exe --gtest_filter="-IntegrationTest.MNIST*"

				if errorlevel 1 exit /b 1

									
										2

.jenkins/pytorch/win-test-helpers/test_python_all_except_nn.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,2 @@

				call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat

				cd test && python run_test.py --exclude nn --verbose && cd ..

									
										16

.jenkins/pytorch/win-test-helpers/test_python_nn.bat
									
										Normal file
									
												View File
												
				@ -0,0 +1,16 @@

				call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat

				pushd test

				echo Some smoke tests

				python %SCRIPT_HELPERS_DIR%\run_python_nn_smoketests.py

				if ERRORLEVEL 1 exit /b 1

				echo Run nn tests

				python run_test.py --include nn --verbose

				if ERRORLEVEL 1 exit /b 1

				popd

Compare commits

4470 Commits v1.0.1 ... v1.2.0

2 .circleci/.gitignore vendored Normal file Unescape Escape View File

476 .circleci/README.md Normal file Unescape Escape View File

0 caffe2/contrib/cuda-convnet2/cudaconvnet/__init__.py → .circleci/cimodel/__init__.py Unescape Escape View File

0 aten/src/THC/THCThreadLocal.cpp → .circleci/cimodel/data/__init__.py Unescape Escape View File

178 .circleci/cimodel/data/binary_build_data.py Normal file Unescape Escape View File

213 .circleci/cimodel/data/binary_build_definitions.py Normal file Unescape Escape View File

109 .circleci/cimodel/data/caffe2_build_data.py Normal file Unescape Escape View File

187 .circleci/cimodel/data/caffe2_build_definitions.py Normal file Unescape Escape View File

23 .circleci/cimodel/data/dimensions.py Normal file Unescape Escape View File

200 .circleci/cimodel/data/pytorch_build_data.py Normal file Unescape Escape View File

310 .circleci/cimodel/data/pytorch_build_definitions.py Normal file Unescape Escape View File

0 aten/src/THC/THCThreadLocal.h → .circleci/cimodel/lib/__init__.py Unescape Escape View File

110 .circleci/cimodel/lib/conf_tree.py Normal file Unescape Escape View File

13 .circleci/cimodel/lib/miniutils.py Normal file Unescape Escape View File

64 .circleci/cimodel/lib/miniyaml.py Normal file Unescape Escape View File

86 .circleci/cimodel/lib/visualization.py Normal file Unescape Escape View File

3479 .circleci/config.yml View File

39 .circleci/ensure-consistency.py Executable file Unescape Escape View File

127 .circleci/generate_config_yml.py Executable file Unescape Escape View File

8 .circleci/regenerate.sh Executable file Unescape Escape View File

4 .circleci/scripts/README.md Normal file Unescape Escape View File

48 .circleci/scripts/binary_checkout.sh Executable file Unescape Escape View File

44 .circleci/scripts/binary_install_miniconda.sh Executable file Unescape Escape View File

30 .circleci/scripts/binary_linux_build.sh Executable file Unescape Escape View File

50 .circleci/scripts/binary_linux_test.sh Executable file Unescape Escape View File

40 .circleci/scripts/binary_linux_upload.sh Executable file Unescape Escape View File

24 .circleci/scripts/binary_macos_build.sh Executable file Unescape Escape View File

30 .circleci/scripts/binary_macos_test.sh Executable file Unescape Escape View File

40 .circleci/scripts/binary_macos_upload.sh Executable file Unescape Escape View File

106 .circleci/scripts/binary_populate_env.sh Executable file Unescape Escape View File

48 .circleci/scripts/binary_run_in_docker.sh Executable file Unescape Escape View File

127 .circleci/scripts/cpp_doc_push_script.sh Executable file Unescape Escape View File

118 .circleci/scripts/python_doc_push_script.sh Executable file Unescape Escape View File

87 .circleci/scripts/setup_ci_environment.sh Executable file Unescape Escape View File

50 .circleci/scripts/setup_linux_system_environment.sh Executable file Unescape Escape View File

96 .circleci/scripts/should_run_job.py Normal file Unescape Escape View File

29 .circleci/scripts/should_run_job.sh Executable file Unescape Escape View File

20 .circleci/verbatim-sources/binary-build-tests.yml Normal file Unescape Escape View File

98 .circleci/verbatim-sources/binary_update_htmls.yml Normal file Unescape Escape View File

63 .circleci/verbatim-sources/header-section.yml Normal file Unescape Escape View File

267 .circleci/verbatim-sources/job-specs-custom.yml Normal file Unescape Escape View File

34 .circleci/verbatim-sources/job-specs-setup.yml Normal file Unescape Escape View File

101 .circleci/verbatim-sources/linux-binary-build-defaults.yml Normal file Unescape Escape View File

234 .circleci/verbatim-sources/linux-build-defaults.yml Normal file Unescape Escape View File

72 .circleci/verbatim-sources/macos-binary-build-defaults.yml Normal file Unescape Escape View File

90 .circleci/verbatim-sources/macos-build-defaults.yml Normal file Unescape Escape View File

70 .circleci/verbatim-sources/nightly-binary-build-defaults.yml Normal file Unescape Escape View File

64 .circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml Normal file Unescape Escape View File

4 .circleci/verbatim-sources/workflows-binary-build-header.yml Normal file Unescape Escape View File

51 .circleci/verbatim-sources/workflows-binary-builds-smoke-subset.yml Normal file Unescape Escape View File

11 .circleci/verbatim-sources/workflows-nightly-uploads-header.yml Normal file Unescape Escape View File

19 .circleci/verbatim-sources/workflows-pytorch-macos-builds.yml Normal file Unescape Escape View File

30 .circleci/verbatim-sources/workflows-s3-html.yml Normal file Unescape Escape View File

12 .circleci/verbatim-sources/workflows.yml Normal file Unescape Escape View File

25 .clang-tidy Unescape Escape View File

2 .ctags.d/pytorch.ctags Normal file Unescape Escape View File

12 .flake8 Normal file Unescape Escape View File

29 .gitignore vendored Unescape Escape View File

146 .gitmodules vendored Unescape Escape View File

60 .jenkins/caffe2/bench.sh Executable file Unescape Escape View File

218 .jenkins/caffe2/build.sh Unescape Escape View File

22 .jenkins/caffe2/common.sh Normal file Unescape Escape View File

139 .jenkins/caffe2/test.sh Unescape Escape View File

26 .jenkins/pytorch/build-asan.sh Unescape Escape View File

133 .jenkins/pytorch/build.sh Unescape Escape View File

68 .jenkins/pytorch/common.sh Unescape Escape View File

5 .jenkins/pytorch/disabled-configs.txt Unescape Escape View File

2 .jenkins/pytorch/docker-build-test.sh Unescape Escape View File

51 .jenkins/pytorch/enabled-configs.txt Unescape Escape View File

4 .jenkins/pytorch/macos-build-test.sh Unescape Escape View File

13 .jenkins/pytorch/macos-build.sh Unescape Escape View File

79 .jenkins/pytorch/macos-test.sh Unescape Escape View File

9 .jenkins/pytorch/multigpu-test.sh Unescape Escape View File

2 .jenkins/pytorch/perf_test/common.sh Unescape Escape View File

7 .jenkins/pytorch/perf_test/compare_with_baseline.py Unescape Escape View File

4 .jenkins/pytorch/perf_test/test_cpu_speed_mini_sequence_labeler.sh Unescape Escape View File

6 .jenkins/pytorch/perf_test/test_cpu_speed_mnist.sh Unescape Escape View File

4 .jenkins/pytorch/perf_test/test_cpu_speed_torch.sh Unescape Escape View File

4470 Commits

v1.0.1 ... v1.2.0

2

.circleci/.gitignore vendored Normal file

View File

476

.circleci/README.md Normal file

View File

0

caffe2/contrib/cuda-convnet2/cudaconvnet/init.py → .circleci/cimodel/init.py

View File

0

aten/src/THC/THCThreadLocal.cpp → .circleci/cimodel/data/init.py

View File

178

.circleci/cimodel/data/binary_build_data.py Normal file

View File

213

.circleci/cimodel/data/binary_build_definitions.py Normal file

View File

109

.circleci/cimodel/data/caffe2_build_data.py Normal file

View File

187

.circleci/cimodel/data/caffe2_build_definitions.py Normal file

View File

23

.circleci/cimodel/data/dimensions.py Normal file

View File

200

.circleci/cimodel/data/pytorch_build_data.py Normal file

View File

310

.circleci/cimodel/data/pytorch_build_definitions.py Normal file

View File

0

aten/src/THC/THCThreadLocal.h → .circleci/cimodel/lib/init.py

View File

110

.circleci/cimodel/lib/conf_tree.py Normal file

View File

13

.circleci/cimodel/lib/miniutils.py Normal file

View File

64

.circleci/cimodel/lib/miniyaml.py Normal file

View File

86

.circleci/cimodel/lib/visualization.py Normal file

View File

3479

.circleci/config.yml

View File

39

.circleci/ensure-consistency.py Executable file

View File

127

.circleci/generate_config_yml.py Executable file

View File

8

.circleci/regenerate.sh Executable file

View File

4

.circleci/scripts/README.md Normal file

View File

48

.circleci/scripts/binary_checkout.sh Executable file

View File

44

.circleci/scripts/binary_install_miniconda.sh Executable file

View File

30

.circleci/scripts/binary_linux_build.sh Executable file

View File

50

.circleci/scripts/binary_linux_test.sh Executable file

View File

40

.circleci/scripts/binary_linux_upload.sh Executable file

View File

24

.circleci/scripts/binary_macos_build.sh Executable file

View File

30

.circleci/scripts/binary_macos_test.sh Executable file

View File

40

.circleci/scripts/binary_macos_upload.sh Executable file

View File

106

.circleci/scripts/binary_populate_env.sh Executable file

View File

48

.circleci/scripts/binary_run_in_docker.sh Executable file

View File

127

.circleci/scripts/cpp_doc_push_script.sh Executable file

View File

118

.circleci/scripts/python_doc_push_script.sh Executable file

View File

87

.circleci/scripts/setup_ci_environment.sh Executable file

View File

50

.circleci/scripts/setup_linux_system_environment.sh Executable file

View File

96

.circleci/scripts/should_run_job.py Normal file

View File

29

.circleci/scripts/should_run_job.sh Executable file

View File

20

.circleci/verbatim-sources/binary-build-tests.yml Normal file

View File

98

.circleci/verbatim-sources/binary_update_htmls.yml Normal file

View File

63

.circleci/verbatim-sources/header-section.yml Normal file

View File

267

.circleci/verbatim-sources/job-specs-custom.yml Normal file

View File

34

.circleci/verbatim-sources/job-specs-setup.yml Normal file

View File

101

.circleci/verbatim-sources/linux-binary-build-defaults.yml Normal file

View File

234

.circleci/verbatim-sources/linux-build-defaults.yml Normal file

View File

72

.circleci/verbatim-sources/macos-binary-build-defaults.yml Normal file

View File

90

.circleci/verbatim-sources/macos-build-defaults.yml Normal file

View File

70

.circleci/verbatim-sources/nightly-binary-build-defaults.yml Normal file

View File

64

.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml Normal file

View File

4

.circleci/verbatim-sources/workflows-binary-build-header.yml Normal file

View File

51

.circleci/verbatim-sources/workflows-binary-builds-smoke-subset.yml Normal file

View File

11

.circleci/verbatim-sources/workflows-nightly-uploads-header.yml Normal file

View File

19

.circleci/verbatim-sources/workflows-pytorch-macos-builds.yml Normal file

View File

30

.circleci/verbatim-sources/workflows-s3-html.yml Normal file

View File

12

.circleci/verbatim-sources/workflows.yml Normal file

View File

25

.clang-tidy

View File

2

.ctags.d/pytorch.ctags Normal file

View File

12

.flake8 Normal file

View File

29

.gitignore vendored

View File

146

.gitmodules vendored

View File

60

.jenkins/caffe2/bench.sh Executable file

View File

218

.jenkins/caffe2/build.sh

View File

22

.jenkins/caffe2/common.sh Normal file

View File

139

.jenkins/caffe2/test.sh

View File

26

.jenkins/pytorch/build-asan.sh

View File

133

.jenkins/pytorch/build.sh

View File

68

.jenkins/pytorch/common.sh

View File

5

.jenkins/pytorch/disabled-configs.txt

View File

2

.jenkins/pytorch/docker-build-test.sh

View File

51

.jenkins/pytorch/enabled-configs.txt

View File

4

.jenkins/pytorch/macos-build-test.sh

View File

13

.jenkins/pytorch/macos-build.sh

View File

79

.jenkins/pytorch/macos-test.sh

View File

9

.jenkins/pytorch/multigpu-test.sh

View File

2

.jenkins/pytorch/perf_test/common.sh

View File

7

.jenkins/pytorch/perf_test/compare_with_baseline.py

View File

4

.jenkins/pytorch/perf_test/test_cpu_speed_mini_sequence_labeler.sh

View File

6

.jenkins/pytorch/perf_test/test_cpu_speed_mnist.sh

View File

4

.jenkins/pytorch/perf_test/test_cpu_speed_torch.sh

View File

4

.jenkins/pytorch/perf_test/test_cpu_speed_torch_tensor.sh

View File