pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Ilia Cherniavskii	c3d05e86cc	Resend "Split ATen/Parallel into interface and backend" (#20825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20825 ghimport-source-id: 0371fbd37cb37635647d473d5ac9f2859e787061 Differential Revision: D15458073 Pulled By: ilia-cher fbshipit-source-id: cd27d0da1691f6be1183cd152348ac0d93a53996	2019-05-24 02:03:06 -07:00
Will Feng	8cde4c4d22	Remove Variable::Impl and DifferentiableViewImpl (#17072 ) Summary: As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR: 1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class 2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()` 3. Remove `Variable.data()` API 3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history. After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't. Note that this PR is BC-breaking in the following use cases: Use Case 1: Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type. Use Case 2: If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example: ```python params = torch.tensor([1.5, 1.5]).requires_grad_() with torch.no_grad(): # Change gradient to a sparse tensor params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.])) grad_saved = params.grad params.backward(torch.tensor([1.5, 1.5])) assert id(grad_saved) == id(params.grad) # This will fail after this PR ``` The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072 Differential Revision: D14075257 Pulled By: yf225 fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957	2019-05-23 21:09:04 -07:00
Syed Tousif Ahmed	b6d0f6c85a	Move THCTensor_{random, clampedRandom, cappedRandom} to ATen (#20620 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20620 ghimport-source-id: 7c09c2462021e3fa5adef61570a575964ff16125 Differential Revision: D15454050 Pulled By: ezyang fbshipit-source-id: 5b0421c56445baf19dbdbdd9680af128a5cdf443	2019-05-23 13:44:16 -07:00
Edward Yang	fd95947e68	Revert D15248618: Split ATen/Parallel into interface and backend Differential Revision: D15248618 Original commit changeset: 060879266bc8 fbshipit-source-id: fc5cbb030b87613c9e15100118c3d4a064097c20	2019-05-22 09:55:51 -07:00
Ilia Cherniavskii	c4a3b4d528	Split ATen/Parallel into interface and backend (#20057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20057 ghimport-source-id: c583f61bf661c994eb4d0625748a299e892a7246 Differential Revision: D15248618 Pulled By: ilia-cher fbshipit-source-id: 060879266bc8616916fe220adef6ae6c0b076fbd	2019-05-21 19:15:47 -07:00
peter	bb20956e3c	Add support for CMake switches for VS 2019 (#20752 ) Summary: Appending `arch` to the generator name is not supported for VS starting from VS 2019. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20752 Differential Revision: D15436740 Pulled By: ezyang fbshipit-source-id: 20057aae8f708d82619927bf2cb87dd1bc2df312	2019-05-21 13:46:39 -07:00
Xiaodong Wang	f3d827f311	Hipify fb/quantize Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20725 Reviewed By: bddppq Differential Revision: D15407710 fbshipit-source-id: e5fdeee7e2dffd43cfdd6fab6193eb8a80902c02	2019-05-21 10:51:36 -07:00
Xiaodong Wang	b5edeca39d	Split cpu/gpu in caffe2/distributed + some clean up (#20674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20674 A few targets in caffe2/caffe2/distribute needs to be split too, otherwise won't compile. Also some clean ups and make select_gpu_type to gpu_library_selector Differential Revision: D15406019 fbshipit-source-id: 6455ab885b248502b48d4c7565597e00fecfd547	2019-05-21 10:51:33 -07:00
Karl Ostmo	0bfc0eeef7	restore hidden visibility by default for Linux builds (#20461 ) Summary: Symbols are given hidden visibility by default on Linux to emulate the behavior on Windows. This helps developers catch visibility issues in their streamlined Linux dev environment before being surprised, late in the process, by Windows errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20461 Reviewed By: kostmo Differential Revision: D15410410 Pulled By: dzhulgakov fbshipit-source-id: 1d684b5a9a80b692966a775c3f1c56b7c72ffc95	2019-05-20 16:49:37 -07:00
Nikolay Korovaiko	f215db9b92	InsertGuards pass Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20438 Differential Revision: D15342655 Pulled By: Krovatkin fbshipit-source-id: a193e582d621b99f848573fb4478e7b62265dc9f	2019-05-20 10:49:19 -07:00
Levent Ertoz	5f14ef8cc1	Split out gpu/cpu targets based on gpu_library_targets (#20633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20633 Merge the c2_gpu and is_amd_build logic in targets files. Reviewed By: dzhulgakov Differential Revision: D15176621 fbshipit-source-id: 9185b394ffcb305fd8d94dc7c7c92780bf10a511	2019-05-17 13:07:10 -07:00
Ilia Cherniavskii	409200df59	Move inter-op settings into ATen/Parallel (#20050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050 ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14 Differential Revision: D15248576 Pulled By: ilia-cher fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249	2019-05-17 03:12:02 -07:00
Jesse Hellemn	5821a76b8e	Forcing gcc ABI and safer bash scripts, v2 (#20540 ) Summary: First time this was merged it broke master and was reverted. This time I do not add ```set -u``` to the .circleci/scripts/setup* scripts. There's still a chance that ```set -u``` breaks the binary builds on master, but at least those can be fixed in parallel and don't completely eliminate signal from all merges. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20540 Differential Revision: D15373444 Pulled By: pjh5 fbshipit-source-id: 0203c20865827366ecd8fa07b2db74d255549ed1	2019-05-16 09:40:01 -07:00
Vitaly Fedyunin	5b78a5eadb	Memory format support for contiguous and is_contiguous (#20455 ) Summary: #19975 was separated by 2 PRs. This one: Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions. At this moment both functions just operate with strides and doesn't store any tensor state. (Original RFC #19092) ----- Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api). Note: We had several complaints about `.to(memory_format)` function, and decided not to support it. 1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior. - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern. `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise. 2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged. - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format. Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455 Differential Revision: D15341577 Pulled By: VitalyFedyunin fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d	2019-05-16 07:18:24 -07:00
Jerry Zhang	abb3698976	Add QInt32 ScalarType and qint32 data type (#19816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19816 We need this for quantization for bias add third argument of ScalarType to `quantize_linear` Differential Revision: D15094174 fbshipit-source-id: f19ec8f4716cf5fe0aa21b38d45af6d27c9ab377	2019-05-15 18:50:18 -07:00
Igor Fedan	4c23c34e79	Computing var/stddev and mean at the same time (#18731 ) Summary: The current variance kernels compute mean at the same time. Many times we want both statistics together, so it seems reasonable to have a kwarg/function that allows us to get both values without launching an extra kernel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18731 Differential Revision: D14726082 Pulled By: ifedan fbshipit-source-id: 473cba0227b69eb2240dca5e61a8f4366df0e029	2019-05-15 16:42:38 -07:00
Edward Yang	73a97387c1	Replace AT_CHECK with TORCH_CHECK [shard 9/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20435 Reviewed By: jerryzh168 Differential Revision: D15318877 fbshipit-source-id: 4d83571187ea14a604fef83ac355d328b46d93e1	2019-05-15 08:05:59 -07:00
Bram Wasti	8e26759f14	Back out "[pytorch][PR] Manually set _GLIBCXX_USE_CXX11_ABI in devtoolset7 binary builds" Summary: Original commit changeset: 571bba8a93ea Reviewed By: pjh5 Differential Revision: D15349783 fbshipit-source-id: 75c3e2b9b97e0ac0e8bcdef93e53b0d475c6fa38	2019-05-15 00:02:55 -07:00
Jesse Hellemn	ea38fbfc5c	Manually set _GLIBCXX_USE_CXX11_ABI in devtoolset7 binary builds (#20243 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/17492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20243 Differential Revision: D15348101 Pulled By: pjh5 fbshipit-source-id: 571bba8a93eaa9806db3f3d38697c26b5285da7a	2019-05-14 18:02:42 -07:00
Zachary DeVito	9610f150d7	stop build spew on development (#20508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20508 ghimport-source-id: 26a16e2918fb93058c7740afb85070e0d29b4d1b Differential Revision: D15343207 Pulled By: zdevito fbshipit-source-id: b6d8858024cc440d59cf88d69e0fbc0e67dc85ce	2019-05-14 15:30:52 -07:00
peter	30bdb8c0d7	Hotfix for caffe2 windows build (#20417 ) Summary: We don't need to overlay vc env when not using ninja. CMake will deal with it automatically. Overlaying is a no-op when the env is the same with the generator specified but will generate the error "Cannot find CMAKE_CXX_COMPILER" when they are different. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20417 Differential Revision: D15317081 Pulled By: ezyang fbshipit-source-id: 5d9100321ecd593e810c31158f22c67d3e34973b	2019-05-13 08:03:45 -07:00
Nikolay Korovaiko	9499c7b7ee	Profiling GraphExecutor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19994 Differential Revision: D15307752 Pulled By: Krovatkin fbshipit-source-id: 7b35191042199ef16823487e15fe639968cbdc89	2019-05-10 23:05:47 -07:00
Karl Ostmo	4ba28deb6e	Unify libtorch and libcaffe2 (#17783 ) Summary: This PR is an intermediate step toward the ultimate goal of eliminating "caffe2" in favor of "torch". This PR moves all of the files that had constituted "libtorch.so" into the "libcaffe2.so" library, and wraps "libcaffe2.so" with a shell library named "libtorch.so". This means that, for now, `caffe2/CMakeLists.txt` becomes a lot bigger, and `torch/CMakeLists.txt` becomes smaller. The torch Python bindings (`torch_python.so`) still remain in `torch/CMakeLists.txt`. The follow-up to this PR will rename references to `caffe2` to `torch`, and flatten the shell into one library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17783 Differential Revision: D15284178 Pulled By: kostmo fbshipit-source-id: a08387d735ae20652527ced4e69fd75b8ff88b05	2019-05-10 09:50:53 -07:00
peter	8726b27333	Fix overlay_vc_env when called by legacy python (#20304 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20155. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20304 Differential Revision: D15292369 Pulled By: zdevito fbshipit-source-id: 7da2e0cb85c98d0fcd4461d39e2a8c57391db60e	2019-05-10 06:44:58 -07:00
Pieter Noordhuis	caa0d0c50a	Add c10d::broadcast_coalesced and tests (#20234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20234 The differences with the existing function _dist_broadcast_coalesced is that this one works for both CPU and CUDA tensors and that it has a maximum number of in flight operations. This should be the final change needed to have only a single version of DistributedDataParallel that both supports CPU and CUDA models, or even a mix of both. See #17757 for more information. Reviewed By: mrshenli Differential Revision: D15228099 fbshipit-source-id: a2113ba6b09b68cb5328f49f4c1960031eb43c93	2019-05-09 14:11:08 -07:00
Richard Zou	e01a5bf28b	Add USE_NAMEDTENSOR compilation flag. (#20162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20162 ghimport-source-id: 0efcd67f04aa087e1dd5faeee550daa2f13ef1a5 Reviewed By: gchanan Differential Revision: D15278211 Pulled By: zou3519 fbshipit-source-id: 6fee981915d83e820fe8b50a8f59da22a428a9bf	2019-05-09 09:09:16 -07:00
Wanchao Liang	4d676d53a6	split canonicalize_ops, make a decompose pass (#19988 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19988 ghimport-source-id: 1dbf39e07099fa24ef9a6c0221312bf01a8011b7 Differential Revision: D15190355 Pulled By: wanchaol fbshipit-source-id: 83f2b6557efd758810ccb4a4229d71fdebfd06e0	2019-05-08 17:21:59 -07:00
Mikhail Zolotukhin	8a6072c3bd	SubgraphRewriter: Rename pattern fusion to subgraph rewrite. (#20082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20082 ghimport-source-id: f0594f4ad918288fb3158b4ecfa8010cf09dd0c2 Differential Revision: D15190193 Pulled By: ZolotukhinM fbshipit-source-id: 81b026398c94f2fbf7487cafbb86b7364a78d827	2019-05-08 11:22:29 -07:00
Junjie Bai	bc5398451e	Enable ROCm multi-gpu with Gloo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18640 Differential Revision: D15185822 Pulled By: bddppq fbshipit-source-id: 1b49ab3fb0f251cfc7ef3ddd62033ae0065a4ec3	2019-05-07 09:55:47 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
Mikhail Zolotukhin	722eb48ff2	Cleanup includes in torch/csrc/* (#19924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19924 ghimport-source-id: f7248b16c8e263a7d0ba7975b1fc0b00cb2cf2c0 Differential Revision: D15125018 Pulled By: ZolotukhinM fbshipit-source-id: 322c7ca53e38ef8b43b5ac5bd747b28bc10379f1	2019-05-06 14:03:18 -07:00
davidriazati	18cb098588	Remove warnings on new_* constructors (#20026 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * *#20026 Remove warnings on new_ constructors** Revert of #16770, fixes #19995 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20026 Pulled By: driazati Differential Revision: D15171691 fbshipit-source-id: 057c3b4a9fd6086ca240007e5404a286080f04b6	2019-05-01 16:35:36 -07:00
Mikhail Zolotukhin	360640bc9c	Extract Python-specific SugaredValues to a separate file from init.cpp. (#19986 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19986 ghimport-source-id: 67f5fec4b5b2114f2922505a7743ed27e6d7e6cc Differential Revision: D15160820 Pulled By: ZolotukhinM fbshipit-source-id: e39238db8f30a8809891bff8a2fe39977124f6ca	2019-04-30 19:38:23 -07:00
iurii zdebskyi	aa6403bae6	Added .bool() method Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19928 Differential Revision: D15131923 Pulled By: izdeby fbshipit-source-id: 3909cf4623fe85e98ceaf57fbb57745919899445	2019-04-30 10:34:31 -07:00
Mikhail Zolotukhin	2a95cf6345	Add a pattern-based fusion pass. (#19596 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19596 ghimport-source-id: 1d7af5877dbeffa826201812649a9009c06c6305 Differential Revision: D15042033 Pulled By: ZolotukhinM fbshipit-source-id: e3178d9aec2ac63fc3779ddedbd967aae0401c76	2019-04-29 19:17:31 -07:00
iurii zdebskyi	de19eeee99	Enabled masked for a bool tensor (#19140 ) Summary: Added deprecation warnings for the masked methods and enabled them for a bool tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19140 Differential Revision: D14888021 Pulled By: izdeby fbshipit-source-id: 0e42daf8f3732ca29f36d10485402bfc502716ad	2019-04-29 10:40:12 -07:00
Xiaodong Wang	9d0b5a1ce9	Build caffe2/fb/operators (#19688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19688 Minor changes to hipify script to take extra folders. Reviewed By: bddppq Differential Revision: D15068427 fbshipit-source-id: e2e792c8227cbd0e15fd2564f87d740a62c477da	2019-04-29 09:01:10 -07:00
Michael Suo	a25b79531c	use fully qualified name for ScriptClasses (#19239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239 ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706 Differential Revision: D14928293 Reviewed By: eellison Pulled By: suo fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a	2019-04-26 19:17:21 -07:00
Junjie Bai	c9f380df02	Add aten mkldnn linear operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19210 Reviewed By: dzhulgakov Differential Revision: D14901641 fbshipit-source-id: 8fa68b9941fd93cea0f313a828cba34c5c81ae11	2019-04-26 13:41:57 -07:00
Karl Ostmo	8f0603b128	C++ changes toward libtorch and libcaffe2 unification (#19554 ) Summary: * adds TORCH_API and AT_CUDA_API in places * refactor code generation Python logic to separate caffe2/torch outputs * fix hip and asan * remove profiler_cuda from hip * fix gcc warnings for enums * Fix PythonOp::Kind Pull Request resolved: https://github.com/pytorch/pytorch/pull/19554 Differential Revision: D15082727 Pulled By: kostmo fbshipit-source-id: 83a8a99717f025ab44b29608848928d76b3147a4	2019-04-26 01:38:10 -07:00
Thomas Viehmann	556c8a300b	Fall back to asking nvcc for detecting cuda version if no cudaart is found (#19741 ) Summary: This happens on Debian/Ubuntu with distribution-provided cuda repackaging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19741 Differential Revision: D15082550 Pulled By: soumith fbshipit-source-id: 2ca39c6cdc9305896529b6fd537270116223cd6c	2019-04-25 10:54:20 -07:00
Roy Li	a6811e17c0	Restore copy_ overload with async arg (#19641 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19641 ghimport-source-id: 7099221334505bacdc209cff8bf29e3004c30379 Differential Revision: D15056755 Pulled By: li-roy fbshipit-source-id: e9063b606e72a70fc1270fbcdcf1c0b23d876dd3	2019-04-24 17:51:50 -07:00
Vitaly Fedyunin	d14abe3aff	Add torch.from_file function similar to the Storage.from_file, but returning tensor (#18688 ) Summary: Porting `torch.Storage.from_file(filename, shared, size)` function to `torch.from_file(filename, shared, size, dtype=torch.int)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/18688 Differential Revision: D15012644 Pulled By: VitalyFedyunin fbshipit-source-id: 3f62ca9e414fad3847fe71b785ff97b5bdc2d2cd	2019-04-24 15:38:56 -07:00
Dmytro Dzhulgakov	d247912dbf	Add no-gpu build mode for all of PyTorch and Caffe2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19687 Differential Revision: D15023347 fbshipit-source-id: 5bed0d72e8ff337e066c142ca5c8e2c2bae93746	2019-04-24 13:27:59 -07:00
Dmytro Dzhulgakov	8b798f43e3	Commit explicit libtorch_python sources (#19607 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19607 Explicit is better than implicit - it's pretty hard to debug where particular file is if it's not greppable. As a follow up step - we should look whether we can just include build_variables.py in CMake directly to share setups of two build systems Reviewed By: ezyang Differential Revision: D15023348 fbshipit-source-id: 600ef2d1871bc28530c6a02681b284f7499904df	2019-04-23 19:49:42 -07:00
James Reed	80020b3d2d	Guard {set,rebase}_history on grad_fn check (#19623 ) Summary: We would previously have statements like ``` set_history(flatten_tensor_args( result ), grad_fn); ``` Internally, {set,rebase}_history would check grad_fn and short circuit if it is nullptr. However, this means that we are executing the expression `flatten_tensor_args( result )` and immediately throwing away the results. This was causing unnecessary allocations + overhead. My JIT overhead benchmark script (with custom benchmark method): ``` import torch, time torch.jit.script def add(x, y): return x + y a = torch.rand([]) b = torch.rand([]) niter = 1000000 with torch.no_grad(): s = time.time() add.__getattr__('forward').benchmark(niter, a, b) e = time.time() - s print('overhead per call (us)', e / niter * 1e6) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/19623 Differential Revision: D15053399 Pulled By: jamesr66a fbshipit-source-id: 8777e1a2b5c5a5bbd3a035b7247c8154c5fc4aa6	2019-04-23 15:40:11 -07:00
Wanchao Liang	e9c8f372c4	dispatch max_pools with no indices, expose max_pools to torch namespace (#19449 ) Summary: in functional interfaces we do boolean dispatch, but all to max_pool\d_with_indices. This change it to emit max_pool\d op instead when it's not necessary to expose with_indices ops to different backends (for jit). It also bind max_pool\d to the torch namespace, which is the same behavior with avg_pool\d Pull Request resolved: https://github.com/pytorch/pytorch/pull/19449 Differential Revision: D15016839 Pulled By: wanchaol fbshipit-source-id: f77cd5f0bcd6d8534c1296d89b061023a8288a2c	2019-04-23 11:20:05 -07:00
vishwakftw	c30224ad21	Rename potri to cholesky_inverse (#19498 ) Summary: Changelog: - Rename `potri` to `cholesky_inverse` to remain consistent with names of `cholesky` methods (`cholesky`, `cholesky_solve`) - Fix all callsites - Rename all tests - Create a tentative alias for `cholesky_inverse` under the name `potri` and add a deprecation warning to not promote usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/19498 Differential Revision: D15029901 Pulled By: ezyang fbshipit-source-id: 2074286dc93d8744cdc9a45d54644fe57df3a57a	2019-04-22 08:18:39 -07:00
Roy Li	689dd800ed	Generate only one Type class per backend (#19295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19295 ghimport-source-id: 9345110f91f044a449804ddd5116cc9179444a00 Differential Revision: D14948581 Pulled By: li-roy fbshipit-source-id: a317b03d58d621e8df162918038f7543bfb13ba2	2019-04-21 21:16:14 -07:00
Roy Li	ab78449e8c	Add ScalarType argument to Type::options() (#19270 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19270 ghimport-source-id: a5ade6131f3260066c5750ea1fa9ed5c998bb791 Differential Revision: D14938707 Pulled By: li-roy fbshipit-source-id: 018fb3f01706531a06515d6d861e5683a455a705	2019-04-21 21:16:07 -07:00

1 2 3 4 5 ...

1256 Commits