pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Michael Suo	62af37aa88	dropout symbolic_script should respect the training flag (#20760 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20760 ghimport-source-id: eb667c3549a03a2fc01ffa0a2d3bc7e3a29b78e0 Reviewed By: jamesr66a Differential Revision: D15486511 Pulled By: suo fbshipit-source-id: 56ae930a01b0f6f4305a2a745135d4529b4a1ca0	2019-05-23 18:17:17 -07:00
Vitaly Fedyunin	5b78a5eadb	Memory format support for contiguous and is_contiguous (#20455 ) Summary: #19975 was separated by 2 PRs. This one: Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions. At this moment both functions just operate with strides and doesn't store any tensor state. (Original RFC #19092) ----- Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api). Note: We had several complaints about `.to(memory_format)` function, and decided not to support it. 1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior. - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern. `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise. 2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged. - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format. Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455 Differential Revision: D15341577 Pulled By: VitalyFedyunin fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d	2019-05-16 07:18:24 -07:00
Natalia Gimelshein	da3e74b21c	define use_cuda in dropout backward to allow peephole optimization to… (#20289 ) Summary: … work Pull Request resolved: https://github.com/pytorch/pytorch/pull/20289 Differential Revision: D15350262 Pulled By: wanchaol fbshipit-source-id: b457304688524822c1e6f23049e05472130c1ff4	2019-05-15 11:36:06 -07:00
Edward Yang	97e1f07ffc	Replace AT_CHECK with TORCH_CHECK [shard 10/10] Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20436 Reviewed By: jerryzh168 Differential Revision: D15318926 fbshipit-source-id: 71a43070cc50cc174f703ebc595f1d87c6fc1e91	2019-05-15 07:35:37 -07:00
Wanchao Liang	d2da3ee601	temporarily disbale layernorm AD (#20442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20442 ghimport-source-id: c246ade4ee9ee31b2e3413efff3ea6a246e1837e Differential Revision: D15321524 Pulled By: wanchaol fbshipit-source-id: 22c77d08c91af2d83dfd2c4a84cafc56e9240033	2019-05-13 16:35:50 -07:00
Zachary DeVito	3afd99680c	Remove SourceLocation (respin) (#20333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20333 ghimport-source-id: e64075bb82067224463e9955d10bd13967d1975d Differential Revision: D15284081 Pulled By: zdevito fbshipit-source-id: ac26ae48392b9daff08f460529c06af8f4e4722a	2019-05-09 16:17:33 -07:00
Wanchao Liang	e870b11ae6	Revert D15275731: Remote SourceLocation Differential Revision: D15275731 Original commit changeset: f4da178c3137 fbshipit-source-id: 830b79735eb2dadc4795b5aae407826bf20ef121	2019-05-09 13:07:11 -07:00
Zachary DeVito	eca91de5d2	Remote SourceLocation (#20300 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20300 ghimport-source-id: 06f606c4db3b70b1d2ed9f6ed4542c3f703c4e17 Differential Revision: D15275731 Pulled By: zdevito fbshipit-source-id: f4da178c31372c2264feb9f99476b9c9aa66c1f2	2019-05-09 11:48:29 -07:00
Mikhail Zolotukhin	edb376eceb	Cleanup includes in torch/csrc/jit/script/* (#19921 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19921 ghimport-source-id: 12a4553a4a081e8a41f4ed432b4ce3dc14e4699f Differential Revision: D15125017 Pulled By: ZolotukhinM fbshipit-source-id: f7285bd1e0745dadb9cd353a5fa8a09728012a59	2019-05-06 13:24:22 -07:00
Wanchao Liang	3c81eb3aa7	add max_pool2d to AD, add tests for both autodiff and inference mode Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19661 Differential Revision: D15074431 Pulled By: wanchaol fbshipit-source-id: b31cf2126c2c5d6a12c2ef5dc67b57677652f1fc	2019-04-24 22:19:51 -07:00
Michael Suo	1e94a3bc4d	Turn resolver into a class (#19236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19236 ghimport-source-id: d36705ea5ecff085d0d84ea57bb96d18d7c260dd Differential Revision: D14928292 Reviewed By: zdevito Pulled By: suo fbshipit-source-id: cd038100ac423fa1c19d0547b9e5487a633a2258	2019-04-19 13:01:59 -07:00
Thomas Viehmann	b9291f55bb	pow scalar exponent / base autodiff, fusion (#19324 ) Summary: Fixes: #19253 Fixing pow(Tensor, float) is straightforward. The breakage for pow(float, Tensor) is a bit more subtle to trigger, and fixing needs `torch.log` (`math.log` didn't work) from the newly merged #19115 (Thanks ngimel for pointing out this has landed.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/19324 Differential Revision: D15003531 Pulled By: ailzhang fbshipit-source-id: 8b22138fa27a43806b82886fb3a7b557bbb5a865	2019-04-18 17:58:35 -07:00
Zachary DeVito	ef406ee925	First class modules in the compiler, round 2 (#19167 ) Summary: This PR propagates where we use first-class modules objects into the compiler. This creates a transitionary state where: * compiler.cpp creates Graphs where `self` is a Module class and attributes/parameters/buffers/submodules are looked up with `prim::GetAttr` * GraphExecutor still runs "lowered graphs" where the self object has been removed by a compiler pass `lower_first_class_method`. * Tracing still creates "lowered graphs", and a pass "lift_lowered_method" creates a first-class method graph for things. * This PR separates out Method and Function. A script::Function is a pure Graph with no `self` bound. Similar to Python, a script::Method is just a bound `self` and its underlying `script::Function`. * This PR also separates CompilationUnit from Module. A CompilationUnit is just a list of named script::Functions. Class's have a CompilationUnit holding the class methods, and Modules also have a CompilationUnit holding their Methods. This avoids the weird circular case Module --has a-> Class -> has a -> Module ... Details: * In this transitionary state, we maintain two copies of a Graph, first-class module and lowered. Th first-class one has a self argument that is the module's class type. The lowered one is the lowered graph that uses the initial_ivalues inputs. * When defining lowered methods using `_defined_lowered` we immediately create the first-class equivalent. The reverse is done lazily, creating lowered_methods on demand from the class. * The two way conversions will be deleted in a future PR when the executor itself runs first-class objects. However this requires more changes to (1) the traces, (2) the python bindings, and (3) the onnx export pass and would make this PR way to large. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19167 Differential Revision: D14891966 Pulled By: zdevito fbshipit-source-id: 0b5f03118aa65448a15c7a7818e64089ec93d7ea	2019-04-11 13:55:48 -07:00
Zachary DeVito	f5165ade5b	Revert D14842057: Compiler uses first-class modules** Differential Revision: D14842057 Original commit changeset: ca6e7b5a4380 fbshipit-source-id: e8f1862a59bf20d5f78648b2fdc53a8b3750ead3	2019-04-11 06:17:01 -07:00
Zachary DeVito	5e1f0b2a07	Compiler uses first-class modules** (#19043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19043 ghimport-source-id: 0c9e80d5f35654af6d472abd5643bff3e9eb9ddf Differential Revision: D14842057 Pulled By: zdevito fbshipit-source-id: ca6e7b5a43805240f40b84d30e54495061067dc0	2019-04-11 00:00:48 -07:00
Zachary DeVito	1abbee0f8e	Allow Tensor lists to show up in symbolic differentiable graphs. (#16784 ) Summary: It is done by flattening all tensor lists that are inputs/outputs to the graph into the inputs/outputs list in the autograd graph. This is less desirable than simply allowing IValues to exist in the inputs/outputs of autograd::Function but it is substantially less intrusive. CaptureList describes the variables captured for backward in a single class. UnpackInstructs describes how the flattened inputs to backwards are re-packed into lists. ailzhang This PR is also part 2 of covering maskrcnn & bert AD formulas, following #16689. Ops added in this PR: ``` cat index meshgrid reshape split split_with_sizes stack unbind ``` I will also add a few perf numbers here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16784 Differential Revision: D14104063 Pulled By: ailzhang fbshipit-source-id: 5ceadadfd67ccaac60c5fd6740786c5354e252b9	2019-04-10 18:16:20 -07:00
Ailing Zhang	dbd9971dd2	move 2ops back to autodiff (#18969 ) Summary: Move these 2 ops back to autodiff to unblock xla CI. I will leave them for my next PR to cleanup symbolic_variable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18969 Differential Revision: D14816811 Pulled By: ailzhang fbshipit-source-id: dd8a7e133dcad29560d3d1d25691883960117299	2019-04-06 21:41:25 -07:00
Ailing Zhang	cb3a4a3d28	remove symbolic variable part 1 (#17986 ) Summary: As discussed with gchanan we should deduplicate symbolic_variable and symbolic_script to prepare for the future merge with derivatives.yaml. This PR moves most easy formulas to symbolic_script. TODO: run benchmarks to make sure no perf regression cc: apaszke zdevito wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/17986 Differential Revision: D14766412 Pulled By: ailzhang fbshipit-source-id: d95a3f876e256c0f505779a71587c985571d3b8f	2019-04-05 12:06:47 -07:00
Wanchao Liang	843e6234f5	Fix layernorm ad formula on weight and bias (#18233 ) Summary: Fix the layernorm formula when weight and bias passed in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18233 Differential Revision: D14760375 Pulled By: wanchaol fbshipit-source-id: d6bd3b137bc04c391aa5c24d021d1f811ba2a877	2019-04-03 16:58:33 -07:00
Zachary DeVito	0512e4e323	Unify namespace of script::Module (#18378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18378 ghimport-source-id: 55c29bb436a2153d29ff2f4488d99d8863c187b1 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18379 Enforce single parent for script submodules * #18378 Unify namespace of script::Module * #18314 Add ability to specialize class types to ArgumentSpec * #18226 Add Slot type to abstract the raw pointers being used for slots. This removes individual OrderedDicts in favor of a single unified namespace for all things in a script::Module. This removes a whole class of bugs where both a method and an parameter could get the same name, for instance. Since we no longer have to expose OrderedDict::Item objects, a lot of downstream code can be simplified. We no longer now double-store names (both in the key of the dictionary, and in the object itself). Differential Revision: D14603723 fbshipit-source-id: b5f7551b3074679623edd6ea70269830353b4d4c	2019-04-03 16:04:17 -07:00
Wanchao Liang	a21e256e8d	Fix contiguous AD and Autogradzero inconsistency (#18633 ) Summary: Fixes #17962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18633 Differential Revision: D14700449 Pulled By: wanchaol fbshipit-source-id: 3d15d67c01b69b28394a0f2f001db90ed9fd31dc	2019-04-03 12:47:28 -07:00
Ailing Zhang	9c87543124	Enforce check ad in test_jit (#18509 ) Summary: If a test triggers autodiff, it must have a `DifferentiableGraph` in its differentiated forward graph, and this subgraph must have either the original aten node, or the corresponding nodes used in AD formula. Typically a forward differentiable graph looks like this: ``` graph(%i0 : Float(), %i1 : Float()): %3 : Float() = prim::DifferentiableGraph_0(%i0, %i1) return (%3) with prim::DifferentiableGraph_0 = graph(%0 : Float(), %1 : Float()): %2 : Float() = aten::max(%0, %1) return (%2) ``` which tells us `aten::max(Tensor self, Tensor other) -> Tensor` is symbolically differentiable. Update: there're a lot of cases (fusions/ConstantChunk/python implementations) that breaks it so I decided to make the check optionally take node names if different from function name. ~~[OLD]Theoretically I could also check if `aten::max` is in the differentiable block or not to be more precise, but there're also cases like `chunk` where in a differentiable block it's replaced with a prim node (ConstantChunk) and we will have to special case them. Any suggestions here (to be more precise or no) is very welcome!~~ We used to have a list containing nn tests should be run against AD, I moved it to an field when constructing our test(either torch or nn). I think it's cleaner this way, and it matches the fact that for the same op we support one schema of it but not all, in this way we could just turn on the corresponding test which triggers that supported schema. cc: apaszke zdevito wanchaol ngimel for a review [Done] : - Going through a manual second pass of all tests to check if they should enable AD test or not.... - Add a readme about how to add AD for an op and how to add/enable its test in test_jit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18509 Differential Revision: D14696811 Pulled By: ailzhang fbshipit-source-id: c5e693277baac585cd3aed5ab2c0e7faa5e6f29f	2019-03-31 08:51:30 -07:00
Natalia Gimelshein	ba4de667fa	change dropout lowering in symbolic_script (#18375 ) Summary: Dropout is now eligible for fusion, and generated fused kernels are just as fast as dropout in ATen. Change its lowering in symbolic script so that it can actually be fused. Still special-cased for cuda, because without fusion this lowering is less efficient than current (bernoulli_ * input). Testing is covered by the test case that ailzhang added (test_dropout_cuda). Pull Request resolved: https://github.com/pytorch/pytorch/pull/18375 Differential Revision: D14611938 Pulled By: soumith fbshipit-source-id: 11b18f4784e6c9265e382a8f8deca7add8df3b37	2019-03-25 20:05:11 -07:00
Wanchao Liang	6c9b312fd4	Add addcmul, lerp to fuser, enable scalar->float specialization in symbolic script (#18081 ) Summary: This PR did two things: 1. Enable scalar->float specialization in symbolic script, so AD formula that contains scalar in the schema, should write `float` instead. 2. add addcmul, lerp to AD and fuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18081 Differential Revision: D14490493 Pulled By: wanchaol fbshipit-source-id: b3b86d960d5f051b30733bc908b19786111cdaa4	2019-03-25 11:05:45 -07:00
Ailing Zhang	a50ba7e238	specialized CUDA impl for dropout in AD (#17756 ) Summary: In aten we have a _fused_dropout implementation for CUDA case. As ngimel suggested if we discard it in JIT AD, it hurts performance. It doesn't seem ideal to include backend specific implementation in AD, but this is helpful to prevent performance regression atm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17756 Differential Revision: D14368999 Pulled By: ailzhang fbshipit-source-id: 9a371c5020f630e8f6e496849ec9772b6f196169	2019-03-19 10:34:15 -07:00
Sebastian Messmer	be364ac8d7	Specify overload name in function schema (#18037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18037 The FunctionSchema can now store an overload name and the parser knows how to parse it. Specify like this: my_func.overload1(arg1: Tensor) -> Tensor my_func.overload2(arg1: Tensor, arg2: Tensor) -> Tensor Reviewed By: zdevito Differential Revision: D14467497 fbshipit-source-id: 8832b32f07351bb61090357b17b77a6a2fed3650	2019-03-15 16:58:13 -07:00
Wanchao Liang	10d64a1372	add tanh to AD and fix layernorm schema Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17816 Differential Revision: D14453048 Pulled By: wanchaol fbshipit-source-id: 45815db964a4d9ee85d8933e335b47f215e3c467	2019-03-14 11:20:40 -07:00
Michael Suo	f9820e55af	initializing class value (#17585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17585 Create a sugared value that represents a class during initialization. This is so that assignments to attributes correctly define attributes in __init__ but raise an error elsewhere. Reviewed By: shannonzhu Differential Revision: D14263403 fbshipit-source-id: 09b2feeb272302f00a79c2a0302fbdf5483aed6a	2019-03-11 19:13:52 -07:00
Wanchao Liang	aa57f17808	add layernorm to AD Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17702 Differential Revision: D14368472 Pulled By: wanchaol fbshipit-source-id: 8db390e39444078258ad1d34ba74d6ddafa5d02b	2019-03-07 13:36:51 -08:00
Ailing Zhang	fefaebabba	fix dropout AD & rename range to rangelist (#17691 ) Summary: fixes #17669 Address apaszke 's comments in #17523 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17691 Differential Revision: D14328083 Pulled By: ailzhang fbshipit-source-id: 9ec4a54f13bfd1aaf4b1821dd00c31793ac07a44	2019-03-05 20:50:10 -08:00
Ailing Zhang	03132c1f56	convolution/matmul/dropout (#17523 ) Summary: * Add AD formula for _convolution & matmul & dropout * add prim::range, fixes #17483 Example: ``` dim = 3 x = range(dim) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/17523 Differential Revision: D14254002 Pulled By: ailzhang fbshipit-source-id: ba60d77b047db347929b72beca2623fb26aec957	2019-02-27 21:41:59 -08:00
Ailing Zhang	68e90a398e	Temporarily disable select/topk/kthvalue AD (#17470 ) Summary: Temporarily disable them for perf consideration. Will figure out a way to do `torch.zeros(sizes, grad.options())` in torchscript before enabling these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17470 Differential Revision: D14210313 Pulled By: ailzhang fbshipit-source-id: efaf44df1192ae42f4fe75998ff0073234bb4204	2019-02-25 16:29:11 -08:00
Ailing Zhang	9aae82bc2c	Improvements for current AD (#17187 ) Summary: This PR removes a few size of `self` that passed from forward pass to backward pass when `self` is already required in backward pass. This could be reason that cause the potential slow down in #16689 . I will attach a few perf numbers (still a bit volatile among runs tho) I got in the comment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17187 Differential Revision: D14179512 Pulled By: ailzhang fbshipit-source-id: 5f3b1f6f26a3fef6dec15623b940380cc13656fa	2019-02-22 14:34:14 -08:00
Zachary DeVito	4c6da649e5	Partial support for kwarg_only arguments in script (#17339 ) Summary: This provides the minimum necessary to allow derivative formulas for things that have a kwarg only specifier in their schema. Support for non-parser frontend default arguments for kwargs is not completed. Fixes #16921 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17339 Differential Revision: D14160923 Pulled By: zdevito fbshipit-source-id: 822e964c5a3fe2806509cf24d9f51c6dc01711c3	2019-02-21 15:27:06 -08:00
Ailing Zhang	90950f79c7	fix missing std (#17263 ) Summary: add missing std introduced by #16689 . Investigating why this wasn't caught in CI (nor my local dev environment). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17263 Reviewed By: ezyang Differential Revision: D14134556 Pulled By: ailzhang fbshipit-source-id: 6f0753fa858d3997e654924779646228d6d49838	2019-02-20 16:47:35 -08:00
eellison	82aa511146	move prim::None to prim::Constant (again) (#17186 ) Summary: Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test. https://github.com/pytorch/pytorch/pull/16160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186 Differential Revision: D14115304 Pulled By: eellison fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7	2019-02-19 11:45:50 -08:00
Ailing Zhang	20fd6dca77	fix missing constant in adaptive_avg_pool2d AD (#17180 ) Summary: Thanks ngimel for pointing this out! Pull Request resolved: https://github.com/pytorch/pytorch/pull/17180 Differential Revision: D14113001 Pulled By: ailzhang fbshipit-source-id: 78e7d7f2cda3889138e2bf26a54980c2cc665882	2019-02-15 21:14:34 -08:00
Elias Ellison	91c1d728ac	Revert D14109636: [pytorch][PR] move prim::None to a case in prim::Constant Differential Revision: D14109636 Original commit changeset: d26fd3839761 fbshipit-source-id: c8c8113e2bff49ea93235732603e6ebc89356533	2019-02-15 16:38:12 -08:00
Elias Ellison	7caa21f5ca	move prim::None to a case in prim::Constant (#16160 ) Summary: This change simplifies analysis done on constants since prim::None does not need to be handled separately now. To check if a constant node is None, use node->isNone(). Next step will be to remove prim::Undefined. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160 Differential Revision: D14109636 Pulled By: eellison fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876	2019-02-15 16:27:57 -08:00
Adam Paszke	7157be8622	Add special ops for BatchNorm symbolic differentiation (#15403 ) Summary: The main problem there is with differentiating batch norm statically is that we make a lot of complex run-time decisions about the backend we choose. Then, the autograd derivatives are implemented for every backend separately, which makes sense, because they might be saving buffers containing different values. To resolve the issue, the forward op returns an index of the chosen backend, and the backward function takes it as an argument, such that it knows how to interpret the buffers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15403 Differential Revision: D14098815 Pulled By: ailzhang fbshipit-source-id: 7fcd3e6e0566433e81fe8286fb441c1ecaf198ad	2019-02-15 15:40:28 -08:00
Ailing Zhang	8d33eb450e	Fix avg pool2d api (#17166 ) Summary: Fix xla breakage (partially). Pull Request resolved: https://github.com/pytorch/pytorch/pull/17166 Differential Revision: D14106954 Pulled By: ailzhang fbshipit-source-id: 35ae6713272d0517b66da2ee9209f49015492b89	2019-02-15 13:58:30 -08:00
Wanchao Liang	f062f5fd4a	add std to autodiff, and mean/var/std to operator set (#17137 ) Summary: supersedes #16684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17137 Differential Revision: D14096724 Pulled By: wanchaol fbshipit-source-id: d801d70029a6a1f5851400ff4094c0299c102b2b	2019-02-14 23:18:53 -08:00
ngimel	91c50aeec6	Speed-up adaptive average pooling for the common case of size=1 output (#17011 ) Summary: When adaptive pooling has to produce a single pixel feature map, it is faster to do so by calling .mean(). Backward calls a pretty inefficient cuda kernel with atomics, which becomes ridiculously slow for halfs. For half this PR provides approx 30x speed-up for adaptive average pooling, which results in 30% end-to-end speed-up on senet. Improvements are smaller for float, but still significant (approx 5x). Also this PR unifies handling of 3d (no batch dimension) and 4d tensors, using negative dimension indices. cc ezyang for review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17011 Reviewed By: ailzhang Differential Revision: D14078747 Pulled By: soumith fbshipit-source-id: 0eb9255da2351190a6bcaf68c30e2ae2402a2dd9	2019-02-14 21:15:16 -08:00
Ailing Zhang	b0545aa85f	maskrcnn & bert AD coverage part 1 (#16689 ) Summary: - Moved a few functions from `autograd` namespace to `aten` namespace to be visible from JIT nativeResolver. - Added a hack to loop up keyword only argument. Will add proper support for kw only later - Simulate function overload in aten using `_<number>` as function name suffix. - Even `forward` returns multiple outputs like in `kthvalue`, there's at most one requires grad that we currently support. - Removed the `TensorList` related ops here since partial `TensorList` support is prone to bugs. Our symbolic diff for `cat` was never tested with autodiff, and it seems broken. Need to find another proper way to support these ops(either by properly supporting `TensorList` or sth like `prim::ConstantChunk` and leave them for next PR. Ops supported in this PR: ``` erf expand_as index kthvalue mean permute pow rsub select sqrt squeeze t to topk transpose view var embedding logsumexp // grad is None _dim_arange contiguous nonzero ones_like ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16689 Differential Revision: D14020806 Pulled By: ailzhang fbshipit-source-id: a5e2c144a7be5a0d39d7ac5f93cb402ec12503a5	2019-02-14 15:36:39 -08:00
Thomas Viehmann	20d45c43d7	Get more fusion after autodiff uses SumToSize (#14957 ) Summary: Here is a fresh attempt at getting some fusion back in autodiff-generated graphs in the presence of SumToSize. - The sum to size operator is now `aten::_grad_sum_to_size` to allow symbolic script differentiation (and that in turn would need to use this in place of sum_to_size to signal that it strictly operates on gradients). This is also used in the autodiff code, replacing `prim::SumToSize`. - `_grad_sum_to_size` is now fusable, `cat`s - which are fused afterwards thanks to Adam's simplification of the code - are only fused if there is no `_grad_sum_to_size` in the fusion group. - I push the `_grad_sum_to_size` out of the the fusion group when compiling and record the desired summations in the KernelSpec. The reasoning is the following: - As the autodiff is a repeated applicaiton of the chain rule, we always have the pattern `grad_in = mm(A, grad_out)`, with A often diagonal for cases interesting to the fuser, whence it is `grad_in = a * grad_out` (a pointwise multiplication). We know that only `grad_out` may have AutodiffGradSumToSize applied, so we can commute AutodiffGradSumToSize with the `mul` (and `div` and `neg` are of similar origin). - For `type_as` the gradient might be giving the type, so just skip SumToSize, - `add` (which was inserted as `prim::AutogradAdd`) adding gradients when the forward used the same value in several places. This is non-broadcasting, so we know that the two arguments would have the same sizes as inputs - which is good so we don't have to do bookkeeping of the two parts. Details: - During fusion, the Tensor arguments are always kept as the first parameters of the fusion group to accomodate indexing assumptions in the fuser. - The rewriting of the fusion group to record the necessary output transformation and eliminate `_grad_sum_to_size` from the fusion group is now in the fuser compile step. - In the execution step, the arguments are split into Tensor / Non-Tensor and the non-tensor args are mostly forgotten about except for doing `sum_to_size` at the end. This would want to be improved if/when we fuse nonconstant scalar arguments. - In a number of places in the fuser, the non-Tensor arguments to the fusion group needed to be ignored. Thank you, apaszke for the insightful discussion. All bad ideas and errors are my own. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14957 Differential Revision: D13888173 Pulled By: zou3519 fbshipit-source-id: 071992c876e8b845f2b3e6329ae03a835d39a0ea	2019-01-31 12:24:38 -08:00
Michael Suo	f636dc9276	clang format world (#15524 ) Summary: The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook. Here is a list of non-mechanical changes: - I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting. - Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas - Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas - Small improvements to the precommit hook clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524 Differential Revision: D13547989 Pulled By: suo fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493	2018-12-26 06:55:01 -08:00
Ailing Zhang	70aafad08a	AD support for adaptive_avg_pool2d (#15459 ) Summary: This adds AD support for adaptive_avg_pool2d, which is necessary for resnet50 in pytorch/vision:master. cc: soumith asuhan dlibenzi apaszke I saw that autodiff bug you fixed in #15403 , as it doesn't prevent this PR from passing, so I'll leave it for your PR to fix it. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/15459 Differential Revision: D13534732 Pulled By: ailzhang fbshipit-source-id: 4e48b93e35d5ecfe7bd64b6a132a55b07843f206	2018-12-21 15:38:24 -08:00
Zachary DeVito	1a2ec10bd4	Support enough of closures to write autograd functions (#15411 ) Summary: This PR adds enough of the infra for supporting closures (inner script functions) in order to allow us to expression symbolic gradients using them. We do not actually ever run graphs that contain these closures. The symbolic_script infrastructure just extracts them out of the original forward graph and turns them into discrete forward/backward pairs. This cuts down on the type annotations necessary to write forward/backward pairs and aligns closely with the "differentiator" function approach to expression reverse-mode AD. Example: This code: ``` import torch r = torch.jit.CompilationUnit( ''' def mul_forward(self, other): def backward(grad_output): grad_self = (grad_output * other).sum_to_size(self.size()) grad_other = (grad_output * self).sum_to_size(other.size()) return grad_self, grad_other return self * other, backward ''') print(r.module.code) ``` Will produce this graph (pretty printed for clarity): ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: backward = (self.__lambda, (other, self)) return (torch.mul(self, other), backward) def __lambda(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` symbolic_script will then do some modifications to remove the unsuppored prim::Function node, yielding: ``` def mul_forward(self, self: Tensor, other: Tensor) -> Tuple[Tensor, Tuple[None, Tuple[Tensor, Tensor]]]: return (torch.mul(self, other), (other, self)) def backward(self, context: Tuple[Tensor, Tensor], grad_output: Tensor) -> Tuple[Tensor, Tensor]: other, self, = context grad_self = torch.sum_to_size(torch.mul(grad_output, other), torch.size(self)) grad_other = torch.sum_to_size(torch.mul(grad_output, self), torch.size(other)) return (grad_self, grad_other) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/15411 Differential Revision: D13523340 Pulled By: zdevito fbshipit-source-id: 4d4a269460e595b16802c00ec55ae00e3e682d49	2018-12-20 14:39:11 -08:00
Ailing Zhang	6ab2e7442d	Autograd using torchscript (#14604 ) Summary: This PR enables autodiff to use the forward/backward graph compiled from python code, instead of using symbolic gradients(modifying the original graph directly). We put the map in a separate .h file for now to wait for the native_functions.yaml and derivatives.yaml merge. This should ideally go into native_functions.yaml eventually. This PR should be enough to unblock us for now, we can start writing gradients for aten functions in python. Differential Revision: D13494635 Pulled By: ailzhang fbshipit-source-id: f8d51a15243ac46afd09d930c573ccdfcd9fdaaf	2018-12-18 19:10:57 -08:00

49 Commits