pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Richard Barnes	9945fd7253	Drop unused imports from caffe2/python (#49980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49980 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727359 fbshipit-source-id: c4f60005b10546423dc093d31d46deb418352286	2021-01-05 13:17:46 -08:00
skyline75489	46b83212d1	Remove unused six code for Python 2/3 compatibility (#48077 ) Summary: This is basically a reborn version of https://github.com/pytorch/pytorch/issues/45254 . Ref: https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48077 Reviewed By: ngimel Differential Revision: D25687042 Pulled By: bugra fbshipit-source-id: 05f20a6f3c5212f73d0b1505b493b720e6cf74e5	2020-12-22 18:07:08 -08:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Orion Reblitz-Richardson	ad17dafc50	[caffe2] Remove python2 from operator_test (#33977 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33977 Removing python2 from operator_test so we can retire python2 support for PyTorch. Test Plan: waitforsandcastle Reviewed By: seemethere Differential Revision: D20129500 fbshipit-source-id: d4c82e4acfc795be9bec6a162c713e37ffb9f5ff	2020-03-02 08:55:53 -08:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
rohithkrn	0d663cec30	Unify cuda and hip device types in Caffe2 python front end (#14221 ) Summary: Goal of this PR is to unify cuda and hip device types in caffe2 python front end. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14221 Differential Revision: D13148564 Pulled By: bddppq fbshipit-source-id: ef9bd2c7d238200165f217097ac5727e686d887b	2018-11-29 14:00:16 -08:00
Evan Klitzke	189c1e1afb	Rewrite http://pytorch.org -> https://pytorch.org throughout project (#12636 ) Summary: The pytorch.org site redirects all of the http:// requests to the https:// site anyway, so the comments and error messages might as well refer directly to the https:// site. The GitHub project description should also be updated to point to https://pytorch.org Pull Request resolved: https://github.com/pytorch/pytorch/pull/12636 Differential Revision: D10377099 Pulled By: soumith fbshipit-source-id: f47eaba1dd3eecc5dbe62afaf7022573dc3fd039	2018-10-15 13:03:27 -07:00
Takayoshi Nishida	197412fa8f	Fix typo in comment (#7183 )	2018-05-02 11:58:30 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Pooya Davoodi	37dec493a5	Scope MultiRNN blobs with name as well as layers (#2025 ) * Scope MultiRNN blobs with name as well as layers Also don't double scope MultiRNN in case of multiple layers. * Scope input projection of first layer with name We don't scope it with layers because the projection is done outside of the layer. * Avoid scoping input blob in MemongerTest.test_rnn * Rectify input_blob in prepare_input Revert change in memonger_test because rectifying input will solve the problem.	2018-03-02 22:21:07 -08:00
Alexander Sidorov	e0e124e617	Fix RNN scoping situation Summary: There is a long lasting problem of scoping which was introduced in original python wrappers early in H1. Basically each RNNCell implemented has to manually scope outputs of each of the operators. If somebody forgets, then there could be weird bugs with layers etc. Approach is the following. User has to explicitly specify current scope when using apply_over_sequence function and others if the function is going to be called several times (like for stacking layers). This way we use Caffe2 native scoping approach instead of inventing one extra API people have to use (i.e. passing scope name as an argument to the RNNCell constructor). Closes https://github.com/caffe2/caffe2/pull/1681 Differential Revision: D6777536 Pulled By: salexspb fbshipit-source-id: 73d860b8d4857589e04bdea5a6fcd3080d68427c	2018-02-07 17:35:29 -08:00
Anders Papitto	6a02cb2844	implement sequence length support for BasicRNN Summary: Closes https://github.com/caffe2/caffe2/pull/1843 Differential Revision: D6839575 Pulled By: anderspapitto fbshipit-source-id: efdf00f1c5cfb0d63f1992028a796c8277b76688	2018-02-05 21:05:51 -08:00
Anders Papitto	d8748a9d53	GRU sequence lengths: allow unspecified sequence lengths Summary: modeled after the earlier change for LSTM Closes https://github.com/caffe2/caffe2/pull/1841 Differential Revision: D6837461 Pulled By: anderspapitto fbshipit-source-id: de4e787019fa30f813a4b29f14b7000ce9d22d8e	2018-02-05 13:20:05 -08:00
Anders Papitto	33d2212751	LSTM sequence lengths: allow unspecified sequence lengths Summary: In this case, each sequence is treated as having a length equal to the first dimension of the input tensor. This matches the semantics of ONNX when the sequence length input is left out. Closes https://github.com/caffe2/caffe2/pull/1764 Reviewed By: dzhulgakov Differential Revision: D6751219 Pulled By: anderspapitto fbshipit-source-id: 89e0efd12339157627494e2b8c83e952bdd8a9f8	2018-01-26 16:32:56 -08:00
Anders Papitto	e3e6680b48	Add ElmanCell and ElmanRNN Summary: Closes https://github.com/caffe2/caffe2/pull/1742 Reviewed By: dzhulgakov Differential Revision: D6706809 Pulled By: anderspapitto fbshipit-source-id: 15a05786a26aeb719ea4377f4dbbb62738d9e697	2018-01-18 12:14:02 -08:00
Anders Papitto	12309f4aa6	GRU cell: add linear_before_reset boolean parameter Summary: This matches the semantics of cudnn (and others, like pytorch) Closes https://github.com/caffe2/caffe2/pull/1695 Reviewed By: dzhulgakov Differential Revision: D6658208 Pulled By: anderspapitto fbshipit-source-id: 00e1716fba47b0ac296d1e9e0131165f4997ac7d	2018-01-08 13:22:56 -08:00
James Cross	ca44c16e72	LayerConfigMILSTMCell Summary: A version of MILSTMCell which uses layer normalization (see https://arxiv.org/pdf/1607.06450.pdf). There's a lot of copypasta because we don't want to make the existing RNNCell classes harder to approach / understand by adding new options. Differential Revision: D6564208 fbshipit-source-id: 0bc43e12b6c08ebdf5ea6af2c631f785c302bdb4	2017-12-14 10:17:53 -08:00
Peter Goldsborough	540a9c279e	Add LayerNormLSTM Summary: Adds a new `LSTMCell` subclass to the `rnn_cell` module that performs layer normalization on the fused input matrix. Moves around some code in `rnn_cell.py` to avoid copy-pasta. Adds relevant test cases to `rnn_cell_test.py`. Had to fix `brew.layer_norm` first. See T24013870. Reviewed By: jhcross Differential Revision: D6454883 fbshipit-source-id: 0f4ea7a778cc5be6a7274f7b28c793f5dd7c6095	2017-12-04 10:48:37 -08:00
James Reed	995c83f945	Disable cudnn dropout Summary: The cudnn version of the DropoutOp was taking a significant (and unwarranted) amount of time in our RNN training. Further investigation showed that setting the cudnn dropout descriptors was an extremely expensive operation (https://pxl.cl/99nT), much more so than the dropout operation itself. This diff adds to the DropoutCell the option to disable cudnn. The non-cudnn version uses a raw curand call that elides all of the expensive descriptor setting. Reviewed By: jmp84, akyrola Differential Revision: D5972022 fbshipit-source-id: 6325ec5d6569f8b94d776cbb2554cc8ddb28f699	2017-10-04 17:24:09 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Junjie Bai	d9b0bcd7a4	Make all existing (except in RoIPool) "is_test" arguments required Reviewed By: akyrola Differential Revision: D5830168 fbshipit-source-id: 8634e9cfe308ba0ee90cd8a5c4b09a47b0b5f015	2017-09-25 23:46:12 -07:00
Aapo Kyrola	6b44a00c71	remove in-place Dropout from rnn_cell (bug in PR-1185) Summary: This caused gradient generation problems. Output was made in-place in PR-1185, by mistake, I believe. Differential Revision: D5844825 fbshipit-source-id: 4ad84d0fb468aafde9f78463b9acf89316e633ca	2017-09-15 14:03:33 -07:00
Luke Yeager	c313855523	Use brew in rnn_cell.py Summary: Was https://github.com/caffe2/caffe2/pull/1151. Closes https://github.com/caffe2/caffe2/pull/1185 Differential Revision: D5794716 Pulled By: akyrola fbshipit-source-id: c27d30d5d6dd7dacc47610150dcfef03343a7120	2017-09-13 12:02:57 -07:00
Aapo Kyrola	ceb13bf3fb	Fix cell/hidden init issue, add copy states to test Summary: As title. Wonder this had not been encountered before. Only affects cases where the states are copied over though. Reviewed By: Yangqing Differential Revision: D5777314 fbshipit-source-id: 8aef435c832e4ead5bb3d3e35bb065c734a2af5f	2017-09-06 14:16:17 -07:00
James Cross	53ccbd9a6e	soft-coverage attention Summary: Implementation of a new variant of attention module, which contains a recurrent decoder state with vectors corresponding to each source-side word and strictly increasing values, thus enabling it to model the degree to which source words have been translated. The approach is a variant of the approaches described in https://arxiv.org/pdf/1601.04811.pdf. We simply include the sum of all previous attention weights for encoder words as a new recurrent state (coverage_t). A new linear transform on encoder_outputs is used to produce coverage_weights, which has the same dimensionality as encoder_outputs, and implicitly models the fertility of source-side words (and putting this extra information strain on the encoder network). Thus the encoder output, the decoder state, and the coverage weights have the same dimensionality for a given source word, and attention logits are calculated as v * tanh(coverage * coverage_weights + encoder_output + decoder_state). Note: the entire coverage state for each translation instance is of shape (encoder_length, coverage_units), but the states for the RecurrentNetwork operator, used to train the decoder, must be flat in the data dimension. This state is therefore initialized with shape (encoder_length * coverage_units) [not shown in the open-source library] and reshaped appropriately within the apply_soft_coverage_attention() function. Differential Revision: D5593617 fbshipit-source-id: 7d0522b5eb0b26f22e8429e4461a459f2f16ed46	2017-08-31 21:21:54 -07:00
Alexander Sidorov	7eba614503	RNNCell: Initializers interface, simplify _LSTM helper Summary: _LSTM helper is a legacy piece we had before all the RNNCell awesomeness landed. Now we need to pull it apart and create separate building blocks that people can use for any RNNs. Please note changes to a test with double scoping. That should go away once we change RNNCell scoping logic in such a way that each cells ads its own name to the scope for all of its outputs (see another diff: D5613139 ) Reviewed By: jhcross Differential Revision: D5632276 fbshipit-source-id: 1cb568ab995c4c0b3dd1b4bad2d028e34bded9c1	2017-08-25 12:01:24 -07:00
Aapo Kyrola	e89474c496	fix forward_only mode Summary: Forward-only mode had broken at some point. Two things: RNNCell did not pass the parameter to recurrent.py and also recurrent.py was broken if forward_only=True after python3 codemod. Added test to rnn_cell_test to actually check the forward only parameter is passed to prevent future breakage. Reviewed By: jmp84 Differential Revision: D5639306 fbshipit-source-id: b1bbc39d59c3f3734b2f40a1c2f3740c733e0bd4	2017-08-17 10:19:04 -07:00
Alexander Sidorov	a7be496fe2	Revert D5589309: modify _LSTM into _RNN to adapt GRU Summary: This reverts commit f5af67dfe0842acd68223f6da3e96a81639e8049 bypass-lint Differential Revision: D5589309 fbshipit-source-id: 79b0a3a9455829c3899472a1368ef36dc75f6e14	2017-08-10 16:42:41 -07:00
Tao Wu	7b86a34610	modify _LSTM into _RNN to adapt GRU Summary: GRU is different than LSTM that it only has hidden states but no cell states. So in this case, reusing the code of _LSTM is problematic, as we need to delete the part of creating cell state, and change many other places that use hard-coded 4 (hidden_all, hidden, cell_all, cell) into 2 (hidden_all, hidden). Otherwise GRU will break during the backward pass, when the optimizer tries to apply gradient to each of the parameters, because cell state is never used, so it does not have gradients for the corresponding parameters (i.e., cell_state_w, cell_state_b). Differential Revision: D5589309 fbshipit-source-id: f5af67dfe0842acd68223f6da3e96a81639e8049	2017-08-09 13:24:45 -07:00
Juan Miguel Pino	4d8a8c2e1e	Implement dot attention Summary: Implement dot attention as described in https://arxiv.org/abs/1508.04025 This saves the computation of weighted encoder outputs in `rnn_cell.py` When the encoder and decoder dimensions are different, we apply an FC, which corresponds to the general case below Figure 2. Refactored unit tests. Reviewed By: jhcross Differential Revision: D5486976 fbshipit-source-id: f9e9aea675b3b072fbe631bc004199b90a9d95cb	2017-08-06 11:50:16 -07:00
Tao Wu	5449afa855	use model.create_param instead of using param_init_net directly Summary: When creating parameters for modelhelper, we should use create_param instead of using param_init_net and model.params directly. The diff rewrite some of these cases in rnn_cell.py in order to make model._parameter_info and model.params consistent. Reviewed By: kittipatv Differential Revision: D5477724 fbshipit-source-id: 28c4aaf8f98d9d89125af6a42ad328008f0079e1	2017-07-24 21:17:24 -07:00
James Cross	0eda7955bd	use internal cell for DropoutCell output prep methods Summary: In order to get dimensions right, correctly identify gradients, etc., DropoutCell should call the _prepare_output and _prepare_output_sequence methods of its internal cell for its own such methods. This bug was identified by NVIDIA intern Syed Tousif Ahmed. Reviewed By: akyrola Differential Revision: D5483082 fbshipit-source-id: f6df5b4a0502ed0771056638aab219fb5cc7d964	2017-07-24 14:53:11 -07:00
James Cross	99e79a616b	attention with encoder_lengths Summary: For RNN attention, we should not include the invalid parts of the encoder output (based on encoder_lengths) in the computation. This diff accomplishes that by forcing logits for those positions to be negative infinity. Note that the this step can be bypassed by passing encoder_lengths=None, which is what we do for beam search, thus incurring no extra overhead for inference. Reviewed By: jamesr66a Differential Revision: D5402547 fbshipit-source-id: 1863d6050b5129e4df829c6357f0aa9ded0715dc	2017-07-23 10:06:01 -07:00
Thomas Dudziak	5355634dac	Dict fixes/improvements and unittest targets for Python 3 in caffe2 core Summary: As title Reviewed By: salexspb Differential Revision: D5316104 fbshipit-source-id: aee43819d817842e5ce6ba3d045a55b1a2491c30	2017-06-29 17:05:41 -07:00
James Cross	29887f556f	Unrolled test for AttentionCell Summary: Adding a test to check computational integrity of networks constructed with AttentionCell using UnrolledCell. Reviewed By: salexspb Differential Revision: D5306915 fbshipit-source-id: 02acfd1011f7d3ee5fac21cc2778c4a486190c43	2017-06-25 17:21:24 -07:00
James Cross	ccc46229af	Fix residual connections Summary: This diff fixes gradient computation of residual connections for a training network constructed with MultiRNNCell. It addresses a logic bug in _prepare_output() and _prepare_output_sequence() by keeping track internally of which layers have consecutive residual connections before the output, and then reconstructing the final residual output by (re-)preparing the output of each of those layers and then combining them with a Sum operation. This also involves keeping track of which states contribute toward the reconstruction of the final sequence output so that outputs_with_grads can be correctly passed to apply_over_sequence(). Differential Revision: D5300520 fbshipit-source-id: f37d800c909e631175de7045abe192351cc11c41	2017-06-23 11:36:22 -07:00
Alexander Sidorov	eefd4b0bb2	Static RNN: gpu support and lstm_benchmark integration Summary: While this is not intended to be the best performat and general solution, we can see from the test plan in some cases static DAG RNN could perform better than our own implementation. Hopefully we will get dynamic RNN DAG execution at least as fast as this one. Then we will not need this one in production, only for testing. Still putting it into our benchmark for comparison purposes Reviewed By: akyrola Differential Revision: D5210038 fbshipit-source-id: fa44baf51c455872abd6ec5f5d151cf06e15b1fa	2017-06-16 11:31:43 -07:00
Aapo Kyrola	2a9cb7d4a9	use brew for Tranpose --> major perf regression fix Summary: I accidentaly noticed that we were calling the non-CUDNN version of Transpose with attention, and it is super slow. This broke when rnn_cell was changed to use ModelHelper instead of CNNModelHelper in D5062963, but calls to transpose were not "brewed". Reviewed By: jamesr66a Differential Revision: D5264248 fbshipit-source-id: b61494ae210f34597245f1195d20547f5b5cd8b5	2017-06-16 11:02:48 -07:00
Alexander Sidorov	df72826ead	Static RNN Summary: Static RNN allows to unroll an RNN into Caffe2 graph using all existing cell abstractions. In this diff I introduce several new tests that already caught a few bugs in our RecurrentNetworkOp gradient accumulation logic by comparing it to an unrolled version. Another use case is perf - potentially we can run an unrolled net faster because DAGNet will have access to the whole graph. Same about memonger. But this work is not part of this diff Reviewed By: akyrola Differential Revision: D5200943 fbshipit-source-id: 20f16fc1b2ca500d06ccc60c4cec6e81839149dc	2017-06-08 17:48:48 -07:00
Thomas Dudziak	d524d5b481	Fixes zip/izip for Python 3 Summary: As title Reviewed By: salexspb Differential Revision: D5154186 fbshipit-source-id: 2ef24557d82ae16d3bdfbc90a4cc96be8e2dc6c3	2017-06-07 00:04:26 -07:00
James Cross	4bed0c6d41	Update RNN Seq2SeqModelCaffe2EnsembleDecoder to reflect training network structure Summary: Use new blob as residual sum output, and add scoping to prevent any name conflicts. Reviewed By: urikz Differential Revision: D5167145 fbshipit-source-id: a01c87ed2278205e95e8395314b166afb1dca1b3	2017-06-01 23:32:35 -07:00
James Cross	03503140fd	DropoutCell as wrapper for another RNNCell Summary: Added a new RNNCell, DropoutCell, which wraps an existing RNNCell and applies dropout to its primary output (as defined by get_output_state_index()). Reviewed By: salexspb Differential Revision: D5084871 fbshipit-source-id: 60474af84e5757a12e7fdc3814840dc9ba8e32a1	2017-05-24 11:36:45 -07:00
James Cross	c39f6cf2d0	gradient accumulation fix Summary: As noted by salexspb, MultiRNNCell had unreliable gradient computation. The problem was that recurrent gradient and gradient computed wihtin the backward step net were not being accumulated during the backward pass, but rather writing to the same blob, thus overwriting each other. This diff fixes that by artificially introducing an extra blob for the internal output, and then accumulating it into the gradient coming from the recurrent connection. Reviewed By: salexspb Differential Revision: D5110059 fbshipit-source-id: 16add50989fe8866361bbc21afce5f214c5292fd	2017-05-24 10:33:32 -07:00
James Cross	83f6dceaa6	remove forget_bias as argument to AttentionCell constructor Summary: argument unsused. Differential Revision: D5096088 fbshipit-source-id: fcda8a1d2b0d7c85182ab5bc002c86640b443f97	2017-05-19 16:53:40 -07:00
James Cross	f27c9eea20	dropout for C2 multilayer Summary: Incorporate arbitrary dropout for encoder and decoder layers for Caffe2 NMT models using current configuration. This involves separate output processing (_prepare_output() and _prepare_output_sequence()) for the final layer in a MultiRNNCell. Switching to using the newly introduced forward_only switch for RNN cells revealed an unrelated bug in our NetGradientChecker test, which urikz is investigating. Reviewed By: salexspb Differential Revision: D5031964 fbshipit-source-id: 19b49607d551aa3e2140041ef4e585f128c8f178	2017-05-17 11:32:47 -07:00
James Cross	37c06a3ba8	residual connections in multilayer C2 ('add' only) Summary: Residual connections for multilayer RNN encoder/decoder for Caffe2 NMT model. Only supporting 'add' connections (the standard approach, which ves's TF experiments concluded was at least as good as other approaches), and also only implementing for residual_level >= 1 (which also fits our use case). It is the responsibility of the config to ensure dimension compatibility: each level at and beyond residual_level (in both the encoder and decoder) should have the same number of units, with the exception that a bidirectional initial encoder layer should have half the number of units of the succeeding layer if that next layer is a residual layer. Differential Revision: D5023160 fbshipit-source-id: f38c1b140638fee78cf3ef7d6b4602dd462484ee	2017-05-16 17:04:58 -07:00
Yiming Wu	a28b01c155	rnn with brew Summary: Update rnn_cell.py and char_rnn.py example with new `brew` model. - Deprecated CNNModelHelper - replace all helper functions with brew helper functions - Use `model.net.<SingleOp>` format to create bare bone Operator for better clarity. Reviewed By: salexspb Differential Revision: D5062963 fbshipit-source-id: 254f7b9059a29621027d2b09e932f3f81db2e0ce	2017-05-16 13:33:44 -07:00
Yury Zemlyanskiy	e8c274cf16	Optimize memory usage for MI-LSTM Summary: Use ElementwiseLinearOps instead of manual Mul + Sum. That saves intermediate blobs. For NMT use case Before: https://our.intern.facebook.com/intern/fblearner/details/18060753 Time per step: 0.072 memory usage (per each of 2 GPUs): 9041MiB After:https://our.intern.facebook.com/intern/fblearner/details/18107583 Time per step: 0.0715 Memory (per each GPU): 8560MiB Reviewed By: akyrola Differential Revision: D5038785 fbshipit-source-id: 4bc8155dbd0c87729e17236d68d62ca530aadb53	2017-05-10 16:53:43 -07:00
Yury Zemlyanskiy	ae924be3ac	Removing extra Reshapes in MILSTM with new broadcasted ops Summary: D4873222 introduced SumReduceLike and removed the use_grad_hack ... hack. Remove unnecessary reshapes and kill use_grad_hack parameters. Reviewed By: jamesr66a Differential Revision: D4894243 fbshipit-source-id: c4f3f84abf95572d436b58bbdc2b18b21583c2f1	2017-05-09 14:11:04 -07:00

1 2

59 Commits