Commit Graph

141 Commits

Author SHA1 Message Date
c9be135bb9 Fix batch norm multiplier init (#12325)
Summary:
Fixes #12259
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12325

Differential Revision: D10203439

Pulled By: SsnL

fbshipit-source-id: 999cc134a45e2554313adb7eb93ee98e1f84335f
2018-11-08 19:00:00 -08:00
2cd912bcc2 Fix more spectral norm bugs (#13350)
Summary:
Problems with SN and DP after #12671 :
1. in eval mode, `weight_orig` is not getting correct gradient #12737 .

    Fix: keep `v` vector around as a buffer and always calculate `W = W_orig / (u @ W_orig @ v)` even in eval.

2. in training mode, the `weight` buffer of the parallelized module is never updated, if someone touches `weight_orig` and/or `weight` and makes them not sharing storage. So in `eval` the weight used is wrong.

    Fix: Make `weight` not a buffer anymore and always calculate it as above.

3. #12671 changed SN to update `u` in-place to make DP work correctly, but then it breaks backward through two forwards (e.g., the common GAN loss `D(real) - D(fake)`) because the vectors needed to backprop the 1st forward is changed in the 2nd forward.

    Fix: This PR clones `u` and `v` before using them.

To maintain BC, I added a hook interface for producing and loading state_dict. This is ugly and we should really have better interface for spectral_norm. But for the purpose to fix this issue, I make this patch. Even if we have a better interface, BC mechanism for legacy loading legacy state_dict still needs to be done.

cc The controller you requested could not be found. crcrpar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13350

Differential Revision: D12931044

Pulled By: SsnL

fbshipit-source-id: 8be6f934eaa62414d76d2c644dedd7e1b7eb31ef
2018-11-06 19:16:13 -08:00
f9a99d5504 Specify default initialization schemes for modules in docs (#9038)
Summary: This closes #6906 .

Reviewed By: ezyang

Differential Revision: D8698632

Pulled By: weiyangfb

fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
2018-07-24 11:58:08 -07:00
623ae0c07c Fix loading 0.4 BN checkpoints (#9004)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/8481
Closes https://github.com/pytorch/pytorch/pull/9004

Reviewed By: soumith

Differential Revision: D8684017

Pulled By: SsnL

fbshipit-source-id: 57820ad5f6b60795358c9447409a364a93ffa1d9
2018-07-01 22:24:21 -07:00
3185d8342e Replace incorrect usages of "NotImplemented" (#7381)
* Replace incorrect usages of "NotImplemented"

Fixes #7266. Replaces "NotImplemented" (which is supposed to be used for
binary ops) with the correct "NotImplementedError".

* Address comments
2018-05-08 18:31:45 -04:00
76d3c30783 Enable resetting of batchnorm running moments and cumulative ("simple") moving average (#6445) 2018-04-26 19:27:24 -07:00
605307f8f3 Add support for printing extra information in Module and refactor redundant codes (#5936)
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.

Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.

* Add support for extra information on Module

* Rewrite the repr method of Module

* Fix flake8

* Change the __repr__ to get_extra_repr in Linear

* Fix extra new-line for empty line

* Add test for __repr__ method

* Fix bug of block string indent

* Add indent for multi-line repr test.

* Address review comments

* Update tutorial for creating nn.Module

* Fix flake8, add extra_repr of bilinear

* Refactor DropoutNd

* Change to extra_repr in some Modules

* Fix flake8

* Refactor padding modules

* Refactor pooling module

* Fix typo

* Change to extra_repr

* Fix bug for GroupNorm

* Fix bug for LayerNorm
2018-04-02 13:52:33 -04:00
08891b0a4e Group Normalization (#5968)
* Group Normalization

* move to ATen
2018-03-24 12:16:18 -04:00
7e13138eb6 Revert "Enable resetting of batchnorm running stats and cumulative ("simple") moving average" (#5892)
* Revert "Port ATen and JIT C++ tests to Catch2 (#5788)"

This reverts commit 6f80023c29e0fb55f46a32c4931bc5d4ba749846.

* Revert "Fix error message for cat-ing zero-dim tensors (#5819)"

This reverts commit cf2e1760490d369e93017b9425279b235c10772d.

* Revert "Softmax symbolic should account for negative dim (#5846)"

This reverts commit ba64724aeea8ad5d4b50cd1154fca5a011618333.

* Revert "[fft][1 of 3] build system and helpers to support cuFFT and MKL (#5855)"

This reverts commit 22ef8e5654c45d1f5404e3add6ad19678c0b80a9.

* Revert "Don't modify requires_grad when running DataParallel in no_grad mode (#5880)"

This reverts commit d11b7fbd1c49ed7bd84c89d286e2763e6ba55f51.

* Revert "fix some methods not showing up in doc (#5882)"

This reverts commit 24fca0efb289a069929639783d1c050b79e591c0.

* Revert "ReduceOps cleanup and set_num_threads (#5723)"

This reverts commit 84400d5531500e1a3fbcfe8a3f2865f982405861.

* Revert "introduce shape_as_tensor and reshape_from_variable_shape (#5824)"

This reverts commit f446b82e70ca0aa42fffa58469c28b6bce51d021.

* Revert "Enable resetting of batchnorm running moments and cumulative ("simple") moving average (#5766)"

This reverts commit 99b1f6cfad85a4856550cc1e787afd7ff9e6c6aa.
2018-03-19 17:47:54 -04:00
99b1f6cfad Enable resetting of batchnorm running moments and cumulative ("simple") moving average (#5766) 2018-03-19 11:47:57 -04:00
76a283db40 [ready] General Documentation Improvements - 2 (#5685)
* Fix some minor errors in existing docs.

* Fix Convolution and Pooling docs in torch.nn.functional

* Cleaned up torch.nn.functional docs

* Address @SsnL 's comments

* Add multiplication sign missing in docs

* Fix more typos, and clear some warnings

* Change infinity symbol in LPPool2d

* Revert some changes in torch.nn.functional

* Few more minor changes
2018-03-13 09:47:43 -04:00
32b3841553 [ready] General documentation improvements (#5450)
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan

* Fix minor nit in the docstring

* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs

* Add more changes
1. Modify all torch.Tensor wherever required

* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines

* Improve Pooling docs
1. Fix lint error

* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling

* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation

* Fix lint error

* Improve docstrings in torch.nn.init

* Fix lint error

* Fix minor error in torch.nn.init.sparse

* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs

* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py

* Fix batch norm doc error
2018-03-08 13:21:12 -05:00
27265503ad nn.* doc update after Variable/Tensor merge (#5459)
The nn.* counterpart of #5443 . Mostly removed Variable wrapper. Also added doc for nn.RReLU.

Notice that torch.randn(*, requires_grad=True) isn't documented until #5462 is done.
2018-03-01 18:11:39 -05:00
1848cad108 [ready] Layer Normalization (#4922)
* at::maybe_data_ptr and Check.h => TensorUtils.h

* THNN support for optional BN running_*

* ATen support for optional BN running_*

* Python nn.* support for optional BN running_*; Improve IN and BN doc

* Add tests for IN and BN new option

* Layer Norm

* Fix LRN doc

* functional interface for LN and IN

* Layer norm tests

* fix BN double backward returning undefined tensors

* fix jit test using wrong dim inputs for BN

* add/improve BN, IN and LN GPU tests with half type

* Udpate docs to be consistent with Conv notation
Fix onnx
Clarified onnx symbokic wrapper

* fix typo

* Address comments
2018-02-22 11:56:41 -05:00
dd6d04ddf2 doc: Normalize all true/false in docstrings to `True|False` (#3593)
* doc: Normalize all true/false in docstrings to ``True|False``

This makes them more apparent in the documentation.

* doc: fix flake8
2017-11-09 08:12:29 -05:00
3109e4ad6a add common terminology to BatchNorm docs 2017-10-17 11:03:31 +02:00
0cd149f06f Add comments for default value (#2242) 2017-07-29 14:14:14 +05:30
46a868dab7 [Ready] Limit docs line length (#1900)
* some docs are ready

* docs

* docs

* fix some more

* fix some more
2017-07-10 10:24:54 -04:00
c6d7e1e6bf added input size checks to batchnorm (#2020) 2017-07-09 15:31:24 -04:00
9d916e561c batch norm docfix (#1804)
fixes the formula for batch normalization (moves the epsilon inside
the square root)
2017-06-14 11:57:46 -04:00
46cf6ff5fb fix batchnorm docs (#1284) 2017-04-18 15:12:38 -04:00
d9678c2e34 Correct typo in batchnorm documentation 2017-03-22 13:55:45 +01:00
e7c1e6a8e3 [pep8] Fix most lint automatically with autopep8
Here's the command I used to invoke autopep8 (in parallel!):

    git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i

Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.

Also configures flake8 to match pep8's behavior.

Also configures TravisCI to check the whole project for lint.
2017-01-28 01:15:51 +01:00
6d14ef8083 Update batchnorm docstrings
Add missing full stops, and added blank line for increased clarity on rendered documentation.
2017-01-19 14:15:26 +01:00
f0a6ca4d53 BatchNorm fixes (#423)
- don't use cuDNN for half inputs because weight, bias, running_mean,
   etc. are required to be of different type than for THCUNN
 - accept 3D inputs (N,C,L) in BatchNorm1d
 - remove accidental 'use_cudnn=False'
2017-01-09 13:16:51 -05:00
088f14c697 fix batchnorm and linear docs for rst 2017-01-04 13:35:55 -05:00
55e850d825 test if modules can be printed with fixes 2016-12-29 17:30:46 -05:00
62af45d99f Basic functional interface (#354) 2016-12-29 22:53:57 +01:00
ffcc38cf05 Deterministic ordering of parameters and buffers. (#317)
Uses the assignment syntax to get deterministic ordering of parameters.
The ordering of parameters using the constructor syntax is
non-deterministic because kwargs use dict() in Python 3.5 and earlier.
2016-12-16 14:45:56 -05:00
513d902df1 adding __repr__ for nn 2016-11-07 16:17:40 -05:00
b4f4cca875 Rename training and evaluation methods 2016-10-30 00:16:06 +02:00
3cbe66ba8c Change requires_grad default to False 2016-10-05 08:46:34 -07:00
d92b7da733 fix documentation to not use forward 2016-09-30 09:49:30 -07:00
7f4ff0e615 Fix type conversions in nn 2016-09-27 15:45:49 -07:00
f9d25e8e72 Refactor nn (require specifying parameters explicitly) 2016-09-27 15:22:26 -07:00
8fdec15a55 Codemod to remove camel case method naming 2016-09-20 08:40:28 -07:00
b5f7720ab9 docstrings for container and batchnorm 2016-09-16 05:31:36 -04:00
fb39971464 Add more modules to nn 2016-09-14 11:05:56 -07:00
b738b09606 Clean up Module forward and __call__ (#14)
* _forward is renamed forward since users should override it

 * some __call__ overrides are changed to forward

 * function which return a single variable are changed to return that
   variable instead of a one-element tuple
2016-09-07 15:41:39 -04:00
959304bc0d Make BatchNorm2d inherit from BatchNorm 2016-08-30 13:25:01 -07:00
ea93fb7ac0 Add more nn modules 2016-08-23 19:15:21 -07:00