pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	596b418391	[BE][PYFMT] migrate PYFMT for `{torch,test}/{nn,optim}/**` to `ruff format` (#144548 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144548 Approved by: https://github.com/ezyang	2025-06-14 11:27:04 +00:00
Sahdev Zala	9795dba1e0	Optim package docstring fix (#129086 ) Fix docstrings in various files in optim package. This is a last remaining fix for the issue #112593 The fix can be verified by running pydocstyle path-to-file --count Fixes #112593 Related #128248 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129086 Approved by: https://github.com/janeyx99	2024-06-21 14:30:53 +00:00
FFFrog	560efaa471	Part 1: UFMT partial files in torch/optim due to the pr-sanity-checks (#124053 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124053 Approved by: https://github.com/ezyang ghstack dependencies: #124048	2024-04-16 03:17:18 +00:00
Xuehai Pan	1fd119948e	[3/3] Update `.pyi` Python stub files and enable `'UFMT'` linter (#95268 ) Changes: - #95200 1. Recognize `.py.in` and `.pyi.in` files as Python in VS Code for a better development experience. 2. Fix deep setting merge in `tools/vscode_settings.py`. - #95267 3. Use `Namedtuple` rather than `namedtuple + __annotations__` for `torch.nn.utils.rnn.PackedSequence_`: `namedtuple + __annotations__`: ```python PackedSequence_ = namedtuple('PackedSequence_', ['data', 'batch_sizes', 'sorted_indices', 'unsorted_indices']) # type annotation for PackedSequence_ to make it compatible with TorchScript PackedSequence_.__annotations__ = {'data': torch.Tensor, 'batch_sizes': torch.Tensor, 'sorted_indices': Optional[torch.Tensor], 'unsorted_indices': Optional[torch.Tensor]} ``` `Namedtuple`: Python 3.6+ ```python class PackedSequence_(NamedTuple): data: torch.Tensor batch_sizes: torch.Tensor sorted_indices: Optional[torch.Tensor] unsorted_indices: Optional[torch.Tensor] ``` - => this PR: #95268 4. Sort import statements and remove unnecessary imports in `.pyi`, `.pyi.in` files. 5. Format `.pyi`, `.pyi.in` files and remove unnecessary ellipsis `...` in type stubs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95268 Approved by: https://github.com/huydhn	2023-03-01 23:50:56 +00:00
Jan Zikes	715a0dc5c0	[PyTorch/d2go] fix optim _multi_tensor (#73215 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73215 Fixing an issue in optimizers from _multi_tensor, for `sgd_mt` introduced in `2cb03e926f` Reviewed By: mikaylagawarecki Differential Revision: D34389034 fbshipit-source-id: ede153d52dca15909c6c022853589707f18dc8d1 (cherry picked from commit cc8a58e58459265414cceb975697e5bf83de154d)	2022-02-23 10:29:48 +00:00
Mikayla Gawarecki	2a5aaf1c49	Optim foreach cleanup for AdamW (#70484 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70484 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767869 Pulled By: mikaylagawarecki fbshipit-source-id: 2f5273bbfeea3ed502c5d77da4bebe1674243e86 (cherry picked from commit 2dd9b77917d67223012cfe1719d0919a422c5428)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	dff58d519f	Optim foreach cleanup for Rprop (#70483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70483 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767866 Pulled By: mikaylagawarecki fbshipit-source-id: ffc5ae68eeea8fa09385862b853b731554b77bcb (cherry picked from commit 3a0fe295807bb4519884a1838edeea1a9d222e41)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	ce3094f5f6	Optim foreach cleanup for Rmsprop (#70482 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70482 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767862 Pulled By: mikaylagawarecki fbshipit-source-id: 8e2e9c986d5a3774093a79755940372945f1b3a9 (cherry picked from commit baea53727711fcc083e1c18641afd1e617c24495)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	2cb03e926f	Optim foreach cleanup for SGD (#70481 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70481 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767868 Pulled By: mikaylagawarecki fbshipit-source-id: 89b9227a4ddf99602855973cbc343c58ae3d5328 (cherry picked from commit ffea8ddcfd39f3f33e18d1c7b2b903d5464d5eb9)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	5f9590681d	Optim foreach cleanup for Adam (#70295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70295 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767870 Pulled By: mikaylagawarecki fbshipit-source-id: f922f15ecb0307458c8ecee737325c42c4f3ce8b (cherry picked from commit 66233a8a3eaa073acdaeaa16ca83413da8a2d969)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	0972db5b7d	Optim foreach cleanup for ASGD (#70231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70231 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767867 Pulled By: mikaylagawarecki fbshipit-source-id: 4406824acbb6f427d52c1ced2d8a02a98c943b86 (cherry picked from commit cbd9a4da15e0c99410a53232aa0816050097dc3e)	2022-02-09 16:52:13 +00:00
Mikayla Gawarecki	5948522e9c	Optim foreach cleanup for RAdam (#70230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70230 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767874 Pulled By: mikaylagawarecki fbshipit-source-id: 9379db24266a7bbcc2c23849f87ae0af2e6729c0 (cherry picked from commit ecf7b31fc39ccfeeef36bb763ca8c96960be3577)	2022-02-09 16:52:13 +00:00
Mikayla Gawarecki	3653f07c7c	Optim foreach cleanup for NAdam (#70229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70229 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767873 Pulled By: mikaylagawarecki fbshipit-source-id: 833ead14c1d1659351ebfbeb41045a3c7eb96dad (cherry picked from commit 9415df6b5c9620c9d53036c28fe3f297c6d4906c)	2022-02-09 16:52:13 +00:00
Mikayla Gawarecki	d9acfef831	Optim foreach cleanup for Adamax (#69982 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69982 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767865 Pulled By: mikaylagawarecki fbshipit-source-id: c5efd351e359825d38b71f57a2c61a2055c3c114 (cherry picked from commit 37bb80c2d7b441c5718cee6f1b37653d4937e20a)	2022-02-09 16:52:13 +00:00
Mikayla Gawarecki	dabfea8363	Optim foreach cleanup for Adagrad (#69981 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69981 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767863 Pulled By: mikaylagawarecki fbshipit-source-id: 1c99abe4ac4eb2a9eb896dff4837b539b94f68e7 (cherry picked from commit 61c28d0645046b67050efaf0d4617203126cbd30)	2022-02-09 16:52:12 +00:00
Mikayla Gawarecki	8e8d170674	Optim foreach cleanup for Adadelta (#69980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69980 - Merged `torch/optim/adadelta.py` and `torch/optim/_multitensor/adadelta.py` into `torch/optim/adadelta.py` - Moved adadelta functional forms from `torch/optim/_functional.py` and `torch/optim/_multi_tensor/_functional.py` to `torch/optim/adadelta.py` - `torch/optim/_functional.py` just imports from `torch/optim/adadelta.py` - Added a test `test_optimizers_foreach_flag` which replicates `test_multi_tensor_optimizers` in `test/test_optim.py` - Add a test `test_adadelta_new` that replicates the behavior of `test_adadelta` but with `foreach` flag instead of using the multitensor adadleta class. If we delete `_multitensor/` we could replace `test_adadelta` with this Remaining TODO: - [ ] single_tensor adadelta supports complex but multitensor does not, need to integrate the singletensor logic in multitensor and switch the `test_adadelta_complex` to test for foreach in [True, False] Test Plan: Imported from OSS Reviewed By: VitalyFedyunin, albanD Differential Revision: D33413059 Pulled By: mikaylagawarecki fbshipit-source-id: 92a9fa98705762bb6bd464261671e49aef40070e (cherry picked from commit a008227d227749d79367d7d592bcefcf51c22df5)	2022-02-09 16:52:12 +00:00
Mikayla Gawarecki	8bb1d06702	[optim] ASGD fold state updates into functional and pass list of vars rather than states (#71335 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71335 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767871 Pulled By: mikaylagawarecki fbshipit-source-id: 84ebe1fafb1c27572f08c8c8026c882dd7e054c1 (cherry picked from commit 7613ebb3914b257637ed67b1945b2a4694f4a258)	2022-02-08 23:58:41 +00:00
Mikayla Gawarecki	ccc1a01dcb	[optim] NAdam fold state updates into functional (#71334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71334 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767864 Pulled By: mikaylagawarecki fbshipit-source-id: 4d985e9e346f40110bd4231e0f16e5643fbc448d (cherry picked from commit 58aa77e367f7c863563c0469ef9df818236ed57c)	2022-02-08 23:58:41 +00:00
Mikayla Gawarecki	7176c92687	[optim] update step in functional and pass state_steps instead of state (#71333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71333 Updated - Adagrad - Adamax - Adam - AdamW - RAdam make multi_tensor functionals take `state_steps: List[Tensor]` instead of taking `states: List[Dict]` make `state_steps: List[int]s -> state_steps:List[Tensor]` where each is a Singleton tensor so step can be updated within the functional (NAdam and ASGD) were updated in separate diffs to fold their handling of state into the functionals Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767872 Pulled By: mikaylagawarecki fbshipit-source-id: 9baa7cafb6375eab839917df9287c65a437891f2 (cherry picked from commit 831c02b3d0f585f61165ead368213f94b97a99ee)	2022-02-08 16:51:19 +00:00
Alban Desmaison	e1b84e1b6b	fix loading of older models that don't have maximize (#71023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71023 Reviewed By: jbschlosser Differential Revision: D33483687 Pulled By: albanD fbshipit-source-id: 2f3c6e97a9579be9ba15eca0756fc1f2c466fbb6	2022-01-10 06:01:24 -08:00
Mikayla Gawarecki	3a21f38a2e	Integrate multi_tensor zero_grad into Optimizer base class (#69936 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69936 Currently, the optimizers in `torch/optim/_multi_tensor/` all override the base Optimizer class' implementation of `zero_grad` with the same foreach zero_grad implementation (e.g. [here](https://github.com/pytorch/pytorch/blob/master/torch/optim/_multi_tensor/adadelta.py#L93-L114)). There is a TODO that indicates that this should be refactored to the base class once the foreach ops are in good shape. This PR is intended to address that TODO. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D33346748 Pulled By: mikaylagawarecki fbshipit-source-id: 6573f4776aeac757b6a778894681868191a1b4c7	2022-01-05 15:46:23 -08:00
Adnios	a9c7d626e1	Add the `maximize` flag to AdamW (#70146 ) Summary: Related issue: https://github.com/pytorch/pytorch/issues/68052 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/70146 Reviewed By: malfet Differential Revision: D33254561 Pulled By: albanD fbshipit-source-id: f190c836a4162f936c5953e076747c345df21421	2021-12-23 09:20:29 -08:00
oliver	3d358a7678	Adds a `maximize` flag to Adam (#68164 ) Summary: Solves the next most important use case in https://github.com/pytorch/pytorch/issues/68052. I have kept the style as close to that in SGD as seemed reasonable, given the slight differences in their internal implementations. All feedback welcome! cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/68164 Reviewed By: VitalyFedyunin Differential Revision: D32994129 Pulled By: albanD fbshipit-source-id: 65c57c3f3dbbd3e3e5338d51def54482503e8850	2021-12-13 05:53:53 -08:00
oliver	f8297d40fc	Adds a `maximize` flag to SGD. (#67847 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46480 -- for SGD. ## Notes: - I have modified the existing tests to take a new `constructor_accepts_maximize` flag. When this is set to true, the ` _test_basic_cases_template` function will test both maximizing and minimizing the sample function. - This was the clearest way I could think of testing the changes -- I would appreciate feedback on this strategy. ## Work to be done: [] I need to update the docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67847 Reviewed By: H-Huang Differential Revision: D32252631 Pulled By: albanD fbshipit-source-id: 27915a3cc2d18b7e4d17bfc2d666fe7d2cfdf9a4	2021-11-09 00:43:07 -08:00
Christopher Gray Howard	acb340de75	[Pytorch][Bootcamp] Add fixes and vanilla testing for Adagrad non-vectorized and vectorized optimizers to handle complex numbers (#66671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66671 Made changes in the step function of the vectorized and non-vectorized adagrad optimizers to handle complex numbers as two real numbers as per 65711 on github ghstack-source-id: 141442350 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adagrad_complex' https://pxl.cl/1Rd44 Reviewed By: albanD Differential Revision: D31673503 fbshipit-source-id: 90a0d0c69b556716e2d17c59ce80f09c750fc464	2021-10-25 10:13:21 -07:00
ramvenkat98	4a544df00d	Implement and benchmark a torch.optim.multi_tensor.adagrad implementation (#59155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59155 Test Plan: Imported from OSS Reviewed By: iramazanli Differential Revision: D29525213 Pulled By: ramvenkat98 fbshipit-source-id: 6d7e8da91c965d1f4e955a084ed875bab641dc9a	2021-07-07 08:08:32 -07:00
Ilqar Ramazanli	f0e972a481	To add Nesterov Adam algorithm for multi-tensor optimizers API (#59165 ) Summary: Previously in the PR: https://github.com/pytorch/pytorch/issues/59009 we added NAdam to Optimizers. Here in this PR we are proposing multi-tensor version of NAdam for PyTorch. Nadam has been proposed in the paper https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ and report and report : http://cs229.stanford.edu/proj2015/054_report.pdf by Timothy Dozat. It has been one of the most used algorithm in Deep Learning community. It worth to noting that the implementation of NAdam is inspired by the implementation for Keras : `f9d3868495/tensorflow/python/keras/optimizer_v2/nadam.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59165 Reviewed By: vincentqb Differential Revision: D29360577 Pulled By: iramazanli fbshipit-source-id: 0fe14016303b2df2cb8cc31912a2674acf63d1e5	2021-06-27 17:00:41 -07:00
Ilqar Ramazanli	5563f4bda0	To add Rectified Adam algorithm for multi-tensor optimizers API (#59161 ) Summary: Previously in the PR: https://github.com/pytorch/pytorch/issues/58968 we added RAdam to Optimizers. Here in this PR we are proposing multi-tensor version of RAdam for PyTorch. Radam has been proposed in the paper https://arxiv.org/pdf/1908.03265.pdf Liyuan Liu et al. It has been one of the most used algorithm in Deep Learning community. Differing from the paper, we selected variance tractability cut-off as 5 instead of 4 as it is the common practice. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59161 Reviewed By: vincentqb Differential Revision: D29360576 Pulled By: iramazanli fbshipit-source-id: 7ccdbf12b1ee7f12e66f7d7992123a70cc818b6b	2021-06-27 13:01:20 -07:00
Ilqar Ramazanli	e1bd4963e2	To intorduce Functional API for multi-tensor (#60735 ) Summary: In this PR we change Multi-Tensor Optimizers to Functional API. We can see that in the file : https://github.com/pytorch/pytorch/blob/master/torch/optim/_functional.py , there has been functional API defined for most of Optimizers. However we do not have similar file / functionality for multi tensors : https://github.com/pytorch/pytorch/tree/master/torch/optim/_multi_tensor Therefore we are adding it in this PR here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60735 Reviewed By: vincentqb Differential Revision: D29392253 Pulled By: iramazanli fbshipit-source-id: cebc8e7b07ab11156370f5297cfb419cd9f20b46	2021-06-25 13:09:26 -07:00
Sam Estep	5bcbbf5373	Lint trailing newlines (#54737 ) Summary: Context: https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines. The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR: - `.github/workflows/lint.yml` - `mypy-strict.ini` - `tools/README.md` - `tools/test/test_trailing_newlines.py` - `tools/trailing_newlines.py` I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository): - [How to detect file ends in newline?](https://stackoverflow.com/q/38746) - [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068) - [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800) - [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632) - [git ensure newline at end of each file](https://stackoverflow.com/q/57770972) To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737 Test Plan: Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR: - https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true In contrast, this run (after correcting the trailing newlines in this PR) succeeded: - https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241 To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow): ``` python tools/test/test_trailing_newlines.py ``` Reviewed By: malfet Differential Revision: D27409736 Pulled By: samestep fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19	2021-03-30 13:09:52 -07:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Iurii Zdebskyi	6230e337d5	Add torch._foreach_zero_ API (#47286 ) Summary: In this PR - add `_foreach_zero_` API - Update all optimizers under /_multi_tensor/ to use `_foreach_zero_` in `zero_grad` method Performance improvement ----------------- OP: zero_ ----------------- for-loop: 630.36 us foreach: 90.84 us script ``` import torch import torch.optim as optim import torch.nn as nn import torchvision import torch.utils.benchmark as benchmark_utils inputs = [torch.rand(3, 200, 200, device="cuda") for _ in range(100)] def main(): for op in [ "zero_" ]: print("\n\n----------------- OP: ", op, " -----------------") stmt = "[torch.{op}(t) for t in inputs]" timer = benchmark_utils.Timer( stmt=stmt.format(op = op), globals=globals(), label="str(optimizer)", ) print(f"autorange:\n{timer.blocked_autorange()}\n\n") stmt = "torch._foreach_{op}(inputs)" timer_mta = benchmark_utils.Timer( stmt=stmt.format(op = op), globals=globals(), label="str(optimizer_mta)", ) print(f"autorange:\n{timer_mta.blocked_autorange()}\n\n") if __name__ == "__main__": main() ``` TODO - Refactor zero_grad once foreach APIs are stable. Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/47286 Reviewed By: ngimel Differential Revision: D24706240 Pulled By: izdeby fbshipit-source-id: aac69d6d134d65126ae8e5916f3627b73d8a94bf	2020-12-16 20:04:25 -08:00
Iurii Zdebskyi	e7564b076c	Refactor scalar list APIs to use overloads (#45673 ) Summary: Refactor foreach APIs to use overloads in case of scalar list inputs. Tested via unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45673 Reviewed By: heitorschueroff Differential Revision: D24053424 Pulled By: izdeby fbshipit-source-id: 35976cc50b4acfe228a32ed26cede579d5621cde	2020-10-19 09:28:49 -07:00
Iurii Zdebskyi	8a074af929	Added scalar lists APIs for addcdiv and addcmul (#45932 ) Summary: 1) Added new APIs: _foreach_addcdiv(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcdiv_(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcmul(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcmul_(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) 2) Updated optimizers to use new APIs Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/45932 Reviewed By: navahgar Differential Revision: D24150306 Pulled By: izdeby fbshipit-source-id: c2e65dedc95d9d81a2fdd116e41df0accb0b6f26	2020-10-14 08:12:37 -07:00
Iurii Zdebskyi	1a57b390e8	Add torch._foreach_maximum(TensorList, TensorList) & torch._foreach_minimum(TensorList, TensorList) APIs (#45692 ) Summary: - Adding torch._foreach_maximum(TensorList, TensorList) API - Adding torch._foreach_minimum(TensorList, TensorList) API - Updated Adam/AdamW optimizers Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/45692 Reviewed By: anjali411 Differential Revision: D24142464 Pulled By: izdeby fbshipit-source-id: 6a4fc343a1613cb1e26c8398450ac9cea0a2eb51	2020-10-13 09:22:30 -07:00
Iurii Zdebskyi	8c309fc052	Add more tests for mt optimizers (#45475 ) Summary: Add more test cases for mt optimizers and fix Adam/AdamW Pull Request resolved: https://github.com/pytorch/pytorch/pull/45475 Reviewed By: soumith Differential Revision: D23982727 Pulled By: izdeby fbshipit-source-id: 4b24d37bd52a2fa3719d3e3a5dcf3b96990b0f5b	2020-09-28 23:59:58 -07:00
Iurii Zdebskyi	722faeb2a4	[RELAND] Added optimizers based on multi tensor apply (#45408 ) Summary: Original PR https://github.com/pytorch/pytorch/pull/45299. The present PR fixes minor bugs that caused revert. Adding a new namespace `torch.optim._multi_tensor` with a bunch of updated optimizers. Those optimizers are using _foreach APIs which improve performance significantly. ### Tests - updated existing tests to use both optimizers - added `test_multi_tensor_optimizers` test to verify correctness. ### Perf results Adam timeit: 42.69 ms --> 10.16 ms autorange: 41.96 ms --> 10.28 ms AdamW timeit: 51.38 ms --> 15.63 ms autorange: 50.82 ms --> 16.07 ms SGD timeit: 6.28 ms --> 4.40 ms autorange: 6.13 ms --> 4.73 ms RMSprop timeit: 28.63 ms --> 5.89 ms autorange: 28.27 ms --> 5.76 ms Rprop timeit: 213.30 --> 178.42 autorange: 212.03 --> 178.03 ASGD timeit: 21.67 --> 9.33 autorange: 21.64 --> 9.27 Adamax timeit: 55.60 --> 48.29 autorange: 55.22 -> 49.13 Rerf Script used ``` import torch import time import torch.optim as optim from torch.autograd import Variable from torch.optim.lr_scheduler import ExponentialLR, ReduceLROnPlateau, StepLR import torch.nn as nn import time import torchvision import torch.utils._benchmark as benchmark_utils device = "cuda" model = torchvision.models.resnet.resnet101(pretrained=True).to(device) targets = torch.randint(0, 1000, (100, 100), device=device) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=1e-3) # <----------------------- optimizer. # would compare optim.SGD vs optim._multi_tensor.SGD running_loss = 0.0 target = torch.empty(128, dtype=torch.long, device=device).random_(5) optimizer.zero_grad() inputs = torch.rand(128, 3, 100, 100, device=device , requires_grad=True) outputs = model(inputs) loss = criterion(outputs, target) loss.backward() optimizer.step() running_loss += loss.item() def main(): timer = benchmark_utils.Timer( stmt="optimizer.step()", globals=globals(), label="str(optimizer)", ) for i in range(1): print(f"Run: {i}\n{'-' * 40}") print(f"timeit:\n{timer.timeit(1000)}\n") print(f"autorange:\n{timer.blocked_autorange()}\n\n") if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45408 Reviewed By: gchanan Differential Revision: D23956680 Pulled By: izdeby fbshipit-source-id: c5eab7bf5fce14a287c15cead1cdc26e42cfed94	2020-09-28 13:14:04 -07:00
Mike Ruberry	54a253fded	Revert D23931987: Added optimizers based on multi tensor apply Test Plan: revert-hammer Differential Revision: D23931987 (`2b21e7767e`) Original commit changeset: 582134ef2d40 fbshipit-source-id: ffd500aea55fda34155442fb15e2529cb9c00100	2020-09-26 18:11:54 -07:00
Iurii Zdebskyi	2b21e7767e	Added optimizers based on multi tensor apply (#45299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45299 Adding a new namespace `torch.optim._multi_tensor` with a bunch of updated optimizers. Those optimizers are using _foreach APIs which improve performance significantly. ### Tests - updated existing tests to use both optimizers - added `test_multi_tensor_optimizers` test to verify correctness. ### Perf results Adam timeit: 42.69 ms --> 10.16 ms autorange: 41.96 ms --> 10.28 ms AdamW timeit: 51.38 ms --> 15.63 ms autorange: 50.82 ms --> 16.07 ms SGD timeit: 6.28 ms --> 4.40 ms autorange: 6.13 ms --> 4.73 ms RMSprop timeit: 28.63 ms --> 5.89 ms autorange: 28.27 ms --> 5.76 ms Rprop timeit: 213.30 --> 178.42 autorange: 212.03 --> 178.03 ASGD timeit: 21.67 --> 9.33 autorange: 21.64 --> 9.27 Adamax timeit: 55.60 --> 48.29 autorange: 55.22 -> 49.13 Rerf Script used ``` import torch import time import torch.optim as optim from torch.autograd import Variable from torch.optim.lr_scheduler import ExponentialLR, ReduceLROnPlateau, StepLR import torch.nn as nn import time import torchvision import torch.utils._benchmark as benchmark_utils device = "cuda" model = torchvision.models.resnet.resnet101(pretrained=True).to(device) targets = torch.randint(0, 1000, (100, 100), device=device) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=1e-3) # <----------------------- optimizer. # would compare optim.SGD vs optim._multi_tensor.SGD running_loss = 0.0 target = torch.empty(128, dtype=torch.long, device=device).random_(5) optimizer.zero_grad() inputs = torch.rand(128, 3, 100, 100, device=device , requires_grad=True) outputs = model(inputs) loss = criterion(outputs, target) loss.backward() optimizer.step() running_loss += loss.item() def main(): timer = benchmark_utils.Timer( stmt="optimizer.step()", globals=globals(), label="str(optimizer)", ) for i in range(1): print(f"Run: {i}\n{'-' * 40}") print(f"timeit:\n{timer.timeit(1000)}\n") print(f"autorange:\n{timer.blocked_autorange()}\n\n") if __name__ == "__main__": main() ``` Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D23931987 Pulled By: izdeby fbshipit-source-id: 582134ef2d402909d27d89a45c5b588fb7130ea1	2020-09-26 12:17:43 -07:00

40 Commits