pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Emilio Castillo	f0eb841d20	Make `torch.optim.RMSprop` differentiable (#83578 ) Blocked by #82205 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83578 Approved by: https://github.com/albanD	2022-08-22 03:37:10 +00:00
Emilio Castillo	5aab57e112	Make Adam optimizer differentiable (#82205 ) Continues [80938](https://github.com/pytorch/pytorch/pull/80938) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82205 Approved by: https://github.com/albanD	2022-08-17 07:20:37 +00:00
Rob Zinkov	ff75562cff	Adding maximize to rprop (#81864 ) Added the maximize flag #68052 to rprop optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81864 Approved by: https://github.com/albanD	2022-08-16 08:19:46 +00:00
Federico Pozzi	f8a10a7f79	feat: add PolynomialLR scheduler (#82769 ) ### Description <!-- What did you change and why was it needed? --> Add PolynomialLR scheduler. ### Issue Closes #79511. ### Testing I added tests for PolynomialLR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82769 Approved by: https://github.com/datumbox	2022-08-10 18:21:00 +00:00
Rob Zinkov	c54d18dbc7	Handle complex optimization in Adamax by treating complex numbers as 2D real numbers (#80319 ) This commit partially addresses #65711 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80319 Approved by: https://github.com/albanD	2022-08-05 21:03:18 +00:00
Rob Zinkov	dcbe9ce2ad	Handle complex optimization in AdamW by treating complex numbers as 2D real numbers (#80280 ) This commit partially addresses #65711 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80280 Approved by: https://github.com/albanD	2022-08-05 13:47:14 +00:00
Mikayla Gawarecki	b1922e03ab	Test that multi_tensor optimizer state buffers match with single_tensor state buffers (#81894 ) Add testing for state of multitensor optimizers suggested in #78807 (previously only the equality of model parameters after a few optimizer steps was being tested) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81894 Approved by: https://github.com/albanD	2022-08-04 23:23:17 +00:00
Rob Zinkov	f9ef363982	Modifying Adam to support complex numbers as 2d real numbers (#80279 ) This commit addresses issues in #65711 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80279 Approved by: https://github.com/albanD	2022-07-27 18:39:40 +00:00
Emilio Castillo	49b4f45781	Add initial support for differentiable optimizers (#80938 ) Adds the `differentiable` argument, a method for updating parameters in an existing optimizer, and a template for testing the differentiability of multiple optimizers. This is all based in discussions with @albanD & @jbschlosser Pull Request resolved: https://github.com/pytorch/pytorch/pull/80938 Approved by: https://github.com/albanD	2022-07-25 13:37:08 +00:00
Rob Zinkov	50c655d5e3	Adding maximize to ASGD (#81875 ) Added the maximize flag #68052 to ASGD optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81875 Approved by: https://github.com/albanD	2022-07-22 17:05:41 +00:00
PyTorch MergeBot	135af0fe30	Revert "Adding maximize to ASGD (#80323 )" This reverts commit 14bd5bd6eee8e9a33c557aa7625a082e96a70242. Reverted https://github.com/pytorch/pytorch/pull/80323 on behalf of https://github.com/albanD due to Broke rocm test	2022-07-08 13:35:31 +00:00
PyTorch MergeBot	0b8a5ca01b	Revert "Adding maximize to rprop (#80335 )" This reverts commit 495aa9bc3adf88e1499a042323871dc4650c7c6f. Reverted https://github.com/pytorch/pytorch/pull/80335 on behalf of https://github.com/albanD due to Broke rocm and windows test	2022-07-08 13:34:02 +00:00
Rob Zinkov	f24c94d7ae	Adding maximize to SparseAdam (#80336 ) Added the maximize flag #68052 to SparseAdam optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80336 Approved by: https://github.com/albanD	2022-07-08 12:17:27 +00:00
Rob Zinkov	495aa9bc3a	Adding maximize to rprop (#80335 ) Added the maximize flag #68052 to rprop optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80335 Approved by: https://github.com/albanD	2022-07-08 08:04:38 +00:00
Rob Zinkov	a1fd5b4273	Adding maximize to RMSprop (#80326 ) Added the maximize flag #68052 to RMSprop optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80326 Approved by: https://github.com/albanD	2022-07-08 08:04:26 +00:00
Rob Zinkov	14bd5bd6ee	Adding maximize to ASGD (#80323 ) Added the maximize flag #68052 to ASGD optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80323 Approved by: https://github.com/albanD	2022-07-08 08:03:36 +00:00
Antonio Kim	765b6a8fab	Fix SequentialLR initialization (#72856 ) What was happening is that when we have multiple learning rate schedulers, the order in which they are being initialized is not being taken into account. This is a problem if they were being initialized in sequential order (as one might intuitively do). Each scheduler calls `step()` on initialization and sets the `lr` in its optimizer's `params_groups`. However, this means that step 0 will be using the `lr` that was set by the very last scheduler (in the case of initializing schedulers sequentially) instead of the first scheduler. The fix in this PR, addresses the above bug by performing a call to the appropriate scheduler on initialization after decrementing the `last_epoch` values in order to keep them the same post-step. This will ensure that the correct scheduler is the one setting the `lr` values for the optimizer's `param_groups` Pull Request resolved: https://github.com/pytorch/pytorch/pull/72856 Approved by: https://github.com/jbschlosser	2022-06-21 20:21:13 +00:00
Rob Zinkov	2a496e2f80	Adding maximize to Adamax (#77409 ) Added the maximize flag #68052 to Adamax optimizer and updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77409 Approved by: https://github.com/albanD	2022-05-16 17:34:44 +00:00
Emilio Castillo	e5ee6f5cf7	Fix `CosineAnnealingLR` on restart Fixes #60265 The initial LR for this scheduler is not consistent when a new instance is created with `last_epoch != -1` Maybe we can refactor the testing code to test `last_epoch != -1` in schedulers that can recreate their state from the current epoch? Pull Request resolved: https://github.com/pytorch/pytorch/pull/60339 Approved by: https://github.com/albanD	2022-04-20 13:35:01 +00:00
Rob Zinkov	6642e88ad2	Adding maximize flag to Adagrad This adds maximize to Adagrad (#68052) along with updates the respective tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75968 Approved by: https://github.com/albanD	2022-04-20 08:29:03 +00:00
arindamroy-eng	7478ce187a	ROCM:Unskip more tests for ROCM5.0 Re-enabling more tests which are working on ROCM5.0 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75353 Approved by: https://github.com/ezyang	2022-04-19 19:45:55 +00:00
francescocastelli	58a44523c1	Add maximize flag to Adadelta Added the maximize flag to Adadelta optimizer (#68052) and adjusted tests to take maximize into account. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75330 Approved by: https://github.com/cpuhrsch	2022-04-08 20:32:35 +00:00
Mikayla Gawarecki	10bb0ffe69	Fix casting bug in state_step for optimizers when loading state dict Pull Request resolved: https://github.com/pytorch/pytorch/pull/75214 Approved by: https://github.com/albanD	2022-04-05 01:27:18 +00:00
Mikayla Gawarecki	03662b32d5	Uncomment step no-op test in test_optim (#70953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70953 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767861 Pulled By: mikaylagawarecki fbshipit-source-id: 8b41c8ee5d0e045436b10da5f68e9d5c5852c334 (cherry picked from commit 9224afc453cbc1c74da2b1c036dc78e1c210ac37)	2022-02-15 18:02:08 +00:00
Mikayla Gawarecki	8e8d170674	Optim foreach cleanup for Adadelta (#69980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69980 - Merged `torch/optim/adadelta.py` and `torch/optim/_multitensor/adadelta.py` into `torch/optim/adadelta.py` - Moved adadelta functional forms from `torch/optim/_functional.py` and `torch/optim/_multi_tensor/_functional.py` to `torch/optim/adadelta.py` - `torch/optim/_functional.py` just imports from `torch/optim/adadelta.py` - Added a test `test_optimizers_foreach_flag` which replicates `test_multi_tensor_optimizers` in `test/test_optim.py` - Add a test `test_adadelta_new` that replicates the behavior of `test_adadelta` but with `foreach` flag instead of using the multitensor adadleta class. If we delete `_multitensor/` we could replace `test_adadelta` with this Remaining TODO: - [ ] single_tensor adadelta supports complex but multitensor does not, need to integrate the singletensor logic in multitensor and switch the `test_adadelta_complex` to test for foreach in [True, False] Test Plan: Imported from OSS Reviewed By: VitalyFedyunin, albanD Differential Revision: D33413059 Pulled By: mikaylagawarecki fbshipit-source-id: 92a9fa98705762bb6bd464261671e49aef40070e (cherry picked from commit a008227d227749d79367d7d592bcefcf51c22df5)	2022-02-09 16:52:12 +00:00
Mikayla Gawarecki	7176c92687	[optim] update step in functional and pass state_steps instead of state (#71333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71333 Updated - Adagrad - Adamax - Adam - AdamW - RAdam make multi_tensor functionals take `state_steps: List[Tensor]` instead of taking `states: List[Dict]` make `state_steps: List[int]s -> state_steps:List[Tensor]` where each is a Singleton tensor so step can be updated within the functional (NAdam and ASGD) were updated in separate diffs to fold their handling of state into the functionals Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767872 Pulled By: mikaylagawarecki fbshipit-source-id: 9baa7cafb6375eab839917df9287c65a437891f2 (cherry picked from commit 831c02b3d0f585f61165ead368213f94b97a99ee)	2022-02-08 16:51:19 +00:00
Rishi Puri	b066931106	fixing of usage of rel_tol for test adadelta (#71880 ) Summary: Recently I made a PR to change some test tolerances: https://github.com/pytorch/pytorch/pull/69919 It turns out that the previous decorator does not work with the test optim unit test framework. I have summarized the issue in the following doc: https://docs.google.com/document/d/1BOrp29r31A2WXwM0O6ydsCs43wi01sAgdduKd7is_ec/edit?usp=sharing Pull Request resolved: https://github.com/pytorch/pytorch/pull/71880 Reviewed By: cpuhrsch Differential Revision: D33801967 Pulled By: jbschlosser fbshipit-source-id: 094feba10e2ee2a94e3ab754e4140e16b634ea09 (cherry picked from commit d504ddd950f69a6784b93a2e7630d24d5c7051fe)	2022-01-26 23:33:28 +00:00
Prabhat Roy	942a084c46	Remove state_dict from AveragedModel and use buffers instead (#71763 ) Summary: Fixes [https://github.com/pytorch/pytorch/issues/66686](https://github.com/pytorch/pytorch/issues/66686) Pull Request resolved: https://github.com/pytorch/pytorch/pull/71763 Reviewed By: anjali411 Differential Revision: D33770907 Pulled By: prabhat00155 fbshipit-source-id: ee32f2cb8475c9add4e1a9a5d3d784ef95825efc (cherry picked from commit a15898b072ae5234c76afa005ec492ed158c51aa)	2022-01-26 13:31:30 +00:00
Jake Tae	a4196a9abf	Remove unused `optimizers` variable in test (#70668 ) Summary: In `TestLRScheduler._test()`, an unused variable `optimizers` is created. This PR is a minor refactoring that removes the variable and the loop block that populates the set. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70668 Reviewed By: wenleix Differential Revision: D33586236 Pulled By: albanD fbshipit-source-id: cabf870a8221f144df9d3e2f2b564cdc5c255f5a	2022-01-14 11:59:49 -08:00
Alban Desmaison	e1b84e1b6b	fix loading of older models that don't have maximize (#71023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71023 Reviewed By: jbschlosser Differential Revision: D33483687 Pulled By: albanD fbshipit-source-id: 2f3c6e97a9579be9ba15eca0756fc1f2c466fbb6	2022-01-10 06:01:24 -08:00
Jake Tae	dd1121435b	SequentialLR update _last_lr on step (#70558 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68956. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70558 Reviewed By: dagitses Differential Revision: D33430213 Pulled By: albanD fbshipit-source-id: 446f182610de32db224d55b244d76c3076e8080f	2022-01-07 10:36:35 -08:00
Rishi Puri	f9e1a1c97f	Increase tolerance for test_adadelta (#69919 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69919 Reviewed By: cpuhrsch Differential Revision: D33286427 Pulled By: jbschlosser fbshipit-source-id: a2ca90683c14b6669f9b1804881ac675ba925fc5	2022-01-05 15:02:10 -08:00
Adnios	a9c7d626e1	Add the `maximize` flag to AdamW (#70146 ) Summary: Related issue: https://github.com/pytorch/pytorch/issues/68052 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/70146 Reviewed By: malfet Differential Revision: D33254561 Pulled By: albanD fbshipit-source-id: f190c836a4162f936c5953e076747c345df21421	2021-12-23 09:20:29 -08:00
oliver	3d358a7678	Adds a `maximize` flag to Adam (#68164 ) Summary: Solves the next most important use case in https://github.com/pytorch/pytorch/issues/68052. I have kept the style as close to that in SGD as seemed reasonable, given the slight differences in their internal implementations. All feedback welcome! cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/68164 Reviewed By: VitalyFedyunin Differential Revision: D32994129 Pulled By: albanD fbshipit-source-id: 65c57c3f3dbbd3e3e5338d51def54482503e8850	2021-12-13 05:53:53 -08:00
Kurt Mohler	52219b1017	Fix `ChainedScheduler.get_last_lr()` (#69112 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68820 cc vincentqb jbschlosser albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/69112 Reviewed By: zou3519 Differential Revision: D32796626 Pulled By: albanD fbshipit-source-id: bde9d4e473527be4c0a7f21cb57f795a67a99eaa	2021-12-02 13:44:12 -08:00
oliver	94b6fa6f8b	Adds an optimizer instance variable to ChainedScheduler (#68010 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67601. As simple a fix as I could make it. I even managed to delete some testing code! I checked calling `super()` and, as I had feared, it doesn't work out the box, so perhaps that ought to be revisited later. As it stands, https://github.com/pytorch/pytorch/issues/20124, still applies to the chained scheduler, but I think this change is still an improvement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68010 Reviewed By: zou3519 Differential Revision: D32278139 Pulled By: albanD fbshipit-source-id: 4c6f9f1b2822affdf63a6d22ddfdbcb1c6afd579	2021-11-10 01:31:47 -08:00
oliver	f8297d40fc	Adds a `maximize` flag to SGD. (#67847 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46480 -- for SGD. ## Notes: - I have modified the existing tests to take a new `constructor_accepts_maximize` flag. When this is set to true, the ` _test_basic_cases_template` function will test both maximizing and minimizing the sample function. - This was the clearest way I could think of testing the changes -- I would appreciate feedback on this strategy. ## Work to be done: [] I need to update the docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67847 Reviewed By: H-Huang Differential Revision: D32252631 Pulled By: albanD fbshipit-source-id: 27915a3cc2d18b7e4d17bfc2d666fe7d2cfdf9a4	2021-11-09 00:43:07 -08:00
oliver	b3770766c4	Fixes deprecation warnings in `test_optim.py` (#67954 ) Summary: Catches deprecation warnings when we call `scheduler.step(epoch)` in tests. Removes duplicate parameters to optimizers unless we are specifically testing for that Fixes https://github.com/pytorch/pytorch/issues/67696 There is one warning remaining when I run this locally -- however that is due to the implementation of the `SequentialLR` Scheduler. I will open a new issue relating to that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67954 Reviewed By: H-Huang Differential Revision: D32244056 Pulled By: albanD fbshipit-source-id: 2ab3086a58e10c8d29809ccbaab80606a1ec61d8	2021-11-08 09:36:08 -08:00
oliver	53ebccbe78	Fix warnings produced when running test_optim.py (#67756 ) Summary: Fixes part of https://github.com/pytorch/pytorch/issues/67696 by adding calls to `optimizer.step()` in various places. ## Notes for reviewers: - It is not entirely clear which is the right optimizer to step in each case. I have favoured the more explicit approach of creating a set of optimizers and calling step on each of them. - At the time of writing, the only Scheduler without an `optimizer` instance variable is `ChainedScheduler` which I need to deal with once. I use `hasattr` to do this check. Let me know if this ought to be changed. - I am opening this PR for review when it only solve part of the issue, as I'd rather get feedback sooner. I think it is fine to fix the issue in several PRs too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67756 Reviewed By: jbschlosser Differential Revision: D32187864 Pulled By: albanD fbshipit-source-id: fd0d133bcaa3a24588e5a997ad198fdf5879ff5a	2021-11-05 07:12:13 -07:00
Christopher Gray Howard	dfa7225a38	[Pytorch][Bootcamp] Add fix and testing for non-vectorized Adadelta optimizer to handle complex numbers (#66587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66587 Made some changes in the step function of the non-vectorized Adadelta optimizer to handle complex numbers as two real numbers as per 65711 on github ghstack-source-id: 141484731 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adadelta_complex' https://pxl.cl/1R7kJ Reviewed By: albanD Differential Revision: D31630069 fbshipit-source-id: 2741177b837960538ce39772897af36bbce7b7d8	2021-10-26 17:35:01 -07:00
Christopher Gray Howard	acb340de75	[Pytorch][Bootcamp] Add fixes and vanilla testing for Adagrad non-vectorized and vectorized optimizers to handle complex numbers (#66671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66671 Made changes in the step function of the vectorized and non-vectorized adagrad optimizers to handle complex numbers as two real numbers as per 65711 on github ghstack-source-id: 141442350 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adagrad_complex' https://pxl.cl/1Rd44 Reviewed By: albanD Differential Revision: D31673503 fbshipit-source-id: 90a0d0c69b556716e2d17c59ce80f09c750fc464	2021-10-25 10:13:21 -07:00
Jane Xu	fd608cd313	[skip ci] Set test owners for optim tests (#66861 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc vincentqb jbschlosser albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/66861 Reviewed By: albanD Differential Revision: D31761369 Pulled By: janeyx99 fbshipit-source-id: 57829e1f1509fc2af321530a4b55c9d33b7fb150	2021-10-19 08:39:35 -07:00
Christopher Gray Howard	87df043f63	[Bootcamp][Pytorch]Add testing for complex parameters in Adagrad optimizer (#66501 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66501 Add testing for the Adagrad optimizer to ensure that it behaves as if complex numbers are two real numbers in R^2 as per issue 65711 on github ghstack-source-id: 140414042 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adagrad_complex' https://pxl.cl/1R27M Reviewed By: albanD Differential Revision: D31584240 fbshipit-source-id: 5c9938084566b8ea49cc8ff002789731f62fe87e	2021-10-13 07:05:20 -07:00
Mikayla Gawarecki	0e2d1b221a	[Bootcamp][Pytorch Core] Add testing for complex non-vanilla SGD Summary: Adding test to ensure non-Vanilla SGD behaves as if complex numbers are two real numbers in R^2 as per issue 65711 on github Test Plan: ```buck test mode/dev caffe2/test:optim -- 'test_sgd_complex'``` https://pxl.cl/1QLxw Reviewed By: albanD Differential Revision: D31477212 fbshipit-source-id: 500678e561a05ac96759223b4c87a37cab26c6a6	2021-10-07 14:07:39 -07:00
Mikayla Gawarecki	1e4bcbdddb	[Bootcamp][Pytorch Core] Add test for complex numbers for vanilla SGD (#66230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66230 Adding test to ensure Vanilla SGD behaves as if complex numbers are two real numbers in R^2 as per issue 65711 on github https://github.com/pytorch/pytorch/issues/65711 ghstack-source-id: 139918862 Test Plan: ```buck test mode/dev caffe2/test:optim -- 'test_sgd_complex'``` https://pxl.cl/1QHvX Reviewed By: albanD Differential Revision: D31449289 fbshipit-source-id: da8b00421085796a23b643e73f96b19b5b560a32	2021-10-07 07:14:05 -07:00
Prabhat Roy	2ea724b1fd	Added option to update parameters using state_dict in AveragedModel (#65495 ) Summary: While implementing [EMA](https://github.com/pytorch/vision/pull/4381)(which extends AveragedModel) in torchvision, update_parameters() from AveragedModel could not be used as it did not handle state_dict(), so a custom update_parameters() needed to be defined in [EMA class](https://github.com/pytorch/vision/pull/4406). This PR aims to handle this scenario removing the need for this custom update_parameters() implementation. Discussion: https://github.com/pytorch/vision/pull/4406#pullrequestreview-753734102 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65495 Reviewed By: datumbox Differential Revision: D31176742 Pulled By: prabhat00155 fbshipit-source-id: 326d14876018f21cf602bab5eaba344678dbabe2	2021-09-28 03:34:49 -07:00
Ilqar Ramazanli	2b41bf40c5	To add SequentialLR to PyTorch Core Schedulers (#64037 ) Summary: Partially resolves https://github.com/pytorch/vision/issues/4281 In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process. The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster. With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler. ```PyThon scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2) scheduler2 = ExponentialLR(optimizer, gamma=0.9) scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5]) for epoch in range(100): train(...) validate(...) scheduler.step() ``` which this code snippet will call `ConstantLR` in the first 5 epochs and will follow up with `ExponentialLR` in the following epochs. This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch. We also add Chained Scheduler to `optim.rst` and `lr_scheduler.pyi` files here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64037 Reviewed By: albanD Differential Revision: D30841099 Pulled By: iramazanli fbshipit-source-id: 94f7d352066ee108eef8cda5f0dcb07f4d371751	2021-09-09 09:36:32 -07:00
Ilqar Ramazanli	f767cf6683	To change WarmUp Scheduler with ConstantLR and LinearLR (#64395 ) Summary: Partially unblocks https://github.com/pytorch/vision/issues/4281 Previously we have added WarmUp Schedulers to PyTorch Core in the PR : https://github.com/pytorch/pytorch/pull/60836 which had two mode of execution - linear and constant depending on warming up function. In this PR we are changing this interface to more direct form, as separating linear and constant modes to separate Schedulers. In particular ```Python scheduler1 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant") scheduler2 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="linear") ``` will look like ```Python scheduler1 = ConstantLR(optimizer, warmup_factor=0.1, warmup_iters=5) scheduler2 = LinearLR(optimizer, warmup_factor=0.1, warmup_iters=5) ``` correspondingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64395 Reviewed By: datumbox Differential Revision: D30753688 Pulled By: iramazanli fbshipit-source-id: e47f86d12033f80982ddf1faf5b46873adb4f324	2021-09-07 08:42:31 -07:00
Ilqar Ramazanli	5a12cb611f	To add Chained Scheduler to the list of PyTorch schedulers. (#63491 ) Summary: In this PR we are introducing ChainedScheduler which initially proposed in the discussion https://github.com/pytorch/pytorch/pull/26423#discussion_r329976246 . The idea is to provide a user friendly chaining method for schedulers, especially for the cases many of them are involved and we want to have a clean and easy to read interface for schedulers. This method will be even more crucial once CompositeSchedulers and Schedulers for different type of parameters are involved. The immediate application of Chained Scheduler is expected to happen in TorchVision Library to combine WarmUpLR and MultiStepLR https://github.com/pytorch/vision/blob/master/references/video_classification/scheduler.py#L5 . However, it can be expected that in many other use cases also this method could be applied. ### Example The usage is as simple as below: ```python sched=ChainedScheduler([ExponentialLR(self.opt, gamma=0.9), WarmUpLR(self.opt, warmup_factor=0.2, warmup_iters=4, warmup_method="constant"), StepLR(self.opt, gamma=0.1, step_size=3)]) ``` Then calling ```python sched.step() ``` would trigger step function for all three schedulers consecutively Partially resolves https://github.com/pytorch/vision/issues/4281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63491 Reviewed By: datumbox, mruberry Differential Revision: D30576180 Pulled By: iramazanli fbshipit-source-id: b43f0749f55faab25079641b7d91c21a891a87e4	2021-08-26 13:30:21 -07:00
Ilqar Ramazanli	e7c4988b52	To fix the chainability at epoch zero for some schedulers (#63457 ) Summary: It has been discussed in the https://github.com/pytorch/pytorch/pull/60836#issuecomment-899084092 that we have observed an obstacle to chain some type of learning rate schedulers. In particular we observed * some of the learning rate schedulers returns initial learning rates at epoch 0 as ``` return self.base_lrs` ``` * This can be a problem when two schedulers called as chained as ``` scheduler1.step() scheduler2.step() ``` in particular, we completely ignore the effect of scheduler1 at epoch 0. This could not be an issue if at epoch 0, scheduler1 was ineffective as in many schedulers, however for schedulers as WarmUp Schedulers, where at epoch 0 schedulers multiplicative value is smaller than 1 this could lead to undesired behaviors. The following code snippet illustrates the problem better ## Reproducing the bug ```python import torch from torch.nn import Parameter from torch.optim import SGD from torch.optim.lr_scheduler import WarmUpLR, ExponentialLR model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = SGD(model, 1.0) scheduler1 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant") scheduler2 = ExponentialLR(optimizer, gamma=0.9) for epoch in range(10): print(epoch, scheduler2.get_last_lr()[0]) optimizer.step() scheduler1.step() scheduler2.step() ``` ### Current Result ``` 0 1.0 1 0.9 2 0.81 3 0.7290000000000001 4 0.6561000000000001 5 5.904900000000001 6 5.314410000000001 7 4.782969000000001 8 4.304672100000001 9 3.874204890000001 ``` ### Expected Result ``` 0 1.0 1 0.9 2 0.81 3 0.7290000000000001 4 0.6561000000000001 5 0.5904900000000001 6 0.5314410000000001 7 0.4782969000000001 8 0.4304672100000001 9 0.3874204890000001 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/63457 Reviewed By: datumbox Differential Revision: D30424160 Pulled By: iramazanli fbshipit-source-id: 3e15af8d278c872cd6f53406b55f4d3ce5002867	2021-08-19 07:17:03 -07:00

... 2 3 4 5 6 ...

311 Commits