Document complex optimizer semantic behavior (#121667)

<img width="817" alt="image" src="https://github.com/pytorch/pytorch/assets/31798555/565b389d-3e86-4767-9fcb-fe075b50aefe">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121667
Approved by: https://github.com/albanD
This commit is contained in:
Jane Xu
2024-03-15 08:30:54 -07:00
committed by PyTorch MergeBot
parent 12662900f9
commit 37e563276b
2 changed files with 32 additions and 3 deletions

View File

@ -418,8 +418,8 @@ The short version:
the gradients are computed under the assumption that the function is a part of a larger real-valued
loss function :math:`g(input)=L`. The gradient computed is :math:`\frac{\partial L}{\partial z^*}`
(note the conjugation of z), the negative of which is precisely the direction of steepest descent
used in Gradient Descent algorithm. Thus, all the existing optimizers work out of
the box with complex parameters.
used in Gradient Descent algorithm. Thus, there is a viable path in making the existing optimizers
work out of the box with complex parameters.
- This convention matches TensorFlow's convention for complex
differentiation, but is different from JAX (which computes
:math:`\frac{\partial L}{\partial z}`).