Commit Graph

30 Commits

Author SHA1 Message Date
356ac3103a Revert "Stop parsing command line arguments every time common_utils is imported. (#156703)"
This reverts commit 310f901a71e53688866b14bb2f2b4c8eef9979b3.

Reverted https://github.com/pytorch/pytorch/pull/156703 on behalf of https://github.com/izaitsevfb due to breaking tests internally with `assert common_utils.SEED is not None` ([comment](https://github.com/pytorch/pytorch/pull/156703#issuecomment-3152337518))
2025-08-04 20:37:39 +00:00
310f901a71 Stop parsing command line arguments every time common_utils is imported. (#156703)
Last PR in the series to re-submit https://github.com/pytorch/pytorch/pull/134592 as smaller PRs:

https://github.com/pytorch/pytorch/pull/154612
https://github.com/pytorch/pytorch/pull/154628
https://github.com/pytorch/pytorch/pull/154715
https://github.com/pytorch/pytorch/pull/154716
https://github.com/pytorch/pytorch/pull/154725
https://github.com/pytorch/pytorch/pull/154728

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156703
Approved by: https://github.com/clee2000
2025-08-02 16:38:54 +00:00
c199a4d0fd Move non inductor workflows cuda 12.6->cuda 12.8 (#155234)
Move non inductor workflows cuda 12.6->cuda 12.8

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155234
Approved by: https://github.com/Skylion007, https://github.com/zxiiro, https://github.com/cyyever, https://github.com/malfet
2025-06-12 12:42:34 +00:00
792f1c47e9 No actual change, just remove variable contain Tensors from global scope (#143225)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143225
Approved by: https://github.com/ezyang
2024-12-17 16:14:25 +00:00
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
c165a8e71d Enable UFMT on test_decomp.py, test_expanded_weights.py and some files (#125117)
Part of: #123062

Ran lintrunner on:

- test/test_decomp.py
- test/test_deploy.py
- test/test_determination.py
- test/test_dlpack.py
- test/test_dynamic_shapes.py
- test/test_expanded_weights.py

Detail:

```bash
$ lintrunner -a --take UFMT --all-files
ok No lint issues.
Successfully applied all patches.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125117
Approved by: https://github.com/jansel
2024-05-07 02:36:40 +00:00
eb2bdfae88 Make variables in dict LazyTrackers (not lazily guarded yet) and avoid using DICT_KEYS guard (#117625)
Make variables in dict lazy and remove DICT_KEYS guard.

We build the keys of a dict depth-first and we rely on the guards of
each element in the dict to create the correct guards. This allows us to
remove the rather buggy DICT_KEYS guard and make the guard lazy.
The guards are not completely lazy yet, as we instantiate them in
`_HashableTracker._eq_impl` but it should be possible to make them
truly lazy.

Also, adding new types to the supported types within keys should be less
error prone.

This is marginally less efficient when we graph break, but in turn we
should graph break much less. It also  makes the dicts code easier to maintain
(removes `is_hashable_python_var`).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117625
Approved by: https://github.com/jansel, https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983
2024-02-02 14:38:08 +00:00
dd2cff1591 [Dynamo] Use isinstance rather than istype when check if python module type (#117022)
This is to fix a issue from Meta internal use case, where third-party ```DictConfig``` has bug on [```__eq__```](fd730509ef/omegaconf/dictconfig.py (L596)) and it triggers Dynamo error because we are using ```obj in [x, y]``` check. Then I found we can use ```isinstance``` to cover all and removing these special cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117022
Approved by: https://github.com/ckluk2, https://github.com/jansel
2024-01-15 23:25:30 +00:00
bd10fea79a [BE]: Enable F821 and fix bugs (#116579)
Fixes #112371

I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579
Approved by: https://github.com/ezyang
2024-01-01 08:40:46 +00:00
6d570ccd59 tf32 context fixes for various tests (#103137)
Addresses tf32 context related failures from NVIDIA internal testing for following unit tests:

H100:

- functorch/test_vmap.py: test_op_has_batch_rule

A100:

- test_expanded_weights.py: test_cnn_model_sum
- nn/test_convolution.py: test_conv2d_same_padding_backward

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103137
Approved by: https://github.com/zou3519
2023-06-15 02:33:12 +00:00
d805a53f1f disable tf32 for rnn tests and norm tests (#102005)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102005
Approved by: https://github.com/ngimel
2023-05-24 02:22:58 +00:00
fe0e28ab87 [decompositions] GRU decompositon with and without packed sequence (#91466)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91466
Approved by: https://github.com/zou3519
2023-02-08 14:16:30 +00:00
bef61225c3 [decompositions] add decomposition for RNN with packed sequence (#91281)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91281
Approved by: https://github.com/zou3519
2023-02-08 14:16:30 +00:00
e5f6e1f660 [decompositions] add LSTM decomp (#91124)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91124
Approved by: https://github.com/zou3519
2023-02-08 14:16:30 +00:00
20d01d2dc9 [expanded weights] add RNN support via decomp (#91807)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91807
Approved by: https://github.com/albanD
2023-02-08 14:16:30 +00:00
9eb4f9dd17 Tweak test tolerances to be compatible with A10G (#86538)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86538
Approved by: https://github.com/ngimel
2022-10-11 23:31:48 +00:00
3a511e8354 [Expanded Weights] add 'same' and 'valid' padding support (#83345)
Co-authored-by: Ashkan <yousefpour@fb.com>

Adds "same" and "valid" padding support, as Opacus (well @ashkan-software) did https://github.com/pytorch/opacus/pull/451

Basics of it are this:
- during forward pass, if there's "same" padding, we manually pad the input (NB: this will cause a small perf hit, haven't benchmarked yet)
- during backward pass, the gradient wrt input needs to be cut down to the correct size if the original padding was same (conv_transpose doesn't accept string padding). Because conv_transpose will give us a gradient wrt the padded shape, we cut down the gradient to the correct size (we know how much padding we added to the left and right)
- then, for the per sample gradients wrt weights, the input is already padded so neither the unfold nor group convolution have any padding
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83345
Approved by: https://github.com/zou3519
2022-08-16 22:39:08 +00:00
3d74fd4870 [Expanded Weights] add ability to not specify batch size (#80944)
Opacus has been asking for the ability to not specify a batch size. Previously a user had to do
`call_for_per_sample_grads(module, batch_size)(*args, **kwargs)`
They rightfully pointed out that in most cases when you're passing a single argument to a module's forward function, it seems repetitive to specify the batch_size. The argument here is that in cases where a user was passing more than one argument, we might not know what the batch size is if they don't match. So, this lets a user not specify a batch size (or pass it as None), meaning that
`call_for_per_sample_grad(linear_module)(torch.randn(5, 4))`
 now works and has a batch size of 5

If there are multiple tensor arguments with different batch sizes, we fail, even if one of the inputs wouldn't have been used by the module because we can't tell which batch size we should be using.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80944
Approved by: https://github.com/zou3519
2022-07-19 13:43:46 +00:00
5973cbe657 [Expanded Weights] fix conv3d (#80943)
Conv3d has an ordering typo in its unfold (it needs a custom unfold since unfold3d doesn't exist in torch). This was caught by Opacus but not by us because dilation, padding, and stride always matched in the test cases. Conv3d has very few test cases since it doesn't have an OpInfo and we were skipping some by the nn module testing filtering. So this updated this filtering to add more of the common nn tests (one of which did fail without the change)

Closes #80953
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80943
Approved by: https://github.com/zou3519
2022-07-19 13:43:46 +00:00
2bcbea1ff6 [Expanded Weights] fix layer norm (#80895)
Opacus found that Layer Norm can fail from a wrong ordering in the ExpandedWeights code. What was happening is that all our tests had the input require grad so a layer norm check was always short circuiting in the tests, avoiding the wrong ordering. This adds a test where the input does not require gradients and fixes the issue in Layer Norm

Closes #80952
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80895
Approved by: https://github.com/zou3519
2022-07-19 13:43:45 +00:00
799bc645d9 [Expanded Weights] fix loss reduction (#80892)
Two changes in here:
(1) Changes `call_for_per_sample_grads` to be curried. Old call looks like:
`call_for_per_sample_grads(module, batch_size, args, kwargs)`
New call looks like:
`call_for_per_sample_grads(module, batch_size, loss_reduction=loss_reduction)(args, kwargs)`

(2) Adds the ability to specify a loss reduction, to match what is done in Opacus. Opacus has a more complete explanation but essentially, they want the per sample gradient behavior to match what is happens in a for loop with a single example. This gets messed up if you use a mean reduction at the end since in a batch that ends up scaling all the grad_outputs by 1/batch_size, so we offset that by scaling all the grad_samples by batch_size if the loss_reduction is mean
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80892
Approved by: https://github.com/zou3519
2022-07-18 20:29:20 +00:00
d96d186537 [Expanded Weights] fix unbatched support issue (#80891)
Closes #78077

Adds a better failure message when encountering unbatched convolution. Also add the inputs that use unbatched convolution as failure inputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80891
Approved by: https://github.com/zou3519
2022-07-07 14:56:32 +00:00
56f4b69c4b [Expanded Weights] Fix instance norm (#79800)
Opacus found an issue with the input (batched) gradients produced from instance norm. What was surprising is that we are testing that the input gradients match--but here the input gradients with instance norm are so close to 0 (typically around 1e-10) that they all look the same. It only shows up if you use another layer in front of instance norm so those small differences get magnified. This fixes the bug and makes sure that each layer we support is used in a test with a model at least once
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79800
Approved by: https://github.com/zou3519, https://github.com/albanD
2022-07-05 16:52:29 +00:00
0a7d6f34b0 expanded weights: instance norm faster rule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70141

Approved by: https://github.com/zou3519
2022-04-19 19:40:09 +00:00
72f7193f4d expanded weights: group norm faster rule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73696

Approved by: https://github.com/zou3519
2022-03-31 20:06:54 +00:00
8b8f3e836b expanded weights: layer norm faster rule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73694

Approved by: https://github.com/zou3519
2022-03-31 19:10:43 +00:00
fc47257b30 expanded weights: embedding faster rule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73693

Approved by: https://github.com/zou3519
2022-03-29 21:28:17 +00:00
78e17eaadc expanded weights: conv faster rule (#73692)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73692

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D34719302

Pulled By: samdow

fbshipit-source-id: 2288320a5f5d6a442da78e9fbe722f300b844be9
(cherry picked from commit a4cf23383c16d3c61d53e9d21f426259d2dc2d37)
2022-03-10 04:06:08 +00:00
0973c5a1cc align signature of make_tensor with other creation ops (#72702)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72702

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D34457729

Pulled By: mruberry

fbshipit-source-id: 83d580c4201eef946dc9cf4b9e28a3d36be55609
(cherry picked from commit aa4cf20fbeb4b795595729b8ac2e6ba7707d8283)
2022-02-25 06:30:31 +00:00
53faf78143 expanded weights without fast rules (#70140)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70140

[Design Doc for Expanded Weights](https://gist.github.com/samdow/fa0a164fec7963f93ff45284989cfc55) <-- gives an overview of the design for Expanded Weights

Introduces the ExpandedWeights mechanism and user-facing API without any custom implemented, faster rules.
 - User facing API is in `_stateless.py` (with documentation)
 - Testing is in test_expanded_weights
 - The rest is the implementation of the erroring fallback + the mechanism for being able to register faster per sample grad rules. Only linear is implemented here, but they are all implemented in #70141

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D34350950

Pulled By: samdow

fbshipit-source-id: 69c664b0bc3dff6951358d79d7e5d94882f7aef2
(cherry picked from commit ae1620d3b6507b27c3bc08ecfb2b1418aa8ce7d7)
2022-02-22 20:35:16 +00:00