Commit Graph

36 Commits

Author SHA1 Message Date
dcc3cf7066 [BE] fix ruff rule E226: add missing whitespace around operator in f-strings (#144415)
The fixes are generated by:

```bash
ruff check --fix --preview --unsafe-fixes --select=E226 .
lintrunner -a --take "RUFF,PYFMT" --all-files
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144415
Approved by: https://github.com/huydhn, https://github.com/Skylion007
2025-01-08 21:55:00 +00:00
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
8db9dfa2d7 Flip default value for mypy disallow_untyped_defs [9/11] (#127846)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127846
Approved by: https://github.com/ezyang
ghstack dependencies: #127842, #127843, #127844, #127845
2024-06-08 18:50:06 +00:00
3bf922a6ce Apply UFMT to low traffic torch modules (#106249)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249
Approved by: https://github.com/Skylion007
2023-07-29 23:37:30 +00:00
3721fa5612 [BE] Enable ruff's UP rules and autoformat optim/ (#105426)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105426
Approved by: https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi, https://github.com/janeyx99
2023-07-18 21:07:43 +00:00
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3dd4e66059522bf5f5c1ba0431e2069.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
3c5a494d7a Revert "Update mypy to 1.4.1 (#91983)"
This reverts commit 634659e262f82bbc76aa776119c9fea079fbffe3.

Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))
2023-07-14 15:59:16 +00:00
634659e262 Update mypy to 1.4.1 (#91983)
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  -
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983
Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi
2023-07-13 16:30:36 +00:00
9c3fbe7475 [BE] Enable flake8-simplify checks (#97984)
Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check.

Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984
Approved by: https://github.com/ezyang
2023-03-31 03:40:21 +00:00
e52786f3d1 Silence profiler error (#94013)
This is not 3.11 specific but a lot more likely in 3.11 I guess
You can find other reports at https://github.com/pytorch/pytorch/issues/64345 as well for it failing in 3.8
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94013
Approved by: https://github.com/malfet
2023-02-03 17:33:47 +00:00
0a4e4de525 [ROCm] add case for FP32MatMulPattern skip property (#84077)
TF32 is not supported on ROCm and hence the torch/profiler/_pattern_matcher.py FP32MatMulPattern should return False for ROCm instead of checking the results of torch.cuda.get_arch_list().  Depending on the gfx arch running the test, test_profiler.py's test_profiler_fp32_matmul_pattern (__main__.TestExperimentalUtils) will fail otherwise.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84077
Approved by: https://github.com/jeffdaily, https://github.com/kit1980
2022-12-13 20:27:35 +00:00
1cd6ebe095 Fix typos in messages under torch (#89049)
This PR fixes typos of messages in `.py` files under torch directory.
Only in `torch/onnx/symbolic_opset16.py`, fix a typo in comment to make the operator name correct.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89049
Approved by: https://github.com/lezcano
2022-11-17 04:18:14 +00:00
6e6f929b2c [Profiler] Restructure inputs and capture TensorLists. (#87825)
This PR unifies and rationalizes some of the input representation in Result. The current approach of storing separate types in separate vectors is tedious for two types (Tensors and scalars), but would be even more annoying with the addition of TensorLists. A similar disconnection exists with sizes and strides which the user is also expected to zip with tensor_metadata.

I simplified things by moving inputs to a variant and moving sizes and strides into TensorMetadata. This also forced collection of sizes and strides in python tracer which helps to bring it in line with op profiling. Collection of TensorLists is fairly straightforward; `InputOutputEncoder` already has a spot for them (I actually collected them in the original TorchTidy prototype) so it was just a matter of plumbing things through.

Differential Revision: [D40734451](https://our.internmc.facebook.com/intern/diff/D40734451/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87825
Approved by: https://github.com/slgong-fb, https://github.com/chaekit
2022-11-08 21:48:43 +00:00
acd2f21ea1 [Profiler] Update python binding type annotations (#85722)
The annotations for `torch._C._profiler` have gotten a bit stale. This PR simply brings them up to date.

There is one small quality of life change that alters behavior: instead of returning device type and index separately we return a `torch.device` object.

Differential Revision: [D39852803](https://our.internmc.facebook.com/intern/diff/D39852803/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85722
Approved by: https://github.com/chaekit
2022-10-03 05:41:39 +00:00
ba95984588 [Profiler] Make name a property. (#85720)
This is just a quality of life change. `.name` is 30% fewer characters than `.name()`. I should have done this from the start.

Differential Revision: [D39788873](https://our.internmc.facebook.com/intern/diff/D39788873/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85720
Approved by: https://github.com/chaekit
2022-10-03 05:41:36 +00:00
282d8dfa68 [Profiler] Fix traversal utility (#85717)
`eventTreeDFS` traverses in the wrong order (right to left). Moreover, we will need more complex traversal (e.g. early stopping) for memory profiler. Thus, I made a simple general `_traverse` method and added `functools.partial` specializations for DFS and BFS.

Differential Revision: [D39788871](https://our.internmc.facebook.com/intern/diff/D39788871/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85717
Approved by: https://github.com/chaekit
2022-09-29 02:59:45 +00:00
1fa9a377d0 [Profiler] Start moving python bindings out of autograd (#82584)
A lot of profiler code still lives in autograd for historic reasons. However as we formalize and clean up profiler internals it makes sense to pull more and more into the profiler folders/namespace. For now I'm just moving some of the core config data structures and those related to `torch::profiler::impl::Result` to keep the scope manageable.

Differential Revision: [D37961462](https://our.internmc.facebook.com/intern/diff/D37961462/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37961462/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82584
Approved by: https://github.com/albanD, https://github.com/Gamrix
2022-08-19 17:15:18 +00:00
e47637aabc fix matching against MemcpyAsync (#82782)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82782
Approved by: https://github.com/robieta
2022-08-05 03:25:24 +00:00
d852dce720 Revert "Bug fix in ExtraCUDACopy and remove unstable lints for release (#82693)"
This reverts commit 50a1124fd006e4d893687ce9ab1399565a9d2741.

Reverted https://github.com/pytorch/pytorch/pull/82693 on behalf of https://github.com/zengk95 due to This is breaking test_profiler_extra_cuda_copy_pattern_benchmark on windows 50a1124fd0
2022-08-04 02:11:20 +00:00
50a1124fd0 Bug fix in ExtraCUDACopy and remove unstable lints for release (#82693)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82693
Approved by: https://github.com/robieta
2022-08-03 21:28:39 +00:00
7922bbef73 [TorchTidy] Add option to generate json report (#82261)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82261
Approved by: https://github.com/robieta
2022-08-03 21:25:55 +00:00
54064ad198 [TorchTidy] Add pattern to detect matrix alignment in fp16 AND reorganize benchmark structure (#82248)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82248
Approved by: https://github.com/robieta
2022-08-03 18:24:22 +00:00
ae2e303de0 [TorchTidy] Reland #81941 Add pattern to detect if bias is enabled in conv2d followed by batchnorm2d (#82421)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82421
Approved by: https://github.com/robieta
2022-08-02 17:18:37 +00:00
af7dc23124 [Profiler] Add tag property to Result (#81965)
With a bit of template deduction we can make the variant tell us what type it is, and then we don't have to rely on (and maintain) a bunch of `isinstance` checks in Python.

Differential Revision: [D38066829](https://our.internmc.facebook.com/intern/diff/D38066829/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81965
Approved by: https://github.com/davidchencsl
2022-07-29 05:12:11 +00:00
2fe73164b6 Revert "[TorchTidy] Add pattern to detect if bias is enabled in conv2d followed by batchnorm2d (#81941)"
This reverts commit 615f2fda4f40be098da16be075d192f36820353f.

Reverted https://github.com/pytorch/pytorch/pull/81941 on behalf of https://github.com/ZainRizvi due to New test failed on ROCm builds
2022-07-28 16:14:18 +00:00
615f2fda4f [TorchTidy] Add pattern to detect if bias is enabled in conv2d followed by batchnorm2d (#81941)
Summary: For cuda models only, if we have nn.Conv2d followed by nn.Batchnorm2d, we don't actually need to use bias=True in Conv2d as the math will cancel out the effect of bias

Test Plan: Added test in test_profiler.py

Differential Revision: [D38082643](https://our.internmc.facebook.com/intern/diff/D38082643)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81941
Approved by: https://github.com/robieta
2022-07-28 08:28:18 +00:00
f445c220be [TorchTidy] Add pattern to detect if set_to_none is set in zero_grad() (#81921)
Summary: Detect if we are using aten::zero_ to zero of the gradient, instead of setting it to None

Test Plan:
Added test in test.profiler.py

Differential Revision: [D38082642](https://our.internmc.facebook.com/intern/diff/D38082642)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81921
Approved by: https://github.com/robieta
2022-07-28 06:43:57 +00:00
d537f868f3 [TorchTidy] Add Pattern to detect Synchronous Data Loader (#81740)
Summary: By setting num_workers > 0 in DataLoader, we can achieve async data loading, which is non blocking to the computation. This helps speed up the training process. By matching the call structure, we can detect if we are using Synchronous Data Loader.

Test Plan:
Added test in test.profiler.py

Differential Revision: [D38082644](https://our.internmc.facebook.com/intern/diff/D38082644)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81740
Approved by: https://github.com/robieta
2022-07-27 22:00:17 +00:00
aeb97d9559 [TorchTidy] Add pattern to detect if single-tensor implementation optimizer is used (#81733)
Summary: For Adam, SGD, AdamW, enabling multi-tensor implementation is faster. So for these optimizers, we detect if single-tensor implementation is used.

Test Plan:
Added test in test_profiler.py

Differential Revision: [D38082641](https://our.internmc.facebook.com/intern/diff/D38082641)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81733
Approved by: https://github.com/robieta
2022-07-22 23:28:32 +00:00
64c6387c0f [Profiler] Add speedup estimate for FP32 pattern and Extra CUDA Copy Pattern (#81501)
Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81501
Approved by: https://github.com/robieta
2022-07-22 23:26:26 +00:00
26ecd67d38 [Profiler] Add pattern to detect if TF32 is available but not used (#81273)
Summary: This commit adds recording of a global flag allow_tf32 in all torch ops. Then I add a pattern that detects if a user is setting tf32 flag or not, recommend he/she to do so if is not set.

Test Plan: Added a test in test_profiler.py

Differential Revision: [D37799414](https://our.internmc.facebook.com/intern/diff/D37799414)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81273
Approved by: https://github.com/robieta
2022-07-20 18:40:21 +00:00
6e621d0287 Add pattern to detect for loop indexing into tensor (#81056)
Differential Revision: [D37725266](https://our.internmc.facebook.com/intern/diff/D37725266)

Summary: This commit added a pattern in the pattern matcher to detect if for loop & indexing is used that can be vectorized.

Test Plan: Added a test in test_profiler.py that creates a bootstrap example.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81056
Approved by: https://github.com/robieta
2022-07-20 18:37:24 +00:00
39db8b3823 [Profiler] Add Pattern that detects extra cuda copy (#80572)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80572
Approved by: https://github.com/robieta
2022-07-07 20:22:42 +00:00
69a77f77b7 [Profiler] Define pattern matcher structure (#80108)
Differential Revision: [D37535663](https://our.internmc.facebook.com/intern/diff/D37535663)

Summary:
I define a base Pattern and some helper function in this commit

Test Plan:
Added test in test_profiler.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80108
Approved by: https://github.com/robieta
2022-07-07 20:19:11 +00:00