e595136187
Enable PLC1802 on ruff ( #165813 )
...
This PR enables ruff check `PLC1802`, which detects len calls on sequences in a boolean test context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165813
Approved by: https://github.com/ezyang
2025-10-18 05:44:14 +00:00
39116409a1
[torch/utils][Code Clean] Clean asserts in benchmark/
and data/
in torch/utils/
( #165299 )
...
Including:
- `torch/utils/benchmarks/`
- `torch/utils/data/`
Fixes part of #164878
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165299
Approved by: https://github.com/albanD
2025-10-14 04:50:39 +00:00
8de85896e0
Enable ruff rule E721 ( #165162 )
...
`E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162
Approved by: https://github.com/Skylion007
2025-10-13 01:48:55 +00:00
816fb7f48d
Revert "Enable ruff rule E721 ( #165162 )"
...
This reverts commit 9e7c19f72b6d0690915c307409c0c0a76b5a3bf0.
Reverted https://github.com/pytorch/pytorch/pull/165162 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165162#issuecomment-3393328271 ))
2025-10-11 13:25:40 +00:00
9e7c19f72b
Enable ruff rule E721 ( #165162 )
...
`E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162
Approved by: https://github.com/Skylion007
2025-10-11 06:43:53 +00:00
a029675f6f
More ruff SIM fixes ( #164695 )
...
This PR applies ruff `SIM` rules to more files. Most changes are about simplifying `dict.get` because `None` is already the default value.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164695
Approved by: https://github.com/ezyang
2025-10-09 03:24:50 +00:00
086dec3235
Pyrefly suppressions 6/n ( #164877 )
...
Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283
Almost there!
Test plan:
dmypy restart && python3 scripts/lintrunner.py -a
pyrefly check
step 1: delete lines in the pyrefly.toml file from the project-excludes field
step 2: run pyrefly check
step 3: add suppressions, clean up unused suppressions
before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199
after:
INFO 0 errors (5,064 ignored)
Only four directories left to enable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164877
Approved by: https://github.com/oulgen
2025-10-08 02:30:57 +00:00
f7ab8a2710
[1/N] Fix ruff warnings ( #164333 )
...
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164333
Approved by: https://github.com/albanD
2025-10-01 16:48:32 +00:00
e30f01b5b5
[1/N] Simplify "in" operation for containers of a single item ( #164224 )
...
These issues are detected by ruff [FURB171](https://docs.astral.sh/ruff/rules/single-item-membership-test/#single-item-membership-test-furb171 ).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164224
Approved by: https://github.com/rec , https://github.com/Skylion007
2025-09-30 19:59:43 +00:00
3cda34ebde
[2/N] Apply ruff UP035 check in torch files ( #164054 )
...
This is the result of applying the ruff `UP035` check.
`Callable` is imported from `collections.abc` instead of `typing`.
`TypeAlias` is also imported from `typing`.
This PR is the follow-up of #163947 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164054
Approved by: https://github.com/ezyang , https://github.com/Skylion007
2025-09-29 03:35:32 +00:00
e3b392bdfd
[BC breaking] Remove deprecated imports for torch.utils.data.datapipes.iter.grouping ( #163438 )
...
This PR removes import tricks of `SHARDING_PRIORITIES` and `ShardingFilterIterDataPipe` from `torch.utils.data.datapipes.iter.grouping`. They are declared to be removed in PyTorch 2.1 but not.
Before change:
```
import torch.utils.data.datapipes.iter.grouping.SHARDING_PRIORITIES
import torch.utils.data.datapipes.iter.grouping.ShardingFilterIterDataPipe
```
works
After change:
there is an import error exception.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163438
Approved by: https://github.com/janeyx99
2025-09-23 05:02:06 +00:00
6a48f57d2f
[1/N] Remove 'type: ignore' suppressions ( #163468 )
...
Remove some unnecessary 'type: ignore' suppressions from python code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163468
Approved by: https://github.com/Skylion007 , https://github.com/janeyx99
2025-09-23 03:53:11 +00:00
46e1b7d70b
remove allow-untyped-defs from ./torch/utils/data/datapipes/iter/fileopener.py ( #163469 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163469
Approved by: https://github.com/aorenste , https://github.com/Skylion007
ghstack dependencies: #163246
2025-09-22 20:29:09 +00:00
f591bb5056
Remove data_source argument from Sampler ( #163134 )
...
`data_source` is declared being removed in PT 2.2 but not.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163134
Approved by: https://github.com/ezyang
2025-09-21 05:44:41 +00:00
a87aea03f7
Update RandomSampler docstring. data_source must be Sized not Dataset ( #158857 )
...
Fixes #158631
The docstring said data_source was a Dataset, but RandomSampler only needs something that implements __len__. This updates the docstring to use Sized instead, which matches the actual type used in the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158857
Approved by: https://github.com/divyanshk
2025-09-20 04:05:25 +00:00
5cedc5a0ff
[BE][PYFMT] migrate PYFMT for torch/[p-z]*/
to ruff format
( #144552 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144552
Approved by: https://github.com/ezyang
2025-08-07 00:09:56 +00:00
5f7eae697d
Deprecate DataLoader pin_memory_device param ( #158323 )
...
Build on top of https://github.com/pytorch/pytorch/pull/146821
- Moves enabling pin_memory back inside `_BaseDataLoaderIter`
- This is required for `StatefulDataloader` which leveraged `_BaseDataLoaderIter` directly and not the `Dataloader` class init
- Add a simple test for CPU only env where setting `pin_memory=True` is a no-op.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158323
Approved by: https://github.com/ramanishsingh
Co-authored-by: zeshengzong <zesheng.zong@outlook.com >
2025-07-31 18:42:07 +00:00
7e34f9c292
Add torch._C._log_api_usage_once to datapipes (mapper) ( #155489 )
...
This is to get a better understanding of how datapipes is used right now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155489
Approved by: https://github.com/ramanishsingh
2025-07-17 19:01:49 +00:00
b83d8827bc
Revert "Deprecate DataLoader pin_memory_device param ( #146821 )"
...
This reverts commit ab655816b8f76f511fb2262d45276d8d1b13d59c.
Reverted https://github.com/pytorch/pytorch/pull/146821 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/146821#issuecomment-3052093902 ))
2025-07-09 10:29:31 +00:00
ab655816b8
Deprecate DataLoader pin_memory_device param ( #146821 )
...
Following [ #131858 suggestion](https://github.com/pytorch/pytorch/pull/131858#pullrequestreview-2517760602 ) to optimize DataLoader code
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146821
Approved by: https://github.com/divyanshk
Co-authored-by: Divyansh Khanna <divyanshkhanna09@gmail.com >
2025-07-08 09:24:53 +00:00
d40aaa42ee
[BE][16/16] fix typos in torch/ (torch/utils/) ( #156606 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156606
Approved by: https://github.com/albanD
ghstack dependencies: #156318 , #156320 , #156602 , #156604
2025-07-02 22:55:29 +00:00
7709ff5512
[remove untyped defs] batch 1 ( #157011 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157011
Approved by: https://github.com/Skylion007
2025-06-30 23:54:40 +00:00
e6d8ed02cb
PyTorch Data Sampler benchmark ( #156974 )
...
## Motivation
Many PRs optimizing samplers (for eg https://github.com/pytorch/pytorch/pull/147706 , https://github.com/pytorch/pytorch/pull/137423 ) are leveraging an adhoc script for benchmarking samplers. The script and outputs are often copied over in PRs. We want to begin centralizing benchmarks for torch.utils.data components.
## What ?
* This PR adds a new sub-folder in `benchmarks` for `data`. This is aimed to cover benchmarking scripts for torch.utils.data components like dataloader and sampler.
* Specifically, this PR includes a simple script to time samplers. This is often "copy-pasted" in PRs optimizing samplers. Having it in a centralized location should prevent that, and allow a common standard.
## Output
```
Benchmark Results:
+--------------+-------------+----------------+-----------+-----------+
| Batch Size | Drop Last | Original (s) | New (s) | Speedup |
+==============+=============+================+===========+===========+
| 4 | True | 0.004 | 0.0088 | -119.62% |
+--------------+-------------+----------------+-----------+-----------+
| 4 | False | 0.0083 | 0.009 | -9.23% |
+--------------+-------------+----------------+-----------+-----------+
| 8 | True | 0.003 | 0.0074 | -147.64% |
+--------------+-------------+----------------+-----------+-----------+
| 8 | False | 0.0054 | 0.0075 | -38.72% |
+--------------+-------------+----------------+-----------+-----------+
| 64 | True | 0.0021 | 0.0056 | -161.92% |
+--------------+-------------+----------------+-----------+-----------+
| 64 | False | 0.0029 | 0.0055 | -92.50% |
+--------------+-------------+----------------+-----------+-----------+
| 640 | True | 0.002 | 0.0055 | -168.75% |
+--------------+-------------+----------------+-----------+-----------+
| 640 | False | 0.0024 | 0.0062 | -161.35% |
+--------------+-------------+----------------+-----------+-----------+
| 6400 | True | 0.0021 | 0.0055 | -160.13% |
+--------------+-------------+----------------+-----------+-----------+
| 6400 | False | 0.0021 | 0.0068 | -215.46% |
+--------------+-------------+----------------+-----------+-----------+
| 64000 | True | 0.0042 | 0.0065 | -55.29% |
+--------------+-------------+----------------+-----------+-----------+
| 64000 | False | 0.0029 | 0.0077 | -169.56% |
+--------------+-------------+----------------+-----------+-----------+
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156974
Approved by: https://github.com/ramanishsingh
2025-06-27 04:49:43 +00:00
f3e6c8e834
Fix #155016 for Docathon - convert rst to markdown ( #155198 )
...
Used [rst2myst tool](https://rst-to-myst.readthedocs.io/en/latest/ )
One note is that "Created On" and "Last Updated On" banner doesn't show in the markdown files... I'm not sure if that's just an artifact of my local build though.
Fixes #155016
Docs comparison (check out the 'new' whenever docs build)
1. cuda ([old](https://docs.pytorch.org/docs/main/cuda.html ) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155198/cuda.html ))
2. cuda.tunable ([old](https://docs.pytorch.org/docs/main/cuda.tunable.html ) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155198/cuda.tunable.html ))
3. leave cudnn_persistent_rnn.rst as is because it's reused in docstrings
4. cudnn_rnn_determinism.rst as is because it's reused in docstrings.
5. data ([old](https://docs.pytorch.org/docs/main/data.html ) vs. [new](https://docs-preview.pytorch.org/pytorch/pytorch/155198/data.html ))
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155198
Approved by: https://github.com/albanD , https://github.com/svekars
2025-06-13 20:24:34 +00:00
dc82e911e7
remove allow-untyped-defs from torch/utils/data/datapipes/iter/filelister.py ( #154624 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154624
Approved by: https://github.com/Skylion007
2025-05-30 08:38:05 +00:00
7ae204c3b6
[BE][CI][Easy] Run lintrunner
on generated .pyi
stub files ( #150732 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150732
Approved by: https://github.com/malfet , https://github.com/cyyever , https://github.com/aorenste
2025-05-27 14:58:02 +00:00
9b2a45ac7d
Refactor torch/utils/data/datapipes/gen_pyi.py
with torchgen
( #150626 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150626
Approved by: https://github.com/aorenste
2025-05-17 06:21:41 +00:00
0f891cad5a
Enable ruff check for torch/utils/data/*.ipynb
( #148654 )
...
Fixes part of #146411
Enable ruff check for `torch/utils/data/*.ipynb` files
## Test Result
```bash
lintrunner -a --take RUFF torch/utils/data/*.ipynb
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148654
Approved by: https://github.com/Skylion007
2025-05-14 06:21:47 +00:00
b22fda9e1c
Remove conda refs in tools ( #152368 )
...
Fixes #152126
Did not find references in the two .ipynb files
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152368
Approved by: https://github.com/atalman
2025-04-29 02:45:47 +00:00
7e11089fe5
Optimize dataloader Self typing ( #146816 )
...
Optimize `dataloader.py` method return type with Self typing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146816
Approved by: https://github.com/albanD
2025-04-08 03:52:23 +00:00
c3bb174bb2
SubsetRandomSampler - changed iteration over tensor to iteration over list ( #149126 )
...
Digging further the problem at https://github.com/UKPLab/sentence-transformers/pull/3261 , it boils down to this expensive loop over a torch tensor. Looping over a list, like in RandomSampler, solves the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149126
Approved by: https://github.com/divyanshk , https://github.com/cyyever
2025-03-31 04:33:35 +00:00
68c12ecfe2
Move get accelerator to use build time flags when possible ( #146098 )
...
This PR does two main things (they are in a single PR to show how the newly added APIs are used).
- Add isBuilt and isAvailable APIs to the AcceleratorHook interface. See inline doc for their exact semantic
- Use the newly added isBuilt for accelerator check to ensure it does not poison fork
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146098
Approved by: https://github.com/ngimel , https://github.com/malfet , https://github.com/EikanWang , https://github.com/jeromean
Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com >
2025-03-10 13:17:58 +00:00
b246cd7b82
Revert "Move get accelerator to use build time flags when possible ( #146098 )"
...
This reverts commit 17302b4bc837af079d2f6480f07ea2c99b93fb4b.
Reverted https://github.com/pytorch/pytorch/pull/146098 on behalf of https://github.com/albanD due to Still fails with cuda build on a non-gpu machine ([comment](https://github.com/pytorch/pytorch/pull/146098#issuecomment-2707191770 ))
2025-03-07 18:59:58 +00:00
17302b4bc8
Move get accelerator to use build time flags when possible ( #146098 )
...
This PR does two main things (they are in a single PR to show how the newly added APIs are used).
- Add isBuilt and isAvailable APIs to the AcceleratorHook interface. See inline doc for their exact semantic
- Use the newly added isBuilt for accelerator check to ensure it does not poison fork
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146098
Approved by: https://github.com/ngimel , https://github.com/malfet , https://github.com/EikanWang , https://github.com/jeromean
Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com >
2025-03-07 15:19:34 +00:00
086d146f6f
Update ruff linter for PEP585 ( #147540 )
...
This turns on PEP585 enforcement in RUFF.
- Updates the target python version
- Stops ignoring UP006 warnings (PEP585)
- Fixes a few issues which crept into the tree in the last day
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147540
Approved by: https://github.com/justinchuby , https://github.com/Skylion007
2025-02-22 04:45:17 +00:00
7ce4974e50
Fix PEP585 update ( #147536 )
...
Summary: D69920347 causes a pyre failure due to changing a base object from typing.Iterable to abc.Iterable. For now revert that change until it can be dealt with on its own.
Test Plan:
failures from D69920347 pass locally
unit tests pass
Reviewed By: oulgen
Differential Revision: D69936518
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147536
Approved by: https://github.com/jeanschmidt
2025-02-21 14:37:03 +00:00
db4ce78d46
PEP585: More UP006 fixes ( #146392 )
...
This should be the final PR before we can enable RUFF UP006.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146392
Approved by: https://github.com/justinchuby , https://github.com/albanD , https://github.com/Skylion007
2025-02-20 06:18:13 +00:00
ecf44d1002
Fixed a typo in dataset.py ( #146600 )
...
Changed word 'Mult' to 'Multi'.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146600
Approved by: https://github.com/Skylion007
2025-02-07 05:09:51 +00:00
292af3cc89
[BE][Ez]: ISC001 Auto concatenate implicit one line strings ( #146408 )
...
Apply ruff rule about implicit string concatenation, this autofixes strings that are all the same type and on the same line. These lines are broken up likely as the result of autoformatters in the past. All fixes are automated using the autofixes in ISC001.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146408
Approved by: https://github.com/justinchuby , https://github.com/janeyx99
2025-02-04 19:07:04 +00:00
629840e038
Backout PEP585 use of Iterable ( #145438 )
...
Summary:
Importing Iterable from collections.abc here causes an internal product to fail
MRO discovery causing a collision between Iterable and Generic.
This fixes the failure on D68461304
Differential Revision: D68531443
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145438
Approved by: https://github.com/izaitsevfb
2025-01-23 11:45:37 +00:00
2f9d378f7b
PEP585 update - torch/utils ( #145201 )
...
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201
Approved by: https://github.com/bobrenjc93
2025-01-21 21:04:10 +00:00
c07dc64017
Update pin memory related APIs to not pass 'device' argument ( #131858 )
...
Based on https://github.com/pytorch/pytorch/pull/126376 , this PR tries to update all PT callers (e.g., `Tensor.is_pinned()`, `Tensor.pin_memory()`) to not pass `device` argument.
As for `storage/untyped_storage.is_pinned()/pin_memory()`, we keep the `device` argument but passing `device` is discouraged. And if not given, the default `device` is still 'cuda' for BC.
Additionally, based on device-agnostic pin_memory, `pin_memory_device` argument of `torch.utils.data.DataLoader` is discouraged now. For BC, explictly passing this argument is still effective. If not given, the default `device` will be the current accelerator.
Fixes #124908
Relates https://github.com/pytorch/pytorch/pull/126376
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131858
Approved by: https://github.com/albanD
Co-authored-by: albanD <desmaison.alban@gmail.com >
2025-01-15 17:23:35 +00:00
90e81a157a
Migrate from Tuple -> tuple in torch/utils/data ( #144255 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144255
Approved by: https://github.com/andrewkho
2025-01-08 04:09:45 +00:00
99f2491af9
Revert "Use absolute path path.resolve()
-> path.absolute()
( #129409 )"
...
This reverts commit 45411d1fc9a2b6d2f891b6ab0ae16409719e09fc.
Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/jeanschmidt due to Breaking internal CI, @albanD please help get this PR merged ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2571316444 ))
2025-01-04 14:17:20 +00:00
45411d1fc9
Use absolute path path.resolve()
-> path.absolute()
( #129409 )
...
Changes:
1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2025-01-03 20:03:40 +00:00
55dc61dd52
Dataloader distribute tasks to workers when in_order is False ( #142324 )
...
Fixes #105203 and is a follow up PR to #141833
When `in_order` is True (the default), tasks are given out to workers in a round robin fashion. When `in_order` is False this is no longer needed, as we give up guarantees of reproducibility, and instead tasks should be given to workers that are able to perform work.
In this PR I've added tracking of the number of outstanding tasks for each worker (updated when tasks are added to their queue, and when data is returned to the main thread). When finding the next queue to add a task to, if `in_order` is False it will only add the task to the workers queue if it has fewer than `_prefetch_factor` tasks outstanding.
The current default behaviour is left as is.
Tests are also updated to assert on the worker IDs for each sample of data returned.
I've run the following to confirm they aren't flaky
```bash
for i in {1..20}; do python test/test_dataloader.py TestOutOfOrderDataLoader; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142324
Approved by: https://github.com/andrewkho
2025-01-03 12:57:04 +00:00
0d6db839a7
remove allow-untyped-defs from utils/data/datapipes/iter/streamreader.py ( #144088 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144088
Approved by: https://github.com/aorenste
2025-01-03 01:21:44 +00:00
cc4e70b7c3
Revert "Use absolute path path.resolve()
-> path.absolute()
( #129409 )"
...
This reverts commit 135c7db99d646b8bd9603bf969d47d3dec5987b1.
Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/malfet due to need to revert to as dependency of https://github.com/pytorch/pytorch/pull/129374 ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2562969825 ))
2024-12-26 17:26:06 +00:00
b77406a9ec
[BE][CI] bump ruff
to 0.8.4 ( #143753 )
...
Changes:
1. Bump `ruff` from 0.7.4 to 0.8.4
2. Change `%`-formatted strings to f-string
3. Change arguments with the `__`-prefix to positional-only arguments with the `/` separator in function signature.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143753
Approved by: https://github.com/Skylion007
2024-12-24 12:24:10 +00:00
135c7db99d
Use absolute path path.resolve()
-> path.absolute()
( #129409 )
...
Changes:
1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2024-12-24 08:33:08 +00:00