Commit Graph

57 Commits

Author SHA1 Message Date
3cda34ebde [2/N] Apply ruff UP035 check in torch files (#164054)
This is the result of applying the ruff `UP035` check.
`Callable` is imported from `collections.abc` instead of `typing`.
`TypeAlias` is also imported from `typing`.
This PR is the follow-up of #163947.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164054
Approved by: https://github.com/ezyang, https://github.com/Skylion007
2025-09-29 03:35:32 +00:00
e3b392bdfd [BC breaking] Remove deprecated imports for torch.utils.data.datapipes.iter.grouping (#163438)
This PR removes import tricks of `SHARDING_PRIORITIES` and  `ShardingFilterIterDataPipe` from `torch.utils.data.datapipes.iter.grouping`. They are declared to be removed in PyTorch 2.1 but not.
Before change:
```
import torch.utils.data.datapipes.iter.grouping.SHARDING_PRIORITIES
import torch.utils.data.datapipes.iter.grouping.ShardingFilterIterDataPipe
```
works
After change:
there is an import error exception.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163438
Approved by: https://github.com/janeyx99
2025-09-23 05:02:06 +00:00
5cedc5a0ff [BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144552
Approved by: https://github.com/ezyang
2025-08-07 00:09:56 +00:00
2f9d378f7b PEP585 update - torch/utils (#145201)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201
Approved by: https://github.com/bobrenjc93
2025-01-21 21:04:10 +00:00
f1df13f023 [BE][Easy] Fix PYI001: unprefixed-type-param in torch/utils/data/datapipes (#129885)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129885
Approved by: https://github.com/ezyang
2024-07-02 14:56:27 +00:00
7cf0b90e49 [BE] enable UFMT in torch.utils.data (#127705)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127705
Approved by: https://github.com/ezyang
ghstack dependencies: #127706, #127704
2024-06-27 23:16:24 +00:00
f911957573 [BE] sort imports in torch.utils.data (#127704)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127704
Approved by: https://github.com/ezyang
ghstack dependencies: #127706
2024-06-27 23:16:24 +00:00
8db9dfa2d7 Flip default value for mypy disallow_untyped_defs [9/11] (#127846)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127846
Approved by: https://github.com/ezyang
ghstack dependencies: #127842, #127843, #127844, #127845
2024-06-08 18:50:06 +00:00
f58ecd4823 docs: fix docstrings for datapipes and other (#112765)
Fixes #112636

Before: 265
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D208: Docstring is over-indented
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/dataframe/structures.py:13 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:17 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:43 in public class `IterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:119 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:122 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:135 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:139 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:171 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:180 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:186 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:191 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:197 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:203 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/datapipe.py:217 in public class `DFIterDataPipe`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:223 in public class `MapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:261 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:274 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:278 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:303 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:312 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:323 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:329 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:392 in public class `DataChunk`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:393 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:397 in public method `as_str`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:401 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:404 in public method `raw_iterator`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:63 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:121 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:125 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:173 in public class `CollatorIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:213 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:18 in public class `SamplerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:44 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:47 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D400: First line should end with a period (not 'r')
torch/utils/data/datapipes/iter/combinatorics.py:94 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:114 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:118 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:122 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:137 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:142 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:150 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:165 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:179 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/combining.py:44 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:51 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:55 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:64 in public class `ForkerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:92 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D400: First line should end with a period (not 'd')
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
        D400: First line should end with a period (not '`')
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:320 in private method `_set_main_datapipe_valid_iterator_id`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:343 in private method `_check_valid_iterator_id`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/iter/combining.py:384 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/combining.py:549 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:553 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:566 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:572 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:575 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:585 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:593 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:615 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:622 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:626 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:15 in public class `FileListerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/filelister.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:58 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
        D400: First line should end with a period (not 'm')
torch/utils/data/datapipes/iter/fileopener.py:42 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:66 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:69 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/grouping.py:55 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:68 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:79 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/grouping.py:112 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:118 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/grouping.py:185 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:233 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:257 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:261 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:278 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:294 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
        D400: First line should end with a period (not 'a')
torch/utils/data/datapipes/iter/routeddecoder.py:37 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:53 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:56 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:21 in public class `FilterIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/selecting.py:46 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:70 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/sharding.py:39 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:47 in public method `apply_sharding`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:74 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:79 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/streamreader.py:27 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:31 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:9 in public class `IterableWrapperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/utils.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:33 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:49 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/callable.py:45 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:55 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:58 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:15 in public class `ShufflerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combinatorics.py:55 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:68 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:72 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:76 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:85 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:92 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:95 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:110 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:34 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:43 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:52 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:76 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:85 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:94 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/map/grouping.py:34 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:47 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:60 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:9 in public class `SequenceWrapperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/utils.py:32 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:48 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
        D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
        D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D400: First line should end with a period (not 'g')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:156 in public function `match_masks`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:170 in public function `get_file_pathnames_from_root`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:207 in public function `get_file_binaries_from_pathnames`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:220 in public function `validate_pathname_binary_tuple`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D400: First line should end with a period (not 'y')
torch/utils/data/datapipes/utils/common.py:298 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:315 in public method `close_streams`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/utils/common.py:359 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:364 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__next__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:374 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:383 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D202: No blank lines allowed after function docstring (found 1)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:139 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:143 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:187 in public function `imagehandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:194 in public function `videohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:215 in public function `audiohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:236 in public class `MatHandler`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:237 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:247 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:253 in public function `mathandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:261 in public function `extension_extract_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:270 in public class `Decoder`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:276 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:282 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:292 in public method `decode1`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:309 in public method `decode`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:326 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/tensorboard/_convert_np.py:1 at module level:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
        D400: First line should end with a period (not ':')
265
```

After: 166
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:10 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:14 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:120 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:123 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:136 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:140 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:173 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:182 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:188 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:193 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:199 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:205 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:221 in public class `DFIterDataPipe`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:266 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:279 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:283 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:309 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:324 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:329 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:341 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:398 in public class `DataChunk`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:399 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:403 in public method `as_str`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:407 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:410 in public method `raw_iterator`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:65 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:123 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:127 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:216 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:45 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:48 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:97 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:117 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:121 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:125 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:140 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:145 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:153 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:168 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:182 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:46 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:53 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:57 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:95 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:388 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:556 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:560 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:573 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:579 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:582 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:592 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:600 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:624 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:631 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:635 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:37 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:59 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:63 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:41 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:65 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:68 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:57 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:70 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:81 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:115 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:121 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:190 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:238 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:262 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:266 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:283 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:299 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:38 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:54 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:57 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:63 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:47 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:71 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:40 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:48 in public method `apply_sharding`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:75 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:80 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:33 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:34 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:50 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:47 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:57 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:60 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:56 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:69 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:73 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:77 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:86 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:93 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:96 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:111 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:54 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:80 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:89 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:98 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:49 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:33 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:46 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:49 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:157 in public function `match_masks`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:171 in public function `get_file_pathnames_from_root`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:208 in public function `get_file_binaries_from_pathnames`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:221 in public function `validate_pathname_binary_tuple`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:300 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:356 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:361 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:365 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__next__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:377 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:141 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:145 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:189 in public function `imagehandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:196 in public function `videohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:217 in public function `audiohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:238 in public class `MatHandler`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:239 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:249 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:255 in public function `mathandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:263 in public function `extension_extract_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:279 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:285 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:295 in public method `decode1`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:312 in public method `decode`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:329 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
        D100: Missing docstring in public module
166
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112765
Approved by: https://github.com/ejguan
2023-11-03 21:01:19 +00:00
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
46faa79e09 Simplify by using yield from in torch/utils/data (#97839)
Also see https://github.com/pytorch/pytorch/pull/97831
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97839
Approved by: https://github.com/NivekT, https://github.com/Skylion007
2023-03-29 04:51:26 +00:00
d8f4026ebf Continue support sharding pipes in tud.datapipes.iter.grouping as deprecated (#94527)
Summary:
https://github.com/pytorch/pytorch/pull/94095 moves this into `tud.datapipes.iter.sharding`. However, since previously this is a public API, this is a BC break change.

As discussed in https://github.com/pytorch/data/pull/987#issuecomment-1422440049, we will have backward compatbile support but with deprecated warning.

Differential Revision: D43161015

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94527
Approved by: https://github.com/ejguan, https://github.com/NivekT
2023-02-10 18:42:10 +00:00
1e2d82b8e4 [BE] Merge isinstance calls together (#94419)
Simplify and speeds up isinstance calls by checking for multiple types at the same time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419
Approved by: https://github.com/ezyang
2023-02-09 00:47:26 +00:00
d6dec1a5cf Refactor sharding data pipe into a seperate file (#94095)
Move `ShardingFilterIterDataPipe` into a dedicated file.

Also, propose to have a dedicated parent class (`_ShardingIterDataPipe`) for sharding data pipe, as this seems more like a "system/engine-level" datapipe that gives strong hints to RS on how to execute, and needs first-class citizen treatment in RS (compared with other "user-level" datapipe that are mostly composable `Callable[[Iterable], Iterable]`.  So we don't need to based on whether `is_shardable` and `apply_sharding` are presented in DataPipe in `graph_settings.py`. But open to other discussions.

Open question: Should
[ShardingRoundRobinDispatcherIterDataPipe](01fc762003/torchdata/datapipes/iter/util/sharding.py (L16-L17)) also be considered as a `_ShardingIterDataPipe`? (e.g. this sharding is executed by replicating (the metadata), while `ShardingRoundRobinDispatcherIterDataPipe` hints too expensive to replicate so requires round robin data exchange/dispatch).

Differential Revision: D43014692

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94095
Approved by: https://github.com/ejguan, https://github.com/NivekT
2023-02-07 09:12:02 +00:00
b073c09f7a Added keep_key option to Grouper (#92532)
Fixes https://github.com/pytorch/data/issues/256

The testing of this module is currently suboptimal in general. We should improve this in the future.

@ejguan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92532
Approved by: https://github.com/ejguan
2023-01-25 20:58:21 +00:00
ad782ff7df Enable xdoctest runner in CI for real this time (#83816)
Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-12-29 05:32:42 +00:00
09ccda0d94 Fix: Make __len__ of datapipes dynamic (#88302)
Fixes #88074

Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever `apply_sharding` is called). The behaviour is counter-intuitive because we do not expect `__len__` to have side-effects.

This PR makes `__len__` dynamically computed.

Changes:
- Add note to the `datapipes` README that `__len__` should be dynamic and why.
- Remove caching of length computations in `ConcaterIterDataPipe`, `MultiplexerIterDataPipe`, `ZipperIterDataPipe`, `BatcherIterDataPipe`, `ConcaterMapDataPipe`, and `BatcherMapDataPipe`.
- This required removal of the `length` attribute in setstate/getstate of `MultiplexerIterDataPipe`. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the `length` of the loaded `state`.
- This also means the classes above no longer have a `length` attribute. I have found no uses of this, though.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88302
Approved by: https://github.com/NivekT
2022-12-09 19:15:53 +00:00
9dadf8fcc2 [DataPipes] Add group support to the sharding_filter (#88424)
Differential Revision: [D41006747](https://our.internmc.facebook.com/intern/diff/D41006747)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88424
Approved by: https://github.com/ejguan
2022-11-07 22:07:01 +00:00
ea72a0991c Add support to traverse all python collection objects (#84079)
Fixes https://github.com/pytorch/data/issues/752

This PR makes `traverse` function supporting more collections data structures from Python. The `getstate_hook` will be invoked after custom `__getstate__` function. This would guarantee that `traverse` function will be working as long as the `DataPipe` is working properly with multiprocessing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84079
Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin
2022-09-23 16:21:25 +00:00
4618371da5 Integrate xdoctest - Rebased (#82797)
This is a new version of #15648 based on the latest master branch.

Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.

In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)

Fixes https://github.com/pytorch/pytorch/issues/71105

@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
2022-08-12 02:08:01 +00:00
35d97e21c8 [DataPipe] Simple graph snapshotting (#79479)
This mostly completes the "poor man's snapshotting" implementation (named "simple snapshotting"). This is the most basic version of snapshotting but it should work for all DataPipes. I will be adding more efficient implementation for different types of DataPipes in future PRs.

### Implementation

The general idea of the simple snapshot is that we will:
1. Create a new iterator
2. Move that iterator forward by `n_iterations`
3. Save that as the `_fast_forward_iterator` of the DataPipe
4. The next time `iter` is called on the DataPipe, use the `_fast_forward_iterator`

### Usage
As of this implementation, the usage will something like:
```python
rng = torch.Generator()
initial_rng_state = rng.get_state()
datapipe: IterDataPipe = ...
# Some usage of the DataPipe, here maybe yielding the first 5 values
n_iter = 5
it = iter(datapipe)
for _ in range(n_iter):
    next(it)
serialized_graph = pickle.dumps(datapipe)

# The serialized object has most of the sufficient information for simple snapshot (except for initial RNG state)
# It can be deserialized at a later point in time or by a different process
deserialized_graph = pickle.loads(serialized_graph)
# I think `DataLoader2` or `ReadingService` should store `initial_rng_state` that can be saved by the API that we later use
rng_for_deserialized = torch.Generator()
rng_for_deserialized.set_state(initial_rng_state)
n_iterations = deserialized_graph._number_of_samples_yielded

_simple_snapshot_graph(deserialized_graph, n_iterations, rng=rng_for_deserialized)
# The while DataPipe graph should have the same state as before serialization, such that:
self.assertEqual(list(it), list(deserialized_graph))  # True
```

### Next Steps
If this looks acceptable, the next step is I will modify `DataLoader2`'s prototype ReadingService (the one with queues) to remember things like `initial_rng_state` and to have methods `save_snapshot` that will return the `(serialized graph, initial_rng)` and `restore_snapshot`. This should work for single worker data loading.

Note that, in the long term, `initial_rng_state` may not be necessary if we are able to directly save/restore the buffer and RNG state of `Shuffler` (that is work in progress). However, `initial_rng_state` and simple snapshot is still a good fall-back option for some edge cases where the buffer can't be stored.

Differential Revision: [D37943406](https://our.internmc.facebook.com/intern/diff/D37943406)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79479
Approved by: https://github.com/ejguan
2022-07-23 02:53:15 +00:00
3d218e1c87 Raise warning for unpickable local function (#547) (#80232)
Summary:
X-link: https://github.com/pytorch/data/pull/547

Fixes https://github.com/pytorch/data/issues/538
- Improve the validation function to raise warning about unpickable function when either lambda or local function is provided to DataPipe.
- The inner function from functools.partial object is extracted as well for validation
- Mimic the behavior of pickle module for local lambda function: It would only raise Error for the local function rather than lambda function. So, we will raise warning about local function not lambda function.
```py

>>> import pickle
>>> def fn():
...     lf = lambda x: x
...     pickle.dumps(lf)
>>> pickle.dumps(fn)
AttributeError: Can't pickle local object 'fn.<locals>.<lambda>'
```

This Diff also fixes the Error introduced by https://github.com/pytorch/pytorch/pull/79344

Test Plan:
CI on PyTorch and TorchData
Manually validated the tests from TorchVision

Differential Revision: D37417556

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80232
Approved by: https://github.com/NivekT
2022-06-27 21:47:09 +00:00
79ba65c0f2 Revert "Raise warning for unpickable local function (#80140)"
This reverts commit 4b75b7d3c1504323d62c9f702de04b17aec0175c.

Reverted https://github.com/pytorch/pytorch/pull/80140 on behalf of https://github.com/ejguan due to It will break the CI for TorchData
2022-06-24 14:49:06 +00:00
4b75b7d3c1 Raise warning for unpickable local function (#80140)
Fixes https://github.com/pytorch/data/issues/538

- Improve the validation function to raise warning about unpickable function when either lambda or local function is provided to `DataPipe`.
- The inner function from `functools.partial` object is extracted as well for validation
- Mimic the behavior of `pickle` module for local lambda function: It would only raise Error for the local function rather than `lambda` function. So, we will raise warning about local function not lambda function.
```py
>>> import pickle
>>> def fn():
...     lf = lambda x: x
...     pickle.dumps(lf)
>>> pickle.dumps(fn)
AttributeError: Can't pickle local object 'fn.<locals>.<lambda>'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80140
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-24 13:50:51 +00:00
ee080918df [DataPipe] Moving DataPipe buffers from __iter__ to instance (self)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76999

Approved by: https://github.com/VitalyFedyunin
2022-05-18 01:31:39 +00:00
0289ab2cec Fix data-related public API (#368)
Summary:
X-link: https://github.com/pytorch/data/pull/368

This is PR aims to expose the right data-relate API.

There are two more changes made in this PR to convert public api to private api
`check_lambda_fn` -> `_check_lambda_fn`
`deprecation_warning` -> `_deprecation_warning`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76143

Reviewed By: albanD, NivekT

Differential Revision: D35798311

Pulled By: ejguan

fbshipit-source-id: b13fded5c88a533c706702fb2070c918c839dca4
(cherry picked from commit 0b534b829a2e90e1e533951c6d334fdeaa9358b9)
2022-04-21 17:27:05 -07:00
eec994fc16 [DataPipe] Separating DataPipes from Dataset into different files (#73396)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73396

Separating DataPipes from Dataset into different files. This makes the code more maintainable and simplifies some of the code generation.

I have also tried to move `datapipe.py` into `torch.utils.data.datapipes`, but that will lead to circular import and rewriting many import statements. Should I put more time and go down that path some more?

Fixes https://github.com/pytorch/data/issues/213

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34481962

Pulled By: NivekT

fbshipit-source-id: 42fb26fe7fc334636852cfd8719fc807bdaa7912
(cherry picked from commit 81e76a64e297cb5c58caa951c554e49526173936)
2022-03-15 14:46:34 +00:00
cd4ecce1bb [DataPipe] Fix issue with DataPipe serialization with dill (#72896)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72896

Fixing the issue described here: https://github.com/pytorch/data/issues/214

There will be a follow-up PR in TorchData as well

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34258669

Pulled By: NivekT

fbshipit-source-id: 6dd88250ed14ebe779915dc46139be7e012e9d1b
(cherry picked from commit 025b8ed98019e576bfef04c33a3f33ed1a426a66)
2022-02-23 16:31:20 +00:00
f5e201e4e9 [DataPipe] Adding usage examples for IterDataPipes (#73033)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73033

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34313793

Pulled By: NivekT

fbshipit-source-id: 51125be2f79d73d02658b2b1c2691f96be8d4769
(cherry picked from commit 3e3c2df7c6a9f6cb51f2343254063487a091729a)
2022-02-18 15:12:34 +00:00
3e1eff9a0e [DataPipe] Add docstrings for IterDataPipe and MapDataPipe, along with small doc changes for consistency (#72618)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72618

The major changes are in torch/utils/data/dataset.py

Let me know if anything is unclear. I'm open to suggestion.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D34119492

Pulled By: NivekT

fbshipit-source-id: 358cb6d33d18501f9042431350f872ebaa9b4070
(cherry picked from commit 53b484f60ad942c9b86b060c40fe5a3b994424f9)
2022-02-10 16:25:36 +00:00
b4e5b4d92e [DataPipe] Fixing IterDataPipe docstrings (#72475)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72475

To render the change in documentation, please pull down this PR and build the doc in `TorchData`.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34078064

Pulled By: NivekT

fbshipit-source-id: 0d3d02d5d05ecd774251cf8d04c40413660446f1
(cherry picked from commit 9f604689a47ad94331d18b5d1370f9fff8fe42ad)
2022-02-08 22:52:27 +00:00
13ea2cb330 [DataPipe] Make GroupBy serializable with lambda function (#71497)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71497

Related to https://github.com/pytorch/data/issues/172

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33668749

Pulled By: NivekT

fbshipit-source-id: 6506614e9d4389dc645d8985c00fdb3402122d9b
(cherry picked from commit 458e76fcb1a60691a225f3f5e4a058a51490732d)
2022-01-21 16:04:45 +00:00
1e3893ecbb [DataPipe] Removing deprecated DataPipes (#71161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71161

Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) if they would like to use them. We will be checking for any downstream library usage before landing this PR.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33532272

Pulled By: NivekT

fbshipit-source-id: 9dbfb21baf2d1183e0aa379049ad8304753e08a1
2022-01-13 07:37:48 -08:00
75dbe88b05 [DataPipe] removing unbatch_level from .groupby (#70249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70249

IMO, the `unbatch_level` argument is not needed here since users can simply can `.unbatch` before calling `.groupby` if needed. One small step closer to an unified API with other libraries.

Note that we may rename the functional name from `.groupby` to `.group` in the future. TBD.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33259104

Pulled By: NivekT

fbshipit-source-id: 490e3b6f5927f9ebe8772d5a5e4fbabe9665dfdf
2021-12-22 07:13:12 -08:00
d8a44270d6 [DataPipe] Simplify BatcherIterDataPipe by removing 'unbatch_level' argument and functionality (#68594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68594

Based on my conversation with ejguan [here](https://github.com/pytorch/pytorch/pull/68197#pullrequestreview-809148827), we both believe that having the `unbatch_level` argument and functionality is making this DataPipe unnecessarily complicated, because users can call `.unbatch` before `.batch` if they would like to do so. That will likely be cleaner as well.

I also checked other libraries (for example, [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#unbatch)), and I do not see them provide the ability the `unbatch` within the `batch` function either.

This PR simplifies the DataPipe by removing the argument.

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32532594

Pulled By: NivekT

fbshipit-source-id: 7276ce76ba2a3f207c9dfa58803a48e320adefed
2021-12-01 22:00:31 -08:00
a66ff81837 [DataPipe] Optimize Grouper from N^2 to N (#68647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68647

Fixes #68539

When all data from source datapipe depletes, there is no need to yield the biggest group in the buffer.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32562646

Pulled By: ejguan

fbshipit-source-id: ce91763656bc457e9c7d0af5861a5606c89965d5
2021-11-22 07:49:13 -08:00
edab202a30 [DatePipe] add deprecation warnings for DataPipes that will solely exist in TorchData (#65827)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65827

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31272794

Pulled By: NivekT

fbshipit-source-id: 8da8266184b4df050422904cbc5fca6d7c3d2e02
2021-09-29 22:42:22 -07:00
ab5e1c69a7 [WIP] Example of DataPipes and DataFrames integration (#60840)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60840

Test Plan: Imported from OSS

Reviewed By: wenleix, ejguan

Differential Revision: D29461080

Pulled By: VitalyFedyunin

fbshipit-source-id: 4909394dcd39e97ee49b699fda542b311b7e0d82
2021-09-13 18:50:15 -07:00
4f43480186 [DataPipe] adding/removing __len__ for different DataPipe (#64398)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64398

cc VitalyFedyunin ejguan

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30710437

Pulled By: NivekT

fbshipit-source-id: 524eda43a2faa0db0c1a662bf9bb4283f0ade83c
2021-09-02 13:08:32 -07:00
a49907f984 Modify inline doc for DataPipe (#64221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64221

List of tasks in this PR
- [x]  Add inline doc for DataPipe
- [x] Improve the inline doc
- [x] Expose DataPipe to `datapipes.iter` (`UnBatcher`) Note: `Forker`, `Demux`, `Mux` are exposed in another PR authored by Kevin
- [x] Add correct typing to DataPipe
- [x] Unify the argument to `datapipe` rather than `source_datapipe`

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30650541

Pulled By: ejguan

fbshipit-source-id: c09d1b9742b8097d8e645c15947cef80c876877b
2021-08-30 18:45:46 -07:00
af85bc5ffd Replace group_by_key by group_by IterDataPipe (#64220)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64220

Remove `ByKeyGrouperIterDataPipe` due to duplicated functionality.
Fix a bug in `GrouperIterDataPipe` using the existing test.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30650542

Pulled By: ejguan

fbshipit-source-id: 666b4d28282fb4f49f3ff101b8d08be16a50d836
2021-08-30 18:45:44 -07:00
7946f8a9f6 Rename DataPipe to Op-er (#63325)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63325

Rename each DataPipe to an operation name ending with er. Functional API should remain `verb` such as `read_from_tar` , `shuffle`, ... (Discussed in [here](https://github.com/facebookexternal/torchdata/pull/97#discussion_r688553905))
- Batch -> Batcher
- Collate -> Collator
- Concat -> Concater
- GroupByKey - > ByKeyGrouper ?
- ListDirFiles -> FileLister
- LoadFilesFromDisk -> FileLoader
- Map -> Mapper
- ReadFilesFromTar -> TarArchiveReader
- ReadFilesFromZip -> ZipArchiveReader
- ReadLinesFromFile -> LineReader
- Shuffle -> Shuffler
- ToBytes -> StreamReader
- Transforms -> Transformer
- Zip -> Zipper

Let me know if you have better name for each DataPipe

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D30466950

Pulled By: ejguan

fbshipit-source-id: 72909dca7b3964ab83b965891f96cc1ecf62d049
2021-08-23 14:36:10 -07:00
383a33a0eb Make DataChunk support list in-place ops (#63422)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63422

Fixes #63095

Make `DataChunk` delegate to list method. Then it will support in-place operations:
- `sort`
- `reverse`
- `append`
- `extend`
- `random.shuffle`

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30379027

Pulled By: ejguan

fbshipit-source-id: d176bd0cc8b89b915c7bb184ff243ab1f605616d
2021-08-18 08:48:47 -07:00
d1cbee7b2b Refactor BucketBatch (#63185)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63185

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D30288893

Pulled By: ejguan

fbshipit-source-id: b88b792d12a83c99d8ea9e516e3b4c54a82100f6
2021-08-16 06:42:56 -07:00
d3bdf345cb Introducing DataChunk for DataPipes batching (#62768)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62768

This is part of TorchArrow DF support preparation, separating it to multiple PRs to simplify review process.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30149090

Pulled By: VitalyFedyunin

fbshipit-source-id: a36b5ff56e2ac6b06060014d4cd41b487754acb8
2021-08-06 08:38:33 -07:00
d3cb065b2f Implement usage of is_shardable and apply_sharding (#61236)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61236

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D29588835

Pulled By: VitalyFedyunin

fbshipit-source-id: 00c3042f96af498637b2dcf6e3f842c1fc05ddd8
2021-07-12 14:23:20 -07:00
8e21ff91e2 [DataLoader] Add simple groupby DataPipe (#60675)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60675

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29461082

Pulled By: VitalyFedyunin

fbshipit-source-id: ded5a3a1555bfd8457d64b7e61ab6729fff9cb75
2021-07-01 08:40:20 -07:00
9b5e1e0734 [DataLoader] Make batch DataPipe sensitive to unbatch_level argument (#60672)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60672

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29461086

Pulled By: VitalyFedyunin

fbshipit-source-id: efc6b3b567323defe64d3f1b30a5708107e62dd4
2021-06-30 10:04:32 -07:00
fa030d1213 [DataPipes] Add simple unbatch to DataPipe (#59610)
Summary:
Implements the simple unbatch feature for DataPipe https://github.com/pytorch/pytorch/issues/58148

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59610

Reviewed By: VitalyFedyunin

Differential Revision: D28994180

Pulled By: NivekT

fbshipit-source-id: 4bafe6e26c4f95a808c489b147369413a196fa1c
2021-06-09 16:53:31 -07:00
5c7e14d2bc [DataLoader] Switch NotImplementedError to TypeError for len (#59464)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59464

Fixes #59378

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D28944447

Pulled By: ejguan

fbshipit-source-id: 8b3d53a1863b41e578d56f219e452d18d7eae0d8
2021-06-08 07:16:18 -07:00