Commit Graph

74 Commits

Author SHA1 Message Date
3cda34ebde [2/N] Apply ruff UP035 check in torch files (#164054)
This is the result of applying the ruff `UP035` check.
`Callable` is imported from `collections.abc` instead of `typing`.
`TypeAlias` is also imported from `typing`.
This PR is the follow-up of #163947.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164054
Approved by: https://github.com/ezyang, https://github.com/Skylion007
2025-09-29 03:35:32 +00:00
5cedc5a0ff [BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144552
Approved by: https://github.com/ezyang
2025-08-07 00:09:56 +00:00
292af3cc89 [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408)
Apply ruff rule about implicit string concatenation, this autofixes strings that are all the same type and on the same line. These lines are broken up likely as the result of autoformatters in the past. All fixes are automated using the autofixes in ISC001.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146408
Approved by: https://github.com/justinchuby, https://github.com/janeyx99
2025-02-04 19:07:04 +00:00
2f9d378f7b PEP585 update - torch/utils (#145201)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201
Approved by: https://github.com/bobrenjc93
2025-01-21 21:04:10 +00:00
90e81a157a Migrate from Tuple -> tuple in torch/utils/data (#144255)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144255
Approved by: https://github.com/andrewkho
2025-01-08 04:09:45 +00:00
f1df13f023 [BE][Easy] Fix PYI001: unprefixed-type-param in torch/utils/data/datapipes (#129885)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129885
Approved by: https://github.com/ezyang
2024-07-02 14:56:27 +00:00
7cf0b90e49 [BE] enable UFMT in torch.utils.data (#127705)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127705
Approved by: https://github.com/ezyang
ghstack dependencies: #127706, #127704
2024-06-27 23:16:24 +00:00
f911957573 [BE] sort imports in torch.utils.data (#127704)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127704
Approved by: https://github.com/ezyang
ghstack dependencies: #127706
2024-06-27 23:16:24 +00:00
8db9dfa2d7 Flip default value for mypy disallow_untyped_defs [9/11] (#127846)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127846
Approved by: https://github.com/ezyang
ghstack dependencies: #127842, #127843, #127844, #127845
2024-06-08 18:50:06 +00:00
4f5785b6b3 Enable possibly-undefined error code (#118533)
Fixes https://github.com/pytorch/pytorch/issues/118129

Suppressions automatically added with

```
import re

with open("error_file.txt", "r") as f:
    errors = f.readlines()

error_lines = {}
for error in errors:
    match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
    if match:
        file_path, line_number, error_type = match.groups()
        if file_path not in error_lines:
            error_lines[file_path] = {}
        error_lines[file_path][int(line_number)] = error_type

for file_path, lines in error_lines.items():
    with open(file_path, "r") as f:
        code = f.readlines()
    for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
        code[line_number - 1] = code[line_number - 1].rstrip() + f"  # type: ignore[{error_type}]\n"
    with open(file_path, "w") as f:
        f.writelines(code)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Co-authored-by: Catherine Lee <csl@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2024-01-30 21:07:01 +00:00
40ece2e579 Revert "Enable possibly-undefined error code (#118533)"
This reverts commit 4f13f69a45ef53747e2eefffd65d91ce840b431b.

Reverted https://github.com/pytorch/pytorch/pull/118533 on behalf of https://github.com/clee2000 due to sorry i'm trying to figure out a codev merge conflict, if this works i'll be back to rebase and merge ([comment](https://github.com/pytorch/pytorch/pull/118533#issuecomment-1917695185))
2024-01-30 19:00:34 +00:00
4f13f69a45 Enable possibly-undefined error code (#118533)
Fixes https://github.com/pytorch/pytorch/issues/118129

Suppressions automatically added with

```
import re

with open("error_file.txt", "r") as f:
    errors = f.readlines()

error_lines = {}
for error in errors:
    match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
    if match:
        file_path, line_number, error_type = match.groups()
        if file_path not in error_lines:
            error_lines[file_path] = {}
        error_lines[file_path][int(line_number)] = error_type

for file_path, lines in error_lines.items():
    with open(file_path, "r") as f:
        code = f.readlines()
    for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
        code[line_number - 1] = code[line_number - 1].rstrip() + f"  # type: ignore[{error_type}]\n"
    with open(file_path, "w") as f:
        f.writelines(code)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2024-01-30 05:08:10 +00:00
9bce208dfb Replace follow_imports = silent with normal (#118414)
This is a lot of files changed! Don't panic! Here's how it works:

* Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file.
* When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded.
* The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors.
* Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list.
* Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves.
* torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state.
* There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many.

In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file.

The codemod was done with this script authored by GPT-4:

```
import glob

exclude_patterns = [
    ...
]

for pattern in exclude_patterns:
    for filepath in glob.glob(pattern, recursive=True):
        if filepath.endswith('.py'):
            with open(filepath, 'r+') as f:
                content = f.read()
                f.seek(0, 0)
                f.write('# mypy: ignore-errors\n\n' + content)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414
Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD
2024-01-27 02:44:11 +00:00
f58ecd4823 docs: fix docstrings for datapipes and other (#112765)
Fixes #112636

Before: 265
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D208: Docstring is over-indented
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/dataframe/structures.py:13 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:17 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:43 in public class `IterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:119 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:122 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:135 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:139 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:171 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:180 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:186 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:191 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:197 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:203 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/datapipe.py:217 in public class `DFIterDataPipe`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:223 in public class `MapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:261 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:274 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:278 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:303 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:312 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:323 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:329 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:392 in public class `DataChunk`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:393 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:397 in public method `as_str`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:401 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:404 in public method `raw_iterator`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:63 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:121 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:125 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:173 in public class `CollatorIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:213 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:18 in public class `SamplerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:44 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:47 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
        D400: First line should end with a period (not 'r')
torch/utils/data/datapipes/iter/combinatorics.py:94 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:114 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:118 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:122 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:137 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:142 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:150 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:165 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:179 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/combining.py:44 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:51 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:55 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:64 in public class `ForkerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:92 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
        D400: First line should end with a period (not 'd')
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
        D400: First line should end with a period (not '`')
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:320 in private method `_set_main_datapipe_valid_iterator_id`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:343 in private method `_check_valid_iterator_id`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
        D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/iter/combining.py:384 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/combining.py:549 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:553 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:566 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:572 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:575 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:585 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:593 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:615 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:622 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:626 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:15 in public class `FileListerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/filelister.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:58 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
        D400: First line should end with a period (not 'm')
torch/utils/data/datapipes/iter/fileopener.py:42 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:66 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:69 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/grouping.py:55 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:68 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:79 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/grouping.py:112 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:118 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/grouping.py:185 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:233 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:257 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:261 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:278 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:294 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
        D400: First line should end with a period (not 'a')
torch/utils/data/datapipes/iter/routeddecoder.py:37 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:53 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:56 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:21 in public class `FilterIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/selecting.py:46 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:70 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/sharding.py:39 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:47 in public method `apply_sharding`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:74 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:79 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
        D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/streamreader.py:27 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:31 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:9 in public class `IterableWrapperIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/utils.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:33 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:49 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/callable.py:45 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:55 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:58 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:15 in public class `ShufflerIterDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combinatorics.py:55 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:68 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:72 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:76 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:85 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:92 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:95 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:110 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:34 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:43 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:52 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:76 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:85 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:94 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/map/grouping.py:34 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:47 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:60 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:9 in public class `SequenceWrapperMapDataPipe`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/utils.py:32 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:48 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
        D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
        D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D400: First line should end with a period (not 'g')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
        D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:156 in public function `match_masks`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:170 in public function `get_file_pathnames_from_root`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:207 in public function `get_file_binaries_from_pathnames`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:220 in public function `validate_pathname_binary_tuple`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
        D400: First line should end with a period (not 'y')
torch/utils/data/datapipes/utils/common.py:298 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:315 in public method `close_streams`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
        D400: First line should end with a period (not 's')
torch/utils/data/datapipes/utils/common.py:359 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:364 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__next__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:374 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:383 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D202: No blank lines allowed after function docstring (found 1)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
        D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:139 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:143 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:187 in public function `imagehandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:194 in public function `videohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:215 in public function `audiohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:236 in public class `MatHandler`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:237 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:247 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:253 in public function `mathandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:261 in public function `extension_extract_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:270 in public class `Decoder`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:276 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:282 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:292 in public method `decode1`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:309 in public method `decode`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:326 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D400: First line should end with a period (not ',')
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/tensorboard/_convert_np.py:1 at module level:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
        D400: First line should end with a period (not ':')
265
```

After: 166
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:10 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:14 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:120 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:123 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:136 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:140 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:173 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:182 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:188 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:193 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:199 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:205 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:221 in public class `DFIterDataPipe`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:266 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:279 in public method `register_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:283 in public method `register_datapipe_as_function`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:309 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_getstate_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:324 in public method `set_reduce_ex_hook`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:329 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__str__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:341 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:398 in public class `DataChunk`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:399 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:403 in public method `as_str`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:407 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:410 in public method `raw_iterator`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:65 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:123 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:127 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:216 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:45 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:48 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:97 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:117 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:121 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:125 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:140 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:145 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:153 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:168 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:182 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:46 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:53 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:57 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:95 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:388 in public method `__new__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:556 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:560 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:573 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:579 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:582 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:592 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:600 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:624 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:631 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:635 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:37 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:59 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:63 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:41 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:65 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:68 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:57 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:70 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:81 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:115 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:121 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:190 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:238 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:262 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:266 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:283 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:299 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:38 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:54 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:57 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:63 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:47 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:71 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:40 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:48 in public method `apply_sharding`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:75 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:80 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:33 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:34 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:50 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:47 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:57 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:60 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:56 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:69 in public method `set_shuffle`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:73 in public method `set_seed`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:77 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:86 in public method `reset`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:93 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:96 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:111 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:54 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:80 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:89 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:98 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:36 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:49 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:62 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:33 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:46 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:49 in public method `__len__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:157 in public function `match_masks`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:171 in public function `get_file_pathnames_from_root`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:208 in public function `get_file_binaries_from_pathnames`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:221 in public function `validate_pathname_binary_tuple`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:300 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:356 in public method `__dir__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:361 in public method `__del__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:365 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__next__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:377 in public method `__getstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__setstate__`:
        D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
        D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:141 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:145 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:189 in public function `imagehandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:196 in public function `videohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:217 in public function `audiohandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:238 in public class `MatHandler`:
        D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:239 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:249 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:255 in public function `mathandler`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:263 in public function `extension_extract_fn`:
        D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:279 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:285 in public method `add_handler`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:295 in public method `decode1`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:312 in public method `decode`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:329 in public method `__call__`:
        D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
        D100: Missing docstring in public module
166
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112765
Approved by: https://github.com/ejguan
2023-11-03 21:01:19 +00:00
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3dd4e66059522bf5f5c1ba0431e2069.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
3c5a494d7a Revert "Update mypy to 1.4.1 (#91983)"
This reverts commit 634659e262f82bbc76aa776119c9fea079fbffe3.

Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))
2023-07-14 15:59:16 +00:00
634659e262 Update mypy to 1.4.1 (#91983)
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  -
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983
Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi
2023-07-13 16:30:36 +00:00
46faa79e09 Simplify by using yield from in torch/utils/data (#97839)
Also see https://github.com/pytorch/pytorch/pull/97831
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97839
Approved by: https://github.com/NivekT, https://github.com/Skylion007
2023-03-29 04:51:26 +00:00
622a11d512 Fix typos under torch/utils directory (#97516)
This PR fixes typos in comments and messages of `.py` files under `torch/utils` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97516
Approved by: https://github.com/ezyang
2023-03-24 16:53:39 +00:00
939c4ae6cd [DataPipe] Add copy option to fork DataPipe (#96030)
Fixes pytorch/data#1061 and fixes pytorch/data#1032
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96030
Approved by: https://github.com/ejguan, https://github.com/NivekT
2023-03-10 17:31:56 +00:00
748bac8757 [BE]: Apply pyupgrade yield from and unit test alias upgrades (#94309)
Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309
Approved by: https://github.com/albanD
2023-02-07 20:08:58 +00:00
09ccda0d94 Fix: Make __len__ of datapipes dynamic (#88302)
Fixes #88074

Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever `apply_sharding` is called). The behaviour is counter-intuitive because we do not expect `__len__` to have side-effects.

This PR makes `__len__` dynamically computed.

Changes:
- Add note to the `datapipes` README that `__len__` should be dynamic and why.
- Remove caching of length computations in `ConcaterIterDataPipe`, `MultiplexerIterDataPipe`, `ZipperIterDataPipe`, `BatcherIterDataPipe`, `ConcaterMapDataPipe`, and `BatcherMapDataPipe`.
- This required removal of the `length` attribute in setstate/getstate of `MultiplexerIterDataPipe`. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the `length` of the loaded `state`.
- This also means the classes above no longer have a `length` attribute. I have found no uses of this, though.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88302
Approved by: https://github.com/NivekT
2022-12-09 19:15:53 +00:00
b1eb42bcfd [4/4][DataPipe] Remove iterator depletion in Zipper (#89974)
Fixes: https://github.com/pytorch/data/issues/865

I will add another PR in torchdata to validate this change would solve the infinite datapipe problem (I have tested locally). This is one of the most annoying stack of PRs cause by separation between TorchData and PyTorch.

There is a case that `file.close` is never called because when generator function has never reached to the end. A simple example would be `zip` two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by `gc`. So, the line of `file.close` is not executed. (This is the reason that Vitaly has to create this [hack](4451eb24e6/torch/utils/data/datapipes/iter/combining.py (L573-L583)) to retrieve all remaining data to make sure generator function is fully executed)

However, this hack introduces another problem where an infinite datapipe would make `zip` never end as it would try to deplete the infinite iterator. See: https://github.com/pytorch/data/issues/865

So, in this PR, I am adding a `try-finally` clause to make sure the `file.close` is always executed during the destruction of `generator` object. Then, we don't need the hack within `zip` any more.

Differential Revision: [D41699469](https://our.internmc.facebook.com/intern/diff/D41699469)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89974
Approved by: https://github.com/NivekT, https://github.com/wenleix
2022-12-05 16:45:34 +00:00
bda6ff0990 [1/4][DataPipe] Properly cleanup unclosed files within generator function (#89973)
There is a case that `file.close` is never called because when generator function has never reached to the end. A simple example would be `zip` two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by `gc`. So, the line of `file.close` is not executed. (This is the reason that Vitaly has to create this [hack](4451eb24e6/torch/utils/data/datapipes/iter/combining.py (L573-L583)) to retrieve all remaining data to make sure generator function is fully executed)

However, this hack introduces another problem where an infinite datapipe would make `zip` never end as it would try to deplete the infinite iterator. See: https://github.com/pytorch/data/issues/865

So, in this PR, I am adding a `try-finally` clause to make sure the `file.close` is always executed during the destruction of `generator` object. Then, we don't need the hack within `zip` any more.

Differential Revision: [D41699470](https://our.internmc.facebook.com/intern/diff/D41699470)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89973
Approved by: https://github.com/NivekT
2022-12-04 04:04:46 +00:00
3d8a853a87 [DataPipe] Add container template for _Fork and _Demux (#89216)
- This would remove the hard-coded check within `_ChildDataPipe`.
- Add `get_length_by_instance` to parent class to make sure there is a chance that child DataPipe can have different lengths
- Prevent Error when `__del__` executed when the object has already been removed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89216
Approved by: https://github.com/NivekT
2022-11-17 23:06:41 +00:00
ea72a0991c Add support to traverse all python collection objects (#84079)
Fixes https://github.com/pytorch/data/issues/752

This PR makes `traverse` function supporting more collections data structures from Python. The `getstate_hook` will be invoked after custom `__getstate__` function. This would guarantee that `traverse` function will be working as long as the `DataPipe` is working properly with multiprocessing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84079
Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin
2022-09-23 16:21:25 +00:00
4618371da5 Integrate xdoctest - Rebased (#82797)
This is a new version of #15648 based on the latest master branch.

Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.

In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)

Fixes https://github.com/pytorch/pytorch/issues/71105

@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
2022-08-12 02:08:01 +00:00
35d97e21c8 [DataPipe] Simple graph snapshotting (#79479)
This mostly completes the "poor man's snapshotting" implementation (named "simple snapshotting"). This is the most basic version of snapshotting but it should work for all DataPipes. I will be adding more efficient implementation for different types of DataPipes in future PRs.

### Implementation

The general idea of the simple snapshot is that we will:
1. Create a new iterator
2. Move that iterator forward by `n_iterations`
3. Save that as the `_fast_forward_iterator` of the DataPipe
4. The next time `iter` is called on the DataPipe, use the `_fast_forward_iterator`

### Usage
As of this implementation, the usage will something like:
```python
rng = torch.Generator()
initial_rng_state = rng.get_state()
datapipe: IterDataPipe = ...
# Some usage of the DataPipe, here maybe yielding the first 5 values
n_iter = 5
it = iter(datapipe)
for _ in range(n_iter):
    next(it)
serialized_graph = pickle.dumps(datapipe)

# The serialized object has most of the sufficient information for simple snapshot (except for initial RNG state)
# It can be deserialized at a later point in time or by a different process
deserialized_graph = pickle.loads(serialized_graph)
# I think `DataLoader2` or `ReadingService` should store `initial_rng_state` that can be saved by the API that we later use
rng_for_deserialized = torch.Generator()
rng_for_deserialized.set_state(initial_rng_state)
n_iterations = deserialized_graph._number_of_samples_yielded

_simple_snapshot_graph(deserialized_graph, n_iterations, rng=rng_for_deserialized)
# The while DataPipe graph should have the same state as before serialization, such that:
self.assertEqual(list(it), list(deserialized_graph))  # True
```

### Next Steps
If this looks acceptable, the next step is I will modify `DataLoader2`'s prototype ReadingService (the one with queues) to remember things like `initial_rng_state` and to have methods `save_snapshot` that will return the `(serialized graph, initial_rng)` and `restore_snapshot`. This should work for single worker data loading.

Note that, in the long term, `initial_rng_state` may not be necessary if we are able to directly save/restore the buffer and RNG state of `Shuffler` (that is work in progress). However, `initial_rng_state` and simple snapshot is still a good fall-back option for some edge cases where the buffer can't be stored.

Differential Revision: [D37943406](https://our.internmc.facebook.com/intern/diff/D37943406)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79479
Approved by: https://github.com/ejguan
2022-07-23 02:53:15 +00:00
428e44ffa1 [DataPipe] Fixes various warnings, exceptions, and clean up testing (#81833)
I went through most of the warnings and exceptions raised in our tests to find these issues.

Changes:
1. In testing, `self.assertEquals` is deprecated, converting to `self.assertEqual` to get rid of the warning
2. Small changes for cleanliness and get rid of warnings (no actual change to result)
3. Correct `is_every_instance_exhausted` logic for `_Forker`
4. Catch `RunTimeError` raised by invalidated iterator during clean up
5. Check if attribute `parent_stream` exists before trying to access it

Differential Revision: [D38020122](https://our.internmc.facebook.com/intern/diff/D38020122)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81833
Approved by: https://github.com/ejguan
2022-07-21 18:59:40 +00:00
ccbf04dd5f [DataPipe] Fix fork/unzip with a single child (#81502)
When `Forker` or `Unzipper` only contains a single child, the buffer should be cleaned up. This is one of the root causes for the issue reported internally. See: https://fburl.com/2k0et1gv
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81502
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-07-18 16:53:19 +00:00
331c0c1803 [DataLoader] Close open in DataPipe streams on best effort basis (#78952)
Adding ability to:
- Track open StreamWrappers with `StreamWrapper.session_streams`
- Automatically close parent StreamWrapper (ex. torchdata tar is the parent and extracted file streams are children)
- Close streams in discarded by filtering structures

Differential Revision: [D37489935](https://our.internmc.facebook.com/intern/diff/D37489935)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78952
Approved by: https://github.com/ejguan
2022-06-29 20:11:23 +00:00
3d218e1c87 Raise warning for unpickable local function (#547) (#80232)
Summary:
X-link: https://github.com/pytorch/data/pull/547

Fixes https://github.com/pytorch/data/issues/538
- Improve the validation function to raise warning about unpickable function when either lambda or local function is provided to DataPipe.
- The inner function from functools.partial object is extracted as well for validation
- Mimic the behavior of pickle module for local lambda function: It would only raise Error for the local function rather than lambda function. So, we will raise warning about local function not lambda function.
```py

>>> import pickle
>>> def fn():
...     lf = lambda x: x
...     pickle.dumps(lf)
>>> pickle.dumps(fn)
AttributeError: Can't pickle local object 'fn.<locals>.<lambda>'
```

This Diff also fixes the Error introduced by https://github.com/pytorch/pytorch/pull/79344

Test Plan:
CI on PyTorch and TorchData
Manually validated the tests from TorchVision

Differential Revision: D37417556

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80232
Approved by: https://github.com/NivekT
2022-06-27 21:47:09 +00:00
79ba65c0f2 Revert "Raise warning for unpickable local function (#80140)"
This reverts commit 4b75b7d3c1504323d62c9f702de04b17aec0175c.

Reverted https://github.com/pytorch/pytorch/pull/80140 on behalf of https://github.com/ejguan due to It will break the CI for TorchData
2022-06-24 14:49:06 +00:00
4b75b7d3c1 Raise warning for unpickable local function (#80140)
Fixes https://github.com/pytorch/data/issues/538

- Improve the validation function to raise warning about unpickable function when either lambda or local function is provided to `DataPipe`.
- The inner function from `functools.partial` object is extracted as well for validation
- Mimic the behavior of `pickle` module for local lambda function: It would only raise Error for the local function rather than `lambda` function. So, we will raise warning about local function not lambda function.
```py
>>> import pickle
>>> def fn():
...     lf = lambda x: x
...     pickle.dumps(lf)
>>> pickle.dumps(fn)
AttributeError: Can't pickle local object 'fn.<locals>.<lambda>'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80140
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-24 13:50:51 +00:00
b4a6730ce1 [DataPipe] Refactor 'mux' to have buffer as an instance variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77775

Approved by: https://github.com/ejguan
2022-05-19 19:55:27 +00:00
97fa1d317f [DataPipe] Preventing automatic reset call after state is restored
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77774

Approved by: https://github.com/ejguan
2022-05-19 16:59:08 +00:00
fdd5f7214e Revert "[DataPipe] Preventing automatic reset call after state is restored"
This reverts commit ac1837ddd3082429886abbe2335d2fff75211d05.

Reverted https://github.com/pytorch/pytorch/pull/77774 on behalf of https://github.com/janeyx99
2022-05-19 14:26:42 +00:00
ac1837ddd3 [DataPipe] Preventing automatic reset call after state is restored
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77774

Approved by: https://github.com/ejguan
2022-05-19 13:53:14 +00:00
4d1ead6dff [DataPipe] Update mux data pipe (#76384) (#77145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76384

OSS issue discussion: https://github.com/pytorch/data/issues/346
This diff updates `mux` and `mux_longest` data pipe.
`mux`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It ends when the shortest input DataPipe is exhausted.

`mux` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(3)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22]
```

Test Plan:
buck test mode/opt //caffe2/test:datapipe

https://www.internalfb.com/intern/testinfra/testrun/4785074706282345

Differential Revision: D36017945

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77145
Approved by: https://github.com/NivekT, https://github.com/ejguan
2022-05-18 16:23:07 +00:00
bbaefdf6b5 [DataPipe] Enforcing single valid iterator for IterDataPipes multiple DataPipes as outputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75995

Approved by: https://github.com/VitalyFedyunin
2022-05-18 01:31:39 +00:00
a008d19ff7 [DataPipe] Revamp serialization logic of DataPipes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74984

Approved by: https://github.com/ejguan
2022-05-10 16:16:46 +00:00
ef63408853 Revert [DataPipe] Update mux data pipe
Reverts #76384

this this is breaking tests test_demux_mux_datapipe (__main__.TestIterableDataPipeBasic. See logs: a997046017
and was red on the PR as well: https://hud.pytorch.org/pytorch/pytorch/pull/76384
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76507
Approved by: https://github.com/kit1980
2022-04-28 00:06:30 +00:00
a997046017 [DataPipe] Update mux data pipe (#76384)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76384

OSS issue discussion: https://github.com/pytorch/data/issues/346
This diff updates `mux` and `mux_longest` data pipe.
`mux`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It ends when the shortest input DataPipe is exhausted.

`mux` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(3)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22]
```

Test Plan:
buck test mode/dev //pytorch/data/test:tests -- --exact 'pytorch/data/test:tests - test_mux_longest_iterdatapipe (test_datapipe.TestDataPipe)'

https://www.internalfb.com/intern/testinfra/testrun/3096224791148107

Reviewed By: ejguan

Differential Revision: D35799965

fbshipit-source-id: 320e71a342ec27e6e9200624aad42f4b99f97c3a
(cherry picked from commit 741ed595275df6c05026ed6f0e78d7052328fb7d)
2022-04-27 22:10:42 +00:00
ccd7233fdd [DataPipe] clearing buffer for DataPipes during __del__
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76345

Approved by: https://github.com/ejguan
2022-04-26 14:24:28 +00:00
0289ab2cec Fix data-related public API (#368)
Summary:
X-link: https://github.com/pytorch/data/pull/368

This is PR aims to expose the right data-relate API.

There are two more changes made in this PR to convert public api to private api
`check_lambda_fn` -> `_check_lambda_fn`
`deprecation_warning` -> `_deprecation_warning`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76143

Reviewed By: albanD, NivekT

Differential Revision: D35798311

Pulled By: ejguan

fbshipit-source-id: b13fded5c88a533c706702fb2070c918c839dca4
(cherry picked from commit 0b534b829a2e90e1e533951c6d334fdeaa9358b9)
2022-04-21 17:27:05 -07:00
841a7f5187 [DataPipe] apply dill serialization for _Demux and add cache to traverse
- Fix _Demux can not be pickled with DILL presented https://github.com/pytorch/pytorch/pull/74958#issuecomment-1084637227
- And add cache to traverse function to prevent infinite recursion for circular reference of DataPipe (Fixes https://github.com/pytorch/data/issues/237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75034
Approved by: https://github.com/wenleix
2022-04-04 19:45:14 +00:00
eec994fc16 [DataPipe] Separating DataPipes from Dataset into different files (#73396)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73396

Separating DataPipes from Dataset into different files. This makes the code more maintainable and simplifies some of the code generation.

I have also tried to move `datapipe.py` into `torch.utils.data.datapipes`, but that will lead to circular import and rewriting many import statements. Should I put more time and go down that path some more?

Fixes https://github.com/pytorch/data/issues/213

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34481962

Pulled By: NivekT

fbshipit-source-id: 42fb26fe7fc334636852cfd8719fc807bdaa7912
(cherry picked from commit 81e76a64e297cb5c58caa951c554e49526173936)
2022-03-15 14:46:34 +00:00