Commit Graph

8 Commits

Author SHA1 Message Date
3a0d088517 Flip default value for mypy disallow_untyped_defs [5/11] (#127842)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127842
Approved by: https://github.com/oulgen
2024-06-08 18:49:18 +00:00
284b0b5f44 Add --local-ranks-filter to torchrun: allow logs filtering by rank (#118562)
Addresses issue https://github.com/pytorch/pytorch/issues/117383

The implementation exposes `--local-ranks-filter` which filters by rank which files we pass to `TailLog` (used in torchrun to determine which logs to output to stdout/stderr)

## Behavior
### with --tee
Currently --tee is implemented as --redirect to file, and streams file to console using `tail`. When --tee is specified, file logs will be unaffected and we will only filter the output to console.

### with --redirect
When --redirect is specified without --tee, nothing is logged to console, so we no-op.

### with neither
When neither --tee or --redirect are specified, torchrun uses empty string "" to indicate logging to console. We intercept this empty string, and redirect it to "/dev/null" to not print to console.

The api also allows a per-rank configuration for --tee and --redirect, and is also supported by this filter implementation.

## Usage
### without --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --local_rank_filter=0 t.py
hello from rank 0 python
DEBUG: TRACED GRAPH
 __compiled_fn_0 <eval_with_key>.0 opcode         name    target                   args       kwargs
-------------  ------  -----------------------  ---------  --------
placeholder    l_x_    L_x_                     ()         {}
call_function  mul     <built-in function mul>  (l_x_, 5)  {}
output         output  output                   ((mul,),)  {}
...
```
### with --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --tee 3 --local_rank_filter=0 t.py
[rank0]:hello from rank 0 python
[rank0]:DEBUG: TRACED GRAPH
[rank0]: __compiled_fn_0 <eval_with_key>.0 opcode         name    target                   args       kwargs
[rank0]:-------------  ------  -----------------------  ---------  --------
[rank0]:placeholder    l_x_    L_x_                     ()         {}
[rank0]:call_function  mul     <built-in function mul>  (l_x_, 5)  {}
[rank0]:output         output  output                   ((mul,),)  {}
...
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118562
Approved by: https://github.com/wconstab, https://github.com/wanchaol
2024-02-07 04:29:54 +00:00
a4355d6b9a Revert "Add --filter-rank to torchrun: allow logs filtering by rank (#118562)"
This reverts commit 73229b4f931f8cd1799b0905d61e3d8e85157bcd.

Reverted https://github.com/pytorch/pytorch/pull/118562 on behalf of https://github.com/xmfan due to breaks MAST precheck, flag naming conflict ([comment](https://github.com/pytorch/pytorch/pull/118562#issuecomment-1924916601))
2024-02-02 23:56:21 +00:00
73229b4f93 Add --filter-rank to torchrun: allow logs filtering by rank (#118562)
Addresses issue https://github.com/pytorch/pytorch/issues/117383

The implementation exposes `--filter-ranks` which filters by rank which files we pass to `TailLog` (used in torchrun to determine which logs to output to stdout/stderr)

## Behavior
### with --tee
Currently --tee is implemented as --redirect to file, and streams file to console using `tail`. When --tee is specified, file logs will be unaffected and we will only filter the output to console.

### with --redirect
When --redirect is specified without --tee, nothing is logged to console, so we no-op.

### with neither
When neither --tee or --redirect are specified, torchrun uses empty string "" to indicate logging to console. We intercept this empty string, and redirect it to "/dev/null" to not print to console.

The api also allows a per-rank configuration for --tee and --redirect, and is also supported by this filter implementation.

## Usage
### without --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --filter_ranks=0 t.py
hello from rank 0 python
DEBUG: TRACED GRAPH
 __compiled_fn_0 <eval_with_key>.0 opcode         name    target                   args       kwargs
-------------  ------  -----------------------  ---------  --------
placeholder    l_x_    L_x_                     ()         {}
call_function  mul     <built-in function mul>  (l_x_, 5)  {}
output         output  output                   ((mul,),)  {}
...
```
### with --tee
```
> TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" TORCH_LOGS="graph" torchrun --standalone --nproc_per_node=2 --role rank --tee 3 --filter_ranks=0 t.py
[rank0]:hello from rank 0 python
[rank0]:DEBUG: TRACED GRAPH
[rank0]: __compiled_fn_0 <eval_with_key>.0 opcode         name    target                   args       kwargs
[rank0]:-------------  ------  -----------------------  ---------  --------
[rank0]:placeholder    l_x_    L_x_                     ()         {}
[rank0]:call_function  mul     <built-in function mul>  (l_x_, 5)  {}
[rank0]:output         output  output                   ((mul,),)  {}
...
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118562
Approved by: https://github.com/wconstab, https://github.com/wanchaol
2024-01-31 07:40:01 +00:00
a8097ed479 Fix docstring errors in _composable_state.py, remote_device.py, value_ranges.py, utils.py, run.py, rendezvous.py, launch.py, argparse_util.py, __init__.py, _cycles.py (#112953)
Fixes #112639

```txt
 torch/utils/_sympy/value_ranges.py
 torch/utils/_sympy/value_ranges.py:60 in public class `ValueRanges`:
        D101: Missing docstring in public class
torch/utils/_sympy/value_ranges.py:68 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/_sympy/value_ranges.py:81 in public method `__contains__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:86 in public method `tighten`:
        D400: First line should end with a period (not 'n')
torch/utils/_sympy/value_ranges.py:90 in public method `__and__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:103 in public method `__or__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:113 in public method `is_singleton`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:118 in public method `unknown`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:122 in public method `wrap`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:129 in public method `increasing_map`:
        D400: First line should end with a period (not ')')
torch/utils/_sympy/value_ranges.py:135 in public method `decreasing_map`:
        D400: First line should end with a period (not ')')
torch/utils/_sympy/value_ranges.py:141 in public method `monotone_map`:
        D400: First line should end with a period (not 'g')
torch/utils/_sympy/value_ranges.py:149 in public method `convex_min_zero_map`:
        D400: First line should end with a period (not '0')
torch/utils/_sympy/value_ranges.py:149 in public method `convex_min_zero_map`:
        D403: First word of the first line should be properly capitalized ('Fn', not 'fn')
torch/utils/_sympy/value_ranges.py:158 in public method `coordinatewise_increasing_map`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/_sympy/value_ranges.py:158 in public method `coordinatewise_increasing_map`:
        D400: First line should end with a period (not ':')
torch/utils/_sympy/value_ranges.py:171 in public method `coordinatewise_monotone_map`:
        D400: First line should end with a period (not 'e')
torch/utils/_sympy/value_ranges.py:180 in private class `SymPyValueRangeAnalysis`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/_sympy/value_ranges.py:180 in private class `SymPyValueRangeAnalysis`:
        D400: First line should end with a period (not 's')
torch/utils/_sympy/value_ranges.py:386 in private method `reciprocal`:
        D210: No whitespaces allowed surrounding docstring text
torch/utils/_sympy/value_ranges.py:386 in private method `reciprocal`:
        D400: First line should end with a period (not 'n')
torch/utils/_sympy/value_ranges.py:488 in public class `ValueRangeAnalysis`:
        D101: Missing docstring in public class
torch/utils/_sympy/value_ranges.py:489 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/_sympy/value_ranges.py:501 in public method `bool_handler`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:506 in public method `default_handler`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:511 in public method `load`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:514 in public method `store`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:517 in public method `reduction`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:520 in public method `index_expr`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:525 in public method `to_dtype`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:558 in public method `square`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:562 in public method `neg`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:566 in public method `truncdiv`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:577 in public method `sub`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:580 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:585 in public function `bound_sympy`:
        D103: Missing docstring in public function
36
torch/utils/_sympy/value_ranges.py:60 in public class `ValueRanges`:
        D101: Missing docstring in public class
torch/utils/_sympy/value_ranges.py:68 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/_sympy/value_ranges.py:81 in public method `__contains__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:86 in public method `tighten`:
        D400: First line should end with a period (not 'n')
torch/utils/_sympy/value_ranges.py:90 in public method `__and__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:103 in public method `__or__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:113 in public method `is_singleton`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:118 in public method `unknown`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:122 in public method `wrap`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:182 in private class `SymPyValueRangeAnalysis`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/_sympy/value_ranges.py:182 in private class `SymPyValueRangeAnalysis`:
        D400: First line should end with a period (not 's')
torch/utils/_sympy/value_ranges.py:388 in private method `reciprocal`:
        D210: No whitespaces allowed surrounding docstring text
torch/utils/_sympy/value_ranges.py:388 in private method `reciprocal`:
        D400: First line should end with a period (not 'n')
torch/utils/_sympy/value_ranges.py:490 in public class `ValueRangeAnalysis`:
        D101: Missing docstring in public class
torch/utils/_sympy/value_ranges.py:491 in public method `__init__`:
        D107: Missing docstring in __init__
torch/utils/_sympy/value_ranges.py:503 in public method `bool_handler`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:508 in public method `default_handler`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:513 in public method `load`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:516 in public method `store`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:519 in public method `reduction`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:522 in public method `index_expr`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:527 in public method `to_dtype`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:560 in public method `square`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:564 in public method `neg`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:568 in public method `truncdiv`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:579 in public method `sub`:
        D102: Missing docstring in public method
torch/utils/_sympy/value_ranges.py:582 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/utils/_sympy/value_ranges.py:587 in public function `bound_sympy`:
        D103: Missing docstring in public function
28

torch/utils/viz/_cycles.py
torch/utils/viz/_cycles.py:14 in public function `observe_garbage`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:207 in public function `object_annotation`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/viz/_cycles.py:207 in public function `object_annotation`:
        D400: First line should end with a period (not 'g')
torch/utils/viz/_cycles.py:256 in public class `Node`:
        D101: Missing docstring in public class
torch/utils/viz/_cycles.py:262 in public function `create_graph`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:308 in public function `escape`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:312 in public function `is_cuda_tensor`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:315 in public function `cuda_allocation_context`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:335 in public function `to_dot`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:406 in public function `to_html`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:416 in public function `observe_tensor_cycles`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:429 in public function `warn_tensor_cycles`:
        D205: 1 blank line required between summary line and description (found 0)
torch/utils/viz/_cycles.py:429 in public function `warn_tensor_cycles`:
        D400: First line should end with a period (not 'p')
torch/utils/viz/_cycles.py:429 in public function `warn_tensor_cycles`:
        D401: First line should be in imperative mood; try rephrasing (found 'Reference')
14
torch/utils/viz/_cycles.py:14 in public function `observe_garbage`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:256 in public class `Node`:
        D101: Missing docstring in public class
torch/utils/viz/_cycles.py:262 in public function `create_graph`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:308 in public function `escape`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:312 in public function `is_cuda_tensor`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:315 in public function `cuda_allocation_context`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:335 in public function `to_dot`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:406 in public function `to_html`:
        D103: Missing docstring in public function
torch/utils/viz/_cycles.py:416 in public function `observe_tensor_cycles`:
        D103: Missing docstring in public function
9

torch/distributed/argparse_util.py
torch/distributed/argparse_util.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/argparse_util.py:13 in public class `env`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/argparse_util.py:13 in public class `env`:
        D400: First line should end with a period (not 'g')
torch/distributed/argparse_util.py:13 in public class `env`:
        D412: No blank lines allowed between a section header and its content ('Example')
torch/distributed/argparse_util.py:43 in public method `__init__`:
        D107: Missing docstring in __init__
torch/distributed/argparse_util.py:56 in public method `__call__`:
        D102: Missing docstring in public method
torch/distributed/argparse_util.py:61 in public class `check_env`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/argparse_util.py:61 in public class `check_env`:
        D400: First line should end with a period (not 's')
torch/distributed/argparse_util.py:61 in public class `check_env`:
        D412: No blank lines allowed between a section header and its content ('Example')
torch/distributed/argparse_util.py:97 in public method `__init__`:
        D107: Missing docstring in __init__
torch/distributed/argparse_util.py:102 in public method `__call__`:
        D102: Missing docstring in public method
11
torch/distributed/argparse_util.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/argparse_util.py:43 in public method `__init__`:
        D107: Missing docstring in __init__
torch/distributed/argparse_util.py:56 in public method `__call__`:
        D102: Missing docstring in public method
torch/distributed/argparse_util.py:97 in public method `__init__`:
        D107: Missing docstring in __init__
torch/distributed/argparse_util.py:102 in public method `__call__`:
        D102: Missing docstring in public method
5

torch/distributed/_composable_state.py
torch/distributed/_composable_state.py:20 in private function `_get_module_state`:
        D202: No blank lines allowed after function docstring (found 1)
torch/distributed/_composable_state.py:20 in private function `_get_module_state`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/_composable_state.py:20 in private function `_get_module_state`:
        D400: First line should end with a period (not '`')
3
0

torch/distributed/launch.py
torch/distributed/launch.py:1 at module level:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/launch.py:1 at module level:
        D400: First line should end with a period (not 'd')
torch/distributed/launch.py:156 in public function `parse_args`:
        D103: Missing docstring in public function
torch/distributed/launch.py:171 in public function `launch`:
        D103: Missing docstring in public function
torch/distributed/launch.py:180 in public function `main`:
        D103: Missing docstring in public function
5
torch/distributed/launch.py:157 in public function `parse_args`:
        D103: Missing docstring in public function
torch/distributed/launch.py:172 in public function `launch`:
        D103: Missing docstring in public function
torch/distributed/launch.py:181 in public function `main`:
        D103: Missing docstring in public function
3

torch/distributed/remote_device.py
torch/distributed/remote_device.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/remote_device.py:81 in private method `worker_name`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/remote_device.py:81 in private method `worker_name`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/distributed/remote_device.py:88 in private method `rank`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/remote_device.py:88 in private method `rank`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/distributed/remote_device.py:95 in private method `device`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/distributed/remote_device.py:95 in private method `device`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
7
torch/distributed/remote_device.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/remote_device.py:85 in private method `rank`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/remote_device.py:85 in private method `rank`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
3

torch/distributed/rendezvous.py
torch/distributed/rendezvous.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/rendezvous.py:23 in public function `register_rendezvous_handler`:
        D401: First line should be in imperative mood (perhaps 'Register', not 'Registers')
torch/distributed/rendezvous.py:88 in public function `rendezvous`:
        D103: Missing docstring in public function
torch/distributed/rendezvous.py:147 in private function `_create_c10d_store`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/rendezvous.py:147 in private function `_create_c10d_store`:
        D400: First line should end with a period (not 'r')
5
torch/distributed/rendezvous.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/rendezvous.py:89 in public function `rendezvous`:
        D103: Missing docstring in public function
2

torch/distributed/run.py
torch/distributed/run.py:9 at module level:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/run.py:9 at module level:
        D400: First line should end with a period (not '`')
torch/distributed/run.py:393 in public function `get_args_parser`:
        D202: No blank lines allowed after function docstring (found 1)
torch/distributed/run.py:393 in public function `get_args_parser`:
        D401: First line should be in imperative mood; try rephrasing (found 'Helper')
torch/distributed/run.py:610 in public function `parse_args`:
        D103: Missing docstring in public function
torch/distributed/run.py:615 in public function `parse_min_max_nnodes`:
        D103: Missing docstring in public function
torch/distributed/run.py:629 in public function `determine_local_world_size`:
        D103: Missing docstring in public function
torch/distributed/run.py:670 in public function `get_rdzv_endpoint`:
        D103: Missing docstring in public function
torch/distributed/run.py:677 in public function `get_use_env`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/run.py:677 in public function `get_use_env`:
        D401: First line should be in imperative mood (perhaps 'Retrieve', not 'Retrieves')
torch/distributed/run.py:689 in public function `config_from_args`:
        D103: Missing docstring in public function
torch/distributed/run.py:770 in public function `run_script_path`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/run.py:770 in public function `run_script_path`:
        D401: First line should be in imperative mood (perhaps 'Run', not 'Runs')
torch/distributed/run.py:781 in public function `run`:
        D103: Missing docstring in public function
torch/distributed/run.py:804 in public function `main`:
        D103: Missing docstring in public function
15
torch/distributed/run.py:611 in public function `parse_args`:
        D103: Missing docstring in public function
torch/distributed/run.py:616 in public function `parse_min_max_nnodes`:
        D103: Missing docstring in public function
torch/distributed/run.py:630 in public function `determine_local_world_size`:
        D103: Missing docstring in public function
torch/distributed/run.py:671 in public function `get_rdzv_endpoint`:
        D103: Missing docstring in public function
torch/distributed/run.py:691 in public function `config_from_args`:
        D103: Missing docstring in public function
torch/distributed/run.py:784 in public function `run`:
        D103: Missing docstring in public function
torch/distributed/run.py:807 in public function `main`:
        D103: Missing docstring in public function
7

torch/distributed/__init__.py
torch/distributed/__init__.py:1 at module level:
        D104: Missing docstring in public package
torch/distributed/__init__.py:8 in public function `is_available`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/__init__.py:8 in public function `is_available`:
        D400: First line should end with a period (not ',')
torch/distributed/__init__.py:8 in public function `is_available`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
4
torch/distributed/__init__.py:1 at module level:
        D104: Missing docstring in public package
1

torch/distributed/utils.py:1 at module level:
        D100: Missing docstring in public module
torch/distributed/utils.py:16 in private function `_pack_kwargs`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/utils.py:16 in private function `_pack_kwargs`:
        D400: First line should end with a period (not ')')
torch/distributed/utils.py:47 in private function `_cast_forward_inputs`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/utils.py:88 in private function `_recursive_to`:
        D200: One-line docstring should fit on one line with quotes (found 3)
torch/distributed/utils.py:141 in private function `_p_assert`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/utils.py:141 in private function `_p_assert`:
        D209: Multi-line docstring closing quotes should be on a separate line
torch/distributed/utils.py:141 in private function `_p_assert`:
        D400: First line should end with a period (not 't')
torch/distributed/utils.py:141 in private function `_p_assert`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/distributed/utils.py:275 in private function `_sync_module_states`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/utils.py:275 in private function `_sync_module_states`:
        D400: First line should end with a period (not 'n')
torch/distributed/utils.py:275 in private function `_sync_module_states`:
        D401: First line should be in imperative mood (perhaps 'Sync', not 'Syncs')
torch/distributed/utils.py:300 in private function `_sync_params_and_buffers`:
        D205: 1 blank line required between summary line and description (found 0)
torch/distributed/utils.py:300 in private function `_sync_params_and_buffers`:
        D400: First line should end with a period (not 'y')
torch/distributed/utils.py:300 in private function `_sync_params_and_buffers`:
        D401: First line should be in imperative mood (perhaps 'Synchronize', not 'Synchronizes')
15
torch/distributed/utils.py:1 at module level:
        D100: Missing docstring in public module
1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112953
Approved by: https://github.com/weifengpy
2023-11-08 01:13:09 +00:00
a6940aae37 [19/n][torch/elastic][upstream] Replace pytorch.distributed.launch with torchelastic launcher (#56214)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56214

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56037

The diff introduces new  `torch.distributed.elastic_launch` and removes internals of `torch.distributed.launch` keeping backwards compatibility.

Since torchelastic and torch.launch are not fully compatible due to `--use_env` arg, the `torch.distributed.launch` deprecation is going to be iterative: as part of pytorch 1.9 we are going to deprecate it, and in the following releases we will remove `torch.distributed.launch`

The diff leaves `torchelastic.distributed.launch` module, and the follow up diffs will migrate the users form `torchelastic.distributed.launch` to `torch.distributed.elastic_launch`

Test Plan: buck test mode/dev-nosan //pytorch/elastic/torchelastic/distributed/...

Reviewed By: H-Huang

Differential Revision: D27805799

fbshipit-source-id: 599a4c0592fbc7a1bc1953040626dd6b72bac907
2021-04-16 13:38:23 -07:00
90e103ddfe Revert D27753803: [19/n][torch/elastic][upstream] Replace pytorch.distributed.launch with torchelastic launcher
Test Plan: revert-hammer

Differential Revision:
D27753803 (7c708ef4ea)

Original commit changeset: 5f24bcfdcb70

fbshipit-source-id: 650e229b788d046450615364e5cba65065a95e3b
2021-04-15 15:03:14 -07:00
7c708ef4ea [19/n][torch/elastic][upstream] Replace pytorch.distributed.launch with torchelastic launcher (#56037)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56037

The diff introduces new  `torch.distributed.elastic_launch` and removes internals of `torch.distributed.launch` keeping backwards compatibility.

Since torchelastic and torch.launch are not fully compatible due to `--use_env` arg, the `torch.distributed.launch` deprecation is going to be iterative: as part of pytorch 1.9 we are going to deprecate it, and in the following releases we will remove `torch.distributed.launch`

The diff leaves `torchelastic.distributed.launch` module, and the follow up diffs will migrate the users form `torchelastic.distributed.launch` to `torch.distributed.elastic_launch`

Test Plan: buck test mode/dev-nosan //pytorch/elastic/torchelastic/distributed/...

Reviewed By: cbalioglu

Differential Revision: D27753803

fbshipit-source-id: 5f24bcfdcb70356f0787b11f6cb9479f3515fb47
2021-04-15 11:09:12 -07:00