mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 12:54:11 +08:00
Polish DDP join API docstrings (#43973)
Summary: Polishes DDP join api docstrings and makes a few minor cosmetic changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43973 Reviewed By: zou3519 Differential Revision: D23467238 Pulled By: rohan-varma fbshipit-source-id: faf0ee56585fca5cc16f6891ea88032336b3be56
This commit is contained in:
committed by
Facebook GitHub Bot
parent
442684cb25
commit
3806c939bd
@ -231,6 +231,7 @@ class DistributedDataParallel(Module):
|
|||||||
parameters.
|
parameters.
|
||||||
|
|
||||||
Example::
|
Example::
|
||||||
|
|
||||||
>>> import torch.distributed.autograd as dist_autograd
|
>>> import torch.distributed.autograd as dist_autograd
|
||||||
>>> from torch.nn.parallel import DistributedDataParallel as DDP
|
>>> from torch.nn.parallel import DistributedDataParallel as DDP
|
||||||
>>> from torch import optim
|
>>> from torch import optim
|
||||||
@ -688,7 +689,7 @@ class DistributedDataParallel(Module):
|
|||||||
def join(self, divide_by_initial_world_size=True, enable=True):
|
def join(self, divide_by_initial_world_size=True, enable=True):
|
||||||
r"""
|
r"""
|
||||||
A context manager to be used in conjunction with an instance of
|
A context manager to be used in conjunction with an instance of
|
||||||
:class:`torch.nn.parallel.distributed.DistributedDataParallel` to be
|
:class:`torch.nn.parallel.DistributedDataParallel` to be
|
||||||
able to train with uneven inputs across participating processes.
|
able to train with uneven inputs across participating processes.
|
||||||
|
|
||||||
This context manager will keep track of already-joined DDP processes,
|
This context manager will keep track of already-joined DDP processes,
|
||||||
@ -710,10 +711,10 @@ class DistributedDataParallel(Module):
|
|||||||
|
|
||||||
.. warning::
|
.. warning::
|
||||||
This module works only with the multi-process, single-device usage
|
This module works only with the multi-process, single-device usage
|
||||||
of :class:`torch.nn.parallel.distributed.DistributedDataParallel`,
|
of :class:`torch.nn.parallel.DistributedDataParallel`,
|
||||||
which means that a single process works on a single GPU.
|
which means that a single process works on a single GPU.
|
||||||
|
|
||||||
..warning::
|
.. warning::
|
||||||
This module currently does not support custom distributed collective
|
This module currently does not support custom distributed collective
|
||||||
operations in the forward pass, such as ``SyncBatchNorm`` or other
|
operations in the forward pass, such as ``SyncBatchNorm`` or other
|
||||||
custom defined collectives in the model's forward pass.
|
custom defined collectives in the model's forward pass.
|
||||||
@ -731,7 +732,7 @@ class DistributedDataParallel(Module):
|
|||||||
``world_size`` even when we encounter uneven inputs. If you set
|
``world_size`` even when we encounter uneven inputs. If you set
|
||||||
this to ``False``, we divide the gradient by the remaining
|
this to ``False``, we divide the gradient by the remaining
|
||||||
number of nodes. This ensures parity with training on a smaller
|
number of nodes. This ensures parity with training on a smaller
|
||||||
world_size although it also means the uneven inputs would
|
``world_size`` although it also means the uneven inputs would
|
||||||
contribute more towards the global gradient. Typically, you
|
contribute more towards the global gradient. Typically, you
|
||||||
would want to set this to ``True`` for cases where the last few
|
would want to set this to ``True`` for cases where the last few
|
||||||
inputs of your training job are uneven. In extreme cases, where
|
inputs of your training job are uneven. In extreme cases, where
|
||||||
@ -744,6 +745,7 @@ class DistributedDataParallel(Module):
|
|||||||
|
|
||||||
|
|
||||||
Example::
|
Example::
|
||||||
|
|
||||||
>>> import torch
|
>>> import torch
|
||||||
>>> import torch.distributed as dist
|
>>> import torch.distributed as dist
|
||||||
>>> import os
|
>>> import os
|
||||||
|
Reference in New Issue
Block a user