mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 12:54:11 +08:00
Convert to markdown: distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst (#155297)
Fixes #155019 ## Description Convert to markdown: distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst ## Checklist - [X] dlpack.rst converted to dlpack.md --> [Preview](https://docs-preview.pytorch.org/pytorch/pytorch/155297/dlpack.html) - [X] distributions.rst converted to distributions.md --> [Preview](https://docs-preview.pytorch.org/pytorch/pytorch/155297/distributions.html) - [X] distributed.tensor.rst converted to distributed.tensor.md --> [Preview](https://docs-preview.pytorch.org/pytorch/pytorch/155297/distributed.tensor.html) - [X] distributed.tensor.parallel.rst converted to distributed.tensor.parallel.md --> [Preview](https://docs-preview.pytorch.org/pytorch/pytorch/155297/distributed.tensor.parallel.html) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155297 Approved by: https://github.com/svekars Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
This commit is contained in:
committed by
PyTorch MergeBot
parent
764c02b78b
commit
799443605b
250
docs/source/distributed.tensor.md
Normal file
250
docs/source/distributed.tensor.md
Normal file
@ -0,0 +1,250 @@
|
||||
|
||||
:::{currentmodule} torch.distributed.tensor
|
||||
:::
|
||||
|
||||
|
||||
# torch.distributed.tensor
|
||||
|
||||
:::{note}
|
||||
`torch.distributed.tensor` is currently in alpha state and under
|
||||
development, we are committing backward compatibility for the most APIs listed
|
||||
in the doc, but there might be API changes if necessary.
|
||||
:::
|
||||
|
||||
## PyTorch DTensor (Distributed Tensor)
|
||||
|
||||
PyTorch DTensor offers simple and flexible tensor sharding primitives that transparently handles distributed
|
||||
logic, including sharded storage, operator computation and collective communications across devices/hosts.
|
||||
`DTensor` could be used to build different paralleism solutions and support sharded state_dict representation
|
||||
when working with multi-dimensional sharding.
|
||||
|
||||
Please see examples from the PyTorch native parallelism solutions that are built on top of `DTensor`:
|
||||
|
||||
- [Tensor Parallel](https://pytorch.org/docs/main/distributed.tensor.parallel.html)
|
||||
- [FSDP2](https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md)
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributed.tensor
|
||||
```
|
||||
|
||||
{class}`DTensor` follows the SPMD (single program, multiple data) programming model to empower users to
|
||||
write distributed program as if it's a **single-device program with the same convergence property**. It
|
||||
provides a uniform tensor sharding layout (DTensor Layout) through specifying the {class}`DeviceMesh`
|
||||
and {class}`Placement`:
|
||||
|
||||
- {class}`DeviceMesh` represents the device topology and the communicators of the cluster using
|
||||
an n-dimensional array.
|
||||
- {class}`Placement` describes the sharding layout of the logical tensor on the {class}`DeviceMesh`.
|
||||
DTensor supports three types of placements: {class}`Shard`, {class}`Replicate` and {class}`Partial`.
|
||||
|
||||
### DTensor Class APIs
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor
|
||||
```
|
||||
|
||||
{class}`DTensor` is a `torch.Tensor` subclass. This means once a {class}`DTensor` is created, it could be
|
||||
used in very similar way to `torch.Tensor`, including running different types of PyTorch operators as if
|
||||
running them in a single device, allowing proper distributed computation for PyTorch operators.
|
||||
|
||||
In addition to existing `torch.Tensor` methods, it also offers a set of additional methods to interact with
|
||||
`torch.Tensor`, `redistribute` the DTensor Layout to a new DTensor, get the full tensor content
|
||||
on all devices, etc.
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: DTensor
|
||||
:members: from_local, to_local, full_tensor, redistribute, device_mesh, placements
|
||||
:member-order: groupwise
|
||||
:special-members: __create_chunk_list__
|
||||
|
||||
```
|
||||
|
||||
### DeviceMesh as the distributed communicator
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.device_mesh
|
||||
```
|
||||
|
||||
{class}`DeviceMesh` was built from DTensor as the abstraction to describe cluster's device topology and represent
|
||||
multi-dimensional communicators (on top of `ProcessGroup`). To see the details of how to create/use a DeviceMesh,
|
||||
please refer to the [DeviceMesh recipe](https://pytorch.org/tutorials/recipes/distributed_device_mesh.html).
|
||||
|
||||
### DTensor Placement Types
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributed.tensor.placement_types
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor.placement_types
|
||||
```
|
||||
|
||||
DTensor supports the following types of {class}`Placement` on each {class}`DeviceMesh` dimension:
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Shard
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Replicate
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Partial
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Placement
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
(create_dtensor)=
|
||||
|
||||
## Different ways to create a DTensor
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor
|
||||
```
|
||||
|
||||
There're three ways to construct a {class}`DTensor`:
|
||||
: - {meth}`distribute_tensor` creates a {class}`DTensor` from a logical or "global" `torch.Tensor` on
|
||||
each rank. This could be used to shard the leaf `torch.Tensor` s (i.e. model parameters/buffers
|
||||
and inputs).
|
||||
- {meth}`DTensor.from_local` creates a {class}`DTensor` from a local `torch.Tensor` on each rank, which can
|
||||
be used to create {class}`DTensor` from a non-leaf `torch.Tensor` s (i.e. intermediate activation
|
||||
tensors during forward/backward).
|
||||
- DTensor provides dedicated tensor factory functions (e.g. {meth}`empty`, {meth}`ones`, {meth}`randn`, etc.)
|
||||
to allow different {class}`DTensor` creations by directly specifying the {class}`DeviceMesh` and
|
||||
{class}`Placement`. Compare to {meth}`distribute_tensor`, this could directly materializing the sharded memory
|
||||
on device, instead of performing sharding after initializing the logical Tensor memory.
|
||||
|
||||
### Create DTensor from a logical torch.Tensor
|
||||
|
||||
The SPMD (single program, multiple data) programming model in `torch.distributed` launches multiple processes
|
||||
(i.e. via `torchrun`) to execute the same program, this means that the model inside the program would be
|
||||
initialized on different processes first (i.e. the model might be initialized on CPU, or meta device, or directly
|
||||
on GPU if enough memory).
|
||||
|
||||
`DTensor` offers a {meth}`distribute_tensor` API that could shard the model weights or Tensors to `DTensor` s,
|
||||
where it would create a DTensor from the "logical" Tensor on each process. This would empower the created
|
||||
`DTensor` s to comply with the single device semantic, which is critical for **numerical correctness**.
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: distribute_tensor
|
||||
```
|
||||
|
||||
Along with {meth}`distribute_tensor`, DTensor also offers a {meth}`distribute_module` API to allow easier
|
||||
sharding on the {class}`nn.Module` level
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: distribute_module
|
||||
|
||||
```
|
||||
|
||||
### DTensor Factory Functions
|
||||
|
||||
DTensor also provides dedicated tensor factory functions to allow creating {class}`DTensor` directly
|
||||
using torch.Tensor like factory function APIs (i.e. torch.ones, torch.empty, etc), by additionally
|
||||
specifying the {class}`DeviceMesh` and {class}`Placement` for the {class}`DTensor` created:
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: zeros
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: ones
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: empty
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: full
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: rand
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: randn
|
||||
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributed.tensor.debug
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor.debug
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
When launching the program, you can turn on additional logging using the `TORCH_LOGS` environment variable from
|
||||
[torch._logging](https://pytorch.org/docs/main/logging.html#module-torch._logging) :
|
||||
|
||||
- `TORCH_LOGS=+dtensor` will display `logging.DEBUG` messages and all levels above it.
|
||||
- `TORCH_LOGS=dtensor` will display `logging.INFO` messages and above.
|
||||
- `TORCH_LOGS=-dtensor` will display `logging.WARNING` messages and above.
|
||||
|
||||
### Debugging Tools
|
||||
|
||||
To debug the program that applied DTensor, and understand more details about what collectives happened under the
|
||||
hood, DTensor provides a {class}`CommDebugMode`:
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: CommDebugMode
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
To visualize the sharding of a DTensor that have less than 3 dimensions, DTensor provides {meth}`visualize_sharding`:
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: visualize_sharding
|
||||
|
||||
```
|
||||
|
||||
## Experimental Features
|
||||
|
||||
`DTensor` also provides a set of experimental features. These features are either in prototyping stage, or the basic
|
||||
functionality is done and but looking for user feedbacks. Please submit a issue to PyTorch if you have feedbacks to
|
||||
these features.
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributed.tensor.experimental
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor.experimental
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: context_parallel
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: local_map
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: register_sharding
|
||||
|
||||
```
|
||||
|
||||
% modules that are missing docs, add the doc later when necessary
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.distributed.tensor.device_mesh
|
||||
```
|
92
docs/source/distributed.tensor.parallel.md
Normal file
92
docs/source/distributed.tensor.parallel.md
Normal file
@ -0,0 +1,92 @@
|
||||
:::{role} hidden
|
||||
:class: hidden-section
|
||||
:::
|
||||
|
||||
# Tensor Parallelism - torch.distributed.tensor.parallel
|
||||
|
||||
Tensor Parallelism(TP) is built on top of the PyTorch DistributedTensor
|
||||
(DTensor)[https://github.com/pytorch/pytorch/blob/main/torch/distributed/tensor/README.md]
|
||||
and provides different parallelism styles: Colwise, Rowwise, and Sequence Parallelism.
|
||||
|
||||
:::{warning}
|
||||
Tensor Parallelism APIs are experimental and subject to change.
|
||||
:::
|
||||
|
||||
The entrypoint to parallelize your `nn.Module` using Tensor Parallelism is:
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributed.tensor.parallel
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributed.tensor.parallel
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: parallelize_module
|
||||
```
|
||||
|
||||
Tensor Parallelism supports the following parallel styles:
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.ColwiseParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.RowwiseParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.SequenceParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
To simply configure the nn.Module's inputs and outputs with DTensor layouts
|
||||
and perform necessary layout redistributions, without distribute the module
|
||||
parameters to DTensors, the following `ParallelStyle` s can be used in
|
||||
the `parallelize_plan` when calling `parallelize_module`:
|
||||
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleInput
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleOutput
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleInputOutput
|
||||
:members:
|
||||
:undoc-members:
|
||||
```
|
||||
|
||||
:::{note}
|
||||
when using the `Shard(dim)` as the input/output layouts for the above
|
||||
`ParallelStyle` s, we assume the input/output activation tensors are evenly sharded on
|
||||
the tensor dimension `dim` on the `DeviceMesh` that TP operates on. For instance,
|
||||
since `RowwiseParallel` accepts input that is sharded on the last dimension, it assumes
|
||||
the input tensor has already been evenly sharded on the last dimension. For the case of uneven sharded activation tensors, one could pass in DTensor directly to the partitioned modules, and use `use_local_output=False` to return DTensor after each `ParallelStyle`, where DTensor could track the uneven sharding information.
|
||||
:::
|
||||
|
||||
For models like Transformer, we recommend users to use `ColwiseParallel`
|
||||
and `RowwiseParallel` together in the parallelize_plan for achieve the desired
|
||||
sharding for the entire model (i.e. Attention and MLP).
|
||||
|
||||
Parallelized cross-entropy loss computation (loss parallelism), is supported via the following context manager:
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: torch.distributed.tensor.parallel.loss_parallel
|
||||
```
|
||||
:::{warning}
|
||||
The loss_parallel API is experimental and subject to change.
|
||||
:::
|
@ -1,71 +0,0 @@
|
||||
.. role:: hidden
|
||||
:class: hidden-section
|
||||
|
||||
Tensor Parallelism - torch.distributed.tensor.parallel
|
||||
======================================================
|
||||
|
||||
Tensor Parallelism(TP) is built on top of the PyTorch DistributedTensor
|
||||
(`DTensor <https://github.com/pytorch/pytorch/blob/main/torch/distributed/tensor/README.md>`__)
|
||||
and provides different parallelism styles: Colwise, Rowwise, and Sequence Parallelism.
|
||||
|
||||
.. warning ::
|
||||
Tensor Parallelism APIs are experimental and subject to change.
|
||||
|
||||
The entrypoint to parallelize your ``nn.Module`` using Tensor Parallelism is:
|
||||
|
||||
.. automodule:: torch.distributed.tensor.parallel
|
||||
|
||||
.. currentmodule:: torch.distributed.tensor.parallel
|
||||
|
||||
.. autofunction:: parallelize_module
|
||||
|
||||
Tensor Parallelism supports the following parallel styles:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.ColwiseParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.RowwiseParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.SequenceParallel
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
To simply configure the nn.Module's inputs and outputs with DTensor layouts
|
||||
and perform necessary layout redistributions, without distribute the module
|
||||
parameters to DTensors, the following ``ParallelStyle`` s can be used in
|
||||
the ``parallelize_plan`` when calling ``parallelize_module``:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleInput
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleOutput
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: torch.distributed.tensor.parallel.PrepareModuleInputOutput
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. note:: when using the ``Shard(dim)`` as the input/output layouts for the above
|
||||
``ParallelStyle`` s, we assume the input/output activation tensors are evenly sharded on
|
||||
the tensor dimension ``dim`` on the ``DeviceMesh`` that TP operates on. For instance,
|
||||
since ``RowwiseParallel`` accepts input that is sharded on the last dimension, it assumes
|
||||
the input tensor has already been evenly sharded on the last dimension. For the case of uneven
|
||||
sharded activation tensors, one could pass in DTensor directly to the partitioned modules,
|
||||
and use ``use_local_output=False`` to return DTensor after each ``ParallelStyle``, where
|
||||
DTensor could track the uneven sharding information.
|
||||
|
||||
For models like Transformer, we recommend users to use ``ColwiseParallel``
|
||||
and ``RowwiseParallel`` together in the parallelize_plan for achieve the desired
|
||||
sharding for the entire model (i.e. Attention and MLP).
|
||||
|
||||
Parallelized cross-entropy loss computation (loss parallelism), is supported via the following context manager:
|
||||
|
||||
.. autofunction:: torch.distributed.tensor.parallel.loss_parallel
|
||||
|
||||
.. warning ::
|
||||
The loss_parallel API is experimental and subject to change.
|
@ -1,195 +0,0 @@
|
||||
.. currentmodule:: torch.distributed.tensor
|
||||
|
||||
torch.distributed.tensor
|
||||
===========================
|
||||
|
||||
.. note::
|
||||
``torch.distributed.tensor`` is currently in alpha state and under
|
||||
development, we are committing backward compatibility for the most APIs listed
|
||||
in the doc, but there might be API changes if necessary.
|
||||
|
||||
|
||||
PyTorch DTensor (Distributed Tensor)
|
||||
---------------------------------------
|
||||
|
||||
PyTorch DTensor offers simple and flexible tensor sharding primitives that transparently handles distributed
|
||||
logic, including sharded storage, operator computation and collective communications across devices/hosts.
|
||||
``DTensor`` could be used to build different paralleism solutions and support sharded state_dict representation
|
||||
when working with multi-dimensional sharding.
|
||||
|
||||
Please see examples from the PyTorch native parallelism solutions that are built on top of ``DTensor``:
|
||||
|
||||
* `Tensor Parallel <https://pytorch.org/docs/main/distributed.tensor.parallel.html>`__
|
||||
* `FSDP2 <https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md>`__
|
||||
|
||||
.. automodule:: torch.distributed.tensor
|
||||
|
||||
:class:`DTensor` follows the SPMD (single program, multiple data) programming model to empower users to
|
||||
write distributed program as if it's a **single-device program with the same convergence property**. It
|
||||
provides a uniform tensor sharding layout (DTensor Layout) through specifying the :class:`DeviceMesh`
|
||||
and :class:`Placement`:
|
||||
|
||||
- :class:`DeviceMesh` represents the device topology and the communicators of the cluster using
|
||||
an n-dimensional array.
|
||||
|
||||
- :class:`Placement` describes the sharding layout of the logical tensor on the :class:`DeviceMesh`.
|
||||
DTensor supports three types of placements: :class:`Shard`, :class:`Replicate` and :class:`Partial`.
|
||||
|
||||
|
||||
DTensor Class APIs
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. currentmodule:: torch.distributed.tensor
|
||||
|
||||
:class:`DTensor` is a ``torch.Tensor`` subclass. This means once a :class:`DTensor` is created, it could be
|
||||
used in very similar way to ``torch.Tensor``, including running different types of PyTorch operators as if
|
||||
running them in a single device, allowing proper distributed computation for PyTorch operators.
|
||||
|
||||
In addition to existing ``torch.Tensor`` methods, it also offers a set of additional methods to interact with
|
||||
``torch.Tensor``, ``redistribute`` the DTensor Layout to a new DTensor, get the full tensor content
|
||||
on all devices, etc.
|
||||
|
||||
.. autoclass:: DTensor
|
||||
:members: from_local, to_local, full_tensor, redistribute, device_mesh, placements
|
||||
:member-order: groupwise
|
||||
:special-members: __create_chunk_list__
|
||||
|
||||
|
||||
DeviceMesh as the distributed communicator
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. currentmodule:: torch.distributed.device_mesh
|
||||
|
||||
:class:`DeviceMesh` was built from DTensor as the abstraction to describe cluster's device topology and represent
|
||||
multi-dimensional communicators (on top of ``ProcessGroup``). To see the details of how to create/use a DeviceMesh,
|
||||
please refer to the `DeviceMesh recipe <https://pytorch.org/tutorials/recipes/distributed_device_mesh.html>`__.
|
||||
|
||||
|
||||
DTensor Placement Types
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
.. automodule:: torch.distributed.tensor.placement_types
|
||||
.. currentmodule:: torch.distributed.tensor.placement_types
|
||||
|
||||
DTensor supports the following types of :class:`Placement` on each :class:`DeviceMesh` dimension:
|
||||
|
||||
.. autoclass:: Shard
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: Replicate
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: Partial
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
.. autoclass:: Placement
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
|
||||
.. _create_dtensor:
|
||||
|
||||
Different ways to create a DTensor
|
||||
---------------------------------------
|
||||
|
||||
.. currentmodule:: torch.distributed.tensor
|
||||
|
||||
There're three ways to construct a :class:`DTensor`:
|
||||
* :meth:`distribute_tensor` creates a :class:`DTensor` from a logical or "global" ``torch.Tensor`` on
|
||||
each rank. This could be used to shard the leaf ``torch.Tensor`` s (i.e. model parameters/buffers
|
||||
and inputs).
|
||||
* :meth:`DTensor.from_local` creates a :class:`DTensor` from a local ``torch.Tensor`` on each rank, which can
|
||||
be used to create :class:`DTensor` from a non-leaf ``torch.Tensor`` s (i.e. intermediate activation
|
||||
tensors during forward/backward).
|
||||
* DTensor provides dedicated tensor factory functions (e.g. :meth:`empty`, :meth:`ones`, :meth:`randn`, etc.)
|
||||
to allow different :class:`DTensor` creations by directly specifying the :class:`DeviceMesh` and
|
||||
:class:`Placement`. Compare to :meth:`distribute_tensor`, this could directly materializing the sharded memory
|
||||
on device, instead of performing sharding after initializing the logical Tensor memory.
|
||||
|
||||
Create DTensor from a logical torch.Tensor
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The SPMD (single program, multiple data) programming model in ``torch.distributed`` launches multiple processes
|
||||
(i.e. via ``torchrun``) to execute the same program, this means that the model inside the program would be
|
||||
initialized on different processes first (i.e. the model might be initialized on CPU, or meta device, or directly
|
||||
on GPU if enough memory).
|
||||
|
||||
``DTensor`` offers a :meth:`distribute_tensor` API that could shard the model weights or Tensors to ``DTensor`` s,
|
||||
where it would create a DTensor from the "logical" Tensor on each process. This would empower the created
|
||||
``DTensor`` s to comply with the single device semantic, which is critical for **numerical correctness**.
|
||||
|
||||
.. autofunction:: distribute_tensor
|
||||
|
||||
Along with :meth:`distribute_tensor`, DTensor also offers a :meth:`distribute_module` API to allow easier
|
||||
sharding on the :class:`nn.Module` level
|
||||
|
||||
.. autofunction:: distribute_module
|
||||
|
||||
|
||||
DTensor Factory Functions
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
DTensor also provides dedicated tensor factory functions to allow creating :class:`DTensor` directly
|
||||
using torch.Tensor like factory function APIs (i.e. torch.ones, torch.empty, etc), by additionally
|
||||
specifying the :class:`DeviceMesh` and :class:`Placement` for the :class:`DTensor` created:
|
||||
|
||||
.. autofunction:: zeros
|
||||
|
||||
.. autofunction:: ones
|
||||
|
||||
.. autofunction:: empty
|
||||
|
||||
.. autofunction:: full
|
||||
|
||||
.. autofunction:: rand
|
||||
|
||||
.. autofunction:: randn
|
||||
|
||||
|
||||
Debugging
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: torch.distributed.tensor.debug
|
||||
.. currentmodule:: torch.distributed.tensor.debug
|
||||
|
||||
Logging
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
When launching the program, you can turn on additional logging using the `TORCH_LOGS` environment variable from
|
||||
`torch._logging <https://pytorch.org/docs/main/logging.html#module-torch._logging>`__ :
|
||||
|
||||
* `TORCH_LOGS=+dtensor` will display `logging.DEBUG` messages and all levels above it.
|
||||
* `TORCH_LOGS=dtensor` will display `logging.INFO` messages and above.
|
||||
* `TORCH_LOGS=-dtensor` will display `logging.WARNING` messages and above.
|
||||
|
||||
Debugging Tools
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To debug the program that applied DTensor, and understand more details about what collectives happened under the
|
||||
hood, DTensor provides a :class:`CommDebugMode`:
|
||||
|
||||
.. autoclass:: CommDebugMode
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
To visualize the sharding of a DTensor that have less than 3 dimensions, DTensor provides :meth:`visualize_sharding`:
|
||||
|
||||
.. autofunction:: visualize_sharding
|
||||
|
||||
|
||||
Experimental Features
|
||||
---------------------------------------
|
||||
|
||||
``DTensor`` also provides a set of experimental features. These features are either in prototyping stage, or the basic
|
||||
functionality is done and but looking for user feedbacks. Please submit a issue to PyTorch if you have feedbacks to
|
||||
these features.
|
||||
|
||||
.. automodule:: torch.distributed.tensor.experimental
|
||||
.. currentmodule:: torch.distributed.tensor.experimental
|
||||
|
||||
.. autofunction:: context_parallel
|
||||
.. autofunction:: local_map
|
||||
.. autofunction:: register_sharding
|
||||
|
||||
|
||||
.. modules that are missing docs, add the doc later when necessary
|
||||
.. py:module:: torch.distributed.tensor.device_mesh
|
@ -1,460 +1,692 @@
|
||||
```{eval-rst}
|
||||
.. role:: hidden
|
||||
:class: hidden-section
|
||||
```
|
||||
|
||||
Probability distributions - torch.distributions
|
||||
==================================================
|
||||
# Probability distributions - torch.distributions
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributions
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions
|
||||
```
|
||||
|
||||
:hidden:`Distribution`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Distribution`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.distribution
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Distribution
|
||||
:members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`ExponentialFamily`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`ExponentialFamily`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.exp_family
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: ExponentialFamily
|
||||
:members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Bernoulli`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Bernoulli`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.bernoulli
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Bernoulli
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Beta`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Beta`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.beta
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Beta
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Binomial`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Binomial`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.binomial
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Binomial
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Categorical`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Categorical`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.categorical
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Categorical
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Cauchy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Cauchy`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.cauchy
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Cauchy
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Chi2`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Chi2`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.chi2
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Chi2
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`ContinuousBernoulli`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`ContinuousBernoulli`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.continuous_bernoulli
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: ContinuousBernoulli
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Dirichlet`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Dirichlet`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.dirichlet
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Dirichlet
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Exponential`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Exponential`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.exponential
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Exponential
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`FisherSnedecor`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`FisherSnedecor`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.fishersnedecor
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: FisherSnedecor
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Gamma`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Gamma`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.gamma
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Gamma
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`GeneralizedPareto`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`GeneralizedPareto`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.generalized_pareto
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: GeneralizedPareto
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Geometric`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Geometric`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.geometric
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Geometric
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Gumbel`
|
||||
~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Gumbel`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.gumbel
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Gumbel
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`HalfCauchy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`HalfCauchy`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.half_cauchy
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: HalfCauchy
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`HalfNormal`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`HalfNormal`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.half_normal
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: HalfNormal
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Independent`
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Independent`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.independent
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Independent
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`InverseGamma`
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`InverseGamma`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.inverse_gamma
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: InverseGamma
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Kumaraswamy`
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Kumaraswamy`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.kumaraswamy
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Kumaraswamy
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`LKJCholesky`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`LKJCholesky`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.lkj_cholesky
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: LKJCholesky
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Laplace`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Laplace`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.laplace
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Laplace
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`LogNormal`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`LogNormal`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.log_normal
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: LogNormal
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`LowRankMultivariateNormal`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`LowRankMultivariateNormal`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.lowrank_multivariate_normal
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: LowRankMultivariateNormal
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`MixtureSameFamily`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`MixtureSameFamily`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.mixture_same_family
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: MixtureSameFamily
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Multinomial`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Multinomial`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.multinomial
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Multinomial
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`MultivariateNormal`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`MultivariateNormal`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.multivariate_normal
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: MultivariateNormal
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`NegativeBinomial`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`NegativeBinomial`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.negative_binomial
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: NegativeBinomial
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Normal`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Normal`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.normal
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Normal
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`OneHotCategorical`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`OneHotCategorical`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.one_hot_categorical
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: OneHotCategorical
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Pareto`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Pareto`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.pareto
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Pareto
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Poisson`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Poisson`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.poisson
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Poisson
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`RelaxedBernoulli`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`RelaxedBernoulli`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.relaxed_bernoulli
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: RelaxedBernoulli
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`LogitRelaxedBernoulli`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`LogitRelaxedBernoulli`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.relaxed_bernoulli
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: LogitRelaxedBernoulli
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`RelaxedOneHotCategorical`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`RelaxedOneHotCategorical`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.relaxed_categorical
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: RelaxedOneHotCategorical
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`StudentT`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`StudentT`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.studentT
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: StudentT
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`TransformedDistribution`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`TransformedDistribution`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.transformed_distribution
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: TransformedDistribution
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Uniform`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Uniform`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.uniform
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Uniform
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`VonMises`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`VonMises`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.von_mises
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: VonMises
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Weibull`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Weibull`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.weibull
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Weibull
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
:hidden:`Wishart`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## {hidden}`Wishart`
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.wishart
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: Wishart
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
```
|
||||
|
||||
`KL Divergence`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## `KL Divergence`
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributions.kl
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.distributions.kl
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: kl_divergence
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: register_kl
|
||||
```
|
||||
|
||||
`Transforms`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## `Transforms`
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributions.transforms
|
||||
:members:
|
||||
:member-order: bysource
|
||||
```
|
||||
|
||||
`Constraints`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## `Constraints`
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributions.constraints
|
||||
:members:
|
||||
:member-order: bysource
|
||||
```
|
||||
|
||||
`Constraint Registry`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
## `Constraint Registry`
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.distributions.constraint_registry
|
||||
:members:
|
||||
:member-order: bysource
|
||||
```
|
||||
|
||||
.. This module needs to be documented. Adding here in the meantime
|
||||
.. for tracking purposes
|
||||
% This module needs to be documented. Adding here in the meantime
|
||||
|
||||
% for tracking purposes
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.distributions.bernoulli
|
||||
|
||||
.. py:module:: torch.distributions.beta
|
||||
|
||||
.. py:module:: torch.distributions.binomial
|
||||
|
||||
.. py:module:: torch.distributions.categorical
|
||||
|
||||
.. py:module:: torch.distributions.cauchy
|
||||
|
||||
.. py:module:: torch.distributions.chi2
|
||||
|
||||
.. py:module:: torch.distributions.continuous_bernoulli
|
||||
|
||||
.. py:module:: torch.distributions.dirichlet
|
||||
|
||||
.. py:module:: torch.distributions.distribution
|
||||
|
||||
.. py:module:: torch.distributions.exp_family
|
||||
|
||||
.. py:module:: torch.distributions.exponential
|
||||
|
||||
.. py:module:: torch.distributions.fishersnedecor
|
||||
|
||||
.. py:module:: torch.distributions.gamma
|
||||
|
||||
.. py:module:: torch.distributions.generalized_pareto
|
||||
|
||||
.. py:module:: torch.distributions.geometric
|
||||
|
||||
.. py:module:: torch.distributions.gumbel
|
||||
|
||||
.. py:module:: torch.distributions.half_cauchy
|
||||
|
||||
.. py:module:: torch.distributions.half_normal
|
||||
|
||||
.. py:module:: torch.distributions.independent
|
||||
|
||||
.. py:module:: torch.distributions.inverse_gamma
|
||||
|
||||
.. py:module:: torch.distributions.kumaraswamy
|
||||
|
||||
.. py:module:: torch.distributions.laplace
|
||||
|
||||
.. py:module:: torch.distributions.lkj_cholesky
|
||||
|
||||
.. py:module:: torch.distributions.log_normal
|
||||
|
||||
.. py:module:: torch.distributions.logistic_normal
|
||||
|
||||
.. py:module:: torch.distributions.lowrank_multivariate_normal
|
||||
|
||||
.. py:module:: torch.distributions.mixture_same_family
|
||||
|
||||
.. py:module:: torch.distributions.multinomial
|
||||
|
||||
.. py:module:: torch.distributions.multivariate_normal
|
||||
|
||||
.. py:module:: torch.distributions.negative_binomial
|
||||
|
||||
.. py:module:: torch.distributions.normal
|
||||
|
||||
.. py:module:: torch.distributions.one_hot_categorical
|
||||
|
||||
.. py:module:: torch.distributions.pareto
|
||||
|
||||
.. py:module:: torch.distributions.poisson
|
||||
|
||||
.. py:module:: torch.distributions.relaxed_bernoulli
|
||||
|
||||
.. py:module:: torch.distributions.relaxed_categorical
|
||||
|
||||
.. py:module:: torch.distributions.studentT
|
||||
|
||||
.. py:module:: torch.distributions.transformed_distribution
|
||||
|
||||
.. py:module:: torch.distributions.uniform
|
||||
|
||||
.. py:module:: torch.distributions.utils
|
||||
|
||||
.. py:module:: torch.distributions.von_mises
|
||||
|
||||
.. py:module:: torch.distributions.weibull
|
||||
|
||||
.. py:module:: torch.distributions.wishart
|
||||
```
|
@ -1,7 +1,13 @@
|
||||
torch.utils.dlpack
|
||||
==================
|
||||
# torch.utils.dlpack
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.utils.dlpack
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: from_dlpack
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: to_dlpack
|
||||
```
|
Reference in New Issue
Block a user