mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 12:54:11 +08:00
Fixes #155027 Converted RST files to Markdown Pull Request resolved: https://github.com/pytorch/pytorch/pull/155252 Approved by: https://github.com/svekars Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
This commit is contained in:
committed by
PyTorch MergeBot
parent
3d82a1dfb5
commit
d41f62b7a0
@ -1,8 +1,14 @@
|
||||
torch.mps
|
||||
===================================
|
||||
.. automodule:: torch.mps
|
||||
.. currentmodule:: torch.mps
|
||||
# torch.mps
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.mps
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.mps
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
@ -19,9 +25,11 @@ torch.mps
|
||||
driver_allocated_memory
|
||||
recommended_max_memory
|
||||
compile_shader
|
||||
```
|
||||
|
||||
MPS Profiler
|
||||
------------
|
||||
## MPS Profiler
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
@ -33,17 +41,27 @@ MPS Profiler
|
||||
profiler.is_capturing_metal
|
||||
profiler.is_metal_capture_enabled
|
||||
profiler.metal_capture
|
||||
```
|
||||
|
||||
MPS Event
|
||||
------------
|
||||
## MPS Event
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
|
||||
event.Event
|
||||
|
||||
```
|
||||
|
||||
.. This module needs to be documented. Adding here in the meantime
|
||||
.. for tracking purposes
|
||||
% This module needs to be documented. Adding here in the meantime
|
||||
|
||||
% for tracking purposes
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.mps.event
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.mps.profiler
|
||||
```
|
@ -1,11 +1,16 @@
|
||||
torch.mtia
|
||||
===================================
|
||||
# torch.mtia
|
||||
|
||||
The MTIA backend is implemented out of the tree, only interfaces are be defined here.
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.mtia
|
||||
.. currentmodule:: torch.mtia
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.mtia
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
@ -32,12 +37,15 @@ The MTIA backend is implemented out of the tree, only interfaces are be defined
|
||||
set_rng_state
|
||||
get_rng_state
|
||||
DeferredMtiaCallError
|
||||
```
|
||||
|
||||
Streams and events
|
||||
------------------
|
||||
## Streams and events
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
|
||||
Event
|
||||
Stream
|
||||
```
|
@ -1,13 +1,19 @@
|
||||
torch.mtia.memory
|
||||
===================================
|
||||
# torch.mtia.memory
|
||||
|
||||
The MTIA backend is implemented out of the tree, only interfaces are be defined here.
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.mtia.memory
|
||||
.. currentmodule:: torch.mtia.memory
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.mtia.memory
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autosummary::
|
||||
:toctree: generated
|
||||
:nosignatures:
|
||||
|
||||
memory_stats
|
||||
```
|
@ -1,136 +1,139 @@
|
||||
:orphan:
|
||||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
.. _multiprocessing-doc:
|
||||
(multiprocessing-doc)=
|
||||
|
||||
Multiprocessing package - torch.multiprocessing
|
||||
===============================================
|
||||
# Multiprocessing package - torch.multiprocessing
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.multiprocessing
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.multiprocessing
|
||||
```
|
||||
|
||||
.. warning::
|
||||
:::{warning}
|
||||
If the main process exits abruptly (e.g. because of an incoming signal),
|
||||
Python's `multiprocessing` sometimes fails to clean up its children.
|
||||
It's a known caveat, so if you're seeing any resource leaks after
|
||||
interrupting the interpreter, it probably means that this has just happened
|
||||
to you.
|
||||
:::
|
||||
|
||||
If the main process exits abruptly (e.g. because of an incoming signal),
|
||||
Python's ``multiprocessing`` sometimes fails to clean up its children.
|
||||
It's a known caveat, so if you're seeing any resource leaks after
|
||||
interrupting the interpreter, it probably means that this has just happened
|
||||
to you.
|
||||
|
||||
Strategy management
|
||||
-------------------
|
||||
## Strategy management
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: get_all_sharing_strategies
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: get_sharing_strategy
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: set_sharing_strategy
|
||||
|
||||
```
|
||||
|
||||
.. _multiprocessing-cuda-sharing-details:
|
||||
(multiprocessing-cuda-sharing-details)=
|
||||
|
||||
Sharing CUDA tensors
|
||||
--------------------
|
||||
## Sharing CUDA tensors
|
||||
|
||||
Sharing CUDA tensors between processes is supported only in Python 3, using
|
||||
a ``spawn`` or ``forkserver`` start methods.
|
||||
|
||||
a `spawn` or `forkserver` start methods.
|
||||
|
||||
Unlike CPU tensors, the sending process is required to keep the original tensor
|
||||
as long as the receiving process retains a copy of the tensor. The refcounting is
|
||||
implemented under the hood but requires users to follow the next best practices.
|
||||
|
||||
.. warning::
|
||||
If the consumer process dies abnormally to a fatal signal, the shared tensor
|
||||
could be forever kept in memory as long as the sending process is running.
|
||||
|
||||
:::{warning}
|
||||
If the consumer process dies abnormally to a fatal signal, the shared tensor
|
||||
could be forever kept in memory as long as the sending process is running.
|
||||
:::
|
||||
|
||||
1. Release memory ASAP in the consumer.
|
||||
|
||||
::
|
||||
```
|
||||
## Good
|
||||
x = queue.get()
|
||||
# do somethings with x
|
||||
del x
|
||||
```
|
||||
|
||||
## Good
|
||||
x = queue.get()
|
||||
# do somethings with x
|
||||
del x
|
||||
|
||||
::
|
||||
|
||||
## Bad
|
||||
x = queue.get()
|
||||
# do somethings with x
|
||||
# do everything else (producer have to keep x in memory)
|
||||
```
|
||||
## Bad
|
||||
x = queue.get()
|
||||
# do somethings with x
|
||||
# do everything else (producer have to keep x in memory)
|
||||
```
|
||||
|
||||
2. Keep producer process running until all consumers exits. This will prevent
|
||||
the situation when the producer process releasing memory which is still in use
|
||||
by the consumer.
|
||||
|
||||
::
|
||||
```
|
||||
## producer
|
||||
# send tensors, do something
|
||||
event.wait()
|
||||
```
|
||||
|
||||
## producer
|
||||
# send tensors, do something
|
||||
event.wait()
|
||||
|
||||
|
||||
::
|
||||
|
||||
## consumer
|
||||
# receive tensors and use them
|
||||
event.set()
|
||||
```
|
||||
## consumer
|
||||
# receive tensors and use them
|
||||
event.set()
|
||||
```
|
||||
|
||||
3. Don't pass received tensors.
|
||||
|
||||
::
|
||||
```
|
||||
# not going to work
|
||||
x = queue.get()
|
||||
queue_2.put(x)
|
||||
```
|
||||
|
||||
# not going to work
|
||||
x = queue.get()
|
||||
queue_2.put(x)
|
||||
```
|
||||
# you need to create a process-local copy
|
||||
x = queue.get()
|
||||
x_clone = x.clone()
|
||||
queue_2.put(x_clone)
|
||||
```
|
||||
|
||||
```
|
||||
# putting and getting from the same queue in the same process will likely end up with segfault
|
||||
queue.put(tensor)
|
||||
x = queue.get()
|
||||
```
|
||||
|
||||
::
|
||||
|
||||
# you need to create a process-local copy
|
||||
x = queue.get()
|
||||
x_clone = x.clone()
|
||||
queue_2.put(x_clone)
|
||||
|
||||
|
||||
::
|
||||
|
||||
# putting and getting from the same queue in the same process will likely end up with segfault
|
||||
queue.put(tensor)
|
||||
x = queue.get()
|
||||
|
||||
|
||||
Sharing strategies
|
||||
------------------
|
||||
## Sharing strategies
|
||||
|
||||
This section provides a brief overview into how different sharing strategies
|
||||
work. Note that it applies only to CPU tensor - CUDA tensors will always use
|
||||
the CUDA API, as that's the only way they can be shared.
|
||||
|
||||
File descriptor - ``file_descriptor``
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
### File descriptor - `file_descriptor`
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
This is the default strategy (except for macOS and OS X where it's not
|
||||
supported).
|
||||
:::{note}
|
||||
This is the default strategy (except for macOS and OS X where it's not
|
||||
supported).
|
||||
:::
|
||||
|
||||
This strategy will use file descriptors as shared memory handles. Whenever a
|
||||
storage is moved to shared memory, a file descriptor obtained from ``shm_open``
|
||||
storage is moved to shared memory, a file descriptor obtained from `shm_open`
|
||||
is cached with the object, and when it's going to be sent to other processes,
|
||||
the file descriptor will be transferred (e.g. via UNIX sockets) to it. The
|
||||
receiver will also cache the file descriptor and ``mmap`` it, to obtain a shared
|
||||
receiver will also cache the file descriptor and `mmap` it, to obtain a shared
|
||||
view onto the storage data.
|
||||
|
||||
Note that if there will be a lot of tensors shared, this strategy will keep a
|
||||
large number of file descriptors open most of the time. If your system has low
|
||||
limits for the number of open file descriptors, and you can't raise them, you
|
||||
should use the ``file_system`` strategy.
|
||||
should use the `file_system` strategy.
|
||||
|
||||
File system - ``file_system``
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
### File system - `file_system`
|
||||
|
||||
This strategy will use file names given to ``shm_open`` to identify the shared
|
||||
This strategy will use file names given to `shm_open` to identify the shared
|
||||
memory regions. This has a benefit of not requiring the implementation to cache
|
||||
the file descriptors obtained from it, but at the same time is prone to shared
|
||||
memory leaks. The file can't be deleted right after its creation, because other
|
||||
@ -139,28 +142,27 @@ crash, or are killed, and don't call the storage destructors, the files will
|
||||
remain in the system. This is very serious, because they keep using up the
|
||||
memory until the system is restarted, or they're freed manually.
|
||||
|
||||
To counter the problem of shared memory file leaks, :mod:`torch.multiprocessing`
|
||||
will spawn a daemon named ``torch_shm_manager`` that will isolate itself from
|
||||
To counter the problem of shared memory file leaks, {mod}`torch.multiprocessing`
|
||||
will spawn a daemon named `torch_shm_manager` that will isolate itself from
|
||||
the current process group, and will keep track of all shared memory allocations.
|
||||
Once all processes connected to it exit, it will wait a moment to ensure there
|
||||
will be no new connections, and will iterate over all shared memory files
|
||||
allocated by the group. If it finds that any of them still exist, they will be
|
||||
deallocated. We've tested this method and it proved to be robust to various
|
||||
failures. Still, if your system has high enough limits, and ``file_descriptor``
|
||||
failures. Still, if your system has high enough limits, and `file_descriptor`
|
||||
is a supported strategy, we do not recommend switching to this one.
|
||||
|
||||
Spawning subprocesses
|
||||
---------------------
|
||||
## Spawning subprocesses
|
||||
|
||||
.. note::
|
||||
:::{note}
|
||||
Available for Python >= 3.4.
|
||||
|
||||
Available for Python >= 3.4.
|
||||
|
||||
This depends on the ``spawn`` start method in Python's
|
||||
``multiprocessing`` package.
|
||||
This depends on the `spawn` start method in Python's
|
||||
`multiprocessing` package.
|
||||
:::
|
||||
|
||||
Spawning a number of subprocesses to perform some function can be done
|
||||
by creating ``Process`` instances and calling ``join`` to wait for
|
||||
by creating `Process` instances and calling `join` to wait for
|
||||
their completion. This approach works fine when dealing with a single
|
||||
subprocess but presents potential issues when dealing with multiple
|
||||
processes.
|
||||
@ -170,27 +172,48 @@ sequentially. If they don't, and the first process does not terminate,
|
||||
the process termination will go unnoticed. Also, there are no native
|
||||
facilities for error propagation.
|
||||
|
||||
The ``spawn`` function below addresses these concerns and takes care
|
||||
The `spawn` function below addresses these concerns and takes care
|
||||
of error propagation, out of order termination, and will actively
|
||||
terminate processes upon detecting an error in one of them.
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: torch.multiprocessing.spawn
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.multiprocessing.spawn
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. autofunction:: spawn
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch.multiprocessing
|
||||
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. class:: SpawnContext
|
||||
|
||||
Returned by :func:`~spawn` when called with ``join=False``.
|
||||
|
||||
.. automethod:: join
|
||||
|
||||
```
|
||||
|
||||
.. This module needs to be documented. Adding here in the meantime
|
||||
.. for tracking purposes
|
||||
% This module needs to be documented. Adding here in the meantime
|
||||
|
||||
% for tracking purposes
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.multiprocessing.pool
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.multiprocessing.queue
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. py:module:: torch.multiprocessing.reductions
|
||||
```
|
@ -1,11 +1,12 @@
|
||||
```{eval-rst}
|
||||
.. currentmodule:: torch
|
||||
```
|
||||
|
||||
.. _name_inference_reference-doc:
|
||||
(name_inference_reference-doc)=
|
||||
|
||||
Named Tensors operator coverage
|
||||
===============================
|
||||
# Named Tensors operator coverage
|
||||
|
||||
Please read :ref:`named_tensors-doc` first for an introduction to named tensors.
|
||||
Please read {ref}`named_tensors-doc` first for an introduction to named tensors.
|
||||
|
||||
This document is a reference for *name inference*, a process that defines how
|
||||
named tensors:
|
||||
@ -17,11 +18,13 @@ Below is a list of all operations that are supported with named tensors
|
||||
and their associated name inference rules.
|
||||
|
||||
If you don't see an operation listed here, but it would help your use case, please
|
||||
`search if an issue has already been filed <https://github.com/pytorch/pytorch/issues?q=is%3Aopen+is%3Aissue+label%3A%22module%3A+named+tensor%22>`_ and if not, `file one <https://github.com/pytorch/pytorch/issues/new/choose>`_.
|
||||
[search if an issue has already been filed](https://github.com/pytorch/pytorch/issues?q=is%3Aopen+is%3Aissue+label%3A%22module%3A+named+tensor%22) and if not, [file one](https://github.com/pytorch/pytorch/issues/new/choose).
|
||||
|
||||
.. warning::
|
||||
The named tensor API is experimental and subject to change.
|
||||
:::{warning}
|
||||
The named tensor API is experimental and subject to change.
|
||||
:::
|
||||
|
||||
```{eval-rst}
|
||||
.. csv-table:: Supported Operations
|
||||
:header: API, Name inference rule
|
||||
:widths: 20, 20
|
||||
@ -244,226 +247,221 @@ If you don't see an operation listed here, but it would help your use case, plea
|
||||
:meth:`Tensor.zero_`,None
|
||||
:func:`torch.zeros`,:ref:`factory-doc`
|
||||
|
||||
```
|
||||
|
||||
.. _keeps_input_names-doc:
|
||||
(keeps_input_names-doc)=
|
||||
|
||||
Keeps input names
|
||||
^^^^^^^^^^^^^^^^^
|
||||
## Keeps input names
|
||||
|
||||
All pointwise unary functions follow this rule as well as some other unary functions.
|
||||
|
||||
- Check names: None
|
||||
- Propagate names: input tensor's names are propagated to the output.
|
||||
|
||||
::
|
||||
```
|
||||
>>> x = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.abs().names
|
||||
('N', 'C')
|
||||
```
|
||||
|
||||
>>> x = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.abs().names
|
||||
('N', 'C')
|
||||
(removes_dimensions-doc)=
|
||||
|
||||
.. _removes_dimensions-doc:
|
||||
## Removes dimensions
|
||||
|
||||
Removes dimensions
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
All reduction ops like :meth:`~Tensor.sum` remove dimensions by reducing
|
||||
over the desired dimensions. Other operations like :meth:`~Tensor.select` and
|
||||
:meth:`~Tensor.squeeze` remove dimensions.
|
||||
All reduction ops like {meth}`~Tensor.sum` remove dimensions by reducing
|
||||
over the desired dimensions. Other operations like {meth}`~Tensor.select` and
|
||||
{meth}`~Tensor.squeeze` remove dimensions.
|
||||
|
||||
Wherever one can pass an integer dimension index to an operator, one can also pass
|
||||
a dimension name. Functions that take lists of dimension indices can also take in a
|
||||
list of dimension names.
|
||||
|
||||
- Check names: If :attr:`dim` or :attr:`dims` is passed in as a list of names,
|
||||
check that those names exist in :attr:`self`.
|
||||
- Propagate names: If the dimensions of the input tensor specified by :attr:`dim`
|
||||
or :attr:`dims` are not present in the output tensor, then the corresponding names
|
||||
of those dimensions do not appear in ``output.names``.
|
||||
- Check names: If {attr}`dim` or {attr}`dims` is passed in as a list of names,
|
||||
check that those names exist in {attr}`self`.
|
||||
- Propagate names: If the dimensions of the input tensor specified by {attr}`dim`
|
||||
or {attr}`dims` are not present in the output tensor, then the corresponding names
|
||||
of those dimensions do not appear in `output.names`.
|
||||
|
||||
::
|
||||
```
|
||||
>>> x = torch.randn(1, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.squeeze('N').names
|
||||
('C', 'H', 'W')
|
||||
|
||||
>>> x = torch.randn(1, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.squeeze('N').names
|
||||
('C', 'H', 'W')
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.sum(['N', 'C']).names
|
||||
('H', 'W')
|
||||
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.sum(['N', 'C']).names
|
||||
('H', 'W')
|
||||
# Reduction ops with keepdim=True don't actually remove dimensions.
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.sum(['N', 'C'], keepdim=True).names
|
||||
('N', 'C', 'H', 'W')
|
||||
```
|
||||
|
||||
# Reduction ops with keepdim=True don't actually remove dimensions.
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('N', 'C', 'H', 'W'))
|
||||
>>> x.sum(['N', 'C'], keepdim=True).names
|
||||
('N', 'C', 'H', 'W')
|
||||
(unifies_names_from_inputs-doc)=
|
||||
|
||||
|
||||
.. _unifies_names_from_inputs-doc:
|
||||
|
||||
Unifies names from inputs
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
## Unifies names from inputs
|
||||
|
||||
All binary arithmetic ops follow this rule. Operations that broadcast still
|
||||
broadcast positionally from the right to preserve compatibility with unnamed
|
||||
tensors. To perform explicit broadcasting by names, use :meth:`Tensor.align_as`.
|
||||
tensors. To perform explicit broadcasting by names, use {meth}`Tensor.align_as`.
|
||||
|
||||
- Check names: All names must match positionally from the right. i.e., in
|
||||
``tensor + other``, ``match(tensor.names[i], other.names[i])`` must be true for all
|
||||
``i`` in ``(-min(tensor.dim(), other.dim()) + 1, -1]``.
|
||||
`tensor + other`, `match(tensor.names[i], other.names[i])` must be true for all
|
||||
`i` in `(-min(tensor.dim(), other.dim()) + 1, -1]`.
|
||||
- Check names: Furthermore, all named dimensions must be aligned from the right.
|
||||
During matching, if we match a named dimension ``A`` with an unnamed dimension
|
||||
``None``, then ``A`` must not appear in the tensor with the unnamed dimension.
|
||||
During matching, if we match a named dimension `A` with an unnamed dimension
|
||||
`None`, then `A` must not appear in the tensor with the unnamed dimension.
|
||||
- Propagate names: unify pairs of names from the right from both tensors to
|
||||
produce output names.
|
||||
|
||||
For example,
|
||||
|
||||
::
|
||||
|
||||
# tensor: Tensor[ N, None]
|
||||
# other: Tensor[None, C]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', None))
|
||||
>>> other = torch.randn(3, 3, names=(None, 'C'))
|
||||
>>> (tensor + other).names
|
||||
('N', 'C')
|
||||
```
|
||||
# tensor: Tensor[ N, None]
|
||||
# other: Tensor[None, C]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', None))
|
||||
>>> other = torch.randn(3, 3, names=(None, 'C'))
|
||||
>>> (tensor + other).names
|
||||
('N', 'C')
|
||||
```
|
||||
|
||||
Check names:
|
||||
|
||||
- ``match(tensor.names[-1], other.names[-1])`` is ``True``
|
||||
- ``match(tensor.names[-2], tensor.names[-2])`` is ``True``
|
||||
- Because we matched ``None`` in :attr:`tensor` with ``'C'``,
|
||||
check to make sure ``'C'`` doesn't exist in :attr:`tensor` (it does not).
|
||||
- Check to make sure ``'N'`` doesn't exists in :attr:`other` (it does not).
|
||||
- `match(tensor.names[-1], other.names[-1])` is `True`
|
||||
- `match(tensor.names[-2], tensor.names[-2])` is `True`
|
||||
- Because we matched `None` in {attr}`tensor` with `'C'`,
|
||||
check to make sure `'C'` doesn't exist in {attr}`tensor` (it does not).
|
||||
- Check to make sure `'N'` doesn't exists in {attr}`other` (it does not).
|
||||
|
||||
Finally, the output names are computed with
|
||||
``[unify('N', None), unify(None, 'C')] = ['N', 'C']``
|
||||
`[unify('N', None), unify(None, 'C')] = ['N', 'C']`
|
||||
|
||||
More examples::
|
||||
More examples:
|
||||
|
||||
# Dimensions don't match from the right:
|
||||
# tensor: Tensor[N, C]
|
||||
# other: Tensor[ N]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> other = torch.randn(3, names=('N',))
|
||||
>>> (tensor + other).names
|
||||
RuntimeError: Error when attempting to broadcast dims ['N', 'C'] and dims
|
||||
['N']: dim 'C' and dim 'N' are at the same position from the right but do
|
||||
not match.
|
||||
```
|
||||
# Dimensions don't match from the right:
|
||||
# tensor: Tensor[N, C]
|
||||
# other: Tensor[ N]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> other = torch.randn(3, names=('N',))
|
||||
>>> (tensor + other).names
|
||||
RuntimeError: Error when attempting to broadcast dims ['N', 'C'] and dims
|
||||
['N']: dim 'C' and dim 'N' are at the same position from the right but do
|
||||
not match.
|
||||
|
||||
# Dimensions aren't aligned when matching tensor.names[-1] and other.names[-1]:
|
||||
# tensor: Tensor[N, None]
|
||||
# other: Tensor[ N]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', None))
|
||||
>>> other = torch.randn(3, names=('N',))
|
||||
>>> (tensor + other).names
|
||||
RuntimeError: Misaligned dims when attempting to broadcast dims ['N'] and
|
||||
dims ['N', None]: dim 'N' appears in a different position from the right
|
||||
across both lists.
|
||||
# Dimensions aren't aligned when matching tensor.names[-1] and other.names[-1]:
|
||||
# tensor: Tensor[N, None]
|
||||
# other: Tensor[ N]
|
||||
>>> tensor = torch.randn(3, 3, names=('N', None))
|
||||
>>> other = torch.randn(3, names=('N',))
|
||||
>>> (tensor + other).names
|
||||
RuntimeError: Misaligned dims when attempting to broadcast dims ['N'] and
|
||||
dims ['N', None]: dim 'N' appears in a different position from the right
|
||||
across both lists.
|
||||
```
|
||||
|
||||
.. note::
|
||||
:::{note}
|
||||
In both of the last examples, it is possible to align the tensors by names
|
||||
and then perform the addition. Use {meth}`Tensor.align_as` to align
|
||||
tensors by name or {meth}`Tensor.align_to` to align tensors to a custom
|
||||
dimension ordering.
|
||||
:::
|
||||
|
||||
In both of the last examples, it is possible to align the tensors by names
|
||||
and then perform the addition. Use :meth:`Tensor.align_as` to align
|
||||
tensors by name or :meth:`Tensor.align_to` to align tensors to a custom
|
||||
dimension ordering.
|
||||
(permutes_dimensions-doc)=
|
||||
|
||||
.. _permutes_dimensions-doc:
|
||||
## Permutes dimensions
|
||||
|
||||
Permutes dimensions
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Some operations, like :meth:`Tensor.t()`, permute the order of dimensions. Dimension names
|
||||
Some operations, like {meth}`Tensor.t()`, permute the order of dimensions. Dimension names
|
||||
are attached to individual dimensions so they get permuted as well.
|
||||
|
||||
If the operator takes in positional index :attr:`dim`, it is also able to take a dimension
|
||||
name as :attr:`dim`.
|
||||
If the operator takes in positional index {attr}`dim`, it is also able to take a dimension
|
||||
name as {attr}`dim`.
|
||||
|
||||
- Check names: If :attr:`dim` is passed as a name, check that it exists in the tensor.
|
||||
- Check names: If {attr}`dim` is passed as a name, check that it exists in the tensor.
|
||||
- Propagate names: Permute dimension names in the same way as the dimensions that are
|
||||
being permuted.
|
||||
|
||||
::
|
||||
```
|
||||
>>> x = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.transpose('N', 'C').names
|
||||
('C', 'N')
|
||||
```
|
||||
|
||||
>>> x = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.transpose('N', 'C').names
|
||||
('C', 'N')
|
||||
(contracts_away_dims-doc)=
|
||||
|
||||
.. _contracts_away_dims-doc:
|
||||
|
||||
Contracts away dims
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
## Contracts away dims
|
||||
|
||||
Matrix multiply functions follow some variant of this. Let's go through
|
||||
:func:`torch.mm` first and then generalize the rule for batch matrix multiplication.
|
||||
{func}`torch.mm` first and then generalize the rule for batch matrix multiplication.
|
||||
|
||||
For ``torch.mm(tensor, other)``:
|
||||
For `torch.mm(tensor, other)`:
|
||||
|
||||
- Check names: None
|
||||
- Propagate names: result names are ``(tensor.names[-2], other.names[-1])``.
|
||||
- Propagate names: result names are `(tensor.names[-2], other.names[-1])`.
|
||||
|
||||
::
|
||||
|
||||
>>> x = torch.randn(3, 3, names=('N', 'D'))
|
||||
>>> y = torch.randn(3, 3, names=('in', 'out'))
|
||||
>>> x.mm(y).names
|
||||
('N', 'out')
|
||||
```
|
||||
>>> x = torch.randn(3, 3, names=('N', 'D'))
|
||||
>>> y = torch.randn(3, 3, names=('in', 'out'))
|
||||
>>> x.mm(y).names
|
||||
('N', 'out')
|
||||
```
|
||||
|
||||
Inherently, a matrix multiplication performs a dot product over two dimensions,
|
||||
collapsing them. When two tensors are matrix-multiplied, the contracted dimensions
|
||||
disappear and do not show up in the output tensor.
|
||||
|
||||
:func:`torch.mv`, :func:`torch.dot` work in a similar way: name inference does not
|
||||
{func}`torch.mv`, {func}`torch.dot` work in a similar way: name inference does not
|
||||
check input names and removes the dimensions that are involved in the dot product:
|
||||
|
||||
::
|
||||
```
|
||||
>>> x = torch.randn(3, 3, names=('N', 'D'))
|
||||
>>> y = torch.randn(3, names=('something',))
|
||||
>>> x.mv(y).names
|
||||
('N',)
|
||||
```
|
||||
|
||||
>>> x = torch.randn(3, 3, names=('N', 'D'))
|
||||
>>> y = torch.randn(3, names=('something',))
|
||||
>>> x.mv(y).names
|
||||
('N',)
|
||||
|
||||
Now, let's take a look at ``torch.matmul(tensor, other)``. Assume that ``tensor.dim() >= 2``
|
||||
and ``other.dim() >= 2``.
|
||||
Now, let's take a look at `torch.matmul(tensor, other)`. Assume that `tensor.dim() >= 2`
|
||||
and `other.dim() >= 2`.
|
||||
|
||||
- Check names: Check that the batch dimensions of the inputs are aligned and broadcastable.
|
||||
See :ref:`unifies_names_from_inputs-doc` for what it means for the inputs to be aligned.
|
||||
See {ref}`unifies_names_from_inputs-doc` for what it means for the inputs to be aligned.
|
||||
- Propagate names: result names are obtained by unifying the batch dimensions and removing
|
||||
the contracted dimensions:
|
||||
``unify(tensor.names[:-2], other.names[:-2]) + (tensor.names[-2], other.names[-1])``.
|
||||
`unify(tensor.names[:-2], other.names[:-2]) + (tensor.names[-2], other.names[-1])`.
|
||||
|
||||
Examples::
|
||||
Examples:
|
||||
|
||||
# Batch matrix multiply of matrices Tensor['C', 'D'] and Tensor['E', 'F'].
|
||||
# 'A', 'B' are batch dimensions.
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('A', 'B', 'C', 'D'))
|
||||
>>> y = torch.randn(3, 3, 3, names=('B', 'E', 'F'))
|
||||
>>> torch.matmul(x, y).names
|
||||
('A', 'B', 'C', 'F')
|
||||
```
|
||||
# Batch matrix multiply of matrices Tensor['C', 'D'] and Tensor['E', 'F'].
|
||||
# 'A', 'B' are batch dimensions.
|
||||
>>> x = torch.randn(3, 3, 3, 3, names=('A', 'B', 'C', 'D'))
|
||||
>>> y = torch.randn(3, 3, 3, names=('B', 'E', 'F'))
|
||||
>>> torch.matmul(x, y).names
|
||||
('A', 'B', 'C', 'F')
|
||||
```
|
||||
|
||||
Finally, there are fused `add` versions of many matmul functions. i.e., {func}`addmm`
|
||||
and {func}`addmv`. These are treated as composing name inference for i.e. {func}`mm` and
|
||||
name inference for {func}`add`.
|
||||
|
||||
Finally, there are fused ``add`` versions of many matmul functions. i.e., :func:`addmm`
|
||||
and :func:`addmv`. These are treated as composing name inference for i.e. :func:`mm` and
|
||||
name inference for :func:`add`.
|
||||
(factory-doc)=
|
||||
|
||||
.. _factory-doc:
|
||||
## Factory functions
|
||||
|
||||
Factory functions
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
|
||||
Factory functions now take a new :attr:`names` argument that associates a name
|
||||
Factory functions now take a new {attr}`names` argument that associates a name
|
||||
with each dimension.
|
||||
|
||||
::
|
||||
```
|
||||
>>> torch.zeros(2, 3, names=('N', 'C'))
|
||||
tensor([[0., 0., 0.],
|
||||
[0., 0., 0.]], names=('N', 'C'))
|
||||
```
|
||||
|
||||
>>> torch.zeros(2, 3, names=('N', 'C'))
|
||||
tensor([[0., 0., 0.],
|
||||
[0., 0., 0.]], names=('N', 'C'))
|
||||
(out_function_semantics-doc)=
|
||||
|
||||
.. _out_function_semantics-doc:
|
||||
## out function and in-place variants
|
||||
|
||||
out function and in-place variants
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
A tensor specified as an ``out=`` tensor has the following behavior:
|
||||
A tensor specified as an `out=` tensor has the following behavior:
|
||||
|
||||
- If it has no named dimensions, then the names computed from the operation
|
||||
get propagated to it.
|
||||
@ -473,13 +471,13 @@ A tensor specified as an ``out=`` tensor has the following behavior:
|
||||
All in-place methods modify inputs to have names equal to the computed names
|
||||
from name inference. For example:
|
||||
|
||||
::
|
||||
```
|
||||
>>> x = torch.randn(3, 3)
|
||||
>>> y = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.names
|
||||
(None, None)
|
||||
|
||||
>>> x = torch.randn(3, 3)
|
||||
>>> y = torch.randn(3, 3, names=('N', 'C'))
|
||||
>>> x.names
|
||||
(None, None)
|
||||
|
||||
>>> x += y
|
||||
>>> x.names
|
||||
('N', 'C')
|
||||
>>> x += y
|
||||
>>> x.names
|
||||
('N', 'C')
|
||||
```
|
Reference in New Issue
Block a user