mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-24 07:27:32 +08:00
Compare commits
3 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| ccd5f4dbfc | |||
| 3cc21b5a46 | |||
| 27fb8750ad |
16
.gitignore
vendored
16
.gitignore
vendored
@ -5,7 +5,6 @@ torch.egg-info/
|
||||
torch/version.py
|
||||
torch/csrc/generic/TensorMethods.cpp
|
||||
torch/lib/*.so*
|
||||
torch/lib/*.a*
|
||||
torch/lib/*.dylib*
|
||||
torch/lib/*.h
|
||||
torch/lib/build
|
||||
@ -20,7 +19,6 @@ torch/csrc/nn/THCUNN.cpp
|
||||
torch/csrc/nn/THNN_generic.cwrap
|
||||
torch/csrc/nn/THNN_generic.cpp
|
||||
torch/csrc/nn/THNN_generic.h
|
||||
torch/csrc/generated
|
||||
docs/src/**/*
|
||||
test/data/legacy_modules.t7
|
||||
test/data/gpu_tensors.pt
|
||||
@ -35,17 +33,3 @@ test/.coverage
|
||||
*/**/*.so*
|
||||
*/**/*.dylib*
|
||||
test/data/legacy_serialized.pt
|
||||
test/data/linear.pt
|
||||
|
||||
# IPython notebook checkpoints
|
||||
.ipynb_checkpoints
|
||||
|
||||
# Editor temporaries
|
||||
*.swn
|
||||
*.swo
|
||||
*.swp
|
||||
*~
|
||||
|
||||
# OSX dir files
|
||||
.DS_Store
|
||||
|
||||
|
||||
@ -1,8 +1,7 @@
|
||||
# https://travis-ci.org/pytorch/pytorch
|
||||
language: python
|
||||
dist: trusty
|
||||
python:
|
||||
- 2.7.9
|
||||
- 2.7.8
|
||||
- 2.7
|
||||
- 3.5
|
||||
- 3.6
|
||||
|
||||
@ -44,9 +44,7 @@ https://github.com/pytorch/pytorch#from-source
|
||||
|
||||
The change you have to make is to replace
|
||||
|
||||
```
|
||||
python setup.py install
|
||||
```
|
||||
`python setup.py install`
|
||||
|
||||
with
|
||||
|
||||
@ -63,73 +61,18 @@ Hence, if you modify a python file, you do not need to reinstall pytorch again a
|
||||
|
||||
For example:
|
||||
- Install local pytorch in `build develop` mode
|
||||
- modify your python file `torch/__init__.py` (for example)
|
||||
- modify your python file torch/__init__.py (for example)
|
||||
- test functionality
|
||||
- modify your python file `torch/__init__.py`
|
||||
- modify your python file torch/__init__.py
|
||||
- test functionality
|
||||
- modify your python file `torch/__init__.py`
|
||||
- modify your python file torch/__init__.py
|
||||
- test functionality
|
||||
|
||||
You do not need to repeatedly install after modifying python files.
|
||||
|
||||
#### C++ Development tips
|
||||
|
||||
## Writing documentation
|
||||
|
||||
PyTorch uses [Google style](http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)
|
||||
for formatting docstrings. Length of line inside docstrings block must be limited to 80 characters to
|
||||
fit into Jupyter documentation popups.
|
||||
|
||||
|
||||
## Managing multiple build trees
|
||||
|
||||
One downside to using `python setup.py develop` is that your development
|
||||
version of pytorch will be installed globally on your account (e.g., if
|
||||
you run `import torch` anywhere else, the development version will be
|
||||
used.
|
||||
|
||||
If you want to manage multiple builds of PyTorch, you can make use of
|
||||
[conda environments](https://conda.io/docs/using/envs.html) to maintain
|
||||
separate Python package environments, each of which can be tied to a
|
||||
specific build of PyTorch. To set one up:
|
||||
|
||||
```
|
||||
conda create -n pytorch-myfeature
|
||||
source activate pytorch-myfeature
|
||||
# if you run python now, torch will NOT be installed
|
||||
python setup.py build develop
|
||||
```
|
||||
|
||||
## C++ Development tips
|
||||
|
||||
If you are working on the C++ code, there are a few important things that you
|
||||
will want to keep in mind:
|
||||
|
||||
1. How to rebuild only the code you are working on, and
|
||||
2. How to make rebuilds in the absence of changes go faster.
|
||||
|
||||
### Build only what you need.
|
||||
|
||||
`python setup.py build` will build everything, but since our build system is
|
||||
not very optimized for incremental rebuilds, this will actually be very slow.
|
||||
Far better is to only request rebuilds of the parts of the project you are
|
||||
working on:
|
||||
|
||||
- Working on `torch/csrc`? Run `python setup.py develop` to rebuild
|
||||
(NB: no `build` here!)
|
||||
|
||||
- Working on `torch/lib/TH`, did not make any cmake changes, and just want to
|
||||
see if it compiles? Run `(cd torch/lib/build/TH && make install -j$(getconf _NPROCESSORS_ONLN))`. This
|
||||
applies for any other subdirectory of `torch/lib`. **Warning: Changes you
|
||||
make here will not be visible from Python.** See below.
|
||||
|
||||
- Working on `torch/lib` and want to run your changes / rerun cmake? Run
|
||||
`python setup.py build_deps`. Note that this will rerun cmake for
|
||||
every subdirectory in TH; if you are only working on one project,
|
||||
consider editing `torch/lib/build_all.sh` and commenting out the
|
||||
`build` lines of libraries you are not working on.
|
||||
|
||||
On the initial build, you can also speed things up with the environment
|
||||
variables `DEBUG` and `NO_CUDA`.
|
||||
When you are developing on the C++ side of things, the environment variables `DEBUG` and `NO_CUDA` are helpful.
|
||||
|
||||
- `DEBUG=1` will enable debug builds (-g -O0)
|
||||
- `NO_CUDA=1` will disable compiling CUDA (in case you are developing on something not CUDA related), to save compile time.
|
||||
@ -139,15 +82,7 @@ For example:
|
||||
NO_CUDA=1 DEBUG=1 python setup.py build develop
|
||||
```
|
||||
|
||||
Make sure you continue to pass these flags on subsequent builds.
|
||||
|
||||
### Make no-op build fast.
|
||||
|
||||
Python `setuptools` is pretty dumb, and always rebuilds every C file in a
|
||||
project. Using ccache in a situation like this is a real time-saver. However, by
|
||||
default, ccache does not properly support CUDA stuff, so here are the
|
||||
instructions for installing a custom `ccache` fork that has CUDA support:
|
||||
|
||||
Also, if you are developing a lot, using ccache is a real time-saver. By default, ccache does not properly support CUDA stuff, so here are the instructions for installing a custom `ccache` fork that has CUDA support:
|
||||
```
|
||||
# install and export ccache
|
||||
if ! ls ~/ccache/bin/ccache
|
||||
|
||||
12
Dockerfile
12
Dockerfile
@ -1,16 +1,18 @@
|
||||
FROM nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04
|
||||
FROM nvidia/cuda:8.0-devel-ubuntu16.04
|
||||
|
||||
RUN echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list
|
||||
|
||||
ENV CUDNN_VERSION 6.0.20
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
build-essential \
|
||||
cmake \
|
||||
git \
|
||||
curl \
|
||||
vim \
|
||||
ca-certificates \
|
||||
libjpeg-dev \
|
||||
libpng-dev &&\
|
||||
libpng-dev \
|
||||
libcudnn6=$CUDNN_VERSION-1+cuda8.0 \
|
||||
libcudnn6-dev=$CUDNN_VERSION-1+cuda8.0 && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN curl -o ~/miniconda.sh -O https://repo.continuum.io/miniconda/Miniconda3-4.2.12-Linux-x86_64.sh && \
|
||||
@ -28,9 +30,7 @@ COPY . .
|
||||
|
||||
RUN TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1+PTX" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" \
|
||||
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" \
|
||||
pip install -v .
|
||||
|
||||
RUN git clone https://github.com/pytorch/vision.git && cd vision && pip install -v .
|
||||
python setup.py install
|
||||
|
||||
WORKDIR /workspace
|
||||
RUN chmod -R a+w /workspace
|
||||
|
||||
97
README.md
97
README.md
@ -2,30 +2,29 @@
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
PyTorch is a Python package that provides two high-level features:
|
||||
- Tensor computation (like NumPy) with strong GPU acceleration
|
||||
- Deep neural networks built on a tape-based autograd system
|
||||
PyTorch is a python package that provides two high-level features:
|
||||
- Tensor computation (like numpy) with strong GPU acceleration
|
||||
- Deep Neural Networks built on a tape-based autograd system
|
||||
|
||||
You can reuse your favorite Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed.
|
||||
You can reuse your favorite python packages such as numpy, scipy and Cython to extend PyTorch when needed.
|
||||
|
||||
We are in an early-release beta. Expect some adventures and rough edges.
|
||||
We are in an early-release Beta. Expect some adventures and rough edges.
|
||||
|
||||
- [More about PyTorch](#more-about-pytorch)
|
||||
- [More About PyTorch](#more-about-pytorch)
|
||||
- [Installation](#installation)
|
||||
- [Binaries](#binaries)
|
||||
- [From Source](#from-source)
|
||||
- [Docker Image](#docker-image)
|
||||
- [From source](#from-source)
|
||||
- [Docker image](#docker-image)
|
||||
- [Getting Started](#getting-started)
|
||||
- [Communication](#communication)
|
||||
- [Releases and Contributing](#releases-and-contributing)
|
||||
- [The Team](#the-team)
|
||||
|
||||
| System | 2.7 | 3.5 |
|
||||
| System | Python | Status |
|
||||
| --- | --- | --- |
|
||||
| Linux CPU | [](https://travis-ci.org/pytorch/pytorch) | [](https://travis-ci.org/pytorch/pytorch) |
|
||||
| Linux GPU | [](https://build.pytorch.org/job/pytorch-master-py2-linux) | [](https://build.pytorch.org/job/pytorch-master-py3-linux) |
|
||||
| macOS CPU | [](https://build.pytorch.org/job/pytorch-master-py2-osx-cpu) | [](https://build.pytorch.org/job/pytorch-master-py3-osx-cpu) |
|
||||
|
||||
| Linux CPU | 2.7.8, 2.7, 3.5, nightly | [](https://travis-ci.org/pytorch/pytorch) |
|
||||
| Linux GPU | 2.7 | [](https://build.pytorch.org/job/pytorch-master-py2) |
|
||||
| Linux GPU | 3.5 | [](https://build.pytorch.org/job/pytorch-master-py3) |
|
||||
|
||||
## More about PyTorch
|
||||
|
||||
@ -38,7 +37,7 @@ At a granular level, PyTorch is a library that consists of the following compone
|
||||
</tr>
|
||||
<tr>
|
||||
<td><b> torch.autograd </b></td>
|
||||
<td> a tape-based automatic differentiation library that supports all differentiable Tensor operations in torch </td>
|
||||
<td> a tape based automatic differentiation library that supports all differentiable Tensor operations in torch </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><b> torch.nn </b></td>
|
||||
@ -46,7 +45,7 @@ At a granular level, PyTorch is a library that consists of the following compone
|
||||
</tr>
|
||||
<tr>
|
||||
<td><b> torch.multiprocessing </b></td>
|
||||
<td> Python multiprocessing, but with magical memory sharing of torch Tensors across processes. Useful for data loading and Hogwild training. </td>
|
||||
<td> python multiprocessing, but with magical memory sharing of torch Tensors across processes. Useful for data loading and hogwild training. </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><b> torch.utils </b></td>
|
||||
@ -60,14 +59,14 @@ At a granular level, PyTorch is a library that consists of the following compone
|
||||
|
||||
Usually one uses PyTorch either as:
|
||||
|
||||
- a replacement for NumPy to use the power of GPUs.
|
||||
- A replacement for numpy to use the power of GPUs.
|
||||
- a deep learning research platform that provides maximum flexibility and speed
|
||||
|
||||
Elaborating further:
|
||||
|
||||
### A GPU-Ready Tensor Library
|
||||
### A GPU-ready Tensor library
|
||||
|
||||
If you use NumPy, then you have used Tensors (a.k.a ndarray).
|
||||
If you use numpy, then you have used Tensors (a.k.a ndarray).
|
||||
|
||||
<p align=center><img width="30%" src="docs/source/_static/img/tensor_illustration.png" /></p>
|
||||
|
||||
@ -78,15 +77,15 @@ We provide a wide variety of tensor routines to accelerate and fit your scientif
|
||||
such as slicing, indexing, math operations, linear algebra, reductions.
|
||||
And they are fast!
|
||||
|
||||
### Dynamic Neural Networks: Tape-Based Autograd
|
||||
### Dynamic Neural Networks: Tape based Autograd
|
||||
|
||||
PyTorch has a unique way of building neural networks: using and replaying a tape recorder.
|
||||
|
||||
Most frameworks such as TensorFlow, Theano, Caffe and CNTK have a static view of the world.
|
||||
Most frameworks such as `TensorFlow`, `Theano`, `Caffe` and `CNTK` have a static view of the world.
|
||||
One has to build a neural network, and reuse the same structure again and again.
|
||||
Changing the way the network behaves means that one has to start from scratch.
|
||||
|
||||
With PyTorch, we use a technique called reverse-mode auto-differentiation, which allows you to
|
||||
With PyTorch, we use a technique called Reverse-mode auto-differentiation, which allows you to
|
||||
change the way your network behaves arbitrarily with zero lag or overhead. Our inspiration comes
|
||||
from several research papers on this topic, as well as current and past work such as
|
||||
[autograd](https://github.com/twitter/torch-autograd),
|
||||
@ -98,45 +97,45 @@ You get the best of speed and flexibility for your crazy research.
|
||||
|
||||
<p align=center><img width="80%" src="docs/source/_static/img/dynamic_graph.gif" /></p>
|
||||
|
||||
### Python First
|
||||
### Python first
|
||||
|
||||
PyTorch is not a Python binding into a monolithic C++ framework.
|
||||
PyTorch is not a Python binding into a monolothic C++ framework.
|
||||
It is built to be deeply integrated into Python.
|
||||
You can use it naturally like you would use NumPy / SciPy / scikit-learn etc.
|
||||
You can use it naturally like you would use numpy / scipy / scikit-learn etc.
|
||||
You can write your new neural network layers in Python itself, using your favorite libraries
|
||||
and use packages such as Cython and Numba.
|
||||
Our goal is to not reinvent the wheel where appropriate.
|
||||
|
||||
### Imperative Experiences
|
||||
### Imperative experiences
|
||||
|
||||
PyTorch is designed to be intuitive, linear in thought and easy to use.
|
||||
When you execute a line of code, it gets executed. There isn't an asynchronous view of the world.
|
||||
When you drop into a debugger, or receive error messages and stack traces, understanding them is straightforward.
|
||||
The stack trace points to exactly where your code was defined.
|
||||
When you drop into a debugger, or receive error messages and stack traces, understanding them is straight-forward.
|
||||
The stack-trace points to exactly where your code was defined.
|
||||
We hope you never spend hours debugging your code because of bad stack traces or asynchronous and opaque execution engines.
|
||||
|
||||
### Fast and Lean
|
||||
|
||||
PyTorch has minimal framework overhead. We integrate acceleration libraries
|
||||
such as Intel MKL and NVIDIA (cuDNN, NCCL) to maximize speed.
|
||||
At the core, its CPU and GPU Tensor and neural network backends
|
||||
PyTorch has minimal framework overhead. We integrate acceleration libraries
|
||||
such as Intel MKL and NVIDIA (CuDNN, NCCL) to maximize speed.
|
||||
At the core, its CPU and GPU Tensor and Neural Network backends
|
||||
(TH, THC, THNN, THCUNN) are written as independent libraries with a C99 API.
|
||||
They are mature and have been tested for years.
|
||||
|
||||
Hence, PyTorch is quite fast – whether you run small or large neural networks.
|
||||
Hence, PyTorch is quite fast -- whether you run small or large neural networks.
|
||||
|
||||
The memory usage in PyTorch is extremely efficient compared to Torch or some of the alternatives.
|
||||
We've written custom memory allocators for the GPU to make sure that
|
||||
your deep learning models are maximally memory efficient.
|
||||
This enables you to train bigger deep learning models than before.
|
||||
|
||||
### Extensions without Pain
|
||||
### Extensions without pain
|
||||
|
||||
Writing new neural network modules, or interfacing with PyTorch's Tensor API was designed to be straightforward
|
||||
Writing new neural network modules, or interfacing with PyTorch's Tensor API was designed to be straight-forward
|
||||
and with minimal abstractions.
|
||||
|
||||
You can write new neural network layers in Python using the torch API
|
||||
[or your favorite NumPy-based libraries such as SciPy](http://pytorch.org/tutorials/advanced/numpy_extensions_tutorial.html).
|
||||
[or your favorite numpy based libraries such as SciPy](http://pytorch.org/tutorials/advanced/numpy_extensions_tutorial.html).
|
||||
|
||||
If you want to write your layers in C/C++, we provide an extension API based on
|
||||
[cffi](http://cffi.readthedocs.io/en/latest/) that is efficient and with minimal boilerplate.
|
||||
@ -150,16 +149,16 @@ Commands to install from binaries via Conda or pip wheels are on our website:
|
||||
|
||||
[http://pytorch.org](http://pytorch.org)
|
||||
|
||||
### From Source
|
||||
### From source
|
||||
|
||||
If you are installing from source, we highly recommend installing an [Anaconda](https://www.continuum.io/downloads) environment.
|
||||
You will get a high-quality BLAS library (MKL) and you get a controlled compiler version regardless of your Linux distro.
|
||||
|
||||
Once you have [Anaconda](https://www.continuum.io/downloads) installed, here are the instructions.
|
||||
Once you have [anaconda](https://www.continuum.io/downloads) installed, here are the instructions.
|
||||
|
||||
If you want to compile with CUDA support, install
|
||||
- [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads) 7.5 or above
|
||||
- [NVIDIA cuDNN](https://developer.nvidia.com/cudnn) v5.x or above
|
||||
- [NVIDIA CuDNN](https://developer.nvidia.com/cudnn) v5.x or above
|
||||
|
||||
If you want to disable CUDA support, export environment variable `NO_CUDA=1`.
|
||||
|
||||
@ -167,7 +166,7 @@ If you want to disable CUDA support, export environment variable `NO_CUDA=1`.
|
||||
|
||||
On Linux
|
||||
```bash
|
||||
export CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" # [anaconda root directory]
|
||||
export CMAKE_PREFIX_PATH=[anaconda root directory]
|
||||
|
||||
# Install basic dependencies
|
||||
conda install numpy pyyaml mkl setuptools cmake gcc cffi
|
||||
@ -197,21 +196,15 @@ MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install
|
||||
|
||||
Dockerfile is supplied to build images with cuda support and cudnn v6. Build as usual
|
||||
```
|
||||
docker build -t pytorch .
|
||||
docker build -t pytorch-cudnnv6 .
|
||||
```
|
||||
Alternatively, if you want a runtime image, build with
|
||||
|
||||
and run with nvidia-docker:
|
||||
```
|
||||
docker build -t pytorch . -f tools/docker/Dockerfile_runtime
|
||||
|
||||
nvidia-docker run --rm -ti --ipc=host pytorch-cudnnv6
|
||||
```
|
||||
and run with nvidia-docker:
|
||||
```
|
||||
nvidia-docker run --rm -ti --ipc=host pytorch
|
||||
```
|
||||
Please note that PyTorch uses shared memory to share data between processes, so if torch multiprocessing is used (e.g.
|
||||
Please note that pytorch uses shared memory to share data between processes, so if torch multiprocessing is used (e.g.
|
||||
for multithreaded data loaders) the default shared memory segment size that container runs with is not enough, and you
|
||||
should increase shared memory size either with `--ipc=host` or `--shm-size` command line options to `nvidia-docker run`.
|
||||
should increase shared memory size either with --ipc=host or --shm-size command line options to nvidia-docker run.
|
||||
|
||||
|
||||
## Getting Started
|
||||
@ -223,13 +216,13 @@ Three pointers to get you started:
|
||||
|
||||
## Communication
|
||||
* forums: discuss implementations, research, etc. http://discuss.pytorch.org
|
||||
* GitHub issues: bug reports, feature requests, install issues, RFCs, thoughts, etc.
|
||||
* Slack: general chat, online discussions, collaboration etc. https://pytorch.slack.com/ . If you need a slack invite, ping us at soumith@pytorch.org
|
||||
* github issues: bug reports, feature requests, install issues, RFCs, thoughts, etc.
|
||||
* slack: general chat, online discussions, collaboration etc. https://pytorch.slack.com/ . If you need a slack invite, ping us at soumith@pytorch.org
|
||||
* newsletter: no-noise, one-way email newsletter with important announcements about pytorch. You can sign-up here: http://eepurl.com/cbG0rv
|
||||
|
||||
## Releases and Contributing
|
||||
|
||||
PyTorch has a 90 day release cycle (major releases).
|
||||
PyTorch has a 90 day release cycle (major releases).
|
||||
It's current state is Beta, we expect no obvious bugs. Please let us know if you encounter a bug by [filing an issue](https://github.com/pytorch/pytorch/issues).
|
||||
|
||||
We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion.
|
||||
|
||||
@ -112,7 +112,3 @@ footer p {
|
||||
nav .hidden-section {
|
||||
display: inherit;
|
||||
}
|
||||
|
||||
.wy-side-nav-search>div.version {
|
||||
color: #000;
|
||||
}
|
||||
|
||||
@ -9,8 +9,6 @@ Automatic differentiation package - torch.autograd
|
||||
|
||||
.. autofunction:: backward
|
||||
|
||||
.. autofunction:: grad
|
||||
|
||||
Variable
|
||||
--------
|
||||
|
||||
@ -40,8 +38,8 @@ All :class:`Variable` s keep track of in-place operations applied to them, and
|
||||
if the implementation detects that a variable was saved for backward in one of
|
||||
the functions, but it was modified in-place afterwards, an error will be raised
|
||||
once backward pass is started. This ensures that if you're using in-place
|
||||
functions and not seeing any errors, you can be sure that the computed
|
||||
gradients are correct.
|
||||
functions and not seing any errors, you can be sure that the computed gradients
|
||||
are correct.
|
||||
|
||||
|
||||
.. autoclass:: Variable
|
||||
|
||||
@ -75,10 +75,10 @@ author = 'Torch Contributors'
|
||||
#
|
||||
# The short X.Y version.
|
||||
# TODO: change to [:2] at v1.0
|
||||
version = 'master (' + torch.__version__ + ' )'
|
||||
version = '.'.join(torch.__version__.split('+')[0].split('.')[:3])
|
||||
# The full version, including alpha/beta/rc tags.
|
||||
# TODO: verify this works as expected
|
||||
release = 'master'
|
||||
release = torch.__version__.split('+')[0]
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
@ -113,7 +113,7 @@ html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
|
||||
#
|
||||
html_theme_options = {
|
||||
'collapse_navigation': False,
|
||||
'display_version': True,
|
||||
'display_version': False,
|
||||
'logo_only': True,
|
||||
}
|
||||
|
||||
@ -205,7 +205,7 @@ from sphinx import addnodes
|
||||
|
||||
|
||||
def patched_make_field(self, types, domain, items, **kw):
|
||||
# `kw` catches `env=None` needed for newer sphinx while maintaining
|
||||
# `kw` catches `env=None` needed for newer sphinx while maingaining
|
||||
# backwards compatibility when passed along further down!
|
||||
|
||||
# type: (List, unicode, Tuple) -> nodes.field
|
||||
|
||||
@ -25,10 +25,3 @@ Streams and events
|
||||
|
||||
.. autoclass:: Event
|
||||
:members:
|
||||
|
||||
NVIDIA Tools Extension (NVTX)
|
||||
-----------------------------
|
||||
|
||||
.. autofunction:: torch.cuda.nvtx.mark
|
||||
.. autofunction:: torch.cuda.nvtx.range_push
|
||||
.. autofunction:: torch.cuda.nvtx.range_pop
|
||||
|
||||
@ -10,4 +10,3 @@ torch.utils.data
|
||||
.. autoclass:: torch.utils.data.sampler.RandomSampler
|
||||
.. autoclass:: torch.utils.data.sampler.SubsetRandomSampler
|
||||
.. autoclass:: torch.utils.data.sampler.WeightedRandomSampler
|
||||
.. autoclass:: torch.utils.data.distributed.DistributedSampler
|
||||
|
||||
@ -1,165 +0,0 @@
|
||||
.. role:: hidden
|
||||
:class: hidden-section
|
||||
|
||||
Distributed communication package - torch.distributed
|
||||
=====================================================
|
||||
|
||||
.. automodule:: torch.distributed
|
||||
.. currentmodule:: torch.distributed
|
||||
|
||||
Currently torch.distributed supports three backends, each with
|
||||
different capabilities. The table below shows which functions are available
|
||||
for use with CPU / CUDA tensors.
|
||||
MPI supports cuda only iff the implementation used to build PyTorch supports it.
|
||||
|
||||
+------------+-----------+-----------+-----------+
|
||||
| Backend | ``tcp`` | ``gloo`` | ``mpi`` |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| Device | CPU | GPU | CPU | GPU | CPU | GPU |
|
||||
+============+=====+=====+=====+=====+=====+=====+
|
||||
| send | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| recv | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| broadcast | ✓ | ✘ | ✓ | ✓ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| all_reduce | ✓ | ✘ | ✓ | ✓ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| reduce | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| all_gather | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| gather | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| scatter | ✓ | ✘ | ✘ | ✘ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
| barrier | ✓ | ✘ | ✓ | ✓ | ✓ | ? |
|
||||
+------------+-----+-----+-----+-----+-----+-----+
|
||||
|
||||
Initialization
|
||||
--------------
|
||||
|
||||
The package needs to be initialized using the :func:`torch.distributed.init_process_group`
|
||||
function before calling any other methods.
|
||||
|
||||
.. autofunction:: init_process_group
|
||||
|
||||
.. autofunction:: get_rank
|
||||
|
||||
.. autofunction:: get_world_size
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Currently three initialization methods are supported:
|
||||
|
||||
TCP initialization
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Initialization will utilize a network address reachable from all processes.
|
||||
If the address belongs to one of the machines, initialization requires that all processes
|
||||
have manually specified ranks.
|
||||
|
||||
Alternatively, the address has to be a valid IP multicast address, in which case,
|
||||
ranks can be assigned automatically. Multicast initialization also supports
|
||||
a ``group_name`` argument, which allows you to use the same address for multiple jobs,
|
||||
as long as they use different group names.
|
||||
|
||||
::
|
||||
|
||||
import torch.distributed as dist
|
||||
|
||||
# Use address of one of the machines
|
||||
dist.init_process_group(init_method='tcp://10.1.1.20:23456', rank=args.rank, world_size=4)
|
||||
|
||||
# or a multicast address - rank will be assigned automatically if unspecified
|
||||
dist.init_process_group(init_method='tcp://[ff15:1e18:5d4c:4cf0:d02d:b659:53ba:b0a7]:23456',
|
||||
world_size=4)
|
||||
|
||||
Shared file-system initialization
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Another initialization method makes use of a file system shared and visible from
|
||||
all machines in a group. The URL should start with ``file://`` and contain a path
|
||||
to a non-existent file (in an existing directory) on a shared file system.
|
||||
This initialization method also supports a ``group_name`` argument, which allows you to
|
||||
use the same shared file path for multiple jobs, as long as they use different
|
||||
group names.
|
||||
|
||||
.. warning::
|
||||
This method assumes that the file system supports locking using ``fcntl`` - most
|
||||
local systems and NFS support it.
|
||||
|
||||
::
|
||||
|
||||
import torch.distributed as dist
|
||||
|
||||
# Rank will be assigned automatically if unspecified
|
||||
dist.init_process_group(init_method='file:///mnt/nfs/sharedfile', world_size=4,
|
||||
group_name=args.group)
|
||||
|
||||
Environment variable initialization
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
This method will read the configuration from environment variables, allowing
|
||||
one to fully customize how the information is obtained. The variables to be set
|
||||
are:
|
||||
|
||||
* ``MASTER_PORT`` - required; has to be a free port on machine with rank 0
|
||||
* ``MASTER_ADDR`` - required (except for rank 0); address of rank 0 node
|
||||
* ``WORLD_SIZE`` - required; can be set either here, or in a call to init function
|
||||
* ``RANK`` - required; can be set either here, or in a call to init function
|
||||
|
||||
The machine with rank 0 will be used to set up all connections.
|
||||
|
||||
This is the default method, meaning that ``init_method`` does not have to be specified (or
|
||||
can be ``env://``).
|
||||
|
||||
Groups
|
||||
------
|
||||
|
||||
By default collectives operate on the default group (also called the world) and
|
||||
require all processes to enter the distributed function call. However, some workloads can benefit
|
||||
from more fine-grained communication. This is where distributed groups come
|
||||
into play. :func:`~torch.distributed.new_group` function can be
|
||||
used to create new groups, with arbitrary subsets of all processes. It returns
|
||||
an opaque group handle that can be given as a ``group`` argument to all collectives
|
||||
(collectives are distributed functions to exchange information in certain well-known programming patterns).
|
||||
|
||||
.. autofunction:: new_group
|
||||
|
||||
Point-to-point communication
|
||||
----------------------------
|
||||
|
||||
.. autofunction:: send
|
||||
|
||||
.. autofunction:: recv
|
||||
|
||||
:func:`~torch.distributed.isend` and :func:`~torch.distributed.irecv`
|
||||
return distributed request objects when used. In general, the type of this object is unspecified
|
||||
as they should never be created manually, but they are guaranteed to support two methods:
|
||||
|
||||
* ``is_completed()`` - returns True if the operation has finished
|
||||
* ``wait()`` - will block the process until the operation is finished.
|
||||
``is_completed()`` is guaranteed to return True once it returns.
|
||||
|
||||
.. autofunction:: isend
|
||||
|
||||
.. autofunction:: irecv
|
||||
|
||||
Collective functions
|
||||
--------------------
|
||||
|
||||
.. autofunction:: broadcast
|
||||
|
||||
.. autofunction:: all_reduce
|
||||
|
||||
.. autofunction:: reduce
|
||||
|
||||
.. autofunction:: all_gather
|
||||
|
||||
.. autofunction:: gather
|
||||
|
||||
.. autofunction:: scatter
|
||||
|
||||
.. autofunction:: barrier
|
||||
|
||||
@ -30,7 +30,6 @@ PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.
|
||||
optim
|
||||
torch.autograd <autograd>
|
||||
torch.multiprocessing <multiprocessing>
|
||||
torch.distributed <distributed>
|
||||
torch.legacy <legacy>
|
||||
cuda
|
||||
ffi
|
||||
|
||||
@ -83,6 +83,6 @@ the current process group, and will keep track of all shared memory allocations.
|
||||
Once all processes connected to it exit, it will wait a moment to ensure there
|
||||
will be no new connections, and will iterate over all shared memory files
|
||||
allocated by the group. If it finds that any of them still exist, they will be
|
||||
deallocated. We've tested this method and it proved to be robust to various
|
||||
deallocated. We've tested this method and it prooved to be robust to various
|
||||
failures. Still, if your system has high enough limits, and ``file_descriptor``
|
||||
is a supported strategy, we do not recommend switching to this one.
|
||||
|
||||
@ -160,7 +160,7 @@ Pooling Layers
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: AdaptiveMaxPool2d
|
||||
:members:
|
||||
:members:
|
||||
|
||||
:hidden:`AdaptiveAvgPool1d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@ -174,41 +174,7 @@ Pooling Layers
|
||||
.. autoclass:: AdaptiveAvgPool2d
|
||||
:members:
|
||||
|
||||
|
||||
Padding Layers
|
||||
--------------
|
||||
|
||||
:hidden:`ReflectionPad2d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ReflectionPad2d
|
||||
:members:
|
||||
|
||||
:hidden:`ReplicationPad2d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ReplicationPad2d
|
||||
:members:
|
||||
|
||||
:hidden:`ReplicationPad3d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ReplicationPad3d
|
||||
:members:
|
||||
|
||||
:hidden:`ZeroPad2d`
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ZeroPad2d
|
||||
:members:
|
||||
|
||||
:hidden:`ConstantPad2d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ConstantPad2d
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
Non-linear Activations
|
||||
----------------------------------
|
||||
|
||||
@ -230,12 +196,6 @@ Non-linear Activations
|
||||
.. autoclass:: ELU
|
||||
:members:
|
||||
|
||||
:hidden:`SELU`
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: SELU
|
||||
:members:
|
||||
|
||||
:hidden:`PReLU`
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
@ -343,19 +303,19 @@ Normalization layers
|
||||
:members:
|
||||
|
||||
:hidden:`InstanceNorm1d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: InstanceNorm1d
|
||||
:members:
|
||||
|
||||
:hidden:`InstanceNorm2d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: InstanceNorm2d
|
||||
:members:
|
||||
|
||||
:hidden:`InstanceNorm3d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: InstanceNorm3d
|
||||
:members:
|
||||
@ -430,12 +390,6 @@ Dropout layers
|
||||
.. autoclass:: Dropout3d
|
||||
:members:
|
||||
|
||||
:hidden:`AlphaDropout`
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: AlphaDropout
|
||||
:members:
|
||||
|
||||
|
||||
Sparse layers
|
||||
----------------------------------
|
||||
@ -446,21 +400,9 @@ Sparse layers
|
||||
.. autoclass:: Embedding
|
||||
:members:
|
||||
|
||||
:hidden:`EmbeddingBag`
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: EmbeddingBag
|
||||
:members:
|
||||
|
||||
Distance functions
|
||||
----------------------------------
|
||||
|
||||
:hidden:`CosineSimilarity`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: CosineSimilarity
|
||||
:members:
|
||||
|
||||
:hidden:`PairwiseDistance`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -495,12 +437,6 @@ Loss functions
|
||||
.. autoclass:: NLLLoss
|
||||
:members:
|
||||
|
||||
:hidden:`PoissonNLLLoss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: PoissonNLLLoss
|
||||
:members:
|
||||
|
||||
:hidden:`NLLLoss2d`
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -519,12 +455,6 @@ Loss functions
|
||||
.. autoclass:: BCELoss
|
||||
:members:
|
||||
|
||||
:hidden:`BCEWithLogitsLoss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: BCEWithLogitsLoss
|
||||
:members:
|
||||
|
||||
:hidden:`MarginRankingLoss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -573,12 +503,6 @@ Loss functions
|
||||
.. autoclass:: MultiMarginLoss
|
||||
:members:
|
||||
|
||||
:hidden:`TripletMarginLoss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: TripletMarginLoss
|
||||
:members:
|
||||
|
||||
|
||||
Vision layers
|
||||
----------------
|
||||
@ -589,12 +513,6 @@ Vision layers
|
||||
.. autoclass:: PixelShuffle
|
||||
:members:
|
||||
|
||||
:hidden:`Upsample`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: Upsample
|
||||
:members:
|
||||
|
||||
:hidden:`UpsamplingNearest2d`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -608,21 +526,15 @@ Vision layers
|
||||
:members:
|
||||
|
||||
|
||||
DataParallel layers (multi-GPU, distributed)
|
||||
--------------------------------------------
|
||||
Multi-GPU layers
|
||||
----------------
|
||||
|
||||
:hidden:`DataParallel`
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: DataParallel
|
||||
:members:
|
||||
|
||||
:hidden:`DistributedDataParallel`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: torch.nn.parallel.DataParallel
|
||||
:members:
|
||||
|
||||
|
||||
Utilities
|
||||
---------
|
||||
@ -632,16 +544,6 @@ Utilities
|
||||
|
||||
.. autofunction:: torch.nn.utils.clip_grad_norm
|
||||
|
||||
:hidden:`weight_norm`
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: torch.nn.utils.weight_norm
|
||||
|
||||
:hidden:`remove_weight_norm`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: torch.nn.utils.remove_weight_norm
|
||||
|
||||
|
||||
.. currentmodule:: torch.nn.utils.rnn
|
||||
|
||||
@ -774,7 +676,7 @@ Pooling functions
|
||||
|
||||
.. autofunction:: adaptive_avg_pool2d
|
||||
|
||||
|
||||
|
||||
Non-linear activation functions
|
||||
-------------------------------
|
||||
|
||||
@ -804,11 +706,6 @@ Non-linear activation functions
|
||||
|
||||
.. autofunction:: elu
|
||||
|
||||
:hidden:`selu`
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: selu
|
||||
|
||||
:hidden:`leaky_relu`
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -887,11 +784,6 @@ Normalization functions
|
||||
|
||||
.. autofunction:: batch_norm
|
||||
|
||||
:hidden:`normalize`
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: normalize
|
||||
|
||||
Linear functions
|
||||
----------------
|
||||
|
||||
@ -908,21 +800,6 @@ Dropout functions
|
||||
|
||||
.. autofunction:: dropout
|
||||
|
||||
:hidden:`alpha_dropout`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: alpha_dropout
|
||||
|
||||
:hidden:`dropout2d`
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: dropout2d
|
||||
|
||||
:hidden:`dropout3d`
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: dropout3d
|
||||
|
||||
Distance functions
|
||||
----------------------------------
|
||||
|
||||
@ -931,100 +808,36 @@ Distance functions
|
||||
|
||||
.. autofunction:: pairwise_distance
|
||||
|
||||
:hidden:`cosine_similarity`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: cosine_similarity
|
||||
|
||||
|
||||
Loss functions
|
||||
--------------
|
||||
|
||||
:hidden:`binary_cross_entropy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: binary_cross_entropy
|
||||
|
||||
:hidden:`poisson_nll_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: poisson_nll_loss
|
||||
|
||||
:hidden:`cosine_embedding_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: cosine_embedding_loss
|
||||
|
||||
:hidden:`cross_entropy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: cross_entropy
|
||||
|
||||
:hidden:`hinge_embedding_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: hinge_embedding_loss
|
||||
|
||||
:hidden:`kl_div`
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: kl_div
|
||||
|
||||
:hidden:`l1_loss`
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: l1_loss
|
||||
|
||||
:hidden:`mse_loss`
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: mse_loss
|
||||
|
||||
:hidden:`margin_ranking_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: margin_ranking_loss
|
||||
|
||||
:hidden:`multilabel_margin_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: multilabel_margin_loss
|
||||
|
||||
:hidden:`multilabel_soft_margin_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: multilabel_soft_margin_loss
|
||||
|
||||
:hidden:`multi_margin_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: multi_margin_loss
|
||||
|
||||
:hidden:`nll_loss`
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: nll_loss
|
||||
|
||||
:hidden:`binary_cross_entropy_with_logits`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: binary_cross_entropy_with_logits
|
||||
:hidden:`kl_div`
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: kl_div
|
||||
|
||||
:hidden:`cross_entropy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: cross_entropy
|
||||
|
||||
:hidden:`binary_cross_entropy`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: binary_cross_entropy
|
||||
|
||||
:hidden:`smooth_l1_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: smooth_l1_loss
|
||||
|
||||
:hidden:`soft_margin_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: soft_margin_loss
|
||||
|
||||
:hidden:`triplet_margin_loss`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: triplet_margin_loss
|
||||
|
||||
Vision functions
|
||||
----------------
|
||||
|
||||
@ -1038,32 +851,6 @@ Vision functions
|
||||
|
||||
.. autofunction:: pad
|
||||
|
||||
:hidden:`upsample`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: upsample
|
||||
|
||||
:hidden:`upsample_nearest`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: upsample_nearest
|
||||
|
||||
:hidden:`upsample_bilinear`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: upsample_bilinear
|
||||
|
||||
:hidden:`grid_sample`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: grid_sample
|
||||
|
||||
:hidden:`affine_grid`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autofunction:: affine_grid
|
||||
|
||||
|
||||
torch.nn.init
|
||||
=============
|
||||
|
||||
|
||||
@ -67,18 +67,18 @@ model. ``volatile`` also determines that ``requires_grad is False``.
|
||||
|
||||
Volatile differs from :ref:`excluding-requires_grad` in how the flag propagates.
|
||||
If there's even a single volatile input to an operation, its output is also
|
||||
going to be volatile. Volatility spreads across the graph much easier than
|
||||
going to be volatile. Volatility spreads accross the graph much easier than
|
||||
non-requiring gradient - you only need a **single** volatile leaf to have a
|
||||
volatile output, while you need **all** leaves to not require gradient to
|
||||
have an output that doesn't require gradient. Using volatile flag you don't
|
||||
have an output the doesn't require gradient. Using volatile flag you don't
|
||||
need to change any settings of your model parameters to use it for
|
||||
inference. It's enough to create a volatile input, and this will ensure that
|
||||
no intermediate states are saved.
|
||||
|
||||
.. code::
|
||||
|
||||
>>> regular_input = Variable(torch.randn(1, 3, 227, 227))
|
||||
>>> volatile_input = Variable(torch.randn(1, 3, 227, 227), volatile=True)
|
||||
>>> regular_input = Variable(torch.randn(5, 5))
|
||||
>>> volatile_input = Variable(torch.randn(5, 5), volatile=True)
|
||||
>>> model = torchvision.models.resnet18(pretrained=True)
|
||||
>>> model(regular_input).requires_grad
|
||||
True
|
||||
@ -86,28 +86,21 @@ no intermediate states are saved.
|
||||
False
|
||||
>>> model(volatile_input).volatile
|
||||
True
|
||||
>>> model(volatile_input).grad_fn is None
|
||||
>>> model(volatile_input).creator is None
|
||||
True
|
||||
|
||||
How autograd encodes the history
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Autograd is reverse automatic differentiation system. Conceptually,
|
||||
autograd records a graph recording all of the operations that created
|
||||
the data as you execute operations, giving you a directed acyclic graph
|
||||
whose leaves are the input variables and roots are the output variables.
|
||||
By tracing this graph from roots to leaves, you can automatically
|
||||
compute the gradients using the chain rule.
|
||||
|
||||
Internally, autograd represents this graph as a graph of
|
||||
:class:`Function` objects (really expressions), which can be
|
||||
:meth:`~torch.autograd.Function.apply` ed to compute the result of
|
||||
evaluating the graph. When computing the forwards pass, autograd
|
||||
simultaneously performs the requested computations and builds up a graph
|
||||
representing the function that computes the gradient (the ``.grad_fn``
|
||||
attribute of each :class:`Variable` is an entry point into this graph).
|
||||
When the forwards pass is completed, we evaluate this graph in the
|
||||
backwards pass to compute the gradients.
|
||||
Each Variable has a ``.creator`` attribute, that points to the function, of
|
||||
which it is an output. This is an entry point to a directed acyclic graph (DAG)
|
||||
consisting of :class:`Function` objects as nodes, and references between them
|
||||
being the edges. Every time an operation is performed, a new :class:`Function`
|
||||
representing it is instantiated, its :meth:`~torch.autograd.Function.forward`
|
||||
method is called, and its output :class:`Variable` s creators are set to it.
|
||||
Then, by following the path from any :class:`Variable` to the leaves, it is
|
||||
possible to reconstruct the sequence of operations that has created the data,
|
||||
and automatically compute the gradients.
|
||||
|
||||
An important thing to note is that the graph is recreated from scratch at every
|
||||
iteration, and this is exactly what allows for using arbitrary Python control
|
||||
|
||||
@ -1,113 +0,0 @@
|
||||
.. _broadcasting-semantics:
|
||||
|
||||
Broadcasting semantics
|
||||
======================
|
||||
|
||||
Many PyTorch operations support :any:`NumPy Broadcasting Semantics <numpy.doc.broadcasting>`.
|
||||
|
||||
In short, if a PyTorch operation supports broadcast, then its Tensor arguments can be
|
||||
automatically expanded to be of equal sizes (without making copies of the data).
|
||||
|
||||
General semantics
|
||||
-----------------
|
||||
Two tensors are "broadcastable" if the following rules hold:
|
||||
|
||||
- Each tensor has at least one dimension.
|
||||
- When iterating over the dimension sizes, starting at the trailing dimension,
|
||||
the dimension sizes must either be equal, one of them is 1, or one of them
|
||||
does not exist.
|
||||
|
||||
For Example::
|
||||
|
||||
>>> x=torch.FloatTensor(5,7,3)
|
||||
>>> y=torch.FloatTensor(5,7,3)
|
||||
# same shapes are always broadcastable (i.e. the above rules always hold)
|
||||
|
||||
>>> x=torch.FloatTensor()
|
||||
>>> y=torch.FloatTensor(2,2)
|
||||
# x and y are not broadcastable, because x does not have at least 1 dimension
|
||||
|
||||
# can line up trailing dimensions
|
||||
>>> x=torch.FloatTensor(5,3,4,1)
|
||||
>>> y=torch.FloatTensor( 3,1,1)
|
||||
# x and y are broadcastable.
|
||||
# 1st trailing dimension: both have size 1
|
||||
# 2nd trailing dimension: y has size 1
|
||||
# 3rd trailing dimension: x size == y size
|
||||
# 4th trailing dimension: y dimension doesn't exist
|
||||
|
||||
# but:
|
||||
>>> x=torch.FloatTensor(5,2,4,1)
|
||||
>>> y=torch.FloatTensor( 3,1,1)
|
||||
# x and y are not broadcastable, because in the 3rd trailing dimension 2 != 3
|
||||
|
||||
If two tensors :attr:`x`, :attr:`y` are "broadcastable", the resulting tensor size
|
||||
is calculated as follows:
|
||||
|
||||
- If the number of dimensions of :attr:`x` and :attr:`y` are not equal, prepend 1
|
||||
to the dimensions of the tensor with fewer dimensions to make them equal length.
|
||||
- Then, for each dimension size, the resulting dimension size is the max of the sizes of
|
||||
:attr:`x` and :attr:`y` along that dimension.
|
||||
|
||||
For Example::
|
||||
|
||||
# can line up trailing dimensions to make reading easier
|
||||
>>> x=torch.FloatTensor(5,1,4,1)
|
||||
>>> y=torch.FloatTensor( 3,1,1)
|
||||
>>> (x+y).size()
|
||||
torch.Size([5, 3, 4, 1])
|
||||
|
||||
# but not necessary:
|
||||
>>> x=torch.FloatTensor(1)
|
||||
>>> y=torch.FloatTensor(3,1,7)
|
||||
>>> (x+y).size()
|
||||
torch.Size([3, 1, 7])
|
||||
|
||||
>>> x=torch.FloatTensor(5,2,4,1)
|
||||
>>> y=torch.FloatTensor(3,1,1)
|
||||
>>> (x+y).size()
|
||||
RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1
|
||||
|
||||
In-place semantics
|
||||
------------------
|
||||
One complication is that in-place operations do not allow the in-place tensor to change shape
|
||||
as a result of the broadcast.
|
||||
|
||||
For Example::
|
||||
|
||||
>>> x=torch.FloatTensor(5,3,4,1)
|
||||
>>> y=torch.FloatTensor(3,1,1)
|
||||
>>> (x.add_(y)).size()
|
||||
torch.Size([5, 3, 4, 1])
|
||||
|
||||
# but:
|
||||
>>> x=torch.FloatTensor(1,3,1)
|
||||
>>> y=torch.FloatTensor(3,1,7)
|
||||
>>> (x.add_(y)).size()
|
||||
RuntimeError: The expanded size of the tensor (1) must match the existing size (7) at non-singleton dimension 2.
|
||||
|
||||
Backwards compatibility
|
||||
-----------------------
|
||||
Prior versions of PyTorch allowed certain pointwise functions to execute on tensors with different shapes,
|
||||
as long as the number of elements in each tensor was equal. The pointwise operation would then be carried
|
||||
out by viewing each tensor as 1-dimensional. PyTorch now supports broadcasting and the "1-dimensional"
|
||||
pointwise behavior is considered deprecated and will generate a Python warning in cases where tensors are
|
||||
not broadcastable, but have the same number of elements.
|
||||
|
||||
Note that the introduction of broadcasting can cause backwards incompatible changes in the case where
|
||||
two tensors do not have the same shape, but are broadcastable and have the same number of elements.
|
||||
For Example::
|
||||
|
||||
>>> torch.add(torch.ones(4,1), torch.randn(4))
|
||||
|
||||
would previously produce a Tensor with size: torch.Size([4,1]), but now produces a Tensor with size: torch.Size([4,4]).
|
||||
In order to help identify cases in your code where backwards incompatibilities introduced by broadcasting may exist,
|
||||
you may set `torch.utils.backcompat.broadcast_warning.enabled` to `True`, which will generate a python warning
|
||||
in such cases.
|
||||
|
||||
For Example::
|
||||
|
||||
>>> torch.utils.backcompat.broadcast_warning.enabled=True
|
||||
>>> torch.add(torch.ones(4,1), torch.ones(4))
|
||||
__main__:1: UserWarning: self and other do not have the same shape, but are broadcastable, and have the same number of elements.
|
||||
Changing behavior in a backwards incompatible manner to broadcasting rather than viewing as 1-dimensional.
|
||||
@ -12,7 +12,7 @@ of your selected device, and the results will be always placed in on the same
|
||||
device as the tensor.
|
||||
|
||||
Cross-GPU operations are not allowed by default, with the only exception of
|
||||
:meth:`~torch.Tensor.copy_`. Unless you enable peer-to-peer memory accesses,
|
||||
:meth:`~torch.Tensor.copy_`. Unless you enable peer-to-peer memory accesses
|
||||
any attempts to launch ops on tensors spread across different devices will
|
||||
raise an error.
|
||||
|
||||
|
||||
@ -13,28 +13,31 @@ Extending :mod:`torch.autograd`
|
||||
Adding operations to :mod:`~torch.autograd` requires implementing a new
|
||||
:class:`Function` subclass for each operation. Recall that :class:`Function` s
|
||||
are what :mod:`~torch.autograd` uses to compute the results and gradients, and
|
||||
encode the operation history. Every new function requires you to implement 2
|
||||
encode the operation history. Every new function requires you to implement 3
|
||||
methods:
|
||||
|
||||
- ``__init__`` (*optional*) - if your operation is parametrized by/uses
|
||||
objects different than :class:`Variable` s, you should pass them as arguments
|
||||
to ``__init__``. For example, ``AddConstant`` function takes a scalar to add,
|
||||
while ``Transpose`` requires specifying which two dimensions to swap. If your
|
||||
function doesn't require any additional parameters, you can skip it.
|
||||
- :meth:`~Function.forward` - the code that performs the operation. It can take
|
||||
as many arguments as you want, with some of them being optional, if you
|
||||
specify the default values. All kinds of Python objects are accepted here.
|
||||
:class:`Variable` arguments will be converted to :class:`Tensor` s before the
|
||||
call, and their use will be registered in the graph. Note that this logic won't
|
||||
traverse lists/dicts/any other data structures and will only consider Variables
|
||||
that are direct arguments to the call. You can return either a single
|
||||
:class:`Tensor` output, or a :class:`tuple` of :class:`Tensor` s if there are
|
||||
multiple outputs. Also, please refer to the docs of :class:`Function` to find
|
||||
descriptions of useful methods that can be called only from :meth:`~Function.forward`.
|
||||
as many arguments as you want, with some of them being
|
||||
optional, if you specify the default values. Keep in mind that only
|
||||
:class:`Variable` s will be passed in here. You can return either a single
|
||||
:class:`Variable` output, or a :class:`tuple` of :class:`Variable` s if there
|
||||
are multiple. Also, please refer to the docs of :class:`Function` to find
|
||||
descriptions of useful methods that can be called only from
|
||||
:meth:`~Function.forward`.
|
||||
- :meth:`~Function.backward` - gradient formula. It will be given
|
||||
as many :class:`Variable` arguments as there were outputs, with each of them
|
||||
representing gradient w.r.t. that output. It should return as many
|
||||
:class:`Variable` s as there were inputs, with each of them containing the
|
||||
gradient w.r.t. its corresponding input. If your inputs didn't require
|
||||
gradient (see :attr:`~Variable.needs_input_grad`), or were non-:class:`Variable`
|
||||
objects, you can return :class:`python:None`. Also, if you have optional
|
||||
arguments to :meth:`~Variable.forward` you can return more gradients than there
|
||||
were inputs, as long as they're all :any:`python:None`.
|
||||
as many arguments as there were outputs, with each of them representing
|
||||
gradient w.r.t. that output. It should return as many :class:`Tensor` s as
|
||||
there were inputs, with each of them containing the gradient w.r.t.
|
||||
corresponding input. If you inputs didn't require gradient (see
|
||||
:attr:`~Variable.needs_input_grad`), or it was non-differentiable, you
|
||||
can return :class:`None`. Also, if you have optional arguments to
|
||||
:meth:`~Variable.forward` you can return more gradients than there were
|
||||
inputs, as long as they're all :any:`python:None`.
|
||||
|
||||
Below you can find code for a ``Linear`` function from :mod:`torch.nn`, with
|
||||
additional comments::
|
||||
@ -42,25 +45,22 @@ additional comments::
|
||||
# Inherit from Function
|
||||
class Linear(Function):
|
||||
|
||||
# Note that both forward and backward are @staticmethods
|
||||
@staticmethod
|
||||
# bias is an optional argument
|
||||
def forward(ctx, input, weight, bias=None):
|
||||
ctx.save_for_backward(input, weight, bias)
|
||||
def forward(self, input, weight, bias=None):
|
||||
self.save_for_backward(input, weight, bias)
|
||||
output = input.mm(weight.t())
|
||||
if bias is not None:
|
||||
output += bias.unsqueeze(0).expand_as(output)
|
||||
return output
|
||||
|
||||
# This function has only a single output, so it gets only one gradient
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
def backward(self, grad_output):
|
||||
# This is a pattern that is very convenient - at the top of backward
|
||||
# unpack saved_tensors and initialize all gradients w.r.t. inputs to
|
||||
# None. Thanks to the fact that additional trailing Nones are
|
||||
# ignored, the return statement is simple even when the function has
|
||||
# optional inputs.
|
||||
input, weight, bias = ctx.saved_variables
|
||||
input, weight, bias = self.saved_tensors
|
||||
grad_input = grad_weight = grad_bias = None
|
||||
|
||||
# These needs_input_grad checks are optional and there only to
|
||||
@ -76,39 +76,27 @@ additional comments::
|
||||
|
||||
return grad_input, grad_weight, grad_bias
|
||||
|
||||
Now, to make it easier to use these custom ops, we recommend aliasing their
|
||||
``apply`` method::
|
||||
Now, to make it easier to use these custom ops, we recommend wrapping them in
|
||||
small helper functions::
|
||||
|
||||
linear = Linear.aply
|
||||
|
||||
Here, we give an additional example of a function that is parametrized by
|
||||
non-Variable arguments::
|
||||
|
||||
class MulConstant(Function):
|
||||
@staticmethod
|
||||
def forward(ctx, tensor, constant):
|
||||
# ctx is a context object that can be used to stash information
|
||||
for backward computation
|
||||
ctx.constant = constant
|
||||
return tensor * constant
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
# We return as many input gradients as there were arguments.
|
||||
# Gradients of non-Tensor arguments to forward must be None.
|
||||
return grad_output * ctx.constant, None
|
||||
def linear(input, weight, bias=None):
|
||||
# First braces create a Function object. Any arguments given here
|
||||
# will be passed to __init__. Second braces will invoke the __call__
|
||||
# operator, that will then use forward() to compute the result and
|
||||
# return it.
|
||||
return Linear()(input, weight, bias)
|
||||
|
||||
You probably want to check if the backward method you implemented actually
|
||||
computes the derivatives of your function. It is possible by comparing with
|
||||
numerical approximations using small finite differences::
|
||||
|
||||
from torch.autograd import gradcheck
|
||||
|
||||
|
||||
# gradchek takes a tuple of tensor as input, check if your gradient
|
||||
# evaluated with these tensors are close enough to numerical
|
||||
# approximations and returns True if they all verify this condition.
|
||||
input = (Variable(torch.randn(20,20).double(), requires_grad=True), Variable(torch.randn(30,20).double(), requires_grad=True),)
|
||||
test = gradcheck(Linear.apply, input, eps=1e-6, atol=1e-4)
|
||||
input = (Variable(torch.randn(20,20).double(), requires_grad=True),)
|
||||
test = gradcheck(Linear(), input, eps=1e-6, atol=1e-4)
|
||||
print(test)
|
||||
|
||||
Extending :mod:`torch.nn`
|
||||
|
||||
@ -114,21 +114,3 @@ Algorithms
|
||||
:members:
|
||||
.. autoclass:: SGD
|
||||
:members:
|
||||
|
||||
How to adjust Learning Rate
|
||||
---------------------------
|
||||
|
||||
:mod:`torch.optim.lr_scheduler` provides several methods to adjust the learning
|
||||
rate based on the number of epoches. :class:`torch.optim.lr_scheduler.ReduceLROnPlateau`
|
||||
allows dynamic learning rate reducing based on some validation measurements.
|
||||
|
||||
.. autoclass:: torch.optim.lr_scheduler.LambdaLR
|
||||
:members:
|
||||
.. autoclass:: torch.optim.lr_scheduler.StepLR
|
||||
:members:
|
||||
.. autoclass:: torch.optim.lr_scheduler.MultiStepLR
|
||||
:members:
|
||||
.. autoclass:: torch.optim.lr_scheduler.ExponentialLR
|
||||
:members:
|
||||
.. autoclass:: torch.optim.lr_scheduler.ReduceLROnPlateau
|
||||
:members:
|
||||
|
||||
@ -1,7 +1,7 @@
|
||||
.. currentmodule:: torch.sparse
|
||||
|
||||
torch.sparse
|
||||
============
|
||||
Sparse tensors
|
||||
==============
|
||||
|
||||
.. warning::
|
||||
|
||||
@ -12,13 +12,16 @@ efficiently store and process tensors for which the majority of elements
|
||||
are zeros.
|
||||
|
||||
A sparse tensor is represented as a pair of dense tensors: a tensor
|
||||
of values and a tensor of indices. A sparse tensor can be constructed
|
||||
which contains the actual values :class:`torch.sparse.values`, and a
|
||||
tensor which contains the coordinates of those values
|
||||
:class:`torch.sparse.indices`. A sparse tensor can be constructed
|
||||
by providing these two tensors, as well as the size of the sparse tensor
|
||||
(which cannot be inferred from these tensors!)
|
||||
|
||||
>>> i = torch.LongTensor([[0, 1], [2, 0]])
|
||||
>>> v = torch.FloatTensor([3, 4])
|
||||
>>> torch.sparse.FloatTensor(i, v, torch.Size([2,3])).to_dense()
|
||||
|
||||
0 0 3
|
||||
4 0 0
|
||||
[torch.FloatTensor of size 2x2]
|
||||
@ -29,6 +32,7 @@ dimensions are sparse, and the rest of the dimensions are dense.
|
||||
>>> i = torch.LongTensor([[2, 4]])
|
||||
>>> v = torch.FloatTensor([[1, 3], [5, 7]])
|
||||
>>> torch.sparse.FloatTensor(i, v).to_dense()
|
||||
|
||||
0 0
|
||||
0 0
|
||||
1 3
|
||||
@ -44,71 +48,42 @@ An empty sparse tensor can be constructed by specifying its size:
|
||||
and values:
|
||||
[torch.FloatTensor with no dimension]
|
||||
|
||||
.. note::
|
||||
|
||||
Our sparse tensor format permits *uncoalesced* sparse tensors, where
|
||||
there may be duplicate coordinates in the indices; in this case,
|
||||
the interpretation is that the value at that index is the sum of all
|
||||
duplicate value entries. Uncoalesced tensors permit us to implement
|
||||
certain operators more efficiently.
|
||||
|
||||
For the most part, you shouldn't have to care whether or not a
|
||||
sparse tensor is coalesced or not, as most operations will work
|
||||
identically given a coalesced or uncoalesced sparse tensor.
|
||||
However, there are two cases in which you may need to care.
|
||||
|
||||
First, if you repeatedly perform an operation that can produce
|
||||
duplicate entries (e.g., :func:`torch.sparse.FloatTensor.add`), you
|
||||
should occasionally coalesce your sparse tensors to prevent
|
||||
them from growing too large.
|
||||
|
||||
Second, some operators will produce different values depending on
|
||||
whether or not they are coalesced or not (e.g.,
|
||||
:func:`torch.sparse.FloatTensor._values` and
|
||||
:func:`torch.sparse.FloatTensor._indices`, as well as
|
||||
:func:`torch.Tensor._sparse_mask`). These operators are
|
||||
prefixed by an underscore to indicate that they reveal internal
|
||||
implementation details and should be used with care, since code
|
||||
that works with coalesced sparse tensors may not work with
|
||||
uncoalesced sparse tensors; generally speaking, it is safest
|
||||
to explicitly coalesce before working with these operators.
|
||||
|
||||
For example, suppose that we wanted to implement an operator
|
||||
by operating directly on :func:`torch.sparse.FloatTensor._values`.
|
||||
Multiplication by a scalar can be implemented in the obvious way,
|
||||
as multiplication distributes over addition; however, square root
|
||||
cannot be implemented directly, since ``sqrt(a + b) != sqrt(a) +
|
||||
sqrt(b)`` (which is what would be computed if you were given an
|
||||
uncoalesced tensor.)
|
||||
Sparse tensors can have duplicate entries for an index; such a tensor is
|
||||
called non-coalesced. Duplicate entries are summed together when
|
||||
coalescing (or converting to another representation). Some operations
|
||||
(for example, :func:`torch.FloatTensor.add`) produce duplicate entries;
|
||||
if you repeatedly perform these operations, you should coalesce your
|
||||
sparse tensors to prevent them from growing too large.
|
||||
|
||||
.. class:: FloatTensor()
|
||||
|
||||
.. method:: add
|
||||
.. method:: add_
|
||||
.. method:: clone
|
||||
.. method:: dim
|
||||
.. method:: div
|
||||
.. method:: div_
|
||||
.. method:: get_device
|
||||
.. method:: hspmm
|
||||
.. method:: mm
|
||||
.. method:: mul
|
||||
.. method:: mul_
|
||||
.. method:: resizeAs_
|
||||
.. method:: size
|
||||
.. method:: spadd
|
||||
.. method:: spmm
|
||||
.. method:: sspaddmm
|
||||
.. method:: sspmm
|
||||
.. method:: sub
|
||||
.. method:: sub_
|
||||
.. method:: t_
|
||||
.. method:: toDense
|
||||
.. method:: transpose
|
||||
.. method:: transpose_
|
||||
.. method:: zero_
|
||||
.. method:: coalesce
|
||||
.. method:: is_coalesced
|
||||
.. method:: _indices
|
||||
.. method:: _values
|
||||
.. method:: _nnz
|
||||
.. automethod:: add
|
||||
.. automethod:: add_
|
||||
.. automethod:: clone
|
||||
.. automethod:: contiguous
|
||||
.. automethod:: dim
|
||||
.. automethod:: div
|
||||
.. automethod:: div_
|
||||
.. automethod:: get_device
|
||||
.. automethod:: hspmm
|
||||
.. automethod:: indices
|
||||
.. automethod:: is_contiguous
|
||||
.. automethod:: mm
|
||||
.. automethod:: mul
|
||||
.. automethod:: mul_
|
||||
.. automethod:: nnz
|
||||
.. automethod:: resizeAs_
|
||||
.. automethod:: size
|
||||
.. automethod:: spadd
|
||||
.. automethod:: sparse_mask
|
||||
.. automethod:: spmm
|
||||
.. automethod:: sspaddmm
|
||||
.. automethod:: sspmm
|
||||
.. automethod:: sub
|
||||
.. automethod:: sub_
|
||||
.. automethod:: t_
|
||||
.. automethod:: toDense
|
||||
.. automethod:: transpose
|
||||
.. automethod:: transpose_
|
||||
.. automethod:: values
|
||||
.. automethod:: zero_
|
||||
|
||||
@ -13,7 +13,7 @@ Data type CPU tensor GPU tensor
|
||||
======================== =========================== ================================
|
||||
32-bit floating point :class:`torch.FloatTensor` :class:`torch.cuda.FloatTensor`
|
||||
64-bit floating point :class:`torch.DoubleTensor` :class:`torch.cuda.DoubleTensor`
|
||||
16-bit floating point :class:`torch.HalfTensor` :class:`torch.cuda.HalfTensor`
|
||||
16-bit floating point N/A :class:`torch.cuda.HalfTensor`
|
||||
8-bit integer (unsigned) :class:`torch.ByteTensor` :class:`torch.cuda.ByteTensor`
|
||||
8-bit integer (signed) :class:`torch.CharTensor` :class:`torch.cuda.CharTensor`
|
||||
16-bit integer (signed) :class:`torch.ShortTensor` :class:`torch.cuda.ShortTensor`
|
||||
@ -196,10 +196,9 @@ view of a storage and defines numeric operations on it.
|
||||
.. automethod:: lt
|
||||
.. automethod:: lt_
|
||||
.. automethod:: map_
|
||||
.. automethod:: masked_scatter_
|
||||
.. automethod:: masked_copy_
|
||||
.. automethod:: masked_fill_
|
||||
.. automethod:: masked_select
|
||||
.. automethod:: matmul
|
||||
.. automethod:: max
|
||||
.. automethod:: mean
|
||||
.. automethod:: median
|
||||
|
||||
@ -170,7 +170,6 @@ BLAS and LAPACK Operations
|
||||
.. autofunction:: ger
|
||||
.. autofunction:: gesv
|
||||
.. autofunction:: inverse
|
||||
.. autofunction:: matmul
|
||||
.. autofunction:: mm
|
||||
.. autofunction:: mv
|
||||
.. autofunction:: orgqr
|
||||
|
||||
@ -1,78 +1,129 @@
|
||||
torchvision.datasets
|
||||
====================
|
||||
|
||||
All datasets are subclasses of :class:`torch.utils.data.Dataset`
|
||||
i.e, they have ``__getitem__`` and ``__len__`` methods implemented.
|
||||
Hence, they can all be passed to a :class:`torch.utils.data.DataLoader`
|
||||
which can load multiple samples parallelly using ``torch.multiprocessing`` workers.
|
||||
For example: ::
|
||||
|
||||
imagenet_data = torchvision.datasets.ImageFolder('path/to/imagenet_root/')
|
||||
data_loader = torch.utils.data.DataLoader(imagenet_data,
|
||||
batch_size=4,
|
||||
shuffle=True,
|
||||
num_workers=args.nThreads)
|
||||
The following dataset loaders are available:
|
||||
|
||||
The following datasets are available:
|
||||
- `MNIST`_
|
||||
- `COCO (Captioning and Detection)`_
|
||||
- `LSUN Classification`_
|
||||
- `ImageFolder`_
|
||||
- `Imagenet-12`_
|
||||
- `CIFAR10 and CIFAR100`_
|
||||
- `STL10`_
|
||||
|
||||
.. contents:: Datasets
|
||||
:local:
|
||||
Datasets have the API:
|
||||
|
||||
All the datasets have almost similar API. They all have two common arguments:
|
||||
``transform`` and ``target_transform`` to transform the input and target respectively.
|
||||
- ``__getitem__``
|
||||
- ``__len__``
|
||||
They all subclass from ``torch.utils.data.Dataset``
|
||||
Hence, they can all be multi-threaded (python multiprocessing) using
|
||||
standard torch.utils.data.DataLoader.
|
||||
|
||||
For example:
|
||||
|
||||
.. currentmodule:: torchvision.datasets
|
||||
``torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)``
|
||||
|
||||
In the constructor, each dataset has a slightly different API as needed,
|
||||
but they all take the keyword args:
|
||||
|
||||
- ``transform`` - a function that takes in an image and returns a
|
||||
transformed version
|
||||
- common stuff like ``ToTensor``, ``RandomCrop``, etc. These can be
|
||||
composed together with ``transforms.Compose`` (see transforms section
|
||||
below)
|
||||
- ``target_transform`` - a function that takes in the target and
|
||||
transforms it. For example, take in the caption string and return a
|
||||
tensor of word indices.
|
||||
|
||||
MNIST
|
||||
~~~~~
|
||||
|
||||
.. autoclass:: MNIST
|
||||
``dset.MNIST(root, train=True, transform=None, target_transform=None, download=False)``
|
||||
|
||||
- ``root`` : root directory of dataset where ``processed/training.pt`` and ``processed/test.pt`` exist.
|
||||
- ``train`` : ``True`` = Training set, ``False`` = Test set
|
||||
- ``download`` : ``True`` = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, place the processed dataset (function available in mnist.py) in the ``processed`` folder.
|
||||
|
||||
COCO
|
||||
~~~~
|
||||
|
||||
.. note ::
|
||||
These require the `COCO API to be installed`_
|
||||
This requires the `COCO API to be installed`_
|
||||
|
||||
.. _COCO API to be installed: https://github.com/pdollar/coco/tree/master/PythonAPI
|
||||
|
||||
|
||||
Captions
|
||||
^^^^^^^^
|
||||
|
||||
.. autoclass:: CocoCaptions
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
|
||||
|
||||
Detection
|
||||
Captions:
|
||||
^^^^^^^^^
|
||||
|
||||
.. autoclass:: CocoDetection
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
``dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])``
|
||||
|
||||
Example:
|
||||
|
||||
.. code:: python
|
||||
|
||||
import torchvision.datasets as dset
|
||||
import torchvision.transforms as transforms
|
||||
cap = dset.CocoCaptions(root = 'dir where images are',
|
||||
annFile = 'json annotation file',
|
||||
transform=transforms.ToTensor())
|
||||
|
||||
print('Number of samples: ', len(cap))
|
||||
img, target = cap[3] # load 4th sample
|
||||
|
||||
print("Image Size: ", img.size())
|
||||
print(target)
|
||||
|
||||
Output:
|
||||
|
||||
::
|
||||
|
||||
Number of samples: 82783
|
||||
Image Size: (3L, 427L, 640L)
|
||||
[u'A plane emitting smoke stream flying over a mountain.',
|
||||
u'A plane darts across a bright blue sky behind a mountain covered in snow',
|
||||
u'A plane leaves a contrail above the snowy mountain top.',
|
||||
u'A mountain that has a plane flying overheard in the distance.',
|
||||
u'A mountain view with a plume of smoke in the background']
|
||||
|
||||
Detection:
|
||||
^^^^^^^^^^
|
||||
|
||||
``dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])``
|
||||
|
||||
LSUN
|
||||
~~~~
|
||||
|
||||
.. autoclass:: LSUN
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
``dset.LSUN(db_path, classes='train', [transform, target_transform])``
|
||||
|
||||
- db\_path = root directory for the database files
|
||||
- ``classes`` = ``‘train’`` (all categories, training set), ``‘val’`` (all categories, validation set), ``‘test’`` (all categories, test set)
|
||||
- [``‘bedroom\_train’``, ``‘church\_train’``, …] : a list of categories to load
|
||||
|
||||
ImageFolder
|
||||
~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: ImageFolder
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
A generic data loader where the images are arranged in this way:
|
||||
|
||||
::
|
||||
|
||||
root/dog/xxx.png
|
||||
root/dog/xxy.png
|
||||
root/dog/xxz.png
|
||||
|
||||
root/cat/123.png
|
||||
root/cat/nsdf3.png
|
||||
root/cat/asd932_.png
|
||||
|
||||
``dset.ImageFolder(root="root folder path", [transform, target_transform])``
|
||||
|
||||
It has the members:
|
||||
|
||||
- ``self.classes`` - The class names as a list
|
||||
- ``self.class_to_idx`` - Corresponding class indices
|
||||
- ``self.imgs`` - The list of (image path, class-index) tuples
|
||||
|
||||
Imagenet-12
|
||||
~~~~~~~~~~~
|
||||
|
||||
This should simply be implemented with an ``ImageFolder`` dataset.
|
||||
This is simply implemented with an ImageFolder dataset.
|
||||
|
||||
The data is preprocessed `as described
|
||||
here <https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset>`__
|
||||
|
||||
@ -82,31 +133,30 @@ example <https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce
|
||||
CIFAR
|
||||
~~~~~
|
||||
|
||||
.. autoclass:: CIFAR10
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
``dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)``
|
||||
|
||||
``dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)``
|
||||
|
||||
- ``root`` : root directory of dataset where there is folder
|
||||
``cifar-10-batches-py``
|
||||
- ``train`` : ``True`` = Training set, ``False`` = Test set
|
||||
- ``download`` : ``True`` = downloads the dataset from the internet and
|
||||
puts it in root directory. If dataset already downloaded, doesn't do anything.
|
||||
|
||||
STL10
|
||||
~~~~~
|
||||
|
||||
``dset.STL10(root, split='train', transform=None, target_transform=None, download=False)``
|
||||
|
||||
.. autoclass:: STL10
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
|
||||
SVHN
|
||||
~~~~~
|
||||
|
||||
|
||||
.. autoclass:: SVHN
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
|
||||
PhotoTour
|
||||
~~~~~~~~~
|
||||
|
||||
|
||||
.. autoclass:: PhotoTour
|
||||
:members: __getitem__
|
||||
:special-members:
|
||||
- ``root`` : root directory of dataset where there is folder ``stl10_binary``
|
||||
- ``split`` : ``'train'`` = Training set, ``'test'`` = Test set, ``'unlabeled'`` = Unlabeled set, ``'train+unlabeled'`` = Training + Unlabeled set (missing label marked as ``-1``)
|
||||
- ``download`` : ``True`` = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, doesn't do anything.
|
||||
|
||||
.. _MNIST: #mnist
|
||||
.. _COCO (Captioning and Detection): #coco
|
||||
.. _LSUN Classification: #lsun
|
||||
.. _ImageFolder: #imagefolder
|
||||
.. _Imagenet-12: #imagenet-12
|
||||
.. _CIFAR10 and CIFAR100: #cifar
|
||||
.. _STL10: #stl10
|
||||
.. _COCO API to be installed: https://github.com/pdollar/coco/tree/master/PythonAPI
|
||||
@ -1,12 +1,11 @@
|
||||
torchvision.models
|
||||
===================
|
||||
|
||||
|
||||
.. currentmodule:: torchvision.models
|
||||
|
||||
|
||||
.. automodule:: torchvision.models
|
||||
:members: alexnet, resnet18, resnet34, resnet50, resnet101, resnet152,
|
||||
vgg11, vgg11_bn, vgg13, vgg13_bn, vgg16, vgg16_bn, vgg19,
|
||||
vgg19_bn, inception_v3, squeezenet1_0, squeezenet1_1, densenet121,
|
||||
densenet169, densenet201, densenet161
|
||||
vgg19_bn
|
||||
:undoc-members:
|
||||
|
||||
@ -3,8 +3,6 @@ torchvision.transforms
|
||||
|
||||
.. currentmodule:: torchvision.transforms
|
||||
|
||||
Transforms are common image transforms. They can be chained together using :class:`Compose`
|
||||
|
||||
.. autoclass:: Compose
|
||||
|
||||
Transforms on PIL.Image
|
||||
@ -26,20 +24,14 @@ Transforms on torch.\*Tensor
|
||||
----------------------------
|
||||
|
||||
.. autoclass:: Normalize
|
||||
:members: __call__
|
||||
:special-members:
|
||||
|
||||
|
||||
Conversion Transforms
|
||||
---------------------
|
||||
|
||||
.. autoclass:: ToTensor
|
||||
:members: __call__
|
||||
:special-members:
|
||||
|
||||
.. autoclass:: ToPILImage
|
||||
:members: __call__
|
||||
:special-members:
|
||||
|
||||
Generic Transforms
|
||||
------------------
|
||||
|
||||
86
setup.py
86
setup.py
@ -15,24 +15,12 @@ import os
|
||||
from tools.setup_helpers.env import check_env_flag
|
||||
from tools.setup_helpers.cuda import WITH_CUDA, CUDA_HOME
|
||||
from tools.setup_helpers.cudnn import WITH_CUDNN, CUDNN_LIB_DIR, CUDNN_INCLUDE_DIR
|
||||
from tools.setup_helpers.split_types import split_types
|
||||
DEBUG = check_env_flag('DEBUG')
|
||||
WITH_DISTRIBUTED = not check_env_flag('NO_DISTRIBUTED')
|
||||
WITH_DISTRIBUTED = check_env_flag('WITH_DISTRIBUTED')
|
||||
WITH_DISTRIBUTED_MW = WITH_DISTRIBUTED and check_env_flag('WITH_DISTRIBUTED_MW')
|
||||
WITH_NCCL = WITH_CUDA and platform.system() != 'Darwin'
|
||||
SYSTEM_NCCL = False
|
||||
|
||||
|
||||
################################################################################
|
||||
# Workaround setuptools -Wstrict-prototypes warnings
|
||||
# I lifted this code from https://stackoverflow.com/a/29634231/23845
|
||||
################################################################################
|
||||
import distutils.sysconfig
|
||||
cfg_vars = distutils.sysconfig.get_config_vars()
|
||||
for key, value in cfg_vars.items():
|
||||
if type(value) == str:
|
||||
cfg_vars[key] = value.replace("-Wstrict-prototypes", "")
|
||||
|
||||
################################################################################
|
||||
# Monkey-patch setuptools to compile in parallel
|
||||
################################################################################
|
||||
@ -156,10 +144,6 @@ class build_ext(setuptools.command.build_ext.build_ext):
|
||||
print('-- Building NCCL library')
|
||||
else:
|
||||
print('-- Not using NCCL')
|
||||
if WITH_DISTRIBUTED:
|
||||
print('-- Building with distributed package ')
|
||||
else:
|
||||
print('-- Building without distributed package')
|
||||
|
||||
# cwrap depends on pyyaml, so we can't import it earlier
|
||||
from tools.cwrap import cwrap
|
||||
@ -171,14 +155,10 @@ class build_ext(setuptools.command.build_ext.build_ext):
|
||||
from tools.cwrap.plugins.NullableArguments import NullableArguments
|
||||
from tools.cwrap.plugins.CuDNNPlugin import CuDNNPlugin
|
||||
from tools.cwrap.plugins.WrapDim import WrapDim
|
||||
from tools.cwrap.plugins.AssertNDim import AssertNDim
|
||||
from tools.cwrap.plugins.Broadcast import Broadcast
|
||||
from tools.cwrap.plugins.ProcessorSpecificPlugin import ProcessorSpecificPlugin
|
||||
thp_plugin = THPPlugin()
|
||||
cwrap('torch/csrc/generic/TensorMethods.cwrap', plugins=[
|
||||
ProcessorSpecificPlugin(), BoolOption(), thp_plugin,
|
||||
AutoGPU(condition='IS_CUDA'), ArgcountSortPlugin(), KwargsPlugin(),
|
||||
AssertNDim(), WrapDim(), Broadcast()
|
||||
BoolOption(), thp_plugin, AutoGPU(condition='IS_CUDA'),
|
||||
ArgcountSortPlugin(), KwargsPlugin(), WrapDim()
|
||||
])
|
||||
cwrap('torch/csrc/cudnn/cuDNN.cwrap', plugins=[
|
||||
CuDNNPlugin(), NullableArguments()
|
||||
@ -225,10 +205,11 @@ class clean(distutils.command.clean.clean):
|
||||
include_dirs = []
|
||||
library_dirs = []
|
||||
extra_link_args = []
|
||||
extra_compile_args = ['-std=c++11', '-Wno-write-strings',
|
||||
# Python 2.6 requires -fno-strict-aliasing, see
|
||||
# http://legacy.python.org/dev/peps/pep-3123/
|
||||
'-fno-strict-aliasing']
|
||||
extra_compile_args = ['-std=c++11', '-Wno-write-strings']
|
||||
if os.getenv('PYTORCH_BINARY_BUILD') and platform.system() == 'Linux':
|
||||
print('PYTORCH_BINARY_BUILD found. Static linking libstdc++ on Linux')
|
||||
extra_compile_args += ['-static-libstdc++']
|
||||
extra_link_args += ['-static-libstdc++']
|
||||
|
||||
cwd = os.path.dirname(os.path.abspath(__file__))
|
||||
lib_path = os.path.join(cwd, "torch", "lib")
|
||||
@ -241,7 +222,6 @@ include_dirs += [
|
||||
tmp_install_path + "/include/TH",
|
||||
tmp_install_path + "/include/THPP",
|
||||
tmp_install_path + "/include/THNN",
|
||||
tmp_install_path + "/include/ATen",
|
||||
]
|
||||
|
||||
library_dirs.append(lib_path)
|
||||
@ -254,10 +234,7 @@ THCS_LIB = os.path.join(lib_path, 'libTHCS.so.1')
|
||||
THNN_LIB = os.path.join(lib_path, 'libTHNN.so.1')
|
||||
THCUNN_LIB = os.path.join(lib_path, 'libTHCUNN.so.1')
|
||||
THPP_LIB = os.path.join(lib_path, 'libTHPP.so.1')
|
||||
ATEN_LIB = os.path.join(lib_path, 'libATen.so.1')
|
||||
GLOO_LIB = os.path.join(lib_path, 'libgloo.a')
|
||||
GLOO_CUDA_LIB = os.path.join(lib_path, 'libgloo_cuda.a')
|
||||
THD_LIB = os.path.join(lib_path, 'libTHD.a')
|
||||
THD_LIB = os.path.join(lib_path, 'libTHD.so.1')
|
||||
NCCL_LIB = os.path.join(lib_path, 'libnccl.so.1')
|
||||
if platform.system() == 'Darwin':
|
||||
TH_LIB = os.path.join(lib_path, 'libTH.1.dylib')
|
||||
@ -267,26 +244,26 @@ if platform.system() == 'Darwin':
|
||||
THNN_LIB = os.path.join(lib_path, 'libTHNN.1.dylib')
|
||||
THCUNN_LIB = os.path.join(lib_path, 'libTHCUNN.1.dylib')
|
||||
THPP_LIB = os.path.join(lib_path, 'libTHPP.1.dylib')
|
||||
ATEN_LIB = os.path.join(lib_path, 'libATen.1.dylib')
|
||||
THD_LIB = os.path.join(lib_path, 'libTHD.1.dylib')
|
||||
NCCL_LIB = os.path.join(lib_path, 'libnccl.1.dylib')
|
||||
|
||||
if WITH_NCCL and subprocess.call('ldconfig -p | grep libnccl >/dev/null', shell=True) == 0:
|
||||
SYSTEM_NCCL = True
|
||||
SYSTEM_NCCL = True
|
||||
|
||||
main_compile_args = ['-D_THP_CORE']
|
||||
main_libraries = ['shm']
|
||||
main_link_args = [TH_LIB, THS_LIB, THPP_LIB, THNN_LIB, ATEN_LIB]
|
||||
main_link_args = [TH_LIB, THS_LIB, THPP_LIB, THNN_LIB]
|
||||
main_sources = [
|
||||
"torch/csrc/PtrWrapper.cpp",
|
||||
"torch/csrc/Module.cpp",
|
||||
"torch/csrc/Generator.cpp",
|
||||
"torch/csrc/Size.cpp",
|
||||
"torch/csrc/Exceptions.cpp",
|
||||
"torch/csrc/Tensor.cpp",
|
||||
"torch/csrc/Storage.cpp",
|
||||
"torch/csrc/DynamicTypes.cpp",
|
||||
"torch/csrc/byte_order.cpp",
|
||||
"torch/csrc/utils.cpp",
|
||||
"torch/csrc/expand_utils.cpp",
|
||||
"torch/csrc/utils/object_ptr.cpp",
|
||||
"torch/csrc/utils/tuple_parser.cpp",
|
||||
"torch/csrc/allocators.cpp",
|
||||
@ -295,7 +272,7 @@ main_sources = [
|
||||
"torch/csrc/autograd/engine.cpp",
|
||||
"torch/csrc/autograd/function.cpp",
|
||||
"torch/csrc/autograd/variable.cpp",
|
||||
"torch/csrc/autograd/input_buffer.cpp",
|
||||
"torch/csrc/autograd/grad_buffer.cpp",
|
||||
"torch/csrc/autograd/python_function.cpp",
|
||||
"torch/csrc/autograd/python_cpp_function.cpp",
|
||||
"torch/csrc/autograd/python_variable.cpp",
|
||||
@ -303,14 +280,9 @@ main_sources = [
|
||||
"torch/csrc/autograd/python_hook.cpp",
|
||||
"torch/csrc/autograd/functions/batch_normalization.cpp",
|
||||
"torch/csrc/autograd/functions/convolution.cpp",
|
||||
"torch/csrc/autograd/functions/basic_ops.cpp",
|
||||
"torch/csrc/autograd/functions/tensor.cpp",
|
||||
"torch/csrc/autograd/functions/accumulate_grad.cpp",
|
||||
"torch/csrc/autograd/functions/utils.cpp",
|
||||
"torch/csrc/autograd/functions/init.cpp",
|
||||
"torch/csrc/nn/THNN_generic.cpp",
|
||||
]
|
||||
main_sources += split_types("torch/csrc/Tensor.cpp")
|
||||
|
||||
try:
|
||||
import numpy as np
|
||||
@ -331,11 +303,8 @@ if WITH_DISTRIBUTED:
|
||||
"torch/csrc/distributed/Tensor.cpp",
|
||||
"torch/csrc/distributed/Storage.cpp",
|
||||
]
|
||||
extra_compile_args += ['-DWITH_DISTRIBUTED_MW']
|
||||
include_dirs += [tmp_install_path + "/include/THD"]
|
||||
main_link_args += [THD_LIB]
|
||||
if platform.system() == 'Linux':
|
||||
main_link_args += [GLOO_LIB]
|
||||
|
||||
if WITH_CUDA:
|
||||
cuda_lib_dirs = ['lib64', 'lib']
|
||||
@ -350,20 +319,17 @@ if WITH_CUDA:
|
||||
extra_link_args.append('-Wl,-rpath,' + cuda_lib_path)
|
||||
extra_compile_args += ['-DWITH_CUDA']
|
||||
extra_compile_args += ['-DCUDA_LIB_PATH=' + cuda_lib_path]
|
||||
main_libraries += ['cudart', 'nvToolsExt']
|
||||
main_libraries += ['cudart']
|
||||
main_link_args += [THC_LIB, THCS_LIB, THCUNN_LIB]
|
||||
if platform.system() == 'Linux':
|
||||
main_link_args += [GLOO_CUDA_LIB]
|
||||
main_sources += [
|
||||
"torch/csrc/cuda/Module.cpp",
|
||||
"torch/csrc/cuda/Storage.cpp",
|
||||
"torch/csrc/cuda/Stream.cpp",
|
||||
"torch/csrc/cuda/Tensor.cpp",
|
||||
"torch/csrc/cuda/AutoGPU.cpp",
|
||||
"torch/csrc/cuda/utils.cpp",
|
||||
"torch/csrc/cuda/expand_utils.cpp",
|
||||
"torch/csrc/cuda/serialization.cpp",
|
||||
]
|
||||
main_sources += split_types("torch/csrc/cuda/Tensor.cpp")
|
||||
|
||||
if WITH_NCCL:
|
||||
if SYSTEM_NCCL:
|
||||
@ -380,8 +346,6 @@ if WITH_CUDNN:
|
||||
"torch/csrc/cudnn/BatchNorm.cpp",
|
||||
"torch/csrc/cudnn/Conv.cpp",
|
||||
"torch/csrc/cudnn/cuDNN.cpp",
|
||||
"torch/csrc/cudnn/GridSampler.cpp",
|
||||
"torch/csrc/cudnn/AffineGridGenerator.cpp",
|
||||
"torch/csrc/cudnn/Types.cpp",
|
||||
"torch/csrc/cudnn/Handles.cpp",
|
||||
]
|
||||
@ -391,18 +355,6 @@ if DEBUG:
|
||||
extra_compile_args += ['-O0', '-g']
|
||||
extra_link_args += ['-O0', '-g']
|
||||
|
||||
if os.getenv('PYTORCH_BINARY_BUILD') and platform.system() == 'Linux':
|
||||
print('PYTORCH_BINARY_BUILD found. Static linking libstdc++ on Linux')
|
||||
# get path of libstdc++ and link manually.
|
||||
# for reasons unknown, -static-libstdc++ doesn't fully link some symbols
|
||||
CXXNAME = os.getenv('CXX', 'g++')
|
||||
STDCPP_LIB = subprocess.check_output([CXXNAME, '-print-file-name=libstdc++.a'])
|
||||
STDCPP_LIB = STDCPP_LIB[:-1]
|
||||
if type(STDCPP_LIB) != str: # python 3
|
||||
STDCPP_LIB = STDCPP_LIB.decode(sys.stdout.encoding)
|
||||
main_link_args += [STDCPP_LIB]
|
||||
version_script = os.path.abspath("tools/pytorch.version")
|
||||
extra_link_args += ['-Wl,--version-script=' + version_script]
|
||||
|
||||
def make_relative_rpath(path):
|
||||
if platform.system() == 'Darwin':
|
||||
@ -415,7 +367,7 @@ def make_relative_rpath(path):
|
||||
################################################################################
|
||||
|
||||
extensions = []
|
||||
packages = find_packages(exclude=('tools', 'tools.*',))
|
||||
packages = find_packages(exclude=('tools.*',))
|
||||
|
||||
C = Extension("torch._C",
|
||||
libraries=main_libraries,
|
||||
@ -462,7 +414,7 @@ if WITH_CUDA:
|
||||
)
|
||||
extensions.append(THCUNN)
|
||||
|
||||
version = '0.2.0'
|
||||
version = '0.1.12'
|
||||
if os.getenv('PYTORCH_BUILD_VERSION'):
|
||||
assert os.getenv('PYTORCH_BUILD_NUMBER') is not None
|
||||
version = os.getenv('PYTORCH_BUILD_VERSION') \
|
||||
@ -495,5 +447,5 @@ setup(name="torch", version=version,
|
||||
'lib/*.h',
|
||||
'lib/include/TH/*.h', 'lib/include/TH/generic/*.h',
|
||||
'lib/include/THC/*.h', 'lib/include/THC/generic/*.h']},
|
||||
install_requires=['pyyaml', 'numpy'],
|
||||
install_requires=['pyyaml'],
|
||||
)
|
||||
|
||||
@ -15,28 +15,15 @@ from torch.autograd import Variable
|
||||
|
||||
torch.set_default_tensor_type('torch.DoubleTensor')
|
||||
|
||||
SEED = 0
|
||||
SEED_SET = 0
|
||||
|
||||
|
||||
def parse_set_seed_once():
|
||||
global SEED
|
||||
global SEED_SET
|
||||
def run_tests():
|
||||
parser = argparse.ArgumentParser(add_help=False)
|
||||
parser.add_argument('--seed', type=int, default=123)
|
||||
args, remaining = parser.parse_known_args()
|
||||
if SEED_SET == 0:
|
||||
torch.manual_seed(args.seed)
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.manual_seed_all(args.seed)
|
||||
SEED = args.seed
|
||||
SEED_SET = 1
|
||||
torch.manual_seed(args.seed)
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.manual_seed_all(args.seed)
|
||||
remaining = [sys.argv[0]] + remaining
|
||||
return remaining
|
||||
|
||||
|
||||
def run_tests():
|
||||
remaining = parse_set_seed_once()
|
||||
unittest.main(argv=remaining)
|
||||
|
||||
|
||||
@ -90,7 +77,7 @@ def to_gpu(obj, type_map={}):
|
||||
elif torch.is_storage(obj):
|
||||
return obj.new().resize_(obj.size()).copy_(obj)
|
||||
elif isinstance(obj, Variable):
|
||||
assert obj.is_leaf
|
||||
assert obj.creator is None
|
||||
t = type_map.get(type(obj.data), get_gpu_type(type(obj.data)))
|
||||
return Variable(obj.data.clone().type(t), requires_grad=obj.requires_grad)
|
||||
elif isinstance(obj, list):
|
||||
@ -131,11 +118,6 @@ def is_iterable(obj):
|
||||
class TestCase(unittest.TestCase):
|
||||
precision = 1e-5
|
||||
|
||||
def setUp(self):
|
||||
torch.manual_seed(SEED)
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.manual_seed_all(SEED)
|
||||
|
||||
def assertTensorsSlowEqual(self, x, y, prec=None, message=''):
|
||||
max_err = 0
|
||||
self.assertEqual(x.size(), y.size())
|
||||
@ -147,7 +129,7 @@ class TestCase(unittest.TestCase):
|
||||
tc = t.coalesce()
|
||||
|
||||
value_map = {}
|
||||
for idx, val in zip(t._indices().t(), t._values()):
|
||||
for idx, val in zip(t.indices().t(), t.values()):
|
||||
idx_tup = tuple(idx)
|
||||
if idx_tup in value_map:
|
||||
value_map[idx_tup] += val
|
||||
@ -156,31 +138,26 @@ class TestCase(unittest.TestCase):
|
||||
|
||||
new_indices = sorted(list(value_map.keys()))
|
||||
new_values = [value_map[idx] for idx in new_indices]
|
||||
if t._values().ndimension() < 2:
|
||||
new_values = t._values().new(new_values)
|
||||
if t.values().ndimension() < 2:
|
||||
new_values = t.values().new(new_values)
|
||||
else:
|
||||
new_values = torch.stack(new_values)
|
||||
|
||||
new_indices = t._indices().new(new_indices).t()
|
||||
new_indices = t.indices().new(new_indices).t()
|
||||
tg = t.new(new_indices, new_values, t.size())
|
||||
|
||||
self.assertEqual(tc._indices(), tg._indices())
|
||||
self.assertEqual(tc._values(), tg._values())
|
||||
self.assertEqual(tc.indices(), tg.indices())
|
||||
self.assertEqual(tc.values(), tg.values())
|
||||
|
||||
return tg
|
||||
|
||||
def unwrapVariables(self, x, y):
|
||||
if isinstance(x, Variable) and isinstance(y, Variable):
|
||||
return x.data, y.data
|
||||
elif isinstance(x, Variable) or isinstance(y, Variable):
|
||||
raise AssertionError("cannot compare {} and {}".format(type(x), type(y)))
|
||||
return x, y
|
||||
|
||||
def assertEqual(self, x, y, prec=None, message=''):
|
||||
if prec is None:
|
||||
prec = self.precision
|
||||
|
||||
x, y = self.unwrapVariables(x, y)
|
||||
if isinstance(x, Variable) and isinstance(y, Variable):
|
||||
x = x.data
|
||||
y = y.data
|
||||
|
||||
if torch.is_tensor(x) and torch.is_tensor(y):
|
||||
def assertTensorsEqual(a, b):
|
||||
@ -201,16 +178,13 @@ class TestCase(unittest.TestCase):
|
||||
if x.is_sparse:
|
||||
x = self.safeCoalesce(x)
|
||||
y = self.safeCoalesce(y)
|
||||
assertTensorsEqual(x._indices(), y._indices())
|
||||
assertTensorsEqual(x._values(), y._values())
|
||||
assertTensorsEqual(x.indices(), y.indices())
|
||||
assertTensorsEqual(x.values(), y.values())
|
||||
else:
|
||||
assertTensorsEqual(x, y)
|
||||
elif type(x) == str and type(y) == str:
|
||||
super(TestCase, self).assertEqual(x, y)
|
||||
elif type(x) == set and type(y) == set:
|
||||
super(TestCase, self).assertEqual(x, y)
|
||||
elif is_iterable(x) and is_iterable(y):
|
||||
super(TestCase, self).assertEqual(len(x), len(y))
|
||||
for x_, y_ in zip(x, y):
|
||||
self.assertEqual(x_, y_, prec, message)
|
||||
else:
|
||||
@ -225,7 +199,9 @@ class TestCase(unittest.TestCase):
|
||||
if prec is None:
|
||||
prec = self.precision
|
||||
|
||||
x, y = self.unwrapVariables(x, y)
|
||||
if isinstance(x, Variable) and isinstance(y, Variable):
|
||||
x = x.data
|
||||
y = y.data
|
||||
|
||||
if torch.is_tensor(x) and torch.is_tensor(y):
|
||||
if x.size() != y.size():
|
||||
@ -259,33 +235,24 @@ class TestCase(unittest.TestCase):
|
||||
return
|
||||
raise AssertionError("object not found in iterable")
|
||||
|
||||
if sys.version_info < (3, 2):
|
||||
# assertRaisesRegexp renamed assertRaisesRegex in 3.2
|
||||
assertRaisesRegex = unittest.TestCase.assertRaisesRegexp
|
||||
|
||||
|
||||
def download_file(url, binary=True):
|
||||
def download_file(url, path, binary=True):
|
||||
if sys.version_info < (3,):
|
||||
from urlparse import urlsplit
|
||||
import urllib2
|
||||
request = urllib2
|
||||
error = urllib2
|
||||
else:
|
||||
from urllib.parse import urlsplit
|
||||
from urllib import request, error
|
||||
|
||||
filename = os.path.basename(urlsplit(url)[2])
|
||||
data_dir = os.path.join(os.path.dirname(__file__), 'data')
|
||||
path = os.path.join(data_dir, filename)
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
request = urllib.request
|
||||
error = urllib.error
|
||||
|
||||
if os.path.exists(path):
|
||||
return path
|
||||
return True
|
||||
try:
|
||||
data = request.urlopen(url, timeout=15).read()
|
||||
with open(path, 'wb' if binary else 'w') as f:
|
||||
f.write(data)
|
||||
return path
|
||||
except error.URLError:
|
||||
msg = "could not download test file '{}'".format(url)
|
||||
warnings.warn(msg, RuntimeWarning)
|
||||
raise unittest.SkipTest(msg)
|
||||
return True
|
||||
except error.URLError as e:
|
||||
return False
|
||||
|
||||
@ -53,31 +53,29 @@ module_tests = [
|
||||
dict(
|
||||
module_name='ReLU',
|
||||
input_size=(2, 3, 4, 5),
|
||||
check_inplace=True,
|
||||
check_inplace=True
|
||||
),
|
||||
dict(
|
||||
module_name='ReLU6',
|
||||
input_size=(2, 3, 4, 5),
|
||||
check_inplace=True,
|
||||
check_inplace=True
|
||||
),
|
||||
dict(
|
||||
module_name='RReLU',
|
||||
input_size=(1, 2, 2),
|
||||
test_cuda=False,
|
||||
check_gradgrad=False,
|
||||
test_cuda=False
|
||||
),
|
||||
dict(
|
||||
module_name='RReLU',
|
||||
constructor_args=(0.1, 0.9),
|
||||
input_size=(4, 4, 5),
|
||||
desc='with_up_down',
|
||||
test_cuda=False,
|
||||
check_gradgrad=False,
|
||||
test_cuda=False
|
||||
),
|
||||
dict(
|
||||
module_name='Hardtanh',
|
||||
input_size=(3, 2, 5),
|
||||
reference_fn=lambda i, _: i.clamp(-1, 1),
|
||||
reference_fn=lambda i, _: i.clamp(-1, 1)
|
||||
),
|
||||
dict(
|
||||
module_name='Sigmoid',
|
||||
@ -90,35 +88,35 @@ module_tests = [
|
||||
dict(
|
||||
module_name='Softmax',
|
||||
input_size=(10, 20),
|
||||
reference_fn=lambda i, _: torch.exp(i).div(torch.exp(i).sum(1, True).expand(10, 20)),
|
||||
reference_fn=lambda i, _: torch.exp(i).div(torch.exp(i).sum(1).expand(10, 20))
|
||||
),
|
||||
dict(
|
||||
module_name='Softmax2d',
|
||||
input_size=(1, 3, 10, 20),
|
||||
reference_fn=lambda i, _: torch.exp(i).div(torch.exp(i).sum(1, False)),
|
||||
reference_fn=lambda i, _: torch.exp(i).div(torch.exp(i).sum(1).expand_as(i))
|
||||
),
|
||||
dict(
|
||||
module_name='LogSoftmax',
|
||||
input_size=(10, 20),
|
||||
reference_fn=lambda i, _: torch.exp(i).div_(torch.exp(i).sum(1, True).expand(10, 20)).log_(),
|
||||
reference_fn=lambda i, _: torch.exp(i).div_(torch.exp(i).sum(1).expand(10, 20)).log_()
|
||||
),
|
||||
dict(
|
||||
module_name='LogSoftmax',
|
||||
input_size=(1, 3, 10, 20),
|
||||
reference_fn=lambda i, _: torch.exp(i).div_(torch.exp(i).sum(1, False)).log_(),
|
||||
desc='multiparam',
|
||||
reference_fn=lambda i, _: torch.exp(i).div_(torch.exp(i).sum(1).expand_as(i)).log_(),
|
||||
desc='multiparam'
|
||||
),
|
||||
dict(
|
||||
module_name='ELU',
|
||||
constructor_args=(2.,),
|
||||
input_size=(3, 2, 5),
|
||||
check_inplace=True
|
||||
),
|
||||
# TODO: reference function
|
||||
dict(
|
||||
module_name='Hardshrink',
|
||||
constructor_args=(2.,),
|
||||
input_size=(4, 3, 2, 4),
|
||||
check_gradgrad=False,
|
||||
input_size=(4, 3, 2, 4)
|
||||
),
|
||||
dict(
|
||||
module_name='LeakyReLU',
|
||||
@ -135,40 +133,34 @@ module_tests = [
|
||||
dict(
|
||||
module_name='LogSigmoid',
|
||||
input_size=(2, 3, 4),
|
||||
reference_fn=lambda i, _: i.sigmoid().log(),
|
||||
check_gradgrad=False,
|
||||
reference_fn=lambda i, _: i.sigmoid().log()
|
||||
),
|
||||
dict(
|
||||
module_name='Softplus',
|
||||
input_size=(10, 20),
|
||||
reference_fn=lambda i, _: torch.log(1 + torch.exp(i)),
|
||||
check_gradgrad=False,
|
||||
reference_fn=lambda i, _: torch.log(1 + torch.exp(i))
|
||||
),
|
||||
dict(
|
||||
module_name='Softplus',
|
||||
constructor_args=(2,),
|
||||
input_size=(10, 20),
|
||||
reference_fn=lambda i, _: 1. / 2. * torch.log(1 + torch.exp(2 * i)),
|
||||
desc='beta',
|
||||
check_gradgrad=False,
|
||||
desc='beta'
|
||||
),
|
||||
dict(
|
||||
module_name='Softshrink',
|
||||
input_size=(3, 2, 5),
|
||||
check_gradgrad=False,
|
||||
input_size=(3, 2, 5)
|
||||
),
|
||||
dict(
|
||||
module_name='Softshrink',
|
||||
constructor_args=(1,),
|
||||
input_size=(3, 2, 5),
|
||||
desc='lambda',
|
||||
check_gradgrad=False,
|
||||
desc='lambda'
|
||||
),
|
||||
dict(
|
||||
module_name='CrossMapLRN2d',
|
||||
constructor_args=(5, 5e-3, 1e-3, 2),
|
||||
input_size=(2, 3, 6, 6),
|
||||
check_gradgrad=False,
|
||||
input_size=(2, 3, 6, 6)
|
||||
),
|
||||
dict(
|
||||
module_name='PReLU',
|
||||
@ -212,12 +204,11 @@ module_tests = [
|
||||
dict(
|
||||
module_name='Softsign',
|
||||
input_size=(3, 2, 5),
|
||||
reference_fn=lambda i, _: i.div(1 + torch.abs(i)),
|
||||
reference_fn=lambda i, _: i.div(1 + torch.abs(i))
|
||||
),
|
||||
dict(
|
||||
module_name='Softmin',
|
||||
input_size=(10, 20),
|
||||
check_gradgrad=False,
|
||||
input_size=(10, 20)
|
||||
),
|
||||
dict(
|
||||
module_name='Tanhshrink',
|
||||
@ -225,32 +216,19 @@ module_tests = [
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
criterion_tests = [
|
||||
dict(module_name='L1Loss',
|
||||
input_size=(2, 3, 4),
|
||||
target=torch.randn(2, 3, 4),
|
||||
reference_fn=lambda i, t, _: 1. / i.numel() *
|
||||
sum((a - b).abs().sum() for a, b in zip(i, t)),
|
||||
sum((a - b).abs().sum() for a, b in zip(i, t))
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
input=torch.rand(15, 10).log(),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
constructor_args=(None, False),
|
||||
input=torch.rand(15, 10).log(),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
desc='no_size_average'
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
constructor_args=(None, True, 2),
|
||||
input=torch.rand(15, 10).log(),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
desc='ignore_index'
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
constructor_args=(torch.rand(10),),
|
||||
@ -258,159 +236,120 @@ criterion_tests = [
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
desc='weights',
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
constructor_args=(torch.rand(10), True, 2),
|
||||
input=torch.rand(15, 10).add(1e-2).log(),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
desc='weights_ignore_index'
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss',
|
||||
constructor_args=(torch.rand(10), True, -1),
|
||||
input=torch.rand(15, 10).add(1e-2).log(),
|
||||
target=torch.Tensor(15).uniform_().mul(10 + 1).floor().long() - 1,
|
||||
desc='weights_ignore_index_neg'
|
||||
),
|
||||
dict(
|
||||
module_name='KLDivLoss',
|
||||
input=torch.rand(10, 10).log(),
|
||||
target=torch.rand(10, 10),
|
||||
check_gradgrad=False,
|
||||
target=torch.rand(10, 10)
|
||||
),
|
||||
dict(
|
||||
module_name='MSELoss',
|
||||
input=torch.randn(2, 3, 4, 5),
|
||||
target=torch.randn(2, 3, 4, 5),
|
||||
reference_fn=lambda i, t, _: (i - t).abs().pow(2).sum() / i.numel(),
|
||||
check_gradgrad=False,
|
||||
reference_fn=lambda i, t, _: (i - t).abs().pow(2).sum() / i.numel()
|
||||
),
|
||||
dict(
|
||||
module_name='BCELoss',
|
||||
input=torch.rand(15, 10).clamp_(1e-2, 1 - 1e-2),
|
||||
target=torch.randn(15, 10).gt(0).double(),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(15, 10).gt(0).double()
|
||||
),
|
||||
dict(
|
||||
module_name='BCELoss',
|
||||
constructor_args=(torch.rand(10),),
|
||||
input=torch.rand(15, 10).clamp_(1e-2, 1 - 1e-2),
|
||||
target=torch.randn(15, 10).gt(0).double(),
|
||||
desc='weights',
|
||||
check_gradgrad=False,
|
||||
desc='weights'
|
||||
),
|
||||
dict(
|
||||
module_name='CrossEntropyLoss',
|
||||
input=torch.randn(15, 10),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
check_gradgrad=False,
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long()
|
||||
),
|
||||
dict(
|
||||
module_name='CrossEntropyLoss',
|
||||
constructor_args=(torch.rand(10),),
|
||||
input=torch.randn(15, 10),
|
||||
target=torch.Tensor(15).uniform_().mul(10).floor().long(),
|
||||
desc='weights',
|
||||
check_gradgrad=False,
|
||||
desc='weights'
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss2d',
|
||||
input_size=(2, 3, 5, 5),
|
||||
target=torch.rand(2, 5, 5).mul(3).floor().long(),
|
||||
target=torch.rand(2, 5, 5).mul(3).floor().long()
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss2d',
|
||||
constructor_args=(torch.rand(3),),
|
||||
input_size=(2, 3, 5, 5),
|
||||
target=torch.rand(2, 5, 5).mul(3).floor().long(),
|
||||
desc='weights',
|
||||
),
|
||||
dict(
|
||||
module_name='NLLLoss2d',
|
||||
constructor_args=(None, True, 3),
|
||||
input_size=(2, 3, 5, 5),
|
||||
target=torch.rand(2, 5, 5).mul(4).floor().long(),
|
||||
desc='ignore_index',
|
||||
desc='weights'
|
||||
),
|
||||
dict(
|
||||
module_name='HingeEmbeddingLoss',
|
||||
input=torch.rand(10),
|
||||
target=torch.randn(10).gt(0).double().mul_(2).sub(1),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(10).gt(0).double().mul_(2).sub(1)
|
||||
),
|
||||
dict(
|
||||
module_name='HingeEmbeddingLoss',
|
||||
constructor_args=(0.5,),
|
||||
input=torch.rand(10),
|
||||
target=torch.randn(10).gt(0).double().mul_(2).sub(1),
|
||||
desc='margin',
|
||||
check_gradgrad=False,
|
||||
desc='margin'
|
||||
),
|
||||
dict(
|
||||
module_name='MultiLabelMarginLoss',
|
||||
input_size=(5, 10),
|
||||
target=torch.rand(5, 10).mul(10).floor().long(),
|
||||
check_gradgrad=False,
|
||||
target=torch.rand(5, 10).mul(10).floor().long()
|
||||
),
|
||||
dict(
|
||||
module_name='MultiLabelSoftMarginLoss',
|
||||
input_size=(5, 10),
|
||||
target=torch.rand(5, 10).mul(2).floor(),
|
||||
check_gradgrad=False,
|
||||
target=torch.rand(5, 10).mul(2).floor()
|
||||
),
|
||||
dict(
|
||||
module_name='MultiLabelSoftMarginLoss',
|
||||
constructor_args=(torch.rand(10),),
|
||||
input_size=(5, 10),
|
||||
target=torch.rand(5, 10).mul(2).floor(),
|
||||
desc='weights',
|
||||
check_gradgrad=False,
|
||||
desc='weights'
|
||||
),
|
||||
dict(
|
||||
module_name='MultiMarginLoss',
|
||||
input_size=(5, 10),
|
||||
target=torch.rand(5).mul(8).floor().long(),
|
||||
check_gradgrad=False,
|
||||
target=torch.rand(5).mul(8).floor().long()
|
||||
),
|
||||
dict(
|
||||
module_name='SmoothL1Loss',
|
||||
input_size=(5, 10),
|
||||
target=torch.randn(5, 10),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(5, 10)
|
||||
),
|
||||
dict(
|
||||
module_name='SoftMarginLoss',
|
||||
input_size=(5, 5),
|
||||
target=torch.randn(5, 5).sign(),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(5, 5).sign()
|
||||
),
|
||||
dict(
|
||||
module_name='CosineEmbeddingLoss',
|
||||
input=(torch.rand(15, 10), torch.rand(15, 10)),
|
||||
target=torch.randn(15).sign(),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(15).sign()
|
||||
),
|
||||
dict(
|
||||
module_name='CosineEmbeddingLoss',
|
||||
constructor_args=(0.7,),
|
||||
input=(torch.rand(15, 10), torch.rand(15, 10)),
|
||||
target=torch.randn(15).sign(),
|
||||
desc='margin',
|
||||
check_gradgrad=False,
|
||||
desc='margin'
|
||||
),
|
||||
dict(
|
||||
module_name='MarginRankingLoss',
|
||||
input=(torch.randn(50).mul(10), torch.randn(50).mul(10)),
|
||||
target=torch.randn(50).sign(),
|
||||
check_gradgrad=False,
|
||||
target=torch.randn(50).sign()
|
||||
),
|
||||
dict(
|
||||
module_name='MarginRankingLoss',
|
||||
constructor_args=(2,),
|
||||
input=(torch.randn(50).mul(10), torch.randn(50).mul(10)),
|
||||
target=torch.randn(50).sign(),
|
||||
desc='margin',
|
||||
check_gradgrad=False,
|
||||
desc='margin'
|
||||
),
|
||||
]
|
||||
|
||||
@ -440,7 +379,6 @@ class NNTestCase(TestCase):
|
||||
if isinstance(input, Variable):
|
||||
if input.requires_grad and input.grad is not None:
|
||||
input.grad.data.zero_()
|
||||
input.grad.detach_()
|
||||
elif torch.is_tensor(input):
|
||||
return
|
||||
else:
|
||||
@ -501,6 +439,7 @@ class NNTestCase(TestCase):
|
||||
return out
|
||||
|
||||
res = tuple()
|
||||
# TODO: enable non-contig tests
|
||||
input = contiguous(input)
|
||||
if jacobian_input:
|
||||
res += get_numerical_jacobian(fw, input, input, eps=1e-6),
|
||||
@ -752,7 +691,6 @@ class CriterionTest(TestBase):
|
||||
test_case.assertEqual(out, expected_out)
|
||||
|
||||
test_case.check_criterion_jacobian(module, input, self.target)
|
||||
self._do_extra_tests(test_case, module, input, self.target)
|
||||
|
||||
def test_cuda(self, test_case):
|
||||
if not TEST_CUDA or not self.should_test_cuda:
|
||||
@ -779,6 +717,3 @@ class CriterionTest(TestBase):
|
||||
test_case.assertEqual(cpu_gradInput, gpu_gradInput, 4e-4)
|
||||
except NotImplementedError:
|
||||
pass
|
||||
|
||||
def _do_extra_tests(self, test_case, module, input, target):
|
||||
pass
|
||||
|
||||
@ -55,37 +55,32 @@ $PYCMD test_cuda.py $@
|
||||
echo "Running NCCL tests"
|
||||
$PYCMD test_nccl.py $@
|
||||
|
||||
distributed_set_up() {
|
||||
export TEMP_DIR="$(mktemp -d)"
|
||||
rm -rf "$TEMP_DIR/"*
|
||||
mkdir "$TEMP_DIR/barrier"
|
||||
mkdir "$TEMP_DIR/test_dir"
|
||||
}
|
||||
################################################################################
|
||||
if [[ "$TEST_DISTRIBUTED" -eq 1 ]]; then
|
||||
distributed_set_up() {
|
||||
export TEMP_DIR="$(mktemp -d)"
|
||||
rm -rf "$TEMP_DIR/"*
|
||||
mkdir "$TEMP_DIR/barrier"
|
||||
mkdir "$TEMP_DIR/test_dir"
|
||||
}
|
||||
|
||||
distributed_tear_down() {
|
||||
rm -rf "$TEMP_DIR"
|
||||
}
|
||||
distributed_tear_down() {
|
||||
rm -rf "$TEMP_DIR"
|
||||
}
|
||||
|
||||
trap distributed_tear_down EXIT SIGHUP SIGINT SIGTERM
|
||||
trap distributed_tear_down EXIT SIGHUP SIGINT SIGTERM
|
||||
|
||||
echo "Running distributed tests for the TCP backend"
|
||||
distributed_set_up
|
||||
BACKEND=tcp WORLD_SIZE=3 $PYCMD ./test_distributed.py
|
||||
distributed_tear_down
|
||||
echo "Running distributed tests for the TCP backend"
|
||||
distributed_set_up
|
||||
BACKEND=tcp WORLD_SIZE=3 $PYCMD ./test_distributed.py
|
||||
distributed_tear_down
|
||||
|
||||
echo "Running distributed tests for the Gloo backend"
|
||||
distributed_set_up
|
||||
BACKEND=gloo WORLD_SIZE=3 $PYCMD ./test_distributed.py
|
||||
distributed_tear_down
|
||||
|
||||
if [ -x "$(command -v mpiexec)" ]; then
|
||||
echo "Running distributed tests for the MPI backend"
|
||||
distributed_set_up
|
||||
BACKEND=mpi mpiexec -n 3 $PYCMD ./test_distributed.py
|
||||
distributed_tear_down
|
||||
else
|
||||
echo "Skipping MPI backend tests (MPI not found)"
|
||||
echo "Running distributed tests for the MPI backend"
|
||||
distributed_set_up
|
||||
BACKEND=mpi mpiexec -n 3 $PYCMD ./test_distributed.py
|
||||
distributed_tear_down
|
||||
fi
|
||||
################################################################################
|
||||
|
||||
if [[ $COVERAGE -eq 1 ]]; then
|
||||
coverage combine
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -94,7 +94,7 @@ def small_3d_positive(t):
|
||||
|
||||
|
||||
def small_3d_unique(t):
|
||||
return t(S, S, S).copy_(torch.arange(1, S * S * S + 1).view(S, S, S))
|
||||
return t(S, S, S).copy_(torch.arange(1, S * S * S + 1))
|
||||
|
||||
|
||||
def small_1d_lapack(t):
|
||||
@ -113,10 +113,6 @@ def small_2d_lapack_fat(t):
|
||||
return t(4, 3).copy_(torch.arange(1, 13).view(4, 3))
|
||||
|
||||
|
||||
def large_2d_lapack(t):
|
||||
return t(1000, 1000).normal_()
|
||||
|
||||
|
||||
def new_t(*sizes):
|
||||
def tmp(t):
|
||||
return t(*sizes).copy_(torch.randn(*sizes))
|
||||
@ -284,8 +280,6 @@ tests = [
|
||||
('qr', small_2d_lapack, lambda t: [], 'square', float_types),
|
||||
('qr', small_2d_lapack_skinny, lambda t: [], 'skinny', float_types),
|
||||
('qr', small_2d_lapack_fat, lambda t: [], 'fat', float_types),
|
||||
('qr', large_2d_lapack, lambda t: [], 'big', float_types),
|
||||
('inverse', new_t(20, 20), lambda t: [], None, float_types),
|
||||
|
||||
]
|
||||
|
||||
@ -300,7 +294,6 @@ custom_precision = {
|
||||
'baddbmm': 1e-4,
|
||||
'rsqrt': 1e-4,
|
||||
'cumprod': 1e-4,
|
||||
'qr': 3e-4,
|
||||
}
|
||||
|
||||
simple_pointwise = [
|
||||
@ -602,17 +595,6 @@ class TestCuda(TestCase):
|
||||
cuda_type = get_gpu_type(t)
|
||||
self.assertEqual(cuda_type(seq), reference)
|
||||
|
||||
def test_torch_manual_seed_seeds_cuda_devices(self):
|
||||
with freeze_rng_state():
|
||||
x = torch.zeros(4, 4).float().cuda()
|
||||
torch.manual_seed(2)
|
||||
self.assertEqual(torch.cuda.initial_seed(), 2)
|
||||
x.uniform_()
|
||||
torch.manual_seed(2)
|
||||
y = x.clone().uniform_()
|
||||
self.assertEqual(x, y)
|
||||
self.assertEqual(torch.cuda.initial_seed(), 2)
|
||||
|
||||
def test_manual_seed(self):
|
||||
with freeze_rng_state():
|
||||
x = torch.zeros(4, 4).float().cuda()
|
||||
@ -841,60 +823,12 @@ class TestCuda(TestCase):
|
||||
self.assertEqual(gpu_tensor1[0], 1)
|
||||
self.assertEqual(gpu_tensor0[0], 2)
|
||||
|
||||
@staticmethod
|
||||
def _select_broadcastable_dims(dims_full=None):
|
||||
return TestTorch._select_broadcastable_dims(dims_full)
|
||||
|
||||
def test_broadcast(self):
|
||||
TestTorch._test_broadcast(self, lambda t: t.cuda())
|
||||
|
||||
def test_broadcast_fallback(self):
|
||||
TestTorch._test_broadcast_fallback(self, lambda t: t.cuda())
|
||||
|
||||
def test_broadcast_fused_matmul(self):
|
||||
TestTorch._test_broadcast_fused_matmul(self, lambda t: t.cuda())
|
||||
|
||||
def test_broadcast_batched_matmul(self):
|
||||
TestTorch._test_broadcast_batched_matmul(self, lambda t: t.cuda())
|
||||
|
||||
def test_advancedindex(self):
|
||||
TestTorch._test_advancedindex(self, lambda t: t.cuda())
|
||||
|
||||
def test_advancedindex_big(self):
|
||||
TestTorch._test_advancedindex_big(self, lambda t: t.cuda())
|
||||
|
||||
def test_btrifact(self):
|
||||
TestTorch._test_btrifact(self, lambda t: t.cuda())
|
||||
|
||||
def test_btrisolve(self):
|
||||
TestTorch._test_btrisolve(self, lambda t: t.cuda())
|
||||
|
||||
def test_tensor_gather(self):
|
||||
TestTorch._test_gather(self, lambda t: t.cuda(), False)
|
||||
|
||||
def test_tensor_scatter(self):
|
||||
TestTorch._test_scatter_base(self, lambda t: t.cuda(), 'scatter_', test_bounds=False)
|
||||
|
||||
def test_tensor_scatterAdd(self):
|
||||
TestTorch._test_scatter_base(self, lambda t: t.cuda(), 'scatter_add_', test_bounds=False)
|
||||
|
||||
def test_tensor_scatterFill(self):
|
||||
TestTorch._test_scatter_base(self, lambda t: t.cuda(), 'scatter_', True, test_bounds=False)
|
||||
|
||||
def test_arange(self):
|
||||
for t in ['IntTensor', 'LongTensor', 'FloatTensor', 'DoubleTensor']:
|
||||
a = torch.cuda.__dict__[t]()
|
||||
torch.arange(0, 10, out=a)
|
||||
b = torch.__dict__[t]()
|
||||
torch.arange(0, 10, out=b)
|
||||
self.assertEqual(a, b.cuda())
|
||||
|
||||
def test_nvtx(self):
|
||||
# Just making sure we can see the symbols
|
||||
torch.cuda.nvtx.range_push("foo")
|
||||
torch.cuda.nvtx.mark("bar")
|
||||
torch.cuda.nvtx.range_pop()
|
||||
|
||||
|
||||
if HAS_CUDA:
|
||||
for decl in tests:
|
||||
|
||||
@ -3,7 +3,7 @@ import sys
|
||||
import torch
|
||||
import traceback
|
||||
import unittest
|
||||
from torch.utils.data import Dataset, TensorDataset, DataLoader, ConcatDataset
|
||||
from torch.utils.data import Dataset, TensorDataset, DataLoader
|
||||
from common import TestCase, run_tests, TEST_NUMPY
|
||||
from common_nn import TEST_CUDA
|
||||
|
||||
@ -31,38 +31,6 @@ class TestTensorDataset(TestCase):
|
||||
self.assertEqual(l[i], source[i][1])
|
||||
|
||||
|
||||
class TestConcatDataset(TestCase):
|
||||
|
||||
def test_concat_two_singletons(self):
|
||||
result = ConcatDataset([[0], [1]])
|
||||
self.assertEqual(2, len(result))
|
||||
self.assertEqual(0, result[0])
|
||||
self.assertEqual(1, result[1])
|
||||
|
||||
def test_concat_two_non_singletons(self):
|
||||
result = ConcatDataset([[0, 1, 2, 3, 4],
|
||||
[5, 6, 7, 8, 9]])
|
||||
self.assertEqual(10, len(result))
|
||||
self.assertEqual(0, result[0])
|
||||
self.assertEqual(5, result[5])
|
||||
|
||||
def test_concat_two_non_singletons_with_empty(self):
|
||||
# Adding an empty dataset somewhere is correctly handled
|
||||
result = ConcatDataset([[0, 1, 2, 3, 4],
|
||||
[],
|
||||
[5, 6, 7, 8, 9]])
|
||||
self.assertEqual(10, len(result))
|
||||
self.assertEqual(0, result[0])
|
||||
self.assertEqual(5, result[5])
|
||||
|
||||
def test_concat_raises_index_error(self):
|
||||
result = ConcatDataset([[0, 1, 2, 3, 4],
|
||||
[5, 6, 7, 8, 9]])
|
||||
with self.assertRaises(IndexError):
|
||||
# this one goes to 11
|
||||
result[11]
|
||||
|
||||
|
||||
class ErrorDataset(Dataset):
|
||||
|
||||
def __init__(self, size):
|
||||
@ -109,7 +77,7 @@ class TestDataLoader(TestCase):
|
||||
errors = 0
|
||||
while True:
|
||||
try:
|
||||
next(it)
|
||||
it.next()
|
||||
except NotImplementedError:
|
||||
errors += 1
|
||||
except StopIteration:
|
||||
@ -123,14 +91,6 @@ class TestDataLoader(TestCase):
|
||||
def test_sequential_batch(self):
|
||||
self._test_sequential(DataLoader(self.dataset, batch_size=2))
|
||||
|
||||
def test_growing_dataset(self):
|
||||
dataset = [torch.ones(4) for _ in range(4)]
|
||||
dataloader_seq = DataLoader(dataset, shuffle=False)
|
||||
dataloader_shuffle = DataLoader(dataset, shuffle=True)
|
||||
dataset.append(torch.ones(4))
|
||||
self.assertEqual(len(dataloader_seq), 5)
|
||||
self.assertEqual(len(dataloader_shuffle), 5)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, "CUDA unavailable")
|
||||
def test_sequential_pin_memory(self):
|
||||
loader = DataLoader(self.dataset, batch_size=2, pin_memory=True)
|
||||
@ -156,29 +116,6 @@ class TestDataLoader(TestCase):
|
||||
def test_shuffle_batch_workers(self):
|
||||
self._test_shuffle(DataLoader(self.dataset, batch_size=2, shuffle=True, num_workers=4))
|
||||
|
||||
def _test_batch_sampler(self, **kwargs):
|
||||
# [(0, 1), (2, 3, 4), (5, 6), (7, 8, 9), ...]
|
||||
batches = []
|
||||
for i in range(0, 100, 5):
|
||||
batches.append(tuple(range(i, i + 2)))
|
||||
batches.append(tuple(range(i + 2, i + 5)))
|
||||
|
||||
dl = DataLoader(self.dataset, batch_sampler=batches, **kwargs)
|
||||
self.assertEqual(len(dl), 40)
|
||||
for i, (input, _target) in enumerate(dl):
|
||||
if i % 2 == 0:
|
||||
offset = i * 5 // 2
|
||||
self.assertEqual(len(input), 2)
|
||||
self.assertEqual(input, self.data[offset:offset + 2])
|
||||
else:
|
||||
offset = i * 5 // 2
|
||||
self.assertEqual(len(input), 3)
|
||||
self.assertEqual(input, self.data[offset:offset + 3])
|
||||
|
||||
def test_batch_sampler(self):
|
||||
self._test_batch_sampler()
|
||||
self._test_batch_sampler(num_workers=4)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, "CUDA unavailable")
|
||||
def test_shuffle_pin_memory(self):
|
||||
loader = DataLoader(self.dataset, batch_size=2, shuffle=True, num_workers=4, pin_memory=True)
|
||||
|
||||
@ -14,12 +14,7 @@ from common import TestCase
|
||||
BACKEND = os.environ['BACKEND']
|
||||
TEMP_DIR = os.environ['TEMP_DIR']
|
||||
MASTER_PORT = '29500'
|
||||
MASTER_ADDR = '127.0.0.1'
|
||||
|
||||
|
||||
if not dist.is_available():
|
||||
print('Distributed not available, skipping tests')
|
||||
sys.exit(0)
|
||||
MASTER_ADDR = '127.0.0.1:' + MASTER_PORT
|
||||
|
||||
|
||||
@contextmanager
|
||||
@ -69,7 +64,7 @@ class Barrier(object):
|
||||
data = f.read()
|
||||
if int(data) >= cls.barrier_id:
|
||||
arrived += 1
|
||||
if arrived == dist.get_world_size():
|
||||
if arrived == dist.get_num_processes():
|
||||
break
|
||||
|
||||
if time.time() - start_time > timeout:
|
||||
@ -92,7 +87,7 @@ class _DistTestBase(object):
|
||||
return (group, group_id, rank)
|
||||
|
||||
def _init_global_test(self):
|
||||
group = [i for i in range(0, dist.get_world_size())]
|
||||
group = [i for i in range(0, dist.get_num_processes())]
|
||||
group_id = dist.group.WORLD
|
||||
rank = dist.get_rank()
|
||||
return (group, group_id, rank)
|
||||
@ -101,7 +96,7 @@ class _DistTestBase(object):
|
||||
def test_get_rank(self):
|
||||
test_dir = os.path.join(TEMP_DIR, 'test_dir')
|
||||
pid = str(os.getpid())
|
||||
num_processes = dist.get_world_size()
|
||||
num_processes = dist.get_num_processes()
|
||||
with open(os.path.join(test_dir, pid), 'w') as f:
|
||||
f.write(str(dist.get_rank()))
|
||||
|
||||
@ -122,16 +117,15 @@ class _DistTestBase(object):
|
||||
self._barrier()
|
||||
|
||||
# SEND RECV
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support send/recv")
|
||||
def test_send_recv(self):
|
||||
rank = dist.get_rank()
|
||||
tensor = _build_tensor(rank + 1)
|
||||
for dest in range(0, dist.get_world_size()):
|
||||
for dest in range(0, dist.get_num_processes()):
|
||||
if dest == rank:
|
||||
continue
|
||||
dist.send(tensor, dest)
|
||||
|
||||
for src in range(0, dist.get_world_size()):
|
||||
for src in range(0, dist.get_num_processes()):
|
||||
if src == rank:
|
||||
continue
|
||||
tensor = _build_tensor(src + 1, value=-1)
|
||||
@ -142,32 +136,29 @@ class _DistTestBase(object):
|
||||
self._barrier()
|
||||
|
||||
# SEND RECV ANY SOURCE
|
||||
@unittest.skipIf(BACKEND == 'gloo',
|
||||
"Gloo does not support send/recv from any source")
|
||||
def test_send_recv_any_source(self):
|
||||
rank = dist.get_rank()
|
||||
tensor = _build_tensor(10, rank)
|
||||
for dest in range(0, dist.get_world_size()):
|
||||
for dest in range(0, dist.get_num_processes()):
|
||||
if dest == rank:
|
||||
continue
|
||||
dist.send(tensor, dest)
|
||||
|
||||
recv_ranks = set()
|
||||
for src in range(0, dist.get_world_size()):
|
||||
for src in range(0, dist.get_num_processes()):
|
||||
if src == rank:
|
||||
continue
|
||||
tensor = _build_tensor(10, value=-1)
|
||||
dist.recv(tensor)
|
||||
recv_ranks.add(tensor.resize_(1)[0])
|
||||
|
||||
self.assertEqual(len(recv_ranks), dist.get_world_size() - 1)
|
||||
self.assertEqual(len(recv_ranks), dist.get_num_processes() - 1)
|
||||
self._barrier()
|
||||
|
||||
# ISEND
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support isend")
|
||||
def test_isend(self):
|
||||
rank = dist.get_rank()
|
||||
world_size = dist.get_world_size()
|
||||
world_size = dist.get_num_processes()
|
||||
|
||||
if rank == 0:
|
||||
requests = [
|
||||
@ -184,10 +175,9 @@ class _DistTestBase(object):
|
||||
self._barrier()
|
||||
|
||||
# IRECV
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support irecv")
|
||||
def test_irecv(self):
|
||||
rank = dist.get_rank()
|
||||
world_size = dist.get_world_size()
|
||||
world_size = dist.get_num_processes()
|
||||
|
||||
if rank == 0:
|
||||
expected_tensors = [_build_tensor(src, -1) for src in range(1, world_size)]
|
||||
@ -206,17 +196,13 @@ class _DistTestBase(object):
|
||||
self._barrier()
|
||||
|
||||
# BROADCAST
|
||||
def _test_broadcast_helper(self, group, group_id, rank, cuda=False):
|
||||
def _test_broadcast_helper(self, group, group_id, rank):
|
||||
for src in group:
|
||||
expected_tensor = _build_tensor(src + 1)
|
||||
if cuda:
|
||||
expected_tensor = expected_tensor.cuda()
|
||||
if rank == src:
|
||||
dist.broadcast(expected_tensor, src, group_id)
|
||||
else:
|
||||
tensor = _build_tensor(src + 1, -1)
|
||||
if cuda:
|
||||
tensor = tensor.cuda()
|
||||
dist.broadcast(tensor, src, group_id)
|
||||
self.assertEqual(tensor, expected_tensor)
|
||||
|
||||
@ -226,11 +212,6 @@ class _DistTestBase(object):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_broadcast_helper(group, group_id, rank)
|
||||
|
||||
@unittest.skipIf(BACKEND != 'gloo', "Only Gloo backend supports CUDA allReduce")
|
||||
def test_broadcast_cuda(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_broadcast_helper(group, group_id, rank, True)
|
||||
|
||||
def test_broadcast_group(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_broadcast_helper(group, group_id, rank)
|
||||
@ -248,14 +229,12 @@ class _DistTestBase(object):
|
||||
|
||||
self._barrier()
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_sum(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.SUM, 2, 10, 2 + (10 * (len(group) - 1))
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_product(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_reduce_helper(
|
||||
@ -263,28 +242,24 @@ class _DistTestBase(object):
|
||||
2, 10, reduce((lambda x, y: x * y), [10] * (len(group) - 1), 2)
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_min(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.MIN, 1010, 1, 1
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_max(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.MAX, -1, 10, 10
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_group_sum(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.SUM, 2, 10, 2 + (10 * (len(group) - 1))
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_group_product(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_reduce_helper(
|
||||
@ -292,14 +267,12 @@ class _DistTestBase(object):
|
||||
2, 10, reduce((lambda x, y: x * y), [10] * (len(group) - 1), 2)
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_group_min(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.MIN, 1010, 1, 1
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support reduce")
|
||||
def test_reduce_group_max(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_reduce_helper(
|
||||
@ -307,19 +280,14 @@ class _DistTestBase(object):
|
||||
)
|
||||
|
||||
# ALL REDUCE
|
||||
def _test_all_reduce_helper(self, group, group_id, rank, op, master_value,
|
||||
worker_value, expected_value, cuda=False):
|
||||
def _test_all_reduce_helper(self, group, group_id, rank, op, master_value, worker_value, expected_value):
|
||||
for src in group:
|
||||
if rank == src:
|
||||
tensor = _build_tensor(src + 1).fill_(master_value)
|
||||
if cuda:
|
||||
tensor = tensor.cuda()
|
||||
dist.all_reduce(tensor, op, group_id)
|
||||
self.assertEqual(tensor, _build_tensor(src + 1, expected_value))
|
||||
else:
|
||||
tensor = _build_tensor(src + 1).fill_(worker_value)
|
||||
if cuda:
|
||||
tensor = tensor.cuda()
|
||||
dist.all_reduce(tensor, op, group_id)
|
||||
self.assertEqual(tensor, _build_tensor(src + 1, expected_value))
|
||||
|
||||
@ -331,13 +299,6 @@ class _DistTestBase(object):
|
||||
group, group_id, rank, dist.reduce_op.SUM, 2, 10, 2 + (10 * (len(group) - 1))
|
||||
)
|
||||
|
||||
@unittest.skipIf(BACKEND != 'gloo', "Only Gloo backend supports CUDA allReduce")
|
||||
def test_all_reduce_sum_cuda(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_all_reduce_helper(
|
||||
group, group_id, rank, dist.reduce_op.SUM, 2, 10, 2 + (10 * (len(group) - 1)), True
|
||||
)
|
||||
|
||||
def test_all_reduce_product(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_all_reduce_helper(
|
||||
@ -387,18 +348,20 @@ class _DistTestBase(object):
|
||||
for dest in group:
|
||||
tensor = _build_tensor(dest + 1, -1)
|
||||
expected_tensor = _build_tensor(dest + 1, rank)
|
||||
tensors = [_build_tensor(dest + 1, i) for i in group] if rank == dest else []
|
||||
dist.scatter(tensor, src=dest, scatter_list=tensors, group=group_id)
|
||||
self.assertEqual(tensor, expected_tensor)
|
||||
if rank == dest:
|
||||
tensors = [_build_tensor(dest + 1, i) for i in group]
|
||||
dist.scatter_send(tensors, tensor, group_id)
|
||||
self.assertEqual(tensor, expected_tensor)
|
||||
else:
|
||||
dist.scatter_recv(tensor, dest, group_id)
|
||||
self.assertEqual(tensor, expected_tensor)
|
||||
|
||||
self._barrier()
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support scatter")
|
||||
def test_scatter(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_scatter_helper(group, group_id, rank)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support scatter")
|
||||
def test_scatter_group(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_scatter_helper(group, group_id, rank)
|
||||
@ -407,21 +370,22 @@ class _DistTestBase(object):
|
||||
def _test_gather_helper(self, group, group_id, rank):
|
||||
for dest in group:
|
||||
tensor = _build_tensor(dest + 1, rank)
|
||||
tensors = [_build_tensor(dest + 1, -1) for i in group] if rank == dest else []
|
||||
dist.gather(tensor, dst=dest, gather_list=tensors, group=group_id)
|
||||
if rank == dest:
|
||||
tensors = [_build_tensor(dest + 1, -1) for i in group]
|
||||
dist.gather_recv(tensors, tensor, group_id)
|
||||
|
||||
expected_tensors = [_build_tensor(dest + 1, i) for i in group]
|
||||
for t1, t2 in zip(tensors, expected_tensors):
|
||||
self.assertEqual(t1, t2)
|
||||
else:
|
||||
dist.gather_send(tensor, dest, group_id)
|
||||
|
||||
self._barrier()
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support gather")
|
||||
def test_gather(self):
|
||||
group, group_id, rank = self._init_global_test()
|
||||
self._test_gather_helper(group, group_id, rank)
|
||||
|
||||
@unittest.skipIf(BACKEND == 'gloo', "Gloo does not support gather")
|
||||
def test_gather_group(self):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_gather_helper(group, group_id, rank)
|
||||
@ -473,13 +437,13 @@ class _DistTestBase(object):
|
||||
group, group_id, rank = self._init_group_test()
|
||||
self._test_barrier_helper(group, group_id, rank)
|
||||
|
||||
if BACKEND == 'tcp' or BACKEND == 'gloo':
|
||||
if BACKEND == 'tcp':
|
||||
WORLD_SIZE = os.environ['WORLD_SIZE']
|
||||
|
||||
class TestTCPOrGloo(TestCase, _DistTestBase):
|
||||
class TestTCP(TestCase, _DistTestBase):
|
||||
|
||||
MANAGER_PROCESS_RANK = -1
|
||||
JOIN_TIMEOUT = 10
|
||||
JOIN_TIMEOUT = 5
|
||||
|
||||
@staticmethod
|
||||
def manager_join(fn):
|
||||
@ -522,11 +486,7 @@ if BACKEND == 'tcp' or BACKEND == 'gloo':
|
||||
|
||||
def _run(self, rank):
|
||||
self.rank = rank
|
||||
try:
|
||||
dist.init_process_group(backend=BACKEND)
|
||||
except RuntimeError as e:
|
||||
if 'recompile' in e.args[0]:
|
||||
sys.exit(0)
|
||||
dist.init_process_group(backend=BACKEND)
|
||||
# self.id() == e.g. '__main__.TestDistributed.test_get_rank'
|
||||
# We're retreiving a corresponding test and executing it.
|
||||
getattr(self, self.id().split(".")[2])()
|
||||
|
||||
@ -184,16 +184,16 @@ tests = [
|
||||
OldModuleTest(nn.Sum,
|
||||
(1,),
|
||||
input_size=(2, 4, 5),
|
||||
reference_fn=lambda i, _: i.sum(1, keepdim=False)),
|
||||
reference_fn=lambda i, _: i.sum(1).squeeze(1)),
|
||||
OldModuleTest(nn.Sum,
|
||||
(1, True),
|
||||
input_size=(2, 4, 5),
|
||||
reference_fn=lambda i, _: i.sum(1, keepdim=False).div(i.size(1)),
|
||||
reference_fn=lambda i, _: i.sum(1).div(i.size(1)).squeeze(1),
|
||||
desc='sizeAverage'),
|
||||
OldModuleTest(nn.Mean,
|
||||
(1,),
|
||||
input_size=(2, 4, 5),
|
||||
reference_fn=lambda i, _: torch.mean(i, 1, keepdim=False)),
|
||||
reference_fn=lambda i, _: torch.mean(i, 1).squeeze(1)),
|
||||
OldModuleTest(lambda: nn.Sequential().add(nn.GradientReversal()).add(nn.GradientReversal()),
|
||||
input_size=(4, 3, 2, 2),
|
||||
fullname='GradientReversal'),
|
||||
@ -233,19 +233,19 @@ tests = [
|
||||
reference_fn=lambda i, _: torch.bmm(i[0], i[1].view(i[1].size(0), i[1].size(1), 1)).squeeze()),
|
||||
OldModuleTest(nn.Max,
|
||||
input_size=(4, 5, 3),
|
||||
reference_fn=lambda i, _: torch.max(i, 0, False)[0]),
|
||||
reference_fn=lambda i, _: torch.max(i, 0)[0].squeeze()),
|
||||
OldModuleTest(nn.Max,
|
||||
(1,),
|
||||
input_size=(4, 5, 3),
|
||||
reference_fn=lambda i, _: torch.max(i, 1, False)[0],
|
||||
reference_fn=lambda i, _: torch.max(i, 1)[0].squeeze(),
|
||||
desc='with_dimension'),
|
||||
OldModuleTest(nn.Min,
|
||||
input_size=(4, 5, 3),
|
||||
reference_fn=lambda i, _: torch.min(i, 0, False)[0]),
|
||||
reference_fn=lambda i, _: torch.min(i, 0)[0].squeeze()),
|
||||
OldModuleTest(nn.Min,
|
||||
(1,),
|
||||
input_size=(4, 5, 3),
|
||||
reference_fn=lambda i, _: torch.min(i, 1, False)[0],
|
||||
reference_fn=lambda i, _: torch.min(i, 1)[0].squeeze(),
|
||||
desc='with_dimension'),
|
||||
OldModuleTest(nn.MixtureTable,
|
||||
tuple(),
|
||||
@ -532,7 +532,7 @@ for p in (1, 2, 1.5):
|
||||
(p,),
|
||||
input_size=(4, 5),
|
||||
# Eh, we need to use p as a default, so it's passed by value
|
||||
reference_fn=lambda i, _, p=p: i.div(i.norm(p, 1, True).expand_as(i)),
|
||||
reference_fn=lambda i, _, p=p: i.div(i.norm(p, 1).expand_as(i)),
|
||||
desc=str(p)),
|
||||
)
|
||||
for p in range(1, 4 + 1):
|
||||
@ -807,14 +807,14 @@ class TestNN(NNTestCase):
|
||||
str(m)
|
||||
|
||||
output = m.forward(input)
|
||||
output2 = input.sum(1, True).expand(4, 5).repeat(num_modules, 1)
|
||||
output2 = input.sum(1).expand(4, 5).repeat(num_modules, 1)
|
||||
self.assertEqual(output2, output)
|
||||
|
||||
gradInput = m.backward(input, torch.ones(output2.size()))
|
||||
gradInput2 = torch.ones(4, 2).fill_(num_modules * 5)
|
||||
self.assertEqual(gradInput, gradInput2)
|
||||
|
||||
gradWeight = input.sum(0, keepdim=True).expand(5, 2)
|
||||
gradWeight = input.sum(0).expand(5, 2)
|
||||
for l in linears:
|
||||
self.assertEqual(gradWeight, l.gradWeight)
|
||||
|
||||
@ -884,8 +884,8 @@ class TestNN(NNTestCase):
|
||||
output2 = [input, input, input]
|
||||
self.assertEqual(output2, output)
|
||||
gradInput = module.backward(input, gradOutput)
|
||||
gradInput2 = [_gradOutput[0].sum(0, keepdim=False), _gradOutput[1].sum(
|
||||
0, keepdim=False), [_gradOutput[2].sum(0, keepdim=False)]]
|
||||
gradInput2 = [_gradOutput[0].sum(0).squeeze(0), _gradOutput[1].sum(
|
||||
0).squeeze(0), [_gradOutput[2].sum(0).squeeze(0)]]
|
||||
self.assertTrue(isinstance(gradInput, list))
|
||||
self.assertFalse(isinstance(gradInput[0], list))
|
||||
self.assertFalse(isinstance(gradInput[1], list))
|
||||
|
||||
@ -112,10 +112,9 @@ class leak_checker(object):
|
||||
# test is no more than 4 higher than the 10th available at the
|
||||
# start. This attempts to catch file descriptor leaks, but allows
|
||||
# one-off initialization that may use up a file descriptor
|
||||
# TODO: Disabled because this check is too flaky
|
||||
# available_fds = self._get_next_fds(10)
|
||||
# self.test_case.assertLessEqual(
|
||||
# available_fds[-1] - self.next_fds[-1], 5)
|
||||
available_fds = self._get_next_fds(10)
|
||||
self.test_case.assertLessEqual(
|
||||
available_fds[-1] - self.next_fds[-1], 5)
|
||||
self.test_case.assertFalse(self.has_shm_files())
|
||||
return False
|
||||
|
||||
@ -297,8 +296,7 @@ class TestMultiprocessing(TestCase):
|
||||
ctx = mp.get_context('spawn')
|
||||
tensors = []
|
||||
for i in range(5):
|
||||
device = i % 2
|
||||
tensors += [torch.arange(i * 5, (i + 1) * 5).cuda(device)]
|
||||
tensors += [torch.arange(i * 5, (i + 1) * 5).cuda()]
|
||||
|
||||
inq = ctx.Queue()
|
||||
outq = ctx.Queue()
|
||||
@ -314,7 +312,7 @@ class TestMultiprocessing(TestCase):
|
||||
for i, tensor in enumerate(tensors):
|
||||
v, device, tensor_size, storage_size = results[i]
|
||||
self.assertEqual(v, torch.arange(i * 5, (i + 1) * 5).sum())
|
||||
self.assertEqual(device, i % 2)
|
||||
self.assertEqual(device, 0)
|
||||
self.assertEqual(tensor_size, 5)
|
||||
self.assertEqual(storage_size, 5)
|
||||
|
||||
@ -393,10 +391,6 @@ class TestMultiprocessing(TestCase):
|
||||
param = Parameter(torch.arange(1, 26).view(5, 5))
|
||||
self._test_autograd_sharing(param)
|
||||
|
||||
def test_empty_shared(self):
|
||||
t = torch.Tensor()
|
||||
t.share_memory_()
|
||||
|
||||
def _test_is_shared(self):
|
||||
t = torch.randn(5, 5)
|
||||
self.assertFalse(t.is_shared())
|
||||
|
||||
1169
test/test_nn.py
1169
test/test_nn.py
File diff suppressed because it is too large
Load Diff
@ -4,11 +4,8 @@ from copy import deepcopy
|
||||
import torch
|
||||
import torch.optim as optim
|
||||
import torch.legacy.optim as old_optim
|
||||
import torch.nn.functional as F
|
||||
from torch.optim import SGD
|
||||
from torch.autograd import Variable
|
||||
from torch import sparse
|
||||
from torch.optim.lr_scheduler import LambdaLR, StepLR, MultiStepLR, ExponentialLR, ReduceLROnPlateau
|
||||
|
||||
from common import TestCase, run_tests
|
||||
|
||||
|
||||
@ -61,49 +58,6 @@ class TestOptim(TestCase):
|
||||
|
||||
self.assertLessEqual(params.data.dist(solution), initial_dist)
|
||||
|
||||
def _test_rosenbrock_sparse(self, constructor):
|
||||
params_t = torch.Tensor([1.5, 1.5])
|
||||
|
||||
params = Variable(torch.Tensor([1.5, 1.5]), requires_grad=True)
|
||||
params_c = Variable(torch.Tensor([1.5, 1.5]), requires_grad=True)
|
||||
optimizer = constructor([params])
|
||||
optimizer_c = constructor([params_c])
|
||||
|
||||
solution = torch.Tensor([1, 1])
|
||||
initial_dist = params.data.dist(solution)
|
||||
|
||||
def eval(params, sparse_grad, w):
|
||||
# Depending on w, provide only the x or y gradient
|
||||
optimizer.zero_grad()
|
||||
loss = rosenbrock(params)
|
||||
loss.backward()
|
||||
grad = drosenbrock(params.data)
|
||||
# NB: We torture test the optimizer by returning an
|
||||
# uncoalesced sparse tensor
|
||||
if w:
|
||||
i = torch.LongTensor([[0, 0]])
|
||||
x = grad[0]
|
||||
v = torch.DoubleTensor([x / 4., x - x / 4.])
|
||||
else:
|
||||
i = torch.LongTensor([[1, 1]])
|
||||
y = grad[1]
|
||||
v = torch.DoubleTensor([y - y / 4., y / 4.])
|
||||
x = sparse.DoubleTensor(i, v, torch.Size([2]))
|
||||
if sparse_grad:
|
||||
params.grad.data = x
|
||||
else:
|
||||
params.grad.data = x.to_dense()
|
||||
return loss
|
||||
|
||||
for i in range(2000):
|
||||
# Do cyclic coordinate descent
|
||||
w = i % 2
|
||||
optimizer.step(functools.partial(eval, params, True, w))
|
||||
optimizer_c.step(functools.partial(eval, params_c, False, w))
|
||||
self.assertEqual(params.data, params_c.data)
|
||||
|
||||
self.assertLessEqual(params.data.dist(solution), initial_dist)
|
||||
|
||||
def _test_basic_cases_template(self, weight, bias, input, constructor):
|
||||
weight = Variable(weight, requires_grad=True)
|
||||
bias = Variable(bias, requires_grad=True)
|
||||
@ -201,9 +155,6 @@ class TestOptim(TestCase):
|
||||
def _build_params_dict(self, weight, bias, **kwargs):
|
||||
return [dict(params=[weight]), dict(params=[bias], **kwargs)]
|
||||
|
||||
def _build_params_dict_single(self, weight, bias, **kwargs):
|
||||
return [dict(params=bias, **kwargs)]
|
||||
|
||||
def test_sgd(self):
|
||||
self._test_rosenbrock(
|
||||
lambda params: optim.SGD(params, lr=1e-3),
|
||||
@ -223,11 +174,6 @@ class TestOptim(TestCase):
|
||||
self._build_params_dict(weight, bias, lr=1e-2),
|
||||
lr=1e-3)
|
||||
)
|
||||
self._test_basic_cases(
|
||||
lambda weight, bias: optim.SGD(
|
||||
self._build_params_dict_single(weight, bias, lr=1e-2),
|
||||
lr=1e-3)
|
||||
)
|
||||
|
||||
def test_adam(self):
|
||||
self._test_rosenbrock(
|
||||
@ -290,11 +236,6 @@ class TestOptim(TestCase):
|
||||
lr=1e-1)
|
||||
)
|
||||
|
||||
def test_adagrad_sparse(self):
|
||||
self._test_rosenbrock_sparse(
|
||||
lambda params: optim.Adagrad(params, lr=1e-1)
|
||||
)
|
||||
|
||||
def test_adamax(self):
|
||||
self._test_rosenbrock(
|
||||
lambda params: optim.Adamax(params, lr=1e-1),
|
||||
@ -402,157 +343,5 @@ class TestOptim(TestCase):
|
||||
optim.SGD(Variable(torch.randn(5, 5)), lr=3)
|
||||
|
||||
|
||||
class SchedulerTestNet(torch.nn.Module):
|
||||
def __init__(self):
|
||||
super(SchedulerTestNet, self).__init__()
|
||||
self.conv1 = torch.nn.Conv2d(1, 1, 1)
|
||||
self.conv2 = torch.nn.Conv2d(1, 1, 1)
|
||||
|
||||
def forward(self, x):
|
||||
return self.conv2(F.relu(self.conv1(x)))
|
||||
|
||||
|
||||
class TestLRScheduler(TestCase):
|
||||
def setUp(self):
|
||||
self.net = SchedulerTestNet()
|
||||
self.opt = SGD(
|
||||
[{'params': self.net.conv1.parameters()}, {'params': self.net.conv2.parameters(), 'lr': 0.5}],
|
||||
lr=0.05)
|
||||
|
||||
def test_step_lr(self):
|
||||
# lr = 0.05 if epoch < 3
|
||||
# lr = 0.005 if 30 <= epoch < 6
|
||||
# lr = 0.0005 if epoch >= 9
|
||||
single_targets = [0.05] * 3 + [0.005] * 3 + [0.0005] * 3 + [0.00005] * 3
|
||||
targets = [single_targets, list(map(lambda x: x * 10, single_targets))]
|
||||
scheduler = StepLR(self.opt, gamma=0.1, step_size=3)
|
||||
epochs = 10
|
||||
self._test(scheduler, targets, epochs)
|
||||
|
||||
def test_multi_step_lr(self):
|
||||
# lr = 0.05 if epoch < 2
|
||||
# lr = 0.005 if 2 <= epoch < 5
|
||||
# lr = 0.0005 if epoch < 9
|
||||
# lr = 0.00005 if epoch >= 9
|
||||
single_targets = [0.05] * 2 + [0.005] * 3 + [0.0005] * 4 + [0.00005] * 3
|
||||
targets = [single_targets, list(map(lambda x: x * 10, single_targets))]
|
||||
scheduler = MultiStepLR(self.opt, gamma=0.1, milestones=[2, 5, 9])
|
||||
epochs = 10
|
||||
self._test(scheduler, targets, epochs)
|
||||
|
||||
def test_exp_lr(self):
|
||||
single_targets = [0.05 * (0.9 ** x) for x in range(10)]
|
||||
targets = [single_targets, list(map(lambda x: x * 10, single_targets))]
|
||||
scheduler = ExponentialLR(self.opt, gamma=0.9)
|
||||
epochs = 10
|
||||
self._test(scheduler, targets, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau1(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 20]
|
||||
metrics = [10 - i * 0.0167 for i in range(20)]
|
||||
scheduler = ReduceLROnPlateau(self.opt, threshold_mode='abs', mode='min',
|
||||
threshold=0.01, patience=5, cooldown=5)
|
||||
epochs = 10
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau2(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 6 + [0.05] * 7 + [0.005] * 7 + [0.0005] * 2]
|
||||
metrics = [10 - i * 0.0165 for i in range(22)]
|
||||
scheduler = ReduceLROnPlateau(self.opt, patience=5, cooldown=0, threshold_mode='abs',
|
||||
mode='min', threshold=0.1)
|
||||
epochs = 22
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau3(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * (2 + 6) + [0.05] * (5 + 6) + [0.005] * 4]
|
||||
metrics = [-0.8] * 2 + [-0.234] * 20
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='max', patience=5, cooldown=5,
|
||||
threshold_mode='abs')
|
||||
epochs = 22
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau4(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 20]
|
||||
metrics = [1.5 * (1.025 ** i) for i in range(20)] # 1.025 > 1.1**0.25
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='max', patience=3,
|
||||
threshold_mode='rel', threshold=0.1)
|
||||
epochs = 20
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau5(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 6 + [0.05] * (5 + 6) + [0.005] * 4]
|
||||
metrics = [1.5 * (1.005 ** i) for i in range(20)]
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='max', threshold_mode='rel',
|
||||
threshold=0.1, patience=5, cooldown=5)
|
||||
epochs = 20
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau6(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 20]
|
||||
metrics = [1.5 * (0.85 ** i) for i in range(20)]
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='min', threshold_mode='rel',
|
||||
threshold=0.1)
|
||||
epochs = 20
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau7(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 6 + [0.05] * (5 + 6) + [0.005] * 4]
|
||||
metrics = [1] * 7 + [0.6] + [0.5] * 12
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='min', threshold_mode='rel',
|
||||
threshold=0.1, patience=5, cooldown=5)
|
||||
epochs = 20
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_reduce_lr_on_plateau8(self):
|
||||
for param_group in self.opt.param_groups:
|
||||
param_group['lr'] = 0.5
|
||||
targets = [[0.5] * 6 + [0.4] * 14, [0.5] * 6 + [0.3] * 14]
|
||||
metrics = [1.5 * (1.005 ** i) for i in range(20)]
|
||||
scheduler = ReduceLROnPlateau(self.opt, mode='max', threshold_mode='rel', min_lr=[0.4, 0.3],
|
||||
threshold=0.1, patience=5, cooldown=5)
|
||||
epochs = 20
|
||||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)
|
||||
|
||||
def test_lambda_lr(self):
|
||||
self.opt.param_groups[0]['lr'] = 0.05
|
||||
self.opt.param_groups[1]['lr'] = 0.4
|
||||
targets = [[0.05 * (0.9 ** x) for x in range(10)], [0.4 * (0.8 ** x) for x in range(10)]]
|
||||
scheduler = LambdaLR(self.opt,
|
||||
lr_lambda=[lambda x1: 0.9 ** x1, lambda x2: 0.8 ** x2])
|
||||
epochs = 10
|
||||
self._test(scheduler, targets, epochs)
|
||||
|
||||
def _test(self, scheduler, targets, epochs=10):
|
||||
for epoch in range(epochs):
|
||||
scheduler.step(epoch)
|
||||
for param_group, target in zip(self.opt.param_groups, targets):
|
||||
self.assertAlmostEqual(target[epoch], param_group['lr'],
|
||||
msg='LR is wrong in epoch {}: expected {}, got {}'.format(
|
||||
epoch, target[epoch], param_group['lr']), delta=1e-5)
|
||||
|
||||
def _test_reduce_lr_on_plateau(self, scheduler, targets, metrics, epochs=10, verbose=False):
|
||||
for epoch in range(epochs):
|
||||
scheduler.step(metrics[epoch])
|
||||
if verbose:
|
||||
print('epoch{}:\tlr={}'.format(epoch, self.opt.param_groups[0]['lr']))
|
||||
for param_group, target in zip(self.opt.param_groups, targets):
|
||||
self.assertAlmostEqual(target[epoch], param_group['lr'],
|
||||
msg='LR is wrong in epoch {}: expected {}, got {}'.format(
|
||||
epoch, target[epoch], param_group['lr']), delta=1e-5)
|
||||
|
||||
if __name__ == '__main__':
|
||||
run_tests()
|
||||
|
||||
@ -8,63 +8,28 @@ from common import TestCase, run_tests
|
||||
from common_nn import TEST_CUDA
|
||||
from numbers import Number
|
||||
|
||||
# triplet := (index type, value type, sparse type)
|
||||
cpu_triplet = (
|
||||
torch.LongTensor,
|
||||
torch.DoubleTensor,
|
||||
torch.sparse.DoubleTensor)
|
||||
|
||||
def cpu_only(inner):
|
||||
def outer(self, *args, **kwargs):
|
||||
if self.is_cuda:
|
||||
raise unittest.SkipTest("Test is CPU-only")
|
||||
inner(self, *args, **kwargs)
|
||||
return outer
|
||||
|
||||
|
||||
def cuda_only(inner):
|
||||
def outer(self, *args, **kwargs):
|
||||
if not self.is_cuda:
|
||||
raise unittest.SkipTest("Test is GPU-only")
|
||||
inner(self, *args, **kwargs)
|
||||
return outer
|
||||
if TEST_CUDA:
|
||||
cuda_triplet = (
|
||||
torch.cuda.LongTensor,
|
||||
torch.cuda.DoubleTensor,
|
||||
torch.cuda.sparse.DoubleTensor)
|
||||
|
||||
|
||||
class TestSparse(TestCase):
|
||||
|
||||
def setUp(self):
|
||||
# These parameters control the various ways we can run the test.
|
||||
# We will subclass and override this method to implement CUDA
|
||||
# tests
|
||||
self.is_cuda = False
|
||||
self.is_uncoalesced = False
|
||||
self.IndexTensor = torch.LongTensor
|
||||
self.ValueTensor = torch.DoubleTensor
|
||||
self.SparseTensor = torch.sparse.DoubleTensor
|
||||
|
||||
def _gen_sparse(self, d, nnz, with_size):
|
||||
# TODO: Consider implementing this in the CUDA case by directly
|
||||
# performing the operations on the GPU. You won't be able to
|
||||
# use torch.rand/torch.randn in this case because they are
|
||||
# CPU-only. If you do this, you can remove the is_cuda branch
|
||||
# at the end.
|
||||
#
|
||||
# If you do this, be sure to update assert_uncoalesced too
|
||||
|
||||
@staticmethod
|
||||
def _gen_sparse(d, nnz, with_size, is_cuda=False):
|
||||
if isinstance(with_size, Number):
|
||||
with_size = [with_size] * d
|
||||
|
||||
if self.is_uncoalesced:
|
||||
# We want to generate a tensor with a lot of uncoalesced
|
||||
# entries to stress test whether or not we handle this
|
||||
# (subtle) case correctly
|
||||
v_size = [nnz * 2] + list(with_size[d:])
|
||||
v = torch.randn(*v_size)
|
||||
r = torch.rand(d, nnz)
|
||||
# Repeat the indexes, so every position shows up twice
|
||||
i = torch.cat([r, r], dim=1) * \
|
||||
torch.Tensor(with_size[:d]).repeat(nnz * 2, 1).transpose(0, 1)
|
||||
i = i.type(torch.LongTensor)
|
||||
x = torch.sparse.DoubleTensor(i, v, torch.Size(with_size))
|
||||
self.assert_uncoalesced(x)
|
||||
v = torch.randn(nnz)
|
||||
i = (torch.rand(d, nnz) * with_size).type(torch.LongTensor)
|
||||
x = torch.sparse.DoubleTensor(i, v)
|
||||
else:
|
||||
# Generate a sparse tensor with d sparse dimensions; the
|
||||
# rest the dimensions with_size[d:] are dense.
|
||||
v_size = [nnz] + list(with_size[d:])
|
||||
v = torch.randn(*v_size)
|
||||
i = torch.rand(d, nnz) * \
|
||||
@ -72,62 +37,49 @@ class TestSparse(TestCase):
|
||||
i = i.type(torch.LongTensor)
|
||||
x = torch.sparse.DoubleTensor(i, v, torch.Size(with_size))
|
||||
|
||||
if self.is_cuda:
|
||||
if is_cuda:
|
||||
return x.cuda(), i.cuda(), v.cuda()
|
||||
else:
|
||||
return x, i.clone(), v.clone()
|
||||
|
||||
def assert_uncoalesced(self, x):
|
||||
"""
|
||||
Test if a CPU tensor is uncoalesced. This is used to ensure
|
||||
correctness of the uncoalesced tensor generation algorithm.
|
||||
"""
|
||||
assert not x.is_coalesced()
|
||||
# Strategy: construct a new sparse tensor with the raw value
|
||||
# field overwritten to a tensor of ones, coalesce it, and then
|
||||
# check if any value entries are > 1 (which indicates that the
|
||||
# original was uncoalesced.)
|
||||
i = x._indices().clone()
|
||||
v = x._values().clone().fill_(1)
|
||||
y = torch.sparse.DoubleTensor(i, v, x.size())
|
||||
z = self.safeCoalesce(y)
|
||||
assert (z._values() > 1).sum() > 0
|
||||
def _test_basic(self, is_cuda):
|
||||
x, i, v = self._gen_sparse(3, 10, 100, is_cuda)
|
||||
|
||||
def randn(self, *args, **kwargs):
|
||||
"""
|
||||
Variant of torch.randn that also works in the TEST_CUDA case.
|
||||
"""
|
||||
# TODO: Put this in torch.cuda.randn
|
||||
return self.ValueTensor(*args, **kwargs).normal_()
|
||||
self.assertEqual(i, x.indices())
|
||||
self.assertEqual(v, x.values())
|
||||
|
||||
def test_basic(self):
|
||||
x, i, v = self._gen_sparse(3, 10, 100)
|
||||
|
||||
self.assertEqual(i, x._indices())
|
||||
self.assertEqual(v, x._values())
|
||||
|
||||
x, i, v = self._gen_sparse(3, 10, [100, 100, 100])
|
||||
self.assertEqual(i, x._indices())
|
||||
self.assertEqual(v, x._values())
|
||||
x, i, v = self._gen_sparse(3, 10, [100, 100, 100], is_cuda)
|
||||
self.assertEqual(i, x.indices())
|
||||
self.assertEqual(v, x.values())
|
||||
self.assertEqual(x.ndimension(), 3)
|
||||
self.assertEqual(x.coalesce()._nnz(), 10)
|
||||
self.assertEqual(x.nnz(), 10)
|
||||
for i in range(3):
|
||||
self.assertEqual(x.size(i), 100)
|
||||
|
||||
SparseTensor = (cuda_triplet if is_cuda else cpu_triplet)[2]
|
||||
# Make sure we can access empty indices / values
|
||||
x = self.SparseTensor()
|
||||
self.assertEqual(x._indices().numel(), 0)
|
||||
self.assertEqual(x._values().numel(), 0)
|
||||
x = SparseTensor()
|
||||
self.assertEqual(x.indices().numel(), 0)
|
||||
self.assertEqual(x.values().numel(), 0)
|
||||
|
||||
def test_to_dense(self):
|
||||
i = self.IndexTensor([
|
||||
def test_basic(self):
|
||||
self._test_basic(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_basic_cuda(self):
|
||||
self._test_basic(True)
|
||||
|
||||
def _test_to_dense(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[0, 1, 2, 2],
|
||||
[0, 0, 0, 3],
|
||||
[0, 0, 1, 4],
|
||||
])
|
||||
v = self.ValueTensor([2, 1, 3, 4])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
res = self.ValueTensor([
|
||||
v = ValueTensor([2, 1, 3, 4])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
res = ValueTensor([
|
||||
[[2, 0, 0, 0, 0],
|
||||
[0, 0, 0, 0, 0],
|
||||
[0, 0, 0, 0, 0],
|
||||
@ -147,23 +99,23 @@ class TestSparse(TestCase):
|
||||
x.to_dense()
|
||||
self.assertEqual(res, x.to_dense())
|
||||
|
||||
def test_shared(self):
|
||||
i = self.IndexTensor([[2]])
|
||||
v = self.ValueTensor([5])
|
||||
x = self.SparseTensor(i, v, torch.Size([3]))
|
||||
v[0] = 6
|
||||
self.assertEqual(self.ValueTensor([0, 0, 6]), x.to_dense())
|
||||
i[0][0] = 0
|
||||
self.assertEqual(self.ValueTensor([6, 0, 0]), x.to_dense())
|
||||
def test_to_dense(self):
|
||||
self._test_to_dense(False)
|
||||
|
||||
def test_to_dense_hybrid(self):
|
||||
i = self.IndexTensor([
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_to_dense_cuda(self):
|
||||
self._test_to_dense(True)
|
||||
|
||||
def _test_to_dense_hybrid(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[0, 1, 2, 2],
|
||||
[0, 0, 0, 3],
|
||||
])
|
||||
v = self.ValueTensor([[2, 3], [1, 2], [3, 4], [4, 5]])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 2]))
|
||||
res = self.ValueTensor([
|
||||
v = ValueTensor([[2, 3], [1, 2], [3, 4], [4, 5]])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 2]))
|
||||
res = ValueTensor([
|
||||
[[2, 3],
|
||||
[0, 0],
|
||||
[0, 0],
|
||||
@ -183,131 +135,145 @@ class TestSparse(TestCase):
|
||||
x.to_dense()
|
||||
self.assertEqual(res, x.to_dense())
|
||||
|
||||
def test_contig(self):
|
||||
i = self.IndexTensor([
|
||||
def test_to_dense_hybrid(self):
|
||||
self._test_to_dense_hybrid(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_to_dense_hybrid_cuda(self):
|
||||
self._test_to_dense_hybrid(True)
|
||||
|
||||
def _test_contig(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[1, 0, 35, 14, 39, 6, 71, 66, 40, 27],
|
||||
[92, 31, 62, 50, 22, 65, 89, 74, 56, 34],
|
||||
])
|
||||
v = self.ValueTensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
|
||||
x = self.SparseTensor(i, v, torch.Size([100, 100]))
|
||||
exp_i = self.IndexTensor([
|
||||
v = ValueTensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
|
||||
x = SparseTensor(i, v, torch.Size([100, 100]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 1, 6, 14, 27, 35, 39, 40, 66, 71],
|
||||
[31, 92, 65, 50, 34, 62, 22, 56, 74, 89],
|
||||
])
|
||||
exp_v = self.ValueTensor([2, 1, 6, 4, 10, 3, 5, 9, 8, 7])
|
||||
exp_v = ValueTensor([2, 1, 6, 4, 10, 3, 5, 9, 8, 7])
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
i = self.IndexTensor([
|
||||
i = IndexTensor([
|
||||
[2, 0, 2, 1],
|
||||
[0, 0, 3, 0],
|
||||
[1, 0, 4, 0],
|
||||
])
|
||||
v = self.ValueTensor([3, 2, 4, 1])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
exp_i = self.IndexTensor([
|
||||
v = ValueTensor([3, 2, 4, 1])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 1, 2, 2],
|
||||
[0, 0, 0, 3],
|
||||
[0, 0, 1, 4],
|
||||
])
|
||||
exp_v = self.ValueTensor([2, 1, 3, 4])
|
||||
exp_v = ValueTensor([2, 1, 3, 4])
|
||||
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
# Duplicate indices
|
||||
i = self.IndexTensor([
|
||||
i = IndexTensor([
|
||||
[0, 0, 2, 0],
|
||||
[0, 0, 3, 0],
|
||||
[0, 0, 4, 0],
|
||||
])
|
||||
v = self.ValueTensor([3, 2, 4, 1])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
exp_i = self.IndexTensor([
|
||||
v = ValueTensor([3, 2, 4, 1])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 5]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 2],
|
||||
[0, 3],
|
||||
[0, 4],
|
||||
])
|
||||
exp_v = self.ValueTensor([6, 4])
|
||||
exp_v = ValueTensor([6, 4])
|
||||
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
def test_contig_hybrid(self):
|
||||
i = self.IndexTensor([
|
||||
def test_contig(self):
|
||||
self._test_contig(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_contig_cuda(self):
|
||||
self._test_contig(True)
|
||||
|
||||
def _test_contig_hybrid(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[1, 0, 35, 14, 39, 6, 71, 66, 40, 27],
|
||||
[92, 31, 62, 50, 22, 65, 89, 74, 56, 34],
|
||||
])
|
||||
v = self.ValueTensor([
|
||||
v = ValueTensor([
|
||||
[1, 2], [2, 3], [3, 4], [4, 5], [5, 6],
|
||||
[6, 7], [7, 8], [8, 9], [9, 10], [10, 11],
|
||||
])
|
||||
x = self.SparseTensor(i, v, torch.Size([100, 100, 2]))
|
||||
exp_i = self.IndexTensor([
|
||||
x = SparseTensor(i, v, torch.Size([100, 100, 2]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 1, 6, 14, 27, 35, 39, 40, 66, 71],
|
||||
[31, 92, 65, 50, 34, 62, 22, 56, 74, 89],
|
||||
])
|
||||
exp_v = self.ValueTensor([
|
||||
exp_v = ValueTensor([
|
||||
[2, 3], [1, 2], [6, 7], [4, 5], [10, 11],
|
||||
[3, 4], [5, 6], [9, 10], [8, 9], [7, 8],
|
||||
])
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
i = self.IndexTensor([
|
||||
i = IndexTensor([
|
||||
[2, 0, 2, 1],
|
||||
[0, 0, 3, 0],
|
||||
[1, 0, 4, 0],
|
||||
])
|
||||
v = self.ValueTensor([[3, 3, 3], [2, 2, 2], [4, 4, 4], [1, 1, 1]])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 5, 3]))
|
||||
exp_i = self.IndexTensor([
|
||||
v = ValueTensor([[3, 3, 3], [2, 2, 2], [4, 4, 4], [1, 1, 1]])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 5, 3]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 1, 2, 2],
|
||||
[0, 0, 0, 3],
|
||||
[0, 0, 1, 4],
|
||||
])
|
||||
exp_v = self.ValueTensor([[2, 2, 2], [1, 1, 1], [3, 3, 3], [4, 4, 4]])
|
||||
exp_v = ValueTensor([[2, 2, 2], [1, 1, 1], [3, 3, 3], [4, 4, 4]])
|
||||
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
# Duplicate indices
|
||||
i = self.IndexTensor([
|
||||
i = IndexTensor([
|
||||
[0, 0, 2, 0],
|
||||
[0, 0, 3, 0],
|
||||
[0, 0, 4, 0],
|
||||
])
|
||||
v = self.ValueTensor([[3, 2, 3], [2, 1, 1], [4, 3, 4], [1, 1, 1]])
|
||||
x = self.SparseTensor(i, v, torch.Size([3, 4, 5, 3]))
|
||||
exp_i = self.IndexTensor([
|
||||
v = ValueTensor([[3, 2, 3], [2, 1, 1], [4, 3, 4], [1, 1, 1]])
|
||||
x = SparseTensor(i, v, torch.Size([3, 4, 5, 3]))
|
||||
exp_i = IndexTensor([
|
||||
[0, 2],
|
||||
[0, 3],
|
||||
[0, 4],
|
||||
])
|
||||
exp_v = self.ValueTensor([[6, 4, 5], [4, 3, 4]])
|
||||
exp_v = ValueTensor([[6, 4, 5], [4, 3, 4]])
|
||||
|
||||
x = self.safeCoalesce(x)
|
||||
self.assertEqual(exp_i, x._indices())
|
||||
self.assertEqual(exp_v, x._values())
|
||||
self.assertEqual(exp_i, x.indices())
|
||||
self.assertEqual(exp_v, x.values())
|
||||
|
||||
def test_clone(self):
|
||||
x, _, _ = self._gen_sparse(4, 20, 5)
|
||||
if self.is_uncoalesced:
|
||||
self.assertFalse(x.is_coalesced())
|
||||
y = x.clone()
|
||||
self.assertFalse(y.is_coalesced())
|
||||
x = x.coalesce()
|
||||
self.assertTrue(x.is_coalesced())
|
||||
y = x.clone()
|
||||
self.assertTrue(y.is_coalesced())
|
||||
def test_contig_hybrid(self):
|
||||
self._test_contig_hybrid(False)
|
||||
|
||||
def test_transpose(self):
|
||||
x = self._gen_sparse(4, 20, 5)[0]
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_contig_hybrid_cuda(self):
|
||||
self._test_contig_hybrid(True)
|
||||
|
||||
def _test_transpose(self, is_cuda):
|
||||
x = self._gen_sparse(4, 20, 5, is_cuda=is_cuda)[0]
|
||||
y = x.to_dense()
|
||||
|
||||
for i, j in itertools.combinations(range(4), 2):
|
||||
@ -319,7 +285,13 @@ class TestSparse(TestCase):
|
||||
y = y.transpose(i, j)
|
||||
self.assertEqual(x.to_dense(), y)
|
||||
|
||||
@cpu_only
|
||||
def test_transpose(self):
|
||||
self._test_transpose(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_transpose_cuda(self):
|
||||
self._test_transpose(True)
|
||||
|
||||
def test_mm(self):
|
||||
def test_shape(di, dj, dk):
|
||||
x, _, _ = self._gen_sparse(2, 20, [di, dj])
|
||||
@ -344,7 +316,6 @@ class TestSparse(TestCase):
|
||||
test_shape(100, 1000, 200)
|
||||
test_shape(64, 10000, 300)
|
||||
|
||||
@cpu_only
|
||||
def test_saddmm(self):
|
||||
def test_shape(di, dj, dk):
|
||||
x = self._gen_sparse(2, 20, [di, dj])[0]
|
||||
@ -369,10 +340,12 @@ class TestSparse(TestCase):
|
||||
test_shape(1000, 100, 100)
|
||||
test_shape(3000, 64, 300)
|
||||
|
||||
def test_dsmm(self):
|
||||
def _test_dsmm(self, is_cuda):
|
||||
def test_shape(di, dj, dk):
|
||||
x = self._gen_sparse(2, 20, [di, dj])[0]
|
||||
y = self.randn(dj, dk)
|
||||
x = self._gen_sparse(2, 20, [di, dj], is_cuda)[0]
|
||||
y = torch.randn(dj, dk)
|
||||
if is_cuda:
|
||||
y = y.cuda()
|
||||
|
||||
res = torch.dsmm(x, y)
|
||||
expected = torch.mm(x.to_dense(), y)
|
||||
@ -382,10 +355,19 @@ class TestSparse(TestCase):
|
||||
test_shape(1000, 100, 100)
|
||||
test_shape(3000, 64, 300)
|
||||
|
||||
def test_hsmm(self):
|
||||
def test_dsmm(self):
|
||||
self._test_dsmm(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_dsmm_cuda(self):
|
||||
self._test_dsmm(True)
|
||||
|
||||
def _test_hsmm(self, is_cuda):
|
||||
def test_shape(di, dj, dk):
|
||||
x = self._gen_sparse(2, 20, [di, dj])[0]
|
||||
y = self.randn(dj, dk)
|
||||
x = self._gen_sparse(2, 20, [di, dj], is_cuda)[0]
|
||||
y = torch.randn(dj, dk)
|
||||
if is_cuda:
|
||||
y = y.cuda()
|
||||
|
||||
res = torch.hsmm(x, y)
|
||||
expected = torch.mm(x.to_dense(), y)
|
||||
@ -395,10 +377,19 @@ class TestSparse(TestCase):
|
||||
test_shape(1000, 100, 100)
|
||||
test_shape(3000, 64, 300)
|
||||
|
||||
def _test_spadd_shape(self, shape_i, shape_v=None):
|
||||
def test_hsmm(self):
|
||||
self._test_hsmm(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_hsmm_cuda(self):
|
||||
self._test_hsmm(True)
|
||||
|
||||
def _test_spadd_shape(self, is_cuda, shape_i, shape_v=None):
|
||||
shape = shape_i + (shape_v or [])
|
||||
x, _, _ = self._gen_sparse(len(shape_i), 10, shape)
|
||||
y = self.randn(*shape)
|
||||
x, _, _ = self._gen_sparse(len(shape_i), 10, shape, is_cuda)
|
||||
y = torch.randn(*shape)
|
||||
if is_cuda:
|
||||
y = y.cuda()
|
||||
r = random.random()
|
||||
|
||||
res = torch.add(y, r, x)
|
||||
@ -410,7 +401,9 @@ class TestSparse(TestCase):
|
||||
s = list(shape)
|
||||
s[0] = shape[-1]
|
||||
s[-1] = shape[0]
|
||||
y = self.randn(*s)
|
||||
y = torch.randn(*s)
|
||||
if is_cuda:
|
||||
y = y.cuda()
|
||||
y.transpose_(0, len(s) - 1)
|
||||
r = random.random()
|
||||
|
||||
@ -419,22 +412,36 @@ class TestSparse(TestCase):
|
||||
|
||||
self.assertEqual(res, expected)
|
||||
|
||||
def _test_spadd(self, is_cuda):
|
||||
self._test_spadd_shape(is_cuda, [5, 6])
|
||||
self._test_spadd_shape(is_cuda, [10, 10, 10])
|
||||
self._test_spadd_shape(is_cuda, [50, 30, 20])
|
||||
self._test_spadd_shape(is_cuda, [5, 5, 5, 5, 5, 5])
|
||||
|
||||
def test_spadd(self):
|
||||
self._test_spadd_shape([5, 6])
|
||||
self._test_spadd_shape([10, 10, 10])
|
||||
self._test_spadd_shape([50, 30, 20])
|
||||
self._test_spadd_shape([5, 5, 5, 5, 5, 5])
|
||||
self._test_spadd(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_spadd_cuda(self):
|
||||
self._test_spadd(True)
|
||||
|
||||
def _test_spadd_hybrid(self, is_cuda):
|
||||
self._test_spadd_shape(is_cuda, [5, 6], [2, 3])
|
||||
self._test_spadd_shape(is_cuda, [10, 10, 10], [3])
|
||||
self._test_spadd_shape(is_cuda, [50, 30, 20], [2])
|
||||
self._test_spadd_shape(is_cuda, [5, 5, 5, 5, 5, 5], [2])
|
||||
|
||||
def test_spadd_hybrid(self):
|
||||
self._test_spadd_shape([5, 6], [2, 3])
|
||||
self._test_spadd_shape([10, 10, 10], [3])
|
||||
self._test_spadd_shape([50, 30, 20], [2])
|
||||
self._test_spadd_shape([5, 5, 5, 5, 5, 5], [2])
|
||||
self._test_spadd_hybrid(False)
|
||||
|
||||
def _test_basic_ops_shape(self, shape_i, shape_v=None):
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_spadd_hybrid_cuda(self):
|
||||
self._test_spadd_hybrid(True)
|
||||
|
||||
def _test_basic_ops_shape(self, is_cuda, shape_i, shape_v=None):
|
||||
shape = shape_i + (shape_v or [])
|
||||
x1, _, _ = self._gen_sparse(len(shape_i), 9, shape)
|
||||
x2, _, _ = self._gen_sparse(len(shape_i), 12, shape)
|
||||
x1, _, _ = self._gen_sparse(len(shape_i), 9, shape, is_cuda)
|
||||
x2, _, _ = self._gen_sparse(len(shape_i), 12, shape, is_cuda)
|
||||
|
||||
y1 = x1 + x2
|
||||
y2 = x1.clone()
|
||||
@ -491,25 +498,39 @@ class TestSparse(TestCase):
|
||||
self.assertTrue(y.is_coalesced())
|
||||
self.assertEqual(x1, y)
|
||||
# check that coalesce is out of place
|
||||
y._values().add_(1)
|
||||
self.assertEqual(z._values() + 1, y._values())
|
||||
y.values().add_(1)
|
||||
self.assertEqual(z.values() + 1, y.values())
|
||||
|
||||
def _test_basic_ops(self, is_cuda):
|
||||
self._test_basic_ops_shape(is_cuda, [5, 6])
|
||||
self._test_basic_ops_shape(is_cuda, [10, 10, 10])
|
||||
self._test_basic_ops_shape(is_cuda, [50, 30, 20])
|
||||
self._test_basic_ops_shape(is_cuda, [5, 5, 5, 5, 5, 5])
|
||||
|
||||
def test_basic_ops(self):
|
||||
self._test_basic_ops_shape([5, 6])
|
||||
self._test_basic_ops_shape([10, 10, 10])
|
||||
self._test_basic_ops_shape([50, 30, 20])
|
||||
self._test_basic_ops_shape([5, 5, 5, 5, 5, 5])
|
||||
self._test_basic_ops(False)
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_basic_ops_cuda(self):
|
||||
self._test_basic_ops(True)
|
||||
|
||||
def _test_basic_ops_hybrid(self, is_cuda):
|
||||
self._test_basic_ops_shape(is_cuda, [5, 6], [2, 3])
|
||||
self._test_basic_ops_shape(is_cuda, [10, 10, 10], [3])
|
||||
self._test_basic_ops_shape(is_cuda, [50, 30, 20], [2])
|
||||
self._test_basic_ops_shape(is_cuda, [5, 5, 5, 5, 5, 5], [2])
|
||||
|
||||
def test_basic_ops_hybrid(self):
|
||||
self._test_basic_ops_shape([5, 6], [2, 3])
|
||||
self._test_basic_ops_shape([10, 10, 10], [3])
|
||||
self._test_basic_ops_shape([50, 30, 20], [2])
|
||||
self._test_basic_ops_shape([5, 5, 5, 5, 5, 5], [2])
|
||||
self._test_basic_ops_hybrid(False)
|
||||
|
||||
def _test_sparse_mask_shape(self, shape_i, shape_v=None):
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_basic_ops_hybrid_cuda(self):
|
||||
self._test_basic_ops_hybrid(True)
|
||||
|
||||
def _test_sparse_mask_shape(self, is_cuda, shape_i, shape_v=None):
|
||||
shape = shape_i + (shape_v or [])
|
||||
x1, _, _ = self._gen_sparse(len(shape_i), 9, shape)
|
||||
x2, _, _ = self._gen_sparse(len(shape_i), 12, shape)
|
||||
x1, _, _ = self._gen_sparse(len(shape_i), 9, shape, is_cuda)
|
||||
x2, _, _ = self._gen_sparse(len(shape_i), 12, shape, is_cuda)
|
||||
|
||||
y1 = x1 + x2
|
||||
y2 = x1.clone()
|
||||
@ -518,108 +539,78 @@ class TestSparse(TestCase):
|
||||
self.assertEqual(y1.to_dense(), expected)
|
||||
self.assertEqual(y2.to_dense(), expected)
|
||||
|
||||
def _test_sparse_mask_fixed(self):
|
||||
i = self.IndexTensor([
|
||||
[1, 3, 0, 4],
|
||||
[2, 1, 2, 3],
|
||||
def _test_sparse_mask_fixed(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[1, 3, 3, 0, 4],
|
||||
[2, 1, 1, 2, 3],
|
||||
])
|
||||
v = self.ValueTensor([1, 2, 3, 4])
|
||||
x = self.SparseTensor(i, v, torch.Size([5, 4])).coalesce()
|
||||
dense = self.ValueTensor([
|
||||
v = ValueTensor([1, 2, 3, 4, 5])
|
||||
x = SparseTensor(i, v, torch.Size([5, 4]))
|
||||
dense = ValueTensor([
|
||||
[1, 2, 3, 4],
|
||||
[5, 6, 7, 8],
|
||||
[9, 10, 11, 12],
|
||||
[13, 14, 15, 16],
|
||||
[17, 18, 19, 20],
|
||||
])
|
||||
exp_v = self.ValueTensor([7, 14, 3, 20])
|
||||
res = dense._sparse_mask(x)
|
||||
expected = self.SparseTensor(i, exp_v, torch.Size([5, 4]))
|
||||
exp_v = ValueTensor([7, 14, 14, 3, 20])
|
||||
res = dense.sparse_mask(x)
|
||||
expected = SparseTensor(i, exp_v, torch.Size([5, 4]))
|
||||
self.assertEqual(res, expected)
|
||||
|
||||
def _test_sparse_mask(self, is_cuda):
|
||||
self._test_sparse_mask_fixed(is_cuda)
|
||||
|
||||
self._test_sparse_mask_shape(is_cuda, [5, 6])
|
||||
self._test_sparse_mask_shape(is_cuda, [10, 10, 10])
|
||||
self._test_sparse_mask_shape(is_cuda, [50, 30, 20])
|
||||
self._test_sparse_mask_shape(is_cuda, [5, 5, 5, 5, 5, 5])
|
||||
|
||||
def test_sparse_mask(self):
|
||||
self._test_sparse_mask_fixed()
|
||||
self._test_sparse_mask(False)
|
||||
|
||||
self._test_sparse_mask_shape([5, 6])
|
||||
self._test_sparse_mask_shape([10, 10, 10])
|
||||
self._test_sparse_mask_shape([50, 30, 20])
|
||||
self._test_sparse_mask_shape([5, 5, 5, 5, 5, 5])
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_sparse_mask_cuda(self):
|
||||
self._test_sparse_mask(True)
|
||||
|
||||
def _test_sparse_mask_hybrid_fixed(self):
|
||||
i = self.IndexTensor([
|
||||
[1, 3, 0, 4],
|
||||
[2, 1, 2, 3],
|
||||
def _test_sparse_mask_hybrid_fixed(self, is_cuda):
|
||||
IndexTensor, ValueTensor, SparseTensor = \
|
||||
cuda_triplet if is_cuda else cpu_triplet
|
||||
i = IndexTensor([
|
||||
[1, 3, 3, 0, 4],
|
||||
[2, 1, 1, 2, 3],
|
||||
])
|
||||
v = self.ValueTensor([[1, 2], [2, 3], [3, 4], [4, 5]])
|
||||
# TODO: This is also testing that, if coalesce is a no-op,
|
||||
# the indices don't get permuted. I don't know if we actually
|
||||
# want to give this invariant.
|
||||
x = self.SparseTensor(i, v, torch.Size([5, 4, 2])).coalesce()
|
||||
dense = self.ValueTensor([
|
||||
v = ValueTensor([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]])
|
||||
x = SparseTensor(i, v, torch.Size([5, 4, 2]))
|
||||
dense = ValueTensor([
|
||||
[[1, 3], [2, 2], [3, 3], [4, 2]],
|
||||
[[5, 7], [6, 7], [7, 9], [8, 9]],
|
||||
[[9, 2], [10, 4], [11, 1], [12, 3]],
|
||||
[[13, 5], [14, 1], [15, 1], [16, 6]],
|
||||
[[17, 7], [18, 2], [19, 7], [20, 1]],
|
||||
])
|
||||
res = dense._sparse_mask(x)
|
||||
exp_v = self.ValueTensor([[7, 9], [14, 1], [3, 3], [20, 1]])
|
||||
expected = self.SparseTensor(i, exp_v, torch.Size([5, 4, 2]))
|
||||
res = dense.sparse_mask(x)
|
||||
exp_v = ValueTensor([[7, 9], [14, 1], [14, 1], [3, 3], [20, 1]])
|
||||
expected = SparseTensor(i, exp_v, torch.Size([5, 4, 2]))
|
||||
self.assertEqual(res, expected)
|
||||
|
||||
def _test_sparse_mask_hybrid(self, is_cuda):
|
||||
self._test_sparse_mask_hybrid_fixed(is_cuda)
|
||||
|
||||
self._test_sparse_mask_shape(is_cuda, [5, 6], [2, 3])
|
||||
self._test_sparse_mask_shape(is_cuda, [10, 10, 10], [3])
|
||||
self._test_sparse_mask_shape(is_cuda, [50, 30, 20], [2])
|
||||
self._test_sparse_mask_shape(is_cuda, [5, 5, 5, 5, 5, 5], [2])
|
||||
|
||||
def test_sparse_mask_hybrid(self):
|
||||
self._test_sparse_mask_hybrid_fixed()
|
||||
self._test_sparse_mask_hybrid(False)
|
||||
|
||||
self._test_sparse_mask_shape([5, 6], [2, 3])
|
||||
self._test_sparse_mask_shape([10, 10, 10], [3])
|
||||
self._test_sparse_mask_shape([50, 30, 20], [2])
|
||||
self._test_sparse_mask_shape([5, 5, 5, 5, 5, 5], [2])
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
def test_sparse_mask_hybrid_cuda(self):
|
||||
self._test_sparse_mask_hybrid(True)
|
||||
|
||||
@cuda_only
|
||||
def test_storage_not_null(self):
|
||||
x = torch.cuda.sparse.FloatTensor(2)
|
||||
self.assertNotEqual(x.get_device(), -1)
|
||||
|
||||
@cuda_only
|
||||
@unittest.skipIf(torch.cuda.device_count() < 2, "only one GPU detected")
|
||||
def test_same_gpu(self):
|
||||
i = self.IndexTensor([[2]]).cuda(1)
|
||||
v = self.ValueTensor([5]).cuda(1)
|
||||
x = self.SparseTensor(i, v, torch.Size([3]), device=1)
|
||||
self.assertEqual(x.get_device(), 1)
|
||||
self.assertEqual(x._values().get_device(), 1)
|
||||
self.assertEqual(x._indices().get_device(), 1)
|
||||
|
||||
x = self.SparseTensor(3, device=1)
|
||||
self.assertEqual(x.get_device(), 1)
|
||||
self.assertEqual(x._values().get_device(), 1)
|
||||
self.assertEqual(x._indices().get_device(), 1)
|
||||
|
||||
v = self.ValueTensor([5]).cuda(0)
|
||||
self.assertRaises(RuntimeError, lambda: self.SparseTensor(i, v, torch.Size([3])))
|
||||
|
||||
|
||||
class TestUncoalescedSparse(TestSparse):
|
||||
def setUp(self):
|
||||
super(TestUncoalescedSparse, self).setUp()
|
||||
self.is_uncoalesced = True
|
||||
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
class TestCudaSparse(TestSparse):
|
||||
def setUp(self):
|
||||
super(TestCudaSparse, self).setUp()
|
||||
self.is_cuda = True
|
||||
self.IndexTensor = torch.cuda.LongTensor
|
||||
self.ValueTensor = torch.cuda.DoubleTensor
|
||||
self.SparseTensor = torch.cuda.sparse.DoubleTensor
|
||||
|
||||
|
||||
@unittest.skipIf(not TEST_CUDA, 'CUDA not available')
|
||||
class TestCudaUncoalescedSparse(TestCudaSparse):
|
||||
def setUp(self):
|
||||
super(TestCudaUncoalescedSparse, self).setUp()
|
||||
self.is_uncoalesced = True
|
||||
|
||||
if __name__ == '__main__':
|
||||
run_tests()
|
||||
|
||||
1191
test/test_torch.py
1191
test/test_torch.py
File diff suppressed because it is too large
Load Diff
@ -336,11 +336,16 @@ class TestLuaReader(TestCase):
|
||||
|
||||
@classmethod
|
||||
def init(cls):
|
||||
try:
|
||||
path = download_file('https://download.pytorch.org/test_data/legacy_modules.t7')
|
||||
except unittest.SkipTest:
|
||||
DATA_URL = 'https://download.pytorch.org/test_data/legacy_modules.t7'
|
||||
data_dir = os.path.join(os.path.dirname(__file__), 'data')
|
||||
test_file_path = os.path.join(data_dir, 'legacy_modules.t7')
|
||||
succ = download_file(DATA_URL, test_file_path)
|
||||
if not succ:
|
||||
warnings.warn(("Couldn't download the test file for TestLuaReader! "
|
||||
"Tests will be incomplete!"), RuntimeWarning)
|
||||
return
|
||||
tests = load_lua(path)
|
||||
|
||||
tests = load_lua(test_file_path)
|
||||
for name, test in tests['modules'].items():
|
||||
test_name = 'test_' + name.replace('nn.', '')
|
||||
setattr(cls, test_name, cls._module_test(name, test))
|
||||
|
||||
@ -4,7 +4,6 @@ from string import Template
|
||||
from copy import deepcopy
|
||||
from .plugins import ArgcountChecker, OptionalArguments, ArgumentReferences, \
|
||||
BeforeAfterCall, ConstantArguments, ReturnArguments, GILRelease
|
||||
from ..shared import cwrap_common
|
||||
|
||||
|
||||
class cwrap(object):
|
||||
@ -36,11 +35,11 @@ class cwrap(object):
|
||||
DEFAULT_PLUGIN_CLASSES = [ArgcountChecker, ConstantArguments, OptionalArguments,
|
||||
ArgumentReferences, BeforeAfterCall, ReturnArguments, GILRelease]
|
||||
|
||||
def __init__(self, source, destination=None, plugins=None, default_plugins=True):
|
||||
def __init__(self, source, destination=None, plugins=[], default_plugins=True):
|
||||
if destination is None:
|
||||
destination = source.replace('.cwrap', '.cpp')
|
||||
|
||||
self.plugins = [] if plugins is None else plugins
|
||||
self.plugins = plugins
|
||||
if default_plugins:
|
||||
defaults = [cls() for cls in self.DEFAULT_PLUGIN_CLASSES]
|
||||
self.plugins = defaults + self.plugins
|
||||
@ -52,10 +51,7 @@ class cwrap(object):
|
||||
with open(source, 'r') as f:
|
||||
declarations = f.read()
|
||||
|
||||
# wrap all the declarations in the source .cwrap file
|
||||
wrapper = self.wrap_declarations(declarations)
|
||||
|
||||
# let each plugin do any post-processing of the wrapped file
|
||||
for plugin in self.plugins:
|
||||
wrapper = plugin.process_full_file(wrapper)
|
||||
|
||||
@ -77,7 +73,7 @@ class cwrap(object):
|
||||
elif line == ']]':
|
||||
in_declaration = False
|
||||
declaration = yaml.load('\n'.join(declaration_lines))
|
||||
cwrap_common.set_declaration_defaults(declaration)
|
||||
self.set_declaration_defaults(declaration)
|
||||
|
||||
# Pass declaration in a list - maybe some plugins want to add
|
||||
# multiple wrappers
|
||||
@ -105,6 +101,24 @@ class cwrap(object):
|
||||
|
||||
return '\n'.join(output)
|
||||
|
||||
def set_declaration_defaults(self, declaration):
|
||||
declaration.setdefault('arguments', [])
|
||||
declaration.setdefault('return', 'void')
|
||||
if 'cname' not in declaration:
|
||||
declaration['cname'] = declaration['name']
|
||||
# Simulate multiple dispatch, even if it's not necessary
|
||||
if 'options' not in declaration:
|
||||
declaration['options'] = [{'arguments': declaration['arguments']}]
|
||||
del declaration['arguments']
|
||||
# Parse arguments (some of them can be strings)
|
||||
for option in declaration['options']:
|
||||
option['arguments'] = self.parse_arguments(option['arguments'])
|
||||
# Propagate defaults from declaration to options
|
||||
for option in declaration['options']:
|
||||
for k, v in declaration.items():
|
||||
if k != 'name' and k != 'options':
|
||||
option.setdefault(k, v)
|
||||
|
||||
def parse_arguments(self, args):
|
||||
new_args = []
|
||||
for arg in args:
|
||||
@ -122,10 +136,6 @@ class cwrap(object):
|
||||
return new_args
|
||||
|
||||
def search_plugins(self, fnname, args, fallback):
|
||||
"""Search plugins for the given function to call with args.
|
||||
|
||||
If not found, call fallback with args.
|
||||
"""
|
||||
for plugin in self.plugins:
|
||||
wrapper = getattr(plugin, fnname)(*args)
|
||||
if wrapper is not None:
|
||||
|
||||
@ -1,6 +1,4 @@
|
||||
import os
|
||||
from . import CWrapPlugin
|
||||
from ...shared import cwrap_common
|
||||
|
||||
|
||||
class ArgcountSortPlugin(CWrapPlugin):
|
||||
@ -9,7 +7,8 @@ class ArgcountSortPlugin(CWrapPlugin):
|
||||
self.descending = descending
|
||||
|
||||
def process_declarations(self, declarations):
|
||||
def num_checked_args(option):
|
||||
return sum(map(lambda a: not a.get('ignore_check', False), option['arguments']))
|
||||
for declaration in declarations:
|
||||
cwrap_common.sort_by_number_of_options(declaration,
|
||||
self.descending)
|
||||
declaration['options'].sort(key=num_checked_args, reverse=self.descending)
|
||||
return declarations
|
||||
|
||||
@ -1,29 +0,0 @@
|
||||
from . import CWrapPlugin
|
||||
from string import Template
|
||||
|
||||
|
||||
class AssertNDim(CWrapPlugin):
|
||||
|
||||
PRE_CODE_TEMPLATE = Template(
|
||||
"""if(THTensor_(nDimension)(LIBRARY_STATE ${arg_op}) != ${dim_value}) {
|
||||
THError("Expected argument %s to have %d dimension(s), but has %d",
|
||||
"${op}", ${dim_value}, THTensor_(nDimension)(LIBRARY_STATE ${arg_op}));
|
||||
}
|
||||
""")
|
||||
|
||||
def process_option_code_template(self, template, option):
|
||||
new_code_pre = []
|
||||
|
||||
for _, arg in enumerate(option['arguments']):
|
||||
if 'assert_ndim' not in arg:
|
||||
continue
|
||||
|
||||
dim_value = arg.get('assert_ndim')
|
||||
op = arg.get('assign_name', arg['name'])
|
||||
arg_op = "arg_" + op
|
||||
new_code_pre.append(self.PRE_CODE_TEMPLATE.substitute(op=op,
|
||||
arg_op=arg_op,
|
||||
dim_value=dim_value))
|
||||
template = new_code_pre + template
|
||||
|
||||
return template
|
||||
@ -1,12 +1,6 @@
|
||||
from . import CWrapPlugin
|
||||
from string import Template
|
||||
|
||||
import sys
|
||||
if sys.version_info[0] == 3:
|
||||
string_type = str
|
||||
else:
|
||||
string_type = basestring
|
||||
|
||||
|
||||
class BoolOption(CWrapPlugin):
|
||||
|
||||
@ -21,8 +15,7 @@ class BoolOption(CWrapPlugin):
|
||||
for arg in option['arguments']:
|
||||
if self.is_bool_option(arg):
|
||||
arg['is_bool_option'] = True
|
||||
if isinstance(arg['if_true'], string_type):
|
||||
arg['type'] = 'const char*'
|
||||
arg['type'] = 'const char*'
|
||||
return declarations
|
||||
|
||||
def get_type_check(self, arg, option):
|
||||
|
||||
@ -1,318 +0,0 @@
|
||||
from . import CWrapPlugin
|
||||
from string import Template
|
||||
|
||||
# Arguments to the Broadcast Plugin:
|
||||
# broadcast: args_to_broadcast_against [inplace] [fallback]
|
||||
# [args_to_broadcast_against]: either a single argument (e.g. "arg1") or a comma-seperated
|
||||
# list of two arguments (e.g. "tensor1,tensor2") indicating
|
||||
# arguments to broadcast specified argument (usually "self") against
|
||||
# [inplace] will generate code for in-place function, which doesn't allow the in-place
|
||||
# argument to be broadcast
|
||||
# [fallback] if tensors aren't broadcastable, preserves "element number" pointwise behavior,
|
||||
# where only number of elements need to match, and tensors are viewed as 1-dimensional.
|
||||
# [dims] specify if the tensors shouldn't be broadcast to a specific tensor or tensors, but a combination
|
||||
# of individual dimension sizes of a set of tensors. For example: addbmm(C,A,B) a.k.a. [C + A @ B]
|
||||
# broadcasts C to the first dimension of A and the second dimension of B. Each dimension is specified as
|
||||
# [arg].dim[#] and dimensions are comma-separated. So, to specify that the tensor should be
|
||||
# broadcast to 3-dimensions with sizes:
|
||||
# tensor0->size[0] x tensor1->size[1] x tensor2->size[2]
|
||||
# you would write:
|
||||
# dims:tensor0.dim0,tensor1.dim1,tensor2.dim2
|
||||
# [types] if the tensors should be of different types than THTensor, specify as X where
|
||||
# the actual type to use is THXTensor (i.e. Byte for THByteTensor). If the type
|
||||
# should be THTensor, use 'Real'
|
||||
|
||||
# For out of place:
|
||||
# Two args: expand the two args together
|
||||
# Three args (fused kernels): (e.g. addcmul) expand all three args together
|
||||
# Sketch of proof that this is the same:
|
||||
# consider addcmul, under expansion we want: a + (b * c) = (a + b * c) [all expanded together]
|
||||
# Let e(i, j) be the expansion of i with j, e(i, j, k) be the expansion of i with j,k
|
||||
#
|
||||
# Then a + (b * c) = e(a, e(b,c) * e(c,b)) + e(e(b,c) * e(c,b), a)
|
||||
# = e(a, e(b,c)) + e(e(b,c) * e(c,b), a) (only size matters for second param)
|
||||
# = e(a,b,c) + e(e(b,c) * e(c,b), a) (by associativity of max in expand)
|
||||
# = e(a,b,c) + e(b,c,a) * e(c,b,a) (see L1)
|
||||
# which is a + b * c all expanded together
|
||||
#
|
||||
# L1: Show e(i * j, a) = e(i,a) * e(j,a) where i,j have same size
|
||||
# Consider any index _{ s_0, ..., s_n}
|
||||
# e(i * j, a) = (i*j)_{f(s_0), ...,f(s_n)} where f is the expansion of that dimension with a
|
||||
# = i_{f(s_0), ..., f(s_n)} * j_{f(s_0), ..., f(s_n)} by definition of pointwise operator
|
||||
# = e(i,a) * e(j,a)
|
||||
|
||||
|
||||
class Broadcast(CWrapPlugin):
|
||||
|
||||
# Save and restore passed in arguments in case later plugins use
|
||||
POST_TEMPLATE = Template(
|
||||
"""${arg_op_other} = ${arg_op_other}_save;\n""")
|
||||
|
||||
def getPreArgStringTemplate(self, type=None):
|
||||
if type is None:
|
||||
ret = """THTensor *${arg_op_other}_save = ${arg_op_other};
|
||||
THTensorPtr ${arg_op_other}_guard(THTensor_(new)(LIBRARY_STATE_NOARGS));\n"""
|
||||
else:
|
||||
cpu_t = "TH" + type + "Tensor"
|
||||
gpu_t = "THCuda" + type + "Tensor"
|
||||
ret = ("#if !IS_CUDA\n" +
|
||||
cpu_t + " *${arg_op_other}_save = ${arg_op_other};\n" +
|
||||
cpu_t + "Ptr ${arg_op_other}_guard(" + cpu_t + "_new(LIBRARY_STATE_NOARGS));\n" +
|
||||
"#else\n" +
|
||||
gpu_t + " *${arg_op_other}_save = ${arg_op_other};\n" +
|
||||
"THPPointer<" + gpu_t + "> ${arg_op_other}_guard(\n" + gpu_t + "_new(LIBRARY_STATE_NOARGS));\n" +
|
||||
"#endif\n")
|
||||
return Template(ret)
|
||||
|
||||
def getExpandTemplate(self, expand_call, success_code, raise_errors):
|
||||
if not raise_errors:
|
||||
return Template(
|
||||
"bool expand_success = false;\n" +
|
||||
"try {\n" +
|
||||
expand_call +
|
||||
"\nexpand_success = true;\n" +
|
||||
"}\n"
|
||||
"catch (std::exception &e) {}\n" +
|
||||
"if(expand_success) {\n" +
|
||||
success_code +
|
||||
"\n}\n")
|
||||
else:
|
||||
return Template(
|
||||
expand_call + "\n" +
|
||||
success_code + "\n")
|
||||
|
||||
def getOutPlacePreExpand2Template(self, raise_errors):
|
||||
expand_code = """expand_outplace2(LIBRARY_STATE ${arg_op_a}_guard.get(), ${arg_op_other}_guard.get(),
|
||||
${arg_op_a}, ${arg_op_other},
|
||||
\"${op_a}\", \"${op_other}\", !${raise_errors});"""
|
||||
success_code = """${arg_op_a} = ${arg_op_a}_guard.get();
|
||||
${arg_op_other} = ${arg_op_other}_guard.get();"""
|
||||
return self.getExpandTemplate(expand_code, success_code, raise_errors)
|
||||
|
||||
def getOutPlacePreExpand3Template(self, raise_errors):
|
||||
expand_code = """expand_outplace3(LIBRARY_STATE ${arg_op_a}_guard.get(),
|
||||
${arg_op_other1}_guard.get(), ${arg_op_other2}_guard.get(),
|
||||
${arg_op_a}, ${arg_op_other1}, ${arg_op_other2},
|
||||
\"${op_a}\", \"${op_other1}\", \"${op_other2}\", !${raise_errors});"""
|
||||
success_code = """${arg_op_a} = ${arg_op_a}_guard.get();
|
||||
${arg_op_other1} = ${arg_op_other1}_guard.get();
|
||||
${arg_op_other2} = ${arg_op_other2}_guard.get();"""
|
||||
return self.getExpandTemplate(expand_code, success_code, raise_errors)
|
||||
|
||||
OUT_PLACE_PRE_EXPAND_PRE_DIM_TEMPLATE = Template(
|
||||
"""if(THTensor_(nDimension)(LIBRARY_STATE ${arg_op_dim}) <= ${arg_op_dim_value}) {
|
||||
THError("Argument %s requires at least %d dimensions, but only has %d",
|
||||
"${op_dim}", ${arg_op_dim_value} + 1, THTensor_(nDimension)(LIBRARY_STATE ${arg_op_dim}));
|
||||
}
|
||||
long ${arg_op_a}_dim${idx}_size = THTensor_(size)(LIBRARY_STATE ${arg_op_dim}, ${arg_op_dim_value});\n""")
|
||||
|
||||
OUT_PLACE_PRE_EXPAND1_DIM_TEMPLATE = Template(
|
||||
"""THLongStoragePtr ${arg_op_a}_storage(THLongStorage_newWithSize1(${arg_op_a}_dim0_size));\n""")
|
||||
|
||||
OUT_PLACE_PRE_EXPAND2_DIM_TEMPLATE = Template(
|
||||
"""THLongStoragePtr ${arg_op_a}_storage(
|
||||
THLongStorage_newWithSize2(${arg_op_a}_dim0_size, ${arg_op_a}_dim1_size));\n""")
|
||||
|
||||
OUT_PLACE_PRE_EXPAND3_DIM_TEMPLATE = Template(
|
||||
"""THLongStoragePtr ${arg_op_a}_storage(
|
||||
THLongStorage_newWithSize3(${arg_op_a}_dim0_size, ${arg_op_a}_dim1_size, ${arg_op_a}_dim2_size));\n""")
|
||||
|
||||
def getOutPlacePreExpandPostDimTemplate(self, raise_errors):
|
||||
expand_code = """expand(LIBRARY_STATE ${arg_op_a}_guard.get(), ${arg_op_a}, ${arg_op_a}_storage);"""
|
||||
success_code = """${arg_op_a} = ${arg_op_a}_guard.get();"""
|
||||
return self.getExpandTemplate(expand_code, success_code, raise_errors)
|
||||
|
||||
OUT_PLACE_PRE_TEMPLATE = Template(
|
||||
"""${code_arg_op_a}${code_arg_op_other1}${code_arg_op_other2}
|
||||
${expand_code}""")
|
||||
|
||||
def getInPlacePreExpand1Template(self, raise_errors):
|
||||
expand_code = """expand_inplace1(LIBRARY_STATE ${arg_op_other}_guard.get(), ${arg_op_other}, ${arg_op_a},
|
||||
\"${op_other}\", \"${op_a}\", !${raise_errors});"""
|
||||
success_code = """${arg_op_other} = ${arg_op_other}_guard.get();"""
|
||||
return self.getExpandTemplate(expand_code, success_code, raise_errors)
|
||||
|
||||
def getInPlacePreExpand2Template(self, raise_errors):
|
||||
expand_code = """expand_inplace2(LIBRARY_STATE ${arg_op_other1}_guard.get(), ${arg_op_other2}_guard.get(),
|
||||
${arg_op_other1}, ${arg_op_other2}, ${arg_op_a},
|
||||
\"${op_other1}\", \"${op_other2}\", \"${op_a}\", !${raise_errors});"""
|
||||
success_code = """${arg_op_other1} = ${arg_op_other1}_guard.get();
|
||||
${arg_op_other2} = ${arg_op_other2}_guard.get();"""
|
||||
return self.getExpandTemplate(expand_code, success_code, raise_errors)
|
||||
|
||||
IN_PLACE_PRE_TEMPLATE = Template(
|
||||
"""${code_arg_op_other1}${code_arg_op_other2}
|
||||
${expand_code}""")
|
||||
|
||||
def initialize(self, cwrap):
|
||||
self.cwrap = cwrap
|
||||
|
||||
# Arguments:
|
||||
# [0]: name of tensor to broadcast with (possibly two comma separated)
|
||||
# [1] inplace (optional). In place operations only broadcast on second tensor argument
|
||||
# [2] fallback (optional). Will fallback to applying to tensor of equal nElem if broadcast fails
|
||||
def process_option_code_template(self, template, option):
|
||||
new_code_pre = []
|
||||
new_code_post = []
|
||||
for _, arg in enumerate(option['arguments']):
|
||||
if 'broadcast' not in arg:
|
||||
continue
|
||||
|
||||
params = arg.get('broadcast').split(" ")
|
||||
op_a = arg.get('assign_name', arg['name'])
|
||||
in_place = "inplace" in params
|
||||
raise_errors = "false" if "fallback" in params else "true"
|
||||
|
||||
param_others = params[0].split(",")
|
||||
if len(param_others) > 2:
|
||||
raise ValueError('Broadcast only supports up to 2 secondary parameters')
|
||||
op_b = param_others[0]
|
||||
op_c = param_others[1] if len(param_others) == 2 else None
|
||||
arg_op_b = "arg_" + op_b
|
||||
arg_op_a = "arg_" + op_a
|
||||
arg_op_c = ("arg_" + op_c) if op_c else None
|
||||
|
||||
dims_kvs = []
|
||||
for p in params:
|
||||
if p.startswith("dims:"):
|
||||
assert(raise_errors == "true")
|
||||
if len(dims_kvs) != 0:
|
||||
raise ValueError("multiple specifications of dims")
|
||||
dims = p[len("dims:"):].split(",")
|
||||
for dim in dims:
|
||||
batchdim = dim.split(".")
|
||||
assert len(batchdim) == 2
|
||||
assert batchdim[1].startswith("dim")
|
||||
dim_val = batchdim[1][len("dim"):]
|
||||
dims_kvs.append({"op": batchdim[0], "arg_op": "arg_" + batchdim[0], "val": dim_val})
|
||||
|
||||
assert len(dims_kvs) <= 3
|
||||
for p in params[1:]:
|
||||
if p != "inplace" and p != "fallback" and not p.startswith("dims:") and not p.startswith("types:"):
|
||||
raise ValueError("invalid parameter {}".format(p))
|
||||
|
||||
type_op_b = None
|
||||
type_op_c = None
|
||||
for p in params:
|
||||
if p.startswith("types:"):
|
||||
if not in_place and len(dims_kvs) > 0:
|
||||
raise ValueError("type specification not supported yet for out-of-place functions "
|
||||
"that specify explicit dimensions")
|
||||
types = p[len("types:"):].split(",")
|
||||
assert(len(types) == (2 if op_c else 1))
|
||||
type_op_b = None if types[0] == "Real" else types[0]
|
||||
if op_c:
|
||||
type_op_c = None if types[1] == "Real" else types[1]
|
||||
|
||||
op_b_mapping = {
|
||||
"op_a": op_a,
|
||||
"op_other": op_b,
|
||||
"arg_op_a": arg_op_a,
|
||||
"arg_op_other": arg_op_b,
|
||||
"raise_errors": raise_errors
|
||||
}
|
||||
op_c_mapping = {
|
||||
"op_a": op_a,
|
||||
"op_other": op_c,
|
||||
"arg_op_a": arg_op_a,
|
||||
"arg_op_other": arg_op_c,
|
||||
"raise_errors": raise_errors
|
||||
}
|
||||
|
||||
if in_place:
|
||||
code_arg_op_other1 = self.getPreArgStringTemplate(type=type_op_b).substitute(op_b_mapping)
|
||||
code_arg_op_other2 = (
|
||||
self.getPreArgStringTemplate(type=type_op_c).substitute(op_c_mapping) if op_c else "")
|
||||
|
||||
if op_c:
|
||||
expand_code = self.getInPlacePreExpand2Template(raise_errors == "true").substitute(
|
||||
op_b_mapping,
|
||||
op_other1=op_b,
|
||||
op_other2=op_c,
|
||||
arg_op_other1=arg_op_b,
|
||||
arg_op_other2=arg_op_c)
|
||||
else:
|
||||
expand_code = self.getInPlacePreExpand1Template(raise_errors == "true").substitute(op_b_mapping)
|
||||
|
||||
new_code_pre.append(self.IN_PLACE_PRE_TEMPLATE.substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
code_arg_op_other1=code_arg_op_other1,
|
||||
code_arg_op_other2=code_arg_op_other2,
|
||||
expand_code=expand_code,
|
||||
raise_errors=raise_errors))
|
||||
new_code_pre.append("")
|
||||
|
||||
post_code = self.POST_TEMPLATE.substitute(op_b_mapping)
|
||||
if op_c:
|
||||
post_code += self.POST_TEMPLATE.substitute(op_c_mapping)
|
||||
|
||||
new_code_post.append(post_code)
|
||||
new_code_post.append("")
|
||||
else:
|
||||
if len(dims_kvs) != 0:
|
||||
code_arg_op_a = self.getPreArgStringTemplate().substitute(arg_op_other=arg_op_a)
|
||||
code_arg_op_other1 = ""
|
||||
code_arg_op_other2 = ""
|
||||
expand_code = ""
|
||||
for idx, kv in enumerate(dims_kvs):
|
||||
expand_code += self.OUT_PLACE_PRE_EXPAND_PRE_DIM_TEMPLATE.substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
op_dim=kv["op"],
|
||||
arg_op_dim=kv["arg_op"],
|
||||
arg_op_dim_value=kv["val"],
|
||||
idx=idx)
|
||||
|
||||
if len(dims_kvs) == 1:
|
||||
expand_code += self.OUT_PLACE_PRE_EXPAND1_DIM_TEMPLATE.substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
arg_op_dim0=dims_kvs[0]["arg_op"])
|
||||
elif len(dims_kvs) == 2:
|
||||
expand_code += self.OUT_PLACE_PRE_EXPAND2_DIM_TEMPLATE.substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
arg_op_dim0=dims_kvs[0]["arg_op"],
|
||||
arg_op_dim1=dims_kvs[1]["arg_op"])
|
||||
else:
|
||||
expand_code += self.OUT_PLACE_PRE_EXPAND3_DIM_TEMPLATE.substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
arg_op_dim0=dims_kvs[0]["arg_op"],
|
||||
arg_op_dim1=dims_kvs[1]["arg_op"],
|
||||
arg_op_dim2=dims_kvs[2]["arg_op"])
|
||||
expand_code += self.getOutPlacePreExpandPostDimTemplate(raise_errors == "true").substitute(
|
||||
arg_op_a=arg_op_a,
|
||||
raise_errors=raise_errors)
|
||||
post_code = self.POST_TEMPLATE.substitute(arg_op_other=arg_op_a)
|
||||
|
||||
else:
|
||||
code_arg_op_a = self.getPreArgStringTemplate().substitute(arg_op_other=arg_op_a)
|
||||
code_arg_op_other1 = self.getPreArgStringTemplate(type=type_op_b).substitute(op_b_mapping)
|
||||
code_arg_op_other2 = (self.getPreArgStringTemplate(type=type_op_c).substitute(op_c_mapping)
|
||||
if op_c else "")
|
||||
|
||||
if op_c:
|
||||
expand_code = self.getOutPlacePreExpand3Template(raise_errors == "true").substitute(
|
||||
op_b_mapping,
|
||||
op_other1=op_b,
|
||||
op_other2=op_c,
|
||||
arg_op_other1=arg_op_b,
|
||||
arg_op_other2=arg_op_c)
|
||||
|
||||
else:
|
||||
expand_code = self.getOutPlacePreExpand2Template(
|
||||
raise_errors == "true").substitute(op_b_mapping)
|
||||
|
||||
post_code = self.POST_TEMPLATE.substitute(arg_op_other=arg_op_a)
|
||||
post_code += self.POST_TEMPLATE.substitute(op_b_mapping)
|
||||
post_code += self.POST_TEMPLATE.substitute(op_c_mapping) if op_c else ""
|
||||
|
||||
new_code_pre.append(self.OUT_PLACE_PRE_TEMPLATE.substitute(
|
||||
code_arg_op_a=code_arg_op_a,
|
||||
code_arg_op_other1=code_arg_op_other1,
|
||||
code_arg_op_other2=code_arg_op_other2,
|
||||
expand_code=expand_code))
|
||||
new_code_pre.append("")
|
||||
|
||||
new_code_post.append(post_code)
|
||||
new_code_post.append("")
|
||||
|
||||
template = new_code_pre + template + new_code_post
|
||||
return template
|
||||
@ -135,7 +135,7 @@ static PyObject * $name(PyObject *self, PyObject *args, PyObject *kwargs)
|
||||
if arg['name'] in ['self', 'state', 'dataType', 'handle']:
|
||||
arg['ignore_check'] = True
|
||||
declaration['options'] = self.filter_unique_options(declaration['options'])
|
||||
return [d for d in declarations if not d.get('only_register', False)]
|
||||
return declarations
|
||||
|
||||
def filter_unique_options(self, options):
|
||||
def signature(option):
|
||||
|
||||
@ -23,8 +23,6 @@ class GILRelease(CWrapPlugin):
|
||||
]
|
||||
|
||||
def process_option_code_template(self, template, option):
|
||||
if option.get('with_gil', False):
|
||||
return template
|
||||
call_idx = template.index('$call')
|
||||
template.insert(call_idx, self.BEFORE_CALL)
|
||||
template.insert(call_idx + 2, self.AFTER_CALL)
|
||||
|
||||
@ -30,8 +30,10 @@ class KwargsPlugin(CWrapPlugin):
|
||||
for option in declaration['options']:
|
||||
offset = 0
|
||||
for arg in option['arguments']:
|
||||
if arg.get('kwarg_only'):
|
||||
arg['no_idx'] = True
|
||||
if arg.get('kwarg_only') and not arg.get('ignore_check', False):
|
||||
offset += 1
|
||||
else:
|
||||
arg['kwarg_offset'] = offset
|
||||
return declarations
|
||||
|
||||
def get_arg_accessor(self, arg, option):
|
||||
@ -39,14 +41,14 @@ class KwargsPlugin(CWrapPlugin):
|
||||
return
|
||||
if arg.get('kwarg_only'):
|
||||
return self.KWARG_ONLY_ACCESSOR_TEMPLATE.substitute(name=arg['name'])
|
||||
return self.ACCESSOR_TEMPLATE.substitute(idx=arg['idx'], name=arg['name'])
|
||||
return self.ACCESSOR_TEMPLATE.substitute(idx=arg['idx'] - arg['kwarg_offset'], name=arg['name'])
|
||||
|
||||
def process_single_check(self, code, arg, arg_accessor):
|
||||
if arg.get('no_kwargs'):
|
||||
return code
|
||||
if arg.get('kwarg_only'):
|
||||
return self.KWARG_ONLY_CHECK_TEMPLATE.substitute(name=arg['name'], code=code)
|
||||
return self.CHECK_TEMPLATE.substitute(idx=arg['idx'], name=arg['name'], code=code)
|
||||
return self.CHECK_TEMPLATE.substitute(idx=arg['idx'] - arg['kwarg_offset'], name=arg['name'], code=code)
|
||||
|
||||
def process_wrapper(self, code, declaration):
|
||||
if declaration.get('no_kwargs'):
|
||||
|
||||
@ -1,18 +1,58 @@
|
||||
import os
|
||||
from copy import deepcopy
|
||||
from . import CWrapPlugin
|
||||
from itertools import product
|
||||
from ...shared import cwrap_common
|
||||
|
||||
|
||||
class OptionalArguments(CWrapPlugin):
|
||||
|
||||
def process_declarations(self, declarations):
|
||||
new_options = []
|
||||
for declaration in declarations:
|
||||
cwrap_common.enumerate_options_due_to_default(
|
||||
declaration,
|
||||
allow_kwarg=True,
|
||||
type_to_signature={},
|
||||
remove_self=False)
|
||||
|
||||
for option in declaration['options']:
|
||||
optional_args = []
|
||||
for i, arg in enumerate(option['arguments']):
|
||||
if 'default' in arg:
|
||||
optional_args.append(i)
|
||||
for permutation in product((True, False), repeat=len(optional_args)):
|
||||
option_copy = deepcopy(option)
|
||||
for i, bit in zip(optional_args, permutation):
|
||||
arg = option_copy['arguments'][i]
|
||||
if not bit:
|
||||
arg['type'] = 'CONSTANT'
|
||||
arg['ignore_check'] = True
|
||||
# PyYAML interprets NULL as None...
|
||||
arg['name'] = 'NULL' if arg['default'] is None else arg['default']
|
||||
new_options.append(option_copy)
|
||||
declaration['options'] = self.filter_unique_options(new_options)
|
||||
return declarations
|
||||
|
||||
def filter_unique_options(self, options):
|
||||
def signature(option, kwarg_only_count):
|
||||
if kwarg_only_count == 0:
|
||||
kwarg_only_count = None
|
||||
else:
|
||||
kwarg_only_count = -kwarg_only_count
|
||||
arg_signature = '#'.join(
|
||||
arg['type']
|
||||
for arg in option['arguments'][:kwarg_only_count]
|
||||
if not arg.get('ignore_check'))
|
||||
if kwarg_only_count is None:
|
||||
return arg_signature
|
||||
kwarg_only_signature = '#'.join(
|
||||
arg['name'] + '#' + arg['type']
|
||||
for arg in option['arguments'][kwarg_only_count:]
|
||||
if not arg.get('ignore_check'))
|
||||
return arg_signature + "#-#" + kwarg_only_signature
|
||||
seen_signatures = set()
|
||||
unique = []
|
||||
for option in options:
|
||||
for num_kwarg_only in range(0, len(option['arguments']) + 1):
|
||||
sig = signature(option, num_kwarg_only)
|
||||
if sig not in seen_signatures:
|
||||
if num_kwarg_only > 0:
|
||||
for arg in option['arguments'][-num_kwarg_only:]:
|
||||
arg['kwarg_only'] = True
|
||||
unique.append(option)
|
||||
seen_signatures.add(sig)
|
||||
break
|
||||
return unique
|
||||
|
||||
@ -1,90 +0,0 @@
|
||||
from copy import deepcopy
|
||||
from . import CWrapPlugin
|
||||
import yaml
|
||||
|
||||
|
||||
class ProcessorSpecificPlugin(CWrapPlugin):
|
||||
|
||||
def process_declarations(self, declarations):
|
||||
# In order to move Torch's random functions into the same cwrap
|
||||
# declaration, we need to be able to handle the fact that on the CPU
|
||||
# these functions take a generator argument, while on the GPU, they
|
||||
# do not. As such, we would like to split those declarations at cwrap
|
||||
# runtime into two separate declarations, one for the CPU (unchanged),
|
||||
# and one for the GPU (with the generator argument removed).
|
||||
#
|
||||
# For example, the declaration arguments:
|
||||
# arguments:
|
||||
# - THTensor* self
|
||||
# - arg: THGenerator* generator
|
||||
# default: THPDefaultGenerator->cdata
|
||||
# kwarg_only: True
|
||||
#
|
||||
# Would have the generator argument removed when generating for the GPU
|
||||
# backend.
|
||||
|
||||
def arg_contains_generator(arg):
|
||||
return (arg['type'] == 'THGenerator*' or (arg.get('default', None)
|
||||
is not None and 'THPDefaultGenerator' in
|
||||
str(arg.get('default', ""))))
|
||||
|
||||
def split_candidate(declaration):
|
||||
# First, check and see if it is a declaration for both CPU/GPU
|
||||
if all([proc in declaration['backends'] for
|
||||
proc in ['CPU', 'CUDA']]):
|
||||
for option in declaration['options']:
|
||||
for argument in option['arguments']:
|
||||
if arg_contains_generator(argument):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def can_we_handle_the_split(declaration):
|
||||
# hook into here if the split cannot happen for some reason
|
||||
return True
|
||||
|
||||
def generator_split(declaration):
|
||||
# the split must make two changes: 1. remove the generator argument
|
||||
# for the GPU, and 2. assign the correct backends/types to the
|
||||
# split declaration
|
||||
dec_cpu = declaration
|
||||
dec_gpu = deepcopy(declaration)
|
||||
|
||||
# Remove GPU backend and types from dec_cpu
|
||||
dec_cpu['backends'].remove('CUDA')
|
||||
if dec_cpu.get('backend_type_pairs', False):
|
||||
dec_cpu['backend_type_pairs'] = (
|
||||
[pair for pair in dec_cpu['backend_type_pairs'] if
|
||||
pair[1] == 'CPU'])
|
||||
# also need to reach into options
|
||||
for option in dec_cpu['options']:
|
||||
option['backends'].remove('CUDA')
|
||||
|
||||
# Remove CPU backend and types from dec_gpu
|
||||
dec_gpu['backends'].remove('CPU')
|
||||
if dec_gpu.get('backend_type_pairs', False):
|
||||
dec_gpu['backend_type_pairs'] = (
|
||||
[pair for pair in dec_gpu['backend_type_pairs'] if
|
||||
pair[1] == 'CUDA'])
|
||||
# also need to reach into options
|
||||
for option in dec_gpu['options']:
|
||||
option['backends'].remove('CPU')
|
||||
|
||||
# Remove generator arguments from dec_gpu options
|
||||
for option in dec_gpu['options']:
|
||||
option['arguments'] = (
|
||||
[arg for arg in option['arguments'] if
|
||||
not arg_contains_generator(arg)])
|
||||
|
||||
return [dec_cpu, dec_gpu]
|
||||
|
||||
decs = []
|
||||
for declaration in declarations:
|
||||
if split_candidate(declaration):
|
||||
assert(can_we_handle_the_split(declaration))
|
||||
newdecs = generator_split(declaration)
|
||||
decs.extend(newdecs)
|
||||
else:
|
||||
decs.append(declaration)
|
||||
|
||||
return decs
|
||||
@ -127,7 +127,7 @@ PyObject * $name(PyObject *self, PyObject *args, PyObject *kwargs)
|
||||
""")
|
||||
|
||||
ALLOCATE_TMPL = Template("""\
|
||||
THP${type}TensorPtr _${name}_guard((THP${type}Tensor*) THP${type}Tensor_NewEmpty());
|
||||
THP${type}TensorPtr _${name}_guard = (THP${type}Tensor*) THP${type}Tensor_NewEmpty();
|
||||
if (!_${name}_guard.get()) return NULL;
|
||||
THP${type}Tensor* $name = _${name}_guard.get();
|
||||
""")
|
||||
@ -334,92 +334,8 @@ ${cpu}
|
||||
for option in declaration['options']
|
||||
for arg in option['arguments'])
|
||||
|
||||
def backends_types_to_defined_if_string(declaration):
|
||||
# A declaration has two fields: 'backend', which stores a list of
|
||||
# backends (currently 'cpu' and 'cuda') the declaration applies
|
||||
# to, and 'types', which stores a list of real types the
|
||||
# declaration applies to. In PyTorch, when a function is only
|
||||
# supported by a subset of types, we wrap it in macro definition
|
||||
# checks.
|
||||
#
|
||||
# Previously, we manually required the cwrap declaration to
|
||||
# specify for which backend/type combinations a function was
|
||||
# defined for. Now, we explicitly list the types and backends for
|
||||
# a declaration, if it should only be supported for a specific
|
||||
# subset of types, backends, or type-backend pairs.
|
||||
|
||||
types = declaration.get('types', [])
|
||||
backends = declaration['backends']
|
||||
all_backends = ['CPU', 'CUDA']
|
||||
|
||||
def get_defined_string(backend, real):
|
||||
if backend == 'CUDA':
|
||||
if real == 'all':
|
||||
return "IS_CUDA"
|
||||
else:
|
||||
return 'CUDA_{0}'.format(real.upper())
|
||||
else:
|
||||
if real == 'all':
|
||||
return "!IS_CUDA"
|
||||
else:
|
||||
return 'defined(TH_REAL_IS_{0})'.format(real.upper())
|
||||
|
||||
def expand_composite_type(p, t):
|
||||
if t == 'floating_point':
|
||||
result = ['double', 'float']
|
||||
if p == 'CUDA':
|
||||
result.append('half')
|
||||
elif t == 'integral':
|
||||
result = ['byte', 'char', 'short', 'int', 'long']
|
||||
else:
|
||||
result = [t]
|
||||
return result
|
||||
|
||||
defineds = []
|
||||
|
||||
# The logic below does not handle corner cases well. We allow the
|
||||
# declaration to have a field 'backend_type_pairs' that stores a
|
||||
# dictionary from type --> backend representing allowed
|
||||
# combinations. Let's use these first.
|
||||
for pair in declaration.get('backend_type_pairs', []):
|
||||
p, t = pair
|
||||
defineds.extend([get_defined_string(p, et) for et in
|
||||
expand_composite_type(p, t)])
|
||||
|
||||
# In the base case, types is empty and backends contains both
|
||||
# 'CPU' and 'CUDA' --> this means we support all types, and our
|
||||
# string should be empty, or simply the list of explict type
|
||||
# backend pairs
|
||||
if (len(types) == 0 and all([proc in backends for proc in
|
||||
all_backends])):
|
||||
return " || ".join(defineds)
|
||||
|
||||
# Case 2: types is empty, but only one backend type is specified
|
||||
if len(types) == 0 and len(backends) == 1:
|
||||
defineds.append('IS_CUDA' if backends[0] == 'CUDA' else
|
||||
"!IS_CUDA")
|
||||
return " || ".join(defineds)
|
||||
|
||||
# Else, we loop overall all of the backend, type pairs and add
|
||||
# them
|
||||
for p in backends:
|
||||
for t in types:
|
||||
defineds.extend([get_defined_string(p, et) for et in
|
||||
expand_composite_type(p, t)])
|
||||
|
||||
return " || ".join(defineds)
|
||||
|
||||
for declaration in declarations:
|
||||
# Disable all methods for THHalfTensor, unless cpu_half is True
|
||||
|
||||
dfstr = backends_types_to_defined_if_string(declaration)
|
||||
if len(dfstr) > 0:
|
||||
# for now, need to check for distributed defined if as well
|
||||
if 'defined_if' in declaration:
|
||||
declaration['defined_if'] += ' && (' + dfstr + ')'
|
||||
else:
|
||||
declaration['defined_if'] = dfstr
|
||||
|
||||
if not declaration.get('cpu_half', False):
|
||||
defined_if = '!defined(TH_REAL_IS_HALF)'
|
||||
if 'defined_if' in declaration:
|
||||
@ -439,23 +355,15 @@ ${cpu}
|
||||
declaration['variables'] += ['PyObject *__out;']
|
||||
self.generate_out_options(declaration)
|
||||
if has_long_args(declaration):
|
||||
for option in declaration['options']:
|
||||
for arg in option['arguments']:
|
||||
if arg.get('long_args', False):
|
||||
arg['no_kwargs'] = True
|
||||
declaration['no_kwargs'] = True
|
||||
for option in declaration['options']:
|
||||
option['cname'] = 'TH{}Tensor_({})'.format(
|
||||
'S' if option.get('sparse', False) else '', option['cname'])
|
||||
if option.get('sparse', False):
|
||||
defined_if = option.get('defined_if', '')
|
||||
option['defined_if'] = '!IS_DISTRIBUTED' + (' && ' if defined_if else '') + defined_if
|
||||
|
||||
variants = declaration.get('variants', ['method'])
|
||||
if 'function' in variants:
|
||||
if declaration.get('with_stateless', False) or declaration.get('only_stateless', False):
|
||||
stateless_declaration = self.make_stateless(declaration)
|
||||
new_declarations.append(stateless_declaration)
|
||||
self.stateless_declarations.append(stateless_declaration)
|
||||
if 'method' not in variants:
|
||||
if declaration.get('only_stateless', False):
|
||||
continue
|
||||
|
||||
self.declarations.append(declaration)
|
||||
@ -468,13 +376,9 @@ ${cpu}
|
||||
|
||||
register_only = [d for d in declarations if d.get('only_register', False)]
|
||||
declarations = [d for d in declarations
|
||||
if (('method' in d.get('variants', ['method'])) and
|
||||
(not d.get('only_register', False)))]
|
||||
self.declarations.extend(filter(lambda x: 'method' in x.get('variants',
|
||||
['method']), register_only))
|
||||
self.stateless_declarations.extend(filter(lambda x: 'method' not in
|
||||
x.get('variants', ['method']),
|
||||
register_only))
|
||||
if (not d.get('only_stateless', False)) and (not d.get('only_register', False))]
|
||||
self.declarations.extend(filter(lambda x: not x.get('only_stateless', False), register_only))
|
||||
self.stateless_declarations.extend(filter(lambda x: x.get('only_stateless', False), register_only))
|
||||
|
||||
self.process_docstrings()
|
||||
|
||||
@ -516,7 +420,7 @@ ${cpu}
|
||||
sparse=('' if not sparse else 'S'),
|
||||
)
|
||||
if sparse:
|
||||
generated = '#if !defined(TH_REAL_IS_HALF) && !IS_DISTRIBUTED\n' + generated + '\n#endif\n\n'
|
||||
generated = '#ifndef TH_REAL_IS_HALF\n' + generated + '\n#endif\n\n'
|
||||
return generated
|
||||
|
||||
def process_full_file(self, code):
|
||||
@ -557,24 +461,11 @@ ${cpu}
|
||||
|
||||
if any(arg.get('long_args', False) for arg in option['arguments']):
|
||||
code = code.replace('__argcount ==', '__argcount >=')
|
||||
expected = str(int(option.get('output_provided', False)) +
|
||||
sum(not arg.get('no_kwargs', False) and not arg.get('ignore_check', False)
|
||||
for arg in option['arguments']))
|
||||
expected = str(int(option.get('output_provided', False)))
|
||||
code = '__dictcount == ' + expected + ' &&\n ' + code
|
||||
|
||||
return code
|
||||
|
||||
def process_option_code(self, code, option):
|
||||
if option.get('defined_if', ''):
|
||||
defined_if = option['defined_if']
|
||||
placeholder = ''
|
||||
# This means that it's a first option, so we need a dummy if,
|
||||
# so the next option can be an else if.
|
||||
if 'else if' not in code:
|
||||
placeholder = '\n #else\n if (false) {'
|
||||
return '#if ' + defined_if + '\n ' + code + placeholder + '\n #endif\n'
|
||||
return code
|
||||
|
||||
def process_pre_arg_assign(self, template, option):
|
||||
new_args = []
|
||||
for arg in option['arguments']:
|
||||
|
||||
@ -1,422 +1,55 @@
|
||||
|
||||
class CWrapPlugin(object):
|
||||
"""Base class from which all cwrap plugins should inherit.
|
||||
|
||||
Override any of the following methods to implement the desired wrapping
|
||||
behavior.
|
||||
"""
|
||||
|
||||
def initialize(self, cwrap):
|
||||
"""Initialize the Plugin class prior to calling any other functions.
|
||||
|
||||
It is used to give the Plugin access to the cwrap object's helper
|
||||
functions and state.
|
||||
|
||||
Args:
|
||||
cwrap: the cwrap object performing the wrapping.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_type_check(self, arg, option):
|
||||
"""Used to generate code for runtime checks of object types.
|
||||
|
||||
The type can be found in arg['type']. For example, it could be
|
||||
THTensor*. If this Plugin recognizes the type in arg, it should
|
||||
return a Template string containing code that checks whether a
|
||||
Python object is of this type. For example, the return type in
|
||||
this case would be:
|
||||
|
||||
Template('(PyObject*)Py_TYPE($arg) == THPTensorClass')
|
||||
|
||||
As a simpler example, if the type == 'bool' then we would return:
|
||||
|
||||
Template('PyBool_Check($arg)')
|
||||
|
||||
Note that the name of the identifier that will be subsituted must be
|
||||
$arg.
|
||||
|
||||
Args:
|
||||
arg: a Python object with a 'type' field representing the type
|
||||
to generate a check string for.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
A Template string as described above, or None if this Plugin does
|
||||
not have a corresponding type check for the passed type.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_type_unpack(self, arg, option):
|
||||
"""Used to generate code unpacking of Python objects into C types.
|
||||
|
||||
Similar to get_type_check, but for unpacking Python objects into their
|
||||
corresponding C types. The type is once again accessible via
|
||||
arg['type']. This time we return a Template string that unpacks an
|
||||
object. For a THTensor*, we know that the corresponding PyTorch type is
|
||||
a THPTensor*, so we need to get the cdata from the object. So we would
|
||||
return:
|
||||
|
||||
Template('((THPTensor*)$arg)->cdata')
|
||||
|
||||
For a simpler type, such as a long, we could do:
|
||||
|
||||
Template('PyLong_AsLong($arg)')
|
||||
|
||||
though in practice we will use our own custom unpacking code. Once
|
||||
again, $arg must be used as the identifier.
|
||||
|
||||
Args:
|
||||
arg: a Python object with a 'type' field representing the type
|
||||
to generate a unpack string for.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
A Template string as described above, or None if this Plugin does
|
||||
not have a corresponding type unpack for the passed type.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_return_wrapper(self, option):
|
||||
"""Used to generate code wrapping a function's return value.
|
||||
|
||||
Wrapped functions should always return a PyObject *. However,
|
||||
internally, the code will be working with C objects or primitives.
|
||||
Therefore, if a function has a return value we need to convert it back
|
||||
to a PyObject * before the function returns. Plugins can override this
|
||||
function to generate wrapper code for returning specific C types. The
|
||||
type is accessible via option['return'].
|
||||
|
||||
Continuing on with our THTensor* example, we might do something like:
|
||||
|
||||
Template('return THPTensor_(New)($result);')
|
||||
|
||||
In general, you want to do return <statement>; In this case, we call
|
||||
into THP's library routine that takes a THTensor* (the $result
|
||||
identifier) and returns a PyObject *.
|
||||
|
||||
For a bool, we could do Template('return PyBool_FromLong($result);').
|
||||
|
||||
Note that in other cases, our logic might be more complicated. For
|
||||
example, if our return value is also an argument to the function call,
|
||||
we could need to increase the reference count prior to returning.
|
||||
|
||||
Args:
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
A Template string as described above, or None if this Plugin does
|
||||
not have a corresponding return wrapper for the functions return
|
||||
type or specifier.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_wrapper_template(self, declaration):
|
||||
"""Used to create a code template to wrap the options.
|
||||
|
||||
This function returns a Template string that contains the function call
|
||||
for the overall declaration, including the method definition, opening
|
||||
and closing brackets, and any additional code within the method body.
|
||||
Look through the examples to get a sense of what this might look like.
|
||||
The only requirements are that it contains unsubstituted template
|
||||
identifiers for anything the cwrap engine expects.
|
||||
|
||||
Note that for any declaration only one Plugin can generate the wrapper
|
||||
template.
|
||||
|
||||
Args:
|
||||
declaration: the declaration for the wrapped method.
|
||||
|
||||
Returns:
|
||||
A template string representing the entire function declaration,
|
||||
with identifiers as necessary.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_assign_args(self, arguments):
|
||||
"""Used to modify argument metadata prior to assignment.
|
||||
|
||||
We have already setup argument checking, and how to unpack arguments.
|
||||
This function allows you to modify the metadata of an argument prior to
|
||||
actually performing the assignment. For example, you might want to
|
||||
check that an argument is of a specific type, but when unpacking it you
|
||||
might want to treat it as a different type. This function will allow
|
||||
you to do stuff like that --> e.g. you could set the 'type' field for a
|
||||
particular argument to be something else.
|
||||
|
||||
Args:
|
||||
arguments: a list of argument metadata dictionaries.
|
||||
|
||||
Returns:
|
||||
The same list of arguments, with any modifications as you see fit.
|
||||
|
||||
"""
|
||||
pass
|
||||
|
||||
def get_arg_accessor(self, arg, option):
|
||||
"""Used to generate a string for accessing the passed arg.
|
||||
|
||||
One of the key components of the YAML definition for a method to be
|
||||
wrapped are the arguments to that method. Override this function to
|
||||
show how to access that specific arg in the code. For example, you
|
||||
might do something different if the argument is a keyword argument, or
|
||||
a constant, or self. The base cwrap plugin has a fallback arg accessor
|
||||
for loading elements from the args PyObject * tuple passed to the
|
||||
function.
|
||||
|
||||
Its best to look at some of the existing Plugins to get a sense of what
|
||||
one might do.
|
||||
|
||||
Args:
|
||||
arg: a dictionary specifying attributes of the arg to be accessed
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
A a string (note: not a Template string!) of code that can be used
|
||||
to access the given arg. If the plugin does not know how to access
|
||||
the arg, return None.
|
||||
"""
|
||||
pass
|
||||
|
||||
def process_full_file(self, code):
|
||||
"""Used to modify the code for the entire output file.
|
||||
|
||||
The last thing any plugin can do. Code contains the results of wrapping
|
||||
all the declarations. The plugin can do things like adding header
|
||||
guards, include statements, etc.
|
||||
|
||||
Args:
|
||||
code: a string source code for the wrapped declarations.
|
||||
|
||||
Returns:
|
||||
The same code, modified as the plugin sees fit.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_single_check(self, code, arg, arg_accessor):
|
||||
"""Used to postprocess a type check.
|
||||
|
||||
Above we defined a function get_type_check that returns a Template
|
||||
string that allows for type checking a PyObject * for a specific type.
|
||||
In this function, the passed "code" is a combination of that type check
|
||||
along with a specific arg_accessor pasted in. For example:
|
||||
|
||||
'(PyObject*)Py_TYPE(PyTuple_GET_ITEM(args, 1)) == THPTensorClass'
|
||||
|
||||
This function can be overriden to support modifying this check string.
|
||||
For example, if an argument can be null, we might want to check and see
|
||||
if the type is Py_None, as well.
|
||||
|
||||
Args:
|
||||
code: The string code representing a type check for a specific
|
||||
argument being accessed.
|
||||
arg: dictionary containing properties of that specific argument
|
||||
arg_accessor: the arg_accessor string for that specific argument.
|
||||
Note that this is likely also embedded in code, but if you want to
|
||||
be able to access this arg and throw away the other code, you can
|
||||
do so.
|
||||
|
||||
Returns:
|
||||
A string representing the processed check/access string for this
|
||||
arg. If the plugin does not know how to modify a specific input, it
|
||||
should return the original code.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_all_checks(self, code, option):
|
||||
"""Used to generate additional checks based on all the individual ones.
|
||||
|
||||
After individually processing each argument with get_type_check,
|
||||
get_arg_accessor, process_single_check, this function allows you to
|
||||
inspect the combined checks and do any additional checking/modify that
|
||||
string as you see fit. In particular, given code is a string like:
|
||||
|
||||
CHECK_TYPE(GET_ARG(0)) && CHECK_TYPE(GET_ARG(1)) && ..
|
||||
|
||||
We can process it as we see fit. For example, we may want to add a
|
||||
check at the beginning that we have the specified number of arguments.
|
||||
|
||||
Args:
|
||||
code: A string representing each argument check separated by an
|
||||
'&&'. code can be None if there are no arguments to be checked.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
The modified code string with any additional checks, or just the
|
||||
existing code if no modifications are to be made.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_single_unpack(self, code, arg, arg_accessor):
|
||||
"""Used to postprocess a type unpack.
|
||||
|
||||
Same as process_single_check above, but for type unpacking. E.g. an
|
||||
example code could be:
|
||||
|
||||
PyLong_FromLong(PyTuple_GET_ITEM(args, 0))
|
||||
|
||||
And this code could modify that as it sees fit. For example, if the
|
||||
result of accessing the argument is None, we would not want to call the
|
||||
unpacking code.
|
||||
|
||||
Args:
|
||||
code: The string code representing a type unpack for a specific
|
||||
argument being accessed.
|
||||
arg: dictionary containing properties of that specific argument
|
||||
arg_accessor: the arg_accessor string for that specific argument.
|
||||
Note that this is likely also embedded in code, but if you want to
|
||||
be able to access this arg and throw away the other code, you can
|
||||
do so.
|
||||
|
||||
Returns:
|
||||
A string representing the processed unpack/access string for this
|
||||
arg. If the plugin does not know how to modify a specific input, it
|
||||
should return the original code.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_all_call_arg(self, code, option):
|
||||
"""Used to modify the arguments to the underlying C function call.
|
||||
|
||||
Code is the string of comma-separated arguments that will be passed to
|
||||
the wrapped C function. You can use this function to modify that string
|
||||
as you see fit. For example, THP prepends the LIBRARY_STATE definition
|
||||
so that the generated code will follow the conventions it uses for
|
||||
writing one function for both TH/THC calls.
|
||||
|
||||
Args:
|
||||
code: A string as described above.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
The same code, modified as the plugin sees fit.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_option_code(self, code, option):
|
||||
"""Used to modify the entire code body for an option.
|
||||
|
||||
Code in this case is a string containing the entire generated code for
|
||||
a specific option. Note that this body includes the checks for each
|
||||
option, i.e. if (type checks for one permutation) { ... } else if (type
|
||||
checks for another permutation) { ... } etc.
|
||||
|
||||
Args:
|
||||
code: string representing the generated code for the option
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
The same code, modified as the plugin sees fit.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_wrapper(self, code, declaration):
|
||||
"""Used to modify the entire code body for a declaration.
|
||||
|
||||
Code in this case is a string containing the entire generated code for
|
||||
a specific declaration. This code can be modified as the plugin sees
|
||||
fit. For example, we might want to wrap the function in preprocessor
|
||||
guards if it is only enabled for floats.
|
||||
|
||||
Args:
|
||||
code: string representing the generated code for the declaration
|
||||
declaration: the declaration metadata.
|
||||
|
||||
Returns:
|
||||
The same code, modified as the plugin sees fit.
|
||||
|
||||
"""
|
||||
return code
|
||||
|
||||
def process_declarations(self, declarations):
|
||||
"""Used to process/modify the function's declaration.
|
||||
|
||||
Cwrap loads the YAML of a function to be cwrap'd into a dictionary.
|
||||
This is known as the declaration. The cwrap code sets some defaults as
|
||||
necessary, and then passes this dictionary to process_declarations.
|
||||
Overriding this code allows the plugin to modify this declaration as it
|
||||
sees fit prior to any code generation. The plugin may add, remove or
|
||||
modify the fields of the declaration dictionary. It can also save state
|
||||
to the Plugin for use in subsequent function overrides.
|
||||
|
||||
Its best to look at some of the existing Plugins to get a sense of what
|
||||
one might do.
|
||||
|
||||
Args:
|
||||
declarations: a list of declarations, i.e. dictionaries that define
|
||||
the function(s) being wrapped. Note that this can be plural, so the
|
||||
function must take care to modify each input declaration.
|
||||
|
||||
Returns:
|
||||
Those same declarations, modified as the Plugin sees fit. Note that
|
||||
you could insert a declaration, if you wanted to take an input
|
||||
declaration and e.g. wrap it multiple times.
|
||||
|
||||
"""
|
||||
return declarations
|
||||
|
||||
def process_option_code_template(self, template, option):
|
||||
"""Used to modify the code template for the option.
|
||||
|
||||
The "code template" can be thought of the actual body implementing the
|
||||
wrapped function call --> i.e. it is not the argument check,
|
||||
assignment, etc. but the actual logic of the function. The template is
|
||||
a list containing two operations: the $call, and the $return_result.
|
||||
These represent the "locations" where the function call will happen,
|
||||
and the function will return.
|
||||
|
||||
This function can modify the list to insert arbitrary code around the
|
||||
$call and $return_result. For example, one might want to wrap the code
|
||||
in a try/catch, or post-process the result in some way. This allows a
|
||||
plugin to do that.
|
||||
|
||||
Args:
|
||||
template: a list containing $call and $return_result, in addition
|
||||
to any arbitrary code inserted by other plugins.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
The same "code template", possibly modified by this plugin.
|
||||
|
||||
"""
|
||||
return template
|
||||
|
||||
def process_pre_arg_assign(self, template, option):
|
||||
"""Used to include any code before argument assignment.
|
||||
|
||||
This function can be used to insert any code that will be part of the
|
||||
resulting function. The code is inserted after argument checks occur,
|
||||
but before argument assignment.
|
||||
|
||||
Args:
|
||||
template: String representing the code to be inserted. If other
|
||||
plugins have included code for pre_arg_assign, it will be included
|
||||
here.
|
||||
option: dictionary containing the information for this specific
|
||||
option.
|
||||
|
||||
Returns:
|
||||
template, with any additional code if needed.
|
||||
|
||||
"""
|
||||
return template
|
||||
|
||||
|
||||
@ -433,4 +66,3 @@ from .AutoGPU import AutoGPU
|
||||
from .CuDNNPlugin import CuDNNPlugin
|
||||
from .GenericNN import GenericNN
|
||||
from .WrapDim import WrapDim
|
||||
from .Broadcast import Broadcast
|
||||
|
||||
@ -1,27 +0,0 @@
|
||||
FROM ubuntu:16.04
|
||||
|
||||
LABEL com.nvidia.volumes.needed="nvidia_driver"
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
build-essential \
|
||||
git \
|
||||
curl \
|
||||
ca-certificates \
|
||||
libjpeg-dev \
|
||||
libpng-dev && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN curl -o ~/miniconda.sh -O https://repo.continuum.io/miniconda/Miniconda3-4.2.12-Linux-x86_64.sh && \
|
||||
chmod +x ~/miniconda.sh && \
|
||||
~/miniconda.sh -b -p /opt/conda && \
|
||||
rm ~/miniconda.sh && \
|
||||
/opt/conda/bin/conda install conda-build && \
|
||||
/opt/conda/bin/conda create -y --name pytorch-py35 python=3.5.2 numpy pyyaml scipy ipython mkl&& \
|
||||
/opt/conda/bin/conda clean -ya
|
||||
ENV PATH /opt/conda/envs/pytorch-py35/bin:$PATH
|
||||
RUN conda install --name pytorch-py35 -c soumith magma-cuda80 && /opt/conda/bin/conda clean -ya
|
||||
RUN conda install --name pytorch-py35 pytorch torchvision cuda80 -c soumith && /opt/conda/bin/conda clean -ya
|
||||
|
||||
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64
|
||||
|
||||
WORKDIR /workspace
|
||||
RUN chmod -R a+w /workspace
|
||||
@ -3,13 +3,26 @@ import sys
|
||||
from string import Template, ascii_lowercase
|
||||
from ..cwrap import cwrap
|
||||
from ..cwrap.plugins import StandaloneExtension, GenericNN, NullableArguments, AutoGPU
|
||||
from ..shared import import_module
|
||||
|
||||
BASE_PATH = os.path.realpath(os.path.join(__file__, '..', '..', '..'))
|
||||
WRAPPER_PATH = os.path.join(BASE_PATH, 'torch', 'csrc', 'nn')
|
||||
THNN_UTILS_PATH = os.path.join(BASE_PATH, 'torch', '_thnn', 'utils.py')
|
||||
|
||||
|
||||
def import_module(name, path):
|
||||
if sys.version_info >= (3, 5):
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(name, path)
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
elif sys.version_info >= (3, 0):
|
||||
from importlib.machinery import SourceFileLoader
|
||||
return SourceFileLoader(name, path).load_module()
|
||||
else:
|
||||
import imp
|
||||
return imp.load_source(name, path)
|
||||
|
||||
thnn_utils = import_module('torch._thnn.utils', THNN_UTILS_PATH)
|
||||
|
||||
FUNCTION_TEMPLATE = Template("""\
|
||||
@ -75,17 +88,14 @@ def wrap_function(name, type, arguments):
|
||||
cname = 'THNN_' + type + name
|
||||
declaration = ''
|
||||
declaration += 'extern "C" void ' + cname + \
|
||||
'(' + ', '.join(TYPE_TRANSFORMS[type].get(arg.type, arg.type)
|
||||
for arg in arguments) + ');\n'
|
||||
'(' + ', '.join(TYPE_TRANSFORMS[type].get(arg.type, arg.type) for arg in arguments) + ');\n'
|
||||
declaration += FUNCTION_TEMPLATE.substitute(name=type + name, cname=cname)
|
||||
indent = ' ' * 4
|
||||
dict_indent = ' ' * 6
|
||||
prefix = indent + '- '
|
||||
for arg in arguments:
|
||||
if not arg.is_optional:
|
||||
declaration += prefix + \
|
||||
TYPE_TRANSFORMS[type].get(
|
||||
arg.type, arg.type) + ' ' + arg.name + '\n'
|
||||
declaration += prefix + TYPE_TRANSFORMS[type].get(arg.type, arg.type) + ' ' + arg.name + '\n'
|
||||
else:
|
||||
t = TYPE_TRANSFORMS[type].get(arg.type, arg.type)
|
||||
declaration += prefix + 'type: ' + t + '\n' + \
|
||||
@ -130,7 +140,6 @@ def wrap_cunn():
|
||||
AutoGPU(has_self=False),
|
||||
])
|
||||
|
||||
|
||||
GENERIC_FUNCTION_TEMPLATE = Template("""\
|
||||
[[
|
||||
name: $name
|
||||
@ -159,7 +168,7 @@ def wrap_generic():
|
||||
defs = OrderedDict()
|
||||
|
||||
def should_wrap_function(name):
|
||||
if name.startswith('LookupTable_'):
|
||||
if name.startswith('LookupTable'):
|
||||
return False
|
||||
return (name.endswith('updateOutput') or
|
||||
name.endswith('updateGradInput') or
|
||||
|
||||
@ -1,12 +0,0 @@
|
||||
{
|
||||
global:
|
||||
_TH*;
|
||||
TH*;
|
||||
*THP*;
|
||||
*THCP*;
|
||||
PyInit*;
|
||||
init*;
|
||||
state;
|
||||
local:
|
||||
*;
|
||||
};
|
||||
@ -1,39 +1,17 @@
|
||||
import os
|
||||
import platform
|
||||
import ctypes.util
|
||||
from subprocess import Popen, PIPE
|
||||
import os
|
||||
|
||||
from .env import check_env_flag
|
||||
|
||||
|
||||
def find_nvcc():
|
||||
proc = Popen(['which', 'nvcc'], stdout=PIPE, stderr=PIPE)
|
||||
out, err = proc.communicate()
|
||||
out = out.decode().strip()
|
||||
if len(out) > 0:
|
||||
return os.path.dirname(out)
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
if check_env_flag('NO_CUDA'):
|
||||
WITH_CUDA = False
|
||||
CUDA_HOME = None
|
||||
else:
|
||||
CUDA_HOME = os.getenv('CUDA_HOME', '/usr/local/cuda')
|
||||
if not os.path.exists(CUDA_HOME):
|
||||
# We use nvcc path on Linux and cudart path on macOS
|
||||
osname = platform.system()
|
||||
if osname == 'Linux':
|
||||
cuda_path = find_nvcc()
|
||||
else:
|
||||
cudart_path = ctypes.util.find_library('cudart')
|
||||
if cudart_path is not None:
|
||||
cuda_path = os.path.dirname(cudart_path)
|
||||
else:
|
||||
cuda_path = None
|
||||
if cuda_path is not None:
|
||||
CUDA_HOME = os.path.dirname(cuda_path)
|
||||
cudart_path = ctypes.util.find_library('cudart')
|
||||
if cudart_path is not None:
|
||||
CUDA_HOME = os.path.dirname(cudart_path)
|
||||
else:
|
||||
CUDA_HOME = None
|
||||
WITH_CUDA = CUDA_HOME is not None
|
||||
|
||||
@ -1,5 +1,4 @@
|
||||
import os
|
||||
import sys
|
||||
import glob
|
||||
from itertools import chain
|
||||
|
||||
@ -10,8 +9,6 @@ from .cuda import WITH_CUDA, CUDA_HOME
|
||||
def gather_paths(env_vars):
|
||||
return list(chain(*(os.getenv(v, '').split(':') for v in env_vars)))
|
||||
|
||||
is_conda = 'conda' in sys.version or 'Continuum' in sys.version
|
||||
conda_dir = os.path.join(os.path.dirname(sys.executable), '..')
|
||||
|
||||
WITH_CUDNN = False
|
||||
CUDNN_LIB_DIR = None
|
||||
@ -22,7 +19,6 @@ if WITH_CUDA and not check_env_flag('NO_CUDNN'):
|
||||
os.path.join(CUDA_HOME, 'lib'),
|
||||
os.path.join(CUDA_HOME, 'lib64'),
|
||||
'/usr/lib/x86_64-linux-gnu/',
|
||||
'/usr/lib/powerpc64le-linux-gnu/',
|
||||
] + gather_paths([
|
||||
'LIBRARY_PATH',
|
||||
])))
|
||||
@ -35,9 +31,6 @@ if WITH_CUDA and not check_env_flag('NO_CUDNN'):
|
||||
'C_INCLUDE_PATH',
|
||||
'CPLUS_INCLUDE_PATH',
|
||||
])))
|
||||
if is_conda:
|
||||
lib_paths.append(os.path.join(conda_dir, 'lib'))
|
||||
include_paths.append(os.path.join(conda_dir, 'include'))
|
||||
for path in lib_paths:
|
||||
if path is None or not os.path.exists(path):
|
||||
continue
|
||||
|
||||
@ -1,58 +0,0 @@
|
||||
import os
|
||||
|
||||
this_file = os.path.dirname(os.path.abspath(__file__))
|
||||
generated_dir = os.path.abspath(os.path.join(this_file, '..', '..', 'torch', 'csrc', 'generated'))
|
||||
|
||||
line_start = '//generic_include '
|
||||
|
||||
types = [
|
||||
'Double',
|
||||
'Float',
|
||||
'Half',
|
||||
'Long',
|
||||
'Int',
|
||||
'Short',
|
||||
'Char',
|
||||
'Byte'
|
||||
]
|
||||
|
||||
generic_include = '#define {lib}_GENERIC_FILE "{path}"'
|
||||
generate_include = '#include "{lib}/{lib}Generate{type}Type.h"'
|
||||
|
||||
|
||||
def split_types(file_name):
|
||||
assert file_name.startswith('torch/csrc/')
|
||||
if not os.path.exists(generated_dir):
|
||||
os.makedirs(generated_dir)
|
||||
|
||||
with open(file_name, 'r') as f:
|
||||
lines = f.read().split('\n')
|
||||
|
||||
# Find //generic_include
|
||||
for i, l in enumerate(lines):
|
||||
if l.startswith(line_start):
|
||||
args = l[len(line_start):]
|
||||
lib_prefix, generic_file = filter(bool, args.split())
|
||||
break
|
||||
else:
|
||||
raise RuntimeError("generic include not found")
|
||||
|
||||
gen_name_prefix = file_name[len('torch/csrc/'):].replace('/', '_').replace('.cpp', '')
|
||||
gen_path_prefix = os.path.join(generated_dir, gen_name_prefix)
|
||||
|
||||
prefix = '\n'.join(lines[:i])
|
||||
suffix = '\n'.join(lines[i + 1:])
|
||||
|
||||
to_build = []
|
||||
|
||||
g_include = generic_include.format(lib=lib_prefix, path=generic_file)
|
||||
for t in types:
|
||||
t_include = generate_include.format(lib=lib_prefix, type=t)
|
||||
gen_path = gen_path_prefix + t + '.cpp'
|
||||
to_build.append(gen_path)
|
||||
with open(gen_path, 'w') as f:
|
||||
f.write(prefix + '\n' +
|
||||
g_include + '\n' +
|
||||
t_include + '\n' +
|
||||
suffix)
|
||||
return to_build
|
||||
@ -1,3 +0,0 @@
|
||||
from .module_loader import import_module
|
||||
from .cwrap_common import set_declaration_defaults, \
|
||||
sort_by_number_of_options, enumerate_options_due_to_default
|
||||
@ -1 +0,0 @@
|
||||
../../torch/lib/ATen/common_with_cwrap.py
|
||||
@ -1,16 +0,0 @@
|
||||
import sys
|
||||
|
||||
|
||||
def import_module(name, path):
|
||||
if sys.version_info >= (3, 5):
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(name, path)
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
elif sys.version_info >= (3, 0):
|
||||
from importlib.machinery import SourceFileLoader
|
||||
return SourceFileLoader(name, path).load_module()
|
||||
else:
|
||||
import imp
|
||||
return imp.load_source(name, path)
|
||||
@ -5,7 +5,7 @@ Additionally, it provides many utilities for efficient serializing of
|
||||
Tensors and arbitrary types, and other useful utilities.
|
||||
|
||||
It has a CUDA counterpart, that enables you to run your tensor computations
|
||||
on an NVIDIA GPU with compute capability >= 3.0.
|
||||
on an NVIDIA GPU with compute capability >= 2.0.
|
||||
"""
|
||||
|
||||
import sys
|
||||
@ -15,7 +15,7 @@ from .version import __version__
|
||||
__all__ = [
|
||||
'typename', 'is_tensor', 'is_storage', 'set_default_tensor_type',
|
||||
'set_rng_state', 'get_rng_state', 'manual_seed', 'initial_seed',
|
||||
'save', 'load', 'set_printoptions', 'chunk', 'split', 'stack', 'matmul',
|
||||
'save', 'load', 'set_printoptions', 'chunk', 'split', 'stack',
|
||||
'DoubleStorage', 'FloatStorage', 'LongStorage', 'IntStorage',
|
||||
'ShortStorage', 'CharStorage', 'ByteStorage',
|
||||
'DoubleTensor', 'FloatTensor', 'LongTensor', 'IntTensor',
|
||||
@ -129,9 +129,6 @@ def manual_seed(seed):
|
||||
Args:
|
||||
seed (int or long): The desired seed.
|
||||
"""
|
||||
if torch.cuda.is_available() and not torch.cuda._in_bad_fork:
|
||||
torch.cuda.manual_seed_all(seed)
|
||||
|
||||
return default_generator.manual_seed(seed)
|
||||
|
||||
|
||||
@ -268,12 +265,12 @@ class ByteTensor(_C.ByteTensorBase, _TensorBase):
|
||||
|
||||
_storage_classes = {
|
||||
DoubleStorage, FloatStorage, LongStorage, IntStorage, ShortStorage,
|
||||
CharStorage, ByteStorage, HalfStorage
|
||||
CharStorage, ByteStorage,
|
||||
}
|
||||
|
||||
_tensor_classes = {
|
||||
DoubleTensor, FloatTensor, LongTensor, IntTensor, ShortTensor,
|
||||
CharTensor, ByteTensor, HalfTensor
|
||||
CharTensor, ByteTensor,
|
||||
}
|
||||
|
||||
|
||||
@ -339,9 +336,8 @@ import torch.nn
|
||||
import torch.optim
|
||||
import torch.multiprocessing
|
||||
import torch.sparse
|
||||
import torch.utils.backcompat
|
||||
_C._init_names(list(torch._tensor_classes) + list(torch._storage_classes))
|
||||
|
||||
# attach docstrings to torch and tensor functions
|
||||
from . import _torch_docs, _tensor_docs, _storage_docs
|
||||
del _torch_docs, _tensor_docs, _storage_docs
|
||||
from . import _torch_docs, _tensor_docs
|
||||
del _torch_docs, _tensor_docs
|
||||
|
||||
@ -1,31 +0,0 @@
|
||||
# Copyright (c) 2010-2017 Benjamin Peterson
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
# of this software and associated documentation files (the "Software"), to deal
|
||||
# in the Software without restriction, including without limitation the rights
|
||||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
# copies of the Software, and to permit persons to whom the Software is
|
||||
# furnished to do so, subject to the following conditions:
|
||||
#
|
||||
# The above copyright notice and this permission notice shall be included in all
|
||||
# copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
# SOFTWARE.
|
||||
|
||||
|
||||
def with_metaclass(meta, *bases):
|
||||
"""Create a base class with a metaclass."""
|
||||
# This requires a bit of explanation: the basic idea is to make a dummy
|
||||
# metaclass for one level of class instantiation that replaces itself with
|
||||
# the actual metaclass.
|
||||
class metaclass(meta):
|
||||
|
||||
def __new__(cls, name, this_bases, d):
|
||||
return meta(name, bases, d)
|
||||
return type.__new__(metaclass, 'temporary_class', (), {})
|
||||
@ -1,43 +0,0 @@
|
||||
"""Adds docstrings to Storage functions"""
|
||||
|
||||
import torch._C
|
||||
from torch._C import _add_docstr as add_docstr
|
||||
|
||||
|
||||
storage_classes = [
|
||||
'DoubleStorageBase',
|
||||
'FloatStorageBase',
|
||||
'LongStorageBase',
|
||||
'IntStorageBase',
|
||||
'ShortStorageBase',
|
||||
'CharStorageBase',
|
||||
'ByteStorageBase',
|
||||
]
|
||||
|
||||
|
||||
def add_docstr_all(method, docstr):
|
||||
for cls_name in storage_classes:
|
||||
cls = getattr(torch._C, cls_name)
|
||||
try:
|
||||
add_docstr(getattr(cls, method), docstr)
|
||||
except AttributeError:
|
||||
pass
|
||||
|
||||
|
||||
add_docstr_all('from_file',
|
||||
"""
|
||||
from_file(filename, shared=False, size=0) -> Storage
|
||||
|
||||
If shared is True then memory is shared between all processes. All changes are
|
||||
written to the file. If shared is False then the changes on the storage do not
|
||||
affect the file.
|
||||
|
||||
Size is the number of elements in the storage. If shared is False then the file
|
||||
must contain at least `size * sizeof(Type)` bytes (`Type` is the type of
|
||||
storage). If shared is True the file will be created if needed.
|
||||
|
||||
Args:
|
||||
filename (str): file name to map
|
||||
shared (bool): whether to share memory
|
||||
size (int): number of elements in the storage
|
||||
""")
|
||||
File diff suppressed because it is too large
Load Diff
@ -67,7 +67,7 @@ def set_printoptions(
|
||||
|
||||
def _number_format(tensor, min_sz=-1):
|
||||
min_sz = max(min_sz, 2)
|
||||
tensor = torch.DoubleTensor(tensor.size()).copy_(tensor).abs_().view(tensor.nelement())
|
||||
tensor = torch.DoubleTensor(tensor.nelement()).copy_(tensor).abs_()
|
||||
|
||||
pos_inf_mask = tensor.eq(float('inf'))
|
||||
neg_inf_mask = tensor.eq(float('-inf'))
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -3,8 +3,7 @@ import importlib
|
||||
|
||||
|
||||
def _type(self, new_type=None, async=False):
|
||||
"""Returns the type if `new_type` is not provided, else casts this object to
|
||||
the specified type.
|
||||
"""Casts this object to the specified type.
|
||||
|
||||
If this is already of the correct type, no copy is performed and the
|
||||
original object is returned.
|
||||
@ -28,8 +27,8 @@ def _type(self, new_type=None, async=False):
|
||||
raise RuntimeError("Cannot cast sparse tensor to dense tensor")
|
||||
new_type_name = new_type.__module__ + '.' + new_type.__name__
|
||||
new_values_type_name = new_type_name.replace('.sparse', '')
|
||||
new_values = self._values().type(new_values_type_name, async)
|
||||
return new_type(self._indices(), new_values, self.size())
|
||||
new_values = self.values().type(new_values_type_name, async)
|
||||
return new_type(self.indices(), new_values, self.size())
|
||||
if new_type.is_sparse:
|
||||
raise RuntimeError("Cannot cast dense tensor to sparse tensor")
|
||||
return new_type(self.size()).copy_(self, async)
|
||||
@ -58,8 +57,8 @@ def _cuda(self, device=None, async=False):
|
||||
with torch.cuda.device(device):
|
||||
if self.is_sparse:
|
||||
new_type = getattr(torch.cuda.sparse, self.__class__.__name__)
|
||||
indices = self._indices().cuda(device, async)
|
||||
values = self._values().cuda(device, async)
|
||||
indices = self.indices().cuda(device, async)
|
||||
values = self.values().cuda(device, async)
|
||||
return new_type(indices, values, self.size())
|
||||
else:
|
||||
new_type = getattr(torch.cuda, self.__class__.__name__)
|
||||
@ -99,47 +98,3 @@ def _accumulate(iterable, fn=lambda x, y: x + y):
|
||||
for element in it:
|
||||
total = fn(total, element)
|
||||
yield total
|
||||
|
||||
|
||||
def _flatten_tensors(tensors):
|
||||
"""Flatten tensors into a single contiguous 1D buffer"""
|
||||
if len(tensors) == 1:
|
||||
return tensors[0].contiguous().view(-1)
|
||||
numels = [tensor.numel() for tensor in tensors]
|
||||
size = sum(numels)
|
||||
offset = 0
|
||||
flat = tensors[0].new(size)
|
||||
for tensor, numel in zip(tensors, numels):
|
||||
flat.narrow(0, offset, numel).copy_(tensor, broadcast=False)
|
||||
offset += numel
|
||||
return flat
|
||||
|
||||
|
||||
def _unflatten_tensors(flat, tensors):
|
||||
"""View a flat buffer using the sizes of tensors"""
|
||||
outputs = []
|
||||
offset = 0
|
||||
for tensor in tensors:
|
||||
numel = tensor.numel()
|
||||
outputs.append(flat.narrow(0, offset, numel).view_as(tensor))
|
||||
offset += numel
|
||||
return tuple(outputs)
|
||||
|
||||
|
||||
def _take_tensors(tensors, size_limit):
|
||||
"""Groups tensors into lists of up to size_limit bytes"""
|
||||
buf = []
|
||||
size = 0
|
||||
last_type = type(tensors[0]) if len(tensors) > 0 else None
|
||||
for tensor in tensors:
|
||||
t = type(tensor)
|
||||
param_size = tensor.numel() * tensor.element_size()
|
||||
if t is not last_type or (size + param_size > size_limit and size > 0):
|
||||
yield buf
|
||||
last_type = t
|
||||
size = 0
|
||||
buf = []
|
||||
buf.append(tensor)
|
||||
size += param_size
|
||||
if len(buf) > 0:
|
||||
yield buf
|
||||
|
||||
@ -5,7 +5,6 @@ changes to the existing code - you only need to wrap all tensors in
|
||||
:class:`.Variable` objects.
|
||||
"""
|
||||
import torch
|
||||
import warnings
|
||||
|
||||
from .variable import Variable
|
||||
from .function import Function, NestedIOFunction
|
||||
@ -15,41 +14,13 @@ from .gradcheck import gradcheck
|
||||
__all__ = ['Variable', 'Function', 'StochasticFunction', 'backward']
|
||||
|
||||
|
||||
def _make_grads(outputs, grads, user_create_graph):
|
||||
if user_create_graph is not None:
|
||||
create_graph = user_create_graph
|
||||
else:
|
||||
create_graph = any(isinstance(grad, Variable) and not grad.volatile
|
||||
for grad in grads)
|
||||
|
||||
new_grads = []
|
||||
for out, grad in zip(outputs, grads):
|
||||
if isinstance(grad, Variable):
|
||||
new_grads.append(grad)
|
||||
elif torch.is_tensor(grad):
|
||||
new_grads.append(Variable(grad, volatile=not create_graph))
|
||||
elif grad is None:
|
||||
if out.requires_grad:
|
||||
if out.numel() != 1:
|
||||
raise RuntimeError("grad can be implicitly created only for scalar outputs")
|
||||
data = out.data
|
||||
new_grads.append(
|
||||
Variable(data.new().resize_as_(data).fill_(1), volatile=not create_graph))
|
||||
else:
|
||||
new_grads.append(None)
|
||||
else:
|
||||
raise TypeError("gradients can be either Tensors, Variables or None, but got " +
|
||||
type(grad).__name__)
|
||||
return tuple(new_grads), create_graph
|
||||
|
||||
|
||||
def backward(variables, grad_variables=None, retain_graph=None, create_graph=None, retain_variables=None):
|
||||
def backward(variables, grad_variables, retain_variables=False):
|
||||
"""Computes the sum of gradients of given variables w.r.t. graph leaves.
|
||||
|
||||
The graph is differentiated using the chain rule. If any of ``variables``
|
||||
are non-scalar (i.e. their data has more than one element) and require
|
||||
gradient, the function additionaly requires specifying ``grad_variables``.
|
||||
It should be a sequence of matching length, that contains gradient of
|
||||
It should be a sequence of matching length, that containins gradient of
|
||||
the differentiated function w.r.t. corresponding variables (``None`` is an
|
||||
acceptable value for all variables that don't need gradient tensors).
|
||||
|
||||
@ -59,98 +30,15 @@ def backward(variables, grad_variables=None, retain_graph=None, create_graph=Non
|
||||
Arguments:
|
||||
variables (sequence of Variable): Variables of which the derivative will be
|
||||
computed.
|
||||
grad_variables (sequence of (Tensor, Variable or None)): Gradients w.r.t.
|
||||
each element of corresponding variables. Any tensors will be
|
||||
automatically converted to Variables that are volatile unless
|
||||
``create_graph`` is True. None values can be specified for scalar
|
||||
Variables or ones that don't require grad. If a None value would
|
||||
be acceptable for all grad_variables, then this argument is optional.
|
||||
retain_graph (bool, optional): If False, the graph used to compute the grad
|
||||
will be freed. Note that in nearly all cases setting this option to True
|
||||
is not needed and often can be worked around in a much more efficient
|
||||
way. Defaults to the value of ``create_graph``.
|
||||
create_graph (bool, optional): If true, graph of the derivative will
|
||||
be constructed, allowing to compute higher order derivative products.
|
||||
Defaults to False, unless ``grad_variables`` contains at least one
|
||||
non-volatile Variable.
|
||||
grad_variables (sequence of Tensor): Gradients w.r.t. each element of
|
||||
corresponding variables. Required only for non-scalar variables that
|
||||
require gradient.
|
||||
retain_variables (bool): If ``True``, buffers necessary for computing
|
||||
gradients won't be freed after use. It is only necessary to
|
||||
specify ``True`` if you want to differentiate some subgraph multiple
|
||||
times.
|
||||
"""
|
||||
variables = (variables,) if isinstance(variables, Variable) else tuple(variables)
|
||||
|
||||
if grad_variables is None:
|
||||
grad_variables = [None] * len(variables)
|
||||
elif isinstance(grad_variables, Variable) or torch.is_tensor(grad_variables):
|
||||
grad_variables = [grad_variables]
|
||||
else:
|
||||
grad_variables = list(grad_variables)
|
||||
|
||||
grad_variables, create_graph = _make_grads(variables, grad_variables, create_graph)
|
||||
|
||||
if retain_variables is not None:
|
||||
if retain_graph is not None:
|
||||
raise ValueError("only one of retain_graph and retain_variables can be specified")
|
||||
retain_graph = retain_variables
|
||||
warnings.warn("retain_variables option is deprecated and will be removed in 0.3. "
|
||||
"Use retain_graph instead.")
|
||||
elif retain_graph is None:
|
||||
retain_graph = create_graph
|
||||
|
||||
Variable._execution_engine.run_backward(
|
||||
variables, grad_variables, retain_graph)
|
||||
tuple(variables), tuple(grad_variables), retain_variables)
|
||||
|
||||
|
||||
def grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=None, only_inputs=True):
|
||||
"""Computes and returns the sum of gradients of outputs w.r.t. the inputs.
|
||||
|
||||
``grad_outputs`` should be a sequence of length matching ``output``
|
||||
containing the pre-computed gradients w.r.t. each of the outputs. If an
|
||||
output doesn't require_grad, then the gradient can be ``None``).
|
||||
Gradients can be given as Tensors when one doesn't need the graph of the
|
||||
derivative, or as Variables, in which case the graph will be created.
|
||||
|
||||
If ``only_inputs`` is True, the function will only return a list of gradients
|
||||
w.r.t the specified inputs. If it's False, then gradient w.r.t. all remaining
|
||||
leaves will still be computed, and will be accumulated into their ``.grad``
|
||||
attribute.
|
||||
|
||||
Arguments:
|
||||
outputs (sequence of Variable): outputs of the differentiated function.
|
||||
inputs (sequence of Variable): Inputs w.r.t. which the gradient will be
|
||||
returned (and not accumulated into ``.grad``).
|
||||
grad_outputs (sequence of Tensor or Variable): Gradients w.r.t. each output.
|
||||
Any tensors will be automatically converted to Variables that are
|
||||
volatile unless ``create_graph`` is True. None values can be
|
||||
specified for scalar Variables or ones that don't require grad.
|
||||
If a None value would be acceptable for all grad_variables, then
|
||||
this argument is optional.
|
||||
retain_graph (bool, optional): If False, the graph used to compute the grad
|
||||
will be freed. Note that in nearly all cases setting this option to True
|
||||
is not needed and often can be worked around in a much more efficient
|
||||
way. Defaults to the value of ``create_graph``.
|
||||
create_graph (bool, optional): If True, graph of the derivative will
|
||||
be constructed, allowing to compute higher order derivative products.
|
||||
Defaults to False, unless ``grad_variables`` contains at least one
|
||||
non-volatile Variable.
|
||||
only_inputs (bool, optional): If True, gradient w.r.t. leaves that are
|
||||
part of the graph, but don't appear in ``inputs`` won't be computed
|
||||
and accumulated. Defaults to True.
|
||||
"""
|
||||
|
||||
outputs = (outputs,) if isinstance(outputs, Variable) else tuple(outputs)
|
||||
inputs = (inputs,) if isinstance(inputs, Variable) else tuple(inputs)
|
||||
if grad_outputs is None:
|
||||
grad_outputs = [None] * len(outputs)
|
||||
elif isinstance(grad_outputs, Variable) or torch.is_tensor(grad_outputs):
|
||||
grad_outputs = [grad_outputs]
|
||||
else:
|
||||
grad_outputs = list(grad_outputs)
|
||||
|
||||
grad_outputs, create_graph = _make_grads(outputs, grad_outputs, create_graph)
|
||||
if retain_graph is None:
|
||||
retain_graph = create_graph
|
||||
|
||||
return Variable._execution_engine.run_backward(
|
||||
outputs, grad_outputs, retain_graph,
|
||||
inputs, only_inputs)
|
||||
|
||||
if not torch._C._autograd_init():
|
||||
raise RuntimeError("autograd initialization failed")
|
||||
assert torch._C._autograd_init()
|
||||
|
||||
@ -1,228 +1,200 @@
|
||||
import torch
|
||||
from ..function import Function, InplaceFunction
|
||||
from .utils import maybe_unexpand, maybe_unexpand_or_view
|
||||
import math
|
||||
|
||||
|
||||
def maybe_view(tensor, size):
|
||||
if tensor.size() == size:
|
||||
return tensor
|
||||
return tensor.contiguous().view(size)
|
||||
|
||||
|
||||
class Add(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_size = b.size()
|
||||
if inplace:
|
||||
ctx.mark_dirty(a)
|
||||
def forward(self, a, b):
|
||||
self.b_size = b.size()
|
||||
if self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.add_(b)
|
||||
else:
|
||||
return a.add(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return maybe_unexpand(grad_output, ctx.a_size), maybe_unexpand_or_view(grad_output, ctx.b_size), None
|
||||
def backward(self, grad_output):
|
||||
return grad_output, maybe_view(grad_output, self.b_size)
|
||||
|
||||
|
||||
class Sub(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_size = b.size()
|
||||
if inplace:
|
||||
ctx.mark_dirty(a)
|
||||
def forward(self, a, b):
|
||||
self.b_size = b.size()
|
||||
if self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.sub_(b)
|
||||
else:
|
||||
return a.sub(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return maybe_unexpand(grad_output, ctx.a_size), maybe_unexpand_or_view(grad_output.neg(), ctx.b_size), None
|
||||
def backward(self, grad_output):
|
||||
return grad_output, maybe_view(grad_output.neg(), self.b_size)
|
||||
|
||||
|
||||
class Mul(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_size = b.size()
|
||||
ctx.save_for_backward(a, b)
|
||||
def forward(self, a, b):
|
||||
self.b_size = b.size()
|
||||
self.save_for_backward(a, b)
|
||||
return a.mul(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
a, b = ctx.saved_variables
|
||||
return maybe_unexpand(grad_output.mul(b), ctx.a_size), maybe_unexpand_or_view(grad_output.mul(a), ctx.b_size)
|
||||
def backward(self, grad_output):
|
||||
a, b = self.saved_tensors
|
||||
return grad_output.mul(b), maybe_view(grad_output.mul(a), self.b_size)
|
||||
|
||||
|
||||
class Div(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_size = b.size()
|
||||
ctx.save_for_backward(a, b)
|
||||
def forward(self, a, b):
|
||||
self.b_size = b.size()
|
||||
self.save_for_backward(a, b)
|
||||
return a.div(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
a, b = ctx.saved_variables
|
||||
b_rec = b.reciprocal()
|
||||
grad_a = grad_output.mul(b_rec)
|
||||
grad_b = grad_output.neg().mul(a).mul(b_rec).mul(b_rec)
|
||||
return maybe_unexpand(grad_a, ctx.a_size), maybe_unexpand_or_view(grad_b, ctx.b_size)
|
||||
def backward(self, grad_output):
|
||||
a, b = self.saved_tensors
|
||||
return grad_output.div(b), maybe_view(grad_output.neg().mul(a).div_(b).div_(b), self.b_size)
|
||||
|
||||
|
||||
class Pow(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_size = b.size()
|
||||
ctx.save_for_backward(a, b)
|
||||
def forward(self, a, b):
|
||||
self.b_size = b.size()
|
||||
self.save_for_backward(a, b)
|
||||
return a.pow(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
a, b = ctx.saved_variables
|
||||
grad_a = grad_output.mul(b).mul(a.pow(b - 1))
|
||||
grad_b = grad_output.mul(a.pow(b)).mul(a.log())
|
||||
return maybe_unexpand(grad_a, ctx.a_size), maybe_unexpand_or_view(grad_b, ctx.b_size)
|
||||
|
||||
|
||||
def sort_args(a, b):
|
||||
return (a, b, True) if torch.is_tensor(a) else (b, a, False)
|
||||
def backward(self, grad_output):
|
||||
a, b = self.saved_tensors
|
||||
return grad_output.mul(b).mul_(a.pow(b - 1)), maybe_view(grad_output.mul(a.pow(b)).mul_(a.log()), self.b_size)
|
||||
|
||||
|
||||
class AddConstant(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
tensor, constant, ctx.tensor_first = sort_args(a, b)
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.add_(constant)
|
||||
else:
|
||||
return tensor.add(constant)
|
||||
def __init__(self, constant, inplace=False):
|
||||
super(AddConstant, self).__init__(inplace)
|
||||
self.constant = constant
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.tensor_first:
|
||||
return grad_output, None, None
|
||||
def forward(self, a):
|
||||
if self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.add_(self.constant)
|
||||
else:
|
||||
return None, grad_output, None
|
||||
return a.add(self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output
|
||||
|
||||
|
||||
class SubConstant(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
tensor, constant, ctx.tensor_first = sort_args(a, b)
|
||||
if ctx.tensor_first:
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.sub_(constant)
|
||||
else:
|
||||
return tensor.sub(constant)
|
||||
else:
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.neg_().add_(constant)
|
||||
else:
|
||||
return tensor.neg().add_(constant)
|
||||
def __init__(self, constant, sub_tensor=False, inplace=False):
|
||||
super(SubConstant, self).__init__(inplace)
|
||||
self.constant = constant
|
||||
self.sub_tensor = sub_tensor
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.tensor_first:
|
||||
return grad_output, None, None
|
||||
def forward(self, a):
|
||||
if self.sub_tensor:
|
||||
if a.is_signed() and self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.neg_().add_(self.constant)
|
||||
else:
|
||||
assert not self.inplace, "can't perform (constant - tensor) " \
|
||||
"subtraction in-place on an unsigned type"
|
||||
return a.new().resize_as_(a).fill_(self.constant).sub_(a)
|
||||
else:
|
||||
return None, grad_output.neg(), None
|
||||
if self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.sub_(self.constant)
|
||||
else:
|
||||
return a.sub(self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
if self.sub_tensor:
|
||||
return grad_output.neg()
|
||||
else:
|
||||
return grad_output
|
||||
|
||||
|
||||
class MulConstant(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
tensor, ctx.constant, ctx.tensor_first = sort_args(a, b)
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.mul_(ctx.constant)
|
||||
else:
|
||||
return tensor.mul(ctx.constant)
|
||||
def __init__(self, constant, inplace=False):
|
||||
super(MulConstant, self).__init__(inplace)
|
||||
self.constant = constant
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
grad_input = grad_output.mul(ctx.constant)
|
||||
if ctx.tensor_first:
|
||||
return grad_input, None, None
|
||||
def forward(self, a):
|
||||
if self.inplace:
|
||||
self.mark_dirty(a)
|
||||
return a.mul_(self.constant)
|
||||
else:
|
||||
return None, grad_input, None
|
||||
return a.mul(self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output.mul(self.constant)
|
||||
|
||||
|
||||
class DivConstant(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, inplace=False):
|
||||
tensor, ctx.constant, ctx.tensor_first = sort_args(a, b)
|
||||
ctx.inplace = inplace
|
||||
if ctx.tensor_first:
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.div_(ctx.constant)
|
||||
else:
|
||||
return tensor.div(ctx.constant)
|
||||
else:
|
||||
ctx.save_for_backward(tensor)
|
||||
if inplace:
|
||||
ctx.mark_dirty(tensor)
|
||||
return tensor.reciprocal_().mul_(ctx.constant)
|
||||
else:
|
||||
return tensor.reciprocal().mul_(ctx.constant)
|
||||
def __init__(self, constant, div_by_tensor=False, inplace=False):
|
||||
super(DivConstant, self).__init__(inplace)
|
||||
self.constant = constant
|
||||
self.div_by_tensor = div_by_tensor
|
||||
if self.inplace and self.div_by_tensor:
|
||||
# TODO: actually, as long as the type is floating point, we can
|
||||
raise RuntimeError("can't perform (constant / tensor) division in-place")
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.tensor_first:
|
||||
return grad_output.div(ctx.constant), None, None
|
||||
def forward(self, a):
|
||||
if self.div_by_tensor:
|
||||
self.save_for_backward(a)
|
||||
return a.new().resize_as_(a).fill_(self.constant).div_(a)
|
||||
else:
|
||||
v, = ctx.saved_variables
|
||||
if ctx.inplace:
|
||||
return None, grad_output.mul(v).mul(v).div_(-ctx.constant), None
|
||||
if self.inplace:
|
||||
return a.div_(self.constant)
|
||||
else:
|
||||
v_rep = v.reciprocal()
|
||||
return None, grad_output.mul(v_rep).mul(v_rep).mul_(-ctx.constant), None
|
||||
return a.div(self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
if self.div_by_tensor:
|
||||
a = self.saved_tensors[0]
|
||||
return grad_output.neg().mul_(self.constant).div_(a).div_(a)
|
||||
else:
|
||||
return grad_output.div(self.constant)
|
||||
|
||||
|
||||
class PowConstant(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
tensor, ctx.constant, ctx.tensor_first = sort_args(a, b)
|
||||
if ctx.tensor_first:
|
||||
ctx.save_for_backward(tensor)
|
||||
return tensor.pow(ctx.constant)
|
||||
else:
|
||||
result = torch.pow(ctx.constant, tensor)
|
||||
ctx.save_for_backward(result)
|
||||
return result
|
||||
def __init__(self, constant, tensor_power=False):
|
||||
super(PowConstant, self).__init__()
|
||||
self.constant = constant
|
||||
self.tensor_power = tensor_power
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.tensor_first:
|
||||
var, = ctx.saved_variables
|
||||
return grad_output.mul(ctx.constant).mul(var.pow(ctx.constant - 1)), None
|
||||
def forward(self, a):
|
||||
if self.tensor_power:
|
||||
self.fw_result = torch.pow(self.constant, a)
|
||||
return self.fw_result
|
||||
else:
|
||||
var_result, = ctx.saved_variables
|
||||
return None, grad_output.mul(var_result).mul_(math.log(ctx.constant))
|
||||
self.save_for_backward(a)
|
||||
return a.pow(self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
if self.tensor_power:
|
||||
return grad_output.mul(self.fw_result).mul_(math.log(self.constant))
|
||||
else:
|
||||
a = self.saved_tensors[0]
|
||||
return grad_output.mul(self.constant).mul_(a.pow(self.constant - 1))
|
||||
|
||||
|
||||
class Negate(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(i)
|
||||
def forward(self, i):
|
||||
if self.inplace:
|
||||
return i.neg_()
|
||||
else:
|
||||
return i.neg()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return grad_output.neg(), None
|
||||
def backward(self, grad_output):
|
||||
return grad_output.neg()
|
||||
|
||||
@ -1,224 +1,195 @@
|
||||
import torch
|
||||
|
||||
from ..function import Function, InplaceFunction
|
||||
from .utils import maybe_unexpand
|
||||
|
||||
|
||||
# TODO: no need to save all args if the grad w.r.t. some of them is not needed
|
||||
def _get_output(ctx, arg, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(arg)
|
||||
return arg
|
||||
else:
|
||||
return arg.new().resize_as_(arg)
|
||||
class _BlasBase(InplaceFunction):
|
||||
|
||||
def __init__(self, alpha=1, beta=1, inplace=False):
|
||||
super(_BlasBase, self).__init__(inplace)
|
||||
self.alpha = alpha
|
||||
self.beta = beta
|
||||
|
||||
def _get_output(self, arg):
|
||||
if self.inplace:
|
||||
self.mark_dirty(arg)
|
||||
return arg
|
||||
else:
|
||||
return arg.new().resize_as_(arg)
|
||||
|
||||
|
||||
class Addmm(InplaceFunction):
|
||||
class Addmm(_BlasBase):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_matrix, matrix1, matrix2, alpha=1, beta=1, inplace=False):
|
||||
ctx.alpha = alpha
|
||||
ctx.beta = beta
|
||||
ctx.add_matrix_size = add_matrix.size()
|
||||
ctx.save_for_backward(matrix1, matrix2)
|
||||
output = _get_output(ctx, add_matrix, inplace=inplace)
|
||||
return torch.addmm(alpha, add_matrix, beta,
|
||||
def forward(self, add_matrix, matrix1, matrix2):
|
||||
self.save_for_backward(matrix1, matrix2)
|
||||
output = self._get_output(add_matrix)
|
||||
return torch.addmm(self.alpha, add_matrix, self.beta,
|
||||
matrix1, matrix2, out=output)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
matrix1, matrix2 = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
matrix1, matrix2 = self.saved_tensors
|
||||
grad_add_matrix = grad_matrix1 = grad_matrix2 = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add_matrix = maybe_unexpand(grad_output, ctx.add_matrix_size)
|
||||
if ctx.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(ctx.alpha)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add_matrix = grad_output
|
||||
if self.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(self.alpha)
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
if matrix1.stride() == (1, matrix1.size(0)):
|
||||
# column major gradient if input is column major
|
||||
grad_matrix1 = torch.mm(matrix2, grad_output.t()).t()
|
||||
else:
|
||||
grad_matrix1 = torch.mm(grad_output, matrix2.t())
|
||||
if ctx.beta != 1:
|
||||
grad_matrix1 *= ctx.beta
|
||||
if self.needs_input_grad[1]:
|
||||
grad_matrix1 = torch.mm(grad_output, matrix2.t())
|
||||
if self.beta != 1:
|
||||
grad_matrix1 *= self.beta
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if matrix2.stride() == (1, matrix2.size(0)):
|
||||
# column major gradient if input is column major
|
||||
grad_matrix2 = torch.mm(grad_output.t(), matrix1).t()
|
||||
else:
|
||||
grad_matrix2 = torch.mm(matrix1.t(), grad_output)
|
||||
if ctx.beta != 1:
|
||||
grad_matrix2 *= ctx.beta
|
||||
if self.needs_input_grad[2]:
|
||||
grad_matrix2 = torch.mm(matrix1.t(), grad_output)
|
||||
if self.beta != 1:
|
||||
grad_matrix2 *= self.beta
|
||||
|
||||
return grad_add_matrix, grad_matrix1, grad_matrix2, None, None, None
|
||||
return grad_add_matrix, grad_matrix1, grad_matrix2
|
||||
|
||||
|
||||
class Addbmm(InplaceFunction):
|
||||
class Addbmm(_BlasBase):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_matrix, batch1, batch2, alpha=1, beta=1, inplace=False):
|
||||
ctx.alpha = alpha
|
||||
ctx.beta = beta
|
||||
ctx.add_matrix_size = add_matrix.size()
|
||||
ctx.save_for_backward(batch1, batch2)
|
||||
output = _get_output(ctx, add_matrix, inplace=inplace)
|
||||
return torch.addbmm(alpha, add_matrix, beta,
|
||||
def forward(self, add_matrix, batch1, batch2):
|
||||
self.save_for_backward(batch1, batch2)
|
||||
output = self._get_output(add_matrix)
|
||||
return torch.addbmm(self.alpha, add_matrix, self.beta,
|
||||
batch1, batch2, out=output)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
batch1, batch2 = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
batch1, batch2 = self.saved_tensors
|
||||
grad_add_matrix = grad_batch1 = grad_batch2 = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add_matrix = maybe_unexpand(grad_output, ctx.add_matrix_size)
|
||||
if ctx.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(ctx.alpha)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add_matrix = grad_output
|
||||
if self.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(self.alpha)
|
||||
|
||||
if any(ctx.needs_input_grad[1:]):
|
||||
if any(self.needs_input_grad[1:]):
|
||||
batch_grad_output = (grad_output
|
||||
.unsqueeze(0)
|
||||
.expand(batch1.size(0), batch1.size(1), batch2.size(2)))
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
if self.needs_input_grad[1]:
|
||||
grad_batch1 = torch.bmm(batch_grad_output, batch2.transpose(1, 2))
|
||||
if ctx.beta != 1:
|
||||
grad_batch1 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_batch1 *= self.beta
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if self.needs_input_grad[2]:
|
||||
grad_batch2 = torch.bmm(batch1.transpose(1, 2), batch_grad_output)
|
||||
if ctx.beta != 1:
|
||||
grad_batch2 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_batch2 *= self.beta
|
||||
|
||||
return grad_add_matrix, grad_batch1, grad_batch2, None, None, None
|
||||
return grad_add_matrix, grad_batch1, grad_batch2
|
||||
|
||||
|
||||
class Baddbmm(InplaceFunction):
|
||||
class Baddbmm(_BlasBase):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_batch, batch1, batch2, alpha=1, beta=1, inplace=False):
|
||||
ctx.alpha = alpha
|
||||
ctx.beta = beta
|
||||
ctx.add_batch_size = add_batch.size()
|
||||
ctx.save_for_backward(batch1, batch2)
|
||||
output = _get_output(ctx, add_batch, inplace=inplace)
|
||||
return torch.baddbmm(alpha, add_batch, beta,
|
||||
def forward(self, add_batch, batch1, batch2):
|
||||
self.save_for_backward(batch1, batch2)
|
||||
output = self._get_output(add_batch)
|
||||
return torch.baddbmm(self.alpha, add_batch, self.beta,
|
||||
batch1, batch2, out=output)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
batch1, batch2 = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
batch1, batch2 = self.saved_tensors
|
||||
grad_add_batch = grad_batch1 = grad_batch2 = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add_batch = maybe_unexpand(grad_output, ctx.add_batch_size)
|
||||
if ctx.alpha != 1:
|
||||
grad_add_batch = grad_add_batch.mul(ctx.alpha)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add_batch = grad_output
|
||||
if self.alpha != 1:
|
||||
grad_add_batch = grad_add_batch.mul(self.alpha)
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
if self.needs_input_grad[1]:
|
||||
grad_batch1 = torch.bmm(grad_output, batch2.transpose(1, 2))
|
||||
if ctx.beta != 1:
|
||||
grad_batch1 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_batch1 *= self.beta
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if self.needs_input_grad[2]:
|
||||
grad_batch2 = torch.bmm(batch1.transpose(1, 2), grad_output)
|
||||
if ctx.beta != 1:
|
||||
grad_batch2 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_batch2 *= self.beta
|
||||
|
||||
return grad_add_batch, grad_batch1, grad_batch2, None, None, None
|
||||
return grad_add_batch, grad_batch1, grad_batch2
|
||||
|
||||
|
||||
class Addmv(InplaceFunction):
|
||||
class Addmv(_BlasBase):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_vector, matrix, vector, alpha=1, beta=1, inplace=False):
|
||||
ctx.alpha = alpha
|
||||
ctx.beta = beta
|
||||
ctx.add_vector_size = add_vector.size()
|
||||
ctx.save_for_backward(matrix, vector)
|
||||
output = _get_output(ctx, add_vector, inplace=inplace)
|
||||
return torch.addmv(alpha, add_vector, beta,
|
||||
def forward(self, add_vector, matrix, vector):
|
||||
self.save_for_backward(matrix, vector)
|
||||
output = self._get_output(add_vector)
|
||||
return torch.addmv(self.alpha, add_vector, self.beta,
|
||||
matrix, vector, out=output)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
matrix, vector = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
matrix, vector = self.saved_tensors
|
||||
grad_add_vector = grad_matrix = grad_vector = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add_vector = maybe_unexpand(grad_output, ctx.add_vector_size)
|
||||
if ctx.alpha != 1:
|
||||
grad_add_vector = grad_add_vector.mul(ctx.alpha)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add_vector = grad_output
|
||||
if self.alpha != 1:
|
||||
grad_add_vector = grad_add_vector.mul(self.alpha)
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
if self.needs_input_grad[1]:
|
||||
grad_matrix = torch.ger(grad_output, vector)
|
||||
if ctx.beta != 1:
|
||||
grad_matrix *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_matrix *= self.beta
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if self.needs_input_grad[2]:
|
||||
grad_vector = torch.mv(matrix.t(), grad_output)
|
||||
if ctx.beta != 1:
|
||||
grad_vector *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_vector *= self.beta
|
||||
|
||||
return grad_add_vector, grad_matrix, grad_vector, None, None, None
|
||||
return grad_add_vector, grad_matrix, grad_vector
|
||||
|
||||
|
||||
class Addr(InplaceFunction):
|
||||
class Addr(_BlasBase):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_matrix, vector1, vector2, alpha=1, beta=1, inplace=False):
|
||||
ctx.alpha = alpha
|
||||
ctx.beta = beta
|
||||
ctx.add_matrix_size = add_matrix.size()
|
||||
ctx.save_for_backward(vector1, vector2)
|
||||
output = _get_output(ctx, add_matrix, inplace=inplace)
|
||||
return torch.addr(alpha, add_matrix, beta,
|
||||
def forward(self, add_matrix, vector1, vector2):
|
||||
self.save_for_backward(vector1, vector2)
|
||||
output = self._get_output(add_matrix)
|
||||
return torch.addr(self.alpha, add_matrix, self.beta,
|
||||
vector1, vector2, out=output)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
vector1, vector2 = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
vector1, vector2 = self.saved_tensors
|
||||
grad_add_matrix = grad_vector1 = grad_vector2 = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add_matrix = maybe_unexpand(grad_output, ctx.add_matrix_size)
|
||||
if ctx.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(ctx.alpha)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add_matrix = grad_output
|
||||
if self.alpha != 1:
|
||||
grad_add_matrix = grad_add_matrix.mul(self.alpha)
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
if self.needs_input_grad[1]:
|
||||
grad_vector1 = torch.mv(grad_output, vector2)
|
||||
if ctx.beta != 1:
|
||||
grad_vector1 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_vector1 *= self.beta
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if self.needs_input_grad[2]:
|
||||
# TODO: maybe it's better to do transpose + mv + transpose
|
||||
grad_vector2 = torch.mm(vector1.unsqueeze(0), grad_output).squeeze(0)
|
||||
if ctx.beta != 1:
|
||||
grad_vector2 *= ctx.beta
|
||||
if self.beta != 1:
|
||||
grad_vector2 *= self.beta
|
||||
|
||||
return grad_add_matrix, grad_vector1, grad_vector2, None, None, None
|
||||
return grad_add_matrix, grad_vector1, grad_vector2
|
||||
|
||||
|
||||
class Dot(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, vector1, vector2):
|
||||
ctx.save_for_backward(vector1, vector2)
|
||||
ctx.sizes = (vector1.size(), vector2.size())
|
||||
def forward(self, vector1, vector2):
|
||||
self.save_for_backward(vector1, vector2)
|
||||
self.sizes = (vector1.size(), vector2.size())
|
||||
return vector1.new((vector1.dot(vector2),))
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
vector1, vector2 = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
vector1, vector2 = self.saved_tensors
|
||||
grad_vector1 = grad_vector2 = None
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_vector1 = vector2.mul(grad_output.expand(ctx.sizes[1])).view(ctx.sizes[0])
|
||||
if self.needs_input_grad[0]:
|
||||
grad_vector1 = vector2.mul(grad_output[0]).view(self.sizes[0])
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
grad_vector2 = vector1.mul(grad_output.expand(ctx.sizes[0])).view(ctx.sizes[1])
|
||||
if self.needs_input_grad[1]:
|
||||
grad_vector2 = vector1.mul(grad_output[0]).view(self.sizes[1])
|
||||
|
||||
return grad_vector1, grad_vector2
|
||||
|
||||
@ -1,28 +1,19 @@
|
||||
import torch
|
||||
|
||||
from ..function import Function
|
||||
from .utils import maybe_unexpand, maybe_unexpand_or_view
|
||||
|
||||
|
||||
# TODO: once Cpp-style functions are implemented we can detach a and b
|
||||
# before calling forward.
|
||||
class _CompareOp(Function):
|
||||
|
||||
@classmethod
|
||||
def forward(cls, ctx, a, b):
|
||||
ctx.a_size = a.size()
|
||||
ctx.b_tensor = torch.is_tensor(b)
|
||||
ctx.b_size = b.size() if ctx.b_tensor else None
|
||||
ctx.input_type = type(a)
|
||||
mask = getattr(a, cls.fn_name)(b)
|
||||
ctx.mark_non_differentiable(mask)
|
||||
return mask
|
||||
def __init__(self, scalar=None):
|
||||
super(_CompareOp, self).__init__()
|
||||
self.scalar = scalar
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
grad_input = (grad_output * 0).type(ctx.input_type)
|
||||
return (maybe_unexpand(grad_input, ctx.a_size),
|
||||
maybe_unexpand_or_view(grad_input, ctx.b_size) if ctx.b_tensor else None)
|
||||
def forward(self, tensor1, tensor2=None):
|
||||
other = tensor2 if tensor2 is not None else self.scalar
|
||||
mask = getattr(tensor1, self.fn_name)(other)
|
||||
self.mark_non_differentiable(mask)
|
||||
return mask
|
||||
|
||||
|
||||
class Eq(_CompareOp):
|
||||
|
||||
@ -1,104 +1,71 @@
|
||||
import torch
|
||||
|
||||
from ..function import Function
|
||||
from ..variable import Variable
|
||||
|
||||
|
||||
class Diag(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, diagonal_idx=0):
|
||||
ctx.diagonal_idx = diagonal_idx
|
||||
return input.diag(ctx.diagonal_idx)
|
||||
def __init__(self, diagonal_idx=0):
|
||||
super(Diag, self).__init__()
|
||||
self.diagonal_idx = diagonal_idx
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return grad_output.diag(ctx.diagonal_idx), None
|
||||
def forward(self, input):
|
||||
return input.diag(self.diagonal_idx)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output.diag(self.diagonal_idx)
|
||||
|
||||
|
||||
class Tril(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, diagonal_idx=0):
|
||||
ctx.diagonal_idx = diagonal_idx
|
||||
return input.tril(ctx.diagonal_idx)
|
||||
def __init__(self, diagonal_idx=0):
|
||||
super(Tril, self).__init__()
|
||||
self.diagonal_idx = diagonal_idx
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return grad_output.tril(ctx.diagonal_idx), None
|
||||
def forward(self, input):
|
||||
return input.tril(self.diagonal_idx)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output.tril(self.diagonal_idx)
|
||||
|
||||
|
||||
class Triu(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, diagnoal_idx=0):
|
||||
ctx.diagonal_idx = diagnoal_idx
|
||||
return input.triu(ctx.diagonal_idx)
|
||||
def __init__(self, diagonal_idx=0):
|
||||
super(Triu, self).__init__()
|
||||
self.diagonal_idx = diagonal_idx
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return grad_output.triu(ctx.diagonal_idx), None
|
||||
def forward(self, input):
|
||||
return input.triu(self.diagonal_idx)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output.triu(self.diagonal_idx)
|
||||
|
||||
|
||||
class Trace(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input):
|
||||
ctx.isize = input.size()
|
||||
return input.new((input.trace(), ))
|
||||
def forward(self, input):
|
||||
self.isize = input.size()
|
||||
return input.new((input.trace(),))
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
isize = ctx.isize
|
||||
min_size = min(isize)
|
||||
grad_input = Variable(grad_output.data.new(isize).zero_()).view(-1)
|
||||
grad_input[::(isize[1] + 1)] = grad_output.expand(min_size)
|
||||
return grad_input.view(isize)
|
||||
def backward(self, grad_output):
|
||||
isize = self.isize
|
||||
grad_input = grad_output.new(isize).zero_()
|
||||
grad_input.view(-1)[::(isize[1] + 1)] = grad_output[0]
|
||||
return grad_input
|
||||
|
||||
|
||||
class Cross(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, other, dim=-1):
|
||||
ctx.dim = dim
|
||||
ctx.save_for_backward(input, other)
|
||||
return torch.cross(input, other, ctx.dim)
|
||||
def __init__(self, dim=-1):
|
||||
self.dim = dim
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
input, other = ctx.saved_variables
|
||||
grad_input = other.cross(grad_output, ctx.dim)
|
||||
grad_other = grad_output.cross(input, ctx.dim)
|
||||
return grad_input, grad_other, None
|
||||
def forward(self, input, other):
|
||||
self.save_for_backward(input, other)
|
||||
return torch.cross(input, other, self.dim)
|
||||
|
||||
|
||||
class Inverse(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input):
|
||||
inverse = torch.inverse(input)
|
||||
ctx.save_for_backward(inverse)
|
||||
return inverse
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
inverse, = ctx.saved_variables
|
||||
return -torch.mm(inverse.t(), torch.mm(grad_output, inverse.t()))
|
||||
|
||||
|
||||
class Gesv(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, b, a):
|
||||
# TODO see if one can backprop through LU
|
||||
X, LU = torch.gesv(b, a)
|
||||
ctx.save_for_backward(X, a)
|
||||
ctx.mark_non_differentiable(LU)
|
||||
return X, LU
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output, grad_LU=None):
|
||||
X, a = ctx.saved_variables
|
||||
grad_b, _ = torch.gesv(grad_output, a.t())
|
||||
grad_a = -torch.mm(grad_b, X.t())
|
||||
return grad_b, grad_a
|
||||
def backward(self, grad_output):
|
||||
input, other = self.saved_tensors
|
||||
grad_input = torch.cross(other, grad_output, self.dim)
|
||||
grad_other = torch.cross(grad_output, input, self.dim)
|
||||
return grad_input, grad_other
|
||||
|
||||
@ -1,351 +1,282 @@
|
||||
from itertools import repeat
|
||||
|
||||
from ..._thnn import type2backend
|
||||
from ..function import Function, InplaceFunction
|
||||
from ..variable import Variable
|
||||
from .utils import maybe_unexpand, maybe_unexpand_or_view
|
||||
|
||||
|
||||
class Exp(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(i)
|
||||
def forward(self, i):
|
||||
if self.inplace:
|
||||
self.mark_dirty(i)
|
||||
result = i.exp_()
|
||||
else:
|
||||
result = i.exp()
|
||||
ctx.save_for_backward(result)
|
||||
self.save_for_backward(result)
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
result, = ctx.saved_variables
|
||||
return grad_output * result, None
|
||||
def backward(self, grad_output):
|
||||
return self.saved_tensors[0] * grad_output
|
||||
|
||||
|
||||
class Log(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.log()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output.div(i)
|
||||
def backward(self, grad_output):
|
||||
return grad_output.div(self.saved_tensors[0])
|
||||
|
||||
|
||||
class Log1p(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.log1p()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output.div(i.add(1))
|
||||
def backward(self, grad_output):
|
||||
return grad_output.div(self.saved_tensors[0].add(1))
|
||||
|
||||
|
||||
class Tanh(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(i)
|
||||
def forward(self, i):
|
||||
if self.inplace:
|
||||
self.mark_dirty(i)
|
||||
result = i.tanh_()
|
||||
else:
|
||||
result = i.tanh()
|
||||
ctx.save_for_backward(result)
|
||||
self.save_for_backward(result)
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
result, = ctx.saved_variables
|
||||
if grad_output.volatile:
|
||||
grad_input = Variable(grad_output.data.new(grad_output.size()), volatile=True)
|
||||
backend = type2backend[type(result.data)]
|
||||
backend.Tanh_updateGradInput(backend.library_state, None, grad_output.data,
|
||||
grad_input.data, result.data)
|
||||
else:
|
||||
grad_input = grad_output * (1 - result * result)
|
||||
return grad_input, None
|
||||
def backward(self, grad_output):
|
||||
result, = self.saved_tensors
|
||||
return grad_output * (1 - result * result)
|
||||
|
||||
|
||||
class Sigmoid(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(i)
|
||||
def forward(self, i):
|
||||
if self.inplace:
|
||||
self.mark_dirty(i)
|
||||
result = i.sigmoid_()
|
||||
else:
|
||||
result = i.sigmoid()
|
||||
ctx.save_for_backward(result)
|
||||
self.save_for_backward(result)
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
result, = ctx.saved_variables
|
||||
if grad_output.volatile:
|
||||
grad_input = Variable(grad_output.data.new(grad_output.size()), volatile=True)
|
||||
backend = type2backend[type(result.data)]
|
||||
backend.Sigmoid_updateGradInput(backend.library_state, None, grad_output.data,
|
||||
grad_input.data, result.data)
|
||||
else:
|
||||
grad_input = grad_output * ((1 - result) * result)
|
||||
return grad_input, None
|
||||
def backward(self, grad_output):
|
||||
result, = self.saved_tensors
|
||||
return grad_output * ((1 - result) * result)
|
||||
|
||||
|
||||
class Sinh(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.sinh()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * i.cosh()
|
||||
|
||||
|
||||
class Cosh(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.cosh()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * i.sinh()
|
||||
|
||||
|
||||
class Abs(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.abs()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * i.sign()
|
||||
|
||||
|
||||
class Clamp(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, min_val, max_val):
|
||||
ctx._mask = (i.ge(min_val) * i.le(max_val))
|
||||
return i.clamp(min_val, max_val)
|
||||
def __init__(self, min_val, max_val):
|
||||
super(Clamp, self).__init__()
|
||||
self.min_val = min_val
|
||||
self.max_val = max_val
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
mask = Variable(ctx._mask.type_as(grad_output.data))
|
||||
return grad_output * mask, None, None
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.clamp(self.min_val, self.max_val)
|
||||
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
mask = i.ge(self.min_val) * i.le(self.max_val)
|
||||
return grad_output * mask.type_as(grad_output)
|
||||
|
||||
|
||||
class Sqrt(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.sqrt()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output.mul(i.pow(-0.5)).div_(2)
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output.mul(i.pow(-0.5)).div(2)
|
||||
|
||||
|
||||
class Sin(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.sin()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * i.cos()
|
||||
|
||||
|
||||
class Cos(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.cos()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output.mul(i.sin()).neg_()
|
||||
|
||||
|
||||
class Tan(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.tan()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output.div(i.cos().pow(2))
|
||||
|
||||
|
||||
class Asin(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.asin()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output * (1 - i.mul(i)).sqrt().reciprocal()
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * (1 - i.mul(i)).sqrt_().reciprocal_()
|
||||
|
||||
|
||||
class Acos(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.acos()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output.mul((1 - i.mul(i)).sqrt().reciprocal()).neg_()
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output.mul((1 - i.mul(i)).sqrt_().reciprocal_()).neg_()
|
||||
|
||||
|
||||
class Atan(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
ctx.save_for_backward(i)
|
||||
def forward(self, i):
|
||||
self.save_for_backward(i)
|
||||
return i.atan()
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
i, = ctx.saved_variables
|
||||
return grad_output * i.mul(i).add_(1).reciprocal()
|
||||
def backward(self, grad_output):
|
||||
i, = self.saved_tensors
|
||||
return grad_output * i.mul(i).add_(1).reciprocal_()
|
||||
|
||||
|
||||
class Atan2(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, y, x):
|
||||
ctx.save_for_backward(y, x)
|
||||
return y.atan2(x)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
y, x, = ctx.saved_variables
|
||||
denominator = y.mul(y).add(x.mul(x)).reciprocal()
|
||||
return grad_output * x.mul(denominator), grad_output * y.neg().mul(denominator)
|
||||
|
||||
|
||||
# TODO: make inplace and update grad formulas
|
||||
class Reciprocal(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i):
|
||||
def forward(self, i):
|
||||
result = i.reciprocal()
|
||||
ctx.save_for_backward(result)
|
||||
self.save_for_backward(result)
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
result, = ctx.saved_variables
|
||||
def backward(self, grad_output):
|
||||
result, = self.saved_tensors
|
||||
return grad_output * result.mul(result).neg_()
|
||||
|
||||
|
||||
class Cmax(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
ctx._a_size = a.size()
|
||||
ctx._b_size = b.size()
|
||||
ctx._mask = a.gt(b)
|
||||
def forward(self, a, b):
|
||||
self._max_buffer = a.gt(b).type_as(a)
|
||||
return a.max(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
mask = Variable(ctx._mask.type_as(grad_output.data))
|
||||
def backward(self, grad_output):
|
||||
return (
|
||||
maybe_unexpand(grad_output * mask, ctx._a_size),
|
||||
maybe_unexpand_or_view(grad_output * Variable(ctx._mask.eq(0).type_as(grad_output.data)), ctx._b_size)
|
||||
grad_output * self._max_buffer,
|
||||
grad_output * self._max_buffer.eq(0).type_as(grad_output)
|
||||
)
|
||||
|
||||
|
||||
class CmaxConstant(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, constant):
|
||||
ctx._mask = i.gt(constant)
|
||||
return i.clamp(min=constant)
|
||||
def __init__(self, constant):
|
||||
super(CmaxConstant, self).__init__()
|
||||
self.constant = constant
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
mask = Variable(ctx._mask.type_as(grad_output.data))
|
||||
return grad_output * mask, None
|
||||
def forward(self, i):
|
||||
self._max_buffer = i.gt(self.constant).type_as(i)
|
||||
return i.clamp(min=self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output * self._max_buffer
|
||||
|
||||
|
||||
class Cmin(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b):
|
||||
ctx._a_size = a.size()
|
||||
ctx._b_size = b.size()
|
||||
ctx._mask = a.lt(b).type_as(a)
|
||||
def forward(self, a, b):
|
||||
self._min_buffer = a.lt(b).type_as(a)
|
||||
return a.min(b)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
mask = Variable(ctx._mask.type_as(grad_output.data))
|
||||
def backward(self, grad_output):
|
||||
return (
|
||||
maybe_unexpand(grad_output * mask, ctx._a_size),
|
||||
maybe_unexpand_or_view(grad_output * Variable(ctx._mask.eq(0).type_as(grad_output.data)), ctx._b_size)
|
||||
grad_output * self._min_buffer,
|
||||
grad_output * self._min_buffer.eq(0).type_as(grad_output)
|
||||
)
|
||||
|
||||
|
||||
class CminConstant(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, constant):
|
||||
ctx._mask = i.lt(constant)
|
||||
return i.clamp(max=constant)
|
||||
def __init__(self, constant):
|
||||
super(CminConstant, self).__init__()
|
||||
self.constant = constant
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
mask = Variable(ctx._mask.type_as(grad_output.data))
|
||||
return grad_output * mask, None
|
||||
def forward(self, i):
|
||||
self._min_buffer = i.lt(self.constant).type_as(i)
|
||||
return i.clamp(max=self.constant)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output * self._min_buffer
|
||||
|
||||
|
||||
class _ConstantGrad(Function):
|
||||
grad_value = 0
|
||||
|
||||
@classmethod
|
||||
def forward(cls, ctx, *args):
|
||||
ctx._num_args = len(args)
|
||||
ctx._args0_size = args[0].size()
|
||||
return getattr(args[0], cls.__name__.lower())(*args[1:])
|
||||
def __init__(self, *args):
|
||||
super(_ConstantGrad, self).__init__()
|
||||
self.args = args
|
||||
|
||||
@classmethod
|
||||
def backward(cls, ctx, grad_output):
|
||||
return (maybe_unexpand(grad_output.mul(cls.grad_value), ctx._args0_size),) + (ctx._num_args - 1) * (None,)
|
||||
def forward(self, i):
|
||||
return getattr(i, type(self).__name__.lower())(*self.args)
|
||||
|
||||
def backward(self, grad_output):
|
||||
grad_input = grad_output.new(*repeat(1, grad_output.dim()))
|
||||
grad_input = grad_input.fill_(self.grad_value).expand_as(grad_output)
|
||||
return grad_input.mul(grad_output)
|
||||
|
||||
|
||||
class Floor(_ConstantGrad):
|
||||
@ -382,96 +313,91 @@ class Remainder(_ConstantGrad):
|
||||
|
||||
class Lerp(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, a, b, weight):
|
||||
ctx._a_size = a.size()
|
||||
ctx._b_size = b.size()
|
||||
ctx._weight = float(weight)
|
||||
return a.lerp(b, ctx._weight)
|
||||
def __init__(self, weight):
|
||||
super(Lerp, self).__init__()
|
||||
self.weight = float(weight)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
return (maybe_unexpand(grad_output.mul(1 - ctx._weight), ctx._a_size),
|
||||
maybe_unexpand_or_view(grad_output.mul(ctx._weight), ctx._b_size), None)
|
||||
def forward(self, a, b):
|
||||
return a.lerp(b, self.weight)
|
||||
|
||||
def backward(self, grad_output):
|
||||
return grad_output.mul(1 - self.weight), grad_output.mul(self.weight)
|
||||
|
||||
|
||||
class Rsqrt(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, i, inplace=False):
|
||||
if inplace:
|
||||
ctx.mark_dirty(i)
|
||||
result = i.rsqrt_()
|
||||
def forward(self, input):
|
||||
if self.inplace:
|
||||
self.mark_dirty(input)
|
||||
result = input.rsqrt_()
|
||||
else:
|
||||
result = i.rsqrt()
|
||||
ctx.save_for_backward(result)
|
||||
result = input.rsqrt()
|
||||
self.save_for_backward(result)
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
result, = ctx.saved_variables
|
||||
return result.pow(3).div_(-2).mul(grad_output), None
|
||||
def backward(self, grad_output):
|
||||
result, = self.saved_tensors
|
||||
return result.pow(3).div_(-2).mul_(grad_output)
|
||||
|
||||
|
||||
class Addcmul(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_tensor, mul_tensor1, mul_tensor2, scale=1.0, inplace=False):
|
||||
ctx._scale = scale
|
||||
ctx._add_tensor_size = add_tensor.size()
|
||||
ctx.save_for_backward(mul_tensor1, mul_tensor2)
|
||||
if inplace:
|
||||
ctx.mark_dirty(add_tensor)
|
||||
return add_tensor.addcmul_(scale, mul_tensor1, mul_tensor2)
|
||||
def __init__(self, scale=1, inplace=False):
|
||||
super(Addcmul, self).__init__(inplace)
|
||||
self.scale = scale
|
||||
|
||||
def forward(self, add_tensor, mul_tensor1, mul_tensor2):
|
||||
self.save_for_backward(mul_tensor1, mul_tensor2)
|
||||
if self.inplace:
|
||||
return add_tensor.addcmul_(self.scale, mul_tensor1, mul_tensor2)
|
||||
else:
|
||||
return add_tensor.addcmul(scale, mul_tensor1, mul_tensor2)
|
||||
return add_tensor.addcmul(self.scale, mul_tensor1, mul_tensor2)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
def backward(self, grad_output):
|
||||
grad_add = grad_mul1 = grad_mul2 = None
|
||||
mul_tensor1, mul_tensor2 = ctx.saved_variables
|
||||
mul_tensor1, mul_tensor2 = self.saved_tensors
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add = maybe_unexpand(grad_output, ctx._add_tensor_size)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add = grad_output
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
grad_mul1 = maybe_unexpand_or_view(grad_output.mul(mul_tensor2).mul_(ctx._scale), mul_tensor1.size())
|
||||
if self.needs_input_grad[1]:
|
||||
grad_mul1 = grad_output.mul(mul_tensor2).mul(self.scale)
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
grad_mul2 = maybe_unexpand_or_view(grad_output.mul(mul_tensor1).mul_(ctx._scale), mul_tensor2.size())
|
||||
if self.needs_input_grad[2]:
|
||||
grad_mul2 = grad_output.mul(mul_tensor1).mul(self.scale)
|
||||
|
||||
return grad_add, grad_mul1, grad_mul2, None, None
|
||||
return grad_add, grad_mul1, grad_mul2
|
||||
|
||||
|
||||
class Addcdiv(InplaceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, add_tensor, div_tensor1, div_tensor2, scale=1.0, inplace=False):
|
||||
ctx._scale = scale
|
||||
ctx._add_tensor_size = add_tensor.size()
|
||||
ctx.save_for_backward(div_tensor1, div_tensor2)
|
||||
if inplace:
|
||||
ctx.mark_dirty(add_tensor)
|
||||
return add_tensor.addcdiv_(ctx._scale, div_tensor1, div_tensor2)
|
||||
def __init__(self, scale=1, inplace=False):
|
||||
super(Addcdiv, self).__init__(inplace)
|
||||
self.scale = scale
|
||||
|
||||
def forward(self, add_tensor, div_tensor1, div_tensor2):
|
||||
self.save_for_backward(div_tensor1, div_tensor2)
|
||||
if self.inplace:
|
||||
return add_tensor.addcdiv_(self.scale, div_tensor1, div_tensor2)
|
||||
else:
|
||||
return add_tensor.addcdiv(ctx._scale, div_tensor1, div_tensor2)
|
||||
return add_tensor.addcdiv(self.scale, div_tensor1, div_tensor2)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
def backward(self, grad_output):
|
||||
grad_add = grad_div1 = grad_div2 = None
|
||||
div_tensor1, div_tensor2 = ctx.saved_variables
|
||||
div_tensor1, div_tensor2 = self.saved_tensors
|
||||
|
||||
if ctx.needs_input_grad[0]:
|
||||
grad_add = maybe_unexpand(grad_output, ctx._add_tensor_size)
|
||||
if self.needs_input_grad[0]:
|
||||
grad_add = grad_output
|
||||
|
||||
if ctx.needs_input_grad[1]:
|
||||
grad_div1 = maybe_unexpand_or_view(grad_output.div(div_tensor2).mul_(ctx._scale), div_tensor1.size())
|
||||
if self.needs_input_grad[1]:
|
||||
grad_div1 = grad_output.div(div_tensor2).mul(self.scale)
|
||||
|
||||
if ctx.needs_input_grad[2]:
|
||||
if self.needs_input_grad[2]:
|
||||
div_tensor2_sq = div_tensor2.mul(div_tensor2)
|
||||
grad_div2 = maybe_unexpand_or_view(grad_output.mul(div_tensor1).div(div_tensor2_sq).mul(-ctx._scale),
|
||||
div_tensor2.size())
|
||||
grad_div2 = grad_output.mul(div_tensor1).div_(div_tensor2_sq)
|
||||
grad_div2.neg_().mul_(self.scale)
|
||||
|
||||
return grad_add, grad_div1, grad_div2
|
||||
|
||||
return grad_add, grad_div1, grad_div2, None, None
|
||||
|
||||
# TODO: atan2 + inplace
|
||||
|
||||
@ -1,141 +1,110 @@
|
||||
from functools import reduce
|
||||
|
||||
from ..function import Function
|
||||
from ..variable import Variable
|
||||
import torch
|
||||
|
||||
|
||||
class Sum(Function):
|
||||
class _DimReduceFunction(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, dim=None, keepdim=None):
|
||||
ctx.dim = dim
|
||||
ctx.keepdim = False if keepdim is None else keepdim
|
||||
ctx.input_size = input.size()
|
||||
if dim is None:
|
||||
return input.new((input.sum(),))
|
||||
def __init__(self, dim=None):
|
||||
super(_DimReduceFunction, self).__init__()
|
||||
self.dim = dim
|
||||
|
||||
def forward(self, input):
|
||||
self.input_size = input.size()
|
||||
fn = getattr(input, self.fn_name)
|
||||
if self.dim is None:
|
||||
return input.new((fn(),))
|
||||
else:
|
||||
if keepdim is not None:
|
||||
return input.sum(dim, keepdim=keepdim)
|
||||
else:
|
||||
return input.sum(dim)
|
||||
return fn(self.dim)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.dim is None:
|
||||
return grad_output.expand(ctx.input_size), None, None
|
||||
|
||||
class Sum(_DimReduceFunction):
|
||||
fn_name = 'sum'
|
||||
|
||||
def backward(self, grad_output):
|
||||
if self.dim is None:
|
||||
return grad_output.new(self.input_size).fill_(grad_output[0])
|
||||
else:
|
||||
if ctx.keepdim is False and len(ctx.input_size) != 1:
|
||||
grad_output = grad_output.unsqueeze(ctx.dim)
|
||||
|
||||
repeats = [1 for _ in ctx.input_size]
|
||||
repeats[ctx.dim] = ctx.input_size[ctx.dim]
|
||||
return grad_output.repeat(*repeats), None, None
|
||||
repeats = [1 for _ in self.input_size]
|
||||
repeats[self.dim] = self.input_size[self.dim]
|
||||
return grad_output.repeat(*repeats),
|
||||
|
||||
|
||||
class Prod(Function):
|
||||
class Prod(_DimReduceFunction):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, dim=None, keepdim=None):
|
||||
ctx.dim = dim
|
||||
ctx.keepdim = False if keepdim is None else keepdim
|
||||
ctx.input_size = input.size()
|
||||
if dim is None:
|
||||
ctx.result = input.prod()
|
||||
ctx.save_for_backward(input)
|
||||
return input.new((ctx.result,))
|
||||
def forward(self, input):
|
||||
self.input_size = input.size()
|
||||
if self.dim is None:
|
||||
self.result = input.prod()
|
||||
self.save_for_backward(input)
|
||||
return input.new((self.result,))
|
||||
else:
|
||||
if keepdim is not None:
|
||||
output = input.prod(dim, keepdim=keepdim)
|
||||
else:
|
||||
output = input.prod(dim)
|
||||
ctx.save_for_backward(input, output)
|
||||
output = input.prod(self.dim)
|
||||
self.save_for_backward(input, output)
|
||||
return output
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
def safe_zeros_backward(inp, dim):
|
||||
# note that the gradient is equivalent to:
|
||||
# cumprod(exclusive, normal) * cumprod(exclusive, reverse), e.g.:
|
||||
# input: [ a, b, c]
|
||||
# cumprod(exclusive, normal): [1 , a, a * b]
|
||||
# cumprod(exclusive, reverse): [b * c, c, 1]
|
||||
# product: [b * c, a * c, a * b]
|
||||
# and this is safe under input with 0s.
|
||||
if inp.size(dim) == 1:
|
||||
return grad_output
|
||||
def backward(self, grad_output):
|
||||
if self.dim is None:
|
||||
input, = self.saved_tensors
|
||||
zero_idx = (input == 0).nonzero()
|
||||
if zero_idx.dim() == 0:
|
||||
return grad_output.mul(self.result).expand_as(input).div(input)
|
||||
elif zero_idx.size(0) > 1:
|
||||
return grad_output.new(self.input_size).zero_()
|
||||
else:
|
||||
grad_input = grad_output.new(self.input_size).zero_()
|
||||
zero_idx = tuple(zero_idx[0].cpu())
|
||||
input_copy = input.clone()
|
||||
input_copy[zero_idx] = 1.
|
||||
grad_input[zero_idx] = grad_output[0] * input_copy.prod()
|
||||
return grad_input
|
||||
else:
|
||||
input, output = self.saved_tensors
|
||||
dim = self.dim if self.dim >= 0 else self.dim + input.dim()
|
||||
zero_mask = input == 0
|
||||
slice_zero_count = zero_mask.sum(dim)
|
||||
total_zeros = slice_zero_count.sum()
|
||||
grad_input = grad_output.mul(output).expand_as(input).div(input)
|
||||
if total_zeros == 0:
|
||||
return grad_input
|
||||
|
||||
ones_size = torch.Size((inp.size()[:dim] + (1,) + inp.size()[dim + 1:]))
|
||||
ones = Variable(grad_output.data.new(ones_size).fill_(1))
|
||||
exclusive_normal_nocp = torch.cat((ones, inp.narrow(dim, 0, inp.size(dim) - 1)), dim)
|
||||
exclusive_normal = exclusive_normal_nocp.cumprod(dim)
|
||||
some_zeros = slice_zero_count.gt(0).expand_as(grad_input)
|
||||
grad_input[some_zeros] = 0
|
||||
|
||||
def reverse_dim(var, dim):
|
||||
return var.index_select(dim, Variable(torch.arange(var.size(dim) - 1, -1, -1)).long())
|
||||
single_zero_idx = slice_zero_count.eq(1).nonzero()
|
||||
|
||||
narrow_reverse = reverse_dim(inp.narrow(dim, 1, inp.size(dim) - 1), dim)
|
||||
exclusive_reverse_nocp = torch.cat((ones, narrow_reverse), dim)
|
||||
exclusive_reverse = reverse_dim(exclusive_reverse_nocp.cumprod(dim), dim)
|
||||
if len(single_zero_idx) == 0:
|
||||
return grad_input
|
||||
|
||||
for idx in single_zero_idx:
|
||||
idx_tuple = tuple(idx.cpu())
|
||||
input_idx_tuple = idx_tuple[:dim] + (slice(0, None),) + idx_tuple[dim + 1:]
|
||||
|
||||
# slice_mask and input_copy are 1D
|
||||
slice_mask = zero_mask[input_idx_tuple]
|
||||
input_copy = input[input_idx_tuple].clone()
|
||||
zero_idx = slice_mask.nonzero()[0, 0]
|
||||
input_copy[zero_idx] = 1.
|
||||
|
||||
grad_idx_tuple = idx_tuple[:dim] + (zero_idx,) + idx_tuple[dim + 1:]
|
||||
grad_input[grad_idx_tuple] = grad_output[idx_tuple] * input_copy.prod()
|
||||
|
||||
grad_input = grad_output.expand_as(exclusive_normal).mul(exclusive_normal.mul(exclusive_reverse))
|
||||
return grad_input
|
||||
|
||||
if ctx.dim is None:
|
||||
input, = ctx.saved_variables
|
||||
zero_idx = (input.data == 0).nonzero()
|
||||
if zero_idx.dim() == 0:
|
||||
return grad_output.mul(ctx.result).expand_as(input).div(input), None, None
|
||||
elif zero_idx.size(0) > 1:
|
||||
return (grad_output * 0).expand_as(input), None, None
|
||||
else:
|
||||
return safe_zeros_backward(input.contiguous().view(-1), 0).view_as(input), None, None
|
||||
|
||||
class Mean(_DimReduceFunction):
|
||||
fn_name = 'mean'
|
||||
|
||||
def backward(self, grad_output):
|
||||
if self.dim is None:
|
||||
grad_input_val = grad_output[0]
|
||||
grad_input_val /= reduce(lambda x, y: x * y, self.input_size, 1)
|
||||
return grad_output.new(*self.input_size).fill_(grad_input_val)
|
||||
else:
|
||||
input, output = ctx.saved_variables
|
||||
dim = ctx.dim if ctx.dim >= 0 else ctx.dim + input.dim()
|
||||
if ctx.keepdim is False and len(ctx.input_size) != 1:
|
||||
grad_output = grad_output.unsqueeze(dim)
|
||||
output = output.unsqueeze(dim)
|
||||
|
||||
zero_mask = input == 0
|
||||
slice_zero_count = zero_mask.sum(dim, True)
|
||||
total_zeros = slice_zero_count.data.sum()
|
||||
if total_zeros == 0:
|
||||
grad_input = grad_output.mul(output).expand_as(input).div(input)
|
||||
else:
|
||||
grad_input = safe_zeros_backward(input, dim)
|
||||
|
||||
return grad_input, None, None
|
||||
|
||||
|
||||
class Mean(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, dim=None, keepdim=None):
|
||||
ctx.dim = dim
|
||||
ctx.keepdim = False if keepdim is None else keepdim
|
||||
ctx.input_size = input.size()
|
||||
if dim is None:
|
||||
return input.new((input.mean(),))
|
||||
else:
|
||||
if keepdim is not None:
|
||||
return input.mean(dim, keepdim=keepdim)
|
||||
else:
|
||||
return input.mean(dim)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.dim is None:
|
||||
grad_input_val = grad_output / reduce(lambda x, y: x * y, ctx.input_size, 1)
|
||||
return grad_input_val.expand(ctx.input_size), None, None
|
||||
else:
|
||||
if ctx.keepdim is False and len(ctx.input_size) != 1:
|
||||
grad_output = grad_output.unsqueeze(ctx.dim)
|
||||
|
||||
repeats = [1 for _ in ctx.input_size]
|
||||
dim_size = ctx.input_size[ctx.dim]
|
||||
repeats[ctx.dim] = dim_size
|
||||
return grad_output.repeat(*repeats).div_(dim_size), None, None
|
||||
repeats = [1 for _ in self.input_size]
|
||||
dim_size = self.input_size[self.dim]
|
||||
repeats[self.dim] = dim_size
|
||||
return grad_output.repeat(*repeats).div_(dim_size)
|
||||
|
||||
|
||||
class _SelectionFunction(Function):
|
||||
@ -143,53 +112,44 @@ class _SelectionFunction(Function):
|
||||
# additional_args is prepended before dim when calling the tensor
|
||||
# function. It's a no-op for subclasses other than kthvalue.
|
||||
# kthvalue not only requires us to pass a dim, but also preceed it with k.
|
||||
additional_args = tuple()
|
||||
|
||||
@classmethod
|
||||
def forward(cls, ctx, input, dim=None, keepdim=None, additional_args=tuple()):
|
||||
fn = getattr(input, cls.__name__.lower())
|
||||
ctx.dim = dim
|
||||
ctx.keepdim = False if keepdim is None else keepdim
|
||||
ctx.additional_args = additional_args
|
||||
ctx.input_size = input.size()
|
||||
if ctx.dim is None and cls.has_all_reduce:
|
||||
value = fn(*additional_args)
|
||||
ctx.indices_tuple = tuple(input.eq(value).nonzero()[0])
|
||||
def __init__(self, dim=None):
|
||||
super(_SelectionFunction, self).__init__()
|
||||
self.dim = dim
|
||||
|
||||
def forward(self, input):
|
||||
fn = getattr(input, type(self).__name__.lower())
|
||||
self.input_size = input.size()
|
||||
if self.dim is None and self.has_all_reduce:
|
||||
value = fn(*self.additional_args)
|
||||
self.indices = tuple(input.eq(value).nonzero()[0])
|
||||
return input.new((value,))
|
||||
else:
|
||||
if ctx.dim is None:
|
||||
if self.dim is None:
|
||||
dim = input.dim() - 1
|
||||
else:
|
||||
dim = ctx.dim
|
||||
dim = self.dim
|
||||
args = (dim,)
|
||||
if additional_args:
|
||||
args = additional_args + args
|
||||
if keepdim is not None:
|
||||
output, indices = fn(*args, keepdim=keepdim)
|
||||
else:
|
||||
output, indices = fn(*args)
|
||||
ctx.save_for_backward(indices)
|
||||
ctx.mark_non_differentiable(indices)
|
||||
if self.additional_args:
|
||||
args = self.additional_args + args
|
||||
output, indices = fn(*args)
|
||||
self.save_for_backward(indices)
|
||||
self.mark_non_differentiable(indices)
|
||||
return output, indices
|
||||
|
||||
@classmethod
|
||||
def backward(cls, ctx, grad_output, grad_indices=None):
|
||||
grad_input = Variable(grad_output.data.new(*ctx.input_size).zero_())
|
||||
if ctx.dim is None and cls.has_all_reduce:
|
||||
grad_input[ctx.indices_tuple] = grad_output
|
||||
def backward(self, grad_output, grad_indices=None):
|
||||
grad_input = grad_output.new(*self.input_size).zero_()
|
||||
if self.dim is None and self.has_all_reduce:
|
||||
grad_input[self.indices] = grad_output[0]
|
||||
else:
|
||||
if ctx.dim is None:
|
||||
dim = len(ctx.input_size) - 1
|
||||
if self.dim is None:
|
||||
dim = input.dim() - 1
|
||||
else:
|
||||
dim = ctx.dim
|
||||
|
||||
indices, = ctx.saved_variables
|
||||
if ctx.keepdim is False and len(ctx.input_size) != 1:
|
||||
grad_output = grad_output.unsqueeze(dim)
|
||||
grad_indices = grad_indices.unsqueeze(dim)
|
||||
indices = indices.unsqueeze(dim)
|
||||
|
||||
dim = self.dim
|
||||
indices, = self.saved_tensors
|
||||
grad_input.scatter_(dim, indices, grad_output)
|
||||
return grad_input, None, None, None
|
||||
return grad_input
|
||||
|
||||
|
||||
class Max(_SelectionFunction):
|
||||
@ -205,63 +165,53 @@ class Mode(_SelectionFunction):
|
||||
|
||||
|
||||
class Median(_SelectionFunction):
|
||||
pass
|
||||
has_all_reduce = False
|
||||
|
||||
|
||||
class Kthvalue(_SelectionFunction):
|
||||
has_all_reduce = False
|
||||
|
||||
@classmethod
|
||||
def forward(cls, ctx, input, k, dim=None, keepdim=None):
|
||||
return super(Kthvalue, cls).forward(ctx, input, dim, keepdim, (k,))
|
||||
def __init__(self, k, dim=None):
|
||||
super(Kthvalue, self).__init__(dim)
|
||||
self.additional_args = (k,)
|
||||
|
||||
|
||||
class Norm(Function):
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, input, p=2, dim=None, keepdim=None):
|
||||
ctx.p = p
|
||||
ctx.dim = dim
|
||||
ctx.keepdim = False if keepdim is None else keepdim
|
||||
def __init__(self, norm_type=2, dim=None):
|
||||
super(Norm, self).__init__()
|
||||
self.norm_type = norm_type
|
||||
self.dim = dim
|
||||
|
||||
if dim is None:
|
||||
ctx.norm = input.norm(p)
|
||||
ctx.save_for_backward(input)
|
||||
return input.new((ctx.norm,))
|
||||
def forward(self, input):
|
||||
if self.dim is None:
|
||||
self.norm = input.norm(self.norm_type)
|
||||
self.save_for_backward(input)
|
||||
return input.new((self.norm,))
|
||||
else:
|
||||
if keepdim is not None:
|
||||
output = input.norm(p, dim, keepdim=keepdim)
|
||||
else:
|
||||
output = input.norm(p, dim)
|
||||
ctx.save_for_backward(input, output)
|
||||
output = input.norm(self.norm_type, self.dim)
|
||||
self.save_for_backward(input, output)
|
||||
return output
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, grad_output):
|
||||
if ctx.dim is None:
|
||||
input, = ctx.saved_variables
|
||||
if ctx.p == 2:
|
||||
scale_v = (grad_output / ctx.norm).expand_as(input)
|
||||
return input.mul(scale_v), None, None, None
|
||||
def backward(self, grad_output):
|
||||
if self.dim is None:
|
||||
input, = self.saved_tensors
|
||||
if self.norm_type == 2:
|
||||
return input.mul(grad_output[0] / self.norm)
|
||||
else:
|
||||
pow = input.abs().pow(ctx.p - 2)
|
||||
scale_v = (grad_output / ctx.norm ** (ctx.p - 1)).expand_as(input)
|
||||
return input.mul(pow).mul(scale_v), None, None, None
|
||||
pow = input.abs().pow(self.norm_type - 2)
|
||||
scale = grad_output[0] / self.norm ** (self.norm_type - 1)
|
||||
return input.mul(pow).mul(scale)
|
||||
else:
|
||||
input, output = ctx.saved_variables
|
||||
|
||||
if ctx.keepdim is False and input.dim() != 1:
|
||||
grad_output = grad_output.unsqueeze(ctx.dim)
|
||||
output = output.unsqueeze(ctx.dim)
|
||||
|
||||
input, output = self.saved_tensors
|
||||
big_grad_output = grad_output.expand_as(input)
|
||||
if ctx.p == 2:
|
||||
if self.norm_type == 2:
|
||||
big_output = output.expand_as(input)
|
||||
return input.mul(big_grad_output).div(big_output), None, None, None
|
||||
return input.mul(big_grad_output).div(big_output)
|
||||
else:
|
||||
pow = input.abs().pow(ctx.p - 2)
|
||||
big_output = output.pow(ctx.p - 1).expand_as(input)
|
||||
return input.mul(pow).mul(big_grad_output).div(big_output), None, None, None
|
||||
pow = input.abs().pow(self.norm_type - 2)
|
||||
big_output = output.pow(self.norm_type - 1).expand_as(input)
|
||||
return input.mul(pow).mul(big_grad_output).div(big_output)
|
||||
|
||||
|
||||
# TODO: renorm
|
||||
|
||||
@ -1,3 +0,0 @@
|
||||
%s/self/ctx/g
|
||||
%s/\s\+def forward/ @staticmethod\r def forward/g
|
||||
%s/\s\+def backward/ @staticmethod\r @once_differentiable\r def backward/g
|
||||
@ -23,9 +23,8 @@ class Multinomial(StochasticFunction):
|
||||
if probs.dim() == 1:
|
||||
probs = probs.unsqueeze(0)
|
||||
samples = samples.unsqueeze(0)
|
||||
reward = reward.unsqueeze(0)
|
||||
# normalize probs (multinomial accepts weights)
|
||||
probs /= probs.sum(1, True).expand_as(probs)
|
||||
probs /= probs.sum(1).expand_as(probs)
|
||||
grad_probs = probs.new().resize_as_(probs).zero_()
|
||||
output_probs = probs.gather(1, samples)
|
||||
output_probs.add_(1e-6).reciprocal_()
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -1,39 +0,0 @@
|
||||
import torch
|
||||
|
||||
|
||||
def maybe_view(variable, size):
|
||||
if variable.size() == size:
|
||||
return variable
|
||||
return variable.contiguous().view(size)
|
||||
|
||||
|
||||
def maybe_unexpand(variable, old_size):
|
||||
num_unsqueezed = variable.dim() - len(old_size)
|
||||
expanded_dims = [dim for dim, (expanded, original)
|
||||
in enumerate(zip(variable.size()[num_unsqueezed:], old_size))
|
||||
if expanded != original]
|
||||
|
||||
for _ in range(num_unsqueezed):
|
||||
variable = variable.sum(0, keepdim=False)
|
||||
for dim in expanded_dims:
|
||||
variable = variable.sum(dim, keepdim=True)
|
||||
return variable
|
||||
|
||||
|
||||
def variable_expandable(variable, old_size):
|
||||
try:
|
||||
torch._C._infer_size(variable.size(), old_size)
|
||||
except RuntimeError:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def maybe_unexpand_or_view(variable, old_size):
|
||||
var_expanded = True
|
||||
if maybe_view:
|
||||
var_expanded = variable_expandable(variable, old_size)
|
||||
|
||||
if var_expanded:
|
||||
return maybe_unexpand(variable, old_size)
|
||||
else:
|
||||
return maybe_view(variable, old_size)
|
||||
85
torch/autograd/engine.py
Normal file
85
torch/autograd/engine.py
Normal file
@ -0,0 +1,85 @@
|
||||
from collections import deque, defaultdict
|
||||
from torch._C import _ImperativeEngine as ImperativeEngine
|
||||
from .variable import Variable
|
||||
|
||||
|
||||
class BasicEngine(object):
|
||||
|
||||
def _compute_dependencies(self, function):
|
||||
dependencies = defaultdict(int)
|
||||
seen = {function}
|
||||
queue = [function]
|
||||
while len(queue) > 0:
|
||||
fn = queue.pop()
|
||||
for prev_fn, output_nr in fn.previous_functions:
|
||||
if not prev_fn.requires_grad or isinstance(prev_fn, Variable):
|
||||
continue
|
||||
dependencies[prev_fn] += 1
|
||||
if prev_fn not in seen:
|
||||
queue.append(prev_fn)
|
||||
seen.add(prev_fn)
|
||||
return dependencies
|
||||
|
||||
def _free_backward_dependency(self, dependencies, prev_fn):
|
||||
dependencies[prev_fn] -= 1
|
||||
if dependencies[prev_fn] == 0:
|
||||
del dependencies[prev_fn]
|
||||
return True
|
||||
return False
|
||||
|
||||
def _add_grad(self, need_copy, prev_grad, output_nr, d_prev_fn):
|
||||
copy_id = (id(prev_grad), output_nr)
|
||||
if not prev_grad[output_nr]:
|
||||
prev_grad[output_nr] = d_prev_fn
|
||||
need_copy.add(copy_id)
|
||||
else:
|
||||
grad_tensor = prev_grad[output_nr]
|
||||
if copy_id in need_copy:
|
||||
need_copy.remove(copy_id)
|
||||
grad_tensor = grad_tensor.clone()
|
||||
prev_grad[output_nr] = grad_tensor
|
||||
grad_tensor.add_(d_prev_fn)
|
||||
|
||||
def run_backward(self, variable, grad, retain_variables):
|
||||
if variable.creator is None:
|
||||
variable._do_backward((grad,), retain_variables)
|
||||
return
|
||||
|
||||
initial_grad = [None for _ in range(variable.creator.num_outputs)]
|
||||
initial_grad[variable.output_nr] = grad
|
||||
ready = deque([(variable.creator, initial_grad)])
|
||||
not_ready = {}
|
||||
need_copy = set()
|
||||
|
||||
dependencies = self._compute_dependencies(variable.creator)
|
||||
|
||||
while len(ready) > 0:
|
||||
fn, grad = ready.pop()
|
||||
grad_input = fn._do_backward(tuple(grad), retain_variables)
|
||||
for (prev_fn, output_nr), d_prev_fn in zip(fn.previous_functions, grad_input):
|
||||
if not prev_fn.requires_grad:
|
||||
# TODO: check that d_prev_fn is None and warn otherwise
|
||||
continue
|
||||
if isinstance(prev_fn, Variable):
|
||||
prev_fn._do_backward((d_prev_fn,), retain_variables)
|
||||
continue
|
||||
is_ready = self._free_backward_dependency(dependencies, prev_fn)
|
||||
if is_ready:
|
||||
if prev_fn in not_ready:
|
||||
prev_grad = not_ready[prev_fn]
|
||||
self._add_grad(need_copy, prev_grad, output_nr, d_prev_fn)
|
||||
else:
|
||||
if prev_fn.num_outputs != 1:
|
||||
raise RuntimeError("one of the function outputs "
|
||||
"wasn't used - this is an error not, but "
|
||||
"it's going to be fixed soon")
|
||||
prev_grad = (d_prev_fn,)
|
||||
ready.appendleft((prev_fn, prev_grad))
|
||||
else:
|
||||
if prev_fn in not_ready:
|
||||
prev_grad = not_ready[prev_fn]
|
||||
else:
|
||||
prev_grad = [None for _ in range(prev_fn.num_outputs)]
|
||||
|
||||
self._add_grad(need_copy, prev_grad, output_nr, d_prev_fn)
|
||||
not_ready[prev_fn] = prev_grad
|
||||
@ -1,12 +1,47 @@
|
||||
import torch
|
||||
import torch._C as _C
|
||||
import torch.utils.hooks as hooks
|
||||
from torch._six import with_metaclass
|
||||
import functools
|
||||
from collections import OrderedDict
|
||||
|
||||
|
||||
class _ContextMethodMixin(object):
|
||||
class Function(_C._FunctionBase):
|
||||
"""Records operation history and defines formulas for differentiating ops.
|
||||
|
||||
Every operation performed on :class:`Variable` s creates a new function
|
||||
object, that performs the computation, and records that it happened.
|
||||
The history is retained in the form of a DAG of functions, with edges
|
||||
denoting data dependencies (``input <- output``). Then, when backward is
|
||||
called, the graph is processed in the topological ordering, by calling
|
||||
:func:`backward` methods of each :class:`Function` object, and passing
|
||||
returned gradients on to next :class:`Function` s.
|
||||
|
||||
Normally, the only way users interact with functions is by creating
|
||||
subclasses and defining new operations. This is a recommended way of
|
||||
extending torch.autograd.
|
||||
|
||||
Since Function logic is a hotspot in most scripts, almost all of it
|
||||
was moved to our C backend, to ensure that the framework overhead is
|
||||
minimal.
|
||||
|
||||
Each function is meant to be used only once (in the forward pass).
|
||||
|
||||
Attributes:
|
||||
saved_tensors: Tuple of Tensors that were saved in the call to
|
||||
:func:`forward`.
|
||||
needs_input_grad: Tuple of booleans of length :attr:`num_inputs`,
|
||||
indicating whether a given input requires gradient. This can be
|
||||
used to optimize buffers saved for backward, and ignoring gradient
|
||||
computation in :func:`~Function.backward`.
|
||||
num_inputs: Number of inputs given to :func:`forward`.
|
||||
num_outputs: Number of tensors returned by :func:`forward`.
|
||||
requires_grad: Boolean indicating whether the :func:`backward` will
|
||||
ever need to be called.
|
||||
previous_functions: Tuple of (int, Function) pairs of length
|
||||
:attr:`num_inputs`. Each entry contains a reference to a
|
||||
:class:`Function` that created corresponding input, and an index
|
||||
of the previous function output that's been used.
|
||||
"""
|
||||
__call__ = _C._FunctionBase._do_forward
|
||||
|
||||
def save_for_backward(self, *tensors):
|
||||
"""Saves given tensors for a future call to :func:`~Function.backward`.
|
||||
@ -15,10 +50,9 @@ class _ContextMethodMixin(object):
|
||||
:func:`forward` **method.**
|
||||
|
||||
Later, saved tensors can be accessed through the :attr:`saved_tensors`
|
||||
attribute; or, if the corresponding Variable is needed (e.g. for double
|
||||
backwards), those can be accessed through the :attr:`saved_variables`
|
||||
attribute. Before returning them to the user, a check is made, to ensure
|
||||
they weren't used in any in-place operation that modified their content.
|
||||
attribute. Before returning them to the user, a check is made, to
|
||||
ensure they weren't used in any in-place operation that modified
|
||||
their content.
|
||||
|
||||
Arguments can also be ``None``.
|
||||
"""
|
||||
@ -31,7 +65,7 @@ class _ContextMethodMixin(object):
|
||||
:func:`forward` **method, and all arguments should be inputs.**
|
||||
|
||||
Every tensor that's been modified in-place in a call to :func:`forward`
|
||||
should be given to this function, to ensure correctness of our checks.
|
||||
should be given to this function, to ensure correcness of our checks.
|
||||
It doesn't matter wheter the function is called before or after
|
||||
modification.
|
||||
"""
|
||||
@ -72,9 +106,6 @@ class _ContextMethodMixin(object):
|
||||
"""
|
||||
self.non_differentiable = args
|
||||
|
||||
|
||||
class _HookMixin(object):
|
||||
|
||||
@staticmethod
|
||||
def _register_hook(backward_hooks, hook):
|
||||
if backward_hooks is None:
|
||||
@ -83,84 +114,7 @@ class _HookMixin(object):
|
||||
backward_hooks[handle.id] = hook
|
||||
return backward_hooks, handle
|
||||
|
||||
|
||||
class BackwardCFunction(_C._FunctionBase, _ContextMethodMixin, _HookMixin):
|
||||
_is_legacy = False
|
||||
|
||||
def apply(self, *args):
|
||||
return self._forward_cls.backward(self, *args)
|
||||
|
||||
|
||||
class FunctionMeta(type):
|
||||
"""Function metaclass.
|
||||
|
||||
This metaclass sets up the following properties:
|
||||
_is_legacy: True if forward is not defined as a static method.
|
||||
_backward_cls: The Function class corresponding to the differentiated
|
||||
version of this function (which is generated on the fly by this
|
||||
metaclass).
|
||||
"""
|
||||
|
||||
def __init__(cls, name, bases, attrs):
|
||||
for super_cls in cls.mro():
|
||||
forward = super_cls.__dict__.get('forward')
|
||||
if forward is not None:
|
||||
has_static_forward = isinstance(forward, staticmethod) or isinstance(forward, classmethod)
|
||||
break
|
||||
|
||||
setattr(cls, '_is_legacy', not has_static_forward)
|
||||
|
||||
# old-style functions
|
||||
if not has_static_forward:
|
||||
return super(FunctionMeta, cls).__init__(name, bases, attrs)
|
||||
|
||||
backward_fn = type(name + 'Backward', (BackwardCFunction,), {'_forward_cls': cls})
|
||||
setattr(cls, '_backward_cls', backward_fn)
|
||||
|
||||
return super(FunctionMeta, cls).__init__(name, bases, attrs)
|
||||
|
||||
|
||||
class Function(with_metaclass(FunctionMeta, _C._FunctionBase, _ContextMethodMixin, _HookMixin)):
|
||||
"""Records operation history and defines formulas for differentiating ops.
|
||||
|
||||
Every operation performed on :class:`Variable` s creates a new function
|
||||
object, that performs the computation, and records that it happened.
|
||||
The history is retained in the form of a DAG of functions, with edges
|
||||
denoting data dependencies (``input <- output``). Then, when backward is
|
||||
called, the graph is processed in the topological ordering, by calling
|
||||
:func:`backward` methods of each :class:`Function` object, and passing
|
||||
returned gradients on to next :class:`Function` s.
|
||||
|
||||
Normally, the only way users interact with functions is by creating
|
||||
subclasses and defining new operations. This is a recommended way of
|
||||
extending torch.autograd.
|
||||
|
||||
Since Function logic is a hotspot in most scripts, almost all of it
|
||||
was moved to our C backend, to ensure that the framework overhead is
|
||||
minimal.
|
||||
|
||||
Each function is meant to be used only once (in the forward pass).
|
||||
|
||||
Attributes:
|
||||
saved_tensors: Tuple of Tensors that were saved in the call to
|
||||
:func:`forward`.
|
||||
saved_variables: Tuple of Variables that correspond to the tensors
|
||||
saved in the call to :func:`forward`.
|
||||
needs_input_grad: Tuple of booleans of length :attr:`num_inputs`,
|
||||
indicating whether a given input requires gradient. This can be
|
||||
used to optimize buffers saved for backward, and ignoring gradient
|
||||
computation in :func:`~Function.backward`.
|
||||
num_inputs: Number of inputs given to :func:`forward`.
|
||||
num_outputs: Number of tensors returned by :func:`forward`.
|
||||
requires_grad: Boolean indicating whether the :func:`backward` will
|
||||
ever need to be called.
|
||||
"""
|
||||
|
||||
# only for backward compatibility
|
||||
__call__ = _C._FunctionBase._do_forward
|
||||
|
||||
@staticmethod
|
||||
def forward(*args, **kwargs):
|
||||
def forward(self, *input):
|
||||
"""Performs the operation.
|
||||
|
||||
This function is to be overriden by all subclasses.
|
||||
@ -169,8 +123,7 @@ class Function(with_metaclass(FunctionMeta, _C._FunctionBase, _ContextMethodMixi
|
||||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
@staticmethod
|
||||
def backward(*grad_outputs):
|
||||
def backward(self, *grad_output):
|
||||
"""Defines a formula for differentiating the operation.
|
||||
|
||||
This function is to be overriden by all subclasses.
|
||||
@ -184,41 +137,6 @@ class Function(with_metaclass(FunctionMeta, _C._FunctionBase, _ContextMethodMixi
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
def once_differentiable(fn):
|
||||
from .variable import Variable
|
||||
|
||||
@functools.wraps(fn)
|
||||
def wrapper(ctx, *args):
|
||||
tensor_args = [arg.data if isinstance(arg, Variable) else arg
|
||||
for arg in args]
|
||||
outputs = fn(ctx, *tensor_args)
|
||||
# XXX: this is only an approximation of these flags - there's no way
|
||||
# to figure out if fn didn't use ctx.saved_variables and as a result
|
||||
# some Variables might require grad, even if no args do.
|
||||
# Unfortunately, this leads to unexpected error messages ("no nodes
|
||||
# require computing gradients"), but I don't have a better idea.
|
||||
# These functions would raise an error in backward anyway.
|
||||
volatile = any(arg.volatile if isinstance(arg, Variable) else False
|
||||
for arg in args)
|
||||
requires_grad = any(arg.requires_grad if isinstance(arg, Variable) else False
|
||||
for arg in args)
|
||||
if volatile:
|
||||
def err_fn(*args):
|
||||
return args
|
||||
kwargs = {'volatile': True}
|
||||
else:
|
||||
err_fn = torch._C._functions.DelayedError(
|
||||
b"trying to differentiate twice a function that was marked"
|
||||
b"with @once_differentiable")
|
||||
kwargs = {'requires_grad': requires_grad}
|
||||
if not isinstance(outputs, tuple):
|
||||
var = Variable(outputs, **kwargs) if outputs is not None else None
|
||||
return err_fn(var)
|
||||
return err_fn(*[Variable(o, **kwargs) if o is not None else None
|
||||
for o in outputs])
|
||||
return wrapper
|
||||
|
||||
|
||||
class InplaceFunction(Function):
|
||||
|
||||
def __init__(self, inplace=False):
|
||||
|
||||
@ -1,41 +1,31 @@
|
||||
import torch
|
||||
from torch.autograd import Variable
|
||||
from collections import Iterable
|
||||
|
||||
|
||||
def iter_variables(x):
|
||||
def iter_gradients(x):
|
||||
if isinstance(x, Variable):
|
||||
if x.requires_grad:
|
||||
yield (x.grad.data, x.data) if x.grad is not None else (None, None)
|
||||
elif isinstance(x, Iterable):
|
||||
yield x.grad.data if x.grad is not None else None
|
||||
else:
|
||||
for elem in x:
|
||||
for result in iter_variables(elem):
|
||||
for result in iter_gradients(elem):
|
||||
yield result
|
||||
|
||||
|
||||
def zero_gradients(x):
|
||||
if isinstance(x, Variable):
|
||||
if x.grad is not None:
|
||||
x.grad.detach_()
|
||||
x.grad.data.zero_()
|
||||
elif isinstance(x, Iterable):
|
||||
for elem in x:
|
||||
zero_gradients(elem)
|
||||
def zero_gradients(i):
|
||||
for t in iter_gradients(i):
|
||||
if t is not None:
|
||||
t.zero_()
|
||||
|
||||
|
||||
def make_jacobian(input, num_out):
|
||||
if isinstance(input, Variable) and not input.requires_grad:
|
||||
return None
|
||||
elif torch.is_tensor(input) or isinstance(input, Variable):
|
||||
if torch.is_tensor(input) or isinstance(input, Variable):
|
||||
return torch.zeros(input.nelement(), num_out)
|
||||
elif isinstance(input, Iterable):
|
||||
jacobians = list(filter(
|
||||
lambda x: x is not None, (make_jacobian(elem, num_out) for elem in input)))
|
||||
if not jacobians:
|
||||
return None
|
||||
return type(input)(jacobians)
|
||||
else:
|
||||
return None
|
||||
return type(input)(filter(lambda x: x is not None,
|
||||
(make_jacobian(elem, num_out) for elem in input)))
|
||||
|
||||
|
||||
def iter_tensors(x, only_requiring_grad=False):
|
||||
@ -44,7 +34,7 @@ def iter_tensors(x, only_requiring_grad=False):
|
||||
elif isinstance(x, Variable):
|
||||
if x.requires_grad or not only_requiring_grad:
|
||||
yield x.data
|
||||
elif isinstance(x, Iterable):
|
||||
else:
|
||||
for elem in x:
|
||||
for result in iter_tensors(elem, only_requiring_grad):
|
||||
yield result
|
||||
@ -55,9 +45,8 @@ def contiguous(input):
|
||||
return input.contiguous()
|
||||
elif isinstance(input, Variable):
|
||||
return input.contiguous()
|
||||
elif isinstance(input, Iterable):
|
||||
else:
|
||||
return type(input)(contiguous(e) for e in input)
|
||||
return input
|
||||
|
||||
|
||||
def get_numerical_jacobian(fn, input, target, eps=1e-3):
|
||||
@ -81,9 +70,9 @@ def get_numerical_jacobian(fn, input, target, eps=1e-3):
|
||||
for i in range(flat_tensor.nelement()):
|
||||
orig = flat_tensor[i]
|
||||
flat_tensor[i] = orig - eps
|
||||
outa.copy_(fn(input), broadcast=False)
|
||||
outa.copy_(fn(input))
|
||||
flat_tensor[i] = orig + eps
|
||||
outb.copy_(fn(input), broadcast=False)
|
||||
outb.copy_(fn(input))
|
||||
flat_tensor[i] = orig
|
||||
|
||||
outb.add_(-1, outa).div_(2 * eps)
|
||||
@ -94,31 +83,21 @@ def get_numerical_jacobian(fn, input, target, eps=1e-3):
|
||||
|
||||
def get_analytical_jacobian(input, output):
|
||||
jacobian = make_jacobian(input, output.numel())
|
||||
jacobian_reentrant = make_jacobian(input, output.numel())
|
||||
grad_output = output.data.clone().zero_()
|
||||
flat_grad_output = grad_output.view(-1)
|
||||
reentrant = True
|
||||
correct_grad_sizes = True
|
||||
|
||||
for i in range(flat_grad_output.numel()):
|
||||
flat_grad_output.zero_()
|
||||
flat_grad_output[i] = 1
|
||||
for jacobian_c in (jacobian, jacobian_reentrant):
|
||||
zero_gradients(input)
|
||||
output.backward(grad_output, create_graph=True)
|
||||
for jacobian_x, (d_x, x) in zip(jacobian_c, iter_variables(input)):
|
||||
if d_x is None:
|
||||
jacobian_x[:, i].zero_()
|
||||
else:
|
||||
if d_x.size() != x.size():
|
||||
correct_grad_sizes = False
|
||||
jacobian_x[:, i] = d_x.to_dense() if d_x.is_sparse else d_x
|
||||
zero_gradients(input)
|
||||
output.backward(grad_output, retain_variables=True)
|
||||
for jacobian_x, d_x in zip(jacobian, iter_gradients(input)):
|
||||
if d_x is None:
|
||||
jacobian_x[:, i].zero_()
|
||||
else:
|
||||
jacobian_x[:, i] = d_x.to_dense() if d_x.is_sparse else d_x
|
||||
|
||||
for jacobian_x, jacobian_reentrant_x in zip(jacobian, jacobian_reentrant):
|
||||
if (jacobian_x - jacobian_reentrant_x).abs().max() != 0:
|
||||
reentrant = False
|
||||
|
||||
return jacobian, reentrant, correct_grad_sizes
|
||||
return jacobian
|
||||
|
||||
|
||||
def _as_tuple(x):
|
||||
@ -161,65 +140,21 @@ def gradcheck(func, inputs, eps=1e-6, atol=1e-5, rtol=1e-3):
|
||||
def fn(input):
|
||||
return _as_tuple(func(*input))[i].data
|
||||
|
||||
analytical, reentrant, correct_grad_sizes = get_analytical_jacobian(_as_tuple(inputs), o)
|
||||
numerical = get_numerical_jacobian(fn, inputs, inputs, eps)
|
||||
analytical = get_analytical_jacobian(_as_tuple(inputs), o)
|
||||
|
||||
for a, n in zip(analytical, numerical):
|
||||
if not ((a - n).abs() <= (atol + rtol * n.abs())).all():
|
||||
return False
|
||||
|
||||
if not reentrant:
|
||||
return False
|
||||
|
||||
if not correct_grad_sizes:
|
||||
return False
|
||||
|
||||
# check if the backward multiplies by grad_output
|
||||
zero_gradients(inputs)
|
||||
output = _as_tuple(func(*inputs))
|
||||
torch.autograd.backward(output, [o.data.new(o.size()).zero_() for o in output])
|
||||
var_inputs = list(filter(lambda i: isinstance(i, Variable), inputs))
|
||||
if not var_inputs:
|
||||
raise RuntimeError("no Variables found in input")
|
||||
for i in var_inputs:
|
||||
for i in inputs:
|
||||
if i.grad is None:
|
||||
continue
|
||||
if not i.grad.data.eq(0).all():
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def gradgradcheck(func, inputs, grad_outputs, eps=1e-6, atol=1e-5, rtol=1e-3):
|
||||
"""Check gradients of gradients computed via small finite differences
|
||||
against analytical gradients
|
||||
This function checks that backpropagating through the gradients computed
|
||||
to the given grad_outputs are correct.
|
||||
|
||||
The check between numerical and analytical has the same behaviour as
|
||||
numpy.allclose https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html
|
||||
meaning it check that
|
||||
absolute(a - n) <= (atol + rtol * absolute(n))
|
||||
is true for all elements of analytical gradient a and numerical gradient n.
|
||||
|
||||
Args:
|
||||
func: Python function that takes Variable inputs and returns
|
||||
a tuple of Variables
|
||||
inputs: tuple of Variables
|
||||
grad_outputs: tuple of Variables
|
||||
eps: perturbation for finite differences
|
||||
atol: absolute tolerance
|
||||
rtol: relative tolerance
|
||||
|
||||
Returns:
|
||||
True if all differences satisfy allclose condition
|
||||
"""
|
||||
def new_func(*input_args):
|
||||
input_args = input_args[:-len(grad_outputs)]
|
||||
outputs = func(*input_args)
|
||||
outputs = _as_tuple(outputs)
|
||||
input_args = tuple(x for x in input_args if isinstance(x, Variable) and x.requires_grad)
|
||||
grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs)
|
||||
return grad_inputs
|
||||
|
||||
return gradcheck(new_func, inputs + grad_outputs, eps, atol, rtol)
|
||||
|
||||
@ -1,11 +1,10 @@
|
||||
import sys
|
||||
import torch
|
||||
import torch._C as _C
|
||||
from collections import OrderedDict
|
||||
import torch.sparse as sparse
|
||||
import torch.utils.hooks as hooks
|
||||
import warnings
|
||||
import weakref
|
||||
|
||||
from ._functions import *
|
||||
|
||||
|
||||
class Variable(_C._VariableBase):
|
||||
@ -14,7 +13,7 @@ class Variable(_C._VariableBase):
|
||||
Variable is a thin wrapper around a Tensor object, that also holds
|
||||
the gradient w.r.t. to it, and a reference to a function that created it.
|
||||
This reference allows retracing the whole chain of operations that
|
||||
created the data. If the Variable has been created by the user, its grad_fn
|
||||
created the data. If the Variable has been created by the user, its creator
|
||||
will be ``None`` and we call such objects *leaf* Variables.
|
||||
|
||||
Since autograd only supports scalar valued function differentiation, grad
|
||||
@ -34,9 +33,8 @@ class Variable(_C._VariableBase):
|
||||
inference mode, i.e. don't save the history. See
|
||||
:ref:`excluding-subgraphs` for more details.
|
||||
Can be changed only on leaf Variables.
|
||||
is_leaf: Boolean indicating if the Variable is a graph leaf (i.e
|
||||
if it was created by the user).
|
||||
grad_fn: Gradient function graph trace.
|
||||
creator: Function of which the variable was an output. For leaf
|
||||
(user created) variables it's ``None``. Read-only attribute.
|
||||
|
||||
Parameters:
|
||||
data (any tensor class): Tensor to wrap.
|
||||
@ -62,30 +60,29 @@ class Variable(_C._VariableBase):
|
||||
def __getattr__(self, name):
|
||||
if name in self._fallthrough_methods:
|
||||
return getattr(self.data, name)
|
||||
return object.__getattribute__(self, name)
|
||||
raise AttributeError(name)
|
||||
|
||||
def __getitem__(self, key):
|
||||
if torch.is_tensor(key):
|
||||
key = Variable(key) # auto-wrap tensors
|
||||
if isinstance(key, Variable):
|
||||
if type(key.data).__name__ == 'ByteTensor':
|
||||
return MaskedSelect.apply(self, key)
|
||||
elif type(key.data).__name__ == 'LongTensor':
|
||||
return IndexSelect.apply(self, 0, key)
|
||||
# else fall through and raise an error in Index
|
||||
return Index.apply(self, key)
|
||||
if (isinstance(key, Variable) and
|
||||
type(key.data).__name__ == 'ByteTensor'):
|
||||
return MaskedSelect()(self, key)
|
||||
return Index(key)(self)
|
||||
|
||||
def __setitem__(self, key, value):
|
||||
if isinstance(key, Variable) and type(key.data).__name__ == 'ByteTensor':
|
||||
if (isinstance(key, Variable) and
|
||||
type(key.data).__name__ == 'ByteTensor'):
|
||||
if isinstance(value, Variable):
|
||||
return MaskedScatter.apply(self, key, value, True)
|
||||
return MaskedCopy(inplace=True)(self, key, value)
|
||||
else:
|
||||
return MaskedFill.apply(self, key, value, True)
|
||||
return MaskedFill(value, inplace=True)(self, key)
|
||||
else:
|
||||
return SetItem.apply(self, key, value)
|
||||
if isinstance(value, Variable):
|
||||
return SetItem(key)(self, value)
|
||||
else:
|
||||
return SetItem(key, value)(self)
|
||||
|
||||
def __deepcopy__(self, memo):
|
||||
if not self.is_leaf:
|
||||
if self.creator is not None:
|
||||
raise RuntimeError("Only Variables created explicitly by the user "
|
||||
"(graph leaves) support the deepcopy protocol at the moment")
|
||||
result = type(self)(self.data.clone())
|
||||
@ -109,22 +106,14 @@ class Variable(_C._VariableBase):
|
||||
# legacy serialization of Variable
|
||||
self.data = state[0]
|
||||
state = (state[3], state[4], state[2])
|
||||
if not self.is_leaf:
|
||||
if self.creator is not None:
|
||||
raise RuntimeError('__setstate__ can be only called on leaf variables')
|
||||
self.requires_grad, self.volatile, self._backward_hooks = state
|
||||
|
||||
def __repr__(self):
|
||||
return 'Variable containing:' + self.data.__repr__()
|
||||
|
||||
def __bool__(self):
|
||||
if self.data.numel() == 0:
|
||||
return False
|
||||
raise RuntimeError("bool value of Variable objects containing non-empty " +
|
||||
torch.typename(self.data) + " is ambiguous")
|
||||
|
||||
__nonzero__ = __bool__
|
||||
|
||||
def backward(self, gradient=None, retain_graph=None, create_graph=None, retain_variables=None):
|
||||
def backward(self, gradient=None, retain_variables=False):
|
||||
"""Computes the gradient of current variable w.r.t. graph leaves.
|
||||
|
||||
The graph is differentiated using the chain rule. If the variable is
|
||||
@ -133,27 +122,28 @@ class Variable(_C._VariableBase):
|
||||
It should be a tensor of matching type and location, that contains
|
||||
the gradient of the differentiated function w.r.t. ``self``.
|
||||
|
||||
This function accumulates gradients in the leaves - you might need to
|
||||
zero them before calling it.
|
||||
This function accumulates gradients in the leaves - you might need to zero
|
||||
them before calling it.
|
||||
|
||||
Arguments:
|
||||
grad_variables (Tensor, Variable or None): Gradient w.r.t. the
|
||||
variable. If it is a tensor, it will be automatically converted
|
||||
to a Variable that is volatile unless ``create_graph`` is True.
|
||||
None values can be specified for scalar Variables or ones that
|
||||
don't require grad. If a None value would be acceptable then
|
||||
this argument is optional.
|
||||
retain_graph (bool, optional): If False, the graph used to compute
|
||||
the grads will be freed. Note that in nearly all cases setting
|
||||
this option to True is not needed and often can be worked around
|
||||
in a much more efficient way. Defaults to the value of
|
||||
``create_graph``.
|
||||
create_graph (bool, optional): If true, graph of the derivative will
|
||||
be constructed, allowing to compute higher order derivative
|
||||
products. Defaults to False, unless ``gradient`` is a volatile
|
||||
Variable.
|
||||
gradient (Tensor): Gradient of the differentiated function
|
||||
w.r.t. the data. Required only if the data has more than one
|
||||
element. Type and location should match these of ``self.data``.
|
||||
retain_variables (bool): If ``True``, buffers necessary for computing
|
||||
gradients won't be freed after use. It is only necessary to
|
||||
specify ``True`` if you want to differentiate some subgraph multiple
|
||||
times (in some cases it will be much more efficient to use
|
||||
`autograd.backward`).
|
||||
"""
|
||||
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
|
||||
if self.volatile:
|
||||
raise RuntimeError('calling backward on a volatile variable')
|
||||
if gradient is None and self.requires_grad:
|
||||
if self.data.numel() != 1:
|
||||
raise RuntimeError(
|
||||
'backward should be called only on a scalar (i.e. 1-element tensor) '
|
||||
'or with gradient w.r.t. the variable')
|
||||
gradient = self.data.new().resize_as_(self.data).fill_(1)
|
||||
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
|
||||
|
||||
def register_hook(self, hook):
|
||||
"""Registers a backward hook.
|
||||
@ -187,8 +177,8 @@ class Variable(_C._VariableBase):
|
||||
"doesn't require gradient")
|
||||
if self._backward_hooks is None:
|
||||
self._backward_hooks = OrderedDict()
|
||||
if self.grad_fn is not None:
|
||||
self.grad_fn._register_hook_dict(self)
|
||||
if self.creator is not None:
|
||||
self.creator._register_hook_dict(self)
|
||||
handle = hooks.RemovableHandle(self._backward_hooks)
|
||||
self._backward_hooks[handle.id] = hook
|
||||
return handle
|
||||
@ -204,10 +194,10 @@ class Variable(_C._VariableBase):
|
||||
reward(Tensor): Tensor with per-element rewards. It has to match
|
||||
the device location and shape of Variable's data.
|
||||
"""
|
||||
if not isinstance(self.grad_fn, StochasticFunction):
|
||||
if not isinstance(self.creator, StochasticFunction):
|
||||
raise RuntimeError("reinforce() can be only called on outputs "
|
||||
"of stochastic functions")
|
||||
self.grad_fn._reinforce(reward)
|
||||
self.creator._reinforce(reward)
|
||||
|
||||
def detach(self):
|
||||
"""Returns a new Variable, detached from the current graph.
|
||||
@ -222,61 +212,35 @@ class Variable(_C._VariableBase):
|
||||
errors in correctness checks.
|
||||
"""
|
||||
result = NoGrad()(self) # this is needed, because it merges version counters
|
||||
result._grad_fn = None
|
||||
result._creator = None
|
||||
return result
|
||||
|
||||
def detach_(self):
|
||||
"""Detaches the Variable from the graph that created it, making it a
|
||||
leaf.
|
||||
"""
|
||||
self._grad_fn = None
|
||||
"""Detaches the Variable from the graph that created it, making it a leaf."""
|
||||
self._creator = None
|
||||
self.requires_grad = False
|
||||
|
||||
def retain_grad(self):
|
||||
"""Enables .grad attribute for non-leaf Variables."""
|
||||
if self.grad_fn is None: # no-op for leaves
|
||||
return
|
||||
if not self.requires_grad:
|
||||
raise RuntimeError("can't retain_grad on Variable that has requires_grad=False")
|
||||
if hasattr(self, 'retains_grad'):
|
||||
return
|
||||
weak_self = weakref.ref(self)
|
||||
|
||||
def retain_grad_hook(grad):
|
||||
var = weak_self()
|
||||
if var is None:
|
||||
return
|
||||
if var._grad is None:
|
||||
var._grad = grad.clone()
|
||||
else:
|
||||
var._grad = var._grad + grad
|
||||
|
||||
self.register_hook(retain_grad_hook)
|
||||
self.retains_grad = True
|
||||
|
||||
def contiguous(self):
|
||||
self.data = self.data.contiguous()
|
||||
return self
|
||||
|
||||
def clone(self):
|
||||
return Clone.apply(self)
|
||||
return Clone()(self)
|
||||
|
||||
def type(self, t):
|
||||
if t != type(self.data):
|
||||
return Type.apply(self, t)
|
||||
return Type(t)(self)
|
||||
return self
|
||||
|
||||
def type_as(self, t):
|
||||
if isinstance(t, Variable):
|
||||
t = t.data
|
||||
return self.type(type(t))
|
||||
return self.type(type(t.data))
|
||||
|
||||
def _get_type(self, name):
|
||||
module = torch._import_dotted_name(self.data.__module__)
|
||||
return getattr(module, name)
|
||||
|
||||
def cuda(self, device_id=None, async=False):
|
||||
return CudaTransfer.apply(self, device_id, async)
|
||||
return CudaTransfer(device_id, async)(self)
|
||||
|
||||
def cpu(self):
|
||||
return self.type(getattr(torch, type(self.data).__name__))
|
||||
@ -310,10 +274,10 @@ class Variable(_C._VariableBase):
|
||||
|
||||
def _add(self, other, inplace):
|
||||
if isinstance(other, Variable):
|
||||
return Add.apply(self, other, inplace)
|
||||
return Add(inplace)(self, other)
|
||||
else:
|
||||
assert not torch.is_tensor(other)
|
||||
return AddConstant.apply(self, other, inplace)
|
||||
return AddConstant(other, inplace)(self)
|
||||
|
||||
def add(self, other):
|
||||
return self._add(other, False)
|
||||
@ -323,10 +287,10 @@ class Variable(_C._VariableBase):
|
||||
|
||||
def _sub(self, other, inplace):
|
||||
if isinstance(other, Variable):
|
||||
return Sub.apply(self, other, inplace)
|
||||
return Sub(inplace=inplace)(self, other)
|
||||
else:
|
||||
assert not torch.is_tensor(other)
|
||||
return SubConstant.apply(self, other, inplace)
|
||||
return SubConstant(other, inplace=inplace)(self)
|
||||
|
||||
def sub(self, other):
|
||||
return self._sub(other, False)
|
||||
@ -336,181 +300,178 @@ class Variable(_C._VariableBase):
|
||||
|
||||
def mul(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Mul.apply(self, other)
|
||||
return Mul()(self, other)
|
||||
else:
|
||||
assert not torch.is_tensor(other)
|
||||
return MulConstant.apply(self, other)
|
||||
return MulConstant(other)(self)
|
||||
|
||||
def mul_(self, other):
|
||||
if not isinstance(other, Variable) and not torch.is_tensor(other):
|
||||
return MulConstant.apply(self, other, True)
|
||||
return MulConstant(other, inplace=True)(self)
|
||||
raise RuntimeError("mul_ only supports scalar multiplication")
|
||||
|
||||
def div(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Div.apply(self, other)
|
||||
return Div()(self, other)
|
||||
else:
|
||||
assert not torch.is_tensor(other)
|
||||
return DivConstant.apply(self, other)
|
||||
return DivConstant(other)(self)
|
||||
|
||||
def div_(self, other):
|
||||
if not isinstance(other, Variable) and not torch.is_tensor(other):
|
||||
return DivConstant.apply(self, other, True)
|
||||
return DivConstant(other, inplace=True)(self)
|
||||
raise RuntimeError("div_ only supports scalar multiplication")
|
||||
|
||||
def pow(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Pow.apply(self, other)
|
||||
return Pow()(self, other)
|
||||
else:
|
||||
assert not torch.is_tensor(other)
|
||||
return PowConstant.apply(self, other)
|
||||
return PowConstant(other)(self)
|
||||
|
||||
def exp(self):
|
||||
return Exp.apply(self)
|
||||
return Exp()(self)
|
||||
|
||||
def exp_(self):
|
||||
return Exp.apply(self, True)
|
||||
return Exp(inplace=True)(self)
|
||||
|
||||
def log(self):
|
||||
return Log.apply(self)
|
||||
return Log()(self)
|
||||
|
||||
def log1p(self):
|
||||
return Log1p.apply(self)
|
||||
return Log1p()(self)
|
||||
|
||||
def neg(self):
|
||||
return Negate.apply(self)
|
||||
return Negate()(self)
|
||||
|
||||
def neg_(self):
|
||||
return Negate.apply(self, True)
|
||||
return Negate(inplace=True)(self)
|
||||
|
||||
def tanh(self):
|
||||
return Tanh.apply(self)
|
||||
return Tanh()(self)
|
||||
|
||||
def tanh_(self):
|
||||
return Tanh.apply(self, True)
|
||||
return Tanh(True)(self)
|
||||
|
||||
def sigmoid(self):
|
||||
return Sigmoid.apply(self)
|
||||
return Sigmoid()(self)
|
||||
|
||||
def sigmoid_(self):
|
||||
return Sigmoid.apply(self, True)
|
||||
return Sigmoid(True)(self)
|
||||
|
||||
def sin(self):
|
||||
return Sin.apply(self)
|
||||
return Sin()(self)
|
||||
|
||||
def cos(self):
|
||||
return Cos.apply(self)
|
||||
return Cos()(self)
|
||||
|
||||
def tan(self):
|
||||
return Tan.apply(self)
|
||||
return Tan()(self)
|
||||
|
||||
def asin(self):
|
||||
return Asin.apply(self)
|
||||
return Asin()(self)
|
||||
|
||||
def acos(self):
|
||||
return Acos.apply(self)
|
||||
return Acos()(self)
|
||||
|
||||
def atan(self):
|
||||
return Atan.apply(self)
|
||||
|
||||
def atan2(self, x):
|
||||
return Atan2.apply(self, x)
|
||||
return Atan()(self)
|
||||
|
||||
def sinh(self):
|
||||
return Sinh.apply(self)
|
||||
return Sinh()(self)
|
||||
|
||||
def cosh(self):
|
||||
return Cosh.apply(self)
|
||||
return Cosh()(self)
|
||||
|
||||
def abs(self):
|
||||
return Abs.apply(self)
|
||||
return Abs()(self)
|
||||
|
||||
def clamp(self, min=None, max=None):
|
||||
if min is None and max is None:
|
||||
raise ValueError("clamp requires specifying at least one of "
|
||||
"min and max arguments")
|
||||
elif min is None and max is not None:
|
||||
return CminConstant.apply(self, max)
|
||||
return CminConstant(max)(self)
|
||||
elif min is not None and max is None:
|
||||
return CmaxConstant.apply(self, min)
|
||||
return CmaxConstant(min)(self)
|
||||
else:
|
||||
return Clamp.apply(self, min, max)
|
||||
return Clamp(min, max)(self)
|
||||
|
||||
def reciprocal(self):
|
||||
return Reciprocal.apply(self)
|
||||
return Reciprocal()(self)
|
||||
|
||||
def floor(self):
|
||||
return Floor.apply(self)
|
||||
return Floor()(self)
|
||||
|
||||
def ceil(self):
|
||||
return Ceil.apply(self)
|
||||
return Ceil()(self)
|
||||
|
||||
def frac(self):
|
||||
return Frac.apply(self)
|
||||
return Frac()(self)
|
||||
|
||||
def sqrt(self):
|
||||
return Sqrt.apply(self)
|
||||
return Sqrt()(self)
|
||||
|
||||
def round(self):
|
||||
return Round.apply(self)
|
||||
return Round()(self)
|
||||
|
||||
def sign(self):
|
||||
return Sign.apply(self)
|
||||
return Sign()(self)
|
||||
|
||||
def trunc(self):
|
||||
return Trunc.apply(self)
|
||||
return Trunc()(self)
|
||||
|
||||
def fmod(self, value):
|
||||
return Fmod.apply(self, value)
|
||||
return Fmod(value)(self)
|
||||
|
||||
def remainder(self, value):
|
||||
return Remainder.apply(self, value)
|
||||
return Remainder(value)(self)
|
||||
|
||||
def lerp(self, tensor, weight):
|
||||
return Lerp.apply(self, tensor, weight)
|
||||
return Lerp(weight)(self, tensor)
|
||||
|
||||
def rsqrt(self):
|
||||
return Rsqrt.apply(self)
|
||||
return Rsqrt()(self)
|
||||
|
||||
def sum(self, dim=None, keepdim=None):
|
||||
return Sum.apply(self, dim, keepdim)
|
||||
def sum(self, dim=None):
|
||||
return Sum(dim)(self)
|
||||
|
||||
def prod(self, dim=None, keepdim=None):
|
||||
return Prod.apply(self, dim, keepdim)
|
||||
def prod(self, dim=None):
|
||||
return Prod(dim)(self)
|
||||
|
||||
def mean(self, dim=None, keepdim=None):
|
||||
return Mean.apply(self, dim, keepdim)
|
||||
def mean(self, dim=None):
|
||||
return Mean(dim)(self)
|
||||
|
||||
def max(self, dim=None, keepdim=None):
|
||||
def max(self, dim=None):
|
||||
if isinstance(dim, Variable):
|
||||
return Cmax.apply(self, dim)
|
||||
return Max.apply(self, dim, keepdim)
|
||||
return Cmax()(self, dim)
|
||||
return Max(dim)(self)
|
||||
|
||||
def min(self, dim=None, keepdim=None):
|
||||
def min(self, dim=None):
|
||||
if isinstance(dim, Variable):
|
||||
return Cmin.apply(self, dim)
|
||||
return Min.apply(self, dim, keepdim)
|
||||
return Cmin()(self, dim)
|
||||
return Min(dim)(self)
|
||||
|
||||
def mode(self, dim=None, keepdim=None):
|
||||
return Mode.apply(self, dim, keepdim)
|
||||
def mode(self, dim):
|
||||
return Mode(dim)(self)
|
||||
|
||||
def median(self, dim=None, keepdim=None):
|
||||
return Median.apply(self, dim, keepdim)
|
||||
def median(self, dim):
|
||||
return Median(dim)(self)
|
||||
|
||||
def kthvalue(self, k, dim=None, keepdim=None):
|
||||
return Kthvalue.apply(self, k, dim, keepdim)
|
||||
def kthvalue(self, dim):
|
||||
return Kthvalue(dim)(self)
|
||||
|
||||
def sort(self, dim=None, descending=False):
|
||||
return Sort.apply(self, dim, descending, True)
|
||||
return Sort(dim, descending)(self)
|
||||
|
||||
def topk(self, k, dim=None, largest=True, sorted=True):
|
||||
return Topk.apply(self, k, dim, largest, sorted, True)
|
||||
return Topk(k, dim, largest, sorted)(self)
|
||||
|
||||
def view(self, *sizes):
|
||||
return View.apply(self, sizes)
|
||||
return View(*sizes)(self)
|
||||
|
||||
def view_as(self, tensor):
|
||||
return View.apply(self, tensor.size())
|
||||
return View(*tensor.size())(self)
|
||||
|
||||
def split(self, split_size, dim=0):
|
||||
return torch.split(self, split_size, dim)
|
||||
@ -520,45 +481,32 @@ class Variable(_C._VariableBase):
|
||||
repeats = repeats[0]
|
||||
else:
|
||||
repeats = torch.Size(repeats)
|
||||
return Repeat.apply(self, repeats)
|
||||
return Repeat(repeats)(self)
|
||||
|
||||
def cumsum(self, dim):
|
||||
return Cumsum.apply(self, dim)
|
||||
return Cumsum(dim)(self)
|
||||
|
||||
def cumprod(self, dim):
|
||||
return Cumprod.apply(self, dim)
|
||||
|
||||
def unfold(self, dim, size, step):
|
||||
return Unfold.apply(self, dim, size, step)
|
||||
|
||||
def var(self, dim=None, keepdim=None, unbiased=True):
|
||||
keepdim_ = False if keepdim is None else keepdim
|
||||
mean = self.mean(dim, keepdim)
|
||||
def var(self, dim=None, unbiased=True):
|
||||
mean = self.mean(dim)
|
||||
if dim is None:
|
||||
mean = mean.view(*(1 for s in self.size()))
|
||||
# we could just set keepdim to True, but this preserves some fidelity
|
||||
elif keepdim_ is False and self.dim() != 1:
|
||||
mean = mean.unsqueeze(dim)
|
||||
mean_expanded = mean.expand_as(self)
|
||||
zero_centered = self.sub(mean_expanded)
|
||||
var = zero_centered.mul(zero_centered).sum(dim, keepdim=keepdim_)
|
||||
var = zero_centered.mul(zero_centered).sum(dim)
|
||||
numel = self.numel() if dim is None else self.size(dim)
|
||||
return var.div(numel - int(unbiased))
|
||||
|
||||
def std(self, dim=None, keepdim=None, unbiased=True):
|
||||
return self.var(dim, keepdim, unbiased).sqrt()
|
||||
def std(self, dim=None, unbiased=True):
|
||||
return self.var(dim, unbiased).sqrt()
|
||||
|
||||
def renorm(self, p, dim, maxnorm):
|
||||
t = self.transpose(dim, 0)
|
||||
flat = t.contiguous().view(self.size(0), -1)
|
||||
norms = flat.norm(p, 1, True)
|
||||
norms = flat.norm(p, 1)
|
||||
norms = norms.clamp(max=maxnorm).div(norms.add(1e-7))
|
||||
flat_out = flat.mul(norms.expand_as(flat))
|
||||
return flat_out.view(t.size()).transpose(dim, 0)
|
||||
|
||||
def matmul(self, other):
|
||||
return torch.matmul(self, other)
|
||||
|
||||
@staticmethod
|
||||
def _static_blas(cls, args, inplace):
|
||||
num_args = len(args)
|
||||
@ -569,14 +517,14 @@ class Variable(_C._VariableBase):
|
||||
alpha, beta = args[1:3]
|
||||
if num_args == 4:
|
||||
alpha = args[1]
|
||||
return cls.apply(*(args[:1] + args[-2:] + (alpha, beta, inplace)))
|
||||
return cls(alpha, beta, inplace)(*(args[:1] + args[-2:]))
|
||||
|
||||
def _blas(self, cls, args, inplace):
|
||||
return self._static_blas(cls, (self,) + args, inplace)
|
||||
|
||||
def mm(self, matrix):
|
||||
output = Variable(self.data.new(self.data.size(0), matrix.data.size(1)))
|
||||
return Addmm.apply(output, self, matrix, 0, 1, True)
|
||||
return self._static_blas(Addmm, (output, 0, 1, self, matrix), False)
|
||||
|
||||
def bmm(self, batch):
|
||||
output = Variable(self.data.new(self.data.size(0), self.data.size(1),
|
||||
@ -592,10 +540,10 @@ class Variable(_C._VariableBase):
|
||||
return self._static_blas(Addr, (output, 0, 1, self, vector), False)
|
||||
|
||||
def resize(self, *sizes):
|
||||
return Resize.apply(self, sizes)
|
||||
return Resize(*sizes)(self)
|
||||
|
||||
def resize_as(self, variable):
|
||||
return Resize.apply(self, variable.size())
|
||||
return Resize(*variable.size())(self)
|
||||
|
||||
def addmm(self, *args):
|
||||
return self._blas(Addmm, args, False)
|
||||
@ -628,186 +576,170 @@ class Variable(_C._VariableBase):
|
||||
return self._blas(Addr, args, True)
|
||||
|
||||
def dot(self, other):
|
||||
return Dot.apply(self, other)
|
||||
return Dot()(self, other)
|
||||
|
||||
def _addcop(self, op, args, inplace):
|
||||
def _addcop(self, op, args):
|
||||
if len(args) == 3:
|
||||
# args == [scale, tensor1, tensor2]
|
||||
return op.apply(self, args[1], args[2], args[0], inplace)
|
||||
# scale, tensor1, tensor2
|
||||
return op(args[0])(self, *args[1:])
|
||||
else:
|
||||
# args == [tensor1, tensor2]
|
||||
return op.apply(self, args[0], args[1], 1.0, inplace)
|
||||
# tensor1, tensor2
|
||||
return op()(self, *args)
|
||||
|
||||
def addcmul(self, *args):
|
||||
return self._addcop(Addcmul, args, False)
|
||||
return self._addcop(Addcmul, args)
|
||||
|
||||
def addcdiv(self, *args):
|
||||
return self._addcop(Addcdiv, args, False)
|
||||
return self._addcop(Addcdiv, args)
|
||||
|
||||
def addcmul_(self, *args):
|
||||
return self._addcop(Addcmul, args, True)
|
||||
|
||||
def addcdiv_(self, *args):
|
||||
return self._addcop(Addcdiv, args, True)
|
||||
|
||||
def norm(self, p=2, dim=None, keepdim=None):
|
||||
return Norm.apply(self, p, dim, keepdim)
|
||||
def norm(self, p=2, dim=None):
|
||||
return Norm(p, dim)(self)
|
||||
|
||||
def dist(self, tensor, p=2):
|
||||
return Norm.apply(self - tensor, p)
|
||||
return Norm(p)(self - tensor)
|
||||
|
||||
def index_add(self, dim, index, tensor):
|
||||
return IndexAdd.apply(self, dim, index, tensor)
|
||||
|
||||
def _advanced_index_add(self, index, tensor):
|
||||
return AdvancedIndexAdd.apply(self, index, tensor)
|
||||
return IndexAdd(dim)(self, index, tensor)
|
||||
|
||||
def index_add_(self, dim, index, tensor):
|
||||
return IndexAdd.apply(self, dim, index, tensor, True)
|
||||
return IndexAdd(dim, True)(self, index, tensor)
|
||||
|
||||
def index_copy(self, dim, index, tensor):
|
||||
return IndexCopy.apply(self, dim, index, tensor)
|
||||
return IndexCopy(dim)(self, index, tensor)
|
||||
|
||||
def index_copy_(self, dim, index, tensor):
|
||||
return IndexCopy.apply(self, dim, index, tensor, True)
|
||||
return IndexCopy(dim, True)(self, index, tensor)
|
||||
|
||||
def index_fill(self, dim, index, value):
|
||||
return IndexFill.apply(self, dim, index, value)
|
||||
return IndexFill(dim, value)(self, index)
|
||||
|
||||
def index_fill_(self, dim, index, value):
|
||||
return IndexFill.apply(self, dim, index, value, True)
|
||||
return IndexFill(dim, value, True)(self, index)
|
||||
|
||||
def index_select(self, dim, index):
|
||||
return IndexSelect.apply(self, dim, index)
|
||||
return IndexSelect(dim)(self, index)
|
||||
|
||||
def gather(self, dim, index):
|
||||
return Gather.apply(self, dim, index)
|
||||
return Gather(dim)(self, index)
|
||||
|
||||
def scatter(self, dim, index, source):
|
||||
return Scatter.apply(self, dim, index, source)
|
||||
return Scatter(dim)(self, index, source)
|
||||
|
||||
def scatter_(self, dim, index, source):
|
||||
return Scatter.apply(self, dim, index, source, True)
|
||||
|
||||
def scatter_add(self, dim, index, source):
|
||||
return ScatterAdd.apply(self, dim, index, source)
|
||||
|
||||
def scatter_add_(self, dim, index, source):
|
||||
return ScatterAdd.apply(self, dim, index, source, True)
|
||||
return Scatter(dim, True)(self, index, source)
|
||||
|
||||
def masked_copy(self, mask, variable):
|
||||
warnings.warn("masked_copy is deprecated and renamed to masked_scatter, and will be removed in v0.3")
|
||||
return MaskedScatter.apply(self, mask, variable)
|
||||
return MaskedCopy()(self, mask, variable)
|
||||
|
||||
def masked_copy_(self, mask, variable):
|
||||
warnings.warn("masked_copy_ is deprecated and renamed to masked_scatter_, and will be removed in v0.3")
|
||||
return MaskedScatter.apply(self, mask, variable, True)
|
||||
|
||||
def masked_scatter(self, mask, variable):
|
||||
return MaskedScatter.apply(self, mask, variable)
|
||||
|
||||
def masked_scatter_(self, mask, variable):
|
||||
return MaskedScatter.apply(self, mask, variable, True)
|
||||
return MaskedCopy(True)(self, mask, variable)
|
||||
|
||||
def masked_fill(self, mask, value):
|
||||
return MaskedFill.apply(self, mask, value)
|
||||
return MaskedFill(value)(self, mask)
|
||||
|
||||
def masked_fill_(self, mask, value):
|
||||
return MaskedFill.apply(self, mask, value, True)
|
||||
return MaskedFill(value, True)(self, mask)
|
||||
|
||||
def masked_select(self, mask):
|
||||
return MaskedSelect.apply(self, mask)
|
||||
return MaskedSelect()(self, mask)
|
||||
|
||||
def expand(self, *sizes):
|
||||
return Expand.apply(self, sizes)
|
||||
if isinstance(sizes[0], torch.Size):
|
||||
if len(sizes) > 1:
|
||||
raise ValueError("expand expects a several ints or a single "
|
||||
"torch.Size argument")
|
||||
sizes = sizes[0]
|
||||
return Expand(sizes)(self)
|
||||
|
||||
def expand_as(self, tensor):
|
||||
return Expand.apply(self, (tensor.size(),))
|
||||
return Expand(tensor.size())(self)
|
||||
|
||||
def t(self):
|
||||
if self.dim() != 2:
|
||||
raise RuntimeError("t() expects a 2D Variable, but self is {}D".format(self.dim()))
|
||||
return Transpose.apply(self, 0, 1)
|
||||
return Transpose(0, 1)(self)
|
||||
|
||||
def transpose(self, dim1, dim2):
|
||||
return Transpose.apply(self, dim1, dim2)
|
||||
return Transpose(dim1, dim2)(self)
|
||||
|
||||
def select(self, dim, _index):
|
||||
dim = dim if dim >= 0 else dim + self.dim()
|
||||
index = tuple(slice(None, None) for _ in range(dim)) + (_index,)
|
||||
return Index.apply(self, index)
|
||||
return Index(index)(self)
|
||||
|
||||
def narrow(self, dim, start_index, length):
|
||||
dim = dim if dim >= 0 else dim + self.dim()
|
||||
index = tuple(slice(None, None) for _ in range(dim)) + \
|
||||
(slice(start_index, start_index + length),)
|
||||
return Index.apply(self, index)
|
||||
|
||||
return Index(index)(self)
|
||||
|
||||
def chunk(self, num_chunks, dim=0):
|
||||
return Chunk.apply(self, num_chunks, dim)
|
||||
return Chunk(num_chunks, dim)(self)
|
||||
|
||||
def squeeze(self, dim=None):
|
||||
return Squeeze.apply(self, dim)
|
||||
|
||||
def squeeze_(self, dim=None):
|
||||
return Squeeze.apply(self, dim, True)
|
||||
return Squeeze(dim)(self)
|
||||
|
||||
def unsqueeze(self, dim):
|
||||
return Unsqueeze.apply(self, dim)
|
||||
return Unsqueeze(dim)(self)
|
||||
|
||||
def permute(self, *permutation):
|
||||
return Permute.apply(self, permutation)
|
||||
return Permute(permutation)(self)
|
||||
|
||||
def diag(self, diagonal=0):
|
||||
return Diag.apply(self, diagonal)
|
||||
def diag(self, diagonal_idx=0):
|
||||
return Diag(diagonal_idx)(self)
|
||||
|
||||
def tril(self, diagonal=0):
|
||||
return Tril.apply(self, diagonal)
|
||||
def tril(self, diagonal_idx=0):
|
||||
return Tril(diagonal_idx)(self)
|
||||
|
||||
def triu(self, diagonal=0):
|
||||
return Triu.apply(self, diagonal)
|
||||
def triu(self, diagonal_idx=0):
|
||||
return Triu(diagonal_idx)(self)
|
||||
|
||||
def trace(self):
|
||||
return Trace.apply(self)
|
||||
return Trace()(self)
|
||||
|
||||
def cross(self, other, dim=-1):
|
||||
return Cross.apply(self, other)
|
||||
return Cross(dim)(self, other)
|
||||
|
||||
def inverse(self):
|
||||
return Inverse.apply(self)
|
||||
|
||||
def gesv(self, a):
|
||||
return Gesv.apply(self, a)
|
||||
|
||||
def multinomial(self, num_samples=1, replacement=False):
|
||||
return Multinomial(num_samples, replacement)(self)
|
||||
def multinomial(self, num_samples=1, with_replacement=False):
|
||||
return Multinomial(num_samples, with_replacement)(self)
|
||||
|
||||
def bernoulli(self):
|
||||
return Bernoulli()(self)
|
||||
|
||||
def eq(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Eq()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Eq.apply(self, other)
|
||||
return Eq(other)(self)
|
||||
|
||||
def ne(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Ne()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Ne.apply(self, other)
|
||||
return Ne(other)(self)
|
||||
|
||||
def gt(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Gt()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Gt.apply(self, other)
|
||||
return Gt(other)(self)
|
||||
|
||||
def ge(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Ge()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Ge.apply(self, other)
|
||||
return Ge(other)(self)
|
||||
|
||||
def lt(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Lt()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Lt.apply(self, other)
|
||||
return Lt(other)(self)
|
||||
|
||||
def le(self, other):
|
||||
if isinstance(other, Variable):
|
||||
return Le()(self, other)
|
||||
assert not torch.is_tensor(other), "can't compare Variable and tensor"
|
||||
return Le.apply(self, other)
|
||||
return Le(other)(self)
|
||||
|
||||
def __add__(self, other):
|
||||
return self.add(other)
|
||||
@ -823,7 +755,7 @@ class Variable(_C._VariableBase):
|
||||
return self.sub_(other)
|
||||
|
||||
def __rsub__(self, other):
|
||||
return SubConstant.apply(other, self)
|
||||
return SubConstant(other, sub_tensor=True)(self)
|
||||
|
||||
def __mul__(self, other):
|
||||
return self.mul(other)
|
||||
@ -833,16 +765,28 @@ class Variable(_C._VariableBase):
|
||||
return self.mul_(other)
|
||||
|
||||
def __matmul__(self, other):
|
||||
if not isinstance(other, Variable):
|
||||
dim_self = self.dim()
|
||||
try:
|
||||
dim_other = other.dim()
|
||||
except AttributeError: # not a Variable
|
||||
return NotImplemented
|
||||
return self.matmul(other)
|
||||
if dim_self == 1 and dim_other == 1:
|
||||
return self.dot(other)
|
||||
if dim_self == 2 and dim_other == 1:
|
||||
return self.mv(other)
|
||||
if dim_self == 1 and dim_other == 2:
|
||||
return self.unsqueeze(0).mm(other).squeeze(0)
|
||||
elif dim_self == 2 and dim_other == 2:
|
||||
return self.mm(other)
|
||||
raise ValueError("both arguments to __matmul__ need to be 1D or 2D, "
|
||||
"but they are {}D and {}D".format(dim_self, dim_other))
|
||||
|
||||
def __div__(self, other):
|
||||
return self.div(other)
|
||||
__truediv__ = __div__
|
||||
|
||||
def __rdiv__(self, other):
|
||||
return DivConstant.apply(other, self)
|
||||
return DivConstant(other, div_by_tensor=True)(self)
|
||||
__rtruediv__ = __rdiv__
|
||||
|
||||
def __idiv__(self, other):
|
||||
@ -855,10 +799,10 @@ class Variable(_C._VariableBase):
|
||||
raise NotImplementedError("in-place pow not implemented")
|
||||
|
||||
def __rpow__(self, other):
|
||||
return PowConstant.apply(other, self)
|
||||
return PowConstant(other, tensor_power=True)(self)
|
||||
|
||||
def __neg__(self):
|
||||
return Negate.apply(self)
|
||||
return Negate()(self)
|
||||
|
||||
def __len__(self):
|
||||
return len(self.data)
|
||||
@ -894,7 +838,7 @@ class Variable(_C._VariableBase):
|
||||
|
||||
@staticmethod
|
||||
def cat(iterable, dim=0):
|
||||
return Concat.apply(dim, *iterable)
|
||||
return Concat(dim)(*iterable)
|
||||
|
||||
@staticmethod
|
||||
def normal(means, std=1):
|
||||
@ -917,7 +861,7 @@ class Variable(_C._VariableBase):
|
||||
tensors = args[1:]
|
||||
else:
|
||||
tensors = args
|
||||
return cls.apply(*(tensors + (alpha, beta, inplace)))
|
||||
return cls(alpha, beta, inplace)(*tensors)
|
||||
|
||||
@classmethod
|
||||
def addmm(cls, *args):
|
||||
@ -951,6 +895,5 @@ for method in dir(Variable):
|
||||
setattr(Variable._torch, method, as_static)
|
||||
|
||||
|
||||
from ._functions import *
|
||||
from torch._C import _ImperativeEngine as ImperativeEngine
|
||||
from .engine import ImperativeEngine
|
||||
Variable._execution_engine = ImperativeEngine()
|
||||
|
||||
@ -17,12 +17,6 @@ def _libcudnn():
|
||||
if hasattr(lib, 'cudnnGetErrorString'):
|
||||
lib.cudnnGetErrorString.restype = ctypes.c_char_p
|
||||
__cudnn_version = lib.cudnnGetVersion()
|
||||
compile_version = torch._C._cudnn_version()
|
||||
# Check that cuDNN major and minor versions match
|
||||
if (__cudnn_version // 100) != (compile_version // 100):
|
||||
raise RuntimeError(
|
||||
'cuDNN version mismatch: PyTorch was compiled against {} '
|
||||
'but linked against {}'.format(compile_version, __cudnn_version))
|
||||
else:
|
||||
lib = None
|
||||
return lib
|
||||
|
||||
@ -163,9 +163,9 @@ def get_parameters(fn, handle, weight_buf):
|
||||
# might as well merge the CUDNN ones into a single tensor as well
|
||||
if linear_id == 0 or linear_id == num_linear_layers / 2:
|
||||
assert filter_dim_a.prod() == filter_dim_a[0]
|
||||
size = (filter_dim_a[0] * num_linear_layers // 2, filter_dim_a[2])
|
||||
param = fn.weight_buf.new().set_(
|
||||
weight_buf.storage(), offset, size)
|
||||
weight_buf.storage(), offset,
|
||||
filter_dim_a[0] * num_linear_layers // 2, filter_dim_a[2])
|
||||
layer_params.append(param)
|
||||
else:
|
||||
assert cur_offset == offset
|
||||
@ -178,13 +178,10 @@ def get_parameters(fn, handle, weight_buf):
|
||||
|
||||
|
||||
def _copyParams(params_from, params_to):
|
||||
assert len(params_from) == len(params_to)
|
||||
for layer_params_from, layer_params_to in zip(params_from, params_to):
|
||||
# NOTE: these lists have all weights before all biases, so if the layer doesn't
|
||||
# use biases, zip will terminate once layer_params_from ends and ignore them.
|
||||
for param_from, param_to in zip(layer_params_from, layer_params_to):
|
||||
assert param_from.type() == param_to.type()
|
||||
param_to.copy_(param_from, broadcast=False)
|
||||
param_to.copy_(param_from)
|
||||
|
||||
|
||||
def forward(fn, input, hx, weight, output, hy):
|
||||
@ -245,21 +242,17 @@ def forward(fn, input, hx, weight, output, hy):
|
||||
fn.cy_desc = cudnn.descriptor(cx) if cx is not None else None
|
||||
|
||||
# create the weight buffer and copy the weights into it
|
||||
if fn.weight_buf is None:
|
||||
num_weights = get_num_weights(
|
||||
handle, fn.rnn_desc, fn.x_descs[0], fn.datatype)
|
||||
fn.weight_buf = x.new(num_weights)
|
||||
fn.w_desc = init_weight_descriptor(fn, fn.weight_buf)
|
||||
w = fn.weight_buf
|
||||
# this zero might not seem necessary, but it is in the case
|
||||
# where biases are disabled; then they won't be copied and must be zero'd.
|
||||
# Alternatively, _copyParams could be written more carefully.
|
||||
w.zero_()
|
||||
params = get_parameters(fn, handle, w)
|
||||
_copyParams(weight, params)
|
||||
else:
|
||||
fn.w_desc = init_weight_descriptor(fn, fn.weight_buf)
|
||||
w = fn.weight_buf
|
||||
num_weights = get_num_weights(
|
||||
handle, fn.rnn_desc, fn.x_descs[0], fn.datatype)
|
||||
fn.weight_buf = x.new(num_weights)
|
||||
fn.w_desc = init_weight_descriptor(fn, fn.weight_buf)
|
||||
w = fn.weight_buf
|
||||
# this zero might not seem necessary, but it is in the case
|
||||
# where biases are disabled; then they won't be copied and must be zero'd.
|
||||
# Alternatively, _copyParams could be written more carefully.
|
||||
w.zero_()
|
||||
params = get_parameters(fn, handle, w)
|
||||
_copyParams(weight, params)
|
||||
|
||||
if tuple(hx.size()) != hidden_size:
|
||||
raise RuntimeError('Expected hidden size {}, got {}'.format(
|
||||
@ -276,9 +269,7 @@ def forward(fn, input, hx, weight, output, hy):
|
||||
fn.x_descs,
|
||||
ctypes.byref(workspace_size)
|
||||
))
|
||||
fn.workspace_size = workspace_size.value
|
||||
with torch.cuda.device_of(input):
|
||||
workspace = torch.cuda.ByteTensor(fn.workspace_size)
|
||||
fn.workspace = torch.cuda.ByteTensor(workspace_size.value)
|
||||
if fn.requires_grad:
|
||||
reserve_size = ctypes.c_long()
|
||||
check_error(lib.cudnnGetRNNTrainingReserveSize(
|
||||
@ -301,7 +292,7 @@ def forward(fn, input, hx, weight, output, hy):
|
||||
fn.y_descs, ctypes.c_void_p(y.data_ptr()),
|
||||
fn.hy_desc, ctypes.c_void_p(hy.data_ptr()),
|
||||
fn.cy_desc, ctypes.c_void_p(cy.data_ptr()) if cx is not None else None,
|
||||
ctypes.c_void_p(workspace.data_ptr()), workspace.size(0),
|
||||
ctypes.c_void_p(fn.workspace.data_ptr()), fn.workspace.size(0),
|
||||
ctypes.c_void_p(fn.reserve.data_ptr()), fn.reserve.size(0)
|
||||
))
|
||||
else: # inference
|
||||
@ -316,7 +307,7 @@ def forward(fn, input, hx, weight, output, hy):
|
||||
fn.y_descs, ctypes.c_void_p(y.data_ptr()),
|
||||
fn.hy_desc, ctypes.c_void_p(hy.data_ptr()),
|
||||
fn.cy_desc, ctypes.c_void_p(cy.data_ptr()) if cx is not None else None,
|
||||
ctypes.c_void_p(workspace.data_ptr()), workspace.size(0)
|
||||
ctypes.c_void_p(fn.workspace.data_ptr()), fn.workspace.size(0)
|
||||
))
|
||||
|
||||
if fn.batch_first and not is_input_packed:
|
||||
@ -381,8 +372,6 @@ def backward_grad(fn, input, hx, weight, output, grad_output, grad_hy, grad_inpu
|
||||
if not dhy.is_cuda or not dy.is_cuda or (dcy is not None and not dcy.is_cuda):
|
||||
raise RuntimeError('Gradients aren\'t CUDA tensors')
|
||||
|
||||
with torch.cuda.device_of(input):
|
||||
workspace = torch.cuda.ByteTensor(fn.workspace_size)
|
||||
check_error(cudnn.lib.cudnnRNNBackwardData(
|
||||
handle,
|
||||
fn.rnn_desc,
|
||||
@ -397,7 +386,7 @@ def backward_grad(fn, input, hx, weight, output, grad_output, grad_hy, grad_inpu
|
||||
fn.x_descs, ctypes.c_void_p(dx.data_ptr()),
|
||||
fn.hx_desc, ctypes.c_void_p(dhx.data_ptr()),
|
||||
fn.cx_desc, ctypes.c_void_p(dcx.data_ptr()) if cx is not None else None,
|
||||
ctypes.c_void_p(workspace.data_ptr()), workspace.size(0),
|
||||
ctypes.c_void_p(fn.workspace.data_ptr()), fn.workspace.size(0),
|
||||
ctypes.c_void_p(fn.reserve.data_ptr()), fn.reserve.size(0)
|
||||
))
|
||||
|
||||
@ -450,8 +439,6 @@ def backward_weight(fn, input, hx, output, weight, grad_weight):
|
||||
y = output
|
||||
dw = fn.weight_buf.new().resize_as_(fn.weight_buf).zero_()
|
||||
|
||||
with torch.cuda.device_of(input):
|
||||
workspace = torch.cuda.ByteTensor(fn.workspace_size)
|
||||
check_error(cudnn.lib.cudnnRNNBackwardWeights(
|
||||
handle,
|
||||
fn.rnn_desc,
|
||||
@ -459,7 +446,7 @@ def backward_weight(fn, input, hx, output, weight, grad_weight):
|
||||
fn.x_descs, ctypes.c_void_p(x.data_ptr()),
|
||||
fn.hx_desc, ctypes.c_void_p(hx.data_ptr()),
|
||||
fn.y_descs, ctypes.c_void_p(y.data_ptr()),
|
||||
ctypes.c_void_p(workspace.data_ptr()), workspace.size(0),
|
||||
ctypes.c_void_p(fn.workspace.data_ptr()), fn.workspace.size(0),
|
||||
fn.w_desc, ctypes.c_void_p(dw.data_ptr()),
|
||||
ctypes.c_void_p(fn.reserve.data_ptr()), fn.reserve.size(0)
|
||||
))
|
||||
|
||||
@ -58,53 +58,18 @@ static std::unordered_map<std::string, Type> type_names = {
|
||||
{"Int", Type::INT},
|
||||
{"Long", Type::LONG},
|
||||
};
|
||||
|
||||
static std::unordered_map<std::string, at::ScalarType> attype_names = {
|
||||
{"Float", at::kFloat},
|
||||
{"Double", at::kDouble},
|
||||
{"Half", at::kHalf},
|
||||
{"Byte", at::kByte},
|
||||
{"Char", at::kChar},
|
||||
{"Short", at::kShort},
|
||||
{"Int", at::kInt},
|
||||
{"Long", at::kLong},
|
||||
};
|
||||
static std::unordered_map<PyTypeObject*, TensorType> pytype_to_tensortype;
|
||||
static std::unordered_map<TensorType, PyTypeObject*, TensorTypeHasher> tensortype_to_pytype;
|
||||
|
||||
static std::unordered_map<PyTypeObject*, at::Type*> pytype_to_attype;
|
||||
static std::unordered_map<at::Type*, PyTypeObject*> attype_to_pytype;
|
||||
|
||||
void registerPyTypeObject(PyTypeObject *pytype, const std::string& name, bool is_cuda, bool is_sparse)
|
||||
{
|
||||
TensorType type;
|
||||
at::Backend device;
|
||||
if(is_cuda) {
|
||||
if(is_sparse){
|
||||
device = at::kSparseCUDA;
|
||||
} else {
|
||||
device = at::kCUDA;
|
||||
}
|
||||
} else {
|
||||
if(is_sparse){
|
||||
device = at::kSparseCPU;
|
||||
} else {
|
||||
device = at::kCPU;
|
||||
}
|
||||
}
|
||||
|
||||
type.data_type = type_names.at(name);
|
||||
type.is_cuda = is_cuda;
|
||||
type.is_sparse = is_sparse;
|
||||
|
||||
pytype_to_tensortype[pytype] = type;
|
||||
tensortype_to_pytype[type] = pytype;
|
||||
|
||||
if(!(is_sparse && name == "Half")) {
|
||||
at::Type * attype = &at::getType(device,attype_names.at(name));
|
||||
pytype_to_attype[pytype] = attype;
|
||||
attype_to_pytype[attype] = pytype;
|
||||
}
|
||||
}
|
||||
|
||||
PyTypeObject* getPyTypeObject(const thpp::Tensor& tensor)
|
||||
@ -116,12 +81,6 @@ PyTypeObject* getPyTypeObject(const thpp::Tensor& tensor)
|
||||
|
||||
return tensortype_to_pytype.at(type);
|
||||
}
|
||||
PyTypeObject* getPyTypeObject(const at::Tensor& tensor)
|
||||
{
|
||||
if(attype_to_pytype.count(&tensor.type()) == 0)
|
||||
throw std::invalid_argument("unsupported Tensor type.");
|
||||
return attype_to_pytype.at(&tensor.type());
|
||||
}
|
||||
|
||||
static std::unique_ptr<Tensor> createTensor(void *tensor, Type type, bool is_cuda, bool is_sparse)
|
||||
{
|
||||
@ -208,22 +167,6 @@ std::unique_ptr<Tensor> createTensor(PyObject *data)
|
||||
wrapper->retain();
|
||||
return wrapper;
|
||||
}
|
||||
//rename to createTensor when THPP is removed
|
||||
at::Tensor createTensorAT(PyObject *data)
|
||||
{
|
||||
auto tensor_type = pytype_to_attype.at(Py_TYPE(data));
|
||||
auto tensor = ((THPVoidTensor *)data)->cdata;
|
||||
return tensor_type->unsafeTensorFromTH(tensor, true);
|
||||
}
|
||||
PyObject* createPyObject(at::Tensor tensor)
|
||||
{
|
||||
auto type = getPyTypeObject(tensor);
|
||||
PyObject *obj = type->tp_alloc(type, 0);
|
||||
if (obj) {
|
||||
((THPVoidTensor*)obj)->cdata = (THVoidTensor *)tensor.detach()->unsafeGetTH(true);
|
||||
}
|
||||
return obj;
|
||||
}
|
||||
|
||||
PyObject* createPyObject(const thpp::Tensor& tensor)
|
||||
{
|
||||
|
||||
@ -2,10 +2,9 @@
|
||||
|
||||
// Provides conversions between Python tensor objects and thpp::Tensors.
|
||||
|
||||
#include <Python.h>
|
||||
#include <memory>
|
||||
#include <Python.h>
|
||||
#include <THPP/THPP.h>
|
||||
#include <ATen/ATen.h>
|
||||
|
||||
namespace torch {
|
||||
|
||||
@ -23,9 +22,4 @@ std::unique_ptr<thpp::Tensor> createTensor(PyObject *data);
|
||||
// Creates Python tensor object from a Tensor
|
||||
PyObject* createPyObject(const thpp::Tensor& tensor);
|
||||
|
||||
PyObject* createPyObject(at::Tensor tensor);
|
||||
PyTypeObject* getPyTypeObject(const at::Tensor& tensor);
|
||||
//rename to createPyObject when THPP is removed
|
||||
at::Tensor createTensorAT(PyObject *data);
|
||||
|
||||
} // namespace torch
|
||||
|
||||
@ -48,7 +48,6 @@ struct python_error : public std::exception {
|
||||
|
||||
/** Sets the current Python error from this exception */
|
||||
inline void restore() {
|
||||
if (!type) return;
|
||||
// PyErr_Restore steals references
|
||||
AutoGIL gil;
|
||||
Py_XINCREF(type);
|
||||
@ -64,6 +63,22 @@ struct python_error : public std::exception {
|
||||
|
||||
#ifdef _THP_CORE
|
||||
|
||||
struct THException: public std::exception {
|
||||
THException(const char* msg): msg(msg) {};
|
||||
|
||||
virtual const char* what() const throw() {
|
||||
return msg.c_str();
|
||||
}
|
||||
|
||||
std::string msg;
|
||||
};
|
||||
|
||||
struct THArgException: public THException {
|
||||
THArgException(const char* msg, int argNumber): THException(msg), argNumber(argNumber) {};
|
||||
|
||||
const int argNumber;
|
||||
};
|
||||
|
||||
bool THPException_init(PyObject *module);
|
||||
#endif
|
||||
|
||||
|
||||
@ -33,7 +33,7 @@ static PyObject * THPGenerator_pynew(PyTypeObject *type, PyObject *args, PyObjec
|
||||
THPUtils_setError("torch.Generator constructor doesn't accept any arguments");
|
||||
return NULL;
|
||||
}
|
||||
THPGeneratorPtr self((THPGenerator *)type->tp_alloc(type, 0));
|
||||
THPGeneratorPtr self = (THPGenerator *)type->tp_alloc(type, 0);
|
||||
self->cdata = THGenerator_new();
|
||||
|
||||
return (PyObject*)self.release();
|
||||
@ -44,7 +44,7 @@ static PyObject * THPGenerator_getState(THPGenerator *self)
|
||||
{
|
||||
HANDLE_TH_ERRORS
|
||||
THGenerator *generator = self->cdata;
|
||||
THPByteTensorPtr res((THPByteTensor *)THPByteTensor_NewEmpty());
|
||||
THPByteTensorPtr res = (THPByteTensor *)THPByteTensor_NewEmpty();
|
||||
if (!res) return NULL;
|
||||
THByteTensor_getRNGState(generator, res->cdata);
|
||||
return (PyObject *)res.release();
|
||||
|
||||
@ -6,7 +6,6 @@
|
||||
#include <unordered_map>
|
||||
#include <libshm.h>
|
||||
#include <TH/TH.h>
|
||||
#include <ATen/ATen.h>
|
||||
|
||||
#include "torch/csrc/utils/python_strings.h"
|
||||
|
||||
@ -64,7 +63,7 @@ static PyObject * THPModule_initNames(PyObject *self, PyObject *arg)
|
||||
{
|
||||
static std::vector<std::string> names;
|
||||
|
||||
THPObjectPtr types(PySequence_Fast(arg, "expected a sequence"));
|
||||
THPObjectPtr types = PySequence_Fast(arg, "expected a sequence");
|
||||
if (!types) return NULL;
|
||||
|
||||
int num_classes = PySequence_Fast_GET_SIZE(types.get());
|
||||
@ -74,7 +73,7 @@ static PyObject * THPModule_initNames(PyObject *self, PyObject *arg)
|
||||
THPUtils_assert(PyType_Check(obj), "expected a PyTypeObject");
|
||||
PyTypeObject* type = (PyTypeObject*)obj;
|
||||
|
||||
THPObjectPtr module_name(PyObject_GetAttrString(obj, "__module__"));
|
||||
THPObjectPtr module_name = PyObject_GetAttrString(obj, "__module__");
|
||||
if (!module_name) return NULL;
|
||||
THPUtils_assert(THPUtils_checkString(module_name.get()),
|
||||
"expected __module__ to be a string");
|
||||
@ -214,7 +213,6 @@ dispatch: \
|
||||
IMPLEMENT_STATELESS(sigmoid)
|
||||
IMPLEMENT_STATELESS(log)
|
||||
IMPLEMENT_STATELESS(log1p)
|
||||
IMPLEMENT_STATELESS(lgamma)
|
||||
IMPLEMENT_STATELESS(exp)
|
||||
IMPLEMENT_STATELESS(cos)
|
||||
IMPLEMENT_STATELESS(acos)
|
||||
@ -467,64 +465,6 @@ PyObject *THPModule_addDocStr(PyObject *_unused, PyObject *args)
|
||||
Py_RETURN_NONE;
|
||||
}
|
||||
|
||||
|
||||
PyObject *THPModule_inferSize(PyObject *_unused, PyObject *args)
|
||||
{
|
||||
HANDLE_TH_ERRORS
|
||||
Py_ssize_t num_args = args ? PyTuple_Size(args) : 0;
|
||||
THPUtils_assert(num_args == 2, "expected exactly 2 arguments");
|
||||
PyObject *arg1 = PyTuple_GET_ITEM(args, 0);
|
||||
THPUtils_assert(THPSize_Check(arg1), "expected a torch.Size as argument 1");
|
||||
PyObject *arg2 = PyTuple_GET_ITEM(args, 1);
|
||||
THPUtils_assert(THPSize_Check(arg2), "expected a torch.Size as argument 2");
|
||||
|
||||
THLongStoragePtr size1_guard = THPUtils_unpackSize(arg1);
|
||||
THLongStorage *size1 = size1_guard.get();
|
||||
THLongStoragePtr size2_guard = THPUtils_unpackSize(arg2);
|
||||
THLongStorage *size2 = size2_guard.get();
|
||||
THLongStoragePtr sizes_guard(THLongStorage_new());
|
||||
THLongStorage *sizes = sizes_guard.get();
|
||||
|
||||
char error_buffer[1024];
|
||||
int ret = THLongStorage_inferSize2(sizes, size1->data, size1->size, size2->data, size2->size, error_buffer, 1024);
|
||||
THPUtils_assert(ret == 0, error_buffer);
|
||||
return THPSize_New(sizes->size, sizes->data);
|
||||
END_HANDLE_TH_ERRORS
|
||||
}
|
||||
|
||||
static PyObject *THPModule_setBackcompatBroadcastWarn(PyObject *module, PyObject *arg) {
|
||||
THPUtils_assert(PyBool_Check(arg), "set_backcompat_broadcast_warn expects a bool, "
|
||||
"but got %s", THPUtils_typename(arg));
|
||||
setBackCompatBroadcastWarn(arg == Py_True);
|
||||
Py_RETURN_NONE;
|
||||
}
|
||||
|
||||
static PyObject *THPModule_getBackcompatBroadcastWarn(PyObject *module)
|
||||
{
|
||||
return getBackCompatBroadcastWarn() ? Py_True : Py_False;
|
||||
}
|
||||
|
||||
static PyObject *THPModule_setBackcompatKeepdimWarn(PyObject *module, PyObject *arg) {
|
||||
THPUtils_assert(PyBool_Check(arg), "set_backcompat_keepdim_warn expects a bool, "
|
||||
"but got %s", THPUtils_typename(arg));
|
||||
setBackCompatKeepdimWarn(arg == Py_True);
|
||||
Py_RETURN_NONE;
|
||||
}
|
||||
|
||||
static PyObject *THPModule_getBackcompatKeepdimWarn(PyObject *module)
|
||||
{
|
||||
return getBackCompatKeepdimWarn() ? Py_True : Py_False;
|
||||
}
|
||||
|
||||
PyObject *THPModule_hasDistributed(PyObject *_unused)
|
||||
{
|
||||
#ifdef WITH_DISTRIBUTED
|
||||
Py_RETURN_TRUE;
|
||||
#else
|
||||
Py_RETURN_FALSE;
|
||||
#endif
|
||||
}
|
||||
|
||||
#ifdef WITH_CUDA
|
||||
extern PyObject * THCPModule_initExtension(PyObject *self);
|
||||
extern PyObject * THCPModule_setDevice_wrap(PyObject *self, PyObject *arg);
|
||||
@ -557,7 +497,6 @@ static PyMethodDef TorchMethods[] = {
|
||||
{"_add_docstr", (PyCFunction)THPModule_addDocStr, METH_VARARGS, NULL},
|
||||
{"_sparse_init", (PyCFunction)THSPModule_initExtension, METH_NOARGS, NULL},
|
||||
{"_init_names", (PyCFunction)THPModule_initNames, METH_O, NULL},
|
||||
{"_has_distributed",(PyCFunction)THPModule_hasDistributed, METH_NOARGS, NULL},
|
||||
#ifdef WITH_CUDA
|
||||
{"_cuda_init", (PyCFunction)THCPModule_initExtension, METH_NOARGS, NULL},
|
||||
{"_cuda_setDevice", (PyCFunction)THCPModule_setDevice_wrap, METH_O, NULL},
|
||||
@ -584,11 +523,6 @@ static PyMethodDef TorchMethods[] = {
|
||||
#endif
|
||||
{"_safe_call", (PyCFunction)THPModule_safeCall, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"_set_default_tensor_type", (PyCFunction)THPModule_setDefaultTensorType, METH_O, NULL},
|
||||
{"_infer_size", (PyCFunction)THPModule_inferSize, METH_VARARGS, NULL},
|
||||
{"_set_backcompat_broadcast_warn", (PyCFunction)THPModule_setBackcompatBroadcastWarn, METH_O, NULL},
|
||||
{"_get_backcompat_broadcast_warn", (PyCFunction)THPModule_getBackcompatBroadcastWarn, METH_NOARGS, NULL},
|
||||
{"_set_backcompat_keepdim_warn", (PyCFunction)THPModule_setBackcompatKeepdimWarn, METH_O, NULL},
|
||||
{"_get_backcompat_keepdim_warn", (PyCFunction)THPModule_getBackcompatKeepdimWarn, METH_NOARGS, NULL},
|
||||
{"get_num_threads", (PyCFunction)THPModule_getNumThreads, METH_NOARGS, NULL},
|
||||
{"set_num_threads", (PyCFunction)THPModule_setNumThreads, METH_O, NULL},
|
||||
{"from_numpy", (PyCFunction)THPModule_fromNumpy, METH_O, NULL},
|
||||
@ -596,7 +530,6 @@ static PyMethodDef TorchMethods[] = {
|
||||
{"sigmoid", (PyCFunction)THPModule_sigmoid, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"log", (PyCFunction)THPModule_log, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"log1p", (PyCFunction)THPModule_log1p, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"lgamma", (PyCFunction)THPModule_lgamma, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"exp", (PyCFunction)THPModule_exp, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"cos", (PyCFunction)THPModule_cos, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
{"acos", (PyCFunction)THPModule_acos, METH_VARARGS | METH_KEYWORDS, NULL},
|
||||
@ -719,6 +652,22 @@ static PyMethodDef TorchMethods[] = {
|
||||
{NULL, NULL, 0, NULL}
|
||||
};
|
||||
|
||||
static void errorHandler(const char *msg, void *data)
|
||||
{
|
||||
throw THException(msg);
|
||||
}
|
||||
|
||||
static void errorHandlerArg(int argNumber, const char *msg, void *data)
|
||||
{
|
||||
throw THArgException(msg, argNumber);
|
||||
}
|
||||
|
||||
static void updateErrorHandlers()
|
||||
{
|
||||
THSetDefaultErrorHandler(errorHandler, NULL);
|
||||
THSetDefaultArgErrorHandler(errorHandlerArg, NULL);
|
||||
}
|
||||
|
||||
bool THCPDoubleStorage_init(PyObject *module);
|
||||
bool THCPFloatStorage_init(PyObject *module);
|
||||
bool THCPHalfStorage_init(PyObject *module);
|
||||
@ -778,7 +727,6 @@ PyMODINIT_FUNC init_C()
|
||||
PyMODINIT_FUNC PyInit__C()
|
||||
#endif
|
||||
{
|
||||
THInferNumThreads();
|
||||
|
||||
#if PY_MAJOR_VERSION == 2
|
||||
#define ASSERT_TRUE(cmd) if (!(cmd)) {PyErr_SetString(PyExc_ImportError, "initialization error"); return;}
|
||||
@ -883,7 +831,8 @@ PyMODINIT_FUNC PyInit__C()
|
||||
Py_INCREF(has_cudnn);
|
||||
ASSERT_TRUE(PyModule_AddObject(module, "has_cudnn", has_cudnn) == 0);
|
||||
|
||||
#ifdef WITH_DISTRIBUTED_MW
|
||||
// TODO THD: enable once master-worker mode is implemented
|
||||
#if 0 && defined(WITH_DISTRIBUTED)
|
||||
// See comment on CUDA objects
|
||||
ASSERT_TRUE(THDPDoubleStorage_init(module));
|
||||
ASSERT_TRUE(THDPFloatStorage_init(module));
|
||||
@ -908,9 +857,7 @@ PyMODINIT_FUNC PyInit__C()
|
||||
ASSERT_TRUE(THPDefaultGenerator != nullptr);
|
||||
ASSERT_TRUE(PyModule_AddObject(module, "default_generator", (PyObject*)THPDefaultGenerator) == 0);
|
||||
|
||||
// force ATen to initialize because it handles
|
||||
// setting up TH Errors so that they throw C++ exceptions
|
||||
at::init();
|
||||
updateErrorHandlers();
|
||||
|
||||
#ifdef WITH_NUMPY
|
||||
import_array();
|
||||
|
||||
@ -1,100 +0,0 @@
|
||||
# csrc
|
||||
|
||||
The csrc directory contains all of the code concerned with integration
|
||||
with Python. This is in contrast to lib, which contains the Torch
|
||||
libraries that are Python agnostic. csrc depends on lib, but not vice
|
||||
versa.
|
||||
|
||||
There are a number of utilities for easing integration with Python which
|
||||
are worth knowing about, which we briefly describe here. But the most
|
||||
important gotchas:
|
||||
|
||||
* DO NOT forget to take out the GIL with `AutoGil` before calling Python
|
||||
API or bringing a `THPObjectPtr` into scope.
|
||||
|
||||
* Make sure you include `Python.h` first in your header files, before
|
||||
any system headers; otherwise, you will get `error: "_XOPEN_SOURCE" redefined`
|
||||
error. If you pay attention to warnings, you will see where you need to
|
||||
do this.
|
||||
|
||||
## Notes
|
||||
|
||||
### Note [Storage is not NULL]
|
||||
|
||||
Historically, Torch supported NULL storage, as a minor optimization to
|
||||
avoid having to allocate a storage object when it would be empty.
|
||||
However, this is actually a confusing special case to deal with, so
|
||||
by-in-large, PyTorch assumes that, in fact, storage is never NULL.
|
||||
|
||||
One important case where this assumption is important is when tracking
|
||||
the CUDA device a tensor is stored in: this information is stored
|
||||
solely in the storage, so if a storage is NULL, we lose this information.
|
||||
|
||||
Although storage is never NULL, the data field of THStorage may be NULL. This
|
||||
mostly occurs when we want to pre-allocate an output tensor struct, but then
|
||||
have it be resized and filled with data by some operator: there's no point in
|
||||
allocating data for it in this case!
|
||||
|
||||
## Files
|
||||
|
||||
### `Exceptions.h`
|
||||
|
||||
Frequently when working with the Python API, you may call a function
|
||||
which returns an error. In this case, we want to return directly to the
|
||||
Python interpreter, so that this exception can be propagated
|
||||
accordingly; however, because the Python API is C-based, what actually
|
||||
will happen is it will return control to whatever C++ code called it.
|
||||
Similarly, if we raise a C++ exception, prior to returning to the Python
|
||||
interpreter, we must set the Python error flags, so it turns into a C++
|
||||
exception.
|
||||
|
||||
Exceptions defines some useful helpers: `HANDLE_TH_ERRORS`, `END_HANDLE_TH_ERRORS`
|
||||
and an exception class `python_error`. You call them like this:
|
||||
|
||||
```
|
||||
// Entry point from Python interpreter
|
||||
PyObject* run() {
|
||||
HANDLE_TH_ERRORS
|
||||
...
|
||||
if (!x) throw python_error();
|
||||
...
|
||||
END_HANDLE_TH_ERRORS
|
||||
}
|
||||
```
|
||||
|
||||
The `HANDLE_TH_ERRORS` macro will catch all exceptions and convert them
|
||||
into an appropriate Python signal. `python_error` is a special
|
||||
exception which doesn't contain any info, instead it says, "An error
|
||||
occurred in the Python API; if you return to the interpreter, Python
|
||||
will raise that exception, nothing else needs to be done."
|
||||
|
||||
### `utils/auto_gil.h`
|
||||
|
||||
Whenever you make any calls to the Python API, you must have taken out
|
||||
the Python GIL, as none of these calls are thread safe. `AutoGIL` is
|
||||
a RAII struct which handles taking and releasing the GIL. Use it like
|
||||
this:
|
||||
|
||||
```
|
||||
void iWantToUsePython() {
|
||||
AutoGil gil;
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
In general, the compiler will NOT warn you if you use Python
|
||||
functionality without taking out the GIL, so DO NOT FORGET this call.
|
||||
|
||||
### `utils/object_ptr.h`
|
||||
|
||||
`THPPointer` is a smart pointer class analogous to `std::shared_ptr`,
|
||||
but which is overloaded to handle reference counting scheme of various
|
||||
objects which are not based on `shared_ptr`. The most important overloads are:
|
||||
|
||||
* `PyObject` (so important we've aliased it as `THPObjectPtr`), which
|
||||
hooks into Python reference counting. (By the way, that means you
|
||||
MUST take out the GIL before bringing one of these into scope!)
|
||||
|
||||
* The various TH tensor and storage types (e.g., `THTensor`), which
|
||||
hook into TH's reference counting. (TH's reference counting
|
||||
IS thread safe, no locks necessary.)
|
||||
@ -25,7 +25,7 @@ PyObject * THPSize_New(int dim, long *sizes)
|
||||
|
||||
static PyObject * THPSize_pynew(PyTypeObject *type, PyObject *args, PyObject *kwargs)
|
||||
{
|
||||
THPObjectPtr self(PyTuple_Type.tp_new(type, args, kwargs));
|
||||
THPObjectPtr self = PyTuple_Type.tp_new(type, args, kwargs);
|
||||
if (self) {
|
||||
for (Py_ssize_t i = 0; i < PyTuple_Size(self); ++i) {
|
||||
PyObject *item = PyTuple_GET_ITEM(self.get(), i);
|
||||
@ -56,12 +56,13 @@ extern PyTypeObject THPSizeType;
|
||||
template<typename FnType, FnType fn, typename ...Args>
|
||||
static PyObject* wrap_tuple_fn(Args ... args)
|
||||
{
|
||||
THPObjectPtr result((*fn)(std::forward<Args>(args)...));
|
||||
PyObject *result = (*fn)(std::forward<Args>(args)...);
|
||||
if (!result) return NULL;
|
||||
if (PyTuple_Check(result.get())) {
|
||||
return PyObject_CallFunctionObjArgs((PyObject*)&THPSizeType, result.get(), NULL);
|
||||
if (PyTuple_Check(result)) {
|
||||
return PyObject_CallFunctionObjArgs((PyObject*)&THPSizeType, result, NULL);
|
||||
}
|
||||
return result.release();
|
||||
Py_INCREF(result);
|
||||
return result;
|
||||
}
|
||||
|
||||
static auto sq_concat = PyTuple_Type.tp_as_sequence->sq_concat;
|
||||
|
||||
@ -1,14 +1,12 @@
|
||||
#ifndef THP_H
|
||||
#define THP_H
|
||||
|
||||
#include <Python.h>
|
||||
#include <stdbool.h>
|
||||
#include <TH/TH.h>
|
||||
#include <THS/THS.h>
|
||||
|
||||
// Back-compatibility macros, Thanks to http://cx-oracle.sourceforge.net/
|
||||
// define PyInt_* macros for Python 3.x. NB: We must include Python.h first,
|
||||
// otherwise we'll incorrectly conclude PyInt_Check isn't defined!
|
||||
// define PyInt_* macros for Python 3.x
|
||||
#ifndef PyInt_Check
|
||||
#define PyInt_Check PyLong_Check
|
||||
#define PyInt_FromLong PyLong_FromLong
|
||||
@ -20,7 +18,6 @@
|
||||
#define LIBRARY_STATE
|
||||
#define LIBRARY_STATE_NOARGS
|
||||
#define LIBRARY_STATE_TYPE
|
||||
#define LIBRARY_STATE_TYPE_NOARGS
|
||||
|
||||
#define THP_API extern "C"
|
||||
|
||||
|
||||
@ -9,8 +9,12 @@
|
||||
#include <tuple>
|
||||
#include <TH/THMath.h>
|
||||
|
||||
#include "torch/csrc/THP.h"
|
||||
#include "torch/csrc/copy_utils.h"
|
||||
#include "torch/csrc/DynamicTypes.h"
|
||||
#include "THP.h"
|
||||
#include "copy_utils.h"
|
||||
#include "DynamicTypes.h"
|
||||
|
||||
//generic_include TH torch/csrc/generic/Tensor.cpp
|
||||
#include "generic/Tensor.cpp"
|
||||
#include <TH/THGenerateAllTypes.h>
|
||||
|
||||
#include "generic/Tensor.cpp"
|
||||
#include <TH/THGenerateHalfType.h>
|
||||
|
||||
@ -29,33 +29,11 @@ void StorageWeakRefAllocator::free(void* ptr) {
|
||||
|
||||
|
||||
#ifdef WITH_NUMPY
|
||||
/**
|
||||
* Note [Numpy memory management]
|
||||
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
* For efficiency reasons, when a user converts to/from numpy arrays,
|
||||
* we want to share the underlying storage. This means that if we
|
||||
* turn a Numpy array into a Torch tensor, the Torch tensor must
|
||||
* keep the Numpy array alive, and vice versa for conversions in
|
||||
* the other direction.
|
||||
*
|
||||
* A Torch tensor keeps its backing Numpy array alive using the custom allocator
|
||||
* THNumpyArrayAllocator (backed by NumpyArrayAllocator), which holds a
|
||||
* THPObjectPointer to the Numpy PyArrayObject, and nulls it out upon free.
|
||||
* The relevant code is in torch/csrc/generic/Tensor.cpp.
|
||||
*
|
||||
* A Numpy array keeps its backing Torch tensor alive using the base object
|
||||
* <https://docs.scipy.org/doc/numpy-dev/reference/c-api.array.html#c.PyArray_SetBaseObject>
|
||||
* field of Numpy, which is Numpy's hook for allowing an external user to
|
||||
* manage memory. The relevant code is in
|
||||
* torch/csrc/generic/methods/TensorSerialization.cwrap
|
||||
*/
|
||||
|
||||
// See Note [Numpy memory management]
|
||||
void* NumpyArrayAllocator::realloc(void* ptr, ptrdiff_t size) {
|
||||
PyArrayObject *array_ptr = (PyArrayObject*)object.get();
|
||||
if (array_ptr && ptr == PyArray_DATA(array_ptr)) {
|
||||
void* newPtr = this->malloc(size);
|
||||
memcpy(newPtr, ptr, std::min((size_t) size, (size_t) PyArray_NBYTES(array_ptr)));
|
||||
memcpy(newPtr, ptr, std::min(size, PyArray_NBYTES(array_ptr)));
|
||||
// Whee! We're done!
|
||||
object = nullptr;
|
||||
return newPtr;
|
||||
@ -63,7 +41,7 @@ void* NumpyArrayAllocator::realloc(void* ptr, ptrdiff_t size) {
|
||||
return allocator->realloc(allocatorContext, ptr, size);
|
||||
}
|
||||
|
||||
// See Note [Numpy memory management]
|
||||
|
||||
void NumpyArrayAllocator::free(void* ptr) {
|
||||
PyArrayObject *array_ptr = (PyArrayObject*)object.get();
|
||||
if (!array_ptr || ptr != PyArray_DATA(array_ptr))
|
||||
@ -101,7 +79,6 @@ THAllocator THStorageWeakRefAllocator = {
|
||||
};
|
||||
|
||||
#ifdef WITH_NUMPY
|
||||
// See Note [Numpy memory management]
|
||||
THAllocator THNumpyArrayAllocator = {
|
||||
malloc_wrapper<NumpyArrayAllocator>,
|
||||
realloc_wrapper<NumpyArrayAllocator>,
|
||||
|
||||
@ -1,33 +0,0 @@
|
||||
## Autograd
|
||||
|
||||
Autograd is a hotspot for PyTorch performance, so most of the heavy lifting is
|
||||
implemented in C++. This implies that we have to do some shuffling between
|
||||
Python and C++; and in general, we want data to be in a form that is convenient
|
||||
to manipulate from C++.
|
||||
|
||||
Our general model is that for any key data type that autograd manipulates,
|
||||
there are two implementations: a C++ type and a Python object type. For
|
||||
example, consider variables in autograd: we have both `Variable` in `variable.h`
|
||||
(the C++ type) and `THPVariable` in `python_variable.h` (the Python type.)
|
||||
(By the way, THP stands for TorcH Python, not to be confused with THPP, TorcH
|
||||
C++). `Variable` contains the payload of a variable, while `THPVariable` just
|
||||
contains a `shared_ptr` reference to `Variable`, as well as references to other
|
||||
Python objects which the Python runtime needs to know about. A lot of
|
||||
data accessor implementations in `python_variable.cpp` simply reach through
|
||||
to the underlying `Variable` and return the appropriate value.
|
||||
|
||||
The most complicated application of this principle is Function, which also
|
||||
supports users implementing custom behavior in Python. We have the following
|
||||
classes:
|
||||
|
||||
* `Function` in `function.h`, the C++ type.
|
||||
* `THPFunction` in `python_function.h`, the Python object type. In
|
||||
`python_function.cpp`, you can see the boilerplate that tells the Python
|
||||
interpreter about this object.
|
||||
* `PyFunction` in `python_function.h`, a subclass of `Function` which forwards
|
||||
`apply` to a Python `THPFunction`. (NOT a Python object, despite its name!)
|
||||
|
||||
Outside of `PyFunction`, the C++ objects largely avoid referencing Python
|
||||
objects (there are a few exceptions, like `pyobj` in `Variable`, and
|
||||
`PyFunction`, whose whole point is to let C++ call into Python). And `pyobj`
|
||||
in `Function` to ensure uniqueness of the associated python wrapper (if it exists).
|
||||
@ -1,11 +1,8 @@
|
||||
#include "torch/csrc/autograd/engine.h"
|
||||
#include "torch/csrc/autograd/functions/basic_ops.h"
|
||||
#include "torch/csrc/utils/auto_gpu.h"
|
||||
|
||||
#include <atomic>
|
||||
#include <condition_variable>
|
||||
#include <cstdint>
|
||||
#include <functional>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <set>
|
||||
@ -25,25 +22,15 @@ using thpp::Tensor;
|
||||
|
||||
namespace torch { namespace autograd {
|
||||
|
||||
// XXX: Changes to the way multithreading works in execute should be done with
|
||||
// great care. Right now the implementation guarantees that a single function's
|
||||
// apply will never be entered concurrently (even if multiple graphs are
|
||||
// executed at the same time). Adding multiple threads per-device or removing
|
||||
// engine thread affinity to the device can break this invariant, and we depend
|
||||
// on it in a few places (e.g. AccumulateGrad function).
|
||||
|
||||
struct FunctionTask {
|
||||
GraphTask* base;
|
||||
BackwardTask* base;
|
||||
std::shared_ptr<Function> fn;
|
||||
// This buffer serves as an implicit "addition" node for all of the
|
||||
// gradients flowing here. Once all the dependencies are finished, we
|
||||
// use the contents of this buffer to run the function.
|
||||
InputBuffer inputs;
|
||||
GradBuffer grad;
|
||||
|
||||
FunctionTask(GraphTask* base, std::shared_ptr<Function> fn, InputBuffer inputs)
|
||||
FunctionTask(BackwardTask* base, std::shared_ptr<Function> fn, GradBuffer grad)
|
||||
: base(base)
|
||||
, fn(fn)
|
||||
, inputs(std::move(inputs)) {}
|
||||
, grad(std::move(grad)) {}
|
||||
};
|
||||
|
||||
struct ReadyQueue {
|
||||
@ -55,32 +42,26 @@ struct ReadyQueue {
|
||||
FunctionTask pop_back();
|
||||
};
|
||||
|
||||
struct GraphTask {
|
||||
struct BackwardTask {
|
||||
std::exception_ptr exception;
|
||||
// Indicates if an error occurred while executing any task. When this is
|
||||
// true, it signals all threads to stop executing.
|
||||
std::atomic_bool has_error;
|
||||
std::atomic<uint64_t> outstanding_tasks;
|
||||
bool keep_graph;
|
||||
bool has_any_work;
|
||||
bool retain_variables;
|
||||
bool node_requires_grad;
|
||||
|
||||
std::mutex mutex;
|
||||
// Notified when a task finishes executing. Check outstanding_tasks to see
|
||||
// if all tasks are done.
|
||||
std::condition_variable not_done;
|
||||
const Engine::callback_map& function_callbacks;
|
||||
std::unordered_map<Function*, InputBuffer> not_ready;
|
||||
std::unordered_map<Function*, GradBuffer> not_ready;
|
||||
std::unordered_map<Function*, int> dependencies;
|
||||
|
||||
GraphTask(bool keep_graph, const Engine::callback_map& function_callbacks)
|
||||
BackwardTask(bool retain_variables)
|
||||
: exception()
|
||||
, has_error(false)
|
||||
, outstanding_tasks(0)
|
||||
, keep_graph(keep_graph)
|
||||
, has_any_work(false)
|
||||
, retain_variables(retain_variables)
|
||||
, node_requires_grad(false)
|
||||
, mutex()
|
||||
, not_done()
|
||||
, function_callbacks(function_callbacks)
|
||||
, not_ready()
|
||||
, dependencies() {}
|
||||
};
|
||||
@ -107,9 +88,7 @@ Engine::Engine() : ready_queues() {
|
||||
// This Engine's ReadyQueues and their corresponding threads are leaked here
|
||||
Engine::~Engine() = default;
|
||||
|
||||
auto Engine::thread_main(std::shared_ptr<ReadyQueue> queue, int device) -> void {
|
||||
THInferNumThreads();
|
||||
AutoGPU guard(device);
|
||||
auto Engine::thread_main(std::shared_ptr<ReadyQueue> queue) -> void {
|
||||
while (1) {
|
||||
FunctionTask task = queue->pop_back();
|
||||
if (!task.base->has_error.load()) {
|
||||
@ -134,73 +113,78 @@ auto Engine::thread_on_exception(FunctionTask& task, std::exception& e) -> void
|
||||
}
|
||||
}
|
||||
|
||||
static variable_list call_pre_hooks(Function& fn, variable_list inputs) {
|
||||
static variable_list call_pre_hooks(Function& fn, variable_list grad_output) {
|
||||
for (auto& hook : fn.pre_hooks) {
|
||||
inputs = (*hook)(inputs);
|
||||
grad_output = (*hook)(grad_output);
|
||||
}
|
||||
return inputs;
|
||||
return grad_output;
|
||||
}
|
||||
|
||||
static variable_list call_post_hooks(Function& fn, variable_list outputs, variable_list inputs) {
|
||||
static variable_list call_post_hooks(Function& fn, variable_list grad_input, variable_list grad_output) {
|
||||
for (auto& hook : fn.post_hooks) {
|
||||
outputs = (*hook)(outputs, inputs);
|
||||
grad_input = (*hook)(grad_input, grad_output);
|
||||
}
|
||||
return outputs;
|
||||
return grad_input;
|
||||
}
|
||||
|
||||
static variable_list call_function(FunctionTask& task) {
|
||||
auto& fn = *task.fn;
|
||||
auto inputs = call_pre_hooks(fn, InputBuffer::variables(std::move(task.inputs)));
|
||||
|
||||
auto& function_callbacks = task.base->function_callbacks;
|
||||
auto callback_it = function_callbacks.find(&fn);
|
||||
if (callback_it != function_callbacks.end()) {
|
||||
auto& callback = callback_it->second;
|
||||
if (!callback(&fn, inputs)) return variable_list(fn.next_functions.size());
|
||||
}
|
||||
|
||||
auto fn_outputs = fn.apply(inputs);
|
||||
return call_post_hooks(fn, std::move(fn_outputs), std::move(inputs));
|
||||
auto grad_output = call_pre_hooks(*task.fn, GradBuffer::variables(std::move(task.grad)));
|
||||
auto grad_input = task.fn->apply(grad_output);
|
||||
return call_post_hooks(*task.fn, std::move(grad_input), std::move(grad_output));
|
||||
}
|
||||
|
||||
auto Engine::evaluate_function(FunctionTask& task) -> void {
|
||||
auto outputs = call_function(task);
|
||||
auto grad_inputs = call_function(task);
|
||||
|
||||
auto& fn = *task.fn;
|
||||
if (!task.base->keep_graph) {
|
||||
if (!task.base->retain_variables) {
|
||||
fn.releaseVariables();
|
||||
}
|
||||
|
||||
if (outputs.size() != fn.next_functions.size()) {
|
||||
if (grad_inputs.size() != fn.previous_functions.size()) {
|
||||
std::stringstream ss;
|
||||
ss << "Function '" << fn.name() << "' returned an invalid number of outputs - expected ";
|
||||
ss << fn.next_functions.size() << ", but got " << outputs.size();
|
||||
ss << "Function '" << fn.name() << "' returned an invalid number of gradients - expected ";
|
||||
ss << fn.previous_functions.size() << ", but got " << grad_inputs.size();
|
||||
throw std::runtime_error(ss.str());
|
||||
}
|
||||
|
||||
int num_outputs = outputs.size();
|
||||
for (int i = 0; i < num_outputs; ++i) {
|
||||
auto& output = outputs[i];
|
||||
auto& next_fn = fn.next_functions[i].first;
|
||||
int input_nr = fn.next_functions[i].second;
|
||||
int size = grad_inputs.size();
|
||||
for (int i = 0; i < size; ++i) {
|
||||
auto& grad_input = grad_inputs[i];
|
||||
auto& prev_fn = fn.previous_functions[i].first;
|
||||
int output_nr = fn.previous_functions[i].second;
|
||||
|
||||
if (!next_fn) {
|
||||
// null inputs have no previous_function and we skip them here
|
||||
if (!prev_fn) {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Stochastic functions are placed in the ready queue by
|
||||
// compute_dependencies, so we have to skip them here.
|
||||
if (next_fn->is_stochastic || !next_fn->is_executable) {
|
||||
// compute_dependencies, so we can skip them here.
|
||||
if (prev_fn->is_stochastic || !prev_fn->requires_grad) {
|
||||
continue;
|
||||
}
|
||||
|
||||
std::lock_guard<std::mutex> lock(task.base->mutex);
|
||||
// Check if the next function is ready to be computed
|
||||
if (auto var = dynamic_cast<Variable*>(prev_fn.get())) {
|
||||
if (!grad_input) {
|
||||
// NOTE: grad_input can be NULL if the function returns None for a
|
||||
// non_differentiable input. We may need to track additional information
|
||||
// at the function level to determine if a NULL grad_input is an error.
|
||||
std::stringstream ss;
|
||||
ss << "Function '" << fn.name() << "' missing gradient at " << i;
|
||||
throw std::runtime_error(ss.str());
|
||||
}
|
||||
var->backward(grad_input);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Check if the function is ready for backward
|
||||
bool is_ready = false;
|
||||
auto& dependencies = task.base->dependencies;
|
||||
auto it = dependencies.find(next_fn.get());
|
||||
auto it = dependencies.find(prev_fn.get());
|
||||
if (it == dependencies.end()) {
|
||||
auto name = next_fn->name();
|
||||
auto name = prev_fn->name();
|
||||
throw std::runtime_error(std::string("dependency not found for ") + name);
|
||||
} else if (--it->second == 0) {
|
||||
dependencies.erase(it);
|
||||
@ -208,24 +192,24 @@ auto Engine::evaluate_function(FunctionTask& task) -> void {
|
||||
}
|
||||
|
||||
auto& not_ready = task.base->not_ready;
|
||||
auto not_ready_it = not_ready.find(next_fn.get());
|
||||
auto not_ready_it = not_ready.find(prev_fn.get());
|
||||
if (not_ready_it == not_ready.end()) {
|
||||
// No buffers have been allocated for the function
|
||||
InputBuffer input_buffer(next_fn->num_inputs);
|
||||
input_buffer.add(input_nr, std::move(output));
|
||||
GradBuffer prev_buffer(prev_fn->num_outputs);
|
||||
prev_buffer.addGrad(output_nr, std::move(grad_input));
|
||||
if (is_ready) {
|
||||
auto& queue = ready_queue(input_buffer.device());
|
||||
queue.push_front(FunctionTask(task.base, next_fn, std::move(input_buffer)));
|
||||
auto& queue = ready_queue(prev_buffer.device());
|
||||
queue.push_front(FunctionTask(task.base, prev_fn, std::move(prev_buffer)));
|
||||
} else {
|
||||
not_ready.emplace(next_fn.get(), std::move(input_buffer));
|
||||
not_ready.emplace(prev_fn.get(), std::move(prev_buffer));
|
||||
}
|
||||
} else {
|
||||
// The function already has a buffer
|
||||
auto &input_buffer = not_ready_it->second;
|
||||
input_buffer.add(input_nr, std::move(output));
|
||||
auto &prev_buffer = not_ready_it->second;
|
||||
prev_buffer.addGrad(output_nr, std::move(grad_input));
|
||||
if (is_ready) {
|
||||
auto& queue = ready_queue(input_buffer.device());
|
||||
queue.push_front(FunctionTask(task.base, next_fn, std::move(input_buffer)));
|
||||
auto& queue = ready_queue(prev_buffer.device());
|
||||
queue.push_front(FunctionTask(task.base, prev_fn, std::move(prev_buffer)));
|
||||
not_ready.erase(not_ready_it);
|
||||
}
|
||||
}
|
||||
@ -233,30 +217,30 @@ auto Engine::evaluate_function(FunctionTask& task) -> void {
|
||||
}
|
||||
|
||||
/** Finds all stochastic functions and appends them to the queue */
|
||||
auto Engine::find_stochastic_functions(function_queue& queue, Function* graph_root, GraphTask& task) -> void {
|
||||
std::unordered_set<Function*> seen {graph_root};
|
||||
function_queue search_queue {graph_root};
|
||||
auto Engine::find_stochastic_functions(function_queue& queue, BackwardTask& task) -> void {
|
||||
std::unordered_set<Function*> seen;
|
||||
function_queue search_queue(queue);
|
||||
while (search_queue.size() > 0) {
|
||||
auto fn = search_queue.back(); search_queue.pop_back();
|
||||
for (auto& next_fn_pair : fn->next_functions) {
|
||||
auto& next_fn = next_fn_pair.first;
|
||||
Function* next_ptr = next_fn.get();
|
||||
if (!next_ptr) continue;
|
||||
if (next_ptr->is_stochastic && next_ptr->is_executable && seen.count(next_ptr) == 0) {
|
||||
ready_queue(-1).push_front(FunctionTask(&task, next_fn, InputBuffer(0)));
|
||||
queue.push_back(next_ptr);
|
||||
task.has_any_work = true;
|
||||
for (auto& prev_fn_pair : fn->previous_functions) {
|
||||
auto& prev_fn = prev_fn_pair.first;
|
||||
Function* prev_ptr = prev_fn.get();
|
||||
if (!prev_ptr) continue;
|
||||
if (prev_ptr->is_stochastic && prev_ptr->requires_grad && seen.count(prev_ptr) == 0) {
|
||||
ready_queue(-1).push_front(FunctionTask(&task, prev_fn, GradBuffer(0)));
|
||||
queue.push_back(prev_ptr);
|
||||
task.node_requires_grad = true;
|
||||
}
|
||||
if (seen.count(next_ptr) == 0) {
|
||||
seen.insert(next_ptr);
|
||||
search_queue.push_back(next_ptr);
|
||||
if (seen.count(prev_ptr) == 0) {
|
||||
seen.insert(prev_ptr);
|
||||
search_queue.push_back(prev_ptr);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/** Computes the number of dependencies for each function which requires grad */
|
||||
auto Engine::compute_dependencies(function_queue queue, GraphTask& task) -> void {
|
||||
auto Engine::compute_dependencies(function_queue queue, BackwardTask& task) -> void {
|
||||
// Just to make sure that they will never be added to the queue again
|
||||
std::unordered_set<Function*> seen(queue.begin(), queue.end());
|
||||
|
||||
@ -265,97 +249,99 @@ auto Engine::compute_dependencies(function_queue queue, GraphTask& task) -> void
|
||||
auto& dependencies = task.dependencies;
|
||||
while (queue.size() > 0) {
|
||||
auto fn = std::move(queue.back()); queue.pop_back();
|
||||
for (auto& next_fn_pair : fn->next_functions) {
|
||||
Function* next_ptr = next_fn_pair.first.get();
|
||||
if (!next_ptr) continue;
|
||||
if (!next_ptr->is_executable) continue;
|
||||
if (next_ptr->is_stochastic) continue; // Stochastic nodes were in the queue already
|
||||
dependencies[next_ptr] += 1;
|
||||
if (seen.count(next_ptr) == 0) {
|
||||
seen.insert(next_ptr);
|
||||
queue.push_back(next_ptr);
|
||||
// This is needed only to filter out backward roots that don't require grad
|
||||
if (!fn->requires_grad) continue;
|
||||
for (auto& prev_fn_pair : fn->previous_functions) {
|
||||
Function* prev_ptr = prev_fn_pair.first.get();
|
||||
if (!prev_ptr) continue;
|
||||
if (dynamic_cast<Variable*>(prev_ptr)) continue;
|
||||
if (!prev_ptr->requires_grad) continue;
|
||||
if (prev_ptr->is_stochastic) continue; // Stochastic nodes were in the queue already
|
||||
dependencies[prev_ptr] += 1;
|
||||
if (seen.count(prev_ptr) == 0) {
|
||||
seen.insert(prev_ptr);
|
||||
queue.push_back(prev_ptr);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
struct ClearCallbacks {
|
||||
ClearCallbacks(std::vector<std::function<void()>>& callbacks,
|
||||
std::mutex &callbacks_lock)
|
||||
: callbacks(callbacks)
|
||||
, callbacks_lock(callbacks_lock) { clear(); }
|
||||
~ClearCallbacks() { clear(); }
|
||||
|
||||
void clear() {
|
||||
std::lock_guard<std::mutex> lock(callbacks_lock);
|
||||
callbacks.clear();
|
||||
}
|
||||
|
||||
std::vector<std::function<void()>>& callbacks;
|
||||
std::mutex& callbacks_lock;
|
||||
};
|
||||
|
||||
auto Engine::execute(const function_list& input_roots,
|
||||
variable_list& inputs,
|
||||
bool keep_graph,
|
||||
const callback_map& callbacks) -> void {
|
||||
std::call_once(start_threads_flag, &Engine::start_threads, this);
|
||||
// Callbacks are only valid for the duration of this run and should always be cleared
|
||||
ClearCallbacks _cb_guard(post_callbacks, post_callbacks_lock);
|
||||
|
||||
GraphTask graph_task(keep_graph, callbacks);
|
||||
std::unique_lock<std::mutex> lock(graph_task.mutex);
|
||||
|
||||
auto graph_root = std::make_shared<GraphRoot>(input_roots, inputs);
|
||||
function_queue roots;
|
||||
for (auto entry : input_roots) {
|
||||
if (entry.first->is_executable) {
|
||||
graph_task.has_any_work = true;
|
||||
roots.push_back(graph_root.get());
|
||||
ready_queue(-1).push_front(FunctionTask(&graph_task, graph_root, InputBuffer(0)));
|
||||
break;
|
||||
auto Engine::find_creators(const variable_list& variables,
|
||||
tensor_list& grad_variables,
|
||||
BackwardTask& task) -> function_queue {
|
||||
function_queue creators;
|
||||
std::unordered_map<std::shared_ptr<Function>, std::unique_ptr<GradBuffer>> creator_grad;
|
||||
int size = variables.size();
|
||||
for (int i = 0; i < size; ++i) {
|
||||
auto& var = variables[i];
|
||||
auto& grad = grad_variables[i];
|
||||
if (!var->creator) {
|
||||
// If someone calls .backward() on a leaf, it's simple...
|
||||
if (var->requires_grad) {
|
||||
var->backward(std::make_shared<Variable>(std::move(grad), false, true));
|
||||
task.node_requires_grad = true;
|
||||
}
|
||||
} else {
|
||||
auto& creator = var->creator;
|
||||
auto& buf = creator_grad[creator];
|
||||
if (creator->requires_grad) {
|
||||
if (!buf) buf.reset(new GradBuffer(creator->num_outputs));
|
||||
buf->addGrad(var->output_nr, Variable::of(std::move(grad)));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Search the graph and find all stochastic functions. Append them to the queue.
|
||||
find_stochastic_functions(roots, graph_root.get(), graph_task);
|
||||
for (auto& entry: creator_grad) {
|
||||
const auto& creator = entry.first;
|
||||
creators.push_back(creator.get());
|
||||
if (creator->requires_grad) {
|
||||
// NOTE: buf is null if creator doesn't require gradient
|
||||
auto& buf = entry.second;
|
||||
auto& queue = ready_queue(buf->device());
|
||||
queue.push_front(FunctionTask(&task, creator, std::move(*buf)));
|
||||
task.node_requires_grad = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (!graph_task.has_any_work) {
|
||||
return creators;
|
||||
}
|
||||
|
||||
auto Engine::backward(const variable_list& variables,
|
||||
tensor_list& grad_variables,
|
||||
bool retain_variables) -> void {
|
||||
static std::once_flag once_flag;
|
||||
std::call_once(once_flag, &Engine::start_threads, this);
|
||||
|
||||
BackwardTask backward_task(retain_variables);
|
||||
std::unique_lock<std::mutex> lock(backward_task.mutex);
|
||||
|
||||
// Find the unique creators and backprop into variables which don't have creators.
|
||||
auto creators = find_creators(variables, grad_variables, backward_task);
|
||||
|
||||
// Search the graph and find all stochastic functions. Append them to the queue.
|
||||
find_stochastic_functions(creators, backward_task);
|
||||
|
||||
if (!backward_task.node_requires_grad) {
|
||||
throw std::runtime_error(
|
||||
"there are no graph nodes that require computing gradients");
|
||||
}
|
||||
|
||||
// Now compute the dependencies for all executable functions
|
||||
compute_dependencies(std::move(roots), graph_task);
|
||||
// Now compute the dependencies for each function which requires grad
|
||||
compute_dependencies(std::move(creators), backward_task);
|
||||
|
||||
// Wait for all tasks to complete
|
||||
graph_task.not_done.wait(lock, [&graph_task]{
|
||||
return graph_task.outstanding_tasks.load() == 0;
|
||||
// wait for all tasks to complete
|
||||
backward_task.not_done.wait(lock, [&backward_task]{
|
||||
return backward_task.outstanding_tasks.load() == 0;
|
||||
});
|
||||
|
||||
// Check for an exception while running backwards
|
||||
if (graph_task.has_error.load()) {
|
||||
std::rethrow_exception(graph_task.exception);
|
||||
// check for an exception while running backwards
|
||||
if (backward_task.has_error.load()) {
|
||||
std::rethrow_exception(backward_task.exception);
|
||||
}
|
||||
|
||||
if (!graph_task.not_ready.empty()) {
|
||||
if (!backward_task.not_ready.empty()) {
|
||||
throw std::runtime_error("could not compute gradients for some functions");
|
||||
}
|
||||
|
||||
// Unlocking is necessary, because the callback can register
|
||||
// more callbacks (or they can be registered from other threads
|
||||
// while it's waiting.
|
||||
std::unique_lock<std::mutex> cb_lock(post_callbacks_lock);
|
||||
for (std::size_t i = 0; i < post_callbacks.size(); ++i) {
|
||||
cb_lock.unlock();
|
||||
post_callbacks[i]();
|
||||
cb_lock.lock();
|
||||
}
|
||||
}
|
||||
|
||||
void Engine::queue_callback(std::function<void()> callback) {
|
||||
std::lock_guard<std::mutex> lock(post_callbacks_lock);
|
||||
post_callbacks.emplace_back(std::move(callback));
|
||||
}
|
||||
|
||||
auto Engine::ready_queue(int device) -> ReadyQueue& {
|
||||
@ -371,12 +357,10 @@ auto Engine::start_threads() -> void {
|
||||
num_devices = 0;
|
||||
}
|
||||
#endif
|
||||
int num_threads = num_devices + 1;
|
||||
ready_queues = std::vector<std::shared_ptr<ReadyQueue>>(num_threads);
|
||||
for (int i = 0; i < num_threads; ++i) {
|
||||
auto& queue = ready_queues[i];
|
||||
ready_queues = std::vector<std::shared_ptr<ReadyQueue>>(num_devices + 1);
|
||||
for (auto& queue : ready_queues) {
|
||||
queue.reset(new ReadyQueue());
|
||||
std::thread t(&Engine::thread_main, this, queue, i - 1);
|
||||
std::thread t(&Engine::thread_main, this, queue);
|
||||
t.detach();
|
||||
}
|
||||
}
|
||||
|
||||
@ -3,22 +3,20 @@
|
||||
// Engine implements backpropagation from output variables and their gradients
|
||||
// to "root" variables (variables created by the user with requires_grad=True).
|
||||
|
||||
#include <Python.h>
|
||||
#include <deque>
|
||||
#include <memory>
|
||||
#include <unordered_map>
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
#include <functional>
|
||||
|
||||
#include "torch/csrc/autograd/function.h"
|
||||
#include "torch/csrc/autograd/input_buffer.h"
|
||||
#include "torch/csrc/autograd/grad_buffer.h"
|
||||
|
||||
namespace torch { namespace autograd {
|
||||
|
||||
struct ReadyQueue;
|
||||
struct FunctionTask;
|
||||
struct GraphTask;
|
||||
struct BackwardTask;
|
||||
|
||||
// A single instance of this struct should be created through the whole process lifetime.
|
||||
// The worker thread creation logic and Engine's destructor rely on this.
|
||||
@ -26,39 +24,31 @@ struct Engine {
|
||||
Engine();
|
||||
virtual ~Engine();
|
||||
|
||||
using ready_queue_type = std::deque<std::pair<std::shared_ptr<Function>, InputBuffer>>;
|
||||
using ready_queue_type = std::deque<std::pair<std::shared_ptr<Function>, GradBuffer>>;
|
||||
using function_queue = std::vector<Function*>;
|
||||
using dependencies_type = std::unordered_map<Function*, int>;
|
||||
using callback_type = std::function<bool (Function*, variable_list&)>;
|
||||
using callback_map = std::unordered_map<Function*, callback_type>;
|
||||
|
||||
// Given a list of (Function, input number) pairs computes the value of the graph
|
||||
// by following next_function references.
|
||||
void execute(
|
||||
const function_list& roots,
|
||||
variable_list& inputs,
|
||||
bool keep_graph,
|
||||
const callback_map& callbacks = callback_map());
|
||||
|
||||
void queue_callback(std::function<void()> callback);
|
||||
// Given a list of output variables and their gradients, computes the
|
||||
// gradients of "root" variables by backpropagation.
|
||||
void backward(
|
||||
const variable_list& variables,
|
||||
tensor_list& grad_variables,
|
||||
bool retain_variables);
|
||||
|
||||
protected:
|
||||
function_queue find_roots(
|
||||
const function_list& roots,
|
||||
variable_list& inputs,
|
||||
GraphTask& task);
|
||||
void find_stochastic_functions(function_queue& queue, Function* graph_root, GraphTask& task);
|
||||
void compute_dependencies(function_queue queue, GraphTask& task);
|
||||
function_queue find_creators(
|
||||
const variable_list& variables,
|
||||
tensor_list& grad_variables,
|
||||
BackwardTask& task);
|
||||
void find_stochastic_functions(function_queue& queue, BackwardTask& task);
|
||||
void compute_dependencies(function_queue queue, BackwardTask& task);
|
||||
void evaluate_function(FunctionTask& task);
|
||||
ReadyQueue& ready_queue(int device);
|
||||
void start_threads();
|
||||
virtual void thread_main(std::shared_ptr<ReadyQueue> queue, int device);
|
||||
virtual void thread_main(std::shared_ptr<ReadyQueue> queue);
|
||||
virtual void thread_on_exception(FunctionTask& task, std::exception& e);
|
||||
|
||||
std::once_flag start_threads_flag;
|
||||
std::vector<std::shared_ptr<ReadyQueue>> ready_queues;
|
||||
std::vector<std::function<void()>> post_callbacks;
|
||||
std::mutex post_callbacks_lock;
|
||||
};
|
||||
|
||||
}} // namespace torch::autograd
|
||||
|
||||
@ -10,22 +10,22 @@ namespace torch { namespace autograd {
|
||||
auto Function::flags(const variable_list& inputs) -> FunctionFlags {
|
||||
int num_inputs = inputs.size();
|
||||
FunctionFlags f;
|
||||
f.is_executable = false;
|
||||
f.requires_grad = false;
|
||||
f.is_volatile = false;
|
||||
f.next_functions.resize(num_inputs);
|
||||
f.previous_functions.resize(num_inputs);
|
||||
for (int i = 0; i != num_inputs; ++i) {
|
||||
auto& var = inputs[i];
|
||||
if (var) {
|
||||
f.is_executable |= var->requires_grad;
|
||||
f.requires_grad |= var->requires_grad;
|
||||
f.is_volatile |= var->is_volatile;
|
||||
if (var->grad_fn) {
|
||||
f.next_functions[i] = std::make_pair<>(var->grad_fn, var->output_nr);
|
||||
if (var->creator) {
|
||||
f.previous_functions[i] = std::make_pair<>(var->creator, var->output_nr);
|
||||
} else {
|
||||
f.next_functions[i] = std::make_pair<>(var->get_grad_accumulator(), 0);
|
||||
f.previous_functions[i] = std::make_pair<>(var, 0);
|
||||
}
|
||||
}
|
||||
}
|
||||
f.is_executable &= !f.is_volatile;
|
||||
f.requires_grad &= !f.is_volatile;
|
||||
return f;
|
||||
}
|
||||
|
||||
|
||||
@ -1,19 +1,18 @@
|
||||
#pragma once
|
||||
|
||||
// Function is an abstract class that represents a single operation from one or
|
||||
// more variables to one more or variables.
|
||||
// more variables to one more or varaibles.
|
||||
//
|
||||
// Subclasses may represent "forward" or "backward" operations (i.e functions
|
||||
// and their derivatives). Some functions may be used as both.
|
||||
|
||||
#include <Python.h>
|
||||
#include "torch/csrc/autograd/function_hook.h"
|
||||
|
||||
#include <THPP/THPP.h>
|
||||
|
||||
#include <memory>
|
||||
#include <THPP/THPP.h>
|
||||
#include <vector>
|
||||
|
||||
#include "torch/csrc/autograd/saved_variable.h"
|
||||
#include "torch/csrc/autograd/function_hook.h"
|
||||
|
||||
namespace torch { namespace autograd {
|
||||
|
||||
struct Function;
|
||||
@ -25,37 +24,30 @@ using function_list = std::vector<std::pair<std::shared_ptr<Function>, int>>;
|
||||
|
||||
// State used to create "backward" functions
|
||||
struct FunctionFlags {
|
||||
// Roughly speaking, is_executable corresponds to requires_grad.
|
||||
// See http://pytorch.org/docs/notes/autograd.html for more details:
|
||||
// both is_executable and is_volatile specify whether or not backwards
|
||||
// gradient computation will be performed for a function, but they differ in
|
||||
// their precedence.
|
||||
bool is_executable = false;
|
||||
bool requires_grad = false;
|
||||
bool is_volatile = false;
|
||||
// What functions take the output of this function as input.
|
||||
// There is one function per output of this function.
|
||||
function_list next_functions;
|
||||
function_list previous_functions;
|
||||
};
|
||||
|
||||
struct Function {
|
||||
Function()
|
||||
: num_inputs(0)
|
||||
, next_functions()
|
||||
, is_executable(false)
|
||||
: num_outputs(0)
|
||||
, previous_functions()
|
||||
, requires_grad(false)
|
||||
, is_volatile(false)
|
||||
, is_stochastic(false)
|
||||
, pre_hooks()
|
||||
, post_hooks()
|
||||
, pyobj(nullptr)
|
||||
{}
|
||||
|
||||
Function(FunctionFlags&& flags)
|
||||
: num_inputs(0)
|
||||
, next_functions(std::move(flags.next_functions))
|
||||
, is_executable(flags.is_executable)
|
||||
: num_outputs(0)
|
||||
, previous_functions(std::move(flags.previous_functions))
|
||||
, requires_grad(flags.requires_grad)
|
||||
, is_volatile(flags.is_volatile)
|
||||
, is_stochastic(false)
|
||||
, pre_hooks()
|
||||
, post_hooks()
|
||||
, pyobj(nullptr)
|
||||
{}
|
||||
|
||||
Function(const Function& other) = delete;
|
||||
@ -65,7 +57,7 @@ struct Function {
|
||||
// Implements the operation
|
||||
virtual variable_list apply(const variable_list& inputs) = 0;
|
||||
|
||||
// Computes is_executable, is_volatile, and next_functions from a list
|
||||
// Computes requires_grad, is_volatile, and previous_functions from a list
|
||||
// of input variables
|
||||
static FunctionFlags flags(const variable_list& inputs);
|
||||
|
||||
@ -75,24 +67,21 @@ struct Function {
|
||||
// Function name for debugging
|
||||
virtual std::string name();
|
||||
|
||||
inline bool should_compute_output(int i) const {
|
||||
auto& fn = next_functions[i].first;
|
||||
return fn && fn->is_executable;
|
||||
inline bool needs_input_grad(int i) const {
|
||||
auto& fn = previous_functions[i].first;
|
||||
return fn && fn->requires_grad;
|
||||
}
|
||||
|
||||
inline void set_flags(FunctionFlags&& flags) {
|
||||
is_executable = flags.is_executable;
|
||||
next_functions = std::move(flags.next_functions);
|
||||
}
|
||||
|
||||
int num_inputs;
|
||||
function_list next_functions;
|
||||
bool is_executable;
|
||||
// These variables are usually only meaningful for "backward" functions.
|
||||
// num_outputs is the number of outputs of corresponding "forward" function;
|
||||
// it's actually the number of inputs of this function.
|
||||
int num_outputs;
|
||||
function_list previous_functions;
|
||||
bool requires_grad;
|
||||
bool is_volatile;
|
||||
bool is_stochastic;
|
||||
std::vector<std::shared_ptr<FunctionPreHook>> pre_hooks;
|
||||
std::vector<std::shared_ptr<FunctionPostHook>> post_hooks;
|
||||
|
||||
PyObject *pyobj; // weak reference
|
||||
};
|
||||
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user