It generally recommended to use `is/is not` to compare types. Therefore this series of changes apply this suggestion in the code base, and it aims to finally enabling related linter checks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165037
Approved by: https://github.com/mlazos
Fixes#76772, #144196
Extends #144106
- added type annotations to `lazy_property`.
- added type annotation to all `@property` and `@lazy_property` inside `torch.distributions` module.
- added simply type-check unit test to ensure type inference is working.
- replaced deprecated annotations like `typing.List` with the corresponding counterpart.
- simplified `torch.Tensor` hints with plain `Tensor`, otherwise signatures can become very verbose.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144110
Approved by: https://github.com/Skylion007
Over time, a large number of the existing type ignores have become irrelevant/unused/dead as a result of improvements in annotations and type checking.
Having these `# type: ignore` linger around is not ideal for two reasons:
- They are verbose/ugly syntatically.
- They could hide genuine bugs in the future, if a refactoring would actually introduce a bug but it gets hidden by the ignore.
I'm counting over 1500 unused ignores already. This is a first PR that removes some of them. Note that I haven't touched type ignores that looked "conditional" like the import challenge mentioned in https://github.com/pytorch/pytorch/pull/60006#issuecomment-2480604728. I will address these at a later point, and eventually would enable `warn_unused_ignores = True` in the mypy configuration as discussed in that comment to prevent accumulating more dead ignores going forward.
This PR should have no effect on runtime at all.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142325
Approved by: https://github.com/Skylion007, https://github.com/janeyx99
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.
Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0
| Repository | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7 | 251.8 | 351.1 | 274.9 |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
Summary:
Fixes https://github.com/pytorch/pytorch/issues/72765.
- [x] Improved `NotImplementedError` verbosity.
- [x] Automate the docstring generation process
## Improved `NotImplementedError` verbosity
### Code
```python
import torch
dist = torch.distributions
torch_normal = dist.Normal(loc=0.0, scale=1.0)
torch_mixture = dist.MixtureSameFamily(
dist.Categorical(torch.ones(5,)
),
dist.Normal(torch.randn(5,), torch.rand(5,)),
)
dist.kl_divergence(torch_normal, torch_mixture)
```
#### Output before this PR
```python
NotImplementedError:
```
#### Output after this PR
```python
NotImplementedError: No KL(p || q) is implemented for p type Normal and q type MixtureSameFamily
```
## Automate the docstring generation process
### Docstring before this PR
```python
Compute Kullback-Leibler divergence :math:`KL(p \| q)` between two distributions.
.. math::
KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx
Args:
p (Distribution): A :class:`~torch.distributions.Distribution` object.
q (Distribution): A :class:`~torch.distributions.Distribution` object.
Returns:
Tensor: A batch of KL divergences of shape `batch_shape`.
Raises:
NotImplementedError: If the distribution types have not been registered via
:meth:`register_kl`.
```
### Docstring after this PR
```python
Compute Kullback-Leibler divergence :math:`KL(p \| q)` between two distributions.
.. math::
KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx
Args:
p (Distribution): A :class:`~torch.distributions.Distribution` object.
q (Distribution): A :class:`~torch.distributions.Distribution` object.
Returns:
Tensor: A batch of KL divergences of shape `batch_shape`.
Raises:
NotImplementedError: If the distribution types have not been registered via
:meth:`register_kl`.
KL divergence is currently implemented for the following distribution pairs:
* :class:`~torch.distributions.Bernoulli` and :class:`~torch.distributions.Bernoulli`
* :class:`~torch.distributions.Bernoulli` and :class:`~torch.distributions.Poisson`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.Binomial` and :class:`~torch.distributions.Binomial`
* :class:`~torch.distributions.Categorical` and :class:`~torch.distributions.Categorical`
* :class:`~torch.distributions.Cauchy` and :class:`~torch.distributions.Cauchy`
* :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.Dirichlet` and :class:`~torch.distributions.Dirichlet`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Gumbel`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.ExponentialFamily` and :class:`~torch.distributions.ExponentialFamily`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Gumbel`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.Geometric` and :class:`~torch.distributions.Geometric`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Gumbel`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.HalfNormal` and :class:`~torch.distributions.HalfNormal`
* :class:`~torch.distributions.Independent` and :class:`~torch.distributions.Independent`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Laplace`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.LowRankMultivariateNormal` and :class:`~torch.distributions.LowRankMultivariateNormal`
* :class:`~torch.distributions.LowRankMultivariateNormal` and :class:`~torch.distributions.MultivariateNormal`
* :class:`~torch.distributions.MultivariateNormal` and :class:`~torch.distributions.LowRankMultivariateNormal`
* :class:`~torch.distributions.MultivariateNormal` and :class:`~torch.distributions.MultivariateNormal`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Gumbel`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Laplace`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.OneHotCategorical` and :class:`~torch.distributions.OneHotCategorical`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Uniform`
* :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Bernoulli`
* :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Binomial`
* :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Poisson`
* :class:`~torch.distributions.TransformedDistribution` and :class:`~torch.distributions.TransformedDistribution`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Beta`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.ContinuousBernoulli`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Exponential`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Gamma`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Gumbel`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Normal`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Pareto`
* :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Uniform`
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72845
Reviewed By: mikaylagawarecki
Differential Revision: D34344551
Pulled By: soulitzer
fbshipit-source-id: 7a603613a2f56f71138d56399c7c521e2238e8c5
(cherry picked from commit 6b2a51c796cd8a16551d629ca368360eec34faef)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63570
There is a use of `at::triangular_solve_out` in the file
`torch/csrc/jit/tensorexpr/external_functions.cpp` that I have not dared
to move to `at::linalg_solve_triangular_out`.
**Deprecation note:**
This PR deprecates the `torch.triangular_solve` function in favor of
`torch.linalg.solve_triangular`. An upgrade guide is added to the
documentation for `torch.triangular_solve`.
Note that it DOES NOT remove `torch.triangular_solve`, but
`torch.triangular_solve` will be removed in a future PyTorch release.
cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision: D32618035
Pulled By: anjali411
fbshipit-source-id: 0bfb48eeb6d96eff3e96e8a14818268cceb93c83
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64181
This PR replaces all the calls to:
- `transpose(-2, -1)` or `transpose(-1, -2)` by `mT()` in C++ and `mT` in Python
- `conj().transpose(-2, -1)` or `transpose(-2, -1).conj()` or `conj().transpose(-1, -2)` or `transpose(-1, -2).conj()` by `mH()` in C++ and `mH` in Python.
It also simplifies two pieces of code, and fixes one bug where a pair
of parentheses were missing in the function `make_symmetric_matrices`.
Test Plan: Imported from OSS
Reviewed By: H-Huang
Differential Revision: D31692896
Pulled By: anjali411
fbshipit-source-id: e9112c42343663d442dc5bd53ff2b492094b434a
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50496
Fixes https://github.com/pytorch/pytorch/issues/34859
Fixes https://github.com/pytorch/pytorch/issues/21596
This fixes many bugs involving `TransformedDistribution` and `ComposeTransform` when the component transforms changed their event shapes. Part of the fix is to introduce an `IndependentTransform` analogous to `distributions.Independent` and `constraints.independent`, and to introduce methods `Transform.forward_shape()` and `.inverse_shape()`. I have followed fehiepsi's suggestion and replaced `.input_event_dim` -> `.domain.event_dim` and `.output_event_dim` -> `.codomain.event_dim`. This allows us to deprecate `.event_dim` as an attribute.
## Summary of changes
- Fixes `TransformDistribution` and `ComposeTransform` shape errors.
- Fixes a behavior bug in `LogisticNormal`.
- Fixes `kl_divergence(TransformedDistribution, TransformedDistribution)`
- Adds methods `Transform.forward_shape()`, `.inverse_shape()` which are required for correct shape computations in `TransformedDistribution` and `ComposeTransform`.
- Adds an `IndependentTransform`.
- Adds a `ReshapeTransform` which is invaluable in testing shape logic in `ComposeTransform` and `TransformedDistribution` and which will be used by stefanwebb flowtorch.
- Fixes incorrect default values in `constraints.dependent.event_dim`.
- Documents the `.event_dim` and `.is_discrete` attributes.
## Changes planned for follow-up PRs
- Memoize `constraints.dependent_property` as we do with `lazy_property`, since we now consult those properties much more often.
## Tested
- [x] added a test for `Dist.support` vs `Dist(**params).support` to ensure static and dynamic attributes agree.
- [x] refactoring is covered by existing tests
- [x] add test cases for `ReshapedTransform`
- [x] add a test for `TransformedDistribution` on a wide grid of input shapes
- [x] added a regression test for https://github.com/pytorch/pytorch/issues/34859
cc fehiepsi feynmanliang stefanwebb
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50581
Reviewed By: ezyang, glaringlee, jpchen
Differential Revision: D26024247
Pulled By: neerajprad
fbshipit-source-id: f0b9a296f780ff49659b132409e11a29985dde9b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14687478
fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
Summary:
Changelog:
- Renames `trtrs` to `triangular_solve` to remain consistent with `cholesky_solve` and `solve`.
- Rename all tests, fix callsites
- Create a tentative alias for `triangular_solve` under the name `trtrs`, and add a deprecation warning to not promote usage.
- Move `isnan` to _torch_docs.py
- Remove unnecessary imports
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18213
Differential Revision: D14566902
Pulled By: ezyang
fbshipit-source-id: 544f57c29477df391bacd5de700bed1add456d3f
Summary:
- Remove single batch TH/THC implementations
- Remove `_batch_trtrs_lower` from `multivariate_normal`
- Add tests for batched behavior
- Modify trtrs_backward to accommodate for batched case
- Modify docs
In a future PR, this will be renamed to `triangular_solve`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18025
Differential Revision: D14523004
Pulled By: ifedan
fbshipit-source-id: 11c6a967d107f969b60e5a5c73ce6bb8099ebbe1
Summary:
This PR fixes an issue of the slowness expanded MVN.
A notebook to show the problem is [here](https://gist.github.com/fehiepsi/b15ac2978f1045d6d96b1d35b640d742). Basically, mvn's sample and log_prob have expensive computations based on `cholesky` and `trtrs`. We can save a lot of computation based on caching the unbroadcasted version of `scale_tril` (or `cov_diag`, `cov_factor` in lowrank mvn).
When expanding, this cached tensor should not be expanded together with other arguments.
Ref: https://github.com/uber/pyro/issues/1586
cc neerajprad fritzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14557
Differential Revision: D13277408
Pulled By: soumith
fbshipit-source-id: a6b16f999b008d5da148ccf519b7f32d9c6a5351
Summary:
Support broadcasting in _kl_categorical_categorical
this makes it possible to do:
```
import torch.distributions as dist
import torch
p_dist = dist.Categorical(torch.ones(1,10))
q_dist = dist.Categorical(torch.ones(100,10))
dist.kl_divergence(p_dist, q_dist)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10533
Differential Revision: D9341252
Pulled By: soumith
fbshipit-source-id: 34575b30160b43b6c9e4c3070dd7ef07c00ff5d7
Summary:
This pull request implements low rank multivariate normal distribution where the covariance matrix has the from `W @ W.T + D`. Here D is a diagonal matrix, W has shape n x m where m << n. It used "matrix determinant lemma" and "Woodbury matrix identity" to save computational cost.
During the way, I also revise MultivariateNormal distribution a bit. Here are other changes:
+ `torch.trtrs` works with cuda tensor. So I tried to use it instead of `torch.inverse`.
+ Use `torch.matmul` instead of `torch.bmm` in `_batch_mv`. The former is faster and simpler.
+ Use `torch.diagonal` for `_batch_diag`
+ Reimplement `_batch_mahalanobis` based on `_batch_trtrs_lower`.
+ Use trtrs to compute term2 of KL.
+ `variance` relies on `scale_tril` instead of `covariance_matrix`
TODO:
- [x] Resolve the fail at `_gradcheck_log_prob`
- [x] Add test for KL
cc fritzo stepelu apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8635
Differential Revision: D8951893
Pulled By: ezyang
fbshipit-source-id: 488ee3db6071150c33a1fb6624f3cfd9b52760c3
* Codemod to update our codebase to 0.4 standard
* Update some of the test scri[ts
* remove Variable in test_clip_grad_value
* fix _symbolic_override_wrapper_maker
* Fix test_distributions when WITH_SCALARS.
* Use SCALAR_SHAPE in test, use self.scale in AffineTransform.
* Handle device correctly for scalars.
* Fix one hot categorical.
* Fix relaxed categorical.
* Add a new_tensor instance method to Variable that takes only data.
This is to work around the legacy problems of new, where e.g.
new(5) will give you an unfilled tensor rather than a scalar.
* Fix cuda scalar code path.
* Remove double return.
* Work around lack of WITH_SCALARS.
* Use tensor_new.