133 Commits

Author SHA1 Message Date
774ae0851d [OpInfo] Added ReductionOpInfo subclass of OpInfo and ported sum test (#62737)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62737

ReductionOpInfo is a specialization of OpInfo for reduction operators. For now, it is designed to work with reductions that return a single tensor and that reduce all elements along one or more dimensions to a single value. In particular this excludes operators such as `max` and `min` that return multiple tensors and `quantile` that can return multiple values.

fixes https://github.com/pytorch/pytorch/issues/49746

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30406568

Pulled By: heitorschueroff

fbshipit-source-id: 218b1da1902f67bcf4c3681e2a0f0029a25d51f1
2021-08-26 06:06:38 -07:00
f4bc28990f Compute cuda reduction buffer size in elements (#63969)
Summary:
Resubmit of https://github.com/pytorch/pytorch/issues/63885

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63969

Reviewed By: mruberry

Differential Revision: D30549423

Pulled By: ngimel

fbshipit-source-id: b16d25030d44ced789c125a333d72b02a8f45067
2021-08-25 18:18:37 -07:00
14d4723abd add bf16 support for bucketize (#55588)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55588

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D28836796

Pulled By: VitalyFedyunin

fbshipit-source-id: c9ae5b969c30a45473533be5f29bb497f8da5143
2021-08-24 10:31:42 -07:00
99203580a9 Updates internal assert_allclose callsites in favor of assert_close (#61841)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841

Redo of #60863.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30408145

Pulled By: mruberry

fbshipit-source-id: 0b34ebc7f23ba38ecd89640b61d8aca59b7eab58
2021-08-19 12:50:41 -07:00
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
4c4c5b14e4 Port sum.dim_IntList kernel to structured kernels. (#61642)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61642

Tracking issue: #55070

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D29783865

Pulled By: ezyang

fbshipit-source-id: 375d4cd5f915812108367601a610a428762e606d
2021-08-09 08:46:16 -07:00
e6ef87001c [BF16] Add BF16 support to _aminmax and _anminmax_all operators (#62767)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62767

Add BF16 support to _aminmax_all and _aminmax operators.

Test Plan:
Added unit test:
https://www.internalfb.com/intern/testinfra/testconsole/testrun/2533274857208373/

Reviewed By: anjali411

Differential Revision: D30073837

fbshipit-source-id: 9cb4991e644cfdb2f0674ccaff161d223c174150
2021-08-06 08:56:12 -07:00
940cbbce76 Add BFloat16 support to CPU nansum (#61083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61083

It's already supported on CUDA, so it seems reasonable to support on CPU as
well. This also changes `test_nansum` to compare against `torch.sum` since numpy
doesn't support BFloat16. Note that `test_nansum_vs_numpy` checks against
NumPy as well, so that's still being tested.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30006227

Pulled By: heitorschueroff

fbshipit-source-id: 1449730e1936417e7de1f8b3cf8cdcc15518873c
2021-08-02 16:03:57 -07:00
c44d9d9f70 Use cascade-summation to improve nansum accuracy (#61082)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61082

Fixes #59415

This implements nansum as a new `LoadPolicy` for the existing sum functions.
So, it's using the more accurate cascade-sum algorithm.

I've also expanded `test_nansum` to cover the four special cases of the sum
algorithm (inner/outer reduction; vectorized or scalar).

Nansum performance comparison
-----------------------------
For float sums, contiguous reductions are as much as 10x faster and discontiguous sums are ~1.8x faster (more for small shapes due to TensorIterator overheads).

|        Shape | Dim | Master Contiguous (us) | This PR Contiguous (us) | Master Discontiguous (us) | This PR Discontiguous (us) |
|-------------:|-----|:----------------------:|:-----------------------:|:-------------------------:|:--------------------------:|
|     10, 1000 | 0   |          74.9          |           2.02          |            75.6           |            6.41            |
|              | 1   |          8.24          |           1.8           |            8.28           |            5.24            |
|    100, 1000 | 0   |           134          |           7.55          |            130            |            43.2            |
|              | 1   |          70.5          |           7.01          |            71.5           |            40.6            |
|   1000, 1000 | 0   |           726          |           69.2          |            737            |             403            |
|              | 1   |           702          |           51.0          |            709            |             404            |
|  10000, 1000 | 0   |         15,300         |          2,470          |           18,200          |           10,400           |
|              | 1   |          7,200         |          1,160          |           7,470           |            4,440           |
| 100000, 1000 | 0   |         163,000        |          28,000         |          199,000          |           131,000          |
|              | 1   |         70,700         |          13,500         |           75,700          |           44,200           |

Sum performace comparison
-------------------------

For float sums, performance is unchanged to within measurement precision:
|        Shape | Dim | Master Contiguous (us) | This PR Contiguous (us) | Master Discontiguous (us) | This PR Discontiguous (us) |
|-------------:|-----|:----------------------:|:-----------------------:|:-------------------------:|:--------------------------:|
|     10, 1000 | 0   |          1.92          |           2.01          |            4.2            |            4.49            |
|              | 1   |          1.68          |           1.68          |            2.79           |            2.75            |
|    100, 1000 | 0   |          6.52          |           7.07          |            26.9           |            27.3            |
|              | 1   |          5.91          |           5.66          |            16.8           |            16.9            |
|   1000, 1000 | 0   |          55.6          |           58.6          |            256            |             254            |
|              | 1   |          41.0          |           41.2          |            150            |             147            |
|  10000, 1000 | 0   |          1,370         |          1,650          |           8,070           |            8,020           |
|              | 1   |           908          |           845           |           3,100           |            2,980           |
| 100000, 1000 | 0   |         24,700         |          24,700         |           90,900          |           91,000           |
|              | 1   |         12,500         |          12,100         |           31,500          |           31,800           |

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D29753523

Pulled By: ngimel

fbshipit-source-id: 28095ac39e4a07ff878775c98f7a7815d9a4e457
2021-07-19 21:47:43 -07:00
612632556d Fix torch.median crash on empty tensor (#61698)
Summary:
`torch.tensor([]).median()` returns `nan`, which mimics the behavior of `np.median`
Add test to `TestReductions.test_median_corner_cases`
Fixes https://github.com/pytorch/pytorch/issues/61656

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61698

Reviewed By: heitorschueroff

Differential Revision: D29706912

Pulled By: malfet

fbshipit-source-id: ea5f58327fbff371f3fb8786b269430c7a10d05f
2021-07-16 12:36:18 -07:00
9d955abcdb Fix test_reductions when no SciPy is installed (#61699)
Summary:
Also, use skipIfNoSciPy decorator instead of implicit `unittest.skipIf`

This fixes regression introduced by https://github.com/pytorch/pytorch/pull/52565

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61699

Reviewed By: seemethere

Differential Revision: D29706938

Pulled By: malfet

fbshipit-source-id: 0b63c3ddadfa7f68bed994b71cadf68976d3b396
2021-07-15 15:57:11 -07:00
0c916c8a4e up the priority of numpy array comparisons in self.assertEqual (#59067)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58988.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59067

Reviewed By: jbschlosser

Differential Revision: D28986642

Pulled By: heitorschueroff

fbshipit-source-id: 3ef2d26b4010fc3519d0a1a020ea446ffeb46ba0
2021-06-22 13:07:07 -07:00
729f7cd52f Implement histogram operator on CPU (#58780)
Summary:
The existing [torch.histc](https://pytorch.org/docs/stable/generated/torch.histc.html) operator is limited in comparison to [numpy.histogram](https://numpy.org/doc/stable/reference/generated/numpy.histogram.html). This PR adds torch.histogram on CPU. The new operator replicates numpy.histogram's behavior, including support for caller-specified bin edges and weights. It was motivated by previous community requests for histogram.

The implementation was [benchmarked](https://docs.google.com/spreadsheets/d/1xCR0jODchVvwdVSAjiLsNCkmyictA6j1LNfDpWOafjw/edit?usp=sharing) against numpy.histogram as well as torch.histc. This implementation is weakly faster than numpy.histogram across all types of inputs tested, and performs in line with torch.histc for the limited inputs histc supports.

mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58780

Test Plan:
Added unit tests, OpInfo for the new torch.histogram operator.

Tested execution time on a variety of input sizes and compared to numpy.histogram performance: https://docs.google.com/spreadsheets/d/1xCR0jODchVvwdVSAjiLsNCkmyictA6j1LNfDpWOafjw/edit?usp=sharing

Reviewed By: ezyang

Differential Revision: D29134626

Pulled By: saketh-are

fbshipit-source-id: f2773085de1697f6bc6ffdeffe9a81267f51bdfc
2021-06-22 10:06:04 -07:00
0a26781966 fix numpy compatibility in test for torch.kthvalue (#59214)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59201. Should be merged after https://github.com/pytorch/pytorch/issues/59067 to ensure this actually working correctly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59214

Reviewed By: albanD

Differential Revision: D28792363

Pulled By: mruberry

fbshipit-source-id: 0cf613463139352906fb567f1efcc582c2c25de8
2021-06-01 21:57:09 -07:00
e1078d42f0 std/var: Return real results for complex input (#58066)
Summary:
Fixes gh-56627

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58066

Reviewed By: ngimel

Differential Revision: D28372987

Pulled By: mruberry

fbshipit-source-id: c34d55f1a48047ceefa298ef2f4f33ad7dd1e577
2021-05-12 03:26:55 -07:00
2043093217 Add correction parameter to std/var (#50903)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50903

First part of #50010. Also fixes #51127.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27911345

Pulled By: mruberry

fbshipit-source-id: 7138fddc935802918ab9ff19f4bc1b9f4d745d41
2021-05-07 14:40:28 -07:00
293830bc19 Fix min() and max() for empty tensors (#52565)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/34907

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52565

Reviewed By: anjali411

Differential Revision: D27999955

Pulled By: ezyang

fbshipit-source-id: 30e88cc8d84806198500e3001ecf58fa764536dd
2021-04-30 15:55:10 -07:00
1f04494c0e Consolidate nondeterministic error tests (#55631)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/51498

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55631

Reviewed By: malfet

Differential Revision: D27909953

Pulled By: mruberry

fbshipit-source-id: 9115b2433f9c276555be55bd51b270a7a2846829
2021-04-22 23:37:01 -07:00
35a66db774 Fix complex mean and reduction tests not being run (#55640)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55640

Mean is broken for complex types, since #53218 it's now allocating the result
as a real tensor which discards the imaginary component. This wasn't picked up
in testing because `_test_dim_ops` tests are defined as closures inside of
`_test_dim_ops` instead of as methods on the test class. The result is, they
never get run.

For best results, view diff with "Hide whitespace changes".

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27671127

Pulled By: mruberry

fbshipit-source-id: 4a1f6fea1048919fda7339c867ee78e88f2d7bd2
2021-04-09 10:03:44 -07:00
4170a6cc24 Migrate mode from TH to ATen (#52043)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/24731 #24673 https://github.com/pytorch/pytorch/issues/24597 #24526 https://github.com/pytorch/pytorch/issues/46507
Related https://github.com/pytorch/pytorch/issues/24507

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52043

Reviewed By: mruberry

Differential Revision: D27468266

Pulled By: ngimel

fbshipit-source-id: 35a3229c2a706da9bad4ccd0070161831e5476ba
2021-04-02 22:21:53 -07:00
6e2d020037 Add interpolation kwarg to torch.quantile (#49267)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49267

This PR builds upon the PR https://github.com/pytorch/pytorch/pull/48711 by RockingJavaBean. The original PR introduced a BC breaking change by making the interpolation parameter positional. Thus, previous invocations of torch.quantile that did not include the interpolation parameter failed after the PR landed.

To avoid BC breaking changes, we preserve the original signatures and make the interpolation parameter in the new signatures kwarg only. For now, interpolation cannot have a default value to avoid ambiguity with the deprecated signature. However, due to limitations of codegen and C++, we cannot have a required arg after optional ones. Thus, this PR also makes dim and keepdim requires args. Once we can remove the old signatures, dim, keepdim and interpolation parameters in the new signature will get the default values back.

__TODO__
 ---
- [ ] Run backward compat tests

This reverts commit 2f1d1eb7df5e8032392b73751c84025a2aa3d1ee.

Test Plan: Imported from OSS

Reviewed By: glaringlee

Differential Revision: D27337117

Pulled By: heitorschueroff

fbshipit-source-id: 7fe31f22027645e0d6cb3cab0392d532a4b362c9
2021-04-02 12:11:36 -07:00
f2a38a0edd Enabled BFloat16 support for argmax & argmin on both CPU & CUDA (#52582)
Summary:
1. Enabled `BFloat16` support for `argmax` & `argmin` on both CPU & CUDA
2. Added `OpInfo`s for `argmax` & `argmin`
3. Enabled `test_argminmax_multiple` for `float16`. It can't be enabled for `bfloat16`, as comparison is done with numpy, which doesn't currently support `bfloat16`.
4. Enabled `test_dim_arg_reduction_scalar` for `float16` & `bfloat16`.
5. Enabled `test_reduction_vectorize_along_output` for `bfloat16`.
6. Enabled `test_reduction_vectorize_along_input_corner` for `bfloat16`.
7. Enabled `test_dim_reduction` for both `float16` and `bfloat16`, except that both of them don't support `prod` on CPU.
8. Unskipped `TestCommonCPU.test_variant_consistency_jit` for dtype `bfloat16` for `amax` & `amin`, as they're passing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52582

Reviewed By: anjali411

Differential Revision: D27204704

Pulled By: heitorschueroff

fbshipit-source-id: cdad5df494d070f8e1a8fb83939441a91124b4d9
2021-03-23 03:38:11 -07:00
e698a634cc Enabled amin & amax for float16 & bfloat16 (#52579)
Summary:
1. Enabled `amax` & `amin` for `float16` & `bfloat16` dtypes for both CPU & CUDA.
2. Added `OpInfo`s for `amax` & `amin`.
3. Enabled `test_min_with_inf` & `test_max_with_inf` for both `float16` & `bfloat16`, as they also use `torch.amin` & `torch.amax` respectively.
4. Enabled `test_amax` & `test_amin` for `float16` but not for `bfloat16`, as comparison is done with `numpy`, which doesn't support `bfloat16`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52579

Reviewed By: pbelevich

Differential Revision: D26784194

Pulled By: heitorschueroff

fbshipit-source-id: 1050de3e155b83f282fb30b0db6658eead89936c
2021-03-04 07:03:03 -08:00
3adc8f8cf7 Enable min & max for Float16 & BFloat16 (#51244)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50790.

Added `min()` & `max()` support for `Float16` & `BFloat16`.
CUDA already supported these ops on `Float16`, so the other three combinations had to be enabled.
`OpInfo`s for `min` & `max` were also added, and their sample inputs were removed from `method_tests()`.

### MORE INFO
The (slightly) long-term goal is to add dispatch for `min()` & `max()` related operations on CPU & CUDA for `Float16` & `BFloat16`,
wherever they aren't present already:
1. `amin()`
2. `argmax()`
3. `amax()`
4. `argmin()`
5. `torch._aminmax()`
6. `torch.clamp()` on CPU. Was already supported on CUDA
7. `min()` (in this PR)
8. `max()` (in this PR)
9. `minimum()`
10. `maximum()`

I'll submit separate PRs for the other ops.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51244

Reviewed By: jbschlosser

Differential Revision: D26503455

Pulled By: anjali411

fbshipit-source-id: c32247f214e9272ca2e4322a23337874e737b140
2021-02-18 23:13:51 -08:00
58eb23378f Clean up usage of torch._six partially (#49785)
Summary:
See https://github.com/pytorch/pytorch/issues/42919

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785

Reviewed By: mruberry

Differential Revision: D25963833

Pulled By: bugra

fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2
2021-02-08 13:58:34 -08:00
5d45140d68 [numpy] torch.{all/any} : output dtype is always bool (#47878)
Summary:
BC-breaking note:

This PR changes the behavior of the any and all functions to always return a bool tensor. Previously these functions were only defined on bool and uint8 tensors, and when called on uint8 tensors they would also return a uint8 tensor. (When called on a bool tensor they would return a bool tensor.)

PR summary:

https://github.com/pytorch/pytorch/pull/44790#issuecomment-725596687

Fixes 2 and 3

Also Fixes https://github.com/pytorch/pytorch/issues/48352

Changes
* Output dtype is always `bool` (consistent with numpy) **BC Breaking (Previously used to match the input dtype**)
* Uses vectorized version for all dtypes on CPU
* Enables test for complex
* Update doc for `torch.all` and `torch.any`

TODO
* [x] Update docs
* [x] Benchmark
* [x] Raise issue on XLA

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47878

Reviewed By: albanD

Differential Revision: D25714324

Pulled By: mruberry

fbshipit-source-id: a87345f725297524242d69402dfe53060521ea5d
2021-01-08 11:05:39 -08:00
983bfc79ed Enable product for bool tensor (#48637)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48351

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48637

Reviewed By: mrshenli

Differential Revision: D25658596

Pulled By: mruberry

fbshipit-source-id: ff3ada74b6d281c8e4753ed38339a1c036f722ee
2020-12-21 14:11:26 -08:00
afce5890ff Revert D25421263: [pytorch][PR] [numpy] torch.{all/any} : output dtype is always bool
Test Plan: revert-hammer

Differential Revision:
D25421263 (c508e5b1bf)

Original commit changeset: c6c681ef9400

fbshipit-source-id: 4c0c9acf42b06a3ed0af8f757ea4512ca35b6c59
2020-12-16 11:11:13 -08:00
c508e5b1bf [numpy] torch.{all/any} : output dtype is always bool (#47878)
Summary:
BC-breaking note:

This PR changes the behavior of the any and all functions to always return a bool tensor. Previously these functions were only defined on bool and uint8 tensors, and when called on uint8 tensors they would also return a uint8 tensor. (When called on a bool tensor they would return a bool tensor.)

PR summary:

https://github.com/pytorch/pytorch/pull/44790#issuecomment-725596687

Fixes 2 and 3

Also Fixes https://github.com/pytorch/pytorch/issues/48352

Changes
* Output dtype is always `bool` (consistent with numpy) **BC Breaking (Previously used to match the input dtype**)
* Uses vectorized version for all dtypes on CPU
* Enables test for complex
* Update doc for `torch.all` and `torch.any`

TODO
* [x] Update docs
* [x] Benchmark
* [x] Raise issue on XLA

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47878

Reviewed By: H-Huang

Differential Revision: D25421263

Pulled By: mruberry

fbshipit-source-id: c6c681ef94004d2bcc787be61a72aa059b333e69
2020-12-15 13:59:32 -08:00
2f1d1eb7df Revert D25428587: [pytorch][PR] add additional interpolation modes for torch.quantile
Test Plan: revert-hammer

Differential Revision:
D25428587 (25a8397bf3)

Original commit changeset: e98d24f6a651

fbshipit-source-id: fb217b8a19e853e83779a4edd312be86b26eb26d
2020-12-11 07:50:16 -08:00
25a8397bf3 add additional interpolation modes for torch.quantile (#48711)
Summary:
Fix https://github.com/pytorch/pytorch/issues/48523
Related  https://github.com/pytorch/pytorch/issues/38349

**BC-breaking Note:**

This PR updates PyTorch's quantile function to add additional interpolation methods `lower`, `higher`, `nearest`, and `midpoint`, and these interpolation methods are currently supported by NumPy.

New parameter `interpolation` is added to the signature for both `torch.quantile` and `torch.nanquantile` functions.

- `quantile(input, q, dim=None, interpolation='linear', keepdim=False, *, out=None) -> Tensor`
- `nanquantile(input, q, dim=None, interpolation='linear', keepdim=False, *, out=None) -> Tensor`

Function signatures followed the NumPy-like style for the moment, keeping `out` at the end to be consistent with PyTorch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48711

Reviewed By: H-Huang

Differential Revision: D25428587

Pulled By: heitorschueroff

fbshipit-source-id: e98d24f6a651d302eb94f4ff4da18e38bdbf0124
2020-12-10 10:10:51 -08:00
36c87f1243 Refactors test_torch.py to be fewer than 10k lines (#47356)
Summary:
Creates multiple new test suites to have fewer tests in test_torch.py, consistent with previous test suite creation like test_unary_ufuncs.py and test_linalg.py.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47356

Reviewed By: ngimel

Differential Revision: D25202268

Pulled By: mruberry

fbshipit-source-id: 75fde3ca76545d1b32b86d432a5cb7a5ba8f5bb6
2020-11-28 20:11:40 -08:00