162 Commits

Author SHA1 Message Date
83e8612d11 Clean up test autograd (#67413)
Summary:
Partially fixes https://github.com/pytorch/pytorch/issues/66066

This PR:
 - cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality
 - tests related to an operator are better colocated
 - see the tracker for details

What to think about when moving tests to their correct test suite:
 - naming, make sure its not too generic
 - how the test is parametrized, sometimes we need to add/remove a device/dtype parameter
 - can this be merged with existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413

Reviewed By: jbschlosser, albanD

Differential Revision: D32031480

Pulled By: soulitzer

fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f
2021-11-03 15:26:09 -07:00
885a8e53ba replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201)
Summary:
Reference https://github.com/pytorch/pytorch/issues/53849

Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201

Reviewed By: mrshenli

Differential Revision: D31299718

Pulled By: mruberry

fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd
2021-11-01 09:22:34 -07:00
83f70db95c Fix common device computation for comparison ops. (#66245)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66245

Fixes #66053

This PR splits `declare_static_dtype_and_device` into two new methods for
`TensorIteratorBase`: `declare_static_dtype` and `declare_static_device`.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31503849

Pulled By: ngimel

fbshipit-source-id: 4b131b691d29ceb5f3709f5d6503997ea0875c54
2021-10-22 18:43:17 -07:00
8a65047acc [skip ci] Set test owners for everything considered with module: tests (#66865)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

cc mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66865

Reviewed By: anjali411

Differential Revision: D31771147

Pulled By: janeyx99

fbshipit-source-id: 8bebe5ac2098364ef1ee93b590abb5f4455b0f89
2021-10-20 09:37:03 -07:00
0b8dc0f04a add BFloat16 operators on CPU: logaddexp, logaddexp2, remainder (#63621)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63621

Reviewed By: H-Huang

Differential Revision: D31640811

Pulled By: mruberry

fbshipit-source-id: 1fd061b65c196398738018eefc52bf459e424b1c
2021-10-15 13:11:45 -07:00
2223737da9 restore test_inplace_comparison_ops_require_inputs_have_same_dtype Expected behavior (#64267)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64267

This test expects every operation to throw a runtime error.

And Reinsert in-place operation test,Fix bug for comparison operation

fix: #64018

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30720915

Pulled By: ezyang

fbshipit-source-id: 215a6556d20770f70f4ced1c1f9a9753933f1d37
2021-09-08 06:42:12 -07:00
7e4ebe06ca Fixes issue related torch.trapezoid broadcasting behavior and documentation (#64054)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64054

Fixes #63608

cc mruberry rgommers heitorschueroff

Test Plan: Imported from OSS

Reviewed By: saketh-are

Differential Revision: D30617078

Pulled By: NivekT

fbshipit-source-id: 815896ec56d447562790df4d662e94fd13457e2a
2021-09-07 11:41:55 -07:00
26b7ff5aea deprecate dtype getters from torch.testing namespace (#63554)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554

Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold:

1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`.
2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries.

We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters.

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D30662206

Pulled By: mruberry

fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56
2021-09-07 08:58:51 -07:00
83e28a7d28 Use stacklevel for floordiv deprecation warnings (#64034)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60548

`Tensor.__floordiv__` was indirectly deprecated by deprecation of `torch.floor_divide` (see https://github.com/pytorch/pytorch/issues/43874). Deprecating it directly provides clearer feedback.

Repro:
```
import torch
x = torch.tensor(0)
x // 1
```

Before this change, a deprecation warning was triggered within the C++ implementation of floor_divide:
```
UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  ../aten/src/ATen/native/BinaryOps.cpp:571.)
  return torch.floor_divide(self, other)
```

After this change, the warning instead cites the user's offending line of Python code:
```
UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  x // 1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64034

Reviewed By: mruberry

Differential Revision: D30658010

Pulled By: saketh-are

fbshipit-source-id: b0e6c5008d741897509d102f4a89efb47de4aa2a
2021-08-31 11:27:56 -07:00
d37636901e [Doc] make_tensor to torch.testing module (#63925)
Summary:
This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs.

TODOs:

* [x] Add examples

cc: pmeier mruberry brianjo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925

Reviewed By: ngimel

Differential Revision: D30633487

Pulled By: mruberry

fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af
2021-08-30 12:25:40 -07:00
70a3210eca Add BinaryUfuncOpInfo and broadcasting tests (#61964)
Summary:
As proof of concept, this PR uses the new `BinaryUfuncOpInfo` in broadcasting tests for `add`, `sub`, `mul`, `div`, `floor_div`, and `true_div`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61964

Reviewed By: ngimel

Differential Revision: D30407734

Pulled By: mruberry

fbshipit-source-id: ada28994f43b0635f279f45a02ecba18bc8ee033
2021-08-20 11:44:15 -07:00
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
87465a6e68 adding operator cumulative_trapezoid (#61615)
Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* https://github.com/pytorch/pytorch/issues/61616
* **https://github.com/pytorch/pytorch/issues/61615**
* https://github.com/pytorch/pytorch/issues/61475

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61615

Reviewed By: malfet, mruberry

Differential Revision: D29975064

Pulled By: NivekT

fbshipit-source-id: 4d4e98f3efb720fdc44eb238ecbf0fa157ac13d7
2021-08-03 08:04:00 -07:00
fd8004b42e add bfloat16 impl for nextafter (#61829)
Summary:
Add `BFloat16` support for `nextafter`.

* [x] Add OpInfo
* [x] Add Implementation Test (C++ tests)
* [x] Add credit

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61829

Reviewed By: ejguan

Differential Revision: D29932498

Pulled By: mruberry

fbshipit-source-id: 89524531a4800569ba1addd08a4ace330a6f72a4
2021-08-02 23:16:58 -07:00
109bd5e78a OpInfo: bitwise_and (#61349)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61349

Also add type promotion test for bugs found by pr #60813

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29592840

Pulled By: ezyang

fbshipit-source-id: ee013b20e31baf6c6ebf2edb881ae6d8e215c7a6
2021-07-22 07:04:17 -07:00
604f503d30 Revert D29794958 + compilation fix (#61937)
Summary:
This PR un-reverts https://github.com/pytorch/pytorch/issues/61475 + fixes compilation with MSVC, that does not recognize alternative operator spellings (i.e. using `or` instead of `||` )

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61937

Reviewed By: albanD

Differential Revision: D29805941

Pulled By: malfet

fbshipit-source-id: 01e5963c6717c1b44b260300d87ba0bf57f26ce9
2021-07-20 18:14:45 -07:00
22fff61f06 Revert D29794958: [pytorch][PR] changing trapz to trapezoid
Test Plan: revert-hammer

Differential Revision:
D29794958 (95cec8f4fa)

Original commit changeset: 60b9c07efd47

fbshipit-source-id: 2dcda2d62e01c2521a86ae5ed8246cfb686d3f64
2021-07-20 16:00:46 -07:00
95cec8f4fa changing trapz to trapezoid (#61475)
Summary:
This PR resolves issue https://github.com/pytorch/pytorch/issues/52606 while also adding support for complex number

Stack from [ghstack](https://github.com/ezyang/ghstack):
* https://github.com/pytorch/pytorch/issues/61616
* https://github.com/pytorch/pytorch/issues/61615
* **https://github.com/pytorch/pytorch/issues/61475**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61475

Reviewed By: mruberry

Differential Revision: D29794958

Pulled By: NivekT

fbshipit-source-id: 60b9c07efd47fd85b9c8178768fc7828d7b57d29
2021-07-20 15:25:55 -07:00
4d9fd8958b Support __rand__, __ror__ and __rxor__ (#59240)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58120.

This PR implements `torch.Tensor.{__rand__/__ror__/__rxor__}` for the compatibility with NumPy’s interface.
(cc: mruberry, rgommers, emcastillo, kmaehashi)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59240

Reviewed By: ngimel

Differential Revision: D29482304

Pulled By: mruberry

fbshipit-source-id: 13789202c1d8dddf8658a45381aeedcc31e2f603
2021-07-07 13:34:14 -07:00
03b5a225a7 Test parametrization for instantiated device-specific tests (#60233)
Summary:
The `ops` decorator provides a way to parameterize a test across a given list of ops. This would be useful for modules as well (e.g. a `modules` decorator), but the mechanism by which this is accomplished is specific to ops. In the details, the `ops` decorator tags a test function with the metadata needed (list of ops, `dtypes`) and the actual tests are generated according to this metadata during the call to `instantiate_device_type_tests()`.

This PR makes this mechanism more generic, allowing for test parameterization across arbitrary dimensions. This makes a `modules` decorator (or any similar type of decorator) straightforward to implement without changes to the device-specific test instantiation logic.

One caveat is that, since this is implemented where the old `ops` decorator was (within `instantiate_device_type_tests()`), this only works for tests instantiated using the device-specific instantiation logic. Longer term, even device-specific test instantiation could be treated as an optional parameterization across device types, but this PR takes a low-risk approach for now. In practice, this just means that a `device` kwarg is required for all test signatures used with the mechanism.

The `ops` decorator has been refactored to use the generic mechanism and works the same as before, with one difference: when `OpDTypes.none` is specified, the test signature no longer needs an unused `dtype` kwarg. This is a nice bonus that demonstrates the added flexibility of a generic parameterization mechanism. The refactored form also has the bonus that all op-specific test generation logic is contained within the `ops` decorator class, improving readability.

Behind the scenes, the generic mechanism is a base decorator class (`_TestParameterizer`) from which `ops` derives. The core functionality is in the `_parameterize_test()` method, which takes in a test function and returns a generator that produces parameterized tests, including names and parameter kwargs to pass to them. Using the `ops` decorator results in a set of op-specific tests from a given generic test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60233

Reviewed By: iramazanli

Differential Revision: D29494995

Pulled By: jbschlosser

fbshipit-source-id: a14446488c106094fafcaa75ccf8e9e3faf33bfc
2021-06-30 18:50:22 -07:00
dfd2edc025 [special] add zeta (#59623)
Summary:
Reference https://github.com/pytorch/pytorch/issues/50345

`zeta` was already present in the codebase to support computation of `polygamma`.

However, `zeta` only had `double(double, double)` signature **for CPU** before the PR (which meant that computation `polygamma` were always upcasted to `double` for zeta part).

With this PR, float computations will take place in float and double in double.

Have also refactored the code and moved the duplicate code from `Math.cuh` to `Math.h`

**Note**: For scipy, q is optional, and if it is `None`, it defaults `1` which corresponds to Reimann-Zeta. However, for `torch.specia.zeta`, I made it mandatory cause for me it feels odd without `q` this is Reimann-Zeta and with `q` it is the general Hurwitz Zeta. I think sticking to just general made more sense as passing `1` for q sounds trivial.

Verify:
* [x] Docs https://14234587-65600975-gh.circle-artifacts.com/0/docs/special.html#torch.special.zeta

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59623

Reviewed By: ngimel

Differential Revision: D29348269

Pulled By: mruberry

fbshipit-source-id: a3f9ebe1f7724dbe66de2b391afb9da1cfc3e4bb
2021-06-24 00:00:12 -07:00
26cdec6ce4 Support torch.bitwise_{left/right}_shift and __rlshift__, __rrshift__ (#59544)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58121

This PR implements `torch.bitwise_left_shift` and `torch.bitwise_right_shift` and `torch.Tensor.{__rlshift__/__rrshift__}`for compatibility with Python array API standard.
(cc: mruberry, rgommers, emcastillo, kmaehashi)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59544

Reviewed By: ngimel

Differential Revision: D29348869

Pulled By: mruberry

fbshipit-source-id: 329aee296cf890735e8a9f858bccfe87c03d06ca
2021-06-23 23:57:16 -07:00
9e773ea7d5 Use accscalar_t for CUDA add/sub with Tensor and Scalar (#60454)
Summary:
Follow up of https://github.com/pytorch/pytorch/issues/60227, related to https://github.com/pytorch/pytorch/issues/59907 & https://github.com/pytorch/pytorch/issues/58833

With this pull request, `torch.add` & `torch.sub` use `acc_type` for `Scalar` if either of two arguments is `Scalar`.
This mimics the behavior of [`torch.mul`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu#L18), `torch._foreach_(add|sub).Scalar` and `torch._foreach_(add|sub).ScalarList`.

 ---

**reference**
- torch.mul CUDA kernel: b0c9762e2d/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu (L17-L25)
- `torch._foreach_(add|sub).Scalar`: cast scalar b0c9762e2d/aten/src/ATen/native/cuda/ForeachBinaryOpScalar.cu (L27)
- `torch._foreach_(add|sub).ScalarList`: `BinaryOpScalarListFunctor` b0c9762e2d/aten/src/ATen/native/cuda/ForeachFunctors.cuh (L180-L182) and multi_tensor_apply handles `scalar_t` and computes `opmath_t` (almost equivalent `accscalar_t`)  b0c9762e2d/aten/src/ATen/native/cuda/MultiTensorApply.cuh (L60-L68). BinaryOpScalarListFunctor
is used b0c9762e2d/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu (L24)

cc ngimel ptrblck mcarilli

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60454

Reviewed By: VitalyFedyunin

Differential Revision: D29345035

Pulled By: ngimel

fbshipit-source-id: 5dbafbdfe029a9544ec2e58f17d547928e017a04
2021-06-23 18:59:22 -07:00
0c916c8a4e up the priority of numpy array comparisons in self.assertEqual (#59067)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58988.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59067

Reviewed By: jbschlosser

Differential Revision: D28986642

Pulled By: heitorschueroff

fbshipit-source-id: 3ef2d26b4010fc3519d0a1a020ea446ffeb46ba0
2021-06-22 13:07:07 -07:00
b298013cd5 [add/sub] Cast alpha to acc_type (#60227)
Summary:
This PR lets `torch.add` & `torch.sub` CUDA kernels cast `alpha` to `acc_type`, not `scalar_t`.
I do not remove `cast`s from `test/test_foreach.py` because I'll do this in https://github.com/pytorch/pytorch/issues/59907 or follow-up for it.

Current upstream `torch._foreach_add` & `torch._foreach_sub` upcast `alpha` parameter to `acc_type<scalar_t>` while `torch.add` & `torch.sub` not. This is kind of problematic because outputs of `torch.add` and `torch.sub` are different from `torch._foreach_add` and `torch._foreach_sub`, respectively if the dtype of input tensors is either `torch.half` or `torch.bfloat16`. The discrepancy is proportional-ish to `abs(alpha)` except when `alpha` is representable with 16 bits.

ref:
- `torch._foreach_add` & `torch._foreach_sub` cast `alpha`: 6d0fb85a62/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu (L21-L28), `BinaryOpListAlphaFunctor` is defined here: 6d0fb85a62/aten/src/ATen/native/cuda/ForeachFunctors.cuh (L202)

related: https://github.com/pytorch/pytorch/issues/58833, https://github.com/pytorch/pytorch/pull/59907

cc ngimel ptrblck mcarilli

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60227

Reviewed By: mruberry

Differential Revision: D29252759

Pulled By: ngimel

fbshipit-source-id: 847f3b9493ae30a900f7445af00aef1abcc1ab21
2021-06-20 19:05:22 -07:00
0a5bfa9919 Support __rmod__ (#58476)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58035.

This PR implements `torch.Tensor.__rmod__` and `torch.remainder(scalar, tensor)` for the compatibility with NumPy’s interface.
(cc: mruberry, rgommers, emcastillo, kmaehashi)

TODO:
  - [x] Update `tensor_binary_op` in test/test_binary_ufuncs.py after https://github.com/pytorch/pytorch/issues/58216 is merged.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58476

Reviewed By: ngimel

Differential Revision: D28776810

Pulled By: mruberry

fbshipit-source-id: 74f8aea80f439ef2cc370333524e39971eeb7bf4
2021-06-05 16:19:24 -07:00
3113a1de4a Fix some tensor operators to return NotImplemented for invalid inputs (#58216)
Summary:
Same as https://github.com/pytorch/pytorch/issues/57934. (cc/ albanD)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58216

Reviewed By: ailzhang

Differential Revision: D28494886

Pulled By: albanD

fbshipit-source-id: 380205867ee1cde90e1c6fcfe2a31749e1243530
2021-05-19 13:09:57 -07:00
098d9975a7 Port heaviside to structured kernel (#57933)
Summary:
Port heaviside to structured kernel
Related https://github.com/pytorch/pytorch/issues/55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57933

Reviewed By: mruberry

Differential Revision: D28362533

Pulled By: ezyang

fbshipit-source-id: 96b4591db3f609434784bd0ef9e54c61c918fb88
2021-05-13 10:48:11 -07:00
5e83c62a9e Revert D28351931: [pytorch][PR] Fix some tensor operators to return NotImplemented for invalid inputs
Test Plan: revert-hammer

Differential Revision:
D28351931 (35521a2629)

Original commit changeset: 985457a44dba

fbshipit-source-id: 10724c219e53648f10a70719e25bcf774c6c7852
2021-05-12 13:58:03 -07:00
35521a2629 Fix some tensor operators to return NotImplemented for invalid inputs (#57934)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/57719.

This PR fixes `torch.Tensor{__rsub__, __rdiv__, __rtruediv__, __pow__, __rmatmul__}` to return `NotImplemented` instead of raising a `TypeError`.

cc/ mruberry: The first commit of this PR is the same as 1d209db1cc excepts the commit message.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57934

Reviewed By: mruberry

Differential Revision: D28351931

Pulled By: albanD

fbshipit-source-id: 985457a44dba24d2496794dfb8c1661cbcd4ff8f
2021-05-12 11:03:23 -07:00
14282232d9 Fix generate_not_implemented_tests not testing unknown types correctly (#56997)
Summary:
Currently, the test code is not testing unknown types correctly because `op` is overwritten in the for-loop (i.e., currently only `__ior__` is tested).
This PR fixes the test `generate_not_implemented_tests` to bind operator name to each method, and remove operators currently unsupported (`__rand__`, …).

cc/ mruberry This fix is be needed to add tests for the operator we are going to introduce (e.g., `__rand__`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56997

Reviewed By: astaff

Differential Revision: D28118465

Pulled By: mruberry

fbshipit-source-id: c5a466a7604262ed5490862300d47043aff63d0b
2021-05-09 05:34:10 -07:00
20085f6d23 Support auto generation of device check (#56872)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56872

ghstack-source-id: 127914018

Test Plan: auto test

Reviewed By: ezyang

Differential Revision: D27986429

fbshipit-source-id: 0da8413b0b8e6810fcea27ed1de499f11f68bd1f
2021-05-01 12:02:09 -07:00
d4ddb47719 [special] Add xlog1py (#55138)
Summary:
Reference : https://github.com/pytorch/pytorch/issues/50345

* [x] Check Rendered Document (https://12494173-65600975-gh.circle-artifacts.com/0/docs/special.html#torch.special.xlog1py)
* [x] Tests in Binary Ufunc
* [x] OpInfo
* [x] Structured Kernel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55138

Reviewed By: ngimel

Differential Revision: D27961461

Pulled By: mruberry

fbshipit-source-id: 30a8f41970a829bf50254aadf5615e8ce4148c7e
2021-04-30 05:51:13 -07:00
5536cda19a Update floor_divide behavior in line with NumPy 1.20 (#56893)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56893

Fixes gh-56814

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D28025814

Pulled By: mruberry

fbshipit-source-id: 8654978ea1d5aa7c12bcf5a8c939966287a2d34e
2021-04-28 05:01:23 -07:00
e8faf69739 fix torch.pow type promotion issue (#54085)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54085

Fixes https://github.com/pytorch/pytorch/issues/50121.

This fixes two similar issues pointed out with the dtype that `torch.pow` performs its computation. Thanks ngimel for spotting the issues originally (comments [here](https://github.com/pytorch/pytorch/pull/53669#discussion_r594624355) and [here](https://github.com/pytorch/pytorch/pull/53669#discussion_r594719704))!

Before:
```
>>> torch.pow(2, torch.tensor([17], dtype=torch.uint8), out=torch.tensor([0]))
tensor([0])
>>> torch.pow(2, torch.tensor(17, dtype=torch.uint8), out=torch.tensor(0))
tensor(131072)
>>> torch.pow(2, torch.tensor([17], dtype=torch.uint8, device='cuda'), out=torch.tensor([0], device='cuda'))
tensor([131072], device='cuda:0')
>>> torch.pow(2, torch.tensor(17, dtype=torch.uint8, device='cuda'), out=torch.tensor(0, device='cuda'))
tensor(131072, device='cuda:0')
```

After:
```
>>> torch.pow(2, torch.tensor([17], dtype=torch.uint8), out=torch.tensor([0]))
tensor([0])
>>> torch.pow(2, torch.tensor(17, dtype=torch.uint8), out=torch.tensor(0))
tensor(0)
>>> torch.pow(2, torch.tensor([17], dtype=torch.uint8, device='cuda'), out=torch.tensor([0], device='cuda'))
tensor([0], device='cuda:0')
>>> torch.pow(2, torch.tensor(17, dtype=torch.uint8, device='cuda'), out=torch.tensor(0, device='cuda'))
tensor(0, device='cuda:0')
```

In all four cases above, `tensor(0, ...)` is the correct value because the computed "common dtype" among the inputs is expected to be `uint8`. Computing `2 ** 7` in uint8 will then overflow to zero. Finally, we cast the computed output to the output tensor's dtype, which is `int32`.

There were two separate issues fixed in this PR: one for cpu and one for cuda:
* For CPU, The `pow(Scalar, Tensor)` overload wasn't calling `set_wrapped_number(true)` after wrapping the scalar in a Tensor, which caused the "promoted" scalar to incorrectly participate in type promotion (see the documented behavior [here](aa8714dfed/c10/core/TensorImpl.h (L590)))
* For CUDA, the cuda kernels defined in `PowKernel.cu` were using the output's dtype to run the computation, instead of the common dtype.

As an aside: The CPU and CUDA kernels actually both use `iter.dtype()` instead of `iter.common_dtype()` to run the computation, which I fixed. The reason that only manifested here for CUDA is because TensorIterator has cpu-specific logic to create temporary outputs with the intermediate dtype (shown [here](aa8714dfed/aten/src/ATen/TensorIterator.cpp (L349))). I'm not sure what the end state is there- I can imagine that being something we're more okay doing for cpu than for cuda, but it also leads to hard-to-track-down inconsistencies between the two like in this case.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27096330

Pulled By: bdhirsh

fbshipit-source-id: a7e2909243851625cb3056d1e7abb2383bfe95f2
2021-04-15 08:55:53 -07:00
aceceb3d5c Reland #50999 (Added pow() on CPU for float16 & bfloat16) (#55280)
Summary:
#### Reason for relanding
Line 1607 of `torch/testing/_internal/common_methods_invocations.py` of https://github.com/pytorch/pytorch/issues/50999  had `dtype` instead of `dtype=torch.bool`, so 4 of the 9 sample inputs for `bool` had incorrect dtype. This bug was caught by https://github.com/pytorch/pytorch/issues/54949.

1. Added support for pow() on CPU for `float16` (`Half`) and `bfloat16` types.
Both `pow(Tensor, Scalar)` and `pow(Tensor, Tensor)` are now supported for the aforementioned types.
However autograd isn't supported for `Float16` on CPU yet, as `log_vml_cpu` can't be enabled for it.
2. heitorschueroff added `pow_tensor_scalar_optimized_kernel` to refactor & simplify `PowKernel.cpp`.
It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it).  It replaced code that had previously been duplicated for (float, double) and complex types,
so PowKernel.cpp looks a lot cleaner now.
3. Enabled (unskipped) some tests for `erf`, `erfc`,`erfinv`, `tan` and `linalg.vector.norm` which were being skipped earlier due to `pow()` not having been implemented for `float16` & `bfloat16`.
4. Added an OpInfo for `pow()` & enabled some test cases for `pow()`.
5. Extended the coverage of existing tests for `pow` in `test_binary_ufuncs.py` in order to enable comparison with `numpy`, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for `pow`'s inplace variant if resizing the base tensor is required during its invocation.
6. Added `float16` & `bfloat16` to `square`'s dtype lists in its `UnaryUfuncInfo`.
7. Removed redundant `dtypesIfCPU` and `dtypesIfCUDA` from `OpInfo`s where they are equal to `dtypes`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55280

Reviewed By: jbschlosser

Differential Revision: D27591772

Pulled By: heitorschueroff

fbshipit-source-id: c7420811b32595bb3353149a61e54a73f2eb352b
2021-04-13 13:23:29 -07:00
1e70d217e7 Add error message for complex alpha and non-complex inputs (#54964)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54964

Previously, the following would error out with a strange error message:
```
import torch
x=torch.randn(2)
torch.rsub(x, 1, alpha=2j)

Traceback (most recent call last)
<ipython-input-2-caf2a1c03d0b> in <module>
      1 import torch
      2 x=torch.randn(2)
----> 3 torch.rsub(x, 1, alpha=2j)

RuntimeError: value cannot be converted to type float without overflow: (-0,-2)
```

The reason why this is happening is because the alpha check doesn't check for if `x` is not complex and `alpha` is complex.
The error gets thrown further along in the implementation of torch.sub,
when it coerces `alpha` to be the same dtype as the input tensor:
https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp#L53

This PR fixes the bad error message by adding a new check to the alpha check.

Test Plan:
- pytest test/test_binary_ufuncs.py
- NB: add, sub, and rsub all share the same alpha check. The test only tests it for torch.add, but that should be sufficient.

Reviewed By: gchanan

Differential Revision: D27504017

Pulled By: zou3519

fbshipit-source-id: 70b9aa75a7a4faaaa93f6ba235cae85998a91697
2021-04-07 14:12:34 -07:00
2ee02b30b1 Replace rounding_mode="true" with rounding_mode=None (#51988)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51988

* **#51988 Replace rounding_mode="true" with rounding_mode=None**

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27561817

Pulled By: mruberry

fbshipit-source-id: 60d1d9c389570f60d599fc1876518717367fb368
2021-04-05 14:53:43 -07:00
8377e6221a Revert D27478225: [pytorch][PR] Added pow() on CPU for float16 & bfloat16
Test Plan: revert-hammer

Differential Revision:
D27478225 (6d030c14cf)

Original commit changeset: d309dd98d5a9

fbshipit-source-id: e0518f15185b41946caf3a8456c7af3f52e5a910
2021-04-03 10:26:44 -07:00
6d030c14cf Added pow() on CPU for float16 & bfloat16 (#50999)
Summary:
Added the functionality desired in https://github.com/pytorch/pytorch/issues/50789.

1. Added support for pow() on CPU for `float16` (`Half`) and `bfloat16` types.
Both `pow(Tensor, Scalar)` and `pow(Tensor, Tensor)` are now supported for the aforementioned types.
However autograd isn't supported for `Float16` on CPU yet, as `log_vml_cpu` can't be enabled for it.
2. heitorschueroff added `pow_tensor_scalar_optimized_kernel` to refactor & simplify `PowKernel.cpp`.
It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it).  It replaced code that had previously been duplicated for (float, double) and complex types,
so PowKernel.cpp looks a lot cleaner now.
3. Enabled (unskipped) some tests for `erf`, `erfc`,`erfinv`, `linalg.norm` and `linalg.vector.norm` which were being skipped earlier due to `pow()` not having been implemented for `float16` & `bfloat16`.
4. Added an OpInfo for `pow()` & enabled some test cases for `pow()`.
5. Extended the coverage of existing tests for `pow` in `test_binary_ufuncs.py` in order to enable comparison with `numpy`, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for `pow`'s inplace variant if resizing the base tensor is required during its invocation.
6. Added `float16` & `bfloat16` to `square`'s dtype lists in its `UnaryUfuncInfo`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50999

Reviewed By: zou3519

Differential Revision: D27478225

Pulled By: heitorschueroff

fbshipit-source-id: d309dd98d5a96d0cb9b08281757bb1c65266d011
2021-04-02 15:57:06 -07:00
bac566bf61 torch.square : OpInfo and minor fixes (#52551)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/42515

Add `out` variant to be consistent with Unary Ops.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52551

Reviewed By: heitorschueroff

Differential Revision: D27233482

Pulled By: mruberry

fbshipit-source-id: fef6f241849a12c46028bd1aad8f5ecc1dc65ea1
2021-03-24 00:04:42 -07:00
b93ab10b7a torch.lerp: cuda complex support (#54129)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54048

TODO
* [x] Add test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54129

Reviewed By: bdhirsh

Differential Revision: D27261878

Pulled By: anjali411

fbshipit-source-id: 10937a2eab944c73b5a98ec6278f50a876b8c7dc
2021-03-23 19:58:43 -07:00
779cae9e42 port at::pow to structured (#53669)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53669

This PR does two things:
* Ports `pow` to be structured
* Fixes a bug with how pow handles mixed cpu and cuda tensors

**bug fix**
Pow is a binary op, and all binary ops that use TensorIterator are currently written to handle the case when one of the inputs is a CUDA tensor, and the other is a zero-dimensional cpu tensor.

`pow` incidentally only handles one of the two cases: it fails when the CUDA tensor is passed as the exponent, e.g. `at::pow(torch.tensor(2.0, device='cpu'), torch.tensor([2, 2], device='cuda'))`. Porting `pow` to structured happened to change the error that was outputted from a `TORCH_CHECK` in TensorIterator to an `INTERNAL_ASSERT` in loop.cuh, so I ended up trying to fix the error and update the tests. I added more details in a comment on the PR.

**notes on the structured port**
Pow is a little weird, so I wrote down a couple of issues I noticed during the port:
* Multiple independent overloads. `pow` has two overloads that have their own cpu/cuda kernels, meaning one doesn't call the other. I have to update the names of the kernel overloads to make the compiler happy, since the codegen would otherwise try to generate two classes with the same name. `pow` actually has 3 overloads that all have `out` variants, so I ported all 3 to structured- one of them just happens to redispatch one of the others in most cases.
* Name propagation. Is name propagation implemented per operator? Or is expected to work for most/all ops by default. Right now it looks like it happens for TensorIterator ops by default. For ops that don't use TensorIterator, we need to explicitly pass the names through to the `set_output()` call in the meta function. This happened to matter for `pow` because it has 3 overloads, but only two of them directly use TensorIterator. I had to pass names directly to `set_output` in the 3rd overload to make tests happy.
*  Lack of `const Tensor &` in the C++ API. It's a goal to slowly make all `Tensor &` arguments const as part of the structured port, but in this case I needed to explicitly cast constness away because one structured kernel called back into the C++ API, which still has ordinary `Tensor &` arguments. This probably isn't something we'll fix soon, since we have boxing logic that actually relies on the `Tensor &` / `const Tensor &` distinction in some places.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D27029821

Pulled By: bdhirsh

fbshipit-source-id: c1786e770de6e6c2474b9a48210b88057ab1018e
2021-03-19 14:30:48 -07:00
54a2498919 Modify tests to use assertWarnsOnceRegex instead of maybeWarnsRegex (#52387)
Summary:
Related to https://github.com/pytorch/pytorch/issues/50006

Follow on for https://github.com/pytorch/pytorch/issues/48560 to ensure TORCH_WARN_ONCE warnings are caught. Most of this is straight-forward find-and-replace, but I did find one place where the TORCH_WARN_ONCE warning was not wrapped into a python warning.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52387

Reviewed By: albanD

Differential Revision: D26773387

Pulled By: mruberry

fbshipit-source-id: 5be7efbc8ab4a32ec8437c9c45f3b6c3c328f5dd
2021-03-08 03:32:14 -08:00
3309f034aa remove pointless test (#52609)
Summary:
Fixes T81870118

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52609

Reviewed By: mruberry

Differential Revision: D26584288

Pulled By: ngimel

fbshipit-source-id: 7cec37db46cfe5b5b2fd21fe7c3e3fcbb8aba049
2021-02-22 16:25:04 -08:00
983347fa25 Allow broadcasting against lerp weights. (#52319)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52319

Fixes: https://github.com/pytorch/pytorch/issues/52254

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D26488411

Pulled By: gchanan

fbshipit-source-id: 60eb471609986584c4235ba7f263581e988e7642
2021-02-18 09:53:25 -08:00
594a66d778 Warn about floor_divide performing incorrect rounding (#50281) (#50281)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50281

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51745

Test Plan: Imported from OSS

Reviewed By: ngimel

Pulled By: mruberry

Differential Revision: D26257855

fbshipit-source-id: e5d497cf07b0c746838ed081c5d0e82fb4cb701b
2021-02-10 03:13:34 -08:00
b150f150ba Add division overload with rounding_mode selection (#51706)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51706

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50280

As mentioned in gh-43874, this adds a `rounding_mode={'true', 'trunc', 'floor'}`
argument so `torch.div` can be used as a replacement for `floor_divide` during
the transitional period.

I've included dedicated kernels for truncated and floor division which
aren't strictly necessary for float, but do perform significantly better (~2x) than
doing true division followed by a separate rounding kernel.

Note: I introduce new overloads for `aten::div` instead of just adding a default
`rounding_mode` because various JIT passes rely on the exact operator schema.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D26123271

Pulled By: mruberry

fbshipit-source-id: 51a83717602114597ec9c4d946e35a392eb01d46
2021-02-04 13:08:36 -08:00
4803eaf502 Implement NumPy-like function torch.fmax() & torch.fmin() (#49312)
Summary:
- Implementing the NumPy-like function`torch.fmax()` and `torch.fmin()` recommended in https://github.com/pytorch/pytorch/issues/48440

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49312

Reviewed By: izdeby

Differential Revision: D25887246

Pulled By: heitorschueroff

fbshipit-source-id: d762eeff8b328bfcbe7d48b7ee9d2da72c249691
2021-01-20 06:45:25 -08:00