Commit Graph

206 Commits

Author SHA1 Message Date
860790de88 Makes torch.real and torch.imag NumPy compatible, but disables them for complex tensors (#35560)
Summary:
The current implementations of torch.real and torch.imag are not NumPy compatible. In particular:

- torch.real on a real tensor does not return the real tensor, like contiguous
- torch.real on a complex tensor does not return a real-valued view of the real part
- torch.imag on a complex tensor does not return a real-valued view of the imaginary part
- torch.Tensor.real and torch.Tensor.imag exist as methods, but in NumPy they are writable attributes

This PR makes the functions NumPy compatible by removing the method variants and out kwarg, restricting them to work on only real tensors, and updating the behavior of torch.real to return its input. New tests are added to test_torch.py to verify the behavior, a couple existing complex tests are skipped, and the documentation is updated to reflect the change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35560

Differential Revision: D20714568

Pulled By: mruberry

fbshipit-source-id: 5dd092f45757b620c8426c829dd15ee997246a26
2020-03-29 02:09:00 -07:00
17abb7c31a Add docs to resize_ and resize_as_ (#35392)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35392

Test Plan: Imported from OSS

Differential Revision: D20650097

Pulled By: VitalyFedyunin

fbshipit-source-id: cff4f555d355dfee42394f6070fe3e466949aeb5
2020-03-25 14:09:25 -07:00
7c1ea736ba Extends true_divide to be a method (#34794)
Summary:
Per title. See related https://github.com/pytorch/pytorch/pull/34570.

In PyTorch 1.7 the plan is for torch.div and Python's division operator to perform "true" division, like Python 3, JAX, and NumPy. To facilitate this change, this PR expands true_divide to be a method so it can cover all of torch.div's use cases.

New true_divide tests are added to test_torch.py, test_type_promotion.py, and test_sparse.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34794

Differential Revision: D20545507

Pulled By: mruberry

fbshipit-source-id: 55286f819716c8823d1930441a69008560ac2bd5
2020-03-23 23:12:23 -07:00
40da01db6a Add docs about memory format (#34818)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34818

Test Plan: Imported from OSS

Differential Revision: D20601336

Pulled By: VitalyFedyunin

fbshipit-source-id: d34ad226be950bf134c6b383a4810ea6aa75599e
2020-03-23 15:06:33 -07:00
3b7e1cd2cc Makes floor_divide a method, adds sparse floor division (#34552)
Summary:
(Updated per review feedback)

`torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to:

- have an out variant: `floor_divide(x, y, out=z)`
- be a method on a tensor: `x.floor_divide(y)`
- have an in-place variant: `x.floor_divide_(y)`
- work with sparse tensors

Tests are added to test_sparse.py and test_torch.py for these new behaviors.

In addition, this PR:

- cleans up the existing sparse division and true_division code and improves their error message
- adds testing of sparse true_division to test_sparse.py
- extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU

Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y).

The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors.

There are two potential follow-up issues suggested by this PR:

- the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes
- the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552

Differential Revision: D20509850

Pulled By: mruberry

fbshipit-source-id: 2cd3c828aad67191c77f2ed8470411e246f604f8
2020-03-18 15:00:53 -07:00
a1eaaea288 Revert D20497453: [pytorch][PR] Makes floor_divide a method, adds sparse floor division
Test Plan: revert-hammer

Differential Revision:
D20497453

Original commit changeset: ac326f2007d8

fbshipit-source-id: b94b89b1a25521506e3d0a6b072d3d4d8c55e63d
2020-03-18 01:48:50 -07:00
b7129050e7 Makes floor_divide a method, adds sparse floor division (#34552)
Summary:
(Updated per review feedback)

`torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to:

- have an out variant: `floor_divide(x, y, out=z)`
- be a method on a tensor: `x.floor_divide(y)`
- have an in-place variant: `x.floor_divide_(y)`
- work with sparse tensors

Tests are added to test_sparse.py and test_torch.py for these new behaviors.

In addition, this PR:

- cleans up the existing sparse division and true_division code and improves their error message
- adds testing of sparse true_division to test_sparse.py
- extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU

Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y).

The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors.

There are two potential follow-up issues suggested by this PR:

- the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes
- the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552

Differential Revision: D20497453

Pulled By: mruberry

fbshipit-source-id: ac326f2007d8894f730d1278fef84d63bcb07b5d
2020-03-18 00:01:45 -07:00
c1dd70688a Fix deprecated python "add" calls (#33428)
Summary:
This PR fixed those python "add" calls using deprecated signature `add(Scalar, Tensor)`. The alternative signature `add(Tensor, alpha = Scalar)` is used.

cc csarofeen zasdfgbnm ptrblck ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33428

Differential Revision: D20002534

Pulled By: vincentqb

fbshipit-source-id: 81f2dd6170a47a9b53a17e5817c26e70d8afa130
2020-02-26 09:02:31 -08:00
d6ea4be153 Fix minor problems in index_put_ docs (#33689)
Summary:
Fix for https://github.com/pytorch/pytorch/issues/33641
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33689

Differential Revision: D20086967

Pulled By: ngimel

fbshipit-source-id: d9dde8edb904de1cf56b9337920cb29e008b72fb
2020-02-24 21:15:36 -08:00
9278196d89 scatter_add uses src, not other (#32307)
Summary:
using `other` kwarg gives `TypeError: scatter_add_() missing 1 required positional arguments: "src"`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32307

Differential Revision: D20076859

Pulled By: zou3519

fbshipit-source-id: dfb417c087d5be41fad02dc0b2cf0506c89b1b02
2020-02-24 18:01:34 -08:00
13e4ee7883 Added tensor.is_complex(), is_complex and dtype.is_complex py binding, tensor printing, and dixed the scalar type returned for complex float (#33268)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33268

Test Plan: Imported from OSS

Differential Revision: D19907698

Pulled By: anjali411

fbshipit-source-id: c3ce2e99fc09da91a90a8fb94e5525a00bb23703
2020-02-20 13:38:01 -08:00
aa3c871739 Adds TestViewOps, updates documentation (#32512)
Summary:
Understanding which ops return views and which return tensors with new storage is a common user issue, and an issue for developers connecting accelerators to PyTorch, too. This generic test suite verifies that ops which should return views do (and a few ops that shouldn't don't).  The documentation has also been updated for .t(), permute(), unfold(), and select() to clarify they return views.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32512

Differential Revision: D19659454

Pulled By: mruberry

fbshipit-source-id: b4334be9b698253a979e1bb8746fdb3ca24aa4e3
2020-02-04 11:10:34 -08:00
5b815d980e Added cummin
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32238

Differential Revision: D19416791

Pulled By: anjali411

fbshipit-source-id: 5aadc0a7a55af40d76f444ab7d7d47ec822f55a5
2020-01-17 10:51:58 -08:00
8dc67a014f Add cummax
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32169

Differential Revision: D19393236

Pulled By: anjali411

fbshipit-source-id: 5dac6b0a4038eb48458d4a0b253418daeccbb6bc
2020-01-14 17:19:10 -08:00
b0ac425dc4 Emit warning from deprecated torch function signatures (#32009)
Summary:
Continuation of https://github.com/pytorch/pytorch/issues/31514, fixes https://github.com/pytorch/pytorch/issues/28430
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32009

Test Plan:
I verified that the deprecation warnings only occur once on a relevant workflow. Built with:

```
buck build mode/opt //vision/fair/detectron2/tools:train_net
```

Ran with:

```
DETECTRON2_ENV_MODULE=detectron2.fb.env ~/local/train_net.par --config-file configs/quick_schedules/retinanet_R_50_FPN_instant_test.yaml --num-gpus 1 SOLVER.IMS_PER_BATCH 2
```

Inspected log:

```
[01/14 07:28:13 d2.engine.train_loop]: Starting training from iteration 0
buck-out/opt/gen/caffe2/generate-code=python_variable_methods.cpp/python_variable_methods.cpp:1299: UserWarning: This overload of add is deprecated:
add(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add(Tensor other, Number alpha)
buck-out/opt/gen/caffe2/generate-code=python_variable_methods.cpp/python_variable_methods.cpp:1334: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, Number alpha)
[01/14 07:28:25 d2.utils.events]: eta: 0:00:10  iter: 19  total_loss: 1.699  loss_cls: 1.185  loss_box_reg: 0.501  time: 0.5020  data_time: 0.0224  lr: 0.000100  max_mem: 3722M
[01/14 07:28:35 fvcore.common.checkpoint]: Saving checkpoint to ./output/model_final.pth
```

Differential Revision: D19373523

Pulled By: ezyang

fbshipit-source-id: 75756de129645501f43ecc4e3bf8cc0f78c40b90
2020-01-14 11:44:29 -08:00
701ca68882 Docs entry for the is_quantized
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32075

Test Plan: Imported from OSS

Differential Revision: D19353861

Pulled By: z-a-f

fbshipit-source-id: 4249216ac9a4af354a251c62181d65bc14cbfd3e
2020-01-13 13:54:35 -08:00
9ba6a768de Add op bitwise_or (#31559)
Summary:
ezyang ,  this PR add bitwise_or operator as https://github.com/pytorch/pytorch/pull/31104 .
Benchmark script :
```
import timeit
import torch
torch.manual_seed(1)

for n, t in [(10, 100000),(1000, 10000)]:
    print('__or__ (a.numel() == {}) for {} times'.format(n, t))
    for device in ('cpu', 'cuda'):
        for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'):
            print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t')
            print(timeit.timeit(f'a | b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}")', number=t))

for n, t in [(10, 100000),(1000, 10000)]:
    print('__ior__ (a.numel() == {}) for {} times'.format(n, t))
    for device in ('cpu', 'cuda'):
        for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'):
            print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t')
            print(timeit.timeit(f'a | b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.tensor(5, dtype = {dtype}, device="{device}")', number=t))
```
Device: **Tesla P100, skx-8180**
Cuda verison: **9.0.176**

Before:
```
__or__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.17616272252053022
device: cpu, dtype: torch.uint8, 100000 times           0.17148233391344547
device: cpu, dtype: torch.int16, 100000 times           0.17616403382271528
device: cpu, dtype: torch.int32, 100000 times           0.17717823758721352
device: cpu, dtype: torch.int64, 100000 times           0.1801931718364358
device: cuda, dtype: torch.int8, 100000 times           1.270583058707416
device: cuda, dtype: torch.uint8, 100000 times          1.2636413089931011
device: cuda, dtype: torch.int16, 100000 times          1.2839747751131654
device: cuda, dtype: torch.int32, 100000 times          1.2548385225236416
device: cuda, dtype: torch.int64, 100000 times          1.2650810535997152
__or__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.031136621721088886
device: cpu, dtype: torch.uint8, 10000 times            0.030786747112870216
device: cpu, dtype: torch.int16, 10000 times            0.02391665056347847
device: cpu, dtype: torch.int32, 10000 times            0.024147341027855873
device: cpu, dtype: torch.int64, 10000 times            0.024414129555225372
device: cuda, dtype: torch.int8, 10000 times            0.12741921469569206
device: cuda, dtype: torch.uint8, 10000 times           0.1249831635504961
device: cuda, dtype: torch.int16, 10000 times           0.1283819805830717
device: cuda, dtype: torch.int32, 10000 times           0.12591975275427103
device: cuda, dtype: torch.int64, 10000 times           0.12655890546739101
__ior__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.3908365070819855
device: cpu, dtype: torch.uint8, 100000 times           0.38267823681235313
device: cpu, dtype: torch.int16, 100000 times           0.38239253498613834
device: cpu, dtype: torch.int32, 100000 times           0.3817988149821758
device: cpu, dtype: torch.int64, 100000 times           0.3901665909215808
device: cuda, dtype: torch.int8, 100000 times           1.4211318120360374
device: cuda, dtype: torch.uint8, 100000 times          1.4215159295126796
device: cuda, dtype: torch.int16, 100000 times          1.4307750314474106
device: cuda, dtype: torch.int32, 100000 times          1.4123614141717553
device: cuda, dtype: torch.int64, 100000 times          1.4480243818834424
__ior__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.06468924414366484
device: cpu, dtype: torch.uint8, 10000 times            0.06442475505173206
device: cpu, dtype: torch.int16, 10000 times            0.05267547257244587
device: cpu, dtype: torch.int32, 10000 times            0.05286940559744835
device: cpu, dtype: torch.int64, 10000 times            0.06211103219538927
device: cuda, dtype: torch.int8, 10000 times            0.15332304500043392
device: cuda, dtype: torch.uint8, 10000 times           0.15353196952492
device: cuda, dtype: torch.int16, 10000 times           0.15300503931939602
device: cuda, dtype: torch.int32, 10000 times           0.15274472255259752
device: cuda, dtype: torch.int64, 10000 times           0.1512152962386608
```
After:
```
__or__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.2465507509186864
device: cpu, dtype: torch.uint8, 100000 times           0.2472386620938778
device: cpu, dtype: torch.int16, 100000 times           0.2469814233481884
device: cpu, dtype: torch.int32, 100000 times           0.2535214088857174
device: cpu, dtype: torch.int64, 100000 times           0.24855613708496094
device: cuda, dtype: torch.int8, 100000 times           1.4351346511393785
device: cuda, dtype: torch.uint8, 100000 times          1.4434308474883437
device: cuda, dtype: torch.int16, 100000 times          1.4520929995924234
device: cuda, dtype: torch.int32, 100000 times          1.4456610176712275
device: cuda, dtype: torch.int64, 100000 times          1.4580101007595658
__or__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.029985425993800163
device: cpu, dtype: torch.uint8, 10000 times            0.03024935908615589
device: cpu, dtype: torch.int16, 10000 times            0.026356655173003674
device: cpu, dtype: torch.int32, 10000 times            0.027377349324524403
device: cpu, dtype: torch.int64, 10000 times            0.029163731262087822
device: cuda, dtype: torch.int8, 10000 times            0.14540370367467403
device: cuda, dtype: torch.uint8, 10000 times           0.1456305105239153
device: cuda, dtype: torch.int16, 10000 times           0.1450125053524971
device: cuda, dtype: torch.int32, 10000 times           0.1472016740590334
device: cuda, dtype: torch.int64, 10000 times           0.14709716010838747
__ior__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.27195510920137167
device: cpu, dtype: torch.uint8, 100000 times           0.2692424338310957
device: cpu, dtype: torch.int16, 100000 times           0.27726674638688564
device: cpu, dtype: torch.int32, 100000 times           0.2815811652690172
device: cpu, dtype: torch.int64, 100000 times           0.2852728571742773
device: cuda, dtype: torch.int8, 100000 times           1.4743850827217102
device: cuda, dtype: torch.uint8, 100000 times          1.4766502184793353
device: cuda, dtype: torch.int16, 100000 times          1.4774163831025362
device: cuda, dtype: torch.int32, 100000 times          1.4749693805351853
device: cuda, dtype: torch.int64, 100000 times          1.5772947426885366
__ior__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.03614502027630806
device: cpu, dtype: torch.uint8, 10000 times            0.03619729354977608
device: cpu, dtype: torch.int16, 10000 times            0.0319912089034915
device: cpu, dtype: torch.int32, 10000 times            0.03319283854216337
device: cpu, dtype: torch.int64, 10000 times            0.0343862259760499
device: cuda, dtype: torch.int8, 10000 times            0.1581476852297783
device: cuda, dtype: torch.uint8, 10000 times           0.15974601730704308
device: cuda, dtype: torch.int16, 10000 times           0.15957212820649147
device: cuda, dtype: torch.int32, 10000 times           0.16002820804715157
device: cuda, dtype: torch.int64, 10000 times           0.16129320487380028
```

Fix  https://github.com/pytorch/pytorch/issues/24511, https://github.com/pytorch/pytorch/issues/24515, https://github.com/pytorch/pytorch/issues/24658, https://github.com/pytorch/pytorch/issues/24662.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31559

Differential Revision: D19315875

Pulled By: ezyang

fbshipit-source-id: 4a3ca88fdafbeb796079687e676228111eb44aad
2020-01-08 15:06:30 -08:00
5dfcfeebb8 Revert D19298735: Emit warning from deprecated torch function signatures
Test Plan: revert-hammer

Differential Revision:
D19298735

Original commit changeset: 03cb78af1765

fbshipit-source-id: 304a6d4412f53a8fc822d36897c96815432e0f70
2020-01-08 13:04:41 -08:00
0e5a6700cc Emit warning from deprecated torch function signatures (#31514)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/28430

The unpythonic signatures for functions such as `torch.addcdiv` are already seperated in [`deprecated.yaml`] and the signatures marked as deprecated in `PythonArgParser`. However, nothing was done with this information previously. So, this now emits a warning when the deprecated signatures are used.

One minor complication is that if all arguments are passed as keyword args then there is nothing to differentiate the deprecated overload. This can lead to false warnings being emitted. So, I've also modified `PythonArgParser` to prefer non-deprecated signatures.

[`deprecated.yaml`]: https://github.com/pytorch/pytorch/blob/master/tools/autograd/deprecated.yaml
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31514

Differential Revision: D19298735

Pulled By: ezyang

fbshipit-source-id: 03cb78af17658eaab9d577cd2497c6f413f07647
2020-01-07 10:57:53 -08:00
3f0b330736 corrected keyword argument name in docs for Tensor.scatter (#31617)
Summary:
See https://github.com/pytorch/pytorch/issues/31601
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31617

Differential Revision: D19268872

Pulled By: mruberry

fbshipit-source-id: 52f0213f4aab991fd549b7623556a2ced61631a6
2020-01-04 21:48:30 -08:00
b47e9b97a2 Add op bitwise_and (#31104)
Summary:
Refer to https://github.com/pytorch/pytorch/pull/25665,  add `bitwise_and` operator.
Benchmark script :
```
import timeit
#for __and__
for n, t in [(10, 100000),(1000, 10000)]:
    print('__and__ (a.numel() == {}) for {} times'.format(n, t))
    for device in ('cpu', 'cuda'):
        for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'):
            print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t')
            print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}")', number=t))
#for __iand__
for n, t in [(10, 100000),(1000, 10000)]:
    print('__iand__ (a.numel() == {}) for {} times'.format(n, t))
    for device in ('cpu', 'cuda'):
        for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'):
            print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t')
            print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.tensor(5, dtype = {dtype}, device="{device}")', number=t))
```
Device: **Tesla P100, skx-8180**
Cuda verison: **9.0.176**

Before:
```
__and__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.1766007635742426
device: cpu, dtype: torch.uint8, 100000 times           0.17322628945112228
device: cpu, dtype: torch.int16, 100000 times           0.17650844901800156
device: cpu, dtype: torch.int32, 100000 times           0.17711848113685846
device: cpu, dtype: torch.int64, 100000 times           0.18240160401910543
device: cuda, dtype: torch.int8, 100000 times           1.273967768996954
device: cuda, dtype: torch.uint8, 100000 times          1.2778537990525365
device: cuda, dtype: torch.int16, 100000 times          1.2753686187788844
device: cuda, dtype: torch.int32, 100000 times          1.2797665279358625
device: cuda, dtype: torch.int64, 100000 times          1.2933144550770521
__and__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.031139614060521126
device: cpu, dtype: torch.uint8, 10000 times            0.03091452084481716
device: cpu, dtype: torch.int16, 10000 times            0.022756479680538177
device: cpu, dtype: torch.int32, 10000 times            0.025045674294233322
device: cpu, dtype: torch.int64, 10000 times            0.024164282716810703
device: cuda, dtype: torch.int8, 10000 times            0.12820732593536377
device: cuda, dtype: torch.uint8, 10000 times           0.12775669433176517
device: cuda, dtype: torch.int16, 10000 times           0.12697868794202805
device: cuda, dtype: torch.int32, 10000 times           0.12832533661276102
device: cuda, dtype: torch.int64, 10000 times           0.1280576130375266
__iand__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.3687064303085208
device: cpu, dtype: torch.uint8, 100000 times           0.36253443732857704
device: cpu, dtype: torch.int16, 100000 times           0.362891579978168
device: cpu, dtype: torch.int32, 100000 times           0.37680106051266193
device: cpu, dtype: torch.int64, 100000 times           0.3689364707097411
device: cuda, dtype: torch.int8, 100000 times           1.419940729625523
device: cuda, dtype: torch.uint8, 100000 times          1.4247053815051913
device: cuda, dtype: torch.int16, 100000 times          1.4191444097086787
device: cuda, dtype: torch.int32, 100000 times          1.4305962566286325
device: cuda, dtype: torch.int64, 100000 times          1.4567416654899716
__iand__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.06224383972585201
device: cpu, dtype: torch.uint8, 10000 times            0.06205617543309927
device: cpu, dtype: torch.int16, 10000 times            0.05016433447599411
device: cpu, dtype: torch.int32, 10000 times            0.05216377507895231
device: cpu, dtype: torch.int64, 10000 times            0.06139362137764692
device: cuda, dtype: torch.int8, 10000 times            0.14827249851077795
device: cuda, dtype: torch.uint8, 10000 times           0.14801877550780773
device: cuda, dtype: torch.int16, 10000 times           0.14952312968671322
device: cuda, dtype: torch.int32, 10000 times           0.14999118447303772
device: cuda, dtype: torch.int64, 10000 times           0.14951884001493454
```
After:
```
__and__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.23157884553074837
device: cpu, dtype: torch.uint8, 100000 times           0.23063660878688097
device: cpu, dtype: torch.int16, 100000 times           0.23005440644919872
device: cpu, dtype: torch.int32, 100000 times           0.23748818412423134
device: cpu, dtype: torch.int64, 100000 times           0.24106105230748653
device: cuda, dtype: torch.int8, 100000 times           1.4394256137311459
device: cuda, dtype: torch.uint8, 100000 times          1.4436759827658534
device: cuda, dtype: torch.int16, 100000 times          1.4631587155163288
device: cuda, dtype: torch.int32, 100000 times          1.459101552143693
device: cuda, dtype: torch.int64, 100000 times          1.4784048134461045
__and__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.028442862443625927
device: cpu, dtype: torch.uint8, 10000 times            0.028130197897553444
device: cpu, dtype: torch.int16, 10000 times            0.025318274274468422
device: cpu, dtype: torch.int32, 10000 times            0.02519288007169962
device: cpu, dtype: torch.int64, 10000 times            0.028299466706812382
device: cuda, dtype: torch.int8, 10000 times            0.14342594426125288
device: cuda, dtype: torch.uint8, 10000 times           0.145280827768147
device: cuda, dtype: torch.int16, 10000 times           0.14673697855323553
device: cuda, dtype: torch.int32, 10000 times           0.14499565307050943
device: cuda, dtype: torch.int64, 10000 times           0.14582364354282618
__iand__ (a.numel() == 10) for 100000 times
device: cpu, dtype: torch.int8, 100000 times            0.25548241566866636
device: cpu, dtype: torch.uint8, 100000 times           0.2552562616765499
device: cpu, dtype: torch.int16, 100000 times           0.25905191246420145
device: cpu, dtype: torch.int32, 100000 times           0.26635489892214537
device: cpu, dtype: torch.int64, 100000 times           0.26269810926169157
device: cuda, dtype: torch.int8, 100000 times           1.485458506271243
device: cuda, dtype: torch.uint8, 100000 times          1.4742380809038877
device: cuda, dtype: torch.int16, 100000 times          1.507783885113895
device: cuda, dtype: torch.int32, 100000 times          1.4926990242674947
device: cuda, dtype: torch.int64, 100000 times          1.519851053133607
__iand__ (a.numel() == 1000) for 10000 times
device: cpu, dtype: torch.int8, 10000 times             0.03425929415971041
device: cpu, dtype: torch.uint8, 10000 times            0.03293587639927864
device: cpu, dtype: torch.int16, 10000 times            0.029559112153947353
device: cpu, dtype: torch.int32, 10000 times            0.030915481969714165
device: cpu, dtype: torch.int64, 10000 times            0.03292469773441553
device: cuda, dtype: torch.int8, 10000 times            0.15792148280888796
device: cuda, dtype: torch.uint8, 10000 times           0.16000914946198463
device: cuda, dtype: torch.int16, 10000 times           0.1600684942677617
device: cuda, dtype: torch.int32, 10000 times           0.16162546630948782
device: cuda, dtype: torch.int64, 10000 times           0.1629159888252616
```
Fix  https://github.com/pytorch/pytorch/issues/24508, https://github.com/pytorch/pytorch/issues/24509,  https://github.com/pytorch/pytorch/issues/24655, https://github.com/pytorch/pytorch/issues/24656.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31104

Differential Revision: D18938930

Pulled By: VitalyFedyunin

fbshipit-source-id: a77e805a0b84e8ace16c6e648c2f67dad44f2e44
2020-01-03 10:32:36 -08:00
717274c001 Add useful warnings for t.grad when it won't be populated for known reasons (#30531)
Summary:
Fix https://github.com/pytorch/pytorch/issues/2362 and https://github.com/pytorch/pytorch/issues/19778

To avoid issues with frozen model, we only consider warning for Tensors that require gradients and are neither leafs nor retain gradients.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30531

Differential Revision: D18832767

Pulled By: albanD

fbshipit-source-id: 743e863dc14ab57713e66da78b2e4d759dfba0ff
2019-12-11 09:47:18 -08:00
5edfe9cb80 add torch.square (#30719)
Summary:
fixes https://github.com/pytorch/pytorch/issues/30524
This adds an new operator `torch.square` to PyTorch

I think it is ready for the first-time review now albanD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30719

Differential Revision: D18909268

Pulled By: albanD

fbshipit-source-id: 5626c445d8db20471a56fc1d7a3490e77812662b
2019-12-10 15:22:46 -08:00
1189595875 Fix Tensor.argsort -> torch.argsort documentation link
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30464

Differential Revision: D18717657

Pulled By: zou3519

fbshipit-source-id: 9894f63c6cb1b5311117441e78805230d1bc09f3
2019-12-04 07:49:38 -08:00
a68b790293 fix ref to nonexistent torch.repeat
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30614

Differential Revision: D18808517

Pulled By: ezyang

fbshipit-source-id: 27f9bda6fbbd1c3c751a0e96fdc336bf724c0b31
2019-12-04 07:27:01 -08:00
bb5dcaf24f Add logical_and and logical_or (#30521)
Summary:
With the CI failure caused in 8bbafa0b32d2899ef6101172d62c6049427c977b fixed (incorrect return type of the lambdas in CUDA kernels)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30521

Differential Revision: D18770151

Pulled By: ailzhang

fbshipit-source-id: 02f0fe1d5718c34d24da6dbb5884ee8b247ce39a
2019-12-03 18:24:54 -08:00
ec5c08de74 Revert D18580867: Add logical_and and logical_or
Test Plan: revert-hammer

Differential Revision:
D18580867

Original commit changeset: 7e4d7c37da4d

fbshipit-source-id: 81fb604c7aef8d847f518f5faa016e7bd0423016
2019-11-27 09:27:00 -08:00
8bbafa0b32 Add logical_and and logical_or (#28162)
Summary:
Superseding https://github.com/pytorch/pytorch/issues/24379 as type promotion has been implemented.

Close https://github.com/pytorch/pytorch/issues/24379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28162

Differential Revision: D18580867

Pulled By: ailzhang

fbshipit-source-id: 7e4d7c37da4dc8df87314bd4f1f6a7539e46586a
2019-11-26 17:38:22 -08:00
bd0394d473 Add op bitwise_xor to replace __xor__ and __ixor__ (#25665)
Summary:
We define `bitwise_xor` instead of
`__xor__` and `__ixor__`. The reason is that (a) it is not idiomatic to call
functions starting and ending with double underscores, and that (b) the
types of argument that we can add is limited (e.g., no out), and that (c) consistent with the naming of `bitwise_not` and numpy.

Fix https://github.com/pytorch/pytorch/issues/24513,  Fix https://github.com/pytorch/pytorch/issues/24517, Fix https://github.com/pytorch/pytorch/issues/24660, Fix https://github.com/pytorch/pytorch/issues/24664
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25665

Differential Revision: D17577143

Pulled By: VitalyFedyunin

fbshipit-source-id: 042f6385f9305bd66d50a8ce82e28f40a23a7266
2019-11-12 16:14:04 -08:00
ad47788647 Add Polygamma to the docs (#27696)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/25347
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27696

Differential Revision: D17916790

Pulled By: ezyang

fbshipit-source-id: ac2635a300b1ef0ab437e3ffac152239754fe828
2019-10-15 07:00:57 -07:00
a23edd6b9c Fix Type Errors in Examples about Named Tensor (#27828)
Summary:
`names` should be `tuple`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27828

Differential Revision: D17908112

Pulled By: zou3519

fbshipit-source-id: bd1454c5d6e6b690955f49380e34c4b0ddaf879b
2019-10-14 09:24:45 -07:00
82a69a690f Add documentation for torch.lgamma (#27812)
Summary:
Changelog:
- Add doc string in _torch_docs.py, _tensor_docs.py
- Expose in docs/source/torch.rst, docs/source/tensors.rst
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27812

Test Plan:
- Remove `lgamma`, `lgamma_` from the blacklist

Fixes https://github.com/pytorch/pytorch/issues/27783

Differential Revision: D17907630

Pulled By: ezyang

fbshipit-source-id: 14e662a4e5262126889a437e5c4bfb21936730e8
2019-10-14 08:47:04 -07:00
23bffc4f14 Fix most documentation warnings (#27782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27782

Warnings show up when running `make html` to build documentation. All of
the warnings are very reasonable and point to bugs in our docs. This PR
attempts to fix most of those warnings.

In the future we will add something to the CI that asserts that there
are no warnings in our docs.

Test Plan: - build and view changes locally

Differential Revision: D17887067

Pulled By: zou3519

fbshipit-source-id: 6bf4d08764759133b20983d6cd7f5d27e5ee3166
2019-10-13 10:34:01 -07:00
7c472ec597 Vectorized complex unary and binary op support. (#26500)
Summary:
Added Complex support with AVX to unary ops and binary ops.

I need to add nan propagation to minimum() and maximum() in the future.
In-tree changes to pytorch to support complex numbers are being submitted here.
Out-of-tree support for complex numbers is here: pytorch-cpu-strided-complex extension

Preliminary Benchmarks are here.

I tried rrii and riri and found that riri is better in most situations.
Divide is very slow because you can't reduce 1/(x+y)
Sqrt is also very slow.
Reciprocal could be sped up after I add conj()
Everything else is typically within 20% of the real number performance.
Questions:

Why does macOS not support mil? #if AT_MKL_ENABLED() && !defined(__APPLE__) in vml.h. MKL does support some complex operations like Abs, so I was curious about trying it.
Is MKL just calling AVX?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26500

Differential Revision: D17835431

Pulled By: ezyang

fbshipit-source-id: 6746209168fbeb567af340c22bf34af28286bd54
2019-10-09 12:49:21 -07:00
59b14a7620 Documentation for named tensors (#27173)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27173

`docs/source/named_tensor.rst` is the entry point; most users will land
either here or the named tensor tutorial when looking to use named
tensors. We should strive to make this as readable, concise, and understandable
as possible.

`docs/source/name_inference.rst` lists all of the name inference rules.
It should be clear but it's hard to make it concise.

Please let me know if anything doesn't make sense and please propose
alternative wordings and/or restructuring to improve the documentation.
This should ultimately get cherry-picked into the 1.3 branch as one
monolithic commit so it would be good to get all necessary changes made
in this PR and not have any follow ups.

Test Plan: - built and reviewed locally with `cd docs/ && make html`.

Differential Revision: D17763046

Pulled By: zou3519

fbshipit-source-id: c7872184fc4b189d405b18dad77cad6899ae1522
2019-10-08 22:22:30 -07:00
9f9c6c0999 From docs of scatter_add_() removed erroneous comment on uniqueness of indices. (#27132)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/27080
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27132

Differential Revision: D17765307

Pulled By: soumith

fbshipit-source-id: b0892ff442f3b49f8e3cdf029e2a08b51fa88f28
2019-10-04 11:02:19 -07:00
b93823cb65 Per-channel quantized tensor to have only a single axis (#26675)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26675

Based on offline poll, we're very unlikely to have multi-axis quantized tensors in the foreseeable future. Let's simplify API and just return int instead of list. It also matches the singular `axis` name.

Test Plan: Imported from OSS

Differential Revision: D17537052

Pulled By: dzhulgakov

fbshipit-source-id: 676abc3b251d288468aaed467b5e5ca4063b98b0
2019-09-23 22:29:01 -07:00
8c1354c31b Implement more support for per-channel quantization (#26240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26240

In particular adds support for empty/empty_like which is needed for memory layouts to work.

Test Plan: Imported from OSS

Differential Revision: D17443220

Pulled By: dzhulgakov

fbshipit-source-id: 9c9e25981999c0edaf40be104a5741e9c62a1333
2019-09-19 13:39:17 -07:00
687aa781df Fix typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25238

Differential Revision: D17076308

Pulled By: mrshenli

fbshipit-source-id: 2827150be1d15af63088db21051ab0e3476992e6
2019-08-28 07:39:11 -07:00
12ea1d74f0 Add missing functions and methods for channelwise quantization (#24934)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24934

1) Functions and methods to get scales and zero_points for channelwise quantization were missing. Adding these.
2) Correctly print quantized tensors for channelwise quantization.
ghstack-source-id: 88868339

Test Plan:
buck test mode/dev caffe2/test:quantized -- 'test_qtensor\ \(test_quantized_tensor.TestQuantizedTensor\)'  --print-passing-details

```
Running 1 tests
Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1970324844629541
      ✓ caffe2/test:quantized - test_qtensor (test_quantized_tensor.TestQuantizedTensor) 0.161 1/1 (passed)
Test output:
> test_qtensor (test_quantized_tensor.TestQuantizedTensor) ... ok
>
> ----------------------------------------------------------------------
> Ran 1 test in 0.161s
>
> OK
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/1970324844629541
Summary (total time 6.61s):
  PASS: 1
  FAIL: 0
  SKIP: 0
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```
To be added in a followup diff.
Current output for printing qtensors:
print(W_q.int_repr())
print(W_q)

```
> tensor([[[[-3,  0,  0],
>           [ 4, -2, -4],
>           [-1, -3, -2]],
>
>          [[-3,  1,  3],
>           [-3, -3,  3],
>           [-3, -5, -1]]],
>
>
>         [[[ 4, -3, -4],
>           [ 4, -3, -3],
>           [ 4, -1, -1]],
>
>          [[ 2, -3,  0],
>           [ 3,  1,  1],
>           [ 2, -4,  0]]]], dtype=torch.int8)
> tensor([[[[-0.9273, -0.2318, -0.2318],
>           [ 0.6955, -0.6955, -1.1592],
>           [-0.4637, -0.9273, -0.6955]],
>
>          [[-0.9273,  0.0000,  0.4637],
>           [-0.9273, -0.9273,  0.4637],
>           [-0.9273, -1.3910, -0.4637]]],
>
>
>         [[[ 0.3938, -0.1575, -0.2363],
>           [ 0.3938, -0.1575, -0.1575],
>           [ 0.3938,  0.0000,  0.0000]],
>
>          [[ 0.2363, -0.1575,  0.0788],
>           [ 0.3150,  0.1575,  0.1575],
>           [ 0.2363, -0.2363,  0.0788]]]], size=(2, 2, 3, 3), dtype=torch.qint8,
>        quantization_scheme=torch.per_channel_affine,
>        scale=tensor([0.2318, 0.0788]), zero_point=tensor([ 1, -1]))
```

Differential Revision: D16659715

fbshipit-source-id: f8d3eeaff8f618aa0cca4fd076db73318e6df946
2019-08-23 15:44:16 -07:00
e166811598 Documentation for Tensor.record_stream() (#24078)
Summary:
This patch writes documentation for `Tensor.record_stream()`, which is not a documented API currently. I've discussed publishing it with colesbury in https://github.com/pytorch/pytorch/issues/23729.

The documentation is based on [the introduction at `CUDACachingAllocator.cpp`](25d1496d58/c10/cuda/CUDACachingAllocator.cpp (L47-L50)). ~~I didn't explain full details of the life cycle of memory blocks or stream awareness of the allocator for the consistent level of details with other documentations.~~ I explained about the stream awareness in a note block.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24078

Differential Revision: D16743526

Pulled By: zou3519

fbshipit-source-id: 05819c3cc96733e2ba93c0a7c0ca06933acb22f3
2019-08-16 08:07:33 -07:00
338f9c860f Add logical_xor operator (#23847)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23847

Related to #23836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23847

Test Plan: Imported from OSS

Differential Revision: D16678300

Pulled By: gchanan

fbshipit-source-id: 67020aca5830b6bec2f561105954e0a8c2ee37e0
2019-08-15 08:40:25 -07:00
1f4c73618c Add logical_not operator. (#23839)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23839

Close #23836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23839

Test Plan: Imported from OSS

Differential Revision: D16678301

Pulled By: gchanan

fbshipit-source-id: 54e7b3f3b04c577e239b88493247e1c036266774
2019-08-15 08:40:21 -07:00
19c675178f Updated docs and added deprecation warnings to acknowledge a bool tensor (#22261)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22261
ghimport-source-id: 1611d62d056a04c0ad15ef662e594a3d206a78e2

Test Plan: Imported from OSS

Differential Revision: D16005990

Pulled By: izdeby

fbshipit-source-id: 2413824aa75a0755719e4df11acd21e6607e5a85
2019-08-05 07:42:34 -07:00
af638ad5d7 pin_memory should not copy on already pinned tensors (#23484)
Summary:
fixes https://github.com/pytorch/pytorch/issues/21076
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23484

Differential Revision: D16546264

Pulled By: ezyang

fbshipit-source-id: 8058e0bbc6336751f36b884d71234feef498a982
2019-07-30 21:16:23 -07:00
b3a9a7a9b9 Rename gels to lstsq (#23460)
Summary:
Changelog:
- Rename `gels` to `lstsq`
- Fix all callsites
- Rename all tests
- Create a tentative alias for `lstsq` under the name `gels` and add a deprecation warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23460

Test Plan: - All tests should pass to confirm that the patch is correct

Differential Revision: D16547834

Pulled By: colesbury

fbshipit-source-id: b3bdb8f4c5d14c7716c3d9528e40324cc544e496
2019-07-30 09:56:04 -07:00
45d3f495ef Add document of function torch.as_strided (#22842)
Summary:
Documentation of `torch.as_strided` and `Tensor.as_strided` is missing. As mentioned in https://github.com/pytorch/pytorch/issues/9886
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22842

Differential Revision: D16254106

Pulled By: soumith

fbshipit-source-id: dee142483fb9ef7bea84bd44a970b6eccdcdc471
2019-07-23 06:06:00 -07:00
bd88fd0793 Added .bfloat16() (#22852)
Summary:
Add conversion method for bfloat16
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22852

Differential Revision: D16256760

Pulled By: izdeby

fbshipit-source-id: 01d75495f9df513a0cdf78791c3eb013ab92bd95
2019-07-15 09:32:18 -07:00
45cf33a731 add fill_diagonal function (#21892)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/21796
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21892

Differential Revision: D16164678

Pulled By: colesbury

fbshipit-source-id: 85df8ae9b7a6a91b6023fe7295b3a8124e4526ea
2019-07-11 09:20:44 -07:00
e2dc1fc715 Add a bitwise NOT operator for integer and Boolean types (CPU).
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22283

Test Plan: Imported from OSS

Differential Revision: D16183576

Pulled By: colesbury

fbshipit-source-id: 2e539fab8ff885dddb9bff334d1d784b28d65b8f
2019-07-10 12:17:44 -07:00