pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
lezcano	452569dffb	cfloat and cdouble functions (#58137 ) Summary: This adds the methods `Tensor.cfloat()` and `Tensor.cdouble()`. I was not able to find the tests for `.float()` functions. I'd be happy to add similar tests for these functions once someone points me to them. Fixes https://github.com/pytorch/pytorch/issues/56014 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58137 Reviewed By: ejguan Differential Revision: D28412288 Pulled By: anjali411 fbshipit-source-id: ff3653cb3516bcb3d26a97b9ec3d314f1f42f83d	2021-05-13 21:13:37 -07:00
Peter Bell	2043093217	Add correction parameter to std/var (#50903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50903 First part of #50010. Also fixes #51127. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27911345 Pulled By: mruberry fbshipit-source-id: 7138fddc935802918ab9ff19f4bc1b9f4d745d41	2021-05-07 14:40:28 -07:00
Ilqar Ramazanli	15975cf6a6	To add priority of int/int? over int[] on signature matching and adding {h,v,d}split methods (#57346 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54555 It has been discussed in the issue https://github.com/pytorch/pytorch/issues/54555 that {h,v,d}split methods unexpectedly matches argument of single int[] when it is expected to match single argument of int. The same unexpected behavior can happen in other functions/methods which can take both int[] and int? as single argument signatures. In this PR we solve this problem by giving higher priority to int/int? arguments over int[] while sorting signatures. We also add methods of {h,v,d}split methods here, which helped us to discover this unexpected behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57346 Reviewed By: ezyang Differential Revision: D28121234 Pulled By: iramazanli fbshipit-source-id: 851cf40b370707be89298177b51ceb4527f4b2d6	2021-05-03 18:52:41 -07:00
Peter Bell	33eea146ee	torch.clamp with tensor min and max (#52695 ) Summary: Fixes gh-2793 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52695 Reviewed By: mruberry Differential Revision: D27395977 Pulled By: ezyang fbshipit-source-id: f86aa240feb034d42e4c45447e72218f6a773c24	2021-05-03 12:56:16 -07:00
Akifumi Imanishi	9da0f2e95e	Support `__pos__` and `positive` (#55891 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/55604. This PR implements `torch.Tensor.__pos__` and `torch.positive` for the compatibility with NumPy’s interface. (cc: mruberry, rgommers, emcastillo and kmaehashi) Pull Request resolved: https://github.com/pytorch/pytorch/pull/55891 Reviewed By: H-Huang Differential Revision: D28025928 Pulled By: mruberry fbshipit-source-id: e43e329a802f31bf8805f6efab5c2c7ef34c88b9	2021-04-27 13:23:59 -07:00
kshitij12345	298db67220	[OpInfo] Add Function Variant and Opinfo for permute (#56125 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/54261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/56125 Reviewed By: ezyang Differential Revision: D27960312 Pulled By: mruberry fbshipit-source-id: b9dd89f7e69d7dff29f3b53828656c13df898fa5	2021-04-25 21:26:44 -07:00
Sameer Deshmukh	5fb1142702	Add CSR (compressed sparse row) layout for sparse tensors (#50937 ) Summary: Implement compressed sparse row format. Derived from the GCS implementation at https://github.com/pytorch/pytorch/pull/44190 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50937 Reviewed By: mrshenli Differential Revision: D27439865 Pulled By: ezyang fbshipit-source-id: 3ba3dcb9679505b980ff6a5f513e913bbae2fb1d	2021-04-12 10:09:12 -07:00
mattip	7d56de1834	DOC: use autosummary on tensors.rst (#55042 ) Summary: Related to https://github.com/pytorch/pytorch/issues/52256 Splits tensors into a table-of-contents page and many sub-pages, one for each function Pull Request resolved: https://github.com/pytorch/pytorch/pull/55042 Reviewed By: mrshenli Differential Revision: D27628688 Pulled By: zou3519 fbshipit-source-id: 08e87700a8e7d5b3fba3f1949e29e988a42bf2c6	2021-04-08 06:44:23 -07:00
lezcano	fd02fc5d71	Port put_ and take from TH to ATen (#53356 ) Summary: The two ports were don together, as they can be implemented with the same kernel. In TH, they were already implemented with the same kernel. Resolves https://github.com/pytorch/pytorch/issues/24751 Resolves https://github.com/pytorch/pytorch/issues/24614 Resolves https://github.com/pytorch/pytorch/issues/24640 Resolves https://github.com/pytorch/pytorch/issues/24772 This port makes sure that it interacts correctly with the "deterministic algorithms" flag, as done in https://github.com/pytorch/pytorch/pull/51388 This PR also makes these two functions correct in the following aspects (all of them added to the tests as well): - Support for complex numbers - Correct handling of scalar inputs and zero-dimensional inputs - Implementation that does not do any copies nor sorting of any of the input tensors - Faster and more correct implementation of the backwards (now it works as it should when `source.shape() != index.shape()`) - Now `put_(..., accumulate=True)` is implemented correctly with atomic operations on GPU / CPU (when possible) and is deterministic (modulo the loss of precision that might happen due to the reordering of a sum of floats) - Adds the `torch.put` function that was missing, (`index_put` exists, for example) - Corrected docs It also adds a much more thorough testing to the operations and their gradients. There is a BC-breaking change, and that is that now we check that the inputs do not overlap in the `put_` operation. This was handled (some of the cases, other cases were wrong) in the TH implementation by making contiguous copies of the inputs. How should we handle this one? Edit. Benchmarks: <details> <summary>Script</summary> ```python from IPython import get_ipython import torch from itertools import product torch.manual_seed(13) torch.set_num_threads(1) ipython = get_ipython() cpu = torch.device('cpu') cuda = torch.device('cuda') def run_test(ndims, size, index_len, device, cmd): print(f"cmd: {cmd}, ndims: {ndims}, tensor_size: {size}, index_len: {index_len}, device: {device}") large_tensor = torch.rand(([size] ndims), device=device) small_tensor = torch.rand((index_len,), device=device) index = torch.randint(size * ndims, (index_len,), dtype=torch.long, device=device) if cmd == "put": command = "large_tensor.put_(index, small_tensor, accumulate=False)" if device == cuda: command += "; torch.cuda.synchronize()" elif cmd == "accumulate": command = "large_tensor.put_(index, small_tensor, accumulate=True)" if device == cuda: command += "; torch.cuda.synchronize()" elif cmd == "take": command = "torch.take(large_tensor, index)" if device == cuda: command += "; torch.cuda.synchronize()" ipython.magic(f"timeit {command}") print() for method, device in product(["accumulate", "put", "take"], [cpu, cuda]): run_test(3, 1000, 10, device, method) run_test(3, 1000, 1000, device, method) run_test(3, 1000, 10000, device, method) run_test(2, 10000, 100000, device, method) ``` </details> ```python put_(accumulate=False) ``` <details> <summary>ATen CPU (1.5x - 2x speedup)</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.05 µs ± 2.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 3.15 µs ± 5.13 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 21.6 µs ± 13.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 238 µs ± 781 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 722 ns ± 2.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 4.89 µs ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 42.5 µs ± 96.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 428 µs ± 774 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>ATen GPU (same speed)</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 8.99 µs ± 16 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.4 µs ± 24.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.4 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 15.6 µs ± 1.12 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 8.44 µs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 9.09 µs ± 4.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 9.77 µs ± 0.998 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 15.8 µs ± 5.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> ```python put_(accumulate=True) ``` <details> <summary>ATen CPU (x2 speedup)</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.12 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 3.14 µs ± 2.05 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 20.8 µs ± 25.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 264 µs ± 263 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 814 ns ± 1.87 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 5.11 µs ± 6.02 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 43.9 µs ± 49.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 442 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>ATen GPU (3x - 11x speedup)</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.01 µs ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.4 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.3 µs ± 44.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 12.6 µs ± 19 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 34.7 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 38.2 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 61.2 µs ± 50.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 140 µs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> ```python take() ``` <details> <summary>ATen CPU (1.1x speedup)</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.18 µs ± 2.34 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 2.79 µs ± 2.96 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 16.6 µs ± 10.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 161 µs ± 984 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.1 µs ± 3.14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 2.93 µs ± 7.31 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 18.6 µs ± 14.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 178 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> <details> <summary>ATen GPU (same speed)</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.38 µs ± 23.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.7 µs ± 9.77 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.6 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 11.5 µs ± 21.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.31 µs ± 7.57 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 9.52 µs ± 5.78 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 9.73 µs ± 17.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 11.7 µs ± 5.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/53356 Reviewed By: mruberry Differential Revision: D27520243 Pulled By: ngimel fbshipit-source-id: e3979349c2c62d2949e09fb05e5fd4883fbc9093	2021-04-05 18:05:38 -07:00
Heitor Schueroff	6e2d020037	Add interpolation kwarg to torch.quantile (#49267 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49267 This PR builds upon the PR https://github.com/pytorch/pytorch/pull/48711 by RockingJavaBean. The original PR introduced a BC breaking change by making the interpolation parameter positional. Thus, previous invocations of torch.quantile that did not include the interpolation parameter failed after the PR landed. To avoid BC breaking changes, we preserve the original signatures and make the interpolation parameter in the new signatures kwarg only. For now, interpolation cannot have a default value to avoid ambiguity with the deprecated signature. However, due to limitations of codegen and C++, we cannot have a required arg after optional ones. Thus, this PR also makes dim and keepdim requires args. Once we can remove the old signatures, dim, keepdim and interpolation parameters in the new signature will get the default values back. __TODO__ --- - [ ] Run backward compat tests This reverts commit 2f1d1eb7df5e8032392b73751c84025a2aa3d1ee. Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D27337117 Pulled By: heitorschueroff fbshipit-source-id: 7fe31f22027645e0d6cb3cab0392d532a4b362c9	2021-04-02 12:11:36 -07:00
Jeff Yang	9ef53f7e0f	docs: remove extra backticks in `narrow_copy` (#54669 ) Summary: fixes https://github.com/pytorch/pytorch/issues/41590 https://11813004-65600975-gh.circle-artifacts.com/0/docs/tensors.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/54669 Reviewed By: ailzhang Differential Revision: D27328228 Pulled By: zou3519 fbshipit-source-id: 9a4a9bc4b265b0e82cf91f94dbbfd842fc42cdcb	2021-03-29 10:38:21 -07:00
kshitij12345	0527d14248	[numpy] Add torch.take_along_dim (#52833 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/38349 Wrapper around the existing `torch.gather` with broadcasting logic. TODO: * [x] Add Doc entry (see if phrasing can be improved) * [x] Add OpInfo * [x] Add test against numpy * [x] Handle broadcasting behaviour and when dim is not given. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52833 Reviewed By: malfet Differential Revision: D27319038 Pulled By: mruberry fbshipit-source-id: 00f307825f92c679d96e264997aa5509172f5ed1	2021-03-28 05:22:51 -07:00
Xiang Gao	eec48303c0	Make index_add take a scalar argument alpha (#54176 ) Summary: ``` index_add(Tensor self, int dim, Tensor index, Tensor source) -> Tensor ``` now becomes ``` index_add(Tensor self, int dim, Tensor index, Tensor source, Scalar alpha=1) -> Tensor ``` Generally, this sounds useful and harmless, and inside PyTorch, we are already needing this feature in `add_out_dense_sparse_cuda`, see the `SparseCUDATensorMath.cu` change in this PR. Test not added yet. Will add if after discussion we believe this is a good idea. - [ ] TODO: add test Pull Request resolved: https://github.com/pytorch/pytorch/pull/54176 Reviewed By: ngimel Differential Revision: D27319198 Pulled By: mruberry fbshipit-source-id: fe43be082d1230c87c5313458213d5252be2ff23	2021-03-28 00:22:45 -07:00
Heitor Schueroff	591084abb8	Deprecate torch.matrix_power in favor of torch.linalg.matrix_power (#53538 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53538 * #52608 Added torch.linalg.matrix_power Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D27261531 Pulled By: heitorschueroff fbshipit-source-id: 5a944b390f3cc6896c2aa92ba467319ddc9309e4	2021-03-23 15:11:24 -07:00
Xiong Wei	da10ccd35f	Implements cpu_kernel_multiple_outputs and torch.frexp (#51097 ) Summary: Close https://github.com/pytorch/pytorch/issues/51108 Related https://github.com/pytorch/pytorch/issues/38349 This PR implements the `cpu_kernel_multiple_outputs` to support returning multiple values in a CPU kernel. ```c++ auto iter = at::TensorIteratorConfig() .add_output(out1) .add_output(out2) .add_input(in1) .add_input(in2) .build(); at::native::cpu_kernel_multiple_outputs(iter, [=](float a, float b) -> std::tuple<float, float> { float add = a + b; float mul = a * b; return std::tuple<float, float>(add, mul); } ); ``` The `out1` will equal to `torch.add(in1, in2)`, while the result of `out2` will be `torch.mul(in1, in2)`. It helps developers implement new torch functions that return two tensors more conveniently, such as NumPy-like functions [divmod](https://numpy.org/doc/1.18/reference/generated/numpy.divmod.html?highlight=divmod#numpy.divmod) and [frexp](https://numpy.org/doc/stable/reference/generated/numpy.frexp.html#numpy.frexp). This PR adds `torch.frexp` function to exercise the new functionality provided by `cpu_kernel_multiple_outputs`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51097 Reviewed By: albanD Differential Revision: D26982619 Pulled By: heitorschueroff fbshipit-source-id: cb61c7f2c79873ab72ab5a61cbdb9203531ad469	2021-03-15 10:44:32 -07:00
Mike Ruberry	1795398c24	Updates rounding_mode documentation to remove "true" (#52202 ) Summary: In design review the use of the word "true" for a "rounding mode" which actually performed no rounding was, understandably, considered confusing. This PR updates the documentation to remove references to "true." The signatures for torch.div and torch.divide are updated to reflect the future behavior where rounding_mode=None will be the default. This is slightly inaccurate. Today when rounding mode is not specified it is effectively None, but users cannot actually specify rounding_mode=None today. That change was considered too disruptive to the 1.8 branch cut process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52202 Reviewed By: gchanan Differential Revision: D26424979 Pulled By: mruberry fbshipit-source-id: db3cc769c0d9c6d7e42bfad294073c99fa9168d9	2021-02-12 09:19:39 -08:00
Michael Dagitses	d61d8d886b	correct value argument name for Tensor.index_fill_ docs (#51763 ) Summary: The name of "val" is inconsistent with the rest of the API and also inconsistent with the underlying C++ implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51763 Test Plan: Used the following command to demonstrate incorrect docs before and correct docs after: python -c 'import torch; print(torch.Tensor.index_fill_.__doc__)' Fixes https://github.com/pytorch/pytorch/issues/51250 Reviewed By: zhangguanheng66 Differential Revision: D26271273 Pulled By: dagitses fbshipit-source-id: 4897da80b639c54ca652d2111e13f26efe2646a0	2021-02-09 07:15:52 -08:00
Peter Bell	b150f150ba	Add division overload with rounding_mode selection (#51706 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50280 As mentioned in gh-43874, this adds a `rounding_mode={'true', 'trunc', 'floor'}` argument so `torch.div` can be used as a replacement for `floor_divide` during the transitional period. I've included dedicated kernels for truncated and floor division which aren't strictly necessary for float, but do perform significantly better (~2x) than doing true division followed by a separate rounding kernel. Note: I introduce new overloads for `aten::div` instead of just adding a default `rounding_mode` because various JIT passes rely on the exact operator schema. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D26123271 Pulled By: mruberry fbshipit-source-id: 51a83717602114597ec9c4d946e35a392eb01d46	2021-02-04 13:08:36 -08:00
Jeffrey Wan	b18eeaa80a	Implement `np.diff` for single order differences (#50569 ) Summary: Implements `np.diff` for single order differences only: - method and function variants for `diff` and function variant for `diff_out` - supports out variant, but not in-place since shape changes - adds OpInfo entry, and test in `test_torch` - automatic autograd because we are using the `Math` dispatch _Update: we only support Tensors for prepend and append in this PR. See discussion below and comments for more details._ Currently there is a quirk in the c++ API based on how this is implemented: it is not possible to specify scalar prepend and appends without also specifying all 4 arguments. That is because the goal is to match NumPy's diff signature of `diff(int n=1, int dim=-1, Union[Scalar, Tensor] prepend=None, Union[Scalar, Tensor] append)=None` where all arguments are optional, positional and in the correct order. There are a couple blockers. One is c++ ambiguity. This prevents us from simply doing `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)` etc for all combinations of {Tensor, Scalar} x {Tensor, Scalar}. Why not have append, prepend not have default args and then write out the whole power set of {Tensor, Scalar, omitted} x {Tensor, Scalar, omitted} you might ask. Aside from having to write 18 overloads, this is actually illegal because arguments with defaults must come after arguments without defaults. This would mean having to write `diff(prepend, append, n, dim)` which is not desired. Finally writing out the entire power set of all arguments n, dim, prepend, append is out of the question because that would actually involve 2 * 2 * 3 * 3 = 36 combinations. And if we include the out variant, that would be 72 overloads! With this in mind, the current way this is implemented is actually to still do `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)`. But also make use of `cpp_no_default_args`. The idea is to only have one of the 4 {Tensor, Scalar} x {Tensor, Scalar} provide default arguments for the c++ api, and add `cpp_no_default_args` for the remaining 3 overloads. With this, Python api works as expected, but some calls such as `diff(prepend=1)` won't work on c++ api. We can optionally add 18 more overloads that cover the {dim, n, no-args} x {scalar-tensor, tensor-scalar, scalar-scalar} x {out, non-out} cases for c++ api. _[edit: counting is hard - just realized this number is still wrong. We should try to count the cases we do cover instead and subtract that from the total: (2 * 2 * 3 * 3) - (3 + 2^4) = 17. 3 comes from the 3 of 4 combinations of {tensor, scalar}^2 that we declare to be `cpp_no_default_args`, and the one remaining case that has default arguments has covers 2^4 cases. So actual count is 34 additional overloads to support all possible calls]_ _[edit: thanks to https://github.com/pytorch/pytorch/issues/50767 hacky_wrapper is no longer necessary; it is removed in the latest commit]_ hacky_wrapper was also necessary here because `Tensor?` will cause dispatch to look for the `const optional<Tensor>&` schema but also generate a `const Tensor&` declaration in Functions.h. hacky_wrapper allows us to define our function as `const Tensor&` but wraps it in optional for us, so this avoids both the errors while linking and loading. _[edit: rewrote the above to improve clarity and correct the fact that we actually need 18 more overloads (26 total), not 18 in total to complete the c++ api]_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/50569 Reviewed By: H-Huang Differential Revision: D26176105 Pulled By: soulitzer fbshipit-source-id: cd8e77cc2de1117c876cd71c29b312887daca33f	2021-02-02 20:25:16 -08:00
Sam Estep	c147aa306c	Use doctest directly to get docstring examples (#50596 ) Summary: This PR addresses [a two-year-old TODO in `test/test_type_hints.py`](`12942ea52b/test/test_type_hints.py (L21-L22)`) by replacing most of the body of our custom `get_examples_from_docstring` function with [a function from Python's built-in `doctest.DocTestParser` class](https://docs.python.org/3/library/doctest.html#doctest.DocTestParser.get_examples). This mostly made the parser more strict, catching a few errors in existing doctests: - missing `...` in multiline statements - missing space after `>>>` - unmatched closing parenthesis Also, as shown by [the resulting diff of the untracked `test/generated_type_hints_smoketest.py` file](https://pastebin.com/vC5Wz6M0) (also linked from the test plan below), this introduces a few incidental changes as well: - standalone comments are no longer preserved - indentation is now visually correct - [`example_torch_promote_types`](`4da9ceb743/torch/_torch_docs.py (L6753-L6772)`) is now present - an example called `example_torch_tensor___array_priority__` is added, although I can't tell where it comes from - the last nine lines of code from [`example_torch_tensor_align_as`](`5d45140d68/torch/_tensor_docs.py (L386-L431)`) are now present - the previously-misformatted third line from [`example_torch_tensor_stride`](`5d45140d68/torch/_tensor_docs.py (L3508-L3532)`) is now present Pull Request resolved: https://github.com/pytorch/pytorch/pull/50596 Test Plan: Checkout the base commit, typecheck the doctests, and save the generated file: ``` $ python test/test_type_hints.py TestTypeHints.test_doc_examples $ cp test/generated_type_hints_smoketest.py /tmp ``` Then checkout this PR, do the same thing, and compare: ``` $ python test/test_type_hints.py TestTypeHints.test_doc_examples $ git diff --no-index {/tmp,test}/generated_type_hints_smoketest.py ``` The test should succeed, and the diff should match [this paste](https://pastebin.com/vC5Wz6M0). Reviewed By: walterddr Differential Revision: D25926245 Pulled By: samestep fbshipit-source-id: 23bc379ff438420e556263c19582dba06d8e42ec	2021-01-20 15:55:36 -08:00
chengjun	4a8ef4525e	Add new backend type for Intel heterogeneous computation platform. (#49786 ) Summary: Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc. https://github.com/pytorch/pytorch/issues/48246 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786 Reviewed By: mrshenli Differential Revision: D25893962 Pulled By: ezyang fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052	2021-01-20 08:15:18 -08:00
kiyosora	4803eaf502	Implement NumPy-like function torch.fmax() & torch.fmin() (#49312 ) Summary: - Implementing the NumPy-like function`torch.fmax()` and `torch.fmin()` recommended in https://github.com/pytorch/pytorch/issues/48440 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49312 Reviewed By: izdeby Differential Revision: D25887246 Pulled By: heitorschueroff fbshipit-source-id: d762eeff8b328bfcbe7d48b7ee9d2da72c249691	2021-01-20 06:45:25 -08:00
kshitij12345	5d45140d68	[numpy] torch.{all/any} : output dtype is always bool (#47878 ) Summary: BC-breaking note: This PR changes the behavior of the any and all functions to always return a bool tensor. Previously these functions were only defined on bool and uint8 tensors, and when called on uint8 tensors they would also return a uint8 tensor. (When called on a bool tensor they would return a bool tensor.) PR summary: https://github.com/pytorch/pytorch/pull/44790#issuecomment-725596687 Fixes 2 and 3 Also Fixes https://github.com/pytorch/pytorch/issues/48352 Changes * Output dtype is always `bool` (consistent with numpy) BC Breaking (Previously used to match the input dtype) * Uses vectorized version for all dtypes on CPU * Enables test for complex * Update doc for `torch.all` and `torch.any` TODO * [x] Update docs * [x] Benchmark * [x] Raise issue on XLA Pull Request resolved: https://github.com/pytorch/pytorch/pull/47878 Reviewed By: albanD Differential Revision: D25714324 Pulled By: mruberry fbshipit-source-id: a87345f725297524242d69402dfe53060521ea5d	2021-01-08 11:05:39 -08:00
Xiang Gao	d00acebd14	Add tensor.view(dtype) (#47951 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42571 Note that this functionality is a subset of [`numpy.ndarray.view`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.view.html): - this only supports viewing a tensor as a dtype with the same number of bytes - this does not support viewing a tensor as a subclass of `torch.Tensor` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47951 Reviewed By: ngimel Differential Revision: D25062301 Pulled By: mruberry fbshipit-source-id: 9fefaaef77f15d5b863ccd12d836932983794475	2021-01-08 06:55:21 -08:00
Ralf Gommers	d99a0c3b3e	Improve docs for scatter and gather functions (#49679 ) Summary: - Add warning about non-unique indices - And note that these functions don't broadcast - Add missing `torch.scatter` and `torch.scatter_add` doc entries - Fix parameter descriptions - Improve code examples to make indexing behaviour easier to understand Closes gh-48214 Closes gh-26191 Closes gh-37130 Closes gh-34062 xref gh-31776 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49679 Reviewed By: mruberry Differential Revision: D25693660 Pulled By: ngimel fbshipit-source-id: 4983e7b4efcbdf1ab9f04e58973b4f983e8e43a4	2020-12-23 12:23:15 -08:00
kshitij12345	2780400904	[numpy] Add `torch.xlogy` (#48777 ) Summary: Reference https://github.com/pytorch/pytorch/issues/38349 Fixes https://github.com/pytorch/pytorch/issues/22656 TODO: * [x] Add docs * [x] Add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/48777 Reviewed By: ngimel Differential Revision: D25681346 Pulled By: mruberry fbshipit-source-id: 369e0a29ac8a2c44de95eec115bf75943fe1aa45	2020-12-22 15:05:59 -08:00
Xiong Wei	3779bdec56	Implementing NumPy-like function torch.broadcast_to (#48997 ) Summary: Related https://github.com/pytorch/pytorch/issues/38349 Implement NumPy-like function `torch.broadcast_to` to broadcast the input tensor to a new shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48997 Reviewed By: anjali411, ngimel Differential Revision: D25663937 Pulled By: mruberry fbshipit-source-id: 0415c03f92f02684983f412666d0a44515b99373	2020-12-21 11:24:50 -08:00
Jeffrey Wan	d0a12c5a47	Add sinc operator (#48740 ) Summary: Implements the sinc operator. See https://numpy.org/doc/stable/reference/generated/numpy.sinc.html ![image](https://user-images.githubusercontent.com/13428986/101653855-cdffa080-3a0d-11eb-8426-ecc81c152ebd.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/48740 Reviewed By: ezyang Differential Revision: D25597565 Pulled By: soulitzer fbshipit-source-id: 6dbcf282ee4eba34930bc9e5c85c0c5e79cf0322	2020-12-18 15:52:24 -08:00
Jeffrey Wan	7767dcfc8d	Revert D25564477: [pytorch][PR] Add sinc operator Test Plan: revert-hammer Differential Revision: D25564477 (`bbc71435b7`) Original commit changeset: 13f36a2b84da fbshipit-source-id: 58cbe8109efaf499dd017531878b9fbbb27976bc	2020-12-16 13:19:16 -08:00
Natalia Gimelshein	afce5890ff	Revert D25421263: [pytorch][PR] [numpy] torch.{all/any} : output dtype is always bool Test Plan: revert-hammer Differential Revision: D25421263 (`c508e5b1bf`) Original commit changeset: c6c681ef9400 fbshipit-source-id: 4c0c9acf42b06a3ed0af8f757ea4512ca35b6c59	2020-12-16 11:11:13 -08:00
Jeffrey Wan	bbc71435b7	Add sinc operator (#48740 ) Summary: Implements the sinc operator. See https://numpy.org/doc/stable/reference/generated/numpy.sinc.html ![image](https://user-images.githubusercontent.com/13428986/101653855-cdffa080-3a0d-11eb-8426-ecc81c152ebd.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/48740 Reviewed By: izdeby Differential Revision: D25564477 Pulled By: soulitzer fbshipit-source-id: 13f36a2b84dadfb4fd1442a2a40a3a3246cbaecb	2020-12-16 10:33:02 -08:00
kshitij12345	c508e5b1bf	[numpy] torch.{all/any} : output dtype is always bool (#47878 ) Summary: BC-breaking note: This PR changes the behavior of the any and all functions to always return a bool tensor. Previously these functions were only defined on bool and uint8 tensors, and when called on uint8 tensors they would also return a uint8 tensor. (When called on a bool tensor they would return a bool tensor.) PR summary: https://github.com/pytorch/pytorch/pull/44790#issuecomment-725596687 Fixes 2 and 3 Also Fixes https://github.com/pytorch/pytorch/issues/48352 Changes * Output dtype is always `bool` (consistent with numpy) BC Breaking (Previously used to match the input dtype) * Uses vectorized version for all dtypes on CPU * Enables test for complex * Update doc for `torch.all` and `torch.any` TODO * [x] Update docs * [x] Benchmark * [x] Raise issue on XLA Pull Request resolved: https://github.com/pytorch/pytorch/pull/47878 Reviewed By: H-Huang Differential Revision: D25421263 Pulled By: mruberry fbshipit-source-id: c6c681ef94004d2bcc787be61a72aa059b333e69	2020-12-15 13:59:32 -08:00
Peter Bell	5180caeeb4	Remove deprecated spectral ops from torch namespace (#48594 ) Summary: Ref https://github.com/pytorch/pytorch/issues/42175 This removes the 4 deprecated spectral functions: `torch.{fft,rfft,ifft,irfft}`. `torch.fft` is also now imported by by default. The actual `at::native` functions are still used in `torch.stft` so can't be full removed yet. But will once https://github.com/pytorch/pytorch/issues/47601 has been merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48594 Reviewed By: heitorschueroff Differential Revision: D25298929 Pulled By: mruberry fbshipit-source-id: e36737fe8192fcd16f7e6310f8b49de478e63bf0	2020-12-05 04:12:32 -08:00
kiyosora	6ab84ca0f3	Implement NumPy-like function torch.msort() (#48440 ) Summary: - Related with https://github.com/pytorch/pytorch/issues/38349 - Implementing the NumPy-like function `torch.msort()` . Pull Request resolved: https://github.com/pytorch/pytorch/pull/48440 Reviewed By: bdhirsh Differential Revision: D25265753 Pulled By: mruberry fbshipit-source-id: 7709ac5e5667e7541a3dc9048b9c9896b1a6dfa1	2020-12-04 04:32:09 -08:00
Heitor Schueroff	c134f32835	Implemented torch.inner (#46716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46716 Implemented torch.inner similar to [numpy.inner](https://numpy.org/doc/stable/reference/generated/numpy.inner.html). For now it's implemented as a composite op. TODO - [x] Add documentation Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D24860351 Pulled By: heitorschueroff fbshipit-source-id: de5c82f285893495491fdba73b35634f4d00bac8	2020-12-03 11:37:55 -08:00
Kurt Mohler	2cb9204159	Add nondeterministic alert to index_copy, median CUDA and kthvalue CUDA (#46942 ) Summary: Also fixes issue where skipped tests did not properly restore deterministic flag. Fixes https://github.com/pytorch/pytorch/issues/46743 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46942 Reviewed By: heitorschueroff Differential Revision: D25298020 Pulled By: mruberry fbshipit-source-id: 14b1680e1fa536ec72018d0cdb0a3cf83b098767	2020-12-03 11:03:07 -08:00
kshitij12345	5c9cef9a6c	[numpy] Add `torch.moveaxis` (#48581 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/38349 #36048 https://github.com/pytorch/pytorch/pull/41480#issuecomment-734398262 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48581 Reviewed By: bdhirsh Differential Revision: D25276307 Pulled By: mruberry fbshipit-source-id: 3e3e4df1343c5ce5b71457badc43f08c419ec5c3	2020-12-03 10:34:33 -08:00
Ailing Zhang	f61de25dfa	Fix index_put doc. (#48673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48673 fixes #48642 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D25257078 Pulled By: ailzhang fbshipit-source-id: e5ebd6e07aafb262989fc12131546037fed8ebf6	2020-12-02 10:01:11 -08:00
kiyosora	272f4db043	Implement NumPy-like function torch.float_power() (#44937 ) Summary: - Related with https://github.com/pytorch/pytorch/issues/38349 - Implementing the NumPy-like function `torch.float_power()` . Pull Request resolved: https://github.com/pytorch/pytorch/pull/44937 Reviewed By: ngimel Differential Revision: D25192119 Pulled By: mruberry fbshipit-source-id: 2e446b8e0c2825f045fe057e30c9419335557a05	2020-11-27 18:01:42 -08:00
Fayçal Arbai	2e0a8b75d8	An implementation of torch.tile as requested in pytorch/pytorch#38349 (#47974 ) Summary: The approach is to simply reuse `torch.repeat` but adding one more functionality to tile, which is to prepend 1's to reps arrays if there are more dimensions to the tensors than the reps given in input. Thus for a tensor of shape (64, 3, 24, 24) and reps of (2, 2) will become (1, 1, 2, 2), which is what NumPy does. I've encountered some instability with the test on my end, where I could get a random failure of the test (due to, sometimes, random value of `self.dim()`, and sometimes, segfaults). I'd appreciate any feedback on the test or an explanation for this instability so I can this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47974 Reviewed By: ngimel Differential Revision: D25148963 Pulled By: mruberry fbshipit-source-id: bf63b72c6fe3d3998a682822e669666f7cc97c58	2020-11-24 18:07:25 -08:00
Randall Hunt	562d4c3bc5	Add basic ldexp operator for numpy compatibility (#45370 ) Summary: Adds ldexp operator for https://github.com/pytorch/pytorch/issues/38349 I'm not entirely sure the changes to `NamedRegistrations.cpp` were needed but I saw other operators in there so I added it. Normally the ldexp operator is used along with the frexp to construct and deconstruct floating point values. This is useful for performing operations on either the mantissa and exponent portions of floating point values. Sleef, std math.h, and cuda support both ldexp and frexp but not for all data types. I wasn't able to figure out how to get the iterators to play nicely with a vectorized kernel so I have left this with just the normal CPU kernel for now. This is the first operator I'm adding so please review with an eye for errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45370 Reviewed By: mruberry Differential Revision: D24333516 Pulled By: ranman fbshipit-source-id: 2df78088f00aa9789aae1124eda399771e120d3f	2020-11-20 04:09:39 -08:00
kiyosora	008f840e7a	Implement in-place method torch.cumsum_ and torch.cumprod_ (#47651 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47193 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47651 Reviewed By: zou3519 Differential Revision: D24992438 Pulled By: ezyang fbshipit-source-id: c38bea55f4af1fc92be780eaa8e1d462316e6192	2020-11-19 11:20:12 -08:00
mfkasim91	8819bad86c	Implement igammac (3rd PR) (#48171 ) Summary: Related: https://github.com/pytorch/pytorch/issues/46183 (torch.igamma) This is the regularized upper incomplete gamma function. This is supposed to be exactly the same as https://github.com/pytorch/pytorch/issues/47463, but after rebasing the `viable/strict` branch. cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/48171 Reviewed By: zhangguanheng66 Differential Revision: D25060107 Pulled By: mruberry fbshipit-source-id: 89780dea21dbb2141cbc4f7f18192cb78a769b17	2020-11-18 23:44:32 -08:00
kshitij12345	68a3a3f3b5	Add `torch.swapdims` and `torch.swapaxes` (#46041 ) Summary: Reference https://github.com/pytorch/pytorch/issues/38349 Delegates to `torch.transpose` (not sure what is the best way to alias) TODO: * [x] Add test * [x] Add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/46041 Reviewed By: gchanan Differential Revision: D25022816 Pulled By: mruberry fbshipit-source-id: c80223d081cef84f523ef9b23fbedeb2f8c1efc5	2020-11-18 11:35:53 -08:00
Richard Zou	59aca02224	Implement Tensor.new_empty_strided(sizes, strides, , dtype, device, requires_grad) (#47225 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47225 Summary ------- This PR implements Tensor.new_empty_strided. Many of our torch. factory functions have a corresponding new_* method (e.g., torch.empty and torch.new_empty), but there is no corresponding method to torch.empty_strided. This PR adds one. Motivation ---------- The real motivation behind this is for vmap to be able to work through CopySlices. CopySlices shows up a lot in double backwards because a lot of view functions have backward formulas that perform view+inplace. `e0fd590ec9/torch/csrc/autograd/functions/tensor.cpp (L78-L106)` To support vmap through CopySlices, the approach in this stack is to: - add `Tensor.new_empty_strided` and replace `empty_strided` in CopySlices with that so that we can propagate batch information. - Make some slight modifications to AsStridedBackward (and add as_strided batching rule) Please let me know if it would be better if I squashed everything related to supporting vmap over CopySlices together into a single big PR. Test Plan --------- - New tests. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D24741688 Pulled By: zou3519 fbshipit-source-id: b688047d2eb3f92998896373b2e9d87caf2c4c39	2020-11-09 08:31:01 -08:00
Heitor Schueroff	a4ba018e57	Updated docs/test for dot and vdot (#47242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47242 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D24733771 Pulled By: heitorschueroff fbshipit-source-id: 92e3b0e28e0565918335fa85d52abe5db9eeff57	2020-11-05 06:27:50 -08:00
Erjia Guan	f1ac63d324	Implement copysign (#46396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46396 Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` \| a \| b \| c \| a.grad \| \| -1 \| -1 \| -1 \| 1 \| \| -0 \| -1 \| -0 \| 0 \| \| 0 \| -1 \| -0 \| 0 \| \| 1 \| -1 \| -1 \| -1 \| \| -1 \| -0 \| -1 \| 1 \| \| -0 \| -0 \| 0 \| 0 \| \| 0 \| -0 \| 0 \| 0 \| \| 1 \| -0 \| -1 \| -1 \| \| -1 \| 0 \| 1 \| -1 \| \| -0 \| 0 \| 0 \| 0 \| \| 0 \| 0 \| 0 \| 0 \| \| 1 \| 0 \| 1 \| 1 \| \| -1 \| 1 \| 1 \| -1 \| \| -0 \| 1 \| 0 \| 0 \| \| 0 \| 1 \| 0 \| 0 \| \| 1 \| 1 \| 1 \| 1 \| This function becomes non-differentiable at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24401366 Pulled By: ejguan fbshipit-source-id: 3621c5ff74b185376a3705589983bb5197ab896d	2020-11-04 08:08:57 -08:00
Qi Zhou	0ec717c830	Support int32 indices and offsets in nn.EmbeddingBag (#46758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46758 It's in general helpful to support int32 indices and offsets, especially when such tensors are large and need to be transferred to accelerator backends. Since it may not be very useful to support the combination of int32 indices and int64 offsets, here we enforce that these two must have the same type. Test Plan: unit tests Reviewed By: ngimel Differential Revision: D24470808 fbshipit-source-id: 94b8a1d0b7fc9fe3d128247aa042c04d7c227f0b	2020-11-03 23:33:50 -08:00
Ivan Yashchuk	f276ab55cd	Added Kronecker product of tensors (torch.kron) (#45358 ) Summary: This PR adds a function for calculating the Kronecker product of tensors. The implementation is based on `at::tensordot` with permutations and reshape. Tests pass. TODO: - [x] Add more test cases - [x] Write documentation - [x] Add entry `common_methods_invokations.py` Ref. https://github.com/pytorch/pytorch/issues/42666 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45358 Reviewed By: mrshenli Differential Revision: D24680755 Pulled By: mruberry fbshipit-source-id: b1f8694589349986c3abfda3dc1971584932b3fa	2020-11-03 12:41:41 -08:00
mfkasim91	6eaa324c9f	Implement torch.igamma (#46183 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41637 This is regularized lower incomplete gamma function, equivalent to scipy's `gammainc` and tensorflow `igamma`. cc fritzo mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/46183 Reviewed By: gchanan Differential Revision: D24479126 Pulled By: mruberry fbshipit-source-id: fdf8ea289fe4ca1b408810732192411e948fcdfe	2020-10-29 11:40:18 -07:00

1 2 3 4 5 ...

328 Commits