pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	56e3bc5215	Revert "Add spdiags sparse matrix initialization (#78439 )" This reverts commit cfb2034b657e8527767f1f74854bc62b4d6d4927. Reverted https://github.com/pytorch/pytorch/pull/78439 on behalf of https://github.com/suo due to broke windows builds, see: `cfb2034b65`	2022-06-30 21:04:36 +00:00
Andrew M. James	cfb2034b65	Add spdiags sparse matrix initialization (#78439 ) Similar to [scipy.sparse.spdiags](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.spdiags.html#scipy-sparse-spdiags) Part of #70926 In other functions (ie (torch.diagonal)[https://pytorch.org/docs/stable/generated/torch.diagonal.html#torch.diagonal]) diagonals of a tensor are referenced using the offset and the two dimensions that the diagonal is taken with respect to. Here the reference implementation from scipy is only considering matrix output, so even if we only support 2-d output at first. It may be useful to consider how the dimensions corresponding to each diagonal would be specified for higher dimensional output. The proposed torch signature implies that all offsets refer to the diagonals with respect to the only two dimensions of the output: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, int[] shape, Layout? layout=None) -> SparseTensor ``` Above it is required that: `diagonals.ndimension() == 2`, `offsets.ndimensions() == 1`, `offsets.shape[0] == diagonals.shape[0]` and `len(shape) == 2`. This would need to be altered for the case where `len(shape)` > 2. One options is: ``` torch.sparse.spdiags(Tensor[] diagonals, IntTensor[] offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here `offsets` and `diagonals` becomes lists of tensors, and the `IntTensor dims` argument is introduced. This would require that `len(diagonals) == len(offsets) == dims.shape[0]`, `dims.ndimension() == 2` and `dims.shape[1] == 2` also the same restrictions as the 2d case above apply to the elements of `diagonals` and `offsets` pairwise (that is `diagonals[i].ndimension() == 2`, `offsets[i].ndimension() == 1` and `offsets[i].shape[0] == diagonals[i].shape[0]` for all i). This form of the signature would construct the sparse result by placing the values from `diagonals[i][j]` into the diagonal with offset `offset[i][j]` taken with respect to dimensions `dims[i]`. The specialization back to the original signature for the 2d case could be seen as allowing the single row of dims to default to `[0, 1]` when there is only one `diagonals`, `offsets` provided, and shape is `2-d`. This option allows the rows of an input element `diagonals[i]` to have a different length which may be appropriate as the max length of a diagonal along different dimension pairs will be different. Another option is to specify the dimensions the diagonal is taken with respect to for each offset. This signature would look like: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here, `diagonals` is still 2-D with dimension 0 matching the length of 1-D `offsets` and the tensor input `dims` is also 2-D with dimension 0 matching the length of 1-D `offsets` and the second dimension being fixed at `2` in this case the sparse result is constructed by placing the elements from `diagonals[i]` into the output diagonal `output.diagonal(offset[i], dim0=dims[i][0], dim1=dims[i][1])` (with some additional consideration that makes it more complicated than simply asigning to that view). The specialization from this back to the 2-D form could be seen as assuming `dims = [[0, 1], [0, 1]... len(offsets) times ]` when `len shape==2`. In both proposed signatures for the N-D case the specialization back to the 2-D signature is a bit of a stretch for your typical default arguments logic, however I think the first is better choice as it offers more flexibility. I think some discussion is required about: - [x] Should the N-D output case be implemented from the outset - [x] If not, should the future addition of the N-D output case be considered when designing the interface. - [x] Other thoughts on the signature which includes the `dims` information for the N-D output case. Resolution: Since no one has offered a request for N-D output support, I think is fine to restrict this to sparse matrix generation. Should a request for N-D support come later, an overload accepting the additional `dims` could be added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78439 Approved by: https://github.com/nikitaved, https://github.com/cpuhrsch, https://github.com/pearu	2022-06-30 19:54:47 +00:00
Christian Puhrsch	5da776dd08	[Resubmission] fix mul_out CUDA config for COO tensors (#80254 ) Fixes https://github.com/pytorch/pytorch/issues/79914 Duplicate of https://github.com/pytorch/pytorch/pull/79937 . I wasn't able to push changes to the existing PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80254 Approved by: https://github.com/eellison	2022-06-28 00:47:03 +00:00
Nikita Vedeneev	417677bf62	`permute` for COO sparse tensors (#79707 ) As per title. Partial implementation of https://github.com/pytorch/pytorch/issues/78422. We cannot satisfy the view semantics once operated over sparse dims. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79707 Approved by: https://github.com/cpuhrsch	2022-06-25 08:49:58 +00:00
Nikita Vedeneev	03cf01bdc0	`index_select` for COO CUDA tensors. (#77551 ) Brings a native CUDA implementation for `index_select`. Master silently converts CUDA tensors to CPU for CUDA support. Case `nnz >> size` could be optimized similar to how https://github.com/pytorch/pytorch/pull/72710 is doing that. Some benchmarks: <details> <summary>PR/torch_sparse/master</summary> ``` [------------------------------- cuda coo.index_select -------------------------------] \| PR \| torch_sparse \| master 32 threads: --------------------------------------------------------------------------- n=10000, nnz=100, index_len=100, dim=0 \| 96 \| 327 \| 70 n=10000, nnz=100, index_len=100, dim=1 \| 120 \| 505 \| 74 n=10000, nnz=100, index_len=1000, dim=0 \| 90 \| 333 \| 93 n=10000, nnz=100, index_len=1000, dim=1 \| 120 \| 499 \| 98 n=10000, nnz=100, index_len=10000, dim=0 \| 92 \| 331 \| 350 n=10000, nnz=100, index_len=10000, dim=1 \| 100 \| 506 \| 352 n=100000, nnz=1000, index_len=100, dim=0 \| 53 \| 274 \| 60 n=100000, nnz=1000, index_len=100, dim=1 \| 90 \| 368 \| 71 n=100000, nnz=1000, index_len=1000, dim=0 \| 93 \| 332 \| 100 n=100000, nnz=1000, index_len=1000, dim=1 \| 130 \| 501 \| 140 n=100000, nnz=1000, index_len=10000, dim=0 \| 100 \| 341 \| 522 n=100000, nnz=1000, index_len=10000, dim=1 \| 130 \| 530 \| 549 n=1000000, nnz=10000, index_len=100, dim=0 \| 90 \| 429 \| 110 n=1000000, nnz=10000, index_len=100, dim=1 \| 296 \| 810 \| 355 n=1000000, nnz=10000, index_len=1000, dim=0 \| 100 \| 435 \| 170 n=1000000, nnz=10000, index_len=1000, dim=1 \| 309 \| 830 \| 548 n=1000000, nnz=10000, index_len=10000, dim=0 \| 110 \| 446 \| 750 n=1000000, nnz=10000, index_len=10000, dim=1 \| 310 \| 830 \| 1000 n=10, nnz=100, index_len=100, dim=0 \| 90 \| 333 \| 74 n=10, nnz=100, index_len=100, dim=1 \| 100 \| 497 \| 78 n=10, nnz=100, index_len=1000, dim=0 \| 90 \| 329 \| 140 n=10, nnz=100, index_len=1000, dim=1 \| 100 \| 800 \| 100 n=10, nnz=100, index_len=10000, dim=0 \| 93 \| 340 \| 900 n=10, nnz=100, index_len=10000, dim=1 \| 120 \| 800 \| 489 n=10, nnz=1000, index_len=100, dim=0 \| 90 \| 321 \| 140 n=10, nnz=1000, index_len=100, dim=1 \| 100 \| 680 \| 140 n=10, nnz=1000, index_len=1000, dim=0 \| 110 \| 349 \| 670 n=10, nnz=1000, index_len=1000, dim=1 \| 130 \| 740 \| 800 n=10, nnz=1000, index_len=10000, dim=0 \| 302 \| 503 \| 4882 n=10, nnz=1000, index_len=10000, dim=1 \| 325 \| 2257 \| 5262 n=10, nnz=10000, index_len=100, dim=0 \| 229 \| 349 \| 810 n=10, nnz=10000, index_len=100, dim=1 \| 433 \| 870 \| 700 n=10, nnz=10000, index_len=1000, dim=0 \| 666 \| 502 \| 5581 n=10, nnz=10000, index_len=1000, dim=1 \| 826 \| 2379 \| 4820 n=10, nnz=10000, index_len=10000, dim=0 \| 2534 \| 2700 \| 80000 n=10, nnz=10000, index_len=10000, dim=1 \| 2723 \| 18540 \| 80000 n=100, nnz=1000, index_len=100, dim=0 \| 94 \| 324 \| 110 n=100, nnz=1000, index_len=100, dim=1 \| 100 \| 499 \| 110 n=100, nnz=1000, index_len=1000, dim=0 \| 96 \| 337 \| 150 n=100, nnz=1000, index_len=1000, dim=1 \| 130 \| 800 \| 140 n=100, nnz=1000, index_len=10000, dim=0 \| 100 \| 346 \| 900 n=100, nnz=1000, index_len=10000, dim=1 \| 130 \| 760 \| 900 n=100, nnz=10000, index_len=100, dim=0 \| 90 \| 323 \| 190 n=100, nnz=10000, index_len=100, dim=1 \| 279 \| 800 \| 180 n=100, nnz=10000, index_len=1000, dim=0 \| 110 \| 339 \| 781 n=100, nnz=10000, index_len=1000, dim=1 \| 294 \| 870 \| 800 n=100, nnz=10000, index_len=10000, dim=0 \| 315 \| 505 \| 6264 n=100, nnz=10000, index_len=10000, dim=1 \| 497 \| 2398 \| 5404 n=1000, nnz=10000, index_len=100, dim=0 \| 90 \| 333 \| 160 n=1000, nnz=10000, index_len=100, dim=1 \| 279 \| 635 \| 150 n=1000, nnz=10000, index_len=1000, dim=0 \| 100 \| 328 \| 215 n=1000, nnz=10000, index_len=1000, dim=1 \| 287 \| 810 \| 207 n=1000, nnz=10000, index_len=10000, dim=0 \| 100 \| 339 \| 900 n=1000, nnz=10000, index_len=10000, dim=1 \| 291 \| 880 \| 1000 n=1000, nnz=100000, index_len=100, dim=0 \| 92 \| 358 \| 435 n=1000, nnz=100000, index_len=100, dim=1 \| 302 \| 900 \| 530 n=1000, nnz=100000, index_len=1000, dim=0 \| 130 \| 360 \| 1000 n=1000, nnz=100000, index_len=1000, dim=1 \| 329 \| 930 \| 1200 n=1000, nnz=100000, index_len=10000, dim=0 \| 343 \| 530 \| 7000 n=1000, nnz=100000, index_len=10000, dim=1 \| 545 \| 2446 \| 6100 n=1000, nnz=1000000, index_len=100, dim=0 \| 355 \| 394 \| 2210 n=1000, nnz=1000000, index_len=100, dim=1 \| 1660 \| 2276 \| 2674 n=1000, nnz=1000000, index_len=1000, dim=0 \| 877 \| 574 \| 6700 n=1000, nnz=1000000, index_len=1000, dim=1 \| 2449 \| 3782 \| 9000 n=1000, nnz=1000000, index_len=10000, dim=0 \| 3112 \| 2931 \| 57000 n=1000, nnz=1000000, index_len=10000, dim=1 \| 7340 \| 20220 \| 65700 Times are in microseconds (us). ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77551 Approved by: https://github.com/cpuhrsch	2022-06-01 17:39:03 +00:00
Mike Ruberry	089203f8bc	Updates floor_divide to perform floor division (#78411 ) Fixes https://github.com/pytorch/pytorch/issues/43874 This PR changes floor_divide to perform floor division instead of truncation division. This is a BC-breaking change, but it's a "bug fix," and we've already warned users for several releases this behavior would change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78411 Approved by: https://github.com/ngimel	2022-05-29 21:28:45 +00:00
Nikita Vedeneev	00a1fb64bb	Faster `index_select` for sparse COO tensors on CPU. (#72710 ) Fixes https://github.com/pytorch/pytorch/issues/72212. This PR improves the previous algorithm in complexity. It also utilizes the structure of the problem and parallelizes computations when possible. Benchmark results. <details> <summary>Testing script</summary> ```python import torch import math from IPython import get_ipython from itertools import product import pickle from torch.utils.benchmark import Timer, Compare torch.manual_seed(13) #torch.set_num_threads(1) ipython = get_ipython() index_sizes = (100, 1000, 10000) # specifies (n, nnz) problem_dims = ( # n > nnz (10000, 100), (100000, 1000), (1000000, 10000), # n < nnz (10, 100), (10, 1000), (10, 10000), (100, 1000), (100, 10000), (1000, 10000), (1000, 100000), (1000, 1000000), #(1000000, 1000000000), ) def f(t, d, index): s = torch_sparse.SparseTensor.from_torch_sparse_coo_tensor(t) ss = s.index_select(d, index) return ss.coo() name = "PR" results = [] for (n, nnz), m in product(problem_dims, index_sizes): for d in (0, 1): if nnz < n: shape = (n, n) else: shape = (n, nnz // n) if d == 0 else (nnz // n, n) nrows, ncols = shape rowidx = torch.randint(low=0, high=nrows, size=(nnz,)) colidx = torch.randint(low=0, high=ncols, size=(nnz,)) itemidx = torch.vstack((rowidx, colidx)) xvalues = torch.randn(nnz) index = torch.randint(low=0, high=n, size=(m,)) SparseX = torch.sparse_coo_tensor(itemidx, xvalues, size=shape).coalesce() smtp = "SparseX.index_select(d, index)" timer = Timer(smtp, globals=globals(), label="coo.index_select", description=f"{name}: coo.index_select", sub_label=f"n={n}, nnz={nnz}, index_len={m}, dim={d}", num_threads=torch.get_num_threads()) results.append(timer.blocked_autorange()) compare = Compare(results) compare.trim_significant_figures() compare.print() with open(f"{name}_index_select.pickle", 'wb') as f: pickle.dump(results, f) ``` </details> <details> <summary>Gather results</summary> ```python import pickle from torch.utils.benchmark import Timer, Compare files = [ "PR", "torch_sparse", "master" ] timers = [] for name in files: with open("{}_index_select.pickle".format(name), 'rb') as f: timers += pickle.load(f) compare = Compare(timers) compare.trim_significant_figures() compare.print() ``` </details> <details> <summary>PR/torch_sparse/master runtime comparison</summary> ``` [----------------------------------- coo.index_select ----------------------------------] \| PR \| torch_sparse \| master 32 threads: ----------------------------------------------------------------------------- n=10000, nnz=100, index_len=100, dim=0 \| 14 \| 140 \| 10 n=10000, nnz=100, index_len=100, dim=1 \| 14 \| 200 \| 10 n=10000, nnz=100, index_len=1000, dim=0 \| 30 \| 180 \| 38 n=10000, nnz=100, index_len=1000, dim=1 \| 34 \| 240 \| 38 n=10000, nnz=100, index_len=10000, dim=0 \| 278 \| 460 \| 330 n=10000, nnz=100, index_len=10000, dim=1 \| 275 \| 516 \| 330 n=100000, nnz=1000, index_len=100, dim=0 \| 16 \| 290 \| 31 n=100000, nnz=1000, index_len=100, dim=1 \| 26 \| 390 \| 31 n=100000, nnz=1000, index_len=1000, dim=0 \| 45 \| 405 \| 263 n=100000, nnz=1000, index_len=1000, dim=1 \| 73 \| 500 \| 261 n=100000, nnz=1000, index_len=10000, dim=0 \| 444 \| 783 \| 2570 n=100000, nnz=1000, index_len=10000, dim=1 \| 470 \| 890 \| 2590 n=1000000, nnz=10000, index_len=100, dim=0 \| 25 \| 2400 \| 270 n=1000000, nnz=10000, index_len=100, dim=1 \| 270 \| 4000 \| 269 n=1000000, nnz=10000, index_len=1000, dim=0 \| 74 \| 2600 \| 2620 n=1000000, nnz=10000, index_len=1000, dim=1 \| 464 \| 3600 \| 2640 n=1000000, nnz=10000, index_len=10000, dim=0 \| 635 \| 3300 \| 26400 n=1000000, nnz=10000, index_len=10000, dim=1 \| 1000 \| 3960 \| 26400 n=10, nnz=100, index_len=100, dim=0 \| 16 \| 137 \| 16 n=10, nnz=100, index_len=100, dim=1 \| 16 \| 220 \| 16 n=10, nnz=100, index_len=1000, dim=0 \| 63 \| 238 \| 81 n=10, nnz=100, index_len=1000, dim=1 \| 60 \| 698 \| 78 n=10, nnz=100, index_len=10000, dim=0 \| 480 \| 940 \| 862 n=10, nnz=100, index_len=10000, dim=1 \| 330 \| 4930 \| 1070 n=10, nnz=1000, index_len=100, dim=0 \| 60 \| 200 \| 73 n=10, nnz=1000, index_len=100, dim=1 \| 56 \| 683 \| 70 n=10, nnz=1000, index_len=1000, dim=0 \| 480 \| 530 \| 1050 n=10, nnz=1000, index_len=1000, dim=1 \| 330 \| 4550 \| 1368 n=10, nnz=1000, index_len=10000, dim=0 \| 3100 \| 2900 \| 9300 n=10, nnz=1000, index_len=10000, dim=1 \| 3400 \| 46000 \| 9100 n=10, nnz=10000, index_len=100, dim=0 \| 400 \| 453 \| 857 n=10, nnz=10000, index_len=100, dim=1 \| 400 \| 4070 \| 1730 n=10, nnz=10000, index_len=1000, dim=0 \| 2840 \| 2600 \| 13900 n=10, nnz=10000, index_len=1000, dim=1 \| 3700 \| 40600 \| 16000 n=10, nnz=10000, index_len=10000, dim=0 \| 83200 \| 67400 \| 160000 n=10, nnz=10000, index_len=10000, dim=1 \| 68000 \| 528000 \| 190000 n=100, nnz=1000, index_len=100, dim=0 \| 46 \| 148 \| 31 n=100, nnz=1000, index_len=100, dim=1 \| 45 \| 242 \| 37 n=100, nnz=1000, index_len=1000, dim=0 \| 68 \| 248 \| 240 n=100, nnz=1000, index_len=1000, dim=1 \| 66 \| 755 \| 290 n=100, nnz=1000, index_len=10000, dim=0 \| 370 \| 802 \| 2250 n=100, nnz=1000, index_len=10000, dim=1 \| 372 \| 5430 \| 2770 n=100, nnz=10000, index_len=100, dim=0 \| 82 \| 210 \| 224 n=100, nnz=10000, index_len=100, dim=1 \| 74 \| 986 \| 270 n=100, nnz=10000, index_len=1000, dim=0 \| 350 \| 618 \| 2600 n=100, nnz=10000, index_len=1000, dim=1 \| 370 \| 4660 \| 4560 n=100, nnz=10000, index_len=10000, dim=0 \| 3000 \| 3400 \| 41680 n=100, nnz=10000, index_len=10000, dim=1 \| 5000 \| 47500 \| 30400 n=1000, nnz=10000, index_len=100, dim=0 \| 71 \| 160 \| 185 n=1000, nnz=10000, index_len=100, dim=1 \| 64 \| 516 \| 190 n=1000, nnz=10000, index_len=1000, dim=0 \| 100 \| 249 \| 1740 n=1000, nnz=10000, index_len=1000, dim=1 \| 98 \| 1030 \| 1770 n=1000, nnz=10000, index_len=10000, dim=0 \| 600 \| 808 \| 18300 n=1000, nnz=10000, index_len=10000, dim=1 \| 663 \| 5300 \| 18500 n=1000, nnz=100000, index_len=100, dim=0 \| 160 \| 258 \| 1890 n=1000, nnz=100000, index_len=100, dim=1 \| 200 \| 3620 \| 2050 n=1000, nnz=100000, index_len=1000, dim=0 \| 500 \| 580 \| 18700 n=1000, nnz=100000, index_len=1000, dim=1 \| 640 \| 7550 \| 30000 n=1000, nnz=100000, index_len=10000, dim=0 \| 3400 \| 3260 \| 186000 n=1000, nnz=100000, index_len=10000, dim=1 \| 3600 \| 49600 \| 194000 n=1000, nnz=1000000, index_len=100, dim=0 \| 517 \| 957 \| 18700 n=1000, nnz=1000000, index_len=100, dim=1 \| 680 \| 39600 \| 37600 n=1000, nnz=1000000, index_len=1000, dim=0 \| 3600 \| 4500 \| 186000 n=1000, nnz=1000000, index_len=1000, dim=1 \| 5800 \| 76400 \| 190000 n=1000, nnz=1000000, index_len=10000, dim=0 \| 50000 \| 67900 \| 1800000 n=1000, nnz=1000000, index_len=10000, dim=1 \| 45000 \| 570000 \| 1900000 Times are in microseconds (us). ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/72710 Approved by: https://github.com/pearu, https://github.com/cpuhrsch	2022-05-10 16:33:13 +00:00
PyTorch MergeBot	8d67972b14	Revert "Faster `index_select` for sparse COO tensors on CPU. (#72710 )" This reverts commit ce3857e73ccbfc1970e90ee886f22e9d26cc97fe. Reverted https://github.com/pytorch/pytorch/pull/72710 on behalf of https://github.com/malfet	2022-05-10 14:43:05 +00:00
Nikita Vedeneev	ce3857e73c	Faster `index_select` for sparse COO tensors on CPU. (#72710 ) Fixes https://github.com/pytorch/pytorch/issues/72212. This PR improves the previous algorithm in complexity. It also utilizes the structure of the problem and parallelizes computations when possible. Benchmark results. <details> <summary>Testing script</summary> ```python import torch import math from IPython import get_ipython from itertools import product import pickle from torch.utils.benchmark import Timer, Compare torch.manual_seed(13) #torch.set_num_threads(1) ipython = get_ipython() index_sizes = (100, 1000, 10000) # specifies (n, nnz) problem_dims = ( # n > nnz (10000, 100), (100000, 1000), (1000000, 10000), # n < nnz (10, 100), (10, 1000), (10, 10000), (100, 1000), (100, 10000), (1000, 10000), (1000, 100000), (1000, 1000000), #(1000000, 1000000000), ) def f(t, d, index): s = torch_sparse.SparseTensor.from_torch_sparse_coo_tensor(t) ss = s.index_select(d, index) return ss.coo() name = "PR" results = [] for (n, nnz), m in product(problem_dims, index_sizes): for d in (0, 1): if nnz < n: shape = (n, n) else: shape = (n, nnz // n) if d == 0 else (nnz // n, n) nrows, ncols = shape rowidx = torch.randint(low=0, high=nrows, size=(nnz,)) colidx = torch.randint(low=0, high=ncols, size=(nnz,)) itemidx = torch.vstack((rowidx, colidx)) xvalues = torch.randn(nnz) index = torch.randint(low=0, high=n, size=(m,)) SparseX = torch.sparse_coo_tensor(itemidx, xvalues, size=shape).coalesce() smtp = "SparseX.index_select(d, index)" timer = Timer(smtp, globals=globals(), label="coo.index_select", description=f"{name}: coo.index_select", sub_label=f"n={n}, nnz={nnz}, index_len={m}, dim={d}", num_threads=torch.get_num_threads()) results.append(timer.blocked_autorange()) compare = Compare(results) compare.trim_significant_figures() compare.print() with open(f"{name}_index_select.pickle", 'wb') as f: pickle.dump(results, f) ``` </details> <details> <summary>Gather results</summary> ```python import pickle from torch.utils.benchmark import Timer, Compare files = [ "PR", "torch_sparse", "master" ] timers = [] for name in files: with open("{}_index_select.pickle".format(name), 'rb') as f: timers += pickle.load(f) compare = Compare(timers) compare.trim_significant_figures() compare.print() ``` </details> <details> <summary>PR/torch_sparse/master runtime comparison</summary> ``` [----------------------------------- coo.index_select ----------------------------------] \| PR \| torch_sparse \| master 32 threads: ----------------------------------------------------------------------------- n=10000, nnz=100, index_len=100, dim=0 \| 14 \| 140 \| 10 n=10000, nnz=100, index_len=100, dim=1 \| 14 \| 200 \| 10 n=10000, nnz=100, index_len=1000, dim=0 \| 30 \| 180 \| 38 n=10000, nnz=100, index_len=1000, dim=1 \| 34 \| 240 \| 38 n=10000, nnz=100, index_len=10000, dim=0 \| 278 \| 460 \| 330 n=10000, nnz=100, index_len=10000, dim=1 \| 275 \| 516 \| 330 n=100000, nnz=1000, index_len=100, dim=0 \| 16 \| 290 \| 31 n=100000, nnz=1000, index_len=100, dim=1 \| 26 \| 390 \| 31 n=100000, nnz=1000, index_len=1000, dim=0 \| 45 \| 405 \| 263 n=100000, nnz=1000, index_len=1000, dim=1 \| 73 \| 500 \| 261 n=100000, nnz=1000, index_len=10000, dim=0 \| 444 \| 783 \| 2570 n=100000, nnz=1000, index_len=10000, dim=1 \| 470 \| 890 \| 2590 n=1000000, nnz=10000, index_len=100, dim=0 \| 25 \| 2400 \| 270 n=1000000, nnz=10000, index_len=100, dim=1 \| 270 \| 4000 \| 269 n=1000000, nnz=10000, index_len=1000, dim=0 \| 74 \| 2600 \| 2620 n=1000000, nnz=10000, index_len=1000, dim=1 \| 464 \| 3600 \| 2640 n=1000000, nnz=10000, index_len=10000, dim=0 \| 635 \| 3300 \| 26400 n=1000000, nnz=10000, index_len=10000, dim=1 \| 1000 \| 3960 \| 26400 n=10, nnz=100, index_len=100, dim=0 \| 16 \| 137 \| 16 n=10, nnz=100, index_len=100, dim=1 \| 16 \| 220 \| 16 n=10, nnz=100, index_len=1000, dim=0 \| 63 \| 238 \| 81 n=10, nnz=100, index_len=1000, dim=1 \| 60 \| 698 \| 78 n=10, nnz=100, index_len=10000, dim=0 \| 480 \| 940 \| 862 n=10, nnz=100, index_len=10000, dim=1 \| 330 \| 4930 \| 1070 n=10, nnz=1000, index_len=100, dim=0 \| 60 \| 200 \| 73 n=10, nnz=1000, index_len=100, dim=1 \| 56 \| 683 \| 70 n=10, nnz=1000, index_len=1000, dim=0 \| 480 \| 530 \| 1050 n=10, nnz=1000, index_len=1000, dim=1 \| 330 \| 4550 \| 1368 n=10, nnz=1000, index_len=10000, dim=0 \| 3100 \| 2900 \| 9300 n=10, nnz=1000, index_len=10000, dim=1 \| 3400 \| 46000 \| 9100 n=10, nnz=10000, index_len=100, dim=0 \| 400 \| 453 \| 857 n=10, nnz=10000, index_len=100, dim=1 \| 400 \| 4070 \| 1730 n=10, nnz=10000, index_len=1000, dim=0 \| 2840 \| 2600 \| 13900 n=10, nnz=10000, index_len=1000, dim=1 \| 3700 \| 40600 \| 16000 n=10, nnz=10000, index_len=10000, dim=0 \| 83200 \| 67400 \| 160000 n=10, nnz=10000, index_len=10000, dim=1 \| 68000 \| 528000 \| 190000 n=100, nnz=1000, index_len=100, dim=0 \| 46 \| 148 \| 31 n=100, nnz=1000, index_len=100, dim=1 \| 45 \| 242 \| 37 n=100, nnz=1000, index_len=1000, dim=0 \| 68 \| 248 \| 240 n=100, nnz=1000, index_len=1000, dim=1 \| 66 \| 755 \| 290 n=100, nnz=1000, index_len=10000, dim=0 \| 370 \| 802 \| 2250 n=100, nnz=1000, index_len=10000, dim=1 \| 372 \| 5430 \| 2770 n=100, nnz=10000, index_len=100, dim=0 \| 82 \| 210 \| 224 n=100, nnz=10000, index_len=100, dim=1 \| 74 \| 986 \| 270 n=100, nnz=10000, index_len=1000, dim=0 \| 350 \| 618 \| 2600 n=100, nnz=10000, index_len=1000, dim=1 \| 370 \| 4660 \| 4560 n=100, nnz=10000, index_len=10000, dim=0 \| 3000 \| 3400 \| 41680 n=100, nnz=10000, index_len=10000, dim=1 \| 5000 \| 47500 \| 30400 n=1000, nnz=10000, index_len=100, dim=0 \| 71 \| 160 \| 185 n=1000, nnz=10000, index_len=100, dim=1 \| 64 \| 516 \| 190 n=1000, nnz=10000, index_len=1000, dim=0 \| 100 \| 249 \| 1740 n=1000, nnz=10000, index_len=1000, dim=1 \| 98 \| 1030 \| 1770 n=1000, nnz=10000, index_len=10000, dim=0 \| 600 \| 808 \| 18300 n=1000, nnz=10000, index_len=10000, dim=1 \| 663 \| 5300 \| 18500 n=1000, nnz=100000, index_len=100, dim=0 \| 160 \| 258 \| 1890 n=1000, nnz=100000, index_len=100, dim=1 \| 200 \| 3620 \| 2050 n=1000, nnz=100000, index_len=1000, dim=0 \| 500 \| 580 \| 18700 n=1000, nnz=100000, index_len=1000, dim=1 \| 640 \| 7550 \| 30000 n=1000, nnz=100000, index_len=10000, dim=0 \| 3400 \| 3260 \| 186000 n=1000, nnz=100000, index_len=10000, dim=1 \| 3600 \| 49600 \| 194000 n=1000, nnz=1000000, index_len=100, dim=0 \| 517 \| 957 \| 18700 n=1000, nnz=1000000, index_len=100, dim=1 \| 680 \| 39600 \| 37600 n=1000, nnz=1000000, index_len=1000, dim=0 \| 3600 \| 4500 \| 186000 n=1000, nnz=1000000, index_len=1000, dim=1 \| 5800 \| 76400 \| 190000 n=1000, nnz=1000000, index_len=10000, dim=0 \| 50000 \| 67900 \| 1800000 n=1000, nnz=1000000, index_len=10000, dim=1 \| 45000 \| 570000 \| 1900000 Times are in microseconds (us). ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/72710 Approved by: https://github.com/pearu, https://github.com/cpuhrsch	2022-05-09 19:59:39 +00:00
Jane Xu	6d9dbd3391	Manually skip test_sparse_addmm as disable code is not working for now (#77076 ) Related to https://github.com/pytorch/pytorch/issues/73145 It was previously skipped for Linux and Windows, but mac has become a problem as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77076 Approved by: https://github.com/ezyang	2022-05-09 13:54:29 +00:00
Mikayla Gawarecki	0adf070574	Use scatter_reduce to support masked reductions on sparse COO tensors (sum, prod, amin, amax) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75454 Approved by: https://github.com/cpuhrsch	2022-05-06 15:40:22 +00:00
PyTorch MergeBot	381e08309f	Revert "Use scatter_reduce to support masked reductions on sparse COO tensors (sum, prod, amin, amax)" This reverts commit fc2a2e8b7271b258f5f394c94e9154ebef4769e4. Reverted https://github.com/pytorch/pytorch/pull/75454 on behalf of https://github.com/b0noI	2022-05-04 22:31:31 +00:00
Mikayla Gawarecki	fc2a2e8b72	Use scatter_reduce to support masked reductions on sparse COO tensors (sum, prod, amin, amax) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75454 Approved by: https://github.com/cpuhrsch	2022-05-03 23:17:07 +00:00
arindamroy-eng	7478ce187a	ROCM:Unskip more tests for ROCM5.0 Re-enabling more tests which are working on ROCM5.0 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75353 Approved by: https://github.com/ezyang	2022-04-19 19:45:55 +00:00
Pearu Peterson	a98b4666e0	Enable test_sparse_mask for Windows Pull Request resolved: https://github.com/pytorch/pytorch/pull/75189 Approved by: https://github.com/cpuhrsch	2022-04-11 17:21:29 +00:00
Brian Hirsh	1b7d7d9327	Reland: "free up dispatch key space (in C++)" (#74963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74963 This is a re-land of D35192346 (`9872a06d77`) and D35192317 (`a9216cde6c`), which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: https://github.com/pytorch/pytorch/pull/69633. The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`. Background: Existing Mobile Optimization Pytorch mobile builds have an existing optimization (here `cc23725e89/c10/core/DispatchKey.h (L382)` and here `cc23725e89/aten/src/ATen/core/dispatch/OperatorEntry.h (L214)`), which works as follows: Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc). In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys. The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: `cc23725e89/aten/src/ATen/core/dispatch/Dispatcher.h (L294)`. The mobile-optimization currently does not extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64. The Bug This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294). That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`. Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan? Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this? The debugging experience was pretty difficult Debugging the Milan-specific failure was made difficult by the following: (1) lack of CI - the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging. (2) It's difficult to get a repro. - my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space) - There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from MarcioPorto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/) (3) Lack of stack-traces. - Most Milan failures didn't include actionable stack traces. phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash. ghstack-source-id: 152688542 Test Plan: Confirmed with phding that the broken Milan workflow from the previous version of this diff is now passing. Reviewed By: phding, albanD Differential Revision: D35222806 fbshipit-source-id: 0ad115a0f768bc8ea5d4c203b2990254c7092d30 (cherry picked from commit 002b91966f11fd55ab3fa3801b636fa39a6dd12c)	2022-03-31 21:52:38 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit eed19a0f38a81015ca50dd25e997b1c6e223d46b. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Brian Hirsh	9872a06d77	Back out "free up dispatch key space (in C++)" (#74859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74859 Original commit changeset: 6d1dd0fd8144 Original Phabricator Diff: D34227616 (`2cbddc0e9b`) ghstack-source-id: 152381077 (Note: this ignores all push blocking failures!) Test Plan: Test on Milan with "get weather utterance" buck build fbsourcefbandroid/mode/opt fbsourcefbandroid/mode/milan_build_rdk //fbandroid/apps/wearable/system/speechservice:speechservice_target30_xhdpi_armv7_release_debug_keystore -c pt.has_backtaces=1 Reviewed By: phding Differential Revision: D35192346 fbshipit-source-id: b962de5d5effaf23f9aa8afd3ef36f8c6383de5b (cherry picked from commit 913e3027a11457aaa2d97a9d89ebc6133b14213c)	2022-03-29 15:39:17 +00:00
Christian Puhrsch	e55b73d65a	Add strided layout support for to_dense Fixes #59958 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74486 Approved by: https://github.com/pearu, https://github.com/suo	2022-03-29 00:12:48 +00:00
Pearu Peterson	ebeea9e2ea	Support masked sum on sparse COO tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71239 Approved by: https://github.com/cpuhrsch	2022-03-25 18:26:39 +00:00
Brian Hirsh	2cbddc0e9b	free up dispatch key space (in C++) (#72827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72827 Reland of D34034848 (`6690256021`) ghstack-source-id: 152161452 Test Plan: Confirm that Milan tests are passing Reviewed By: ezyang Differential Revision: D34227616 fbshipit-source-id: 6d1dd0fd8144dfbd9e194cd7564cce017e7db968 (cherry picked from commit e5c1b29fedd5c2a0bad810cedc94aa784136b6aa)	2022-03-25 17:04:51 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Pearu Peterson	a5dcc0c378	Enable test_coalesce_cuda_bfloat16 (#73158 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73158 Fixes #72893 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D34515679 Pulled By: cpuhrsch fbshipit-source-id: 049f8ddf53023b78e1b48e15bbd3cdc58b6bf692 (cherry picked from commit 28a44ca56f66bfaaf14a049856b7d89fec8cd838)	2022-02-28 19:34:20 +00:00
Pearu Peterson	3c932c345b	Fix test_Sparse_to_Sparse_copy__cuda_bfloat16 failure (#73157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73157 Fixes #72892 Test Plan: Imported from OSS Reviewed By: george-qi Differential Revision: D34398986 Pulled By: cpuhrsch fbshipit-source-id: 20214be1859354fb18a306e8d1de9852a898c485 (cherry picked from commit c1816ef0cf8834149bebcc11f4402f0eedfae6f7)	2022-02-28 05:33:50 +00:00
Pearu Peterson	16cd6853e1	Fix test_sparse_addmm_...float16 and test_sparse_matmul_...float16 test failures (#73155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73155 Fixes #73145 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D34398935 Pulled By: cpuhrsch fbshipit-source-id: b1e852f25b0888b37d9c9c1418ddf344ac8f0a04 (cherry picked from commit d63c977fb39c7dcb3f3d083edc4b25cd2d6c2ec4)	2022-02-26 05:30:36 +00:00
Pearu Peterson	4c522643e7	Fix CUDA error when multiplying sparse hybrid tensors with zero dense dimensions (#73428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73428 Fixes https://github.com/pytorch/pytorch/issues/73363 Test Plan: Imported from OSS Reviewed By: george-qi Differential Revision: D34478521 Pulled By: cpuhrsch fbshipit-source-id: cbc83f223a14c92ed8b284e5e2a8aab390e2bc5c (cherry picked from commit 9d7ecc848228f9a5b1761f9d3653d3cca49e0244)	2022-02-26 01:08:45 +00:00
Philip Meier	0973c5a1cc	align signature of make_tensor with other creation ops (#72702 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72702 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D34457729 Pulled By: mruberry fbshipit-source-id: 83d580c4201eef946dc9cf4b9e28a3d36be55609 (cherry picked from commit aa4cf20fbeb4b795595729b8ac2e6ba7707d8283)	2022-02-25 06:30:31 +00:00
Rohan Varma	c3d79ac422	Manual skip sparse tests manual skip because not properly disabled by automation Differential Revision: [D34456851](https://our.internmc.facebook.com/intern/diff/D34456851/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/73374	2022-02-24 20:26:02 +00:00
Alban Desmaison	49444bb501	Revert D34400588: [pytorch][PR] super setUp call missing in TestSparse Test Plan: revert-hammer Differential Revision: D34400588 (`555b215a90`) Original commit changeset: 40ac1c56918d Original Phabricator Diff: D34400588 (`555b215a90`) fbshipit-source-id: 0375279d06cc7a9d612bd70cc4c042cb3319a5fc (cherry picked from commit 7cd3d2da907e6f0882f56c8843d50586756a2fe6)	2022-02-24 14:34:01 +00:00
Jane Xu	555b215a90	super setUp call missing in TestSparse (#73217 ) Summary: Should fix the fact that Sparse tests are not rightly disabled https://github.com/pytorch/pytorch/issues/73145#issuecomment-1046952585 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73217 Reviewed By: atalman Differential Revision: D34400588 Pulled By: janeyx99 fbshipit-source-id: 40ac1c56918d5c47debf962a2bd218a325626ad8 (cherry picked from commit e63dae284ba9056567fcaffc54d1aa38151c0a12)	2022-02-23 19:36:50 +00:00
Nikita Shulga	5dad19fef0	Back out "[pytorch][PR] add BFloat16 sparse operators on CPU: copy, coalesce, sparse_mask, ad…" Summary: Original commit changeset: f1274125234a Original Phabricator Diff: D34343016 (`c6f56599bb`) Test Plan: Abovementioned PR regressed OSS CI Reviewed By: atalman Differential Revision: D34379703 fbshipit-source-id: bc624cfd86249dde2fac635d9b66f08f86b4aed9 (cherry picked from commit e52827f1ae09e0c54fd3c7383b5ed49377b6293c)	2022-02-21 18:31:51 +00:00
Jiayi Sun	c6f56599bb	add BFloat16 sparse operators on CPU: copy, coalesce, sparse_mask, ad… (#72846 ) Summary: …d_out, addmm Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/72846 Reviewed By: mikaylagawarecki Differential Revision: D34343016 Pulled By: cpuhrsch fbshipit-source-id: f1274125234a3bacbb7a38fc642fbf5c9786d435 (cherry picked from commit c819456abf1d27ee09ae7f243222dd7e89cc82b4)	2022-02-19 01:33:51 +00:00
Pearu Peterson	e785c0a1ab	Enable Half/BFloat16 support for to_dense and coalesce methods. (#72397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72397 Test Plan: Imported from OSS Reviewed By: jbschlosser, zou3519 Differential Revision: D34286114 Pulled By: cpuhrsch fbshipit-source-id: a4f7e2abc3b2d37437cbd09d693c1b409bb011b9 (cherry picked from commit 74f94447fcf12ff7c740e1008c84d0df9ec9e1f5)	2022-02-17 02:54:23 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit cb6b0ef7198c5252c51a8fec1c19e3c17b33cc87)	2022-02-17 02:33:08 +00:00
Brian Hirsh	22ccf448e8	Revert D34034848: free up dispatch key space (in C++) Test Plan: revert-hammer Differential Revision: D34034848 (`6690256021`) Original commit changeset: 9677ee2c0a1a Original Phabricator Diff: D34034848 (`6690256021`) fbshipit-source-id: fd50943d915ef813bb9f9ab278fb582429eea3b1 (cherry picked from commit 3acefee1cdb89bc051d1ef2e9deb5698d2bd85c3)	2022-02-14 23:29:00 +00:00
Brian Hirsh	6690256021	free up dispatch key space (in C++) (#72402 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72402 The original PR had an array-out-of-bounds access in `DispatchKeyExtractor.cpp`, that wasn't caught by ASAN and appeared to only manifest in a subset of android internal tests. After fixing the OOB access (and adding more asserts), I confirmed that the android internal test passes. Reland of D33255193 (`20b8653dfa`) ghstack-source-id: 148830728 Test Plan: Steps to test: (1) connect to a mobile OD (2) run `one_world android emulator android-29` in a terminal to start the android emulator (3) In a separate terminal, run the test: `buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled` I also ran `buck test fbandroid/mode/dbg //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test`, which failed before and passed after the PR. Reviewed By: albanD Differential Revision: D34034848 fbshipit-source-id: 9677ee2c0a1afd1183896f7055009445712523c5 (cherry picked from commit 9ab9b12d355540ad0923c6869ed088ff6c21490c)	2022-02-14 16:02:29 +00:00
Jacob Szwejbka	791e7df7d9	Back out "free up dispatch key space (in C++)" Summary: I think this diff stack broke all the related tasks below. Test Plan: For our failing tests: buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled For the ubn: Not really sure what to do, trying to build the app and see if I can use an effect? Reviewed By: shoumikhin Differential Revision: D34018849 fbshipit-source-id: 3571718cb6621931af931b494e0a70d6e0164e65 (cherry picked from commit 3cc63cb2ea2664dd1063b190614f2034cce5f2d0)	2022-02-05 01:25:42 +00:00
Brian Hirsh	20b8653dfa	free up dispatch key space (in C++) (#69633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69633 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33255193 Pulled By: bdhirsh fbshipit-source-id: 79773e9c15bf4f2f27675121a49ff5ffd1375238 (cherry picked from commit eac0b1300569e035f3de28a1f0fdce03f60bd270)	2022-02-04 17:57:38 +00:00
Pearu Peterson	214f4bf2ff	Support sparse.sum on empty sparse tensor (#71091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71091 Fixes https://github.com/pytorch/pytorch/issues/65394 The masked sum on a full input tensor (of any layout) with an all-true mask is the same as the sum on the strided input tensor (after applying `to_dense` to sparse inputs). Since masked sum uses `torch.sparse.sum` then, for the simplicity of masked reductions implementations, its reduction behavior ought to be defined by the behavior of the `torch.sum`. This PR implements the behavioral connection with respect to the directional summation of empty sparse tensors that correspond to all-zero strided tensors. cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: davidberard98 Differential Revision: D33651750 Pulled By: cpuhrsch fbshipit-source-id: 703891bff88c8da6270b4272f5d2da81688db67d (cherry picked from commit 53f97e80f7520594e9977ad61a1a727dadade645)	2022-01-19 18:58:08 +00:00
Pearu Peterson	677fab6d1d	Support broadcast_to on sparse COO tensors (#71073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71073 cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33645744 Pulled By: cpuhrsch fbshipit-source-id: 4775c9636c4e868022a8c1bbfec93e351d1cf885 (cherry picked from commit 640f21e09a935a1231b99ddd6472b03158bdc283)	2022-01-19 04:33:41 +00:00
Pearu Peterson	e7602a1e30	Fix multiplication of 0-D sparse tensors (#70749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70749 Fixes https://github.com/pytorch/pytorch/issues/65396 and a clang-tidy error. cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33439136 Pulled By: cpuhrsch fbshipit-source-id: 45ec58de7c18db183f891431d4a26e98fd0e924a	2022-01-06 13:36:46 -08:00
Peter Bell	6de9f0fc94	OpInfo: Allow sample_inputs_func to be any iterable (#69256 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69256 Closes #52486 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32942008 Pulled By: mruberry fbshipit-source-id: f5b01b0298c0160b0bec6e86e2b6db8cfe746206	2021-12-09 08:37:26 -08:00
Peter Bell	1da1707568	Sparse: Implement simple unary ufuncs operators (#68887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68887 Closes #46988, closes #46987, closes #46761 By "simple" I mean operators that map 0->0 so we can implement it by just re-dispatching on the values tensor. That does mean we have `sin` but not `cos` for example, but without fill value support this is the best that can be done. Most of these don't support autograd because the derivative formulas use unsupported operators. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32734911 Pulled By: cpuhrsch fbshipit-source-id: 203ab105799f3d2d682b01ca3d6b18e7c994776a	2021-12-01 05:43:19 -08:00
Eli Uriegas	251686fc4c	Revert D32706197: Sparse: Implement simple unary ufuncs operators Test Plan: revert-hammer Differential Revision: D32706197 (`fbaa19a6fa`) Original commit changeset: 65e1acb36457 fbshipit-source-id: 45c4b486f9eee200d5a1f6d46d267617124f8a5e	2021-11-30 10:50:12 -08:00
Peter Bell	fbaa19a6fa	Sparse: Implement simple unary ufuncs operators (#68887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68887 Closes #46988, closes #46987, closes #46761 By "simple" I mean operators that map 0->0 so we can implement it by just re-dispatching on the values tensor. That does mean we have `sin` but not `cos` for example, but without fill value support this is the best that can be done. Most of these don't support autograd because the derivative formulas use unsupported operators. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32706197 Pulled By: cpuhrsch fbshipit-source-id: 65e1acb3645737ca7bdb7f2db739d8e118906f4b	2021-11-30 00:30:30 -08:00
Peter Bell	f5fa91ba2e	Sparse: Add additional opinfo tests (#68886 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68886 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32697933 Pulled By: cpuhrsch fbshipit-source-id: fffdd1bc663cc1bc49abe8cf3680982d1cb497bc	2021-11-29 12:49:20 -08:00
Vinnam Kim	f89572f417	Add feature: zeros_like() from a dense tensor to a sparse tensor (#68108 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67904. - Create a sparse tensor when the sparse layout is given even if the input tensor is not sparse. cc nikitaved pearu cpuhrsch IvanYashchuk Pull Request resolved: https://github.com/pytorch/pytorch/pull/68108 Reviewed By: anjali411 Differential Revision: D32316269 Pulled By: cpuhrsch fbshipit-source-id: 923dbd4dc7c74f51f7cdbafb2375a30271a6a886	2021-11-11 08:54:15 -08:00

1 2 3 4 5 ...

265 Commits