Add deterministic path for CUDA cumsum (#136224)

Change `cumsum` to call its decomposition when `use_deterministic_algorithms(True)` and input is CUDA.

Fixes #89492

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136224
Approved by: https://github.com/ezyang, https://github.com/justinchuby
This commit is contained in:
Kurt Mohler
2024-09-24 21:34:43 +00:00
committed by PyTorch MergeBot
parent 0133fbcfe7
commit b1a02bf708
9 changed files with 108 additions and 45 deletions

View File

@ -3317,38 +3317,6 @@ Example::
""".format(**reduceops_common_args),
)
add_docstr(
torch.cumsum,
r"""
cumsum(input, dim, *, dtype=None, out=None) -> Tensor
Returns the cumulative sum of elements of :attr:`input` in the dimension
:attr:`dim`.
For example, if :attr:`input` is a vector of size N, the result will also be
a vector of size N, with elements.
.. math::
y_i = x_1 + x_2 + x_3 + \dots + x_i
Args:
{input}
dim (int): the dimension to do the operation over
Keyword args:
{dtype}
{out}
Example::
>>> a = torch.randint(1, 20, (10,))
>>> a
tensor([13, 7, 3, 10, 13, 3, 15, 10, 9, 10])
>>> torch.cumsum(a, dim=0)
tensor([13, 20, 23, 33, 46, 49, 64, 74, 83, 93])
""".format(**reduceops_common_args),
)
add_docstr(
torch.count_nonzero,
r"""