6 Commits

Author SHA1 Message Date
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
0711bff9aa [ROCm] add skipCUDAIfVersionLessThan to unskip test_jiterator for ROCm (#99197)
This unskips 121 tests that the decorator `@skipCUDAIf(_get_torch_cuda_version() < (11, 6))` was unintentionally skipping for ROCm.  Other decorators such as `skipCUDAVersionIn` will only activate for the CUDA device, not the CPU or ROCm-as-CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99197
Approved by: https://github.com/ngimel
2023-04-17 16:05:16 +00:00
eea0733045 Reduce pytest blocklist (#96016)
`TestCase = object` or variations of it get switched to `TestCase = NoTest`.

unittest collects test based on subclassing unittest.TestCase, so setting TestCase = object removes it from unittest test collection.  pytest collects based on name (https://docs.pytest.org/en/7.1.x/reference/reference.html#confval-python_classes) but can be told to ignore a class (bottom of https://docs.pytest.org/en/7.1.x/example/pythoncollection.html#changing-naming-conventions)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96016
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2023-03-07 18:30:27 +00:00
ff5fe9e622 [ROCm] enable jiterator (#77982)
### Description
Enables jiterator for ROCm builds.  This includes necessary porting when hiprtc and nvrtc behavior differed.  This also ported ROCm versus CUDA differences w.r.t. MAX_DIMS and NUM_THREADS from the non-jiterator code paths into jiterator.

### Testing
CI with ciflow/trunk label to force running ROCm workflows that are currently trunk-only.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77982
Approved by: https://github.com/ngimel
2022-08-15 16:04:09 +00:00
6db8440f35 Python Jiterator supports multiple outputs (#78139)
This PR is part3.
Part1: https://github.com/pytorch/pytorch/pull/77902
Part2: https://github.com/pytorch/pytorch/pull/77921

Python Jiterator now supports returning multiple outputs

```
fn = torch.cuda.jiterator._create_multi_output_jit_fn(
"""
template <typename T>
T binary_2outputs(T i0, T i1, T& out0, T& out1) {
    out0 = i0 + i1;
    out1 = i0 - i1;
}
""",
num_outputs=2)

x = torch.rand(3, device='cuda')
y = torch.rand(3, device='cuda')
out0, out1 = fn(x, y)

torch.allclose(out0, x+y)
torch.allclose(out1, x-y)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78139
Approved by: https://github.com/ngimel
2022-05-24 21:52:56 +00:00
8b6a78f39f Python Interface for Jiterator
This PR allows user to author a CUDA kernel in python.

```
from torch.cuda.jiterator import create_jit_fn

code_string = "template <typename T> T my_kernel(T x, T y, T alpha) { return  -x * y + x - y + alpha; }"
jitted_fn = create_jit_fn(code_string, alpha=0)

a = torch.rand(3, device='cuda')
b = torch.rand(3, device='cuda')
result = jitted_fn(a, b, alpha=1.0)
```

Limitations:
- Only supports elementwise kernel
- 1~8 tensor inputs (empty input, e.g. factory methods, is not supported)
- inputs tensors must live in cuda device
- cpu Scalar is not supported
- kwargs must be pre-declared when calling create_jit_fn
- kwargs must be convertible to at::Scalar, one of float64, int64_t, bool. (complex not support for now)

TODOs:
- [x] consolidate union and c10::variant implementation
- [x] plug into existing op testing framework
- [ ] rename files, place files in the right folder
- [ ] place util functions in the right file
- [x] enforce assumptions in python interface e.g <8 inputs, kwargs types
- [x] Add user-facing documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76394
Approved by: https://github.com/mruberry
2022-05-06 18:44:28 +00:00