38 Commits

Author SHA1 Message Date
e925dfcc6b Enable all SIM rules except disabled ones (#164645)
`SIM` rules are useful for simplifying boolean expressions and enhances code readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645
Approved by: https://github.com/ezyang, https://github.com/mlazos
2025-10-17 07:27:11 +00:00
5d7360bb03 Revert "Enable all SIM rules except disabled ones (#164645)"
This reverts commit 321e6026925f6b6e8a36e3a8b7c0295cd7541911.

Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))
2025-10-05 19:32:21 +00:00
321e602692 Enable all SIM rules except disabled ones (#164645)
`SIM` rules are useful for simplifying boolean expressions and enhances code readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645
Approved by: https://github.com/ezyang
2025-10-05 07:38:25 +00:00
c0142f5c06 [ROCm] Enabling several UTs (#161715)
All these UTs are working as is, just removing the skip
- test_p2p_ipc
- test_repros.py: working, added fp8 support
- test_activation_checkpointing.py
- test_content_store.py
- test_cuda_multigpu.py
- test_compute_comm_reordering.py
- test_segment_reductions.py
- test_dataloader.py
- test_math_ops.py
- test_loop_ordering.py
- test_control_flow.py
- distributed_test.py
- test_mem_tracker.py
- test_fsdp_optim_state.py
- test_fully_shard_mixed_precision.py: skippped for < ROCm7.0
- test_aot_inductor_custom_ops.py
- test_c10d_ops_nccl.py
- test_eager_transforms.py
- test_sparse_csr.py
- test_inductor_collectives.py
- test_fake_tensor.py
- test_cupy_as_tensor.py
- test_cuda.py: enable UTs that are working
- test_matmul_cuda.py: enable UTs that are working

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715
Approved by: https://github.com/msaroufim

Co-authored-by: Mark Saroufim <marksaroufim@fb.com>
2025-09-09 15:49:21 +00:00
8235c4f65d Revert "[ROCm] Enabling several UTs (#161715)"
This reverts commit b9ba612f7a968f7b27e121ca8f4d0a4d954f5354.

Reverted https://github.com/pytorch/pytorch/pull/161715 on behalf of https://github.com/jeanschmidt due to Need to revert in order to revert https://github.com/pytorch/pytorch/pull/159473, feel free to merge it back once conflicts are cleared ([comment](https://github.com/pytorch/pytorch/pull/161715#issuecomment-3264040604))
2025-09-07 21:03:17 +00:00
b9ba612f7a [ROCm] Enabling several UTs (#161715)
All these UTs are working as is, just removing the skip
- test_p2p_ipc
- test_repros.py: working, added fp8 support
- test_activation_checkpointing.py
- test_content_store.py
- test_cuda_multigpu.py
- test_compute_comm_reordering.py
- test_segment_reductions.py
- test_dataloader.py
- test_math_ops.py
- test_loop_ordering.py
- test_control_flow.py
- distributed_test.py
- test_mem_tracker.py
- test_fsdp_optim_state.py
- test_fully_shard_mixed_precision.py: skippped for < ROCm7.0
- test_aot_inductor_custom_ops.py
- test_c10d_ops_nccl.py
- test_eager_transforms.py
- test_sparse_csr.py
- test_inductor_collectives.py
- test_fake_tensor.py
- test_cupy_as_tensor.py
- test_cuda.py: enable UTs that are working
- test_matmul_cuda.py: enable UTs that are working

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715
Approved by: https://github.com/pruthvistony, https://github.com/jeffdaily
2025-09-04 20:43:03 +00:00
fc0376e8b1 [BE][2/6] fix typos in test/ (test/test_*.py) (#157636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157636
Approved by: https://github.com/yewentao256, https://github.com/mlazos
ghstack dependencies: #156311, #156609
2025-07-09 11:02:23 +00:00
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
cb71bcc542 Replace clone.detach with detach.clone (#140264)
Fixes #64532

As state in issue, replace `clone.detach` by `detach.clone`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140264
Approved by: https://github.com/soulitzer
2024-11-13 07:01:02 +00:00
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
ddd7da7546 Enable more tests (#104437)
Remove `test_segment_reductions` from list of blocklisted tests Remove `@onlyCPU` qualifier from test_segment_reductions as it has CUDA specific parts

Fixes https://github.com/pytorch/pytorch/issues/104410

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104437
Approved by: https://github.com/atalman, https://github.com/huydhn
2023-06-30 16:26:11 +00:00
193adde5f4 Fix UnboundLocalError in test_segment_reductions.py (#104353)
Summary:
Fixes:
```
UnboundLocalError: local variable 'expected_result' referenced before assignment
```

Test Plan: Sandcastle

Differential Revision: D47092967

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104353
Approved by: https://github.com/malfet
2023-06-29 16:29:34 +00:00
efed5a4969 Allow data size equal to 0 for SegmentReduce (#99733)
Summary:
Support special case that data size can be 0 for SegmentReduce.

Example code below:
```
x = torch.ones((0, 6)).cuda()
lengths = torch.tensor([0, 0]).cuda()
torch.segment_reduce(x, "sum", lengths=lengths, unsafe=False, initial=0)
```
Previously, error message: Expected data.numel() > 0 to be true, but got false.
Now expect to return 0.

Test Plan: contbuild & OSS CI

Differential Revision: D45133827

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99733
Approved by: https://github.com/ngimel
2023-04-23 01:59:45 +00:00
496c0a207b Make segment_reduce properly private. (#93166)
I am attempting not to change the aten function to reduce the amount of BC issues on the torchscript side.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93166
Approved by: https://github.com/ngimel
2023-02-06 18:32:23 +00:00
661800a2cf Fix BC-breaking change introduced by #91499 (#93091)
This fixes BC-breaking changes introduced by https://github.com/pytorch/pytorch/pull/91499
Make enum accept both `min` and `amin` values
Reinstante testing

To reiterate
454361435c/torch/masked/_ops.py (L786)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93091
Approved by: https://github.com/ngimel
2023-01-27 03:58:35 +00:00
eb7b89771e unify reduction types from different operators: scatter, scatter_reduce, segment_reduce (#91499)
The target of this PR is to unify `ReductionType` for reduce operators so that we have the same set of reduce utils for `init`, or `update` for vectorization.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91499
Approved by: https://github.com/ngimel
2023-01-13 04:32:34 +00:00
7360b53ff3 reland Add offsets-based reduction to segment_reduce (CPU, CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79725

Approved by: https://github.com/george-qi
2022-06-17 15:49:31 +00:00
3b194fd532 Revert "Add offsets-based reduction to segment_reduce (CPU, CUDA)"
This reverts commit 1ec30a6647be35d123a741a39cab8b4253c1cbe0.

Reverted https://github.com/pytorch/pytorch/pull/78907 on behalf of https://github.com/osalpekar due to Caused Typecasting errors in PT Distributed and fx2trt builds internally
2022-06-13 22:37:25 +00:00
1ec30a6647 Add offsets-based reduction to segment_reduce (CPU, CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78907

Approved by: https://github.com/cpuhrsch
2022-06-11 17:43:42 +00:00
e727539c29 Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77061

Approved by: https://github.com/cpuhrsch
2022-06-11 01:45:22 +00:00
87a5ecced2 Revert "Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA)"
This reverts commit 40f7ef1f3db9717d8149a0bd1e8b8c80c8600753.

Reverted https://github.com/pytorch/pytorch/pull/77061 on behalf of https://github.com/janeyx99 due to Broke segment_reduce tests on trunk, e.g., 40f7ef1f3d
2022-06-10 01:57:34 +00:00
40f7ef1f3d Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77061

Approved by: https://github.com/cpuhrsch
2022-06-10 00:49:37 +00:00
e289a18e78 Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CPU-only)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77058

Approved by: https://github.com/cpuhrsch
2022-06-09 19:27:29 +00:00
814ff74460 Add prod reduce option to segment_reduce + opinfo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76067

Approved by: https://github.com/cpuhrsch
2022-06-07 17:06:07 +00:00
6259601c8a Set test owners for tests with unknown owners (#67552)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67552

Reviewed By: jbschlosser

Differential Revision: D32028248

Pulled By: janeyx99

fbshipit-source-id: a006f7026288b7126dba58b31cac28e10ce0fed6
2021-10-29 12:42:01 -07:00
f84a441718 [torch][segment_reduce] Update default values when initial value is not set (#61266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61266

Same as title.
Mainly this concludes the initially planned features from the op. Only missing functionality is to do reduction on any axis (currently axis 0 only is supported).

Test Plan: Updated unit test.

Reviewed By: ngimel

Differential Revision: D29552037

fbshipit-source-id: 023c7cbf750a0671f76082708f14c05739dda07a
2021-07-07 13:34:10 -07:00
a78ad5dc4c [torch][segment_reduce] Add support for int lengths as well (#61141)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61141

Currently only long is supported. This diff adds support for other index type.

Next Steps:
- Update default, refactor unit test and test non_initial value as well
- Cleanup (more tests, benchmark, update documentation)

Test Plan: updated unit test. rely on CI.

Reviewed By: ngimel

Differential Revision: D29526308

fbshipit-source-id: b4043603483851ef7e0e93b0bb02ac7849c6449d
2021-07-07 13:34:09 -07:00
af66824c1f [torch][segment_reduce] Add support for sum and min reductions (#60379)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60379

This concludes the support for all reductions types initially planned (min, max, mean, sum).

Next Steps:
- Cleanups
       -  update default values when length is 0 and initial not given
       - templatize the code to avoid branching with every item.( and other known improvements)
- more unit tests, verification
- benchmarking

Test Plan: updated unit tests.

Reviewed By: ngimel

Differential Revision: D29268218

fbshipit-source-id: c77d91671e01dcf96c18c758fa3ea522b2e13db9
2021-06-23 18:51:44 -07:00
6af5d00e4b [torch][segment_reduce] Add support for multi-dimensional input (cuda) (#60018)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60018

Same as title. This diff finishes cuda support for currently implemented reductions and input parameters.

Next Steps:
- Add support for sum/min
- More testing and benchmarking
- Cleanup
    - Update default values when length is 0
    - Use TensorIterator
    - Update documentation

Test Plan: Unit test to cover cuda forward path.

Reviewed By: ngimel

Differential Revision: D29135373

fbshipit-source-id: d070727eeb660f56782e7ac8a5b0798be688480a
2021-06-17 16:30:30 -07:00
a727f655c8 [torch][segment_reduce] Support for multi dimension (cpu only) (#59951)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59951

Add support for multi-d input for cpu forward/backward implementation.

Next step: Adding cuda support for multi-d input.

Test Plan: Added unit tests.

Reviewed By: ngimel

Differential Revision: D29105457

fbshipit-source-id: a389ba4cc10f02434a336b8e7d36259f32552e11
2021-06-17 16:29:14 -07:00
f9445c8a6b [torch][segment_reduce] Add cuda support for mean reduction (#59543)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59543

Building on top of previous PR: https://github.com/pytorch/pytorch/pull/59521

This diff is adding support for mean reduction for Cuda (fwd only currently).
Will add cuda backward implementation in subsequent PR.
Next Steps:
cuda backward support for mean
2d data input support
more testing
benchmarking

Test Plan: update unit test to cover this part as well.

Reviewed By: ngimel

Differential Revision: D28922838

fbshipit-source-id: 72b7e5e79db967116b96ad010f290c9f057232d4
2021-06-15 07:00:45 -07:00
ac6b5beade [torch][segment_reduce] Add support for mean reduction (cpu) (#59521)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59521

This diff is adding support for mean reduction for CPU (fwd + bckwd).

Will add cuda implementation in subsequent PR. We are using "cub::DeviceSegmentedReduce" for other aggregation, trying to see how to support mean or will write custom kernel for it.

Next Steps:
- cuda support for mean
- 2d data input support
- more testing
- benchmarking

Test Plan: updated unit test. Still relying on manual data for ease of debugging. Will add more tests that covers edge cases once major features are complete.

Reviewed By: ngimel

Differential Revision: D28922547

fbshipit-source-id: 2fad53bbad2cce714808ff95759cbdbd45bb4ce6
2021-06-10 14:21:31 -07:00
20eac093a7 [torch][segment_reduce] Add support for initial value (#56923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56923

Next Steps in order:
- Add backward support for CUDA
- Add support for more aggregation types
- Benchmarking (for cuda mainly)/more testing/documentation
- Support for multi dimension

Test Plan: Updated unit test to include 0 length segment as well.

Reviewed By: ngimel

Differential Revision: D27992228

fbshipit-source-id: 28851811f8a784a63162721c511d69e617a93727
2021-04-30 18:01:31 -07:00
e27740b38e [torch] Add backward support for segment reduce (CPU only)
Summary:
This is to setup boiler plate code for backward and CPU implementation.

Next Steps in order:
- Add backward support for CUDA
- Add support for more aggregation types
- Benchmarking (for cuda mainly)/more testing/documentation
- Support for multi dimension

Test Plan:
Updated unit test to also check correctness of backward.

Wait for CI signal

Reviewed By: ngimel

Differential Revision: D27970340

fbshipit-source-id: 3e608c7fe3628b0a761dd8affc6aad8f65a6ef7f
2021-04-29 15:41:37 -07:00
6c37788cb1 [torch] Add cuda support for segment reduction 'max' (#56704)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56704

This is re submit of PR: https://github.com/pytorch/pytorch/pull/54175

Main changes compared to original PR:
- Switch to importing "<ATen/cuda/cub.cuh>"
- Use CUB_WRAPPER to reduce boiler plate code.

Test Plan:
Will check CI status to make sure a

Added unit test

Reviewed By: ngimel

Differential Revision: D27941257

fbshipit-source-id: 24a0e0c7f6c46126d2606fe42ed03dca15684415
2021-04-27 11:29:03 -07:00
364639041f Revert D27121170: [torch] Add cuda support for segment reduction 'max'
Test Plan: revert-hammer

Differential Revision:
D27121170 (eb5e1fc713)

Original commit changeset: 1c2565f42e29

fbshipit-source-id: 3dd394edcf5ef53c27098b4d0a1dd6fbbabdd506
2021-04-08 15:30:58 -07:00
eb5e1fc713 [torch] Add cuda support for segment reduction 'max' (#54175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54175

Building on top of previous PR. This PR adds cuda support for 1D max reduction.

Next steps:
- Add support for other major reduction types (e.g. min, sum) for 1D tensor
- Documentation for the op
- Perf optimizations and benchmark util
- Backward support  (not high priority)
- Support for multi dimensional tensors (on data and lengths) (not high priority)
- Support for 'indices' (not high priority)

Test Plan: Added unit test

Reviewed By: ngimel

Differential Revision: D27121170

fbshipit-source-id: 1c2565f42e2903e6fc089d56983ce8857efbfa3c
2021-04-08 13:25:55 -07:00
7e3cf1ee24 [pytorch] Add native support for segment reduce step1: API definition (#53727)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53727

This is first diff to add native support for segment reduction in PyTorch. It provides similar functionality like torch.scatter or "numpy.ufunc.reduceat".

This diff mainly focuses on API layer to make sure future improvements will not cause backward compatibility issues. Once API is settled, here are next steps I am planning:
- Add support for other major reduction types (e.g. min, sum) for 1D tensor
- Add Cuda support
- Backward support
- Documentation for the op
- Perf optimizations and benchmark util
- Support for multi dimensional tensors (on data and lengths) (not high priority)
- Support for 'indices' (not high priority)

Test Plan: Added unit test

Reviewed By: ngimel

Differential Revision: D26952075

fbshipit-source-id: 8040ec96def3013e7240cf675d499ee424437560
2021-03-23 16:00:30 -07:00