Commit Graph

15 Commits

Author SHA1 Message Date
a37afd23fa [custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555)
(benchmark for 1 call)

Before:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 77.72445678710938 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 64.61143493652344 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 11.682510375976562 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 18.596649169921875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

After:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 47.6837158203125 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 31.709671020507812 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 10.967254638671875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 10.728836059570312 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148555
Approved by: https://github.com/zou3519
2025-04-01 18:45:48 +00:00
d256b2dcb2 Revert "[custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555)"
This reverts commit d686d04c2f3bac110044ebad5cc46e3035d7b425.

Reverted https://github.com/pytorch/pytorch/pull/148555 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/148555#issuecomment-2753283221))
2025-03-26 05:27:52 +00:00
d686d04c2f [custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555)
(benchmark for 1 call)

Before:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 77.72445678710938 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 64.61143493652344 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 11.682510375976562 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 18.596649169921875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

After:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 47.6837158203125 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 31.709671020507812 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 10.967254638671875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 10.728836059570312 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148555
Approved by: https://github.com/zou3519
2025-03-19 17:16:57 +00:00
5b5766665d PEP585 update - torch/_C torch/_decomp torch/_lazy torch/_library torch/_numpy torch/_prims torch/_refs torch/_strobelight (#145102)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145102
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #145105
2025-01-18 20:47:12 +00:00
e7eeee473c [BE][Easy][14/19] enforce style for empty lines in import segments in torch/_[a-c]*/ and torch/_[e-h]*/ and torch/_[j-z]*/ (#129765)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129765
Approved by: https://github.com/ezyang
2024-07-31 10:42:50 +00:00
61421c42c0 [custom_op] don't invoke autograd.Function when unnecessary (#127976)
This matches our autograd logic for pytorch native operators. There's no
need to invoke an autograd.Function if we're under a torch.no_grad() or
if none of the inputs have requires_grad=True (invoking an
autograd.Function results in (noticeable) overhead).

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127976
Approved by: https://github.com/williamwen42
2024-06-13 23:38:23 +00:00
7cc07a3eb1 [custom_op] stop using nonlocals to store information (#128547)
Fixes https://github.com/pytorch/pytorch/issues/128544
Fixes https://github.com/pytorch/pytorch/issues/128535

We had a problem with multithreading where the nonlocals were being
clobbered. In the first place, we stored these nonlocals because we
wanted to ferry information from an autograd.Function.apply to
autograd.Function.forward.

Our new approach is:
- pass the information directly as an input to the
  autograd.Function.apply. This means that the autograd.Function.forward
  will receive the information too.
- this messes up ctx.needs_input_grad, which has an element per input to
  forward. The user should not see the additional information we passed.
  We fix this by temporarily overriding ctx.needs_input_grad to the
  right thing.
- this exposed a bug in that ctx.needs_input_grad wasn't correct for
  TensorList inputs. This PR fixes that too.

Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128547
Approved by: https://github.com/williamwen42, https://github.com/soulitzer
2024-06-13 13:36:39 +00:00
afe15d2d2f Flip default value for mypy disallow_untyped_defs [3/11] (#127840)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127840
Approved by: https://github.com/oulgen
2024-06-08 18:28:01 +00:00
4e340a7f8b [custom_op] setup_context fills in default values (#124852)
This is to mirror autograd.Function's setup_context behavior.
The PyTorch Dispatcher removes default values for "FC/BC reasons", but I
convinced myself there's no FC/BC problem for the setup_context API.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124852
Approved by: https://github.com/albanD
ghstack dependencies: #124637, #124805, #124806
2024-04-25 04:22:01 +00:00
4f398eed0b [custom_op] register_autograd supports non-tensor kwargonly-args (#124806)
The user does not need to return gradients for these args.

We also change how setup_context works to adapt to kwargonly-args. If
the user's op has no kwonly-args, then their setup_context function must
look like `setup_context(ctx, inputs, output)`: we require that the
arguments have the same names.

If the user's op has kwonly-args, then their setup_context function must
look like `setup_context(ctx, inputs, keyword_only_inputs, output)`.
We require that the arguments have the same names.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124806
Approved by: https://github.com/albanD, https://github.com/williamwen42
ghstack dependencies: #124637, #124805
2024-04-25 01:51:02 +00:00
25c65d6642 Change register_autograd to reflect ordering of setup_context and backward (#124403)
old: `register_autograd(setup_context, backward, /)`
new: `register_autograd(backward, /, *, setup_context=None)`

Motivations:
- We introduce these APIs as "give us a backward and use setup_context
  to save things for backward".
- setup_context isn't always necessary.

Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124403
Approved by: https://github.com/albanD
ghstack dependencies: #124180, #124200, #124299, #124134, #124199
2024-04-19 17:56:30 +00:00
648c39c47d Add OpOverload.redispatch; use it in new custom ops API (#124089)
A kernel has "dispatcher convention" if there is an additional keyset
arg at the beginning of the argument list. This PR:
- adds a way to register kernels with dispatcher_convention using
  Library.impl (pass dispatcher_convention = True)
- adds OpOverload.redispatch

We use both of the above in the new custom ops API: we register the
autograd kernel in dispatcher convention so that we can actually call
redispatch like how pytorch built-in ops do it.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124089
Approved by: https://github.com/albanD
ghstack dependencies: #123937, #124064, #124065, #124066, #124071
2024-04-18 12:48:04 +00:00
645173a0b5 Add torch.library.register_autograd (#124071)
Allows registering autograd for all custom op entry points:
- the new-style custom op API (custom_op)
- the old-style torch.library APIs
- C++ operator registration

Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124071
Approved by: https://github.com/albanD
ghstack dependencies: #123937, #124064, #124065, #124066
2024-04-18 12:47:59 +00:00
a03711d24d [custom_ops] Support TensorList inputs/outputs (#123615)
We add a `supports_tensorlist` decorator that gives an autograd.Function
the ability to handle TensorLists.

Test Plan:
- custom_op_db tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123615
Approved by: https://github.com/albanD
2024-04-15 23:32:43 +00:00
9e8d2b6de2 Add register_autograd to register backward formulas for custom ops (#123110)
The user provides a `setup_context` and a `backward_function`. These
get put into a torch.autograd.Function that gets registered as the
custom op's autograd implementation.

Test Plan:
- we update custom ops in the custom_op_db to use the new
  register_autograd API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123110
Approved by: https://github.com/albanD
ghstack dependencies: #123108, #123109
2024-04-05 22:03:47 +00:00