Summary: We were handling constant attrs in a few different ways before, leading to confusion and missed handing for fused dtypes. This diff consolidates some of that code and unbreaks current breakage.
Test Plan: CI. Recently broken tests now pass.
Differential Revision: D36335238
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77401
Approved by: https://github.com/jaybean-dev, https://github.com/jamesr66a
Previously, we were taking the `.op` from OpOverload/OpOverloadPacket and looking for a mapping in `_jit_builtins` for their signature. Those will only exist for operators on the public api, not the overload packets, e.g. `torch.resize_as_` not `torch.ops.aten.resize_as_` (as least in this case, and im pretty sure generally). The OpOverloads/OpOverloadPackets have schemas stored on them so we can just use those directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77182
Approved by: https://github.com/anjali411
This is the `__torch_dispatch__` subclass used for tracing by AOTAutograd (https://github.com/pytorch/functorch/blob/main/functorch/_src/python_key.py).
Given that a couple of folks are now interested in using this infra, it seems like a good idea to put it in core, and focus our efforts on a single implementation.
I put this up as a WIP, just for discussion, but some questions off the top of my head.
1. What should be the intended way of extending this tracer? Should we define extension points, or should folks simply copy paste and modify? If we do define extension points, what are the extension points we should define?
2. There are some open questions about the way we're overriding FX to resolve some lingering issues (i.e. dealing with `nn.Parameter` and `call_module` calls). @ezyang implemented an alternate version of this tensor in https://github.com/albanD/subclass_zoo/blob/main/tracer_tensor.py, but it appears he ran into some issues with it that led to me submitting this implementation. That being said, I think some of the things over there should still be ported.
3. Given that this is going to be shared infra, what other features should we put in here? One that comes to mind is to allow for meta-tensor tracing (perhaps by default?), with a more solid fallback.
Some of the other implementations (for reference on requirements).
1. FX2TRT: D34868356 (internal only)
2. Edge's? @gmagogsfm
cc: @ezyang , @jamesr66a , @zou3519 , @gmagogsfm, @842974287
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74360
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76469
Broken by Original commit changeset: 450e86c4e08a
Original Phabricator Diff: D35874477
Test Plan: Added unit test coverage to test_fx_experimental
Reviewed By: albanD
Differential Revision: D35978105
fbshipit-source-id: f22670b3b00a86777a26feaf4cb911595d150a17
(cherry picked from commit 91868b1e872c19d58d96a6c80a5e78dc6ffe4c7b)
This PR makes the following improvements:
- moves the custom skip list for test_normalize_operator_exhaustive in test_fx_experimental to use the typical OpInfo skip architecture. The skips were updated to xfails, and that identified some operators which were no longer failing the test
- redundant tests with OpInfo-based testing in test_jit.py were removed
- test_dtypes was improved so its error messages are clear and it makes test_nondifferentiable redundant; the latter test has been removed
- OpInfo.supports_complex_autograd() is removed in favor of a more accurate and general test for whether the particular dtype is in the backward dtypes of the operator
- gradchecks have been improved to verify that an operator doesn't support grad if it claims not to
- gradchecks have been improved to test the gradient of all input tensors that require gradient
- the concept of "default test dtypes" has been removed
- excessive and mostly redundant out testing for elementwise unary operators has been removed
- metadata for whether an op supports nuanced "safe casting" to out behavior has been removed from OpInfos
- numerous skips have been converted to xfails
- numerous OpInfos have had their metadata fixed based on the new checks
- jit-specific utilities in common_methods_invocations.py have been moved to jit_programming_utils.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75951
Approved by: https://github.com/ngimel
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73748
This adds CPU-only slow test jobs, which previously would never run.
Includes fixes/skips for slow tests which fail (they need to be skipped now because they used to never run)
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D34628803
Pulled By: davidberard98
fbshipit-source-id: c090ab7bf7bda9e24ec5cdefa6fd35c6310dbac0
(cherry picked from commit 06f7a94a57cc7023e9c5442be8298d20cd011144)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73564
While maintaining API backward compatibility, add an optional output parameter to split_module() that returns a mapping from the new qualified names in the modules after split to the old qualified names in the original module
Test Plan:
1. Added a test (test_split_qualname_mapping) to test_fx_experimental.py to check the returned qualname mapping
```
$ python test_fx_experimental.py
...
Ran 1084 tests in 73.464s
OK (skipped=531, expected failures=4)
```
2. Ask test_fx.py to accept split_module's new signature
```
$ python test_fx.py --accept
```
Reviewed By: jamesr66a
Differential Revision: D34541792
fbshipit-source-id: e8ec7e77ec884e4db7cad0c0593e31861c76e42d
(cherry picked from commit d2e5a95a353ee5fb52cdba065f127489e9df47ae)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61608
See #61544 for an example of issues created by functional wrappers. In this
case, these are directly wrapping the native function with no added
functionality. One exception was `bilinear` which was just missing the default
argument in C++, but was otherwise the same.
I've kept the symbol `torch.functional.istft` because it looks like public API,
but it could just as easily be moved to `_torch_docs.py`.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D31401361
Pulled By: albanD
fbshipit-source-id: 162b74d0b2d4f2e5c4834687a94541960cefdd52
(cherry picked from commit 700cd73ca121d903f04f539af171d3f768565921)
Summary:
These tests were not actually running as they were defined in the local scope of another test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71885
Reviewed By: scottxu0730
Differential Revision: D33806251
Pulled By: jansel
fbshipit-source-id: 48a2d7b472f160759ef55e6fff1f8890511e3345
(cherry picked from commit 9ae14efb25dd034fed60ae99465cd3673c24eed2)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71016
I found out that `split_module` doesn't preserve default values for arguments. In trying to fix that, I noticed that `Graph.placeholder` doesn't make it easy to add a default argument when making a placeholder. This PR addresses both of those issues
Test Plan: Imported from OSS
Reviewed By: ansley
Differential Revision: D33482218
Pulled By: jamesr66a
fbshipit-source-id: 57ebcdab25d267333fb1034994e08fc1bdb128ee
Summary:
Earlier, we were only testing for inputs with the shape of `(5,)` for `nn.functional.dropout`, but since it's used a lot - I feel it's a good idea to test for a few more shapes including scalars. This PR:
1. Revises sample inputs for `nn.functional.dropout`
2. Adds an OpInfo for `nn.functional.dropout2d`.
A note regarding the documentation:
Looks like `nn.functional.dropout2d` also supports inputs of shape `(H, W)` apart from `(N, C, H, W) / (C, H, W)` but the [documentation](https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html#torch.nn.Dropout2d) doesn't mention that (`H, W` case). Should that be revised or am I missing anything here? (Filed an issue here: https://github.com/pytorch/pytorch/issues/67892)
```python
# A 2D tensor is a valid input for Dropout2d
In [11]: tensor = torch.randn((3, 4), device='cpu', dtype=torch.float32)
In [12]: dropout2d = torch.nn.Dropout2d(p=0.5)
In [13]: dropout2d(tensor)
Out[13]:
tensor([[-0.1026, -0.0000, -0.0000, -0.0000],
[-1.5647, 0.0000, -0.0000, -0.5820],
[-0.0000, -3.2080, 0.1164, -3.6780]])
```
Issue Tracker: https://github.com/pytorch/pytorch/issues/54261
cc: mruberry zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67891
Reviewed By: mrshenli
Differential Revision: D32628527
Pulled By: mruberry
fbshipit-source-id: 4c9b89550f1d49526e294378ce107eba9f29cabb
Summary:
As per title.
While working on this I have discovered several issues with these methods related to grad instabilities. I will file them and link here later. These were quite painful to force to pass all the tests with these discovered issues, sorry for the delay, mruberry!
cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69107
Reviewed By: zou3519
Differential Revision: D32920341
Pulled By: mruberry
fbshipit-source-id: 15b33e2b46acdcbff8a37d8e43e381eb55d1a296
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67357
This PR adds OpInfos for:
- new_ones, new_zeros, new_full, new_empty
- rand_like, randint_like
I forgot to add the _like functions in a previous PR, so here they are.
Test Plan: - wait for tests
Reviewed By: mruberry
Differential Revision: D31969533
Pulled By: zou3519
fbshipit-source-id: 236d70d66e82f1d6f8e5254b55ca2a37b54c9494
Summary:
All of the pooling modules except MaxUnpool and LPPool return either a
Tensor or [Tensor, Tensor]. The current type annotations are inaccurate,
and prevent scripting the module if return_indices is set as True in the
module.
There's not a great way to make this agree with mypy because the
overload is dependent on the value of return_indices, an attribute.
I tried changing the annotations from `Tensor` to
`Union[Tensor, Tuple[Tensor, Tensor]]`, but that breaks a bunch of uses
that have return_indices=False.
For example, this breaks:
4e94e84f65/torch/nn/modules/container.py (L139)
Also clean up how test names were being constructed in test_jit, since
otherwise we were getting name collisions when there were two tests on
the same nn.Module.
Fixes https://github.com/pytorch/pytorch/issues/45904
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65847
Reviewed By: ZolotukhinM
Differential Revision: D31462517
Pulled By: eellison
fbshipit-source-id: 6f9e8df1be6c75e5e1e9bae07cf3ad3603ba59bd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64282
OpInfos for:
- Tensor.bfloat16, Tensor.bool, Tensor.bypte, Tensor.char
- Tensor.double, Tensor.float, Tensor.half, Tensor.int
- Tensor.short, Tensor.long
None of these are supported by TorchScript. Also, the OpInfo autograd
test runner assumes that the operation is not allowed to change the
dtype of the argument, so only Tensor.double has
`supports_autograd=True` (in theory Tensor.bfloat16, Tensor.float,
Tensor.half should be differentiable).
Test Plan: - run tests
Reviewed By: dagitses
Differential Revision: D31452627
Pulled By: zou3519
fbshipit-source-id: b7f272e558558412c47aefe947af7f060dfb45c5
Summary:
OpInfo tracker: https://github.com/pytorch/pytorch/issues/54261
- Eliminate duplicated testing logic in test_autograd
- Moved tests that rely on this testing logic to use OpInfos
- `cat` already has OpInfo (no action needed)
- Created OpInfo for `block_diag` and `broadcast_tensors`
Running into some FX errors. Added op to skip-list and created an issue here: https://github.com/pytorch/pytorch/issues/64997
Both `block_diag` and `broadcast_tensors` are variadic, so skipping `test_variant_consistency_jit` (from comments on other OpInfos, it looks like JIT does not support variadic tensors)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64993
Reviewed By: jbschlosser
Differential Revision: D30961736
Pulled By: soulitzer
fbshipit-source-id: e169305384a683acae1178c4e12e9e214a67226a