183 Commits

Author SHA1 Message Date
b11593c31b [8/N] Apply ruff UP035 rule (#165214)
This is follow-up of #164653 to continue applying `UP035` fixes. The purpose is to finally enable this rule.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165214
Approved by: https://github.com/ezyang
2025-10-15 03:18:57 +00:00
199e9abb6a [fx] fix split_module with symint (#160093)
Fixes https://github.com/pytorch/pytorch/issues/155220

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160093
Approved by: https://github.com/ezyang
2025-08-13 05:50:15 +00:00
b9afdd9bcc Add flag to fx.passes.split_module to normalize input names (#157733)
This is useful for vLLM, which runs AOTAutograd directly on graphs after
they have been split.

I created a new flag for this instead of reusing
`keep_original_node_name` (please let me know if you think I should reuse this).
The reasoning is:
- The names of the placeholder nodes is different from the targets of
  the placehoder nodes. The targets are the actual input names.
- Backwards compatibility: this API has been out for ~4 years, it
  looks public, and it has extensive public use. For example, this change
  would actually be BC-breaking to vLLM (they rely on the subgraph input
  names being different at the moment).

Test Plan:
- new tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157733
Approved by: https://github.com/ezyang
2025-07-08 13:47:24 +00:00
1f8ff94d4f PEP585: Add noqa to necessary tests (#146391)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146391
Approved by: https://github.com/justinchuby, https://github.com/Skylion007
2025-02-12 15:29:50 +00:00
cyy
1c16cf70c3 Apply ruff fixes to tests (#146140)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146140
Approved by: https://github.com/albanD
2025-02-04 05:41:01 +00:00
c4523999a1 Fix incorrect type comparison (#145449)
Summary: This change was incorrectly made as part of #145166

Differential Revision: D68536221

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145449
Approved by: https://github.com/bobrenjc93
2025-01-26 04:40:26 +00:00
99dbc5b0e2 PEP585 update - test (#145176)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176
Approved by: https://github.com/bobrenjc93
2025-01-22 04:48:28 +00:00
0b2a3687b9 PEP585 update - torch/fx (#145166)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145166
Approved by: https://github.com/bobrenjc93
2025-01-20 18:11:54 +00:00
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
792f1c47e9 No actual change, just remove variable contain Tensors from global scope (#143225)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143225
Approved by: https://github.com/ezyang
2024-12-17 16:14:25 +00:00
af47e05a96 [fx] make split_module work with keep_original_order=True and no-op graph (#141340)
Fixes https://github.com/pytorch/pytorch/issues/140014

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141340
Approved by: https://github.com/ezyang
2024-11-24 06:41:30 +00:00
ee7eaad5c3 [dynamo] add SymNode bitwise and/or (#138777)
Fixes [T203472723](https://www.internalfb.com/intern/tasks/?t=203472723)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138777
Approved by: https://github.com/ezyang
2024-11-22 23:36:16 +00:00
c1fe6be202 Revert "[dynamo] add SymNode bitwise and/or (#138777)"
This reverts commit c98ef0279e6eb968f5f9d22e1f193e7064594152.

Reverted https://github.com/pytorch/pytorch/pull/138777 on behalf of https://github.com/ezyang due to triggering AssertionError: Guard check failed: 14/2: name 'BitwiseFn_bitwise_or' is not defined ([comment](https://github.com/pytorch/pytorch/pull/138777#issuecomment-2477477776))
2024-11-14 21:52:40 +00:00
c98ef0279e [dynamo] add SymNode bitwise and/or (#138777)
Fixes [T203472723](https://www.internalfb.com/intern/tasks/?t=203472723)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138777
Approved by: https://github.com/ezyang
2024-11-13 18:31:06 +00:00
0cf4cc3d5f [fx] split_module subgraph should always have an output node (#139275)
Fixes https://github.com/pytorch/pytorch/issues/138207

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139275
Approved by: https://github.com/ezyang
2024-10-31 04:53:19 +00:00
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
9e473fd868 Make adding Buffers more like adding Parameters (#125971)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971
Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos
2024-07-31 10:32:40 +00:00
973037be6a [BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): list() / tuple() / dict() (#130199)
This PR changes the empty collection factory call to Python literals:

- `list()` -> `[]`
- `tuple()` -> `()`
- `dict()` -> `{}`

The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary:

```bash
$ python3 -m dis - <<EOS
import collections

d1 = {}
d2 = dict()

dict = collections.OrderedDict
d3 = dict()
EOS
```

```text
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 (0)
              4 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (collections)
              8 STORE_NAME               0 (collections)

  3          10 BUILD_MAP                0
             12 STORE_NAME               1 (d1)

  4          14 PUSH_NULL
             16 LOAD_NAME                2 (dict)
             18 CALL                     0
             26 STORE_NAME               3 (d2)

  6          28 LOAD_NAME                0 (collections)
             30 LOAD_ATTR                8 (OrderedDict)
             50 STORE_NAME               2 (dict)

  7          52 PUSH_NULL
             54 LOAD_NAME                2 (dict)
             56 CALL                     0
             64 STORE_NAME               5 (d3)
             66 RETURN_CONST             1 (None)
```

The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199
Approved by: https://github.com/malfet
2024-07-11 17:30:28 +00:00
e6bfa2958b Add aten._unsafe_masked_index (#116491)
To generate masked indexing operations that would generate
masked loads in triton code

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116491
Approved by: https://github.com/lezcano, https://github.com/peterbell10
2024-06-25 02:45:02 +00:00
d1fad416a8 Revert "Add aten._unsafe_masked_index (#116491)"
This reverts commit f03f8bc901a6c9038308a6353e8d280f4b5628f5.

Reverted https://github.com/pytorch/pytorch/pull/116491 on behalf of https://github.com/PaliC due to breaking onnx tests ([comment](https://github.com/pytorch/pytorch/pull/116491#issuecomment-2145557724))
2024-06-03 15:51:50 +00:00
f03f8bc901 Add aten._unsafe_masked_index (#116491)
To generate masked indexing operations that would generate
masked loads in triton code

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116491
Approved by: https://github.com/lezcano, https://github.com/peterbell10
2024-06-03 14:44:03 +00:00
35d3adb4b0 Add ATen Op _chunk_cat and _chunk_cat.out (#121081)
# Motivation

In backward of per-parameter sharding FSDP, each rank performs reduce scatter to sync gradients across ranks. A rank chunks each gradient tensor into `world_size` slices along the 0-th dimension and concatenate all slices along the 1-th dimension. Gradient tensors will be padded before concatenation when tensor.size(0) % world_size != 0.

### Example 1
Consider `world_size=3` and tensors A (2x4), B (3x3), C (1x2):

Input tensors:
```
AAAA   BBB   CC
AAAA   BBB
       BBB
```

Reduce-scatter-copy-in Output:
```
AAAABBBCC
AAAABBB00
0000BBB00
```

### Example 2
Consider `world_size=2` and tensors A (2x4), B (3x3), C(1x2), D(4x2):

Input tensors:
```
AAAA   BBB   CC   DD
AAAA   BBB   00   DD
       BBB        DD
       000        DD
```

Reduce-scatter-copy-in first pad:
```
AAAA   BBB   CC   DD
AAAA   BBB   00   DD
       BBB        DD
       000        DD
```

Then chunk and cat along dim as the output:
```
AAAABBBBBBCCDDDD
AAAABBB00000DDDD
```

The performance of reduce-scatter-copy-in is critical to per-parameter sharding FSDP. However, reduce-scatter-copy-in via composing existing ATen ops involves `cat` and irregular `pad`, leading redundant data copies and unsatisfactory performance.

# PR
We provide aten native support for reduce-scatter-copy-in, namely `_chunk_cat()`:

```
_chunk_cat(Tensor[] tensors, int dim, int num_chunks) -> Tensor
```

This PR includes the registration of `_chunk_cat` and `_chunk_cat.out`, OpInfo tests, and basic implementation composing existing ATen ops.
In the next PR, we will add the CUDA implementation. Comparing with baselines of composing existing ATen ops, `_chunk_cat()` CUDA implementation improves copy bandwidth from 498 GB/s to 966 GB/s on a production benchmark.

## Requirements on input

1. If input tensors have different ndims, dim should be non-negative and be less than the ndims of every input tensors. If all input tensors have the same ndims, we support both negative and non-negative dim.
2. For wrapped_dim, all tensors should have the same size for 0,...,wrapped_dim-1 dimensions. No requirements for (wrapped_dim, ...)-th dimension.
3. Expect positive num_chunks
4. Expect non-empty input tensor list and each input tensor should have at least 1 element

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121081
Approved by: https://github.com/albanD
2024-03-08 21:48:12 +00:00
b3b585af64 Revert "[codemod] markDynamoStrictTest batch 16 (#117218)"
This reverts commit 47119785acbfe20d9ef6cf5d90887a441402f5c7.

Reverted https://github.com/pytorch/pytorch/pull/117218 on behalf of https://github.com/zou3519 due to just felt like reverting this ([comment](https://github.com/pytorch/pytorch/pull/117218#issuecomment-1888360366))
2024-01-12 03:06:20 +00:00
47119785ac [codemod] markDynamoStrictTest batch 16 (#117218)
[codemod] markDynamoStrictTest test_dataloader
[codemod] markDynamoStrictTest test_public_bindings
[codemod] markDynamoStrictTest test_namedtensor
[codemod] markDynamoStrictTest test_fx
[codemod] markDynamoStrictTest test_content_store
[codemod] markDynamoStrictTest test_schema_check
[codemod] markDynamoStrictTest lazy/test_ts_opinfo
[codemod] markDynamoStrictTest functorch/test_ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117218
Approved by: https://github.com/bdhirsh
2024-01-12 00:32:36 +00:00
d8ad74857c Run translation validation on tracing error. (#106645)
This PR wraps `InstructionTranslator` run with a try-catch block so as to run the
translation validation (TV) if it ends up raising an error.

In this context, we run TV so as to catch simplification errors. These may turn
`ShapeEnv.divisible` and `ShapeEnv.replacements` incorrect.

For example: #101173 describes a SymPy simplification bug that doesn't reach TV, since
it's run only in the end of the tracing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106645
Approved by: https://github.com/ezyang
2023-08-14 13:43:34 +00:00
bc88028e8e Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743)
Summary:
Original commit changeset: 81319beb97f3

Original Phabricator Diff: D47961182

Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822

Reviewed By: atuljangra

Differential Revision: D48131623

@diff-train-skip-merge
(D48131623 landed internally)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743
Approved by: https://github.com/malfet
2023-08-08 15:27:34 +00:00
33e70e34a3 More readable Z3 expressions printer. (#106643)
This PR makes Z3 expressions easier to read and understand by creating a custom printer
for them.

Z3 expressions can be printed in 2 forms:

1. Using the builtin `str(e)` function
2. Using the `e.sexpr()` method

Problem is that (1) is a bit hard to read because its line breaks are not so
intuitive. (2) is a bit nicer, but the `to_int` and `to_real` functions clutter things up.

The custom printer is an improved `sexpr()` function:

- Leaves everything in one line
- Gets rid of `to_int` and `to_real` functions
- Reconstruct the floor division operations
- Merge commutative operation chains

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106643
Approved by: https://github.com/ezyang
2023-08-07 16:52:22 +00:00
424dc238f4 Fix split module interaction with dead code (#104554)
Summary:
This change fixes split_module's interaction with dead code. Previously if a dead region was split out, split module would throw an error while attempting to access the outputs for the partition even though the partition has no outputs.

This change adds a new unit test to cover the dead code case and changes the output check to allow no output. The split module with no output will now output None like a normal python function

Unit Test Added:
test_split_module_dead_code

A module with dead code:
```
class ModWithDeadCode(torch.nn.Module):
            def forward(self, x):
                output = x * 2 # we want this
                dead_line = x + 2 # this is dead
                return output
```

Before:
```
torch/fx/passes/split_module.py, line 357, in split_module
base_mod_env[list(partition.outputs)[0]] = output_val
IndexError: list index out of range
```

After:
```
class GraphModule(torch.nn.Module):
    def forward(self, x):
        # No stacktrace found for following nodes
        submod_2 = self.submod_2(x)
        submod_1 = self.submod_1(x);  x = None
        return submod_1

    class GraphModule(torch.nn.Module):
        def forward(self, x):
            # No stacktrace found for following nodes
            add = x + 2;  x = None
            return None

    class GraphModule(torch.nn.Module):
        def forward(self, x):
            # No stacktrace found for following nodes
            mul = x * 2;  x = None
            return mul
```
Submod 2 is correctly extracted

Test Plan: Tested with new unit test

Differential Revision: D47196732

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104554
Approved by: https://github.com/yf225
2023-08-03 21:36:35 +00:00
d8e5f2aa6d Reland "Make adding buffers more like adding parameters (#104069)" (#106224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224
Approved by: https://github.com/atalman, https://github.com/albanD
2023-07-31 17:18:56 +00:00
c6653b65d8 Back out "Make adding buffers more like adding parameters (#104069)" (#105581)
Summary:
D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/

with `TypeError: register_buffer() takes 3 positional arguments but 4 were given`

Original commit changeset: d4b4069fbd38

Original Phabricator Diff: D47537831

Test Plan:
```
buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform
```

Reviewed By: atalman

Differential Revision: D47600140

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581
Approved by: https://github.com/mikaylagawarecki
2023-07-20 03:39:53 +00:00
32d422f335 Make adding buffers more like adding parameters (#104069)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
2023-07-17 17:59:05 +00:00
9b0b31a5e3 fix conv+bn folding issue for mixed dtype (#99696)
Align the conv+bn folding behavior with jit path for mixed type case: always keep conv's weight and bias dtype after folding.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99696
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-04-23 05:13:40 +00:00
039b4c8809 Add meta function for _upsample_bilinear2d_aa (#94982)
Differential Revision: D43353000

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94982
Approved by: https://github.com/ezyang
2023-02-19 07:11:20 +00:00
046e88a291 [BE] [3/3] Rewrite super() calls in test (#94592)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-12 22:20:53 +00:00
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
9171f7d4cd [BE] Modernize PyTorch even more for 3.8 with pyupgrade (#94520)
Applies some more pyupgrade fixits to PyTorch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94520
Approved by: https://github.com/ezyang
2023-02-10 18:02:50 +00:00
b769005924 [fx][passes] Implement annotate getitem node FX passes (#90237)
Summary: One common cause of jit unscriptability issue is loss of node type annotations on local names after one or several FX transform(s). One way to improve the type coverage is to eagerly annotate the type for `getitem` nodes from its parent sequence node. This diff introduces an fx pass to do that.

Test Plan:
```
buck2 test //caffe2/test:fx_experimental
```

Reviewed By: xush6528

Differential Revision: D41749744

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90237
Approved by: https://github.com/xush6528
2022-12-06 23:18:55 +00:00
bc73affdad prepare removal of deprecated functionality in torch.testing (#87969)
_Redo of #86586 with all BC breaking changes granularly placed into separate commits._

---

Per title. Deprecation happened on Feb 25, 2022 in c6f1bbc0ac33be0c8ad9956e3fc15e78ddb6cb95, which made it into the 1.12 release. Since it is now 245 days later and the next release will be 1.14, the removals later in the stack comply with the [BC policy](https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#minimizing-the-disruption-of-bc-breaking-changes).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87969
Approved by: https://github.com/mruberry
2022-11-02 14:04:48 +00:00
f325c29b05 [fx] Make NormalizeArgs preserve node type (#85637)
Summary: Make `NormalizeArgs` preserve node types when transforming the graph. This bug is preventing me from scripting a graph that goes through the fx2trt `acc_tracer`.

Test Plan: New unit test

Reviewed By: ipiszy

Differential Revision: D39753021

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85637
Approved by: https://github.com/Chillee
2022-09-26 21:30:16 +00:00
604487f239 OpInfo for Slice (#85554)
This is based on wconstab tests from #84680

Technically, slice is covered by the __getitem__ opinfo, but it is
easier to debug/test on a more narrow internal function that only
uses this functionality and not other advanced indexing stuff.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85554
Approved by: https://github.com/mruberry, https://github.com/wconstab
2022-09-23 22:01:32 +00:00
5b88a2078b Follow GitHub relabeling of oncall: fx for test owners (#81821)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81821
Approved by: https://github.com/janeyx99
2022-07-21 01:50:06 +00:00
9402219a36 Move serialize_module() out of OSS graph_manipulation.py to internal (#80785)
Differential Revision: D37582495

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80785
Approved by: https://github.com/jfix71
2022-07-05 23:39:13 +00:00
4d88affb5d Ported proxy tensor tests over to core (#78890)
Will fill out later
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78890
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-07 00:28:53 +00:00
50cadfae10 Add strictness check and made tensors into leaves if input tensors were leaves (#77474)
I think this makes sense to do? Otherwise, if you call `backward()` in your traced function, you can't get gradients out of any tensors that should have been leaves.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77474
Approved by: https://github.com/ezyang
2022-05-21 01:16:39 +00:00
ba0ca0f591 Add torch dispatch mode to ProxyTensor tracing (#77174)
Uses a mode for ProxyTensor tracing so that it traces factory functions as well

cc @dhruvbird
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77174
Approved by: https://github.com/ezyang
2022-05-19 19:53:57 +00:00
18e36a6295 [graph_manipulation] Set fused dtypes for all constant params/buffers (#77401)
Summary: We were handling constant attrs in a few different ways before, leading to confusion and missed handing for fused dtypes. This diff consolidates some of that code and unbreaks current breakage.

Test Plan: CI. Recently broken tests now pass.

Differential Revision: D36335238

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77401
Approved by: https://github.com/jaybean-dev, https://github.com/jamesr66a
2022-05-17 07:42:29 +00:00
7311390d35 [WIP] Make constructor calls in experimental MetaTracer serializable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76789

Approved by: https://github.com/pbelevich
2022-05-11 00:19:47 +00:00
023aafbcd7 Fix for normalizing signature for op overloads (#77182)
Previously, we were taking the `.op` from OpOverload/OpOverloadPacket and looking for a mapping in `_jit_builtins` for their signature. Those will only exist for operators on the public api, not the overload packets, e.g. `torch.resize_as_` not `torch.ops.aten.resize_as_` (as least in this case, and im pretty sure generally). The OpOverloads/OpOverloadPackets have schemas stored on them so we can just use those directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77182
Approved by: https://github.com/anjali411
2022-05-10 23:36:26 +00:00
fc95eda285 Added proxy tensor
This is the `__torch_dispatch__` subclass used for tracing by AOTAutograd (https://github.com/pytorch/functorch/blob/main/functorch/_src/python_key.py).

Given that a couple of folks are now interested in using this infra, it seems like a good idea to put it in core, and focus our efforts on a single implementation.

I put this up as a WIP, just for discussion, but some questions off the top of my head.

1. What should be the intended way of extending this tracer? Should we define extension points, or should folks simply copy paste and modify? If we do define extension points, what are the extension points we should define?
2. There are some open questions about the way we're overriding FX to resolve some lingering issues (i.e. dealing with `nn.Parameter` and `call_module` calls). @ezyang implemented an alternate version of this tensor in https://github.com/albanD/subclass_zoo/blob/main/tracer_tensor.py, but it appears he ran into some issues with it that led to me submitting this implementation. That being said, I think some of the things over there should still be ported.
3. Given that this is going to be shared infra, what other features should we put in here? One that comes to mind is to allow for meta-tensor tracing (perhaps by default?), with a more solid fallback.

Some of the other implementations (for reference on requirements).

1. FX2TRT: D34868356 (internal only)
2. Edge's? @gmagogsfm

cc: @ezyang , @jamesr66a , @zou3519 , @gmagogsfm, @842974287
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74360
Approved by: https://github.com/ezyang
2022-05-03 22:46:30 +00:00