Commit Graph

32 Commits

Author SHA1 Message Date
cyy
df458be4e5 [4/N] Apply py39 ruff and pyupgrade fixes (#143257)
```torch/fx/passes/annotate_getitem_nodes.py``` was changed to support the new type hinting annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143257
Approved by: https://github.com/justinchuby, https://github.com/albanD
2025-01-04 10:47:51 +00:00
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
8b08b0f340 [BE] enable ruff rule Q from flake8-quotes (#127713)
Enable [ruff rule `Q`](https://docs.astral.sh/ruff/rules/#flake8-quotes-q) from flake8-quotes. Fixes:

- [avoidable-escaped-quote (Q003)](https://docs.astral.sh/ruff/rules/avoidable-escaped-quote/#avoidable-escaped-quote-q003)
- [unnecessary-escaped-quote (Q004)](https://docs.astral.sh/ruff/rules/unnecessary-escaped-quote/#unnecessary-escaped-quote-q004)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127713
Approved by: https://github.com/ezyang
2024-06-02 23:25:26 +00:00
cac36e232e [PyTorch] Split StaticModule out of test_static_runtime (#121028)
I want to use StaticModule in another (internal) test, so splitting it out.

Differential Revision: [D54384817](https://our.internmc.facebook.com/intern/diff/D54384817/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121028
Approved by: https://github.com/suo
2024-03-05 23:14:07 +00:00
d9ad7ac390 Skip test_fork_wait_4 and test_fork_wait_4_async (#112743)
Fixes #109782

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112743
Approved by: https://github.com/jbschlosser
2023-11-02 20:46:29 +00:00
046e88a291 [BE] [3/3] Rewrite super() calls in test (#94592)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-12 22:20:53 +00:00
fefdad6137 [Static Runtime] test case for staticRuntime::runAsync() API (#80407)
Summary:
- Python interface to call StaticRuntime::runAsync API
- creates a custom executor with execution on inter-op thread pool
- test cases for different async graph scenarios like multiple forks, nested forks, exception handling

Test Plan:
- local tests
    buck test mode/opt caffe2/test:static_runtime
    buck test mode/opt caffe2/benchmarks/static_runtime/fb:test_fb_operators
    buck test mode/opt caffe2/benchmarks/static_runtime:static_runtime_cpptest
- OSS CI tests

Differential Revision: D37471859

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80407
Approved by: https://github.com/tenpercent
2022-07-05 23:40:53 +00:00
49368d9848 [Static Runtime] Added nested prim::fork and aten::wait test case (#79746)
Summary:
- Added test case for code coverage of fork and wait operator scenarios
- Nested fork operations and their corresponding wait calls

Differential Revision: D37221551

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79746
Approved by: https://github.com/tenpercent
2022-06-20 16:32:17 +00:00
65a37923f9 [Static Runtime] Exception handling during fork subgraph execution (#79292)
Summary:
- Exception handling was not performed in forked subgraph execution
- forked subgraph runtime can throw runtime exception. Future returned by prim::fork needs to handle exceptions so that aten::wait handles it.

Test Plan:
local test cases:
  - buck test caffe2/benchmarks/static_runtime/fb:test_fb_operators
  - buck test mode/opt caffe2/benchmarks/static_runtime:static_runtime_cpptest
  - buck test mode/opt caffe2/test:static_runtime

Async execution of the subgraph is tested by adding pytorch profiler hooks on the StaticRuntime execution via below code. Async execution in threadpool is verfiied by checking trace
  with profile(activities=[ProfilerActivity.CPU]) as prof:
      static_runtime_module(inputs)
  prof.export_chrome_trace("trace.json")

Differential Revision: D37072493

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79292
Approved by: https://github.com/mikeiovine
2022-06-11 03:11:49 +00:00
720cb5023a [Static Runtime] Implement prim::Fork and aten::wait (#78780)
Summary:
basic implementation of prim::fork and aten::wait

- current implementation uses interpreter to call the forked subgraph
- interpreter call to be replaced in future
- Added custom test cases for fork/wait procedures in the graph

Test Plan:
custom tests are created in test_static_runtime.py file for verification of static_runtime output compared to reference pytorch output.

test command
- buck run caffe2/test:static_runtime
- buck run caffe2/benchmarks/static_runtime:static_runtime_cpptest
- buck test caffe2/benchmarks/static_runtime/fb:test_fb_operators

Differential Revision: D36881214

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78780
Approved by: https://github.com/tenpercent
2022-06-03 23:39:04 +00:00
b26801276f Fix quicklints (#72256)
Summary:
Regression introduced by https://github.com/pytorch/pytorch/pull/71854

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72256

Reviewed By: dagitses

Differential Revision: D33978449

Pulled By: malfet

fbshipit-source-id: 32c99cc95f0e1011a5d241875d47a78f6c869e0e
(cherry picked from commit ffa0aa8530c5ca8e0a17634d6e3de0c88e674ed6)
2022-02-03 15:17:35 +00:00
0bb3158eae [SR] Implement prim::CreateObject (#71854)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71854

Support `prim::CreateObject` - this is a native interpreter instruction, so we can't fall back to the JIT for this op.

Test Plan: New unit test exercises creating and modifying custom objects

Reviewed By: d1jang

Differential Revision: D33783759

fbshipit-source-id: 8185ff71b5d441597d712a5d4aab7fc4dddf7034
(cherry picked from commit bd3f52d8e2cd8e20a8d66e2d2b802c1d92088e4e)
2022-02-03 12:18:46 +00:00
4635f5711f [static runtime][dper] multi_env tests for static runtime: selective enable (#67467)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67467

Unit tests for static runtime in the dper multi-env tests for cpu and scripted (including fx-traced + scripted) models. Only turn it on for single_operators_tests that are in the inline_cvr local/local_ro/remote_ro model for now.

Will have another diff that turns this on by default and explicitly disables for certain tests.

Test Plan: buck test dper3/dper3/modules/low_level_modules/tests:single_operators_test

Reviewed By: hlu1, houseroad

Differential Revision: D30870488

fbshipit-source-id: 382daec8dbcb95135cdd43e7b84a1d23b445d27c
2021-11-18 01:04:12 -08:00
6259601c8a Set test owners for tests with unknown owners (#67552)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67552

Reviewed By: jbschlosser

Differential Revision: D32028248

Pulled By: janeyx99

fbshipit-source-id: a006f7026288b7126dba58b31cac28e10ce0fed6
2021-10-29 12:42:01 -07:00
a0495b3cdb [SR] Remove unused operator() overload (#67001)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67001

The overload of `operator()` taking `std::vector<at::Tensor>` was only used for testing. In a diff following this one, I will add a new overload that takes `std::vector<c10::IValue> args` and no `kwargs` so we can avoid default-constructing `kwargs` everywhere.

This new overload will probably take a forwarding reference, so to avoid problems with overloading on forwarding reference and simplify the interface, it's best to remove this unused one.

Test Plan:
`buck test caffe2/benchmarks/static_runtime/...`

`buck test caffe2/test:static_runtime`

Reviewed By: hlu1

Differential Revision: D31821990

fbshipit-source-id: 6d2e4a75ca4abe6e262651532eb96c3b274c6f4a
2021-10-25 08:18:58 -07:00
99203580a9 Updates internal assert_allclose callsites in favor of assert_close (#61841)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841

Redo of #60863.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30408145

Pulled By: mruberry

fbshipit-source-id: 0b34ebc7f23ba38ecd89640b61d8aca59b7eab58
2021-08-19 12:50:41 -07:00
ccd0977060 [Static Runtime] Support prim::GetAttr/SetAttr (#61505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61505

The handling of `self` in static runtime was previously incorrect. This diff fixed that issue, since self is essential to prim::GetAttr/SetAttr. After all, most of the time we're getting and setting attributes from self, the torch script module.

Reviewed By: ajyu

Differential Revision: D29350173

fbshipit-source-id: 6e62add4cda517ef8cd6c315d4cb0595e7d531fb
2021-07-10 14:06:06 -07:00
0521e420fd [Static Runtime] Temporarily disable fusion tests (#55342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55342

The fusion stuff is pretty hard to debug. Given that we're not shipping this part of the stack any time soon, let's temporarily disable them and re-enable them when somebody has the cycles to debug them.

Test Plan: Verified that the tests are now disabled

Reviewed By: ajyu

Differential Revision: D27578573

fbshipit-source-id: cb8d7c9339f7c1700b7653b0231cf570996995ff
2021-04-05 20:54:02 -07:00
56f8379802 [static runtime] Move all heavy constructor logic into InferenceModule (renamed to StaticModule) (#51564)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51564

Constructor logic was spread throughout InferenceModule and StaticRuntime.  This diff unifies the two.  After a lot of discussion on this diff D25961626 it became apparent that `clone` is uglier than a cheap StaticRuntime.

This means StaticRuntime is effectively StaticModule and the only code in the new StaticRuntime is the `run` functions.

```
graph, schema = PrepareForStaticModule(torchscript_module)
sm = StaticModule(graph, schema, options)
sm(inputs)
// or create many cheap runtimes with the module
sr = StaticRuntime(sm)
sr(inputs)
```

Changelist:
- Rename InferenceModule StaticModule
- Move all logic for construction into StaticModule
- Create a new StaticRuntime that only has a unique memory planner (everything else is in StaticModule)
- Update comments with explanation
- Propagate all changes to predictor integration
- Propagate all changes to python integration
- Change semantics to be a bit more PyTorch-standard (no "run" calls, no "get_" getters).

Test Plan:
buck test //caffe2/test:static_runtime
buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest

Reviewed By: hlu1

Differential Revision: D25592967

fbshipit-source-id: 8233bed03137ce129137af2d44bce0095033ef0f
2021-03-05 10:15:26 -08:00
a4383a69d4 Clean up some type annotations in caffe2/test (#49943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49943

Upgrades type annotations from Python2 to Python3

Test Plan: Sandcastle tests

Reviewed By: xush6528

Differential Revision: D25717534

fbshipit-source-id: 5aedea4db07efca126ffb6daee79617c30a67146
2021-01-13 10:01:55 -08:00
f4226b5c90 [static runtime] add static subgraph fusion pass (#49185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49185

This diff adds a fusion feature that will let us use static runtime for *parts* of the graph.  This will prove useful in cases where fully eliminating control flow is hard etc.

TODO:
[x] factor out into separate fusion file
[x] add python test case
[x] add graph that isn't fully lowered test case
[x] add graph that has weird list/tuple outputs test case

the loop example looks quite good:
```
graph(%a.1 : Tensor,
      %b.1 : Tensor,
      %iters.1 : int):
  %12 : bool = prim::Constant[value=1]() # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:110:4
  %c.2 : Tensor = prim::StaticSubgraph_0(%a.1, %b.1)
  %c : Tensor = prim::Loop(%iters.1, %12, %c.2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:110:4
    block0(%i : int, %c.12 : Tensor):
      %c.10 : Tensor = prim::StaticSubgraph_1(%a.1, %c.12, %b.1)
      -> (%12, %c.10)
  return (%c)
with prim::StaticSubgraph_0 = graph(%0 : Tensor,
      %4 : Tensor):
  %5 : int = prim::Constant[value=2]()
  %6 : Tensor = aten::mul(%4, %5) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:109:12
  %2 : int = prim::Constant[value=1]()
  %c.2 : Tensor = aten::add(%0, %6, %2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:109:8
  return (%c.2)
with prim::StaticSubgraph_1 = graph(%1 : Tensor,
      %7 : Tensor,
      %8 : Tensor):
  %9 : int = prim::Constant[value=1]()
  %c.4 : Tensor = aten::add(%7, %8, %9) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:111:12
  %5 : int = prim::Constant[value=2]()
  %c.7 : Tensor = aten::mul_(%c.4, %5) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:112:8
  %2 : int = prim::Constant[value=1]()
  %c.10 : Tensor = aten::sub_(%c.7, %1, %2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:113:8
  return (%c.10)
```

(Note: this ignores all push blocking failures!)

Test Plan:
buck test mode/no-gpu //caffe2/benchmarks/static_runtime:static_runtime_cpptest

buck test mode/no-gpu caffe2/test:static_runtime

Reviewed By: bertmaher

Differential Revision: D25385702

fbshipit-source-id: 2f24af4f11d92a959167facd03fbd24f464a6098
2020-12-10 14:03:11 -08:00
fe7d1d7d0e Add LeakyReLU operator to static runtime (#47798)
Summary:
- Add LeakyReLU operator to static runtime
- Add LeakyReLU benchmark
- Add LeakyReLU correctness test case

Static Runtime
```
------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------
BM_leaky_relu/1                              4092 ns       4092 ns     172331
BM_leaky_relu/8                              4425 ns       4425 ns     158434
BM_leaky_relu/20                             4830 ns       4830 ns     145335
BM_leaky_relu_const/1                        3545 ns       3545 ns     198054
BM_leaky_relu_const/8                        3825 ns       3825 ns     183074
BM_leaky_relu_const/20                       4222 ns       4222 ns     165999
```

Interpreter
```
------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------
BM_leaky_relu/1                              7183 ns       7182 ns      96377
BM_leaky_relu/8                              7580 ns       7580 ns      91588
BM_leaky_relu/20                             8066 ns       8066 ns      87183
BM_leaky_relu_const/1                        6466 ns       6466 ns     107925
BM_leaky_relu_const/8                        7063 ns       7063 ns      98768
BM_leaky_relu_const/20                       7380 ns       7380 ns      94564
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47798

Reviewed By: ezyang

Differential Revision: D24927043

Pulled By: kavoor

fbshipit-source-id: 69b12cc57f725f1dc8d68635788813710a74dc2b
2020-11-13 22:05:52 -08:00
996f444c00 [pt][static_runtime] Memory model (#46896)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46896

The idea of the memory model is quite similar to that of BlackBoxPredictor, however, it's more complicated in pt due to 1) tensor views that share storage with storage refcount bumps but with different TensorImpls, 2) tensors sharing the same TensorImpl and the same storage, but with no refcount bump of the StorageImpl, 3) data types such as TensorList and Tuples that have Tensors in them, 4) need to support non-out/out variant mix while we move the aten ops to out variants.

As a result, I have to make the following adjustments:
1) remove tensors in output Tuples from internal blob list;
2) for memory allocation/deallocation, get candidate Tensors from the outputs of ops with out variant, extract StorageImpls from the Tensors, dedup, and remove output tensor StorageImpls, and get the final list of blobs for memory planning;
3) during the clean_up_memory pass, clean up memory held by the StorageImpls as well as Tensors/Lists/Tuples in IValues that don't participate in memory planning to reduce overall memory usage

Risk:
PyTorch team is planning to deprecate the current resize_outout api, which we do rely on. This is a pretty big risk.

https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/caffe2/aten/src/ATen/native/Resize.cpp?commit=6457b329847607553d34e788a3a7092f41f38895&lines=9-23

Test Plan:
```
buck test //caffe2/test:static_runtime
buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest
buck test //caffe2/caffe2/fb/predictor:pytorch_predictor_test
```
Benchmarks:
```
MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 numactl -m 0 -C 13 \
buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench \
--scripted_model=/home/hlu/ads/adindexer/adindexer_ctr_mobilefeed/pt/merge/traced_precomputation.pt \
--pt_inputs=/home/hlu/ads/adindexer/adindexer_ctr_mobilefeed/pt/merge/container_precomputation_bs1.pt \
--iters=1000 --warmup_iters=10000 --num_threads=1 --pt_enable_static_runtime=true \
--pt_cleanup_activations=true --pt_enable_out_variant=false
```

|pt_cleanup_activations	|pt_enable_out_variant	|old ms/iter	|new ms/iter	|
|---	|---	|---	|---	|
|0	|0	|0.31873	|0.30228	|
|0	|1	|0.30018	|0.29184	|
|1	|0	|0.35246	|0.31895	|
|1	|1	|0.35742	|0.30417	|

Reviewed By: bwasti, raziel

Differential Revision: D24471854

fbshipit-source-id: 4ac37dca7d2a0c362120a7f02fd3995460c9a55c
2020-11-03 23:47:59 -08:00
e8d8de32b4 [StaticRuntime] Implement StaticRuntime::benchmark (#45639)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45639

`StaticRuntime::run_individual` is to mimic the caffe2 operator benchmark `SimpleNet::TEST_Benchmark`, so we can accurate information on the operator breakdown. We found that the PyTorch AutogradProfiler adds a lot of overhead to small models, such as the adindexer precomputation_merge net, 100% for batch_size 1, 33% for batch_size 20. This implementation adds very little overhead, as shown in the test plan.

Test Plan: Test results are fb internal only.

Reviewed By: yinghai, dzhulgakov

Differential Revision: D24012088

fbshipit-source-id: f32eb420aace93e2de421a15e4209fce6a3d90f0
2020-10-06 20:54:43 -07:00
87b356d093 [static runtime] Split out graph preparation from runtime (#44131)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44131

Test Plan: Imported from OSS

Reviewed By: hlu1

Differential Revision: D23604305

Pulled By: bwasti

fbshipit-source-id: 7b47da4961d99074199417ef1407a788c7d80ee6
2020-09-28 13:01:23 -07:00
d1a11618f5 [static runtime] Add _out variants and reuse memory (#44128)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44128

Test Plan: Imported from OSS

Reviewed By: hlu1

Differential Revision: D23604304

Pulled By: bwasti

fbshipit-source-id: 06a23cb75700a0fc733069071843b7b498e7b9e9
2020-09-25 11:03:06 -07:00
a475613d1d [static runtime] Swap to out-variant compatible nodes (#44127)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44127

Test Plan: Imported from OSS

Reviewed By: hlu1

Differential Revision: D23604306

Pulled By: bwasti

fbshipit-source-id: 18ccfb9b466b822e28130be3d5c4fae36c76820b
2020-09-14 12:38:25 -07:00
8538a79bfe [jit][static] Basic executor (#43647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43647

Nothing fancy, just a basic implementation of the graph executor without using stack machine.

Reviewed By: bwasti

Differential Revision: D23208413

fbshipit-source-id: e483bb6ad7ba8591bbe1767e669654d82f42c356
2020-08-28 23:20:07 -07:00
8864148823 [jit] DeepAndWide benchmark (#43096)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43096

Add benchmark script for deep and wide model.

Reviewed By: bwasti, yinghai

Differential Revision: D23099925

fbshipit-source-id: aef09d8606eba1eccc0ed674dfea59b890d3648b
2020-08-15 01:27:12 -07:00
523b2ce9c6 [jit][static runtime] Simplify the graph and add operator whitelist (#43024)
Summary:
This PR whitelists and simplifies graphs to help with development later on.  Key to note in this PR is the use of both a pattern substitution and the registration of custom operators.  This will likely be one of the main optimization types done in this folder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43024

Reviewed By: hlu1

Differential Revision: D23114262

Pulled By: bwasti

fbshipit-source-id: e25aa3564dcc8a2b48cfd1561b3ee2a4780ae462
2020-08-13 20:19:55 -07:00
ada8404f2d [jit] Scaffold a static runtime (#42753)
Summary:
The premise of this approach is that a small subset of neural networks are well represented by a data flow graph.  The README contains more information.

The name is subject to change, but I thought it was a cute reference to fire.

suo let me know if you'd prefer this in a different spot.  Since it lowers a JIT'd module directly I assumed the JIT folder would be appropriate.  There is no exposed Python interface yet (but is mocked up in `test_accelerant.py`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42753

Reviewed By: zou3519

Differential Revision: D23043771

Pulled By: bwasti

fbshipit-source-id: 5353731e3aae31c08b5b49820815da98113eb551
2020-08-12 13:05:27 -07:00