Summary:
Currently the cpp_extension build in benchmarks is misleading as it has the same name with torch.utils.cpp_extension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58708
Test Plan:
Run from `./benchmarks/operator_benchmark/pt_extension` folder:
```
python setup.py install
python cpp_extension_test.py
```
Note: CI doesn't matter as currently benchmarks/ folder is not compiled/test against CI
Reviewed By: robieta
Differential Revision: D28585582
Pulled By: walterddr
fbshipit-source-id: fc071040cf3cb52ee6c9252b2c5a0c3043393f57
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.
Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28: print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:
- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
```
test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
```
I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272
Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:
- https://github.com/pytorch/pytorch/runs/2365189927
Reviewed By: janeyx99
Differential Revision: D27830127
Pulled By: samestep
fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47767
This diff implements the functionality of running benchmark on mobile on top of operator_benchmark framework. It does so through a few steps:
1. create a scripted module from existing benchmark case.
2. run mobile specific optimization pass on the scripted module
3. run the scripted module on AiBench by calling its Python API
A small change in the way of writing a benchmark case is introduced so that both local and mobile run can share the same interface. The change is about having inputs as arguments of the `forward` function, so that mobile optimization pass can be run successfully (otherwise everything will be optimized away by constant propagation).
Test Plan:
## local op_bench run
buck run caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --iterations 1 --warmup_iterations 1
buck run caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --iterations 1 --warmup_iterations 1 --use_jit
Exceptions: `py_module` op in `FakeQuantizePerTensorBaseOpBenchmark` and `FakeQuantizePerChannelBaseOpBenchmark` under JIT mode. These tests also failed in the base version
```
RuntimeError:
Module 'FakeQuantizePerChannelOpBenchmark' has no attribute 'op_func' (This function exists as an attribute on the Python module, but we failed to compile it to a TorchScript function.
The error stack is reproduced here:
Python builtin <built-in method apply of FunctionMeta object at 0x619000c652a0> is currently not supported in Torchscript:
File "/data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/pt/quantization_test#link-tree/quantization_test.py", line 260
quant_min: int, quant_max: int
):
return _LearnableFakeQuantizePerChannelOp.apply(input, scale, zero_point, axis, quant_min, quant_max, 1.0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
:
File "/data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/pt/quantization_test#link-tree/quantization_test.py", line 313
axis: int, quant_min: int, quant_max: int
):
return self.op_func(input, scale, zero_point, axis, quant_min, quant_max)
~~~~~~~~~~~~ <--- HERE
```
`_consume_op` typing mismatch: chunk, split, qobserver, sort in qunary. These will be fixed in D24774105
## OSS test
python3 -m benchmark_all_test --iterations 1 --warmup_iterations 1 --use_jit
python3 -m benchmark_all_test --iterations 1 --warmup_iterations 1
## saved module graph
```
module __torch__.mobile_benchmark_utils.OpBenchmarkMobile {
parameters {
}
attributes {
training = True
num_iters = 1
benchmark = <__torch__.pt.add_test.___torch_mangle_4.AddBenchmark object at 0x6070001b8b50>
}
methods {
method forward {
graph(%self : __torch__.mobile_benchmark_utils.OpBenchmarkMobile):
%12 : None = prim::Constant() # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:9:4
%4 : bool = prim::Constant[value=1]() # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:10:8
%1 : int = prim::GetAttr[name="num_iters"](%self)
= prim::Loop(%1, %4) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:10:8
block0(%i : int):
%6 : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark = prim::GetAttr[name="benchmark"](%self)
%7 : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark = prim::GetAttr[name="benchmark"](%self)
%self.inputs_tuple : (Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu), Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu)) = prim::Constant[value=({0.48884}, {0.809042})]()
%9 : Tensor, %10 : Tensor = prim::TupleUnpack(%self.inputs_tuple)
%23 : int = prim::Constant[value=1]()
%24 : Tensor = aten::add(%9, %10, %23) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/pt/add_test.py:39:15
-> (%4)
return (%12)
}
}
submodules {
module __torch__.pt.add_test.___torch_mangle_4.AddBenchmark {
parameters {
}
attributes {
mobile_optimized = True
}
methods {
method forward {
graph(%self : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark,
%input_one.1 : Tensor,
%input_two.1 : Tensor):
%3 : int = prim::Constant[value=1]()
%4 : Tensor = aten::add(%input_one.1, %input_two.1, %3) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/pt/add_test.py:39:15
return (%4)
}
method get_inputs {
graph(%self : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark):
%self.inputs_tuple : (Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu), Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu)) = prim::Constant[value=({0.48884}, {0.809042})]()
return (%self.inputs_tuple)
}
}
submodules {
}
}
}
}
```
Reviewed By: kimishpatel
Differential Revision: D24322214
fbshipit-source-id: 335317eca4f40c4083883eb41dc47caf25cbdfd1
Summary:
Otherwise, I don't understand how those could have been invoked
Also, what is the benefit of importing the same module twice?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38832
Differential Revision: D21675081
Pulled By: malfet
fbshipit-source-id: fee5604c4c433161b6b1a999d505b5acbbc3b421
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28837
The JIT code used in op bench is not compatibility with latest JIT code path. This diff aims to resolve that issue.
Test Plan:
```buck run mode/opt //caffe2/benchmarks/operator_benchmark/pt:add_test -- --use_jit
Building: finished in 02:29.8 min (100%) 7055/7055 jobs, 1 updated
Total time: 02:30.3 min
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: add
# Mode: JIT
# Name: add_M64_N64_K64_cpu
# Input: M: 64, N: 64, K: 64, device: cpu
Forward Execution Time (us) : 118.052
Reviewed By: hl475
Differential Revision: D18197057
fbshipit-source-id: 92edae8a48abc4115a558a91ba46cc9c3edb2eb8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22309
This diff enables PT operators to run with JIT mode. Users can control eager and JIT mode using the `use_jit` flag.
In this diff, we are putting operators in a loop and passed it to JIT. One extra step which wraps the operator with the `_consume` op is introduced to avoid dead code elimination optimization in JIT. With that, the reported time includes the real operator execution time plus the `_consume` (directly return input, nothing else if happening inside) op.
Reviewed By: zheng-xq
Differential Revision: D16033082
fbshipit-source-id: e03be89fd5a505e44e81015dfc63db9cd76fb8a1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21210
This diff introduces a new path to run op with JIT. There are two steps involved here:
1. Users need to script the op. This should happen in the `init` method.
2. The generated graph from step1 is passed to `jit_forward` which will be executed by the benchmark backend
Reviewed By: zheng-xq
Differential Revision: D15460831
fbshipit-source-id: 48441d9cd4be5d0acebab901f45544616e6ed2ee
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21206
This diff change the default test_name to be a globally unique value across tests. With that, users can list all the tests and choose to run a specific test.
Reviewed By: zheng-xq
Differential Revision: D15543508
fbshipit-source-id: 0814ef6a60d41637fed5245e30c282497cf21bb8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21149
The diff modifies the interface for PyTorch operators in the benchmark suite
Reviewed By: zheng-xq
Differential Revision: D15433897
fbshipit-source-id: e858183431eb37d90313356716c2de8709372b58
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19518
Previous design needs to run the op benchmarks from PyTorch root directory which could lead to `module not found` error in OSS environment. This diff fixes that issue by making the benchmark to be launched in the `benchmarks` folder.
Reviewed By: ilia-cher
Differential Revision: D15020787
fbshipit-source-id: eb09814a33432a66cc857702bc86538cd17bea3b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19450
We want to make each operator benchmark as a separate binary. The previous way to run the benchmark is by collecting all operators into a single binary, it is unnecessary when we want to filter a specific operator. This diff aims to resolve that issue.
Reviewed By: ilia-cher
Differential Revision: D14808159
fbshipit-source-id: 43cd25b219c6e358d0cd2a61463b34596bf3bfac
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18740
Test utilities for writing Caffe2/PyTorch performance microbenchmarks. Brief description of the file structure
* benchmark_core.py : core utiltiites for running microbenchmark tests
* benchmark_caffe2.py : Caffe2 specific benchmark utilitites
* benchmark_pytorch.py: PyTorch specific benchmark utilities
* benchmark_runner.py : Main function. Currently it can run the microbenchmark tests in a stand-alone mode. The next step is to have this integrate with AI-PEP.
The utilities are located at https://github.com/pytorch/pytorch/tree/master/test to have access to both Caffe2/PyTorch Python's frontend.
Include two operator microbenchmarks; support both Caffe2/PyTorch:
* MatMul
* Add
Reference: PyTorch benchmarks : https://github.com/pytorch/benchmark/tree/master/timing/python. In this work, we start with two example binary operators MatMul and Add, but eventually we should to cover unary operators like in the PyTorch benchmark repo.
Reviewed By: zheng-xq
Differential Revision: D13887111
fbshipit-source-id: b7a56b95448c9ec3e674b0de0ffb96af4439bfce