Commit Graph

25 Commits

Author SHA1 Message Date
8d45f555d7 [BE] [1/3] Rewrite super() calls in caffe2 and benchmarks (#94587)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587
Approved by: https://github.com/ezyang
2023-02-11 18:19:48 +00:00
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
277f587496 rename benchmark_cpp_extension (#58708)
Summary:
Currently the cpp_extension build in benchmarks is misleading as it has the same name with torch.utils.cpp_extension

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58708

Test Plan:
Run from `./benchmarks/operator_benchmark/pt_extension` folder:
```
python setup.py install
python cpp_extension_test.py
```

Note: CI doesn't matter as currently benchmarks/ folder is not compiled/test against CI

Reviewed By: robieta

Differential Revision: D28585582

Pulled By: walterddr

fbshipit-source-id: fc071040cf3cb52ee6c9252b2c5a0c3043393f57
2021-05-24 11:04:02 -07:00
e3900d2ba5 Add lint for unqualified noqa (#56272)
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.

Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27:            print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28:            print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:

- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
  ```
  test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
  test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
  ```

I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272

Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:

- https://github.com/pytorch/pytorch/runs/2365189927

Reviewed By: janeyx99

Differential Revision: D27830127

Pulled By: samestep

fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
2021-04-19 13:16:18 -07:00
8ff0b6fef8 [OpBenchMobile] Enable operator_benchmark to run the benchmark on mobile through AiBench (#47767)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47767

This diff implements the functionality of running benchmark on mobile on top of operator_benchmark framework. It does so through a few steps:

1. create a scripted module from existing benchmark case.
2. run mobile specific optimization pass on the scripted module
3. run the scripted module on AiBench by calling its Python API

A small change in the way of writing a benchmark case is introduced so that both local and mobile run can share the same interface. The change is about having inputs as arguments of the `forward` function, so that mobile optimization pass can be run successfully (otherwise everything will be optimized away by constant propagation).

Test Plan:
## local op_bench run

buck run caffe2/benchmarks/operator_benchmark:benchmark_all_test --  --iterations 1 --warmup_iterations 1

buck run caffe2/benchmarks/operator_benchmark:benchmark_all_test --  --iterations 1 --warmup_iterations 1 --use_jit

Exceptions: `py_module` op in `FakeQuantizePerTensorBaseOpBenchmark` and `FakeQuantizePerChannelBaseOpBenchmark` under JIT mode. These tests also failed in the base version

```
RuntimeError:
Module 'FakeQuantizePerChannelOpBenchmark' has no attribute 'op_func' (This function exists as an attribute on the Python module, but we failed to compile it to a TorchScript function.
The error stack is reproduced here:

Python builtin <built-in method apply of FunctionMeta object at 0x619000c652a0> is currently not supported in Torchscript:
  File "/data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/pt/quantization_test#link-tree/quantization_test.py", line 260
    quant_min: int, quant_max: int
):
    return _LearnableFakeQuantizePerChannelOp.apply(input, scale, zero_point, axis, quant_min, quant_max, 1.0)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
:
  File "/data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/pt/quantization_test#link-tree/quantization_test.py", line 313
        axis: int, quant_min: int, quant_max: int
    ):
        return self.op_func(input, scale, zero_point, axis, quant_min, quant_max)
               ~~~~~~~~~~~~ <--- HERE
```

`_consume_op` typing mismatch: chunk, split, qobserver, sort in qunary. These will be fixed in D24774105

## OSS test

python3 -m benchmark_all_test --iterations 1 --warmup_iterations 1 --use_jit
python3 -m benchmark_all_test --iterations 1 --warmup_iterations 1

## saved module graph
```
module __torch__.mobile_benchmark_utils.OpBenchmarkMobile {
  parameters {
  }
  attributes {
    training = True
    num_iters = 1
    benchmark = <__torch__.pt.add_test.___torch_mangle_4.AddBenchmark object at 0x6070001b8b50>
  }
  methods {
    method forward {
      graph(%self : __torch__.mobile_benchmark_utils.OpBenchmarkMobile):
        %12 : None = prim::Constant() # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:9:4
        %4 : bool = prim::Constant[value=1]() # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:10:8
        %1 : int = prim::GetAttr[name="num_iters"](%self)
         = prim::Loop(%1, %4) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/mobile_benchmark_utils.py:10:8
          block0(%i : int):
            %6 : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark = prim::GetAttr[name="benchmark"](%self)
            %7 : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark = prim::GetAttr[name="benchmark"](%self)
            %self.inputs_tuple : (Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu), Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu)) = prim::Constant[value=({0.48884}, {0.809042})]()
            %9 : Tensor, %10 : Tensor = prim::TupleUnpack(%self.inputs_tuple)
            %23 : int = prim::Constant[value=1]()
            %24 : Tensor = aten::add(%9, %10, %23) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/pt/add_test.py:39:15
            -> (%4)
        return (%12)

    }
  }
  submodules {
    module __torch__.pt.add_test.___torch_mangle_4.AddBenchmark {
      parameters {
      }
      attributes {
        mobile_optimized = True
      }
      methods {
        method forward {
          graph(%self : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark,
                %input_one.1 : Tensor,
                %input_two.1 : Tensor):
            %3 : int = prim::Constant[value=1]()
            %4 : Tensor = aten::add(%input_one.1, %input_two.1, %3) # /data/users/wangyang19/fbsource/fbcode/buck-out/dev/gen/caffe2/benchmarks/operator_benchmark/fb/pt/mobile/benchmark_all_test_fbcode#link-tree/pt/add_test.py:39:15
            return (%4)

        }
        method get_inputs {
          graph(%self : __torch__.pt.add_test.___torch_mangle_4.AddBenchmark):
            %self.inputs_tuple : (Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu), Float(1, 1, 1, strides=[1, 1, 1], requires_grad=0, device=cpu)) = prim::Constant[value=({0.48884}, {0.809042})]()
            return (%self.inputs_tuple)

        }
      }
      submodules {
      }
    }
  }
}

```

Reviewed By: kimishpatel

Differential Revision: D24322214

fbshipit-source-id: 335317eca4f40c4083883eb41dc47caf25cbdfd1
2020-11-12 17:15:05 -08:00
e829d4fba9 [op-bench] fix jit mode (#45774)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45774

Fix RuntimeError: No such operator operator_benchmark::_consume

Test Plan: waitforsandcastle

Reviewed By: ngimel

Differential Revision: D24064982

fbshipit-source-id: 13160b6d18569e659ca1ab0ca1d444ed9947260c
2020-10-05 09:29:41 -07:00
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
c02e7c464a Replace import cpp_benchmark with torch.utils.cpp_benchmark (#38832)
Summary:
Otherwise, I don't understand how those could have been invoked

Also, what is the benefit of importing the same module twice?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38832

Differential Revision: D21675081

Pulled By: malfet

fbshipit-source-id: fee5604c4c433161b6b1a999d505b5acbbc3b421
2020-05-20 18:53:09 -07:00
c543034531 add cuda sync when ops running on gpu (#29936)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29936

This diff adds synchronization after op execution to ensure all the cuda streams complete.

Test Plan:
```
buck run mode/opt //caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --iterations 1
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K64_cpu
# Input: M: 64, N: 64, K: 64, device: cpu
Forward Execution Time (us) : 154.412

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K64_cuda
# Input: M: 64, N: 64, K: 64, device: cuda
Forward Execution Time (us) : 101.115
...

Reviewed By: hl475

Differential Revision: D18542732

fbshipit-source-id: b979d26a174f488e971074dc1e16b00e17179c80
2019-11-15 18:02:48 -08:00
f31d6c70fe reduce op bench binary size (#29496)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29496

This diff reduces the binary size of op benchmark by avoiding creating all tests at once.

Test Plan:
```
buck run //caffe2/benchmarks/operator_benchmark:benchmark_all_test

# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : long

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N2_K1_cpu
# Input: M: 8, N: 2, K: 1, device: cpu
Forward Execution Time (us) : 160.781

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N2_K8_cpu
# Input: M: 8, N: 2, K: 8, device: cpu
Forward Execution Time (us) : 158.941

Reviewed By: hl475

Differential Revision: D18412342

fbshipit-source-id: 5db647019ae8c2e4d6ab361b54b63cf88236b1ae
2019-11-08 22:15:12 -08:00
0a68e8bab0 fix op bench runtime error when use_jit is enabled (#28837)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28837

The JIT code used in op bench is not compatibility with latest JIT code path. This diff aims to resolve that issue.

Test Plan:
```buck run mode/opt //caffe2/benchmarks/operator_benchmark/pt:add_test -- --use_jit
Building: finished in 02:29.8 min (100%) 7055/7055 jobs, 1 updated
  Total time: 02:30.3 min
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: add
# Mode: JIT
# Name: add_M64_N64_K64_cpu
# Input: M: 64, N: 64, K: 64, device: cpu
Forward Execution Time (us) : 118.052

Reviewed By: hl475

Differential Revision: D18197057

fbshipit-source-id: 92edae8a48abc4115a558a91ba46cc9c3edb2eb8
2019-10-29 12:08:28 -07:00
cbcb70f84c print last 50 runs when using ai_pep_format (#28128)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28128

as title

Test Plan:
```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: add
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.169559478759766"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.206514358520508"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.4950008392334"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.172897338867188"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.27255630493164"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.549837112426758"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "ms", "value": "29.63113784790039"}
...

Reviewed By: hl475

Differential Revision: D17957611

fbshipit-source-id: 4e70ba2070b97fbbca0d6d4295abbead2ac356d4
2019-10-16 15:22:23 -07:00
382917bbd1 report per iteration execution time (#27923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27923

As title

Test Plan:
```
buck run caffe2/benchmarks/operator_benchmark/pt:add_test -- --iterations 3 --ai_pep_format true

# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: add
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "us", "value": "0.027768373489379883"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "us", "value": "0.02661752700805664"}
PyTorchObserver {"type": "add_M64_N64_K64_cpu", "metric": "latency", "unit": "us", "value": "0.026746749877929688"}
...

Reviewed By: hl475

Differential Revision: D17911718

fbshipit-source-id: 6fe28f2ab9ce1e0feabb5b822f04ff32dac977a9
2019-10-14 15:44:42 -07:00
c1ed0150c5 canonical example of torch.add benchmark (#23402)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23402

This diff tries to make torch.add as a canonical example for op benchmark. Once it lands, we will also modify all other op benchmarks to be uniform with this example. With that, when people are adding new ops, they can copy paste any existing code.

Test Plan:
buck run mode/dev-nosan caffe2/benchmarks/operator_benchmark/pt:add_test -- --iterations 3

```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N16_K32_devicecpu
# Input: M: 8, N: 16, K: 32, device: cpu
Forward Execution Time (us) : 146.586

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N16_K32_devicecuda
# Input: M: 8, N: 16, K: 32, device: cuda
Forward Execution Time (us) : 92.151

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M16_N16_K64_devicecpu
# Input: M: 16, N: 16, K: 64, device: cpu
Forward Execution Time (us) : 428.421

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M16_N16_K64_devicecuda
# Input: M: 16, N: 16, K: 64, device: cuda
Forward Execution Time (us) : 89.811

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K128_devicecpu
# Input: M: 64, N: 64, K: 128, device: cpu
Forward Execution Time (us) : 11857.012

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K128_devicecuda
# Input: M: 64, N: 64, K: 128, device: cuda
Forward Execution Time (us) : 93.918

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N16_K32_devicecpu_bwdall
# Input: M: 8, N: 16, K: 32, device: cpu
Backward Execution Time (us) : 990.125

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N16_K32_devicecpu_bwd1
# Input: M: 8, N: 16, K: 32, device: cpu
Backward Execution Time (us) : 781.217

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M8_N16_K32_devicecpu_bwd2
# Input: M: 8, N: 16, K: 32, device: cpu
Backward Execution Time (us) : 777.307
```

Reviewed By: zheng-xq

Differential Revision: D16501974

fbshipit-source-id: f1eec010eabf11ce4fcf6cfe6f85cd5241a7022d
2019-10-09 11:24:10 -07:00
3c986dff77 introduce auto_set to simplify benchmarking the backward path of operators (#23276)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23276

This diff introduces a new feature to simplify benchmarking the backward path of ops. Here is an example:

```
...
self.input_one = torch.rand(M, N, K, requires_grad=self.auto_set())
self.input_two = torch.rand(M, N, K, requires_grad=self.auto_set())
...
```

In this way, the benchmark will generate three different test cases.
1. input_one requires grad
2. input_two requires grad
3. both inputs require grad

Here is a sample output:
```
# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M1_N8_K8_bwdall
# Input: M: 1, N: 8, K: 8
Backward Execution Time (us) : 863.744

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M1_N8_K8_bwd1
# Input: M: 1, N: 8, K: 8
Backward Execution Time (us) : 727.915

# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M1_N8_K8_bwd2
# Input: M: 1, N: 8, K: 8
Backward Execution Time (us) : 687.626
```

Reviewed By: zheng-xq

Differential Revision: D16450355

fbshipit-source-id: 50ae0916e81c3ff9f0c482ed6d386319eb15b305
2019-07-29 15:58:41 -07:00
b93f29ded3 add JIT path to the benchmark (#22309)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22309

This diff enables PT operators to run with JIT mode. Users can control eager and JIT mode using the `use_jit` flag.

In this diff, we are putting operators in a loop and passed it to JIT. One extra step which wraps the operator with the `_consume` op is introduced to avoid dead code elimination optimization in JIT.  With that, the reported time includes the real operator execution time plus the `_consume` (directly return input, nothing else if happening inside) op.

Reviewed By: zheng-xq

Differential Revision: D16033082

fbshipit-source-id: e03be89fd5a505e44e81015dfc63db9cd76fb8a1
2019-07-03 17:18:03 -07:00
341a7e4bb5 Fix issue in backward path (#21663)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21663

as title

Reviewed By: hl475

Differential Revision: D15770793

fbshipit-source-id: b3d0dd030237c4d62bddc388984a273153fac4a6
2019-06-11 21:09:25 -07:00
4e3c97a0be add separate path for op with JIT (#21210)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21210

This diff introduces a new path to run op with JIT. There are two steps involved here:
1. Users need to script the op. This should happen in the `init` method.
2. The generated graph from step1 is passed to `jit_forward` which will be executed by the benchmark backend

Reviewed By: zheng-xq

Differential Revision: D15460831

fbshipit-source-id: 48441d9cd4be5d0acebab901f45544616e6ed2ee
2019-06-10 19:53:58 -07:00
3004b397f0 change test_name to be globally unique value across tests (#21206)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21206

This diff change the default test_name to be a globally unique value across tests. With that, users can list all the tests and choose to run a specific test.

Reviewed By: zheng-xq

Differential Revision: D15543508

fbshipit-source-id: 0814ef6a60d41637fed5245e30c282497cf21bb8
2019-06-03 14:55:11 -07:00
ca80ec7c97 introduce a new intrace to add op [PT changes] (#21149)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21149

The diff modifies the interface for PyTorch operators in the benchmark suite

Reviewed By: zheng-xq

Differential Revision: D15433897

fbshipit-source-id: e858183431eb37d90313356716c2de8709372b58
2019-06-03 14:55:08 -07:00
19e6886576 Intra-op parallel microbenchmarks for PT (#19997)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19997
ghimport-source-id: 420d4a68a1ef879beee2734adba8abb575e0b0ab

Differential Revision: D15231375

Pulled By: ilia-cher

fbshipit-source-id: ce7248ea2ebb54d25c9d831c6e3f23f3534557dd
2019-05-06 20:21:45 -07:00
0c7e98b765 Support for non-contiguous tensors and arbitrary dtypes in PT benchmarks (#19993)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19993
ghimport-source-id: 4cf51b61bb83b72883148ab0faa0c75c3cef7635

Differential Revision: D15230363

Pulled By: ilia-cher

fbshipit-source-id: a3ab591d6fd24e874958401e63eaec56bda19a5c
2019-05-06 19:12:09 -07:00
26f12af537 Fix op benchmarks error in OSS environment (#19518)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19518

Previous design needs to run the op benchmarks from PyTorch root directory which could lead to `module not found` error in OSS environment. This diff fixes that issue by making the benchmark to be launched in the `benchmarks` folder.

Reviewed By: ilia-cher

Differential Revision: D15020787

fbshipit-source-id: eb09814a33432a66cc857702bc86538cd17bea3b
2019-04-19 16:25:16 -07:00
08f5c05d60 make separate operators as independent binaries (#19450)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19450

We want to make each operator benchmark as a separate binary. The previous way to run the benchmark is by collecting all operators into a single binary, it is unnecessary when we want to filter a specific operator. This diff aims to resolve that issue.

Reviewed By: ilia-cher

Differential Revision: D14808159

fbshipit-source-id: 43cd25b219c6e358d0cd2a61463b34596bf3bfac
2019-04-18 20:00:47 -07:00
5f5a2aaab9 Operator-level performance microbenchmarks (#18740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18740

Test utilities for writing Caffe2/PyTorch performance microbenchmarks. Brief description of the file structure

* benchmark_core.py : core utiltiites for running microbenchmark tests
* benchmark_caffe2.py : Caffe2 specific benchmark utilitites
* benchmark_pytorch.py: PyTorch specific benchmark utilities
* benchmark_runner.py : Main function. Currently it can run the microbenchmark tests in a stand-alone mode. The next step is to have this integrate with AI-PEP.

The utilities are located at https://github.com/pytorch/pytorch/tree/master/test to have access to both Caffe2/PyTorch Python's frontend.

Include two operator microbenchmarks; support both Caffe2/PyTorch:
* MatMul
* Add

Reference: PyTorch benchmarks : https://github.com/pytorch/benchmark/tree/master/timing/python. In this work, we start with two example binary operators MatMul and Add, but eventually we should to cover unary operators like in the PyTorch benchmark repo.

Reviewed By: zheng-xq

Differential Revision: D13887111

fbshipit-source-id: b7a56b95448c9ec3e674b0de0ffb96af4439bfce
2019-04-02 17:06:19 -07:00