2193 Commits

Author SHA1 Message Date
007fd01e9b Enable PT operators running with {cpu, gpu} * {forward, backward} (#22416)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22416

This diff tests the combination of cpu/gpu and forward/backward path for PT add operator.

Reviewed By: hl475

Differential Revision: D15770792

fbshipit-source-id: 38cc648361d2501d774db407f988c3cb5115b2ae
2019-07-01 16:30:58 -07:00
3a198400f8 modify pool benchmarks
Summary: as title

Reviewed By: hl475

Differential Revision: D16058193

fbshipit-source-id: 8f4e04a0356960f6483d6ef58e64876740434849
2019-06-28 14:35:23 -07:00
89c709d217 modify unary operators benchmark
Summary: as title

Reviewed By: hl475

Differential Revision: D16057665

fbshipit-source-id: 07e31a17450fbfd88b5bd330c31c729de5300eaa
2019-06-28 14:03:41 -07:00
6cf4df5d06 add PT softmax ops to the benchmark suite (#21208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21208

The diff adds softmax, softmax2d, and logsoftmax to the benchmark suite.

Reviewed By: zheng-xq

Differential Revision: D15526265

fbshipit-source-id: b7ba63032dba7146765513c8cb1ac5a6a7bd1a68
2019-06-28 13:58:20 -07:00
a4f281446b introduce flags to set omp and mkl threads (#21472)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21472

as title

Reviewed By: hl475

Differential Revision: D15695846

fbshipit-source-id: 44437f6b94a9c583275fcc711bb6ccf2b04f90fc
2019-06-26 09:33:05 -07:00
f59581218f Fix spelling errors (#21665)
Summary:
alloctor -> allocator
excutable -> executable
excution -> execution
foward -> forward
initiaize -> initialize
paralell -> parallel
preprocesor -> preprocessor
tranpose -> transpose
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21665

Differential Revision: D15806155

Pulled By: soumith

fbshipit-source-id: d92b21ec8650a2b32f05faf9af0b7d2b073e992c
2019-06-13 15:21:55 -07:00
341a7e4bb5 Fix issue in backward path (#21663)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21663

as title

Reviewed By: hl475

Differential Revision: D15770793

fbshipit-source-id: b3d0dd030237c4d62bddc388984a273153fac4a6
2019-06-11 21:09:25 -07:00
f2623c74a9 add PT pointwise unary ops to the benchmark suite (#21207)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21207

This diff adds 80 PT pointwise unary ops to the benchmark suite. Most of the ops are added using the generate_pt_tests_from_list interface. The rest are handled separately.

Reviewed By: zheng-xq

Differential Revision: D15471597

fbshipit-source-id: 8ea36e292a38b1dc50f064a48c8cd07dbf78ae56
2019-06-10 21:35:44 -07:00
4e3c97a0be add separate path for op with JIT (#21210)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21210

This diff introduces a new path to run op with JIT. There are two steps involved here:
1. Users need to script the op. This should happen in the `init` method.
2. The generated graph from step1 is passed to `jit_forward` which will be executed by the benchmark backend

Reviewed By: zheng-xq

Differential Revision: D15460831

fbshipit-source-id: 48441d9cd4be5d0acebab901f45544616e6ed2ee
2019-06-10 19:53:58 -07:00
512c9d8c76 add PT gather op to the benchmark suite (#21614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21614

as title

Reviewed By: kimishpatel

Differential Revision: D15525115

fbshipit-source-id: 6a17e1d791bdb432cc3d51e45c5e82b96268127d
2019-06-10 16:31:52 -07:00
a5cf6d5100 reorganize op bench directory (#21543)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21543

No code change in this diff.

Reviewed By: hl475

Differential Revision: D15721419

fbshipit-source-id: 06212cc882f5297064153417dc4d80bce9ec2667
2019-06-07 16:06:51 -07:00
f433913996 add more info back to BenchResult (#21502)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21502

In BenchResult, we keep name, avg_fwd, std_fwd, avg_bwd, and std_bwd. There is no information about the number of each iteration. In this diff, I am adding more info to BenchResult to include the number reported from each iteration.

Reviewed By: wanchaol

Differential Revision: D15706306

fbshipit-source-id: 3f14be4ba91f1f6da473995783bd7af1d067938d
2019-06-06 18:43:51 -07:00
12528990f8 change output of ai_pep_format (#21440)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21440

This diff modifies the output format when ai_pep_format is enabled.

Reviewed By: hl475

Differential Revision: D15681042

fbshipit-source-id: df5f2dbb38d1bd866ca7f74ef4e63459d480be6e
2019-06-05 21:54:24 -07:00
b869a3b4ac add new ops to benchmark_all_test (#21365)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21365

This diff adds new operators to benchmark_all_test so all the supported ops can be built as one binary

Reviewed By: hl475

Differential Revision: D15627328

fbshipit-source-id: b7ca550a279f485102a6a6bd47e4032c7beb9940
2019-06-04 13:54:26 -07:00
3004b397f0 change test_name to be globally unique value across tests (#21206)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21206

This diff change the default test_name to be a globally unique value across tests. With that, users can list all the tests and choose to run a specific test.

Reviewed By: zheng-xq

Differential Revision: D15543508

fbshipit-source-id: 0814ef6a60d41637fed5245e30c282497cf21bb8
2019-06-03 14:55:11 -07:00
ca80ec7c97 introduce a new intrace to add op [PT changes] (#21149)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21149

The diff modifies the interface for PyTorch operators in the benchmark suite

Reviewed By: zheng-xq

Differential Revision: D15433897

fbshipit-source-id: e858183431eb37d90313356716c2de8709372b58
2019-06-03 14:55:08 -07:00
516ea33f6a add PT maxpool and avgpool ops to the benchmark suite (#21200)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21200

This diff adds MaxPool1d/2d/3d and AvgPool1d/2d/3d to the benchmark suite.

Reviewed By: hl475

Differential Revision: D15541980

fbshipit-source-id: 394d136ee94a16ee24285939323ca5fe317e99d3
2019-05-31 19:35:29 -07:00
dceea73460 add PT conv and convtranspose ops to the benchmark suite (#21199)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21199

This diff adds Conv1d, ConvTranspose1d, Conv2d, ConvTranspose2d, Conv3d, and ConvTranspose3d operators to the benchmark suite.

Reviewed By: hl475

Differential Revision: D15520817

fbshipit-source-id: 5512afec2be8a1036fbcd170f70265c7e455fcde
2019-05-31 19:35:25 -07:00
2d75d31398 add PT linear op to the benchmark suite (#21204)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21204

as title

Reviewed By: hl475

Differential Revision: D15484743

fbshipit-source-id: 7094a983e370e1c3952021146b58b844874b7d5e
2019-05-31 19:35:22 -07:00
00b3e69211 add PT batchnorm op to the benchmark suite (#21201)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21201

as title

Reviewed By: hl475

Differential Revision: D15482581

fbshipit-source-id: d93713a35be41e76d077df419cb24585f69d72eb
2019-05-31 19:35:18 -07:00
ed1078bde3 migrate matmul operator to the new interface (#21198)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21198

as title

Reviewed By: hl475

Differential Revision: D15325768

fbshipit-source-id: a5d7c6837cd09445e75846660d12807dd26af6cc
2019-05-31 19:35:15 -07:00
668dbcc41b migrate intraop benchmarks to the new interface (#21202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21202

Migrate Ilia's op benchmarks to the new interface

Reviewed By: hl475

Differential Revision: D15322577

fbshipit-source-id: 8e75d51e7ddacbd56896c55f2996a9358491d83e
2019-05-31 16:19:04 -07:00
c62d476206 migrate add operator to the new interface (#21152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21152

Migrate existing add benchmark to use the new op front-end

Reviewed By: zheng-xq

Differential Revision: D15325524

fbshipit-source-id: 34e969e1bd289913d881c476711bce9f8ac18a29
2019-05-31 16:19:00 -07:00
0223d3744a introduce a new intrace to add op [C2 changes] (#21148)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21148

The diff modifies the interface for Caffe2 operators in the benchmark suite

Reviewed By: zheng-xq

Differential Revision: D15433888

fbshipit-source-id: c264a95906422d7a26c10b1f9836ba8b35e36b53
2019-05-31 09:21:07 -07:00
31089b02ce introduce a new interface to add op [core changes] (#21147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21147

This diff introduces a new interface to add PT/C2 operators to the benchmark suite.

The following steps are needed to add a new operator:
1. Specify the input shapes, args to an operator in configs
2. Create a PT/C2 benchmark class which includes ```init``` (create tensors),  ```forward``` (specify the operator to be tested.), and ```backward```(gradient of an op.) methods
3. call generate_pt_test/generate_c2_test to create test cases based on configs

Reviewed By: zheng-xq

Differential Revision: D15250380

fbshipit-source-id: 1025a7cf60d2427baa0f3f716455946d3d3e6a27
2019-05-31 09:21:04 -07:00
cda9e995e2 Benchmark repeat op. (#20016)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20016

PT's repeat op benchmark

Reviewed By: zheng-xq

Differential Revision: D15166941

fbshipit-source-id: b1ed7af790460456210b60bfb4e44a08657e9612
2019-05-20 07:34:54 -07:00
eecf52b444 Fix in benchmark_test_generator (#20237)
Summary:
Add missing import
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20237

Differential Revision: D15245957

Pulled By: ilia-cher

fbshipit-source-id: 0f71aa08eb9ecac32002a1644838d06ab9faa37c
2019-05-07 17:03:25 -07:00
19e6886576 Intra-op parallel microbenchmarks for PT (#19997)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19997
ghimport-source-id: 420d4a68a1ef879beee2734adba8abb575e0b0ab

Differential Revision: D15231375

Pulled By: ilia-cher

fbshipit-source-id: ce7248ea2ebb54d25c9d831c6e3f23f3534557dd
2019-05-06 20:21:45 -07:00
8c97f0b19e Initialize Caffe2 only when running Caffe2 benchmarks (#19980)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19980
ghimport-source-id: ca31ca25b88a1c6219e4a32483f70738a8fdbf88

Differential Revision: D15229797

Pulled By: ilia-cher

fbshipit-source-id: 0b23dbdba0c0f60932a75d8b1900c54285f5a8e4
2019-05-06 19:17:23 -07:00
0c7e98b765 Support for non-contiguous tensors and arbitrary dtypes in PT benchmarks (#19993)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19993
ghimport-source-id: 4cf51b61bb83b72883148ab0faa0c75c3cef7635

Differential Revision: D15230363

Pulled By: ilia-cher

fbshipit-source-id: a3ab591d6fd24e874958401e63eaec56bda19a5c
2019-05-06 19:12:09 -07:00
3875e1ba45 try to make at::cat in mm_tree_reduction operate on contig tensors (#18816)
Summary:
Sometimes at::cat gets transposed inputs and goes on a slow path. Also, make jit_premul lstm benchmark add bias to the whole input tensor to avoid separate reduction kernels in the backward pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18816

Differential Revision: D15013576

Pulled By: wanchaol

fbshipit-source-id: bcfa1cf44180b11b05b0f55f034707012f66281a
2019-04-24 23:44:25 -07:00
26f12af537 Fix op benchmarks error in OSS environment (#19518)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19518

Previous design needs to run the op benchmarks from PyTorch root directory which could lead to `module not found` error in OSS environment. This diff fixes that issue by making the benchmark to be launched in the `benchmarks` folder.

Reviewed By: ilia-cher

Differential Revision: D15020787

fbshipit-source-id: eb09814a33432a66cc857702bc86538cd17bea3b
2019-04-19 16:25:16 -07:00
5da7b74d48 fix AI-PEP path error (#19514)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19514

as title

Reviewed By: hl475

Differential Revision: D15018499

fbshipit-source-id: 9ce38e3a577432e0575a6743f5dcd2e907d3ab9d
2019-04-19 16:25:13 -07:00
08f5c05d60 make separate operators as independent binaries (#19450)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19450

We want to make each operator benchmark as a separate binary. The previous way to run the benchmark is by collecting all operators into a single binary, it is unnecessary when we want to filter a specific operator. This diff aims to resolve that issue.

Reviewed By: ilia-cher

Differential Revision: D14808159

fbshipit-source-id: 43cd25b219c6e358d0cd2a61463b34596bf3bfac
2019-04-18 20:00:47 -07:00
45d5b6be48 Enhance front-end to add op (#19433)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19433

For operator benchmark project, we need to cover a lot of operators, so the interface for adding operators needs to be very clean and simple. This diff is implementing a new interface to add op.

Here is the logic to add new operator to the benchmark:
```
long_config = {}
short_config = {}

map_func

add_test(
  [long_config, short_config],
  map_func,
  [caffe2 op]
  [pt op]
)
```

Reviewed By: zheng-xq

Differential Revision: D14791191

fbshipit-source-id: ac6738507cf1b9d6013dc8e546a2022a9b177f05
2019-04-18 17:07:02 -07:00
5627940e9c Add a fast path for batch-norm CPU inference. (#19152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19152

Adding a fast path for batch-norm CPU inference when all tensors are contiguous.
* Leverage vectorization through smiple loops.
* Folding linear terms before computation.
* For resnext-101, this version gets 18.95 times faster.
* Add a microbenchmark:
* (buck build mode/opt -c python.package_style=inplace --show-output //caffe2/benchmarks/operator_benchmark:batchnorm_benchmark) && \
(OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 buck-out/gen/caffe2/benchmarks/operator_benchmark/batchnorm_benchmark#binary.par)
* batch_norm: data shape: [1, 256, 3136], bandwidth: 22.26 GB/s
* batch_norm: data shape: [1, 65536, 1], bandwidth: 5.57 GB/s
* batch_norm: data shape: [128, 2048, 1], bandwidth: 18.21 GB/s

Reviewed By: soumith, BIT-silence

Differential Revision: D14889728

fbshipit-source-id: 20c9e567e38ff7dbb9097873b85160eca2b0a795
2019-04-16 19:27:54 -07:00
3501576230 calculate execution time based on final iterations (#19299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19299

I saw larger than 5% performance variation with small operators, this diff aims to reduce the variation by avoiding python overhead. Previously, in the benchmark, we run the main loop for 100 iterations then look at the time. If it's not significant, we will double the number of iterations to rerun and look at the result. We continue this process until it becomes significant. We calculate the time by total_time / number of iterations. The issue is that we are including multiple python trigger overhead.

Now, I change the logic to calculate execution time based on the last run instead of all runs, the equation is time_in_last_run/number of iterations.

Reviewed By: hl475

Differential Revision: D14925287

fbshipit-source-id: cb646298c08a651e27b99a5547350da367ffff47
2019-04-16 08:57:17 -07:00
07efee395c add Fast-RNN to AI-PEP
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18885

Reviewed By: hl475

Differential Revision: D14728854

fbshipit-source-id: 7e7a2946929551963f7c938e3d82a260a9efdfbd
2019-04-04 17:04:21 -07:00
cb66759600 temp fix for flake8 error (#18788)
Summary:
Fix lint error
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18788

Reviewed By: houseroad

Differential Revision: D14741840

Pulled By: mingzhe09088

fbshipit-source-id: 1fa630e3c6e606e3d78fe8293e5b0e7ea1b78da3
2019-04-02 22:52:52 -07:00
5f5a2aaab9 Operator-level performance microbenchmarks (#18740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18740

Test utilities for writing Caffe2/PyTorch performance microbenchmarks. Brief description of the file structure

* benchmark_core.py : core utiltiites for running microbenchmark tests
* benchmark_caffe2.py : Caffe2 specific benchmark utilitites
* benchmark_pytorch.py: PyTorch specific benchmark utilities
* benchmark_runner.py : Main function. Currently it can run the microbenchmark tests in a stand-alone mode. The next step is to have this integrate with AI-PEP.

The utilities are located at https://github.com/pytorch/pytorch/tree/master/test to have access to both Caffe2/PyTorch Python's frontend.

Include two operator microbenchmarks; support both Caffe2/PyTorch:
* MatMul
* Add

Reference: PyTorch benchmarks : https://github.com/pytorch/benchmark/tree/master/timing/python. In this work, we start with two example binary operators MatMul and Add, but eventually we should to cover unary operators like in the PyTorch benchmark repo.

Reviewed By: zheng-xq

Differential Revision: D13887111

fbshipit-source-id: b7a56b95448c9ec3e674b0de0ffb96af4439bfce
2019-04-02 17:06:19 -07:00
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
e22a2b9015 Minor fixes in fastrnns benchmarks
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18613

Reviewed By: wanchaol

Differential Revision: D14681838

fbshipit-source-id: 60bd5c9b09398c74335f003cd21ea32dd1c45876
2019-03-29 01:22:28 -07:00
6684ef3f23 Move fast rnn benchmark to pytorch/pytorch
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18369

Differential Revision: D14652039

Pulled By: wanchaol

fbshipit-source-id: 1177b1f60d96672c3e2c9d527b56ee06ca7c0af1
2019-03-27 14:46:09 -07:00