Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * **#18598 Turn on F401: Unused import warning.** This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
Fast RNN benchmarks
Benchmarks for TorchScript models
For most stable results, do the following:
- Set CPU Governor to performance mode (as opposed to energy save)
- Turn off turbo for all CPUs (assuming Intel CPUs)
- Shield cpus via
cset shield
when running benchmarks.
Some of these scripts accept command line args but most of them do not because I was lazy. They will probably be added sometime in the future, but the default sizes are pretty reasonable.
Test fastrnns (fwd + bwd) correctness
Test the fastrnns benchmarking scripts with the following:
python -m fastrnns.test
or run the test independently:
python -m fastrnns.test --rnns jit
Run benchmarks
python -m fastrnns.bench
should give a good comparision, or you can specify the type of model to run
python -m fastrnns.bench --rnns cudnn aten jit --group rnns
Run model profiling, calls nvprof
python -m fastrnns.profile
should generate nvprof file for all models somewhere. you can also specify the models to generate nvprof files separately:
python -m fastrnns.profile --rnns aten jit
Caveats
Use Linux for the most accurate timing. A lot of these tests only run on CUDA.