mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

Basil Hosmer cab926b2c0 faster generate_square_subsequent_mask in nn.Transformer (#60631 )

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60631

Per #48360, speed up `Transformer.generate_square_subsequent_mask`. New impl is informally ~5x faster, though absolute difference is probably small.

PR includes Python and C++ versions as well as a couple of places where the previous impl had been copied around.

Test Plan: Imported from OSS

Reviewed By: jbschlosser, albanD

Differential Revision: D29356673

Pulled By: bhosmer

fbshipit-source-id: 4c062ba0ead61a445aeef451c78777bf0b3a631e

2021-06-25 16:07:01 -07:00

audio_text_models.py

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

compare.py

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

functional_autograd_benchmark.py

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

ppl_models.py

Enable distribution validation if __debug__ (#48743 )

2021-01-05 13:59:10 -08:00

README.md

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

torchaudio_models.py

faster generate_square_subsequent_mask in nn.Transformer (#60631 )

2021-06-25 16:07:01 -07:00

torchvision_models.py

Add lint for unqualified type: ignore (#56290 )

2021-04-21 08:07:23 -07:00

utils.py

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

vision_models.py

Reland of benchmark code (#43428 )

2020-08-24 13:27:26 -07:00

README.md

Benchmarking tool for the autograd API

This folder contain a set of self-contained scripts that allow to benchmark the autograd with different common models. It is designed to run the benchmark before and after your change and will generate a table to share on the PR.

To do so, you can use functional_autograd_benchmark.py to run the benchmarks before your change (using as output before.txt) and after your change (using as output after.txt). You can then use compare.py to get a markdown table comparing the two runs.

The default arguments of functional_autograd_benchmark.py should be used in general. You can change them though to force a given device or force running even the (very) slow settings.

Sample usage

# Make sure you compile pytorch in release mode and with the same flags before/after
export DEBUG=0
# When running on CPU, it might be required to limit the number of cores to avoid oversubscription
export OMP_NUM_THREADS=10

# Compile pytorch with the base revision
git checkout master
python setup.py develop

# Run the benchmark for the base
# This will use the GPU if available.
pushd benchmarks/functional_autograd_benchmark
python functional_autograd_benchmark.py --output before.txt

# Compile pytorch with your change
popd
git checkout your_feature_branch
python setup.py develop

# Run the benchmark for the new version
pushd benchmarks/functional_autograd_benchmark
python functional_autograd_benchmark.py --output after.txt

# Get the markdown table that you can paste in your github PR
python compare.py

popd

Files in this folder:

functional_autograd_benchmark.py is the main entry point to run the benchmark.
compare.py is the entry point to run the comparison script that generates a markdown table.
torchaudio_models.py and torchvision_models.py contains code extracted from torchaudio and torchvision to be able to run the models without having a specific version of these libraries installed.
ppl_models.py, vision_models.py and audio_text_models.py contain all the getter functions used for the benchmark.