Summary: This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub. **Important:** for uninteresting reasons, this PR moves the `print_test_stats.py` file. - *Before:* `test/print_test_stats.py` - *After:* `torch/testing/_internal/print_test_stats.py` Notes on the approach: - Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics. - We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit. - We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite. - We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives. - We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3). - We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR. - Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes. - Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171 Test Plan: To run the unit tests: ``` $ python test/test_testing.py $ python test/print_test_stats.py ``` To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example: - pytorch_linux_bionic_py3_6_clang9_test To test locally, use the following steps. First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option): ``` $ DATA_DIR=/tmp $ ARBITRARY_TEST=testing $ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST ``` Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)): ``` $ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489 $ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test ``` Download the `*.json.bz2` file(s) for that commit/job pair: ``` $ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive ``` And feed everything into `test/print_test_stats.py`: ``` $ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/*Z.json.bz2 | torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST ``` The first part of the output should be the same as before this PR; here is the new part, at the end of the output: - https://pastebin.com/Jj1svhAn Reviewed By: malfet, izdeby Differential Revision: D26317769 Pulled By: samestep fbshipit-source-id: 1ba06cec0fafac77f9e7341d57079543052d73db
Code Coverage Tool for Pytorch
Overview
This tool is designed for calculating code coverage for Pytorch project. It’s an integrated tool. You can use this tool to run and generate both file-level and line-level report for C++ and Python tests. It will also be the tool we use in CircleCI to generate report for each master commit.
Simple
- Simple command to run:
python oss_coverage.py
- Argument
--clean
will do all the messy clean up things for you
But Powerful
- Choose your own interested folder:
- Default folder will be good enough in most times
- Flexible: you can specify one or more folder(s) that you are interested in
- Run only the test you want:
- By default it will run all the c++ and python tests
- Flexible: you can specify one or more test(s) that you want to run
- Final report:
- File-Level: The coverage percentage for each file you are interested in
- Line-Level: The coverage details for each line in each file you are interested in
- Html-Report (only for
gcc
): The beautiful HTML report supported bylcov
, combine file-level report and line-lever report into a graphical view.
- More complex but flexible options:
- Use different stages like --run, --export, --summary to achieve more flexible functionality
How to use
This part will introduce about the arguments you can use when run this tool. The arguments are powerful, giving you full flexibility to do different work.
We have two different compilers, gcc
and clang
, and this tool supports both. But it is recommended to use gcc
because it's much faster and use less disk place. The examples will also be divided to two parts, for gcc
and clang
.
Preparation
The first step is to build Pytorch from source with CODE_COVERAGE
option ON
. You may also want to set BUILD_TEST
option ON
to get the test binaries. Besides, if you are under gcc
compiler, to get accurate result, it is recommended to also select CMAKE_BUILD_CONFIG=Debug
.
See: how to adjust build options for reference. Following is one way to adjust build option:
# in build/ folder (all build artifacts must in `build/` folder)
cmake .. -DCODE_COVERAGE=ON -DBUILD_TEST=ON -DCMAKE_BUILD_CONFIG=Debug
Examples
The tool will auto-detect compiler type in your operating system, but if you are using another one, you need to specify it. Besides, if you are using clang
, llvm
tools are required. So the first step is to set some environment value if needed:
# set compiler type, the default is auto detected, you can check it at the start of log.txt
export COMPILER_TYPE="CLANG"
# set llvm path for clang, by default is /usr/local/opt/llvm/bin
export LLVM_TOOL_PATH=...
Great, you are ready to run the code coverage tool for the first time! Start from the simple command:
python oss_coverage.py --run-only=atest
This command will run atest
binary in build/bin/
folder and generate reoports over the entire Pytorch folder. You can find the reports in profile/summary
. But you may only be interested in the aten
folder, in this case, try:
python oss_coverage.py --run-only=atest --interested-only=aten
In Pytorch, c++
tests located in build/bin/
and python
tests located in test/
. If you want to run python
test, try:
python oss_coverage.py --run-only=test_complex.py
You may also want to specify more than one test or interested folder, in this case, try:
python oss_coverage.py --run-only=atest c10_logging_test --interested-only aten/src/Aten c10/core
That it is! With these two simple options, you can customize many different functionality according to your need.
By default, the tool will run all tests in build/bin
folder (by running all executable binaries in it) and test/
folder (by running run_test.py
), and then collect coverage over the entire Pytorch folder. If this is what you want, try:
(Note: It's not recommended to run default all tests in clang, because it will take too much space)
python oss_coverage.py
For more complex arguments and functionalities
GCC
The code coverage with gcc
compiler can be divided into 3 step:
- run the tests:
--run
- run
gcov
to get json report:--export
- summarize it to human readable file report and line report:
--summary
By default all steps will be run, but you can specify only run one of them. Following is some usage scenario:
1. Interested in different folder
—summary
is useful when you have different interested folder. For example,
# after run this command
python oss_coverage.py --run-only=atest --interested-folder=aten
# you may then want to learn atest's coverage over c10, instead of running the test again, you can:
python oss_coverage.py --run-only=atest --interested-folder=c10 --summary
2. Run tests yourself When you are developing a new feature, you may first run the tests yourself to make sure the implementation is all right and then want to learn its coverage. But sometimes the test take very long time and you don't want to wait to run it again when doing code coverage. In this case, you can use these arguments to accerate your development (make sure you build pytorch with the coverage option!):
# run tests when you are devloping a new feature, assume the the test is `test_nn.py`
python oss_coverage.py --run-only=test_nn.py
# or you can run it yourself
cd test/ && python test_nn.py
# then you want to learn about code coverage, you can just run:
python oss_coverage.py --run-only=test_nn.py --export --summary
CLANG
The steps for clang
is very similar to gcc
, but the export stage is divided into two step:
- run the tests:
--run
- run
gcov
to get json report:--merge
--export
- summarize it to human readable file report and line report:
--summary
Therefore, just replace --export
in gcc
examples with --merge
and --export
, you will find it work!
Reference
For gcc
- See about how to invoke
gcov
, read Invoking gcov will be helpful
For clang
- If you are not familiar with the procedure of generating code coverage report by using
clang
, read Source-based Code Coverage will be helpful.