Files
pytorch/tools/code_coverage
Sam Estep 21ef248fb8 [reland] Report test time regressions (#50171)
Summary:
This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub.

**Important:** for uninteresting reasons, this PR moves the `print_test_stats.py` file.

- *Before:* `test/print_test_stats.py`
- *After:* `torch/testing/_internal/print_test_stats.py`

Notes on the approach:

- Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics.
- We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit.
- We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite.
- We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives.
- We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3).
- We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR.
- Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes.
- Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171

Test Plan:
To run the unit tests:
```
$ python test/test_testing.py
$ python test/print_test_stats.py
```

To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example:
- pytorch_linux_bionic_py3_6_clang9_test

To test locally, use the following steps.

First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option):
```
$ DATA_DIR=/tmp
$ ARBITRARY_TEST=testing
$ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST
```
Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)):
```
$ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489
$ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test
```
Download the `*.json.bz2` file(s) for that commit/job pair:
```
$ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive
```
And feed everything into `test/print_test_stats.py`:
```
$ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/*Z.json.bz2 | torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST
```
The first part of the output should be the same as before this PR; here is the new part, at the end of the output:

- https://pastebin.com/Jj1svhAn

Reviewed By: malfet, izdeby

Differential Revision: D26317769

Pulled By: samestep

fbshipit-source-id: 1ba06cec0fafac77f9e7341d57079543052d73db
2021-02-08 15:35:21 -08:00
..

Code Coverage Tool for Pytorch

Overview

This tool is designed for calculating code coverage for Pytorch project. Its an integrated tool. You can use this tool to run and generate both file-level and line-level report for C++ and Python tests. It will also be the tool we use in CircleCI to generate report for each master commit.

Simple

  • Simple command to run:
    • python oss_coverage.py
  • Argument --clean will do all the messy clean up things for you

But Powerful

  • Choose your own interested folder:
    • Default folder will be good enough in most times
    • Flexible: you can specify one or more folder(s) that you are interested in
  • Run only the test you want:
    • By default it will run all the c++ and python tests
    • Flexible: you can specify one or more test(s) that you want to run
  • Final report:
    • File-Level: The coverage percentage for each file you are interested in
    • Line-Level: The coverage details for each line in each file you are interested in
    • Html-Report (only for gcc): The beautiful HTML report supported by lcov, combine file-level report and line-lever report into a graphical view.
  • More complex but flexible options:
    • Use different stages like --run, --export, --summary to achieve more flexible functionality

How to use

This part will introduce about the arguments you can use when run this tool. The arguments are powerful, giving you full flexibility to do different work. We have two different compilers, gcc and clang, and this tool supports both. But it is recommended to use gcc because it's much faster and use less disk place. The examples will also be divided to two parts, for gcc and clang.

Preparation

The first step is to build Pytorch from source with CODE_COVERAGE option ON. You may also want to set BUILD_TEST option ON to get the test binaries. Besides, if you are under gcc compiler, to get accurate result, it is recommended to also select CMAKE_BUILD_CONFIG=Debug. See: how to adjust build options for reference. Following is one way to adjust build option:

# in build/ folder (all build artifacts must in `build/` folder)
cmake .. -DCODE_COVERAGE=ON -DBUILD_TEST=ON -DCMAKE_BUILD_CONFIG=Debug

Examples

The tool will auto-detect compiler type in your operating system, but if you are using another one, you need to specify it. Besides, if you are using clang, llvm tools are required. So the first step is to set some environment value if needed:

# set compiler type, the default is auto detected, you can check it at the start of log.txt
export COMPILER_TYPE="CLANG"
# set llvm path for clang, by default is /usr/local/opt/llvm/bin
export LLVM_TOOL_PATH=...

Great, you are ready to run the code coverage tool for the first time! Start from the simple command:

python oss_coverage.py --run-only=atest

This command will run atest binary in build/bin/ folder and generate reoports over the entire Pytorch folder. You can find the reports in profile/summary. But you may only be interested in the aten folder, in this case, try:

python oss_coverage.py --run-only=atest --interested-only=aten

In Pytorch, c++ tests located in build/bin/ and python tests located in test/. If you want to run python test, try:

python oss_coverage.py --run-only=test_complex.py

You may also want to specify more than one test or interested folder, in this case, try:

python oss_coverage.py --run-only=atest c10_logging_test --interested-only aten/src/Aten c10/core

That it is! With these two simple options, you can customize many different functionality according to your need. By default, the tool will run all tests in build/bin folder (by running all executable binaries in it) and test/ folder (by running run_test.py), and then collect coverage over the entire Pytorch folder. If this is what you want, try: (Note: It's not recommended to run default all tests in clang, because it will take too much space)

python oss_coverage.py

For more complex arguments and functionalities

GCC

The code coverage with gcc compiler can be divided into 3 step:

  1. run the tests: --run
  2. run gcov to get json report: --export
  3. summarize it to human readable file report and line report: --summary

By default all steps will be run, but you can specify only run one of them. Following is some usage scenario:

1. Interested in different folder —summary is useful when you have different interested folder. For example,

# after run this command
python oss_coverage.py --run-only=atest --interested-folder=aten
# you may then want to learn atest's coverage over c10, instead of running the test again, you can:
python oss_coverage.py --run-only=atest --interested-folder=c10 --summary

2. Run tests yourself When you are developing a new feature, you may first run the tests yourself to make sure the implementation is all right and then want to learn its coverage. But sometimes the test take very long time and you don't want to wait to run it again when doing code coverage. In this case, you can use these arguments to accerate your development (make sure you build pytorch with the coverage option!):

# run tests when you are devloping a new feature, assume the the test is `test_nn.py`
python oss_coverage.py --run-only=test_nn.py
# or you can run it yourself
cd test/ && python test_nn.py
# then you want to learn about code coverage, you can just run:
python oss_coverage.py --run-only=test_nn.py --export --summary

CLANG

The steps for clang is very similar to gcc, but the export stage is divided into two step:

  1. run the tests: --run
  2. run gcov to get json report: --merge --export
  3. summarize it to human readable file report and line report: --summary

Therefore, just replace --export in gcc examples with --merge and --export, you will find it work!

Reference

For gcc

  • See about how to invoke gcov, read Invoking gcov will be helpful

For clang

  • If you are not familiar with the procedure of generating code coverage report by using clang, read Source-based Code Coverage will be helpful.