The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122, #127123, #127124, #127125
- Extend dynamo bench interface with '--compilers onnx' and '--compilers dynamo-onnx'
- ONNX bench exports model to onnx and runs in ONNX Runtime.
- Introduce error aggregation and report.
- Scripts to build ONNX deps and running ONNX bench.
- Huggingface accuracy check workaround for ONNX.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103135
Approved by: https://github.com/thiagocrepaldi, https://github.com/jansel
I notice from the Rockset data that there are only `float32` records, while there should be both dtypes there. It turns out that the benchmarks script generated by `runner.py` always removes the output directory by default, so there are only records from `float32` running later left.
For example, `rm -rf /var/lib/jenkins/workspace/test/test-reports` appeared twice in the CI log https://ossci-raw-job-status.s3.amazonaws.com/log/11840774308.
I'm adding a new flag `--keep-output-dir` to keep the output directory. This is off by default as I'm not sure how this script is used internally, people probably expect to see the output directory cleaned up everytime.
### Testing
Not really want to start the 10h jobs just to test this small flag, so I'm triple check the change to make sure that there is no bug
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96398
Approved by: https://github.com/weiwangmeta
Preferring dash over underscore in command-line options. Add `--command-arg-name` to the argument parser. The old arguments with underscores `--command_arg_name` are kept for backward compatibility.
Both dashes and underscores are used in the PyTorch codebase. Some argument parsers only have dashes or only have underscores in arguments. For example, the `torchrun` utility for distributed training only accepts underscore arguments (e.g., `--master_port`). The dashes are more common in other command-line tools. And it looks to be the default choice in the Python standard library:
`argparse.BooleanOptionalAction`: 4a9dff0e5a/Lib/argparse.py (L893-L895)
```python
class BooleanOptionalAction(Action):
def __init__(...):
if option_string.startswith('--'):
option_string = '--no-' + option_string[2:]
_option_strings.append(option_string)
```
It adds `--no-argname`, not `--no_argname`. Also typing `_` need to press the shift or the caps-lock key than `-`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94505
Approved by: https://github.com/ezyang, https://github.com/seemethere
Summary stat diff was reporting diff between previous day and the day before that, instead of today and previous day. Issue was because summary stats were not uploaded to the archive before the summary stat differ was run.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89789
Approved by: https://github.com/anijain2305
A number of dashboard improvements:
- Add accuracy failures to warnings section
- Add regression detection to all metrics (speedup, compile time, peak memory), not just accuracy
- Add testing flag to update-dashboard to prevent image/comment uploads
- Add section for comparing summary statistics (passrate, speedup) between 2 most recent reports
- Show names of reports for summary stats diff and regression detection sections
- Remove metric graphs from the comment (they can still be found in the generated text file)
Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1317565972
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89155
Approved by: https://github.com/anijain2305
Fix dashboard comment failure due to the following trace:
```
Traceback (most recent call last):
File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1180, in <module>
DashboardUpdater(args).update()
File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1119, in update
self.comment_on_gh(comment)
File "/scratch/anijain/dashboard/work/pytorch/benchmarks/dynamo/runner.py", line 1096, in comment_on_gh
subprocess.check_call(
File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 368, in check_call
retcode = call(*popenargs, **kwargs)
File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 349, in call
with Popen(*popenargs, **kwargs) as p:
File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/scratch/anijain/dashboard/env/lib/python3.9/subprocess.py", line 1821, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: '/data/home/anijain/miniconda/bin/gh'
srun: error: a100-st-p4d24xlarge-27: task 0: Exited with exit code 1
```
That is, we were trying to execute a gh command in the OS that was too long.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89002
Approved by: https://github.com/davidberard98
Add absolute latency to dashboard, as requested by https://github.com/pytorch/torchdynamo/issues/1833#issuecomment-1302742914
Tested by setting `run.sh` to
```
# Setup the output directory
rm -rf ../test-dynamo-runner-logs-7/
mkdir ../test-dynamo-runner-logs-7/
# Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing
python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 --cold_start_latency
# Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing
python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-7//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor --no-skip --dashboard --only mobilenet_v2
```
and running `python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-7/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard` (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else).
Sample comment: https://github.com/pytorch/torchdynamo/issues/1831#issuecomment-1309645562
NOTE: this change breaks processing old logs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88790
Approved by: https://github.com/anijain2305
Implement various features in https://github.com/pytorch/torchdynamo/issues/1644:
- Upload nightly run logs to /fsx before parsing - for backing up parsing failures.
- Flag models with (1) < 0.95x speedup, (2) > 2min compile time, (3) < 0.9x compression ratio
- Flag models that were passing yesterday but failed today.
- Other small bug fixes.
See https://github.com/pytorch/torchdynamo/issues/1831 for sample outputs.
Also tested by running run.sh:
```bash
# Setup the output directory
rm -rf ../test-dynamo-runner-logs-3/
mkdir ../test-dynamo-runner-logs-3/
# Commands for torchbench for device=cuda, dtype=float32 for training and for performance testing
python benchmarks/dynamo/torchbench.py --performance --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_performance.csv --training --inductor --no-skip --dashboard --only mobilenet_v2 --cold_start_latency
# Commands for torchbench for device=cuda, dtype=float32 for training and for accuracy testing
python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcuda --output=../test-dynamo-runner-logs-3//inductor_torchbench_float32_training_cuda_accuracy.csv --training --inductor --no-skip --dashboard --only mobilenet_v2
```
with the command
`python benchmarks/dynamo/runner.py --output-dir ../test-dynamo-runner-logs-3/ --dashboard-archive-path /data/home/williamwen/dynamo-runner-logs-copy --training --run --compilers inductor --flag-compilers inductor --suites torchbench --update-dashboard` (need to comment out the `generate_commands` line and change the github issue ID from 681 to something else).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88516
Approved by: https://github.com/anijain2305