Updated PyTorch OSS benchmark infra (markdown)

2025-10-20 12:54:11 +08:00 · 2025-06-16 17:33:52 -07:00
parent 7783012df2
commit 6e5ace5d19
1 changed files with 21 additions and 21 deletions
--- a/PyTorch-OSS-benchmark-infra.md
+++ b/PyTorch-OSS-benchmark-infra.md
@ -4,26 +4,26 @@
    <img width="60%" src="https://github.com/user-attachments/assets/f6bc50d0-3c78-47b7-bfd1-80c4d9f3bc18">
 </p>

-At a high-level, PyTorch OSS benchmark infra consists of 5 different components:
+At a high level, the PyTorch OSS benchmark infrastructure consists of 5 key components:

-1. The servers where the benchmark are run.  They come from various sources depending on the availability.  Some notable ones are:
+1. Benchmark Servers - These servers come from various sources based on availability. Notable examples include:
    1. CUDA benchmarks like `torch.compile` on [linux.aws.h100](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-h100.yml)
    2. ROCm benchmarks on [linux.rocm.gpu.mi300.2](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-rocm.yml)
    3. x86 CPU benchmarks on [linux.24xl.spr-metal](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-x86.yml)
    4. aarch64 CPU benchmark on [linux.arm64.m7g.metal](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-aarch64.yml)
    5. MPS benchmarks on [macos-m2-15](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-macos.yml)
-    4. [Android](https://github.com/pytorch/executorch/blob/main/.github/workflows/android-perf.yml) and [iOS](https://github.com/pytorch/executorch/blob/main/.github/workflows/apple-perf.yml) benchmarks on AWS Device Farms
-2. The integration layer where benchmark results are processed.  In order to support different use cases across PyTorch-org, we don't dictate what benchmarks could be run or how.  Instead, we provide the integration touch point on GitHub for CI and an API to upload the benchmark results when running in a local environment.  This gives onboarding teams the flexibility to run their benchmarks in their own ways as long as the results are saved in a standardize format.  This format is documented [here](https://github.com/pytorch/pytorch/wiki/How-to-integrate-with-PyTorch-OSS-benchmark-infra#output-format)
-3. The centralize benchmark database is located on ClickHouse Cloud at https://console.clickhouse.cloud under `benchmark` database and `oss_ci_benchmark_v3` table. 
-4. The family of [HUD benchmark dashboards](https://hud.pytorch.org) whose code is in PyTorch [test-infra](https://github.com/pytorch/test-infra/tree/main/torchci/pages/benchmark)
-5. An UPCOMING collection of benchmark-related toolings including:
-    1. A querying API to access the benchmark data programmatically
+    6. [Android](https://github.com/pytorch/executorch/blob/main/.github/workflows/android-perf.yml) and [iOS](https://github.com/pytorch/executorch/blob/main/.github/workflows/apple-perf.yml) benchmarks on AWS Device Farms
+2. Integration Layer - Where benchmark results are processed. To support different use cases across PyTorch-org, we don't dictate what benchmarks to run or how. Instead, we provide an integration point on GitHub for CI and an API to upload benchmark results when running in a local environment. This gives teams flexibility to run benchmarks their own way as long as results are saved in a standardized format. This format is documented [here](https://github.com/pytorch/pytorch/wiki/How-to-integrate-with-PyTorch-OSS-benchmark-infra#output-format)
+3. Centralized Benchmark Database - Located on ClickHouse Cloud at https://console.clickhouse.cloud under the `benchmark` database and `oss_ci_benchmark_v3` table.
+4. HUD Benchmark Dashboards - The [benchmark dashboard family](https://hud.pytorch.org) with code in PyTorch [test-infra](https://github.com/pytorch/test-infra/tree/main/torchci/pages/benchmark)
+5. UPCOMING Benchmark Tooling Collection:
+    1. A querying API for programmatic benchmark data access
    2. A regression notification mechanism (via Grafana)
-    2. A bisecting tool to root cause thus regression
+    3. A bisecting tool to identify root causes of regressions

 ## Benchmark results format

-Your benchmark results need to be a list of metrics in the following format.  All fields are optional unless specified otherwise.
+Your benchmark results should be formatted as a list of metrics as shown below. All fields are optional unless specified as required.

 ```
 // The list of all benchmark metrics
@ -70,13 +70,13 @@ Your benchmark results need to be a list of metrics in the following format.  Al
 ]
 ```

-Note that the JSON list is optional. Writing JSON record one per line ([JSONEachRow](https://clickhouse.com/docs/en/interfaces/formats#jsoneachrow)) is also accepted.
+Note that using a JSON list is optional. Writing one JSON record per line ([JSONEachRow](https://clickhouse.com/docs/en/interfaces/formats#jsoneachrow)) is also accepted.

 ## Upload the benchmark results

 ### Upload API

-The quickest way to upload the benchmark results is to use the [upload_benchmark_results.py](https://github.com/pytorch/pytorch-integration-testing/blob/main/.github/scripts/upload_benchmark_results.py) script.  The script requires an `UPLOADER_[USERNAME|PASSWORD]` credential at the moment, so please reach out to PyTorch Dev Infra if you need to use it.  Once written to the database, the benchmark results could be considered immutable because it's a complicated and expensive process to update or delete them.
+The fastest way to upload benchmark results is using the [upload_benchmark_results.py](https://github.com/pytorch/pytorch-integration-testing/blob/main/.github/scripts/upload_benchmark_results.py) script. This script requires `UPLOADER_[USERNAME|PASSWORD]` credentials, so please contact PyTorch Dev Infra if you need access. Once written to the database, benchmark results should be considered immutable as updating or deleting them is complex and costly.

 Here is an example usage:

@ -129,10 +129,10 @@ python upload_benchmark_results.py \
  --dry-run
 ```

-Behind the scene, we have an API deployed at `https://kvvka55vt7t2dzl6qlxys72kra0xtirv.lambda-url.us-east-1.on.aws` to accept the benchmark result JSON and an S3 path on where to store it.
+Behind the scenes, we have an API deployed at `https://kvvka55vt7t2dzl6qlxys72kra0xtirv.lambda-url.us-east-1.on.aws` that accepts the benchmark result JSON and an S3 path where it will be stored.

 ```
-# The path here is a example, any path under v3 directory is ok.  If the path already exists, the API will refuse to overwrite it
+# This path is an example - any path under the v3 directory is acceptable. If the path already exists, the API will not overwrite it
 s3_path = f"v3/{repo_name}/{head_branch}/{head_sha}/{device}/benchmark_results.json"

 payload = {
@ -154,8 +154,8 @@ requests.post(

 ### GitHub CI

-1. If you are using PyTorch AWS self-hosted runners, they already have the permission to upload the benchmark results. No need to prepare anything else.
-1. If you are using something else (non-AWS), for example ROCm runners. Please reach out to PyTorch Dev Infra team (poc @huydhn) to create a GitHub environment with permission to write to S3.  The environment is called `upload-benchmark-results`.  For example, [android-perf.yml](https://github.com/pytorch/executorch/blob/9666ee8259065a80858603b3bc9b95a71ecfe460/.github/workflows/android-perf.yml#L385)
+1. If you are using PyTorch AWS self-hosted runners, they already have permission to upload benchmark results. No additional preparation is needed.
+2. If you are using non-AWS runners (such as ROCm runners), please contact the PyTorch Dev Infra team (POC: @huydhn) to create a GitHub environment with S3 write permissions. This environment is called `upload-benchmark-results`. See [android-perf.yml](https://github.com/pytorch/executorch/blob/9666ee8259065a80858603b3bc9b95a71ecfe460/.github/workflows/android-perf.yml#L385) for an example.

 #### A sample job on AWS self-hosted runners

@ -238,7 +238,7 @@ jobs:
 ## Query benchmark results

 ### [Experimental] Query API
-An experimental query API has been setup at `https://queries.clickhouse.cloud/run/84649f4e-52c4-4cf9-bd6e-0a105ea145c8` to query benchmark results from the database.  Please reach out to PyTorch Dev Infra (@huydhn) if you need the credential to access it:
+An experimental query API is available at `https://queries.clickhouse.cloud/run/84649f4e-52c4-4cf9-bd6e-0a105ea145c8` for querying benchmark results from the database. Please contact PyTorch Dev Infra (@huydhn) if you need credentials to access it:

 ```
 import os
@ -304,14 +304,14 @@ Here is the Bento notebook N7397718 to illustrate some use cases from TorchInduc

 ### TorchAgent

-To explore the benchmark database, we recommend trying out https://hud.pytorch.org/torchagent.  You need to have write access to PyTorch to use the agent.  At a high level, it includes our [clickhouse-mcp](https://github.com/izaitsevfb/clickhouse-mcp) that it can use to explore the database.
+To explore the benchmark database, we recommend using https://hud.pytorch.org/torchagent. You'll need write access to PyTorch to use the agent. The tool incorporates our [clickhouse-mcp](https://github.com/izaitsevfb/clickhouse-mcp) for database exploration.

-As an example, the list of available benchmarks comes from this prompt `List all the benchmark names from different GitHub repositories from Jun 10th to Jun 16th.  Each name only needs to be mentioned once`
+For example, you can use this prompt to list available benchmarks: `List all the benchmark names from different GitHub repositories from Jun 10th to Jun 16th. List each name only once`

 ### Benchmark database
-The benchmark database on ClickHouse Cloud is currently accessible to all Metamates. We also provide a [ClickHouse MCP server](https://github.com/izaitsevfb/clickhouse-mcp) that you can install to access the database via an AI agent like Claude Code. 
+The benchmark database on ClickHouse Cloud is accessible to all Metamates. We also provide a [ClickHouse MCP server](https://github.com/izaitsevfb/clickhouse-mcp) that you can install to access the database through AI agents like Claude Code. 

-The step to access the database is as follows:
+Follow these steps to access the database:

 1. Login to https://console.clickhouse.cloud. For metamates, you can login using your meta email using SSO and request access. Read-only access will be granted by default.
 2. Select `benchmark` database