Update wiki to include

2025-10-20 12:54:11 +08:00 · 2025-06-10 19:24:26 -07:00
parent 547007e32d
commit 727ca0d2cf
1 changed files with 85 additions and 28 deletions
--- a/How-to-integrate-with-PyTorch-OSS-benchmark-infra.md
+++ b/How-to-integrate-with-PyTorch-OSS-benchmark-infra.md
@ -1,34 +1,27 @@
 ## Architecture overview

 <p align="center" width="100%">
-    <img width="60%" src="https://github.com/user-attachments/assets/aa7b56f9-a610-4f99-b793-1989b819d023">
+    <img width="60%" src="https://github.com/user-attachments/assets/f6bc50d0-3c78-47b7-bfd1-80c4d9f3bc18">
 </p>

-## Prerequisite
+At a high-level, PyTorch OSS benchmark infra consists of 5 different components:

-The OSS benchmark database is located on ClickHouse Cloud at https://console.clickhouse.cloud under `benchmark` database and `oss_ci_benchmark_v3` table. It's a good idea to try login to ClickHouse Cloud and do some simple queries to understand the data there
+1. The servers where the benchmark are run.  They come from various sources depending on the availability.  Some notable ones are:
+    1. CUDA benchmarks like `torch.compile` on [linux.aws.h100](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-h100.yml)
+    2. ROCm benchmarks on [linux.rocm.gpu.mi300.2](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-rocm.yml)
+    3. x86 CPU benchmarks on [linux.24xl.spr-metal](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-x86.yml)
+    4. aarch64 CPU benchmark on [linux.arm64.m7g.metal](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-aarch64.yml)
+    5. MPS benchmarks on [macos-m2-15](https://github.com/pytorch/pytorch/blob/main/.github/workflows/inductor-perf-test-nightly-macos.yml)
+    4. [Android](https://github.com/pytorch/executorch/blob/main/.github/workflows/android-perf.yml) and [iOS](https://github.com/pytorch/executorch/blob/main/.github/workflows/apple-perf.yml) benchmarks on AWS Device Farms
+2. The integration layer where benchmark results are processed.  In order to support different use cases across PyTorch-org, we don't dictate what benchmarks could be run or how.  Instead, we provide the integration touch point on GitHub for CI and an API to upload the benchmark results when running in a local environment.  This gives onboarding teams the flexibility to run their benchmarks in their own ways as long as the results are saved in a standardize format.  This format is documented [here](https://github.com/pytorch/pytorch/wiki/How-to-integrate-with-PyTorch-OSS-benchmark-infra#output-format)
+3. The centralize benchmark database is located on ClickHouse Cloud at https://console.clickhouse.cloud under `benchmark` database and `oss_ci_benchmark_v3` table. 
+4. The family of [HUD benchmark dashboards](https://hud.pytorch.org) whose code is located in PyTorch [test-infra](https://github.com/pytorch/test-infra/tree/main/torchci/pages/benchmark)
+5. An UPCOMING collection of benchmark-related toolings including:
+    1. A querying API to access the benchmark data programmatically
+    2. A regression notification mechanism (via Grafana)
+    2. A bisecting tool to root cause thus regression

-1. Login to https://console.clickhouse.cloud. For metamates, you can login using your meta email using SSO and request access. Read-only access will be granted by default.
-2. Select `benchmark` database
-3. Run a sample query:
-
-```
-select
-    head_branch,
-    head_sha,
-    benchmark,
-    model.name as model,
-    metric.name as name,
-    arrayAvg(metric.benchmark_values) as value
-from
-    oss_ci_benchmark_v3
-where
-    tupleElement(benchmark, 'name') = 'TorchAO benchmark'
-    and oss_ci_benchmark_v3.timestamp < 1733870813
-    and oss_ci_benchmark_v3.timestamp > 1733784413
-```
-
-## Output format
+## Benchmark results format

 Your benchmark results need to be a list of metrics in the following format.  All fields are optional unless specified otherwise.

@ -79,14 +72,53 @@ Your benchmark results need to be a list of metrics in the following format.  Al

 Note that the JSON list is optional. Writing JSON record one per line ([JSONEachRow](https://clickhouse.com/docs/en/interfaces/formats#jsoneachrow)) is also accepted.

-## Run the benchmark and upload the results
+## Upload the benchmark results

-### Prerequisites
+### Upload API
+
+The API is currently deployed at `https://kvvka55vt7t2dzl6qlxys72kra0xtirv.lambda-url.us-east-1.on.aws` and it accepts the benchmark result JSON and an S3 path on where to store it.  The API is gated behind a credential at the moment, so please reach out to PyTorch Dev Infra if you need to use it.
+
+```
+# The path here is a example, any path under v3 directory is ok.  If the path already exists, the API will refuse to overwrite it
+s3_path = f"v3/{repo_name}/{head_branch}/{head_sha}/{device}/benchmark_results.json"
+
+payload = {
+    "username": UPLOADER_USERNAME,
+    "password": UPLOADER_PASSWORD,
+    "s3_path": s3_path,
+    "content": json.dumps(benchmark_results),
+}
+
+headers = {"content-type": "application/json"}
+
+requests.post(
+    "https://kvvka55vt7t2dzl6qlxys72kra0xtirv.lambda-url.us-east-1.on.aws",
+    json=payload,
+    headers=headers
+)
+```
+
+You can also use the [upload_benchmark_results.py](https://github.com/pytorch/pytorch-integration-testing/blob/main/vllm-benchmarks/upload_benchmark_results.py) that implements the above logic in your bash script.  For example,
+
+```
+UPLOADER_USERNAME=<REDACT>
+UPLOADER_PASSWORD=<REDACT>
+GPU_DEVICE=$(nvidia-smi -i 0 --query-gpu=name --format=csv,noheader | awk '{print $2}')
+
+python upload_benchmark_results.py \
+  --repo pytorch \
+  --benchmark-name "My PyTorch benchmark" \
+  --benchmark-results benchmark-results-dir \
+  --device "${GPU_DEVICE}" \
+  --dry-run
+```
+
+### GitHub CI

 1. If you are using PyTorch AWS self-hosted runners, they already have the permission to upload the benchmark results. No need to prepare anything else.
 1. If you are using something else (non-AWS), for example ROCm runners. Please reach out to PyTorch Dev Infra team (poc @huydhn) to create a GitHub environment with permission to write to S3.  The environment is called `upload-benchmark-results`.  For example, [android-perf.yml](https://github.com/pytorch/executorch/blob/9666ee8259065a80858603b3bc9b95a71ecfe460/.github/workflows/android-perf.yml#L385)

-### A sample job on AWS self-hosted runners
+#### A sample job on AWS self-hosted runners

 ```
 name: A sample benchmark job that runs on all main commits
@ -120,7 +152,7 @@ jobs:
         github-token: ${{ secrets.GITHUB_TOKEN }}
 ```

-### A sample job on non-AWS runners
+#### A sample job on non-AWS runners

 ```
 name: A sample benchmark job that runs on all main commits
@ -162,4 +194,29 @@ jobs:
         dry-run: false
         schema-version: v3
         github-token: ${{ secrets.GITHUB_TOKEN }}
+```
+
+## Database
+The benchmark database on ClickHouse Cloud is currently accessible to all Metamates. We also provide a [ClickHouse MCP server](https://github.com/izaitsevfb/clickhouse-mcp) that you can install to access the database via an AI agent like Claude Code. 
+
+A quick way to check your access to the database is to do the following steps:
+
+1. Login to https://console.clickhouse.cloud. For metamates, you can login using your meta email using SSO and request access. Read-only access will be granted by default.
+2. Select `benchmark` database
+3. Run a sample query:
+
+```
+select
+    head_branch,
+    head_sha,
+    benchmark,
+    model.name as model,
+    metric.name as name,
+    arrayAvg(metric.benchmark_values) as value
+from
+    oss_ci_benchmark_v3
+where
+    tupleElement(benchmark, 'name') = 'TorchAO benchmark'
+    and oss_ci_benchmark_v3.timestamp < 1733870813
+    and oss_ci_benchmark_v3.timestamp > 1733784413
 ```