mirror of
https://github.com/pytorch/pytorch.git
synced 2025-11-16 23:44:53 +08:00
Motivation: Update OpenBLAS and change build script to enable SBGEMM kernels . Update pytorch `jammy` builds for aarch64 to use `install_openblas.sh` instead of `conda_install` Link to full [TorchInductor Performance Dashboard AArch64](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Fri%2C%2006%20Jun%202025%2009%3A46%3A35%20GMT&stopTime=Fri%2C%2013%20Jun%202025%2009%3A46%3A35%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=cpu%20(aarch64)&lBranch=adi/update_openblas&lCommit=0218b65bcf61971c1861cfe8bc586168b73aeb5f&rBranch=main&rCommit=9d59b516e9b3026948918e3ff8c2ef55a33d13ad) 1. This shows a promising speedup across most of the HF models in benchmark, specifically giving a significant boost to SDPA layers. 2. Overall torch-bench pass-rate (cpp_wrapper mode) increased `[87%, 65/75 → 96%, 72/75]` <img width="676" alt="Screenshot 2025-06-20 at 17 05 15" src="https://github.com/user-attachments/assets/2ca9c1bc-80c6-464a-8db6-b758f2476582" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/151547 Approved by: https://github.com/malfet, https://github.com/snadampal, https://github.com/fadara01 Co-authored-by: Christopher Sidebottom <chris.sidebottom@arm.com> Co-authored-by: Ryo Suzuki <ryo.suzuki@arm.com> Co-authored-by: Ye Tao <ye.tao@arm.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
22 lines
467 B
Bash
22 lines
467 B
Bash
#!/bin/bash
|
|
# Script used only in CD pipeline
|
|
|
|
set -ex
|
|
|
|
cd /
|
|
git clone https://github.com/OpenMathLib/OpenBLAS.git -b "${OPENBLAS_VERSION:-v0.3.30}" --depth 1 --shallow-submodules
|
|
|
|
OPENBLAS_CHECKOUT_DIR="OpenBLAS"
|
|
OPENBLAS_BUILD_FLAGS="
|
|
NUM_THREADS=128
|
|
USE_OPENMP=1
|
|
NO_SHARED=0
|
|
DYNAMIC_ARCH=1
|
|
TARGET=ARMV8
|
|
CFLAGS=-O3
|
|
BUILD_BFLOAT16=1
|
|
"
|
|
|
|
make -j8 ${OPENBLAS_BUILD_FLAGS} -C ${OPENBLAS_CHECKOUT_DIR}
|
|
make -j8 ${OPENBLAS_BUILD_FLAGS} install -C ${OPENBLAS_CHECKOUT_DIR}
|