Files
pytorch/.ci/docker/common/install_openblas.sh
Aditya Tewari f85954e043 Update OpenBLAS commit (#151547)
Motivation: Update OpenBLAS and change build script to enable SBGEMM kernels . Update pytorch `jammy` builds for aarch64 to use `install_openblas.sh` instead of `conda_install`

Link to full [TorchInductor Performance Dashboard AArch64](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Fri%2C%2006%20Jun%202025%2009%3A46%3A35%20GMT&stopTime=Fri%2C%2013%20Jun%202025%2009%3A46%3A35%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=cpu%20(aarch64)&lBranch=adi/update_openblas&lCommit=0218b65bcf61971c1861cfe8bc586168b73aeb5f&rBranch=main&rCommit=9d59b516e9b3026948918e3ff8c2ef55a33d13ad)

1. This shows a promising speedup across most of the HF models in benchmark, specifically giving a significant boost to SDPA layers.
2. Overall torch-bench pass-rate (cpp_wrapper mode) increased `[87%, 65/75 → 96%, 72/75]`

<img width="676" alt="Screenshot 2025-06-20 at 17 05 15" src="https://github.com/user-attachments/assets/2ca9c1bc-80c6-464a-8db6-b758f2476582" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151547
Approved by: https://github.com/malfet, https://github.com/snadampal, https://github.com/fadara01

Co-authored-by: Christopher Sidebottom <chris.sidebottom@arm.com>
Co-authored-by: Ryo Suzuki <ryo.suzuki@arm.com>
Co-authored-by: Ye Tao <ye.tao@arm.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-07-10 14:58:12 +00:00

22 lines
467 B
Bash

#!/bin/bash
# Script used only in CD pipeline
set -ex
cd /
git clone https://github.com/OpenMathLib/OpenBLAS.git -b "${OPENBLAS_VERSION:-v0.3.30}" --depth 1 --shallow-submodules
OPENBLAS_CHECKOUT_DIR="OpenBLAS"
OPENBLAS_BUILD_FLAGS="
NUM_THREADS=128
USE_OPENMP=1
NO_SHARED=0
DYNAMIC_ARCH=1
TARGET=ARMV8
CFLAGS=-O3
BUILD_BFLOAT16=1
"
make -j8 ${OPENBLAS_BUILD_FLAGS} -C ${OPENBLAS_CHECKOUT_DIR}
make -j8 ${OPENBLAS_BUILD_FLAGS} install -C ${OPENBLAS_CHECKOUT_DIR}