AutoHeuristic: mixed_mm documentation (#133410)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133410
Approved by: https://github.com/Chillee
ghstack dependencies: #133409
This commit is contained in:
Alnis Murtovi
2024-08-13 22:38:59 -07:00
committed by PyTorch MergeBot
parent 142353eca3
commit f32a9e953f
3 changed files with 49 additions and 2 deletions

View File

@ -0,0 +1,16 @@
If you just want to re-generate existing heuristics with already collected data for mixed_mm for A100/H100, run the following scripts:
`bash get_mixedmm_dataset.sh # Downloads A100 and H100 datasets`
`bash gen_mixedmm_heuristic_a100.sh # Generates A100 heuristic`
`bash gen_mixedmm_heuristic_h100.sh # Generates H100 heuristic`
If you want to collect new data, or generate a heuristic for another GPU, use the `generate_heuristic.sh` script:
First, go into the generate_heuristic.sh and modify the variables according to the comments.
Then run the script to perform benchmarks and collect training data:
`bash generate_heuristic.sh collect`
Depending on how many GPUs you are using, this might take a day.
Afterwards, run the script in order to learn the heuristic:
`bash generate_heuristic.sh generate`

View File

@ -12,6 +12,7 @@ from benchmark_runner import BenchmarkRunner # type: ignore[import-not-found]
from benchmark_utils import ( # type: ignore[import-not-found]
fits_in_memory,
get_mm_tensors,
get_random_between_pow2,
)
import torch
@ -95,7 +96,7 @@ class BenchmarkRunnerMixedMM(BenchmarkRunner): # type: ignore[misc, no-any-unim
if distr_type == "pow2":
return self.get_random_pow2(min_power2=10, max_power2=17)
elif distr_type == "uniform-between-pow2":
return self.get_random_between_pow2(min_power2=10, max_power2=17)
return get_random_between_pow2(min_power2=10, max_power2=17)
elif distr_type == "uniform":
return random.randint(1024, 131072)
print(f"random_type {distr_type} not supported")
@ -106,7 +107,7 @@ class BenchmarkRunnerMixedMM(BenchmarkRunner): # type: ignore[misc, no-any-unim
if pow2:
return 2 ** random.randint(1, 7)
else:
return self.get_random_between_pow2(1, 7)
return get_random_between_pow2(1, 7)
def get_m_k_n(self, dtype: Any) -> Tuple[int, int, int]:
numel_max = 2**31

View File

@ -0,0 +1,30 @@
#!/bin/bash
if [ $# -ne 1 ]; then
echo "Error: This script requires exactly one argument."
echo "`bash generate_heuristic_mixedmm.sh collect` to run benchmark and collect training data."
echo "`bash generate_heuristic_mixedmm.sh generate` to use the collected data to learn a heuristic."
exit 1
fi
MODE=$1
# !!! SPECIFY THE GPUs THAT YOU WANT TO USE HERE !!!
GPU_DEVICE_IDS="4,5"
# !!! SPECIFY THE CONDA ENVIRONEMNT THAT YOU WANT TO BE ACTIVATED HERE !!!
CONDA_ENV=heuristic-pr
NUM_SAMPLES=2000
# This is where AutoHeuristic will store autotuning results
OUTPUT_DIR="a100"
# !!! CHANGE THE NAME OF THE HEURISTIC IF YOU WANT TO LEARN A HEURISTIC FOR A GPU THAT IS NOT A100 !!!
HEURISTIC_NAME="MixedMMA100"
BENCHMARK_SCRIPT="gen_data_mixed_mm.py"
TRAIN_SCRIPT="train_decision_mixedmm.py"
bash ../generate_heuristic.sh ${MODE} ${GPU_DEVICE_IDS} ${CONDA_ENV} ${NUM_SAMPLES} ${OUTPUT_DIR} ${HEURISTIC_NAME} ${BENCHMARK_SCRIPT} ${TRAIN_SCRIPT}