[Doc] update max-autotune for CPU (#134986)

The current doc for `max-autotune` is applicable only for GPU. This PR adds the corresponding content for CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134986 Approved by: https://github.com/jgong5, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-10-20 21:14:14 +08:00 · 2024-09-07 03:56:27 -07:00
parent f7c0c06692
commit 18479c5f70
1 changed files with 3 additions and 2 deletions
--- a/torch/init.py
+++ b/torch/init.py
@ -2378,8 +2378,9 @@ def compile(
          There are other circumstances where CUDA graphs are not applicable; use TORCH_LOG=perf_hints
          to debug.

-        - "max-autotune" is a mode that leverages Triton based matrix multiplications and convolutions
-          It enables CUDA graphs by default.
+        - "max-autotune" is a mode that leverages Triton or template based matrix multiplications
+          on supported devices and Triton based convolutions on GPU.
+          It enables CUDA graphs by default on GPU.

        - "max-autotune-no-cudagraphs" is a mode similar to "max-autotune" but without CUDA graphs