FlashAttention remove Nvidia only GPUs to more generic.

2025-10-21 01:23:56 +08:00 · 2023-11-15 15:33:09 +01:00
1 changed files with 1 additions and 1 deletions
--- a/docs/source/en/perf_infer_gpu_one.md
+++ b/docs/source/en/perf_infer_gpu_one.md
@ -62,7 +62,7 @@ model = AutoModelForCausalLM.from_pretrained(

 <Tip>

-FlashAttention-2 can only be used when the model's dtype is  `fp16` or `bf16`, and it only runs on Nvidia GPUs. Make sure to cast your model to the appropriate dtype and load them on a supported device before using FlashAttention-2.
+FlashAttention-2 can only be used when the model's dtype is  `fp16` or `bf16`, and is available on both AMD & Nvidia GPUs. Make sure to cast your model to the appropriate dtype and load them on a supported device before using FlashAttention-2.
  
 </Tip>