Correct alignment in the seq_len diagram. (#5592)

Co-authored-by: Liqian Chen <liqian.chen@deeplang.ai>
This commit is contained in:
Charles Riggins
2024-06-18 00:05:33 +08:00
committed by GitHub
parent 9333fb8eb9
commit 9e74d9d003

View File

@ -83,7 +83,7 @@ class FlashAttentionMetadata(AttentionMetadata):
# |---------------- N iteration ---------------------|
# |- tokenA -|......................|-- newTokens ---|
# |---------- context_len ----------|
# |-------------------- seq_len ----------------------|
# |-------------------- seq_len ---------------------|
# |-- query_len ---|
# Maximum query length in the batch. None for decoding.