mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
change GPT2ForSequenceClassification inference accuracy tolerance (#136749)
Fixes https://github.com/pytorch/pytorch/issues/123503. https://github.com/pytorch/pytorch/pull/121866 makes GPT2ForSequenceClassification hit the SDPA pattern 18 and then encounter the accuracy issue. The issue only happens with BF16 inference single thread. This PR tends to increase the model tolerance from 4e-3 to 5e-3 and make the check pass. Note that the issue is due to some small implementation diff. For example, the sdpa math backend scales q, k before matmul for stability; the flash attention backend has more diffs as a new algorithm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136749 Approved by: https://github.com/jgong5, https://github.com/jansel
This commit is contained in:
committed by
PyTorch MergeBot
parent
fba2c0a23a
commit
67883e70c0
@ -501,12 +501,12 @@ class HuggingfaceRunner(BenchmarkRunner):
|
||||
else:
|
||||
return 1e-2, cosine
|
||||
else:
|
||||
if name in self._config["tolerance"]["higher_inference"]:
|
||||
return 4e-3, cosine
|
||||
if (
|
||||
current_device == "cpu"
|
||||
and name in self._config["tolerance"]["higher_inference_cpu"]
|
||||
):
|
||||
return 5e-3, cosine
|
||||
if name in self._config["tolerance"]["higher_inference"]:
|
||||
return 4e-3, cosine
|
||||
return 1e-3, cosine
|
||||
|
||||
|
@ -89,6 +89,7 @@ tolerance:
|
||||
|
||||
higher_inference_cpu:
|
||||
- LayoutLMForSequenceClassification
|
||||
- GPT2ForSequenceClassification
|
||||
|
||||
cosine: []
|
||||
|
||||
|
Reference in New Issue
Block a user