mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
[Triton] [Inductor] Restrict subprocess autotuning to just Triton (#162688)
Summary: Restricts subprocess benchmarking to only `TritonTemplateCaller`, which is expected by the underlying `target` method. THhis triggered a bug with large K shapes because the decompose k is `SubgraphChoiceCaller`. Test Plan: mm autotuning with a large k and `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1` Rollback Plan: Differential Revision: D82181924 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162688 Approved by: https://github.com/PaulZhang12, https://github.com/eellison, https://github.com/mlazos
This commit is contained in:
committed by
PyTorch MergeBot
parent
468c1f9e9d
commit
082d3dd9d5
@ -3032,8 +3032,13 @@ class AlgorithmSelectorCache(PersistentCache):
|
||||
|
||||
# only benchmark triton kernel in sub process for now.
|
||||
# ATen/Extern kernel are still benchmarked in the current process.
|
||||
extern = [c for c in choices if isinstance(c, ExternKernelCaller)]
|
||||
triton = [c for c in choices if not isinstance(c, ExternKernelCaller)]
|
||||
extern = []
|
||||
triton = []
|
||||
for c in choices:
|
||||
if isinstance(c, TritonTemplateCaller):
|
||||
triton.append(c)
|
||||
else:
|
||||
extern.append(c)
|
||||
|
||||
timings = cls.benchmark_in_current_process(
|
||||
extern, input_nodes, layout, input_gen_fns, hint_override=hint_override
|
||||
|
Reference in New Issue
Block a user