inductor: reduce compile time for cpu backend by reducing weight conversion (#104402)

Before this PR, we always add ```to_mkldnn``` before doing weight packing, it is redundant, we can directly convert a dense tensor to block tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104402 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/eellison
2025-10-20 21:14:14 +08:00 · 2023-07-05 22:10:37 -04:00
parent 434fcffa21
commit 6bfd507c15
4 changed files with 15 additions and 18 deletions
--- a/test/test_mkldnn_fusion.py
+++ b/test/test_mkldnn_fusion.py
@ -366,7 +366,7 @@ class TestMkldnnFusion(JitTestCase):

                        if prepack_weight:
                            packed_weight = torch.ops.mkldnn._reorder_convolution_transpose_weight(
-                                mod.conv_transpose.weight.to_mkldnn(),
+                                mod.conv_transpose.weight,
                                mod.conv_transpose.padding,
                                mod.conv_transpose.output_padding,
                                mod.conv_transpose.stride,