[dynamo][cpp-guards] Optimize tensor.grad accessor (#123226)

For LayoutLM model, reduces C++ guard overhead by 1.48x. These are the numbers ![image](https://github.com/pytorch/pytorch/assets/13822661/25cfc35b-b67d-4903-8403-71fa931dacdd) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123226 Approved by: https://github.com/jansel
2025-11-06 00:54:56 +08:00 · 2024-04-02 16:59:04 -07:00
parent 9288b27461
commit d91db70295
4 changed files with 91 additions and 11 deletions
--- a/torch/_dynamo/source.py
+++ b/torch/_dynamo/source.py
@ -170,6 +170,25 @@ class AttrSource(ChainedSource):
        return f"{self.base.name()}.{self.member}"


+# Represents tensor.grad source. It could be represented by AttrSource as well.
+# But, we could access grad field on tensor directly in C++ without going
+# through the Python bytecodes. Therefore, we use a separate source for grad
+# field.
+@dataclasses.dataclass(frozen=True)
+class GradSource(ChainedSource):
+    member: str = "grad"
+
+    def reconstruct(self, codegen):
+        self.base.reconstruct(codegen)
+        codegen.extend_output(codegen.create_load_attrs(self.member))
+
+    def guard_source(self):
+        return self.base.guard_source()
+
+    def name(self):
+        return f"{self.base.name()}.{self.member}"
+
+
@dataclasses.dataclass(frozen=True)
 class ParamBufferSource(AttrSource):
    def guard_source(self):