Files
pytorch/tools/setup_helpers/generate_linker_script.py
Robert Hardwick 1aeac304b8 Move prioritized text linker optimization code from setup.py to cmake (#160078)
Note. This is a replica PR of #155901 which will be closed. I had to create a new PR in order to add it into my ghstack as there are some later commits which depend on it.

### Summary

🚀 This PR moves the prioritized text linker optimization from setup.py to cmake ( and enables by default on Linux aarch64 systems )

This change consolidates what was previously manual CI logic into a single location (cmake), ensuring consistent behavior across local builds, CI pipelines, and developer environments.

### Motivation
Prioritized text layout has measurable performance benefits on Arm systems by reducing code padding and improving cache utilization. This optimization was previously triggered manually via CI scripts (.ci/aarch64_linux/aarch64_ci_build.sh) or user-set environment variables. By detecting the target architecture within setup.py, this change enables the optimization automatically where applicable, improving maintainability and usability.

Note:

Due to ninja/cmake graph generation issues we cannot apply the linker file globally to all targets to the targets must be manually defined. See CMakeLists.txt the main libraries torch_python, torch, torch_cpu, torch_cuda, torch_xpu have been targetted which should be enough to maintain the performance benefits outlined above.

Co-authored-by: Usamah Zaheer <usamah.zaheer@arm.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160078
Approved by: https://github.com/seemethere
2025-09-18 17:09:48 +00:00

62 lines
2.0 KiB
Python

import argparse
import os
import subprocess
from pathlib import Path
def gen_linker_script(
filein: str = "cmake/prioritized_text.txt", fout: str = "cmake/linker_script.ld"
) -> None:
with open(filein) as f:
prioritized_text = f.readlines()
prioritized_text = [
line.replace("\n", "") for line in prioritized_text if line != "\n"
]
ld = os.environ.get("LD", "ld")
linker_script_lines = subprocess.check_output([ld, "-verbose"], text=True).split(
"\n"
)
indices = [
i
for i, x in enumerate(linker_script_lines)
if x == "=================================================="
]
linker_script_lines = linker_script_lines[indices[0] + 1 : indices[1]]
text_line_start = [
i for i, line in enumerate(linker_script_lines) if ".text :" in line
]
assert len(text_line_start) == 1, "The linker script has multiple text sections!"
text_line_start = text_line_start[0]
# ensure that parent directory exists before writing
fout = Path(fout)
fout.parent.mkdir(parents=True, exist_ok=True)
with open(fout, "w") as f:
for lineid, line in enumerate(linker_script_lines):
if lineid == text_line_start + 2:
f.write(" *(\n")
for plines in prioritized_text:
f.write(f" .text.{plines}\n")
f.write(" )\n")
f.write(f"{line}\n")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Generate linker file based on prioritized symbols. Used for link-time optimization.",
)
parser.add_argument(
"--filein",
help="Path to prioritized_text.txt input file",
default=argparse.SUPPRESS,
)
parser.add_argument(
"--fout", help="Output path for linker ld file", default=argparse.SUPPRESS
)
# convert args to a dict to pass to gen_linker_script
kwargs = vars(parser.parse_args())
gen_linker_script(**kwargs)