[BE] Add pre-push hook for lintrunner to the PyTorch repo (#158389)

Adds a pre-commit hook (technically a pre-push hook) to the PyTorch repo.
**This is currently an opt-in feature**, which one can opt into by running `python scripts/setup_hooks.py` locally.

### Features
- **Run Lintrunner Before Push**: Before every `git push`, automatically runs lintrunner on your changes.
  - Really need to skip the checks? Run `git push --no-verify`
- **Consistent, Isolated, Lintrunner Environment**: During pre-push, Lintrunner runs in it's own virtual en environment that contain all lintrunner dependencies in a consistent, isolated environment.  No more lintrunner failures because you created a new .venv. (Did you know you needed to run `lintrunner init` every time you make a new .venv?)
- **Dependencies Automatically Updated**: If .lintrunner.toml is updated, this will automatically re-run `lintrunner init` to ensure you install the latest dependencies specified

### Installation
- Run `python scripts/setup_hooks.py`. Now every `git push` will first run lintrunner.

### Additional details
- The lintrunner used by the pre-push hook runs in a special per-repo virtual environment managed by the commit-hook tool located under `$USER/.cache/pre-commit`
- Does not affect your regularly used lintrunner
  - Manual invocations of lintrunner will continue to depend on your local environment instead of the special pre-push one. If there's enough interest, we could explore consolidating them.
- Does not run `lintrunner -a` for you.
  - You still need to manually run that (can be changed later though!)
- Have staged/unstaged changes? No worries
  - This runs `git stash` before running the pre-commit hooks and pops back your changes afterwards, so only the changes actaully being pushed will be tested

### Downsides
- No streaming UI updates
  - While you still get the same output from lintrunner that you're used to, the commit-hook framework doesn't show any output while lintrunner is actually running. Instead, it shows the entire output after linter has completed execution, which could be a few minutes (especially if it has to run `lintrunner init` first)
- `uv` installation is required to run the setup script. The setup script will ask users to install uv if it's not available.
  - This is required to be able to install the pre-commit package in a safe way that's available no matter what .venv you are running in.

### Opting out
- Disable hook for a single push: Run `git push --no-verify`
- Disable hook permanently: If something goes wrong and you need to wipe your setup:
  - Delete the `$USER/.cache/pre-commit` folder and the `.git/hooks/pre-push` file in your local repo.
  - You can now rerun `python scripts/setup_hooks.py` to setup your git push hook again if you want.

### Potential Future Changes
Things that could be done to make this even better if folks like these ideas:
- Automatic setup
  - Our `CONTRIBUTING.md` file tells devs to run `make setup-env`.  That could be a good entry point to hook the installation into
- Fix the console output streaming
- Make every lintrunner invocation (including manual ones) use the same repo-specific venv that the commit-hook uses.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158389
Approved by: https://github.com/seemethere
This commit is contained in:
Zain Rizvi
2025-07-18 19:55:35 +00:00
committed by PyTorch MergeBot
parent 75e2628782
commit 1b5fdb23b9
3 changed files with 261 additions and 0 deletions

110
scripts/run_lintrunner.py Normal file
View File

@ -0,0 +1,110 @@
#!/usr/bin/env python3
"""
Prepush hook wrapper for Lintrunner.
✓ Stores a hash of .lintrunner.toml in the venv
✓ Re-runs `lintrunner init` if that file's hash changes
"""
from __future__ import annotations
import hashlib
import os
import shutil
import subprocess
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[1]
LINTRUNNER_TOML_PATH = REPO_ROOT / ".lintrunner.toml"
# This is the path to the pre-commit-managed venv
VENV_ROOT = Path(sys.executable).parent.parent
# Stores the hash of .lintrunner.toml from the last time we ran `lintrunner init`
INITIALIZED_LINTRUNNER_TOML_HASH_PATH = VENV_ROOT / ".lintrunner_plugins_hash"
def ensure_lintrunner() -> None:
"""Fail if Lintrunner is not on PATH."""
if shutil.which("lintrunner"):
print("✅ lintrunner is already installed")
return
sys.exit(
"❌ lintrunner is required but was not found on your PATH. Please run the `python scripts/setup_hooks.py` to install to configure lintrunner before using this script. If `git push` still fails, you may need to open an new terminal"
)
def ensure_virtual_environment() -> None:
"""Fail if not running within a virtual environment."""
in_venv = (
os.environ.get("VIRTUAL_ENV") is not None
or hasattr(sys, "real_prefix")
or (hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix)
)
if not in_venv:
sys.exit(
"❌ This script must be run from within a virtual environment. "
"Please activate your virtual environment before running this script."
)
def compute_file_hash(path: Path) -> str:
"""Returns SHA256 hash of a file's contents."""
hasher = hashlib.sha256()
with path.open("rb") as f:
while chunk := f.read(8192):
hasher.update(chunk)
return hasher.hexdigest()
def read_stored_hash(path: Path) -> str | None:
if not path.exists():
return None
try:
return path.read_text().strip()
except Exception:
return None
def initialize_lintrunner_if_needed() -> None:
"""Runs lintrunner init if .lintrunner.toml changed since last run."""
if not LINTRUNNER_TOML_PATH.exists():
print("⚠️ No .lintrunner.toml found. Skipping init.")
return
print(
f"INITIALIZED_LINTRUNNER_TOML_HASH_PATH = {INITIALIZED_LINTRUNNER_TOML_HASH_PATH}"
)
current_hash = compute_file_hash(LINTRUNNER_TOML_PATH)
stored_hash = read_stored_hash(INITIALIZED_LINTRUNNER_TOML_HASH_PATH)
if current_hash == stored_hash:
print("✅ Lintrunner plugins already initialized and up to date.")
return
print("🔁 Running `lintrunner init` …", file=sys.stderr)
subprocess.check_call(["lintrunner", "init"])
INITIALIZED_LINTRUNNER_TOML_HASH_PATH.write_text(current_hash)
def main() -> None:
# 0. Ensure we're running in a virtual environment
ensure_virtual_environment()
print(f"🐍 Virtual env being used: {VENV_ROOT}", file=sys.stderr)
# 1. Ensure lintrunner binary is available
ensure_lintrunner()
# 2. Check for plugin updates and re-init if needed
initialize_lintrunner_if_needed()
# 3. Run lintrunner with any passed arguments and propagate its exit code
args = sys.argv[1:] # Forward all arguments to lintrunner
result = subprocess.call(["lintrunner"] + args)
sys.exit(result)
if __name__ == "__main__":
main()

139
scripts/setup_hooks.py Normal file
View File

@ -0,0 +1,139 @@
#!/usr/bin/env python3
"""
Bootstrap Git prepush hook.
✓ Requires uv to be installed (fails if not available)
✓ Installs/updates precommit with uv (global, venvproof)
✓ Registers the repo's prepush hook and freezes hook versions
Run this from the repo root (inside or outside any project venv):
python scripts/setup_hooks.py
"""
from __future__ import annotations
import shutil
import subprocess
import sys
from pathlib import Path
# ───────────────────────────────────────────
# Helper utilities
# ───────────────────────────────────────────
def run(cmd: list[str]) -> None:
print(f"$ {' '.join(cmd)}")
subprocess.check_call(cmd)
def which(cmd: str) -> bool:
return shutil.which(cmd) is not None
def ensure_uv() -> None:
if which("uv"):
# Ensure the path uv installs binaries to is part of the system path
print("$ uv tool update-shell")
result = subprocess.run(
["uv", "tool", "update-shell"], capture_output=True, text=True
)
if result.returncode == 0:
# Check if the output indicates changes were made
if (
"Updated" in result.stdout
or "Added" in result.stdout
or "Modified" in result.stdout
):
print(
"⚠️ Shell configuration updated. You may need to restart your terminal for changes to take effect."
)
elif result.stdout.strip():
print(result.stdout)
return
else:
sys.exit(
f"❌ Warning: uv tool update-shell failed: {result.stderr}. uv installed tools may not be available."
)
sys.exit(
"\n❌ uv is required but was not found on your PATH.\n"
" Please install uv first using the instructions at:\n"
" https://docs.astral.sh/uv/getting-started/installation/\n"
" Then rerun python scripts/setup_hooks.py\n"
)
def ensure_tool_installed(tool: str, force_update: bool = False) -> None:
"""
Checks to see if the tool is available and if not (or if force update requested) then
it reinstalls it.
Returns: Whether or not the tool is available on PATH. If it's not, a new terminal
needs to be opened before git pushes work as expected.
"""
if force_update or not which(tool):
print(f"Ensuring latest {tool} via uv …")
run(["uv", "tool", "install", "--force", tool])
if not which(tool):
print(
f"\n⚠️ {tool} installation succeed, but it's not on PATH. Launch a new terminal if your git pushes don't work.\n"
)
if sys.platform.startswith("win"):
print(
"\n⚠️ Lintrunner is not supported on Windows, so there are no pre-push hooks to add. Exiting setup.\n"
)
sys.exit(0)
# ───────────────────────────────────────────
# 1. Install dependencies
# ───────────────────────────────────────────
ensure_uv()
# Ensure pre-commit is installed globally via uv
ensure_tool_installed("pre-commit", force_update=True)
# Don't force a lintrunner update because it might break folks
# who already have it installed in a different way
ensure_tool_installed("lintrunner")
# ───────────────────────────────────────────
# 2. Activate (or refresh) the prepush hook
# ───────────────────────────────────────────
# ── Activate (or refresh) the repos prepush hook ──────────────────────────
# Creates/overwrites .git/hooks/prepush with a tiny shim that will call
# `pre-commit run --hook-stage pre-push` on every `git push`.
# This is why we need to install pre-commit globally.
#
# The --allow-missing-config flag lets pre-commit succeed if someone changes to
# a branch that doesn't have pre-commit installed
run(
[
"uv",
"tool",
"run",
"pre-commit",
"install",
"--hook-type",
"pre-push",
"--allow-missing-config",
]
)
# ── Pin remotehook versions for reproducibility ────────────────────────────
# (Note: we don't have remote hooks right now, but it future-proofs this script)
# 1. `autoupdate` bumps every remote hooks `rev:` in .pre-commit-config.yaml
# to the latest commit on its default branch.
# 2. `--freeze` immediately rewrites each `rev:` to the exact commit SHA,
# ensuring all contributors and CI run identical hook code.
run(["uv", "tool", "run", "pre-commit", "autoupdate", "--freeze"])
print(
"\n✅ precommit is installed globally via uv and the prepush hook is active.\n"
" Lintrunner will now run automatically on every `git push`.\n"
)