Files
pytorch/scripts/run_lintrunner.py
Zain Rizvi 1b5fdb23b9 [BE] Add pre-push hook for lintrunner to the PyTorch repo (#158389)
Adds a pre-commit hook (technically a pre-push hook) to the PyTorch repo.
**This is currently an opt-in feature**, which one can opt into by running `python scripts/setup_hooks.py` locally.

### Features
- **Run Lintrunner Before Push**: Before every `git push`, automatically runs lintrunner on your changes.
  - Really need to skip the checks? Run `git push --no-verify`
- **Consistent, Isolated, Lintrunner Environment**: During pre-push, Lintrunner runs in it's own virtual en environment that contain all lintrunner dependencies in a consistent, isolated environment.  No more lintrunner failures because you created a new .venv. (Did you know you needed to run `lintrunner init` every time you make a new .venv?)
- **Dependencies Automatically Updated**: If .lintrunner.toml is updated, this will automatically re-run `lintrunner init` to ensure you install the latest dependencies specified

### Installation
- Run `python scripts/setup_hooks.py`. Now every `git push` will first run lintrunner.

### Additional details
- The lintrunner used by the pre-push hook runs in a special per-repo virtual environment managed by the commit-hook tool located under `$USER/.cache/pre-commit`
- Does not affect your regularly used lintrunner
  - Manual invocations of lintrunner will continue to depend on your local environment instead of the special pre-push one. If there's enough interest, we could explore consolidating them.
- Does not run `lintrunner -a` for you.
  - You still need to manually run that (can be changed later though!)
- Have staged/unstaged changes? No worries
  - This runs `git stash` before running the pre-commit hooks and pops back your changes afterwards, so only the changes actaully being pushed will be tested

### Downsides
- No streaming UI updates
  - While you still get the same output from lintrunner that you're used to, the commit-hook framework doesn't show any output while lintrunner is actually running. Instead, it shows the entire output after linter has completed execution, which could be a few minutes (especially if it has to run `lintrunner init` first)
- `uv` installation is required to run the setup script. The setup script will ask users to install uv if it's not available.
  - This is required to be able to install the pre-commit package in a safe way that's available no matter what .venv you are running in.

### Opting out
- Disable hook for a single push: Run `git push --no-verify`
- Disable hook permanently: If something goes wrong and you need to wipe your setup:
  - Delete the `$USER/.cache/pre-commit` folder and the `.git/hooks/pre-push` file in your local repo.
  - You can now rerun `python scripts/setup_hooks.py` to setup your git push hook again if you want.

### Potential Future Changes
Things that could be done to make this even better if folks like these ideas:
- Automatic setup
  - Our `CONTRIBUTING.md` file tells devs to run `make setup-env`.  That could be a good entry point to hook the installation into
- Fix the console output streaming
- Make every lintrunner invocation (including manual ones) use the same repo-specific venv that the commit-hook uses.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158389
Approved by: https://github.com/seemethere
2025-07-18 19:55:35 +00:00

111 lines
3.4 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env python3
"""
Prepush hook wrapper for Lintrunner.
✓ Stores a hash of .lintrunner.toml in the venv
✓ Re-runs `lintrunner init` if that file's hash changes
"""
from __future__ import annotations
import hashlib
import os
import shutil
import subprocess
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[1]
LINTRUNNER_TOML_PATH = REPO_ROOT / ".lintrunner.toml"
# This is the path to the pre-commit-managed venv
VENV_ROOT = Path(sys.executable).parent.parent
# Stores the hash of .lintrunner.toml from the last time we ran `lintrunner init`
INITIALIZED_LINTRUNNER_TOML_HASH_PATH = VENV_ROOT / ".lintrunner_plugins_hash"
def ensure_lintrunner() -> None:
"""Fail if Lintrunner is not on PATH."""
if shutil.which("lintrunner"):
print("✅ lintrunner is already installed")
return
sys.exit(
"❌ lintrunner is required but was not found on your PATH. Please run the `python scripts/setup_hooks.py` to install to configure lintrunner before using this script. If `git push` still fails, you may need to open an new terminal"
)
def ensure_virtual_environment() -> None:
"""Fail if not running within a virtual environment."""
in_venv = (
os.environ.get("VIRTUAL_ENV") is not None
or hasattr(sys, "real_prefix")
or (hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix)
)
if not in_venv:
sys.exit(
"❌ This script must be run from within a virtual environment. "
"Please activate your virtual environment before running this script."
)
def compute_file_hash(path: Path) -> str:
"""Returns SHA256 hash of a file's contents."""
hasher = hashlib.sha256()
with path.open("rb") as f:
while chunk := f.read(8192):
hasher.update(chunk)
return hasher.hexdigest()
def read_stored_hash(path: Path) -> str | None:
if not path.exists():
return None
try:
return path.read_text().strip()
except Exception:
return None
def initialize_lintrunner_if_needed() -> None:
"""Runs lintrunner init if .lintrunner.toml changed since last run."""
if not LINTRUNNER_TOML_PATH.exists():
print("⚠️ No .lintrunner.toml found. Skipping init.")
return
print(
f"INITIALIZED_LINTRUNNER_TOML_HASH_PATH = {INITIALIZED_LINTRUNNER_TOML_HASH_PATH}"
)
current_hash = compute_file_hash(LINTRUNNER_TOML_PATH)
stored_hash = read_stored_hash(INITIALIZED_LINTRUNNER_TOML_HASH_PATH)
if current_hash == stored_hash:
print("✅ Lintrunner plugins already initialized and up to date.")
return
print("🔁 Running `lintrunner init` …", file=sys.stderr)
subprocess.check_call(["lintrunner", "init"])
INITIALIZED_LINTRUNNER_TOML_HASH_PATH.write_text(current_hash)
def main() -> None:
# 0. Ensure we're running in a virtual environment
ensure_virtual_environment()
print(f"🐍 Virtual env being used: {VENV_ROOT}", file=sys.stderr)
# 1. Ensure lintrunner binary is available
ensure_lintrunner()
# 2. Check for plugin updates and re-init if needed
initialize_lintrunner_if_needed()
# 3. Run lintrunner with any passed arguments and propagate its exit code
args = sys.argv[1:] # Forward all arguments to lintrunner
result = subprocess.call(["lintrunner"] + args)
sys.exit(result)
if __name__ == "__main__":
main()