[BE] Isolate pre-push hook dependencies in dedicated virtual environment (#160048)

This adds two changes:
- Isolates pre-push hook dependencies into an isolated venv, no longer affect your system environment
- Lets you manually run the pre-push lintrunner (including with lintrunner -a) by invoking `python scripts/lintrunner.py [-a]` (it's ugly, but better than nothing...for now)

This is a follow up to:
- https://github.com/pytorch/pytorch/pull/158389

## Problem
The current pre-push hook setup installs lintrunner and related dependencies globally, which makes developers nervous about system pollution and can cause version conflicts with existing installations.

Also, if the pre-push lintrunner found errors, you had to hope your normal lintrunner could fix them (which wasn't always the case, e.g. if those errors only manifested in certain python versions)

##  Key Changes:
  - Isolated Environment: Creates .git/hooks/linter/.venv/ with Python 3.9 (the python used in CI) and an isolated lintrunner installation
  - User-Friendly CLI: New python scripts/lintrunner.py wrapper allows developers to run lintrunner (including -a auto-fix) from any environment
  - Simplified Architecture: Eliminates pre-commit dependency entirely - uses direct git hooks

  File Changes:
  - scripts/setup_hooks.py: Rewritten to create isolated uv-managed virtual environment
  - scripts/lintrunner.py: New wrapper script with shared hash management logic
  - scripts/run_lintrunner.py: Removed (functionality merged into lintrunner.py)
  - .pre-commit-config.yaml: Removed (no longer needed)

##  Usage:
```
  # Setup (run once)
  python scripts/setup_hooks.py

  # Manual linting (works from any environment)
  python scripts/lintrunner.py        # Check mode
  python scripts/lintrunner.py -a     # Auto-fix mode

  # Git hooks work automatically
  git push  # Runs lintrunner in isolated environment

  # Need to skip the pre-push hook?
  git push --no-verify
```

##  Benefits:
  -  Zero global dependency installation
  -  Per-repository isolation prevents version conflicts
  -  Full lintrunner functionality is now accessible

##  Implementation Notes:
  - Virtual env is kept in a dedicated dir in .git, to keep per-repo mechanics
  - lintrunner.py does not need to be invoked from a specific venv.  It'll invoke the right venv itself.

A minor bug: It tends to garble the lintrunner output a bit, like the screenshot below shows, but I haven't found a workaround so far and it remains understandable to users:
<img width="241" height="154" alt="image" src="https://github.com/user-attachments/assets/9496f925-8524-4434-8486-dc579442d688" />

## What's next?
Features that could be added:
- Check for lintrunner updates, auto-update if needed
- Depending on dev response, this could be enabled by default for all pytorch/pytorch environments
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160048
Approved by: https://github.com/seemethere
This commit is contained in:
Zain Rizvi
2025-08-12 01:58:44 +00:00
committed by PyTorch MergeBot
parent 7a974a88f2
commit 95210cc409
4 changed files with 251 additions and 207 deletions

View File

@ -1,12 +0,0 @@
repos:
- repo: local
hooks:
- id: lintrunner
name: Run Lintrunner in an isolated venv before every push. The first run may be slow...
entry: python scripts/run_lintrunner.py # wrapper below
language: python # precommit manages venv for the wrapper
additional_dependencies: [] # wrapper handles lintrunner install
always_run: true
stages: [pre-push] # fire only on prepush
pass_filenames: false # Lintrunner gets no perfile args
verbose: true # stream output as it is produced...allegedly anyways

181
scripts/lintrunner.py Normal file
View File

@ -0,0 +1,181 @@
#!/usr/bin/env python3
"""
Wrapper script to run the isolated hook version of lintrunner.
This allows developers to easily run lintrunner (including with -a for auto-fixes)
using the same isolated environment that the pre-push hook uses, without having
to manually activate/deactivate virtual environments.
Usage:
python scripts/lintrunner.py # Check mode (same as git push)
python scripts/lintrunner.py -a # Auto-fix mode
python scripts/lintrunner.py --help # Show lintrunner help
This module also provides shared functionality for lintrunner hash management.
"""
from __future__ import annotations
import hashlib
import os
import shlex
import shutil
import subprocess
import sys
from pathlib import Path
def find_repo_root() -> Path:
"""Find repository root using git."""
try:
result = subprocess.run(
["git", "rev-parse", "--show-toplevel"],
capture_output=True,
text=True,
check=True,
)
return Path(result.stdout.strip())
except subprocess.CalledProcessError:
sys.exit("❌ Not in a git repository")
def compute_file_hash(path: Path) -> str:
"""Returns SHA256 hash of a file's contents."""
hasher = hashlib.sha256()
with path.open("rb") as f:
while chunk := f.read(8192):
hasher.update(chunk)
return hasher.hexdigest()
def read_stored_hash(path: Path) -> str | None:
if not path.exists():
return None
try:
return path.read_text().strip()
except Exception:
return None
# Venv location - change this if the path changes
HOOK_VENV_PATH = ".git/hooks/linter/.venv"
def get_hook_venv_path() -> Path:
"""Get the path to the hook virtual environment."""
repo_root = find_repo_root()
return repo_root / HOOK_VENV_PATH
def find_hook_venv() -> Path:
"""Locate the isolated hook virtual environment."""
venv_dir = get_hook_venv_path()
if not venv_dir.exists():
sys.exit(
f"❌ Hook virtual environment not found at {venv_dir}\n"
" Please set this up by running: python scripts/setup_hooks.py"
)
return venv_dir
def check_lintrunner_installed(venv_dir: Path) -> None:
"""Check if lintrunner is installed in the given venv, exit if not."""
result = subprocess.run(
[
"uv",
"pip",
"show",
"--python",
str(venv_dir / "bin" / "python"),
"lintrunner",
],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
if result.returncode != 0:
sys.exit(
"❌ lintrunner is required but was not found in the hook environment. "
"Please run `python scripts/setup_hooks.py` to reinstall."
)
print("✅ lintrunner is already installed")
def run_lintrunner(venv_dir: Path, args: list[str]) -> int:
"""Run lintrunner command in the specified venv and return exit code."""
# Run lintrunner directly from the venv's bin directory with environment setup
lintrunner_exe = venv_dir / "bin" / "lintrunner"
cmd = [str(lintrunner_exe)] + args
env = os.environ.copy()
# PATH: Ensures lintrunner can find other tools in the venv (like python, pip, etc.)
env["PATH"] = str(venv_dir / "bin") + os.pathsep + env.get("PATH", "")
# VIRTUAL_ENV: Tells tools like pip_init.py that we're in a venv (prevents --user flag issues)
env["VIRTUAL_ENV"] = str(venv_dir)
# Note: Progress tends to be slightly garbled due to terminal control sequences,
# but functionality and final results will be correct
return subprocess.call(cmd, env=env)
def initialize_lintrunner_if_needed(venv_dir: Path) -> None:
"""Check if lintrunner needs initialization and run init if needed."""
repo_root = find_repo_root()
lintrunner_toml_path = repo_root / ".lintrunner.toml"
initialized_hash_path = venv_dir / ".lintrunner_plugins_hash"
if not lintrunner_toml_path.exists():
print("⚠️ No .lintrunner.toml found. Skipping init.")
return
current_hash = compute_file_hash(lintrunner_toml_path)
stored_hash = read_stored_hash(initialized_hash_path)
if current_hash != stored_hash:
print("🔁 Running `lintrunner init` …", file=sys.stderr)
result = run_lintrunner(venv_dir, ["init"])
if result != 0:
sys.exit(f"❌ lintrunner init failed")
initialized_hash_path.write_text(current_hash)
else:
print("✅ Lintrunner plugins already initialized and up to date.")
def main() -> None:
"""Run lintrunner in the isolated hook environment."""
venv_dir = find_hook_venv()
python_exe = venv_dir / "bin" / "python"
if not python_exe.exists():
sys.exit(f"❌ Python executable not found at {python_exe}")
try:
print(f"🐍 Virtual env being used: {venv_dir}", file=sys.stderr)
# 1. Ensure lintrunner binary is available in the venv
check_lintrunner_installed(venv_dir)
# 2. Check for plugin updates and re-init if needed
initialize_lintrunner_if_needed(venv_dir)
# 3. Run lintrunner with any passed arguments and propagate its exit code
args = sys.argv[1:]
result = run_lintrunner(venv_dir, args)
# If lintrunner failed and we're not already in auto-fix mode, suggest the wrapper
if result != 0 and "-a" not in args:
print(
"\n💡 To auto-fix these issues, run: python scripts/lintrunner.py -a",
file=sys.stderr,
)
sys.exit(result)
except KeyboardInterrupt:
print("\n Lintrunner interrupted by user (KeyboardInterrupt)", file=sys.stderr)
sys.exit(1) # Tell git push to fail
if __name__ == "__main__":
main()

View File

@ -1,110 +0,0 @@
#!/usr/bin/env python3
"""
Prepush hook wrapper for Lintrunner.
✓ Stores a hash of .lintrunner.toml in the venv
✓ Re-runs `lintrunner init` if that file's hash changes
"""
from __future__ import annotations
import hashlib
import os
import shutil
import subprocess
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[1]
LINTRUNNER_TOML_PATH = REPO_ROOT / ".lintrunner.toml"
# This is the path to the pre-commit-managed venv
VENV_ROOT = Path(sys.executable).parent.parent
# Stores the hash of .lintrunner.toml from the last time we ran `lintrunner init`
INITIALIZED_LINTRUNNER_TOML_HASH_PATH = VENV_ROOT / ".lintrunner_plugins_hash"
def ensure_lintrunner() -> None:
"""Fail if Lintrunner is not on PATH."""
if shutil.which("lintrunner"):
print("✅ lintrunner is already installed")
return
sys.exit(
"❌ lintrunner is required but was not found on your PATH. Please run the `python scripts/setup_hooks.py` to install to configure lintrunner before using this script. If `git push` still fails, you may need to open an new terminal"
)
def ensure_virtual_environment() -> None:
"""Fail if not running within a virtual environment."""
in_venv = (
os.environ.get("VIRTUAL_ENV") is not None
or hasattr(sys, "real_prefix")
or (hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix)
)
if not in_venv:
sys.exit(
"❌ This script must be run from within a virtual environment. "
"Please activate your virtual environment before running this script."
)
def compute_file_hash(path: Path) -> str:
"""Returns SHA256 hash of a file's contents."""
hasher = hashlib.sha256()
with path.open("rb") as f:
while chunk := f.read(8192):
hasher.update(chunk)
return hasher.hexdigest()
def read_stored_hash(path: Path) -> str | None:
if not path.exists():
return None
try:
return path.read_text().strip()
except Exception:
return None
def initialize_lintrunner_if_needed() -> None:
"""Runs lintrunner init if .lintrunner.toml changed since last run."""
if not LINTRUNNER_TOML_PATH.exists():
print("⚠️ No .lintrunner.toml found. Skipping init.")
return
print(
f"INITIALIZED_LINTRUNNER_TOML_HASH_PATH = {INITIALIZED_LINTRUNNER_TOML_HASH_PATH}"
)
current_hash = compute_file_hash(LINTRUNNER_TOML_PATH)
stored_hash = read_stored_hash(INITIALIZED_LINTRUNNER_TOML_HASH_PATH)
if current_hash == stored_hash:
print("✅ Lintrunner plugins already initialized and up to date.")
return
print("🔁 Running `lintrunner init` …", file=sys.stderr)
subprocess.check_call(["lintrunner", "init"])
INITIALIZED_LINTRUNNER_TOML_HASH_PATH.write_text(current_hash)
def main() -> None:
# 0. Ensure we're running in a virtual environment
ensure_virtual_environment()
print(f"🐍 Virtual env being used: {VENV_ROOT}", file=sys.stderr)
# 1. Ensure lintrunner binary is available
ensure_lintrunner()
# 2. Check for plugin updates and re-init if needed
initialize_lintrunner_if_needed()
# 3. Run lintrunner with any passed arguments and propagate its exit code
args = sys.argv[1:] # Forward all arguments to lintrunner
result = subprocess.call(["lintrunner"] + args)
sys.exit(result)
if __name__ == "__main__":
main()

View File

@ -1,31 +1,51 @@
#!/usr/bin/env python3
"""
Bootstrap Git prepush hook.
Bootstrap Git prepush hook with isolated virtual environment.
✓ Requires uv to be installed (fails if not available)
Installs/updates precommit with uv (global, venvproof)
Registers the repo's prepush hook and freezes hook versions
Creates isolated venv in .git/hooks/linter/.venv/ for hook dependencies
Installs lintrunner only in the isolated environment
✓ Creates direct git hook that bypasses pre-commit
Run this from the repo root (inside or outside any project venv):
python scripts/setup_hooks.py
IMPORTANT: The generated git hook references scripts/lintrunner.py. If users checkout
branches that don't have this file, git push will fail with "No such file or directory".
Users would need to either:
1. Re-run the old setup_hooks.py from that branch, or
2. Manually delete .git/hooks/pre-push to disable hooks temporarily, or
3. Switch back to a branch with the new scripts/lintrunner.py
"""
from __future__ import annotations
import shlex
import shutil
import subprocess
import sys
from pathlib import Path
from typing import Tuple
# Add scripts directory to Python path so we can import lintrunner module
scripts_dir = Path(__file__).parent
sys.path.insert(0, str(scripts_dir))
# Import shared functions from lintrunner module
from lintrunner import find_repo_root, get_hook_venv_path
# Restore sys.path to avoid affecting other imports
sys.path.pop(0)
# ───────────────────────────────────────────
# Helper utilities
# ───────────────────────────────────────────
def run(cmd: list[str]) -> None:
def run(cmd: list[str], cwd: Path = None) -> None:
print(f"$ {' '.join(cmd)}")
subprocess.check_call(cmd)
subprocess.check_call(cmd, cwd=cwd)
def which(cmd: str) -> bool:
@ -34,28 +54,7 @@ def which(cmd: str) -> bool:
def ensure_uv() -> None:
if which("uv"):
# Ensure the path uv installs binaries to is part of the system path
print("$ uv tool update-shell")
result = subprocess.run(
["uv", "tool", "update-shell"], capture_output=True, text=True
)
if result.returncode == 0:
# Check if the output indicates changes were made
if (
"Updated" in result.stdout
or "Added" in result.stdout
or "Modified" in result.stdout
):
print(
"⚠️ Shell configuration updated. You may need to restart your terminal for changes to take effect."
)
elif result.stdout.strip():
print(result.stdout)
return
else:
sys.exit(
f"❌ Warning: uv tool update-shell failed: {result.stderr}. uv installed tools may not be available."
)
return
sys.exit(
"\n❌ uv is required but was not found on your PATH.\n"
@ -65,29 +64,6 @@ def ensure_uv() -> None:
)
def ensure_tool_installed(
tool: str, force_update: bool = False, python_ver: Tuple[int, int] = None
) -> None:
"""
Checks to see if the tool is available and if not (or if force update requested) then
it reinstalls it.
Returns: Whether or not the tool is available on PATH. If it's not, a new terminal
needs to be opened before git pushes work as expected.
"""
if force_update or not which(tool):
print(f"Ensuring latest {tool} via uv …")
command = ["uv", "tool", "install", "--force", tool]
if python_ver:
# Add the Python version to the command if specified
command.extend(["--python", f"{python_ver[0]}.{python_ver[1]}"])
run(command)
if not which(tool):
print(
f"\n⚠️ {tool} installation succeed, but it's not on PATH. Launch a new terminal if your git pushes don't work.\n"
)
if sys.platform.startswith("win"):
print(
"\n⚠️ Lintrunner is not supported on Windows, so there are no pre-push hooks to add. Exiting setup.\n"
@ -95,52 +71,61 @@ if sys.platform.startswith("win"):
sys.exit(0)
# ───────────────────────────────────────────
# 1. Install dependencies
# 1. Setup isolated hook environment
# ───────────────────────────────────────────
ensure_uv()
# Ensure pre-commit is installed globally via uv
ensure_tool_installed("pre-commit", force_update=True, python_ver=(3, 9))
# Find repo root and setup hook directory
repo_root = find_repo_root()
venv_dir = get_hook_venv_path()
hooks_dir = venv_dir.parent.parent # Go from .git/hooks/linter/.venv to .git/hooks
# Don't force a lintrunner update because it might break folks
# who already have it installed in a different way
ensure_tool_installed("lintrunner")
# ───────────────────────────────────────────
# 2. Activate (or refresh) the prepush hook
# ───────────────────────────────────────────
print(f"Setting up isolated hook environment in {venv_dir}")
# ── Activate (or refresh) the repos prepush hook ──────────────────────────
# Creates/overwrites .git/hooks/prepush with a tiny shim that will call
# `pre-commit run --hook-stage pre-push` on every `git push`.
# This is why we need to install pre-commit globally.
#
# The --allow-missing-config flag lets pre-commit succeed if someone changes to
# a branch that doesn't have pre-commit installed
# Create isolated virtual environment for hooks
if venv_dir.exists():
print("Removing existing hook venv...")
shutil.rmtree(venv_dir)
run(["uv", "venv", str(venv_dir), "--python", "3.9"])
# Install lintrunner in the isolated environment
print("Installing lintrunner in isolated environment...")
run(
[
"uv",
"tool",
"run",
"pre-commit",
"install",
"--hook-type",
"pre-push",
"--allow-missing-config",
]
["uv", "pip", "install", "--python", str(venv_dir / "bin" / "python"), "lintrunner"]
)
# ── Pin remotehook versions for reproducibility ────────────────────────────
# (Note: we don't have remote hooks right now, but it future-proofs this script)
# 1. `autoupdate` bumps every remote hooks `rev:` in .pre-commit-config.yaml
# to the latest commit on its default branch.
# 2. `--freeze` immediately rewrites each `rev:` to the exact commit SHA,
# ensuring all contributors and CI run identical hook code.
run(["uv", "tool", "run", "pre-commit", "autoupdate", "--freeze"])
# ───────────────────────────────────────────
# 2. Create direct git pre-push hook
# ───────────────────────────────────────────
pre_push_hook = hooks_dir / "pre-push"
python_exe = venv_dir / "bin" / "python"
lintrunner_script_path_quoted = shlex.quote(
str(repo_root / "scripts" / "lintrunner.py")
)
hook_script = f"""#!/bin/bash
set -e
# Check if lintrunner script exists (user might be on older commit)
if [ ! -f {lintrunner_script_path_quoted} ]; then
echo "⚠️ {lintrunner_script_path_quoted} not found - skipping linting (likely on an older commit)"
exit 0
fi
# Run lintrunner wrapper using the isolated venv's Python
{shlex.quote(str(python_exe))} {lintrunner_script_path_quoted}
"""
print(f"Creating git pre-push hook at {pre_push_hook}")
pre_push_hook.write_text(hook_script)
pre_push_hook.chmod(0o755) # Make executable
print(
"\nprecommit is installed globally via uv and the prepush hook is active.\n"
"\nIsolated hook environment created and prepush hook is active.\n"
" Lintrunner will now run automatically on every `git push`.\n"
f" Hook dependencies are isolated in {venv_dir}\n"
)