Package config/template files with torchgen (#78942)

Package config/template files with torchgen This PR packages native_functions.yaml, tags.yaml and ATen/templates with torchgen. This PR: - adds a step to setup.py to copy the relevant files over into torchgen - adds a docstring for torchgen (so `import torchgen; help(torchgen)` says something) - adds a helper function in torchgen so you can get the torchgen root directory (and figure out where the packaged files are) - changes some scripts to explicitly pass the location of torchgen, which will be helpful for the first item in the Future section. Future ====== - torchgen, when invoked from the command line, should use sources in torchgen/packaged instead of aten/src. I'm unable to do this because people (aka PyTorch CI) invokes `python -m torchgen.gen` without installing torchgen. - the source of truth for all of these files should be in torchgen. This is a bit annoying to execute on due to potential merge conflicts and dealing with merge systems - CI and testing. The way things are set up right now is really fragile, we should have a CI job for torchgen. Test Plan ========= I ran the following locally: ``` python -m torchgen.gen -s torchgen/packaged ``` and verified that it outputted files. Furthermore, I did a setup.py install and checked that the files are actually being packaged with torchgen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78942 Approved by: https://github.com/ezyang
2025-10-20 21:14:14 +08:00 · 2022-06-07 13:33:55 +00:00
parent 67badf0d5c
commit 9da5defff6
7 changed files with 79 additions and 2 deletions
--- a/.gitignore
+++ b/.gitignore
@ -122,6 +122,10 @@ env
 .circleci/scripts/COMMIT_MSG
 scripts/release_notes/*.json
 # These files get copied over on invoking setup.py
 torchgen/packaged/*
 !torchgen/packaged/README.md
 # IPython notebook checkpoints
 .ipynb_checkpoints
--- a/.jenkins/pytorch/codegen-test.sh
+++ b/.jenkins/pytorch/codegen-test.sh
@ -27,6 +27,7 @@ rm -rf "$OUT"
 # aten codegen
 python -m torchgen.gen \
  -s aten/src/ATen \
  -d "$OUT"/torch/share/ATen
 # torch codegen
--- a/docs/cpp/source/check-doxygen.sh
+++ b/docs/cpp/source/check-doxygen.sh
@ -16,7 +16,7 @@ pushd "$(dirname "$0")/../../.."
 cp torch/_utils_internal.py tools/shared
-python -m torchgen.gen
+python -m torchgen.gen --source-path aten/src/ATen
 python tools/setup_helpers/generate_code.py                 \
  --native-functions-path aten/src/ATen/native/native_functions.yaml \
--- a/setup.py
+++ b/setup.py
@ -363,6 +363,33 @@ def check_submodules():
                                 'benchmark'), ['CMakeLists.txt'])
 # Windows has very bad support for symbolic links.
 # Instead of using symlinks, we're going to copy files over
 def mirror_files_into_torchgen():
    # (new_path, orig_path)
    # Directories are OK and are recursively mirrored.
    paths = [
        ('torchgen/packaged/ATen/native/native_functions.yaml', 'aten/src/ATen/native/native_functions.yaml'),
        ('torchgen/packaged/ATen/native/tags.yaml', 'aten/src/ATen/native/tags.yaml'),
        ('torchgen/packaged/ATen/templates', 'aten/src/ATen/templates'),
    ]
    for new_path, orig_path in paths:
        # Create the dirs involved in new_path if they don't exist
        if not os.path.exists(new_path):
            os.makedirs(os.path.dirname(new_path), exist_ok=True)
        # Copy the files from the orig location to the new location
        if os.path.isfile(orig_path):
            shutil.copyfile(orig_path, new_path)
            continue
        if os.path.isdir(orig_path):
            if os.path.exists(new_path):
                # copytree fails if the tree exists already, so remove it.
                shutil.rmtree(new_path)
            shutil.copytree(orig_path, new_path)
            continue
        raise RuntimeError("Check the file paths in `mirror_files_into_torchgen()`")
 # all the work we need to do _before_ setup runs
 def build_deps():
    report('-- Building version ' + version)
@ -912,6 +939,7 @@ if __name__ == '__main__':
        print(e)
        sys.exit(1)
    mirror_files_into_torchgen()
    if RUN_BUILD_DEPS:
        build_deps()
@ -1081,7 +1109,15 @@ if __name__ == '__main__':
                'utils/model_dump/code.js',
                'utils/model_dump/*.mjs',
            ],
-            'torchgen': [],
+            'torchgen': [
                # Recursive glob doesn't work in setup.py,
                # https://github.com/pypa/setuptools/issues/1806
                # To make this robust we should replace it with some code that
                # returns a list of everything under packaged/
                'packaged/ATen/*',
                'packaged/ATen/native/*',
                'packaged/ATen/templates/*',
            ],
            'caffe2': [
                'python/serialized_test/data/operator_test/*.zip',
            ],
--- a/torchgen/init.py
+++ b/torchgen/init.py
@ -0,0 +1,10 @@
 """torchgen
 This module contains codegeneration utilities for PyTorch. It is used to
 build PyTorch from source, but may also be used for out-of-tree projects
 that extend PyTorch.
 Note well that we provide no BC guarantees for torchgen. If you're interested
 in using torchgen and want the PyTorch team to be aware, please reach out
 on GitHub.
 """
--- a/torchgen/gen.py
+++ b/torchgen/gen.py
@ -2297,6 +2297,14 @@ def gen_declarations_yaml(
    )
 def get_torchgen_root() -> pathlib.Path:
    """
    If you're depending on torchgen out-of-tree, you can use the root to figure
    out the path to native_functions.yaml
    """
    return pathlib.Path(__file__).parent.resolve()
 def main() -> None:
    parser = argparse.ArgumentParser(description="Generate ATen source files")
    parser.add_argument(
--- a/torchgen/packaged/README.md
+++ b/torchgen/packaged/README.md
@ -0,0 +1,18 @@
 What is torchgen/packaged?
 --------------------------
 This directory is a collection of files that have been mirrored from their
 original locations. setup.py is responsible for performing the mirroring.
 These files are necessary config files (e.g. `native_functions.yaml`) for
 torchgen to do its job; we mirror them over so that they can be packaged
 and distributed with torchgen.
 Ideally the source of truth of these files exists just in torchgen (and not
 elsewhere), but getting to that point is a bit difficult due to needing to
 deal with merge conflicts, multiple build systems, etc. We aspire towards
 this for the future, though.
 Note well that although we bundle torchgen with PyTorch, there are NO
 BC guarantees: use it at your own risk. If you're a user and want to use it,
 please reach out to us on GitHub.