Backport transposes optimization to v0.3.0 (#3994 )

* Optimizer: optimize transposes in variety of circumstances (#3509) * Optimizer: Optimize transposes in variety of circumstances - No-op transposes - Consecutive transposes (fuse them) - Transposes into Gemm (fuse them into transA/transB parameter) * touch up out of date comment * Backporting optimizer changes
Propagate volatile in zeros_like (#3984 )
2025-10-25 16:14:55 +08:00 · 2017-12-04 00:00:43 -08:00 · 2017-12-04 00:00:43 -08:00 · 2017-12-01 16:17:49 -08:00 · 2017-12-01 00:16:31 -05:00 · 2017-12-01 00:05:04 -05:00
17213 changed files with 226606 additions and 3146277 deletions
--- a/.bazelignore
+++ b/.bazelignore
@ -1,4 +0,0 @@
-# We do not use this library in our Bazel build. It contains an
-# infinitely recursing symlink that makes Bazel very unhappy.
-third_party/ittapi/
-third_party/opentelemetry-cpp
--- a/.bazelrc
+++ b/.bazelrc
@ -1,114 +0,0 @@
-build --cxxopt=--std=c++17
-build --copt=-I.
-# Bazel does not support including its cc_library targets as system
-# headers. We work around this for generated code
-# (e.g. c10/macros/cmake_macros.h) by making the generated directory a
-# system include path.
-build --copt=-isystem --copt bazel-out/k8-fastbuild/bin
-build --copt=-isystem --copt bazel-out/darwin-fastbuild/bin
-build --experimental_ui_max_stdouterr_bytes=2048576
-
-# Configuration to disable tty features for environments like CI
-build:no-tty --curses no
-build:no-tty --progress_report_interval 10
-build:no-tty --show_progress_rate_limit 10
-
-# Build with GPU support by default.
-build --define=cuda=true
-# rules_cuda configuration
-build --@rules_cuda//cuda:enable_cuda
-build --@rules_cuda//cuda:cuda_targets=sm_52
-build --@rules_cuda//cuda:compiler=nvcc
-build --repo_env=CUDA_PATH=/usr/local/cuda
-
-# Configuration to build without GPU support
-build:cpu-only --define=cuda=false
-# define a separate build folder for faster switching between configs
-build:cpu-only --platform_suffix=-cpu-only
-# See the note on the config-less build for details about why we are
-# doing this. We must also do it for the "-cpu-only" platform suffix.
-build --copt=-isystem --copt=bazel-out/k8-fastbuild-cpu-only/bin
-# rules_cuda configuration
-build:cpu-only --@rules_cuda//cuda:enable_cuda=False
-
-# Definition of --config=shell
-# interactive shell immediately before execution
-build:shell --run_under="//tools/bazel_tools:shellwrap"
-
-# Disable all warnings for external repositories. We don't care about
-# their warnings.
-build --per_file_copt=^external/@-w
-
-# Set additional warnings to error level.
-#
-# Implementation notes:
-#  * we use file extensions to determine if we are using the C++
-#    compiler or the cuda compiler
-#  * we use ^// at the start of the regex to only permit matching
-#    PyTorch files. This excludes external repos.
-#
-# Note that because this is logically a command-line flag, it is
-# considered the word on what warnings are enabled. This has the
-# unfortunate consequence of preventing us from disabling an error at
-# the target level because those flags will come before these flags in
-# the action invocation. Instead we provide per-file exceptions after
-# this.
-#
-# On the bright side, this means we don't have to more broadly apply
-# the exceptions to an entire target.
-#
-# Looking for CUDA flags? We have a cu_library macro that we can edit
-# directly. Look in //tools/rules:cu.bzl for details. Editing the
-# macro over this has the following advantages:
-#  * making changes does not require discarding the Bazel analysis
-#    cache
-#  * it allows for selective overrides on individual targets since the
-#    macro-level opts will come earlier than target level overrides
-
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Werror=all
-# The following warnings come from -Wall. We downgrade them from error
-# to warnings here.
-#
-# We intentionally use #pragma unroll, which is compiler specific.
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-error=unknown-pragmas
-
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Werror=extra
-# The following warnings come from -Wextra. We downgrade them from error
-# to warnings here.
-#
-# unused-parameter-compare has a tremendous amount of violations in the
-# codebase. It will be a lot of work to fix them, just disable it for
-# now.
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-unused-parameter
-# missing-field-parameters has both a large number of violations in
-# the codebase, but it also is used pervasively in the Python C
-# API. There are a couple of catches though:
-# * we use multiple versions of the Python API and hence have
-#   potentially multiple different versions of each relevant
-#   struct. They may have different numbers of fields. It will be
-#   unwieldy to support multiple versions in the same source file.
-# * Python itself for many of these structs recommends only
-#   initializing a subset of the fields. We should respect the API
-#   usage conventions of our dependencies.
-#
-# Hence, we just disable this warning altogether. We may want to clean
-# up some of the clear-cut cases that could be risky, but we still
-# likely want to have this disabled for the most part.
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-missing-field-initializers
-
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-unused-function
-build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-unused-variable
-
-build --per_file_copt='//:aten/src/ATen/RegisterCompositeExplicitAutograd\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterCompositeImplicitAutograd\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterMkldnnCPU\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterNestedTensorCPU\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterQuantizedCPU\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterSparseCPU\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterSparseCsrCPU\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterNestedTensorMeta\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterSparseMeta\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterQuantizedMeta\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:aten/src/ATen/RegisterZeroTensor\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:torch/csrc/lazy/generated/RegisterAutogradLazy\.cpp$'@-Wno-error=unused-function
-build --per_file_copt='//:torch/csrc/lazy/generated/RegisterLazy\.cpp$'@-Wno-error=unused-function
--- a/.bazelversion
+++ b/.bazelversion
@ -1 +0,0 @@
-6.1.1
--- a/.buckconfig.oss
+++ b/.buckconfig.oss
@ -1,26 +0,0 @@
-[pt]
-  is_oss=1
-
-[buildfile]
-  name = BUCK.oss
-  includes = //tools/build_defs/select.bzl
-
-[repositories]
-  bazel_skylib = third_party/bazel-skylib/
-  ovr_config = .
-
-[download]
-  in_build = true
-
-[cxx]
-  cxxflags = -std=c++17
-  ldflags = -Wl,--no-undefined
-  should_remap_host_platform = true
-  cpp = /usr/bin/clang
-  cc = /usr/bin/clang
-  cxx = /usr/bin/clang++
-  cxxpp = /usr/bin/clang++
-  ld = /usr/bin/clang++
-
-[project]
-  default_flavors_mode=all
--- a/.ci/caffe2/README.md
+++ b/.ci/caffe2/README.md
@ -1,14 +0,0 @@
-# Jenkins
-
-The scripts in this directory are the entrypoint for testing Caffe2.
-
-The environment variable `BUILD_ENVIRONMENT` is expected to be set to
-the build environment you intend to test. It is a hint for the build
-and test scripts to configure Caffe2 a certain way and include/exclude
-tests. Docker images, they equal the name of the image itself. For
-example: `py2-cuda9.0-cudnn7-ubuntu16.04`. The Docker images that are
-built on Jenkins and are used in triggered builds already have this
-environment variable set in their manifest. Also see
-`./docker/jenkins/*/Dockerfile` and search for `BUILD_ENVIRONMENT`.
-
-Our Jenkins installation is located at https://ci.pytorch.org/jenkins/.
--- a/.ci/caffe2/common.sh
+++ b/.ci/caffe2/common.sh
@ -1,36 +0,0 @@
-set -ex
-
-LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
-ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)
-TEST_DIR="$ROOT_DIR/test"
-gtest_reports_dir="${TEST_DIR}/test-reports/cpp"
-pytest_reports_dir="${TEST_DIR}/test-reports/python"
-
-# Figure out which Python to use
-PYTHON="$(which python)"
-if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then
-  PYTHON=$(which "python${BASH_REMATCH[1]}")
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
-    # HIP_PLATFORM is auto-detected by hipcc; unset to avoid build errors
-    unset HIP_PLATFORM
-    if which sccache > /dev/null; then
-        # Save sccache logs to file
-        sccache --stop-server || true
-        rm -f ~/sccache_error.log || true
-        SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=0 sccache --start-server
-
-        # Report sccache stats for easier debugging
-        sccache --zero-stats
-    fi
-fi
-
-# /usr/local/caffe2 is where the cpp bits are installed to in cmake-only
-# builds. In +python builds the cpp tests are copied to /usr/local/caffe2 so
-# that the test code in .ci/test.sh is the same
-INSTALL_PREFIX="/usr/local/caffe2"
-
-mkdir -p "$gtest_reports_dir" || true
-mkdir -p "$pytest_reports_dir" || true
-mkdir -p "$INSTALL_PREFIX" || true
--- a/.ci/caffe2/test.sh
+++ b/.ci/caffe2/test.sh
@ -1,172 +0,0 @@
-#!/bin/bash
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-if [[ ${BUILD_ENVIRONMENT} == *onnx* ]]; then
-  pip install click mock tabulate networkx==2.0
-  pip -q install --user "file:///var/lib/jenkins/workspace/third_party/onnx#egg=onnx"
-fi
-
-# Skip tests in environments where they are not built/applicable
-if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then
-  echo 'Skipping tests'
-  exit 0
-fi
-if [[ "${BUILD_ENVIRONMENT}" == *-rocm* ]]; then
-  # temporary to locate some kernel issues on the CI nodes
-  export HSAKMT_DEBUG_LEVEL=4
-fi
-# These additional packages are needed for circleci ROCm builds.
-if [[ $BUILD_ENVIRONMENT == *rocm* ]]; then
-    # Need networkx 2.0 because bellmand_ford was moved in 2.1 . Scikit-image by
-    # defaults installs the most recent networkx version, so we install this lower
-    # version explicitly before scikit-image pulls it in as a dependency
-    pip install networkx==2.0
-    # click - onnx
-    pip install --progress-bar off click protobuf tabulate virtualenv mock typing-extensions
-fi
-
-# Find where cpp tests and Caffe2 itself are installed
-if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then
-  # For cmake only build we install everything into /usr/local
-  cpp_test_dir="$INSTALL_PREFIX/cpp_test"
-  ld_library_path="$INSTALL_PREFIX/lib"
-else
-  # For Python builds we install into python
-  # cd to /usr first so the python import doesn't get confused by any 'caffe2'
-  # directory in cwd
-  python_installation="$(dirname $(dirname $(cd /usr && $PYTHON -c 'import os; import caffe2; print(os.path.realpath(caffe2.__file__))')))"
-  caffe2_pypath="$python_installation/caffe2"
-  cpp_test_dir="$python_installation/torch/test"
-  ld_library_path="$python_installation/torch/lib"
-fi
-
-################################################################################
-# C++ tests #
-################################################################################
-# Only run cpp tests in the first shard, don't run cpp tests a second time in the second shard
-if [[ "${SHARD_NUMBER:-1}" == "1" ]]; then
-  echo "Running C++ tests.."
-  for test in $(find "$cpp_test_dir" -executable -type f); do
-    case "$test" in
-      # skip tests we know are hanging or bad
-      */mkl_utils_test|*/aten/integer_divider_test)
-        continue
-        ;;
-      */scalar_tensor_test|*/basic|*/native_test)
-        if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
-          continue
-        else
-          LD_LIBRARY_PATH="$ld_library_path" "$test"
-        fi
-        ;;
-      */*_benchmark)
-        LD_LIBRARY_PATH="$ld_library_path" "$test" --benchmark_color=false
-        ;;
-      *)
-        # Currently, we use a mixture of gtest (caffe2) and Catch2 (ATen). While
-        # planning to migrate to gtest as the common PyTorch c++ test suite, we
-        # currently do NOT use the xml test reporter, because Catch doesn't
-        # support multiple reporters
-        # c.f. https://github.com/catchorg/Catch2/blob/master/docs/release-notes.md#223
-        # which means that enabling XML output means you lose useful stdout
-        # output for Jenkins.  It's more important to have useful console
-        # output than it is to have XML output for Jenkins.
-        # Note: in the future, if we want to use xml test reporter once we switch
-        # to all gtest, one can simply do:
-        LD_LIBRARY_PATH="$ld_library_path" \
-            "$test" --gtest_output=xml:"$gtest_reports_dir/$(basename $test).xml"
-        ;;
-    esac
-  done
-fi
-
-################################################################################
-# Python tests #
-################################################################################
-if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then
-  exit 0
-fi
-
-# If pip is installed as root, we must use sudo.
-# CircleCI docker images could install conda as jenkins user, or use the OS's python package.
-PIP=$(which pip)
-PIP_USER=$(stat --format '%U' $PIP)
-CURRENT_USER=$(id -u -n)
-if [[ "$PIP_USER" = root && "$CURRENT_USER" != root ]]; then
-  MAYBE_SUDO=sudo
-fi
-
-# Uninstall pre-installed hypothesis and coverage to use an older version as newer
-# versions remove the timeout parameter from settings which ideep/conv_transpose_test.py uses
-$MAYBE_SUDO pip -q uninstall -y hypothesis
-$MAYBE_SUDO pip -q uninstall -y coverage
-
-# "pip install hypothesis==3.44.6" from official server is unreliable on
-# CircleCI, so we host a copy on S3 instead
-$MAYBE_SUDO pip -q install attrs==18.1.0 -f https://s3.amazonaws.com/ossci-linux/wheels/attrs-18.1.0-py2.py3-none-any.whl
-$MAYBE_SUDO pip -q install coverage==4.5.1 -f https://s3.amazonaws.com/ossci-linux/wheels/coverage-4.5.1-cp36-cp36m-macosx_10_12_x86_64.whl
-$MAYBE_SUDO pip -q install hypothesis==3.44.6 -f https://s3.amazonaws.com/ossci-linux/wheels/hypothesis-3.44.6-py3-none-any.whl
-
-# Collect additional tests to run (outside caffe2/python)
-EXTRA_TESTS=()
-
-# CUDA builds always include NCCL support
-if [[ "$BUILD_ENVIRONMENT" == *-cuda* ]] || [[ "$BUILD_ENVIRONMENT" == *-rocm* ]]; then
-  EXTRA_TESTS+=("$caffe2_pypath/contrib/nccl")
-fi
-
-rocm_ignore_test=()
-if [[ $BUILD_ENVIRONMENT == *-rocm* ]]; then
-  # Currently these tests are failing on ROCM platform:
-
-  # On ROCm, RCCL (distributed) development isn't complete.
-  # https://github.com/ROCmSoftwarePlatform/rccl
-  rocm_ignore_test+=("--ignore $caffe2_pypath/python/data_parallel_model_test.py")
-
-  # This test has been flaky in ROCm CI (but note the tests are
-  # cpu-only so should be unrelated to ROCm)
-  rocm_ignore_test+=("--ignore $caffe2_pypath/python/operator_test/blobs_queue_db_test.py")
-  # This test is skipped on Jenkins(compiled without MKL) and otherwise known flaky
-  rocm_ignore_test+=("--ignore $caffe2_pypath/python/ideep/convfusion_op_test.py")
-  # This test is skipped on Jenkins(compiled without MKL) and causing segfault on Circle
-  rocm_ignore_test+=("--ignore $caffe2_pypath/python/ideep/pool_op_test.py")
-fi
-
-echo "Running Python tests.."
-# locale setting is required by click package
-for loc in "en_US.utf8" "C.UTF-8"; do
-  if locale -a | grep "$loc" >/dev/null 2>&1; then
-    export LC_ALL="$loc"
-    export LANG="$loc"
-    break;
-  fi
-done
-
-# Some Caffe2 tests fail when run using AVX512 ISA, see https://github.com/pytorch/pytorch/issues/66111
-export DNNL_MAX_CPU_ISA=AVX2
-
-# Should still run even in the absence of SHARD_NUMBER
-if [[ "${SHARD_NUMBER:-1}" == "1" ]]; then
-  # TODO(sdym@meta.com) remove this when the linked issue resolved.
-  # py is temporary until https://github.com/Teemu/pytest-sugar/issues/241 is fixed
-  pip install --user py==1.11.0
-  pip install --user pytest-sugar
-  # NB: Warnings are disabled because they make it harder to see what
-  # the actual erroring test is
-  "$PYTHON" \
-    -m pytest \
-    -x \
-    -v \
-    --disable-warnings \
-    --junit-xml="$pytest_reports_dir/result.xml" \
-    --ignore "$caffe2_pypath/python/test/executor_test.py" \
-    --ignore "$caffe2_pypath/python/operator_test/matmul_op_test.py" \
-    --ignore "$caffe2_pypath/python/operator_test/pack_ops_test.py" \
-    --ignore "$caffe2_pypath/python/mkl/mkl_sbn_speed_test.py" \
-    --ignore "$caffe2_pypath/python/trt/test_pt_onnx_trt.py" \
-    ${rocm_ignore_test[@]} \
-    "$caffe2_pypath/python" \
-    "${EXTRA_TESTS[@]}"
-fi
--- a/.ci/docker/README.md
+++ b/.ci/docker/README.md
@ -1,32 +0,0 @@
-# Docker images for GitHub CI
-
-This directory contains everything needed to build the Docker images
-that are used in our CI.
-
-The Dockerfiles located in subdirectories are parameterized to
-conditionally run build stages depending on build arguments passed to
-`docker build`. This lets us use only a few Dockerfiles for many
-images. The different configurations are identified by a freeform
-string that we call a _build environment_. This string is persisted in
-each image as the `BUILD_ENVIRONMENT` environment variable.
-
-See `build.sh` for valid build environments (it's the giant switch).
-
-## Contents
-
-* `build.sh` -- dispatch script to launch all builds
-* `common` -- scripts used to execute individual Docker build stages
-* `ubuntu` -- Dockerfile for Ubuntu image for CPU build and test jobs
-* `ubuntu-cuda` -- Dockerfile for Ubuntu image with CUDA support for nvidia-docker
-* `ubuntu-rocm` -- Dockerfile for Ubuntu image with ROCm support
-* `ubuntu-xpu` -- Dockerfile for Ubuntu image with XPU support
-
-## Usage
-
-```bash
-# Build a specific image
-./build.sh pytorch-linux-bionic-py3.8-gcc9 -t myimage:latest
-
-# Set flags (see build.sh) and build image
-sudo bash -c 'PROTOBUF=1 ./build.sh pytorch-linux-bionic-py3.8-gcc9 -t myimage:latest
-```
--- a/.ci/docker/android/AndroidManifest.xml
+++ b/.ci/docker/android/AndroidManifest.xml
@ -1 +0,0 @@
-<manifest package="org.pytorch.deps" />
--- a/.ci/docker/android/build.gradle
+++ b/.ci/docker/android/build.gradle
@ -1,66 +0,0 @@
-buildscript {
-    ext {
-        minSdkVersion = 21
-        targetSdkVersion = 28
-        compileSdkVersion = 28
-        buildToolsVersion = '28.0.3'
-
-        coreVersion = "1.2.0"
-        extJUnitVersion = "1.1.1"
-        runnerVersion = "1.2.0"
-        rulesVersion = "1.2.0"
-        junitVersion = "4.12"
-    }
-
-    repositories {
-        google()
-        mavenLocal()
-        mavenCentral()
-        jcenter()
-    }
-
-    dependencies {
-        classpath 'com.android.tools.build:gradle:4.1.2'
-        classpath 'com.vanniktech:gradle-maven-publish-plugin:0.14.2'
-    }
-}
-
-repositories {
-    google()
-    jcenter()
-}
-
-apply plugin: 'com.android.library'
-
-android {
-    compileSdkVersion rootProject.compileSdkVersion
-    buildToolsVersion rootProject.buildToolsVersion
-
-    defaultConfig {
-        minSdkVersion minSdkVersion
-        targetSdkVersion targetSdkVersion
-    }
-
-    sourceSets {
-        main {
-            manifest.srcFile 'AndroidManifest.xml'
-        }
-    }
-}
-
-dependencies {
-    implementation 'com.android.support:appcompat-v7:28.0.0'
-    implementation 'androidx.appcompat:appcompat:1.0.0'
-    implementation 'com.facebook.fbjni:fbjni-java-only:0.2.2'
-    implementation 'com.google.code.findbugs:jsr305:3.0.1'
-    implementation 'com.facebook.soloader:nativeloader:0.10.5'
-
-    implementation 'junit:junit:' + rootProject.junitVersion
-    implementation 'androidx.test:core:' + rootProject.coreVersion
-
-    implementation 'junit:junit:' + rootProject.junitVersion
-    implementation 'androidx.test:core:' + rootProject.coreVersion
-    implementation 'androidx.test.ext:junit:' + rootProject.extJUnitVersion
-    implementation 'androidx.test:rules:' + rootProject.rulesVersion
-    implementation 'androidx.test:runner:' + rootProject.runnerVersion
-}
--- a/.ci/docker/aotriton_version.txt
+++ b/.ci/docker/aotriton_version.txt
@ -1,5 +0,0 @@
-0.6b
-manylinux_2_17
-rocm6
-04b5df8c8123f90cba3ede7e971e6fbc6040d506
-3db6ecbc915893ff967abd6e1b43bd5f54949868873be60dc802086c3863e648
--- a/.ci/docker/build.sh
+++ b/.ci/docker/build.sh
@ -1,558 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-image="$1"
-shift
-
-if [ -z "${image}" ]; then
-  echo "Usage: $0 IMAGE"
-  exit 1
-fi
-
-function extract_version_from_image_name() {
-  eval export $2=$(echo "${image}" | perl -n -e"/$1(\d+(\.\d+)?(\.\d+)?)/ && print \$1")
-  if [ "x${!2}" = x ]; then
-    echo "variable '$2' not correctly parsed from image='$image'"
-    exit 1
-  fi
-}
-
-function extract_all_from_image_name() {
-  # parts $image into array, splitting on '-'
-  keep_IFS="$IFS"
-  IFS="-"
-  declare -a parts=($image)
-  IFS="$keep_IFS"
-  unset keep_IFS
-
-  for part in "${parts[@]}"; do
-    name=$(echo "${part}" | perl -n -e"/([a-zA-Z]+)\d+(\.\d+)?(\.\d+)?/ && print \$1")
-    vername="${name^^}_VERSION"
-    # "py" is the odd one out, needs this special case
-    if [ "x${name}" = xpy ]; then
-      vername=ANACONDA_PYTHON_VERSION
-    fi
-    # skip non-conforming fields such as "pytorch", "linux" or "bionic" without version string
-    if [ -n "${name}" ]; then
-      extract_version_from_image_name "${name}" "${vername}"
-    fi
-  done
-}
-
-# Use the same pre-built XLA test image from PyTorch/XLA
-if [[ "$image" == *xla* ]]; then
-  echo "Using pre-built XLA test image..."
-  exit 0
-fi
-
-if [[ "$image" == *-focal* ]]; then
-  UBUNTU_VERSION=20.04
-elif [[ "$image" == *-jammy* ]]; then
-  UBUNTU_VERSION=22.04
-elif [[ "$image" == *ubuntu* ]]; then
-  extract_version_from_image_name ubuntu UBUNTU_VERSION
-elif [[ "$image" == *centos* ]]; then
-  extract_version_from_image_name centos CENTOS_VERSION
-fi
-
-if [ -n "${UBUNTU_VERSION}" ]; then
-  OS="ubuntu"
-elif [ -n "${CENTOS_VERSION}" ]; then
-  OS="centos"
-else
-  echo "Unable to derive operating system base..."
-  exit 1
-fi
-
-DOCKERFILE="${OS}/Dockerfile"
-# When using ubuntu - 22.04, start from Ubuntu docker image, instead of nvidia/cuda docker image.
-if [[ "$image" == *cuda* && "$UBUNTU_VERSION" != "22.04" ]]; then
-  DOCKERFILE="${OS}-cuda/Dockerfile"
-elif [[ "$image" == *rocm* ]]; then
-  DOCKERFILE="${OS}-rocm/Dockerfile"
-elif [[ "$image" == *xpu* ]]; then
-  DOCKERFILE="${OS}-xpu/Dockerfile"
-elif [[ "$image" == *cuda*linter* ]]; then
-  # Use a separate Dockerfile for linter to keep a small image size
-  DOCKERFILE="linter-cuda/Dockerfile"
-elif [[ "$image" == *linter* ]]; then
-  # Use a separate Dockerfile for linter to keep a small image size
-  DOCKERFILE="linter/Dockerfile"
-fi
-
-# CMake 3.18 is needed to support CUDA17 language variant
-CMAKE_VERSION=3.18.5
-
-_UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb
-_UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b
-
-# It's annoying to rename jobs every time you want to rewrite a
-# configuration, so we hardcode everything here rather than do it
-# from scratch
-case "$image" in
-  pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9)
-    CUDA_VERSION=12.4.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9)
-    CUDA_VERSION=12.1.1
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9-inductor-benchmarks)
-    CUDA_VERSION=12.4.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    INDUCTOR_BENCHMARKS=yes
-    ;;
-  pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks)
-    CUDA_VERSION=12.1.1
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    INDUCTOR_BENCHMARKS=yes
-    ;;
-  pytorch-linux-focal-cuda12.1-cudnn9-py3.12-gcc9-inductor-benchmarks)
-    CUDA_VERSION=12.1.1
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.12
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    INDUCTOR_BENCHMARKS=yes
-    ;;
-  pytorch-linux-focal-cuda12.4-cudnn9-py3.12-gcc9-inductor-benchmarks)
-    CUDA_VERSION=12.4.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.12
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    INDUCTOR_BENCHMARKS=yes
-    ;;
-  pytorch-linux-focal-cuda11.8-cudnn9-py3-gcc9)
-    CUDA_VERSION=11.8.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9)
-    CUDA_VERSION=12.4.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9)
-    CUDA_VERSION=12.1.1
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9)
-    CUDA_VERSION=12.4.0
-    CUDNN_VERSION=9
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    UCX_COMMIT=${_UCX_COMMIT}
-    UCC_COMMIT=${_UCC_COMMIT}
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-py3-clang10-onnx)
-    ANACONDA_PYTHON_VERSION=3.8
-    CLANG_VERSION=10
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    CONDA_CMAKE=yes
-    ONNX=yes
-    ;;
-  pytorch-linux-focal-py3-clang9-android-ndk-r21e)
-    ANACONDA_PYTHON_VERSION=3.8
-    CLANG_VERSION=9
-    LLVMDEV=yes
-    PROTOBUF=yes
-    ANDROID=yes
-    ANDROID_NDK_VERSION=r21e
-    GRADLE_VERSION=6.8.3
-    NINJA_VERSION=1.9.0
-    ;;
-  pytorch-linux-focal-py3.8-clang10)
-    ANACONDA_PYTHON_VERSION=3.8
-    CLANG_VERSION=10
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    VULKAN_SDK_VERSION=1.2.162.1
-    SWIFTSHADER=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-py3.11-clang10)
-    ANACONDA_PYTHON_VERSION=3.11
-    CLANG_VERSION=10
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    VULKAN_SDK_VERSION=1.2.162.1
-    SWIFTSHADER=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-py3.8-gcc9)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-rocm-n-1-py3)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    ROCM_VERSION=6.0
-    NINJA_VERSION=1.9.0
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-focal-rocm-n-py3)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=9
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    ROCM_VERSION=6.1
-    NINJA_VERSION=1.9.0
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-jammy-xpu-2024.0-py3)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=11
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    XPU_VERSION=0.5
-    NINJA_VERSION=1.9.0
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-    pytorch-linux-jammy-py3.8-gcc11-inductor-benchmarks)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=11
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    DOCS=yes
-    INDUCTOR_BENCHMARKS=yes
-    ;;
-  pytorch-linux-jammy-cuda11.8-cudnn9-py3.8-clang12)
-    ANACONDA_PYTHON_VERSION=3.8
-    CUDA_VERSION=11.8
-    CUDNN_VERSION=9
-    CLANG_VERSION=12
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-jammy-py3-clang12-asan)
-    ANACONDA_PYTHON_VERSION=3.9
-    CLANG_VERSION=12
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    ;;
-  pytorch-linux-jammy-py3-clang15-asan)
-    ANACONDA_PYTHON_VERSION=3.10
-    CLANG_VERSION=15
-    CONDA_CMAKE=yes
-    VISION=yes
-    ;;
-  pytorch-linux-jammy-py3.8-gcc11)
-    ANACONDA_PYTHON_VERSION=3.8
-    GCC_VERSION=11
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    KATEX=yes
-    CONDA_CMAKE=yes
-    TRITON=yes
-    DOCS=yes
-    UNINSTALL_DILL=yes
-    ;;
-  pytorch-linux-jammy-py3-clang12-executorch)
-    ANACONDA_PYTHON_VERSION=3.10
-    CLANG_VERSION=12
-    CONDA_CMAKE=yes
-    EXECUTORCH=yes
-    ;;
-  pytorch-linux-focal-linter)
-    # TODO: Use 3.9 here because of this issue https://github.com/python/mypy/issues/13627.
-    # We will need to update mypy version eventually, but that's for another day. The task
-    # would be to upgrade mypy to 1.0.0 with Python 3.11
-    ANACONDA_PYTHON_VERSION=3.9
-    CONDA_CMAKE=yes
-    ;;
-  pytorch-linux-jammy-cuda11.8-cudnn9-py3.9-linter)
-    ANACONDA_PYTHON_VERSION=3.9
-    CUDA_VERSION=11.8
-    CONDA_CMAKE=yes
-    ;;
-  pytorch-linux-jammy-aarch64-py3.10-gcc11)
-    ANACONDA_PYTHON_VERSION=3.10
-    GCC_VERSION=11
-    ACL=yes
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    CONDA_CMAKE=yes
-    # snadampal: skipping sccache due to the following issue
-    # https://github.com/pytorch/pytorch/issues/121559
-    SKIP_SCCACHE_INSTALL=yes
-    # snadampal: skipping llvm src build install because the current version
-    # from pytorch/llvm:9.0.1 is x86 specific
-    SKIP_LLVM_SRC_BUILD_INSTALL=yes
-    ;;
-  *)
-    # Catch-all for builds that are not hardcoded.
-    PROTOBUF=yes
-    DB=yes
-    VISION=yes
-    echo "image '$image' did not match an existing build configuration"
-    if [[ "$image" == *py* ]]; then
-      extract_version_from_image_name py ANACONDA_PYTHON_VERSION
-    fi
-    if [[ "$image" == *cuda* ]]; then
-      extract_version_from_image_name cuda CUDA_VERSION
-      extract_version_from_image_name cudnn CUDNN_VERSION
-    fi
-    if [[ "$image" == *rocm* ]]; then
-      extract_version_from_image_name rocm ROCM_VERSION
-      NINJA_VERSION=1.9.0
-      TRITON=yes
-      # To ensure that any ROCm config will build using conda cmake
-      # and thus have LAPACK/MKL enabled
-      CONDA_CMAKE=yes
-    fi
-    if [[ "$image" == *centos7* ]]; then
-      NINJA_VERSION=1.10.2
-    fi
-    if [[ "$image" == *gcc* ]]; then
-      extract_version_from_image_name gcc GCC_VERSION
-    fi
-    if [[ "$image" == *clang* ]]; then
-      extract_version_from_image_name clang CLANG_VERSION
-    fi
-    if [[ "$image" == *devtoolset* ]]; then
-      extract_version_from_image_name devtoolset DEVTOOLSET_VERSION
-    fi
-    if [[ "$image" == *glibc* ]]; then
-      extract_version_from_image_name glibc GLIBC_VERSION
-    fi
-    if [[ "$image" == *cmake* ]]; then
-      extract_version_from_image_name cmake CMAKE_VERSION
-    fi
-  ;;
-esac
-
-tmp_tag=$(basename "$(mktemp -u)" | tr '[:upper:]' '[:lower:]')
-
-#when using cudnn version 8 install it separately from cuda
-if [[ "$image" == *cuda*  && ${OS} == "ubuntu" ]]; then
-  IMAGE_NAME="nvidia/cuda:${CUDA_VERSION}-cudnn${CUDNN_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
-  if [[ ${CUDNN_VERSION} == 9 ]]; then
-    IMAGE_NAME="nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
-  fi
-fi
-
-# Build image
-docker build \
-       --no-cache \
-       --progress=plain \
-       --build-arg "BUILD_ENVIRONMENT=${image}" \
-       --build-arg "PROTOBUF=${PROTOBUF:-}" \
-       --build-arg "LLVMDEV=${LLVMDEV:-}" \
-       --build-arg "DB=${DB:-}" \
-       --build-arg "VISION=${VISION:-}" \
-       --build-arg "UBUNTU_VERSION=${UBUNTU_VERSION}" \
-       --build-arg "CENTOS_VERSION=${CENTOS_VERSION}" \
-       --build-arg "DEVTOOLSET_VERSION=${DEVTOOLSET_VERSION}" \
-       --build-arg "GLIBC_VERSION=${GLIBC_VERSION}" \
-       --build-arg "CLANG_VERSION=${CLANG_VERSION}" \
-       --build-arg "ANACONDA_PYTHON_VERSION=${ANACONDA_PYTHON_VERSION}" \
-       --build-arg "GCC_VERSION=${GCC_VERSION}" \
-       --build-arg "CUDA_VERSION=${CUDA_VERSION}" \
-       --build-arg "CUDNN_VERSION=${CUDNN_VERSION}" \
-       --build-arg "TENSORRT_VERSION=${TENSORRT_VERSION}" \
-       --build-arg "ANDROID=${ANDROID}" \
-       --build-arg "ANDROID_NDK=${ANDROID_NDK_VERSION}" \
-       --build-arg "GRADLE_VERSION=${GRADLE_VERSION}" \
-       --build-arg "VULKAN_SDK_VERSION=${VULKAN_SDK_VERSION}" \
-       --build-arg "SWIFTSHADER=${SWIFTSHADER}" \
-       --build-arg "CMAKE_VERSION=${CMAKE_VERSION:-}" \
-       --build-arg "NINJA_VERSION=${NINJA_VERSION:-}" \
-       --build-arg "KATEX=${KATEX:-}" \
-       --build-arg "ROCM_VERSION=${ROCM_VERSION:-}" \
-       --build-arg "PYTORCH_ROCM_ARCH=${PYTORCH_ROCM_ARCH:-gfx906;gfx90a}" \
-       --build-arg "IMAGE_NAME=${IMAGE_NAME}" \
-       --build-arg "UCX_COMMIT=${UCX_COMMIT}" \
-       --build-arg "UCC_COMMIT=${UCC_COMMIT}" \
-       --build-arg "CONDA_CMAKE=${CONDA_CMAKE}" \
-       --build-arg "TRITON=${TRITON}" \
-       --build-arg "ONNX=${ONNX}" \
-       --build-arg "DOCS=${DOCS}" \
-       --build-arg "INDUCTOR_BENCHMARKS=${INDUCTOR_BENCHMARKS}" \
-       --build-arg "EXECUTORCH=${EXECUTORCH}" \
-       --build-arg "XPU_VERSION=${XPU_VERSION}" \
-       --build-arg "ACL=${ACL:-}" \
-       --build-arg "SKIP_SCCACHE_INSTALL=${SKIP_SCCACHE_INSTALL:-}" \
-       --build-arg "SKIP_LLVM_SRC_BUILD_INSTALL=${SKIP_LLVM_SRC_BUILD_INSTALL:-}" \
-       -f $(dirname ${DOCKERFILE})/Dockerfile \
-       -t "$tmp_tag" \
-       "$@" \
-       .
-
-# NVIDIA dockers for RC releases use tag names like `11.0-cudnn9-devel-ubuntu18.04-rc`,
-# for this case we will set UBUNTU_VERSION to `18.04-rc` so that the Dockerfile could
-# find the correct image. As a result, here we have to replace the
-#   "$UBUNTU_VERSION" == "18.04-rc"
-# with
-#   "$UBUNTU_VERSION" == "18.04"
-UBUNTU_VERSION=$(echo ${UBUNTU_VERSION} | sed 's/-rc$//')
-
-function drun() {
-  docker run --rm "$tmp_tag" $*
-}
-
-if [[ "$OS" == "ubuntu" ]]; then
-
-  if !(drun lsb_release -a 2>&1 | grep -qF Ubuntu); then
-    echo "OS=ubuntu, but:"
-    drun lsb_release -a
-    exit 1
-  fi
-  if !(drun lsb_release -a 2>&1 | grep -qF "$UBUNTU_VERSION"); then
-    echo "UBUNTU_VERSION=$UBUNTU_VERSION, but:"
-    drun lsb_release -a
-    exit 1
-  fi
-fi
-
-if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
-  if !(drun python --version 2>&1 | grep -qF "Python $ANACONDA_PYTHON_VERSION"); then
-    echo "ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION, but:"
-    drun python --version
-    exit 1
-  fi
-fi
-
-if [ -n "$GCC_VERSION" ]; then
-  if !(drun gcc --version 2>&1 | grep -q " $GCC_VERSION\\W"); then
-    echo "GCC_VERSION=$GCC_VERSION, but:"
-    drun gcc --version
-    exit 1
-  fi
-fi
-
-if [ -n "$CLANG_VERSION" ]; then
-  if !(drun clang --version 2>&1 | grep -qF "clang version $CLANG_VERSION"); then
-    echo "CLANG_VERSION=$CLANG_VERSION, but:"
-    drun clang --version
-    exit 1
-  fi
-fi
-
-if [ -n "$KATEX" ]; then
-  if !(drun katex --version); then
-    echo "KATEX=$KATEX, but:"
-    drun katex --version
-    exit 1
-  fi
-fi
--- a/.ci/docker/centos-rocm/Dockerfile
+++ b/.ci/docker/centos-rocm/Dockerfile
@ -1,133 +0,0 @@
-ARG CENTOS_VERSION
-
-FROM centos:${CENTOS_VERSION}
-
-ARG CENTOS_VERSION
-
-# Set AMD gpu targets to build for
-ARG PYTORCH_ROCM_ARCH
-ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
-
-# Install required packages to build Caffe2
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Update CentOS git version
-RUN yum -y remove git
-RUN yum -y remove git-*
-RUN yum -y install https://packages.endpoint.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm || \
-    (yum -y install https://packages.endpointdev.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm && \
-    sed -i "s/packages.endpoint/packages.endpointdev/" /etc/yum.repos.d/endpoint.repo)
-RUN yum install -y git
-
-# Install devtoolset
-ARG DEVTOOLSET_VERSION
-COPY ./common/install_devtoolset.sh install_devtoolset.sh
-RUN bash ./install_devtoolset.sh && rm install_devtoolset.sh
-ENV BASH_ENV "/etc/profile"
-
-# (optional) Install non-default glibc version
-ARG GLIBC_VERSION
-COPY ./common/install_glibc.sh install_glibc.sh
-RUN if [ -n "${GLIBC_VERSION}" ]; then bash ./install_glibc.sh; fi
-RUN rm install_glibc.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-COPY requirements-ci.txt /opt/conda/requirements-ci.txt
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
-
-# (optional) Install protobuf for ONNX
-ARG PROTOBUF
-COPY ./common/install_protobuf.sh install_protobuf.sh
-RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
-RUN rm install_protobuf.sh
-ENV INSTALLED_PROTOBUF ${PROTOBUF}
-
-# (optional) Install database packages like LMDB and LevelDB
-ARG DB
-COPY ./common/install_db.sh install_db.sh
-RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
-RUN rm install_db.sh
-ENV INSTALLED_DB ${DB}
-
-# (optional) Install vision packages like OpenCV
-ARG VISION
-COPY ./common/install_vision.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
-RUN rm install_vision.sh cache_vision_models.sh common_utils.sh
-ENV INSTALLED_VISION ${VISION}
-
-# Install rocm
-ARG ROCM_VERSION
-COPY ./common/install_rocm.sh install_rocm.sh
-RUN bash ./install_rocm.sh
-RUN rm install_rocm.sh
-COPY ./common/install_rocm_magma.sh install_rocm_magma.sh
-RUN bash ./install_rocm_magma.sh
-RUN rm install_rocm_magma.sh
-COPY ./common/install_amdsmi.sh install_amdsmi.sh
-RUN bash ./install_amdsmi.sh
-RUN rm install_amdsmi.sh
-ENV PATH /opt/rocm/bin:$PATH
-ENV PATH /opt/rocm/hcc/bin:$PATH
-ENV PATH /opt/rocm/hip/bin:$PATH
-ENV PATH /opt/rocm/opencl/bin:$PATH
-ENV PATH /opt/rocm/llvm/bin:$PATH
-ENV MAGMA_HOME /opt/rocm/magma
-ENV LANG en_US.utf8
-ENV LC_ALL en_US.utf8
-
-# (optional) Install non-default CMake version
-ARG CMAKE_VERSION
-COPY ./common/install_cmake.sh install_cmake.sh
-RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
-RUN rm install_cmake.sh
-
-# (optional) Install non-default Ninja version
-ARG NINJA_VERSION
-COPY ./common/install_ninja.sh install_ninja.sh
-RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
-RUN rm install_ninja.sh
-
-ARG TRITON
-# Install triton, this needs to be done before sccache because the latter will
-# try to reach out to S3, which docker build runners don't have access
-ENV CMAKE_C_COMPILER cc
-ENV CMAKE_CXX_COMPILER c++
-COPY ./common/install_triton.sh install_triton.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/triton-rocm.txt triton-rocm.txt
-COPY triton_version.txt triton_version.txt
-RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
-RUN rm install_triton.sh common_utils.sh triton-rocm.txt triton_version.txt
-
-# Install AOTriton (Early fail)
-COPY ./aotriton_version.txt aotriton_version.txt
-COPY ./common/common_utils.sh common_utils.sh
-COPY ./common/install_aotriton.sh install_aotriton.sh
-RUN ["/bin/bash", "-c", "./install_aotriton.sh /opt/rocm && rm -rf install_aotriton.sh aotriton_version.txt common_utils.sh"]
-ENV AOTRITON_INSTALLED_PREFIX /opt/rocm/aotriton
-
-# Install ccache/sccache (do this last, so we get priority in PATH)
-COPY ./common/install_cache.sh install_cache.sh
-ENV PATH /opt/cache/bin:$PATH
-RUN bash ./install_cache.sh && rm install_cache.sh
-
-# Include BUILD_ENVIRONMENT environment variable in image
-ARG BUILD_ENVIRONMENT
-ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/ci_commit_pins/executorch.txt
+++ b/.ci/docker/ci_commit_pins/executorch.txt
@ -1 +0,0 @@
-d4b3e5cc607e97afdba79dc90f8ef968142f347c
--- a/.ci/docker/ci_commit_pins/huggingface.txt
+++ b/.ci/docker/ci_commit_pins/huggingface.txt
@ -1 +0,0 @@
-243e186efbf7fb93328dd6b34927a4e8c8f24395
--- a/.ci/docker/ci_commit_pins/timm.txt
+++ b/.ci/docker/ci_commit_pins/timm.txt
@ -1 +0,0 @@
-730b907b4d45a4713cbc425cbf224c46089fd514
--- a/.ci/docker/ci_commit_pins/triton-rocm.txt
+++ b/.ci/docker/ci_commit_pins/triton-rocm.txt
@ -1 +0,0 @@
-21eae954efa5bf584da70324b640288c3ee7aede
--- a/.ci/docker/ci_commit_pins/triton-xpu.txt
+++ b/.ci/docker/ci_commit_pins/triton-xpu.txt
@ -1 +0,0 @@
-aac14a3b93f11d781d1d5ebc5400b15ae8df5185
--- a/.ci/docker/ci_commit_pins/triton.txt
+++ b/.ci/docker/ci_commit_pins/triton.txt
@ -1 +0,0 @@
-45fff310c891f5a92d55445adf8cc9d29df5841e
--- a/.ci/docker/common/cache_vision_models.sh
+++ b/.ci/docker/common/cache_vision_models.sh
@ -1,18 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-# Cache the test models at ~/.cache/torch/hub/
-IMPORT_SCRIPT_FILENAME="/tmp/torchvision_import_script.py"
-as_jenkins echo 'import torchvision; torchvision.models.mobilenet_v2(pretrained=True); torchvision.models.mobilenet_v3_large(pretrained=True);' > "${IMPORT_SCRIPT_FILENAME}"
-
-pip_install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
-# Very weird quoting behavior here https://github.com/conda/conda/issues/10972,
-# so echo the command to a file and run the file instead
-conda_run python "${IMPORT_SCRIPT_FILENAME}"
-
-# Cleaning up
-conda_run pip uninstall -y torch torchvision
-rm "${IMPORT_SCRIPT_FILENAME}" || true
--- a/.ci/docker/common/common_utils.sh
+++ b/.ci/docker/common/common_utils.sh
@ -1,36 +0,0 @@
-#!/bin/bash
-
-# Work around bug where devtoolset replaces sudo and breaks it.
-if [ -n "$DEVTOOLSET_VERSION" ]; then
-  export SUDO=/bin/sudo
-else
-  export SUDO=sudo
-fi
-
-as_jenkins() {
-  # NB: unsetting the environment variables works around a conda bug
-  # https://github.com/conda/conda/issues/6576
-  # NB: Pass on PATH and LD_LIBRARY_PATH to sudo invocation
-  # NB: This must be run from a directory that jenkins has access to,
-  # works around https://github.com/conda/conda-package-handling/pull/34
-  $SUDO -E -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
-}
-
-conda_install() {
-  # Ensure that the install command don't upgrade/downgrade Python
-  # This should be called as
-  #   conda_install pkg1 pkg2 ... [-c channel]
-  as_jenkins conda install -q -n py_$ANACONDA_PYTHON_VERSION -y python="$ANACONDA_PYTHON_VERSION" $*
-}
-
-conda_run() {
-  as_jenkins conda run -n py_$ANACONDA_PYTHON_VERSION --no-capture-output $*
-}
-
-pip_install() {
-  as_jenkins conda run -n py_$ANACONDA_PYTHON_VERSION pip install --progress-bar off $*
-}
-
-get_pinned_commit() {
-  cat "${1}".txt
-}
--- a/.ci/docker/common/install_acl.sh
+++ b/.ci/docker/common/install_acl.sh
@ -1,16 +0,0 @@
-set -euo pipefail
-
-readonly version=v24.04
-readonly src_host=https://review.mlplatform.org/ml
-readonly src_repo=ComputeLibrary
-
-# Clone ACL
-[[ ! -d ${src_repo} ]] && git clone ${src_host}/${src_repo}.git
-cd ${src_repo}
-
-git checkout $version
-
-# Build with scons
-scons -j8  Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 \
-  os=linux arch=armv8a build=native multi_isa=1 \
-  fixed_format_kernels=1 openmp=1 cppthreads=0
--- a/.ci/docker/common/install_amdsmi.sh
+++ b/.ci/docker/common/install_amdsmi.sh
@ -1,5 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-cd /opt/rocm/share/amd_smi && pip install .
--- a/.ci/docker/common/install_android.sh
+++ b/.ci/docker/common/install_android.sh
@ -1,112 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "${ANDROID_NDK}" ]
-
-_https_amazon_aws=https://ossci-android.s3.amazonaws.com
-
-apt-get update
-apt-get install -y --no-install-recommends autotools-dev autoconf unzip
-apt-get autoclean && apt-get clean
-rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-
-pushd /tmp
-curl -Os --retry 3 $_https_amazon_aws/android-ndk-${ANDROID_NDK}-linux-x86_64.zip
-popd
-_ndk_dir=/opt/ndk
-mkdir -p "$_ndk_dir"
-unzip -qo /tmp/android*.zip -d "$_ndk_dir"
-_versioned_dir=$(find "$_ndk_dir/" -mindepth 1 -maxdepth 1 -type d)
-mv "$_versioned_dir"/* "$_ndk_dir"/
-rmdir "$_versioned_dir"
-rm -rf /tmp/*
-
-# Install OpenJDK
-# https://hub.docker.com/r/picoded/ubuntu-openjdk-8-jdk/dockerfile/
-
-sudo apt-get update && \
-    apt-get install -y openjdk-8-jdk && \
-    apt-get install -y ant && \
-    apt-get clean && \
-    rm -rf /var/lib/apt/lists/* && \
-    rm -rf /var/cache/oracle-jdk8-installer;
-
-# Fix certificate issues, found as of
-# https://bugs.launchpad.net/ubuntu/+source/ca-certificates-java/+bug/983302
-
-sudo apt-get update && \
-    apt-get install -y ca-certificates-java && \
-    apt-get clean && \
-    update-ca-certificates -f && \
-    rm -rf /var/lib/apt/lists/* && \
-    rm -rf /var/cache/oracle-jdk8-installer;
-
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-
-# Installing android sdk
-# https://github.com/circleci/circleci-images/blob/staging/android/Dockerfile.m4
-
-_tmp_sdk_zip=/tmp/android-sdk-linux.zip
-_android_home=/opt/android/sdk
-
-rm -rf $_android_home
-sudo mkdir -p $_android_home
-curl --silent --show-error --location --fail --retry 3 --output /tmp/android-sdk-linux.zip $_https_amazon_aws/android-sdk-linux-tools3859397-build-tools2803-2902-platforms28-29.zip
-sudo unzip -q $_tmp_sdk_zip -d $_android_home
-rm $_tmp_sdk_zip
-
-sudo chmod -R 777 $_android_home
-
-export ANDROID_HOME=$_android_home
-export ADB_INSTALL_TIMEOUT=120
-
-export PATH="${ANDROID_HOME}/tools:${ANDROID_HOME}/tools/bin:${ANDROID_HOME}/platform-tools:${PATH}"
-echo "PATH:${PATH}"
-
-# Installing Gradle
-echo "GRADLE_VERSION:${GRADLE_VERSION}"
-_gradle_home=/opt/gradle
-sudo rm -rf $gradle_home
-sudo mkdir -p $_gradle_home
-
-curl --silent --output /tmp/gradle.zip --retry 3 $_https_amazon_aws/gradle-${GRADLE_VERSION}-bin.zip
-
-sudo unzip -q /tmp/gradle.zip -d $_gradle_home
-rm /tmp/gradle.zip
-
-sudo chmod -R 777 $_gradle_home
-
-export GRADLE_HOME=$_gradle_home/gradle-$GRADLE_VERSION
-alias gradle="${GRADLE_HOME}/bin/gradle"
-
-export PATH="${GRADLE_HOME}/bin/:${PATH}"
-echo "PATH:${PATH}"
-
-gradle --version
-
-mkdir /var/lib/jenkins/gradledeps
-cp build.gradle /var/lib/jenkins/gradledeps
-cp AndroidManifest.xml /var/lib/jenkins/gradledeps
-
-pushd /var/lib/jenkins
-
-export GRADLE_LOCAL_PROPERTIES=gradledeps/local.properties
-rm -f $GRADLE_LOCAL_PROPERTIES
-echo "sdk.dir=/opt/android/sdk" >> $GRADLE_LOCAL_PROPERTIES
-echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
-
-chown -R jenkins /var/lib/jenkins/gradledeps
-chgrp -R jenkins /var/lib/jenkins/gradledeps
-
-sudo -H -u jenkins $GRADLE_HOME/bin/gradle -Pandroid.useAndroidX=true -p /var/lib/jenkins/gradledeps -g /var/lib/jenkins/.gradle --refresh-dependencies --debug --stacktrace assemble
-
-chown -R jenkins /var/lib/jenkins/.gradle
-chgrp -R jenkins /var/lib/jenkins/.gradle
-
-popd
-
-rm -rf /var/lib/jenkins/.gradle/daemon
-
-# Cache vision models used by the test
-source "$(dirname "${BASH_SOURCE[0]}")/cache_vision_models.sh"
--- a/.ci/docker/common/install_aotriton.sh
+++ b/.ci/docker/common/install_aotriton.sh
@ -1,23 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-TARBALL='aotriton.tar.bz2'
-# This read command alwasy returns with exit code 1
-read -d "\n" VER MANYLINUX ROCMBASE PINNED_COMMIT SHA256 < aotriton_version.txt || true
-ARCH=$(uname -m)
-AOTRITON_INSTALL_PREFIX="$1"
-AOTRITON_URL="https://github.com/ROCm/aotriton/releases/download/${VER}/aotriton-${VER}-${MANYLINUX}_${ARCH}-${ROCMBASE}.tar.bz2"
-
-cd "${AOTRITON_INSTALL_PREFIX}"
-# Must use -L to follow redirects
-curl -L --retry 3 -o "${TARBALL}" "${AOTRITON_URL}"
-ACTUAL_SHA256=$(sha256sum "${TARBALL}" | cut -d " " -f 1)
-if [ "${SHA256}" != "${ACTUAL_SHA256}" ]; then
-  echo -n "Error: The SHA256 of downloaded tarball is ${ACTUAL_SHA256},"
-  echo " which does not match the expected value ${SHA256}."
-  exit
-fi
-tar xf "${TARBALL}" && rm -rf "${TARBALL}"
--- a/.ci/docker/common/install_base.sh
+++ b/.ci/docker/common/install_base.sh
@ -1,159 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-install_ubuntu() {
-  # NVIDIA dockers for RC releases use tag names like `11.0-cudnn9-devel-ubuntu18.04-rc`,
-  # for this case we will set UBUNTU_VERSION to `18.04-rc` so that the Dockerfile could
-  # find the correct image. As a result, here we have to check for
-  #   "$UBUNTU_VERSION" == "18.04"*
-  # instead of
-  #   "$UBUNTU_VERSION" == "18.04"
-  if [[ "$UBUNTU_VERSION" == "20.04"* ]]; then
-    cmake3="cmake=3.16*"
-    maybe_libiomp_dev=""
-  elif [[ "$UBUNTU_VERSION" == "22.04"* ]]; then
-    cmake3="cmake=3.22*"
-    maybe_libiomp_dev=""
-  else
-    cmake3="cmake=3.5*"
-    maybe_libiomp_dev="libiomp-dev"
-  fi
-
-  if [[ "$CLANG_VERSION" == 15 ]]; then
-    maybe_libomp_dev="libomp-15-dev"
-  elif [[ "$CLANG_VERSION" == 12 ]]; then
-    maybe_libomp_dev="libomp-12-dev"
-  elif [[ "$CLANG_VERSION" == 10 ]]; then
-    maybe_libomp_dev="libomp-10-dev"
-  else
-    maybe_libomp_dev=""
-  fi
-
-  # HACK: UCC testing relies on libnccl library from NVIDIA repo, and version 2.16 crashes
-  # See https://github.com/pytorch/pytorch/pull/105260#issuecomment-1673399729
-  if [[ "$UBUNTU_VERSION" == "20.04"* && "$CUDA_VERSION" == "11.8"* ]]; then
-    maybe_libnccl_dev="libnccl2=2.15.5-1+cuda11.8 libnccl-dev=2.15.5-1+cuda11.8 --allow-downgrades --allow-change-held-packages"
-  else
-    maybe_libnccl_dev=""
-  fi
-
-  # Install common dependencies
-  apt-get update
-  # TODO: Some of these may not be necessary
-  ccache_deps="asciidoc docbook-xml docbook-xsl xsltproc"
-  deploy_deps="libffi-dev libbz2-dev libreadline-dev libncurses5-dev libncursesw5-dev libgdbm-dev libsqlite3-dev uuid-dev tk-dev"
-  numpy_deps="gfortran"
-  apt-get install -y --no-install-recommends \
-    $ccache_deps \
-    $numpy_deps \
-    ${deploy_deps} \
-    ${cmake3} \
-    apt-transport-https \
-    autoconf \
-    automake \
-    build-essential \
-    ca-certificates \
-    curl \
-    git \
-    libatlas-base-dev \
-    libc6-dbg \
-    ${maybe_libiomp_dev} \
-    libyaml-dev \
-    libz-dev \
-    libjemalloc2 \
-    libjpeg-dev \
-    libasound2-dev \
-    libsndfile-dev \
-    ${maybe_libomp_dev} \
-    ${maybe_libnccl_dev} \
-    software-properties-common \
-    wget \
-    sudo \
-    vim \
-    jq \
-    libtool \
-    vim \
-    unzip \
-    gpg-agent \
-    gdb
-
-  # Should resolve issues related to various apt package repository cert issues
-  # see: https://github.com/pytorch/pytorch/issues/65931
-  apt-get install -y libgnutls30
-
-  # Cleanup package manager
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-}
-
-install_centos() {
-  # Need EPEL for many packages we depend on.
-  # See http://fedoraproject.org/wiki/EPEL
-  yum --enablerepo=extras install -y epel-release
-
-  ccache_deps="asciidoc docbook-dtds docbook-style-xsl libxslt"
-  numpy_deps="gcc-gfortran"
-  # Note: protobuf-c-{compiler,devel} on CentOS are too old to be used
-  # for Caffe2. That said, we still install them to make sure the build
-  # system opts to build/use protoc and libprotobuf from third-party.
-  yum install -y \
-    $ccache_deps \
-    $numpy_deps \
-    autoconf \
-    automake \
-    bzip2 \
-    cmake \
-    cmake3 \
-    curl \
-    gcc \
-    gcc-c++ \
-    gflags-devel \
-    git \
-    glibc-devel \
-    glibc-headers \
-    glog-devel \
-    libstdc++-devel \
-    libsndfile-devel \
-    make \
-    opencv-devel \
-    sudo \
-    wget \
-    vim \
-    unzip \
-    gdb
-
-  # Cleanup
-  yum clean all
-  rm -rf /var/cache/yum
-  rm -rf /var/lib/yum/yumdb
-  rm -rf /var/lib/yum/history
-}
-
-# Install base packages depending on the base OS
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-  ubuntu)
-    install_ubuntu
-    ;;
-  centos)
-    install_centos
-    ;;
-  *)
-    echo "Unable to determine OS..."
-    exit 1
-    ;;
-esac
-
-# Install Valgrind separately since the apt-get version is too old.
-mkdir valgrind_build && cd valgrind_build
-VALGRIND_VERSION=3.20.0
-wget https://ossci-linux.s3.amazonaws.com/valgrind-${VALGRIND_VERSION}.tar.bz2
-tar -xjf valgrind-${VALGRIND_VERSION}.tar.bz2
-cd valgrind-${VALGRIND_VERSION}
-./configure --prefix=/usr/local
-make -j$[$(nproc) - 2]
-sudo make install
-cd ../../
-rm -rf valgrind_build
-alias valgrind="/usr/local/bin/valgrind"
--- a/.ci/docker/common/install_cache.sh
+++ b/.ci/docker/common/install_cache.sh
@ -1,118 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-install_ubuntu() {
-  echo "Preparing to build sccache from source"
-  apt-get update
-  # libssl-dev will not work as it is upgraded to libssl3 in Ubuntu-22.04.
-  # Instead use lib and headers from OpenSSL1.1 installed in `install_openssl.sh``
-  apt-get install -y cargo
-  echo "Checking out sccache repo"
-  git clone https://github.com/pytorch/sccache
-  cd sccache
-  echo "Building sccache"
-  cargo build --release
-  cp target/release/sccache /opt/cache/bin
-  echo "Cleaning up"
-  cd ..
-  rm -rf sccache
-  apt-get remove -y cargo rustc
-  apt-get autoclean && apt-get clean
-}
-
-install_binary() {
-  echo "Downloading sccache binary from S3 repo"
-  curl --retry 3 https://s3.amazonaws.com/ossci-linux/sccache -o /opt/cache/bin/sccache
-}
-
-mkdir -p /opt/cache/bin
-mkdir -p /opt/cache/lib
-sed -e 's|PATH="\(.*\)"|PATH="/opt/cache/bin:\1"|g' -i /etc/environment
-export PATH="/opt/cache/bin:$PATH"
-
-# Setup compiler cache
-if [ -n "$ROCM_VERSION" ]; then
-  curl --retry 3 http://repo.radeon.com/misc/.sccache_amd/sccache -o /opt/cache/bin/sccache
-else
-  ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-  # TODO: Install the pre-built binary from S3 as building from source
-  # https://github.com/pytorch/sccache has started failing mysteriously
-  # in which sccache server couldn't start with the following error:
-  #   sccache: error: Invalid argument (os error 22)
-  install_binary
-fi
-chmod a+x /opt/cache/bin/sccache
-
-function write_sccache_stub() {
-  # Unset LD_PRELOAD for ps because of asan + ps issues
-  # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
-  printf "#!/bin/sh\nif [ \$(env -u LD_PRELOAD ps -p \$PPID -o comm=) != sccache ]; then\n  exec sccache $(which $1) \"\$@\"\nelse\n  exec $(which $1) \"\$@\"\nfi" > "/opt/cache/bin/$1"
-  chmod a+x "/opt/cache/bin/$1"
-}
-
-write_sccache_stub cc
-write_sccache_stub c++
-write_sccache_stub gcc
-write_sccache_stub g++
-
-# NOTE: See specific ROCM_VERSION case below.
-if [ "x$ROCM_VERSION" = x ]; then
-  write_sccache_stub clang
-  write_sccache_stub clang++
-fi
-
-if [ -n "$CUDA_VERSION" ]; then
-  # TODO: This is a workaround for the fact that PyTorch's FindCUDA
-  # implementation cannot find nvcc if it is setup this way, because it
-  # appears to search for the nvcc in PATH, and use its path to infer
-  # where CUDA is installed.  Instead, we install an nvcc symlink outside
-  # of the PATH, and set CUDA_NVCC_EXECUTABLE so that we make use of it.
-
-  write_sccache_stub nvcc
-  mv /opt/cache/bin/nvcc /opt/cache/lib/
-fi
-
-if [ -n "$ROCM_VERSION" ]; then
-  # ROCm compiler is hcc or clang. However, it is commonly invoked via hipcc wrapper.
-  # hipcc will call either hcc or clang using an absolute path starting with /opt/rocm,
-  # causing the /opt/cache/bin to be skipped. We must create the sccache wrappers
-  # directly under /opt/rocm while also preserving the original compiler names.
-  # Note symlinks will chain as follows: [hcc or clang++] -> clang -> clang-??
-  # Final link in symlink chain must point back to original directory.
-
-  # Original compiler is moved one directory deeper. Wrapper replaces it.
-  function write_sccache_stub_rocm() {
-    OLDCOMP=$1
-    COMPNAME=$(basename $OLDCOMP)
-    TOPDIR=$(dirname $OLDCOMP)
-    WRAPPED="$TOPDIR/original/$COMPNAME"
-    mv "$OLDCOMP" "$WRAPPED"
-    printf "#!/bin/sh\nexec sccache $WRAPPED \"\$@\"" > "$OLDCOMP"
-    chmod a+x "$OLDCOMP"
-  }
-
-  if [[ -e "/opt/rocm/hcc/bin/hcc" ]]; then
-    # ROCm 3.3 or earlier.
-    mkdir /opt/rocm/hcc/bin/original
-    write_sccache_stub_rocm /opt/rocm/hcc/bin/hcc
-    write_sccache_stub_rocm /opt/rocm/hcc/bin/clang
-    write_sccache_stub_rocm /opt/rocm/hcc/bin/clang++
-    # Fix last link in symlink chain, clang points to versioned clang in prior dir
-    pushd /opt/rocm/hcc/bin/original
-    ln -s ../$(readlink clang)
-    popd
-  elif [[ -e "/opt/rocm/llvm/bin/clang" ]]; then
-    # ROCm 3.5 and beyond.
-    mkdir /opt/rocm/llvm/bin/original
-    write_sccache_stub_rocm /opt/rocm/llvm/bin/clang
-    write_sccache_stub_rocm /opt/rocm/llvm/bin/clang++
-    # Fix last link in symlink chain, clang points to versioned clang in prior dir
-    pushd /opt/rocm/llvm/bin/original
-    ln -s ../$(readlink clang)
-    popd
-  else
-    echo "Cannot find ROCm compiler."
-    exit 1
-  fi
-fi
--- a/.ci/docker/common/install_clang.sh
+++ b/.ci/docker/common/install_clang.sh
@ -1,44 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-if [ -n "$CLANG_VERSION" ]; then
-
-  if [[ $CLANG_VERSION == 9 && $UBUNTU_VERSION == 18.04 ]]; then
-    sudo apt-get update
-    # gpg-agent is not available by default on 18.04
-    sudo apt-get install  -y --no-install-recommends gpg-agent
-    wget --no-check-certificate -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add  -
-    apt-add-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-${CLANG_VERSION} main"
-  elif [[ $UBUNTU_VERSION == 22.04 ]]; then
-    # work around ubuntu apt-get conflicts
-    sudo apt-get -y -f install
-  fi
-
-  sudo apt-get update
-  apt-get install -y --no-install-recommends clang-"$CLANG_VERSION"
-  apt-get install -y --no-install-recommends llvm-"$CLANG_VERSION"
-
-  # Install dev version of LLVM.
-  if [ -n "$LLVMDEV" ]; then
-    sudo apt-get install -y --no-install-recommends llvm-"$CLANG_VERSION"-dev
-  fi
-
-  # Use update-alternatives to make this version the default
-  update-alternatives --install /usr/bin/clang clang /usr/bin/clang-"$CLANG_VERSION" 50
-  update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-"$CLANG_VERSION" 50
-  # Override cc/c++ to clang as well
-  update-alternatives --install /usr/bin/cc cc /usr/bin/clang 50
-  update-alternatives --install /usr/bin/c++ c++ /usr/bin/clang++ 50
-
-  # clang's packaging is a little messed up (the runtime libs aren't
-  # added into the linker path), so give it a little help
-  clang_lib=("/usr/lib/llvm-$CLANG_VERSION/lib/clang/"*"/lib/linux")
-  echo "$clang_lib" > /etc/ld.so.conf.d/clang.conf
-  ldconfig
-
-  # Cleanup package manager
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-
-fi
--- a/.ci/docker/common/install_cmake.sh
+++ b/.ci/docker/common/install_cmake.sh
@ -1,31 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "$CMAKE_VERSION" ]
-
-# Remove system cmake install so it won't get used instead
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-  ubuntu)
-    apt-get remove cmake -y
-    ;;
-  centos)
-    yum remove cmake -y
-    ;;
-  *)
-    echo "Unable to determine OS..."
-    exit 1
-    ;;
-esac
-
-# Turn 3.6.3 into v3.6
-path=$(echo "${CMAKE_VERSION}" | sed -e 's/\([0-9].[0-9]\+\).*/v\1/')
-file="cmake-${CMAKE_VERSION}-Linux-x86_64.tar.gz"
-
-# Download and install specific CMake version in /usr/local
-pushd /tmp
-curl -Os --retry 3 "https://cmake.org/files/${path}/${file}"
-tar -C /usr/local --strip-components 1 --no-same-owner -zxf cmake-*.tar.gz
-rm -f cmake-*.tar.gz
-popd
--- a/.ci/docker/common/install_conda.sh
+++ b/.ci/docker/common/install_conda.sh
@ -1,127 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-# Optionally install conda
-if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
-  BASE_URL="https://repo.anaconda.com/miniconda"
-
-  MAJOR_PYTHON_VERSION=$(echo "$ANACONDA_PYTHON_VERSION" | cut -d . -f 1)
-  MINOR_PYTHON_VERSION=$(echo "$ANACONDA_PYTHON_VERSION" | cut -d . -f 2)
-
-if [[ $(uname -m) == "aarch64" ]]; then
-  BASE_URL="https://github.com/conda-forge/miniforge/releases/latest/download"
-  case "$MAJOR_PYTHON_VERSION" in
-    3)
-      CONDA_FILE="Miniforge3-Linux-aarch64.sh"
-    ;;
-    *)
-      echo "Unsupported ANACONDA_PYTHON_VERSION: $ANACONDA_PYTHON_VERSION"
-      exit 1
-      ;;
-  esac
-else
-  case "$MAJOR_PYTHON_VERSION" in
-    3)
-      CONDA_FILE="Miniconda3-latest-Linux-x86_64.sh"
-    ;;
-    *)
-      echo "Unsupported ANACONDA_PYTHON_VERSION: $ANACONDA_PYTHON_VERSION"
-      exit 1
-      ;;
-  esac
-fi
-
-  mkdir -p /opt/conda
-  chown jenkins:jenkins /opt/conda
-
-  source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-  pushd /tmp
-  wget -q "${BASE_URL}/${CONDA_FILE}"
-  # NB: Manually invoke bash per https://github.com/conda/conda/issues/10431
-  as_jenkins bash "${CONDA_FILE}" -b -f -p "/opt/conda"
-  popd
-
-  # NB: Don't do this, rely on the rpath to get it right
-  #echo "/opt/conda/lib" > /etc/ld.so.conf.d/conda-python.conf
-  #ldconfig
-  sed -e 's|PATH="\(.*\)"|PATH="/opt/conda/bin:\1"|g' -i /etc/environment
-  export PATH="/opt/conda/bin:$PATH"
-
-  # Ensure we run conda in a directory that jenkins has write access to
-  pushd /opt/conda
-
-  # Prevent conda from updating to 4.14.0, which causes docker build failures
-  # See https://hud.pytorch.org/pytorch/pytorch/commit/754d7f05b6841e555cea5a4b2c505dd9e0baec1d
-  # Uncomment the below when resolved to track the latest conda update
-  # as_jenkins conda update -y -n base conda
-
-  if [[ $(uname -m) == "aarch64" ]]; then
-    export SYSROOT_DEP="sysroot_linux-aarch64=2.17"
-  else
-    export SYSROOT_DEP="sysroot_linux-64=2.17"
-  fi
-
-  # Install correct Python version
-  # Also ensure sysroot is using a modern GLIBC to match system compilers
-  as_jenkins conda create -n py_$ANACONDA_PYTHON_VERSION -y\
-             python="$ANACONDA_PYTHON_VERSION" \
-             ${SYSROOT_DEP}
-
-  # libstdcxx from conda default channels are too old, we need GLIBCXX_3.4.30
-  # which is provided in libstdcxx 12 and up.
-  conda_install libstdcxx-ng=12.3.0 -c conda-forge
-
-  # Install PyTorch conda deps, as per https://github.com/pytorch/pytorch README
-  if [[ $(uname -m) == "aarch64" ]]; then
-    CONDA_COMMON_DEPS="astunparse pyyaml setuptools openblas==0.3.25=*openmp* ninja==1.11.1 scons==4.5.2"
-
-    if [ "$ANACONDA_PYTHON_VERSION" = "3.8" ]; then
-      conda_install numpy=1.24.4 ${CONDA_COMMON_DEPS}
-    else
-      conda_install numpy=1.26.2 ${CONDA_COMMON_DEPS}
-    fi
-  else
-    CONDA_COMMON_DEPS="astunparse pyyaml mkl=2021.4.0 mkl-include=2021.4.0 setuptools"
-
-    if [ "$ANACONDA_PYTHON_VERSION" = "3.11" ] || [ "$ANACONDA_PYTHON_VERSION" = "3.12" ]; then
-      conda_install numpy=1.26.0 ${CONDA_COMMON_DEPS}
-    else
-      conda_install numpy=1.21.2 ${CONDA_COMMON_DEPS}
-    fi
-  fi
-
-  # Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
-  # and libpython-static for torch deploy
-  conda_install llvmdev=8.0.0 "libpython-static=${ANACONDA_PYTHON_VERSION}"
-
-  # Use conda cmake in some cases. Conda cmake will be newer than our supported
-  # min version (3.5 for xenial and 3.10 for bionic), so we only do it in those
-  # following builds that we know should use conda. Specifically, Ubuntu bionic
-  # and focal cannot find conda mkl with stock cmake, so we need a cmake from conda
-  if [ -n "${CONDA_CMAKE}" ]; then
-    conda_install cmake
-  fi
-
-  # Magma package names are concatenation of CUDA major and minor ignoring revision
-  # I.e. magma-cuda102 package corresponds to CUDA_VERSION=10.2 and CUDA_VERSION=10.2.89
-  if [ -n "$CUDA_VERSION" ]; then
-    conda_install magma-cuda$(TMP=${CUDA_VERSION/./};echo ${TMP%.*[0-9]}) -c pytorch
-  fi
-
-  # Install some other packages, including those needed for Python test reporting
-  pip_install -r /opt/conda/requirements-ci.txt
-
-  pip_install -U scikit-learn
-
-  if [ -n "$DOCS" ]; then
-    apt-get update
-    apt-get -y install expect-dev
-
-    # We are currently building docs with python 3.8 (min support version)
-    pip_install -r /opt/conda/requirements-docs.txt
-  fi
-
-  popd
-fi
--- a/.ci/docker/common/install_cudnn.sh
+++ b/.ci/docker/common/install_cudnn.sh
@ -1,22 +0,0 @@
-#!/bin/bash
-
-if [[ -n "${CUDNN_VERSION}" ]]; then
-    # cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
-    mkdir tmp_cudnn
-    pushd tmp_cudnn
-    if [[ ${CUDA_VERSION:0:2} == "12" ]]; then
-        CUDNN_NAME="cudnn-linux-x86_64-9.1.0.70_cuda12-archive"
-    elif [[ ${CUDA_VERSION:0:2} == "11" ]]; then
-        CUDNN_NAME="cudnn-linux-x86_64-9.1.0.70_cuda11-archive"
-    else
-        print "Unsupported CUDA version ${CUDA_VERSION}"
-        exit 1
-    fi
-    curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-x86_64/${CUDNN_NAME}.tar.xz
-    tar xf ${CUDNN_NAME}.tar.xz
-    cp -a ${CUDNN_NAME}/include/* /usr/local/cuda/include/
-    cp -a ${CUDNN_NAME}/lib/* /usr/local/cuda/lib64/
-    popd
-    rm -rf tmp_cudnn
-    ldconfig
-fi
--- a/.ci/docker/common/install_cusparselt.sh
+++ b/.ci/docker/common/install_cusparselt.sh
@ -1,26 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-# cuSPARSELt license: https://docs.nvidia.com/cuda/cusparselt/license.html
-mkdir tmp_cusparselt && cd tmp_cusparselt
-
-if [[ ${CUDA_VERSION:0:4} =~ ^12\.[1-4]$ ]]; then
-    arch_path='sbsa'
-    export TARGETARCH=${TARGETARCH:-$(uname -m)}
-    if [ ${TARGETARCH} = 'amd64' ] || [ "${TARGETARCH}" = 'x86_64' ]; then
-        arch_path='x86_64'
-    fi
-    CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.5.2.1-archive"
-    curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-${arch_path}/${CUSPARSELT_NAME}.tar.xz
-elif [[ ${CUDA_VERSION:0:4} == "11.8" ]]; then
-    CUSPARSELT_NAME="libcusparse_lt-linux-x86_64-0.4.0.7-archive"
-    curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-x86_64/${CUSPARSELT_NAME}.tar.xz
-fi
-
-tar xf ${CUSPARSELT_NAME}.tar.xz
-cp -a ${CUSPARSELT_NAME}/include/* /usr/local/cuda/include/
-cp -a ${CUSPARSELT_NAME}/lib/* /usr/local/cuda/lib64/
-cd ..
-rm -rf tmp_cusparselt
-ldconfig
--- a/.ci/docker/common/install_db.sh
+++ b/.ci/docker/common/install_db.sh
@ -1,38 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-install_ubuntu() {
-  apt-get update
-
-  # Cleanup
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-}
-
-install_centos() {
-  # Need EPEL for many packages we depend on.
-  # See http://fedoraproject.org/wiki/EPEL
-  yum --enablerepo=extras install -y epel-release
-
-  # Cleanup
-  yum clean all
-  rm -rf /var/cache/yum
-  rm -rf /var/lib/yum/yumdb
-  rm -rf /var/lib/yum/history
-}
-
-# Install base packages depending on the base OS
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-  ubuntu)
-    install_ubuntu
-    ;;
-  centos)
-    install_centos
-    ;;
-  *)
-    echo "Unable to determine OS..."
-    exit 1
-    ;;
-esac
--- a/.ci/docker/common/install_devtoolset.sh
+++ b/.ci/docker/common/install_devtoolset.sh
@ -1,10 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "$DEVTOOLSET_VERSION" ]
-
-yum install -y centos-release-scl
-yum install -y devtoolset-$DEVTOOLSET_VERSION
-
-echo "source scl_source enable devtoolset-$DEVTOOLSET_VERSION" > "/etc/profile.d/devtoolset-$DEVTOOLSET_VERSION.sh"
--- a/.ci/docker/common/install_docs_reqs.sh
+++ b/.ci/docker/common/install_docs_reqs.sh
@ -1,25 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-if [ -n "$KATEX" ]; then
-  apt-get update
-  # Ignore error if gpg-agent doesn't exist (for Ubuntu 16.04)
-  apt-get install -y gpg-agent || :
-
-  curl --retry 3 -sL https://deb.nodesource.com/setup_16.x | sudo -E bash -
-  sudo apt-get install -y nodejs
-
-  curl --retry 3 -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
-  echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
-
-  apt-get update
-  apt-get install -y --no-install-recommends yarn
-  yarn global add katex --prefix /usr/local
-
-  sudo apt-get -y install doxygen
-
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-
-fi
--- a/.ci/docker/common/install_executorch.sh
+++ b/.ci/docker/common/install_executorch.sh
@ -1,61 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-clone_executorch() {
-  EXECUTORCH_PINNED_COMMIT=$(get_pinned_commit executorch)
-
-  # Clone the Executorch
-  git clone https://github.com/pytorch/executorch.git
-
-  # and fetch the target commit
-  pushd executorch
-  git checkout "${EXECUTORCH_PINNED_COMMIT}"
-  git submodule update --init
-  popd
-
-  chown -R jenkins executorch
-}
-
-install_buck2() {
-  pushd executorch/.ci/docker
-
-  BUCK2_VERSION=$(cat ci_commit_pins/buck2.txt)
-  source common/install_buck.sh
-
-  popd
-}
-
-install_conda_dependencies() {
-  pushd executorch/.ci/docker
-  # Install conda dependencies like flatbuffer
-  conda_install --file conda-env-ci.txt
-  popd
-}
-
-install_pip_dependencies() {
-  pushd executorch/.ci/docker
-  # Install all Python dependencies
-  pip_install -r requirements-ci.txt
-  popd
-}
-
-setup_executorch() {
-  pushd executorch
-  source .ci/scripts/utils.sh
-
-  install_flatc_from_source
-  pip_install .
-
-  # Make sure that all the newly generate files are owned by Jenkins
-  chown -R jenkins .
-  popd
-}
-
-clone_executorch
-install_buck2
-install_conda_dependencies
-install_pip_dependencies
-setup_executorch
--- a/.ci/docker/common/install_gcc.sh
+++ b/.ci/docker/common/install_gcc.sh
@ -1,20 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-if [ -n "$GCC_VERSION" ]; then
-
-  # Need the official toolchain repo to get alternate packages
-  add-apt-repository ppa:ubuntu-toolchain-r/test
-  apt-get update
-  apt-get install -y g++-$GCC_VERSION
-  update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-"$GCC_VERSION" 50
-  update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-"$GCC_VERSION" 50
-  update-alternatives --install /usr/bin/gcov gcov /usr/bin/gcov-"$GCC_VERSION" 50
-
-
-  # Cleanup package manager
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-
-fi
--- a/.ci/docker/common/install_glibc.sh
+++ b/.ci/docker/common/install_glibc.sh
@ -1,34 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "$GLIBC_VERSION" ]
-if [[ -n "$CENTOS_VERSION" ]]; then
-  [ -n "$DEVTOOLSET_VERSION" ]
-fi
-
-yum install -y wget sed
-
-mkdir -p /packages && cd /packages
-wget -q http://ftp.gnu.org/gnu/glibc/glibc-$GLIBC_VERSION.tar.gz
-tar xzf glibc-$GLIBC_VERSION.tar.gz
-if [[ "$GLIBC_VERSION" == "2.26" ]]; then
-  cd glibc-$GLIBC_VERSION
-  sed -i 's/$name ne "nss_test1"/$name ne "nss_test1" \&\& $name ne "nss_test2"/' scripts/test-installation.pl
-  cd ..
-fi
-mkdir -p glibc-$GLIBC_VERSION-build && cd glibc-$GLIBC_VERSION-build
-
-if [[ -n "$CENTOS_VERSION" ]]; then
-  export PATH=/opt/rh/devtoolset-$DEVTOOLSET_VERSION/root/usr/bin:$PATH
-fi
-
-../glibc-$GLIBC_VERSION/configure --prefix=/usr CFLAGS='-Wno-stringop-truncation -Wno-format-overflow -Wno-restrict -Wno-format-truncation -g -O2'
-make -j$(nproc)
-make install
-
-# Cleanup
-rm -rf /packages
-rm -rf /var/cache/yum/*
-rm -rf /var/lib/rpm/__db.*
-yum clean all
--- a/.ci/docker/common/install_inductor_benchmark_deps.sh
+++ b/.ci/docker/common/install_inductor_benchmark_deps.sh
@ -1,26 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-function install_huggingface() {
-  local version
-  commit=$(get_pinned_commit huggingface)
-  pip_install pandas==2.0.3
-  pip_install "git+https://github.com/huggingface/transformers@${commit}"
-}
-
-function install_timm() {
-  local commit
-  commit=$(get_pinned_commit timm)
-  pip_install pandas==2.0.3
-  pip_install "git+https://github.com/huggingface/pytorch-image-models@${commit}"
-  # Clean up
-  conda_run pip uninstall -y cmake torch torchvision triton
-}
-
-# Pango is needed for weasyprint which is needed for doctr
-conda_install pango
-install_huggingface
-install_timm
--- a/.ci/docker/common/install_jni.sh
+++ b/.ci/docker/common/install_jni.sh
@ -1,6 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-mkdir -p /usr/local/include
-cp jni.h /usr/local/include
--- a/.ci/docker/common/install_lcov.sh
+++ b/.ci/docker/common/install_lcov.sh
@ -1,8 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-git clone --branch v1.15 https://github.com/linux-test-project/lcov.git
-pushd lcov
-sudo make install   # will be installed in /usr/local/bin/lcov
-popd
--- a/.ci/docker/common/install_linter.sh
+++ b/.ci/docker/common/install_linter.sh
@ -1,29 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-if [ -n "${UBUNTU_VERSION}" ]; then
-  apt update
-  apt-get install -y clang doxygen git graphviz nodejs npm libtinfo5
-fi
-
-# Do shallow clone of PyTorch so that we can init lintrunner in Docker build context
-git clone https://github.com/pytorch/pytorch.git --depth 1
-chown -R jenkins pytorch
-
-pushd pytorch
-# Install all linter dependencies
-pip_install -r requirements.txt
-conda_run lintrunner init
-
-# Cache .lintbin directory as part of the Docker image
-cp -r .lintbin /tmp
-popd
-
-# Node dependencies required by toc linter job
-npm install -g markdown-toc
-
-# Cleaning up
-rm -rf pytorch
--- a/.ci/docker/common/install_ninja.sh
+++ b/.ci/docker/common/install_ninja.sh
@ -1,13 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "$NINJA_VERSION" ]
-
-url="https://github.com/ninja-build/ninja/releases/download/v${NINJA_VERSION}/ninja-linux.zip"
-
-pushd /tmp
-wget --no-verbose --output-document=ninja-linux.zip "$url"
-unzip ninja-linux.zip -d /usr/local/bin
-rm -f ninja-linux.zip
-popd
--- a/.ci/docker/common/install_onnx.sh
+++ b/.ci/docker/common/install_onnx.sh
@ -1,51 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-retry () {
-    "$@" || (sleep 10 && "$@") || (sleep 20 && "$@") || (sleep 40 && "$@")
-}
-
-# A bunch of custom pip dependencies for ONNX
-pip_install \
-  beartype==0.15.0 \
-  filelock==3.9.0 \
-  flatbuffers==2.0 \
-  mock==5.0.1 \
-  ninja==1.10.2 \
-  networkx==2.0 \
-  numpy==1.24.2
-
-# ONNXRuntime should be installed before installing
-# onnx-weekly. Otherwise, onnx-weekly could be
-# overwritten by onnx.
-pip_install \
-  parameterized==0.8.1 \
-  pytest-cov==4.0.0 \
-  pytest-subtests==0.10.0 \
-  tabulate==0.9.0 \
-  transformers==4.36.2
-
-pip_install coloredlogs packaging
-
-pip_install onnxruntime==1.18
-pip_install onnx==1.16.0
-# pip_install "onnxscript@git+https://github.com/microsoft/onnxscript@3e869ef8ccf19b5ebd21c10d3e9c267c9a9fa729" --no-deps
-pip_install onnxscript==0.1.0.dev20240523 --no-deps
-
-# Cache the transformers model to be used later by ONNX tests. We need to run the transformers
-# package to download the model. By default, the model is cached at ~/.cache/huggingface/hub/
-IMPORT_SCRIPT_FILENAME="/tmp/onnx_import_script.py"
-as_jenkins echo 'import transformers; transformers.AutoModel.from_pretrained("sshleifer/tiny-gpt2"); transformers.AutoTokenizer.from_pretrained("sshleifer/tiny-gpt2"); transformers.AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3");' > "${IMPORT_SCRIPT_FILENAME}"
-
-# Need a PyTorch version for transformers to work
-pip_install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
-# Very weird quoting behavior here https://github.com/conda/conda/issues/10972,
-# so echo the command to a file and run the file instead
-conda_run python "${IMPORT_SCRIPT_FILENAME}"
-
-# Cleaning up
-conda_run pip uninstall -y torch
-rm "${IMPORT_SCRIPT_FILENAME}" || true
--- a/.ci/docker/common/install_openmpi.sh
+++ b/.ci/docker/common/install_openmpi.sh
@ -1,10 +0,0 @@
-#!/bin/bash
-
-sudo apt-get update
-# also install ssh to avoid error of:
-# --------------------------------------------------------------------------
-# The value of the MCA parameter "plm_rsh_agent" was set to a path
-# that could not be found:
-#   plm_rsh_agent: ssh : rsh
-sudo apt-get install -y ssh
-sudo apt-get install -y --allow-downgrades --allow-change-held-packages openmpi-bin libopenmpi-dev
--- a/.ci/docker/common/install_openssl.sh
+++ b/.ci/docker/common/install_openssl.sh
@ -1,17 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-OPENSSL=openssl-1.1.1k
-
-wget -q -O "${OPENSSL}.tar.gz" "https://ossci-linux.s3.amazonaws.com/${OPENSSL}.tar.gz"
-tar xf "${OPENSSL}.tar.gz"
-cd "${OPENSSL}"
-./config --prefix=/opt/openssl -d '-Wl,--enable-new-dtags,-rpath,$(LIBRPATH)'
-# NOTE: openssl install errors out when built with the -j option
-NPROC=$[$(nproc) - 2]
-make -j${NPROC}; make install_sw
-# Link the ssl libraries to the /usr/lib folder.
-sudo ln -s /opt/openssl/lib/lib* /usr/lib
-cd ..
-rm -rf "${OPENSSL}"
--- a/.ci/docker/common/install_protobuf.sh
+++ b/.ci/docker/common/install_protobuf.sh
@ -1,19 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-pb_dir="/usr/temp_pb_install_dir"
-mkdir -p $pb_dir
-
-# On the nvidia/cuda:9-cudnn7-devel-centos7 image we need this symlink or
-# else it will fail with
-#   g++: error: ./../lib64/crti.o: No such file or directory
-ln -s /usr/lib64 "$pb_dir/lib64"
-
-curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protobuf-all-3.17.3.tar.gz" --retry 3
-
-tar -xvz --no-same-owner -C "$pb_dir" --strip-components 1 -f protobuf-all-3.17.3.tar.gz
-NPROC=$[$(nproc) - 2]
-pushd "$pb_dir" && ./configure && make -j${NPROC} && make -j${NPROC} check && sudo make -j${NRPOC} install && sudo ldconfig
-popd
-rm -rf $pb_dir
--- a/.ci/docker/common/install_rocm.sh
+++ b/.ci/docker/common/install_rocm.sh
@ -1,148 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-ver() {
-    printf "%3d%03d%03d%03d" $(echo "$1" | tr '.' ' ');
-}
-
-install_ubuntu() {
-    apt-get update
-    if [[ $UBUNTU_VERSION == 18.04 ]]; then
-      # gpg-agent is not available by default on 18.04
-      apt-get install -y --no-install-recommends gpg-agent
-    fi
-    if [[ $UBUNTU_VERSION == 20.04 ]]; then
-      # gpg-agent is not available by default on 20.04
-      apt-get install -y --no-install-recommends gpg-agent
-    fi
-    apt-get install -y kmod
-    apt-get install -y wget
-
-    # Need the libc++1 and libc++abi1 libraries to allow torch._C to load at runtime
-    apt-get install -y libc++1
-    apt-get install -y libc++abi1
-
-    # Add amdgpu repository
-    UBUNTU_VERSION_NAME=`cat /etc/os-release | grep UBUNTU_CODENAME | awk -F= '{print $2}'`
-    echo "deb [arch=amd64] https://repo.radeon.com/amdgpu/${ROCM_VERSION}/ubuntu ${UBUNTU_VERSION_NAME} main" > /etc/apt/sources.list.d/amdgpu.list
-
-    # Add rocm repository
-    wget -qO - http://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -
-    local rocm_baseurl="http://repo.radeon.com/rocm/apt/${ROCM_VERSION}"
-    echo "deb [arch=amd64] ${rocm_baseurl} ${UBUNTU_VERSION_NAME} main" > /etc/apt/sources.list.d/rocm.list
-    apt-get update --allow-insecure-repositories
-
-    DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated \
-                   rocm-dev \
-                   rocm-utils \
-                   rocm-libs \
-                   rccl \
-                   rocprofiler-dev \
-                   roctracer-dev \
-                   amd-smi-lib
-
-    if [[ $(ver $ROCM_VERSION) -ge $(ver 6.1) ]]; then
-        DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated rocm-llvm-dev
-    fi
-
-    # precompiled miopen kernels added in ROCm 3.5, renamed in ROCm 5.5
-    # search for all unversioned packages
-    # if search fails it will abort this script; use true to avoid case where search fails
-    MIOPENHIPGFX=$(apt-cache search --names-only miopen-hip-gfx | awk '{print $1}' | grep -F -v . || true)
-    if [[ "x${MIOPENHIPGFX}" = x ]]; then
-      echo "miopen-hip-gfx package not available" && exit 1
-    else
-      DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated ${MIOPENHIPGFX}
-    fi
-
-    # ROCm 6.0 had a regression where journal_mode was enabled on the kdb files resulting in permission errors at runtime
-    for kdb in /opt/rocm/share/miopen/db/*.kdb
-    do
-        sqlite3 $kdb "PRAGMA journal_mode=off; PRAGMA VACUUM;"
-    done
-
-    # Cleanup
-    apt-get autoclean && apt-get clean
-    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-}
-
-install_centos() {
-
-  yum update -y
-  yum install -y kmod
-  yum install -y wget
-  yum install -y openblas-devel
-
-  yum install -y epel-release
-  yum install -y dkms kernel-headers-`uname -r` kernel-devel-`uname -r`
-
-  # Add amdgpu repository
-  local amdgpu_baseurl
-  if [[ $OS_VERSION == 9 ]]; then
-      amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/9.0/main/x86_64"
-  else
-      amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/7.9/main/x86_64"
-  fi
-  echo "[AMDGPU]" > /etc/yum.repos.d/amdgpu.repo
-  echo "name=AMDGPU" >> /etc/yum.repos.d/amdgpu.repo
-  echo "baseurl=${amdgpu_baseurl}" >> /etc/yum.repos.d/amdgpu.repo
-  echo "enabled=1" >> /etc/yum.repos.d/amdgpu.repo
-  echo "gpgcheck=1" >> /etc/yum.repos.d/amdgpu.repo
-  echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/amdgpu.repo
-
-  local rocm_baseurl="http://repo.radeon.com/rocm/yum/${ROCM_VERSION}"
-  echo "[ROCm]" > /etc/yum.repos.d/rocm.repo
-  echo "name=ROCm" >> /etc/yum.repos.d/rocm.repo
-  echo "baseurl=${rocm_baseurl}" >> /etc/yum.repos.d/rocm.repo
-  echo "enabled=1" >> /etc/yum.repos.d/rocm.repo
-  echo "gpgcheck=1" >> /etc/yum.repos.d/rocm.repo
-  echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/rocm.repo
-
-  yum update -y
-
-  yum install -y \
-                   rocm-dev \
-                   rocm-utils \
-                   rocm-libs \
-                   rccl \
-                   rocprofiler-dev \
-                   roctracer-dev \
-                   amd-smi-lib
-
-  # precompiled miopen kernels; search for all unversioned packages
-  # if search fails it will abort this script; use true to avoid case where search fails
-  MIOPENHIPGFX=$(yum -q search miopen-hip-gfx | grep miopen-hip-gfx | awk '{print $1}'| grep -F kdb. || true)
-  if [[ "x${MIOPENHIPGFX}" = x ]]; then
-    echo "miopen-hip-gfx package not available" && exit 1
-  else
-    yum install -y ${MIOPENHIPGFX}
-  fi
-
-  # ROCm 6.0 had a regression where journal_mode was enabled on the kdb files resulting in permission errors at runtime
-  for kdb in /opt/rocm/share/miopen/db/*.kdb
-  do
-      sqlite3 $kdb "PRAGMA journal_mode=off; PRAGMA VACUUM;"
-  done
-
-  # Cleanup
-  yum clean all
-  rm -rf /var/cache/yum
-  rm -rf /var/lib/yum/yumdb
-  rm -rf /var/lib/yum/history
-}
-
-# Install Python packages depending on the base OS
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-  ubuntu)
-    install_ubuntu
-    ;;
-  centos)
-    install_centos
-    ;;
-  *)
-    echo "Unable to determine OS..."
-    exit 1
-    ;;
-esac
--- a/.ci/docker/common/install_rocm_magma.sh
+++ b/.ci/docker/common/install_rocm_magma.sh
@ -1,31 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-# "install" hipMAGMA into /opt/rocm/magma by copying after build
-git clone https://bitbucket.org/icl/magma.git
-pushd magma
-
-# Version 2.7.2 + ROCm related updates
-git checkout a1625ff4d9bc362906bd01f805dbbe12612953f6
-
-cp make.inc-examples/make.inc.hip-gcc-mkl make.inc
-echo 'LIBDIR += -L$(MKLROOT)/lib' >> make.inc
-echo 'LIB += -Wl,--enable-new-dtags -Wl,--rpath,/opt/rocm/lib -Wl,--rpath,$(MKLROOT)/lib -Wl,--rpath,/opt/rocm/magma/lib' >> make.inc
-echo 'DEVCCFLAGS += --gpu-max-threads-per-block=256' >> make.inc
-export PATH="${PATH}:/opt/rocm/bin"
-if [[ -n "$PYTORCH_ROCM_ARCH" ]]; then
-  amdgpu_targets=`echo $PYTORCH_ROCM_ARCH | sed 's/;/ /g'`
-else
-  amdgpu_targets=`rocm_agent_enumerator | grep -v gfx000 | sort -u | xargs`
-fi
-for arch in $amdgpu_targets; do
-  echo "DEVCCFLAGS += --offload-arch=$arch" >> make.inc
-done
-# hipcc with openmp flag may cause isnan() on __device__ not to be found; depending on context, compiler may attempt to match with host definition
-sed -i 's/^FOPENMP/#FOPENMP/g' make.inc
-make -f make.gen.hipMAGMA -j $(nproc)
-LANG=C.UTF-8 make lib/libmagma.so -j $(nproc) MKLROOT=/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION
-make testing/testing_dgemm -j $(nproc) MKLROOT=/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION
-popd
-mv magma /opt/rocm
--- a/.ci/docker/common/install_swiftshader.sh
+++ b/.ci/docker/common/install_swiftshader.sh
@ -1,24 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "${SWIFTSHADER}" ]
-
-retry () {
-    $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
-}
-
-_https_amazon_aws=https://ossci-android.s3.amazonaws.com
-
-# SwiftShader
-_swiftshader_dir=/var/lib/jenkins/swiftshader
-_swiftshader_file_targz=swiftshader-abe07b943-prebuilt.tar.gz
-mkdir -p $_swiftshader_dir
-_tmp_swiftshader_targz="/tmp/${_swiftshader_file_targz}"
-
-curl --silent --show-error --location --fail --retry 3 \
-  --output "${_tmp_swiftshader_targz}" "$_https_amazon_aws/${_swiftshader_file_targz}"
-
-tar -C "${_swiftshader_dir}" -xzf "${_tmp_swiftshader_targz}"
-
-export VK_ICD_FILENAMES="${_swiftshader_dir}/build/Linux/vk_swiftshader_icd.json"
--- a/.ci/docker/common/install_triton.sh
+++ b/.ci/docker/common/install_triton.sh
@ -1,72 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-
-get_conda_version() {
-  as_jenkins conda list -n py_$ANACONDA_PYTHON_VERSION | grep -w $* | head -n 1 | awk '{print $2}'
-}
-
-conda_reinstall() {
-  as_jenkins conda install -q -n py_$ANACONDA_PYTHON_VERSION -y --force-reinstall $*
-}
-
-if [ -n "${ROCM_VERSION}" ]; then
-  TRITON_REPO="https://github.com/openai/triton"
-  TRITON_TEXT_FILE="triton-rocm"
-elif [ -n "${XPU_VERSION}" ]; then
-  TRITON_REPO="https://github.com/intel/intel-xpu-backend-for-triton"
-  TRITON_TEXT_FILE="triton-xpu"
-else
-  TRITON_REPO="https://github.com/openai/triton"
-  TRITON_TEXT_FILE="triton"
-fi
-
-# The logic here is copied from .ci/pytorch/common_utils.sh
-TRITON_PINNED_COMMIT=$(get_pinned_commit ${TRITON_TEXT_FILE})
-
-if [ -n "${UBUNTU_VERSION}" ];then
-    apt update
-    apt-get install -y gpg-agent
-fi
-
-if [ -n "${CONDA_CMAKE}" ]; then
-  # Keep the current cmake and numpy version here, so we can reinstall them later
-  CMAKE_VERSION=$(get_conda_version cmake)
-  NUMPY_VERSION=$(get_conda_version numpy)
-fi
-
-if [ -z "${MAX_JOBS}" ]; then
-    export MAX_JOBS=$(nproc)
-fi
-
-if [ -n "${UBUNTU_VERSION}" ] && [ -n "${GCC_VERSION}" ] && [[ "${GCC_VERSION}" == "7" ]]; then
-  # Triton needs at least gcc-9 to build
-  apt-get install -y g++-9
-
-  CXX=g++-9 pip_install "git+${TRITON_REPO}@${TRITON_PINNED_COMMIT}#subdirectory=python"
-elif [ -n "${UBUNTU_VERSION}" ] && [ -n "${CLANG_VERSION}" ]; then
-  # Triton needs <filesystem> which surprisingly is not available with clang-9 toolchain
-  add-apt-repository -y ppa:ubuntu-toolchain-r/test
-  apt-get install -y g++-9
-
-  CXX=g++-9 pip_install "git+${TRITON_REPO}@${TRITON_PINNED_COMMIT}#subdirectory=python"
-else
-  pip_install "git+${TRITON_REPO}@${TRITON_PINNED_COMMIT}#subdirectory=python"
-fi
-
-if [ -n "${CONDA_CMAKE}" ]; then
-  # TODO: This is to make sure that the same cmake and numpy version from install conda
-  # script is used. Without this step, the newer cmake version (3.25.2) downloaded by
-  # triton build step via pip will fail to detect conda MKL. Once that issue is fixed,
-  # this can be removed.
-  #
-  # The correct numpy version also needs to be set here because conda claims that it
-  # causes inconsistent environment.  Without this, conda will attempt to install the
-  # latest numpy version, which fails ASAN tests with the following import error: Numba
-  # needs NumPy 1.20 or less.
-  conda_reinstall cmake="${CMAKE_VERSION}"
-  # Note that we install numpy with pip as conda might not have the version we want
-  pip_install --force-reinstall numpy=="${NUMPY_VERSION}"
-fi
--- a/.ci/docker/common/install_ucc.sh
+++ b/.ci/docker/common/install_ucc.sh
@ -1,53 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-if [[ -d "/usr/local/cuda/" ]];  then
-  with_cuda=/usr/local/cuda/
-else
-  with_cuda=no
-fi
-
-function install_ucx() {
-  set -ex
-  git clone --recursive https://github.com/openucx/ucx.git
-  pushd ucx
-  git checkout ${UCX_COMMIT}
-  git submodule update --init --recursive
-
-  ./autogen.sh
-  ./configure --prefix=$UCX_HOME      \
-      --enable-mt                     \
-      --with-cuda=$with_cuda          \
-      --enable-profiling              \
-      --enable-stats
-  time make -j
-  sudo make install
-
-  popd
-  rm -rf ucx
-}
-
-function install_ucc() {
-  set -ex
-  git clone --recursive https://github.com/openucx/ucc.git
-  pushd ucc
-  git checkout ${UCC_COMMIT}
-  git submodule update --init --recursive
-
-  ./autogen.sh
-  # We only run distributed tests on Tesla M60 and A10G
-  NVCC_GENCODE="-gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=compute_86"
-  ./configure --prefix=$UCC_HOME          \
-    --with-ucx=$UCX_HOME                  \
-    --with-cuda=$with_cuda                \
-    --with-nvcc-gencode="${NVCC_GENCODE}"
-  time make -j
-  sudo make install
-
-  popd
-  rm -rf ucc
-}
-
-install_ucx
-install_ucc
--- a/.ci/docker/common/install_user.sh
+++ b/.ci/docker/common/install_user.sh
@ -1,33 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-# Mirror jenkins user in container
-# jenkins user as ec2-user should have the same user-id
-echo "jenkins:x:1000:1000::/var/lib/jenkins:" >> /etc/passwd
-echo "jenkins:x:1000:" >> /etc/group
-# Needed on focal or newer
-echo "jenkins:*:19110:0:99999:7:::" >>/etc/shadow
-
-# Create $HOME
-mkdir -p /var/lib/jenkins
-chown jenkins:jenkins /var/lib/jenkins
-mkdir -p /var/lib/jenkins/.ccache
-chown jenkins:jenkins /var/lib/jenkins/.ccache
-
-# Allow writing to /usr/local (for make install)
-chown jenkins:jenkins /usr/local
-
-# Allow sudo
-# TODO: Maybe we shouldn't
-echo 'jenkins ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/jenkins
-
-# Work around bug where devtoolset replaces sudo and breaks it.
-if [ -n "$DEVTOOLSET_VERSION" ]; then
-  SUDO=/bin/sudo
-else
-  SUDO=sudo
-fi
-
-# Test that sudo works
-$SUDO -u jenkins $SUDO -v
--- a/.ci/docker/common/install_vision.sh
+++ b/.ci/docker/common/install_vision.sh
@ -1,46 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-install_ubuntu() {
-  apt-get update
-  apt-get install -y --no-install-recommends \
-          libopencv-dev
-
-  # Cleanup
-  apt-get autoclean && apt-get clean
-  rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-}
-
-install_centos() {
-  # Need EPEL for many packages we depend on.
-  # See http://fedoraproject.org/wiki/EPEL
-  yum --enablerepo=extras install -y epel-release
-
-  yum install -y \
-      opencv-devel
-
-  # Cleanup
-  yum clean all
-  rm -rf /var/cache/yum
-  rm -rf /var/lib/yum/yumdb
-  rm -rf /var/lib/yum/history
-}
-
-# Install base packages depending on the base OS
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-  ubuntu)
-    install_ubuntu
-    ;;
-  centos)
-    install_centos
-    ;;
-  *)
-    echo "Unable to determine OS..."
-    exit 1
-    ;;
-esac
-
-# Cache vision models used by the test
-source "$(dirname "${BASH_SOURCE[0]}")/cache_vision_models.sh"
--- a/.ci/docker/common/install_vulkan_sdk.sh
+++ b/.ci/docker/common/install_vulkan_sdk.sh
@ -1,24 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-[ -n "${VULKAN_SDK_VERSION}" ]
-
-retry () {
-    $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
-}
-
-_vulkansdk_dir=/var/lib/jenkins/vulkansdk
-_tmp_vulkansdk_targz=/tmp/vulkansdk.tar.gz
-
-curl \
-  --silent \
-  --show-error \
-  --location \
-  --fail \
-  --retry 3 \
-  --output "${_tmp_vulkansdk_targz}" "https://ossci-android.s3.amazonaws.com/vulkansdk-linux-x86_64-${VULKAN_SDK_VERSION}.tar.gz"
-
-mkdir -p "${_vulkansdk_dir}"
-tar -C "${_vulkansdk_dir}" -xzf "${_tmp_vulkansdk_targz}" --strip-components 1
-rm -rf "${_tmp_vulkansdk_targz}"
--- a/.ci/docker/common/install_xpu.sh
+++ b/.ci/docker/common/install_xpu.sh
@ -1,114 +0,0 @@
-#!/bin/bash
-set -xe
-
-
-# Intel® software for general purpose GPU capabilities.
-# Refer to https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpus.html
-
-# Users should update to the latest version as it becomes available
-
-function install_ubuntu() {
-    apt-get update -y
-    apt-get install -y gpg-agent wget
-
-    # Set up the repository. To do this, download the key to the system keyring
-    wget -qO - https://repositories.intel.com/gpu/intel-graphics.key \
-        | gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg
-    wget -qO - https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
-        | gpg --dearmor --output /usr/share/keyrings/intel-for-pytorch-gpu-dev-keyring.gpg
-
-    # Add the signed entry to APT sources and configure the APT client to use the Intel repository
-    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/intel-graphics.gpg] \
-        https://repositories.intel.com/gpu/ubuntu jammy/lts/2350 unified" \
-        | tee /etc/apt/sources.list.d/intel-gpu-jammy.list
-    echo "deb [signed-by=/usr/share/keyrings/intel-for-pytorch-gpu-dev-keyring.gpg] \
-        https://apt.repos.intel.com/intel-for-pytorch-gpu-dev all main" \
-        | tee /etc/apt/sources.list.d/intel-for-pytorch-gpu-dev.list
-
-    # Update the packages list and repository index
-    apt-get update
-
-    # The xpu-smi packages
-    apt-get install -y flex bison xpu-smi
-    # Compute and Media Runtimes
-    apt-get install -y \
-        intel-opencl-icd intel-level-zero-gpu level-zero \
-        intel-media-va-driver-non-free libmfx1 libmfxgen1 libvpl2 \
-        libegl-mesa0 libegl1-mesa libegl1-mesa-dev libgbm1 libgl1-mesa-dev libgl1-mesa-dri \
-        libglapi-mesa libgles2-mesa-dev libglx-mesa0 libigdgmm12 libxatracker2 mesa-va-drivers \
-        mesa-vdpau-drivers mesa-vulkan-drivers va-driver-all vainfo hwinfo clinfo
-    # Development Packages
-    apt-get install -y libigc-dev intel-igc-cm libigdfcl-dev libigfxcmrt-dev level-zero-dev
-    # Install Intel Support Packages
-    if [ -n "$XPU_VERSION" ]; then
-        apt-get install -y intel-for-pytorch-gpu-dev-${XPU_VERSION}
-    else
-        apt-get install -y intel-for-pytorch-gpu-dev
-    fi
-
-    # Cleanup
-    apt-get autoclean && apt-get clean
-    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-}
-
-function install_centos() {
-    dnf install -y 'dnf-command(config-manager)'
-    dnf config-manager --add-repo \
-        https://repositories.intel.com/gpu/rhel/8.6/production/2328/unified/intel-gpu-8.6.repo
-    # To add the EPEL repository needed for DKMS
-    dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
-        # https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
-
-    # Create the YUM repository file in the /temp directory as a normal user
-    tee > /tmp/oneAPI.repo << EOF
-[oneAPI]
-name=Intel® oneAPI repository
-baseurl=https://yum.repos.intel.com/oneapi
-enabled=1
-gpgcheck=1
-repo_gpgcheck=1
-gpgkey=https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
-EOF
-
-    # Move the newly created oneAPI.repo file to the YUM configuration directory /etc/yum.repos.d
-    mv /tmp/oneAPI.repo /etc/yum.repos.d
-
-    # The xpu-smi packages
-    dnf install -y flex bison xpu-smi
-    # Compute and Media Runtimes
-    dnf install -y \
-        intel-opencl intel-media intel-mediasdk libmfxgen1 libvpl2\
-        level-zero intel-level-zero-gpu mesa-dri-drivers mesa-vulkan-drivers \
-        mesa-vdpau-drivers libdrm mesa-libEGL mesa-libgbm mesa-libGL \
-        mesa-libxatracker libvpl-tools intel-metrics-discovery \
-        intel-metrics-library intel-igc-core intel-igc-cm \
-        libva libva-utils intel-gmmlib libmetee intel-gsc intel-ocloc hwinfo clinfo
-    # Development packages
-    dnf install -y --refresh \
-        intel-igc-opencl-devel level-zero-devel intel-gsc-devel libmetee-devel \
-        level-zero-devel
-    # Install Intel® oneAPI Base Toolkit
-    dnf install intel-basekit -y
-
-    # Cleanup
-    dnf clean all
-    rm -rf /var/cache/yum
-    rm -rf /var/lib/yum/yumdb
-    rm -rf /var/lib/yum/history
-}
-
-
-# The installation depends on the base OS
-ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
-case "$ID" in
-    ubuntu)
-        install_ubuntu
-    ;;
-    centos)
-        install_centos
-    ;;
-    *)
-        echo "Unable to determine OS..."
-        exit 1
-    ;;
-esac
--- a/.ci/docker/java/jni.h
+++ b/.ci/docker/java/jni.h
--- a/.ci/docker/linter-cuda/Dockerfile
+++ b/.ci/docker/linter-cuda/Dockerfile
@ -1,44 +0,0 @@
-ARG UBUNTU_VERSION
-
-FROM ubuntu:${UBUNTU_VERSION}
-
-ARG UBUNTU_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install missing libomp-dev
-RUN apt-get update && apt-get install -y --no-install-recommends libomp-dev && apt-get autoclean && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-COPY requirements-ci.txt /opt/conda/requirements-ci.txt
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
-
-# Install cuda and cudnn
-ARG CUDA_VERSION
-RUN wget -q https://raw.githubusercontent.com/pytorch/builder/main/common/install_cuda.sh -O install_cuda.sh
-RUN bash ./install_cuda.sh ${CUDA_VERSION} && rm install_cuda.sh
-ENV DESIRED_CUDA ${CUDA_VERSION}
-ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:$PATH
-
-# Note that Docker build forbids copying file outside the build context
-COPY ./common/install_linter.sh install_linter.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_linter.sh
-RUN rm install_linter.sh common_utils.sh
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/linter/Dockerfile
+++ b/.ci/docker/linter/Dockerfile
@ -1,34 +0,0 @@
-ARG UBUNTU_VERSION
-
-FROM ubuntu:${UBUNTU_VERSION}
-
-ARG UBUNTU_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-COPY requirements-ci.txt /opt/conda/requirements-ci.txt
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
-
-# Note that Docker build forbids copying file outside the build context
-COPY ./common/install_linter.sh install_linter.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_linter.sh
-RUN rm install_linter.sh common_utils.sh
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/requirements-ci.txt
+++ b/.ci/docker/requirements-ci.txt
@ -1,314 +0,0 @@
-# Python dependencies required for unit tests
-
-#awscli==1.6 #this breaks some platforms
-#Description: AWS command line interface
-#Pinned versions: 1.6
-#test that import:
-
-boto3==1.19.12
-#Description: AWS SDK for python
-#Pinned versions: 1.19.12, 1.16.34
-#test that import:
-
-click
-#Description: Command Line Interface Creation Kit
-#Pinned versions:
-#test that import:
-
-coremltools==5.0b5 ; python_version < "3.12"
-#Description: Apple framework for ML integration
-#Pinned versions: 5.0b5
-#test that import:
-
-#dataclasses #this breaks some platforms
-#Description: Provides decorators for auto adding special methods to user classes
-#Pinned versions:
-#test that import:
-
-dill==0.3.7
-#Description: dill extends pickle with serializing and de-serializing for most built-ins
-#Pinned versions: 0.3.7
-#test that import: dynamo/test_replay_record.py test_dataloader.py test_datapipe.py test_serialization.py
-
-expecttest==0.1.6
-#Description: method for writing tests where test framework auto populates
-# the expected output based on previous runs
-#Pinned versions: 0.1.6
-#test that import:
-
-flatbuffers==2.0
-#Description: cross platform serialization library
-#Pinned versions: 2.0
-#test that import:
-
-hypothesis==5.35.1
-# Pin hypothesis to avoid flakiness: https://github.com/pytorch/pytorch/issues/31136
-#Description: advanced library for generating parametrized tests
-#Pinned versions: 3.44.6, 4.53.2
-#test that import: test_xnnpack_integration.py, test_pruning_op.py, test_nn.py
-
-junitparser==2.1.1
-#Description: unitparser handles JUnit/xUnit Result XML files
-#Pinned versions: 2.1.1
-#test that import:
-
-lark==0.12.0
-#Description: parser
-#Pinned versions: 0.12.0
-#test that import:
-
-librosa>=0.6.2 ; python_version < "3.11"
-#Description: A python package for music and audio analysis
-#Pinned versions: >=0.6.2
-#test that import: test_spectral_ops.py
-
-#mkl #this breaks linux-bionic-rocm4.5-py3.7
-#Description: Intel oneAPI Math Kernel Library
-#Pinned versions:
-#test that import: test_profiler.py, test_public_bindings.py, test_testing.py,
-#test_nn.py, test_mkldnn.py, test_jit.py, test_fx_experimental.py,
-#test_autograd.py
-
-#mkl-devel
-# see mkl
-
-#mock
-#Description: A testing library that allows you to replace parts of your
-#system under test with mock objects
-#Pinned versions:
-#test that import: test_modules.py, test_nn.py,
-#test_testing.py
-
-#MonkeyType # breaks pytorch-xla-linux-bionic-py3.7-clang8
-#Description: collects runtime types of function arguments and return
-#values, and can automatically generate stub files
-#Pinned versions:
-#test that import:
-
-mypy==1.9.0
-# Pin MyPy version because new errors are likely to appear with each release
-#Description: linter
-#Pinned versions: 1.9.0
-#test that import: test_typing.py, test_type_hints.py
-
-networkx==2.8.8
-#Description: creation, manipulation, and study of
-#the structure, dynamics, and functions of complex networks
-#Pinned versions: 2.8.8
-#test that import: functorch
-
-#ninja
-#Description: build system.  Note that it install from
-#here breaks things so it is commented out
-#Pinned versions: 1.10.0.post1
-#test that import: run_test.py, test_cpp_extensions_aot.py,test_determination.py
-
-numba==0.49.0 ; python_version < "3.9"
-numba==0.54.1 ; python_version == "3.9"
-numba==0.55.2 ; python_version == "3.10"
-#Description: Just-In-Time Compiler for Numerical Functions
-#Pinned versions: 0.54.1, 0.49.0, <=0.49.1
-#test that import: test_numba_integration.py
-#For numba issue see https://github.com/pytorch/pytorch/issues/51511
-
-#numpy
-#Description: Provides N-dimensional arrays and linear algebra
-#Pinned versions: 1.20
-#test that import: test_view_ops.py, test_unary_ufuncs.py, test_type_promotion.py,
-#test_type_info.py, test_torch.py, test_tensorexpr_pybind.py, test_tensorexpr.py,
-#test_tensorboard.py, test_tensor_creation_ops.py, test_static_runtime.py,
-#test_spectral_ops.py, test_sort_and_select.py, test_shape_ops.py,
-#test_segment_reductions.py, test_reductions.py, test_pruning_op.py,
-#test_overrides.py, test_numpy_interop.py, test_numba_integration.py
-#test_nn.py, test_namedtensor.py, test_linalg.py, test_jit_cuda_fuser.py,
-#test_jit.py, test_indexing.py, test_datapipe.py, test_dataloader.py,
-#test_binary_ufuncs.py
-
-#onnxruntime
-#Description: scoring engine for Open Neural Network Exchange (ONNX) models
-#Pinned versions: 1.9.0
-#test that import:
-
-opt-einsum==3.3
-#Description: Python library to optimize tensor contraction order, used in einsum
-#Pinned versions: 3.3
-#test that import: test_linalg.py
-
-optree==0.11.0
-#Description: A library for tree manipulation
-#Pinned versions: 0.11.0
-#test that import: test_vmap.py, test_aotdispatch.py, test_dynamic_shapes.py,
-#test_pytree.py, test_ops.py, test_control_flow.py, test_modules.py,
-#common_utils.py, test_eager_transforms.py, test_python_dispatch.py,
-#test_expanded_weights.py, test_decomp.py, test_overrides.py, test_masked.py,
-#test_ops.py, test_prims.py, test_subclass.py, test_functionalization.py,
-#test_schema_check.py, test_profiler_tree.py, test_meta.py, test_torchxla_num_output.py,
-#test_utils.py, test_proxy_tensor.py, test_memory_profiler.py, test_view_ops.py,
-#test_pointwise_ops.py, test_dtensor_ops.py, test_torchinductor.py, test_fx.py,
-#test_fake_tensor.py, test_mps.py
-
-pillow==10.3.0
-#Description:  Python Imaging Library fork
-#Pinned versions: 10.3.0
-#test that import:
-
-protobuf==3.20.2
-#Description:  Google’s data interchange format
-#Pinned versions: 3.20.1
-#test that import: test_tensorboard.py
-
-psutil
-#Description: information on running processes and system utilization
-#Pinned versions:
-#test that import: test_profiler.py, test_openmp.py, test_dataloader.py
-
-pytest==7.3.2
-#Description: testing framework
-#Pinned versions:
-#test that import: test_typing.py, test_cpp_extensions_aot.py, run_test.py
-
-pytest-xdist==3.3.1
-#Description: plugin for running pytest in parallel
-#Pinned versions:
-#test that import:
-
-pytest-flakefinder==1.1.0
-#Description: plugin for rerunning tests a fixed number of times in pytest
-#Pinned versions: 1.1.0
-#test that import:
-
-pytest-rerunfailures>=10.3
-#Description: plugin for rerunning failure tests in pytest
-#Pinned versions:
-#test that import:
-
-#pytest-benchmark
-#Description: fixture for benchmarking code
-#Pinned versions: 3.2.3
-#test that import:
-
-#pytest-sugar
-#Description: shows failures and errors instantly
-#Pinned versions:
-#test that import:
-
-xdoctest==1.1.0
-#Description: runs doctests in pytest
-#Pinned versions: 1.1.0
-#test that import:
-
-pygments==2.15.0
-#Description: support doctest highlighting
-#Pinned versions: 2.12.0
-#test that import: the doctests
-
-#PyYAML
-#Description: data serialization format
-#Pinned versions:
-#test that import:
-
-#requests
-#Description: HTTP library
-#Pinned versions:
-#test that import: test_type_promotion.py
-
-#rich
-#Description: rich text and beautiful formatting in the terminal
-#Pinned versions: 10.9.0
-#test that import:
-
-scikit-image==0.19.3 ; python_version < "3.10"
-scikit-image==0.20.0 ; python_version >= "3.10"
-#Description: image processing routines
-#Pinned versions:
-#test that import: test_nn.py
-
-#scikit-learn
-#Description: machine learning package
-#Pinned versions: 0.20.3
-#test that import:
-
-scipy==1.10.1 ; python_version <= "3.11"
-scipy==1.12.0 ; python_version == "3.12"
-# Pin SciPy because of failing distribution tests (see #60347)
-#Description: scientific python
-#Pinned versions: 1.10.1
-#test that import: test_unary_ufuncs.py, test_torch.py,test_tensor_creation_ops.py
-#test_spectral_ops.py, test_sparse_csr.py, test_reductions.py,test_nn.py
-#test_linalg.py, test_binary_ufuncs.py
-
-#tabulate
-#Description: Pretty-print tabular data
-#Pinned versions:
-#test that import:
-
-tb-nightly==2.13.0a20230426
-#Description: TensorBoard
-#Pinned versions:
-#test that import:
-
-# needed by torchgen utils
-typing-extensions
-#Description: type hints for python
-#Pinned versions:
-#test that import:
-
-#virtualenv
-#Description: virtual environment for python
-#Pinned versions:
-#test that import:
-
-unittest-xml-reporting<=3.2.0,>=2.0.0
-#Description: saves unit test results to xml
-#Pinned versions:
-#test that import:
-
-#lintrunner is supported on aarch64-linux only from 0.12.4 version
-lintrunner==0.12.5
-#Description: all about linters!
-#Pinned versions: 0.12.5
-#test that import:
-
-rockset==1.0.3
-#Description: queries Rockset
-#Pinned versions: 1.0.3
-#test that import:
-
-ghstack==0.8.0
-#Description: ghstack tool
-#Pinned versions: 0.8.0
-#test that import:
-
-jinja2==3.1.4
-#Description: jinja2 template engine
-#Pinned versions: 3.1.4
-#test that import:
-
-pytest-cpp==2.3.0
-#Description: This is used by pytest to invoke C++ tests
-#Pinned versions: 2.3.0
-#test that import:
-
-z3-solver==4.12.2.0
-#Description: The Z3 Theorem Prover Project
-#Pinned versions:
-#test that import:
-
-tensorboard==2.13.0
-#Description: Also included in .ci/docker/requirements-docs.txt
-#Pinned versions:
-#test that import: test_tensorboard
-
-pywavelets==1.4.1 ; python_version < "3.12"
-pywavelets==1.5.0 ; python_version >= "3.12"
-#Description: This is a requirement of scikit-image, we need to pin
-# it here because 1.5.0 conflicts with numpy 1.21.2 used in CI
-#Pinned versions: 1.4.1
-#test that import:
-
-lxml==5.0.0.
-#Description: This is a requirement of unittest-xml-reporting
-
-# Python-3.9 binaries
-
-PyGithub==2.3.0
--- a/.ci/docker/requirements-docs.txt
+++ b/.ci/docker/requirements-docs.txt
@ -1,49 +0,0 @@
-sphinx==5.3.0
-#Description: This is used to generate PyTorch docs
-#Pinned versions: 5.3.0
-e git+https://github.com/pytorch/pytorch_sphinx_theme.git#egg=pytorch_sphinx_theme
-
-# TODO: sphinxcontrib.katex 0.9.0 adds a local KaTeX server to speed up pre-rendering
-# but it doesn't seem to work and hangs around idly. The initial thought is probably
-# something related to Docker setup. We can investigate this later
-sphinxcontrib.katex==0.8.6
-#Description: This is used to generate PyTorch docs
-#Pinned versions: 0.8.6
-
-matplotlib==3.5.3
-#Description: This is used to generate PyTorch docs
-#Pinned versions: 3.5.3
-
-tensorboard==2.13.0
-#Description: This is used to generate PyTorch docs
-#Pinned versions: 2.13.0
-
-breathe==4.34.0
-#Description: This is used to generate PyTorch C++ docs
-#Pinned versions: 4.34.0
-
-exhale==0.2.3
-#Description: This is used to generate PyTorch C++ docs
-#Pinned versions: 0.2.3
-
-docutils==0.16
-#Description: This is used to generate PyTorch C++ docs
-#Pinned versions: 0.16
-
-bs4==0.0.1
-#Description: This is used to generate PyTorch C++ docs
-#Pinned versions: 0.0.1
-
-IPython==8.12.0
-#Description: This is used to generate PyTorch functorch docs
-#Pinned versions: 8.12.0
-
-myst-nb==0.17.2
-#Description: This is used to generate PyTorch functorch docs
-#Pinned versions: 0.13.2
-
-# The following are required to build torch.distributed.elastic.rendezvous.etcd* docs
-python-etcd==0.4.5
-sphinx-copybutton==0.5.0
-sphinx-panels==0.4.1
-myst-parser==0.18.1
--- a/.ci/docker/triton_version.txt
+++ b/.ci/docker/triton_version.txt
@ -1 +0,0 @@
-3.0.0
--- a/.ci/docker/ubuntu-cuda/Dockerfile
+++ b/.ci/docker/ubuntu-cuda/Dockerfile
@ -1,158 +0,0 @@
-ARG UBUNTU_VERSION
-ARG CUDA_VERSION
-ARG IMAGE_NAME
-
-FROM ${IMAGE_NAME}
-
-ARG UBUNTU_VERSION
-ARG CUDA_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install katex
-ARG KATEX
-COPY ./common/install_docs_reqs.sh install_docs_reqs.sh
-RUN bash ./install_docs_reqs.sh && rm install_docs_reqs.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-ARG CONDA_CMAKE
-COPY requirements-ci.txt /opt/conda/requirements-ci.txt
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
-
-# Install gcc
-ARG GCC_VERSION
-COPY ./common/install_gcc.sh install_gcc.sh
-RUN bash ./install_gcc.sh && rm install_gcc.sh
-
-# Install clang
-ARG CLANG_VERSION
-COPY ./common/install_clang.sh install_clang.sh
-RUN bash ./install_clang.sh && rm install_clang.sh
-
-# (optional) Install protobuf for ONNX
-ARG PROTOBUF
-COPY ./common/install_protobuf.sh install_protobuf.sh
-RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
-RUN rm install_protobuf.sh
-ENV INSTALLED_PROTOBUF ${PROTOBUF}
-
-# (optional) Install database packages like LMDB and LevelDB
-ARG DB
-COPY ./common/install_db.sh install_db.sh
-RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
-RUN rm install_db.sh
-ENV INSTALLED_DB ${DB}
-
-# (optional) Install vision packages like OpenCV
-ARG VISION
-COPY ./common/install_vision.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
-RUN rm install_vision.sh cache_vision_models.sh common_utils.sh
-ENV INSTALLED_VISION ${VISION}
-
-# (optional) Install UCC
-ARG UCX_COMMIT
-ARG UCC_COMMIT
-ENV UCX_COMMIT $UCX_COMMIT
-ENV UCC_COMMIT $UCC_COMMIT
-ENV UCX_HOME /usr
-ENV UCC_HOME /usr
-ADD ./common/install_ucc.sh install_ucc.sh
-RUN if [ -n "${UCX_COMMIT}" ] && [ -n "${UCC_COMMIT}" ]; then bash ./install_ucc.sh; fi
-RUN rm install_ucc.sh
-
-COPY ./common/install_openssl.sh install_openssl.sh
-ENV OPENSSL_ROOT_DIR /opt/openssl
-RUN bash ./install_openssl.sh
-ENV OPENSSL_DIR /opt/openssl
-
-ARG INDUCTOR_BENCHMARKS
-COPY ./common/install_inductor_benchmark_deps.sh install_inductor_benchmark_deps.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/huggingface.txt huggingface.txt
-COPY ci_commit_pins/timm.txt timm.txt
-RUN if [ -n "${INDUCTOR_BENCHMARKS}" ]; then bash ./install_inductor_benchmark_deps.sh; fi
-RUN rm install_inductor_benchmark_deps.sh common_utils.sh timm.txt huggingface.txt
-
-# (optional) Install non-default CMake version
-ARG CMAKE_VERSION
-COPY ./common/install_cmake.sh install_cmake.sh
-RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
-RUN rm install_cmake.sh
-
-ARG TRITON
-# Install triton, this needs to be done before sccache because the latter will
-# try to reach out to S3, which docker build runners don't have access
-COPY ./common/install_triton.sh install_triton.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/triton.txt triton.txt
-COPY triton_version.txt triton_version.txt
-RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
-RUN rm install_triton.sh common_utils.sh triton.txt triton_version.txt
-
-# Install ccache/sccache (do this last, so we get priority in PATH)
-COPY ./common/install_cache.sh install_cache.sh
-ENV PATH /opt/cache/bin:$PATH
-# See https://github.com/pytorch/pytorch/issues/82174
-# TODO(sdym@fb.com):
-# check if this is needed after full off Xenial migration
-ENV CARGO_NET_GIT_FETCH_WITH_CLI true
-RUN bash ./install_cache.sh && rm install_cache.sh
-ENV CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache
-
-# Add jni.h for java host build
-COPY ./common/install_jni.sh install_jni.sh
-COPY ./java/jni.h jni.h
-RUN bash ./install_jni.sh && rm install_jni.sh
-
-# Install Open MPI for CUDA
-COPY ./common/install_openmpi.sh install_openmpi.sh
-RUN if [ -n "${CUDA_VERSION}" ]; then bash install_openmpi.sh; fi
-RUN rm install_openmpi.sh
-
-# Include BUILD_ENVIRONMENT environment variable in image
-ARG BUILD_ENVIRONMENT
-ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
-
-# AWS specific CUDA build guidance
-ENV TORCH_CUDA_ARCH_LIST Maxwell
-ENV TORCH_NVCC_FLAGS "-Xfatbin -compress-all"
-ENV CUDA_PATH /usr/local/cuda
-
-# Install LLVM dev version (Defined in the pytorch/builder github repository)
-COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
-
-# Install CUDNN
-ARG CUDNN_VERSION
-ARG CUDA_VERSION
-COPY ./common/install_cudnn.sh install_cudnn.sh
-RUN if [ -n "${CUDNN_VERSION}" ]; then bash install_cudnn.sh; fi
-RUN rm install_cudnn.sh
-
-# Install CUSPARSELT
-ARG CUDA_VERSION
-COPY ./common/install_cusparselt.sh install_cusparselt.sh
-RUN bash install_cusparselt.sh
-RUN rm install_cusparselt.sh
-
-# Delete /usr/local/cuda-11.X/cuda-11.X symlinks
-RUN if [ -h /usr/local/cuda-11.6/cuda-11.6 ]; then rm /usr/local/cuda-11.6/cuda-11.6; fi
-RUN if [ -h /usr/local/cuda-11.7/cuda-11.7 ]; then rm /usr/local/cuda-11.7/cuda-11.7; fi
-RUN if [ -h /usr/local/cuda-12.1/cuda-12.1 ]; then rm /usr/local/cuda-12.1/cuda-12.1; fi
-RUN if [ -h /usr/local/cuda-12.4/cuda-12.4 ]; then rm /usr/local/cuda-12.4/cuda-12.4; fi
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/ubuntu-rocm/.gitignore
+++ b/.ci/docker/ubuntu-rocm/.gitignore
@ -1 +0,0 @@
-*.sh
--- a/.ci/docker/ubuntu-rocm/Dockerfile
+++ b/.ci/docker/ubuntu-rocm/Dockerfile
@ -1,125 +0,0 @@
-ARG UBUNTU_VERSION
-
-FROM ubuntu:${UBUNTU_VERSION}
-
-ARG UBUNTU_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-# Set AMD gpu targets to build for
-ARG PYTORCH_ROCM_ARCH
-ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install clang
-ARG LLVMDEV
-ARG CLANG_VERSION
-COPY ./common/install_clang.sh install_clang.sh
-RUN bash ./install_clang.sh && rm install_clang.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-COPY requirements-ci.txt /opt/conda/requirements-ci.txt
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
-
-# Install gcc
-ARG GCC_VERSION
-COPY ./common/install_gcc.sh install_gcc.sh
-RUN bash ./install_gcc.sh && rm install_gcc.sh
-
-# (optional) Install protobuf for ONNX
-ARG PROTOBUF
-COPY ./common/install_protobuf.sh install_protobuf.sh
-RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
-RUN rm install_protobuf.sh
-ENV INSTALLED_PROTOBUF ${PROTOBUF}
-
-# (optional) Install database packages like LMDB and LevelDB
-ARG DB
-COPY ./common/install_db.sh install_db.sh
-RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
-RUN rm install_db.sh
-ENV INSTALLED_DB ${DB}
-
-# (optional) Install vision packages like OpenCV
-ARG VISION
-COPY ./common/install_vision.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
-RUN rm install_vision.sh cache_vision_models.sh common_utils.sh
-ENV INSTALLED_VISION ${VISION}
-
-# Install rocm
-ARG ROCM_VERSION
-COPY ./common/install_rocm.sh install_rocm.sh
-RUN bash ./install_rocm.sh
-RUN rm install_rocm.sh
-COPY ./common/install_rocm_magma.sh install_rocm_magma.sh
-RUN bash ./install_rocm_magma.sh
-RUN rm install_rocm_magma.sh
-ENV ROCM_PATH /opt/rocm
-ENV PATH /opt/rocm/bin:$PATH
-ENV PATH /opt/rocm/hcc/bin:$PATH
-ENV PATH /opt/rocm/hip/bin:$PATH
-ENV PATH /opt/rocm/opencl/bin:$PATH
-ENV PATH /opt/rocm/llvm/bin:$PATH
-ENV MAGMA_HOME /opt/rocm/magma
-ENV LANG C.UTF-8
-ENV LC_ALL C.UTF-8
-
-# Install amdsmi
-COPY ./common/install_amdsmi.sh install_amdsmi.sh
-RUN bash ./install_amdsmi.sh
-RUN rm install_amdsmi.sh
-
-# (optional) Install non-default CMake version
-ARG CMAKE_VERSION
-COPY ./common/install_cmake.sh install_cmake.sh
-RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
-RUN rm install_cmake.sh
-
-# (optional) Install non-default Ninja version
-ARG NINJA_VERSION
-COPY ./common/install_ninja.sh install_ninja.sh
-RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
-RUN rm install_ninja.sh
-
-ARG TRITON
-# Install triton, this needs to be done before sccache because the latter will
-# try to reach out to S3, which docker build runners don't have access
-COPY ./common/install_triton.sh install_triton.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/triton-rocm.txt triton-rocm.txt
-COPY triton_version.txt triton_version.txt
-RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
-RUN rm install_triton.sh common_utils.sh triton-rocm.txt triton_version.txt
-
-# Install AOTriton
-COPY ./aotriton_version.txt aotriton_version.txt
-COPY ./common/common_utils.sh common_utils.sh
-COPY ./common/install_aotriton.sh install_aotriton.sh
-RUN ["/bin/bash", "-c", "./install_aotriton.sh /opt/rocm && rm -rf install_aotriton.sh aotriton_version.txt common_utils.sh"]
-ENV AOTRITON_INSTALLED_PREFIX /opt/rocm/aotriton
-
-# Install ccache/sccache (do this last, so we get priority in PATH)
-COPY ./common/install_cache.sh install_cache.sh
-ENV PATH /opt/cache/bin:$PATH
-RUN bash ./install_cache.sh && rm install_cache.sh
-
-# Include BUILD_ENVIRONMENT environment variable in image
-ARG BUILD_ENVIRONMENT
-ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/ubuntu-xpu/Dockerfile
+++ b/.ci/docker/ubuntu-xpu/Dockerfile
@ -1,118 +0,0 @@
-ARG UBUNTU_VERSION
-
-FROM ubuntu:${UBUNTU_VERSION}
-
-ARG UBUNTU_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-ARG CLANG_VERSION
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install clang
-ARG LLVMDEV
-COPY ./common/install_clang.sh install_clang.sh
-RUN bash ./install_clang.sh && rm install_clang.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install katex
-ARG KATEX
-COPY ./common/install_docs_reqs.sh install_docs_reqs.sh
-RUN bash ./install_docs_reqs.sh && rm install_docs_reqs.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ARG DOCS
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-ENV DOCS=$DOCS
-COPY requirements-ci.txt requirements-docs.txt /opt/conda/
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt /opt/conda/requirements-docs.txt
-
-# Install gcc
-ARG GCC_VERSION
-COPY ./common/install_gcc.sh install_gcc.sh
-RUN bash ./install_gcc.sh && rm install_gcc.sh
-
-# Install lcov for C++ code coverage
-COPY ./common/install_lcov.sh install_lcov.sh
-RUN  bash ./install_lcov.sh && rm install_lcov.sh
-
-COPY ./common/install_openssl.sh install_openssl.sh
-RUN bash ./install_openssl.sh
-ENV OPENSSL_ROOT_DIR /opt/openssl
-ENV OPENSSL_DIR /opt/openssl
-RUN rm install_openssl.sh
-
-ARG INDUCTOR_BENCHMARKS
-COPY ./common/install_inductor_benchmark_deps.sh install_inductor_benchmark_deps.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/huggingface.txt huggingface.txt
-COPY ci_commit_pins/timm.txt timm.txt
-RUN if [ -n "${INDUCTOR_BENCHMARKS}" ]; then bash ./install_inductor_benchmark_deps.sh; fi
-RUN rm install_inductor_benchmark_deps.sh common_utils.sh timm.txt huggingface.txt
-
-# Install XPU Dependencies
-ARG XPU_VERSION
-COPY ./common/install_xpu.sh install_xpu.sh
-RUN bash ./install_xpu.sh && rm install_xpu.sh
-
-ARG TRITON
-# Install triton, this needs to be done before sccache because the latter will
-# try to reach out to S3, which docker build runners don't have access
-COPY ./common/install_triton.sh install_triton.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/triton-xpu.txt triton-xpu.txt
-COPY triton_version.txt triton_version.txt
-RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
-RUN rm install_triton.sh common_utils.sh triton-xpu.txt triton_version.txt
-
-# (optional) Install database packages like LMDB and LevelDB
-ARG DB
-COPY ./common/install_db.sh install_db.sh
-RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
-RUN rm install_db.sh
-ENV INSTALLED_DB ${DB}
-
-# (optional) Install vision packages like OpenCV
-ARG VISION
-COPY ./common/install_vision.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
-RUN rm install_vision.sh cache_vision_models.sh common_utils.sh
-ENV INSTALLED_VISION ${VISION}
-
-# (optional) Install non-default CMake version
-ARG CMAKE_VERSION
-COPY ./common/install_cmake.sh install_cmake.sh
-RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
-RUN rm install_cmake.sh
-
-# (optional) Install non-default Ninja version
-ARG NINJA_VERSION
-COPY ./common/install_ninja.sh install_ninja.sh
-RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
-RUN rm install_ninja.sh
-
-# Install ccache/sccache (do this last, so we get priority in PATH)
-COPY ./common/install_cache.sh install_cache.sh
-ENV PATH /opt/cache/bin:$PATH
-RUN bash ./install_cache.sh && rm install_cache.sh
-
-# Include BUILD_ENVIRONMENT environment variable in image
-ARG BUILD_ENVIRONMENT
-ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
-
-# Install LLVM dev version (Defined in the pytorch/builder github repository)
-COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/docker/ubuntu/Dockerfile
+++ b/.ci/docker/ubuntu/Dockerfile
@ -1,203 +0,0 @@
-ARG UBUNTU_VERSION
-
-FROM ubuntu:${UBUNTU_VERSION}
-
-ARG UBUNTU_VERSION
-
-ENV DEBIAN_FRONTEND noninteractive
-
-ARG CLANG_VERSION
-
-# Install common dependencies (so that this step can be cached separately)
-COPY ./common/install_base.sh install_base.sh
-RUN bash ./install_base.sh && rm install_base.sh
-
-# Install clang
-ARG LLVMDEV
-COPY ./common/install_clang.sh install_clang.sh
-RUN bash ./install_clang.sh && rm install_clang.sh
-
-# Install user
-COPY ./common/install_user.sh install_user.sh
-RUN bash ./install_user.sh && rm install_user.sh
-
-# Install katex
-ARG KATEX
-COPY ./common/install_docs_reqs.sh install_docs_reqs.sh
-RUN bash ./install_docs_reqs.sh && rm install_docs_reqs.sh
-
-# Install conda and other packages (e.g., numpy, pytest)
-ARG ANACONDA_PYTHON_VERSION
-ARG CONDA_CMAKE
-ARG DOCS
-ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
-ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
-ENV DOCS=$DOCS
-COPY requirements-ci.txt requirements-docs.txt /opt/conda/
-COPY ./common/install_conda.sh install_conda.sh
-COPY ./common/common_utils.sh common_utils.sh
-RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt /opt/conda/requirements-docs.txt
-RUN if [ -n "${UNINSTALL_DILL}" ]; then pip uninstall -y dill; fi
-
-# Install gcc
-ARG GCC_VERSION
-COPY ./common/install_gcc.sh install_gcc.sh
-RUN bash ./install_gcc.sh && rm install_gcc.sh
-
-# Install lcov for C++ code coverage
-COPY ./common/install_lcov.sh install_lcov.sh
-RUN  bash ./install_lcov.sh && rm install_lcov.sh
-
-# Install cuda and cudnn
-ARG CUDA_VERSION
-RUN wget -q https://raw.githubusercontent.com/pytorch/builder/main/common/install_cuda.sh -O install_cuda.sh
-RUN bash ./install_cuda.sh ${CUDA_VERSION} && rm install_cuda.sh
-ENV DESIRED_CUDA ${CUDA_VERSION}
-ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:$PATH
-
-# (optional) Install UCC
-ARG UCX_COMMIT
-ARG UCC_COMMIT
-ENV UCX_COMMIT $UCX_COMMIT
-ENV UCC_COMMIT $UCC_COMMIT
-ENV UCX_HOME /usr
-ENV UCC_HOME /usr
-ADD ./common/install_ucc.sh install_ucc.sh
-RUN if [ -n "${UCX_COMMIT}" ] && [ -n "${UCC_COMMIT}" ]; then bash ./install_ucc.sh; fi
-RUN rm install_ucc.sh
-
-# (optional) Install protobuf for ONNX
-ARG PROTOBUF
-COPY ./common/install_protobuf.sh install_protobuf.sh
-RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
-RUN rm install_protobuf.sh
-ENV INSTALLED_PROTOBUF ${PROTOBUF}
-
-# (optional) Install database packages like LMDB and LevelDB
-ARG DB
-COPY ./common/install_db.sh install_db.sh
-RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
-RUN rm install_db.sh
-ENV INSTALLED_DB ${DB}
-
-# (optional) Install vision packages like OpenCV
-ARG VISION
-COPY ./common/install_vision.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
-RUN rm install_vision.sh cache_vision_models.sh common_utils.sh
-ENV INSTALLED_VISION ${VISION}
-
-# (optional) Install Android NDK
-ARG ANDROID
-ARG ANDROID_NDK
-ARG GRADLE_VERSION
-COPY ./common/install_android.sh ./common/cache_vision_models.sh ./common/common_utils.sh ./
-COPY ./android/AndroidManifest.xml AndroidManifest.xml
-COPY ./android/build.gradle build.gradle
-RUN if [ -n "${ANDROID}" ]; then bash ./install_android.sh; fi
-RUN rm install_android.sh cache_vision_models.sh common_utils.sh
-RUN rm AndroidManifest.xml
-RUN rm build.gradle
-ENV INSTALLED_ANDROID ${ANDROID}
-
-# (optional) Install Vulkan SDK
-ARG VULKAN_SDK_VERSION
-COPY ./common/install_vulkan_sdk.sh install_vulkan_sdk.sh
-RUN if [ -n "${VULKAN_SDK_VERSION}" ]; then bash ./install_vulkan_sdk.sh; fi
-RUN rm install_vulkan_sdk.sh
-
-# (optional) Install swiftshader
-ARG SWIFTSHADER
-COPY ./common/install_swiftshader.sh install_swiftshader.sh
-RUN if [ -n "${SWIFTSHADER}" ]; then bash ./install_swiftshader.sh; fi
-RUN rm install_swiftshader.sh
-
-# (optional) Install non-default CMake version
-ARG CMAKE_VERSION
-COPY ./common/install_cmake.sh install_cmake.sh
-RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
-RUN rm install_cmake.sh
-
-# (optional) Install non-default Ninja version
-ARG NINJA_VERSION
-COPY ./common/install_ninja.sh install_ninja.sh
-RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
-RUN rm install_ninja.sh
-
-COPY ./common/install_openssl.sh install_openssl.sh
-RUN bash ./install_openssl.sh
-ENV OPENSSL_ROOT_DIR /opt/openssl
-ENV OPENSSL_DIR /opt/openssl
-RUN rm install_openssl.sh
-
-ARG INDUCTOR_BENCHMARKS
-COPY ./common/install_inductor_benchmark_deps.sh install_inductor_benchmark_deps.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/huggingface.txt huggingface.txt
-COPY ci_commit_pins/timm.txt timm.txt
-RUN if [ -n "${INDUCTOR_BENCHMARKS}" ]; then bash ./install_inductor_benchmark_deps.sh; fi
-RUN rm install_inductor_benchmark_deps.sh common_utils.sh timm.txt huggingface.txt
-
-ARG TRITON
-# Install triton, this needs to be done before sccache because the latter will
-# try to reach out to S3, which docker build runners don't have access
-COPY ./common/install_triton.sh install_triton.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/triton.txt triton.txt
-RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
-RUN rm install_triton.sh common_utils.sh triton.txt
-
-ARG EXECUTORCH
-# Build and install executorch
-COPY ./common/install_executorch.sh install_executorch.sh
-COPY ./common/common_utils.sh common_utils.sh
-COPY ci_commit_pins/executorch.txt executorch.txt
-RUN if [ -n "${EXECUTORCH}" ]; then bash ./install_executorch.sh; fi
-RUN rm install_executorch.sh common_utils.sh executorch.txt
-
-ARG ONNX
-# Install ONNX dependencies
-COPY ./common/install_onnx.sh ./common/common_utils.sh ./
-RUN if [ -n "${ONNX}" ]; then bash ./install_onnx.sh; fi
-RUN rm install_onnx.sh common_utils.sh
-
-# (optional) Build ACL
-ARG ACL
-COPY ./common/install_acl.sh install_acl.sh
-RUN if [ -n "${ACL}" ]; then bash ./install_acl.sh; fi
-RUN rm install_acl.sh
-ENV INSTALLED_ACL ${ACL}
-
-# Install ccache/sccache (do this last, so we get priority in PATH)
-ARG SKIP_SCCACHE_INSTALL
-COPY ./common/install_cache.sh install_cache.sh
-ENV PATH /opt/cache/bin:$PATH
-RUN if [ -z "${SKIP_SCCACHE_INSTALL}" ]; then bash ./install_cache.sh; fi
-RUN rm install_cache.sh
-
-# Add jni.h for java host build
-COPY ./common/install_jni.sh install_jni.sh
-COPY ./java/jni.h jni.h
-RUN bash ./install_jni.sh && rm install_jni.sh
-
-# Install Open MPI for CUDA
-COPY ./common/install_openmpi.sh install_openmpi.sh
-RUN if [ -n "${CUDA_VERSION}" ]; then bash install_openmpi.sh; fi
-RUN rm install_openmpi.sh
-
-# Include BUILD_ENVIRONMENT environment variable in image
-ARG BUILD_ENVIRONMENT
-ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
-
-# Install LLVM dev version (Defined in the pytorch/builder github repository)
-ARG SKIP_LLVM_SRC_BUILD_INSTALL
-COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
-RUN if [ -n "${SKIP_LLVM_SRC_BUILD_INSTALL}" ]; then set -eu; rm -rf /opt/llvm; fi
-
-# AWS specific CUDA build guidance
-ENV TORCH_CUDA_ARCH_LIST Maxwell
-ENV TORCH_NVCC_FLAGS "-Xfatbin -compress-all"
-ENV CUDA_PATH /usr/local/cuda
-
-USER jenkins
-CMD ["bash"]
--- a/.ci/onnx/README.md
+++ b/.ci/onnx/README.md
@ -1,14 +0,0 @@
-# Jenkins
-
-The scripts in this directory are the entrypoint for testing ONNX exporter.
-
-The environment variable `BUILD_ENVIRONMENT` is expected to be set to
-the build environment you intend to test. It is a hint for the build
-and test scripts to configure Caffe2 a certain way and include/exclude
-tests. Docker images, they equal the name of the image itself. For
-example: `py2-cuda9.0-cudnn7-ubuntu16.04`. The Docker images that are
-built on Jenkins and are used in triggered builds already have this
-environment variable set in their manifest. Also see
-`./docker/jenkins/*/Dockerfile` and search for `BUILD_ENVIRONMENT`.
-
-Our Jenkins installation is located at https://ci.pytorch.org/jenkins/.
--- a/.ci/onnx/common.sh
+++ b/.ci/onnx/common.sh
@ -1,23 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-source "$(dirname "${BASH_SOURCE[0]}")/../pytorch/common_utils.sh"
-
-LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
-ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)
-TEST_DIR="$ROOT_DIR/test"
-pytest_reports_dir="${TEST_DIR}/test-reports/python"
-
-# Figure out which Python to use
-PYTHON="$(which python)"
-if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then
-  PYTHON=$(which "python${BASH_REMATCH[1]}")
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
-    # HIP_PLATFORM is auto-detected by hipcc; unset to avoid build errors
-    unset HIP_PLATFORM
-fi
-
-mkdir -p "$pytest_reports_dir" || true
--- a/.ci/onnx/test.sh
+++ b/.ci/onnx/test.sh
@ -1,29 +0,0 @@
-#!/bin/bash
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-# Workaround for dind-rootless userid mapping (https://github.com/pytorch/ci-infra/issues/96)
-WORKSPACE_ORIGINAL_OWNER_ID=$(stat -c '%u' "/var/lib/jenkins/workspace")
-cleanup_workspace() {
-  echo "sudo may print the following warning message that can be ignored. The chown command will still run."
-  echo "    sudo: setrlimit(RLIMIT_STACK): Operation not permitted"
-  echo "For more details refer to https://github.com/sudo-project/sudo/issues/42"
-  sudo chown -R "$WORKSPACE_ORIGINAL_OWNER_ID" /var/lib/jenkins/workspace
-}
-# Disable shellcheck SC2064 as we want to parse the original owner immediately.
-# shellcheck disable=SC2064
-trap_add cleanup_workspace EXIT
-sudo chown -R jenkins /var/lib/jenkins/workspace
-git config --global --add safe.directory /var/lib/jenkins/workspace
-
-if [[ "$BUILD_ENVIRONMENT" == *onnx* ]]; then
-  # TODO: This can be removed later once vision is also part of the Docker image
-  pip install -q --user --no-use-pep517 "git+https://github.com/pytorch/vision.git@$(cat .github/ci_commit_pins/vision.txt)"
-  # JIT C++ extensions require ninja, so put it into PATH.
-  export PATH="/var/lib/jenkins/.local/bin:$PATH"
-  # NB: ONNX test is fast (~15m) so it's ok to retry it few more times to avoid any flaky issue, we
-  # need to bring this to the standard PyTorch run_test eventually. The issue will be tracked in
-  # https://github.com/pytorch/pytorch/issues/98626
-  "$ROOT_DIR/scripts/onnx/test.sh"
-fi
--- a/.ci/pytorch/.shellcheckrc
+++ b/.ci/pytorch/.shellcheckrc
@ -1,4 +0,0 @@
-source-path=SCRIPTDIR
-
-# we'd like to enable --external-sources here but can't
-# https://github.com/koalaman/shellcheck/issues/1818
--- a/.ci/pytorch/README.md
+++ b/.ci/pytorch/README.md
@ -1,42 +0,0 @@
-This directory contains scripts for our continuous integration.
-
-One important thing to keep in mind when reading the scripts here is
-that they are all based off of Docker images, which we build for each of
-the various system configurations we want to run on Jenkins.  This means
-it is very easy to run these tests yourself:
-
-1. Figure out what Docker image you want.  The general template for our
-   images look like:
-   ``registry.pytorch.org/pytorch/pytorch-$BUILD_ENVIRONMENT:$DOCKER_VERSION``,
-   where ``$BUILD_ENVIRONMENT`` is one of the build environments
-   enumerated in
-   [pytorch-dockerfiles](https://github.com/pytorch/pytorch/blob/master/.ci/docker/build.sh). The dockerfile used by jenkins can be found under the `.ci` [directory](https://github.com/pytorch/pytorch/blob/master/.ci/docker)
-
-2. Run ``docker run -it -u jenkins $DOCKER_IMAGE``, clone PyTorch and
-   run one of the scripts in this directory.
-
-The Docker images are designed so that any "reasonable" build commands
-will work; if you look in [build.sh](build.sh) you will see that it is a
-very simple script.  This is intentional.  Idiomatic build instructions
-should work inside all of our Docker images.  You can tweak the commands
-however you need (e.g., in case you want to rebuild with DEBUG, or rerun
-the build with higher verbosity, etc.).
-
-We have to do some work to make this so.  Here is a summary of the
-mechanisms we use:
-
- We install binaries to directories like `/usr/local/bin` which
-  are automatically part of your PATH.
-
- We add entries to the PATH using Docker ENV variables (so
-  they apply when you enter Docker) and `/etc/environment` (so they
-  continue to apply even if you sudo), instead of modifying
-  `PATH` in our build scripts.
-
- We use `/etc/ld.so.conf.d` to register directories containing
-  shared libraries, instead of modifying `LD_LIBRARY_PATH` in our
-  build scripts.
-
- We reroute well known paths like `/usr/bin/gcc` to alternate
-  implementations with `update-alternatives`, instead of setting
-  `CC` and `CXX` in our implementations.
--- a/.ci/pytorch/build-mobile.sh
+++ b/.ci/pytorch/build-mobile.sh
@ -1,34 +0,0 @@
-#!/usr/bin/env bash
-# DO NOT ADD 'set -x' not to reveal CircleCI secret context environment variables
-set -eu -o pipefail
-
-# This script uses linux host toolchain + mobile build options in order to
-# build & test mobile libtorch without having to setup Android/iOS
-# toolchain/simulator.
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-# shellcheck source=./common-build.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common-build.sh"
-
-# Install torch & torchvision - used to download & trace test model.
-# Ideally we should use the libtorch built on the PR so that backward
-# incompatible changes won't break this script - but it will significantly slow
-# down mobile CI jobs.
-# Here we install nightly instead of stable so that we have an option to
-# temporarily skip mobile CI jobs on BC-breaking PRs until they are in nightly.
-retry pip install --pre torch torchvision \
-  -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html \
-  --progress-bar off
-
-# Run end-to-end process of building mobile library, linking into the predictor
-# binary, and running forward pass with a real model.
-if [[ "$BUILD_ENVIRONMENT" == *-mobile-custom-build-static* ]]; then
-  TEST_CUSTOM_BUILD_STATIC=1 test/mobile/custom_build/build.sh
-elif [[ "$BUILD_ENVIRONMENT" == *-mobile-lightweight-dispatch* ]]; then
-  test/mobile/lightweight_dispatch/build.sh
-else
-  TEST_DEFAULT_BUILD=1 test/mobile/custom_build/build.sh
-fi
-
-print_sccache_stats
--- a/.ci/pytorch/build.sh
+++ b/.ci/pytorch/build.sh
@ -1,393 +0,0 @@
-#!/bin/bash
-
-set -ex
-
-# Required environment variable: $BUILD_ENVIRONMENT
-# (This is set by default in the Docker images we build, so you don't
-# need to set it yourself.
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-# shellcheck source=./common-build.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common-build.sh"
-
-if [[ "$BUILD_ENVIRONMENT" == *-mobile-*build* ]]; then
-  exec "$(dirname "${BASH_SOURCE[0]}")/build-mobile.sh" "$@"
-fi
-
-echo "Python version:"
-python --version
-
-echo "GCC version:"
-gcc --version
-
-echo "CMake version:"
-cmake --version
-
-echo "Environment variables:"
-env
-
-if [[ "$BUILD_ENVIRONMENT" == *cuda* ]]; then
-  # Use jemalloc during compilation to mitigate https://github.com/pytorch/pytorch/issues/116289
-  export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
-  echo "NVCC version:"
-  nvcc --version
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *cuda11* ]]; then
-  if [[ "$BUILD_ENVIRONMENT" != *cuda11.3* && "$BUILD_ENVIRONMENT" != *clang* ]]; then
-    # TODO: there is a linking issue when building with UCC using clang,
-    # disable it for now and to be fix later.
-    # TODO: disable UCC temporarily to enable CUDA 12.1 in CI
-    export USE_UCC=1
-    export USE_SYSTEM_UCC=1
-  fi
-fi
-
-if [[ ${BUILD_ENVIRONMENT} == *"parallelnative"* ]]; then
-  export ATEN_THREADING=NATIVE
-fi
-
-# Enable LLVM dependency for TensorExpr testing
-if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
-  export USE_LLVM=/opt/rocm/llvm
-  export LLVM_DIR=/opt/rocm/llvm/lib/cmake/llvm
-else
-  export USE_LLVM=/opt/llvm
-  export LLVM_DIR=/opt/llvm/lib/cmake/llvm
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *executorch* ]]; then
-  # To build test_edge_op_registration
-  export BUILD_EXECUTORCH=ON
-  export USE_CUDA=0
-fi
-
-if ! which conda; then
-  # In ROCm CIs, we are doing cross compilation on build machines with
-  # intel cpu and later run tests on machines with amd cpu.
-  # Also leave out two builds to make sure non-mkldnn builds still work.
-  if [[ "$BUILD_ENVIRONMENT" != *rocm* ]]; then
-    export USE_MKLDNN=1
-  else
-    export USE_MKLDNN=0
-  fi
-else
-  # CMAKE_PREFIX_PATH precedences
-  # 1. $CONDA_PREFIX, if defined. This follows the pytorch official build instructions.
-  # 2. /opt/conda/envs/py_${ANACONDA_PYTHON_VERSION}, if ANACONDA_PYTHON_VERSION defined.
-  #    This is for CI, which defines ANACONDA_PYTHON_VERSION but not CONDA_PREFIX.
-  # 3. $(conda info --base). The fallback value of pytorch official build
-  #    instructions actually refers to this.
-  #    Commonly this is /opt/conda/
-  if [[ -v CONDA_PREFIX ]]; then
-    export CMAKE_PREFIX_PATH=${CONDA_PREFIX}
-  elif [[ -v ANACONDA_PYTHON_VERSION ]]; then
-    export CMAKE_PREFIX_PATH="/opt/conda/envs/py_${ANACONDA_PYTHON_VERSION}"
-  else
-    # already checked by `! which conda`
-    CMAKE_PREFIX_PATH="$(conda info --base)"
-    export CMAKE_PREFIX_PATH
-  fi
-
-  # Workaround required for MKL library linkage
-  # https://github.com/pytorch/pytorch/issues/119557
-  if [ "$ANACONDA_PYTHON_VERSION" = "3.12" ]; then
-    export CMAKE_LIBRARY_PATH="/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/lib/"
-    export CMAKE_INCLUDE_PATH="/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/include/"
-  fi
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *aarch64* ]]; then
-  export USE_MKLDNN=1
-  export USE_MKLDNN_ACL=1
-  export ACL_ROOT_DIR=/ComputeLibrary
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *libtorch* ]]; then
-  POSSIBLE_JAVA_HOMES=()
-  POSSIBLE_JAVA_HOMES+=(/usr/local)
-  POSSIBLE_JAVA_HOMES+=(/usr/lib/jvm/java-8-openjdk-amd64)
-  POSSIBLE_JAVA_HOMES+=(/Library/Java/JavaVirtualMachines/*.jdk/Contents/Home)
-  # Add the Windows-specific JNI
-  POSSIBLE_JAVA_HOMES+=("$PWD/.circleci/windows-jni/")
-  for JH in "${POSSIBLE_JAVA_HOMES[@]}" ; do
-    if [[ -e "$JH/include/jni.h" ]] ; then
-      # Skip if we're not on Windows but haven't found a JAVA_HOME
-      if [[ "$JH" == "$PWD/.circleci/windows-jni/" && "$OSTYPE" != "msys" ]] ; then
-        break
-      fi
-      echo "Found jni.h under $JH"
-      export JAVA_HOME="$JH"
-      export BUILD_JNI=ON
-      break
-    fi
-  done
-  if [ -z "$JAVA_HOME" ]; then
-    echo "Did not find jni.h"
-  fi
-fi
-
-# Use special scripts for Android builds
-if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then
-  export ANDROID_NDK=/opt/ndk
-  build_args=()
-  if [[ "${BUILD_ENVIRONMENT}" == *-arm-v7a* ]]; then
-    build_args+=("-DANDROID_ABI=armeabi-v7a")
-  elif [[ "${BUILD_ENVIRONMENT}" == *-arm-v8a* ]]; then
-    build_args+=("-DANDROID_ABI=arm64-v8a")
-  elif [[ "${BUILD_ENVIRONMENT}" == *-x86_32* ]]; then
-    build_args+=("-DANDROID_ABI=x86")
-  elif [[ "${BUILD_ENVIRONMENT}" == *-x86_64* ]]; then
-    build_args+=("-DANDROID_ABI=x86_64")
-  fi
-  if [[ "${BUILD_ENVIRONMENT}" == *vulkan* ]]; then
-    build_args+=("-DUSE_VULKAN=ON")
-  fi
-  build_args+=("-DUSE_LITE_INTERPRETER_PROFILER=OFF")
-  exec ./scripts/build_android.sh "${build_args[@]}" "$@"
-fi
-
-if [[ "$BUILD_ENVIRONMENT" != *android* && "$BUILD_ENVIRONMENT" == *vulkan* ]]; then
-  export USE_VULKAN=1
-  # shellcheck disable=SC1091
-  source /var/lib/jenkins/vulkansdk/setup-env.sh
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
-  # hcc used to run out of memory, silently exiting without stopping
-  # the build process, leaving undefined symbols in the shared lib,
-  # causing undefined symbol errors when later running tests.
-  # We used to set MAX_JOBS to 4 to avoid, but this is no longer an issue.
-  if [ -z "$MAX_JOBS" ]; then
-    export MAX_JOBS=$(($(nproc) - 1))
-  fi
-
-  if [[ -n "$CI" && -z "$PYTORCH_ROCM_ARCH" ]]; then
-      # Set ROCM_ARCH to gfx906 for CI builds, if user doesn't override.
-      echo "Limiting PYTORCH_ROCM_ARCH to gfx906 for CI builds"
-      export PYTORCH_ROCM_ARCH="gfx906"
-  fi
-
-  # hipify sources
-  python tools/amd_build/build_amd.py
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *xpu* ]]; then
-  # shellcheck disable=SC1091
-  source /opt/intel/oneapi/compiler/latest/env/vars.sh
-  export USE_XPU=1
-fi
-
-# sccache will fail for CUDA builds if all cores are used for compiling
-# gcc 7 with sccache seems to have intermittent OOM issue if all cores are used
-if [ -z "$MAX_JOBS" ]; then
-  if { [[ "$BUILD_ENVIRONMENT" == *cuda* ]] || [[ "$BUILD_ENVIRONMENT" == *gcc7* ]]; } && which sccache > /dev/null; then
-    export MAX_JOBS=$(($(nproc) - 1))
-  fi
-fi
-
-# TORCH_CUDA_ARCH_LIST must be passed from an environment variable
-if [[ "$BUILD_ENVIRONMENT" == *cuda* && -z "$TORCH_CUDA_ARCH_LIST" ]]; then
-  echo "TORCH_CUDA_ARCH_LIST must be defined"
-  exit 1
-fi
-
-# We only build FlashAttention files for CUDA 8.0+, and they require large amounts of
-# memory to build and will OOM
-if [[ "$BUILD_ENVIRONMENT" == *cuda* ]] && [[ "$TORCH_CUDA_ARCH_LIST" == *"8.6"* || "$TORCH_CUDA_ARCH_LIST" == *"8.0"* ]]; then
-  echo "WARNING: FlashAttention files require large amounts of memory to build and will OOM"
-  echo "Setting MAX_JOBS=(nproc-2)/3 to reduce memory usage"
-  export MAX_JOBS="$(( $(nproc --ignore=2) / 3 ))"
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *clang* ]]; then
-  export CC=clang
-  export CXX=clang++
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *-clang*-asan* ]]; then
-  export LDSHARED="clang --shared"
-  export USE_CUDA=0
-  export USE_ASAN=1
-  export UBSAN_FLAGS="-fno-sanitize-recover=all;-fno-sanitize=float-divide-by-zero;-fno-sanitize=float-cast-overflow"
-  unset USE_LLVM
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *no-ops* ]]; then
-  export USE_PER_OPERATOR_HEADERS=0
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *-pch* ]]; then
-    export USE_PRECOMPILED_HEADERS=1
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" == *linux-focal-py3.7-gcc7-build*  ]]; then
-  export USE_GLOO_WITH_OPENSSL=ON
-fi
-
-if [[ "${BUILD_ENVIRONMENT}" != *android* && "${BUILD_ENVIRONMENT}" != *cuda* ]]; then
-  export BUILD_STATIC_RUNTIME_BENCHMARK=ON
-fi
-
-# Do not change workspace permissions for ROCm CI jobs
-# as it can leave workspace with bad permissions for cancelled jobs
-if [[ "$BUILD_ENVIRONMENT" != *rocm* ]]; then
-  # Workaround for dind-rootless userid mapping (https://github.com/pytorch/ci-infra/issues/96)
-  WORKSPACE_ORIGINAL_OWNER_ID=$(stat -c '%u' "/var/lib/jenkins/workspace")
-  cleanup_workspace() {
-    echo "sudo may print the following warning message that can be ignored. The chown command will still run."
-    echo "    sudo: setrlimit(RLIMIT_STACK): Operation not permitted"
-    echo "For more details refer to https://github.com/sudo-project/sudo/issues/42"
-    sudo chown -R "$WORKSPACE_ORIGINAL_OWNER_ID" /var/lib/jenkins/workspace
-  }
-  # Disable shellcheck SC2064 as we want to parse the original owner immediately.
-  # shellcheck disable=SC2064
-  trap_add cleanup_workspace EXIT
-  sudo chown -R jenkins /var/lib/jenkins/workspace
-  git config --global --add safe.directory /var/lib/jenkins/workspace
-fi
-
-if [[ "$BUILD_ENVIRONMENT" == *-bazel-* ]]; then
-  set -e
-
-  get_bazel
-  install_sccache_nvcc_for_bazel
-
-  # Leave 1 CPU free and use only up to 80% of memory to reduce the change of crashing
-  # the runner
-  BAZEL_MEM_LIMIT="--local_ram_resources=HOST_RAM*.8"
-  BAZEL_CPU_LIMIT="--local_cpu_resources=HOST_CPUS-1"
-
-  if [[ "$CUDA_VERSION" == "cpu" ]]; then
-    # Build torch, the Python module, and tests for CPU-only
-    tools/bazel build --config=no-tty "${BAZEL_MEM_LIMIT}" "${BAZEL_CPU_LIMIT}" --config=cpu-only :torch :torch/_C.so :all_tests
-  else
-    tools/bazel build --config=no-tty "${BAZEL_MEM_LIMIT}" "${BAZEL_CPU_LIMIT}" //...
-  fi
-else
-  # check that setup.py would fail with bad arguments
-  echo "The next three invocations are expected to fail with invalid command error messages."
-  ( ! get_exit_code python setup.py bad_argument )
-  ( ! get_exit_code python setup.py clean] )
-  ( ! get_exit_code python setup.py clean bad_argument )
-
-  if [[ "$BUILD_ENVIRONMENT" != *libtorch* ]]; then
-    # rocm builds fail when WERROR=1
-    # XLA test build fails when WERROR=1
-    # set only when building other architectures
-    # or building non-XLA tests.
-    if [[ "$BUILD_ENVIRONMENT" != *rocm*  &&
-          "$BUILD_ENVIRONMENT" != *xla* ]]; then
-      if [[ "$BUILD_ENVIRONMENT" != *py3.8* ]]; then
-        # Install numpy-2.0 release candidate for builds
-        # Which should be backward compatible with Numpy-1.X
-        python -mpip install --pre numpy==2.0.0rc1
-      fi
-      WERROR=1 python setup.py bdist_wheel
-    else
-      if [[ "$BUILD_ENVIRONMENT" == *xla* ]]; then
-        source .ci/pytorch/install_cache_xla.sh
-      fi
-      python setup.py bdist_wheel
-    fi
-    pip_install_whl "$(echo dist/*.whl)"
-
-    # TODO: I'm not sure why, but somehow we lose verbose commands
-    set -x
-
-    assert_git_not_dirty
-    # Copy ninja build logs to dist folder
-    mkdir -p dist
-    if [ -f build/.ninja_log ]; then
-      cp build/.ninja_log dist
-    fi
-
-    if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
-      # remove sccache wrappers post-build; runtime compilation of MIOpen kernels does not yet fully support them
-      sudo rm -f /opt/cache/bin/cc
-      sudo rm -f /opt/cache/bin/c++
-      sudo rm -f /opt/cache/bin/gcc
-      sudo rm -f /opt/cache/bin/g++
-      pushd /opt/rocm/llvm/bin
-      if [[ -d original ]]; then
-        sudo mv original/clang .
-        sudo mv original/clang++ .
-      fi
-      sudo rm -rf original
-      popd
-    fi
-
-    CUSTOM_TEST_ARTIFACT_BUILD_DIR=${CUSTOM_TEST_ARTIFACT_BUILD_DIR:-"build/custom_test_artifacts"}
-    CUSTOM_TEST_USE_ROCM=$([[ "$BUILD_ENVIRONMENT" == *rocm* ]] && echo "ON" || echo "OFF")
-    CUSTOM_TEST_MODULE_PATH="${PWD}/cmake/public"
-    mkdir -pv "${CUSTOM_TEST_ARTIFACT_BUILD_DIR}"
-
-    # Build custom operator tests.
-    CUSTOM_OP_BUILD="${CUSTOM_TEST_ARTIFACT_BUILD_DIR}/custom-op-build"
-    CUSTOM_OP_TEST="$PWD/test/custom_operator"
-    python --version
-    SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
-    mkdir -p "$CUSTOM_OP_BUILD"
-    pushd "$CUSTOM_OP_BUILD"
-    cmake "$CUSTOM_OP_TEST" -DCMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" -DPython_EXECUTABLE="$(which python)" \
-          -DCMAKE_MODULE_PATH="$CUSTOM_TEST_MODULE_PATH" -DUSE_ROCM="$CUSTOM_TEST_USE_ROCM"
-    make VERBOSE=1
-    popd
-    assert_git_not_dirty
-
-    # Build jit hook tests
-    JIT_HOOK_BUILD="${CUSTOM_TEST_ARTIFACT_BUILD_DIR}/jit-hook-build"
-    JIT_HOOK_TEST="$PWD/test/jit_hooks"
-    python --version
-    SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
-    mkdir -p "$JIT_HOOK_BUILD"
-    pushd "$JIT_HOOK_BUILD"
-    cmake "$JIT_HOOK_TEST" -DCMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" -DPython_EXECUTABLE="$(which python)" \
-          -DCMAKE_MODULE_PATH="$CUSTOM_TEST_MODULE_PATH" -DUSE_ROCM="$CUSTOM_TEST_USE_ROCM"
-    make VERBOSE=1
-    popd
-    assert_git_not_dirty
-
-    # Build custom backend tests.
-    CUSTOM_BACKEND_BUILD="${CUSTOM_TEST_ARTIFACT_BUILD_DIR}/custom-backend-build"
-    CUSTOM_BACKEND_TEST="$PWD/test/custom_backend"
-    python --version
-    mkdir -p "$CUSTOM_BACKEND_BUILD"
-    pushd "$CUSTOM_BACKEND_BUILD"
-    cmake "$CUSTOM_BACKEND_TEST" -DCMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" -DPython_EXECUTABLE="$(which python)" \
-          -DCMAKE_MODULE_PATH="$CUSTOM_TEST_MODULE_PATH" -DUSE_ROCM="$CUSTOM_TEST_USE_ROCM"
-    make VERBOSE=1
-    popd
-    assert_git_not_dirty
-  else
-    # Test no-Python build
-    echo "Building libtorch"
-
-    # This is an attempt to mitigate flaky libtorch build OOM error. By default, the build parallelization
-    # is set to be the number of CPU minus 2. So, let's try a more conservative value here. A 4xlarge has
-    # 16 CPUs
-    MAX_JOBS=$(nproc --ignore=4)
-    export MAX_JOBS
-
-    # NB: Install outside of source directory (at the same level as the root
-    # pytorch folder) so that it doesn't get cleaned away prior to docker push.
-    BUILD_LIBTORCH_PY=$PWD/tools/build_libtorch.py
-    mkdir -p ../cpp-build/caffe2
-    pushd ../cpp-build/caffe2
-    WERROR=1 VERBOSE=1 DEBUG=1 python "$BUILD_LIBTORCH_PY"
-    popd
-  fi
-fi
-
-if [[ "$BUILD_ENVIRONMENT" != *libtorch* && "$BUILD_ENVIRONMENT" != *bazel* ]]; then
-  # export test times so that potential sharded tests that'll branch off this build will use consistent data
-  # don't do this for libtorch as libtorch is C++ only and thus won't have python tests run on its build
-  python tools/stats/export_test_times.py
-fi
-
-# snadampal: skipping it till sccache support added for aarch64
-# https://github.com/pytorch/pytorch/issues/121559
-if [[ "$BUILD_ENVIRONMENT" != *aarch64* ]]; then
-  print_sccache_stats
-fi
--- a/.ci/pytorch/codegen-test.sh
+++ b/.ci/pytorch/codegen-test.sh
@ -1,58 +0,0 @@
-#!/usr/bin/env bash
-
-# This script can also be used to test whether your diff changes any codegen output.
-#
-# Run it before and after your change:
-#   .ci/pytorch/codegen-test.sh <baseline_output_dir>
-#   .ci/pytorch/codegen-test.sh <test_output_dir>
-#
-# Then run diff to compare the generated files:
-#   diff -Naur <baseline_output_dir> <test_output_dir>
-
-set -eu -o pipefail
-
-if [ "$#" -eq 0 ]; then
-  # shellcheck source=./common.sh
-  source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-  OUT="$(dirname "${BASH_SOURCE[0]}")/../../codegen_result"
-else
-  OUT=$1
-fi
-
-set -x
-
-rm -rf "$OUT"
-
-# aten codegen
-python -m torchgen.gen \
-  -s aten/src/ATen \
-  -d "$OUT"/torch/share/ATen
-
-# torch codegen
-python -m tools.setup_helpers.generate_code \
-  --install_dir "$OUT"
-
-# pyi codegen
-mkdir -p "$OUT"/pyi/torch/_C
-mkdir -p "$OUT"/pyi/torch/nn
-python -m tools.pyi.gen_pyi \
-  --native-functions-path aten/src/ATen/native/native_functions.yaml \
-  --tags-path aten/src/ATen/native/tags.yaml \
-  --deprecated-functions-path tools/autograd/deprecated.yaml \
-  --out "$OUT"/pyi
-
-# autograd codegen (called by torch codegen but can run independently)
-python -m tools.autograd.gen_autograd \
-  "$OUT"/torch/share/ATen/Declarations.yaml \
-  aten/src/ATen/native/native_functions.yaml \
-  aten/src/ATen/native/tags.yaml \
-  "$OUT"/autograd \
-  tools/autograd
-
-# annotated_fn_args codegen (called by torch codegen but can run independently)
-mkdir -p "$OUT"/annotated_fn_args
-python -m tools.autograd.gen_annotated_fn_args \
-  aten/src/ATen/native/native_functions.yaml \
-  aten/src/ATen/native/tags.yaml \
-  "$OUT"/annotated_fn_args \
-  tools/autograd
--- a/.ci/pytorch/common-build.sh
+++ b/.ci/pytorch/common-build.sh
@ -1,59 +0,0 @@
-#!/bin/bash
-# Required environment variables:
-#   $BUILD_ENVIRONMENT (should be set by your Docker image)
-
-if [[ "$BUILD_ENVIRONMENT" != *win-* ]]; then
-    # Save the absolute path in case later we chdir (as occurs in the gpu perf test)
-    script_dir="$( cd "$(dirname "${BASH_SOURCE[0]}")" || exit ; pwd -P )"
-
-    if which sccache > /dev/null; then
-        # Save sccache logs to file
-        sccache --stop-server > /dev/null  2>&1 || true
-        rm -f ~/sccache_error.log || true
-
-        function sccache_epilogue() {
-            echo "::group::Sccache Compilation Log"
-            echo '=================== sccache compilation log ==================='
-            python "$script_dir/print_sccache_log.py" ~/sccache_error.log 2>/dev/null || true
-            echo '=========== If your build fails, please take a look at the log above for possible reasons ==========='
-            sccache --show-stats
-            sccache --stop-server || true
-            echo "::endgroup::"
-        }
-
-        # Register the function here so that the error log can be printed even when
-        # sccache fails to start, i.e. timeout error
-        trap_add sccache_epilogue EXIT
-
-        if [[ -n "${SKIP_SCCACHE_INITIALIZATION:-}" ]]; then
-            # sccache --start-server seems to hang forever on self hosted runners for GHA
-            # so let's just go ahead and skip the --start-server altogether since it seems
-            # as though sccache still gets used even when the sscache server isn't started
-            # explicitly
-            echo "Skipping sccache server initialization, setting environment variables"
-            export SCCACHE_IDLE_TIMEOUT=0
-            export SCCACHE_ERROR_LOG=~/sccache_error.log
-            export RUST_LOG=sccache::server=error
-        elif [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
-            SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=0 sccache --start-server
-        else
-            # increasing SCCACHE_IDLE_TIMEOUT so that extension_backend_test.cpp can build after this PR:
-            # https://github.com/pytorch/pytorch/pull/16645
-            SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=0 RUST_LOG=sccache::server=error sccache --start-server
-        fi
-
-        # Report sccache stats for easier debugging. It's ok if this commands
-        # timeouts and fails on MacOS
-        sccache --zero-stats || true
-    fi
-
-    if which ccache > /dev/null; then
-        # Report ccache stats for easier debugging
-        ccache --zero-stats
-        ccache --show-stats
-        function ccache_epilogue() {
-            ccache --show-stats
-        }
-        trap_add ccache_epilogue EXIT
-    fi
-fi
--- a/.ci/pytorch/common.sh
+++ b/.ci/pytorch/common.sh
@ -1,24 +0,0 @@
-#!/bin/bash
-
-# Common setup for all Jenkins scripts
-# shellcheck source=./common_utils.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
-set -ex
-
-# Required environment variables:
-#   $BUILD_ENVIRONMENT (should be set by your Docker image)
-
-# Figure out which Python to use for ROCm
-if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
-  # HIP_PLATFORM is auto-detected by hipcc; unset to avoid build errors
-  unset HIP_PLATFORM
-  export PYTORCH_TEST_WITH_ROCM=1
-  # temporary to locate some kernel issues on the CI nodes
-  export HSAKMT_DEBUG_LEVEL=4
-  # improve rccl performance for distributed tests
-  export HSA_FORCE_FINE_GRAIN_PCIE=1
-fi
-
-# TODO: Renable libtorch testing for MacOS, see https://github.com/pytorch/pytorch/issues/62598
-# shellcheck disable=SC2034
-BUILD_TEST_LIBTORCH=0
--- a/.ci/pytorch/common_utils.sh
+++ b/.ci/pytorch/common_utils.sh
@ -1,240 +0,0 @@
-#!/bin/bash
-
-# Common util **functions** that can be sourced in other scripts.
-
-# note: printf is used instead of echo to avoid backslash
-# processing and to properly handle values that begin with a '-'.
-
-log() { printf '%s\n' "$*"; }
-error() { log "ERROR: $*" >&2; }
-fatal() { error "$@"; exit 1; }
-
-retry () {
-    "$@" || (sleep 10 && "$@") || (sleep 20 && "$@") || (sleep 40 && "$@")
-}
-
-# compositional trap taken from https://stackoverflow.com/a/7287873/23845
-# appends a command to a trap
-#
-# - 1st arg:  code to add
-# - remaining args:  names of traps to modify
-#
-trap_add() {
-    trap_add_cmd=$1; shift || fatal "${FUNCNAME[0]} usage error"
-    for trap_add_name in "$@"; do
-        trap -- "$(
-            # helper fn to get existing trap command from output
-            # of trap -p
-            extract_trap_cmd() { printf '%s\n' "$3"; }
-            # print existing trap command with newline
-            eval "extract_trap_cmd $(trap -p "${trap_add_name}")"
-            # print the new trap command
-            printf '%s\n' "${trap_add_cmd}"
-        )" "${trap_add_name}" \
-            || fatal "unable to add to trap ${trap_add_name}"
-    done
-}
-# set the trace attribute for the above function.  this is
-# required to modify DEBUG or RETURN traps because functions don't
-# inherit them unless the trace attribute is set
-declare -f -t trap_add
-
-function assert_git_not_dirty() {
-    # TODO: we should add an option to `build_amd.py` that reverts the repo to
-    #       an unmodified state.
-    if [[ "$BUILD_ENVIRONMENT" != *rocm* ]] && [[ "$BUILD_ENVIRONMENT" != *xla* ]] ; then
-        git_status=$(git status --porcelain | grep -v '?? third_party' || true)
-        if [[ $git_status ]]; then
-            echo "Build left local git repository checkout dirty"
-            echo "git status --porcelain:"
-            echo "${git_status}"
-            exit 1
-        fi
-    fi
-}
-
-function pip_install_whl() {
-  # This is used to install PyTorch and other build artifacts wheel locally
-  # without using any network connection
-  python3 -mpip install --no-index --no-deps "$@"
-}
-
-function pip_install() {
-  # retry 3 times
-  # old versions of pip don't have the "--progress-bar" flag
-  pip install --progress-bar off "$@" || pip install --progress-bar off "$@" || pip install --progress-bar off "$@" ||\
-  pip install "$@" || pip install "$@" || pip install "$@"
-}
-
-function pip_uninstall() {
-  # uninstall 2 times
-  pip uninstall -y "$@" || pip uninstall -y "$@"
-}
-
-function get_exit_code() {
-  set +e
-  "$@"
-  retcode=$?
-  set -e
-  return $retcode
-}
-
-function get_bazel() {
-  # Download and use the cross-platform, dependency-free Python
-  # version of Bazelisk to fetch the platform specific version of
-  # Bazel to use from .bazelversion.
-  retry curl --location --output tools/bazel \
-    https://raw.githubusercontent.com/bazelbuild/bazelisk/v1.16.0/bazelisk.py
-  shasum --algorithm=1 --check \
-    <(echo 'd4369c3d293814d3188019c9f7527a948972d9f8  tools/bazel')
-  chmod u+x tools/bazel
-}
-
-# This function is bazel specific because of the bug
-# in the bazel that requires some special paths massaging
-# as a workaround. See
-# https://github.com/bazelbuild/bazel/issues/10167
-function install_sccache_nvcc_for_bazel() {
-  sudo mv /usr/local/cuda/bin/nvcc /usr/local/cuda/bin/nvcc-real
-
-  # Write the `/usr/local/cuda/bin/nvcc`
-  cat << EOF | sudo tee /usr/local/cuda/bin/nvcc
-#!/bin/sh
-if [ \$(env -u LD_PRELOAD ps -p \$PPID -o comm=) != sccache ]; then
-  exec sccache /usr/local/cuda/bin/nvcc "\$@"
-else
-  exec external/local_cuda/cuda/bin/nvcc-real "\$@"
-fi
-EOF
-
-  sudo chmod +x /usr/local/cuda/bin/nvcc
-}
-
-function install_monkeytype {
-  # Install MonkeyType
-  pip_install MonkeyType
-}
-
-
-function get_pinned_commit() {
-  cat .github/ci_commit_pins/"${1}".txt
-}
-
-function install_torchaudio() {
-  local commit
-  commit=$(get_pinned_commit audio)
-  if [[ "$1" == "cuda" ]]; then
-    # TODO: This is better to be passed as a parameter from _linux-test workflow
-    # so that it can be consistent with what is set in build
-    TORCH_CUDA_ARCH_LIST="8.0;8.6" pip_install --no-use-pep517 --user "git+https://github.com/pytorch/audio.git@${commit}"
-  else
-    pip_install --no-use-pep517 --user "git+https://github.com/pytorch/audio.git@${commit}"
-  fi
-
-}
-
-function install_torchtext() {
-  local data_commit
-  local text_commit
-  data_commit=$(get_pinned_commit data)
-  text_commit=$(get_pinned_commit text)
-  pip_install --no-use-pep517 --user "git+https://github.com/pytorch/data.git@${data_commit}"
-  pip_install --no-use-pep517 --user "git+https://github.com/pytorch/text.git@${text_commit}"
-}
-
-function install_torchvision() {
-  local orig_preload
-  local commit
-  commit=$(get_pinned_commit vision)
-  orig_preload=${LD_PRELOAD}
-  if [ -n "${LD_PRELOAD}" ]; then
-    # Silence dlerror to work-around glibc ASAN bug, see https://sourceware.org/bugzilla/show_bug.cgi?id=27653#c9
-    echo 'char* dlerror(void) { return "";}'|gcc -fpic -shared -o "${HOME}/dlerror.so" -x c -
-    LD_PRELOAD=${orig_preload}:${HOME}/dlerror.so
-  fi
-  pip_install --no-use-pep517 --user "git+https://github.com/pytorch/vision.git@${commit}"
-  if [ -n "${LD_PRELOAD}" ]; then
-    LD_PRELOAD=${orig_preload}
-  fi
-}
-
-function install_tlparse() {
-  pip_install --user "tlparse==0.3.7"
-  PATH="$(python -m site --user-base)/bin:$PATH"
-}
-
-function install_torchrec_and_fbgemm() {
-  local torchrec_commit
-  torchrec_commit=$(get_pinned_commit torchrec)
-  local fbgemm_commit
-  fbgemm_commit=$(get_pinned_commit fbgemm)
-  pip_uninstall torchrec-nightly
-  pip_uninstall fbgemm-gpu-nightly
-  pip_install setuptools-git-versioning scikit-build pyre-extensions
-  # See https://github.com/pytorch/pytorch/issues/106971
-  CUDA_PATH=/usr/local/cuda-12.1 pip_install --no-use-pep517 --user "git+https://github.com/pytorch/FBGEMM.git@${fbgemm_commit}#egg=fbgemm-gpu&subdirectory=fbgemm_gpu"
-  pip_install --no-use-pep517 --user "git+https://github.com/pytorch/torchrec.git@${torchrec_commit}"
-}
-
-function clone_pytorch_xla() {
-  if [[ ! -d ./xla ]]; then
-    git clone --recursive -b r2.4 https://github.com/pytorch/xla.git
-    pushd xla
-    # pin the xla hash so that we don't get broken by changes to xla
-    git checkout "$(cat ../.github/ci_commit_pins/xla.txt)"
-    git submodule sync
-    git submodule update --init --recursive
-    popd
-  fi
-}
-
-function checkout_install_torchdeploy() {
-  local commit
-  commit=$(get_pinned_commit multipy)
-  pushd ..
-  git clone --recurse-submodules https://github.com/pytorch/multipy.git
-  pushd multipy
-  git checkout "${commit}"
-  python multipy/runtime/example/generate_examples.py
-  BUILD_CUDA_TESTS=1 pip install -e .
-  popd
-  popd
-}
-
-function test_torch_deploy(){
- pushd ..
- pushd multipy
- ./multipy/runtime/build/test_deploy
- ./multipy/runtime/build/test_deploy_gpu
- popd
- popd
-}
-
-function checkout_install_torchbench() {
-  local commit
-  commit=$(get_pinned_commit torchbench)
-  git clone https://github.com/pytorch/benchmark torchbench
-  pushd torchbench
-  git checkout "$commit"
-
-  if [ "$1" ]; then
-    python install.py --continue_on_fail models "$@"
-  else
-    # Occasionally the installation may fail on one model but it is ok to continue
-    # to install and test other models
-    python install.py --continue_on_fail
-  fi
-  popd
-}
-
-function print_sccache_stats() {
-  echo 'PyTorch Build Statistics'
-  sccache --show-stats
-
-  if [[ -n "${OUR_GITHUB_JOB_ID}" ]]; then
-    sccache --show-stats --stats-format json | jq .stats \
-      > "sccache-stats-${BUILD_ENVIRONMENT}-${OUR_GITHUB_JOB_ID}.json"
-  else
-    echo "env var OUR_GITHUB_JOB_ID not set, will not write sccache stats to json"
-  fi
-}
--- a/.ci/pytorch/cpp_doc_push_script.sh
+++ b/.ci/pytorch/cpp_doc_push_script.sh
@ -1,93 +0,0 @@
-#!/bin/bash
-
-# This is where the local pytorch install in the docker image is located
-pt_checkout="/var/lib/jenkins/workspace"
-
-# Since we're cat-ing this file, we need to escape all $'s
-echo "cpp_doc_push_script.sh: Invoked with $*"
-
-# for statements like ${1:-${DOCS_INSTALL_PATH:-docs/}}
-# the order of operations goes:
-#   1. Check if there's an argument $1
-#   2. If no argument check for environment var DOCS_INSTALL_PATH
-#   3. If no environment var fall back to default 'docs/'
-
-# NOTE: It might seem weird to gather the second argument before gathering the first argument
-#       but since DOCS_INSTALL_PATH can be derived from DOCS_VERSION it's probably better to
-#       try and gather it first, just so we don't potentially break people who rely on this script
-# Argument 2: What version of the Python API docs we are building.
-version="${2:-${DOCS_VERSION:-main}}"
-if [ -z "$version" ]; then
-echo "error: cpp_doc_push_script.sh: version (arg2) not specified"
-  exit 1
-fi
-
-# Argument 1: Where to copy the built documentation for Python API to
-# (pytorch.github.io/$install_path)
-install_path="${1:-${DOCS_INSTALL_PATH:-docs/${DOCS_VERSION}}}"
-if [ -z "$install_path" ]; then
-echo "error: cpp_doc_push_script.sh: install_path (arg1) not specified"
-  exit 1
-fi
-
-echo "install_path: $install_path  version: $version"
-
-# ======================== Building PyTorch C++ API Docs ========================
-
-echo "Building PyTorch C++ API docs..."
-
-# Clone the cppdocs repo
-rm -rf cppdocs
-git clone https://github.com/pytorch/cppdocs
-
-set -ex
-
-# Generate ATen files
-pushd "${pt_checkout}"
-time python -m torchgen.gen \
-  -s aten/src/ATen \
-  -d build/aten/src/ATen
-
-# Copy some required files
-cp torch/_utils_internal.py tools/shared
-
-# Generate PyTorch files
-time python tools/setup_helpers/generate_code.py \
-  --native-functions-path aten/src/ATen/native/native_functions.yaml \
-  --tags-path aten/src/ATen/native/tags.yaml
-
-# Build the docs
-pushd docs/cpp
-time make VERBOSE=1 html -j
-
-popd
-popd
-
-pushd cppdocs
-
-# Purge everything with some exceptions
-mkdir /tmp/cppdocs-sync
-mv _config.yml README.md /tmp/cppdocs-sync/
-rm -rf ./*
-
-# Copy over all the newly generated HTML
-cp -r "${pt_checkout}"/docs/cpp/build/html/* .
-
-# Copy back _config.yml
-rm -rf _config.yml
-mv /tmp/cppdocs-sync/* .
-
-# Make a new commit
-git add . || true
-git status
-git config user.email "soumith+bot@pytorch.org"
-git config user.name "pytorchbot"
-# If there aren't changes, don't make a commit; push is no-op
-git commit -m "Generate C++ docs from pytorch/pytorch@${GITHUB_SHA}" || true
-git status
-
-if [[ "${WITH_PUSH:-}" == true ]]; then
-  git push -u origin
-fi
-
-popd
--- a/.ci/pytorch/create_test_cert.py
+++ b/.ci/pytorch/create_test_cert.py
@ -1,124 +0,0 @@
-from datetime import datetime, timedelta
-from tempfile import mkdtemp
-
-from cryptography import x509
-from cryptography.hazmat.primitives import hashes, serialization
-from cryptography.hazmat.primitives.asymmetric import rsa
-from cryptography.x509.oid import NameOID
-
-temp_dir = mkdtemp()
-print(temp_dir)
-
-
-def genrsa(path):
-    key = rsa.generate_private_key(
-        public_exponent=65537,
-        key_size=2048,
-    )
-    with open(path, "wb") as f:
-        f.write(
-            key.private_bytes(
-                encoding=serialization.Encoding.PEM,
-                format=serialization.PrivateFormat.TraditionalOpenSSL,
-                encryption_algorithm=serialization.NoEncryption(),
-            )
-        )
-    return key
-
-
-def create_cert(path, C, ST, L, O, key):
-    subject = issuer = x509.Name(
-        [
-            x509.NameAttribute(NameOID.COUNTRY_NAME, C),
-            x509.NameAttribute(NameOID.STATE_OR_PROVINCE_NAME, ST),
-            x509.NameAttribute(NameOID.LOCALITY_NAME, L),
-            x509.NameAttribute(NameOID.ORGANIZATION_NAME, O),
-        ]
-    )
-    cert = (
-        x509.CertificateBuilder()
-        .subject_name(subject)
-        .issuer_name(issuer)
-        .public_key(key.public_key())
-        .serial_number(x509.random_serial_number())
-        .not_valid_before(datetime.utcnow())
-        .not_valid_after(
-            # Our certificate will be valid for 10 days
-            datetime.utcnow()
-            + timedelta(days=10)
-        )
-        .add_extension(
-            x509.BasicConstraints(ca=True, path_length=None),
-            critical=True,
-        )
-        .sign(key, hashes.SHA256())
-    )
-    # Write our certificate out to disk.
-    with open(path, "wb") as f:
-        f.write(cert.public_bytes(serialization.Encoding.PEM))
-    return cert
-
-
-def create_req(path, C, ST, L, O, key):
-    csr = (
-        x509.CertificateSigningRequestBuilder()
-        .subject_name(
-            x509.Name(
-                [
-                    # Provide various details about who we are.
-                    x509.NameAttribute(NameOID.COUNTRY_NAME, C),
-                    x509.NameAttribute(NameOID.STATE_OR_PROVINCE_NAME, ST),
-                    x509.NameAttribute(NameOID.LOCALITY_NAME, L),
-                    x509.NameAttribute(NameOID.ORGANIZATION_NAME, O),
-                ]
-            )
-        )
-        .sign(key, hashes.SHA256())
-    )
-    with open(path, "wb") as f:
-        f.write(csr.public_bytes(serialization.Encoding.PEM))
-    return csr
-
-
-def sign_certificate_request(path, csr_cert, ca_cert, private_ca_key):
-    cert = (
-        x509.CertificateBuilder()
-        .subject_name(csr_cert.subject)
-        .issuer_name(ca_cert.subject)
-        .public_key(csr_cert.public_key())
-        .serial_number(x509.random_serial_number())
-        .not_valid_before(datetime.utcnow())
-        .not_valid_after(
-            # Our certificate will be valid for 10 days
-            datetime.utcnow()
-            + timedelta(days=10)
-            # Sign our certificate with our private key
-        )
-        .sign(private_ca_key, hashes.SHA256())
-    )
-    with open(path, "wb") as f:
-        f.write(cert.public_bytes(serialization.Encoding.PEM))
-    return cert
-
-
-ca_key = genrsa(temp_dir + "/ca.key")
-ca_cert = create_cert(
-    temp_dir + "/ca.pem",
-    "US",
-    "New York",
-    "New York",
-    "Gloo Certificate Authority",
-    ca_key,
-)
-
-pkey = genrsa(temp_dir + "/pkey.key")
-csr = create_req(
-    temp_dir + "/csr.csr",
-    "US",
-    "California",
-    "San Francisco",
-    "Gloo Testing Company",
-    pkey,
-)
-
-cert = sign_certificate_request(temp_dir + "/cert.pem", csr, ca_cert, ca_key)
--- a/.ci/pytorch/docker-build-test.sh
+++ b/.ci/pytorch/docker-build-test.sh
@ -1,6 +0,0 @@
-#!/bin/bash
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-docker build -t pytorch .
--- a/.ci/pytorch/docs-test.sh
+++ b/.ci/pytorch/docs-test.sh
@ -1,9 +0,0 @@
-#!/bin/bash
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-echo "Testing pytorch docs"
-
-cd docs
-TERM=vt100 make doctest
--- a/.ci/pytorch/fake_numpy/numpy.py
+++ b/.ci/pytorch/fake_numpy/numpy.py
@ -1 +0,0 @@
-raise ModuleNotFoundError("Sorry PyTorch, but our NumPy is in the other folder")
--- a/.ci/pytorch/functorch_doc_push_script.sh
+++ b/.ci/pytorch/functorch_doc_push_script.sh
@ -1,40 +0,0 @@
-#!/bin/bash
-
-# This is where the local pytorch install in the docker image is located
-pt_checkout="/var/lib/jenkins/workspace"
-source "$pt_checkout/.ci/pytorch/common_utils.sh"
-echo "functorch_doc_push_script.sh: Invoked with $*"
-
-set -ex
-
-version=${DOCS_VERSION:-nightly}
-echo "version: $version"
-
-# Build functorch docs
-pushd $pt_checkout/functorch/docs
-make html
-popd
-
-git clone https://github.com/pytorch/functorch -b gh-pages --depth 1 functorch_ghpages
-pushd functorch_ghpages
-
-if [ "$version" == "main" ]; then
-  version=nightly
-fi
-
-git rm -rf "$version" || true
-mv "$pt_checkout/functorch/docs/build/html" "$version"
-
-git add "$version" || true
-git status
-git config user.email "soumith+bot@pytorch.org"
-git config user.name "pytorchbot"
-# If there aren't changes, don't make a commit; push is no-op
-git commit -m "Generate Python docs from pytorch/pytorch@${GITHUB_SHA}" || true
-git status
-
-if [[ "${WITH_PUSH:-}" == true ]]; then
-  git push -u origin gh-pages
-fi
-
-popd
--- a/.ci/pytorch/install_cache_xla.sh
+++ b/.ci/pytorch/install_cache_xla.sh
@ -1,37 +0,0 @@
-#!/bin/bash
-
-# Script for installing sccache on the xla build job, which uses xla's docker
-# image and doesn't have sccache installed on it.  This is mostly copied from
-# .ci/docker/install_cache.sh.  Changes are: removing checks that will always
-# return the same thing, ex checks for for rocm, CUDA, and changing the path
-# where sccache is installed, and not changing /etc/environment.
-
-set -ex
-
-install_binary() {
-  echo "Downloading sccache binary from S3 repo"
-  curl --retry 3 https://s3.amazonaws.com/ossci-linux/sccache -o /tmp/cache/bin/sccache
-}
-
-mkdir -p /tmp/cache/bin
-mkdir -p /tmp/cache/lib
-export PATH="/tmp/cache/bin:$PATH"
-
-install_binary
-chmod a+x /tmp/cache/bin/sccache
-
-function write_sccache_stub() {
-  # Unset LD_PRELOAD for ps because of asan + ps issues
-  # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
-  # shellcheck disable=SC2086
-  # shellcheck disable=SC2059
-  printf "#!/bin/sh\nif [ \$(env -u LD_PRELOAD ps -p \$PPID -o comm=) != sccache ]; then\n  exec sccache $(which $1) \"\$@\"\nelse\n  exec $(which $1) \"\$@\"\nfi" > "/tmp/cache/bin/$1"
-  chmod a+x "/tmp/cache/bin/$1"
-}
-
-write_sccache_stub cc
-write_sccache_stub c++
-write_sccache_stub gcc
-write_sccache_stub g++
-write_sccache_stub clang
-write_sccache_stub clang++
--- a/.ci/pytorch/macos-build-test.sh
+++ b/.ci/pytorch/macos-build-test.sh
@ -1,11 +0,0 @@
-#!/bin/bash
-
-if [ -z "${BUILD_ENVIRONMENT}" ] || [[ "${BUILD_ENVIRONMENT}" == *-build* ]]; then
-  # shellcheck source=./macos-build.sh
-  source "$(dirname "${BASH_SOURCE[0]}")/macos-build.sh"
-fi
-
-if [ -z "${BUILD_ENVIRONMENT}" ] || [[ "${BUILD_ENVIRONMENT}" == *-test* ]]; then
-# shellcheck source=./macos-test.sh
-  source "$(dirname "${BASH_SOURCE[0]}")/macos-test.sh"
-fi
--- a/.ci/pytorch/macos-build.sh
+++ b/.ci/pytorch/macos-build.sh
@ -1,92 +0,0 @@
-#!/bin/bash
-
-# shellcheck disable=SC2034
-# shellcheck source=./macos-common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/macos-common.sh"
-# shellcheck source=./common-build.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common-build.sh"
-
-# Build PyTorch
-if [ -z "${CI}" ]; then
-  export DEVELOPER_DIR=/Applications/Xcode9.app/Contents/Developer
-fi
-
-# This helper function wraps calls to binaries with sccache, but only if they're not already wrapped with sccache.
-# For example, `clang` will be `sccache clang`, but `sccache clang` will not become `sccache sccache clang`.
-# The way this is done is by detecting the command of the parent pid of the current process and checking whether
-# that is sccache, and wrapping sccache around the process if its parent were not already sccache.
-function write_sccache_stub() {
-  output=$1
-  binary=$(basename "${output}")
-
-  printf "#!/bin/sh\nif [ \$(ps auxc \$(ps auxc -o ppid \$\$ | grep \$\$ | rev | cut -d' ' -f1 | rev) | tr '\\\\n' ' ' | rev | cut -d' ' -f2 | rev) != sccache ]; then\n  exec sccache %s \"\$@\"\nelse\n  exec %s \"\$@\"\nfi" "$(which "${binary}")" "$(which "${binary}")" > "${output}"
-  chmod a+x "${output}"
-}
-
-if which sccache > /dev/null; then
-  # Create temp directory for sccache shims
-  tmp_dir=$(mktemp -d)
-  trap 'rm -rfv ${tmp_dir}' EXIT
-  write_sccache_stub "${tmp_dir}/clang++"
-  write_sccache_stub "${tmp_dir}/clang"
-
-  export PATH="${tmp_dir}:$PATH"
-fi
-
-cross_compile_arm64() {
-  # Cross compilation for arm64
-  # Explicitly set USE_DISTRIBUTED=0 to align with the default build config on mac. This also serves as the sole CI config that tests
-  # that building with USE_DISTRIBUTED=0 works at all. See https://github.com/pytorch/pytorch/issues/86448
-  USE_DISTRIBUTED=0 CMAKE_OSX_ARCHITECTURES=arm64 MACOSX_DEPLOYMENT_TARGET=11.0 USE_MKLDNN=OFF USE_QNNPACK=OFF WERROR=1 BUILD_TEST=OFF USE_PYTORCH_METAL=1 python setup.py bdist_wheel
-}
-
-compile_arm64() {
-  # Compilation for arm64
-  # TODO: Compile with OpenMP support (but this causes CI regressions as cross-compilation were done with OpenMP disabled)
-  USE_DISTRIBUTED=0 USE_OPENMP=1 MACOSX_DEPLOYMENT_TARGET=11.0 WERROR=1 BUILD_TEST=OFF USE_PYTORCH_METAL=1 python setup.py bdist_wheel
-}
-
-compile_x86_64() {
-  USE_DISTRIBUTED=0 WERROR=1 python setup.py bdist_wheel --plat-name=macosx_10_9_x86_64
-}
-
-build_lite_interpreter() {
-    echo "Testing libtorch (lite interpreter)."
-
-    CPP_BUILD="$(pwd)/../cpp_build"
-    # Ensure the removal of the tmp directory
-    trap 'rm -rfv ${CPP_BUILD}' EXIT
-    rm -rf "${CPP_BUILD}"
-    mkdir -p "${CPP_BUILD}/caffe2"
-
-    # It looks libtorch need to be built in "${CPP_BUILD}/caffe2 folder.
-    BUILD_LIBTORCH_PY=$PWD/tools/build_libtorch.py
-    pushd "${CPP_BUILD}/caffe2" || exit
-    VERBOSE=1 DEBUG=1 python "${BUILD_LIBTORCH_PY}"
-    popd || exit
-
-    "${CPP_BUILD}/caffe2/build/bin/test_lite_interpreter_runtime"
-}
-
-print_cmake_info
-
-if [[ ${BUILD_ENVIRONMENT} = *arm64* ]]; then
-  if [[ $(uname -m) == "arm64" ]]; then
-    compile_arm64
-  else
-    cross_compile_arm64
-  fi
-elif [[ ${BUILD_ENVIRONMENT} = *lite-interpreter* ]]; then
-  export BUILD_LITE_INTERPRETER=1
-  build_lite_interpreter
-else
-  compile_x86_64
-fi
-
-if which sccache > /dev/null; then
-  print_sccache_stats
-fi
-
-python tools/stats/export_test_times.py
-
-assert_git_not_dirty
--- a/.ci/pytorch/macos-common.sh
+++ b/.ci/pytorch/macos-common.sh
@ -1,33 +0,0 @@
-#!/bin/bash
-
-# Common prelude for macos-build.sh and macos-test.sh
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-sysctl -a | grep machdep.cpu
-
-# These are required for both the build job and the test job.
-# In the latter to test cpp extensions.
-export MACOSX_DEPLOYMENT_TARGET=11.1
-export CXX=clang++
-export CC=clang
-
-print_cmake_info() {
-  CMAKE_EXEC=$(which cmake)
-  echo "$CMAKE_EXEC"
-
-  CONDA_INSTALLATION_DIR=$(dirname "$CMAKE_EXEC")
-  # Print all libraries under cmake rpath for debugging
-  ls -la "$CONDA_INSTALLATION_DIR/../lib"
-
-  export CMAKE_EXEC
-  # Explicitly add conda env lib folder to cmake rpath to address the flaky issue
-  # where cmake dependencies couldn't be found. This seems to point to how conda
-  # links $CMAKE_EXEC to its package cache when cloning a new environment
-  install_name_tool -add_rpath @executable_path/../lib "${CMAKE_EXEC}" || true
-  # Adding the rpath will invalidate cmake signature, so signing it again here
-  # to trust the executable. EXC_BAD_ACCESS (SIGKILL (Code Signature Invalid))
-  # with an exit code 137 otherwise
-  codesign -f -s - "${CMAKE_EXEC}" || true
-}
--- a/.ci/pytorch/macos-test.sh
+++ b/.ci/pytorch/macos-test.sh
@ -1,169 +0,0 @@
-#!/bin/bash
-
-# shellcheck disable=SC2034
-# shellcheck source=./macos-common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/macos-common.sh"
-
-if [[ -n "$CONDA_ENV" ]]; then
-  # Use binaries under conda environment
-  export PATH="$CONDA_ENV/bin":$PATH
-fi
-
-# Test that OpenMP is enabled for non-arm64 build
-if [[ ${BUILD_ENVIRONMENT} != *arm64* ]]; then
-  pushd test
-  if [[ ! $(python -c "import torch; print(int(torch.backends.openmp.is_available()))") == "1" ]]; then
-    echo "Build should have OpenMP enabled, but torch.backends.openmp.is_available() is False"
-    exit 1
-  fi
-  popd
-fi
-
-setup_test_python() {
-  # The CircleCI worker hostname doesn't resolve to an address.
-  # This environment variable makes ProcessGroupGloo default to
-  # using the address associated with the loopback interface.
-  export GLOO_SOCKET_IFNAME=lo0
-  echo "Ninja version: $(ninja --version)"
-  echo "Python version: $(which python) ($(python --version))"
-
-  # Increase default limit on open file handles from 256 to 1024
-  ulimit -n 1024
-}
-
-test_python_all() {
-  setup_test_python
-
-  time python test/run_test.py --verbose --exclude-jit-executor
-
-  assert_git_not_dirty
-}
-
-test_python_shard() {
-  if [[ -z "$NUM_TEST_SHARDS" ]]; then
-    echo "NUM_TEST_SHARDS must be defined to run a Python test shard"
-    exit 1
-  fi
-
-  setup_test_python
-
-  time python test/run_test.py --verbose --exclude-jit-executor --exclude-distributed-tests --shard "$1" "$NUM_TEST_SHARDS"
-
-  assert_git_not_dirty
-}
-
-test_libtorch() {
-  # C++ API
-
-  if [[ "$BUILD_TEST_LIBTORCH" == "1" ]]; then
-    # NB: Install outside of source directory (at the same level as the root
-    # pytorch folder) so that it doesn't get cleaned away prior to docker push.
-    # But still clean it before we perform our own build.
-
-    echo "Testing libtorch"
-
-    CPP_BUILD="$PWD/../cpp-build"
-    rm -rf "$CPP_BUILD"
-    mkdir -p "$CPP_BUILD"/caffe2
-
-    BUILD_LIBTORCH_PY=$PWD/tools/build_libtorch.py
-    pushd "$CPP_BUILD"/caffe2
-    VERBOSE=1 DEBUG=1 python "$BUILD_LIBTORCH_PY"
-    popd
-
-    MNIST_DIR="${PWD}/test/cpp/api/mnist"
-    python tools/download_mnist.py --quiet -d "${MNIST_DIR}"
-
-    # Unfortunately it seems like the test can't load from miniconda3
-    # without these paths being set
-    export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$PWD/miniconda3/lib"
-    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/miniconda3/lib"
-    TORCH_CPP_TEST_MNIST_PATH="${MNIST_DIR}" CPP_TESTS_DIR="${CPP_BUILD}/caffe2/bin" python test/run_test.py --cpp --verbose -i cpp/test_api
-
-    assert_git_not_dirty
-  fi
-}
-
-test_custom_backend() {
-  print_cmake_info
-
-  echo "Testing custom backends"
-  pushd test/custom_backend
-  rm -rf build && mkdir build
-  pushd build
-  SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
-  CMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" "${CMAKE_EXEC}" ..
-  make VERBOSE=1
-  popd
-
-  # Run Python tests and export a lowered module.
-  python test_custom_backend.py -v
-  python backend.py --export-module-to=model.pt
-  # Run C++ tests using the exported module.
-  build/test_custom_backend ./model.pt
-  rm -f ./model.pt
-  popd
-  assert_git_not_dirty
-}
-
-test_custom_script_ops() {
-  print_cmake_info
-
-  echo "Testing custom script operators"
-  pushd test/custom_operator
-  # Build the custom operator library.
-  rm -rf build && mkdir build
-  pushd build
-  SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
-  CMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" "${CMAKE_EXEC}" ..
-  make VERBOSE=1
-  popd
-
-  # Run tests Python-side and export a script module.
-  python test_custom_ops.py -v
-  python model.py --export-script-module=model.pt
-  # Run tests C++-side and load the exported script module.
-  build/test_custom_ops ./model.pt
-  popd
-  assert_git_not_dirty
-}
-
-test_jit_hooks() {
-  print_cmake_info
-
-  echo "Testing jit hooks in cpp"
-  pushd test/jit_hooks
-  # Build the custom operator library.
-  rm -rf build && mkdir build
-  pushd build
-  SITE_PACKAGES="$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())')"
-  CMAKE_PREFIX_PATH="$SITE_PACKAGES/torch" "${CMAKE_EXEC}" ..
-  make VERBOSE=1
-  popd
-
-  # Run tests Python-side and export a script module.
-  python model.py --export-script-module=model
-  # Run tests C++-side and load the exported script module.
-  build/test_jit_hooks ./model
-  popd
-  assert_git_not_dirty
-}
-
-install_tlparse
-
-if [[ $NUM_TEST_SHARDS -gt 1 ]]; then
-  test_python_shard "${SHARD_NUMBER}"
-  if [[ "${SHARD_NUMBER}" == 1 ]]; then
-    test_libtorch
-    test_custom_script_ops
-  elif [[ "${SHARD_NUMBER}" == 2 ]]; then
-    test_jit_hooks
-    test_custom_backend
-  fi
-else
-  test_python_all
-  test_libtorch
-  test_custom_script_ops
-  test_jit_hooks
-  test_custom_backend
-fi
--- a/.ci/pytorch/multigpu-test.sh
+++ b/.ci/pytorch/multigpu-test.sh
@ -1,62 +0,0 @@
-#!/bin/bash
-
-# Required environment variable: $BUILD_ENVIRONMENT
-# (This is set by default in the Docker images we build, so you don't
-# need to set it yourself.
-
-# shellcheck source=./common.sh
-source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
-
-echo "Testing pytorch"
-time python test/run_test.py --include test_cuda_multigpu test_cuda_primary_ctx --verbose
-
-# Disabling tests to see if they solve timeout issues; see https://github.com/pytorch/pytorch/issues/70015
-# python tools/download_mnist.py --quiet -d test/cpp/api/mnist
-# OMP_NUM_THREADS=2 TORCH_CPP_TEST_MNIST_PATH="test/cpp/api/mnist" build/bin/test_api
-time python test/run_test.py --verbose -i distributed/test_c10d_common
-time python test/run_test.py --verbose -i distributed/test_c10d_gloo
-time python test/run_test.py --verbose -i distributed/test_c10d_nccl
-time python test/run_test.py --verbose -i distributed/test_c10d_spawn_gloo
-time python test/run_test.py --verbose -i distributed/test_c10d_spawn_nccl
-time python test/run_test.py --verbose -i distributed/test_cuda_p2p
-time python test/run_test.py --verbose -i distributed/test_store
-time python test/run_test.py --verbose -i distributed/test_pg_wrapper
-time python test/run_test.py --verbose -i distributed/rpc/cuda/test_tensorpipe_agent
-# FSDP tests
-for f in test/distributed/fsdp/*.py ; do time python test/run_test.py --verbose -i "${f#*/}" ; done
-# ShardedTensor tests
-time python test/run_test.py --verbose -i distributed/checkpoint/test_checkpoint
-time python test/run_test.py --verbose -i distributed/checkpoint/test_file_system_checkpoint
-time python test/run_test.py --verbose -i distributed/_shard/sharding_spec/test_sharding_spec
-time python test/run_test.py --verbose -i distributed/_shard/sharding_plan/test_sharding_plan
-time python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/test_sharded_tensor
-time python test/run_test.py --verbose -i distributed/_shard/sharded_tensor/test_sharded_tensor_reshard
-
-# functional collective tests
-time python test/run_test.py --verbose -i distributed/test_functional_api
-
-# DTensor tests
-time python test/run_test.py --verbose -i distributed/_tensor/test_random_ops
-time python test/run_test.py --verbose -i distributed/_tensor/test_dtensor_compile
-
-# DeviceMesh test
-time python test/run_test.py --verbose -i distributed/test_device_mesh
-
-# DTensor/TP tests
-time python test/run_test.py --verbose -i distributed/tensor/parallel/test_ddp_2d_parallel
-time python test/run_test.py --verbose -i distributed/tensor/parallel/test_fsdp_2d_parallel
-time python test/run_test.py --verbose -i distributed/tensor/parallel/test_tp_examples
-time python test/run_test.py --verbose -i distributed/tensor/parallel/test_tp_random_state
-
-# FSDP2 tests
-time python test/run_test.py --verbose -i distributed/_composable/fsdp/test_fully_shard_training -- -k test_2d_mlp_with_nd_mesh
-
-# Pipelining composability tests
-time python test/run_test.py --verbose -i distributed/pipelining/test_composability.py
-
-# Other tests
-time python test/run_test.py --verbose -i test_cuda_primary_ctx
-time python test/run_test.py --verbose -i test_optim -- -k test_forloop_goes_right_direction_multigpu
-time python test/run_test.py --verbose -i test_optim -- -k test_mixed_device_dtype
-time python test/run_test.py --verbose -i test_foreach -- -k test_tensors_grouping
-assert_git_not_dirty
--- a/.ci/pytorch/perf_test/common.sh
+++ b/.ci/pytorch/perf_test/common.sh
@ -1,22 +0,0 @@
-#!/bin/bash
-set -e
-
-run_test () {
-  rm -rf test_tmp/ && mkdir test_tmp/ && cd test_tmp/
-  "$@"
-  cd .. && rm -rf test_tmp/
-}
-
-get_runtime_of_command () {
-  TIMEFORMAT=%R
-
-  # runtime=$( { time ($@ &> /dev/null); } 2>&1 1>/dev/null)
-  runtime=$( { time "$@"; } 2>&1 1>/dev/null)
-  if [[ $runtime == *"Error"* ]]; then
-    exit 1
-  fi
-  runtime=${runtime#+++ $@}
-  runtime=$(python -c "print($runtime)")
-
-  echo "$runtime"
-}
--- a/.ci/pytorch/perf_test/compare_with_baseline.py
+++ b/.ci/pytorch/perf_test/compare_with_baseline.py
@ -1,90 +0,0 @@
-import argparse
-import json
-import math
-import sys
-
-parser = argparse.ArgumentParser()
-parser.add_argument(
-    "--test-name", dest="test_name", action="store", required=True, help="test name"
-)
-parser.add_argument(
-    "--sample-stats",
-    dest="sample_stats",
-    action="store",
-    required=True,
-    help="stats from sample",
-)
-parser.add_argument(
-    "--update",
-    action="store_true",
-    help="whether to update baseline using stats from sample",
-)
-args = parser.parse_args()
-
-test_name = args.test_name
-
-if "cpu" in test_name:
-    backend = "cpu"
-elif "gpu" in test_name:
-    backend = "gpu"
-
-data_file_path = f"../{backend}_runtime.json"
-
-with open(data_file_path) as data_file:
-    data = json.load(data_file)
-
-if test_name in data:
-    mean = float(data[test_name]["mean"])
-    sigma = float(data[test_name]["sigma"])
-else:
-    # Let the test pass if baseline number doesn't exist
-    mean = sys.maxsize
-    sigma = 0.001
-
-print("population mean: ", mean)
-print("population sigma: ", sigma)
-
-# Let the test pass if baseline number is NaN (which happened in
-# the past when we didn't have logic for catching NaN numbers)
-if math.isnan(mean) or math.isnan(sigma):
-    mean = sys.maxsize
-    sigma = 0.001
-
-sample_stats_data = json.loads(args.sample_stats)
-
-sample_mean = float(sample_stats_data["mean"])
-sample_sigma = float(sample_stats_data["sigma"])
-
-print("sample mean: ", sample_mean)
-print("sample sigma: ", sample_sigma)
-
-if math.isnan(sample_mean):
-    raise Exception("""Error: sample mean is NaN""")  # noqa: TRY002
-elif math.isnan(sample_sigma):
-    raise Exception("""Error: sample sigma is NaN""")  # noqa: TRY002
-
-z_value = (sample_mean - mean) / sigma
-
-print("z-value: ", z_value)
-
-if z_value >= 3:
-    raise Exception(  # noqa: TRY002
-        f"""\n
-z-value >= 3, there is high chance of perf regression.\n
-To reproduce this regression, run
-`cd .ci/pytorch/perf_test/ && bash {test_name}.sh` on your local machine
-and compare the runtime before/after your code change.
-"""
-    )
-else:
-    print("z-value < 3, no perf regression detected.")
-    if args.update:
-        print("We will use these numbers as new baseline.")
-        new_data_file_path = f"../new_{backend}_runtime.json"
-        with open(new_data_file_path) as new_data_file:
-            new_data = json.load(new_data_file)
-        new_data[test_name] = {}
-        new_data[test_name]["mean"] = sample_mean
-        new_data[test_name]["sigma"] = max(sample_sigma, sample_mean * 0.1)
-        with open(new_data_file_path, "w") as new_data_file:
-            json.dump(new_data, new_data_file, indent=4)
--- a/.ci/pytorch/perf_test/get_stats.py
+++ b/.ci/pytorch/perf_test/get_stats.py
@ -1,17 +0,0 @@
-import json
-import sys
-
-import numpy
-
-sample_data_list = sys.argv[1:]
-sample_data_list = [float(v.strip()) for v in sample_data_list]
-
-sample_mean = numpy.mean(sample_data_list)
-sample_sigma = numpy.std(sample_data_list)
-
-data = {
-    "mean": sample_mean,
-    "sigma": sample_sigma,
-}
-
-print(json.dumps(data))
--- a/.ci/pytorch/perf_test/test_cpu_speed_mini_sequence_labeler.sh
+++ b/.ci/pytorch/perf_test/test_cpu_speed_mini_sequence_labeler.sh
@ -1,43 +0,0 @@
-#!/bin/bash
-set -e
-
-. ./common.sh
-
-test_cpu_speed_mini_sequence_labeler () {
-  echo "Testing: mini sequence labeler, CPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/pytorch/benchmark.git
-
-  cd benchmark/
-
-  git checkout 726567a455edbfda6199445922a8cfee82535664
-
-  cd scripts/mini_sequence_labeler
-
-  SAMPLE_ARRAY=()
-  NUM_RUNS=$1
-
-  for (( i=1; i<=NUM_RUNS; i++ )) do
-    runtime=$(get_runtime_of_command python main.py)
-    SAMPLE_ARRAY+=("${runtime}")
-  done
-
-  cd ../../..
-
-  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")
-  echo "Runtime stats in seconds:"
-  echo "$stats"
-
-  if [ "$2" == "compare_with_baseline" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}"
-  elif [ "$2" == "compare_and_update" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}" --update
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_cpu_speed_mini_sequence_labeler "$@"
-fi
--- a/.ci/pytorch/perf_test/test_cpu_speed_mnist.sh
+++ b/.ci/pytorch/perf_test/test_cpu_speed_mnist.sh
@ -1,45 +0,0 @@
-#!/bin/bash
-set -e
-
-. ./common.sh
-
-test_cpu_speed_mnist () {
-  echo "Testing: MNIST, CPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/pytorch/examples.git -b perftests
-
-  cd examples/mnist
-
-  conda install -c pytorch torchvision-cpu
-
-  # Download data
-  python main.py --epochs 0
-
-  SAMPLE_ARRAY=()
-  NUM_RUNS=$1
-
-  for (( i=1; i<=NUM_RUNS; i++ )) do
-    runtime=$(get_runtime_of_command python main.py --epochs 1 --no-log)
-    echo "$runtime"
-    SAMPLE_ARRAY+=("${runtime}")
-  done
-
-  cd ../..
-
-  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")
-  echo "Runtime stats in seconds:"
-  echo "$stats"
-
-  if [ "$2" == "compare_with_baseline" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}"
-  elif [ "$2" == "compare_and_update" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}" --update
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_cpu_speed_mnist "$@"
-fi
--- a/.ci/pytorch/perf_test/test_cpu_speed_torch.sh
+++ b/.ci/pytorch/perf_test/test_cpu_speed_torch.sh
@ -1,29 +0,0 @@
-#!/bin/bash
-
-. ./common.sh
-
-test_cpu_speed_torch () {
-  echo "Testing: torch.*, CPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/yf225/perf-tests.git
-
-  if [ "$1" == "compare_with_baseline" ]; then
-    export ARGS=(--compare ../cpu_runtime.json)
-  elif [ "$1" == "compare_and_update" ]; then
-    export ARGS=(--compare ../cpu_runtime.json --update ../new_cpu_runtime.json)
-  elif [ "$1" == "update_only" ]; then
-    export ARGS=(--update ../new_cpu_runtime.json)
-  fi
-
-  if ! python perf-tests/modules/test_cpu_torch.py "${ARGS[@]}"; then
-    echo "To reproduce this regression, run \`cd .ci/pytorch/perf_test/ && bash ${FUNCNAME[0]}.sh\` on your local machine and compare the runtime before/after your code change."
-    exit 1
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_cpu_speed_torch "$@"
-fi
--- a/.ci/pytorch/perf_test/test_cpu_speed_torch_tensor.sh
+++ b/.ci/pytorch/perf_test/test_cpu_speed_torch_tensor.sh
@ -1,29 +0,0 @@
-#!/bin/bash
-
-. ./common.sh
-
-test_cpu_speed_torch_tensor () {
-  echo "Testing: torch.Tensor.*, CPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/yf225/perf-tests.git
-
-  if [ "$1" == "compare_with_baseline" ]; then
-    export ARGS=(--compare ../cpu_runtime.json)
-  elif [ "$1" == "compare_and_update" ]; then
-    export ARGS=(--compare ../cpu_runtime.json --update ../new_cpu_runtime.json)
-  elif [ "$1" == "update_only" ]; then
-    export ARGS=(--update ../new_cpu_runtime.json)
-  fi
-
-  if ! python perf-tests/modules/test_cpu_torch_tensor.py "${ARGS[@]}"; then
-    echo "To reproduce this regression, run \`cd .ci/pytorch/perf_test/ && bash ${FUNCNAME[0]}.sh\` on your local machine and compare the runtime before/after your code change."
-    exit 1
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_cpu_speed_torch_tensor "$@"
-fi
--- a/.ci/pytorch/perf_test/test_gpu_speed_cudnn_lstm.sh
+++ b/.ci/pytorch/perf_test/test_gpu_speed_cudnn_lstm.sh
@ -1,44 +0,0 @@
-#!/bin/bash
-set -e
-
-. ./common.sh
-
-test_gpu_speed_cudnn_lstm () {
-  echo "Testing: CuDNN LSTM, GPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/pytorch/benchmark.git
-
-  cd benchmark/
-
-  git checkout 43dfb2c0370e70ef37f249dc09aff9f0ccd2ddb0
-
-  cd scripts/
-
-  SAMPLE_ARRAY=()
-  NUM_RUNS=$1
-
-  for (( i=1; i<=NUM_RUNS; i++ )) do
-    runtime=$(get_runtime_of_command python cudnn_lstm.py --skip-cpu-governor-check)
-    echo "$runtime"
-    SAMPLE_ARRAY+=("${runtime}")
-  done
-
-  cd ../..
-
-  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")
-  echo "Runtime stats in seconds:"
-  echo "$stats"
-
-  if [ "$2" == "compare_with_baseline" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}"
-  elif [ "$2" == "compare_and_update" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}" --update
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_gpu_speed_cudnn_lstm "$@"
-fi
--- a/.ci/pytorch/perf_test/test_gpu_speed_lstm.sh
+++ b/.ci/pytorch/perf_test/test_gpu_speed_lstm.sh
@ -1,44 +0,0 @@
-#!/bin/bash
-set -e
-
-. ./common.sh
-
-test_gpu_speed_lstm () {
-  echo "Testing: LSTM, GPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/pytorch/benchmark.git
-
-  cd benchmark/
-
-  git checkout 43dfb2c0370e70ef37f249dc09aff9f0ccd2ddb0
-
-  cd scripts/
-
-  SAMPLE_ARRAY=()
-  NUM_RUNS=$1
-
-  for (( i=1; i<=NUM_RUNS; i++ )) do
-    runtime=$(get_runtime_of_command python lstm.py --skip-cpu-governor-check)
-    echo "$runtime"
-    SAMPLE_ARRAY+=("${runtime}")
-  done
-
-  cd ../..
-
-  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")
-  echo "Runtime stats in seconds:"
-  echo "$stats"
-
-  if [ "$2" == "compare_with_baseline" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}"
-  elif [ "$2" == "compare_and_update" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}" --update
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_gpu_speed_lstm "$@"
-fi
--- a/.ci/pytorch/perf_test/test_gpu_speed_mlstm.sh
+++ b/.ci/pytorch/perf_test/test_gpu_speed_mlstm.sh
@ -1,44 +0,0 @@
-#!/bin/bash
-set -e
-
-. ./common.sh
-
-test_gpu_speed_mlstm () {
-  echo "Testing: MLSTM, GPU"
-
-  export OMP_NUM_THREADS=4
-  export MKL_NUM_THREADS=4
-
-  git clone https://github.com/pytorch/benchmark.git
-
-  cd benchmark/
-
-  git checkout 43dfb2c0370e70ef37f249dc09aff9f0ccd2ddb0
-
-  cd scripts/
-
-  SAMPLE_ARRAY=()
-  NUM_RUNS=$1
-
-  for (( i=1; i<=NUM_RUNS; i++ )) do
-    runtime=$(get_runtime_of_command python mlstm.py --skip-cpu-governor-check)
-    echo "$runtime"
-    SAMPLE_ARRAY+=("${runtime}")
-  done
-
-  cd ../..
-
-  stats=$(python ../get_stats.py "${SAMPLE_ARRAY[@]}")
-  echo "Runtime stats in seconds:"
-  echo "$stats"
-
-  if [ "$2" == "compare_with_baseline" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}"
-  elif [ "$2" == "compare_and_update" ]; then
-    python ../compare_with_baseline.py --test-name "${FUNCNAME[0]}" --sample-stats "${stats}" --update
-  fi
-}
-
-if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
-  run_test test_gpu_speed_mlstm "$@"
-fi
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Dmytro Dzhulgakov	af3964a872	Backport transposes optimization to v0.3.0 (#3994 ) * Optimizer: optimize transposes in variety of circumstances (#3509) * Optimizer: Optimize transposes in variety of circumstances - No-op transposes - Consecutive transposes (fuse them) - Transposes into Gemm (fuse them into transA/transB parameter) * touch up out of date comment * Backporting optimizer changes	2017-12-04 00:00:43 -08:00
Sam Gross	1645546aa9	Propagate volatile in zeros_like (#3984 ) Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See #3983, #3835, #3824 In v0.4 this will be more robustly solved by #3970	2017-12-04 00:00:43 -08:00
Soumith Chintala	350fad8a22	fix softmax dim on 1D input	2017-12-01 16:17:49 -08:00
Edward Z. Yang	565d183042	Documentation updates for ONNX. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-01 00:16:31 -05:00
Edward Z. Yang	2ebda372f6	More ONNX support (#3928 ) * Remove dilations for pooling in onnx export and other small fixes (#3698) * fix optimization pass issues * remove pool dilations * Fix export for recent changes in ONNX (#3708) * Fix symbolic for Embedding and Upsampling and improve error messages * Record stack traces during JIT tracing (#3607) * Update comments and size logic * Record stack traces during JIT tracing * Use string helper functions and AutoGIL * Use SourceLocation object instead of storing in debugName * Address zdevito comments * Address comments * Allow 1->N broadcasts at the beginning and end to be fused (#3616) * Allow 1->N broadcasts at the beginning and end to be fused * Update comments and size logic * Implement bmm symbolic (#3681) * Buildfix. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Now actually fix padding (the tests are added in onnx-pytorch) (#3893) * Now actually fix padding (the tests are added in onnx-pytorch) * fix test * Fix exporting HalfTensor * Fix padding according to https://github.com/onnx/onnx/issues/261 * Update ONNX IR we emit to version 0.0.2 (attribute discriminators) / fix Permute export (#3484) * Regenerate ONNX nanopb from latest version. But don't bump the IR version, we don't handle discriminators yet. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add discriminator to AttributeProto. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add back ONNX definition for permute Signed-off-by: Edward Z. Yang <ezyang@fb.com> * PyTorch now uses operator versioning. Also move some of the exporter info out of the ModelProto constructor. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-01 00:05:04 -05:00
Gregory Johnson	28b846c486	Corrected formatting in "docker image" section	2017-11-29 19:05:09 +01:00
Adam Paszke	9622eaa6fa	Fix void* wrapping in autograd codegen Also, add assertions here and there to make sure bad things never happen again.	2017-11-24 13:33:30 +01:00
Stefan Schweter	db8154df32	flake8 fix	2017-11-20 16:32:00 -08:00
Adam Paszke	b6eeea343d	Always define outputs of ConvBackwardBackward (#3800 )	2017-11-20 19:05:08 -05:00
Soumith Chintala	1fe9991554	fix exception handling when c++filt not found on host	2017-11-19 20:31:09 -08:00
Soumith Chintala	00118024f3	add PEP440 compatible versioning	2017-11-18 13:22:35 -08:00
Soumith Chintala	87edf5a349	change versioning schem to be PEP440 compatible	2017-11-18 13:18:58 -08:00
Fritz Obermeyer	20972878cc	Rename pyro.distributions.Multinomial -> .Categorical (#3766 ) * Rename distributions.Multinomial -> distributions.Categorical * Rename Multinomial -> Categorical * Update docs * Update variable.py * Update distributions.py * Update variable.py	2017-11-18 13:11:44 -08:00
jekbradbury	0d1128d25c	fix cuDNN RNN weight tying test (#3774 )	2017-11-18 11:43:03 -08:00
jekbradbury	81dc60493d	Detect aliasing in cuDNN RNN flatten_parameters (#3752 ) * Detect aliasing in cuDNN RNN flatten_parameters * add test	2017-11-17 19:32:59 -08:00
Soumith Chintala	b18df1cedf	add cuda9 options to nccl	2017-11-17 19:30:19 -08:00
Soumith Chintala	3976d77509	add error checking for FusionCompiler on old CUDA	2017-11-16 19:52:02 -08:00
Soumith Chintala	09c83673bf	move static-libstdc++ to extra_link_args	2017-11-15 20:04:42 -08:00
Soumith Chintala	5b9a8f918e	update gloo submodule	2017-11-15 14:21:53 -08:00
Soumith Chintala	f20fb2c1a1	fix half uniform for cuda 7.5	2017-11-14 11:36:49 -08:00
Sam Gross	4e00120117	Support negative dimensions in softmax and log_softmax Fixes #3677	2017-11-14 09:37:20 -08:00
Sam Gross	2b3f35daea	Fix elu double-backwards when applied in-place (#3687 ) * Fix elu double-backwards when applied in-place Removed unused "input" argument to elu_backwards. Also removed 'inplace' argument from backwards functions, since we don't ever want to use it. * Fix up additional calls to ELU_updateGradInput	2017-11-14 09:37:06 -08:00
Soumith Chintala	c580437342	add linker version script	2017-11-13 20:07:50 -08:00
Soumith Chintala	455e788fe6	add linker version script	2017-11-13 17:20:42 -08:00
gchanan	c980fb359b	[v0.3] Prevent segfaults from undefined aten tensors (#3675 ) * [v0.3] Prevent segfaults from undefined aten tensors * Move Undefined aten related files to proper place.	2017-11-13 17:49:04 -05:00
Soumith Chintala	bae45bb106	add depthwise convolution terminology as a note	2017-11-12 20:28:30 -08:00
josecabjim	34557d80f4	Add border-padding for grid_sampler (#3599 ) * adds border padding to spatial grid sampler * fixes flake8 * adds docs	2017-11-12 15:49:54 -08:00
Soumith Chintala	1e77879b2a	fix build after cherry-pick	2017-11-12 14:51:44 -08:00
Zachary DeVito	ff52d424b2	fix uninitialized warnings in THCUNN. (#3575 )	2017-11-12 12:28:04 -08:00
gchanan	4b7aa13b30	CPU all/any should work with empty tensors. (#3581 )	2017-11-12 12:27:46 -08:00
vfdev	e1f2d0916e	Add missing documentation for replacement in WeightedRandomSampler (#3579 ) * Update sampler.py * fix lint	2017-11-12 12:27:21 -08:00
Ozan Çağlayan	4b5b7e53f6	doc: Normalize all true/false in docstrings to ``True\|False`` (#3593 ) * doc: Normalize all true/false in docstrings to ``True\|False`` This makes them more apparent in the documentation. * doc: fix flake8	2017-11-12 12:26:43 -08:00
Ozan Çağlayan	db66fa9436	docs: clarify the difference between net() and net.forward() (#3596 )	2017-11-12 12:26:28 -08:00
andreh7	392c89ab6a	fix for unknown ssize_t in aten/src/TH/THMemoryFile.c (#3612 ) * added sys/types.h include to fix unknown ssize_t in aten/src/TH/THMemoryFile.c * now including <sys/types.h> only if _WIN32 is not #defined * now including sys/types.h in aten/src/TH/THDiskFile.c (if _WIN32 is not defined) to fix undefined off_t	2017-11-12 12:25:22 -08:00
Adam Paszke	cddf501fc5	Expend autograd profiler docs (#3621 )	2017-11-12 12:25:07 -08:00
andreh7	d0907d2c34	added #define __STDC_FORMAT_MACROS to tensor and storage code templates to avoid problems with gcc 4.8.5 (#3629 )	2017-11-12 12:24:36 -08:00
rluo	448a85a8e0	Fix module load_state_dict error information.	2017-11-12 12:24:20 -08:00
Luca Antiga	ea3138fd09	Remove redundant dimension check that produced maybe-uninitializd warnings	2017-11-12 12:23:55 -08:00
Christian Sarofeen	b89c96fe58	Fix for cuDNN half precision RNN for pre-volta archs (#3613 ) * Fix for cuDNN half RNN on pre-volta archs * Fix cuDNN versioning in rnn. * lint fix	2017-11-12 12:23:31 -08:00
ngimel	088f47bb89	fix selecting deterministic conv algo (#3631 ) Conflicts: torch/csrc/cudnn/Conv.cpp	2017-11-12 12:22:59 -08:00
Jaemin Cho	ddb3804f87	Allow torch.load and torch.save to take pathlib.Path (#3589 ) * Allow torch.load to take pathlib.Path pathlib has been python standard library for filesystem path since python 3.4 But `torch.load` currently cannot take `pathlib.Path` as its filename of state dictionary. I changed `torch.load` and `_with_file_like` to check so that they can accept `pathlib.Path` typed filepath. * Fix flake8: too long line & indentation	2017-11-12 11:28:39 -08:00
Soumith Chintala	a896311d06	add warnings if device capability is less than ideal (#3601 ) Conflicts: torch/csrc/cuda/Module.cpp	2017-11-12 11:27:37 -08:00
Richard Zou	937b634b5d	Fix cuda symeig (#3566 ) * Fix cuda symeig * Add symeig test * Better check for magma	2017-11-12 11:26:16 -08:00
SsnL	004dfdc7cc	Fix ld* conditions for gemv ger gemm (#3604 )	2017-11-12 11:24:49 -08:00
Sam Gross	f8aa5e2ed7	Fix stride checks in gemm dispatch (#3548 ) From https://software.intel.com/en-us/mkl-developer-reference-fortran-gemm: lda: "When transa = 'N' or 'n', then lda must be at least max(1, m), otherwise lda must be at least max(1, k)." ldb: "When transb = 'N' or 'n', then ldb must be at least max(1, k), otherwise ldb must be at least max(1, n)." Partly addresses #3525	2017-11-12 11:24:30 -08:00
Richard Zou	8a49309f81	Fix error when default_collate is passed a collection of numpy.str_ (#3404 ) * Fix error when default_collate is passed a collection of numpy.str_ * Error if default_collate input is nested nparray containing non-numbers	2017-11-12 11:22:26 -08:00
Soumith Chintala	14de24d89c	fix linking order of nvrtc to force no-as-needed (#3583 ) Conflicts: setup.py	2017-11-12 11:21:43 -08:00
Sam Gross	c7cccc250e	Fix uniform on CUDA tensor to return in range [0, 1) (#3547 ) The curand_uniform function returns the range (0, 1]. Most RNG APIs have the opposite bounds. Fixup the values in uniform_() so that they fall in the more common bounds.	2017-11-12 11:19:23 -08:00
Sam Gross	1f694e9a6e	Bump version in v0.3.0 branch	2017-11-09 09:37:19 -08:00
Sam Gross	1108bced80	Raise exception when Variable.reinforce is called (#3555 ) Fixes #3554	2017-11-09 09:30:50 -08:00
Soumith Chintala	c36d452224	add warnings if device capability is less than ideal for compiled cuda version	2017-11-09 07:58:52 -08:00
Richard Zou	11955b86d2	THTensor_varOuterDim numeric stability (#3533 )	2017-11-08 11:27:34 -05:00
SsnL	9a6788202b	Exposing emptyCache from allocator (#3518 ) * Add empty_cache binding * cuda.empty_cache document * update docs	2017-11-07 15:44:52 -08:00
ngimel	d58bad4073	avoid unnecessary multiplies in derivatives (#3545 )	2017-11-07 15:44:45 -08:00
Richard Zou	f95e252984	Document weights argument format for BCELoss (#3535 )	2017-11-07 15:44:39 -08:00
Kai Arulkumaran	b49f0f8154	Make distributions docstring raw (#3539 )	2017-11-07 15:44:33 -08:00
Richard Zou	269c25267b	Add reduce keyword for KLDivLoss (#3330 )	2017-11-07 15:44:26 -08:00
SsnL	fde471ee2a	add doc for sparse_adam (#3519 )	2017-11-07 15:44:19 -08:00
Christian Sarofeen	eb24d2ff6e	-1 indexing fix in THCApply for pre CUDA9 (#3457 ) * THCApply fixes * THCApply add undef	2017-11-07 15:44:12 -08:00
Sam Gross	f768068c3b	Fix float uniform generation in TH (#3541 ) Generate random uniform floats in the range [0, 1) by generating random uniform uint32 in the range [0, 2^24-1] and dividing by 2^24. This ensures that the largest value is representable as a float32 less than one. This also changes the uniform double generation to use more bits of randomness.	2017-11-07 13:28:05 -08:00
gchanan	c456451915	[v.0.3] Don't expose 0-dim tensors to Variable API (#3523 ) * [v0.3] Don't expose 0-dim tensors to Variable API. * [v.0.3] Ensure grad_inputs are not ATen scalars and address review comments. * Remove extra parentheses	2017-11-07 15:15:23 -05:00
Sam Gross	f282d1dc7c	Fix memory leak in THTensor_(addmm) (#3524 ) THTensor_(newContiguous) always increments the refcount. It may return the same pointer if the tensor is always contiguous. Since we added the check for zero strides, it may be called when the tensor is already contiguous. We need to make sure that THTensor_(free) is always called in this case. See #3498	2017-11-07 07:10:10 -05:00
Sam Gross	2a3cae0f3e	index_select does not return a view	2017-11-06 17:23:01 -08:00
Sam Gross	3d9630abc2	Fix and speed-up norm_backwards (#3481 ) Fixes #3264	2017-11-06 14:51:51 -08:00
Richard Zou	da7a5147db	Make THCTensor_varInnermostDim numerically stable using Welford's algorithm (#3425 ) * Use Welford's algorithm when reducing along inner dimension for THCTensor's variance fn * Use accreals in THCTensor's varInnermostDim * Skip cuda tests if no cuda * Variance testing	2017-11-06 14:51:42 -08:00
SsnL	5df8e582cd	Sparse Adam optimizer for sparse gradients (#3137 ) * sparse adam * Favor dense addition over sparse_mask	2017-11-06 14:48:39 -08:00
Richard Zou	5dff261598	Fix error message for type mismatches with sparse tensors (#3504 ) * Fix error messages * Better fix for error checking	2017-11-06 14:48:21 -08:00
Dhanton	aa0c8920af	Add single argument version of torch.arange (#3494 )	2017-11-06 14:46:41 -08:00
Kaixhin	a3b658bf3b	Tidy up CUDA notes	2017-11-06 14:46:29 -08:00
Kaixhin	94e89f3911	Add REINFORCE rule to distributions doc	2017-11-06 14:46:23 -08:00
Edward Z. Yang	f0956ad9ec	Don't assume construction succeeded in __del__. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-11-06 14:46:11 -08:00
Sam Gross	452ea78f43	Fix fill derivative (#3483 )	2017-11-06 14:45:51 -08:00
Lu Fang	3d5d66868e	Add ONNX symbolic for Elu	2017-11-06 14:45:14 -08:00
ngimel	cf373e25e2	fix copy-paste error in #3263 (#3476 ) I have no idea how it worked on cuda 8, but apparently this fixes failures on cuda 9. cc @colesbury	2017-11-06 14:44:46 -08:00
Richard Zou	91d764c781	Fix overflow when using magma (#3470 ) * Fix types * Make types better instead of casting to size_t	2017-11-06 14:44:33 -08:00
ngimel	524235bb71	Install magma in cuda 9 docker (#3469 )	2017-11-06 14:44:19 -08:00
Sam Gross	e035fa028b	Add assertion that 'pos' is in-bounds (#3466 )	2017-11-06 14:44:09 -08:00
Sam Gross	58a928c3b9	Fix warning in jit/ir.cpp	2017-11-06 14:43:57 -08:00
Richard Zou	4f1eefa8ad	Better error messages for Aten tensor types (#3449 ) * Better error messages for Aten tensor types * Address comments, add unit test	2017-11-06 14:43:46 -08:00
Sam Gross	4251c151e3	Add gradient checks for take and put_ (#3460 ) * Add gradient checks for take and put_ Fix the gradient formula for put_ * Make grad_output optional in gradgradcheck	2017-11-06 14:43:28 -08:00
Sam Gross	c0931a3a4d	Make grad_output optional in gradgradcheck (#3459 )	2017-11-06 14:43:18 -08:00
				`@ -1 +0,0 @@`
				`raise ModuleNotFoundError("Sorry PyTorch, but our NumPy is in the other folder")`