Compare commits

...

267 Commits

Author SHA1 Message Date
a8d6afb511 Disabling amp context when invoking compiler (#138659)
Disabling amp context when invoking compiler (#138624)

Fix for https://github.com/pytorch/pytorch/issues/133974

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138624
Approved by: https://github.com/bdhirsh, https://github.com/drisspg

(cherry picked from commit 5942b2985000e0c69ec955b6c88dee8b5d7e67fd)

Co-authored-by: eellison <elias.ellison@gmail.com>
2024-10-22 18:14:52 -07:00
f31b8bbc5b [MPS] Fix sliced cast (#138535)
[MPS] Fix sliced cast (#138314)

This fixes internal crash due to the invalid bufer size computation if sliced API is used

Not sure what was the purpose of
```c++
IntArrayRef baseShape;
if (src.is_view()) {
  baseShape = src._base().sizes();
} else {
  baseShape = getIMPSAllocator()->getBufferShape(src.storage().data());
}
int flattenedShaped = 1;
for (const auto i : c10::irange(baseShape.size())) {
  flattenedShaped *= baseShape[i];
}
```
As flattenShaped could be much easier computed as `[srcBuf
lengh]/src.element_size()`, and even if `srcBuf` is padded it's a safe thing to do.

When someone allocated buffer to hold say uint8 and that view-casted it
to float16, attempt to compute `baseShape` returned sizes of original
tensor in its data type, rather than size in new dtypes

Fixes https://github.com/pytorch/pytorch/issues/137800
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138314
Approved by: https://github.com/albanD, https://github.com/DenisVieriu97

(cherry picked from commit de16159e565e7a08294347e31e97ca08a3468227)

Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
2024-10-22 16:25:25 -07:00
848e7ac42a [SDPA-CUDNN] Make CuDNN Attention Opt in (#138587)
[SDPA-CUDNN] Make CuDNN Attention Opt in (#138522)

# Summary
Currently we have a `cudnn_order` that says on H100 w/ new enough CuDNN backend (we ship a 9.1 version in OSS) try to run CuDNN attention first. We have already encountered a few bugs with the release of 2.5:

1. https://github.com/pytorch/pytorch/issues/138529
2. https://github.com/huggingface/diffusers/issues/9704
3. https://github.com/pytorch/pytorch/pull/138354

In light of the above we are going to make the CuDNN backend Opt-in by default.

This can be done easily with the context manager for choosing backends I.e.:
``` Python
from torch.nn.attention import sdpa_kernel, SDPBackend

with sdpa_kernel(SDPBackend.CUDNN_ATTENTION):
    out = F.scaled_dot_product_attention(q, k, v)

```

This PR puts the CuDNN backend as the lowest precedence in the backend list, meaning that the Math backend will always be chosen unless disabled (which is done via the context manager).

Cc @atalman

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138522
Approved by: https://github.com/ngimel, https://github.com/eqy, https://github.com/malfet

(cherry picked from commit 9a9a0abc2818d40d06eda6c0b6fdbc949474f12e)

Co-authored-by: drisspg <drisspguessous@gmail.com>
2024-10-22 15:51:29 -07:00
885c823759 Update doc copyrights to 2024 (#138650)
Update copyrights to 2024 (#138638)

Spiritual successor of https://github.com/pytorch/pytorch/pull/119413 + CPP docs copyright update as well
Fixes https://github.com/pytorch/pytorch/issues/138630

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138638
Approved by: https://github.com/atalman

(cherry picked from commit d1be61ce4eb31640d1bdce07c8e6b17d03cbdca6)

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-10-22 15:47:37 -07:00
8c3ed97baa Update cpuinfo submodule (#138600)
Spiritual cherry-pick of https://github.com/pytorch/pytorch/pull/138351 that picks https://github.com/pytorch/cpuinfo/pull/258 into the branch

Fixes https://github.com/pytorch/pytorch/issues/138333


Test Plan: `python -c "import torch"` finishes without any output on the screen
2024-10-22 15:06:53 -07:00
70cf2bbc0b Add link to torch.compile the missing manual in troubleshooting (#137369)
Add link to torch.compile the missing manual in troubleshooting (#137301)

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137301
Approved by: https://github.com/svekars

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
(cherry picked from commit 22e19bd2d70409c2908edf0b0d00abb9209e3aaa)

Co-authored-by: Michael Lazos <mlazos@meta.com>
2024-10-22 12:32:56 -07:00
cde6b382ff Don't try to load cufile (#138539)
Don't try to load cufile (#138501)

Trying to loading it caused a big issue with 2.5.0 release - https://github.com/pytorch/pytorch/issues/138324

cufile is not actually used currently by default, see #133489

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138501
Approved by: https://github.com/atalman, https://github.com/mikaylagawarecki, https://github.com/malfet

(cherry picked from commit 012ff2a0aaf81bce25b7eda8c0021f5a784c11a6)

Co-authored-by: Sergii Dymchenko <sdym@meta.com>
2024-10-22 10:45:53 -07:00
4076a738b0 [Cherry-Pick] Use cuda 12.4 pytorch_extra_install_requirements as default (#138526)
Cherry-Picks https://github.com/pytorch/pytorch/pull/138458
Need to do it manually due to conflict with generated files.
2024-10-21 18:11:08 -07:00
a97c15174b update getting started xpu (#138090)
update get start xpu (#137479)

1. respect the comment from the community, downgrade the "Beta" to "Prototype" for the first xpu release with wheel
2. add wheels installation of torchaudio & torchvision for nightly on Windows
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137479
Approved by: https://github.com/atalman, https://github.com/malfet

(cherry picked from commit 7ba706c74e352dc752a0afd4b23d250108e2cb3e)

Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
2024-10-18 08:24:39 -07:00
32f585d934 [Release only] use triton 3.1.x from pypi (#137895)
* [Release only] use triton 3.1.x from pypi

* [Release only] Disable triton build workflows

* fix
2024-10-14 13:48:44 -04:00
417a0763a7 [split build] move periodic split builds into own concurrency group (#135510) (#137265)
To avoid nightly workflows cancelling each other
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135510
Approved by: https://github.com/clee2000, https://github.com/huydhn, https://github.com/malfet

Co-authored-by: Sahan Paliskara <sahanp@meta.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-10-03 08:29:56 -07:00
119e7344d9 [RELEASE-ONLY CHANGES] Fix dependency on filesystem on Linux (#137242)
* Apply changes from https://github.com/pytorch/pytorch/pull/135374

* Fix dependency on filesystem on Linux (#137209)

Similar to: https://github.com/pytorch/pytorch/pull/134494
We are seeing come back of https://github.com/pytorch/pytorch/issues/133437 due to use of filesystem on Linux

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137209
Approved by: https://github.com/kit1980, https://github.com/malfet

---------

Co-authored-by: atalman <atalman@fb.com>
2024-10-02 18:04:47 -07:00
783a6a424c [MPS] Add regression test for fft.fftfreq (#137215)
[MPS] Add regression test for `fft.fftfreq` (#135440)

The issue reported in #135223 was already solved in #128393. This PR adds a regression test for it.

Fixes #135223

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135440
Approved by: https://github.com/ezyang

(cherry picked from commit 09287e3af484421099d46ee56cc6ad7809402064)

Co-authored-by: Roy Hvaara <roy@lightyear.no>
2024-10-02 15:19:35 -07:00
5375201dff [MPS] Add missing dispatch to rshift.Tensor (#137212)
[MPS] Add missing dispatch to rshift.Tensor (#135607)

Missed it while working on https://github.com/pytorch/pytorch/pull/131813
Test plan: `python -c "import torch;print(torch.randint(100, 500, (64,), device='mps') >> torch.tensor([3,], device='mps'))"`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135607
Approved by: https://github.com/manuelcandales

(cherry picked from commit 3bf6be457d40034aa4b603b7ea1b8977051221ed)

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-10-02 14:54:41 -07:00
1de132ec9e [MPS] Fix 5D+ reductions over negative dimentions (#137211)
[MPS] Fix 5D+ reductions over negative dimentions (#136198)

This fixes bug introduced by https://github.com/pytorch/pytorch/pull/99856 that attempts to speed-up reduction for 5D+ tensor if trailing dimensions are all ones, but introduces crashes/off-by-one errors for wrapped dimensions

Added regresion test case to `TestMPS.test_sum`

Fixes https://github.com/pytorch/pytorch/issues/136132

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136198
Approved by: https://github.com/albanD

(cherry picked from commit f6f1504d39c92f1ab2a1ee10a7da97745593151f)

Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
2024-10-02 14:52:55 -07:00
0b1b609ed7 [NCCL] Don't override waitUntilInitialized's setting of comm->initialized_ (#137210)
[NCCL] Don't override `waitUntilInitialized`'s setting of `comm->initialized_` (#136155)

#133630 sets `initialized_` to `true` which causes previous wait codepaths to skip necessary waits, see also #https://github.com/pytorch/pytorch/issues/136151

CC @shuqiangzhang @wconstab

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136155
Approved by: https://github.com/fduwjj, https://github.com/kwen2501, https://github.com/c-p-i-o, https://github.com/shuqiangzhang

(cherry picked from commit e3aa5e2f6410727efead4970a4ec30569cf76881)

Co-authored-by: eqy <eddiey@nvidia.com>
2024-10-02 14:51:50 -07:00
0b45af9c10 Fix addmm silent correctness on aarch64 (#137208)
Fix addmm silent correctness on aarch64 (#136371)

Do not dispatch to fast gemmv functions when alpha is not equal to 1

Add regression test to address the problem

Fixes https://github.com/pytorch/pytorch/issues/136299

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136371
Approved by: https://github.com/swolchok

(cherry picked from commit c3e678382ba8eeeb3749b54e7b5256d7fb02cbd0)

Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
2024-10-02 14:50:52 -07:00
1a0b166ba2 [ONNX] Add assertion nodes to ignoring list (#137214)
[ONNX] Add assertion nodes to ignoring list (#135591)

Fixes #135419

PS: there are 104 empty output nodes, I suggest we add them one by one when we run into them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135591
Approved by: https://github.com/justinchuby

(cherry picked from commit 492f064f15a56afd3839a3075f844632f9cd7964)

Co-authored-by: titaiwangms <titaiwang@microsoft.com>
2024-10-02 13:12:27 -07:00
3a541ef8c2 Clarify that libtorch API is C++17 compatible (#137206)
Clarify that `libtorch` API is C++17 compatible (#136471)

As it relies on some common C++17 primitives, such as `std::optional`
Replace all docs references from C++14 to C++17

Fixes https://github.com/pytorch/pytorch/issues/133205

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136471
Approved by: https://github.com/kit1980, https://github.com/atalman

(cherry picked from commit 4fd16dd8aa259cd75c9a6d2ddcd8171cd1ee8e28)

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-10-02 12:57:34 -07:00
f8c4c252ca [Release only] Set WITH_PUSH when WITH_PUSH_ROCM is set (#137177) 2024-10-02 11:40:01 -04:00
8af31b2e49 [RELEASE-ONLY Change] Push ROCm images on RC (#137148)
* [RELEASE-ONLY Change] Push ROCm images on RC

We usually don't need to push docker images on RC, but for 2.5.0 release a cherry pick for ROCm actually modified docker images.

Do this for ROCm only to be safe.
After the release, think about what's the desired behavior and implement this in a more generic way.

* Hardcode 2.5 in the tag
2024-10-01 18:40:25 -07:00
8a71edcca5 [RELEASE-ONLY CHANGES] Disable slow workflows (#136805)
* Delete slow workflows

* Create slow.yml
2024-10-01 17:39:41 -07:00
058d3de7b9 [dynamo] Do not treat user defined nn module attributes static for dynamic shape infra (#137025)
[dynamo] Do not treat user defined nn module attributes static for dynamic shape infra (#136516)

Fixes https://github.com/pytorch/pytorch/issues/136254

Th regression was introduced in https://github.com/pytorch/pytorch/pull/132736 where originally we were trying to fix another regression. This PR and the offending PR together say - "treat user defined nn module attributes as automatic dynamic, but for cudagraphs they will be considered static". This avoid recompilations. This can lead to a cudagraph recording, which is ok. This also maintains the state before inline_inbuilt_nn_modules flag was introduced.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136516
Approved by: https://github.com/williamwen42
2024-09-30 18:02:31 -07:00
17d25897b2 [FlexAttention] Fix output layout (#135882) (#136905)
We previously only supported the same v_head dim and + qk_head dim. When allowed for different head-dims I accidently kept the same query strides for the output. This PR fixes this bug as well it ensures that we always produce output in the same stride order as the input query.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135882
Approved by: https://github.com/yanboliang, https://github.com/Chillee

(cherry picked from commit ae02d663cdf493362699d2672ed7dc9019a7033b)
2024-09-30 18:01:24 -07:00
70298e91f9 [Cherry-pick][DSD] Fix distributed state dict full_state_dict option hang during set_state_dict (#135725) and Fix loading uneven full tensor into sharded state dict (#136365) (#136903)
* [DSD] Fix distributed state dict full_state_dict option hang during set_state_dict (#135725)

Fix https://github.com/pytorch/pytorch/issues/134095
This fix distributed state dict full_state_dict option hang during set_state_dict. We switch `_distribute_tensors` in _state_dict_utils.py to use `DTensor.from_local` instead of `distribute_tensor` to support FSDP2+TP 2D strided sharding use case, as `distribute_tensor` cannot handle strided sharding yet. `distribute_tensor` incurs a scatter behind the scenes, while `DTensor.from_local` takes the local slice from the full tensor on each rank to create the DTensor (no collective).  This means it's the user's responsibility to make sure the full_tensor from the full_state_dict is the same across all ranks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135725
Approved by: https://github.com/fegin

(cherry picked from commit 0cdc6a8dcd7e294b01d8914385bbe45e79c1770d)

* [DSD] Fix loading uneven full tensor into sharded state dict (#136365)

Fix #136228.

This is a follow up on https://github.com/pytorch/pytorch/pull/135725. We need to pass shape and stride from the original dtensor, since for uneven case, `from_local` would calculate shape and stride assuming the tensor is evenly-sharded based on the local tensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136365
Approved by: https://github.com/fegin

(cherry picked from commit 637d5c4b7eb0fb82e019eed29efc0aa6ba92dc24)
2024-09-30 16:49:45 -07:00
69ed7c7093 [SymmetricMemory] improve multicast initialization/fallback logic (#136894)
[SymmetricMemory] improve multicast initialization/fallback logic (#136577)

Fixes https://github.com/pytorch/pytorch/issues/136494

Currently, CUDASymmetricMemory::rendezvous() initializes a multicast address if multicast support is present. However, if we believe multicast support is present but cuMulticastCreate still fails for some reason, we do not fallback gracefully.

- In addition to CUDART and driver version check, query CU_DEVICE_ATTRIBUTE_MULTICAST_SUPPORTED to determine multicast support for a rank/device.
- Before initializing multicast for a block, ensure all ranks/devices have multicast support.
- This is unlikely, but if cuMulticastCreate still fails on rank 0, print the corresponding driver error message as a warning, and gracefully skip multicast initialization for the block.
- Introduced an environment variable (TORCH_SYMM_MEM_DISABLE_MULTICAST) to allow users to explicitly disable multicast support as a workaround.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136577
Approved by: https://github.com/Chillee, https://github.com/eqy

(cherry picked from commit d55eef5c596b3955dd8ee43c721b1c311dbab5e0)
2024-09-30 15:54:05 -07:00
d80f521ee2 fix requirements.txt installation failure issue on Windows (#136893) 2024-09-30 15:51:36 -07:00
57717c8768 Fix lint (#137052)
Fix lint (#137023)

By migrating some of the workflows to Python-3.9 as 3.8 has been deprecated by https://github.com/pytorch/pytorch/pull/132138

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137023
Approved by: https://github.com/ZainRizvi, https://github.com/janeyx99, https://github.com/seemethere, https://github.com/kit1980, https://github.com/Skylion007

(cherry picked from commit 40f80a70fab672e96a088796fa1b742a70b93070)

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-09-30 14:56:13 -07:00
eqy
550ed97a89 [cuDNN][SDPA] cherrypick Support attn_bias in cuDNN (#130482) (#136885)
[cuDNN][SDPA] Support `attn_bias` in cuDNN (#130482)

CC @drisspg

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130482
Approved by: https://github.com/drisspg, https://github.com/Skylion007, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-09-30 13:17:37 -07:00
051df20ac2 [ROCm] Update to AOTriton 0.7b (Cherry-picked) (#135869)
* Always create seed and offset tensors on GPU memory.

* Adjust fudge_factors for test_flash_attention_vs_math_ref_grads

* Skip enable_gqa=True tests

* Fix cudagraph support for FA backend

* Update the AOTriton FA API to meet hipGraph demands.

* Enable test_fused_attention_vs_math_ref_grads_cudagraph and skip seq_len_q != seq_len_k when is_causal=True

* The main FA and ME tests passed after heavily hacking the fudge factors...

* [SDPA] Add experimental support to Navi31

* Changes aotriton_version.txt to 0.7b release

* Make the fudge factors more explicit.

* Code clean up.

* Claim GQA is not supported on ROCM in can_use_flash_attention

* Switch to .gz package

* Skip failures on test/test_native_mha.py

* Skip more GQA tests

* Skip nn_functional_scaled_dot_product_attention related tests

* Disable Efficient attention on fp32 + is_casual=True

* Revert "Disable Efficient attention on fp32 + is_casual=True"

This reverts commit 36324a49d2c322146adbd678902fa32d008b8b8b.

It's not very effective and forcing MATH backend does not help. Need
further investigations.

* Add missing imports

* Disable test_transformerencoderlayer and test_transformerdecoder

* Fix two more problems

* Fix lint

* Re-enable test_transformers

* Skip some tests in test_multiheadattention_fastpath_attn_mask on ROCM

* fix lint

* skip test_pointwise_associative_scan_tuple_reverse_True_combine_mode_pointwise_cuda on ROCm

* skip more test_pointwise_associative_scan

* Fix per suggestions from Nikita

* Update skip reason of test_transformerencoderlayer

* Add missing using
2024-09-30 13:14:55 -07:00
bc421d456e [RELEASE ONLY CHANGES] Revert XNNPACK Update (#136522)
* Revert "Update generate-xnnpack-wrappers.py parsing to handle build identifier (#134724)"

This reverts commit 679b8fe426babd40f8f9b013392df7a52901c926.

* Revert "Update PyTorch for XNNPACK 87ee0b4 (#134518)"

This reverts commit 3b40b07efbc13422d59fff83ffaddc5141a5fa5e.
2024-09-27 12:58:59 -07:00
aa574ab7e3 SDPA regression fix to work around high-precision by default (#136536)
Add option to configure reduced precision math backend for SDPA (#135964)

Summary: Address https://github.com/pytorch/pytorch/issues/135778 by adding a global flag to configure whether using high precision or low precision for math backend of SDPA.

Test Plan: buck2 run mode/opt //scripts/feikou/llm:run_attn_kernels

Differential Revision: D62625515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135964
Approved by: https://github.com/jbschlosser

(cherry picked from commit 0a35986cdb72383c4e173abbdfc140ae2f98d5b1)

Co-authored-by: Jianyu Huang <jianyuhuang@meta.com>
2024-09-27 09:44:04 -07:00
24bd87d5dd [Docs] fix inconsistent docs in conv1d, conv2d, and conv3d (#136813)
[Docs] fix inconsistent docs in conv1d, conv2d, and conv3d (#135894)

Addresses https://github.com/pytorch/pytorch/issues/135880
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135894
Approved by: https://github.com/mikaylagawarecki, https://github.com/malfet

(cherry picked from commit c9de2efde6ba0e1f15fe3ea99646855fdd9debaa)

Co-authored-by: Sahan Paliskara <sahanp@meta.com>
2024-09-26 18:59:00 -07:00
6101aafa34 [Update] Update note for Getting Started with PyTorch on Intel GPUs (#136731)
[Update] Update note for Getting Started with PyTorch on Intel GPUs (#129946)

remove the hardware and software prerequisites and set up env part.
keep the prerequisites section and link to pytorch prerequistes for intel gpus for driver install, intel support package install and env set up
https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpus.html
Update the support for Intel Client GPU MTL-H
Update inference & training examples

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129946
Approved by: https://github.com/seemethere

(cherry picked from commit f3dd1721f48e503145669843a8fe90e0ad57d9f7)

Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
2024-09-26 17:50:08 -07:00
396413f05c Fix ROCm skip decorator for test_ddp_tp and multiprocess UTs (#136161) (#136801)
skip_if_rocm is used only in multiprocess case (when UT test class is a child of MultiProcessTestCase). Each individual process can exit with a skip code. If used for single process UT, it will cause the UT to fail as the process returns a non-zero exit code. Use skipIfRocm in single process UTs.

To avoid the above confusion, this PR renamed skip_if_rocm to skip_if_rocm_multiprocess.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136161
Approved by: https://github.com/jithunnair-amd, https://github.com/kwen2501, https://github.com/fegin

Co-authored-by: Prachi Gupta <prachi.gupta@amd.com>
2024-09-26 16:09:37 -07:00
c25781c5d2 Update current maintainers (#136769)
Update current maintainers (#136672)

This file didn't had an overall in a few years so long overdue. Most of the credit goes to @orionr for gathering all of this info.

The main rules we followed:
- No code contributor is removed, they're all placed as emeritus
- Breakdown too big categories to make this document useful to know who to ping
- No category where the code is still in the codebase is removed
- We did not rework the categories (for example to be closer to module: labels) and leave that for later
- All non-emeritus names are ordered by their number of comments on issues related to their topic
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136672
Approved by: https://github.com/eqy, https://github.com/ezyang, https://github.com/seemethere, https://github.com/malfet

(cherry picked from commit 2421344d8f582084b69b7b00fe0304b1c9732f65)

Co-authored-by: albanD <desmaison.alban@gmail.com>
2024-09-26 11:45:15 -07:00
ecd330669e Constraint setuptools to 72.1.0 or older in requirements.txt (#136729)
Constraint setuptools to 72.1.0 or older in requirements.txt (#136489)

FIXES: https://github.com/pytorch/pytorch/issues/136541

Setuptools>=74.0.0 has deprecated support for some functions in distutils, and so the builds run into error such as ```AttributeError: module 'distutils' has no attribute '_msvccompiler'```. Also, the pytorch builds have setuptools pin to 72.1.0 according to these PRs: https://github.com/pytorch/builder/pull/1995 and 89d9a8cf6f. So, until there is a fix to change the function usage in accordance with latest setuptools, the 72.1.0 version works fine.

Also observed in CI jobs: https://github.com/pytorch/pytorch/actions/runs/10979326524
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136489
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
(cherry picked from commit 8d65d9f11bbbcb0ea4152510ffe9784135cfacc8)

Co-authored-by: ratnampa <ratnam.parikh@intel.com>
2024-09-26 00:05:27 -07:00
1715708183 Revert "Trace fwd graph under no_grad mode #134872" (#136734)
This reverts commit aca4d6b891ac1b7296195283edb8b64c828a6351.
2024-09-25 23:10:28 -07:00
2e2c00f74c Make test_skip_data_serialization regex more flexible (#136710)
Make test_skip_data_serialization regex more flexible (#136580)

Some CI machines seem to throw "Can't get local object" rather than
"Can't pickle local object".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136580
Approved by: https://github.com/mikaylagawarecki

(cherry picked from commit a0c76ea8533a4327fa36dfeeaf0863749d5e6dca)

Co-authored-by: Jez Ng <jezng@fb.com>
2024-09-25 18:33:27 -07:00
cbe476a5a7 Disable iOS workflow (#136706)
Disable iOS workflow (#136571)

See https://github.com/pytorch/pytorch/issues/136284
It's been broken for more than a week and it does not seem like anyone cares about fixing it.
Once it's landed I'll reassigned the issue on `oncall: mobile`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136571
Approved by: https://github.com/huydhn, https://github.com/kit1980

(cherry picked from commit 5340feb8aa97fe0a70251a52f2c9b570adf7dbe0)

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-09-25 18:18:23 -07:00
4b030d47b1 [RELEASE-ONLY CHANGES] Don't push to https://ghcr.io/ (#136703)
Don't do push to https://ghcr.io/ on release branch: we don't need it and it fails with "unauthorized: unauthenticated: User cannot be authenticated with the token provided".
2024-09-25 18:03:12 -07:00
9b80ddecd6 Fix hardcoded ROCm paths in Caffe2Targets.cmake (#136700)
Fix hardcoded ROCm paths in `Caffe2Targets.cmake` (#136283)

Fixes #131701

Use CMake imported targets more consistently to eliminate hardcode paths.

Here is the new relevant sections of Caffe2Targets.cmake:
```
set_target_properties(c10_hip PROPERTIES
  INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include"
  INTERFACE_LINK_LIBRARIES "c10;hip::amdhip64"
)
```

```
set_target_properties(torch_hip PROPERTIES
  INTERFACE_COMPILE_DEFINITIONS "USE_C10D_NCCL"
  INTERFACE_COMPILE_OPTIONS "-fPIC;-D__HIP_PLATFORM_AMD__=1;-DCUDA_HAS_FP16=1;-DUSE_ROCM;-D__HIP_NO_HALF_OPERATORS__=1;-D__HIP_NO_HALF_CONVERSIONS__=1;-DTORCH_HIP_VERSION=602;-Wno-shift-count-negative;-Wno-shift-count-overflow;-Wno-duplicate-decl-specifier;-DCAFFE2_USE_MIOPEN;-DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_HIP;-std=c++17;-DHIPBLAS_V2;-DHIP_NEW_TYPE_ENUMS"
  INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include"
  INTERFACE_LINK_LIBRARIES "c10_hip;torch_cpu_library;hip::amdhip64;MIOpen;hiprtc::hiprtc;roc::hipblaslt;roc::hipblas;hip::hipfft;hip::hiprand;roc::hipsparse;roc::hipsolver"
)
```

HIPCUB dependency was not actually used; which is why it is removed here as the imported target had undesirable side effects.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136283
Approved by: https://github.com/jeffdaily, https://github.com/Skylion007, https://github.com/jithunnair-amd, https://github.com/atalman

(cherry picked from commit e8f1dd6ba0675d3e11808e5198b0d927278a6f91)

Co-authored-by: Nichols A. Romero <nick.romero@amd.com>
2024-09-25 17:45:39 -07:00
6e86793f75 [ROCm] upgrade ROCm CI builds to py3.10 (#134108) (#136696)
Upgrade ROCm CI builds to py3.10

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134108
Approved by: https://github.com/jeffdaily, https://github.com/jithunnair-amd, https://github.com/atalman

Co-authored-by: Jack Taylor <jack.taylor@amd.com>
2024-09-25 17:17:30 -07:00
7c550fea95 [ROCm][CI] upgrade CI to ROCm 6.2 (#132555) (#136467)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132555
Approved by: https://github.com/pruthvistony, https://github.com/malfet
2024-09-25 13:42:27 -07:00
7a00785c23 [ROCm] Cherry-pick unit test fixes to release/2.5 (#136557)
* [ROCm] skip test_fp8_cast_and_t on non-MI300 machines (#135917)

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135917
Approved by: https://github.com/malfet

(cherry picked from commit 6cdc70bccd369d0d7ec4b54d921c91d62bd931f7)

* Skip pointwise associative scan tests due to regression (changes based on PR https://github.com/pytorch/pytorch/pull/135995)

* Cherry-pick fix from https://github.com/pytorch/pytorch/pull/135702

---------

Co-authored-by: Prachi Gupta <prachi.gupta@amd.com>
Co-authored-by: Jithun Nair <jithun.nair@amd.com>
2024-09-25 16:12:47 -04:00
4e6a99e5f3 fix stride compare failed when size value equal to one in ForeachUtils.h (#136426)
fix stride compare failed when size value equal to one in ForeachUtils.h (#134546)

When size value equal to one, tensor strides value need be skipped to compare.
@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134546
Approved by: https://github.com/janeyx99
2024-09-25 12:26:29 -07:00
dd73223b90 [ROCm] [BUGFIX] Re-enable rocm-specific tuning parameters v2 (#133852) (#136139)
Small bug fix - https://github.com/pytorch/pytorch/pull/124592 replaced the torch.version.hip with device_props but made a mistake in porting the original logic.

The original code was:
`if torch.version.hip is not None:`

Which was incorrectly replaced by:
`if self.device_props.type != "hip":`

Another occurence of https://github.com/pytorch/pytorch/pull/130617

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133852
Approved by: https://github.com/masnesral, https://github.com/malfet

(cherry picked from commit da587de9cb7cdbcd69f65783cd7a9589198bd6f6)
2024-09-25 11:56:24 -07:00
c5e5254a79 Fix xpu memory stats error (#135818) (#136420)
# Motivation
fix https://github.com/pytorch/pytorch/issues/135726
After merging two free blocks, I made a stupid mistake of ignoring the correct size to decrease the active memory size, which should be the original block size instead of the merged block size.

# Additional Context
Add a UT to guard this scenario.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135818
Approved by: https://github.com/EikanWang

(cherry picked from commit e6b68359d7c86aff25eefe77e0774c02b38f44b4)
2024-09-24 13:58:37 -04:00
ffed7b71e8 Fix dynamo benchmark skip logic for cpu device (#135193) (#135793)
Fixes #132380, adjust torchbench and huggingface skip models list, then we can remove `--no-skip` when running benchmarks on 3 suites.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135193
Approved by: https://github.com/chuanqi129, https://github.com/jansel

(cherry picked from commit 7ec17b49cf89cfeb97272a1baddcc30fa6fa66d8)
2024-09-20 15:47:42 -07:00
becdf8ae4f [Inductor] Increase multiplier to 3 for Inductor AMP FP16 benchmark correctness check (#135932) (#136262)
Fix https://github.com/pytorch/pytorch/issues/135657.
Aligned with AMP BF16, using multiplier 3 for Inductor AMP FP16 benchmark correctness check

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135932
Approved by: https://github.com/CaoE, https://github.com/jgong5, https://github.com/jansel
2024-09-20 15:45:26 -07:00
fb276d2652 [ONNX] Fix numpy method to return the correct type (#136162) (#136203)
[ONNX] Fix numpy method to return the correct type (#136162)

Previous implementation of the `numpy()` method returns `fp64` when the tensor is `fp32`. This is unexpected but seems to be caused by calling `__array__(dtype=None)` on the numpy array. I updated the implementation to implement the `numpy()` method explicitly and added tests to guard the behavior.

This needs to be cherry-picked into torch 2.5
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136162
Approved by: https://github.com/gramalingam, https://github.com/xadupre

(cherry picked from commit 67b14ce8bd9d4d0ad1920e57bc148644775646ac)
2024-09-20 15:44:05 -07:00
b7de7932fd Update torch-xpu-ops pin (ATen XPU implementation) (#135833)
Update torch-xpu-ops pin (ATen XPU implementation) (#135647)

Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135647
Approved by: https://github.com/EikanWang
2024-09-20 15:39:53 -07:00
1954439802 Update document for autocast on CPU (#136082)
update document for autocast on CPU
2024-09-20 15:28:26 -07:00
3920988456 [Cherry-pick] [ONNX] Update fake mode usage in onnx docs (#135512) and Drop final None values as inputs for nodes in exporter graph (#135520) (#136005)
* [ONNX] Update fake mode usage in onnx docs (#135512)

Update fake mode usage in onnx docs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135512
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
(cherry picked from commit 66db61f0d1b55323af958c3401d74842dc00b0c7)

* [ONNX] Drop final None values as inputs for nodes in exporter graph (#135520)

When value for an optional input is not provided, it is defaulted to `None`, which gets translates to "" in the onnx graph. To avoid this, if we have a list of inputs and the final few are all `None`, strip them in the graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135520
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
(cherry picked from commit e2f9a83b85af1bca54d97d2cc1259d72bf9579e5)
2024-09-20 15:25:25 -07:00
1db2a6562c [DCP] Fixes the stateless optimizer issue of distributed state_dict (… (#136000)
[DCP] Fixes the stateless optimizer issue of distributed state_dict (#135535)

Some optimizers don't have states that can cause get_state_dict/set_state_dict behave incorrectly. This PR fixes the issues.

fixes: https://github.com/pytorch/pytorch/issues/133415

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135535
Approved by: https://github.com/wz337

Co-authored-by: Chien-Chin Huang <chienchin@fb.com>
2024-09-20 15:24:10 -07:00
4b5bf41476 [inductor] [cpp] fix the input contiguous check in max-autotune (#135561)
[inductor] [cpp] fix the input contiguous check in max-autotune (#134982)

## Description
Fixes the FP32 accuracy failure of `resmlp_12_224` and BF16 accuracy failure of `volo_d1_224` in timm.

In this PR, we check whether input is contiguous using the following way:
If it has `FixedLayout`, we know the accurate strides. For `FlexibleLayout`, if its data is a `ComputedBuffer`, we could get the fill order of the buffer to decide whether it's contiguous. For the other cases, we won't use GEMM template as we can't infer whether it's contiguous.

## Additional context
The current GEMM template only supports this case: `input.get_stride()[-1] == 1`. In `resmlp_12_224`, when we run into this check, the layout of `input` is a `FlexibleLayout`. The reason is that when realizing the input which is a `View` IR, the `convert_to_reinterpret_view` call fails:
d14fe3ffed/torch/_inductor/ir.py (L4712-L4715)

And it finally runs into this `copy_input` and returns a `FlexibleLayout`.
d14fe3ffed/torch/_inductor/ir.py (L4722)

When checking its stride, this `FlexibleLayout` indeed satisfies `input.get_stride()[-1] == 1` but it is later decided as a `FixedLayout` with `size = (3072, 196), stride = (1, 3072)`, which is not supported by the GEMM template, thus causing accuracy issue in this model.
The `FlexibleLayout` is converted to `FixedLayout` during [CppPackedGemmTemplate.add_choices](d14fe3ffed/torch/_inductor/mkldnn_lowerings.py (L1051)) which calls [slice_nd](d14fe3ffed/torch/_inductor/codegen/cpp_template_kernel.py (L150)) when rendering the kernel (`slice_nd(X)`). When creating the `SliceView` IR, [as_storage_and_layout](d14fe3ffed/torch/_inductor/ir.py (L2288)) invokes
[decide_layout](d14fe3ffed/torch/_inductor/ir.py (L2135)) and converts it to a `FixedLayout` with `size = (3072, 196), stride = (1, 3072)`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134982
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel
2024-09-20 15:20:16 -07:00
a889c85498 [Inductor] simplify indexing_exprs in LoopBody._init_with_copy (#135574) (#135935)
This PR uses `var_ranges` information to simplify `indexing_exprs` in `LoopBody._init_with_copy` to to reduce occurrences of `FloorDiv` and `ModularIndexing` in the `indexing_exprs`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135574
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel
2024-09-19 17:03:24 -07:00
813e06461c [ONNX] Fix symbolic values and numpy implementation (#135786) (#135868)
1. Remove `__eq__` to make `SymbolicTensor` hashable and test for that
2. Update the `__array__` method so that it works for tensor on GPU

Fixes https://github.com/pytorch/pytorch/issues/135700
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135786
Approved by: https://github.com/titaiwangms
2024-09-19 16:58:05 -07:00
9887030485 Revert "[fx] Bypass custom __setattr__ in Node.__init__ (#135079)" (#… (#135625)
Revert "[fx] Bypass custom __setattr__ in Node.__init__ (#135079)" (#135562)

This reverts commit 66da3b3b2acacb116a9b23e91b24934830eaf6b8.

#135079 breaks internal tests and needs to be reverted. Revert with mergebot doesn't work as this PR is technically part of the stack, but, according to @jansel, it should be possible to revert it individually.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135562
Approved by: https://github.com/jansel, https://github.com/seemethere

Co-authored-by: Ivan Zaitsev <ivanzaitsev@fb.com>
2024-09-19 16:50:49 -07:00
9e315fef22 Revert "[Release only] Temporary disable triton xpu build" (#136276)
Revert "[Release only] Temporary disable triton xpu build (#136206)"

This reverts commit 6b14e6cfddc62f71a65f250c58d2482fdf2126b4.
2024-09-18 11:17:52 -07:00
6b14e6cfdd [Release only] Temporary disable triton xpu build (#136206) 2024-09-17 12:12:09 -07:00
828d686e1c [ONNX] Fix scaled_dot_product_attention with float scale (#135710)
Fixes #125158

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135594
Approved by: https://github.com/justinchuby

(cherry picked from commit e48ee2cf50d86d87ef7c7d0839267dbed4903ebf)
2024-09-12 09:50:58 -07:00
612fc7c447 [ONNX] Improves documentation of ONNX exporter (#135526)
The PR updates the documentation to reflect the changes introduced in pytorch 2.5 and related to onnx exporter.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135372
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
(cherry picked from commit 5e145861f23370d342125118dd9b0dcba26147db)

Co-authored-by: Xavier Dupré <xadupre@users.noreply.github.com>
2024-09-12 09:49:45 -07:00
cea562006e [Cherry-Pick] Bump triton pin and release version, revert temporary changes to build from pin (#135613)
* Revert "[RELEASE-ONLY CHANGES] Temp changes to build triton from pin for first RC (#135517)"

This reverts commit 4a3dabd67f8ce63f2fc45f278421cca3cc532cfe.

* Build triton from release branch

* triton_pin

* fix

* Bump triton xpu pin and release version (#135638)

Similar with https://github.com/pytorch/pytorch/pull/135627

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135638
Approved by: https://github.com/atalman

---------

Co-authored-by: chuanqiw <chuanqi.wang@intel.com>
2024-09-11 15:03:44 -04:00
ba27502501 Use upload-artifact@v4.4.0 for create_release.yml (#135534)
Use upload-artifact@v4.4.0 for create_release.yml (#135528)

Fixes failure: https://github.com/pytorch/pytorch/actions/runs/10780281005/job/29895846007

Due broken sync
```
actions/upload-artifact@v2
and
actions/download-artifact@v4.1.7
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135528
Approved by: https://github.com/kit1980, https://github.com/malfet

(cherry picked from commit 9b764491e32755d64efd4e7ff1c90cf587f3e3ff)

Co-authored-by: atalman <atalman@fb.com>
2024-09-10 14:43:16 -04:00
4a3dabd67f [RELEASE-ONLY CHANGES] Temp changes to build triton from pin for first RC (#135517)
[Release only] Temp changes to build triton from pin
2024-09-09 15:08:05 -04:00
e130696270 [RELEASE-ONLY CHANGES] Branch Cut for Release 2.5 (#135506)
* [RELEASE-ONLY CHANGES] Branch Cut for Release 2.5

* fix_lint
2024-09-09 14:05:58 -04:00
b7eb7256fb docs: torch.nn.utils.rnn.pack_padded_sequence: docs improve (#135417)
docs: `torch.nn.utils.rnn.pack_padded_sequence`: docs improve

/cc @mikaylagawarecki
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135417
Approved by: https://github.com/ezyang
2024-09-09 03:16:11 +00:00
c1ae78be92 [inductor] calibration inductor windows uts (18/N) (#135449)
skip test_quantized_* UTs of `test/inductor/test_cpu_select_algorithm.py`.
Windows inductor don't support quantize so far.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135449
Approved by: https://github.com/ezyang
2024-09-09 03:10:54 +00:00
defb515306 [NJT]Add permute ops support (#135336)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135336
Approved by: https://github.com/davidberard98
2024-09-08 21:00:41 +00:00
31c4e0d37d [inductor] Cleanup analysis done at lowering time (#135412)
Before this we would take multiple passes over the body of each IRNode as we did lowering.  This combines most analysis into `OpCounterCSE` so it can be done in a single pass.

Before:
![image](https://github.com/user-attachments/assets/0047db09-4258-4491-a9a6-b078e183092a)

After:
![image](https://github.com/user-attachments/assets/1e03adcb-8303-4bb1-8bbb-cc42dacd44d7)

This stack:
![image](https://github.com/user-attachments/assets/d6b50b24-c30c-4d23-8b1a-344b3ba65d7a)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135412
Approved by: https://github.com/oulgen
ghstack dependencies: #135286, #135306, #135377, #135400
2024-09-08 18:02:36 +00:00
53290ca00b [inductor] Refactor BaseSchedulerNode.__init__ (#135400)
Might be a small compile time improvement since we remove a call to extract_read_writes().

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135400
Approved by: https://github.com/oulgen
ghstack dependencies: #135286, #135306, #135377
2024-09-08 18:02:36 +00:00
16f5155992 [inductor] Fast path for extract_read_writes without tracing (#135377)
Before (bottom of stack):
![image](https://github.com/user-attachments/assets/13060ff9-b31d-42a9-8e8f-c50b2bf3dc2f)

After (this PR):
![image](https://github.com/user-attachments/assets/7d190821-b614-46b7-9e9e-9087443df654)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135377
Approved by: https://github.com/oulgen
ghstack dependencies: #135286, #135306
2024-09-08 18:02:32 +00:00
37144be03d [inductor] Remove ReadWrites.op_counts (#135306)
This was (almost) unused.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135306
Approved by: https://github.com/oulgen
ghstack dependencies: #135286
2024-09-08 18:02:28 +00:00
3bdc54ed18 [inductor] Refactor LoopBody.memory_usage (#135286)
This is preparing for some other changes where I speed up extract_read_writes tracing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135286
Approved by: https://github.com/oulgen
2024-09-08 18:02:24 +00:00
cyy
2196f32475 [22/N] Fix clang-tidy warnings in jit (#135319)
Follows #134537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135319
Approved by: https://github.com/titaiwangms
2024-09-08 17:18:29 +00:00
cfc227ad43 [reland][dtensor] move DTensor to public namespace (#134203)
reland of https://github.com/pytorch/pytorch/pull/133113

I have to create a new PR because the previous reverted PR could not either be rebased, or imported successfully :(

----

Moving DTensor to be in the public namespace, to formally add the documentation page that includes all the public APIs. This includes:

* many path renames and path import fixes
* a dedicated doc page without too much content yet (adding in the next PRs)
* To preserve the BC for users still using the torch.distributed._tensor, I added a shim script to redirect old path calls to the new module

The BC preserving is evidented by the fact that all DTensor tests are still working without changing the public imports. So it's safe to land the changes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134203
Approved by: https://github.com/tianyu-l
2024-09-08 17:08:40 +00:00
20cab91a12 [dynamo] Remove skip from jit freeze tests (#135281)
Fixes https://github.com/pytorch/pytorch/issues/119781
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135281
Approved by: https://github.com/zou3519
2024-09-08 15:11:12 +00:00
a6fae2e811 Use BRGEMM for Half flash attention forward kernel (#131879)
Use oneDNN BRGEMM on packed data to get better performance on the 5th generation of Xeon where Intel® Advanced Matrix Extensions (AMX) will have fp16 support, e.g. amx-fp16.
Multiple models have achieved acceleration, for instance, FP16 stable diffusion v2.1 has achieved over 50% improvement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131879
Approved by: https://github.com/jgong5, https://github.com/peterbell10
ghstack dependencies: #131878
2024-09-08 12:32:23 +00:00
042f2f7746 [ONNX] Re-raise the exception if the dynamic shapes cannot be refined (#135418)
Improve error reporting. Otherwise users will just see not being able to refine shapes most of the time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135418
Approved by: https://github.com/titaiwangms
2024-09-08 05:30:34 +00:00
fd494dd426 Change wrapped_linear_prepack and wrapped_quantized_linear_prepacked to private by adding _ as prefix (#135401)
Summary: In https://github.com/pytorch/pytorch/pull/134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Differential Revision: D62325142

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135401
Approved by: https://github.com/houseroad
2024-09-08 04:16:24 +00:00
8334cb2fb9 remove commented out breakpoints (#135363)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135363
Approved by: https://github.com/oulgen
2024-09-08 02:15:45 +00:00
e72ed4717e [Dynamo] Fix Huggingface PretrainedConfig get non const attr (#135413)
Fixes #135329

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135413
Approved by: https://github.com/anijain2305
2024-09-07 19:16:29 +00:00
3bebc09be9 [FlexAttention] Align the matmul tensorcore usage (#135168)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135168
Approved by: https://github.com/Chillee
2024-09-07 16:33:41 +00:00
a2db22e6bb [inductor] Catch BrokenProcessPool and print a more helpful message. (#135120)
Summary: BrokenProcessPool means a parallel-compile subprocess exited, which we never expect. It's likely due to a crash, so print a more meaningful error message and instructions that it's probably easier to debug by turning off parallel compile. Output looks like:
```
...
  File "/data/users/slarsen/pytorch/torch/_inductor/runtime/compile_tasks.py", line 45, in _reload_python_module
    exec(code, mod.__dict__, mod.__dict__)
  File "/tmp/torchinductor_slarsen/4q/c4qw7xk5lbb7whg5txnk4hwbc7z6kepak3o666tr3d64gcad5r5b.py", line 815, in <module>
    async_compile.wait(globals())
  File "/data/users/slarsen/pytorch/torch/_inductor/async_compile.py", line 265, in wait
    raise RuntimeError(
RuntimeError: A compilation subprocess exited unexpectedly. This is likely due to a crash. To facilitate debugging, you can re-run with TORCHINDUCTOR_COMPILE_THREADS=1 to cause compilation to occur in the main process.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135120
Approved by: https://github.com/Chillee
2024-09-07 16:33:37 +00:00
eac5e12548 [inductor] Move LoopBody to its own file (#135257)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135257
Approved by: https://github.com/oulgen
2024-09-07 16:29:15 +00:00
18479c5f70 [Doc] update max-autotune for CPU (#134986)
The current doc for `max-autotune` is applicable only for GPU. This PR adds the corresponding content for CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134986
Approved by: https://github.com/jgong5, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-09-07 13:42:40 +00:00
f7c0c06692 Add oneDNN BRGEMM support on CPU (#131878)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131878
Approved by: https://github.com/jgong5, https://github.com/peterbell10
2024-09-07 13:22:30 +00:00
b53d97c7be [Intel GPU] Add XPU memory-related APIs (#129919)
# Motivation
According to https://github.com/pytorch/pytorch/issues/116322, we will help unify the device allocator. So we introduce a simple xpu device allocator only with the key functionality first. And expect to add some memory statistics-related functionality after the unification.
But now, some memory statistic-related APIs listed in https://github.com/pytorch/pytorch/issues/127929 are requested. We need more time to unify the device allocator. In order to facilitate the user experience, we expect to support these memory statistic-related APIs before the unification.

# Additional Context
Fixes: #127929

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129919
Approved by: https://github.com/dvrogozh, https://github.com/abhilash1910, https://github.com/gujinghui, https://github.com/EikanWang, https://github.com/albanD
ghstack dependencies: #130923
2024-09-07 11:15:17 +00:00
6c1da66407 [Reland] Refactor caching device allocator utils (#130923)
# Motivation
Following [[RFC] Intel GPU Runtime Upstreaming for Allocator ](https://github.com/pytorch/pytorch/issues/116322), this PR aims to refactor caching device allocator utils to improve code reuse usage.
This is the first PR, we could prepare some follow-up PRs continuing to refactor the device caching allocator.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130923
Approved by: https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/albanD, https://github.com/eqy
2024-09-07 11:14:17 +00:00
d7c97e7245 [inductor][cpp][gemm] cache blocking config for dynamic shapes (#133538)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133538
Approved by: https://github.com/leslie-fang-intel
ghstack dependencies: #135277, #133447

Co-authored-by: Wu, Chunyuan <chunyuan.wu@intel.com>
2024-09-07 11:09:30 +00:00
be9f4ffe88 [inductor][cpp][gemm] enable dynamic M for k-slicing (#133447)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133447
Approved by: https://github.com/leslie-fang-intel
ghstack dependencies: #135277

Co-authored-by: Wu, Chunyuan <chunyuan.wu@intel.com>
2024-09-07 11:09:30 +00:00
692faa9bc6 [inductor][cpp][gemm] reduce memory alloc overhead by allocating local acc once per thread (#135277)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135277
Approved by: https://github.com/leslie-fang-intel

Co-authored-by: Wu, Chunyuan <chunyuan.wu@intel.com>
2024-09-07 11:09:25 +00:00
32f3af72b7 [ONNX] Support FakeTensor in ONNXProgram (#135399)
Sync with https://github.com/justinchuby/torch-onnx/compare/v0.1.20...v0.1.21 to support FakeTensors in ONNXProgram. Specifically, this PR implements the `apply_weights` method to allow users to supply a dictionary of concrete tensors to replace FakeTensors in the exported model weights.

An error is raised when users try to serialize a FakeTensor to avoid segfaults.

Also fixed a bug in `.save()` when `keep_initializers_as_inputs` is True and `include_initializers` is False.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135399
Approved by: https://github.com/titaiwangms
2024-09-07 04:48:18 +00:00
ebab5c85c4 [FlexAttention] Skip very small block size unit tests on H100 due to Triton bug (#135393)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135393
Approved by: https://github.com/BoyuanFeng
2024-09-07 04:35:22 +00:00
3d734d837b [ONNX] Handle mixed sequence inputs properly (#135378)
Previously, when an input contains a mixture of `Value` and python constants like `[SymbolicTensor('sym_size_int_3', type=Tensor(INT64), shape=[], producer=node_Shape_0, index=0), 512]`, we get errors like

```pytb
Traceback (most recent call last):
  File "/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_building.py", line 367, in _call_op
    converted_named_inputs = _process_python_constants_and_sequences(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_building.py", line 275, in _process_python_constants_and_sequences
    raise TypeError(
TypeError: Constant input '[SymbolicTensor('sym_size_int_3', type=Tensor(INT64), shape=[], producer=node_Shape_0, index=0), 512]' of type '<class 'list'>' is not supported
```

This PR updates Sequence handling to support this case, as well as variadic inputs and ONNX Sequence inputs.

Synced from https://github.com/justinchuby/torch-onnx/pull/187
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135378
Approved by: https://github.com/titaiwangms
2024-09-07 03:07:39 +00:00
c92227c41a [quant][pt2e] fix placeholder typo and related quantization tests (#135379)
A previous typo on "placeholder" and related tests in quantization are fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135379
Approved by: https://github.com/jerryzh168
2024-09-07 02:31:43 +00:00
e6a0221fc6 [Inductor] Optionally allow padding on non-GPU devices (#135280)
This is the OSS component of a larger MTIA diff.

Currently, Inductor disables padding for non-GPU devices. We need to change this behavior to enable padding on MTIA.

This PR adds a config option to enable padding on the CPU, or any other non-GPU device. In the future, we might want to enable padding on all devices by default. However, that might require supporting device-dependent padding defaults, since CPUs will likely use different settings than H100 GPUs.

Differential Revision: D61038114

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135280
Approved by: https://github.com/jfix71, https://github.com/shunting314
2024-09-07 02:19:14 +00:00
a6b9d444fb [ONNX] Refactor exporter errors (#135180)
Refactor exporter errors to combine old errors and new errors for API consistency.

This PR also

1. Removes the `_C._check_onnx_proto(proto)` call in the old exporter. We don't need the ONNX checker because it is limited.
2. Removes the `OnnxExporterError` defined in the dynamo module. This class unnecessarily stores the onnx program object, making it very bulky. Instead, we revert to use the plain OnnxExporterError defined in the `errors` module and use it as the base class for all errors.
3. Continues to expose `OnnxExporterError` in `torch.onnx` and the rest of the errors in `torch.onnx.errors`.
4. Removes the `CheckerError` and `InvalidExportOptionsError` from `torch.onnx`. This is BC breaking but should have low impact.
5. I did not rename existing errors out of compatibility considerations, even though `ExporterError` would have been more succinct.

Fixes https://github.com/pytorch/pytorch/issues/135125
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135180
Approved by: https://github.com/titaiwangms
2024-09-07 00:50:15 +00:00
d42b0c8f22 Add release matrix for 2.5 (#135383)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135383
Approved by: https://github.com/huydhn
2024-09-07 00:49:53 +00:00
941d094dd1 [Dynamo][DTensor] Fixes SymNodeVariable() is not a constant error in Compiled DDP + TP unit test (#135315)
Before the fix, the unit test will fail at forward Dynamo tracing:
```
  File "/data/users/willfeng/pytorch/test/distributed/_composable/test_replicate_with_compiler.py", line 415, in test_ddp_tp
    loss = compiled_replicate_model(data).sum()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
torch._dynamo.exc.InternalTorchDynamoError: SymNodeVariable() is not a constant

from user code:
   File "/data/users/willfeng/pytorch/torch/distributed/tensor/parallel/_data_parallel_utils.py", line 34, in _unflatten_tensor
    result = DTensor.from_local(
```
After the fix, the compilation fails at a later step (Compiled Autograd tracing), due to needing "pre-dispatch tracing of backward graph" feature (see details at https://github.com/pytorch/pytorch/issues/127797#issuecomment-2291695474).

I believe this PR is a net improvement, because it should also fix the 1D Traceable FSDP2 failure case on internal models (https://github.com/pytorch/pytorch/issues/130978#issuecomment-2319476690), which is much harder to build a minimal unit test for.

Fixes https://github.com/pytorch/pytorch/issues/130978.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135315
Approved by: https://github.com/bdhirsh
2024-09-07 00:11:25 +00:00
b1a934741e Change test_constant_prop_preserve_metadata (#135268)
Summary: In new export_for_training, "stack_trace" does not exist in node meta anymore.

Test Plan:
```
buck run fbcode//mode/dev-nosan fbcode//caffe2/test:quantization_pt2e -- -r test_constant_prop_preserve_metadata
```

Reviewed By: angelayi

Differential Revision: D62219974

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135268
Approved by: https://github.com/angelayi
2024-09-07 00:02:35 +00:00
0c661f3e1a [Split Build] Refactor split build binary builds into their own workflows and move split build binary builds to periodic (#134624)
As we need to move split build binary tests from trunk to periodic this pr, refactors those jobs out into its own workflow to achieve this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134624
Approved by: https://github.com/malfet
2024-09-06 23:57:56 +00:00
2c7e314803 [Inductor][CPP] Fix the issue of view dtype (#135301)
**Summary**
Fix issue: https://github.com/pytorch/pytorch/issues/135160, it's a regression introduced by https://github.com/pytorch/pytorch/pull/134569, where the dtype of `to_dtype_bitcast` was incorrectly handled when using the scalarize implementation.

**TestPlan**
```
python -u -m pytest -s -v test/inductor/test_cpu_repro.py -k test_view_dtype
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135301
Approved by: https://github.com/jgong5, https://github.com/jansel
2024-09-06 23:36:44 +00:00
ead4407f57 [inductor] Fix loop split optimization (#135303)
Fix https://github.com/pytorch/pytorch/issues/135274.

Improve the check whether the div expr matches: add a check whether `split_var` is in `original_body.iter_vars`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135303
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel
2024-09-06 23:06:25 +00:00
2f5b40c099 [aoti test] Disable FP8 funz dtypes in fp8 runtime check test (#135373)
Fixing https://github.com/pytorch/pytorch/issues/126734

Key is the funz FP8 types are for AMD only.

source: https://github.com/openxla/stablehlo/blob/main/rfcs/20230321-fp8_fnuz.md

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135373
Approved by: https://github.com/chenyang78
2024-09-06 23:05:47 +00:00
993b5647ab [export] fix placeholder name collision tests by removing map call (#135366)
The current test is failing because of the current unstable state of map. torch.compile and non-strict export are taking two seperate routes unlike cond and while_loop. This pr fix the test it self. We'll fix map in follow up PRs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135366
Approved by: https://github.com/angelayi
2024-09-06 22:02:50 +00:00
2ab26806f1 Require tlparse for failing tests in test_structured_trace.py (#135376)
Summary: These tests are currently failing internally. Per discussion, skip if tlparse is unavailable

Test Plan:
```
feature remove tlparse
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --run-disabled --regex test_structured_trace.py
feature install tlparse
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --run-disabled --regex test_structured_trace.py
```

Differential Revision: D62310342

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135376
Approved by: https://github.com/ezyang
2024-09-06 21:53:41 +00:00
b1612569f6 [BE] Clarify defaulting behavior in optimizer (#135384)
Fixes #135340

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135384
Approved by: https://github.com/drisspg, https://github.com/jainapurva
2024-09-06 21:52:55 +00:00
dc0e818738 [FR] Automatically infer a common filename prefix (#135158)
Save the annoyance of specifying this on the command line each time
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135158
Approved by: https://github.com/fduwjj, https://github.com/c-p-i-o
ghstack dependencies: #135157
2024-09-06 21:44:27 +00:00
06e414d7fe [FR] Make trace_dir a required argument (#135157)
Ensures users get a clean error if they forget to specify the dir, and
improves the help message.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135157
Approved by: https://github.com/c-p-i-o, https://github.com/fduwjj
2024-09-06 21:44:27 +00:00
a681260caf Revert "[ONNX] Refactor exporter errors (#135180)"
This reverts commit 5eebd9315a72422d59b6f8d8ca8e4e573e231d5c.

Reverted https://github.com/pytorch/pytorch/pull/135180 on behalf of https://github.com/clee2000 due to I think this broke test_public_bindings.py::TestPublicBindings::test_correct_module_names [GH job link](https://github.com/pytorch/pytorch/actions/runs/10743909338/job/29800779403) [HUD commit link](5eebd9315a), possibly a landrace with the PR that landed before it ([comment](https://github.com/pytorch/pytorch/pull/135180#issuecomment-2334844191))
2024-09-06 21:39:18 +00:00
95e976a63f [dynamo] recursively skip frames when Dynamo cache limit is hit (#135144)
Fixes https://github.com/pytorch/pytorch/pull/135144 and [T197117723](https://www.internalfb.com/intern/tasks/?t=197117723).

In general, adds `SkipCodeRecursiveException` to Dynamo - when raised in Dynamo, convert_frame will return a `skip_code_recursive_flag` back to C Dynamo, signaling it to skip the current frame and all recursive calls.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135144
Approved by: https://github.com/jansel, https://github.com/anijain2305
2024-09-06 21:38:53 +00:00
306ac44eaa [ez][TD] Fix request for issue body returns None (#135389)
I assumed it would be empty string if the body is empty, but its just None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135389
Approved by: https://github.com/malfet
2024-09-06 21:02:01 +00:00
a7643baceb Revert expectFailureIf condition on tests with torch.compile on Windows (#134759)
Fixes #134716

This PR reverts some changes introduced in 6eae569546 (#133987)

torch.compile is not available on Windows, tests should be expected to fail.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134759
Approved by: https://github.com/malfet
2024-09-06 20:51:55 +00:00
a4030e37be [dynamo] reland map/zip iterator related changes (#135074)
Differential Revision: [D62211019](https://our.internmc.facebook.com/intern/diff/D62211019)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135074
Approved by: https://github.com/jansel, https://github.com/anijain2305, https://github.com/mlazos
2024-09-06 20:38:02 +00:00
22e1fb6faa [test][easy] Add debug utils for cpu select algorithm test (#135038)
Summary: Add debug utils to debug a flaky test in fbcode ci.

Some context: https://github.com/pytorch/pytorch/pull/126545

Test Plan: ci

Differential Revision: D62005445

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135038
Approved by: https://github.com/jgong5, https://github.com/XuehaiPan
2024-09-06 20:30:49 +00:00
2a4890e315 [ONNX] Clean up the missed lines from previous PRs (#135368)
Some missed deleted lines

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135368
Approved by: https://github.com/justinchuby
2024-09-06 20:27:52 +00:00
3ce433aef2 [TCPStore] use wait counters (#135283)
This replaces the existing TCPStore counters with the new shared wait counters. There's no users of the tcpstore counters so should be completely safe to remove.

Test plan:

Existing tests + build

There's no OSS backend for wait counters so can't write any tests with them currently.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135283
Approved by: https://github.com/c-p-i-o
2024-09-06 19:54:25 +00:00
7f2d20e687 Run all autograd node post hooks (#134728)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134728
Approved by: https://github.com/albanD, https://github.com/soulitzer
2024-09-06 19:44:28 +00:00
32fd29c1ea [ONNX] Properly handle Attributes in traceable functions (#135367)
Previously the attributes were sent in as Attr objects even when we call the function as a plain Python function. Turning them into python objects.

From https://github.com/justinchuby/torch-onnx/pull/186
Related https://github.com/microsoft/onnxscript/issues/1846

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135367
Approved by: https://github.com/justinchuby
2024-09-06 19:35:22 +00:00
5eebd9315a [ONNX] Refactor exporter errors (#135180)
Refactor exporter errors to combine old errors and new errors for API consistency.

This PR also

1. Removes the `_C._check_onnx_proto(proto)` call in the old exporter. We don't need the ONNX checker because it is limited.
2. Removes the `OnnxExporterError` defined in the dynamo module. This class unnecessarily stores the onnx program object, making it very bulky. Instead, we revert to use the plain OnnxExporterError defined in the `errors` module and use it as the base class for all errors.
3. Continues to expose `OnnxExporterError` in `torch.onnx` and the rest of the errors in `torch.onnx.errors`.
4. Removes the `CheckerError` and `InvalidExportOptionsError` from `torch.onnx`. This is BC breaking but should have low impact.
5. I did not rename existing errors out of compatibility considerations, even though `ExporterError` would have been more succinct.

Fixes https://github.com/pytorch/pytorch/issues/135125
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135180
Approved by: https://github.com/titaiwangms
2024-09-06 19:10:56 +00:00
a15aabc975 Add MaskedTensor passthrough: unfold, F.Unfold, F.Fold, stack (#125262)
Hi,
I noticed the `unfold` operator was missing on MaskedTensor.

I tested that my change works when calling unfold and backward on a `MaskedTensor` but I didn't find the tests for the dispatch of such operation. Where is it?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125262
Approved by: https://github.com/cpuhrsch
2024-09-06 19:06:23 +00:00
b143426db3 [Inductor] Use argument names as the key for the constants dict and the signature dict (#135170)
Referencing how triton constructs these dictionaries

ca3fb5f6fa/python/triton/runtime/jit.py (L639)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135170
Approved by: https://github.com/htyu
2024-09-06 19:05:00 +00:00
13ba0a2e5c Run bypassed graph compile outside the except block to avoid chaining of exceptions (#135175)
Fixes #135172

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135175
Approved by: https://github.com/masnesral, https://github.com/ezyang
2024-09-06 19:03:57 +00:00
8520ce5f78 Fix incorrect trace of post-accumulate grad hook on tensor with zero dims (#135226)
Fix incorrect trace of post-accumulate grad hook on tensor with zero dimensions

Fixes #135207

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135226
Approved by: https://github.com/xmfan
2024-09-06 18:19:54 +00:00
196748d491 [elastic] support local_addr across all rendezvous impls (#135262)
Summary:
There was a regression introduced in https://github.com/pytorch/pytorch/pull/125743 that made `local_addr` no longer used. This fixes that by passing `local_addr` to `RendezvousStoreInfo.build` everywhere it's used.

This also fixes a number of tests allowing them to be run in parallel which hugely sped up the testing cycle as this change touches many different rendezvous implementations. This required a few fixes in unrelated tests.

Test Plan:
Added tests for the common rendezvous implementations that `local_addr` to prevent future regressions.

```
buck2 test @//mode/dev-nosan fbcode//caffe2/test/distributed/elastic/... fbcode//caffe2/torch/distributed/elastic/... -- --stress-runs 3
```

To vet the parallelism changes I also ran with 3 stress runs each to identify flakiness caused by parallelism.

Differential Revision: D62256407

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135262
Approved by: https://github.com/fduwjj, https://github.com/wz337
2024-09-06 17:55:43 +00:00
177e4f4218 remove _check call on item() for torch.istft (#135234)
Fixes #135014

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135234
Approved by: https://github.com/tugsbayasgalan
2024-09-06 17:31:25 +00:00
3988b3468b [aoti][easy] remove breakpoint() in wrapper.py (#134807)
Differential Revision: D61687146

Remove an unintended breakpoint in code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134807
Approved by: https://github.com/YUNQIUGUO
2024-09-06 17:25:05 +00:00
04118d8617 [export] Record the global torch version in serialization. (#135243)
Summary: In general I think it will be useful to also record the global torch version in the EP, so that we can track them in the logging in addition to the schema version.

Test Plan: CI

Reviewed By: henryoier

Differential Revision: D62252626

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135243
Approved by: https://github.com/yushangdi
2024-09-06 17:02:06 +00:00
24482e5c68 [torch][fx] Set maximum warning count during fx.Graph.lint (#135069)
Summary:
resnet152 spent about 15 minutes writing warning messages in _unlift
during `to_executorch` because they're all written to unbuffered stderr
by the `warnings` module.

These warnings are almost always about get_attr nodes referencing a
non-existent name:
```lang=py
warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does '
  'not reference an nn.Module, nn.Parameter, or buffer, which is '
  'what \'get_attr\' Nodes typically target'
)
```
I'm not aware of a way to configure the warnings module to write this out
at most once, so I'm just going to disable the lint for now.

Test Plan:
Re-ran resnet152 with Executorch and the XNNPackBackend, it is much faster now

Differential Revision: D62156090

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135069
Approved by: https://github.com/yushangdi
2024-09-06 16:41:59 +00:00
c0ec599f27 Update submodule ideep to include aarch64 change (#134897)
This PR is per ARM request, which is in https://github.com/intel/ideep/issues/334.

Context for the request is: Arm team has upstreamed the dynamic quantization changes, all the PRs were merged (torch, ideep, oneDNN), but without this ideep submodule update, the feature will not work. The change is isolated to only matmul operator and quantization path alone.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134897
Approved by: https://github.com/jgong5, https://github.com/atalman, https://github.com/snadampal
2024-09-06 16:40:26 +00:00
7074de43c0 Porting to GCC 15 (#135188)
uint8_t is found on cstdint header

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135188
Approved by: https://github.com/Skylion007
2024-09-06 16:16:53 +00:00
771dcce11d [AOTI][Tooling][6/n] Fix long dtype input tensors calling mean() in aoti_torch_print_tensor_handle (#135072)
Differential Revision: D61635232

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135072
Approved by: https://github.com/hl475, https://github.com/ColinPeppler
2024-09-06 15:59:32 +00:00
de74aafff4 error on exporting ScriptModule (#135302)
Test Plan: added test

Differential Revision: D62279179

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135302
Approved by: https://github.com/yushangdi
2024-09-06 15:12:40 +00:00
ad29a2c0dc Add Inductor config for default stride behavior (#135238)
By default, Inductor is allowed to manipulate the layout
(strides+storage offset) of input tensors to custom operators.

We want to change it so that the default is that Inductor should respect
the stride order of input tensors to custom operators.

This PR adds a config to toggle the behavior, in the next PR up we'll
change the default. We also make the following changes:
- We add a new operator Tag (flexible_layout), which means that
inductor is allowed to manipulate the layout. When we flip the default,
users can specify they want the old behavior by using this tag.

This is a reland of https://github.com/pytorch/pytorch/pull/126986,
which was previously reverted due to silent incorrectness. We've since
fixed the silent incorrectness
(https://github.com/pytorch/pytorch/pull/133639)

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135238
Approved by: https://github.com/albanD
2024-09-06 14:48:24 +00:00
3a9e33dca8 [torchelastic] Don't do signal handling when off the main thread (#135088)
Summary:
In multiprocessing, signal handling is not possible if the thread is not the main thread. This resulted in the following error:
> "ValueError('signal only works in main thread of the main interpreter')"

To address this issue, the diff checks whether the thread is the main thread and, if not, skips signal handling.

Test Plan:
Before this change, MAST job failed:
https://fburl.com/mlhub/iq2m10v8

With this change, MAST job succeeded:
https://fburl.com/mlhub/q6kb8343

Differential Revision: D62166943

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135088
Approved by: https://github.com/d4l3k
2024-09-06 14:47:03 +00:00
a086882d72 [inductor][triton] mark workspace args as mutated (#134648)
SplitScan makes use of a workspace arg that needs to be zeroed before it is used - then, it is used to communicate between thread blocks during the triton kernel implementation. It is mutated during during the execution of the kernel, so it should be marked as such.

Before this PR, it is not marked as mutated; AFAIK this is fine during normal execution, but during autotuning it causes problems. The workspace starts off zeroed (as expected), but during autotuning the kernel will be executed multiple times and the workspace does not get re-set between executions, resulting in incorrect data. If the data is used for indexing, then you can fail device-side asserts (and the results after the initial run (with autotuning) could be wrong). The test added in this PR repros the issue when the fix is removed.

When we mark the arg as mutated, then the arg gets cloned before autotuning, so that the arg passed to the kernel during autotuning will always be zeroed as expected.
804852c1f9/torch/_inductor/runtime/triton_heuristics.py (L685-L689)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134648
Approved by: https://github.com/peterbell10, https://github.com/jansel
2024-09-06 14:23:37 +00:00
84ae6b7d6b AOTDispatcher: limit cases when we detach() graph inputs to non-leaves (#134193)
This PR is slightly a revival / update to the discussion from https://github.com/pytorch/pytorch/pull/98960:

Part of FSDP2's tracing strategy right now is that:

(1) it is painful/difficult to handle the case where we have multiple graph input tensors that are aliased to each other and at least one of them is duplicated

(2) we already have longstanding in logic to remove duplicate input tensors from the graph in dynamo. Morally, FSDP2 gives us duplicate input tensors in the backward graph for every `unsharded_param`, because we have (a) the `unsharded_param` being closed over by the backward hook to resize/allgather, and (b) the same `unsharded_param` being saved for backward by autograd (we now guarantee in the partitioner that we will always save the base tensor for backward and recompute views)

(3) However, we were still seeing cases where the `unsharded_param` showed up twice in the backward graph inputs, as distinct tensor objects (with different python ids) instead of being true duplicates that dynamo can de-dup.

It turns on that this was because we were `.detach()`ing the `unsharded_param` in AOTDispatcher before plumbing it through the compiled forward (and so autograd would save a detach'd version of the `unsharded_param`). This is precisely because of the logic from https://github.com/pytorch/pytorch/pull/98960.

However, re-reading the detailed comments, it seems unnecessary to do a detach() on a graph input that is a (leaf) `nn.Parameter`, even if it happens to get no gradients in the backward. Since it is a leaf, we don't have to worry about the autograd engine "continuing to backprop through the graph beyond the current tensor" (the leaf has no other grad_fn for autograd to backprop through).

So this PR makes us a bit less aggressive about calling detach() on inputs: we only do it when:

(1) our graph input statically will get a `None` gradient (and also has no metadata mutations, the existing state)

(2) **and** our graph input is a non-leaf tensor (so detach()ing is actually required to prevent autograd from incorrectly backpropping past the non-leaf.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134193
Approved by: https://github.com/yf225

Co-authored-by: Will Feng <yf225@cornell.edu>
2024-09-06 14:06:48 +00:00
60a097a071 [CD] Update binary_linux_test.sh to include calling builder smoke test (#133869)
Run smoke test

Fixes #1969

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133869
Approved by: https://github.com/atalman

Co-authored-by: Andrey Talman <atalman@fb.com>
2024-09-06 13:27:24 +00:00
13bae39e22 [inductor] [cpp] improve cache blocking for is_dynamic_M (#131306)
## Performance
Models with >= 3% performance speedup are listed below:

### AMP single-thread dynamic shape (measured on CPU with AMX support)
No regressions

| Model Family | Model Name | Speedup |
|--------------|------------|---------|
torchbench | soft_actor_critic| 3%

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131306
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel
ghstack dependencies: #135275

Co-authored-by: Jiong Gong <jiong.gong@intel.com>
2024-09-06 13:21:24 +00:00
4ef6c05f65 [inductor][cpp][gemm] fix autotune runtime error from linear_binary fusion (#135275)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135275
Approved by: https://github.com/leslie-fang-intel
2024-09-06 13:21:23 +00:00
d6b9bd3e60 Also handle compiler collective when input variable doesn't exist on all ranks (#135147)
Internal xref:
https://fb.workplace.com/groups/3095840833991792/permalink/3810738595835342/

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135147
Approved by: https://github.com/jansel
2024-09-06 13:18:36 +00:00
d0591f4658 Ignore fresh unbacked when doing recursive make_fx inside HOPs (#135053)
Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/7705964779531357/

This now also incorporates a test from https://github.com/pytorch/pytorch/pull/133585 (which it fixes) and the prep PR https://github.com/pytorch/pytorch/pull/134407 Including the PR desc from that:

I am trying to fix a problem reported by user in [fb.workplace.com/groups/6829516587176185/permalink/7705964779531357](https://fb.workplace.com/groups/6829516587176185/permalink/7705964779531357/) The summary of this problem is that when we do collect metadata analysis in AOTAutograd, we accumulate pending unbacked symbols which are going to be discarded at the end of the trace. However, if we do a recursive make_fx inside tracing, as occurs with torch.cond, we end up seeing that there are pending unbacked symbols that aren't associated with a binding, even though it's spurious (they've leaked into the inner make_fx call from the outer AOTAutograd analysis).

In https://github.com/pytorch/pytorch/pull/133588 I tried to just prevent adding the symbols to the pending list at all in the first place. But this itself caused some problems which were fixed in https://github.com/pytorch/pytorch/pull/124785 . The problem fixed in that PR is that when we allocate tangents that have unbacked size, something prevented them from having correct unbacked SymInts when ignore fresh unbacked SymInts was enabled. So I had patched it at the time by just not suppressing pending symbols and clearing them out some other way.

I think... I was wrong in that PR? That is to say, it was OK to avoid putting the fresh unbacked symbols in the pending list; the real problem was suppressing unbacked renamings. But there doesn't seem to be a good reason to suppress these; this PR shows that it doesn't actually fail any tests if you do these anyway. Intuitively, this makes sense, because you can't trigger renamings unless you're actually adding unbacked symbols to the pending set.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135053
Approved by: https://github.com/ydwu4
2024-09-06 13:13:15 +00:00
b5dea061c8 check compilation status before query cudnn version in conv (#135332)
This PR is created for fixing the https://github.com/pytorch/pytorch/issues/135322.  The cudnn compilation status should be check firstly before querying version, otherwise, conv may trigger runtimeerror before any check in other non-cuda backends.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135332
Approved by: https://github.com/EikanWang, https://github.com/atalman
2024-09-06 12:50:04 +00:00
041960a1ce [Dynamo] Automatically in-graph traceable tensor subclass ctors (#135151)
Fixes https://github.com/pytorch/pytorch/issues/114389

Previously, dynamo would attempt to trace through the `__init__` of traceable tensor subclasses, since their constructors are AOT dispatcher traceable by definition, dynamo should automatically put these in the graph like we do for any other tensors. Not doing this is difficult because dynamo would need to apply mutations post tensor subclass creation in the graph.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135151
Approved by: https://github.com/bdhirsh
2024-09-06 12:23:38 +00:00
67c7924ea1 [inductor] Fix gen_transposed_tile_load_store (#135307)
Recent PR: https://github.com/pytorch/pytorch/pull/131745 bring new VLA logical in cpp codegen. And it will raise build fail error on MSVC and error code is `Compiler Error C2131`: https://learn.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/compiler-error-c2131?view=msvc-170

reproduce UT:
```cmd
pytest test\inductor\test_torchinductor_dynamic_shapes.py -v -k test_large_block_sizes_dynamic_shapes_cpu
```

Original generated code:
```c++
alignas(16) float tmp1[static_cast<int64_t>(((-256LL)*(c10::div_floor_integer(static_cast<int64_t>(ks1), static_cast<int64_t>(16LL)))) + (16LL*ks1))];
```

Changes:
allocate a large-enough fixed-sized buffer.

New genarated code:
```c++
alignas(16) float tmp1[16*16];
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135307
Approved by: https://github.com/jgong5, https://github.com/jansel
2024-09-06 10:44:08 +00:00
217ba7b2ab [Docs] Update FileCheck doc (#135199)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135199
Approved by: https://github.com/soulitzer
2024-09-06 08:18:38 +00:00
758d515d98 [Inductor][CPP] Select tiling factor for lower precision data types (#133830)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133830
Approved by: https://github.com/jgong5, https://github.com/jansel
2024-09-06 08:12:37 +00:00
60d98b4cfb Update torch-xpu-ops pin (ATen XPU implementation) (#135300)
Release cycle for PyTorch 2.5
1. Bugfixing: correct reduction logic in cdist kernel.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135300
Approved by: https://github.com/EikanWang
2024-09-06 07:30:09 +00:00
590a3e9f8a [export][training ir migration] quantized_decomposed.quantize_per_tensor decomposition (#134525)
Summary:
In graph of  TestXNNPACKQuantizer.test_dynamic_linear_with_con test, some quantized_decomposed.quantize_per_tensor.default ops are becoming quantized_decomposed.dequantize_per_tensor.tensor ops when using the new training ir.

This is because we lift params/buffers before calling make_fx. So previously, for the graph that’s passed to make_fx,`graph.L__self___linear1.weight` is a tensor
now in training ir, graph.L__self___linear1.weight is a FakeTensor. This caused the node overload to be different.

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_dynamic_linear_with_conv
```

Differential Revision: D61364547

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134525
Approved by: https://github.com/tugsbayasgalan, https://github.com/jerryzh168
2024-09-06 07:06:06 +00:00
764ee6e3f9 [FlexAttention] Specify padding_value for boundary checked loads (#134573)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134573
Approved by: https://github.com/Chillee
2024-09-06 06:47:26 +00:00
67f98a99a4 [DeviceMesh][Easy] Make RuntimeError a bit more descriptive by including the actual world_size (#135271)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135271
Approved by: https://github.com/fduwjj
2024-09-06 06:23:20 +00:00
e020a8755a [Fix][FR][ez] Remove debugging logs (#135308)
Removing the print added during debugging process.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135308
Approved by: https://github.com/wz337
2024-09-06 06:14:33 +00:00
7ffb3b201c [inductor] Remove LoopBody.reads,writes,other (#135256)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135256
Approved by: https://github.com/oulgen
ghstack dependencies: #135070, #135076, #135082, #135084, #135079, #135235
2024-09-06 06:11:55 +00:00
f946bf88c4 [inductor] Skip retracing an existing LoopBody (#135235)
This is roughly a 7% speedup in inductor compile time for hf_Bert_large.  The time spent in `LoopBody.__init__` improves from 15% to 8% of `fx_codegen_and_compile`.

Before
![image](https://github.com/user-attachments/assets/7de0f28e-35bd-472f-b4be-b52733d2a85c)

After
![image](https://github.com/user-attachments/assets/5f0cf11a-43c5-43ae-b13c-f32383a75a7f)

Overall
![image](https://github.com/user-attachments/assets/6a369d8c-fb5e-4ad2-9504-0fc745ad6568)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135235
Approved by: https://github.com/oulgen
ghstack dependencies: #135070, #135076, #135082, #135084, #135079
2024-09-06 06:11:55 +00:00
66da3b3b2a [fx] Bypass custom __setattr__ in Node.__init__ (#135079)
Before:
![image](https://github.com/user-attachments/assets/5f0a6ae6-6049-44d0-b5f2-a549a23ad97f)

After:
![image](https://github.com/user-attachments/assets/51c9f91b-f8a0-4043-8362-65813feec823)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135079
Approved by: https://github.com/oulgen
ghstack dependencies: #135070, #135076, #135082, #135084
2024-09-06 06:11:46 +00:00
41e653456e [RDP] Fix "No module named 'libfb’" (#135244)
Summary:
D62215095 Introduced an import error to arvr pipelines as the is_fbcode() function does not work as intended.

This changes is_fbcode() to be a much stricter check.

Test Plan:
```
buck2 run arvr/mode/platform010/opt-stripped //arvr/libraries/depthlink/clients/mr_replay:pipeline_runner -c bolt.use_eva3_sim=True -- --config_file arvr/libraries/depthlink/clients/mr_replay/configs/runner_config.yaml --features DEPTH
```

Differential Revision: D62237502

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135244
Approved by: https://github.com/aorenste
2024-09-06 04:52:31 +00:00
e40a0a9359 Add randomness checking for sdpa vmap (#135176)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135176
Approved by: https://github.com/zou3519
2024-09-06 04:50:49 +00:00
c05a7adb36 [inductor][debug] fix draw_buffers (#135266)
**Before:**
![image](https://github.com/user-attachments/assets/aac756f3-1349-4647-9da3-87cf105cf647)

**After:**
<img width="791" alt="image" src="https://github.com/user-attachments/assets/d72c663c-e598-42fa-ac40-9e58956f1ec1">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135266
Approved by: https://github.com/yf225
2024-09-06 04:12:41 +00:00
5f57be7571 [Distributed] Change function call in test to non-deprecated to eliminate warning (#134938)
Migrate function call in test to eliminate warning message in below and reduce the chance of test fail when methods removed

-  from deprecated `save_state_dict` change to `save`
-  from deprecated `load_state_dict` change to `load`

Warning message:
```bash
pytorch/test/distributed/checkpoint/test_fsdp_model_state.py:37: FutureWarning: `save_state_dict` is deprecated and will be removed in future versions.Please use `save` instead.

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134938
Approved by: https://github.com/wz337, https://github.com/fegin
2024-09-06 03:25:09 +00:00
29d72c1100 [inductor] check intel compiler minimal version (#135209)
On Windows: early version icx has `-print-file-name` issue, and can't preload correctly for inductor. Add minimal version check for Intel compiler.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135209
Approved by: https://github.com/ezyang
2024-09-06 03:21:07 +00:00
3b1a334c0f [Inductor][CPP] Avoid mistake wgt tensor delete (#135100)
**Summary**
Fix issue: https://github.com/pytorch/pytorch/issues/134998: Previously, we only checked if the `get_attr` FX node for the weight had a single user node. However, two `get_attr` nodes may share the same tensor and should not be deleted in such cases. In this PR, we add the count of users for tensor along with the num of users for nodes to decide whether this tensor can be deleted or not.

**TestPlan**
```
 python test/inductor/test_cpu_select_algorithm.py -k test_linear_wgt_multi_users
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135100
Approved by: https://github.com/jgong5
2024-09-06 03:13:36 +00:00
07689a38bf [Inductor] Fix AOT weight alignment issue on CPU (#135205)
**Summary**
Fix issue: https://github.com/pytorch/pytorch/issues/135027. On CPU, the `consts_size` used to generate `_binary_constants_bin_start` is not padded to `ALIGN_BYTES`, while `serialized_weights` is, causing a failure in the 16K alignment check.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135205
Approved by: https://github.com/jgong5, https://github.com/desertfire
2024-09-06 03:06:51 +00:00
06a7dc21c1 Remove dead expect_rational (#135105)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135105
Approved by: https://github.com/malfet
2024-09-06 02:57:27 +00:00
d9a18173fa Report qualname of exception type rather than <class 'RuntimeError'> (#135146)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135146
Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/yanboliang
ghstack dependencies: #135148, #135145
2024-09-06 02:56:50 +00:00
d8543e3162 Include exception type qualname when rewrapping InternalTorchDynamoError (#135145)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135145
Approved by: https://github.com/drisspg, https://github.com/anijain2305
ghstack dependencies: #135148
2024-09-06 02:56:50 +00:00
ad01fc194d Consolidate raise and rewrap raise error branches (#135148)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135148
Approved by: https://github.com/anijain2305, https://github.com/albanD, https://github.com/yanboliang, https://github.com/malfet
2024-09-06 02:56:46 +00:00
e162414963 add instrumentation of CCA stats for reserved and allocated memory size (#135231)
As titled
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135231
Approved by: https://github.com/c-p-i-o
2024-09-06 02:48:56 +00:00
9e5a797771 Improve test_public_bindings import module error reporting (#135258)
Error was hard to understand without message. Render it now. See https://github.com/pytorch/pytorch/pull/135259 for it in action.

Example failure:

```
2024-09-05T20:04:45.3022000Z FAILED [5.9524s] test_public_bindings.py::TestPublicBindings::test_modules_can_be_imported - AssertionError: String comparison failed: '' != "torch._logging.scribe failed to import w[112 chars].py)"
2024-09-05T20:04:45.3025413Z + torch._logging.scribe failed to import with error ImportError: cannot import name 'TypeAlias' from 'typing' (/opt/conda/envs/py_3.9/lib/python3.9/typing.py)
2024-09-05T20:04:45.3026990Z
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135258
Approved by: https://github.com/albanD
2024-09-06 02:40:03 +00:00
b46a1b9e2d Use Python 3.9 on all libtorch jobs (#135245)
Part of the migration py3.8->3.9

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135245
Approved by: https://github.com/izaitsevfb
2024-09-06 02:27:22 +00:00
9688014820 aarch64: extend matmul heuristic checks to all neoverse platforms (#134548)
for aarch64 neoverse platforms there are two gemm backends available
for matmul operator on PyTorch: (1) Arm Compute Library and (2) OpenBLAS.
While Arm Compute Library provides better performance over OpenBLAS,
it has overhead for the kernel launch time, and hence we use OpenBLAS
for smaller tensor compute. The heuristic was originally implemented for
neoverse_v1. This commit extends the heuristic to other neoverse platforms

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134548
Approved by: https://github.com/malfet
2024-09-06 01:40:50 +00:00
8f6e73f068 [ONNX] Enable experimental exporter logic to dynamo_export and support refine dynamic_shapes (#134976)
(1) Enable experimental exporter logic to dynamo_export
(2) Refine dynamic shapes and retry export in export strategies
(3) Delete `torch_export_graph_extractor` and use the new export logic
(4) Disable ExportedProgram test in `test_fx_onnx_with_onnxruntime.py`, as ONNXProgram is different now.

Fixes https://github.com/pytorch/pytorch/issues/126479
Fixes #135183
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134976
Approved by: https://github.com/justinchuby
2024-09-06 01:29:56 +00:00
1e57ef08fa [AOTI] Support MKLDNN qconv ops in cpp wrapper (#134795)
Summary: Similar to https://github.com/pytorch/pytorch/pull/134475, support qconv in the ABI-compatible mode for cpp-wrapper Inductor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134795
Approved by: https://github.com/leslie-fang-intel, https://github.com/chunyuan-w, https://github.com/angelayi
ghstack dependencies: #134475, #134783
2024-09-06 01:01:53 +00:00
614b86d602 [AOTI] Support MKLDNN qlinear ops in cpp wrapper (#134783)
Summary: Similar to https://github.com/pytorch/pytorch/pull/134475, support qlinear in the ABI-compatible mode for cpp-wrapper Inductor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134783
Approved by: https://github.com/leslie-fang-intel, https://github.com/chunyuan-w, https://github.com/angelayi
ghstack dependencies: #134475
2024-09-06 01:01:53 +00:00
0b96dfb736 [AOTI] Support MKLDNN conv ops in cpp wrapper (#134475)
Summary: Partially fix https://github.com/pytorch/pytorch/issues/123040. In the ABI-compatible mode, MKLDNN fallback ops do not have C shim implementations and thus need to go through the custom ops launch path. Other MLKDNN ops will be fixed in following PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134475
Approved by: https://github.com/leslie-fang-intel, https://github.com/chunyuan-w, https://github.com/angelayi
2024-09-06 01:01:53 +00:00
62b221d5cc Add Percentages to Function Events (#135155)
Summary: Users have recently asked that the profiler contains self/total CPU and device percentages to FunctionEvents so that teams can process the data procedurely. Some of it could be done mathematically via subroutines but since we already have the information in the _build_table, lets build it there.

Test Plan: Check that we have the same table as before but also check that the parameters we check also have the expected values

Differential Revision: D62210351

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135155
Approved by: https://github.com/shanw-meta, https://github.com/kit1980
2024-09-06 00:39:11 +00:00
66dd4577b1 Track base of FunctionalTensor in inference mode. (#135141)
The idea behind the tracking is the following, whenever we see a tensor if the tensors is a root tensors (does not have any view metas ) when we consider is as the base of the all the tensors that shares its storage.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135141
Approved by: https://github.com/zou3519
2024-09-06 00:10:25 +00:00
cyy
cc28634172 [Submodule] Bump pybind11 to v2.13.5 (#135202)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135202
Approved by: https://github.com/Skylion007
2024-09-06 00:09:00 +00:00
c83cdf068b [DTensor] Fix view op replicating on tensor dim when the size of the tensor dim = 1 (#135054)
We found a corner case that when a tensor dimension is 1, calling `view(1)` would result in an unexpected replication (see case 1 below). When the tensor dimension to shard is not 1, no matter whether the tensor dimension is evenly-shardable across the mesh dimension, it won't cause an implicit replication behind the scenes if view doesn't change the size of the given tensor dimension (see case 2 and 3).

When the tensor dimension to shard is of size 1, it is not being added to shardable_dims here:
https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/ops/_view_ops.py#L518

```
# uneven case where the size of the tensor dimension to shard is 1
p = torch.randn(1,2)
mesh = init_device_mesh(“cuda”, (2,))
dtensor = distribute_tensor(p, mesh, [Shard(0)])
t = dtensor.view(1, 2)
# this would result in replication, meaning t is now replicated across all ranks.

# uneven case where the size of the tensor dimension to shard is not 1
p = torch.randn(3, 2)
mesh = init_device_mesh(“cuda”, (2,))
dtensor = distribute_tensor(p, mesh, [Shard(0)])
t = dtensor.view(3, 2) # this would not result in replication.
# this would not result in replication, meaning t stays as sharded.

# even case
p = torch.randn(2,2)
dtensor = distribute_tensor(p, mesh, [Shard(0)])
t = dtensor.view(2, 2)
# this would not result in replication, meaning t stays as sharded.
```

Differential Revision: [D62155606](https://our.internmc.facebook.com/intern/diff/D62155606)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135054
Approved by: https://github.com/tianyu-l, https://github.com/wanchaol
2024-09-06 00:03:54 +00:00
28ccfba248 [ONNX] Delete ONNXProgramSerializer (#135261)
Fixes #135182

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135261
Approved by: https://github.com/justinchuby
2024-09-05 23:52:51 +00:00
b2386bdca1 [debug] Add helper to run cProfile on a function (#135084)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135084
Approved by: https://github.com/oulgen
ghstack dependencies: #135070, #135076, #135082
2024-09-05 23:41:30 +00:00
bdfc8d9f96 [fx] Don't use generators in map_aggregate (#135082)
While the generators avoid a copy, they are slow.

Before:
![image](https://github.com/user-attachments/assets/70a55a9a-0595-4105-b0ab-22cf77c7409c)

After:
![image](https://github.com/user-attachments/assets/cecb9c59-ae36-47de-8b08-cab2c7cb3d57)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135082
Approved by: https://github.com/oulgen
ghstack dependencies: #135070, #135076
2024-09-05 23:41:30 +00:00
70779dded8 [fx] Compile time optimization in Node.__update_args_kwargs (#135076)
Before this we took two passes over all of the args.

Before:
![image](https://github.com/user-attachments/assets/24ce5628-03f4-4983-9f2d-5ddf0ca5816e)

After:
![image](https://github.com/user-attachments/assets/c9681aa2-32f0-4f6b-a598-fc6f90ffafb5)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135076
Approved by: https://github.com/Chillee
ghstack dependencies: #135070
2024-09-05 23:41:30 +00:00
ea231300d1 [inductor] Improve compile time regression from MemoryDep.normalize (#135070)
Possible fix for #135056

Before
![image](https://github.com/user-attachments/assets/3962cb85-e808-4fd4-991f-471ff5ef7eae)

After
![image](https://github.com/user-attachments/assets/2322d48d-6518-4518-baca-336027b5cda8)

Measured based on:
```
python benchmarks/dynamo/torchbench.py --ci --accuracy --timing --explain --inductor --device cuda --training --only hf_Bert_large --stats -n1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135070
Approved by: https://github.com/Chillee
2024-09-05 23:41:30 +00:00
8f66995459 Revert "Support rolling over a percentage of workflows (#134816)"
This reverts commit fc890b55b51098437b6149abf1026a8b2aaee389.

Reverted https://github.com/pytorch/pytorch/pull/134816 on behalf of https://github.com/malfet due to Causes lint to intermittently fail ([comment](https://github.com/pytorch/pytorch/pull/134816#issuecomment-2332902609))
2024-09-05 23:39:41 +00:00
144fde4fd2 [MPS] Add support for autocast in MPS (#99272)
Fixes https://github.com/pytorch/pytorch/issues/88415

Need to run inductor/test_cpu_select_algorithm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272
Approved by: https://github.com/malfet

Co-authored-by: Siddharth Kotapati <skotapati@apple.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Co-authored-by: Roy Hvaara <roy@lightyear.no>
2024-09-05 23:23:17 +00:00
43f4947d44 fix fake tensor tolist implementation (#135131)
Summary:
When exporting for training with `tolist`, we do not hit `FunctionalTensor.tolist` since we do not functionalize. Unfortunately, this means we hit `FakeTensor.tolist`, which creates unbacked symints that are not backed by proxies.

Rather than trying to patch up this low-level implementation, we replace it with essentially what `FunctionalTensor.tolist` does, which is higher-level: we essentially desugar to `item()` calls and let it take care of unbacked symints.

Test Plan:
Some expected failures are gone now.
Also found a test for `tolist` that was written when `FunctionalTensor.tolist` was implemented but not really doing much; repurposed it now to exercise more modes.

Differential Revision: D62197742

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135131
Approved by: https://github.com/ezyang
2024-09-05 23:20:31 +00:00
65e1c34061 [rfc] scuba for flight recorder (#134794)
Summary: Record flight recorder status in a scuba table.

Test Plan: Testing with timing out a job. Will post results soon.

Differential Revision: D61729221

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134794
Approved by: https://github.com/fduwjj
2024-09-05 23:18:10 +00:00
830247c355 [Intel Triton] Update Intel Triton to release/2.5.0 (#134074)
This PR relands https://github.com/pytorch/pytorch/pull/134053

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134074
Approved by: https://github.com/EikanWang
2024-09-05 22:46:31 +00:00
4262755b5a [cond] fix typo in cond codegen (#134708)
As titled.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134708
Approved by: https://github.com/jansel
2024-09-05 22:38:24 +00:00
3825607144 Add torch._logging.scribe (#135224)
See https://github.com/pytorch/pytorch/pull/135138 for a usage example. Meta only, see https://docs.google.com/document/d/1JpbAQvRhTmuxjnKKjT7qq57dsnV84nxSLpWJo1abJuE/edit#heading=h.9wi46k7np6xw for context

fbscribelogger is a library that allows us to write to scribe, which is Meta's logging infrastructure, when you have appropriate access token (this token is available for jobs running on main, as well as authorized jobs with the ci-scribe label). The resulting data is accessible via Scuba (a real time in-memory database) and Hive (a more traditional SQL persisted database).

Here's the motivating use case. Suppose there is somewhere in PyTorch's codebase where you'd like to log an event, and then you'd like to find all the situations where this log is called. If PyTorch is rolled out to our internal users, we have some FB-oriented APIs (like torch._utils_internal.signpost_event) with which you can do this. But you have to actually land your PR to main, wait for it to be ingested to fbcode, and then wait for us to actually roll out this version, before you get any data. But what if you want the results within the next few hours? Instead, you can use torch._logging.scribe to directly write to our logging infrastructure *from inside CI jobs.* The most convenient approach is to log unstructured JSON blobs to `open_source_signpost` (added in this PR; you can also add your own dedicated table as described in the GDoc above). After adding logging code to your code, you can push your PR to CI, add 'ci-scribe' label, and in a few hours view the results in Scuba, e.g., (Meta-only) https://fburl.com/scuba/torch_open_source_signpost/z2mq8o4l If you want continuous logging on all commits on master, you can land your PR and it will be continuously get logging for all CI runs that happen on main.

Eventually, if your dataset is important enough, you can consider collaborating with PyTorch Dev Infra to get the data collected in our public AWS cloud so that OSS users can view it without access to Meta's internal users. But this facility is really good for prototyping / one-off experiments. It's entirely self serve: just add your logging, run your PR CI with ci-scribe, get results, do analysis in Scuba.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135224
Approved by: https://github.com/Skylion007
2024-09-05 22:37:13 +00:00
eqy
3c8f71ff93 [cuDNN][64-bit indexing] cuDNN v9.3+ supports non-batch-splittable convolutions with > 2**31 elements (#134890)
For longstanding issues such as #95024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134890
Approved by: https://github.com/Skylion007
2024-09-05 22:22:45 +00:00
fc890b55b5 Support rolling over a percentage of workflows (#134816)
In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py.

Details of the new format are in the comments up top.

On the plus side, this now includes some unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816
Approved by: https://github.com/PaliC, https://github.com/zxiiro
2024-09-05 22:21:45 +00:00
058a69d91a [fbcode][dynamo] Turn on guard_nn_modules using justknobs_check (#134928)
As Title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134928
Approved by: https://github.com/ezyang
2024-09-05 22:05:54 +00:00
6c5920d515 Tune int8 AMX WoQ micro-kernel for CPU (#134832)
This patch prevents performance regression against the default ATen implementation for LLaMA 3.1 int8 GPTQ WoQ workload.

Uses AMX micro-kernel only if `M` >= `block_m`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134832
Approved by: https://github.com/jgong5
2024-09-05 22:01:14 +00:00
116fd474da [export] Expand coverage to more copied sym ops for unflattener. (#135119)
Test Plan:
buck2 test 'fbcode//mode/opt' fbcode//torchrec/ir/tests:test_serializer -- --run-disabled

```
File changed: fbcode//caffe2/torch/export/unflatten.py
Buck UI: https://www.internalfb.com/buck2/2e0377e7-e2b6-4bd0-8133-a787245165a0
Test UI: https://www.internalfb.com/intern/testinfra/testrun/5066549824883887
Network: Up: 0B  Down: 0B
Jobs completed: 16. Time elapsed: 10.2s.
Tests finished: Pass 6. Fail 0. Fatal 0. Skip 0. Build failure 0
```

Differential Revision: D62190172

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135119
Approved by: https://github.com/yushangdi
2024-09-05 21:58:20 +00:00
a5d70cf545 [PyTorch] Add isfinite to BFloat16-math.h (#135052)
Missing function from <cmath>.

Differential Revision: [D62148884](https://our.internmc.facebook.com/intern/diff/D62148884/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135052
Approved by: https://github.com/PaliC, https://github.com/albanD
ghstack dependencies: #135031
2024-09-05 21:50:36 +00:00
7fe819d917 [PyTorch] Fix -Wshadow -Werror build in BFloat16-inl.h (#135031)
`float_t` is required to exists in C99 math.h, which causes -Wshadow to fire. We don't need the alias, fortunately.

Differential Revision: [D62135908](https://our.internmc.facebook.com/intern/diff/D62135908/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135031
Approved by: https://github.com/albanD
2024-09-05 21:48:21 +00:00
f63571060c Revert "Use actions/upload-artifact@v4.4.0 for rest of workflows (#135264)"
This reverts commit 9c0b03020b7204ca5d5dbe18174bab005f79c47b.

Reverted https://github.com/pytorch/pytorch/pull/135264 on behalf of https://github.com/atalman due to broke CI ([comment](https://github.com/pytorch/pytorch/pull/135264#issuecomment-2332674607))
2024-09-05 21:43:05 +00:00
38fead8f7c [hop] preserve metadata in re-tracing hop subgraph by running with interpreter (#135159)
In this way, the interpreter.run can preserve the current metadata of subgraphs correctly when tracing the subgraphs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135159
Approved by: https://github.com/tugsbayasgalan
2024-09-05 21:36:56 +00:00
24a223c49d Run inductor micro benchmark on x86 metal runner (#135042)
This enables inductor micro benchmark on CPU (x86):

* Running on AWS metal runner for more accurate benchmark
* I add a new `arch` column, which will be either x86_64 or arm64 for CPU or GPU name for GPU.  We can use this later to differentiate between different setup, i.e. cuda (a100) vs cuda (a10g) or cpu (x86_64) vs cpu (arm64)

The next step would be to run this one cpu arm64, and cuda (a10g).

### Testing
Here is the CSV results from my test run https://github.com/pytorch/pytorch/actions/runs/10709344180

```
name,metric,target,actual,dtype,device,arch,is_model
mlp_layer_norm_gelu,flops_utilization,0.8,17.36,bfloat16,cpu,x86_64,False
gather_gemv,memory_bandwidth(GB/s),990,170.80,int8,cpu,x86_64,False
gather_gemv,memory_bandwidth(GB/s),1060,204.78,bfloat16,cpu,x86_64,False
Mixtral-8x7B-v0.1,token_per_sec,175,26.68,int8,cpu,x86_64,True
Mixtral-8x7B-v0.1,memory_bandwidth(GB/s),1130,171.91,int8,cpu,x86_64,True
Mixtral-8x7B-v0.1,compilation_time(s),162,47.36,int8,cpu,x86_64,True
gemv,memory_bandwidth(GB/s),870,236.36,int8,cpu,x86_64,False
gemv,memory_bandwidth(GB/s),990,305.71,bfloat16,cpu,x86_64,False
Llama-2-7b-chat-hf,token_per_sec,94,14.01,bfloat16,cpu,x86_64,True
Llama-2-7b-chat-hf,memory_bandwidth(GB/s),1253,185.18,bfloat16,cpu,x86_64,True
Llama-2-7b-chat-hf,compilation_time(s),162,74.99,bfloat16,cpu,x86_64,True
Llama-2-7b-chat-hf,token_per_sec,144,25.09,int8,cpu,x86_64,True
Llama-2-7b-chat-hf,memory_bandwidth(GB/s),957,165.83,int8,cpu,x86_64,True
Llama-2-7b-chat-hf,compilation_time(s),172,70.69,int8,cpu,x86_64,True
layer_norm,memory_bandwidth(GB/s),950,172.03,bfloat16,cpu,x86_64,False
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135042
Approved by: https://github.com/yanboliang
2024-09-05 21:31:36 +00:00
e4920a1364 [Traceable FSDP2][Dynamo] allow tracing through auto_functionalized HOP (#135169)
If an `auto_functionalized` HOP is included in backward graph due to activation checkpointing, we will run into a scenario where Compiled Autograd Dynamo tracing will need to trace through the `auto_functionalized` HOP. This PR adds support for it.

Test commands:
- `pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_auto_functionalized`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135169
Approved by: https://github.com/zou3519
2024-09-05 21:22:45 +00:00
bc5ecf83d7 [training ir migration] Fix quantization tests (#135184)
Summary:
Fixed some quantization tests for new training ir:

Fix batch norm node pattern matcher. In training ir, we have `aten.batch_norm` node instead of `aten._native_batch_norm_legit` and `aten._native_batch_norm_legit_no_training`.

Test Plan:
```
buck run fbcode//mode/dev-nosan fbcode//caffe2/test:quantization_pt2e
```

Reviewed By: tugsbayasgalan

Differential Revision: D62209819

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135184
Approved by: https://github.com/tugsbayasgalan
2024-09-05 21:19:28 +00:00
e55c0f59e5 Revert "[Reland] Refactor caching device allocator utils (#130923)"
This reverts commit 9809080b9ed657a8c0ea0383be7cbdce3a26e05e.

Reverted https://github.com/pytorch/pytorch/pull/130923 on behalf of https://github.com/kit1980 due to breaking internal builds - Error: Relocation overflow has occured ([comment](https://github.com/pytorch/pytorch/pull/130923#issuecomment-2332640961))
2024-09-05 21:16:14 +00:00
a4cf9653ee Revert "Remove Caffe2 code from tool scripts (#134941)"
This reverts commit c818ecd1698a28d9fadf4a81453a89914b18374a.

Reverted https://github.com/pytorch/pytorch/pull/134941 on behalf of https://github.com/kit1980 due to breaking internal builds - The path `caffe2/operators/hip/gather_op.cuh` does not exist ([comment](https://github.com/pytorch/pytorch/pull/134941#issuecomment-2332636624))
2024-09-05 21:12:54 +00:00
9c0b03020b Use actions/upload-artifact@v4.4.0 for rest of workflows (#135264)
To be consistent with https://github.com/pytorch/pytorch/pull/135263 and rest of workflows. Use v4.4.0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135264
Approved by: https://github.com/kit1980, https://github.com/malfet
2024-09-05 21:05:06 +00:00
034717a029 [ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133438
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
2024-09-05 20:36:45 +00:00
9c38b00999 [export] Add ability to run eagerly on UnflattenedModule (#133996)
Summary:
Added the contextmanager, `_disable_interpreter`, which is meant to put around a call to `unflatten`. This will generate an UnflattendModule and sub-InterpreterModules which will not use torch.fx.Interpreter to run eagerly. We want to have this as a state of the module instead of a contextmanager around running the module because it's not clear where we are calling the unflattened module.

This seems to improve the performance: https://fb.workplace.com/groups/1075192433118967/posts/1473590629945810/?comment_id=1473621763276030

Test Plan: CI

Differential Revision: D60939034

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133996
Approved by: https://github.com/pianpwk
2024-09-05 20:28:42 +00:00
8efe547046 Use actions/upload-artifact@v4.4.0 for triton builds (#135263)
Same as: https://github.com/pytorch/pytorch/pull/135139
Fixes upload failure: https://github.com/pytorch/pytorch/actions/runs/10722567217/job/29748125015
fix regression introduced by https://github.com/pytorch/pytorch/pull/135068

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135263
Approved by: https://github.com/kit1980, https://github.com/huydhn
2024-09-05 20:03:39 +00:00
82d00acfee Allow cross-device copies for cpu scalars in refs (#135140)
This copies our eager-mode behavior where someone can do torch.add(a, b, out=c)
where a and b are CPU scalar tensors and c is a CUDA tensor.

Fixes https://github.com/pytorch/pytorch/issues/121619 by side effect (we get into a situation where we're writing a CPU scalar into a FakeTensor that is actually a meta tensor)

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135140
Approved by: https://github.com/williamwen42, https://github.com/yanboliang
2024-09-05 19:08:48 +00:00
098431a29d Update Resize.cpp with new device type (#135117)
Update Resize.cpp with new device type

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135117
Approved by: https://github.com/egienvalue
2024-09-05 18:53:13 +00:00
be660ea2d3 [PT2] Directly set meta.val in group_batch_fusion_aten (#135078)
Summary: instead of using FakeTensorProp after the pass

Differential Revision: D62162640

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135078
Approved by: https://github.com/frank-wei
2024-09-05 18:17:06 +00:00
52c7c89ea4 [Inductor][CPP] Leverage full bits for BF16/FP16 vectorization (#126502)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126502
Approved by: https://github.com/jgong5, https://github.com/jansel
2024-09-05 17:17:46 +00:00
1efd341d15 [fake_tensor] Move unrecognized_type NotImplemented before ConstProp (#135033)
We should not try to do ConstProp on the unrecognized types (e.g. Subclasses).
In case of those types throwing NotImplemented will jump to the next torch_dispatch.

Test:
```
 python test/functorch/test_aotdispatch.py -k test_aot_test_subclasses_with_tensor_factories
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135033
Approved by: https://github.com/zou3519, https://github.com/bdhirsh
2024-09-05 17:09:41 +00:00
a096f2899d Add torch.serialization.skip_data context manager (#134504)
## Semantic

The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).

```python
import torch
import torch.nn as nn

sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
    torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```

(2)  With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)

```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode

with FakeTensorMode():
    m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')

sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
    torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])

```

## Follow Ups

- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)

Differential Revision: [D62238610](https://our.internmc.facebook.com/intern/diff/D62238610)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
2024-09-05 16:53:39 +00:00
dbeb8a1691 Render log filepaths that are not anchored in torch's directory in a reasonable way (#135165)
For example, if I do TORCH_LOGS=fbscribelogger I'll get:

```
I0904 17:59:07.567000 3672513 fbscribelogger/__init__.py:161] stop
```

instead of

```
I0904 12:46:15.332000 2930287 ../../../../../home/ezyang/local/a/pytorch-env/lib/python3.10/site-packages/fbscribelogger/__init__.py:161] stop
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135165
Approved by: https://github.com/Skylion007
2024-09-05 16:48:09 +00:00
b1f72e2984 Gradient scaler for DTensor (#132816)
Solve the request [here](https://github.com/pytorch/pytorch/issues/120003#issuecomment-2248805798).
Enable DTensor input in gradient scaler's APIs, especially on `.unscale_()`
Related dispatch strategy is added to accept DTensor input.

To enable found_inf to conduct reduce action across devices, we add allreduce at dispatch with args after dispatch strategy and kernel.
Since `aten._amp_foreach_non_finite_check_and_unscale_.default` is an inplace_op, grad_scale as the arg[0] with be inplaced, so that redesign a strategy or refactoring the kernel would not help

Test files are testing 2 parts under 1-d(dp) and 2-d(dp,tp) cases:
1. whether the non-inf values unscaled
2. whether all DTensors at each device could found inf even not at their device.
3. If inf not found, will new parameters generates
4. if inf found, will scale be updated

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132816
Approved by: https://github.com/XilunWu, https://github.com/weifengpy, https://github.com/wanchaol
2024-09-05 16:44:32 +00:00
bb3c2408f4 [inductor][test] in test_unbacked_symints, replace inductor's skipCUDAIf with common device type's skipcudaif (#133936)
Differential Revision: D61506212

Use `skipCUDAIf` from `torch.testing._internal.common_device_type` if we create the test class with `instantiate_device_type_tests`.

`instantiate_device_type_tests` would make sure the class has attr device_type, which works with`skipCUDAIf` from `torch.testing._internal.common_device_type`.

Also skipping test_vertical_pointwise_reduction_fusion for cpu test class, since the test expects cuda.

FAILED [0.0026s] test/inductor/test_unbacked_symints.py::TestUnbackedSymintsCPU::test_vertical_pointwise_reduction_fusion_cpu - AttributeError: 'TestUnbackedSymintsCPU' object has no attribute 'device'

repro:
```
CUDA_VISIBLE_DEVICES="" pytest test/inductor/test_unbacked_symints.py -k cpu -v
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133936
Approved by: https://github.com/ColinPeppler, https://github.com/desertfire
2024-09-05 16:40:14 +00:00
2c99f17a32 Implement VariableTracker.python_type() (#134215)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134215
Approved by: https://github.com/amjames, https://github.com/jansel
2024-09-05 16:35:47 +00:00
0043dcd79e Switch torch pt2e xnnpack tests to use export_for_training (#134788)
Migrate all the callsites inside the pt2e XNNPACK tests to use export_for_training.

Differential Revision: D61994553

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134788
Approved by: https://github.com/mergennachin
2024-09-05 16:11:18 +00:00
2e2fb668fa Upgrade expecttest to 0.2.1 (#135136)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135136
Approved by: https://github.com/albanD, https://github.com/atalman, https://github.com/Skylion007
2024-09-05 16:05:35 +00:00
9d24f945ba [CI] Use larger instance for building triton whl (#135201)
When running CI jobs of "Build Triton Wheels", it failed due to the lack of resources. This PR uses a larger runner to avoid these issues.

The failure message is like:

```
Process completed with exit code 137.
```

Related running actions:
Failed actions: https://github.com/pytorch/pytorch/actions/runs/10714445036
Success actions: https://github.com/pytorch/pytorch/actions/runs/10716710830

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135201
Approved by: https://github.com/chuanqi129, https://github.com/atalman
2024-09-05 14:36:23 +00:00
ecbd715363 [Intel GPU][Windows] Fix overriding default CMAKE_CXX_FLAGS (#135093)
The root cause is that `/EHsc` is part of the default `CMAKE_CXX_FLAGS` in CMake.
Fix to not override the default `CMAKE_CXX_FLAGS`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135093
Approved by: https://github.com/EikanWang, https://github.com/atalman
2024-09-05 12:52:43 +00:00
58f2477a26 [Dynamo] Support builtin function frozenset (#134563)
Support builtin function frozenset in dynamo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134563
Approved by: https://github.com/anijain2305, https://github.com/EikanWang, https://github.com/jansel
2024-09-05 12:15:10 +00:00
43dcb4bb61 Revise CPU vectorization ISA support API (#135075)
Revising (mostly renaming) CPU vectorization ISA support API (non-frontend-user-facing). Also added AVX512_BF16 ISA detection API.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135075
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/ezyang
2024-09-05 12:14:56 +00:00
50d1e37079 [AOTI] Fix a unbacked symint retrieve bug (#134670)
Summary: Fix https://github.com/pytorch/pytorch/issues/134081. When a unbacked symint is computed as the shape of a tensor from a tuple, generated C++ code needs to use std::get<> to extract the tensor.

Differential Revision: [D62142113](https://our.internmc.facebook.com/intern/diff/D62142113)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134670
Approved by: https://github.com/angelayi, https://github.com/22quinn, https://github.com/chenyang78
2024-09-05 11:34:14 +00:00
b99ef1a02e Update torch-xpu-ops pin (ATen XPU implementation) (#135185)
Release cycle for PyTorch 2.5
1. Update specific AOT targets for Windows. On Windows, AOT target list prefers Intel client GPUs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135185
Approved by: https://github.com/EikanWang
2024-09-05 10:05:23 +00:00
8a5c8e5db9 Update unbacked symints in masked_select more precisely (#134899)
## Summary
At the moment, the fake impl for `masked_select` simply sets the upper range while updating its size-like SymInt to `sys.maxsize`(9223372036854775807, max value for an unsigned int64) if the there are any SymInts in the original input tensor shape. This PR constrains the range more intelligently by using the upper ranges of each SymInt in the input tensor shape.

This solves an issue where an model being lowered to Executorch errors during memory planning because the memory allocated for `masked_select` ended up exceeded the 64-bit address space (`INT_MAX * size(dtype)`).

## Test plan
- Passes existing unit tests (tests case where upper bound is inf)
- Added unit test to verify upper bound reduction calculation
- Tested end-to-end by exporting with TORCH_LOGS="export" and ensuring that the range for `masked_select`'s SymInt size has the correct upper bound
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134899
Approved by: https://github.com/ezyang
2024-09-05 09:01:06 +00:00
c7328dff7f Enhance the stability of the complex divide code (#134647)
In C++, when a floating-point literal (e.g., 3.14) is compared with a variable of type float, the literal is by default interpreted as a double.
```c++
float f = 3.14f;
if (f == 3.14) {
    // Do something
}
```
If a device does not support double, an error will occur.
This PR addresses the issue of complex64 errors on machines that do not support double operations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134647
Approved by: https://github.com/EikanWang, https://github.com/albanD
2024-09-05 08:36:37 +00:00
749dc6ceda [inductor] [cpp] use_local_acc if template_buffer_has_other_users (#135081)
Fix the compilation error of `coat_lite_mini` in timm and `YituTechConvBert` in HF:
```
/tmp/tmpuu94adg_/nf/cnf3zm677wbfjzzll522zvjp57g44udzfnj66ac2t5b2odvfqpts.cpp:239:33: error: invalid conversion from ‘const float*’ to ‘float*’ [-fpermissive]
  239 |                                 &(in_ptr2[static_cast<int64_t>(n_start + (192L*m_start) + (Nr*nci) + ((-1L)*Nr*nc))]),
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                 |
      |                                 const float*
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135081
Approved by: https://github.com/jgong5
ghstack dependencies: #134984
2024-09-05 08:31:31 +00:00
eaeae0ac95 [c10d] Change collective to take in a list of tensors so it work fully for all collectives (#135049)
We found that currently, we only pass one input and output tensor to the function `collective`, and this causes NaNCheck, work numel stats and FR input/output sizes not accurate for all-to-all, scatter and reduce. So we want to let the collective take in a list of tensors to ensure it works for all collectives inside PGNCCL.

This partially revert what we did in https://github.com/pytorch/pytorch/pull/119421, and down the road we will have another round of cleanup on the collective to make it cleaner. For now, at least for the sake of correctness, we changed it back.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135049
Approved by: https://github.com/kwen2501
2024-09-05 07:56:56 +00:00
5a0e7a408f restore CSE'd node metadata in runtime asserts pass (#134516)
Adds val, and optionally stack_trace & nn_module_stack metadata back to SymInt compute nodes that we CSE, with a hook on `graph.create_node()`. Not sure if there's other metadata we want to populate here?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134516
Approved by: https://github.com/ezyang
2024-09-05 07:50:04 +00:00
81a8624296 [Intel GPU] Customized XPU behaviour in indexing, group norm (#134453)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134453
Approved by: https://github.com/EikanWang, https://github.com/albanD
ghstack dependencies: #133980
2024-09-05 07:41:57 +00:00
731fd3172a [inductor] [cpp] generate reindexer for each epilogue_node (#134984)
Fixes the FP32 accuracy failure of `levit_128` in timm.

Previously, we used `Y` which is the output of the final epilogue node to calculate the reindexer. We actually need to use each epilogue node to calculate the reindexer from the GEMM output to the epilogue node.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134984
Approved by: https://github.com/jgong5
2024-09-05 07:08:31 +00:00
9d705605dd Fix decomp behaviour in export training IR (#134801)
Subset of changes in https://github.com/pytorch/pytorch/pull/132901, can't land the previous one because it is too complicated. Rest of the change will be implemented as follow up after export design meeting. This part just makes the training IR -> inference IR decomp to have the same path as normal export.

Differential Revision: [D62000525](https://our.internmc.facebook.com/intern/diff/D62000525)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134801
Approved by: https://github.com/avikchaudhuri, https://github.com/angelayi
2024-09-05 06:37:44 +00:00
05feb6e4ed [Inductor] support masked vectorization for the tail_loop for dynamic shapes (#131745)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131745
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel
2024-09-05 06:17:48 +00:00
7b280c31ba [export] dynamic_shapes serialization, load/dump (#134718)
Adds utility functions `_dump_dynamic_shapes` and `_load_dynamic_shapes`.

- `_dump_dynamic_shapes`: dynamic shapes spec -> serialized format:
    - takes in the `dynamic_shapes` pytree object you'd feed into `export()`, and dumps into serialized format
- `_load_dynamic_shapes`: serialized format -> dynamic shapes spec
    - takes the serialized format, and produces a `dynamic_shapes` object you feed into `export()`

For example with dumping:
```
dx = Dim("dx", min=4, max=16)
dy = dx + 1

inputs = (
    [
        torch.randn(4, 4),
        torch.randn(5, 4),
    ],
    torch.randn(4),
    torch.randn(4, 4),
    "hello",
)
dynamic_shapes = {
    "a": [
        (dx, 4),
        (dy, 4),
    ],
    "b": (Dim.AUTO,),
    "c": None,
    "d": None,
}
out = _dump_dynamic_shapes(dynamic_shapes, inputs)
```

would generate the following output:
```
DynamicShapesSpec(
    dynamic_shapes=(
        [
            ['dx', 4],
            ['dx + 1', 4],
        ],
        ['_DimHint.STATIC'],
        ['_DimHint.STATIC', '_DimHint.STATIC'],
        None,
    ),
    dims={
        'dx': RootDim(
            min=4,
            max=16,
            derived=['dx + 1'],
        ),
    },
)
```

The serialized format contains 2 keys, `dynamic_shapes` and `dims.`
- `dynamic_shapes` is the pytree structure matching the input to `export()`, with strings in place of Dim names and enums, and ints/Nones otherwise. Each tensor is represented with a list of shapes, non-tensors with Nones.
- `dims` contain min/max range and derived dims info for each root dim.

The test cases show some roundtrippability guarantees for these functions. Definitely taking naming suggestions for them :)

Follow up: utility function to extract serializable format from ExportedProgram.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134718
Approved by: https://github.com/avikchaudhuri
2024-09-05 05:39:44 +00:00
f2a7228aed [executorch hash update] update the pinned executorch hash (#135162)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).
Update the pinned executorch hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135162
Approved by: https://github.com/pytorchbot
2024-09-05 04:21:51 +00:00
8fb1281db9 [Traceable FSDP2] Skip _backward_prefetch under compile, and rely on compiler pass to have prefetching (#135163)
Before this PR, when traceable FSDP2 + AC is run, an error would be thrown:
```
  File "/data/users/willfeng/pytorch/torch/_dynamo/variables/builtin.py", line 1449, in call_getitem
    return args[0].call_method(tx, "__getitem__", args[1:], kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/users/willfeng/pytorch/torch/_dynamo/variables/lists.py", line 435, in call_method
    return super().call_method(tx, name, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/users/willfeng/pytorch/torch/_dynamo/variables/lists.py", line 392, in call_method
    return super().call_method(tx, name, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/users/willfeng/pytorch/torch/_dynamo/variables/lists.py", line 131, in call_method
    return self.getitem_const(tx, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/users/willfeng/pytorch/torch/_dynamo/variables/lists.py", line 106, in getitem_const
    return self.items[index]
Error: Index out of bound

from user code:
   File "<eval_with_key>.5", line 105, in forward
    aot0_trace_wrapped = torch__dynamo__trace_wrapped_higher_order_op_self_invoke(aot0_tangents_1, bw_state = aot0_primals_34);  aot0_tangents_1 = None
  File "/data/users/willfeng/pytorch/torch/_dynamo/_trace_wrapped_higher_order_op.py", line 74, in self_invoke
    return _trace_wrapped_op(*args, **dyn_kwargs, **kwargs)
  File "/data/users/willfeng/pytorch/torch/_dynamo/external_utils.py", line 132, in call_hook_from_backward_state
    return getattr(bw_state, hook_name)(*args, **kwargs)
  File "/data/users/willfeng/pytorch/torch/distributed/_composable/fsdp/_fsdp_state.py", line 271, in _pre_backward
    self._fsdp_param_group.pre_backward(default_prefetch)
  File "/data/users/willfeng/pytorch/torch/distributed/_composable/fsdp/_fsdp_param_group.py", line 332, in pre_backward
    self._backward_prefetch()
  File "/data/users/willfeng/pytorch/torch/distributed/_composable/fsdp/_fsdp_param_group.py", line 417, in _backward_prefetch
    target_fsdp_param_group = self.comm_ctx.post_forward_order[target_index]
```

Since it's okay to rely on the compiler to recover the "prefetching" pattern, we will skip this `_backward_prefetch()` code path during tracing to avoid the error, and have a compiler pass (in future PR) to achieve the equivalent prefetching overlap.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135163
Approved by: https://github.com/awgu
2024-09-05 03:32:04 +00:00
a7a53b796b [Intel GPU]device guard codegen for XPU (#133980)
This PR is a supplement to #130082. The previous PR  #130082 fulfill the basic functionality of codegen, while we found it fails to handle the device sameness check in lots of uts.  Current PR is aimed to facilitate the XPU device guard code generation.

With current PR, the code snippet in `RegisterXPU.cpp` is as follows, where we can see the device guard is successfully generated.
```c++
namespace {
at::Tensor & wrapper_XPU_Tensor_float_out_normal_out(const at::Tensor & mean, double std, ::std::optional<at::Generator> generator, at::Tensor & out) {
  std::optional<Device> common_device = std::nullopt;
(void)common_device; // Suppress unused variable warning
  c10::impl::check_and_update_common_device(common_device, out, "wrapper_XPU_Tensor_float_out_normal_out", "out");
  c10::impl::check_and_update_common_device(common_device, mean, "wrapper_XPU_Tensor_float_out_normal_out", "mean");
  const OptionalDeviceGuard device_guard(device_of(out));
  return at::native::normal_out(mean, std, generator, out);
}
} // anonymous namespace
```
Nevertheless, without current change, the generated code is
```c++
namespace {
at::Tensor & wrapper_XPU_Tensor_float_out_normal_out(const at::Tensor & mean, double std, ::std::optional<at::Generator> generator, at::Tensor & out) {
    // No device check
  // DeviceGuard omitted
  return at::native::normal_out(mean, std, generator, out);
}
} // anonymous namespace
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133980
Approved by: https://github.com/EikanWang, https://github.com/malfet
2024-09-05 01:53:31 +00:00
30b98940b8 Fix typo in comment (#135111)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135111
Approved by: https://github.com/aorenste, https://github.com/oulgen
2024-09-05 01:39:04 +00:00
724faac260 [FSDP] casting input args with dataclass(frozen=True) (#135067)
resolve: https://github.com/pytorch/pytorch/pull/135029

when enabling mixed precision, FSDP cast input args to desired dtype by calling `_apply_to_tensors`. When input args has `dataclass(frozen=True)`, we hit following runtime error, because of using `setattr` in `_apply_to_tensors`

`dataclasses.FrozenInstanceError: cannot assign to field 'some_key'`. The fix is to use dataclasses api `dataclasses.replace`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135067
Approved by: https://github.com/awgu
2024-09-05 01:19:53 +00:00
04e11c7eed Update current scripts used for setting up s390x runners (#129866)
Update current scripts used for setting up s390x runners

Just a documentation update.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129866
Approved by: https://github.com/malfet, https://github.com/huydhn
2024-09-05 01:17:54 +00:00
a3e0d4bf07 [FlexAttention] Fix mismatched backward strides for eager impl (#135152)
# Fixes:
The first repro from: https://github.com/pytorch/pytorch/issues/134888

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135152
Approved by: https://github.com/Chillee
2024-09-05 01:14:53 +00:00
27d86f93fe Remove redundant code (#134955)
Remove GetPrivateUse1HooksInterface
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134955
Approved by: https://github.com/Skylion007
2024-09-05 01:11:32 +00:00
32f45f01a9 [dynamo] Retire CompileProfiler (#135133)
Fixes confusion in https://github.com/pytorch/pytorch/issues/113443

We have TORCH_LOGS that supersedes CompileProfiler

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135133
Approved by: https://github.com/ezyang
ghstack dependencies: #135039, #135121, #135129, #135130
2024-09-05 01:08:40 +00:00
4a661e089a [FR] Add version based logic to FR script and make traces print can be filtered (#135154)
This PR makes version passing around the version, so that we can have different behaviors for different versions of FR dump. This PR also adds the logic of filtering to certain PG(desc) and ranks to show their traces.

Some minor refactors to make the name more accurate and util function working.

<img width="1180" alt="image" src="https://github.com/user-attachments/assets/4ef8a2d6-1296-4a45-b9a7-6d3b48fbe233">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135154
Approved by: https://github.com/wconstab
2024-09-05 00:59:32 +00:00
105ac2418c Fix binary builds artifact download (#135139)
By upgrading upload-artifacts action to v4.4.0

As artifact store layout is different between v3 and v4 actions and artifacts uploaded by v3 can not be downloaded by v4

Should fix`Unable to download artifact(s): Artifact not found for name: libtorch-cpu-shared-with-deps-release`, which could be seen for example [here](https://github.com/pytorch/pytorch/actions/runs/10707740040/job/29690137218#step:7:29)

I.e. fix regression introduced by https://github.com/pytorch/pytorch/pull/135068

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135139
Approved by: https://github.com/atalman, https://github.com/huydhn
2024-09-05 00:43:34 +00:00
560f449d8f Fix: use clone_preserve_strides in auto_functionalized_v2 (#135142)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135142
Approved by: https://github.com/zou3519
ghstack dependencies: #134409
2024-09-05 00:39:48 +00:00
956da79bda [CUDA][AMP] Fix autocast_dtype (#133938)
Fixes #132715

The failure in #132715 is due to `autocast_dtype` being a thread-local variable. It causes inconsistencies between `get_autocast_dtype()` among different threads.

To be exact, what is happening in the following: The amp dtype is set to `bfloat16` on main thread. The `backward` call runs on a side thread, so `at::autocast::prioritize` fails because `lower_precision_fp` defaults to `float16`:
6f738d6434/aten/src/ATen/autocast_mode.h (L221-L225)

This PR makes `autocast_dtype` thread-global so it consistent among all threads of forward and backward passes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133938
Approved by: https://github.com/soulitzer
2024-09-05 00:07:32 +00:00
977a909250 [CI] Build pytorch wheel with Torch XPU Operators on Windows (#133151)
# Description
This pipeline enables the CI build on Windows with PR labeled with ciflow/xpu. This will build torch binary with Torch XPU Operators on Windows using Vision Studio BuildTools 2022.

# Changes
1. Install xpu batch file (install_xpu.bat) - Check if build machine has oneAPI in environment, and if the version of it is latest. If not, install the latest public released oneAPI in the machine.
2. GHA callable pipeline (_win-build.yml) - Set vc_year and use_xpu as parameter to set build wheel environment.
3.  GHA workflow (xpu.yml) - Add a new windows build job and pass parameters to it.
4.  Build wheels script (.ci/pytorch/win-test-helpers/build_pytorch.bat) - Prepare environment for building, e.g. install oneAPI bundle.

# Note
1. For building wheels on Intel GPU, you need Vision Studio BuildTools version >= 2022
2. This pipeline requires to use Vision Studio BuildTools 2022 to build wheels. For now, we specify "windows.4xlarge.nonephemeral" as build machine label in the yaml file. We will request to add self-hosted runners with Intel GPU and Vision Studio BuildTools 2022 installed soon.

Work for #114850

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133151
Approved by: https://github.com/chuanqi129, https://github.com/atalman

Co-authored-by: chuanqiw <chuanqi.wang@intel.com>
2024-09-05 00:02:46 +00:00
b3ef0c99f5 [PP] Fix zero bubble composability with DP (#134052)
Moved all the backward functions (`stage_backward_input`, `stage_backward_weight`, `stage_backward`) under the same `backward_maybe_with_nosync` function which controls the logic of the data parallel wrappers.

FSDP was not working with zero bubble PP because there will be twice as many "backward" calls and we update the weight gradients after `autograd.grad` is called. As a result, we need to manually call the FSDP `post_backward_hook()` after the weights have the correct gradients.

Fixes the tests:
`python test/distributed/_composable/test_composability/test_pp_composability.py ComposabilityTest.test_manual_with_data_parallel_dp_type_FSDP_ScheduleClass0_use_new_runtime_False`

`python test/distributed/_composable/test_composability/test_pp_composability.py ComposabilityTest.test_manual_with_data_parallel_dp_type_DDP_ScheduleClass0_use_new_runtime_False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134052
Approved by: https://github.com/kwen2501
2024-09-04 23:46:29 +00:00
43c9b4e0e6 Fix unintentional deduplication of returned tensors (#134726)
When CSE was used, returned tensors that had gone through identical
processing steps but were distinct from a data perspective were pruned
out of the graph.  This commit protects tensors which are directly
output from being pruned, and adds a test for this behavior.

Closes #88813 and #114344

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134726
Approved by: https://github.com/amjames, https://github.com/zou3519, https://github.com/bdhirsh
2024-09-04 23:42:56 +00:00
00a8666708 [ONNX] Support output_names in dynamic_axes when dynamo=True (#135134)
Previous to this PR, if output_names shows in dynamic_axes, it errors when we turn it to dynamic_shapes of torch.export, as we only recognized input_names.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135134
Approved by: https://github.com/justinchuby
2024-09-04 23:42:13 +00:00
eqy
4f70b3cfae [CUDA][complex][TF32] Update test_noncontiguous_samples tolerances for complex64 (#134526)
Recent cuDNN heuristics change surfaces same TF32 issue as `float32`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134526
Approved by: https://github.com/ezyang
2024-09-04 23:37:16 +00:00
359077fa43 [export] Fix indentation (#135128)
Summary: as title

Test Plan: CI

Differential Revision: D62195680

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135128
Approved by: https://github.com/tugsbayasgalan
2024-09-04 23:26:36 +00:00
9810ce9ca7 [PP] Go back to export instead of _export (#134299)
Reverts https://github.com/pytorch/pytorch/pull/130998 because FakeTensor + real device suffice to work around the autocast issue in HF.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134299
Approved by: https://github.com/lessw2020
2024-09-04 23:25:17 +00:00
804852c1f9 [dynamo] Search for _torchdynamo_inline only for functions (#135130)
Issue seen in https://github.com/pytorch/pytorch/issues/93633

Fixes https://github.com/pytorch/pytorch/issues/93633

Unable to create a testcase

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135130
Approved by: https://github.com/williamwen42, https://github.com/yanboliang
ghstack dependencies: #135039, #135121, #135129
2024-09-04 23:02:59 +00:00
13a4a0c60d [Inductor] Apply loop split optimization in codegen_node (#132389)
This PR applies loop split optimization in codegen_node to avoid non-contiguous load. When the vector is loaded in a non-contiguous manner due to a division in the index, we eliminate the division by splitting the loop to avoid non-contiguous load.

Example:
```
import torch
import torch.nn as nn

class GNReLU(torch.nn.Module):
    def __init__(self, num_groups, num_channels):
        super(GNReLU, self).__init__()
        self.gn = nn.GroupNorm(num_groups, num_channels)

    def forward(self, x):
        return torch.nn.functional.relu(self.gn(x))

input = torch.randn(2, 960, 96, 96).to(memory_format=torch.channels_last)
m = GNReLU(32, 960).eval()
compiled_m = torch.compile(m)

with torch.no_grad():
    compiled_m(input)
```

Generated code:

- Before:
```
cpp_fused_native_group_norm_relu_0 = async_compile.cpp_pybinding(['const float*', 'const float*', 'const float*', 'float*', 'float*', 'float*'], '''
#include "/tmp/torchinductor_jiayisun/vu/cvuckxaygqfovv2zu2byqhcmiejbke7mdhf2rpgpr5mlscdev2hg.h"
extern "C"  void kernel(const float* in_ptr0,
                       const float* in_ptr1,
                       const float* in_ptr2,
                       float* out_ptr0,
                       float* out_ptr1,
                       float* out_ptr2)
{
    #pragma omp parallel num_threads(56)
    {
        int tid = omp_get_thread_num();
        {
            #pragma omp for collapse(2)
            for(long x0=static_cast<long>(0L); x0<static_cast<long>(2L); x0+=static_cast<long>(1L))
            {
                for(long x1=static_cast<long>(0L); x1<static_cast<long>(32L); x1+=static_cast<long>(1L))
                {
                    {
                        Welford<float> tmp_acc0 = Welford<float>();
                        Welford<at::vec::Vectorized<float>> tmp_acc0_vec = Welford<at::vec::Vectorized<float>>();
                        Welford<at::vec::Vectorized<float>> masked_tmp_acc0_vec = Welford<at::vec::Vectorized<float>>();
                        static WeightRecp<at::vec::Vectorized<float>> wrecps0(static_cast<long>(17280L));
                        for(long x2=static_cast<long>(0L); x2<static_cast<long>(9216L); x2+=static_cast<long>(1L))
                        {
                            for(long x3=static_cast<long>(0L); x3<static_cast<long>(16L); x3+=static_cast<long>(16L))
                            {
                                auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x1) + (960L*x2) + (8847360L*x0)), 16);
                                tmp_acc0_vec = welford_combine(tmp_acc0_vec, tmp0, &wrecps0);
                            }
                            for(long x3=static_cast<long>(16L); x3<static_cast<long>(30L); x3+=static_cast<long>(14L))
                            {
                                auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x1) + (960L*x2) + (8847360L*x0)), 14);
                                masked_tmp_acc0_vec = welford_combine(masked_tmp_acc0_vec, tmp0, 14, &wrecps0);
                            }
                        }
                        tmp_acc0 = welford_combine(tmp_acc0, welford_vec_reduce_all(masked_tmp_acc0_vec));
                        tmp_acc0 = welford_combine(tmp_acc0, welford_vec_reduce_all(tmp_acc0_vec));
                        out_ptr0[static_cast<long>(x1 + (32L*x0))] = static_cast<float>(tmp_acc0.mean);
                        out_ptr1[static_cast<long>(x1 + (32L*x0))] = static_cast<float>(tmp_acc0.m2);
                    }
                }
            }
        }
        {
            #pragma omp for collapse(2)
            for(long x0=static_cast<long>(0L); x0<static_cast<long>(2L); x0+=static_cast<long>(1L))
            {
                for(long x1=static_cast<long>(0L); x1<static_cast<long>(9216L); x1+=static_cast<long>(1L))
                {
                    for(long x2=static_cast<long>(0L); x2<static_cast<long>(960L); x2+=static_cast<long>(16L))
                    {
                        auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x2 + (960L*x1) + (8847360L*x0)), 16);
                        auto tmp1 =
                        [&]
                        {
                            __at_align__ std::array<float, 16> tmpbuf;
                            #pragma GCC unroll 16
                            for (long x2_inner = 0; x2_inner < 16; x2_inner++)
                            {
                                tmpbuf[x2_inner] = out_ptr0[static_cast<long>((32L*x0) + (c10::div_floor_integer((x2 + x2_inner), 30L)))];
                            }
                            return at::vec::Vectorized<float>::loadu(tmpbuf.data(), 16);
                        }
                        ()
                        ;
                        auto tmp3 =
                        [&]
                        {
                            __at_align__ std::array<float, 16> tmpbuf;
                            #pragma GCC unroll 16
                            for (long x2_inner = 0; x2_inner < 16; x2_inner++)
                            {
                                tmpbuf[x2_inner] = out_ptr1[static_cast<long>((32L*x0) + (c10::div_floor_integer((x2 + x2_inner), 30L)))];
                            }
                            return at::vec::Vectorized<float>::loadu(tmpbuf.data(), 16);
                        }
                        ()
                        ;
                        auto tmp12 = at::vec::Vectorized<float>::loadu(in_ptr1 + static_cast<long>(x2), 16);
                        auto tmp14 = at::vec::Vectorized<float>::loadu(in_ptr2 + static_cast<long>(x2), 16);
                        auto tmp2 = tmp0 - tmp1;
                        auto tmp4 = static_cast<float>(276480.0);
                        auto tmp5 = at::vec::Vectorized<float>(tmp4);
                        auto tmp6 = tmp3 / tmp5;
                        auto tmp7 = static_cast<float>(1e-05);
                        auto tmp8 = at::vec::Vectorized<float>(tmp7);
                        auto tmp9 = tmp6 + tmp8;
                        auto tmp10 = tmp9.rsqrt();
                        auto tmp11 = tmp2 * tmp10;
                        auto tmp13 = tmp11 * tmp12;
                        auto tmp15 = tmp13 + tmp14;
                        auto tmp16 = at::vec::clamp_min(tmp15, decltype(tmp15)(0));
                        tmp16.store(out_ptr2 + static_cast<long>(x2 + (960L*x1) + (8847360L*x0)));
                    }
                }
            }
        }
    }
}
''')

async_compile.wait(globals())
del async_compile

def call(args):
    arg2_1, = args
    args.clear()
    assert_size_stride(arg2_1, (2, 960, 96, 96), (8847360, 1, 92160, 960))
    buf0 = empty_strided_cpu((2, 32, 1, 1), (32, 1, 64, 64), torch.float32)
    buf1 = empty_strided_cpu((2, 32, 1, 1), (32, 1, 64, 64), torch.float32)
    buf3 = empty_strided_cpu((2, 960, 96, 96), (8847360, 1, 92160, 960), torch.float32)
    cpp_fused_native_group_norm_relu_0(arg2_1, _frozen_param3, _frozen_param2, buf0, buf1, buf3)
    del arg2_1
    return (buf3, )
```

- After:
```
cpp_fused_native_group_norm_relu_0 = async_compile.cpp_pybinding(['const float*', 'const float*', 'const float*', 'float*', 'float*', 'float*'], '''
#include "/tmp/torchinductor_jiayisun/vu/cvuckxaygqfovv2zu2byqhcmiejbke7mdhf2rpgpr5mlscdev2hg.h"
extern "C"  void kernel(const float* in_ptr0,
                       const float* in_ptr1,
                       const float* in_ptr2,
                       float* out_ptr0,
                       float* out_ptr1,
                       float* out_ptr2)
{
    #pragma omp parallel num_threads(56)
    {
        int tid = omp_get_thread_num();
        {
            #pragma omp for collapse(2)
            for(long x0=static_cast<long>(0L); x0<static_cast<long>(2L); x0+=static_cast<long>(1L))
            {
                for(long x1=static_cast<long>(0L); x1<static_cast<long>(32L); x1+=static_cast<long>(1L))
                {
                    {
                        Welford<float> tmp_acc0 = Welford<float>();
                        Welford<at::vec::Vectorized<float>> tmp_acc0_vec = Welford<at::vec::Vectorized<float>>();
                        Welford<at::vec::Vectorized<float>> masked_tmp_acc0_vec = Welford<at::vec::Vectorized<float>>();
                        static WeightRecp<at::vec::Vectorized<float>> wrecps0(static_cast<long>(17280L));
                        for(long x2=static_cast<long>(0L); x2<static_cast<long>(9216L); x2+=static_cast<long>(1L))
                        {
                            for(long x3=static_cast<long>(0L); x3<static_cast<long>(16L); x3+=static_cast<long>(16L))
                            {
                                auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x1) + (960L*x2) + (8847360L*x0)), 16);
                                tmp_acc0_vec = welford_combine(tmp_acc0_vec, tmp0, &wrecps0);
                            }
                            for(long x3=static_cast<long>(16L); x3<static_cast<long>(30L); x3+=static_cast<long>(14L))
                            {
                                auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x1) + (960L*x2) + (8847360L*x0)), 14);
                                masked_tmp_acc0_vec = welford_combine(masked_tmp_acc0_vec, tmp0, 14, &wrecps0);
                            }
                        }
                        tmp_acc0 = welford_combine(tmp_acc0, welford_vec_reduce_all(masked_tmp_acc0_vec));
                        tmp_acc0 = welford_combine(tmp_acc0, welford_vec_reduce_all(tmp_acc0_vec));
                        out_ptr0[static_cast<long>(x1 + (32L*x0))] = static_cast<float>(tmp_acc0.mean);
                        out_ptr1[static_cast<long>(x1 + (32L*x0))] = static_cast<float>(tmp_acc0.m2);
                    }
                }
            }
        }
        {
            #pragma omp for collapse(2)
            for(long x0=static_cast<long>(0L); x0<static_cast<long>(2L); x0+=static_cast<long>(1L))
            {
                for(long x1=static_cast<long>(0L); x1<static_cast<long>(9216L); x1+=static_cast<long>(1L))
                {
                    #pragma GCC ivdep
                    for(long x2=static_cast<long>(0L); x2<static_cast<long>(32L); x2+=static_cast<long>(1L))
                    {
                        for(long x3=static_cast<long>(0L); x3<static_cast<long>(16L); x3+=static_cast<long>(16L))
                        {
                            auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x2) + (960L*x1) + (8847360L*x0)), 16);
                            auto tmp1 = out_ptr0[static_cast<long>(x2 + (32L*x0))];
                            auto tmp4 = out_ptr1[static_cast<long>(x2 + (32L*x0))];
                            auto tmp12 = at::vec::Vectorized<float>::loadu(in_ptr1 + static_cast<long>(x3 + (30L*x2)), 16);
                            auto tmp14 = at::vec::Vectorized<float>::loadu(in_ptr2 + static_cast<long>(x3 + (30L*x2)), 16);
                            auto tmp2 = at::vec::Vectorized<float>(tmp1);
                            auto tmp3 = tmp0 - tmp2;
                            auto tmp5 = static_cast<float>(276480.0);
                            auto tmp6 = tmp4 / tmp5;
                            auto tmp7 = static_cast<float>(1e-05);
                            auto tmp8 = decltype(tmp6)(tmp6 + tmp7);
                            auto tmp9 = 1 / std::sqrt(tmp8);
                            auto tmp10 = at::vec::Vectorized<float>(tmp9);
                            auto tmp11 = tmp3 * tmp10;
                            auto tmp13 = tmp11 * tmp12;
                            auto tmp15 = tmp13 + tmp14;
                            auto tmp16 = at::vec::clamp_min(tmp15, decltype(tmp15)(0));
                            tmp16.store(out_ptr2 + static_cast<long>(x3 + (30L*x2) + (960L*x1) + (8847360L*x0)));
                        }
                        for(long x3=static_cast<long>(16L); x3<static_cast<long>(30L); x3+=static_cast<long>(14L))
                        {
                            auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + static_cast<long>(x3 + (30L*x2) + (960L*x1) + (8847360L*x0)), 14);
                            auto tmp1 = out_ptr0[static_cast<long>(x2 + (32L*x0))];
                            auto tmp4 = out_ptr1[static_cast<long>(x2 + (32L*x0))];
                            auto tmp12 = at::vec::Vectorized<float>::loadu(in_ptr1 + static_cast<long>(x3 + (30L*x2)), 14);
                            auto tmp14 = at::vec::Vectorized<float>::loadu(in_ptr2 + static_cast<long>(x3 + (30L*x2)), 14);
                            auto tmp2 = at::vec::Vectorized<float>(tmp1);
                            auto tmp3 = tmp0 - tmp2;
                            auto tmp5 = static_cast<float>(276480.0);
                            auto tmp6 = tmp4 / tmp5;
                            auto tmp7 = static_cast<float>(1e-05);
                            auto tmp8 = decltype(tmp6)(tmp6 + tmp7);
                            auto tmp9 = 1 / std::sqrt(tmp8);
                            auto tmp10 = at::vec::Vectorized<float>(tmp9);
                            auto tmp11 = tmp3 * tmp10;
                            auto tmp13 = tmp11 * tmp12;
                            auto tmp15 = tmp13 + tmp14;
                            auto tmp16 = at::vec::clamp_min(tmp15, decltype(tmp15)(0));
                            tmp16.store(out_ptr2 + static_cast<long>(x3 + (30L*x2) + (960L*x1) + (8847360L*x0)), 14);
                        }
                    }
                }
            }
        }
    }
}
''')

async_compile.wait(globals())
del async_compile

def call(args):
    arg2_1, = args
    args.clear()
    assert_size_stride(arg2_1, (2, 960, 96, 96), (8847360, 1, 92160, 960))
    buf0 = empty_strided_cpu((2, 32, 1, 1), (32, 1, 64, 64), torch.float32)
    buf1 = empty_strided_cpu((2, 32, 1, 1), (32, 1, 64, 64), torch.float32)
    buf3 = empty_strided_cpu((2, 960, 96, 96), (8847360, 1, 92160, 960), torch.float32)
    cpp_fused_native_group_norm_relu_0(arg2_1, _frozen_param3, _frozen_param2, buf0, buf1, buf3)
    del arg2_1
    return (buf3, )
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132389
Approved by: https://github.com/leslie-fang-intel, https://github.com/jansel

Co-authored-by: Jiong Gong <jiong.gong@intel.com>
2024-09-04 22:42:46 +00:00
87842cc658 [dynamo][super] Corner case where the class is not present in the __mro__ (#135129)
I could not come up with a testcase. This was seen in https://github.com/pytorch/pytorch/issues/93633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135129
Approved by: https://github.com/yanboliang
ghstack dependencies: #135039, #135121
2024-09-04 22:30:09 +00:00
d9ae92cd6e [Dynamo] Support for proxying frozen dataclasses (#134846)
Fixes https://github.com/pytorch/pytorch/issues/133858

Details: Previously Dynamo would treat dataclasses as UserDefinedVariables. This was non-desirable if we would like to proxy the value into the graph, which is needed for TensorSubclassMetadata. To rectify this, frozen dataclasses are now able to be proxied similarly to NamedTuples. We require the object to be frozen, because if arbitrary mutation were allowed, we would need to replay those mutations in the graph after construction of the object.

For tracing construction of the variable, the generated `__init__` for the dataclass uses `object.__setattr__` because frozen dataclasses throw errors on the usual `__setattr__` invocation. With this treatment, no special handling is needed in dynamo for frozen dataclass construction.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134846
Approved by: https://github.com/bdhirsh, https://github.com/anijain2305
2024-09-04 22:17:00 +00:00
ed06772e35 [TorchElastic] add warning when users try to pass a "use_libuv" argument to create_c10d_store (#135062)
**Summary**
Extend the warning message to be more self-explained

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135062
Approved by: https://github.com/shuqiangzhang
2024-09-04 22:05:51 +00:00
fb1c580892 [BE][optim] Make pyright recognize exported symbols (#135043)
Follows pattern introduced by https://github.com/pytorch/pytorch/pull/80955 which [pyright](https://github.com/microsoft/pyright) prefers over `__all__` symbol, see https://github.com/microsoft/pylance-release/issues/2953#issuecomment-1168956296
Fixes https://github.com/pytorch/pytorch/issues/134985

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135043
Approved by: https://github.com/janeyx99
2024-09-04 21:53:46 +00:00
2276940f8c Make Dynamo inline through torch._library.custom_ops.autograd (#135066)
Fixes https://github.com/pytorch/pytorch/issues/135057

The bug was: in the situation that Dynamo graph breaks in the forward
and Compiled Autograd uses Dynamo to introspect the backward, we end up
running into a "Unsupported: inlining through SKIPFILES" error. The
solution is to mark the entirety of this module as inlineable.

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135066
Approved by: https://github.com/bdhirsh, https://github.com/williamwen42, https://github.com/yanboliang
2024-09-04 21:48:28 +00:00
4e6df83d19 [PT] Add out variant for avg_pool1d and adaptive_avg_pool1d (#135051)
Test Plan: CI

Differential Revision: D62148410

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135051
Approved by: https://github.com/SS-JIA
2024-09-04 21:20:01 +00:00
a8611da86f [dynamo][backend match] Optimize backend match for common case (#135121)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135121
Approved by: https://github.com/williamwen42
ghstack dependencies: #135039
2024-09-04 21:02:29 +00:00
661 changed files with 18817 additions and 9893 deletions

View File

@ -1,5 +1,5 @@
0.6b
0.7b
manylinux_2_17
rocm6.2
7f07e8a1cb1f99627eb6d77f5c0e9295c775f3c7
e4ab195d2bd19e939c675a13280c29714c6ef9f2cf420690da150fa0cac043b1
9be04068c3c0857a4cfd17d7e39e71d0423ebac2
3e9e1959d23b93d78a08fcc5f868125dc3854dece32fd9458be9ef4467982291

View File

@ -286,18 +286,7 @@ case "$image" in
TRITON=yes
;;
pytorch-linux-focal-rocm-n-1-py3)
ANACONDA_PYTHON_VERSION=3.8
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=6.0
NINJA_VERSION=1.9.0
CONDA_CMAKE=yes
TRITON=yes
;;
pytorch-linux-focal-rocm-n-py3)
ANACONDA_PYTHON_VERSION=3.8
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=9
PROTOBUF=yes
DB=yes
@ -307,6 +296,17 @@ case "$image" in
CONDA_CMAKE=yes
TRITON=yes
;;
pytorch-linux-focal-rocm-n-py3)
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=6.2
NINJA_VERSION=1.9.0
CONDA_CMAKE=yes
TRITON=yes
;;
pytorch-linux-jammy-xpu-2024.0-py3)
ANACONDA_PYTHON_VERSION=3.9
GCC_VERSION=11

View File

@ -108,10 +108,10 @@ ENV CMAKE_C_COMPILER cc
ENV CMAKE_CXX_COMPILER c++
COPY ./common/install_triton.sh install_triton.sh
COPY ./common/common_utils.sh common_utils.sh
COPY ci_commit_pins/triton-rocm.txt triton-rocm.txt
COPY ci_commit_pins/triton.txt triton.txt
COPY triton_version.txt triton_version.txt
RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
RUN rm install_triton.sh common_utils.sh triton-rocm.txt triton_version.txt
RUN rm install_triton.sh common_utils.sh triton.txt triton_version.txt
# Install AOTriton (Early fail)
COPY ./aotriton_version.txt aotriton_version.txt

View File

@ -1 +1 @@
d519b4d3a1ffdc81b45e2b1d4733423ce0577813
cd1c833b079adb324871dcbbe75b43d42ffc0ade

View File

@ -1 +0,0 @@
21eae954efa5bf584da70324b640288c3ee7aede

View File

@ -1 +1 @@
1b2f15840e0d70eec50d84c7a0575cb835524def
91b14bf5593cf58a8541f3e6b9125600a867d4ef

View File

@ -1 +1 @@
dedb7bdf339a3546896d4820366ca562c586bfa0
5fe38ffd73c2ac6ed6323b554205186696631c6f

View File

@ -4,12 +4,12 @@ set -ex
source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
TARBALL='aotriton.tar.bz2'
TARBALL='aotriton.tar.gz'
# This read command alwasy returns with exit code 1
read -d "\n" VER MANYLINUX ROCMBASE PINNED_COMMIT SHA256 < aotriton_version.txt || true
ARCH=$(uname -m)
AOTRITON_INSTALL_PREFIX="$1"
AOTRITON_URL="https://github.com/ROCm/aotriton/releases/download/${VER}/aotriton-${VER}-${MANYLINUX}_${ARCH}-${ROCMBASE}-shared.tar.bz2"
AOTRITON_URL="https://github.com/ROCm/aotriton/releases/download/${VER}/aotriton-${VER}-${MANYLINUX}_${ARCH}-${ROCMBASE}-shared.tar.gz"
cd "${AOTRITON_INSTALL_PREFIX}"
# Must use -L to follow redirects

View File

@ -10,6 +10,21 @@ if [[ -z $ROCM_VERSION ]]; then
exit 1;
fi
IS_UBUNTU=0
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
IS_UBUNTU=1
;;
centos)
IS_UBUNTU=0
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
# To make version comparison easier, create an integer representation.
save_IFS="$IFS"
IFS=. ROCM_VERSION_ARRAY=(${ROCM_VERSION})
@ -58,8 +73,7 @@ MIOPEN_CMAKE_COMMON_FLAGS="
"
# Pull MIOpen repo and set DMIOPEN_EMBED_DB based on ROCm version
if [[ $ROCM_INT -ge 60200 ]] && [[ $ROCM_INT -lt 60300 ]]; then
echo "ROCm 6.2 MIOpen does not need any patches, do not build from source"
exit 0
MIOPEN_BRANCH="release/rocm-rel-6.2-staging"
elif [[ $ROCM_INT -ge 60100 ]] && [[ $ROCM_INT -lt 60200 ]]; then
echo "ROCm 6.1 MIOpen does not need any patches, do not build from source"
exit 0
@ -93,12 +107,21 @@ else
exit 1
fi
yum remove -y miopen-hip
if [[ ${IS_UBUNTU} == 1 ]]; then
apt-get remove -y miopen-hip
else
yum remove -y miopen-hip
fi
git clone https://github.com/ROCm/MIOpen -b ${MIOPEN_BRANCH}
pushd MIOpen
# remove .git to save disk space since CI runner was running out
rm -rf .git
# Don't build CK to save docker build time
if [[ $ROCM_INT -ge 60200 ]]; then
sed -i '/composable_kernel/d' requirements.txt
fi
# Don't build MLIR to save docker build time
# since we are disabling MLIR backend for MIOpen anyway
if [[ $ROCM_INT -ge 50400 ]] && [[ $ROCM_INT -lt 50500 ]]; then
@ -111,10 +134,15 @@ cmake -P install_deps.cmake --minimum
# clean up since CI runner was running out of disk space
rm -rf /tmp/*
yum clean all
rm -rf /var/cache/yum
rm -rf /var/lib/yum/yumdb
rm -rf /var/lib/yum/history
if [[ ${IS_UBUNTU} == 1 ]]; then
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
else
yum clean all
rm -rf /var/cache/yum
rm -rf /var/lib/yum/yumdb
rm -rf /var/lib/yum/history
fi
## Build MIOpen
mkdir -p build
@ -131,7 +159,11 @@ make -j $(nproc) package
# clean up since CI runner was running out of disk space
rm -rf /usr/local/cget
yum install -y miopen-*.rpm
if [[ ${IS_UBUNTU} == 1 ]]; then
sudo dpkg -i miopen-hip*.deb
else
yum install -y miopen-*.rpm
fi
popd
rm -rf MIOpen

View File

@ -12,10 +12,7 @@ conda_reinstall() {
as_jenkins conda install -q -n py_$ANACONDA_PYTHON_VERSION -y --force-reinstall $*
}
if [ -n "${ROCM_VERSION}" ]; then
TRITON_REPO="https://github.com/openai/triton"
TRITON_TEXT_FILE="triton-rocm"
elif [ -n "${XPU_VERSION}" ]; then
if [ -n "${XPU_VERSION}" ]; then
TRITON_REPO="https://github.com/intel/intel-xpu-backend-for-triton"
TRITON_TEXT_FILE="triton-xpu"
else

View File

@ -135,7 +135,7 @@ fi
)
GITHUB_REF=${GITHUB_REF:-$(git symbolic-ref -q HEAD || git describe --tags --exact-match)}
GIT_BRANCH_NAME=${GITHUB_REF##*/}
GIT_BRANCH_NAME="2.5"
GIT_COMMIT_SHA=${GITHUB_SHA:-$(git rev-parse HEAD)}
DOCKER_IMAGE_BRANCH_TAG=${DOCKER_IMAGE}-${GIT_BRANCH_NAME}
DOCKER_IMAGE_SHA_TAG=${DOCKER_IMAGE}-${GIT_COMMIT_SHA}

View File

@ -30,9 +30,14 @@ dill==0.3.7
#Pinned versions: 0.3.7
#test that import: dynamo/test_replay_record.py test_dataloader.py test_datapipe.py test_serialization.py
expecttest==0.1.6
expecttest==0.2.1
#Description: method for writing tests where test framework auto populates
# the expected output based on previous runs
#Pinned versions: 0.2.1
#test that import:
fbscribelogger==0.1.6
#Description: write to scribe from authenticated jobs on CI
#Pinned versions: 0.1.6
#test that import:

View File

@ -1 +1 @@
3.0.0
3.1.0

View File

@ -68,6 +68,8 @@ RUN rm install_rocm.sh
COPY ./common/install_rocm_magma.sh install_rocm_magma.sh
RUN bash ./install_rocm_magma.sh
RUN rm install_rocm_magma.sh
ADD ./common/install_miopen.sh install_miopen.sh
RUN bash ./install_miopen.sh ${ROCM_VERSION} && rm install_miopen.sh
ENV ROCM_PATH /opt/rocm
ENV PATH /opt/rocm/bin:$PATH
ENV PATH /opt/rocm/hcc/bin:$PATH
@ -100,10 +102,10 @@ ARG TRITON
# try to reach out to S3, which docker build runners don't have access
COPY ./common/install_triton.sh install_triton.sh
COPY ./common/common_utils.sh common_utils.sh
COPY ci_commit_pins/triton-rocm.txt triton-rocm.txt
COPY ci_commit_pins/triton.txt triton.txt
COPY triton_version.txt triton_version.txt
RUN if [ -n "${TRITON}" ]; then bash ./install_triton.sh; fi
RUN rm install_triton.sh common_utils.sh triton-rocm.txt triton_version.txt
RUN rm install_triton.sh common_utils.sh triton.txt triton_version.txt
# Install AOTriton
COPY ./aotriton_version.txt aotriton_version.txt
@ -121,5 +123,8 @@ RUN bash ./install_cache.sh && rm install_cache.sh
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
# Install LLVM dev version (Defined in the pytorch/builder github repository)
COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
USER jenkins
CMD ["bash"]

View File

@ -49,13 +49,8 @@ if [[ ${BUILD_ENVIRONMENT} == *"parallelnative"* ]]; then
fi
# Enable LLVM dependency for TensorExpr testing
if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
export USE_LLVM=/opt/rocm/llvm
export LLVM_DIR=/opt/rocm/llvm/lib/cmake/llvm
else
export USE_LLVM=/opt/llvm
export LLVM_DIR=/opt/llvm/lib/cmake/llvm
fi
export USE_LLVM=/opt/llvm
export LLVM_DIR=/opt/llvm/lib/cmake/llvm
if [[ "$BUILD_ENVIRONMENT" == *executorch* ]]; then
# To build test_edge_op_registration

View File

@ -198,7 +198,7 @@ function install_torchrec_and_fbgemm() {
function clone_pytorch_xla() {
if [[ ! -d ./xla ]]; then
git clone --recursive --quiet https://github.com/pytorch/xla.git
git clone --recursive -b r2.5 https://github.com/pytorch/xla.git
pushd xla
# pin the xla hash so that we don't get broken by changes to xla
git checkout "$(cat ../.github/ci_commit_pins/xla.txt)"

View File

@ -596,6 +596,9 @@ test_single_dynamo_benchmark() {
test_inductor_micro_benchmark() {
TEST_REPORTS_DIR=$(pwd)/test/test-reports
if [[ "${TEST_CONFIG}" == *cpu* ]]; then
test_inductor_set_cpu_affinity
fi
python benchmarks/gpt_fast/benchmark.py --output "${TEST_REPORTS_DIR}/gpt_fast_benchmark.csv"
}

View File

@ -24,6 +24,12 @@ call %INSTALLER_DIR%\install_sccache.bat
if errorlevel 1 goto fail
if not errorlevel 0 goto fail
if "%USE_XPU%"=="1" (
:: Install xpu support packages
call %INSTALLER_DIR%\install_xpu.bat
if errorlevel 1 exit /b 1
)
:: Miniconda has been installed as part of the Windows AMI with all the dependencies.
:: We just need to activate it here
call %INSTALLER_DIR%\activate_miniconda3.bat
@ -43,6 +49,16 @@ if "%VC_VERSION%" == "" (
)
if errorlevel 1 goto fail
if not errorlevel 0 goto fail
if "%USE_XPU%"=="1" (
:: Activate xpu environment - VS env is required for xpu
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
if errorlevel 1 exit /b 1
:: Reduce build time. Only have MTL self-hosted runner now
SET TORCH_XPU_ARCH_LIST=xe-lpg
SET USE_KINETO=0
)
@echo on
popd

View File

@ -0,0 +1,91 @@
@echo on
REM Description: Install Intel Support Packages on Windows
REM BKM reference: https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-5.html
set XPU_INSTALL_MODE=%~1
if "%XPU_INSTALL_MODE%"=="" goto xpu_bundle_install_start
if "%XPU_INSTALL_MODE%"=="bundle" goto xpu_bundle_install_start
if "%XPU_INSTALL_MODE%"=="driver" goto xpu_driver_install_start
if "%XPU_INSTALL_MODE%"=="all" goto xpu_driver_install_start
:arg_error
echo Illegal XPU installation mode. The value can be "bundle"/"driver"/"all"
echo If keep the value as space, will use default "bundle" mode
exit /b 1
:xpu_driver_install_start
:: TODO Need more testing for driver installation
set XPU_DRIVER_LINK=https://downloadmirror.intel.com/830975/gfx_win_101.5972.exe
curl -o xpu_driver.exe --retry 3 --retry-all-errors -k %XPU_DRIVER_LINK%
echo "XPU Driver installing..."
start /wait "Intel XPU Driver Installer" "xpu_driver.exe"
if errorlevel 1 exit /b 1
del xpu_driver.exe
if "%XPU_INSTALL_MODE%"=="driver" goto xpu_install_end
:xpu_bundle_install_start
set XPU_BUNDLE_PARENT_DIR=C:\Program Files (x86)\Intel\oneAPI
set XPU_BUNDLE_URL=https://registrationcenter-download.intel.com/akdlm/IRC_NAS/9d1a91e2-e8b8-40a5-8c7f-5db768a6a60c/w_intel-for-pytorch-gpu-dev_p_0.5.3.37_offline.exe
set XPU_PTI_URL=https://registrationcenter-download.intel.com/akdlm/IRC_NAS/9d1a91e2-e8b8-40a5-8c7f-5db768a6a60c/w_intel-pti-dev_p_0.9.0.37_offline.exe
set XPU_BUNDLE_VERSION=0.5.3+31
set XPU_PTI_VERSION=0.9.0+36
set XPU_BUNDLE_PRODUCT_NAME=intel.oneapi.win.intel-for-pytorch-gpu-dev.product
set XPU_PTI_PRODUCT_NAME=intel.oneapi.win.intel-pti-dev.product
set XPU_BUNDLE_INSTALLED=0
set XPU_PTI_INSTALLED=0
set XPU_BUNDLE_UNINSTALL=0
set XPU_PTI_UNINSTALL=0
:: Check if XPU bundle is target version or already installed
if exist "%XPU_BUNDLE_PARENT_DIR%\Installer\installer.exe" goto xpu_bundle_ver_check
goto xpu_bundle_install
:xpu_bundle_ver_check
"%XPU_BUNDLE_PARENT_DIR%\Installer\installer.exe" --list-products > xpu_bundle_installed_ver.log
for /f "tokens=1,2" %%a in (xpu_bundle_installed_ver.log) do (
if "%%a"=="%XPU_BUNDLE_PRODUCT_NAME%" (
echo %%a Installed Version: %%b
set XPU_BUNDLE_INSTALLED=1
if not "%XPU_BUNDLE_VERSION%"=="%%b" (
start /wait "Installer Title" "%XPU_BUNDLE_PARENT_DIR%\Installer\installer.exe" --action=remove --eula=accept --silent --product-id %XPU_BUNDLE_PRODUCT_NAME% --product-ver %%b --log-dir uninstall_bundle
set XPU_BUNDLE_UNINSTALL=1
)
)
if "%%a"=="%XPU_PTI_PRODUCT_NAME%" (
echo %%a Installed Version: %%b
set XPU_PTI_INSTALLED=1
if not "%XPU_PTI_VERSION%"=="%%b" (
start /wait "Installer Title" "%XPU_BUNDLE_PARENT_DIR%\Installer\installer.exe" --action=remove --eula=accept --silent --product-id %XPU_PTI_PRODUCT_NAME% --product-ver %%b --log-dir uninstall_bundle
set XPU_PTI_UNINSTALL=1
)
)
)
if errorlevel 1 exit /b 1
if exist xpu_bundle_installed_ver.log del xpu_bundle_installed_ver.log
if "%XPU_BUNDLE_INSTALLED%"=="0" goto xpu_bundle_install
if "%XPU_BUNDLE_UNINSTALL%"=="1" goto xpu_bundle_install
if "%XPU_PTI_INSTALLED%"=="0" goto xpu_pti_install
if "%XPU_PTI_UNINSTALL%"=="1" goto xpu_pti_install
goto xpu_install_end
:xpu_bundle_install
curl -o xpu_bundle.exe --retry 3 --retry-all-errors -k %XPU_BUNDLE_URL%
echo "XPU Bundle installing..."
start /wait "Intel Pytorch Bundle Installer" "xpu_bundle.exe" --action=install --eula=accept --silent --log-dir install_bundle
if errorlevel 1 exit /b 1
del xpu_bundle.exe
:xpu_pti_install
curl -o xpu_pti.exe --retry 3 --retry-all-errors -k %XPU_PTI_URL%
echo "XPU PTI installing..."
start /wait "Intel PTI Installer" "xpu_pti.exe" --action=install --eula=accept --silent --log-dir install_bundle
if errorlevel 1 exit /b 1
del xpu_pti.exe
:xpu_install_end

View File

@ -119,6 +119,11 @@ fi
# Test the package
/builder/check_binary.sh
if [[ "\$GPU_ARCH_TYPE" != *s390x* && "\$GPU_ARCH_TYPE" != *xpu* && "\$GPU_ARCH_TYPE" != *rocm* && "$PACKAGE_TYPE" != libtorch ]]; then
# Exclude s390, xpu, rocm and libtorch builds from smoke testing
python /builder/test/smoke_test/smoke_test.py --package=torchonly --torch-compile-check disabled
fi
# Clean temp files
cd /builder && git clean -ffdx

View File

@ -90,7 +90,7 @@ fi
if [[ "$PACKAGE_TYPE" =~ .*wheel.* && -n "$PYTORCH_BUILD_VERSION" && "$PYTORCH_BUILD_VERSION" =~ .*rocm.* && $(uname) == "Linux" ]]; then
TRITON_REQUIREMENT="pytorch-triton-rocm==${TRITON_VERSION}; ${TRITON_CONSTRAINT}"
if [[ -n "$PYTORCH_BUILD_VERSION" && "$PYTORCH_BUILD_VERSION" =~ .*dev.* ]]; then
TRITON_SHORTHASH=$(cut -c1-10 $PYTORCH_ROOT/.ci/docker/ci_commit_pins/triton-rocm.txt)
TRITON_SHORTHASH=$(cut -c1-10 $PYTORCH_ROOT/.ci/docker/ci_commit_pins/triton.txt)
TRITON_REQUIREMENT="pytorch-triton-rocm==${TRITON_VERSION}+${TRITON_SHORTHASH}; ${TRITON_CONSTRAINT}"
fi
if [[ -z "${PYTORCH_EXTRA_INSTALL_REQUIREMENTS:-}" ]]; then

View File

@ -1 +1 @@
2eb4a60ed14a38260b85b0c765161f0ce45be6d1
r2.5

View File

@ -9,6 +9,7 @@ ciflow_push_tags:
- ciflow/inductor-rocm
- ciflow/inductor-perf-compare
- ciflow/inductor-micro-benchmark
- ciflow/inductor-micro-benchmark-cpu-x86
- ciflow/inductor-cu124
- ciflow/linux-aarch64
- ciflow/mps

View File

@ -1,6 +1,7 @@
boto3==1.19.12
hypothesis==6.56.4
expecttest==0.1.6
expecttest==0.2.1
fbscribelogger==0.1.6
librosa>=0.6.2
mpmath==1.3.0
networkx==2.8.7

View File

@ -15,9 +15,7 @@ REPO_DIR = SCRIPT_DIR.parent.parent
def read_triton_pin(device: str = "cuda") -> str:
triton_file = "triton.txt"
if device == "rocm":
triton_file = "triton-rocm.txt"
elif device == "xpu":
if device == "xpu":
triton_file = "triton-xpu.txt"
with open(REPO_DIR / ".ci" / "docker" / "ci_commit_pins" / triton_file) as f:
return f.read().strip()

View File

@ -39,9 +39,9 @@ SUPPORTED_PERIODICAL_MODES: Dict[str, Callable[[Optional[str]], bool]] = {
}
# The link to the published list of disabled jobs
DISABLED_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/disabled-jobs.json"
DISABLED_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/disabled-jobs.json?versionId=sxzMTP57qj.Vwz8dN1glkTK560Txq9W3"
# and unstable jobs
UNSTABLE_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/unstable-jobs.json"
UNSTABLE_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/unstable-jobs.json?versionId=8f1.4S3MupuHXH8t0waxyGnPsGHJYdv9"
# Some constants used to handle disabled and unstable jobs
JOB_NAME_SEP = "/"

View File

@ -325,6 +325,7 @@ def generate_wheels_matrix(
os: str,
arches: Optional[List[str]] = None,
python_versions: Optional[List[str]] = None,
use_split_build: bool = False,
) -> List[Dict[str, str]]:
package_type = "wheel"
if os == "linux" or os == "linux-aarch64" or os == "linux-s390x":
@ -371,7 +372,17 @@ def generate_wheels_matrix(
) and python_version == "3.13":
continue
if use_split_build and (
arch_version not in ["12.4", "12.1", "11.8", "cpu"] or os != "linux"
):
raise RuntimeError(
"Split build is only supported on linux with cuda 12.4, 12.1, 11.8, and cpu.\n"
f"Currently attempting to build on arch version {arch_version} and os {os}.\n"
"Please modify the matrix generation to exclude this combination."
)
# 12.1 linux wheels require PYTORCH_EXTRA_INSTALL_REQUIREMENTS to install
if (
arch_version in ["12.4", "12.1", "11.8"]
and os == "linux"
@ -385,6 +396,7 @@ def generate_wheels_matrix(
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"use_split_build": "True" if use_split_build else "False",
"devtoolset": (
"cxx11-abi" if arch_version == "cuda-aarch64" else ""
),
@ -400,7 +412,8 @@ def generate_wheels_matrix(
),
}
)
if arch_version != "cuda-aarch64":
# Special build building to use on Colab. PyThon 3.10 for 12.1 CUDA
if python_version == "3.10" and arch_version == "12.1":
ret.append(
{
"python_version": python_version,
@ -409,40 +422,16 @@ def generate_wheels_matrix(
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"use_split_build": "True",
"use_split_build": "True" if use_split_build else "False",
"devtoolset": "",
"container_image": WHEEL_CONTAINER_IMAGES[arch_version],
"package_type": package_type,
"pytorch_extra_install_requirements": (
PYTORCH_EXTRA_INSTALL_REQUIREMENTS[arch_version] # fmt: skip
if os != "linux-aarch64"
else ""
),
"build_name": f"{package_type}-py{python_version}-{gpu_arch_type}{gpu_arch_version}-split".replace( # noqa: B950
"pytorch_extra_install_requirements": "",
"build_name": f"{package_type}-py{python_version}-{gpu_arch_type}{gpu_arch_version}-full".replace( # noqa: B950
".", "_"
),
}
)
# Special build building to use on Colab. PyThon 3.10 for 12.1 CUDA
if python_version == "3.10" and arch_version == "12.1":
ret.append(
{
"python_version": python_version,
"gpu_arch_type": gpu_arch_type,
"gpu_arch_version": gpu_arch_version,
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"use_split_build": "False",
"devtoolset": "",
"container_image": WHEEL_CONTAINER_IMAGES[arch_version],
"package_type": package_type,
"pytorch_extra_install_requirements": "",
"build_name": f"{package_type}-py{python_version}-{gpu_arch_type}{gpu_arch_version}-full".replace( # noqa: B950
".", "_"
),
}
)
else:
ret.append(
{
@ -452,6 +441,7 @@ def generate_wheels_matrix(
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"use_split_build": "True" if use_split_build else "False",
"devtoolset": (
"cxx11-abi" if arch_version == "cpu-cxx11-abi" else ""
),
@ -461,12 +451,13 @@ def generate_wheels_matrix(
".", "_"
),
"pytorch_extra_install_requirements": (
PYTORCH_EXTRA_INSTALL_REQUIREMENTS["12.1"] # fmt: skip
PYTORCH_EXTRA_INSTALL_REQUIREMENTS["12.4"] # fmt: skip
if os != "linux" and gpu_arch_type != "xpu"
else ""
),
}
)
return ret

View File

@ -61,6 +61,7 @@ class BinaryBuildWorkflow:
# Mainly for macos
cross_compile_arm64: bool = False
macos_runner: str = "macos-14-xlarge"
use_split_build: bool = False
def __post_init__(self) -> None:
if self.abi_version:
@ -69,12 +70,20 @@ class BinaryBuildWorkflow:
)
else:
self.build_environment = f"{self.os}-binary-{self.package_type}"
if self.use_split_build:
# added to distinguish concurrency groups
self.build_environment += "-split"
def generate_workflow_file(self, workflow_template: jinja2.Template) -> None:
output_file_path = (
GITHUB_DIR
/ f"workflows/generated-{self.build_environment}-{self.branches}.yml"
)
if self.use_split_build:
output_file_path = (
GITHUB_DIR
/ f"workflows/generated-{self.build_environment}-{self.branches}"
)
with open(output_file_path, "w") as output_file:
GENERATED = "generated" # Note that please keep the variable GENERATED otherwise phabricator will hide the whole file
output_file.writelines([f"# @{GENERATED} DO NOT EDIT MANUALLY\n"])
@ -110,6 +119,20 @@ LINUX_BINARY_BUILD_WORFKLOWS = [
isolated_workflow=True,
),
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="manywheel",
build_configs=generate_binary_build_matrix.generate_wheels_matrix(
OperatingSystem.LINUX,
use_split_build=True,
arches=["11.8", "12.1", "12.4", "cpu"],
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_WHEEL},
isolated_workflow=True,
),
use_split_build=True,
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="conda",
@ -162,6 +185,21 @@ LINUX_BINARY_SMOKE_WORKFLOWS = [
),
branches="main",
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="manywheel",
build_configs=generate_binary_build_matrix.generate_wheels_matrix(
OperatingSystem.LINUX,
arches=["11.8", "12.1", "12.4"],
python_versions=["3.9"],
use_split_build=True,
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_PERIODIC},
),
branches="main",
use_split_build=True,
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="libtorch",

View File

@ -3,7 +3,7 @@
## Install prerequisites.
```
$ sudo dnf install docker
$ sudo dnf install podman podman-docker jq
```
## Add services.
@ -27,23 +27,48 @@ $ sudo systemctl enable --now qemu-user-static
## Rebuild the image
In order to build or update the `iiilinuxibmcom/actions-runner` image, e.g. to get the
latest OS security fixes, use the following commands:
First build s390x builder image `docker.io/pytorch/manylinuxs390x-builder`,
using following commands:
```
$ cd ~
$ git clone https://github.com/pytorch/pytorch
$ cd pytorch
$ git submodule update --init --recursive
$ GPU_ARCH_TYPE=cpu-s390x "$(pwd)/.ci/docker/manywheel/build.sh" manylinuxs390x-builder
$ docker image tag localhost/pytorch/manylinuxs390x-builder docker.io/pytorch/manylinuxs390x-builder:cpu-s390x
$ docker image save -o ~/manywheel-s390x.tar docker.io/pytorch/manylinuxs390x-builder:cpu-s390x
```
Next step is to build `actions-runner` image using:
```
$ cd self-hosted-builder
$ sudo docker build \
--build-arg repo=<owner>/<name> \
--build-arg token=<***> \
--pull \
-f actions-runner.Dockerfile \
-t iiilinuxibmcom/actions-runner \
-t iiilinuxibmcom/actions-runner.<name> \
.
```
If it fails, ensure that selinux doesn't prevent it from working.
If there are failures, ensure that selinux doesn't prevent it from working.
In worst case, selinux can be disabled with `setenforce 0`.
Now prepare all necessary files for runner registration:
```
$ sudo mkdir -p /etc/actions-runner/<name>
$ sudo chmod 700 /etc/actions-runner/<name>
$ sudo /bin/cp <github_app_private_key_file> /etc/actions-runner/<name>/key_private.pem
$ sudo echo <github_app_id> | sudo tee /etc/actions-runner/<name>/appid.env
$ sudo echo <github_app_install_id> | sudo tee /etc/actions-runner/<name>/installid.env
$ sudo echo NAME=<worker_name> | sudo tee /etc/actions-runner/<name>/env
$ sudo echo ORG=<github_org> | sudo tee -a /etc/actions-runner/<name>/env
$ cd self-hosted-builder
$ sudo /bin/cp helpers/*.sh /usr/local/bin/
$ sudo chmod 755 /usr/local/bin/app_token.sh /usr/local/bin/gh_token_generator.sh
```
## Autostart the runner.
```

View File

@ -1,12 +1,12 @@
# Self-Hosted IBM Z Github Actions Runner.
# Temporary image: amd64 dependencies.
FROM docker.io/amd64/ubuntu:22.04 as ld-prefix
FROM docker.io/amd64/ubuntu:23.10 as ld-prefix
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get -y install ca-certificates libicu70 libssl3
RUN apt-get update && apt-get -y install ca-certificates libicu72 libssl3
# Main image.
FROM docker.io/s390x/ubuntu:22.04
FROM docker.io/s390x/ubuntu:23.10
# Packages for pytorch building and testing.
ENV DEBIAN_FRONTEND=noninteractive
@ -16,6 +16,7 @@ RUN apt-get update && apt-get -y install \
gcc \
git \
jq \
zip \
libxml2-dev \
libxslt-dev \
ninja-build \
@ -43,24 +44,28 @@ COPY fs/ /
RUN chmod +x /usr/bin/actions-runner /usr/bin/entrypoint
# install podman
RUN apt -y install podman podman-docker
# amd64 Github Actions Runner.
RUN useradd -m actions-runner
USER actions-runner
WORKDIR /home/actions-runner
RUN curl -L https://github.com/actions/runner/releases/download/v2.309.0/actions-runner-linux-x64-2.309.0.tar.gz | tar -xz
# repository
ARG repo
# set up python virtual environment which is later used by runner.
# build workflows use "python -m pip install ...",
# and it doesn't work for non-root user
RUN virtualenv --system-site-packages venv
# repository token
ARG token
# copy prebuilt manywheel docker image for builds and tests
# build command is:
# GPU_ARCH_TYPE=cpu-s390x "$(pwd)/manywheel/build_docker.sh"
# and save command is:
# docker image save -o manywheel-s390x.tar pytorch/manylinuxs390x-builder:cpu-s390x
#
COPY --chown=actions-runner:actions-runner manywheel-s390x.tar /home/actions-runner/manywheel-s390x.tar
RUN ./config.sh \
--unattended \
--url "https://github.com/${repo}" \
--token "${token}" \
--no-default-labels \
--labels self-hosted,linux.s390x
RUN curl -L https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-linux-x64-2.317.0.tar.gz | tar -xz
ENTRYPOINT ["/usr/bin/entrypoint"]
CMD ["/usr/bin/actions-runner"]

View File

@ -8,12 +8,16 @@ StartLimitIntervalSec=0
Type=simple
Restart=always
ExecStartPre=-/usr/bin/docker rm --force actions-runner.%i
ExecStartPre=-/usr/local/bin/gh_token_generator.sh /etc/actions-runner/%i/appid.env /etc/actions-runner/%i/installid.env /etc/actions-runner/%i/key_private.pem /etc/actions-runner/%i/ghtoken.env
ExecStart=/usr/bin/docker run \
--env-file=/etc/actions-runner/%i/env \
--env-file=/etc/actions-runner/%i/ghtoken.env \
--init \
--interactive \
--name=actions-runner.%i \
--rm \
iiilinuxibmcom/actions-runner
--privileged \
iiilinuxibmcom/actions-runner.%i
ExecStop=/bin/sh -c "docker exec actions-runner.%i kill -INT -- -1"
ExecStop=/bin/sh -c "docker wait actions-runner.%i"
ExecStop=/bin/sh -c "docker rm actions-runner.%i"

View File

@ -2,5 +2,45 @@
set -e -u
# first import docker image
if [ -f ./manywheel-s390x.tar ] ; then
docker image load --input manywheel-s390x.tar
docker image tag docker.io/pytorch/manylinuxs390x-builder:cpu-s390x docker.io/pytorch/manylinuxs390x-builder:cpu-s390x-main
rm -f manywheel-s390x.tar
fi
token_file=registration-token.json
# Generate registration token
curl \
-X POST \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: Bearer ${ACCESS_TOKEN}" \
"https://api.github.com/orgs/${ORG}/actions/runners/registration-token" \
-o "$token_file"
unset ACCESS_TOKEN
# register runner as ephemeral runner
# it does one job, stops and unregisters
registration_token=$(jq --raw-output .token "$token_file")
./config.sh \
--unattended \
--ephemeral \
--url "https://github.com/${ORG}" \
--token "${registration_token}" \
--name "${NAME}" \
--no-default-labels \
--labels self-hosted,linux.s390x
unset registration_token
rm -f "$token_file"
# enter into python virtual environment.
# build workflows use "python -m pip install ...",
# and it doesn't work for non-root user
source venv/bin/activate
# Run one job.
./run.sh --once
./run.sh

View File

@ -0,0 +1,84 @@
#!/usr/bin/env bash
#
# Request an ACCESS_TOKEN to be used by a GitHub APP
# Environment variable that need to be set up:
# * APP_ID, the GitHub's app ID
# * INSTALL_ID, the Github's app's installation ID
# * APP_PRIVATE_KEY, the content of GitHub app's private key in PEM format.
#
# https://github.com/orgs/community/discussions/24743#discussioncomment-3245300
#
set -o pipefail
_GITHUB_HOST=${GITHUB_HOST:="github.com"}
# If URL is not github.com then use the enterprise api endpoint
if [[ ${GITHUB_HOST} = "github.com" ]]; then
URI="https://api.${_GITHUB_HOST}"
else
URI="https://${_GITHUB_HOST}/api/v3"
fi
API_VERSION=v3
API_HEADER="Accept: application/vnd.github.${API_VERSION}+json"
CONTENT_LENGTH_HEADER="Content-Length: 0"
APP_INSTALLATIONS_URI="${URI}/app/installations"
# JWT parameters based off
# https://docs.github.com/en/developers/apps/building-github-apps/authenticating-with-github-apps#authenticating-as-a-github-app
#
# JWT token issuance and expiration parameters
JWT_IAT_DRIFT=60
JWT_EXP_DELTA=600
JWT_JOSE_HEADER='{
"alg": "RS256",
"typ": "JWT"
}'
build_jwt_payload() {
now=$(date +%s)
iat=$((now - JWT_IAT_DRIFT))
jq -c \
--arg iat_str "${iat}" \
--arg exp_delta_str "${JWT_EXP_DELTA}" \
--arg app_id_str "${APP_ID}" \
'
($iat_str | tonumber) as $iat
| ($exp_delta_str | tonumber) as $exp_delta
| ($app_id_str | tonumber) as $app_id
| .iat = $iat
| .exp = ($iat + $exp_delta)
| .iss = $app_id
' <<< "{}" | tr -d '\n'
}
base64url() {
base64 | tr '+/' '-_' | tr -d '=\n'
}
rs256_sign() {
openssl dgst -binary -sha256 -sign <(echo "$1")
}
request_access_token() {
jwt_payload=$(build_jwt_payload)
encoded_jwt_parts=$(base64url <<<"${JWT_JOSE_HEADER}").$(base64url <<<"${jwt_payload}")
encoded_mac=$(echo -n "$encoded_jwt_parts" | rs256_sign "${APP_PRIVATE_KEY}" | base64url)
generated_jwt="${encoded_jwt_parts}.${encoded_mac}"
auth_header="Authorization: Bearer ${generated_jwt}"
app_installations_response=$(curl -sX POST \
-H "${auth_header}" \
-H "${API_HEADER}" \
--header "X-GitHub-Api-Version: 2022-11-28" \
--url "https://api.github.com/app/installations/${INSTALL_ID}/access_tokens" \
)
echo "$app_installations_response" | jq --raw-output '.token'
}
request_access_token

View File

@ -0,0 +1,10 @@
#!/usr/bin/env bash
SCRIPT_DIR=$(dirname "$0")
APP_ID=$1
INSTALL_ID=$2
APP_PRIVATE_KEY=$3
DST_FILE="$4"
ACCESS_TOKEN="$(APP_ID="$(<"${APP_ID}")" INSTALL_ID="$(<"${INSTALL_ID}")" APP_PRIVATE_KEY="$(<"${APP_PRIVATE_KEY}")" "${SCRIPT_DIR}/app_token.sh")"
echo "ACCESS_TOKEN=${ACCESS_TOKEN}" > "${DST_FILE}"

View File

@ -1,6 +1,6 @@
{%- set upload_artifact_s3_action = "seemethere/upload-artifact-s3@v5" -%}
{%- set download_artifact_s3_action = "seemethere/download-artifact-s3@v4" -%}
{%- set upload_artifact_action = "actions/upload-artifact@v3" -%}
{%- set upload_artifact_action = "actions/upload-artifact@v4.4.0" -%}
{%- set download_artifact_action = "actions/download-artifact@v4.1.7" -%}
{%- set timeout_minutes = 240 -%}
@ -8,7 +8,7 @@
# NOTE: If testing pytorch/builder changes you can change this variable to change what pytorch/builder reference
# the binary builds will check out
{%- set builder_repo = "pytorch/builder" -%}
{%- set builder_branch = "main" -%}
{%- set builder_branch = "release/2.5" -%}
{%- macro concurrency(build_environment) -%}
concurrency:
@ -36,7 +36,7 @@ concurrency:
{%- macro setup_ec2_windows() -%}
!{{ display_ec2_information() }}
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}

View File

@ -142,10 +142,10 @@ jobs:
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: !{{ config["container_image"] }}
- name: Test Pytorch binary
@ -164,13 +164,13 @@ jobs:
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
- name: ROCm set GPU_FLAG
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: !{{ config["container_image"] }}
- name: Test Pytorch binary

View File

@ -78,8 +78,8 @@ jobs:
elif [ -d "/Applications/Xcode_13.3.1.app" ]; then
echo "DEVELOPER_DIR=/Applications/Xcode_13.3.1.app/Contents/Developer" >> "${GITHUB_ENV}"
fi
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
- name: Install sccache (only for non-forked PRs, and pushes to trunk)
uses: nick-fields/retry@v3.0.0
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
@ -101,7 +101,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: !{{ config["build_name"] }}

View File

@ -45,7 +45,7 @@
{%- if is_windows %}
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
{%- endif %}
{%- else %}

View File

@ -79,8 +79,8 @@ jobs:
steps:
!{{ common.setup_ec2_windows() }}
!{{ set_runner_specific_vars() }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
- name: Populate binary env
shell: bash
run: |
@ -121,8 +121,8 @@ jobs:
with:
name: !{{ config["build_name"] }}
path: "${{ env.PYTORCH_FINAL_PACKAGE_DIR }}"
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
- name: Populate binary env
shell: bash
run: |

View File

@ -37,7 +37,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -59,25 +59,25 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -141,5 +141,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -37,7 +37,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -59,25 +59,25 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -186,5 +186,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -47,7 +47,7 @@ jobs:
reenabled-issues: ${{ steps.filter.outputs.reenabled-issues }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -69,25 +69,25 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -97,7 +97,7 @@ jobs:
run: echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> "$GITHUB_OUTPUT"
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.5
if: ${{ inputs.cuda-version != 'cpu' && steps.check_arc_runner.outputs.IN_ARC_RUNNER == 'false' }}
- name: Output disk space left
@ -206,5 +206,5 @@ jobs:
file-suffix: bazel-${{ github.job }}_${{ steps.get-job-id.outputs.job-id }}
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -159,13 +159,13 @@ jobs:
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
if: inputs.build_environment != 'linux-s390x-binary-manywheel'
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.github-token }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: ${{ inputs.build_environment == 'linux-aarch64-binary-manywheel' || inputs.build_environment == 'linux-s390x-binary-manywheel' }}
@ -195,7 +195,6 @@ jobs:
- name: Checkout PyTorch to pytorch dir
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -209,7 +208,7 @@ jobs:
- name: Checkout pytorch/builder to builder dir
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -235,7 +234,7 @@ jobs:
- name: Pull Docker image
if: ${{ steps.filter.outputs.is-test-matrix-empty == 'False' && inputs.build_environment != 'linux-s390x-binary-manywheel' }}
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ inputs.DOCKER_IMAGE }}
@ -283,7 +282,7 @@ jobs:
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "${RUNNER_TEMP}/artifacts:/v" -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: ${{ steps.filter.outputs.is-test-matrix-empty == 'False' }}
with:
name: ${{ inputs.build_name }}
@ -293,7 +292,7 @@ jobs:
- name: Teardown Linux
if: always() && inputs.build_environment != 'linux-s390x-binary-manywheel'
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
- name: Chown workspace
if: always() && inputs.build_environment != 'linux-s390x-binary-manywheel'

View File

@ -142,14 +142,14 @@ jobs:
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
if: inputs.build_environment != 'linux-s390x-binary-manywheel'
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.github-token }}
# Setup the environment
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: ${{ inputs.build_environment == 'linux-aarch64-binary-manywheel' || inputs.build_environment == 'linux-s390x-binary-manywheel' }}
@ -172,7 +172,6 @@ jobs:
- name: Checkout PyTorch to pytorch dir
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
@ -185,7 +184,7 @@ jobs:
- name: Checkout pytorch/builder to builder dir
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -216,12 +215,12 @@ jobs:
path: "${{ runner.temp }}/artifacts/"
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.5
if: ${{ inputs.GPU_ARCH_TYPE == 'cuda' && steps.filter.outputs.is-test-matrix-empty == 'False' }}
- name: Pull Docker image
if: ${{ steps.filter.outputs.is-test-matrix-empty == 'False' && inputs.build_environment != 'linux-s390x-binary-manywheel' }}
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ inputs.DOCKER_IMAGE }}
@ -231,7 +230,7 @@ jobs:
- name: Teardown Linux
if: always() && inputs.build_environment != 'linux-s390x-binary-manywheel'
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
- name: Chown workspace
if: always() && inputs.build_environment != 'linux-s390x-binary-manywheel'

View File

@ -103,7 +103,7 @@ jobs:
USE_SPLIT_BUILD: ${{ inputs.use_split_build }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: true

View File

@ -28,7 +28,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -49,7 +49,7 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Set up JDK 8
uses: actions/setup-java@v3
@ -58,7 +58,7 @@ jobs:
distribution: 'temurin'
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: 3.8
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}

View File

@ -84,7 +84,7 @@ jobs:
name: build-docs-${{ matrix.docs_type }}-${{ inputs.push }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -95,7 +95,7 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
@ -110,12 +110,12 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -222,5 +222,5 @@ jobs:
s3-prefix: pytorch/pytorch/${{ github.event.pull_request.number }}/functorchdocs
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -46,7 +46,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -80,7 +80,7 @@ jobs:
steps:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Populate CI build options
shell: bash
@ -102,7 +102,7 @@ jobs:
brew install libtool
- name: Setup miniconda for iOS
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: "3.9"
environment-file: .github/requirements/conda-env-iOS.txt

View File

@ -108,7 +108,7 @@ jobs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -117,7 +117,7 @@ jobs:
# checkout because when we run this action we don't *have* a local
# checkout. In other cases you should prefer a local checkout.
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
@ -132,7 +132,7 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image-name }}
@ -146,7 +146,7 @@ jobs:
echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}"
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -269,5 +269,5 @@ jobs:
s3-bucket: ${{ inputs.s3-bucket }}
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -72,7 +72,7 @@ jobs:
timeout-minutes: ${{ matrix.mem_leak_check == 'mem_leak_check' && 600 || inputs.timeout-minutes }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
if: ${{ !contains(matrix.runner, 'gcp.a100') }}
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -81,7 +81,7 @@ jobs:
docker exec -it $(docker container ps --format '{{.ID}}') bash
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
@ -96,7 +96,7 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image }}
@ -110,7 +110,7 @@ jobs:
echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}"
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -121,7 +121,7 @@ jobs:
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
id: install-nvidia-driver
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.5
if: ${{ contains(inputs.build-environment, 'cuda') && !contains(matrix.config, 'nogpu') && steps.check_arc_runner.outputs.IN_ARC_RUNNER == 'false' }}
- name: Lock NVIDIA A100 40GB Frequency
@ -342,7 +342,7 @@ jobs:
path: ./**/core.[1-9]*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()
# NB: We are currently having an intermittent GPU-related issue on G5 runners with

View File

@ -71,11 +71,11 @@ jobs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
steps:
- name: Clean up disk space before running MacOS workflow
uses: pytorch/test-infra/.github/actions/check-disk-space@main
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.5
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Set xcode version
env:
@ -87,7 +87,7 @@ jobs:
- name: Setup miniconda
if: inputs.environment-file == ''
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -97,7 +97,7 @@ jobs:
# environment even though the arch is x86-64
- name: Setup miniconda using the provided environment file
if: inputs.environment-file != ''
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: ${{ inputs.python-version }}
environment-file: ${{ inputs.environment-file }}
@ -207,4 +207,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@main
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.5

View File

@ -41,7 +41,7 @@ jobs:
reenabled-issues: ${{ steps.filter.outputs.reenabled-issues }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
@ -82,7 +82,7 @@ jobs:
use-gha: true
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -161,4 +161,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@main
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.5

View File

@ -74,11 +74,11 @@ jobs:
done
- name: Clean up disk space before running MacOS workflow
uses: pytorch/test-infra/.github/actions/check-disk-space@main
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.5
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Download build artifacts
uses: ./.github/actions/download-build-artifacts
@ -93,7 +93,7 @@ jobs:
use-gha: true
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -217,4 +217,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@main
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.5

View File

@ -58,7 +58,7 @@ jobs:
steps:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: true
@ -80,12 +80,12 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}

View File

@ -28,7 +28,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false
@ -59,10 +59,10 @@ jobs:
SUPPORT_ABI: '${{ matrix.support_abi }}'
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: 3.8
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}.txt

View File

@ -45,7 +45,7 @@ jobs:
ISSUE_OWNER: ${{ inputs.issue_owner }}
steps:
# - name: Checkout PyTorch
# uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
# uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
# with:
# fetch-depth: 1
# submodules: true

View File

@ -11,6 +11,16 @@ on:
required: true
type: string
description: What CUDA version to build with, "cpu" for none.
use-xpu:
required: false
type: boolean
default: false
description: If set, build with XPU support.
vc-year:
required: false
type: string
default: "2019"
description: The Visual Studio year to use for building.
build-with-debug:
required: false
type: boolean
@ -69,10 +79,10 @@ jobs:
git config --global core.fsmonitor false
- name: Clean up leftover processes on non-ephemeral Windows runner
uses: pytorch/test-infra/.github/actions/cleanup-runner@main
uses: pytorch/test-infra/.github/actions/cleanup-runner@release/2.5
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -87,7 +97,7 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: true
@ -141,7 +151,7 @@ jobs:
SCCACHE_REGION: us-east-1
VC_PRODUCT: "BuildTools"
VC_VERSION: ""
VC_YEAR: "2019"
VC_YEAR: "${{ inputs.vc-year }}"
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
AWS_DEFAULT_REGION: us-east-1
PR_NUMBER: ${{ github.event.pull_request.number }}
@ -149,6 +159,7 @@ jobs:
DEBUG: ${{ inputs.build-with-debug && '1' || '0' }}
TORCH_CUDA_ARCH_LIST: "8.6"
USE_CUDA: ${{ inputs.cuda-version != 'cpu' && '1' || '0' }}
USE_XPU: ${{ inputs.use-xpu == true && '1' || '0' }}
OUR_GITHUB_JOB_ID: ${{ steps.get-job-id.outputs.job-id }}
run: |
.ci/pytorch/win-build.sh

View File

@ -57,10 +57,10 @@ jobs:
git config --global core.fsmonitor false
- name: Clean up leftover processes on non-ephemeral Windows runner
uses: pytorch/test-infra/.github/actions/cleanup-runner@main
uses: pytorch/test-infra/.github/actions/cleanup-runner@release/2.5
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -76,7 +76,7 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
no-sudo: true

View File

@ -54,7 +54,7 @@ jobs:
steps:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup XPU
uses: ./.github/actions/setup-xpu
@ -72,12 +72,12 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ inputs.docker-image }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}

View File

@ -40,12 +40,12 @@ jobs:
CUDA_VERSION: ${{ matrix.cuda_version }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: conda-builder${{ matrix.cuda_version == 'cpu' && '-' || '-cuda' }}${{matrix.cuda_version}}
docker-build-dir: .ci/docker/conda

View File

@ -40,12 +40,12 @@ jobs:
GPU_ARCH_VERSION: ${{ matrix.cuda_version }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: libtorch-cxx11-builder-cuda${{matrix.cuda_version}}
docker-build-dir: .ci/docker/libtorch
@ -75,12 +75,12 @@ jobs:
GPU_ARCH_VERSION: ${{ matrix.rocm_version }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: libtorch-cxx11-builder-rocm${{matrix.rocm_version}}
docker-build-dir: .ci/docker/libtorch
@ -104,12 +104,12 @@ jobs:
runs-on: linux.9xlarge.ephemeral
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: libtorch-cxx11-builder-cpu
docker-build-dir: .ci/docker/libtorch

View File

@ -27,6 +27,7 @@ env:
DOCKER_REGISTRY: "docker.io"
DOCKER_BUILDKIT: 1
WITH_PUSH: ${{ github.event_name == 'push' && (github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/heads/release')) }}
WITH_PUSH_ROCM: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/v') }}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
@ -46,12 +47,12 @@ jobs:
- name: Purge tools folder (free space for build)
run: rm -rf /opt/hostedtoolcache
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux-builder-cuda${{matrix.cuda_version}}
docker-build-dir: .ci/docker/manywheel
@ -84,12 +85,12 @@ jobs:
- name: Purge tools folder (free space for build)
run: rm -rf /opt/hostedtoolcache
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux2_28-builder-cuda${{matrix.cuda_version}}
docker-build-dir: .ci/docker/manywheel
@ -122,7 +123,7 @@ jobs:
uses: actions/checkout@v3
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinuxaarch64-builder-cuda${{matrix.cuda_version}}
docker-build-dir: .ci/docker/manywheel
@ -152,41 +153,42 @@ jobs:
GPU_ARCH_VERSION: ${{ matrix.rocm_version }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
if: env.WITH_PUSH_ROCM == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux-builder-rocm${{matrix.rocm_version}}
docker-build-dir: .ci/docker/manywheel
always-rebuild: true
push: true
- name: Authenticate if WITH_PUSH
if: env.WITH_PUSH == 'true'
- name: Authenticate if WITH_PUSH_ROCM
if: env.WITH_PUSH_ROCM == 'true'
env:
DOCKER_TOKEN: ${{ secrets.DOCKER_TOKEN }}
DOCKER_ID: ${{ secrets.DOCKER_ID }}
run: |
if [[ "${WITH_PUSH}" == true ]]; then
if [[ "${WITH_PUSH_ROCM}" == true ]]; then
echo "${DOCKER_TOKEN}" | docker login -u "${DOCKER_ID}" --password-stdin
fi
- name: Build Docker Image
if: env.WITH_PUSH == 'true'
if: env.WITH_PUSH_ROCM == 'true'
run: |
export WITH_PUSH=true
.ci/docker/manywheel/build.sh manylinux-builder:rocm${{matrix.rocm_version}}
build-docker-cpu:
environment: ${{ (github.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v')) && 'docker-build' || '' }}
runs-on: am2.linux.9xlarge.ephemeral
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux-builder-cpu
docker-build-dir: .ci/docker/manywheel
@ -212,12 +214,12 @@ jobs:
GPU_ARCH_TYPE: cpu-manylinux_2_28
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux2_28-builder-cpu
docker-build-dir: .ci/docker/manywheel
@ -243,12 +245,12 @@ jobs:
GPU_ARCH_TYPE: cpu-aarch64
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinuxaarch64-builder-cpu-aarch64
docker-build-dir: .ci/docker/manywheel
@ -274,12 +276,12 @@ jobs:
GPU_ARCH_TYPE: cpu-aarch64-2_28
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux2_28_aarch64-builder-cpu-aarch64
docker-build-dir: .ci/docker/manywheel
@ -308,12 +310,12 @@ jobs:
GPU_ARCH_TYPE: cpu-cxx11-abi
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinuxcxx11-abi-builder-cpu-cxx11-abi
docker-build-dir: .ci/docker/manywheel
@ -339,12 +341,12 @@ jobs:
GPU_ARCH_TYPE: xpu
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
- name: Calculate docker image
if: env.WITH_PUSH == 'false'
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: manylinux2_28-builder-xpu
docker-build-dir: .ci/docker/manywheel

View File

@ -1,322 +0,0 @@
name: Build Triton wheels
on:
push:
branches:
- main
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
paths:
- .github/workflows/build-triton-wheel.yml
- .github/scripts/build_triton_wheel.py
- .github/ci_commit_pins/triton.txt
- .ci/docker/ci_commit_pins/triton.txt
- .ci/docker/ci_commit_pins/triton-rocm.txt
- .ci/docker/ci_commit_pins/triton-xpu.txt
pull_request:
paths:
- .github/workflows/build-triton-wheel.yml
- .github/scripts/build_triton_wheel.py
- .github/ci_commit_pins/triton.txt
- .ci/docker/ci_commit_pins/triton.txt
- .ci/docker/ci_commit_pins/triton-rocm.txt
- .ci/docker/ci_commit_pins/triton-xpu.txt
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
jobs:
build-wheel:
name: "Build Triton Wheel"
runs-on: [self-hosted, linux.2xlarge]
strategy:
fail-fast: false
matrix:
py_vers: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
device: ["cuda", "rocm", "xpu"]
include:
- device: "rocm"
rocm_version: "6.2"
- device: "cuda"
rocm_version: ""
timeout-minutes: 40
env:
DOCKER_IMAGE: ${{ matrix.device == 'rocm' && format('pytorch/manylinux-builder:rocm{0}', matrix.rocm_version) || 'pytorch/manylinux-builder:cpu' }}
PY_VERS: ${{ matrix.py_vers }}
BUILD_DEVICE: ${{ matrix.device }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ env.DOCKER_IMAGE }}
- name: Build Triton wheel
env:
IS_RELEASE_TAG: ${{ startsWith(github.event.ref, 'refs/tags/v') }}
run: |
set -x
mkdir -p "${RUNNER_TEMP}/artifacts/"
container_name=$(docker run \
--tty \
--detach \
-v "${GITHUB_WORKSPACE}:/pytorch" \
-v "${RUNNER_TEMP}/artifacts:/artifacts" \
-w /artifacts/ \
"${DOCKER_IMAGE}" \
)
# Determine python executable for given version
case $PY_VERS in
3.8)
PYTHON_EXECUTABLE=/opt/python/cp38-cp38/bin/python
;;
3.9)
PYTHON_EXECUTABLE=/opt/python/cp39-cp39/bin/python
;;
3.10)
PYTHON_EXECUTABLE=/opt/python/cp310-cp310/bin/python
;;
3.11)
PYTHON_EXECUTABLE=/opt/python/cp311-cp311/bin/python
;;
3.12)
PYTHON_EXECUTABLE=/opt/python/cp312-cp312/bin/python
;;
*)
echo "Unsupported python version ${PY_VERS}"
exit 1
;;
esac
RELEASE=""
if [[ "${IS_RELEASE_TAG}" == true ]]; then
RELEASE="--release"
fi
docker exec -t "${container_name}" yum install -y zlib-devel zip
docker exec -t "${container_name}" "${PYTHON_EXECUTABLE}" -m pip install -U setuptools==67.4.0
# Triton xpu build use GCC11
if [[ "${BUILD_DEVICE}" == xpu ]]; then
docker exec -t "${container_name}" yum install -y devtoolset-11-gcc-c++
docker exec -t "${container_name}" bash -c "source /opt/rh/devtoolset-11/enable && ${PYTHON_EXECUTABLE} /pytorch/.github/scripts/build_triton_wheel.py --device=$BUILD_DEVICE $RELEASE"
else
docker exec -t "${container_name}" bash -c "${PYTHON_EXECUTABLE} /pytorch/.github/scripts/build_triton_wheel.py --device=$BUILD_DEVICE $RELEASE"
fi
docker exec -t "${container_name}" chown -R 1000.1000 /artifacts
- uses: actions/upload-artifact@v3
with:
name: pytorch-triton-wheel-${{ matrix.py_vers }}-${{ matrix.device }}
if-no-files-found: error
path: ${{ runner.temp }}/artifacts/*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()
upload-wheel:
runs-on: ubuntu-22.04
needs: build-wheel
permissions:
id-token: write
contents: read
container:
image: continuumio/miniconda3:4.12.0
environment: ${{ (github.event_name == 'push' && (github.event.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v'))) && 'conda-aws-upload' || '' }}
steps:
- uses: actions/checkout@v3
- name: Configure AWS credentials(PyTorch account) for main
if: ${{ github.event_name == 'push' && github.event.ref == 'refs/heads/main' }}
uses: aws-actions/configure-aws-credentials@v3
with:
role-to-assume: arn:aws:iam::749337293305:role/gha_workflow_nightly_build_wheels
aws-region: us-east-1
- name: Configure AWS credentials(PyTorch account) for RC builds
if: ${{ github.event_name == 'push' && (startsWith(github.event.ref, 'refs/tags/') && !startsWith(github.event.ref, 'refs/tags/ciflow/')) }}
uses: aws-actions/configure-aws-credentials@v3
with:
role-to-assume: arn:aws:iam::749337293305:role/gha_workflow_test_build_wheels
aws-region: us-east-1
- name: Download Build Artifacts
uses: actions/download-artifact@v4.1.7
with:
# Download all available artifacts
path: ${{ runner.temp }}/artifacts-all
- name: Select Wheel Artifacts
shell: bash
run: |
set -x
mkdir -p "${RUNNER_TEMP}/artifacts/"
mv "${RUNNER_TEMP}"/artifacts-all/pytorch-triton-wheel-*/* "${RUNNER_TEMP}/artifacts/"
- name: Set DRY_RUN (only for tagged pushes)
if: ${{ github.event_name == 'push' && (github.event.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v')) }}
shell: bash
run: |
echo "DRY_RUN=disabled" >> "$GITHUB_ENV"
- name: Set UPLOAD_CHANNEL (only for tagged pushes)
if: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/v') }}
shell: bash
run: |
set -ex
# reference ends with an RC suffix
if [[ "${GITHUB_REF_NAME}" = *-rc[0-9]* ]]; then
echo "UPLOAD_CHANNEL=test" >> "$GITHUB_ENV"
fi
# NB: This step is gated by DRY_RUN, which is enabled everywhere except main and release branches
- name: Upload binaries
env:
PACKAGE_TYPE: wheel
# The UPLOAD_SUBFOLDER needs to be empty here so that triton wheels are uploaded
# to nightly or test
UPLOAD_SUBFOLDER: ""
PKG_DIR: ${{ runner.temp }}/artifacts
shell: bash
run: |
set -ex
bash .circleci/scripts/binary_upload.sh
build-conda:
name: "Build Triton Conda"
runs-on: [self-hosted, linux.2xlarge]
strategy:
fail-fast: false
matrix:
py_vers: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
timeout-minutes: 40
env:
DOCKER_IMAGE: pytorch/conda-builder:cpu
PY_VERS: ${{ matrix.py_vers }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ env.DOCKER_IMAGE }}
- name: Build Triton conda package
env:
IS_RELEASE_TAG: ${{ startsWith(github.event.ref, 'refs/tags/v') }}
run: |
set -x
mkdir -p "${RUNNER_TEMP}/artifacts/"
container_name=$(docker run \
--tty \
--detach \
-v "${GITHUB_WORKSPACE}:/pytorch" \
-v "${RUNNER_TEMP}/artifacts:/artifacts" \
-w /artifacts/ \
"${DOCKER_IMAGE}" \
)
RELEASE=""
if [[ "${IS_RELEASE_TAG}" == true ]]; then
RELEASE="--release"
fi
docker exec -t "${container_name}" yum install -y llvm11 llvm11-devel llvm11-static llvm11-libs zlib-devel
docker exec -t "${container_name}" python /pytorch/.github/scripts/build_triton_wheel.py --build-conda --py-version="${PY_VERS}" $RELEASE
docker exec -t "${container_name}" chown -R 1000.1000 /artifacts
- uses: actions/upload-artifact@v3
with:
name: pytorch-triton-conda-${{ matrix.py_vers }}
if-no-files-found: error
path: ${{ runner.temp }}/artifacts/*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()
upload-conda:
runs-on: ubuntu-22.04
needs: build-conda
container:
image: continuumio/miniconda3:4.12.0
environment: ${{ (github.event_name == 'push' && (github.event.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v'))) && 'conda-aws-upload' || '' }}
steps:
- uses: actions/checkout@v3
- name: Download Build Artifacts
uses: actions/download-artifact@v4.1.7
with:
# Download all available artifacts
path: ${{ runner.temp }}/artifacts-all
- name: Select Conda Artifacts
shell: bash
run: |
set -x
mkdir -p "${RUNNER_TEMP}/artifacts/"
mv "${RUNNER_TEMP}"/artifacts-all/pytorch-triton-conda-*/* "${RUNNER_TEMP}/artifacts/"
- name: Set DRY_RUN (only for tagged pushes)
if: ${{ github.event_name == 'push' && (github.event.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v')) }}
shell: bash
run: |
echo "DRY_RUN=disabled" >> "$GITHUB_ENV"
- name: Set UPLOAD_CHANNEL (only for tagged pushes)
if: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/v') }}
shell: bash
run: |
set -ex
# reference ends with an RC suffix
if [[ "${GITHUB_REF_NAME}" = *-rc[0-9]* ]]; then
echo "UPLOAD_CHANNEL=test" >> "$GITHUB_ENV"
fi
# NB: This step is gated by DRY_RUN, which is enabled everywhere except nightly and release branches
- name: Upload binaries to Anaconda
env:
PACKAGE_TYPE: conda
PKG_DIR: ${{ runner.temp }}/artifacts
# When running these on pull_request events these should be blank
CONDA_PYTORCHBOT_TOKEN: ${{ secrets.CONDA_PYTORCHBOT_TOKEN }}
CONDA_PYTORCHBOT_TOKEN_TEST: ${{ secrets.CONDA_PYTORCHBOT_TOKEN_TEST }}
shell: bash
run: |
set -ex
if [[ "${UPLOAD_CHANNEL:-nightly}" == "nightly" ]]; then
export ANACONDA_API_TOKEN="${CONDA_PYTORCHBOT_TOKEN}"
else
export ANACONDA_API_TOKEN="${CONDA_PYTORCHBOT_TOKEN_TEST}"
fi
bash .circleci/scripts/binary_upload.sh

View File

@ -35,7 +35,7 @@ jobs:
runs-on: linux.20_04.4x
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
fetch-depth: 1

View File

@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Run close_nonexistent_disable_issues.py
env:

View File

@ -63,7 +63,7 @@ jobs:
files: ${{env.PT_RELEASE_FILE}}
- name: Upload source distribution to GHA artifacts for release tags
if: ${{ github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v') && contains(github.ref, 'rc') }}
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4.4.0
with:
name: ${{ env.PT_RELEASE_FILE }}
path: ${{ env.PT_RELEASE_FILE }}

View File

@ -82,28 +82,28 @@ jobs:
# [see note: pytorch repo ref]
# deep clone (fetch-depth 0) required for git merge-base
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Build docker image
id: build-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.5
with:
docker-image-name: ${{ matrix.docker-image-name }}
always-rebuild: true
push: true
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: ${{ steps.build-docker-image.outputs.docker-image }}
- uses: nick-fields/retry@v3.0.0
name: Push to https://https://ghcr.io/
id: push-to-ghcr-io
if: ${{ github.event_name == 'push' }}
if: ${{ 0 && github.event_name == 'push' }}
env:
ECR_DOCKER_IMAGE: ${{ steps.build-docker-image.outputs.docker-image }}
GHCR_PAT: ${{ secrets.GHCR_PAT }}
@ -128,5 +128,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -41,7 +41,7 @@ jobs:
matrix: ${{ steps.generate-matrix.outputs.matrix }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: true
@ -69,7 +69,7 @@ jobs:
CUDNN_VERSION: ${{ matrix.cudnn_version }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
@ -147,12 +147,12 @@ jobs:
fi
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()
validate:
needs: build
uses: pytorch/builder/.github/workflows/validate-docker-images.yml@main
uses: pytorch/builder/.github/workflows/validate-docker-images.yml@release/2.5
with:
channel: nightly
ref: main

View File

@ -57,13 +57,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
build_name: manywheel-py3_9-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cpu-aarch64-test: # Testing
@ -80,7 +81,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -102,7 +104,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-aarch64
secrets:
@ -123,8 +126,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.9"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
@ -147,8 +151,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda-aarch64
secrets:
@ -169,13 +174,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
build_name: manywheel-py3_10-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_10-cpu-aarch64-test: # Testing
@ -192,7 +198,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -214,7 +221,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-aarch64
secrets:
@ -235,8 +243,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.10"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
@ -259,8 +268,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda-aarch64
secrets:
@ -281,13 +291,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
build_name: manywheel-py3_11-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_11-cpu-aarch64-test: # Testing
@ -304,7 +315,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -326,7 +338,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-aarch64
secrets:
@ -347,8 +360,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.11"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
@ -371,8 +385,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda-aarch64
secrets:
@ -393,13 +408,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
build_name: manywheel-py3_12-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_12-cpu-aarch64-test: # Testing
@ -416,7 +432,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -438,7 +455,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-aarch64
secrets:
@ -459,8 +477,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.12"
runs_on: linux.arm64.m7g.4xlarge.ephemeral
ALPINE_IMAGE: "arm64v8/alpine"
@ -483,8 +502,9 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_TYPE: cuda-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cuda12.4-2.5
DESIRED_DEVTOOLSET: cxx11-abi
use_split_build: False
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda-aarch64
secrets:

View File

@ -57,7 +57,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: conda-py3_9-cpu
@ -78,7 +78,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
build_environment: linux-binary-conda
@ -100,7 +100,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
secrets:
@ -122,7 +122,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -145,7 +145,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda11_8
build_environment: linux-binary-conda
@ -168,7 +168,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda11_8
secrets:
@ -190,7 +190,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -213,7 +213,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_1
build_environment: linux-binary-conda
@ -236,7 +236,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_1
secrets:
@ -258,7 +258,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -281,7 +281,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_4
build_environment: linux-binary-conda
@ -304,7 +304,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_4
secrets:
@ -325,7 +325,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.10"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: conda-py3_10-cpu
@ -346,7 +346,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
build_environment: linux-binary-conda
@ -368,7 +368,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
secrets:
@ -390,7 +390,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.10"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -413,7 +413,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda11_8
build_environment: linux-binary-conda
@ -436,7 +436,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda11_8
secrets:
@ -458,7 +458,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.10"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -481,7 +481,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_1
build_environment: linux-binary-conda
@ -504,7 +504,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_1
secrets:
@ -526,7 +526,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.10"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -549,7 +549,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_4
build_environment: linux-binary-conda
@ -572,7 +572,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_4
secrets:
@ -593,7 +593,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.11"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: conda-py3_11-cpu
@ -614,7 +614,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
build_environment: linux-binary-conda
@ -636,7 +636,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
secrets:
@ -658,7 +658,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.11"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -681,7 +681,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda11_8
build_environment: linux-binary-conda
@ -704,7 +704,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda11_8
secrets:
@ -726,7 +726,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.11"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -749,7 +749,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_1
build_environment: linux-binary-conda
@ -772,7 +772,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_1
secrets:
@ -794,7 +794,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.11"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -817,7 +817,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_4
build_environment: linux-binary-conda
@ -840,7 +840,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_4
secrets:
@ -861,7 +861,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.12"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: conda-py3_12-cpu
@ -882,7 +882,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
build_environment: linux-binary-conda
@ -904,7 +904,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
secrets:
@ -926,7 +926,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.12"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -949,7 +949,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda11_8
build_environment: linux-binary-conda
@ -972,7 +972,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda11_8
secrets:
@ -994,7 +994,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.12"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -1017,7 +1017,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_1
build_environment: linux-binary-conda
@ -1040,7 +1040,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_1
secrets:
@ -1062,7 +1062,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.12"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.24xlarge.ephemeral
@ -1085,7 +1085,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_4
build_environment: linux-binary-conda
@ -1108,7 +1108,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/conda-builder:cuda12.4-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_4
secrets:

View File

@ -52,7 +52,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -74,7 +74,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi

View File

@ -57,7 +57,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -79,7 +79,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -102,7 +102,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -125,7 +125,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -148,7 +148,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda11_8-shared-with-deps-cxx11-abi
@ -172,7 +172,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda11_8-shared-with-deps-cxx11-abi
@ -195,7 +195,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -218,7 +218,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_1-shared-with-deps-cxx11-abi
@ -242,7 +242,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_1-shared-with-deps-cxx11-abi
@ -265,7 +265,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -288,7 +288,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_4-shared-with-deps-cxx11-abi
@ -312,7 +312,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_4-shared-with-deps-cxx11-abi
@ -335,7 +335,7 @@ jobs:
DESIRED_CUDA: rocm6.1
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -360,7 +360,7 @@ jobs:
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
steps:
@ -374,7 +374,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -386,7 +385,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -400,9 +399,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: pytorch/libtorch-cxx11-builder:rocm6.1-main
docker-image: pytorch/libtorch-cxx11-builder:rocm6.1-2.5
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -422,7 +421,7 @@ jobs:
DESIRED_CUDA: rocm6.1
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm6_1-shared-with-deps-cxx11-abi
@ -445,7 +444,7 @@ jobs:
DESIRED_CUDA: rocm6.2
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -470,7 +469,7 @@ jobs:
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
steps:
@ -484,7 +483,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -496,7 +494,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -510,9 +508,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: pytorch/libtorch-cxx11-builder:rocm6.2-main
docker-image: pytorch/libtorch-cxx11-builder:rocm6.2-2.5
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -532,7 +530,7 @@ jobs:
DESIRED_CUDA: rocm6.2
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm6_2-shared-with-deps-cxx11-abi

View File

@ -52,7 +52,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -74,7 +74,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11

View File

@ -57,7 +57,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -79,7 +79,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -102,7 +102,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -125,7 +125,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -148,7 +148,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda11_8-shared-with-deps-pre-cxx11
@ -172,7 +172,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda11_8-shared-with-deps-pre-cxx11
@ -195,7 +195,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -218,7 +218,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_1-shared-with-deps-pre-cxx11
@ -242,7 +242,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_1-shared-with-deps-pre-cxx11
@ -265,7 +265,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -288,7 +288,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_4-shared-with-deps-pre-cxx11
@ -312,7 +312,7 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_4-shared-with-deps-pre-cxx11
@ -335,7 +335,7 @@ jobs:
DESIRED_CUDA: rocm6.1
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -360,7 +360,7 @@ jobs:
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
steps:
@ -374,7 +374,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -386,7 +385,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -400,9 +399,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: pytorch/manylinux-builder:rocm6.1-main
docker-image: pytorch/manylinux-builder:rocm6.1-2.5
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -422,7 +421,7 @@ jobs:
DESIRED_CUDA: rocm6.1
GPU_ARCH_VERSION: 6.1
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.1-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm6_1-shared-with-deps-pre-cxx11
@ -445,7 +444,7 @@ jobs:
DESIRED_CUDA: rocm6.2
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
@ -470,7 +469,7 @@ jobs:
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
steps:
@ -484,7 +483,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -496,7 +494,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -510,9 +508,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.5
with:
docker-image: pytorch/manylinux-builder:rocm6.2-main
docker-image: pytorch/manylinux-builder:rocm6.2-2.5
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -532,7 +530,7 @@ jobs:
DESIRED_CUDA: rocm6.2
GPU_ARCH_VERSION: 6.2
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-main
DOCKER_IMAGE: pytorch/manylinux-builder:rocm6.2-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm6_2-shared-with-deps-pre-cxx11

View File

@ -53,7 +53,8 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda11_8
@ -76,7 +77,8 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8
build_environment: linux-binary-manywheel
@ -85,53 +87,6 @@ jobs:
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda11_8-split-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda11_8-split
build_environment: linux-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu11==11.8.87; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu11==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu11==11.11.3.6; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu11==10.9.0.58; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu11==10.3.0.86; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu11==11.4.1.48; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu11==11.7.5.86; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu11==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu11==11.8.86; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda11_8-split-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda11_8-split-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8-split
build_environment: linux-binary-manywheel
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_1-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
@ -145,7 +100,8 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_1
@ -168,7 +124,8 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1
build_environment: linux-binary-manywheel
@ -177,53 +134,6 @@ jobs:
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_1-split-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_1-split
build_environment: linux-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_1-split-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda12_1-split-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1-split
build_environment: linux-binary-manywheel
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_4-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
@ -237,7 +147,8 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_4
@ -260,7 +171,8 @@ jobs:
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_4
build_environment: linux-binary-manywheel
@ -268,50 +180,3 @@ jobs:
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_4-split-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_4-split
build_environment: linux-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_4-split-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda12_4-split-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-main
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_4-split
build_environment: linux-binary-manywheel
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,182 @@
# @generated DO NOT EDIT MANUALLY
# Template is at: .github/templates/linux_binary_build_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: linux-binary-manywheel-split
on:
push:
branches:
- main
tags:
- 'ciflow/periodic/*'
workflow_dispatch:
env:
# Needed for conda builds
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
ANACONDA_USER: pytorch
AWS_DEFAULT_REGION: us-east-1
BINARY_ENV_FILE: /tmp/env
BUILD_ENVIRONMENT: linux-binary-manywheel-split
BUILDER_ROOT: /builder
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
PYTORCH_FINAL_PACKAGE_DIR: /artifacts
PYTORCH_ROOT: /pytorch
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
SKIP_ALL_TESTS: 0
concurrency:
group: linux-binary-manywheel-split-${{ github.event.pull_request.number || github.ref_name }}-${{ github.ref_type == 'branch' && github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
jobs:
get-label-type:
name: get-label-type
uses: ./.github/workflows/_runner-determinator.yml
with:
triggering_actor: ${{ github.triggering_actor }}
issue_owner: ${{ github.event.pull_request.user.login || github.event.issue.user.login }}
curr_branch: ${{ github.head_ref || github.ref_name }}
curr_ref_type: ${{ github.ref_type }}
manywheel-py3_9-cuda11_8-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda11_8
build_environment: linux-binary-manywheel-split
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu11==11.8.87; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu11==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu11==11.11.3.6; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu11==10.9.0.58; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu11==10.3.0.86; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu11==11.4.1.48; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu11==11.7.5.86; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu11==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu11==11.8.86; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda11_8-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda11_8-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8
build_environment: linux-binary-manywheel-split
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_1-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_1
build_environment: linux-binary-manywheel-split
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_1-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda12_1-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1
build_environment: linux-binary-manywheel-split
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_4-build:
if: ${{ github.repository_owner == 'pytorch' }}
uses: ./.github/workflows/_binary-build-linux.yml
needs: get-label-type
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build_name: manywheel-py3_9-cuda12_4
build_environment: linux-binary-manywheel-split
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cuda12_4-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs:
- manywheel-py3_9-cuda12_4-build
- get-label-type
uses: ./.github/workflows/_binary-test-linux.yml
with:
PYTORCH_ROOT: /pytorch
BUILDER_ROOT: /builder
PACKAGE_TYPE: manywheel
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cu124
GPU_ARCH_VERSION: 12.4
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.4-2.5
use_split_build: True
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_4
build_environment: linux-binary-manywheel-split
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
runs_on: linux.4xlarge.nvidia.gpu
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}

File diff suppressed because it is too large Load Diff

View File

@ -57,13 +57,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
runs_on: linux.s390x
ALPINE_IMAGE: "docker.io/s390x/alpine"
build_name: manywheel-py3_9-cpu-s390x
build_environment: linux-s390x-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_9-cpu-s390x-test: # Testing
@ -80,7 +81,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-s390x
build_environment: linux-s390x-binary-manywheel
@ -102,7 +104,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-s390x
secrets:
@ -123,13 +126,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
runs_on: linux.s390x
ALPINE_IMAGE: "docker.io/s390x/alpine"
build_name: manywheel-py3_10-cpu-s390x
build_environment: linux-s390x-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_10-cpu-s390x-test: # Testing
@ -146,7 +150,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-s390x
build_environment: linux-s390x-binary-manywheel
@ -168,7 +173,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-s390x
secrets:
@ -189,13 +195,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
runs_on: linux.s390x
ALPINE_IMAGE: "docker.io/s390x/alpine"
build_name: manywheel-py3_11-cpu-s390x
build_environment: linux-s390x-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_11-cpu-s390x-test: # Testing
@ -212,7 +219,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-s390x
build_environment: linux-s390x-binary-manywheel
@ -234,7 +242,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-s390x
secrets:
@ -255,13 +264,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
runs_on: linux.s390x
ALPINE_IMAGE: "docker.io/s390x/alpine"
build_name: manywheel-py3_12-cpu-s390x
build_environment: linux-s390x-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_12-cpu-s390x-test: # Testing
@ -278,7 +288,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-s390x
build_environment: linux-s390x-binary-manywheel
@ -300,7 +311,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-s390x
secrets:
@ -321,13 +333,14 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.13"
runs_on: linux.s390x
ALPINE_IMAGE: "docker.io/s390x/alpine"
build_name: manywheel-py3_13-cpu-s390x
build_environment: linux-s390x-binary-manywheel
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
manywheel-py3_13-cpu-s390x-test: # Testing
@ -344,7 +357,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.13"
build_name: manywheel-py3_13-cpu-s390x
build_environment: linux-s390x-binary-manywheel
@ -366,7 +380,8 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-s390x
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-main
DOCKER_IMAGE: pytorch/manylinuxs390x-builder:cpu-s390x-2.5
use_split_build: False
DESIRED_PYTHON: "3.13"
build_name: manywheel-py3_13-cpu-s390x
secrets:

View File

@ -74,7 +74,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -86,7 +85,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -117,7 +116,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_9-cpu
@ -138,7 +137,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
use_s3: False
@ -189,7 +188,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -201,7 +199,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -232,7 +230,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_10-cpu
@ -253,7 +251,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
use_s3: False
@ -304,7 +302,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -316,7 +313,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -347,7 +344,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_11-cpu
@ -368,7 +365,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
use_s3: False
@ -419,7 +416,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -431,7 +427,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -462,7 +458,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_12-cpu
@ -483,7 +479,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.5
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
use_s3: False

View File

@ -49,7 +49,7 @@ jobs:
DESIRED_DEVTOOLSET: cxx11-abi
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
@ -78,7 +78,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -90,7 +89,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -121,7 +120,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -142,7 +141,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.5
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi

View File

@ -46,7 +46,7 @@ jobs:
GPU_ARCH_TYPE: cpu
SKIP_ALL_TESTS: 1
DESIRED_PYTHON: "3.9"
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
steps:
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
@ -75,7 +75,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -87,7 +86,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -118,7 +117,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: wheel-py3_9-cpu
@ -139,7 +138,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
DESIRED_PYTHON: "3.9"
build_name: wheel-py3_9-cpu
use_s3: False
@ -162,7 +161,7 @@ jobs:
GPU_ARCH_TYPE: cpu
SKIP_ALL_TESTS: 1
DESIRED_PYTHON: "3.10"
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
steps:
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
@ -191,7 +190,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -203,7 +201,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -234,7 +232,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: wheel-py3_10-cpu
@ -255,7 +253,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
DESIRED_PYTHON: "3.10"
build_name: wheel-py3_10-cpu
use_s3: False
@ -278,7 +276,7 @@ jobs:
GPU_ARCH_TYPE: cpu
SKIP_ALL_TESTS: 1
DESIRED_PYTHON: "3.11"
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
steps:
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
@ -307,7 +305,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -319,7 +316,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -350,7 +347,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: wheel-py3_11-cpu
@ -371,7 +368,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
DESIRED_PYTHON: "3.11"
build_name: wheel-py3_11-cpu
use_s3: False
@ -394,7 +391,7 @@ jobs:
GPU_ARCH_TYPE: cpu
SKIP_ALL_TESTS: 1
DESIRED_PYTHON: "3.12"
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.1.3.1; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.0.2.54; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.2.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.4.5.107; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.1.0.106; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.1.105; platform_system == 'Linux' and platform_machine == 'x86_64'
PYTORCH_EXTRA_INSTALL_REQUIREMENTS: nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-runtime-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cuda-cupti-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cudnn-cu12==9.1.0.70; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cublas-cu12==12.4.5.8; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cufft-cu12==11.2.1.3; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-curand-cu12==10.3.5.147; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusolver-cu12==11.6.1.9; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-cusparse-cu12==12.3.1.170; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvtx-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64' | nvidia-nvjitlink-cu12==12.4.127; platform_system == 'Linux' and platform_machine == 'x86_64'
steps:
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
@ -423,7 +420,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -435,7 +431,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -466,7 +462,7 @@ jobs:
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
"${PYTORCH_ROOT}/.circleci/scripts/binary_macos_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: wheel-py3_12-cpu
@ -487,7 +483,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.5
DESIRED_PYTHON: "3.12"
build_name: wheel-py3_12-cpu
use_s3: False

View File

@ -71,7 +71,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -102,7 +102,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -114,7 +113,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -132,7 +131,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_9-cpu
@ -185,7 +184,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -221,7 +220,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -233,7 +231,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -317,7 +315,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -348,7 +346,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -360,7 +357,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -378,7 +375,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_9-cuda11_8
@ -432,7 +429,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -468,7 +465,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -480,7 +476,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -565,7 +561,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -596,7 +592,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -608,7 +603,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -626,7 +621,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_9-cuda12_1
@ -680,7 +675,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -716,7 +711,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -728,7 +722,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -813,7 +807,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -844,7 +838,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -856,7 +849,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -874,7 +867,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_9-cuda12_4
@ -928,7 +921,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -964,7 +957,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -976,7 +968,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1060,7 +1052,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1091,7 +1083,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1103,7 +1094,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1121,7 +1112,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_10-cpu
@ -1174,7 +1165,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1210,7 +1201,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1222,7 +1212,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1306,7 +1296,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1337,7 +1327,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1349,7 +1338,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1367,7 +1356,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_10-cuda11_8
@ -1421,7 +1410,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1457,7 +1446,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1469,7 +1457,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1554,7 +1542,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1585,7 +1573,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1597,7 +1584,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1615,7 +1602,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_10-cuda12_1
@ -1669,7 +1656,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1705,7 +1692,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1717,7 +1703,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1802,7 +1788,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1833,7 +1819,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1845,7 +1830,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1863,7 +1848,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_10-cuda12_4
@ -1917,7 +1902,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1953,7 +1938,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1965,7 +1949,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2049,7 +2033,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2080,7 +2064,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2092,7 +2075,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2110,7 +2093,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_11-cpu
@ -2163,7 +2146,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2199,7 +2182,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2211,7 +2193,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2295,7 +2277,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2326,7 +2308,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2338,7 +2319,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2356,7 +2337,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_11-cuda11_8
@ -2410,7 +2391,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2446,7 +2427,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2458,7 +2438,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2543,7 +2523,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2574,7 +2554,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2586,7 +2565,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2604,7 +2583,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_11-cuda12_1
@ -2658,7 +2637,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2694,7 +2673,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2706,7 +2684,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2791,7 +2769,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2822,7 +2800,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2834,7 +2811,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -2852,7 +2829,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_11-cuda12_4
@ -2906,7 +2883,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2942,7 +2919,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2954,7 +2930,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3038,7 +3014,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3069,7 +3045,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3081,7 +3056,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3099,7 +3074,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_12-cpu
@ -3152,7 +3127,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3188,7 +3163,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3200,7 +3174,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3284,7 +3258,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3315,7 +3289,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3327,7 +3300,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3345,7 +3318,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_12-cuda11_8
@ -3399,7 +3372,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3435,7 +3408,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3447,7 +3419,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3532,7 +3504,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3563,7 +3535,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3575,7 +3546,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3593,7 +3564,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_12-cuda12_1
@ -3647,7 +3618,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3683,7 +3654,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3695,7 +3665,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3780,7 +3750,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3811,7 +3781,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3823,7 +3792,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -3841,7 +3810,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: conda-py3_12-cuda12_4
@ -3895,7 +3864,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3931,7 +3900,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3943,7 +3911,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -51,7 +51,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -68,7 +68,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -99,7 +99,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -111,7 +110,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -129,7 +128,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cpu-shared-with-deps-debug
@ -169,7 +168,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -186,7 +185,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -222,7 +221,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -234,7 +232,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -58,7 +58,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -75,7 +75,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -106,7 +106,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -118,7 +117,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -136,7 +135,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cpu-shared-with-deps-debug
@ -176,7 +175,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -193,7 +192,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -229,7 +228,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -241,7 +239,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -290,7 +288,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cpu-shared-with-deps-debug
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -316,7 +314,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -333,7 +331,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -364,7 +362,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -376,7 +373,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -394,7 +391,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda11_8-shared-with-deps-debug
@ -435,7 +432,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -452,7 +449,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -488,7 +485,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -500,7 +496,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -550,7 +546,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda11_8-shared-with-deps-debug
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -576,7 +572,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -593,7 +589,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -624,7 +620,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -636,7 +631,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -654,7 +649,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda12_1-shared-with-deps-debug
@ -695,7 +690,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -712,7 +707,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -748,7 +743,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -760,7 +754,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -810,7 +804,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda12_1-shared-with-deps-debug
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -836,7 +830,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -853,7 +847,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -884,7 +878,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -896,7 +889,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -914,7 +907,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda12_4-shared-with-deps-debug
@ -955,7 +948,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -972,7 +965,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1008,7 +1001,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1020,7 +1012,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1070,7 +1062,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda12_4-shared-with-deps-debug
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}

View File

@ -51,7 +51,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -68,7 +68,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -99,7 +99,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -111,7 +110,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -129,7 +128,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cpu-shared-with-deps-release
@ -169,7 +168,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -186,7 +185,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -222,7 +221,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -234,7 +232,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -58,7 +58,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -75,7 +75,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -106,7 +106,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -118,7 +117,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -136,7 +135,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cpu-shared-with-deps-release
@ -176,7 +175,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -193,7 +192,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -229,7 +228,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -241,7 +239,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -290,7 +288,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cpu-shared-with-deps-release
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -316,7 +314,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -333,7 +331,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -364,7 +362,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -376,7 +373,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -394,7 +391,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda11_8-shared-with-deps-release
@ -435,7 +432,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -452,7 +449,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -488,7 +485,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -500,7 +496,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -550,7 +546,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda11_8-shared-with-deps-release
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -576,7 +572,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -593,7 +589,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -624,7 +620,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -636,7 +631,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -654,7 +649,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda12_1-shared-with-deps-release
@ -695,7 +690,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -712,7 +707,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -748,7 +743,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -760,7 +754,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -810,7 +804,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda12_1-shared-with-deps-release
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}
@ -836,7 +830,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -853,7 +847,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -884,7 +878,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -896,7 +889,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -914,7 +907,7 @@ jobs:
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4.4.0
if: always()
with:
name: libtorch-cuda12_4-shared-with-deps-release
@ -955,7 +948,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
steps:
- name: Display EC2 information
shell: bash
@ -972,7 +965,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@main
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.5
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1008,7 +1001,6 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1020,7 +1012,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: main
ref: release/2.5
submodules: recursive
repository: pytorch/builder
path: builder
@ -1070,7 +1062,7 @@ jobs:
LIBTORCH_VARIANT: shared-with-deps
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.8"
DESIRED_PYTHON: "3.9"
build_name: libtorch-cuda12_4-shared-with-deps-release
secrets:
github-token: ${{ secrets.GITHUB_TOKEN }}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,40 @@
name: inductor-micro-benchmark-x86
on:
schedule:
- cron: 0 7 * * *
push:
tags:
- ciflow/inductor-micro-benchmark-cpu-x86/*
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref_name }}-${{ github.ref_type == 'branch' && github.sha }}-${{ github.event_name == 'workflow_dispatch' }}-${{ github.event_name == 'schedule' }}
cancel-in-progress: true
permissions: read-all
jobs:
linux-jammy-cpu-py3_9-gcc11-inductor-build:
name: linux-jammy-cpu-py3.9-gcc11-inductor
uses: ./.github/workflows/_linux-build.yml
with:
build-environment: linux-jammy-py3.9-gcc11
docker-image-name: pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks
# Use metal host for benchmark jobs
test-matrix: |
{ include: [
{ config: "inductor-micro-benchmark-cpu-x86", shard: 1, num_shards: 1, runner: "linux.24xl.spr-metal" },
]}
linux-jammy-cpu-py3_9-gcc11-inductor-micro-benchmark-test:
name: linux-jammy-cpu-py3.9-gcc11-inductor
uses: ./.github/workflows/_linux-test.yml
needs: linux-jammy-cpu-py3_9-gcc11-inductor-build
with:
build-environment: linux-jammy-py3.9-gcc11
docker-image: ${{ needs.linux-jammy-cpu-py3_9-gcc11-inductor-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-jammy-cpu-py3_9-gcc11-inductor-build.outputs.test-matrix }}
use-gha: anything-non-empty-to-use-gha
timeout-minutes: 720

View File

@ -22,11 +22,11 @@ concurrency:
permissions: read-all
jobs:
linux-focal-rocm6_1-py3_8-inductor-build:
name: rocm6.1-py3.8-inductor
linux-focal-rocm6_2-py3_10-inductor-build:
name: rocm6.2-py3.10-inductor
uses: ./.github/workflows/_linux-build.yml
with:
build-environment: linux-focal-rocm6.1-py3.8
build-environment: linux-focal-rocm6.2-py3.10
docker-image-name: pytorch-linux-focal-rocm-n-py3
test-matrix: |
{ include: [
@ -34,14 +34,14 @@ jobs:
{ config: "inductor", shard: 2, num_shards: 2, runner: "linux.rocm.gpu.2" },
]}
linux-focal-rocm6_1-py3_8-inductor-test:
linux-focal-rocm6_2-py3_10-inductor-test:
permissions:
id-token: write
contents: read
name: rocm6.1-py3.8-inductor
name: rocm6.2-py3.10-inductor
uses: ./.github/workflows/_rocm-test.yml
needs: linux-focal-rocm6_1-py3_8-inductor-build
needs: linux-focal-rocm6_2-py3_10-inductor-build
with:
build-environment: linux-focal-rocm6.1-py3.8
docker-image: ${{ needs.linux-focal-rocm6_1-py3_8-inductor-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_1-py3_8-inductor-build.outputs.test-matrix }}
build-environment: linux-focal-rocm6.2-py3.10
docker-image: ${{ needs.linux-focal-rocm6_2-py3_10-inductor-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_2-py3_10-inductor-build.outputs.test-matrix }}

View File

@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Run BC Lint Action
uses: pytorch/test-infra/.github/actions/bc-lint@main
uses: pytorch/test-infra/.github/actions/bc-lint@release/2.5
with:
repo: ${{ github.event.pull_request.head.repo.full_name }}
base_sha: ${{ github.event.pull_request.base.sha }}

View File

@ -24,7 +24,7 @@ jobs:
curr_branch: ${{ github.head_ref || github.ref_name }}
lintrunner-clang:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
timeout: 120
@ -41,7 +41,7 @@ jobs:
.github/scripts/lintrunner.sh
lintrunner-noclang:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
timeout: 120
@ -57,7 +57,7 @@ jobs:
.github/scripts/lintrunner.sh
quick-checks:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
runner: "${{ needs.get-label-type.outputs.label-type }}linux.2xlarge"
@ -100,7 +100,7 @@ jobs:
if: github.event_name == 'pull_request' && !contains(github.event.pull_request.labels.*.name, 'skip-pr-sanity-checks')
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
fetch-depth: -1
@ -113,7 +113,7 @@ jobs:
bash .github/scripts/pr-sanity-check.sh
workflow-checks:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
runner: "${{ needs.get-label-type.outputs.label-type }}linux.2xlarge"
@ -127,6 +127,7 @@ jobs:
conda activate "${CONDA_ENV}"
# Regenerate workflows
export RELEASE_VERSION_TAG="2.5"
.github/scripts/generate_ci_workflows.py
RC=0
@ -150,7 +151,7 @@ jobs:
exit $RC
toc:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
runner: "${{ needs.get-label-type.outputs.label-type }}linux.2xlarge"
@ -189,7 +190,7 @@ jobs:
test-tools:
name: Test tools
if: ${{ github.repository == 'pytorch/pytorch' }}
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.5
needs: get-label-type
with:
runner: "${{ needs.get-label-type.outputs.label-type }}linux.2xlarge"
@ -211,19 +212,19 @@ jobs:
runs-on: linux.20_04.4x
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
fetch-depth: 1
- name: Setup Python 3.8
- name: Setup Python 3.9
uses: actions/setup-python@v4
with:
python-version: '3.8'
python-version: '3.9'
architecture: x64
cache: pip
- name: Install dependencies
run: |
pip install pytest-rerunfailures==11.1.* pytest-flakefinder==1.1.* pytest-xdist==3.3.* expecttest==0.1.* numpy==1.24.*
pip install pytest-rerunfailures==11.1.* pytest-flakefinder==1.1.* pytest-xdist==3.3.* expecttest==0.2.* fbscribelogger==0.1.* numpy==1.24.*
pip install torch --pre --index-url https://download.pytorch.org/whl/nightly/cpu/
- name: Run run_test.py (nonretryable)
run: |
@ -241,7 +242,7 @@ jobs:
# [see note: pytorch repo ref]
# deep clone (fetch-depth 0) required, to allow us to use git log
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
submodules: false
fetch-depth: 1

View File

@ -48,7 +48,7 @@ jobs:
path: llm-target-determinator
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.5
with:
python-version: "3.9"
@ -117,5 +117,5 @@ jobs:
AWS_REGION: ""
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@main
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.5
if: always()

View File

@ -21,7 +21,7 @@ jobs:
environment: upload-stats
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
with:
fetch-depth: 1
submodules: false

View File

@ -56,7 +56,7 @@ jobs:
if: ${{ github.event_name == 'schedule' }}
steps:
- name: update-vision-commit-hash
uses: pytorch/test-infra/.github/actions/update-commit-hash@main
uses: pytorch/test-infra/.github/actions/update-commit-hash@release/2.5
with:
repo-name: vision
branch: main
@ -71,7 +71,7 @@ jobs:
if: ${{ github.event_name == 'schedule' }}
steps:
- name: update-audio-commit-hash
uses: pytorch/test-infra/.github/actions/update-commit-hash@main
uses: pytorch/test-infra/.github/actions/update-commit-hash@release/2.5
with:
repo-name: audio
branch: main
@ -86,7 +86,7 @@ jobs:
if: ${{ github.event_name == 'schedule' }}
steps:
- name: update-executorch-commit-hash
uses: pytorch/test-infra/.github/actions/update-commit-hash@main
uses: pytorch/test-infra/.github/actions/update-commit-hash@release/2.5
with:
repo-name: executorch
branch: main

View File

@ -19,7 +19,7 @@ jobs:
if: ${{ github.event.pull_request.number != 26921 && github.repository_owner == 'pytorch' }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.5
- uses: ethanis/nitpicker@v1
with:
nitpicks: '.github/nitpicks.yml'

View File

@ -214,7 +214,9 @@ jobs:
# TODO: Figure out how to migrate this job to M1 runner
ios-build-test:
name: ios-build-test
if: github.event_name != 'schedule' || github.event.schedule == '45 0,8,16 * * 1-5' || github.event.schedule == '45 4 * * 0,6' || github.event.schedule == '29 8 * * *'
# Has been broken for a while, see https://github.com/pytorch/pytorch/issues/136284
# if: github.event_name != 'schedule' || github.event.schedule == '45 0,8,16 * * 1-5' || github.event.schedule == '45 4 * * 0,6' || github.event.schedule == '29 8 * * *'
if: false
uses: ./.github/workflows/_ios-build-test.yml
with:
trigger-event: ${{ github.event_name }}
@ -293,13 +295,13 @@ jobs:
docker-image: ${{ needs.linux-vulkan-focal-py3_11-clang10-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-vulkan-focal-py3_11-clang10-build.outputs.test-matrix }}
linux-focal-rocm6_1-py3_8-build:
name: linux-focal-rocm6.1-py3.8
linux-focal-rocm6_2-py3_10-build:
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
with:
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build-environment: linux-focal-rocm6.1-py3.8
build-environment: linux-focal-rocm6.2-py3.10
docker-image-name: pytorch-linux-focal-rocm-n-py3
test-matrix: |
{ include: [
@ -308,19 +310,19 @@ jobs:
{ config: "distributed", shard: 3, num_shards: 3, runner: "linux.rocm.gpu" },
]}
linux-focal-rocm6_1-py3_8-test:
linux-focal-rocm6_2-py3_10-test:
permissions:
id-token: write
contents: read
name: linux-focal-rocm6.1-py3.8
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_rocm-test.yml
needs:
- linux-focal-rocm6_1-py3_8-build
- linux-focal-rocm6_2-py3_10-build
- target-determination
with:
build-environment: linux-focal-rocm6.1-py3.8
docker-image: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.test-matrix }}
build-environment: linux-focal-rocm6.2-py3.10
docker-image: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.test-matrix }}
linux-focal-cuda12_1-py3_10-gcc9-experimental-split-build:
name: linux-focal-cuda12.1-py3.10-gcc9-experimental-split-build

View File

@ -503,15 +503,15 @@ jobs:
]}
secrets: inherit
linux-focal-rocm6_1-py3_8-build:
linux-focal-rocm6_2-py3_10-build:
# don't run build twice on main
if: github.event_name == 'pull_request'
name: linux-focal-rocm6.1-py3.8
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
with:
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build-environment: linux-focal-rocm6.1-py3.8
build-environment: linux-focal-rocm6.2-py3.10
docker-image-name: pytorch-linux-focal-rocm-n-py3
sync-tag: rocm-build
test-matrix: |

View File

@ -31,11 +31,11 @@ jobs:
id-token: write
contents: read
linux-focal-rocm6_1-py3_8-build:
name: linux-focal-rocm6.1-py3.8
linux-focal-rocm6_2-py3_10-build:
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_linux-build.yml
with:
build-environment: linux-focal-rocm6.1-py3.8
build-environment: linux-focal-rocm6.2-py3.10
docker-image-name: pytorch-linux-focal-rocm-n-py3
sync-tag: rocm-build
test-matrix: |
@ -48,16 +48,16 @@ jobs:
{ config: "default", shard: 6, num_shards: 6, runner: "linux.rocm.gpu.2" },
]}
linux-focal-rocm6_1-py3_8-test:
linux-focal-rocm6_2-py3_10-test:
permissions:
id-token: write
contents: read
name: linux-focal-rocm6.1-py3.8
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_rocm-test.yml
needs:
- linux-focal-rocm6_1-py3_8-build
- linux-focal-rocm6_2-py3_10-build
- target-determination
with:
build-environment: linux-focal-rocm6.1-py3.8
docker-image: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.test-matrix }}
build-environment: linux-focal-rocm6.2-py3.10
docker-image: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.test-matrix }}

View File

@ -22,6 +22,7 @@ permissions: read-all
jobs:
llm-td:
if: false
name: before-test
uses: ./.github/workflows/llm_td_retrieval.yml
permissions:
@ -29,6 +30,7 @@ jobs:
contents: read
target-determination:
if: false
name: before-test
uses: ./.github/workflows/target_determination.yml
needs: llm-td
@ -37,6 +39,7 @@ jobs:
contents: read
get-label-type:
if: false
name: get-label-type
uses: ./.github/workflows/_runner-determinator.yml
with:
@ -46,6 +49,7 @@ jobs:
curr_ref_type: ${{ github.ref_type }}
linux-focal-cuda12_1-py3-gcc9-slow-gradcheck-build:
if: false
name: linux-focal-cuda12.1-py3-gcc9-slow-gradcheck
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
@ -56,15 +60,18 @@ jobs:
cuda-arch-list: 8.6
test-matrix: |
{ include: [
{ config: "default", shard: 1, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 2, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 3, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 4, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 5, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 6, num_shards: 6, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 1, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 2, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 3, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 4, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 5, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 6, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 7, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "default", shard: 8, num_shards: 8, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
]}
linux-focal-cuda12_1-py3-gcc9-slow-gradcheck-test:
if: false
name: linux-focal-cuda12.1-py3-gcc9-slow-gradcheck
uses: ./.github/workflows/_linux-test.yml
needs:
@ -77,6 +84,7 @@ jobs:
timeout-minutes: 300
linux-focal-cuda12_1-py3_10-gcc9-sm86-build:
if: false
name: linux-focal-cuda12.1-py3.10-gcc9-sm86
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
@ -87,11 +95,13 @@ jobs:
cuda-arch-list: 8.6
test-matrix: |
{ include: [
{ config: "slow", shard: 1, num_shards: 2, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "slow", shard: 2, num_shards: 2, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "slow", shard: 1, num_shards: 3, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "slow", shard: 2, num_shards: 3, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
{ config: "slow", shard: 3, num_shards: 3, runner: "${{ needs.get-label-type.outputs.label-type }}linux.g5.4xlarge.nvidia.gpu" },
]}
linux-focal-cuda12_1-py3_10-gcc9-sm86-test:
if: false
name: linux-focal-cuda12.1-py3.10-gcc9-sm86
uses: ./.github/workflows/_linux-test.yml
needs:
@ -103,6 +113,7 @@ jobs:
test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-sm86-build.outputs.test-matrix }}
linux-focal-py3_9-clang10-build:
if: false
name: linux-focal-py3.9-clang10
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
@ -117,6 +128,7 @@ jobs:
]}
linux-focal-py3_9-clang10-test:
if: false
name: linux-focal-py3.9-clang10
uses: ./.github/workflows/_linux-test.yml
needs:
@ -127,13 +139,14 @@ jobs:
docker-image: ${{ needs.linux-focal-py3_9-clang10-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-py3_9-clang10-build.outputs.test-matrix }}
linux-focal-rocm6_1-py3_8-build:
name: linux-focal-rocm6.1-py3.8
linux-focal-rocm6_2-py3_10-build:
if: false
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
with:
runner_prefix: "${{ needs.get-label-type.outputs.label-type }}"
build-environment: linux-focal-rocm6.1-py3.8
build-environment: linux-focal-rocm6.2-py3.10
docker-image-name: pytorch-linux-focal-rocm-n-py3
test-matrix: |
{ include: [
@ -141,21 +154,23 @@ jobs:
{ config: "slow", shard: 2, num_shards: 2, runner: "linux.rocm.gpu" },
]}
linux-focal-rocm6_1-py3_8-test:
linux-focal-rocm6_2-py3_10-test:
if: false
permissions:
id-token: write
contents: read
name: linux-focal-rocm6.1-py3.8
name: linux-focal-rocm6.2-py3.10
uses: ./.github/workflows/_rocm-test.yml
needs:
- linux-focal-rocm6_1-py3_8-build
- linux-focal-rocm6_2-py3_10-build
- target-determination
with:
build-environment: linux-focal-rocm6.1-py3.8
docker-image: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_1-py3_8-build.outputs.test-matrix }}
build-environment: linux-focal-rocm6.2-py3.10
docker-image: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-focal-rocm6_2-py3_10-build.outputs.test-matrix }}
linux-jammy-py3_10-clang15-asan-build:
if: false
name: linux-jammy-py3.10-clang15-asan
uses: ./.github/workflows/_linux-build.yml
needs: get-label-type
@ -173,6 +188,7 @@ jobs:
secrets: inherit
linux-jammy-py3_10-clang15-asan-test:
if: false
name: linux-jammy-py3.10-clang15-asan
uses: ./.github/workflows/_linux-test.yml
needs:

Some files were not shown because too many files have changed in this diff Show More