Compare commits

..

332 Commits

Author SHA1 Message Date
b33a283e9a [nccl-pg] Pass pg name and desc to NCCL communicator (#124149)
Summary:
Pass Process Group Name and Desc to NCCL communicator in order to access pg information in NCCL layer.
The information is passed as commDesc string(i.e. "<pg_desc>:<pg_name>")
Function only valid when NCCL_COMM_DESCRIPTION is defined.

Differential Revision: D55703310

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124149
Approved by: https://github.com/shuqiangzhang
2024-04-16 15:08:38 -07:00
7a551d81e5 [c10d/nccl-pg] allow user to pass process group description (#123472)
Summary:
We need a way to allow user set a customized description for a process group, e.g. FSDP, PP.

Here are several use cases of user specified group_desc:
- Logging: we can easily match a log line and understand what's this collective/pg is used to.
- Pytorch traces (e.g. Kineto, Execution Trace) can benefit from the PG desc since trace analysis, benchmarks will be able to easily differentiate PG purpose like FSDP, PP.
- Lower layer collectives(e.g. NCCL) debug: we will be able to expose PG desc to NCCL communicator so NCCL layer operations can be easily correlated to a PG.

Solution: Add a group_desc field to c10d

Differential Revision: D55781850

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123472
Approved by: https://github.com/kwen2501
2024-04-16 15:08:38 -07:00
1515a90475 [DCP] Adds ability to create a CPU state dict that is both shared and pinned (#122338)
[DCP] Adds ability to create a CPU state dict that is both shared and pinned, as well as a new utility specific to copying the state dict

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1ge8d5c17670f16ac4fc8fcb4181cb490c

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122338
Approved by: https://github.com/fegin
2024-04-16 15:08:22 -07:00
4882ec2a91 Pass and record process_group_name when creating ProcessGroupNCCL (#123117)
Summary:
Pass python c10d group_name to c++ ProcessGroupNCCL so that the pg name will be consistent across different layers.
Also record pg_name in flight recorder entry.

Differential Revision: D55597200

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123117
Approved by: https://github.com/wconstab
2024-04-16 13:48:35 -07:00
972b8060bd [c10d] make monitorThread sleep when we try to dump (#123788)
Summary:
We seperated the FR dump logic from the desync debug logic,
so we no longer set collectiveDebugInfoMode_ to true when we just need FR
dump. That's why monitor thread did not sleep and try to kill the
process without waiting for the dump.

The fix is simple, we should sleep whenever shouldDump_ is true
Test Plan:
Existing unit tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123788
Approved by: https://github.com/wconstab
2024-04-11 09:19:15 -07:00
3e7683ae18 [c10d] dump on any exception (timeout + nccl error) (#123023)
Summary:
Existing flight recorder dumping logic is: dump only on timeout, but not
on NCCL error. This resulted in the faulty ranks missing dumps when NCCL
error happens.

So in this PR, we revise the logic of dump such that records are dumped
when any exception is detected. Exception could be 1. NCCL async errors.
2. watchdog timeout

Also the existing code tends to mix the logic of flight recorder dump
and desync debug, which is no desirable. We only dump the desync debug
report only when timeout is detected.
Test Plan:
Added a new unit test to trigger nccl error and dump, and make sure the
dump is triggered by the error.

Also existing dump on timeout tests should still pass.

sqzhang_1) [sqzhang@devgpu009.cln1 ~/pytorch (84bf9d4c)]$ python
test/distributed/test_c10d_nccl.py NcclErrorDumpTest
NCCL version 2.19.3+cuda12.0
[E329 19:15:11.775879730 ProcessGroupNCCL.cpp:565] [Rank 0] Watchdog
caught collective operation timeout: WorkNCCL(SeqNum=2,
OpType=ALLREDUCE, NumelIn=10, NumelOut=10, Timeout(ms)=10000) ran for
10028 milliseconds before timing out.
[E329 19:15:11.777459894 ProcessGroupNCCL.cpp:1561] [PG 0 Rank 0]
Exception hit in NCCL work: 2
[E329 19:15:12.660717323 ProcessGroupNCCL.cpp:1332] [PG 0 Rank 0]
Received a timeout signal from this local rank and will start to dump
the debug info. Last enqueued NCCL work: 2, last completed NCCL work: 1.
[E329 19:15:12.660932242 ProcessGroupNCCL.cpp:1167] [PG 0 Rank 0]
ProcessGroupNCCL preparing to dump debug info.
[E329 19:15:12.661192990 ProcessGroupNCCL.cpp:1174] [PG 0 Rank 0]
ProcessGroupNCCL dumping nccl trace to /tmp/tmp06psqil3/trace_0
[F329 19:15:12.661485601 ProcessGroupNCCL.cpp:1185] [PG 0 Rank 0] [PG 0
Rank 0] ProcessGroupNCCL's watchdog detected a collective timeout from
the local rank. This is most likely caused by incorrect usages of
collectives, e.g., wrong sizes used across ranks, the order of
collectives is not same for all ranks or the scheduled collective, for
some reason, didn't run. Additionally, this can be caused by GIL
deadlock or other reasons such as network errors or bugs in the
communications library (e.g. NCCL), etc. We tried our best to dump the
debug info into the storage to help you debug the issue.

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123023
Approved by: https://github.com/wconstab
2024-04-02 15:41:15 -07:00
f2e9ec2dc5 [c10d] dump from one and only one thread (PG0's monitor thread) (#120893)
Summary:
When there are multiple PGs in a process and a hardware failure happens,
we found that multiple PGs/ threads in the same
process are competing to dump the same records at the same time. The
affects the reliability of dumps.

In this PR, we will try to make the change such that only one thread/PG
could dump: PG0's monitor thread. We use a static variable to indicate
that something (e.g., collective timeout) has triggered the dump
locally.

monitor thread would dump debug info under any one of the 3 conditions:
1: this static variable is set to true by the watchdog thread when it detects
a timeout or pipe dump signal
2: timeout signal is received from other ranks through tcpstore
3: no heartbeat of watchdog
Test Plan:
python test/distributed/test_c10d_nccl.py -k
test_timeout_dumps_on_stuck_ranks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120893
Approved by: https://github.com/wconstab
2024-04-02 15:36:05 -07:00
dde4324d8e [NCCL PG] Enable ncclCommDevIdxMap unconditionally (#122049)
Differential Revision: D54993977

The initial purpose of ncclCommDevIdxMap is to support NCCL zero copy algorithms. Therefore, it is only enabled (with its values filled) if useTensorRegisterAllocatorHook_ is set to true. However, now we rely on it to support dumping NCCL information in a single PG. So we need it to be always available, regardless of whether we enabled useTensorRegisterAllocatorHook_.
Move the code of filling ncclCommDevIdxMap out of if (useTensorRegisterAllocatorHook_) statement.

See diff

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122049
Approved by: https://github.com/shuqiangzhang
2024-03-26 17:14:06 -07:00
94c079104d [c10d] fix the macro definition of NCCL_COMM_DUMP (#120502)
Summary:
Only if both macros are defined, should we dump the comm dump,
otherwise, use the original definition.

The previous implementation missed the function definition when IS_NCCL_EXP is defined but NCCL_COMM_DUMP is not defined

Test Plan:
Build and unit test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120502
Approved by: https://github.com/dsjohns2, https://github.com/Skylion007
2024-03-26 14:09:00 -07:00
a6afee6d94 [c10d][flight recorder] dump additinal NCCL debug info (#120063)
Summary:
This PR is mainly about flight recorder side of changes that takes a
map of maps as input, and dump it as picklable. Also add functions that
should be compiled only when NCCL_COMM_DUMP is defined
Test Plan:
Integration tests with NCCL would be done later, here we only do the
c10d side of dump test, aka,NCCLTraceTest

Testing the dump function is a bit tricky as we don't have
existing C++ unit tests for them. So we still use the Python NCCLTraceTest with
the python binding of _dump_nccl_trace(), we manually fed the
dump_nccl_trace with a map of test info, and assert the pickle result and
print the converted python dict:
```
(sqzhang_1) [sqzhang@devgpu009.cln1 ~/pytorch (main)]$  python
test/distributed/test_c10d_nccl.py NCCLTraceTest
NCCL version 2.19.3+cuda12.0
[rank0]:[E ProcessGroupNCCL.cpp:1200] [PG 0 Rank 0] ProcessGroupNCCL
preparing to dump debug info.
.NCCL version 2.19.3+cuda12.0
.NCCL version 2.19.3+cuda12.0
{'ncclID2': {'Key2': 'Value2', 'Key1': 'Value1'}, 'ncclID1': {'Key2':
'Value2', 'Key1': 'Value1'}}
{'ncclID2': {'Key2': 'Value2', 'Key1': 'Value1'}, 'ncclID1': {'Key2':
'Value2', 'Key1': 'Value1'}}
.NCCL version 2.19.3+cuda12.0
{'ncclID2': {'Key2': 'Value2', 'Key1': 'Value1'}, 'ncclID1': {'Key2':
'Value2', 'Key1': 'Value1'}}
{'ncclID2': {'Key2': 'Value2', 'Key1': 'Value1'}, 'ncclID1': {'Key2':
'Value2', 'Key1': 'Value1'}}
.NCCL version 2.19.3+cuda12.0
.NCCL version 2.19.3+cuda12.0
.NCCL version 2.19.3+cuda12.0
.NCCL version 2.19.3+cuda12.0
.
----------------------------------------------------------------------
Ran 8 tests in 95.761s
OK
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120063
Approved by: https://github.com/wconstab
2024-03-26 14:08:19 -07:00
d092857531 [Caffe2 CPU tests] Update CMakeLists.txt 2024-02-24 12:18:10 -08:00
6aad5e444a Fix missing MAST log when there is Unicode non-decodable text in logs (#119298)
Summary:
## Issue
When there is Unicode non-decodable text in logs, `tail_logger` will stop working afterwards, i.e. f527390102

In the example, the process stopped producing Python logs after 17:20:21 untill the job finished
```
[0]:I0201 17:20:21.338000 3429 gen_ai/genie_projects/llm/metaformers/reward_model_score.py:335] Progress: 118 batches out of 512 total batches. 23.05 % | (gpu mem: 25.8GB, free CPU mem: 1387.8GB)
I0201 17:39:14 Stopping twtask-main.service with Service Result: [success] Exit Code: [exited] Exit Status: [0]
```
At the end, `UnicodeDecodeError` was thrown at the end with no call stack.

## Fix
Use `errors="replace"` to avoid throwing exception when `UnicodeDecodeError` happens.

Test Plan: f528854819

Differential Revision: D53483644

Co-authored-by: Jack Zhang <jackzh@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119298
Approved by: https://github.com/XilunWu
2024-02-24 12:16:39 -08:00
c54ce9313b [c10d][flight recorder] store a copy of string in entry (#119837)
Summary:
Previously, we just store the char pointer in entry, the string is a
temp object and will be destructed when we want to dump/access it.

A quick fix is to store a copy of the string, but without changing the
upstream char*.

An alternative is to change every profilingTitle into std:string, this
however would needs comprehensive overhall of the code up to the
c10d::work layer above workNCCL and RecordFunction etc.

We chose the first option for this change

Resolve #119808

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119837
Approved by: https://github.com/zdevito, https://github.com/wconstab
2024-02-14 11:38:10 -08:00
1fe59f4ef7 [c10d][flight recorder] remove unintended assignment of entry (#119748)
Summary:
auto& entry = entries_.at(*id % max_entries_);
entry = entries_.at(*id % max_entries_);
The above line of code has unintended consequence of invoking copy/assignment
of entry objects as ref itself cannot be re-assigned.

Also what could cause the crash is that the entry ref could become invalid if entries_ are
resized by other threads. and this could result in 'copy to a garbage
location'. The fix is to use a pointer which can be re-assigned after
re-acquiring the lock

Tests: python test/distributed/test_c10d_nccl.py NCCLTraceTest

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119748
Approved by: https://github.com/wconstab, https://github.com/fegin
2024-02-14 11:38:10 -08:00
e693fb2bb1 [nccl flight recorder] record time we discover start and complete (#119249)
Some APIs like ncclCommAbort can cause nccl kernels to finish even if
they were previously stuck. Because we can gather the trace buffer after
those calls, we can end up seeing some collectives marked completed eventhough
that complete happened several minutes after they started and clearly after
the timeout. This changes how we record state so that we keep track of the time
we discover a state change, so even if eventually the collective gets marked complete,
we can observe it happened minutes after it was schedule.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119249
Approved by: https://github.com/wconstab
2024-02-14 11:38:10 -08:00
4fe510baf6 [NCCL PG] log NCCL comm at creation and abort (#118335)
Summary: It helps correlate NCCL PG with corresponding NCCL comm in separate logs.

Differential Revision: D53107647

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118335
Approved by: https://github.com/wconstab
2024-02-14 11:38:04 -08:00
7c507b78c4 [c10d] Expose check method to Python for store via pybind (#116144)
Differential Revision: [D52310987](https://our.internmc.facebook.com/intern/diff/D52310987)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116144
Approved by: https://github.com/wconstab
2024-01-31 11:08:27 -08:00
0019901601 [C10D] Fix nccl flightrecorder ignored dump timeout (#118142)
Don't call future.get() unless it's ready, because it waits.
Also, refactor the code a bit for simplicity.

We should do a follow-on PR to clean up the timeouts further, but this
should fix the glaring timeout bug.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118142
Approved by: https://github.com/shuqiangzhang
ghstack dependencies: #118044, #118046, #118047
2024-01-26 16:48:00 -08:00
18be18535b [C10D] Make Flight Recorder report time_created in ns (#118047)
Addresses (6) from #117883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118047
Approved by: https://github.com/zdevito
ghstack dependencies: #118044, #118046
2024-01-26 16:48:00 -08:00
2729367313 [C10D] Add version tag to NCCL Flight Recorder Dump (#118046)
Addresses (3) from #117883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118046
Approved by: https://github.com/zdevito
ghstack dependencies: #118044
2024-01-26 16:48:00 -08:00
33537aae24 [C10D] Make NCCL Flight Recorder dump produce a dict (#118044)
Putting the list of entries into a particular key of a top-level dict
paves the way for adding other metadata as other top level keys.

Addresses 1 and 2 from #117883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118044
Approved by: https://github.com/zdevito
2024-01-26 16:48:00 -08:00
dcdb1337dd [C10D] Finer-grain nccl heartbeat, avoid false positive hangs (#118016)
Summary:
Previously, heatbeat was incremented once per finishing a for loop over a list
of in-progress work items, under the assumption that either the processing
would be predictably quick, or it would hang completely.

In fact, there can be cuda API contention that causes the processing of works
to slow down arbitrarily but not truly deadlock.  To guard against this, we
bump the heartbeat at the smallest unit of progress, one work item being
successfully processed.

Test Plan: CI

Differential Revision: D52973948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118016
Approved by: https://github.com/shuqiangzhang, https://github.com/kwen2501
2024-01-26 16:48:00 -08:00
9cf0f2bd59 Move getDurationFromFirstEvent to USE_C10D_NCCL ifdef (#117738)
Fixes #117517

Try to move nccl related function *getDurationFromFirstEvent* to USE_C10D_NCCL ifdef (Related to https://github.com/pytorch/pytorch/issues/114575)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117738
Approved by: https://github.com/wconstab, https://github.com/XilunWu
2024-01-26 16:48:00 -08:00
1d2e877c05 [ProcessGroup] Make watchdog check work queue more frequently (#117297)
Today watchdog's sleep interval is 1s. That's a bit long compared to modern GPU link's (or network link's) speed.

Take DDP and Ampere for example:

DDP's bucket size = 25 MB
Ampere's NVLink speed = 250 GB/s

25 MB / 250 GB/s = 100 ms.
So we are updating the interval to 100 ms.

Update:
25 MB / 250 GB/s = 0.1 ms
But let's see how it goes so far between making the checking more aggressive.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117297
Approved by: https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
f27b979b0c [c10d] Move the timeout dump check from watchdog to monitoring thread (#117168)
To avoid potential hang in watchdog thread which will prevent us from dumping timeout debugging info, we move the check of global collective timeout signals and dumping debugging info to monitoring thread. We also need to ensure that we don't wait very long to check out the timeout signal from store; otherwise, we will miss the signal and don't get debugging info dumped.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117168
Approved by: https://github.com/wconstab
2024-01-26 16:48:00 -08:00
f30d6047ad [c10d] Add a timeout check interval variable for timeout dump (#117093)
The current timeout check frequency is relied on monitoring thread's timeout thread which can be too long (even if we set it to 2mins) so let's use a separate timeout variable which users can configure it. And we only only let default PG to check TCPStore so even more frequent check should be fine. (Our stress test is performed on every half second).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117093
Approved by: https://github.com/wconstab, https://github.com/kwen2501
2024-01-26 16:48:00 -08:00
75311510ef [C10D] Add duration_ms to flight recorder (#114817)
Measures the duration of a collective operation using nccl start/end
events and includes this duration (in ms) in the flight recorder data.

duration_ms will be an optional field, since it only works when
timing is enabled.  Currently timing is enabled when flight recorder
is enabled, but this is not a strict requirement.  Duration is also
not available for collectives not in a completed state.

Note: computing duration can lead to a hang due to calling cudaEventDuration when
the cuda driver queue is full.

We don't ever want dump() api to hang, since we might want dump to help
debug a hang. Hence, we only query durations from the watchdog thread,
and it's possible during dump() call, some of the most recent
collectives durations won't have been computed yet at time of dump.  We
make this tradeoff to ensure that dump() itself will never hang.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114817
Approved by: https://github.com/fduwjj, https://github.com/zdevito
ghstack dependencies: #116905
2024-01-26 16:48:00 -08:00
dbd6094d05 [C10D](reland) Add GIL checker to NCCL watchdog monitor (#117312)
Whenever the monitor thread kills the watchdog thread for being stuck, we do so to save cluster time and get a faster failure signal, but we want to know more about why it got stuck.

One possible reason for watchdog stuckness is GIL contention, which could be ruled out or observed by making an attempt to acquire the GIL at exit time.

If we cannot acquire the GIL within a short time window (1s) we abort the attempt and report GIL contention, otherwise we report that GIL was acquired successfully.

Reland: uses a function pointer to avoid destructor ordering issues on dlclose. (Looks like the destructor for the std::function was being run later than the libtorchpython lib was unloaded, leading to a crash).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117312
Approved by: https://github.com/zdevito
2024-01-26 16:48:00 -08:00
397b9d47e9 [ProcessGroup] Do not print NCCL_DEBUG before NCCL init (#117328)
In case /etc/nccl.conf is used, `NCCL_DEBUG` is not set to sys env until NCCL inits.
The deleted print point is before NCCL inits, hence may be inaccurate.
This PR removes it and relies on the other print point which is after NCCL comm creation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117328
Approved by: https://github.com/wconstab, https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
36a01a8ab9 [c10d][EZ] Add more logs in the destructor of ProcessGroupNCCL for better root cause investigation (#117291)
Add logs to the place where we inspect whether a hang happens.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117291
Approved by: https://github.com/XilunWu, https://github.com/shuqiangzhang
2024-01-26 16:48:00 -08:00
ee336cf58a [c10d] Add comments to the rest environment variable within NCCLPG (#117092)
Not every environment within NCCLPG has comments, let's add comments to each of them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117092
Approved by: https://github.com/kwen2501
ghstack dependencies: #116545
2024-01-26 16:48:00 -08:00
a9e2e745d7 [c10d] Add extra sleep in waitForDumpOrTimeout to ensure enough time for all ranks dump debug info (#116545)
We added an extra sleep and make it configurable so that users can set an extra wait to ensure all ranks have dumped the debug info.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116545
Approved by: https://github.com/wconstab
2024-01-26 16:48:00 -08:00
ab4df89eea [C10D] Rename flightrecorder key vars to avoid confusion (#116905)
Key vars are strings used as dict keys (e.g. duration_s was a string
"duration_ms")

_s confused me with time (seconds) since duration_s was a key string and
duration_ms is another variable holding a time value.

Now duration_key is "duration_ms".

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116905
Approved by: https://github.com/zdevito
2024-01-26 16:48:00 -08:00
9d02ebe876 [c10d] To make ProcessGroupNCCL to use globalStore for coordination (#117075)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117075
Approved by: https://github.com/wconstab
ghstack dependencies: #117074
2024-01-26 16:48:00 -08:00
b61e01cce9 [c10d] Add a recursive method to get the inner most store (#117074)
In c10d PG initialization, we wrap TCPStore with multiple layers of PrefixStore which adds layers of prefix.

One example is:
"default_pg/0//cuda//timeout_dump"
When initialized the default PG, because there is no store passed. We first add the prefix "default_pg" to the TCPStore returned from rendezvous:

bdeaaad70c/torch/distributed/distributed_c10d.py (L1240)

We then add pg_name (aka 0) bdeaaad70c/torch/distributed/distributed_c10d.py (L1376) and device (aka cuda) bdeaaad70c/torch/distributed/distributed_c10d.py (L1387)

to the prefix. Then when we call store_->set("timeout_dump"). The actual key used for writing into TCPStore is "default_pg/0//cuda//timeout_dump".

For sub-PG, things get even interesting, we put the store wrapped with default pg name to a cache:
bdeaaad70c/torch/distributed/distributed_c10d.py (L1517)

And when creating each subPG, it is append its PG name right after the cached store. The example keys are:
'default_pg/0//10//cuda//timeout_dump', 'default_pg/0//12//cuda//timeout_dump', 'default_pg/0//38//cuda//timeout_dump', 'default_pg/0//39//cuda//timeout_dump'. (10, 12, 38 and 39 are all PG names of each subPG created)

The reason why the number in the name is bumped up so high is because for each subPG creation, all ranks have to call the API together and the global variable used for PG name will be bumped up monolithically:

bdeaaad70c/torch/distributed/distributed_c10d.py (L3666)

Similar things happen for using hashing for PG names.

This has a potential issue, because each sub-PG has an instance of ProcessGroupNCCL, and if we want to set something global to notify all sub-PGs (and all ranks). This added prefix causes bugs. For example, if on sub-PG 1, we set a value to TCPStore with key ('default_pg/0//1//cuda//timeout_dump'), while we use the default PG instances to check the TCPStore, which are using the key ('default_pg/0//cuda//timeout_dump'), default PG instances will never get the notified signals. So in this PR, we added a new API in PrefixStore which we get the innermost non-PrefixStore for set and check. The next PR will make changes in NCCL watchdog.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117074
Approved by: https://github.com/wconstab, https://github.com/H-Huang
2024-01-26 16:48:00 -08:00
f7ce61ba53 [C10D] Dump cpp stacktraces on heartbeat monitor timeout (#116717)
Summary:
If heartbeat monitor times out and kills the process, we want to know why.

It's convenient to use an internal tool for this, but we plan to later
integrate with torchelastic to call into pyspy or something else, which will be
both better (including py stacks) and compatible with OSS.

Test Plan: tested manually, observed c++ stacktraces were dumped

Reviewed By: fduwjj

Differential Revision: D52370243

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116717
Approved by: https://github.com/zdevito
2024-01-26 16:48:00 -08:00
e7bae15ab1 [C10D] Make heartbeat_ atomic (#116702)
Summary:
Currently, the code is working. We know this becuase we observe heartbeat
timeouts.

However, there is a chance that if the code were refactored, the compiler could
optimize away the load of heartbeat_ inside heartbeatMonitor, and we wouldn't
know.

Using atomic here is not really for thread synchronization, but more to ensure
compiler optimizations (hoisting the read outside the loop) can never be
allowed to happen.  Again, we know this isn't currently happening bc if it
were, it  would not be an intermittent failure, it would be an always failure.
(at least with a fixed compiler/platform).

I previously avoided atomic bc we didn't want shared locks between heartbeat
monitor and watchdog thread.  Why? if watchdog held the lock and hung, monitor
could also hang.  However, this really can't happen (Afaik) when using an
atomic.

Test Plan: existing CI tests

Differential Revision: D52378257

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116702
Approved by: https://github.com/fduwjj, https://github.com/zdevito
2024-01-26 16:48:00 -08:00
e71b422908 [C10D] Improve Heartbeat Monitor exit logs (#116268) (#116661)
Summary:

- add workMetaList_.size() so we know how many outstanding works there
  were when killing
- Print our first log before debuginfo dump instead of after, since it
  is clearer when reading the logs that we time out and then dump
- Organize the log strings- put them near where they are used

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin XilunWu wanchaol fduwjj wz337 tianyu-l yf225

imported-using-ghimport

Test Plan: Imported from OSS

Reviewed By: fduwjj

Differential Revision: D52369167

Pulled By: wconstab

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116661
Approved by: https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
389940ce60 [c10d] Make DebugInfoWriter Singleton across all PG objects (#116489)
Previously, we have the writer register to each NCCL PG(backend), so for every pg, we have a NCCL PG instance, so if we use some customized writer when multiple sub-PGs are used, we need to ensure user to register the writer for every backend which indicates a bad UX. Furthermore, the debug info is global, so it does not make sense to have the writer for each instance. We even have a static mutex in the `dumpDebuggingInfo` to ensure we serialize the write, that makes it more obvious that we can make the writer a singleton so that we only have one writer instance for all PG instances.

Although the rationale is clear, the implementation may vary a lot. So this PR is RFC for now to see if this implementation makes sense or not.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116489
Approved by: https://github.com/kwen2501
2024-01-26 16:48:00 -08:00
b2237a7c85 [C10d] Fix Log Prefix in NCCLPG so that each instance gets its own prefix (#116520)
Somehow the logprefix only have ProcessGroup 0 rank [global rank]. This does not give the expected result as per the comment says "a prefix that is unique to this process group and rank". So this PR fix it and make it different for different subPGs.

The reason is that we set the prefix static which is shared across all NCCLPG instances and whoever calls this function first will set `rank_` and `uid_` to the prefix. We always initialize PG 0 first that's why we always see PG[0] + global ranks for all subPGs.

<img width="484" alt="image" src="https://github.com/pytorch/pytorch/assets/6937752/7fbb0226-7e25-4306-9cee-22e17b00bc8e">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116520
Approved by: https://github.com/wconstab
ghstack dependencies: #116218
2024-01-26 16:48:00 -08:00
ef5dfe3f3e [c10d] Fix timeout dump path write path overlap when there are multiple PGs (#116218)
Basically we observed that if there are multiple PGs and if the timeout happens on one of the subPG, we somehow use the local rank in the dump file. We realize that:
1. For setting the timeout signal in the store, any watchdog thread from any PG can do that.
2. For checking and dump, only the watchdog thread of default PG which we will always create and contain all ranks (no file name conflict) is needed here because the store signal and dump debug info are all global.
3. Since dump is global, we want to avoid the case when ranks from sub-PG pollute logs from global ranks (local rank 0 vs global rank 0). So that we use global ranks here to initialize debug info writer. (Down the road, we are thinking about making it a singleton so that user only register it once for multi-PG case.)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116218
Approved by: https://github.com/wconstab
2024-01-26 16:48:00 -08:00
e303dc3c08 [c10d] Add stream info during nccl comm abort call (#116076)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116076
Approved by: https://github.com/XilunWu
2024-01-26 16:48:00 -08:00
265efad2de [C10D] Increase TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC (#116267)
Change default from 2 min to 10 min.

Why? Many cases of heartbeat timeout were reported, but increasing
timeout led to the same job hanging in a different place, suggesting
heartbeat kill was working well and not a false positive.  However, some
others reported jobs running fine with increased timeouts.  One such
case was investigated below, and suggests that indeed a 2 min timeout is
too aggressive.  While we have not fully root caused the issue, it
is better to avoid killing jobs that would otherwise complete.

Current theory is that watchdog is not totally deadlocked, but is slowed
down in its processing of work objs due to some intermittent resource
contention.  Hence, allowing more time is more of a workaround than a
fix.

Debug/Analysis:
https://docs.google.com/document/d/1NMNWoTB86ZpP9bqYLZ_EVA9byOlEfxw0wynMVEMlXwM

Differential Revision: [D52368791](https://our.internmc.facebook.com/intern/diff/D52368791)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116267
Approved by: https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
60f0455905 [C10D] Make all PGNCCL LOG usages use logPrefix() (#116060)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116060
Approved by: https://github.com/fduwjj
ghstack dependencies: #116059
2024-01-26 16:48:00 -08:00
4898313791 [C10D] Add logPrefix to abortCommsFromMap (#116059)
Prints additional info such as PG ID/Rank.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116059
Approved by: https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
f4da9adf6b [C10D] Add waitForDumpOrTimeout to log on dump abandonment (#115876)
Helps call attention to any cases where the dump actually times out.

The timeout is likely to hit if we run into slow stacktrace processing.

Log any exceptions encountered in the background thread, but don't raise
them- we're already willing to abandon the debug dump, and want to
proceed with our normal execution (in the case of dumppipe) or shutdown
process (when dumping happens on timeout and shutdown is already
initiated).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115876
Approved by: https://github.com/zdevito
ghstack dependencies: #115807
2024-01-26 16:48:00 -08:00
8f7f35273e [c10d] Polish NCCL PG monitor thread log message (#115888)
We turned on monitor thread by default in https://github.com/pytorch/pytorch/pull/112518, and we want the error message that is displayed when the monitor kills the process to be more informative.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115888
Approved by: https://github.com/wconstab
2024-01-26 16:48:00 -08:00
44ec9612ed [C10D] Log PG size in init log (#115807)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115807
Approved by: https://github.com/XilunWu
2024-01-26 16:48:00 -08:00
4d3bea2b29 [nccl flight recorder] nullptr profiling name (#115851)
Sometimes profiling name can be a nullptr, which
throws on conversion to std::string. This adds a check.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115851
Approved by: https://github.com/wconstab
2024-01-26 16:48:00 -08:00
0bcdddc3c1 [C10D] Make dumpDebuggingInfo share a mutex across PGs (#115803)
The mutex was originally added to avoid racing to dump debuginfo,
where a race in this case would result in a corrupted dump file.

The reason a mutex helps is that it forces all dump requests to be
serialized, so that an observer would either see an in-progress file, a
complete file, or no file.  Without a mutex, a fourth state is possible
(a file that has been written to by multiple threads and is invalid).

Becuase the mutex was a ProcessGroupNCCL class member, and each PG
instance has its own watchdog thread that can launch a dump, it was not
doing its job.  Making the mutex static shares it between instances of
the class and ensures serialization of dumps triggered by any PG.

(Note: dumps triggered by different PGs have the same, global contents
anyway- there is only one global flight recorder, so it doesn't matter
who triggers it.)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115803
Approved by: https://github.com/kwen2501
ghstack dependencies: #115771, #115798, #115800, #115801
2024-01-26 16:48:00 -08:00
28b6220312 [C10D] Change PGNCCL logs to prefix [PG {} Rank {}] (#115801)
Adds a PG {process group uid} prefix component to logs.

This is helpful in situations where there are multiple processgroups,
and rank information by itself is confusing.  (For example rank0 on PG1
may correspond to rank3 on PG0.  People may assume 'rank0' references
the global (PG0) world, but it may reference a sub-pg.  Prefacing the PG
helps clarify this.

Does NOT change logs from inside WorkNCCL functions, since WorkNCCL
doens't know what PG ID it corresponds to. Will address these logs
separately.

Example:

```
[I ProcessGroupNCCL.cpp:787] [PG 0 Rank 0] ProcessGroupNCCL initialization ...
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115801
Approved by: https://github.com/fduwjj
ghstack dependencies: #115771, #115798, #115800
2024-01-26 16:48:00 -08:00
210b7b65e2 [C10D] Refactor NCCL logs to use common prefix helper (#115800)
Put the repeated code that string formats [Rank {rank}] in one place.

Sets up for the next PR that also adds more info to this prefix.

(Does not change exception messages, which could be done as well.
Exception messages are not formatted quite the same way. Tries
instead to keep from changing log behavior (in this PR) and only
refactor code.

Did limited testing (some logs were observed OK).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115800
Approved by: https://github.com/fduwjj
ghstack dependencies: #115771, #115798
2024-01-26 16:48:00 -08:00
4da10b5cd3 [C10D] Only open NCCL dump pipe file once per process (#115798)
The NCCL flight recorder is per-process (it is shared by all
processgroups), but individual process groups used to construct their
own pipe for being signaled to dump the flight recorder.

This ensures that only one pipe per process is created, by only creating
the pipe on the first ProcessGroup (uid_ == 0) which should be the world
group.

Filenames are still keyed off of rank, but this should now be global
rank instead of sub-pg rank, making the filenames unique across the
whole trainer process.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115798
Approved by: https://github.com/zdevito
ghstack dependencies: #115771
2024-01-26 16:48:00 -08:00
f09763814f [C10D] Make DumpPipe disabled when FlightRecorder disabled (#115771)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115771
Approved by: https://github.com/fduwjj
2024-01-26 16:48:00 -08:00
80923ed5a6 [C10D] Make DumpPipe pipe file configurable (#115770)
Add TORCH_NCCL_DEBUG_INFO_PIPE_FILE env, allowing separate pipe file
location from dump file location.

Defaults PIPE_FILE to empty, meaning disabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115770
Approved by: https://github.com/zdevito
2024-01-26 16:48:00 -08:00
0ff155fb65 Fix SDPA for SAM (#115636)
Addresses the regression for Segment Anything Fast in https://github.com/pytorch-labs/segment-anything-fast/issues/99
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115636
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-12 18:52:38 +00:00
8885128dcc Fix backward for SDPA NT jagged layout (#115576)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115576
Approved by: https://github.com/jbschlosser, https://github.com/ani300
2023-12-12 18:35:40 +00:00
7553c49514 [S382174] Fix distributed debug w/ non-equal split (#115483)
Summary:
In collectives, it's possible to have non-equal split that has a different implementation and the output tensor size will be different, e.g. https://www.internalfb.com/code/fbsource/[460afb1172b5]/fbcode/caffe2/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp?lines=3104. However, TORCH_DISTRIBUTED_DEBUG=DETAIL will assume the output tensor size is the same and does the check and will fail the job if they don't: https://fburl.com/code/mhte9ty8. c10d code should handle this.

Ideally we should check the input size across ranks and make sure they're the same. Maybe for next diff.

Test Plan: Test torchrec's TWRW w/ non-even split and it's working now.

Reviewed By: zhangruiskyline

Differential Revision: D52010942

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115483
Approved by: https://github.com/kwen2501, https://github.com/fegin, https://github.com/XilunWu
2023-12-12 18:02:05 +00:00
d521857411 Terminate handler (#101332)
Fixes #50051.
This PR is based on #50320 and I address the last feedback.
On Windows it is enabled by default. Can be enabled or disabled via USE_CUSTOM_TERMINATE env variable.

This PR adds support for overriding the terminate handler in order to log uncaught exceptions in the threads.
If an exception is thrown and not caught, it will print <Unhandled exception caught in c10/util/AbortHandler.h>
The point of doing this is that in issue #50051, exceptions were thrown but not logged. With this logging system it will be easier to debug it in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101332
Approved by: https://github.com/albanD, https://github.com/malfet
2023-12-12 17:55:27 +00:00
36b5136270 [inductor] Don't print disable_cudagraphs_reason when cudagraphs is disabled (#115489)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115489
Approved by: https://github.com/yanboliang
2023-12-12 17:50:18 +00:00
670eb83573 Enable test_sparse_addmm for crossref tests (#115536)
Fixes https://github.com/pytorch/pytorch/issues/97284

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115536
Approved by: https://github.com/cpuhrsch
2023-12-12 17:26:40 +00:00
a8dc9d8e35 [8/n] Update XNNPACK Version Part 8 Everything Remaining to get it to work (#115587)
> **__Note:__** XNNPACK Upgrade is too large in the range of **40k** files and **10m** Lines of code, Thus we break the update of the library into multiple parts. All Parts [1 - 6/n] Must be landed together for it to work. ***This also means If there is a revert. Please revert the Entire Stack.***

This change is everything remaining requiring XNNPACK version to work.

Differential Revision: [D52044420](https://our.internmc.facebook.com/intern/diff/D52044420/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115587
Approved by: https://github.com/digantdesai
2023-12-12 17:17:19 +00:00
e918461377 Add instructions for generating optimal Triton kernel parameters of bsr_dense_addmm (#115504)
As in the title.

In addition, enable verbose output when executing the torch/sparse/_triton_ops_meta.py script.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115504
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #115499
2023-12-12 16:44:51 +00:00
32286512cc Add tune_bsr_dense_addmm as an API to find optimal triton kernel parameters for bsr_dense_addmm (#115499)
As in the title.

In addition:
- improve the algorithm for finding a minima of operation timings: break the inner loop early when a next minima candidate is found
- add tests and fix bugs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115499
Approved by: https://github.com/cpuhrsch
2023-12-12 16:44:51 +00:00
40dc0580a6 [inductor] De-duplicate triton helper functions (#115546)
Previously if two calls to cumsum were generated in the same triton kernel
we would generate identical helper functions with different names. Now this
recognizes identical functions and only defines it once. To do this I defer
choosing the name until after codegen.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115546
Approved by: https://github.com/lezcano
ghstack dependencies: #109132
2023-12-12 16:30:50 +00:00
02196c21ac [inductor] Parameterize ir.Scan on combine_fn (#109132)
This replaces `tl.cumsum` and `tl.cumprod` with calls to `tl.associative_scan`
where the combine function is generated from inductor IR.

So before we had:
```python
@triton.jit
def triton_(in_ptr0, out_ptr0, xnumel, rnumel, XBLOCK : tl.constexpr):
    xnumel = 20
    rnumel = 30
    RBLOCK: tl.constexpr = 32
    xoffset = tl.program_id(0) * XBLOCK
    xindex = xoffset + tl.arange(0, XBLOCK)[:, None]
    xmask = xindex < xnumel
    rindex = tl.arange(0, RBLOCK)[None, :]
    rmask = rindex < rnumel
    r1 = rindex
    x0 = xindex
    tmp0 = tl.load(in_ptr0 + (r1 + (30*x0)), rmask & xmask, other=0).to(tl.float32)
    tmp1 = tl.broadcast_to(tmp0, [XBLOCK, RBLOCK])
    tmp2 = tl.where(rmask & xmask, tmp1, 0)
    tmp3 = tl.cumsum(tmp2, 1)
    tl.store(out_ptr0 + (r1 + (30*x0)), tmp3, rmask & xmask)
```

Now we have:
```python
@triton.jit
def _triton_helper_fn0(arg0, arg1):
    tmp0 = tmp0 + tmp1
    return tmp0

@triton.jit
def triton_(in_ptr0, out_ptr0, xnumel, rnumel, XBLOCK : tl.constexpr):
    xnumel = 20
    rnumel = 30
    RBLOCK: tl.constexpr = 32
    xoffset = tl.program_id(0) * XBLOCK
    xindex = xoffset + tl.arange(0, XBLOCK)[:, None]
    xmask = xindex < xnumel
    rindex = tl.arange(0, RBLOCK)[None, :]
    rmask = rindex < rnumel
    r1 = rindex
    x0 = xindex
    tmp0 = tl.load(in_ptr0 + (r1 + (30*x0)), rmask & xmask, other=0).to(tl.float32)
    tmp1 = tl.broadcast_to(tmp0, [XBLOCK, RBLOCK])
    tmp2 = tl.where(rmask & xmask, tmp1, 0)
    tmp3 = tl.associative_scan(tmp2, 1, _triton_helper_fn0)
    tl.store(out_ptr0 + (r1 + (30*x0)), tmp3, rmask & xmask)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109132
Approved by: https://github.com/lezcano
2023-12-12 16:30:50 +00:00
d5286d7ea8 [export] Add canonical form for differentiating IR (#115589)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115589
Approved by: https://github.com/suo
2023-12-12 16:21:57 +00:00
de4b2e59a7 [PyTorch] AOTI: add more basic aoti_torch getters (#112799)
Lot of simple information about tensors we couldn't get. In
particular, we didn't know the lengths of the arrays returned by sizes
and strides.

Differential Revision: [D50949929](https://our.internmc.facebook.com/intern/diff/D50949929/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112799
Approved by: https://github.com/desertfire, https://github.com/aakhundov
ghstack dependencies: #112116, #112174, #112405, #112798
2023-12-12 15:56:33 +00:00
c5c4d81b1b Switched stale workflow to linux.large.arc (#115635)
Switched stale workflow to linux.large.arc
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115635
Approved by: https://github.com/jeanschmidt
2023-12-12 15:33:59 +00:00
4fafc36c33 [MPS] Fix sum and prod for complex types (#115554)
By not force-casting dtype to float

Test plan: `python -c "import torch;print(torch.linspace(-3.0, 3.0, 50, dtype=torch.cfloat, device='mps').sqrt().sin().sum())"`

Before:
```
tensor(21.1778+0.j, device='mps:0')
```
After
```
tensor(21.1778+39.1377j, device='mps:0')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115554
Approved by: https://github.com/lezcano
ghstack dependencies: #115512, #115513
2023-12-12 15:04:45 +00:00
07f03b4a62 [MPS] Add support for MPSDataTypeComplexFloat[16|32] (#115513)
But limit it to MacOS Sonoma +

Before the calling `torch.cat` with complex types failed, but now it works.
Before:
```
% python -c "import torch;print(torch.cat([torch.rand(3, 3, dtype=torch.cfloat).to('mps'), torch.rand(3, 3, dtype=torch.cfloat).to('mps')]))"
TypeError: Trying to convert ComplexFloat to the MPS backend but it does not have support for that dtype.
```
After:
```
% python -c "import torch;print(torch.cat([torch.rand(3, 3, dtype=torch.cfloat).to('mps'), torch.rand(3, 3, dtype=torch.cfloat).to('mps')]))"
tensor([[0.4857+0.0030j, 0.9375+0.8630j, 0.3544+0.9911j],
        [0.5293+0.8652j, 0.8440+0.1991j, 0.5152+0.8276j],
        [0.0136+0.7469j, 0.1403+0.4761j, 0.2943+0.0896j],
        [0.6458+0.0035j, 0.3579+0.4577j, 0.1723+0.1508j],
        [0.4420+0.3554j, 0.4396+0.7272j, 0.2479+0.1191j],
        [0.3895+0.2292j, 0.7886+0.1613j, 0.9243+0.4180j]], device='mps:0')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115513
Approved by: https://github.com/kulinseth
ghstack dependencies: #115512
2023-12-12 15:04:45 +00:00
21cf6e76c2 Revert "Use linux.large.arc for stale workflow (#115440)"
This reverts commit dadb3694ffaa2a0bfe78516c294a46566430c1ad.

Reverted https://github.com/pytorch/pytorch/pull/115440 on behalf of https://github.com/DanilBaibak due to Did not merge properly ([comment](https://github.com/pytorch/pytorch/pull/115440#issuecomment-1852126050))
2023-12-12 14:20:29 +00:00
dadb3694ff Use linux.large.arc for stale workflow (#115440)
* Try linux.large.arc for stale workflow

* Run stale workflow on PR changes

* Added arc runner lable to the list of self hosted runners

* Added concurency linux-job

* Cleanup

* Added workflow_dispatch for testing purpose
2023-12-12 15:11:09 +01:00
7350dcb307 [CI] Fix lint errors on master (#115627)
Differential Revision: [D52073432](https://our.internmc.facebook.com/intern/diff/D52073432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115627
Approved by: https://github.com/atalman
2023-12-12 13:53:14 +00:00
bc51a0c22f Revert "[PyTorch] AOTI: add more basic aoti_torch getters (#112799)"
This reverts commit 3de2596abed9717a166635b48126302fcf46527a.

Reverted https://github.com/pytorch/pytorch/pull/112799 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/112799#issuecomment-1852076887))
2023-12-12 13:52:34 +00:00
f98b0f3ebc Add bfloat16 support to torch.sparse.addmm for CPU (#115535)
Fixes https://github.com/pytorch/pytorch/issues/73145.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115535
Approved by: https://github.com/cpuhrsch
2023-12-12 13:26:33 +00:00
d6f8850653 Revert "[Export] Test non-strict mode on existing test cases (#115399)"
This reverts commit 36527df344c0c33dae8bc6c94eded8646013b736.

Reverted https://github.com/pytorch/pytorch/pull/115399 on behalf of https://github.com/atalman due to OSSCI oncall, broke CI tests ([comment](https://github.com/pytorch/pytorch/pull/115399#issuecomment-1851988651))
2023-12-12 13:02:18 +00:00
a8acd6c410 Add Half support for AvgPool2d on CPU (#109578)
Add Half support for AvgPool2d (both channels last and channels first) on CPU

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109578
Approved by: https://github.com/mingfeima, https://github.com/albanD
2023-12-12 12:59:47 +00:00
92fd3927b0 [export][reland] Add math.* ops to pass base (#115559)
Reland of https://github.com/pytorch/pytorch/pull/115271/
Fixes https://github.com/pytorch/pytorch/issues/115209
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115559
Approved by: https://github.com/zhxchen17, https://github.com/atalman
ghstack dependencies: #115556, #115557, #115558
2023-12-12 10:46:41 +00:00
36527df344 [Export] Test non-strict mode on existing test cases (#115399)
Summary:
Dynamo test methodology provides a good example to patch various
treaments on the same set of test cases. A pitfall is the global config
that could be easily modified somewhere. Here we change the behavior of
the export API thru hijacking it with self defined code.

For supporting non-strict test suite, the `strict=False` is explicitly
passed into the export API when it's called w/ or w/o strict arg.

* For existing failed strict test cases, non-strict also fails.
* For passed strict but failed non-strict cases, we mark them as
`@testing.expectedFailureNonStrict`.
* Moreover, I manually check the failure reason and some of them are not
related to nn.Module asserting exception. I mark them as `# Need to fix
for non-strict mode`.

Test Plan:
python test/export/test_export_nonstrict.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115399
Approved by: https://github.com/zhxchen17, https://github.com/tugsbayasgalan
2023-12-12 07:11:53 +00:00
fdf814c6ca Revert "[MPS] Add support for MPSDataTypeComplexFloat[16|32] (#115513)"
This reverts commit a4bb4a237348ff8d688e43ba542ee59a9d7ed4a6.

Reverted https://github.com/pytorch/pytorch/pull/115513 on behalf of https://github.com/malfet due to Broke Mac x86 periodic builds ([comment](https://github.com/pytorch/pytorch/pull/115513#issuecomment-1851398773))
2023-12-12 06:50:47 +00:00
46694e92b7 Revert "[MPS] Fix sum and prod for complex types (#115554)"
This reverts commit 8b28380c8ed5b5bfe479392bcffeccf8b89be328.

Reverted https://github.com/pytorch/pytorch/pull/115554 on behalf of https://github.com/malfet due to Broke MacOS x86 builds ([comment](https://github.com/pytorch/pytorch/pull/115554#issuecomment-1851395982))
2023-12-12 06:47:39 +00:00
f28687dfb2 Do not use pytorchbot-env from upload-test-stats (#115606)
As it was only needed to check our token rate limits

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115606
Approved by: https://github.com/huydhn
2023-12-12 06:42:33 +00:00
1eca63c6ac [DeviceMesh] Move helper function 'get_mesh_dim_by_name' to MeshEnv class (#115572)
Move helper function `get_mesh_dim_by_name ` outside of the DeviceMesh class to keep the public class cleaner.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115572
Approved by: https://github.com/XilunWu, https://github.com/wanchaol
2023-12-12 06:29:46 +00:00
3de2596abe [PyTorch] AOTI: add more basic aoti_torch getters (#112799)
Lot of simple information about tensors we couldn't get. In
particular, we didn't know the lengths of the arrays returned by sizes
and strides.

Differential Revision: [D50949929](https://our.internmc.facebook.com/intern/diff/D50949929/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112799
Approved by: https://github.com/desertfire, https://github.com/aakhundov
ghstack dependencies: #112116, #112174, #112405, #112798
2023-12-12 06:19:45 +00:00
2b323e61ad [PyTorch] AOTI: Use static_cast, not dynamic_cast (#112798)
dynamic_cast is for when we aren't certain about the type. We are certain (and will crash anyway if we're wrong).

Differential Revision: [D50812978](https://our.internmc.facebook.com/intern/diff/D50812978/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112798
Approved by: https://github.com/chenyang78, https://github.com/desertfire, https://github.com/jansel, https://github.com/khabinov
ghstack dependencies: #112116, #112174, #112405
2023-12-12 06:19:45 +00:00
ca52195112 [PyTorch] AOTI: Avoid aoti_torch_data_ptr calls for constants at inference time (#112405)
Cache aoti_torch_get_data_ptr at constants update time.

Differential Revision: [D50708982](https://our.internmc.facebook.com/intern/diff/D50708982/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112405
Approved by: https://github.com/chenyang78, https://github.com/desertfire, https://github.com/khabinov
ghstack dependencies: #112116, #112174
2023-12-12 06:19:45 +00:00
24c67fe8cf [PyTorch] AOTI: Emit static constexpr int array vars when possible (#112174)
No need to populate a stack-based array for a shape/stride array when it's statically known.

Differential Revision: [D50699889](https://our.internmc.facebook.com/intern/diff/D50699889/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112174
Approved by: https://github.com/chenyang78, https://github.com/desertfire, https://github.com/jansel
ghstack dependencies: #112116
2023-12-12 06:19:45 +00:00
ff6f987adc [PyTorch] Replace cached thread_locals with stack allocation in AOTI (#112116)
This changes cached thread_local tensors to stack-allocated buffers. Since we were incidentally caching output in a thread_local, I had to add manual thread_local caching of outputs, which I implemented by caching a buffer and a Tensor whose storage is that buffer and then just memcpying the result into the cached buffer every time. Ideally, memory planning would be able to identify allocations that are the backing storage for outputs, but this should be good enough in the absence of planning.

Differential Revision: [D50416438](https://our.internmc.facebook.com/intern/diff/D50416438/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112116
Approved by: https://github.com/jansel, https://github.com/desertfire
2023-12-12 06:19:45 +00:00
405a0040cf Adds tool to visualize sharding (#114307)
This pull request adds a tool to visualize sharding. It uses the device_mesh and placement details to construct a visualization of the split of a torch dtensor.

Things to fix:

- [x] This implementation only uses the first element of the placement tuple, when can there be more than one elements?
- [x] The calculation of the split is happening here but maybe it is already done somewhere internally in Shard class and can we directly call that here?

Fixes #108746

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114307
Approved by: https://github.com/wanchaol
2023-12-12 06:18:03 +00:00
65651d970b Optimize the copy of Half to Float and Float to Half on CPU (#103148)
### Description
Optimize the copy of Half to Float and Float to Half on CPU.

### Testing

Single core:
shape | fp16 -> fp32 / ms | fp32 -> fp16 / ms | bf16 -> fp32 / ms | fp32 -> bf16 / ms
-- | -- | -- | -- | --
size: (1, 777) | 0.00345 | 0.00344 | 0.00411 | 0.00410
size: (2, 512) | 0.00355 | 0.00344 | 0.00431 | 0.00400
size: (10, 555) | 0.00473 | 0.00391 | 0.00562 | 0.00477
size: (1, 2048, 1024) | 0.488 | 0.480 | 0.498 | 0.499
size: (32, 100, 777) | 0.584 | 0.568 | 0.571 | 0.587

28 cores:
shape | fp16 -> fp32 / ms | fp32 -> fp16 / ms | bf16 -> fp32 / ms | fp32 -> bf16 / ms
-- | -- | -- | -- | --
size: (10, 555) |  0.00472 | 0.00369 | 0.00576 |  0.00481
size: (1, 2048, 1024) |  0.0189 | 0.0188 | 0.0173 | 0.0251
size: (64, 512, 1024) | 3.159 | 2.375 |  3.152 | 2.358
size: (32, 100, 777) | 0.0225 | 0.0195 | 0.0193 | 0.0261

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103148
Approved by: https://github.com/jgong5, https://github.com/cpuhrsch
2023-12-12 05:57:52 +00:00
b6a4866330 [export][reland][refactor][3/n] Move unlift to separate file (#115558)
Reland of https://github.com/pytorch/pytorch/pull/114787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115558
Approved by: https://github.com/zhxchen17, https://github.com/atalman
ghstack dependencies: #115556, #115557
2023-12-12 05:37:07 +00:00
36199747f3 [export][reland][refactor][2/n] Move tracing logic (#115557)
Reland of https://github.com/pytorch/pytorch/pull/114768
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115557
Approved by: https://github.com/zhxchen17
ghstack dependencies: #115556
2023-12-12 05:37:07 +00:00
dd9a989b83 [export][reland][refactor][1/n] Split dynamic shapes (#115556)
Reland of https://github.com/pytorch/pytorch/pull/114764
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115556
Approved by: https://github.com/zhxchen17
2023-12-12 05:36:41 +00:00
744d74c456 [inductor][optimus] enable smart fusion (#115471)
Summary: Enable gmm smart fusion in D51698686

Test Plan: buck test

Differential Revision: D52002137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115471
Approved by: https://github.com/mengluy0125
2023-12-12 05:04:36 +00:00
fbb744fd49 [dtensor] enable radam foreach optimizer (#115566)
As titled, test both non-foreach and foreach optim

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115566
Approved by: https://github.com/XilunWu
ghstack dependencies: #115297, #115564, #115565
2023-12-12 03:57:00 +00:00
c322e5b5e9 [dtensor] add test for nadam optimizer (#115565)
as titled, foreach ops already supported, just add test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115565
Approved by: https://github.com/XilunWu
ghstack dependencies: #115297, #115564
2023-12-12 03:57:00 +00:00
4bd661c472 [dtensor] enable adadelta foreach optimizer (#115564)
as titled

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115564
Approved by: https://github.com/XilunWu
ghstack dependencies: #115297
2023-12-12 03:56:55 +00:00
8a27352d6b [dtensor] add a implicit replication flag (#115297)
This PR adds a experimental implicit replication support for DTensor to
inter-op with torch.Tensor, basically under this context manager DTensor
could work together with torch.Tensor by assuming the torch.Tensor
sharding layout is replicated.

Note that this is risky for DTensor so we don't turn it on by default,
but for certain cases where it is for sure replicated, user can use this
to allow DTensor and Tensor computation work together

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115297
Approved by: https://github.com/awgu
2023-12-12 03:56:48 +00:00
c70f995b5c [DeviceMesh] Add mesh_dim_names to DeviceMesh __repr__ if it exists (#115579)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115579
Approved by: https://github.com/wanchaol
2023-12-12 02:18:34 +00:00
0fc04e274d [inductor] Fix an aliased output bug (#115373)
Summary: for https://github.com/pytorch/pytorch/issues/97083, when

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115373
Approved by: https://github.com/jansel
2023-12-12 01:18:59 +00:00
89ee3af076 [Reland][Dynamo] Don't log compilation metrics for PyTorch unit tests (#115571)
Reland #115452, which was reverted to simplify a merge conflict with #115386

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115571
Approved by: https://github.com/yanboliang
2023-12-12 01:15:54 +00:00
064846dbc2 [cpu] flash attention optimization (#115151)
### Modifications
- **EXP**: Add a fast version with a reduced accuracy (ULP20) to vec exp `exp_u20` and use it in flash attention.
- **FUSION**: Do fusion for `softmax` ops.
- **SCALE**: Move the calculation of `scaling_factor` after `gemm`.

### Performance
_Model: Stable Diffusion V2.1_

| Version | BF16 Kernel latency (s) | BF16 speedup | FP32 Kernel latency (s) | FP32 speedup |
| ----- | ----- | ----- | ----- | ----- |
| PT | 15.865 |  | 35.362 |  |
| PT + EXP | 12.518 | 21.10% | 19.327 | 45.35% |
| PT + EXP + FUSION | 11.774 | 25.79% | 18.306 | 48.23% |
| PT + EXP + FUSION + SCALE | 11.053 | 30.33% | 18.360 | 48.08% |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115151
Approved by: https://github.com/jgong5, https://github.com/drisspg
2023-12-12 01:09:55 +00:00
0379c11248 [c10d] Enable PG NCCL monitor thread by default (#115577)
We added a monitor thread in NCCL PG in https://github.com/pytorch/pytorch/pull/112518. To summarize what we are doing in monitor thread: it listens to the heartbeat from watchdog thread and detect unhealthy nccl watchdog hang (due to several reasons such as nccl/cuda API bugs or unexpected blocking behaviors). This is the last resort to ensure that we don't silently keep the training job run for hours.

We didn't open this feature as default, since we want to perform more due diligence and have some customers to try it out. So far, we didn't see any obstacle which blocks turning on this feature and received positive feedback from users. We now decided to turn in on by default in this PR.

If this feature turns out not to work as expected and disturb one's training process, one can set `TORCH_NCCL_ENABLE_MONITORING=0` to disable this feature. Please kindly file an issue with us so that we can see if we missed any corner cases during the design.

Differential Revision: [D52045911](https://our.internmc.facebook.com/intern/diff/D52045911)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115577
Approved by: https://github.com/wconstab, https://github.com/kwen2501
2023-12-12 00:45:54 +00:00
6988e40b48 [quant][fx] Lower operator.matmul in convert_fx (#113954)
Summary: We support lowering `torch.matmul` but not
`operator.matmul`. This commit adds support for the latter,
which enables lowering the shorthand `@`. This address
https://github.com/pytorch/pytorch/issues/111450.

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113954
Approved by: https://github.com/jerryzh168
2023-12-12 00:34:58 +00:00
0a464ad1a7 [dtensor] turn back on symbolic shape in tests (#115568)
as titled, as @jbschlosser enabled dynamic shape support for traceable
subclass, turn back on the tests with default setting

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115568
Approved by: https://github.com/XilunWu
2023-12-12 00:26:23 +00:00
078773b32b [ROCm] Add owners for more HIP-specific paths (#113989)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113989
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2023-12-12 00:24:38 +00:00
17de38c9af [Dynamo] Check duplication when loading dynamo tracing rules (#115059)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115059
Approved by: https://github.com/jansel
2023-12-12 00:22:20 +00:00
0692240b90 [dtensor] account for empty list when turning to OpStrategy (#115298)
Trying to fix https://github.com/pytorch/pytorch/issues/115065

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115298
Approved by: https://github.com/XilunWu
2023-12-12 00:11:16 +00:00
19c67a9db5 [dynamo] Fix a closure cell empty error (#115541)
Summary: Fixes https://github.com/pytorch/pytorch/issues/97115. The solution given by @jansel in that issue works. Checking in the code so it won't get lost.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115541
Approved by: https://github.com/jansel
2023-12-12 00:01:51 +00:00
617c228fba [CI] Lower the smoketest speedup threshold for nangpt (#115562)
Summary:
https://github.com/pytorch/pytorch/actions/runs/7158691360/job/19491437314
shows the variance can be larger than previously expected. Lowering it
for now and if it continues to be a problem, we should switch to some
other more stable model.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115562
Approved by: https://github.com/chenyang78
2023-12-11 23:46:30 +00:00
4471fe6c39 [sparse][semi-structured] add alg_id to _cslt_sparse_mm and _cslt_sparse_mm_search (#115178)
Summary:

cuSPARSELt has support for different alg_id, which are set via

`cusparseLTMatmulAlgSetAttribute`, in total there are 4 different
alg_ids, 0 - 3.

Previously we were just using the default alg_id, as from our initial
experiments we found that for most shapes the default alg_id is the
fastest and that they made no difference on numerical correctness, just
performance. From our previous experiments the fastest alg_id seemed to
differ only on small matmul shapes.

danthe3rd found a performance regression when running with
cuSPARSELt v0.4.0 vs v0.5.0, on LLM shapes, which match these
characteristics (activations are small, weights are large).

However it's likely that this is due to the alg_id ordering changing, as
mentioned in the release notes for v0.5.0.
```
cusparseLtMatmulAlgSelectionInit() does not ensure the same ordering of
algorithm id alg as in v0.4.0.
```

This PR adds in the following:
- support for passing in alg_id to _cslt_sparse_mm
- a new op, _cslt_sparse_mm_search, which returns the optimal alg_id for
  a given matmul

_cslt_sparse_mm_search has the same function signature as
_cslt_sparse_mm, minus the alg_id parameter.
We are able to achieve v0.4.0 performance with alg_id=1 on the shapes
that daniel provided.

We will address autoselecting the best alg_id in a future PR, possibly
with torch.compile.

Test Plan:
```
python test/test_sparse_semi_structured -k cslt
```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115178
Approved by: https://github.com/cpuhrsch
2023-12-11 23:08:51 +00:00
8b28380c8e [MPS] Fix sum and prod for complex types (#115554)
By not force-casting dtype to float

Test plan: `python -c "import torch;print(torch.linspace(-3.0, 3.0, 50, dtype=torch.cfloat, device='mps').sqrt().sin().sum())"`

Before:
```
tensor(21.1778+0.j, device='mps:0')
```
After
```
tensor(21.1778+39.1377j, device='mps:0')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115554
Approved by: https://github.com/lezcano
ghstack dependencies: #115512, #115513
2023-12-11 23:03:44 +00:00
a4bb4a2373 [MPS] Add support for MPSDataTypeComplexFloat[16|32] (#115513)
But limit it to MacOS Sonoma +

Before the calling `torch.cat` with complex types failed, but now it works.
Before:
```
% python -c "import torch;print(torch.cat([torch.rand(3, 3, dtype=torch.cfloat).to('mps'), torch.rand(3, 3, dtype=torch.cfloat).to('mps')]))"
TypeError: Trying to convert ComplexFloat to the MPS backend but it does not have support for that dtype.
```
After:
```
% python -c "import torch;print(torch.cat([torch.rand(3, 3, dtype=torch.cfloat).to('mps'), torch.rand(3, 3, dtype=torch.cfloat).to('mps')]))"
tensor([[0.4857+0.0030j, 0.9375+0.8630j, 0.3544+0.9911j],
        [0.5293+0.8652j, 0.8440+0.1991j, 0.5152+0.8276j],
        [0.0136+0.7469j, 0.1403+0.4761j, 0.2943+0.0896j],
        [0.6458+0.0035j, 0.3579+0.4577j, 0.1723+0.1508j],
        [0.4420+0.3554j, 0.4396+0.7272j, 0.2479+0.1191j],
        [0.3895+0.2292j, 0.7886+0.1613j, 0.9243+0.4180j]], device='mps:0')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115513
Approved by: https://github.com/kulinseth
ghstack dependencies: #115512
2023-12-11 23:03:44 +00:00
288822c968 Increase ROCm test shards to 6 (#110997)
To reduce signal time

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110997
Approved by: https://github.com/huydhn, https://github.com/malfet
2023-12-11 22:30:16 +00:00
4307ccde99 Move ONNX's TorchModelType to pytorch_test_common to fix circ. dep. (#115353)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115353
Approved by: https://github.com/BowenBao
2023-12-11 22:23:03 +00:00
suo
ccd5bde6a3 [export] Reintroduce InterpreterModule to unflatten (#115436)
InterpreterModule is better than GraphModule codegen; it's more debuggable and
has better stack traces. The only reason we don't use it today is because
torch.compile doesn't work with it.

I work around this by constructing a GraphModule separately for usage during
dynamo tracing, but otherwise using torch.fx.Interpreter.

Differential Revision: [D51971661](https://our.internmc.facebook.com/intern/diff/D51971661/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115436
Approved by: https://github.com/zhxchen17
ghstack dependencies: #115408
2023-12-11 22:15:32 +00:00
suo
c137335b5c [export] make UnflattenedModule not inherit from GraphModule (#115408)
UnflattenedModule doesn't really behave like a graph module; we customize `__call__` to do something completely different than what GraphModule does. So, things that test `isinstance(unflattened_module, GraphModule)` and do something with the GraphModule are often broken.

This change makes UnflattenedModule it's own thing.

Differential Revision: [D51959097](https://our.internmc.facebook.com/intern/diff/D51959097/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115408
Approved by: https://github.com/zhxchen17
2023-12-11 22:15:21 +00:00
8c1567d021 [c10d] Change watchdog inner loop function name to make it more accurate (#115404)
Function `workCleanupLoop` does not affect all things we did in watchdog thread, so proposing a new name here to reflect what we are actually doing in the watchdog thread.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115404
Approved by: https://github.com/kwen2501, https://github.com/wconstab
2023-12-11 22:00:06 +00:00
99f06c0cc2 [BE] update errors to be more descriptive (#115443)
we call `_check_single_tensor` and `_check_tensor_list` as validation but don't print out the param types that were invalid

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115443
Approved by: https://github.com/XilunWu
2023-12-11 21:21:10 +00:00
b706c4116d [MPS] Add MacOS 14 runtime check (#115512)
Prerequisite for adding more complex type support and FFT operation

Check using `conjugateWithTensor:name:` selector defined as follows
```objc
/// Returns the complex conjugate of the input tensor elements.
///
/// - Parameters:
///   - tensor: The input tensor.
///   - name: An optional string which serves as an identifier for the operation..
/// - Returns: A valid `MPSGraphTensor` object containing the elementwise result of the applied operation.
-(MPSGraphTensor *) conjugateWithTensor:(MPSGraphTensor *) tensor
                                   name:(NSString * _Nullable) name
MPS_AVAILABLE_STARTING(macos(14.0), ios(17.0), tvos(17.0))
MPS_SWIFT_NAME( conjugate(tensor:name:) );
```

- Rename `isOnMacOS13orNewer(unsigned minor)` hook to `isOnMacOSorNewer(major, minor)`
- Replace `torch._C.__mps_is_on_macos_13_or_newer` with `torch._C._mps_is_on_macos_or_newer`
- Add `torch.backends.mps.is_macos_or_newer` public API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115512
Approved by: https://github.com/albanD
2023-12-11 21:11:42 +00:00
03ff44c958 [c10d] Fix Store check condition in NCCL PG watchdog (#115475)
In https://github.com/pytorch/pytorch/pull/115449/ somehow after turning on `DUMP_ON_TIMEOUT=1`, some existing tests failed. Upon checking, the failing is because of TCPStore check call within watchdog thread.

1. It's not because of TCPStore creation has not completed, because if I put it sleep for a long time, the test still failed. Rather, it's because we query TCPStore after we shutdown the PG.

2. The reason for that is: The `std::chrono::steady_clock::now()` function in C++ returns a `time_point` object representing the current point in time according to the steady clock. The default unit of this time_point is not directly specified in terms of seconds or nanoseconds; rather, it is dependent on the internal representation of the steady clock, which can vary between implementations. In reality it's actually nanosecs which makes the delta so big that we are checking the store every time when watchdog thread wakes up. To make things even worse, `terminateProcessGroup_` might be turned to be `True` before the next check for the outmost while but before TCPStore check, so watchdog gets stuck because we are checking a TCPStore which is already deleted. And main thread is still waiting for watchdog to join.

The solution here is:
1. Add back `std::chrono::duration_cast` to ensure the delta is indeed mil_sec, so that the timeout check logic is working as expected.
2. Check `terminateProcessGroup_` as well so that, we don't do any dump when main thread has already mark the process exited.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115475
Approved by: https://github.com/wconstab
2023-12-11 21:06:05 +00:00
ccc9e5f5bc Optimize conv2d pw quantized (#115221)
Summary:
In order to get better performance on conv2d pw its better to read the input together in a batch.

With this optimization on CUNET-enc ops:

Kernel Name              Workgroup Size         Duration P50 (ns)
===========              ==============         =================
vulkan.quantized_conv2d_pw_2x2{96, 72, 2}                       891332
vulkan.quantized_conv2d_pw_2x2{48, 36, 4}                       528528
vulkan.quantized_conv2d_pw_2x2{24, 18, 8}                       557336

Without this optimization:
Kernel Name              Workgroup Size         Duration P50 (ns)
===========              ==============         =================
vulkan.quantized_conv2d_pw_2x2{96, 72, 2}                      1633268
vulkan.quantized_conv2d_pw_2x2{48, 36, 4}                      1177228
vulkan.vulkan.quantized_conv2d_pw_2x2{24, 18, 8}                      1343264

Test Plan:
Ensure all vulkan quantize tests pass:
buck2 run --target-platforms ovr_configplatform/macos:arm64-fbsourcexplat/caffe2:pt_vulkan_quantized_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1 --show-output"
Running main() from third-party/googletest/1.11.0/googletest/googletest/src/gtest_main.cc
[==========] Running 78 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 78 tests from VulkanAPITest
[ RUN      ] VulkanAPITest.uniform_buffer_copy
...
[----------] Global test environment tear-down
[==========] 78 tests from 1 test suite ran. (1519 ms total)
[  PASSED  ] 78 tests.

buck2 run --target-platforms ovr_config//platform/macos:arm64-fbsource  //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1 --show-output"

Running main() from third-party/googletest/1.11.0/googletest/googletest/src/gtest_main.cc
[==========] Running 395 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 395 tests from VulkanAPITest
[ RUN      ] VulkanAPITest.zero_size_tensor
[       OK ] VulkanAPITest.zero_size_tensor (83 ms)
...
xplat/caffe2/aten/src/ATen/test/vulkan_api_test.cpp:7593: Skipped
QueryPool is not available
[  SKIPPED ] VulkanAPITest.querypool_flushed_shader_log (0 ms)
[----------] 395 tests from VulkanAPITest (6515 ms total)

[----------] Global test environment tear-down
[==========] 395 tests from 1 test suite ran. (6515 ms total)
[  PASSED  ] 394 tests.
[  SKIPPED ] 1 test, listed below:
[  SKIPPED ] VulkanAPITest.querypool_flushed_shader_log

  YOU HAVE 5 DISABLED TESTS

Reviewed By: yipjustin

Differential Revision: D50997530

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115221
Approved by: https://github.com/yipjustin
2023-12-11 20:59:15 +00:00
585aea6e77 [xla hash update] update the pinned xla hash (#115528)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/_update-commit-hash.yml).
Update the pinned xla hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115528
Approved by: https://github.com/clee2000
2023-12-11 20:22:46 +00:00
505574c46a Add decomposition for torch.block_diag (#115096)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115096
Approved by: https://github.com/peterbell10
2023-12-11 20:04:22 +00:00
5fe2b138e3 Revert "[inductor] Fix an aliased output bug (#115373)"
This reverts commit 1310f0bf38293b68a781287d1de8cf699a76974d.

Reverted https://github.com/pytorch/pytorch/pull/115373 on behalf of https://github.com/atalman due to Sorry for reverting your change it broke inductor tests ([comment](https://github.com/pytorch/pytorch/pull/115373#issuecomment-1850792869))
2023-12-11 20:02:15 +00:00
c52b78ebc2 [ez] Remove some args from run_test.py (#115459)
Don't think anyone uses these
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115459
Approved by: https://github.com/malfet, https://github.com/huydhn
2023-12-11 19:56:37 +00:00
b5578cb08b [ez] Remove unittest retries (#115460)
Pytest is used in CI now for reruns and I doubt people are using the env vars when running locally.  imo removing this code has the makes the run function easier to read
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115460
Approved by: https://github.com/malfet, https://github.com/huydhn
2023-12-11 19:46:09 +00:00
5c0976fa04 Revert "[dynamo] guarded config (#111299)" (#115386)
This reverts commit 5927e9cbf2ac18aaaaecaab02258b7a35ac10969.

Differential Revision: [D51959266](https://our.internmc.facebook.com/intern/diff/D51959266)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115386
Approved by: https://github.com/yanboliang, https://github.com/malfet
ghstack dependencies: #115384, #115401, #115385
2023-12-11 19:35:42 +00:00
6db7b30db4 Revert "[dynamo] Cache size calc for differing config (#111300)" (#115385)
This reverts commit 78318d024989cf86e1ede424997cd42d2d291694.

Differential Revision: [D51959268](https://our.internmc.facebook.com/intern/diff/D51959268)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115385
Approved by: https://github.com/malfet
ghstack dependencies: #115384, #115401
2023-12-11 19:35:42 +00:00
f06f51b152 Revert "[Dynamo] Don't log compilation metrics for PyTorch unit tests (#115452)"
This reverts commit cd444aa075dd1e9c5d85cf3fbca9e078c74a7580.

Reverted https://github.com/pytorch/pytorch/pull/115452 on behalf of https://github.com/davidberard98 due to Merge conflict with #115385, which already landed in fbcode ([comment](https://github.com/pytorch/pytorch/pull/115452#issuecomment-1850729965))
2023-12-11 19:21:40 +00:00
f5f6618813 [executorch hash update] update the pinned executorch hash (#115311)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/_update-commit-hash.yml).
Update the pinned executorch hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115311
Approved by: https://github.com/pytorchbot
2023-12-11 18:31:44 +00:00
40a14e07ef Revert "[sparse][semi-structured] add alg_id to _cslt_sparse_mm and _cslt_spasre_mm_search (#115178)"
This reverts commit 1e5636f7915035b09dce22ad1d2170a65f344214.

Reverted https://github.com/pytorch/pytorch/pull/115178 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but the Window build failure looks legit 1e5636f791 ([comment](https://github.com/pytorch/pytorch/pull/115178#issuecomment-1850605711))
2023-12-11 18:07:17 +00:00
5f41fc7619 [c10d] Change NCCL PG watchdog error msg and test comments (#115403)
Address the nit comments in https://github.com/pytorch/pytorch/pull/115226/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115403
Approved by: https://github.com/wconstab
ghstack dependencies: #115226
2023-12-11 17:55:28 +00:00
794545c11f [BE]: Enable RUF015 codebase wide (#115507)
Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507
Approved by: https://github.com/malfet
2023-12-11 15:51:01 +00:00
1e5636f791 [sparse][semi-structured] add alg_id to _cslt_sparse_mm and _cslt_spasre_mm_search (#115178)
Summary:

cuSPARSELt has support for different alg_id, which are set via

`cusparseLTMatmulAlgSetAttribute`, in total there are 4 different
alg_ids, 0 - 3.

Previously we were just using the default alg_id, as from our initial
experiments we found that for most shapes the default alg_id is the
fastest and that they made no difference on numerical correctness, just
performance. From our previous experiments the fastest alg_id seemed to
differ only on small matmul shapes.

danthe3rd found a performance regression when running with
cuSPARSELt v0.4.0 vs v0.5.0, on LLM shapes, which match these
characteristics (activations are small, weights are large).

However it's likely that this is due to the alg_id ordering changing, as
mentioned in the release notes for v0.5.0.
```
cusparseLtMatmulAlgSelectionInit() does not ensure the same ordering of
algorithm id alg as in v0.4.0.
```

This PR adds in the following:
- support for passing in alg_id to _cslt_sparse_mm
- a new op, _cslt_sparse_mm_search, which returns the optimal alg_id for
  a given matmul

_cslt_sparse_mm_search has the same function signature as
_cslt_sparse_mm, minus the alg_id parameter.
We are able to achieve v0.4.0 performance with alg_id=1 on the shapes
that daniel provided.

We will address autoselecting the best alg_id in a future PR, possibly
with torch.compile.

Test Plan:
```
python test/test_sparse_semi_structured -k cslt
```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115178
Approved by: https://github.com/cpuhrsch
2023-12-11 15:47:28 +00:00
b88be1686d Revert "[export][refactor][1/n] Move dynamic shapes logic (#114764)" (#115508)
GitHub first oncall.
This reverts commit 53bf8cfcf9c966096e829247380462d0a3a61e8d.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115508
Approved by: https://github.com/malfet, https://github.com/angelayi
2023-12-11 14:54:51 +00:00
f017a1af3f [MPS] add complex_out to MPS backend (#110851)
Adds support for at::complex_out to the MPS backend

Implemented in a binary kernel using the view_as_real pattern for handling complex dtypes in the mps backend.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110851
Approved by: https://github.com/kulinseth
2023-12-11 13:37:55 +00:00
de89a53df8 [benchmarking] Reduce box_detections_per_img for vision_maskrcnn (#115487)
This fixes a failure on the [perf dashboard](https://hud.pytorch.org/benchmark/compilers) with `--amp` mode.  I believe boxes 5 and 6 were getting swapped.  The existing comment explains the issue.

Before
```
$ ./benchmarks/dynamo/torchbench.py --training  --accuracy --no-translation-validatio --amp --backend=inductor --disable-cudagraphs --only vision_maskrcnn
...
[2023-12-09 13:21:27,292] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.00171, (ref-fp64): 0.00054 and shape=torch.Size([256, 256, 3, 3])
[2023-12-09 13:21:27,292] torch._dynamo.utils: [ERROR] Accuracy failed for key name backbone.fpn.layer_blocks.2.0.weight.grad
fail_accuracy
```

After
```
$ ./benchmarks/dynamo/torchbench.py --training  --accuracy --no-translation-validatio --amp --backend=inductor --disable-cudagraphs --only vision_maskrcnn
...
pass
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115487
Approved by: https://github.com/yanboliang
2023-12-11 08:42:25 +00:00
274fdc81f8 [Dynamo][6.3/N] Further cleanup torch.py (#114669)
A follow-up PR to clean up what I found during the refactor of torch.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114669
Approved by: https://github.com/jansel
2023-12-11 07:16:03 +00:00
fe01605830 [aotinductor] replace lld with the default ld linker (#115478)
Currently, we place constants in the .so. To avoid cases
where constants are too large (i.e. >2G), we put the
constants into .lrodata, which allows doesn't have 2G limit.
Not sure why, lld still issues errors like beow even if
those large constants data are stored in .lrodata section:

"relocation R_X86_64_PC32 out of range: 5459191920 is not in
[-2147483648, 2147483647]"

In constrast, the default gnu ld linker works fine. Let's
switch back to use ld to unblock some internal models.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115478
Approved by: https://github.com/desertfire, https://github.com/htyu
2023-12-11 02:35:26 +00:00
1310f0bf38 [inductor] Fix an aliased output bug (#115373)
Summary: for https://github.com/pytorch/pytorch/issues/97083, when

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115373
Approved by: https://github.com/jansel
2023-12-10 23:52:39 +00:00
2e6b809d6b [AOTI] Fix a missing declaration for the result of item() (#115175)
Differential Revision: [D51968539](https://our.internmc.facebook.com/intern/diff/D51968539)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115175
Approved by: https://github.com/chenyang78
2023-12-10 22:49:45 +00:00
9b3cb1c66c Fix environment condition for docker-release.yml
As those are run on nightlies and release tags environment should be set accordingly.

Also simplify `WITH_PUSH` condition.

Should fix https://github.com/pytorch/pytorch/actions/runs/7156407285/job/19494049140
2023-12-10 14:09:39 -08:00
38f890341d Implement pass-through state_dict and load_state_dict for dynamo OptimizedModule (#113423)
Fixes #113422
Fixes #94575

This is now possible:
```py
model = Model()
compiled_model = torch.compile(model)

model.load_state_dict(compiled_model.state_dict())  # previously key mismatch!
```

This also makes it much easier to checkpoint and load models that were wrapped like so:
```py
FSDP(torch.compile(model))
# or
DDP(torch.compile(model))
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113423
Approved by: https://github.com/msaroufim
2023-12-10 22:09:19 +00:00
26266c9718 [CI] Call torch.cuda.empty_cache to release device memory (#114663)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114663
Approved by: https://github.com/eellison
2023-12-10 21:27:42 +00:00
694cc6af56 [benchmarks] Fix NameError: name 'args' is not defined (#115494)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115494
Approved by: https://github.com/Skylion007, https://github.com/desertfire
2023-12-10 21:22:21 +00:00
21a1d31ed8 [caffe2] update Meta-internal googletest references (#115407)
Summary: Update test dependencies to point to the new internal googletest location.

Test Plan: CI

Differential Revision: D51951643

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115407
Approved by: https://github.com/cccclai
2023-12-10 20:37:13 +00:00
24a463c46c Revert "[export][refactor][2/n] Move tracing logic (#114768)" (#115503)
Github first oncall.
This reverts commit 0ab57ee7eab5391289d30e8c49fceee3f503f539.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115503
Approved by: https://github.com/angelayi, https://github.com/kit1980
2023-12-10 19:30:15 +00:00
b4ef59f740 Revert "[dynamo] remove unused OptimizeCtx field - export (#113901)" (#115401)
This reverts commit b62230a685666e8c2b8a5cb31b16352d286bcf9f.

Differential Revision: [D52001024](https://our.internmc.facebook.com/intern/diff/D52001024)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115401
Approved by: https://github.com/malfet
ghstack dependencies: #115384
2023-12-10 18:17:24 +00:00
b36fc6790e Revert "[dynamo] Guard on HAS_GRAPH_BREAKS if graph breaks are present (i.e. cache miss if compiled object requires nopython) (#114073)" (#115384)
This reverts commit 0bb29f945079ac4c83d674f7b3ff755cfb5396cf.

Differential Revision: [D51959267](https://our.internmc.facebook.com/intern/diff/D51959267)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115384
Approved by: https://github.com/malfet
2023-12-10 18:16:02 +00:00
6c1e75e646 Revert "[HigherOrderOp] make MapHigherOrder create map_impl call_function node instead of map (#115205)"
This reverts commit 8b747358783d2411afe1136dcc9da95c01bfbdaa.

Reverted https://github.com/pytorch/pytorch/pull/115205 on behalf of https://github.com/atalman due to ghfirst broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/115205#issuecomment-1848995376))
2023-12-10 15:25:55 +00:00
100c466bff [CI][Inductor] Skip CPU tests when running on GPU (#115430)
This is just follows the standard practice for CI, when one specifies `PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda`, only tests targeting the device should be run

Do it by refactoring part of `instantiate_device_type_tests` into `get_desired_device_type_test_bases` and using it from test_torchinductor.py to skip CPU tests

Fixes https://github.com/pytorch/pytorch/issues/115423

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115430
Approved by: https://github.com/seemethere
2023-12-10 15:21:24 +00:00
08d63a75a4 Revert "[HigherOrderOp] Remove additional get item calls in MapHigherOrder. (#115207)"
This reverts commit dd6ae6d3b473906d32fcb8a319895e31b039f224.

Reverted https://github.com/pytorch/pytorch/pull/115207 on behalf of https://github.com/atalman due to ghfirst broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/115207#issuecomment-1848991919))
2023-12-10 15:12:12 +00:00
fbeca60b1f Remove replace_all and make VTs mutable (#113725)
1.  Removes calls to `replace_all` and `clone` and makes VTs mutable.
2. Properly handles Tuple Iterator mutation. Previously TupleIterator variables would only be properly reconstructed if they were advanced at least once in a frame. On calls to `next`, the source information would be lost (due to constructing a new iterator without using builder), which would ensure that during codegen the variable would be reconstructed from scratch. Now that VTs are mutated, the source is never lost, so we need to properly track mutation and handle it by replaying calls to `next` at the end of the modified bytecode.
3. Added test for checking iadd side effects, this was missing in our unit test coverage.
4. Fixed two incorrect sources, DelayGraphBreakVariable, and UserMethodVariable both relied on setting the source to AttrSource(parent, name) at the callsite of `var_getattr`.
5. Fixed a bug in inplace adding for lists, it would set the resulting VariableTracker's source to `None` which would utilize a different reconstruct path in codegen. Now this is handled explicitly by reconstructing vars when allow_cache=`False`, so that during side effect replay, the mutated var is correctly updated.

In subsequent PRs:
* Refactoring side effect tracking to be significantly simpler (I think we only need an `is_modified` flag)
* Refactor `next_variables` iterator to match the signature of `next`
* Remove all references to `options` in the code
* Refactor VTs representing mutable collections to implement their own mutation update handling
* Remove clone and/or make it specific to lists for creating slices
* Add mutation tracking/replay for sets
* Add mutation tracking/replay for iter.py
* Removing setting source in builder (it's set at the top level after a var is returned)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113725
Approved by: https://github.com/jansel
2023-12-10 09:31:21 +00:00
f71d931b32 [Dynamo][6.2/N] Dump the in graph function list(~2600 ops) and add unit tests. (#114196)
This is the second PR according https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114196
Approved by: https://github.com/jansel
2023-12-10 06:41:51 +00:00
4eb5838e18 Revert "Enable builtin tests for ONNX Export with ExportedProgram models (#114762)"
This reverts commit 13d2e3eba79000028291f4739a6e9c937dbe4264.

Reverted https://github.com/pytorch/pytorch/pull/114762 on behalf of https://github.com/huydhn due to Sorry for reverting your change but ONNX test is failing from this commit 13d2e3eba7 ([comment](https://github.com/pytorch/pytorch/pull/114762#issuecomment-1848831147))
2023-12-10 01:55:47 +00:00
2ee240d14a Revert "Move ONNX's TorchModelType to pytorch_test_common to fix circ. dep. (#115353)"
This reverts commit 960ad9d94e365c758b19298b45bcba5225b79e0c.

Reverted https://github.com/pytorch/pytorch/pull/115353 on behalf of https://github.com/huydhn due to Sorry for reverting your change but ONNX test is failing from the commit below in the stack 13d2e3eba7 ([comment](https://github.com/pytorch/pytorch/pull/115353#issuecomment-1848830883))
2023-12-10 01:53:50 +00:00
4490d4692b [doc] Rewrite benchmarks/dynamo/README.md (#115485)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115485
Approved by: https://github.com/yanboliang
2023-12-10 00:37:53 +00:00
8ddc549c0f [BE][JIT] Do not wrap shared_ptr with optional (#115473)
While reviewing https://github.com/pytorch/pytorch/pull/115381 noticed that `torch::jit::GraphFunction::optimized_graph_` is an `std::array<c10::optional<std::shared_ptr<Graph>>, N>`, which feels excessive as `shared_ptr` is already nullable and have `operator bool()`. Looking at https://github.com/pytorch/pytorch/pull/26488 that introduced the change, also does not hint that this indirection is necessary.

Test plan: CI

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115473
Approved by: https://github.com/davidberard98, https://github.com/Skylion007
2023-12-09 20:43:40 +00:00
641ec2115f [AOTI] move model runner into a library (#115220)
Summary: So that we can import it in fbcode and do some AOTI run in py env

Test Plan: existed AOTI tests

Reviewed By: chenyang78

Differential Revision: D51780021

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115220
Approved by: https://github.com/desertfire
2023-12-09 19:03:32 +00:00
c039f01bd9 Increased hardcoded limit for number of GPUs. (#115368)
Fixes #115331.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115368
Approved by: https://github.com/albanD
2023-12-09 18:10:51 +00:00
cyy
99f222372b [5/N] Fixes clang-tidy warnings in c10/{core,util}/*.h (#115354)
This PR continues to fix clang-tidy warnings for headers in c10/core and c10/util.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115354
Approved by: https://github.com/Skylion007
2023-12-09 17:16:04 +00:00
937d616e82 Re-enable type checking for distributed_c10d.py (#115223)
Re-enable type checking for distributed_c10d.py

Type checking for distributed_c10d.py was inadvertently turned off in issues that have accumulated since.

Note: the backwards compatibility linter does not like some of these changes.  But they were incorrect before.  This needs human verification, however.

#suppress-api-compatibility-check

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115223
Approved by: https://github.com/wconstab
2023-12-09 11:07:54 +00:00
485ea9a70a [DTensor] Add DTensor experimental op for LayerNorm backward sharding rule propogation (#115398)
Summary: This diff is only for prototype to unblock the TP work. PyTorch distributed team is working on a more generic backward op for `aten.layer_norm`. Will remove this op from the experimental file once it is ready.

Test Plan:
**Local Test**:
Accuracy:
- Dtensor + Checkpoint: first run loss: P884569822 (on-par with baseline: P884213363)
- 2nd by loading saved checkpoint: P884583429 (on-par with baseline: P884271869)

Trace:
- Collective functions are inserted automatically.
- Example: https://fburl.com/perfdoctor/l567ww1x

**MAST Test**:
With: trainer = 128, batch_size=512
- NE on-par:
(see: 4441_ep_bs512_2fsdp_tp_sp_dtensor)
 {F1155318138}

Differential Revision: D51490868

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115398
Approved by: https://github.com/wanchaol
2023-12-09 09:38:56 +00:00
eb3aa424ce [Reland][Dynamo] Added support for math.radians on ints with dynamic shapes (#115477)
Reland #114507

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115477
Approved by: https://github.com/larryliu0820
2023-12-09 08:58:18 +00:00
960ad9d94e Move ONNX's TorchModelType to pytorch_test_common to fix circ. dep. (#115353)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115353
Approved by: https://github.com/BowenBao
ghstack dependencies: #114407, #115281, #114762
2023-12-09 07:47:03 +00:00
13d2e3eba7 Enable builtin tests for ONNX Export with ExportedProgram models (#114762)
Fixed by https://github.com/pytorch/pytorch/pull/113982
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114762
Approved by: https://github.com/BowenBao
ghstack dependencies: #114407, #115281
2023-12-09 07:46:43 +00:00
7e941a932b Store user model to simplify ONNXProgram.{adapt_torch_*,__call__} APIs (#115281)
Currently (after https://github.com/pytorch/pytorch/pull/114407), the user has must pass the original user ``model`` to APIs such as ``ONNXProgram.__call__``, ``ONNXProgram.adapt_torch_inputs_to_onnx`` and ``ONNXProgram.adapt_torch_outputs_to_onnx`` APIs.

This was needed because when the model is fakefied, a version of the non-fakefied model is needed so that the Initializers, buffers and constants can be extracted from a real model (and used as input to the ONNX model).
That approach brings an unnecessary usability burden to the user when the model is not fakefied, because the model that was already passed to ``torch.onnx.dynamo_export`` could be used to extract ``state_dict``.

This PR adds ``ONNXProgram._model_torch`` attribute to store the user model and demote ``model`` argument of the aforementioned APIs to optional, only (as opposed to required).

As a result, for the fakefied model scenario, the user still need to pass the required model, but for non fakefied models, the persisted model is implicitly used to extract the model state_dict, making it easier to use.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115281
Approved by: https://github.com/BowenBao
ghstack dependencies: #114407
2023-12-09 07:46:12 +00:00
da341d0d48 [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-12-09 05:11:44 +00:00
1c1f2bbe8a Add a space in the error message (#115465)
Summary:
As title says

Created from CodeHub with https://fburl.com/edit-in-codehub

Test Plan:
waitforsandcastle

Sandcastle run

Reviewed By: eeggl

Differential Revision: D52000286

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115465
Approved by: https://github.com/kwen2501
2023-12-09 04:35:51 +00:00
3ebf9acea1 [Triton] Replace triton.runtime.jit.get_cuda_stream with torch.cuda.c… (#115397)
triton.runtime.jit.get_cuda_stream was removed in https://github.com/openai/triton/pull/2756

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115397
Approved by: https://github.com/jansel
2023-12-09 04:30:42 +00:00
cyy
516bd4a72c [1/N] Use std::in_place (#115170)
It is time to gradually replace c10::in_place with std::in_place.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115170
Approved by: https://github.com/colesbury
2023-12-09 03:52:39 +00:00
2ed47fecc5 Robustify torch.multiprocessing.spawn error reporting to be less deadlock prone (#114688)
multiprocessing.Queue relies on, among other things, background threads to send messages between processes.  This works in the happy path but can cause issues if a process is exiting by bypassing atexit handlers or crashing because the writer to the Queue can terminate while the reader is blocked reading the queue.  The reader sees the queue as non-empty yet even with a timeout will actually block forever.

An example of a Queue deadlock is here: https://gist.github.com/chipturner/342f72341f087737befe9df84d0e41ce

Since the error reporting case here is a simple one-shot message from the dying child to the parent, we can just use a file-based rendezvous.  This eliminates the deadlock when a large traceback is still being flushed to the network when a child exits.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114688
Approved by: https://github.com/suo, https://github.com/yifuwang
2023-12-09 03:36:43 +00:00
2962271f58 [ONNX][dynamo_export] Extend expected fx output types for int, float, bool (#115431)
Fixes exporting ops, such as `aten::_scaled_dot_product_flash_attention` that returns int, float, bool typed outputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115431
Approved by: https://github.com/titaiwangms, https://github.com/thiagocrepaldi
2023-12-09 03:24:48 +00:00
41b1919208 [nested_tensor]Python subclass NT overhead improvement (2/n): avoid getting from WeakTensorKeyDictionary twice during __init__ (#115450)
Summary:
Most NT operations end with creating a new NestedTensor, which is time-consuming. Trying to reduce overhead during the NestedTensor creation.

The ops return a new NestedTensor with the same offsets, so "tensor not in _tensor_symint_registry" would be false in most case. The "in" (__contain__) function takes ~8 us. If we use the "get" directly, then we save a few us for most NT operations.

Test Plan:
Before:
get_tensor_symint take 15us
https://pxl.cl/3XF83
After
get_tensor_symint take 10us
https://pxl.cl/3XFc9

Differential Revision: D51992836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115450
Approved by: https://github.com/soulitzer
2023-12-09 03:12:31 +00:00
d40a7c6026 Add decompositions for replication_pad (#115113)
Fixes #115395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115113
Approved by: https://github.com/peterbell10
2023-12-09 02:44:07 +00:00
d7705f325d Patch --save-xml when TEST_IN_SUBPROCESS (#115463)
Patch `--save-xml` when `TEST_IN_SUBPROCESS`

When `--save-xml` is given as a unit test argument and the test is handled by a `TEST_IN_SUBPROCESS` handler (e.g., `run_test_with_subprocess` for `distributed/test_c10d_nccl`), the `--save-xml` args were first "consumed" by argparser in `common_utils.py`. When a following subprocess in this `if TEST_IN_SUBPROCESS:` section starts, there are no `--save-xml` args, thus leaving `args.save_xml` to `None`.

Since argparser for `--save-xml` option has a default argument of `_get_test_report_path()` when the arg is `None`, it's not a problem for Github CI run. It could be an issue when people run those tests without `CI=1`. Test reports won't be saved in this case even if they passed `--save-xml=xxx`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115463
Approved by: https://github.com/clee2000
2023-12-09 02:38:31 +00:00
c9c4cdf9a9 [AOTAutograd] Do not call ctx.mark_dirty on mutations hidden from autograd (#115324)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115324
Approved by: https://github.com/bdhirsh
2023-12-09 02:23:13 +00:00
3361496f96 Fix the corner case of index_add (#114929)
Fixes #114864

As the title stated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114929
Approved by: https://github.com/mikaylagawarecki
2023-12-09 01:57:25 +00:00
3c54ff6bcd Update ONNX's IO Adapter to support FakeTensor with ExportedProgram (#114407)
Currently, the ONNX exporter using torch.nn.Module as input can support
FakeTensor because the ONNX model stores all initializers

When using torch.export.ExportedProgram as input, the initializers are
lifted as inputs. In order to execute the ONNX model, we need to pass a
reference to the non-fake model to the
ONNXProgram.adapt_torch_inputs_to_onnx API, so that initializers can be
fetched from the model and fed to the ONNX model as input

ps: https://github.com/pytorch/pytorch/issues/115461 will track the API revision for the cases where additional `model_with_state_dict` are required to produce complete ONNX files exported with fake support. This is also tracked by the umbrella fake tensor issue https://github.com/pytorch/pytorch/issues/105464 FYI @BowenBao
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114407
Approved by: https://github.com/BowenBao
2023-12-09 01:48:27 +00:00
495054545c Allow preserve_rng_state=True when torch.compile + selective checkpointing + CUDA (#113718)
Fixes https://github.com/pytorch/pytorch/issues/113717.

When `preserve_rng_state=True`, we let AOTAutograd trace through `torch.random.fork_rng` op, and the tracing doesn't work under CUDA, hence the original error reported in the issue.

But since we are already doing RNG functionalization at Inductor level, we don't actually need to trace this `fork_rng` op. So we should just rewrite `preserve_rng_state` to False when we are using torch.compile (and let Inductor do its RNG functionalization which it's already been doing).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113718
Approved by: https://github.com/wanchaol
2023-12-09 01:47:25 +00:00
cd444aa075 [Dynamo] Don't log compilation metrics for PyTorch unit tests (#115452)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115452
Approved by: https://github.com/zou3519
2023-12-09 01:39:36 +00:00
e1370ff80f Vectorize CPU ATen mean kernel for BF16 & FP16 dtypes (#114582)
## Summary
Since #97351, CPU ATen kernel for `mean` for BF16 & FP16 dtypes has been unvectorized (it's not even implicitly vectorized).

This PR vectorizes `mean` for BF16 & FP16 on CPU in a `cast_fp32 -> sum -> div -> cast_bf16_or_fp16` fashion.

The perf benefit would be especially pronounced on machines with `AVX512_BF16` and/or `AVX512_FP16` ISA support.

## Benchmarking data for BF16 (collected before & after the change in this PR)

**Machine:** Intel&reg; Xeon&reg; (4th generation series, formerly codenamed Sapphire Rapids) Platinum 8468H
One socket (48 physical cores) - used `numactl --membind=0 --cpunodebind=0`
libtcmalloc & Intel OpenMP were preloaded

Environment variable used -
`KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=1 OMP_NUM_THREADS=48 MKL_NUM_THREADS=48`

**Workload:** E2E performance on BS 32 resnet50 (using BF16 via AMP) inference using oneDNN Graph JIT fuser (`mean` kernel is dispatched to eager mode ATen kernel, and is the bottleneck right now)

| **BEFORE:** Latency with unvectorized mean (lower is better)| **AFTER:** Latency with vectorized mean (lower is better)| Speedup due to vectorizing mean|
|----------------------------|-------------------------|------------|
|                19.1 ms           |                10.8  ms       | latency reduced by ~43.45%      |

**Benchmarking script for BF16 -**

 ```
import time
import torch
import torchvision

# enable oneDNN Graph JIT fuser
torch.jit.enable_onednn_fusion(True)
# AMP for JIT mode is enabled by default, and is divergent with its eager mode counterpart
torch._C._jit_set_autocast_mode(False)

# sample input should be of the same shape as expected inputs
example_input = torch.rand(32, 3, 224, 224)
# Using resnet50 from torchvision in this example for illustrative purposes,
# but the line below can indeed be modified to use custom models as well.
model = getattr(torchvision.models, "resnet50")().eval()

with torch.no_grad(), torch.cpu.amp.autocast(cache_enabled=False, dtype=torch.bfloat16):
    # Conv-BatchNorm folding for CNN-based Vision Models should be done with ``torch.fx.experimental.optimization.fuse`` when AMP is used
    import torch.fx.experimental.optimization as optimization
    # Please note that optimization.fuse need not be called when AMP is not used
    model = optimization.fuse(model)
    model = torch.jit.trace(model, (example_input))
    model = torch.jit.freeze(model)
    # a couple of warm-up runs
    model(example_input)
    model(example_input)
    # speedup would be observed in subsequent runs
    start = time.time()
    model(example_input)
    end = time.time()
    inference_time = (end - start) * 1000
    print("Inference time is ", inference_time)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114582
Approved by: https://github.com/jgong5, https://github.com/malfet
2023-12-09 01:02:13 +00:00
f614ed78b8 [docs, dynamo] fix typos in dynamo custom backend docs (#115444)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115444
Approved by: https://github.com/eellison
2023-12-08 23:58:26 +00:00
fb19947962 Add decompositions for reflection_pad{1, 2, 3}d (#115100)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115100
Approved by: https://github.com/peterbell10
2023-12-08 23:05:57 +00:00
9f7b3a4e18 Move autolabeler to "oncall: distributed" not "module:.." (#115447)
Reasoning for the change is spelled out in this issue

https://github.com/pytorch/pytorch/issues/115168

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115447
Approved by: https://github.com/huydhn, https://github.com/malfet
2023-12-08 22:53:20 +00:00
749f0c90e1 Revert "[export][refactor][3/n] Move unlift to separate file (#114787)" (#115457)
Github First Oncall: This reverts commit 967863d91dbe0a56fa7bcc4e075a25cc4ad67c81.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115457
Approved by: https://github.com/osalpekar
2023-12-08 22:33:28 +00:00
28de29fdda [releng] version 2.2 -> 2.3 (#115446)
Release 2.2 branch cut is ompleted. Hence bump nightly version to 2.3
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115446
Approved by: https://github.com/huydhn, https://github.com/seemethere, https://github.com/malfet
2023-12-08 22:25:52 +00:00
3e47e3f441 Revert "[export] Fix graph output mismatch issue with constant outputs. (#115280)"
This reverts commit 622688fab9fc6d20ff3475a8a0a1fdb6af9d837e.

Reverted https://github.com/pytorch/pytorch/pull/115280 on behalf of https://github.com/atalman due to ghfirst issue when importing, will reland this PR ([comment](https://github.com/pytorch/pytorch/pull/115280#issuecomment-1847903624))
2023-12-08 22:10:03 +00:00
3dab46fe19 Revert "[export] Dont skip output caching for now. (#115374)"
This reverts commit fd79995fd6d9f599ff60b721ae56bb7b0aa4eb93.

Reverted https://github.com/pytorch/pytorch/pull/115374 on behalf of https://github.com/atalman due to ghfirst issue when importing, will reland this PR ([comment](https://github.com/pytorch/pytorch/pull/115374#issuecomment-1847899901))
2023-12-08 22:06:21 +00:00
aaaf5c08fb [ez] Don't run workflows on forks (#115429)
Adds the `if: github.repository_owner == 'pytorch'` to some jobs to make sure they don't run on forks, since they usually either fail or remain pending due to not having the correct machines to run.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115429
Approved by: https://github.com/huydhn, https://github.com/botmethere, https://github.com/malfet, https://github.com/atalman
2023-12-08 21:41:58 +00:00
b5d3d3ebf0 [ao] making hist_obs handle torch.inf and closeby values (#103467)
Summary: This PR does 2 things:

1) Previously this would simply error, now it will ignore any
torch.inf values that it recieves. note: The code checks for torch.inf after
aminmax that way if there are no torch.inf values found, the perf is a
relatively unchanged

2) as mentioned in https://github.com/pytorch/pytorch/issues/100051,
values close to (but not quite at) the maximum/minimum float value could
overflow to infinity in the course of _adjust_min_max() (when this large
value would be multiplied by something in the middle of a calculation
that would otherwise result in a non inf value). This was fixed by
rearranging the order of operations for the lines in question without
altering the actual equations. Specifically, where operations in lines
1095, 1098 and 1100 have multiplication and division of large values,
its better to divide the two large values before multiplying, rather
than multiplying the two large values together (creating overflow) before dividing like it had been.

Test Plan: python test/test_quantization.py
TestObserver.test_histogram_observer_ignore_infinity

python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity
Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467
Approved by: https://github.com/andrewor14
2023-12-08 21:41:31 +00:00
1215f2ffe2 [dtensor] readme typo (#115383)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115383
Approved by: https://github.com/awgu
ghstack dependencies: #115365
2023-12-08 21:40:40 +00:00
af925a56a1 Revert "[export] Add math.* ops to pass base (#115271)"
This reverts commit 6c0a4ced530dab78db455c37508931de2eb56239.

Reverted https://github.com/pytorch/pytorch/pull/115271 on behalf of https://github.com/atalman due to ghfirst issue when importing, will reland this PR ([comment](https://github.com/pytorch/pytorch/pull/115271#issuecomment-1847852211))
2023-12-08 21:17:56 +00:00
12d7ea19af [Indcutor][fx pass] Add sub and div pointwise ops to the post grad fusion (#115389)
Summary: Titled

Test Plan:
# unit test
```
buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/inductor:group_batch_fusion
```
Buck UI: https://www.internalfb.com/buck2/792c58db-c369-487d-9a42-b5da471657c0
Test UI: https://www.internalfb.com/intern/testinfra/testrun/2814749981661407
Network: Up: 74KiB  Down: 29KiB  (reSessionID-b47c266b-12d6-4e88-8dc3-4af1dd7ecbb4)
Jobs completed: 20. Time elapsed: 2:09.6s.
Cache hits: 0%. Commands: 2 (cached: 0, remote: 0, local: 2)
Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0

# local reproduce
OC: P899142918
MAI: P899175452
# e2e (oc)

Differential Revision: D51957242

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115389
Approved by: https://github.com/dshi7, https://github.com/jackiexu1992, https://github.com/xuzhao9
2023-12-08 21:07:03 +00:00
e8e4141773 Revert "[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)"
This reverts commit e61d6b42f0f4e4fa5bb816e03fb81e5bbcc9fa06.

Reverted https://github.com/pytorch/pytorch/pull/113432 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing dynamo tests in trunk e61d6b42f0, landrace? ([comment](https://github.com/pytorch/pytorch/pull/113432#issuecomment-1847787981))
2023-12-08 20:15:39 +00:00
d7180161b5 Revert "[SparseCsr] Remove triton sdpa skip after triton pin update (#109601)"
This reverts commit f64b10803f5fdd34e43fba7f421401bcfe247c19.

Reverted https://github.com/pytorch/pytorch/pull/109601 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing in trunk with this error ZeroDivisionError: integer division or modulo by zero ([comment](https://github.com/pytorch/pytorch/pull/109601#issuecomment-1847784383))
2023-12-08 20:12:53 +00:00
4186932bac Revert "[export] Remove runtime assertion pass (#115196)"
This reverts commit c163b3c03563c11640d4dbee504ef63101b019fe.

Reverted https://github.com/pytorch/pytorch/pull/115196 on behalf of https://github.com/atalman due to Broke internal test ([comment](https://github.com/pytorch/pytorch/pull/115196#issuecomment-1847778344))
2023-12-08 20:07:04 +00:00
317486edb0 [C10D] Decouple flight recorder from enableTiming (#115358)
RE #115301

Decoupling gives us a path to disable timing without disabling the
flight recorder.

Flight recorder is still useful for stuckness analysis without 'timing'.

Disabling timing makes it miss the 'started'
state that comes from using an extra nccl event at the start of each
collective.  It will also be missing 'duration_ms' of collectives, which
hasn't been landed yet, but is useful for timing/perf work more than
stuckness analysis.

Hopefully we can enable timing by default and leave both on, but it's
nice to have the flexiblity for now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115358
Approved by: https://github.com/fduwjj
2023-12-08 19:44:45 +00:00
suo
3d999d2f2c [export] optimize unflattener (#115364)
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules).

Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol.

Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115364
Approved by: https://github.com/zhxchen17
2023-12-08 19:32:01 +00:00
494cb28231 [PyTorch] AOTI: add ArrayRefTensor (#112115)
This adds a shim for AOTI generated code to pretend a raw array works like an AtenTensorHandle. This allows parts of AOTI that generate uses of tensors to continue to be unaware of how those tensors are allocated. See the following diff/PR for usage.

Differential Revision: [D50570252](https://our.internmc.facebook.com/intern/diff/D50570252/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112115
Approved by: https://github.com/chenyang78, https://github.com/desertfire
2023-12-08 19:31:50 +00:00
a2b89154bf New swap function (#111747)
This PR is proposing a new approach to solve the nn/optim only linked by python object identity problem.
The idea is to have a function that can swap the content of two Tensors t1 and t2 while preserving all the old references.
This would allow us to swap the `model.weight` with a new Tensor (can be any subclass of Tensor and any TensorImpl (xla, sparse, nested tensorimpl would work)). The use within nn will be done in a follow up.

This is done by swapping the whole content of the PyObject and then putting back the fields associated with external references (refcount, gc tracking and weakrefs).
Note that we have to properly handle all the cases where there is memory used before the public pointer PyObject* and where the PyObject is bigger due to dict/weakref being inlined (older CPython version) or due to slots.

The main limitation of this approach is that the number of slots need to match for the objects being swapped and thus limit usage of slots in subclasses.

Draft right now to see what @colesbury thinks about doing this?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111747
Approved by: https://github.com/colesbury
2023-12-08 18:49:35 +00:00
5f2ff29569 Fix typo in https://pytorch.org/docs/stable/sparse.html (#115282)
Fixes #111473

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115282
Approved by: https://github.com/svekars
2023-12-08 18:31:33 +00:00
68f74dd162 Add python and C++ support for LPPool3d (#114199)
Add python and C++ support for LPPool3d to Fixes #114114

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114199
Approved by: https://github.com/mikaylagawarecki
2023-12-08 18:18:44 +00:00
1c3a4a864c Remove always restore (#115317)
Removes always restore, assuming that a HOP will cleanup any leftover state from tracing fwd + bwd

This required a minor change to the autograd fn variable higher order op. If we are tracing forward DON'T add the call_function node into the main graph, since we are only tracing it for the purposes of speculation. Instead return the result directly to be passed to the backward for speculation. This was the only observable side effect on the output graph that I found.

Test plan:
test_smoke_from_test_autograd in test_autograd_function.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115317
Approved by: https://github.com/voznesenskym, https://github.com/jansel
2023-12-08 18:17:37 +00:00
a3f93dc44d [EZ] [CD] Enable Triton 3.12 conda builds (#115424)
Currently there is a chicken and egg problem with enabling triton builds for the platform, as package depends on `torch`, so I can only submit this change few days after https://github.com/pytorch/pytorch/pull/114819

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115424
Approved by: https://github.com/clee2000, https://github.com/seemethere
2023-12-08 18:10:45 +00:00
81b565b142 [CI] Fix a missing write_csv_when_exception problem (#115370)
Summary: Fix a problem shown in https://github.com/pytorch/pytorch/actions/runs/7124839624/job/19400589129 when a model times out.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115370
Approved by: https://github.com/eellison
2023-12-08 18:09:53 +00:00
c370450f02 [inductor] Remove hashing of tensor data for constants (#115356)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115356
Approved by: https://github.com/eellison
2023-12-08 18:05:34 +00:00
e61d6b42f0 [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-12-08 17:15:14 +00:00
898554a3a3 [torchgen] Add logic in custom ops to return empty tensor (#114143)
Summary: Add two logic:

1. If the custom op is returning a `Tensor` but also doesn't have an out tensor as input, return an empty tensor.
2. If the custom op is returning more than one Tensor and the number of out tensors is not the same as return Tensor, return a tuple of empty tensors.

Test Plan: Rely on new unit tests

Differential Revision: D51471651

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114143
Approved by: https://github.com/cccclai
2023-12-08 17:03:44 +00:00
b3b5bd51ea [raas][torch][jit] Allow not storing the optimized graph (#115381)
Summary:
GraphFunction internally stores the optimized graph after generating it and then it is passed into the executor which makes a copy of it. So we store the optimized graph effectively twice.

This diff allows to set a flag to not store the optimized graph inside the GraphFunction.

The code is NoP right now until the flag is enabled.

Test Plan:
I ran SL with this on raas with good memory saving on raas server. From command line:

exmaple model run
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362

I1207 11:04:58.657143 3556226 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 255646 Kb
```

then with flag enabled:
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362 --torch_jit_do_not_store_optimized_graph=true
I1207 11:06:25.245779 3577383 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 165167 Kb
```
So collective with this flag and the flag from D51950418
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362 --torch_jit_do_not_store_optimized_graph=true --torch_jit_enable_profiling_graph_executor=false

I1207 11:09:17.502743 3592345 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 114848 Kb
```

Differential Revision: D51931895

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115381
Approved by: https://github.com/malfet
2023-12-08 16:29:13 +00:00
f64b10803f [SparseCsr] Remove triton sdpa skip after triton pin update (#109601)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109601
Approved by: https://github.com/desertfire, https://github.com/amjames
2023-12-08 15:49:16 +00:00
72e58a756c Set markDynamoStrictTest in functorch/test_vmap.py (#115274)
We set markDynamoStrictTest in most of functorch/test_vmap.py. This
revealed many existing failing tests, so we mark those all as expected
failures or skip them.

Test Plan:
- CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115274
Approved by: https://github.com/guilhermeleobas, https://github.com/kshitij12345
ghstack dependencies: #115267, #115276, #115268
2023-12-08 14:51:19 +00:00
cc8f6f56dc [quant][pt2e] Add convert callback to Observer module (#115001)
Summary:
This is to allow easier extension of quant workflow in the future, as we are seening more
diverse ways of doing quantization

putting up this for feedbacks first

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_observer_callback

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115001
Approved by: https://github.com/kimishpatel
2023-12-08 13:47:37 +00:00
ca15671c30 Fix failing test_invalid_input_csr_large (#114940)
The test introduced in #102530 has a bug:
Construction of `crow_indices` raises an exception: "value cannot be converted to type int32 without overflow" which is obviously correct.
This makes the test fail which is supposed to check for an overflow in nnz.
Fix by making the construction of `crow_indices` pass although with an invalid value which would error later but triggers the correct check.

Given that I'm not sure it is even worth checking for an overflow in nnz:
- `crow_indices[..., -1] == nnz` is already enforced
- this can only hold if `crow_indices` is able to hold `nnz` without overflow
- `col_indices` has to be of the same type as `crow_indices`
- Hence the type of `col_indices` has to be able to hold the value of `nnz`

So in conclusion: The situation being checked for cannot reasonably occur

CC @pearu as the test author for additional insight

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114940
Approved by: https://github.com/pearu, https://github.com/cpuhrsch
2023-12-08 11:55:21 +00:00
23fa9621e4 [DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#115099) (#115193)
Summary:

Rename _device_mesh.py to device_mesh.py, update all callsites, add documentation.
We created stubs for public class and methods in torch.distributed.device_mesh so that torch.distributed.device_mesh can be imported with or without distributed is available().

Original diff reverted: D51629761
Original PR reverted: https://github.com/pytorch/pytorch/pull/115099
Prior to landing, CI signals are all passed. Shipit added the "ci/trunk" label to the PR and DID NOT wait for it and went ahead committing. More context can be found in the reverted PR above.

Test Plan: CI.

Differential Revision: D51861018

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115193
Approved by: https://github.com/fegin
2023-12-08 08:44:32 +00:00
6c585de076 [CUDA] baddmm should fall back to addmm for batch=1 (#114992)
I.e. it feels reasonable to always call `at::cuda::gemm` rather than `at::cuda::bgemm` when num_batches == 1
After the change, benchmarking torch built with CUDA-12 using  [following perf script](https://gist.github.com/malfet/6a17156d7f5663b8b12054a1beff3fe1) on A100  are as follows:
|      Shape     |  bmm_time |  mm_time  | slow down (%) |
| -------------- | --------- | --------- | ------------- |
|    1x1x4096    |   14.18   |   14.31   |     -0.89     |
|    1x1x8192    |   14.37   |   14.37   |     -0.05     |
|   1x1x16384    |   14.03   |   14.12   |     -0.68     |
|   1x1x32768    |   14.19   |   14.24   |     -0.35     |
|   1x1x65536    |   14.85   |   14.52   |     2.30      |
|   1x1x131072   |   14.03   |   14.07   |     -0.33     |
|  128x128x128   |   11.34   |   11.06   |     2.56      |
|  256x256x256   |   14.85   |   14.40   |     3.15      |
|  512x512x512   |   27.22   |   27.22   |     -0.01     |
| 1024x1024x1024 |  129.66   |  129.50   |     0.12      |
| 2048x2048x2048 |  972.18   |  973.24   |     -0.11     |
|  129x127x129   |   11.21   |   11.25   |     -0.39     |
|  257x255x257   |   14.50   |   14.43   |     0.44      |
|  513x511x513   |   29.01   |   29.01   |     0.01      |
| 1025x1023x1025 |  137.65   |  137.64   |     0.01      |
| 2049x2047x2049 |  982.58   |  982.65   |     -0.01     |
|  4097x3x4097   |   86.65   |   86.64   |     0.01      |
|  8193x3x8193   |  384.02   |  383.96   |     0.02      |
| 16385x3x16385  |  1106.73  |  1107.32  |     -0.05     |
| 32769x3x32769  |  4739.49  |  4739.48  |     0.00      |
| 65537x3x65537  | 17377.78  | 17378.74  |     -0.01     |
|  4097x5x4097   |   87.09   |   87.12   |     -0.03     |
|  8193x5x8193   |  301.38   |  301.36   |     0.01      |
| 16385x5x16385  |  1107.38  |  1108.04  |     -0.06     |
| 32769x5x32769  |  4743.73  |  4744.07  |     -0.01     |
| 65537x5x65537  | 17392.32  | 17395.42  |     -0.02     |
|  4097x7x4097   |   87.17   |   87.19   |     -0.02     |
|  8193x7x8193   |  301.94   |  302.00   |     -0.02     |
| 16385x7x16385  |  1107.17  |  1106.79  |     0.03      |
| 32769x7x32769  |  4747.15  |  4747.13  |     0.00      |
| 65537x7x65537  | 17403.85  | 17405.02  |     -0.01     |

Fixes perf problem reported in https://github.com/pytorch/pytorch/issues/114911
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114992
Approved by: https://github.com/Skylion007, https://github.com/eqy
2023-12-08 07:53:17 +00:00
4d70802133 [c10d] Use TCPStore to record NCCL timeout and dump debug info (#115226)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115226
Approved by: https://github.com/wconstab
2023-12-08 06:19:40 +00:00
2c84616a94 Move the shape env symint cache to a symbol cache, better routing for subclass fakification [re-pr 115227] (#115396)
*
Context:

Joel sees that unless he manually writes to the fake tensor memo, fakification seems to produce spurious symbols! Voz (me) objects, saying that not only is directly writing to memo a bad pattern, recursively invoking fakification on tensor subclass elements in dynamo should suffice! Joel says that while he morally agrees, he has a test proving otherwise, a most perplexing situation.

Digging in, I figured out that while *we were* making fake tensors correctly, with properly cached symbols and the like, we were *also* incorrectly creating spurious symbols, leading the test to fail.

Before this PR, we would only cache source->symint. This was generally fine, but meant that you would create a symbol, then potentially throw it out due to symint cache. For example, the cache hit flow was:

make a symbol (ex: s2) -> use it to make a symint -> hit the cache (my_source-s1)

Now, in this example,  you have a symbol in your val_to_var/var_to_val (s2) that is unused. This is sound, but wasteful, and furthermore, misleading.

This was causing a test added in a PR in this stack to fail, specifically, because the test was using

```
curr_var_to_val = {
    str(k): v for k, v in context.fake_mode.shape_env.var_to_val.items()
}
````

To validate that no new symbols were being created (that is, that recursively creating fake tensors for subclasses was working).

The test is correct, but the implementation of caching would make (by this method of observation) cache hits look like cache misses.

So, the fix here is to move the cache up to be a general symbol cache, rather than only a cache for symints.

The initial implementation did that! But then, it ran into some interesting errors when it came to replay. When replaying symbol creation, behaviors would diverge in the new shape env! How could that be? The answer is because creating a new shape_env resulted in us replaying symbol creation... but with a cache from a different shape env! This was short circuiting symbol creation - and so, adding an extra layer to the cache for id(shape_env) fixes the problem.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115396
Approved by: https://github.com/mlazos
2023-12-08 05:02:21 +00:00
d0f161eae4 [vision hash update] update the pinned vision hash (#111264)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/_update-commit-hash.yml).
Update the pinned vision hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111264
Approved by: https://github.com/pytorchbot
2023-12-08 03:33:33 +00:00
9521331ba5 [pytorch] Multiprocessing api to use sigkill if sigterm doesn't kill the process (#115219)
Summary:
[pytorch] Multiprocessing api to use sigkill if sigterm doesn't kill the process
We have seen a handful of jobs training stuck where one of the trainer goes down
while others are stuck in c++ land and hence not handling the sigterm.

Test Plan: Manually validated by attaching gdb to one of the processes and sent a kill -9 to another. Saw the log ```WARNING] Unable to shutdown process 4422 via Signals.SIGTERM, forcefully exiting via Signals.SIGKILL```

Differential Revision: D51862545

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115219
Approved by: https://github.com/wconstab, https://github.com/fduwjj
2023-12-08 02:26:19 +00:00
459845b82d [cuDNN][cuDNN frontend] Bump cudnn_frontend submodule to 1.0 (#115218)
A prerequisite for cuDNN flash attention #113713 .

CC @malfet @atalman @drisspg @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115218
Approved by: https://github.com/drisspg, https://github.com/malfet
2023-12-08 02:24:26 +00:00
e071d6a9eb [Nested tensor]avoid using shape in python subclass NT, use _size instead (#115371)
Summary:
calling tensor.shape will call torch_dispatch which adds more overhead.

Testing overhead difference in "NT + NT" operation:
**Before:**
the add operation takes ~300us
{F1167963824}
**After:**
the add operation takes ~200us
 {F1167964056}

Test Plan: unit tests in test_nestedtensor

Differential Revision: D51949135

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115371
Approved by: https://github.com/soulitzer, https://github.com/jbschlosser
2023-12-08 02:08:36 +00:00
5432088098 Adds Checkpointer Wrapper for DCP [3/N] (#114603)
Adds a useful high level wrapper for calling `dist.save/load` with the correct storage readers and writers.

Instead of doing:

```
DCP.save(
    state_dict={...},
    storage_writer=StorageWriter(...)
)

DCP.load(
    state_dict={...},
    storage_reader=StorageReader(...)
)
```

We can now do:

```
checkpointer = Checkpointer(...)

checkpointer.save(state_dict={...})
checkpointer.load(state_dict={...})
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114603
Approved by: https://github.com/fegin, https://github.com/wz337
2023-12-08 01:03:21 +00:00
3b01f30b20 Prevent invalid pointwise ops on jagged with transposed ragged dim (#115190)
TODO: tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115190
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-08 00:54:03 +00:00
784e20e3d7 [C10D] Make dumpPipe use async launcher (#115375)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115375
Approved by: https://github.com/fduwjj
ghstack dependencies: #115332
2023-12-08 00:16:22 +00:00
bb7746275c Add is_integer to SymFloat (#114703)
Fixes #114676

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114703
Approved by: https://github.com/peterbell10
2023-12-07 23:23:53 +00:00
f5919335db Fix _load_from_state_dict for num_batches_tracked in batchnorm (#115285)
I approved https://github.com/pytorch/pytorch/pull/110850 which did the following

Previously:
`num_batches_tracked` not in state_dict when doing `m.load_state_dict(state_dict)` --> always overwrite module's `num_batches_tracked` in `load_from_state_dict` with a 0 cpu tensor

Now:
`num_batches_tracked` not in state_dict loaded when doing `m.load_state_dict(state_dict)` --> only overwrite module's `num_batches_tracked`  in `load_from_state_dict` with a 0 cpu tensor if module does not have `num_batches_tracked`

This causes the following issue:

```
with torch.device('meta'):
     m = BatchNorm(...)
m.load_state_dict(state_dict, assign=True)
```

If `num_batches_tracked` is not in `state_dict`, since `modules's` `num_batches_tracked` is present on meta device, it is not overwritten with a 0 cpu tensor. When compiling, this error is raised

```
AssertionError: Does not support mixing cuda+meta
```

I am not sure whether the explicit check for meta device makes sense as a fix, will add testing if this fix is ok

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115285
Approved by: https://github.com/albanD
2023-12-07 22:48:26 +00:00
18d57dde2d Remove remaining uses of copy_graphstate (#115321)
After auditing higher_order_ops.py, the graph checkpoints were only getting used in the event of an exception, so it is safe to remove because we restart analysis in this case now.

To make this clearer the current state is the following:
Checkpoint side effects
Capture subgraph
if graph break:
  restore as usual
else:
  throw away inlining translator and subgraph tracer
Restore side effects

This will change to the following after this change:
Checkpoint side effects
Capture subgraph:
if graph break:
  restart analysis
else:
  throw away inlining translator and subgraph tracer
Restore side effects

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115321
Approved by: https://github.com/jansel, https://github.com/zou3519
2023-12-07 22:35:02 +00:00
ecba053cff [quant][pt2e] XNNPACKQuantizer skip inserting observers for non-float Tensors (#114999)
Summary:
att

Test Plan:
python test/test_quantization.py -k test_add_mul_long

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114999
Approved by: https://github.com/kimishpatel, https://github.com/guangy10
2023-12-07 22:13:36 +00:00
dacf5d6e92 [DTensor] Remove assert to allow tensor sharding dimension < Shard(x).ndim (#115114)
Consolidated by changes made by @yoyoyocmu. https://www.internalfb.com/diff/D51821717
Remove assert to allow tensor dimension < Shard(x).ndim. With the current padding, we do support this already.

Follow up: we will still need to fix the size mismatch and `full_tensor()` hang when tensor is uneven-sharded.
Created issue here: https://github.com/pytorch/pytorch/issues/115310

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115114
Approved by: https://github.com/yoyoyocmu, https://github.com/wanchaol
2023-12-07 21:57:30 +00:00
7562b45454 Reland "[C10D] Use future for flight recorder dump (#115176)" (#115332)
Replaces the "always sleep 30 sec before abort" with "wait up to 30 sec
for the future to complete then abort". The difference in this case is
the abort happens as soon as the dump finishes up to a maximum, instead
of always waiting the maximum.

Allows multiple calls to dump, which will be serialized.

Renames tryWriteDebugInfo to launchAsyncDebugDump in spirit of the
change to support more than one launch and to always launch rather than
only launching on the first call.

Adds a test for dumping on timeout.

This reverts commit ac7d14baad53fa7d63119418f760190f289d8a01.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115332
Approved by: https://github.com/fduwjj
2023-12-07 21:20:58 +00:00
fd79995fd6 [export] Dont skip output caching for now. (#115374)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115374
Approved by: https://github.com/tugsbayasgalan
2023-12-07 20:31:30 +00:00
6a6a1e3ef7 [dtensor] update README to make all example runnable (#115365)
as titled, also add torchrun commands

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115365
Approved by: https://github.com/fegin
2023-12-07 20:23:37 +00:00
c06ab369e8 [OAT] toggle for forcing matmul precision matching (#115326)
Summary: Add a toggle to inductor config that will force matmul precision dtypes to match between cublas and triton backends for addmm, bmm, and mm operations.

Test Plan: CI + model launches

Reviewed By: jansel

Differential Revision: D51442001

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115326
Approved by: https://github.com/jansel
2023-12-07 20:22:12 +00:00
7faa67f6ef [inductor] enable mkldnn op weight pre-packing on aarch64 (#115037)
This PR enables the fx passes and mkldnn optimizations for aarch64 It improved the bert inference performance up to 5.8x on AWS c7g instance when compared torch.compile() vs no compile path. This is enabled when pytorch is built with USE_MKLDNN_ACL option for aarch64.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115037
Approved by: https://github.com/jgong5, https://github.com/malfet
2023-12-07 19:58:38 +00:00
7201edc0a5 Fix RNN class constructor signature (#115341)
Fixes #114617

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115341
Approved by: https://github.com/mikaylagawarecki
2023-12-07 19:46:33 +00:00
21cca2494d Move test_multi_tensor_optimizers to use OptimizerInfos (#114797)
This PR aims for parity+ compared to the old testing for the simplest foreach test case.

Test coverage increase: we now test foreach optimizers with CPU as well as on GPU.

Before:
```
(pytorch-3.10) [janeyx@devgpu023.odn1 ~/local/pytorch (19136605)]$ python test/test_optim.py -v -k test_multi_tensor_optimizers
/home/janeyx/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.0
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
test_multi_tensor_optimizers (optim.test_optim.TestOptim) ... ok

----------------------------------------------------------------------
Ran 1 test in 7.253s

OK
(pytorch-3.10) [janeyx@devgpu023.odn1 ~/local/pytorch (19136605)]$
```

Now, we get granular test cases at the cost of overhead!
```
(pytorch-3.10) [janeyx@devgpu023.odn1 ~/local/pytorch (19136605)]$ python test/test_optim.py -v -k test_foreach
/home/janeyx/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.0
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
test_foreach_ASGD_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_Adadelta_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_Adagrad_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_AdamW_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_Adam_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_Adamax_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_NAdam_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_RAdam_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_RMSprop_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_Rprop_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_SGD_cpu_float64 (__main__.TestOptimRenewedCPU) ... ok
test_foreach_ASGD_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_Adadelta_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_Adagrad_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_AdamW_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_Adam_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_Adamax_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_NAdam_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_RAdam_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_RMSprop_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_Rprop_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok
test_foreach_SGD_cuda_float64 (__main__.TestOptimRenewedCUDA) ... ok

----------------------------------------------------------------------
Ran 22 tests in 30.954s

OK
(pytorch-3.10) [janeyx@devgpu023.odn1 ~/local/pytorch (19136605)]$
```

Why the increase in time?
Two reasons:
1. overhead. Any _CUDA_ *Info test (OpInfo, ModuleInfo, OptimizerInfo) will wrap itself with the `CudaNonDefaultStream` policy, and `CudaNonDefaultStream.__enter__` when called for the first time will go through all visible CUDA devices and synchronize each of them, thus forcing the CUDAContext to be init'd. Doing this for all 8 devices takes ~10-15s. Also, test parametrization costs a little overhead too, but not to the level init'ing CUDA context does.
2. We test more! Now, we have 72 configs (in the foreach optimizer world) whereas we only had 59 before.

Next steps for the future:
- consider adding more Tensor LR configs (like a Tensor LR without capturable in the single tensor case)
- this is likely the next PR or 2: migrate all uses of _test_derived_optimizers in test_optim to TestOptimRenewed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114797
Approved by: https://github.com/albanD
2023-12-07 19:37:56 +00:00
16373bbc1f fix error message in pytorch (#115349)
Fixes https://dev-discuss.pytorch.org/t/typo-in-error-message/1709 .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115349
Approved by: https://github.com/Skylion007
2023-12-07 19:27:29 +00:00
suo
eb4ba35b07 fix test_weak.py on mac (#115367)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115367
Approved by: https://github.com/albanD
2023-12-07 19:19:56 +00:00
b0a9641815 [Inductor][fx pass] Fuse pointwise operators in the post grad (#114778)
Summary: We construct a unified API that can be easily add pointwise ops to be batched in the post grad

Test Plan:
# unit test
```
buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/inductor:group_batch_fusion
```
Buck UI: https://www.internalfb.com/buck2/19b3f641-782f-4f94-a953-3ff9ce2cfa7b
Test UI: https://www.internalfb.com/intern/testinfra/testrun/1125900251953016
Network: Up: 67KiB  Down: 32KiB  (reSessionID-c2a80f26-8227-4f78-89fc-bcbda0ae8353)
Jobs completed: 18. Time elapsed: 1:19.8s.
Cache hits: 0%. Commands: 2 (cached: 0, remote: 0, local: 2)
Tests finished: Pass 6. Fail 0. Fatal 0. Skip 0. Build failure 0
# local reproduce
### cmf
P881792289
### igctr
### dsnn
### icvr

Reviewed By: xuzhao9

Differential Revision: D51332067

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114778
Approved by: https://github.com/xuzhao9
2023-12-07 19:04:03 +00:00
3a5fb0d456 markDynamoStrictTest in functorch/test_eager_transforms.py (#115268)
We're doing some more work around the functorch-torch.compile
interaction. The current state is that these tests might not get run in
the Dynamo CI shard. Using this decorator makes them actually run (by
resetting the Dynamo state before/after each test).

Test Plan:
Wait for CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115268
Approved by: https://github.com/voznesenskym, https://github.com/guilhermeleobas
ghstack dependencies: #115267, #115276
2023-12-07 18:42:21 +00:00
a1bfaf75dc markDynamoStrictTest: add nopython flag, set default to False (#115276)
Default should be False because in general, we're interested
in reliability and composability: we want to check that
running PyTorch with and without Dynamo has the same semantics (with
graph breaks allowed).

Test Plan:
Existing tests?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115276
Approved by: https://github.com/voznesenskym
ghstack dependencies: #115267
2023-12-07 18:42:21 +00:00
2847045ed9 Set _dynamo.config.capture_func_transforms=False (#115267)
Due to not all tests in the Dynamo shard actually running in CI, we've
started to bitrot on this implementation. Since our plan is to trace
into the functorch implementations instead of construct a HOP
(which is what capture_func_transforms=True does), let's turn off this
config by default.

Test Plan:
- Tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115267
Approved by: https://github.com/voznesenskym, https://github.com/guilhermeleobas
2023-12-07 18:42:15 +00:00
3e66385ddd Add Work to distributed docs (#115172)
Summary:
Documenting the `Work` object

For a collective (broadcast, all_reduce, etc.) when async_op=True we return a `Work` object to which users can call `.wait()`, `.is_success()`, among other things but this class is not documented

Test Plan: Preview the docs build in OSS

Differential Revision: D51854974

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115172
Approved by: https://github.com/wconstab
2023-12-07 18:12:10 +00:00
ee8b33f7d5 Fixed crash when calling pad_packed_tensor when packed with cuda tensors and ensure_sorted=false due to indexing with tensors on different devices (#115028)
Fixes #115027

Fix in csrc as done in the python code [here](https://github.com/pytorch/pytorch/blob/main/torch/nn/utils/rnn.py#L338).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115028
Approved by: https://github.com/drisspg
2023-12-07 18:09:18 +00:00
suo
686a3e0bf0 [pytorch][PR] introduce WeakHashRef (#115216)
We would like weak dictionaries that have `torch.ScriptObject` keys. Similar to tensors, we need to override the behavior of the ref to dot he right thing under comparison.

This change also makes it so that WeakIdKeyDictionary works with a pluggable ref_type.

Differential Revision: [D51828205](https://our.internmc.facebook.com/intern/diff/D51828205/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115216
Approved by: https://github.com/albanD
2023-12-07 17:48:11 +00:00
684ce1b21d Revert "Assert that output could only be the last node of the FX graph (#115179)"
This reverts commit 4a9fb9832abc00dff9729b7d7a9647b376882f38.

Reverted https://github.com/pytorch/pytorch/pull/115179 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/115179#issuecomment-1845776365))
2023-12-07 17:26:27 +00:00
dd6ae6d3b4 [HigherOrderOp] Remove additional get item calls in MapHigherOrder. (#115207)
As titled, this PR removes the unnessecary getitem call from the graph that's manipulated in MapHigherOrder, where we want to get the first dim slice of original tensor for specualtion but using call_method will accidentally create a get_item call in the graph, so want to avoid it by calling unpack_var_sequence on input tensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115207
Approved by: https://github.com/yanboliang
ghstack dependencies: #115115, #115204, #115205
2023-12-07 17:06:44 +00:00
8b74735878 [HigherOrderOp] make MapHigherOrder create map_impl call_function node instead of map (#115205)
We want to remove the map_wrapper and replace it with dynamo always on. This is the first step of this plan.

In this PR, we make dynamo directly generates a map_impl nodes. This hasn't touch the eager logic yet. So the execution path after this PR looks like 1. `dynamo -> map_impl` when torch.compile is on. (Before this PR, it's `dynamo -> map_wrapper -> map_impl` and 2. `map_wrapper -> map_impl` (This PR did't touch the logic here).

The added TODO(yidi) is addressed in the following pr.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115205
Approved by: https://github.com/yanboliang
ghstack dependencies: #115115, #115204
2023-12-07 17:06:44 +00:00
be3efbebb6 [HigherOrderOp] make MapHigherOrder use should_flatten_output=True (#115204)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115204
Approved by: https://github.com/yanboliang
ghstack dependencies: #115115
2023-12-07 17:06:35 +00:00
998c87f93c [BE][HigherOrderOp] extract redundant code that unflattens the output (#115115)
We need this function to unflatten the variable tracker for HOPs that want pytree output support, e.g. map.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115115
Approved by: https://github.com/yanboliang
2023-12-07 17:06:28 +00:00
43f42bf3cb Updated docs for deprecated torch.set_default_tensor_type (#115041)
Added deprecation note for torch.set_default_tensor_type. Updated docs that referenced this method.

Fixes #113646.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115041
Approved by: https://github.com/janeyx99
2023-12-07 16:17:36 +00:00
441ecf03e2 Update gloo submodule (#115158)
Updates to pull ROCm 6.0 related changes and few minor updates in gloo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115158
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2023-12-07 15:55:08 +00:00
cyy
7b8084d1c6 [5/N] Fixes clang-tidy warnings in c10/core/*.h (#115232)
This PR continues to fix clang-tidy warnings for headers in c10/core.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115232
Approved by: https://github.com/Skylion007
2023-12-07 15:48:03 +00:00
d08b20d534 Update FlashAttention too v2.3.6 (#115313)
# Summary
This PR updates the FlashAttention code from:
02ac572f3f.
Or Tag 2.3.2

To 92dd5703ec

Or tag 3.2.6.

As well I think that this should be cherry picked into 2.2.0 release since there was a temporary ~15% perf regression for causal masking. It is not technically a regression since Flash wasn't released yet but it would be nice to have in the release.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115313
Approved by: https://github.com/Skylion007
2023-12-07 15:47:16 +00:00
78b945484b [c10d] Extend NCCL communicator splitting to more use cases (#114916)
Previously we could only use `ncclCommSplit` when we knew all backends were connected on all shards (due to the need to perform a NOCOLOR split), which in practice meant we could only use it for subgroups that were copies of the entire world.

This change allows for specifying a bound device id to `init_process_group` which tells the pg and its backends that the specified device, and the specified device only, will be associated with this rank.

This guarantee lets us do an early connect (which we could not previously do due to how ProcessGroupNCCL infers devices based on tensors and not the rank number).  And by doing the early connect, we have the guarantee ranks are connected and can perform nocolor splits when needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114916
Approved by: https://github.com/kwen2501
2023-12-07 15:13:01 +00:00
a6736ac851 Add call to run_tests for a few tests (#115097)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115097
Approved by: https://github.com/wconstab, https://github.com/fduwjj
2023-12-07 08:27:40 +00:00
3c882925da Make subclass type instances constants (like UserDefinedClasses) (#115323)
As title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115323
Approved by: https://github.com/oulgen
2023-12-07 08:10:59 +00:00
5e3631db31 [DTensor] force re-compute sharding when normalized_shape differs in fwd layer norm (#115250)
**Summary**:
#114174 did not test the case where `elementwise_affine=False` (i.e. `weight` and `bias` are `None`) and this test would fail due to cached sharding propagation. The difference on sharding prop between these cases is, when `weight` and `bias` are None, the forward layer norm op will be recognized as a "static shape op" and `propagate_op_sharding` will be applied rather than `propagate_op_sharding_non_cached`. A fix is to force re-compute sharding when `normalized_shape` changes by setting op schema's `RuntimeSchemaInfo`'s `static_argnum` to include `normalized_shape` (i.e. 1)

**Test**:
pytest test/distributed/_tensor/test_math_ops.py -s -k layer_norm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115250
Approved by: https://github.com/wanchaol
2023-12-07 07:44:06 +00:00
622688fab9 [export] Fix graph output mismatch issue with constant outputs. (#115280)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115280
Approved by: https://github.com/tugsbayasgalan
2023-12-07 06:11:08 +00:00
e1f159e6b2 Remove rebundant api named is_int_list (#115136)
Fixes #114933

As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115136
Approved by: https://github.com/zou3519
2023-12-07 04:55:13 +00:00
5309ac1b98 Add test case to prove non-strict export supports external call (#115245)
Current non-strict test cases (added in #114697) are already supported by strict mode, so it can't demonstrate the incremental value of non-strict mode. How about adding test cases that fail in strict mode but pass in non-strict mode?

Test Plan:
python test/export/test_export.py -k test_external_call_non_strict_real_tensor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115245
Approved by: https://github.com/tugsbayasgalan, https://github.com/zhxchen17
2023-12-07 04:51:15 +00:00
a93b9ee9d8 [quant][be] Add a test for per channel quant for groupwise conv (#115224)
Summary:
just making sure this works

Test Plan:
python test/test_quantization.py -k test_groupwise_per_channel_quant

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115224
Approved by: https://github.com/andrewor14
2023-12-07 04:46:20 +00:00
b7eb9b1e7e [Autotune] Enable register pressure handling logic for H100. (#115295)
I have seen the register pressure handling logic helps performance on H100 for a couple kernels. Also my local run of Huggingface and timm_models both show neutral results.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115295
Approved by: https://github.com/jansel
2023-12-07 04:37:44 +00:00
f55ab176fc [OAT] move matmul precision out of system info (#115242)
Summary: move matmul precision out of the system info (system hash) and into the cache in preparation for switching precisions during compile

Test Plan: CI

Reviewed By: jansel

Differential Revision: D51442000

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115242
Approved by: https://github.com/jansel
2023-12-07 04:30:06 +00:00
7ec145bfed [Quant] [PT2] Fix XNNPACKQuantizer set_module_type issue (#115252)
**Summary**
Fix the issue https://github.com/pytorch/pytorch/issues/115251, the root-cause is we pass the `filter_fn` parameter of `find_sequential_partitions` in wrong position. Use keyword arg to fix this issue.

**Summary**
```
python -u -m pytest -s -v test_quantization.py -k test_set_module_type_case_2
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115252
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-12-07 03:08:20 +00:00
6c0a4ced53 [export] Add math.* ops to pass base (#115271)
Fixes https://github.com/pytorch/pytorch/issues/115209

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115271
Approved by: https://github.com/ydwu4
2023-12-07 02:47:04 +00:00
d7160c9223 Handle potential ValueError exception when stringifying signals (#114696)
On some systems it is possible to receive a signal that does not have a name.  Rare, but possible.  This prevents our error handler from crashing and instead properly reports the signal.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114696
Approved by: https://github.com/xmfan
2023-12-07 02:10:30 +00:00
ac7d14baad Revert "[C10D] Use future for flight recorder dump (#115176)"
This reverts commit 0e07e3dbe434ce31a5aea634628c7d39747f265f.

Reverted https://github.com/pytorch/pytorch/pull/115176 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but the test_timeout_dumps is failing in trunk 0e07e3dbe4 ([comment](https://github.com/pytorch/pytorch/pull/115176#issuecomment-1844076455))
2023-12-07 02:09:58 +00:00
3a18211622 Guard on subclass inner tensors (#114965)
This PR introduces guarding on subclass inner tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114965
Approved by: https://github.com/voznesenskym
ghstack dependencies: #114311, #115212
2023-12-07 01:47:48 +00:00
c163b3c035 [export] Remove runtime assertion pass (#115196)
Reland of https://github.com/pytorch/pytorch/pull/111949/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115196
Approved by: https://github.com/avikchaudhuri
2023-12-07 01:44:11 +00:00
73c0035160 Add reset_storage method to FunctionalTensorWrapper (#115235)
In certain edge cases when using lazy tensors, the base tensor stored in the `FunctionalStorageImpl` and the `value_` tensor stored in the `FunctionalTensorWrapper` diverge. For instance, take this simple example
```python
class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(4, 2, bias=False)

    def forward(self, x):
        return x @ self.fc1.weight.transpose(0, 1)

with torch.device("lazy"):
    model = Model()

    x = torch.ones(4)
    out = model(x)
```
The call to `transpose` on the lazily initialized weight `fc1.weight` applies a view op on the functional tensor which only gets propagated to the functional tensor wrapper and not the base tensor in the storage. Thus, causing them to diverge.

To fix this behaviour, we need to reset the functional tensor's storage. To facilitate this, we add a `reset_storage` method to `FunctionalTensorWrapper` which clears away the old storage and view metas.

CC: @behzad-a @GlebKazantaev @wconstab @bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115235
Approved by: https://github.com/bdhirsh
2023-12-07 01:32:01 +00:00
cyy
4e9fe496cd Remove c10::either (#112733)
Time to remove it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112733
Approved by: https://github.com/albanD
2023-12-07 01:31:53 +00:00
240f4b2d25 make __lookup_backend return None when cache misses (#114766)
Fixes #114674. The error is because cached_backends is a thread-local object, when it's accessed from the other thread, we'll have a cache miss. The naive fix is to just return None and re-compiles when cache misses. This could also be related to making dynamo more thread-safe but I'm not sure if there an on-going effort or not.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114766
Approved by: https://github.com/IvanYashchuk, https://github.com/Neilblaze, https://github.com/jansel
2023-12-07 00:25:01 +00:00
7457a5f4be [inductor] adapt to the get_max_simd_tflops Triton API change (#115288)
Differential Revision: D51907617

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115288
Approved by: https://github.com/hl475, https://github.com/chenyang78
2023-12-07 00:22:06 +00:00
ae5365819d [ONNX] Extend test_fx_op_consistency.py to cover ExportedProgram model type (#114886)
This PR covers `ExportedProgram` to `test_fx_op_consistency.py`, which helps us identify the necessary but missing io_steps.
Next, we should refactor the tests to actually cover all ops supported by registry.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114886
Approved by: https://github.com/thiagocrepaldi
2023-12-07 00:03:23 +00:00
3642f29a64 DistributedDataParallel._post_forward, fix return (#114678)
Fix `return` in case of `_delay_all_reduce_all_params`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114678
Approved by: https://github.com/Skylion007, https://github.com/fegin
2023-12-06 23:44:52 +00:00
0e07e3dbe4 [C10D] Use future for flight recorder dump (#115176)
Replaces the "always sleep 30 sec before abort" with "wait up to 30 sec
for the future to complete then abort".  The difference in this case is
the abort happens as soon as the dump finishes up to a maximum, instead
of always waiting the maximum.

Allows multiple calls to dump, which will be serialized.

Renames `tryWriteDebugInfo` to `launchAsyncDebugDump` in spirit of the
change to support more than one launch and to always launch rather than
only launching on the first call.

Adds a test for dumping on timeout.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115176
Approved by: https://github.com/zdevito
2023-12-06 23:42:19 +00:00
0757e2ba84 [aotautograd] Fix an output shape error when inputs are aliased (#115279)
Summary: https://github.com/pytorch/pytorch/issues/97083, when an output
is marked as OutputType.is_input but a synthetic base is constructed
because of aliased inputs, we may need to update the output type to
OutputType.alias_of_input if needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115279
Approved by: https://github.com/bdhirsh
2023-12-06 23:10:21 +00:00
7e0e124a5d Automated submodule update: FBGEMM (#115103)
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: dbc3157bf2

Test Plan: Ensure that CI jobs succeed on GitHub before landing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115103
Approved by: https://github.com/malfet
2023-12-06 22:47:40 +00:00
83cb6a75ad [dynamo] add list iterator contains (#115237)
Fixes https://github.com/pytorch/pytorch/issues/115236

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115237
Approved by: https://github.com/jansel
2023-12-06 22:26:16 +00:00
71bf4f3b87 [CI] Add torch/_functorch/_aot_autograd to auto-label rule (#115283)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115283
Approved by: https://github.com/bdhirsh
2023-12-06 20:07:53 +00:00
1489e4bcf3 [Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547)
**Summary**
Add standalone batchnorm into `_move_exported_model_to_eval` to move it from training mode into eval mode

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_bn_conv2d
python -u -m pytest -s -v test_quantize_pt2e.py -k test_bn_move_exported_model_to_eval
```

Differential Revision: [D51853407](https://our.internmc.facebook.com/intern/diff/D51853407)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114547
Approved by: https://github.com/jgong5, https://github.com/andrewor14
2023-12-06 19:51:22 +00:00
c99db5617a Introduce general metadata cache to jagged layout NestedTensor (#115212)
Slight refactor to:
* lazily compute min / max seq_len used for flash. this avoids unnecessary graph breaks / specialization when we're not accessing these
* store min / max seq_len in a general `metadata_cache`. condensing these should make it easier to avoid specializing on these and others we may add in the future
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115212
Approved by: https://github.com/soulitzer, https://github.com/ani300
ghstack dependencies: #114311
2023-12-06 19:40:35 +00:00
b6de337d16 [funcol] a few optimizations to funcol (#113324)
Apply a few optimizations to funcol:

- allgather on non-0 dim, the resulting tensor already needs to access
data in order to do torch.cat, so we sync wait here so that we don;t
need to go through ACT dispatch for chunk + cat alltogether
- have a fast return logic to aten.view as it's a commonly hit op for
view related ops

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113324
Approved by: https://github.com/XilunWu
2023-12-06 19:25:35 +00:00
2cf0cf8137 [dynamo / DDP] - lazily compile submodules - to propagate real tensor strides to backend compiler (#114154)
Fixes https://github.com/pytorch/pytorch/issues/113812, https://github.com/pytorch/pytorch/issues/102591, Probably fixes: https://github.com/pytorch/pytorch/issues/113740, https://github.com/pytorch/pytorch/issues/113786, https://github.com/pytorch/pytorch/issues/113788

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114154
Approved by: https://github.com/wconstab, https://github.com/yf225
2023-12-06 18:50:14 +00:00
967863d91d [export][refactor][3/n] Move unlift to separate file (#114787)
Differential Revision: [D51823960](https://our.internmc.facebook.com/intern/diff/D51823960)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114787
Approved by: https://github.com/ydwu4
ghstack dependencies: #114764, #114768
2023-12-06 16:46:47 +00:00
0ab57ee7ea [export][refactor][2/n] Move tracing logic (#114768)
2/n of refactoring export code:

* Moved tracing logic in torch/_export/init.py to torch/export/_tracer.py

Differential Revision: [D51823961](https://our.internmc.facebook.com/intern/diff/D51823961)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114768
Approved by: https://github.com/ydwu4
ghstack dependencies: #114764
2023-12-06 16:46:47 +00:00
53bf8cfcf9 [export][refactor][1/n] Move dynamic shapes logic (#114764)
1/n of refactoring export code:
* Moved dynamic shapes/constraints/dynamic_dims logic in torch/_export/__init__.py and torch/export/__init__.py to torch/export/dynamic_shapes.py

Differential Revision: [D51823962](https://our.internmc.facebook.com/intern/diff/D51823962)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114764
Approved by: https://github.com/ydwu4
2023-12-06 16:46:38 +00:00
5f939e32e3 [CI] Log load_model failures in csv (#114784)
Summary: Right now when load_model fails (either because of loading error or validation eager run failure), the result won't be logged in generated csv files. Let's log them in csv so that they are monitored by the expected results checking.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114784
Approved by: https://github.com/malfet
2023-12-06 15:19:16 +00:00
67c8ad7285 Fix autograd.Function x enum input x torch.compile (#115206)
Fixes https://github.com/pytorch/pytorch/issues/114777. We treat Enums
like we do ConstantVariable.

Test Plan:
New test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115206
Approved by: https://github.com/yanboliang
ghstack dependencies: #115185, #115186, #115187
2023-12-06 15:18:25 +00:00
233ce0d24b Support GPU annotations for auto-trace jobs similar on-demand support (#114638)
Summary: When using auto_trace, gpu_user_annotation is not shown in the results. Fixing this by including `GPU_USER_ANNOTATION` in `kCudaTypes`.

Differential Revision: D51597995

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114638
Approved by: https://github.com/aaronenyeshi
2023-12-06 09:38:13 +00:00
d4c79a3078 Add an attention bias subclass for a lower right causal masking (#114823)
# Summary
This PR introduces a new Tensor subclass that is designed to be used with torch.nn.functional.scaled_dot_product_attention. Currently we have a boolean `is_causal` flag that allows users to do do causal masking without the need to actually create the "realized" attention bias and pass into sdpa. We originally added this flag since there is native support in both fused kernels we support. This provides a big performance gain ( the kernels only need to iterate over ~0.5x the sequence, and for very large sequence lengths this can provide vary large memory improvements.

The flag was introduced when the early on in the kernel development and at the time it was implicitly meant to "upper_left" causal attention. This distinction only matters when the attention_bias is not square. For a more detailed break down see: https://github.com/pytorch/pytorch/issues/108108. The kernels default behavior has since changed, largely due to the rise of autogressive text generation. And unfortunately this would lead to a BC break. In the long term it may actually be beneficial to change the default meaning of `is_causal` to represent lower_right causal masking.

The larger theme though is laid here: https://github.com/pytorch/pytorch/issues/110681. The thesis being that there is alot of innovation in SDPA revolving around the attention_bias being used. This is the first in hopefully a few more attention_biases that we would like to add. The next interesting one would be `sliding_window` which is used by the popular mistral model family.

Results from benchmarking, I improved the meff_attention perf hence the slightly decreased max perf.
```Shell
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
|  Type   |      Speedup       | batch_size | num_heads | q_seq_len | k_seq_len | embed_dim |     dtype      | head_dim |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
| Average | 1.2388050062214226 |            |           |           |           |           |                |          |
|   Max   | 1.831672915579016  |    128     |    32     |   1024    |   2048    |   2048    | torch.bfloat16 |    64    |
|   Min   | 0.9430534166730135 |     1      |    16     |    256    |    416    |   2048    | torch.bfloat16 |   128    |
+---------+--------------------+------------+-----------+-----------+-----------+-----------+----------------+----------+
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114823
Approved by: https://github.com/cpuhrsch
2023-12-06 08:29:26 +00:00
4a9fb9832a Assert that output could only be the last node of the FX graph (#115179)
Test Plan: unit tests

Differential Revision: D51856848

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115179
Approved by: https://github.com/Chillee
2023-12-06 08:17:16 +00:00
fcf6a76108 [aot_inductor][pass] fuse parallel linear based on pre grad aten IR (#114776)
Summary:
This work is for PT2 inference. Since the IR from Export will change to pre-grad aten IR in a few months. We need to start this work from now on. Here is what I do in this diff:
1) Copy the fuse parallel linear pass to fb folder and adapt it to aten IR. We still want to keep the original `group_batch_fusion.py` because it is still used in training. In future at certain time point when PT2 training decided to retire the torch IR based group_batch_fusion, we can remove it. But right now, it's better to have torch IR and aten IR version seperately.

Our plan is to gradually transform the existing and important pre-grad passes to aten IR based passes.

Differential Revision: D51017854

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114776
Approved by: https://github.com/zhxchen17
2023-12-06 05:48:20 +00:00
cyy
d250b2158e [4/N] Fixes clang-tidy warnings in header files (#115163)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115163
Approved by: https://github.com/Skylion007
2023-12-06 05:00:01 +00:00
f4c67ffff4 [dynamo] Improve support for dynamic shapes str.format and _assert (#115203)
This removes a graph break in vision_maskrcnn.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115203
Approved by: https://github.com/yanboliang
2023-12-06 04:54:45 +00:00
4ff4e06b5b Update xla pin (#115211)
This is to update the pin pass 062aa91a9c so flaky test can be skipped
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115211
Approved by: https://github.com/malfet
2023-12-06 04:52:37 +00:00
534f25887b [inductor] avoid inplace for ComplexView (#115166)
Fix https://github.com/pytorch/pytorch/issues/115071
A regression introduced by https://github.com/pytorch/pytorch/pull/112875/files#diff-d2539c9c8dc6a3d7e457767a880612e96d3c85752a77ead49a9e4e00a3e4c3c7R335

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115166
Approved by: https://github.com/Skylion007
2023-12-06 04:52:28 +00:00
490f2d7570 Skip privateuse1's checkZeroPoints (#114117)
We want to use ``quantize_per_channel`` to create a quantized tensor, but we found that ``checkZeroPoints`` for ``privateuse1`` backend failed.

``quantize_tensor_per_channel_affine`` will ``checkZeroPoints`` for all backends expect ``CUDA``:
140c54e6cc/aten/src/ATen/native/quantized/AffineQuantizer.cpp (L162-L164)

However, our ``privateuse1`` backend will get a segmentation error if we try to cast our data to int64_t in ``checkZeroPoints``:
140c54e6cc/aten/src/ATen/native/quantized/AffineQuantizer.cpp (L82-L88)

So if we can skip ``privateuse1``'s ``checkZeroPoints`` and check this item in the actual device function? What do you think?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114117
Approved by: https://github.com/jerryzh168
2023-12-06 04:44:49 +00:00
acdd06e00f [executorch hash update] update the pinned executorch hash (#115215)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/_update-commit-hash.yml).
Update the pinned executorch hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115215
Approved by: https://github.com/pytorchbot
2023-12-06 04:33:25 +00:00
a548e80536 Use test_vulkan to validate run_test without boto3 (#115233)
As `test_weak` can undergo some changes, but `test_vulkan` is a no-op for CPU builds
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115233
Approved by: https://github.com/suo
2023-12-06 03:45:52 +00:00
2bff36bb0e [c10d] Change set timeout API name to _set_default_timeout (#115197)
Somehow the feedback does not show up, this PR is to address the comment in https://github.com/pytorch/pytorch/pull/115141.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115197
Approved by: https://github.com/XilunWu, https://github.com/wconstab
2023-12-06 03:38:39 +00:00
b56b002842 Fix NULL dereference in binary CPU ops (#115183)
Targeted fix for https://github.com/pytorch/pytorch/issues/113037

A more fundamental one, where those functions are not even called for
empty tensors are coming later

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115183
Approved by: https://github.com/drisspg, https://github.com/atalman, https://github.com/huydhn
2023-12-06 03:37:47 +00:00
892a14a450 [vision hash update] update the pinned vision hash (#111408)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/_update-commit-hash.yml).
Update the pinned vision hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111408
Approved by: https://github.com/pytorchbot
2023-12-06 03:25:52 +00:00
ef6cbf4e1f remove myself from CODEOWNERS (#115230)
Trying to reign in my notifications ;-)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115230
Approved by: https://github.com/malfet
2023-12-06 02:50:50 +00:00
b0b190f7c0 More descriptive error message for unsupported inputs to HOP (#115187)
Test Plan:
See updated tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115187
Approved by: https://github.com/ydwu4, https://github.com/yanboliang
ghstack dependencies: #115185, #115186
2023-12-06 01:29:03 +00:00
b5b011a5cd Expand input types for HOPs that use manually_set_subgraph_inputs=False (#115186)
Previously we only supported Tensor, Constants, and SymNode. We lift
that restriction (there's not really a good reason for it). HOPs like
torch.cond, torch.map already do input validation (those are the ones
that can only support Tensor, Constant, and SymNode inputs).

Test Plan:
New test for `wrap`, which is a HOP that has
manually_set_subgraph_inputs=False

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115186
Approved by: https://github.com/ydwu4, https://github.com/yanboliang
ghstack dependencies: #115185
2023-12-06 01:29:03 +00:00
bc46347152 Refactor how HOPs create new args to subgraphs (#115185)
This PR combines the logic for Tensor and SymNode.

Test Plan:
- Existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115185
Approved by: https://github.com/ydwu4, https://github.com/yanboliang
2023-12-06 01:29:03 +00:00
f6291a5e93 [Quant] [Inductor] Enable QLinear weight prepack when input dimension size exceeds 2 (#113928)
**Summary**
Enable the qlinear weight prepack when input dimension size exceeds 2. There are extra reshape node before and after the `addmm` or `mm` node if input dimension size exceeds 2.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k input_dim_exceeds_2
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113928
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #113733, #113912
2023-12-06 01:24:15 +00:00
6d0cf26c3a [Quant] [Inductor] Enable Dequant Promotion when Linear input dimension size exceeds 2 (#113912)
**Summary**
When decomposing `Linear` to `addmm` or `mm` within Inductor, if the input dimension size exceeds 2, `reshape` nodes are introduced to convert the input into a 2-dimensional form before and after the `addmm` or `mm` node. It is essential to identify and match this pattern during quantization for dequantization promotion. For instance,
```
        #            quant
        #      + - - - | - - - +
        #      |    dequant    |
        #      |       |       |
        #      |    reshape    |
        #      |    /     \    |
        #      |  node1  node2 |
        #      + - | - - - | - +
        #        reshape reshape
        #      + - | - - - | - +
        #        quant    quant
```
In this PR, we mainly do 2 things:

- Extend support for the dequantization pattern in QLinear when the input dimension size exceeds 2.
- Revise the implementation of the dequant promotion pass, as it now needs to accommodate the matching of four different patterns.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k input_dim_exceeds_2
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113912
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #113733
2023-12-06 01:20:36 +00:00
4a624d1f8a [Quant] [PT2] Enable QLinear input with multi dims (#113733)
**Summary**
In the previous QLinear implementation, it was assumed that inputs have a dimension of 2. In this update, we have modified QLinear to accept inputs with a dimension greater than 2, incorporating input and output reshaping accordingly.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113733
Approved by: https://github.com/jgong5, https://github.com/eellison
2023-12-06 01:16:51 +00:00
b8ce05456c enable cat for cuda bits types (#115044)
It was already working for cpu, so bring parity.
Also, slightly reduce number of compiled kernels by using OpaqueType.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115044
Approved by: https://github.com/malfet
2023-12-06 00:05:18 +00:00
b9c4fb68c5 [ONNX][Bench] Fix model name retrieval and remove unused argument (#115108)
Might be some upstream updates, the previous hack starts to not pick up model names, updating to use the other more appropriate variable.
Also fix a bug with an unused argument that was supposed to be removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115108
Approved by: https://github.com/thiagocrepaldi
2023-12-05 23:55:12 +00:00
ae457a2c4a [PyTorch] Change test_aot_inductor CPU test failures syntax (#115180)
This portion of D50416438 is extremely subject to merge conflicts. It can also be safely landed without full CI round trip because it changes just one test file that we can simply run to make sure it works.

Differential Revision: [D51856943](https://our.internmc.facebook.com/intern/diff/D51856943/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115180
Approved by: https://github.com/mikekgfb, https://github.com/desertfire
2023-12-05 23:55:08 +00:00
01ec71e466 [NFC][Autotune] Use device_prop.regsPerMultiprocessor instead of hardcoded reg number. (#115094)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115094
Approved by: https://github.com/jansel
2023-12-05 23:49:46 +00:00
1102d37958 remove aot_config.keep_inference_input_mutations from assert_functional_graph (#115195)
We technically allow backends to aot_autograd to pass a config saying "yes I am ok with seeing input mutations in my graph".

With https://github.com/pytorch/pytorch/pull/112906 though, there can be input mutations that show up in the backward (that we need to handle for correctness), that are a large pain to keep out of the graph. The meta-point is that it's been ~a year since we added the config, and it almost always makes sense for backends to support input mutations for performance reasons (inductor does). So I just allow these input mutations in the graph in this rare backward situation, even if the backend didn't explicitly use the config.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115195
Approved by: https://github.com/drisspg
2023-12-05 23:36:37 +00:00
7aac689b19 [inductor] Add ir.Scan and lower aten.cumsum on CUDA (#106581)
This adds the `ir.Scan` node (currently only supported on CUDA) which re-uses the existing reduction kernel machinery to support different kinds of non-pointwise ops. Just like reductions it supports prologue and epilogue fusions and has both persistent and non-persistent kernel generation.

Currently this doesn't support the equivalent of `Reduction.create_multilayer` and will instead fall back to eager in those cases. This is because splitting into multiple kernel invocations ends up being far slower than cub's single kernel strategy which matches the performance of a copy kernel.

Fixes https://github.com/pytorch/pytorch/issues/93631

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106581
Approved by: https://github.com/lezcano, https://github.com/atalman
2023-12-05 23:31:49 +00:00
d78fe039eb Introduce OptimizerInfos + add a test_errors (#114178)
Introduce OptimizerInfos + use them to refactor out the error testing.

Why OptimizerInfos?
- cleaner, easier way to test all configs of optimizers
- would plug in well with devicetype to auto-enable tests for devices like MPS, meta
- would allow for more granular testing. currently, lots of functionality is tested in `_test_basic_cases` and some of that should be broken down more.

What did I do for error testing?
- I moved out some error cases from `_test_basic_cases` into a new test_errors parametrized test.
- The new test has to live in TestOptimRenewed (bikeshedding welcome) because the parametrized tests need to take in device and dtype and hook correctly, and not all tests in TestOptim do that.
- TestOptimRenewed also is migrating to the toplevel test/test_optim.py now because importing TestOptimRenewed does not work (because of test instantiation, TestOptimRenewed gets replaced with TestOptimRenewedDevice for CPU, CUDA, and whatever other device).

Is there any change in test coverage?
- INCREASE: The error case where a single Parameter (vs a container of them) are passed in has now expanded to all optims instead of only LBFGS
- DECREASE: Not much. The only thing is we no longer test two error cases for foreach=True AND foreach=False, which I think is redundant. (Highlighted in comments)

Possible but not urgent next step: test ALL possible error cases by going through all the constructors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114178
Approved by: https://github.com/albanD
2023-12-05 22:58:36 +00:00
99257002fa Extend auto_functionalized to support ops that return Tensors (#115135)
We can auto-functionalize operators that mutate their inputs as long as
the outputs of the operator do not alias their inputs. The user needs
to provide an abstract impl for the operator if it has non-trivial
returns.
- We update can_auto_functionalize(op) to include ops that return (but
  do not alias) Tensors
- We update auto_functionalized(op, mutated_args_names, kwargs) to
  return (out, mutated_args), where `out = op(**kwargs)` and
  `mutated_args` are the new values of the inputs that would have been
  mutated.

Test Plan:
- new test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115135
Approved by: https://github.com/bdhirsh
ghstack dependencies: #114955, #114956, #115134
2023-12-05 22:43:06 +00:00
d0aad93249 Refactor can_auto_functionalize (#115134)
In preparation for the next PR up in the stack, which is going to update
"can_auto_functionalize" to support more operators than just ones that
return nothing. We are unable to auto-generate FakeTensor kernels for
operators that do not return nothing, but we are able to generate
functionalization kernels for operators that return something.

Test Plan:
Existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115134
Approved by: https://github.com/bdhirsh
ghstack dependencies: #114955, #114956
2023-12-05 22:43:06 +00:00
4620170008 [Dynamo] Revert multiple PRs since they triggered compilation stuck internally (#115126)
Revert the following PRs to mitigate internal compilation stuck:
#113432
#114016
#114507
#114196
#114739
#114669

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115126
Approved by: https://github.com/xush6528
2023-12-05 22:35:37 +00:00
80527c0cf2 [AOTInductor] Double buffering for Weights (#114446)
Summary:
This adds function to model container doing weight swapping with double buffering.

There are 2 parts for double buffering
a) Write constants into inactive buffer
b) Swap active buffer

For (a), we write the constants into the buffer that's currently not in use, and store the information in both constants map and the corresponding constant array to read.
For (b), we obtain the lock, and activate the constant map/constant array that is inactive, and flag the one that's currently in use to inactive.

Test Plan:
test/cpp/aot_inductor/test.cpp

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51543732](https://our.internmc.facebook.com/intern/diff/D51543732)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114446
Approved by: https://github.com/chenyang78, https://github.com/eellison
2023-12-05 22:31:56 +00:00
12085914b8 Replace bsr_dense_mm triton kernel with bsr_dense_addm triton kernel (#115030)
The `bsr_dense_addmm` triton kernel introduced in https://github.com/pytorch/pytorch/pull/114595 is a generalization of `bsr_dense_mm` triton kernel and a more efficient version of it because it uses an extra kernel parameter `SPLIT_N` that has notable effect to performance for r.h.s operand with a larger number of columns.

This PR eliminates the `bsr_dense_mm` triton kernel in favor of using `bsr_dense_addmm` triton kernel.

The performance increase of `bsr_dense_mm` is as follows (float16, `NVIDIA A100-SXM4-80GB`):
- with 16x16 blocks, the average/maximal speed up is 50/71 %
- with 32x32 blocks, the average/maximal speed up is 30/63 %
- with 64x64 blocks, the average/maximal speed up is 12/26 %
- with 128x128 blocks, the average/maximal speed up is 7/17 %

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115030
Approved by: https://github.com/cpuhrsch
2023-12-05 22:29:24 +00:00
f35f52e4a6 Update auto_request_review.yml (#115182)
remove myself to avoid notification noise

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115182
Approved by: https://github.com/huydhn, https://github.com/albanD
2023-12-05 21:36:18 +00:00
f09e8381b7 [Inductor][fx pass] Fix a bug in batch linear fusion in the post grad (#115061) (#115131)
Summary:

Titled

Test Plan:
```
buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/inductor:group_batch_fusion
```
Buck UI: https://www.internalfb.com/buck2/ab4b918c-9ffa-4d00-a747-880521a27851
Test UI: https://www.internalfb.com/intern/testinfra/testrun/16607023638890043
Network: Up: 11MiB  Down: 117MiB  (reSessionID-079402d0-8fd7-4797-9ed5-dd0f778dce1a)
Jobs completed: 189430. Time elapsed: 2:02.5s.
Cache hits: 99%. Commands: 77000 (cached: 76995, remote: 5, local: 0)
Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0

Reviewed By: mengluy0125

Differential Revision: D51796899

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115131
Approved by: https://github.com/mengluy0125
2023-12-05 21:20:17 +00:00
ab120e65fb Fix FSDP + TP state dict in param unflattening (#115105)
Summary:
This diff fix the param unflattening when using FSDP together with TP. Currently we hardcode the `reshape_size` to be multiplied by 2, which instead should be the size of the process group.

Before the fix, example exception: `shape '[257, 514]' is invalid for input of size 264196`, where the process group size is 4 instead of 2.

Test Plan:
**CI**:
CI test

**Unit test**:
`buck2 test mode/dev-nosan //caffe2/test/distributed/tensor/parallel:fsdp_2d_parallel`
- Passed

**Test model with WHEN**:
- Verified that checkpoint can be saved and resumed successfully;
- Verified the accuracy with window_ne, which is on-par with baseline.
https://pxl.cl/3Wp8w

Differential Revision: D51826120

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115105
Approved by: https://github.com/fegin
2023-12-05 21:19:56 +00:00
22704426c3 Expand dynamic dims support for traceable subclasses (#114311)
Continuation of #112185, following the design in this [doc](https://docs.google.com/document/d/1ipSxcTzEMMOAPvxP-YJlD5JBZZmIGgh8Q34ixtOUCRo).

Summary:
* Introduce `SubclassSymbolicPolicy` containing separate dynamic dim / constraint policies for the outer and inner tensors
    * Expand the automatic dynamic algorithm to recurse into inner tensors and produce one of these for a subclass instance
    * Maintain legacy behavior for subclasses by recursively calling `mark_dynamic()` on inner tensors *of the same dim as outer* when `mark_dynamic(outer, ...)` is called
    * Addresses this: 6a86cf00ad/torch/_dynamo/variables/builder.py (L1750)
* Add `outer_size` and `outer_stride` arguments to `__tensor_unflatten__()` so that you can find out what symbols were allocated for the outer size / stride (you are expected to return a tensor that compares equal to the outer symbols)
    * Signatures now:
    ```python
    # attrs is a list of inner tensor attributes on x; inner_tensor = getattr(x, attr)
    # ctx is anything useful for rebuilding the class we want to guard on
    attrs, ctx = x.__tensor_flatten__()
    ...
    # inner_tensors is a dict of {attr -> tensor}
    # ctx is taken unmodified from flattening and (eventually) guarded on
    # outer_size is the expected size of the output; possibly symbolic
    # outer_stride is the expected strides of the output; possibly symbolic
    y = MySubclass.__tensor_unflatten__(inner_tensors, ctx, outer_size, outer_stride)

    # at the __tensor_unflatten__() call-site in PT2, we assert y.shape == outer_size and y.stride() == outer_stride
    # the assert simplifies symbols when there are relationships between outer and inner symbols
    ```
    * Size info needed for `NestedTensor` at least, stride info needed for `DTensor` at least
    * Punting on `outer_storage_offset` because storage_offset handling is horribly broken in PT2 right now
* ~~Add new `__tensor_mark_dynamic__()` to allow overriding the behavior of mark_dynamic on a per-subclass basis~~ (booted to future work)
* ~~Add guards for tensor subclasses by calling `__tensor_flatten__()` in the guard to test equality on `ctx`~~
    * Now handled in #114469
* Next PR: add TENSOR_MATCH guards on inner tensors

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114311
Approved by: https://github.com/ezyang, https://github.com/drisspg, https://github.com/voznesenskym, https://github.com/bdhirsh
2023-12-05 21:09:25 +00:00
259a99669d [NCCL flight recorder] Dump when writing to pipe (#115139)
If TORCH_NCCL_DUMP_ON_TIMEOUT is set, then along with producing a dump
file when a timeout happens, you can trigger a dump by writing to local pipe
`<TORCH_NCCL_DEBUG_INFO_TEMP_FILE>_<rank>.pipe` (by default
/tmp/nccl_trace_{rank}_<rank>.pipe).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115139
Approved by: https://github.com/wconstab
2023-12-05 20:44:23 +00:00
5fdae89c03 [docs][aoti] Link to export docs in AOTI docs (#115088)
Context: https://fb.workplace.com/groups/1075192433118967/posts/1341833143121560/?comment_id=1341841786454029

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115088
Approved by: https://github.com/desertfire
2023-12-05 20:22:42 +00:00
630 changed files with 21775 additions and 24340 deletions

View File

@ -1 +1 @@
b2f5dfe80704404298467347b8ee3ac229efed47
ca6322dcfc51b209a06b76d160bd95d81d58f15c

View File

@ -61,7 +61,6 @@ install_ubuntu() {
${maybe_libiomp_dev} \
libyaml-dev \
libz-dev \
libjemalloc2 \
libjpeg-dev \
libasound2-dev \
libsndfile-dev \

View File

@ -298,8 +298,3 @@ pywavelets==1.4.1
# it here because 1.5.0 conflicts with numpy 1.21.2 used in CI
#Pinned versions: 1.4.1
#test that import:
lxml==4.9.4
#Description: This is a requirement of unittest-xml-reporting
# have to pin to 4.9.4 because 5.0.0 release on Dec 29th missing
# Python-3.9 binaries

View File

@ -1 +1 @@
2.2.0
2.1.0

View File

@ -28,8 +28,6 @@ echo "Environment variables:"
env
if [[ "$BUILD_ENVIRONMENT" == *cuda* ]]; then
# Use jemalloc during compilation to mitigate https://github.com/pytorch/pytorch/issues/116289
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2
echo "NVCC version:"
nvcc --version
fi

View File

@ -173,7 +173,7 @@ function install_torchrec_and_fbgemm() {
function clone_pytorch_xla() {
if [[ ! -d ./xla ]]; then
git clone --recursive -b r2.2 https://github.com/pytorch/xla.git
git clone --recursive --quiet https://github.com/pytorch/xla.git
pushd xla
# pin the xla hash so that we don't get broken by changes to xla
git checkout "$(cat ../.github/ci_commit_pins/xla.txt)"

View File

@ -500,7 +500,11 @@ test_inductor_torchbench_smoketest_perf() {
python benchmarks/dynamo/torchbench.py --device cuda --performance --bfloat16 --inference \
--export-aot-inductor --only nanogpt --output "$TEST_REPORTS_DIR/inductor_inference_smoketest.csv"
# The threshold value needs to be actively maintained to make this check useful
python benchmarks/dynamo/check_perf_csv.py -f "$TEST_REPORTS_DIR/inductor_inference_smoketest.csv" -t 5.2
# The perf number of nanogpt seems not very stable, e.g.
# https://github.com/pytorch/pytorch/actions/runs/7158691360/job/19491437314,
# and thus we lower its threshold to reduce flakiness. If this continues to be a problem,
# we switch to use some other model.
python benchmarks/dynamo/check_perf_csv.py -f "$TEST_REPORTS_DIR/inductor_inference_smoketest.csv" -t 4.9
# Check memory compression ratio for a few models
for test in hf_Albert timm_vision_transformer; do
@ -1069,7 +1073,7 @@ elif [[ "${TEST_CONFIG}" == *torchbench* ]]; then
# https://github.com/opencv/opencv-python/issues/885
pip_install opencv-python==4.8.0.74
if [[ "${TEST_CONFIG}" == *inductor_torchbench_smoketest_perf* ]]; then
checkout_install_torchbench hf_Bert hf_Albert timm_vision_transformer
checkout_install_torchbench hf_Bert hf_Albert nanogpt timm_vision_transformer
PYTHONPATH=$(pwd)/torchbench test_inductor_torchbench_smoketest_perf
else
checkout_install_torchbench

2
.circleci/config.yml generated
View File

@ -657,8 +657,6 @@ jobs:
TEST_CONFIG: << parameters.test-config >>
SHARD_NUMBER: << parameters.shard-number >>
NUM_TEST_SHARDS: << parameters.num-test-shards >>
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
steps:
- checkout
- attach_workspace:

View File

@ -211,8 +211,6 @@
TEST_CONFIG: << parameters.test-config >>
SHARD_NUMBER: << parameters.shard-number >>
NUM_TEST_SHARDS: << parameters.num-test-shards >>
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
steps:
- checkout
- attach_workspace:

View File

@ -3,6 +3,7 @@ self-hosted-runner:
- linux.20_04.4x
- linux.20_04.16x
- linux.large
- linux.large.arc
- linux.2xlarge
- linux.4xlarge
- linux.12xlarge

View File

@ -46,8 +46,7 @@ runs:
retry_wait_seconds: 30
command: |
set -eux
# PyYAML 6.0 doesn't work with MacOS x86 anymore
python3 -m pip install requests==2.26.0 pyyaml==6.0.1
python3 -m pip install requests==2.26.0 pyyaml==6.0
- name: Parse ref
id: parse-ref

View File

@ -12,7 +12,6 @@ reviewers:
symbolic-shapes:
- symbolic-shapes
- antoniojkim
- wconstab
- SherlockNoMad
Chillee:
- ezyang

View File

@ -1 +1 @@
c1e2095c3a16fbe7db25b9e2f206025488c2c203
e12d200c97d7aab668b976e92b46513c9ca7a0d8

View File

@ -1 +1 @@
r2.2
a80c1e7f958e7d8e8f92319db70876940e67ad9b

5
.github/labeler.yml vendored
View File

@ -23,8 +23,9 @@
- torch/_subclasses/meta_utils.py
- test/distributed/test_dynamo_distributed.py
- test/distributed/test_inductor_collectives.py
- torch/_functorch/partitioners.py
- torch/_functorch/_aot_autograd/**
- torch/_functorch/aot_autograd.py
- torch/_functorch/partitioners.py
- .ci/docker/ci_commit_pins/**
- .github/ci_commit_pins/**
- c10/core/Sym*
@ -72,7 +73,7 @@
"ciflow/trunk":
- .ci/docker/ci_commit_pins/triton.txt
"module: distributed":
"oncall: distributed":
- torch/csrc/distributed/**
- torch/distributed/**
- torch/nn/parallel/**

View File

@ -84,14 +84,7 @@ def build_triton(
triton_repo = "https://github.com/openai/triton"
triton_pkg_name = "pytorch-triton"
check_call(["git", "clone", triton_repo], cwd=tmpdir)
if release:
ver, rev, patch = version.split(".")
check_call(
["git", "checkout", f"release/{ver}.{rev}.x"], cwd=triton_basedir
)
else:
check_call(["git", "checkout", commit_hash], cwd=triton_basedir)
check_call(["git", "checkout", commit_hash], cwd=triton_basedir)
if build_conda:
with open(triton_basedir / "meta.yaml", "w") as meta:
print(
@ -137,7 +130,7 @@ def build_triton(
cwd=triton_basedir,
env=env,
)
conda_path = list(Path(tmpdir).glob("linux-64/torchtriton*.bz2"))[0]
conda_path = next(iter(Path(tmpdir).glob("linux-64/torchtriton*.bz2")))
shutil.copy(conda_path, Path.cwd())
return Path.cwd() / conda_path.name
@ -163,7 +156,7 @@ def build_triton(
[sys.executable, "setup.py", "bdist_wheel"], cwd=triton_pythondir, env=env
)
whl_path = list((triton_pythondir / "dist").glob("*.whl"))[0]
whl_path = next(iter((triton_pythondir / "dist").glob("*.whl")))
shutil.copy(whl_path, Path.cwd())
if build_rocm:

View File

@ -62,9 +62,9 @@ SUPPORTED_PERIODICAL_MODES: Dict[str, Callable[[Optional[str]], bool]] = {
}
# The link to the published list of disabled jobs
DISABLED_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/disabled-jobs.json?versionId=jbbJUxI_SSZFssBBGCU6ybH9sxHitHLY"
DISABLED_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/disabled-jobs.json"
# and unstable jobs
UNSTABLE_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/unstable-jobs.json?versionId=hUtTalgnWb1m3AtJyVLUdu7DBrnddRkp"
UNSTABLE_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/unstable-jobs.json"
# Some constants used to handle disabled and unstable jobs
JOB_NAME_SEP = "/"

View File

@ -16,12 +16,6 @@ from typing import Dict, List, Optional, Tuple
CUDA_ARCHES = ["11.8", "12.1"]
CUDA_ARCHES_FULL_VERSION = {"11.8": "11.8.0", "12.1": "12.1.1"}
CUDA_ARCHES_CUDNN_VERSION = {"11.8": "8", "12.1": "8"}
ROCM_ARCHES = ["5.6", "5.7"]
@ -30,7 +24,6 @@ CPU_CXX11_ABI_ARCH = ["cpu-cxx11-abi"]
CPU_AARCH64_ARCH = ["cpu-aarch64"]
PYTORCH_EXTRA_INSTALL_REQUIREMENTS = {
"11.8": (
"nvidia-cuda-nvrtc-cu11==11.8.89; platform_system == 'Linux' and platform_machine == 'x86_64' | " # noqa: B950
@ -93,7 +86,9 @@ def get_nccl_wheel_version(arch_version: str) -> str:
requirements = map(
str.strip, re.split("[;|]", PYTORCH_EXTRA_INSTALL_REQUIREMENTS[arch_version])
)
return [x for x in requirements if x.startswith("nvidia-nccl-cu")][0].split("==")[1]
return next(x for x in requirements if x.startswith("nvidia-nccl-cu")).split("==")[
1
]
def validate_nccl_dep_consistency(arch_version: str) -> None:

View File

@ -1,42 +0,0 @@
#!/usr/bin/env python3
"""Generates a matrix for docker releases through github actions
Will output a condensed version of the matrix. Will include fllowing:
* CUDA version short
* CUDA full verison
* CUDNN version short
* Image type either runtime or devel
* Platform linux/arm64,linux/amd64
"""
import json
from typing import Dict, List
import generate_binary_build_matrix
DOCKER_IMAGE_TYPES = ["runtime", "devel"]
def generate_docker_matrix() -> Dict[str, List[Dict[str, str]]]:
ret: List[Dict[str, str]] = []
for cuda, version in generate_binary_build_matrix.CUDA_ARCHES_FULL_VERSION.items():
for image in DOCKER_IMAGE_TYPES:
ret.append(
{
"cuda": cuda,
"cuda_full_version": version,
"cudnn_version": generate_binary_build_matrix.CUDA_ARCHES_CUDNN_VERSION[
cuda
],
"image_type": image,
"platform": "linux/arm64,linux/amd64",
}
)
return {"include": ret}
if __name__ == "__main__":
build_matrix = generate_docker_matrix()
print(json.dumps(build_matrix))

View File

@ -8,7 +8,7 @@
# NOTE: If testing pytorch/builder changes you can change this variable to change what pytorch/builder reference
# the binary builds will check out
{%- set builder_repo = "pytorch/builder" -%}
{%- set builder_branch = "release/2.2" -%}
{%- set builder_branch = "main" -%}
{%- macro concurrency(build_environment) -%}
concurrency:
@ -36,7 +36,7 @@ concurrency:
{%- macro setup_ec2_windows() -%}
!{{ display_ec2_information() }}
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}

View File

@ -99,13 +99,13 @@ jobs:
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
- name: ROCm set GPU_FLAG
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: !{{ config["container_image"] }}
- name: Test Pytorch binary

View File

@ -81,8 +81,8 @@ jobs:
elif [ -d "/Applications/Xcode_13.3.1.app" ]; then
echo "DEVELOPER_DIR=/Applications/Xcode_13.3.1.app/Contents/Developer" >> "${GITHUB_ENV}"
fi
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
- name: Install sccache (only for non-forked PRs, and pushes to trunk)
uses: nick-fields/retry@v2.8.2
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}

View File

@ -65,8 +65,8 @@ jobs:
steps:
!{{ common.setup_ec2_windows() }}
!{{ set_runner_specific_vars() }}
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
- name: Populate binary env
shell: bash
run: |
@ -105,8 +105,8 @@ jobs:
with:
name: !{{ config["build_name"] }}
path: "${{ env.PYTORCH_FINAL_PACKAGE_DIR }}"
!{{ common.checkout(deep_clone=False, directory="pytorch", checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch, checkout_pr_head=False) }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
- name: Populate binary env
shell: bash
run: |

View File

@ -29,6 +29,7 @@ env:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -36,7 +37,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -58,25 +59,25 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -140,5 +141,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -29,6 +29,7 @@ env:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -36,7 +37,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -58,25 +59,25 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -185,5 +186,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -33,6 +33,7 @@ env:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -41,7 +42,7 @@ jobs:
reenabled-issues: ${{ steps.filter.outputs.reenabled-issues }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -63,30 +64,30 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image-name }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.2
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
if: ${{ inputs.cuda-version != 'cpu' }}
- name: Output disk space left
@ -121,8 +122,6 @@ jobs:
GITHUB_RUN_NUMBER: ${{ github.run_number }}
GITHUB_RUN_ATTEMPT: ${{ github.run_attempt }}
JOB_ID: ${{ steps.get-job-id.outputs.job-id }}
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
REENABLED_ISSUES: ${{ needs.filter.outputs.reenabled-issues }}
# TODO duplicated
AWS_DEFAULT_REGION: us-east-1
@ -159,8 +158,6 @@ jobs:
-e TORCH_CUDA_ARCH_LIST \
-e OUR_GITHUB_JOB_ID \
-e CUDA_VERSION \
-e PYTORCH_RETRY_TEST_CASES \
-e PYTORCH_OVERRIDE_FLAKY_SIGNAL \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
@ -199,5 +196,5 @@ jobs:
file-suffix: bazel-${{ github.job }}_${{ steps.get-job-id.outputs.job-id }}
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -139,13 +139,13 @@ jobs:
run: env
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.github-token }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: ${{ inputs.build_environment == 'linux-aarch64-binary-manywheel' }}
@ -173,6 +173,7 @@ jobs:
- name: Checkout PyTorch to pytorch dir
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -186,7 +187,7 @@ jobs:
- name: Checkout pytorch/builder to builder dir
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -212,7 +213,7 @@ jobs:
- name: Pull Docker image
if: ${{ steps.filter.outputs.is-test-matrix-empty == 'False' }}
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ inputs.DOCKER_IMAGE }}
@ -269,7 +270,7 @@ jobs:
- name: Teardown Linux
if: always()
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
- name: Chown workspace
if: always()

View File

@ -127,14 +127,14 @@ jobs:
} >> "${GITHUB_ENV} }}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.github-token }}
# Setup the environment
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: ${{ inputs.build_environment == 'linux-aarch64-binary-manywheel' }}
@ -155,6 +155,7 @@ jobs:
- name: Checkout PyTorch to pytorch dir
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
@ -167,7 +168,7 @@ jobs:
- name: Checkout pytorch/builder to builder dir
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -198,12 +199,12 @@ jobs:
path: "${{ runner.temp }}/artifacts/"
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.2
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
if: ${{ inputs.GPU_ARCH_TYPE == 'cuda' && steps.filter.outputs.is-test-matrix-empty == 'False' }}
- name: Pull Docker image
if: ${{ steps.filter.outputs.is-test-matrix-empty == 'False' }}
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ inputs.DOCKER_IMAGE }}
@ -213,7 +214,7 @@ jobs:
- name: Teardown Linux
if: always()
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
- name: Chown workspace
if: always()

View File

@ -100,7 +100,7 @@ jobs:
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: true

View File

@ -15,6 +15,7 @@ defaults:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -22,7 +23,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -43,7 +44,7 @@ jobs:
runs-on: ${{ matrix.runner }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Set up JDK 8
uses: actions/setup-java@v3
@ -52,7 +53,7 @@ jobs:
distribution: 'temurin'
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: 3.8
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}

View File

@ -66,7 +66,7 @@ jobs:
name: build-docs-${{ matrix.docs_type }}-${{ inputs.push }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -77,19 +77,19 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -187,5 +187,5 @@ jobs:
s3-prefix: pytorch/pytorch/${{ github.event.pull_request.number }}/functorchdocs
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -38,6 +38,7 @@ env:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -45,7 +46,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -79,7 +80,7 @@ jobs:
steps:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Populate CI build options
shell: bash
@ -101,7 +102,7 @@ jobs:
brew install libtool
- name: Setup miniconda for iOS
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: "3.9"
environment-file: .github/requirements/conda-env-iOS.txt

View File

@ -73,7 +73,7 @@ jobs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -82,14 +82,14 @@ jobs:
# checkout because when we run this action we don't *have* a local
# checkout. In other cases you should prefer a local checkout.
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image-name }}
@ -103,7 +103,7 @@ jobs:
echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}"
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -209,5 +209,5 @@ jobs:
path: sccache-stats-*.json
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -57,7 +57,7 @@ jobs:
timeout-minutes: ${{ matrix.mem_leak_check == 'mem_leak_check' && 600 || inputs.timeout-minutes }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
if: ${{ !contains(matrix.runner, 'gcp.a100') }}
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -66,14 +66,14 @@ jobs:
docker exec -it $(docker container ps --format '{{.ID}}') bash
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image }}
@ -87,13 +87,13 @@ jobs:
echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}"
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
id: install-nvidia-driver
uses: pytorch/test-infra/.github/actions/setup-nvidia@release/2.2
uses: pytorch/test-infra/.github/actions/setup-nvidia@main
if: contains(inputs.build-environment, 'cuda') && !contains(matrix.config, 'nogpu')
- name: Lock NVIDIA A100 40GB Frequency
@ -164,8 +164,6 @@ jobs:
BRANCH: ${{ steps.parse-ref.outputs.branch }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
BASE_SHA: ${{ github.event.pull_request.base.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
TEST_CONFIG: ${{ matrix.config }}
SHARD_NUMBER: ${{ matrix.shard }}
NUM_TEST_SHARDS: ${{ matrix.num_shards }}
@ -219,8 +217,6 @@ jobs:
-e NUM_TEST_SHARDS \
-e REENABLED_ISSUES \
-e CONTINUE_THROUGH_ERROR \
-e PYTORCH_RETRY_TEST_CASES \
-e PYTORCH_OVERRIDE_FLAKY_SIGNAL \
-e PR_LABELS \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e SCCACHE_BUCKET \
@ -300,7 +296,7 @@ jobs:
path: ./**/core.[1-9]*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()
# NB: We are currently having an intermittent GPU-related issue on G5 runners with

View File

@ -71,11 +71,11 @@ jobs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
steps:
- name: Clean up disk space before running MacOS workflow
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.2
uses: pytorch/test-infra/.github/actions/check-disk-space@main
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Set xcode version
env:
@ -87,7 +87,7 @@ jobs:
- name: Setup miniconda
if: inputs.environment-file == ''
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -97,7 +97,7 @@ jobs:
# environment even though the arch is x86-64
- name: Setup miniconda using the provided environment file
if: inputs.environment-file != ''
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: ${{ inputs.python-version }}
environment-file: ${{ inputs.environment-file }}
@ -207,4 +207,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.2
uses: pytorch/test-infra/.github/actions/check-disk-space@main

View File

@ -28,6 +28,7 @@ on:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -36,7 +37,7 @@ jobs:
reenabled-issues: ${{ steps.filter.outputs.reenabled-issues }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -80,7 +81,7 @@ jobs:
use-gha: true
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -95,8 +96,6 @@ jobs:
ENV_NAME: conda-test-env-${{ github.run_id }}
PY_VERS: 3.9
PR_BODY: ${{ github.event.pull_request.body }}
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
CONTINUE_THROUGH_ERROR: ${{ needs.filter.outputs.keep-going }}
PIP_REQUIREMENTS_FILE: .github/requirements/pip-requirements-${{ runner.os }}.txt
REENABLED_ISSUES: ${{ needs.filter.outputs.reenabled-issues }}
@ -158,4 +157,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.2
uses: pytorch/test-infra/.github/actions/check-disk-space@main

View File

@ -57,8 +57,6 @@ jobs:
SHARD_NUMBER: ${{ matrix.shard }}
NUM_TEST_SHARDS: ${{ matrix.num_shards }}
PR_BODY: ${{ github.event.pull_request.body }}
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
steps:
- name: Clean up leftover processes on MacOS pet runner
continue-on-error: true
@ -79,11 +77,11 @@ jobs:
- name: Clean up disk space before running MacOS workflow
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.2
uses: pytorch/test-infra/.github/actions/check-disk-space@main
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Download build artifacts
uses: ./.github/actions/download-build-artifacts
@ -92,7 +90,7 @@ jobs:
use-gha: true
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: ${{ inputs.python-version }}
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}
@ -218,4 +216,4 @@ jobs:
- name: Clean up disk space
if: always()
continue-on-error: true
uses: pytorch/test-infra/.github/actions/check-disk-space@release/2.2
uses: pytorch/test-infra/.github/actions/check-disk-space@main

View File

@ -54,7 +54,7 @@ jobs:
steps:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: true
@ -63,12 +63,12 @@ jobs:
- name: Calculate docker image
id: calculate-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ inputs.docker-image }}
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.calculate-docker-image.outputs.docker-image }}
@ -131,8 +131,6 @@ jobs:
JOB_NAME: ${{ steps.get-job-id.outputs.job-name }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
CONTINUE_THROUGH_ERROR: ${{ steps.keep-going.outputs.keep-going }}
TEST_CONFIG: ${{ matrix.config }}
SHARD_NUMBER: ${{ matrix.shard }}
@ -180,8 +178,6 @@ jobs:
-e TEST_CONFIG \
-e NUM_TEST_SHARDS \
-e REENABLED_ISSUES \
-e PYTORCH_RETRY_TEST_CASES \
-e PYTORCH_OVERRIDE_FLAKY_SIGNAL \
-e CONTINUE_THROUGH_ERROR \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e SCCACHE_BUCKET \

View File

@ -15,6 +15,7 @@ defaults:
jobs:
filter:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
test-matrix: ${{ steps.filter.outputs.test-matrix }}
@ -22,7 +23,7 @@ jobs:
keep-going: ${{ steps.filter.outputs.keep-going }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false
@ -53,10 +54,10 @@ jobs:
SUPPORT_ABI: '${{ matrix.support_abi }}'
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup miniconda
uses: pytorch/test-infra/.github/actions/setup-miniconda@release/2.2
uses: pytorch/test-infra/.github/actions/setup-miniconda@main
with:
python-version: 3.8
environment-file: .github/requirements/conda-env-${{ runner.os }}-${{ runner.arch }}.txt

View File

@ -60,10 +60,10 @@ jobs:
git config --global core.fsmonitor false
- name: Clean up leftover processes on non-ephemeral Windows runner
uses: pytorch/test-infra/.github/actions/cleanup-runner@release/2.2
uses: pytorch/test-infra/.github/actions/cleanup-runner@main
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -78,7 +78,7 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: true

View File

@ -54,10 +54,10 @@ jobs:
git config --global core.fsmonitor false
- name: Clean up leftover processes on non-ephemeral Windows runner
uses: pytorch/test-infra/.github/actions/cleanup-runner@release/2.2
uses: pytorch/test-infra/.github/actions/cleanup-runner@main
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
instructions: |
@ -73,7 +73,7 @@ jobs:
# [see note: pytorch repo ref]
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
no-sudo: true
@ -139,8 +139,6 @@ jobs:
USE_CUDA: ${{ inputs.cuda-version != 'cpu' && '1' || '0' }}
INSTALL_WINDOWS_SDK: 1
PYTHON_VERSION: 3.8
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
CONTINUE_THROUGH_ERROR: ${{ steps.keep-going.outputs.keep-going }}
VC_PRODUCT: "BuildTools"
VC_VERSION: ""

View File

@ -3,7 +3,7 @@ name: Build Triton wheels
on:
push:
branches:
- release/2.2
- main
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
@ -47,12 +47,12 @@ jobs:
BUILD_DEVICE: ${{ matrix.device }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
@ -60,7 +60,7 @@ jobs:
uses: ./.github/actions/setup-linux
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ env.DOCKER_IMAGE }}
@ -125,7 +125,7 @@ jobs:
path: ${{ runner.temp }}/artifacts/*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()
upload-wheel:
@ -182,19 +182,19 @@ jobs:
strategy:
fail-fast: false
matrix:
py_vers: [ "3.8", "3.9", "3.10", "3.11" ]
py_vers: [ "3.8", "3.9", "3.10", "3.11", "3.12" ]
timeout-minutes: 40
env:
DOCKER_IMAGE: pytorch/conda-builder:cpu
PY_VERS: ${{ matrix.py_vers }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
@ -202,7 +202,7 @@ jobs:
uses: ./.github/actions/setup-linux
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ env.DOCKER_IMAGE }}
@ -238,7 +238,7 @@ jobs:
path: ${{ runner.temp }}/artifacts/*
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()
upload-conda:

View File

@ -29,7 +29,7 @@ jobs:
runs-on: linux.20_04.4x
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
fetch-depth: 1

View File

@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Run close_nonexistent_disable_issues.py
env:

View File

@ -65,21 +65,21 @@ jobs:
# [see note: pytorch repo ref]
# deep clone (fetch-depth 0) required for git merge-base
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- name: Setup Linux
uses: ./.github/actions/setup-linux
- name: Build docker image
id: build-docker-image
uses: pytorch/test-infra/.github/actions/calculate-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/calculate-docker-image@main
with:
docker-image-name: ${{ matrix.docker-image-name }}
always-rebuild: true
push: true
- name: Pull docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: ${{ steps.build-docker-image.outputs.docker-image }}
@ -109,5 +109,5 @@ jobs:
if: always()
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -11,8 +11,8 @@ on:
branches:
- nightly
tags:
# We want to run this build on final release tag
- v[0-9]+.[0-9]+.[0-9]+
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
- ciflow/nightly/*
concurrency:
@ -26,45 +26,28 @@ env:
DOCKER_REGISTRY: ghcr.io
NO_BUILD_SUFFIX: true
USE_BUILDX: 1
WITH_PUSH: ${{ github.event_name == 'push' && (github.event.ref == 'refs/heads/nightly' || (startsWith(github.event.ref, 'refs/tags/') && !startsWith(github.event.ref, 'refs/tags/ciflow/'))) }}
WITH_PUSH: ${{ github.event_name == 'push' && (github.event.ref == 'refs/heads/nightly' || startsWith(github.event.ref, 'refs/tags/v')) }}
jobs:
generate-matrix:
if: github.repository_owner == 'pytorch'
runs-on: [self-hosted, linux.large]
outputs:
matrix: ${{ steps.generate-matrix.outputs.matrix }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: true
- name: Get docker release matrix
id: generate-matrix
run: |
MATRIX_BLOB="$(python3 .github/scripts/generate_docker_release_matrix.py)"
echo "${MATRIX_BLOB}"
echo "matrix=${MATRIX_BLOB}" >> "${GITHUB_OUTPUT}"
build:
if: ${{ github.repository == 'pytorch/pytorch' }}
runs-on: [self-hosted, linux.2xlarge]
environment: ${{ (github.ref == 'refs/heads/main' || startsWith(github.event.ref, 'refs/tags/v')) && 'docker-build' || '' }}
environment: ${{ (github.ref == 'refs/heads/nightly' || startsWith(github.event.ref, 'refs/tags/v')) && 'docker-build' || '' }}
timeout-minutes: 240
needs: generate-matrix
strategy:
matrix: ${{ fromJson(needs.generate-matrix.outputs.matrix) }}
fail-fast: false
matrix:
include:
# nvidia specific images don't exist for arm64 so only build the runtime image
- image_type: runtime
platform: linux/arm64,linux/amd64
- image_type: devel
platform: linux/amd64
env:
BUILD_IMAGE_TYPE: ${{ matrix.image_type }}
BUILD_PLATFORMS: ${{ matrix.platform }}
CUDA_VERSION: ${{ matrix.cuda_full_version }}
CUDA_VERSION_SHORT: ${{ matrix.cuda }}
CUDNN_VERSION: ${{ matrix.cudnn_version }}
steps:
- name: Setup SSH (Click me for login details)
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
# [see note: pytorch repo ref]
@ -114,11 +97,10 @@ jobs:
- name: Push nightly tags
if: ${{ github.event.ref == 'refs/heads/nightly' && matrix.image_type == 'runtime' }}
run: |
PYTORCH_DOCKER_TAG="${PYTORCH_VERSION}-cuda$(CUDA_VERSION_SHORT)-cudnn$(CUDNN_VERSION)-runtime"
PYTORCH_DOCKER_TAG="${PYTORCH_VERSION}-runtime"
CUDA_VERSION=$(python3 -c "import re;print(re.search('CUDA_VERSION\s+=\s+([0-9\.]+)',open('docker.Makefile').read())[1],end='')")
PYTORCH_NIGHTLY_COMMIT=$(docker run ghcr.io/pytorch/pytorch-nightly:"${PYTORCH_DOCKER_TAG}" \
python -c 'import torch; print(torch.version.git_version[:7],end="")')
docker tag ghcr.io/pytorch/pytorch-nightly:"${PYTORCH_DOCKER_TAG}" \
ghcr.io/pytorch/pytorch-nightly:"${PYTORCH_NIGHTLY_COMMIT}-cu${CUDA_VERSION}"
docker push ghcr.io/pytorch/pytorch-nightly:"${PYTORCH_NIGHTLY_COMMIT}-cu${CUDA_VERSION}"
@ -127,5 +109,5 @@ jobs:
ghcr.io/pytorch/pytorch-nightly:latest
docker push ghcr.io/pytorch/pytorch-nightly:latest
- name: Teardown Linux
uses: pytorch/test-infra/.github/actions/teardown-linux@release/2.2
uses: pytorch/test-infra/.github/actions/teardown-linux@main
if: always()

View File

@ -47,7 +47,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.8"
runs_on: linux.arm64.2xlarge
ALPINE_IMAGE: "arm64v8/alpine"
@ -68,7 +68,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -87,7 +87,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu-aarch64
secrets:
@ -109,7 +109,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.9"
runs_on: linux.arm64.2xlarge
ALPINE_IMAGE: "arm64v8/alpine"
@ -130,7 +130,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -149,7 +149,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-aarch64
secrets:
@ -171,7 +171,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.10"
runs_on: linux.arm64.2xlarge
ALPINE_IMAGE: "arm64v8/alpine"
@ -192,7 +192,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -211,7 +211,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-aarch64
secrets:
@ -233,7 +233,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.11"
runs_on: linux.arm64.2xlarge
ALPINE_IMAGE: "arm64v8/alpine"
@ -254,7 +254,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -273,7 +273,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-aarch64
secrets:
@ -295,7 +295,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.12"
runs_on: linux.arm64.2xlarge
ALPINE_IMAGE: "arm64v8/alpine"
@ -316,7 +316,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-aarch64
build_environment: linux-aarch64-binary-manywheel
@ -335,7 +335,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu-aarch64
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2
DOCKER_IMAGE: pytorch/manylinuxaarch64-builder:cpu-aarch64-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-aarch64
secrets:

View File

@ -47,7 +47,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cpu
build_environment: linux-binary-conda
@ -65,7 +65,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cpu
build_environment: linux-binary-conda
@ -83,7 +83,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cpu
secrets:
@ -106,7 +106,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
runs_on: linux.24xlarge
build_name: conda-py3_8-cuda11_8
@ -126,7 +126,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cuda11_8
build_environment: linux-binary-conda
@ -145,7 +145,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cuda11_8
secrets:
@ -168,7 +168,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
runs_on: linux.24xlarge
build_name: conda-py3_8-cuda12_1
@ -188,7 +188,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cuda12_1
build_environment: linux-binary-conda
@ -207,7 +207,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cuda12_1
secrets:
@ -229,7 +229,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
build_environment: linux-binary-conda
@ -247,7 +247,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
build_environment: linux-binary-conda
@ -265,7 +265,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
secrets:
@ -288,7 +288,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
runs_on: linux.24xlarge
build_name: conda-py3_9-cuda11_8
@ -308,7 +308,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda11_8
build_environment: linux-binary-conda
@ -327,7 +327,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda11_8
secrets:
@ -350,7 +350,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
runs_on: linux.24xlarge
build_name: conda-py3_9-cuda12_1
@ -370,7 +370,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_1
build_environment: linux-binary-conda
@ -389,7 +389,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cuda12_1
secrets:
@ -411,7 +411,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
build_environment: linux-binary-conda
@ -429,7 +429,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
build_environment: linux-binary-conda
@ -447,7 +447,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
secrets:
@ -470,7 +470,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
runs_on: linux.24xlarge
build_name: conda-py3_10-cuda11_8
@ -490,7 +490,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda11_8
build_environment: linux-binary-conda
@ -509,7 +509,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda11_8
secrets:
@ -532,7 +532,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
runs_on: linux.24xlarge
build_name: conda-py3_10-cuda12_1
@ -552,7 +552,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_1
build_environment: linux-binary-conda
@ -571,7 +571,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cuda12_1
secrets:
@ -593,7 +593,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
build_environment: linux-binary-conda
@ -611,7 +611,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
build_environment: linux-binary-conda
@ -629,7 +629,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
secrets:
@ -652,7 +652,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
runs_on: linux.24xlarge
build_name: conda-py3_11-cuda11_8
@ -672,7 +672,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda11_8
build_environment: linux-binary-conda
@ -691,7 +691,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda11_8
secrets:
@ -714,7 +714,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
runs_on: linux.24xlarge
build_name: conda-py3_11-cuda12_1
@ -734,7 +734,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_1
build_environment: linux-binary-conda
@ -753,7 +753,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cuda12_1
secrets:
@ -775,7 +775,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
build_environment: linux-binary-conda
@ -793,7 +793,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
build_environment: linux-binary-conda
@ -811,7 +811,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
secrets:
@ -834,7 +834,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
runs_on: linux.24xlarge
build_name: conda-py3_12-cuda11_8
@ -854,7 +854,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda11_8
build_environment: linux-binary-conda
@ -873,7 +873,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda11_8
secrets:
@ -896,7 +896,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
runs_on: linux.24xlarge
build_name: conda-py3_12-cuda12_1
@ -916,7 +916,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_1
build_environment: linux-binary-conda
@ -935,7 +935,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/conda-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cuda12_1
secrets:

View File

@ -42,7 +42,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -61,7 +61,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi

View File

@ -47,7 +47,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -66,7 +66,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -85,7 +85,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi
@ -109,7 +109,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda11_8-shared-with-deps-cxx11-abi
@ -129,7 +129,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda11_8-shared-with-deps-cxx11-abi
@ -149,7 +149,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda11_8-shared-with-deps-cxx11-abi
@ -173,7 +173,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_1-shared-with-deps-cxx11-abi
@ -193,7 +193,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_1-shared-with-deps-cxx11-abi
@ -213,7 +213,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cuda12_1-shared-with-deps-cxx11-abi
@ -237,7 +237,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm5_6-shared-with-deps-cxx11-abi
@ -259,7 +259,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
steps:
@ -273,6 +273,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -284,7 +285,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -298,9 +299,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/libtorch-cxx11-builder:rocm5.6-2.2
docker-image: pytorch/libtorch-cxx11-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -317,7 +318,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm5_6-shared-with-deps-cxx11-abi
@ -341,7 +342,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm5_7-shared-with-deps-cxx11-abi
@ -363,7 +364,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
steps:
@ -377,6 +378,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -388,7 +390,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -402,9 +404,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/libtorch-cxx11-builder:rocm5.7-2.2
docker-image: pytorch/libtorch-cxx11-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -421,7 +423,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-rocm5_7-shared-with-deps-cxx11-abi

View File

@ -42,7 +42,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -61,7 +61,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11

View File

@ -47,7 +47,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -66,7 +66,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -85,7 +85,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cpu-shared-with-deps-pre-cxx11
@ -109,7 +109,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda11_8-shared-with-deps-pre-cxx11
@ -129,7 +129,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda11_8-shared-with-deps-pre-cxx11
@ -149,7 +149,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda11_8-shared-with-deps-pre-cxx11
@ -173,7 +173,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_1-shared-with-deps-pre-cxx11
@ -193,7 +193,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_1-shared-with-deps-pre-cxx11
@ -213,7 +213,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-cuda12_1-shared-with-deps-pre-cxx11
@ -237,7 +237,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm5_6-shared-with-deps-pre-cxx11
@ -259,7 +259,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
steps:
@ -273,6 +273,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -284,7 +285,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -298,9 +299,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -317,7 +318,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm5_6-shared-with-deps-pre-cxx11
@ -341,7 +342,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm5_7-shared-with-deps-pre-cxx11
@ -363,7 +364,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
steps:
@ -377,6 +378,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -388,7 +390,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -402,9 +404,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -421,7 +423,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: pre-cxx11
build_name: libtorch-rocm5_7-shared-with-deps-pre-cxx11

View File

@ -43,7 +43,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda11_8
build_environment: linux-binary-manywheel
@ -63,7 +63,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda11_8
build_environment: linux-binary-manywheel
@ -83,7 +83,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda12_1
build_environment: linux-binary-manywheel
@ -103,7 +103,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda12_1
build_environment: linux-binary-manywheel

View File

@ -47,7 +47,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu
build_environment: linux-binary-manywheel
@ -65,7 +65,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu
build_environment: linux-binary-manywheel
@ -83,7 +83,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu
secrets:
@ -105,7 +105,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu-cxx11-abi
@ -124,7 +124,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu-cxx11-abi
@ -143,7 +143,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cpu-cxx11-abi
@ -167,7 +167,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda11_8
build_environment: linux-binary-manywheel
@ -187,7 +187,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda11_8
build_environment: linux-binary-manywheel
@ -206,7 +206,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda11_8
secrets:
@ -229,7 +229,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda12_1
build_environment: linux-binary-manywheel
@ -249,7 +249,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda12_1
build_environment: linux-binary-manywheel
@ -268,7 +268,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-cuda12_1
secrets:
@ -291,7 +291,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-rocm5_6
build_environment: linux-binary-manywheel
@ -312,7 +312,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.8"
steps:
- name: Setup ROCm
@ -325,6 +325,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -336,7 +337,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -350,9 +351,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -369,7 +370,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-rocm5_6
secrets:
@ -392,7 +393,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-rocm5_7
build_environment: linux-binary-manywheel
@ -413,7 +414,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.8"
steps:
- name: Setup ROCm
@ -426,6 +427,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -437,7 +439,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -451,9 +453,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -470,7 +472,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.8"
build_name: manywheel-py3_8-rocm5_7
secrets:
@ -492,7 +494,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu
build_environment: linux-binary-manywheel
@ -510,7 +512,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu
build_environment: linux-binary-manywheel
@ -528,7 +530,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu
secrets:
@ -550,7 +552,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-cxx11-abi
@ -569,7 +571,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-cxx11-abi
@ -588,7 +590,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cpu-cxx11-abi
@ -612,7 +614,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8
build_environment: linux-binary-manywheel
@ -632,7 +634,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8
build_environment: linux-binary-manywheel
@ -651,7 +653,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda11_8
secrets:
@ -674,7 +676,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1
build_environment: linux-binary-manywheel
@ -694,7 +696,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1
build_environment: linux-binary-manywheel
@ -713,7 +715,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-cuda12_1
secrets:
@ -736,7 +738,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-rocm5_6
build_environment: linux-binary-manywheel
@ -757,7 +759,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.9"
steps:
- name: Setup ROCm
@ -770,6 +772,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -781,7 +784,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -795,9 +798,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -814,7 +817,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-rocm5_6
secrets:
@ -837,7 +840,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-rocm5_7
build_environment: linux-binary-manywheel
@ -858,7 +861,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.9"
steps:
- name: Setup ROCm
@ -871,6 +874,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -882,7 +886,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -896,9 +900,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -915,7 +919,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.9"
build_name: manywheel-py3_9-rocm5_7
secrets:
@ -937,7 +941,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu
build_environment: linux-binary-manywheel
@ -955,7 +959,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu
build_environment: linux-binary-manywheel
@ -973,7 +977,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu
secrets:
@ -995,7 +999,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-cxx11-abi
@ -1014,7 +1018,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-cxx11-abi
@ -1033,7 +1037,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cpu-cxx11-abi
@ -1057,7 +1061,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda11_8
build_environment: linux-binary-manywheel
@ -1077,7 +1081,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda11_8
build_environment: linux-binary-manywheel
@ -1096,7 +1100,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda11_8
secrets:
@ -1119,7 +1123,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda12_1
build_environment: linux-binary-manywheel
@ -1139,7 +1143,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda12_1
build_environment: linux-binary-manywheel
@ -1158,7 +1162,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-cuda12_1
secrets:
@ -1181,7 +1185,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-rocm5_6
build_environment: linux-binary-manywheel
@ -1202,7 +1206,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.10"
steps:
- name: Setup ROCm
@ -1215,6 +1219,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1226,7 +1231,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1240,9 +1245,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -1259,7 +1264,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-rocm5_6
secrets:
@ -1282,7 +1287,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-rocm5_7
build_environment: linux-binary-manywheel
@ -1303,7 +1308,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.10"
steps:
- name: Setup ROCm
@ -1316,6 +1321,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1327,7 +1333,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1341,9 +1347,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -1360,7 +1366,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.10"
build_name: manywheel-py3_10-rocm5_7
secrets:
@ -1382,7 +1388,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu
build_environment: linux-binary-manywheel
@ -1400,7 +1406,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu
build_environment: linux-binary-manywheel
@ -1418,7 +1424,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu
secrets:
@ -1440,7 +1446,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-cxx11-abi
@ -1459,7 +1465,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-cxx11-abi
@ -1478,7 +1484,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cpu-cxx11-abi
@ -1502,7 +1508,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda11_8
build_environment: linux-binary-manywheel
@ -1522,7 +1528,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda11_8
build_environment: linux-binary-manywheel
@ -1541,7 +1547,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda11_8
secrets:
@ -1564,7 +1570,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda12_1
build_environment: linux-binary-manywheel
@ -1584,7 +1590,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda12_1
build_environment: linux-binary-manywheel
@ -1603,7 +1609,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-cuda12_1
secrets:
@ -1626,7 +1632,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-rocm5_6
build_environment: linux-binary-manywheel
@ -1647,7 +1653,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.11"
steps:
- name: Setup ROCm
@ -1660,6 +1666,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1671,7 +1678,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1685,9 +1692,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -1704,7 +1711,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-rocm5_6
secrets:
@ -1727,7 +1734,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-rocm5_7
build_environment: linux-binary-manywheel
@ -1748,7 +1755,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.11"
steps:
- name: Setup ROCm
@ -1761,6 +1768,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1772,7 +1780,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1786,9 +1794,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -1805,7 +1813,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.11"
build_name: manywheel-py3_11-rocm5_7
secrets:
@ -1827,7 +1835,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu
build_environment: linux-binary-manywheel
@ -1845,7 +1853,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu
build_environment: linux-binary-manywheel
@ -1863,7 +1871,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu
secrets:
@ -1885,7 +1893,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-cxx11-abi
@ -1904,7 +1912,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-cxx11-abi
@ -1923,7 +1931,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu-cxx11-abi
GPU_ARCH_TYPE: cpu-cxx11-abi
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2
DOCKER_IMAGE: pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main
DESIRED_DEVTOOLSET: cxx11-abi
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cpu-cxx11-abi
@ -1947,7 +1955,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda11_8
build_environment: linux-binary-manywheel
@ -1967,7 +1975,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda11_8
build_environment: linux-binary-manywheel
@ -1986,7 +1994,7 @@ jobs:
DESIRED_CUDA: cu118
GPU_ARCH_VERSION: 11.8
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda11.8-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda11_8
secrets:
@ -2009,7 +2017,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda12_1
build_environment: linux-binary-manywheel
@ -2029,7 +2037,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda12_1
build_environment: linux-binary-manywheel
@ -2048,7 +2056,7 @@ jobs:
DESIRED_CUDA: cu121
GPU_ARCH_VERSION: 12.1
GPU_ARCH_TYPE: cuda
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cuda12.1-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-cuda12_1
secrets:
@ -2071,7 +2079,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-rocm5_6
build_environment: linux-binary-manywheel
@ -2092,7 +2100,7 @@ jobs:
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.12"
steps:
- name: Setup ROCm
@ -2105,6 +2113,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2116,7 +2125,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2130,9 +2139,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.6-2.2
docker-image: pytorch/manylinux-builder:rocm5.6-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -2149,7 +2158,7 @@ jobs:
DESIRED_CUDA: rocm5.6
GPU_ARCH_VERSION: 5.6
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.6-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-rocm5_6
secrets:
@ -2172,7 +2181,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-rocm5_7
build_environment: linux-binary-manywheel
@ -2193,7 +2202,7 @@ jobs:
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
SKIP_ALL_TESTS: 1
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.12"
steps:
- name: Setup ROCm
@ -2206,6 +2215,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2217,7 +2227,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2231,9 +2241,9 @@ jobs:
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Pull Docker image
uses: pytorch/test-infra/.github/actions/pull-docker-image@release/2.2
uses: pytorch/test-infra/.github/actions/pull-docker-image@main
with:
docker-image: pytorch/manylinux-builder:rocm5.7-2.2
docker-image: pytorch/manylinux-builder:rocm5.7-main
- name: Test Pytorch binary
uses: ./pytorch/.github/actions/test-pytorch-binary
- name: Teardown ROCm
@ -2250,7 +2260,7 @@ jobs:
DESIRED_CUDA: rocm5.7
GPU_ARCH_VERSION: 5.7
GPU_ARCH_TYPE: rocm
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:rocm5.7-main
DESIRED_PYTHON: "3.12"
build_name: manywheel-py3_12-rocm5_7
secrets:

View File

@ -79,6 +79,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -90,7 +91,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -139,7 +140,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cpu
use_s3: False
@ -195,6 +196,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -206,7 +208,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -255,7 +257,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
use_s3: False
@ -311,6 +313,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -322,7 +325,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -371,7 +374,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
use_s3: False
@ -427,6 +430,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -438,7 +442,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -487,7 +491,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
use_s3: False
@ -543,6 +547,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -554,7 +559,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -603,7 +608,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
use_s3: False

View File

@ -81,6 +81,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -92,7 +93,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -141,7 +142,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi

View File

@ -78,6 +78,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -89,7 +90,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -138,7 +139,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: wheel-py3_8-cpu
use_s3: False
@ -195,6 +196,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -206,7 +208,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -255,7 +257,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: wheel-py3_9-cpu
use_s3: False
@ -312,6 +314,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -323,7 +326,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -372,7 +375,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: wheel-py3_10-cpu
use_s3: False
@ -429,6 +432,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -440,7 +444,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -489,7 +493,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: wheel-py3_11-cpu
use_s3: False
@ -546,6 +550,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -557,7 +562,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -606,7 +611,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: wheel-py3_12-cpu
use_s3: False

View File

@ -77,6 +77,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -88,7 +89,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -137,7 +138,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: conda-py3_8-cpu
use_s3: False
@ -193,6 +194,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -204,7 +206,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -253,7 +255,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: conda-py3_9-cpu
use_s3: False
@ -309,6 +311,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -320,7 +323,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -369,7 +372,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: conda-py3_10-cpu
use_s3: False
@ -425,6 +428,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -436,7 +440,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -485,7 +489,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: conda-py3_11-cpu
use_s3: False
@ -541,6 +545,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -552,7 +557,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -601,7 +606,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/conda-builder:cpu-2.2
DOCKER_IMAGE: pytorch/conda-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: conda-py3_12-cpu
use_s3: False

View File

@ -81,6 +81,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -92,7 +93,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -141,7 +142,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-2.2
DOCKER_IMAGE: pytorch/libtorch-cxx11-builder:cpu-main
LIBTORCH_VARIANT: shared-with-deps
DESIRED_DEVTOOLSET: cxx11-abi
build_name: libtorch-cpu-shared-with-deps-cxx11-abi

View File

@ -78,6 +78,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -89,7 +90,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -138,7 +139,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.8"
build_name: wheel-py3_8-cpu
use_s3: False
@ -195,6 +196,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -206,7 +208,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -255,7 +257,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.9"
build_name: wheel-py3_9-cpu
use_s3: False
@ -312,6 +314,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -323,7 +326,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -372,7 +375,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.10"
build_name: wheel-py3_10-cpu
use_s3: False
@ -429,6 +432,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -440,7 +444,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -489,7 +493,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.11"
build_name: wheel-py3_11-cpu
use_s3: False
@ -546,6 +550,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -557,7 +562,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -606,7 +611,7 @@ jobs:
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: cpu
GPU_ARCH_TYPE: cpu
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-2.2
DOCKER_IMAGE: pytorch/manylinux-builder:cpu-main
DESIRED_PYTHON: "3.12"
build_name: wheel-py3_12-cpu
use_s3: False

View File

@ -62,7 +62,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -93,6 +93,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -104,7 +105,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -173,7 +174,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -209,6 +210,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -220,7 +222,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -302,7 +304,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -333,6 +335,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -344,7 +347,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -414,7 +417,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -450,6 +453,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -461,7 +465,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -544,7 +548,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -575,6 +579,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -586,7 +591,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -656,7 +661,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -692,6 +697,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -703,7 +709,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -785,7 +791,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -816,6 +822,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -827,7 +834,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -896,7 +903,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -932,6 +939,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -943,7 +951,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1025,7 +1033,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1056,6 +1064,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1067,7 +1076,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1137,7 +1146,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1173,6 +1182,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1184,7 +1194,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1267,7 +1277,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1298,6 +1308,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1309,7 +1320,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1379,7 +1390,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1415,6 +1426,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1426,7 +1438,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1508,7 +1520,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1539,6 +1551,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1550,7 +1563,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1619,7 +1632,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1655,6 +1668,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1666,7 +1680,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1748,7 +1762,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1779,6 +1793,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1790,7 +1805,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1860,7 +1875,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1896,6 +1911,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1907,7 +1923,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1990,7 +2006,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2021,6 +2037,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2032,7 +2049,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2102,7 +2119,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2138,6 +2155,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2149,7 +2167,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2231,7 +2249,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2262,6 +2280,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2273,7 +2292,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2342,7 +2361,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2378,6 +2397,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2389,7 +2409,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2471,7 +2491,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2502,6 +2522,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2513,7 +2534,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2583,7 +2604,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2619,6 +2640,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2630,7 +2652,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2713,7 +2735,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2744,6 +2766,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2755,7 +2778,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2825,7 +2848,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2861,6 +2884,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2872,7 +2896,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2954,7 +2978,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2985,6 +3009,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2996,7 +3021,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3065,7 +3090,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3101,6 +3126,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3112,7 +3138,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3194,7 +3220,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3225,6 +3251,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3236,7 +3263,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3306,7 +3333,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3342,6 +3369,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3353,7 +3381,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3436,7 +3464,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3467,6 +3495,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3478,7 +3507,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3548,7 +3577,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3584,6 +3613,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3595,7 +3625,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -59,7 +59,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -90,6 +90,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -101,7 +102,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -174,7 +175,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -210,6 +211,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -221,7 +223,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -66,7 +66,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -97,6 +97,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -108,7 +109,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -181,7 +182,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -217,6 +218,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -228,7 +230,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -318,7 +320,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -349,6 +351,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -360,7 +363,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -434,7 +437,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -470,6 +473,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -481,7 +485,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -572,7 +576,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -603,6 +607,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -614,7 +619,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -688,7 +693,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -724,6 +729,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -735,7 +741,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -59,7 +59,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -90,6 +90,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -101,7 +102,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -174,7 +175,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -210,6 +211,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -221,7 +223,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -66,7 +66,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -97,6 +97,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -108,7 +109,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -181,7 +182,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -217,6 +218,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -228,7 +230,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -318,7 +320,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -349,6 +351,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -360,7 +363,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -434,7 +437,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -470,6 +473,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -481,7 +485,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -572,7 +576,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -603,6 +607,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -614,7 +619,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -688,7 +693,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -724,6 +729,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -735,7 +741,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -63,7 +63,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -94,6 +94,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -105,7 +106,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -174,7 +175,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -210,6 +211,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -221,7 +223,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -304,7 +306,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -335,6 +337,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -346,7 +349,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -416,7 +419,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -452,6 +455,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -463,7 +467,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -547,7 +551,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -578,6 +582,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -589,7 +594,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -659,7 +664,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -695,6 +700,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -706,7 +712,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -789,7 +795,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -820,6 +826,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -831,7 +838,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -900,7 +907,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -936,6 +943,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -947,7 +955,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1030,7 +1038,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1061,6 +1069,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1072,7 +1081,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1142,7 +1151,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1178,6 +1187,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1189,7 +1199,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1273,7 +1283,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1304,6 +1314,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1315,7 +1326,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1385,7 +1396,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1421,6 +1432,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1432,7 +1444,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1515,7 +1527,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1546,6 +1558,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1557,7 +1570,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1626,7 +1639,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1662,6 +1675,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1673,7 +1687,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1756,7 +1770,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1787,6 +1801,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1798,7 +1813,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1868,7 +1883,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -1904,6 +1919,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -1915,7 +1931,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -1999,7 +2015,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2030,6 +2046,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2041,7 +2058,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2111,7 +2128,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2147,6 +2164,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2158,7 +2176,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2241,7 +2259,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2272,6 +2290,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2283,7 +2302,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2352,7 +2371,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2388,6 +2407,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2399,7 +2419,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2482,7 +2502,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2513,6 +2533,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2524,7 +2545,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2594,7 +2615,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2630,6 +2651,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2641,7 +2663,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2725,7 +2747,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2756,6 +2778,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2767,7 +2790,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2837,7 +2860,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2873,6 +2896,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -2884,7 +2908,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -2967,7 +2991,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -2998,6 +3022,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3009,7 +3034,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3078,7 +3103,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3114,6 +3139,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3125,7 +3151,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3208,7 +3234,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3239,6 +3265,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3250,7 +3277,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3320,7 +3347,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3356,6 +3383,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3367,7 +3395,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3451,7 +3479,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3482,6 +3510,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3493,7 +3522,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder
@ -3563,7 +3592,7 @@ jobs:
echo "instance-type: $(get_ec2_metadata instance-type)"
echo "system info $(uname -a)"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: pytorch/test-infra/.github/actions/setup-ssh@release/2.2
uses: pytorch/test-infra/.github/actions/setup-ssh@main
continue-on-error: true
with:
github-secret: ${{ secrets.GITHUB_TOKEN }}
@ -3599,6 +3628,7 @@ jobs:
- name: Checkout PyTorch
uses: malfet/checkout@silent-checkout
with:
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
submodules: recursive
path: pytorch
quiet-checkout: true
@ -3610,7 +3640,7 @@ jobs:
- name: Checkout pytorch/builder
uses: malfet/checkout@silent-checkout
with:
ref: release/2.2
ref: main
submodules: recursive
repository: pytorch/builder
path: builder

View File

@ -4,7 +4,6 @@ on:
push:
branches:
- main
- release/*
tags:
- ciflow/inductor/*
workflow_dispatch:

View File

@ -26,7 +26,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Run BC Lint Action
uses: pytorch/test-infra/.github/actions/bc-lint@release/2.2
uses: pytorch/test-infra/.github/actions/bc-lint@main
with:
repo: ${{ github.event.pull_request.head.repo.full_name }}
base_sha: ${{ github.event.pull_request.base.sha }}

View File

@ -15,7 +15,7 @@ on:
# When any other step fails, it's job will be retried once by retryBot.
jobs:
lintrunner:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.2
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
timeout: 120
runner: linux.2xlarge
@ -63,7 +63,7 @@ jobs:
exit $RC
quick-checks:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.2
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.2xlarge
docker-image: pytorch-linux-focal-linter
@ -104,7 +104,7 @@ jobs:
if: github.event_name == 'pull_request' && !contains(github.event.pull_request.labels.*.name, 'skip-pr-sanity-checks')
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
fetch-depth: -1
@ -117,7 +117,7 @@ jobs:
bash .github/scripts/pr-sanity-check.sh
workflow-checks:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.2
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.2xlarge
docker-image: pytorch-linux-focal-linter
@ -129,7 +129,6 @@ jobs:
CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
conda activate "${CONDA_ENV}"
export RELEASE_VERSION_TAG=2.2
# Regenerate workflows
.github/scripts/generate_ci_workflows.py
@ -154,7 +153,7 @@ jobs:
exit $RC
toc:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.2
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.2xlarge
docker-image: pytorch-linux-focal-linter
@ -192,7 +191,7 @@ jobs:
test-tools:
name: Test tools
if: ${{ github.repository == 'pytorch/pytorch' }}
uses: pytorch/test-infra/.github/workflows/linux_job.yml@release/2.2
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.2xlarge
docker-image: pytorch-linux-focal-linter
@ -213,7 +212,7 @@ jobs:
runs-on: linux.20_04.4x
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
fetch-depth: 1
@ -229,8 +228,8 @@ jobs:
pip install torch --pre --index-url https://download.pytorch.org/whl/nightly/cpu/
- name: Run run_test.py (nonretryable)
run: |
# Run test_weak, which is very fast
python3 test/run_test.py --include test_weak --verbose
# Run test_vulkan, which is a fast noop on Linux
python3 test/run_test.py --include test_vulkan --verbose
test_collect_env:
if: ${{ github.repository == 'pytorch/pytorch' }}
@ -243,7 +242,7 @@ jobs:
# [see note: pytorch repo ref]
# deep clone (fetch-depth 0) required, to allow us to use git log
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
fetch-depth: 1

View File

@ -21,7 +21,7 @@ jobs:
environment: upload-stats
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false

View File

@ -12,8 +12,6 @@ on:
push:
tags:
- ciflow/periodic/*
branches:
- release/*
workflow_dispatch:
concurrency:

View File

@ -25,9 +25,12 @@ jobs:
sync-tag: rocm-build
test-matrix: |
{ include: [
{ config: "default", shard: 1, num_shards: 3, runner: "linux.rocm.gpu" },
{ config: "default", shard: 2, num_shards: 3, runner: "linux.rocm.gpu" },
{ config: "default", shard: 3, num_shards: 3, runner: "linux.rocm.gpu" },
{ config: "default", shard: 1, num_shards: 6, runner: "linux.rocm.gpu" },
{ config: "default", shard: 2, num_shards: 6, runner: "linux.rocm.gpu" },
{ config: "default", shard: 3, num_shards: 6, runner: "linux.rocm.gpu" },
{ config: "default", shard: 4, num_shards: 6, runner: "linux.rocm.gpu" },
{ config: "default", shard: 5, num_shards: 6, runner: "linux.rocm.gpu" },
{ config: "default", shard: 6, num_shards: 6, runner: "linux.rocm.gpu" },
]}
linux-focal-rocm5_7-py3_8-test:

View File

@ -10,8 +10,6 @@ on:
push:
tags:
- ciflow/slow/*
branches:
- release/*
workflow_dispatch:
concurrency:

View File

@ -16,11 +16,12 @@ on:
schedule:
# Run hourly.
- cron: 30 * * * *
workflow_dispatch:
jobs:
stale:
if: ${{ github.repository == 'pytorch/pytorch' }}
runs-on: ubuntu-latest
runs-on: linux.large.arc
steps:
- uses: actions/github-script@v6

View File

@ -16,6 +16,7 @@ concurrency:
jobs:
# There must be at least one job here to satisfy GitHub action workflow syntax
introduction:
if: github.repository_owner == 'pytorch'
runs-on: ubuntu-latest
continue-on-error: true
steps:

View File

@ -14,7 +14,7 @@ jobs:
if: ${{ github.repository == 'pytorch/pytorch' }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
fetch-depth: 1
submodules: false

View File

@ -44,7 +44,7 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
uses: pytorch/test-infra/.github/actions/upload-alerts@release/2.2
uses: pytorch/test-infra/.github/actions/upload-alerts@main
with:
alerts: '${{ steps.alert_creation_step.outputs.script-output }}'
organization: "pytorch"

View File

@ -10,6 +10,7 @@ jobs:
# the conclusion field in the github context is sometimes null
# solution adapted from https://github.com/community/community/discussions/21090#discussioncomment-3226271
get_workflow_conclusion:
if: github.repository_owner == 'pytorch'
runs-on: ubuntu-latest
outputs:
conclusion: ${{ fromJson(steps.get_conclusion.outputs.data).conclusion }}
@ -25,8 +26,9 @@ jobs:
upload-test-stats:
needs: get_workflow_conclusion
if:
github.event.workflow_run.conclusion == 'success' || github.event.workflow_run.conclusion == 'failure' ||
needs.get_workflow_conclusion.outputs.conclusion == 'success' || needs.get_workflow_conclusion.outputs.conclusion == 'failure'
github.repository_owner == 'pytorch' &&
(github.event.workflow_run.conclusion == 'success' || github.event.workflow_run.conclusion == 'failure' ||
needs.get_workflow_conclusion.outputs.conclusion == 'success' || needs.get_workflow_conclusion.outputs.conclusion == 'failure')
runs-on: ubuntu-22.04
environment: upload-stats
name: Upload test stats for ${{ github.event.workflow_run.id }}, attempt ${{ github.event.workflow_run.run_attempt }}
@ -37,7 +39,7 @@ jobs:
run: echo "${TRIGGERING_WORKFLOW}"
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
- uses: actions/setup-python@v4
with:
@ -93,15 +95,12 @@ jobs:
python3 -m tools.stats.check_disabled_tests --workflow-run-id "${WORKFLOW_RUN_ID}" --workflow-run-attempt "${WORKFLOW_RUN_ATTEMPT}" --repo "${REPO_FULLNAME}"
check-api-rate:
if: ${{ always() }}
if: ${{ always() && github.repository_owner == 'pytorch' }}
runs-on: ubuntu-latest
continue-on-error: true
environment: pytorchbot-env
steps:
- name: Get our GITHUB_TOKEN API limit usage
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PYTORCHBOT_TOKEN: ${{ secrets.GH_PYTORCHBOT_TOKEN}}
run: |
curl -H "Accept: application/vnd.github.v3+json" -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/rate_limit
curl -H "Accept: application/vnd.github.v3+json" -H "Authorization: token $PYTORCHBOT_TOKEN" https://api.github.com/rate_limit

View File

@ -29,7 +29,7 @@ jobs:
name: Upload dynamo performance stats for ${{ github.event.workflow_run.id }}, attempt ${{ github.event.workflow_run.run_attempt }}
steps:
- name: Checkout PyTorch
uses: pytorch/pytorch/.github/actions/checkout-pytorch@release/2.2
uses: pytorch/pytorch/.github/actions/checkout-pytorch@main
with:
submodules: false
fetch-depth: 1

View File

@ -134,7 +134,6 @@ exclude_patterns = [
'torch/distributed/elastic/agent/server/api.py',
'torch/testing/_internal/**',
'torch/distributed/fsdp/fully_sharded_data_parallel.py',
'torch/distributed/distributed_c10d.py',
# TODO(suo): these exclusions were added just to get lint clean on master.
# Follow up to do more target suppressions and remove them.
'torch/ao/quantization/fx/convert.py',
@ -878,11 +877,7 @@ exclude_patterns = [
'test/jit/**', # should be run through test/test_jit.py
'test/ao/sparsity/**', # should be run through test/test_ao_sparsity.py
'test/fx/**', # should be run through test/test_fx.py
'test/bottleneck_test/test_args.py',
'test/bottleneck_test/test_cuda.py',
'test/distributed/_shard/test_sharder.py',
'test/distributed/_tensor/experimental/test_tp_transform.py',
'test/distributed/_tensor/test_optimizers.py',
'test/bottleneck_test/**', # excluded by test/run_test.py
'test/distributed/argparse_util_test.py',
'test/distributed/bin/test_script.py',
'test/distributed/checkpoint/e2e/test_pipeline.py',
@ -919,6 +914,7 @@ exclude_patterns = [
'test/dynamo/test_compile.py',
'test/dynamo/test_sources.py',
'test/functorch/test_logging.py',
'test/inductor/test_aot_inductor_utils.py',
'test/lazy/test_bindings.py',
'test/lazy/test_extract_compiled_graph.py',
'test/lazy/test_meta_kernel.py',

View File

@ -288,7 +288,7 @@ option(USE_VULKAN_RELAXED_PRECISION "Vulkan - Use relaxed precision math in the
option(USE_XNNPACK "Use XNNPACK" ON)
option(USE_ZMQ "Use ZMQ" OFF)
option(USE_ZSTD "Use ZSTD" OFF)
option(USE_ROCM_KERNEL_ASSERT "Use Kernel Assert for ROCm" OFF)
option(TORCH_DISABLE_GPU_ASSERTS "Disable GPU asserts by default" OFF)
# Ensure that an ITT build is the default for x86 CPUs
cmake_dependent_option(
USE_ITT "Use Intel(R) VTune Profiler ITT functionality" ON

View File

@ -52,7 +52,8 @@ nn/qat/ @jerryzh168
# Docker
/.ci/docker/ @jeffdaily
/.ci/docker/ci_commit_pins/triton.txt @desertfire @Chillee @eellison @shunting314 @ngimel @bertmaher @jeffdaily @jataylo @jithunnair-amd @pruthvistony
/.ci/docker/ci_commit_pins/triton.txt @desertfire @Chillee @eellison @shunting314 @bertmaher @jeffdaily @jataylo @jithunnair-amd @pruthvistony
/.ci/docker/ci_commit_pins/triton-rocm.txt @jeffdaily @jataylo @jithunnair-amd @pruthvistony
# Github Actions
# This list is for people wanting to be notified every time there's a change
@ -76,15 +77,15 @@ nn/qat/ @jerryzh168
/test/test_linalg.py @lezcano @nikitaved @IvanYashchuk
# OpInfo-related files
/torch/testing/_internal/common_methods_invocations.py @mruberry @ngimel
/torch/testing/_internal/common_device_type.py @mruberry @ngimel
test/test_ops.py @mruberry @ngimel
test/test_ops_gradients.py @mruberry @ngimel @soulitzer
test/test_ops_fwd_gradients.py @mruberry @ngimel @soulitzer
test/test_unary_ufuncs.py @mruberry @ngimel
test/test_binary_ufuncs.py @mruberry @ngimel
test/test_reductions.py @mruberry @ngimel
test/test_type_promotion.py @mruberry @ngimel
/torch/testing/_internal/common_methods_invocations.py @mruberry
/torch/testing/_internal/common_device_type.py @mruberry
test/test_ops.py @mruberry
test/test_ops_gradients.py @mruberry @soulitzer
test/test_ops_fwd_gradients.py @mruberry @soulitzer
test/test_unary_ufuncs.py @mruberry
test/test_binary_ufuncs.py @mruberry
test/test_reductions.py @mruberry
test/test_type_promotion.py @mruberry
# functorch-related things
# This list is for people wanting to be notified every time there's a change
@ -113,6 +114,16 @@ torch/utils/data/ @ejguan
torch/utils/hipify/ @jeffdaily @jithunnair-amd
tools/amd_build/ @jeffdaily @jithunnair-amd
# ROCm-specific files
aten/src/ATen/hip/ @jeffdaily @jithunnair-amd
aten/src/ATen/miopen/ @jeffdaily @jithunnair-amd
aten/src/ATen/native/miopen/ @jeffdaily @jithunnair-amd
c10/hip @jeffdaily @jithunnair-amd
caffe2/core/hip @jeffdaily @jithunnair-amd
caffe2/operators/hip @jeffdaily @jithunnair-amd
caffe2/operators/rnn/hip @jeffdaily @jithunnair-amd
caffe2/utils/hip @jeffdaily @jithunnair-amd
# torch.export
/torch/export/ @avikchaudhuri @gmagogsfm @tugsbayasgalan @zhxchen17
/torch/_export/ @avikchaudhuri @gmagogsfm @tugsbayasgalan @zhxchen17

View File

@ -63,7 +63,7 @@ RUN --mount=type=cache,target=/opt/ccache \
FROM conda as conda-installs
ARG PYTHON_VERSION=3.8
ARG CUDA_VERSION=12.1
ARG CUDA_VERSION=11.7
ARG CUDA_CHANNEL=nvidia
ARG INSTALL_CHANNEL=pytorch-nightly
# Automatically set by buildx

View File

@ -108,13 +108,13 @@ TensorIteratorConfig& TensorIteratorConfig::add_owned_output(const TensorBase& o
num_inputs_ == 0,
"Keep in mind that you have to add all outputs first before adding any input. "
"For more details, see https://github.com/pytorch/pytorch/wiki/How-to-use-TensorIterator.");
tensors_.push_back(c10::MaybeOwned<TensorBase>::owned(c10::in_place, output));
tensors_.push_back(c10::MaybeOwned<TensorBase>::owned(std::in_place, output));
num_outputs_++;
return *this;
}
TensorIteratorConfig& TensorIteratorConfig::add_owned_input(const TensorBase& input) {
tensors_.push_back(c10::MaybeOwned<TensorBase>::owned(c10::in_place, input));
tensors_.push_back(c10::MaybeOwned<TensorBase>::owned(std::in_place, input));
num_inputs_++;
return *this;
}

View File

@ -210,7 +210,7 @@ struct TORCH_API OperandInfo {
private:
c10::MaybeOwned<TensorBase> tensor_base_;
c10::MaybeOwned<TensorBase> original_tensor_base_ =
c10::MaybeOwned<TensorBase>::owned(c10::in_place);
c10::MaybeOwned<TensorBase>::owned(std::in_place);
// We store TensorBase visibly in the header to allow inline access.
// However, we sometimes need a genuine `const Tensor &` for the

View File

@ -133,7 +133,6 @@
#include <c10/util/bits.h>
#include <c10/util/complex.h>
#include <c10/util/floating_point_utils.h>
#include <c10/util/in_place.h>
#include <c10/util/intrusive_ptr.h>
#include <c10/util/irange.h>
#include <c10/util/llvmMathExtras.h>

View File

@ -996,7 +996,7 @@ inline c10::MaybeOwned<TensorBase> borrow_from_optional_tensor(
const c10::optional<TensorBase>& opt) {
return opt.has_value()
? c10::MaybeOwned<TensorBase>::borrowed(*opt)
: c10::MaybeOwned<TensorBase>::owned(c10::in_place);
: c10::MaybeOwned<TensorBase>::owned(std::in_place);
}
inline c10::MaybeOwned<TensorBase> TensorBase::expect_contiguous(MemoryFormat memory_format) const & {

View File

@ -215,7 +215,7 @@ std::tuple<Tensor, int64_t, std::vector<Tensor>, c10::optional<int64_t>, Dict<st
dummyTensor(DispatchKey::CUDA),
5,
{dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA)},
c10::optional<int64_t>(c10::in_place, 0),
c10::optional<int64_t>(std::in_place, 0),
dict
);
}

View File

@ -231,7 +231,7 @@ std::tuple<Tensor, int64_t, c10::List<Tensor>, c10::optional<int64_t>, Dict<stri
dummyTensor(DispatchKey::CUDA),
5,
c10::List<Tensor>({dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA)}),
c10::optional<int64_t>(c10::in_place, 0),
c10::optional<int64_t>(std::in_place, 0),
dict
);
}

View File

@ -196,7 +196,7 @@ TEST(OperatorRegistrationTestLegacyLambdaBasedKernel, givenKernelWithMultipleOut
dummyTensor(DispatchKey::CUDA),
5,
{dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA)},
c10::optional<int64_t>(c10::in_place, 0),
c10::optional<int64_t>(std::in_place, 0),
dict
);
});

View File

@ -195,7 +195,7 @@ TEST(OperatorRegistrationTestLambdaBasedKernel, givenKernelWithMultipleOutputs_w
dummyTensor(DispatchKey::CUDA),
5,
c10::List<Tensor>({dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA)}),
c10::optional<int64_t>(c10::in_place, 0),
c10::optional<int64_t>(std::in_place, 0),
dict
);
}));

View File

@ -213,7 +213,7 @@ struct KernelWithMultipleOutputs final : OperatorKernel {
dummyTensor(DispatchKey::CUDA),
5,
c10::List<Tensor>({dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA)}),
c10::optional<int64_t>(c10::in_place, 0),
c10::optional<int64_t>(std::in_place, 0),
dict
);
}

View File

@ -3,7 +3,6 @@
#include <ATen/core/function_schema.h>
#include <c10/util/Metaprogramming.h>
#include <c10/util/flat_hash_map.h>
#include <c10/util/either.h>
#include <c10/util/Optional.h>
#include <c10/core/DispatchKey.h>
#include <c10/core/PyHandleCache.h>

View File

@ -369,6 +369,9 @@ public:
Vectorized<T> expm1() const {
return map(Sleef_expm1f8_u10);
}
Vectorized<T> exp_u20() const {
return exp();
}
Vectorized<T> fmod(const Vectorized<T> & q) const {
__m256 x_lo, x_hi;
cvt_to_fp32<T>(values, x_lo, x_hi);

View File

@ -169,6 +169,9 @@ public:
Vectorized<double> expm1() const {
return Vectorized<double>(Sleef_expm1d4_u10(values));
}
Vectorized<double> exp_u20() const {
return exp();
}
Vectorized<double> fmod(const Vectorized<double>& q) const {
return Vectorized<double>(Sleef_fmodd4(values, q));
}

View File

@ -204,6 +204,68 @@ public:
Vectorized<float> expm1() const {
return Vectorized<float>(Sleef_expm1f8_u10(values));
}
Vectorized<float> exp_u20() const {
// A faster version of exp with ULP=20
static __m256 vec_factorial_1 =
_mm256_set1_ps(0.999999701f); // 1/factorial(1)
static __m256 vec_factorial_2 =
_mm256_set1_ps(0.499991506f); // 1/factorial(2)
static __m256 vec_factorial_3 =
_mm256_set1_ps(0.166676521f); // 1/factorial(3)
static __m256 vec_factorial_4 =
_mm256_set1_ps(0.0418978221f); // 1/factorial(4)
static __m256 vec_factorial_5 =
_mm256_set1_ps(0.00828929059f); // 1/factorial(5)
static __m256 vec_exp_log2ef =
(__m256)_mm256_set1_epi32(0x3fb8aa3b); // log2(e)
static __m256 vec_half = _mm256_set1_ps(0.5f);
static __m256 vec_one = _mm256_set1_ps(1.f);
static __m256 vec_zero = _mm256_set1_ps(0.f);
static __m256 vec_two = _mm256_set1_ps(2.f);
static __m256 vec_ln2f = (__m256)_mm256_set1_epi32(0x3f317218); // ln(2)
static __m256 vec_ln_flt_min = (__m256)_mm256_set1_epi32(0xc2aeac50);
static __m256 vec_ln_flt_max = (__m256)_mm256_set1_epi32(0x42b17218);
static __m256i vec_127 = _mm256_set1_epi32(0x0000007f);
static int n_mantissa_bits = 23;
// exp(x) =
// = exp(n * ln(2) + r) // divide x by ln(2) and get quot and rem
// = 2^n * exp(r) // simplify the exp(n*ln(2)) expression
auto less_ln_flt_min_mask =
_mm256_cmp_ps(values, vec_ln_flt_min, 1 /*_CMP_LT_OS*/);
auto vec_src = _mm256_min_ps(values, vec_ln_flt_max);
vec_src = _mm256_max_ps(vec_src, vec_ln_flt_min);
// fx = floorf(x * log2ef + 0.5)
auto vec_fx = _mm256_fmadd_ps(vec_src, vec_exp_log2ef, vec_half);
vec_fx = _mm256_floor_ps(vec_fx);
// x = x - fx * ln2
auto vec_exp_poly = _mm256_fnmadd_ps(vec_fx, vec_ln2f, vec_src);
// compute polynomial
auto vec_res =
_mm256_fmadd_ps(vec_exp_poly, vec_factorial_5, vec_factorial_4);
vec_res = _mm256_fmadd_ps(vec_exp_poly, vec_res, vec_factorial_3);
vec_res = _mm256_fmadd_ps(vec_exp_poly, vec_res, vec_factorial_2);
vec_res = _mm256_fmadd_ps(vec_exp_poly, vec_res, vec_factorial_1);
vec_res = _mm256_fmadd_ps(vec_exp_poly, vec_res, vec_one);
// compute 2^(n-1)
auto vec_exp_number = _mm256_sub_ps(vec_fx, vec_one);
auto vec_exp_number_i = _mm256_cvtps_epi32(vec_exp_number);
auto vec_two_pow_n_i = _mm256_add_epi32(vec_exp_number_i, vec_127);
vec_two_pow_n_i = _mm256_slli_epi32(vec_two_pow_n_i, n_mantissa_bits);
auto vec_two_pow_n = (__m256)vec_two_pow_n_i;
vec_two_pow_n =
_mm256_blendv_ps(vec_two_pow_n, vec_zero, less_ln_flt_min_mask);
// y = y * 2^n
vec_res = _mm256_mul_ps(vec_res, vec_two_pow_n);
vec_res = _mm256_mul_ps(vec_res, vec_two);
return vec_res;
}
Vectorized<float> fmod(const Vectorized<float>& q) const {
return Vectorized<float>(Sleef_fmodf8(values, q));
}

View File

@ -421,6 +421,9 @@ public:
map(std::expm1)
);
}
Vectorized<float> exp_u20() const {
return exp();
}
Vectorized<float> fmod(const Vectorized<float>& q) const {
USE_SLEEF(
{

View File

@ -249,6 +249,9 @@ class Vectorized<double> {
Vectorized<double> expm1() const {
return {Sleef_expm1d2_u10(_vec0), Sleef_expm1d2_u10(_vec1)};
}
Vectorized<double> C10_ALWAYS_INLINE exp_u20() const {
return exp();
}
Vectorized<double> lgamma() const __ubsan_ignore_undefined__ {
return {Sleef_lgammad2_u10(_vec0), Sleef_lgammad2_u10(_vec1)};

View File

@ -312,6 +312,9 @@ class Vectorized<float> {
Vectorized<float> expm1() const {
return {Sleef_expm1f4_u10(_vec0), Sleef_expm1f4_u10(_vec1)};
}
Vectorized<float> C10_ALWAYS_INLINE exp_u20() const {
return exp();
}
Vectorized<float> C10_ALWAYS_INLINE log() const {
return {Sleef_logf4_u10(_vec0), Sleef_logf4_u10(_vec1)};

View File

@ -1072,6 +1072,9 @@ struct Vectorized<T, std::enable_if_t<is_zarch_implemented<T>()>> {
Vectorized<T> expm1() const {
return mapSleef(Sleef_expm1f4_u10, Sleef_expm1d2_u10);
}
Vectorized<T> exp_u20() const {
return exp();
}
Vectorized<T> log() const {
return mapSleef(Sleef_logf4_u10, Sleef_logd2_u10);

Some files were not shown because too many files have changed in this diff Show More