Compare commits

...

920 Commits

Author SHA1 Message Date
bc2c6edaf1 Re-enable Windows debug libtorch (#73900) 2022-03-08 10:46:44 -05:00
46f85865c0 Also install c10d headers with .h extension (#73422) (#73497)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73422

Fixes https://github.com/pytorch/pytorch/issues/73421
ghstack-source-id: 149978120

Test Plan: None

Reviewed By: cbalioglu

Differential Revision: D34475711

fbshipit-source-id: 9e4d1d57021cbff51f53762b32bbfffbf3f81c4c
2022-03-01 10:37:30 -05:00
a556333dfa scatter_reduce documentation (#73125) (#73365)
Summary:
Reland of https://github.com/pytorch/pytorch/issues/68580 (which were milestoned for 1.11) plus partial revert of https://github.com/pytorch/pytorch/pull/72543

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73125

Reviewed By: bdhirsh

Differential Revision: D34355217

Pulled By: malfet

fbshipit-source-id: 325ecdeaf53183d653b44ee5e6e8839ceefd9200
(cherry picked from commit 71db31748a8adcd8f95d5faf04aaa454e9c4c760)
(cherry picked from commit cfb6c942fed64dbb81ccc4f14b2a6650123af2e1)
2022-02-24 11:37:42 -08:00
3c14fe2151 Introduce an environment variable to change c10 log level (#71746) (#73357)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71746

This PR contains the following improvements:

- It exposes a new environment variable `TORCH_CPP_LOG_LEVEL` that enables users to set the log level of c10 logging facility (supports both GLOG and c10 loggers). Valid values are `INFO`, `WARNING`, `ERROR`, and `FATAL` or their numerical equivalents `0`, `1`, `2`, and `3`.
- It implements an `initLogging()` function and calls it as part of `torch._C` module import to ensure that the underlying logging facility is correctly initialized in Python.

With these changes a user can dynamically set the log level of c10 as in the following example:

```
$ TORCH_CPP_LOG_LEVEL=INFO python my_torch_script.py
```
ghstack-source-id: 149822703

Test Plan: Run existing tests.

Reviewed By: malfet

Differential Revision: D33756252

fbshipit-source-id: 7fd078c03a598595d992de0b474a23cec91838af
(cherry picked from commit 01d6ec6207faedf259ed1368730e9e197cb3e1c6)
2022-02-24 10:46:15 -08:00
055052bf64 Improvements to C10d log (#73358)
* Prefix c10d log messages with `[c10d]` for easier troubleshooting (#73144)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73144

This PR formats c10d log messages written by the `C10D_INFO/WARN/ERROR` macros by prefixing them with the `[c10d]` tag for easier troubleshooting. See #73121 for a specific customer request.

Note though that this is a temporary fix to unblock our users. Ideally our global logging facility should natively support component-based preambles.
ghstack-source-id: 149748943

Test Plan: N/A

Reviewed By: rohan-varma

Differential Revision: D34363975

fbshipit-source-id: 6b8096ac4b2fa344406c866a2e7665541cb60b34
(cherry picked from commit af14aef18d0239f04730545596a05536e0f9c857)

* Refactor TORCH_DISTRIBUTED_DEBUG implementation (#73166)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73166

This PR refactors, cleans up, and optimizes the implementation of `TORCH_DISTRIBUTED_DEBUG`. It also introduces three new user APIs: `get_debug_level()`, `set_debug_level()`, and `set_debug_level_from_env()` to retrieve and modify the debug level after a process has started.
ghstack-source-id: 149778566

Test Plan: Run the existing unit tests.

Reviewed By: rohan-varma

Differential Revision: D34371226

fbshipit-source-id: e18443b411adcbaf39b2ec999178c198052fcd5b
(cherry picked from commit 26d6bb1584b83a0490d8b766482656a5887fa21d)

* Introduce debug and trace log levels in c10d (#73167)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73167

This PR adds `C10D_DEBUG` and `C10D_TRACE` macros to enable fine grained logging in c10d. It also updates some log statements of `socket` to make its output less noisy.
ghstack-source-id: 149778567

Test Plan: Manual testing with different socket conditions.

Reviewed By: rohan-varma

Differential Revision: D34371426

fbshipit-source-id: a852b05ec353b18b0540ce5f803666c3da21ddd7
(cherry picked from commit 4519b06ac57f177dfc086bc10e8e1a746ba0870d)

* Make "server socket not listening" warning logs less noisy (#73149)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73149

This PR improves the handling of the "server socket not yet listening" warning log in c10d `socket`. Instead of outputting it after every failed attempt (meaning every second), it is now written every 20 seconds. Note though that if the log level is set to `INFO`, we keep writing a detailed message every second as before with additional `errno` information.

With log level set to `WARN` the output looks like:
```
[W socket.cpp:598] [c10d] No socket on (127.0.0.1, 29501) is listening yet, will retry.
[W socket.cpp:598] [c10d] No socket on (127.0.0.1, 29501) is listening yet, will retry.
...
[E socket.cpp:726] [c10d] The client socket has timed out after 300s while trying to connect to (127.0.0.1, 29501).
```

With log level set to `INFO` (a.k.a. verbose or debug level) the output looks like:
```
[I socket.cpp:515] [c10d] The client socket will attempt to connect to an IPv6 address of (127.0.0.1, 29501).
[I socket.cpp:582] [c10d] The client socket is attempting to connect to [localhost]:29501.
[I socket.cpp:643] [c10d] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
[W socket.cpp:598] [c10d] No socket on (127.0.0.1, 29501) is listening yet, will retry.
[I socket.cpp:582] [c10d] The client socket is attempting to connect to [localhost]:29501.
[I socket.cpp:643] [c10d] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
[I socket.cpp:582] [c10d] The client socket is attempting to connect to [localhost]:29501.
[I socket.cpp:643] [c10d] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
[I socket.cpp:582] [c10d] The client socket is attempting to connect to [localhost]:29501.
[I socket.cpp:643] [c10d] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
...
[W socket.cpp:598] [c10d] No socket on (127.0.0.1, 29501) is listening yet, will retry.
...
[E socket.cpp:726] [c10d] The client socket has timed out after 300s while trying to connect to (127.0.0.1, 29501).
```
ghstack-source-id: 149778565

Test Plan: Run manual tests to verify the correctness of the log message.

Reviewed By: rohan-varma

Differential Revision: D34365217

fbshipit-source-id: 296d01fa8b1ba803432903c10686d8a75145e539
(cherry picked from commit 8ae5aff0c5ffcc3e87d27d2deba6fedf8cef45cd)

* Rename `_get_debug_mode` to `get_debug_level` in distributed.py
2022-02-24 10:37:41 -08:00
68ef2a2188 Documenting cuda 11.5 windows issue (#73013) (#73312)
Summary:
Adding documentation about compiling extension with CUDA 11.5 and Windows

Example of failure: https://github.com/pytorch/pytorch/runs/4408796098?check_suite_focus=true

 Note: Don't use torch/extension.h In CUDA 11.5 under windows in your C++ code:
    Use aten instead of torch interface in all cuda 11.5 code under windows. It has been failing with errors, due to a bug in nvcc.
    Example use:
        >>> #include <ATen/ATen.h>
        >>> at::Tensor SigmoidAlphaBlendForwardCuda(....)
    Instead of:
        >>> #include <torch/extension.h>
        >>> torch::Tensor SigmoidAlphaBlendForwardCuda(...)
    Currently open issue for nvcc bug: https://github.com/pytorch/pytorch/issues/69460
    Complete Workaround code example: cb170ac024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73013

Reviewed By: malfet, seemethere

Differential Revision: D34306134

Pulled By: atalman

fbshipit-source-id: 3c5b9d7a89c91bd1920dc63dbd356e45dc48a8bd
(cherry picked from commit 87098e7f17fca1b98c90fafe2dde1defb6633f49)
2022-02-24 10:34:39 -08:00
9647fb7d18 Use "large" macos for binary builds
Hopefully it will fix the timeout

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73089

(cherry picked from commit 99427654aa86d052420f18b03ee9aa9abcf7e6d0)
2022-02-24 09:55:15 -08:00
e4944871c8 stop sccache server after building (#72794) (#73122)
Summary:
This is to avoid the directory , where the sccache is installed, couldn't be deleted.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72794

Reviewed By: H-Huang

Differential Revision: D34222877

Pulled By: janeyx99

fbshipit-source-id: 2765d6f49b375d15598586ed83ae4c5e667e7226
(cherry picked from commit 551e21ca582c80d88a466b7bfe4eda9dee0c9a5f)

Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
2022-02-21 11:08:08 -08:00
bbf2c0e3c6 Disable test history as it's fragile
Related to #73083

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73093

(cherry picked from commit 08510ba5e4ae0b53b67f0fbbc9f53b35aec9902c)
2022-02-18 15:01:40 -08:00
e6e8877bc2 avoiding adding some functions to the public python API before 1.11 release (#72543) (#72913)
cherry-picked for 1.11 release

(cherry picked from commit 6676a0c79a3b2bc1aa95e09e91eb92a6eca6b764)
2022-02-18 14:49:13 -08:00
eaa80c6fd8 [DataPipe] Adding usage examples for IterDataPipes (#73036)
Adding usage examples for IterDataPipes, with additional improvements for description of `groupby`, `IterDataPipe`, `MapDataPipe`.

Differential Revision: [D34313793](https://our.internmc.facebook.com/intern/diff/D34313793)
2022-02-18 14:38:05 -08:00
7fa092949e [NNC] TensorExprKernel state should not be modified on calls to run methods (#73029)
A typical use case for `TensorExprKernel` is to create the kernel once and call it multiple times, possibly in parallel. For the parallel calls to work, we need to ensure that the run() method calls do not change any state in `TensorExprKernel`.

Before this change, the `run()` method was modifying the sizes and strides vectors when dynamic shapes were present. This manifested as a data race when running a model with Static Runtime.
ghstack-source-id: 149398820

Differential Revision: [D34287960](https://our.internmc.facebook.com/intern/diff/D34287960/)

Co-authored-by: Raghavan Raman <raghavanr@fb.com>
2022-02-18 14:31:59 -08:00
74cd18623e Fix doc regressions for various modules and functional forms (#73014) (#73049)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73014

Fixes #72501
Fixes #72502
Fixes #72503
Fixes #72504
Fixes #72505
Fixes #72506
Fixes #72507
Fixes #72509
Fixes #72510

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D34305640

Pulled By: jbschlosser

fbshipit-source-id: 62f341633fdb0316eaa346cf7247865290eb830a
(cherry picked from commit 8362d264e7b2c0c2bd5d688a87bf4f8f0bf60f0f)

Co-authored-by: Joel Schlosser <jbschlosser@fb.com>
2022-02-18 08:23:21 -08:00
dad4c2d032 Fix sequence_ops_test (#72844) (#73017) 2022-02-17 11:31:27 -08:00
565742cb63 [CircleCI] Re-enable nightly android builds (#73027)
A stop-gap measure to re-enable publishing of Android maven packages by
CI, see https://github.com/pytorch/pytorch/issues/72902

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72903

(cherry picked from commit 3493646f7636046c603921ef9a8b5c3fc635f39f)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2022-02-17 14:27:55 -05:00
7acb591cf9 Add docstrings to native_channel_shuffle (#72954)
ghstack-source-id: 9288da6390b5e5702c250788a2644ec6ad32df3c
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72919
2022-02-17 08:07:08 -05:00
a5d5a6ad4f Set BLAS_LIBRARIES to ${MKL_LIBRARIES} for MKL case (#72806) (#72959)
This reverts [suggestion](https://github.com/pytorch/pytorch/pull/49647#discussion_r677737470) proposed to https://github.com/pytorch/pytorch/pull/49647

Which is somehow sufficient to workaround symptoms of https://github.com/pytorch/pytorch/issue/72653 

I.e. before this change, `BLAS_LIBRARIES` were set to `caffe2::mkl`
which is an interface library with link property set as follows:
59dd84cab6/cmake/public/mkl.cmake (L10-L12)
2022-02-17 07:54:33 -05:00
ea5089751f [JIT] API Changes for dynamic fusion (#72937)
* Move dyn fusion api to jit/api/module/

ghstack-source-id: 5597012c7381629ed478c10925b1b08eed1a32bf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72638

* Make fusion strategy api public

ghstack-source-id: b2ede61e046297f9f6132c3afd23e88b33d5b4eb
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72639

Co-authored-by: Elias Ellison <eellison@devfair044.h1.fair>
2022-02-16 15:12:09 -08:00
d216c83667 [release/1.11] Create a CI workflow for XLA tests using the XLA test image (#72938)
* Create a CI workflow for XLA tests using the XLA test image (#72496)

Summary:
This PR resolves https://github.com/pytorch/pytorch/issues/72693

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72496

Reviewed By: H-Huang

Differential Revision: D34255441

Pulled By: seemethere

fbshipit-source-id: fdfd54fbd59ef7266a78c9f729c1d5b6ed25e9d6
(cherry picked from commit ba14f0ee6cfa2fe248784d2dc5d54e427aef6bf7)

* Update .github/workflows/generated-pytorch-xla-linux-bionic-py3.7-clang8.yml

Fixes lint

Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
2022-02-16 15:10:16 -08:00
f7e0ca546c add optional encoding argument to fileopener so users can open files in non-default encodings. (#72800)
Co-authored-by: Elijah Rippeth <elijah.rippeth@gmail.com>
Co-authored-by: Nikita Shulga <nshulga@fb.com>
2022-02-16 13:17:11 -08:00
89ee69e173 Rename Typed/UntypedStorage to _Typed/_UntypedStorage (#72540) (#72914)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72540

Reviewed By: jbschlosser

Differential Revision: D34216823

Pulled By: bdhirsh

fbshipit-source-id: 1bc9930ab582771ebf02308e035576cd1a0dbe47
(cherry picked from commit 329238f612a9d92586bb0e5b33bcc45a0ec6936b)

Co-authored-by: Kurt Mohler <kmohler@quansight.com>
2022-02-16 12:24:21 -08:00
e0aad8e864 [quant][core][docs] Add docs for torch.quantize_per_tensor_dynamic (#72311) (#72929)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72311

att

Test Plan:
doc page in github

Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33996034

fbshipit-source-id: 797f7a55176e9219586d16142ca351c5c9cbe828
2022-02-16 12:22:12 -08:00
28ad47f553 [ONNX] Fix lstm reshape shape inference regression (#72734)
Fixes #72399
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72532

Co-authored-by: BowenBao <bowbao@microsoft.com>
2022-02-15 11:04:47 -08:00
2cc3c2ef38 [1.11][DataPipe] Docs Improvement (#72801)
* [DataPipe] Fixing MapDataPipe docstrings

[ghstack-poisoned]

* [DataPipe] Fixing IterDataPipe docstrings

[ghstack-poisoned]

* [DataPipe] Add docstrings for IterDataPipe and MapDataPipe, along with small doc changes for consistency

[ghstack-poisoned]
2022-02-15 08:24:38 -05:00
b0037f707f pad_sequence: fix regression - support tensor (#72436) (#72697)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71365

Based on https://github.com/pytorch/pytorch/pull/72343

Thanks jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72436

Reviewed By: bdhirsh

Differential Revision: D34117724

Pulled By: jbschlosser

fbshipit-source-id: e5d6599d0791025e18ab36ae16c417a91554bf64
(cherry picked from commit ffe8a0e41b7906920e392a9588347215ac44f46f)

Co-authored-by: kshitij12345 <kshitijkalambarkar@gmail.com>
2022-02-14 08:44:45 -05:00
b6a3176c1c Cat shape analysis fix for -1 dim (#72678)
ghstack-source-id: b4e1e8b74889653d70b6111de71797c2e10f347d
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72616

Co-authored-by: Elias Ellison <eellison@devfair044.h1.fair>
2022-02-14 08:42:51 -05:00
5fac320809 Fix refcounting in access of saved for forward attribute (#72627) (#72656)
Summary:
fix https://github.com/pytorch/pytorch/issues/72612

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72627

Reviewed By: soulitzer

Differential Revision: D34119834

Pulled By: albanD

fbshipit-source-id: 893a1e88a738eb40072af2106527340aea1d0006
(cherry picked from commit 511a1f16c5e37f4946907bc89b246eb684b89428)

Co-authored-by: albanD <desmaison.alban@gmail.com>
2022-02-14 08:38:16 -05:00
1f406fe91d Pin builder repo for GHA builds to release/1.11 (#72739)
* Builder repo is not pinned in release branch

* Updated workflows
2022-02-11 15:29:26 -05:00
6a46b2e2aa Fix for builder repo not pinned in release branch (#72719) (#72732)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/72655

Please note: Readme.md file change will be done after this change is performed and release specific change is done, so that I will reference the commit of the release specific change in the readme as an example

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72719

Reviewed By: seemethere

Differential Revision: D34177045

Pulled By: atalman

fbshipit-source-id: 2abb7af8cf1337704933c19c0d06022034ec77b4
(cherry picked from commit 31ff276d5e2cacc0e0592d624f3d486d5e8cfd1c)
2022-02-11 11:41:24 -08:00
503a0923d3 Fix tagged build detection for binary builds (#72628) (#72652)
Summary:
Should fix the following [error](https://github.com/pytorch/pytorch/runs/5058514346#step:13:88):
```
+ git --git-dir /pytorch/pytorch/.git describe --tags --match 'v[0-9]*.[0-9]*.[0-9]*' --exact
fatal: not a git repository: '/pytorch/pytorch/.git'
```
By setting `workdir` correctly for GHA linux and Windows builds

Also, abort `tagged_version` if GIT_DIR does not exist (as this script should only be executed in context of git folder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72628

Reviewed By: atalman

Differential Revision: D34120721

Pulled By: malfet

fbshipit-source-id: 035e93e243e601f9c24659cd247f9c029210fba5
(cherry picked from commit 3a6c97b6ddb185d706494f64423a761fee8fce09)
(cherry picked from commit b6df02bbbb5b786b198938ffb5d90fa5251df3eb)
2022-02-10 07:17:32 -08:00
6641e9b75f Fix SVD error code handling for OpenBLAS 0.3.15+ and MKL 2022+ (again) (#72357) (#72513)
Summary:
This PR was opened as copy of https://github.com/pytorch/pytorch/pull/68812 by request https://github.com/pytorch/pytorch/pull/68812#issuecomment-1030215862.

-----

Fixes https://github.com/pytorch/pytorch/issues/67693.

Reference LAPACK (used in OpenBLAS) changed info error code for svd when inputs contain non-finite numbers. In PyTorch, we raise an internal assert error for negative `info` error codes because usually, it would indicate the wrong implementation. However, this is not the case with SVD now in newer versions of LAPACK. MKL (tried 2021.4.0) still gives a positive error code for this kind of input. This change aligns with the OpenBLAS and MKL behavior in our code.

MKL 2022 has uses the latest reference LAPACK behavior and returns the same `info` as OpenBLAS 0.3.15+
This PR also fixes https://github.com/pytorch/pytorch/issues/71645 that is due to the updated MKL version in CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72357

Reviewed By: albanD

Differential Revision: D34012245

Pulled By: ngimel

fbshipit-source-id: 2b66c173cc3458d8c766b542d0d569191cdce310
(cherry picked from commit fa29e65611ea5028bf6d2d3c151d79e6c9e4ffef)
2022-02-09 18:58:00 -05:00
4f9f0e7a13 Fix doc build for release branches (#72567) (#72635)
Summary:
Add "v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+" wildcard to tag triggers
Add similar `startsWith(github.event.ref, 'refs/tags/v1.')` for push
conditions

Fixes https://github.com/pytorch/pytorch/issues/72519

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72567

Reviewed By: atalman

Differential Revision: D34116048

Pulled By: malfet

fbshipit-source-id: 7ef6ae3972ff7eba213ae9c4eb4afea5a7e11827
(cherry picked from commit 3785553532ccf636e389c97713f2c5bbfec836ba)
2022-02-09 15:55:24 -08:00
0ea924fc98 Disable complex32 (#72604) 2022-02-09 15:51:37 -08:00
5a78725c29 Add missing entry for sampled_addmm in sparse.rst (#72312) (#72514)
Summary:
Let's make the documentation for `torch.sparse.sampled_addmm` searchable in the PyTorch documentation.
This PR shall be cherry-picked for the next 1.11 release.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72312

Reviewed By: davidberard98

Differential Revision: D34045230

Pulled By: cpuhrsch

fbshipit-source-id: c1b1dc907443284857f48c8ce1efab22c6701bbe
(cherry picked from commit 225929ecf20eb369f862b091818f5af16ee78f88)
2022-02-08 10:25:15 -08:00
f72151b900 [ONNX] Resolve attribute error in CI (#72350) (#72439)
Summary:
Tests under `test/onnx/test_models_onnxruntime.py` complains `AttributeError: 'TestModels' object has no attribute 'onnx_shape_inference'`.

This failure in CI appears suddenly without any code changes to related files. It is likely due to different test case run order. The test code was badly written such that test class `TestModels_new_jit_API`, if called first, will assign `TestModels.onnx_shape_inference = True`, circumventing this problem. On the other hand, if `TestModels` is called first, `AttributeError` will be raised.

Fixes https://github.com/pytorch/pytorch/issues/72337

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72350

Reviewed By: jbschlosser, seemethere, janeyx99

Differential Revision: D34010794

Pulled By: malfet

fbshipit-source-id: 816f7bee89ea0251bb5df8f482b68f8dc4823997
(cherry picked from commit b39b23bec5dfd3f2fd24a0d781757c20ff94b1db)

Co-authored-by: BowenBao <bowbao@microsoft.com>
2022-02-07 12:35:32 -08:00
8380187819 Pin librosa (#72440)
Should mitigate https://github.com/pytorch/pytorch/issues/72432
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72433

Co-authored-by: Jane Xu <janeyx@fb.com>
2022-02-07 10:06:57 -08:00
7cc129e60c Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA (#72290) (#72356)
Summary:
Remove forcing CUDNN_STATIC when CAFFE2_STATIC_LINK_CUDA is set
Since we are transitioning to using dynamic loading for multiple pytorch dependecies  and CUDNN is the first step in this transition,  hence we want to remove forcing CUDNN to statically load, and instead load it dynamically.

Tested using following workflow:
https://github.com/pytorch/pytorch/actions/runs/1790666862

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72290

Reviewed By: albanD

Differential Revision: D34003793

Pulled By: atalman

fbshipit-source-id: 41bda7ac019a612ee53ceb18d1e372b1bb3cb68e
(cherry picked from commit 4a01940e681f996017d924b08946188ef352ef41)

Co-authored-by: Andrey Talman <atalman@fb.com>
2022-02-04 14:56:08 -05:00
ff6c348762 Revert "Bump torch version to 1.12 (#72221)"
This reverts commit 0ca0e02685a9d033ac4f04e2fa5c8ba6dbc5ae50.
2022-02-04 11:38:35 -08:00
03a283b2b1 Fix persistent worker exits before pin_memory thread (#72269)
* release 1.11 Install torch from test channel, Pin builder and xla repo (#72217)

* Fix persistent worker exits before pin_memory thread

ghstack-source-id: 2d15b14df2e2d84b309081dffbedc4836495ae95
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579

Co-authored-by: Andrey Talman <atalman@fb.com>
2022-02-04 11:31:16 -08:00
614e765575 [1.11] Make svd / svdvals fully functorch compatible (#72181) (#72274)
* release 1.11 Install torch from test channel, Pin builder and xla repo (#72217)

* Make svd / svdvals fully functorch compatible (#72181)

Summary:
This should (hopefully) make all the CI from `functorch` go green (including jvp's!) after changing `VARIADIC_BDIMS_BOXED(_svd_helper);` with `VARIADIC_BDIMS_BOXED(_linalg_svd);` and removing all the skip and xfails associated to `linalg.svdvals`.

Locally, there's just one test that started failing because of this, and that is `test_vmapjvpall_norm_nuc_cpu_float32`. I have no idea what's going on here, but it's a jvp product, so not a regression, and it might very well be caused by the jvp of other operation within `norm_nuc` as this is a composite operation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72181

Reviewed By: ngimel

Differential Revision: D33952744

Pulled By: zou3519

fbshipit-source-id: 2a2510d97eed4a0bfc25615264ddd36e38856efe
(cherry picked from commit 5805fa107c3a91c58f8ecc9778cfc87aa7f64233)

Co-authored-by: Andrey Talman <atalman@fb.com>
Co-authored-by: lezcano <lezcano-93@hotmail.com>
2022-02-04 14:29:02 -05:00
7b0e140ecc [1.11] Remove torch.vmap (#65496) (#72275)
* release 1.11 Install torch from test channel, Pin builder and xla repo (#72217)

* [1.11] Remove torch.vmap (#65496)

torch.vmap is a prototype feature and should not be in the stable
binary. This PR:
- Removes the torch.vmap API
- Removes the documentation entry for torch.vmap
- Changes the vmap tests to use an internal API instead of torch.vmap.

Test Plan:
- Tested locally (test_torch, test_autograd, test_type_hints, test_vmap),
but also wait for CI.

Co-authored-by: Andrey Talman <atalman@fb.com>
2022-02-04 11:23:44 -08:00
3fab33e1c9 release 1.11 Install torch from test channel, Pin builder and xla repo (#72217) 2022-02-04 11:15:10 -08:00
e81bfffbe1 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33999038

fbshipit-source-id: f6e3f24997eb3e478857341d21fa6aaf9dd3a906
(cherry picked from commit d530a426d4b20475cfb3b2538d0e0e2c017c358b)
2022-02-04 11:14:53 +00:00
6297aa114f [DataPipe] Extend FileLister to support load multiple directories (#72260)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72260

Test Plan: Imported from OSS

Reviewed By: dagitses, NivekT

Differential Revision: D33979744

Pulled By: ejguan

fbshipit-source-id: 5733d20382642fc2274afd838b33c98150d81e91
(cherry picked from commit f70537ae76448f5898da8d3f4884d0b3a29d40eb)
2022-02-04 07:55:00 +00:00
03260f85ff Update torch flatbuffer usage to OSS version (#71957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71957

Update users of flatbuffer serializer/loader to use the version in torch/csrc.

Test Plan:
sandcastle

Ran `buck run :test_models -- -k test_aten_relu` passes

Reviewed By: gmagogsfm

Differential Revision: D33720611

fbshipit-source-id: 6cdf7ab43ffca83327a677853be8f4918c47d53d
(cherry picked from commit 4f59e3547e2cd346a3f2310bc2d1f6a931fb826e)
2022-02-04 01:54:15 +00:00
3f6643e661 [FX] Fix default argument handling for Interpreter (#72272)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72272

Test Plan: Imported from OSS

Reviewed By: dagitses

Differential Revision: D33984862

Pulled By: jamesr66a

fbshipit-source-id: 7d89901c2041806df86c9b08f3af731f3afc9100
(cherry picked from commit f79f0e451e1da820cb59cc3267376615699061ea)
2022-02-04 01:46:20 +00:00
dbf09bc088 Sparse: Use per-operator headers (#71115)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71115

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33949904

Pulled By: malfet

fbshipit-source-id: c49f76fac3fc79385f01da02f32ed526462ab962
(cherry picked from commit 121801ad32fb34fb05c9b7fef3cb1de08aca24b6)
2022-02-04 01:39:48 +00:00
e90f5586d6 Add support for include-what-you-use (#71114)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71114

`include-what-you-use` or `iwyu` is a clang-based tool that looks at
the code's AST to figure out which symbols need to be included and
with the help of user-defined mappings it suggests the include
files that are actually needed.

This is very nice for the per-operator headers build because it give
you a list of exactly the `ATen/ops` headers needed by the file. You
still need to manually write the include-guards etc. but at least this
automates the most tedious part.

The header mappings aren't perfect yet so it will still suggest you
include basic c10 components everywhere instead of taking it
transitively from `TensorBase.h`. However, this does provide some
useful mappings and removes bad include paths from the build system
that were causing bad suggestions.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33949901

Pulled By: malfet

fbshipit-source-id: d5b015ef9e168bee4b8717b8e87ccc0608da62a1
(cherry picked from commit ecb2ffb35a5b1509a1275834fbe5c25e60ea1b79)
2022-02-04 01:39:48 +00:00
36385bee78 Remove CUDA Foreach... files dependency on function operators (#68462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68462

ATen has a header dependency problem. Whenever an operator is added or modified, it changes `ATen/Functions.h` and `ATen/NativeFunctions.h` which in turn requires essentially every single file to be rebuilt. Per-operator headers allow files to only include the specific operators they use and so minimizes unnecessary rebuilds during incremental builds and improves cache hits in CI builds.

See this note for more details:
3a03af2f50/aten/src/ATen/templates/Functions.h (L20)

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33949899

Pulled By: malfet

fbshipit-source-id: c044c73891eaaa5533dc2fac1b12fcfb1b871312
(cherry picked from commit 3c7f4da61f967b9fc35ecd0dc3e6323a85c300ef)
2022-02-04 01:39:48 +00:00
3c266620a4 [quant][gpu] Adding quantized conv operator in cudnn (#70622)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70622

This PR is the initial PR to add eager mode quantized GPU operator support, we'll start
with convolution, following cudnn fp32 Conv code and the example cudnn frontend code
https://github.com/pytorch/pytorch/pull/51390
https://github.com/NVIDIA/cudnn-frontend/blob/main/samples/fusion_sample.cpp#L557

Test Plan:
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33409155

fbshipit-source-id: cb5183d274993fcd2c3ab6de8ae022baa9f89f7f
(cherry picked from commit 4fde5559dee2a28907b09f96bc5a8dd259148d2e)
2022-02-04 01:32:55 +00:00
aa04c8edc7 Combine install miniconda routes in Windows GHA
Fixes recent network issues with installing miniconda, as well as cleaning up the logic by removing duplication

e.g., https://github.com/pytorch/pytorch/runs/4858321602?check_suite_focus=true

Test plan:
Windows tests and builds should pass
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72268
2022-02-04 01:29:46 +00:00
8c505bbc86 Make ShardedTensor ctor more inline with torch.Tensor ctor (#72164)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72164

torch.Tensor ctor creates an empty tensor and this PR makes
ShardedTensor on par with that.

In particular we remove TensorInitParams and instead always a create an empty
tensor and then fill it in for things like ones, zeros, full etc. This is
inline with torch.ones etc. as well since even for those APIs we first create
an empty tensor and then fill it out.
ghstack-source-id: 148318045

Test Plan: waitforbuildbot

Reviewed By: wanchaol

Differential Revision: D33934603

fbshipit-source-id: 5655bbd726f29e74600ebe9f33f9dc5952b528f4
(cherry picked from commit 78b301c78c9d5046e2f0a9818dcbc2cc45e7cdd0)
2022-02-04 01:16:25 +00:00
defde3bb04 [NNC] Use index for stride mapping in kernel.cpp (#72266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72266

Within the kernel, we may manipulate `Value *` in `OptimizeCat`, which would invalidate the input `Value *` -> Stride mapping.

Fix for https://github.com/pytorch/pytorch/issues/72173

Test Plan: Imported from OSS

Reviewed By: dagitses, davidberard98

Differential Revision: D33986306

Pulled By: eellison

fbshipit-source-id: dc33cd2b545e49e90d1e46b9fcf1e6dbb4b829db
(cherry picked from commit 5e4555968a0d7b9e42ab6368575137b1c1db814f)
2022-02-04 00:12:38 +00:00
9d4d782e42 remove alwayslink/link_whole from //c10 library (#70997)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70997

This is no longer necessary because the sublibraries that need this
have it specified.
ghstack-source-id: 147786997

Test Plan: Verified manually that this works with Bazel and Buck.

Reviewed By: malfet

Differential Revision: D33477915

fbshipit-source-id: f00f8ac24747711904fe49df4fc9400beec54f3b
(cherry picked from commit 3325437d2b20c398e3edfb389d6d3d3e6ce74d93)
2022-02-04 00:07:27 +00:00
0ca0e02685 Bump torch version to 1.12 (#72221)
Summary:
Bump torch version to 1.12

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72221

Reviewed By: dagitses, atalman

Differential Revision: D33987446

Pulled By: seemethere

fbshipit-source-id: f5fc1c4954ff116baab9e4afe3955c0e7842e6cf
(cherry picked from commit 78d62aa29364d46c816f0b6941ce85246824f85d)
2022-02-04 00:02:28 +00:00
e970160c19 remove //c10:headers dependency from //c10 (#70996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70996

This is no longer necessary and does not exist internally.
ghstack-source-id: 148159361

Test Plan: Relying on CI.

Reviewed By: malfet

Differential Revision: D33477755

fbshipit-source-id: 7d375a0770d5c6277cfdea4bb0e85a9b2b4f40cd
(cherry picked from commit 360f9a548c2e4cde1b97b5902ca62a8e43af4070)
2022-02-03 22:55:04 +00:00
4fc6ab5e81 [DataPipe] Fix OOM when traverse IterDataPipe due to pickling (#72209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72209

This PR would fix https://fburl.com/jtfyksyr

Test Plan: Imported from OSS

Reviewed By: dagitses, NivekT

Differential Revision: D33955933

Pulled By: ejguan

fbshipit-source-id: 120203a3c2323a0c7081715bb6628d1768f8b1c4
(cherry picked from commit 469f3d056276ad34c31dad4a3ba5c570f511854a)
2022-02-03 22:55:04 +00:00
90458004cb move //c10/cuda/test to shared build structure (#71429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71429

Note that this was untested in OSS Bazel.
ghstack-source-id: 148159363

Test Plan: Tested locally. Rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33638407

fbshipit-source-id: 12ae383ccadc1375b92d9c6a12d43821e48f9dcb
(cherry picked from commit 12be8c195ce11d9697264b1423d1e7ad28a915cb)
2022-02-03 22:33:41 +00:00
2d4513e4c4 [PyTorch] MHA: remove TODO about transposing second arg to linear() (#72229)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72229

linear() transposes its second argument internally.
ghstack-source-id: 148259782

Test Plan: CI, review

Reviewed By: zrphercule

Differential Revision: D33962112

fbshipit-source-id: fedb2139d652ab933f9a0829db6060d14e3e3c7a
(cherry picked from commit 2b4259db0b62bcfca089352fb6d76d100616d2d2)
2022-02-03 21:34:53 +00:00
286f5a51f9 move //c10:tests target to the shared //c10/test package (#70928)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70928

ghstack-source-id: 148159366

Test Plan: Ensured that the same number of tests are found and run.

Reviewed By: malfet

Differential Revision: D33455272

fbshipit-source-id: fba1e3409b14794be3e6fe4445c56dd5361cfe9d
(cherry picked from commit b45fce500aa9c3f69915bf0857144ba6d268e649)
2022-02-03 20:14:57 +00:00
c965b47995 add ZeroTensor specialization to div.Tensor (#71862)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70027.  Fixes https://github.com/pytorch/pytorch/issues/69925.  Reverts https://github.com/pytorch/pytorch/issues/70061.  Completes a task from https://github.com/pytorch/pytorch/issues/69687.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71862

Reviewed By: george-qi

Differential Revision: D33824078

Pulled By: anjali411

fbshipit-source-id: 114754161077281f9804476288ed796d344ee54b
(cherry picked from commit 55ce1d5c02b5869497d07e916c46b6f50287542c)
2022-02-03 20:08:15 +00:00
69c9bc9d16 [PTE] Adding prim::to_list to be emitted (#72238)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72238

Adding missed operator to be emitted either as operator (version 7 and below) and as an instruction (version 8 and above)
ghstack-source-id: 148278722

Test Plan: CI

Reviewed By: JacobSzwejbka

Differential Revision: D33970756

fbshipit-source-id: 876f0ea48dde2ee93fa40d38a264181e2fcf42ce
(cherry picked from commit f2666f99acaf9efa1a066f22319962e841209d54)
2022-02-03 19:30:17 +00:00
6d9c0073a8 create //c10/cuda library (#70863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70863

ghstack-source-id: 148159368

Test Plan: Ought to be a no-op: rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33367290

fbshipit-source-id: cb550538b9eafaa0117f94077ebd4cb920688881
(cherry picked from commit 077d9578bcbf5e41e806c6acb7a8f7c622f66fe9)
2022-02-03 19:17:18 +00:00
38f696c0cd [nnc] Add a API to unroll loops by a given factor (#72071)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72071

Reviewed By: ngimel

Differential Revision: D33946250

Pulled By: navahgar

fbshipit-source-id: 3f3f92054174620025a9d71154d006f1738953e2
(cherry picked from commit d8b53598e92e8d2e050bc1d0cd070fbe8e2d77dd)
2022-02-03 18:41:21 +00:00
8f4f0ba7e5 use random number in shared file name (#72232)
Summary:
A tentative fix for https://github.com/pytorch/pytorch/issues/67864

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72232

Reviewed By: mruberry

Differential Revision: D33966413

Pulled By: ngimel

fbshipit-source-id: 8204c2bf6cc3c5722f8e54652d7be257355162c2
(cherry picked from commit 2af7cfcf4e1934e0867cae6f98d4a87e0f35f5a1)
2022-02-03 18:16:27 +00:00
38ebb776a4 Fail with unexpected success for fatal errors (#72016)
Summary:
Rest of the tests from CUDA testuite is skipped after GPU context corruption is encountered.
For tests decorated with `expectedFailure` creates false impression that entire testsuite is passing.
Remedy it by suppressing the exception and printing the warning about unexpected success if `should_stop_early` is true
Also, prints warning when this happens (to make attribution easier) as well as when this condition is detected.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72016

Test Plan:
`python test_ops.py -v  -k test_fn_fwgrad_bwgrad_gradient`
Before the change:
```
test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok
test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok
test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... expected failure

----------------------------------------------------------------------
Ran 3 tests in 0.585s
OK (expected failures=1)
```

After the change:
```
test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok
test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok
test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
  warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}")
/home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error
  warn("Suppressed expected failure that resulted in fatal error")
unexpected success

----------------------------------------------------------------------
Ran 3 tests in 0.595s

FAILED (unexpected successes=1)
```
And `stderr` from XML file contains requested info:
```
/home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
  warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}")
/home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error
  warn("Suppressed expected failure that resulted in fatal error")
```

Fixes https://github.com/pytorch/pytorch/issues/71973

Reviewed By: janeyx99, ngimel

Differential Revision: D33854287

Pulled By: malfet

fbshipit-source-id: dd0f5a4d2fcd21ebb7ee50ce4ec4914405a812d0
(cherry picked from commit 0c0baf393158b430e938ff3be3f4b59f85620e35)
2022-02-03 17:49:59 +00:00
7c182f785c [Quant][devs] Separated implementations for quantized & non-quantized tensors for the unsqueeze function (#71648)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71648

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for unsqueeze
using a dispatcher.

Test Plan:
Additional testing was not implemented to test this change because test cases in the existing test suite already make use of the squeeze function for various backends.

Additional testing was not implemented to test this change because test cases in the existing test suite already make use of the squeeze function for various backends.

Differential Revision:
D33809041
D33809041

Reviewed By: albanD, jerryzh168

Pulled By: dzdang

fbshipit-source-id: 304d3311bc88e9bdc0ebc600e4da8e3e661134ad
(cherry picked from commit 978604a03e95f2ec7b542fad60264b61c440e9b9)
2022-02-03 17:36:28 +00:00
85591dc85d Test 0->0 correspondence for Unary Ops with Sparse CSR inputs (#70302)
Summary:
Since there is no rule in PyTorch (Sparse CSR) for filling zeros, it was decided that only those ops will be supported which do not break 0->0 correspondence. To ensure that this rule is not broken, this PR aims to add a test to ensure this rule is not broken.

`sample_inputs_unary` may or may not generate a zero in the sample input. Hence, this separate test is good for validating the rule, and the support for Sparse CSR.

cc nikitaved pearu cpuhrsch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70302

Reviewed By: albanD

Differential Revision: D33922501

Pulled By: cpuhrsch

fbshipit-source-id: 10f67a220b95a8e75205345a33744ad536fdcf53
(cherry picked from commit ade9bf781852af7be98bd254ec5117ebdd89ec31)
2022-02-03 16:53:27 +00:00
f03734d337 [Quant][devs] Separated implementations for quantized & non-quantized tensors for the squeeze function (#71639)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71639

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for squeeze
using a dispatcher.

Test Plan:
Additional testing was not implemented to test this change because test cases in the existing test case already make use of the squeeze function for various backends.

initial

Additional testing was not implemented to test this change because test cases in the existing test case already make use of the squeeze function for various backends.

Differential Revision:
D33798546
D33798546

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: 549cd7b16afb2e93ff453c9b256bab6ce73d57ce
(cherry picked from commit 193591c072e1241445dc1b67bffd925af52e330f)
2022-02-03 16:45:00 +00:00
bafe440d14 [Quant][devs] Added early return statement in squeeze_qtensor when input tensor has zero dimensions or the input dim is not size 1 (#71876)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71876

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR doesn't address any dispatcher issues but is the first of 2 stacked PRs that addresses separating
the implementations for quantized & non-quantized squeeze functions.

Differential Revision:
D33798473
D33798473

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: d3502eff89c02a110d3d12e6e3d3fab496197842
(cherry picked from commit 2456f7d627d781f9abbe26b22915482682861c7b)
2022-02-03 16:45:00 +00:00
35e25258ca Fix lint (#72261)
Summary:
fixes https://www.torch-ci.com/minihud?name_filter=Lint%20/%20quick-checks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72261

Reviewed By: ejguan

Differential Revision: D33979857

Pulled By: janeyx99

fbshipit-source-id: b0a074ff35c61f9e1e52ed5a05d2be26158f19bf
(cherry picked from commit 7ae13a749bd74dc4214a1c51437fd01510a18280)
2022-02-03 16:35:59 +00:00
bae40bc764 [Quant][devs] Separated implementations for quantized & non-quantized tensors in index_select_cpu_ (#71900)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71900

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for index_select_cpu_
using a dispatcher.

Differential Revision:
D33809857
D33809857

Test Plan: Imported from OSS

Reviewed By: albanD

Pulled By: dzdang

fbshipit-source-id: 3792a139c3c98e3a22b29304eeef593a091cf928
(cherry picked from commit 88550e01b8ec25a641e8ca751cbef62064d71ac9)
2022-02-03 16:00:30 +00:00
b62827b81a [Quant][devs] Separated implementations for quantized & non-quantized tensors in fill_ (#71939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71939

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for fill_
using a dispatcher.

Differential Revision:
D33827371
D33827371

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: d034f83de844ef777a2d71e5464f582cba634550
(cherry picked from commit 9f38385051e41a32ccc631dc3354caa03188649b)
2022-02-03 15:52:39 +00:00
5b7c72101c [Quant][devs] Removed check for is_quantized in dequantize_cpu_or_cuda (#71958)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71958

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR isn't dispatcher related but does remove the extraneous torch check for a quant tensor
since the dispatcher already handles a quantized backend for this particular function

Differential Revision:
D33833765
D33833765

Test Plan: Imported from OSS

Reviewed By: ngimel

Pulled By: dzdang

fbshipit-source-id: c3bb531a5c09326bdf724b5185a19ea0a379bba7
(cherry picked from commit f053b8248f895446f6a9d352de4038df6c6d4b2d)
2022-02-03 15:36:15 +00:00
6474195ec8 [Quant][devs] Changed empty_quantized call for quantized tensor to resize_output (#71899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71899

This PR is part of a series of PRs addressing https://github.com/pytorch/pytorch/issues/54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR removes the call to empty_quantized for quantized tensors and substitutes
it for resize_output, which works for quantized tensors, based on current understanding.
Using the dispatcher for this function was determined to be not practical as it would entail
a significant amoutn of duplicate code

Differential Revision:
D33809138
D33809138

Test Plan: Imported from OSS

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: 5bacea37356547ceacea4b3f6b0141ac3a223dcf
(cherry picked from commit 3bb82ff3040c9a7905a3cfe8a57c69cfe0721955)
2022-02-03 15:36:15 +00:00
2291981eea Fix backwardcompat definitions after #72200 (#72255)
Summary:
Adds `aten::native_multi_head_self_attention` to allowlist

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72255

Reviewed By: dagitses

Differential Revision: D33978360

Pulled By: malfet

fbshipit-source-id: cf24aaf82f5da07510adc04b6d76ea002d0136eb
(cherry picked from commit f98e86c044d18b187d35a000f2cf58c72e8d7c44)
2022-02-03 15:25:07 +00:00
b26801276f Fix quicklints (#72256)
Summary:
Regression introduced by https://github.com/pytorch/pytorch/pull/71854

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72256

Reviewed By: dagitses

Differential Revision: D33978449

Pulled By: malfet

fbshipit-source-id: 32c99cc95f0e1011a5d241875d47a78f6c869e0e
(cherry picked from commit ffa0aa8530c5ca8e0a17634d6e3de0c88e674ed6)
2022-02-03 15:17:35 +00:00
0bb3158eae [SR] Implement prim::CreateObject (#71854)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71854

Support `prim::CreateObject` - this is a native interpreter instruction, so we can't fall back to the JIT for this op.

Test Plan: New unit test exercises creating and modifying custom objects

Reviewed By: d1jang

Differential Revision: D33783759

fbshipit-source-id: 8185ff71b5d441597d712a5d4aab7fc4dddf7034
(cherry picked from commit bd3f52d8e2cd8e20a8d66e2d2b802c1d92088e4e)
2022-02-03 12:18:46 +00:00
cff5e22a72 [SR] Relax aten::__is__ constraint for SR enablement (#71807)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71807

There's no need to completely disallow `aten::__is__` and `aten::__isnot__`. The only problematic case is when the comparison is between two tensors, e.g. in

```
def forward(x):
    y = x.detach()
    # Should be false, but we get True
    # after our EliminateNoOps pass
    return x is y
```

Test Plan: New unit test covers this case

Reviewed By: d1jang

Differential Revision: D33783668

fbshipit-source-id: c9f57fa96937ecce38a21554f12b69c45cc58fe4
(cherry picked from commit 019588f4ca3fcd2b3ae51bccab102f0538745b15)
2022-02-03 12:18:46 +00:00
7cdbbfaee2 Revert D33716716: [pytorch][PR] Added remove_duplicate parameter to nn.Module
Test Plan: revert-hammer

Differential Revision:
D33716716 (7e8217549f)

Original commit changeset: ff1ed9980bd1

Original Phabricator Diff: D33716716 (7e8217549f)

fbshipit-source-id: 91c3d9acc5bc731da716dd0d2485431f85f861c9
(cherry picked from commit c81d193bf0fccbffdc009255bc85d0c287c1e409)
2022-02-03 09:04:29 +00:00
88547396eb [PT-D] Enable megatron-lm style MLP layers (Changes mainly on sharded linear op) (#69735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69735

We want to build a prototype of Megatron-LM so that we can apply PT-D op to models like transformer and other Meta flagship models like

The basic idea of Megatron-LM is as following:
1. Col-wise sharding of linear weight. Perform the linear op for the first layer.
2. Perform a math op (optional), such as ReLU or GeLU. We use GeLU in our example unit test. The input is from step 1.
3. Row-wise sharing of linear weight. Perform the linear op for the second layer. The input is from step 2.

We then save communications to concatenate the col-wise sharding results and spreading the input to different ranks for row-wise sharding.

The change is as following:
1. Return a ShardedTensor for the col-wise sharding in the sharded_linear op.
2. Return a PartialTensors for the row-wise sharding in the sharded_linear op.
3. Leverage APIs already defined for `reshard` to merge/aggregate local results to a fully sync local result if needed.
4. Add helper function to create sharded tensor based on the local result.
5. Add a unit test to test the Megatron-LM idea mentioned above and compare with local ops, including the grad and optimizer so that we can ensure the correctness of the implementation.
6. Refactor the unit test of sharded linear to reflect the changes in the code.
ghstack-source-id: 148273049

Test Plan: Unit test + CI

Reviewed By: pritamdamania87

Differential Revision: D32978221

fbshipit-source-id: 565fc92e7807e19d53b0261f8ace3945bef69e3e
(cherry picked from commit 344abe75202493c8313502e1b22d634568e1b225)
2022-02-03 06:12:15 +00:00
19d0de8a57 [PT-D][RFC] Resharding related API implement for ShardedTensor and Partial Tensor (#70079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70079

We defined a new concept named `PartialTensor`, which is an abstraction to represent Tensors that need aggregation across multiple devices and multiple processes.

We also defined a API `reshard_output` to reshard a `PartialTensor` to `Tensor` or reshard a `ShardedTensor` to `ShardedTensor/Tensor`. This is done via class `ModuleResharder` which acts like a wrapper of original modules plus the a reshard in the final step.

The `reshard` logic is defined in each class (`ShardedTensor` and `PartialTensor`).
ghstack-source-id: 148273050

Test Plan: Unit test is in the next PR.

Reviewed By: pritamdamania87

Differential Revision: D33121037

fbshipit-source-id: 5f56617ea526b857c5b73df6e069697d428ec359
(cherry picked from commit 58b1457cbcfc9c0bfb3083ef07fbc9e60f0ba51e)
2022-02-03 05:26:02 +00:00
541773d268 Make native MHA private for release 1.11 (#72200)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72200

This op should still remain private in release 1.11, add underscore before op name to make it happens

Test Plan: buck run mode/opt -c fbcode.enable_gpu_sections=true pytext/fb/tools:benchmark_transformers -- mha --batch-size=10 --max-sequence-length=16

Reviewed By: bdhirsh

Differential Revision: D33952191

fbshipit-source-id: 3f8525ac9c23bb286f51476342113ebc31b8ed59
(cherry picked from commit 6e41bfa4fc242987165fafda1a01735838e3f73d)
2022-02-03 04:15:18 +00:00
2a391284fc Revert D33851316: ci: Migrate macOS x86_64 binary builds to GHA
Test Plan: revert-hammer

Differential Revision:
D33851316 (c2e63b43ce)

Original commit changeset: 3c953f0e4e4b

Original Phabricator Diff: D33851316 (c2e63b43ce)

fbshipit-source-id: d95670332bbe44725b589e6d895f99b6d8821024
(cherry picked from commit 5f1861d777b913a94be7844e5eef28b53ab7010d)
2022-02-03 04:08:39 +00:00
14538fa7bf [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33962464

fbshipit-source-id: a8f0633dbd3fcb26b68e3d48886d520a46eea631
(cherry picked from commit 85f819baa3be5a2b81faef35b9d8fb422a5cb8fe)
2022-02-03 04:02:37 +00:00
bf09ece782 Make svd / svdvals fully functorch compatible (#72181)
Summary:
This should (hopefully) make all the CI from `functorch` go green (including jvp's!) after changing `VARIADIC_BDIMS_BOXED(_svd_helper);` with `VARIADIC_BDIMS_BOXED(_linalg_svd);` and removing all the skip and xfails associated to `linalg.svdvals`.

Locally, there's just one test that started failing because of this, and that is `test_vmapjvpall_norm_nuc_cpu_float32`. I have no idea what's going on here, but it's a jvp product, so not a regression, and it might very well be caused by the jvp of other operation within `norm_nuc` as this is a composite operation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72181

Reviewed By: ngimel

Differential Revision: D33952744

Pulled By: zou3519

fbshipit-source-id: 2a2510d97eed4a0bfc25615264ddd36e38856efe
(cherry picked from commit 5805fa107c3a91c58f8ecc9778cfc87aa7f64233)
2022-02-03 03:21:22 +00:00
e03c3dd150 Add leaf module code example (#72100)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72100

Facebook :
enable splitter to properly read the leaf module specified by acc_tracer leaf module list, and parse leaf module as run_on_acc if customize leaf module converter is provided.
add scratch board for customize leaf module converter and example code for std_conv2d_same converter.

Reviewed By: jfix71

Differential Revision: D33698402

fbshipit-source-id: 01ce84ee1543f0fb8a8899256530ef1300797417
(cherry picked from commit 1357b2d528358da74609b728bfcbe2b8b4b90a50)
2022-02-03 02:07:00 +00:00
c83eaf5c26 add the default destructor of TensorImpl (#72190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72190

# Context
Some compilers could not generate the destructor correctly.

# Mitigation
add the default destructor.

Test Plan: ^CI

Reviewed By: albanD

Differential Revision: D33936970

fbshipit-source-id: c21aa1cce8565d8c25389de8970880392737afb1
(cherry picked from commit 7ab4b8b14e96e2c02d515c2d3059f1e6c1259b51)
2022-02-03 01:40:25 +00:00
c585d35463 CUDACachingAllocator: Keep one event queue per stream (#71745)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71616

This fixes the leaks in my test case. I have not tested it on our big models yet, but will report back if we can.

This potentially impacts allocator performance in that it slightly increases the amount of CPU memory we allocate for data structures, and it means that `process_events` may look at a larger number of events in the case where there are multiple streams with long-running ops on them.

However, I suspect that in general, either:
- An application isn't using very many streams or very many long-running ops, in which case the performance is essentially the same
- Or, they are, which is precisely the case where https://github.com/pytorch/pytorch/issues/71616 bites you, and so freeing memory faster is probably more valuable than the slight CPU overhead here.

I'm not attached to this approach or any of its details, but figured it was worth throwing up for discussion.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71745

Reviewed By: soulitzer

Differential Revision: D33948288

Pulled By: ngimel

fbshipit-source-id: 73e95f8a9bbe385a77de483d1c58b857b5d84e81
(cherry picked from commit d233719c072341607e6dab226b5cbfe8d316d91f)
2022-02-03 01:35:19 +00:00
d23231fd8c Fix upgrader codegen when constant list is 0 (#72199)
Summary:
When the constant list is empty, previous codegen will generate something like
```
std::vector<c10::IValue>({

}), // constants list,
```
However it will fail quick-check, because it includes trailing spaces. This pr will generate the following instead.
```
std::vector<c10::IValue>(), // constants list,
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72199

ghstack-source-id: 148231023

Test Plan: CI

Reviewed By: tugsbayasgalan

Differential Revision: D33952046

fbshipit-source-id: 359b8a418928c89bbeb446b44774b312c94f03bc
(cherry picked from commit 060490f66724e418a43548c2eaffa3244e780557)
2022-02-03 00:41:03 +00:00
4a7e07e53e Fix torch.save and detach for CSR Tensor (#71963)
Summary:
Currently saving a CSR Tensor simply fails. This also addresses the segfault encountered in https://github.com/pytorch/pytorch/issues/71652.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71963

Reviewed By: jbschlosser

Differential Revision: D33895938

Pulled By: cpuhrsch

fbshipit-source-id: a333505d3a216705147c2aaaaeb2a0fd0c2a5e43
(cherry picked from commit a88265921cd8cf29871b5c2174f5e3184b3df8d3)
2022-02-02 23:59:24 +00:00
c2e63b43ce ci: Migrate macOS x86_64 binary builds to GHA (#71888)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71888

Migrates binary builds for x86_64 for macOS from CircleCI to GHA.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33851316

Pulled By: seemethere

fbshipit-source-id: 3c953f0e4e4b434f4e0f95156d50484a5b56d0c7
(cherry picked from commit 15de76a6bec68e18fc4882f29984cc32a546754c)
2022-02-02 23:44:13 +00:00
4c62ffa11e Improve fx2trt benchmark (#72145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72145

- Added a predicate that allows us not to lower nodes with specific names.
- Added an observer function to help with the debugging

Reviewed By: jasonjk-park, houseroad

Differential Revision: D33785834

fbshipit-source-id: 7bdb7f33851da1118763c85f8e2121d01e4914a2
(cherry picked from commit 4e2268ed45c394822f38ef82334f0c76721556cf)
2022-02-02 22:41:58 +00:00
1ad53b51d0 Fix unused variable warning in LostCTC.cu (#72155)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72155

Fixes:
```

caffe2/aten/src/ATen/native/cuda/LossCTC.cu(420): warning: parameter "max_target_length" was declared but never referenced
          detected during instantiation of "at::Tensor at::native::<unnamed>::ctc_loss_backward_gpu_template<scalar_t,target_scalar_type>(const at::Tensor &, const at::Tensor &, const at::Tensor &, c10::IntArrayRef, c10::IntArrayRef, const at::Tensor &, const at::Tensor &, int64_t, __nv_bool) [with scalar_t=double, target_scalar_type=c10::ScalarType::Long]"
(758): here

caffe2/aten/src/ATen/native/cuda/LossCTC.cu(427): warning: parameter "num_labels" was declared but never referenced
          detected during instantiation of "at::Tensor at::native::<unnamed>::ctc_loss_backward_gpu_template<scalar_t,target_scalar_type>(const at::Tensor &, const at::Tensor &, const at::Tensor &, c10::IntArrayRef, c10::IntArrayRef, const at::Tensor &, const at::Tensor &, int64_t, __nv_bool) [with scalar_t=double, target_scalar_type=c10::ScalarType::Long]"
(758): here

caffe2/aten/src/ATen/native/cuda/LossCTC.cu(427): warning: parameter "BLANK" was declared but never referenced
          detected during instantiation of "at::Tensor at::native::<unnamed>::ctc_loss_backward_gpu_template<scalar_t,target_scalar_type>(const at::Tensor &, const at::Tensor &, const at::Tensor &, c10::IntArrayRef, c10::IntArrayRef, const at::Tensor &, const at::Tensor &, int64_t, __nv_bool) [with scalar_t=double, target_scalar_type=c10::ScalarType::Long]"
(758): here

caffe2/aten/src/ATen/native/cuda/LossCTC.cu(420): warning: parameter "max_target_length" was declared but never referenced
          detected during instantiation of "at::Tensor at::native::<unnamed>::ctc_loss_backward_gpu_template<scalar_t,target_scalar_type>(const at::Tensor &, const at::Tensor &, const at::Tensor &, c10::IntArrayRef, c10::IntArrayRef, const at::Tensor &, const at::Tensor &, int64_t, __nv_bool) [with scalar_t=double, target_scalar_type=c10::ScalarType::Long]"
(758): here
```

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33930498

fbshipit-source-id: 34b408b16c2a3f45e73f808f07f89e3fd443e9cc
(cherry picked from commit 9cc3c2051bc84c25bf4cc64364becb61f700b7ee)
2022-02-02 22:35:44 +00:00
774e0847c9 Add hook for functorch to error out with unoverridable autograd operations (#72176)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72176

I went through the manual_cpp_binding operations in
native_functions.yaml looking for important things that people use that
don't go through the dispatcher and came up with this.

There's currently no mechanism for functorch (or Tensor subclasses)
to change the behavior of tensor.requires_grad_() and
tensor.retains_grad() because these don't go through the dispatcher at
all.

This PR adds a hook for functorch to be able to throw an error on these.
In the future they should probably be overridable with torch_dispatch
(or at least configurable!).

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33943151

Pulled By: zou3519

fbshipit-source-id: df7eb0acad1da3adaf8c07e503ccf899e34571a2
(cherry picked from commit bba7207dc77a12ceedfbd16d44e4d287287423bf)
2022-02-02 22:07:03 +00:00
f607af126e Set correct device id on efficientzerotensors (#71611)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71611

Fixes https://github.com/pytorch/pytorch/issues/71160 https://github.com/pytorch/pytorch/issues/69925 #69913

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33897543

Pulled By: anjali411

fbshipit-source-id: f1d8608c351876b8c2619da5ef891f74bad30ab5
(cherry picked from commit 643e666ea34aaf1584d478cd85025bf66d93cb21)
2022-02-02 21:51:32 +00:00
889a62ddb2 Update approvers for ONNX (#71659)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71659

Reviewed By: albanD, seemethere

Differential Revision: D33924156

Pulled By: malfet

fbshipit-source-id: 451aa349dcb49dea07f804d758a181254544c0a9
(cherry picked from commit cd773b93fb46cf48fd5076bb88edd8cd9e0250ad)
2022-02-02 21:46:36 +00:00
2d5296b0e7 [SR] Implement prim::Loop (#69838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69838

Implement `prim::Loop` with the new `StaticRuntimeBlockRunner` abstraction.
ghstack-source-id: 148186483

Test Plan: New unit tests: `buck test caffe2/benchmark/static_runtime/...`

Reviewed By: d1jang

Differential Revision: D33049595

fbshipit-source-id: 550de5167b46fccd65ff77d092785289b5e5d532
(cherry picked from commit 8baf1753af34f4c166b4680e42589517fd2e508d)
2022-02-02 19:30:50 +00:00
2aa699505d [SR] Implement prim::If (#69837)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69837

Implement `prim::If` with the new `StaticRuntimeBlockRunner` abstraction.
ghstack-source-id: 148186475

Test Plan:
New unit tests: `buck test caffe2/benchmarks/static_runtime/...`

Accuracy test at top of stack

Reviewed By: d1jang

Differential Revision: D33045908

fbshipit-source-id: 281fb4a73528249fa60f65ac26f8ae6737771f55
(cherry picked from commit de3b12dc0871e8ca09891c257e1dfd7cd352aa7c)
2022-02-02 19:30:50 +00:00
d2599701fd [SR] Force sub-blocks to return at least one output (#69836)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69836

It is technically possible for the sub-blocks to return zero outputs. This is problematic for `StaticRuntimeBlockRunner`, because it assumes that at least one output is being returned.

Rather than slowing down SR with special logic for this corner case, we can simply force these sub-blocks to return `None`.
ghstack-source-id: 148186453

Test Plan: Sub-blocks with no return values tested at top of stack

Reviewed By: d1jang

Differential Revision: D33050420

fbshipit-source-id: 17d9e19fda6431aa9fd0b155131349bac42bc149
(cherry picked from commit c97fd07bf53e1e253a0e6c733db5ea7c86698fc9)
2022-02-02 19:30:50 +00:00
238dded10f [SR] Graph pass to create owned refs of special IValues (#69835)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69835

`StaticRuntimeBlockRunner` moves its outputs to the return value at the end of `run_impl`. However, there's a corner case where this can cause problems. If we return a constant, then the only reference in the `constants_` array can be destroyed by this move. We could add special logic to handle this in `run_impl`. But since this is a relatively rare corner case, it's simpler to just add an op that does nothing but create an owned reference to its input. This owned reference can be safely moved out of `StaticRuntimeBlockRunner`.

Note that this also applies to returned values in sub-blocks that are from outer scopes.
ghstack-source-id: 148186452

Test Plan:
`buck test caffe2/benchmarks/static_runtime/...`

Added a new unit test with a graph that simply returns a constant.

Tests with sub-blocks at top of stack.

Reviewed By: d1jang

Differential Revision: D33047519

fbshipit-source-id: 22b6058f0d1da8a6d1d61a6f2866bc518bff482b
(cherry picked from commit a8f89a12ee726aa7d7e546dee25d696eef868ce7)
2022-02-02 19:30:50 +00:00
b0518b2705 Codegen: Do less work in dry-runs for sharded files (#69805)
Summary:
This improves a dry-run of `gen.py` from 0.80s to 0.45s.

`FileManager` in `dry_run` mode doesn't actually need to compute the
environment; it just records the filenames that would have been
written.

cc ezyang bhosmer bdhirsh

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69805

Reviewed By: ngimel

Differential Revision: D33944912

Pulled By: albanD

fbshipit-source-id: 74f22af3f2bd5afdef7105961270198566fa91e5
(cherry picked from commit 6fcdc15954788257b76e14087ba1ebf63fd3ab27)
2022-02-02 19:25:16 +00:00
cb0d7f0d96 [Profiler] Defer KinetoEvent and GenericTraceActivity creation to post processing. (#71539)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71539

This is the first of the optimizing changes. One of the issues with kineto sometimes being unavailable is we cannot use it as a storage mechanism. KinetoEvent currently fills this role, however KinetoEvent is VERY expensive. A second issue is that because we currently write to two objects, we hold the state lock for the duration of both event creations which is not ideal.

This applies the following optimizations:
1) Intermediate data is stored in a deque in KinetoThreadLocalState, which saves a data->KinetoObserverContext->KinetoEvent double copy. The new KinetoObserverContext just holds a pointer to the element in the deque.
2) OpEventData is much lighter weight (though still far from ideal)

Test Plan:
Script: P470970719
Result: P470970794
For the base case (no special flags), 40% reduction in the `profiler_kineto` portion of the overhead.

Reviewed By: aaronenyeshi

Differential Revision: D32691800

fbshipit-source-id: 3d90d74000105d0ef1a7cb86d01236610e7e3bbd
(cherry picked from commit fbca1b05bac60ed81d6cd3b2cfdb7ffb94ebeb6a)
2022-02-02 18:42:50 +00:00
e19f2e52ad enable GHA workflow defaults for ROCm (#72142)
Summary:
Also updates the ROCm per-job health check.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72142

Reviewed By: ngimel

Differential Revision: D33947337

Pulled By: seemethere

fbshipit-source-id: aaf47f3046d254df9f43a71d168f7b81ae44eeab
(cherry picked from commit 72974ae68199e174200d66f9bafa2e0c9cdbfb55)
2022-02-02 18:30:03 +00:00
31b348411a fix typos in aten/src/ATen/native/mkldnn (#71853)
Summary:
AT_MKLDNN_EBABLED => AT_MKLDNN_ENABLED

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71853

Reviewed By: albanD

Differential Revision: D33883760

Pulled By: mrshenli

fbshipit-source-id: 0cd028808b0de3c66b5666e1cd032a76855e882e
(cherry picked from commit 640ffaf806b61578b00d6f145cffa2833e76920d)
2022-02-02 17:34:59 +00:00
02f6226bff [fix] Dropout2d-3d no-batch-dim (#69885)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69801

TODO:
* [x] Update C++ API

cc albanD mruberry jbschlosser walterddr kshitij12345

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69885

Reviewed By: mruberry

Differential Revision: D33175470

Pulled By: jbschlosser

fbshipit-source-id: c9d7d9e0f59ba290a0157725c338a345f3d58b9f
(cherry picked from commit 7e4271a1564b36ef9a49ac35c210338470fedb89)
2022-02-02 16:40:32 +00:00
39483b5918 Updated CONTRIBUTING.md (#72044)
Summary:
Removed duplicate statements

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72044

Reviewed By: albanD

Differential Revision: D33883764

Pulled By: mrshenli

fbshipit-source-id: db467b2adec6a62edf132a70f70e665a97266fa4
(cherry picked from commit aa38c8af79eea946cfb66ad6e3eff578363c48a4)
2022-02-02 16:13:38 +00:00
a1383a9cfa Reland torch.ops API change machinery with the core functionality disabled (#71785)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71785

see https://github.com/pytorch/pytorch/pull/67254
ghstack-source-id: 147648699

Test Plan: github CI

Reviewed By: albanD

Differential Revision: D33777229

fbshipit-source-id: 517b36be9743025eb40d708d380dae62e3663184
(cherry picked from commit a637e695694d3fd615dbe821394bfe53d41b6901)
2022-02-02 16:06:29 +00:00
1fdbe9aa76 Make asarray behavior consistent with Python Array API. (#71757)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70591

This PR makes `torch.asarray` consistent with [the Python Array API](https://data-apis.org/array-api/latest/API_specification/generated/signatures.creation_functions.asarray.html#signatures.creation_functions.asarray) (which also happens to be the same as `torch.as_tensor` behavior). Specifically, it makes `asarray` casting conditional to the presence of the `dtype` argument. This solves the issue when Python scalars (and lists) were passed as input without specifying the `dtype`.

Before:
```python
>>> torch.asarray([True, False])
tensor([1., 0.])
```

After:
```python
>>> torch.asarray([True, False])
tensor([True, False])
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71757

Reviewed By: mrshenli

Differential Revision: D33774995

Pulled By: anjali411

fbshipit-source-id: 9f293401f993dca4046ceb61f714773ed4cf7c46
(cherry picked from commit 0c6f98ebe7c843a68f624d2d9c3cae39f018bb66)
2022-02-02 15:57:31 +00:00
2336571cb7 make fsdp folder to be public (#72084)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72084

make fsdp folder to be public
ghstack-source-id: 148173447

Test Plan: unit tests

Reviewed By: mrshenli

Differential Revision: D33903417

fbshipit-source-id: 7852a2adc4af09af48a5ffa52ebf210489f834d5
(cherry picked from commit bd06513cfe2f391941bb0afa611dd39994585513)
2022-02-02 15:50:14 +00:00
ed435e903f [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33938055

fbshipit-source-id: 6c0643a18f09854e87e183341f252c66dd6395a6
(cherry picked from commit fd183aedbc0f015bd43a01a28930093ab94ab41e)
2022-02-02 11:27:15 +00:00
8757e21c6a Update logspace and bump the version number to 9 (#72051)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72051

Test Plan: TestUpgraders.test_aten_logspace && TestSaveLoadForOpVersion.test_aten_logspace

Reviewed By: khabinov, cccclai

Differential Revision: D33885098

fbshipit-source-id: 0c669d0b00f451bc65427900dcf4d8032318a341
(cherry picked from commit b12d1aa2aada12df5ff7b1f0f71574a8363bfaca)
2022-02-02 08:54:14 +00:00
230186f9f7 Jiterator cache for Windows (#72048)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71967

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72048

Reviewed By: mruberry

Differential Revision: D33902836

Pulled By: ngimel

fbshipit-source-id: af7f9e0b9735282cf17a61486330856f66ab4548
(cherry picked from commit 64150c3a7917b037e646ddd4d4c105022a49a0ec)
2022-02-02 07:11:13 +00:00
64670e414e [reland] Create torch.distributed._shard package. (#72141)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72141

We have many sharding components currently:
torch.distributed._sharded_tensor, torch.distributed._sharding_spec,
torch.distributed._sharded_optimizer and more coming.

As a result, organizing all of this under the `torch.distributed._shard`
package. For BC reasons, I'm still keeping the old packages and have them just
reference the new package.
ghstack-source-id: 148150861
ghstack-source-id: 148150861

Test Plan: waitforbuildbot

Reviewed By: fduwjj

Differential Revision: D33904585

fbshipit-source-id: 057e847eb7521b536a3ee4e0f94871aacc752062
(cherry picked from commit 29a70dd7afde6083bab942081020a13278f38e52)
2022-02-02 06:58:20 +00:00
7b014cc645 [DataPipe] Disable Typing for DataPipe before branch cut (#72123)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72123

There is a bug to fix the typing system in DataPipe, which would take more than 1 week to fix. I will follow up on it later this month. As branch cut is today, add this PR to disable typing to make sure release works.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33920610

Pulled By: ejguan

fbshipit-source-id: febff849ab2272fd3b1c5127a20f27eb82992d9c
(cherry picked from commit ee103e62e70b69236294f8228ac8061fd95cd4fd)
2022-02-02 05:00:41 +00:00
aa99df5cc3 Check for grad mode enabled in dynamic shape fusion check (#72161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72161

Following logic here: 3dce68fdf4/aten/src/ATen/core/tensor_type.cpp (L329)

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33934368

Pulled By: eellison

fbshipit-source-id: 8555ef72070559905f65c6e883a7ae49e5bbbdc3
(cherry picked from commit 1db78befd65edf5ee79621919b9c97796bb9d2b6)
2022-02-02 04:40:22 +00:00
0c49800c1c change default grain size in gather scatter kernel to improve CPU performance (#72082)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72082

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64478

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33862395

Pulled By: frank-wei

fbshipit-source-id: 1e439d90b56294f27aaa7905bc81e772cf47fbbf
(cherry picked from commit 1064561fec2df83b8a20bfd44c205a47af8ef03b)
2022-02-02 04:20:20 +00:00
7c2eda3829 Fix fx docs (#72108)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72108

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D33916855

Pulled By: mrshenli

fbshipit-source-id: 5fff6c87555109e43954eff99164e68a56ff95da
(cherry picked from commit 1611c4c75c6398cdd72fb9edfd04e8f9ff2f9ece)
2022-02-02 03:28:07 +00:00
f99147dec0 Targeted documentation updates in autograd.functional (#72111)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72111

For vectorize flag:
- Advertises the use of functorch

For autograd.functional.jvp:
- Advertises the use of functorch and the low-level jvp API, both of
which will be more performant than the double backprop trick.

Test Plan: - view docs

Reviewed By: albanD

Differential Revision: D33918065

Pulled By: zou3519

fbshipit-source-id: 6e19699aa94f0e023ccda0dc40551ad6d932b7c7
(cherry picked from commit b4662ceb99bf79d56727d9f1343669e584af50bd)
2022-02-02 03:19:31 +00:00
a60e2ae037 [TensorExpr] Move AOT compilation logic from aot_compiler.cpp to NNC's to_backend (#70375)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70375

Differential Revision:
D33303645
D33303645

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin, priyaramani

Pulled By: ZolotukhinM

fbshipit-source-id: 01ab9fab9bb0d63f89b06a146d3c5fb6ed7fe52d
(cherry picked from commit aac8e0ed900d1b760606b0b50eb064e6b00f8b7a)
2022-02-02 02:34:55 +00:00
64668e61b8 [TensorExpr] AOT Compiler: support symbolic shape arguments. (#70374)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70374

Differential Revision:
D33303646
D33303646

Test Plan: Imported from OSS

Reviewed By: navahgar, priyaramani

Pulled By: ZolotukhinM

fbshipit-source-id: da81af93b27e632b34c6f35b0ff3c933cba74c19
(cherry picked from commit 4af5fb18a1ddc32452b1acfd1d77451331e9c09d)
2022-02-02 02:34:55 +00:00
a7cebda955 Revert D33892443: use irange for caffe2/aten directory
Test Plan: revert-hammer

Differential Revision:
D33892443 (10cc66bc78)

Original commit changeset: eb76a3b39e6b

Original Phabricator Diff: D33892443 (10cc66bc78)

fbshipit-source-id: 1072520eea5cae121b6dc82403d6679f9c768e3e
(cherry picked from commit b3dbef5e63b6afb50be10e75f8e5a2f69b5a8053)
2022-02-02 00:36:53 +00:00
a34c17bfaa [Pytorch Edge] Fix Custom Class Parser (#72153)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72153

Forgot that schemas can have types like foo.bar[]

Test Plan: ci and the extra files regenerated in this diff

Reviewed By: tugsbayasgalan

Differential Revision: D33928283

fbshipit-source-id: 810d25f8f7c1dd7c75e149739fc9f59c6eafe3b9
(cherry picked from commit 6fe5e8c437d1eddb600448ecf323262fc1a4c60b)
2022-02-02 00:30:51 +00:00
aa5dab02b2 [fix] EmbeddingBag segfault for out-of-bounds idx (#71904)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71094

Added checks for out-of-bound indices

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71904

Reviewed By: jbschlosser, VitalyFedyunin

Differential Revision: D33893387

Pulled By: ngimel

fbshipit-source-id: 0ba7038bd7e10c6fa6700646a0fe755b73db0ec9
(cherry picked from commit 4d6ae2e3f4ed73e243fe39f49f575433752d6ab1)
2022-02-02 00:04:26 +00:00
67a275c293 Fix persistent worker exits before pin_memory thread (#71579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579

Fixes #1551

As the comment in the code, register a function to terminate persistent workers.
By adding a reference of these workers in `atexit`, it would prevent Python interpreter kills these persistent worker processes before `pin_memorh_thread` exits.
And, if users explicitly kills DataLoader iterator, such function in `atexit` would be a no-op.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33896537

Pulled By: ejguan

fbshipit-source-id: 36b57eac7523d8aa180180c2b61fc693ea4638ae
(cherry picked from commit 05add2ae0fcd08b6ecb5dc46cfbf4c126c6427ed)
2022-02-01 23:57:17 +00:00
10cc66bc78 use irange for caffe2/aten directory (#72067)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72067

The majority of scripts used to generate the changes are from Richard Barnes (D28874212).

Use irange in PyTorch, which adds some benefits
- const safety
- might help the compiler to generate more efficient binary
- more concise

Originally, I was planning to change everything include the head files. But it caused too many errors in other places, therefore I changed the script to only change the cpp and cc files.

```
#filetypes = ('.cpp', '.cc', '.h', '.hpp')
filetypes = ('.cpp', '.cc')
```

Even only changing the cpp(cc) files, there are still some unknown issues, therefore I limited to  **aten** folder to begin with.
```
#target_path = '..'
target_path = '../aten'
```
**Later on, we could run the script for each folder one by one.**

The following files are known to cause issues (such as name space conflicts (already in c10 namespace), loop variable should not be constant etc). We will need to deal with them one by one.
```
excluded_files = ['../c10/util/ConstexprCrc.h',
    '../aten/src/ATen/core/jit_type.h',
    '../aten/src/ATen/native/Math.h',
    '../c10/util/variant.h',
    '../c10/util/flags_use_no_gflags.cpp',
    '../caffe2/operators/cc_bmm_bg_op.h',
    '../aten/src/ATen/core/tensor_type.cpp',
    '../aten/src/ATen/native/Linear.cpp',
    '../aten/src/ATen/native/ConvolutionTBC.cpp',
    '../caffe2/share/fb/mask_rcnn/bbox_concat_batch_splits_op.h',
    '../aten/src/ATen/native/BatchLinearAlgebra.cpp',
    '../aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp',
    '../aten/src/ATen/native/cuda/DistributionTemplates.h',
    '../c10/util/sparse_bitset.h',
    '../torch/csrc/distributed/c10d/TCPStore.cpp',
    '../caffe2/fb/operators/calibration_op.h',
    '../torch/csrc/jit/testing/file_check.cpp',
    '../torch/csrc/jit/passes/concat_opt.cpp',
    '../torch/csrc/jit/tensorexpr/operators/reduction.cpp',
    '../torch/fb/operators/select_keys.cpp',
    '../torch/fb/operators/calibration/bucketize_calibration.cpp',
    '../fb/custom_ops/maskrcnn/int8/int8_aabb_roi_align.cpp',
    '../fb/custom_ops/maskrcnn/aabb/aabb_roi_align.cpp',
    '../caffe2/fb/tests/RecordIOHelper.cpp',
    '../test/cpp/api/rnn.cpp',
    '../torch/fb/training_toolkit/common/tdigest/tests/TestBufferedTDigest.cpp'
    ]
```

I placed **use_irange.py** at cafee2/script and run the script from there.
```
[charleszhang@devvm7388]~/fbsource/fbcode/caffe2/scripts% pwd
/home/charleszhang/fbsource/fbcode/caffe2/scripts
[charleszhang@devvm7388]~/fbsource/fbcode/caffe2/scripts% ls -l use*
-rwxr-xr-x 1 charleszhang users 5174 Jan 27 10:18 use_irange.py
```

The following is **use_irange.py** I used to generate the changes.
```
#!/usr/bin/env python3
# (c) Facebook, Inc. and its affiliates. Confidential and proprietary.

import re
import os

irange_header = "#include <c10/util/irange.h>"

# I recommend using https://regex101.com/ to understand this.
for_loop_regex = re.compile(
    r"for\s*\((?:int32_t|int64_t|uint32_t|int64_t|size_t|int|unsigned|auto|std::size_t|short|uint16_t|uint8_t) ([A-Za-z0-9_]+)\s*=\s*([^\s]+)\s*;\s*\1\s*<\s*([^\s]+)\s*;\s*(?:\+\+\1|\1\+\+)\s*\)\s*({?)")

header_regex = re.compile(r'#include ["<][^>"]+(?:[">])')

new_loop_zero = "for (const auto {loop_var} : c10::irange({upper_bound})){bracket}"
new_loop_range = (
    "for (const auto {loop_var} : c10::irange({lower_bound}, {upper_bound})){bracket}"
)

#header_insertion_points = (("c10", "alpha"), ("ATen/", "after"), ("torch/", "before"))

def find_c10(data : str) -> int:
    insert_at = -1
    for m in header_regex.finditer(data):
        if "c10/" in m.group(0):
            if insert_at is None:
                insert_at = m.span()[0]
            if irange_header > m.group(0):
                insert_at = m.span()[1]
    return insert_at

def find_ATen(data : str) -> int:
    insert_at = -1
    for m in header_regex.finditer(data):
        if "ATen/" in m.group(0):
            insert_at = m.span()[1]
    return insert_at

def find_torch(data : str) -> int:
    for m in header_regex.finditer(data):
        if "torch/" in m.group(0):
            return m.span()[0]
    return -1

def find_header_insertion_point(data: str) -> (int, str):
    """Look through headers to find an insertion point."""

    m = find_c10(data)
    if m != -1:
        return m, "after"
    else:
        m = find_ATen(data)
        if m != -1:
            return m, "after"
        else:
            m = find_torch(data)
            return m, "before"

def process_one_file(a_file : str):
    data = ''
    with open(a_file) as f:
        data = f.read()
    has_for_loop = for_loop_regex.findall(data)
    if not has_for_loop:
        return
    needs_header = has_for_loop and irange_header not in data

    if needs_header:
        pos, stype = find_header_insertion_point(data)
        # we do no change the file if do not know where to insert the head file
        # for now, since there are too many of them
        if pos == -1:
            return
        if stype == "after":
            data = data[0:pos] + "\n" + irange_header + data[pos:]
        else:
            data = data[0:pos] + irange_header + "\n" + data[pos:]

    start = 0
    new_data = ""
    for match in for_loop_regex.finditer(data):
        loop_text_begin, loop_text_end = match.span()
        loop_var = match.group(1)
        lower_bound = match.group(2)
        upper_bound = match.group(3)
        bracket = " {" if match.group(4) == "{" else ""
        if lower_bound == "0":
            replacement_loop = new_loop_zero.format(
                loop_var=loop_var, upper_bound=upper_bound, bracket=bracket
            )
        else:
            replacement_loop = new_loop_range.format(
                loop_var=loop_var,
                lower_bound=lower_bound,
                upper_bound=upper_bound,
                bracket=bracket,
            )
        old_loop = data[loop_text_begin : loop_text_end]
        new_data += data[start : loop_text_begin] + replacement_loop
        start = loop_text_end
    new_data += data[start:]

    with open(a_file, "w") as fout:
        fout.write(new_data)

#filetypes = ('.cpp', '.cc', '.h', '.hpp')
filetypes = ('.cpp', '.cc')
#target_path = '..'
target_path = '../aten'

excluded_files = ['../c10/util/ConstexprCrc.h',
    '../aten/src/ATen/core/jit_type.h',
    '../aten/src/ATen/native/Math.h',
    '../c10/util/variant.h',
    '../c10/util/flags_use_no_gflags.cpp',
    '../caffe2/operators/cc_bmm_bg_op.h',
    '../aten/src/ATen/core/tensor_type.cpp',
    '../aten/src/ATen/native/Linear.cpp',
    '../aten/src/ATen/native/ConvolutionTBC.cpp',
    '../caffe2/share/fb/mask_rcnn/bbox_concat_batch_splits_op.h',
    '../aten/src/ATen/native/BatchLinearAlgebra.cpp',
    '../aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp',
    '../aten/src/ATen/native/cuda/DistributionTemplates.h',
    '../c10/util/sparse_bitset.h',
    '../torch/csrc/distributed/c10d/TCPStore.cpp',
    '../caffe2/fb/operators/calibration_op.h',
    '../torch/csrc/jit/testing/file_check.cpp',
    '../torch/csrc/jit/passes/concat_opt.cpp',
    '../torch/csrc/jit/tensorexpr/operators/reduction.cpp',
    '../torch/fb/operators/select_keys.cpp',
    '../torch/fb/operators/calibration/bucketize_calibration.cpp',
    '../fb/custom_ops/maskrcnn/int8/int8_aabb_roi_align.cpp',
    '../fb/custom_ops/maskrcnn/aabb/aabb_roi_align.cpp',
    '../caffe2/fb/tests/RecordIOHelper.cpp',
    '../test/cpp/api/rnn.cpp',
    '../torch/fb/training_toolkit/common/tdigest/tests/TestBufferedTDigest.cpp'
    ]

for current_folder, subfolders, files in os.walk(target_path):
    for a_file in files:
        if a_file.endswith(filetypes) and current_folder != '../caffe2/torch/jit':
            full_path = os.path.join(current_folder, a_file)
            if full_path not in excluded_files:
                process_one_file(full_path)

```

Test Plan: Sandcastle

Reviewed By: r-barnes

Differential Revision: D33892443

fbshipit-source-id: eb76a3b39e6bebb867ede85f74af9791ee8be566
(cherry picked from commit 28f8a2a6cca5b9a4e4ce4166bdc50135caf1b311)
2022-02-01 23:42:20 +00:00
e39bf13316 Fix internal assert custom function when input does not require grad (#72008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72008

Fixes  #71119

Technically BC-breaking because when an input does not require grad, previously it was returned as-is instead of a view because it didn't need to. Now we will also return a view in that case (whether or not forward AD runs).

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33859553

Pulled By: soulitzer

fbshipit-source-id: 81b3fa371f4c0904630878500aa190492c562367
(cherry picked from commit ee74bc82342e2a42577101cb1aef43330a028a89)
2022-02-01 22:36:04 +00:00
26f88eb0e6 Revert D32053748: [pytorch] use cublas lt interface for bias fusion
Test Plan: revert-hammer

Differential Revision:
D32053748 (702d375df5)

Original commit changeset: accf787c8727

Original Phabricator Diff: D32053748 (702d375df5)

fbshipit-source-id: 735fe64de4d525d8c9f2833952b09483afeaea98
(cherry picked from commit 099bd88c628feb648baad7cb33484f5772ed1052)
2022-02-01 22:03:17 +00:00
b9f9d78a8c Update flatbuffers to v2.0.5 (#72132)
Summary:
This updates flatbuffer submodule from v1.12.1 to v2.0.5, but according to relnotes on [v2.0.0](https://github.com/google/flatbuffers/releases/tag/v2.0.0):
> Note, "2.0" doesn't signify any kind of major overhaul of FlatBuffers, it is merely trying to be more semver compatible, and this release does have breaking changes for some languages much like all releases before it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72132

Reviewed By: seemethere

Differential Revision: D33923945

Pulled By: malfet

fbshipit-source-id: 9398d35f6bbc4ec05562a25f6ee444b66df94086
(cherry picked from commit 2335d5f69b0b0ee36ead7f5d66cfc47a1954f834)
2022-02-01 21:32:27 +00:00
3dce68fdf4 [SR] Eliminate op_name_ in ProcessedNode (#71986)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71986

To address concerns over space increase from control flow.

`op_name_` was only stored as a minor optimization to avoid name lookup during logging, we can safely get rid of it. Thanks to the sampling mechanism, `get_op_name()` is called very infrequently, so this shouldn't cause too much of a regression
ghstack-source-id: 148086244

Test Plan: CI

Reviewed By: d1jang

Differential Revision: D33821005

fbshipit-source-id: 6f74eb30a54a046ca90768aebbcde22e8c435f35
(cherry picked from commit 361ba32e97dbd130938bae10b5159730822c518c)
2022-02-01 21:22:26 +00:00
b1897d6d99 fixing stride order for expanded tensor (#71665)
Summary:
The default initialization of stride order were not correct. This ended up with an expanded tensor showing wrong stride, since stride 0 is ignored by TensorIterator stride computation logic [Computing output strides].

Quick fix with cpp tests as well.

Note that things still look strange when we expand from a rank 1 size 1 tensor, as that gives us inconsistent strides.
```
In [7]: x = torch.rand([1])

In [8]: x.expand(1, 1, 4, 4).stride()
Out[8]: (0, 0, 0, 0)

In [9]: x.expand(4, 4, 1, 1).stride()
Out[9]: (0, 0, 1, 1)

In [10]: x.expand(4, 1, 4, 1).stride()
Out[10]: (0, 0, 0, 1)
```

Meanwhile, scalar tensor seems to work fine.
```
In [2]: x = torch.tensor(1.0)

In [3]: x.expand(4, 1, 1, 4).stride()
Out[3]: (0, 0, 0, 0)

In [4]: x.expand(4, 1, 4, 1).stride()
Out[4]: (0, 0, 0, 0)

In [5]: x.expand(4, 4, 1, 1).stride()
Out[5]: (0, 0, 0, 0)

In [6]: x.expand(1, 1, 4, 4).stride()
Out[6]: (0, 0, 0, 0)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71665

Reviewed By: mrshenli

Differential Revision: D33849958

Pulled By: davidberard98

fbshipit-source-id: 982cd7fa352747d1e094a022475d6d1381ba75e5
(cherry picked from commit 0e0b587fe18ed47f4e801bb55a10641b9decd6e4)
2022-02-01 21:22:26 +00:00
b8a4ee5e35 Clean up old warnings in F.interpolate (#72093)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71720

This PR removes the old warnings for `recompute_scale_factor` and `align_corners`.

Looking at this, I realize that the tests I modified don't really catch whether or not a warning is created for  `recompute_scale_factor`. If desired, I can add a couple lines into the tests there to pass a floating point in the `scale_factors` kwarg, along with `recompute_scale_factor=None`.

Let me know how this looks, thanks so much!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72093

Reviewed By: mruberry

Differential Revision: D33917615

Pulled By: albanD

fbshipit-source-id: e822f0a15b813ecf312cdc6ed0b693e7f1d1ca89
(cherry picked from commit c14852b85c79d11adb1307a35cbf82e60ae21d50)
2022-02-01 21:18:29 +00:00
29d9100277 Process commit update2
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72139
2022-02-01 21:09:46 +00:00
d0f397ae61 Avoid unnecessary copy of padding/dilation vectors in check_shape_forward (#72019)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72019

Test Plan: CI + perf measurement

Reviewed By: marksantaniello

Differential Revision: D33846964

fbshipit-source-id: d206387a6efd16005e0cda75da9fd5fac40b405b
(cherry picked from commit 02bbdd78d2353d641622acd071b26dc7686a9e5c)
2022-02-01 20:08:05 +00:00
9e8334e3ae [tensorexpr][quant] Enable tensorexpr for quant,dequant (#71243)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71243

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33554981

Pulled By: IvanKobzarev

fbshipit-source-id: 461f1cbece3bc8be6a3e9cf16bdbcc4fc5dd2593
(cherry picked from commit d2f9aac2c6bd0942d26f98be0a89c815b22a7f03)
2022-02-01 19:48:53 +00:00
34e4418dfa [nnc] tensorexpr for quantized/aten::upsample_nearest2d (#71236)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71236

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33553305

Pulled By: IvanKobzarev

fbshipit-source-id: 2442afee6d23123bb3a4bc52d3555393b0254106
(cherry picked from commit 90a263fc08dc6302a74736070c02606937486956)
2022-02-01 19:48:53 +00:00
e118d6e59f Add lowering path for LinearReLU module (#71427)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427

This commit adds a lowering path for the LinearReLU modules
in static quantization mode. This includes torch.nn.qat.Linear,
torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU.
Future commits will add support for dynamic quantization and functional
LinearReLU.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_module

Imported from OSS

Reviewed By: george-qi

Differential Revision: D33694742

fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036
(cherry picked from commit b3f607de439f2ba7c0a03ad1ac494127685cbf4e)
2022-02-01 19:31:31 +00:00
be7ee92669 Update process_commit.py
Update new release note labelling system message on PRs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72121
2022-02-01 19:17:40 +00:00
e0a0f37a11 Add docs for fusion strategy (#72036)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72036

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33864651

Pulled By: eellison

fbshipit-source-id: b1010d97d95fd01603cc7e04a978857cbde3c4fb
(cherry picked from commit e83b008017e54c68c7be374085c3217e8e38cce5)
2022-02-01 19:07:02 +00:00
a55ef69e68 update default fusion strategy (#72038)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72038

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33864653

Pulled By: eellison

fbshipit-source-id: 9f0d7fa5f72a901566fae937668d3a6ede2c4b03
(cherry picked from commit aeee43e8d92255d2926a7ce0f540ffb46a681d6a)
2022-02-01 19:07:02 +00:00
b44f724aef [nnc] Update cuda codegen to use llvm for thread and block extent computations (#72040)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72040

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33865041

Pulled By: eellison

fbshipit-source-id: 41b4e648f69a048c7d84410da0f082ec3916f4f9
(cherry picked from commit 6be040e5fe3bb2ab6c0d3226a4c1def9f8a0730d)
2022-02-01 19:07:02 +00:00
27a4d39756 NNC Dynamic Channels last fixes (#72032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72032

This contains a few channels last changes from benchmarking:
- dont permute back to channels last on dynamic, cpu, perf is not good, and use cases for it are exotic atm
- remove the conditional one handling in permutting channels last symbolic tensor on cuda, it's not needed in the permutation case as tests show
- removing logic in torch/csrc/jit/tensorexpr/loopnest.cpp preventing inlining. the condition in checks is always valid given valid construction of ir

I can split up as needed.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33864652

Pulled By: eellison

fbshipit-source-id: f16674fb02dfff22670d8a2f856c5a317fd15717
(cherry picked from commit a9a069783956802e9e2f30c7a06e8e2ca8d210a1)
2022-02-01 19:07:02 +00:00
59a6375639 [NNC] Add Tests for Dynamic Shape Fusion Change default fusion strategy (#71651)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71651

The only tests that regress are because chunk NYI, the other tests that I touched were passing just because the `assertAllFused` wasn't working correctly. That, and we're no longer compiling conv/matmul w dynamic shapes

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33801500

Pulled By: eellison

fbshipit-source-id: 074118ab4a975b7db876a4fcdfb9483afb879e79
(cherry picked from commit abaa7948c18bf2dc885efd9323a92449d321afbc)
2022-02-01 19:07:02 +00:00
f1499d6c18 Refactor PE so fusion specializations are configurable (#71650)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71650

*

Refactors PE so there is a current fusion strategy set, which will take in a vector of e.g. [(STATIC, 2), (DYNAMIC, 10)] which means fuse two static invocations then fuse 10 dynamic ones, then stop specializing.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33801501

Pulled By: eellison

fbshipit-source-id: ebc7ac3c57e35a3b9bb15ab751f0aa1d25cc9bd5
(cherry picked from commit 8dd89088d3ceae800ea110d0b6949b759d4fe582)
2022-02-01 19:07:02 +00:00
cf1833df70 [WIP] add explicit dynamic fusion arg (#71173)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71173

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33536222

Pulled By: eellison

fbshipit-source-id: a097408ecdd6e284432de128feb297993d882d52
(cherry picked from commit 0e3419b2d32e798539a35db9c47aa85f7487a524)
2022-02-01 19:07:02 +00:00
e305248a33 Add logspace test modules (#72052)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72052

Test Plan: tofu_demands_tests

Reviewed By: cccclai

Differential Revision: D33885099

fbshipit-source-id: 53eb12c6f4416d35f118a1357b714fec16d74157
(cherry picked from commit 2608cd937975f938c6bbc9ac5b5f9bf197ff1b95)
2022-02-01 18:43:00 +00:00
7e8217549f Added remove_duplicate parameter to nn.Module (#39)
Summary:
Pull Request resolved: https://github.com/pytorch/torchrec/pull/39

Pull Request resolved: https://github.com/facebookresearch/torchrec/pull/6

This makes it so that shared parameters get their own entry in `named_parameters`.

More broadly, this makes it so that
```
params_and_buffers = {**mod.named_named_parameters(remove_duplicate=False), **mod.named_buffers(remove_duplicate=False)}
_stateless.functional_call(mod, params_and_buffers, args, kwargs)
```
is identical to calling the original module's forwards pass.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71542

Reviewed By: jbschlosser, albanD

Differential Revision: D33716716

Pulled By: Chillee

fbshipit-source-id: ff1ed9980bd1a3f7ebaf695ee5e401202b543213
(cherry picked from commit d6e3ad3cd0c694886d4d15a38876835e01f68134)
2022-02-01 18:34:58 +00:00
4567d5ded4 Upgrade oneDNN to v2.5.2 (#71546)
Summary:
This PR upgrades oneDNN to v2.5.2, and includes some building support for oneDNN v2.5.2.

v2.4 changes:
- Improved performance for future Intel Xeon Scalable processor (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
- Improved binary primitive performance for cases when one of the tensors is broadcasted.
- Improved performance of reduction primitive, reorder, shuffle primitives.
- Improved performance of depthwise convolution forward propagation for processors with Intel AVX5-12 support
- Improved performance of forward inner product primitive for the shapes with minibatch equal to 1 for processors with Intel AVX-512 support
- Improved performance of int8 matmul and inner product primitives for processors with Intel AVX2 and Intel DL Boost support

v2.5 changes:
- Improved performance for future Intel Xeon Scalable processors (code name Sapphire Rapids). The functionality is now enabled by default and requires Linux kernel 5.16.
- Improved performance of matmul primitive for processors with Intel AVX-512 support.

v2.5.2 changes:
- Fixed performance regression in binary primitive with broadcast
- Fixed segmentation fault in depthwise convolution primitive for shapes with huge spatial size for processors with Intel AVX-512 support

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71546

Reviewed By: george-qi

Differential Revision: D33827108

Pulled By: VitalyFedyunin

fbshipit-source-id: 8f5a19b331c82af5b0783f081e061e1034a93952
(cherry picked from commit 9705212fe9b7b0838cc010d040c37d1175be83ce)
2022-02-01 18:34:58 +00:00
c61be5fb22 Add split_with_sizes converter (#71953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71953

Add converter for split_with_sizes

Reviewed By: yinghai

Differential Revision: D33829024

fbshipit-source-id: 50de383797a347ef7afecfbda80b2c84e244e404
(cherry picked from commit f21ced40bfc04aadf5b80c48f84649510cc3a09b)
2022-02-01 18:30:54 +00:00
b28e696516 Update linspace and bump version nuymber to 8 (#71486)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71486

This PR adds upgraders for linspace and linspace.out as the optional step size will be deprecated soon. Old models will be using steps size of 100 when nothing is provided.

Test Plan: buck-out/gen/caffe2/test/jit#binary.par -r TestUpgraders.test_aten_linspace

Reviewed By: cccclai, mruberry

Differential Revision: D33654308

fbshipit-source-id: 0e0138091da0b11d4f49156eeb6bcd7e46102a5b
(cherry picked from commit 931ae4af3200b37d1cebcb7f30e8ba880c1305ec)
2022-02-01 18:16:55 +00:00
5024c1bc7b Make get_file_pathnames_from_root output order deterministic (#70435)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70103

I used an argument so it can be disabled. I called it `deterministic_order` because `sort` can be confusing, as it's actually sorted but by dir levels.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70435

Reviewed By: albanD

Differential Revision: D33899755

Pulled By: ejguan

fbshipit-source-id: e8a08f03a49120333b2d27f332cd21a3240a02a9
(cherry picked from commit 4616e43ec30ba425585c041f8895196909f94d1b)
2022-02-01 18:12:23 +00:00
2c3ecb435e Automated submodule update: FBGEMM (#72116)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: 1280f817bf

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72116

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jasonjk-park

Differential Revision: D33919076

fbshipit-source-id: 8d27fd898af101494e4b54f9abfd27e6169cfd4d
(cherry picked from commit 1731bbd676f8bc739cdb5d9b50cb151816318484)
2022-02-01 18:01:41 +00:00
702d375df5 [pytorch] use cublas lt interface for bias fusion (#71200)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71200

To quantify how much cublas lt interface can help param bench (https://github.com/facebookresearch/param/) linear perf

On V100 GPU

for b in 512 1024; do for i in {1..5}; param_bench/train/compute/pt/pytorch_linear.py --device gpu --dtype=float16 --hidden-size 1024 --batch-size ${b}; done; done

Before this commit
batch size 512: median 21.4 TF/s (20.7, 20.6, 21.8, 21.6, 21.4)
batch size 1024: median 40.1 TF/s (39.4, 39.3, 40.2, 40.4, 40.1)

After this commit
batch size 512: median 23.5 TF/s (23.2, 23.5, 23.8, 23.9, 23.6 ) 9.8% speedup
batch size 1024: median 41.6 TF/s (42.7, 41.6, 40.4, 41.3, 41.9 ) 3.7% speedup

Reviewed By: jasonjk-park, ngimel

Differential Revision: D32053748

fbshipit-source-id: accf787c8727a2f8fb16fae92de461367ac10442
(cherry picked from commit 254532ac451859982da07648431ccbea12e21397)
2022-02-01 17:55:25 +00:00
1a30954f44 CUDA TopK Optimization: use multiple block per slice (#71081)
Summary:
# Overview
Currently the cuda topk implementation uses only 1 block per slice, which limits the performance for big slices. This PR addresses this issue.

There are 2 parts in the topk calculation, find the kth value (`radixFindKthValues`) in each slice, then gather topk values (`gatherTopK`) based on the kth value. `radixFindKthValues` kernel now supports multiple blocks. `gatherTopK` may also need a multiple block version (separate PR?).

kthvalue, quantile, median could also use the same code (separate PR).

# Benchmark

Benchmark result with input `x = torch.randn((D1 (2d884f2263), D2 (9b53d3194c)), dtype=torch.float32)` and `k = 2000` on RTX 3080: https://docs.google.com/spreadsheets/d/1BAGDkTCHK1lROtjYSjuu_nLuFkwfs77VpsVPymyO8Gk/edit?usp=sharing

benchmark plot: left is multiblock, right is dispatched based on heuristics result from the above google sheet.
<p class="img">
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860547-7e450ed2-df09-4292-a02a-cb0e1040eebe.png">
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860579-672b88ca-e500-4846-825c-65d31d126df4.png">
</p>

The performance of divide-and-conquer implementation at https://github.com/pytorch/pytorch/pull/39850 is not stable in terms of the D1 (2d884f2263), D2 (9b53d3194c) size increasing, for more detail please check the above google sheet.

<p>
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860563-21d5a5a3-9d6a-4cef-9031-cac4d2d8edee.png">
</p>

# cubin binary size
The cubin binary size for TensorTopK.cubin (topk) and Sorting.cubin (kthvalue, quantile and etc) has been reduced by removing `#pragma unroll` at [SortingRadixSelect.cuh](https://github.com/pytorch/pytorch/pull/71081/files#diff-df06046dc4a2620f47160e1b16b8566def855c0f120a732e0d26bc1e1327bb90L321) and `largest` template argument without much performance regression.

The final binary size before and after the PR is
```
# master
-rw-rw-r-- 1 richard richard  18M Jan 24 20:07 TensorTopK.cu.1.sm_86.cubin
-rw-rw-r-- 1 richard richard  16M Jan 24 20:07 Sorting.cu.1.sm_86.cubin
# this PR
-rw-rw-r-- 1 richard richard 5.0M Jan 24 20:11 TensorTopK.cu.1.sm_86.cubin
-rw-rw-r-- 1 richard richard 2.5M Jan 24 20:11 Sorting.cu.1.sm_86.cubin
```

script to extract cubin
```
# build with REL_WITH_DEB_INFO=0
# at pytorch directory
cubin_path=build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/cubin; mkdir -p $cubin_path; cd $cubin_path; find ../ -type f -name '*cu.o' -exec cuobjdump {} -xelf all \; ; ls -lh *.cubin -S | head -70
```

# benchmark script
```py
import torch
import time
import torch
import pandas as pd
import numpy as np
import torch.utils.benchmark as benchmark

torch.manual_seed(1)
dtype = torch.float
data = []

for d1 in [1, 20, 40, 60, 80, 100, 200, 400, 800, 1000, 2000, 4000, 6000, 8000, 10000, 100000, 500000]:
    if d1 <= 1000:
        D2 (9b53d3194c) = [100, 200, 300, 400, 800, 1000, 2000, 3000, 4000, 5000, 8000, 10000, 20000, 30000, 40000, 80000, 100000, 200000, 300000, 400000, 500000]
    else:
        D2 (9b53d3194c) = [100, 200, 300, 400, 800, 1000, 5000, 10000, 20000, 30000]
    for d2 in D2 (9b53d3194c):
        k = 2000 if d2 >= 2000 else d2 // 2
        print(f"----------------- D1 (2d884f2263) = {d1}, D2 (9b53d3194c) = {d2} -----------------")
        try:
            x = torch.randn((d1, d2), dtype=dtype, device="cuda")
            m = benchmark.Timer(
                stmt='x.topk(k=k, dim=1, sorted=False, largest=True)',
                globals={'x': x, 'k': k},
                num_threads=1,
            ).blocked_autorange(min_run_time=1)
            print(m)
            time_ms = m.median * 1000
        except RuntimeError: # OOM
            time_ms = -1
        data.append([d1, d2, k, time_ms])

df = pd.DataFrame(data=data, columns=['D1 (2d884f2263)', 'D2 (9b53d3194c)', 'k', 'time(ms)'])
print(df)
df.to_csv('benchmark.csv')
```

plot script could be found at: https://github.com/yueyericardo/misc/tree/master/share/topk-script

cc zasdfgbnm ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71081

Reviewed By: albanD

Differential Revision: D33823002

Pulled By: ngimel

fbshipit-source-id: c0482664e9d74f7cafc559a07c6f0b564c9e3ed0
(cherry picked from commit be367b8d076aebf53ab7511f6a8a86834c76c95b)
2022-02-01 17:43:51 +00:00
4aade95029 [PyTorch] Rework stat collection in CUDACachingAllocator (#71669)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71669

This was relatively inefficient. Rather than looping for each type of stat we want to update, we now do one loop covering all the stats.
ghstack-source-id: 148013645

Reviewed By: ngimel

Differential Revision: D33725458

fbshipit-source-id: 39ef5d65a73d4ef67f259de8c02c7df29487d990
(cherry picked from commit 7ca46689b72ba7611517447a292445571bd02dd7)
2022-02-01 17:24:51 +00:00
ca2ff12ea3 [PyTorch] Remove call_once from CUDACachingAllocator (#71668)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71668

As https://en.cppreference.com/w/cpp/thread/call_once mentions, function-local statics are probably more efficient.
ghstack-source-id: 148013646

Reviewed By: ngimel

Differential Revision: D33722954

fbshipit-source-id: a2737c2d6dfdd23b26cbe34574b80e3da0d4b8a4
(cherry picked from commit a6ddb24558f41aff12f76ba49a28d0a3082aec20)
2022-02-01 17:24:51 +00:00
da0423aa0b [PyTorch] Use a better hash table in CUDACachingAllocator (#71667)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71667

We have flat_hash_set because it performs better than std::unordered_set.
ghstack-source-id: 148013648

Reviewed By: ngimel

Differential Revision: D33720595

fbshipit-source-id: aa6077c474dd6fc61ce17e24ebde4056c8bae361
(cherry picked from commit 386082eaf1d4669c7967ba9cdf765d9d677f5cd9)
2022-02-01 17:24:51 +00:00
4b789df68b [SR] Add BlockRunner and handle sub-blocks (#69834)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69834

* Modify the `StaticModule` constructor to handle index initialization for sub-blocks.
* Add a new class `StaticRuntimeBlockRunner`. This class is almost exactly like what we've been calling `StaticRuntime` up to this point, except that it does not own a `values_` array. All `StaticRuntimeBlockRunners` hold an unowned reference to a `values_` array owned by `StaticRuntime`. This is a useful abstraction for implementing control flow - it gives us a way for sub-blocks to look up values from surrounding scopes!
ghstack-source-id: 148086245

Test Plan: `buck test caffe2/benchmarks/static_runtime/...`

Reviewed By: d1jang

Differential Revision: D33028039

fbshipit-source-id: 4f01417bad51a0cf09b1680a518308da647be1f6
(cherry picked from commit 3a9feffd929869120c717d35aa55aad8a382783d)
2022-02-01 17:20:55 +00:00
7bb614fc71 Simplify TensorImpl size check and fix error message (#72070)
Summary:
Today, the enum is ignored and the generic assert within the equal function is used leading to no information in the error message when this fails.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72070

Reviewed By: bdhirsh

Differential Revision: D33893602

Pulled By: albanD

fbshipit-source-id: 4bc644e9232cbf0bafef22d713948915eb6964ff
(cherry picked from commit bdcc5f5f476f3b9ccd2068f365a734b7df756f02)
2022-02-01 17:09:35 +00:00
ba8d5f6f75 [JIT] FuseLinear pass now handles CallFunction("linear", ...) (#61646)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61646

There are several passes which are written to handle both
`CallFunction("linear", ...)` and `aten::linear(...)` despite the two being
functionally identical.

This changes `FuseLinear` to alse normalize the `CallFunction` variant to
`aten::linear`. That way each subsequent transformation only has to handle one
form instead of both.

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33754261

Pulled By: albanD

fbshipit-source-id: 42465cea790538481efc881a249dafdda4bba5d4
(cherry picked from commit ebeca9434caf74c5e75f61b98db443779fe5c6a9)
2022-02-01 16:59:26 +00:00
e8d226cd9a Remove some unnecessary python functional wrappers (#61608)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61608

See #61544 for an example of issues created by functional wrappers. In this
case, these are directly wrapping the native function with no added
functionality. One exception was `bilinear` which was just missing the default
argument in C++, but was otherwise the same.

I've kept the symbol `torch.functional.istft` because it looks like public API,
but it could just as easily be moved to `_torch_docs.py`.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D31401361

Pulled By: albanD

fbshipit-source-id: 162b74d0b2d4f2e5c4834687a94541960cefdd52
(cherry picked from commit 700cd73ca121d903f04f539af171d3f768565921)
2022-02-01 16:59:26 +00:00
7ea96a7293 [quant][fx] Don't assume bias is a keyword-argument (#71426)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71426

dbr quantization makes faulty assumptions about which arguments are
passed as keyword arguments and which are passed as positional
arguments. This happens to work currently due to a quirk of how
`__torch_function__` is implemented in python functions, but will
break when the operators are moved to C++.

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D33754262

Pulled By: albanD

fbshipit-source-id: 63515d7a166449726e1beaba6659443b6261742d
(cherry picked from commit f7b18848455cd95872b2b658111206b71ce4b3f7)
2022-02-01 16:59:26 +00:00
a5e27c45dc Use new_empty in dropout (#72078)
Summary:
This will be needed by functorch to have the expected behavior of randomness:
Dropout generates a tensor of the right size and then calls `bernoulli_` on that. In order to get the expected behavior from ensembled creation, we'll need to make sure that the generated tensor is a batched tensor.This works mostly because most tensors are created as `empty_like` but this one just creates `empty` because it needs a new shape, only for feature dropout. There is also no analogous version in CUDA because this directly calls`_dropout_impl` here (not in native_functions.yaml)

This shouldn't change the behavior outside of functorch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72078

Reviewed By: zou3519

Differential Revision: D33898338

Pulled By: samdow

fbshipit-source-id: 9d9ed59d138d732d9647b2771ccf2ea97cffae1c
(cherry picked from commit e51cf3ebf2c80a65296c7513576042dd58e0de28)
2022-02-01 16:54:43 +00:00
44e2b8da28 Automated submodule update: FBGEMM (#72068)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: 35d4dd4eb3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72068

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: malfet

Differential Revision: D33892960

fbshipit-source-id: 462b24ab3a81862bbfdc8e80fe07ea262e11829f
(cherry picked from commit c5d2b40fa61e185fab1237c07a0ddc875bcb9203)
2022-02-01 16:24:53 +00:00
58dabebcd7 improve quantized error checking for structured kernels (#71928)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71928

Test Plan: Imported from OSS

Reviewed By: wconstab, bhosmer

Differential Revision: D33823417

Pulled By: bdhirsh

fbshipit-source-id: e894b9724833b77b12963cc4bf194bc6ce526ad9
(cherry picked from commit 6be10b79e7b3ff59aa8d6ca7cf86fc73f545933a)
2022-02-01 16:09:45 +00:00
f20fa66f70 Revert "[fix] max_pool1d: composite compliance (#70900)" (#71992)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71992

This reverts commit b7222e15b6a457099b74420e29b3a39a3e8b5f1a.

We are conservatively reverting this because it broke a test in functorch.
The original PR added a `_max_pool1d_cpu` operator. I'm not sure if it
is actually safe to revert this due to the addition of the new operator
(someone may have serialized it between now and then) but because it has
only been two weeks this should be fine.

Test Plan: - wait for tests

Reviewed By: jbschlosser, VitalyFedyunin

Differential Revision: D33882918

Pulled By: zou3519

fbshipit-source-id: f146e82e6b46690376b3d8825dc7f7da62e2c7de
(cherry picked from commit 1606333e6ce23d618863a9b0e504352bd55569bc)
2022-02-01 15:07:21 +00:00
1cc824ef59 Fix old GCC ABI check in CMake package config (#72081)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72081

This PR fixes the libstdc++ ABI check in CMake package configuration file (i.e. `TorchConfig.cmake`) The `_GLIBCXX_USE_CXX11_ABI` flag is a property of `libstdc++`, not GNU compiler collection. In its current form C++ libraries built with Clang on Linux fail since the `torch` CMake target propagates `_GLIBCXX_USE_CXX11_ABI` only when used with gcc.
ghstack-source-id: 148056323

Test Plan: Built a dummy C++ library that depends on libtorch with both gcc and clang on Linux

Reviewed By: malfet

Differential Revision: D33899849

fbshipit-source-id: 3e933b2c7a17d1fba086caa8aaec831223760882
(cherry picked from commit 41d18c64c4e88db615ecf6f3ef973bd8f985377a)
2022-02-01 13:21:00 +00:00
c93d6f90c9 Revert #62143, the new CUDNN_RNN_ALGO_PERSIST_STATIC_SMALL_H algorithm (#72089)
Summary:
Revert "[cuDNN] Add a new optimized cuDNN RNN algorithm for small RNN hidden_size (https://github.com/pytorch/pytorch/issues/62143)"

This reverts commit 965b9f483ef99f98af8a5be0e751d41e5ef0efdc.

This new cudnn RNN algorithm is causing some failures in our internal testings.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72089

Reviewed By: mruberry

Differential Revision: D33905226

Pulled By: ngimel

fbshipit-source-id: 5563a2c275e697477cf79bada3b81a33f1bf2aaa
(cherry picked from commit 35c240a8dc4ac65add84e30da1dde33402333892)
2022-02-01 05:33:29 +00:00
bb456d2bf7 Split cuda: list cpp files that go in _cu library explicitly (#69082)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69082

Test Plan: Imported from OSS

Reviewed By: dagitses, mruberry

Differential Revision: D32723669

Pulled By: malfet

fbshipit-source-id: e9a815b882089dcf1ba76194c728cd7c45377deb
(cherry picked from commit 456d1ebeb8080e8082b9760fdd072390b55b9b2a)
2022-02-01 02:45:01 +00:00
e784808bc6 DOC: create 1.12 docs from a tag like v1.12.2rc1 (#71985)
Summary:
brianjo, malfet

The documentation team would prefer the [documentation versions] to only have a major.minor version, not major.minor.patch. See also pytorch/pytorch.github.io#921

The regex can be tested by this bash 1-liner (where $tag is something like `v10.1225.0rc1`)
```
echo $tag | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/'
```

I have lost track a bit, is the CI run for a tag actually building and pushing documentation?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71985

Reviewed By: mrshenli

Differential Revision: D33845882

Pulled By: malfet

fbshipit-source-id: 3cb644d8b01f5ddf87c0ac7c43e23e9fd292d660
(cherry picked from commit f884bd86740547e3164adde7bdc6318b944f9bdb)
2022-02-01 01:18:29 +00:00
a319bce58d Make sure we set GITHUB token in the header for pr-label GHA (#72085)
Summary:
Make sure we set GITHUB token in the header for pr-label GHA

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72085

Reviewed By: seemethere

Differential Revision: D33904391

Pulled By: atalman

fbshipit-source-id: 039130a4f94070d78186b018696f53fad6142a8a
(cherry picked from commit f42c74b03c4d37f980d831b4365c6dc0e3fd1613)
2022-02-01 00:08:35 +00:00
cf70466970 [ONNX] Improve scope inference in function extraction
Cover more cases of scope inferencing where consecutive nodes don't have valid scope information. Usually these nodes are created in some pass where authors forgot to assign meaningful scope to them.
* One rule of `InferScope` is to check if the current node's outputs' users share the same scope. Recursively run `InferScope` on the user nodes if they are missing scope as well. Since the graph is SSA, the depth is finite.
* Fix one pass that missed scope information for a new node.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71897
2022-01-31 23:58:53 +00:00
a83cf17807 Composite compliance for gather_backward (#71766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71766

No need for a tensorSubclassLike check here (for functorch at least),
changing the zeros to new_zeros is sufficient.

Test Plan: - tested with functorch

Reviewed By: anjali411

Differential Revision: D33772752

Pulled By: zou3519

fbshipit-source-id: 5779a1c20b032d00a549c58ff905cf768f10467f
(cherry picked from commit a927c664d601d0b1cbbd3cda7dc297364c1d9e94)
2022-01-31 23:52:37 +00:00
25f5f2cd06 Composite compliance for index_put (#71765)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71765

This PR gives index_put the same treatment as
https://github.com/pytorch/pytorch/pull/71751.

Test Plan: - wait for tests

Reviewed By: albanD

Differential Revision: D33772755

Pulled By: zou3519

fbshipit-source-id: 441c9e9188a35ce1a04a337d1c3fdaf87846acf6
(cherry picked from commit 4a168ac40d5c0c6464e1c4fd609f97e5c5e57176)
2022-01-31 23:52:37 +00:00
1f2751cdea Composite compliance for index_copy, index_fill, masked_scatter, masked_fill (#71751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71751

Before this PR, each of the above operations were composite and had
in-place variants that were primitive w.r.t. autograd.

The backward passes are not composite compliant due to the op (e.g.
index_copy) decomposing into index_copy_ and index_copy_'s backward
formula having in-place operations in it. To fix this, for each of the
ops mentioned in the title:
- I deleted the autograd formula for the in-place variant and replaced
it with the out-of-place variant
- This makes the forward-ad formula slightly slower because the codegen
generates a temporary but offline discussion suggests it's not worth
maintaining two sets of formulas for this and we can make the autograd
codegen smarter in the future.
- I then replaced instances of grad.clone().inplace_variant_(...) with
grad.outplace_variant(...)

Test Plan:
- run existing tests to check correctness
- run functorch tests

Reviewed By: anjali411

Differential Revision: D33772756

Pulled By: zou3519

fbshipit-source-id: fd22fe1d542e6e2a16af0865c2ddce0e65c04d70
(cherry picked from commit d025ba03270d53e19b2e68e8dd7ae49f2bb84532)
2022-01-31 23:52:37 +00:00
c62b515691 Make diag_embed a primitive w.r.t. autograd (#71750)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71750

functorch has problems vmap-ing over diag_embed due to the in-place copy_.
This PR adds a backward formula for it so that it becomes a primitive
w.r.t. autograd.

Test Plan: - tested with functorch

Reviewed By: anjali411

Differential Revision: D33772753

Pulled By: zou3519

fbshipit-source-id: da8ff3a10a1de1d60e6de6292003079d4b5ba861
(cherry picked from commit afe9059bfb1f2856e463e6ae988ec0ae86fdd470)
2022-01-31 23:52:37 +00:00
184b78c4c1 [acc_ops] Move slice_tensor to consider single dim at a time (#5906)
Summary:
Pull Request resolved: https://github.com/pytorch/glow/pull/5906

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71883

Fixes slice_tensor retracing. Include fix for retrace coverage.

Missed in D33760455 (66939e3b94).

Test Plan: CI

Reviewed By: wushirong

Differential Revision: D33802222

fbshipit-source-id: 4e0e44ae4a4eb70b99d79f0cd582182031b87e25
(cherry picked from commit 98fd23cf7b93ad22667023eaa696ef7d89f96147)
2022-01-31 23:37:36 +00:00
082ff25f37 [reland][bc-breaking][quant][be] Refactor fuser_method to include is_qat argument" (#71956)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956

Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59

Original commit changeset: f3912e210e8c

Original Phabricator Diff: D33178977 (ef501e8fed)

Test Plan:
Please see original diff for test plans

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)|

|**Modified Pages**|

Reviewed By: andrewor14

Differential Revision: D33833203

fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e
(cherry picked from commit fb88772b18b26141be11f3885af6294eb1bc8466)
2022-01-31 23:02:22 +00:00
847dbb8684 CMake: Clean up unused definitions (#69216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69216

This cleans up 4 pre-processor defines not used by any code:
- HAVE_GCC_GET_CPUID
- USE_GCC_GET_CPUID
- USE_AVX
- USE_AVX2

`cpuid` isn't used in PyTorch any more, we only use `cpuinfo`.
`USE_AVX*` is also not used, instead `HAVE_*_CPU_DEFINITIONS` tells
you which `CPU_CAPABILITY` flags are being compiled.

There is also `fbgemm`'s code path adding `third_party` as an include
path, despite `fbgemm` having a dedicated include directory and a
CMake setup that properly includes it.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33794424

Pulled By: malfet

fbshipit-source-id: 99d504af088818d4a26c2f6ce67ec0d59a5eb703
(cherry picked from commit 2e099d41f0e2f7d96c6013ac83223a75f4e4f862)
2022-01-31 22:49:11 +00:00
d693739248 CMake: Clean up unused definitions (#69216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69216

Currently `torch_cpu` has command line arguments relating to cuda
libraries e.g. `-DMAGMA_V2`. This happens because
`include_directories` and `add_definitions` indescriminately change
the compile commands of all targets.

Instead creating a proper magma target allows limiting the flags to
just `torch_cuda`.

Test Plan: Imported from OSS

Reviewed By: dagitses

Differential Revision: D33794174

Pulled By: malfet

fbshipit-source-id: 762eabf3b9576bef94e8caa3ed4764c0e2c72b08
(cherry picked from commit f7d127b654330e3b37a134200571122aab08079b)
2022-01-31 22:49:11 +00:00
d46256bd7c [skip ci] Remove unused outdated .circleci bazel_definitions file (#71943)
Summary:
Small clean-up, realized this file isn't necessary after migrating to GHA, so removing this file

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71943

Test Plan: running .circleci/regenerate.sh yields no config changes

Reviewed By: malfet

Differential Revision: D33901182

Pulled By: janeyx99

fbshipit-source-id: e8ff16395c81be25dae5b84619c6b4bfe749ada2
(cherry picked from commit e564c1ed5e2b23db537c25f9312647f13a10ab15)
2022-01-31 22:49:11 +00:00
a1b4410964 Add owners to custom test infra (#72080)
Summary:
Followup from meeting today where it is not clear who owned these processes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72080

Reviewed By: malfet

Differential Revision: D33898729

Pulled By: janeyx99

fbshipit-source-id: 79d0e8210b8a6b9876eb50af448e6967a88d38bf
(cherry picked from commit 57cd82ef02c8192154d644af317a51d5f6d2f9e8)
2022-01-31 22:08:03 +00:00
871e240e63 Improved error message for interpolation (#72066)
Summary:
Description:
- Improved error message for CUDA interpolation with antialiasing

jbschlosser could you please check this PR and the wording if the error message is more clear now ? Thank.
I'm skipping all the tests now and once we are agreed on the wording if any updates are required, I update and restart the tests to ensure nothing is broken.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72066

Reviewed By: VitalyFedyunin

Differential Revision: D33892729

Pulled By: jbschlosser

fbshipit-source-id: 6249c7a1c51aa2e242f4bb8bfbe3f2abab17a8e8
(cherry picked from commit 44eb5391cf4fed54b379e96dfa9f23ef6ab1ecfa)
2022-01-31 20:50:42 +00:00
6714d039a1 [bug fix] for add_activation layer, mobilenetv2 is fixed (#71979)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71979

as titled
add local trt test for easy verification

Test Plan:
buck run mode/opt  -c=python.package_style=inplace   scripts/wwei6:trt_local_test
buck test mode/dev-nosan caffe2/test/fx2trt/converters:test_hardtanh

Reviewed By: 842974287

Differential Revision: D33824456

fbshipit-source-id: d824b7da09929de66190fd8a077d4e73b68b9909
(cherry picked from commit 19abcadecc6ff8b58991552a874230a068294e0d)
2022-01-31 19:44:22 +00:00
689c218c36 [caffe2] Rename c10d::detail::vformat to resolve conflict with fmt (#72039)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72039

As it turned out calling the function `vformat` was a bad idea because it caused a subtle compilation error due to a conflict with `fmt::vformat` and as a result wrong function overload being found during lookup.

(Note: this ignores all push blocking failures!)

Test Plan:
```
buck build //caffe2:libtorch
```

Reviewed By: cbalioglu

Differential Revision: D33864790

fbshipit-source-id: 08f8a1cdb5dfe72707a00a4ab7a859ea0d33b847
(cherry picked from commit 6fbca57d5e76dea88e1fe60431c5a42ab3ff738b)
2022-01-31 19:32:58 +00:00
dbd090d610 .github: Change binary build workflow trigger (#71890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71890

Makes it so that ciflow is the way to trigger binary builds instead of
doing both pushes and ciflow

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D33851317

Pulled By: seemethere

fbshipit-source-id: 5e357bddfe004b996e2e1a9336dbbd622321a83d
(cherry picked from commit 11e061d89c5c9ee2a9fc168b367373f68c1946ec)
2022-01-31 18:33:40 +00:00
34494e6252 Back out "Create torch.distributed.shard package." (#72062)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72062

Original commit changeset: dc692b31e260

Original Phabricator Diff: D33755913 (87bbcf70f7)

Test Plan: CI

Reviewed By: pbelevich

Differential Revision: D33891115

fbshipit-source-id: 37286e03d743d8691319f07c95e9561d54f3d6d0
(cherry picked from commit 0c1b3fe00848a275d44d8c91fba91d3df6d4927f)
2022-01-31 18:29:27 +00:00
bb6b501aa0 Back out "[pytorch][PR] Fix SVD error code handling for OpenBLAS 0.3.15+ and MKL 2022+" (#72063)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72063

Original commit changeset: fd1c86e37e40

Original Phabricator Diff: D33844257 (2017b404ec)

Test Plan: CI

Reviewed By: pbelevich

Differential Revision: D33890846

fbshipit-source-id: df9eea497038ec256893a6ce7c3dcd645441b50d
(cherry picked from commit c9bf2ba5e708f70227443121fdc51b4bdd7f54c7)
2022-01-31 18:09:45 +00:00
95d71ed212 Run the pr-label check on PR closed action and validate closed_by (#71917)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68459
This PR implements reminder to assign the release notes and topic labels to each PR when merged. Here is an example of the message that set on the issue related to PR:

> Hey atalman. You merged this PR, but no release notes category and topic labels were added. >
> The list of valid release and topic labels is available https://github.com/pytorch/pytorch/labels?q=release+notes+or+topic

Tested by manually running process_commit.py script in standalone mode passing following  commit_hash = "e020414cb25cd763103f77a10c6225ce27cbbb6e" which should resolve to this PR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71917

Reviewed By: malfet

Differential Revision: D33847058

Pulled By: atalman

fbshipit-source-id: 370e0928b792df721b216a8e08b22253f03abda3
(cherry picked from commit dfa86f440f155a3328ad4149a92ea48fcd72f158)
2022-01-31 17:44:19 +00:00
74c44ba9d6 Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation
Test Plan: revert-hammer

Differential Revision:
D33850228 (23d03025dc)

Original commit changeset: 3cc33fb298e4

Original Phabricator Diff: D33850228 (23d03025dc)

fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692
(cherry picked from commit c9efb582233fec9abfe86bb85d2b496047bf62a7)
2022-01-31 17:44:19 +00:00
5045c18bd1 Error if pocketfft is not found (#67909)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67842

cc mruberry peterbell10

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67909

Reviewed By: albanD

Differential Revision: D33759534

Pulled By: malfet

fbshipit-source-id: 03548c95fe233b812b303ce9603c20ff9f626c39
(cherry picked from commit 214624e254770b1b160d30b000cc244b0c8601b4)
2022-01-31 17:29:48 +00:00
23d03025dc Implement Tanh Gelu Approximation (#61439)
Summary:
1. Implements https://github.com/pytorch/pytorch/issues/39853
2. Adds approximate boolean flag to Gelu
3. Enables Tanh Gelu approximation
4. Adds double backward support for Gelu
5. Enable Tanh Gelu in NvFuser

```
def gelu(x, approximate : str = 'none'):
    if approximate == 'tanh':
        # sqrt(2/pi) = 0.7978845608028654
        return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0))))
    else:
        return x * normcdf(x)
```

Linking XLA PR - https://github.com/pytorch/xla/pull/3039

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439

Reviewed By: cpuhrsch

Differential Revision: D33850228

Pulled By: jbschlosser

fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33
(cherry picked from commit 3a53b3e94fd58190d1261efd3cf41b53506fb96e)
2022-01-31 17:07:45 +00:00
ca61292465 Add append method for nn.Sequential (#71326)
Summary:
Partially addresses https://github.com/pytorch/pytorch/issues/71249, and potentially supersedes https://github.com/pytorch/pytorch/pull/20274.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71326

Reviewed By: cpuhrsch

Differential Revision: D33855047

Pulled By: jbschlosser

fbshipit-source-id: a3a682e206f93b4c52bc3405e2f7b26aea6635ea
(cherry picked from commit c0b27bbf2a12c970abef4c64e419bfc4840aa8ea)
2022-01-31 16:54:12 +00:00
72c972e1e1 Fix bug in linspace model generation (#72027)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72027

Test Plan: build is successful

Reviewed By: cccclai

Differential Revision: D33858104

fbshipit-source-id: 77a5d4ab15efc21f128efbba1fcce63c6dea8018
(cherry picked from commit b03b9d6b2f5435afd22ac66155dd8542e55bb5da)
2022-01-31 05:13:30 +00:00
1e4aefaa2f Revert D33834916: Set correct device id on efficientzerotensors
Test Plan: revert-hammer

Differential Revision:
D33834916 (a18cfb790d)

Original commit changeset: 11cec343e95e

Original Phabricator Diff: D33834916 (a18cfb790d)

fbshipit-source-id: 3d3f60b760b445383768161b1d21ea4dadbe5d7c
(cherry picked from commit eba41aa646461aa341e3a629f668d327581b2f9c)
2022-01-31 03:49:56 +00:00
6208c2800e torch/monitor: merge Interval and FixedCount stats (#72009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72009

This simplifies the Stats interface by merging IntervalStat and FixedCountStat into a single Stat w/ a specific window size duration and an optional max samples per window. This allows for the original intention of having comparably sized windows (for statistical purposes) while also having a consistent output bandwidth.

Test Plan:
```
buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor
```

Reviewed By: kiukchung

Differential Revision: D33822956

fbshipit-source-id: a74782492421be613a1a8b14341b6fb2e8eeb8b4
(cherry picked from commit 293b94e0b4646521ffe047e5222c4bba7e688464)
2022-01-30 23:21:59 +00:00
a18cfb790d Set correct device id on efficientzerotensors (#71611)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71611

Fixes https://github.com/pytorch/pytorch/issues/71160 https://github.com/pytorch/pytorch/issues/69925

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D33834916

Pulled By: anjali411

fbshipit-source-id: 11cec343e95e2ee188ab7576f26f64aa19317891
(cherry picked from commit f6e86f8a6b3924b67533183bd0b31e7ea4fcd2a9)
2022-01-30 20:53:15 +00:00
784bd92340 Use upgrader_mobile.cpp as the reference for codegen unittest (#71930)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71930

Previously `fbcode/caffe2/test/mobile/test_upgrader_bytecode_table_example.cpp` was checked in as intermediate step to make sure upgrader codegen works properly, before upgrader codegen is actually being used.

this change use `buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen` to codegen `upgrader_mobile.cpp` and we no longer need to use the checkin file `test_upgrader_bytecode_table_example.cpp` for the codegen unit test.
ghstack-source-id: 147957826

Test Plan:
```
buck test mode/opt //caffe2/test:upgrader_codegen
```

Reviewed By: tugsbayasgalan

Differential Revision: D33746264

fbshipit-source-id: 18de3cae53aed966e67f8dc42976a2d10d3788b3
(cherry picked from commit 661ffa786063d3e47cd7bcbe16b3baf2fff74808)
2022-01-30 03:11:32 +00:00
af65634d1c Move generated keyword out of gen_mobile_upgraders.py (#71938)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71938

`generated` will trigger the generated changes and hide the file changes. It's also misleading, because `gen_mobile_upgraders.py` itself is not autogen. Separate the keyword out from `gen_mobile_upgraders.py` so it's easier to see the changes from `gen_mobile_upgraders.py`.
ghstack-source-id: 147957825

Test Plan:
```
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen
```

Reviewed By: tugsbayasgalan

Differential Revision: D33826982

fbshipit-source-id: 593c19f8ef4c9da776b11650863dc43c0b171cd5
(cherry picked from commit 43038d5bc7a41312a005d62f432c5ca19ed79f21)
2022-01-30 03:11:32 +00:00
815532d40c Unsqueeze ops to reduce the number of reshapes in we use in LTC (#72011)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72011

Reviewed By: gmagogsfm

Differential Revision: D33855760

Pulled By: Krovatkin

fbshipit-source-id: abe5572567c8f7746e7b06a552dfbe5566c3d3ce
(cherry picked from commit 8eac12685f17a145e1d5d78fcf0d65131248c5c3)
2022-01-29 04:21:36 +00:00
7a69752c27 Make upgrader test model generation more robust (#72030)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72030

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D33863263

Pulled By: tugsbayasgalan

fbshipit-source-id: 931578848ba530583008be6540003b2dcf4d55ce
(cherry picked from commit 67cd085104631264eb12c2c808eb4ed7b973a652)
2022-01-29 02:54:02 +00:00
87bbcf70f7 Create torch.distributed.shard package. (#71742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71742

We have many sharding components currently:
torch.distributed._sharded_tensor, torch.distributed._sharding_spec,
torch.distributed._sharded_optimizer and more coming.

As a result, organizing all of this under the `torch.distributed.shard`
package. For BC reasons, I'm still keeping the old packages and have them just
reference the new package.
ghstack-source-id: 147899768

Test Plan: waitforbuildbot

Reviewed By: fduwjj, wanchaol

Differential Revision: D33755913

fbshipit-source-id: dc692b31e2607063d55dfcb3db33ec53961d5a5b
(cherry picked from commit 5b6885f3587786217f8ce143f2329ceec618404e)
2022-01-29 00:48:06 +00:00
db370b7a1e [warnings][caffe2] Fix broken asserts (never trigger) (#72014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72014

`assert("string")` evaluates as `assert(true)` and thus never fires (oops!)
`assert(false && "string")` is the prescribed and supported way clang supports asserting "never" so that a string can be captured

Test Plan: ci pass

Differential Revision: D33824206

fbshipit-source-id: 223443f7ebecd78e1732c13ebb4ae416c0a0b11a
(cherry picked from commit 8e3721d0dc6adb92a9baed96552959f71e27cca4)
2022-01-29 00:48:06 +00:00
8ca7484ce7 [FIX] Enable TORCH_CHECK again (#71971)
Summary:
From https://github.com/pytorch/pytorch/pull/71947, the `TORCH_CHECK` should be fixed now.

cc: albanD NSProgrammer

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71971

Reviewed By: mrshenli

Differential Revision: D33845192

Pulled By: albanD

fbshipit-source-id: 020cbe1504ef6dd54f703d7bbc57c2cd22253363
(cherry picked from commit a154fdfb727f3762a8e1c9c71e48222ecdb3966e)
2022-01-29 00:48:06 +00:00
d68c314b13 [warnings][caffe2] Fix asserts yielding -Wstring-conversion warnings (#72013)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72013

Find and replace `assert(!"` with `assert(false && "`
Excludes headers and paths that contain "third-party" or "external"

Clang raises a `-Wstring-conversion` warning when treating a string as a boolean.  This is not uncommon for asserts though (e.g. `assert(!"should never happen")`).  Clang does permit `expr && "string"` though in order to support these assertion use cases.

Test Plan: ci pass

Differential Revision: D33823092

fbshipit-source-id: 9a1af012215bdc91f8b4162ddb2df28d51539773
(cherry picked from commit 0286910350492eea61050bd9c7d21727b607858c)
2022-01-29 00:48:06 +00:00
2017b404ec Fix SVD error code handling for OpenBLAS 0.3.15+ and MKL 2022+ (#68812)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67693.

Reference LAPACK (used in OpenBLAS) changed info error code for svd when inputs contain non-finite numbers. In PyTorch, we raise an internal assert error for negative `info` error codes because usually, it would indicate wrong implementation. However, this is not the case with SVD now in newer versions of LAPACK. MKL (tried 2021.4.0) still gives a positive error code for this kind of input. This change aligns with the OpenBLAS and MKL behavior in our code.

**UPDATE:**
MKL 2022 has uses the latest reference LAPACK behavior and returns the same `info` as OpenBLAS 0.3.15+
This PR fixes https://github.com/pytorch/pytorch/issues/71645 that is due to the updated MKL version in CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68812

Reviewed By: mrshenli

Differential Revision: D33844257

Pulled By: ngimel

fbshipit-source-id: fd1c86e37e405b330633d039f49dce466391b66e
(cherry picked from commit c00a9bdeb0dc8d49317b93d19b7b938a4cfb7a38)
2022-01-29 00:48:06 +00:00
bc9d1e709a [EASY] Adding virtual to the isUnionType op (#69554)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69554

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33750966

Pulled By: Gamrix

fbshipit-source-id: d2da8138c72709e6d1359df638ac29ca9d0f1556
(cherry picked from commit 00f434ee04ca458941b240732c10d006efce69cc)
2022-01-29 00:48:06 +00:00
8fa5cde3a9 Fix hooks (#71970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71970

- Provide default arg for power SGD convenience wrapper that matches the main API default

Test Plan: CI

Reviewed By: H-Huang

Differential Revision: D33837457

fbshipit-source-id: 8f4efab4992b3fff09456a18db2c83e087c25bdf
(cherry picked from commit 83f52fb3c7c82d4f3cb07a9469cfac6ac5a49658)
2022-01-28 23:07:33 +00:00
3e1e02595a Avoid unnecessary copy of ExecutionPlan in operator() (#71982)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71982

Remove unnecessary copy

Test Plan: CI + perf measurement

Reviewed By: marksantaniello

Differential Revision: D33816746

fbshipit-source-id: 1145f2f8b2e3516520bbf5069a4e7399516f4497
(cherry picked from commit 0a18a6f3bafc948a0ae1129a3403fec8cc097e38)
2022-01-28 22:08:32 +00:00
2821574eea [caffe2] Fix compilation with fmt 8.x (#71966)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71966

Fix a few issues that block migration to fmt 8.x:

1. Format strings must be known at compile time by default
2. `formatter` specialization must be visible when formatting an object

Test Plan: sandcastleit

Reviewed By: cbalioglu

Differential Revision: D33835157

fbshipit-source-id: 642d36ae7cd4a3894aff1a6ecc096f72348df864
(cherry picked from commit 970ad5bc010e48d8c3e8f5818e9ab05a3785968e)
2022-01-28 21:41:13 +00:00
726cc39242 Rename inplace variant of freeze_module (#71437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71437

The two versions of freeze_module can be easily mixed up. This is to make the distinction more clear.

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D33824856

Pulled By: Gamrix

fbshipit-source-id: 206bda52f1346f7d2096f55c4660bca5f0011bdf
(cherry picked from commit d7bc6d372f1eeca63588bb235ac124170916892d)
2022-01-28 21:30:23 +00:00
4cd7819854 [caffe2][torch] Remove unreferenced local variable e (#71856)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71856

Fix this warning that flags on the MSVC build:
```
caffe2\torch\csrc\jit\frontend\tree_views.h(919): warning C4101: 'e': unreferenced local variable
```

Test Plan: CI

Reviewed By: jamesr66a

Differential Revision: D33784473

fbshipit-source-id: 83e84f419157da6a563f223e9488f8bef4046efb
(cherry picked from commit 5451aaa23ece11ca2b4e592b291f8754fe97a2d0)
2022-01-28 21:00:37 +00:00
57a9b499dc torch/monitor: update pyi definitions (#71950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71950

This updates the .pyi definitions to match the pybind interfaces.

Test Plan:
```
pyre
```

CI

Reviewed By: kiukchung, edward-io

Differential Revision: D33830311

fbshipit-source-id: 147b1fbfd242dd9cec1cff05768f7a96d9599af4
(cherry picked from commit 347a5ebcc34c4583f80ccaa65b194e6f51714475)
2022-01-28 20:43:18 +00:00
65d3adc65d Add linspace test modules (#71850)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71850

Test Plan: none

Reviewed By: cccclai

Differential Revision: D33785383

fbshipit-source-id: 67441df869f8cff8e75aec9adbeff2d31736a879
(cherry picked from commit 7043ae8f76db0632cd6a5dbb19a06cbc1c9a9a5a)
2022-01-28 20:16:51 +00:00
bc0e216d1f [jit][edge] Print correct type strings in code file for mobile models. (#71968)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71968

Right now when we output type to python files under `code/`, we directly write the dynamic type representation `Dynamic<>`, which causes server side to load an unsupported type. Instead we should do the fallback in export_module.cpp.
ghstack-source-id: 147856473

Test Plan:
CI
buck test //xplat/pytorch/mobile/test:test_read_all_mobile_model_configs
```
...
[       OK ] GeneralAndSpecial/BackPortTest.BackPortForChunkIdx/37 (39142 ms)
[ RUN      ] GeneralAndSpecial/BackPortTest.BackPortForChunkIdx/38
 total: 6 success: 6 failure: 0
[       OK ] GeneralAndSpecial/BackPortTest.BackPortForChunkIdx/38 (9651 ms)
[ RUN      ] GeneralAndSpecial/BackPortTest.BackPortForChunkIdx/39
 total: 4 success: 4 failure: 0
[       OK ] GeneralAndSpecial/BackPortTest.BackPortForChunkIdx/39 (5509 ms)
[----------] 40 tests from GeneralAndSpecial/BackPortTest (806244 ms total)

[----------] Global test environment tear-down
[==========] 41 tests from 2 test cases ran. (810453 ms total)
[  PASSED  ] 41 tests.
```

Reviewed By: pavithranrao

Differential Revision: D33830355

fbshipit-source-id: 0be608fadf14daa2b703f31118ab648cb7b75f9b
(cherry picked from commit 6d65049ae5ac1ef6a11d19de48dd4d926b793b34)
2022-01-28 20:06:21 +00:00
63429bf4b3 Removed JIT FC tweaks for interpolation options (#71937)
Summary:
Description:
- Removed JIT FC tweaks for interpolation options : nearest-exact and antialiasing

They were added in
- https://github.com/pytorch/pytorch/pull/64501 (Sept 04 2021)
- https://github.com/pytorch/pytorch/pull/65142 (Sept 16 2021)

cc jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71937

Reviewed By: mrshenli

Differential Revision: D33845502

Pulled By: jbschlosser

fbshipit-source-id: 8a94454fd643cd2aef21b06689f72a0f16620d30
(cherry picked from commit b21173d64c27d3ee12b608f2805f209611077aa0)
2022-01-28 19:56:59 +00:00
09e54ffec3 .github: Ensure we're using correct build matrix (#72010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72010

We were adding additional CUDA arches to our libtorch builds when we
shouldn't have been

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: atalman

Differential Revision: D33851196

Pulled By: seemethere

fbshipit-source-id: 52055d0cf5b528f45ef0aa33da297cd4175e8dcf
(cherry picked from commit f33b27ecab856a69c52625abf292f51dd2602229)
2022-01-28 19:35:07 +00:00
b2b63209e1 make code simplify in get bufffers and parameters (#70399)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70399

Reviewed By: george-qi

Differential Revision: D33823535

Pulled By: khabinov

fbshipit-source-id: 8d1e49595da1f5cc14db7634a8c27556b02a5361
(cherry picked from commit 78bbb536142ee985df69d2f195a17427b97bb8ea)
2022-01-28 19:15:56 +00:00
3d2d466fc0 [Quant] Fixed errors in test_embedding introduced by https://github.com/pytorch/pytorch/pull/69768 (#71387)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71387

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33808593

Pulled By: dzdang

fbshipit-source-id: 3950400dc7506006666fcd055819e9a08a42eda9
(cherry picked from commit 38dc2de49d0f53b0bc650e70ef98410b6432face)
2022-01-28 19:10:18 +00:00
99bc978b78 [JIT] Propagate requires_grad to autodiff subgraphs (#71666)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71666

When JIT autodiff is constructing a gradient computation graph, it will only add gradients for tensors that require_grad. Previously, require_grad information was **not** propagated to the subgraph that autodiff used; as a result, autodiff would calculate *all* gradients, even if requires_grad had never been set during profiling runs. In certain cases, this can lead to performance issues. For example, during training, the gradient of the input data is not needed, but is still computed.

This propagates requires_grad to the subgraph passed into autodiff, so that autodiff will not compute unnecessary gradients.

Test: `./bin/test_jit --gtest_filter="AutodiffRemoveUnusedGradientsTest.Linear"`

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D33725304

Pulled By: davidberard98

fbshipit-source-id: ca7ab4c9a6a26f94f93aff2d5a4135e125323ba1
(cherry picked from commit a97fe0556da1d74d04250c7cbcd1b8e9d8b41ebe)
2022-01-28 18:57:36 +00:00
765669e1b9 Update docs for torch.real to indicate that it's supported for real tensors (#71962)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71962

Test Plan: Imported from OSS

Reviewed By: davidberard98

Differential Revision: D33846613

Pulled By: anjali411

fbshipit-source-id: a9782bf4e8a7f3ae1fcd4f7ff558ba80b6af012c
(cherry picked from commit 93ea37800ffaae9cd4e085f7d963ad5fc8ce78fa)
2022-01-28 18:46:40 +00:00
cb823d9f07 Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation
Test Plan: revert-hammer

Differential Revision:
D33744717 (f499ab9cef)

Original commit changeset: d64532a562ed

Original Phabricator Diff: D33744717 (f499ab9cef)

fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93
(cherry picked from commit e9fb2d1db1821c8a064975403de46ae6c4b3102c)
2022-01-28 18:35:01 +00:00
0c3bc426a8 LTC move squeeze to master (#71677)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71677

Reviewed By: wconstab, alanwaketan

Differential Revision: D33750874

Pulled By: Krovatkin

fbshipit-source-id: 124783a955ae27ded5bbf5a90fa99bb1d599d3f6
(cherry picked from commit 7cc9293cefb29831127c73689f80f3d44847d18e)
2022-01-28 18:24:31 +00:00
c5df294940 Fix bug in upgrader generation in mobile (#71578)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71578

Use more robust way of extracting upgrader min and max versions

Test Plan: omgitsgreen

Reviewed By: cccclai

Differential Revision: D33690113

fbshipit-source-id: 79a964acb26d7ca1354e104710a285b8da3f46d1
(cherry picked from commit 9e316ee5c12e7bce9b17edebec2eeb38ecabd336)
2022-01-28 18:20:59 +00:00
f499ab9cef Implement Tanh Gelu Approximation (#61439)
Summary:
1. Implements https://github.com/pytorch/pytorch/issues/39853
2. Adds approximate boolean flag to Gelu
3. Enables Tanh Gelu approximation
4. Adds double backward support for Gelu
5. Enable Tanh Gelu in NvFuser

```
def gelu(x, approximate : str = 'none'):
    if approximate == 'tanh':
        # sqrt(2/pi) = 0.7978845608028654
        return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0))))
    else:
        return x * normcdf(x)
```

Linking XLA PR - https://github.com/pytorch/xla/pull/3039

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439

Reviewed By: mikaylagawarecki

Differential Revision: D33744717

Pulled By: jbschlosser

fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187
(cherry picked from commit 4713dd9ccaa8983422bf3aa7b73df8d9ebd8cc02)
2022-01-28 16:59:09 +00:00
e58d5b718a Remove code for using our own build cudnn image, use nvidia image (#71952)
Summary:
Remove code for using our own build cudnn image, use nvidia imag

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71952

Reviewed By: mrshenli

Differential Revision: D33845873

Pulled By: atalman

fbshipit-source-id: 59806dedd925a13700ddf090f32c8c4dae10692d
(cherry picked from commit 90e4755658adb5738ec8c37377e4bd5a276c76c5)
2022-01-28 16:48:21 +00:00
de44a50f14 index_backward: use out-of-place index_put if any input is subclass (#71779)
Summary:
Reference: https://github.com/pytorch/functorch/issues/393

Context :

The derivative of `__getitem__`/`index` is
f5a71ec2d6/tools/autograd/derivatives.yaml (L733-L734)

where `index_backward` is defined as
f5a71ec2d6/torch/csrc/autograd/FunctionsManual.cpp (L3892-L3894)

Problem arises when `grad` is not BatchedTensor but one of the other input is. In that case, `grad.new_zeros` returns an unbatched tensor and call to the inplace `_index_put_impl_` errors as it expects `zeros_like_self` to be Batched.

To avoid this, we dispatch to out-of-place `index_put` if any of the input tensor is subclassed otherwise we dispatch to the inplace `_index_put_impl_`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71779

Reviewed By: albanD

Differential Revision: D33790596

Pulled By: zou3519

fbshipit-source-id: 9d6d81b758740cab7b3db9b905f1e8053f82b835
(cherry picked from commit ba0407a86ef3cabf885cd127649fa6dcd7f75117)
2022-01-28 16:19:34 +00:00
5735f2f875 Make detach redispatch like a regular PyTorch operator (#71707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71707

Why?
- detach should behave like jax.stop_gradient in functorch. Because it
does not detach all the way through, functorch (as well as a Tensor
Subclass wrapping a Tensor subclass) won't see it after the first
layer/subclass handles it.

How?
- This PR changes detach to dispatch all the way through to the backend.
- This PR also modifies native::detach to call shallow_copy_and_detach
instead of native::alias. This is because today, the semantics of detach
and alias are differently -- they differ only by
allow_tensor_metadata_change. In the future, we may choose to deprecate
this flag.
- NB: Before and after this PR, detach() shows up twice in
torch_dispatch: https://github.com/pytorch/pytorch/issues/71725. This is
not a regression so I didn't want to fix it in this PR because it is
weird to fix.

Test Plan: - added new tests; run existing tests

Reviewed By: albanD

Differential Revision: D33752860

Pulled By: zou3519

fbshipit-source-id: 40cc2dc8232e75a02586a4ba5b0ef5f16cb76617
(cherry picked from commit f88aae426ec00bba907e9ad5d1cd6ed2c40bf14a)
2022-01-28 16:13:36 +00:00
fa38e93fe9 Add lightweight reparametrization for _stateless calls (#68969)
Summary:
https://github.com/pytorch/pytorch/issues/61447 introduced a mechanism for performing functional calls in a model using the reparametrization API. However, the overhead introduced in a single call was too large.
I tried to address this by modifying the reparametrization code to support spare tensors, but the changes needed were too large due to type checking and several parts of the code expecting actual `nn.Module` objects so this option was not feasible.

resnet50 and call functional with a parameters dict covering the 0, 25, 50, and 100% of the model total parameters.

Used script:
https://gist.github.com/emcastillo/f344a58638bd71d130c71c45f86f0c3a

| % of parameters passed | CPU Time (us) | GPU Time (us) |
|------------------------|---------------|---------------|
| regular call           | 5539          | 184909        |
| 0                      | 5561          | 184843        |
| 25                     | 11363         | 189236        |
| 50                     | 18716         | 195378        |
| 75                     | 22851         | 198641        |
| 100                    | 27441         | 202281        |

This PR just swaps the `__getattr__` of the submodules to look into a dict holding only the parameters when called, greatly reducing the burden of having to instantiate custom modules and calling forward to just retrieve a tensor.

The execution times now are as follows:

| % of parameters passed | CPU Time (us) | GPU Time (us) |
|------------------------|---------------|---------------|
| regular call           | 5939          | 187533        |
| 0                      | 5899          | 187570        |
| 25                     | 8541         | 188953        |
| 50                     | 10045         | 189826        |
| 75                     | 11049         | 190344        |
| 100                    | 11911         | 190800        |
| functorch with 100% params | 14014 | 191727

Now we see that the CPU time overhead is greatly reduced and the GPU time barely increases due to the effective overlap.

cc albanD zou3519

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68969

Reviewed By: george-qi

Differential Revision: D33836360

Pulled By: albanD

fbshipit-source-id: 532561f64b18ca14c6ae2d77dcacb339397a589d
(cherry picked from commit fd4b6bdfbff4cb3d1da47b7fd73f1edfe43ba65c)
2022-01-28 14:38:45 +00:00
9413c0cd3e Revert D32626563: [pytorch][PR] Fix SVD error code handling for OpenBLAS 0.3.15+ and MKL 2022+
Test Plan: revert-hammer

Differential Revision:
D32626563 (e06eb286da)

Original commit changeset: 09042f07cdc9

Original Phabricator Diff: D32626563 (e06eb286da)

fbshipit-source-id: 387681e68121708a97dfe2768297b470fa84c097
(cherry picked from commit 6ad4864d631a32c62a638d248d06bfd7481405c3)
2022-01-28 13:24:43 +00:00
e88c999da3 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33839320

fbshipit-source-id: d316e2a0c26ef0dfe1172df4d422b086b0cfe398
(cherry picked from commit 96a237d13f2757c8239f23bceedcb27337c84dfb)
2022-01-28 11:33:18 +00:00
be2dc8f294 Sparse CSR CUDA: Add torch.baddbmm and torch.bmm (#68711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68711

This PR adds possibility to multiply a single CSR matrix by a batch of dense matrices.

cc nikitaved pearu cpuhrsch IvanYashchuk ngimel

Test Plan: Imported from OSS

Reviewed By: davidberard98

Differential Revision: D33773319

Pulled By: cpuhrsch

fbshipit-source-id: 1623ce9affbc4fdc6d6130a95c5a42022858b62b
(cherry picked from commit 628c8e366d6325fed631edfbe9a35d130c529344)
2022-01-28 07:25:32 +00:00
e06eb286da Fix SVD error code handling for OpenBLAS 0.3.15+ and MKL 2022+ (#68812)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67693.

Reference LAPACK (used in OpenBLAS) changed info error code for svd when inputs contain non-finite numbers. In PyTorch, we raise an internal assert error for negative `info` error codes because usually, it would indicate wrong implementation. However, this is not the case with SVD now in newer versions of LAPACK. MKL (tried 2021.4.0) still gives a positive error code for this kind of input. This change aligns with the OpenBLAS and MKL behavior in our code.

**UPDATE:**
MKL 2022 has uses the latest reference LAPACK behavior and returns the same `info` as OpenBLAS 0.3.15+
This PR fixes https://github.com/pytorch/pytorch/issues/71645 that is due to the updated MKL version in CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68812

Reviewed By: osalpekar

Differential Revision: D32626563

Pulled By: ngimel

fbshipit-source-id: 09042f07cdc9c24ce1fa5cd6f4483340c7b5b06c
(cherry picked from commit aadf50731945ac626936956e229cf2056a291741)
2022-01-28 06:33:29 +00:00
60997be85c Replace LOG by LOG_EVERY_N to avoid log spamming (#71755)
Summary:
The warning in DDP can cause log spamming. Suggest printing this warning every N times instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71755

Reviewed By: albanD

Differential Revision: D33763034

Pulled By: rohan-varma

fbshipit-source-id: 2d2fe691979b0c7f96a40ca6f9cd29a80b4395dd
(cherry picked from commit 7d879b98e24b978cba5d94a753ddfc781a240933)
2022-01-28 05:23:49 +00:00
fb0e27d38a Add mechanism for functorch to error out on autograd.Function (#71866)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71866

See title. There is a minimal perf regression for the non-functorch case
(a TLS access and a null check).

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D33825279

Pulled By: zou3519

fbshipit-source-id: afa2ad5a672cc9225d2bb6b46ee7f3f1513c1e02
(cherry picked from commit 17ae1d3e9dcf57193a2d90f755e18994671c9f13)
2022-01-28 05:01:06 +00:00
5bd19ba846 Expect test_fn_fwgrad_bwgrad to fail because forward AD is not implemented (#71944)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71944

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33828924

Pulled By: soulitzer

fbshipit-source-id: f754d0f08567f20324d10f37502b1ab37aca3d8f
2022-01-27 20:46:35 -08:00
746e702104 Revert D33827835: Try setting pull_request checkout to head ref
Test Plan: revert-hammer

Differential Revision:
D33827835 (e27271d05a)

Original commit changeset: 45c7829f2ed8

Original Phabricator Diff: D33827835 (e27271d05a)

fbshipit-source-id: 6b813c12dac54b6e3eea61afbf7a2b72e9bb2c67
2022-01-27 20:46:26 -08:00
0c2b1b8bcf Update docs for forward AD and make them public (#71643)
Summary:
Follow up: we would need to update the links to the tutorial later

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71643

Reviewed By: albanD

Differential Revision: D33713982

Pulled By: soulitzer

fbshipit-source-id: a314ffa4e7d5c5ebdef9c50033f338b06578d71c
(cherry picked from commit ba30daaaa5bb79619332f59e6826f19623bc1697)
2022-01-28 03:33:00 +00:00
e849c8b0f2 Move bytecode generation to python (#71681)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71681

Test Plan: Imported from OSS

Reviewed By: gmagogsfm, cccclai

Differential Revision: D33730791

Pulled By: tugsbayasgalan

fbshipit-source-id: e752e9ae20c01a57a3bea270f604215fdcc9182e
(cherry picked from commit 69c9dc0548d794a6b3be8e1f9967e5fd56310a29)
2022-01-28 02:33:00 +00:00
7bc5962329 Trace asserts with fx by looking at byte code (#70960)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70960

This patch uses some bytecode introspection logic to see if a boolean is being used as an assert condition and if so, it records the assert in the fx graph and allows the trace to continue.

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D33570397

Pulled By: zdevito

fbshipit-source-id: 99d26cede8fe42c96d4032d9353c1ede7eb3d969
(cherry picked from commit 30d002da25b8eca134d44d43596ce78c4ef8c221)
2022-01-28 02:04:21 +00:00
1aa2257cac Error message update: use proper name of custom c++ classes (#71922)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71922

Use proper name in the error message and remove "torchbind", since it's not official in documentation.

Test Plan: Imported from OSS

Reviewed By: cccclai

Differential Revision: D33824899

Pulled By: iseeyuan

fbshipit-source-id: 41968494c04fab39292d9cc4dc6e15cca99cbff4
(cherry picked from commit 9732a52ed264f013e9ba3844f86be11d31444954)
2022-01-28 01:43:19 +00:00
7d613ab1d6 Fix indentation typo in test_fx_experimental.py (#71885)
Summary:
These tests were not actually running as they were defined in the local scope of another test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71885

Reviewed By: scottxu0730

Differential Revision: D33806251

Pulled By: jansel

fbshipit-source-id: 48a2d7b472f160759ef55e6fff1f8890511e3345
(cherry picked from commit 9ae14efb25dd034fed60ae99465cd3673c24eed2)
2022-01-28 00:41:12 +00:00
8548657ddb TransformedDistribution.icdf: Fix erroneous icdf ValueError (#71393)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/66946

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71393

Reviewed By: albanD

Differential Revision: D33810839

Pulled By: neerajprad

fbshipit-source-id: a86d8ce3b196b6b06e1466e4030fc549bc07d332
(cherry picked from commit 88743f53b2877a3ab43365c6cb5771856bf24967)
2022-01-28 00:34:08 +00:00
d0ff1f0013 [FSDP] Backward prefetch in recursive call (#71804)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71804

Add backward prefetch arg when using auto_wrap_policy. Unittests are
updated appropriately.
ghstack-source-id: 147753214

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D33782346

fbshipit-source-id: c0176b48db29c3756a8873e809610ed53480102b
(cherry picked from commit 764acb3f1c8fb9879b6c92a934df1a7d2c9e3f3d)
2022-01-28 00:34:08 +00:00
a30b0cf52a [FSDP] Add/refactor unit test for wrap (#71803)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71803

1. Extra check for wrapping with override args,

2. Enhance UT to make sure
`wrap` doesn't wrap outside of ctx.
ghstack-source-id: 147753225

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D33774512

fbshipit-source-id: 1f8d60bdf9b3ba257fee465064a0e25235b3622b
(cherry picked from commit 9ab775b29eddcd193c11398184bee8beffed0327)
2022-01-28 00:34:08 +00:00
99a9929254 [Easy] Format DDP error (#71802)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71802

Per title
ghstack-source-id: 147753213

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D33774513

fbshipit-source-id: f798ea9f63aa1ae573c6b012cc6e749d126dedea
(cherry picked from commit 631157b3ea834c499cea740df6877644b8e27a10)
2022-01-28 00:34:08 +00:00
f115a42362 Revert D33805315: [pytorch][PR] Automated submodule update: FBGEMM
Test Plan: revert-hammer

Differential Revision:
D33805315 (75cc2184e1)

Original commit changeset: 6c341cdff97b

Original Phabricator Diff: D33805315 (75cc2184e1)

fbshipit-source-id: 4c5580fbb0258dc8d69b3f321a124568355abc8d
(cherry picked from commit c891b1cddf83109085f6e5bf11a18925cba08fb6)
2022-01-28 00:23:31 +00:00
51ae9ccba4 Fix forward AD for cudnn batch norm (#71901)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71901

We didn't catch this initially because CuDNN is not being tested on CI.

The following tests fail on master (if we build with CuDNN), but pass with this PR:
- `test_forward_mode_AD_nn_functional_batch_norm_cuda_float64`
- `test_forward_mode_AD_nn_functional_instance_norm_cuda_float64`

I don't think it is documented anywhere, but from the tests passing now I'm going to guess `result1` and `result2` return `mean` and `invstd` respectively. Previously, I thought mean and variance were returned because the variables were named `saved_mean` and `saved_var`.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33818652

Pulled By: soulitzer

fbshipit-source-id: ecee760f5aec620dc70f57de4fb3573c8f2f5f31
(cherry picked from commit 73fd3e021c3478fedc7a7ca258269c029b7790a6)
2022-01-27 23:55:37 +00:00
3b9f2e2cca [GHF] More verbose failures messages (#71941)
Summary:
Modify _check_output to capture `CalledProcessError` and add
stdout/stderr to the failure message
Also record github actions run id in the failure message (calculated based on `${{ github.run_id}}`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71941

Reviewed By: seemethere

Differential Revision: D33829633

Pulled By: malfet

fbshipit-source-id: 060b2856ca6c71574075effa72b982f9e1d64e6e
(cherry picked from commit a9ad7df9b540f9ab14524a644cab5e06225debe4)
2022-01-27 23:49:58 +00:00
c85965600c Fix bug where frozen mod not used for OFI #68903 (#71436)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71436

Fixes issue #68903

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D33824857

Pulled By: Gamrix

fbshipit-source-id: 8d351feb4a621916f55003c58527a1e85eec476e
(cherry picked from commit 57bb4200403692adee1c09f22ce02b1534bad202)
2022-01-27 23:37:50 +00:00
b486797864 [jit][edge] Make flatbuffer_serailzer print correct type strings. (#71935)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71935

flatbuffer serializer today prints type strings based on platform. For example "DynamicType" will be exported if C10_MOBILE is present. Although it's not intended behavior, we should be able to export the correct type name to reduce confusion from users.
ghstack-source-id: 147821109

Test Plan:
```
buck run fbcode/mode/dbg //arvr/firmware/silicon/turing:test_torch -c pt.has_backtraces=1 -c turing.min_runtime=1 -c turing.dsp_op=1 -c turing.model_file=test1.ptl

Downloaded 0/66 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 38.2 sec (100%) 345/345 jobs, 36/345 updated
  Total time: 38.2 sec
BUILD SUCCEEDED
Conv:  input [1, 32, 4, 4] residuals [1] weights [4, 4, 1, 1, 2, 32] nlu_params [4, 128] in_ch 32 out_ch 32 groups 1 kernel  stride  padding  upsample 0 op_type 0 act_type 0
--tensor: 0x7ffdd461c6e8
        device: cpu
        is_quantized: 0
        contiguous: 1
        layout: Strided
        dtype: int
        itemsize: 4
        data_ptr: 0x7f781a0a2c10
        dim: 4
        size: [1, 32, 4, 4]
        stride: [512, 16, 4, 1]
dump data/size: 0x7f781a0a2c10/512
        0       00000004
        1       00000004
        2       00000004
        3       00000004
        4       00000004
        5       00000004
        6       00000004
        7       00000004
        8       00000004
        9       00000004
        10      00000004
        11      00000004
        12      00000004
        13      00000004
        14      00000004
        15      00000004
```

Reviewed By: qihqi

Differential Revision: D33826292

fbshipit-source-id: 3c579d89d31fe8d0df5ea6915746aa70da7e3d5c
(cherry picked from commit 9723a84f83f00d7aa84c6c21d55897e4e23d7209)
2022-01-27 23:22:56 +00:00
1407939f69 Remove unnecessary non_contiguous and gradient tests from test_linalg (#68188)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68188

As per title

cc mruberry jianyuh nikitaved pearu walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774419

Pulled By: mruberry

fbshipit-source-id: d63b599a25a6426463548d632d13a403cad1cc34
(cherry picked from commit eed47601fa11dfe7161c430955a5af6784bb3d42)
2022-01-27 23:13:17 +00:00
6cb128c8dd Generalize noncontiguous tests to several outputs (#67996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67996

This is necessary for most matrix decompositions in `linalg`.

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774418

Pulled By: mruberry

fbshipit-source-id: 576f2dda9d484808b4acf0621514c0ffe26834e6
(cherry picked from commit fb07c50aa9c143aa9dafab57936a8a8a7d3b4ec4)
2022-01-27 23:13:17 +00:00
171cf153d2 Make repeat_interleave respect the conj and neg bits. (#68523)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68523

As per title.

cc ezyang gchanan anjali411 dylanbespalko mruberry Lezcano nikitaved

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774421

Pulled By: mruberry

fbshipit-source-id: f47918c53c10ebec61198d5926287b711e141643
(cherry picked from commit 5b8cc5086683dc8a13d24bf52a2880b62587b1f5)
2022-01-27 23:13:17 +00:00
a675770adc Deactivate the tracking of gradients in sampling functions within OpInfos (#68522)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68522

Some OpInfos were inadvertibly generating samples with `grad_fn`. For
example, when using functions like `transpose()` or `conj()` on the
inputs to generate transposed or conjugated inputs. This PR corrects
this and deactivates the tracking of gradients in all the sampling
functions.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774420

Pulled By: mruberry

fbshipit-source-id: da0e6189a2d67a2cb0fd458054558d36dbad9b61
(cherry picked from commit 42b0870774ff4a07fbba1d991f3ea0a4dbae735a)
2022-01-27 23:13:17 +00:00
e2011b29aa Add OpInfo test to check that floating point inputs in OpInfos have requires_grad set to True (#69909)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69909

This test detected a number of sampling methods that were not generating
the samples as expected, e.g. `index_put`, `cosine_embedding`, `stft`, but
perhaps most notably the generator for `BinOps`.

It also detected that `reminder` and `fmod` did not have implemented the
backward formula for the second input. I added this in the previous PR.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774422

Pulled By: mruberry

fbshipit-source-id: 76cfc75b1fdfd72ee64aa524665f83a75fe52509
(cherry picked from commit 13ea7b436bc6301be4cf7bb7d559177d895502b3)
2022-01-27 23:13:17 +00:00
dcc6aed52c Implement derivatives for torch.remainder and torch.fmod wrt the second argument and update the docs (#69908)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69908

I also took this chance to clarify a bit the documentation of these
functions.

cc brianjo mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774417

Pulled By: mruberry

fbshipit-source-id: ab4a9014006783d1f87d432ecb959c854374c2d4
(cherry picked from commit f319a75d781bbe12a48ef1ffd21d3874dfee3bfa)
2022-01-27 23:13:16 +00:00
b62780fc4f [warnings] Disable broken TORCH_CHECK (#71947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71947

This is a recent regression that blocks our migration to turning `-Wstring-conversion` into an error.
Comment it out until albanD can resolve in the future.

Test Plan: compiles locally

Reviewed By: stephinphection

Differential Revision: D33829899

fbshipit-source-id: 47833d0d8dada087d748ee7e500179ff16f2a138
(cherry picked from commit e3c77ff4458aed174e08a5ec233c606509fb5bc6)
2022-01-27 23:07:20 +00:00
75cc2184e1 Automated submodule update: FBGEMM (#65595)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: 2728266e4c

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65595

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: malfet

Differential Revision: D33805315

fbshipit-source-id: 6c341cdff97b9f7c23a1cd69f65e0936da502f29
(cherry picked from commit a2b62c1fa18d93d20fd4d0c56ac60f8aeb1a75d0)
2022-01-27 23:02:49 +00:00
81d1ce05fd Add complex support for Jiterator, port sinc to Jiterator (#71577)
Summary:
I copy-pasted part of std c++ from LLVM, make it a string, and modify it to use it to implement complex support for Jiterator

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71577

Reviewed By: george-qi

Differential Revision: D33820258

Pulled By: ngimel

fbshipit-source-id: 3d4ea834803b99904a79e430f749407635a3cf6d
(cherry picked from commit f2c3b2a9a5d89099c3752605b7c4394f2d61a00d)
2022-01-27 14:59:55 -08:00
8551989bff [c10d] Enable gather_object on nccl (#71623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71623

Enable gather_object on the nccl backend, since we already support `dist.gather` on nccl. This requires user to set the current device properly.
ghstack-source-id: 147754836

Test Plan: distributed_nccl_spawn -r test_gather_object

Reviewed By: zou3519

Differential Revision: D33701042

fbshipit-source-id: 39cff22947a7cac69d0c923b956dc10f25353a6f
(cherry picked from commit 6e6eff497ff9ac4888ba1876740ac80ea1eb2201)
2022-01-27 14:59:55 -08:00
e27271d05a Try setting pull_request checkout to head ref (#71734)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71734

There are two commits that we test sometimes in CI:
1. The merge commit (a test merge between the PR head ref and the latest base ref)
2. The head ref (the exact commit that was at the head of the user's branch when they pushed).

This distinction is fairly subtle; in the case of 1, you are effectively running against a "rebased" version of your PR's branch. The problem is that we use *both* of these commits today, with confusing results—depending on how you put up your PR and what workflows are running, we might be testing two different commits!

We should probably consolidate on one. This would eliminate a subtle but complex part of our CI (I am mildly horrified by the complexity of [this explanation](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md#which-commit-is-used-in-ci), although it's heroic that someone went and documented it lol). This PR consolidates on using the head ref (option 2).
- This is the behavior of phabricator/fbcode, which many PT devs will be more familiar with.
- This is the behavior of ghstack
- Our master branch moves quite quickly, so the chance that there is a substantial divergence between your local test runs and CI is high, with confusing results that are nondeterministic based on when you put up the PR.
- We use a linear history/squash-rebase-merge workflow, which is better modeled by option 2. Option 1 effectively emulates a merge-commit-style workflow.

The primary disadvantage is that now when re-running workflows, you will not be re-running against a "rebased" version of the PR, but the exact head ref that was pushed. Tbh I find it quite unintuitive that what you're testing changes depending on when you press the re-run button, but I know at least malfet does this so it's worth mentioning.

Test Plan: Imported from OSS

Reviewed By: janeyx99, cpuhrsch

Differential Revision: D33827835

Pulled By: suo

fbshipit-source-id: 45c7829f2ed8e097562d0bf16db5fc6a238a86dc
(cherry picked from commit e53fab96905cfab9c3f2e98de51e09006c17842d)
2022-01-27 14:59:54 -08:00
e755a4f124 Update the operator version check logic when generating models for testing upgraders (#71894)
Summary:
The model generation script will check the model version, to ensure the developer run the script before they change operator

Previously, the version use the old model version. However, it's hard for developer to know the old version number. In this change, it use the current max operator version to check. It's less strict, but more developer friendly

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71894

ghstack-source-id: 147769215

Test Plan:
first time run:
```
chenlai@devvm5615:~/fbsource/fbcode(b82243650)$ buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_test_models_gen
Parsing buck files: finished in 0.7 sec
Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 21.6 sec (100%) 11547/11547 jobs, 2/11547 updated
  Total time: 22.4 sec
BUILD SUCCEEDED
TestVersionedDivTensorExampleV7() aten::div.Tensor
INFO:test.jit.fixtures_srcs.generate_models:Processing TestVersionedDivTensorExampleV7
INFO:test.jit.fixtures_srcs.generate_models:Generating model test_versioned_div_tensor_example_v7 and it's save to /data/users/chenlai/fbsource/fbcode/caffe2/test/jit/fixtures/test_versioned_div_tensor_example_v7.ptl
chenlai@devvm5615:~/fbsource/fbcode(b82243650)$
```

second time run:
```
chenlai@devvm5615:~/fbsource/fbcode(b82243650)$ rm caffe2/test/jit/fixtures/test_versioned_div_tensor_example_v4.ptl
chenlai@devvm5615:~/fbsource/fbcode(b82243650)$ buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_test_models_gen
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 2.0 sec
Building... 17.4 sec (99%) 9289/9290 jobs, 0/9290 updated
TestVersionedDivTensorExampleV7() aten::div.Tensor
INFO:test.jit.fixtures_srcs.generate_models:Processing TestVersionedDivTensorExampleV7
INFO:test.jit.fixtures_srcs.generate_models:Model test_versioned_div_tensor_example_v7 already exists, skipping
chenlai@devvm5615:~/fbsource/fbcode(b82243650)$ jf s
```

Reviewed By: tugsbayasgalan

Differential Revision: D33804737

fbshipit-source-id: 7424b81a700703bdf896ec606c2dac8df6dbf8a6
(cherry picked from commit 44b4e37d30077a3160b8a92209af339a6f2fc885)
2022-01-27 21:15:32 +00:00
0cae3c0481 Improved error messages for max_unpool{}d operators (#67328)
Summary:
~As per the title, this PR adds OpInfos for `max_unpoolNd` operators. There are a few TODOs:~

* [x] Improve error messages for the rest of the functions in the CUDA file for the un-pooling operators.
~* [x] Raise issues for the failures, and provide descriptions.~

~Note to the reviewers: I'll add descriptions and reasons for the skips, I'm not totally sure about them, hence the skips for now.~

cc: mruberry saketh-are

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67328

Reviewed By: george-qi

Differential Revision: D33818126

Pulled By: albanD

fbshipit-source-id: 8ddc8510be7f4ea19eca3ae7f052aeca590d8d48
(cherry picked from commit bd9903d16ceed7e1a5e0d1ead747df085434a53d)
2022-01-27 21:06:54 +00:00
eeda31fa08 Added antialias flag to interpolate (CUDA, bilinear and bicubic) (#70930)
Summary:
Description:
- Added antialias flag to interpolate (CUDA)
  - forward and backward for bicubic mode
  - added tests

Previous PR for CPU bilinear, https://github.com/pytorch/pytorch/pull/65142
Previous PR for CPU bicubic, https://github.com/pytorch/pytorch/pull/68819

### Benchmarks

<details>
<summary>
Bilinear forward pass, PIL, PTH CPU and PTH CUDA
</summary>

Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112

```

Torch version: 1.11.0a0+gitd032369
Torch config: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61
  - CuDNN 8.0.5
  - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF,

Num threads: 8
[----------------------------------- Downsampling (bilinear): torch.Size([1, 3, 906, 438]) -> (320, 196) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               2851.2              |            874.1          |            57.1
      channels_last non-contiguous torch.float32  |               2856.1              |           1155.8          |           130.6

Times are in microseconds (us).

[----------------------------------- Downsampling (bilinear): torch.Size([1, 3, 906, 438]) -> (460, 220) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               3705.9              |           1005.8          |            66.3
      channels_last non-contiguous torch.float32  |               3742.9              |           1332.8          |           143.5

Times are in microseconds (us).

[------------------------------------ Downsampling (bilinear): torch.Size([1, 3, 906, 438]) -> (120, 96) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               1768.0              |           725.2           |            77.9
      channels_last non-contiguous torch.float32  |               1753.7              |           942.5           |           144.0

Times are in microseconds (us).

[----------------------------------- Downsampling (bilinear): torch.Size([1, 3, 906, 438]) -> (1200, 196) ----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               9522.6              |           2593.8          |           157.8
      channels_last non-contiguous torch.float32  |               9513.5              |           3622.7          |           241.5

Times are in microseconds (us).

[----------------------------------- Downsampling (bilinear): torch.Size([1, 3, 906, 438]) -> (120, 1200) ----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               2240.1              |           565.5           |            93.3
      channels_last non-contiguous torch.float32  |               2244.2              |           972.7           |           170.8

Times are in microseconds (us).

[------------------------- Downsampling (bilinear): torch.Size([1, 1, 906, 438]) -> (320, 196) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              1441.3             |           386.1           |            22.3

Times are in microseconds (us).

[------------------------- Downsampling (bilinear): torch.Size([1, 1, 906, 438]) -> (460, 220) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              1815.2             |           376.8           |            27.8

Times are in microseconds (us).

[-------------------------- Downsampling (bilinear): torch.Size([1, 1, 906, 438]) -> (120, 96) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              962.3              |           400.0           |            29.4

Times are in microseconds (us).

[------------------------- Downsampling (bilinear): torch.Size([1, 1, 906, 438]) -> (1200, 196) -------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              4749.7             |           910.1           |            63.7

Times are in microseconds (us).

[------------------------- Downsampling (bilinear): torch.Size([1, 1, 906, 438]) -> (120, 1200) -------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              1098.1             |           272.0           |            36.4

Times are in microseconds (us).

```

</details>

<details>
<summary>
Bicubic forward pass, PIL, PTH CPU and PTH CUDA
</summary>

Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112

```

Torch version: 1.11.0a0+gitd032369
Torch config: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61
  - CuDNN 8.0.5
  - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF,

Num threads: 8
[------------------------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               4522.4              |           1406.7          |           170.3
      channels_last non-contiguous torch.float32  |               4530.0              |           1435.4          |           242.2

Times are in microseconds (us).

[------------------------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               5726.4              |           1628.6          |           164.0
      channels_last non-contiguous torch.float32  |               5722.6              |           1665.6          |           234.7

Times are in microseconds (us).

[------------------------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) ------------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               2909.1              |           1461.5          |           276.9
      channels_last non-contiguous torch.float32  |               2892.9              |           1458.7          |           345.1

Times are in microseconds (us).

[----------------------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |              14699.2              |           4283.9          |           407.1
      channels_last non-contiguous torch.float32  |              14711.3              |           4321.1          |           477.0

Times are in microseconds (us).

[----------------------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) -----------------------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |               3467.0              |           980.0           |           339.2
      channels_last non-contiguous torch.float32  |               3465.2              |           982.3           |           407.8

Times are in microseconds (us).

[-------------------------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              2396.7             |           877.8           |            68.1

Times are in microseconds (us).

[-------------------------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              3068.2             |           777.3           |            64.7

Times are in microseconds (us).

[-------------------------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) ---------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              1540.2             |           829.3           |           100.4

Times are in microseconds (us).

[------------------------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              7919.5             |           1467.8          |           151.6

Times are in microseconds (us).

[------------------------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) --------------------------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ---------------------------------------------------------------------------------------------------------------
       contiguous torch.float32  |              1695.7             |           631.2           |           117.7

Times are in microseconds (us).

```

</details>

<details>
<summary>
Bilinear backward pass, PTH CPU and PTH CUDA
</summary>

Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112

```
- Measure only backward op

Torch version: 1.11.0a0+gitd032369
Torch config: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61
  - CuDNN 8.0.5
  - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF,

Num threads: 8
[------------- Downsampling backward (bilinear): torch.Size([1, 3, 906, 438]) -> (320, 196) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           4686.8          |           215.7
      channels_last non-contiguous torch.float32  |           5101.1          |           220.5

Times are in microseconds (us).

[------------- Downsampling backward (bilinear): torch.Size([1, 3, 906, 438]) -> (460, 220) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           6011.2          |           204.4
      channels_last non-contiguous torch.float32  |           6396.0          |           210.0

Times are in microseconds (us).

[------------- Downsampling backward (bilinear): torch.Size([1, 3, 906, 438]) -> (120, 96) -------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           2035.6          |           250.2
      channels_last non-contiguous torch.float32  |           1589.6          |           252.5

Times are in microseconds (us).

[------------ Downsampling backward (bilinear): torch.Size([1, 3, 906, 438]) -> (1200, 196) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |          11392.5          |           256.5
      channels_last non-contiguous torch.float32  |          11640.2          |           263.9

Times are in microseconds (us).

[------------ Downsampling backward (bilinear): torch.Size([1, 3, 906, 438]) -> (120, 1200) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |          11769.6          |           465.9
      channels_last non-contiguous torch.float32  |          12407.0          |           474.4

Times are in microseconds (us).

[---- Downsampling backward (bilinear): torch.Size([1, 1, 906, 438]) -> (320, 196) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           3931.0          |           133.3

Times are in microseconds (us).

[---- Downsampling backward (bilinear): torch.Size([1, 1, 906, 438]) -> (460, 220) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           5594.8          |           133.9

Times are in microseconds (us).

[---- Downsampling backward (bilinear): torch.Size([1, 1, 906, 438]) -> (120, 96) -----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           1272.6          |           133.0

Times are in microseconds (us).

[--- Downsampling backward (bilinear): torch.Size([1, 1, 906, 438]) -> (1200, 196) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |          10618.1          |           134.0

Times are in microseconds (us).

[--- Downsampling backward (bilinear): torch.Size([1, 1, 906, 438]) -> (120, 1200) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |          11082.2          |           154.6

Times are in microseconds (us).

```

</details>

<details>
<summary>
Bicubic backward pass, PTH CPU and PTH CUDA
</summary>

Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112

```
- Measure only backward op

Torch version: 1.11.0a0+gitd032369
Torch config: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61
  - CuDNN 8.0.5
  - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF,

Num threads: 8
[------------- Downsampling backward (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           6791.2          |           618.9
      channels_last non-contiguous torch.float32  |           7125.2          |           622.9

Times are in microseconds (us).

[------------- Downsampling backward (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           8806.2          |           600.3
      channels_last non-contiguous torch.float32  |           9167.6          |           607.5

Times are in microseconds (us).

[-------------- Downsampling backward (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) -------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |           3683.6          |           693.8
      channels_last non-contiguous torch.float32  |           3617.4          |           695.0

Times are in microseconds (us).

[------------- Downsampling backward (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |          17548.2          |           779.4
      channels_last non-contiguous torch.float32  |          17966.2          |           786.5

Times are in microseconds (us).

[------------- Downsampling backward (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) ------------]
                                                  |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: ----------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |            28.4           |            1.6
      channels_last non-contiguous torch.float32  |            28.4           |            1.6

Times are in milliseconds (ms).

[---- Downsampling backward (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) -----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           6266.1          |           208.5

Times are in microseconds (us).

[---- Downsampling backward (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) -----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           8218.3          |           200.8

Times are in microseconds (us).

[----- Downsampling backward (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) -----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |           3458.9          |           231.9

Times are in microseconds (us).

[---- Downsampling backward (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |          15729.3          |           261.6

Times are in microseconds (us).

[---- Downsampling backward (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) ----]
                                 |  1.11.0a0+gitd032369 cpu  |  1.11.0a0+gitd032369 cuda
8 threads: -----------------------------------------------------------------------------
       contiguous torch.float32  |          26279.8          |           547.0

Times are in microseconds (us).

```

</details>

Code is moved from torchvision: https://github.com/pytorch/vision/pull/4211 and optimized

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70930

Reviewed By: zou3519

Differential Revision: D33817902

Pulled By: jbschlosser

fbshipit-source-id: d63a620f8972ff36b63841f0bc6c820466f58f69
(cherry picked from commit d358cfdb7d1a9efb4b9254446cb9b1d0af35a26d)
2022-01-27 20:43:08 +00:00
567c2bb8e9 Support printing inplace operators in FX (#71887)
Summary:
Pretty print inplace operators (`a+=b`, etc) in generated FX code.  This is useful because it allows `torch.jit.script()` to parse these operators without error.

I don't believe FX tracing supports inplace ops yet, though I am generating them in torchdynamo and want to be able to lower them with torchscript.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71887

Reviewed By: jamesr66a

Differential Revision: D33806248

Pulled By: jansel

fbshipit-source-id: 5eb9f744caab2f745cefc83ea658e12e9e7a817d
(cherry picked from commit eacbd6bb83571f9e58d84243aeed277e7a4f1fe5)
2022-01-27 20:35:22 +00:00
5a14eca191 Revert D33820822: [pytorch][PR] Run the pr-label check on PR closed action and validate closed_by
Test Plan: revert-hammer

Differential Revision:
D33820822 (e020414cb2)

Original commit changeset: 373ffcbc2bca

Original Phabricator Diff: D33820822 (e020414cb2)

fbshipit-source-id: e9b4cf7f3861c8a383b9ab52f55a6ca3fe8b7fb2
(cherry picked from commit 43b9277c235a5bc4bcaa71347432c56f04a096c9)
2022-01-27 20:04:03 +00:00
6feba4bc7e Implement scatter primitive for ProcessGroupNCCL (#70029)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70029

This PR implements NCCL scatter and add scatter to ProcessGroupNCCL.

NCCL doesn’t directly provide primitives for scatter, so we need to be implemented on top of NCCL’s send/recv API.

1. In ProcessGroupNCCL.cpp, the inputTensors are first flattened, then outputTensors and inputFlattened are passed by the collective class to scatter() function in nccl.cpp.
2. In nccl.cpp, scatter is implemented using ncclSend/ncclRecv: the root rank uses a for loop to send(distribute) the inputTensors to each rank, then all the ranks receive the inputTensor from the root rank.
ghstack-source-id: 147754837

Test Plan:
test_scatter_ops
test_scatter_stress
test_scatter_checks

Reviewed By: pritamdamania87

Differential Revision: D33154823

fbshipit-source-id: 4513e7eaf7d47a60eb67da99dc6c2e9a2882f3fd
(cherry picked from commit 93201f9d4a87c556110e60ceb93826abd71cf518)
2022-01-27 19:37:55 +00:00
9b53d3194c Implement gather primitive for ProcessGroupNCCL (#66745)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66745

This PR implement NCCL gather and add gather to ProcessGroupNCCL using nccl send/recv api.

NCCL doesn’t directly provide primitives for gather, so we need to be implemented on top of NCCL’s send/recv API.
1. In ProcessGroupNCCL.cpp, the outputTensors are first flattened, then inputTensors and outputFlattened are passed by the collective class to gather() function in nccl.cpp.
1. In nccl.cpp, gather is implemented using ncclSend/ncclRecv: all the ranks send inputTensor to the root rank, and the root rank uses a for loop to receive these inputTensors.
ghstack-source-id: 147754838

Test Plan:
test_gather_ops
test_gather_checks
test_gather_stress

Reviewed By: pritamdamania87

Differential Revision: D29616361

fbshipit-source-id: b500d9b8e67113194c5cc6575fb0e5d806dc7782
(cherry picked from commit d560ee732eb559782a2d1d88b3cf118dcfc404bc)
2022-01-27 19:37:55 +00:00
0a8b391936 ci: Enable tests for iOS on GHA
These were left out of the intial migration for some reason so this just
transfers over those tests

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71644

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2022-01-27 19:32:12 +00:00
e020414cb2 Run the pr-label check on PR closed action and validate closed_by (#71917)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71917

Reviewed By: malfet

Differential Revision: D33820822

Pulled By: atalman

fbshipit-source-id: 373ffcbc2bcae4041f7f84a4883c086a61afd03b
(cherry picked from commit 2c2c81a9850a8f43397e725a1ec2c593edc66fab)
2022-01-27 18:54:09 +00:00
8ff1a8fdca Implement forward AD for linalg.svd and improve svd_backward (#70253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253

I included a derivation of the formula in the complex case, as it is
particularly tricky. As far as I know, this is the first time this formula
is derived in the literature.

I also implemented a more efficient and more accurate version of svd_backward.
More importantly, I also added a lax check in the complex case making sure the loss
function just depends on the subspaces spanned by the pairs of singular
vectors, and not their joint phase.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751982

Pulled By: mruberry

fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512
(cherry picked from commit 391319ed8f2e0ecc1e034d8eaecfb38f5ea4615f)
2022-01-27 18:38:30 +00:00
84f1685397 Rewrite svd and linalg.svd as structured kernels (#69827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827

In general, the current pattern allows for implementing optimisations
for all the backends in a common place (see for example the optimisation
for empty matrices).

After this PR, `torch.svd` is implemented in terms of `linalg.svd` and
`linalg.svdvals`, as expected. This makes it differentiable in the case
when `compute_uv=False`, although this is not particularly important, as
`torch.svd` will eventually be deprecated.

This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj
in the cases when `full_matrices=False` or `compute_uv=False`.

The memory for auxiliary `U` and `V` in the cases above, needed for some
cuSOLVER routines is allocated raw allocators rather than through fully
fledged tensors, as it's just a blob of memory the algorithm requests.
As the code is better structured now, it was easier to see that `U` and
`Vh` needn't be allocated when calling `svd_cusolver_gesvd`.

Now `linalg.svdvals` work as expected wrt the `out=` parameter.
Note that in the test `test_svd_memory_allocation` we were
passing a tensor of the wrong size and dtype and the test seemed to
pass...

This PR also changes the backward formula to avoid saving the input
matrix, as it's not necessary. In a follow up PR, I will clean the
backward formula and make it more numerically stable and efficient.

This PR also does a number of memory optimisations here and there, and fixes
the call to cusolver_gesvd, which were incorrect for m <= n. To test
this path, I compiled the code with a flag to unconditionally execute
the `if (!gesvdj_convergence_check.empty())` branch, and all the tests
passed.

I also took this chance to simplify the tests for these functions in
`test_linalg.py`, as we had lots of tests that were testing some
functionality that is already currently tested in the corresponding
OpInfos. I used xwang233's feature to test both MAGMA and CUDA
backends. This is particularly good for SVD, as cuSOLVER is always
chosen over MAGMA when available, so testing MAGMA otherwise would be
tricky.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751983

Pulled By: mruberry

fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2
(cherry picked from commit a1860bd567f2d136e74695275214bc0eaf542028)
2022-01-27 18:38:30 +00:00
2a8b91548e Add scripts to OSS merge rules
There is nothing but builds scripts for OSS in there

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71927
2022-01-27 18:18:41 +00:00
a49f2412e4 [SR] Add static runtime scopes to record function (#70944)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70944

Added special net-level/op-level scopes for static runtime. We can use these to add special behavior in record functions when they are invoked from a static runtime context.

Reviewed By: navahgar

Differential Revision: D33458211

fbshipit-source-id: 0b7022100e9f5ac872f4cb5bfba14e92af2c71b0
(cherry picked from commit b486548544c5e822803071756c85e675e37d2dad)
2022-01-27 18:00:08 +00:00
09c417ae65 Add new reduce options and autograd support for scatter_reduce (#71788)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71788

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33778525

Pulled By: cpuhrsch

fbshipit-source-id: 47b8544e29df3075bc6ede894c59499a7ffec876
(cherry picked from commit ddcddac7262dd78a4002aaaea08daa3c50526028)
2022-01-27 17:38:50 +00:00
a432b9a7c6 Clean repo after checkout
Otherwise, lint workflow can fail due to artifacts left over from
previous run.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71792
2022-01-27 17:28:12 +00:00
fdec94504f Rename _scatter_reduce to scatter_reduce and make it unstructured (#71787)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71787

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33778524

Pulled By: cpuhrsch

fbshipit-source-id: 55a330e1c2227c0eaaa1c0d2f9205a4dee24a11b
(cherry picked from commit 6e4a8a91dac179f2302f87964090a4f991a4392f)
2022-01-27 16:29:13 +00:00
21d307cd22 CUDNN changes for cuda 11.5 (#71869)
Summary:
CUDNN changes for cuda 11.5

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71869

Reviewed By: janeyx99

Differential Revision: D33817943

Pulled By: atalman

fbshipit-source-id: 5da5f8f45877ac12c0ee4d982082fd24e5f09adb
(cherry picked from commit 3f3d96af693b0515eea658bcb4b9ed3fed32eea4)
2022-01-27 15:33:58 +00:00
b66f1bc80f fx quant: make forked subgraph rewriter preserve stack trace (#71858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71858

Makes the forked subgraph rewriter code path preserve stack traces.
The strategy is pretty simple for now:
1. find any specified stack trace in pattern graph
2. if found, copy this stack trace to every node in replacement graph

If more complicated logic is needed in the future, we can address it
at a later time.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_stack_trace_preserved_subgraph_rewriter
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D33791740

fbshipit-source-id: 38bb4885549a9f954278c6c14fa41f58f1d5f7b7
(cherry picked from commit 5cc32a87ce62ad9a1c8d2240cfe630cbf1cc838d)
2022-01-27 15:33:58 +00:00
a6d9dd9370 [c10d] Use the term "errno" instead of "generic error" in logs and error messages (#71865)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71865

This PR changes the term "generic error" to "errno" in c10d log outputs and error messages to make the root cause more clear.

```
[W socket.cpp:634] The server socket on [localhost]:29501 is not yet listening (generic error: 111 - Connection refused), will retry.
```

becomes

```
[W socket.cpp:634] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
```
ghstack-source-id: 147716733

Test Plan: No behavioral change, run existing unit and integration tests.

Reviewed By: H-Huang

Differential Revision: D33792822

fbshipit-source-id: f57b0ec0fc4135e83c46fdc93911edbce9d26ec1
(cherry picked from commit f61dd92a43b8e253b770c3db7da0a1fba9b81cab)
2022-01-27 13:32:39 +00:00
7aa4a1f63e torch/monitor: TensorboardEventHandler (#71658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71658

This adds the beginnings of a TensorboardEventHandler which will log stats to Tensorboard.

Test Plan: buck test //caffe2/test:monitor

Reviewed By: edward-io

Differential Revision: D33719954

fbshipit-source-id: e9847c1319255ce0d9cf2d85d8b54b7a3c681bd2
(cherry picked from commit 5c8520a6baea51db02e4e29d0210b3ced60fa18d)
2022-01-27 08:33:55 +00:00
d4d0ab71b3 use torch.testing.assert_equal in TestCase.assertEqual (#67796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67796

Supersedes #58981.

cc mruberry

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542994

Pulled By: mruberry

fbshipit-source-id: 527099f5fdc154fd95ee48cd19f0a85eeec43443
(cherry picked from commit 1a58915e2cfde5c48ad77198a917872a03fd1b72)
2022-01-27 08:33:55 +00:00
de58a27769 define //c10/core:CPUAllocator target (#70862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70862

ghstack-source-id: 147642558

Test Plan: Should be a no-op, rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33330151

fbshipit-source-id: f566993f47cffa0df85105f3787bb5c6385cf5d6
(cherry picked from commit a17c3865efb6f1fa7e14adb20e5d5ed441543885)
2022-01-27 07:34:53 +00:00
41690d7804 define //c10/mobile targets (#70861)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70861

ghstack-source-id: 147642549

Test Plan: Should be a no-op. Rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33329870

fbshipit-source-id: 7dbccaa994737c5fe7195d02dffd61eeceb19ceb
(cherry picked from commit 2b5264ebc49e4a5445c066e07f15bad041f42ac8)
2022-01-27 07:34:52 +00:00
844a4b47df extract out //c10/core:alloc_cpu (#70859)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70859

ghstack-source-id: 147642534

Test Plan: Extracting code unmodified to a new library: relying on CI to validate.

Reviewed By: malfet

Differential Revision: D33329688

fbshipit-source-id: f60327467d197ec1862fb3554f8b83e6c84cab5c
(cherry picked from commit f82e7c0e9beba1113defe6d55cf8a232551e913b)
2022-01-27 07:34:52 +00:00
fc6a488e9a extract out //c10/core:alignment (#70858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70858

ghstack-source-id: 147642533

Test Plan: Extracted a constant to a new header, trusting CI build to validate.

Reviewed By: malfet

Differential Revision: D33329689

fbshipit-source-id: 8697bb81a5cc3366462ebdf1f214b62d478fa77c
(cherry picked from commit 16663847e179ea1c2a16f2bb538cfe3aca032593)
2022-01-27 07:34:52 +00:00
4523a73288 Fix usages of TORCH_CHECK/_INTERNAL_ASSERT without condition (#71879)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71879

Two locations of improper macro usage were reported (https://github.com/pytorch/pytorch/issues/71848), and this diff fixes them.  In both cases this is behavior-changing, since the incorrect usages would have passed assertion due interpreting the error string as the condition, and both cases should have been 'assert false'.

Test Plan: Run CI

Reviewed By: alanwaketan

Differential Revision: D33800406

fbshipit-source-id: dfe3d9a6455e6eb96cb639022f8813a8bd6520c3
(cherry picked from commit ee551e5a16828f273d7694820fa9d9fa1fa52129)
2022-01-27 04:20:55 +00:00
56511f859a Revert D33178977: [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument
Test Plan: revert-hammer

Differential Revision:
D33178977 (ef501e8fed)

Original commit changeset: 0c1499c45526

Original Phabricator Diff: D33178977 (ef501e8fed)

fbshipit-source-id: f3912e210e8c588fdbdc9c3c5f4acf2aa8fe6678
(cherry picked from commit cd62183414e757b6012522aee01442e818b7b06d)
2022-01-27 03:29:40 +00:00
bf69a61293 (1/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: backend change
Summary: Reland for D33282878 (911d527b87) . Land backend change first to maintain FC. Will wait for 2 weeks after this diff is in. And than land the front-end change in next diff.

Test Plan:
test in next diff

time buck test mode/dev-nosan fblearner/flow/projects/langtech/translation:tests -- test_e2e_base_training

Reviewed By: gmagogsfm

Differential Revision: D33342547

fbshipit-source-id: b3dee9a4bdfd78103848c12629e5fccafdd621e3
(cherry picked from commit ae1935f1af755180e5607e870ff365dc17061e4a)
2022-01-27 03:29:40 +00:00
027c0d7f8e fixed compilations on xla tensor print (#71147)
Summary:
Fixes multiple compilation on xla tensor print. Please check the conversation here: https://github.com/pytorch/xla/pull/3253

This is done to avoid compilations during tensor printing. Torch performs some tensor operations like slicing to make the tensor readable. These operations result in compilations. Hence to avoid the compilations, copying the tensor to cpu before printing.

example:

```
dev = xm.xla_device()
def test_linear(input_shape=(8, 1024)):
    import pdb
    pdb.set_trace()
    linear = torch.nn.Linear(in_features=1024, out_features=4096, bias=True).to(dev)
    inp = torch.randn(*input_shape).to(dev)
    output = linear(inp)
    xm.mark_step()
    return output
```
Returning from this function would have resulted in 63 compiles, since PDB prints the value of the return output. In this case it is a xla tensor.

Now with the current change, there is no compilation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71147

Reviewed By: shunting314

Differential Revision: D33795177

Pulled By: wconstab

fbshipit-source-id: 74b53d9a1cb7ef67f9d8b0a32064f3896be449b5
(cherry picked from commit a9e0687fc5c9981fb55ea4dc406c283c80fa20c9)
2022-01-27 02:28:19 +00:00
76a2c22341 [c10d] Improve the "not yet listening" warning message of socket (#71864)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71864

A very minor change in one of the warning messages of `socket` to make it clear that it is a transient issue and not an error.

```
[W socket.cpp:634] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused).
```

becomes

```
[W socket.cpp:634] The server socket on [localhost]:29501 is not yet listening (errno: 111 - Connection refused), will retry.
```
ghstack-source-id: 147716736

Test Plan: No behavioral change. Run the existing unit and integration tests.

Reviewed By: H-Huang

Differential Revision: D33792888

fbshipit-source-id: 79b287325945d0353c4568d84d1b52c820783cfc
(cherry picked from commit 9e5b627551fdf3bd6d06eb669883f9423d0999f1)
2022-01-27 02:28:19 +00:00
0099796978 [CUDA Pinned Memory] [Retry] Alternative implementation of pinned memory allocator focusing on multi-threaded scalability (#69299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69299

https://github.com/pytorch/pytorch/pull/68906 + https://github.com/pytorch/pytorch/pull/68749 plugged one correctness hole (non-blocking copies of offset pinned memory tensors) while introducing another (non-blocking copies of pinned memory tensors with a non-standard DataPtr context).

In this revision, we use both the tensor data pointer and context to attempt to identify the originating block in the pinned memory allocator.

Test Plan: New unit tests added to cover the missing case previously.

Reviewed By: yinghai

Differential Revision: D32787087

fbshipit-source-id: 0cb0d29d7c39a13f433eb1cd423dc0d2a303c955
(cherry picked from commit 297157b1a13b5c75d860cac9eba4fe7fe1ad5e6f)
2022-01-27 01:33:55 +00:00
ebeeee7b2b [warnings][caffe2] Fix -Wstring-conversion warnings (#71874)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71874

Treating a string as a boolean is a clang warning (`-Wstring-conversion`).  Clang, however, makes an exception for cases where you `&&` the string, specifically for assertion use cases.

e.g.
```
assert(false && "should never happen!");
```

There a number of checks/asserts that never actually trigger because they were checking against a string, which is always non-zero (and evaluates to true).   This will fix all those impotent asserts/checks.

Test Plan: CI Pass

Differential Revision: D33796853

fbshipit-source-id: a895e047173bbea243fba76705e5b1aa5c5db064
(cherry picked from commit 0decb563d10e312f7f6730f740da006ed04fad37)
2022-01-27 01:33:55 +00:00
cf3ef23713 Propagate full autocast state to CheckpointFunction's forward-inside-backward (#71169)
Summary:
Should fix https://github.com/pytorch/pytorch/issues/71124 (implements https://github.com/pytorch/pytorch/issues/71124#issuecomment-1009436056).

cc mcarilli ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71169

Reviewed By: albanD

Differential Revision: D33793556

Pulled By: ngimel

fbshipit-source-id: 80a4b4f0657b922002e3446fb6b48f082fa98453
(cherry picked from commit cf9beee28bf7b541b4631c13fa35bb494212e1cd)
2022-01-27 00:31:53 +00:00
804f13289e [ONNX] Update opset_version restriction for local function
Export should fail if export_modules_as_functions is set and opset_version<15.
This is because opeset_version < 15 implies IR version < 8, which means no local function support.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71619
2022-01-27 00:21:13 +00:00
ef501e8fed [bc-breaking][quant][be] Refactor fuser_method to include is_qat argument (#70009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70009

Currently we rely on module.training to decide whether we'll do a qat fusion or ptq fusion, this is
not ideal since training flag has nothing to do with quantization, this PR introduces an extra flag `is_qat`
to control this

Note: currently we still has the constraint that when `is_qat` is True, the modules must be in training mode, we
can relax this constraint later

Test Plan:
```
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestFusion
```

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33178977/V36/classyvision/)|

|**Modified Pages**|

Reviewed By: mruberry

Differential Revision: D33178977

fbshipit-source-id: 0c1499c45526971140d9ad58e2994d1edf5ad770
(cherry picked from commit 2d51f9fb28967f1c5aab260d84b8d32d838f4f26)
2022-01-26 23:33:28 +00:00
b066931106 fixing of usage of rel_tol for test adadelta (#71880)
Summary:
Recently I made a PR to change some test tolerances: https://github.com/pytorch/pytorch/pull/69919
It turns out that the previous decorator does not work with the test optim unit test framework. I have summarized the issue in the following doc:
https://docs.google.com/document/d/1BOrp29r31A2WXwM0O6ydsCs43wi01sAgdduKd7is_ec/edit?usp=sharing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71880

Reviewed By: cpuhrsch

Differential Revision: D33801967

Pulled By: jbschlosser

fbshipit-source-id: 094feba10e2ee2a94e3ab754e4140e16b634ea09
(cherry picked from commit d504ddd950f69a6784b93a2e7630d24d5c7051fe)
2022-01-26 23:33:28 +00:00
c224f82ed3 Revert "Disable XLA config"
This reverts commit cda6f40151c2bd9d2e981141d809a47e46c26c52.

Reverted https://github.com/pytorch/pytorch/pull/71733 on behalf of @malfet
2022-01-26 23:19:30 +00:00
bbe6144b45 Revert "Fix lint"
This reverts commit 9d47652bee39ffd0001fc6088ae949be08674710.

Reverted https://github.com/pytorch/pytorch/pull/71735 on behalf of @malfet
2022-01-26 23:11:23 +00:00
7beb030e11 .github: Exclude rocm from ciflow/all,ciflow/trunk
Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71884
2022-01-26 22:38:12 +00:00
5bd33247ec [GHF] Add revert workflow
This adds `try_revert` repository dispatch that will revert commit
that were previously landed by merge workflow if requested by org member

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71868
2022-01-26 22:35:02 +00:00
5ee629e50d .github: Enable windows binary builds (#71484)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71484

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet, atalman

Differential Revision: D33800904

Pulled By: seemethere

fbshipit-source-id: 56d0a6e34ac8023745e36ae341efec79384d1dde
(cherry picked from commit 0339a882c99d0da56a85d492ac1aab2188816daa)
2022-01-26 22:29:33 +00:00
9f4bdf7811 Refactor flatbuffer loader to allow overriding how IValues are parsed. (#71661)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71661

https://docs.google.com/document/d/1OoPKREoqNbOUIcbGzfk8TTIibeTgx3c6Lr3NthF7-PM/edit

Test Plan: unittest

Reviewed By: zhxchen17

Differential Revision: D33720630

fbshipit-source-id: da24993cf5568c689cb6fda64ba4943d77f8b5e6
(cherry picked from commit 327cf75d234ee2b1aea79dc909b890b96927f536)
2022-01-26 22:29:33 +00:00
666ff0ae22 Update _create_c10d_store to check port value (#71863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71863

Port number is int in python, but needs to be uint16_t when called for TCPStore constructor.

Related to #67172

Test Plan: Imported from OSS

Reviewed By: cbalioglu

Differential Revision: D33793270

Pulled By: H-Huang

fbshipit-source-id: 89ab47ec8bd7518f9ecbf7d01871fe059b0e77b1
(cherry picked from commit 84bff1f5bb11029ff3fcf7a04faa3b9c7b25286a)
2022-01-26 22:29:33 +00:00
d7e5870b9e Fixes pr-labels workflow trigger (#71871)
Summary:
Fixes pr-labels workflow trigger

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71871

Reviewed By: anjali411, janeyx99

Differential Revision: D33796618

Pulled By: atalman

fbshipit-source-id: b93595b62ee831a40578e524e499e0ddd6affeb8
(cherry picked from commit 3080548ff3cbd9b74621602d4e211e5ea0034f5c)
2022-01-26 22:29:33 +00:00
d73dc9b7d1 [GHF] Small cleanups
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71867
2022-01-26 22:08:10 +00:00
bdcdf94bdd [Opt Overlap] Clean up code in _OptimizerHookState (#71620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71620

Remove from_functional_optim and make it the default constructor since
that is the only way _OptimizerHookState is now being built. Also, no longer
need to expose create_functional_optim helper function
ghstack-source-id: 147577174

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33700593

fbshipit-source-id: ba089ce3bf66ccf8f71cffdd0f4d4bddc03e8b14
(cherry picked from commit a50b2caf0e19f9793fbf18b371d30e3dd8c5c0cf)
2022-01-26 19:33:49 +00:00
1c8fcc44cb [Opt Overlap] Support optimizing partial set of parameters (#71608)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71608

Per title
ghstack-source-id: 147577178

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33696382

fbshipit-source-id: 5b638d3edf5f03ba476356d61e96ca604de18c8f
(cherry picked from commit 436b547fb0080c81e656fa4753b5d7275e3a3283)
2022-01-26 19:33:49 +00:00
c44d0ac181 Implement labelling for release notes and topics check (#71726)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71726

Reviewed By: malfet

Differential Revision: D33793084

Pulled By: atalman

fbshipit-source-id: 5840bff63c897f09e9d2d70ef435b8764c5d64c0
(cherry picked from commit dd6b3a1131913d2ad8b6dde18aabede3495adc29)
2022-01-26 18:33:24 +00:00
46817895bd [Profiler] Split observer implementations based on ProfilerState (#71135)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71135

The NVTX profiler is quite different from the other Kineto cases, so it's worth it to peel it off early so that later logic can assume either KINETO or KINETO_GPU_FALLBACK. This is more important since we're going to change the Kineto internals. (You can see the python tracer was unnecessarily coupled to NVTX just because the control logic was intermingled.)

There's also no reason to put the legacy observer state in the header rather than the cpp file now that the kineto profiler doesn't need it, so we should shield it from prying eyes.

The recent headaches with TLS downcasting and RPC integration (D32678163 (7ea86dfdb1), D33283314 (681e78bace), D33437773 (7d6535cab3)) have made crystal clear that we need a lot more safety in the profiler, particularly as we shift things around.

Test Plan: Unit tests. This is no longer a performance PR.

Reviewed By: aaronenyeshi

Differential Revision: D32710829

fbshipit-source-id: f9138598b3cfeba71872905a7afab3c03c0d56e7
(cherry picked from commit 059a39d8e3b184337ddd401cfd242c47b8ad0538)
2022-01-26 18:33:24 +00:00
d3bbb281f3 [numpy] add decimals argument to round (#66195)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/65908

Added a new overload instead of updating the current signature. (Had issues with JIT and **maybe** it would have been FC breaking)

TODO:

* [x] Don't compute `std::pow(10, decimals)` for each element.
* [x] Update docs (https://docs-preview.pytorch.org/66195/generated/torch.round.html?highlight=round#torch.round)
* [x] Add tests
* ~~Should we try to make it composite?~~
* ~~Should we add specialized test with more values of `decimals` outside of OpInfo with larger range of values in input tensor?~~

cc mruberry rgommers

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66195

Reviewed By: anjali411

Differential Revision: D31821385

Pulled By: mruberry

fbshipit-source-id: 9a03fcb809440f0c83530108284e69c345e1850f
(cherry picked from commit 50b67c696880b8dcfc42796956b4780b83bf7a7e)
2022-01-26 17:35:03 +00:00
7e6312a5df [SR] Reverse iteration order in resetMemory (#71705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71705

This fixes a crash `resetMemory` caused by trying to access a `TensorImpl` via a borrowed `IValue` after it had already been destroyed. We need to clean up all borrows *before* we destroy the owning `IValue`, not after.
ghstack-source-id: 147688982

Test Plan:
New unit test covers this case

ICE w/ inline_cvr v0 [finishes successfully](https://www.internalfb.com/intern/unidash/dashboard/ads_infra_cost_estimation/a_metrics/?e[select_ESTIMATION_RUN_ID]=ICE_mikeiovine_16431103211c65), didn't see any nnpi errors

Reviewed By: ajyu

Differential Revision: D33725435

fbshipit-source-id: f8dd109382b5cf54df6f194f8dcb5c0812b174bb
(cherry picked from commit 31339d9d38e63248d2ac3646be71008ed731f16c)
2022-01-26 17:35:03 +00:00
e04ade92ae Skip compiledWithCuDNN() call for mobile to avoid segfault (#71775)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71775

Mobile is running into segfaults at the `compiledWithCuDNN()` call as described in T110194934. This fix works around this with an #ifdef following the approach done [here](d32b7d9585/aten/src/ATen/native/Convolution.cpp (L1076-L1088)). TBD how to fix the underlying cause.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33778888

Pulled By: jbschlosser

fbshipit-source-id: 2a22b2eaa858ee6adf5b3c25a1c470c6aebc3f87
(cherry picked from commit e90a6bb402f45f45b7219f453ca38ee85603f3eb)
2022-01-26 17:01:32 +00:00
0891c908bb Revert D33768645: Set correct device id on efficientzerotensors
Test Plan: revert-hammer

Differential Revision:
D33768645 (5dd6cd55ba)

Original commit changeset: 66ce9907630b

Original Phabricator Diff: D33768645 (5dd6cd55ba)

fbshipit-source-id: 4bb1ad46f01cd33aeb813bdc123741cf665194a8
(cherry picked from commit 8ca385b1d8f80f7d2d40a1f177f5155c228ab46e)
2022-01-26 17:01:32 +00:00
adcf34f65a Revert D33778917: Disable some forward mode AD tests
Test Plan: revert-hammer

Differential Revision:
D33778917 (24f577dcb2)

Original commit changeset: 57dfbff61817

Original Phabricator Diff: D33778917 (24f577dcb2)

fbshipit-source-id: f734169bc15baacbe40da904a84def02a1af8261
(cherry picked from commit 630edf5bedc3bd06c1da796e428e3d31d3b44bde)
2022-01-26 17:01:32 +00:00
25e84fa4e5 Add forward AD formulas for some losses (#71026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71026

...and fmod

Testing:
- L1Loss: new module tests (linear in the real case only)
- SmoothL1Loss: new module tests
- MSELoss: tested - OpInfo + new module tests
- huberloss: tested - OpInfo + new module tests
- multi-margin-loss: new module tests
- kl-div: OpInfo + new module tests
- fmod: OpInfo

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33485661

Pulled By: soulitzer

fbshipit-source-id: 542ef5148183b9f574d06b2e2e345d0d889537b7
(cherry picked from commit 60765438e8de82cf9dd2fca71f2ae218c0a38493)
2022-01-26 16:31:26 +00:00
c6d885e489 extract out //c10/core:base library (#70857)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70857

ghstack-source-id: 147302543

Test Plan: Relying on CI

Reviewed By: malfet

Differential Revision: D33329579

fbshipit-source-id: 961abdecabb7b2c6f090e00a6a670e5b70aa5bca
(cherry picked from commit 2b8c4bb0a4f6b22e028aa4cfbf06f09fb6873fa3)
2022-01-26 16:31:26 +00:00
130ca58601 extract final two libraries out of //c10/util (#70856)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70856

ghstack-source-id: 147302541

Test Plan: Relying on CI

Reviewed By: malfet

Differential Revision: D33329555

fbshipit-source-id: 1e7884b2df1c294a8fe9e7f3664a139487d27978
(cherry picked from commit 643cc436ec416ea42a73ec2e376b1d5e747192ac)
2022-01-26 16:31:26 +00:00
bfc481cf67 extract //c10/core:ScalarType to its own library (#70855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70855

This library is depended on by parts of util so it has to go next.
ghstack-source-id: 147170897

Test Plan: Rely on CI.

Reviewed By: malfet

Differential Revision: D33329527

fbshipit-source-id: 28a111f602ee085c1d9b0acec29790488f8c8f0b
(cherry picked from commit e3601b94ff4a89caeb0c012a0d946613934646b9)
2022-01-26 16:31:26 +00:00
f37d2046f8 Implements allreduce_coalesced for ProcessGroupNCCL (#62140)
Summary:
Implements allreduce_coalesced for ProcessGroupNCCL as an NCCL group of allreduces on separate tensors, as proposed in https://github.com/pytorch/pytorch/issues/38995#issuecomment-882804595. In recent versions of NCCL, performance of grouped comms has improved significantly. A group can execute with just one kernel, so a grouped comm on a set of unflattened tensors can be more performant than flattening+a single flat nccl call.

The same approach can easily extend to broadcast_coalesced and reduce_coalesced.

I'm still not sure how (hypothetical) all_gather_coalesced and reduce_scatter_coalesced ops should be exposed or implemented, because we need to consider "_base" variants where the output or input tensor is pre-flattened. For example, https://github.com/pytorch/pytorch/issues/61781 effectively wants "allgather_base_coalesced".

I'm also not sure how the _multigpu variants should enter the picture. With the approach I've written here, ProcessGroupNCCL::allreduce accepts a vector of tensors that are either all on the same device (in which case it'll do an allreduce_coalesced) or all on different devices (in which case it'll do an allreduce_multigpu). In other words it can do _coalesced or _multigpu but not both at once.

for some reason github wont let me add agolynski to the reviewers

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62140

Reviewed By: fduwjj

Differential Revision: D33781010

Pulled By: cbalioglu

fbshipit-source-id: f0c233da9ebae57d7ccecf6d8dc432d936d4d3ce
(cherry picked from commit e43cb81d300bd9e9926f6e01ae77f4accb12c258)
2022-01-26 13:31:30 +00:00
942a084c46 Remove state_dict from AveragedModel and use buffers instead (#71763)
Summary:
Fixes [https://github.com/pytorch/pytorch/issues/66686](https://github.com/pytorch/pytorch/issues/66686)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71763

Reviewed By: anjali411

Differential Revision: D33770907

Pulled By: prabhat00155

fbshipit-source-id: ee32f2cb8475c9add4e1a9a5d3d784ef95825efc
(cherry picked from commit a15898b072ae5234c76afa005ec492ed158c51aa)
2022-01-26 13:31:30 +00:00
40e88b75c4 extract out //c10/util:base library (#70854)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70854

We can't do the entire package since parts of it depend on //c10/core.
ghstack-source-id: 147170901

Test Plan: Rely on CI.

Reviewed By: malfet

Differential Revision: D33321821

fbshipit-source-id: 6d634da872a382a60548e2eea37a0f9f93c6f080
(cherry picked from commit 0afa808367ff92b6011b61dcbb398a2a32e5e90d)
2022-01-26 11:51:45 +00:00
108b37db84 [Array API] Add linalg.diagonal (#70599)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70599

This PR adds `linalg.diagonal` following the Array API:
https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.html#linalg-diagonal-x-axis1-0-axis2-1-offset-0

Fixes https://github.com/pytorch/pytorch/issues/62813

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33760506

Pulled By: mruberry

fbshipit-source-id: e32c3490321d8c3f31b3bb538bc1f72b39bd2854
(cherry picked from commit 44f41f8e3922892ca2f86c9c05249336de40e9ee)
2022-01-26 08:08:32 +00:00
fe277b8717 [jit][edge] Migrate to TypeFactory for jit types on mobile (#71516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71516

Mobile should be able to contruct dynamic types by default.
ghstack-source-id: 147498365

Test Plan:
CI.

**-48KB** binary size reduction for igios BSB.
UMBEX link: https://www.internalfb.com/intern/unigraph/explorer/?jsgq_traversal_spec=%7B%22builds%22%3A[%22bsb%3A422553426218394%5Cu0040base%22%2C%22bsb%3A422553426218394%5Cu0040diff%22]%7D&unigraph_project=UnigraphProjectMbex&is_mbex_redirected

Reviewed By: iseeyuan

Differential Revision: D33673958

fbshipit-source-id: 8600c04ae929283681971aae264d3774188df9cd
(cherry picked from commit 64ebcec09e69d2eff64fdbf926fb43d3b67f99b2)
2022-01-26 07:32:04 +00:00
e5794974cb [acc_tracer] Do not rewrite the leaf modules (#71790)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71790

If a leaf module is specified, it means we should treat it as a blackbox and we should just avoid rewriting it too.

Test Plan:
```
buck test caffe2/test:test_fx_acc_tracer
```
with a new unit test.

Reviewed By: jfix71, houseroad, wushirong

Differential Revision: D33731903

fbshipit-source-id: 0560d9e8435b40f30d9b99dc3b2f47d1a04eb38b
(cherry picked from commit 747e9e44ee1792bd6ac5089ced4ffe5f43b09316)
2022-01-26 07:32:04 +00:00
d3354602fc [Easy] DDP typo fix (#71607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71607

Per title
ghstack-source-id: 147577177

Test Plan: N/a

Reviewed By: cbalioglu

Differential Revision: D33694038

fbshipit-source-id: 5a5a618f13bc8b91127169efcebb90b5a36474a1
(cherry picked from commit 62f17f116d8c94f11c93c4d04149bc1e6ab504aa)
2022-01-26 07:32:04 +00:00
10ca760c0a [Opt Overlap] Implement register_fused_optim in DDP (#71606)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71606

Per title
ghstack-source-id: 147577172

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33694037

fbshipit-source-id: a148d5ce6031f0cc20f33785cfe2c27d1fc2d682
(cherry picked from commit ace3261e0cd6898e3203cf30e78e17e80e5fc42f)
2022-01-26 07:32:04 +00:00
8273912a8c [Opt Overlap] Implement _OverlappedOptimizer (#71605)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71605

ghstack-source-id: 147577173

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33692686

fbshipit-source-id: b0fdb45245d923e1de8fef4431d3e235ac57dcbf
(cherry picked from commit 8b83dbf690d6f426cc5f3954d77c829a3d302962)
2022-01-26 07:32:04 +00:00
bd6ec4efb4 [TensorExpr] Add lowerings for scalar binary ops (+,-,*,/,&,|,^,<<,>>,cmp). (#71298)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71298

Differential Revision:
D33576534
D33576534

Test Plan: Imported from OSS

Reviewed By: anjali411

Pulled By: ZolotukhinM

fbshipit-source-id: 93787b6f11180fcbfbacbb55e1bfb79700320a0e
(cherry picked from commit b2a8e83f97075a9e2c241eefc2357974c8fe0098)
2022-01-26 06:32:51 +00:00
1dbcde2ade [TensorExpr] Support scalar intermediate and output values. (#71186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71186

So far we've only supported scalar inputs, but couldn't handle scalar outputs
or intermediates. This PR adds it.

Scalar outputs are returned as 0-dim tensors. If the kernel is invoked on a
stack of IValues, we correctly convert the results to scalar IValues when
needed. If the kernel is invoked with a vector of void* pointers, everything
works out of the box without any conversions.

Lowerings for scalar operators are a bit tricky. Usual lowerings return a pair
<Buf, Stmt> (aka Tensor), but for scalar operators we also want to have the
corresponding Var that the lowering function supposedly creates (in theory we
could just use Loads and Stores, but I'm worried it can affect performance as
there is no guarantee this will be optimized by LLVM). So, what we do here to
work around this is we return a fake buf + stmt that sets the corresponding
var. Then outside of the lowering we create a real buffer and generate a Store
to it with the value from the variable we passed as the base handle of the fake
buf. This real buffer is then treated as usual by the rest of the system and we
can use it if we need to return this scalar value as a kernel output. If we do
not need to return it, then the Store will be deleted by the DCE pass.

Differential Revision:
D33539324
D33539324

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: ab4524b9820ce204f106effcf6232ed33d4ee223
(cherry picked from commit 7faa0939f08e7235c2a7faa49da5eb84372165e7)
2022-01-26 06:32:51 +00:00
530e7f6195 Define check_sizes_nonnegative as inline (#71640)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71640

Moving this function into the cpp file caused a small regression in
empty_cpu's callgrind instruction count, so moving it back.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33712880

Pulled By: ngimel

fbshipit-source-id: 64b3cb76308da38a3f0384de69500bea6ce6a30b
(cherry picked from commit d3791bc986d12a2e995bfb65fed5c35ddf7a9ae6)
2022-01-26 05:31:19 +00:00
88c298c28f Fix symbolic shape function for flatten (silvasean's) (#71762)
Summary:
in response to https://github.com/pytorch/pytorch/pull/71727

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71762

Reviewed By: anjali411

Differential Revision: D33771227

Pulled By: makslevental

fbshipit-source-id: 7a46d200665d51c95978153f0638f35cdc7c3742
(cherry picked from commit eb3f404093010a62977c4559d4a931093a991ccc)
2022-01-26 04:30:18 +00:00
66939e3b94 [acc_tracer] Add test coverage for retracing (#71752)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71752

Added coverage for reshape specifically which required a fix. The problem for `acc_ops.reshape` as best as I understand:

- `torch.reshape` requires the `shape` arg to be a `tuple` of `ints`
- If `torch.reshape` is passed a `tuple` where the first element is not an `int`, it throws a TypeError e.g. `TypeError: reshape(): argument 'shape' (position 2) must be tuple of ints, not tuple`
- If the `shape` we're reshaping to is an FX Proxy then this type error will be thrown. This happens when the first element of the `shape` tuple is a Proxy because it's input-dependent.
- As a workaround we use `tensor.reshape` instead of `torch.reshape`, which doesn't do equivalent type checking for a `tuple` of `ints`.

Also remove unnecessary `acc_utils.get_field_from_acc_out_ty()` with cast to `TensorMetadata`.

Test Plan: Added test coverage

Reviewed By: yinghai

Differential Revision: D33760455

fbshipit-source-id: bff5563bf9e3d9e9318901b56211151d2c0e4eb2
(cherry picked from commit d5c1b9732a208dd305a3215920f1ea23e2f327f7)
2022-01-26 04:30:18 +00:00
b36b11cbc1 Separating CaptureDataFrame out of DFIterDataPipe (#71776)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71776

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33771602

Pulled By: VitalyFedyunin

fbshipit-source-id: 59d85bc707a9568f1f0960fc184113a4f422d2df
(cherry picked from commit 93522768efc8c525887ad52b415009535fe02cfb)
2022-01-26 03:25:02 +00:00
dfcbe059ec Obliviate ALL_TENSORTYPES and ALL_TENSORTYPES2. (#71153)
Summary:
Hi,
The PR fixes https://github.com/pytorch/pytorch/issues/71096.  It aims to scan all the test files and replace ` ALL_TENSORTYPES` and `ALL_TENSORTYPES2` with `get_all_fp_dtypes`.

I'm looking forward to your viewpoints!

Thanks!

cc: janeyx99 kshitij12345

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71153

Reviewed By: jbschlosser, mruberry

Differential Revision: D33533346

Pulled By: anjali411

fbshipit-source-id: 75e79ca2756c1ddaf0e7e0289257fca183a570b3
(cherry picked from commit da54b54dc5f1c7d9db92dab98c2db177d944cc7e)
2022-01-26 03:25:02 +00:00
eqy
166d4e4201 Change test_conv_large parameter initialization (#71521)
Summary:
This PR twiddles the parameters of the conv layer in `test_conv_large` to better avoid NaN values. Previously, this test would cause a NaN to be computed for `scale` (propagated from `.mean()` on the `.grad` tensor). This NaN would then be propagated to the scaled gradients via division, resulting in a bogus `assertEqual` check as `NaN == NaN` is by default true. (This behavior was observed on V100 and A100).
To improve visibility of failures in the event of NaNs in `grad1`, scale is now computed from `grad2`.

Interestingly enough, we discovered this issue when trying out some less common setups that broke this test; it turns out those breakages were cases where there were no NaN values (leading to an actual `assertEqual` check that would fail for `float16`).

CC ptrblck ngimel puririshi98

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71521

Reviewed By: anjali411

Differential Revision: D33776705

Pulled By: ngimel

fbshipit-source-id: a1ec4792cba04c6322b22ef5b80ce08579ea4cf6
(cherry picked from commit d207bd9b87f8e8c2cb13182b7295c17e19dc3dba)
2022-01-26 02:32:15 +00:00
965b9f483e [cuDNN] Add a new optimized cuDNN RNN algorithm for small RNN hidden_size (#62143)
Summary:
This PR enables a new cuDNN RNN/LSTM algorithm `CUDNN_RNN_ALGO_PERSIST_STATIC_SMALL_H` when the hidden_size is small. Operator benchmark observes 10x performance improvement in some shapes.

- [X] forward https://github.com/xwang233/code-snippet/tree/master/cudnn-rnn-bench-62143/forward
- [X] backward https://github.com/xwang233/code-snippet/tree/master/cudnn-rnn-bench-62143/backward
- [X] end-to-end model: benchmark looks good

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62143

Reviewed By: anjali411

Differential Revision: D33771442

Pulled By: ngimel

fbshipit-source-id: 0640abc6b90ebd2428c3182ce03bf0b9c30a2ec9
(cherry picked from commit 73b153a528fb9b64b994c1174882bc2f64b1ed47)
2022-01-26 02:32:15 +00:00
358b5078ec udpate missing ops message (#71294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71294

This message will be visible to both internal and OSS users.

Test Plan: sandcastle

Reviewed By: dhruvbird, cccclai

Differential Revision: D33575804

fbshipit-source-id: a672e065f80aa20abd344951f0aaa07104defaf7
(cherry picked from commit 53703bed101c2a3f04bf85191681a95a137d1146)
2022-01-26 01:31:47 +00:00
24f577dcb2 Disable some forward mode AD tests (#71791)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71791

forward fix for the failing tests.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33778917

Pulled By: anjali411

fbshipit-source-id: 57dfbff618172652c4bcdf56308196bc5b1ecea4
(cherry picked from commit dd7f86e6f9f79668f2990bd0482b4813ec2f38d7)
2022-01-26 00:34:29 +00:00
e4500306c8 [Quant] Enable default reference path for CopyNodeQuantizeHandler (#71168)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71168

In this PR we want to enable the reference path by default for CopyNodeQuantizeHandler

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D33715995

fbshipit-source-id: eda44892fcea3a1cba54ac75bc020f73e1becc8c
(cherry picked from commit a2cf63f68d36a3847dd3d2fae7614469ffaab51b)
2022-01-25 23:32:11 +00:00
5dd6cd55ba Set correct device id on efficientzerotensors (#71611)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71611

Fixes https://github.com/pytorch/pytorch/issues/71160

Test Plan: Imported from OSS

Reviewed By: pbelevich, ngimel

Differential Revision: D33768645

Pulled By: anjali411

fbshipit-source-id: 66ce9907630b65a12c0775077147a7e72ff4cee4
(cherry picked from commit 3af98a4d707463a12f2b39bc839c5d7e51eec840)
2022-01-25 23:32:11 +00:00
ce6e6812b1 use legacy unrolled kernel for non-trivial offset calc cases (#71710)
Summary:
This leads to across the board improvements on Pascals, big perf improvements for some broadcasting patterns and datatypes on V100 (along with some 3-5% regressions for some other patterns). The most common improving pattern on V100 is half-precision x+bias, that improves by ~5%. Full V100 results in https://docs.google.com/spreadsheets/d/1K67x-6_TPT9Yt6533NfECEhUyfbqBxLH9M5Z3gymzXE/edit#gid=1218963246, benchmarking script in https://gist.github.com/ngimel/986ee84a1dd234a0485e99544e0fc8b6
Most importantly, it reduces context size by 40 MB.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71710

Reviewed By: mruberry

Differential Revision: D33769330

Pulled By: ngimel

fbshipit-source-id: 5a7942261e06003ca79bfa3b071106aab1a8a4bc
(cherry picked from commit f9b51b48112b25353c928711974537a0792516c8)
2022-01-25 22:30:48 +00:00
f3ebf06e98 Release GIL when assigning to real or imag components (#71747)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71747

The getter is trivial as it's just creating a view tensor, but the
setter is actually copying data so does call into kernel code.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33770046

Pulled By: albanD

fbshipit-source-id: f0a70acaef790ae1e5b2f68ac4ce046e850c9624
(cherry picked from commit 36a0109400b256b32a185fcd05f21f302197c081)
2022-01-25 22:30:48 +00:00
dba42056d8 Release GIL in Tensor indexing functions (#71728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71728

Fixes gh-68739

For simple indexing this adds a `gil_scoped_release` before calling
`set_item`. For tuple indexing, the slicing operation is done with the
GIL because otherwise it would have to re-aquire the GIL for each
element in the tuple. However, the GIL is released for the final
`copy_to` operation which is where the actual kernels are called.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33770047

Pulled By: albanD

fbshipit-source-id: 67304a65e2cbf3b3ba9843687d9c63926d29298f
(cherry picked from commit d0a85046b7a497df8f377ff43f1667982ede7f2a)
2022-01-25 22:30:48 +00:00
de8d0203e9 Allow torch.Tensor.real on real-valued tensors (#71718)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71718

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33770668

Pulled By: anjali411

fbshipit-source-id: bad21ebe72220b9017a0b8efa71eaeab84bd9e9f
(cherry picked from commit aa0a922757277ac7b3ad4d633648a89c385ccc0d)
2022-01-25 22:30:48 +00:00
03f1f0cfe4 Check the availability of MAGMA / cuSOLVER when setting the Linalg backend. (#69826)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69826

This simplifies the logic needed to handle the defaultBackend flag in linalg functions.

cc ngimel jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki, ngimel

Differential Revision: D33751984

Pulled By: mruberry

fbshipit-source-id: 6963820be38d4f2d82ebb5196dfcccf034ad6784
(cherry picked from commit 49c81220160062a05bc10a25d487a1f14a2959cd)
2022-01-25 21:40:31 +00:00
332d67b065 Add hascuSOLVER flag to Context (#69825)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69825

As per title.

cc ngimel jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki, ngimel

Differential Revision: D33751986

Pulled By: mruberry

fbshipit-source-id: 8625c7246d627b5c3680d92d4e8afdd7efc7dd69
(cherry picked from commit 7ca16beb28bc541b65cd07271f40c889c30e3b85)
2022-01-25 21:40:31 +00:00
12e01f7825 linalg.matrix_rank: fix cpp interface + add more overloads (#70575)
Summary:
As per title.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70575

Reviewed By: albanD

Differential Revision: D33760541

Pulled By: mruberry

fbshipit-source-id: e048941311c885f91ae524ab34cb732a18eda6c4
(cherry picked from commit 2d686e002d908c5307aac121ede5a0a03bca3327)
2022-01-25 21:29:31 +00:00
33403f4848 edge_order check in torch.gradient only applies to dim argument (#67926)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67919

The compatibility check on `edge_order` in `pre_check_gradient` now looks only at dim argument if it is present, otherwise it checks all dimensions.

Previously, it would check all dimensions regardless of the dim argument and throw unnecessary errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67926

Reviewed By: albanD

Differential Revision: D33760621

Pulled By: mruberry

fbshipit-source-id: d490cd8610c68ff3787e670fc947de3cbf2db062
(cherry picked from commit 45bc56de9e287f715186378682e22bc6ac7a6ccc)
2022-01-25 21:29:31 +00:00
f3e81f3eed Remove copies in jit_log.cpp (#67841)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67841

Reviewed By: anjali411

Differential Revision: D33768433

Pulled By: ZolotukhinM

fbshipit-source-id: 9c081895f7b98eb1ed55fc65250d5ab1f33463b7
(cherry picked from commit a32445da4dc6b69c8ad79282031128b0e637be82)
2022-01-25 20:32:12 +00:00
07ca1fc88b remove hasPrimaryContext workaround on ROCm (#71146)
Summary:
As issue https://github.com/pytorch/pytorch/issues/59750 is fixed, this PR is to remove the workaround implemented for it on ROCm.

Enabled hasPrimaryContext() related PyTorch unit tests on ROCm.

cc: amathews-amd, jithunnair-amd

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71146

Reviewed By: anjali411

Differential Revision: D33754615

Pulled By: albanD

fbshipit-source-id: b3a5c65a20c6d52d5f2ffc9e6f9628c819329b5d
(cherry picked from commit cfdd12166cfd1365de0ebe5a75ce40ac7fde15cc)
2022-01-25 20:32:12 +00:00
22a77d7b92 [warning] Disable broken assert (#71778)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71778

This assert was broken (never triggers).  Fixing the assert leads to test failures.  We need to fix those test failures, so a FIXME has been filed.  The urgency is avoiding the compile time failure that will come with enabling `-Wstring-conversion` as an error.

Test Plan: CI Pass

Reviewed By: r-barnes

Differential Revision: D33754171

fbshipit-source-id: 834b070b94007af583d0fc6c022f23b6703f3fbc
(cherry picked from commit ac8f905fb11c75b470b964f5ff5157e79d4c4b60)
2022-01-25 20:32:12 +00:00
456a4dc6bb [warning] Fix TORCH_INTERNAL_ASSERT calls missing condition to check 2/x (#71767)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71767

This will fix a ton of broken asserts that should always fire but never actually fire.
All would have been caught with `-Wstring-conversion` warnings enabled.

Test Plan: CI Pass

Differential Revision: D33754170

fbshipit-source-id: fa47dbf3b3e6cc27a2dfbdce7ac0416c47122ad7
(cherry picked from commit 23802fe3b5e14bbee6affc1393f3966603f5a983)
2022-01-25 20:32:12 +00:00
f866e8b5aa [fx2trt] Add trt splitter setting (#71717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71717

Add a setting class for trt splitter which has a specific setting `use_implicit_batch_dim`. Further diffs will try to merge `fx2trt/split.py` and trt splitter.

Test Plan: CI

Reviewed By: wushirong

Differential Revision: D33745251

fbshipit-source-id: 5192da9c9b69d86839a8f26636852d405a40cfe7
(cherry picked from commit e2b0ccb1fab82eb54145404f7fce82294693f9a4)
2022-01-25 20:32:12 +00:00
9a2b43085d Improve docs for from_dlpack and to_dlpack (#70437)
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad9665956078958af93132a4c6ae820bbaac9)
2022-01-25 20:32:12 +00:00
f5a71ec2d6 [Opt Overlap] Implement as_functional_optim and create_functional_optim (#71604)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71604

Implement 2 helper functions:
- as_functional_optim which takes in a torch.optim class type and arguments and
  creates the corresponding functional optimizer.
- create_functional_optim which takes in the functional optimizer class type
  and constructs it. Note that as_functional_optim calls into
  create_functional_optim.

  The first will be used in future PRs as described in
  https://github.com/pytorch/pytorch/issues/67570 to create a functional
  optimizer from a traditional optimizer. The latter is used in
  _OptimizerHookState to create a functional optimizer.

  Both new helper functions are covered by unittests.
ghstack-source-id: 147577170

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33688995

fbshipit-source-id: 8b2daafd1b914efa90877cc4313aa9a428546fc1
(cherry picked from commit 42fdae2991b93754501852802c292556c9addc6c)
2022-01-25 18:32:13 +00:00
541817628b [Easy] Add comment explaining DistributedOptimizer gating (#71603)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71603

Small comment to clarify this.
ghstack-source-id: 147577171

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33688994

fbshipit-source-id: 4c87e6ed48416a0aad695861893f183bee7c5252
(cherry picked from commit f8868629c160de01d16f675e69c9f3d4321c04cb)
2022-01-25 18:32:13 +00:00
281663955f [Opt Overlap] Create Optimizer Hook State directly from functional optim (#71602)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71602

The design in https://github.com/pytorch/pytorch/issues/67570 requires
`_OptimizerHookState` to be created directly from a functional optimizer. Add
support and tests for this. Also refactor a few tests.
ghstack-source-id: 147577175

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33687477

fbshipit-source-id: f3c789aa77773f918e01a8d0cf08739b2edf07b3
(cherry picked from commit 4851e1c6d4a200d6efcc8354c98936ab4044f761)
2022-01-25 18:32:13 +00:00
6848e0dae5 Fix RNN modules with inputs shapes containing-0 in CUDA (#71696)
Summary:
We found a discrepancy between cpu & CUDA when using RNN modules where input shapes containing 0s would cause an invalid configuration argument error in CUDA (kernel grid size is 0), while returning a valid tensor in CPU cases.

A reproducer:

```
import torch

x = torch.zeros((5, 0, 3)).cuda()
gru = torch.nn.GRU(input_size=3, hidden_size=4).to("cuda")
gru(x)
```

Run with `CUDA_LAUNCH_BLOCKING=1` set.

cc ngimel albanD

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71696

Reviewed By: mikaylagawarecki

Differential Revision: D33743674

Pulled By: ngimel

fbshipit-source-id: e9334175d10969fdf1f9c63985910d944bbd26e7
(cherry picked from commit 70838ba69bbfd1b39f6c208f9dbefaad3f11d47e)
2022-01-25 18:32:13 +00:00
211deb0364 Fix CI quick-checks (#71773)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71773

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D33770042

Pulled By: anjali411

fbshipit-source-id: 9dd3f8c8592663d385ab0cd4376aaa4b9c7d9ec2
(cherry picked from commit 739c8885c78b3e39c0b5814f1bece0e3bbb6521b)
2022-01-25 18:32:13 +00:00
d32b7d9585 Logic to auto-categorize commits (#64929)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64929

Auto categorized 63% of the commits for PyTorch 1.10 release (2.2k out of 3.4k commits)

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33768760

Pulled By: anjali411

fbshipit-source-id: 0655090af83e923f8c26fa1ce9f190edc542b97e
(cherry picked from commit 2fe30f77b83cbcfcb8fc09f728c8853600e8f303)
2022-01-25 17:32:41 +00:00
0fdb90da5e [warning] Fix TORCH_INTERNAL_ASSERT calls missing condition to check 1/x (#71711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71711

This will fix a ton of broken asserts that should always fire but never actually fire.
All would have been caught with `-Wstring-conversion` warnings enabled.

Test Plan: CI Pass

Differential Revision: D33743605

fbshipit-source-id: 062641f9d5d02c6e317c5a286fd01017cf77237f
(cherry picked from commit 639b42e04b78c35389a1e3a12ae46901d7808e53)
2022-01-25 15:45:21 +00:00
bb157dd4eb Make methods of internal file_obj visible from StreamWrapper (#71653)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71653

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33718749

Pulled By: ejguan

fbshipit-source-id: f3a8244f22ca37049b8678afa0e329b23c957a9d
(cherry picked from commit a4d12ca48ec153ad5f058152e7df4a9a1421b184)
2022-01-25 15:34:24 +00:00
16a9ffba4b Allow specifying num_samples to RandomSampler even when replacement=False (#71568)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38032 #39214

Hi, I modified the RandomSampler to satisfy the requirement of https://github.com/pytorch/pytorch/issues/38032. I also added and deleted some test cases in the test/test_dataloader.py to match with the new requirement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71568

Reviewed By: mikaylagawarecki

Differential Revision: D33741776

Pulled By: ejguan

fbshipit-source-id: 2d25f5096b7b36ad9fb6455107182f387cf8ee43
(cherry picked from commit 9c7e1891c220534f1939b57caedd9e9615c65464)
2022-01-25 15:34:24 +00:00
133c213415 updated the docs for BatchNorm1d and InstanceNorm1d (#71371)
Summary:
Fixes input shape notation inconsistencies mentioned here: https://github.com/pytorch/pytorch/issues/71366

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71371

Reviewed By: anjali411

Differential Revision: D33746814

Pulled By: jbschlosser

fbshipit-source-id: 21a080ea30192cd109e437f758afe54d57778724
(cherry picked from commit c1fecebd036dab6ed1f6af58263e9fda5f8d435a)
2022-01-25 15:34:24 +00:00
1df4eca6d7 [Operator Versioning][Test] Automate model generating process (#70629)
Summary:
This change is to automate the process to generate the old models for testing upgraders. Developer will
1. Add a module in `caffe2/test/jit/fixtures_srcs/fixtures_src.py`
2. Register the module in `caffe2/test/jit/fixtures_srcs/generate_models.py`
3. Run `python test/jit/fixtures_src/generate_models.py`

The model will be dumped to `test/jit/fixtures`.

This script also ensure:
1. The output model operator version is as expected
2. The output model will include the changed operator

Example log:
```
(base) chenlai@chenlai-mp pytorch % python3 /Users/chenlai/pytorch/test/jit/fixtures_src/generate_models.py
TestVersionedDivTensorExampleV4() aten::div.Tensor
INFO:__main__:Processing TestVersionedDivTensorExampleV4
INFO:__main__:Generating model test_versioned_div_tensor_example_v4 and it's save to /Users/chenlai/pytorch/test/jit/fixtures/test_versioned_div_tensor_example_v4.ptl
```
The second time to run
```
(base) chenlai@chenlai-mp pytorch % python3 /Users/chenlai/pytorch/test/jit/fixtures_src/generate_models.py
TestVersionedDivTensorExampleV4() aten::div.Tensor
INFO:__main__:Processing TestVersionedDivTensorExampleV4
INFO:__main__:Model test_versioned_div_tensor_example_v4 already exists, skipping
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70629

ghstack-source-id: 147585826

Test Plan:
OSS
```
python3 /Users/chenlai/pytorch/test/jit/fixtures_src/generate_models.py
```
Internal:
```
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_test_models_gen
```

Reviewed By: iseeyuan, tugsbayasgalan

Differential Revision: D33410841

fbshipit-source-id: 28e2b851a30f12a74e4ac8a539d76e83bbc4fb3a
(cherry picked from commit 6614f1bdf360b69bcf9eb4bca30707e5bd0e8a2b)
2022-01-25 09:33:23 +00:00
e33cd8f382 Drop unused variables (#71685)
Summary:
This fixes a number of unused variable warnings that appear when compiling with LLVM-12 on platform010. Fixes are made by removing the variable when possible or by using `/* */` comments to unname the variable when a shared interface is used or eliminating the variable entirely would require extensive changes or risk modifying a public API.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71685

Test Plan: Sandcastle

Reviewed By: luciang, meyering

Differential Revision: D33728264

fbshipit-source-id: 49286ad7f5271ca1cb48dc70039097305285c948
(cherry picked from commit a2306cddd67b5f2d83d7c2829aea7cb3d1ce767e)
2022-01-25 08:35:01 +00:00
7a0c97195f Add save_for_forward to custom function (#71569)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71569

Not sure if this is the right API

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33695395

Pulled By: soulitzer

fbshipit-source-id: 652b5758f15d901f98ff0da94e977030c7f3415b
(cherry picked from commit 9421a6846ad35cebbb84bd052769527505092a0c)
2022-01-25 07:30:46 +00:00
09aeadf4ab Fix custom function forward AD internal assert (#71531)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71531

Based on the comment above the original internal assert, this is the desired check.
1. Don't error, and automatically make jvp return a view for that tensor output (this is easier than I originally thought: https://github.com/pytorch/pytorch/pull/71531#discussion_r789211877)
2. Error (currently doing)

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33695399

Pulled By: soulitzer

fbshipit-source-id: dba49890a55ad1dd59ed5c41faa96bf7cfc9e562
(cherry picked from commit fdb0f266f51e939e122676ab378f4cacba4295aa)
2022-01-25 07:30:46 +00:00
1cc3291716 Fix custom function when non tensor argument precedes tensor argument (#71530)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71530

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33695397

Pulled By: soulitzer

fbshipit-source-id: 49ccd062f73ccf69c47aca2552fde182d582be2a
(cherry picked from commit 68d502a01332701f80588735bb174df715fb3bcf)
2022-01-25 07:30:46 +00:00
1295d2699f don't include Loops.cuh from Reduce.cuh (#71730)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71730

Reviewed By: mruberry

Differential Revision: D33754606

Pulled By: ngimel

fbshipit-source-id: ddebd147fdfdd66fa723ca7c4341c3d4648b5182
(cherry picked from commit a8754002dcc2a3e279eb76cd2d91b9440be5dbe3)
2022-01-25 06:30:19 +00:00
70f3078dd6 [Pytorch Edge] Wrap lowered module in to_backend (#71597)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71597

Problem: _jit_to_backend overrides get/set state. This means any attributes added to the module after lowering will not be preserved after serialization. For edge workflows the biggest problem here is it breaks bundled_inputs.

Solution?:

Real quick and easy way to handle issues with to_backend overriding get/set state. Wraps the lowered module in another module and has forwarding functions for the api specified in 'method_compile_spec'.

The tradeoff with this approach is now the actual workhorse of the module is 1 layer deep which might make debugging slightly grosser/more difficult/confusing. The other approach Martin David and I talked about would be to only lower the portions that require custom get/set state logic. This leaves the top level the same, and only specific backened internals are changed. Personally I'm not sure how much that really addresses the debugging concern all that well. It seems like if you cracked the model open you'd still run into similar amounts of confusion with a lot of the variables and logic referenced coming from another module.

The other concern with this approach is whether or not 'compile_spec' specifies the public api of the module (since thats our source of truth for this wrapper). While it may not be enforced, it certainly seems to be true by convention and the to_backend api already uses it as a source of truth for all functions that get generated in the resulting module. I say we just formally commit to this (compile spec keys being functions) being the contract of the api instead of just assuming it to be the case and then having weird behavior if its not.

Test Plan:
New Unit Test
CI to check for existing behavior and contracts.

manually tested in a notebook with bundled inputs.

{P475790313}

Reviewed By: raziel

Differential Revision: D33694257

fbshipit-source-id: 9ff27db421eba41bac083dff11a22e9e40a36970
(cherry picked from commit 91ef49977ef0bf18242df381a3ee805c24d6f68d)
2022-01-25 06:30:19 +00:00
b82c4a890d Fix aten's native's folder docs. (#71395)
Summary:
Fixes typo's in `aten/src/ATen/native/README.md`. The following were the fixes:
- Update string type to `c10::string_view` instead of `std::string`.
- Update the link `torch/_python_dispatcher.py`, which was broken.

**Link to docs:** https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/README.md

Thanks!

cc: mruberry kshitij12345

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71395

Reviewed By: mikaylagawarecki

Differential Revision: D33743229

Pulled By: mruberry

fbshipit-source-id: 9deebffede20bf68dfc8e45088c8ab2dffb7564c
(cherry picked from commit 8bedb2cb60aa62b189f6341cf2d92fe46e9f3f7a)
2022-01-25 06:30:19 +00:00
35e7ac3fa1 Fix bug in singleCheckErrors (#71706)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71706

This fixes a bug in singleCheckErrors introduced by #69437 (thanks
Lezcano for the catch). Checking existence of a substring in a larger
string is done with (name.find(text) != name.npos) but we omitted the
second half of the check.

Test Plan: - Code reading; I guess there are no tests for this in CI

Reviewed By: mikaylagawarecki

Differential Revision: D33742822

Pulled By: zou3519

fbshipit-source-id: a12017bb12b941282704bd2110e8632f02c24b04
(cherry picked from commit afb5a04a44232671961d554139e5e19ee711fcab)
2022-01-25 05:33:36 +00:00
9d47652bee Fix lint
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71735
2022-01-25 01:32:32 +00:00
09f7b42f5c Add @suo to the list of CI GHF approvers (#71737)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71737

Reviewed By: suo

Differential Revision: D33755929

Pulled By: malfet

fbshipit-source-id: 21bec237575f26e16305227e6b521aed23e6b94e
(cherry picked from commit b3d33c3f029c89290b175fb415d9e1e7a50949e8)
2022-01-25 01:30:02 +00:00
506d41d659 Improve disable name match (#71499)
Summary:
Allows disabling issues to disable all parametrized tests with dtypes.

Tested locally with:
1. .pytorch-disabled-tests.json as
```
{"test_bitwise_ops (__main__.TestBinaryUfuncs)": ["https://github.com/pytorch/pytorch/issues/99999", ["mac"]]}
```
and running `python test_binary_ufuncs.py --import-disabled-tests -k test_bitwise_ops` yields all tests skipped.

2. .pytorch-disabled-tests.json as
```
{"test_bitwise_ops_cpu_int16 (__main__.TestBinaryUfuncsCPU)": ["https://github.com/pytorch/pytorch/issues/99999", ["mac"]]}
```
and running `python test_binary_ufuncs.py --import-disabled-tests -k test_bitwise_ops` yields only `test_bitwise_ops_cpu_int16` skipped.

NOTE: this only works with dtype parametrization, not all prefixes, e.g., disabling `test_async_script` would NOT disable `test_async_script_capture`. This is the most intuitive behavior, I believe, but I can be convinced otherwise.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71499

Reviewed By: mruberry

Differential Revision: D33742723

Pulled By: janeyx99

fbshipit-source-id: 98a84f9e80402978fa8d22e0f018e6c6c4339a72
(cherry picked from commit 3f778919caebd3f5cae13963b4824088543e2311)
2022-01-25 01:30:02 +00:00
7bc220e060 Update distributed.rst for ProcessGroup Extensions (#71482)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71482

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D33745986

Pulled By: mrshenli

fbshipit-source-id: fe2d0491901bf00be09deb5c556bc1e1d359b725
(cherry picked from commit be5104bfd7e15b027647d2b8bdcd7adbe5de241d)
2022-01-25 00:30:08 +00:00
cda6f40151 Disable XLA config
See https://github.com/pytorch/pytorch/issues/71732

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71733
2022-01-25 00:16:55 +00:00
8ba1ee6aa7 [tensorexpr][easy] add missing comma to test_jit_fuser_te.py (#71642)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71642

Missing comma was causing string concatenation in a list of strings

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33713185

Pulled By: davidberard98

fbshipit-source-id: a2458629d78202713a5bb2f8c720ff9b81939c31
(cherry picked from commit b077598f1d41948ebe05e2d644ba2dd37446b900)
2022-01-24 22:18:37 +00:00
f75e92a936 Fix for retracing documentation which would break for n-ary operators (#71599)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68195

Updated fx.rst documentation and followed the instructions in [contributing.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md#writing-documentation) to generate html. Faced errors which looked very similar to https://github.com/pytorch/pytorch/issues/32703 but gathered from the thread that a non-0 exit is OK for documentation building and these are warnings not affecting the html generation (at least for root rst folder). The HTML output is plain without any styling, please confirm this is intentional.

Screenshot of generated html:
<img width="1438" alt="Screen Shot 2022-01-20 at 4 31 24 PM" src="https://user-images.githubusercontent.com/9580531/150439448-1a626d74-68ba-4f94-91f2-a6942959b049.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71599

Reviewed By: jamesr66a

Differential Revision: D33719546

Pulled By: zephirefaith

fbshipit-source-id: cc9b8ddb13cfdb9f14ebff54cf0d894a8b842aa1
(cherry picked from commit 170db5d7be005e90980c41449b6a9a4c23b3a91f)
2022-01-24 20:07:08 +00:00
edcd4a20ea Exit once there's an environment error (#71693)
Summary:
Narrow the scope of https://github.com/pytorch/pytorch/issues/69730.
Once there's an error, stop the script.
Since it's a random error, it most likely has something with the environment.

Let's see the stat.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71693

Reviewed By: seemethere, mikaylagawarecki

Differential Revision: D33742733

Pulled By: janeyx99

fbshipit-source-id: b453957c2cb450eb79b89614db426b50eef1d14f
(cherry picked from commit cd32fa53d994c4c4590cd7f4962671330eda28c1)
2022-01-24 18:45:56 +00:00
b372be4211 [nn] lstm : no batch dim support (#71056)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/60585

TODO:
* [x] Update docs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71056

Reviewed By: samdow

Differential Revision: D33638643

Pulled By: jbschlosser

fbshipit-source-id: c0949829de8a8e6e7b2873f459a8d7da597a3be3
(cherry picked from commit f94d5849f66dd7da2ae4037b7c1d3e72817e926f)
2022-01-24 15:13:40 +00:00
99d9883a22 dbr quant: make SeenOpInfo a dataclass (#71267)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71267

Refactors `SeenOpInfo` to be a dataclass, to be consistent with
`QTensorInfo`, so we can get real typing. Fixes the type errors. No logic change.

Test Plan:
```
python test/test_quantization.py -k DBR
```

Reviewed By: HDCharles

Differential Revision: D33567129

Pulled By: vkuzo

fbshipit-source-id: 55f89d7a497b6db1fd9956255d964663032a0401
(cherry picked from commit 7fdec92b9cc9ecbc8ca7224cfec5668543cd8cfc)
2022-01-24 14:24:48 +00:00
41afeea791 dbr quant: split observer insertion to a separate pass (#71253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71253

Before this PR, observers were inserted at the same time as
we recorded ops seen while tracing with example input. This is not
ideal because for function fusion (not yet implemented),
we need to be able to look ahead from the current op to properly
insert observers.

This PR refactors observer insertion in DBR quantization to happen
in a separate pass after the ops are recorded.  There is no functionality
change in this diff, but this PR will make it easier to implement
function fusion in a future PR.

Note: the qconfig is still used during tracing to assign each
op an inference dtype. This is not ideal, in the future we may move this
step to happen as a separate pass as well. The reason we keep it as is
in this PR because some more refactoring would be needed to allow
this to both happen in a separate pass as well as survive module
boundaries.

Test Plan:
```
python test/test_quantization.py -k DBR
```

Reviewed By: wenleix

Differential Revision: D33558280

Pulled By: vkuzo

fbshipit-source-id: 54e9cea6ad05317a8c7c92be005d33653617bed6
(cherry picked from commit 2985849916dbd194b6bf44cc3c360e9450da6828)
2022-01-24 14:24:48 +00:00
e12cc227a2 dbr quant: make QTensorInfo a dataclass and add orig_dtype (#71245)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71245

This is a refactor to make a future PR of making observer
insertion be a separate pass easier.
1. adds orig_dtype, so we always record what was seen while tracing
2. switches from namedtuple to dataclass, so we can have more explicit types

Test Plan: CI and mypy

Reviewed By: HDCharles

Differential Revision: D33558281

Pulled By: vkuzo

fbshipit-source-id: b9f87e25a3538fee145f020916a31699046a9c11
(cherry picked from commit 3c8db243605220e990e3c7280ed475d6e90c32fb)
2022-01-24 14:24:48 +00:00
8fe82b855e dbr quant: do not crash on unsupported qconfig_dict keys if they are empty (#71233)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71233

Some keys in qconfig_dict are not implemented yet for DBR quant.
However, FX graph mode quantization modifies qconfig_dict inplace,
so if users use the same dict for DBR and FX they may hit errors.

This PR reduces the chance of these errors by only throwing an
exception in DBR quant if the unsupported keys have nonempty values.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_qconfig_dict_unsupported_only_crashes_when_nonempty
```

Reviewed By: samdow

Differential Revision: D33552398

Pulled By: vkuzo

fbshipit-source-id: 4191ad7ae23929455fef6acaf2c045c65db0b0bd
(cherry picked from commit 8b1911f33e1298225055aff375c0479760767468)
2022-01-24 14:24:48 +00:00
c3570fd945 fx quant: preserve node stack trace throughout prepare and convert (#70757)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70757

This is an initial PR on a way to preserve stack traces throughout FX
graph mode quantization.  It preserves the stack traces for ops
for all of the quantize handlers. A future PR will add stack traces
for dtype transitions.

Test Plan:
```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```

Note: the above only tests a single case. In a future PR, once we
expand coverage, we can expand the utility functions to check for stack
traces on all tests.

```
python test/test_quantization.py
TestQuantizeFx.test_stack_trace_preserved
```

Imported from OSS

Differential Revision:
D33432485
D33432485

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 56c56850393132487430a850fa1def826a9c39c0
(cherry picked from commit c11155b31eb9d228380501860f522a8c89eb2040)
2022-01-24 14:15:43 +00:00
e0d829a266 Kill the test_torch.py mixin and creates test_scatter_gather_ops (#71691)
Summary:
Per title.

Also annotates test_torch.py with additional cleanup tasks and adds empty sample inputs to elementwise unary and binary OpInfos.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71691

Reviewed By: ngimel

Differential Revision: D33735126

Pulled By: mruberry

fbshipit-source-id: 8cc097a7581a8b620540c95b2a5889c1165ecf23
(cherry picked from commit 5c6a245a3f9ba7c064fc77c8cd4045f903e73cfd)
2022-01-24 09:32:32 +00:00
3a03af2f50 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33730646

fbshipit-source-id: 3af18fc393aecce8f03c9e9689deefcafa3a978e
(cherry picked from commit a578b8b07c5799730b6e3d12a7bf32f5a62c95b2)
2022-01-23 03:30:36 +00:00
9b3a56eecf [Optimizer Overlap] Move hooks to own file (#71601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71601

Moves current prototype optimizer overlap to its own file for a better
namespace. No code changes besides a few comment fixes. Note that this code is
still prototype and not expected to be used by an end user.
ghstack-source-id: 147458826

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33662678

fbshipit-source-id: 3cc931323230a4b66c02b9e6f744aaf5c48d4d34
(cherry picked from commit 5070595c7f6de85f75249eb22cbd561f9450fcc2)
2022-01-23 00:04:32 +00:00
ba08440e88 [Opt Overlap] Remove redundant tests (#71600)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71600

These tests in test_c10d_nccl test a subset of functionality that's
already covered by distributed_test.py, no need for these additional tests.
ghstack-source-id: 147458823

Test Plan: CI

Reviewed By: cbalioglu

Differential Revision: D33662679

fbshipit-source-id: 2d1c1223fdd72a851c537b4793a71d65190d2553
(cherry picked from commit 14565ac5a6e059ec06af8583fcefa80626c95990)
2022-01-23 00:04:32 +00:00
3e55fa6385 Fix unused variable warning in FractionalMaxPool3d (#71591)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71591

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33692623

fbshipit-source-id: f96446751f41555a6b5e5289f94efb98c87c66d0
(cherry picked from commit a5c272793e55c4a41af553727a97f490c8e152fa)
2022-01-22 22:30:43 +00:00
27308642a0 Fix unused variable warning in layer_norm_kernel.cu (#71587)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71587

Test Plan: Sandcastle

Reviewed By: meyering

Differential Revision: D33692615

fbshipit-source-id: 84d249eb967fb97191aaf59c9a96cac9f8cae498
(cherry picked from commit ca36d04c3166dbeac73ef88ecc215c0b426169dc)
2022-01-22 20:58:04 +00:00
6ed46b08ab Fix unused variable warning in LossCTC.cu (#71588)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71588

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692622

fbshipit-source-id: b8ee83b775bc5b87f8ddc0936da2df24ab42420d
(cherry picked from commit b88934c5da4997fdef4f7f66b89fa8bee51c1525)
2022-01-22 20:58:04 +00:00
6edb06daa6 Fix unused variable warning in DilatedMaxPool3d.cu (#71590)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71590

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692617

fbshipit-source-id: e9663458c8598822d3b022a0623a66a0e8d1fa5f
(cherry picked from commit 692c94ae85ec99ee3494c1df53362231085a3398)
2022-01-22 20:52:34 +00:00
b9a7dd79f9 Fix unused variable warning in DepthwiseConv2d.cu (#71584)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71584

Test Plan: Sandcastle

Reviewed By: meyering

Differential Revision: D33692624

fbshipit-source-id: 6edba51a62e1754d9ef8ead4178688ada4bd6ffc
(cherry picked from commit 3f52e885fb1d620b3d5b200325e2bd6ea5f69d66)
2022-01-22 20:04:29 +00:00
f269f990f2 [jiterator] polygamma (#71162)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/69463

TODO:
* [x] Add note regarding how to capture value and it's limitations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71162

Reviewed By: ngimel

Differential Revision: D33697346

Pulled By: mruberry

fbshipit-source-id: 0308a6c12cf4b488ab26bdae14291c1f6dbba9ab
(cherry picked from commit 0d3f8c52a17b63079c0d09c965d76cc27cd69146)
2022-01-22 06:49:08 +00:00
c7c864bbd1 Fix unused variable warning in AveragePool2d (#71585)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71585

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692619

fbshipit-source-id: 6f851a0925e7b91e107aa35353986a1fcaf630c4
(cherry picked from commit 53ce218424b84d9cf5aa25908afbacaf1f5ac757)
2022-01-22 04:59:33 +00:00
8d5d875ac7 Fix unused variable warning in ConvolutionMM2d.cu (#71593)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71593

Test Plan: Sancastle

Reviewed By: malfet

Differential Revision: D33692621

fbshipit-source-id: 64d079e10cdd9c2c71a42540fbf396f0f66efe11
(cherry picked from commit 6959ddd920ceae1d751665d7f48777a779fcbf23)
2022-01-22 02:56:28 +00:00
70532e32d9 Fix unused variable warning in EmbeddingBag.cu (#71589)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71589

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692625

fbshipit-source-id: ad58e704de0202fb4f9e25c9c932f0cd5c832198
(cherry picked from commit ae77c2c8097f99f67fcec99c3c123b28139b3c3f)
2022-01-22 02:56:28 +00:00
1794fdd154 Fix unused variable warning in MultiLabelMarginCriterion.cu (#71594)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71594

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692620

fbshipit-source-id: ae1ac664f1e7bce2ce054839b6d5eed4fdcd454f
(cherry picked from commit 4377a8b053a60d252cf49b5b92593f80ebb3af31)
2022-01-22 02:52:06 +00:00
4f498f11cd Fix unused variable warning in DistanceKernel.cu (#71586)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71586

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33692618

fbshipit-source-id: 6265efc9b7106737ac8e70e36e611c8997572d5c
(cherry picked from commit 5b2699a0c72723314897d8ed04a63016b48daa90)
2022-01-22 02:52:06 +00:00
9950f3b7e6 [BE][GHA] Further refactor checkout_pytorch
Rename `checkout_pytorch` to `checkout` and assign `submodules` argument
a default values, which allow one to replace 10+
`common.checkout_pytorch("recursive")` with `common.checkout()`

And also use the same macro to checkout builder in binary builds
workflow

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71664
2022-01-22 01:33:02 +00:00
dcc1e1cd87 [BE] Use !{{ common.checkout_pytorch("recursive") }} in binary builds workflows
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71663
2022-01-22 01:33:02 +00:00
c9bd1c60ed Move upgraders from python to cpp (#70593)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70593

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D33402543

Pulled By: tugsbayasgalan

fbshipit-source-id: 713c54fbbb2bc4c96d5e3b6084f3090a8923a12d
(cherry picked from commit e72b375264395ac1264c07e9c69dd0dd7adfebb8)
2022-01-22 00:24:24 +00:00
47cf0dbf8b Prefer at::detail::empty_cuda to the native function (#70618)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70618

`at::native::empty_cuda` is called directly in some places to avoid
the extra dispatch, however it's features like device guards and a
`TensorOptions` overload.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33623676

Pulled By: ngimel

fbshipit-source-id: 3ac56c4f8acc90281323195a34fc0a1ef8148fbe
(cherry picked from commit 4aaf8b29d0de927ec9ced9f8749a96b2be9c4a89)
2022-01-22 00:20:40 +00:00
86aefdc082 Revert D33694867: Fix persistent worker exits before pin_memory thread
Test Plan: revert-hammer

Differential Revision:
D33694867 (e2191e7084)

Original commit changeset: 0847f4d424a0

Original Phabricator Diff: D33694867 (e2191e7084)

fbshipit-source-id: 5f28616700d8647cbe468a9e300724a7f0c6cc15
(cherry picked from commit 3d8125ba6d669e19efebd9c76b0bd19499a1583e)
2022-01-22 00:09:28 +00:00
ce3215db70 Fix nnq.dropout in vision mobilenetv3 pretrain model (#71438)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71438

Fix issue https://github.com/pytorch/vision/issues/5198
skip observer for nn.dropout to load pretrain model

Test Plan:
python -c "import torchvision; torchvision.models.quantization.mobilenet_v3_large(pretrained=True, quantize=True)"

Imported from OSS

Reviewed By: HDCharles

Differential Revision: D33641707

fbshipit-source-id: 14ea26557c4ff3b942cf46bf06610db0b8f06b05
(cherry picked from commit 0b8b178d261431e604165f2419e95a32c7ecc6b2)
2022-01-22 00:02:48 +00:00
91b43b7820 Add clean workspace step to clang-tidy workflow (#71655)
Summary:
Should alleviate instances of "blah not a repository" that happen due to non-ephemeral runners not cleaning up properly.

https://github.com/pytorch/pytorch/runs/4902228068?check_suite_focus=true

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71655

Reviewed By: malfet

Differential Revision: D33719058

Pulled By: janeyx99

fbshipit-source-id: 4ff35893d00c99026154d71e4d1ae7a54ac5c42a
(cherry picked from commit 13ca9d1f91b9101a6350d3caf45fbc158e7ae47a)
2022-01-21 23:19:11 +00:00
ae285d837e [1/n][caffe2] Add session based margin loss function in caffe2 operator
Summary: Add session based margin loss into caffe2 operator. This is the first diff make these 2 loss available to dper3

Test Plan:
unit test succeeds with gradient check for both new loss function
buck test //caffe2/caffe2/python/operator_test:softmax_l2r_operator_test
buck test //caffe2/caffe2/python/operator_test:margin_loss_l2r_operator_test

E2E test in bento notebook with model training in N1488923
margin loss model: f318207967 f318207399

Notice that the E2E test is run with dper change in D33532976 to change a full model

Reviewed By: devashisht

Differential Revision: D32902460

fbshipit-source-id: 8f21b9109f500583431156908b632e503ed90dbd
(cherry picked from commit 1592111aa4ed6cfdd7ca5f54de70396e9610757c)
2022-01-21 23:13:36 +00:00
26d54b4076 monitor: add docstrings to pybind interface (#71481)
Summary:
This adds argument names and docstrings so the docs are a lot more understandable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71481

Test Plan:
docs/tests CI should suffice

![Screenshot 2022-01-19 at 16-35-10 torch monitor — PyTorch master documentation](https://user-images.githubusercontent.com/909104/150240882-e69cfa17-e2be-4569-8ced-71979a89b369.png)

Reviewed By: edward-io

Differential Revision: D33661255

Pulled By: d4l3k

fbshipit-source-id: 686835dfe331b92a51f4409ec37f8ee6211e49d3
(cherry picked from commit 0a6accda1bec839bbc9387d80caa51194e81d828)
2022-01-21 23:04:33 +00:00
84fe4279db Structured Kernels: Use at::detail::empty functions (#70617)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70617

This reduces the divergence between the code generated for
`create_out` different devices, and means the `TensorOptions` don't
need to be unpacked.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33623680

Pulled By: ngimel

fbshipit-source-id: 54f36774a8530be99c26a54270d4d95f3e38d684
(cherry picked from commit b22ba92e27e638f96a290835b71ad162411fa535)
2022-01-21 22:57:27 +00:00
0a2cdd18f3 nice error msg from load_state_dict for non-tensor value (#70596)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67549

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70596

Reviewed By: anjali411

Differential Revision: D33710750

Pulled By: jbschlosser

fbshipit-source-id: 870b5fafffcd005fd4fcd62f865542739c133805
(cherry picked from commit da374fbc58a61774632c6517d68ad56ecb82019e)
2022-01-21 22:02:13 +00:00
71a41323bb BackendSelect: Use at::_ops API and per-operator headers (#69840)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69840

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33160027

Pulled By: albanD

fbshipit-source-id: 0e492ec8bab73da90afd9df70f48c17a8206a768
(cherry picked from commit 133ec77e9f970fa042ce6fd3fd85c888334f8086)
2022-01-21 21:44:24 +00:00
b09d6224e2 Register{Schema,BackendSelect}.cpp: cleanup header includes (#70021)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70021

`RegisterSchema.cpp` only uses strings to register operator schemas,
so doesn't need to include any operator headers at all (except
indirectly through `torch/library.h`).

`RegisterBackendSelect.cpp` only needs the dispatcher API.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33160028

Pulled By: albanD

fbshipit-source-id: 68fb5cb8775077b6f174428a1fcced2a7836b714
(cherry picked from commit 35774ad7ac6ebbb6d17552ca9eb76fd3c06dcf43)
2022-01-21 21:44:24 +00:00
7dd6ead0ac Update actions/stale to latest version
as title

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71628
2022-01-21 21:30:19 +00:00
d8abe813bc [LocalSGD] Move feature to Beta, clean up some docs (#71621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71621

Moves this feature to beta as discussed, and cleans up some docs.
Synced offline with wayi1 who mentioned that the current names are preferred
as he works to prototype hierarchical allreduce as discussed in this RFC: https://github.com/pytorch/pytorch/issues/71325.
ghstack-source-id: 147382940

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D33700444

fbshipit-source-id: 8eb543f5b02a119d0790a5c0919e6def6383a067
(cherry picked from commit 656e9809b2429d1924e008164a1f4ca770700a9a)
2022-01-21 21:10:42 +00:00
29a7cb41d8 [BE] Fix FSDP flaky test (#71525)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71525

Closes https://github.com/pytorch/pytorch/issues/71496. Use file init
for test as opposed to TCP init which runs into some port racing conditions as
seen in the failures for that issue.
ghstack-source-id: 147300691

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D33676165

fbshipit-source-id: fcf83f7c7541d3521d3e38481195b0c7cb081691
(cherry picked from commit ea091c4af7d864e4d2ebcda6f72d04e17ae7bd82)
2022-01-21 21:00:13 +00:00
4e031419aa Skip broken svd tests (#71646)
Summary:
Workaround for https://github.com/pytorch/pytorch/issues/71645 breaking CI

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71646

Reviewed By: jbschlosser

Differential Revision: D33717104

Pulled By: albanD

fbshipit-source-id: f12d3895ecadd7000faa203e3d070dc0ee81e2f7
(cherry picked from commit 2b7c234d8409fa46037dff669ee7f50403c9b973)
2022-01-21 20:47:21 +00:00
e2191e7084 Fix persistent worker exits before pin_memory thread (#71579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579

Fixes #1551

As the comment in the code, register a function to terminate persistent workers. Using `atexit` to make sure termination of persistent workers always happens at the end (after pin_memory_thread exits).
We need such mechanism because Python interpreter would clean up worker process before DataLoader iterator in some rare cases.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33694867

Pulled By: ejguan

fbshipit-source-id: 0847f4d424a0cd6b3c0be8235d505415970254e8
(cherry picked from commit 18ad4621af5b5ff3c66b56051a00f6bfd3bf7a51)
2022-01-21 20:31:16 +00:00
8b3f58d311 Labels more elementwise binary operators correctly as BinaryUfuncInfos (#71622)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/66322.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71622

Reviewed By: ngimel

Differential Revision: D33702106

Pulled By: mruberry

fbshipit-source-id: bd0b7b9172cb161daebc5852b9546e734be8ac17
(cherry picked from commit 02f2ff1646414c0978135b4d69e3d0d82b0c9ac1)
2022-01-21 19:42:38 +00:00
b40dbdc49f Fix test ownership lint (#71554)
Summary:
I noticed after creating https://github.com/pytorch/pytorch/issues/71553 that the test ownership lint was not working properly.

This fixes my egregious mistake and fixes the broken lints.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71554

Reviewed By: malfet

Differential Revision: D33690732

Pulled By: janeyx99

fbshipit-source-id: ba4dfbcd98038e4afd63e326832ae40935d2501e
(cherry picked from commit 1bbc3d343ac143f10b3d4052496812fccfd9e853)
2022-01-21 18:24:42 +00:00
3a77fb244b [PyTorch][Static Runtime] Delete cleanup_activations option (#71501)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71501

This option disabled the memory planner. Supporting it would require us to add multiple versions of ops that borrow their inputs (because they rely on the memory planner to support that), and I'm not aware of a particular need to continue supporting it.
ghstack-source-id: 147385569

Test Plan: CI, rerun broken test from task

Reviewed By: mikeiovine

Differential Revision: D33669290

fbshipit-source-id: ecb01995891aecb5f4d0da2d9c51eed1f8fe489a
(cherry picked from commit 5e4fefb109b6c92d59fc7e24d69f1b6b2780c776)
2022-01-21 18:15:43 +00:00
8d880b06a1 stochastic_depth support (#71536)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71536

as titled

Test Plan: buck test mode/dev-nosan caffe2/test:test_fx_acc_tracer -- test_stochastic_depth

Reviewed By: yinghai

Differential Revision: D33668640

fbshipit-source-id: 3a8e6fc04b5529373d9dc77fef4514e9d01bf088
(cherry picked from commit e346d1d7a306f64334146a1a5c107c3db3ce8cd8)
2022-01-21 17:44:19 +00:00
9ada2b0768 add dumb retry to installing miniconda (#71558)
Summary:
Potentially can alleviate some issues related to https://github.com/pytorch/pytorch/issues/69730

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71558

Reviewed By: anjali411

Differential Revision: D33686601

Pulled By: janeyx99

fbshipit-source-id: fbf3aa77c2b97237f0080d1d2ae13c997d99b6c1
(cherry picked from commit 94ecd3840104bbb65751a106a642426f412ff4c7)
2022-01-21 17:06:45 +00:00
1a917e637c Bump dlpack.h to latest version (#65047)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64995

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65047

Reviewed By: VitalyFedyunin

Differential Revision: D32468916

Pulled By: mruberry

fbshipit-source-id: 3e0a17a3a264a77956ea7b795bd472c6fc79566c
(cherry picked from commit bd480b9892b9fa8a3a46fd0d7babeaf5d649a8b6)
2022-01-21 16:55:14 +00:00
13ea2cb330 [DataPipe] Make GroupBy serializable with lambda function (#71497)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71497

Related to https://github.com/pytorch/data/issues/172

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33668749

Pulled By: NivekT

fbshipit-source-id: 6506614e9d4389dc645d8985c00fdb3402122d9b
(cherry picked from commit 458e76fcb1a60691a225f3f5e4a058a51490732d)
2022-01-21 16:04:45 +00:00
36b4c95e74 [DataPipe] adding serialization test for all core IterDataPipes (#71456)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71456

Related to https://github.com/pytorch/data/issues/172

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D33668748

Pulled By: NivekT

fbshipit-source-id: ea2085d5ed47533ca49258cc52471373c6ae1847
(cherry picked from commit d5f6fde1d08bf77789176930cf4dc7faa7a6b5a3)
2022-01-21 16:04:45 +00:00
40d1f77384 Codegen: python_torch_functions only include relevant operators (#68693)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68693

Generation of python bindings for native functions is split over 8
different files. One for each namespace, with the torch namespace
split into 3 shards, and methods in their own file as well. This
change ensures that editing any single (non-method) operator only
causes one of these files to be rebuilt.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32596270

Pulled By: albanD

fbshipit-source-id: 0570ec69e7476b8f1bc21138ba18fe8f95ebbe3f
(cherry picked from commit ba0fc71a3a6835e49b332a8be52bf798fa2726b3)
2022-01-21 15:37:06 +00:00
7680a0ae9d Deprecates _aminmax (#71576)
Summary:
Replaces https://github.com/pytorch/pytorch/pull/62432. Existing callsites are updated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71576

Reviewed By: ngimel

Differential Revision: D33689960

Pulled By: mruberry

fbshipit-source-id: fad1ba78347ecec7fd48f21862c3eb606662b8f4
(cherry picked from commit 6cd438e9a156d9fc20e34f195721fd1a9374314b)
2022-01-21 09:23:29 +00:00
401e755354 Fix hsplit vsplit dsplit crash when section is 0 (#69342)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69270 #68970

Dim modulo by zero when split_size is 0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69342

Reviewed By: ngimel

Differential Revision: D33692683

Pulled By: mruberry

fbshipit-source-id: aab82e5617c23c265b7dd3a8bac2dd245aaef5b4
(cherry picked from commit bcbf3b4c4d587394bc7586f32310f637fc9e3de7)
2022-01-21 09:16:22 +00:00
dea61e7e6c [Docs] Fixed missing format common args (#70439)
Summary:
Description:
- Fixing missing format common args: https://pytorch.org/docs/master/generated/torch.select.html#torch.select

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70439

Reviewed By: ngimel

Differential Revision: D33699723

Pulled By: mruberry

fbshipit-source-id: 5e5d79021a5ce2dcafe2731eee08044611549f3a
(cherry picked from commit d1d16c6569b3fc2d0bd513b312baaacc36fe5a2e)
2022-01-21 08:49:10 +00:00
c5fe70021c Fix version strings in CI (#71564)
In CI PRs are being tagged like `ciflow/cpu/$PR_NUMBER` which is
causing version strings to be set as non-numbers. This breaks the
caffe2 build because it uses CAFFE2_VERSION_MAJOR etc. as numbers.
2022-01-20 21:06:54 -08:00
3a963d5621 [fx2trt][torchbench] enable shufflenet lowering (#71562)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71562

Previously we have some unsupported ops and the perf improvement is not promising (10% on batch size 32)
```
Unsupported node types in the model:
acc_ops.reshape: ((), {'input': torch.float16})
mean: ((torch.float16,), {})
```

After the diff stack, we don't have any unsupported nodes.

Also moved `lower_to_trt` to lower.py.

Test Plan: buck run mode/dev-nosan -c python.package_style=inplace scripts/dsy842974287/cu_model:vision

Reviewed By: wushirong

Differential Revision: D33483843

fbshipit-source-id: 4a54e25af3e5a6e4a299737994b60b999f529aa6
(cherry picked from commit add0077c27e7155fff7aaab96c506a872a00b83c)
2022-01-21 03:36:24 +00:00
26c123efbd empty_cuda: Add functions that don't depend on Tensor (#70616)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70616

This adds `at::detail::empty_cuda` and
`at::detail::empty_strided_cuda` to complement the cpu and meta APIs.

These functions also include the `lazyInitCUDA` and `DeviceGuard` that
are missing from the `at::native::empty_cuda` interface and so is
safer to use.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33623677

Pulled By: ngimel

fbshipit-source-id: 1c38e84881083df8e025250388f0c8f392974b92
(cherry picked from commit 4bc48c7008acf2394db7d02dee69dd7a8cfb87b8)
2022-01-21 03:11:38 +00:00
abe361754e [fix] isin : non-contiguous input on cpu (#70659)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67432

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70659

Reviewed By: anjali411

Differential Revision: D33532405

Pulled By: mruberry

fbshipit-source-id: fc3cbe3893e4c87b6a11f1bbaad66f49d6eda215
(cherry picked from commit 65aef8031c30f45f10b66d8c904845033e67bd50)
2022-01-21 02:56:53 +00:00
7ee0712642 Fix torch.{unique, unique_consecutive} out of bound (#71540)
Summary:
This PR ensures that the input iterator is always in front of the output
iterator. Thus, we won't have a out of bound issue since the input
iterator will meet the end before output iterator meets.

Fixes https://github.com/pytorch/pytorch/issues/71089

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71540

Reviewed By: mruberry

Differential Revision: D33688123

Pulled By: ngimel

fbshipit-source-id: f57718931d09a0fbea76ac1bd6cc8c7150af0978
(cherry picked from commit dc6e0e219a9e9b9ccea9ff5406458b56f556b2e4)
2022-01-21 02:36:49 +00:00
9f0227a0eb Revert "[ONNX] Minor doc update (#69501)" (#71615)
This reverts commit 114c13d020af9f6069610196177ee7e69d87fa8a.
2022-01-20 17:35:04 -08:00
76fd3cfd38 fix python version error (#71021)
Summary:
if python3 is the one running the tests but there exists a "python" installed as python2.7 the test will fail with a syntax issue

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71021

Reviewed By: zou3519

Differential Revision: D33667073

Pulled By: albanD

fbshipit-source-id: 8e489b491439be1740fc32ca5c7cdceb2145771e
(cherry picked from commit 5adfece429fcfe6ace778dd67f060d04a3d54699)
2022-01-21 01:05:53 +00:00
e2dc2aca93 Export ONNX models with readable input/output names (#68976)
Summary:
For some ONNX exported models, the inputs/outputs names have sometimes a numeric value and this makes pretty hard to inspect the generated graphs in the case of large models.

The solution in this PR was initially submitted to our internal utilities library by take-cheeze https://github.com/pfnet/pytorch-pfn-extras/pull/102

Now we would like to upstream this change by adding an extra kwarg when exporting the model to allow replacing these numeric names with actual debuggable ones.

As an example, the following code shows that the module output is `3`

```python
g, p, o = _model_to_graph(module, torch.ones(1, 10))
for n in g.nodes():
    for v in n.outputs():
        print(v.debugName())
```
output
```
3
```

With this PR

```
v3_Gemm
```

This allows identifying this out as a value from the associated Gemm layer.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68976

Reviewed By: jansel

Differential Revision: D33662246

Pulled By: msaroufim

fbshipit-source-id: 45f56eef2a84d9a318db20c6a6de6c2743b9cd99
(cherry picked from commit 513c1d28f1708ccf8224caa92165de702cf43fc3)
2022-01-21 00:34:56 +00:00
64a3827d4e <Qunat> remove inplace hardtanh in test (#71519)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71519

Remove inplace hardtanh in fx quantized op test case

Test Plan:
python3 test/test_quantization.py TestQuantizeFxOps.test_clamp

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D33675227

fbshipit-source-id: a496150ca4b485f953f68e24ddf9beb8ed1d94c0
(cherry picked from commit f65a888900aeef812bb3e6d8a231395c95914db9)
2022-01-21 00:30:41 +00:00
114c13d020 [ONNX] Minor doc update (#69501)
Fix the wiki URL.

Also minor reorganization in onnx.rst.

[ONNX] restore documentation of public functions (#69623)

The build-docs check requires all public functions to be documented.
These should really not be public, but we'll fix that later.'

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71609
2022-01-21 00:13:40 +00:00
9adee84a3f .github: Improve syncbranch debugability (#71596)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71596

Adds a dry_run to test out push as well as adding in a debug flag to
allow you to see what git commands are running

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet, bigfootjon

Differential Revision: D33695224

Pulled By: seemethere

fbshipit-source-id: 03bf6a3f2d9594089e134d95c3d35a6779ba7a26
(cherry picked from commit a75a402f9d02d5e4c709e25ca385264f854945d1)
2022-01-20 23:53:02 +00:00
64d221ffbf Add onnx.rst to the list of mergeable files
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71559
2022-01-20 23:50:47 +00:00
c92ff47afd Use == operator to test type equivalance in pytorch_jni_common.cpp (#71508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71508

"==" is the more universal way to test type equalities, and also ::get() doesn't incur any refcount overhead now, so we can swtich to == instead of relying on type kinds.
ghstack-source-id: 147353057

Test Plan:
CI
buck test xplat/caffe2/android:pytorch_jni_common_test

Differential Revision: D33672433

fbshipit-source-id: 5973fd40de48b8017b5c3ebaa55bcf5b4b391aa3
(cherry picked from commit db84a4b566d1f2f17cda8785e11bc11739e6f50c)
2022-01-20 23:46:51 +00:00
0df607ce00 Separate title and body of commit by 2 lines (#71598)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71598

Reviewed By: seemethere

Differential Revision: D33696070

Pulled By: malfet

fbshipit-source-id: 8508c548279658f7d751b4c064b0d0c5053b7660
(cherry picked from commit 5c679f2bea5c3b4df7292cd40ebe2a804efbf854)
2022-01-20 23:42:42 +00:00
dc0a8a6587 Improve storage assertion of Tensor's enforce_invariants (#70380)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70380

A small change in `Tensor`'s `enforce_invariants` that addresses tensor types that don't use the regular storage mechanism.
ghstack-source-id: 147328303

Test Plan: Existing unit tests.

Reviewed By: zou3519

Differential Revision: D33304602

fbshipit-source-id: c8cc41ed38a3eec147f40fe1029fd059748c87b5
(cherry picked from commit da4e87f20b0ec8bb1003e519ed39ba32de62a89d)
2022-01-20 23:00:49 +00:00
db2fc82a54 Generalize IValue's aliased hash handling for opaque tensors (#70371)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70371

This PR generalizes the aliased hash handling for opaque tensors beyond MKL-DNN.
ghstack-source-id: 147328304

Test Plan: Run existing tests.

Reviewed By: zou3519

Differential Revision: D33301787

fbshipit-source-id: db741ac347e933f8d65b029cd5be5f01804a960e
(cherry picked from commit aa8822a31a1002ea0c2440041e5e8cb862666535)
2022-01-20 23:00:49 +00:00
2eb4b05b94 torch/monitor: make tests more robust on windows (#71581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71581

Fixes https://github.com/pytorch/pytorch/issues/71553

Test Plan:
add ciflow/windows to CI

  buck test //caffe2/test:monitor -- --stress-runs 100 test_interval_sec

I don't have a windows machine so need to rely on CI to test

Reviewed By: edward-io

Differential Revision: D33691540

fbshipit-source-id: 69f28f1dfa7243e4eeda642f9bef6d5d168381d2
(cherry picked from commit 5d24dc7c2f5e8e0f48fdd602b1eaa3a8e6929715)
2022-01-20 14:33:41 -08:00
bc608b6e16 Add gitutils tests (#71580)
Summary:
Test PeekableIterator behavior

Add `.github/scripts/test_*.py` to list of tests run by test_tools
workflow and pin Python version to 3.7 in test_tools workflow

Change PeekableIterator inheritance from collections.abc.Iterator, to
typing.Iterator, which is a correct alias starting from Python-3.7

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71580

Reviewed By: bigfootjon

Differential Revision: D33690659

Pulled By: malfet

fbshipit-source-id: 71f270b15138230772e2eed0da66cdfcb34825cc
(cherry picked from commit 42abb07396fa90272afb0b56508bd3a1f5c4ccbe)
2022-01-20 14:33:41 -08:00
08b389fc36 .github: Set DRY_RUN for refs/heads/nightly (#71570)
Summary:
This wasn't getting set correctly before

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71570

Reviewed By: malfet

Differential Revision: D33688286

Pulled By: seemethere

fbshipit-source-id: 57d87ce714477683ab123c4b8382ede4149835bc
(cherry picked from commit 7fce5aa077be27efd58bcf67f38af46c12e08d73)
2022-01-20 14:33:41 -08:00
e43769e8ab remove ciflow_should_run (#70321)
Summary:
This PR implements the workflow changes described in https://fb.quip.com/oi8wAvajpR4g. Combined with the bot logic in d928549336 (can be moved to probot but is easier to test there), it fully implements the proposal.

The CIFlow comment is slightly outdated now but is still technically correct (all the commands will continue to work as before, just through a different mechanism).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70321

Reviewed By: atalman, janeyx99

Differential Revision: D33690370

Pulled By: suo

fbshipit-source-id: 8d81ffeb249cdae53c5526798a4a504560d0204f
(cherry picked from commit 5ed8d0dfae6dcf8bacaf6e4229e7b40b5c2b2446)
2022-01-20 14:33:41 -08:00
cyy
67385918ab move header inclusion (#71307)
Summary:
A header is used only in the .cc file and it is included by the public header. This causes errors when I try to include the public header.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71307

Reviewed By: zou3519

Differential Revision: D33650700

Pulled By: ngimel

fbshipit-source-id: d08dd335208da3aaafe333522d9525976c513151
(cherry picked from commit 94805495a0d30c54f22b0609db177d7ac3e26093)
2022-01-20 21:02:43 +00:00
8b5775a30f Fix unused variable warning in Sorting.cu (#71555)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71555

Test Plan: Sandcastle

Reviewed By: malfet, mruberry

Differential Revision: D33675174

fbshipit-source-id: c849b809d17d8c51b4ddba24ec5e2ae7bd1fa69a
(cherry picked from commit 440fca60be13135883f2cc3be98b078035906032)
2022-01-20 20:07:32 +00:00
9f1ad2d3e4 Fix unused variable warning in lp_pool_op.cu (#71557)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71557

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33675165

fbshipit-source-id: d18f0833c8aeef7d2cc35919ebb328cac5a92db4
(cherry picked from commit bca6b8c8d164f0139ec7330d51e09bef7eba44ee)
2022-01-20 20:07:32 +00:00
deb1c2f837 Include act_rewriter_allow_list and leaf_module in lower (#71289)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71289

Test Plan:
Facebook
buck test mode/opt -j 10 caffe2/test/fx2trt/trt_lower:test_fx2trt_lower

Reviewed By: 842974287

Differential Revision: D33556639

fbshipit-source-id: feda71ce08690b85576fc43a18e670fe94beaa91
(cherry picked from commit 6c8efc264fa378c06d1deb5259d91d237b000d53)
2022-01-20 19:38:41 +00:00
10da8726ef Fix unused variable warning in adagrad_fused_op_gpu.cu (#71556)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71556

Test Plan: Sandcastle

Reviewed By: jspark1105

Differential Revision: D33675154

fbshipit-source-id: f5ce981f5e7d3351be225d22156070bf2a7ed2b3
(cherry picked from commit 7409c02b4f685e2e1881cd2cfb627adf8ebc13cb)
2022-01-20 19:38:41 +00:00
add774ddbd .github: Set rocm workflows to only run on PRs (#71567)
Summary:
Also adds a mechanism for all workflows to do this

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71567

Reviewed By: malfet

Differential Revision: D33687713

Pulled By: seemethere

fbshipit-source-id: a3c7ef41ed04f9caa82c180961d2f4b7c24582dd
(cherry picked from commit eef2eafffd4c6311eff73d86fffaa42460cd2603)
2022-01-20 19:38:41 +00:00
53b3904115 Fix memory leak in ShardedTensor. (#71445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71445

A reference to the ShardedTensor was always added to the global map
`_sharded_tensor_map`, that never got cleaned up since the map always held a
reference to the ShardedTensor.

A couple of fixes for this:
1) Add to the global map only for `init_rrefs=True` since only this codepath
requires this.
2) Add a `weakref` to the global map to avoid having a reference to the
ShardedTensor forever that never gets cleaned up.
ghstack-source-id: 147299580

Test Plan: waitforbuildbot

Reviewed By: fduwjj

Differential Revision: D33641013

fbshipit-source-id: c552fa3359186514445fd5715bec93f67dc2262d
(cherry picked from commit d25f1a645313dcbf8c37158d80c42c983262cec2)
2022-01-20 19:38:41 +00:00
4b3cf1eaf7 [BE]Clarify how to check memory saving if using gradient_as_bucket_view (#71483)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71483

claify that peak memory saving should be checked after first iteration when using gradient_as_bucket_view
ghstack-source-id: 147271113

Test Plan: unit test

Reviewed By: rohan-varma

Differential Revision: D33662424

fbshipit-source-id: f760da38e166ae85234e526ddf1526269ea25d42
(cherry picked from commit a40dda20daa2fe051fcaa8fee5f3641aeea1da1c)
2022-01-20 19:38:41 +00:00
e926360cb8 [Pytorch Edge] Refactor Compatibility Stuff into own directory (#71432)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71432

Organizing jit/mobile a little more

ghstack-source-id: 147184536

Test Plan: ci.

Reviewed By: iseeyuan

Differential Revision: D33640527

fbshipit-source-id: f3a7884fe0d06d80bb8d9cf141ecaee34b6f88ff
(cherry picked from commit 4c3d1e5435db04a4ca2898ddf0811490f5959555)
2022-01-20 19:38:41 +00:00
1c61d8c43f [PT1.11] make static graph to be stable (#71459)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71459

1. add static_graph feature to DDP constructor;
2. still keep _set_static_graph() API, so that existing use cases are not affected, also it can be called internally by DDP constructor
3. four cases are covered:
    static_graph = False, _set_static_graph() is called;
    static_graph = False, _set_static_graph() is not called;
    static_graph = True, _set_static_graph() is not called;
    static_graph = True, _set_static_graph() is called;
ghstack-source-id: 147263797

Test Plan: unit tests

Reviewed By: rohan-varma

Differential Revision: D33646738

fbshipit-source-id: 8c1730591152aab91afce7133d2adf1efd723855
(cherry picked from commit dc246a1129a8ce5f70e551d7d8e00e0dab8ec6af)
2022-01-20 19:38:41 +00:00
11d8fe59fd Revert "Move syncbranches and trymerge to 3.9"
This reverts commit c7c767726be3989d8f1841df71f0624a96b1135f.
2022-01-20 11:26:12 -08:00
c7c767726b Move syncbranches and trymerge to 3.9
Hotfix for `TypeError: 'ABCMeta' object is not subscriptable`
2022-01-20 11:12:13 -08:00
4868907cf3 [binaries] fix dump_operator_name binary (#71246)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71246

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33555962

Pulled By: IvanKobzarev

fbshipit-source-id: 2b386e52fa8e76c877fec5b6b15d99f7d280801f
(cherry picked from commit f6d60fdff68964f77aa46ca2c51327cb66566194)
2022-01-20 17:33:08 +00:00
89c844db9b [torch.distributions] Implement positive-semidefinite constraint (#71375)
Summary:
While implementing https://github.com/pytorch/pytorch/issues/70275, I thought that it will be useful if there is a `torch.distributions.constraints` to check the positive-semidefiniteness of matrix random variables.
This PR implements it with `torch.linalg.eigvalsh`, different from `torch.distributions.constraints.positive_definite` implemented with `torch.linalg.cholesky_ex`.
Currently, `torch.linalg.cholesky_ex` returns only the order of the leading minor that is not positive-definite in symmetric matrices and we can't check positive semi-definiteness by the mechanism.
cc neerajprad

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71375

Reviewed By: H-Huang

Differential Revision: D33663990

Pulled By: neerajprad

fbshipit-source-id: 02cefbb595a1da5e54a239d4f17b33c619416518
(cherry picked from commit 43eaea5bd861714f234e9efc1a7fb571631298f4)
2022-01-20 17:33:08 +00:00
640bfa7e6f Refactor convolution_backward's cudnn cases (#71491)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71491

Changed the Cudnn and CudnnTranspose cases to only make the input
contiguous when it is needed for the grad_weight computation.

Reading the implementation of cudnn_convolution_transpose_backward and
cudnn_convolution_backward give me confidence that `input` isn't used
for the grad_weight computation. However, the memory format logic is so
convoluted that I'm 100$ sure this correct. All the tests though
and on request I can directly pass `backend_memory_format` to
{cudnn_convolution_backward, cudnn_convolution_transpose_backward}.

Test Plan: - pytest test/test_nn.py -v -k "conv"

Reviewed By: jbschlosser

Differential Revision: D33664694

Pulled By: zou3519

fbshipit-source-id: 9f4929686fe34f7aaf5331bfa49e98022b9d6c08
(cherry picked from commit 9e2ba0daca88139f7941bcb56bbc23825585d7a2)
2022-01-20 17:33:08 +00:00
06f14c2d63 Refactor convolution_backward's CudaDepthwise3d case (#71490)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71490

Deleted unnecessary .contiguous() calls in convolution_backward. The
CudaDepthwise3d case always hits _depthwise_3d_backward_cuda_out,
which will make arguments contiguous as necessary.

Changed _depthwise_3d_backward_cuda_out
- to make the input contiguous only when we're computing grad_weight
- to make the weight contiguous only when we're computing grad_input

Test Plan: - pytest test/test_nn.py -v -k "conv"

Reviewed By: jbschlosser

Differential Revision: D33664696

Pulled By: zou3519

fbshipit-source-id: d01d4f213e21ef4778de089a158933737b191cdf
(cherry picked from commit c6eb977c94a07f9812567a43b125b453eb5c5051)
2022-01-20 17:33:08 +00:00
17d2a5167e Refactor convolution_backward's CudaDepthwise2d case (#71489)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71489

Deleted unnecessary .contiguous() calls in convolution_backward. The
CudaDepthwise2d case always hits conv_depthwise2d_backward_cuda_out,
which makes the grad_output / self contiguous.

Changed conv_depthwise2d_backward_cuda_out to change `self_` (aka the
image input to convolution) to be contiguous only when we're computing
the grad_weight. This is because when we are computing the grad_input,
we only need the values from the grad_output and the weight.

Test Plan: - pytest test/test_nn.py -v -k "conv"

Reviewed By: jbschlosser

Differential Revision: D33664697

Pulled By: zou3519

fbshipit-source-id: 7a755fa8a076809c5490422d69fdf7ed80c8e29a
(cherry picked from commit 862ae63bab74113b3607b1bbc0a82f27992550fe)
2022-01-20 17:33:08 +00:00
42f7afc4cd [BE] Improve gitutils
Inherit `PeekableIterator` from `collections.abc.Iterator`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71515
2022-01-20 17:07:06 +00:00
a9f44b22c0 Fix composite compliance problems for linalg.{matrix_power, inv, cholesky} (#69437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69437

linalg.{inv, cholesky} are problematic because they call .data_ptr().
This makes them not composite compliant (e.g. meta tensors will not run
on them correctly). This PR makes them composite compliant by adding a
new native_functions operator that does error checking,
`_linalg_check_errors(Tensor info, str api_name, bool is_matrix`
that is a primitive with respect to autograd.

This PR modifies linalg.inv and linalg.cholesky to call the new error
check function. I also needed to refactor singleCheckErrors and
batchCheckErrors to accept a c10::string_view instead of a
`const char*`; you can convert `const char*` to c10::string_view but not
the other way around because `string_view` does not require null
terminated buffers.

Finally, there is a bugfix in `__torch_dispatch__` for this PR for
the composite compliant testing mechanism. Previously,
`__torch_dispatch__` could not handle operators with no returns; this PR
fixes that. No returns in C++ is equivalent to a single None return in
Python.

Test Plan: - composite compliant tests

Reviewed By: albanD

Differential Revision: D32883666

Pulled By: zou3519

fbshipit-source-id: d5a3f52ebab116c93e1a54a203eacc8f787de7e2
(cherry picked from commit 9e24c9599a043877ab4f289469be55550c996a79)
2022-01-20 16:14:34 +00:00
011fd1d933 [DataPipe] improving DataPipe unit tests (#70215)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70215

A few renaming, formatting, and additional tests to make the unit tests better.

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33344610

Pulled By: NivekT

fbshipit-source-id: bb36f7452bdc44964c9ce0650c7ae308ba2c5aa5
(cherry picked from commit 0aae20cb27038b7b3598520db4304a604f1e6799)
2022-01-20 15:49:53 +00:00
9f0c808593 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33677079

fbshipit-source-id: 997b73bebdcf83e09138bddc4bce257d0740e874
(cherry picked from commit 620023ad32a9e2c971edf79cd8d9653a987a5aff)
2022-01-20 12:13:18 +00:00
06838ce8b1 fix: do_constant_folding arg when exporting ONNX (#71348)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71348

Reviewed By: H-Huang

Differential Revision: D33662228

Pulled By: msaroufim

fbshipit-source-id: a69c72838b7ff41a2305453ef00666c060ade593
(cherry picked from commit 75dd62b406a655ff9612a340a80a3bd563bd9919)
2022-01-20 05:42:35 +00:00
21b697b646 add flatbuffer_loader and flatbuffer_serializer as BUCK target (#71463)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71463

title

Test Plan: unittest

Reviewed By: zhxchen17

Differential Revision: D33651339

fbshipit-source-id: 4bf325a40e263a441fd86bce560645ad0c1ebb23
(cherry picked from commit 4cb02e62a68f338e3388ad09276ced9b8f4cdcb1)
2022-01-20 04:51:10 +00:00
99df96d800 Add silu and hardsigmoid converter (#71453)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71453

As title

Test Plan: unit test

Reviewed By: frank-wei

Differential Revision: D33646384

fbshipit-source-id: d86326c93e4d6bd59c9152592721f0e6ecf7f6fb
(cherry picked from commit d886380edef3388d60d529100332f9d9564f0913)
2022-01-20 03:16:20 +00:00
80b19c4c8c Enable Python bindings for UntypedStorage (#68945)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68945

This PR enables the Python conversion functions for `Storage` (specifically `UntypedStorage`) and also cleans up some remnants of the deprecated typed storages from `DynamicTypes.cpp`.
ghstack-source-id: 147245110

Test Plan: Run the existing unit and integration tests.

Reviewed By: albanD

Differential Revision: D32676505

fbshipit-source-id: 3a3f6db4fb0da5c78dd406c96ab70bdc37015521
(cherry picked from commit d6427b94cf88b078bd228d43cd2afbabf0773b39)
2022-01-20 02:11:34 +00:00
f5b19ba683 Additional unit test for sharded linear. (#70476)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70476

1) Support a single dimension for inputs
2) Test several error cases

Partially addresses https://github.com/pytorch/pytorch/issues/65638
ghstack-source-id: 146307607

Test Plan: waitforbuildbot

Reviewed By: fduwjj

Differential Revision: D33344357

fbshipit-source-id: 4de7a7177452951dbcce76f27441703447609e6f
(cherry picked from commit 96dfded5697e451b54f113f99b6d0da6f6af500d)
2022-01-20 01:23:44 +00:00
a5d5b11252 Add GitHub merge rules (#71514)
Summary:
Following subfolders of the project were identified as one that can be
merged on github first and then asynchronously merged into Meta
codebase:
## ONNX exporter
PRs that include only files under `torch/onnx`, `torch/csrc/jit/passes/onnx` and `test/onnx` and are reviewed by garymm
## CUDA fusers
PRs that include only files under `torch/csrc/jit/codegen/fuser/cuda`, `torch/csrc/jit/codegen/cuda` or `benchmarks/cpp/nvfuser` and reviewed by csarofeen or ngimel
## OSS CI
PR that include only files under `.circleci`, `.github` and `.jenkins` and reviewed either by seemethere or myself

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71514

Reviewed By: bigfootjon

Differential Revision: D33673050

Pulled By: malfet

fbshipit-source-id: 21b909d49cb73ff79879b3ea0568e53ef65aa08c
(cherry picked from commit 520226c1bf341fe6a9e1cd42f18da73c43386062)
2022-01-20 01:16:25 +00:00
c59942ac73 [PyTorch] Fix a bunch of structured kernel refcounting (#71140)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71140

Structured kernels need to use the borrowing variants of the build APIs to TensorIterator. (I am working on a debug check for this, but it is currently too strict, and relaxing it does not catch these bugs.)
ghstack-source-id: 147191022

Test Plan: CI

Reviewed By: bdhirsh

Differential Revision: D33520003

fbshipit-source-id: 3b0ff9036acdb78ae6fc7489ed0ed487d5ff080f
(cherry picked from commit 80ef4e14e33718a9ad5aaefc218bb773e3b15a5c)
2022-01-20 00:30:43 +00:00
b98e955b24 [flatbuffer] Fix forward flatbuffer type handling with dynamic type. (#71500)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71500

Some places in flatbuffer_loader.cpp need to update to newer API call following the dynamic type changes.
ghstack-source-id: 147278860

Test Plan:
rebase D33665961
```
[zhxchen17@devbig560.ftw3 /data/users/zhxchen17/fbsource]  buck run fbcode/mode/dbg //arvr/firmware/silicon/turing:test_torch -c turing.min_runtime=1 -c turing.dsp_op=1 -c turing.model_file=test1.ptl -c pt.has_backtraces=1
Action graph will be rebuilt because files have been added or removed.
Downloaded 0/4 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 6.1 sec (100%) 253/253 jobs, 3/253 updated
  Total time: 6.1 sec
BUILD SUCCEEDED
Conv:  input [1, 32, 4, 4] residuals [1] weights [4, 4, 1, 1, 2, 32] nlu_params [4, 128] in_ch 32 out_ch 32 groups 1 kernel  stride  padding  upsample 0 op_type 0 act_type 0
```

Reviewed By: qihqi

Differential Revision: D33668588

fbshipit-source-id: 44163c1bc0ea57e4bd265384a253d6cc7b96ed4a
(cherry picked from commit 746487075e36fe90317b631cb3a839d16fd0723f)
2022-01-20 00:22:35 +00:00
565f78f571 [Pytorch] Speed up LayerNorm 4-5% (#71423)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71423

Replacing this math with a load seems to improve perf.
ghstack-source-id: 147171800

Test Plan: ptvsc2_predictor_bench runs on model from mikeiovine courtesy of mikeiovine

Reviewed By: mikeiovine, xiaomengy

Differential Revision: D33552176

fbshipit-source-id: f21a4cd66c13b9fcb7bcf48f356bdc85e94c4216
(cherry picked from commit 0354fcb9889e7345321fe4dc9e30495a67709a4d)
2022-01-20 00:16:17 +00:00
958f9cf5ff [PyTorch][Static Runtime] Fix extra refcount bumps in layer_norm (#71237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71237

Noticed these on inspection.
ghstack-source-id: 147171799

Test Plan: CI

Reviewed By: mikeiovine

Differential Revision: D33519799

fbshipit-source-id: 167c63323b345a5822303cecdbbbbb959f66f6e4
(cherry picked from commit 57e8da2d354497d3370906d1ae145288a2fd166b)
2022-01-20 00:16:17 +00:00
811af25963 Fix trivial typo at the doc of torch.lobpcg (#71464)
Summary:
I think `symmetric positive defined generalized eigenvalue problem` should be changed to `symmetric positive definite generalized eigenvalue problem`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71464

Reviewed By: ejguan

Differential Revision: D33660670

Pulled By: H-Huang

fbshipit-source-id: 85dc830ed56a98d8a38bd2843f575f6ce08498cf
(cherry picked from commit dbbef542c04a8dd93514ac7783f4546e5da7ca58)
2022-01-20 00:07:39 +00:00
dc5cda0cca Update min python version to 3.7 in setup.py and mypy configs (#71494)
Summary:
As Python-3.6 have reached EOL

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71494

Reviewed By: atalman

Differential Revision: D33667509

Pulled By: malfet

fbshipit-source-id: ab1f03085cfb9161df77ba5ce373b81f5e7ef3ae
(cherry picked from commit 60343166d97b1eb1649b29a78ad390d39926b642)
2022-01-20 00:03:57 +00:00
06bc6748a1 [acc_ops] Remove usage of kwarg expansion via **locals() for jit scripting support (#71425)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71425

att

Test Plan: CI

Reviewed By: yuhc

Differential Revision: D33639228

fbshipit-source-id: 95edced3b19a531d417538f00f0a555295c8741f
(cherry picked from commit 45455a6edc721a0362a5e775ac2fb52f4f16c84d)
2022-01-19 23:49:50 +00:00
ef4bc3fa2f [distributed] Make rref_proxy._invoke_rpc trully async when needed. (#70206)
Summary:
From https://github.com/pytorch/pytorch/issues/67626: RRefProxy (rref.rpc_async, rref.rpc_sync, rref.remote) currently uses a blocking RPC call to the owner

This is done by chaining async calls. In the sync case we wait on the
resulting Future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70206

Test Plan:
I ran rpc_tests using tensorpipe_rpc_agent_test_fixture.py and had to
adjust test_rref_proxy_timeout to the new behavior.

I ran into test_tensorpipe_set_default_timeout failing due to the
timeout being too small. Doesn't look related to this change.
mrshenli
Fixes https://github.com/pytorch/pytorch/issues/67626

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Reviewed By: pritamdamania87

Differential Revision: D33243348

Pulled By: kumpera

fbshipit-source-id: e1e8c34bb3d170407c0a793e2e585357f905d3c6
(cherry picked from commit 1ad5a7ceea17d00872e593650ef50d85bb232cda)
2022-01-19 23:37:15 +00:00
70c9146c40 [nnc] Update block and thread extents in cuda_codegen to use int64_t (#71428)
Summary:
The block and thread extent calculations in `cuda_codegen` should be using `int64_t` instead of `int`. The updated test, `test_dynamic_shapes`, fails without this change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71428

Reviewed By: samdow

Differential Revision: D33640374

Pulled By: navahgar

fbshipit-source-id: 64c340ad2a9a1fa1fe066cf1c5dfc3b546b7be6d
(cherry picked from commit 6ea546ce116fc05d9d7e225bc29f7fe86be439de)
2022-01-19 23:21:24 +00:00
2dbbb1a921 [fx2trt] Issue warnings instead of error if there's possible const folding opportunities (#71031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71031

During the conversion stage, we might create some constants when size op is called and size is static. Raising error here causes problem for this case. Generally speaking it doesn't hurt to allow not const folding.

Test Plan:
Test with D33483843 on shufflenet.

Added unit tests.

Reviewed By: wushirong

Differential Revision: D33484183

fbshipit-source-id: 5b32c06297e56965befd7e83fe8ca273e3665cee
(cherry picked from commit e6b79bd3dd626f4b0035b9792a246fc09098d5ef)
2022-01-19 23:16:23 +00:00
61713acb07 Add trymerge workflow (#71488)
Summary:
This one, will react to `repo_dispatch` event sent by PyTorch Probot
when `pytorchbot merge this` command is issued

At the moment, workflow will only attempt to merge PRs which has not
been created from forked repo and that match rules defined in
`.github/merge_rules.json`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71488

Reviewed By: bigfootjon

Differential Revision: D33665142

Pulled By: malfet

fbshipit-source-id: e22daa1892523e62d7b7a941960636a6514cb7d7
(cherry picked from commit 92059bab073e2cd6ca6e9f946ffc2f956e22895c)
2022-01-19 23:11:48 +00:00
f45e217c01 Consolidate the overloads of TensorImpl::shallow_copy_and_detach (#68953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68953

This PR consolidates the almost identical lvalue and rvalue implementations of shallow_copy_and_detach into a single templated function.
ghstack-source-id: 147238376

Test Plan: Run existing unit tests.

Reviewed By: fduwjj

Differential Revision: D32679741

fbshipit-source-id: 89a870335d2e09ffd005c943733a787d20d352f9
(cherry picked from commit 750344c8600e05d4ab593956257c8191919eeef8)
2022-01-19 21:52:13 +00:00
805b7575db test //c10/... without Google libraries in OSS (#70853)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70853

We support both configurations, so we should ensure they both work.
ghstack-source-id: 147170900

Test Plan: This is adding a test to CI.

Reviewed By: malfet

Differential Revision: D33304505

fbshipit-source-id: 7074b6b98d05f60801bb1d74bc9ac1458c768d28
(cherry picked from commit 8e4134b77789a157be5ba3df1d07f9bb308ca3b6)
2022-01-19 20:56:12 +00:00
78e1f9db34 port //c10/macros to common build structure (#70852)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70852

This is the first change that uses a common build file, build.bzl, to
hold most of the build logic.
ghstack-source-id: 147170895

Test Plan: Relying on internal and external CI.

Reviewed By: malfet

Differential Revision: D33299331

fbshipit-source-id: a66afffba6deec76b758dfb39bdf61d747b5bd99
(cherry picked from commit d9163c56f55cfc97c20f5a6d505474d5b8839201)
2022-01-19 20:56:12 +00:00
661d10aab4 use c10/macros/cmake_macros.h in fbcode build (#70851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70851

This is a step towards OSS/fbcode convergence since OSS uses this file
in both CMake and Bazel.
ghstack-source-id: 147170896

Test Plan: Relying on the extensive CI internal tests for this.

Reviewed By: malfet

Differential Revision: D33299102

fbshipit-source-id: c650dd4755f8d696d5fce81c583d5c73782e3990
(cherry picked from commit 741ca140c82f728e3b349d703a7de239e5bbf13c)
2022-01-19 20:56:12 +00:00
bdeec0c7b6 [fx] add documentation to AccOpProperties (#71450)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71450

att

Test Plan: no test

Reviewed By: jfix71

Differential Revision: D33515471

fbshipit-source-id: ded40ca117f63c971d6c5ed4556932cc71c009ca
(cherry picked from commit a9f66d5921241645191c1df3292dc6e784860165)
2022-01-19 20:50:21 +00:00
7ce6db48e5 add rocm GHA workflow (#68552)
Summary:
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68552

Reviewed By: bdhirsh

Differential Revision: D33569551

Pulled By: seemethere

fbshipit-source-id: cc7d68a22ad0eedd4d11eea3cf43a909e5b8616b
(cherry picked from commit 2bb701eb9d2c1ec79bf3f5b3e75cb7ec41fdeb4d)
2022-01-19 20:31:17 +00:00
15e7d18124 [jit][edge] Create convinience wrapper for dynamic type construcytors. (#71457)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71457

Today DynamicType is hard to be created because we have separare APIs for different types. In this diff we introduce an easier API to create types like the following:
```
#include <ATen/core/type_factory.h>

auto type = dynT<ListType>(dynT<TensorType>()); // etc...
```
ghstack-source-id: 147211236

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33647746

fbshipit-source-id: c850cf31ae781244eac805906a2fc110ef065a70
(cherry picked from commit 8cfd51d75f010ca6f7f98b7e8ef807ead4d5f8f3)
2022-01-19 20:11:11 +00:00
ac26f8237c Allow disabling nvfuser without CUDA (#71358)
Summary:
On a CPU-only build of pytorch `torch._C._jit_set_nvfuser_enabled(False)` would throw an error (even though it is a no-op operation), with this fix:
```
>>> torch._C._jit_set_nvfuser_enabled(False)
False
>>> torch._C._jit_set_nvfuser_enabled(True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Running CUDA fuser is only supported on CUDA builds.
>>>
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71358

Reviewed By: eellison

Differential Revision: D33601135

Pulled By: jansel

fbshipit-source-id: c764df2fa197ce7b4f71e5df0a91cd988766e99c
(cherry picked from commit a801df93210302e918eca7134d3c0a19ac5bae5d)
2022-01-19 20:01:09 +00:00
214f4bf2ff Support sparse.sum on empty sparse tensor (#71091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71091

Fixes https://github.com/pytorch/pytorch/issues/65394

The masked sum on a full input tensor (of any layout) with an all-true mask is the same as the sum on the strided input tensor (after applying `to_dense` to sparse inputs).
Since masked sum uses `torch.sparse.sum` then, for the simplicity of masked reductions implementations, its reduction behavior ought to be defined by the behavior of the `torch.sum`. This PR implements the behavioral connection with respect to the directional summation of empty sparse tensors that correspond to all-zero strided tensors.

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: davidberard98

Differential Revision: D33651750

Pulled By: cpuhrsch

fbshipit-source-id: 703891bff88c8da6270b4272f5d2da81688db67d
(cherry picked from commit 53f97e80f7520594e9977ad61a1a727dadade645)
2022-01-19 18:58:08 +00:00
3b589c3497 [DDP Checkpointing] non-reentrant checkpoint tests (#69060)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69060

Saved variable hooks checkpointing was added in https://github.com/pytorch/pytorch/pull/69508, this PR adds some tests for DDP.

Specifically, we can support almost all DDP use cases with this new API, such as dynamic module with find_unused_parameters=True. One case remains to be supported, which is static_graph + non-reentrant based checkpointing. The underlying reason this does not work is https://github.com/pytorch/pytorch/issues/58111.
ghstack-source-id: 147219887

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D32712126

fbshipit-source-id: ba5ae9ca77fd8929ee020c7dc97838bae9a1931b
(cherry picked from commit 9c7f93e21728d1627d85c351a21e7c8da832bff7)
2022-01-19 18:09:41 +00:00
75aaa9f92b Remove simd qualifier for pragma omp loop in upsample_nearest_op.h (#71462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71462

Fixes
```
      6 aienv/aienv_ig_reels_base:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
      6 deep_entity_classification/si_dec_gnn:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
      6 feed_recommendation_infra/multifeed_execution_graph_service_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     12 mobile_cv/mobile-vision_experimental:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     30 mobile_cv/mobile-vision_xraymobilev2_detection_caffe2:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     42 aienv/aienv:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
    128 feed_recommendation_infra/multifeed_recagg_dev:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
    136 fluent2/fblearner_flow_projects_fluent2_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
   1338 f6/f6_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
```

Test Plan: Sandcastle

Reviewed By: luciang

Differential Revision: D33641869

fbshipit-source-id: 8424849cfac5cb0109272dec2086863067bbde66
(cherry picked from commit d18429905c7661486ed8ec0cdcdd7d94b9c62762)
2022-01-19 18:04:10 +00:00
908fd3d78b [fix] composite compliance: quantile and nanquantile (#70894)
Summary:
Reference https://github.com/pytorch/pytorch/issues/69991

Refactored such that only `out` variant copies the result into `out` otherwise we just return the result of the composite functions as is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70894

Reviewed By: samdow

Differential Revision: D33641742

Pulled By: zou3519

fbshipit-source-id: 671be13b31a7fff3afc0b7976706a5ecfc51ccac
(cherry picked from commit e7d5ac9af319be327adc16d2d7048139a4b2ddd3)
2022-01-19 17:54:00 +00:00
a0ada2d22b Back out "[pytorch][PR] Performance and memory improvements to batched torch.linalg.solve" (#71421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71421

Original commit changeset: 7a0dd443cd0e

Original Phabricator Diff: D33028236 (410e91adee)

Test Plan: PyTorch OSS CI

Reviewed By: ngimel

Differential Revision: D33637628

fbshipit-source-id: 1e81485be202b2f9d6a1ff315279cc099754c2dc
(cherry picked from commit c2d730bfeb2a9e4a3af1442b8d1fe5bf28a95f2b)
2022-01-19 17:26:01 +00:00
8a9243996c Lazy load pandas when importing pytorch (#71316)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71313

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71316

Reviewed By: wenleix

Differential Revision: D33595043

Pulled By: malfet

fbshipit-source-id: da8c7a7f132696645191d7b7055c4c21970d92c3
(cherry picked from commit 2d4847780a4d26426d2300861069160836130063)
2022-01-19 17:02:50 +00:00
671a0b5376 Move sccache compilation log to its own group (#71444)
Summary:
The sccache compilation log is often misleading.

We can move it to its own group so people don't see it right away

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71444

Reviewed By: atalman

Differential Revision: D33659650

Pulled By: janeyx99

fbshipit-source-id: f22fd21640a8747beeacce8857bbb8281efd76f4
(cherry picked from commit e25970abf99801fc04d4ae15f8f5ffe63dd1dc41)
2022-01-19 16:47:36 +00:00
7ed2a43d26 Adding wheels with py3.10 (#71419)
Summary:
Adding wheels with py3.10

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71419

Reviewed By: janeyx99

Differential Revision: D33657770

Pulled By: atalman

fbshipit-source-id: 5d24f1771991ff07fbfd92d04d3d5211cf53084c
(cherry picked from commit bf2f2624e12821a417a17bd374e13fda5ab69724)
2022-01-19 16:40:39 +00:00
b56ba296b1 Support multiple input dims for sharded linear. (#70266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70266

Addresses some of the issues mentioned in
https://github.com/pytorch/pytorch/issues/65638. ShardedLinear implementation
only support 2D inputs.

On the other hand `nn.Linear` supports arbitrary dimensions for inputs and
outputs. As a result, in this PR I've added support to ensure that
ShardedLinear supports arbitrary input dims as well.
ghstack-source-id: 147206607

Test Plan: waitforbuildbot

Reviewed By: wanchaol

Differential Revision: D33267630

fbshipit-source-id: 0460994c3aa33348b80547d9274206ef90cb29b6
(cherry picked from commit 7c289e1dbf491008e091ed0a49f98f2ebcfb4175)
2022-01-19 08:07:14 +00:00
fbc3b8c1bb [RPC] Fix a few flaky RPC tsan tests (#71460)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71460

When running with TSAN, we use a larger RPC timeout: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/dist_utils.py#L68. As a result, the assertions here are invalid.

Tried to fix this by just setting `self.rpc_backend_options.rpc_timeout` to the new timeout, but `rpc_backend_options` is reconstructed every time it is accessed, so this doesn't work:: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/distributed/rpc/tensorpipe_rpc_agent_test_fixture.py#L15

Just removing the asserts should be fine as they don't really add value to what's being tested.
ghstack-source-id: 147208455

Test Plan: CI

Reviewed By: fduwjj

Differential Revision: D33648421

fbshipit-source-id: 9a5052b1c851fe7f838792d8bdf17d0563b4aa00
(cherry picked from commit 96ddab3433aff88961236d2d64f2b685de1ccc15)
2022-01-19 06:12:43 +00:00
9515213070 [Operator Versioning] Remove version compare as they are decoupled now (#71461)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71461

After operator versioning work, the version in model file is used for operator versioning, while bytecode_version is used for bytecode versioning (for bytecode schema). They are two seperate things now and this comparison is not needed.
ghstack-source-id: 147209286

Test Plan: CI

Reviewed By: iseeyuan, tugsbayasgalan

Differential Revision: D33648592

fbshipit-source-id: beaa136a728f88435176a00c07b2d521210f107f
(cherry picked from commit e90e650e1a5134473117eda802d679171e035082)
2022-01-19 04:51:45 +00:00
677fab6d1d Support broadcast_to on sparse COO tensors (#71073)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71073

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33645744

Pulled By: cpuhrsch

fbshipit-source-id: 4775c9636c4e868022a8c1bbfec93e351d1cf885
(cherry picked from commit 640f21e09a935a1231b99ddd6472b03158bdc283)
2022-01-19 04:33:41 +00:00
9b9b878c89 Fixes jiterator cache macro include + updates CUDA note with cache variables (#71452)
Summary:
Per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71452

Reviewed By: ngimel

Differential Revision: D33646495

Pulled By: mruberry

fbshipit-source-id: bbf627e6d7a724a83a3ea2ae9c0f50430f8d578e
(cherry picked from commit d1e72b144aad9607ce53c477b7edfdce17cfd1c0)
2022-01-19 03:45:05 +00:00
125bdb6d51 empty_meta: Add functions that don't depend on Tensor (#70615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70615

This adds `at::detail::empty_meta` and
`at::detail::empty_strided_meta` to complement the cpu API.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623678

Pulled By: ngimel

fbshipit-source-id: 59e003116361fb547ec2c633bbc15a7973e21d0e
(cherry picked from commit b4f5836fa106418755381abedf327125bde744ef)
2022-01-19 03:41:20 +00:00
b4a75af758 [fx2trt] Export some options out (#71315)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71315

Add variables in LowerSetting to export options from TRTInterpreter and interpreter.run:
- explicit precision
- int8_mode

Export skip_folding_node_fn options from split_const_subgraphs.

Reviewed By: wushirong

Differential Revision: D33585385

fbshipit-source-id: 3d20b69d255ad97487e462436ae479587a8e2118
(cherry picked from commit f24a279517b16624a02d458e10275d78ec3d5699)
2022-01-19 02:13:31 +00:00
87215ed526 empty_strided: Factor out generic implementation (#70614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70614

This creates an `empty_strided_generic` function which, similar to
`empty_generic`, is a device-independent tensor constructor. This also
adds `at::detail::empty_strided_cpu` to complement
`at::detail::empty_cpu`.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623679

Pulled By: ngimel

fbshipit-source-id: 85994e88d664870bf425f398dfcdfc467885c694
(cherry picked from commit 2ff2a89df5752cfad667463aa3c3bffe8479ec9a)
2022-01-19 01:54:16 +00:00
d5e9a276ea Adapt to llvm marking SmallVector::set_size private (#71434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71434

See also https://reviews.llvm.org/D115380

Reviewed By: zhuhan0

Differential Revision: D33638540

fbshipit-source-id: a55e51462dc0d8f55a75bb79d9d76db781a36af2
(cherry picked from commit 78d1d65f77ab575acb367196608606206d29b0f1)
2022-01-19 00:54:03 +00:00
30739f5329 ci: Change binary trigger to be nightly push (#71447)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71447

Changes the nightly build trigger to be based on pushes to the `nightly`
branch instead of being based on the tagged push. This aligns it with
our current CircleCI trigger and should make it so that it's easily
viewable using tools like https://hud.pytorch.org/ci/pytorch/pytorch/nightly

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33647102

Pulled By: seemethere

fbshipit-source-id: c6757da35b7ec2d68bf36160dd7f3cb9ed040899
(cherry picked from commit 99b7b22650440e82fe5b150af3db53cf8c9deabd)
2022-01-19 00:27:42 +00:00
6f4c491c6b empty_cpu: Add functions that don't depend on Tensor (#70613)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70613

This refactors `at::detail::empty_cpu` to use only `TensorBase` so you
can construct tensors without including `Tensor.h`. It also adds a
`TensorOptions` version to reduce friction in operators moving from
the `at::empty` API.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623682

Pulled By: ngimel

fbshipit-source-id: 7a7b08bc2ed06830a3d698197a0c8389a096dc1d
(cherry picked from commit 2e17ad0bbd6dea2ea99c264fe3ea66414c991c8e)
2022-01-19 00:01:58 +00:00
6964aa2ced backout D33469839 (#71443)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71443

cogwheel test inline_cvr_infer_canary_pyper_model_publish is timing out.

The convert_fx call takes > 20 mins for local and local_ro sub modules, which used to take ~ 2 mins.

Test Plan:
Fblearn flow run
* the following cmd took 1113 seconds before the diff and 5002 seconds after.
    flow-cli clone-locally 320014219  --run-as-secure-group pytorch_at_scale  --operators pyper_model_publish_workflow.pyper_model_publish_workflow.process_torch_package_model_files.process_non_sparse_parameters[0]

Cogwheel test
* Cogwheel test with packages in B3588 (the last good run) took 4694.48s
* Cogwheel test with packages in B3590 (the first timeout) took 13975.83s
* Cogwheel test with the following packages took 4535.04s
  * all packages in B3588 except the model publish
  * the model publish built with D33469839 (043e84b3d2) reversed (created D33633570)

Reviewed By: albanD, jerryzh168

Differential Revision: D33633570

fbshipit-source-id: dc5e777c48a90c551641a3f79126461f6a60449e
(cherry picked from commit 03ab65023a9f4175584ddac1cca7eab51397c84a)
2022-01-18 23:51:51 +00:00
4fd1992a60 [Docs][BE] DDP doc fix (#71363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71363

Looks like DDP example is currently broken as per
https://discuss.pytorch.org/t/official-ddp-example-is-broken/141493. Fix the
issue by setting the correct env variable.
ghstack-source-id: 147080377

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D33607250

fbshipit-source-id: e0e7d03cc365c186253b959c4c5405a5e3609218
(cherry picked from commit 32472884ec04d0e9b348b07d645dd1027389f8e8)
2022-01-18 22:24:51 +00:00
322f13d914 [Profiler] Fix memory profile type from recent refactor (#71417)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71417

I accidentally changed CPU_INSTANT_EVENT to CPU_OP, which broke TensorBoard.

Test Plan: Make memory profiling unit test check this case.

Reviewed By: aaronenyeshi

Differential Revision: D33637286

fbshipit-source-id: c95945f6b85cd4168820bd4d2a9203274a0a5bd6
(cherry picked from commit b1e258672af4b83d824b8c8eb565af0ffdfa895b)
2022-01-18 22:18:11 +00:00
ff8fb717db Fix get_git_repo_dir (#71448)
Summary:
Otherwise, rev-list will only pick-up commits in `.github` repo

Before:
```
% git -C .github rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- .
598b55fd1894b7edb21f208b1c86fd6a377ebc69
ae089d6bdf03f1fadbc76fdf3d284081396251e8
```
After
```
% git -C . rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- .
1ff7f65cc1ad499a71457368894ca14bed069749
2ac58b0dc13f152bea180dd3d64b7c36fe0ba755
598b55fd1894b7edb21f208b1c86fd6a377ebc69
55899528a266363d27e0cf5e82b1b94524509756
ae089d6bdf03f1fadbc76fdf3d284081396251e8
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71448

Reviewed By: seemethere, atalman

Differential Revision: D33644256

Pulled By: malfet

fbshipit-source-id: fa2e06f6767e7702af6ce85471aea07fa58292c0
(cherry picked from commit 594cecc0e1b95bacbd0d1d87bf7c622a3a5b04e5)
2022-01-18 22:12:41 +00:00
b8679ee1fc fix conv+bn folding issue when bn hasn't running states (#71259)
Summary:
Doing conv+bn folding which bn hasn't a running stats, there have error for JIT and FX path:

```
import torch

import torch.nn as nn

import torch.fx.experimental.optimization as optimization

class M(nn.Module):
    def __init__(self):
        super(M, self).__init__()
        self.conv = nn.Conv2d(32, 64, 3, stride=2)
        self.bn = nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)

    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        return x

x = torch.randn([1, 32, 50, 50])

model = M().eval()

'''
# jit path
with torch.no_grad():
    traced = torch.jit.trace(model, x).eval()
    traced = torch.jit.freeze(traced)
'''

# FX path
fused_model = optimization.fuse(model)
```

expected result:
1. JIT path
```
Traceback (most recent call last):
  File "bn_test.py", line 27, in <module>
    traced = torch.jit.freeze(traced)
  File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/jit/_freeze.py", line 119, in freeze
    run_frozen_optimizations(out, optimize_numerics, preserved_methods)
  File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/jit/_freeze.py", line 167, in run_frozen_optimizations
    torch._C._jit_pass_optimize_frozen_graph(mod.graph, optimize_numerics)
RuntimeError: Expected Tensor but got None
```
2. FX path
```
Traceback (most recent call last):
  File "bn_test.py", line 31, in <module>
    model = optimization.fuse(model, inplace=True)
  File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/fx/experimental/optimization.py", line 71, in fuse
    fused_conv = fuse_conv_bn_eval(conv, bn)
  File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/nn/utils/fusion.py", line 11, in fuse_conv_bn_eval
    fuse_conv_bn_weights(fused_conv.weight, fused_conv.bias,
  File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/nn/utils/fusion.py", line 23, in fuse_conv_bn_weights
    bn_var_rsqrt = torch.rsqrt(bn_rv + bn_eps)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'float'
```

This PR will fix this issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71259

Reviewed By: anjali411

Differential Revision: D33595049

Pulled By: davidberard98

fbshipit-source-id: 0fe56bb2bb25d6d54ebc53789d2ad22458da9012
(cherry picked from commit 5672c083784585e6e1ec5657f02bd3051afb2b50)
2022-01-18 22:12:41 +00:00
a986154950 Lazy import packaging in torch_version (#71345)
Summary:
As it is a pretty big package and to be used during normal
course of PyTorch initialization

Fixes https://github.com/pytorch/pytorch/issues/71280

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71345

Reviewed By: seemethere

Differential Revision: D33594547

Pulled By: malfet

fbshipit-source-id: e0abea82dbdc29914512b610692701140d3e68a2
(cherry picked from commit 1ff7f65cc1ad499a71457368894ca14bed069749)
2022-01-18 22:12:41 +00:00
efd274bbcb Fix for windows builds with python 3.10 , getting rid of ssize_t (ssize_t is not a C++ defined type) (#71390)
Summary:
Fix for windows builds with python 3.10 , getting rid of ssize_t

Here is the completed bin build : https://app.circleci.com/pipelines/github/pytorch/pytorch/441527/workflows/144edb79-b398-4d70-92fe-b63158c1b439/jobs/16954881

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71390

Reviewed By: samdow

Differential Revision: D33637686

Pulled By: atalman

fbshipit-source-id: fcdfca672dc20385a3d2339c20e69bd2d1717e88
(cherry picked from commit 2ac58b0dc13f152bea180dd3d64b7c36fe0ba755)
2022-01-18 22:12:41 +00:00
ea0524dbc3 [FIX LOG] Complete a '\n' in GRAPH_DEBUG (#70421)
Summary:
In file graph_executor.cpp, line 963, a '\n' is missing in GRAPH_DEBUG, which all other GRAPH_DEBUG places here holds.
The output in GRAPH_DEBUG seems weird.

[DEBUG graph_executor.cpp:963] After CheckInplace (end of runOptimization)graph(%0 : Float(*, *, *, *, requires_grad=0, device=cpu),

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70421

Reviewed By: Gamrix

Differential Revision: D33596430

Pulled By: davidberard98

fbshipit-source-id: 0e7c3c02ce44bf925f0c45e96a382104059fe397
(cherry picked from commit 55899528a266363d27e0cf5e82b1b94524509756)
2022-01-18 22:12:41 +00:00
02ac73a973 ci: Add PR trigger for binary builds workflows (#71431)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71431

Adds a PR trigger based on paths to the binary build workflows to make
it easier to test / verify changes to the binary build workflows without
adding a bunch of skipped checks to the majority of our workflows

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: atalman

Differential Revision: D33641276

Pulled By: seemethere

fbshipit-source-id: 0ed65cbcebf06dfe998f81d67df817250dd1a716
(cherry picked from commit 598b55fd1894b7edb21f208b1c86fd6a377ebc69)
2022-01-18 21:19:27 +00:00
5243986df6 Update syncbranches workflow (#71420)
Summary:
Use `pytorchmergebot` credentials to do the merge
Infer sync branch name from the workflow rather than hardcode it
Move common functions from `syncbranches.py` to `gitutils.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71420

Reviewed By: bigfootjon

Differential Revision: D33638846

Pulled By: malfet

fbshipit-source-id: a568fd9ca04f4f142a7f5f64363e9516f5f4ef1c
2022-01-18 11:31:57 -08:00
1eb6146d96 Add manual simple retry to ECR login (#71287)
Summary:
Current retry with AWS_MAX_ATTEMPTS does not seem to work as we still get failures https://github.com/pytorch/pytorch/runs/4806177738?check_suite_focus=true

This should hopefully alleviate

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71287

Reviewed By: malfet, seemethere

Differential Revision: D33573788

Pulled By: janeyx99

fbshipit-source-id: 300fde9a9fa5a2da3e9d18b7989a3676500d8011
2022-01-18 10:56:53 -08:00
2bb6a4f437 Generate aten_interned_strings.h automatically (#69407)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69407

This generates aten_interned_strings.h from `native_functions.yaml`
which is more like how it was originally done. The items deleted from
`interned_strings.h` are duplicates that need to be removed in order
for the code to compile, some of the remaining items may still be out
of date but it is fairly benign even if that's the case.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32923636

Pulled By: albanD

fbshipit-source-id: a0fd6b3714e70454c5f4ea9b19da5e047d2a4687
2022-01-18 08:29:54 -08:00
d665097cad allow Bazel to build without glog and gflags (#70850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70850

We support both, so we want to ensure both continue to work.
ghstack-source-id: 146960552

Test Plan: Tested manually. A subsequent diff adds this test configuration to CI.

Reviewed By: malfet

Differential Revision: D33297464

fbshipit-source-id: 70e1431d0907d480c576239af93ef57036d5e4d7
2022-01-18 08:08:46 -08:00
ffdc6b4994 extract //c10/macros to its own package (#70849)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70849

ghstack-source-id: 146960563

Test Plan: Bazel CI tests will protect this.

Reviewed By: malfet

Differential Revision: D33297235

fbshipit-source-id: 6504a977e82ad2f2232a74233b96cdea8bf94a20
2022-01-18 08:08:42 -08:00
8d0e354191 fix CAFFE2_BUILD_MAIN_LIB to the correct C10_BUILD_MAIN_LIB (#70848)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70848

This is the C10 library, it that's the main lib we are building
here. While here, use `local_defines` instead of `copts` for this
definition. Both `copts` and `local_defines` only apply to the
compilation units in the library, and not transitively.
ghstack-source-id: 146998039

Test Plan: We are relying on CI to verify this doesn't cause any problems.

Reviewed By: malfet

Differential Revision: D33429420

fbshipit-source-id: b3fc84c0588bd43346e3f9f77e851d293bde9428
2022-01-18 08:05:20 -08:00
fd9e08df5d Make Demux serializable with lambda function (#71311)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71311

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33584552

Pulled By: ejguan

fbshipit-source-id: 52324faf5547f9f77582ec170ec91ce3114cfc61
2022-01-18 06:47:54 -08:00
f0db15122f [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33629127

fbshipit-source-id: 47befcd98cfa544a4d822161d8bfbe8d7a788e4d
2022-01-18 01:50:08 -08:00
d17f340a2e The Cacherator (#71350)
Summary:
This PR adds a persistent filesystem cache for jitted kernels. The cache is disabled on Windows because it relies on POSIX headers.

The cache writes, by default, to `~/.cache/torch/kernels`, but the location can be controlled by setting the `PYTORCH_KERNEL_CACHE_PATH`. A separate environment variable, `USE_PYTORCH_KERNEL_CACHE`, will disable all caching logic when set to zero.

The use of a persistent fileystem cache dramatically lowers the "first call time" for an operator AFTER its has been compiled, because it skips (most of) the jit compilation process. On systems where we're compiling only to ptx that ptx still has to be just-in-time compiled by the driver API, so an additional latency of around 10 milliseconds is expected at first call time. On systems which compile to SASS the additional first call time latency is about one millisecond. This compares with times of 150 milliseconds+ for just-in-time kernel compilation.

Files in the cache use a mostly human readable string that includes an SHA1 hash of the CUDA C string used to generate them. Note that this is not an SHA1 hash of the file's contents, because the contents are the compiled ptx or SASS. No verification is done when the file is loaded to ensure the kernel is what's expected, but it's far more likely you'll be struck by a meteor than observe two file names conflict. Using SHA1 hashes to generate unique ids this way is a common practice (GitHub does it, too).

This cache design could be reused by other fusion systems and should allow us to jiterate more operations without fear of regressing the "incremental development" scenario where users are tweaking or extending programs slightly, rerunning then, and then repeating that process again and again. Without a cache each run of the program would have to recompile every jitted kernel, but with this cache we expect a negligible impact to the user experience.

cc kshitij12345, xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71350

Reviewed By: ngimel

Differential Revision: D33626671

Pulled By: mruberry

fbshipit-source-id: d55df53416fbe46348623846f699f9b998e6c318
2022-01-17 23:52:14 -08:00
7b9fff90d2 empty_generic: Remove redundant device argument (#70612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70612

The device information is embedded in the `DataPtr` returned from the
allocator, so this argument is completely ignored.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33623681

Pulled By: ngimel

fbshipit-source-id: bea64707bb17d46debb0ed7c1175493df56fee77
2022-01-17 20:18:43 -08:00
f93ffc9ea8 Sparse CSR: Handle zero matrix consistently for triangular_solve (#71304)
Summary:
This PR enables `test_block_triangular` tests on the CPU.
These tests revealed that there was a problem with how the nnz==0 case is handled. Now we return a tensor filled with NaNs both on CUDA and CPU.

cc nikitaved pearu cpuhrsch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71304

Reviewed By: davidberard98

Differential Revision: D33600482

Pulled By: cpuhrsch

fbshipit-source-id: d09cb619f8b6e54b9f07eb16765ad1c183c42487
2022-01-17 13:47:49 -08:00
17540c5c80 [warnings][Caffe2] Suppress warnings in non-c10 headers (#71370)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71370

Round out suppressing warnings in `caffe2` headers

Test Plan: CI check

Reviewed By: r-barnes

Differential Revision: D33613084

fbshipit-source-id: 9306d480bd796aeae4d887ad26b6ddc2c571c9e4
2022-01-17 10:09:31 -08:00
cf47338191 [Caffe2][warnings] Suppress -Wimplicit-int-float-conversion in TypeSafeSignMath.h for clang (#71369)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71369

Suppress `-Wimplicit-int-float-conversion` in `TypeSafeSignMath.h` when building with clang

Test Plan: CI check

Reviewed By: r-barnes

Differential Revision: D33612983

fbshipit-source-id: cff1239bc252d4a2f54a50a2bbcd48aeb8bf31ca
2022-01-17 10:05:21 -08:00
ddf97a59ca Remove the dependency of pytorch nightly. (#71323)
Summary:
This PR removes the PyTorch nightly dependencies of TorchBench CI. Instead, it relies on the bisection script to install TorchBench dependencies (https://github.com/pytorch/benchmark/pull/694).
This will unblock TorchBench CI users when the nightly build fails (e.g., https://github.com/pytorch/pytorch/issues/71260)

RUN_TORCHBENCH: resnet18
TORCHBENCH_BRANCH: xz9/optimize-bisection

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71323

Reviewed By: wconstab

Differential Revision: D33591713

Pulled By: xuzhao9

fbshipit-source-id: f1308ea33ece1f18196c993b40978351160ccc0c
2022-01-17 09:52:36 -08:00
a383d01774 [fbcode][warnings] Suppress warnings in caffe2/c10 (#71356)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71356

Suppress remaining header based warnings in `caffe2/c10` when building with `clang`

Test Plan: CI pass

Reviewed By: r-barnes

Differential Revision: D33600097

fbshipit-source-id: e1c0d84a0bad768eb03e047d62b5379cf28b48e2
2022-01-15 18:34:08 -08:00
1ecfa1d61a Load zip file in deploy interpreter (#71072)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71072

This PR replaces the old logic of loading frozen torch through cpython by directly loading zipped torch modules directly onto deploy interpreter. We use elf file to load the zip file as its' section and load it back in the interpreter executable. Then, we directly insert the zip file into sys.path of the each initialized interpreter. Python has implicit ZipImporter module that can load modules from zip file as long as they are inside sys.path.

Test Plan: buck test //caffe2/torch/csrc/deploy:test_deploy

Reviewed By: shunting314

Differential Revision: D32442552

fbshipit-source-id: 627f0e91e40e72217f3ceac79002e1d8308735d5
2022-01-15 14:39:59 -08:00
08d8f81704 [quant][fix][fx][graphmode] Fix qconfig setting for fused modules (#71254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71254

when we configure linear and relu with the same qconfig, we currently have utility functions to also
generate a qconfig for the fused linear relu module, but this code is not called in correct order before
which resulted in unexpected behaviors. This PR fixes the issue. Please see test case for more details.
(Test case is from Supriya)

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_fused_module_qat_swap

Imported from OSS

Reviewed By: supriyar

Differential Revision: D33558321

fbshipit-source-id: d95114dc4b77264e603c262c2da02a3de4acba69
2022-01-14 23:31:11 -08:00
bb49352354 caffe2/torch/csrc/jit/frontend/tree_views: workaround nvcc compiler error
Test Plan:
Move it outside the header so it's not seen by nvcc

```
$ buck2 build -c fbcode.platform=platform010 fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda
Downloading buck2...
[======================================================================]

watchman fresh instance event, clearing cache
Using disallowed linker flag 'arvr/third-party/toolchains/platform009/build/mesa/lib/libGL.so' in library rule 'fbsource//third-party/toolchains:opengl'
Using disallowed linker flag 'arvr/third-party/freeglut/3.0.0/libs/x64-linux/libglut.a' in library rule 'fbsource//third-party/toolchains:GLUT'
Action Failed for fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda (ovr_config//platform/linux:x86_64-fbcode-platform010-clang-6dbc4bb1b9a32829)#5:
cxx_compile ngram_repeat_block_cuda_kernel.cu (pic) failed with non-zero exit code 1
debug information: action_digest=b2bda91d24dad53e960c740ef9a412cee1902d86:94
stdout:
stderr:
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Property>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:117:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Property>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Assign>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:171:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Assign>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
command: buck-out/v2/gen/fbcode/999b02f9444004c1/tools/build/__wrap_nvcc.py__/wrap_nvcc.py -_NVCC_BIN_ fbcode ...<omitted>... ors/pytorch/lib/cuda/__ngram_repeat_block_cuda__/__objects__/ngram_repeat_block_cuda_kernel.cu.pic.o (rerun with -v to view the untruncated command)

```

Reviewed By: zhxchen17

Differential Revision: D33592885

fbshipit-source-id: a36dcb3c8265d009b2287f0a479695d1ddbf85aa
2022-01-14 21:58:31 -08:00
4bf1be898d caffe: fix warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"
Summary:
Need to bring in all signatures

https://www.internalfb.com/code/fbsource/[36035b9e4e41813e215ffd5f4377d65b7259237e]/fbcode/caffe2/aten/src/ATen/core/function.h?lines=91-101

Test Plan:
```
Action Failed for fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda (ovr_config//platform/linux:x86_64-fbcode-platform010-clang-6dbc4bb1b9a32829)#5:
cxx_compile ngram_repeat_block_cuda_kernel.cu (pic) failed with non-zero exit code 1
debug information: action_digest=988629a726bc4eabcaf334db2317a969958d5fd2:94
stdout:
stderr:
fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"

fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"

fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Property>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:117:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Property>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Assign>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:171:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Assign>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
command: buck-out/v2/gen/fbcode/999b02f9444004c1/tools/build/__wrap_nvcc.py__/wrap_nvcc.py -_NVCC_BIN_ fbcode ...<omitted>... ors/pytorch/lib/cuda/__ngram_repeat_block_cuda__/__objects__/ngram_repeat_block_cuda_kernel.cu.pic.o (rerun with -v to view the untruncated command)
```

Differential Revision: D33579670

fbshipit-source-id: 9acb443732feb3e921ce0fa5f38f21ed44f64114
2022-01-14 20:27:09 -08:00
3ed27a96ed [BE] Refactor repetitions into TorchVersion._cmp_wrapper` (#71344)
Summary:
First step towards https://github.com/pytorch/pytorch/issues/71280

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71344

Reviewed By: b0noI

Differential Revision: D33594463

Pulled By: malfet

fbshipit-source-id: 0295f0d9f0342f05a390b2bd4aa0a5958c76579b
2022-01-14 19:57:55 -08:00
c43e0286a9 [PyTorch][Lazy] Make hashing null optionals cheap (#71290)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71290

The existing code called an out-of-line hash function on a constant. This is just going to get the same random-looking 64-bit integer every time, so I just changed the constant to an integer I generated with `hex(random.randint(0x1000000000000000, 0xFFFFFFFFFFFFFFFF))` to get the same effect but without the runtime hashing.
ghstack-source-id: 146991945

Test Plan: CI

Reviewed By: wconstab

Differential Revision: D33574676

fbshipit-source-id: d6ce1e1cc0db67dfede148b7e3173508ec311ea8
2022-01-14 17:13:50 -08:00
a138aad6e6 [jit][edge] Return a no-op nullptr for UnionType on mobile for backward compatibility. (#71341)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71341

Old models containing UnionType need to be loaded even if they don't actually use Unions.
This is not the best solution, we need to catch this error on the compiler side instead, but before doing that we can land this first to at least mitigate model loading crash issues.
ghstack-source-id: 147056684

Test Plan:
CI
Verified with jaebong on his device locally.

Differential Revision: D33593276

fbshipit-source-id: fac4bc85c652974c7c10186a29f36e3e411865ad
2022-01-14 17:06:13 -08:00
b7222e15b6 [fix] max_pool1d: composite compliance (#70900)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/69991

Not sure if this is a good idea as this increases the number of operators.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70900

Reviewed By: wenleix

Differential Revision: D33585964

Pulled By: zou3519

fbshipit-source-id: 11bfa2e00ee123a6d36f7d4cccdf0c1a3e664d8c
2022-01-14 15:36:27 -08:00
fcbc34a5eb [PyTorch][Static Runtime] Avoid recomputing input size in dict_unpack (#71252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71252

Same old problem, same old solution.

Interestingly, I tried using c10::irange instead, but that caused really bad assembly to be generated -- we lost inlining for lots of the loop body!
ghstack-source-id: 146939573

Test Plan:
CI

Spot-checked assembly before/after and confirmed that loop termination value was recomputed before and not after

Reviewed By: mikeiovine

Differential Revision: D33558118

fbshipit-source-id: 9fda2f1f89bacba2e8b5e61ba432871e973201fe
2022-01-14 14:33:56 -08:00
bf82d2012e [PyTorch] Add IValue::toDimVector & mostly replace toIntVector with it (#71247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71247

Most uses of toIntVector() were for a Tensor shape. We have DimVector to avoid heap allocations in those cases, so let's use it.
ghstack-source-id: 146933314

Test Plan: CI -- if we think DimVector is good in general then I think we have to think this change is good?

Reviewed By: mikeiovine

Differential Revision: D33556198

fbshipit-source-id: cf2ad92c2d0b99ab1df4da0f6843e6ccb9a6320b
2022-01-14 14:32:40 -08:00
94ed61eb5c Pin numba to 0.54.1 (#71327)
Summary:
Not sure what is going on, but numba=0.55.0 currently installed in for example 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.7-clang9:0d18ad2827487386d2a7864b11fec5bc83de6545 is build against newer version of numpy, which was apparently silently fixed on the pypi side (as latest numba download is numba-0.55.0-1-cp37-cp37m-manylinux2014_x86_64.manylinux_2_17_x86_64.whl  )
Fixes https://github.com/pytorch/pytorch/issues/71320

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71327

Reviewed By: suo, seemethere, atalman, janeyx99

Differential Revision: D33589002

Pulled By: malfet

fbshipit-source-id: d362a2b2fd045bc1720cd7fdc4c7b18b7d607fc4
2022-01-14 14:06:15 -08:00
d74bb42f7a Add a missing precondition to DistributedSampler docstring (#70104)
Summary:
Distributed sampler sets different indices for different processes. By doing this, it assumes that the data is the same across the board and in the same order. This may seem trivial, however, there are times that users don't guarantee the order items are gonna have, because they rely on something such as the order the filesystem lists a directory (which is not guaranteed and may vary on different computers), or the order a `set` is iterated.

I think it's better to make it clearer.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70104

Reviewed By: bdhirsh

Differential Revision: D33569539

Pulled By: rohan-varma

fbshipit-source-id: 68ff028cb360cadaee8c441256c1b027a57c7089
2022-01-14 13:55:12 -08:00
2faccc2f5d [quant] Remove some redundant entries in backend_config_dict for TensorRT (#70971)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70971

"root_module" and "reference_quantized_module_for_root" are only used in convert, removed
them for fused module and qat module swapping configurations
We may be able to remove some other fields as well.

Test Plan:
python test/fx2trt/test_quant_trt.py TestQuantizeFxTRTOps

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D33470739

fbshipit-source-id: 67e6d58d7a3ec9fbd8c13527e701c06119aeb219
2022-01-14 12:43:25 -08:00
d793cc1993 Revert "Pin numba ot 0.54.1"
This reverts commit ac7f188c64805f2f9dd134f5781d3b584688e677 that was
landed accidentally.
2022-01-14 12:32:39 -08:00
ac7f188c64 Pin numba ot 0.54.1
As newer one is incompatible with numpy version we are using
Fixes https://github.com/pytorch/pytorch/issues/71320
2022-01-14 12:25:47 -08:00
680d61daab [LT] Remove torch::lazy::convertShapes (#71291)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71291

This commit removes torch::lazy::convertShapes since it's no longer used.
In addition, it replaces a numel logic within LTCTensorImpl.

Test Plan:
./build/bin/test_lazy
CI in lazy_tensor_staging branch

Reviewed By: wconstab

Differential Revision: D33575084

Pulled By: alanwaketan

fbshipit-source-id: b104ef39fd552822e1f4069eab2cb942d48423a6
2022-01-14 12:06:39 -08:00
c7d1501e4d fractional_maxpool3d: port to structured kernel (#70414)
Summary:
Port fractional maxpool 3d to structured kernel

Fixes https://github.com/pytorch/pytorch/issues/55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70414

Reviewed By: zdevito, wenleix

Differential Revision: D33572110

Pulled By: bdhirsh

fbshipit-source-id: 1f89eb511335f51cc7abbb0230e165da8752f9fc
2022-01-14 12:01:16 -08:00
a4196a9abf Remove unused optimizers variable in test (#70668)
Summary:
In `TestLRScheduler._test()`, an unused variable `optimizers` is created. This PR is a minor refactoring that removes the variable and the loop block that populates the set.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70668

Reviewed By: wenleix

Differential Revision: D33586236

Pulled By: albanD

fbshipit-source-id: cabf870a8221f144df9d3e2f2b564cdc5c255f5a
2022-01-14 11:59:49 -08:00
054b90f0d6 add channels last support for ChannelShuffle (#50247)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50247

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D26007052

Pulled By: VitalyFedyunin

fbshipit-source-id: 08f737d64a65791c8002ffd56b79b02cf14d6159
2022-01-14 11:55:21 -08:00
e531646955 Fix docstring for nn.MultiHeadAttention (#71100)
Summary:
Fixes nn.MultiHeadAttention's docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

cc albanD mruberry jbschlosser walterddr kshitij12345

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71100

Reviewed By: mruberry

Differential Revision: D33531726

Pulled By: albanD

fbshipit-source-id: d2aa8fa44d0f6b166a809b7e5ceee26efcbccf36
2022-01-14 10:29:18 -08:00
17bb68618f Copy: Fix CPU transpose path ignoring neg and conj bits (#69026)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69026

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33064533

Pulled By: anjali411

fbshipit-source-id: 98c25586a1707ac2324f69f652ce5a14dd59c0ad
2022-01-14 10:13:33 -08:00
84b1c9798c add BFloat16 support for AvgPool2d on CPU (#66927)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66927

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33353198

Pulled By: VitalyFedyunin

fbshipit-source-id: 1aeaa4bb90ac99210b8f6051c09d6995d06ce3a1
2022-01-14 07:59:10 -08:00
88012c7daf [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33577744

fbshipit-source-id: 7ecc8367998ee1dffde54c2f4dd3cfafe19a53c9
2022-01-14 06:10:57 -08:00
3a0c680a14 Jiterates exp2, erfc, erfinv and entr and refactors code_template.h to ATen (#71295)
Summary:
Per title.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71295

Reviewed By: ngimel

Differential Revision: D33575885

Pulled By: mruberry

fbshipit-source-id: bc841b46fc0b5458a26a4d4465b18a7a54cd5a5b
2022-01-13 23:58:51 -08:00
d068849cc0 - Fixed memory leak in ir_simplifier.cpp (#71285)
Summary:
The leak was causing long running inference loops to exhaust system memory. I tracked down the issue and noted that ModRound can be copied by value without worrying about a performance hit.

I originally branched from release/1.10 and made these changes. This commit includes the same changes but from master as requested in the original PR https://github.com/pytorch/pytorch/pull/71077

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71285

Reviewed By: wenleix

Differential Revision: D33575821

Pulled By: ZolotukhinM

fbshipit-source-id: 64333f6cbb2c222f05481499c9cae4c7e0116af6
2022-01-13 22:29:06 -08:00
910c01020e add BFloat16 support for AdaptiveMaxPool2d on CPU (#66929)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66929

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33353199

Pulled By: VitalyFedyunin

fbshipit-source-id: d402d5deb7ca766259ca42118ddc16625e134c4c
2022-01-13 20:00:42 -08:00
9e45c89891 remove skips from determinant tests (#70034)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67512.

The accuracy requirement for non-contiguous inputs when using complex64 was too high so I reduced it to upto 1e-3.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70034

Reviewed By: anjali411

Differential Revision: D33530382

Pulled By: mruberry

fbshipit-source-id: 057daf75dc5feca5bb2f4428922eb7489435da60
2022-01-13 19:13:28 -08:00
356af8f857 Do not use ssize_t in python_arg_parser.[cpp|h] (#71250)
Summary:
Use `Py_ssize_t` when calling Python API
Use `c10::irange` to automatically infer loop type
 Use `size_t` or `unsigned` for unsigned type

 Partially addresses https://github.com/pytorch/pytorch/issues/69948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71250

Reviewed By: atalman

Differential Revision: D33569724

Pulled By: malfet

fbshipit-source-id: c9eb75be9859d586c00db2f824c68840488a2822
2022-01-13 19:10:30 -08:00
675acfc1f4 Remove unwanted comma (#71193)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70611

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71193

Reviewed By: ngimel

Differential Revision: D33542841

Pulled By: mruberry

fbshipit-source-id: 0f2f1218c056aea7ecf86ba4036cfb10df6e8614
2022-01-13 19:09:05 -08:00
558622642b Fix torch.dsplit docs dim specification (#70557)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70445.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70557

Reviewed By: ngimel

Differential Revision: D33542864

Pulled By: mruberry

fbshipit-source-id: c3a7929bfcd964da99225ad715f4546f1fc8002a
2022-01-13 19:04:51 -08:00
5f2b4be3b9 [jit] Split DynamicType conformance test into smaller pieces. (#71275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71275

Currently it's taking more than 10 minutes to run the conformance test. Instead we should use parametrized test to shard into test segments so that they can run in parallel.
ghstack-source-id: 146990608

Test Plan:
```
[zhxchen17@devbig560.ftw3 /data/users/zhxchen17/fbsource/fbcode] buck test mode/dev-tsan //caffe2/test/cpp/jit:jit -- -r 'LiteInterpreterDynamicTypeTestFixture'
Building... 34.9 sec (99%) 12110/12111 jobs, 0/12111 updated
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: ebea52b3-7c7f-46be-9f69-18e2e7b040cc
Trace available for this run at /tmp/tpx-20220113-113635.717778/trace.log
RemoteExecution session id: reSessionID-ebea52b3-7c7f-46be-9f69-18e2e7b040cc-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/4222124735827748
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 431 tests discovered (11.173)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/0 (51.331)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/1 (65.614)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/3 (76.875)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/5 (77.271)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/4 (78.871)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/6 (78.984)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/7 (84.068)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/2 (85.198)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/8 (88.815)
    ✓ Pass: caffe2/test/cpp/jit:jit - Conformance/LiteInterpreterDynamicTypeTestFixture.Conformance/9 (90.332)
Summary
  Pass: 10
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/4222124735827748
```

Reviewed By: qihqi

Differential Revision: D33570442

fbshipit-source-id: 5c49e03b0f88068d444c84b4adeaaf45433ce1fa
2022-01-13 18:22:55 -08:00
81f693d509 [ONNX] minor clarifications of docstrings (#69260) (#69549)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69549

[ONNX] minor clarifications of docstrings

1. Make description of ONNX_ATEN_FALLBACK more accurate (after #67460).
2. Specify minimum and maximum values for opset_version. This is pretty
   important information and we should make users dig through source
   code to find it.

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D32994267

Pulled By: msaroufim

fbshipit-source-id: ba641404107baa23506d337eca742fc1fe9f0772
2022-01-13 18:03:27 -08:00
d555d3f0d0 Update generated header to use flatbuffer v1.12; (#71279)
Summary:
Update generated header to use flatbuffer v1.12;
Also pin flatbuffer repo to v1.12

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71279

Test Plan:
unittest
Fixes #ISSUE_NUMBER

Reviewed By: gmagogsfm

Differential Revision: D33572140

Pulled By: qihqi

fbshipit-source-id: 319efc70f6c491c66a3dfcd7cad1f7defe69916b
2022-01-13 17:23:30 -08:00
e47771cca0 [ao] Removing unused allow list arguments from propagate_qconfig and helper (#71104)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71104

This shouldn't change any functionality given that those
variables were not used. It should be noted that a similar variable is
used in add_observer which is why it wasn't removed from there.
ghstack-source-id: 146940043

Test Plan:
python test/test_quantization.py

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33510352

fbshipit-source-id: c66ed72c2b71a6e1822f9311467adaa1f4b730d0
2022-01-13 16:07:29 -08:00
e7c87e8b44 [quant] fix dropout in FX graph mode quantization (#71043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71043

fix issue #68250
dropout break fx graph model quantization

Test Plan:
python test/test_quantization.py TestStaticQuantizedModule

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33490176

fbshipit-source-id: 155546505b28ffc635ada65a1464b9d622dbc235
2022-01-13 15:59:59 -08:00
eac3decf93 ModuleList concatenation (#70887)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70441.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70887

Reviewed By: ejguan

Differential Revision: D33555431

Pulled By: albanD

fbshipit-source-id: ce42459ee46a611e98e89f02686acbac16b6b668
2022-01-13 15:31:07 -08:00
2981534f54 [nn] cross_entropy: no batch dim support (#71055)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/60585

cc albanD mruberry jbschlosser walterddr kshitij12345

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71055

Reviewed By: anjali411

Differential Revision: D33567403

Pulled By: jbschlosser

fbshipit-source-id: 4d0a311ad7419387c4547e43e533840c8b6d09d8
2022-01-13 14:48:51 -08:00
e4d522a3cf More informative messages for None types comparisons (#69802)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69802

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555886

Pulled By: Gamrix

fbshipit-source-id: 3045cbe04de22f05db41a99ad3dda90c5271aa0f
2022-01-13 13:59:28 -08:00
ed9804088a Adding support for loops (#70209)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70209

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555889

Pulled By: Gamrix

fbshipit-source-id: f6c0c9d517849e3679e07ac1c8cf3bf367e91882
2022-01-13 13:59:25 -08:00
18d91a97e4 Adding custom device type change rules (#69051)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69051

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555884

Pulled By: Gamrix

fbshipit-source-id: c38812277d0e2aa008903a4328cb72e34bc6e1e6
2022-01-13 13:59:21 -08:00
03c4d2b9e3 Adding support for Ifs in Device Type Analysis (#69050)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69050

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555887

Pulled By: Gamrix

fbshipit-source-id: f7f057c5985f8b6e7a9fe5702a944b2b4cc4d5b5
2022-01-13 13:59:18 -08:00
4a8aa971cc Building a TensorProperty AbstractBaseClass (#71184)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71184

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555890

Pulled By: Gamrix

fbshipit-source-id: 694f7b5327b93257010b0abeed3310b0b816c0a8
2022-01-13 13:59:15 -08:00
dabcbb2726 Testing for Default Inference for Device Type (#69052)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69052

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555888

Pulled By: Gamrix

fbshipit-source-id: dbd43ebfc1bea4b17a96bdd378ea730ccf5944b2
2022-01-13 13:59:12 -08:00
ade83ed90c Building Default Inference for Device Type (#69049)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69049

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33555885

Pulled By: Gamrix

fbshipit-source-id: 7364066cbc544ab8442a47c82ea89f0e73eaaa06
2022-01-13 13:57:08 -08:00
b64946cbc1 [acc_normalizer] Delete is_wrapped after normalization (#71046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71046

att

Test Plan:
Added test coverage.

yinghai verifying locally for issue.

Reviewed By: kflu, 842974287

Differential Revision: D33487868

fbshipit-source-id: 5da615f66f50500b30bae84592859305b2971e1e
2022-01-13 13:33:01 -08:00
71b274d34d [pytorch] move ATen/CUDAGeneratorImpl.h to ATen/cuda (#71224)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71224

Pull Request resolved: https://github.com/facebookresearch/FBTT-Embedding/pull/19

Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/860

This patch follows up D33414890 (5cae40c169).

This patch removes an alias header "`ATen/CUDAGeneratorImpl.h`" since it has been moved to `ATen/cuda/CUDAGeneratorImpl.h`. This change should have already been propagated.

Test Plan: Internal and external CI

Reviewed By: jianyuh

Differential Revision: D33534276

fbshipit-source-id: 368177784ec84f003aad911cf4dd4da4a6e8e3d4
2022-01-13 13:29:44 -08:00
1de830a985 Use ptrdiff_t rather than ssize_t (#71271)
Summary:
`diff_type` kind of naturally should be `ptrdiff_t`, as `ssize_t` is actually defined [here](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html) as :
> The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}].

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71271

Reviewed By: atalman

Differential Revision: D33569304

Pulled By: malfet

fbshipit-source-id: 57dafed5fc42a1f91cdbed257e76cec4fdfbbebe
2022-01-13 12:41:53 -08:00
83b45fe166 [ao] disabling dynamic conv/convT ops (#71110)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71110

as mentioned in https://github.com/pytorch/pytorch/issues/70480 the dynamic conv ops are currently missing a key feature to bring their performance in line with other dynamic ops, this diff disables conv/convT from being automatically quantized with convert dynamic

Test Plan: buck test //caffe2/test:quantization --test-selectors test_quantized_module#TestDynamicQuantizedModule

Reviewed By: vkuzo

Differential Revision: D33511152

fbshipit-source-id: 50618fbe734c898664c390f896e70c68f1df3208
2022-01-13 11:28:02 -08:00
37eaf7640f Revert "Revert D33480077: .github: Re-enable xla test config" (#71202)
Summary:
This reverts commit 14922a136f940e2f9bc9d04d7963b8141138efa0.

Re-enable xla test config since PTXLA head is back to green -- https://app.circleci.com/pipelines/github/pytorch/xla.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71202

Reviewed By: wenleix

Differential Revision: D33569109

Pulled By: seemethere

fbshipit-source-id: ee0985768d1dfaa6c28865ae5b3dbce2a4a340f7
2022-01-13 11:19:18 -08:00
40eb004da5 Use nightly-binary instead of nightly to deduplicate refs for nightlies (#71270)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71260

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71270

Reviewed By: seemethere

Differential Revision: D33568858

Pulled By: janeyx99

fbshipit-source-id: 03de185af987e5cb3b021d842be20c4a353b1033
2022-01-13 10:10:35 -08:00
003c94c790 [Quant] Templatize activationLimits function (#71220)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71220

This is to allow using this function for uint8 as well as int8

Test Plan:
buck test caffe2/test:quantization
This primarily tests T=uint8

Reviewed By: kimishpatel

Differential Revision: D33520713

fbshipit-source-id: 9640cf0a446e4c4e76887d643d72b767945bae76
2022-01-13 09:31:16 -08:00
4a26624670 [Quant] Add a guard against shapes for qnnpack qadd (#71219)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71219

qnnpack kernel does not support broadcasting

Test Plan: buck test caffe2/test:quantization

Reviewed By: kimishpatel

Differential Revision: D33520613

fbshipit-source-id: 93c5226d53cb7b90ed495ff7b14158f7171d25bf
2022-01-13 09:31:12 -08:00
e1b9d5854a [Quant] Add quantized input tensor data type checks (#71218)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71218

This asserts quint8 support, and fails with a helpful error message when attempted to use with a different qdtype

Test Plan: buck test caffe2/test:quantization

Reviewed By: kimishpatel

Differential Revision: D33455785

fbshipit-source-id: 6ec728f59bb707c2d941b50e6375a698c66284c0
2022-01-13 09:29:55 -08:00
188b744390 Make docker build cron once a week and not every hour on Wed (#71255)
Summary:
The many times a day was probably not intentional

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71255

Reviewed By: suo, atalman

Differential Revision: D33559155

Pulled By: janeyx99

fbshipit-source-id: c8703cea6f3188c9bcb0867b895261808d3164ee
2022-01-13 08:26:57 -08:00
1e3893ecbb [DataPipe] Removing deprecated DataPipes (#71161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71161

Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) if they would like to use them. We will be checking for any downstream library usage before landing this PR.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33532272

Pulled By: NivekT

fbshipit-source-id: 9dbfb21baf2d1183e0aa379049ad8304753e08a1
2022-01-13 07:37:48 -08:00
60632a00fe [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33561057

fbshipit-source-id: 79873717c45c8bbe6d0ae760e718770fd960185d
2022-01-13 03:27:06 -08:00
ff78c73286 [ONNX] Remove f arg from export_to_pretty_string (#69045) (#69546)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69546

The arg is not used and was previously deprecated.

Also remove torch.onnx._export_to_pretty_string. It's redundant with the
public version.

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D32994270

Pulled By: msaroufim

fbshipit-source-id: f8f3933b371a0d868d9247510bcd73c31a9d6fcc
2022-01-12 21:31:36 -08:00
3cc34a4502 [PyTorch][Static Runtime] s/toObject/toObjectRef/ in native ops (#71238)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71238

Saves a refcount bump for these.
ghstack-source-id: 146927203

Test Plan: CI

Reviewed By: mikeiovine

Differential Revision: D33554385

fbshipit-source-id: b2f8d5afdc0eb80c8765d88560d0e547376f28d1
2022-01-12 18:44:40 -08:00
ffdc0e23af [SR] Add various missing native ops (#71113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71113

This diff adds a variety of missing ~~out variants~~/native ops. Most of these are trivial, so I included them all in one diff.

Native ops
* `aten::mul` (list variant)
* `aten::sub` (int variant)
* `aten::add` (list variant)
* `aten::Int`

Out variants
* ~~`aten::gt`~~ (codegen will handle)
* ~~`aten::eq`~~ (codegen will handle)
ghstack-source-id: 146927552

Test Plan: `buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest`

Reviewed By: hlu1

Differential Revision: D33510756

fbshipit-source-id: df385958b9561955b2e866dab2e4c050abd26766
2022-01-12 18:40:31 -08:00
f6b804ba9f Fallback to server JIT type for type checking.
Summary:
T109800703
In runtime fallback to server JIT type if a DynamicType is parsed.

Test Plan: local headset

Reviewed By: scramsby

Differential Revision: D33557763

fbshipit-source-id: f5fe7dabf668de2f55cc26f9ebe8bcbccd570ce3
2022-01-12 17:59:54 -08:00
84d4087874 Fix trt const_fold as output use case (#71194)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71194

Reviewed By: jfix71, khabinov

Differential Revision: D33541168

fbshipit-source-id: dd5787430b272977963323a6ce38b3e15e979278
2022-01-12 16:57:19 -08:00
1bbea3c3a2 [PyTorch][JIT] Support mayContainAlias(Value*, ArrayRef<Value*>) (#69853)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69853

We can implement this overload more efficiently.
ghstack-source-id: 146924693

Test Plan:
patched alias_analysis tests

Time reported to initialize a predictor by static runtime when given ctr_mobile_feed local_ro net is 9.5s instead of 10.5s.

Reviewed By: mikeiovine

Differential Revision: D33039731

fbshipit-source-id: 52559d678e9eb00e335b9e0db304e7a5840ea397
2022-01-12 16:53:54 -08:00
cd253938a9 [PyTorch][SR][easy] s/input_or_constant_aliases/external_aliases/ (#69852)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69852

Looks like a stale comment.
ghstack-source-id: 146924694

Test Plan: review

Reviewed By: hlu1

Differential Revision: D33033264

fbshipit-source-id: aa0eff463c42716bdd7142d4662d8668af439f68
2022-01-12 16:52:26 -08:00
1bc3571078 [pytorch][PR] Add ability for a mobile::Module to save as flatbuffer (#70201)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70201

Included functions:
save_mobile_module -> saves a mobile::Module to flatbuffer
load_mobile_module_from_file -> loads a flatbuffer into mobile::Module
parse_mobile_module -> parses from bytes or deserialized flatbuffer module object

Compared to previous attempts, this diff only adds flatbuffer to cmake target and leaves fbcode/xplat ones unchanged.

Test Plan: unittest

Reviewed By: malfet, gmagogsfm

Differential Revision: D33239362

fbshipit-source-id: b9ca36b83d6af2d78cc50b9eb9e2a6fa7fce0763
2022-01-12 16:30:39 -08:00
7a93d8bb2d Revert D32374542: Implement the patterns module for the multi subgraph rewriter.
Test Plan: revert-hammer

Differential Revision:
D32374542 (de62bcac66)

Original commit changeset: 4ae8da575976

Original Phabricator Diff: D32374542 (de62bcac66)

fbshipit-source-id: 901e41d6abb202c5b1c6a3a84b060b2677b5bbe1
2022-01-12 15:50:58 -08:00
9ca367d48b [nnc] Use given kernel function name while emitting code (#67781)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67781

Update `LLVMCodeGen` in NNC to use the given kernel function name while emitting code.

This was earlier committed as D31445799 (c30dc52739) and got reverted as part of a stack of diffs that included a cache for `PyTorchLLVMJIT`, which was the likely culprit.

Test Plan:
```
buck test mode/opt //caffe2/test/cpp/tensorexpr:tensorexpr -- --exact 'caffe2/test/cpp/tensorexpr:tensorexpr - LLVM.CodeGenKernelFuncName'
```

Reviewed By: ZolotukhinM, bdhirsh

Differential Revision: D32145958

fbshipit-source-id: 5f4e0400c4fa7cabce5b91e6de2a294fa0cad88e
2022-01-12 15:49:17 -08:00
67941c8a94 Document torch.cuda.ExternalStream, torch.cuda.caching_allocator_alloc and torch.cuda.caching_allocator_delete (#70126)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67414. Fixes https://github.com/pytorch/pytorch/issues/70117.

cc brianjo mruberry ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70126

Reviewed By: mruberry

Differential Revision: D33542910

Pulled By: ngimel

fbshipit-source-id: 4b870f4dceca6ee4cc8fba58819f1cb18ac9f857
2022-01-12 15:44:40 -08:00
ad803936d1 Codegen: ADInplaceOrViewType only include operators registered (#68692)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68692

ADInplaceOrViewType is a sharded file, so by only including specific
operator headers, we ensure that changing one (non-method) operator
only needs one shard to be re-compiled.

This also ports the generated code over to the `at::_ops` interface,
and the code generator itself to using `write_sharded` instead of
re-implementing its own version of sharding.

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33217916

Pulled By: albanD

fbshipit-source-id: 90f1868f72644f1b5aa023cefd6a102bbbec95af
2022-01-12 15:34:45 -08:00
cc55da8a9b [caffe2/server quant] use new depthwise conv fbgemm interface (#71166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71166

Remove the use of deprecated old interface

Test Plan: CI

Reviewed By: jiyuanzFB

Differential Revision: D33533494

fbshipit-source-id: 930eb93cd67c7a9bb77708cc48914aa0c9f1c841
2022-01-12 15:29:07 -08:00
de62bcac66 Implement the patterns module for the multi subgraph rewriter. (#71181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71181

This diff introduces the patterns module that defines a pattern-replacement pair for the experimental multi subgraph rewriter.

Test Plan: Tested locally. Unit test suite forthcoming.

Reviewed By: ajauhri

Differential Revision: D32374542

fbshipit-source-id: 4ae8da575976e96b02c5c33c6ae2a0943fc7f126
2022-01-12 15:12:05 -08:00
3c0c5bde0e [cmake] Uncomment binaries (#71157)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71157

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33528259

Pulled By: IvanKobzarev

fbshipit-source-id: b8c216558ca612bedd4c37205f38ed29c2c82b3c
2022-01-12 15:01:44 -08:00
e1f01d2c01 .ci: Add nightly trigger, remove CircleCI linux binary builds (#70957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70957

Adds nightly trigger for github actions using a workflow that will pull
down viable/strict and tag it as `nightly` and then re-push it up to the
repository.

Also removes CircleCI linux binary builds since they will now be
outmoded in favor of our new GHA workflow

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33535609

Pulled By: seemethere

fbshipit-source-id: ca6402df37db46e1872ff25befe96afa12e7b1af
2022-01-12 14:31:51 -08:00
6c1be299c1 caffe2/c10/core/TensorImpl.h: adapt to clang 12 (#70973)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70973

clang12 builds fail like this:

  caffe2/c10/core/TensorImpl.h:2615:1: error: static_assert failed due to requirement 'sizeof(void *) != sizeof(long) || sizeof(c10::TensorImpl) == sizeof(long) * 24' "You changed the size of TensorImpl on 64-bit arch.See Note [TensorImpl size constraints] on how to proceed."

Yet eliciting the size of that struct with this one-line addition:

  char (*__show_sizeof)[sizeof( TensorImpl )] = 1;

reports that its size is indeed 192 (aka 8 * 24):

  caffe2/c10/core/TensorImpl.h:2615:8: error: cannot initialize a variable of type 'char (*)[192]' with an rvalue of type 'int'

On closer inspection we determined that failures were occurring because TensorImpl was sometimes of size 208 and other times of size 192. The 192 size was expected and TensorImpl was hard-coded to raise an error for any other case on a 64-bit system, including the one we found where the size was 208.

Additional investigation revealed that systems using GCC 11 and CUDA 11040 with either C++ 201402 and 201703 would sometimes yield TensorImpl sizes of 208 whereas systems newer systems without CUDA would always yield sizes of 192.

The difference turned out to be that `std::unique_ptr` on NVCC systems is sometimes of size 16 and other times of size 8, accounting fully for the observed difference in TensorImpl sizes. We have not yet been able to find a set of preprocessor macros that predict when each size will occur.

To handle the situation, we've added extensive debugging information to the TensorImpl size-checking logic. A number of preprocessing definitions capture compiler versions and other information to help understand what changes might have affected the size of TensorImpl. The size of each member of TensorImpl is now individually checked, along with the total size. Template-based comparison functions are used to provide compile-time outputs about the system state as well as the observed and expected sizes of each item considered.

The template-based comparison functions cause the code to break if it's run on a 32-bit system because the templates and their associated static_asserts are compiled whether or not they'll ultimately be used. In C++17 we could prevent this using `if constexpr`; however, PyTorch is pinned to C++14, so we cannot. Instead, we check pointer size (`#if UINTPTR_MAX == 0xFFFFFFFF`) to determine which system we're on and provide separate checks for 32 vs 64-bit systems.

A final wrinkle is that 32-bit systems have some variations in data size as well. We handle these by checking that the relevant items are `<=` the expected values.

In summary...

Improvements over the previous situation:
* Added checks for 32-bit systems
* The sizes of individual fields are now checked
* Compile-time size results (expected versus observed) are provided
* Compile-time compiler and system info is provided
* Landing this diff will actually enable checks of TensorImpl size; they are currently disabled to expedite LLVM-12 + newer CUDA upgrade efforts.

Some work that could still be done:
* Figure out what preprocessor flags (if any) predict the size of `std::unique_ptr` for 64-bit systems and of various elements of 32-bit systems.

Test Plan: Building no longer triggers that static_assert failure.

Reviewed By: luciang

Differential Revision: D32749655

fbshipit-source-id: 481f84da6ff61b876a5aaba89b8589ec54d59fbe
2022-01-12 14:27:16 -08:00
385773cb77 add BFloat16 support for MaxPool2d on CPU (#56903)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56903

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D28836791

Pulled By: VitalyFedyunin

fbshipit-source-id: e03d55cc30dfa3628f096938fbad34b1031948af
2022-01-12 14:20:20 -08:00
de902b5d02 [FX] Add a default_value arg to Graph.placeholder and fix split_module (#71016)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71016

I found out that `split_module` doesn't preserve default values for arguments. In trying to fix that, I noticed that `Graph.placeholder` doesn't make it easy to add a default argument when making a placeholder. This PR addresses both of those issues

Test Plan: Imported from OSS

Reviewed By: ansley

Differential Revision: D33482218

Pulled By: jamesr66a

fbshipit-source-id: 57ebcdab25d267333fb1034994e08fc1bdb128ee
2022-01-12 14:03:17 -08:00
5749be4678 Fix the shape inconsistency of out and elem tensor (#71065)
Summary:
See bug report  https://github.com/pytorch/pytorch/issues/71063

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71065

Reviewed By: anjali411

Differential Revision: D33549921

Pulled By: ejguan

fbshipit-source-id: bc43f5f9a88f7dcd8729d0e0f4b90d20f40b3064
2022-01-12 13:57:19 -08:00
2290976880 ci: Comment out pull_request trigger for binary builds (#71244)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71244

Binary builds adds a lot of skipped jobs to the default ciflow workflow
so we're commenting out the pull_request trigger for now until the new
ciflow mechanism becomes available

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33555049

Pulled By: seemethere

fbshipit-source-id: 2d0d4704e7297d5931b2c9705ee4dfb26760736e
2022-01-12 13:48:10 -08:00
bfe1abd3b5 torch/monitor: add pybind (#69567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69567

This exposes torch.monitor events and stats via pybind11 to the underlying C++ implementation.

* The registration interface is a tad different since it takes a lambda function in Python where as in C++ it's a full class.
* This has a small amount of changes to the counter interfaces since there's no way to create an initializer list at runtime so they now also take a vector.
* Only double based stats are provided in Python since it's intended more for high level stats where float imprecision shouldn't be an issue. This can be changed down the line if need arises.

```
events = []

def handler(event):
    events.append(event)

handle = register_event_handler(handler)

log_event(Event(type="torch.monitor.TestEvent", timestamp=datetime.now(), metadata={"foo": 1.0}))
```

D32969391 is now included in this diff.
This cleans up the naming for events. type is now name, message is gone, and metadata is renamed data.

Test Plan: buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor

Reviewed By: kiukchung

Differential Revision: D32924141

fbshipit-source-id: 563304c2e3261a4754e40cca39fc64c5a04b43e8
2022-01-12 13:35:11 -08:00
90ef54f8ea [PyTorch] Remove buggy ExclusivelyOwnedTraits<intrusive_ptr<T>> (#70647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70647

It wasn't checking for the null state and it wasn't used.
ghstack-source-id: 146819525

Test Plan: CI

Reviewed By: hlu1

Differential Revision: D33414728

fbshipit-source-id: 7fcd648577cbfc35320c5c3ca9a19a14bd4d6858
2022-01-12 12:19:52 -08:00
479ce1c3a0 [PyTorch] Add isUndefined to ExclusivelyOwnedTraits<TensorBase> debug msg (#70638)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70638

We are seeing these assertions fire infrequently. Add more information to aid in debugging when they fire.
ghstack-source-id: 146819527

Test Plan: CI

Reviewed By: bdhirsh

Differential Revision: D33412651

fbshipit-source-id: 7e35faf9f4eeaa5f2455a4392e00f62fe692811c
2022-01-12 12:18:33 -08:00
4d28cef03a Added AutocastCPU string (#70013)
Summary:
Description:
- Added "AutocastCPU" string repr into `toString` method

Before
```
std::cout << c10::DispatchKey::AutocastCPU;
> UNKNOWN_TENSOR_TYPE_ID
```
and now:
```
std::cout << c10::DispatchKey::AutocastCPU;
> AutocastCPU
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70013

Reviewed By: ejguan

Differential Revision: D33550777

Pulled By: bdhirsh

fbshipit-source-id: b31e15e6d52fc1768af085e428328117d588f283
2022-01-12 12:06:46 -08:00
7884143dff Legacy support for embedded interpreter (#71197)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71197

Adds back legacy support for emmbedded interpreter to use .data section in internal use cases. Specifically this allows for dynamic loading of python extension files.

Test Plan: buck test mode/opt //caffe2/torch/csrc/deploy:test_deploy_gpu_legacy

Reviewed By: shunting314

Differential Revision: D33542636

fbshipit-source-id: b49f94163c91619934bc35595304b9e84d0098fc
2022-01-12 11:48:27 -08:00
a71b4dc164 Update nightly wheels to ROCm4.5.2 (#71064)
Summary:
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71064

Reviewed By: malfet, janeyx99

Differential Revision: D33552643

Pulled By: seemethere

fbshipit-source-id: 3754f69188864f6b3639818a4b9013ed255a2d7d
2022-01-12 11:41:55 -08:00
fd0d4bef03 Edit cron to make the docker jobs run hopefully (#71232)
Summary:
Our docker builds have not been running with our previous cron, changes this so it should work hopefully.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71232

Reviewed By: ejguan

Differential Revision: D33552231

Pulled By: janeyx99

fbshipit-source-id: 1a3e1607b03d37614eedf04093d73f1b96698840
2022-01-12 11:37:03 -08:00
70951884d4 Add option to load historic operators in IR when the operator is deprecated (#71148)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71148

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D33521300

Pulled By: tugsbayasgalan

fbshipit-source-id: a0607dba5e7233590384326537017eb0b18da419
2022-01-12 11:07:04 -08:00
8f4cec2231 [warnings][Caffe2] Suppress warnings in caffe2 headers (#71196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71196

`caffe2` headers contain code that can elicit warnings when built with strict compiler flags.  Rather than force downstream/consuming code to weaken their compiler flags, suppress those warnings in the header using `#pragma clang diagnostic` suppressions.

Test Plan: CI Pass

Reviewed By: malfet

Differential Revision: D33536233

fbshipit-source-id: 74404e7a5edaf244f79f7a0addd991a84442a31f
2022-01-12 10:16:35 -08:00
149f5ffa36 Fix inconsistency between new and old upgrader design (#71185)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71185

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D33539191

Pulled By: tugsbayasgalan

fbshipit-source-id: 721093793574663d56a8080c6a488024620266a1
2022-01-12 09:54:31 -08:00
54fe2741a1 [fx2trt] break down div (#71172)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71172

Break down div to smaller ops to make those div ops look like all other elementwise ops.

Use operator div ops instead of torch div if possible to avoid converting literal numbers to torch tensor (like in the following).
```
a = 1
b = 2

// `c` would be 0.5
c = a / b

// `c` would be torch.tensor([0.5])
c = torch.div(a, b)
```

The problem we saw on shufflenet is that there's size op followed by a div op which results in int64 tensors in acc traced graph (acc tracer turns operator.div to acc_ops.div which uses torch.div). And trt splitter splits out the reshape op that consumes the div op because we have a rule to split out ops that takes in int64 tensors as inputs.

Test Plan: Unit tests.

Reviewed By: wushirong

Differential Revision: D33482231

fbshipit-source-id: 508a171520c4e5b4188cfc5c30c1370ba9db1c55
2022-01-12 09:46:46 -08:00
6a40bb0fdf [DataPipe] Update deprecation warning (#71171)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71171

Editing two warnings to more accurately portray the deprecation plan for the DataPipes

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33535785

Pulled By: NivekT

fbshipit-source-id: b902aaa3637ade0886c86a57b58544ff7993fd91
2022-01-12 09:34:53 -08:00
706777bf56 Disable the output invocation in jit (#71138)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71138

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33521059

Pulled By: eellison

fbshipit-source-id: eaf20eaa6e62159dff9369a7b75e6d6009fb45d0
2022-01-12 09:11:37 -08:00
5480deb183 Add support for permutting dynamic fusion group outputs to channels last format (#70656)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70656

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33458650

Pulled By: eellison

fbshipit-source-id: f0c7d20743deac7a87f7c9176e60da8100aefe41
2022-01-12 09:11:34 -08:00
39be20f259 [JIT][NNC] Add handling of strides to dynamic shape support. (#70464)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70464

Add handling of strided input tensors to dynamic fusion. This is done with the same set of input striding specializations as https://github.com/pytorch/pytorch/pull/60684/:
```
  S_ONE, // STRIDE_ONE: packed
  S_CONT, // STRIDE_CONTIGUOUS: stride[i + 1] * sizes[i + 1]
  S_TRAN_CONT, // STRIDE_TRANSPOSED_CONTIGUOUS: stride[i-1] * sizes[i-1]
  S_AS_ARG, // STRIDE_AS_ARG: stride passed in as runtime value
```
and then two additional specializations for a) contiguous tensor and b) channels-last tensor. channels-last is a common case and we should optimize for it. additionally, tensors natively store whether they are contiguous/channels-last contiguous, which makes it faster to check if tensors follow this pattern.

Output striding will be done in a follow up.

The striding is stored on both the TensorGroup node and on the guard node. The striding descriptors are stored as a vector of strings on the node for debugability and to make use of storing ivalues as attributes on nodes.

As an example:

```

%8 : Double(10, 11, 12, 13, strides=[1716, 1, 143, 11], requires_grad=0, device=cpu) = prim::TensorExprGroup_0[symbolic_shape_inputs=[-37, -36, -35, -34], striding_inputs_desc=[["TENSOR_CONT_CHANNELS_LAST"]](%x, %24, %23, %22, %21)```
```

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33458649

Pulled By: eellison

fbshipit-source-id: c42616d3c683d70f6258180d23d3841a31a6030d
2022-01-12 09:11:31 -08:00
975e7d246e Remove ignore shapes arg (#71144)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71144

This wasn't being used anywhere. It was originally intended for the SR flow but we're doing something else now.

Test Plan: Imported from OSS

Reviewed By: navahgar, ZolotukhinM

Differential Revision: D33521061

Pulled By: eellison

fbshipit-source-id: 0574698a2b7409df6feb703f81e806d886225307
2022-01-12 09:09:49 -08:00
97585ae1e7 Simplify forward / backward AD for linalg.eigh and add checks (#70528)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70528

This PR adds checks for the backward of `linalg.eigh`, similar to those
deduced in https://github.com/pytorch/pytorch/pull/70253

It also makes its the implementation parallel that of the (fwd/bwd) derivative of
`torch.linalg.eig` and it makes most OpInfo tests pass.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33530149

Pulled By: albanD

fbshipit-source-id: 1f368b8d450d4e9e8ae74d3881c78513c27eb956
2022-01-12 08:35:52 -08:00
061be8d600 Correct forward AD for linalg.eig and add checks (#70527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70527

This PR adds checks for the backward of `linalg.eig`, similar to those
deduced in https://github.com/pytorch/pytorch/pull/70253

It also modifies the function so that it does not save the input matrix,
as it's not necessary.

It also corrects the forward AD formula for it to be correct. Now all
the tests pass for `linalg.eig` and `linalg.eigvals`.

It also updates the docs to reflect better what's going on here.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33530148

Pulled By: albanD

fbshipit-source-id: 984521a04f81ecb28ac1c4402b0243c63dd6959d
2022-01-12 08:30:55 -08:00
e1aea9b968 Add retry to disabled tests file download (#71030)
Summary:
Helps with spotty disabling brought up in https://github.com/pytorch/pytorch/issues/70877 and https://github.com/pytorch/pytorch/issues/70875

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71030

Reviewed By: malfet, atalman

Differential Revision: D33486379

Pulled By: janeyx99

fbshipit-source-id: 56c4d56c2bd8be47a51dee19373aac6c9c5d1691
2022-01-12 08:20:44 -08:00
928ca95ff0 fix TensorLikePair origination (#70304)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70304

Without this patch `TensorLikePair` will try to instantiate everything although it should only do so for tensor-likes. This is problematic if it is used before a different pair that would be able to handle the inputs but never gets to do so, because `TensorLikePair` bails out before.

```python
from torch.testing._comparison import assert_equal, TensorLikePair, ObjectPair

assert_equal("a", "a", pair_types=(TensorLikePair, ObjectPair))
```

```
ValueError: Constructing a tensor from <class 'str'> failed with
new(): invalid data type 'str'.
```

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542995

Pulled By: mruberry

fbshipit-source-id: 77a5cc0abad44356c3ec64c7ec46e84d166ab2dd
2022-01-12 06:44:00 -08:00
49a5b33a74 add a equality comparison helper for assert_close internals (#69750)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69750

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542993

Pulled By: mruberry

fbshipit-source-id: 0de0559c33ec0f1dad205113cb363a652140b62d
2022-01-12 06:43:57 -08:00
b0a10a709f add explanation of quantized comparison strategy in assert_close (#68911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68911

Closes #68548.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542997

Pulled By: mruberry

fbshipit-source-id: 78accf20a83cd72254ae0036dc23f9e5376a4c65
2022-01-12 06:43:53 -08:00
802dd2b725 change sparse COO comparison strategy in assert_close (#68728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68728

This removes the ability for `assert_close` to `.coalesce()` the tensors internally. Additionally, we now also check `.sparse_dim()`. Sparse team: please make sure that is the behavior you want for all sparse COO comparisons in the future. #67796 will temporarily keep BC by always coalescing, but in the future `TestCase.assertEqual` will no longer do that.

cc nikitaved pearu cpuhrsch IvanYashchuk

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542996

Pulled By: mruberry

fbshipit-source-id: a8d2322c6ee1ca424e3efb14ab21787328cf28fc
2022-01-12 06:43:50 -08:00
8d05174def make meta tensor data access error message for expressive in assert_close (#68802)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68802

Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542999

Pulled By: mruberry

fbshipit-source-id: 0fe1ddee15b5decdbd4c5dd84f03804ca7eac95b
2022-01-12 06:43:47 -08:00
b652887ad7 improve documentation of comparison internals (#68977)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68977

Follow-up to #68722 to address the review comments that were left open before merge.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33542998

Pulled By: mruberry

fbshipit-source-id: 23c567cd328f83ae4df561ac8ee6c40c259408c9
2022-01-12 06:42:30 -08:00
523d448968 Remove deprecated cuDNN convolution ops (#71128)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71128

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33517677

Pulled By: jbschlosser

fbshipit-source-id: 1690fd38a38ee7cf16865209280a9c457c5f70ff
2022-01-12 06:34:42 -08:00
93b2399c6c [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33544281

fbshipit-source-id: 4f0b5d6d490e6fcb967550cfb1dc0111b1770f73
2022-01-12 04:16:43 -08:00
4a8d4cde65 Fix for tensor in list return added to wildcard set (#71170)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71170

As with an output in a tuple return, an output in a list return will not have any further uses that would make adding it directly to the list's contained elements give incorrect behavior. This unblocks a use case in op authoring.

cc Chillee

Test Plan: Imported from OSS

Reviewed By: d1jang

Differential Revision: D33535608

Pulled By: eellison

fbshipit-source-id: 2066d28e98c2f5d1b3d7e0206c7e39a27b3884b1
2022-01-11 22:12:39 -08:00
9bccb31306 Remove precise tuple construct flag (#71121)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71121

Test Plan: Imported from OSS

Reviewed By: d1jang

Differential Revision: D33515234

Pulled By: eellison

fbshipit-source-id: 57cfe171b583a6bb4d3493a34b159061e97a11b8
2022-01-11 22:12:36 -08:00
47ad6628f1 add optional refining (#69776)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69776

If we have an node output which is an optional type, but both if blocks produce a non-optional value, we can try to refine the if output type, which can open up further optimization opportunities.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33515235

Pulled By: eellison

fbshipit-source-id: 34f6ab94ac4238498f9db36a1b673c5d165e832e
2022-01-11 22:12:34 -08:00
772b3e92bf Parse symbolic shapes (#69775)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69775

Adds parsing for Symbolic Shapes.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33515233

Pulled By: eellison

fbshipit-source-id: 7ebb22c0ab37d78e459ebcab67bb86f731d00376
2022-01-11 22:12:31 -08:00
97e8dcba5e Fix mis-specified device arg name (#69645)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69645

As noted in code comment:
existing device operator is registered with input name `a`, which prevents torch.device(type="cuda") from working. add shim-layer here

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33515231

Pulled By: eellison

fbshipit-source-id: c04af8158a9568a20cd5fbbbd573f6efab98fd60
2022-01-11 22:11:24 -08:00
9465c24245 [jit][edge] Use dynamic type instead of union types for schema parsers. (#70509)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70509

TypeFactory will construct DynamicType when building on Edge platforms. We use this facility to make FunctionSchema return DynamicType all the time for OptionalType. We don't explicitly use DynamicTypeFactory everywhere because that requires too many changes and will split the entire aten codebase.
ghstack-source-id: 146818621

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33306737

fbshipit-source-id: d7ce00b438f7c03b43945d578280cfd254b1f634
2022-01-11 20:14:25 -08:00
40121456af Sparse CSR: Add torch.randn_like (#68083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68083

This PR adds support for `torch.randn_like(sparse_csr_tensor)`.
It creates a new sparse csr tensor with same indices but different values that are normally distributed.

In addition `.normal_()` and `torch.empty_like` were implemented because `randn_like` is a composite of these two functions.

cc nikitaved pearu cpuhrsch IvanYashchuk

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33511280

Pulled By: cpuhrsch

fbshipit-source-id: 6129083e8bc6cc5af2e0191294bd5e4e864f6c0e
2022-01-11 18:29:24 -08:00
831c129e85 fx quant: fix test_fx_acc_tracer::test_quantized_batch_norm2d (#71175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71175

D33330022 was landed with a Meta test failure (ghstack clobbered the fix),
resubmitting the Meta-only part to fix CI.

Test Plan:
```
buck test mode/opt //caffe2/test:test_fx_acc_tracer -- --exact 'caffe2/test:test_fx_acc_tracer - test_quantized_batch_norm2d (fx_acc.test_acc_tracer.AccTracerTest)' --run-disabled
```

Reviewed By: HDCharles

Differential Revision: D33531994

fbshipit-source-id: 39dc945c54fb9a7205c9d4114ede6b5ab99c5012
2022-01-11 17:38:00 -08:00
410e91adee Performance and memory improvements to batched torch.linalg.solve (#69752)
Summary:
Previously for single input matrix A and batched matrix B, matrix A was expanded and cloned before computing the LU decomposition and solving the linear system.

With this PR the LU decomposition is computed once for a single matrix and then expanded&cloned if required by a backend library call for the linear system solving.

Here's a basic comparison:
```python
# BEFORE THE PR
In [1]: import torch
In [2]: a = torch.randn(256, 256)
In [3]: b = torch.randn(1024, 256, 2)
In [4]: %%timeit
   ...: torch.linalg.solve(a, b)
   ...:
   ...:
329 ms ± 17.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# WITH THIS PR
In [1]: import torch
In [2]: a = torch.randn(256, 256)
In [3]: b = torch.randn(1024, 256, 2)
In [4]: %%timeit
   ...: torch.linalg.solve(a, b)
   ...:
   ...:
21.4 ms ± 23 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69752

Reviewed By: albanD

Differential Revision: D33028236

Pulled By: mruberry

fbshipit-source-id: 7a0dd443cd0ece81777c68b29438750f6524ac24
2022-01-11 16:14:16 -08:00
786f946098 [Profiler] Add glue layer to reduce the use of #ifdef USE_KINETO in the profiler code. (#69798)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69798

One of the major sources of complexity in `profiler_kineto.cpp` is that kineto may or may not be available. The code (including the types) follows two related but often distict codepaths, and large sections may or may not be `#ifdef`'d out.

Optimizing such code which preserving correctness is quite difficult; at one point I realized that I had broken the non-Kineto case, because moving work into the finalize step runs astray of a very large `#ifdef` around the finalize logic.

In order to make optimization more tractable, I gathered all of the calls to Kineto APIs and isolated them in the `kineto_shim.h/.cpp` files: the header allows callers to pretend as though Kineto is always available (mostly), and the cpp file hides most of the horrible `#ifdef`s so they don't pollute the main profiler code.

Test Plan: Unit tests.

Reviewed By: aaronenyeshi

Differential Revision: D32690568

fbshipit-source-id: 9a276654ef0ff9d40817c2f88f95071683f150c5
2022-01-11 15:57:46 -08:00
a3b7dd7b78 Enable nested default hooks (#70932)
Summary:
When default hooks are set, they are pushed onto a stack.
When nesting context-manager, only the inner-most hooks will
be applied.

There is special care needed to update the TLS code. See also https://github.com/pytorch/pytorch/issues/70940 (i.e. do we need to be storing the enabled flag as well?)

Fixes https://github.com/pytorch/pytorch/issues/70134

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70932

Reviewed By: mruberry

Differential Revision: D33530370

Pulled By: albanD

fbshipit-source-id: 3197d585d77563f36c175d3949115a0776b309f4
2022-01-11 15:03:49 -08:00
433cf44b79 delete ecr_gc_docker job (#71178)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71178

This should no longer be needed as we now set a lifecycle policy on ECR
and we also don't generate lots of temporary containers anymore.

Test Plan: Imported from OSS

Reviewed By: seemethere

Differential Revision: D33537851

Pulled By: suo

fbshipit-source-id: b97b7525be6f62ec8771dfb6a7ee13b22b78ac5a
2022-01-11 14:53:31 -08:00
e7634f83ce [jit][edge] Migrate base types to DynamicType on mobile. (#70233)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70233

Make type parser to produce DynamicType for all base types which don't have type arguments, and return DynamicType pointer for IValue::type().
ghstack-source-id: 146818622

Test Plan: no behavior change.

Reviewed By: iseeyuan

Differential Revision: D33137219

fbshipit-source-id: 1612c924f5619261ebb21359936309b41b2754f5
2022-01-11 13:53:29 -08:00
ecb6defa36 Fixed docs for forward_ad.make_dual (#71159)
Summary:
Minor docs change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71159

Reviewed By: mruberry

Differential Revision: D33530031

Pulled By: albanD

fbshipit-source-id: e0bbe3a29a7de675fa4c9bf90976616f0e093f74
2022-01-11 13:47:09 -08:00
2c8cb8a964 Speed up quantized upsampling for channels last (#70903)
Summary:
Moving the calls to `q_zero_point()` outside the for loop considerably
speeds up upsampling for channels last format.

This fix is very similar to https://github.com/pytorch/pytorch/pull/66525 but applies it for channels last format.

Fixes https://github.com/pytorch/pytorch/issues/70902

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70903

Reviewed By: mruberry

Differential Revision: D33531805

Pulled By: vkuzo

fbshipit-source-id: e723f1e3d53bdd66529c1326dccba889402a126c
2022-01-11 13:28:10 -08:00
edf15ebbc2 Adding python 3.10 binary workflows (#71132)
Summary:
Testing python 3.10

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71132

Reviewed By: mruberry

Differential Revision: D33534609

Pulled By: atalman

fbshipit-source-id: 561412735fb6d1269fca3db0fac5afd437a0bde2
2022-01-11 13:18:18 -08:00
7d6535cab3 Make Kineto + distributed a warning rather than an error (#71120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71120

D33283314 (681e78bace) is causing jobs to fail when profiled, which is not ideal.

Test Plan:
pyper-online-cli launch 306587531 AI_INFRA ads_global_pyper_sla oncall_model_store --training_package_version training_platform:9344fe410969bdf614bc89cff0280281 --training_stage ONLINE --training_environment DEV --timeout 1728000

(Courtesy of yanjzhou)

Reviewed By: xw285cornell

Differential Revision: D33437773

fbshipit-source-id: 5c492f83146ff82557cfc1142aade3432cf73ca5
2022-01-11 12:50:17 -08:00
45b0bafb38 Drop more unused variables (#71123)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71123

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33511656

fbshipit-source-id: b53565b589720cce9fdfe3bc222853dba8645aff
2022-01-11 12:46:24 -08:00
6c03f8d9e5 Drop unused variables and add some const (#71106)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71106

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33490855

fbshipit-source-id: 9fc4a4e4a7ad5e6c31f394ec6d8221b964fdf043
2022-01-11 12:38:59 -08:00
1c8b167327 Move implementation of empty_like for sparse COO (#71103)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71103

Previously the implementation of empty_like for sparse COO was a
conditional path in generic implementation.
This PR makes use of the Dispatcher and moves the implementation into a
separate function.

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33511240

Pulled By: cpuhrsch

fbshipit-source-id: 9a84f82a27e3cf0ac819d867b86df6d10ddf7fa7
2022-01-11 12:30:39 -08:00
a8612cd72a Skip failing tests in test_nn if compiled without LAPACK (#70913)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70912

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70913

Reviewed By: mruberry

Differential Revision: D33534840

Pulled By: albanD

fbshipit-source-id: 0facf5682140ecd7a78edb34b9cd997f9319e084
2022-01-11 12:21:18 -08:00
14922a136f Revert D33480077: .github: Re-enable xla test config
Test Plan: revert-hammer

Differential Revision:
D33480077 (18e1e1d4d3)

Original commit changeset: a2e720c55d0e

Original Phabricator Diff: D33480077 (18e1e1d4d3)

fbshipit-source-id: e4e114a9a6d7940491ac0741e94f455a490f077a
2022-01-11 12:12:15 -08:00
940b89b03f Disable Python-3.6 binary builds (#71163)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71163

Reviewed By: anjali411

Differential Revision: D33532813

Pulled By: malfet

fbshipit-source-id: ab0833c2db187c452681a17907583599ff1cb481
2022-01-11 11:25:45 -08:00
4f35b9144c [jit][edge] Migrate ListType to DynamicType on mobile. (#70212)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70212

Use DynamicType instead of ListType all over the place in Lite Interpreter. Namely we need to modify the following places:
1. Type parser which produces the Type constants.
2. IValue::type() which returns reflected Type from IValues.
3. Helper functions to construct the container value.
4. Typechecks which test whether a type instance is a particular container type.
ghstack-source-id: 146818619

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33176931

fbshipit-source-id: 9144787f5fc4778538e5c665946974eb6171a2e6
2022-01-11 10:57:53 -08:00
18e1e1d4d3 .github: Re-enable xla test config (#71008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71008

This reverts commit 6f83841582d8d818129dc4ce82a8478f221b32d7.

Test Plan: Imported from OSS

Reviewed By: janeyx99

Differential Revision: D33480077

Pulled By: seemethere

fbshipit-source-id: a2e720c55d0e1995e2b6cf2da7c801f377d52b3f
2022-01-11 10:49:20 -08:00
85c6489cdc ci: unquote env variables (#71139)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71139

These variables were being interpreted as being quoted in the GITHUB_ENV
file meaning they didn't register correctly when attempting to do the
actual binary_upload.sh leading to binaries not actually getting
uploaded.

This remedies that

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet, b0noI

Differential Revision: D33519952

Pulled By: seemethere

fbshipit-source-id: 727f6d4e5dbdfd0a3e2c76058bee9430b2c717a9
2022-01-11 10:21:11 -08:00
cf61738097 Drop unused variables; make things const; use some auto (#71107)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71107

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33490773

fbshipit-source-id: 0d259db9c58c9b33aecc560075f6dcfa78883467
2022-01-11 08:55:54 -08:00
3c2ae2b47c Revert D32994274: [ONNX] Link to the wiki (#68505)
Test Plan: revert-hammer

Differential Revision:
D32994274 (a606ea73d6)

Original commit changeset: 34d54f935799

Original Phabricator Diff: D32994274 (a606ea73d6)

fbshipit-source-id: 81fc96c2aff9d14efb5e092fffd0685e507837e6
2022-01-11 07:40:14 -08:00
1b496cf158 Fixes doc errors in Tensor.triu(), Tensor.tril(), Tensor.ravel(). (#71057)
Summary:
Hi, PyTorch Team!
I am very much interested in starting up my contribution to PyTorch. I made several contributions in NumPy and CuPy, but this is my first PR towards PyTorch. I aim to contribute more in the upcoming future.

The PR fixes https://github.com/pytorch/pytorch/issues/70972  https://github.com/pytorch/pytorch/issues/70975.

#### Aim of PR
The functions like `Tensor.ravel`, `Tensor.tril`, `Tensor.tril_`, `Tensor.triu`, and `Tensor.triu_` had a couple of typos in docs. The PR aims to resolve that.

I'm looking forward to your viewpoints. Thanks!

cc: kshitij12345 vadimkantorov Lezcano TestSomething22

cc brianjo mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71057

Reviewed By: preeti1205

Differential Revision: D33502911

Pulled By: mruberry

fbshipit-source-id: 8ce0b68a29658a5a0be79bc807dfa7d71653532d
2022-01-11 07:34:59 -08:00
ac0d131291 Decprecating routed decoder (#70990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70990

Releasing `decode` API for domains to let them implement custom `decode` DataPipe for now.

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33477620

Pulled By: ejguan

fbshipit-source-id: d3c30ba55c327f4849d56f42d328a932a31777ed
2022-01-11 06:56:48 -08:00
d6b7d69d8b Python3.10 migration adding to binary linux tests (#71130)
Summary:
Python3.10 migration adding to binary linux tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71130

Reviewed By: seemethere, janeyx99

Differential Revision: D33518787

Pulled By: atalman

fbshipit-source-id: 53c2c1b96e7a530a2af9ae7d5840bf8398b870e5
2022-01-11 05:54:07 -08:00
fb8a9732d9 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33524330

fbshipit-source-id: 112291a23e2efe2d573bee86ead8ce2fc3957e5b
2022-01-11 04:33:21 -08:00
fdda7b5e8a [Codemod][FBSourceBlackLinter] Daily arc lint --take BLACK
Reviewed By: zertosh

Differential Revision: D33525225

fbshipit-source-id: 973eb9f9a5dfbd70bf0127f44089237969c2bb68
2022-01-11 04:20:46 -08:00
40b80aa490 [jit][edge] Migrate TupleType to DynamicType on mobile. (#70205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70205

Use DynamicType instead of TupleType all over the place in Lite Interpreter. Namely we need to modify the following places:
1. Type parser which produces the Type constants.
2. IValue::type() which returns reflected Type from IValues.
3. Helper functions to construct the container value.
4. Typechecks which test whether a type instance is a particular container type.
ghstack-source-id: 146818620

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33176925

fbshipit-source-id: 00f7a5db37ba772c912643c733db6c52dfdc695d
2022-01-11 01:01:48 -08:00
5cae40c169 [pytorch][aten][cuda] move CUDAGeneratorImpl.h to ATen/cuda (#70650)
Summary:
This patch moves a CUDA-specific file, `CUDAGeneratorImpl.h` to `ATen/cuda` as the following TODO comment in  `CUDAGeneratorImpl.h` suggests:
```
// TODO: this file should be in ATen/cuda, not top level
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70650

Reviewed By: jianyuh, xw285cornell

Differential Revision: D33414890

Pulled By: shintaro-iwasaki

fbshipit-source-id: 4ff839205f4e4ea4c8767f164d583eb7072f1b8b
2022-01-10 22:27:04 -08:00
33a5905cc6 [quant] fix reduce_range warning (#71027)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71027

Fix issue #61054. remove warning
reduce_range=True which caused the error message "UserWarning: Please use quant_min and quant_max to specify the range for observers".

Test Plan:
python test/test_quantization.py TestFakeQuantizeOps

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D33484341

fbshipit-source-id: 97c3d4658926183f88a0c4665451dd7f913d30e6
2022-01-10 20:05:36 -08:00
59e166feb2 [Quant][DBR] Add test for serialization (#70078)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70078

This commit adds a serialization test for DBR.

Test Plan:
python test/test_quantization.py TestQuantizeDBR.test_serialization

Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33405192

fbshipit-source-id: 39c4cca49aff8b960f4dec6c272fbd0da267fa95
2022-01-10 17:50:05 -08:00
043e84b3d2 Per-overload torch.ops API (#67254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67254

Fixes https://github.com/pytorch/pytorch/issues/65997

BC breaking:
`output = torch.ops._test.leaky_relu(self=torch.tensor(-1.0))` now fails with the error `TypeError: __call__() got multiple values for argument 'self'` since we call into `OpOverloadBundle`'s `__call__` method that has `self` bound to it as its first argument.

Follow up work:
1. disallow `default` as an overload name for aten operators.
2. Add a method to obtain a list of all overloads (exclude the ones registered by JIT)
3. Add methods/properties to `OpOverload` to access more schema information (types of input and output args etc)

cc ezyang gchanan

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D33469839

Pulled By: anjali411

fbshipit-source-id: c3fc43460f1c7c9651c64b4d46337be21c400621
2022-01-10 17:29:06 -08:00
b12ca69179 [jit][edge] Migrate DictType to DynamicType on mobile. (#70202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70202

Use DynamicType instead of DictType all over the place in Lite Interpreter. Namely we need to modify the following places:
1. Type parser which produces the Type constants.
2. IValue::type() which returns reflected Type from IValues.
3. Helper functions to construct the container value.
4. Typechecks which test whether a type instance is a particular container type.
ghstack-source-id: 146735648

Test Plan: no behavior change.

Reviewed By: iseeyuan

Differential Revision: D33137257

fbshipit-source-id: 971bf431658c422ea9353cc32cdab66e98876e9d
2022-01-10 15:55:29 -08:00
a606ea73d6 [ONNX] Link to the wiki (#68505) (#69544)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69544

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D32994274

Pulled By: msaroufim

fbshipit-source-id: 34d54f935799fa94516a541a241900ec205c7427

Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
2022-01-10 15:51:04 -08:00
7397683b57 Add forward AD formulas for mv, scatter_add, _s_where (#70468)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70468

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33405364

Pulled By: soulitzer

fbshipit-source-id: 7681c33fb264a7a3ec6436ebb7c5bb07cd5ffc3d
2022-01-10 13:54:10 -08:00
78994d13c0 Add forward AD formulas for {batch,layer,group}_norm (#70355)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70355

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33405362

Pulled By: soulitzer

fbshipit-source-id: 55a92e88a04e7b15a0a223025d66c14f7db2a190
2022-01-10 13:52:16 -08:00
7a08030903 Fix fx2trt CI test trigger condition (#71014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71014

Replace test trigger with test_config matching.

Test Plan:
CI
https://github.com/pytorch/pytorch/runs/4746717568?check_suite_focus=true

Reviewed By: janeyx99

Differential Revision: D33480971

fbshipit-source-id: 9513e464753343a7ae47fcfaf48119f34bae94c5
2022-01-10 13:37:24 -08:00
80659b71a5 Hoisting common expressions out of If blocks [retry] (#65645)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65645

This is a retry of PR: https://github.com/pytorch/pytorch/pull/59492

Latest Changes: Added more tests, added the getOrCreateDB pattern, updated logic to remove unnecessary checks
addressed all comments.

Adding code to find common expressions from the two subblocks of an if
operation and hoist them before the if block.
This also allows Dead Code Elimination to
then eliminate some if blocks.

Test Plan: python test_jit.py TestIfHoisting

Reviewed By: eellison

Differential Revision: D33302065

Pulled By: Gamrix

fbshipit-source-id: a5a184a480cf07354359aaca344c6e27b687a3c2
2022-01-10 13:28:17 -08:00
569aeec1bc fix typo in debugging_hooks.py (#70956)
Summary:
I just fixed a small typo in the debugging_hooks documentation

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70956

Reviewed By: jbschlosser

Differential Revision: D33508898

Pulled By: dagitses

fbshipit-source-id: fc5935e5a2e2ddc45657a22d3b33a11aba378d9b
2022-01-10 12:59:42 -08:00
49ed097ebe Add documentation for lowering (#71116)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71116

As title, add more inline documentation for code.

Test Plan:
no
pingpoke

Reviewed By: 842974287

Differential Revision: D33465611

fbshipit-source-id: 6b5529893098e5591470c2f41a0d8989e3cfccb9
2022-01-10 12:56:59 -08:00
3fbff80bea ci: Move MAX_JOBS to not set on Darwin (#71122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71122

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33515392

Pulled By: seemethere

fbshipit-source-id: 376608c9a6e2e685a07d5010ce443a3f02475ee5
2022-01-10 12:49:14 -08:00
cfc1117591 Update sparse.rst to warn about _values() (#71088)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70357

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71088

Reviewed By: jbschlosser

Differential Revision: D33511207

Pulled By: cpuhrsch

fbshipit-source-id: 9d0c5445842ed96999eb88445cbea7ae284b1a6f
2022-01-10 12:43:46 -08:00
30699cbfd5 Reland D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers. (#71048)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71048

reland D33284352 (0a921ba0d0)
ghstack-source-id: 146735646

Test Plan: All Github CI: ciflow rerun -l ciflow/all

Reviewed By: gmagogsfm

Differential Revision: D33489731

fbshipit-source-id: 3e160209a1abb193ad3eed3018054aa7d331025e
2022-01-10 12:42:23 -08:00
fb66f561b1 Add copy out to the fallback path in SR invocation of composed op (#70871)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70871

We had previously handled reusing memory in the optimized kernel execution path, but not yet handled it if we hit the unoptimized fallback.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33458652

Pulled By: eellison

fbshipit-source-id: 4eb62181ed02c95813a99638f5e2d0f9347b5c08
2022-01-10 12:16:38 -08:00
c8332256ee [JIT] Refactor SR invocation of fusion (#70508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70508

We can create the code object at compile time instead or runtime to speed it up. This also makes unnecessary the compilation cache. TODO: figure out if theres a way to cache InterpreterState object

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33458648

Pulled By: eellison

fbshipit-source-id: 710389741e7c6210528f2f96ab496fcd533d942a
2022-01-10 12:16:35 -08:00
0adc7cc546 Inline Fallback Functions For Debugging (#70463)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70463

Fix for https://github.com/pytorch/pytorch/issues/52940

When we call inlining on a fallback function, insert the runtime optimized version of its graph.

Test Plan: Imported from OSS

Reviewed By: jbschlosser, davidberard98

Differential Revision: D33458651

Pulled By: eellison

fbshipit-source-id: fd7e5e2b5273a1677014ba1a766538c3ee9cad76
2022-01-10 12:15:11 -08:00
840459a269 [ONNX] Relax constant_fold gather with indices rank > 1 (#68140) (#68493)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68493

Fixes #66786.

`index_select` only supports `index` of 1-D tensor. `ONNX::Gather` allows `index` to have rank `q`. Abort constant folding `ONNX::Gather` if `index` rank is larger than 1.

Test Plan: Imported from OSS

Reviewed By: jansel

Differential Revision: D32483826

Pulled By: msaroufim

fbshipit-source-id: a8e8389d85287a859d32abf8d8d98852290b0a03

Co-authored-by: BowenBao <bowbao@microsoft.com>
2022-01-10 11:55:02 -08:00
4b47047dae [ONNX] Add support for shrink ops (#66969) (#68492)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68492

* Initial commit

* Fix flake issue

* Add test tags

Test Plan: Imported from OSS

Reviewed By: jansel

Differential Revision: D32483827

Pulled By: msaroufim

fbshipit-source-id: 41c623712524465b877d0fe0e2f4001d475bf2ce
2022-01-10 11:38:31 -08:00
62441157e3 Have getFilesToLevels return a reference (#71047)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71047

The copy induced by getFilesToLevels is currently consuming 3,457,470,000 cycles per day. A reference might fix that.

Reference:
```
["Inline torch::jit::JitLoggingConfig::getFilesToLevels[abi:cxx11] @ caffe2/torch/csrc/jit/jit_log.cpp:54"]
```

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33479180

fbshipit-source-id: 05d306ad9ea23e2f30348a08d547ebe274eb0c10
2022-01-10 11:32:32 -08:00
87484d67e3 .github: Enable linux binary builds (#68388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68388

Updates the gpu architectures as well as adding a trigger for
on_pull_request for the binary build workflows so that we can iterate on
this later

TODO:
* Create follow up PR to enable nightly linux GHA builds / disable CircleCI nighlty linux builds

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: janeyx99

Differential Revision: D33462294

Pulled By: seemethere

fbshipit-source-id: 5fa30517550d36f504b491cf6c1e5c9da56d8191
2022-01-10 11:30:45 -08:00
e9a8bb59b4 Move the apply_tensor_props into its own function for more public use (#67786)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67786

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32175962

Pulled By: Gamrix

fbshipit-source-id: caefe1c849277632d976a6b5513f72b47595f2c0
2022-01-10 11:26:03 -08:00
3ef10da97d add support for pickle v4 (#70642)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70642

Review history on https://github.com/pytorch/pytorch/pull/70014

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33414364

Pulled By: PaliC

fbshipit-source-id: 7e7ed491c6f16d4fac3a03f7e403935823c03aa6
2022-01-10 11:13:41 -08:00
118bd82dde detect mocked module on saving pass (#70641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70641

Raises a not implemented error if we attempt to pickle an object which uses a mocked module. Now we no longer have to load the object to get this check, and instead happens right on the saving path.

Review History is on https://github.com/pytorch/pytorch/pull/69793 PR was moved to a different branch due to original branch getting corrupted.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33414365

Pulled By: PaliC

fbshipit-source-id: 6d72ddb05c47a3d060e9622ec0b6e5cd6c6c71c8
2022-01-10 11:11:55 -08:00
c4400fc431 Retire repeat_test_for_types (#71033)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69865

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71033

Reviewed By: mruberry

Differential Revision: D33486370

Pulled By: janeyx99

fbshipit-source-id: 71f9383dbc1e00b572f26eb4f04d0a94c6759e35
2022-01-10 09:13:54 -08:00
e1b84e1b6b fix loading of older models that don't have maximize (#71023)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71023

Reviewed By: jbschlosser

Differential Revision: D33483687

Pulled By: albanD

fbshipit-source-id: 2f3c6e97a9579be9ba15eca0756fc1f2c466fbb6
2022-01-10 06:01:24 -08:00
b27dfa70c4 caffe2: disable TensorImpl static_assert (temporary)
Test Plan: buck2 build -c cxx.modules=false -c fbcode.platform=platform010 fbcode//caffe2/aten:ATen-cu

Reviewed By: singhsrb, meyering

Differential Revision: D33501636

fbshipit-source-id: a1a5bbb2b160eba8eb5abba4f6ae1929a58e11e9
2022-01-09 23:11:17 -08:00
fca8a0acaa Prevent import race condition that leaves torch.package.PackagePickler with unwanted dispatch table entries. (#71025)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71025

TL;DR In come cases:
1) user imports `dill`, which mutates `_Pickler.dispatch`,
2) user imports lib that imports `torch.package`
3) `PackagePickler.dispatch = _Pickler.dispatch.copy()` makes a copy of the mutated table
4) user calls `dill.extend(use_dill=False)` to reset `_Pickler.dispatch`, expecting everything to be okay
5) `PackagePickler` is used to pickle something like `ModuleDict`. `PackagePickler.dispatch` has stale entries to dill pickle functions like `save_module_dict`, which sometimes hard-code calls to `StockPickler.save_global`, which is unaware of torch.package module prefixes.
6) Exception is raised, e.g. `Got unhandled exception Can't pickle <class '<torch_package_2>.caffe2.mylib'>: it's not found as <class '<torch_package_2>.caffe2.mylib'>`

Differential Revision: D33483672

fbshipit-source-id: d7cd2a925bedf27c02524a6a4c3132a262f5c984
2022-01-09 15:13:39 -08:00
2bed616e0f [Dist tests] Make event_listener work for all dist tests (#70628)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70628

event_listener thread is used to log process tracebacks when a timed
out process sends it a request to get its traceback. Although, this thread is
created in `_run` function which is overridden by some classes such as
`TestDistBackend` so those tests did not have this feature. Move the
event_listener setup logic to `run_test` which is called by all distributed
test classes, which enables it for all distributed tests. Also modify logger
setup to ensure that logging.info calls are printed in the subprocess.
ghstack-source-id: 146714642

Test Plan: CI

Reviewed By: jaceyca, fduwjj

Differential Revision: D33410613

fbshipit-source-id: aa616d69d251bc9d04e45781c501d2244f011843
2022-01-09 14:54:09 -08:00
9267fd8d73 [WIP] [ATen] Add native_multi_attention_self_attention CPU + GPU implementation (#70649)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70649

As described in https://fb.quip.com/oxpiA1uDBjgP

This implements the first parts of the RFC, and is a rough draft showing the approach. The idea is that for the first cut we can maintain very close (identical I believe in this diff) numerical equivalence to the existing nn.MHA implementation, which is what this diff attempts to do. In subsequent implementations, once we have a working and adopted native self-attention implementation, we could then explore alternative implementations, etc.

The current implementation is similar to existing dedicated implementations such as LightSeq/FasterTransformer/DeepSpeed, and for MHA on both CPUs and GPUs is between 1.2x and 2x faster depending on the setting. It makes some approximations/restrictions (doesn't handle masking in masked softmax, etc), but these shouldn't materially impact performance.

This does the first few items:

* add native_multi_head_attention(...) , native_multi_head_attention_backward(..) to native_functions.yaml
* Implement native_multi_head_attention(..) on GPU, extracting bits and pieces out of LS/DS/FT as appropriate
* Implement native_multi_head_attention(..) on CPU

The backward implementation is still WIP, but the idea would be to:

* Hook these up in derivatives.yaml
Implement native_multi_head_attention_backward(..) on GPU, extracting out bits and pieces out of LS/DS (not FT since it’s inference only)
* Implement native_multi_head_attention_backward(..) on CPU
* In torch.nn.functional.multi_head_attention_forward 23321ba7a3/torch/nn/functional.py (L4953), add some conditionals to check if we are being called in a BERT/ViT-style encoder fashion, and invoke the native function directly.

Test Plan: TODO

Reviewed By: mikekgfb

Differential Revision: D31829981

fbshipit-source-id: c430344d91ba7a5fbee3138e50b3e62efbb33d96
2022-01-08 21:50:41 -08:00
785b6905de reduce plan generation log spam (#70880)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70880

Change loglevel to `debug` in caffe2 `optimizer.py` for logging rowwise Adagrad engine.

Test Plan: CI + sandcastle

Reviewed By: boryiingsu

Differential Revision: D33439337

fbshipit-source-id: b158249b8df771c0ec8b642210ede39972929b00
2022-01-08 10:07:06 -08:00
49a07c8922 Suppress some unused variable warnings in Sorting.cu and TensorTopK.cu (#70999)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70999

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33470240

fbshipit-source-id: 906932cb5f497c77465b70ec9bc6fcb0705719de
2022-01-08 00:41:58 -08:00
d1e049c306 Fix some unused variable warnings and make some stuff const in ReplicationPadding.cu (#70998)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70998

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33460035

fbshipit-source-id: bdf70fd04cce40a2a8d60c2c405f8d6cee9127e5
2022-01-08 00:40:51 -08:00
11aa1961c1 Use (void)error_unused to avoid unused warning (#71000)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71000

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D33470600

fbshipit-source-id: 868a6ee33a04846bd1efbe06ab306fbaad3bf9db
2022-01-07 23:39:30 -08:00
704af23ee4 Use a reference in GetSingleArgument (#71007)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71007

A string copy at Line 417 is currently consuming 125,749,287,000 cycles/day. I suspect the issue is with a copy-on-return, but we can experiment with introducing a reference in the middle to see if that produces a good savings without changing the interface.

Reference
```
["Inline caffe2::ArgumentHelper::GetSingleArgument @ caffe2/caffe2/utils/proto_utils.cc:417"]
```

Test Plan: Sandcastle

Reviewed By: xw285cornell

Differential Revision: D33478883

fbshipit-source-id: e863e359c0c718fcd0d52fd4b3c7858067de0670
2022-01-07 20:18:56 -08:00
9762aa0fdc Revert D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers.
Test Plan: revert-hammer

Differential Revision:
D33284352 (0a921ba0d0)

Original commit changeset: 997c4f110b36

Original Phabricator Diff: D33284352 (0a921ba0d0)

fbshipit-source-id: af316727442a64f1ae40d53d7a9d26ec550d634e
2022-01-07 19:58:03 -08:00
f626bef598 Fix docstring for nn.Hardshrink (#71012)
Summary:
Fixes nn.Hardshrkink's docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71012

Reviewed By: dagitses

Differential Revision: D33482333

Pulled By: jbschlosser

fbshipit-source-id: 00eea76299676fc97c5cc31421af9c73665bfcf4
2022-01-07 18:56:47 -08:00
0a921ba0d0 [jit][edge] Do not reuse mobile type parser for all unpicklers. (#70338)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70338

Today Unpickler is used by both server and mobile for deserializing model, and it always fallback to mobile parser when there's no type resolver provided by user. However this is not intended as server and mobile type parser supports different things. In this diff we provide a default fallback using script parser and opt it out for all mobile cases.
ghstack-source-id: 146727330

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33284352

fbshipit-source-id: 997c4f110b36eee6596e8f23f6a87bf91a4197ed
2022-01-07 18:35:32 -08:00
3f3eae6737 [jit] Split Tensor type implementations to separate file. (#70121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70121

Code move all TensorType dependencies into a separate `tensor_type.cpp`, so that we don't link with it in the min runtime accidentally.
ghstack-source-id: 146727331

(Note: this ignores all push blocking failures!)

Test Plan: no behavior change.

Reviewed By: gmagogsfm

Differential Revision: D33102286

fbshipit-source-id: e9fe176201bd2696cb8c65c670fcf225e81e8908
2022-01-07 18:35:29 -08:00
53b9c0f12d [jit] Polymorphic IValue::type() for DynamicType. (#70120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70120

Before the change:
```
c10::Type t = ivalue.type();
```
After the change:
```
c10::Type t = ivalue.type();
c10::DynamicType d = ivalue.type<c10::DynamicType>(); // new path
```
The new path will be adopted in PyTorch Lite Interpreter to support lightweight type reflection. Note that type getters are selected at compile time so no performance overhead. The benefits of having a DynamicType will be elaborated in a separate document, but in short, DynamicType provides an isolated type system for controlling binary size bloat, and shrink down ~20 supported Type symbols into one so that the size taken by specializations and function name symbols are greatly reduced.

Lite Interpreter should only use the `<DynamicType>` variant of the interfaces from aten, to reduce binary size.
ghstack-source-id: 146727334

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: gmagogsfm

Differential Revision: D33102276

fbshipit-source-id: c5354e7d88f9de260c9b02636214b40fe15f8a10
2022-01-07 18:35:26 -08:00
62909facb3 [jit] Decouple ivalue.h from jit_type.h (#70119)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70119

JIT type and IValue have a mutual dependency because of various reasons today. It makes things worse when we have `jit_type.h` and `ivalue.h` mutually include each other, causing non deterministic name resolutions at different translation units, preventing us safely use symbols from `jit_type.h` in `ivalue.h` . This diff doesn't address the mutual dependency between JIT type and IValue at linking level, but at header level.

We choose to remove include of `ivalue.h` from `jit_type.h` because it's way harder to make a type-free header for IValue. The way we achieve this is by removing EnumType (which is the only type depending on IValue in JIT types) from `jit_type.h`, and let downstream users to specifiy an explicit `enum_type.h` as needed. We also move some IValue inline member function definitions back to `ivalue_inl.h` so that `jit_type.h` doesn't need IValue definition to be present.
We also remove a seemingly accidental include of `jit_type.h` from `ATen/core/List_inl.h` so that `ivalue.h` can include `jit_type.h` directly, otherwise due to another mutual inclusion between `ivalue.h` and `List_inl.h` we can still get nondeterministic behavior.
ghstack-source-id: 146727333

(Note: this ignores all push blocking failures!)

Test Plan: no behavior change.

Reviewed By: gmagogsfm

Differential Revision: D33155792

fbshipit-source-id: d39d24688004c2ec16c50dbfdeedb7b55f71cd36
2022-01-07 18:34:17 -08:00
0eb2fc608c [fx_acc] ensure all acc ops args to be keyword arguments (#70952)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70952

ATT

Test Plan: test_plan_no

Reviewed By: wushirong

Differential Revision: D33456343

fbshipit-source-id: 26b0c1042de6072ff8741617dd3523edc4a9b5fd
2022-01-07 17:53:36 -08:00
0cd474b2ce fix op not scriptable
Summary: Fix torch.sort, min/max, torch.numel after quantization not scriptable

Test Plan: python3 test/test_quantization.py TestQuantizeFxOps.test_general_shape_ops

Reviewed By: jerryzh168

Differential Revision: D33467184

Pulled By: terrychenism

fbshipit-source-id: 13775ab36d4007978df48c9af71d83398fce5161
2022-01-07 16:55:28 -08:00
d26e5ced72 Add missing docstrings for ONNX converter API. Fixes #67393 (#67640) (#68489)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68489

Test Plan: Imported from OSS

Reviewed By: jansel

Differential Revision: D32483783

Pulled By: msaroufim

fbshipit-source-id: 512e4495040a6a9833d501de2301f1709b0352b9
2022-01-07 16:43:09 -08:00
c59c86706e [quant] Add back README.md for backend_config (#70964)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70964

Accidently deleted before, adding this back. We'll make this more complete after
the structure is finalized

Test Plan:
no test needed

Imported from OSS

Reviewed By: dagitses

Differential Revision: D33470738

fbshipit-source-id: 00459a4b00514d3d0346de68788fab4cad8a5d12
2022-01-07 15:44:51 -08:00
00e5610914 FX quant: allow duplicate named_modules during fbgemm lowering (#70927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70927

Earlier, when replacing deq-ref-quant modules, we get non-duplicate
named modules only. When model contains dupicate names, the lowering fails the
second time.
This PR allows duplicates when getting the named modules.

Test Plan: buck test //caffe2/torch/fb/model_transform/quantization/tests:fx_quant_api_test

Reviewed By: jerryzh168

Differential Revision: D33440028

fbshipit-source-id: f2fabd49a293beb90d7b4bf471610cde6279fd66
2022-01-07 15:43:31 -08:00
ad88354e25 torch.futures doc formatting (#70630)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70630

Params is incorrectly formatted [here](https://pytorch.org/docs/master/futures.html?highlight=future#:~:text=way%20as%20then().-,Parameters,-callback%20(Future)%20%E2%80%93%20a):

![image](https://user-images.githubusercontent.com/14858254/148119877-6c719851-4edd-4126-8ef7-e6c1920304cf.png)

Updated docs:

https://docs-preview.pytorch.org/70630/futures.html?highlight=future#:~:text=way%20as%20then().-,Parameters,-callback%20(Future)%20%E2%80%93%20a

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Test Plan: Imported from OSS

Reviewed By: dagitses, mrshenli

Differential Revision: D33478214

Pulled By: H-Huang

fbshipit-source-id: 8cd7022ae79a8e6fe8b5fa8b767c55903c9ac368
2022-01-07 15:22:22 -08:00
d583eca8c3 Add workflow to sync fbsync->master (#71013)
Summary:
Main logic of the workflow is implemented in `syncbranches.py` script,
which computes patch-id's of divergent history (as determined by `git
merge-base`) and treats all patches present in sync branch with
non-matching patch-ids as ones missing in target branch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71013

Reviewed By: bigfootjon

Differential Revision: D33480885

Pulled By: malfet

fbshipit-source-id: bd72c061720d0cba49c6754ec4e94437d8a5c262
2022-01-07 15:09:23 -08:00
d7db5fb462 ctc loss no batch dim support (#70092)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70092

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33280068

Pulled By: george-qi

fbshipit-source-id: 3278fb2d745a396fe27c00fb5f40df0e7f584f81
2022-01-07 14:33:22 -08:00
9032d73f3b Disable cpp tests in multigpu job (#71015)
Summary:
See if this fixes the timeouts described in https://github.com/pytorch/pytorch/issues/70015

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71015

Reviewed By: dagitses

Differential Revision: D33483762

Pulled By: suo

fbshipit-source-id: 09bf93e73669a1211b200b4b272bfaa0d78a21d2
2022-01-07 14:32:01 -08:00
0721fc6474 Decouple MapDataPipe from Dataset (#70991)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70991

Test Plan: Imported from OSS

Reviewed By: dagitses

Differential Revision: D33477680

Pulled By: ejguan

fbshipit-source-id: d3e89492e921a96791319f35052a229684ddf7cf
2022-01-07 14:28:41 -08:00
3febe0d986 Remove backward op for 3d depthwise convolution (#70462)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70462

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33340495

Pulled By: jbschlosser

fbshipit-source-id: a180951680aef8fb123463af098582ef6cf9bbdb
2022-01-07 14:24:34 -08:00
704fbc29ae Remove backward op for 2d depthwise convolution (#70461)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70461

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33340494

Pulled By: jbschlosser

fbshipit-source-id: f2d8b2fcf9ad0f42b644b1dba51a694d83975566
2022-01-07 14:23:15 -08:00
a70297e7cb NNAPI: quant logistic fix (#70847)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70847

NNAPI needs a fixed zero point and scale for sigmoid (logistic)
ghstack-source-id: 146555935

Test Plan: LIBNEURALNETWORKS_PATH="/path/to/libneuralnetworks.so" pytest test/test_nnapi.py

Reviewed By: dreiss

Differential Revision: D33237918

fbshipit-source-id: 05ef3a81bf1589ad44b599a19bce4066531c432b
2022-01-07 13:36:33 -08:00
ed50a35cf8 [Model Averaging] Update the documentation of PeriodicModelAverager (#70974)
Summary:
Here 20 is a bad example, since the warmup step is set as 100. 200 iterations will make much more sense.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70974

Reviewed By: dagitses

Differential Revision: D33474576

Pulled By: rohan-varma

fbshipit-source-id: 4c7043108897848bde9503d77999971ad5567aa6
2022-01-07 13:20:42 -08:00
c8b897333c [rnn/gru] no batch dim (#70977)
Summary:
Reference https://github.com/pytorch/pytorch/issues/60585

Reland: https://github.com/pytorch/pytorch/pull/70442

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70977

Reviewed By: dagitses, george-qi

Differential Revision: D33477256

Pulled By: jbschlosser

fbshipit-source-id: 2035c2d00b2f627c7046fd9b13c71b9360cd6fad
2022-01-07 13:14:41 -08:00
338eb1b2b3 [LTC] Export torch::lazy::GetBackendDevice() (#70963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70963

This commit exports torch::lazy::GetBackendDevice().

Test Plan: CI in the lazy_tensor_staging branch.

Reviewed By: wconstab

Differential Revision: D33468938

Pulled By: alanwaketan

fbshipit-source-id: f65599c9238bf6b4f4ffbd5194befdc267272831
2022-01-07 13:13:18 -08:00
0a002f879e Actually clean on clean workspace, including hidden files (#71018)
Summary:
The workspace should be totally empty before checking out PyTorch; this
is especially important with non-ephemeral runners.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71018

Reviewed By: robieta

Differential Revision: D33482985

Pulled By: suo

fbshipit-source-id: cafa123d2b893bfbdad62295586b5b79f1542b3a
2022-01-07 13:04:54 -08:00
bc026c0577 [jit] Split Union type and Optional type to separate impl file. (#69483)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69483

To avoid accidental linking to Union type and Optional type in Edge runtimes, we can separate these types into different files, so that we don't accidentally link with them in type.cpp.
ghstack-source-id: 146670525

Test Plan: just code move.

Reviewed By: ejguan

Differential Revision: D32264607

fbshipit-source-id: c60b6246f21f3eb0a67f827a9782f70ce5200da7
2022-01-07 11:23:15 -08:00
1011ac188f [jit][edge] Create DynamicType for OptionalType in mobile. (#68137)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68137

A small step to replace existing OptionalType usage to DynamicType in Edge runtime.
ghstack-source-id: 146670520

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D32264617

fbshipit-source-id: 62d3ffad40901842deac19ca2098ea5ca132e718
2022-01-07 11:23:12 -08:00
0517e719ac [jit] Add conformance test for DynamicType with server JIT types. (#69482)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69482

Add a test to enumerate a number of JIT type combinations and see if their subtyping behavior is preserved in the new DynamicType system.
ghstack-source-id: 146670526

Test Plan: buck test mode/opt //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.DynamicType'

Reviewed By: gmagogsfm

Differential Revision: D32891263

fbshipit-source-id: 728211b39778e93db011b69b0a4047df78a8fc5b
2022-01-07 11:23:09 -08:00
649dda9fee [jit] Implement DynamicType for TorchScript runtime. (#68136)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68136

DynamicType is an extension to existing server JIT types. Today using normal server types on Edge is a bit problematic because in embedded environments we don't need the full spectrum of types but we still build with these unneeded dependencies.

Is it possible to just get rid of unneeded JIT types from Edge builds? It's not easy to do so at this moment. For example, on Edge we don't support Union type, but we have to pull in the dependency of Union type because Optional type is being supported which inherits from Union type, so Union type has to be included in the build. Although we could split Union type and Optional type, it could be argued that the root cause is every time we use anything inheriting from `c10::Type`, we don't have the direct evidence of how much dependency we pull in, because we do virtual calls and we don't know what exactly we're calling with server JIT types. If we don't know, it's highly possible that the linker doesn't know either so it cannot effectively strip unused methods.

To address this problem, one option is to implement a separate `DynamicType` which has simpler behavior and doesn't store different types as different symbols in binary but rather raw data (or "tag"). This could increase the binary size by several KBs, so I included several binary size reductions in the same stack, hoping at least we don't regress the binary size.

Currently `DynamicType` inherits from `c10::Type` because I want to reduce the migration cost of `DynamicType` by making it interfacing with existing server JIT types. In the future `DynamicType` should be implemented as a separate class without relying on `c10::Type` to make things both simpler and leaner.
ghstack-source-id: 146670522

Test Plan: in the next diff.

Reviewed By: VitalyFedyunin

Differential Revision: D32264615

fbshipit-source-id: 180eb0998a14eacc1d8b28db39870d84fcc17d5b
2022-01-07 11:23:07 -08:00
0408449244 [jit] Reclaim some binary size. (#68038)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68038

Replace const std::function& to c10::function_ref because the former uses type erasure and adds 5-10 KB size overhead and adds another level of indirection to call the underlying functions. In contrast a non-owning c10::function_ref will just compile down to a raw function pointer which should be much smaller.
ghstack-source-id: 146670523

Test Plan: eyes

Reviewed By: iseeyuan, mrshenli

Differential Revision: D32264619

fbshipit-source-id: 558538fd882b8e1f4e72c4fd5e9d36d05f301e1e
2022-01-07 11:21:46 -08:00
dd1121435b SequentialLR update _last_lr on step (#70558)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68956.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70558

Reviewed By: dagitses

Differential Revision: D33430213

Pulled By: albanD

fbshipit-source-id: 446f182610de32db224d55b244d76c3076e8080f
2022-01-07 10:36:35 -08:00
195181d4df Revert "add very dumb retry to ecr gc"
This reverts commit 22f528043342ea06d00835616e8447e2b8c94adb.
2022-01-07 10:29:13 -08:00
c6e727d05b Fix adamw formula doc (#68587)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68482

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68587

Reviewed By: dagitses, jbschlosser

Differential Revision: D33478646

Pulled By: albanD

fbshipit-source-id: 4e6419829c3faa7449c041e7d467a6dab30fe917
2022-01-07 10:15:16 -08:00
08074c8f2d Update gradcheck.py (#70950)
Summary:
Following https://github.com/pytorch/pytorch/pull/64837#discussion_r779870974

Changed torch.equal to torch.allclose as exact comparision could be flaky

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70950

Reviewed By: albanD

Differential Revision: D33462426

Pulled By: anjali411

fbshipit-source-id: aeaba9d2a98d1d0af04fa2cab8c495c23ec0a9cc
2022-01-07 09:29:10 -08:00
8dfff8b2e2 Fix scatter for empty indexes (#70662)
Summary:
This PR fixes an issue with `scatter` where the output is garbage for zero-sized indexes.

```py
import torch

null_index = torch.zeros((0, 4), dtype=torch.int64)
null_arr = torch.zeros((0, 4))
zeros_arr = torch.zeros((1, 4))

result = zeros_arr.scatter(0, null_index, null_arr)

print(null_index)
print(null_arr)
print(zeros_arr)
print(result)
```

```
tensor([], size=(0, 4), dtype=torch.int64)
tensor([], size=(0, 4))
tensor([[0., 0., 0., 0.]])
tensor([[1.7036e+19, 2.9965e+32, 3.9133e-14, 1.3585e-19]])
```

the out array is never filled if `index` arg has 0 elements.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70662

Reviewed By: dagitses

Differential Revision: D33476807

Pulled By: albanD

fbshipit-source-id: 97dbdd9c0133899e58828c43ecba81838807b8af
2022-01-07 09:20:43 -08:00
4e7e8f2826 [PyTorch] Outline destructor of CppFunction (#63688)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63688

CppFunction is used for function registration, so it's not performance-sensitive. Outlining the destructor should reduce code size.
ghstack-source-id: 146648927

Test Plan: Mobile buildsizebot

Reviewed By: dhruvbird

Differential Revision: D30462640

fbshipit-source-id: de410f933bf936c16769a10a52092469007c8487
2022-01-07 09:16:23 -08:00
40c512f52c split cuda for all 11.X (#70899)
Summary:
the code didn't support 11.5 or above

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70899

Reviewed By: ngimel

Differential Revision: D33469544

Pulled By: janeyx99

fbshipit-source-id: ea38de36b025051f76322fe840e3851408195160
2022-01-07 09:11:16 -08:00
2378421340 Implement torch.allclose for sharded tensor. (#70331)
Summary:
Implement torch.allclose op for sharded tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70331

Test Plan:
Automated test added.
pritamdamania87
Fixes https://github.com/pytorch/pytorch/issues/67112

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Reviewed By: pritamdamania87

Differential Revision: D33339137

Pulled By: kumpera

fbshipit-source-id: 4263e468eaa117317b190f69877bf3f8bbac5658
2022-01-07 08:37:04 -08:00
997fa8671d Fix docstring for nn.Hardsigmoid (#70987)
Summary:
Fixes nn.Hardsigmoid's docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70987

Reviewed By: dagitses

Differential Revision: D33476974

Pulled By: albanD

fbshipit-source-id: bf3a1c485dd2c369c56981f9afbfe45aa9cee2cc
2022-01-07 08:13:53 -08:00
f135438d3b Dispatch to at::convolution intead of at::_convolution in _convolution_double_backward (#70661)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70661

Dispatching to at::convolution can make Lazy Tensor trace the right convolution op.

Test Plan: pytest test/test_nn.py -k test_conv_double_backward_strided_with_3D_input_and_weight

Reviewed By: wconstab, jbschlosser

Differential Revision: D33428780

Pulled By: desertfire

fbshipit-source-id: 899e4135588ea33fff23d16103c25d9bcd3f902c
2022-01-07 07:53:46 -08:00
9ad21091dd [SR] Give VarStackNodeWrapper an iterator (#69922)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69922

D32596934 (65f54bc000) made the serial stack implementation a bit brittle. It introduced a new container type: `VarStackNodeWrapper`. This type was used as a template parameter in the serial stack implementation.

The other type used in the serial stack implementation is `at::ArrayRef<at::Tensor>`. Ideally, the interface of `VarStackNodeWrapper` should be as close as possible to this other type. However, because the new container type did not have an iterator, expressions like this would fail to compile:
```
for (const auto& tensor : tensors) {
  // do something
}
```
Introducing this iterator will make the code easier to maintain going forward.

Test Plan:
`buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest -- Stack`

I consider this a `VarStack` implementation detail, so I'd prefer not to test it directly. We can test it implicitly by adding some code to the serial stack implementation that uses the iterator.

Reviewed By: swolchok

Differential Revision: D33101489

fbshipit-source-id: 7cf44c072d230c41bd9113cf2393bc6a6645a5b5
2022-01-07 07:24:47 -08:00
6e16c9bb1d Add support for deleteKey for FileStore (#69953)
Summary:
torch_ucc uses `deleteKey`, and trying to run PyTorch tests with torch_ucc leads to failure about `deleteKey not implemented for FileStore`.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69953

Reviewed By: ngimel

Differential Revision: D33458457

Pulled By: H-Huang

fbshipit-source-id: f46afd59f950722ae594d9aafb8843f14019e930
2022-01-07 06:20:59 -08:00
d697bb4220 Adapt llvm_codegen.cpp to LLVM TOT (#70810)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70810

Adapt to LLVM top-of-tree APIs.

For context: LLVM is moving towards opaque pointers for IR values: https://llvm.org/docs/OpaquePointers.html

I also changed some `value->getScalarType()->getPointerElementType()`  expressions to directly reference relevant types. This is simpler and more in line with the intentions of the opaque IR pointers. (In fact I would expect those expressions to break in the future). I did not fix places where the relevant type wasn't obvious to me though.

Test Plan:
-
```
$ cd fbsource/fbcode
$ tp2_update_fbcode llvm-fb --branch=staging
# symlinks point to d9c037cf2b4f0268cb1897b99c8c87c5d0232616 TP2 revision
$ buck build mode/opt-clang-thinlto unicorn:index_server -c unicorn.hfsort="1" -c cxx.profile="fbcode//fdo/autofdo/unicorn/index_server:autofdo" -c cxx.modules=False -c cxx.extra_cxxflags="-Wforce-no-error"
```
- Check sandcastle jobs

Reviewed By: modiking

Differential Revision: D33431503

fbshipit-source-id: 33f39d0a0c0f4b805ab877a811ea0a670f834abf
2022-01-07 05:07:25 -08:00
87139d8532 [LTC] Sync LazyGraphExecutor and LazyTensor with the staging branch (#70867)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70867

This commit syncs LazyGraphExecutor and LazyTensor with the staging branch's
latest changes.

Test Plan: CI in the lazy_tensor_staging branch.

Reviewed By: wconstab, desertfire

Differential Revision: D33440005

Pulled By: alanwaketan

fbshipit-source-id: 0dd72643dbf81a87fc4b05019b6564fcb28f1979
2022-01-07 01:51:53 -08:00
1cdc643714 [TensorExpr] Add a pass for trimming JIT graphs. (#66847)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66847

Trimming means that we try to remove a small portion of the graph while
keeping it valid, and we try performing this step that N times. This is
useful for debugging when we try to find a minimal example reproducing
the issue at hand.

Differential Revision:
D31751397
D31751397

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: 07d8ba1435af8fd2d7b8cf00db6685543fe97a85
2022-01-07 01:03:59 -08:00
8223ef1cd8 [TensorExpr] Clean-up logic for copying input tensors and remove some dead code. (#70535)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70535

This also fixes handling of inputs that happen to be outputs (they
require copy).

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D33399116

Pulled By: ZolotukhinM

fbshipit-source-id: 9845838eb653b82ae47b527631b51893990d5319
2022-01-07 01:03:56 -08:00
5d7cc8f22a [TensorExpr] Add some graph-rewrite passes to prepare models for AOT compilation. (#66515)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66515

These passes should not be used generally as they change API of the
model's forward method, but they help experimenting with the model and
ironing out all the kinks before it can be compiled properly. In the
long run ideally we should provide a better way to enable such
experiments.

Differential Revision:
D31590862
D31590862

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: 74ded34c6c871d4cafa29f43dc27c7e71daff8fc
2022-01-07 01:03:53 -08:00
cdbf83b0c3 [TensorExpr] Add helper passes for AOT pipeline. (#66514)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66514

These passes will
1) help us analyze the graph before trying to compile it
and report errors upfront if it's not possible,
2) fill in missing strides/dtype/device info in JIT IR. Ideally, this
should be done by a dedicated JIT pass, but until it's available, we'll
be using a hack-around defined here.

Differential Revision:
D31590860
D31590860

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Pulled By: ZolotukhinM

fbshipit-source-id: fe8fdefbeacae8079958dd0b4b27809cc0acb34b
2022-01-07 01:02:31 -08:00
a311cfa800 Revert D33460427: [pytorch][PR] [rnn/gru] : no batch dim
Test Plan: revert-hammer

Differential Revision:
D33460427 (6eba936082)

Original commit changeset: c64d9624c305

Original Phabricator Diff: D33460427 (6eba936082)

fbshipit-source-id: 9a5000e202c5f383b03dd6caad9399e46e4ce80e
2022-01-06 23:37:28 -08:00
1622546050 use irange for loops (#70248)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70248

Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D32813863

fbshipit-source-id: 527244b4a2b220fdfe7f17dee3599603f492a2ca
2022-01-06 23:14:29 -08:00
36d9e03ab7 Reserve vector in gather_ranges_to_dense_op.h (#70478)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70478

Test Plan: Sandcastle

Reviewed By: xw285cornell

Differential Revision: D33339890

fbshipit-source-id: 50330e18e344f872d03f146cea0ed11eef4f506e
2022-01-06 23:10:28 -08:00
df6eb9bbab Fixed to_folder not saving dtype (#69983)
Summary:
As above.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69983

Reviewed By: pbelevich, ngimel

Differential Revision: D33466529

Pulled By: Chillee

fbshipit-source-id: 2d2f0ad5b8e2492aba4c19fa034c8b6c0848a568
2022-01-06 22:15:56 -08:00
23f902f7e4 Fix incorrect variable in autograd docs (#70884)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68362.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70884

Reviewed By: mruberry

Differential Revision: D33463331

Pulled By: ngimel

fbshipit-source-id: 834ba9c450972710e0424cc92af222551f0b4a4a
2022-01-06 20:53:10 -08:00
22f5280433 add very dumb retry to ecr gc 2022-01-06 20:29:39 -08:00
c18e6b790e Adding elu,selu,softsign support for fx2trt (#70811)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70811

Add support of the above ops in fx2trt.

Reviewed By: 842974287

Differential Revision: D33407911

fbshipit-source-id: 8c635ddbd1cae6b0a0a04d345b0e0347111a6619
2022-01-06 19:42:24 -08:00
70b18b9511 Fix comment indentation issue (#70227)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70227

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33251107

Pulled By: tugsbayasgalan

fbshipit-source-id: 293ffe5dde38480ea13963a2d7e1eb99dc594d22
2022-01-06 19:14:39 -08:00
32bf5e0ef9 Add native impl of gelu for QuantizedCPU (#69968)
Summary:
Add native implementation of gelu for quantized CPU.

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69968

Reviewed By: ejguan

Differential Revision: D33187095

Pulled By: vkuzo

fbshipit-source-id: 4c4bf0eb47d2d9c2b8827174f2ccdea41986148a
2022-01-06 19:01:26 -08:00
6eba936082 [rnn/gru] no batch dim (#70442)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60585

TODO:
* [x] Doc updates

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70442

Reviewed By: zou3519

Differential Revision: D33460427

Pulled By: jbschlosser

fbshipit-source-id: c64d9624c305d90570c79d11a28557f9ec667b27
2022-01-06 18:39:09 -08:00
880a5b9ea6 [PyTorch] Move prim string ops to JIT op registry (#70501)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70501

This PR migrates prim string ops to be registered into JIT op registry instead of dispatcher. Since the implementation of these ops are backend agnostic, there's no need to go through dispatcher. Relying on `test_jit_string.py` to verify the correctness of these ops. I'm also adding tests to make sure all the operators are covered.

Test Plan: Rely on `test_jit_string.py`.

Reviewed By: iseeyuan

Differential Revision: D33351638

fbshipit-source-id: ecc8359da935a32d3a31add2c395a149a0d8892f
2022-01-06 18:26:28 -08:00
ddea6980fe [PyTorch][JIT] Don't refcount Type singletons (#69579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69579

This should help us avoid reference counting overhead on singleton Type subclasses without a major rewrite of the Type subsystem.
ghstack-source-id: 146643993

Test Plan:
Ran //caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:cpp_benchmark with arguments `--op empty -niter 40 --stressTestRecordFunction --captureRecordFunctionInputs` on devbig with turbo off.

Before:
```
I1206 13:47:15.037441 1201670 bench.cpp:144] Mean 0.737675
I1206 13:47:15.037463 1201670 bench.cpp:145] Median 0.736725
I1206 13:47:15.037468 1201670 bench.cpp:146] Min 0.722897
I1206 13:47:15.037473 1201670 bench.cpp:147] stddev 0.00508187
I1206 13:47:15.037482 1201670 bench.cpp:148] stddev / mean 0.00688903
```

After:
```
I1206 13:48:16.830123 1205612 bench.cpp:144] Mean 0.66988
I1206 13:48:16.830150 1205612 bench.cpp:145] Median 0.663956
I1206 13:48:16.830157 1205612 bench.cpp:146] Min 0.65986
I1206 13:48:16.830164 1205612 bench.cpp:147] stddev 0.0335928
I1206 13:48:16.830171 1205612 bench.cpp:148] stddev / mean 0.0501475
```

Static runtime startup is also improved; for CMF local_ro, time to initialize a predictor went from 10.01s to 9.59s.

(Note: I wish I had a production workload to demonstrate the advantage of this on. I tried ctr_mobile_feed local_ro net but it was neutral. Anything that manipulates types or List/Dict a lot might be promising.)

Reviewed By: suo

Differential Revision: D32923880

fbshipit-source-id: c82ed6689b3598e61047fbcb2149982173127ff0
2022-01-06 17:39:16 -08:00
e6befbe85c Add flag to optionally average output attention weights across heads (#70055)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47583

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70055

Reviewed By: bhosmer

Differential Revision: D33457866

Pulled By: jbschlosser

fbshipit-source-id: 17746b3668b0148c1e1ed8333227b7c42f1e3bf5
2022-01-06 17:32:37 -08:00
cc7382dd92 Enable upgraders in TS server (#70539)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70538

ghstack-source-id: 146384458

Test Plan: python test/test_jit.py TestUpgraders

Reviewed By: gmagogsfm

Differential Revision: D33375195

fbshipit-source-id: 170960b409175bb987cf9dbb65ffed3283e5f6f9
2022-01-06 17:10:30 -08:00
7b8f73dd32 No-batch-dim support for ConvNd (#70506)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70506

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33355034

Pulled By: jbschlosser

fbshipit-source-id: 5a42645299b1d82cee7d461826acca1c5b35a71c
2022-01-06 16:53:50 -08:00
6896b2d734 [NNC Testing] Randomized loop nest infrastructure (#70410)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70410

Trying again after #70174 was reverted. Earlier the env
variable was read into a static var in C++ causing state to be retained
and causing test failures. Static type is removed in this PR.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33321435

fbshipit-source-id: 6d108eb00cac9150a142ccc3c9a65a1867dd7de4
2022-01-06 16:21:42 -08:00
b7742b437a Allow RNN hidden_size to be 0 (#70556)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/56767.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70556

Reviewed By: ngimel

Differential Revision: D33455156

Pulled By: jbschlosser

fbshipit-source-id: 5dc57b09d7beb6ae81dfabc318e87c109bb4e6ae
2022-01-06 14:18:36 -08:00
e7602a1e30 Fix multiplication of 0-D sparse tensors (#70749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70749

Fixes https://github.com/pytorch/pytorch/issues/65396 and a clang-tidy error.

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33439136

Pulled By: cpuhrsch

fbshipit-source-id: 45ec58de7c18db183f891431d4a26e98fd0e924a
2022-01-06 13:36:46 -08:00
4fa70a2483 [pytorch] fix hipify_python (#70619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70619

This Diff improves `hipify_python`, which is needed for AMD GPUs.

Change 1:
```
if (c == "," or ind == len(kernel_string) - 1) and closure == 0:
```
This is needed to deal with the following case (ex: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/cuda_vectorized_test.cu#L111)
```
kernel<<<val, func()>>>(...)
// In this case, kernel_string is "val, func()"
// so closure gets 0 when ind == len(kernel_string) - 1.
```

Change 2:
```
mask_comments()
```
This is needed to deal with a case where "<<<" is included in a comment or a string literal (ex: https://github.com/pytorch/pytorch/blob/master/torch/csrc/deploy/interpreter/builtin_registry.cpp#L71)
```
abc = "<<<XYZ>>>"
// Though this <<<XYZ>>> is irrelevant to CUDA kernels,
// the current script attempts to hipify this and fails.
```

Test Plan:
This patch fixes errors I encountered by running
```
python3 tools/amd_build/build_amd.py
```

I confirmed, with Linux `diff`, that this patch does not change HIP code that was generated successfully with the original script.

Reviewed By: hyuen

Differential Revision: D33407743

fbshipit-source-id: bec822e040a154be4cda1c294536792ca8d596ae
2022-01-06 13:27:43 -08:00
9c455d7086 dbr quant: add limited support for torch.nn.ModuleList (#70372)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70372

Enables basic support for `torch.nn.ModuleList` in DBR quant
by stopping it from being a leaf.  For now, we
require the user to check for `AutoQuantizationState` if they are
looping over the contents without any bounds checking.

In future PRs, we can explore how to solve this without requiring
user code changes.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_module_list
```

Reviewed By: VitalyFedyunin

Differential Revision: D33302329

Pulled By: vkuzo

fbshipit-source-id: 1604748d4b6c2b9d14b50df46268246da807d539
2022-01-06 13:25:13 -08:00
c3f0c77b64 dbr quant support for custom leaf modules, part 3/x (#70349)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70349

Makes sure that child modules of non traceable leaf modules
do not participate in quantization swaps.  This should feature complete
the `non_traceable_module_class` feature.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_prepare_custom_config_dict_non_traceable_module_class
```

Reviewed By: VitalyFedyunin

Differential Revision: D33296246

Pulled By: vkuzo

fbshipit-source-id: 08287429c89ee6aa42d13ca3060a74679a478181
2022-01-06 13:25:10 -08:00
423d8aabbd dbr quant: support for custom leaf modules, part 2/x (#70335)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70335

Adds test case that functions are not quantized inside custom leaf modules.
No logic change needed as it already works correctly.

Note: FX scripting rewriter does not go into modules without auto-quant,
which is why we are using torch.jit.trace to look at the graph.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_prepare_custom_config_dict_non_traceable_module_class
```

Reviewed By: jerryzh168

Differential Revision: D33286370

Pulled By: vkuzo

fbshipit-source-id: 26c81c9e1ce7c4d38ddc1e318730cf1eaa25ff69
2022-01-06 13:25:07 -08:00
b12852eb41 dbr quant: support for custom leaf modules, part 1/x (#70330)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70330

Starts adding support for custom leaf modules, part 1/x.
In this PR, we ensure that leaf modules and all of their children
do not get `AutoQuantizationState` objects attached to them.
The API is matching prepare_fx, using the `prepare_custom_config_dict`
argument and the `non_traceable_module_class` key within that dict.

The next couple of PRs will ensure that modules and functions in
leaves do not get quantized, keeping it separate to make PRs smaller.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_prepare_custom_config_dict_non_traceable_module_class
```

Reviewed By: jerryzh168

Differential Revision: D33285310

Pulled By: vkuzo

fbshipit-source-id: 532025fda5532b420fad0a4a0847074d1ac4ad93
2022-01-06 13:25:04 -08:00
a8929c3278 dbr quant: unbreak case when child module not returning any outputs (#70329)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70329

Fixes a crash in DBR when a child module does not return any tensors.
This happens sometimes in user models.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_child_module_does_not_return_tensor
```

Reviewed By: VitalyFedyunin

Differential Revision: D33285309

Pulled By: vkuzo

fbshipit-source-id: 42b8cffb5ee02ce171a3e6c64d140bb5f217225a
2022-01-06 13:25:01 -08:00
f742853838 dbr quant: support functional linear without bias (#70328)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70328

Currently linear with bias crashes DBR convert step, this PR fixes it.
This unbreaks testing DBR on some customer models.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBRIndividualOps.test_linear_functional_nobias
```

Reviewed By: jerryzh168

Differential Revision: D33285311

Pulled By: vkuzo

fbshipit-source-id: 757c7270be9e3ff9cdf2609b1e426e9fd34e50ff
2022-01-06 13:24:58 -08:00
c21a540866 dbr quant: support dynamic linear (#70257)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70257

Makes dynamic quantization for linear module work in DBR quant.

Coverage for more ops and functionals will be in future PRs.

Test Plan:
```
python test/test_quantization.py -k DBR
```

Reviewed By: jerryzh168

Differential Revision: D33262300

Pulled By: vkuzo

fbshipit-source-id: c1cb0f9dd3f42216ad6ba19f4222b171ff170174
2022-01-06 13:24:55 -08:00
dfb807d65e dbr quant: do not attach auto_quant_state to observers (#70256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70256

Somewhere in previous PRs we started attaching AutoQuantState
to observers. This PR removes this, as that has not purpose
and makes model debugging more complicated.

Test Plan:
```
python test/test_quantization.py -k DBR
```

Reviewed By: jerryzh168

Differential Revision: D33262299

Pulled By: vkuzo

fbshipit-source-id: a3543b44c517325d57f5ed03b961a8955049e682
2022-01-06 13:23:43 -08:00
524bbb1442 [LTC] Sync gen_lazy_tensor.py from the staging branch (#70385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70385

This commit sync gen_lazy_tensor.py from the lazy_tensor_staging branch
to the master.

Test Plan: CI in the lazy_tensor_staging branch.

Reviewed By: wconstab

Differential Revision: D33306232

Pulled By: alanwaketan

fbshipit-source-id: a15c72b22418637f851a6cd4901a9f5c4be75449
2022-01-06 13:12:37 -08:00
81b52c290f Adding leaky_relu support for fx2trt (#70799)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70799

Add op support in fx2trt for leaky_relu
1. add support in acc_ops and corresponding unit test
2. add support in acc_ops_converters and corresponding unit test

Reviewed By: 842974287

Differential Revision: D33399095

fbshipit-source-id: 978340e64b35ffefabdc48273ddfa86b5ee1816e
2022-01-06 12:40:14 -08:00
19f04da21e GHA: Make WORKFLOW_ID not a concatenation of run_id and run_num (#70938)
Summary:
![image](https://user-images.githubusercontent.com/31798555/148432431-f990a26b-55d4-414e-9abd-8cdb4b4e9844.png)

Since both GITHUB_RUN_ID and GITHUB_RUN_NUM are unchanged in rerun attempts, there's little reason to track both. It ends up just being confusing and also hard to use in joins in queries.

Currently, the only places the concatenated WORKFLOW_ID are used are for our test stats jsons in S3 and in our binary size stats in Scuba, code posted respectively:
https://github.com/pytorch/pytorch/blob/master/tools/stats/print_test_stats.py#L824
https://github.com/pytorch/pytorch/blob/master/tools/stats/upload_binary_size_to_scuba.py#L58
And I don't think we use the WORKFLOW_IDs in either stats in any queries yet.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70938

Reviewed By: seemethere, ngimel

Differential Revision: D33458655

Pulled By: janeyx99

fbshipit-source-id: 885b125a978fa0cc51553b08b8c63d5fdcf354d0
2022-01-06 12:34:10 -08:00
10b55648f5 CI: remove unused yaml and make upload_binary_size_to_scuba script work with GHA (#70643)
Summary:
Removes unused pytorch-job-specs.yml

It looks like the recent android GHA jobs use upload_binary_size_to_scuba.py, but a portion of the script was still using CIRCLE only variables

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70643

Reviewed By: ngimel

Differential Revision: D33455659

Pulled By: janeyx99

fbshipit-source-id: cfe79a674641ed3327c7650d2107ace2a5050983
2022-01-06 10:05:27 -08:00
578fe11673 [pytorch][aten][cuda] fix LpNormFunctor (#70601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70601

`&` has lower precedence than `==`, so `==` will be evaluated first. This behavior should not be intended. This patch fixes it.

Test Plan: 🧐  Carefully check the change.

Reviewed By: hyuen

Differential Revision: D33397964

fbshipit-source-id: e3ac5b04e4688dfbf9d8ac3e5c4aa72282bf6ee9
2022-01-06 09:50:34 -08:00
c00d33033c Remove repeat test for types in test nn (#70872)
Summary:
Helps fix a part of https://github.com/pytorch/pytorch/issues/69865

The first commit just migrates everything as is.

The second commit uses the "device" variable instead of passing "cuda" everywhere

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70872

Reviewed By: jbschlosser

Differential Revision: D33455941

Pulled By: janeyx99

fbshipit-source-id: 9d9ec8c95f1714c40d55800e652ccd69b0c314dc
2022-01-06 09:20:02 -08:00
bc514cb425 Skip distributed tests if built with USE_DISTRIBUTED=0 (#70677)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70676

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70677

Reviewed By: albanD

Differential Revision: D33439808

Pulled By: janeyx99

fbshipit-source-id: 7f9971eb564dbbb6625fe5f78328c3abe3808719
2022-01-06 08:55:05 -08:00
ff408fca7f Forward AD formulas for activation backwards (#70460)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70460

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33405363

Pulled By: soulitzer

fbshipit-source-id: f68b59857a609ff593e9e399b9287d58dacef9e2
2022-01-06 08:41:17 -08:00
3051aabd0e Add forward AD formulas for convolution and some others (#69956)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69956

Test Plan: Imported from OSS

Reviewed By: albanD, bdhirsh

Differential Revision: D33235974

Pulled By: soulitzer

fbshipit-source-id: ea60d687edc5d62d92f3fd3cb6640421d32c908c
2022-01-06 08:39:51 -08:00
4916a21f10 quantization: fix scale+zp serialization of quantized BatchNorm{2|3}d (#70432)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70432

Scale and zero_point need to be buffers for serialization to work
on them properly.  This PR moves them to buffers.  This is BC breaking,
but the "before" state was completely broken (scale + zp were not
serialized at all) so there is no value in trying to handle it.

Test Plan:
```
python test/test_quantization.py TestStaticQuantizedModule.test_batch_norm2d_serialization
python test/test_quantization.py TestStaticQuantizedModule.test_batch_norm3d_serialization
```

```
python test/test_quantization.py TestStaticQuantizedModule.test_batch_norm2d_serialization
```

Imported from OSS

Differential Revision:
D33330022
D33330022

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 673c61f1a9f8f949fd9e6d09a4dbd9e5c9d5fd04
2022-01-06 08:26:20 -08:00
6773589a06 Drop some unused variables (#70879)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70879

Sandcastle from layer_norm_kernel.cu

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D33439040

fbshipit-source-id: e7d0e37ab25d62c63f675da3b6eff670fd93b26a
2022-01-06 08:11:25 -08:00
748790588c Upgrading the loop to use irange (#70326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326

See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable.

Test Plan: buck run //caffe2/torch/fb/sparsenn:test

Reviewed By: r-barnes

Differential Revision: D33243400

fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3
2022-01-06 07:06:53 -08:00
b0fdca8855 Bump version number to 7 and compile old operators with old schema (#68358)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68358

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33433730

Pulled By: tugsbayasgalan

fbshipit-source-id: 202c58365bae13195d3545cefcb0da9162b02151
2022-01-05 23:57:22 -08:00
8bdbe94344 Add forward compatability tests in CI (#64139)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64139

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D30626912

Pulled By: tugsbayasgalan

fbshipit-source-id: 781a88386701b42e2e86daaca0a779d1fc1c4df3
2022-01-05 23:40:06 -08:00
402f2934bf Revert D33262228: Per-overload torch.ops API
Test Plan: revert-hammer

Differential Revision:
D33262228 (8e6d1738a4)

Original commit changeset: 600dbf511514

Original Phabricator Diff: D33262228 (8e6d1738a4)

fbshipit-source-id: 238fa88ea9c4f26c7511334765c07452fbca9655
2022-01-05 22:10:11 -08:00
884aa2baad ci: Make linux.*xlarge non-ephemeral (#70869)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70869

Makes linux runners non-ephemeral to reduce the amount of
CreateInstance calls that we have going towards AWS as well as to reduce
the amount of github API calls we make in order to create new instances.

Should help alleviate some of the queuing issues we may observe due to
AWS / Github rate limits

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D33436874

Pulled By: seemethere

fbshipit-source-id: b2736fb4c9d175b1b0e2efb5017dcb4a8d4c05f4
2022-01-05 22:04:21 -08:00
2367face24 Prefer maybe_multiply when multiplying by a constant (#68185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68185

As per title

We also fix the first input to `handle_r_to_c` fo `rsub`as it was
flipped for the two inputs.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684855

Pulled By: mruberry

fbshipit-source-id: ffeab8d561e657105b314a883260f00d0ae59bbf
2022-01-05 20:33:43 -08:00
1a061c7fe1 Merge index_{add,fill,copy,select} sampling (#68184)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68184

This was in the TODO list as the three operations are very similar.
Did this as one of them was failing in the noncontig tests and I wanted
to make sure that all of them were tested properly, as they all appear
in the derivative formulas of each other.

After this PR, these operations do pass the noncontiguous tests.

cc mruberry

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684854

Pulled By: mruberry

fbshipit-source-id: 5db58be8d1e1fce434eab9cdf410cbf1024bbdf9
2022-01-05 20:33:40 -08:00
baeca11a21 Remove random_fullrank_matrix_distinc_singular_value (#68183)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68183

We do so in favour of
`make_fullrank_matrices_with_distinct_singular_values` as this latter
one not only has an even longer name, but also generates inputs
correctly for them to work with the PR that tests noncontig inputs
latter in this stack.

We also heavily simplified the generation of samples for the SVD, as it was
fairly convoluted and it was not generating the inputs correclty for
the noncontiguous test.

To do the transition, we also needed to fix the following issue, as it was popping
up in the tests:

Fixes https://github.com/pytorch/pytorch/issues/66856

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684853

Pulled By: mruberry

fbshipit-source-id: e88189c8b67dbf592eccdabaf2aa6d2e2f7b95a4
2022-01-05 20:33:37 -08:00
08ef4ae0bc Remove unnecessary sync in linalg.det (#67014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67014

LAPACK functions return negative infos when there was an unexpected
input. This happens (for example) when the user does not specify
matrices of the correct size. We already check all this things on the
PyTorch end, so this check that induces a synchronisation is
unnecessary.

I also took this chance to avoid some code repetition in the computation
of the determinant of `P`. I also changed the use of `ExclusivelyOwned<Tensor>`
by regular `Tensors` + moving into the tuple, which should be as efficient or more.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684851

Pulled By: mruberry

fbshipit-source-id: dc046d1cce4c07071d16c4e2eda36412bd734e0f
2022-01-05 20:33:34 -08:00
4d4e81d869 Make linalg.lu_factor structured (#66934)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66934

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684856

Pulled By: mruberry

fbshipit-source-id: 1675448da9a8677c8420005ce753972234e7accc
2022-01-05 20:33:31 -08:00
012c38e04d Add contiguous_strides as a correct replacement of defaultStride (#67789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67789

`at::defaultStride` was added in https://github.com/pytorch/pytorch/pull/18779.
As it was noted in that PR, it differs from the actual computation of
the default strides when one or more of the dimensions of the tensor are
zero. See https://github.com/pytorch/pytorch/pull/18779#discussion_r272296140

We add two functions, `contiguous_strides` and `contiguous_strides_vec`
which correct this issue and we replace the previous (wrong) uses of
`defaultStride`.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D32684852

Pulled By: mruberry

fbshipit-source-id: 62997a5a97a4241a12e73e2be2e192b80b491cb1
2022-01-05 20:33:28 -08:00
a35b4b49d2 Add linalg.lu_factor (#66933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933

This PR exposes `torch.lu` as `torch.linalg.lu_factor` and
`torch.linalg.lu_factor_ex`.

This PR also adds support for matrices with zero elements both in
the size of the matrix and the batch. Note that this function simply
returns empty tensors of the correct size in this case.

We add a test and an OpInfo for the new function.

This PR also adds documentation for this new function in line of
the documentation in the rest of `torch.linalg`.

Fixes https://github.com/pytorch/pytorch/issues/56590
Fixes https://github.com/pytorch/pytorch/issues/64014

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D32834069

Pulled By: mruberry

fbshipit-source-id: 51ef12535fa91d292f419acf83b800b86ee9c7eb
2022-01-05 20:32:12 -08:00
3f53365086 define get_dot_graph (#70541)
Summary:
In the [docstring](https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/graph_drawer.py#L54-L60) we mention `get_dot_graph but it is not defined, so I defined it here.
Not sure if this is preferred, or should we update the docstring to use `get_main_dot_graph`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70541

Test Plan:
```
            g = FxGraphDrawer(symbolic_traced, "resnet18")
            with open("a.svg", "w") as f:
                f.write(g.get_dot_graph().create_svg())
```

Reviewed By: khabinov

Differential Revision: D33378080

Pulled By: mostafaelhoushi

fbshipit-source-id: 7feea2425a12d5628ddca15beff0fe5110f4a111
2022-01-05 20:00:20 -08:00
917d56a7e4 Copy: Fix conj bit being ignored on type mismatch (#68963)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68963

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33064492

Pulled By: anjali411

fbshipit-source-id: 043f927d6bfff46bf5f8ea6fce9409f250bf8ff8
2022-01-05 17:59:32 -08:00
cfc5519661 Support Sparse CSR transpose. Fix clang-tidy warnings. (#70582)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70582

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33414446

Pulled By: cpuhrsch

fbshipit-source-id: dd0888d9dd3885579e853643a60d13373b5d6b15
2022-01-05 17:41:51 -08:00
3a21f38a2e Integrate multi_tensor zero_grad into Optimizer base class (#69936)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69936

Currently, the optimizers in `torch/optim/_multi_tensor/` all override the base Optimizer class' implementation of `zero_grad` with the same foreach zero_grad implementation (e.g. [here](https://github.com/pytorch/pytorch/blob/master/torch/optim/_multi_tensor/adadelta.py#L93-L114)). There is a TODO that indicates that this should be refactored to the base class once the foreach ops are in good shape. This PR is intended to address that TODO.

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D33346748

Pulled By: mikaylagawarecki

fbshipit-source-id: 6573f4776aeac757b6a778894681868191a1b4c7
2022-01-05 15:46:23 -08:00
8e6d1738a4 Per-overload torch.ops API (#67254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67254

Fixes https://github.com/pytorch/pytorch/issues/65997

TODO: disallow `default` as an overload name for aten operators.

BC breaking:
`output = torch.ops._test.leaky_relu(self=torch.tensor(-1.0))` now fails with the error `TypeError: __call__() got multiple values for argument 'self'` since we call into `OpOverloadBundle`'s `__call__` method that has `self` bound to it as its first argument.

cc ezyang gchanan

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33262228

Pulled By: anjali411

fbshipit-source-id: 600dbf511514ea9b41aea3e6b1bc1102dab08909
2022-01-05 15:17:41 -08:00
f9e1a1c97f Increase tolerance for test_adadelta (#69919)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69919

Reviewed By: cpuhrsch

Differential Revision: D33286427

Pulled By: jbschlosser

fbshipit-source-id: a2ca90683c14b6669f9b1804881ac675ba925fc5
2022-01-05 15:02:10 -08:00
ce409d8f50 docs: clarify smooth l1 == l1 when beta == 0 (#70673)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68558.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70673

Reviewed By: albanD

Differential Revision: D33430267

Pulled By: jbschlosser

fbshipit-source-id: db92187ff4f2799b19a6c4a5a6b653e9211c3aca
2022-01-05 14:35:35 -08:00
2431218ee4 Jiterates more ops (#70663)
Summary:
This PR jiterates:

- lcm
- i0e
- i1e
- ndtri
- erfcx
- digamma
- trigamma
- lgamma

It also adds TODOs to jiterate `kaiser_window`, `igamma`, `igammac` and `polygamma`, but jiterating those ops requires more features.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70663

Reviewed By: ngimel

Differential Revision: D33420854

Pulled By: mruberry

fbshipit-source-id: 6f32ac3cf24eda051bf19b6d20e94cdf81f50761
2022-01-05 13:57:25 -08:00
a5bc44422a [PyTorch] Remove the List/Dict move operations (#69370)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69370

These operations are likely slower than copying because they perform a heap allocation and reference count bump, whereas copying is just a reference count bump. This diff is up to see 1) if anything breaks 2) if we can measure any improvements.
ghstack-source-id: 146468907

Test Plan:
Ran //sigrid/lib/features/tests:pytorch_feature_conversion_benchmark before/after

```
swolchok@devbig032 ~/f/fbcode> for x in (seq 5); sudo scripts/bertrand/noise/denoise.sh /tmp/pytorch_feature_conversion_benchmark.Dec7Stable ; end
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.43us  410.68K
PyTorchFeatureConversionIdListBenchmark                      3.74us  267.65K
PyTorchFeatureConversionIdScoreListBenchmark                 4.98us  200.81K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.43us  410.75K
PyTorchFeatureConversionIdListBenchmark                      3.75us  266.92K
PyTorchFeatureConversionIdScoreListBenchmark                 4.98us  200.97K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.44us  410.43K
PyTorchFeatureConversionIdListBenchmark                      3.75us  266.75K
PyTorchFeatureConversionIdScoreListBenchmark                 5.04us  198.23K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.43us  411.17K
PyTorchFeatureConversionIdListBenchmark                      3.74us  267.60K
PyTorchFeatureConversionIdScoreListBenchmark                 5.00us  199.84K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.44us  410.19K
PyTorchFeatureConversionIdListBenchmark                      3.73us  267.89K
PyTorchFeatureConversionIdScoreListBenchmark                 4.96us  201.46K
============================================================================
swolchok@devbig032 ~/f/fbcode> for x in (seq 5); sudo scripts/bertrand/noise/denoise.sh /tmp/pytorch_feature_conversion_benchmark.Dec8RemoveListAndDictMove ; end
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.47us  405.12K
PyTorchFeatureConversionIdListBenchmark                      3.60us  278.07K
PyTorchFeatureConversionIdScoreListBenchmark                 4.87us  205.44K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.45us  407.39K
PyTorchFeatureConversionIdListBenchmark                      3.63us  275.56K
PyTorchFeatureConversionIdScoreListBenchmark                 4.95us  202.17K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.47us  405.49K
PyTorchFeatureConversionIdListBenchmark                      3.63us  275.58K
PyTorchFeatureConversionIdScoreListBenchmark                 4.88us  205.05K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.52us  396.13K
PyTorchFeatureConversionIdListBenchmark                      3.59us  278.29K
PyTorchFeatureConversionIdScoreListBenchmark                 4.88us  204.94K
============================================================================
============================================================================
sigrid/lib/features/tests/PyTorchFeatureConversionBenchmark.cpprelative  time/iter  iters/s
============================================================================
PyTorchFeatureConversionDenseBenchmark                       2.46us  406.77K
PyTorchFeatureConversionIdListBenchmark                      3.62us  276.17K
PyTorchFeatureConversionIdScoreListBenchmark                 4.92us  203.07K
============================================================================
```

Reviewed By: suo, hlu1

Differential Revision: D32836701

fbshipit-source-id: 6e1c3d81f1b4ee13156320263dac17f5256c1462
2022-01-05 13:49:22 -08:00
b283b1de39 Cleaning code in fbcode/caffe2/c10/core/TensorImpl.h (#70588)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70588

Test Plan: Sandcastle

Reviewed By: meyering

Differential Revision: D33399751

fbshipit-source-id: 3e507973f7a8f58635f3446650e85d0f959254c0
2022-01-05 13:40:59 -08:00
395f853770 Parallelize docker dependency builds (#70866)
Summary:
Those scripts are run on 8 vCPU instances, so passing at least `-j6` makes sense

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70866

Reviewed By: atalman

Differential Revision: D33435083

Pulled By: malfet

fbshipit-source-id: c879ed928da0b77346a92976d2fe9ad92ba01b5e
2022-01-05 13:34:27 -08:00
be298212a6 reduce igamma instantiations (#70666)
Summary:
Don't compile scalar versions of the kernel (there is no scalar overload), combine igamma and igammac kernels.
Igamma cubin size 10 MB -> 2 MB on V100

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70666

Reviewed By: malfet

Differential Revision: D33431359

Pulled By: ngimel

fbshipit-source-id: 440998f751251be274f40dd035efba08b8969192
2022-01-05 13:06:24 -08:00
6c4437118b Deprecating Python 3.6 (#70493)
Summary:
Deprecating python 3.6 from documentation and from cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70493

Reviewed By: suo

Differential Revision: D33433118

Pulled By: atalman

fbshipit-source-id: c3adc7b75714efdb5b6acda5d4cddc068fb4a145
2022-01-05 11:46:32 -08:00
025cd69a86 [AMD] Fix some legacy hipify script (#70594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70594

Pull Request resolved: https://github.com/facebookincubator/gloo/pull/315

Fix some out-dated hipify script:
* python -> python3 (fb internal)
* rocblas return code
* gloo makefile for hip clang

Test Plan: Sandcastle + OSS build

Reviewed By: malfet, shintaro-iwasaki

Differential Revision: D33402839

fbshipit-source-id: 5893039451bcf77bbbb1b88d2e46ae3e39caa154
2022-01-05 11:34:25 -08:00
34c49d3d3b Document torch.quantile interpolation kwarg (#70637)
Summary:
clone of https://github.com/pytorch/pytorch/pull/59397

This PR documents the interpolation kwarg parameter added in https://github.com/pytorch/pytorch/issues/49267. Now that the forward compatibility period is over, we can expose this parameter.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70637

Reviewed By: jbschlosser

Differential Revision: D33411707

Pulled By: anjali411

fbshipit-source-id: f5f2d0a6739b3a855bbdf58fc671ac2f0342ce69
2022-01-05 11:02:13 -08:00
616afcf981 [jit] [shape analysis] Move constant tensors out of fused subgraphs during generalization (#70320)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70320

ghstack-source-id: 146514368

Test Plan: `buck test mode/dev-nosan //caffe2/test/cpp/jit:jit`

Reviewed By: eellison

Differential Revision: D33280508

fbshipit-source-id: fe4291d7c49f0a498b330de96b698e99f6f6a505
2022-01-05 10:19:14 -08:00
b60b1b100f Set cuDNN deterministic flag for test_conv_double_backward_cuda (#69941)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69833

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69941

Reviewed By: george-qi

Differential Revision: D33430727

Pulled By: jbschlosser

fbshipit-source-id: 4a250bd0e5460ee631730afe0ab68ba72f37d292
2022-01-05 10:05:56 -08:00
93c7504438 [PyTorch] Improve StorageImpl::set_data_ptr (#65432)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65432

There is no reason to do an extra write to the input DataPtr (via `std::swap`) before returning a new DataPtr.
ghstack-source-id: 146471376

Test Plan:
Inspected assembly for this function to verify that we are
really getting fewer instructions generated. I don't have a specific
application for this at the moment, but it's clearly better IMO.

Reviewed By: mikeiovine

Differential Revision: D31097807

fbshipit-source-id: 06ff6f5fc675df0f38b0315b4147ed959243b6d0
2022-01-05 09:46:35 -08:00
70d3b2700f [LTC] Fix stride accessors in LTCTensorImpl (#70623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70623

Strides on lazy tensor should only be read after calling setup_size_properties. This fixes a failure in hf_Longformer.

Test Plan: CI on the lazy_tensor_staging branch

Reviewed By: wconstab, alanwaketan

Differential Revision: D33410142

Pulled By: desertfire

fbshipit-source-id: ccb2ba8d258bdb88f6b51be6196563f9c4c06cbf
2022-01-05 09:31:41 -08:00
6f473c80a5 Enable fx2trt CI test (#70658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70658

Config '--exclude-distributed-test' was intended to disabled fx2trt test on normal docker test suite, but test is auto disabled now. Remove config.

Test Plan:
CI
https://github.com/pytorch/pytorch/actions/runs/1656375648

Reviewed By: houseroad

Differential Revision: D33417803

fbshipit-source-id: 9dfb4cbd6fa9ad18a4be989ee86d1f8a298347f9
2022-01-05 09:28:58 -08:00
4cbe140ec5 Add CI config to test USE_PER_OPERATOR_HEADERS=0 (#69907)
Summary:
The CMake build defaults to `USE_PER_OPERATOR_HEADERS = 1` which
generates extra headers in the `ATen/ops` folder that don't exist
otherwise. In particular, fb-internal builds using buck don't support
these headers and so all includes must be guarded with
`#ifdef AT_PER_OPERATOR_HEADERS`.

This adds a CI run which builds with `USE_PER_OPERATOR_HEADERS = 0` so
open source contributions don't have to wait for their PR to be
imported to find out it doesn't work in fb-internal. This flag
shouldn't effect runtime behavior though, so I don't run any tests.

cc seemethere malfet pytorch/pytorch-dev-infra

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69907

Reviewed By: malfet, atalman

Differential Revision: D33411864

Pulled By: seemethere

fbshipit-source-id: 18b34d7a83dc81cf8a6c396ba8369e1789f936e9
2022-01-05 09:18:06 -08:00
e1e43c4e71 Prevent sum overflow in broadcast_object_list (#70605)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70605

broadcast_object_list casted the sum of all object lengths to int from long causing overflows.

Test Plan:
Add a Tensor  with >2GB storage requirement (in distributed_test.py) to object broadcast.

This Tensor is only added if test are running at Meta as github tests will oom.

Without fix the length will overflow and the program will request a negative sized Tensor:
```
RuntimeError: Trying to create tensor with negative dimension -2147482417: [-2147482417]
```
With fix it will pass the test.

Test used on server with GPUs:

buck test  mode/dev-nosan //caffe2/test/distributed:distributed_nccl_spawn --local -- broadcast_object
buck test  mode/dev-nosan //caffe2/test/distributed:distributed_gloo_spawn --local -- broadcast_object

Reviewed By: r-barnes

Differential Revision: D33405741

fbshipit-source-id: 972165f8297b3f5d475636e6127ed4a49adacab1
2022-01-05 09:07:39 -08:00
8ba27c576c Upgrade CI to ROCm4.5.2 (#69886)
Summary:
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69886

Reviewed By: albanD, seemethere

Differential Revision: D33429299

Pulled By: malfet

fbshipit-source-id: c3d6d9e45e30d0149b04e59ea255d88bc0e933f2
2022-01-05 08:48:46 -08:00
20489ebdc9 Increase tensor size for mem check tests (#70603)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70226

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70603

Reviewed By: mruberry

Differential Revision: D33410439

Pulled By: janeyx99

fbshipit-source-id: e94615ece6d0fdf230de5297118678b70f34a18c
2022-01-05 08:27:48 -08:00
1aa98c7540 [docs] multi_head_attention_forward no-batch dim support (#70590)
Summary:
no batch dim support added in https://github.com/pytorch/pytorch/issues/67176

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70590

Reviewed By: VitalyFedyunin

Differential Revision: D33405283

Pulled By: jbschlosser

fbshipit-source-id: 86217d7d540184fd12f3a9096605d2b1e9aa313e
2022-01-05 08:26:25 -08:00
e228b71dae remove unnecessary skips in rsub OpInfo (#69973)
Summary:
Skips are unnecessary as https://github.com/pytorch/pytorch/issues/53797 was fixed

Thanks Lezcano for finding the same.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69973

Reviewed By: mruberry

Differential Revision: D33161663

Pulled By: anjali411

fbshipit-source-id: 06b75bc5fc0cf90239f17835c07b86b2282ec846
2022-01-05 08:22:38 -08:00
216ae7bc91 [docs] Transformer: no batch dim support doc update (#70597)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70597

Reviewed By: VitalyFedyunin

Differential Revision: D33405284

Pulled By: jbschlosser

fbshipit-source-id: 04f37e8b9798ded7fcedac48629645843a0e3a28
2022-01-05 08:20:51 -08:00
5543b7ce16 Fix docstring for nn.Softplus (#70576)
Summary:
Fixes nn.Softplus' docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70576

Reviewed By: VitalyFedyunin

Differential Revision: D33407444

Pulled By: albanD

fbshipit-source-id: 7f1f438afb1a1079d30e0c4741aa609c5204329f
2022-01-05 08:12:15 -08:00
657a7e74ed Fix docstring for nn.Tanh (#70577)
Summary:
Fixes nn.Tanh's docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70577

Reviewed By: VitalyFedyunin

Differential Revision: D33408564

Pulled By: albanD

fbshipit-source-id: 2008cb55ef72b4b057d8d68e4505956aaf6cc3fa
2022-01-05 07:56:57 -08:00
adceb13da1 Copy: Avoid extra dispatch in type-mismatch case (#68950)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68950

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33064447

Pulled By: anjali411

fbshipit-source-id: 82bf4e144c1e629e30226eedc9d26ca63cfb4431
2022-01-05 07:32:47 -08:00
e1aa5db108 Bazel: Only run ATen codegen once (#70147)
Summary:
Due to a merge conflict, the new bazel cuda build does something
rather obnoxious. It runs ATen codegen with `--per-operator-headers`
enabled and extracts a subset of the generated files; then calls it
again without the flag to extract the CUDA files.

This PR instead calls the codegen once but keeps track of what is
CPU and what is CUDA in separate lists.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70147

Reviewed By: VitalyFedyunin

Differential Revision: D33413020

Pulled By: malfet

fbshipit-source-id: 4b502c38a209d1aa63d715e2336df6fc5aac2212
2022-01-05 06:56:52 -08:00
1681323ddc DOC: Merge extraheader block from theme instead of override (#70187)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70185

The extraheader block in docs/source/_templates/layout.html overrides the one from the pytorch theme. The theme block adds Google Analytics, so they were missing from the `master` documentation. This came up in PR pytorch/pytorch.github.io#899.

brianjo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70187

Reviewed By: bdhirsh

Differential Revision: D33248466

Pulled By: malfet

fbshipit-source-id: b314916a3f0789b6617cf9ba6bd938bf5ca27242
2022-01-05 06:42:38 -08:00
aea3d3ced7 dbr quant: stop calling eager quant convert (#70247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70247

Stops calling the eager mode quantization `convert` function
from DBR quant convert, and instead implements the module swaps
manually.  This will make it easier to support quantization types
other than static int8 in future PRs.

Test Plan:
```
python test/test_quantization.py -k DBR
```

Reviewed By: jerryzh168

Differential Revision: D33255924

Pulled By: vkuzo

fbshipit-source-id: afdfd61d71833d987bb38aa4d8c3d214f900c03e
2022-01-05 06:36:44 -08:00
4e90fa6a8c dbr quant: break up test class into multiple classes (#70246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70246

Breaks up the large `TestQuantizeDBR` test case into
1. `TestQuantizeDBRIndividualOps` for testing functionality of ops
2. `TestQuantizeDBRMultipleOps` for testing non-fusion interactions between ops
3. `TestQuantizeDBR` for everything else

We may need to refactor this more in the future, but this should
unblock things for the near future.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
python test/test_quantization.py TestQuantizeDBRIndividualOps
python test/test_quantization.py TestQuantizeDBRMultipleOps
```

Reviewed By: jerryzh168

Differential Revision: D33255925

Pulled By: vkuzo

fbshipit-source-id: 82db1a644867e9303453cfedffed2d81d083c9cd
2022-01-05 06:36:41 -08:00
5b20052857 dbr quant: start recording ops which are not quantizeable (#70200)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70200

Adds the logic to record not just the subgraphs which are quantizeable,
but also the set of ops (not subgraph) which are quantizeable.  This changes
the information recorded during tracing as follows (an example):

```
// before
1. subgraph of conv1 -> conv2
2. no other information about other ops

// after
1. subgraph of conv1 -> conv2
2. set of types of ops which were not quantizeable but were encountered during tracing
```

This has two uses:
1. easier development of DBR quant to cover more ops, as now the ops which are not being quantized are easier to inspect
2. easier understanding for the user of what DBR quant is doing or not doing for a model

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR.test_unsupported_ops_recorded
```

Reviewed By: VitalyFedyunin

Differential Revision: D33240997

Pulled By: vkuzo

fbshipit-source-id: 3168eae286387e6cb01df3ae60dc13620fb784d5
2022-01-05 06:36:38 -08:00
80e685e2c0 dbr quant: start reusing static quant module mappings (#70196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70196

Deletes the custom DBR static quant module mapping, and reuses
the global ones.

Test coverage for all the ops will be in future PRs.

Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```

Reviewed By: jerryzh168

Differential Revision: D33240998

Pulled By: vkuzo

fbshipit-source-id: da248b28d7b681794fa0494ff31fd065680f6fef
2022-01-05 06:35:11 -08:00
45f5a3ceb8 Fix generating files for Vulkan on Windows (#69696)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69696

Using `find` is not portable as it won't be there on Windows for example. We can use `glob` with the recursive option added in Python 3.5 instead.

Test Plan: CircleCI

Reviewed By: xta0

Differential Revision: D32994229

fbshipit-source-id: 4a755c4313300142c051f533d0b3876dc9035da0
2022-01-05 05:32:13 -08:00
c468e35d83 [caffe2] don't use __FUNCSIG__ when building for Windows with clang (#70561)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70561

When building with strict(er) compiler warnings on Windows, clang complains that `__FUNCSIC__` is a proprietary language extension. When using clang, it seems we can use `__PRETTY_FUNCTION__` instead, like we do on other platforms. This is also in line with the logic on L100:127.

Test Plan: CI

Reviewed By: kalman5

Differential Revision: D33386400

fbshipit-source-id: d45afa92448042ddcd1f68adc7a9ef4643276b31
2022-01-04 23:44:56 -08:00
12653be434 [PyTorch] Optimize no input NVTX collection (#70133)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70133

we were creating `sstream` + string concats via `getNvtxStr` even when there were no inputs and wasting precious time. this diff avoids `stringstream` when there is no input to squeeze performance. 60% reduction in overhead

Test Plan:
Before
```
I1214 22:48:07.964118 2971180 bench.cpp:154] Mean 0.970494
I1214 22:48:07.964139 2971180 bench.cpp:155] Median 0.969054
I1214 22:48:07.964144 2971180 bench.cpp:156] Min 0.962247
I1214 22:48:07.964148 2971180 bench.cpp:157] stddev 0.00774841
I1214 22:48:07.964154 2971180 bench.cpp:158] stddev / mean 0.00798398
```

After
```
I1214 22:59:00.039872 3437853 bench.cpp:154] Mean 0.384333
I1214 22:59:00.039896 3437853 bench.cpp:155] Median 0.384886
I1214 22:59:00.039899 3437853 bench.cpp:156] Min 0.370235
I1214 22:59:00.039902 3437853 bench.cpp:157] stddev 0.00435907
I1214 22:59:00.039907 3437853 bench.cpp:158] stddev / mean 0.0113419
```

Reviewed By: aaronenyeshi, robieta

Differential Revision: D33137501

fbshipit-source-id: ce0e8cf9aef7ea22fd8aed927e76be4ca375efc3
2022-01-04 23:40:22 -08:00
44283c2766 NNAPI: Add qint16 support via int16 (#70621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70621

Pytorch doesn't have support for qint16 yet. Add an option to handle qint16 via int16 & qint32 data types.

* For qint16 tensors in NNAPI, the user sends a qint32 tensor. We convert the qint32 to int16 for the converter and set the zero point and scale for nnapi
    * inputs to the model have to have fixed scale and zero point and are only supported for testing
* Added a flag use_int16_for_qint16 which will be used maintain backwards compatibility in the converter when true qint16 is supported in PyTorch
ghstack-source-id: 146507483

Test Plan: pytest test/test_nnapi.py

Reviewed By: dreiss

Differential Revision: D33285124

fbshipit-source-id: b6376fa1bb18a0b9f6a18c545f600222b650cb66
2022-01-04 23:12:38 -08:00
10b40acbdb [PyTorch][Static Runtime] Fast aliasing in select_tensor by manual borrowing (#68122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68122

See code comments for details; in brief, we repurpose support
for borrowing `Tensor`s in `MaybeOwned` to make the `select_tensor`
output a borrowed IValue that we have to clean up manually.

If we have any other ops that always create a new reference to an
existing Tensor, we can easily apply this same optimization.
ghstack-source-id: 146482212

Test Plan:
See perf measurements on ctr_mobile_feed local_ro net for this stack: P467203421
(local is neutral: P467267554)

--do_profile output for local_ro (updated Dec 10):

```
swolchok@devbig032 /d/u/s/f/fbcode> tail Stable.profile.txt
First iter time: 0.989023 ms
Number of operators: 2037
Total number of managed tensors: 1597
Total number of managed output tensors: 0
Total number of unmanaged values: 2568
Number of unmanaged values requiring cleanup: 2568
Number of unmanaged values not requiring cleanup: 0
Total memory managed: 50368 bytes
Total number of reused tensors: 1010
Total number of 'out' variant nodes/total number of nodes: 2001/2037 (98.2327%)
swolchok@devbig032 /d/u/s/f/fbcode> ttail TMCC^C
swolchok@devbig032 /d/u/s/f/fbcode> tail TMCOFastAliasing.profile.txt
First iter time: 0.994703 ms
Number of operators: 2551
Total number of managed tensors: 1146
Total number of managed output tensors: 0
Total number of unmanaged values: 4047
Number of unmanaged values requiring cleanup: 3533
Number of unmanaged values not requiring cleanup: 514
Total memory managed: 50048 bytes
Total number of reused tensors: 559
Total number of 'out' variant nodes/total number of nodes: 2001/2551 (78.4398%)
```

for local: (also Dec 10):

```
==> Stable.local.profile.txt <==
First iter time: 9.0909 ms
Number of operators: 1766
Total number of managed tensors: 1894
Total number of managed output tensors: 0
Total number of unmanaged values: 2014
Number of unmanaged values requiring cleanup: 2014
Number of unmanaged values not requiring cleanup: 0
Total memory managed: 4541440 bytes
Total number of reused tensors: 847
Total number of 'out' variant nodes/total number of nodes: 1744/1766 (98.7542%)

==> TMCOFastAliasing.local.profile.txt <==
First iter time: 7.5512 ms
Number of operators: 2378
Total number of managed tensors: 1629
Total number of managed output tensors: 0
Total number of unmanaged values: 3503
Number of unmanaged values requiring cleanup: 2891
Number of unmanaged values not requiring cleanup: 612
Total memory managed: 3949312 bytes
Total number of reused tensors: 586
Total number of 'out' variant nodes/total number of nodes: 1744/2378 (73.3389%)
```

Reviewed By: hlu1

Differential Revision: D32318674

fbshipit-source-id: a2d781105936fda2a3436d32ea22a196f82dc783
2022-01-04 22:36:13 -08:00
4d8fc8693c [PyTorch][Static Runtime] Support memory planning for torch.to() w/o requiring copying (#67223)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67223

ghstack-source-id: 146482215

Test Plan:
See perf measurements on ctr_mobile_feed local_ro net for this stack: P467203421
(local is neutral: P467267554)

Reviewed By: hlu1

Differential Revision: D31776259

fbshipit-source-id: f84fcaa05029577213f3bf2ae9d4b987b68480b3
2022-01-04 22:36:10 -08:00
1507ce90b2 [PyTorch][Static Runtime] Avoid managed output tensor DCHECK (#67221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67221

Update memory leak checks to not require that output tensors are cleaned up.
ghstack-source-id: 146464297

Test Plan: Tests should still pass;  reviewers to confirm that this is OK in principle

Reviewed By: d1jang

Differential Revision: D31847567

fbshipit-source-id: bb7ff2f2ed701e2d7de07d8032a1281fccabd6a9
2022-01-04 22:36:07 -08:00
99a10c371f [PyTorch][Static Runtime] Fix dtype changing between iterations for to() (#67394)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67394

ghstack-source-id: 146464294

Test Plan:
Added new test, which failed but now passes.

Checked perf on ctr_mobile_feed local net (still not on recordio inputs yet), looks neutral

```
Stable, local
========================================

I1027 13:40:23.411118 2156917 PyTorchPredictorBenchLib.cpp:131] PyTorch predictor: number of prediction threads 1
I1027 13:40:48.708222 2156917 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.16975. Iters per second: 162.081
I1027 13:41:13.915948 2156917 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.1487. Iters per second: 162.636
I1027 13:41:38.984462 2156917 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.11408. Iters per second: 163.557
I1027 13:42:04.138948 2156917 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.13566. Iters per second: 162.982
I1027 13:42:29.342630 2156917 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.14269. Iters per second: 162.795
I1027 13:42:29.342669 2156917 PyTorchPredictorBenchLib.cpp:264] Mean milliseconds per iter: 6.14218, standard deviation: 0.0202164
0

FixToDtypeChanges, local
========================================
I1027 13:44:59.632668 2176333 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.11023. Iters per second: 163.66
I1027 13:45:24.894635 2176333 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.16308. Iters per second: 162.257
I1027 13:45:50.275280 2176333 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.17868. Iters per second: 161.847
I1027 13:46:15.637431 2176333 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.18688. Iters per second: 161.632
I1027 13:46:40.670816 2176333 PyTorchPredictorBenchLib.cpp:249] PyTorch run finished. Milliseconds per iter: 6.10549. Iters per second: 163.787
I1027 13:46:40.670863 2176333 PyTorchPredictorBenchLib.cpp:264] Mean milliseconds per iter: 6.14887, standard deviation: 0.03843706
```

Reviewed By: hlu1

Differential Revision: D31972722

fbshipit-source-id: 7a445b325a29020b31dd2bd61e4171ecc2793b15
2022-01-04 22:34:49 -08:00
ab7d0df449 Support cloning CSR tensors (#70581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70581

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33413992

Pulled By: cpuhrsch

fbshipit-source-id: 3a576d2c2f26d1edcc8f6932b2dbe2c7c11e9593
2022-01-04 21:41:18 -08:00
d1dbcb1780 Change to use current LLLVM APIs (#70625)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70625

In llvm-13 depricated APIs were removed. These APIs were just wrappers around APIs present in llvm-9+. Changed to use underlying APIs.

Test Plan: buck build mode/opt-clang-thinlto -j 70 unicorn/topaggr:top_aggregator_server -c unicorn.hfsort="1" -c cxx.extra_cxxflags="-Wforce-no-error -fbracket-depth=300" -c cxx.profile="fbcode//fdo/autofdo/unicorn/topaggr/top_aggregator_server:autofdo" -c cxx.modules=False

Reviewed By: WenleiHe

Differential Revision: D33169593

fbshipit-source-id: c8923991b351a893ef8f6c0d01858149b63c0d33
2022-01-04 20:25:58 -08:00
f8eaebc978 Avoid adding torch::deploy interpreter library to the data section (#70208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70208

Create custom section ".embedded_interpreter" in order to store interpreter instead of .data in order to allow in order to increae the amount of memory that can be used by 33% for the other sections of the executable (1.5GB -> 2.0GB) such as .text/.data/.bss. This also removes memory limitations of the interpreter and tech debt.

Test Plan:
buck test mode/opt //caffe2/torch/csrc/deploy:test_deploy
readelf -S ~/fbcode/buck-out/gen/caffe2/torch/csrc/deploy/test_deploy
check the size of the .data section
Apply the fix and check the size of the .data section again. It should be reduced by the size of the interpreter.so

The output of `readelf -S ~/fbcode/buck-out/gen/caffe2/torch/csrc/deploy/test_deploy` is as follows. The .data section is now 0.0015415GB and the .torch_deploy_payXXX section is 0.605125GB

```
(pytorch) [sahanp@devvm4333.vll0 ~/local/fbsource/fbcode] readelf -S buck-out/gen/caffe2/torch/csrc/deploy/test_deploy
There are 55 section headers, starting at offset 0x24bac82b0:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000200350  00000350
       0000000000000028  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             0000000000200378  00000378
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000000200398  00000398
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .dynsym           DYNSYM           00000000002003c0  000003c0
       0000000000d07a48  0000000000000018   A       9     1     8
  [ 5] .gnu.version      VERSYM           0000000000f07e08  00d07e08
       0000000000115f86  0000000000000002   A       4     0     2
  [ 6] .gnu.version_r    VERNEED          000000000101dd90  00e1dd90
       0000000000000510  0000000000000000   A       9    15     4
  [ 7] .gnu.hash         GNU_HASH         000000000101e2a0  00e1e2a0
       00000000003b4fb0  0000000000000000   A       4     0     8
  [ 8] .hash             HASH             00000000013d3250  011d3250
       0000000000457e20  0000000000000004   A       4     0     4
  [ 9] .dynstr           STRTAB           000000000182b070  0162b070
       0000000004ef205a  0000000000000000   A       0     0     1
  [10] .rela.dyn         RELA             000000000671d0d0  0651d0d0
       0000000000110b80  0000000000000018   A       4     0     8
  [11] .rela.plt         RELA             000000000682dc50  0662dc50
       00000000000093f0  0000000000000018   A       4    35     8
  [12] .rodata           PROGBITS         0000000006837040  06637040
       00000000034067a8  0000000000000000 AMS       0     0     64
  [13] fb_build_info     PROGBITS         0000000009c3d7f0  09a3d7f0
       00000000000002ee  0000000000000000   A       0     0     16
  [14] .gcc_except_table PROGBITS         0000000009c3dae0  09a3dae0
       00000000014a9340  0000000000000000   A       0     0     4
  [15] .eh_frame_hdr     PROGBITS         000000000b0e6e20  0aee6e20
       00000000004abf54  0000000000000000   A       0     0     4
  [16] .eh_frame         PROGBITS         000000000b592d78  0b392d78
       000000000200e344  0000000000000000   A       0     0     8
  [17] .text             PROGBITS         000000000d5a2000  0d3a2000
       000000001e55944e  0000000000000000  AX       0     0     256
  [18] .init             PROGBITS         000000002bafb450  2b8fb450
       0000000000000017  0000000000000000  AX       0     0     4
  [19] .fini             PROGBITS         000000002bafb468  2b8fb468
       0000000000000009  0000000000000000  AX       0     0     4
  [20] .never_hugify     PROGBITS         000000002bafb480  2b8fb480
       0000000000000db3  0000000000000000  AX       0     0     16
  [21] text_env          PROGBITS         000000002bafc240  2b8fc240
       0000000000002e28  0000000000000000  AX       0     0     16
  [22] .plt              PROGBITS         000000002baff070  2b8ff070
       00000000000062b0  0000000000000000  AX       0     0     16
  [23] .tdata            PROGBITS         000000002bb06000  2b906000
       0000000000000b20  0000000000000000 WAT       0     0     8
  [24] .tbss             NOBITS           000000002bb06b40  2b906b20
       0000000000007cb8  0000000000000000 WAT       0     0     64
  [25] .fini_array       FINI_ARRAY       000000002bb06b20  2b906b20
       0000000000000028  0000000000000000  WA       0     0     8
  [26] .init_array       INIT_ARRAY       000000002bb06b48  2b906b48
       0000000000008878  0000000000000000  WA       0     0     8
  [27] .data.rel.ro      PROGBITS         000000002bb0f3c0  2b90f3c0
       0000000000029ce0  0000000000000000  WA       0     0     64
  [28] .ctors            PROGBITS         000000002bb390a0  2b9390a0
       0000000000000010  0000000000000000  WA       0     0     8
  [29] .dynamic          DYNAMIC          000000002bb390b0  2b9390b0
       0000000000000340  0000000000000010  WA       9     0     8
  [30] .got              PROGBITS         000000002bb393f0  2b9393f0
       000000000001f040  0000000000000000  WA       0     0     8
  [31] .bss.rel.ro       NOBITS           000000002bb58440  2b958430
       0000000000000c40  0000000000000000  WA       0     0     32
  [32] .data             PROGBITS         000000002bb5a000  2b959000
       0000000000194188  0000000000000000  WA       0     0     4096
  [33] .tm_clone_table   PROGBITS         000000002bcee188  2baed188
       0000000000000000  0000000000000000  WA       0     0     8
  [34] .probes           PROGBITS         000000002bcee188  2baed188
       0000000000000002  0000000000000000  WA       0     0     2
  [35] .got.plt          PROGBITS         000000002bcee190  2baed190
       0000000000003168  0000000000000000  WA       0     0     8
  [36] .bss              NOBITS           000000002bcf1300  2baf02f8
       00000000005214f0  0000000000000000  WA       0     0     128
  [37] .nvFatBinSegment  PROGBITS         000000002c213000  2baf1000
       0000000000002850  0000000000000000   A       0     0     8
  [38] .nv_fatbin        PROGBITS         000000002c216000  2baf4000
       0000000052baed38  0000000000000000  WA       0     0     8
  [39] .comment          PROGBITS         0000000000000000  7e6a2d38
       00000000000001dc  0000000000000000  MS       0     0     1
  [40] .debug_aranges    PROGBITS         0000000000000000  7e6a2f20
       0000000001266c00  0000000000000000           0     0     16
  [41] .debug_info       PROGBITS         0000000000000000  7f909b20
       000000007b21de49  0000000000000000           0     0     1
  [42] .debug_abbrev     PROGBITS         0000000000000000  fab27969
       000000000179f365  0000000000000000           0     0     1
  [43] .debug_line       PROGBITS         0000000000000000  fc2c6cce
       00000000176954ac  0000000000000000           0     0     1
  [44] .debug_str        PROGBITS         0000000000000000  11395c17a
       0000000039dc32b0  0000000000000001  MS       0     0     1
  [45] .debug_ranges     PROGBITS         0000000000000000  14d71f430
       0000000026a2d930  0000000000000000           0     0     16
  [46] .debug_types      PROGBITS         0000000000000000  17414cd60
       000000000b211ff5  0000000000000000           0     0     1
  [47] .debug_loc        PROGBITS         0000000000000000  17f35ed55
       000000009ca80c7e  0000000000000000           0     0     1
  [48] .debug_macinfo    PROGBITS         0000000000000000  21bddf9d3
       000000000000151c  0000000000000000           0     0     1
  [49] .note.stapsdt     NOTE             0000000000000000  21bde0ef0
       0000000000001b3c  0000000000000000           0     0     4
  [50] .debug_macro      PROGBITS         0000000000000000  21bde2a2c
       0000000000040e6a  0000000000000000           0     0     1
  [51] .torch_deploy_pay PROGBITS         0000000000000000  21be23896
       0000000026ba5d28  0000000000000000           0     0     1
  [52] .symtab           SYMTAB           0000000000000000  2429c95c0
       00000000020ce0c8  0000000000000018          54   863985     8
  [53] .shstrtab         STRTAB           0000000000000000  244a97688
       000000000000025c  0000000000000000           0     0     1
  [54] .strtab           STRTAB           0000000000000000  244a978e4
       00000000070309c6  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)
```

Reviewed By: shunting314

Differential Revision: D33243703

fbshipit-source-id: 09a798113766c716297458cea7a74f074268dc82
2022-01-04 19:57:06 -08:00
2292520bdc Fix genSparseCSRTensor: generate non-trivial values for uint8 dtype. (#70580)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70580

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33413597

Pulled By: cpuhrsch

fbshipit-source-id: 313b08e1bd96ffb8d5c7a0fda9384502325e5d08
2022-01-04 18:02:36 -08:00
29ff596dca [CUDA graphs] Changes batchnorm to increment num_batches_tracked in place for improved graph safety (#70444)
Summary:
This PR was not my worst debugging annoyance, nor my smallest in lines changed, but it has the highest `debugging annoyance/lines changed` ratio.

The current pattern
```
self.num_batches_tracked = self.num_batches_tracked + 1
```
, if captured, deletes an eagerly-allocated tensor and overwrites it with a captured tensor. Replays read from the (deallocated) original tensor's address.
This can cause
1. an IMA on graph replay
2. failure to actually increment `num_batches_tracked` during graph replay, because every replay reads from the old location without adding to it
3. numerical corruption if the allocator reassigns the original tensor's memory to some unrelated tensor
4. combinations of 1, 2, and 3, depending on global allocation patterns and if/when the BN module is called eagerly sometimes between replays

(ask me how I know).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70444

Reviewed By: albanD

Differential Revision: D33342203

Pulled By: ngimel

fbshipit-source-id: 5f201cc25030517e75af010bbaa88c452155df21
2022-01-04 17:06:46 -08:00
14457bb8cb Remove backward op for slow 3d transposed convolution (#69933)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69933

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33131343

Pulled By: jbschlosser

fbshipit-source-id: 4300c66f0f4811c949f82c62d17c7b5200cd15a3
2022-01-04 16:55:43 -08:00
1adb70c6f0 Revert D33409880: [pytorch][PR] Deprecating Python 3.6
Test Plan: revert-hammer

Differential Revision:
D33409880 (d95be99561)

Original commit changeset: 4f9123398960

Original Phabricator Diff: D33409880 (d95be99561)

fbshipit-source-id: 32dc1c3c07ef99a04fab7d0fb742cf4e6c4b718a
2022-01-04 16:37:09 -08:00
8369a46417 [maskrcnn] use stable sort in mask rcnn caffe2 ops (#70510)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70510

Pull Request resolved: https://github.com/facebookresearch/detectron2/pull/3838

Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/58

Pull Request resolved: https://github.com/fairinternal/detectron2/pull/567

D32694315 changes the implementation of sorting in NMS to stable sort. While the C2 operators are using non-stable sort. This causes test failure such as:
- mobile-vision/d2go/tests:fb_test_meta_arch_rcnn - test_export_caffe2 (d2go.tests.fb.test_meta_arch_rcnn.TestFBNetV2MaskRCNNFP32) (architecture: x86_64, buildmode: dev-nosan, buildsystem: buck, compiler: clang, sanitizer: none) https://www.internalfb.com/intern/testinfra/diagnostics/7318349463675961.562949999530318.1640814509/
- mobile-vision/d2go/tests:fb_test_meta_arch_rcnn - test_export_torchscript_mobile_c2_ops (d2go.tests.fb.test_meta_arch_rcnn.TestFBNetV2MaskRCNNFP32) (architecture: x86_64, buildmode: dev-nosan, buildsystem: buck, compiler: clang, sanitizer: none) https://www.internalfb.com/intern/testinfra/diagnostics/7318349463675961.844424980844724.1640814504/

To illustrate, in the failed test_export_caffe2 test, the inputs of BoxWithNMSLimit are:
```
(Pdb) ws.FetchBlob("246")
array([[0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568, 0.01234568, 0.01234568, 0.01234568, 0.01234568,
        0.01234568]], dtype=float32)
(Pdb) ws.FetchBlob("248")
array([[ 0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,
         0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0.,
        10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10.,
        20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,  0.,  0., 10., 20.,
         0.,  0., 10., 20.,  0.,  0., 10., 20.]], dtype=float32)
(Pdb) ws.FetchBlob("249")
array([1.], dtype=float32)
```
This contains 81 boxes (representing 81 classes) with equal score, stable sort will return the class id 0; while the non-stable sort returns class id 50.

This diff changes the sorting to stable sort for BoxWithNMSLimit op.

Test Plan:
The D2 (401a6b682b)Go's tests can pass after this change.
```
buck test mode/dev-nosan //mobile-vision/d2go/tests:fb_test_meta_arch_rcnn -- --run-disabled
```
https://www.internalfb.com/intern/testinfra/testrun/4785074687594820

Reviewed By: newstzpz

Differential Revision: D33355251

fbshipit-source-id: 9f3fc230b852a5e43f0e3cb8fa9093cbaf53e8b6
2022-01-04 16:33:10 -08:00
b16b444828 don't unsqueeze every stack arg if possible (#70288)
Summary:
Fixes T98738497
Use `cat` and `view` if possible, instead of unsqueezing every arg. Helps perf when there are a lot of small arguments to `stack`.
Benchmark:
```
import torch
from torch.utils.benchmark import Timer

inputs =  [torch.randn([1, 128]) for _ in range(500)]
out = torch.empty(1,500,128)
def stack_cat(inputs):
    cat_result = torch.concat(inputs, dim=1)
    return cat_result.view( [1, 500, 128])

timer_stack = Timer(stmt="torch.stack(inputs, dim=1)", globals=globals())
timer_cat = Timer(stmt="stack_cat(inputs)", globals=globals())
print("stack ", timer_stack.blocked_autorange().median)
print("cat ", timer_cat.blocked_autorange().median)
```
Before:
```
stack  0.00023390522226691247
cat  7.437262553721667e-05
```
After
```
stack  7.397504318505526e-05
cat  7.37407322973013e-05
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70288

Reviewed By: robieta, mruberry

Differential Revision: D33289789

Pulled By: ngimel

fbshipit-source-id: b57dcb8ec66e767f552c08deeba330f31ae6c3d0
2022-01-04 16:07:30 -08:00
f8f96d4858 Copy: Re-use existing neg and conj kernel implementations (#68949)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68949

This reuses the existing `neg_kernel` and `conj_kernel`
implementations for copy, saving some binary size and compile time.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33064390

Pulled By: anjali411

fbshipit-source-id: eb0ee94ed3db44ae828ea078ba616365f97a7ff5
2022-01-04 15:30:31 -08:00
95a1952633 add SparseXPU to dispatch key set autogradother_backends (#70443)
Summary:
According to dispatch table computation logic, if no kernel
register to a certain dispatch key, will use CompositeExplicitAutograd
backend kernel, so we need add sparseXPU key to the alias key pool.

Signed-off-by: Ma, Jing1 <jing1.ma@intel.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70443

Reviewed By: jbschlosser

Differential Revision: D33406004

Pulled By: bdhirsh

fbshipit-source-id: 009037739c818676901b10465632d3fef5ba14f2
2022-01-04 15:16:46 -08:00
a60adc7f8a fractional_max_pool2d_backward: port to structured kernel (#68245)
Summary:
Ported to structured kernel the fractional_max_pool2d_backward.

Ref https://github.com/pytorch/pytorch/issues/55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68245

Reviewed By: jbschlosser

Differential Revision: D33405521

Pulled By: bdhirsh

fbshipit-source-id: 4930e870d4025485317208df751bc3721ecdb7eb
2022-01-04 15:15:29 -08:00
7e58b1dd7b Sets device guard in _cudnn_impl functions (#70406)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70404

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70406

Reviewed By: mruberry

Differential Revision: D33407972

Pulled By: ngimel

fbshipit-source-id: 6bf97602ea13f8eaaff95d9f412a2eeaa0e6ba10
2022-01-04 15:11:17 -08:00
6089a0f14a Extend checkout for pytorch/builder (#70644)
Summary:
https://www.torch-ci.com/minihud shows 2 recent failures due to timing out. Increasing to 30m to see if it could be alleviated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70644

Reviewed By: suo, malfet, seemethere, atalman

Differential Revision: D33413604

Pulled By: janeyx99

fbshipit-source-id: 756a7ad94c589e39b8567acbfc3e769dc0b9113f
2022-01-04 14:55:47 -08:00
7b8c43cd7c Revert "Revert D32498570: make codegen'd device guards not cuda-specific. Allow them to be used in external codegen" (#69951)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69951

This reverts commit 0ef523633fddf2d63e97d5028b00af10ff344561.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33113543

Pulled By: bdhirsh

fbshipit-source-id: b28073ee0870b413ea9f617f27671ae5c6f3c696
2022-01-04 14:53:21 -08:00
bb5b4cceb6 Revert "Revert D32498569: allow external backend codegen to toggle whether to generate out= and inplace kernels" (#69950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69950

This reverts commit f6cad53443704dfe5a20cc62bee14d91e3bffcaa.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33113545

Pulled By: bdhirsh

fbshipit-source-id: d6590294662588d36c09662dea65919ad4e1e288
2022-01-04 14:52:00 -08:00
d95be99561 Deprecating Python 3.6 (#70493)
Summary:
Deprecating python 3.6 from documentation and from cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70493

Reviewed By: malfet

Differential Revision: D33409880

Pulled By: atalman

fbshipit-source-id: 4f912339896096be95b344724a4d9ae88cdf1a8f
2022-01-04 14:41:27 -08:00
4d08db0cb2 Flaky tests reporting: use GITHUB_RUN_ID instead of concatenated value (#70604)
Summary:
I did not realize the WORKFLOW_ID variable in our GHA scripts concatenated RUN_ID and RUN_NUMBER.

For flaky tests collection, we should be only using RUN_ID, which makes it easier for us to write queries on the data

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70604

Reviewed By: suo

Differential Revision: D33409503

Pulled By: janeyx99

fbshipit-source-id: 932405989dc1a406dfe9da9a7f513ca127c8d436
2022-01-04 14:36:13 -08:00
0ece9a49d7 Revert D33198155: Bump version number to 7 and compile old operators with old schema
Test Plan: revert-hammer

Differential Revision:
D33198155 (d35fc409ad)

Original commit changeset: 38a1185f9ecb

Original Phabricator Diff: D33198155 (d35fc409ad)

fbshipit-source-id: 411aaeb4e047aad9202db50d4d0f2ff35bc51f9d
2022-01-04 13:44:59 -08:00
61b562206b Fix docstring for nn.ELU (#70574)
Summary:
Fixes nn.ELU's docstring problem reported at https://github.com/pytorch/pytorch/issues/70498.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70574

Reviewed By: VitalyFedyunin

Differential Revision: D33404696

Pulled By: albanD

fbshipit-source-id: 1ffcba3fdeadf88a4433e9168c42ddb252e833e9
2022-01-04 13:27:59 -08:00
9cf0de509f DispatchStub: Improve type mismatch errors (#67880)
Summary:
Currently when you register a kernel implementation to a dispatch stub,
it takes the function signature from the function pointer you pass in.
That means if you get the signature wrong, it fails at runtime with a
link error instead of failing during the compilation. This also means
that when registering nullptr you need to manually specify the type.

Instead, taking the type from `DispatchStub::FnPtr` means quicker time
to signal on failure and better error messages. The only downside is
you need to actually include the DispatchStub declaration which for
some CPU kernels was missing, so I've had to add them here.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67880

Reviewed By: mrshenli

Differential Revision: D33400922

Pulled By: ngimel

fbshipit-source-id: 2da22f053ef82da5db512986e5b968d97a681617
2022-01-04 11:00:47 -08:00
f64906f470 ibm z14/15 SIMD support (#66407)
Summary:
https://github.com/pytorch/pytorch/issues/66406
implemented z arch 14/15 vector SIMD additions.
so far besides bfloat all other types have their SIMD implementation.

it has 99% coverage and currently passing the local test.
it is concise and the main SIMD file is only one header file
it's using template metaprogramming, mostly. but still, there are a few macrosses left with the intention not to modify PyTorch much
Sleef supports z15

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66407

Reviewed By: mrshenli

Differential Revision: D33370163

Pulled By: malfet

fbshipit-source-id: 0e5a57f31b22a718cd2a9ac59753fb468cdda140
2022-01-04 09:40:18 -08:00
8dcfdf39e7 [DataPipe] Renaming FileLoader to FileOpener with deprecation warning for FileLoader (#70367)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70367

This PR renames the `FileLoaderIterDataPipe` to `FileOpenerIterDataPipe`. For the sake of not breaking many CI tests immediately, it still preserves `FileLoader` as an alias. This will allow downstream libraries/users to migrate their use cases before we fully remove all references to `FileLoader` from PyTorch.

Fixes https://github.com/pytorch/data/issues/103. More detailed discussion about this decision is also in the linked issue.

cc VitalyFedyunin ejguan NivekT pmeier Nayef211

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33301648

Pulled By: NivekT

fbshipit-source-id: 59278dcd44e372df0ba2001a4eecbf9792580d0b
2022-01-04 09:14:50 -08:00
7c7eb351c3 Populate __name__ for torch.nn.modules.utils.{_single,_pair,...} (#70459)
Summary:
This helps with debug printouts and python level graph analysis.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70459

Reviewed By: wconstab

Differential Revision: D33340032

Pulled By: jansel

fbshipit-source-id: 24d3fdf31e9e5e92bb47f0db30339cf373a1d4d4
2022-01-04 08:37:12 -08:00
1150046d29 NNAPI: Add runtime flexible shapes & return shapes (#70334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70334

* Use 0 for load time flexible shapes
* -1 for runtime flexible shapes
* NNAPI needs return shapes for flexible outputs

Test Plan: Tested via upcoming ops

Reviewed By: dreiss

Differential Revision: D33237922

fbshipit-source-id: 50afdd8e3c6401dfb79b4bc09513c9882a09e5d5
2022-01-04 08:37:09 -08:00
a825351c13 GHA Windows: Propagate exit code from .bat to calling bash script (#70011)
Summary:
The windows 1st shard was silently failing to run (more details here https://github.com/pytorch/pytorch/issues/70010) because the code to run them was never reached. It was silently failing because our CI still returned green for those workflow jobs, because the exit code from the batch script DID NOT PROPAGATE to the calling bash script.

The key here is that even though we have
```
if ERRORLEVEL 1 exit \b 1
```

The exit code 1 was NOT propagating back to the bash script, as the `exit \b 1` was within an `if` statement and the batch script was actually run in a cmd shell, so the bash script win-test.sh continued without erroring. Moving the `exit \b 1` to be standalone fixes it.

More details for this can be found in this stack overflow https://stackoverflow.com/a/55290133

Evidence that now a failure in the .bat would fail the whole job:
https://github.com/pytorch/pytorch/runs/4621483334?check_suite_focus=true

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70011

Reviewed By: seemethere, samdow

Differential Revision: D33303020

Pulled By: janeyx99

fbshipit-source-id: 8920a43fc6c4b67fecf90f3fca3908c314522cd6
2022-01-04 08:35:49 -08:00
d35fc409ad Bump version number to 7 and compile old operators with old schema (#68358)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68358

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33198155

Pulled By: tugsbayasgalan

fbshipit-source-id: 38a1185f9ecb34a33f737ad0b060b3490956300c
2022-01-04 01:31:25 -08:00
d9106116aa nnapi: Add int32 type torchscript expressions (#70197)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70197

Test Plan:
* `pytest test/test_nnapi.py`
* Testing via ops following this commit

Reviewed By: anshuljain1, dreiss

Differential Revision: D33237917

fbshipit-source-id: f0493620f28a62ad9fe0b97b67d1e25059d50c24
2022-01-03 19:00:38 -08:00
1b66915f39 Have type_parser return const reference (#70477)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70477

Test Plan: Sandcastle

Reviewed By: cccclai

Differential Revision: D33340030

fbshipit-source-id: b2a295b7c1c01e86971f6b9bbdd7d3718a2d3f0c
2022-01-03 16:18:28 -08:00
bc3246453b Added explicit build command for Windows and clarification on obtaining (#70190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70190

C++ build tools to readme.md

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33245438

Pulled By: ikriv

fbshipit-source-id: ef863d68926bd7416d0e10d24197d19392c124de
2022-01-03 14:33:59 -08:00
1e67570f3a Drop omp simd from batch_permutation_op.cc (#70579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70579

Fixes
```
     36 stderr: caffe2/caffe2/operators/batch_permutation_op.cc:25:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
      3 caffe2/caffe2/operators/batch_permutation_op.cc:25:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
```

Test Plan: Sandcastle

Reviewed By: meyering

Differential Revision: D33378925

fbshipit-source-id: 5ae3bfb8fadfa91a13ff0dcf5fae2ce7864ea90e
2022-01-03 08:45:50 -08:00
ab49d41bb5 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33393329

fbshipit-source-id: 728d47e62e8d81c5243c62917d88e54c4b4a1db2
2022-01-02 17:30:39 -08:00
fa09099ba3 Codegen: TraceType only includes operators being registered (#68691)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68691

TraceType is a sharded file, so by only including specific operator
headers, we ensure that changing one (non-method) operator only needs
one shard to be re-compiled.

This also changes all the included autograd and jit headers from
including `ATen/ATen.h` to just including `ATen/core/Tensor.h`.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D33336948

Pulled By: albanD

fbshipit-source-id: 4e40371592b9a5a7e7fcd1d8cecae11ffb873113
2022-01-02 13:09:19 -08:00
779f41a78a [quant] Add a e2e test for standalone module + custom backend_config_dict (#70152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70152

This is to demonstrate our backend_config_dict works for one of
our internal use cases

Test Plan:
python test/fx2trt/test_quant_trt.py

Imported from OSS

Reviewed By: vkuzo, raghuramank100

Differential Revision: D33205161

fbshipit-source-id: dca8570816baaf85a79f2be75378d46c3af0e454
2022-01-02 11:20:50 -08:00
ce86881afa [quant][graphmode][fx] Add qat module mapping support in backend_config_dict (#70287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70287

This PR adds the support to configuring qat modules for fused/non-fused modules
TODO: there are some redundant configs, especially for fused op patterns, we can clean them up later

Test Plan:
python test/fx2trt/test_quant_trt.py TestQuantizeFxTRTOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D33274057

fbshipit-source-id: b2e6a078211320d97c41ffadd3ecedfab57e3b77
2021-12-30 23:30:34 -08:00
65faf1a7eb [fx2trt] Add version check for ProfilingVerbosity bulider config (#70286)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70286

att

Test Plan:
python test/fx2trt/test_quant_trt.py

Imported from OSS

Reviewed By: soulitzer

Differential Revision: D33274058

fbshipit-source-id: c7657f9ba8b578d40d6fc1793b8b363898700eee
2021-12-30 19:59:25 -08:00
6bc06ec3c2 [PyTorch Edge][QNNPack] Tighten Step Height for Indirection Buffers (#70530)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70530

```kernel_size + (output_width * step_width - 1) * kernel_height``` is more space than needed, and ```kernel_size + (output_width - 1) * step_width * kernel_height``` is just enough.

Test Plan: Phabricator Tests

Reviewed By: kimishpatel

Differential Revision: D32553599

fbshipit-source-id: 30f6d191705bcb25dc9bb7a91c6d7b99c3a348e5
2021-12-30 14:57:33 -08:00
7bfaa230be [nn] adaptive_avg_pool{1/2/3}d : Error on negative output_size (#70488)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70232

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70488

Reviewed By: H-Huang

Differential Revision: D33367289

Pulled By: jbschlosser

fbshipit-source-id: 6b7b89d72c4e1e049ad6a0addb22a261c28ddb4c
2021-12-30 14:42:11 -08:00
e6c3aa3880 Remove backward ops for mkldnn convolution (#70467)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70467

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33342476

Pulled By: jbschlosser

fbshipit-source-id: 9811d02b16adea0dd1dd2500261f4b3b294d2dee
2021-12-30 14:29:22 -08:00
cfc71f56e4 [quant][fx][graphmode] Support standalone module in _convert_do_not_use (#70151)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70151

this supports converting an observed standalone module to quantized standalone module
in the new convert flow (convert observers to quant-dequant operators)

Test Plan:
```
python test/test_quant_trt.py TestConvertFxDoNotUse
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D33205163

fbshipit-source-id: 01ea44fb2a8ffe30bec1dd5678e7a72797bafafc
2021-12-30 12:31:03 -08:00
401a6b682b add BFloat16 support for AdaptiveAvgPool2d on CPU (#56902)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56902

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D28836789

Pulled By: VitalyFedyunin

fbshipit-source-id: caac5e5b15190b8010bbfbc6920aa44032208ee7
2021-12-30 11:58:37 -08:00
bc40fb5639 [Reinstate] Wishart distribution (#70377)
Summary:
Implement https://github.com/pytorch/pytorch/issues/68050
Reopened merged and reverted PR https://github.com/pytorch/pytorch/issues/68588 worked with neerajprad
cc neerajprad

Sorry for the confusion.

TODO:

- [x] Unit Test
- [x] Documentation
- [x] Change constraint of matrix variables with 'torch.distributions.constraints.symmetric' if it is reviewed and merged. Debug positive definite constraints https://github.com/pytorch/pytorch/issues/68720

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70377

Reviewed By: mikaylagawarecki

Differential Revision: D33355132

Pulled By: neerajprad

fbshipit-source-id: e968c0d9a3061fb2855564b96074235e46a57b6c
2021-12-30 11:41:46 -08:00
14d3d29b16 make ProcessException pickleable (#70118)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70116

Happy to add tests if you let me know the best place to put them.

cc VitalyFedyunin

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70118

Reviewed By: malfet

Differential Revision: D33255899

Pulled By: ejguan

fbshipit-source-id: 41d495374182eb28bb8bb421e890eca3bddc077b
2021-12-30 09:09:55 -08:00
9c742bea59 [PyTorch Edge][QNNPack] Enable Depthwise Specific Conv3d Kernel for Kernel Size 3x3x3 (#69315)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69315

Uses kernels and setup modifications from earlier diffs in this stack
ghstack-source-id: 146346780

Test Plan:
**Correctness**
- Test using QNNPack Operator-Level Test:
-- Neon Kernel: As in test plan of D32217846, all tests pass
-- SSE2 Kernel: ```buck test xplat/caffe2/aten/src/ATen/native/quantized/cpu/qnnpack:pytorch_qnnpack_test```, all tests pass
- Test by Printing Results of Model-Level Test: D32122020

**Performance**

*Operator Level tests from convolution.cc in D32217846*
||Before (V23 of D32217846, without newly added kernel)|After (V48 of D31966574, with newly added kernel)|
|depthwise 3x3x3 static|184 ms|134 ms|
|depthwise 3x3x3 runtime|181 ms|134 ms|
|depthwise 3x3x3s2 static|30 ms|22 ms|
|depthwise 3x3x3s2 runtime|30 ms|23 ms|
|depthwise 3x3x3s1x2 static|97 ms|70 ms|
|depthwise 3x3x3s1x2 runtime|96 ms|70 ms|
|depthwise 3x3x3s2x1 static|53 ms|38 ms|
|depthwise 3x3x3s2x1 runtime|53 ms|38 ms|
|depthwise 3x3x3d2 static|104 ms|74 ms|
|depthwise 3x3x3d2 runtime|103 ms|75 ms|
|depthwise 3x3x3d1x2 static|158 ms|116 ms|
|depthwise 3x3x3d1x2 runtime|157 ms|115 ms|
|depthwise 3x3x3d2x1 static|120 ms|86 ms|
|depthwise 3x3x3d2x1 runtime|120 ms|87 ms|
|depthwise 3x3x3 per channel static|182 ms|134 ms|
|depthwise 3x3x3 per channel runtime|184 ms|134 ms|
|depthwise 3x3x3s2 per channel static|30 ms|22 ms|
|depthwise 3x3x3s2 per channel runtime|31 ms|23 ms|
|depthwise 3x3x3s1x2 per channel static|95 ms|70 ms|
|depthwise 3x3x3s1x2 per channel runtime|95 ms|71 ms|
|depthwise 3x3x3s2x1 per channel static|53 ms|39 ms|
|depthwise 3x3x3s2x1 per channel runtime|55 ms|39 ms|
|depthwise 3x3x3d2 per channel static|105 ms|75 ms|
|depthwise 3x3x3d2 per channel runtime|103 ms|75 ms|
|depthwise 3x3x3d1x2 per channel static|158 ms|116 ms|
|depthwise 3x3x3d1x2 per channel runtime|158 ms|116 ms|
|depthwise 3x3x3d2x1 per channel static|118 ms|87 ms|
|depthwise 3x3x3d2x1 per channel runtime|119 ms|87 ms|

Average Change: -36.96%

(Generated with https://www.internalfb.com/intern/anp/view/?id=1371846&revision_id=291376782898627)

*Model Level Test on Synthesized Conv3d Model*

Model Details:
- 21 channels, input size: 9 x 12 x 7, kernel size: 3x3x3
- Config added in D31928710
- Model generated with https://www.internalfb.com/intern/anp/view/?id=1313660&revision_id=248658657303993

```buck run aibench:run_bench -- -b dw_conv_3d_3x3x3_big_2b.json --platform android/arm64 --framework pytorch --remote --devices Pixel-4a-11-30```

- Before (V23 of D32217846): [0.0935 ms](https://our.intern.facebook.com/intern/aibench/details/768298420366437)
- After (V48 of D31966574): [0.0665 ms](https://our.intern.facebook.com/intern/aibench/details/67271954298132)
(29% faster)

* Model Level Test on Video Model-like Inputs (provided by liyilui) *
- D33000199
- 87.5% faster

Reviewed By: kimishpatel

Differential Revision: D31966574

fbshipit-source-id: 6554a878401c1120054f6b02241456e8fb44b152
2021-12-30 08:12:10 -08:00
3d4590d16f [PyTorch Edge][QNNPack] Depthwise Conv3d mp8x27 (per-channel) Sse2 Kernel (#69314)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69314

Implementation based off of [convolution-operator-tester.h](https://www.internalfb.com/code/fbsource/[679135d62c0a64e3d0fa0c830aa062ac28f292b8]/fbcode/caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/test/convolution-operator-tester.h)

Generated files (caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/wrappers/q8dwconv/*) made with
- cd caffe2/aten/src/ATen/native/quantized/cpu/qnnpack
- python3 generate-wrapper.py

The math used the compute the ```w_zyxc_ptr``` is explained here:

{F681213069}
ghstack-source-id: 146346784

Test Plan: Test when used in depthwise conv3d later in this diff stack (D31966574)

Reviewed By: kimishpatel

Differential Revision: D32261231

fbshipit-source-id: 8e793696f7c3b0e7cceda88df8099f64f3c69ac4
2021-12-30 08:12:07 -08:00
821c085c9b [PyTorch Edge][QNNPack] Depthwise Conv3d mp8x27 (per channel) Neon Kernel (#69313)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69313

Allows for depthwise conv3d with 3x3x3 kernel

Implementation based heavily off of [mp8x25-neon-per-channel.c](https://www.internalfb.com/code/fbsource/[679135d62c0a64e3d0fa0c830aa062ac28f292b8]/fbcode/caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8dwconv/mp8x25-neon-per-channel.c) (depthwise conv2d with 5x5 kernel)

This supports per-channel convolution, but it works for non per-channel too

Generated files (caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/wrappers/q8dwconv/*) made with
- cd caffe2/aten/src/ATen/native/quantized/cpu/qnnpack
- python3 generate-wrapper.py
ghstack-source-id: 146346785

Test Plan: Test when used in depthwise conv3d later in this diff stack (D31966574)

Reviewed By: kimishpatel

Differential Revision: D32074096

fbshipit-source-id: 8111926df6ecb89d88ca810deeab87b1c072f55a
2021-12-30 08:12:04 -08:00
15d443326c [PyTorch Edge][QNNPack] Depthwise Conv3d Weight Packing (#69312)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69312

Enable packing weights to be compatible with depthwise specific conv3d kernels
ghstack-source-id: 146346778

Test Plan:
- Existing 2d weight packing uses do not break (phabricator tests)
- Test 3d weight packing when used in depthwise conv3d later in this diff stack (D31966574)

Reviewed By: kimishpatel

Differential Revision: D32045036

fbshipit-source-id: a2323f74f7d30d92d4ed91315f59539ecad729ec
2021-12-30 08:12:00 -08:00
db37fd3865 [PyTorch Edge][QNNPack] Depthwise Conv3d Indirection Buffer Setup (#69311)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69311

Enable setting up indirection buffer to be compatible with depthwise specific conv3d kernels
ghstack-source-id: 146346788

Test Plan:
- Existing 2d indirection buffer uses do not break (phabricator tests)
- Test 3d indirection buffer when used in depthwise conv3d later in this diff stack (D31966574)

Reviewed By: kimishpatel

Differential Revision: D31999533

fbshipit-source-id: a403d8dcad6e50641b9235e0b574129b2dfb5412
2021-12-30 08:11:57 -08:00
9863cd5741 [PyTorch Edge][QNNPack] Refactor Computing Step Dimensions (#69310)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69310

Extract computing step height and step width into helper function and store them in operator struct since the same calculation is used in many places before this diff.
ghstack-source-id: 146346783

Test Plan: Phabricator tests

Reviewed By: kimishpatel

Differential Revision: D32553327

fbshipit-source-id: e5bf07416f4c1ccde9975f835767392ad7a851c1
2021-12-30 08:11:54 -08:00
cea3eba617 [PyTorch Edge][QNNPack] Operator-Level Conv3d Tests (#69309)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69309

Test correctness of QNNPack Conv3d

- Add Depth dimension to ConvolutionOperatorTester
- Add tests which use it

Includes John's changes in D32388572
ghstack-source-id: 146346786

Test Plan:
Build the Test
- ```cd caffe2/aten/src/ATen/native/quantized/cpu/qnnpack```
- ```./scripts/build-android-arm64.sh```
- Test binary is outputted to ```build/android/arm64-v8a```

Run the Test
- ```test_name=convolution-test```
- ```chmod +x build/android/arm64-v8a/$test_name```
- Send the binary to android device and execute it, ex. connect to one world and ```adb push build/android/arm64-v8a/$test_name /data/local/tmp/$test_name``` then ```adb shell /data/local/tmp/$test_name```

Reviewed By: kimishpatel

Differential Revision: D32217846

fbshipit-source-id: eba200c136894461bf76b2a5416540fe8781d588
2021-12-30 08:10:34 -08:00
35251a5528 [PyTorch] Add Enum to IValue Deepcopy (#69937)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69937

This enables ```export_torch_mobile_model``` compatibility with Enum IValues

Test Plan: ModuleAPITest.DeepCopyEnum

Reviewed By: gmagogsfm

Differential Revision: D33104681

fbshipit-source-id: ca2a6d259c312487fe38dd1bed33ab6b7910bc2a
2021-12-30 07:52:22 -08:00
36db501736 softplus_backward: remove output arg (#70296)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/69042

Tested with OpInfo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70296

Reviewed By: mikaylagawarecki

Differential Revision: D33349227

Pulled By: albanD

fbshipit-source-id: edeb35cb19ab4434d39df93d4536cb07679218b5
2021-12-30 02:16:36 -08:00
18dd5cdba5 [Operator Versioning][Test] Use hypothesis for better test input data and broader coverage (#70263)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70263

Leverage the hypothesis library as it's more systematic way for testing. To write a test, it needs two parts:

1. A function that looks like a normal test in your test framework of choice but with some additional arguments
2. A given decorator that specifies how to provide those arguments.
ghstack-source-id: 146344955

Test Plan:
```

buck test mode/opt //caffe2/test:jit
python test/test_jit.py TestSaveLoadForOpVersion

```

Reviewed By: iseeyuan

Differential Revision: D33244389

fbshipit-source-id: c93d23f3d9575ebcb4e927a8caee42f4c3a6939d
2021-12-29 20:43:32 -08:00
c627211651 [quant][fx][graphmode][be] Change the type for output of convert to be torch.nn.Module (#69959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69959

GraphModule is an implementation detail, We don't want to expose it in quantization apis

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_quantized_model_type

Imported from OSS

Reviewed By: supriyar

Differential Revision: D33119103

fbshipit-source-id: d8736ff08b42ee009d6cfd74dcb3f6150f71f3d2
2021-12-29 20:33:32 -08:00
fb78a31916 Add testing across mem_formats to ModuleInfos (#69317)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69317

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33285780

Pulled By: mikaylagawarecki

fbshipit-source-id: 1d19293e640e5581351a9c74892dcac4bcdd3f1d
2021-12-29 14:53:27 -08:00
14f4b91f6e Add Nondeterministic Tol to gradient test in test_modules (#69402)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69402

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D33285781

Pulled By: mikaylagawarecki

fbshipit-source-id: f1ab43173d4f558adc943a8acefc13c34cfa5cfa
2021-12-29 14:51:56 -08:00
d2abf3f981 Added antialias flag to interpolate (CPU only, bicubic) (#68819)
Summary:
Description:
- Added antialias flag to interpolate (CPU only)
  - forward and backward for bicubic mode
  - added tests

Previous PR for bilinear, https://github.com/pytorch/pytorch/pull/65142

### Benchmarks

<details>
<summary>
Forward pass, CPU. PTH interpolation vs PIL
</summary>

Cases:
- PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apples vs pears)
- PTH 1 Channel, float32 vs PIL 1 Channel Float

Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112

```
Torch config: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_61,code=sm_61
  - CuDNN 8.0.5
  - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=OFF,

Num threads: 1
[------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (320, 196) -------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitb0bdf58
1 threads: -------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |                4.5                |          5.2
      channels_last non-contiguous torch.float32  |                4.5                |          5.3

Times are in milliseconds (ms).

[------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (460, 220) -------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitb0bdf58
1 threads: -------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |                5.7                |          6.4
      channels_last non-contiguous torch.float32  |                5.7                |          6.4

Times are in milliseconds (ms).

[------------------- Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 96) --------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitb0bdf58
1 threads: -------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |                3.0                |          4.0
      channels_last non-contiguous torch.float32  |                2.9                |          4.1

Times are in milliseconds (ms).

[------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (1200, 196) -------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitb0bdf58
1 threads: -------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |                14.7               |          17.1
      channels_last non-contiguous torch.float32  |                14.8               |          17.2

Times are in milliseconds (ms).

[------------------ Downsampling (bicubic): torch.Size([1, 3, 906, 438]) -> (120, 1200) -------------------]
                                                  |  Reference, PIL 8.4.0, mode: RGB  |  1.11.0a0+gitb0bdf58
1 threads: -------------------------------------------------------------------------------------------------
      channels_first contiguous torch.float32     |                3.5                |          3.9
      channels_last non-contiguous torch.float32  |                3.5                |          3.9

Times are in milliseconds (ms).

[---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (320, 196) ---------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitb0bdf58
1 threads: ------------------------------------------------------------------------------
       contiguous torch.float32  |               2.4               |          1.8

Times are in milliseconds (ms).

[---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (460, 220) ---------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitb0bdf58
1 threads: ------------------------------------------------------------------------------
       contiguous torch.float32  |               3.1               |          2.2

Times are in milliseconds (ms).

[---------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 96) ----------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitb0bdf58
1 threads: ------------------------------------------------------------------------------
       contiguous torch.float32  |               1.6               |          1.4

Times are in milliseconds (ms).

[--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (1200, 196) ---------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitb0bdf58
1 threads: ------------------------------------------------------------------------------
       contiguous torch.float32  |               7.9               |          5.7

Times are in milliseconds (ms).

[--------- Downsampling (bicubic): torch.Size([1, 1, 906, 438]) -> (120, 1200) ---------]
                                 |  Reference, PIL 8.4.0, mode: F  |  1.11.0a0+gitb0bdf58
1 threads: ------------------------------------------------------------------------------
       contiguous torch.float32  |               1.7               |          1.3

Times are in milliseconds (ms).

```

</details>

Code is moved from torchvision: https://github.com/pytorch/vision/pull/3810 and https://github.com/pytorch/vision/pull/4208

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68819

Reviewed By: mikaylagawarecki

Differential Revision: D33339117

Pulled By: jbschlosser

fbshipit-source-id: 6a0443bbba5439f52c7dbc1be819b75634cf67c4
2021-12-29 14:04:43 -08:00
Jim
2b00dbbbbc fix typos in torch/csrc/deploy/README.md (#70494)
Summary:
Fixes typo in torch/csrc/deploy/README.md

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70494

Reviewed By: mikaylagawarecki

Differential Revision: D33354431

Pulled By: H-Huang

fbshipit-source-id: b05757a795d2700eea21d7b881d87a7b239a8b52
2021-12-29 13:52:06 -08:00
8af39b7668 AdaptiveLogSoftmaxWithLoss no_batch_dim support (#69054)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69054

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33200166

Pulled By: george-qi

fbshipit-source-id: 9d953744351a25f372418d2a64e8402356d1e9b7
2021-12-29 10:25:26 -08:00
0460324b9b Fix docs rendering for nn.Module.named_modules() (#70491)
Summary:
The documentation rendering for nn.Module.named_modules() is a bit broken, see the description of the last argument [here](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.named_modules).

This PR fixes that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70491

Reviewed By: mikaylagawarecki

Differential Revision: D33349882

Pulled By: albanD

fbshipit-source-id: a46327c12e8114f7ef2055a8518c4ca9d186e669
2021-12-29 10:08:53 -08:00
fb736c77a4 Remove backward op for slow dilated 3d convolution (#70068)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70068

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D33172550

Pulled By: jbschlosser

fbshipit-source-id: 72109577c020b33e4b9807064f53f1989475d1c2
2021-12-29 09:46:19 -08:00
2c67621a19 [rnn,gru,lstm]cell : no batch dim (#70236)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70236

Reviewed By: mikaylagawarecki

Differential Revision: D33338774

Pulled By: jbschlosser

fbshipit-source-id: 7d8d00272e543b3e67060136b5d98a4baefbedd5
2021-12-29 09:27:32 -08:00
9266b2af73 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33347489

fbshipit-source-id: d43ce53c93724f44b587bfe892534f8d13eadaca
2021-12-29 04:06:52 -08:00
1720 changed files with 124905 additions and 38045 deletions

View File

@ -1,5 +1,9 @@
build --copt=--std=c++14
build --copt=-I.
# Bazel does not support including its cc_library targets as system
# headers. We work around this for generated code
# (e.g. c10/macros/cmake_macros.h) by making the generated directory a
# system include path.
build --copt=-isystem --copt bazel-out/k8-fastbuild/bin
build --experimental_ui_max_stdouterr_bytes=2048576
@ -12,6 +16,9 @@ build:no-tty --show_progress_rate_limit 10
build:gpu --define=cuda=true
# define a separate build folder for faster switching between configs
build:gpu --platform_suffix=-gpu
# See the note on the config-less build for details about why we are
# doing this. We must also do it for the "-gpu" platform suffix.
build --copt=-isystem --copt=bazel-out/k8-fastbuild-gpu/bin
# rules_cuda configuration
build:gpu --@rules_cuda//cuda:enable_cuda
build:gpu --@rules_cuda//cuda:cuda_targets=sm_52

View File

@ -30,21 +30,7 @@ def get_processor_arch_name(gpu_version):
"cu" + gpu_version.strip("cuda") if gpu_version.startswith("cuda") else gpu_version
)
LINUX_PACKAGE_VARIANTS = OrderedDict(
manywheel=[
"3.6m",
"3.7m",
"3.8m",
"3.9m"
],
conda=dimensions.STANDARD_PYTHON_VERSIONS,
libtorch=[
"3.7m",
],
)
CONFIG_TREE_DATA = OrderedDict(
linux=(dimensions.GPU_VERSIONS, LINUX_PACKAGE_VARIANTS),
macos=([None], OrderedDict(
wheel=dimensions.STANDARD_PYTHON_VERSIONS,
conda=dimensions.STANDARD_PYTHON_VERSIONS,
@ -66,11 +52,7 @@ CONFIG_TREE_DATA = OrderedDict(
# Stop building Win+CU102, see https://github.com/pytorch/pytorch/issues/65648
[v for v in dimensions.GPU_VERSIONS if v not in dimensions.ROCM_VERSION_LABELS and v != "cuda102"],
OrderedDict(
wheel=dimensions.STANDARD_PYTHON_VERSIONS,
conda=dimensions.STANDARD_PYTHON_VERSIONS,
libtorch=[
"3.7",
],
)
),
)

View File

@ -8,9 +8,8 @@ CUDA_VERSIONS = [
]
ROCM_VERSIONS = [
"4.1",
"4.2",
"4.3.1",
"4.5.2",
]
ROCM_VERSION_LABELS = ["rocm" + v for v in ROCM_VERSIONS]
@ -20,5 +19,6 @@ GPU_VERSIONS = [None] + ["cuda" + v for v in CUDA_VERSIONS] + ROCM_VERSION_LABEL
STANDARD_PYTHON_VERSIONS = [
"3.7",
"3.8",
"3.9"
"3.9",
"3.10"
]

View File

@ -0,0 +1,103 @@
import cimodel.data.simple.util.branch_filters as branch_filters
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_NDK, DOCKER_REQUIREMENT_NDK
)
class AndroidJob:
def __init__(self,
variant,
template_name,
is_master_only=True):
self.variant = variant
self.template_name = template_name
self.is_master_only = is_master_only
def gen_tree(self):
base_name_parts = [
"pytorch",
"linux",
"xenial",
"py3",
"clang5",
"android",
"ndk",
"r19c",
] + self.variant + [
"build",
]
full_job_name = "_".join(base_name_parts)
build_env_name = "-".join(base_name_parts)
props_dict = {
"name": full_job_name,
"build_environment": "\"{}\"".format(build_env_name),
"docker_image": "\"{}\"".format(DOCKER_IMAGE_NDK),
"requires": [DOCKER_REQUIREMENT_NDK]
}
if self.is_master_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.NON_PR_BRANCH_LIST)
return [{self.template_name: props_dict}]
class AndroidGradleJob:
def __init__(self,
job_name,
template_name,
dependencies,
is_master_only=True,
is_pr_only=False,
extra_props=tuple()):
self.job_name = job_name
self.template_name = template_name
self.dependencies = dependencies
self.is_master_only = is_master_only
self.is_pr_only = is_pr_only
self.extra_props = dict(extra_props)
def gen_tree(self):
props_dict = {
"name": self.job_name,
"requires": self.dependencies,
}
if self.is_master_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.NON_PR_BRANCH_LIST)
elif self.is_pr_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.PR_BRANCH_LIST)
if self.extra_props:
props_dict.update(self.extra_props)
return [{self.template_name: props_dict}]
WORKFLOW_DATA = [
AndroidJob(["x86_32"], "pytorch_linux_build", is_master_only=False),
AndroidJob(["x86_64"], "pytorch_linux_build"),
AndroidJob(["arm", "v7a"], "pytorch_linux_build"),
AndroidJob(["arm", "v8a"], "pytorch_linux_build"),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build-x86_32",
"pytorch_android_gradle_build-x86_32",
["pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build"],
is_master_only=False,
is_pr_only=True),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build",
"pytorch_android_gradle_build",
["pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_64_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v7a_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v8a_build"]),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,69 +0,0 @@
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_GCC7,
DOCKER_REQUIREMENT_GCC7
)
def gen_job_name(phase):
job_name_parts = [
"pytorch",
"bazel",
phase,
]
return "_".join(job_name_parts)
class BazelJob:
def __init__(self, phase, extra_props=None):
self.phase = phase
self.extra_props = extra_props or {}
def gen_tree(self):
template_parts = [
"pytorch",
"linux",
"bazel",
self.phase,
]
build_env_parts = [
"pytorch",
"linux",
"xenial",
"py3.6",
"gcc7",
"bazel",
self.phase,
]
full_job_name = gen_job_name(self.phase)
build_env_name = "-".join(build_env_parts)
extra_requires = (
[gen_job_name("build")] if self.phase == "test" else
[DOCKER_REQUIREMENT_GCC7]
)
props_dict = {
"build_environment": build_env_name,
"docker_image": DOCKER_IMAGE_GCC7,
"name": full_job_name,
"requires": extra_requires,
}
props_dict.update(self.extra_props)
template_name = "_".join(template_parts)
return [{template_name: props_dict}]
WORKFLOW_DATA = [
BazelJob("build", {"resource_class": "large"}),
BazelJob("test"),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -0,0 +1,77 @@
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_NDK,
DOCKER_REQUIREMENT_NDK
)
class AndroidNightlyJob:
def __init__(self,
variant,
template_name,
extra_props=None,
with_docker=True,
requires=None,
no_build_suffix=False):
self.variant = variant
self.template_name = template_name
self.extra_props = extra_props or {}
self.with_docker = with_docker
self.requires = requires
self.no_build_suffix = no_build_suffix
def gen_tree(self):
base_name_parts = [
"pytorch",
"linux",
"xenial",
"py3",
"clang5",
"android",
"ndk",
"r19c",
] + self.variant
build_suffix = [] if self.no_build_suffix else ["build"]
full_job_name = "_".join(["nightly"] + base_name_parts + build_suffix)
build_env_name = "-".join(base_name_parts)
props_dict = {
"name": full_job_name,
"requires": self.requires,
"filters": {"branches": {"only": "nightly"}},
}
props_dict.update(self.extra_props)
if self.with_docker:
props_dict["docker_image"] = DOCKER_IMAGE_NDK
props_dict["build_environment"] = build_env_name
return [{self.template_name: props_dict}]
BASE_REQUIRES = [DOCKER_REQUIREMENT_NDK]
WORKFLOW_DATA = [
AndroidNightlyJob(["x86_32"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["x86_64"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["arm", "v7a"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["arm", "v8a"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["android_gradle"], "pytorch_android_gradle_build",
with_docker=False,
requires=[
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_64_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v7a_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v8a_build"]),
AndroidNightlyJob(["x86_32_android_publish_snapshot"], "pytorch_android_publish_snapshot",
extra_props={"context": "org-member"},
with_docker=False,
requires=["nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_android_gradle_build"],
no_build_suffix=True),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -11,7 +11,7 @@ def gen_docker_image_requires(image_name):
DOCKER_IMAGE_BASIC, DOCKER_REQUIREMENT_BASE = gen_docker_image(
"pytorch-linux-xenial-py3.6-gcc5.4"
"pytorch-linux-xenial-py3.7-gcc5.4"
)
DOCKER_IMAGE_CUDA_10_2, DOCKER_REQUIREMENT_CUDA_10_2 = gen_docker_image(
@ -19,7 +19,7 @@ DOCKER_IMAGE_CUDA_10_2, DOCKER_REQUIREMENT_CUDA_10_2 = gen_docker_image(
)
DOCKER_IMAGE_GCC7, DOCKER_REQUIREMENT_GCC7 = gen_docker_image(
"pytorch-linux-xenial-py3.6-gcc7"
"pytorch-linux-xenial-py3.7-gcc7"
)

5967
.circleci/config.yml generated

File diff suppressed because it is too large Load Diff

View File

@ -40,6 +40,12 @@ function extract_all_from_image_name() {
done
}
# Use the same pre-built XLA test image from PyTorch/XLA
if [[ "$image" == *xla* ]]; then
echo "Using pre-built XLA test image..."
exit 0
fi
if [[ "$image" == *-xenial* ]]; then
UBUNTU_VERSION=16.04
elif [[ "$image" == *-artful* ]]; then
@ -237,22 +243,6 @@ case "$image" in
VISION=yes
ROCM_VERSION=3.9
;;
pytorch-linux-bionic-rocm4.1-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.1
;;
pytorch-linux-bionic-rocm4.2-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.2
;;
pytorch-linux-bionic-rocm4.3.1-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
@ -261,6 +251,14 @@ case "$image" in
VISION=yes
ROCM_VERSION=4.3.1
;;
pytorch-linux-bionic-rocm4.5-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.5.2
;;
*)
# Catch-all for builds that are not hardcoded.
PROTOBUF=yes
@ -306,15 +304,6 @@ fi
tmp_tag=$(basename "$(mktemp -u)" | tr '[:upper:]' '[:lower:]')
# If we are trying to use nvidia cuda image make sure it exists, otherwise use IMAGE from ghcr.io
# this logic currently only exists for ubuntu
if [[ "$image" == *cuda* && ${OS} == "ubuntu" ]]; then
IMAGE_NAME="nvidia/cuda:${CUDA_VERSION}-cudnn${CUDNN_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
if ! DOCKER_CLI_EXPERIMENTAL=enabled docker manifest inspect "${IMAGE_NAME}" >/dev/null 2>/dev/null; then
IMAGE_NAME="ghcr.io/pytorch/nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
INSTALL_CUDNN="True"
fi
fi
# Build image
# TODO: build-arg THRIFT is not turned on for any image, remove it once we confirm
@ -353,8 +342,6 @@ docker build \
--build-arg "KATEX=${KATEX:-}" \
--build-arg "ROCM_VERSION=${ROCM_VERSION:-}" \
--build-arg "PYTORCH_ROCM_ARCH=${PYTORCH_ROCM_ARCH:-gfx900;gfx906}" \
--build-arg "IMAGE_NAME=${IMAGE_NAME}" \
--build-arg "INSTALL_CUDNN=${INSTALL_CUDNN}" \
-f $(dirname ${DOCKERFILE})/Dockerfile \
-t "$tmp_tag" \
"$@" \

View File

@ -122,7 +122,7 @@ wget https://ossci-linux.s3.amazonaws.com/valgrind-${VALGRIND_VERSION}.tar.bz2
tar -xjf valgrind-${VALGRIND_VERSION}.tar.bz2
cd valgrind-${VALGRIND_VERSION}
./configure --prefix=/usr/local
make -j 4
make -j6
sudo make install
cd ../../
rm -rf valgrind_build

View File

@ -120,9 +120,9 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
# Install numba only on python-3.8 or below
# For numba issue see https://github.com/pytorch/pytorch/issues/51511
if [[ $(python -c "import sys; print(int(sys.version_info < (3, 9)))") == "1" ]]; then
as_jenkins pip install --progress-bar off numba librosa>=0.6.2
as_jenkins pip install --progress-bar off numba==0.54.1 "librosa>=0.6.2,<0.9.0"
else
as_jenkins pip install --progress-bar off numba==0.49.0 librosa>=0.6.2
as_jenkins pip install --progress-bar off numba==0.49.0 "librosa>=0.6.2,<0.9.0"
fi
# Update scikit-learn to a python-3.8 compatible version

View File

@ -1,10 +0,0 @@
#!/bin/bash
sudo apt-get update
# also install ssh to avoid error of:
# --------------------------------------------------------------------------
# The value of the MCA parameter "plm_rsh_agent" was set to a path
# that could not be found:
# plm_rsh_agent: ssh : rsh
sudo apt-get install -y ssh
sudo apt-get update && apt-get install -y --no-install-recommends libcudnn8=8.2.0.53-1+cuda11.3 libcudnn8-dev=8.2.0.53-1+cuda11.3 && apt-mark hold libcudnn8

View File

@ -8,7 +8,7 @@ wget -q -O "${OPENSSL}.tar.gz" "https://ossci-linux.s3.amazonaws.com/${OPENSSL}.
tar xf "${OPENSSL}.tar.gz"
cd "${OPENSSL}"
./config --prefix=/opt/openssl -d '-Wl,--enable-new-dtags,-rpath,$(LIBRPATH)'
# NOTE: opensl errors out when built with the -j option
make install_sw
# NOTE: openssl install errors out when built with the -j option
make -j6; make install_sw
cd ..
rm -rf "${OPENSSL}"

View File

@ -14,9 +14,9 @@ install_protobuf_317() {
curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protobuf-all-3.17.3.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-all-3.17.3.tar.gz
# -j2 to balance memory usage and speed.
# -j6 to balance memory usage and speed.
# naked `-j` seems to use too much memory.
pushd "$pb_dir" && ./configure && make -j2 && make -j2 check && sudo make -j2 install && sudo ldconfig
pushd "$pb_dir" && ./configure && make -j6 && make -j6 check && sudo make -j6 install && sudo ldconfig
popd
rm -rf $pb_dir
}

View File

@ -34,6 +34,9 @@ ver() {
printf "%3d%03d%03d%03d" $(echo "$1" | tr '.' ' ');
}
# Map ROCm version to AMDGPU version
declare -A AMDGPU_VERSIONS=( ["4.5.2"]="21.40.2" )
install_ubuntu() {
apt-get update
if [[ $UBUNTU_VERSION == 18.04 ]]; then
@ -51,6 +54,13 @@ install_ubuntu() {
apt-get install -y libc++1
apt-get install -y libc++abi1
if [[ $(ver $ROCM_VERSION) -ge $(ver 4.5) ]]; then
# Add amdgpu repository
UBUNTU_VERSION_NAME=`cat /etc/os-release | grep UBUNTU_CODENAME | awk -F= '{print $2}'`
local amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/ubuntu"
echo "deb [arch=amd64] ${amdgpu_baseurl} ${UBUNTU_VERSION_NAME} main" > /etc/apt/sources.list.d/amdgpu.list
fi
ROCM_REPO="ubuntu"
if [[ $(ver $ROCM_VERSION) -lt $(ver 4.2) ]]; then
ROCM_REPO="xenial"
@ -58,7 +68,8 @@ install_ubuntu() {
# Add rocm repository
wget -qO - http://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -
echo "deb [arch=amd64] http://repo.radeon.com/rocm/apt/${ROCM_VERSION} ${ROCM_REPO} main" > /etc/apt/sources.list.d/rocm.list
local rocm_baseurl="http://repo.radeon.com/rocm/apt/${ROCM_VERSION}"
echo "deb [arch=amd64] ${rocm_baseurl} ${ROCM_REPO} main" > /etc/apt/sources.list.d/rocm.list
apt-get update --allow-insecure-repositories
DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated \
@ -95,11 +106,24 @@ install_centos() {
yum install -y epel-release
yum install -y dkms kernel-headers-`uname -r` kernel-devel-`uname -r`
if [[ $(ver $ROCM_VERSION) -ge $(ver 4.5) ]]; then
# Add amdgpu repository
local amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/7.9/main/x86_64"
echo "[AMDGPU]" > /etc/yum.repos.d/amdgpu.repo
echo "name=AMDGPU" >> /etc/yum.repos.d/amdgpu.repo
echo "baseurl=${amdgpu_baseurl}" >> /etc/yum.repos.d/amdgpu.repo
echo "enabled=1" >> /etc/yum.repos.d/amdgpu.repo
echo "gpgcheck=1" >> /etc/yum.repos.d/amdgpu.repo
echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/amdgpu.repo
fi
local rocm_baseurl="http://repo.radeon.com/rocm/yum/${ROCM_VERSION}"
echo "[ROCm]" > /etc/yum.repos.d/rocm.repo
echo "name=ROCm" >> /etc/yum.repos.d/rocm.repo
echo "baseurl=http://repo.radeon.com/rocm/yum/${ROCM_VERSION}" >> /etc/yum.repos.d/rocm.repo
echo "baseurl=${rocm_baseurl}" >> /etc/yum.repos.d/rocm.repo
echo "enabled=1" >> /etc/yum.repos.d/rocm.repo
echo "gpgcheck=0" >> /etc/yum.repos.d/rocm.repo
echo "gpgcheck=1" >> /etc/yum.repos.d/rocm.repo
echo "gpgkey=http://repo.radeon.com/rocm/rocm.gpg.key" >> /etc/yum.repos.d/rocm.repo
yum update -y

View File

@ -1,14 +1,12 @@
ARG UBUNTU_VERSION
ARG CUDA_VERSION
ARG CUDNN_VERSION
ARG IMAGE_NAME
FROM ${IMAGE_NAME}
ARG UBUNTU_VERSION
ARG CUDA_VERSION
ARG CUDNN_VERSION
FROM nvidia/cuda:${CUDA_VERSION}-cudnn${CUDNN_VERSION}-devel-ubuntu${UBUNTU_VERSION}
ARG UBUNTU_VERSION
ARG CUDA_VERSION
ARG CUDNN_VERSION
ENV DEBIAN_FRONTEND noninteractive
@ -107,12 +105,5 @@ ENV CUDA_PATH /usr/local/cuda
# Install LLVM dev version (Defined in the pytorch/builder github repository)
COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
# Hack for CUDA 11.5.0 image to install cudnn8 since cudnn8 is not included with CUDA 11.5 image
# Also note cudnn 8.2.0.53 is labeled for cuda 11.3
ARG INSTALL_CUDNN
ADD ./common/install_cudnn8.sh install_cudnn8.sh
RUN if [ -n "${INSTALL_CUDNN}" ]; then bash install_cudnn8.sh; fi
RUN rm install_cudnn8.sh
USER jenkins
CMD ["bash"]

View File

@ -1,13 +0,0 @@
FROM ubuntu:18.04
RUN apt-get update && apt-get install -y python3-pip git && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log
ADD requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt
ADD gc.py /usr/bin/gc.py
ADD docker_hub.py /usr/bin/docker_hub.py
ENTRYPOINT ["/usr/bin/gc.py"]

View File

@ -1,125 +0,0 @@
#!/usr/bin/env python3
from collections import namedtuple
import boto3
import requests
import os
IMAGE_INFO = namedtuple(
"IMAGE_INFO", ("repo", "tag", "size", "last_updated_at", "last_updated_by")
)
def build_access_token(username, passwordtr):
r = requests.post(
"https://hub.docker.com/v2/users/login/",
data={"username": username, "password": password},
)
r.raise_for_status()
token = r.json().get("token")
return {"Authorization": "JWT " + token}
def list_repos(user, token):
r = requests.get("https://hub.docker.com/v2/repositories/" + user, headers=token)
r.raise_for_status()
ret = sorted(
repo["user"] + "/" + repo["name"] for repo in r.json().get("results", [])
)
if ret:
print("repos found:")
print("".join("\n\t" + r for r in ret))
return ret
def list_tags(repo, token):
r = requests.get(
"https://hub.docker.com/v2/repositories/" + repo + "/tags", headers=token
)
r.raise_for_status()
return [
IMAGE_INFO(
repo=repo,
tag=t["name"],
size=t["full_size"],
last_updated_at=t["last_updated"],
last_updated_by=t["last_updater_username"],
)
for t in r.json().get("results", [])
]
def save_to_s3(tags):
table_content = ""
client = boto3.client("s3")
for t in tags:
table_content += (
"<tr><td>{repo}</td><td>{tag}</td><td>{size}</td>"
"<td>{last_updated_at}</td><td>{last_updated_by}</td></tr>"
).format(
repo=t.repo,
tag=t.tag,
size=t.size,
last_updated_at=t.last_updated_at,
last_updated_by=t.last_updated_by,
)
html_body = """
<html>
<head>
<link rel="stylesheet"
href="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/css/bootstrap.min.css"
integrity="sha384-Vkoo8x4CGsO3+Hhxv8T/Q5PaXtkKtu6ug5TOeNV6gBiFeWPGFN9MuhOf23Q9Ifjh"
crossorigin="anonymous">
<link rel="stylesheet" type="text/css"
href="https://cdn.datatables.net/1.10.20/css/jquery.dataTables.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js">
</script>
<script type="text/javascript" charset="utf8"
src="https://cdn.datatables.net/1.10.20/js/jquery.dataTables.js"></script>
<title> docker image info</title>
</head>
<body>
<table class="table table-striped table-hover" id="docker">
<caption>Docker images on docker hub</caption>
<thead class="thead-dark">
<tr>
<th scope="col">repo</th>
<th scope="col">tag</th>
<th scope="col">size</th>
<th scope="col">last_updated_at</th>
<th scope="col">last_updated_by</th>
</tr>
</thead>
<tbody>
{table_content}
</tbody>
</table>
</body>
<script>
$(document).ready( function () {{
$('#docker').DataTable({{paging: false}});
}} );py
</script>
</html>
""".format(
table_content=table_content
)
client.put_object(
Bucket="docker.pytorch.org",
ACL="public-read",
Key="docker_hub.html",
Body=html_body,
ContentType="text/html",
)
if __name__ == "__main__":
username = os.environ.get("DOCKER_HUB_USERNAME")
password = os.environ.get("DOCKER_HUB_PASSWORD")
token = build_access_token(username, password)
tags = []
for repo in list_repos("pytorch", token):
tags.extend(list_tags(repo, token))
save_to_s3(tags)

View File

@ -1,218 +0,0 @@
#!/usr/bin/env python3
import argparse
import boto3
import datetime
import pytz
import re
import sys
def save_to_s3(project, data):
table_content = ""
client = boto3.client("s3")
for repo, tag, window, age, pushed in data:
table_content += "<tr><td>{repo}</td><td>{tag}</td><td>{window}</td><td>{age}</td><td>{pushed}</td></tr>".format(
repo=repo, tag=tag, window=window, age=age, pushed=pushed
)
html_body = """
<html>
<head>
<link rel="stylesheet"
href="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/css/bootstrap.min.css"
integrity="sha384-Vkoo8x4CGsO3+Hhxv8T/Q5PaXtkKtu6ug5TOeNV6gBiFeWPGFN9MuhOf23Q9Ifjh"
crossorigin="anonymous">
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.10.20/css/jquery.dataTables.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script type="text/javascript" charset="utf8" src="https://cdn.datatables.net/1.10.20/js/jquery.dataTables.js"></script>
<title>{project} nightly and permanent docker image info</title>
</head>
<body>
<table class="table table-striped table-hover" id="docker">
<thead class="thead-dark">
<tr>
<th scope="col">repo</th>
<th scope="col">tag</th>
<th scope="col">keep window</th>
<th scope="col">age</th>
<th scope="col">pushed at</th>
</tr>
</thead>
<tbody>
{table_content}
</tbody>
</table>
</body>
<script>
$(document).ready( function () {{
$('#docker').DataTable({{paging: false}});
}} );
</script>
</html>
""".format(
project=project, table_content=table_content
)
# for pytorch, file can be found at
# http://ossci-docker.s3-website.us-east-1.amazonaws.com/pytorch.html
# and later one we can config docker.pytorch.org to point to the location
client.put_object(
Bucket="docker.pytorch.org",
ACL="public-read",
Key="{project}.html".format(project=project),
Body=html_body,
ContentType="text/html",
)
def repos(client):
paginator = client.get_paginator("describe_repositories")
pages = paginator.paginate(registryId="308535385114")
for page in pages:
for repo in page["repositories"]:
yield repo
def images(client, repository):
paginator = client.get_paginator("describe_images")
pages = paginator.paginate(
registryId="308535385114", repositoryName=repository["repositoryName"]
)
for page in pages:
for image in page["imageDetails"]:
yield image
parser = argparse.ArgumentParser(description="Delete old Docker tags from registry")
parser.add_argument(
"--dry-run", action="store_true", help="Dry run; print tags that would be deleted"
)
parser.add_argument(
"--debug", action="store_true", help="Debug, print ignored / saved tags"
)
parser.add_argument(
"--keep-stable-days",
type=int,
default=14,
help="Days of stable Docker tags to keep (non per-build images)",
)
parser.add_argument(
"--keep-unstable-days",
type=int,
default=1,
help="Days of unstable Docker tags to keep (per-build images)",
)
parser.add_argument(
"--filter-prefix",
type=str,
default="",
help="Only run cleanup for repositories with this prefix",
)
parser.add_argument(
"--ignore-tags",
type=str,
default="",
help="Never cleanup these tags (comma separated)",
)
args = parser.parse_args()
if not args.ignore_tags or not args.filter_prefix:
print(
"""
Missing required arguments --ignore-tags and --filter-prefix
You must specify --ignore-tags and --filter-prefix to avoid accidentally
pruning a stable Docker tag which is being actively used. This will
make you VERY SAD. So pay attention.
First, which filter-prefix do you want? The list of valid prefixes
is in jobs/private.groovy under the 'docker-registry-cleanup' job.
You probably want either pytorch or caffe2.
Second, which ignore-tags do you want? It should be whatever the most
up-to-date DockerVersion for the repository in question is. Follow
the imports of jobs/pytorch.groovy to find them.
"""
)
sys.exit(1)
client = boto3.client("ecr", region_name="us-east-1")
stable_window = datetime.timedelta(days=args.keep_stable_days)
unstable_window = datetime.timedelta(days=args.keep_unstable_days)
now = datetime.datetime.now(pytz.UTC)
ignore_tags = args.ignore_tags.split(",")
def chunks(chunkable, n):
""" Yield successive n-sized chunks from l.
"""
for i in range(0, len(chunkable), n):
yield chunkable[i: i + n]
SHA_PATTERN = re.compile(r'^[0-9a-f]{40}$')
def looks_like_git_sha(tag):
"""Returns a boolean to check if a tag looks like a git sha
For reference a sha1 is 40 characters with only 0-9a-f and contains no
"-" characters
"""
return re.match(SHA_PATTERN, tag) is not None
stable_window_tags = []
for repo in repos(client):
repositoryName = repo["repositoryName"]
if not repositoryName.startswith(args.filter_prefix):
continue
# Keep list of image digests to delete for this repository
digest_to_delete = []
for image in images(client, repo):
tags = image.get("imageTags")
if not isinstance(tags, (list,)) or len(tags) == 0:
continue
created = image["imagePushedAt"]
age = now - created
for tag in tags:
if any([
looks_like_git_sha(tag),
tag.isdigit(),
tag.count("-") == 4, # TODO: Remove, this no longer applies as tags are now built using a SHA1
tag in ignore_tags]):
window = stable_window
if tag in ignore_tags:
stable_window_tags.append((repositoryName, tag, "", age, created))
elif age < window:
stable_window_tags.append((repositoryName, tag, window, age, created))
else:
window = unstable_window
if tag in ignore_tags or age < window:
if args.debug:
print("Ignoring {}:{} (age: {})".format(repositoryName, tag, age))
break
else:
for tag in tags:
print("{}Deleting {}:{} (age: {})".format("(dry run) " if args.dry_run else "", repositoryName, tag, age))
digest_to_delete.append(image["imageDigest"])
if args.dry_run:
if args.debug:
print("Skipping actual deletion, moving on...")
else:
# Issue batch delete for all images to delete for this repository
# Note that as of 2018-07-25, the maximum number of images you can
# delete in a single batch is 100, so chunk our list into batches of
# 100
for c in chunks(digest_to_delete, 100):
client.batch_delete_image(
registryId="308535385114",
repositoryName=repositoryName,
imageIds=[{"imageDigest": digest} for digest in c],
)
save_to_s3(args.filter_prefix, stable_window_tags)

View File

@ -1,3 +0,0 @@
boto3
pytz
requests

View File

@ -11,9 +11,11 @@ import sys
from collections import namedtuple
import cimodel.data.binary_build_definitions as binary_build_definitions
import cimodel.data.simple.android_definitions
import cimodel.data.simple.binary_smoketest
import cimodel.data.simple.docker_definitions
import cimodel.data.simple.mobile_definitions
import cimodel.data.simple.nightly_android
import cimodel.data.simple.nightly_ios
import cimodel.data.simple.anaconda_prune_defintions
import cimodel.lib.miniutils as miniutils
@ -135,9 +137,11 @@ def generate_required_docker_images(items):
def gen_build_workflows_tree():
build_workflows_functions = [
cimodel.data.simple.android_definitions.get_workflow_jobs,
cimodel.data.simple.mobile_definitions.get_workflow_jobs,
cimodel.data.simple.binary_smoketest.get_workflow_jobs,
cimodel.data.simple.nightly_ios.get_workflow_jobs,
cimodel.data.simple.nightly_android.get_workflow_jobs,
cimodel.data.simple.anaconda_prune_defintions.get_workflow_jobs,
binary_build_definitions.get_post_upload_jobs,
binary_build_definitions.get_binary_smoke_test_jobs,
@ -194,7 +198,6 @@ YAML_SOURCES = [
File("job-specs/docker_jobs.yml"),
Header("Workflows"),
Treegen(gen_build_workflows_tree, 0),
File("workflows/workflows-ecr-gc.yml"),
File("workflows/workflows-promote.yml"),
]

View File

@ -61,7 +61,7 @@ git --no-pager log --max-count 1
popd
# Clone the Builder master repo
retry git clone -q https://github.com/pytorch/builder.git "$BUILDER_ROOT"
retry git clone -q https://github.com/pytorch/builder.git -b release/1.11 "$BUILDER_ROOT"
pushd "$BUILDER_ROOT"
echo "Using builder from "
git --no-pager log --max-count 1

View File

@ -1,10 +1,24 @@
#!/bin/bash
source /home/circleci/project/env
cat >/home/circleci/project/ci_test_script.sh <<EOL
OUTPUT_SCRIPT=${OUTPUT_SCRIPT:-/home/circleci/project/ci_test_script.sh}
# only source if file exists
if [[ -f /home/circleci/project/env ]]; then
source /home/circleci/project/env
fi
cat >"${OUTPUT_SCRIPT}" <<EOL
# =================== The following code will be executed inside Docker container ===================
set -eux -o pipefail
retry () {
"\$@" || (sleep 1 && "\$@") || (sleep 2 && "\$@")
}
# Source binary env file here if exists
if [[ -e "${BINARY_ENV_FILE:-/nofile}" ]]; then
source "${BINARY_ENV_FILE:-/nofile}"
fi
python_nodot="\$(echo $DESIRED_PYTHON | tr -d m.u)"
# Set up Python
@ -23,7 +37,16 @@ fi
EXTRA_CONDA_FLAGS=""
NUMPY_PIN=""
if [[ "\$python_nodot" = *39* ]]; then
PROTOBUF_PACKAGE="defaults::protobuf"
if [[ "\$python_nodot" = *310* ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
# There's an issue with conda channel priority where it'll randomly pick 1.19 over 1.20
# we set a lower boundary here just to be safe
NUMPY_PIN=">=1.21.2"
PROTOBUF_PACKAGE="protobuf>=3.19.0"
fi
if [[ "\$python_nodot" = *39* ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
# There's an issue with conda channel priority where it'll randomly pick 1.19 over 1.20
# we set a lower boundary here just to be safe
@ -59,7 +82,7 @@ if [[ "$PACKAGE_TYPE" == conda ]]; then
ninja \
dataclasses \
typing-extensions \
defaults::protobuf \
${PROTOBUF_PACKAGE} \
six
if [[ "$DESIRED_CUDA" == 'cpu' ]]; then
retry conda install -c pytorch -y cpuonly
@ -92,4 +115,4 @@ EOL
echo
echo
echo "The script that will run in the next step is:"
cat /home/circleci/project/ci_test_script.sh
cat "${OUTPUT_SCRIPT}"

View File

@ -5,53 +5,70 @@ export TZ=UTC
tagged_version() {
# Grabs version from either the env variable CIRCLE_TAG
# or the pytorch git described version
if [[ "$OSTYPE" == "msys" ]]; then
GIT_DESCRIBE="git --git-dir ${workdir}/p/.git describe"
if [[ "$OSTYPE" == "msys" && -z "${IS_GHA:-}" ]]; then
GIT_DIR="${workdir}/p/.git"
else
GIT_DESCRIBE="git --git-dir ${workdir}/pytorch/.git describe"
GIT_DIR="${workdir}/pytorch/.git"
fi
GIT_DESCRIBE="git --git-dir ${GIT_DIR} describe --tags --match v[0-9]*.[0-9]*.[0-9]*"
if [[ -n "${CIRCLE_TAG:-}" ]]; then
echo "${CIRCLE_TAG}"
elif ${GIT_DESCRIBE} --exact --tags >/dev/null; then
${GIT_DESCRIBE} --tags
elif [[ ! -d "${GIT_DIR}" ]]; then
echo "Abort, abort! Git dir ${GIT_DIR} does not exists!"
kill $$
elif ${GIT_DESCRIBE} --exact >/dev/null; then
${GIT_DESCRIBE}
else
return 1
fi
}
# We need to write an envfile to persist these variables to following
# steps, but the location of the envfile depends on the circleci executor
if [[ "$(uname)" == Darwin ]]; then
# macos executor (builds and tests)
workdir="/Users/distiller/project"
elif [[ "$OSTYPE" == "msys" ]]; then
# windows executor (builds and tests)
workdir="/c/w"
elif [[ -d "/home/circleci/project" ]]; then
# machine executor (binary tests)
workdir="/home/circleci/project"
else
# docker executor (binary builds)
workdir="/"
fi
envfile="$workdir/env"
touch "$envfile"
chmod +x "$envfile"
# These are only relevant for CircleCI
# TODO: Remove these later once migrated fully to GHA
if [[ -z ${IS_GHA:-} ]]; then
# We need to write an envfile to persist these variables to following
# steps, but the location of the envfile depends on the circleci executor
if [[ "$(uname)" == Darwin ]]; then
# macos executor (builds and tests)
workdir="/Users/distiller/project"
elif [[ "$OSTYPE" == "msys" ]]; then
# windows executor (builds and tests)
workdir="/c/w"
elif [[ -d "/home/circleci/project" ]]; then
# machine executor (binary tests)
workdir="/home/circleci/project"
else
# docker executor (binary builds)
workdir="/"
fi
envfile="$workdir/env"
touch "$envfile"
chmod +x "$envfile"
# Parse the BUILD_ENVIRONMENT to package type, python, and cuda
configs=($BUILD_ENVIRONMENT)
export PACKAGE_TYPE="${configs[0]}"
export DESIRED_PYTHON="${configs[1]}"
export DESIRED_CUDA="${configs[2]}"
if [[ "${BUILD_FOR_SYSTEM:-}" == "windows" ]]; then
export DESIRED_DEVTOOLSET=""
export LIBTORCH_CONFIG="${configs[3]:-}"
if [[ "$LIBTORCH_CONFIG" == 'debug' ]]; then
export DEBUG=1
# Parse the BUILD_ENVIRONMENT to package type, python, and cuda
configs=($BUILD_ENVIRONMENT)
export PACKAGE_TYPE="${configs[0]}"
export DESIRED_PYTHON="${configs[1]}"
export DESIRED_CUDA="${configs[2]}"
if [[ "${OSTYPE}" == "msys" ]]; then
export DESIRED_DEVTOOLSET=""
export LIBTORCH_CONFIG="${configs[3]:-}"
if [[ "$LIBTORCH_CONFIG" == 'debug' ]]; then
export DEBUG=1
fi
else
export DESIRED_DEVTOOLSET="${configs[3]:-}"
fi
else
export DESIRED_DEVTOOLSET="${configs[3]:-}"
envfile=${BINARY_ENV_FILE:-/tmp/env}
if [[ -n "${PYTORCH_ROOT}" ]]; then
workdir=$(dirname "${PYTORCH_ROOT}")
else
# docker executor (binary builds)
workdir="/"
fi
fi
if [[ "$PACKAGE_TYPE" == 'libtorch' ]]; then
export BUILD_PYTHONLESS=1
fi
@ -131,20 +148,24 @@ if [[ "$PACKAGE_TYPE" == libtorch ]]; then
fi
fi
cat >>"$envfile" <<EOL
cat >"$envfile" <<EOL
# =================== The following code will be executed inside Docker container ===================
export TZ=UTC
echo "Running on $(uname -a) at $(date)"
export PACKAGE_TYPE="$PACKAGE_TYPE"
export DESIRED_PYTHON="$DESIRED_PYTHON"
export DESIRED_PYTHON="${DESIRED_PYTHON:-}"
export DESIRED_CUDA="$DESIRED_CUDA"
export LIBTORCH_VARIANT="${LIBTORCH_VARIANT:-}"
export BUILD_PYTHONLESS="${BUILD_PYTHONLESS:-}"
export DESIRED_DEVTOOLSET="$DESIRED_DEVTOOLSET"
if [[ "${BUILD_FOR_SYSTEM:-}" == "windows" ]]; then
if [[ "${OSTYPE}" == "msys" ]]; then
export LIBTORCH_CONFIG="${LIBTORCH_CONFIG:-}"
export DEBUG="${DEBUG:-}"
if [[ "${LIBTORCH_CONFIG:-}" == 'debug' ]]; then
export DEBUG=1
fi
export DESIRED_DEVTOOLSET=""
else
export DESIRED_DEVTOOLSET="${DESIRED_DEVTOOLSET:-}"
fi
export DATE="$DATE"
@ -156,6 +177,7 @@ export OVERRIDE_PACKAGE_VERSION="$PYTORCH_BUILD_VERSION"
# TODO: We don't need this anymore IIUC
export TORCH_PACKAGE_NAME='torch'
export TORCH_CONDA_BUILD_FOLDER='pytorch-nightly'
export ANACONDA_USER='pytorch'
export USE_FBGEMM=1
export JAVA_HOME=$JAVA_HOME
@ -163,23 +185,6 @@ export BUILD_JNI=$BUILD_JNI
export PIP_UPLOAD_FOLDER="$PIP_UPLOAD_FOLDER"
export DOCKER_IMAGE="$DOCKER_IMAGE"
export workdir="$workdir"
export MAC_PACKAGE_WORK_DIR="$workdir"
if [[ "$OSTYPE" == "msys" ]]; then
export PYTORCH_ROOT="$workdir/p"
export BUILDER_ROOT="$workdir/b"
else
export PYTORCH_ROOT="$workdir/pytorch"
export BUILDER_ROOT="$workdir/builder"
fi
export MINICONDA_ROOT="$workdir/miniconda"
export PYTORCH_FINAL_PACKAGE_DIR="$workdir/final_pkgs"
export CIRCLE_TAG="${CIRCLE_TAG:-}"
export CIRCLE_SHA1="$CIRCLE_SHA1"
export CIRCLE_PR_NUMBER="${CIRCLE_PR_NUMBER:-}"
export CIRCLE_BRANCH="$CIRCLE_BRANCH"
export CIRCLE_WORKFLOW_ID="$CIRCLE_WORKFLOW_ID"
export USE_GOLD_LINKER="${USE_GOLD_LINKER}"
export USE_GLOO_WITH_OPENSSL="ON"
@ -187,6 +192,42 @@ export USE_WHOLE_CUDNN="${USE_WHOLE_CUDNN}"
# =================== The above code will be executed inside Docker container ===================
EOL
# nproc doesn't exist on darwin
if [[ "$(uname)" != Darwin ]]; then
# Because most Circle executors only have 20 CPUs, using more causes OOMs w/ Ninja and nvcc parallelization
MEMORY_LIMIT_MAX_JOBS=18
NUM_CPUS=$(( $(nproc) - 2 ))
# Defaults here for **binary** linux builds so they can be changed in one place
export MAX_JOBS=${MAX_JOBS:-$(( ${NUM_CPUS} > ${MEMORY_LIMIT_MAX_JOBS} ? ${MEMORY_LIMIT_MAX_JOBS} : ${NUM_CPUS} ))}
cat >>"$envfile" <<EOL
export MAX_JOBS="${MAX_JOBS}"
EOL
fi
if [[ -z "${IS_GHA:-}" ]]; then
cat >>"$envfile" <<EOL
export workdir="$workdir"
export MAC_PACKAGE_WORK_DIR="$workdir"
if [[ "$OSTYPE" == "msys" ]]; then
export PYTORCH_ROOT="$workdir/p"
export BUILDER_ROOT="$workdir/b"
else
export PYTORCH_ROOT="$workdir/pytorch"
export BUILDER_ROOT="$workdir/builder"
fi
export MINICONDA_ROOT="$workdir/miniconda"
export PYTORCH_FINAL_PACKAGE_DIR="$workdir/final_pkgs"
export CIRCLE_TAG="${CIRCLE_TAG:-}"
export CIRCLE_SHA1="$CIRCLE_SHA1"
export CIRCLE_PR_NUMBER="${CIRCLE_PR_NUMBER:-}"
export CIRCLE_BRANCH="$CIRCLE_BRANCH"
export CIRCLE_WORKFLOW_ID="$CIRCLE_WORKFLOW_ID"
EOL
fi
echo 'retry () {' >> "$envfile"
echo ' $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)' >> "$envfile"
echo '}' >> "$envfile"

View File

@ -63,6 +63,10 @@ s3_upload() {
)
}
# Install dependencies (should be a no-op if previously installed)
conda install -yq anaconda-client
pip install -q awscli
case "${PACKAGE_TYPE}" in
conda)
conda_upload

View File

@ -1,7 +1,7 @@
#!/bin/bash
set -eux -o pipefail
source "/c/w/env"
source "${BINARY_ENV_FILE:-/c/w/env}"
mkdir -p "$PYTORCH_FINAL_PACKAGE_DIR"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
@ -10,12 +10,12 @@ export SCCACHE_BUCKET=ossci-compiler-cache-windows
export NIGHTLIES_PYTORCH_ROOT="$PYTORCH_ROOT"
export VC_YEAR=2019
if [[ "${DESIRED_CUDA}" == "cu111" || "${DESIRED_CUDA}" == "cu113" ]]; then
export BUILD_SPLIT_CUDA="ON"
if [[ "${DESIRED_CUDA}" == *"cu11"* ]]; then
export BUILD_SPLIT_CUDA=ON
fi
echo "Free Space for CUDA DEBUG BUILD"
if [[ "$CIRCLECI" == 'true' ]]; then
if [[ "${CIRCLECI:-}" == 'true' ]]; then
if [[ -d "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community"
fi
@ -47,23 +47,20 @@ if [[ "$CIRCLECI" == 'true' ]]; then
if [[ -d "C:\\Program Files (x86)\\Google" ]]; then
rm -rf "C:\\Program Files (x86)\\Google"
fi
fi
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
set -x
if [[ "$CIRCLECI" == 'true' && -d "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" ]]; then
mv "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" .
rm -rf "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mkdir -p "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mv _Instances "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
fi
if [[ "$CIRCLECI" == 'true' && -d "C:\\Microsoft" ]]; then
# don't use quotes here
rm -rf /c/Microsoft/AndroidNDK*
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
set -x
if [[ -d "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" ]]; then
mv "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" .
rm -rf "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mkdir -p "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mv _Instances "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
fi
if [[ -d "C:\\Microsoft" ]]; then
# don't use quotes here
rm -rf /c/Microsoft/AndroidNDK*
fi
fi
echo "Free space on filesystem before build:"
@ -71,9 +68,9 @@ df -h
pushd "$BUILDER_ROOT"
if [[ "$PACKAGE_TYPE" == 'conda' ]]; then
./windows/internal/build_conda.bat
./windows/internal/build_conda.bat
elif [[ "$PACKAGE_TYPE" == 'wheel' || "$PACKAGE_TYPE" == 'libtorch' ]]; then
./windows/internal/build_wheels.bat
./windows/internal/build_wheels.bat
fi
echo "Free space on filesystem after build:"

View File

@ -1,7 +1,7 @@
#!/bin/bash
set -eux -o pipefail
source "/c/w/env"
source "${BINARY_ENV_FILE:-/c/w/env}"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
export VC_YEAR=2019

View File

@ -1,26 +1,26 @@
#!/bin/bash
set -eux -o pipefail
# This is typically blank but for CUDA 10* it'll be set to 10
windows_version_qualifier=""
windows_s3_link="https://ossci-windows.s3.amazonaws.com"
case ${CUDA_VERSION} in
10.1)
archive_version="v7.6.4.38"
windows_version_qualifier="10"
# This is typically blank but for CUDA 10* it'll be set to 10
cudnn_file_name="cudnn-${CUDA_VERSION}-windows10-x64-v7.6.4.38"
;;
10.2)
archive_version="v7.6.5.32"
windows_version_qualifier="10"
cudnn_file_name="cudnn-${CUDA_VERSION}-windows10-x64-v7.6.5.32"
;;
11.1)
archive_version="v8.0.5.39"
cudnn_file_name="cudnn-${CUDA_VERSION}-windows-x64-v8.0.5.39"
;;
11.3)
archive_version="v8.2.0.53"
cudnn_file_name="cudnn-${CUDA_VERSION}-windows-x64-v8.2.0.53"
;;
11.5)
archive_version="v8.2.0.53"
# Since cudnn 8.3 the filename have changed
cudnn_file_name="cudnn-windows-x86_64-8.3.2.44_cuda${CUDA_VERSION}-archive"
;;
*)
echo "CUDA_VERSION: ${CUDA_VERSION} not supported yet"
@ -29,7 +29,7 @@ case ${CUDA_VERSION} in
esac
cudnn_installer_name="cudnn_installer.zip"
cudnn_installer_link="https://ossci-windows.s3.amazonaws.com/cudnn-${CUDA_VERSION}-windows${windows_version_qualifier}-x64-${archive_version}.zip"
cudnn_installer_link="${windows_s3_link}/${cudnn_file_name}.zip"
cudnn_install_folder="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v${CUDA_VERSION}/"
if [[ -f "${cudnn_install_folder}/include/cudnn.h" ]]; then
@ -44,6 +44,11 @@ else
# Remove all of the directories before attempting to copy files
rm -rf "${cudnn_install_folder:?}/*"
cp -rf cudnn/cuda/* "${cudnn_install_folder}"
#Make sure windows path contains zlib dll
curl -k -L "${windows_s3_link}/zlib123dllx64.zip" --output "${tmp_dir}\zlib123dllx64.zip"
7z x "${tmp_dir}\zlib123dllx64.zip" -o"${tmp_dir}\zlib"
xcopy /Y "${tmp_dir}\zlib\dll_x64\*.dll" "C:\Windows\System32"
)
rm -rf "${tmp_dir}"
fi

View File

@ -62,5 +62,4 @@ binary_windows_params: &binary_windows_params
default: "windows-xlarge-cpu-with-nvidia-cuda"
environment:
BUILD_ENVIRONMENT: << parameters.build_environment >>
BUILD_FOR_SYSTEM: windows
JOB_EXECUTOR: <<parameters.executor>>

View File

@ -161,6 +161,7 @@
<<: *binary_mac_params
macos:
xcode: "12.0"
resource_class: "large"
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout

View File

@ -54,61 +54,3 @@
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
set -x
cd .circleci/docker && ./build_docker.sh
docker_for_ecr_gc_build_job:
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- run:
name: build_docker_image_for_ecr_gc
no_output_timeout: "1h"
command: |
cd .circleci/ecr_gc_docker
docker build . -t 308535385114.dkr.ecr.us-east-1.amazonaws.com/gc/ecr
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_DOCKER_BUILDER_V1}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
export AWS_REGION=us-east-1
aws ecr get-login-password --region $AWS_REGION|docker login --username AWS \
--password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
set -x
docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/gc/ecr
ecr_gc_job:
parameters:
project:
type: string
default: "pytorch"
tags_to_keep: # comma separate values
type: string
environment:
PROJECT: << parameters.project >>
# TODO: Remove legacy image tags once we feel comfortable with new docker image tags
IMAGE_TAG: << parameters.tags_to_keep >>
docker:
- image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/gc/ecr
aws_auth:
aws_access_key_id: ${CIRCLECI_AWS_ACCESS_KEY_FOR_DOCKER_BUILDER_V1}
aws_secret_access_key: ${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
steps:
- checkout
- run:
# NOTE: see 'docker_build_job' for how these tags actually get built
name: dynamically generate tags to keep
no_output_timeout: "1h"
command: |
GENERATED_IMAGE_TAG=$(\
git log --oneline --pretty='%H' .circleci/docker \
| xargs -I '{}' git rev-parse '{}:.circleci/docker' \
| paste -sd "," -)
echo "export GENERATED_IMAGE_TAG='${GENERATED_IMAGE_TAG}'" >> ${BASH_ENV}
- run:
name: garbage collecting for ecr images
no_output_timeout: "1h"
command: |
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_DOCKER_BUILDER_V1}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
set -x
/usr/bin/gc.py --filter-prefix ${PROJECT} --ignore-tags "${IMAGE_TAG},${GENERATED_IMAGE_TAG}"

View File

@ -27,7 +27,7 @@
pytorch_python_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-python-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4"
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
@ -43,8 +43,8 @@
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12.0
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9.]*\).*/\1/')
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-master}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
@ -73,7 +73,7 @@
pytorch_cpp_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-cpp-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4"
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
@ -89,9 +89,8 @@
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12.0
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9.]*\).*/\1/')
tag=${CIRCLE_TAG:1:5}
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-master}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
@ -253,7 +252,7 @@
environment:
BUILD_ENVIRONMENT: pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c"
PYTHON_VERSION: "3.6"
PYTHON_VERSION: "3.7"
resource_class: large
machine:
image: ubuntu-2004:202104-01
@ -342,7 +341,7 @@
environment:
BUILD_ENVIRONMENT: pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-publish-snapshot
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c"
PYTHON_VERSION: "3.6"
PYTHON_VERSION: "3.7"
resource_class: large
machine:
image: ubuntu-2004:202104-01
@ -378,7 +377,7 @@
environment:
BUILD_ENVIRONMENT: pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build-only-x86_32
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-android-ndk-r19c"
PYTHON_VERSION: "3.6"
PYTHON_VERSION: "3.7"
resource_class: large
machine:
image: ubuntu-2004:202104-01
@ -660,7 +659,7 @@
pytorch_doc_test:
environment:
BUILD_ENVIRONMENT: pytorch-doc-test
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-gcc5.4"
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: medium
machine:
image: ubuntu-2004:202104-01

View File

@ -148,10 +148,10 @@ jobs:
command: |
set -e
is_vanilla_build() {
if [ "${BUILD_ENVIRONMENT}" == "pytorch-linux-bionic-py3.6-clang9-test" ]; then
if [ "${BUILD_ENVIRONMENT}" == "pytorch-linux-bionic-py3.7-clang9-test" ]; then
return 0
fi
if [ "${BUILD_ENVIRONMENT}" == "pytorch-linux-xenial-py3.6-gcc5.4-test" ]; then
if [ "${BUILD_ENVIRONMENT}" == "pytorch-linux-xenial-py3.7-gcc5.4-test" ]; then
return 0
fi
return 1

View File

@ -26,6 +26,7 @@
# (smoke tests and upload jobs do not need the pytorch repo).
binary_checkout: &binary_checkout
name: Checkout pytorch/builder repo
no_output_timeout: "30m"
command: .circleci/scripts/binary_checkout.sh
# Parses circleci arguments in a consistent way, essentially routing to the

View File

@ -1,34 +0,0 @@
ecr_gc:
triggers:
- schedule:
cron: "45 * * * *"
filters:
branches:
only:
- master
jobs:
- docker_for_ecr_gc_build_job
- ecr_gc_job:
name: ecr_gc_job_for_pytorch
project: pytorch
tags_to_keep: "271,262,256,278,282,291,300,323,327,347,389,401,402,403,405,a8006f9a-272d-4478-b137-d121c6f05c83,6e7b11da-a919-49e5-b2ba-da66e3d4bb0a,f990c76a-a798-42bb-852f-5be5006f8026,e43973a9-9d5a-4138-9181-a08a0fc55e2f,8fcf46ef-4a34-480b-a8ee-b0a30a4d3e59,9a3986fa-7ce7-4a36-a001-3c9bef9892e2,1bc00f11-e0f3-4e5c-859f-15937dd938cd,209062ef-ab58-422a-b295-36c4eed6e906,be76e8fd-44e2-484d-b090-07e0cc3a56f0,fff7795428560442086f7b2bb6004b65245dc11a,ab1632df-fa59-40e6-8c23-98e004f61148"
requires:
- docker_for_ecr_gc_build_job
- ecr_gc_job:
name: ecr_gc_job_for_caffe2
project: caffe2
tags_to_keep: "376,373,369,348,345,336,325,324,315,306,301,287,283,276,273,266,253,248,238,230,213"
requires:
- docker_for_ecr_gc_build_job
- ecr_gc_job:
name: ecr_gc_job_for_translate
project: translate
tags_to_keep: "8"
requires:
- docker_for_ecr_gc_build_job
- ecr_gc_job:
name: ecr_gc_job_for_tensorcomp
project: tensorcomp
tags_to_keep: "34"
requires:
- docker_for_ecr_gc_build_job

View File

@ -2,6 +2,8 @@ self-hosted-runner:
labels:
- linux.large
- linux.2xlarge
- linux.4xlarge
- linux.4xlarge.nvidia.gpu
- linux.8xlarge.nvidia.gpu
- linux.16xlarge.nvidia.gpu
- windows.4xlarge

View File

@ -16,17 +16,20 @@
"libtorch-linux-xenial-cuda11.3-py3.7-gcc7",
"linux-bionic-cuda10.2-py3.9-gcc7",
"linux-bionic-py3.7-clang9",
"linux-bionic-rocm4.5-py3.7",
"linux-docs",
"linux-docs-push",
"linux-vulkan-bionic-py3.7-clang9",
"linux-xenial-cuda11.3-py3.7-gcc7",
"linux-xenial-cuda11.3-py3.7-gcc7-bazel-test",
"linux-xenial-cuda11.3-py3.7-gcc7-no-ops",
"linux-xenial-py3-clang5-mobile-build",
"linux-xenial-py3-clang5-mobile-custom-build-static",
"linux-xenial-py3.7-clang7-asan",
"linux-xenial-py3.7-clang7-onnx",
"linux-xenial-py3.7-gcc5.4",
"linux-xenial-py3.7-gcc7",
"linux-xenial-py3.7-gcc7-no-ops",
"macos-10-15-py3-arm64",
"macos-10-15-py3-lite-interpreter-x86-64",
"macos-11-py3-x86-64",
@ -41,6 +44,7 @@
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"pytorch-xla-linux-bionic-py3.7-clang8",
"win-vs2019-cpu-py3",
"win-vs2019-cuda11.3-py3"
],
@ -52,6 +56,28 @@
"ciflow/bazel": [
"linux-xenial-cuda11.3-py3.7-gcc7-bazel-test"
],
"ciflow/binaries": [
"linux-binary-conda",
"linux-binary-libtorch-cxx11-abi",
"linux-binary-libtorch-pre-cxx11",
"linux-binary-manywheel",
"windows-binary-libtorch-debug",
"windows-binary-libtorch-release",
"windows-binary-wheel"
],
"ciflow/binaries_conda": [
"linux-binary-conda"
],
"ciflow/binaries_libtorch": [
"linux-binary-libtorch-cxx11-abi",
"linux-binary-libtorch-pre-cxx11",
"windows-binary-libtorch-debug",
"windows-binary-libtorch-release"
],
"ciflow/binaries_wheel": [
"linux-binary-manywheel",
"windows-binary-wheel"
],
"ciflow/cpu": [
"caffe2-linux-xenial-py3.7-gcc5.4",
"linux-bionic-py3.7-clang9",
@ -63,10 +89,12 @@
"linux-xenial-py3.7-clang7-onnx",
"linux-xenial-py3.7-gcc5.4",
"linux-xenial-py3.7-gcc7",
"linux-xenial-py3.7-gcc7-no-ops",
"parallelnative-linux-xenial-py3.7-gcc5.4",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"pytorch-xla-linux-bionic-py3.7-clang8",
"win-vs2019-cpu-py3"
],
"ciflow/cuda": [
@ -74,6 +102,7 @@
"libtorch-linux-xenial-cuda11.3-py3.7-gcc7",
"linux-bionic-cuda10.2-py3.9-gcc7",
"linux-xenial-cuda11.3-py3.7-gcc7",
"linux-xenial-cuda11.3-py3.7-gcc7-no-ops",
"periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7",
"periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7",
"periodic-linux-bionic-cuda11.5-py3.7-gcc7",
@ -84,7 +113,12 @@
"win-vs2019-cuda11.3-py3"
],
"ciflow/default": [
"linux-binary-conda",
"linux-binary-libtorch-cxx11-abi",
"linux-binary-libtorch-pre-cxx11",
"linux-binary-manywheel",
"linux-bionic-py3.7-clang9",
"linux-bionic-rocm4.5-py3.7",
"linux-docs",
"linux-vulkan-bionic-py3.7-clang9",
"linux-xenial-cuda11.3-py3.7-gcc7",
@ -95,10 +129,14 @@
"linux-xenial-py3.7-clang7-onnx",
"linux-xenial-py3.7-gcc5.4",
"linux-xenial-py3.7-gcc7",
"linux-xenial-py3.7-gcc7-no-ops",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"win-vs2019-cpu-py3",
"win-vs2019-cuda11.3-py3"
"win-vs2019-cuda11.3-py3",
"windows-binary-libtorch-debug",
"windows-binary-libtorch-release",
"windows-binary-wheel"
],
"ciflow/docs": [
"linux-docs"
@ -125,17 +163,20 @@
"libtorch-linux-xenial-cuda11.3-py3.7-gcc7",
"linux-bionic-cuda10.2-py3.9-gcc7",
"linux-bionic-py3.7-clang9",
"linux-bionic-rocm4.5-py3.7",
"linux-docs",
"linux-docs-push",
"linux-vulkan-bionic-py3.7-clang9",
"linux-xenial-cuda11.3-py3.7-gcc7",
"linux-xenial-cuda11.3-py3.7-gcc7-bazel-test",
"linux-xenial-cuda11.3-py3.7-gcc7-no-ops",
"linux-xenial-py3-clang5-mobile-build",
"linux-xenial-py3-clang5-mobile-custom-build-static",
"linux-xenial-py3.7-clang7-asan",
"linux-xenial-py3.7-clang7-onnx",
"linux-xenial-py3.7-gcc5.4",
"linux-xenial-py3.7-gcc7",
"linux-xenial-py3.7-gcc7-no-ops",
"parallelnative-linux-xenial-py3.7-gcc5.4",
"periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7",
"periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7",
@ -144,7 +185,8 @@
"periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit"
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"pytorch-xla-linux-bionic-py3.7-clang8"
],
"ciflow/macos": [
"ios-12-5-1-arm64",
@ -169,6 +211,9 @@
"ciflow/onnx": [
"linux-xenial-py3.7-clang7-onnx"
],
"ciflow/rocm": [
"linux-bionic-rocm4.5-py3.7"
],
"ciflow/sanitizers": [
"linux-xenial-py3.7-clang7-asan"
],
@ -204,16 +249,19 @@
"libtorch-linux-xenial-cuda11.3-py3.7-gcc7",
"linux-bionic-cuda10.2-py3.9-gcc7",
"linux-bionic-py3.7-clang9",
"linux-bionic-rocm4.5-py3.7",
"linux-docs",
"linux-vulkan-bionic-py3.7-clang9",
"linux-xenial-cuda11.3-py3.7-gcc7",
"linux-xenial-cuda11.3-py3.7-gcc7-bazel-test",
"linux-xenial-cuda11.3-py3.7-gcc7-no-ops",
"linux-xenial-py3-clang5-mobile-build",
"linux-xenial-py3-clang5-mobile-custom-build-static",
"linux-xenial-py3.7-clang7-asan",
"linux-xenial-py3.7-clang7-onnx",
"linux-xenial-py3.7-gcc5.4",
"linux-xenial-py3.7-gcc7",
"linux-xenial-py3.7-gcc7-no-ops",
"macos-10-15-py3-arm64",
"macos-10-15-py3-lite-interpreter-x86-64",
"macos-11-py3-x86-64",
@ -221,6 +269,7 @@
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"pytorch-xla-linux-bionic-py3.7-clang8",
"win-vs2019-cpu-py3",
"win-vs2019-cuda11.3-py3"
],
@ -232,6 +281,9 @@
"periodic-win-vs2019-cuda11.5-py3",
"win-vs2019-cpu-py3",
"win-vs2019-cuda11.3-py3"
],
"ciflow/xla": [
"pytorch-xla-linux-bionic-py3.7-clang8"
]
},
"version": "v1"

20
.github/merge_rules.json vendored Normal file
View File

@ -0,0 +1,20 @@
[
{
"name": "ONNX exporter",
"patterns": ["torch/onnx/**", "torch/csrc/jit/passes/onnx/**", "torch/csrc/jit/passes/onnx.*", "test/onnx/**", "docs/source/onnx.rst"],
"approved_by": ["BowenBao", "garymm"],
"mandatory_app_id": 12274
},
{
"name": "NVFuser",
"patterns": ["torch/csrc/jit/codegen/fuser/cuda/**", "torch/csrc/jit/codegen/cuda/**", "benchmarks/cpp/nvfuser/**"],
"approved_by": ["csarofeen", "ngimel"],
"mandatory_app_id": 12274
},
{
"name": "OSS CI",
"patterns": [".github/**", ".circleci/**", ".jenkins/**", "scripts/**"],
"approved_by": ["seemethere", "malfet", "suo"],
"mandatory_app_id": 12274
}
]

View File

@ -5,6 +5,9 @@
#
# NOTE (Apr, 5, 2021): Linux runners are currently all an amazonlinux2
#
# NOTE (Jan 5, 2021): Linux runners are all non-ephemeral to reduce the amount of CreateInstaces calls
# to avoid RequestLimitExceeded issues
#
# TODO: Add some documentation on how the auto-scaling works
#
# NOTE: Default values,
@ -29,10 +32,13 @@ runner_types:
os: linux
max_available: 500
disk_size: 150
linux.4xlarge:
is_ephemeral: false
linux.4xlarge: # for binary-builds
instance_type: c5.4xlarge
os: linux
max_available: 250
disk_size: 150
is_ephemeral: false
linux.8xlarge.nvidia.gpu:
instance_type: g3.8xlarge
os: linux

View File

@ -10,19 +10,13 @@ architectures:
* Latest ROCM
"""
import argparse
import json
from typing import Dict, List
from typing import Dict, List, Tuple
CUDA_ARCHES = [
"10.2",
"11.1"
]
ROCM_ARCHES = [
"3.10",
"4.0"
]
CUDA_ARCHES = ["10.2", "11.1", "11.3", "11.5"]
ROCM_ARCHES = ["4.3.1", "4.5.2"]
def arch_type(arch_version: str) -> str:
@ -36,131 +30,168 @@ def arch_type(arch_version: str) -> str:
WHEEL_CONTAINER_IMAGES = {
**{
# TODO: Re-do manylinux CUDA image tagging scheme to be similar to
# ROCM so we don't have to do this replacement
gpu_arch: f"pytorch/manylinux-cuda{gpu_arch.replace('.', '')}"
gpu_arch: f"pytorch/manylinux-builder:cuda{gpu_arch}"
for gpu_arch in CUDA_ARCHES
},
**{
gpu_arch: f"pytorch/manylinux-rocm:{gpu_arch}"
gpu_arch: f"pytorch/manylinux-builder:rocm{gpu_arch}"
for gpu_arch in ROCM_ARCHES
},
"cpu": "pytorch/manylinux-cpu"
"cpu": "pytorch/manylinux-builder:cpu",
}
CONDA_CONTAINER_IMAGES = {
**{
gpu_arch: f"pytorch/conda-builder:cuda{gpu_arch}"
for gpu_arch in CUDA_ARCHES
},
"cpu": "pytorch/conda-builder:cpu"
**{gpu_arch: f"pytorch/conda-builder:cuda{gpu_arch}" for gpu_arch in CUDA_ARCHES},
"cpu": "pytorch/conda-builder:cpu",
}
LIBTORCH_CONTAINER_IMAGES = {
PRE_CXX11_ABI = "pre-cxx11"
CXX11_ABI = "cxx11-abi"
RELEASE = "release"
DEBUG = "debug"
LIBTORCH_CONTAINER_IMAGES: Dict[Tuple[str, str], str] = {
**{
# TODO: Re-do manylinux CUDA image tagging scheme to be similar to
# ROCM so we don't have to do this replacement
(gpu_arch, "pre-cxx11"): f"pytorch/manylinux-cuda{gpu_arch.replace('.', '')}"
(gpu_arch, PRE_CXX11_ABI): f"pytorch/manylinux-builder:cuda{gpu_arch}"
for gpu_arch in CUDA_ARCHES
},
**{
(gpu_arch, "cxx11-abi"): f"pytorch/libtorch-cxx11-builder:cuda{gpu_arch}"
(gpu_arch, CXX11_ABI): f"pytorch/libtorch-cxx11-builder:cuda{gpu_arch}"
for gpu_arch in CUDA_ARCHES
},
("cpu", "pre-cxx11"): "pytorch/manylinux-cpu",
("cpu", "cxx11-abi"): "pytorch/libtorch-cxx11-builder:cpu",
("cpu", PRE_CXX11_ABI): "pytorch/manylinux-builder:cpu",
("cpu", CXX11_ABI): "pytorch/libtorch-cxx11-builder:cpu",
}
FULL_PYTHON_VERSIONS = [
"3.7",
"3.8",
"3.9",
]
FULL_PYTHON_VERSIONS = ["3.7", "3.8", "3.9", "3.10"]
def is_pull_request() -> bool:
return False
# return os.environ.get("GITHUB_HEAD_REF")
def translate_desired_cuda(gpu_arch_type: str, gpu_arch_version: str) -> str:
return {
"cpu": "cpu",
"cuda": f"cu{gpu_arch_version.replace('.', '')}",
"rocm": f"rocm{gpu_arch_version}",
}.get(gpu_arch_type, gpu_arch_version)
def snip_if(is_pr: bool, versions: List[str]) -> List[str]:
"""
Return the full list of versions, or just the latest if on a PR.
"""
return [versions[-1]] if is_pr else versions
def list_without(in_list: List[str], without: List[str]) -> List[str]:
return [item for item in in_list if item not in without]
def generate_conda_matrix(is_pr: bool) -> List[Dict[str, str]]:
return [
{
"python_version": python_version,
"gpu_arch_type": arch_type(arch_version),
"gpu_arch_version": arch_version,
"container_image": CONDA_CONTAINER_IMAGES[arch_version],
}
for python_version in snip_if(is_pr, FULL_PYTHON_VERSIONS)
def generate_conda_matrix(os: str) -> List[Dict[str, str]]:
ret: List[Dict[str, str]] = []
arches = ["cpu"]
if os == "linux":
arches += CUDA_ARCHES
elif os == "windows":
# We don't build CUDA 10.2 for window see https://github.com/pytorch/pytorch/issues/65648
arches += list_without(CUDA_ARCHES, ["10.2"])
for python_version in FULL_PYTHON_VERSIONS:
# We don't currently build conda packages for rocm
for arch_version in ["cpu"] + snip_if(is_pr, CUDA_ARCHES)
]
for arch_version in arches:
gpu_arch_type = arch_type(arch_version)
gpu_arch_version = "" if arch_version == "cpu" else arch_version
ret.append(
{
"python_version": python_version,
"gpu_arch_type": gpu_arch_type,
"gpu_arch_version": gpu_arch_version,
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"container_image": CONDA_CONTAINER_IMAGES[arch_version],
"package_type": "conda",
"build_name": f"conda-py{python_version}-{gpu_arch_type}{gpu_arch_version}".replace(
".", "_"
),
}
)
return ret
def generate_libtorch_matrix(is_pr: bool) -> List[Dict[str, str]]:
def generate_libtorch_matrix(os: str, abi_version: str) -> List[Dict[str, str]]:
libtorch_variants = [
"shared-with-deps",
"shared-without-deps",
"static-with-deps",
"static-without-deps",
]
return [
{
"gpu_arch_type": arch_type(arch_version),
"gpu_arch_version": arch_version,
"libtorch_variant": libtorch_variant,
"devtoolset": abi_version,
"container_image": LIBTORCH_CONTAINER_IMAGES[(arch_version, abi_version)],
}
# We don't currently build libtorch for rocm
for arch_version in ["cpu"] + snip_if(is_pr, CUDA_ARCHES)
for libtorch_variant in libtorch_variants
# one of the values in the following list must be exactly
# "cxx11-abi", but the precise value of the other one doesn't
# matter
for abi_version in ["cxx11-abi", "pre-cxx11"]
]
def generate_wheels_matrix(is_pr: bool) -> List[Dict[str, str]]:
ret: List[Dict[str, str]] = []
arches = ["cpu"]
arches += snip_if(is_pr, CUDA_ARCHES)
arches += snip_if(is_pr, ROCM_ARCHES)
return [
{
"python_version": python_version,
"gpu_arch_type": arch_type(arch_version),
"gpu_arch_version": arch_version,
"container_image": WHEEL_CONTAINER_IMAGES[arch_version],
}
for python_version in snip_if(is_pr, FULL_PYTHON_VERSIONS)
for arch_version in arches
]
if os == "linux":
arches += CUDA_ARCHES
elif os == "windows":
# We don't build CUDA 10.2 for window see https://github.com/pytorch/pytorch/issues/65648
arches += list_without(CUDA_ARCHES, ["10.2"])
for arch_version in arches:
for libtorch_variant in libtorch_variants:
# We don't currently build libtorch for rocm
# one of the values in the following list must be exactly
# CXX11_ABI, but the precise value of the other one doesn't
# matter
gpu_arch_type = arch_type(arch_version)
gpu_arch_version = "" if arch_version == "cpu" else arch_version
ret.append(
{
"gpu_arch_type": gpu_arch_type,
"gpu_arch_version": gpu_arch_version,
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"libtorch_variant": libtorch_variant,
"libtorch_config": abi_version if os == "windows" else "",
"devtoolset": abi_version if os != "windows" else "",
"container_image": LIBTORCH_CONTAINER_IMAGES[
(arch_version, abi_version)
] if os != "windows" else "",
"package_type": "libtorch",
"build_name": f"libtorch-{gpu_arch_type}{gpu_arch_version}-{libtorch_variant}-{abi_version}".replace(
".", "_"
),
}
)
return ret
def from_includes(includes: List[Dict[str, str]]) -> str:
return json.dumps({"include": includes})
def generate_wheels_matrix(os: str) -> List[Dict[str, str]]:
arches = ["cpu"]
package_type = "wheel"
if os == "linux":
arches += CUDA_ARCHES + ROCM_ARCHES
# NOTE: We only build manywheel packages for linux
package_type = "manywheel"
elif os == "windows":
# We don't build CUDA 10.2 for window see https://github.com/pytorch/pytorch/issues/65648
arches += list_without(CUDA_ARCHES, ["10.2"])
ret: List[Dict[str, str]] = []
for python_version in FULL_PYTHON_VERSIONS:
for arch_version in arches:
gpu_arch_type = arch_type(arch_version)
gpu_arch_version = "" if arch_version == "cpu" else arch_version
ret.append(
{
"python_version": python_version,
"gpu_arch_type": gpu_arch_type,
"gpu_arch_version": gpu_arch_version,
"desired_cuda": translate_desired_cuda(
gpu_arch_type, gpu_arch_version
),
"container_image": WHEEL_CONTAINER_IMAGES[arch_version],
"package_type": package_type,
"build_name": f"{package_type}-py{python_version}-{gpu_arch_type}{gpu_arch_version}".replace(
".", "_"
),
}
)
return ret
def main() -> None:
parser = argparse.ArgumentParser()
parser.add_argument('mode', choices=['conda', 'libtorch', 'wheels'])
args = parser.parse_args()
is_pr = is_pull_request()
print(from_includes({
'conda': generate_conda_matrix,
'libtorch': generate_libtorch_matrix,
'wheels': generate_wheels_matrix,
}[args.mode](is_pr)))
if __name__ == "__main__":
main()
def generate_binary_build_matrix(os: str) -> List[Dict[str, str]]:
return {
"linux": [
*generate_conda_matrix(os),
*generate_libtorch_matrix(os, abi_version=PRE_CXX11_ABI),
*generate_libtorch_matrix(os, abi_version=CXX11_ABI),
*generate_wheels_matrix(os),
]
}[os]

View File

@ -10,6 +10,8 @@ import os
import sys
from typing_extensions import Literal
import generate_binary_build_matrix # type: ignore[import]
YamlShellBool = Literal["''", 1]
Arch = Literal["windows", "linux", "macos"]
@ -27,9 +29,22 @@ WINDOWS_RUNNERS = {
LINUX_CPU_TEST_RUNNER = "linux.2xlarge"
# contains 1 gpu
LINUX_CUDA_TEST_RUNNER = "linux.4xlarge.nvidia.gpu"
# contains at least 2 gpus
LINUX_ROCM_TEST_RUNNER = "linux.rocm.gpu"
LINUX_RUNNERS = {
LINUX_CPU_TEST_RUNNER,
LINUX_CUDA_TEST_RUNNER,
LINUX_ROCM_TEST_RUNNER,
}
LINUX_DISTRIBUTED_GPU_RUNNERS = {
LINUX_CUDA_TEST_RUNNER : "linux.8xlarge.nvidia.gpu",
LINUX_ROCM_TEST_RUNNER : LINUX_ROCM_TEST_RUNNER,
}
LINUX_MULTIGPU_RUNNERS = {
LINUX_CUDA_TEST_RUNNER : "linux.16xlarge.nvidia.gpu",
LINUX_ROCM_TEST_RUNNER : LINUX_ROCM_TEST_RUNNER,
}
MACOS_TEST_RUNNER_10_15 = "macos-10.15"
@ -44,6 +59,9 @@ CUDA_RUNNERS = {
WINDOWS_CUDA_TEST_RUNNER,
LINUX_CUDA_TEST_RUNNER,
}
ROCM_RUNNERS = {
LINUX_ROCM_TEST_RUNNER,
}
CPU_RUNNERS = {
WINDOWS_CPU_TEST_RUNNER,
LINUX_CPU_TEST_RUNNER,
@ -53,6 +71,7 @@ LABEL_CIFLOW_ALL = "ciflow/all"
LABEL_CIFLOW_BAZEL = "ciflow/bazel"
LABEL_CIFLOW_CPU = "ciflow/cpu"
LABEL_CIFLOW_CUDA = "ciflow/cuda"
LABEL_CIFLOW_ROCM = "ciflow/rocm"
LABEL_CIFLOW_DOCS = "ciflow/docs"
LABEL_CIFLOW_DEFAULT = "ciflow/default"
LABEL_CIFLOW_LIBTORCH = "ciflow/libtorch"
@ -73,6 +92,10 @@ LABEL_CIFLOW_DOCKER = "ciflow/docker"
LABEL_CIFLOW_IOS = "ciflow/ios"
LABEL_CIFLOW_MACOS = "ciflow/macos"
LABEL_CIFLOW_TRUNK = "ciflow/trunk"
LABEL_CIFLOW_BINARIES = "ciflow/binaries"
LABEL_CIFLOW_BINARIES_WHEEL = "ciflow/binaries_wheel"
LABEL_CIFLOW_BINARIES_CONDA = "ciflow/binaries_conda"
LABEL_CIFLOW_BINARIES_LIBTORCH = "ciflow/binaries_libtorch"
@dataclass
@ -80,43 +103,15 @@ class CIFlowConfig:
# For use to enable workflows to run on pytorch/pytorch-canary
run_on_canary: bool = False
labels: Set[str] = field(default_factory=set)
trigger_action: str = 'unassigned'
trigger_actor: str = 'pytorchbot'
root_job_condition: str = ''
label_conditions: str = ''
def gen_root_job_condition(self) -> None:
# CIFlow conditions:
# - Workflow should always run on push
# - CIFLOW_DEFAULT workflows should run on PRs even if no `ciflow/` labels on PR
# - Otherwise workflow should be scheduled on all qualifying events
label_conditions = [f"contains(github.event.pull_request.labels.*.name, '{label}')" for label in sorted(self.labels)]
self.label_conditions = ' || '.join(label_conditions)
repo_condition = "github.repository_owner == 'pytorch'" if self.run_on_canary else "github.repository == 'pytorch/pytorch'"
push_event = "github.event_name == 'push'"
scheduled_event = "github.event_name == 'schedule'"
pr_updated_event = f"github.event_name == 'pull_request' && github.event.action != '{self.trigger_action}'"
if LABEL_CIFLOW_DEFAULT in self.labels:
run_with_no_labels = f"({pr_updated_event}) && " \
f"!contains(join(github.event.pull_request.labels.*.name), '{LABEL_CIFLOW_PREFIX}')"
else:
run_with_no_labels = "false"
self.root_job_condition = f"${{{{ ({repo_condition}) && (\n" \
f" ({push_event}) ||\n" \
f" ({scheduled_event}) ||\n" \
f" ({self.label_conditions}) ||\n" \
f" ({run_with_no_labels}))\n"\
f" }}}}"
def reset_root_job(self) -> None:
self.root_job_condition = ''
# Certain jobs might not want to be part of the ciflow/[all,trunk] workflow
isolated_workflow: bool = False
def __post_init__(self) -> None:
self.labels.add(LABEL_CIFLOW_ALL)
if LABEL_CIFLOW_SCHEDULED not in self.labels:
self.labels.add(LABEL_CIFLOW_TRUNK)
if not self.isolated_workflow:
self.labels.add(LABEL_CIFLOW_ALL)
if LABEL_CIFLOW_SCHEDULED not in self.labels:
self.labels.add(LABEL_CIFLOW_TRUNK)
assert all(label.startswith(LABEL_CIFLOW_PREFIX) for label in self.labels)
self.gen_root_job_condition()
@dataclass
@ -155,6 +150,8 @@ class CIWorkflow:
# Optional fields
test_runner_type: str = ''
multigpu_runner_type: str = ''
distributed_gpu_runner_type: str = ''
ciflow_config: CIFlowConfig = field(default_factory=CIFlowConfig)
cuda_version: str = ''
docker_image_base: str = ''
@ -163,6 +160,7 @@ class CIWorkflow:
build_generates_artifacts: bool = True
build_with_debug: bool = False
is_scheduled: str = ''
is_default: bool = False
num_test_shards: int = 1
only_run_smoke_tests_on_pull_request: bool = False
num_test_shards_on_pull_request: int = -1
@ -170,6 +168,9 @@ class CIWorkflow:
fx2trt_test: bool = False
timeout_after: int = 240
xcode_version: str = ''
only_on_pr: bool = False
ios_arch: str = ''
ios_platform: str = ''
# The following variables will be set as environment variables,
# so it's easier for both shell and Python scripts to consume it if false is represented as the empty string.
@ -196,6 +197,12 @@ class CIWorkflow:
if self.fx2trt_test:
self.enable_fx2trt_test = 1
self.multigpu_runner_type = LINUX_MULTIGPU_RUNNERS.get(self.test_runner_type, "linux.16xlarge.nvidia.gpu")
self.distributed_gpu_runner_type = LINUX_DISTRIBUTED_GPU_RUNNERS.get(self.test_runner_type, "linux.8xlarge.nvidia.gpu")
if LABEL_CIFLOW_DEFAULT in self.ciflow_config.labels:
self.is_default = True
# If num_test_shards_on_pull_request is not user-defined, default to num_test_shards unless we are
# only running smoke tests on the pull request.
if self.num_test_shards_on_pull_request == -1:
@ -213,8 +220,8 @@ class CIWorkflow:
if self.arch == 'windows':
assert self.test_runner_type in WINDOWS_RUNNERS, err_message
assert LABEL_CIFLOW_ALL in self.ciflow_config.labels
assert LABEL_CIFLOW_ALL in self.ciflow_config.label_conditions
if not self.ciflow_config.isolated_workflow:
assert LABEL_CIFLOW_ALL in self.ciflow_config.labels
if self.arch == 'linux':
assert LABEL_CIFLOW_LINUX in self.ciflow_config.labels
if self.arch == 'windows':
@ -226,6 +233,8 @@ class CIWorkflow:
assert self.test_runner_type != ''
if self.test_runner_type in CUDA_RUNNERS:
assert LABEL_CIFLOW_CUDA in self.ciflow_config.labels
if self.test_runner_type in ROCM_RUNNERS:
assert LABEL_CIFLOW_ROCM in self.ciflow_config.labels
if self.test_runner_type in CPU_RUNNERS and not self.exclude_test:
assert LABEL_CIFLOW_CPU in self.ciflow_config.labels
if self.is_scheduled:
@ -275,6 +284,40 @@ class DockerWorkflow:
output_file.write("\n")
print(output_file_path)
@dataclass
class BinaryBuildWorkflow:
os: str
build_configs: List[Dict[str, str]]
package_type: str
# Optional fields
build_environment: str = ''
abi_version: str = ''
ciflow_config: CIFlowConfig = field(default_factory=CIFlowConfig)
is_scheduled: str = ''
def __post_init__(self) -> None:
if self.abi_version:
self.build_environment = f"{self.os}-binary-{self.package_type}-{self.abi_version}"
else:
self.build_environment = f"{self.os}-binary-{self.package_type}"
def generate_workflow_file(self, workflow_template: jinja2.Template) -> None:
output_file_path = GITHUB_DIR / f"workflows/generated-{self.build_environment}.yml"
with open(output_file_path, "w") as output_file:
GENERATED = "generated" # Note that please keep the variable GENERATED otherwise phabricator will hide the whole file
output_file.writelines([f"# @{GENERATED} DO NOT EDIT MANUALLY\n"])
try:
content = workflow_template.render(asdict(self))
except Exception as e:
print(f"Failed on template: {workflow_template}", file=sys.stderr)
raise e
output_file.write(content)
if content[-1] != "\n":
output_file.write("\n")
print(output_file_path)
WINDOWS_WORKFLOWS = [
CIWorkflow(
arch="windows",
@ -512,6 +555,37 @@ LINUX_WORKFLOWS = [
labels=set([LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CUDA]),
),
),
# no-ops builds test USE_PER_OPERATOR_HEADERS=0 where ATen/ops is not generated
CIWorkflow(
arch="linux",
build_environment="linux-xenial-cuda11.3-py3.7-gcc7-no-ops",
docker_image_base=f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7",
test_runner_type=LINUX_CUDA_TEST_RUNNER,
exclude_test=True,
ciflow_config=CIFlowConfig(
labels=set([LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CUDA]),
),
),
CIWorkflow(
arch="linux",
build_environment="linux-xenial-py3.7-gcc7-no-ops",
docker_image_base=f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-xenial-py3.7-gcc7",
test_runner_type=LINUX_CPU_TEST_RUNNER,
exclude_test=True,
ciflow_config=CIFlowConfig(
labels=set([LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CPU]),
),
),
CIWorkflow(
arch="linux",
build_environment="linux-bionic-rocm4.5-py3.7",
docker_image_base=f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-rocm4.5-py3.7",
test_runner_type=LINUX_ROCM_TEST_RUNNER,
num_test_shards=2,
ciflow_config=CIFlowConfig(
labels=set([LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_LINUX, LABEL_CIFLOW_ROCM]),
),
),
CIWorkflow(
arch="linux",
build_environment="libtorch-linux-xenial-cuda11.3-py3.7-gcc7",
@ -586,6 +660,22 @@ LINUX_WORKFLOWS = [
),
]
XLA_WORKFLOWS = [
CIWorkflow(
arch="linux",
build_environment="pytorch-xla-linux-bionic-py3.7-clang8",
docker_image_base=f"{DOCKER_REGISTRY}/pytorch/xla_base",
test_runner_type=LINUX_CPU_TEST_RUNNER,
num_test_shards=2,
distributed_test=False,
enable_xla_test=1,
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_LINUX, LABEL_CIFLOW_CPU, LABEL_CIFLOW_XLA},
),
),
]
ANDROID_SHORT_WORKFLOWS = [
CIWorkflow(
arch="linux",
@ -638,6 +728,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-arm64",
ios_arch="arm64",
ios_platform="OS",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -647,6 +739,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-arm64-coreml",
ios_arch="arm64",
ios_platform="OS",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -656,6 +750,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-arm64-full-jit",
ios_arch="arm64",
ios_platform="OS",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -665,6 +761,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-arm64-custom-ops",
ios_arch="arm64",
ios_platform="OS",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -674,6 +772,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-arm64-metal",
ios_arch="arm64",
ios_platform="OS",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -683,6 +783,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-x86-64",
ios_arch="x86_64",
ios_platform="SIMULATOR",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -692,6 +794,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-x86-64-coreml",
ios_arch="x86_64",
ios_platform="SIMULATOR",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -701,6 +805,8 @@ IOS_WORKFLOWS = [
CIWorkflow(
arch="macos",
build_environment="ios-12-5-1-x86-64-full-jit",
ios_arch="x86_64",
ios_platform="SIMULATOR",
test_runner_type=MACOS_TEST_RUNNER_10_15,
exclude_test=True,
ciflow_config=CIFlowConfig(
@ -745,10 +851,8 @@ MACOS_WORKFLOWS = [
]
DOCKER_IMAGES = {
f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-cuda10.2-cudnn7-py3.7-clang9", # for pytorch/xla
f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-rocm4.1-py3.7", # for rocm
f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-rocm4.2-py3.7", # for rocm
f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-rocm4.3.1-py3.7", # for rocm
f"{DOCKER_REGISTRY}/pytorch/pytorch-linux-bionic-rocm4.5-py3.7", # for rocm
}
DOCKER_IMAGES.update({
@ -761,8 +865,104 @@ DOCKER_WORKFLOWS = [
DockerWorkflow(
build_environment="docker-builds",
docker_images=sorted(DOCKER_IMAGES),
# Run weekly to ensure they can build
is_scheduled="1 * */7 * *",
# Run every Wednesday at 3:01am to ensure they can build
is_scheduled="1 3 * * 3",
),
]
class OperatingSystem:
LINUX = "linux"
WINDOWS = "windows"
LINUX_BINARY_BUILD_WORFKLOWS = [
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="manywheel",
build_configs=generate_binary_build_matrix.generate_wheels_matrix(OperatingSystem.LINUX),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_WHEEL},
isolated_workflow=True,
),
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="conda",
build_configs=generate_binary_build_matrix.generate_conda_matrix(OperatingSystem.LINUX),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_CONDA},
isolated_workflow=True,
),
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="libtorch",
abi_version=generate_binary_build_matrix.CXX11_ABI,
build_configs=generate_binary_build_matrix.generate_libtorch_matrix(
OperatingSystem.LINUX, generate_binary_build_matrix.CXX11_ABI
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_LIBTORCH},
isolated_workflow=True,
),
),
BinaryBuildWorkflow(
os=OperatingSystem.LINUX,
package_type="libtorch",
abi_version=generate_binary_build_matrix.PRE_CXX11_ABI,
build_configs=generate_binary_build_matrix.generate_libtorch_matrix(
OperatingSystem.LINUX, generate_binary_build_matrix.PRE_CXX11_ABI
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_LIBTORCH},
isolated_workflow=True,
),
),
]
WINDOWS_BINARY_BUILD_WORKFLOWS = [
BinaryBuildWorkflow(
os=OperatingSystem.WINDOWS,
package_type="wheel",
build_configs=generate_binary_build_matrix.generate_wheels_matrix(OperatingSystem.WINDOWS),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_WHEEL},
isolated_workflow=True,
),
),
# NOTE: conda binaries are currently bugged on the installation step
# See, https://github.com/pytorch/pytorch/pull/71484#issuecomment-1022617195
# BinaryBuildWorkflow(
# os=OperatingSystem.WINDOWS,
# package_type="conda",
# build_configs=generate_binary_build_matrix.generate_conda_matrix(OperatingSystem.WINDOWS),
# ciflow_config=CIFlowConfig(
# labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_CONDA},
# isolated_workflow=True,
# ),
# ),
BinaryBuildWorkflow(
os=OperatingSystem.WINDOWS,
package_type="libtorch",
abi_version=generate_binary_build_matrix.RELEASE,
build_configs=generate_binary_build_matrix.generate_libtorch_matrix(
OperatingSystem.WINDOWS, generate_binary_build_matrix.RELEASE
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_LIBTORCH},
isolated_workflow=True,
),
),
BinaryBuildWorkflow(
os=OperatingSystem.WINDOWS,
package_type="libtorch",
abi_version=generate_binary_build_matrix.DEBUG,
build_configs=generate_binary_build_matrix.generate_libtorch_matrix(
OperatingSystem.WINDOWS, generate_binary_build_matrix.DEBUG
),
ciflow_config=CIFlowConfig(
labels={LABEL_CIFLOW_DEFAULT, LABEL_CIFLOW_BINARIES, LABEL_CIFLOW_BINARIES_LIBTORCH},
isolated_workflow=True,
),
),
]
@ -774,6 +974,7 @@ def main() -> None:
)
template_and_workflows = [
(jinja_env.get_template("linux_ci_workflow.yml.j2"), LINUX_WORKFLOWS),
(jinja_env.get_template("linux_ci_workflow.yml.j2"), XLA_WORKFLOWS),
(jinja_env.get_template("windows_ci_workflow.yml.j2"), WINDOWS_WORKFLOWS),
(jinja_env.get_template("bazel_ci_workflow.yml.j2"), BAZEL_WORKFLOWS),
(jinja_env.get_template("ios_ci_workflow.yml.j2"), IOS_WORKFLOWS),
@ -781,6 +982,8 @@ def main() -> None:
(jinja_env.get_template("docker_builds_ci_workflow.yml.j2"), DOCKER_WORKFLOWS),
(jinja_env.get_template("android_ci_full_workflow.yml.j2"), ANDROID_WORKFLOWS),
(jinja_env.get_template("android_ci_workflow.yml.j2"), ANDROID_SHORT_WORKFLOWS),
(jinja_env.get_template("linux_binary_build_workflow.yml.j2"), LINUX_BINARY_BUILD_WORFKLOWS),
(jinja_env.get_template("windows_binary_build_workflow.yml.j2"), WINDOWS_BINARY_BUILD_WORKFLOWS),
]
# Delete the existing generated files first, this should align with .gitattributes file description.
existing_workflows = GITHUB_DIR.glob("workflows/generated-*")

View File

@ -61,6 +61,7 @@ def run_as_if_on_trunk() -> bool:
return current_workflow_triggered_by_label
def main() -> None:
INCLUDE_DEFAULT_TEST = True
TEST_RUNNER_TYPE = os.getenv('TEST_RUNNER_TYPE')
assert TEST_RUNNER_TYPE is not None
RUN_SMOKE_TESTS_ONLY_ON_PR = os.getenv('RUN_SMOKE_TESTS_ONLY_ON_PR')
@ -99,6 +100,7 @@ def main() -> None:
configs['backwards_compat'] = {'num_shards': 1, 'runner': TEST_RUNNER_TYPE}
if os.getenv('ENABLE_XLA_TEST'):
configs['xla'] = {'num_shards': 1, 'runner': TEST_RUNNER_TYPE}
INCLUDE_DEFAULT_TEST = False
if os.getenv('ENABLE_NOARCH_TEST'):
configs['noarch'] = {'num_shards': 1, 'runner': TEST_RUNNER_TYPE}
if RUN_SMOKE_TESTS:
@ -112,6 +114,7 @@ def main() -> None:
'runner': TEST_RUNNER_TYPE,
}
for shard in range(1, NUM_TEST_SHARDS + 1)
if INCLUDE_DEFAULT_TEST
] + [
{
'config': name,

290
.github/scripts/gitutils.py vendored Normal file
View File

@ -0,0 +1,290 @@
#!/usr/bin/env python3
from collections import defaultdict
from datetime import datetime
from typing import cast, Any, Dict, Iterator, List, Optional, Tuple, Union
import os
import re
RE_GITHUB_URL_MATCH = re.compile("^https://.*@?github.com/(.+)/(.+)$")
def get_git_remote_name() -> str:
return os.getenv("GIT_REMOTE_NAME", "origin")
def get_git_repo_dir() -> str:
from pathlib import Path
return os.getenv("GIT_REPO_DIR", str(Path(__file__).resolve().parent.parent.parent))
def fuzzy_list_to_dict(items: List[Tuple[str, str]]) -> Dict[str, List[str]]:
"""
Converts list to dict preserving elements with duplicate keys
"""
rc: Dict[str, List[str]] = defaultdict(lambda: [])
for (key, val) in items:
rc[key].append(val)
return dict(rc)
def _check_output(items: List[str], encoding: str = "utf-8") -> str:
from subprocess import check_output, CalledProcessError
try:
return check_output(items).decode(encoding)
except CalledProcessError as e:
msg = f"Command `{' '.join(e.cmd)}` returned non-zero exit code {e.returncode}"
stdout = e.stdout.decode(encoding) if e.stdout is not None else ""
stderr = e.stderr.decode(encoding) if e.stderr is not None else ""
if len(stderr) == 0:
msg += f"\n{stdout}"
else:
msg += f"\nstdout:\n{stdout}\nstderr:\n{stderr}"
raise RuntimeError(msg) from e
class GitCommit:
commit_hash: str
title: str
body: str
author: str
author_date: datetime
commit_date: Optional[datetime]
def __init__(self,
commit_hash: str,
author: str,
author_date: datetime,
title: str,
body: str,
commit_date: Optional[datetime] = None) -> None:
self.commit_hash = commit_hash
self.author = author
self.author_date = author_date
self.commit_date = commit_date
self.title = title
self.body = body
def __repr__(self) -> str:
return f"{self.title} ({self.commit_hash})"
def __contains__(self, item: Any) -> bool:
return item in self.body or item in self.title
def parse_fuller_format(lines: Union[str, List[str]]) -> GitCommit:
"""
Expect commit message generated using `--format=fuller --date=unix` format, i.e.:
commit <sha1>
Author: <author>
AuthorDate: <author date>
Commit: <committer>
CommitDate: <committer date>
<title line>
<full commit message>
"""
if isinstance(lines, str):
lines = lines.split("\n")
# TODO: Handle merge commits correctly
if len(lines) > 1 and lines[1].startswith("Merge:"):
del lines[1]
assert len(lines) > 7
assert lines[0].startswith("commit")
assert lines[1].startswith("Author: ")
assert lines[2].startswith("AuthorDate: ")
assert lines[3].startswith("Commit: ")
assert lines[4].startswith("CommitDate: ")
assert len(lines[5]) == 0
return GitCommit(commit_hash=lines[0].split()[1].strip(),
author=lines[1].split(":", 1)[1].strip(),
author_date=datetime.fromtimestamp(int(lines[2].split(":", 1)[1].strip())),
commit_date=datetime.fromtimestamp(int(lines[4].split(":", 1)[1].strip())),
title=lines[6].strip(),
body="\n".join(lines[7:]),
)
class GitRepo:
def __init__(self, path: str, remote: str = "origin", debug: bool = False) -> None:
self.repo_dir = path
self.remote = remote
self.debug = debug
def _run_git(self, *args: Any) -> str:
if self.debug:
print(f"+ git -C {self.repo_dir} {' '.join(args)}")
return _check_output(["git", "-C", self.repo_dir] + list(args))
def revlist(self, revision_range: str) -> List[str]:
rc = self._run_git("rev-list", revision_range, "--", ".").strip()
return rc.split("\n") if len(rc) > 0 else []
def current_branch(self) -> str:
return self._run_git("symbolic-ref", "--short", "HEAD").strip()
def checkout(self, branch: str) -> None:
self._run_git('checkout', branch)
def show_ref(self, name: str) -> str:
refs = self._run_git('show-ref', '-s', name).strip().split('\n')
if not all(refs[i] == refs[0] for i in range(1, len(refs))):
raise RuntimeError(f"referce {name} is ambigous")
return refs[0]
def rev_parse(self, name: str) -> str:
return self._run_git('rev-parse', '--verify', name).strip()
def get_merge_base(self, from_ref: str, to_ref: str) -> str:
return self._run_git('merge-base', from_ref, to_ref).strip()
def patch_id(self, ref: Union[str, List[str]]) -> List[Tuple[str, str]]:
is_list = isinstance(ref, list)
if is_list:
if len(ref) == 0:
return []
ref = " ".join(ref)
rc = _check_output(['sh', '-c', f'git -C {self.repo_dir} show {ref}|git patch-id --stable']).strip()
return [cast(Tuple[str, str], x.split(" ", 1)) for x in rc.split("\n")]
def commits_resolving_gh_pr(self, pr_num: int) -> List[str]:
owner, name = self.gh_owner_and_name()
msg = f"Pull Request resolved: https://github.com/{owner}/{name}/pull/{pr_num}"
rc = self._run_git('log', '--format=%H', '--grep', msg).strip()
return rc.split("\n") if len(rc) > 0 else []
def get_commit(self, ref: str) -> GitCommit:
return parse_fuller_format(self._run_git('show', '--format=fuller', '--date=unix', '--shortstat', ref))
def cherry_pick(self, ref: str) -> None:
self._run_git('cherry-pick', '-x', ref)
def revert(self, ref: str) -> None:
self._run_git("revert", "--no-edit", ref)
def compute_branch_diffs(self, from_branch: str, to_branch: str) -> Tuple[List[str], List[str]]:
"""
Returns list of commmits that are missing in each other branch since their merge base
Might be slow if merge base is between two branches is pretty far off
"""
from_ref = self.rev_parse(from_branch)
to_ref = self.rev_parse(to_branch)
merge_base = self.get_merge_base(from_ref, to_ref)
from_commits = self.revlist(f'{merge_base}..{from_ref}')
to_commits = self.revlist(f'{merge_base}..{to_ref}')
from_ids = fuzzy_list_to_dict(self.patch_id(from_commits))
to_ids = fuzzy_list_to_dict(self.patch_id(to_commits))
for patch_id in set(from_ids).intersection(set(to_ids)):
from_values = from_ids[patch_id]
to_values = to_ids[patch_id]
if len(from_values) != len(to_values):
# Eliminate duplicate commits+reverts from the list
while len(from_values) > 0 and len(to_values) > 0:
frc = self.get_commit(from_values.pop())
toc = self.get_commit(to_values.pop())
if frc.title != toc.title or frc.author_date != toc.author_date:
raise RuntimeError(f"Unexpected differences between {frc} and {toc}")
from_commits.remove(frc.commit_hash)
to_commits.remove(toc.commit_hash)
continue
for commit in from_values:
from_commits.remove(commit)
for commit in to_values:
to_commits.remove(commit)
return (from_commits, to_commits)
def cherry_pick_commits(self, from_branch: str, to_branch: str) -> None:
orig_branch = self.current_branch()
self.checkout(to_branch)
from_commits, to_commits = self.compute_branch_diffs(from_branch, to_branch)
if len(from_commits) == 0:
print("Nothing to do")
self.checkout(orig_branch)
return
for commit in reversed(from_commits):
print(f"Cherry picking commit {commit}")
self.cherry_pick(commit)
self.checkout(orig_branch)
def push(self, branch: str, dry_run: bool) -> None:
if dry_run:
self._run_git("push", "--dry-run", self.remote, branch)
else:
self._run_git("push", self.remote, branch)
def head_hash(self) -> str:
return self._run_git("show-ref", "--hash", "HEAD").strip()
def remote_url(self) -> str:
return self._run_git("remote", "get-url", self.remote)
def gh_owner_and_name(self) -> Tuple[str, str]:
url = os.getenv("GIT_REMOTE_URL", None)
if url is None:
url = self.remote_url()
rc = RE_GITHUB_URL_MATCH.match(url)
if rc is None:
raise RuntimeError(f"Unexpected url format {url}")
return cast(Tuple[str, str], rc.groups())
def commit_message(self, ref: str) -> str:
return self._run_git("log", "-1", "--format=%B", ref)
def amend_commit_message(self, msg: str) -> None:
self._run_git("commit", "--amend", "-m", msg)
class PeekableIterator(Iterator[str]):
def __init__(self, val: str) -> None:
self._val = val
self._idx = -1
def peek(self) -> Optional[str]:
if self._idx + 1 >= len(self._val):
return None
return self._val[self._idx + 1]
def __iter__(self) -> "PeekableIterator":
return self
def __next__(self) -> str:
rc = self.peek()
if rc is None:
raise StopIteration
self._idx += 1
return rc
def patterns_to_regex(allowed_patterns: List[str]) -> Any:
"""
pattern is glob-like, i.e. the only special sequences it has are:
- ? - matches single character
- * - matches any non-folder separator characters
- ** - matches any characters
Assuming that patterns are free of braces and backslashes
the only character that needs to be escaped are dot and plus
"""
rc = "("
for idx, pattern in enumerate(allowed_patterns):
if idx > 0:
rc += "|"
pattern_ = PeekableIterator(pattern)
assert not any(c in pattern for c in "{}()[]\\")
for c in pattern_:
if c == ".":
rc += "\\."
elif c == "+":
rc += "\\+"
elif c == "*":
if pattern_.peek() == "*":
next(pattern_)
rc += ".+"
else:
rc += "[^/]+"
else:
rc += c
rc += ")"
return re.compile(rc)

View File

@ -22,7 +22,7 @@ from typing import List, Any
# Team/owner labels usually start with "module: " or "oncall: ", but the following are acceptable exceptions
ACCEPTABLE_OWNER_LABELS = ["NNC", "high priority"]
GLOB_EXCEPTIONS = [
"test/run_test.py"
"**/test/run_test.py"
]
PYTORCH_ROOT = Path(__file__).resolve().parent.parent.parent
@ -35,7 +35,7 @@ S3_RESOURCE_READ_ONLY = boto3.resource("s3", config=botocore.config.Config(signa
def get_all_test_files() -> List[Path]:
test_files = list(TEST_DIR.glob("**/test_*.py"))
test_files.extend(list(TEST_DIR.glob("**/*_test.py")))
return [f for f in test_files if any([fnmatch.fnmatch(str(f), g) for g in GLOB_EXCEPTIONS])]
return [f for f in test_files if not any([fnmatch.fnmatch(str(f), g) for g in GLOB_EXCEPTIONS])]
def get_pytorch_labels() -> Any:

106
.github/scripts/process_commit.py vendored Normal file
View File

@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""
This script finds the user/pr creator responsible for labeling a PR by a commit SHA. It is used by the workflow in
'.github/workflows/pr-labels.yml'. If there exists no PR associated with the commit or the PR is properly labeled,
this script is a no-op.
Note: we ping the user only, not the reviewers, as the reviewers can sometimes be external to pytorch
with no labeling responsibility, so we don't want to bother them.
This script is based on: https://github.com/pytorch/vision/blob/main/.github/process_commit.py
"""
import sys
from typing import Any, Set, Tuple, List
import re
import os
import json
import requests
# For a PR to be properly labeled it should have release notes label and one topic label
PULL_REQUEST_EXP = "Pull Request resolved:.*pull/(.*)"
PRIMARY_LABEL_FILTER = "release notes:"
SECONDARY_LABELS = {
"topic: bc_breaking",
"topic: deprecation",
"topic: new feature",
"topic: improvements",
"topic: bug fixes",
"topic: performance",
"topic: documentation",
"topic: developer feature",
"topic: not user facing",
}
# This secondary does not require a primary
ALLOWED_ONLY_SECONDARY = {"topic: not user facing"}
PYTORCH_REPO = "https://api.github.com/repos/pytorch/pytorch"
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN')
REQUEST_HEADERS = {'Accept': 'application/vnd.github.v3+json', 'Authorization': f'token {GITHUB_TOKEN}'}
def query_pytorch(cmd: str) -> Any:
response = requests.get(f"{PYTORCH_REPO}/{cmd}", headers=REQUEST_HEADERS)
return response.json()
def get_pr_number(commit_hash: str) -> Any:
data = query_pytorch(f"commits/{commit_hash}")
if not data or (not data["commit"]["message"]):
return None
message = data["commit"]["message"]
p = re.compile(PULL_REQUEST_EXP)
result = p.search(message)
if not result:
return None
return result.group(1)
def get_pr_author_and_labels(pr_number: int) -> Tuple[str, Set[str]]:
# See https://docs.github.com/en/rest/reference/pulls#get-a-pull-request
data = query_pytorch(f"pulls/{pr_number}")
user = data["user"]["login"]
labels = {label["name"] for label in data["labels"]}
return user, labels
def get_repo_labels() -> List[str]:
collected_labels: List[str] = list()
for page in range(0, 10):
response = query_pytorch(f"labels?per_page=100&page={page}")
page_labels = list(map(lambda x: str(x["name"]), response))
if not page_labels:
break
collected_labels += page_labels
return collected_labels
def post_pytorch_comment(pr_number: int, merger: str) -> Any:
message = {'body' : f"Hey {merger}." + """
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. \
Please add one of each to the PR. The 'release notes: ...' label should represent the part of \
PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should \
represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). \
The list of valid labels can be found [here](https://github.com/pytorch/pytorch/labels?q=release+notes) \
for the 'release notes: ...' and [here](https://github.com/pytorch/pytorch/labels?q=topic) for the \
'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label."""}
response = requests.post(
f"{PYTORCH_REPO}/issues/{pr_number}/comments",
json.dumps(message),
headers=REQUEST_HEADERS)
return response.json()
if __name__ == "__main__":
commit_hash = sys.argv[1]
pr_number = get_pr_number(commit_hash)
if not pr_number:
sys.exit(0)
user, labels = get_pr_author_and_labels(pr_number)
repo_labels = get_repo_labels()
primary_labels = set(filter(lambda x: x.startswith(PRIMARY_LABEL_FILTER), repo_labels))
has_both_labels = bool(primary_labels.intersection(labels) and SECONDARY_LABELS.intersection(labels))
is_properly_labeled = has_both_labels or bool(ALLOWED_ONLY_SECONDARY.intersection(labels))
if not is_properly_labeled:
post_pytorch_comment(pr_number, user)

View File

@ -53,6 +53,7 @@ def find_current_branch(repo_path: str) -> str:
repo = git.Repo(repo_path)
name: str = repo.active_branch.name
return name
def deploy_torchbench_config(output_dir: str, config: str) -> None:
# Create test dir if needed
pathlib.Path(output_dir).mkdir(exist_ok=True)

25
.github/scripts/syncbranches.py vendored Executable file
View File

@ -0,0 +1,25 @@
#!/usr/bin/env python3
from gitutils import get_git_repo_dir, GitRepo
from typing import Any
def parse_args() -> Any:
from argparse import ArgumentParser
parser = ArgumentParser("Merge PR/branch into default branch")
parser.add_argument("--sync-branch", default="sync")
parser.add_argument("--default-branch", type=str, default="main")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--debug", action="store_true")
return parser.parse_args()
def main() -> None:
args = parse_args()
repo = GitRepo(get_git_repo_dir(), debug=args.debug)
repo.cherry_pick_commits(args.sync_branch, args.default_branch)
repo.push(args.default_branch, args.dry_run)
if __name__ == '__main__':
main()

27
.github/scripts/test_gitutils.py vendored Normal file
View File

@ -0,0 +1,27 @@
#!/usr/bin/env python3
from gitutils import PeekableIterator
from unittest import TestCase, main
class TestPeekableIterator(TestCase):
def test_iterator(self, input_: str = "abcdef") -> None:
iter_ = PeekableIterator(input_)
for idx, c in enumerate(iter_):
self.assertEqual(c, input_[idx])
def test_is_iterable(self) -> None:
from collections.abc import Iterator
iter_ = PeekableIterator("")
self.assertTrue(isinstance(iter_, Iterator))
def test_peek(self, input_: str = "abcdef") -> None:
iter_ = PeekableIterator(input_)
for idx, c in enumerate(iter_):
if idx + 1 < len(input_):
self.assertEqual(iter_.peek(), input_[idx + 1])
else:
self.assertTrue(iter_.peek() is None)
if __name__ == '__main__':
main()

440
.github/scripts/trymerge.py vendored Executable file
View File

@ -0,0 +1,440 @@
#!/usr/bin/env python3
import json
import os
import re
from dataclasses import dataclass
from urllib.request import urlopen, Request
from urllib.error import HTTPError
from typing import cast, Any, Callable, Dict, List, Optional, Tuple, Union
from gitutils import get_git_remote_name, get_git_repo_dir, patterns_to_regex, GitRepo
GH_GET_PR_INFO_QUERY = """
query ($owner: String!, $name: String!, $number: Int!) {
repository(owner: $owner, name: $name) {
pullRequest(number: $number) {
closed
isCrossRepository
author {
login
}
title
body
headRefName
headRepository {
nameWithOwner
}
baseRefName
baseRepository {
nameWithOwner
isPrivate
defaultBranchRef {
name
}
}
mergeCommit {
oid
}
commits(first: 100) {
nodes {
commit {
author {
user {
login
}
email
name
}
oid
checkSuites(filterBy: {appId: 12274}, first: 1) {
nodes {
app {
databaseId
}
conclusion
}
}
}
}
totalCount
}
changedFiles
files(last: 100) {
nodes {
path
}
}
latestReviews(last: 100) {
nodes {
author {
login
}
state
}
totalCount
}
comments(last: 1) {
nodes {
bodyText
author {
login
}
authorAssociation
editor {
login
}
}
}
}
}
}
"""
RE_GHSTACK_HEAD_REF = re.compile(r"^(gh/[^/]+/[0-9]+/)head$")
RE_GHSTACK_SOURCE_ID = re.compile(r'^ghstack-source-id: (.+)\n?', re.MULTILINE)
RE_PULL_REQUEST_RESOLVED = re.compile(
r'Pull Request resolved: '
r'https://github.com/(?P<owner>[^/]+)/(?P<repo>[^/]+)/pull/(?P<number>[0-9]+)',
re.MULTILINE
)
RE_REVERT_CMD = re.compile(r"@pytorch(merge|)bot\s+revert\s+this")
RE_DIFF_REV = re.compile(r'^Differential Revision:.+?(D[0-9]+)', re.MULTILINE)
def _fetch_url(url: str, *,
headers: Optional[Dict[str, str]] = None,
data: Optional[Dict[str, Any]] = None,
method: Optional[str] = None,
reader: Callable[[Any], Any] = lambda x: x.read()) -> Any:
if headers is None:
headers = {}
token = os.environ.get("GITHUB_TOKEN")
if token is not None and url.startswith('https://api.github.com/'):
headers['Authorization'] = f'token {token}'
data_ = json.dumps(data).encode() if data is not None else None
try:
with urlopen(Request(url, headers=headers, data=data_, method=method)) as conn:
return reader(conn)
except HTTPError as err:
if err.code == 403 and all(key in err.headers for key in ['X-RateLimit-Limit', 'X-RateLimit-Used']):
print(f"Rate limit exceeded: {err.headers['X-RateLimit-Used']}/{err.headers['X-RateLimit-Limit']}")
raise
def fetch_json(url: str,
params: Optional[Dict[str, Any]] = None,
data: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]:
headers = {'Accept': 'application/vnd.github.v3+json'}
if params is not None and len(params) > 0:
url += '?' + '&'.join(f"{name}={val}" for name, val in params.items())
return cast(List[Dict[str, Any]], _fetch_url(url, headers=headers, data=data, reader=json.load))
def gh_post_comment(org: str, project: str, pr_num: int, comment: str, dry_run: bool = False) -> List[Dict[str, Any]]:
if dry_run:
print(comment)
return []
return fetch_json(f'https://api.github.com/repos/{org}/{project}/issues/{pr_num}/comments',
data={"body": comment})
def gh_add_labels(org: str, project: str, pr_num: int, labels: Union[str, List[str]]) -> None:
fetch_json(f'https://api.github.com/repos/{org}/{project}/issues/{pr_num}/labels',
data={"labels": labels})
def gh_graphql(query: str, **kwargs: Any) -> Dict[str, Any]:
rc = _fetch_url("https://api.github.com/graphql", data={"query": query, "variables": kwargs}, reader=json.load)
if "errors" in rc:
raise RuntimeError(f"GraphQL query {query} failed: {rc['errors']}")
return cast(Dict[str, Any], rc)
def gh_get_pr_info(org: str, proj: str, pr_no: int) -> Any:
rc = gh_graphql(GH_GET_PR_INFO_QUERY, name=proj, owner=org, number=pr_no)
return rc["data"]["repository"]["pullRequest"]
def parse_args() -> Any:
from argparse import ArgumentParser
parser = ArgumentParser("Merge PR into default branch")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--revert", action="store_true")
parser.add_argument("pr_num", type=int)
return parser.parse_args()
class GitHubPR:
def __init__(self, org: str, project: str, pr_num: int) -> None:
assert isinstance(pr_num, int)
self.org = org
self.project = project
self.pr_num = pr_num
self.info = gh_get_pr_info(org, project, pr_num)
def is_closed(self) -> bool:
return bool(self.info["closed"])
def is_cross_repo(self) -> bool:
return bool(self.info["isCrossRepository"])
def base_ref(self) -> str:
return cast(str, self.info["baseRefName"])
def default_branch(self) -> str:
return cast(str, self.info["baseRepository"]["defaultBranchRef"]["name"])
def head_ref(self) -> str:
return cast(str, self.info["headRefName"])
def is_ghstack_pr(self) -> bool:
return RE_GHSTACK_HEAD_REF.match(self.head_ref()) is not None
def is_base_repo_private(self) -> bool:
return bool(self.info["baseRepository"]["isPrivate"])
def get_changed_files_count(self) -> int:
return int(self.info["changedFiles"])
def get_changed_files(self) -> List[str]:
rc = [x["path"] for x in self.info["files"]["nodes"]]
if len(rc) != self.get_changed_files_count():
raise RuntimeError("Changed file count mismatch")
return rc
def _get_reviewers(self) -> List[Tuple[str, str]]:
reviews_count = int(self.info["latestReviews"]["totalCount"])
if len(self.info["latestReviews"]["nodes"]) != reviews_count:
raise RuntimeError("Can't fetch all PR reviews")
return [(x["author"]["login"], x["state"]) for x in self.info["latestReviews"]["nodes"]]
def get_approved_by(self) -> List[str]:
return [login for (login, state) in self._get_reviewers() if state == "APPROVED"]
def get_commit_count(self) -> int:
return int(self.info["commits"]["totalCount"])
def get_pr_creator_login(self) -> str:
return cast(str, self.info["author"]["login"])
def get_committer_login(self, num: int = 0) -> str:
return cast(str, self.info["commits"]["nodes"][num]["commit"]["author"]["user"]["login"])
def get_committer_author(self, num: int = 0) -> str:
node = self.info["commits"]["nodes"][num]["commit"]["author"]
return f"{node['name']} <{node['email']}>"
def get_check_suite_conclusions(self) -> Dict[int, str]:
last_commit = self.info["commits"]["nodes"][-1]["commit"]
rc = {}
for node in last_commit["checkSuites"]["nodes"]:
rc[int(node["app"]["databaseId"])] = node["conclusion"]
return rc
def get_authors(self) -> Dict[str, str]:
rc = {}
for idx in range(self.get_commit_count()):
rc[self.get_committer_login(idx)] = self.get_committer_author(idx)
return rc
def get_author(self) -> str:
authors = self.get_authors()
if len(authors) == 1:
return next(iter(authors.values()))
return self.get_authors()[self.get_pr_creator_login()]
def get_title(self) -> str:
return cast(str, self.info["title"])
def get_body(self) -> str:
return cast(str, self.info["body"])
def get_merge_commit(self) -> Optional[str]:
mc = self.info["mergeCommit"]
return mc["oid"] if mc is not None else None
def get_pr_url(self) -> str:
return f"https://github.com/{self.org}/{self.project}/pull/{self.pr_num}"
def get_comment_body(self, num: int = -1) -> str:
return cast(str, self.info["comments"]["nodes"][num]["bodyText"])
def get_comment_author_login(self, num: int = -1) -> str:
return cast(str, self.info["comments"]["nodes"][num]["author"]["login"])
def get_comment_editor_login(self, num: int = -1) -> Optional[str]:
rc = self.info["comments"]["nodes"][num]["editor"]
return rc["login"] if rc is not None else None
def get_comment_author_association(self, num: int = -1) -> str:
return cast(str, self.info["comments"]["nodes"][num]["authorAssociation"])
def merge_ghstack_into(self, repo: GitRepo) -> None:
assert self.is_ghstack_pr()
# For ghstack, cherry-pick commits based from origin
orig_ref = f"{repo.remote}/{re.sub(r'/head$', '/orig', self.head_ref())}"
rev_list = repo.revlist(f"{self.default_branch()}..{orig_ref}")
for idx, rev in enumerate(reversed(rev_list)):
msg = repo.commit_message(rev)
m = RE_PULL_REQUEST_RESOLVED.search(msg)
if m is None:
raise RuntimeError(f"Could not find PR-resolved string in {msg} of ghstacked PR {self.pr_num}")
if self.org != m.group('owner') or self.project != m.group('repo'):
raise RuntimeError(f"PR {m.group('number')} resolved to wrong owner/repo pair")
pr_num = int(m.group('number'))
if pr_num != self.pr_num:
pr = GitHubPR(self.org, self.project, pr_num)
if pr.is_closed():
print(f"Skipping {idx+1} of {len(rev_list)} PR (#{pr_num}) as its already been merged")
continue
# Raises exception if matching rule is not found
find_matching_merge_rule(pr, repo)
repo.cherry_pick(rev)
repo.amend_commit_message(re.sub(RE_GHSTACK_SOURCE_ID, "", msg))
def merge_into(self, repo: GitRepo, dry_run: bool = False) -> None:
# Raises exception if matching rule is not found
find_matching_merge_rule(self, repo)
if repo.current_branch() != self.default_branch():
repo.checkout(self.default_branch())
if not self.is_ghstack_pr():
msg = self.get_title() + "\n\n" + self.get_body()
msg += f"\nPull Request resolved: {self.get_pr_url()}\n"
repo._run_git("merge", "--squash", f"{repo.remote}/{self.head_ref()}")
repo._run_git("commit", f"--author=\"{self.get_author()}\"", "-m", msg)
else:
self.merge_ghstack_into(repo)
repo.push(self.default_branch(), dry_run)
@dataclass
class MergeRule:
name: str
patterns: List[str]
approved_by: List[str]
mandatory_app_id: Optional[int]
def read_merge_rules(repo: GitRepo) -> List[MergeRule]:
from pathlib import Path
rules_path = Path(repo.repo_dir) / ".github" / "merge_rules.json"
if not rules_path.exists():
print(f"{rules_path} does not exist, returning empty rules")
return []
with open(rules_path) as fp:
rc = json.load(fp, object_hook=lambda x: MergeRule(**x))
return cast(List[MergeRule], rc)
def find_matching_merge_rule(pr: GitHubPR, repo: GitRepo) -> MergeRule:
"""Returns merge rule matching to this pr or raises an exception"""
changed_files = pr.get_changed_files()
approved_by = set(pr.get_approved_by())
rules = read_merge_rules(repo)
for rule in rules:
rule_name = rule.name
rule_approvers_set = set(rule.approved_by)
patterns_re = patterns_to_regex(rule.patterns)
approvers_intersection = approved_by.intersection(rule_approvers_set)
# If rule requires approvers but they aren't the ones that reviewed PR
if len(approvers_intersection) == 0 and len(rule_approvers_set) > 0:
print(f"Skipping rule {rule_name} due to no approvers overlap")
continue
if rule.mandatory_app_id is not None:
cs_conslusions = pr.get_check_suite_conclusions()
mandatory_app_id = rule.mandatory_app_id
if mandatory_app_id not in cs_conslusions or cs_conslusions[mandatory_app_id] != "SUCCESS":
print(f"Skipping rule {rule_name} as mandatory app {mandatory_app_id} is not in {cs_conslusions}")
continue
non_matching_files = []
for fname in changed_files:
if not patterns_re.match(fname):
non_matching_files.append(fname)
if len(non_matching_files) > 0:
print(f"Skipping rule {rule_name} due to non-matching files: {non_matching_files}")
continue
print(f"Matched rule {rule_name} for {pr.pr_num}")
return rule
raise RuntimeError(f"PR {pr.pr_num} does not match merge rules")
def try_revert(repo: GitRepo, pr: GitHubPR, dry_run: bool = False) -> None:
def post_comment(msg: str) -> None:
gh_post_comment(pr.org, pr.project, pr.pr_num, msg, dry_run=dry_run)
if not pr.is_closed():
return post_comment(f"Can't revert open PR #{pr.pr_num}")
if not RE_REVERT_CMD.match(pr.get_comment_body()):
raise RuntimeError(f"Comment {pr.get_comment_body()} does not seem to be a valid revert command")
if pr.get_comment_editor_login() is not None:
return post_comment("Don't want to revert based on edited command")
author_association = pr.get_comment_author_association()
author_login = pr.get_comment_author_login()
# For some reason, one can not be a member of private repo, only CONTRIBUTOR
expected_association = "CONTRIBUTOR" if pr.is_base_repo_private() else "MEMBER"
if author_association != expected_association and author_association != "OWNER":
return post_comment(f"Will not revert as @{author_login} is not a {expected_association}, but {author_association}")
# Raises exception if matching rule is not found
find_matching_merge_rule(pr, repo)
commit_sha = pr.get_merge_commit()
if commit_sha is None:
commits = repo.commits_resolving_gh_pr(pr.pr_num)
if len(commits) == 0:
raise RuntimeError("Can't find any commits resolving PR")
commit_sha = commits[0]
msg = repo.commit_message(commit_sha)
rc = RE_DIFF_REV.search(msg)
if rc is not None:
raise RuntimeError(f"Can't revert PR that was landed via phabricator as {rc.group(1)}")
repo.checkout(pr.default_branch())
repo.revert(commit_sha)
msg = repo.commit_message("HEAD")
msg = re.sub(RE_PULL_REQUEST_RESOLVED, "", msg)
msg += f"\nReverted {pr.get_pr_url()} on behalf of @{author_login}\n"
repo.amend_commit_message(msg)
repo.push(pr.default_branch(), dry_run)
if not dry_run:
gh_add_labels(pr.org, pr.project, pr.pr_num, ["reverted"])
def main() -> None:
args = parse_args()
repo = GitRepo(get_git_repo_dir(), get_git_remote_name())
org, project = repo.gh_owner_and_name()
pr = GitHubPR(org, project, args.pr_num)
if args.revert:
try:
try_revert(repo, pr, dry_run=args.dry_run)
except Exception as e:
msg = f"Reverting PR {args.pr_num} failed due to {e}"
run_url = os.getenv("GH_RUN_URL")
if run_url is not None:
msg += f"\nRaised by {run_url}"
gh_post_comment(org, project, args.pr_num, msg, dry_run=args.dry_run)
return
if pr.is_closed():
gh_post_comment(org, project, args.pr_num, f"Can't merge closed PR #{args.pr_num}", dry_run=args.dry_run)
return
if pr.is_cross_repo():
gh_post_comment(org, project, args.pr_num, "Cross-repo merges are not supported at the moment", dry_run=args.dry_run)
return
try:
pr.merge_into(repo, dry_run=args.dry_run)
except Exception as e:
msg = f"Merge failed due to {e}"
run_url = os.getenv("GH_RUN_URL")
if run_url is not None:
msg += f"\nRaised by {run_url}"
gh_post_comment(org, project, args.pr_num, msg, dry_run=args.dry_run)
if __name__ == "__main__":
main()

View File

@ -8,8 +8,18 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- endif -%}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
push:
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
{% block build +%}
# building and testing in a single job since bazel runs only small subset of tests
@ -18,18 +28,16 @@ on:
env:
JOB_BASE_NAME: !{{ build_environment }}-build-and-test
NUM_TEST_SHARDS: !{{ num_test_shards }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.calculate_docker_image(false) }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- name: Determine shm-size
run: |
shm_size="1g"

View File

@ -8,8 +8,18 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- endif -%}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
push:
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
{% block build +%}
# building and testing in a single job since bazel runs only small subset of tests
@ -18,18 +28,16 @@ on:
env:
JOB_BASE_NAME: !{{ build_environment }}-build-and-test
NUM_TEST_SHARDS: !{{ num_test_shards }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.calculate_docker_image(false) }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- name: Determine shm-size
run: |
shm_size="1g"

View File

@ -8,8 +8,18 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- endif -%}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
push:
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
{% block build +%}
# building and testing in a single job since bazel runs only small subset of tests
@ -18,18 +28,16 @@ on:
env:
JOB_BASE_NAME: !{{ build_environment }}-build-and-test
NUM_TEST_SHARDS: !{{ num_test_shards }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.calculate_docker_image(false) }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- name: Determine shm-size
run: |
shm_size="1g"

View File

@ -12,11 +12,10 @@ concurrency:
cancel-in-progress: true
{%- endmacro -%}
{%- macro pull_docker(image) -%}
{%- macro add_retry_to_env() -%}
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "!{{ image }}"
{%- endmacro -%}
{%- macro display_ec2_information() -%}
@ -35,14 +34,20 @@ concurrency:
echo "instance-type: $(get_ec2_metadata instance-type)"
{%- endmacro -%}
{%- macro parse_ref() -%}
{%- macro parse_ref(pytorch_directory="") -%}
- name: Parse ref
{%- if pytorch_directory %}
working-directory: !{{ pytorch_directory }}
{%- endif %}
id: parse-ref
run: .github/scripts/parse_ref.py
{%- endmacro -%}
{%- macro upload_test_statistics(build_environment, when="always()") -%}
{%- macro upload_test_statistics(build_environment, when="always()", pytorch_directory="") -%}
- name: Display and upload test statistics (Click Me)
{%- if pytorch_directory %}
working-directory: !{{ pytorch_directory }}
{%- endif %}
if: !{{ when }}
# temporary hack: set CIRCLE_* vars, until we update
# tools/stats/print_test_stats.py to natively support GitHub Actions
@ -53,7 +58,7 @@ concurrency:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt
@ -61,6 +66,22 @@ concurrency:
python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test
{%- endmacro -%}
{%- macro chown_dir(dir) -%}
- name: Chown artifacts
if: always()
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "!{{ dir }}:/v" -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
{%- endmacro -%}
{%- macro setup_ec2_windows() -%}
!{{ display_ec2_information() }}
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
{%- endmacro -%}
{%- macro setup_ec2_linux() -%}
!{{ display_ec2_information() }}
- name: Log in to ECR
@ -69,17 +90,19 @@ concurrency:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
!{{ add_retry_to_env() }}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
!{{ pull_docker("${ALPINE_IMAGE}") }}
!{{ add_retry_to_env() }}
retry docker pull "${ALPINE_IMAGE}"
# Ensure the working directory gets chowned back to the current user
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -89,8 +112,50 @@ concurrency:
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
{%- endmacro -%}
{%- macro teardown_ec2_linux() -%}
{%- macro setup_rocm_linux() -%}
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: Set DOCKER_HOST
run: echo "DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock" >> "${GITHUB_ENV}"
- name: Runner health check system info
if: always()
run: |
cat /etc/os-release || true
cat /etc/apt/sources.list.d/rocm.list || true
cat /opt/rocm/.info/version || true
whoami
- name: Runner health check rocm-smi
if: always()
run: |
rocm-smi
- name: Runner health check rocminfo
if: always()
run: |
rocminfo
- name: Runner health check GPU count
if: always()
run: |
ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')
if [[ "x$ngpu" != "x2" && "x$ngpu" != "x4" ]]; then
echo "Failed to detect GPUs on the runner"
exit 1
fi
- name: Runner health check disconnect on failure
if: ${{ failure() }}
run: |
killall runsvc.sh
- name: Preserve github env variables for use in docker
run: |
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
{%- endmacro -%}
{%- macro teardown_ec2_linux(pytorch_directory="") -%}
- name: Hold runner for 2 hours or until ssh sessions have drained
{%- if pytorch_directory %}
working-directory: !{{ pytorch_directory }}
{%- endif %}
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
@ -109,18 +174,42 @@ concurrency:
docker system prune -af
{%- endmacro -%}
{%- macro checkout_pytorch(submodules) -%}
- name: Checkout PyTorch
{%- macro teardown_rocm_linux() -%}
- name: Kill containers, clean up images
if: always()
run: |
# ignore expansion of "docker ps -q" since it could be empty
# shellcheck disable=SC2046
docker stop $(docker ps -q) || true
# Prune all of the docker images
docker system prune -af
{%- endmacro -%}
{%- macro checkout(submodules="recursive", deep_clone=True, directory="", repository="pytorch/pytorch", branch="") -%}
- name: Checkout !{{ 'PyTorch' if repository == "pytorch/pytorch" else repository }}
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
{%- if deep_clone %}
# deep clone, to allow use of git merge-base
fetch-depth: 0
{%- endif %}
submodules: !{{ submodules }}
- name: Clean PyTorch checkout
{%- if repository != "pytorch/pytorch" %}
repository: !{{ repository }}
{%- endif %}
{%- if branch %}
ref: !{{ branch }}
{%- endif %}
{%- if directory %}
path: !{{ directory }}
{%- endif %}
- name: Clean !{{ 'PyTorch' if repository == "pytorch/pytorch" else repository }} checkout
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
{%- if directory%}
working-directory: !{{ directory }}
{%- endif %}
{%- endmacro -%}
{%- macro upload_downloaded_files(name, artifact_name="", use_s3=True, when="always()") -%}
@ -282,3 +371,23 @@ concurrency:
DEVELOPER_DIR: /Applications/Xcode_!{{ xcode_version }}.app/Contents/Developer
{%- endif %}
{%- endmacro -%}
{%- macro wait_and_kill_ssh_windows(pytorch_directory="") -%}
- name: Wait until all sessions have drained
shell: powershell
{%- if pytorch_directory %}
working-directory: !{{ pytorch_directory }}
{%- endif %}
if: always()
timeout-minutes: 120
run: |
.github\scripts\wait_for_ssh_to_drain.ps1
- name: Kill active ssh sessions if still around (Useful if workflow was cancelled)
shell: powershell
{%- if pytorch_directory %}
working-directory: !{{ pytorch_directory }}
{%- endif %}
if: always()
run: |
.github\scripts\kill_active_ssh_sessions.ps1
{%- endmacro -%}

View File

@ -11,7 +11,7 @@
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
# The artifact file is created inside docker container, which contains the result binaries.
# Now unpackage it into the project folder. The subsequent script will scan project folder

View File

@ -40,11 +40,12 @@ jobs:
name: docker-build (${{ matrix.docker_image_short_name }})
steps:
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.calculate_docker_image(true) }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
!{{ common.parse_ref() }}
!{{ common.teardown_ec2_linux() }}
- name: Hold runner for 2 hours or until ssh sessions have drained

View File

@ -7,8 +7,9 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- endif -%}
{%- if is_scheduled %}
schedule:
@ -19,43 +20,89 @@ on:
- master
- release/*
{%- endif %}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: !{{ build_environment }}
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: !{{ ios_platform }}
IOS_ARCH: !{{ ios_arch }}
!{{ common.set_xcode_version(xcode_version) }}
jobs:
{% block build +%}
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: !{{ common.timeout_minutes }}
env:
JOB_BASE_NAME: !{{ build_environment }}-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.checkout_pytorch("recursive") }}
!{{ common.setup_miniconda("3.8") }}
- name: Install ios / conda Dependencies
!{{ common.checkout() }}
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -77,10 +124,60 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
{%- if ios_platform == "SIMULATOR" %}
- name: Run Simulator Tests
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
pip3 install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
# generate models for differnet backends
cd "${GITHUB_WORKSPACE}/ios/TestApp/benchmark"
mkdir -p ../models
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
pip install coremltools==5.0b5
pip install six==1.16.0
python coreml_backend.py
else
python trace_model.py
fi
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
else
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${GITHUB_WORKSPACE}/ios/TestApp"
instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
fi
else
fastlane scan --only_testing TestAppTests/TestAppTests/testFullJIT
fi
{%- endif -%}
{% endblock +%}
!{{ common.concurrency(build_environment) }}

View File

@ -0,0 +1,245 @@
{% import 'common.yml.j2' as common %}
{%- block name -%}
# Template is at: .github/templates/linux_binary_build_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: !{{ build_environment }}
{%- endblock %}
{%- macro binary_env(config) -%}
env:
PACKAGE_TYPE: !{{ config["package_type"] }}
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: !{{ config["desired_cuda"] }}
{%- if config["gpu_arch_version"] %}
GPU_ARCH_VERSION: !{{ config["gpu_arch_version"] }}
{%- endif %}
GPU_ARCH_TYPE: !{{ config["gpu_arch_type"] }}
DOCKER_IMAGE: !{{ config["container_image"] }}
SKIP_ALL_TESTS: 1
{%- if config["package_type"] == "libtorch" %}
LIBTORCH_VARIANT: !{{ config["libtorch_variant"] }}
DESIRED_DEVTOOLSET: !{{ config["devtoolset"] }}
{%- else %}
DESIRED_PYTHON: "!{{ config["python_version"] }}"
{%- endif %}
{%- endmacro %}
on:
push:
# NOTE: Meta Employees can trigger new nightlies using: https://fburl.com/trigger_pytorch_nightly_build
branches:
- nightly
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
{%- for label in ciflow_config.labels | sort %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
workflow_dispatch:
env:
# Needed for conda builds
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
ANACONDA_USER: pytorch
AWS_DEFAULT_REGION: us-east-1
BINARY_ENV_FILE: /tmp/env
BUILD_ENVIRONMENT: !{{ build_environment }}
BUILDER_ROOT: /builder
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
IN_CI: 1
IS_GHA: 1
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
PR_NUMBER: ${{ github.event.pull_request.number }}
PYTORCH_FINAL_PACKAGE_DIR: /artifacts
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_ROOT: /pytorch
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
SKIP_ALL_TESTS: 1
!{{ common.concurrency(build_environment) }}
jobs:
{%- for config in build_configs %}
!{{ config["build_name"] }}-build:
if: ${{ github.repository_owner == 'pytorch' }}
runs-on: linux.4xlarge
timeout-minutes: !{{ common.timeout_minutes }}
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_linux() }}
!{{ common.checkout(deep_clone=False, directory="pytorch") }}
!{{ common.checkout(deep_clone=False, directory="builder", repository="pytorch/builder", branch="release/1.11") }}
{%- if config["gpu_arch_type"] == 'cuda' and config["gpu_arch_version"].startswith('11') %}
- name: Set BUILD_SPLIT_CUDA
run: |
echo "BUILD_SPLIT_CUDA='ON'" >> "$GITHUB_ENV"
{%- endif %}
- name: Pull Docker image
run: |
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- name: Build PyTorch binary
run: |
set -x
mkdir -p artifacts/
container_name=$(docker run \
-e BINARY_ENV_FILE \
-e BUILDER_ROOT \
-e BUILD_ENVIRONMENT \
-e BUILD_SPLIT_CUDA \
-e DESIRED_CUDA \
-e DESIRED_DEVTOOLSET \
-e DESIRED_PYTHON \
-e GPU_ARCH_TYPE \
-e GPU_ARCH_VERSION \
-e IS_GHA \
-e LIBTORCH_VARIANT \
-e PACKAGE_TYPE \
-e PYTORCH_FINAL_PACKAGE_DIR \
-e PYTORCH_ROOT \
-e SKIP_ALL_TESTS \
--tty \
--detach \
-v "${GITHUB_WORKSPACE}/pytorch:/pytorch" \
-v "${GITHUB_WORKSPACE}/builder:/builder" \
-v "${RUNNER_TEMP}/artifacts:/artifacts" \
-w / \
"${DOCKER_IMAGE}"
)
docker exec -t -w "${PYTORCH_ROOT}" "${container_name}" bash -c "bash .circleci/scripts/binary_populate_env.sh"
docker exec -t "${container_name}" bash -c "source ${BINARY_ENV_FILE} && bash /builder/!{{ config["package_type"] }}/build.sh"
!{{ common.chown_dir("${RUNNER_TEMP}/artifacts") }}
- uses: !{{ common.upload_artifact_s3_action }}
with:
name: !{{ config["build_name"] }}
retention-days: 14
if-no-files-found: error
path:
${{ runner.temp }}/artifacts/*
!{{ common.teardown_ec2_linux("pytorch/") }}
!{{ config["build_name"] }}-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs: !{{ config["build_name"] }}-build
{%- if config["gpu_arch_type"] == "cuda" %}
runs-on: linux.4xlarge.nvidia.gpu
{%- else %}
runs-on: linux.4xlarge
{%- endif %}
timeout-minutes: !{{ common.timeout_minutes }}
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_linux() }}
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download Build Artifacts
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
- name: Clone pytorch/pytorch
uses: actions/checkout@v2
with:
path: pytorch
submodules: recursive
- name: Clone pytorch/builder
uses: actions/checkout@v2
with:
repository: pytorch/builder
path: builder
{%- if config["gpu_arch_type"] == "cuda" %}
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
working-directory: pytorch/
run: |
bash .github/scripts/install_nvidia_utils_linux.sh
echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}"
{%- endif %}
- name: Pull Docker image
run: |
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- name: Test PyTorch binary
run: |
set -x
# shellcheck disable=SC2086,SC2090
container_name=$(docker run \
${GPU_FLAG:-} \
-e BINARY_ENV_FILE \
-e BUILDER_ROOT \
-e BUILD_ENVIRONMENT \
-e BUILD_SPLIT_CUDA \
-e DESIRED_CUDA \
-e DESIRED_DEVTOOLSET \
-e DESIRED_PYTHON \
-e GPU_ARCH_TYPE \
-e GPU_ARCH_VERSION \
-e IS_GHA \
-e LIBTORCH_VARIANT \
-e PACKAGE_TYPE \
-e PYTORCH_FINAL_PACKAGE_DIR \
-e PYTORCH_ROOT \
-e SKIP_ALL_TESTS \
--tty \
--detach \
-v "${GITHUB_WORKSPACE}/pytorch:/pytorch" \
-v "${GITHUB_WORKSPACE}/builder:/builder" \
-v "${RUNNER_TEMP}/artifacts:/final_pkgs" \
-w / \
"${DOCKER_IMAGE}"
)
docker exec -t -w "${PYTORCH_ROOT}" "${container_name}" bash -c "bash .circleci/scripts/binary_populate_env.sh"
# Generate test script
docker exec -t -w "${PYTORCH_ROOT}" -e OUTPUT_SCRIPT="/run.sh" "${container_name}" bash -c "bash .circleci/scripts/binary_linux_test.sh"
docker exec -t "${container_name}" bash -c "source ${BINARY_ENV_FILE} && bash -x /run.sh"
!{{ common.teardown_ec2_linux("pytorch/") }}
!{{ config["build_name"] }}-upload: # Uploading
runs-on: linux.2xlarge # self hosted runner to download ec2 artifacts
if: ${{ github.repository_owner == 'pytorch' }}
needs: !{{ config["build_name"] }}-test
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_linux() }}
- name: Clone pytorch/pytorch
uses: actions/checkout@v2
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download Build Artifacts
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
- name: Set DRY_RUN (only for tagged pushes)
if: ${{ github.event_name == 'push' && (github.event.ref == 'refs/heads/nightly' || startsWith(github.event.ref, 'refs/tags/'))}}
run: |
echo "DRY_RUN=disabled" >> "$GITHUB_ENV"
- name: Set UPLOAD_CHANNEL (only for tagged pushes)
if: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/')}}
run: |
# reference ends with an RC suffix
if [[ ${GITHUB_REF_NAME} = *-rc[0-9]* ]]; then
echo "UPLOAD_CHANNEL=test" >> "$GITHUB_ENV"
fi
- name: Upload binaries
env:
PKG_DIR: "${{ runner.temp }}/artifacts"
UPLOAD_SUBFOLDER: "${{ env.DESIRED_CUDA }}"
# When running these on pull_request events these should be blank
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_PYTORCH_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_PYTORCH_SECRET_KEY }}
ANACONDA_API_TOKEN: ${{ secrets.CONDA_PYTORCHBOT_TOKEN }}
run: |
docker run --rm -i \
-e ANACONDA_API_TOKEN \
-e AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY \
-e DRY_RUN \
-e PACKAGE_TYPE \
-e PKG_DIR=/artifacts \
-e UPLOAD_CHANNEL \
-e UPLOAD_SUBFOLDER \
-v "${RUNNER_TEMP}/artifacts:/artifacts" \
-v "${GITHUB_WORKSPACE}:/v" \
-w /v \
308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/miniconda3:4.10.3 \
bash -c '.circleci/scripts/binary_upload.sh'
!{{ common.teardown_ec2_linux() }}
{%- endfor %}

View File

@ -7,18 +7,32 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- if is_scheduled %}
schedule:
- cron: !{{ is_scheduled }}
{%- else %}
{%- endif %}
push:
{%- if enable_doc_jobs and is_scheduled %}
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
{%- endif %}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first and not (enable_doc_jobs and is_scheduled) %}
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
{%- if not is_scheduled and not only_on_pr %}
branches:
- master
- release/*
- fbsync
{%- endif %}
{%- if is_scheduled and not only_on_pr %}
schedule:
- cron: !{{ is_scheduled }}
{%- endif %}
workflow_dispatch:
@ -41,6 +55,11 @@ env:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
{%- if enable_xla_test == 1 %}
# This is used for XLA tests only
XLA_CUDA: 0
XLA_IMAGE_TAG: v0.2
{%- endif %}
{%- if build_with_debug %}
DEBUG: 1
{%- endif %}
@ -51,22 +70,32 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: !{{ common.timeout_minutes }}
if: !{{ ciflow_config.root_job_condition }}
env:
JOB_BASE_NAME: !{{ build_environment }}-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
{%- if enable_xla_test == 1 %}
- name: Calculate docker image tag
id: calculate-tag
run: |
echo "XLA workflow uses pre-built test image at ${XLA_IMAGE_TAG}"
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "DOCKER_IMAGE=${DOCKER_IMAGE_BASE}:${XLA_IMAGE_TAG}" >> "${GITHUB_ENV}"
echo "::set-output name=docker_tag::${DOCKER_TAG}"
echo "::set-output name=docker_image::${DOCKER_IMAGE_BASE}:${XLA_IMAGE_TAG}"
{%- else %}
!{{ common.calculate_docker_image(false) }}
{%- endif %}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
!{{ common.parse_ref() }}
- name: Build
env:
@ -84,6 +113,9 @@ jobs:
-e BRANCH \
-e GITHUB_RUN_ID \
-e SCCACHE_BUCKET \
{%- if enable_xla_test == 1 %}
-e XLA_CUDA \
{%- endif %}
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e SKIP_SCCACHE_INITIALIZATION=1 \
@ -108,7 +140,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -162,8 +194,8 @@ jobs:
ENABLE_XLA_TEST: !{{ enable_xla_test }}
ENABLE_NOARCH_TEST: !{{ enable_noarch_test }}
NUM_TEST_SHARDS: !{{ num_test_shards }}
MULTIGPU_RUNNER_TYPE: linux.16xlarge.nvidia.gpu
DISTRIBUTED_GPU_RUNNER_TYPE: linux.8xlarge.nvidia.gpu
MULTIGPU_RUNNER_TYPE: !{{ multigpu_runner_type }}
DISTRIBUTED_GPU_RUNNER_TYPE: !{{ distributed_gpu_runner_type }}
NOGPU_RUNNER_TYPE: linux.2xlarge
PR_BODY: ${{ github.event.pull_request.body }}
outputs:
@ -196,16 +228,28 @@ jobs:
NUM_TEST_SHARDS: ${{ matrix.num_shards }}
PYTORCH_IGNORE_DISABLED_ISSUES: ${{ needs.generate-test-matrix.outputs.ignore-disabled-issues }}
steps:
{%- if 'rocm' in test_runner_type %}
!{{ common.setup_rocm_linux() }}
{%- else %}
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
{%- endif %}
!{{ common.checkout() }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
{%- if 'rocm' in test_runner_type %}
- name: ROCm set GPU_FLAG
if: ${{ contains(env.BUILD_ENVIRONMENT, 'rocm') && !contains(matrix.config, 'nogpu') }}
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
{%- else %}
- name: Install nvidia driver, nvidia-docker runtime, set GPU_FLAG
if: ${{ contains(env.BUILD_ENVIRONMENT, 'cuda') && !contains(matrix.config, 'nogpu') }}
run: |
bash .github/scripts/install_nvidia_utils_linux.sh
echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}"
{%- endif %}
- name: Determine shm-size
run: |
shm_size="1g"
@ -227,7 +271,11 @@ jobs:
unzip -o artifacts.zip
- name: Output disk space left
run: |
{%- if 'rocm' in test_runner_type %}
df -H
{%- else %}
sudo df -H
{%- endif %}
!{{ common.parse_ref() }}
- name: Test
env:
@ -245,6 +293,7 @@ jobs:
else
TEST_COMMAND=.jenkins/pytorch/test.sh
fi
{%- if 'rocm' not in test_runner_type %}
PROXY_ENV=
# NOTE: XLA multiprocessing tests appear to have issues with squid proxy, going to disable for now
# We should investigate whether or not there's a list of hostnames we can add to no_proxy to
@ -253,6 +302,7 @@ jobs:
# shellcheck disable=SC2089
PROXY_ENV="-e http_proxy=!{{ common.squid_proxy }} -e https_proxy=!{{ common.squid_proxy }} -e no_proxy=!{{ common.squid_no_proxy }}"
fi
{%- endif %}
# detached container should get cleaned up by teardown_ec2_linux
# TODO: Stop building test binaries as part of the build phase
# Used for GPU_FLAG since that doesn't play nice
@ -278,13 +328,20 @@ jobs:
-e PR_LABELS \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e SCCACHE_BUCKET \
{%- if enable_xla_test == 1 %}
-e XLA_CUDA \
{%- endif %}
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
{%- if 'rocm' not in test_runner_type %}
${PROXY_ENV} \
{%- endif %}
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--ulimit stack=10485760:83886080 \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
{%- if 'rocm' not in test_runner_type %}
--ipc=host \
{%- endif %}
--shm-size="${SHM_SIZE}" \
--tty \
--detach \
@ -294,17 +351,35 @@ jobs:
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
{%- if 'rocm' in test_runner_type %}
# jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home
docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}"
# copy test results back to the mounted workspace, needed sudo, resulting permissions were correct
docker exec -t "${container_name}" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test"
{%- else %}
docker exec -t "${container_name}" sh -c "sudo chown -R jenkins . && pip install dist/*.whl && ${TEST_COMMAND}"
{%- endif %}
{%- if 'rocm' not in test_runner_type %}
- name: Chown workspace
if: always()
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
{%- endif %}
!{{ common.render_test_results() }}
{%- if 'rocm' in test_runner_type %}
!{{ common.upload_downloaded_files(name='linux', use_s3=False) }}
!{{ common.upload_test_reports(name='linux', artifact_name="test-reports", use_s3=False) }}
{%- else %}
!{{ common.upload_downloaded_files(name='linux') }}
!{{ common.upload_test_reports(name='linux') }}
{%- endif %}
!{{ common.upload_test_statistics(build_environment) }}
{%- if 'rocm' in test_runner_type %}
!{{ common.teardown_rocm_linux() }}
{%- else %}
!{{ common.teardown_ec2_linux() }}
{%- endif %}
{% endblock %}
{%- endif -%}
{%- if enable_doc_jobs %}
@ -318,13 +393,14 @@ jobs:
env:
DOCKER_IMAGE: ${{ needs.build.outputs.docker_image }}
DOCS_TYPE: ${{ matrix.docs_type }}
WITH_PUSH: ${{ github.event_name == 'schedule' }}
WITH_PUSH: ${{ github.event_name == 'schedule' || startsWith(github.event.ref, 'refs/tags/v') }}
steps:
!{{ common.setup_ec2_linux() }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
- name: Pull Docker image
run: |
!{{ common.pull_docker("${DOCKER_IMAGE}") }}
!{{ common.add_retry_to_env() }}
retry docker pull "${DOCKER_IMAGE}"
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download PyTorch Build Artifacts
with:
@ -334,7 +410,7 @@ jobs:
unzip -o artifacts.zip
{%- if is_scheduled %}
- name: Generate netrc (only for docs-push)
if: ${{ github.event_name == 'schedule' }}
if: ${{ github.event_name == 'schedule' || startsWith(github.event.ref, 'refs/tags/v') }}
env:
GITHUB_PYTORCHBOT_TOKEN: ${{ secrets.GH_PYTORCHBOT_TOKEN }}
run: |
@ -347,9 +423,12 @@ jobs:
run: |
set -ex
time docker pull "${DOCKER_IMAGE}" > /dev/null
echo "${GITHUB_REF}"
# TODO: Set it correctly when workflows are scheduled on tags
target="master"
# Convert refs/tags/v1.12.0rc3 into 1.12
if [[ "${GITHUB_REF}" =~ ^refs/tags/v([0-9]+\.[0-9]+)\.* ]]; then
target="${BASH_REMATCH[1]}"
else
target="master"
fi
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \

View File

@ -7,8 +7,9 @@ name: !{{ build_environment }}
{%- endblock %}
on:
{%- if is_default -%}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- endif -%}
{%- if is_scheduled %}
schedule:
@ -18,8 +19,15 @@ on:
branches:
- master
- release/*
- fbsync
{%- endif %}
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
@ -43,14 +51,11 @@ jobs:
# For sccache access (only on non-forked PRs)
AWS_ACCESS_KEY_ID: ${{ secrets.MACOS_SCCACHE_S3_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.MACOS_SCCACHE_S3_SECRET_ACCESS_KEY }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.setup_miniconda("3.8") }}
- name: Install macOS homebrew dependencies
run: |
@ -120,7 +125,7 @@ jobs:
NUM_TEST_SHARDS: ${{ matrix.num_shards }}
PYTORCH_IGNORE_DISABLED_ISSUES: ${{ needs.generate-test-matrix.outputs.ignore-disabled-issues }}
steps:
!{{ common.checkout_pytorch("false") }}
!{{ common.checkout(submodules="false") }}
- uses: actions/download-artifact@v2
name: Download PyTorch Build Artifacts from GHA
with:

View File

@ -0,0 +1,205 @@
{% import 'common.yml.j2' as common %}
{%- block name -%}
# Template is at: .github/templates/windows_binary_build_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: !{{ build_environment }}
{%- endblock %}
{%- macro binary_env(config) -%}
env:
PYTORCH_ROOT: ${{ github.workspace }}/pytorch
BUILDER_ROOT: ${{ github.workspace }}/builder
PACKAGE_TYPE: !{{ config["package_type"] }}
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: !{{ config["desired_cuda"] }}
{%- if config["gpu_arch_version"] %}
GPU_ARCH_VERSION: !{{ config["gpu_arch_version"] }}
{%- endif %}
GPU_ARCH_TYPE: !{{ config["gpu_arch_type"] }}
SKIP_ALL_TESTS: 1
{%- if config["package_type"] == "libtorch" %}
{%- if config["libtorch_config"] %}
LIBTORCH_CONFIG: !{{ config["libtorch_config"] }}
{%- endif %}
LIBTORCH_VARIANT: !{{ config["libtorch_variant"] }}
{%- if config["devtoolset"] %}
DESIRED_DEVTOOLSET: !{{ config["devtoolset"] }}
{%- endif %}
# This is a dummy value for libtorch to work correctly with our batch scripts
# without this value pip does not get installed for some reason
DESIRED_PYTHON: "3.7"
{%- else %}
DESIRED_PYTHON: "!{{ config["python_version"] }}"
{%- endif %}
{%- endmacro %}
{%- macro set_runner_specific_vars() -%}
# NOTE: These environment variables are put here so that they can be applied on every job equally
# They are also here because setting them at a workflow level doesn't give us access to the
# runner.temp variable, which we need.
- name: Populate binary env
shell: bash
run: |
echo "BINARY_ENV_FILE=${RUNNER_TEMP}/env" >> "${GITHUB_ENV}"
echo "PYTORCH_FINAL_PACKAGE_DIR=${RUNNER_TEMP}/artifacts" >> "${GITHUB_ENV}"
echo "WIN_PACKAGE_WORK_DIR=${RUNNER_TEMP}"
{%- endmacro %}
on:
push:
# NOTE: Meta Employees can trigger new nightlies using: https://fburl.com/trigger_pytorch_nightly_build
branches:
- nightly
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
{%- for label in ciflow_config.labels | sort %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
workflow_dispatch:
env:
# Needed for conda builds
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
ANACONDA_USER: pytorch
AWS_DEFAULT_REGION: us-east-1
BUILD_ENVIRONMENT: !{{ build_environment }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
IN_CI: 1
IS_GHA: 1
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
PR_NUMBER: ${{ github.event.pull_request.number }}
PYTORCH_RETRY_TEST_CASES: 1
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
SKIP_ALL_TESTS: 1
!{{ common.concurrency(build_environment) }}
jobs:
{%- for config in build_configs %}
!{{ config["build_name"] }}-build:
runs-on: windows.4xlarge
timeout-minutes: !{{ common.timeout_minutes }}
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_windows() }}
!{{ set_runner_specific_vars() }}
- name: Clone pytorch/pytorch
uses: actions/checkout@v2
with:
path: ${{ env.PYTORCH_ROOT }}
submodules: recursive
- name: Clone pytorch/builder
uses: actions/checkout@v2
with:
repository: pytorch/builder
path: ${{ env.BUILDER_ROOT }}
ref: release/1.11
- name: Populate binary env
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_populate_env.sh"
- name: Build PyTorch binary
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_build.sh"
- uses: !{{ common.upload_artifact_s3_action }}
if: always()
with:
name: !{{ config["build_name"] }}
retention-days: 14
if-no-files-found: error
path: "${{ env.PYTORCH_FINAL_PACKAGE_DIR }}"
!{{ common.wait_and_kill_ssh_windows('pytorch') }}
!{{ config["build_name"] }}-test: # Testing
if: ${{ github.repository_owner == 'pytorch' }}
needs: !{{ config["build_name"] }}-build
{%- if config["gpu_arch_type"] == "cuda" %}
runs-on: windows.8xlarge.nvidia.gpu
{%- else %}
runs-on: windows.4xlarge
{%- endif %}
timeout-minutes: !{{ common.timeout_minutes }}
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_windows() }}
!{{ set_runner_specific_vars() }}
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download Build Artifacts
with:
name: !{{ config["build_name"] }}
path: "${{ env.PYTORCH_FINAL_PACKAGE_DIR }}"
- name: Clone pytorch/pytorch
uses: actions/checkout@v2
with:
path: ${{ env.PYTORCH_ROOT }}
submodules: recursive
- name: Clone pytorch/builder
uses: actions/checkout@v2
with:
repository: pytorch/builder
path: ${{ env.BUILDER_ROOT }}
ref: release/1.11
- name: Populate binary env
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_populate_env.sh"
- name: Test PyTorch binary
shell: bash
run: |
"${PYTORCH_ROOT}/.circleci/scripts/binary_windows_test.sh"
!{{ common.wait_and_kill_ssh_windows('pytorch') }}
!{{ config["build_name"] }}-upload: # Uploading
runs-on: linux.2xlarge # self hosted runner to download ec2 artifacts
if: ${{ github.repository_owner == 'pytorch' }}
needs: !{{ config["build_name"] }}-test
!{{ binary_env(config) }}
steps:
!{{ common.setup_ec2_linux() }}
- name: Clone pytorch/pytorch
uses: actions/checkout@v2
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download Build Artifacts
with:
name: !{{ config["build_name"] }}
path: "${{ runner.temp }}/artifacts/"
- name: Set DRY_RUN (only for tagged pushes)
if: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/')}}
run: |
echo "DRY_RUN=disabled" >> "$GITHUB_ENV"
- name: Set UPLOAD_CHANNEL (only for tagged pushes)
if: ${{ github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags/')}}
run: |
# reference ends with an RC suffix
if [[ ${GITHUB_REF_NAME} = *-rc[0-9]* ]]; then
echo "UPLOAD_CHANNEL=test" >> "$GITHUB_ENV"
fi
- name: Upload binaries
env:
PKG_DIR: "${{ runner.temp }}/artifacts"
UPLOAD_SUBFOLDER: "${{ env.DESIRED_CUDA }}"
# When running these on pull_request events these should be blank
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_PYTORCH_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_PYTORCH_SECRET_KEY }}
ANACONDA_API_TOKEN: ${{ secrets.CONDA_PYTORCHBOT_TOKEN }}
run: |
docker run --rm -i \
-e ANACONDA_API_TOKEN \
-e AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY \
-e DRY_RUN \
-e PACKAGE_TYPE \
-e PKG_DIR=/artifacts \
-e UPLOAD_CHANNEL \
-e UPLOAD_SUBFOLDER \
-v "${RUNNER_TEMP}/artifacts:/artifacts" \
-v "${GITHUB_WORKSPACE}:/v" \
-w /v \
308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/miniconda3:4.10.3 \
bash -c '.circleci/scripts/binary_upload.sh'
!{{ common.teardown_ec2_linux() }}
{%- endfor %}

View File

@ -19,17 +19,25 @@
name: !{{ build_environment }}
on:
{%- if is_default %}
pull_request:
types: [opened, synchronize, reopened, !{{ ciflow_config.trigger_action }}]
{%- if is_scheduled %}
schedule:
- cron: !{{ is_scheduled }}
{%- else %}
{%- endif %}
push:
{%- for label in ciflow_config.labels | sort %}
{%- if loop.first %}
tags:
{%- endif %}
{%- if label != "ciflow/default" %}
- '!{{ label }}/*'
{%- endif %}
{%- endfor %}
{%- if not is_scheduled %}
branches:
- master
- release/*
- fbsync
{%- else %}
schedule:
- cron: !{{ is_scheduled }}
{%- endif %}
workflow_dispatch:
@ -72,9 +80,6 @@ jobs:
JOB_BASE_NAME: !{{ build_environment }}-build
http_proxy: "!{{ common. squid_proxy }}"
https_proxy: "!{{ common.squid_proxy }}"
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == '!{{ ciflow_config.trigger_action }}') && (github.event.assigneed.login == '!{{ ciflow_config.trigger_actor }}') }}
LABEL_CONDITIONS: ${{ !{{ ciflow_config.label_conditions }} }}
if: !{{ ciflow_config.root_job_condition }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -82,7 +87,7 @@ jobs:
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
!{{ common.display_ec2_information() }}
- name: Install Visual Studio 2019 toolchain
shell: powershell
@ -114,7 +119,7 @@ jobs:
if-no-files-found: error
name: ${{ env.BUILD_ENVIRONMENT }}
path: C:\${{ github.run_id }}\build-results
!{{ wait_and_kill_ssh() }}
!{{ common.wait_and_kill_ssh_windows() }}
- name: Cleanup build-results and workspaces
if: always()
shell: bash
@ -173,7 +178,7 @@ jobs:
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
!{{ common.checkout_pytorch("recursive") }}
!{{ common.checkout() }}
- name: Install Visual Studio 2019 toolchain
shell: powershell
run: |
@ -215,7 +220,7 @@ jobs:
!{{ common.upload_downloaded_files(name='windows') }}
!{{ common.upload_test_reports(name='windows') }}
!{{ common.render_test_results() }}
!{{ wait_and_kill_ssh() }}
!{{ common.wait_and_kill_ssh_windows() }}
!{{ common.parse_ref() }}
!{{ common.upload_test_statistics(build_environment) }}
- name: Cleanup workspace

View File

@ -1,113 +0,0 @@
name: Build Linux Conda Packages
on:
# TODO: These are only runnable from workflow_dispatch, we need to eventually add
# a cron
# TODO: Add an on_release trigger to build on tags
workflow_dispatch:
jobs:
generate-build-matrix:
if: ${{ github.repository_owner == 'pytorch' }}
runs-on: ubuntu-18.04
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
container:
image: python:3.9
steps:
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
MATRIX=$(python .github/scripts/generate_binary_build_matrix.py conda)
echo "${MATRIX}"
echo "::set-output name=matrix::${MATRIX}"
build-conda:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-build-matrix
runs-on: linux.2xlarge
strategy:
matrix: ${{ fromJson(needs.generate-build-matrix.outputs.matrix) }}
fail-fast: false
container:
image: ${{ matrix.container_image }}
env:
DESIRED_PYTHON: ${{ matrix.python_version }}
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: ${{ matrix.gpu_arch_version }}
GPU_ARCH_VERSION: ${{ matrix.GPU_ARCH_VERSION }}
GPU_ARCH_TYPE: ${{ matrix.gpu_arch_type }}
NO_BUILD_SUFFIX: true
# TODO: This is a legacy variable, we should just default all build to use
# this folder within the conda/build_pytorch.sh script
TORCH_CONDA_BUILD_FOLDER: pytorch-nightly
# TODO: Another legacy env variable that isn't useful anymore, should default
# to pytorch within the scripts directly
ANACONDA_USER: pytorch
PYTORCH_FINAL_PACKAGE_DIR: /remote
# We specify the CONDA_BLD_PATH here since conda creates extremely long paths
# for its default build path
CONDA_BLD_PATH: /build
PYTORCH_BUILD_NUMBER: 1
SKIP_ALL_TESTS: 1
steps:
- name: Clean runner workspace
run: rm -rf "$GITHUB_WORKSPACE"
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
path: pytorch
submodules: recursive
- name: Clone pytorch/builder
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
repository: pytorch/builder
path: builder
- name: Generate version string
working-directory: pytorch/
run: |
version=$(.github/scripts/generate_pytorch_version.py)
echo "Generated version: ${version}"
echo "PYTORCH_BUILD_VERSION=${version}" >> "$GITHUB_ENV"
- name: Set BUILD_SPLIT_CUDA
if: ${{ matrix.gpu_arch_type == 'cuda' && matrix.gpu_arch_version == '11.1' }}
run: |
echo "BUILD_SPLIT_CUDA=1" >> "$GITHUB_ENV"
# TODO: Remove this once we remove the need for the directories to be
# in specific locations
- name: Symlink repositories to root directory (for legacy scripts purposes)
run: |
mv "$PWD"/pytorch /pytorch
mv "$PWD"/builder /builder
# TODO: Bundle the correct build script in the base container image so
# that we don't have to do this type of specification
- name: Build PyTorch binary
run: |
/builder/conda/build_pytorch.sh
- uses: actions/upload-artifact@v2
with:
name: pytorch-conda-py${{ matrix.python_version }}-${{matrix.gpu_arch_type}}-${{ matrix.gpu_arch_version }}
path: /remote/**/*.bz2
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Display and upload binary build size statistics (Click Me)
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
concurrency:
group: build-linux-conda-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true

View File

@ -1,112 +0,0 @@
name: Build Linux libtorch
on:
# TODO: These are only runnable from workflow_dispatch, we need to eventually add
# a cron
# TODO: Add an on_release trigger to build on tags
workflow_dispatch:
jobs:
generate-build-matrix:
if: ${{ github.repository_owner == 'pytorch' }}
runs-on: ubuntu-18.04
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
container:
image: python:3.9
steps:
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
MATRIX=$(python .github/scripts/generate_binary_build_matrix.py libtorch)
echo "${MATRIX}"
echo "::set-output name=matrix::${MATRIX}"
build-libtorch:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-build-matrix
runs-on: linux.2xlarge
strategy:
matrix: ${{ fromJson(needs.generate-build-matrix.outputs.matrix) }}
fail-fast: false
container:
image: ${{ matrix.container_image }}
env:
# TODO: remove this var from the libtorch builder script(s)
DESIRED_PYTHON: '3.7'
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: ${{ matrix.gpu_arch_version }}
GPU_ARCH_VERSION: ${{ matrix.GPU_ARCH_VERSION }}
GPU_ARCH_TYPE: ${{ matrix.gpu_arch_type }}
BUILD_PYTHONLESS: 1
LIBTORCH_VARIANT: ${{ matrix.libtorch_variant }}
# TODO: remove this and bake env var into the Docker image
DESIRED_DEVTOOLSET: ${{ matrix.devtoolset }}
PYTORCH_BUILD_NUMBER: 1
SKIP_ALL_TESTS: 1
steps:
- name: Clean runner workspace
run: rm -rf "$GITHUB_WORKSPACE"
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
path: pytorch
submodules: recursive
- name: Clone pytorch/builder
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
repository: pytorch/builder
path: builder
- name: Generate version string
working-directory: pytorch/
run: |
version=$(.github/scripts/generate_pytorch_version.py)
echo "Generated version: ${version}"
echo "PYTORCH_BUILD_VERSION=${version}" >> "$GITHUB_ENV"
- name: Set BUILD_SPLIT_CUDA
if: ${{ matrix.gpu_arch_type == 'cuda' && matrix.gpu_arch_version == '11.1' }}
run: |
echo "BUILD_SPLIT_CUDA=1" >> "$GITHUB_ENV"
# TODO: Remove this once we remove the need for the directories to be
# in specific locations
- name: Symlink repositories to root directory (for legacy scripts purposes)
run: |
ln -s "$PWD"/pytorch /pytorch
ln -s "$PWD"/builder /builder
# TODO: Bundle the correct build script in the base container image so
# that we don't have to do this type of specification
- name: Build PyTorch binary (CUDA specific)
if: ${{ matrix.gpu_arch_type == 'cuda' }}
run: |
/builder/manywheel/build.sh
- name: Build PyTorch binary (CPU specific)
if: ${{ matrix.gpu_arch_type == 'cpu' }}
run: |
/builder/manywheel/build_cpu.sh
- uses: actions/upload-artifact@v2
with:
name: pytorch-libtorch-${{ matrix.libtorch_variant }}-${{ matrix.devtoolset }}-${{matrix.gpu_arch_type}}-${{ matrix.gpu_arch_version }}
path: /remote/**/*.zip
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Display and upload binary build size statistics (Click Me)
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
concurrency:
group: build-linux-libtorch-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true

View File

@ -1,111 +0,0 @@
name: Build Linux Wheels
on:
# TODO: These are only runnable from workflow_dispatch, we need to eventually add
# a cron
# TODO: Add an on_release trigger to build on tags
workflow_dispatch:
jobs:
generate-build-matrix:
if: ${{ github.repository_owner == 'pytorch' }}
runs-on: ubuntu-18.04
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
container:
image: python:3.9
steps:
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
- name: Generating build matrix
id: set-matrix
run: |
# outputting for debugging purposes
MATRIX=$(python .github/scripts/generate_binary_build_matrix.py wheels)
echo "${MATRIX}"
echo "::set-output name=matrix::${MATRIX}"
build-wheel:
if: ${{ github.repository_owner == 'pytorch' }}
needs: generate-build-matrix
runs-on: linux.2xlarge
strategy:
matrix: ${{ fromJson(needs.generate-build-matrix.outputs.matrix) }}
fail-fast: false
container:
image: ${{ matrix.container_image }}
env:
DESIRED_PYTHON: ${{ matrix.python_version }}
# TODO: This is a legacy variable that we eventually want to get rid of in
# favor of GPU_ARCH_VERSION
DESIRED_CUDA: ${{ matrix.gpu_arch_version }}
GPU_ARCH_VERSION: ${{ matrix.GPU_ARCH_VERSION }}
GPU_ARCH_TYPE: ${{ matrix.gpu_arch_type }}
PYTORCH_BUILD_NUMBER: 1
SKIP_ALL_TESTS: 1
steps:
- name: Clean runner workspace
run: rm -rf "$GITHUB_WORKSPACE"
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
path: pytorch
submodules: recursive
- name: Clone pytorch/builder
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
repository: pytorch/builder
path: builder
- name: Generate version string
working-directory: pytorch/
run: |
version=$(.github/scripts/generate_pytorch_version.py)
echo "Generated version: ${version}"
echo "PYTORCH_BUILD_VERSION=${version}" >> "$GITHUB_ENV"
- name: Set BUILD_SPLIT_CUDA
if: ${{ matrix.gpu_arch_type == 'cuda' && matrix.gpu_arch_version == '11.1' }}
run: |
echo "BUILD_SPLIT_CUDA=1" >> "$GITHUB_ENV"
# TODO: Remove this once we remove the need for the directories to be
# in specific locations
- name: Symlink repositories to root directory (for legacy scripts purposes)
run: |
ln -s "$PWD"/pytorch /pytorch
ln -s "$PWD"/builder /builder
# TODO: Bundle the correct build script in the base container image so
# that we don't have to do this type of specification
- name: Build PyTorch binary (CUDA specific)
if: ${{ matrix.gpu_arch_type == 'cuda' }}
run: |
/builder/manywheel/build.sh
- name: Build PyTorch binary (ROCM specific)
if: ${{ matrix.gpu_arch_type == 'rocm' }}
run: |
/builder/manywheel/build_rocm.sh
- name: Build PyTorch binary (CPU specific)
if: ${{ matrix.gpu_arch_type == 'cpu' }}
run: |
/builder/manywheel/build_cpu.sh
- uses: actions/upload-artifact@v2
with:
name: pytorch-wheel-py${{ matrix.python_version }}-${{matrix.gpu_arch_type}}-${{ matrix.gpu_arch_version }}
path: /remote/**/*.whl
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Display and upload binary build size statistics (Click Me)
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
concurrency:
group: build-linux-wheels-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true

View File

@ -4,13 +4,15 @@
name: caffe2-linux-xenial-py3.7-gcc5.4
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +43,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
env:
JOB_BASE_NAME: caffe2-linux-xenial-py3.7-gcc5.4-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +69,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +84,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +199,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME

View File

@ -11,7 +11,7 @@ on:
- '.circleci/docker/**'
- '.github/workflows/generated-docker-builds.yml'
schedule:
- cron: 1 * */7 * *
- cron: 1 3 * * 3
concurrency:
group: docker-builds-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
@ -28,20 +28,16 @@ jobs:
strategy:
matrix:
include:
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda10.2-cudnn7-py3.7-clang9'
docker_image_short_name: 'pytorch-linux-bionic-cuda10.2-cudnn7-py3.7-clang9'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7'
docker_image_short_name: 'pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.5-cudnn8-py3-gcc7'
docker_image_short_name: 'pytorch-linux-bionic-cuda11.5-cudnn8-py3-gcc7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.7-clang9'
docker_image_short_name: 'pytorch-linux-bionic-py3.7-clang9'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.1-py3.7'
docker_image_short_name: 'pytorch-linux-bionic-rocm4.1-py3.7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.2-py3.7'
docker_image_short_name: 'pytorch-linux-bionic-rocm4.2-py3.7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.3.1-py3.7'
docker_image_short_name: 'pytorch-linux-bionic-rocm4.3.1-py3.7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.5-py3.7'
docker_image_short_name: 'pytorch-linux-bionic-rocm4.5-py3.7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7'
docker_image_short_name: 'pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7'
- docker_image_base: '308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7'
@ -83,7 +79,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -95,8 +94,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-arm64-coreml
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-arm64-coreml
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: OS
IOS_ARCH: arm64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-arm64-coreml-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,25 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
concurrency:
group: ios-12-5-1-arm64-coreml-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-arm64-custom-ops
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-arm64-custom-ops
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: OS
IOS_ARCH: arm64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-arm64-custom-ops-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,25 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
concurrency:
group: ios-12-5-1-arm64-custom-ops-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-arm64-full-jit
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-arm64-full-jit
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: OS
IOS_ARCH: arm64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-arm64-full-jit-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,25 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
concurrency:
group: ios-12-5-1-arm64-full-jit-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-arm64-metal
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-arm64-metal
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: OS
IOS_ARCH: arm64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-arm64-metal-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,25 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
concurrency:
group: ios-12-5-1-arm64-metal-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-arm64
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-arm64
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: OS
IOS_ARCH: arm64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-arm64-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,25 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
concurrency:
group: ios-12-5-1-arm64-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-x86-64-coreml
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-x86-64-coreml
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: SIMULATOR
IOS_ARCH: x86_64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-x86-64-coreml-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,58 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
- name: Run Simulator Tests
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
pip3 install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
# generate models for differnet backends
cd "${GITHUB_WORKSPACE}/ios/TestApp/benchmark"
mkdir -p ../models
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
pip install coremltools==5.0b5
pip install six==1.16.0
python coreml_backend.py
else
python trace_model.py
fi
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
else
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${GITHUB_WORKSPACE}/ios/TestApp"
instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
fi
else
fastlane scan --only_testing TestAppTests/TestAppTests/testFullJIT
fi
concurrency:
group: ios-12-5-1-x86-64-coreml-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-x86-64-full-jit
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-x86-64-full-jit
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: SIMULATOR
IOS_ARCH: x86_64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-x86-64-full-jit-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,58 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
- name: Run Simulator Tests
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
pip3 install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
# generate models for differnet backends
cd "${GITHUB_WORKSPACE}/ios/TestApp/benchmark"
mkdir -p ../models
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
pip install coremltools==5.0b5
pip install six==1.16.0
python coreml_backend.py
else
python trace_model.py
fi
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
else
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${GITHUB_WORKSPACE}/ios/TestApp"
instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
fi
else
fastlane scan --only_testing TestAppTests/TestAppTests/testFullJIT
fi
concurrency:
group: ios-12-5-1-x86-64-full-jit-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,42 +4,40 @@
name: ios-12-5-1-x86-64
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
tags:
- 'ciflow/all/*'
- 'ciflow/ios/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
defaults:
run:
shell: bash -x -e -l {0}
env:
BUILD_ENVIRONMENT: ios-12-5-1-x86-64
IN_CI: 1
IS_GHA: 1
IOS_PLATFORM: SIMULATOR
IOS_ARCH: x86_64
jobs:
build:
# NOTE: These builds will not run successfully without running on `pytorch/pytorch` due to the limitations
# of accessing secrets from forked pull requests and IOS' dependency on secrets for their build/test
if: ${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
runs-on: macos-10.15
timeout-minutes: 240
env:
JOB_BASE_NAME: ios-12-5-1-x86-64-build
IOS_CERT_KEY_2022: ${{ secrets.IOS_CERT_KEY_2022 }}
IOS_CERT_SECRET: ${{ secrets.IOS_CERT_SECRET }}
IOS_DEV_TEAM_ID: ${{ secrets.IOS_DEV_TEAM_ID }}
IOS_SIGN_KEY_2022: ${{ secrets.IOS_SIGN_KEY_2022 }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/ios') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -53,19 +51,52 @@ jobs:
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.8
activate-environment: build
- name: Install ios / conda Dependencies
- name: Populate CI build options
run: |
# Most builds use the lite interpreter, if certain builds shouldn't
# build the lite interpreter this env variable should get over-written
# in the following case statement
echo "BUILD_LITE_INTERPRETER=1" >> "${GITHUB_ENV}"
case ${BUILD_ENVIRONMENT} in
*metal*)
echo "USE_PYTORCH_METAL=1" >> "${GITHUB_ENV}"
;;
*full_jit*)
echo "BUILD_LITE_INTERPRETER=0" >> "${GITHUB_ENV}"
;;
*custom*)
echo "SELECTED_OP_LIST=${GITHUB_WORKSPACE}/ios/TestApp/custom_build/mobilenetv2.yaml" >> "${GITHUB_ENV}"
;;
*coreml*)
echo "USE_COREML_DELEGATE=1" >> "${GITHUB_ENV}"
;;
esac
- name: Install brew dependencies
run: |
# Install dependencies
brew install libtool
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
- name: Install conda and dependencies
run: |
# Install conda, setup-miniconda messes with the path that messes with the ruby stuff we do later on
curl --retry 3 -o "${RUNNER_TEMP}/conda.sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "${RUNNER_TEMP}/conda.sh"
/bin/bash "${RUNNER_TEMP}/conda.sh" -b -p "${RUNNER_TEMP}/anaconda"
echo "${RUNNER_TEMP}/anaconda/bin" >> "${GITHUB_PATH}"
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
conda install -y \
cffi \
cmake \
mkl \
mkl-include \
ninja \
numpy \
pyyaml \
requests \
setuptools \
typing_extensions
- name: Run Fastlane
shell: bash -e {0}
run: |
set -x
cd ios/TestApp
@ -87,11 +118,58 @@ jobs:
rm cert.txt
- name: Build
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
export TCLLIBPATH="/usr/local/lib"
python -VV
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}
scripts/build_ios.sh
- name: Run Build Test
run: |
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
if [ "${IOS_PLATFORM}" != "SIMULATOR" ]; then
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}" -c "${PROFILE}" -t "${IOS_DEV_TEAM_ID}"
else
ruby scripts/xcode_build.rb -i build_ios/install -x ios/TestApp/TestApp.xcodeproj -p "${IOS_PLATFORM}"
fi
- name: Run Simulator Tests
run: |
# shellcheck disable=SC1091
source "${RUNNER_TEMP}/anaconda/bin/activate"
pip3 install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
# generate models for differnet backends
cd "${GITHUB_WORKSPACE}/ios/TestApp/benchmark"
mkdir -p ../models
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
pip install coremltools==5.0b5
pip install six==1.16.0
python coreml_backend.py
else
python trace_model.py
fi
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
else
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${GITHUB_WORKSPACE}/ios/TestApp"
instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
fi
else
fastlane scan --only_testing TestAppTests/TestAppTests/testFullJIT
fi
concurrency:
group: ios-12-5-1-x86-64-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}

View File

@ -4,13 +4,16 @@
name: libtorch-linux-xenial-cuda10.2-py3.7-gcc7
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cuda/*'
- 'ciflow/libtorch/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/libtorch') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
env:
JOB_BASE_NAME: libtorch-linux-xenial-cuda10.2-py3.7-gcc7-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/libtorch') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME

View File

@ -4,13 +4,16 @@
name: libtorch-linux-xenial-cuda11.3-py3.7-gcc7
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cuda/*'
- 'ciflow/libtorch/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/libtorch') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
env:
JOB_BASE_NAME: libtorch-linux-xenial-cuda11.3-py3.7-gcc7-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/libtorch') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME

8005
.github/workflows/generated-linux-binary-conda.yml generated vendored Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

11149
.github/workflows/generated-linux-binary-manywheel.yml generated vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@ -4,13 +4,16 @@
name: linux-bionic-cuda10.2-py3.9-gcc7
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cuda/*'
- 'ciflow/linux/*'
- 'ciflow/slow/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository_owner == 'pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/slow') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
env:
JOB_BASE_NAME: linux-bionic-cuda10.2-py3.9-gcc7-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/slow') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +319,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +334,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +516,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -5,12 +5,16 @@ name: linux-bionic-py3.7-clang9
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/noarch/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/noarch') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-bionic-py3.7-clang9-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/noarch') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +320,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +335,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +517,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -0,0 +1,513 @@
# @generated DO NOT EDIT MANUALLY
# Template is at: .github/templates/linux_ci_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: linux-bionic-rocm4.5-py3.7
on:
pull_request:
push:
tags:
- 'ciflow/all/*'
- 'ciflow/linux/*'
- 'ciflow/rocm/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
workflow_dispatch:
env:
BUILD_ENVIRONMENT: linux-bionic-rocm4.5-py3.7
DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-rocm4.5-py3.7
SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla
TORCH_CUDA_ARCH_LIST: 5.2
IN_CI: 1
IS_GHA: 1
# This is used for the phase of adding wheel tests only, will be removed once completed
IN_WHEEL_TEST: 1
# Used for custom_opertor, jit_hooks, custom_backend, see .jenkins/pytorch/build.sh
CUSTOM_TEST_ARTIFACT_BUILD_DIR: build/custom_test_artifacts
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AWS_DEFAULT_REGION: us-east-1
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
concurrency:
group: linux-bionic-rocm4.5-py3.7-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
env:
JOB_BASE_NAME: linux-bionic-rocm4.5-py3.7-build
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
- name: Display EC2 information
shell: bash
run: |
set -euo pipefail
function get_ec2_metadata() {
# Pulled from instance metadata endpoint for EC2
# see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html
category=$1
curl -fsSL "http://169.254.169.254/latest/meta-data/${category}"
}
echo "ami-id: $(get_ec2_metadata ami-id)"
echo "instance-id: $(get_ec2_metadata instance-id)"
echo "instance-type: $(get_ec2_metadata instance-type)"
- name: Log in to ECR
env:
AWS_RETRY_MODE: standard
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${ALPINE_IMAGE}"
# Ensure the working directory gets chowned back to the current user
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Preserve github env variables for use in docker
run: |
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
- name: Checkout PyTorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
# deep clone, to allow use of git merge-base
fetch-depth: 0
submodules: recursive
- name: Clean PyTorch checkout
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Calculate docker image tag
id: calculate-tag
run: |
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "DOCKER_IMAGE=${DOCKER_IMAGE_BASE}:${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "::set-output name=docker_tag::${DOCKER_TAG}"
echo "::set-output name=docker_image::${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"
- name: Check if image should be built
id: check
env:
BASE_REVISION: ${{ github.event.pull_request.base.sha || github.sha }}
run: |
set -x
# Check if image already exists, if it does then skip building it
if docker manifest inspect "${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"; then
exit 0
fi
if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then
# if we're on the base branch then use the parent commit
MERGE_BASE=$(git rev-parse HEAD~)
else
# otherwise we're on a PR, so use the most recent base commit
MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION")
fi
# Covers the case where a previous tag doesn't exist for the tree
# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$MERGE_BASE:.circleci/docker"; then
echo "Directory '.circleci/docker' not found in commit $MERGE_BASE, you should probably rebase onto a more recent commit"
exit 1
fi
PREVIOUS_DOCKER_TAG=$(git rev-parse "$MERGE_BASE:.circleci/docker")
# If no image exists but the hash is the same as the previous hash then we should error out here
if [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then
echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"
echo " contact the PyTorch team to restore the original images"
exit 1
fi
echo ::set-output name=rebuild::yes
- name: Build and push docker image
if: ${{ steps.check.outputs.rebuild }}
env:
DOCKER_SKIP_S3_UPLOAD: 1
working-directory: .circleci/docker
run: |
export IMAGE_NAME=${DOCKER_IMAGE_BASE#308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/}
./build_docker.sh
- name: Pull Docker image
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${DOCKER_IMAGE}"
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Build
env:
BRANCH: ${{ steps.parse-ref.outputs.branch }}
run: |
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \
-e JOB_BASE_NAME \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e AWS_DEFAULT_REGION \
-e IS_GHA \
-e PR_NUMBER \
-e SHA1 \
-e BRANCH \
-e GITHUB_RUN_ID \
-e SCCACHE_BUCKET \
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e SKIP_SCCACHE_INITIALIZATION=1 \
-e TORCH_CUDA_ARCH_LIST \
-e PR_LABELS \
-e http_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e https_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e no_proxy="localhost,127.0.0.1,github.com,amazonaws.com,s3.amazonaws.com,169.254.169.254,169.254.170.2,/var/run/docker.sock" \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
--tty \
--detach \
--user jenkins \
-v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
docker exec -t "${container_name}" sh -c 'sudo chown -R jenkins . && .jenkins/pytorch/build.sh'
- name: Display and upload binary build size statistics (Click Me)
# temporary hack: set CIRCLE_* vars, until we update
# tools/stats/print_test_stats.py to natively support GitHub Actions
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26 boto3==1.16.34
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
- name: Chown workspace
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Archive artifacts into zip
run: |
zip -1 -r artifacts.zip dist/ build/custom_test_artifacts build/lib build/bin .pytorch-test-times.json
- uses: seemethere/upload-artifact-s3@v3
name: Store PyTorch Build Artifacts on S3
with:
name: ${{ env.BUILD_ENVIRONMENT }}
retention-days: 14
if-no-files-found: error
path:
artifacts.zip
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Chown workspace
if: always()
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Kill containers, clean up images
if: always()
run: |
# ignore expansion of "docker ps -q" since it could be empty
# shellcheck disable=SC2046
docker stop $(docker ps -q) || true
# Prune all of the docker images
docker system prune -af
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Clean up docker images
if: always()
run: |
# Prune all of the docker images
docker system prune -af
generate-test-matrix:
needs: build
runs-on: ubuntu-18.04
timeout-minutes: 240
env:
TEST_RUNNER_TYPE: linux.rocm.gpu
ENABLE_DISTRIBUTED_TEST: 1
ENABLE_JIT_LEGACY_TEST: ''
ENABLE_FX2TRT_TEST: ''
ENABLE_MULTIGPU_TEST: ''
ENABLE_NOGPU_NO_AVX_TEST: ''
ENABLE_NOGPU_NO_AVX2_TEST: ''
ENABLE_SLOW_TEST: ''
ENABLE_DOCS_TEST: ''
ENABLE_BACKWARDS_COMPAT_TEST: ''
ENABLE_XLA_TEST: ''
ENABLE_NOARCH_TEST: ''
NUM_TEST_SHARDS: 2
MULTIGPU_RUNNER_TYPE: linux.rocm.gpu
DISTRIBUTED_GPU_RUNNER_TYPE: linux.rocm.gpu
NOGPU_RUNNER_TYPE: linux.2xlarge
PR_BODY: ${{ github.event.pull_request.body }}
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
render-matrix: ${{ steps.set-matrix.outputs.render-matrix }}
ignore-disabled-issues: ${{ steps.set-matrix.outputs.ignore-disabled-issues }}
container:
image: python:3.9
steps:
- name: Install dependencies
run: pip install typing-extensions==3.10
- name: Clone pytorch/pytorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
- name: Generating test matrix
id: set-matrix
run: .github/scripts/generate_pytorch_test_matrix.py
test:
needs: [build, generate-test-matrix]
strategy:
matrix: ${{ fromJson(needs.generate-test-matrix.outputs.matrix) }}
fail-fast: false
runs-on: ${{ matrix.runner }}
timeout-minutes: 240
env:
DOCKER_IMAGE: ${{ needs.build.outputs.docker_image }}
JOB_BASE_NAME: linux-bionic-rocm4.5-py3.7-test
TEST_CONFIG: ${{ matrix.config }}
SHARD_NUMBER: ${{ matrix.shard }}
NUM_TEST_SHARDS: ${{ matrix.num_shards }}
PYTORCH_IGNORE_DISABLED_ISSUES: ${{ needs.generate-test-matrix.outputs.ignore-disabled-issues }}
steps:
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: Set DOCKER_HOST
run: echo "DOCKER_HOST=unix:///run/user/$(id -u)/docker.sock" >> "${GITHUB_ENV}"
- name: Runner health check system info
if: always()
run: |
cat /etc/os-release || true
cat /etc/apt/sources.list.d/rocm.list || true
cat /opt/rocm/.info/version || true
whoami
- name: Runner health check rocm-smi
if: always()
run: |
rocm-smi
- name: Runner health check rocminfo
if: always()
run: |
rocminfo
- name: Runner health check GPU count
if: always()
run: |
ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')
if [[ "x$ngpu" != "x2" && "x$ngpu" != "x4" ]]; then
echo "Failed to detect GPUs on the runner"
exit 1
fi
- name: Runner health check disconnect on failure
if: ${{ failure() }}
run: |
killall runsvc.sh
- name: Preserve github env variables for use in docker
run: |
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
- name: Checkout PyTorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
# deep clone, to allow use of git merge-base
fetch-depth: 0
submodules: recursive
- name: Clean PyTorch checkout
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Pull Docker image
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${DOCKER_IMAGE}"
- name: ROCm set GPU_FLAG
if: ${{ contains(env.BUILD_ENVIRONMENT, 'rocm') && !contains(matrix.config, 'nogpu') }}
run: |
echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd --device=/dev/dri --group-add video --group-add daemon" >> "${GITHUB_ENV}"
- name: Determine shm-size
run: |
shm_size="1g"
case "${BUILD_ENVIRONMENT}" in
*cuda*)
shm_size="2g"
;;
*rocm*)
shm_size="8g"
;;
esac
echo "SHM_SIZE=${shm_size}" >> "${GITHUB_ENV}"
- uses: seemethere/download-artifact-s3@0504774707cbc8603d7dca922e8026eb8bf3b47b
name: Download PyTorch Build Artifacts
with:
name: ${{ env.BUILD_ENVIRONMENT }}
- name: Unzip artifacts
run: |
unzip -o artifacts.zip
- name: Output disk space left
run: |
df -H
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Test
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
# Time out the test phase after 240 minutes
timeout-minutes: 240
run: |
set -x
if [[ $TEST_CONFIG == 'multigpu' ]]; then
TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh
elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then
TEST_COMMAND=.jenkins/caffe2/test.sh
else
TEST_COMMAND=.jenkins/pytorch/test.sh
fi
# detached container should get cleaned up by teardown_ec2_linux
# TODO: Stop building test binaries as part of the build phase
# Used for GPU_FLAG since that doesn't play nice
# shellcheck disable=SC2086,SC2090
container_name=$(docker run \
${GPU_FLAG:-} \
-e BUILD_ENVIRONMENT \
-e PR_NUMBER \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e GITHUB_ACTIONS \
-e IN_CI \
-e IS_GHA \
-e BRANCH \
-e SHA1 \
-e AWS_DEFAULT_REGION \
-e IN_WHEEL_TEST \
-e SHARD_NUMBER \
-e JOB_BASE_NAME \
-e TEST_CONFIG \
-e NUM_TEST_SHARDS \
-e PYTORCH_IGNORE_DISABLED_ISSUES \
-e PYTORCH_RETRY_TEST_CASES \
-e PR_LABELS \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e SCCACHE_BUCKET \
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--ulimit stack=10485760:83886080 \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
--shm-size="${SHM_SIZE}" \
--tty \
--detach \
--name="${container_name}" \
--user jenkins \
-v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
# jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home
docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}"
# copy test results back to the mounted workspace, needed sudo, resulting permissions were correct
docker exec -t "${container_name}" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test"
- name: Install render_test_results dependencies
if: always()
shell: bash
run: |
python3 -m pip install junitparser==2.1.1 rich==10.9.0
- name: "[[ Click me for rendered test results (useful for finding failing tests) ]]"
if: always()
shell: bash
# Encoding is weird on windows, just try to default to utf-8 if possible
env:
PYTHONIOENCODING: "utf-8"
run: |
python3 tools/render_junit.py test/
- name: Zip JSONs for upload
if: always()
env:
FILE_SUFFIX: '${{ github.job }}-${{ matrix.config }}-${{ matrix.shard }}-${{ matrix.num_shards }}-${{ matrix.runner }}'
run: |
# Remove any previous test jsons if they exist
rm -f test-jsons-*.zip
zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json'
- uses: actions/upload-artifact@v2
name: Store Test Downloaded JSONs on Github
if: always()
with:
retention-days: 14
if-no-files-found: warn
path:
test-jsons-*.zip
- name: Zip test reports for upload
if: always()
env:
FILE_SUFFIX: '${{ github.job }}-${{ matrix.config }}-${{ matrix.shard }}-${{ matrix.num_shards }}-${{ matrix.runner }}'
run: |
# Remove any previous test reports if they exist
rm -f test-reports-*.zip
zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml'
- uses: actions/upload-artifact@v2
name: Store Test Reports on Github
if: always()
with:
name: test-reports
retention-days: 14
if-no-files-found: error
path:
test-reports-*.zip
- name: Display and upload test statistics (Click Me)
if: always()
# temporary hack: set CIRCLE_* vars, until we update
# tools/stats/print_test_stats.py to natively support GitHub Actions
env:
AWS_DEFAULT_REGION: us-east-1
BRANCH: ${{ steps.parse-ref.outputs.branch }}
JOB_BASE_NAME: linux-bionic-rocm4.5-py3.7-test
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt
python3 -m pip install boto3==1.19.12
python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test
- name: Kill containers, clean up images
if: always()
run: |
# ignore expansion of "docker ps -q" since it could be empty
# shellcheck disable=SC2046
docker stop $(docker ps -q) || true
# Prune all of the docker images
docker system prune -af

View File

@ -4,8 +4,15 @@
name: linux-docs-push
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
# NOTE: Binary build pipelines should only get triggered on release candidate builds
# Release candidate tags look like: v1.11.0-rc1
- v[0-9]+.[0-9]+.[0-9]+-rc[0-9]+
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/scheduled/*'
schedule:
- cron: 0 0 * * *
workflow_dispatch:
@ -38,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/scheduled')) ||
(false))
}}
env:
JOB_BASE_NAME: linux-docs-push-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/scheduled') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -72,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -84,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -199,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -256,7 +258,7 @@ jobs:
env:
DOCKER_IMAGE: ${{ needs.build.outputs.docker_image }}
DOCS_TYPE: ${{ matrix.docs_type }}
WITH_PUSH: ${{ github.event_name == 'schedule' }}
WITH_PUSH: ${{ github.event_name == 'schedule' || startsWith(github.event.ref, 'refs/tags/v') }}
steps:
- name: Display EC2 information
shell: bash
@ -277,7 +279,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -289,8 +294,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -322,7 +327,7 @@ jobs:
run: |
unzip -o artifacts.zip
- name: Generate netrc (only for docs-push)
if: ${{ github.event_name == 'schedule' }}
if: ${{ github.event_name == 'schedule' || startsWith(github.event.ref, 'refs/tags/v') }}
env:
GITHUB_PYTORCHBOT_TOKEN: ${{ secrets.GH_PYTORCHBOT_TOKEN }}
run: |
@ -334,9 +339,12 @@ jobs:
run: |
set -ex
time docker pull "${DOCKER_IMAGE}" > /dev/null
echo "${GITHUB_REF}"
# TODO: Set it correctly when workflows are scheduled on tags
target="master"
# Convert refs/tags/v1.12.0rc3 into 1.12
if [[ "${GITHUB_REF}" =~ ^refs/tags/v([0-9]+\.[0-9]+)\.* ]]; then
target="${BASH_REMATCH[1]}"
else
target="master"
fi
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \

View File

@ -5,12 +5,16 @@ name: linux-docs
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/docs/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/docs') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-docs-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/docs') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -259,7 +258,7 @@ jobs:
env:
DOCKER_IMAGE: ${{ needs.build.outputs.docker_image }}
DOCS_TYPE: ${{ matrix.docs_type }}
WITH_PUSH: ${{ github.event_name == 'schedule' }}
WITH_PUSH: ${{ github.event_name == 'schedule' || startsWith(github.event.ref, 'refs/tags/v') }}
steps:
- name: Display EC2 information
shell: bash
@ -280,7 +279,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -292,8 +294,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -328,9 +330,12 @@ jobs:
run: |
set -ex
time docker pull "${DOCKER_IMAGE}" > /dev/null
echo "${GITHUB_REF}"
# TODO: Set it correctly when workflows are scheduled on tags
target="master"
# Convert refs/tags/v1.12.0rc3 into 1.12
if [[ "${GITHUB_REF}" =~ ^refs/tags/v([0-9]+\.[0-9]+)\.* ]]; then
target="${BASH_REMATCH[1]}"
else
target="master"
fi
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \

View File

@ -5,12 +5,16 @@ name: linux-vulkan-bionic-py3.7-clang9
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
- 'ciflow/vulkan/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') || contains(github.event.pull_request.labels.*.name, 'ciflow/vulkan')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-vulkan-bionic-py3.7-clang9-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') || contains(github.event.pull_request.labels.*.name, 'ciflow/vulkan') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +320,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +335,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +517,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -5,12 +5,16 @@ name: linux-xenial-cuda11.3-py3.7-gcc7-bazel-test
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/bazel/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -44,14 +48,6 @@ jobs:
env:
JOB_BASE_NAME: linux-xenial-cuda11.3-py3.7-gcc7-bazel-test-build-and-test
NUM_TEST_SHARDS: 1
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/bazel') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/bazel') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -74,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -86,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -213,7 +212,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
# The artifact file is created inside docker container, which contains the result binaries.
# Now unpackage it into the project folder. The subsequent script will scan project folder
@ -312,7 +311,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -0,0 +1,248 @@
# @generated DO NOT EDIT MANUALLY
# Template is at: .github/templates/linux_ci_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: linux-xenial-cuda11.3-py3.7-gcc7-no-ops
on:
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cuda/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
workflow_dispatch:
env:
BUILD_ENVIRONMENT: linux-xenial-cuda11.3-py3.7-gcc7-no-ops
DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7
SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla
TORCH_CUDA_ARCH_LIST: 5.2
IN_CI: 1
IS_GHA: 1
# This is used for the phase of adding wheel tests only, will be removed once completed
IN_WHEEL_TEST: 1
# Used for custom_opertor, jit_hooks, custom_backend, see .jenkins/pytorch/build.sh
CUSTOM_TEST_ARTIFACT_BUILD_DIR: build/custom_test_artifacts
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AWS_DEFAULT_REGION: us-east-1
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
concurrency:
group: linux-xenial-cuda11.3-py3.7-gcc7-no-ops-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
env:
JOB_BASE_NAME: linux-xenial-cuda11.3-py3.7-gcc7-no-ops-build
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
- name: Display EC2 information
shell: bash
run: |
set -euo pipefail
function get_ec2_metadata() {
# Pulled from instance metadata endpoint for EC2
# see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html
category=$1
curl -fsSL "http://169.254.169.254/latest/meta-data/${category}"
}
echo "ami-id: $(get_ec2_metadata ami-id)"
echo "instance-id: $(get_ec2_metadata instance-id)"
echo "instance-type: $(get_ec2_metadata instance-type)"
- name: Log in to ECR
env:
AWS_RETRY_MODE: standard
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${ALPINE_IMAGE}"
# Ensure the working directory gets chowned back to the current user
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Preserve github env variables for use in docker
run: |
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
- name: Checkout PyTorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
# deep clone, to allow use of git merge-base
fetch-depth: 0
submodules: recursive
- name: Clean PyTorch checkout
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Calculate docker image tag
id: calculate-tag
run: |
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "DOCKER_IMAGE=${DOCKER_IMAGE_BASE}:${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "::set-output name=docker_tag::${DOCKER_TAG}"
echo "::set-output name=docker_image::${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"
- name: Check if image should be built
id: check
env:
BASE_REVISION: ${{ github.event.pull_request.base.sha || github.sha }}
run: |
set -x
# Check if image already exists, if it does then skip building it
if docker manifest inspect "${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"; then
exit 0
fi
if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then
# if we're on the base branch then use the parent commit
MERGE_BASE=$(git rev-parse HEAD~)
else
# otherwise we're on a PR, so use the most recent base commit
MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION")
fi
# Covers the case where a previous tag doesn't exist for the tree
# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$MERGE_BASE:.circleci/docker"; then
echo "Directory '.circleci/docker' not found in commit $MERGE_BASE, you should probably rebase onto a more recent commit"
exit 1
fi
PREVIOUS_DOCKER_TAG=$(git rev-parse "$MERGE_BASE:.circleci/docker")
# If no image exists but the hash is the same as the previous hash then we should error out here
if [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then
echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"
echo " contact the PyTorch team to restore the original images"
exit 1
fi
echo ::set-output name=rebuild::yes
- name: Build and push docker image
if: ${{ steps.check.outputs.rebuild }}
env:
DOCKER_SKIP_S3_UPLOAD: 1
working-directory: .circleci/docker
run: |
export IMAGE_NAME=${DOCKER_IMAGE_BASE#308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/}
./build_docker.sh
- name: Pull Docker image
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${DOCKER_IMAGE}"
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Build
env:
BRANCH: ${{ steps.parse-ref.outputs.branch }}
run: |
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \
-e JOB_BASE_NAME \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e AWS_DEFAULT_REGION \
-e IS_GHA \
-e PR_NUMBER \
-e SHA1 \
-e BRANCH \
-e GITHUB_RUN_ID \
-e SCCACHE_BUCKET \
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e SKIP_SCCACHE_INITIALIZATION=1 \
-e TORCH_CUDA_ARCH_LIST \
-e PR_LABELS \
-e http_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e https_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e no_proxy="localhost,127.0.0.1,github.com,amazonaws.com,s3.amazonaws.com,169.254.169.254,169.254.170.2,/var/run/docker.sock" \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
--tty \
--detach \
--user jenkins \
-v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
docker exec -t "${container_name}" sh -c 'sudo chown -R jenkins . && .jenkins/pytorch/build.sh'
- name: Display and upload binary build size statistics (Click Me)
# temporary hack: set CIRCLE_* vars, until we update
# tools/stats/print_test_stats.py to natively support GitHub Actions
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26 boto3==1.16.34
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
- name: Chown workspace
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Archive artifacts into zip
run: |
zip -1 -r artifacts.zip dist/ build/custom_test_artifacts build/lib build/bin .pytorch-test-times.json
- uses: seemethere/upload-artifact-s3@v3
name: Store PyTorch Build Artifacts on S3
with:
name: ${{ env.BUILD_ENVIRONMENT }}
retention-days: 14
if-no-files-found: error
path:
artifacts.zip
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Chown workspace
if: always()
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Kill containers, clean up images
if: always()
run: |
# ignore expansion of "docker ps -q" since it could be empty
# shellcheck disable=SC2046
docker stop $(docker ps -q) || true
# Prune all of the docker images
docker system prune -af
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Clean up docker images
if: always()
run: |
# Prune all of the docker images
docker system prune -af

View File

@ -5,12 +5,15 @@ name: linux-xenial-cuda11.3-py3.7-gcc7
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cuda/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-cuda11.3-py3.7-gcc7-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cuda') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +319,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +334,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +516,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -5,12 +5,15 @@ name: linux-xenial-py3-clang5-mobile-build
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/linux/*'
- 'ciflow/mobile/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/mobile') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3-clang5-mobile-build-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/mobile') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME

View File

@ -5,12 +5,15 @@ name: linux-xenial-py3-clang5-mobile-custom-build-static
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/linux/*'
- 'ciflow/mobile/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/mobile') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3-clang5-mobile-custom-build-static-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/mobile') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME

View File

@ -5,12 +5,16 @@ name: linux-xenial-py3.7-clang7-asan
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/sanitizers/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/sanitizers') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3.7-clang7-asan-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/sanitizers') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +320,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +335,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +517,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -5,12 +5,16 @@ name: linux-xenial-py3.7-clang7-onnx
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/onnx/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +45,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/onnx') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3.7-clang7-onnx-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/onnx') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +71,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +86,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +201,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +320,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +335,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +517,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -5,12 +5,15 @@ name: linux-xenial-py3.7-gcc5.4
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository_owner == 'pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3.7-gcc5.4-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +319,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +334,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +516,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -0,0 +1,249 @@
# @generated DO NOT EDIT MANUALLY
# Template is at: .github/templates/linux_ci_workflow.yml.j2
# Generation script: .github/scripts/generate_ci_workflows.py
name: linux-xenial-py3.7-gcc7-no-ops
on:
pull_request:
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
workflow_dispatch:
env:
BUILD_ENVIRONMENT: linux-xenial-py3.7-gcc7-no-ops
DOCKER_IMAGE_BASE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc7
SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla
TORCH_CUDA_ARCH_LIST: 5.2
IN_CI: 1
IS_GHA: 1
# This is used for the phase of adding wheel tests only, will be removed once completed
IN_WHEEL_TEST: 1
# Used for custom_opertor, jit_hooks, custom_backend, see .jenkins/pytorch/build.sh
CUSTOM_TEST_ARTIFACT_BUILD_DIR: build/custom_test_artifacts
ALPINE_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine"
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AWS_DEFAULT_REGION: us-east-1
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
PYTORCH_RETRY_TEST_CASES: 1
concurrency:
group: linux-xenial-py3.7-gcc7-no-ops-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name == 'workflow_dispatch' }}
cancel-in-progress: true
jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
env:
JOB_BASE_NAME: linux-xenial-py3.7-gcc7-no-ops-build
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
- name: print labels
run: echo "${PR_LABELS}"
- name: Display EC2 information
shell: bash
run: |
set -euo pipefail
function get_ec2_metadata() {
# Pulled from instance metadata endpoint for EC2
# see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html
category=$1
curl -fsSL "http://169.254.169.254/latest/meta-data/${category}"
}
echo "ami-id: $(get_ec2_metadata ami-id)"
echo "instance-id: $(get_ec2_metadata instance-id)"
echo "instance-type: $(get_ec2_metadata instance-type)"
- name: Log in to ECR
env:
AWS_RETRY_MODE: standard
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${ALPINE_IMAGE}"
# Ensure the working directory gets chowned back to the current user
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Preserve github env variables for use in docker
run: |
env | grep '^GITHUB' > "/tmp/github_env_${GITHUB_RUN_ID}"
- name: Checkout PyTorch
uses: zhouzhuojie/checkout@05b13c9a0d21f08f6d5e64a1d5042246d13619d9
with:
# deep clone, to allow use of git merge-base
fetch-depth: 0
submodules: recursive
- name: Clean PyTorch checkout
run: |
# Remove any artifacts from the previous checkouts
git clean -fxd
- name: Calculate docker image tag
id: calculate-tag
run: |
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "DOCKER_IMAGE=${DOCKER_IMAGE_BASE}:${DOCKER_TAG}" >> "${GITHUB_ENV}"
echo "::set-output name=docker_tag::${DOCKER_TAG}"
echo "::set-output name=docker_image::${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"
- name: Check if image should be built
id: check
env:
BASE_REVISION: ${{ github.event.pull_request.base.sha || github.sha }}
run: |
set -x
# Check if image already exists, if it does then skip building it
if docker manifest inspect "${DOCKER_IMAGE_BASE}:${DOCKER_TAG}"; then
exit 0
fi
if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then
# if we're on the base branch then use the parent commit
MERGE_BASE=$(git rev-parse HEAD~)
else
# otherwise we're on a PR, so use the most recent base commit
MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION")
fi
# Covers the case where a previous tag doesn't exist for the tree
# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$MERGE_BASE:.circleci/docker"; then
echo "Directory '.circleci/docker' not found in commit $MERGE_BASE, you should probably rebase onto a more recent commit"
exit 1
fi
PREVIOUS_DOCKER_TAG=$(git rev-parse "$MERGE_BASE:.circleci/docker")
# If no image exists but the hash is the same as the previous hash then we should error out here
if [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then
echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"
echo " contact the PyTorch team to restore the original images"
exit 1
fi
echo ::set-output name=rebuild::yes
- name: Build and push docker image
if: ${{ steps.check.outputs.rebuild }}
env:
DOCKER_SKIP_S3_UPLOAD: 1
working-directory: .circleci/docker
run: |
export IMAGE_NAME=${DOCKER_IMAGE_BASE#308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/}
./build_docker.sh
- name: Pull Docker image
run: |
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry docker pull "${DOCKER_IMAGE}"
- name: Parse ref
id: parse-ref
run: .github/scripts/parse_ref.py
- name: Build
env:
BRANCH: ${{ steps.parse-ref.outputs.branch }}
run: |
# detached container should get cleaned up by teardown_ec2_linux
container_name=$(docker run \
-e BUILD_ENVIRONMENT \
-e JOB_BASE_NAME \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e AWS_DEFAULT_REGION \
-e IS_GHA \
-e PR_NUMBER \
-e SHA1 \
-e BRANCH \
-e GITHUB_RUN_ID \
-e SCCACHE_BUCKET \
-e XLA_CLANG_CACHE_S3_BUCKET_NAME \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e SKIP_SCCACHE_INITIALIZATION=1 \
-e TORCH_CUDA_ARCH_LIST \
-e PR_LABELS \
-e http_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e https_proxy="http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128" -e no_proxy="localhost,127.0.0.1,github.com,amazonaws.com,s3.amazonaws.com,169.254.169.254,169.254.170.2,/var/run/docker.sock" \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--security-opt seccomp=unconfined \
--cap-add=SYS_PTRACE \
--tty \
--detach \
--user jenkins \
-v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
docker exec -t "${container_name}" sh -c 'sudo chown -R jenkins . && .jenkins/pytorch/build.sh'
- name: Display and upload binary build size statistics (Click Me)
# temporary hack: set CIRCLE_* vars, until we update
# tools/stats/print_test_stats.py to natively support GitHub Actions
env:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
pip3 install requests==2.26 boto3==1.16.34
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
- name: Chown workspace
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Archive artifacts into zip
run: |
zip -1 -r artifacts.zip dist/ build/custom_test_artifacts build/lib build/bin .pytorch-test-times.json
- uses: seemethere/upload-artifact-s3@v3
name: Store PyTorch Build Artifacts on S3
with:
name: ${{ env.BUILD_ENVIRONMENT }}
retention-days: 14
if-no-files-found: error
path:
artifacts.zip
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Chown workspace
if: always()
run: |
# Ensure the working directory gets chowned back to the current user
docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Kill containers, clean up images
if: always()
run: |
# ignore expansion of "docker ps -q" since it could be empty
# shellcheck disable=SC2046
docker stop $(docker ps -q) || true
# Prune all of the docker images
docker system prune -af
- name: Hold runner for 2 hours or until ssh sessions have drained
# Always hold for active ssh sessions
if: always()
run: .github/scripts/wait_for_ssh_to_drain.sh
- name: Clean up docker images
if: always()
run: |
# Prune all of the docker images
docker system prune -af

View File

@ -5,12 +5,15 @@ name: linux-xenial-py3.7-gcc7
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +44,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository_owner == 'pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
((github.event_name == 'pull_request' && github.event.action != 'unassigned') && !contains(join(github.event.pull_request.labels.*.name), 'ciflow/')))
}}
env:
JOB_BASE_NAME: linux-xenial-py3.7-gcc7-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/default') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +70,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +85,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +200,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +319,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +334,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +516,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -4,13 +4,14 @@
name: macos-10-15-py3-arm64
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
- fbsync
tags:
- 'ciflow/all/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
@ -34,15 +35,7 @@ jobs:
# For sccache access (only on non-forked PRs)
AWS_ACCESS_KEY_ID: ${{ secrets.MACOS_SCCACHE_S3_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.MACOS_SCCACHE_S3_SECRET_ACCESS_KEY }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"

View File

@ -4,13 +4,14 @@
name: macos-10-15-py3-lite-interpreter-x86-64
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
- fbsync
tags:
- 'ciflow/all/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
@ -36,15 +37,7 @@ jobs:
# For sccache access (only on non-forked PRs)
AWS_ACCESS_KEY_ID: ${{ secrets.MACOS_SCCACHE_S3_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.MACOS_SCCACHE_S3_SECRET_ACCESS_KEY }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"

View File

@ -4,13 +4,14 @@
name: macos-11-py3-x86-64
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
branches:
- master
- release/*
- fbsync
tags:
- 'ciflow/all/*'
- 'ciflow/macos/*'
- 'ciflow/trunk/*'
workflow_dispatch:
# For setup-miniconda, see https://github.com/conda-incubator/setup-miniconda/issues/179
@ -36,15 +37,7 @@ jobs:
# For sccache access (only on non-forked PRs)
AWS_ACCESS_KEY_ID: ${{ secrets.MACOS_SCCACHE_S3_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.MACOS_SCCACHE_S3_SECRET_ACCESS_KEY }}
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
PR_LABELS: ${{ toJson(github.event.pull_request.labels.*.name) }}
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/macos') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
steps:
- name: print labels
run: echo "${PR_LABELS}"
@ -222,7 +215,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

View File

@ -4,13 +4,15 @@
name: parallelnative-linux-xenial-py3.7-gcc5.4
on:
pull_request:
types: [opened, synchronize, reopened, unassigned]
push:
tags:
- 'ciflow/all/*'
- 'ciflow/cpu/*'
- 'ciflow/linux/*'
- 'ciflow/trunk/*'
branches:
- master
- release/*
- fbsync
workflow_dispatch:
env:
@ -41,16 +43,8 @@ jobs:
build:
runs-on: linux.2xlarge
timeout-minutes: 240
if: ${{ (github.repository == 'pytorch/pytorch') && (
(github.event_name == 'push') ||
(github.event_name == 'schedule') ||
(contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk')) ||
(false))
}}
env:
JOB_BASE_NAME: parallelnative-linux-xenial-py3.7-gcc5.4-build
IS_PROBOT_TRIGGER_EVENT: ${{ (github.event.action == 'unassigned') && (github.event.assigneed.login == 'pytorchbot') }}
LABEL_CONDITIONS: ${{ contains(github.event.pull_request.labels.*.name, 'ciflow/all') || contains(github.event.pull_request.labels.*.name, 'ciflow/cpu') || contains(github.event.pull_request.labels.*.name, 'ciflow/linux') || contains(github.event.pull_request.labels.*.name, 'ciflow/trunk') }}
outputs:
docker_image: ${{ steps.calculate-tag.outputs.docker_image }}
steps:
@ -75,7 +69,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -87,8 +84,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -202,7 +199,7 @@ jobs:
SCRIBE_GRAPHQL_ACCESS_TOKEN: ${{ secrets.SCRIBE_GRAPHQL_ACCESS_TOKEN }}
BRANCH: ${{ steps.parse-ref.outputs.branch }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
run: |
COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
export COMMIT_TIME
@ -321,7 +318,10 @@ jobs:
AWS_MAX_ATTEMPTS: 5
run: |
AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}
retry aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \
--password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com"
- name: Chown workspace
run: |
@ -333,8 +333,8 @@ jobs:
docker run --pull=never --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
- name: Clean workspace
run: |
rm -rf "${GITHUB_WORKSPACE:?}/*"
rm -f ~/.ssh/authorized_keys
rm -rf "${GITHUB_WORKSPACE}"
mkdir "${GITHUB_WORKSPACE}"
- name: "[FB EMPLOYEES] Enable SSH (Click me for login details)"
uses: seemethere/add-github-ssh-key@v1
with:
@ -515,7 +515,7 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
TAG: ${{ steps.parse-ref.outputs.tag }}
WORKFLOW_ID: '${{ github.run_id }}_${{ github.run_number }}'
WORKFLOW_ID: '${{ github.run_id }}'
shell: bash
run: |
python3 -m pip install -r requirements.txt

Some files were not shown because too many files have changed in this diff Show More