Compare commits

..

84 Commits

Author SHA1 Message Date
67ece03c8c Disable AVX512 CPU dispatch by default (#80253) (#80356)
As it can be slower, see https://github.com/pytorch/pytorch/issues/80252
Update trunk test matrix to test AVX512 config in `nogpu_AVX512` flavor.
Kill `nogpu_noAVX` as AVX support were replaced with AVX512 when https://github.com/pytorch/pytorch/pull/61903 were landed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80253
Approved by: https://github.com/ngimel

(cherry picked from commit 14813536a7120f1104be2270be341b7a383415c5)
2022-06-27 13:41:56 -04:00
bcfb424768 [JIT] Imbue stringbuf with C locale (#79929) (#79983)
To prevent 12345 become "12,345" if locale is not "C", as shown in the
following example:
```cpp

int main() {
  std::locale::global(std::locale("en_US.utf-8"));
  std::stringstream ss;
  ss << "12345 in " << std::locale().name()  << " locale is " << 12345 ;
  ss.imbue(std::locale("C"));
  ss << " but in C locale is " << 12345;
  std::cout << ss.str() << std::endl;
}

```

Fixes #79583

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79929
Approved by: https://github.com/davidberard98

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2022-06-21 21:35:03 -04:00
8186aa7d6c [DataLoader] Share seed via Distributed Store to get rid of CUDA dependency (#79829) (#79890)
Fixes #79828

In distributed environment, before this PR, DataLoader would create a Tensor holding the shared seed in RANK 0 and send the Tensor to other processes. However, when `NCCL` is used as the distributed backend, the Tensor is required to be moved to cuda before broadcasted from RANK 0 to other RANKs. And, this causes the Issue where DataLoader doesn't move the Tensor to cuda before sharing using `NCCL`.

After offline discussion with @mrshenli, we think the distributed Store is a better solution as the shared seed is just an integer value. Then, we can get rid of the dependency on NCCL and CUDA when sharing info between distributed processes for DataLoader.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79829
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-20 20:16:14 -04:00
01d9324fe1 nn: Disable nested tensor by default (#79884)
Better transformers (and by extension nested tensor) are identified as a
prototype feature and should not be enabled by default for the 1.12
release.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2022-06-20 17:15:16 -04:00
5009086150 Fix release doc builds (#79865)
This logic were lost during last workflow migration and as result we do not have docs builds for 1.12 release candidate, see pytorch/pytorch.github.io/tree/site/docs

Hattip to @brianjo for reminding me about the issue

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79865
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/seemethere

(cherry picked from commit 2bfba840847e785b4da56498041421fc4929826b)
2022-06-20 11:27:32 -07:00
bfb6b24575 [JIT] Nested fix (#79480) (#79816)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79480
Approved by: https://github.com/davidberard98

Co-authored-by: Elias Ellison <eellison@fb.com>
2022-06-20 06:10:09 -07:00
681a6e381c [v1.12.0] Fix non-reentrant hooks based checkpointing (#79490)
* merge fix

* Test fix

* Lint
2022-06-17 14:41:52 -07:00
92437c6b4e Revert behavior of Dropout2d on 3D inputs to 1D channel-wise dropout behavior & warn (#79611)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79549

Approved by: https://github.com/ngimel, https://github.com/albanD

Co-authored-by: Joel Benjamin Schlosser <jbschlosser@fb.com>
2022-06-17 14:35:45 -04:00
566286f9db Add Dropout1d module (#79610)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79545

Approved by: https://github.com/ngimel, https://github.com/albanD

Co-authored-by: Joel Benjamin Schlosser <jbschlosser@fb.com>
2022-06-17 14:35:08 -04:00
ac3086120d [DataLoader] Fix the world_size when distributed sharding MapDataPipe (#79524) (#79550)
Fixes #79449

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79524
Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin
2022-06-15 06:23:03 -07:00
eqy
7964022214 Cherry pick tf32docs (#79537) (#79539)
* Update numerical_accuracy.rst

* Update numerical_accuracy.rst

* Update numerical_accuracy.rst

* lint
2022-06-15 06:21:18 -07:00
1d5ecdb3b9 Update PeachPy submodule (#78326)
Forked the repo, merged latest changes into pre-generated branch and
update pregenerared opcodes

Re-enabled NNPACK builds on MacOS

Picking f8ef1a3c0a  fixes https://github.com/pytorch/pytorch/issues/76094

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78326
Approved by: https://github.com/atalman, https://github.com/albanD

(cherry picked from commit fa7117c64a9cc740e71728701adb2cb2ccc143c4)
2022-06-15 06:12:50 -07:00
7eef782636 Link LazyLinalg with cusolver statically when needed (#79324) (#79522)
By copy-n-pasting the static linking logic from `libtorch_cuda` if
lazylinalg is not enabled

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79324
Approved by: https://github.com/atalman

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2022-06-14 08:35:57 -07:00
fa01ea406a Add docs for Python Registration (#79481)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78753

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-14 08:09:21 -04:00
21e1282098 [CUDA graphs] Allows Adam and AdamW to be capture-safe (#77862) (#79472)
Near term fix for https://github.com/pytorch/pytorch/issues/76368.

Q. Why does the user need to request `capturable=True` in the optimizer constructor? Why can't capture safety be completely automatic?
A. We need to set up capture-safe (device-side) state variables before capture. If we don't, and step() internally detects capture is underway, it's too late: the best we could do is create a device state variable and copy the current CPU value into it, which is not something we want baked into the graph.

Q. Ok, why not just do the capture-safe approach with device-side state variables all the time?
A. It incurs several more kernel launches per parameter, which could really add up and regress cpu overhead for ungraphed step()s. If the optimizer won't be captured, we should allow step() to stick with its current cpu-side state handling.

Q. But cuda RNG is a stateful thing that maintains its state on the cpu outside of capture and replay, and we capture it automatically. Why can't we do the same thing here?
A. The graph object can handle RNG generator increments because its capture_begin, capture_end, and replay() methods can see and access generator object. But the graph object has no explicit knowledge of or access to optimizer steps in its capture scope. We could let the user tell the graph object what optimizers will be stepped in its scope, ie something like
```python
graph.will_use_optimizer(opt)
graph.capture_begin()
...
```
but that seems clunkier than an optimizer constructor arg.

I'm open to other ideas, but right now I think constructor arg is necessary and the least bad approach.

Long term, https://github.com/pytorch/pytorch/issues/71274 is a better fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77862
Approved by: https://github.com/ezyang
2022-06-14 08:06:07 -04:00
2d3d6f9d05 cherry-pick (#79455)
Co-authored-by: Mike Ruberry <mruberry@fb.com>
2022-06-13 18:52:31 -04:00
da93b1cbeb [CI] Turn flaky test signal to green (#79220) (#79220) (#79416)
Summary:
This implements the RFC #73573

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79220
Approved by: https://github.com/suo

Test Plan: contbuild & OSS CI, see 1bc8c87322

Reviewed By: osalpekar

Differential Revision: D37059423

Pulled By: osalpekar

fbshipit-source-id: c73d326e3aca834221cd003157f960e5bc02960a

Co-authored-by: Jane Xu (Meta Employee) <janeyx@fb.com>
2022-06-13 17:15:34 -04:00
d67c72cb53 Removing cublas static linking (#79280) (#79417)
Removing cublas static linking

Test:  https://github.com/pytorch/pytorch/runs/6837323424?check_suite_focus=true

```
(base) atalman@atalman-dev-workstation-d4c889c8-2k8hl:~/whl_test/torch/lib$ ldd libtorch_cuda.so
	linux-vdso.so.1 (0x00007fffe8f6a000)
	libc10_cuda.so (0x00007f6539e6a000)
	libcudart-80664282.so.10.2 (0x00007f6539be9000)
	libnvToolsExt-3965bdd0.so.1 (0x00007f65399df000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f65397c0000)
	libc10.so (0x00007f653952f000)
	libtorch_cpu.so (0x00007f6520921000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f6520583000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f652037f000)
	libcublas.so.10 (0x00007f651c0c5000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f651bebd000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f651bb34000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f651b91c000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f651b52b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f656aa13000)
	libgomp-a34b3233.so.1 (0x00007f651b301000)
	libcublasLt.so.10 (0x00007f651946c000)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79280
Approved by: https://github.com/seemethere
2022-06-13 16:48:13 -04:00
ef26f13df9 Install NDK 21 after GitHub update (#79024) (#79024) (#79429)
Summary:
See https://github.com/actions/virtual-environments/issues/5595

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79024
Approved by: https://github.com/janeyx99

Test Plan: contbuild & OSS CI, see 0be9df4e85

Reviewed By: osalpekar

Differential Revision: D36993242

Pulled By: kit1980

fbshipit-source-id: c2e76fee4eaf0b1474cb7221721cbb798c319001

Co-authored-by: Sergii Dymchenko (Meta Employee) <sdym@fb.com>
2022-06-13 15:07:45 -04:00
4a9779aa4d [DataPipe] Correcting deprecation version (#79309)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79302

Approved by: https://github.com/ejguan

Co-authored-by: PyTorch MergeBot <pytorchmergebot@users.noreply.github.com>
2022-06-10 15:23:18 -07:00
9a94ddc081 Fix _free_weak_ref error (#79315)
Fixes #74016

This is a cherry pick of  https://github.com/pytorch/pytorch/pull/78575 into release/1.12 branch
Approved by: https://github.com/ezyang
2022-06-10 15:14:11 -07:00
dee3dc6070 MPS: add layer_norm_backward (#79189) (#79276)
Layernorm backward

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79189
Approved by: https://github.com/razarmehr, https://github.com/albanD
2022-06-10 10:20:43 -07:00
30fce6836f Fix jit schema_matching ignoring self resulting in wrong operator schema (#79249)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79101

Approved by: https://github.com/gmagogsfm, https://github.com/eellison
2022-06-10 11:51:42 -04:00
0f93212516 adding a quick link to nvfuser README.md in jit doc for 1.12 release (#78160) (#79221)
adding a link to github 1.12 release branch nvfuser README.md in jit doc

Note that this PR is intended to be cherry-picked by 1.12 release, we'll have a follow up PR to update the link once this PR is merged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78160
Approved by: https://github.com/davidberard98

Co-authored-by: jjsjann123 <alex.jann2012@gmail.com>
2022-06-09 14:33:59 -04:00
eqy
585417e935 [DDP] Cherrypick support other memory formats #79060 (#79071)
* check in

* add test
2022-06-09 14:25:17 -04:00
bd93fe635e Foward fix sharding bug for DL (#79124) (#79129)
This PR solves a bug introduced by #79041

`torch.utils.data.graph_settings.apply_sharding` changes the datapipe in-place and returns `None`

It would resolve the Error in TorchData. See: https://github.com/pytorch/data/actions/runs/2461030312
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79124
Approved by: https://github.com/VitalyFedyunin
2022-06-09 14:21:33 -04:00
cc6e2d3035 Package config/template files with torchgen (#78942) (#79123)
Package config/template files with torchgen

This PR packages native_functions.yaml, tags.yaml and ATen/templates
with torchgen.

This PR:
- adds a step to setup.py to copy the relevant files over into torchgen
- adds a docstring for torchgen (so `import torchgen; help(torchgen)`
says something)
- adds a helper function in torchgen so you can get the torchgen root
directory (and figure out where the packaged files are)
- changes some scripts to explicitly pass the location of torchgen,
which will be helpful for the first item in the Future section.

Future
======

- torchgen, when invoked from the command line, should use sources
in torchgen/packaged instead of aten/src. I'm unable to do this because
people (aka PyTorch CI) invokes `python -m torchgen.gen` without
installing torchgen.
- the source of truth for all of these files should be in torchgen.
This is a bit annoying to execute on due to potential merge conflicts
and dealing with merge systems
- CI and testing. The way things are set up right now is really fragile,
we should have a CI job for torchgen.

Test Plan
=========
I ran the following locally:

```
python -m torchgen.gen -s torchgen/packaged
```
and verified that it outputted files.

Furthermore, I did a setup.py install and checked that the files are
actually being packaged with torchgen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78942
Approved by: https://github.com/ezyang
2022-06-09 14:20:32 -04:00
127922d451 Fix sharding strategy for distributed DL (#79041) (#79063)
1. Change the sharding strategy from sharding by worker first then by rank to sharding in the order of rank then workers.
2. Change to fetch Rank and World size in main process for the sake of `spawn`.

For the change 1:
Before this PR, for the case when dataset can not be evenly divided by `worker_num * world_size`, more data will be retrieved by workers in first RANKs.
Using the following example:
- dataset size: 100
- world_size: 4
- num_worker: 2

The number of data retrieved by each rank before this PR
- Rank 0: 26
- Rank 1: 26
- Rank 2: 24
- Rank 3: 24

The number of data retrieved by each rank after this PR
- Rank 0: 25
- Rank 1: 25
- Rank 2: 25
- Rank 3: 25

For the change 2:
Before this PR, `dist` functions are invoked inside worker processes. It's fine when the worker processes are forked from the parent process. All environment variables are inherited and exposed to these `dist` functions. However, when the worker processes are spawned, they won't be able to access to these environment variables, then the dataset won't be sharded by rank.
After this PR, `_sharding_worker_init_fn` should be working for both `spawn` and `fork` case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79041
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-07 21:24:27 -04:00
4c3742be4b Add check for no grad in transformer encoder nestedtensor conversion (#78832) (#78832) (#79029)
Summary:
Before, we allowed inputs with grad to be converted to NestedTensors. Autograd attempts to find the size of the NestedTensor, but NestedTensor throws an exception for its size function. This causes all calls to nn.TransformerEncoder with grad enabled to fail.

Fix: we add a check for no grad in transformer encoder so we do not convert tensor with grad to nestedtensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78832
Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser

Test Plan: contbuild & OSS CI, see 1f819ee965

Reviewed By: frank-wei, mikekgfb

Differential Revision: D36907614

Pulled By: erichan1

fbshipit-source-id: 576be36530da81c1eff59ac427ae860bfb402106
2022-06-07 21:23:27 -04:00
f12a1ff7f9 [1.12][DataPipe] Disable profiler for IterDataPipe by default and add deprecation of functional DataPipe names (#79027)
* [DataPipe] Disable profiler for IterDataPipe by default

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78674

Approved by: https://github.com/VitalyFedyunin

* [DataPipe] Add function for deprecation of functional DataPipe names

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78970

Approved by: https://github.com/ejguan
2022-06-07 17:48:48 -04:00
f913b4d9fb [quant] Skip some broken tests due to hypothesis
Summary:
Some quantization tests failed when we didn't touch any code related to the tests, all of them
are using hypothesis, it's likely that hypothesis is the problem. We will skip these tests for now and
gradually remove all hypothesis tests from quantization test code, or skip running the hypothesis tests in CI

Test Plan:
ossci

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78302

Approved by: https://github.com/suo, https://github.com/dzdang

(cherry picked from commit 716f76716a842482947efbdb54ea6bf6de3577e1)
2022-06-07 10:05:29 -07:00
9229e451b2 Guard test_sparse_csr.test_mm on CUDA11+ (#77965)
Fixes #77944

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77965
Approved by: https://github.com/albanD, https://github.com/malfet

(cherry picked from commit a8467de6fa1657a6f2b3f0b426a873c6e98ce5ce)
2022-06-07 10:02:31 -07:00
d064733915 Fix coreml ios workflow (#78356)
Which were broken by https://pypi.org/project/protobuf/4.21.0/ release
Fix by installing pinned version of coremltools with pinned version of protobuf

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78356
Approved by: https://github.com/atalman

(cherry picked from commit a4723d5a5f11f974f9d9ccd83564f3de5de818c5)
2022-06-07 09:51:50 -07:00
9d67727edf [FSDP][Docs] Fix typo in full_optim_state_dict() (#78981)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78784

Approved by: https://github.com/rohan-varma
2022-06-07 10:10:32 -04:00
ec86ed25e9 Run MPS tests (#78723)
This adds a workflow, that is executed on MacOS 12.3+ machines and runs just test_mps.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78723
Approved by: https://github.com/albanD, https://github.com/kulinseth

(cherry picked from commit f7ac389e71e55f84651141c01334dea668b3f90c)
2022-06-07 06:57:52 -07:00
2deba51e72 [MPS] Do not pass linker command to a compiler (#78630)
`-weak_framework` is a linker rather than a compiler option and as such
it should not be passed as CXX flag
Also, use `string(APPEND` rather than `set(FOO "$(FOO) ...)`

Likely fixes our ability to use `sccache` for MacOS CI builds, see https://github.com/pytorch/pytorch/issues/78375#issuecomment-1143697183
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78630
Approved by: https://github.com/albanD

(cherry picked from commit 634954c55c05b0c0905b2299308dd9152e08af92)
2022-06-07 06:56:57 -07:00
e9a12ec87f update mps note with more details (#78669)
Follow up to the comments in https://github.com/pytorch/pytorch/pull/77767#pullrequestreview-978807521
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78669
Approved by: https://github.com/kulinseth, https://github.com/anjali411

(cherry picked from commit b30b1f3decfd2b51ac2250b00a8ae7049143d855)
2022-06-07 06:53:59 -07:00
2a8e3ee91e Update codeowners for MPS (#78727)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78727
Approved by: https://github.com/malfet

(cherry picked from commit 48c3d8573918cf47f8091e9b5e7cea7aa0785ad4)
2022-06-07 06:52:53 -07:00
47d558e862 [MPS] Add arange_mps_out implementation (#78789)
Mostly by factoring out shader logic from `linspace_out_mps` implementation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78789
Approved by: https://github.com/albanD, https://github.com/kulinseth
2022-06-07 06:52:17 -07:00
bc0a9abad2 MPS: Fix issues with view tensors and linspace. (#78690)
Fixes: #https://github.com/pytorch/pytorch/issues/78642, https://github.com/pytorch/pytorch/issues/78511
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78690
Approved by: https://github.com/razarmehr, https://github.com/DenisVieriu97

(cherry picked from commit 4858c56334aa2b09b1ba10d0a3547ef01edda363)
2022-06-07 06:51:17 -07:00
fa7d872ce3 MPS: add linespace op (#78570)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78570
Approved by: https://github.com/malfet

(cherry picked from commit a3bdafece3a07aea186e34abc28e2540aa078393)
2022-06-07 06:51:08 -07:00
d1d2be89fd Add test case for issue: https://github.com/pytorch/pytorch/issues/77851 (#78547)
The test works fine now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78547
Approved by: https://github.com/kulinseth

(cherry picked from commit aa62b3e003b53a0b36e04005fe5fdc8e2dda0253)
2022-06-07 06:50:53 -07:00
0e58e3374e MPS: Implement aten::count_nonzero.dim_IntList (#78169)
- See: #77764

Implements the `aten::count_nonzero.dim_IntList` operator (as used by [torch.count_nonzero](https://pytorch.org/docs/stable/generated/torch.count_nonzero.html)) for [MPS](https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78169
Approved by: https://github.com/malfet, https://github.com/kulinseth, https://github.com/albanD

(cherry picked from commit f42b42d3eb9af4ea1d09f00a13e9b6dc9efcc0f8)
2022-06-07 06:50:41 -07:00
e3e753161c MPS: Fix crashes in view tensors due to buffer size mismatch (#78496)
Fixes #78247, #77886

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78496
Approved by: https://github.com/albanD, https://github.com/malfet

(cherry picked from commit 017b0ae9431ae3780a4eb9bf6d8865dfcd02cd92)
2022-06-07 06:50:33 -07:00
dc2b2f09d7 Speed up test_mps from 9min to 25s
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78488

Approved by: https://github.com/kulinseth

(cherry picked from commit bde246fcc60372c0ce7ee16dd5e3dc7652a36867)
2022-06-07 06:50:20 -07:00
19ebdd7eab Remove prints and add proper asserts
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78454

Approved by: https://github.com/kulinseth

(cherry picked from commit 02551a002575d1a40d6a6c7d6c7f319ef1b3ad2f)
2022-06-07 06:50:13 -07:00
f8160b113e MPS: Fixes the as_strided_mps implementation for contiguous view operations (#78440)
Fixes https://github.com/pytorch/pytorch/issues/78107; https://github.com/pytorch/pytorch/issues/77750

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78440
Approved by: https://github.com/malfet

(cherry picked from commit d63db52349ae3cffd6f762c9027e7363a6271d27)
2022-06-07 06:50:04 -07:00
3e8119bf9a MPS: Fix the memory growing issue and BERT_pytorch network crash fix. (#78006)
Fixes #77753

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78006
Approved by: https://github.com/albanD

(cherry picked from commit cbdb694f158b8471d71822873c3ac130203cc218)
2022-06-07 06:49:56 -07:00
6660df9f22 [MPS] Fix copy_kernel_mps (#78428)
By passing `storage_offset` of source and destination Tensors
This fixes following simple usecase:
```
python3` -c "import torch;x=torch.zeros(3, 3, device='mps'); x[1, 1]=1;print(x)"
```

Add test to validate it would not regress in the future

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78428
Approved by: https://github.com/kulinseth

(cherry picked from commit 437ecfc4612b73ada1f99de94f3c79de6b08f99a)
2022-06-07 06:43:55 -07:00
8b7e19a87b MPS: Eye op (#78408)
This can be used as a reference PR was to add Op in MPS backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78408
Approved by: https://github.com/albanD

(cherry picked from commit 8552acbd7435eadb184e0cedc21df64d3bf30329)
2022-06-07 06:43:48 -07:00
9828013233 [mps] Do not use malloc/free in Indexing.mm (#78409)
Especially allocating just 2 int64 on heap is somewhat wasteful (and they are leaked if function returns earlier)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78409
Approved by: https://github.com/seemethere, https://github.com/kulinseth

(cherry picked from commit aefb4c9fba0edf5a71a245e4fd8f5ac1d65beeac)
2022-06-07 06:43:41 -07:00
53fc6dc3db MPS: Add adaptive max pool2d op (#78410)
Adaptive max pool 2d forward and backward with test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78410
Approved by: https://github.com/albanD

(cherry picked from commit 2e32d5fcd8de75dc2695d940925e5be181a06b54)
2022-06-07 06:43:33 -07:00
52435c6b1f MPS: add ranked tensors for addcmul ops instead of constants and update version_check (#78354)
This is a reland of https://github.com/pytorch/pytorch/pull/78312 with syntax error and formating fixed in `MPSDevice.mm`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78354
Approved by: https://github.com/kulinseth

(cherry picked from commit 45462baf7e2ef00a9aa912e2a045b20bc3ed80d3)
2022-06-07 06:43:21 -07:00
9a66061326 Fix the MPS Heap volatility (#78230)
Fixes #77829
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78230
Approved by: https://github.com/malfet

(cherry picked from commit c8ab55b2939c4cd5cd8d2e0605fdfd09e8eff294)
2022-06-07 06:42:00 -07:00
eef0ec541e Use random seed in normal_mps_out (#78010)
Fixes #78009.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78010
Approved by: https://github.com/kulinseth

(cherry picked from commit 51c4c79e3d4e600baa6a53a2afcf99ef8db5dbe0)
2022-06-07 06:41:26 -07:00
0ffefea581 Fix typo in testname (#78258)
`test_linear2D_no_bias_backwarwd` -> `test_linear2D_no_bias_backward`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78258
Approved by: https://github.com/kulinseth, https://github.com/janeyx99

(cherry picked from commit 705082656a9dd2c9f243da01f28a195b94b24d66)
2022-06-07 06:41:18 -07:00
7e12cfb29d [MPS] Lazy initialize allocators (#78227)
Do not construct MPS allocators at load time, but rather create them
lazily when needed

This significantly reduces `libtorch.dylib` load time and prevents weird
flicker, when during import torch when Intel MacBook runs switches from
integrated to discrete graphics

Before the change `python3 -c "import timeit;import importlib;print(timeit.timeit(lambda: importlib.import_module('torch'), number=1))"` takes about 1 sec, after the change it drops down to .6 sec

Minor changes:
 - Deleted unused `__block id<MTLBuffer> buf = nil;` from
   HeapAllocatorImpl
 - Add braces for single line if statements

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78227
Approved by: https://github.com/kulinseth, https://github.com/albanD

(cherry picked from commit 2679aa47897232827771ad7bb18e14bb4be3cae8)
2022-06-07 06:41:09 -07:00
24b9bd4398 [MPS] Add version check (#78192)
Use `instancesRespondToSelector:` to test the presence of
`optimizationLevel` in `MPSGraphCompilationDescriptor`, which according
to
https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraphcompilationdescriptor/3922624-optimizationlevel
is only available on 12.3 or newer

This works around a limitations of `@available(macOS 12.3, *)` macro in
shared libraries dynamically loaded by apps targeting older runtime.
And deployment target for macos Python conda binaries is 10.14:
```
% otool -l `which python3`
...
Load command 9
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 10.14
      sdk 10.14
...
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78192
Approved by: https://github.com/atalman, https://github.com/seemethere

(cherry picked from commit b7bb34d7625d95e5088638721dcc07c2bc5e2ade)
2022-06-07 06:41:02 -07:00
5342e76039 Convert MPS Tensor data using MPSGraph API (#78092)
Fixes #78091
If you are already working on this, simply disregard this or take what may be helpful. This is my attempt at MPS-native Tensor datatype conversion. It works for everything tested ~~but is currently only implemented for MPS-to-MPS copy, not MPS-to-X or X-to-MPS, but the same approach could easily be used~~.

Before:
```python
In [5]: pt.full((40,), -10.3, device="mps")
Out[5]:
tensor([-10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000], device='mps:0')

In [6]: pt.full((40,), -10.3, device="mps").int()
Out[6]:
tensor([-1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883,
        -1054552883, -1054552883, -1054552883, -1054552883, -1054552883],
       device='mps:0', dtype=torch.int32)

In [7]: pt.full((40,), -10.3, device="mps").int().float()
Out[7]:
tensor([-10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000], device='mps:0')

In [8]: pt.full((40,), -10.3, device="mps").int().float().bool()
Out[8]:
tensor([ True, False, False,  True,  True, False, False,  True,  True, False,
        False,  True,  True, False, False,  True,  True, False, False,  True,
         True, False, False,  True,  True, False, False,  True,  True, False,
        False,  True,  True, False, False,  True,  True, False, False,  True],
       device='mps:0')
```

After:
```python
In [3]: pt.full((40,), -10.3, device="mps")
Out[3]:
tensor([-10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000, -10.3000,
        -10.3000, -10.3000, -10.3000, -10.3000, -10.3000], device='mps:0')

In [4]: pt.full((40,), -10.3, device="mps").int()
Out[4]:
tensor([-10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10,
        -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10,
        -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10],
       device='mps:0', dtype=torch.int32)

In [5]: pt.full((40,), -10.3, device="mps").int().float()
Out[5]:
tensor([-10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10.,
        -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10.,
        -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10., -10.,
        -10., -10., -10., -10.], device='mps:0')

In [6]: pt.full((40,), -10.3, device="mps").int().float().bool()
Out[6]:
tensor([True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True], device='mps:0')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78092
Approved by: https://github.com/kulinseth, https://github.com/malfet

(cherry picked from commit a52bfe2c5d8588b8f9e83e0beecdd18a1d672d0e)
2022-06-07 06:40:54 -07:00
08d70ab718 [MPS] Fix torch.mps.is_available() (#78121)
By introducing `at:mps::is_available()` and changing `torch._C._is_mps_available` from property to memoizable callable

Also, if `_mtl_device` is released in MPSDevice destructor, shouldn't it be retained in the constructor

Looks like GitHubActions Mac runner does not have any Metal devices available, according to https://github.com/malfet/deleteme/runs/6560871657?check_suite_focus=true#step:3:15

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78121
Approved by: https://github.com/albanD

(cherry picked from commit 6244daa6a9a27463f63235d88b9f728c91243a08)
2022-06-07 06:40:23 -07:00
207bde1ee8 Add ignore for -Wunsupported-availability-guard
This failed internal builds so just upstreaming the internal fix

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77995

Approved by: https://github.com/bigfootjon, https://github.com/malfet

(cherry picked from commit a9a99a901e953cf35edd010e17ee0ed4d2f347af)
2022-06-07 06:40:18 -07:00
51428a8f43 Fix a few issues on assert/double error/legacy constructor (#77966)
Fixes https://github.com/pytorch/pytorch/issues/77960, https://github.com/pytorch/pytorch/issues/77957, https://github.com/pytorch/pytorch/issues/77781
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77966
Approved by: https://github.com/soulitzer, https://github.com/kulinseth

(cherry picked from commit 04ac80c73a9f525322a8b622659a27ad065698ea)
2022-06-07 06:38:32 -07:00
c40f18454d [DataLoader] Apply sharding settings in dist when num_workers is 0 (#78967)
ghstack-source-id: 9c53e8c9adb3ac7c80ebf22a476385509b252511
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78950

Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>
2022-06-07 09:15:35 -04:00
8a5156a050 [DataPipe] Adding functional API for FileLister (#78419) (#78948)
Fixes #78263

Follow-up from pytorch/data#387. This adds a functional API `list_files()` to `FileListerDataPipe`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78419
Approved by: https://github.com/NivekT, https://github.com/ejguan

Co-authored-by: Robert Xiu <xiurobert@gmail.com>
2022-06-07 09:13:09 -04:00
04d75d2008 Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765) (#78927)
Fixes https://github.com/pytorch/data/issues/426

This PR introduces two main changes:
- It ensures the `ShufflerDataPipe` would share the same seed across distributed processes.
- Users can reset `shuffle` for persistent workers per epoch.

Detail:
- `shared_seed` is shared across distributed and worker processes. It will seed a `shared_rng` to provide seeds to each `ShufflerDataPipe` in the pipeline
- `worker_loop` now accepts a new argument of `shared_seed` to accept this shared seed.
- The `shared_seed` is attached to `_ResumeIteration` for resetting seed per epoch for `persistent worker`
- I choose not to touch `base_seed` simply for BC issue

I used this [script](https://gist.github.com/ejguan/d88f75fa822cb696ab1bc5bc25844f47) to test the result with `world_size=4`. Please check the result in: https://gist.github.com/ejguan/6ee2d2de12ca57f9eb4b97ef5a0e300b

You can see there isn't any duplicated/missing element for each epoch. And, with the same seed, the order of data remains the same across epochs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78765
Approved by: https://github.com/VitalyFedyunin
2022-06-07 09:03:26 -04:00
2652da29ab Avoid CPU Sync in SyncBatchNorm When Capturing CUDA Graphs (#78810)
We recently updated `SyncBatchNorm` to support empty input batches.
The new code removes stats from ranks with empty inputs. However,
this change breaks CUDA graph capture as it forces CPU sync. This
commit uses `is_current_stream_capturing()` to guard the new code
path, and only run the new code when not capturing CUA Graphs. To
support empty inputs with CUDA graph capturing, we might need to
update CUDA kernels for `batch_norm_backward_elemt` and
`batch_norm_gather_stats_with_counts`. See #78656.

Fixes #78549

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78666

Approved by: https://github.com/albanD
2022-06-06 09:39:03 -04:00
aa8911885b [chalf] warn once on creating a chalf tensor (#78245) (#78710)
`chalf` is experimental as the op coverage is low.

Following script raises 6 warnings if `set_warn_always(True)` else raises only 1 warning.
```python
import torch
torch.set_warn_always(True)
device='cpu'
t = torch.randn(3, dtype=torch.chalf, device=device)
y = torch.rand(3, dtype=torch.chalf, device=device)
# Allocates new tensor for result
t + y

device='cuda'
t = torch.randn(3, dtype=torch.chalf, device=device)
y = torch.rand(3, dtype=torch.chalf, device=device)

# Allocates new tensor for result
t + y

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78245
Approved by: https://github.com/anjali411
2022-06-03 10:51:18 -04:00
528710ec89 [DataLoader] DataLoader now automatically apply sharding to DataPipes (#78762)
ghstack-source-id: ac918b064cd09cd68a04c28238481c76b46b4010
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78631

Co-authored-by: Vitaly Fedyunin <vitaly.fedyunin@gmail.com>
2022-06-03 10:21:55 -04:00
de53f70e1d [GHA] attempt to re-enable mac test workflows (#78000) (#78749)
Our mac tests have not been running since #77645 because of
<img width="1386" alt="image" src="https://user-images.githubusercontent.com/31798555/169602783-988a265a-ce4a-41a7-8f13-3eb4615b0d6f.png">

https://github.com/pytorch/pytorch/actions/runs/2345334995
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78000
Approved by: https://github.com/malfet

Co-authored-by: Jane Xu <janeyx@fb.com>
2022-06-02 15:25:22 -04:00
39ebb3e06e fix set item to scalar tensor missing gradient info (#78746)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78246

Approved by: https://github.com/ngimel
2022-06-02 15:04:36 -04:00
fd3cc823ce [DataPipe] Lazily generate exception message for performance (#78726)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78673

Approved by: https://github.com/ejguan
2022-06-02 14:26:38 -04:00
5bb7c617f6 [docs][nn] conv: complex support note (#78351) (#78709)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78351
Approved by: https://github.com/anjali411, https://github.com/jbschlosser
2022-06-02 14:23:29 -04:00
8a627381c9 [DataLoader] Minor documentation improvement (#78548)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78404

Approved by: https://github.com/ejguan
2022-06-01 14:05:54 -04:00
f56e16a70f [ONNX] Fix typo when comparing DeviceObjType (#78085) (#78370)
#77423 Introduced a typo in

1db9be70a7/torch/onnx/symbolic_opset9.py (L5012-L5017)

where the string `DeviceObjType` was replaced with `_C.DeviceObjType`. This PR reverts the changes to the strings.

**Tested:**

With torchvision,

```
pytest test/test_onnx.py::TestONNXExporter::test_mask_rcnn
pytest -n auto test/test_onnx.py::TestONNXExporter
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78085
Approved by: https://github.com/datumbox, https://github.com/BowenBao, https://github.com/ezyang

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2022-05-26 13:05:36 -07:00
c93a7f8bea Update PyTorch/XLA git clone branch name for 1.12 (#78315) 2022-05-25 16:06:39 -07:00
919b53c5e7 [Profiler] Fix segfault in AppendOnlyList (#78084) 2022-05-24 17:21:45 -04:00
2ad18abc49 [MPS] Initialize MPSDevice::_mtl_device property to nil (#78136) (#78204)
This prevents `import torch` accidentally crash on machines with no metal devices

Should prevent crashes reported in https://github.com/pytorch/pytorch/pull/77662#issuecomment-1134637986 and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true

Backtrace to the crash:
```
(lldb) bt
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23
    frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
    frame #2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125
    frame #3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535
    frame #4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up
frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436
libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl:
->  0x10fd9f524 <+436>: movq   %rax, 0x1b0(%rbx)
    0x10fd9f52b <+443>: movw   $0x0, 0x1b8(%rbx)
    0x10fd9f534 <+452>: addq   $0x8, %rsp
    0x10fd9f538 <+456>: popq   %rbx
(lldb) disassemble
 ...
    0x10fd9f514 <+420>: movq   0xf19ad15(%rip), %rsi     ; "maxBufferLength"
    0x10fd9f51b <+427>: movq   %r14, %rdi
    0x10fd9f51e <+430>: callq  *0xeaa326c(%rip)          ; (void *)0x00007fff7202be40: objc_msgSend
```

which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in
2ae3c59e4b/aten/src/ATen/mps/MPSAllocator.h (L171)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78136
Approved by: https://github.com/seemethere

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2022-05-24 17:03:38 -04:00
9596b999f8 Fix unit tests (#78056) 2022-05-24 17:00:57 -04:00
baabb4cb96 MPS: Add back the memory leak fixes. (#77964) (#78198)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77964
Approved by: https://github.com/albanD

Co-authored-by: Kulin Seth <kulinseth@gmail.com>
2022-05-24 16:36:16 -04:00
906a6e1df9 Fixing release rc build names (#78174) 2022-05-24 15:04:36 -04:00
974f7f8080 [1.12] Remove torch.vmap (#78021) 2022-05-23 10:30:23 -07:00
8abf37d74e ci: Pin builder to release/1.12 (#77986)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2022-05-20 14:00:44 -04:00
8ff2bc0c01 Release 1.12 Install torch from test channel, Pin builder and xla repo (#77983) 2022-05-20 10:51:22 -07:00
a119b7f6d4 retry - enable NVFuser by default
Enable NVFuser in OSS.

Retry of #77213, because it was breaking torchvision tests.

Fix in #77471 has been verified by jjsjann123

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77579

Approved by: https://github.com/eellison, https://github.com/malfet, https://github.com/atalman, https://github.com/seemethere
2022-05-20 10:31:49 -07:00
6809 changed files with 334465 additions and 884769 deletions

108
.bazelrc
View File

@ -1,4 +1,4 @@
build --cxxopt=--std=c++17
build --cxxopt=--std=c++14
build --copt=-I.
# Bazel does not support including its cc_library targets as system
# headers. We work around this for generated code
@ -13,103 +13,15 @@ build:no-tty --curses no
build:no-tty --progress_report_interval 10
build:no-tty --show_progress_rate_limit 10
# Build with GPU support by default.
build --define=cuda=true
# rules_cuda configuration
build --@rules_cuda//cuda:enable_cuda
build --@rules_cuda//cuda:cuda_targets=sm_52
build --@rules_cuda//cuda:compiler=nvcc
build --repo_env=CUDA_PATH=/usr/local/cuda
# Configuration to build without GPU support
build:cpu-only --define=cuda=false
# Configuration to build with GPU support
build:gpu --define=cuda=true
# define a separate build folder for faster switching between configs
build:cpu-only --platform_suffix=-cpu-only
build:gpu --platform_suffix=-gpu
# See the note on the config-less build for details about why we are
# doing this. We must also do it for the "-cpu-only" platform suffix.
build --copt=-isystem --copt=bazel-out/k8-fastbuild-cpu-only/bin
# doing this. We must also do it for the "-gpu" platform suffix.
build --copt=-isystem --copt=bazel-out/k8-fastbuild-gpu/bin
# rules_cuda configuration
build:cpu-only --@rules_cuda//cuda:enable_cuda=False
# Definition of --config=shell
# interactive shell immediately before execution
build:shell --run_under="//tools/bazel_tools:shellwrap"
# Disable all warnings for external repositories. We don't care about
# their warnings.
build --per_file_copt=^external/@-w
# Set additional warnings to error level.
#
# Implementation notes:
# * we use file extensions to determine if we are using the C++
# compiler or the cuda compiler
# * we use ^// at the start of the regex to only permit matching
# PyTorch files. This excludes external repos.
#
# Note that because this is logically a command-line flag, it is
# considered the word on what warnings are enabled. This has the
# unfortunate consequence of preventing us from disabling an error at
# the target level because those flags will come before these flags in
# the action invocation. Instead we provide per-file exceptions after
# this.
#
# On the bright side, this means we don't have to more broadly apply
# the exceptions to an entire target.
#
# Looking for CUDA flags? We have a cu_library macro that we can edit
# directly. Look in //tools/rules:cu.bzl for details. Editing the
# macro over this has the following advantages:
# * making changes does not require discarding the Bazel analysis
# cache
# * it allows for selective overrides on individual targets since the
# macro-level opts will come earlier than target level overrides
build --per_file_copt='^//.*\.(cpp|cc)$'@-Werror=all
# The following warnings come from -Wall. We downgrade them from error
# to warnings here.
#
# sign-compare has a tremendous amount of violations in the
# codebase. It will be a lot of work to fix them, just disable it for
# now.
build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-sign-compare
# We intentionally use #pragma unroll, which is compiler specific.
build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-error=unknown-pragmas
build --per_file_copt='^//.*\.(cpp|cc)$'@-Werror=extra
# The following warnings come from -Wextra. We downgrade them from error
# to warnings here.
#
# unused-parameter-compare has a tremendous amount of violations in the
# codebase. It will be a lot of work to fix them, just disable it for
# now.
build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-unused-parameter
# missing-field-parameters has both a large number of violations in
# the codebase, but it also is used pervasively in the Python C
# API. There are a couple of catches though:
# * we use multiple versions of the Python API and hence have
# potentially multiple different versions of each relevant
# struct. They may have different numbers of fields. It will be
# unwieldy to support multiple versions in the same source file.
# * Python itself for many of these structs recommends only
# initializing a subset of the fields. We should respect the API
# usage conventions of our dependencies.
#
# Hence, we just disable this warning altogether. We may want to clean
# up some of the clear-cut cases that could be risky, but we still
# likely want to have this disabled for the most part.
build --per_file_copt='^//.*\.(cpp|cc)$'@-Wno-missing-field-initializers
build --per_file_copt='//:aten/src/ATen/RegisterCompositeExplicitAutograd\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterCompositeImplicitAutograd\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterMkldnnCPU\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterNestedTensorCPU\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterQuantizedCPU\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterSparseCPU\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterSparseCsrCPU\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterNestedTensorMeta\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterSparseMeta\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterQuantizedMeta\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:aten/src/ATen/RegisterZeroTensor\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:torch/csrc/lazy/generated/RegisterAutogradLazy\.cpp$'@-Wno-error=unused-function
build --per_file_copt='//:torch/csrc/lazy/generated/RegisterLazy\.cpp$'@-Wno-error=unused-function
build:gpu --@rules_cuda//cuda:enable_cuda
build:gpu --@rules_cuda//cuda:cuda_targets=sm_52
build:gpu --@rules_cuda//cuda:compiler=nvcc
build:gpu --repo_env=CUDA_PATH=/usr/local/cuda

View File

@ -1,13 +1,8 @@
[pt]
is_oss=1
[buildfile]
name = BUCK.oss
includes = //tools/build_defs/select.bzl
name = BUILD.buck
[repositories]
bazel_skylib = third_party/bazel-skylib/
ovr_config = .
[download]
in_build = true
@ -15,11 +10,6 @@
[cxx]
cxxflags = -std=c++17
should_remap_host_platform = true
cpp = /usr/bin/clang
cc = /usr/bin/clang
cxx = /usr/bin/clang++
cxxpp = /usr/bin/clang++
ld = /usr/bin/clang++
[project]
default_flavors_mode=all

View File

@ -1,32 +0,0 @@
#!/bin/bash
# Work around bug where devtoolset replaces sudo and breaks it.
if [ -n "$DEVTOOLSET_VERSION" ]; then
export SUDO=/bin/sudo
else
export SUDO=sudo
fi
as_jenkins() {
# NB: unsetting the environment variables works around a conda bug
# https://github.com/conda/conda/issues/6576
# NB: Pass on PATH and LD_LIBRARY_PATH to sudo invocation
# NB: This must be run from a directory that jenkins has access to,
# works around https://github.com/conda/conda-package-handling/pull/34
$SUDO -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
}
conda_install() {
# Ensure that the install command don't upgrade/downgrade Python
# This should be called as
# conda_install pkg1 pkg2 ... [-c channel]
as_jenkins conda install -q -n py_$ANACONDA_PYTHON_VERSION -y python="$ANACONDA_PYTHON_VERSION" $*
}
conda_run() {
as_jenkins conda run -n py_$ANACONDA_PYTHON_VERSION --no-capture-output $*
}
pip_install() {
as_jenkins conda run -n py_$ANACONDA_PYTHON_VERSION pip install --progress-bar off $*
}

View File

@ -1,27 +0,0 @@
#!/bin/bash
if [[ ${CUDNN_VERSION} == 8 ]]; then
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
mkdir tmp_cudnn && cd tmp_cudnn
CUDNN_NAME="cudnn-linux-x86_64-8.3.2.44_cuda11.5-archive"
if [[ ${CUDA_VERSION:0:4} == "11.7" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-8.5.0.96_cuda11-archive"
curl --retry 3 -OLs https://ossci-linux.s3.amazonaws.com/${CUDNN_NAME}.tar.xz
elif [[ ${CUDA_VERSION:0:4} == "11.8" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-8.7.0.84_cuda11-archive"
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/redist/cudnn/v8.7.0/local_installers/11.8/${CUDNN_NAME}.tar.xz
else
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/redist/cudnn/v8.3.2/local_installers/11.5/${CUDNN_NAME}.tar.xz
fi
tar xf ${CUDNN_NAME}.tar.xz
cp -a ${CUDNN_NAME}/include/* /usr/include/
cp -a ${CUDNN_NAME}/include/* /usr/local/cuda/include/
cp -a ${CUDNN_NAME}/include/* /usr/include/x86_64-linux-gnu/
cp -a ${CUDNN_NAME}/lib/* /usr/local/cuda/lib64/
cp -a ${CUDNN_NAME}/lib/* /usr/lib/x86_64-linux-gnu/
cd ..
rm -rf tmp_cudnn
ldconfig
fi

View File

@ -1,29 +0,0 @@
#!/bin/bash
set -ex
source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
if [ -n "${UBUNTU_VERSION}" ]; then
apt update
apt-get install -y clang doxygen git graphviz nodejs npm libtinfo5
fi
# Do shallow clone of PyTorch so that we can init lintrunner in Docker build context
git clone https://github.com/pytorch/pytorch.git --depth 1
chown -R jenkins pytorch
pushd pytorch
# Install all linter dependencies
pip_install -r requirements.txt
conda_run lintrunner init
# Cache .lintbin directory as part of the Docker image
cp -r .lintbin /tmp
popd
# Node dependencies required by toc linter job
npm install -g markdown-toc
# Cleaning up
rm -rf pytorch

View File

@ -1,29 +0,0 @@
#!/bin/bash
set -ex
# "install" hipMAGMA into /opt/rocm/magma by copying after build
git clone https://bitbucket.org/icl/magma.git
pushd magma
# Fixes memory leaks of magma found while executing linalg UTs
git checkout 5959b8783e45f1809812ed96ae762f38ee701972
cp make.inc-examples/make.inc.hip-gcc-mkl make.inc
echo 'LIBDIR += -L$(MKLROOT)/lib' >> make.inc
echo 'LIB += -Wl,--enable-new-dtags -Wl,--rpath,/opt/rocm/lib -Wl,--rpath,$(MKLROOT)/lib -Wl,--rpath,/opt/rocm/magma/lib' >> make.inc
echo 'DEVCCFLAGS += --gpu-max-threads-per-block=256' >> make.inc
export PATH="${PATH}:/opt/rocm/bin"
if [[ -n "$PYTORCH_ROCM_ARCH" ]]; then
amdgpu_targets=`echo $PYTORCH_ROCM_ARCH | sed 's/;/ /g'`
else
amdgpu_targets=`rocm_agent_enumerator | grep -v gfx000 | sort -u | xargs`
fi
for arch in $amdgpu_targets; do
echo "DEVCCFLAGS += --amdgpu-target=$arch" >> make.inc
done
# hipcc with openmp flag may cause isnan() on __device__ not to be found; depending on context, compiler may attempt to match with host definition
sed -i 's/^FOPENMP/#FOPENMP/g' make.inc
make -f make.gen.hipMAGMA -j $(nproc)
LANG=C.UTF-8 make lib/libmagma.so -j $(nproc) MKLROOT=/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION
make testing/testing_dgemm -j $(nproc) MKLROOT=/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION
popd
mv magma /opt/rocm

View File

@ -1,48 +0,0 @@
#!/bin/bash
set -ex
if [[ -d "/usr/local/cuda/" ]]; then
with_cuda=/usr/local/cuda/
else
with_cuda=no
fi
function install_ucx() {
set -ex
git clone --recursive https://github.com/openucx/ucx.git
pushd ucx
git checkout ${UCX_COMMIT}
git submodule update --init --recursive
./autogen.sh
./configure --prefix=$UCX_HOME \
--enable-mt \
--with-cuda=$with_cuda \
--enable-profiling \
--enable-stats
time make -j
sudo make install
popd
rm -rf ucx
}
function install_ucc() {
set -ex
git clone --recursive https://github.com/openucx/ucc.git
pushd ucc
git checkout ${UCC_COMMIT}
git submodule update --init --recursive
./autogen.sh
./configure --prefix=$UCC_HOME --with-ucx=$UCX_HOME --with-cuda=$with_cuda
time make -j
sudo make install
popd
rm -rf ucc
}
install_ucx
install_ucc

View File

@ -1,34 +0,0 @@
ARG UBUNTU_VERSION
FROM ubuntu:${UBUNTU_VERSION}
ARG UBUNTU_VERSION
ENV DEBIAN_FRONTEND noninteractive
# Install common dependencies (so that this step can be cached separately)
COPY ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install user
COPY ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install conda and other packages (e.g., numpy, pytest)
ARG ANACONDA_PYTHON_VERSION
ARG CONDA_CMAKE
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
COPY requirements-ci.txt /opt/conda/requirements-ci.txt
COPY ./common/install_conda.sh install_conda.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
# Note that Docker build forbids copying file outside the build context
COPY ./common/install_linter.sh install_linter.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_linter.sh
RUN rm install_linter.sh common_utils.sh
USER jenkins
CMD ["bash"]

View File

@ -1,14 +0,0 @@
# Jenkins
The scripts in this directory are the entrypoint for testing ONNX exporter.
The environment variable `BUILD_ENVIRONMENT` is expected to be set to
the build environment you intend to test. It is a hint for the build
and test scripts to configure Caffe2 a certain way and include/exclude
tests. Docker images, they equal the name of the image itself. For
example: `py2-cuda9.0-cudnn7-ubuntu16.04`. The Docker images that are
built on Jenkins and are used in triggered builds already have this
environment variable set in their manifest. Also see
`./docker/jenkins/*/Dockerfile` and search for `BUILD_ENVIRONMENT`.
Our Jenkins installation is located at https://ci.pytorch.org/jenkins/.

View File

@ -1,19 +0,0 @@
set -ex
LOCAL_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
ROOT_DIR=$(cd "$LOCAL_DIR"/../.. && pwd)
TEST_DIR="$ROOT_DIR/test"
pytest_reports_dir="${TEST_DIR}/test-reports/python"
# Figure out which Python to use
PYTHON="$(which python)"
if [[ "${BUILD_ENVIRONMENT}" =~ py((2|3)\.?[0-9]?\.?[0-9]?) ]]; then
PYTHON=$(which "python${BASH_REMATCH[1]}")
fi
if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
# HIP_PLATFORM is auto-detected by hipcc; unset to avoid build errors
unset HIP_PLATFORM
fi
mkdir -p "$pytest_reports_dir" || true

View File

@ -1,74 +0,0 @@
#!/bin/bash
# shellcheck source=./common.sh
source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
if [[ ${BUILD_ENVIRONMENT} == *onnx* ]]; then
pip install click mock tabulate networkx==2.0
pip -q install --user "file:///var/lib/jenkins/workspace/third_party/onnx#egg=onnx"
fi
# Skip tests in environments where they are not built/applicable
if [[ "${BUILD_ENVIRONMENT}" == *-android* ]]; then
echo 'Skipping tests'
exit 0
fi
if [[ "${BUILD_ENVIRONMENT}" == *-rocm* ]]; then
# temporary to locate some kernel issues on the CI nodes
export HSAKMT_DEBUG_LEVEL=4
fi
# These additional packages are needed for circleci ROCm builds.
if [[ $BUILD_ENVIRONMENT == *rocm* ]]; then
# Need networkx 2.0 because bellmand_ford was moved in 2.1 . Scikit-image by
# defaults installs the most recent networkx version, so we install this lower
# version explicitly before scikit-image pulls it in as a dependency
pip install networkx==2.0
# click - onnx
pip install --progress-bar off click protobuf tabulate virtualenv mock typing-extensions
fi
################################################################################
# Python tests #
################################################################################
if [[ "$BUILD_ENVIRONMENT" == *cmake* ]]; then
exit 0
fi
# If pip is installed as root, we must use sudo.
# CircleCI docker images could install conda as jenkins user, or use the OS's python package.
PIP=$(which pip)
PIP_USER=$(stat --format '%U' $PIP)
CURRENT_USER=$(id -u -n)
if [[ "$PIP_USER" = root && "$CURRENT_USER" != root ]]; then
MAYBE_SUDO=sudo
fi
# Uninstall pre-installed hypothesis and coverage to use an older version as newer
# versions remove the timeout parameter from settings which ideep/conv_transpose_test.py uses
$MAYBE_SUDO pip -q uninstall -y hypothesis
$MAYBE_SUDO pip -q uninstall -y coverage
# "pip install hypothesis==3.44.6" from official server is unreliable on
# CircleCI, so we host a copy on S3 instead
$MAYBE_SUDO pip -q install attrs==18.1.0 -f https://s3.amazonaws.com/ossci-linux/wheels/attrs-18.1.0-py2.py3-none-any.whl
$MAYBE_SUDO pip -q install coverage==4.5.1 -f https://s3.amazonaws.com/ossci-linux/wheels/coverage-4.5.1-cp36-cp36m-macosx_10_12_x86_64.whl
$MAYBE_SUDO pip -q install hypothesis==4.57.1
##############
# ONNX tests #
##############
if [[ "$BUILD_ENVIRONMENT" == *onnx* ]]; then
pip install -q --user --no-use-pep517 "git+https://github.com/pytorch/vision.git@$(cat .github/ci_commit_pins/vision.txt)"
pip install -q --user transformers==4.25.1
pip install -q --user ninja flatbuffers==2.0 numpy==1.22.4 onnxruntime==1.14.0 beartype==0.10.4
# TODO: change this when onnx 1.13.1 is released.
pip install --no-use-pep517 'onnx @ git+https://github.com/onnx/onnx@e192ba01e438d22ca2dedd7956e28e3551626c91'
# TODO: change this when onnx-script is on testPypi
pip install 'onnx-script @ git+https://github.com/microsoft/onnx-script@a71e35bcd72537bf7572536ee57250a0c0488bf6'
# numba requires numpy <= 1.20, onnxruntime requires numpy >= 1.21.
# We don't actually need it for our tests, but it's imported if it's present, so uninstall.
pip uninstall -q --yes numba
# JIT C++ extensions require ninja, so put it into PATH.
export PATH="/var/lib/jenkins/.local/bin:$PATH"
"$ROOT_DIR/scripts/onnx/test.sh"
fi

View File

@ -1,29 +0,0 @@
#!/bin/bash
# Required environment variable: $BUILD_ENVIRONMENT
# (This is set by default in the Docker images we build, so you don't
# need to set it yourself.
# shellcheck source=./common.sh
source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
# shellcheck source=./common-build.sh
source "$(dirname "${BASH_SOURCE[0]}")/common-build.sh"
echo "Clang version:"
clang --version
python tools/stats/export_test_times.py
if [ -n "$(which conda)" ]; then
export CMAKE_PREFIX_PATH=/opt/conda
fi
CC="clang" CXX="clang++" LDSHARED="clang --shared" \
CFLAGS="-fsanitize=thread" \
USE_TSAN=1 USE_CUDA=0 USE_MKLDNN=0 \
python setup.py bdist_wheel
pip_install_whl "$(echo dist/*.whl)"
print_sccache_stats
assert_git_not_dirty

View File

@ -1,58 +0,0 @@
#!/bin/bash
# Required environment variables:
# $BUILD_ENVIRONMENT (should be set by your Docker image)
if [[ "$BUILD_ENVIRONMENT" != *win-* ]]; then
# Save the absolute path in case later we chdir (as occurs in the gpu perf test)
script_dir="$( cd "$(dirname "${BASH_SOURCE[0]}")" || exit ; pwd -P )"
if which sccache > /dev/null; then
# Save sccache logs to file
sccache --stop-server > /dev/null 2>&1 || true
rm -f ~/sccache_error.log || true
function sccache_epilogue() {
echo "::group::Sccache Compilation Log"
echo '=================== sccache compilation log ==================='
python "$script_dir/print_sccache_log.py" ~/sccache_error.log 2>/dev/null || true
echo '=========== If your build fails, please take a look at the log above for possible reasons ==========='
sccache --show-stats
sccache --stop-server || true
echo "::endgroup::"
}
# Register the function here so that the error log can be printed even when
# sccache fails to start, i.e. timeout error
trap_add sccache_epilogue EXIT
if [[ -n "${SKIP_SCCACHE_INITIALIZATION:-}" ]]; then
# sccache --start-server seems to hang forever on self hosted runners for GHA
# so let's just go ahead and skip the --start-server altogether since it seems
# as though sccache still gets used even when the sscache server isn't started
# explicitly
echo "Skipping sccache server initialization, setting environment variables"
export SCCACHE_IDLE_TIMEOUT=1200
export SCCACHE_ERROR_LOG=~/sccache_error.log
export RUST_LOG=sccache::server=error
elif [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=0 sccache --start-server
else
# increasing SCCACHE_IDLE_TIMEOUT so that extension_backend_test.cpp can build after this PR:
# https://github.com/pytorch/pytorch/pull/16645
SCCACHE_ERROR_LOG=~/sccache_error.log SCCACHE_IDLE_TIMEOUT=1200 RUST_LOG=sccache::server=error sccache --start-server
fi
# Report sccache stats for easier debugging
sccache --zero-stats
fi
if which ccache > /dev/null; then
# Report ccache stats for easier debugging
ccache --zero-stats
ccache --show-stats
function ccache_epilogue() {
ccache --show-stats
}
trap_add ccache_epilogue EXIT
fi
fi

View File

@ -1,28 +0,0 @@
#!/bin/bash
# Common setup for all Jenkins scripts
# shellcheck source=./common_utils.sh
source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
set -ex
# Required environment variables:
# $BUILD_ENVIRONMENT (should be set by your Docker image)
# Figure out which Python to use for ROCm
if [[ "${BUILD_ENVIRONMENT}" == *rocm* ]]; then
# HIP_PLATFORM is auto-detected by hipcc; unset to avoid build errors
unset HIP_PLATFORM
export PYTORCH_TEST_WITH_ROCM=1
# temporary to locate some kernel issues on the CI nodes
export HSAKMT_DEBUG_LEVEL=4
# improve rccl performance for distributed tests
export HSA_FORCE_FINE_GRAIN_PCIE=1
fi
# TODO: Renable libtorch testing for MacOS, see https://github.com/pytorch/pytorch/issues/62598
# shellcheck disable=SC2034
BUILD_TEST_LIBTORCH=0
retry () {
"$@" || (sleep 1 && "$@") || (sleep 2 && "$@")
}

View File

@ -1,236 +0,0 @@
#!/bin/bash
# Common util **functions** that can be sourced in other scripts.
# note: printf is used instead of echo to avoid backslash
# processing and to properly handle values that begin with a '-'.
log() { printf '%s\n' "$*"; }
error() { log "ERROR: $*" >&2; }
fatal() { error "$@"; exit 1; }
retry () {
"$@" || (sleep 10 && "$@") || (sleep 20 && "$@") || (sleep 40 && "$@")
}
# compositional trap taken from https://stackoverflow.com/a/7287873/23845
# appends a command to a trap
#
# - 1st arg: code to add
# - remaining args: names of traps to modify
#
trap_add() {
trap_add_cmd=$1; shift || fatal "${FUNCNAME[0]} usage error"
for trap_add_name in "$@"; do
trap -- "$(
# helper fn to get existing trap command from output
# of trap -p
extract_trap_cmd() { printf '%s\n' "$3"; }
# print existing trap command with newline
eval "extract_trap_cmd $(trap -p "${trap_add_name}")"
# print the new trap command
printf '%s\n' "${trap_add_cmd}"
)" "${trap_add_name}" \
|| fatal "unable to add to trap ${trap_add_name}"
done
}
# set the trace attribute for the above function. this is
# required to modify DEBUG or RETURN traps because functions don't
# inherit them unless the trace attribute is set
declare -f -t trap_add
function assert_git_not_dirty() {
# TODO: we should add an option to `build_amd.py` that reverts the repo to
# an unmodified state.
if [[ "$BUILD_ENVIRONMENT" != *rocm* ]] && [[ "$BUILD_ENVIRONMENT" != *xla* ]] ; then
git_status=$(git status --porcelain)
if [[ $git_status ]]; then
echo "Build left local git repository checkout dirty"
echo "git status --porcelain:"
echo "${git_status}"
exit 1
fi
fi
}
function pip_install_whl() {
# This is used to install PyTorch and other build artifacts wheel locally
# without using any network connection
python3 -mpip install --no-index --no-deps "$@"
}
function pip_install() {
# retry 3 times
# old versions of pip don't have the "--progress-bar" flag
pip install --progress-bar off "$@" || pip install --progress-bar off "$@" || pip install --progress-bar off "$@" ||\
pip install "$@" || pip install "$@" || pip install "$@"
}
function pip_uninstall() {
# uninstall 2 times
pip uninstall -y "$@" || pip uninstall -y "$@"
}
function get_exit_code() {
set +e
"$@"
retcode=$?
set -e
return $retcode
}
function get_bazel() {
if [[ $(uname) == "Darwin" ]]; then
# download bazel version
retry curl https://github.com/bazelbuild/bazel/releases/download/4.2.1/bazel-4.2.1-darwin-x86_64 -Lo tools/bazel
# verify content
echo '74d93848f0c9d592e341e48341c53c87e3cb304a54a2a1ee9cff3df422f0b23c tools/bazel' | shasum -a 256 -c >/dev/null
else
# download bazel version
retry curl https://ossci-linux.s3.amazonaws.com/bazel-4.2.1-linux-x86_64 -o tools/bazel
# verify content
echo '1a4f3a3ce292307bceeb44f459883859c793436d564b95319aacb8af1f20557c tools/bazel' | shasum -a 256 -c >/dev/null
fi
chmod +x tools/bazel
}
function install_monkeytype {
# Install MonkeyType
pip_install MonkeyType
}
function get_pinned_commit() {
cat .github/ci_commit_pins/"${1}".txt
}
function install_torchtext() {
local commit
commit=$(get_pinned_commit text)
pip_install --no-use-pep517 --user "git+https://github.com/pytorch/text.git@${commit}"
}
function install_torchvision() {
local commit
commit=$(get_pinned_commit vision)
pip_install --no-use-pep517 --user "git+https://github.com/pytorch/vision.git@${commit}"
}
function clone_pytorch_xla() {
if [[ ! -d ./xla ]]; then
git clone --recursive -b r2.0 --quiet https://github.com/pytorch/xla.git
pushd xla
# pin the xla hash so that we don't get broken by changes to xla
git checkout "$(cat ../.github/ci_commit_pins/xla.txt)"
git submodule sync
git submodule update --init --recursive
popd
fi
}
function install_filelock() {
pip_install filelock
}
function install_triton() {
local commit
if [[ "${TEST_CONFIG}" == *rocm* ]]; then
echo "skipping triton due to rocm"
else
commit=$(get_pinned_commit triton)
if [[ "${BUILD_ENVIRONMENT}" == *gcc7* ]]; then
# Trition needs gcc-9 to build
sudo apt-get install -y g++-9
CXX=g++-9 pip_install --user "git+https://github.com/openai/triton@${commit}#subdirectory=python"
elif [[ "${BUILD_ENVIRONMENT}" == *clang* ]]; then
# Trition needs <filesystem> which surprisingly is not available with clang-9 toolchain
sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
sudo apt-get install -y g++-9
CXX=g++-9 pip_install --user "git+https://github.com/openai/triton@${commit}#subdirectory=python"
else
pip_install --user "git+https://github.com/openai/triton@${commit}#subdirectory=python"
fi
pip_install --user jinja2
fi
}
function setup_torchdeploy_deps(){
conda install -y -n "py_${ANACONDA_PYTHON_VERSION}" "libpython-static=${ANACONDA_PYTHON_VERSION}"
local CC
local CXX
CC="$(which gcc)"
CXX="$(which g++)"
export CC
export CXX
pip install --upgrade pip
}
function checkout_install_torchdeploy() {
local commit
commit=$(get_pinned_commit multipy)
setup_torchdeploy_deps
pushd ..
git clone --recurse-submodules https://github.com/pytorch/multipy.git
pushd multipy
git checkout "${commit}"
python multipy/runtime/example/generate_examples.py
pip install -e .
popd
popd
}
function test_torch_deploy(){
pushd ..
pushd multipy
./multipy/runtime/build/test_deploy
popd
popd
}
function install_huggingface() {
local commit
commit=$(get_pinned_commit huggingface)
pip_install pandas
pip_install scipy
pip_install "git+https://github.com/huggingface/transformers.git@${commit}#egg=transformers"
}
function install_timm() {
local commit
commit=$(get_pinned_commit timm)
pip_install pandas
pip_install scipy
pip_install "git+https://github.com/rwightman/pytorch-image-models@${commit}"
}
function checkout_install_torchbench() {
git clone https://github.com/pytorch/benchmark torchbench
pushd torchbench
git checkout no_torchaudio
if [ "$1" ]; then
python install.py --continue_on_fail models "$@"
else
# Occasionally the installation may fail on one model but it is ok to continue
# to install and test other models
python install.py --continue_on_fail
fi
popd
}
function test_functorch() {
python test/run_test.py --functorch --verbose
}
function print_sccache_stats() {
echo 'PyTorch Build Statistics'
sccache --show-stats
if [[ -n "${OUR_GITHUB_JOB_ID}" ]]; then
sccache --show-stats --stats-format json | jq .stats \
> "sccache-stats-${BUILD_ENVIRONMENT}-${OUR_GITHUB_JOB_ID}.json"
else
echo "env var OUR_GITHUB_JOB_ID not set, will not write sccache stats to json"
fi
}

View File

@ -1,14 +0,0 @@
#!/bin/bash
# Common prelude for macos-build.sh and macos-test.sh
# shellcheck source=./common.sh
source "$(dirname "${BASH_SOURCE[0]}")/common.sh"
sysctl -a | grep machdep.cpu
# These are required for both the build job and the test job.
# In the latter to test cpp extensions.
export MACOSX_DEPLOYMENT_TARGET=10.9
export CXX=clang++
export CC=clang

View File

@ -1,19 +0,0 @@
call %SCRIPT_HELPERS_DIR%\setup_pytorch_env.bat
:: exit the batch once there's an error
if not errorlevel 0 (
echo "setup pytorch env failed"
echo %errorlevel%
exit /b
)
echo "Test functorch"
pushd test
python run_test.py --functorch --shard "%SHARD_NUMBER%" "%NUM_TEST_SHARDS%" --verbose
popd
if ERRORLEVEL 1 goto fail
:eof
exit /b 0
:fail
exit /b 1

View File

@ -1,26 +0,0 @@
if "%BUILD_ENVIRONMENT%"=="" (
set CONDA_PARENT_DIR=%CD%
) else (
set CONDA_PARENT_DIR=C:\Jenkins
)
:: Be conservative here when rolling out the new AMI with conda. This will try
:: to install conda as before if it couldn't find the conda installation. This
:: can be removed eventually after we gain enough confidence in the AMI
if not exist %CONDA_PARENT_DIR%\Miniconda3 (
set INSTALL_FRESH_CONDA=1
)
if "%INSTALL_FRESH_CONDA%"=="1" (
curl --retry 3 --retry-all-errors -k https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe --output %TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe
if errorlevel 1 exit /b
if not errorlevel 0 exit /b
%TMP_DIR_WIN%\Miniconda3-latest-Windows-x86_64.exe /InstallationType=JustMe /RegisterPython=0 /S /AddToPath=0 /D=%CONDA_PARENT_DIR%\Miniconda3
if errorlevel 1 exit /b
if not errorlevel 0 exit /b
)
:: Activate conda so that we can use its commands, i.e. conda, python, pip
call %CONDA_PARENT_DIR%\Miniconda3\Scripts\activate.bat %CONDA_PARENT_DIR%\Miniconda3

View File

@ -1,468 +0,0 @@
Warning
=======
Contents may be out of date. Our CircleCI workflows are gradually being migrated to Github actions.
Structure of CI
===============
setup job:
1. Does a git checkout
2. Persists CircleCI scripts (everything in `.circleci`) into a workspace. Why?
We don't always do a Git checkout on all subjobs, but we usually
still want to be able to call scripts one way or another in a subjob.
Persisting files this way lets us have access to them without doing a
checkout. This workspace is conventionally mounted on `~/workspace`
(this is distinguished from `~/project`, which is the conventional
working directory that CircleCI will default to starting your jobs
in.)
3. Write out the commit message to `.circleci/COMMIT_MSG`. This is so
we can determine in subjobs if we should actually run the jobs or
not, even if there isn't a Git checkout.
CircleCI configuration generator
================================
One may no longer make changes to the `.circleci/config.yml` file directly.
Instead, one must edit these Python scripts or files in the `verbatim-sources/` directory.
Usage
----------
1. Make changes to these scripts.
2. Run the `regenerate.sh` script in this directory and commit the script changes and the resulting change to `config.yml`.
You'll see a build failure on GitHub if the scripts don't agree with the checked-in version.
Motivation
----------
These scripts establish a single, authoritative source of documentation for the CircleCI configuration matrix.
The documentation, in the form of diagrams, is automatically generated and cannot drift out of sync with the YAML content.
Furthermore, consistency is enforced within the YAML config itself, by using a single source of data to generate
multiple parts of the file.
* Facilitates one-off culling/enabling of CI configs for testing PRs on special targets
Also see https://github.com/pytorch/pytorch/issues/17038
Future direction
----------------
### Declaring sparse config subsets
See comment [here](https://github.com/pytorch/pytorch/pull/17323#pullrequestreview-206945747):
In contrast with a full recursive tree traversal of configuration dimensions,
> in the future I think we actually want to decrease our matrix somewhat and have only a few mostly-orthogonal builds that taste as many different features as possible on PRs, plus a more complete suite on every PR and maybe an almost full suite nightly/weekly (we don't have this yet). Specifying PR jobs in the future might be easier to read with an explicit list when we come to this.
----------------
----------------
# How do the binaries / nightlies / releases work?
### What is a binary?
A binary or package (used interchangeably) is a pre-built collection of c++ libraries, header files, python bits, and other files. We build these and distribute them so that users do not need to install from source.
A **binary configuration** is a collection of
* release or nightly
* releases are stable, nightlies are beta and built every night
* python version
* linux: 3.7m (mu is wide unicode or something like that. It usually doesn't matter but you should know that it exists)
* macos: 3.7, 3.8
* windows: 3.7, 3.8
* cpu version
* cpu, cuda 9.0, cuda 10.0
* The supported cuda versions occasionally change
* operating system
* Linux - these are all built on CentOS. There haven't been any problems in the past building on CentOS and using on Ubuntu
* MacOS
* Windows - these are built on Azure pipelines
* devtoolset version (gcc compiler version)
* This only matters on Linux cause only Linux uses gcc. tldr is gcc made a backwards incompatible change from gcc 4.8 to gcc 5, because it had to change how it implemented std::vector and std::string
### Where are the binaries?
The binaries are built in CircleCI. There are nightly binaries built every night at 9pm PST (midnight EST) and release binaries corresponding to Pytorch releases, usually every few months.
We have 3 types of binary packages
* pip packages - nightlies are stored on s3 (pip install -f \<a s3 url\>). releases are stored in a pip repo (pip install torch) (ask Soumith about this)
* conda packages - nightlies and releases are both stored in a conda repo. Nighty packages have a '_nightly' suffix
* libtorch packages - these are zips of all the c++ libraries, header files, and sometimes dependencies. These are c++ only
* shared with dependencies (the only supported option for Windows)
* static with dependencies
* shared without dependencies
* static without dependencies
All binaries are built in CircleCI workflows except Windows. There are checked-in workflows (committed into the .circleci/config.yml) to build the nightlies every night. Releases are built by manually pushing a PR that builds the suite of release binaries (overwrite the config.yml to build the release)
# CircleCI structure of the binaries
Some quick vocab:
* A \**workflow** is a CircleCI concept; it is a DAG of '**jobs**'. ctrl-f 'workflows' on https://github.com/pytorch/pytorch/blob/master/.circleci/config.yml to see the workflows.
* **jobs** are a sequence of '**steps**'
* **steps** are usually just a bash script or a builtin CircleCI command. *All steps run in new environments, environment variables declared in one script DO NOT persist to following steps*
* CircleCI has a **workspace**, which is essentially a cache between steps of the *same job* in which you can store artifacts between steps.
## How are the workflows structured?
The nightly binaries have 3 workflows. We have one job (actually 3 jobs: build, test, and upload) per binary configuration
1. binary_builds
1. every day midnight EST
2. linux: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml
3. macos: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/macos-binary-build-defaults.yml
4. For each binary configuration, e.g. linux_conda_3.7_cpu there is a
1. binary_linux_conda_3.7_cpu_build
1. Builds the build. On linux jobs this uses the 'docker executor'.
2. Persists the package to the workspace
2. binary_linux_conda_3.7_cpu_test
1. Loads the package to the workspace
2. Spins up a docker image (on Linux), mapping the package and code repos into the docker
3. Runs some smoke tests in the docker
4. (Actually, for macos this is a step rather than a separate job)
3. binary_linux_conda_3.7_cpu_upload
1. Logs in to aws/conda
2. Uploads the package
2. update_s3_htmls
1. every day 5am EST
2. https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/binary_update_htmls.yml
3. See below for what these are for and why they're needed
4. Three jobs that each examine the current contents of aws and the conda repo and update some html files in s3
3. binarysmoketests
1. every day
2. https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml
3. For each binary configuration, e.g. linux_conda_3.7_cpu there is a
1. smoke_linux_conda_3.7_cpu
1. Downloads the package from the cloud, e.g. using the official pip or conda instructions
2. Runs the smoke tests
## How are the jobs structured?
The jobs are in https://github.com/pytorch/pytorch/tree/master/.circleci/verbatim-sources. Jobs are made of multiple steps. There are some shared steps used by all the binaries/smokes. Steps of these jobs are all delegated to scripts in https://github.com/pytorch/pytorch/tree/master/.circleci/scripts .
* Linux jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml
* binary_linux_build.sh
* binary_linux_test.sh
* binary_linux_upload.sh
* MacOS jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/macos-binary-build-defaults.yml
* binary_macos_build.sh
* binary_macos_test.sh
* binary_macos_upload.sh
* Update html jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/binary_update_htmls.yml
* These delegate from the pytorch/builder repo
* https://github.com/pytorch/builder/blob/master/cron/update_s3_htmls.sh
* https://github.com/pytorch/builder/blob/master/cron/upload_binary_sizes.sh
* Smoke jobs (both linux and macos): https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-build-smoke-tests-defaults.yml
* These delegate from the pytorch/builder repo
* https://github.com/pytorch/builder/blob/master/run_tests.sh
* https://github.com/pytorch/builder/blob/master/smoke_test.sh
* https://github.com/pytorch/builder/blob/master/check_binary.sh
* Common shared code (shared across linux and macos): https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/nightly-binary-build-defaults.yml
* binary_checkout.sh - checks out pytorch/builder repo. Right now this also checks out pytorch/pytorch, but it shouldn't. pytorch/pytorch should just be shared through the workspace. This can handle being run before binary_populate_env.sh
* binary_populate_env.sh - parses BUILD_ENVIRONMENT into the separate env variables that make up a binary configuration. Also sets lots of default values, the date, the version strings, the location of folders in s3, all sorts of things. This generally has to be run before other steps.
* binary_install_miniconda.sh - Installs miniconda, cross platform. Also hacks this for the update_binary_sizes job that doesn't have the right env variables
* binary_run_in_docker.sh - Takes a bash script file (the actual test code) from a hardcoded location, spins up a docker image, and runs the script inside the docker image
### **Why do the steps all refer to scripts?**
CircleCI creates a final yaml file by inlining every <<* segment, so if we were to keep all the code in the config.yml itself then the config size would go over 4 MB and cause infra problems.
### **What is binary_run_in_docker for?**
So, CircleCI has several executor types: macos, machine, and docker are the ones we use. The 'machine' executor gives you two cores on some linux vm. The 'docker' executor gives you considerably more cores (nproc was 32 instead of 2 back when I tried in February). Since the dockers are faster, we try to run everything that we can in dockers. Thus
* linux build jobs use the docker executor. Running them on the docker executor was at least 2x faster than running them on the machine executor
* linux test jobs use the machine executor in order for them to properly interface with GPUs since docker executors cannot execute with attached GPUs
* linux upload jobs use the machine executor. The upload jobs are so short that it doesn't really matter what they use
* linux smoke test jobs use the machine executor for the same reason as the linux test jobs
binary_run_in_docker.sh is a way to share the docker start-up code between the binary test jobs and the binary smoke test jobs
### **Why does binary_checkout also checkout pytorch? Why shouldn't it?**
We want all the nightly binary jobs to run on the exact same git commit, so we wrote our own checkout logic to ensure that the same commit was always picked. Later circleci changed that to use a single pytorch checkout and persist it through the workspace (they did this because our config file was too big, so they wanted to take a lot of the setup code into scripts, but the scripts needed the code repo to exist to be called, so they added a prereq step called 'setup' to checkout the code and persist the needed scripts to the workspace). The changes to the binary jobs were not properly tested, so they all broke from missing pytorch code no longer existing. We hotfixed the problem by adding the pytorch checkout back to binary_checkout, so now there's two checkouts of pytorch on the binary jobs. This problem still needs to be fixed, but it takes careful tracing of which code is being called where.
# Code structure of the binaries (circleci agnostic)
## Overview
The code that runs the binaries lives in two places, in the normal [github.com/pytorch/pytorch](http://github.com/pytorch/pytorch), but also in [github.com/pytorch/builder](http://github.com/pytorch/builder), which is a repo that defines how all the binaries are built. The relevant code is
```
# All code needed to set-up environments for build code to run in,
# but only code that is specific to the current CI system
pytorch/pytorch
- .circleci/ # Folder that holds all circleci related stuff
- config.yml # GENERATED file that actually controls all circleci behavior
- verbatim-sources # Used to generate job/workflow sections in ^
- scripts/ # Code needed to prepare circleci environments for binary build scripts
- setup.py # Builds pytorch. This is wrapped in pytorch/builder
- cmake files # used in normal building of pytorch
# All code needed to prepare a binary build, given an environment
# with all the right variables/packages/paths.
pytorch/builder
# Given an installed binary and a proper python env, runs some checks
# to make sure the binary was built the proper way. Checks things like
# the library dependencies, symbols present, etc.
- check_binary.sh
# Given an installed binary, runs python tests to make sure everything
# is in order. These should be de-duped. Right now they both run smoke
# tests, but are called from different places. Usually just call some
# import statements, but also has overlap with check_binary.sh above
- run_tests.sh
- smoke_test.sh
# Folders that govern how packages are built. See paragraphs below
- conda/
- build_pytorch.sh # Entrypoint. Delegates to proper conda build folder
- switch_cuda_version.sh # Switches activate CUDA installation in Docker
- pytorch-nightly/ # Build-folder
- manywheel/
- build_cpu.sh # Entrypoint for cpu builds
- build.sh # Entrypoint for CUDA builds
- build_common.sh # Actual build script that ^^ call into
- wheel/
- build_wheel.sh # Entrypoint for wheel builds
- windows/
- build_pytorch.bat # Entrypoint for wheel builds on Windows
```
Every type of package has an entrypoint build script that handles the all the important logic.
## Conda
Linux, MacOS and Windows use the same code flow for the conda builds.
Conda packages are built with conda-build, see https://conda.io/projects/conda-build/en/latest/resources/commands/conda-build.html
Basically, you pass `conda build` a build folder (pytorch-nightly/ above) that contains a build script and a meta.yaml. The meta.yaml specifies in what python environment to build the package in, and what dependencies the resulting package should have, and the build script gets called in the env to build the thing.
tl;dr on conda-build is
1. Creates a brand new conda environment, based off of deps in the meta.yaml
1. Note that environment variables do not get passed into this build env unless they are specified in the meta.yaml
2. If the build fails this environment will stick around. You can activate it for much easier debugging. The “General Python” section below explains what exactly a python “environment” is.
2. Calls build.sh in the environment
3. Copies the finished package to a new conda env, also specified by the meta.yaml
4. Runs some simple import tests (if specified in the meta.yaml)
5. Saves the finished package as a tarball
The build.sh we use is essentially a wrapper around `python setup.py build`, but it also manually copies in some of our dependent libraries into the resulting tarball and messes with some rpaths.
The entrypoint file `builder/conda/build_conda.sh` is complicated because
* It works for Linux, MacOS and Windows
* The mac builds used to create their own environments, since they all used to be on the same machine. Theres now a lot of extra logic to handle conda envs. This extra machinery could be removed
* It used to handle testing too, which adds more logic messing with python environments too. This extra machinery could be removed.
## Manywheels (linux pip and libtorch packages)
Manywheels are pip packages for linux distros. Note that these manywheels are not actually manylinux compliant.
`builder/manywheel/build_cpu.sh` and `builder/manywheel/build.sh` (for CUDA builds) just set different env vars and then call into `builder/manywheel/build_common.sh`
The entrypoint file `builder/manywheel/build_common.sh` is really really complicated because
* This used to handle building for several different python versions at the same time. The loops have been removed, but there's still unnecessary folders and movements here and there.
* The script is never used this way anymore. This extra machinery could be removed.
* This used to handle testing the pip packages too. This is why theres testing code at the end that messes with python installations and stuff
* The script is never used this way anymore. This extra machinery could be removed.
* This also builds libtorch packages
* This should really be separate. libtorch packages are c++ only and have no python. They should not share infra with all the python specific stuff in this file.
* There is a lot of messing with rpaths. This is necessary, but could be made much much simpler if the above issues were fixed.
## Wheels (MacOS pip and libtorch packages)
The entrypoint file `builder/wheel/build_wheel.sh` is complicated because
* The mac builds used to all run on one machine (we didnt have autoscaling mac machines till circleci). So this script handled siloing itself by setting-up and tearing-down its build env and siloing itself into its own build directory.
* The script is never used this way anymore. This extra machinery could be removed.
* This also builds libtorch packages
* Ditto the comment above. This should definitely be separated out.
Note that the MacOS Python wheels are still built in conda environments. Some of the dependencies present during build also come from conda.
## Windows Wheels (Windows pip and libtorch packages)
The entrypoint file `builder/windows/build_pytorch.bat` is complicated because
* This used to handle building for several different python versions at the same time. This is why there are loops everywhere
* The script is never used this way anymore. This extra machinery could be removed.
* This used to handle testing the pip packages too. This is why theres testing code at the end that messes with python installations and stuff
* The script is never used this way anymore. This extra machinery could be removed.
* This also builds libtorch packages
* This should really be separate. libtorch packages are c++ only and have no python. They should not share infra with all the python specific stuff in this file.
Note that the Windows Python wheels are still built in conda environments. Some of the dependencies present during build also come from conda.
## General notes
### Note on run_tests.sh, smoke_test.sh, and check_binary.sh
* These should all be consolidated
* These must run on all OS types: MacOS, Linux, and Windows
* These all run smoke tests at the moment. They inspect the packages some, maybe run a few import statements. They DO NOT run the python tests nor the cpp tests. The idea is that python tests on master and PR merges will catch all breakages. All these tests have to do is make sure the special binary machinery didnt mess anything up.
* There are separate run_tests.sh and smoke_test.sh because one used to be called by the smoke jobs and one used to be called by the binary test jobs (see circleci structure section above). This is still true actually, but these could be united into a single script that runs these checks, given an installed pytorch package.
### Note on libtorch
Libtorch packages are built in the wheel build scripts: manywheel/build_*.sh for linux and build_wheel.sh for mac. There are several things wrong with this
* Its confusing. Most of those scripts deal with python specifics.
* The extra conditionals everywhere severely complicate the wheel build scripts
* The process for building libtorch is different from the official instructions (a plain call to cmake, or a call to a script)
### Note on docker images / Dockerfiles
All linux builds occur in docker images. The docker images are
* pytorch/conda-cuda
* Has ALL CUDA versions installed. The script pytorch/builder/conda/switch_cuda_version.sh sets /usr/local/cuda to a symlink to e.g. /usr/local/cuda-10.0 to enable different CUDA builds
* Also used for cpu builds
* pytorch/manylinux-cuda90
* pytorch/manylinux-cuda100
* Also used for cpu builds
The Dockerfiles are available in pytorch/builder, but there is no circleci job or script to build these docker images, and they cannot be run locally (unless you have the correct local packages/paths). Only Soumith can build them right now.
### General Python
* This is still a good explanation of python installations https://caffe2.ai/docs/faq.html#why-do-i-get-import-errors-in-python-when-i-try-to-use-caffe2
# How to manually rebuild the binaries
tl;dr make a PR that looks like https://github.com/pytorch/pytorch/pull/21159
Sometimes we want to push a change to master and then rebuild all of today's binaries after that change. As of May 30, 2019 there isn't a way to manually run a workflow in the UI. You can manually re-run a workflow, but it will use the exact same git commits as the first run and will not include any changes. So we have to make a PR and then force circleci to run the binary workflow instead of the normal tests. The above PR is an example of how to do this; essentially you copy-paste the binarybuilds workflow steps into the default workflow steps. If you need to point the builder repo to a different commit then you'd need to change https://github.com/pytorch/pytorch/blob/master/.circleci/scripts/binary_checkout.sh#L42-L45 to checkout what you want.
## How to test changes to the binaries via .circleci
Writing PRs that test the binaries is annoying, since the default circleci jobs that run on PRs are not the jobs that you want to run. Likely, changes to the binaries will touch something under .circleci/ and require that .circleci/config.yml be regenerated (.circleci/config.yml controls all .circleci behavior, and is generated using `.circleci/regenerate.sh` in python 3.7). But you also need to manually hardcode the binary jobs that you want to test into the .circleci/config.yml workflow, so you should actually make at least two commits, one for your changes and one to temporarily hardcode jobs. See https://github.com/pytorch/pytorch/pull/22928 as an example of how to do this.
```sh
# Make your changes
touch .circleci/verbatim-sources/nightly-binary-build-defaults.yml
# Regenerate the yaml, has to be in python 3.7
.circleci/regenerate.sh
# Make a commit
git add .circleci *
git commit -m "My real changes"
git push origin my_branch
# Now hardcode the jobs that you want in the .circleci/config.yml workflows section
# Also eliminate ensure-consistency and should_run_job checks
# e.g. https://github.com/pytorch/pytorch/commit/2b3344bfed8772fe86e5210cc4ee915dee42b32d
# Make a commit you won't keep
git add .circleci
git commit -m "[DO NOT LAND] testing binaries for above changes"
git push origin my_branch
# Now you need to make some changes to the first commit.
git rebase -i HEAD~2 # mark the first commit as 'edit'
# Make the changes
touch .circleci/verbatim-sources/nightly-binary-build-defaults.yml
.circleci/regenerate.sh
# Ammend the commit and recontinue
git add .circleci
git commit --amend
git rebase --continue
# Update the PR, need to force since the commits are different now
git push origin my_branch --force
```
The advantage of this flow is that you can make new changes to the base commit and regenerate the .circleci without having to re-write which binary jobs you want to test on. The downside is that all updates will be force pushes.
## How to build a binary locally
### Linux
You can build Linux binaries locally easily using docker.
```sh
# Run the docker
# Use the correct docker image, pytorch/conda-cuda used here as an example
#
# -v path/to/foo:path/to/bar makes path/to/foo on your local machine (the
# machine that you're running the command on) accessible to the docker
# container at path/to/bar. So if you then run `touch path/to/bar/baz`
# in the docker container then you will see path/to/foo/baz on your local
# machine. You could also clone the pytorch and builder repos in the docker.
#
# If you know how, add ccache as a volume too and speed up everything
docker run \
-v your/pytorch/repo:/pytorch \
-v your/builder/repo:/builder \
-v where/you/want/packages/to/appear:/final_pkgs \
-it pytorch/conda-cuda /bin/bash
# Export whatever variables are important to you. All variables that you'd
# possibly need are in .circleci/scripts/binary_populate_env.sh
# You should probably always export at least these 3 variables
export PACKAGE_TYPE=conda
export DESIRED_PYTHON=3.7
export DESIRED_CUDA=cpu
# Call the entrypoint
# `|& tee foo.log` just copies all stdout and stderr output to foo.log
# The builds generate lots of output so you probably need this when
# building locally.
/builder/conda/build_pytorch.sh |& tee build_output.log
```
**Building CUDA binaries on docker**
You can build CUDA binaries on CPU only machines, but you can only run CUDA binaries on CUDA machines. This means that you can build a CUDA binary on a docker on your laptop if you so choose (though its gonna take a long time).
For Facebook employees, ask about beefy machines that have docker support and use those instead of your laptop; it will be 5x as fast.
### MacOS
Theres no easy way to generate reproducible hermetic MacOS environments. If you have a Mac laptop then you can try emulating the .circleci environments as much as possible, but you probably have packages in /usr/local/, possibly installed by brew, that will probably interfere with the build. If youre trying to repro an error on a Mac build in .circleci and you cant seem to repro locally, then my best advice is actually to iterate on .circleci :/
But if you want to try, then Id recommend
```sh
# Create a new terminal
# Clear your LD_LIBRARY_PATH and trim as much out of your PATH as you
# know how to do
# Install a new miniconda
# First remove any other python or conda installation from your PATH
# Always install miniconda 3, even if building for Python <3
new_conda="~/my_new_conda"
conda_sh="$new_conda/install_miniconda.sh"
curl -o "$conda_sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x "$conda_sh"
"$conda_sh" -b -p "$MINICONDA_ROOT"
rm -f "$conda_sh"
export PATH="~/my_new_conda/bin:$PATH"
# Create a clean python env
# All MacOS builds use conda to manage the python env and dependencies
# that are built with, even the pip packages
conda create -yn binary python=2.7
conda activate binary
# Export whatever variables are important to you. All variables that you'd
# possibly need are in .circleci/scripts/binary_populate_env.sh
# You should probably always export at least these 3 variables
export PACKAGE_TYPE=conda
export DESIRED_PYTHON=3.7
export DESIRED_CUDA=cpu
# Call the entrypoint you want
path/to/builder/wheel/build_wheel.sh
```
N.B. installing a brand new miniconda is important. This has to do with how conda installations work. See the “General Python” section above, but tldr; is that
1. You make the conda command accessible by prepending `path/to/conda_root/bin` to your PATH.
2. You make a new env and activate it, which then also gets prepended to your PATH. Now you have `path/to/conda_root/envs/new_env/bin:path/to/conda_root/bin:$PATH`
3. Now say you (or some code that you ran) call python executable `foo`
1. if you installed `foo` in `new_env`, then `path/to/conda_root/envs/new_env/bin/foo` will get called, as expected.
2. But if you forgot to installed `foo` in `new_env` but happened to previously install it in your root conda env (called base), then unix/linux will still find `path/to/conda_root/bin/foo` . This is dangerous, since `foo` can be a different version than you want; `foo` can even be for an incompatible python version!
Newer conda versions and proper python hygiene can prevent this, but just install a new miniconda to be safe.
### Windows
TODO: fill in

View File

@ -57,7 +57,7 @@ WINDOWS_LIBTORCH_CONFIG_VARIANTS = [
class TopLevelNode(ConfigNode):
def __init__(self, node_name, config_tree_data, smoke):
super().__init__(None, node_name)
super(TopLevelNode, self).__init__(None, node_name)
self.config_tree_data = config_tree_data
self.props["smoke"] = smoke
@ -68,7 +68,7 @@ class TopLevelNode(ConfigNode):
class OSConfigNode(ConfigNode):
def __init__(self, parent, os_name, gpu_versions, py_tree):
super().__init__(parent, os_name)
super(OSConfigNode, self).__init__(parent, os_name)
self.py_tree = py_tree
self.props["os_name"] = os_name
@ -80,7 +80,7 @@ class OSConfigNode(ConfigNode):
class PackageFormatConfigNode(ConfigNode):
def __init__(self, parent, package_format, python_versions):
super().__init__(parent, package_format)
super(PackageFormatConfigNode, self).__init__(parent, package_format)
self.props["python_versions"] = python_versions
self.props["package_format"] = package_format
@ -97,7 +97,7 @@ class PackageFormatConfigNode(ConfigNode):
class LinuxGccConfigNode(ConfigNode):
def __init__(self, parent, gcc_config_variant):
super().__init__(parent, "GCC_CONFIG_VARIANT=" + str(gcc_config_variant))
super(LinuxGccConfigNode, self).__init__(parent, "GCC_CONFIG_VARIANT=" + str(gcc_config_variant))
self.props["gcc_config_variant"] = gcc_config_variant
@ -122,7 +122,7 @@ class LinuxGccConfigNode(ConfigNode):
class WindowsLibtorchConfigNode(ConfigNode):
def __init__(self, parent, libtorch_config_variant):
super().__init__(parent, "LIBTORCH_CONFIG_VARIANT=" + str(libtorch_config_variant))
super(WindowsLibtorchConfigNode, self).__init__(parent, "LIBTORCH_CONFIG_VARIANT=" + str(libtorch_config_variant))
self.props["libtorch_config_variant"] = libtorch_config_variant
@ -132,7 +132,7 @@ class WindowsLibtorchConfigNode(ConfigNode):
class ArchConfigNode(ConfigNode):
def __init__(self, parent, gpu):
super().__init__(parent, get_processor_arch_name(gpu))
super(ArchConfigNode, self).__init__(parent, get_processor_arch_name(gpu))
self.props["gpu"] = gpu
@ -142,7 +142,7 @@ class ArchConfigNode(ConfigNode):
class PyVersionConfigNode(ConfigNode):
def __init__(self, parent, pyver):
super().__init__(parent, pyver)
super(PyVersionConfigNode, self).__init__(parent, pyver)
self.props["pyver"] = pyver
@ -158,7 +158,7 @@ class PyVersionConfigNode(ConfigNode):
class LinkingVariantConfigNode(ConfigNode):
def __init__(self, parent, linking_variant):
super().__init__(parent, linking_variant)
super(LinkingVariantConfigNode, self).__init__(parent, linking_variant)
def get_children(self):
return [DependencyInclusionConfigNode(self, v) for v in DEPS_INCLUSION_DIMENSIONS]
@ -166,6 +166,6 @@ class LinkingVariantConfigNode(ConfigNode):
class DependencyInclusionConfigNode(ConfigNode):
def __init__(self, parent, deps_variant):
super().__init__(parent, deps_variant)
super(DependencyInclusionConfigNode, self).__init__(parent, deps_variant)
self.props["libtorch_variant"] = "-".join([self.parent.get_label(), self.get_label()])

View File

@ -4,7 +4,6 @@ CUDA_VERSIONS = [
"102",
"113",
"116",
"117",
]
ROCM_VERSIONS = [

View File

@ -12,7 +12,7 @@ def get_major_pyver(dotted_version):
class TreeConfigNode(ConfigNode):
def __init__(self, parent, node_name, subtree):
super().__init__(parent, self.modify_label(node_name))
super(TreeConfigNode, self).__init__(parent, self.modify_label(node_name))
self.subtree = subtree
self.init2(node_name)
@ -28,7 +28,7 @@ class TreeConfigNode(ConfigNode):
class TopLevelNode(TreeConfigNode):
def __init__(self, node_name, subtree):
super().__init__(None, node_name, subtree)
super(TopLevelNode, self).__init__(None, node_name, subtree)
# noinspection PyMethodMayBeStatic
def child_constructor(self):
@ -75,7 +75,6 @@ class ExperimentalFeatureConfigNode(TreeConfigNode):
"vulkan": VulkanConfigNode,
"parallel_tbb": ParallelTBBConfigNode,
"crossref": CrossRefConfigNode,
"dynamo": DynamoConfigNode,
"parallel_native": ParallelNativeConfigNode,
"onnx": ONNXConfigNode,
"libtorch": LibTorchConfigNode,
@ -180,14 +179,6 @@ class CrossRefConfigNode(TreeConfigNode):
return ImportantConfigNode
class DynamoConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["is_dynamo"] = node_name
def child_constructor(self):
return ImportantConfigNode
class ParallelNativeConfigNode(TreeConfigNode):
def modify_label(self, label):
return "PARALLELNATIVE=" + str(label)

View File

@ -240,7 +240,6 @@ def instantiate_configs(only_slow_gradcheck):
is_xla = fc.find_prop("is_xla") or False
is_asan = fc.find_prop("is_asan") or False
is_crossref = fc.find_prop("is_crossref") or False
is_dynamo = fc.find_prop("is_dynamo") or False
is_onnx = fc.find_prop("is_onnx") or False
is_pure_torch = fc.find_prop("is_pure_torch") or False
is_vulkan = fc.find_prop("is_vulkan") or False
@ -287,9 +286,6 @@ def instantiate_configs(only_slow_gradcheck):
if is_crossref:
parms_list_ignored_for_docker_image.append("crossref")
if is_dynamo:
parms_list_ignored_for_docker_image.append("dynamo")
if is_onnx:
parms_list.append("onnx")
python_version = fc.find_prop("pyver")

View File

@ -1,5 +1,4 @@
from cimodel.data.simple.util.versions import MultiPartVersion
from cimodel.data.simple.util.branch_filters import gen_filter_dict_exclude
import cimodel.lib.miniutils as miniutils
XCODE_VERSION = MultiPartVersion([12, 5, 1])
@ -12,7 +11,7 @@ class ArchVariant:
def render(self):
extra_parts = [self.custom_build_name] if len(self.custom_build_name) > 0 else []
return "-".join([self.name] + extra_parts).replace("_", "-")
return "_".join([self.name] + extra_parts)
def get_platform(arch_variant_name):
@ -26,25 +25,30 @@ class IOSJob:
self.is_org_member_context = is_org_member_context
self.extra_props = extra_props
def gen_name_parts(self):
version_parts = self.xcode_version.render_dots_or_parts("-")
build_variant_suffix = self.arch_variant.render()
def gen_name_parts(self, with_version_dots):
version_parts = self.xcode_version.render_dots_or_parts(with_version_dots)
build_variant_suffix = "_".join([self.arch_variant.render(), "build"])
return [
"pytorch",
"ios",
] + version_parts + [
build_variant_suffix,
]
def gen_job_name(self):
return "-".join(self.gen_name_parts())
return "_".join(self.gen_name_parts(False))
def gen_tree(self):
platform_name = get_platform(self.arch_variant.name)
props_dict = {
"name": self.gen_job_name(),
"build_environment": self.gen_job_name(),
"build_environment": "-".join(self.gen_name_parts(True)),
"ios_arch": self.arch_variant.name,
"ios_platform": platform_name,
"name": self.gen_job_name(),
}
if self.is_org_member_context:
@ -53,28 +57,30 @@ class IOSJob:
if self.extra_props:
props_dict.update(self.extra_props)
props_dict["filters"] = gen_filter_dict_exclude()
return [{"pytorch_ios_build": props_dict}]
WORKFLOW_DATA = [
IOSJob(XCODE_VERSION, ArchVariant("x86_64"), is_org_member_context=False, extra_props={
"lite_interpreter": miniutils.quote(str(int(True)))}),
# IOSJob(XCODE_VERSION, ArchVariant("arm64"), extra_props={
# "lite_interpreter": miniutils.quote(str(int(True)))}),
# IOSJob(XCODE_VERSION, ArchVariant("arm64", "metal"), extra_props={
# "use_metal": miniutils.quote(str(int(True))),
# "lite_interpreter": miniutils.quote(str(int(True)))}),
# IOSJob(XCODE_VERSION, ArchVariant("arm64", "custom-ops"), extra_props={
# "op_list": "mobilenetv2.yaml",
# "lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("x86_64", "full_jit"), is_org_member_context=False, extra_props={
"lite_interpreter": miniutils.quote(str(int(False)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64"), extra_props={
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "metal"), extra_props={
"use_metal": miniutils.quote(str(int(True))),
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "full_jit"), extra_props={
"lite_interpreter": miniutils.quote(str(int(False)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "custom"), extra_props={
"op_list": "mobilenetv2.yaml",
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("x86_64", "coreml"), is_org_member_context=False, extra_props={
"use_coreml": miniutils.quote(str(int(True))),
"lite_interpreter": miniutils.quote(str(int(True)))}),
# IOSJob(XCODE_VERSION, ArchVariant("arm64", "coreml"), extra_props={
# "use_coreml": miniutils.quote(str(int(True))),
# "lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "coreml"), extra_props={
"use_coreml": miniutils.quote(str(int(True))),
"lite_interpreter": miniutils.quote(str(int(True)))}),
]

View File

@ -11,14 +11,10 @@ class MacOsJob:
non_phase_parts = ["pytorch", "macos", self.os_version, "py3"]
extra_name_list = [name for name, exist in self.extra_props.items() if exist]
full_job_name_list = (
non_phase_parts
+ extra_name_list
+ [
"build" if self.is_build else None,
"test" if self.is_test else None,
]
)
full_job_name_list = non_phase_parts + extra_name_list + [
'build' if self.is_build else None,
'test' if self.is_test else None,
]
full_job_name = "_".join(list(filter(None, full_job_name_list)))
@ -45,8 +41,10 @@ WORKFLOW_DATA = [
"10_13",
is_build=True,
is_test=True,
extra_props=tuple({"lite_interpreter": True}.items()),
),
extra_props=tuple({
"lite_interpreter": True
}.items()),
)
]

View File

@ -15,7 +15,7 @@ class IOSNightlyJob:
def get_phase_name(self):
return "upload" if self.is_upload else "build"
def get_common_name_pieces(self, sep):
def get_common_name_pieces(self, with_version_dots):
extra_name_suffix = [self.get_phase_name()] if self.is_upload else []
@ -24,7 +24,7 @@ class IOSNightlyJob:
common_name_pieces = [
"ios",
] + extra_name + [
] + ios_definitions.XCODE_VERSION.render_dots_or_parts(sep) + [
] + ios_definitions.XCODE_VERSION.render_dots_or_parts(with_version_dots) + [
"nightly",
self.variant,
"build",
@ -33,14 +33,14 @@ class IOSNightlyJob:
return common_name_pieces
def gen_job_name(self):
return "_".join(["pytorch"] + self.get_common_name_pieces(None))
return "_".join(["pytorch"] + self.get_common_name_pieces(False))
def gen_tree(self):
build_configs = BUILD_CONFIGS_FULL_JIT if self.is_full_jit else BUILD_CONFIGS
extra_requires = [x.gen_job_name() for x in build_configs] if self.is_upload else []
props_dict = {
"build_environment": "-".join(["libtorch"] + self.get_common_name_pieces(".")),
"build_environment": "-".join(["libtorch"] + self.get_common_name_pieces(True)),
"requires": extra_requires,
"context": "org-member",
"filters": {"branches": {"only": "nightly"}},

View File

@ -12,9 +12,6 @@ PR_BRANCH_LIST = [
RC_PATTERN = r"/v[0-9]+(\.[0-9]+)*-rc[0-9]+/"
MAC_IOS_EXCLUSION_LIST = ["nightly", "postnightly"]
def gen_filter_dict(
branches_list=NON_PR_BRANCH_LIST,
tags_list=None
@ -29,11 +26,3 @@ def gen_filter_dict(
if tags_list is not None:
filter_dict["tags"] = {"only": tags_list}
return filter_dict
def gen_filter_dict_exclude(branches_list=MAC_IOS_EXCLUSION_LIST):
return {
"branches": {
"ignore": branches_list,
},
}

View File

@ -1,6 +1,3 @@
from typing import Optional
class MultiPartVersion:
def __init__(self, parts, prefix=""):
self.parts = parts
@ -16,11 +13,14 @@ class MultiPartVersion:
else:
return [self.prefix]
def render_dots_or_parts(self, sep: Optional[str] = None):
if sep is None:
return self.prefixed_parts()
def render_dots(self):
return ".".join(self.prefixed_parts())
def render_dots_or_parts(self, with_dots):
if with_dots:
return [self.render_dots()]
else:
return [sep.join(self.prefixed_parts())]
return self.prefixed_parts()
class CudaVersion(MultiPartVersion):

807
.circleci/config.yml generated
View File

@ -47,7 +47,7 @@ commands:
- run:
name: "Calculate docker image hash"
command: |
DOCKER_TAG=$(git rev-parse HEAD:.ci/docker)
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${BASH_ENV}"
designate_upload_channel:
@ -174,6 +174,46 @@ commands:
echo "This is not a pull request, skipping..."
fi
upload_binary_size_for_android_build:
description: "Upload binary size data for Android build"
parameters:
build_type:
type: string
default: ""
artifacts:
type: string
default: ""
steps:
- run:
name: "Binary Size - Install Dependencies"
no_output_timeout: "5m"
command: |
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry pip3 install requests
- run:
name: "Binary Size - Untar Artifacts"
no_output_timeout: "5m"
command: |
# The artifact file is created inside docker container, which contains the result binaries.
# Now unpackage it into the project folder. The subsequent script will scan project folder
# to locate result binaries and report their sizes.
# If artifact file is not provided it assumes that the project folder has been mounted in
# the docker during build and already contains the result binaries, so this step can be skipped.
export ARTIFACTS="<< parameters.artifacts >>"
if [ -n "${ARTIFACTS}" ]; then
tar xf "${ARTIFACTS}" -C ~/project
fi
- run:
name: "Binary Size - Upload << parameters.build_type >>"
no_output_timeout: "5m"
command: |
cd ~/project
export ANDROID_BUILD_TYPE="<< parameters.build_type >>"
export COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
python3 -m tools.stats.upload_binary_size_to_scuba android
##############################################################################
# Binary build (nightlies nightly build) defaults
# The binary builds use the docker executor b/c at time of writing the machine
@ -401,6 +441,245 @@ binary_windows_params: &binary_windows_params
# Job specs
##############################################################################
jobs:
binary_linux_build:
<<: *binary_linux_build_params
steps:
- checkout
- calculate_docker_image_tag
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Build
no_output_timeout: "1h"
command: |
source "/pytorch/.circleci/scripts/binary_linux_build.sh"
# Preserve build log
if [ -f /pytorch/build/.ninja_log ]; then
cp /pytorch/build/.ninja_log /final_pkgs
fi
- run:
name: Output binary sizes
no_output_timeout: "1m"
command: |
ls -lah /final_pkgs
- run:
name: upload build & binary data
no_output_timeout: "5m"
command: |
source /env
cd /pytorch && export COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
python3 -mpip install requests && \
SCRIBE_GRAPHQL_ACCESS_TOKEN=${SCRIBE_GRAPHQL_ACCESS_TOKEN} \
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
- persist_to_workspace:
root: /
paths: final_pkgs
- store_artifacts:
path: /final_pkgs
# This should really just be another step of the binary_linux_build job above.
# This isn't possible right now b/c the build job uses the docker executor
# (otherwise they'd be really really slow) but this one uses the macine
# executor (b/c we have to run the docker with --runtime=nvidia and we can't do
# that on the docker executor)
binary_linux_test:
<<: *binary_linux_test_upload_params
machine:
image: ubuntu-2004:202104-01
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- attach_workspace:
at: /home/circleci/project
- setup_linux_system_environment
- setup_ci_environment
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Prepare test code
no_output_timeout: "1h"
command: .circleci/scripts/binary_linux_test.sh
- run:
<<: *binary_run_in_docker
binary_upload:
parameters:
package_type:
type: string
description: "What type of package we are uploading (eg. wheel, libtorch, conda)"
default: "wheel"
upload_subfolder:
type: string
description: "What subfolder to put our package into (eg. cpu, cudaX.Y, etc.)"
default: "cpu"
docker:
- image: continuumio/miniconda3
environment:
- DRY_RUN: disabled
- PACKAGE_TYPE: "<< parameters.package_type >>"
- UPLOAD_SUBFOLDER: "<< parameters.upload_subfolder >>"
steps:
- attach_workspace:
at: /tmp/workspace
- checkout
- designate_upload_channel
- run:
name: Install dependencies
no_output_timeout: "1h"
command: |
conda install -yq anaconda-client
pip install -q awscli
- run:
name: Do upload
no_output_timeout: "1h"
command: |
AWS_ACCESS_KEY_ID="${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}" \
AWS_SECRET_ACCESS_KEY="${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}" \
ANACONDA_API_TOKEN="${CONDA_PYTORCHBOT_TOKEN}" \
.circleci/scripts/binary_upload.sh
# Nighlty build smoke tests defaults
# These are the second-round smoke tests. These make sure that the binaries are
# correct from a user perspective, testing that they exist from the cloud are
# are runnable. Note that the pytorch repo is never cloned into these jobs
##############################################################################
smoke_linux_test:
<<: *binary_linux_test_upload_params
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -ex
cat >/home/circleci/project/ci_test_script.sh \<<EOL
# The following code will be executed inside Docker container
set -eux -o pipefail
/builder/smoke_test.sh
# The above code will be executed inside Docker container
EOL
- run:
<<: *binary_run_in_docker
smoke_mac_test:
<<: *binary_linux_test_upload_params
macos:
xcode: "12.0"
steps:
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "1h"
command: |
set -ex
source "/Users/distiller/project/env"
export "PATH=$workdir/miniconda/bin:$PATH"
# TODO unbuffer and ts this, but it breaks cause miniconda overwrites
# tclsh. But unbuffer and ts aren't that important so they're just
# disabled for now
./builder/smoke_test.sh
binary_mac_build:
<<: *binary_mac_params
macos:
xcode: "12.0"
resource_class: "large"
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "90m"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_build.sh"
cat "$script"
source "$script"
- run:
name: Test
no_output_timeout: "1h"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_test.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: /Users/distiller/project
paths: final_pkgs
- store_artifacts:
path: /Users/distiller/project/final_pkgs
binary_macos_arm64_build:
<<: *binary_mac_params
macos:
xcode: "12.3.0"
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "90m"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
export CROSS_COMPILE_ARM64=1
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_build.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: /Users/distiller/project
paths: final_pkgs
- store_artifacts:
path: /Users/distiller/project/final_pkgs
binary_ios_build:
<<: *pytorch_ios_params
macos:
@ -445,6 +724,90 @@ jobs:
cat "$script"
source "$script"
binary_windows_build:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-xlarge-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Build
no_output_timeout: "1h"
command: |
set -eux -o pipefail
script="/c/w/p/.circleci/scripts/binary_windows_build.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: "C:/w"
paths: final_pkgs
- store_artifacts:
path: C:/w/final_pkgs
binary_windows_test:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-medium-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
- checkout
- attach_workspace:
at: c:/users/circleci/project
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -eux -o pipefail
script="/c/w/p/.circleci/scripts/binary_windows_test.sh"
cat "$script"
source "$script"
smoke_windows_test:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-medium-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -eux -o pipefail
export TEST_NIGHTLY_PACKAGE=1
script="/c/w/p/.circleci/scripts/binary_windows_test.sh"
cat "$script"
source "$script"
anaconda_prune:
parameters:
packages:
@ -499,6 +862,95 @@ jobs:
pushd /tmp/workspace
git push -u origin "<< parameters.branch >>"
pytorch_python_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-python-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc Build and Push
no_output_timeout: "1h"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-main}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --env-file "${BASH_ENV}" --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && '"export CIRCLE_SHA1='$CIRCLE_SHA1'"' && . ./.circleci/scripts/python_doc_push_script.sh docs/'$target' '$target' site") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
mkdir -p ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/pytorch.github.io/docs/main ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/pytorch.github.io /tmp/workspace
# Save the docs build so we can debug any problems
export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug
docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}
time docker push ${DEBUG_COMMIT_DOCKER_IMAGE}
- persist_to_workspace:
root: /tmp/workspace
paths:
- .
- store_artifacts:
path: ~/workspace/build_artifacts/main
destination: docs
pytorch_cpp_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-cpp-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc Build and Push
no_output_timeout: "1h"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-main}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --env-file "${BASH_ENV}" --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && '"export CIRCLE_SHA1='$CIRCLE_SHA1'"' && . ./.circleci/scripts/cpp_doc_push_script.sh docs/"$target" main") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
mkdir -p ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/cppdocs/ /tmp/workspace
# Save the docs build so we can debug any problems
export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug
docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}
time docker push ${DEBUG_COMMIT_DOCKER_IMAGE}
- persist_to_workspace:
root: /tmp/workspace
paths:
- .
pytorch_macos_10_15_py3_build:
environment:
BUILD_ENVIRONMENT: pytorch-macos-10.15-py3-arm64-build
@ -512,6 +964,7 @@ jobs:
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export CROSS_COMPILE_ARM64=1
export JOB_BASE_NAME=$CIRCLE_JOB
@ -526,8 +979,8 @@ jobs:
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
set -x
chmod a+x .ci/pytorch/macos-build.sh
unbuffer .ci/pytorch/macos-build.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-build.sh
unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts
- persist_to_workspace:
root: /Users/distiller/workspace/
@ -549,6 +1002,7 @@ jobs:
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
# Install sccache
@ -562,206 +1016,14 @@ jobs:
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
set -x
chmod a+x .ci/pytorch/macos-build.sh
unbuffer .ci/pytorch/macos-build.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-build.sh
unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts
- persist_to_workspace:
root: /Users/distiller/workspace/
paths:
- miniconda3
mac_build:
parameters:
build-environment:
type: string
description: Top-level label for what's being built/tested.
xcode-version:
type: string
default: "13.3.1"
description: What xcode version to build with.
build-generates-artifacts:
type: boolean
default: true
description: if the build generates build artifacts
python-version:
type: string
default: "3.8"
macos:
xcode: << parameters.xcode-version >>
resource_class: medium
environment:
BUILD_ENVIRONMENT: << parameters.build-environment >>
AWS_REGION: us-east-1
steps:
- checkout
- run_brew_for_macos_build
- run:
name: Install sccache
command: |
sudo curl --retry 3 https://s3.amazonaws.com/ossci-macos/sccache_v2.15 --output /usr/local/bin/sccache
sudo chmod +x /usr/local/bin/sccache
echo "export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2" >> "${BASH_ENV}"
echo "export SCCACHE_S3_KEY_PREFIX=${GITHUB_WORKFLOW}" >> "${BASH_ENV}"
set +x
echo "export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}" >> "${BASH_ENV}"
echo "export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}" >> "${BASH_ENV}"
set -x
- run:
name: Get workflow job id
command: |
echo "export OUR_GITHUB_JOB_ID=${CIRCLE_WORKFLOW_JOB_ID}" >> "${BASH_ENV}"
- run:
name: Build
command: |
set -x
git submodule sync
git submodule update --init --recursive --depth 1 --jobs 0
export PATH="/usr/local/bin:$PATH"
export WORKSPACE_DIR="${HOME}/workspace"
mkdir -p "${WORKSPACE_DIR}"
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-MacOSX-x86_64.sh"
if [ << parameters.python-version >> == 3.9.12 ]; then
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh"
fi
# If a local installation of conda doesn't exist, we download and install conda
if [ ! -d "${WORKSPACE_DIR}/miniconda3" ]; then
mkdir -p "${WORKSPACE_DIR}"
curl --retry 3 ${MINICONDA_URL} -o "${WORKSPACE_DIR}"/miniconda3.sh
bash "${WORKSPACE_DIR}"/miniconda3.sh -b -p "${WORKSPACE_DIR}"/miniconda3
fi
export PATH="${WORKSPACE_DIR}/miniconda3/bin:$PATH"
# shellcheck disable=SC1091
source "${WORKSPACE_DIR}"/miniconda3/bin/activate
brew link --force libomp
echo "export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}" >> "${BASH_ENV}"
.ci/pytorch/macos-build.sh
- when:
condition: << parameters.build-generates-artifacts >>
steps:
- run:
name: Archive artifacts into zip
command: |
zip -1 -r artifacts.zip dist/ build/.ninja_log build/compile_commands.json .pytorch-test-times.json
cp artifacts.zip /Users/distiller/workspace
- persist_to_workspace:
root: /Users/distiller/workspace/
paths:
- miniconda3
- artifacts.zip
- store_artifacts:
path: /Users/distiller/project/artifacts.zip
mac_test:
parameters:
build-environment:
type: string
shard-number:
type: string
num-test-shards:
type: string
xcode-version:
type: string
test-config:
type: string
default: 'default'
macos:
xcode: << parameters.xcode-version >>
environment:
GIT_DEFAULT_BRANCH: 'master'
BUILD_ENVIRONMENT: << parameters.build-environment >>
TEST_CONFIG: << parameters.test-config >>
SHARD_NUMBER: << parameters.shard-number >>
NUM_TEST_SHARDS: << parameters.num-test-shards >>
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
steps:
- checkout
- attach_workspace:
at: ~/workspace
- run_brew_for_macos_build
- run:
name: Test
no_output_timeout: "2h"
command: |
set -x
git submodule sync --recursive
git submodule update --init --recursive
mv ~/workspace/artifacts.zip .
unzip artifacts.zip
export IN_CI=1
COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}")
export PATH="/usr/local/bin:$PATH"
export WORKSPACE_DIR="${HOME}/workspace"
mkdir -p "${WORKSPACE_DIR}"
export PATH="${WORKSPACE_DIR}/miniconda3/bin:$PATH"
source "${WORKSPACE_DIR}"/miniconda3/bin/activate
# sanitize the input commit message and PR body here:
# trim all new lines from commit messages to avoid issues with batch environment
# variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028
COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}"
# then trim all special characters like single and double quotes to avoid unescaped inputs to
# wreak havoc internally
export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}"
python3 -mpip install dist/*.whl
.ci/pytorch/macos-test.sh
- run:
name: Copy files for uploading test stats
command: |
# copy into a parent folder test-reports because we can't use CIRCLEI_BUILD_NUM in path when persisting to workspace
mkdir -p test-reports/test-reports_${CIRCLE_BUILD_NUM}/test/test-reports
cp -r test/test-reports test-reports/test-reports_${CIRCLE_BUILD_NUM}/test/test-reports
- store_test_results:
path: test/test-reports
- persist_to_workspace:
root: /Users/distiller/project/
paths:
- test-reports
upload_test_stats:
machine: # executor type
image: ubuntu-2004:202010-01 # # recommended linux image - includes Ubuntu 20.04, docker 19.03.13, docker-compose 1.27.4
steps:
- checkout
- attach_workspace:
at: ~/workspace
- run:
name: upload
command: |
set -ex
if [ -z ${AWS_ACCESS_KEY_FOR_OSSCI_ARTIFACT_UPLOAD} ]; then
echo "No credentials found, cannot upload test stats (are you on a fork?)"
exit 0
fi
cp -r ~/workspace/test-reports/* ~/project
pip3 install requests==2.26 rockset==1.0.3 boto3==1.19.12
export AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_FOR_OSSCI_ARTIFACT_UPLOAD}
export AWS_SECRET_ACCESS_KEY=${AWS_SECRET_KEY_FOR_OSSCI_ARTIFACT_UPLOAD}
# i dont know how to get the run attempt number for reruns so default to 1
python3 -m tools.stats.upload_test_stats --workflow-run-id "${CIRCLE_WORKFLOW_JOB_ID}" --workflow-run-attempt 1 --head-branch << pipeline.git.branch >> --circleci
pytorch_macos_10_13_py3_test:
environment:
BUILD_ENVIRONMENT: pytorch-macos-10.13-py3-test
@ -777,10 +1039,27 @@ jobs:
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
chmod a+x .ci/pytorch/macos-test.sh
unbuffer .ci/pytorch/macos-test.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-test.sh
unbuffer .jenkins/pytorch/macos-test.sh 2>&1 | ts
- run:
name: Report results
no_output_timeout: "5m"
command: |
set -ex
source /Users/distiller/workspace/miniconda3/bin/activate
python3 -m pip install boto3==1.19.12
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
# Using the same IAM user to write stats to our OSS bucket
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
python -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test
when: always
- store_test_results:
path: test/test-reports
@ -799,10 +1078,11 @@ jobs:
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export BUILD_LITE_INTERPRETER=1
export JOB_BASE_NAME=$CIRCLE_JOB
chmod a+x ${HOME}/project/.ci/pytorch/macos-lite-interpreter-build-test.sh
unbuffer ${HOME}/project/.ci/pytorch/macos-lite-interpreter-build-test.sh 2>&1 | ts
chmod a+x ${HOME}/project/.jenkins/pytorch/macos-lite-interpreter-build-test.sh
unbuffer ${HOME}/project/.jenkins/pytorch/macos-lite-interpreter-build-test.sh 2>&1 | ts
- store_test_results:
path: test/test-reports
@ -888,6 +1168,9 @@ jobs:
output_image=$docker_image_libtorch_android_x86_32-gradle
docker commit "$id_x86_32" ${output_image}
time docker push ${output_image}
- upload_binary_size_for_android_build:
build_type: prebuilt
artifacts: /home/circleci/workspace/build_android_artifacts/artifacts.tgz
- store_artifacts:
path: ~/workspace/build_android_artifacts/artifacts.tgz
destination: artifacts.tgz
@ -963,6 +1246,9 @@ jobs:
output_image=${docker_image_libtorch_android_x86_32}-gradle
docker commit "$id" ${output_image}
time docker push ${output_image}
- upload_binary_size_for_android_build:
build_type: prebuilt-single
artifacts: /home/circleci/workspace/build_android_x86_32_artifacts/artifacts.tgz
- store_artifacts:
path: ~/workspace/build_android_x86_32_artifacts/artifacts.tgz
destination: artifacts.tgz
@ -972,43 +1258,10 @@ jobs:
macos:
xcode: "12.5.1"
steps:
- run:
name: checkout with retry
command: |
checkout() {
set -ex
# Workaround old docker images with incorrect $HOME
# check https://github.com/docker/docker/issues/2968 for details
if [ "${HOME}" = "/" ]
then
export HOME=$(getent passwd $(id -un) | cut -d: -f6)
fi
mkdir -p ~/.ssh
echo 'github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ==
' >> ~/.ssh/known_hosts
# use git+ssh instead of https
git config --global url."ssh://git@github.com".insteadOf "https://github.com" || true
git config --global gc.auto 0 || true
echo 'Cloning git repository'
mkdir -p '/Users/distiller/project'
cd '/Users/distiller/project'
git clone "$CIRCLE_REPOSITORY_URL" .
echo 'Checking out branch'
git checkout --force -B "$CIRCLE_BRANCH" "$CIRCLE_SHA1"
git --no-pager log --no-color -n 1 --format='HEAD is now at %h %s'
}
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry checkout
- checkout
- run_brew_for_ios_build
- run:
name: Setup Fastlane
name: Run Fastlane
no_output_timeout: "1h"
command: |
set -e
@ -1016,17 +1269,32 @@ jobs:
cd ${PROJ_ROOT}/ios/TestApp
# install fastlane
sudo gem install bundler && bundle install
# install certificates
echo ${IOS_CERT_KEY_2022} >> cert.txt
base64 --decode cert.txt -o Certificates.p12
rm cert.txt
bundle exec fastlane install_root_cert
bundle exec fastlane install_dev_cert
# install the provisioning profile
PROFILE=PyTorch_CI_2022.mobileprovision
PROVISIONING_PROFILES=~/Library/MobileDevice/Provisioning\ Profiles
mkdir -pv "${PROVISIONING_PROFILES}"
cd "${PROVISIONING_PROFILES}"
echo ${IOS_SIGN_KEY_2022} >> cert.txt
base64 --decode cert.txt -o ${PROFILE}
rm cert.txt
- run:
name: Build
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
WORKSPACE=/Users/distiller/workspace
PROJ_ROOT=/Users/distiller/project
export TCLLIBPATH="/usr/local/lib"
# Install conda
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x ~/conda.sh
/bin/bash ~/conda.sh -b -p ~/anaconda
export PATH="~/anaconda/bin:${PATH}"
@ -1037,7 +1305,7 @@ jobs:
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry conda install numpy ninja pyyaml mkl mkl-include setuptools cmake requests typing-extensions --yes
retry conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
# sync submodules
cd ${PROJ_ROOT}
@ -1073,12 +1341,18 @@ jobs:
command: |
set -e
PROJ_ROOT=/Users/distiller/project
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM}
echo ${IOS_DEV_TEAM_ID}
if [ ${IOS_PLATFORM} != "SIMULATOR" ]; then
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM} -c ${PROFILE} -t ${IOS_DEV_TEAM_ID}
else
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM}
fi
if ! [ "$?" -eq "0" ]; then
echo 'xcodebuild failed!'
exit 1
@ -1101,13 +1375,12 @@ jobs:
cd ${PROJ_ROOT}/ios/TestApp/benchmark
mkdir -p ../models
if [ ${USE_COREML_DELEGATE} == 1 ]; then
pip install coremltools==5.0b5 protobuf==3.20.1
pip install coremltools==5.0b5
pip install six
python coreml_backend.py
else
cd "${PROJ_ROOT}"
python test/mobile/model_test/gen_test_model.py ios-test
python trace_model.py
fi
cd "${PROJ_ROOT}/ios/TestApp/benchmark"
if [ ${BUILD_LITE_INTERPRETER} == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
@ -1115,10 +1388,10 @@ jobs:
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${PROJ_ROOT}/ios/TestApp"
# instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
cd ${PROJ_ROOT}/ios/TestApp
instruments -s -devices
if [ ${BUILD_LITE_INTERPRETER} == 1 ]; then
if [ ${USE_COREML_DELEGATE} == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
@ -1151,7 +1424,7 @@ jobs:
docker cp /home/circleci/project/. $id:/var/lib/jenkins/workspace
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/build.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/build.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
@ -1197,9 +1470,9 @@ jobs:
trap "retrieve_test_reports" ERR
if [[ ${BUILD_ENVIRONMENT} == *"multigpu"* ]]; then
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/multigpu-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/multigpu-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
else
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
fi
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
@ -1221,6 +1494,30 @@ jobs:
python3 -m pip install requests
python3 ./.circleci/scripts/trigger_azure_pipeline.py
pytorch_doc_test:
environment:
BUILD_ENVIRONMENT: pytorch-doc-test
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: medium
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc test
no_output_timeout: "30m"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && . ./.jenkins/pytorch/docs-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
# update_s3_htmls job
# These jobs create html files for every cpu/cu## folder in s3. The html
# files just store the names of all the files in that folder (which are
@ -1322,12 +1619,12 @@ jobs:
exit 0
fi
# Covers the case where a previous tag doesn't exist for the tree
# this is only really applicable on trees that don't have `.ci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.ci/docker"; then
echo "Directory '.ci/docker' not found in tree << pipeline.git.base_revision >>, you should probably rebase onto a more recent commit"
# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker"; then
echo "Directory '.circleci/docker' not found in tree << pipeline.git.base_revision >>, you should probably rebase onto a more recent commit"
exit 1
fi
PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):ci/docker")
PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker")
# If no image exists but the hash is the same as the previous hash then we should error out here
if [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then
echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"
@ -1342,7 +1639,7 @@ jobs:
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_DOCKER_BUILDER_V1}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
set -x
cd .ci/docker && ./build_docker.sh
cd .circleci/docker && ./build_docker.sh
##############################################################################
# Workflows
##############################################################################

View File

@ -53,7 +53,7 @@ dependencies {
implementation 'androidx.appcompat:appcompat:1.0.0'
implementation 'com.facebook.fbjni:fbjni-java-only:0.2.2'
implementation 'com.google.code.findbugs:jsr305:3.0.1'
implementation 'com.facebook.soloader:nativeloader:0.10.4'
implementation 'com.facebook.soloader:nativeloader:0.10.1'
implementation 'junit:junit:' + rootProject.junitVersion
implementation 'androidx.test:core:' + rootProject.coreVersion

View File

@ -33,7 +33,7 @@ function extract_all_from_image_name() {
if [ "x${name}" = xpy ]; then
vername=ANACONDA_PYTHON_VERSION
fi
# skip non-conforming fields such as "pytorch", "linux" or "bionic" without version string
# skip non-conforming fields such as "pytorch", "linux" or "xenial" without version string
if [ -n "${name}" ]; then
extract_version_from_image_name "${name}" "${vername}"
fi
@ -46,12 +46,14 @@ if [[ "$image" == *xla* ]]; then
exit 0
fi
if [[ "$image" == *-bionic* ]]; then
if [[ "$image" == *-xenial* ]]; then
UBUNTU_VERSION=16.04
elif [[ "$image" == *-artful* ]]; then
UBUNTU_VERSION=17.10
elif [[ "$image" == *-bionic* ]]; then
UBUNTU_VERSION=18.04
elif [[ "$image" == *-focal* ]]; then
UBUNTU_VERSION=20.04
elif [[ "$image" == *-jammy* ]]; then
UBUNTU_VERSION=22.04
elif [[ "$image" == *ubuntu* ]]; then
extract_version_from_image_name ubuntu UBUNTU_VERSION
elif [[ "$image" == *centos* ]]; then
@ -68,84 +70,113 @@ else
fi
DOCKERFILE="${OS}/Dockerfile"
# When using ubuntu - 22.04, start from Ubuntu docker image, instead of nvidia/cuda docker image.
if [[ "$image" == *cuda* && "$UBUNTU_VERSION" != "22.04" ]]; then
if [[ "$image" == *cuda* ]]; then
DOCKERFILE="${OS}-cuda/Dockerfile"
elif [[ "$image" == *rocm* ]]; then
DOCKERFILE="${OS}-rocm/Dockerfile"
elif [[ "$image" == *linter* ]]; then
# Use a separate Dockerfile for linter to keep a small image size
DOCKERFILE="linter/Dockerfile"
fi
# CMake 3.18 is needed to support CUDA17 language variant
CMAKE_VERSION=3.18.5
if [[ "$image" == *xenial* ]] || [[ "$image" == *bionic* ]]; then
CMAKE_VERSION=3.13.5
fi
_UCX_COMMIT=31e74cac7bee0ef66bef2af72e7d86d9c282e5ab
_UCC_COMMIT=1c7a7127186e7836f73aafbd7697bbc274a77eee
TRAVIS_DL_URL_PREFIX="https://s3.amazonaws.com/travis-python-archives/binaries/ubuntu/14.04/x86_64"
# It's annoying to rename jobs every time you want to rewrite a
# configuration, so we hardcode everything here rather than do it
# from scratch
case "$image" in
pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7)
CUDA_VERSION=11.6.2
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
CONDA_CMAKE=yes
;;
pytorch-linux-bionic-cuda11.7-cudnn8-py3-gcc7)
CUDA_VERSION=11.7.0
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
CONDA_CMAKE=yes
;;
pytorch-linux-bionic-cuda11.8-cudnn8-py3-gcc7)
CUDA_VERSION=11.8.0
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
CONDA_CMAKE=yes
;;
pytorch-linux-focal-py3-clang7-asan)
ANACONDA_PYTHON_VERSION=3.9
CLANG_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
CONDA_CMAKE=yes
;;
pytorch-linux-focal-py3-clang10-onnx)
pytorch-linux-xenial-py3.8)
ANACONDA_PYTHON_VERSION=3.8
CLANG_VERSION=10
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py3.7-gcc5.4)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=5
PROTOBUF=yes
DB=yes
VISION=yes
CONDA_CMAKE=yes
KATEX=yes
;;
pytorch-linux-focal-py3-clang7-android-ndk-r19c)
pytorch-linux-xenial-py3.7-gcc7.2)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py3.7-gcc7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7)
CUDA_VERSION=11.3.0 # Deviating from major.minor to conform to nvidia's Docker image names
CUDNN_VERSION=8
TENSORRT_VERSION=8.0.1.6
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-bionic-cuda11.3-cudnn8-py3-clang9)
CUDA_VERSION=11.3.0 # Deviating from major.minor to conform to nvidia's Docker image names
CUDNN_VERSION=8
TENSORRT_VERSION=8.0.1.6
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7)
CUDA_VERSION=11.6.0
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-xenial-py3-clang5-asan)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=5.0
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3-clang7-asan)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3-clang7-onnx)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3-clang5-android-ndk-r19c)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=5.0
LLVMDEV=yes
PROTOBUF=yes
ANDROID=yes
@ -153,25 +184,21 @@ case "$image" in
GRADLE_VERSION=6.8.3
NINJA_VERSION=1.9.0
;;
pytorch-linux-bionic-py3.8-clang9)
ANACONDA_PYTHON_VERSION=3.8
CLANG_VERSION=9
pytorch-linux-xenial-py3.7-clang7)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
VULKAN_SDK_VERSION=1.2.162.1
SWIFTSHADER=yes
CONDA_CMAKE=yes
;;
pytorch-linux-bionic-py3.11-clang9)
ANACONDA_PYTHON_VERSION=3.11
pytorch-linux-bionic-py3.7-clang9)
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
VULKAN_SDK_VERSION=1.2.162.1
SWIFTSHADER=yes
CONDA_CMAKE=yes
;;
pytorch-linux-bionic-py3.8-gcc9)
ANACONDA_PYTHON_VERSION=3.8
@ -179,70 +206,49 @@ case "$image" in
PROTOBUF=yes
DB=yes
VISION=yes
CONDA_CMAKE=yes
;;
pytorch-linux-focal-rocm-n-1-py3)
ANACONDA_PYTHON_VERSION=3.8
pytorch-linux-bionic-cuda10.2-cudnn7-py3.7-clang9)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.7
CLANG_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.9
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-rocm5.0-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=5.3
NINJA_VERSION=1.9.0
CONDA_CMAKE=yes
ROCM_VERSION=5.0
;;
pytorch-linux-focal-rocm-n-py3)
ANACONDA_PYTHON_VERSION=3.8
pytorch-linux-bionic-rocm5.1-py3.7)
ANACONDA_PYTHON_VERSION=3.7
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=5.4.2
NINJA_VERSION=1.9.0
CONDA_CMAKE=yes
ROCM_VERSION=5.1.1
;;
pytorch-linux-focal-py3.8-gcc7)
ANACONDA_PYTHON_VERSION=3.8
pytorch-linux-focal-py3.7-gcc7)
ANACONDA_PYTHON_VERSION=3.7
CMAKE_VERSION=3.12.4 # To make sure XNNPACK is enabled for the BACKWARDS_COMPAT_TEST used with this image
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
CONDA_CMAKE=yes
;;
pytorch-linux-jammy-cuda11.6-cudnn8-py3.8-clang12)
ANACONDA_PYTHON_VERSION=3.8
CUDA_VERSION=11.6
CUDNN_VERSION=8
CLANG_VERSION=12
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-jammy-cuda11.7-cudnn8-py3.8-clang12)
ANACONDA_PYTHON_VERSION=3.8
CUDA_VERSION=11.7
CUDNN_VERSION=8
CLANG_VERSION=12
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-jammy-cuda11.8-cudnn8-py3.8-clang12)
ANACONDA_PYTHON_VERSION=3.8
CUDA_VERSION=11.8
CUDNN_VERSION=8
CLANG_VERSION=12
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-focal-linter)
# TODO: Use 3.9 here because of this issue https://github.com/python/mypy/issues/13627.
# We will need to update mypy version eventually, but that's for another day. The task
# would be to upgrade mypy to 1.0.0 with Python 3.11
ANACONDA_PYTHON_VERSION=3.9
CONDA_CMAKE=yes
;;
*)
# Catch-all for builds that are not hardcoded.
@ -259,10 +265,6 @@ case "$image" in
fi
if [[ "$image" == *rocm* ]]; then
extract_version_from_image_name rocm ROCM_VERSION
NINJA_VERSION=1.9.0
fi
if [[ "$image" == *centos7* ]]; then
NINJA_VERSION=1.10.2
fi
if [[ "$image" == *gcc* ]]; then
extract_version_from_image_name gcc GCC_VERSION
@ -282,6 +284,12 @@ case "$image" in
;;
esac
# Set Jenkins UID and GID if running Jenkins
if [ -n "${JENKINS:-}" ]; then
JENKINS_UID=$(id -u jenkins)
JENKINS_GID=$(id -g jenkins)
fi
tmp_tag=$(basename "$(mktemp -u)" | tr '[:upper:]' '[:lower:]')
#when using cudnn version 8 install it separately from cuda
@ -298,12 +306,17 @@ fi
docker build \
--no-cache \
--progress=plain \
--build-arg "TRAVIS_DL_URL_PREFIX=${TRAVIS_DL_URL_PREFIX}" \
--build-arg "BUILD_ENVIRONMENT=${image}" \
--build-arg "PROTOBUF=${PROTOBUF:-}" \
--build-arg "THRIFT=${THRIFT:-}" \
--build-arg "LLVMDEV=${LLVMDEV:-}" \
--build-arg "DB=${DB:-}" \
--build-arg "VISION=${VISION:-}" \
--build-arg "EC2=${EC2:-}" \
--build-arg "JENKINS=${JENKINS:-}" \
--build-arg "JENKINS_UID=${JENKINS_UID:-}" \
--build-arg "JENKINS_GID=${JENKINS_GID:-}" \
--build-arg "UBUNTU_VERSION=${UBUNTU_VERSION}" \
--build-arg "CENTOS_VERSION=${CENTOS_VERSION}" \
--build-arg "DEVTOOLSET_VERSION=${DEVTOOLSET_VERSION}" \
@ -323,11 +336,8 @@ docker build \
--build-arg "NINJA_VERSION=${NINJA_VERSION:-}" \
--build-arg "KATEX=${KATEX:-}" \
--build-arg "ROCM_VERSION=${ROCM_VERSION:-}" \
--build-arg "PYTORCH_ROCM_ARCH=${PYTORCH_ROCM_ARCH:-gfx906}" \
--build-arg "PYTORCH_ROCM_ARCH=${PYTORCH_ROCM_ARCH:-gfx900;gfx906}" \
--build-arg "IMAGE_NAME=${IMAGE_NAME}" \
--build-arg "UCX_COMMIT=${UCX_COMMIT}" \
--build-arg "UCC_COMMIT=${UCC_COMMIT}" \
--build-arg "CONDA_CMAKE=${CONDA_CMAKE}" \
-f $(dirname ${DOCKERFILE})/Dockerfile \
-t "$tmp_tag" \
"$@" \

View File

@ -35,6 +35,9 @@ if [[ -z "${GITHUB_ACTIONS}" ]]; then
trap "docker logout ${registry}" EXIT
fi
# export EC2=1
# export JENKINS=1
# Try to pull the previous image (perhaps we can reuse some layers)
# if [ -n "${last_tag}" ]; then
# docker pull "${image}:${last_tag}" || true
@ -43,15 +46,7 @@ fi
# Build new image
./build.sh ${IMAGE_NAME} -t "${image}:${tag}"
# Only push if `DOCKER_SKIP_PUSH` = false
if [ "${DOCKER_SKIP_PUSH:-true}" = "false" ]; then
# Only push if docker image doesn't exist already.
# ECR image tags are immutable so this will avoid pushing if only just testing if the docker jobs work
# NOTE: The only workflow that should push these images should be the docker-builds.yml workflow
if ! docker manifest inspect "${image}:${tag}" >/dev/null 2>/dev/null; then
docker push "${image}:${tag}"
fi
fi
docker push "${image}:${tag}"
if [ -z "${DOCKER_SKIP_S3_UPLOAD:-}" ]; then
trap "rm -rf ${IMAGE_NAME}:${tag}.tar" EXIT

View File

@ -11,72 +11,66 @@ ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
# Install required packages to build Caffe2
# Install common dependencies (so that this step can be cached separately)
COPY ./common/install_base.sh install_base.sh
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Update CentOS git version
RUN yum -y remove git
RUN yum -y remove git-*
RUN yum -y install https://packages.endpoint.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm || \
(yum -y install https://packages.endpointdev.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm && \
sed -i "s/packages.endpoint/packages.endpointdev/" /etc/yum.repos.d/endpoint.repo)
RUN yum -y install https://packages.endpoint.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm
RUN yum install -y git
# Install devtoolset
ARG DEVTOOLSET_VERSION
COPY ./common/install_devtoolset.sh install_devtoolset.sh
ADD ./common/install_devtoolset.sh install_devtoolset.sh
RUN bash ./install_devtoolset.sh && rm install_devtoolset.sh
ENV BASH_ENV "/etc/profile"
# (optional) Install non-default glibc version
ARG GLIBC_VERSION
COPY ./common/install_glibc.sh install_glibc.sh
ADD ./common/install_glibc.sh install_glibc.sh
RUN if [ -n "${GLIBC_VERSION}" ]; then bash ./install_glibc.sh; fi
RUN rm install_glibc.sh
# Install user
COPY ./common/install_user.sh install_user.sh
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install conda and other packages (e.g., numpy, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ARG CONDA_CMAKE
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
COPY requirements-ci.txt /opt/conda/requirements-ci.txt
COPY ./common/install_conda.sh install_conda.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
ADD requirements-ci.txt /opt/conda/requirements-ci.txt
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
RUN rm /opt/conda/requirements-ci.txt
# (optional) Install protobuf for ONNX
ARG PROTOBUF
COPY ./common/install_protobuf.sh install_protobuf.sh
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
COPY ./common/install_db.sh install_db.sh
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
COPY ./common/install_vision.sh install_vision.sh
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
# Install rocm
ARG ROCM_VERSION
COPY ./common/install_rocm.sh install_rocm.sh
ADD ./common/install_rocm.sh install_rocm.sh
RUN bash ./install_rocm.sh
RUN rm install_rocm.sh
COPY ./common/install_rocm_magma.sh install_rocm_magma.sh
RUN bash ./install_rocm_magma.sh
RUN rm install_rocm_magma.sh
ENV PATH /opt/rocm/bin:$PATH
ENV PATH /opt/rocm/hcc/bin:$PATH
ENV PATH /opt/rocm/hip/bin:$PATH
@ -88,18 +82,18 @@ ENV LC_ALL en_US.utf8
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
COPY ./common/install_cmake.sh install_cmake.sh
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# (optional) Install non-default Ninja version
ARG NINJA_VERSION
COPY ./common/install_ninja.sh install_ninja.sh
ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
COPY ./common/install_cache.sh install_cache.sh
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
RUN bash ./install_cache.sh && rm install_cache.sh

View File

@ -15,22 +15,11 @@ install_ubuntu() {
elif [[ "$UBUNTU_VERSION" == "20.04"* ]]; then
cmake3="cmake=3.16*"
maybe_libiomp_dev=""
elif [[ "$UBUNTU_VERSION" == "22.04"* ]]; then
cmake3="cmake=3.22*"
maybe_libiomp_dev=""
else
cmake3="cmake=3.5*"
maybe_libiomp_dev="libiomp-dev"
fi
if [[ "$CLANG_VERSION" == 12 ]]; then
maybe_libomp_dev="libomp-12-dev"
elif [[ "$CLANG_VERSION" == 10 ]]; then
maybe_libomp_dev="libomp-10-dev"
else
maybe_libomp_dev=""
fi
# TODO: Remove this once nvidia package repos are back online
# Comment out nvidia repositories to prevent them from getting apt-get updated, see https://github.com/pytorch/pytorch/issues/74968
# shellcheck disable=SC2046
@ -40,12 +29,10 @@ install_ubuntu() {
apt-get update
# TODO: Some of these may not be necessary
ccache_deps="asciidoc docbook-xml docbook-xsl xsltproc"
deploy_deps="libffi-dev libbz2-dev libreadline-dev libncurses5-dev libncursesw5-dev libgdbm-dev libsqlite3-dev uuid-dev tk-dev"
numpy_deps="gfortran"
apt-get install -y --no-install-recommends \
$ccache_deps \
$numpy_deps \
${deploy_deps} \
${cmake3} \
apt-transport-https \
autoconf \
@ -62,35 +49,15 @@ install_ubuntu() {
libjpeg-dev \
libasound2-dev \
libsndfile-dev \
${maybe_libomp_dev} \
software-properties-common \
wget \
sudo \
vim \
jq \
libtool \
vim \
unzip \
gdb
vim
# Should resolve issues related to various apt package repository cert issues
# see: https://github.com/pytorch/pytorch/issues/65931
apt-get install -y libgnutls30
# cuda-toolkit does not work with gcc-11.2.0 which is default in Ubunutu 22.04
# see: https://github.com/NVlabs/instant-ngp/issues/119
if [[ "$UBUNTU_VERSION" == "22.04"* ]]; then
apt-get install -y g++-10
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 30
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 30
update-alternatives --install /usr/bin/gcov gcov /usr/bin/gcov-10 30
# https://www.spinics.net/lists/libreoffice/msg07549.html
sudo rm -rf /usr/lib/gcc/x86_64-linux-gnu/11
wget https://github.com/gcc-mirror/gcc/commit/2b2d97fc545635a0f6aa9c9ee3b017394bc494bf.patch -O noexecpt.patch
sudo patch /usr/include/c++/10/bits/range_access.h noexecpt.patch
fi
# Cleanup package manager
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
@ -129,9 +96,7 @@ install_centos() {
opencv-devel \
sudo \
wget \
vim \
unzip \
gdb
vim
# Cleanup
yum clean all
@ -157,7 +122,7 @@ esac
# Install Valgrind separately since the apt-get version is too old.
mkdir valgrind_build && cd valgrind_build
VALGRIND_VERSION=3.20.0
VALGRIND_VERSION=3.16.1
wget https://ossci-linux.s3.amazonaws.com/valgrind-${VALGRIND_VERSION}.tar.bz2
tar -xjf valgrind-${VALGRIND_VERSION}.tar.bz2
cd valgrind-${VALGRIND_VERSION}

View File

@ -5,9 +5,7 @@ set -ex
install_ubuntu() {
echo "Preparing to build sccache from source"
apt-get update
# libssl-dev will not work as it is upgraded to libssl3 in Ubuntu-22.04.
# Instead use lib and headers from OpenSSL1.1 installed in `install_openssl.sh``
apt-get install -y cargo
apt-get install -y cargo pkg-config libssl-dev
echo "Checking out sccache repo"
git clone https://github.com/pytorch/sccache
cd sccache
@ -48,9 +46,7 @@ fi
chmod a+x /opt/cache/bin/sccache
function write_sccache_stub() {
# Unset LD_PRELOAD for ps because of asan + ps issues
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
printf "#!/bin/sh\nif [ \$(env -u LD_PRELOAD ps -p \$PPID -o comm=) != sccache ]; then\n exec sccache $(which $1) \"\$@\"\nelse\n exec $(which $1) \"\$@\"\nfi" > "/opt/cache/bin/$1"
printf "#!/bin/sh\nif [ \$(ps -p \$PPID -o comm=) != sccache ]; then\n exec sccache $(which $1) \"\$@\"\nelse\n exec $(which $1) \"\$@\"\nfi" > "/opt/cache/bin/$1"
chmod a+x "/opt/cache/bin/$1"
}

View File

@ -13,9 +13,6 @@ if [ -n "$CLANG_VERSION" ]; then
sudo apt-get install -y --no-install-recommends gpg-agent
wget --no-check-certificate -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
apt-add-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-${CLANG_VERSION} main"
elif [[ $UBUNTU_VERSION == 22.04 ]]; then
# work around ubuntu apt-get conflicts
sudo apt-get -y -f install
fi
sudo apt-get update

View File

@ -5,19 +5,7 @@ set -ex
[ -n "$CMAKE_VERSION" ]
# Remove system cmake install so it won't get used instead
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
apt-get remove cmake -y
;;
centos)
yum remove cmake -y
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
apt-get remove cmake -y
# Turn 3.6.3 into v3.6
path=$(echo "${CMAKE_VERSION}" | sed -e 's/\([0-9].[0-9]\+\).*/v\1/')

View File

@ -24,12 +24,26 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
mkdir -p /opt/conda
chown jenkins:jenkins /opt/conda
source "$(dirname "${BASH_SOURCE[0]}")/common_utils.sh"
# Work around bug where devtoolset replaces sudo and breaks it.
if [ -n "$DEVTOOLSET_VERSION" ]; then
SUDO=/bin/sudo
else
SUDO=sudo
fi
as_jenkins() {
# NB: unsetting the environment variables works around a conda bug
# https://github.com/conda/conda/issues/6576
# NB: Pass on PATH and LD_LIBRARY_PATH to sudo invocation
# NB: This must be run from a directory that jenkins has access to,
# works around https://github.com/conda/conda-package-handling/pull/34
$SUDO -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
}
pushd /tmp
wget -q "${BASE_URL}/${CONDA_FILE}"
# NB: Manually invoke bash per https://github.com/conda/conda/issues/10431
as_jenkins bash "${CONDA_FILE}" -b -f -p "/opt/conda"
chmod +x "${CONDA_FILE}"
as_jenkins ./"${CONDA_FILE}" -b -f -p "/opt/conda"
popd
# NB: Don't do this, rely on the rpath to get it right
@ -41,40 +55,37 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
# Ensure we run conda in a directory that jenkins has write access to
pushd /opt/conda
# Prevent conda from updating to 4.14.0, which causes docker build failures
# See https://hud.pytorch.org/pytorch/pytorch/commit/754d7f05b6841e555cea5a4b2c505dd9e0baec1d
# Uncomment the below when resolved to track the latest conda update
# as_jenkins conda update -y -n base conda
# Track latest conda update
as_jenkins conda update -y -n base conda
# Install correct Python version
as_jenkins conda create -n py_$ANACONDA_PYTHON_VERSION -y python="$ANACONDA_PYTHON_VERSION"
as_jenkins conda install -y python="$ANACONDA_PYTHON_VERSION"
conda_install() {
# Ensure that the install command don't upgrade/downgrade Python
# This should be called as
# conda_install pkg1 pkg2 ... [-c channel]
as_jenkins conda install -q -y python="$ANACONDA_PYTHON_VERSION" $*
}
pip_install() {
as_jenkins pip install --progress-bar off $*
}
# Install PyTorch conda deps, as per https://github.com/pytorch/pytorch README
CONDA_COMMON_DEPS="astunparse pyyaml mkl=2021.4.0 mkl-include=2021.4.0 setuptools"
if [ "$ANACONDA_PYTHON_VERSION" = "3.11" ]; then
# DO NOT install cmake here as it would install a version newer than 3.10, but
# we want to pin to version 3.10.
if [ "$ANACONDA_PYTHON_VERSION" = "3.9" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
# TODO: Stop using `-c malfet`
conda_install numpy=1.23.5 ${CONDA_COMMON_DEPS} llvmdev=8.0.0 -c malfet
elif [ "$ANACONDA_PYTHON_VERSION" = "3.10" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
conda_install numpy=1.21.2 ${CONDA_COMMON_DEPS} llvmdev=8.0.0
elif [ "$ANACONDA_PYTHON_VERSION" = "3.9" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
conda_install numpy=1.19.2 ${CONDA_COMMON_DEPS} llvmdev=8.0.0
conda_install numpy=1.19.2 astunparse pyyaml mkl mkl-include setuptools cffi future six llvmdev=8.0.0
elif [ "$ANACONDA_PYTHON_VERSION" = "3.8" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
conda_install numpy=1.18.5 ${CONDA_COMMON_DEPS} llvmdev=8.0.0
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six llvmdev=8.0.0
elif [ "$ANACONDA_PYTHON_VERSION" = "3.7" ]; then
# DO NOT install dataclasses if installing python-3.7, since its part of python-3.7 core packages
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six typing_extensions
else
# Install `typing-extensions` for 3.7
conda_install numpy=1.18.5 ${CONDA_COMMON_DEPS} typing-extensions
fi
# Use conda cmake in some cases. Conda cmake will be newer than our supported
# min version (3.5 for xenial and 3.10 for bionic), so we only do it in those
# following builds that we know should use conda. Specifically, Ubuntu bionic
# and focal cannot find conda mkl with stock cmake, so we need a cmake from conda
if [ -n "${CONDA_CMAKE}" ]; then
conda_install cmake
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six dataclasses typing_extensions
fi
# Magma package names are concatenation of CUDA major and minor ignoring revision
@ -83,6 +94,9 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
conda_install magma-cuda$(TMP=${CUDA_VERSION/./};echo ${TMP%.*[0-9]}) -c pytorch
fi
# TODO: This isn't working atm
conda_install nnpack -c killeent
# Install some other packages, including those needed for Python test reporting
pip_install -r /opt/conda/requirements-ci.txt

View File

@ -0,0 +1,18 @@
#!/bin/bash
if [[ ${CUDNN_VERSION} == 8 ]]; then
# cuDNN license: https://developer.nvidia.com/cudnn/license_agreement
mkdir tmp_cudnn && cd tmp_cudnn
CUDNN_NAME="cudnn-linux-x86_64-8.3.2.44_cuda11.5-archive"
curl -OLs https://developer.download.nvidia.com/compute/redist/cudnn/v8.3.2/local_installers/11.5/${CUDNN_NAME}.tar.xz
tar xf ${CUDNN_NAME}.tar.xz
cp -a ${CUDNN_NAME}/include/* /usr/include/
cp -a ${CUDNN_NAME}/include/* /usr/local/cuda/include/
cp -a ${CUDNN_NAME}/include/* /usr/include/x86_64-linux-gnu/
cp -a ${CUDNN_NAME}/lib/* /usr/local/cuda/lib64/
cp -a ${CUDNN_NAME}/lib/* /usr/lib/x86_64-linux-gnu/
cd ..
rm -rf tmp_cudnn
ldconfig
fi

View File

@ -7,18 +7,16 @@ if [ -n "$KATEX" ]; then
# Ignore error if gpg-agent doesn't exist (for Ubuntu 16.04)
apt-get install -y gpg-agent || :
curl --retry 3 -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
sudo apt-get install -y nodejs
curl --retry 3 -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
apt-get update
apt-get install -y --no-install-recommends yarn
yarn global add katex --prefix /usr/local
sudo apt-get -y install doxygen
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

View File

@ -10,7 +10,5 @@ cd "${OPENSSL}"
./config --prefix=/opt/openssl -d '-Wl,--enable-new-dtags,-rpath,$(LIBRPATH)'
# NOTE: openssl install errors out when built with the -j option
make -j6; make install_sw
# Link the ssl libraries to the /usr/lib folder.
sudo ln -s /opt/openssl/lib/lib* /usr/lib
cd ..
rm -rf "${OPENSSL}"

View File

@ -12,7 +12,7 @@ install_protobuf_317() {
# g++: error: ./../lib64/crti.o: No such file or directory
ln -s /usr/lib64 "$pb_dir/lib64"
curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protobuf-all-3.17.3.tar.gz" --retry 3
curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protobuf-all-3.17.3.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-all-3.17.3.tar.gz
# -j6 to balance memory usage and speed.
# naked `-j` seems to use too much memory.

View File

@ -2,12 +2,40 @@
set -ex
install_magma() {
# "install" hipMAGMA into /opt/rocm/magma by copying after build
git clone https://bitbucket.org/icl/magma.git
pushd magma
# Fixes memory leaks of magma found while executing linalg UTs
git checkout 5959b8783e45f1809812ed96ae762f38ee701972
cp make.inc-examples/make.inc.hip-gcc-mkl make.inc
echo 'LIBDIR += -L$(MKLROOT)/lib' >> make.inc
echo 'LIB += -Wl,--enable-new-dtags -Wl,--rpath,/opt/rocm/lib -Wl,--rpath,$(MKLROOT)/lib -Wl,--rpath,/opt/rocm/magma/lib' >> make.inc
echo 'DEVCCFLAGS += --gpu-max-threads-per-block=256' >> make.inc
export PATH="${PATH}:/opt/rocm/bin"
if [[ -n "$PYTORCH_ROCM_ARCH" ]]; then
amdgpu_targets=`echo $PYTORCH_ROCM_ARCH | sed 's/;/ /g'`
else
amdgpu_targets=`rocm_agent_enumerator | grep -v gfx000 | sort -u | xargs`
fi
for arch in $amdgpu_targets; do
echo "DEVCCFLAGS += --amdgpu-target=$arch" >> make.inc
done
# hipcc with openmp flag may cause isnan() on __device__ not to be found; depending on context, compiler may attempt to match with host definition
sed -i 's/^FOPENMP/#FOPENMP/g' make.inc
make -f make.gen.hipMAGMA -j $(nproc)
LANG=C.UTF-8 make lib/libmagma.so -j $(nproc) MKLROOT=/opt/conda
make testing/testing_dgemm -j $(nproc) MKLROOT=/opt/conda
popd
mv magma /opt/rocm
}
ver() {
printf "%3d%03d%03d%03d" $(echo "$1" | tr '.' ' ');
}
# Map ROCm version to AMDGPU version
declare -A AMDGPU_VERSIONS=( ["5.0"]="21.50" ["5.1.1"]="22.10.1" ["5.2"]="22.20" )
declare -A AMDGPU_VERSIONS=( ["4.5.2"]="21.40.2" ["5.0"]="21.50" ["5.1.1"]="22.10.1" )
install_ubuntu() {
apt-get update
@ -29,12 +57,7 @@ install_ubuntu() {
if [[ $(ver $ROCM_VERSION) -ge $(ver 4.5) ]]; then
# Add amdgpu repository
UBUNTU_VERSION_NAME=`cat /etc/os-release | grep UBUNTU_CODENAME | awk -F= '{print $2}'`
local amdgpu_baseurl
if [[ $(ver $ROCM_VERSION) -ge $(ver 5.3) ]]; then
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/ubuntu"
else
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/ubuntu"
fi
local amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/ubuntu"
echo "deb [arch=amd64] ${amdgpu_baseurl} ${UBUNTU_VERSION_NAME} main" > /etc/apt/sources.list.d/amdgpu.list
fi
@ -43,10 +66,6 @@ install_ubuntu() {
ROCM_REPO="xenial"
fi
if [[ $(ver $ROCM_VERSION) -ge $(ver 5.3) ]]; then
ROCM_REPO="${UBUNTU_VERSION_NAME}"
fi
# Add rocm repository
wget -qO - http://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -
local rocm_baseurl="http://repo.radeon.com/rocm/apt/${ROCM_VERSION}"
@ -70,6 +89,8 @@ install_ubuntu() {
DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated ${MIOPENKERNELS}
fi
install_magma
# Cleanup
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
@ -87,16 +108,7 @@ install_centos() {
if [[ $(ver $ROCM_VERSION) -ge $(ver 4.5) ]]; then
# Add amdgpu repository
local amdgpu_baseurl
if [[ $OS_VERSION == 9 ]]; then
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/9.0/main/x86_64"
else
if [[ $(ver $ROCM_VERSION) -ge $(ver 5.3) ]]; then
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/rhel/7.9/main/x86_64"
else
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/7.9/main/x86_64"
fi
fi
local amdgpu_baseurl="https://repo.radeon.com/amdgpu/${AMDGPU_VERSIONS[$ROCM_VERSION]}/rhel/7.9/main/x86_64"
echo "[AMDGPU]" > /etc/yum.repos.d/amdgpu.repo
echo "name=AMDGPU" >> /etc/yum.repos.d/amdgpu.repo
echo "baseurl=${amdgpu_baseurl}" >> /etc/yum.repos.d/amdgpu.repo
@ -123,6 +135,8 @@ install_centos() {
rocprofiler-dev \
roctracer-dev
install_magma
# Cleanup
yum clean all
rm -rf /var/cache/yum

View File

@ -22,12 +22,5 @@ chown jenkins:jenkins /usr/local
# TODO: Maybe we shouldn't
echo 'jenkins ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/jenkins
# Work around bug where devtoolset replaces sudo and breaks it.
if [ -n "$DEVTOOLSET_VERSION" ]; then
SUDO=/bin/sudo
else
SUDO=sudo
fi
# Test that sudo works
$SUDO -u jenkins $SUDO -v
sudo -u jenkins sudo -v

View File

@ -36,7 +36,12 @@ flatbuffers==2.0
#Pinned versions: 2.0
#test that import:
hypothesis==5.35.1
#future #this breaks linux-bionic-rocm4.5-py3.7
#Description: compatibility layer between python 2 and python 3
#Pinned versions:
#test that import:
hypothesis==4.53.2
# Pin hypothesis to avoid flakiness: https://github.com/pytorch/pytorch/issues/31136
#Description: advanced library for generating parametrized tests
#Pinned versions: 3.44.6, 4.53.2
@ -47,7 +52,7 @@ junitparser==2.1.1
#Pinned versions: 2.1.1
#test that import:
librosa>=0.6.2 ; python_version < "3.11"
librosa>=0.6.2
#Description: A python package for music and audio analysis
#Pinned versions: >=0.6.2
#test that import: test_spectral_ops.py
@ -75,17 +80,17 @@ librosa>=0.6.2 ; python_version < "3.11"
#Pinned versions:
#test that import:
mypy==0.960
mypy==0.812
# Pin MyPy version because new errors are likely to appear with each release
#Description: linter
#Pinned versions: 0.960
#Pinned versions: 0.812
#test that import: test_typing.py, test_type_hints.py
networkx==2.6.3
#networkx
#Description: creation, manipulation, and study of
#the structure, dynamics, and functions of complex networks
#Pinned versions: 2.6.3 (latest version that works with Python 3.7+)
#test that import: functorch
#Pinned versions: 2.0
#test that import:
#ninja
#Description: build system. Note that it install from
@ -95,7 +100,6 @@ networkx==2.6.3
numba==0.49.0 ; python_version < "3.9"
numba==0.54.1 ; python_version == "3.9"
numba==0.55.2 ; python_version == "3.10"
#Description: Just-In-Time Compiler for Numerical Functions
#Pinned versions: 0.54.1, 0.49.0, <=0.49.1
#test that import: test_numba_integration.py
@ -119,19 +123,14 @@ numba==0.55.2 ; python_version == "3.10"
#Pinned versions: 1.9.0
#test that import:
opt-einsum==3.3
#Description: Python library to optimize tensor contraction order, used in einsum
#Pinned versions: 3.3
#test that import: test_linalg.py
#pillow
#Description: Python Imaging Library fork
#Pinned versions:
#test that import:
protobuf==3.20.2
#protobuf
#Description: Googles data interchange format
#Pinned versions: 3.20.1
#Pinned versions:
#test that import: test_tensorboard.py
psutil
@ -144,26 +143,6 @@ pytest
#Pinned versions:
#test that import: test_typing.py, test_cpp_extensions_aot.py, run_test.py
pytest-xdist
#Description: plugin for running pytest in parallel
#Pinned versions:
#test that import:
pytest-shard
#Description: plugin spliting up tests in pytest
#Pinned versions:
#test that import:
pytest-flakefinder==1.1.0
#Description: plugin for rerunning tests a fixed number of times in pytest
#Pinned versions: 1.1.0
#test that import:
pytest-rerunfailures
#Description: plugin for rerunning failure tests in pytest
#Pinned versions:
#test that import:
#pytest-benchmark
#Description: fixture for benchmarking code
#Pinned versions: 3.2.3
@ -174,16 +153,6 @@ pytest-rerunfailures
#Pinned versions:
#test that import:
xdoctest==1.1.0
#Description: runs doctests in pytest
#Pinned versions: 1.1.0
#test that import:
pygments==2.12.0
#Description: support doctest highlighting
#Pinned versions: 2.12.0
#test that import: the doctests
#PyYAML
#Description: data serialization format
#Pinned versions:
@ -209,9 +178,7 @@ scikit-image
#Pinned versions: 0.20.3
#test that import:
scipy==1.6.3 ; python_version < "3.10"
scipy==1.8.1 ; python_version == "3.10"
scipy==1.9.3 ; python_version == "3.11"
scipy==1.6.3
# Pin SciPy because of failing distribution tests (see #60347)
#Description: scientific python
#Pinned versions: 1.6.3
@ -243,18 +210,3 @@ unittest-xml-reporting<=3.2.0,>=2.0.0
#Description: saves unit test results to xml
#Pinned versions:
#test that import:
lintrunner==0.9.2
#Description: all about linters
#Pinned versions: 0.9.2
#test that import:
rockset==1.0.3
#Description: queries Rockset
#Pinned versions: 1.0.3
#test that import:
ghstack==0.7.1
#Description: ghstack tool
#Pinned versions: 0.7.1
#test that import:

View File

@ -10,98 +10,81 @@ ARG CUDA_VERSION
ENV DEBIAN_FRONTEND noninteractive
# Install common dependencies (so that this step can be cached separately)
COPY ./common/install_base.sh install_base.sh
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install user
COPY ./common/install_user.sh install_user.sh
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install katex
ARG KATEX
COPY ./common/install_docs_reqs.sh install_docs_reqs.sh
RUN bash ./install_docs_reqs.sh && rm install_docs_reqs.sh
ADD ./common/install_katex.sh install_katex.sh
RUN bash ./install_katex.sh && rm install_katex.sh
# Install conda and other packages (e.g., numpy, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
ARG CONDA_CMAKE
COPY requirements-ci.txt /opt/conda/requirements-ci.txt
COPY ./common/install_conda.sh install_conda.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
ADD requirements-ci.txt /opt/conda/requirements-ci.txt
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
RUN rm /opt/conda/requirements-ci.txt
# Install gcc
ARG GCC_VERSION
COPY ./common/install_gcc.sh install_gcc.sh
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# Install clang
ARG CLANG_VERSION
COPY ./common/install_clang.sh install_clang.sh
ADD ./common/install_clang.sh install_clang.sh
RUN bash ./install_clang.sh && rm install_clang.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
COPY ./common/install_protobuf.sh install_protobuf.sh
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
COPY ./common/install_db.sh install_db.sh
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
COPY ./common/install_vision.sh install_vision.sh
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
# (optional) Install UCC
ARG UCX_COMMIT
ARG UCC_COMMIT
ENV UCX_COMMIT $UCX_COMMIT
ENV UCC_COMMIT $UCC_COMMIT
ENV UCX_HOME /usr
ENV UCC_HOME /usr
ADD ./common/install_ucc.sh install_ucc.sh
RUN if [ -n "${UCX_COMMIT}" ] && [ -n "${UCC_COMMIT}" ]; then bash ./install_ucc.sh; fi
RUN rm install_ucc.sh
COPY ./common/install_openssl.sh install_openssl.sh
ADD ./common/install_openssl.sh install_openssl.sh
ENV OPENSSL_ROOT_DIR /opt/openssl
RUN bash ./install_openssl.sh
ENV OPENSSL_DIR /opt/openssl
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
COPY ./common/install_cmake.sh install_cmake.sh
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
COPY ./common/install_cache.sh install_cache.sh
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
# See https://github.com/pytorch/pytorch/issues/82174
# TODO(sdym@fb.com):
# check if this is needed after full off Xenial migration
ENV CARGO_NET_GIT_FETCH_WITH_CLI true
RUN bash ./install_cache.sh && rm install_cache.sh
ENV CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache
# Add jni.h for java host build
COPY ./common/install_jni.sh install_jni.sh
COPY ./java/jni.h jni.h
ADD ./common/install_jni.sh install_jni.sh
ADD ./java/jni.h jni.h
RUN bash ./install_jni.sh && rm install_jni.sh
# Install Open MPI for CUDA
COPY ./common/install_openmpi.sh install_openmpi.sh
ADD ./common/install_openmpi.sh install_openmpi.sh
RUN if [ -n "${CUDA_VERSION}" ]; then bash install_openmpi.sh; fi
RUN rm install_openmpi.sh
@ -119,14 +102,9 @@ COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
# Install CUDNN
ARG CUDNN_VERSION
ARG CUDA_VERSION
COPY ./common/install_cudnn.sh install_cudnn.sh
ADD ./common/install_cudnn.sh install_cudnn.sh
RUN if [ "${CUDNN_VERSION}" -eq 8 ]; then bash install_cudnn.sh; fi
RUN rm install_cudnn.sh
# Delete /usr/local/cuda-11.X/cuda-11.X symlinks
RUN if [ -h /usr/local/cuda-11.6/cuda-11.6 ]; then rm /usr/local/cuda-11.6/cuda-11.6; fi
RUN if [ -h /usr/local/cuda-11.7/cuda-11.7 ]; then rm /usr/local/cuda-11.7/cuda-11.7; fi
USER jenkins
CMD ["bash"]

View File

@ -11,63 +11,59 @@ ARG PYTORCH_ROCM_ARCH
ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
# Install common dependencies (so that this step can be cached separately)
COPY ./common/install_base.sh install_base.sh
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install clang
ARG LLVMDEV
ARG CLANG_VERSION
COPY ./common/install_clang.sh install_clang.sh
ADD ./common/install_clang.sh install_clang.sh
RUN bash ./install_clang.sh && rm install_clang.sh
# Install user
COPY ./common/install_user.sh install_user.sh
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install conda and other packages (e.g., numpy, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ARG CONDA_CMAKE
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
COPY requirements-ci.txt /opt/conda/requirements-ci.txt
COPY ./common/install_conda.sh install_conda.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
ADD requirements-ci.txt /opt/conda/requirements-ci.txt
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
RUN rm /opt/conda/requirements-ci.txt
# Install gcc
ARG GCC_VERSION
COPY ./common/install_gcc.sh install_gcc.sh
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
COPY ./common/install_protobuf.sh install_protobuf.sh
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
COPY ./common/install_db.sh install_db.sh
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
COPY ./common/install_vision.sh install_vision.sh
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
# Install rocm
ARG ROCM_VERSION
COPY ./common/install_rocm.sh install_rocm.sh
ADD ./common/install_rocm.sh install_rocm.sh
RUN bash ./install_rocm.sh
RUN rm install_rocm.sh
COPY ./common/install_rocm_magma.sh install_rocm_magma.sh
RUN bash ./install_rocm_magma.sh
RUN rm install_rocm_magma.sh
ENV PATH /opt/rocm/bin:$PATH
ENV PATH /opt/rocm/hcc/bin:$PATH
ENV PATH /opt/rocm/hip/bin:$PATH
@ -79,18 +75,18 @@ ENV LC_ALL C.UTF-8
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
COPY ./common/install_cmake.sh install_cmake.sh
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# (optional) Install non-default Ninja version
ARG NINJA_VERSION
COPY ./common/install_ninja.sh install_ninja.sh
ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
COPY ./common/install_cache.sh install_cache.sh
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
RUN bash ./install_cache.sh && rm install_cache.sh

View File

@ -6,87 +6,67 @@ ARG UBUNTU_VERSION
ENV DEBIAN_FRONTEND noninteractive
ARG CLANG_VERSION
# Install common dependencies (so that this step can be cached separately)
COPY ./common/install_base.sh install_base.sh
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install clang
ARG LLVMDEV
COPY ./common/install_clang.sh install_clang.sh
ARG CLANG_VERSION
ADD ./common/install_clang.sh install_clang.sh
RUN bash ./install_clang.sh && rm install_clang.sh
# (optional) Install thrift.
ARG THRIFT
COPY ./common/install_thrift.sh install_thrift.sh
ADD ./common/install_thrift.sh install_thrift.sh
RUN if [ -n "${THRIFT}" ]; then bash ./install_thrift.sh; fi
RUN rm install_thrift.sh
ENV INSTALLED_THRIFT ${THRIFT}
# Install user
COPY ./common/install_user.sh install_user.sh
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install katex
ARG KATEX
COPY ./common/install_docs_reqs.sh install_docs_reqs.sh
RUN bash ./install_docs_reqs.sh && rm install_docs_reqs.sh
ADD ./common/install_katex.sh install_katex.sh
RUN bash ./install_katex.sh && rm install_katex.sh
# Install conda and other packages (e.g., numpy, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ARG CONDA_CMAKE
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
ENV PATH /opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:/opt/conda/bin:$PATH
COPY requirements-ci.txt /opt/conda/requirements-ci.txt
COPY ./common/install_conda.sh install_conda.sh
COPY ./common/common_utils.sh common_utils.sh
RUN bash ./install_conda.sh && rm install_conda.sh common_utils.sh /opt/conda/requirements-ci.txt
ADD requirements-ci.txt /opt/conda/requirements-ci.txt
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
RUN rm /opt/conda/requirements-ci.txt
# Install gcc
ARG GCC_VERSION
COPY ./common/install_gcc.sh install_gcc.sh
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# Install lcov for C++ code coverage
COPY ./common/install_lcov.sh install_lcov.sh
ADD ./common/install_lcov.sh install_lcov.sh
RUN bash ./install_lcov.sh && rm install_lcov.sh
# Install cuda and cudnn
ARG CUDA_VERSION
RUN wget -q https://raw.githubusercontent.com/pytorch/builder/main/common/install_cuda.sh -O install_cuda.sh
RUN bash ./install_cuda.sh ${CUDA_VERSION} && rm install_cuda.sh
ENV DESIRED_CUDA ${CUDA_VERSION}
ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:$PATH
# (optional) Install UCC
ARG UCX_COMMIT
ARG UCC_COMMIT
ENV UCX_COMMIT $UCX_COMMIT
ENV UCC_COMMIT $UCC_COMMIT
ENV UCX_HOME /usr
ENV UCC_HOME /usr
ADD ./common/install_ucc.sh install_ucc.sh
RUN if [ -n "${UCX_COMMIT}" ] && [ -n "${UCC_COMMIT}" ]; then bash ./install_ucc.sh; fi
RUN rm install_ucc.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
COPY ./common/install_protobuf.sh install_protobuf.sh
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
COPY ./common/install_db.sh install_db.sh
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
COPY ./common/install_vision.sh install_vision.sh
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
@ -95,9 +75,9 @@ ENV INSTALLED_VISION ${VISION}
ARG ANDROID
ARG ANDROID_NDK
ARG GRADLE_VERSION
COPY ./common/install_android.sh install_android.sh
COPY ./android/AndroidManifest.xml AndroidManifest.xml
COPY ./android/build.gradle build.gradle
ADD ./common/install_android.sh install_android.sh
ADD ./android/AndroidManifest.xml AndroidManifest.xml
ADD ./android/build.gradle build.gradle
RUN if [ -n "${ANDROID}" ]; then bash ./install_android.sh; fi
RUN rm install_android.sh
RUN rm AndroidManifest.xml
@ -106,49 +86,42 @@ ENV INSTALLED_ANDROID ${ANDROID}
# (optional) Install Vulkan SDK
ARG VULKAN_SDK_VERSION
COPY ./common/install_vulkan_sdk.sh install_vulkan_sdk.sh
ADD ./common/install_vulkan_sdk.sh install_vulkan_sdk.sh
RUN if [ -n "${VULKAN_SDK_VERSION}" ]; then bash ./install_vulkan_sdk.sh; fi
RUN rm install_vulkan_sdk.sh
# (optional) Install swiftshader
ARG SWIFTSHADER
COPY ./common/install_swiftshader.sh install_swiftshader.sh
ADD ./common/install_swiftshader.sh install_swiftshader.sh
RUN if [ -n "${SWIFTSHADER}" ]; then bash ./install_swiftshader.sh; fi
RUN rm install_swiftshader.sh
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
COPY ./common/install_cmake.sh install_cmake.sh
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# (optional) Install non-default Ninja version
ARG NINJA_VERSION
COPY ./common/install_ninja.sh install_ninja.sh
ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
COPY ./common/install_openssl.sh install_openssl.sh
ADD ./common/install_openssl.sh install_openssl.sh
RUN bash ./install_openssl.sh
ENV OPENSSL_ROOT_DIR /opt/openssl
ENV OPENSSL_DIR /opt/openssl
RUN rm install_openssl.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
COPY ./common/install_cache.sh install_cache.sh
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
RUN bash ./install_cache.sh && rm install_cache.sh
# Add jni.h for java host build
COPY ./common/install_jni.sh install_jni.sh
COPY ./java/jni.h jni.h
ADD ./common/install_jni.sh install_jni.sh
ADD ./java/jni.h jni.h
RUN bash ./install_jni.sh && rm install_jni.sh
# Install Open MPI for CUDA
COPY ./common/install_openmpi.sh install_openmpi.sh
RUN if [ -n "${CUDA_VERSION}" ]; then bash install_openmpi.sh; fi
RUN rm install_openmpi.sh
# Include BUILD_ENVIRONMENT environment variable in image
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
@ -156,10 +129,5 @@ ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
# Install LLVM dev version (Defined in the pytorch/builder github repository)
COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
# AWS specific CUDA build guidance
ENV TORCH_CUDA_ARCH_LIST Maxwell
ENV TORCH_NVCC_FLAGS "-Xfatbin -compress-all"
ENV CUDA_PATH /usr/local/cuda
USER jenkins
CMD ["bash"]

View File

@ -70,7 +70,6 @@ class Header(object):
for line in filter(None, lines):
output_filehandle.write(line + "\n")
def _for_all_items(items, functor) -> None:
if isinstance(items, list):
for item in items:
@ -79,7 +78,6 @@ def _for_all_items(items, functor) -> None:
item_type, item = next(iter(items.items()))
functor(item_type, item)
def filter_master_only_jobs(items):
def _is_main_or_master_item(item):
filters = item.get('filters', None)
@ -118,7 +116,6 @@ def filter_master_only_jobs(items):
_for_all_items(items, _save_requires_if_master)
return _do_filtering(items)
def generate_required_docker_images(items):
required_docker_images = set()
@ -134,7 +131,6 @@ def generate_required_docker_images(items):
_for_all_items(items, _requires_docker_image)
return required_docker_images
def gen_build_workflows_tree():
build_workflows_functions = [
cimodel.data.simple.mobile_definitions.get_workflow_jobs,

View File

@ -56,13 +56,13 @@ else
echo "Can't tell what to checkout"
exit 1
fi
retry git submodule update --init --recursive
retry git submodule update --init --recursive --jobs 0
echo "Using Pytorch from "
git --no-pager log --max-count 1
popd
# Clone the Builder master repo
retry git clone -q https://github.com/pytorch/builder.git -b release/2.0 "$BUILDER_ROOT"
retry git clone -q https://github.com/pytorch/builder.git -b release/1.12 "$BUILDER_ROOT"
pushd "$BUILDER_ROOT"
echo "Using builder from "
git --no-pager log --max-count 1

View File

@ -31,9 +31,9 @@ fi
conda_sh="$workdir/install_miniconda.sh"
if [[ "$(uname)" == Darwin ]]; then
curl --retry 3 --retry-all-errors -o "$conda_sh" https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh
curl --retry 3 -o "$conda_sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
else
curl --retry 3 --retry-all-errors -o "$conda_sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
curl --retry 3 -o "$conda_sh" https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
fi
chmod +x "$conda_sh"
"$conda_sh" -b -p "$MINICONDA_ROOT"

View File

@ -8,21 +8,21 @@ PROJ_ROOT=/Users/distiller/project
export TCLLIBPATH="/usr/local/lib"
# Install conda
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x ~/conda.sh
/bin/bash ~/conda.sh -b -p ~/anaconda
export PATH="~/anaconda/bin:${PATH}"
source ~/anaconda/bin/activate
# Install dependencies
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake requests typing-extensions --yes
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
conda install -c conda-forge valgrind --yes
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
# sync submodules
cd ${PROJ_ROOT}
git submodule sync
git submodule update --init --recursive
git submodule update --init --recursive --jobs 0
# run build script
chmod a+x ${PROJ_ROOT}/scripts/build_ios.sh

View File

@ -1,19 +1,30 @@
#!/bin/bash
set -ex -o pipefail
if ! [ "$IOS_PLATFORM" == "SIMULATOR" ]; then
exit 0
fi
echo ""
echo "DIR: $(pwd)"
PROJ_ROOT=/Users/distiller/project
cd ${PROJ_ROOT}/ios/TestApp
# install fastlane
sudo gem install bundler && bundle install
# install certificates
echo "${IOS_CERT_KEY_2022}" >> cert.txt
base64 --decode cert.txt -o Certificates.p12
rm cert.txt
bundle exec fastlane install_root_cert
bundle exec fastlane install_dev_cert
# install the provisioning profile
PROFILE=PyTorch_CI_2022.mobileprovision
PROVISIONING_PROFILES=~/Library/MobileDevice/Provisioning\ Profiles
mkdir -pv "${PROVISIONING_PROFILES}"
cd "${PROVISIONING_PROFILES}"
echo "${IOS_SIGN_KEY_2022}" >> cert.txt
base64 --decode cert.txt -o ${PROFILE}
rm cert.txt
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM}
PROFILE=PyTorch_CI_2022
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM} -c ${PROFILE} -t ${IOS_DEV_TEAM_ID}

View File

@ -33,7 +33,7 @@ fi
cp ${PROJ_ROOT}/LICENSE ${ZIP_DIR}/
# zip the library
export DATE="$(date -u +%Y%m%d)"
export IOS_NIGHTLY_BUILD_VERSION="2.0.0.${DATE}"
export IOS_NIGHTLY_BUILD_VERSION="1.12.0.${DATE}"
if [ "${BUILD_LITE_INTERPRETER}" == "1" ]; then
# libtorch_lite_ios_nightly_1.11.0.20210810.zip
ZIPFILE="libtorch_lite_ios_nightly_${IOS_NIGHTLY_BUILD_VERSION}.zip"
@ -47,7 +47,7 @@ echo "${IOS_NIGHTLY_BUILD_VERSION}" > version.txt
zip -r ${ZIPFILE} install src version.txt LICENSE
# upload to aws
# Install conda then 'conda install' awscli
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x ~/conda.sh
/bin/bash ~/conda.sh -b -p ~/anaconda
export PATH="~/anaconda/bin:${PATH}"

View File

@ -38,12 +38,8 @@ fi
EXTRA_CONDA_FLAGS=""
NUMPY_PIN=""
PROTOBUF_PACKAGE="defaults::protobuf"
if [[ "\$python_nodot" = *311* ]]; then
# Numpy is yet not avaiable on default conda channel
EXTRA_CONDA_FLAGS="-c=malfet"
fi
if [[ "\$python_nodot" = *310* ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
# There's an issue with conda channel priority where it'll randomly pick 1.19 over 1.20
# we set a lower boundary here just to be safe
NUMPY_PIN=">=1.21.2"
@ -51,12 +47,15 @@ if [[ "\$python_nodot" = *310* ]]; then
fi
if [[ "\$python_nodot" = *39* ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
# There's an issue with conda channel priority where it'll randomly pick 1.19 over 1.20
# we set a lower boundary here just to be safe
NUMPY_PIN=">=1.20"
fi
if [[ "$DESIRED_CUDA" == "cu116" ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
fi
# Move debug wheels out of the the package dir so they don't get installed
mkdir -p /tmp/debug_final_pkgs
@ -79,27 +78,29 @@ if [[ "$PACKAGE_TYPE" == conda ]]; then
set +u
retry conda install \${EXTRA_CONDA_FLAGS} -yq \
"numpy\${NUMPY_PIN}" \
future \
mkl>=2018 \
ninja \
dataclasses \
typing-extensions \
${PROTOBUF_PACKAGE}
${PROTOBUF_PACKAGE} \
six
if [[ "$DESIRED_CUDA" == 'cpu' ]]; then
retry conda install -c pytorch -y cpuonly
else
cu_ver="${DESIRED_CUDA:2:2}.${DESIRED_CUDA:4}"
CUDA_PACKAGE="pytorch-cuda"
PYTORCH_CHANNEL="pytorch"
if [[ "\${TORCH_CONDA_BUILD_FOLDER}" == "pytorch-nightly" ]]; then
PYTORCH_CHANNEL="pytorch-nightly"
# DESIRED_CUDA is in format cu90 or cu102
if [[ "${#DESIRED_CUDA}" == 4 ]]; then
cu_ver="${DESIRED_CUDA:2:1}.${DESIRED_CUDA:3}"
else
cu_ver="${DESIRED_CUDA:2:2}.${DESIRED_CUDA:4}"
fi
retry conda install \${EXTRA_CONDA_FLAGS} -yq -c nvidia -c "\${PYTORCH_CHANNEL}" "pytorch-cuda=\${cu_ver}"
retry conda install \${EXTRA_CONDA_FLAGS} -yq -c nvidia -c pytorch "cudatoolkit=\${cu_ver}"
fi
conda install \${EXTRA_CONDA_FLAGS} -y "\$pkg" --offline
)
elif [[ "$PACKAGE_TYPE" != libtorch ]]; then
pip install "\$pkg" --extra-index-url "https://download.pytorch.org/whl/nightly/${DESIRED_CUDA}"
retry pip install -q numpy protobuf typing-extensions
pip install "\$pkg"
retry pip install -q future numpy protobuf typing-extensions six
fi
if [[ "$PACKAGE_TYPE" == libtorch ]]; then
pkg="\$(ls /final_pkgs/*-latest.zip)"

View File

@ -4,7 +4,7 @@ set -eux -o pipefail
source "${BINARY_ENV_FILE:-/Users/distiller/project/env}"
mkdir -p "$PYTORCH_FINAL_PACKAGE_DIR"
if [[ -z "${GITHUB_ACTIONS:-}" ]]; then
if [[ -z "${IS_GHA:-}" ]]; then
export PATH="${workdir:-${HOME}}/miniconda/bin:${PATH}"
fi

View File

@ -5,7 +5,7 @@ export TZ=UTC
tagged_version() {
# Grabs version from either the env variable CIRCLE_TAG
# or the pytorch git described version
if [[ "$OSTYPE" == "msys" && -z "${GITHUB_ACTIONS:-}" ]]; then
if [[ "$OSTYPE" == "msys" && -z "${IS_GHA:-}" ]]; then
GIT_DIR="${workdir}/p/.git"
else
GIT_DIR="${workdir}/pytorch/.git"
@ -23,12 +23,50 @@ tagged_version() {
fi
}
envfile=${BINARY_ENV_FILE:-/tmp/env}
if [[ -n "${PYTORCH_ROOT}" ]]; then
workdir=$(dirname "${PYTORCH_ROOT}")
# These are only relevant for CircleCI
# TODO: Remove these later once migrated fully to GHA
if [[ -z ${IS_GHA:-} ]]; then
# We need to write an envfile to persist these variables to following
# steps, but the location of the envfile depends on the circleci executor
if [[ "$(uname)" == Darwin ]]; then
# macos executor (builds and tests)
workdir="/Users/distiller/project"
elif [[ "$OSTYPE" == "msys" ]]; then
# windows executor (builds and tests)
workdir="/c/w"
elif [[ -d "/home/circleci/project" ]]; then
# machine executor (binary tests)
workdir="/home/circleci/project"
else
# docker executor (binary builds)
workdir="/"
fi
envfile="$workdir/env"
touch "$envfile"
chmod +x "$envfile"
# Parse the BUILD_ENVIRONMENT to package type, python, and cuda
configs=($BUILD_ENVIRONMENT)
export PACKAGE_TYPE="${configs[0]}"
export DESIRED_PYTHON="${configs[1]}"
export DESIRED_CUDA="${configs[2]}"
if [[ "${OSTYPE}" == "msys" ]]; then
export DESIRED_DEVTOOLSET=""
export LIBTORCH_CONFIG="${configs[3]:-}"
if [[ "$LIBTORCH_CONFIG" == 'debug' ]]; then
export DEBUG=1
fi
else
export DESIRED_DEVTOOLSET="${configs[3]:-}"
fi
else
# docker executor (binary builds)
workdir="/"
envfile=${BINARY_ENV_FILE:-/tmp/env}
if [[ -n "${PYTORCH_ROOT}" ]]; then
workdir=$(dirname "${PYTORCH_ROOT}")
else
# docker executor (binary builds)
workdir="/"
fi
fi
if [[ "$PACKAGE_TYPE" == 'libtorch' ]]; then
@ -59,7 +97,7 @@ PIP_UPLOAD_FOLDER='nightly/'
# We put this here so that OVERRIDE_PACKAGE_VERSION below can read from it
export DATE="$(date -u +%Y%m%d)"
#TODO: We should be pulling semver version from the base version.txt
BASE_BUILD_VERSION="2.0.0.dev$DATE"
BASE_BUILD_VERSION="1.12.0.dev$DATE"
# Change BASE_BUILD_VERSION to git tag when on a git tag
# Use 'git -C' to make doubly sure we're in the correct directory for checking
# the git tag
@ -76,11 +114,6 @@ if [[ "$(uname)" == 'Darwin' ]] || [[ "$PACKAGE_TYPE" == conda ]]; then
else
export PYTORCH_BUILD_VERSION="${BASE_BUILD_VERSION}+$DESIRED_CUDA"
fi
if [[ -n "${PYTORCH_EXTRA_INSTALL_REQUIREMENTS:-}" ]]; then
export PYTORCH_BUILD_VERSION="${PYTORCH_BUILD_VERSION}-with-pypi-cudnn"
fi
export PYTORCH_BUILD_NUMBER=1
@ -92,11 +125,11 @@ if [[ "$PACKAGE_TYPE" == libtorch ]]; then
POSSIBLE_JAVA_HOMES+=(/usr/lib/jvm/java-8-openjdk-amd64)
POSSIBLE_JAVA_HOMES+=(/Library/Java/JavaVirtualMachines/*.jdk/Contents/Home)
# Add the Windows-specific JNI path
POSSIBLE_JAVA_HOMES+=("$PWD/pytorch/.circleci/windows-jni/")
POSSIBLE_JAVA_HOMES+=("$PWD/.circleci/windows-jni/")
for JH in "${POSSIBLE_JAVA_HOMES[@]}" ; do
if [[ -e "$JH/include/jni.h" ]] ; then
# Skip if we're not on Windows but haven't found a JAVA_HOME
if [[ "$JH" == "$PWD/pytorch/.circleci/windows-jni/" && "$OSTYPE" != "msys" ]] ; then
if [[ "$JH" == "$PWD/.circleci/windows-jni/" && "$OSTYPE" != "msys" ]] ; then
break
fi
echo "Found jni.h under $JH"
@ -129,9 +162,9 @@ if [[ "${OSTYPE}" == "msys" ]]; then
else
export DESIRED_DEVTOOLSET="${DESIRED_DEVTOOLSET:-}"
fi
export PYTORCH_EXTRA_INSTALL_REQUIREMENTS="${PYTORCH_EXTRA_INSTALL_REQUIREMENTS:-}"
export DATE="$DATE"
export NIGHTLIES_DATE_PREAMBLE=1.14.0.dev
export NIGHTLIES_DATE_PREAMBLE=1.12.0.dev
export PYTORCH_BUILD_VERSION="$PYTORCH_BUILD_VERSION"
export PYTORCH_BUILD_NUMBER="$PYTORCH_BUILD_NUMBER"
export OVERRIDE_PACKAGE_VERSION="$PYTORCH_BUILD_VERSION"
@ -167,7 +200,7 @@ if [[ "$(uname)" != Darwin ]]; then
EOL
fi
if [[ -z "${GITHUB_ACTIONS:-}" ]]; then
if [[ -z "${IS_GHA:-}" ]]; then
cat >>"$envfile" <<EOL
export workdir="$workdir"
export MAC_PACKAGE_WORK_DIR="$workdir"

View File

@ -14,12 +14,6 @@ UPLOAD_CHANNEL=${UPLOAD_CHANNEL:-nightly}
UPLOAD_SUBFOLDER=${UPLOAD_SUBFOLDER:-cpu}
UPLOAD_BUCKET="s3://pytorch"
BACKUP_BUCKET="s3://pytorch-backup"
BUILD_NAME=${BUILD_NAME:-}
# this is temporary change to upload pypi-cudnn builds to separate folder
if [[ ${BUILD_NAME} == *with-pypi-cudnn* ]]; then
UPLOAD_SUBFOLDER="${UPLOAD_SUBFOLDER}_pypi_cudnn"
fi
DRY_RUN=${DRY_RUN:-enabled}
# Don't actually do work unless explicit
@ -30,11 +24,6 @@ if [[ "${DRY_RUN}" = "disabled" ]]; then
AWS_S3_CP="aws s3 cp"
fi
# Sleep 2 minutes between retries for conda upload
retry () {
"$@" || (sleep 5m && "$@") || (sleep 5m && "$@") || (sleep 5m && "$@") || (sleep 5m && "$@")
}
do_backup() {
local backup_dir
backup_dir=$1
@ -48,14 +37,13 @@ do_backup() {
conda_upload() {
(
set -x
retry \
${ANACONDA} \
upload \
${PKG_DIR}/*.tar.bz2 \
-u "pytorch-${UPLOAD_CHANNEL}" \
--label main \
--no-progress \
--force
upload \
${PKG_DIR}/*.tar.bz2 \
-u "pytorch-${UPLOAD_CHANNEL}" \
--label main \
--no-progress \
--force
)
}

View File

@ -6,9 +6,9 @@ mkdir -p "$PYTORCH_FINAL_PACKAGE_DIR"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
export USE_SCCACHE=1
export SCCACHE_BUCKET=ossci-compiler-cache
export SCCACHE_BUCKET=ossci-compiler-cache-windows
export SCCACHE_IGNORE_SERVER_IO_ERROR=1
export VC_YEAR=2022
export VC_YEAR=2019
if [[ "${DESIRED_CUDA}" == *"cu11"* ]]; then
export BUILD_SPLIT_CUDA=ON

View File

@ -4,7 +4,7 @@ set -eux -o pipefail
source "${BINARY_ENV_FILE:-/c/w/env}"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
export VC_YEAR=2022
export VC_YEAR=2019
pushd "$BUILDER_ROOT"

View File

@ -20,11 +20,6 @@ do
touch "$file" || true
done < <(find /var/lib/jenkins/.gradle -type f -print0)
# Patch pocketfft (as Android does not have aligned_alloc even if compiled with c++17
if [ -f ~/workspace/third_party/pocketfft/pocketfft_hdronly.h ]; then
sed -i -e "s/#if __cplusplus >= 201703L/#if 0/" ~/workspace/third_party/pocketfft/pocketfft_hdronly.h
fi
export GRADLE_LOCAL_PROPERTIES=~/workspace/android/local.properties
rm -f $GRADLE_LOCAL_PROPERTIES
echo "sdk.dir=/opt/android/sdk" >> $GRADLE_LOCAL_PROPERTIES
@ -83,7 +78,7 @@ if [[ "${BUILD_ENVIRONMENT}" == *-gradle-build-only-x86_32* ]]; then
GRADLE_PARAMS+=" -PABI_FILTERS=x86"
fi
if [ -n "${GRADLE_OFFLINE:-}" ]; then
if [ -n "{GRADLE_OFFLINE:-}" ]; then
GRADLE_PARAMS+=" --offline"
fi

View File

@ -51,6 +51,8 @@ git clone https://github.com/pytorch/cppdocs
set -ex
sudo apt-get -y install doxygen
# Generate ATen files
pushd "${pt_checkout}"
pip install -r requirements.txt

View File

@ -1,5 +1,5 @@
set "DRIVER_DOWNLOAD_LINK=https://s3.amazonaws.com/ossci-windows/452.39-data-center-tesla-desktop-win10-64bit-international.exe"
curl --retry 3 --retry-all-errors -kL %DRIVER_DOWNLOAD_LINK% --output 452.39-data-center-tesla-desktop-win10-64bit-international.exe
curl --retry 3 -kL %DRIVER_DOWNLOAD_LINK% --output 452.39-data-center-tesla-desktop-win10-64bit-international.exe
if errorlevel 1 exit /b 1
start /wait 452.39-data-center-tesla-desktop-win10-64bit-international.exe -s -noreboot

View File

@ -1,47 +0,0 @@
#!/bin/bash
# =================== The following code **should** be executed inside Docker container ===================
# Install dependencies
sudo apt-get -y update
sudo apt-get -y install expect-dev
# This is where the local pytorch install in the docker image is located
pt_checkout="/var/lib/jenkins/workspace"
source "$pt_checkout/.ci/pytorch/common_utils.sh"
echo "functorch_doc_push_script.sh: Invoked with $*"
set -ex
version=${DOCS_VERSION:-nightly}
echo "version: $version"
# Build functorch docs
pushd $pt_checkout/functorch/docs
pip -q install -r requirements.txt
make html
popd
git clone https://github.com/pytorch/functorch -b gh-pages --depth 1 functorch_ghpages
pushd functorch_ghpages
if [ $version == "master" ]; then
version=nightly
fi
git rm -rf "$version" || true
mv "$pt_checkout/functorch/docs/build/html" "$version"
git add "$version" || true
git status
git config user.email "soumith+bot@pytorch.org"
git config user.name "pytorchbot"
# If there aren't changes, don't make a commit; push is no-op
git commit -m "Generate Python docs from pytorch/pytorch@${GITHUB_SHA}" || true
git status
if [[ "${WITH_PUSH:-}" == true ]]; then
git push -u origin gh-pages
fi
popd
# =================== The above code **should** be executed inside Docker container ===================

View File

@ -7,7 +7,7 @@ sudo apt-get -y install expect-dev
# This is where the local pytorch install in the docker image is located
pt_checkout="/var/lib/jenkins/workspace"
source "$pt_checkout/.ci/pytorch/common_utils.sh"
source "$pt_checkout/.jenkins/pytorch/common_utils.sh"
echo "python_doc_push_script.sh: Invoked with $*"
@ -77,9 +77,6 @@ pushd pytorch.github.io
export LC_ALL=C
export PATH=/opt/conda/bin:$PATH
if [ -n $ANACONDA_PYTHON_VERSION ]; then
export PATH=/opt/conda/envs/py_$ANACONDA_PYTHON_VERSION/bin:$PATH
fi
rm -rf pytorch || true
@ -138,10 +135,6 @@ git commit -m "Generate Python docs from pytorch/pytorch@${GITHUB_SHA}" || true
git status
if [[ "${WITH_PUSH:-}" == true ]]; then
# push to a temp branch first to trigger CLA check and satisfy branch protections
git push -u origin HEAD:pytorchbot/temp-branch-py -f
git push -u origin HEAD^:pytorchbot/base -f
sleep 30
git push -u origin "${branch}"
fi

View File

@ -32,7 +32,7 @@ if ! command -v aws >/dev/null; then
fi
if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then
DRIVER_FN="NVIDIA-Linux-x86_64-515.76.run"
DRIVER_FN="NVIDIA-Linux-x86_64-510.60.02.run"
wget "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN"
sudo /bin/bash "$DRIVER_FN" -s --no-drm || (sudo cat /var/log/nvidia-installer.log && false)
nvidia-smi
@ -40,8 +40,8 @@ if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then
# Taken directly from https://github.com/NVIDIA/nvidia-docker
# Add the package repositories
distribution=$(. /etc/os-release;echo "$ID$VERSION_ID")
curl -s -L --retry 3 --retry-all-errors https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L --retry 3 --retry-all-errors "https://nvidia.github.io/nvidia-docker/${distribution}/nvidia-docker.list" | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L "https://nvidia.github.io/nvidia-docker/${distribution}/nvidia-docker.list" | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
retry sudo apt-get update -qq
# Necessary to get the `--gpus` flag to function within docker
@ -66,6 +66,7 @@ add_to_env_file() {
esac
}
add_to_env_file IN_CI 1
add_to_env_file CI_MASTER "${CI_MASTER:-}"
add_to_env_file COMMIT_SOURCE "${CIRCLE_BRANCH:-}"
add_to_env_file BUILD_ENVIRONMENT "${BUILD_ENVIRONMENT}"

View File

@ -2,7 +2,7 @@
set -eux -o pipefail
# Set up CircleCI GPG keys for apt, if needed
curl --retry 3 --retry-all-errors -s -L https://packagecloud.io/circleci/trusty/gpgkey | sudo apt-key add -
curl --retry 3 -s -L https://packagecloud.io/circleci/trusty/gpgkey | sudo apt-key add -
# Stop background apt updates. Hypothetically, the kill should not
# be necessary, because stop is supposed to send a kill signal to

View File

@ -0,0 +1,65 @@
# https://developercommunity.visualstudio.com/t/install-specific-version-of-vs-component/1142479
# Where to find the links: https://docs.microsoft.com/en-us/visualstudio/releases/2019/history#release-dates-and-build-numbers
# BuildTools from S3
$VS_DOWNLOAD_LINK = "https://s3.amazonaws.com/ossci-windows/vs${env:VS_VERSION}_BuildTools.exe"
$COLLECT_DOWNLOAD_LINK = "https://aka.ms/vscollect.exe"
$VS_INSTALL_ARGS = @("--nocache","--quiet","--wait", "--add Microsoft.VisualStudio.Workload.VCTools",
"--add Microsoft.Component.MSBuild",
"--add Microsoft.VisualStudio.Component.Roslyn.Compiler",
"--add Microsoft.VisualStudio.Component.TextTemplating",
"--add Microsoft.VisualStudio.Component.VC.CoreIde",
"--add Microsoft.VisualStudio.Component.VC.Redist.14.Latest",
"--add Microsoft.VisualStudio.ComponentGroup.NativeDesktop.Core",
"--add Microsoft.VisualStudio.Component.VC.Tools.x86.x64",
"--add Microsoft.VisualStudio.ComponentGroup.NativeDesktop.Win81")
if (${env:INSTALL_WINDOWS_SDK} -eq "1") {
$VS_INSTALL_ARGS += "--add Microsoft.VisualStudio.Component.Windows10SDK.19041"
}
if (Test-Path "${env:ProgramFiles(x86)}\Microsoft Visual Studio\Installer\vswhere.exe") {
$VS_VERSION_major = [int] ${env:VS_VERSION}.split(".")[0]
$existingPath = & "${env:ProgramFiles(x86)}\Microsoft Visual Studio\Installer\vswhere.exe" -products "Microsoft.VisualStudio.Product.BuildTools" -version "[${env:VS_VERSION}, ${env:VS_VERSION_major + 1})" -property installationPath
if (($existingPath -ne $null) -and (!${env:CIRCLECI})) {
echo "Found correctly versioned existing BuildTools installation in $existingPath"
exit 0
}
$pathToRemove = & "${env:ProgramFiles(x86)}\Microsoft Visual Studio\Installer\vswhere.exe" -products "Microsoft.VisualStudio.Product.BuildTools" -property installationPath
}
echo "Downloading VS installer from S3."
curl.exe --retry 3 -kL $VS_DOWNLOAD_LINK --output vs_installer.exe
if ($LASTEXITCODE -ne 0) {
echo "Download of the VS 2019 Version ${env:VS_VERSION} installer failed"
exit 1
}
if ($pathToRemove -ne $null) {
echo "Uninstalling $pathToRemove."
$VS_UNINSTALL_ARGS = @("uninstall", "--installPath", "`"$pathToRemove`"", "--quiet","--wait")
$process = Start-Process "${PWD}\vs_installer.exe" -ArgumentList $VS_UNINSTALL_ARGS -NoNewWindow -Wait -PassThru
$exitCode = $process.ExitCode
if (($exitCode -ne 0) -and ($exitCode -ne 3010)) {
echo "Original BuildTools uninstall failed with code $exitCode"
exit 1
}
echo "Other versioned BuildTools uninstalled."
}
echo "Installing Visual Studio version ${env:VS_VERSION}."
$process = Start-Process "${PWD}\vs_installer.exe" -ArgumentList $VS_INSTALL_ARGS -NoNewWindow -Wait -PassThru
Remove-Item -Path vs_installer.exe -Force
$exitCode = $process.ExitCode
if (($exitCode -ne 0) -and ($exitCode -ne 3010)) {
echo "VS 2019 installer exited with code $exitCode, which should be one of [0, 3010]."
curl.exe --retry 3 -kL $COLLECT_DOWNLOAD_LINK --output Collect.exe
if ($LASTEXITCODE -ne 0) {
echo "Download of the VS Collect tool failed."
exit 1
}
Start-Process "${PWD}\Collect.exe" -NoNewWindow -Wait -PassThru
New-Item -Path "C:\w\build-results" -ItemType "directory" -Force
Copy-Item -Path "${env:TEMP}\vslogs.zip" -Destination "C:\w\build-results\"
exit 1
}

View File

@ -0,0 +1,5 @@
$CMATH_DOWNLOAD_LINK = "https://raw.githubusercontent.com/microsoft/STL/12c684bba78f9b032050526abdebf14f58ca26a3/stl/inc/cmath"
$VC14_28_INSTALL_PATH="C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include"
curl.exe --retry 3 -kL $CMATH_DOWNLOAD_LINK --output "$home\cmath"
Move-Item -Path "$home\cmath" -Destination "$VC14_28_INSTALL_PATH" -Force

View File

@ -0,0 +1,70 @@
#!/bin/bash
set -eux -o pipefail
case ${CUDA_VERSION} in
10.2)
cuda_installer_name="cuda_10.2.89_441.22_win10"
cuda_install_packages="nvcc_10.2 cuobjdump_10.2 nvprune_10.2 cupti_10.2 cublas_10.2 cublas_dev_10.2 cudart_10.2 cufft_10.2 cufft_dev_10.2 curand_10.2 curand_dev_10.2 cusolver_10.2 cusolver_dev_10.2 cusparse_10.2 cusparse_dev_10.2 nvgraph_10.2 nvgraph_dev_10.2 npp_10.2 npp_dev_10.2 nvrtc_10.2 nvrtc_dev_10.2 nvml_dev_10.2"
;;
11.3)
cuda_installer_name="cuda_11.3.0_465.89_win10"
cuda_install_packages="thrust_11.3 nvcc_11.3 cuobjdump_11.3 nvprune_11.3 nvprof_11.3 cupti_11.3 cublas_11.3 cublas_dev_11.3 cudart_11.3 cufft_11.3 cufft_dev_11.3 curand_11.3 curand_dev_11.3 cusolver_11.3 cusolver_dev_11.3 cusparse_11.3 cusparse_dev_11.3 npp_11.3 npp_dev_11.3 nvrtc_11.3 nvrtc_dev_11.3 nvml_dev_11.3"
;;
11.6)
cuda_installer_name="cuda_11.6.0_511.23_windows"
cuda_install_packages="thrust_11.6 nvcc_11.6 cuobjdump_11.6 nvprune_11.6 nvprof_11.6 cupti_11.6 cublas_11.6 cublas_dev_11.6 cudart_11.6 cufft_11.6 cufft_dev_11.6 curand_11.6 curand_dev_11.6 cusolver_11.6 cusolver_dev_11.6 cusparse_11.6 cusparse_dev_11.6 npp_11.6 npp_dev_11.6 nvrtc_11.6 nvrtc_dev_11.6 nvml_dev_11.6"
;;
*)
echo "CUDA_VERSION $CUDA_VERSION is not supported yet"
exit 1
;;
esac
if [[ -f "/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v${CUDA_VERSION}/bin/nvcc.exe" ]]; then
echo "Existing CUDA v${CUDA_VERSION} installation found, skipping install"
else
tmp_dir=$(mktemp -d)
(
# no need to popd after, the subshell shouldn't affect the parent shell
pushd "${tmp_dir}"
cuda_installer_link="https://ossci-windows.s3.amazonaws.com/${cuda_installer_name}.exe"
curl --retry 3 -kLO $cuda_installer_link
7z x ${cuda_installer_name}.exe -o${cuda_installer_name}
pushd ${cuda_installer_name}
mkdir cuda_install_logs
set +e
# This breaks for some reason if you quote cuda_install_packages
# shellcheck disable=SC2086
./setup.exe -s ${cuda_install_packages} -loglevel:6 -log:"$(pwd -W)/cuda_install_logs"
set -e
if [[ ! -f "/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v${CUDA_VERSION}/bin/nvcc.exe" ]]; then
echo "CUDA installation failed"
mkdir -p /c/w/build-results
7z a "c:\\w\\build-results\\cuda_install_logs.7z" cuda_install_logs
exit 1
fi
)
rm -rf "${tmp_dir}"
fi
if [[ -f "/c/Program Files/NVIDIA Corporation/NvToolsExt/bin/x64/nvToolsExt64_1.dll" ]]; then
echo "Existing nvtools installation found, skipping install"
else
# create tmp dir for download
tmp_dir=$(mktemp -d)
(
# no need to popd after, the subshell shouldn't affect the parent shell
pushd "${tmp_dir}"
curl --retry 3 -kLO https://ossci-windows.s3.amazonaws.com/NvToolsExt.7z
7z x NvToolsExt.7z -oNvToolsExt
mkdir -p "C:/Program Files/NVIDIA Corporation/NvToolsExt"
cp -r NvToolsExt/* "C:/Program Files/NVIDIA Corporation/NvToolsExt/"
)
rm -rf "${tmp_dir}"
fi

View File

@ -0,0 +1,48 @@
#!/bin/bash
set -eux -o pipefail
windows_s3_link="https://ossci-windows.s3.amazonaws.com"
case ${CUDA_VERSION} in
10.2)
cudnn_file_name="cudnn-${CUDA_VERSION}-windows10-x64-v7.6.5.32"
;;
11.3)
# Use cudnn8.3 with hard-coded cuda11.3 version
cudnn_file_name="cudnn-windows-x86_64-8.3.2.44_cuda11.5-archive"
;;
11.6)
# Use cudnn8.3 with hard-coded cuda11.5 version
cudnn_file_name="cudnn-windows-x86_64-8.3.2.44_cuda11.5-archive"
;;
*)
echo "CUDA_VERSION: ${CUDA_VERSION} not supported yet"
exit 1
;;
esac
cudnn_installer_name="cudnn_installer.zip"
cudnn_installer_link="${windows_s3_link}/${cudnn_file_name}.zip"
cudnn_install_folder="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v${CUDA_VERSION}/"
if [[ -f "${cudnn_install_folder}/include/cudnn.h" ]]; then
echo "Existing cudnn installation found, skipping install..."
else
tmp_dir=$(mktemp -d)
(
pushd "${tmp_dir}"
curl --retry 3 -o "${cudnn_installer_name}" "$cudnn_installer_link"
7z x "${cudnn_installer_name}" -ocudnn
# Use '${var:?}/*' to avoid potentially expanding to '/*'
# Remove all of the directories before attempting to copy files
rm -rf "${cudnn_install_folder:?}/*"
cp -rf cudnn/cuda/* "${cudnn_install_folder}"
#Make sure windows path contains zlib dll
curl -k -L "${windows_s3_link}/zlib123dllx64.zip" --output "${tmp_dir}\zlib123dllx64.zip"
7z x "${tmp_dir}\zlib123dllx64.zip" -o"${tmp_dir}\zlib"
xcopy /Y "${tmp_dir}\zlib\dll_x64\*.dll" "C:\Windows\System32"
)
rm -rf "${tmp_dir}"
fi

View File

@ -6,7 +6,7 @@ commands:
- run:
name: "Calculate docker image hash"
command: |
DOCKER_TAG=$(git rev-parse HEAD:.ci/docker)
DOCKER_TAG=$(git rev-parse HEAD:.circleci/docker)
echo "DOCKER_TAG=${DOCKER_TAG}" >> "${BASH_ENV}"
designate_upload_channel:
@ -132,3 +132,43 @@ commands:
else
echo "This is not a pull request, skipping..."
fi
upload_binary_size_for_android_build:
description: "Upload binary size data for Android build"
parameters:
build_type:
type: string
default: ""
artifacts:
type: string
default: ""
steps:
- run:
name: "Binary Size - Install Dependencies"
no_output_timeout: "5m"
command: |
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry pip3 install requests
- run:
name: "Binary Size - Untar Artifacts"
no_output_timeout: "5m"
command: |
# The artifact file is created inside docker container, which contains the result binaries.
# Now unpackage it into the project folder. The subsequent script will scan project folder
# to locate result binaries and report their sizes.
# If artifact file is not provided it assumes that the project folder has been mounted in
# the docker during build and already contains the result binaries, so this step can be skipped.
export ARTIFACTS="<< parameters.artifacts >>"
if [ -n "${ARTIFACTS}" ]; then
tar xf "${ARTIFACTS}" -C ~/project
fi
- run:
name: "Binary Size - Upload << parameters.build_type >>"
no_output_timeout: "5m"
command: |
cd ~/project
export ANDROID_BUILD_TYPE="<< parameters.build_type >>"
export COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
python3 -m tools.stats.upload_binary_size_to_scuba android

View File

@ -1,4 +1,243 @@
jobs:
binary_linux_build:
<<: *binary_linux_build_params
steps:
- checkout
- calculate_docker_image_tag
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Build
no_output_timeout: "1h"
command: |
source "/pytorch/.circleci/scripts/binary_linux_build.sh"
# Preserve build log
if [ -f /pytorch/build/.ninja_log ]; then
cp /pytorch/build/.ninja_log /final_pkgs
fi
- run:
name: Output binary sizes
no_output_timeout: "1m"
command: |
ls -lah /final_pkgs
- run:
name: upload build & binary data
no_output_timeout: "5m"
command: |
source /env
cd /pytorch && export COMMIT_TIME=$(git log --max-count=1 --format=%ct || echo 0)
python3 -mpip install requests && \
SCRIBE_GRAPHQL_ACCESS_TOKEN=${SCRIBE_GRAPHQL_ACCESS_TOKEN} \
python3 -m tools.stats.upload_binary_size_to_scuba || exit 0
- persist_to_workspace:
root: /
paths: final_pkgs
- store_artifacts:
path: /final_pkgs
# This should really just be another step of the binary_linux_build job above.
# This isn't possible right now b/c the build job uses the docker executor
# (otherwise they'd be really really slow) but this one uses the macine
# executor (b/c we have to run the docker with --runtime=nvidia and we can't do
# that on the docker executor)
binary_linux_test:
<<: *binary_linux_test_upload_params
machine:
image: ubuntu-2004:202104-01
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- attach_workspace:
at: /home/circleci/project
- setup_linux_system_environment
- setup_ci_environment
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Prepare test code
no_output_timeout: "1h"
command: .circleci/scripts/binary_linux_test.sh
- run:
<<: *binary_run_in_docker
binary_upload:
parameters:
package_type:
type: string
description: "What type of package we are uploading (eg. wheel, libtorch, conda)"
default: "wheel"
upload_subfolder:
type: string
description: "What subfolder to put our package into (eg. cpu, cudaX.Y, etc.)"
default: "cpu"
docker:
- image: continuumio/miniconda3
environment:
- DRY_RUN: disabled
- PACKAGE_TYPE: "<< parameters.package_type >>"
- UPLOAD_SUBFOLDER: "<< parameters.upload_subfolder >>"
steps:
- attach_workspace:
at: /tmp/workspace
- checkout
- designate_upload_channel
- run:
name: Install dependencies
no_output_timeout: "1h"
command: |
conda install -yq anaconda-client
pip install -q awscli
- run:
name: Do upload
no_output_timeout: "1h"
command: |
AWS_ACCESS_KEY_ID="${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}" \
AWS_SECRET_ACCESS_KEY="${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}" \
ANACONDA_API_TOKEN="${CONDA_PYTORCHBOT_TOKEN}" \
.circleci/scripts/binary_upload.sh
# Nighlty build smoke tests defaults
# These are the second-round smoke tests. These make sure that the binaries are
# correct from a user perspective, testing that they exist from the cloud are
# are runnable. Note that the pytorch repo is never cloned into these jobs
##############################################################################
smoke_linux_test:
<<: *binary_linux_test_upload_params
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -ex
cat >/home/circleci/project/ci_test_script.sh \<<EOL
# The following code will be executed inside Docker container
set -eux -o pipefail
/builder/smoke_test.sh
# The above code will be executed inside Docker container
EOL
- run:
<<: *binary_run_in_docker
smoke_mac_test:
<<: *binary_linux_test_upload_params
macos:
xcode: "12.0"
steps:
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "1h"
command: |
set -ex
source "/Users/distiller/project/env"
export "PATH=$workdir/miniconda/bin:$PATH"
# TODO unbuffer and ts this, but it breaks cause miniconda overwrites
# tclsh. But unbuffer and ts aren't that important so they're just
# disabled for now
./builder/smoke_test.sh
binary_mac_build:
<<: *binary_mac_params
macos:
xcode: "12.0"
resource_class: "large"
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "90m"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_build.sh"
cat "$script"
source "$script"
- run:
name: Test
no_output_timeout: "1h"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_test.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: /Users/distiller/project
paths: final_pkgs
- store_artifacts:
path: /Users/distiller/project/final_pkgs
binary_macos_arm64_build:
<<: *binary_mac_params
macos:
xcode: "12.3.0"
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- brew_update
- run:
<<: *binary_install_miniconda
- run:
name: Build
no_output_timeout: "90m"
command: |
# Do not set -u here; there is some problem with CircleCI
# variable expansion with PROMPT_COMMAND
set -ex -o pipefail
export CROSS_COMPILE_ARM64=1
script="/Users/distiller/project/pytorch/.circleci/scripts/binary_macos_build.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: /Users/distiller/project
paths: final_pkgs
- store_artifacts:
path: /Users/distiller/project/final_pkgs
binary_ios_build:
<<: *pytorch_ios_params
macos:
@ -43,6 +282,90 @@ jobs:
cat "$script"
source "$script"
binary_windows_build:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-xlarge-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
# See Note [Workspace for CircleCI scripts] in job-specs-setup.yml
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Build
no_output_timeout: "1h"
command: |
set -eux -o pipefail
script="/c/w/p/.circleci/scripts/binary_windows_build.sh"
cat "$script"
source "$script"
- persist_to_workspace:
root: "C:/w"
paths: final_pkgs
- store_artifacts:
path: C:/w/final_pkgs
binary_windows_test:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-medium-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
- checkout
- attach_workspace:
at: c:/users/circleci/project
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -eux -o pipefail
script="/c/w/p/.circleci/scripts/binary_windows_test.sh"
cat "$script"
source "$script"
smoke_windows_test:
<<: *binary_windows_params
parameters:
build_environment:
type: string
default: ""
executor:
type: string
default: "windows-medium-cpu-with-nvidia-cuda"
executor: <<parameters.executor>>
steps:
- checkout
- run:
<<: *binary_checkout
- run:
<<: *binary_populate_env
- run:
name: Test
no_output_timeout: "1h"
command: |
set -eux -o pipefail
export TEST_NIGHTLY_PACKAGE=1
script="/c/w/p/.circleci/scripts/binary_windows_test.sh"
cat "$script"
source "$script"
anaconda_prune:
parameters:
packages:

View File

@ -33,12 +33,12 @@
exit 0
fi
# Covers the case where a previous tag doesn't exist for the tree
# this is only really applicable on trees that don't have `.ci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.ci/docker"; then
echo "Directory '.ci/docker' not found in tree << pipeline.git.base_revision >>, you should probably rebase onto a more recent commit"
# this is only really applicable on trees that don't have `.circleci/docker` at its merge base, i.e. nightly
if ! git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker"; then
echo "Directory '.circleci/docker' not found in tree << pipeline.git.base_revision >>, you should probably rebase onto a more recent commit"
exit 1
fi
PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):ci/docker")
PREVIOUS_DOCKER_TAG=$(git rev-parse "$(git merge-base HEAD << pipeline.git.base_revision >>):.circleci/docker")
# If no image exists but the hash is the same as the previous hash then we should error out here
if [[ "${PREVIOUS_DOCKER_TAG}" = "${DOCKER_TAG}" ]]; then
echo "ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch"
@ -53,4 +53,4 @@
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_DOCKER_BUILDER_V1}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_DOCKER_BUILDER_V1}
set -x
cd .ci/docker && ./build_docker.sh
cd .circleci/docker && ./build_docker.sh

View File

@ -24,6 +24,95 @@
pushd /tmp/workspace
git push -u origin "<< parameters.branch >>"
pytorch_python_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-python-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc Build and Push
no_output_timeout: "1h"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-main}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --env-file "${BASH_ENV}" --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && '"export CIRCLE_SHA1='$CIRCLE_SHA1'"' && . ./.circleci/scripts/python_doc_push_script.sh docs/'$target' '$target' site") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
mkdir -p ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/pytorch.github.io/docs/main ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/pytorch.github.io /tmp/workspace
# Save the docs build so we can debug any problems
export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug
docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}
time docker push ${DEBUG_COMMIT_DOCKER_IMAGE}
- persist_to_workspace:
root: /tmp/workspace
paths:
- .
- store_artifacts:
path: ~/workspace/build_artifacts/main
destination: docs
pytorch_cpp_doc_build:
environment:
BUILD_ENVIRONMENT: pytorch-cpp-doc-push
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: large
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc Build and Push
no_output_timeout: "1h"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
# turn v1.12.0rc3 into 1.12
tag=$(echo $CIRCLE_TAG | sed -e 's/v*\([0-9]*\.[0-9]*\).*/\1/')
target=${tag:-main}
echo "building for ${target}"
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --env-file "${BASH_ENV}" --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && '"export CIRCLE_SHA1='$CIRCLE_SHA1'"' && . ./.circleci/scripts/cpp_doc_push_script.sh docs/"$target" main") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
mkdir -p ~/workspace/build_artifacts
docker cp $id:/var/lib/jenkins/workspace/cppdocs/ /tmp/workspace
# Save the docs build so we can debug any problems
export DEBUG_COMMIT_DOCKER_IMAGE=${COMMIT_DOCKER_IMAGE}-debug
docker commit "$id" ${DEBUG_COMMIT_DOCKER_IMAGE}
time docker push ${DEBUG_COMMIT_DOCKER_IMAGE}
- persist_to_workspace:
root: /tmp/workspace
paths:
- .
pytorch_macos_10_15_py3_build:
environment:
BUILD_ENVIRONMENT: pytorch-macos-10.15-py3-arm64-build
@ -37,6 +126,7 @@
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export CROSS_COMPILE_ARM64=1
export JOB_BASE_NAME=$CIRCLE_JOB
@ -51,8 +141,8 @@
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
set -x
chmod a+x .ci/pytorch/macos-build.sh
unbuffer .ci/pytorch/macos-build.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-build.sh
unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts
- persist_to_workspace:
root: /Users/distiller/workspace/
@ -74,6 +164,7 @@
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
# Install sccache
@ -87,206 +178,14 @@
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
set -x
chmod a+x .ci/pytorch/macos-build.sh
unbuffer .ci/pytorch/macos-build.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-build.sh
unbuffer .jenkins/pytorch/macos-build.sh 2>&1 | ts
- persist_to_workspace:
root: /Users/distiller/workspace/
paths:
- miniconda3
mac_build:
parameters:
build-environment:
type: string
description: Top-level label for what's being built/tested.
xcode-version:
type: string
default: "13.3.1"
description: What xcode version to build with.
build-generates-artifacts:
type: boolean
default: true
description: if the build generates build artifacts
python-version:
type: string
default: "3.8"
macos:
xcode: << parameters.xcode-version >>
resource_class: medium
environment:
BUILD_ENVIRONMENT: << parameters.build-environment >>
AWS_REGION: us-east-1
steps:
- checkout
- run_brew_for_macos_build
- run:
name: Install sccache
command: |
sudo curl --retry 3 https://s3.amazonaws.com/ossci-macos/sccache_v2.15 --output /usr/local/bin/sccache
sudo chmod +x /usr/local/bin/sccache
echo "export SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2" >> "${BASH_ENV}"
echo "export SCCACHE_S3_KEY_PREFIX=${GITHUB_WORKFLOW}" >> "${BASH_ENV}"
set +x
echo "export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}" >> "${BASH_ENV}"
echo "export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}" >> "${BASH_ENV}"
set -x
- run:
name: Get workflow job id
command: |
echo "export OUR_GITHUB_JOB_ID=${CIRCLE_WORKFLOW_JOB_ID}" >> "${BASH_ENV}"
- run:
name: Build
command: |
set -x
git submodule sync
git submodule update --init --recursive --depth 1 --jobs 0
export PATH="/usr/local/bin:$PATH"
export WORKSPACE_DIR="${HOME}/workspace"
mkdir -p "${WORKSPACE_DIR}"
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-MacOSX-x86_64.sh"
if [ << parameters.python-version >> == 3.9.12 ]; then
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh"
fi
# If a local installation of conda doesn't exist, we download and install conda
if [ ! -d "${WORKSPACE_DIR}/miniconda3" ]; then
mkdir -p "${WORKSPACE_DIR}"
curl --retry 3 ${MINICONDA_URL} -o "${WORKSPACE_DIR}"/miniconda3.sh
bash "${WORKSPACE_DIR}"/miniconda3.sh -b -p "${WORKSPACE_DIR}"/miniconda3
fi
export PATH="${WORKSPACE_DIR}/miniconda3/bin:$PATH"
# shellcheck disable=SC1091
source "${WORKSPACE_DIR}"/miniconda3/bin/activate
brew link --force libomp
echo "export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname "$(which conda)")/../"}" >> "${BASH_ENV}"
.ci/pytorch/macos-build.sh
- when:
condition: << parameters.build-generates-artifacts >>
steps:
- run:
name: Archive artifacts into zip
command: |
zip -1 -r artifacts.zip dist/ build/.ninja_log build/compile_commands.json .pytorch-test-times.json
cp artifacts.zip /Users/distiller/workspace
- persist_to_workspace:
root: /Users/distiller/workspace/
paths:
- miniconda3
- artifacts.zip
- store_artifacts:
path: /Users/distiller/project/artifacts.zip
mac_test:
parameters:
build-environment:
type: string
shard-number:
type: string
num-test-shards:
type: string
xcode-version:
type: string
test-config:
type: string
default: 'default'
macos:
xcode: << parameters.xcode-version >>
environment:
GIT_DEFAULT_BRANCH: 'master'
BUILD_ENVIRONMENT: << parameters.build-environment >>
TEST_CONFIG: << parameters.test-config >>
SHARD_NUMBER: << parameters.shard-number >>
NUM_TEST_SHARDS: << parameters.num-test-shards >>
PYTORCH_RETRY_TEST_CASES: 1
PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1
steps:
- checkout
- attach_workspace:
at: ~/workspace
- run_brew_for_macos_build
- run:
name: Test
no_output_timeout: "2h"
command: |
set -x
git submodule sync --recursive
git submodule update --init --recursive
mv ~/workspace/artifacts.zip .
unzip artifacts.zip
export IN_CI=1
COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}")
export PATH="/usr/local/bin:$PATH"
export WORKSPACE_DIR="${HOME}/workspace"
mkdir -p "${WORKSPACE_DIR}"
export PATH="${WORKSPACE_DIR}/miniconda3/bin:$PATH"
source "${WORKSPACE_DIR}"/miniconda3/bin/activate
# sanitize the input commit message and PR body here:
# trim all new lines from commit messages to avoid issues with batch environment
# variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028
COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}"
# then trim all special characters like single and double quotes to avoid unescaped inputs to
# wreak havoc internally
export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}"
python3 -mpip install dist/*.whl
.ci/pytorch/macos-test.sh
- run:
name: Copy files for uploading test stats
command: |
# copy into a parent folder test-reports because we can't use CIRCLEI_BUILD_NUM in path when persisting to workspace
mkdir -p test-reports/test-reports_${CIRCLE_BUILD_NUM}/test/test-reports
cp -r test/test-reports test-reports/test-reports_${CIRCLE_BUILD_NUM}/test/test-reports
- store_test_results:
path: test/test-reports
- persist_to_workspace:
root: /Users/distiller/project/
paths:
- test-reports
upload_test_stats:
machine: # executor type
image: ubuntu-2004:202010-01 # # recommended linux image - includes Ubuntu 20.04, docker 19.03.13, docker-compose 1.27.4
steps:
- checkout
- attach_workspace:
at: ~/workspace
- run:
name: upload
command: |
set -ex
if [ -z ${AWS_ACCESS_KEY_FOR_OSSCI_ARTIFACT_UPLOAD} ]; then
echo "No credentials found, cannot upload test stats (are you on a fork?)"
exit 0
fi
cp -r ~/workspace/test-reports/* ~/project
pip3 install requests==2.26 rockset==1.0.3 boto3==1.19.12
export AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_FOR_OSSCI_ARTIFACT_UPLOAD}
export AWS_SECRET_ACCESS_KEY=${AWS_SECRET_KEY_FOR_OSSCI_ARTIFACT_UPLOAD}
# i dont know how to get the run attempt number for reruns so default to 1
python3 -m tools.stats.upload_test_stats --workflow-run-id "${CIRCLE_WORKFLOW_JOB_ID}" --workflow-run-attempt 1 --head-branch << pipeline.git.branch >> --circleci
pytorch_macos_10_13_py3_test:
environment:
BUILD_ENVIRONMENT: pytorch-macos-10.13-py3-test
@ -302,10 +201,27 @@
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
chmod a+x .ci/pytorch/macos-test.sh
unbuffer .ci/pytorch/macos-test.sh 2>&1 | ts
chmod a+x .jenkins/pytorch/macos-test.sh
unbuffer .jenkins/pytorch/macos-test.sh 2>&1 | ts
- run:
name: Report results
no_output_timeout: "5m"
command: |
set -ex
source /Users/distiller/workspace/miniconda3/bin/activate
python3 -m pip install boto3==1.19.12
export IN_CI=1
export JOB_BASE_NAME=$CIRCLE_JOB
# Using the same IAM user to write stats to our OSS bucket
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4}
python -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test
when: always
- store_test_results:
path: test/test-reports
@ -324,10 +240,11 @@
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
export BUILD_LITE_INTERPRETER=1
export JOB_BASE_NAME=$CIRCLE_JOB
chmod a+x ${HOME}/project/.ci/pytorch/macos-lite-interpreter-build-test.sh
unbuffer ${HOME}/project/.ci/pytorch/macos-lite-interpreter-build-test.sh 2>&1 | ts
chmod a+x ${HOME}/project/.jenkins/pytorch/macos-lite-interpreter-build-test.sh
unbuffer ${HOME}/project/.jenkins/pytorch/macos-lite-interpreter-build-test.sh 2>&1 | ts
- store_test_results:
path: test/test-reports
@ -413,6 +330,9 @@
output_image=$docker_image_libtorch_android_x86_32-gradle
docker commit "$id_x86_32" ${output_image}
time docker push ${output_image}
- upload_binary_size_for_android_build:
build_type: prebuilt
artifacts: /home/circleci/workspace/build_android_artifacts/artifacts.tgz
- store_artifacts:
path: ~/workspace/build_android_artifacts/artifacts.tgz
destination: artifacts.tgz
@ -488,6 +408,9 @@
output_image=${docker_image_libtorch_android_x86_32}-gradle
docker commit "$id" ${output_image}
time docker push ${output_image}
- upload_binary_size_for_android_build:
build_type: prebuilt-single
artifacts: /home/circleci/workspace/build_android_x86_32_artifacts/artifacts.tgz
- store_artifacts:
path: ~/workspace/build_android_x86_32_artifacts/artifacts.tgz
destination: artifacts.tgz
@ -497,43 +420,10 @@
macos:
xcode: "12.5.1"
steps:
- run:
name: checkout with retry
command: |
checkout() {
set -ex
# Workaround old docker images with incorrect $HOME
# check https://github.com/docker/docker/issues/2968 for details
if [ "${HOME}" = "/" ]
then
export HOME=$(getent passwd $(id -un) | cut -d: -f6)
fi
mkdir -p ~/.ssh
echo 'github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ==
' >> ~/.ssh/known_hosts
# use git+ssh instead of https
git config --global url."ssh://git@github.com".insteadOf "https://github.com" || true
git config --global gc.auto 0 || true
echo 'Cloning git repository'
mkdir -p '/Users/distiller/project'
cd '/Users/distiller/project'
git clone "$CIRCLE_REPOSITORY_URL" .
echo 'Checking out branch'
git checkout --force -B "$CIRCLE_BRANCH" "$CIRCLE_SHA1"
git --no-pager log --no-color -n 1 --format='HEAD is now at %h %s'
}
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry checkout
- checkout
- run_brew_for_ios_build
- run:
name: Setup Fastlane
name: Run Fastlane
no_output_timeout: "1h"
command: |
set -e
@ -541,17 +431,32 @@
cd ${PROJ_ROOT}/ios/TestApp
# install fastlane
sudo gem install bundler && bundle install
# install certificates
echo ${IOS_CERT_KEY_2022} >> cert.txt
base64 --decode cert.txt -o Certificates.p12
rm cert.txt
bundle exec fastlane install_root_cert
bundle exec fastlane install_dev_cert
# install the provisioning profile
PROFILE=PyTorch_CI_2022.mobileprovision
PROVISIONING_PROFILES=~/Library/MobileDevice/Provisioning\ Profiles
mkdir -pv "${PROVISIONING_PROFILES}"
cd "${PROVISIONING_PROFILES}"
echo ${IOS_SIGN_KEY_2022} >> cert.txt
base64 --decode cert.txt -o ${PROFILE}
rm cert.txt
- run:
name: Build
no_output_timeout: "1h"
command: |
set -e
export IN_CI=1
WORKSPACE=/Users/distiller/workspace
PROJ_ROOT=/Users/distiller/project
export TCLLIBPATH="/usr/local/lib"
# Install conda
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-MacOSX-x86_64.sh
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x ~/conda.sh
/bin/bash ~/conda.sh -b -p ~/anaconda
export PATH="~/anaconda/bin:${PATH}"
@ -562,7 +467,7 @@
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
retry conda install numpy ninja pyyaml mkl mkl-include setuptools cmake requests typing-extensions --yes
retry conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
# sync submodules
cd ${PROJ_ROOT}
@ -598,12 +503,18 @@
command: |
set -e
PROJ_ROOT=/Users/distiller/project
PROFILE=PyTorch_CI_2022
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM}
echo ${IOS_DEV_TEAM_ID}
if [ ${IOS_PLATFORM} != "SIMULATOR" ]; then
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM} -c ${PROFILE} -t ${IOS_DEV_TEAM_ID}
else
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM}
fi
if ! [ "$?" -eq "0" ]; then
echo 'xcodebuild failed!'
exit 1
@ -626,13 +537,12 @@
cd ${PROJ_ROOT}/ios/TestApp/benchmark
mkdir -p ../models
if [ ${USE_COREML_DELEGATE} == 1 ]; then
pip install coremltools==5.0b5 protobuf==3.20.1
pip install coremltools==5.0b5
pip install six
python coreml_backend.py
else
cd "${PROJ_ROOT}"
python test/mobile/model_test/gen_test_model.py ios-test
python trace_model.py
fi
cd "${PROJ_ROOT}/ios/TestApp/benchmark"
if [ ${BUILD_LITE_INTERPRETER} == 1 ]; then
echo "Setting up the TestApp for LiteInterpreter"
ruby setup.rb --lite 1
@ -640,10 +550,10 @@
echo "Setting up the TestApp for Full JIT"
ruby setup.rb
fi
cd "${PROJ_ROOT}/ios/TestApp"
# instruments -s -devices
if [ "${BUILD_LITE_INTERPRETER}" == 1 ]; then
if [ "${USE_COREML_DELEGATE}" == 1 ]; then
cd ${PROJ_ROOT}/ios/TestApp
instruments -s -devices
if [ ${BUILD_LITE_INTERPRETER} == 1 ]; then
if [ ${USE_COREML_DELEGATE} == 1 ]; then
fastlane scan --only_testing TestAppTests/TestAppTests/testCoreML
else
fastlane scan --only_testing TestAppTests/TestAppTests/testLiteInterpreter
@ -676,7 +586,7 @@
docker cp /home/circleci/project/. $id:/var/lib/jenkins/workspace
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/build.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/build.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
@ -722,9 +632,9 @@
trap "retrieve_test_reports" ERR
if [[ ${BUILD_ENVIRONMENT} == *"multigpu"* ]]; then
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/multigpu-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/multigpu-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
else
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .ci/pytorch/test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && .jenkins/pytorch/test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
fi
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts
@ -745,3 +655,27 @@
set -e
python3 -m pip install requests
python3 ./.circleci/scripts/trigger_azure_pipeline.py
pytorch_doc_test:
environment:
BUILD_ENVIRONMENT: pytorch-doc-test
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.7-gcc5.4"
resource_class: medium
machine:
image: ubuntu-2004:202104-01
steps:
- checkout
- calculate_docker_image_tag
- setup_linux_system_environment
- setup_ci_environment
- run:
name: Doc test
no_output_timeout: "30m"
command: |
set -ex
export COMMIT_DOCKER_IMAGE=${DOCKER_IMAGE}:build-${DOCKER_TAG}-${CIRCLE_SHA1}
echo "DOCKER_IMAGE: "${COMMIT_DOCKER_IMAGE}
time docker pull ${COMMIT_DOCKER_IMAGE} >/dev/null
export id=$(docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -t -d -w /var/lib/jenkins ${COMMIT_DOCKER_IMAGE})
export COMMAND='((echo "sudo chown -R jenkins workspace && cd workspace && . ./.jenkins/pytorch/docs-test.sh") | docker exec -u jenkins -i "$id" bash) 2>&1'
echo ${COMMAND} > ./command.sh && unbuffer bash ./command.sh | ts

View File

@ -3,14 +3,11 @@
InheritParentConfig: true
Checks: '
bugprone-*,
-bugprone-easily-swappable-parameters,
-bugprone-forward-declaration-namespace,
-bugprone-macro-parentheses,
-bugprone-lambda-function-name,
-bugprone-reserved-identifier,
-bugprone-swapped-arguments,
cppcoreguidelines-*,
-cppcoreguidelines-avoid-do-while,
-cppcoreguidelines-avoid-magic-numbers,
-cppcoreguidelines-avoid-non-const-global-variables,
-cppcoreguidelines-interfaces-global-init,
@ -29,11 +26,8 @@ cppcoreguidelines-*,
-facebook-hte-RelativeInclude,
hicpp-exception-baseclass,
hicpp-avoid-goto,
misc-unused-alias-decls,
misc-unused-using-decls,
modernize-*,
-modernize-concat-nested-namespaces,
-modernize-macro-to-enum,
-modernize-return-braced-init-list,
-modernize-use-auto,
-modernize-use-default-member-init,
@ -43,9 +37,9 @@ modernize-*,
performance-*,
-performance-noexcept-move-constructor,
-performance-unnecessary-value-param,
readability-container-size-empty,
'
HeaderFilterRegex: '^(c10/(?!test)|torch/csrc/(?!deploy/interpreter/cpython)).*$'
HeaderFilterRegex: 'torch/csrc/(?!deploy/interpreter/cpython).*'
AnalyzeTemporaryDtors: false
WarningsAsErrors: '*'
CheckOptions:
...

11
.flake8
View File

@ -11,12 +11,8 @@ ignore =
# these ignores are from flake8-bugbear; please fix!
B007,B008,
# these ignores are from flake8-comprehensions; please fix!
C407
per-file-ignores =
__init__.py: F401
torch/utils/cpp_extension.py: B950
torchgen/api/types/__init__.py: F401,F403
torchgen/executorch/api/types/__init__.py: F401,F403
C400,C401,C402,C403,C404,C405,C407,C411,C413,C414,C415
per-file-ignores = __init__.py: F401 torch/utils/cpp_extension.py: B950
optional-ascii-coding = True
exclude =
./.git,
@ -26,9 +22,6 @@ exclude =
./docs/caffe2,
./docs/cpp/src,
./docs/src,
./functorch/docs,
./functorch/examples,
./functorch/notebooks,
./scripts,
./test/generated_type_hints_smoketest.py,
./third_party,

View File

@ -18,13 +18,7 @@ cc11aaaa60aadf28e3ec278bce26a42c1cd68a4f
e3900d2ba5c9f91a24a9ce34520794c8366d5c54
# 2021-04-21 Removed all unqualified `type: ignore`
75024e228ca441290b6a1c2e564300ad507d7af6
# 2021-04-30 [PyTorch] Autoformat c10
44cc873fba5e5ffc4d4d4eef3bd370b653ce1ce1
# 2021-05-14 Removed all versionless Python shebangs
2e26976ad3b06ce95dd6afccfdbe124802edf28f
# 2021-06-07 Strictly typed everything in `.github` and `tools`
737d920b21db9b4292d056ee1329945990656304
# 2022-06-09 Apply clang-format to ATen headers
95b15c266baaf989ef7b6bbd7c23a2d90bacf687
# 2022-06-11 [lint] autoformat test/cpp and torch/csrc
30fb2c4abaaaa966999eab11674f25b18460e609

View File

@ -5,8 +5,6 @@ about: Tracking incidents for PyTorch's CI infra.
> NOTE: Remember to label this issue with "`ci: sev`"
**MERGE BLOCKING** <!-- remove this line if you don't want this SEV to block merges -->
## Current Status
*Status could be: preemptive, ongoing, mitigated, closed. Also tell people if they need to take action to fix it (i.e. rebase)*.

View File

@ -1,61 +0,0 @@
name: 🐛 torch.compile Bug Report
description: Create a report to help us reproduce and fix the bug
labels: ["oncall: pt2"]
body:
- type: markdown
attributes:
value: >
#### Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the
existing and past issues](https://github.com/pytorch/pytorch/issues)
It's likely that your bug will be resolved by checking our FAQ or troubleshooting guide [documentation](https://pytorch.org/docs/master/dynamo/index.html)
- type: textarea
attributes:
label: 🐛 Describe the bug
description: |
Please provide a clear and concise description of what the bug is.
placeholder: |
A clear and concise description of what the bug is.
validations:
required: false
- type: textarea
attributes:
label: Error logs
description: |
Please provide the error you're seeing
placeholder: |
Error...
validations:
required: false
- type: textarea
attributes:
label: Minified repro
description: |
Please run the minifier on your example and paste the minified code below
Learn more here https://pytorch.org/docs/master/dynamo/troubleshooting.html
placeholder: |
env TORCHDYNAMO_REPRO_AFTER="aot" python your_model.py
or
env TORCHDYNAMO_REPRO_AFTER="dynamo" python your_model.py
import torch
...
# torch version: 2.0.....
class Repro(torch.nn.Module)
validations:
required: false
- type: textarea
attributes:
label: Versions
description: |
Please run the following and paste the output below.
```sh
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
```
validations:
required: true

View File

@ -1 +1 @@
Fixes #ISSUE_NUMBER
Fixes #ISSUE_NUMBER

View File

@ -5,19 +5,12 @@ self-hosted-runner:
- linux.large
- linux.2xlarge
- linux.4xlarge
- linux.12xlarge
- linux.24xlarge
- linux.4xlarge.nvidia.gpu
- linux.8xlarge.nvidia.gpu
- linux.16xlarge.nvidia.gpu
- linux.g5.4xlarge.nvidia.gpu
- windows.4xlarge
- windows.8xlarge.nvidia.gpu
- windows.g5.4xlarge.nvidia.gpu
- bm-runner
- linux.rocm.gpu
- macos-m1-12
- macos-m1-13
- macos-12-xl
- macos-12
- macos12.3-m1

View File

@ -37,10 +37,12 @@ runs:
shell: bash
env:
BRANCH: ${{ inputs.branch }}
JOB_BASE_NAME: ${{ inputs.build-environment }}-build-and-test
BUILD_ENVIRONMENT: pytorch-linux-xenial-py3-clang5-android-ndk-r19c-${{ inputs.arch-for-build-env }}-build"
AWS_DEFAULT_REGION: us-east-1
PR_NUMBER: ${{ github.event.pull_request.number }}
SHA1: ${{ github.event.pull_request.head.sha || github.sha }}
CUSTOM_TEST_ARTIFACT_BUILD_DIR: build/custom_test_artifacts
SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2
DOCKER_IMAGE: ${{ inputs.docker-image }}
MATRIX_ARCH: ${{ inputs.arch }}
@ -50,12 +52,16 @@ runs:
export container_name
container_name=$(docker run \
-e BUILD_ENVIRONMENT \
-e JOB_BASE_NAME \
-e MAX_JOBS="$(nproc --ignore=2)" \
-e AWS_DEFAULT_REGION \
-e IS_GHA \
-e PR_NUMBER \
-e SHA1 \
-e BRANCH \
-e GITHUB_RUN_ID \
-e SCCACHE_BUCKET \
-e CUSTOM_TEST_ARTIFACT_BUILD_DIR \
-e SKIP_SCCACHE_INITIALIZATION=1 \
--env-file="/tmp/github_env_${GITHUB_RUN_ID}" \
--security-opt seccomp=unconfined \
@ -66,11 +72,11 @@ runs:
-w /var/lib/jenkins/workspace \
"${DOCKER_IMAGE}"
)
git submodule sync && git submodule update -q --init --recursive --depth 1
git submodule sync && git submodule update -q --init --recursive --depth 1 --jobs 0
docker cp "${GITHUB_WORKSPACE}/." "${container_name}:/var/lib/jenkins/workspace"
(echo "sudo chown -R jenkins . && .ci/pytorch/build.sh && find ${BUILD_ROOT} -type f -name "*.a" -or -name "*.o" -delete" | docker exec -u jenkins -i "${container_name}" bash) 2>&1
(echo "sudo chown -R jenkins . && .jenkins/pytorch/build.sh && find ${BUILD_ROOT} -type f -name "*.a" -or -name "*.o" -delete" | docker exec -u jenkins -i "${container_name}" bash) 2>&1
# Copy install binaries back
mkdir -p "${GITHUB_WORKSPACE}/build_android_install_${MATRIX_ARCH}"
docker cp "${container_name}:/var/lib/jenkins/workspace/build_android/install" "${GITHUB_WORKSPACE}/build_android_install_${MATRIX_ARCH}"
echo "container_id=${container_name}" >> "${GITHUB_OUTPUT}"
echo "::set-output name=container_id::${container_name}"

Some files were not shown because too many files have changed in this diff Show More