Compare commits

..

126 Commits

Author SHA1 Message Date
4ff3872a20 [v.1.5.0] Ensure linearIndex of advanced indexing backwards is contig… (#36962)
* [v.1.5.0] Ensure linearIndex of advanced indexing backwards is contiguous.

This is a more straightforward solution to the problem than https://github.com/pytorch/pytorch/pull/36957; I don't know about the relative performance.

Fixes: #36956

ghstack-source-id: 43c48eaee7232cd3ed2b108edbbee24c11e8321a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36959

* Fix test.
2020-04-20 19:59:38 -04:00
d7bdffabed [v1.5 Patch] Disable flaky test_backward_node_failure_python_udf test in dist_autograd_test.py
This test is flaky on 1.5 release branch. Below is a failed CI run:
https://app.circleci.com/pipelines/github/pytorch/pytorch/157331/workflows/b3e0bd6b-6c55-4d14-bde8-96b8345cf9e2/jobs/5190025
2020-04-20 14:25:32 -04:00
9ba0a89489 Overwrite bazel if /usr/bin/bazel already exists. 2020-04-20 14:24:42 -04:00
c164fbccb1 Add TorchServe 2020-04-19 21:44:32 -07:00
9a51e477ac make simple executor the default for OSS 2020-04-17 20:00:53 -04:00
375566fb78 Handle log_sigmoid(out=) properly.
Fixes: https://github.com/pytorch/pytorch/issues/36499

Changes:
1) Moves some bindings from LegacyNNDefinitions to Activation so all of log_sigmoid lives together
2) Properly handle non-contiguous / incorrectly sized out parameters to log_sigmoid.  This is done by copying from a buffer if necessary.
3) Require that the internal buffer (different from 2)) is contiguous.  This should always be the case because it's always created internally.
4) Adds a test
2020-04-17 15:43:35 -04:00
dfdc788076 Fix incorrect merge of #34136.
If you look at https://github.com/pytorch/pytorch/pull/34136/, you will notice a commit (80c15c087c) that didn't get merged.
This is to address that, to avoid crashing on remainder when the rhs is 0.

ghstack-source-id: e805e290bd4b7d3165fd78d4e537e56e4c459162
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36760
2020-04-17 15:42:20 -04:00
9e6ef814cc [v1.5.0] Print keyword-only arg symbol for function signature suggestions.
Fixes: https://github.com/pytorch/pytorch/issues/36773

ghstack-source-id: 6b08839ffc8b228e9533a47b7fd034367fc93dec
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36780
2020-04-17 15:42:04 -04:00
31461800f6 Migrate release CI jobs to CircleCI for Windows (v1.5 Release) (#36658)
* Migrate release CI jobs to CircleCI for Windows (v1.5 Release)

* Fix comments
2020-04-16 12:18:27 -04:00
Jie
e741839b0e Fixing SyncBN dgrad (#36382)
Summary:
Previous PR https://github.com/pytorch/pytorch/issues/22248 which provides support for variadic batch size across processes doesn't account the mean_dy/mean_dy_xmu on backward path, which produces wrong dgrad.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36382

Differential Revision: D20984446

Pulled By: ngimel

fbshipit-source-id: 80066eee83760b275d61e2cdd4e86facca5577fd
2020-04-16 10:58:16 -04:00
8eb39c9cfd [CI] fix test_distributed for python 3.8+ (#36542)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36542

Python 3.8 set the default multiprocessing start mode to spawn, but we
need fork in these tests, otherwise there are some pickling issues.
Test: Ensure that these tests succeed when run with python 3.8
ghstack-source-id: 102093824

Test Plan: Ensure success with python 3.8

Differential Revision: D21007753

fbshipit-source-id: 4b39844c6ba76a53293c0dfde7c98ec5a78fe113
2020-04-16 10:54:57 -04:00
b5e4c0993d Add a warning for Single-Process Multi-GPU DDP 2020-04-15 19:08:24 -04:00
6bc6832bda fix syntax 2020-04-15 19:00:11 -04:00
593594839c Update docs for 1.5 to remove Python 2 references (#36338)
* Remove python 2 from jit.rst

* Remove python 2 from jit_language_reference.rst

* Remove python 2 from multiprocessing.rst

* Remove python 2 from named_tensor.rst

* Remove python 2 from multiprocessing.rst

* Remove python 2 from windows.rst

* Update multiprocessing.rst

* Remove python 2 from notes/multiprocessing.rst
2020-04-14 15:57:02 -07:00
cf65c8ef15 Fix torch.min docs (#36319)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36319

On the way to resolving #35216.
This is a fix for just the master branch but once this goes in,
I'll send a cherry-pick to release/1.5

The problem is that we were not calling `format` on a string that had
templates (e.g., '{input}', '{dim}'). This change makes it so that we
call format on the entire docstring for `torch.min`.

Test Plan:
- The `torch.max` docs are OK:
https://pytorch.org/docs/master/torch.html#torch.max and don't need
changing.
- `torch.min` docs, before this change: see second screenshot in #35216.
- after this change: <Insert link here on github>

![image](https://user-images.githubusercontent.com/5652049/78921702-4e2acc00-7a63-11ea-9ea0-89636ff6fb0a.png)

Differential Revision: D20946702

Pulled By: zou3519

fbshipit-source-id: a1a28707e41136a9bb170c8a4191786cf037a0c2
2020-04-13 19:03:03 -04:00
ca0dc1fcdc skip test in 3.8 because of inconsistent regex 2020-04-10 11:06:47 -07:00
b58f89b2e4 Use counter instead of vector of futures in _parallel_run (#36159) (#36334)
Summary:
This should be faster than allocating one mutex, flag and conditional variable per task.

Using `std::atomic<size_t>` to count remaing tasks is not sufficient,
because modification of remaining counter and signalling conditional variable must happen atomically,
otherwise `wait()` might get invoked after `notify_one()` was called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36159

Test Plan: CI

Differential Revision: D20905411

Pulled By: malfet

fbshipit-source-id: facaf599693649c3f43edafc49f369e90d2f60de
(cherry picked from commit 986a8fdd6a18d9110f8bde59361967139450966b)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-04-09 14:08:57 -07:00
87b6685c6b repr and _*state_dict for qRNN (#31540)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31540

Fixes #31468

Test Plan: Imported from OSS

Differential Revision: D19205894

Pulled By: z-a-f

fbshipit-source-id: 80c36f74aa20a125ea8d74a54e9905576f1bc6d7
2020-04-09 12:26:56 -04:00
f746f1b746 Revert "Avoid clone for sparse tensors during accumulation of grads. (#33427)"
This reverts commit b185359fb4ba4dcb0c048fd1d049da23eff88b27.
2020-04-09 11:33:55 -04:00
1379415150 Revert "AccumulateGrad: ensure sparse tensor indices and values refcount is always 1 (#34559)"
This reverts commit 2ce9513b0c8894987f6d42bfb57ff95b22e32c95.
2020-04-09 11:33:55 -04:00
7d638d2596 [v1.5.0] fix is_float_scale_factor warning (python and c++) (#36274)
* fix is_float_scale_factor warning

* fix python impl

Co-authored-by: Robin Lobel <divide@divideconcept.net>
Co-authored-by: Will Feng <willfeng@fb.com>
2020-04-09 11:31:13 -04:00
bad005d331 .circleci: Add binary builds/tests to run on release branches (#36283)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-04-08 16:37:24 -07:00
16d8a52407 [pytorch] Add error when PyTorch used with Python 2 (#36151)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36151

Python 2 has reached end-of-life and is no longer supported by PyTorch. To avoid confusing behavior when trying to use PyTorch with Python 2, detect this case early and fail with a clear message.  This commit covers `import torch` only and not C++  for now.

Test Plan: waitforsandcastle

Reviewed By: dreiss

Differential Revision: D20894381

fbshipit-source-id: a1073b7a648e07cf10cda5a99a2cf4eee5a89230
2020-04-08 18:55:58 -04:00
a33b264588 Revert "Update docs for 1.5 to remove Python 2 references (#36116)"
This reverts commit 63dcd9eccc90136afdfb5d8130077ff1e917ba2e.
2020-04-08 18:51:13 -04:00
3a67e00889 [1.5 cherrypick] C++ Adam optimizer - corrected messages for check of default options (#36245)
* Corrected messages for check of default options

* Added 0<= betas < 1 range check, match python messages for check of betas

Co-authored-by: meganset <meganset@gmail.com>
2020-04-08 18:06:16 -04:00
6bd039551d Remove determine_from from test/run_test.py (#36256)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-04-08 14:58:23 -07:00
b6c3058d61 Exclude torch/csrc/cuda/*nccl* from clang-tidy (#36251)
Since workflow configures pytorch with 'USE_NCCL` set to 0, we can not tidy those files

(cherry picked from commit e172a6ef920b6838b67eb8f0020d78031df8cde5)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-04-08 13:37:16 -07:00
ed908b4fbc [release/1.5] Move all nccl from torch_python to torch_cuda (#36229)
* Remote dead code

`THCPModule_useNccl()` doesn't seem to be used anywhere

* Move all nccl calls from `torch_python` to `torch_cuda`

Because `torch_python` is supposed to be thin wrapper around torch

This ensures API parity between C++ and Python, as well as reduces `torch_python` binary size

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-04-08 10:39:20 -07:00
b66e0af58b s/repo.continuum.io/repo.anaconda.com/
Followup after  https://github.com/pytorch/pytorch/pull/36201

Per https://github.com/conda/conda/issues/6886  `repo.anaconda.com` should have been used since Feb 2019

Test Plan: CI
2020-04-08 13:05:04 -04:00
bf8a5ede96 [ONNX] fix size for opset 11 (#35984)
Summary:
Fixing size, as the aten op has updated to support 0 inputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35984

Reviewed By: hl475

Differential Revision: D20858214

Pulled By: houseroad

fbshipit-source-id: 8ad0a0174a569455e89da6798eed403c8b162a47
2020-04-08 11:50:59 -04:00
c2bc5c56c5 Use repo.anaconda.com instead of repo.continuum.io (#36201)
Summary:
Per https://github.com/conda/conda/issues/6886  `repo.anaconda.com` should have been used since Feb 2019
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36201

Test Plan: CI

Differential Revision: D20910667

Pulled By: malfet

fbshipit-source-id: 3a191e2cae293e6f96dbb323853e84c07cd7aabc
2020-04-08 08:39:52 -07:00
db3c3ed662 Move test to test_jit_py3.py 2020-04-08 11:15:33 -04:00
9de4770bbd [v1.5.0] Group libraries in TOC and add PyTorch Elastic
Move XLA out of Notes and group with other libraries. Also adds link to PyTorch Elastic.
2020-04-08 11:08:39 -04:00
911a2a6b63 [BugFix] Fix compare_exchange_weak in DispatchStub.h (#35794)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35794

### Summary

As PyTorch has gone in production on iOS for about week, we've spotted a few crashes (90 out of 20.3k ) related to DispatchStub.h. The major part of the crash log is pasted below (full crash information can be found at `bunnylol logview 1d285dc9172c877b679d0f8539da58f0`):

```
FBCameraFramework void at::native::DispatchStub<void (*)(at::TensorIterator&, c10::Scalar), at::native::add_stub>::operator()<at::TensorIterator&, c10::Scalar&>(c10::DeviceType, at::TensorIterator&, c10::Scalar&)(DispatchStub.h:0)
+FBCameraFramework at::native::add(at::Tensor const&, at::Tensor const&, c10::Scalar)(BinaryOps.cpp:53)
+FBCameraFramework at::CPUType::add_Tensor(at::Tensor const&, at::Tensor const&, c10::Scalar)(CPUType.cpp:55)
+FBCameraFramework at::add(at::Tensor const&, at::Tensor const&, c10::Scalar)(Functions.h:1805)
+FBCameraFramework [inlined] c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::intrusive_ptr(c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>&&)(intrusive_ptr.h:0)
+FBCameraFramework [inlined] c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::intrusive_ptr(c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>&&)(intrusive_ptr.h:221)
+FBCameraFramework [inlined] at::Tensor::Tensor(at::Tensor&&)(TensorBody.h:93)
+FBCameraFramework [inlined] at::Tensor::Tensor(at::Tensor&&)(TensorBody.h:93)
+FBCameraFramework c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >::operator()(at::Tensor, at::Tensor, c10::Scalar)(kernel_lambda.h:23)
+FBCameraFramework [inlined] c10::guts::infer_function_traits<c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> > >::type::return_type c10::detail::call_functor_with_args_from_stack_<c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >, false, 0ul, 1ul, 2ul>(c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >*, std::__1::vector<c10::IValue, c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >*::allocator<std::__1::vector> >*, c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >*::integer_sequence<unsigned long, 0ul, 1ul, 2ul>)(kernel_functor.h:210)
+FBCameraFramework [inlined] c10::guts::infer_function_traits<c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> > >::type::return_type c10::detail::call_functor_with_args_from_stack<c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >, false>(c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >*, std::__1::vector<c10::IValue, c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >*::allocator<std::__1::vector> >*)(kernel_functor.h:218)
+FBCameraFramework c10::detail::make_boxed_from_unboxed_functor<c10::detail::WrapRuntimeKernelFunctor_<(anonymous namespace)::$_3, at::Tensor, c10::guts::typelist::typelist<at::Tensor, at::Tensor, c10::Scalar> >, false, void>::call(c10::OperatorKernel*, c10::OperatorHandle const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >*)(kernel_functor.h:250)
+FBCameraFramework [inlined] (anonymous namespace)::variable_fallback_kernel(c10::OperatorHandle const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >*)(VariableFallbackKernel.cpp:32)
+FBCameraFramework void c10::KernelFunction::make_boxed_function<&((anonymous namespace)::variable_fallback_kernel(c10::OperatorHandle const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >*))>(c10::OperatorKernel*, c10::OperatorHandle const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >*)(KernelFunction_impl.h:21)
+FBCameraFramework torch::jit::mobile::InterpreterState::run(std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >&)(interpreter.cpp:0)
+FBCameraFramework torch::jit::mobile::Function::run(std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >&) const(function.cpp:59)
+FBCameraFramework torch::jit::mobile::Module::run_method(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >)(module.cpp:51)
+FBCameraFramework [inlined] torch::jit::mobile::Module::forward(std::__1::vector<c10::IValue, std::__1::allocator<c10::IValue> >)(module.h:28)
```
The problem is `compare_exchange_weak` is not guaranteed to be successful in one shot, as described in  [C++ Concurrency in Action (2nd Edition)](https://livebook.manning.com/book/c-plus-plus-concurrency-in-action-second-edition/chapter-5/79). This might result in `cpu_dispatch_ptr` being null pointer in concurrent situations, thus leading to the crash. As suggested in the book, due to spurious failure, the `compare_exchange_weak` is typically used in a loop.  There is also a [stackoverflow discussion](https://stackoverflow.com/questions/25199838/understanding-stdatomiccompare-exchange-weak-in-c11) about this. Feel free to drop comments below if there is a better option.

### The original PR

- [Enhance DispatchStub to be thread safe from a TSAN point of view](https://github.com/pytorch/pytorch/pull/32148)

### Test Plan

- Keep observing the crash reports in QE

Test Plan: Imported from OSS

Differential Revision: D20808751

Pulled By: xta0

fbshipit-source-id: 52f5c865b70c59b332ef9f0865315e76d97f6eaa
2020-04-08 10:56:07 -04:00
60375bcfdf [1.5.0] Attempt to fix the pytorch_cpp_doc_push build by pinning breathe. 2020-04-08 10:54:56 -04:00
63dcd9eccc Update docs for 1.5 to remove Python 2 references (#36116) 2020-04-07 16:03:44 -07:00
e8236d2ed4 fix max_pool2d cuda version Dimension out of range issue(#36046) (#36095)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36095

Test Plan: Imported from OSS

Differential Revision: D20876733

Pulled By: glaringlee

fbshipit-source-id: a2b92fd2dd0254c5443af469e3fb2faa2323e5c9
2020-04-07 18:52:21 -04:00
0058b1bb7e [1.5 cherrypick][JIT] Fix fake_range() 2020-04-07 18:47:22 -04:00
419283e291 Improve C++ API autograd and indexing docs (#35777)
Summary:
This PR adds docs for the following components:
1. Tensor autograd APIs (such as `is_leaf` / `backward` / `detach` / `detach_` / `retain_grad` / `grad` / `register_hook` / `remove_hook`)
2. Autograd APIs: `torch::autograd::backward` / `grad` / `Function` / `AutogradContext`, `torch::NoGradGuard` / `torch::AutoGradMode`
3. Tensor indexing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35777

Differential Revision: D20810616

Pulled By: yf225

fbshipit-source-id: 60526ec0c5b051021901d89bc3b56861c68758e8
2020-04-07 18:37:27 -04:00
0e6f6ba218 [pytorch] Remove python2 support from tests and torch.jit (#35042) (#36162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35042

Removing python2 tests and some compat code in torch.jit. Check if dependent projects and external tests have any issues after these changes.

Test Plan: waitforsandcastle

Reviewed By: suo, seemethere

Differential Revision: D18942633

fbshipit-source-id: d76cc41ff20bee147dd8d44d70563c10d8a95a35
(cherry picked from commit 8240db11e193b0334a60a33d9fc907ebc6ba6987)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Orion Reblitz-Richardson <orionr@fb.com>
2020-04-07 13:55:50 -07:00
ec8dbaf920 Add more alternative filters in places people forgot to add them. (#36082) (#36148)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36082

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20874618

Pulled By: ezyang

fbshipit-source-id: b6f12100a247564428eb7272f803a03c9cad3a97
(cherry picked from commit 449a4ca3408774ed961f1702ca31a549f5818b80)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Edward Yang <ezyang@fb.com>
2020-04-07 09:59:33 -07:00
7e168d134f Pin Sphinx to 2.4.4 (take 2), fix docs CIs (#36072)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36072

Update to https://github.com/pytorch/pytorch/pull/36065/ which was
almost there

Test Plan: - Wait for CI

Differential Revision: D20871661

Pulled By: zou3519

fbshipit-source-id: 2bf5ce382e879aafd232700ff1c0d61fc17ea52d
2020-04-07 10:54:36 -04:00
6daae58871 Remove __nv_relfatbin section from nccl_static library (#35907)
Test Plan: CI

(cherry picked from commit 04e06b419990328157f0e2108a95b2848f66d75f)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-04-06 16:57:03 -07:00
fee0ff1bf6 May fix TopKTypeConfig<at::Half> without an additional Bitfield specialization 2020-04-06 19:41:17 -04:00
deaf3b65cf Compile THCTensorTopK per dtype.
ROCm builds fail inconsistently on this file by timing out.

ghstack-source-id: 4a8f22731aa82c02d464a8cba522e856afbe49b8
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36074
2020-04-06 19:41:17 -04:00
dca9c2501d Revert "Revert "Fix handling of non-finite values in topk (#35253)" (#35582)"
This reverts commit dacdbc22d195f80e0b529b4e9111c8ca9a172914.
2020-04-06 19:41:17 -04:00
842cd47416 Refactor and turn on C++ API parity test in CI
gh-metadata: pytorch pytorch 35190 gh/yf225/106/head
2020-04-06 15:40:35 -04:00
a30b49085c Move NewModuleTest and NewCriterionTest from test_nn.py to common_nn.py (#35189)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35189

Test Plan: Imported from OSS

Differential Revision: D20588197

Pulled By: yf225

fbshipit-source-id: 5a28159b653895678c250cbc0c1ddd51bc7a3123
2020-04-06 15:40:35 -04:00
82626f8ad9 More generic dedupe MKL fix (#35966)
* Stop linking against MKL

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Perform test for build size

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* fixup

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* One more MSVC fix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Revert "Perform test for build size"

This reverts commit 8b5ed8eac81cc880b5cedb33cb3b86f584abacb7.
2020-04-06 11:50:48 -07:00
27fddfda4f Use std::abs instead of abs in lbfgs.cpp (#35974)
Summary:
This supersedes https://github.com/pytorch/pytorch/pull/35698.

`abs` is a C-style function that takes only integral argument
`std::abs` is polymorphic and can be applied to both integral and floating point types

This PR also increases `kBatchSize` in `test_optimizer_xor` function in `test/cpp/api/optim.cpp` to fix `OptimTest.XORConvergence_LBFGS` failure under ASAN.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35974

Test Plan: CI

Reviewed By: pbelevich

Differential Revision: D20853570

Pulled By: yf225

fbshipit-source-id: 6135588df2426c5b974e4e097b416955d1907bd4
2020-04-06 14:50:18 -04:00
7ecf6a1c10 [release/1.5] Bump libtorch to 3.7, remove python2 (#36080)
* .cirlceci: Remove Python 2.7 builds, switch libtorch to 3.7

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

* .circleci: Bump libtorch builds to 3.7

The image is actually using Python 3.7.2 so we should reflect that
within our circleci configs

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
(cherry picked from commit b3f2572aaf83d1f5383369187f6263e6f926103b)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-04-06 11:10:48 -07:00
beb07a44c4 Ports integer division callsite cleanup 2020-04-02 20:17:31 -04:00
a01c3bd1fe [BC] Fix the BC test for 1.5 (#35733)
* [BC] Fix the BC test for 1.5

* Skip RRef

* Skip more

* Skip more

* Fix whitelist

* Fix whitelist
2020-04-02 19:36:18 -04:00
ffd010f8a0 Make test_leaky_relu_inplace_with_neg_slope device-generic and skipIfRocm. (#35816)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35816

Fixes https://github.com/pytorch/pytorch/issues/35689.

Test Plan: Imported from OSS

Differential Revision: D20796656

Pulled By: gchanan

fbshipit-source-id: 474790fe07899d9944644f6b3d7a15db1c2b96db
2020-04-02 17:05:23 -04:00
8ad59f03a8 Skip ROCm test in test/test_cpp_extensions_aot.py (#35838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35838

It may be flaky.

Test Plan: Imported from OSS

Differential Revision: D20807409

Pulled By: gchanan

fbshipit-source-id: f085d05bcb6a04d304f3cd048c38d2e8453125d6
2020-04-02 17:04:54 -04:00
ed3640df68 Fix another case of float2::x and float2::y may not be the same on ROCm (#35785)
Summary:
This is another case of the issue fixed in https://github.com/pytorch/pytorch/pull/35783. Mirroring 35786.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35785

Differential Revision: D20800317

Pulled By: ezyang

fbshipit-source-id: de5f32839755d5ff5aefff8408df69adbab4d0a1
2020-04-02 17:01:27 -04:00
fb88942f6c Fix typo 2020-04-02 13:53:13 -04:00
5d05c51887 Refactored rpc docs (#35109)
Summary:
Reorganize as per jlin27 's comments. Screenshots added in comments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35109

Differential Revision: D20788774

Pulled By: rohan-varma

fbshipit-source-id: 7d64be70ef76ed6ff303d05d39c338293c234766
2020-04-02 13:53:13 -04:00
df5986fbf3 [1.5 Release] Disabled complex tensor construction (#35579)
* disabled complex tensor construction

* minor

* doc fix

* added docs back and updated complex dtype check

* removed test_complex.py

* removed complexfloat reg test

* debug
2020-04-01 11:11:05 -04:00
165403f614 [v1.5.0] float2::x and float2::y may not be the same as float on ROCm (#35593)
Summary:
This causes ambiguity and can be triggered sometimes (e.g., by https://github.com/pytorch/pytorch/issues/35217). Explicitly convert them to float.

    error: conditional expression is ambiguous; 'const
    hip_impl::Scalar_accessor<float, Native_vec_, 0>' can be converted to
    'float' and vice versa
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35593

Differential Revision: D20735663

Pulled By: ezyang

fbshipit-source-id: ae6a38a08e59821bae13eb0b9f9bdf21a008d5c0
2020-03-31 19:58:40 -04:00
fbf18c34ff ports disabling imag 2020-03-31 18:55:45 -04:00
84f806c821 ports real and imag fixes 2020-03-31 13:34:39 -04:00
94139a7d95 Add warnings that amp is incomplete in 1.5 2020-03-31 10:49:45 -04:00
75e36186b2 [v1.5.0] Fix Caffe2 mobile compilation
Ports #35288
2020-03-30 17:17:59 -04:00
f4a0b406dd Warn a known autograd issue on XLA backend. 2020-03-30 17:16:39 -04:00
e884e720f0 [Windows] make torch_cuda's forced link also work for CMake
Was only working for ninja
2020-03-30 17:13:51 -04:00
dacdbc22d1 Revert "Fix handling of non-finite values in topk (#35253)" (#35582)
This reverts commit b12579da5398ff23b421332e21e18dc619a0b960.

This patch in-and-of itself looks fine, but it's causing some AMP tests to fail.
2020-03-27 17:44:03 -07:00
2a789cd0e0 [C++ API Parity] [Optimizers] Merged Optimizer and LossClosureOptimizer (#34957)
Summary:
1. Removed LossClosureOptimizer, and merged Optimizer into OptimizerBase (and renamed the merged class to Optimizer)
2. Merged the LBFGS-specific serialize test function and the generic test_serialize_optimizer function.
3. BC-compatibility serialization test for LBFGS
4. Removed mentions of parameters_ in optimizer.cpp, de-virtualize all functions
5. Made defaults_ optional argument in all optimizers except SGD

**TODO**: add BC-breaking notes for this PR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/34957

Test Plan: Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D20678162

Pulled By: yf225

fbshipit-source-id: 74e062e42d86dc118f0fbaddd794e438b2eaf35a
2020-03-27 12:30:29 -04:00
f9b010f399 enforce rref JIT pickling to be in the scope of rpc calls (#34689)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34689

rref JIT pickling is only allowed inside rpc calls. enforcing this by adding a thread local variable isInRpcCall and set it as True when converting rpc requests or responses to message, before calling JIT::pickle(). Inside JIT::pickle(), it allowes to pickle RRef only when the isInRpcCall is true.
ghstack-source-id: 100481001

Test Plan: unit tests

Differential Revision: D20429826

fbshipit-source-id: dbc04612ed15de5d6c7d75a4732041ccd4ef3f8c
2020-03-27 11:13:01 -04:00
55614ff306 Enforce rref python pickling to be in the scope of RPC call (#34755)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34755

This diff disallows to use python pickler to pickle RRef. RRef can only be pickled in the scope of RPC call using _InternalRPCPickler.
ghstack-source-id: 100481337

Test Plan: unit tests

Differential Revision: D20453806

fbshipit-source-id: ebd4115ee01457ba6958cde805afd0a87c686612
2020-03-27 11:12:36 -04:00
b12579da53 Fix handling of non-finite values in topk (#35253)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/34191

`at::native::radixSelect` basically uses integer comparison which creates a defined ordering of non-finite float values. This isn't compatible with IEEE float comparison, so mixing the two leads to unwritten values in the output.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35253

Differential Revision: D20645554

Pulled By: ezyang

fbshipit-source-id: 651bcb1742ed67086ec89cc318d862caae65b981
2020-03-27 10:53:18 -04:00
920e3eb761 Making sure all tensors in torch.cat sequence have the same dtype. (#35150)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35150

Fixes #35014

Test Plan: Imported from OSS

Differential Revision: D20578589

Pulled By: z-a-f

fbshipit-source-id: edeaef133d1cf5152dcbafab2b969f1424ee2836
2020-03-26 16:49:11 -04:00
bec01e755a Renaming: MultiLabelMarginLossFuncOptions -> MultilabelMarginLossFuncOptions, MultiLabelSoftMarginLossFuncOptions -> MultilabelSoftMarginLossFuncOptions
gh-metadata: pytorch pytorch 35163 gh/yf225/104/head
2020-03-26 14:31:21 -04:00
6a880e1bc9 Add inplace tests for several torch::nn modules / functionals
gh-metadata: pytorch pytorch 35147 gh/yf225/101/head
2020-03-26 14:31:21 -04:00
fa86e32a4e Fix F::interpolate and torch::nn::Upsample implementation
gh-metadata: pytorch pytorch 35025 gh/yf225/100/head
2020-03-26 14:31:21 -04:00
5aabaf2b18 Fix fractional_max_pool3d_with_indices implementation
gh-metadata: pytorch pytorch 35024 gh/yf225/99/head
2020-03-26 14:31:21 -04:00
4a707e8f95 Fix Conv and ConvTranspose implementation
gh-metadata: pytorch pytorch 35023 gh/yf225/98/head
2020-03-26 14:31:21 -04:00
db127b21eb Fix AdaptiveAvgPool{2,3}d and AdaptiveMaxPool{2,3}d implementation
gh-metadata: pytorch pytorch 35022 gh/yf225/97/head
2020-03-26 14:31:21 -04:00
45313cd9e1 [1.5 cherrypick] [C++ API Parity] Add xor_convergence test for lbfgs (#35440)
* add xor_convergence test for lbfgs

* increased batchsize to 6

* minor

* increased batch size

Co-authored-by: anjali411 <chourdiaanjali123@gmail.com>
2020-03-26 14:22:55 -04:00
df531973e1 [ONNX] update producer version (#35059)
Summary:
Updating producer version
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35059

Reviewed By: hl475

Differential Revision: D20585173

Pulled By: houseroad

fbshipit-source-id: af0c4e3860beb899548466ea99be2050150f905d
2020-03-26 13:56:57 -04:00
9e3c577caa Fix torch.mm export to ONNX (#34661)
Summary:
torch.mm is exported as Gemm operator in ONNX and both have an optional input: out.
out is considered as broadcastable in Gemm and during graph optimization the optional input (out) would get selected. Since out is optional, in case when it is not defined in torch.mm that would result in the following exception:
IndexError: vector::_M_range_check: __n (which is 2) >= this->size() (which is 2)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34661

Reviewed By: hl475

Differential Revision: D20496398

Pulled By: houseroad

fbshipit-source-id: e677aef0a6aefb1f83a54033153aaabe5c23bc0f
2020-03-26 13:55:18 -04:00
5357b8e4d9 .circleci: Remove python 2 binary builds (#35475)
Python 2 is EOL soon so we're dropping support as of v1.5.0

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-26 10:50:34 -07:00
0f23d23db4 Add docs to resize_ and resize_as_ (#35392)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35392

Test Plan: Imported from OSS

Differential Revision: D20650097

Pulled By: VitalyFedyunin

fbshipit-source-id: cff4f555d355dfee42394f6070fe3e466949aeb5
2020-03-26 12:23:04 -04:00
7c24280a3f Add docs about memory format (#34818)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34818

Test Plan: Imported from OSS

Differential Revision: D20601336

Pulled By: VitalyFedyunin

fbshipit-source-id: d34ad226be950bf134c6b383a4810ea6aa75599e
2020-03-26 12:23:04 -04:00
7100f0be13 ports true_divide method variant to 1.5 (#35390)
Co-authored-by: Mike Ruberry <mruberry@devfair044.maas>
2020-03-26 11:50:00 -04:00
f7f611c2ec torch.cat: disallow inputs on different devices (#35053)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/35045
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35053

Differential Revision: D20545517

Pulled By: ngimel

fbshipit-source-id: eee3fc87c7e578ff44d69d5ce6f92a8f496fa97b
2020-03-26 10:58:33 -04:00
acb982d0b0 Add TORCH_CUDA_API to FilterDescriptor (#35131)
Summary:
`FilterDescriptor` is missing a `TORCH_CUDA_API`, so this symbol is not exported from `torch_cuda.so`, and users could have trouble building cpp_extension when using cudnn.

cc: ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35131

Differential Revision: D20604439

Pulled By: ezyang

fbshipit-source-id: c57414fc8a9df9cb1e910e2ec0a48cfdbe7d1779
2020-03-26 10:57:59 -04:00
aa8b7ad989 Fix thread_local initializtion in C10 WarningHandler. (#34822)
Summary:
The Windows + MSVC-specific bug discussed here: https://github.com/pytorch/pytorch/issues/19394 and fixed here: https://github.com/pytorch/pytorch/issues/22405 still appears in C10's warning handler class. This results in a crash if a user attempts to run code which would print a warning when that code is running inside a thread created by a DLL. This PR applies a similar fix to that of https://github.com/pytorch/pytorch/issues/22405.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34822

Test Plan:
* Tested locally by running CodecverseWorkbench Unity app with patched build.
* CI

Differential Revision: D20627971

Pulled By: HapeMask

fbshipit-source-id: 64dfca531ed7eebbe9e0ecac3d3d4d025c683883
2020-03-25 20:02:45 -07:00
2d403ed8be Add python excepiton handling catch block to resolve deadlock (#35283) (#35402)
Summary:
Note: This PR has been merged into master after the 1.5.0 branch cut at
36e3c00 (see original PR: #35283). This PR is to cherry pick it into 1.5.

---- Original Commit Description Follows ---

Pull Request resolved: https://github.com/pytorch/pytorch/pull/35283

https://github.com/pytorch/pytorch/issues/34260

Deadlock on destructing py::error_already_set.

There are request callback impls in Python, where Python exceptions
could be thrown. For releasing Python exception py::objects, GIL must
be held.

Differential Revision: D7753253

fbshipit-source-id: 4bfaaaf027e4254f5e3fedaca80228c8b4282e39

Co-authored-by: Shihao Xu <shihaoxu@fb.com>
2020-03-25 17:05:18 -07:00
c25a664f77 Trying pinning pyyaml and setuptools on macos to older version (#35296) (#35400)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35296

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20624843

Pulled By: ezyang

fbshipit-source-id: 9028f1dd62d0c25e916eb4927fd8dd6acbd88886
(cherry picked from commit 3f896ef7435201b2c3f51851f80dc674dfadfd40)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Edward Yang <ezyang@fb.com>
2020-03-25 16:04:06 -07:00
ab660ae394 Fix Tensor __radd__ type hint issue (#35231)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35231

Fixes #35213

(Note: this ignores all push blocking failures!)

Test Plan: `mypy -c "import torch; ten = torch.tensor([1.0, 2.0, 3.0]); print(7 + ten)"` should not produce any warnings

Differential Revision: D20604924

Pulled By: pbelevich

fbshipit-source-id: 53a293a99b3f2ab6ca5516b31f3a92f67eb67a39
2020-03-25 18:37:07 -04:00
3c476a8858 PyTorch should always depend on future (#35057) (#35412)
Summary:
Because `past` is used in `caffe2.python.core`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35057

Test Plan: CI

Differential Revision: D20547042

Pulled By: malfet

fbshipit-source-id: cad2123c7b88271fea37f21e616df551075383a8
(cherry picked from commit d3f5045bf55e4a5dfb53ceccb6130e4e408cf466)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-03-25 14:54:26 -07:00
651fa88645 Load all DLLs in the lib directory for Windows (v.1.5.0) 2020-03-25 16:23:22 -04:00
565c3400b4 Update view op list. 2020-03-25 16:14:08 -04:00
3e332778b4 non blocking copy from #35144 2020-03-25 14:54:41 -04:00
f598738920 UBSAN deliberate float to int fix 2020-03-25 11:24:30 -04:00
4c6bfa0187 [1.5 cherrypick][JIT] Namespaces for TorchBind 2020-03-25 11:23:03 -04:00
6f25003682 [1.5 cherrypick][JIT] BC shim for TorchBind classes 2020-03-25 11:23:03 -04:00
752c129fa1 Update docs about DP and DDP for CUDA (#35063)
Summary:
We should recommend DDP instead of DP. Hope we can also cherry-pick this for 1.5
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35063

Differential Revision: D20549621

Pulled By: ngimel

fbshipit-source-id: 86b1b2134664065cc6070ea4212895f993eaf543
2020-03-25 11:18:17 -04:00
fb59a9caca .circleci: Change default CUDA for pip, cu101 -> cu102 (#35310)
So that packages are correctly marked when looking through the html
pages.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-24 15:05:25 -07:00
4d30dbdd35 Pin XLA CI to use r1.5 release branch. 2020-03-24 17:54:31 -04:00
b7f4a1a397 .circleci: Switch master to release/1.5 for git merge (#35320)
Since we're on a release branch we'll need to fix this up to do a merge
for release/1.5 instead of master.

TODO: In the future we should have a dynamic way of gathering the base
branch for PRs.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-24 14:52:24 -07:00
afda1dc943 Revert "Fix AdaptiveAvgPool{2,3}d and AdaptiveMaxPool{2,3}d implementation"
This reverts commit e2184ba08352d730d7165455c14f783b3e54082a.
2020-03-24 14:09:18 -04:00
d506ae882b Revert "Fix Conv and ConvTranspose implementation"
This reverts commit 88778854546b08bc6dd9f68e0a64311902c7d30c.
2020-03-24 14:09:18 -04:00
36e5abe531 Revert "Fix fractional_max_pool3d_with_indices implementation"
This reverts commit b89eb7c654b846fb3391cf4cc5aeb536cc41f1d7.
2020-03-24 14:09:18 -04:00
6e6f62230e Revert "Fix F::interpolate and torch::nn::Upsample implementation"
This reverts commit 75148df1f56c91f54965b530d606a6b9a4c8e269.
2020-03-24 14:09:18 -04:00
5d15577e6c Revert "Add inplace tests for several torch::nn modules / functionals"
This reverts commit 48590d6a9b939fb8097e4f2108872721ea5a516f.
2020-03-24 14:09:18 -04:00
6aa5298c5c Revert "Renaming: MultiLabelMarginLossFuncOptions -> MultilabelMarginLossFuncOptions, MultiLabelSoftMarginLossFuncOptions -> MultilabelSoftMarginLossFuncOptions"
This reverts commit 5ca901431886d60687275b9a310eac5b5aeba02f.
2020-03-24 14:09:18 -04:00
f3df13725b Revert "[1.5 cherrypick] [C++ API Parity] Add xor_convergence test for lbfgs (#35113)"
This reverts commit 246b824644c3731b00be6119f69795afd4eac9b6.
2020-03-24 14:08:56 -04:00
4eee3caa11 [release/1.5] .circleci: Fix unbound CIRCLE_TAG variable (#35242)
Was failing when trying to execute this script on a non-tag

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-23 16:21:44 -07:00
4d96463130 Updating fbgemm 2020-03-23 13:31:24 -07:00
246b824644 [1.5 cherrypick] [C++ API Parity] Add xor_convergence test for lbfgs (#35113)
* add xor_convergence test for lbfgs

* increased batchsize to 6

* minor

* increased batch size
2020-03-23 16:00:57 -04:00
5ca9014318 Renaming: MultiLabelMarginLossFuncOptions -> MultilabelMarginLossFuncOptions, MultiLabelSoftMarginLossFuncOptions -> MultilabelSoftMarginLossFuncOptions 2020-03-23 15:55:18 -04:00
48590d6a9b Add inplace tests for several torch::nn modules / functionals
gh-metadata: pytorch pytorch 35147 gh/yf225/101/head
2020-03-23 15:55:18 -04:00
75148df1f5 Fix F::interpolate and torch::nn::Upsample implementation
gh-metadata: pytorch pytorch 35025 gh/yf225/100/head
2020-03-23 15:55:18 -04:00
b89eb7c654 Fix fractional_max_pool3d_with_indices implementation
gh-metadata: pytorch pytorch 35024 gh/yf225/99/head
2020-03-23 15:55:18 -04:00
8877885454 Fix Conv and ConvTranspose implementation
gh-metadata: pytorch pytorch 35023 gh/yf225/98/head
2020-03-23 15:55:18 -04:00
e2184ba083 Fix AdaptiveAvgPool{2,3}d and AdaptiveMaxPool{2,3}d implementation
gh-metadata: pytorch pytorch 35022 gh/yf225/97/head
2020-03-23 15:55:18 -04:00
8ef47ad2f0 Updating fbgemm 2020-03-23 10:08:52 -07:00
6725b6f503 .cirlceci: Refactor how to grab the tagged version
Discovered that the upload scripts do not do well when there's no
pytorch repository to actually do git operations on.

CirlceCI however provides a nice environment variable with the name of
the current tag so let's just use that when it's available and fall back
on the git describe functionality if that fails.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 16:34:57 -07:00
bcd3f6da1a .circleci: Remove quotes from --git-dir
git doesn't handle the escapes correctly so let's just not put them
altogether.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 15:39:31 -07:00
0b3d2f7b7d .circleci: Make sure to add .git to --git-dir
--git-dir only works when it points directly to a .git folder

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 15:28:23 -07:00
f522651a7e .circleci: Switch git -C -> git --git-dir
Older versions of git do not contain the '-C' flag so let's switch to a
flag that is pre-historic and will run on any version of RHEL that is
still supported in the modern era.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 15:22:44 -07:00
01c8ef2757 .circleci: One more -C to add to get correct git info
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 15:08:02 -07:00
7cfe68ce3a .circleci: Hardcode directory to /pytorch to ensure git
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 14:54:57 -07:00
6f3120c6b9 .circleci: Ensure describe happens in pytorch repo
Found an issue where the git describe wasn't properly executed since the
binary_populate_env.sh script was being executed from a different
directory.

'git -C' forces the describe to run in the running directory for the
script which should contain the correct git information

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-03-19 14:24:18 -07:00
7674 changed files with 276192 additions and 1104368 deletions

View File

@ -1,63 +0,0 @@
# PyTorch CI Builds Pipeline on Azure DevOps
#
# This pipeline:
# 1) builds PyTorch on select configurations
# 2) runs only TestTorch unit tests.
stages:
- stage: 'Build'
displayName: 'Build PyTorch'
jobs:
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: 'PyTorch-Linux-CPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_ci_build: True
os: ubuntu
cuda: cpu
customMatrixes:
Py_38:
configuration: ubuntu_1804_py_38_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cpu_dev
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: 'PyTorch-Linux-GPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_ci_build: True
os: ubuntu
cuda: gpu
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: ubuntu_1804_py_39_cuda_112_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_39_cuda_112_cudnn_8_dev
CUDA_VERSION: 112
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_CPU
pool: 'PyTorch-Win-CPU'
build_stage: True
is_ci_build: True
os: windows
cuda: cpu
customMatrixes:
Py_37:
configuration: windows_2019_py_37_cpu
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_GPU
pool: 'PyTorch-Win-GPU'
build_stage: True
is_ci_build: True
os: windows
cuda: gpu
customMatrixes:
Py_38_CUDA_102_cuDNN_765:
configuration: windows_2019_py_38_cuda_102_cudnn_765
CUDA_VERSION: 102

View File

@ -1,82 +0,0 @@
# PyTorch Daily Builds Pipeline on Azure DevOps
#
# This pipeline:
# 1) builds PyTorch on all available configurations
# 2) runs all PyTorch unit tests
stages:
- stage: 'BuildTest'
displayName: 'Build and Test PyTorch'
jobs:
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: 'PyTorch-Linux-CPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_daily_build: True
os: ubuntu
cuda: cpu
customMatrixes:
Py_38:
configuration: ubuntu_1804_py_38_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cpu_dev
Py_37:
configuration: ubuntu_1804_py_37_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cpu_dev
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: 'PyTorch-Linux-GPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_daily_build: True
os: ubuntu
cuda: gpu
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: ubuntu_1804_py_39_cuda_112_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_39_cuda_112_cudnn_8_dev
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_810:
configuration: ubuntu_1804_py_38_cuda_102_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cuda_102_cudnn_8_dev
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_765:
configuration: ubuntu_1804_py_37_cuda_101_cudnn_765
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cuda_101_cudnn_7_dev
CUDA_VERSION: 101
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_CPU
pool: 'PyTorch-Win-CPU'
build_stage: True
is_daily_build: True
os: windows
cuda: cpu
customMatrixes:
Py_38:
configuration: windows_2019_py_38_cpu
Py_37:
configuration: windows_2019_py_37_cpu
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_GPU
pool: 'PyTorch-Win-GPU'
build_stage: True
is_daily_build: True
os: windows
cuda: gpu
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: windows_2019_py_39_cuda_112_cudnn_810
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_765:
configuration: windows_2019_py_38_cuda_102_cudnn_765
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_764:
configuration: windows_2019_py_37_cuda_101_cudnn_764
CUDA_VERSION: 101

View File

@ -1,134 +0,0 @@
# PyTorch build steps template with Unix images Azure DevOps Instances
#
# This build depends on 3 parameters set as environment variables in the pipeline:
# - AZURE_DEVOPS_CLI_PAT: Secret var for authenticating to Azure DevOps
# - AZURE_DEVOPS_ARTIFACTS_ORGANIZATION: Azure Artifacts Organization name to publish artifacts
# - AZURE_DEVOPS_ARTIFACTS_PROJECT: Azure Artifacts Project name to publish artifacts
parameters:
name: ''
pool: ''
container_endpoint: ''
os: ''
cuda: ''
is_ci_build: False
is_official_build: False
is_daily_build: False
build_stage: False
verify_stage: False
publish_stage: False
customMatrixes: ''
jobs:
- job: ${{parameters.name}}
timeoutInMinutes: 300
strategy:
matrix:
${{ insert }}: ${{parameters.customMatrixes}}
pool:
name: ${{ parameters.pool}}
variables:
DECODE_PERCENTS: false
container:
image: $[variables['container_image']]
endpoint: ${{parameters.container_endpoint}}
steps:
# Build stage
- ${{ if eq(parameters.build_stage, 'True') }}:
# Set up environment variables for specific pipeline build
- template: set-environment-variables.yml
parameters:
os: ${{ parameters.os}}
cuda: ${{ parameters.cuda}}
is_official_build: ${{ parameters.is_official_build}}
# Sync and update PyTorch submodules
- bash: git submodule update --init --recursive --jobs 0
displayName: Update PyTorch submodules
# Build PyTorch and run unit tests - no packaging
- ${{ if or(eq(parameters.is_ci_build, 'True'), eq(parameters.is_daily_build, 'True')) }}:
# Build PyTorch from source in develop mode
- bash: python setup.py develop
displayName: Build PyTorch from source
- ${{ if eq(parameters.is_ci_build, 'True') }}:
# Run TestTorch unit tests to demonstrate successful PyTorch build
- bash: python test/test_torch.py TestTorch
displayName: Run TestTorch unit tests
- ${{ if eq(parameters.is_daily_build, 'True') }}:
# Run all unit tests to demonstrate successful PyTorch build
- bash: python test/run_test.py --continue-through-error --exclude-jit-executor --verbose
displayName: Run all unit tests
# Run ComponentGovernance
- task: ComponentGovernanceComponentDetection@0
inputs:
scanType: 'Register'
verbosity: 'Verbose'
alertWarningLevel: 'High'
# Build PyTorch and produce artifacts for verification stage
- ${{ if eq(parameters.is_official_build, 'True') }}:
# Build PyTorch from source in install mode and exclude test binaries
- bash: python setup.py install
displayName: Build PyTorch from source without test binaries
# Package PyTorch Wheel
- bash: python setup.py bdist_wheel
displayName: Package PyTorch Wheel
# Publish PyTorch Wheel
- task: PublishPipelineArtifact@1
inputs:
targetPath: $(Build.SourcesDirectory)/dist/
artifactName: Build_$(Build.BuildNumber)_$(configuration)
displayName: Publish PyTorch Wheel to Pipeline Artifacts
# Verification stage
- ${{ if eq(parameters.verify_stage, 'True') }}:
# Download PyTorch Wheel
- task: DownloadPipelineArtifact@2
inputs:
artifact: Build_$(Build.BuildNumber)_$(configuration)
path: $(Build.SourcesDirectory)/verify
displayName: Download PyTorch Wheel
# Install PyTorch Wheel on Windows
- bash: python -m pip install $(Build.SourcesDirectory)/verify/torch*linux*.whl
displayName: Install PyTorch Wheel
# Ensure PyTorch installed correctly from produced wheel
- bash: |
cd $(Build.SourcesDirectory)/verify
python -c "import torch; print('Installed Torch version: ' + torch.__version__)"
displayName: Check PyTorch correctly installed from wheel
# Publishing stage
- ${{ if eq(parameters.publish_stage, 'True') }}:
# Download PyTorch Wheel
- task: DownloadPipelineArtifact@2
inputs:
artifact: Build_$(Build.BuildNumber)_$(configuration)
path: $(Build.SourcesDirectory)/publish
displayName: Download PyTorch Wheel
# Publish wheel to Azure Artifacts
# The flag continueOnError=true is needed as the artifact to be published
# may already exist, because the artifact is differentiated based on the
# last commit date.
- bash: |
export TORCH_VERSION=$(head -c 5 ./version.txt)
export LAST_COMMIT=$(git rev-parse --short HEAD)
export LAST_COMMIT_DATE=$(git log -1 --pretty=%ad --date=format:%Y%m%d)
cd $(Build.SourcesDirectory)/publish
export TORCH_WHEEL=$(echo torch*linux*whl)
az extension add -n azure-devops
echo $ADOTOKEN | az devops login
az artifacts universal publish --organization $AZURE_DEVOPS_ARTIFACTS_ORGANIZATION --project $AZURE_DEVOPS_ARTIFACTS_PROJECT --scope project --feed "PyTorch" --name $TORCH_WHEEL --description "PyTorch Official Build Artifact" --version $TORCH_VERSION-$LAST_COMMIT_DATE-$LAST_COMMIT --path .
env:
ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
continueOnError: true
displayName: Upload PyTorch Official Build package to Azure Artifacts

View File

@ -1,150 +0,0 @@
# PyTorch build steps template with Windows images Azure DevOps Instances
#
# This build depends on 3 parameters set as environment variables in the pipeline:
# - AZURE_DEVOPS_CLI_PAT: Secret var for authenticating to Azure DevOps
# - AZURE_DEVOPS_ARTIFACTS_ORGANIZATION: Azure Artifacts Organization name to publish artifacts
# - AZURE_DEVOPS_ARTIFACTS_PROJECT: Azure Artifacts Project name to publish artifacts
parameters:
name: ''
pool: ''
os: ''
cuda: ''
is_ci_build: False
is_official_build: False
is_daily_build: False
build_stage: False
verify_stage: False
publish_stage: False
customMatrixes: ''
jobs:
- job: ${{parameters.name}}
timeoutInMinutes: 300
strategy:
matrix:
${{ insert }}: ${{parameters.customMatrixes}}
pool:
name: ${{ parameters.pool}}
variables:
CMAKE_GENERATOR: Ninja
PACKAGE_PDBS: 0
steps:
# Prepare for PyTorch build on Windows
- template: prepare-build-template.yml
parameters:
configuration: $(configuration)
build_stage: ${{ parameters.build_stage}}
# Build Stage
- ${{ if eq(parameters.build_stage, 'True') }}:
# Set up environment variables for specific pipeline build
- template: set-environment-variables.yml
parameters:
os: ${{ parameters.os}}
cuda: ${{ parameters.cuda}}
is_official_build: ${{ parameters.is_official_build}}
# Sync and update PyTorch submodules
- script: git submodule update --init --recursive --jobs 0
displayName: Update PyTorch submodules
# Build PyTorch and run unit tests - no packaging
- ${{ if or(eq(parameters.is_ci_build, 'True'), eq(parameters.is_daily_build, 'True')) }}:
# Build PyTorch from source in develop mode with Ninja
- script: call activate $(configuration) && python setup.py develop
displayName: Build PyTorch from source
- ${{ if eq(parameters.is_ci_build, 'True') }}:
# Run TestTorch unit tests to demonstrate successful PyTorch build
- script: call activate $(configuration) && python test\test_torch.py TestTorch
displayName: Run TestTorch unit tests
- ${{ if eq(parameters.is_daily_build, 'True') }}:
# Run all unit tests to demonstrate successful PyTorch build
- script: call activate $(configuration) && python test/run_test.py --continue-through-error --exclude-jit-executor --verbose
displayName: Run all unit tests
# Run ComponentGovernance
- task: ComponentGovernanceComponentDetection@0
inputs:
scanType: 'Register'
verbosity: 'Verbose'
alertWarningLevel: 'High'
# Build PyTorch and produce artifacts for verification stage
- ${{ if eq(parameters.is_official_build, 'True') }}:
# Build PyTorch from source in install mode with Ninja and exclude test binaries
- script: call activate $(configuration) && python setup.py install
displayName: Build PyTorch from source without test binaries
# Package PyTorch Wheel
- script: call activate $(configuration) && python setup.py bdist_wheel
displayName: Package PyTorch Wheel
# Publish PyTorch Wheel
- task: PublishPipelineArtifact@1
inputs:
targetPath: $(Build.SourcesDirectory)\dist\
artifactName: Build_$(Build.BuildNumber)_$(configuration)
displayName: Publish PyTorch Wheel to Pipeline Artifacts
# Verification Stage
- ${{ if eq(parameters.verify_stage, 'True') }}:
# Download PyTorch Wheel
- task: DownloadPipelineArtifact@2
inputs:
artifact: Build_$(Build.BuildNumber)_$(configuration)
path: $(Build.SourcesDirectory)\verify
displayName: Download PyTorch Wheel
# Install PyTorch Wheel on Windows
- script: |
call activate $(configuration)
cd $(Build.SourcesDirectory)\verify
dir torch*win*.whl /b > whl.txt
set /p whl= < whl.txt
python -m pip install %whl%
displayName: Install PyTorch Wheel
# Ensure PyTorch installed correctly from produced wheel
- script: |
call activate $(configuration)
cd $(Build.SourcesDirectory)\verify
python -c "import torch; print('Installed Torch version: ' + torch.__version__)"
displayName: Check PyTorch correctly installed from wheel
# Publishing stage
- ${{ if eq(parameters.publish_stage, 'True') }}:
# Download PyTorch Wheel
- task: DownloadPipelineArtifact@2
inputs:
artifact: Build_$(Build.BuildNumber)_$(configuration)
path: $(Build.SourcesDirectory)\publish
displayName: Download PyTorch Wheel
# Set up Azure Artifacts for Windows
# The pip install --upgrade command is a bug fix for Azure CLI on Windows
# More info: https://github.com/Azure/azure-cli/issues/16858
- script: |
pip install --upgrade pip --target \opt\az\lib\python3.6\site-packages\
az extension add -n azure-devops
displayName: Set up Azure Artifacts download on Windows
# Publish wheel to Azure Artifacts
# The flag continueOnError=true is needed as the artifact to be published
# may already exist, because the artifact is differentiated based on the
# last commit date.
- script: |
set /p TORCH_VERSION= < version.txt
cd $(Build.SourcesDirectory)\publish
git rev-parse --short HEAD > last_commit.txt && set /p LAST_COMMIT= < last_commit.txt
git log -1 --pretty=%ad --date=format:%Y%m%d > last_commit_date.txt && set /p LAST_COMMIT_DATE= < last_commit_date.txt
dir torch*win*.whl /b > whl.txt && set /p TORCH_WHEEL= < whl.txt
echo %ADOTOKEN% | az devops login
az artifacts universal publish --organization %AZURE_DEVOPS_ARTIFACTS_ORGANIZATION% --project %AZURE_DEVOPS_ARTIFACTS_PROJECT% --scope project --feed "PyTorch" --name %TORCH_WHEEL% --description "PyTorch Official Build Artifact" --version %TORCH_VERSION:~0,5%-%LAST_COMMIT_DATE%-%LAST_COMMIT% --path .
env:
ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
continueOnError: true
displayName: Upload PyTorch nigthly package to Azure Artifacts

View File

@ -1,17 +0,0 @@
dependencies:
- python=PYTHON_VERSION
- numpy
- ninja
- pyyaml
- mkl
- mkl-include
- setuptools
- cmake
- cffi
- typing_extensions
- future
- six
- requests
- dataclasses
- pip:
- -r ../../requirements.txt

View File

@ -1,26 +0,0 @@
parameters:
name: ''
pool: ''
customMatrixes: ''
jobs:
- job: ${{parameters.name}}
timeoutInMinutes: 600
strategy:
matrix:
${{ insert }}: ${{parameters.customMatrixes}}
pool:
name: ${{ parameters.pool}}
steps:
# Clone PyTorch Tests repository
- bash: |
B64_PAT=$(echo -n ":$_ADOTOKEN" | base64)
git -c http.extraHeader="Authorization: Basic ${B64_PAT}" clone $(AZURE_DEVOPS_PYTORCH_TESTS_REPO_URL)
cd pytorch_tests
git checkout $(PYTORCH_TESTS_CHECKOUT_BRANCH)
env:
_ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
displayName: Clone PyTorch Tests repo
- bash: |
bash $(Build.SourcesDirectory)/pytorch_tests/webapp/notify_webapp.sh
displayName: Notify Webapp

View File

@ -1,62 +0,0 @@
# Build prepare steps for PyTorch on Azure DevOps to build from source.
# These steps share between normal build process and semmle security scan tasks
parameters:
build_stage: False
configuration: ''
steps:
# End Python tasks that may be lingering over from previous runs
# Note: If python.exe isn't currently running, exit code becomes 128,
# which fails the run. Here exit code is set to 0 to avoid failed run.
- script: |
taskkill /f /im python.exe
IF %ERRORLEVEL% EQU 128 exit 0
displayName: End previous Python processes
# Clean up env directory in conda for fresh builds and set up conda environment YAML
- powershell: |
Remove-Item 'C:\Miniconda\envs' -Recurse -ErrorAction Ignore
$env:PYTHON_VERSION = $env:SYSTEM_JOBNAME.Substring(3,1) + '.' + $env:SYSTEM_JOBNAME.Substring(4,1)
(Get-Content .azure_pipelines\job_templates\common-packages.yml) -replace 'PYTHON_VERSION', $env:PYTHON_VERSION | Out-File -encoding ASCII .azure_pipelines\job_templates\common-packages.yml
displayName: Clean up previous environments and Set up conda environment YAML
# Make conda environment and install required packages
- script: |
call conda clean --all -y
call conda env create -n $(configuration) --file .azure_pipelines\job_templates\common-packages.yml
call activate $(configuration)
call conda install -c conda-forge libuv=1.39
displayName: Set up conda environment for building from source
- ${{ if eq(parameters.build_stage, 'True') }}:
# Install MKL
- script: |
rmdir /s /q mkl
del mkl_2020.2.254.7z
curl https://s3.amazonaws.com/ossci-windows/mkl_2020.2.254.7z -k -O
7z x -aoa mkl_2020.2.254.7z -omkl
displayName: Install MKL
# Install sccache and randomtemp
# Related PyTorch GitHub issue: https://github.com/pytorch/pytorch/issues/25393
# Related fix: https://github.com/pytorch/builder/pull/448/
- script: |
mkdir .\tmp_bin
curl -k https://s3.amazonaws.com/ossci-windows/sccache.exe --output .\tmp_bin\sccache.exe
curl -k https://s3.amazonaws.com/ossci-windows/sccache-cl.exe --output .\tmp_bin\sccache-cl.exe
copy .\tmp_bin\sccache.exe .\tmp_bin\nvcc.exe
curl -kL https://github.com/peterjc123/randomtemp-rust/releases/download/v0.3/randomtemp.exe --output .\tmp_bin\randomtemp.exe
displayName: Install sccache and randomtemp
condition: not(eq(variables.CUDA_VERSION, ''))
# CUDA 11.2's CUB directory conflicts with CUDA 10.2 and 10.1
# builds, where CUDA 11.2's CUB is injected into non-CUDA
# 11.2 builds.
- powershell: Remove-Item "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include\cub" -Recurse -ErrorAction Ignore
displayName: Remove conflicting CUB from CUDA installation
condition: not(eq(variables.CUDA_VERSION, ''))
- powershell: Copy-Item -Path "F:\cuda_11_2\cub\" -Destination "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include" -Recurse
displayName: Copy CUDA CUB for CUDA 11.2 build
condition: eq(variables.CUDA_VERSION, '112')

View File

@ -1,61 +0,0 @@
# PyTorch build steps template with Unix images Azure DevOps Instances
#
# This build depends on 5 parameters set as an environment variables in the pipeline:
# - AZURE_DEVOPS_CLI_PAT: Secret var for authenticating to Azure DevOps
# - AZURE_STORAGE_KEY: Secret var for authenticating to Azure Storage
# - _TS_CLONE_P, _TS_P, _TS_SM_P: Secret vars for specific unit tests
parameters:
name: ''
pool: ''
container_endpoint: ''
customMatrixes: ''
jobs:
- job: ${{parameters.name}}
timeoutInMinutes: 600
strategy:
matrix:
${{ insert }}: ${{parameters.customMatrixes}}
pool:
name: ${{ parameters.pool}}
variables:
DECODE_PERCENTS: false
steps:
# Don't checkout repo contents to save time and CPU compute. Environment variables
# related to checkout branch such as $(BUILD_SOURCEBRANCH) are still available.
- checkout: none
# Delete pytorch_tests repo from previous builds if exists
- bash: rm -rf pytorch_tests/
displayName: Delete pytorch_tests repo from previous builds if exists
# Clone PyTorch Tests repository
- bash: |
B64_PAT=$(echo -n ":$_ADOTOKEN" | base64)
git -c http.extraHeader="Authorization: Basic ${B64_PAT}" clone $(AZURE_DEVOPS_PYTORCH_TESTS_REPO_URL)
cd pytorch_tests
git checkout $(PYTORCH_TESTS_CHECKOUT_BRANCH)
env:
_ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
displayName: Clone PyTorch Tests repo
# Run PyTorch Unit Tests
- bash: bash $(Build.SourcesDirectory)/pytorch_tests/scripts/linux/run.sh
env:
_AZURE_STORAGE_KEY: $(AZURE_STORAGE_KEY)
_TS_CLONE_P: $(TS_CLONE_PASSWORD)
_TS_P: $(TS_PAT)
_TS_SM_P: $(TS_SM_PAT)
_AZUREML_CLONE_PASSWORD: $(AZUREML_CLONE_PASSWORD)
_SPPASSWORD: $(SPPASSWORD)
displayName: Run PyTorch Unit Tests
# Tests results are available outside the docker container since
# the current directory is mounted as a volume of the container.
- task: PublishTestResults@2
condition: always()
inputs:
testResultsFiles: '**/test-*.xml'
testRunTitle: 'Publish test results for Python'

View File

@ -1,57 +0,0 @@
# PyTorch build steps template with Windows images Azure DevOps Instances
#
# This build depends on 5 parameters set as an environment variables in the pipeline:
# - AZURE_DEVOPS_CLI_PAT: Secret var for authenticating to Azure DevOps
# - AZURE_STORAGE_KEY: Secret var for authenticating to Azure Storage
# - _TS_CLONE_P, _TS_P, _TS_SM_P: Secret vars for specific unit tests
parameters:
name: ''
pool: ''
customMatrixes: ''
jobs:
- job: ${{parameters.name}}
timeoutInMinutes: 600
strategy:
matrix:
${{ insert }}: ${{parameters.customMatrixes}}
pool:
name: ${{ parameters.pool}}
steps:
# Don't checkout repo contents to save time and CPU compute. Environment variables
# related to checkout branch such as $(BUILD_SOURCEBRANCH) are still available.
- checkout: none
# Delete pytorch_tests repo from previous builds if exists
- script: if exist "pytorch_tests/" rmdir "pytorch_tests/" /q /s
displayName: Delete pytorch_tests repo from previous builds if exists
# Clone PyTorch Tests repository
- powershell: |
$env:B64Pat = [Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes(":$env:_ADOTOKEN"))
git -c http.extraHeader="Authorization: Basic $env:B64Pat" clone $env:AZURE_DEVOPS_pytorch_tests_REPO_URL
cd pytorch_tests
git checkout $(PYTORCH_TESTS_CHECKOUT_BRANCH)
env:
_ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
displayName: Clone PyTorch Tests repo
# Run PyTorch Unit Tests
- script: call $(Build.SourcesDirectory)\pytorch_tests\scripts\windows\run.bat
env:
_ADOTOKEN: $(AZURE_DEVOPS_CLI_PAT)
_AZURE_STORAGE_KEY: $(AZURE_STORAGE_KEY)
_TS_CLONE_P: $(TS_CLONE_PASSWORD)
_TS_P: $(TS_PAT)
_TS_SM_P: $(TS_SM_PAT)
displayName: Run PyTorch Unit Tests
# Tests results are available outside the docker container since
# the current directory is mounted as a volume of the container.
- task: PublishTestResults@2
condition: always()
inputs:
testResultsFiles: '**\test-*.xml'
testRunTitle: 'Publish test results for Python'

View File

@ -1,131 +0,0 @@
# Set environment variables for specific configurations
parameters:
is_official_build: False
os: ''
cuda: ''
steps:
# Environment configuration steps for Ubuntu builds
- ${{ if contains(parameters.os, 'ubuntu') }}:
# Set configuration specific build flags
- ${{ if eq(parameters.is_official_build, True) }}:
- bash: |
echo "##vso[task.setvariable variable=INSTALL_TEST;]0"
echo "##vso[task.setvariable variable=PYTORCH_BUILD_NUMBER;]1"
export PYTORCH_VERSION=$(head -c 5 ./version.txt)
echo "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$PYTORCH_VERSION.dev"
displayName: Set configuration-specific build flags
# Set PyTorch CPU/GPU build flags.
- ${{ if contains(parameters.cuda, 'cpu') }}:
- bash: |
echo "##vso[task.setvariable variable=USE_CUDA;]0"
echo "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$(PYTORCH_BUILD_VERSION).cpu"
displayName: Set CUDA-specific build flag for CPU builds
- ${{ if contains(parameters.cuda, 'gpu') }}:
- bash: |
echo "##vso[task.setvariable variable=USE_CUDA;]1"
echo "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$(PYTORCH_BUILD_VERSION).cu$(CUDA_VERSION)"
displayName: Set CUDA-specific build flag for GPU builds
# Set MKL environment variables
- bash: |
echo "##vso[task.setvariable variable=CMAKE_LIBRARY_PATH;]/opt/intel/lib:$CMAKE_LIBRARY_PATH"
echo "##vso[task.setvariable variable=CMAKE_INCLUDE_PATH;]/opt/intel/include:$CMAKE_INCLUDE_PATH"
displayName: Set MKL paths
# View current environment variables
- bash:
printenv
displayName: Show environment variables
# Environment configuration steps for Windows builds
- ${{ if contains(parameters.os, 'windows') }}:
# Set Conda Lib Path
- powershell: Write-Host "##vso[task.setvariable variable=CONDA_LIB_PATH;]C:\Miniconda\envs\$(configuration)\Library\bin"
displayName: Set Conda Lib Path
# Set configuration specific build flags
- ${{ if eq(parameters.is_official_build, True) }}:
- powershell: |
Write-Host "##vso[task.setvariable variable=INSTALL_TEST;]0"
Write-Host "##vso[task.setvariable variable=PYTORCH_BUILD_NUMBER;]1"
Set-Variable -Name PYTORCH_VERSION -Value (Get-Content .\version.txt).Substring(0,5)
Write-Host "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$PYTORCH_VERSION.dev"
displayName: Set configuration-specific build flags
# Set PyTorch CPU/GPU build flags..
- ${{ if contains(parameters.cuda, 'cpu') }}:
- powershell: |
Write-Host "##vso[task.setvariable variable=USE_CUDA;]0"
Write-Host "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$(PYTORCH_BUILD_VERSION).cpu"
displayName: Set CUDA-specific build flag for CPU build
- ${{ if contains(parameters.cuda, 'gpu') }}:
- powershell: |
Write-Host "##vso[task.setvariable variable=USE_CUDA;]1"
Write-Host "##vso[task.setvariable variable=PYTORCH_BUILD_VERSION;]$(PYTORCH_BUILD_VERSION).cu$(CUDA_VERSION)"
displayName: Set CUDA-specific build flag for GPU build
# Set CUDA 11.2, 10.2 or 10.1 specific build flags
- ${{ if eq(parameters.cuda, 'gpu') }}:
- powershell: |
Write-Host "##vso[task.setvariable variable=TORCH_CUDA_ARCH_LIST;]3.7+PTX;5.0;6.0;6.1;7.0;7.5;8.0;8.6"
Write-Host "##vso[task.setvariable variable=CUDA_PATH;]C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\"
displayName: Set CUDA 11.2 specific build flags
condition: eq(variables.CUDA_VERSION, '112')
- powershell: |
Write-Host "##vso[task.setvariable variable=TORCH_CUDA_ARCH_LIST;]3.7+PTX;5.0;6.0;6.1;7.0;7.5"
Write-Host "##vso[task.setvariable variable=CUDA_PATH;]C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\"
displayName: Set CUDA 10.2 specific build flags
condition: eq(variables.CUDA_VERSION, '102')
- powershell: |
Write-Host "##vso[task.setvariable variable=TORCH_CUDA_ARCH_LIST;]3.7+PTX;5.0;6.0;6.1;7.0;7.5"
Write-Host "##vso[task.setvariable variable=CUDA_PATH;]C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\"
displayName: Set CUDA 10.1 specific build flags
condition: eq(variables.CUDA_VERSION, '101')
- powershell: |
Write-Host "##vso[task.setvariable variable=CUDA_BIN_PATH;]$env:CUDA_PATH\bin\"
Write-Host "##vso[task.setvariable variable=CUDNN_ROOT;]$env:CUDA_PATH"
Write-Host "##vso[task.setvariable variable=CUDNN_INCLUDE_DIR;]$env:CUDA_PATH\include\"
Write-Host "##vso[task.setvariable variable=CUDNN_LIBRARY;]$env:CUDA_PATH\lib\x64\"
Write-Host "##vso[task.prependpath]$env:CUDA_PATH\bin"
Write-Host "##vso[task.setvariable variable=TORCH_NVCC_FLAGS;]-Xfatbin -compress-all --no-host-device-move-forward"
Write-Host "##vso[task.setvariable variable=THRUST_IGNORE_CUB_VERSION_CHECK;]1"
Write-Host "##vso[task.setvariable variable=NVTOOLSEXT_PATH;]C:\Program Files\NVIDIA Corporation\NvToolsExt\"
displayName: Set CUDA environment variables
- powershell: |
copy "$(CUDA_BIN_PATH)\cusparse*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\cublas*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\cudart*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\curand*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\cufft*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\cusolver*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\cudnn*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CUDA_BIN_PATH)\nvrtc*64_*.dll*" $(Build.SourcesDirectory)\torch\lib
copy "C:\Program Files\NVIDIA Corporation\NvToolsExt\bin\x64\nvToolsExt64_1.dll*" $(Build.SourcesDirectory)\torch\lib
copy "$(CONDA_LIB_PATH)\libiomp*5md.dll" $(Build.SourcesDirectory)\torch\lib
copy "$(CONDA_LIB_PATH)\uv.dll" $(Build.SourcesDirectory)\torch\lib
displayName: Copy CUDA/cuDNN/libomp/libuv dlls to torch\lib
# Set MKL, sccache and randomtemp environment variables
- powershell: |
Write-Host "##vso[task.setvariable variable=CMAKE_INCLUDE_PATH;]$(Build.SourcesDirectory)\mkl\include"
Write-Host "##vso[task.setvariable variable=CMAKE_LIBRARY_PATH;]$(Build.SourcesDirectory)\mkl\lib;$env:CMAKE_LIBRARY_PATH"
Write-Host "##vso[task.setvariable variable=ADDITIONAL_PATH;]$(Build.SourcesDirectory)\tmp_bin"
Write-Host "##vso[task.setvariable variable=SCCACHE_IDLE_TIMEOUT;]1500"
Write-Host "##vso[task.setvariable variable=RANDOMTEMP_EXECUTABLE;]$(Build.SourcesDirectory)\tmp_bin\nvcc.exe"
Write-Host "##vso[task.setvariable variable=CUDA_NVCC_EXECUTABLE;]$(Build.SourcesDirectory)\tmp_bin\randomtemp.exe"
Write-Host "##vso[task.setvariable variable=RANDOMTEMP_BASEDIR;]$(Build.SourcesDirectory)\tmp_bin"
displayName: Set MKL, sccache and randomtemp environment variables
# View current environment variables
- script:
set
displayName: Show environment variables

View File

@ -1,14 +0,0 @@
# Main logic to initiate wait for PR artifact to be ready
steps:
- task: InvokeRESTAPI@1
displayName: 'Wait for job success and wheel ready'
timeoutInMinutes: 60
inputs:
connectionType: 'connectedServiceName'
serviceConnection: circleciconn
method: 'POST'
headers: '{"Content-Type":"application/json", "BranchName":"$(_TARGET_BRANCH_TO_CHECK)", "JobName":"$(TARGET_CIRCLECI_BUILD_PR)", "PRNumber":"$(_TARGET_PR_NUMBER)", "TargetCommit":"$(_TARGET_COMMIT)", "PlanUrl":"$(System.CollectionUri)", "ProjectId":"$(System.TeamProjectId)", "HubName":"$(System.HostType)", "PlanId":"$(System.PlanId)", "JobId":"$(System.JobId)", "TimelineId":"$(System.TimelineId)", "TaskInstanceId":"$(System.TaskInstanceId)", "AuthToken":"$(System.AccessToken)"}'
body: ''
urlSuffix: 'api/JobStatus'
waitForCompletion: true

View File

@ -1,92 +0,0 @@
# Initiate 5 agentless-server waiting jobs to check on the
# status of PR artifact builds, for a maximum wait time of
# 11*60 min=660 mins. These jobs will pass immediately
# once targeted CircleCI build is ready.
jobs:
- job: checkjob1
pool: server
timeoutInMinutes: 60
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob2
pool: server
timeoutInMinutes: 60
dependsOn: checkjob1
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob3
pool: server
timeoutInMinutes: 60
dependsOn: checkjob2
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob4
pool: server
timeoutInMinutes: 60
dependsOn: checkjob3
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob5
pool: server
timeoutInMinutes: 60
dependsOn: checkjob4
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob6
pool: server
timeoutInMinutes: 60
dependsOn: checkjob5
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob7
pool: server
timeoutInMinutes: 60
dependsOn: checkjob6
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob8
pool: server
timeoutInMinutes: 60
dependsOn: checkjob7
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob9
pool: server
timeoutInMinutes: 60
dependsOn: checkjob8
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob10
pool: server
timeoutInMinutes: 60
dependsOn: checkjob9
continueOnError: true
steps:
- template: wheel-wait-job-template.yml
- job: checkjob11
pool: server
timeoutInMinutes: 60
dependsOn: checkjob10
continueOnError: true
steps:
- template: wheel-wait-job-template.yml

View File

@ -1,60 +0,0 @@
# PyTorch Nightly PyTorch Tests Builds Pipeline on Azure DevOps
#
# This pipeline runs custom PyTorch unit-tests on nightly
# PyTorch wheels.
stages:
- stage: 'NightlyCustomTests'
displayName: 'Run custom unit tests on PyTorch wheels'
jobs:
- template: job_templates/pytorch-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: $(BUILD_POOL_LIN_1)
customMatrixes:
Nightly_Custom_Tests:
_DOCKER_IMAGE: $(DOCKER_IMAGE_LIN_1)
_PYTHON_VERSION: $(PYTHON_VERSION_LIN_1)
_CUDA_BUILD_VERSION: $(CUDA_BUILD_VERSION_LIN_1)
_RUN_TESTS: $(RUN_TESTS_LIN)
- template: job_templates/pytorch-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: $(BUILD_POOL_LIN_2)
customMatrixes:
Nightly_Custom_Tests:
_DOCKER_IMAGE: $(DOCKER_IMAGE_LIN_2)
_PYTHON_VERSION: $(PYTHON_VERSION_LIN_2)
_CUDA_BUILD_VERSION: $(CUDA_BUILD_VERSION_LIN_2)
_RUN_TESTS: $(RUN_TESTS_LIN)
- template: job_templates/pytorch-template-win.yml
parameters:
name: windows_2019_CPU
pool: $(BUILD_POOL_WIN_1)
customMatrixes:
Nightly_Custom_Tests:
_PYTHON_VERSION: $(PYTHON_VERSION_WIN_1)
_CUDA_BUILD_VERSION: $(CUDA_BUILD_VERSION_WIN_1)
_RUN_TESTS: $(RUN_TESTS_WIN)
- template: job_templates/pytorch-template-win.yml
parameters:
name: windows_2019_GPU
pool: $(BUILD_POOL_WIN_2)
customMatrixes:
Nightly_Custom_Tests:
_PYTHON_VERSION: $(PYTHON_VERSION_WIN_2)
_CUDA_BUILD_VERSION: $(CUDA_BUILD_VERSION_WIN_2)
_RUN_TESTS: $(RUN_TESTS_WIN)
- stage: 'NotifyWebapp'
displayName: 'Notify Webapp that pipeline is finished'
dependsOn: NightlyCustomTests
condition: succeededOrFailed()
jobs:
- template: job_templates/notify-webapp-template.yml
parameters:
name: ubuntu_1804_CPU
pool: $(BUILD_POOL_LIN_1)

View File

@ -1,62 +0,0 @@
# PyTorch PR PyTorch Tests Builds Pipeline on Azure DevOps
#
# This pipeline:
# 1) ensures that CircleCI builds for a given PR
# have finished, and that its artifacts are
# ready for download
# 2) runs custom PyTorch unit-tests on PyTorch
# wheels generated during PR builds.
resources:
webhooks:
- webhook: GitHubPyTorchPRTrigger
connection: GitHubPyTorchPRTriggerConnection
filters:
- path: repositoryName
value: pytorch_tests
stages:
- stage: 'EnsureArtifactsReady'
displayName: 'Ensure PyTorch PR Artifacts are ready'
jobs:
- template: job_templates/wheel-wait-template.yml
variables:
_TARGET_BRANCH_TO_CHECK: ${{parameters.GitHubPyTorchPRTrigger.TARGET_BRANCH_TO_CHECK_AZ_DEVOPS_PR}}
_TARGET_PR_NUMBER: ${{parameters.GitHubPyTorchPRTrigger.PR_NUMBER}}
_TARGET_COMMIT: ${{parameters.GitHubPyTorchPRTrigger.TARGET_COMMIT}}
- stage: 'PRCustomTests'
displayName: 'Run custom unit tests on PyTorch wheels'
dependsOn: EnsureArtifactsReady
condition: succeeded()
jobs:
- template: job_templates/pytorch-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: $(BUILD_POOL_PR)
customMatrixes:
PR_Custom_Tests:
_PYTHON_VERSION: $(PYTHON_VERSION_PR)
_CUDA_BUILD_VERSION: $(CUDA_BUILD_VERSION_PR)
_TARGET_CIRCLECI_BUILD: $(TARGET_CIRCLECI_BUILD_PR)
_TARGET_BRANCH_TO_CHECK: ${{parameters.GitHubPyTorchPRTrigger.TARGET_BRANCH_TO_CHECK_AZ_DEVOPS_PR}}
_TARGET_PR_NUMBER: ${{parameters.GitHubPyTorchPRTrigger.PR_NUMBER}}
_TARGET_COMMIT: ${{parameters.GitHubPyTorchPRTrigger.TARGET_COMMIT}}
_DOCKER_IMAGE: $(DOCKER_IMAGE_PR)
_RUN_TESTS: $(RUN_TESTS_PR)
- stage: 'NotifyWebapp'
displayName: 'Notify Webapp that pipeline is finished'
dependsOn: PRCustomTests
condition: succeededOrFailed()
jobs:
- template: job_templates/notify-webapp-template.yml
parameters:
name: ubuntu_1804_CPU
pool: $(BUILD_POOL_LIN_1)
customMatrixes:
PR_Notify_WebApp:
_TARGET_CIRCLECI_BUILD: $(TARGET_CIRCLECI_BUILD_PR)
_TARGET_BRANCH_TO_CHECK: ${{parameters.GitHubPyTorchPRTrigger.TARGET_BRANCH_TO_CHECK_AZ_DEVOPS_PR}}
_TARGET_PR_NUMBER: ${{parameters.GitHubPyTorchPRTrigger.PR_NUMBER}}
_TARGET_COMMIT: ${{parameters.GitHubPyTorchPRTrigger.TARGET_COMMIT}}

View File

@ -1,224 +0,0 @@
# PyTorch Official Builds Pipeline on Azure DevOps
#
# This pipeline:
# 1) builds PyTorch on all available configurations
# 2) verifies PyTorch artifacts by installing them in a clean environment
# and checking torch.__version_
# 3) publishes official PyTorch artifacts to Azure DevOps Artifacts for consumption
stages:
- stage: 'Build'
displayName: 'Build PyTorch'
jobs:
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: 'PyTorch-Linux-CPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_official_build: True
os: ubuntu
cuda: cpu
customMatrixes:
Py_38:
configuration: ubuntu_1804_py_38_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cpu_dev
Py_37:
configuration: ubuntu_1804_py_37_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cpu_dev
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: 'PyTorch-Linux-GPU'
container_endpoint: pytorchms.azurecr.io
build_stage: True
is_official_build: True
os: ubuntu
cuda: gpu
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: ubuntu_1804_py_39_cuda_112_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_39_cuda_112_cudnn_8_dev
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_810:
configuration: ubuntu_1804_py_38_cuda_102_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cuda_102_cudnn_8_dev
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_765:
configuration: ubuntu_1804_py_37_cuda_101_cudnn_765
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cuda_101_cudnn_7_dev
CUDA_VERSION: 101
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_CPU
pool: 'PyTorch-Win-CPU'
build_stage: True
is_official_build: True
os: windows
cuda: cpu
customMatrixes:
Py_38:
configuration: windows_2019_py_38_cpu
Py_37:
configuration: windows_2019_py_37_cpu
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_GPU
pool: 'PyTorch-Win-GPU'
build_stage: True
is_official_build: True
os: windows
cuda: gpu
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: windows_2019_py_39_cuda_112_cudnn_810
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_765:
configuration: windows_2019_py_38_cuda_102_cudnn_765
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_764:
configuration: windows_2019_py_37_cuda_101_cudnn_764
CUDA_VERSION: 101
- stage: 'Verify'
displayName: 'Verify PyTorch wheels'
dependsOn: Build
condition: succeeded()
jobs:
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: 'PyTorch-Linux-CPU'
container_endpoint: pytorchms.azurecr.io
verify_stage: True
is_official_build: True
customMatrixes:
Py_38:
configuration: ubuntu_1804_py_38_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cpu_dev
Py_37:
configuration: ubuntu_1804_py_37_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cpu_dev
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: 'PyTorch-Linux-GPU'
container_endpoint: pytorchms.azurecr.io
verify_stage: True
is_official_build: True
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: ubuntu_1804_py_39_cuda_112_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_39_cuda_112_cudnn_8_dev
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_810:
configuration: ubuntu_1804_py_38_cuda_102_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cuda_102_cudnn_8_dev
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_765:
configuration: ubuntu_1804_py_37_cuda_101_cudnn_765
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cuda_101_cudnn_7_dev
CUDA_VERSION: 101
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_CPU
pool: 'PyTorch-Win-CPU'
verify_stage: True
is_official_build: True
customMatrixes:
Py_38:
configuration: windows_2019_py_38_cpu
Py_37:
configuration: windows_2019_py_37_cpu
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_GPU
pool: 'PyTorch-Win-GPU'
verify_stage: True
is_official_build: True
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: windows_2019_py_39_cuda_112_cudnn_810
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_765:
configuration: windows_2019_py_38_cuda_102_cudnn_765
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_764:
configuration: windows_2019_py_37_cuda_101_cudnn_764
CUDA_VERSION: 101
- stage: 'Publish'
displayName: 'Publish PyTorch wheels'
dependsOn: Verify
condition: succeeded()
jobs:
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_CPU_docker
pool: 'PyTorch-Linux-CPU'
container_endpoint: pytorchms.azurecr.io
publish_stage: True
is_official_build: True
customMatrixes:
Py_38:
configuration: ubuntu_1804_py_38_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cpu_dev
Py_37:
configuration: ubuntu_1804_py_37_cpu
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cpu_dev
- template: job_templates/build-verify-publish-template-unix.yml
parameters:
name: ubuntu_1804_GPU_docker
pool: 'PyTorch-Linux-GPU'
container_endpoint: pytorchms.azurecr.io
publish_stage: True
is_official_build: True
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: ubuntu_1804_py_39_cuda_112_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_39_cuda_112_cudnn_8_dev
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_810:
configuration: ubuntu_1804_py_38_cuda_102_cudnn_810
container_image: pytorchms.azurecr.io/ubuntu_1804_py_38_cuda_102_cudnn_8_dev
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_765:
configuration: ubuntu_1804_py_37_cuda_101_cudnn_765
container_image: pytorchms.azurecr.io/ubuntu_1804_py_37_cuda_101_cudnn_7_dev
CUDA_VERSION: 101
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_CPU
pool: 'PyTorch-Win-CPU'
publish_stage: True
is_official_build: True
customMatrixes:
Py_38:
configuration: windows_2019_py_38_cpu
Py_37:
configuration: windows_2019_py_37_cpu
- template: job_templates/build-verify-publish-template-win.yml
parameters:
name: windows_2019_GPU
pool: 'PyTorch-Win-GPU'
publish_stage: True
is_official_build: True
customMatrixes:
Py_39_CUDA_112_cuDNN_810:
configuration: windows_2019_py_39_cuda_112_cudnn_810
CUDA_VERSION: 112
Py_38_CUDA_102_cuDNN_765:
configuration: windows_2019_py_38_cuda_102_cudnn_765
CUDA_VERSION: 102
Py_37_CUDA_101_cuDNN_764:
configuration: windows_2019_py_37_cuda_101_cudnn_764
CUDA_VERSION: 101

View File

@ -1,13 +0,0 @@
build --copt=--std=c++14
build --copt=-I.
build --copt=-isystem --copt bazel-out/k8-fastbuild/bin
# Configuration to disable tty features for environments like CI
build:no-tty --curses no
build:no-tty --progress_report_interval 10
build:no-tty --show_progress_rate_limit 10
# Configuration to build with GPU support
build:gpu --define=cuda=true
# define a separate build folder for faster switching between configs
build:gpu --platform_suffix=-gpu

View File

@ -1 +0,0 @@
4.2.1

View File

@ -31,7 +31,7 @@ Usage
1. Make changes to these scripts.
2. Run the `regenerate.sh` script in this directory and commit the script changes and the resulting change to `config.yml`.
You'll see a build failure on GitHub if the scripts don't agree with the checked-in version.
You'll see a build failure on TravisCI if the scripts don't agree with the checked-in version.
Motivation
@ -55,7 +55,7 @@ Future direction
See comment [here](https://github.com/pytorch/pytorch/pull/17323#pullrequestreview-206945747):
In contrast with a full recursive tree traversal of configuration dimensions,
> in the future I think we actually want to decrease our matrix somewhat and have only a few mostly-orthogonal builds that taste as many different features as possible on PRs, plus a more complete suite on every PR and maybe an almost full suite nightly/weekly (we don't have this yet). Specifying PR jobs in the future might be easier to read with an explicit list when we come to this.
> in the future future I think we actually want to decrease our matrix somewhat and have only a few mostly-orthogonal builds that taste as many different features as possible on PRs, plus a more complete suite on every PR and maybe an almost full suite nightly/weekly (we don't have this yet). Specifying PR jobs in the future might be easier to read with an explicit list when we come to this.
----------------
----------------
@ -71,9 +71,9 @@ A **binary configuration** is a collection of
* release or nightly
* releases are stable, nightlies are beta and built every night
* python version
* linux: 3.5m, 3.6m 3.7m (mu is wide unicode or something like that. It usually doesn't matter but you should know that it exists)
* macos: 3.6, 3.7, 3.8
* windows: 3.6, 3.7, 3.8
* linux: 2.7m, 2.7mu, 3.5m, 3.6m 3.7m (mu is wide unicode or something like that. It usually doesn't matter but you should know that it exists)
* macos: 2.7, 3.5, 3.6, 3.7
* windows: 3.5, 3.6, 3.7
* cpu version
* cpu, cuda 9.0, cuda 10.0
* The supported cuda versions occasionally change
@ -90,7 +90,7 @@ The binaries are built in CircleCI. There are nightly binaries built every night
We have 3 types of binary packages
* pip packages - nightlies are stored on s3 (pip install -f \<a s3 url\>). releases are stored in a pip repo (pip install torch) (ask Soumith about this)
* pip packages - nightlies are stored on s3 (pip install -f <a s3 url>). releases are stored in a pip repo (pip install torch) (ask Soumith about this)
* conda packages - nightlies and releases are both stored in a conda repo. Nighty packages have a '_nightly' suffix
* libtorch packages - these are zips of all the c++ libraries, header files, and sometimes dependencies. These are c++ only
* shared with dependencies (the only supported option for Windows)
@ -104,16 +104,16 @@ All binaries are built in CircleCI workflows except Windows. There are checked-i
Some quick vocab:
* A \**workflow** is a CircleCI concept; it is a DAG of '**jobs**'. ctrl-f 'workflows' on https://github.com/pytorch/pytorch/blob/master/.circleci/config.yml to see the workflows.
* A\**workflow** is a CircleCI concept; it is a DAG of '**jobs**'. ctrl-f 'workflows' on\https://github.com/pytorch/pytorch/blob/master/.circleci/config.yml to see the workflows.
* **jobs** are a sequence of '**steps**'
* **steps** are usually just a bash script or a builtin CircleCI command. *All steps run in new environments, environment variables declared in one script DO NOT persist to following steps*
* **steps** are usually just a bash script or a builtin CircleCI command.* All steps run in new environments, environment variables declared in one script DO NOT persist to following steps*
* CircleCI has a **workspace**, which is essentially a cache between steps of the *same job* in which you can store artifacts between steps.
## How are the workflows structured?
The nightly binaries have 3 workflows. We have one job (actually 3 jobs: build, test, and upload) per binary configuration
1. binary_builds
1. binarybuilds
1. every day midnight EST
2. linux: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml
3. macos: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/macos-binary-build-defaults.yml
@ -144,7 +144,7 @@ The nightly binaries have 3 workflows. We have one job (actually 3 jobs: build,
## How are the jobs structured?
The jobs are in https://github.com/pytorch/pytorch/tree/master/.circleci/verbatim-sources. Jobs are made of multiple steps. There are some shared steps used by all the binaries/smokes. Steps of these jobs are all delegated to scripts in https://github.com/pytorch/pytorch/tree/master/.circleci/scripts .
The jobs are in https://github.com/pytorch/pytorch/tree/master/.circleci/verbatim-sources . Jobs are made of multiple steps. There are some shared steps used by all the binaries/smokes. Steps of these jobs are all delegated to scripts in https://github.com/pytorch/pytorch/tree/master/.circleci/scripts .
* Linux jobs: https://github.com/pytorch/pytorch/blob/master/.circleci/verbatim-sources/linux-binary-build-defaults.yml
* binary_linux_build.sh
@ -178,7 +178,8 @@ CircleCI creates a final yaml file by inlining every <<* segment, so if we were
So, CircleCI has several executor types: macos, machine, and docker are the ones we use. The 'machine' executor gives you two cores on some linux vm. The 'docker' executor gives you considerably more cores (nproc was 32 instead of 2 back when I tried in February). Since the dockers are faster, we try to run everything that we can in dockers. Thus
* linux build jobs use the docker executor. Running them on the docker executor was at least 2x faster than running them on the machine executor
* linux test jobs use the machine executor in order for them to properly interface with GPUs since docker executors cannot execute with attached GPUs
* linux test jobs use the machine executor and spin up their own docker. Why this nonsense? It's cause we run nvidia-docker for our GPU tests; any code that calls into the CUDA runtime needs to be run on nvidia-docker. To run a nvidia-docker you need to install some nvidia packages on the host machine and then call docker with the '—runtime nvidia' argument. CircleCI doesn't support this, so we have to do it ourself.
* This is not just a mere inconvenience. **This blocks all of our linux tests from using more than 2 cores.** But there is nothing that we can do about it, but wait for a fix on circleci's side. Right now, we only run some smoke tests (some simple imports) on the binaries, but this also affects non-binary test jobs.
* linux upload jobs use the machine executor. The upload jobs are so short that it doesn't really matter what they use
* linux smoke test jobs use the machine executor for the same reason as the linux test jobs
@ -204,7 +205,7 @@ TODO: fill in stuff
## Overview
The code that runs the binaries lives in two places, in the normal [github.com/pytorch/pytorch](http://github.com/pytorch/pytorch), but also in [github.com/pytorch/builder](http://github.com/pytorch/builder), which is a repo that defines how all the binaries are built. The relevant code is
The code that runs the binaries lives in two places, in the normal [github.com/pytorch/pytorch](http://github.com/pytorch/pytorch), but also in [github.com/pytorch/builder](http://github.com/pytorch/builder) , which is a repo that defines how all the binaries are built. The relevant code is
```
@ -260,7 +261,7 @@ Linux, MacOS and Windows use the same code flow for the conda builds.
Conda packages are built with conda-build, see https://conda.io/projects/conda-build/en/latest/resources/commands/conda-build.html
Basically, you pass `conda build` a build folder (pytorch-nightly/ above) that contains a build script and a meta.yaml. The meta.yaml specifies in what python environment to build the package in, and what dependencies the resulting package should have, and the build script gets called in the env to build the thing.
tl;dr on conda-build is
tldr; on conda-build is
1. Creates a brand new conda environment, based off of deps in the meta.yaml
1. Note that environment variables do not get passed into this build env unless they are specified in the meta.yaml
@ -270,7 +271,7 @@ tl;dr on conda-build is
4. Runs some simple import tests (if specified in the meta.yaml)
5. Saves the finished package as a tarball
The build.sh we use is essentially a wrapper around `python setup.py build`, but it also manually copies in some of our dependent libraries into the resulting tarball and messes with some rpaths.
The build.sh we use is essentially a wrapper around ```python setup.py build``` , but it also manually copies in some of our dependent libraries into the resulting tarball and messes with some rpaths.
The entrypoint file `builder/conda/build_conda.sh` is complicated because
@ -343,6 +344,7 @@ All linux builds occur in docker images. The docker images are
* Has ALL CUDA versions installed. The script pytorch/builder/conda/switch_cuda_version.sh sets /usr/local/cuda to a symlink to e.g. /usr/local/cuda-10.0 to enable different CUDA builds
* Also used for cpu builds
* pytorch/manylinux-cuda90
* pytorch/manylinux-cuda92
* pytorch/manylinux-cuda100
* Also used for cpu builds
@ -354,15 +356,15 @@ The Dockerfiles are available in pytorch/builder, but there is no circleci job o
# How to manually rebuild the binaries
tl;dr make a PR that looks like https://github.com/pytorch/pytorch/pull/21159
tldr; make a PR that looks like https://github.com/pytorch/pytorch/pull/21159
Sometimes we want to push a change to master and then rebuild all of today's binaries after that change. As of May 30, 2019 there isn't a way to manually run a workflow in the UI. You can manually re-run a workflow, but it will use the exact same git commits as the first run and will not include any changes. So we have to make a PR and then force circleci to run the binary workflow instead of the normal tests. The above PR is an example of how to do this; essentially you copy-paste the binarybuilds workflow steps into the default workflow steps. If you need to point the builder repo to a different commit then you'd need to change https://github.com/pytorch/pytorch/blob/master/.circleci/scripts/binary_checkout.sh#L42-L45 to checkout what you want.
## How to test changes to the binaries via .circleci
Writing PRs that test the binaries is annoying, since the default circleci jobs that run on PRs are not the jobs that you want to run. Likely, changes to the binaries will touch something under .circleci/ and require that .circleci/config.yml be regenerated (.circleci/config.yml controls all .circleci behavior, and is generated using `.circleci/regenerate.sh` in python 3.7). But you also need to manually hardcode the binary jobs that you want to test into the .circleci/config.yml workflow, so you should actually make at least two commits, one for your changes and one to temporarily hardcode jobs. See https://github.com/pytorch/pytorch/pull/22928 as an example of how to do this.
Writing PRs that test the binaries is annoying, since the default circleci jobs that run on PRs are not the jobs that you want to run. Likely, changes to the binaries will touch something under .circleci/ and require that .circleci/config.yml be regenerated (.circleci/config.yml controls all .circleci behavior, and is generated using ```.circleci/regenerate.sh``` in python 3.7). But you also need to manually hardcode the binary jobs that you want to test into the .circleci/config.yml workflow, so you should actually make at least two commits, one for your changes and one to temporarily hardcode jobs. See https://github.com/pytorch/pytorch/pull/22928 as an example of how to do this.
```sh
```
# Make your changes
touch .circleci/verbatim-sources/nightly-binary-build-defaults.yml
@ -407,7 +409,7 @@ The advantage of this flow is that you can make new changes to the base commit a
You can build Linux binaries locally easily using docker.
```sh
```
# Run the docker
# Use the correct docker image, pytorch/conda-cuda used here as an example
#
@ -417,6 +419,8 @@ You can build Linux binaries locally easily using docker.
# in the docker container then you will see path/to/foo/baz on your local
# machine. You could also clone the pytorch and builder repos in the docker.
#
# If you're building a CUDA binary then use `nvidia-docker run` instead, see below.
#
# If you know how, add ccache as a volume too and speed up everything
docker run \
-v your/pytorch/repo:/pytorch \
@ -440,7 +444,9 @@ export DESIRED_CUDA=cpu
**Building CUDA binaries on docker**
You can build CUDA binaries on CPU only machines, but you can only run CUDA binaries on CUDA machines. This means that you can build a CUDA binary on a docker on your laptop if you so choose (though its gonna take a long time).
To build a CUDA binary you need to use `nvidia-docker run` instead of just `docker run` (or you can manually pass `--runtime=nvidia`). This adds some needed libraries and things to build CUDA stuff.
You can build CUDA binaries on CPU only machines, but you can only run CUDA binaries on CUDA machines. This means that you can build a CUDA binary on a docker on your laptop if you so choose (though its gonna take a loong time).
For Facebook employees, ask about beefy machines that have docker support and use those instead of your laptop; it will be 5x as fast.
@ -450,7 +456,7 @@ Theres no easy way to generate reproducible hermetic MacOS environments. If y
But if you want to try, then Id recommend
```sh
```
# Create a new terminal
# Clear your LD_LIBRARY_PATH and trim as much out of your PATH as you
# know how to do

View File

@ -5,6 +5,9 @@ for "smoketest" builds.
Each subclass of ConfigNode represents a layer of the configuration hierarchy.
These tree nodes encapsulate the logic for whether a branch of the hierarchy
should be "pruned".
In addition to generating config.yml content, the tree is also traversed
to produce a visualization of config dimensions.
"""
from collections import OrderedDict
@ -25,17 +28,16 @@ DEPS_INCLUSION_DIMENSIONS = [
]
def get_processor_arch_name(gpu_version):
return "cpu" if not gpu_version else (
"cu" + gpu_version.strip("cuda") if gpu_version.startswith("cuda") else gpu_version
)
def get_processor_arch_name(cuda_version):
return "cpu" if not cuda_version else "cu" + cuda_version
LINUX_PACKAGE_VARIANTS = OrderedDict(
manywheel=[
"3.5m",
"3.6m",
"3.7m",
"3.8m",
"3.9m"
],
conda=dimensions.STANDARD_PYTHON_VERSIONS,
libtorch=[
@ -44,7 +46,7 @@ LINUX_PACKAGE_VARIANTS = OrderedDict(
)
CONFIG_TREE_DATA = OrderedDict(
linux=(dimensions.GPU_VERSIONS, LINUX_PACKAGE_VARIANTS),
linux=(dimensions.CUDA_VERSIONS, LINUX_PACKAGE_VARIANTS),
macos=([None], OrderedDict(
wheel=dimensions.STANDARD_PYTHON_VERSIONS,
conda=dimensions.STANDARD_PYTHON_VERSIONS,
@ -52,28 +54,18 @@ CONFIG_TREE_DATA = OrderedDict(
"3.7",
],
)),
macos_arm64=([None], OrderedDict(
wheel=[
"3.8",
"3.9",
],
conda=[
"3.8",
"3.9",
windows=(dimensions.CUDA_VERSIONS, OrderedDict(
wheel=dimensions.STANDARD_PYTHON_VERSIONS,
conda=dimensions.STANDARD_PYTHON_VERSIONS,
libtorch=[
"3.7",
],
)),
windows=(
[v for v in dimensions.GPU_VERSIONS if v not in dimensions.ROCM_VERSION_LABELS],
OrderedDict(
wheel=dimensions.STANDARD_PYTHON_VERSIONS,
conda=dimensions.STANDARD_PYTHON_VERSIONS,
libtorch=[
"3.7",
],
)
),
)
CONFIG_TREE_DATA_NO_WINDOWS = CONFIG_TREE_DATA.copy()
CONFIG_TREE_DATA_NO_WINDOWS.pop("windows")
# GCC config variants:
#
# All the nightlies (except libtorch with new gcc ABI) are built with devtoolset7,
@ -108,12 +100,12 @@ class TopLevelNode(ConfigNode):
class OSConfigNode(ConfigNode):
def __init__(self, parent, os_name, gpu_versions, py_tree):
def __init__(self, parent, os_name, cuda_versions, py_tree):
super(OSConfigNode, self).__init__(parent, os_name)
self.py_tree = py_tree
self.props["os_name"] = os_name
self.props["gpu_versions"] = gpu_versions
self.props["cuda_versions"] = cuda_versions
def get_children(self):
return [PackageFormatConfigNode(self, k, v) for k, v in self.py_tree.items()]
@ -126,14 +118,13 @@ class PackageFormatConfigNode(ConfigNode):
self.props["python_versions"] = python_versions
self.props["package_format"] = package_format
def get_children(self):
if self.find_prop("os_name") == "linux":
return [LinuxGccConfigNode(self, v) for v in LINUX_GCC_CONFIG_VARIANTS[self.find_prop("package_format")]]
elif self.find_prop("os_name") == "windows" and self.find_prop("package_format") == "libtorch":
return [WindowsLibtorchConfigNode(self, v) for v in WINDOWS_LIBTORCH_CONFIG_VARIANTS]
else:
return [ArchConfigNode(self, v) for v in self.find_prop("gpu_versions")]
return [ArchConfigNode(self, v) for v in self.find_prop("cuda_versions")]
class LinuxGccConfigNode(ConfigNode):
@ -143,22 +134,14 @@ class LinuxGccConfigNode(ConfigNode):
self.props["gcc_config_variant"] = gcc_config_variant
def get_children(self):
gpu_versions = self.find_prop("gpu_versions")
cuda_versions = self.find_prop("cuda_versions")
# XXX devtoolset7 on CUDA 9.0 is temporarily disabled
# see https://github.com/pytorch/pytorch/issues/20066
if self.find_prop("gcc_config_variant") == 'devtoolset7':
gpu_versions = filter(lambda x: x != "cuda_90", gpu_versions)
cuda_versions = filter(lambda x: x != "90", cuda_versions)
# XXX disabling conda rocm build since docker images are not there
if self.find_prop("package_format") == 'conda':
gpu_versions = filter(lambda x: x not in dimensions.ROCM_VERSION_LABELS, gpu_versions)
# XXX libtorch rocm build is temporarily disabled
if self.find_prop("package_format") == 'libtorch':
gpu_versions = filter(lambda x: x not in dimensions.ROCM_VERSION_LABELS, gpu_versions)
return [ArchConfigNode(self, v) for v in gpu_versions]
return [ArchConfigNode(self, v) for v in cuda_versions]
class WindowsLibtorchConfigNode(ConfigNode):
@ -168,14 +151,14 @@ class WindowsLibtorchConfigNode(ConfigNode):
self.props["libtorch_config_variant"] = libtorch_config_variant
def get_children(self):
return [ArchConfigNode(self, v) for v in self.find_prop("gpu_versions")]
return [ArchConfigNode(self, v) for v in self.find_prop("cuda_versions")]
class ArchConfigNode(ConfigNode):
def __init__(self, parent, gpu):
super(ArchConfigNode, self).__init__(parent, get_processor_arch_name(gpu))
def __init__(self, parent, cu):
super(ArchConfigNode, self).__init__(parent, get_processor_arch_name(cu))
self.props["gpu"] = gpu
self.props["cu"] = cu
def get_children(self):
return [PyVersionConfigNode(self, v) for v in self.find_prop("python_versions")]
@ -188,6 +171,8 @@ class PyVersionConfigNode(ConfigNode):
self.props["pyver"] = pyver
def get_children(self):
smoke = self.find_prop("smoke")
package_format = self.find_prop("package_format")
os_name = self.find_prop("os_name")

View File

@ -1,15 +1,15 @@
from collections import OrderedDict
import cimodel.data.simple.util.branch_filters as branch_filters
import cimodel.data.binary_build_data as binary_build_data
import cimodel.lib.conf_tree as conf_tree
import cimodel.lib.miniutils as miniutils
class Conf(object):
def __init__(self, os, gpu_version, pydistro, parms, smoke, libtorch_variant, gcc_config_variant, libtorch_config_variant):
def __init__(self, os, cuda_version, pydistro, parms, smoke, libtorch_variant, gcc_config_variant, libtorch_config_variant):
self.os = os
self.gpu_version = gpu_version
self.cuda_version = cuda_version
self.pydistro = pydistro
self.parms = parms
self.smoke = smoke
@ -18,7 +18,7 @@ class Conf(object):
self.libtorch_config_variant = libtorch_config_variant
def gen_build_env_parms(self):
elems = [self.pydistro] + self.parms + [binary_build_data.get_processor_arch_name(self.gpu_version)]
elems = [self.pydistro] + self.parms + [binary_build_data.get_processor_arch_name(self.cuda_version)]
if self.gcc_config_variant is not None:
elems.append(str(self.gcc_config_variant))
if self.libtorch_config_variant is not None:
@ -27,19 +27,7 @@ class Conf(object):
def gen_docker_image(self):
if self.gcc_config_variant == 'gcc5.4_cxx11-abi':
if self.gpu_version is None:
return miniutils.quote("pytorch/libtorch-cxx11-builder:cpu")
else:
return miniutils.quote(
f"pytorch/libtorch-cxx11-builder:{self.gpu_version}"
)
if self.pydistro == "conda":
if self.gpu_version is None:
return miniutils.quote("pytorch/conda-builder:cpu")
else:
return miniutils.quote(
f"pytorch/conda-builder:{self.gpu_version}"
)
return miniutils.quote("pytorch/pytorch-binary-docker-image-ubuntu16.04:latest")
docker_word_substitution = {
"manywheel": "manylinux",
@ -49,12 +37,9 @@ class Conf(object):
docker_distro_prefix = miniutils.override(self.pydistro, docker_word_substitution)
# The cpu nightlies are built on the pytorch/manylinux-cuda102 docker image
# TODO cuda images should consolidate into tag-base images similar to rocm
alt_docker_suffix = "cuda102" if not self.gpu_version else (
"rocm:" + self.gpu_version.strip("rocm") if self.gpu_version.startswith("rocm") else self.gpu_version)
docker_distro_suffix = alt_docker_suffix if self.pydistro != "conda" else (
"cuda" if alt_docker_suffix.startswith("cuda") else "rocm")
return miniutils.quote("pytorch/" + docker_distro_prefix + "-" + docker_distro_suffix)
alt_docker_suffix = self.cuda_version or "102"
docker_distro_suffix = "" if self.pydistro == "conda" else alt_docker_suffix
return miniutils.quote("pytorch/" + docker_distro_prefix + "-cuda" + docker_distro_suffix)
def get_name_prefix(self):
return "smoke" if self.smoke else "binary"
@ -79,89 +64,67 @@ class Conf(object):
job_def = OrderedDict()
job_def["name"] = self.gen_build_name(phase, nightly)
job_def["build_environment"] = miniutils.quote(" ".join(self.gen_build_env_parms()))
job_def["requires"] = ["setup"]
if self.smoke:
job_def["requires"] = [
"update_s3_htmls",
]
job_def["filters"] = branch_filters.gen_filter_dict(
branches_list=["postnightly"],
)
job_def["requires"].append("update_s3_htmls_for_nightlies")
job_def["requires"].append("update_s3_htmls_for_nightlies_devtoolset7")
job_def["filters"] = {"branches": {"only": "postnightly"}}
else:
filter_branch = r"/.*/"
job_def["filters"] = branch_filters.gen_filter_dict(
branches_list=[filter_branch],
tags_list=[branch_filters.RC_PATTERN],
)
filter_branches = ["nightly"]
# we only want to add the release branch filter if we aren't
# uploading
if phase not in ["upload"]:
filter_branches.append(r"/release\/.*/")
job_def["filters"] = {
"branches": {
"only": filter_branches
},
# Will run on tags like v1.5.0-rc1, etc.
"tags": {
# Using a raw string here to avoid having to escape
# anything
"only": r"/v[0-9]+(\.[0-9]+)*-rc[0-9]+/"
}
}
if self.libtorch_variant:
job_def["libtorch_variant"] = miniutils.quote(self.libtorch_variant)
if phase == "test":
if not self.smoke:
job_def["requires"] = [self.gen_build_name("build", nightly)]
if not (self.smoke and self.os == "macos") and self.os != "windows":
job_def["requires"].append(self.gen_build_name("build", nightly))
if not (self.smoke and self.os == "macos"):
job_def["docker_image"] = self.gen_docker_image()
# fix this. only works on cuda not rocm
if self.os != "windows" and self.gpu_version:
if self.cuda_version:
job_def["use_cuda_docker_runtime"] = miniutils.quote("1")
else:
if self.os == "linux" and phase != "upload":
job_def["docker_image"] = self.gen_docker_image()
if phase == "test":
if self.gpu_version:
if self.os == "windows":
job_def["executor"] = "windows-with-nvidia-gpu"
else:
job_def["resource_class"] = "gpu.medium"
if self.cuda_version:
job_def["resource_class"] = "gpu.medium"
if phase == "upload":
job_def["context"] = "org-member"
job_def["requires"] = ["setup", self.gen_build_name(upload_phase_dependency, nightly)]
os_name = miniutils.override(self.os, {"macos": "mac"})
job_name = "_".join([self.get_name_prefix(), os_name, phase])
return {job_name : job_def}
def gen_upload_job(self, phase, requires_dependency):
"""Generate binary_upload job for configuration
Output looks similar to:
- binary_upload:
name: binary_linux_manywheel_3_7m_cu113_devtoolset7_nightly_upload
context: org-member
requires: binary_linux_manywheel_3_7m_cu113_devtoolset7_nightly_test
filters:
branches:
only:
- nightly
tags:
only: /v[0-9]+(\\.[0-9]+)*-rc[0-9]+/
package_type: manywheel
upload_subfolder: cu113
"""
return {
"binary_upload": OrderedDict({
"name": self.gen_build_name(phase, nightly=True),
"context": "org-member",
"requires": [self.gen_build_name(
requires_dependency,
nightly=True
)],
"filters": branch_filters.gen_filter_dict(
branches_list=["nightly"],
tags_list=[branch_filters.RC_PATTERN],
),
"package_type": self.pydistro,
"upload_subfolder": binary_build_data.get_processor_arch_name(
self.gpu_version,
),
})
}
def get_root(smoke, name):
return binary_build_data.TopLevelNode(
name,
binary_build_data.CONFIG_TREE_DATA,
smoke,
)
if smoke:
return binary_build_data.TopLevelNode(
name,
binary_build_data.CONFIG_TREE_DATA_NO_WINDOWS,
smoke,
)
else:
return binary_build_data.TopLevelNode(
name,
binary_build_data.CONFIG_TREE_DATA,
smoke,
)
def gen_build_env_list(smoke):
@ -173,10 +136,10 @@ def gen_build_env_list(smoke):
for c in config_list:
conf = Conf(
c.find_prop("os_name"),
c.find_prop("gpu"),
c.find_prop("cu"),
c.find_prop("package_format"),
[c.find_prop("pyver")],
c.find_prop("smoke") and not (c.find_prop("os_name") == "macos_arm64"), # don't test arm64
c.find_prop("smoke"),
c.find_prop("libtorch_variant"),
c.find_prop("gcc_config_variant"),
c.find_prop("libtorch_config_variant"),
@ -185,35 +148,24 @@ def gen_build_env_list(smoke):
return newlist
def predicate_exclude_macos(config):
return config.os == "linux" or config.os == "windows"
def predicate_exclude_nonlinux_and_libtorch(config):
return config.os == "linux"
def get_nightly_uploads():
configs = gen_build_env_list(False)
mylist = []
for conf in configs:
phase_dependency = "test" if predicate_exclude_macos(conf) else "build"
mylist.append(conf.gen_upload_job("upload", phase_dependency))
phase_dependency = "test" if predicate_exclude_nonlinux_and_libtorch(conf) else "build"
mylist.append(conf.gen_workflow_job("upload", phase_dependency, nightly=True))
return mylist
def get_post_upload_jobs():
return [
{
"update_s3_htmls": {
"name": "update_s3_htmls",
"context": "org-member",
"filters": branch_filters.gen_filter_dict(
branches_list=["postnightly"],
),
},
},
]
def get_nightly_tests():
configs = gen_build_env_list(False)
filtered_configs = filter(predicate_exclude_macos, configs)
filtered_configs = filter(predicate_exclude_nonlinux_and_libtorch, configs)
tests = []
for conf_options in filtered_configs:
@ -228,9 +180,7 @@ def get_jobs(toplevel_key, smoke):
configs = gen_build_env_list(smoke)
phase = "build" if toplevel_key == "binarybuilds" else "test"
for build_config in configs:
# don't test for macos_arm64 as it's cross compiled
if phase != "test" or build_config.os != "macos_arm64":
jobs_list.append(build_config.gen_workflow_job(phase, nightly=True))
jobs_list.append(build_config.gen_workflow_job(phase, nightly=True))
return jobs_list

View File

@ -0,0 +1,91 @@
from cimodel.lib.conf_tree import ConfigNode, XImportant
from cimodel.lib.conf_tree import Ver
CONFIG_TREE_DATA = [
(Ver("ubuntu", "16.04"), [
([Ver("clang", "7")], [XImportant("onnx_main_py3.6"),
XImportant("onnx_ort1_py3.6"),
XImportant("onnx_ort2_py3.6")]),
]),
]
class TreeConfigNode(ConfigNode):
def __init__(self, parent, node_name, subtree):
super(TreeConfigNode, self).__init__(parent, self.modify_label(node_name))
self.subtree = subtree
self.init2(node_name)
# noinspection PyMethodMayBeStatic
def modify_label(self, label):
return str(label)
def init2(self, node_name):
pass
def get_children(self):
return [self.child_constructor()(self, k, v) for (k, v) in self.subtree]
def is_build_only(self):
if str(self.find_prop("language_version")) == "onnx_main_py3.6" or \
str(self.find_prop("language_version")) == "onnx_ort1_py3.6" or \
str(self.find_prop("language_version")) == "onnx_ort2_py3.6":
return False
return set(str(c) for c in self.find_prop("compiler_version")).intersection({
"clang3.8",
"clang3.9",
"clang7",
"android",
}) or self.find_prop("distro_version").name == "macos"
def is_test_only(self):
if str(self.find_prop("language_version")) == "onnx_ort1_py3.6" or \
str(self.find_prop("language_version")) == "onnx_ort2_py3.6":
return True
return False
class TopLevelNode(TreeConfigNode):
def __init__(self, node_name, subtree):
super(TopLevelNode, self).__init__(None, node_name, subtree)
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return DistroConfigNode
class DistroConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["distro_version"] = node_name
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return CompilerConfigNode
class CompilerConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["compiler_version"] = node_name
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return LanguageConfigNode
class LanguageConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["language_version"] = node_name
self.props["build_only"] = self.is_build_only()
self.props["test_only"] = self.is_test_only()
def child_constructor(self):
return ImportantConfigNode
class ImportantConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["important"] = True
def get_children(self):
return []

View File

@ -0,0 +1,175 @@
from collections import OrderedDict
import cimodel.data.dimensions as dimensions
import cimodel.lib.conf_tree as conf_tree
from cimodel.lib.conf_tree import Ver
import cimodel.lib.miniutils as miniutils
from cimodel.data.caffe2_build_data import CONFIG_TREE_DATA, TopLevelNode
from dataclasses import dataclass
DOCKER_IMAGE_PATH_BASE = "308535385114.dkr.ecr.us-east-1.amazonaws.com/caffe2/"
DOCKER_IMAGE_VERSION = "345"
@dataclass
class Conf:
language: str
distro: Ver
# There could be multiple compiler versions configured (e.g. nvcc
# for gpu files and host compiler (gcc/clang) for cpu files)
compilers: [Ver]
build_only: bool
test_only: bool
is_important: bool
@property
def compiler_names(self):
return [c.name for c in self.compilers]
# TODO: Eventually we can probably just remove the cudnn7 everywhere.
def get_cudnn_insertion(self):
omit = self.language == "onnx_main_py3.6" \
or self.language == "onnx_ort1_py3.6" \
or self.language == "onnx_ort2_py3.6" \
or set(self.compiler_names).intersection({"android", "mkl", "clang"}) \
or str(self.distro) in ["ubuntu14.04", "macos10.13"]
return [] if omit else ["cudnn7"]
def get_build_name_root_parts(self):
return [
"caffe2",
self.language,
] + self.get_build_name_middle_parts()
def get_build_name_middle_parts(self):
return [str(c) for c in self.compilers] + self.get_cudnn_insertion() + [str(self.distro)]
def construct_phase_name(self, phase):
root_parts = self.get_build_name_root_parts()
build_name_substitutions = {
"onnx_ort1_py3.6": "onnx_main_py3.6",
"onnx_ort2_py3.6": "onnx_main_py3.6",
}
if phase == "build":
root_parts = [miniutils.override(r, build_name_substitutions) for r in root_parts]
return "_".join(root_parts + [phase]).replace(".", "_")
def get_platform(self):
platform = self.distro.name
if self.distro.name != "macos":
platform = "linux"
return platform
def gen_docker_image(self):
lang_substitutions = {
"onnx_main_py3.6": "py3.6",
"onnx_ort1_py3.6": "py3.6",
"onnx_ort2_py3.6": "py3.6",
"cmake": "py3",
}
lang = miniutils.override(self.language, lang_substitutions)
parts = [lang] + self.get_build_name_middle_parts()
return miniutils.quote(DOCKER_IMAGE_PATH_BASE + "-".join(parts) + ":" + str(DOCKER_IMAGE_VERSION))
def gen_workflow_params(self, phase):
parameters = OrderedDict()
lang_substitutions = {
"onnx_py3": "onnx-py3",
"onnx_main_py3.6": "onnx-main-py3.6",
"onnx_ort1_py3.6": "onnx-ort1-py3.6",
"onnx_ort2_py3.6": "onnx-ort2-py3.6",
}
lang = miniutils.override(self.language, lang_substitutions)
parts = [
"caffe2",
lang,
] + self.get_build_name_middle_parts() + [phase]
build_env_name = "-".join(parts)
parameters["build_environment"] = miniutils.quote(build_env_name)
if "ios" in self.compiler_names:
parameters["build_ios"] = miniutils.quote("1")
if phase == "test":
# TODO cuda should not be considered a compiler
if "cuda" in self.compiler_names:
parameters["use_cuda_docker_runtime"] = miniutils.quote("1")
if self.distro.name != "macos":
parameters["docker_image"] = self.gen_docker_image()
if self.build_only:
parameters["build_only"] = miniutils.quote("1")
if phase == "test":
resource_class = "large" if "cuda" not in self.compiler_names else "gpu.medium"
parameters["resource_class"] = resource_class
return parameters
def gen_workflow_job(self, phase):
job_def = OrderedDict()
job_def["name"] = self.construct_phase_name(phase)
job_def["requires"] = ["setup"]
if phase == "test":
job_def["requires"].append(self.construct_phase_name("build"))
job_name = "caffe2_" + self.get_platform() + "_test"
else:
job_name = "caffe2_" + self.get_platform() + "_build"
if not self.is_important:
job_def["filters"] = {"branches": {"only": ["master", r"/ci-all\/.*/", r"/release\/.*/"]}}
job_def.update(self.gen_workflow_params(phase))
return {job_name : job_def}
def get_root():
return TopLevelNode("Caffe2 Builds", CONFIG_TREE_DATA)
def instantiate_configs():
config_list = []
root = get_root()
found_configs = conf_tree.dfs(root)
for fc in found_configs:
c = Conf(
language=fc.find_prop("language_version"),
distro=fc.find_prop("distro_version"),
compilers=fc.find_prop("compiler_version"),
build_only=fc.find_prop("build_only"),
test_only=fc.find_prop("test_only"),
is_important=fc.find_prop("important"),
)
config_list.append(c)
return config_list
def get_workflow_jobs():
configs = instantiate_configs()
x = []
for conf_options in configs:
phases = ["build"]
if not conf_options.build_only:
phases = dimensions.PHASES
if conf_options.test_only:
phases = ["test"]
for phase in phases:
x.append(conf_options.gen_workflow_job(phase))
return x

View File

@ -1,24 +1,15 @@
PHASES = ["build", "test"]
CUDA_VERSIONS = [
None, # cpu build
"92",
"101",
"102",
"111",
"113",
]
ROCM_VERSIONS = [
"4.0.1",
"4.1",
"4.2",
]
ROCM_VERSION_LABELS = ["rocm" + v for v in ROCM_VERSIONS]
GPU_VERSIONS = [None] + ["cuda" + v for v in CUDA_VERSIONS] + ROCM_VERSION_LABELS
STANDARD_PYTHON_VERSIONS = [
"3.5",
"3.6",
"3.7",
"3.8",
"3.9"
"3.8"
]

View File

@ -3,67 +3,61 @@ from cimodel.lib.conf_tree import ConfigNode, X, XImportant
CONFIG_TREE_DATA = [
("xenial", [
(None, [
X("3.5"),
X("nightly"),
]),
("gcc", [
("5.4", [ # All this subtree rebases to master and then build
XImportant("3.6"),
("3.6", [
("important", [X(True)]),
("parallel_tbb", [X(True)]),
("parallel_native", [X(True)]),
]),
]),
# TODO: bring back libtorch test
("7", [X("3.6")]),
]),
("clang", [
("5", [
XImportant("3.6"), # This is actually the ASAN build
]),
("7", [
("3.6", [
("asan", [
(True, [
("shard_test", [XImportant(True)]),
]),
]),
("onnx", [XImportant(True)]),
("xla", [XImportant(True)]),
]),
]),
]),
("cuda", [
("10.2", [
("3.6", [
# Build are needed for slow_gradcheck
('build_only', [X(True)]),
("slow_gradcheck", [
# If you update this slow gradcheck, you should
# also update docker_definitions.py to make sure
# the docker image match the config used here
(True, [
('shard_test', [XImportant(True)]),
]),
]),
# UNCOMMENT THE BELOW TO REENABLE LIBTORCH
# ("libtorch", [
# (True, [
# ('build_only', [X(True)]),
# ]),
# ]),
]),
]),
]),
]),
("bionic", [
("clang", [
("9", [
# Note there are magic strings here
# https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L21
# and
# https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L143
# and
# https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/build.sh#L153
# (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259453144)
X("3.6"),
]),
("9.2", [X("3.6")]),
("10.1", [X("3.6")]),
("10.2", [
XImportant("3.6"),
("3.6", [
("xla", [XImportant(True)]),
("vulkan", [XImportant(True)]),
("libtorch", [XImportant(True)])
]),
]),
]),
# @jithunnair-amd believes Jenkins builds are sufficient
# ("rocm", [
# ("3.9", [
# ("3.6", [
# ('build_only', [XImportant(True)]),
# ]),
# ]),
# ]),
("android", [
("r19c", [
("3.6", [
("android_abi", [XImportant("x86_32")]),
("android_abi", [X("x86_64")]),
("android_abi", [X("arm-v7a")]),
("android_abi", [X("arm-v8a")]),
])
]),
]),
]),
]
@ -107,7 +101,6 @@ class DistroConfigNode(TreeConfigNode):
next_nodes = {
"xenial": XenialCompilerConfigNode,
"bionic": BionicCompilerConfigNode,
}
return next_nodes[distro]
@ -116,8 +109,6 @@ class PyVerConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["pyver"] = node_name
self.props["abbreviated_pyver"] = get_major_pyver(node_name)
if node_name == "3.9":
self.props["abbreviated_pyver"] = "py3.9"
# noinspection PyMethodMayBeStatic
def child_constructor(self):
@ -132,44 +123,16 @@ class ExperimentalFeatureConfigNode(TreeConfigNode):
experimental_feature = self.find_prop("experimental_feature")
next_nodes = {
"asan": AsanConfigNode,
"xla": XlaConfigNode,
"mlc": MLCConfigNode,
"vulkan": VulkanConfigNode,
"parallel_tbb": ParallelTBBConfigNode,
"noarch": NoarchConfigNode,
"parallel_native": ParallelNativeConfigNode,
"onnx": ONNXConfigNode,
"libtorch": LibTorchConfigNode,
"important": ImportantConfigNode,
"build_only": BuildOnlyConfigNode,
"shard_test": ShardTestConfigNode,
"cuda_gcc_override": CudaGccOverrideConfigNode,
"coverage": CoverageConfigNode,
"pure_torch": PureTorchConfigNode,
"slow_gradcheck": SlowGradcheckConfigNode,
"android_abi": AndroidAbiConfigNode,
}
return next_nodes[experimental_feature]
class SlowGradcheckConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["is_slow_gradcheck"] = True
def child_constructor(self):
return ExperimentalFeatureConfigNode
class PureTorchConfigNode(TreeConfigNode):
def modify_label(self, label):
return "PURE_TORCH=" + str(label)
def init2(self, node_name):
self.props["is_pure_torch"] = node_name
def child_constructor(self):
return ImportantConfigNode
class XlaConfigNode(TreeConfigNode):
def modify_label(self, label):
return "XLA=" + str(label)
@ -180,50 +143,6 @@ class XlaConfigNode(TreeConfigNode):
def child_constructor(self):
return ImportantConfigNode
class MLCConfigNode(TreeConfigNode):
def modify_label(self, label):
return "MLC=" + str(label)
def init2(self, node_name):
self.props["is_mlc"] = node_name
def child_constructor(self):
return ImportantConfigNode
class AsanConfigNode(TreeConfigNode):
def modify_label(self, label):
return "Asan=" + str(label)
def init2(self, node_name):
self.props["is_asan"] = node_name
def child_constructor(self):
return ExperimentalFeatureConfigNode
class ONNXConfigNode(TreeConfigNode):
def modify_label(self, label):
return "Onnx=" + str(label)
def init2(self, node_name):
self.props["is_onnx"] = node_name
def child_constructor(self):
return ImportantConfigNode
class VulkanConfigNode(TreeConfigNode):
def modify_label(self, label):
return "Vulkan=" + str(label)
def init2(self, node_name):
self.props["is_vulkan"] = node_name
def child_constructor(self):
return ImportantConfigNode
class ParallelTBBConfigNode(TreeConfigNode):
def modify_label(self, label):
return "PARALLELTBB=" + str(label)
@ -234,15 +153,6 @@ class ParallelTBBConfigNode(TreeConfigNode):
def child_constructor(self):
return ImportantConfigNode
class NoarchConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["is_noarch"] = node_name
def child_constructor(self):
return ImportantConfigNode
class ParallelNativeConfigNode(TreeConfigNode):
def modify_label(self, label):
return "PARALLELNATIVE=" + str(label)
@ -253,7 +163,6 @@ class ParallelNativeConfigNode(TreeConfigNode):
def child_constructor(self):
return ImportantConfigNode
class LibTorchConfigNode(TreeConfigNode):
def modify_label(self, label):
return "BUILD_TEST_LIBTORCH=" + str(label)
@ -262,41 +171,16 @@ class LibTorchConfigNode(TreeConfigNode):
self.props["is_libtorch"] = node_name
def child_constructor(self):
return ExperimentalFeatureConfigNode
return ImportantConfigNode
class AndroidAbiConfigNode(TreeConfigNode):
class CudaGccOverrideConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["cuda_gcc_override"] = node_name
def child_constructor(self):
return ExperimentalFeatureConfigNode
class BuildOnlyConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["build_only"] = node_name
def child_constructor(self):
return ExperimentalFeatureConfigNode
class ShardTestConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["shard_test"] = node_name
self.props["android_abi"] = node_name
def child_constructor(self):
return ImportantConfigNode
class CoverageConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["is_coverage"] = node_name
def child_constructor(self):
return ExperimentalFeatureConfigNode
class ImportantConfigNode(TreeConfigNode):
def modify_label(self, label):
return "IMPORTANT=" + str(label)
@ -309,6 +193,7 @@ class ImportantConfigNode(TreeConfigNode):
class XenialCompilerConfigNode(TreeConfigNode):
def modify_label(self, label):
return label or "<unspecified>"
@ -321,19 +206,6 @@ class XenialCompilerConfigNode(TreeConfigNode):
return XenialCompilerVersionConfigNode if self.props["compiler_name"] else PyVerConfigNode
class BionicCompilerConfigNode(TreeConfigNode):
def modify_label(self, label):
return label or "<unspecified>"
def init2(self, node_name):
self.props["compiler_name"] = node_name
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return BionicCompilerVersionConfigNode if self.props["compiler_name"] else PyVerConfigNode
class XenialCompilerVersionConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["compiler_version"] = node_name
@ -341,12 +213,3 @@ class XenialCompilerVersionConfigNode(TreeConfigNode):
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return PyVerConfigNode
class BionicCompilerVersionConfigNode(TreeConfigNode):
def init2(self, node_name):
self.props["compiler_version"] = node_name
# noinspection PyMethodMayBeStatic
def child_constructor(self):
return PyVerConfigNode

View File

@ -1,13 +1,19 @@
from collections import OrderedDict
from dataclasses import dataclass, field
from typing import List, Optional
from cimodel.data.pytorch_build_data import TopLevelNode, CONFIG_TREE_DATA
import cimodel.data.dimensions as dimensions
import cimodel.lib.conf_tree as conf_tree
import cimodel.lib.miniutils as miniutils
from cimodel.data.pytorch_build_data import CONFIG_TREE_DATA, TopLevelNode
from cimodel.data.simple.util.branch_filters import gen_filter_dict, RC_PATTERN
from cimodel.data.simple.util.docker_constants import gen_docker_image
from dataclasses import dataclass, field
from typing import List, Optional
DOCKER_IMAGE_PATH_BASE = "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/"
# ARE YOU EDITING THIS NUMBER? MAKE SURE YOU READ THE GUIDANCE AT THE
# TOP OF .circleci/config.yml
DOCKER_IMAGE_VERSION = "f990c76a-a798-42bb-852f-5be5006f8026"
@dataclass
@ -17,25 +23,17 @@ class Conf:
parms_list_ignored_for_docker_image: Optional[List[str]] = None
pyver: Optional[str] = None
cuda_version: Optional[str] = None
rocm_version: Optional[str] = None
# TODO expand this to cover all the USE_* that we want to test for
# tesnrorrt, leveldb, lmdb, redis, opencv, mkldnn, ideep, etc.
# (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259453608)
is_xla: bool = False
is_vulkan: bool = False
is_pure_torch: bool = False
restrict_phases: Optional[List[str]] = None
gpu_resource: Optional[str] = None
dependent_tests: List = field(default_factory=list)
parent_build: Optional["Conf"] = None
parent_build: Optional['Conf'] = None
is_libtorch: bool = False
is_important: bool = False
parallel_backend: Optional[str] = None
build_only: bool = False
@staticmethod
def is_test_phase(phase):
return "test" in phase
# TODO: Eliminate the special casing for docker paths
# In the short term, we *will* need to support special casing as docker images are merged for caffe2 and pytorch
@ -48,47 +46,31 @@ class Conf:
leading.append("pytorch")
if self.is_xla and not for_docker:
leading.append("xla")
if self.is_vulkan and not for_docker:
leading.append("vulkan")
if self.is_libtorch and not for_docker:
leading.append("libtorch")
if self.is_pure_torch and not for_docker:
leading.append("pure_torch")
if self.parallel_backend is not None and not for_docker:
leading.append(self.parallel_backend)
cuda_parms = []
if self.cuda_version:
cudnn = "cudnn8" if self.cuda_version.startswith("11.") else "cudnn7"
cuda_parms.extend(["cuda" + self.cuda_version, cudnn])
if self.rocm_version:
cuda_parms.extend([f"rocm{self.rocm_version}"])
cuda_parms.extend(["cuda" + self.cuda_version, "cudnn7"])
result = leading + ["linux", self.distro] + cuda_parms + self.parms
if not for_docker and self.parms_list_ignored_for_docker_image is not None:
if (not for_docker and self.parms_list_ignored_for_docker_image is not None):
result = result + self.parms_list_ignored_for_docker_image
return result
def gen_docker_image_path(self):
parms_source = self.parent_build or self
base_build_env_name = "-".join(parms_source.get_parms(True))
image_name, _ = gen_docker_image(base_build_env_name)
return miniutils.quote(image_name)
def gen_docker_image_requires(self):
parms_source = self.parent_build or self
base_build_env_name = "-".join(parms_source.get_parms(True))
_, requires = gen_docker_image(base_build_env_name)
return miniutils.quote(requires)
return miniutils.quote(DOCKER_IMAGE_PATH_BASE + base_build_env_name + ":" + str(DOCKER_IMAGE_VERSION))
def get_build_job_name_pieces(self, build_or_test):
return self.get_parms(False) + [build_or_test]
def gen_build_name(self, build_or_test):
return (
("_".join(map(str, self.get_build_job_name_pieces(build_or_test))))
.replace(".", "_")
.replace("-", "_")
)
return ("_".join(map(str, self.get_build_job_name_pieces(build_or_test)))).replace(".", "_").replace("-", "_")
def get_dependents(self):
return self.dependent_tests or []
@ -100,28 +82,22 @@ class Conf:
build_env_name = "-".join(map(str, build_job_name_pieces))
parameters["build_environment"] = miniutils.quote(build_env_name)
parameters["docker_image"] = self.gen_docker_image_path()
if Conf.is_test_phase(phase) and self.gpu_resource:
if phase == "test" and self.gpu_resource:
parameters["use_cuda_docker_runtime"] = miniutils.quote("1")
if Conf.is_test_phase(phase):
if phase == "test":
resource_class = "large"
if self.gpu_resource:
resource_class = "gpu." + self.gpu_resource
if self.rocm_version is not None:
resource_class = "pytorch/amd-gpu"
parameters["resource_class"] = resource_class
if phase == "build" and self.rocm_version is not None:
parameters["resource_class"] = "xlarge"
if hasattr(self, 'filters'):
parameters['filters'] = self.filters
if self.build_only:
parameters['build_only'] = miniutils.quote(str(int(True)))
return parameters
def gen_workflow_job(self, phase):
# All jobs require the setup job
job_def = OrderedDict()
job_def["name"] = self.gen_build_name(phase)
job_def["requires"] = ["setup"]
if Conf.is_test_phase(phase):
if phase == "test":
# TODO When merging the caffe2 and pytorch jobs, it might be convenient for a while to make a
# caffe2 test job dependent on a pytorch build job. This way we could quickly dedup the repeated
@ -129,89 +105,69 @@ class Conf:
# pytorch build job (from https://github.com/pytorch/pytorch/pull/17323#discussion_r259452641)
dependency_build = self.parent_build or self
job_def["requires"] = [dependency_build.gen_build_name("build")]
job_def["requires"].append(dependency_build.gen_build_name("build"))
job_name = "pytorch_linux_test"
else:
job_name = "pytorch_linux_build"
job_def["requires"] = [self.gen_docker_image_requires()]
if not self.is_important:
job_def["filters"] = gen_filter_dict()
# If you update this, update
# caffe2_build_definitions.py too
job_def["filters"] = {"branches": {"only": ["master", r"/ci-all\/.*/", r"/release\/.*/"]}}
job_def.update(self.gen_workflow_params(phase))
return {job_name: job_def}
return {job_name : job_def}
# TODO This is a hack to special case some configs just for the workflow list
class HiddenConf(object):
def __init__(self, name, parent_build=None, filters=None):
def __init__(self, name, parent_build=None):
self.name = name
self.parent_build = parent_build
self.filters = filters
def gen_workflow_job(self, phase):
return {
self.gen_build_name(phase): {
"requires": [self.parent_build.gen_build_name("build")],
"filters": self.filters,
}
}
return {self.gen_build_name(phase): {"requires": [self.parent_build.gen_build_name("build")]}}
def gen_build_name(self, _):
return self.name
class DocPushConf(object):
def __init__(self, name, parent_build=None, branch="master"):
self.name = name
self.parent_build = parent_build
self.branch = branch
def gen_workflow_job(self, phase):
return {
"pytorch_doc_push": {
"name": self.name,
"branch": self.branch,
"requires": [self.parent_build],
"context": "org-member",
"filters": gen_filter_dict(branches_list=["nightly"],
tags_list=RC_PATTERN)
}
}
# TODO Convert these to graph nodes
def gen_dependent_configs(xenial_parent_config):
extra_parms = [
(["multigpu"], "large"),
(["NO_AVX2"], "medium"),
(["NO_AVX", "NO_AVX2"], "medium"),
(["slow"], "medium"),
(["nogpu"], None),
]
configs = []
for parms, gpu in extra_parms:
c = Conf(
xenial_parent_config.distro,
["py3"] + parms,
pyver="3.6",
cuda_version=xenial_parent_config.cuda_version,
restrict_phases=["test"],
gpu_resource=gpu,
parent_build=xenial_parent_config,
is_important=xenial_parent_config.is_important,
)
configs.append(c)
return configs
def gen_docs_configs(xenial_parent_config):
configs = []
configs.append(
HiddenConf(
"pytorch_python_doc_build",
parent_build=xenial_parent_config,
filters=gen_filter_dict(branches_list=["master", "nightly"],
tags_list=RC_PATTERN),
)
)
configs.append(
DocPushConf(
"pytorch_python_doc_push",
parent_build="pytorch_python_doc_build",
branch="site",
)
)
for x in ["pytorch_python_doc_push", "pytorch_cpp_doc_push"]:
configs.append(HiddenConf(x, parent_build=xenial_parent_config))
configs.append(
HiddenConf(
"pytorch_cpp_doc_build",
parent_build=xenial_parent_config,
filters=gen_filter_dict(branches_list=["master", "nightly"],
tags_list=RC_PATTERN),
)
)
configs.append(
DocPushConf(
"pytorch_cpp_doc_push",
parent_build="pytorch_cpp_doc_build",
branch="master",
)
)
return configs
@ -225,31 +181,21 @@ def gen_tree():
return configs_list
def instantiate_configs(only_slow_gradcheck):
def instantiate_configs():
config_list = []
root = get_root()
found_configs = conf_tree.dfs(root)
restrict_phases = None
for fc in found_configs:
restrict_phases = None
distro_name = fc.find_prop("distro_name")
compiler_name = fc.find_prop("compiler_name")
compiler_version = fc.find_prop("compiler_version")
is_xla = fc.find_prop("is_xla") or False
is_asan = fc.find_prop("is_asan") or False
is_coverage = fc.find_prop("is_coverage") or False
is_noarch = fc.find_prop("is_noarch") or False
is_onnx = fc.find_prop("is_onnx") or False
is_pure_torch = fc.find_prop("is_pure_torch") or False
is_vulkan = fc.find_prop("is_vulkan") or False
is_slow_gradcheck = fc.find_prop("is_slow_gradcheck") or False
parms_list_ignored_for_docker_image = []
if only_slow_gradcheck ^ is_slow_gradcheck:
continue
python_version = None
if compiler_name == "cuda" or compiler_name == "android":
python_version = fc.find_prop("pyver")
@ -258,14 +204,9 @@ def instantiate_configs(only_slow_gradcheck):
parms_list = ["py" + fc.find_prop("pyver")]
cuda_version = None
rocm_version = None
if compiler_name == "cuda":
cuda_version = fc.find_prop("compiler_version")
elif compiler_name == "rocm":
rocm_version = fc.find_prop("compiler_version")
restrict_phases = ["build", "test1", "test2", "caffe2_test"]
elif compiler_name == "android":
android_ndk_version = fc.find_prop("compiler_version")
# TODO: do we need clang to compile host binaries like protoc?
@ -279,43 +220,19 @@ def instantiate_configs(only_slow_gradcheck):
gcc_version = compiler_name + (fc.find_prop("compiler_version") or "")
parms_list.append(gcc_version)
if is_asan:
parms_list.append("asan")
python_version = fc.find_prop("pyver")
parms_list[0] = fc.find_prop("abbreviated_pyver")
# TODO: This is a nasty special case
if compiler_name == "clang" and not is_xla:
parms_list.append("asan")
python_version = fc.find_prop("pyver")
parms_list[0] = fc.find_prop("abbreviated_pyver")
if is_coverage:
parms_list_ignored_for_docker_image.append("coverage")
python_version = fc.find_prop("pyver")
if is_noarch:
parms_list_ignored_for_docker_image.append("noarch")
if is_onnx:
parms_list.append("onnx")
python_version = fc.find_prop("pyver")
parms_list[0] = fc.find_prop("abbreviated_pyver")
restrict_phases = ["build", "ort_test1", "ort_test2"]
if cuda_version:
cuda_gcc_version = fc.find_prop("cuda_gcc_override") or "gcc7"
parms_list.append(cuda_gcc_version)
if cuda_version in ["9.2", "10", "10.1", "10.2"]:
# TODO The gcc version is orthogonal to CUDA version?
parms_list.append("gcc7")
is_libtorch = fc.find_prop("is_libtorch") or False
is_important = fc.find_prop("is_important") or False
parallel_backend = fc.find_prop("parallel_backend") or None
build_only = fc.find_prop("build_only") or False
shard_test = fc.find_prop("shard_test") or False
# TODO: fix pure_torch python test packaging issue.
if shard_test:
restrict_phases = ["build"] if restrict_phases is None else restrict_phases
restrict_phases.extend(["test1", "test2"])
if build_only or is_pure_torch:
restrict_phases = ["build"]
if is_slow_gradcheck:
parms_list_ignored_for_docker_image.append("old")
parms_list_ignored_for_docker_image.append("gradcheck")
gpu_resource = None
if cuda_version and cuda_version != "10":
@ -327,49 +244,32 @@ def instantiate_configs(only_slow_gradcheck):
parms_list_ignored_for_docker_image,
python_version,
cuda_version,
rocm_version,
is_xla,
is_vulkan,
is_pure_torch,
restrict_phases,
gpu_resource,
is_libtorch=is_libtorch,
is_important=is_important,
parallel_backend=parallel_backend,
build_only=build_only,
)
# run docs builds on "pytorch-linux-xenial-py3.6-gcc5.4". Docs builds
# should run on a CPU-only build that runs on all PRs.
# XXX should this be updated to a more modern build? Projects are
# beginning to drop python3.6
if (
distro_name == "xenial"
and fc.find_prop("pyver") == "3.6"
and cuda_version is None
and parallel_backend is None
and not is_vulkan
and not is_pure_torch
and compiler_name == "gcc"
and fc.find_prop("compiler_version") == "5.4"
):
c.filters = gen_filter_dict(branches_list=r"/.*/",
tags_list=RC_PATTERN)
if distro_name == 'xenial' and fc.find_prop("pyver") == '3.6' \
and cuda_version is None \
and parallel_backend is None \
and compiler_name == 'gcc' \
and fc.find_prop('compiler_version') == '5.4':
c.dependent_tests = gen_docs_configs(c)
if (
compiler_name != "clang"
and not rocm_version
and not is_libtorch
and not is_vulkan
and not is_pure_torch
and not is_noarch
and not is_slow_gradcheck
and not only_slow_gradcheck
and not build_only
):
distributed_test = Conf(
c.gen_build_name("") + "distributed",
if cuda_version == "10.1" and python_version == "3.6" and not is_libtorch:
c.dependent_tests = gen_dependent_configs(c)
if (compiler_name == "gcc"
and compiler_version == "5.4"
and not is_libtorch
and parallel_backend is None):
bc_breaking_check = Conf(
"backward-compatibility-check",
[],
is_xla=False,
restrict_phases=["test"],
@ -377,16 +277,16 @@ def instantiate_configs(only_slow_gradcheck):
is_important=True,
parent_build=c,
)
c.dependent_tests.append(distributed_test)
c.dependent_tests.append(bc_breaking_check)
config_list.append(c)
return config_list
def get_workflow_jobs(only_slow_gradcheck=False):
def get_workflow_jobs():
config_list = instantiate_configs(only_slow_gradcheck)
config_list = instantiate_configs()
x = []
for conf_options in config_list:
@ -396,7 +296,7 @@ def get_workflow_jobs(only_slow_gradcheck=False):
for phase in phases:
# TODO why does this not have a test?
if Conf.is_test_phase(phase) and conf_options.cuda_version == "10":
if phase == "test" and conf_options.cuda_version == "10":
continue
x.append(conf_options.gen_workflow_job(phase))

View File

@ -1,28 +0,0 @@
from collections import OrderedDict
from cimodel.data.simple.util.branch_filters import gen_filter_dict
from cimodel.lib.miniutils import quote
CHANNELS_TO_PRUNE = ["pytorch-nightly", "pytorch-test"]
PACKAGES_TO_PRUNE = "pytorch torchvision torchaudio torchtext ignite torchcsprng"
def gen_workflow_job(channel: str):
return OrderedDict(
{
"anaconda_prune": OrderedDict(
{
"name": f"anaconda-prune-{channel}",
"context": quote("org-member"),
"packages": quote(PACKAGES_TO_PRUNE),
"channel": channel,
"filters": gen_filter_dict(branches_list=["postnightly"]),
}
)
}
)
def get_workflow_jobs():
return [gen_workflow_job(channel) for channel in CHANNELS_TO_PRUNE]

View File

@ -1,119 +0,0 @@
import cimodel.data.simple.util.branch_filters as branch_filters
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_NDK, DOCKER_REQUIREMENT_NDK
)
import cimodel.lib.miniutils as miniutils
class AndroidJob:
def __init__(self,
variant,
template_name,
is_master_only=True):
self.variant = variant
self.template_name = template_name
self.is_master_only = is_master_only
def gen_tree(self):
base_name_parts = [
"pytorch",
"linux",
"xenial",
"py3",
"clang5",
"android",
"ndk",
"r19c",
] + self.variant + [
"build",
]
full_job_name = "_".join(base_name_parts)
build_env_name = "-".join(base_name_parts)
props_dict = {
"name": full_job_name,
"build_environment": "\"{}\"".format(build_env_name),
"docker_image": "\"{}\"".format(DOCKER_IMAGE_NDK),
"requires": [DOCKER_REQUIREMENT_NDK]
}
if self.is_master_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.NON_PR_BRANCH_LIST)
return [{self.template_name: props_dict}]
class AndroidGradleJob:
def __init__(self,
job_name,
template_name,
dependencies,
is_master_only=True,
is_pr_only=False,
extra_props=tuple()):
self.job_name = job_name
self.template_name = template_name
self.dependencies = dependencies
self.is_master_only = is_master_only
self.is_pr_only = is_pr_only
self.extra_props = dict(extra_props)
def gen_tree(self):
props_dict = {
"name": self.job_name,
"requires": self.dependencies,
}
if self.is_master_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.NON_PR_BRANCH_LIST)
elif self.is_pr_only:
props_dict["filters"] = branch_filters.gen_filter_dict(branch_filters.PR_BRANCH_LIST)
if self.extra_props:
props_dict.update(self.extra_props)
return [{self.template_name: props_dict}]
WORKFLOW_DATA = [
AndroidJob(["x86_32"], "pytorch_linux_build", is_master_only=False),
AndroidJob(["x86_64"], "pytorch_linux_build"),
AndroidJob(["arm", "v7a"], "pytorch_linux_build"),
AndroidJob(["arm", "v8a"], "pytorch_linux_build"),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build-x86_32",
"pytorch_android_gradle_build-x86_32",
["pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build"],
is_master_only=False,
is_pr_only=True),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single",
"pytorch_android_gradle_custom_build_single",
[DOCKER_REQUIREMENT_NDK],
is_master_only=False,
is_pr_only=True),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit",
"pytorch_android_gradle_custom_build_single",
[DOCKER_REQUIREMENT_NDK],
is_master_only=False,
is_pr_only=True,
extra_props=tuple({
"lite_interpreter": miniutils.quote(str(int(False)))
}.items())),
AndroidGradleJob(
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-build",
"pytorch_android_gradle_build",
["pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_64_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v7a_build",
"pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v8a_build"]),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,69 +0,0 @@
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_GCC7,
DOCKER_REQUIREMENT_GCC7
)
def gen_job_name(phase):
job_name_parts = [
"pytorch",
"bazel",
phase,
]
return "_".join(job_name_parts)
class BazelJob:
def __init__(self, phase, extra_props=None):
self.phase = phase
self.extra_props = extra_props or {}
def gen_tree(self):
template_parts = [
"pytorch",
"linux",
"bazel",
self.phase,
]
build_env_parts = [
"pytorch",
"linux",
"xenial",
"py3.6",
"gcc7",
"bazel",
self.phase,
]
full_job_name = gen_job_name(self.phase)
build_env_name = "-".join(build_env_parts)
extra_requires = (
[gen_job_name("build")] if self.phase == "test" else
[DOCKER_REQUIREMENT_GCC7]
)
props_dict = {
"build_environment": build_env_name,
"docker_image": DOCKER_IMAGE_GCC7,
"name": full_job_name,
"requires": extra_requires,
}
props_dict.update(self.extra_props)
template_name = "_".join(template_parts)
return [{template_name: props_dict}]
WORKFLOW_DATA = [
BazelJob("build", {"resource_class": "large"}),
BazelJob("test"),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,193 +0,0 @@
"""
TODO: Refactor circleci/cimodel/data/binary_build_data.py to generate this file
instead of doing one offs here
Binary builds (subset, to smoke test that they'll work)
NB: If you modify this file, you need to also modify
the binary_and_smoke_tests_on_pr variable in
pytorch-ci-hud to adjust the allowed build list
at https://github.com/ezyang/pytorch-ci-hud/blob/master/src/BuildHistoryDisplay.js
Note:
This binary build is currently broken, see https://github_com/pytorch/pytorch/issues/16710
- binary_linux_conda_3_6_cu90_devtoolset7_build
- binary_linux_conda_3_6_cu90_devtoolset7_test
TODO
we should test a libtorch cuda build, but they take too long
- binary_linux_libtorch_3_6m_cu90_devtoolset7_static-without-deps_build
"""
import cimodel.lib.miniutils as miniutils
import cimodel.data.simple.util.branch_filters
class SmoketestJob:
def __init__(self,
template_name,
build_env_parts,
docker_image,
job_name,
is_master_only=False,
requires=None,
has_libtorch_variant=False,
extra_props=None):
self.template_name = template_name
self.build_env_parts = build_env_parts
self.docker_image = docker_image
self.job_name = job_name
self.is_master_only = is_master_only
self.requires = requires or []
self.has_libtorch_variant = has_libtorch_variant
self.extra_props = extra_props or {}
def gen_tree(self):
props_dict = {
"build_environment": " ".join(self.build_env_parts),
"name": self.job_name,
"requires": self.requires,
}
if self.docker_image:
props_dict["docker_image"] = self.docker_image
if self.is_master_only:
props_dict["filters"] = cimodel.data.simple.util.branch_filters.gen_filter_dict()
if self.has_libtorch_variant:
props_dict["libtorch_variant"] = "shared-with-deps"
props_dict.update(self.extra_props)
return [{self.template_name: props_dict}]
WORKFLOW_DATA = [
SmoketestJob(
"binary_linux_build",
["manywheel", "3.7m", "cu102", "devtoolset7"],
"pytorch/manylinux-cuda102",
"binary_linux_manywheel_3_7m_cu102_devtoolset7_build",
is_master_only=True,
),
SmoketestJob(
"binary_linux_build",
["libtorch", "3.7m", "cpu", "devtoolset7"],
"pytorch/manylinux-cuda102",
"binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build",
is_master_only=True,
has_libtorch_variant=True,
),
SmoketestJob(
"binary_linux_build",
["libtorch", "3.7m", "cpu", "gcc5.4_cxx11-abi"],
"pytorch/pytorch-binary-docker-image-ubuntu16.04:latest",
"binary_linux_libtorch_3_7m_cpu_gcc5_4_cxx11-abi_shared-with-deps_build",
is_master_only=False,
has_libtorch_variant=True,
),
SmoketestJob(
"binary_mac_build",
["wheel", "3.7", "cpu"],
None,
"binary_macos_wheel_3_7_cpu_build",
is_master_only=True,
),
# This job has an average run time of 3 hours o.O
# Now only running this on master to reduce overhead
SmoketestJob(
"binary_mac_build",
["libtorch", "3.7", "cpu"],
None,
"binary_macos_libtorch_3_7_cpu_build",
is_master_only=True,
),
SmoketestJob(
"binary_windows_build",
["libtorch", "3.7", "cpu", "debug"],
None,
"binary_windows_libtorch_3_7_cpu_debug_build",
is_master_only=True,
),
SmoketestJob(
"binary_windows_build",
["libtorch", "3.7", "cpu", "release"],
None,
"binary_windows_libtorch_3_7_cpu_release_build",
is_master_only=True,
),
SmoketestJob(
"binary_windows_build",
["wheel", "3.7", "cu102"],
None,
"binary_windows_wheel_3_7_cu102_build",
is_master_only=True,
),
SmoketestJob(
"binary_windows_test",
["libtorch", "3.7", "cpu", "debug"],
None,
"binary_windows_libtorch_3_7_cpu_debug_test",
is_master_only=True,
requires=["binary_windows_libtorch_3_7_cpu_debug_build"],
),
SmoketestJob(
"binary_windows_test",
["libtorch", "3.7", "cpu", "release"],
None,
"binary_windows_libtorch_3_7_cpu_release_test",
is_master_only=False,
requires=["binary_windows_libtorch_3_7_cpu_release_build"],
),
SmoketestJob(
"binary_windows_test",
["wheel", "3.7", "cu102"],
None,
"binary_windows_wheel_3_7_cu102_test",
is_master_only=True,
requires=["binary_windows_wheel_3_7_cu102_build"],
extra_props={
"executor": "windows-with-nvidia-gpu",
},
),
SmoketestJob(
"binary_linux_test",
["manywheel", "3.7m", "cu102", "devtoolset7"],
"pytorch/manylinux-cuda102",
"binary_linux_manywheel_3_7m_cu102_devtoolset7_test",
is_master_only=True,
requires=["binary_linux_manywheel_3_7m_cu102_devtoolset7_build"],
extra_props={
"resource_class": "gpu.medium",
"use_cuda_docker_runtime": miniutils.quote((str(1))),
},
),
SmoketestJob(
"binary_linux_test",
["libtorch", "3.7m", "cpu", "devtoolset7"],
"pytorch/manylinux-cuda102",
"binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_test",
is_master_only=True,
requires=["binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build"],
has_libtorch_variant=True,
),
SmoketestJob(
"binary_linux_test",
["libtorch", "3.7m", "cpu", "gcc5.4_cxx11-abi"],
"pytorch/pytorch-binary-docker-image-ubuntu16.04:latest",
"binary_linux_libtorch_3_7m_cpu_gcc5_4_cxx11-abi_shared-with-deps_test",
is_master_only=True,
requires=["binary_linux_libtorch_3_7m_cpu_gcc5_4_cxx11-abi_shared-with-deps_build"],
has_libtorch_variant=True,
),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,56 +0,0 @@
from collections import OrderedDict
from cimodel.lib.miniutils import quote
from cimodel.data.simple.util.branch_filters import gen_filter_dict, RC_PATTERN
# TODO: make this generated from a matrix rather than just a static list
IMAGE_NAMES = [
"pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7",
"pytorch-linux-bionic-py3.6-clang9",
"pytorch-linux-bionic-cuda10.2-cudnn7-py3.6-clang9",
"pytorch-linux-bionic-py3.8-gcc9",
"pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7",
"pytorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7",
"pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7",
"pytorch-linux-xenial-py3-clang5-android-ndk-r19c",
"pytorch-linux-xenial-py3-clang5-asan",
"pytorch-linux-xenial-py3-clang7-asan",
"pytorch-linux-xenial-py3-clang7-onnx",
"pytorch-linux-xenial-py3.8",
"pytorch-linux-xenial-py3.6-clang7",
"pytorch-linux-xenial-py3.6-gcc5.4", # this one is used in doc builds
"pytorch-linux-xenial-py3.6-gcc7.2",
"pytorch-linux-xenial-py3.6-gcc7",
"pytorch-linux-bionic-rocm4.1-py3.6",
"pytorch-linux-bionic-rocm4.2-py3.6",
"pytorch-linux-bionic-rocm4.3.1-py3.6",
]
# This entry should be an element from the list above
# This should contain the image matching the "slow_gradcheck" entry in
# pytorch_build_data.py
SLOW_GRADCHECK_IMAGE_NAME = "pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7"
def get_workflow_jobs(only_slow_gradcheck=False):
"""Generates a list of docker image build definitions"""
ret = []
for image_name in IMAGE_NAMES:
if only_slow_gradcheck and image_name is not SLOW_GRADCHECK_IMAGE_NAME:
continue
parameters = OrderedDict({
"name": quote(f"docker-{image_name}"),
"image_name": quote(image_name),
})
if image_name == "pytorch-linux-xenial-py3.6-gcc5.4":
# pushing documentation on tags requires CircleCI to also
# build all the dependencies on tags, including this docker image
parameters['filters'] = gen_filter_dict(branches_list=r"/.*/",
tags_list=RC_PATTERN)
ret.append(OrderedDict(
{
"docker_build_job": parameters
}
))
return ret

View File

@ -1,82 +0,0 @@
from cimodel.data.simple.util.versions import MultiPartVersion
import cimodel.lib.miniutils as miniutils
XCODE_VERSION = MultiPartVersion([12, 5, 1])
class ArchVariant:
def __init__(self, name, custom_build_name=""):
self.name = name
self.custom_build_name = custom_build_name
def render(self):
extra_parts = [self.custom_build_name] if len(self.custom_build_name) > 0 else []
return "_".join([self.name] + extra_parts)
def get_platform(arch_variant_name):
return "SIMULATOR" if arch_variant_name == "x86_64" else "OS"
class IOSJob:
def __init__(self, xcode_version, arch_variant, is_org_member_context=True, extra_props=None):
self.xcode_version = xcode_version
self.arch_variant = arch_variant
self.is_org_member_context = is_org_member_context
self.extra_props = extra_props
def gen_name_parts(self, with_version_dots):
version_parts = self.xcode_version.render_dots_or_parts(with_version_dots)
build_variant_suffix = "_".join([self.arch_variant.render(), "build"])
return [
"pytorch",
"ios",
] + version_parts + [
build_variant_suffix,
]
def gen_job_name(self):
return "_".join(self.gen_name_parts(False))
def gen_tree(self):
platform_name = get_platform(self.arch_variant.name)
props_dict = {
"build_environment": "-".join(self.gen_name_parts(True)),
"ios_arch": self.arch_variant.name,
"ios_platform": platform_name,
"name": self.gen_job_name(),
}
if self.is_org_member_context:
props_dict["context"] = "org-member"
if self.extra_props:
props_dict.update(self.extra_props)
return [{"pytorch_ios_build": props_dict}]
WORKFLOW_DATA = [
IOSJob(XCODE_VERSION, ArchVariant("x86_64"), is_org_member_context=False, extra_props={
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("x86_64", "full_jit"), is_org_member_context=False, extra_props={
"lite_interpreter": miniutils.quote(str(int(False)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64"), extra_props={
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "metal"), extra_props={
"use_metal": miniutils.quote(str(int(True))),
"lite_interpreter": miniutils.quote(str(int(True)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "full_jit"), extra_props={
"lite_interpreter": miniutils.quote(str(int(False)))}),
IOSJob(XCODE_VERSION, ArchVariant("arm64", "custom"), extra_props={
"op_list": "mobilenetv2.yaml",
"lite_interpreter": miniutils.quote(str(int(True)))}),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,52 +0,0 @@
class MacOsJob:
def __init__(self, os_version, is_build=False, is_test=False, extra_props=tuple()):
# extra_props is tuple type, because mutable data structures for argument defaults
# is not recommended.
self.os_version = os_version
self.is_build = is_build
self.is_test = is_test
self.extra_props = dict(extra_props)
def gen_tree(self):
non_phase_parts = ["pytorch", "macos", self.os_version, "py3"]
extra_name_list = [name for name, exist in self.extra_props.items() if exist]
full_job_name_list = non_phase_parts + extra_name_list + [
'build' if self.is_build else None,
'test' if self.is_test else None,
]
full_job_name = "_".join(list(filter(None, full_job_name_list)))
test_build_dependency = "_".join(non_phase_parts + ["build"])
extra_dependencies = [test_build_dependency] if self.is_test else []
job_dependencies = extra_dependencies
# Yes we name the job after itself, it needs a non-empty value in here
# for the YAML output to work.
props_dict = {"requires": job_dependencies, "name": full_job_name}
return [{full_job_name: props_dict}]
WORKFLOW_DATA = [
MacOsJob("10_15", is_build=True),
MacOsJob("10_13", is_build=True),
MacOsJob(
"10_13",
is_build=False,
is_test=True,
),
MacOsJob(
"10_13",
is_build=True,
is_test=True,
extra_props=tuple({
"lite_interpreter": True
}.items()),
)
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,86 +0,0 @@
"""
PyTorch Mobile PR builds (use linux host toolchain + mobile build options)
"""
import cimodel.lib.miniutils as miniutils
import cimodel.data.simple.util.branch_filters
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_ASAN,
DOCKER_REQUIREMENT_ASAN,
DOCKER_IMAGE_NDK,
DOCKER_REQUIREMENT_NDK
)
class MobileJob:
def __init__(
self,
docker_image,
docker_requires,
variant_parts,
is_master_only=False):
self.docker_image = docker_image
self.docker_requires = docker_requires
self.variant_parts = variant_parts
self.is_master_only = is_master_only
def gen_tree(self):
non_phase_parts = [
"pytorch",
"linux",
"xenial",
"py3",
"clang5",
"mobile",
] + self.variant_parts
full_job_name = "_".join(non_phase_parts)
build_env_name = "-".join(non_phase_parts)
props_dict = {
"build_environment": build_env_name,
"build_only": miniutils.quote(str(int(True))),
"docker_image": self.docker_image,
"requires": self.docker_requires,
"name": full_job_name,
}
if self.is_master_only:
props_dict["filters"] = cimodel.data.simple.util.branch_filters.gen_filter_dict()
return [{"pytorch_linux_build": props_dict}]
WORKFLOW_DATA = [
MobileJob(
DOCKER_IMAGE_ASAN,
[DOCKER_REQUIREMENT_ASAN],
["build"]
),
# Use LLVM-DEV toolchain in android-ndk-r19c docker image
MobileJob(
DOCKER_IMAGE_NDK,
[DOCKER_REQUIREMENT_NDK],
["custom", "build", "dynamic"]
),
MobileJob(
DOCKER_IMAGE_NDK,
[DOCKER_REQUIREMENT_NDK],
["custom", "build", "static"]
),
# Use LLVM-DEV toolchain in android-ndk-r19c docker image
# Most of this CI is already covered by "mobile-custom-build-dynamic" job
MobileJob(
DOCKER_IMAGE_NDK,
[DOCKER_REQUIREMENT_NDK],
["code", "analysis"],
True
),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,77 +0,0 @@
from cimodel.data.simple.util.docker_constants import (
DOCKER_IMAGE_NDK,
DOCKER_REQUIREMENT_NDK
)
class AndroidNightlyJob:
def __init__(self,
variant,
template_name,
extra_props=None,
with_docker=True,
requires=None,
no_build_suffix=False):
self.variant = variant
self.template_name = template_name
self.extra_props = extra_props or {}
self.with_docker = with_docker
self.requires = requires
self.no_build_suffix = no_build_suffix
def gen_tree(self):
base_name_parts = [
"pytorch",
"linux",
"xenial",
"py3",
"clang5",
"android",
"ndk",
"r19c",
] + self.variant
build_suffix = [] if self.no_build_suffix else ["build"]
full_job_name = "_".join(["nightly"] + base_name_parts + build_suffix)
build_env_name = "-".join(base_name_parts)
props_dict = {
"name": full_job_name,
"requires": self.requires,
"filters": {"branches": {"only": "nightly"}},
}
props_dict.update(self.extra_props)
if self.with_docker:
props_dict["docker_image"] = DOCKER_IMAGE_NDK
props_dict["build_environment"] = build_env_name
return [{self.template_name: props_dict}]
BASE_REQUIRES = [DOCKER_REQUIREMENT_NDK]
WORKFLOW_DATA = [
AndroidNightlyJob(["x86_32"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["x86_64"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["arm", "v7a"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["arm", "v8a"], "pytorch_linux_build", requires=BASE_REQUIRES),
AndroidNightlyJob(["android_gradle"], "pytorch_android_gradle_build",
with_docker=False,
requires=[
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_32_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_x86_64_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v7a_build",
"nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_arm_v8a_build"]),
AndroidNightlyJob(["x86_32_android_publish_snapshot"], "pytorch_android_publish_snapshot",
extra_props={"context": "org-member"},
with_docker=False,
requires=["nightly_pytorch_linux_xenial_py3_clang5_android_ndk_r19c_android_gradle_build"],
no_build_suffix=True),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,71 +0,0 @@
import cimodel.data.simple.ios_definitions as ios_definitions
import cimodel.lib.miniutils as miniutils
class IOSNightlyJob:
def __init__(self,
variant,
is_upload=False):
self.variant = variant
self.is_upload = is_upload
def get_phase_name(self):
return "upload" if self.is_upload else "build"
def get_common_name_pieces(self, with_version_dots):
extra_name_suffix = [self.get_phase_name()] if self.is_upload else []
common_name_pieces = [
"ios",
] + ios_definitions.XCODE_VERSION.render_dots_or_parts(with_version_dots) + [
"nightly",
self.variant,
"build",
] + extra_name_suffix
return common_name_pieces
def gen_job_name(self):
return "_".join(["pytorch"] + self.get_common_name_pieces(False))
def gen_tree(self):
extra_requires = [x.gen_job_name() for x in BUILD_CONFIGS] if self.is_upload else []
props_dict = {
"build_environment": "-".join(["libtorch"] + self.get_common_name_pieces(True)),
"requires": extra_requires,
"context": "org-member",
"filters": {"branches": {"only": "nightly"}},
}
if not self.is_upload:
props_dict["ios_arch"] = self.variant
props_dict["ios_platform"] = ios_definitions.get_platform(self.variant)
props_dict["name"] = self.gen_job_name()
props_dict["use_metal"] = miniutils.quote(str(int(True)))
props_dict["use_coreml"] = miniutils.quote(str(int(True)))
template_name = "_".join([
"binary",
"ios",
self.get_phase_name(),
])
return [{template_name: props_dict}]
BUILD_CONFIGS = [
IOSNightlyJob("x86_64"),
IOSNightlyJob("arm64"),
]
WORKFLOW_DATA = BUILD_CONFIGS + [
IOSNightlyJob("binary", is_upload=True),
]
def get_workflow_jobs():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,27 +0,0 @@
NON_PR_BRANCH_LIST = [
"master",
r"/ci-all\/.*/",
r"/release\/.*/",
]
PR_BRANCH_LIST = [
r"/gh\/.*\/head/",
r"/pull\/.*/",
]
RC_PATTERN = r"/v[0-9]+(\.[0-9]+)*-rc[0-9]+/"
def gen_filter_dict(
branches_list=NON_PR_BRANCH_LIST,
tags_list=None
):
"""Generates a filter dictionary for use with CircleCI's job filter"""
filter_dict = {
"branches": {
"only": branches_list,
},
}
if tags_list is not None:
filter_dict["tags"] = {"only": tags_list}
return filter_dict

View File

@ -1,33 +0,0 @@
AWS_DOCKER_HOST = "308535385114.dkr.ecr.us-east-1.amazonaws.com"
def gen_docker_image(container_type):
return (
"/".join([AWS_DOCKER_HOST, "pytorch", container_type]),
f"docker-{container_type}",
)
def gen_docker_image_requires(image_name):
return [f"docker-{image_name}"]
DOCKER_IMAGE_BASIC, DOCKER_REQUIREMENT_BASE = gen_docker_image(
"pytorch-linux-xenial-py3.6-gcc5.4"
)
DOCKER_IMAGE_CUDA_10_2, DOCKER_REQUIREMENT_CUDA_10_2 = gen_docker_image(
"pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7"
)
DOCKER_IMAGE_GCC7, DOCKER_REQUIREMENT_GCC7 = gen_docker_image(
"pytorch-linux-xenial-py3.6-gcc7"
)
def gen_mobile_docker(specifier):
container_type = "pytorch-linux-xenial-py3-clang5-" + specifier
return gen_docker_image(container_type)
DOCKER_IMAGE_ASAN, DOCKER_REQUIREMENT_ASAN = gen_mobile_docker("asan")
DOCKER_IMAGE_NDK, DOCKER_REQUIREMENT_NDK = gen_mobile_docker("android-ndk-r19c")

View File

@ -1,34 +0,0 @@
class MultiPartVersion:
def __init__(self, parts, prefix=""):
self.parts = parts
self.prefix = prefix
def prefixed_parts(self):
"""
Prepends the first element of the version list
with the prefix string.
"""
if self.parts:
return [self.prefix + str(self.parts[0])] + [str(part) for part in self.parts[1:]]
else:
return [self.prefix]
def render_dots(self):
return ".".join(self.prefixed_parts())
def render_dots_or_parts(self, with_dots):
if with_dots:
return [self.render_dots()]
else:
return self.prefixed_parts()
class CudaVersion(MultiPartVersion):
def __init__(self, major, minor):
self.major = major
self.minor = minor
super().__init__([self.major, self.minor], "cuda")
def __str__(self):
return f"{self.major}.{self.minor}"

View File

@ -1,160 +0,0 @@
import cimodel.lib.miniutils as miniutils
from cimodel.data.simple.util.branch_filters import gen_filter_dict, RC_PATTERN, NON_PR_BRANCH_LIST
from cimodel.data.simple.util.versions import CudaVersion
class WindowsJob:
def __init__(
self,
test_index,
vscode_spec,
cuda_version,
force_on_cpu=False,
multi_gpu=False,
master_only=False,
nightly_only=False,
master_and_nightly=False
):
self.test_index = test_index
self.vscode_spec = vscode_spec
self.cuda_version = cuda_version
self.force_on_cpu = force_on_cpu
self.multi_gpu = multi_gpu
self.master_only = master_only
self.nightly_only = nightly_only
self.master_and_nightly = master_and_nightly
def gen_tree(self):
base_phase = "build" if self.test_index is None else "test"
numbered_phase = (
base_phase if self.test_index is None else base_phase + str(self.test_index)
)
key_parts = ["pytorch", "windows", base_phase]
if self.multi_gpu:
key_parts.append('multigpu')
key_name = "_".join(key_parts)
cpu_forcing_name_parts = ["on", "cpu"] if self.force_on_cpu else []
target_arch = self.cuda_version.render_dots() if self.cuda_version else "cpu"
python_version = "3.8"
base_name_parts = [
"pytorch",
"windows",
self.vscode_spec.render(),
"py" + python_version.replace(".", ""),
target_arch,
]
prerequisite_jobs = []
if base_phase == "test":
prerequisite_jobs.append("_".join(base_name_parts + ["build"]))
if self.cuda_version:
self.cudnn_version = 8 if self.cuda_version.major == 11 else 7
arch_env_elements = (
["cuda" + str(self.cuda_version.major) + "." + str(self.cuda_version.minor)]
if self.cuda_version
else ["cpu"]
)
build_environment_string = "-".join(
["pytorch", "win"]
+ self.vscode_spec.get_elements()
+ arch_env_elements
+ ["py" + python_version.split(".")[0]]
)
is_running_on_cuda = bool(self.cuda_version) and not self.force_on_cpu
if self.multi_gpu:
props_dict = {"requires": prerequisite_jobs}
else:
props_dict = {
"build_environment": build_environment_string,
"python_version": miniutils.quote(python_version),
"vs_version": miniutils.quote("16.8.6"),
"vc_version": miniutils.quote(self.vscode_spec.dotted_version()),
"vc_year": miniutils.quote(str(self.vscode_spec.year)),
"vc_product": self.vscode_spec.get_product(),
"use_cuda": miniutils.quote(str(int(is_running_on_cuda))),
"requires": prerequisite_jobs,
}
if self.master_only:
props_dict[
"filters"
] = gen_filter_dict()
elif self.nightly_only:
props_dict[
"filters"
] = gen_filter_dict(branches_list=["nightly"], tags_list=RC_PATTERN)
elif self.master_and_nightly:
props_dict[
"filters"
] = gen_filter_dict(branches_list=NON_PR_BRANCH_LIST + ["nightly"], tags_list=RC_PATTERN)
name_parts = base_name_parts + cpu_forcing_name_parts + [numbered_phase]
if not self.multi_gpu:
if base_phase == "test":
test_name = "-".join(["pytorch", "windows", numbered_phase])
props_dict["test_name"] = test_name
if is_running_on_cuda:
props_dict["executor"] = "windows-with-nvidia-gpu"
props_dict["cuda_version"] = (
miniutils.quote(str(self.cuda_version))
if self.cuda_version
else "cpu"
)
props_dict["name"] = "_".join(name_parts)
return [{key_name: props_dict}]
class VcSpec:
def __init__(self, year, version_elements=None, hide_version=False):
self.year = year
self.version_elements = version_elements or []
self.hide_version = hide_version
def get_elements(self):
if self.hide_version:
return [self.prefixed_year()]
return [self.prefixed_year()] + self.version_elements
def get_product(self):
return "BuildTools"
def dotted_version(self):
return ".".join(self.version_elements)
def prefixed_year(self):
return "vs" + str(self.year)
def render(self):
return "_".join(self.get_elements())
_VC2019 = VcSpec(2019)
WORKFLOW_DATA = [
# VS2019 CUDA-10.2
WindowsJob(None, _VC2019, CudaVersion(10, 2), master_only=True),
# VS2019 CUDA-10.2 force on cpu
WindowsJob(1, _VC2019, CudaVersion(10, 2), force_on_cpu=True, master_only=True),
# TODO: This test is disabled due to https://github.com/pytorch/pytorch/issues/59724
# WindowsJob('_azure_multi_gpu', _VC2019, CudaVersion(11, 1), multi_gpu=True, master_and_nightly=True),
]
def get_windows_workflows():
return [item.gen_tree() for item in WORKFLOW_DATA]

View File

@ -1,7 +1,5 @@
from collections import OrderedDict
import cimodel.lib.miniutils as miniutils
LIST_MARKER = "- "
INDENTATION_WIDTH = 2
@ -31,8 +29,7 @@ def render(fh, data, depth, is_list_member=False):
tuples.sort()
for i, (k, v) in enumerate(tuples):
if not v:
continue
# If this dict is itself a list member, the first key gets prefixed with a list marker
list_marker_prefix = LIST_MARKER if is_list_member and not i else ""
@ -46,7 +43,5 @@ def render(fh, data, depth, is_list_member=False):
render(fh, v, depth, True)
else:
# use empty quotes to denote an empty string value instead of blank space
modified_data = miniutils.quote(data) if data == "" else data
list_member_prefix = indentation + LIST_MARKER if is_list_member else ""
fh.write(list_member_prefix + str(modified_data) + "\n")
fh.write(list_member_prefix + str(data) + "\n")

View File

@ -0,0 +1,84 @@
"""
This module encapsulates dependencies on pygraphviz
"""
import colorsys
import cimodel.lib.conf_tree as conf_tree
def rgb2hex(rgb_tuple):
def to_hex(f):
return "%02x" % int(f * 255)
return "#" + "".join(map(to_hex, list(rgb_tuple)))
def handle_missing_graphviz(f):
"""
If the user has not installed pygraphviz, this causes
calls to the draw() method of the returned object to do nothing.
"""
try:
import pygraphviz # noqa: F401
return f
except ModuleNotFoundError:
class FakeGraph:
def draw(self, *args, **kwargs):
pass
return lambda _: FakeGraph()
@handle_missing_graphviz
def generate_graph(toplevel_config_node):
"""
Traverses the graph once first just to find the max depth
"""
config_list = conf_tree.dfs(toplevel_config_node)
max_depth = 0
for config in config_list:
max_depth = max(max_depth, config.get_depth())
# color the nodes using the max depth
from pygraphviz import AGraph
dot = AGraph()
def node_discovery_callback(node, sibling_index, sibling_count):
depth = node.get_depth()
sat_min, sat_max = 0.1, 0.6
sat_range = sat_max - sat_min
saturation_fraction = sibling_index / float(sibling_count - 1) if sibling_count > 1 else 1
saturation = sat_min + sat_range * saturation_fraction
# TODO Use a hash of the node label to determine the color
hue = depth / float(max_depth + 1)
rgb_tuple = colorsys.hsv_to_rgb(hue, saturation, 1)
this_node_key = node.get_node_key()
dot.add_node(
this_node_key,
label=node.get_label(),
style="filled",
# fillcolor=hex_color + ":orange",
fillcolor=rgb2hex(rgb_tuple),
penwidth=3,
color=rgb2hex(colorsys.hsv_to_rgb(hue, saturation, 0.9))
)
def child_callback(node, child):
this_node_key = node.get_node_key()
child_node_key = child.get_node_key()
dot.add_edge((this_node_key, child_node_key))
conf_tree.dfs_recurse(toplevel_config_node, lambda x: None, node_discovery_callback, child_callback)
return dot

View File

@ -1,17 +0,0 @@
#!/bin/bash -xe
YAML_FILENAME=verbatim-sources/workflows-pytorch-ge-config-tests.yml
DIFF_TOOL=meld
# Allows this script to be invoked from any directory:
cd $(dirname "$0")
pushd ..
$DIFF_TOOL $YAML_FILENAME <(./codegen_validation/normalize_yaml_fragment.py < $YAML_FILENAME)
popd

View File

@ -1,24 +0,0 @@
#!/usr/bin/env python3
import os
import sys
import yaml
# Need to import modules that lie on an upward-relative path
sys.path.append(os.path.join(sys.path[0], '..'))
import cimodel.lib.miniyaml as miniyaml
def regurgitate(depth, use_pyyaml_formatter=False):
data = yaml.safe_load(sys.stdin)
if use_pyyaml_formatter:
output = yaml.dump(data, sort_keys=True)
sys.stdout.write(output)
else:
miniyaml.render(sys.stdout, data, depth)
if __name__ == "__main__":
regurgitate(3)

View File

@ -1,15 +0,0 @@
#!/bin/bash -xe
YAML_FILENAME=$1
# Allows this script to be invoked from any directory:
cd $(dirname "$0")
pushd ..
TEMP_FILENAME=$(mktemp)
cat $YAML_FILENAME | ./codegen_validation/normalize_yaml_fragment.py > $TEMP_FILENAME
mv $TEMP_FILENAME $YAML_FILENAME
popd

File diff suppressed because it is too large Load Diff

View File

@ -12,20 +12,8 @@ each image as the `BUILD_ENVIRONMENT` environment variable.
See `build.sh` for valid build environments (it's the giant switch).
Docker builds are now defined with `.circleci/cimodel/data/simple/docker_definitions.py`
## Contents
* `build.sh` -- dispatch script to launch all builds
* `common` -- scripts used to execute individual Docker build stages
* `ubuntu-cuda` -- Dockerfile for Ubuntu image with CUDA support for nvidia-docker
## Usage
```bash
# Build a specific image
./build.sh pytorch-linux-bionic-py3.8-gcc9 -t myimage:latest
# Set flags (see build.sh) and build image
sudo bash -c 'PROTOBUF=1 ./build.sh pytorch-linux-bionic-py3.8-gcc9 -t myimage:latest
```

View File

@ -20,8 +20,10 @@ buildscript {
}
dependencies {
classpath 'com.android.tools.build:gradle:4.1.2'
classpath 'com.vanniktech:gradle-maven-publish-plugin:0.14.2'
classpath 'com.android.tools.build:gradle:3.3.2'
classpath "com.jfrog.bintray.gradle:gradle-bintray-plugin:1.8.0"
classpath "com.github.dcendents:android-maven-gradle-plugin:2.1"
classpath "org.jfrog.buildinfo:build-info-extractor-gradle:4.9.8"
}
}

View File

@ -10,64 +10,21 @@ if [ -z "${image}" ]; then
exit 1
fi
function extract_version_from_image_name() {
eval export $2=$(echo "${image}" | perl -n -e"/$1(\d+(\.\d+)?(\.\d+)?)/ && print \$1")
if [ "x${!2}" = x ]; then
echo "variable '$2' not correctly parsed from image='$image'"
exit 1
fi
}
# TODO: Generalize
OS="ubuntu"
DOCKERFILE="${OS}/Dockerfile"
if [[ "$image" == *-cuda* ]]; then
DOCKERFILE="${OS}-cuda/Dockerfile"
fi
function extract_all_from_image_name() {
# parts $image into array, splitting on '-'
keep_IFS="$IFS"
IFS="-"
declare -a parts=($image)
IFS="$keep_IFS"
unset keep_IFS
for part in "${parts[@]}"; do
name=$(echo "${part}" | perl -n -e"/([a-zA-Z]+)\d+(\.\d+)?(\.\d+)?/ && print \$1")
vername="${name^^}_VERSION"
# "py" is the odd one out, needs this special case
if [ "x${name}" = xpy ]; then
vername=ANACONDA_PYTHON_VERSION
fi
# skip non-conforming fields such as "pytorch", "linux" or "xenial" without version string
if [ -n "${name}" ]; then
extract_version_from_image_name "${name}" "${vername}"
fi
done
}
if [[ "$image" == *-xenial* ]]; then
if [[ "$image" == *-trusty* ]]; then
UBUNTU_VERSION=14.04
elif [[ "$image" == *-xenial* ]]; then
UBUNTU_VERSION=16.04
elif [[ "$image" == *-artful* ]]; then
UBUNTU_VERSION=17.10
elif [[ "$image" == *-bionic* ]]; then
UBUNTU_VERSION=18.04
elif [[ "$image" == *-focal* ]]; then
UBUNTU_VERSION=20.04
elif [[ "$image" == *ubuntu* ]]; then
extract_version_from_image_name ubuntu UBUNTU_VERSION
elif [[ "$image" == *centos* ]]; then
extract_version_from_image_name centos CENTOS_VERSION
fi
if [ -n "${UBUNTU_VERSION}" ]; then
OS="ubuntu"
elif [ -n "${CENTOS_VERSION}" ]; then
OS="centos"
else
echo "Unable to derive operating system base..."
exit 1
fi
DOCKERFILE="${OS}/Dockerfile"
if [[ "$image" == *cuda* ]]; then
DOCKERFILE="${OS}-cuda/Dockerfile"
elif [[ "$image" == *rocm* ]]; then
DOCKERFILE="${OS}-rocm/Dockerfile"
fi
TRAVIS_DL_URL_PREFIX="https://s3.amazonaws.com/travis-python-archives/binaries/ubuntu/14.04/x86_64"
@ -76,15 +33,45 @@ TRAVIS_DL_URL_PREFIX="https://s3.amazonaws.com/travis-python-archives/binaries/u
# configuration, so we hardcode everything here rather than do it
# from scratch
case "$image" in
pytorch-linux-xenial-py3.8)
ANACONDA_PYTHON_VERSION=3.8
CMAKE_VERSION=3.10.3
pytorch-linux-bionic-clang9-thrift-llvmdev)
CLANG_VERSION=9
THRIFT=yes
LLVMDEV=yes
PROTOBUF=yes
;;
pytorch-linux-xenial-py2.7.9)
TRAVIS_PYTHON_VERSION=2.7.9
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py2.7)
TRAVIS_PYTHON_VERSION=2.7
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3.5)
TRAVIS_PYTHON_VERSION=3.5
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py3.8)
# TODO: This is a hack, get rid of this as soon as you get rid of the travis downloads
TRAVIS_DL_URL_PREFIX="https://s3.amazonaws.com/travis-python-archives/binaries/ubuntu/16.04/x86_64"
TRAVIS_PYTHON_VERSION=3.8
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py3.6-gcc4.8)
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=4.8
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3.6-gcc5.4)
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=5
PROTOBUF=yes
DB=yes
@ -93,45 +80,71 @@ case "$image" in
;;
pytorch-linux-xenial-py3.6-gcc7.2)
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=7
# Do not install PROTOBUF, DB, and VISION as a test
;;
pytorch-linux-xenial-py3.6-gcc7)
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-pynightly)
TRAVIS_PYTHON_VERSION=nightly
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda9-cudnn7-py2)
CUDA_VERSION=9.0
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=2.7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda9-cudnn7-py3)
CUDA_VERSION=9.0
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda9.2-cudnn7-py3-gcc7)
CUDA_VERSION=9.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda10-cudnn7-py3-gcc7)
CUDA_VERSION=10.0
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-cuda10.1-cudnn7-py3-gcc7)
CUDA_VERSION=10.1
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7)
CUDA_VERSION=11.1
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
KATEX=yes
;;
pytorch-linux-xenial-cuda11.3-cudnn8-py3-gcc7)
CUDA_VERSION=11.3.0 # Deviating from major.minor to conform to nvidia's Docker image names
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
GCC_VERSION=7
PROTOBUF=yes
DB=yes
@ -141,23 +154,6 @@ case "$image" in
pytorch-linux-xenial-py3-clang5-asan)
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=5.0
CMAKE_VERSION=3.10.3
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3-clang7-asan)
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=7
CMAKE_VERSION=3.10.3
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-xenial-py3-clang7-onnx)
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=7
CMAKE_VERSION=3.10.3
PROTOBUF=yes
DB=yes
VISION=yes
@ -165,125 +161,21 @@ case "$image" in
pytorch-linux-xenial-py3-clang5-android-ndk-r19c)
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=5.0
CMAKE_VERSION=3.10.3
LLVMDEV=yes
PROTOBUF=yes
ANDROID=yes
ANDROID_NDK_VERSION=r19c
GRADLE_VERSION=6.8.3
GRADLE_VERSION=4.10.3
CMAKE_VERSION=3.7.0
NINJA_VERSION=1.9.0
;;
pytorch-linux-xenial-py3.6-clang7)
ANACONDA_PYTHON_VERSION=3.6
CMAKE_VERSION=3.10.3
CLANG_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-py3.6-clang9)
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
VULKAN_SDK_VERSION=1.2.162.1
SWIFTSHADER=yes
;;
pytorch-linux-bionic-py3.8-gcc9)
ANACONDA_PYTHON_VERSION=3.8
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-cuda10.2-cudnn7-py3.6-clang9)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.6
CLANG_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7)
CUDA_VERSION=10.2
CUDNN_VERSION=7
ANACONDA_PYTHON_VERSION=3.9
GCC_VERSION=7
PROTOBUF=yes
DB=yes
VISION=yes
;;
pytorch-linux-bionic-cuda11.0-cudnn8-py3.6-gcc9)
CUDA_VERSION=11.0
CUDNN_VERSION=8
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=3.9
;;
pytorch-linux-bionic-rocm4.1-py3.6)
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.1
;;
pytorch-linux-bionic-rocm4.2-py3.6)
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.2
;;
pytorch-linux-bionic-rocm4.3.1-py3.6)
ANACONDA_PYTHON_VERSION=3.6
GCC_VERSION=9
PROTOBUF=yes
DB=yes
VISION=yes
ROCM_VERSION=4.3.1
;;
*)
# Catch-all for builds that are not hardcoded.
PROTOBUF=yes
DB=yes
VISION=yes
echo "image '$image' did not match an existing build configuration"
if [[ "$image" == *xenial* ]]; then
CMAKE_VERSION=3.10.3
fi
if [[ "$image" == *py* ]]; then
extract_version_from_image_name py ANACONDA_PYTHON_VERSION
fi
if [[ "$image" == *cuda* ]]; then
extract_version_from_image_name cuda CUDA_VERSION
extract_version_from_image_name cudnn CUDNN_VERSION
fi
if [[ "$image" == *rocm* ]]; then
extract_version_from_image_name rocm ROCM_VERSION
fi
if [[ "$image" == *gcc* ]]; then
extract_version_from_image_name gcc GCC_VERSION
fi
if [[ "$image" == *clang* ]]; then
extract_version_from_image_name clang CLANG_VERSION
fi
if [[ "$image" == *devtoolset* ]]; then
extract_version_from_image_name devtoolset DEVTOOLSET_VERSION
fi
if [[ "$image" == *glibc* ]]; then
extract_version_from_image_name glibc GLIBC_VERSION
fi
if [[ "$image" == *cmake* ]]; then
extract_version_from_image_name cmake CMAKE_VERSION
fi
;;
esac
# Set Jenkins UID and GID if running Jenkins
@ -292,14 +184,11 @@ if [ -n "${JENKINS:-}" ]; then
JENKINS_GID=$(id -g jenkins)
fi
tmp_tag=$(basename "$(mktemp -u)" | tr '[:upper:]' '[:lower:]')
tmp_tag="tmp-$(cat /dev/urandom | tr -dc 'a-z' | fold -w 32 | head -n 1)"
# Build image
# TODO: build-arg THRIFT is not turned on for any image, remove it once we confirm
# it's no longer needed.
docker build \
--no-cache \
--progress=plain \
--build-arg "TRAVIS_DL_URL_PREFIX=${TRAVIS_DL_URL_PREFIX}" \
--build-arg "BUILD_ENVIRONMENT=${image}" \
--build-arg "PROTOBUF=${PROTOBUF:-}" \
@ -312,36 +201,23 @@ docker build \
--build-arg "JENKINS_UID=${JENKINS_UID:-}" \
--build-arg "JENKINS_GID=${JENKINS_GID:-}" \
--build-arg "UBUNTU_VERSION=${UBUNTU_VERSION}" \
--build-arg "CENTOS_VERSION=${CENTOS_VERSION}" \
--build-arg "DEVTOOLSET_VERSION=${DEVTOOLSET_VERSION}" \
--build-arg "GLIBC_VERSION=${GLIBC_VERSION}" \
--build-arg "CLANG_VERSION=${CLANG_VERSION}" \
--build-arg "ANACONDA_PYTHON_VERSION=${ANACONDA_PYTHON_VERSION}" \
--build-arg "TRAVIS_PYTHON_VERSION=${TRAVIS_PYTHON_VERSION}" \
--build-arg "GCC_VERSION=${GCC_VERSION}" \
--build-arg "CUDA_VERSION=${CUDA_VERSION}" \
--build-arg "CUDNN_VERSION=${CUDNN_VERSION}" \
--build-arg "ANDROID=${ANDROID}" \
--build-arg "ANDROID_NDK=${ANDROID_NDK_VERSION}" \
--build-arg "GRADLE_VERSION=${GRADLE_VERSION}" \
--build-arg "VULKAN_SDK_VERSION=${VULKAN_SDK_VERSION}" \
--build-arg "SWIFTSHADER=${SWIFTSHADER}" \
--build-arg "CMAKE_VERSION=${CMAKE_VERSION:-}" \
--build-arg "NINJA_VERSION=${NINJA_VERSION:-}" \
--build-arg "KATEX=${KATEX:-}" \
--build-arg "ROCM_VERSION=${ROCM_VERSION:-}" \
-f $(dirname ${DOCKERFILE})/Dockerfile \
-t "$tmp_tag" \
"$@" \
.
# NVIDIA dockers for RC releases use tag names like `11.0-cudnn8-devel-ubuntu18.04-rc`,
# for this case we will set UBUNTU_VERSION to `18.04-rc` so that the Dockerfile could
# find the correct image. As a result, here we have to replace the
# "$UBUNTU_VERSION" == "18.04-rc"
# with
# "$UBUNTU_VERSION" == "18.04"
UBUNTU_VERSION=$(echo ${UBUNTU_VERSION} | sed 's/-rc$//')
function drun() {
docker run --rm "$tmp_tag" $*
}
@ -359,6 +235,19 @@ if [[ "$OS" == "ubuntu" ]]; then
fi
fi
if [ -n "$TRAVIS_PYTHON_VERSION" ]; then
if [[ "$TRAVIS_PYTHON_VERSION" != nightly ]]; then
if !(drun python --version 2>&1 | grep -qF "Python $TRAVIS_PYTHON_VERSION"); then
echo "TRAVIS_PYTHON_VERSION=$TRAVIS_PYTHON_VERSION, but:"
drun python --version
exit 1
fi
else
echo "Please manually check nightly is OK:"
drun python --version
fi
fi
if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
if !(drun python --version 2>&1 | grep -qF "Python $ANACONDA_PYTHON_VERSION"); then
echo "ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION, but:"

View File

@ -13,7 +13,7 @@ retry () {
#until we find a way to reliably reuse previous build, this last_tag is not in use
# last_tag="$(( CIRCLE_BUILD_NUM - 1 ))"
tag="${DOCKER_TAG}"
tag="${CIRCLE_WORKFLOW_ID}"
registry="308535385114.dkr.ecr.us-east-1.amazonaws.com"
@ -46,7 +46,4 @@ trap "docker logout ${registry}" EXIT
docker push "${image}:${tag}"
docker save -o "${IMAGE_NAME}:${tag}.tar" "${image}:${tag}"
if [ -z "${DOCKER_SKIP_S3_UPLOAD:-}" ]; then
aws s3 cp "${IMAGE_NAME}:${tag}.tar" "s3://ossci-linux-build/pytorch/base/${IMAGE_NAME}:${tag}.tar" --acl public-read
fi
aws s3 cp "${IMAGE_NAME}:${tag}.tar" "s3://ossci-linux-build/pytorch/base/${IMAGE_NAME}:${tag}.tar" --acl public-read

View File

@ -1,93 +0,0 @@
ARG CENTOS_VERSION
FROM centos:${CENTOS_VERSION}
ARG CENTOS_VERSION
# Install required packages to build Caffe2
# Install common dependencies (so that this step can be cached separately)
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install devtoolset
ARG DEVTOOLSET_VERSION
ADD ./common/install_devtoolset.sh install_devtoolset.sh
RUN bash ./install_devtoolset.sh && rm install_devtoolset.sh
ENV BASH_ENV "/etc/profile"
# (optional) Install non-default glibc version
ARG GLIBC_VERSION
ADD ./common/install_glibc.sh install_glibc.sh
RUN if [ -n "${GLIBC_VERSION}" ]; then bash ./install_glibc.sh; fi
RUN rm install_glibc.sh
# Install user
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install conda and other packages (e.g., numpy, coverage, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
# Install rocm
ARG ROCM_VERSION
ADD ./common/install_rocm.sh install_rocm.sh
RUN bash ./install_rocm.sh
RUN rm install_rocm.sh
ENV PATH /opt/rocm/bin:$PATH
ENV PATH /opt/rocm/hcc/bin:$PATH
ENV PATH /opt/rocm/hip/bin:$PATH
ENV PATH /opt/rocm/opencl/bin:$PATH
ENV PATH /opt/rocm/llvm/bin:$PATH
ENV MAGMA_HOME /opt/rocm/magma
ENV LANG en_US.utf8
ENV LC_ALL en_US.utf8
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# (optional) Install non-default Ninja version
ARG NINJA_VERSION
ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
RUN bash ./install_cache.sh && rm install_cache.sh
# Include BUILD_ENVIRONMENT environment variable in image
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
USER jenkins
CMD ["bash"]

View File

@ -4,15 +4,13 @@ set -ex
[ -n "${ANDROID_NDK}" ]
_https_amazon_aws=https://ossci-android.s3.amazonaws.com
apt-get update
apt-get install -y --no-install-recommends autotools-dev autoconf unzip
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
pushd /tmp
curl -Os --retry 3 $_https_amazon_aws/android-ndk-${ANDROID_NDK}-linux-x86_64.zip
curl -Os --retry 3 https://dl.google.com/android/repository/android-ndk-${ANDROID_NDK}-linux-x86_64.zip
popd
_ndk_dir=/opt/ndk
mkdir -p "$_ndk_dir"
@ -47,22 +45,43 @@ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
# Installing android sdk
# https://github.com/circleci/circleci-images/blob/staging/android/Dockerfile.m4
_tmp_sdk_zip=/tmp/android-sdk-linux.zip
_sdk_version=sdk-tools-linux-3859397.zip
_android_home=/opt/android/sdk
rm -rf $_android_home
sudo mkdir -p $_android_home
curl --silent --show-error --location --fail --retry 3 --output /tmp/android-sdk-linux.zip $_https_amazon_aws/android-sdk-linux-tools3859397-build-tools2803-2902-platforms28-29.zip
sudo unzip -q $_tmp_sdk_zip -d $_android_home
rm $_tmp_sdk_zip
curl --silent --show-error --location --fail --retry 3 --output /tmp/$_sdk_version https://dl.google.com/android/repository/$_sdk_version
sudo unzip -q /tmp/$_sdk_version -d $_android_home
rm /tmp/$_sdk_version
sudo chmod -R 777 $_android_home
export ANDROID_HOME=$_android_home
export ADB_INSTALL_TIMEOUT=120
export PATH="${ANDROID_HOME}/tools:${ANDROID_HOME}/tools/bin:${ANDROID_HOME}/platform-tools:${PATH}"
export PATH="${ANDROID_HOME}/emulator:${ANDROID_HOME}/tools:${ANDROID_HOME}/tools/bin:${ANDROID_HOME}/platform-tools:${PATH}"
echo "PATH:${PATH}"
alias sdkmanager="$ANDROID_HOME/tools/bin/sdkmanager"
sudo mkdir ~/.android && sudo echo '### User Sources for Android SDK Manager' > ~/.android/repositories.cfg
sudo chmod -R 777 ~/.android
yes | sdkmanager --licenses
yes | sdkmanager --update
sdkmanager \
"tools" \
"platform-tools" \
"emulator"
sdkmanager \
"build-tools;28.0.3" \
"build-tools;29.0.2"
sdkmanager \
"platforms;android-28" \
"platforms;android-29"
sdkmanager --list
# Installing Gradle
echo "GRADLE_VERSION:${GRADLE_VERSION}"
@ -70,7 +89,8 @@ _gradle_home=/opt/gradle
sudo rm -rf $gradle_home
sudo mkdir -p $_gradle_home
curl --silent --output /tmp/gradle.zip --retry 3 $_https_amazon_aws/gradle-${GRADLE_VERSION}-bin.zip
wget --no-verbose --output-document=/tmp/gradle.zip \
"https://services.gradle.org/distributions/gradle-${GRADLE_VERSION}-bin.zip"
sudo unzip -q /tmp/gradle.zip -d $_gradle_home
rm /tmp/gradle.zip
@ -99,7 +119,7 @@ echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
chown -R jenkins /var/lib/jenkins/gradledeps
chgrp -R jenkins /var/lib/jenkins/gradledeps
sudo -H -u jenkins $GRADLE_HOME/bin/gradle -Pandroid.useAndroidX=true -p /var/lib/jenkins/gradledeps -g /var/lib/jenkins/.gradle --refresh-dependencies --debug --stacktrace assemble
sudo -H -u jenkins $GRADLE_HOME/bin/gradle -p /var/lib/jenkins/gradledeps -g /var/lib/jenkins/.gradle --refresh-dependencies --debug --stacktrace assemble
chown -R jenkins /var/lib/jenkins/.gradle
chgrp -R jenkins /var/lib/jenkins/.gradle

View File

@ -2,122 +2,74 @@
set -ex
install_ubuntu() {
# NVIDIA dockers for RC releases use tag names like `11.0-cudnn8-devel-ubuntu18.04-rc`,
# for this case we will set UBUNTU_VERSION to `18.04-rc` so that the Dockerfile could
# find the correct image. As a result, here we have to check for
# "$UBUNTU_VERSION" == "18.04"*
# instead of
# "$UBUNTU_VERSION" == "18.04"
if [[ "$UBUNTU_VERSION" == "18.04"* ]]; then
cmake3="cmake=3.10*"
else
cmake3="cmake=3.5*"
fi
if [[ "$UBUNTU_VERSION" == "14.04" ]]; then
# cmake 2 is too old
cmake3=cmake3
else
cmake3=cmake
fi
# Install common dependencies
apt-get update
# TODO: Some of these may not be necessary
ccache_deps="asciidoc docbook-xml docbook-xsl xsltproc"
numpy_deps="gfortran"
apt-get install -y --no-install-recommends \
$ccache_deps \
$numpy_deps \
${cmake3} \
apt-transport-https \
autoconf \
automake \
build-essential \
ca-certificates \
curl \
git \
libatlas-base-dev \
libc6-dbg \
libiomp-dev \
libyaml-dev \
libz-dev \
libjpeg-dev \
libasound2-dev \
libsndfile-dev \
software-properties-common \
sudo \
wget \
vim
if [[ "$UBUNTU_VERSION" == "18.04" ]]; then
cmake3="cmake=3.10*"
else
cmake3="${cmake3}=3.5*"
fi
# Cleanup package manager
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
}
install_centos() {
# Need EPEL for many packages we depend on.
# See http://fedoraproject.org/wiki/EPEL
yum --enablerepo=extras install -y epel-release
ccache_deps="asciidoc docbook-dtds docbook-style-xsl libxslt"
numpy_deps="gcc-gfortran"
# Note: protobuf-c-{compiler,devel} on CentOS are too old to be used
# for Caffe2. That said, we still install them to make sure the build
# system opts to build/use protoc and libprotobuf from third-party.
yum install -y \
$ccache_deps \
$numpy_deps \
autoconf \
automake \
bzip2 \
cmake \
cmake3 \
curl \
gcc \
gcc-c++ \
gflags-devel \
git \
glibc-devel \
glibc-headers \
glog-devel \
hiredis-devel \
libstdc++-devel \
libsndfile-devel \
make \
opencv-devel \
sudo \
wget \
vim
# Cleanup
yum clean all
rm -rf /var/cache/yum
rm -rf /var/lib/yum/yumdb
rm -rf /var/lib/yum/history
}
# Install base packages depending on the base OS
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
centos)
install_centos
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
# Install common dependencies
apt-get update
# TODO: Some of these may not be necessary
# TODO: libiomp also gets installed by conda, aka there's a conflict
ccache_deps="asciidoc docbook-xml docbook-xsl xsltproc"
numpy_deps="gfortran"
apt-get install -y --no-install-recommends \
$ccache_deps \
$numpy_deps \
${cmake3} \
apt-transport-https \
autoconf \
automake \
build-essential \
ca-certificates \
curl \
git \
libatlas-base-dev \
libc6-dbg \
libiomp-dev \
libyaml-dev \
libz-dev \
libjpeg-dev \
libasound2-dev \
libsndfile-dev \
python \
python-dev \
python-setuptools \
python-wheel \
software-properties-common \
sudo \
wget \
vim
# Install Valgrind separately since the apt-get version is too old.
mkdir valgrind_build && cd valgrind_build
VALGRIND_VERSION=3.16.1
if ! wget http://valgrind.org/downloads/valgrind-${VALGRIND_VERSION}.tar.bz2
if ! wget http://valgrind.org/downloads/valgrind-3.14.0.tar.bz2
then
wget https://sourceware.org/ftp/valgrind/valgrind-${VALGRIND_VERSION}.tar.bz2
wget https://sourceware.org/ftp/valgrind/valgrind-3.14.0.tar.bz2
fi
tar -xjf valgrind-${VALGRIND_VERSION}.tar.bz2
cd valgrind-${VALGRIND_VERSION}
tar -xjf valgrind-3.14.0.tar.bz2
cd valgrind-3.14.0
./configure --prefix=/usr/local
make -j 4
make
sudo make install
cd ../../
rm -rf valgrind_build
alias valgrind="/usr/local/bin/valgrind"
# TODO: THIS IS A HACK!!!
# distributed nccl(2) tests are a bit busted, see https://github.com/pytorch/pytorch/issues/5877
if dpkg -s libnccl-dev; then
apt-get remove -y libnccl-dev libnccl2 --allow-change-held-packages
fi
# Cleanup package manager
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

View File

@ -2,51 +2,17 @@
set -ex
install_ubuntu() {
echo "Preparing to build sccache from source"
apt-get update
apt-get install -y cargo pkg-config libssl-dev
echo "Checking out sccache repo"
git clone https://github.com/pytorch/sccache
cd sccache
echo "Building sccache"
cargo build --release
cp target/release/sccache /opt/cache/bin
echo "Cleaning up"
cd ..
rm -rf sccache
apt-get remove -y cargo rustc
apt-get autoclean && apt-get clean
}
install_binary() {
echo "Downloading sccache binary from S3 repo"
curl --retry 3 https://s3.amazonaws.com/ossci-linux/sccache -o /opt/cache/bin/sccache
}
mkdir -p /opt/cache/bin
mkdir -p /opt/cache/lib
sed -e 's|PATH="\(.*\)"|PATH="/opt/cache/bin:\1"|g' -i /etc/environment
export PATH="/opt/cache/bin:$PATH"
# Setup compiler cache
if [ -n "$ROCM_VERSION" ]; then
curl --retry 3 http://repo.radeon.com/misc/.sccache_amd/sccache -o /opt/cache/bin/sccache
else
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
*)
install_binary
;;
esac
fi
curl --retry 3 https://s3.amazonaws.com/ossci-linux/sccache -o /opt/cache/bin/sccache
chmod a+x /opt/cache/bin/sccache
function write_sccache_stub() {
printf "#!/bin/sh\nif [ \$(ps -p \$PPID -o comm=) != sccache ]; then\n exec sccache $(which $1) \"\$@\"\nelse\n exec $(which $1) \"\$@\"\nfi" > "/opt/cache/bin/$1"
printf "#!/bin/sh\nexec sccache $(which $1) \$*" > "/opt/cache/bin/$1"
chmod a+x "/opt/cache/bin/$1"
}
@ -54,12 +20,8 @@ write_sccache_stub cc
write_sccache_stub c++
write_sccache_stub gcc
write_sccache_stub g++
# NOTE: See specific ROCM_VERSION case below.
if [ "x$ROCM_VERSION" = x ]; then
write_sccache_stub clang
write_sccache_stub clang++
fi
write_sccache_stub clang
write_sccache_stub clang++
if [ -n "$CUDA_VERSION" ]; then
# TODO: This is a workaround for the fact that PyTorch's FindCUDA
@ -68,50 +30,6 @@ if [ -n "$CUDA_VERSION" ]; then
# where CUDA is installed. Instead, we install an nvcc symlink outside
# of the PATH, and set CUDA_NVCC_EXECUTABLE so that we make use of it.
write_sccache_stub nvcc
mv /opt/cache/bin/nvcc /opt/cache/lib/
fi
if [ -n "$ROCM_VERSION" ]; then
# ROCm compiler is hcc or clang. However, it is commonly invoked via hipcc wrapper.
# hipcc will call either hcc or clang using an absolute path starting with /opt/rocm,
# causing the /opt/cache/bin to be skipped. We must create the sccache wrappers
# directly under /opt/rocm while also preserving the original compiler names.
# Note symlinks will chain as follows: [hcc or clang++] -> clang -> clang-??
# Final link in symlink chain must point back to original directory.
# Original compiler is moved one directory deeper. Wrapper replaces it.
function write_sccache_stub_rocm() {
OLDCOMP=$1
COMPNAME=$(basename $OLDCOMP)
TOPDIR=$(dirname $OLDCOMP)
WRAPPED="$TOPDIR/original/$COMPNAME"
mv "$OLDCOMP" "$WRAPPED"
printf "#!/bin/sh\nexec sccache $WRAPPED \"\$@\"" > "$OLDCOMP"
chmod a+x "$OLDCOMP"
}
if [[ -e "/opt/rocm/hcc/bin/hcc" ]]; then
# ROCm 3.3 or earlier.
mkdir /opt/rocm/hcc/bin/original
write_sccache_stub_rocm /opt/rocm/hcc/bin/hcc
write_sccache_stub_rocm /opt/rocm/hcc/bin/clang
write_sccache_stub_rocm /opt/rocm/hcc/bin/clang++
# Fix last link in symlink chain, clang points to versioned clang in prior dir
pushd /opt/rocm/hcc/bin/original
ln -s ../$(readlink clang)
popd
elif [[ -e "/opt/rocm/llvm/bin/clang" ]]; then
# ROCm 3.5 and beyond.
mkdir /opt/rocm/llvm/bin/original
write_sccache_stub_rocm /opt/rocm/llvm/bin/clang
write_sccache_stub_rocm /opt/rocm/llvm/bin/clang++
# Fix last link in symlink chain, clang points to versioned clang in prior dir
pushd /opt/rocm/llvm/bin/original
ln -s ../$(readlink clang)
popd
else
echo "Cannot find ROCm compiler."
exit 1
fi
printf "#!/bin/sh\nexec sccache $(which nvcc) \"\$@\"" > /opt/cache/lib/nvcc
chmod a+x /opt/cache/lib/nvcc
fi

View File

@ -4,9 +4,6 @@ set -ex
[ -n "$CMAKE_VERSION" ]
# Remove system cmake install so it won't get used instead
apt-get remove cmake -y
# Turn 3.6.3 into v3.6
path=$(echo "${CMAKE_VERSION}" | sed -e 's/\([0-9].[0-9]\+\).*/v\1/')
file="cmake-${CMAKE_VERSION}-Linux-x86_64.tar.gz"

View File

@ -24,20 +24,13 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
mkdir /opt/conda
chown jenkins:jenkins /opt/conda
# Work around bug where devtoolset replaces sudo and breaks it.
if [ -n "$DEVTOOLSET_VERSION" ]; then
SUDO=/bin/sudo
else
SUDO=sudo
fi
as_jenkins() {
# NB: unsetting the environment variables works around a conda bug
# https://github.com/conda/conda/issues/6576
# NB: Pass on PATH and LD_LIBRARY_PATH to sudo invocation
# NB: This must be run from a directory that jenkins has access to,
# works around https://github.com/conda/conda-package-handling/pull/34
$SUDO -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
sudo -H -u jenkins env -u SUDO_UID -u SUDO_GID -u SUDO_COMMAND -u SUDO_USER env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
}
pushd /tmp
@ -56,10 +49,10 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
pushd /opt/conda
# Track latest conda update
as_jenkins conda update -y -n base conda
as_jenkins conda update -n base conda
# Install correct Python version
as_jenkins conda install -y python="$ANACONDA_PYTHON_VERSION"
as_jenkins conda install python="$ANACONDA_PYTHON_VERSION"
conda_install() {
# Ensure that the install command don't upgrade/downgrade Python
@ -69,68 +62,31 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
}
# Install PyTorch conda deps, as per https://github.com/pytorch/pytorch README
# DO NOT install cmake here as it would install a version newer than 3.10, but
# we want to pin to version 3.10.
SCIPY_VERSION=1.1.0
if [ "$ANACONDA_PYTHON_VERSION" = "3.9" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
conda_install numpy=1.19.2 astunparse pyyaml mkl mkl-include setuptools cffi future six llvmdev=8.0.0 -c conda-forge
SCIPY_VERSION=1.6.0
elif [ "$ANACONDA_PYTHON_VERSION" = "3.8" ]; then
# Install llvm-8 as it is required to compile llvmlite-0.30.0 from source
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six llvmdev=8.0.0
elif [ "$ANACONDA_PYTHON_VERSION" = "3.7" ]; then
# DO NOT install dataclasses if installing python-3.7, since its part of python-3.7 core packages
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six typing_extensions
else
conda_install numpy=1.18.5 astunparse pyyaml mkl mkl-include setuptools cffi future six dataclasses typing_extensions
fi
if [[ "$CUDA_VERSION" == 10.2* ]]; then
conda_install magma-cuda102 -c pytorch
elif [[ "$CUDA_VERSION" == 11.0* ]]; then
conda_install magma-cuda110 -c pytorch
elif [[ "$CUDA_VERSION" == 11.1* ]]; then
conda_install magma-cuda111 -c pytorch
elif [[ "$CUDA_VERSION" == 11.3* ]]; then
conda_install magma-cuda113 -c pytorch
# DO NOT install cmake here as it would install a version newer than 3.5, but
# we want to pin to version 3.5.
conda_install numpy pyyaml mkl mkl-include setuptools cffi typing future six
if [[ "$CUDA_VERSION" == 9.0* ]]; then
conda_install magma-cuda90 -c pytorch
elif [[ "$CUDA_VERSION" == 9.1* ]]; then
conda_install magma-cuda91 -c pytorch
elif [[ "$CUDA_VERSION" == 9.2* ]]; then
conda_install magma-cuda92 -c pytorch
elif [[ "$CUDA_VERSION" == 10.0* ]]; then
conda_install magma-cuda100 -c pytorch
elif [[ "$CUDA_VERSION" == 10.1* ]]; then
conda_install magma-cuda101 -c pytorch
fi
# TODO: This isn't working atm
conda_install nnpack -c killeent
# Install some other packages, including those needed for Python test reporting
# Install some other packages
# TODO: Why is scipy pinned
# Pin MyPy version because new errors are likely to appear with each release
# Pin hypothesis to avoid flakiness: https://github.com/pytorch/pytorch/issues/31136
# Pin coverage so we can use COVERAGE_RCFILE
as_jenkins pip install --progress-bar off pytest \
scipy==$SCIPY_VERSION \
scikit-image \
psutil \
unittest-xml-reporting \
boto3==1.16.34 \
coverage==5.5 \
hypothesis==4.53.2 \
expecttest==0.1.3 \
mypy==0.812 \
tb-nightly
# Install numba only on python-3.8 or below
# For numba issue see https://github.com/pytorch/pytorch/issues/51511
if [[ $(python -c "import sys; print(int(sys.version_info < (3, 9)))") == "1" ]]; then
as_jenkins pip install --progress-bar off numba librosa>=0.6.2
else
as_jenkins pip install --progress-bar off numba==0.49.0 librosa>=0.6.2
fi
# Update scikit-learn to a python-3.8 compatible version
if [[ $(python -c "import sys; print(int(sys.version_info >= (3, 8)))") == "1" ]]; then
as_jenkins pip install --progress-bar off -U scikit-learn
else
# Pinned scikit-learn due to https://github.com/scikit-learn/scikit-learn/issues/14485 (affects gcc 5.5 only)
as_jenkins pip install --progress-bar off scikit-learn==0.20.3
fi
# numba & llvmlite is pinned because of https://github.com/numba/numba/issues/4368
# scikit-learn is pinned because of
# https://github.com/scikit-learn/scikit-learn/issues/14485 (affects gcc 5.5
# only)
as_jenkins pip install --progress-bar off pytest scipy==1.1.0 scikit-learn==0.20.3 scikit-image librosa>=0.6.2 psutil numba==0.46.0 llvmlite==0.30.0
popd
fi

View File

@ -2,6 +2,23 @@
set -ex
# This function installs protobuf 2.6
install_protobuf_26() {
pb_dir="/usr/temp_pb_install_dir"
mkdir -p $pb_dir
# On the nvidia/cuda:9-cudnn7-devel-centos7 image we need this symlink or
# else it will fail with
# g++: error: ./../lib64/crti.o: No such file or directory
ln -s /usr/lib64 "$pb_dir/lib64"
curl -LO "https://github.com/google/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-2.6.1.tar.gz
pushd "$pb_dir" && ./configure && make && make check && sudo make install && sudo ldconfig
popd
rm -rf $pb_dir
}
install_ubuntu() {
apt-get update
apt-get install -y --no-install-recommends \
@ -34,16 +51,11 @@ install_centos() {
}
# Install base packages depending on the base OS
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
centos)
install_centos
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
if [ -f /etc/lsb-release ]; then
install_ubuntu
elif [ -f /etc/os-release ]; then
install_centos
else
echo "Unable to determine OS..."
exit 1
fi

View File

@ -7,15 +7,10 @@ if [ -n "$GCC_VERSION" ]; then
# Need the official toolchain repo to get alternate packages
add-apt-repository ppa:ubuntu-toolchain-r/test
apt-get update
if [ "$UBUNTU_VERSION" = "16.04" -a "$GCC_VERSION" = "5" ]; then
apt-get install -y g++-5=5.4.0-6ubuntu1~16.04.12
else
apt-get install -y g++-$GCC_VERSION
fi
apt-get install -y g++-$GCC_VERSION
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-"$GCC_VERSION" 50
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-"$GCC_VERSION" 50
update-alternatives --install /usr/bin/gcov gcov /usr/bin/gcov-"$GCC_VERSION" 50
# Cleanup package manager
apt-get autoclean && apt-get clean

View File

@ -1,8 +0,0 @@
#!/bin/bash
set -ex
git clone --branch v1.15 https://github.com/linux-test-project/lcov.git
pushd lcov
sudo make install # will be installed in /usr/local/bin/lcov
popd

View File

@ -1,10 +0,0 @@
#!/bin/bash
sudo apt-get update
# also install ssh to avoid error of:
# --------------------------------------------------------------------------
# The value of the MCA parameter "plm_rsh_agent" was set to a path
# that could not be found:
# plm_rsh_agent: ssh : rsh
sudo apt-get install -y ssh
sudo apt-get install -y --allow-downgrades --allow-change-held-packages openmpi-bin libopenmpi-dev

View File

@ -1,14 +0,0 @@
#!/bin/bash
set -ex
OPENSSL=openssl-1.1.1k
wget -q -O "${OPENSSL}.tar.gz" "https://www.openssl.org/source/${OPENSSL}.tar.gz"
tar xf "${OPENSSL}.tar.gz"
cd "${OPENSSL}"
./config --prefix=/opt/openssl -d '-Wl,--enable-new-dtags,-rpath,$(LIBRPATH)'
# NOTE: opensl errors out when built with the -j option
make install_sw
cd ..
rm -rf "${OPENSSL}"

View File

@ -2,8 +2,8 @@
set -ex
# This function installs protobuf 3.17
install_protobuf_317() {
# This function installs protobuf 2.6
install_protobuf_26() {
pb_dir="/usr/temp_pb_install_dir"
mkdir -p $pb_dir
@ -12,45 +12,45 @@ install_protobuf_317() {
# g++: error: ./../lib64/crti.o: No such file or directory
ln -s /usr/lib64 "$pb_dir/lib64"
curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protobuf-all-3.17.3.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-all-3.17.3.tar.gz
# -j2 to balance memory usage and speed.
# naked `-j` seems to use too much memory.
pushd "$pb_dir" && ./configure && make -j2 && make -j2 check && sudo make -j2 install && sudo ldconfig
curl -LO "https://github.com/google/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-2.6.1.tar.gz
pushd "$pb_dir" && ./configure && make && make check && sudo make install && sudo ldconfig
popd
rm -rf $pb_dir
}
install_ubuntu() {
# Ubuntu 14.04 has cmake 2.8.12 as the default option, so we will
# Ubuntu 14.04 ships with protobuf 2.5, but ONNX needs protobuf >= 2.6
# so we install that here if on 14.04
# Ubuntu 14.04 also has cmake 2.8.12 as the default option, so we will
# install cmake3 here and use cmake3.
apt-get update
if [[ "$UBUNTU_VERSION" == 14.04 ]]; then
apt-get install -y --no-install-recommends cmake3
install_protobuf_26
else
apt-get install -y --no-install-recommends \
libprotobuf-dev \
protobuf-compiler
fi
# Cleanup
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
install_protobuf_317
}
install_centos() {
install_protobuf_317
# Centos7 ships with protobuf 2.5, but ONNX needs protobuf >= 2.6
# so we always install install that here
install_protobuf_26
}
# Install base packages depending on the base OS
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
centos)
install_centos
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
if [ -f /etc/lsb-release ]; then
install_ubuntu
elif [ -f /etc/os-release ]; then
install_centos
else
echo "Unable to determine OS..."
exit 1
fi

View File

@ -1,127 +0,0 @@
#!/bin/bash
set -ex
install_magma() {
# "install" hipMAGMA into /opt/rocm/magma by copying after build
git clone https://bitbucket.org/icl/magma.git -b magma_ctrl_launch_bounds
pushd magma
# The branch "magma_ctrl_launch_bounds" is having a fix over the below commit, so keeping the below comment for reference.
#git checkout 878b1ce02e9cfe4a829be22c8f911e9c0b6bd88f
# Work around non-asii characters in certain magma sources; remove this after upstream magma fixes this.
perl -i.bak -pe 's/[^[:ascii:]]//g' sparse/control/magma_zfree.cpp
perl -i.bak -pe 's/[^[:ascii:]]//g' sparse/control/magma_zsolverinfo.cpp
cp make.inc-examples/make.inc.hip-gcc-mkl make.inc
echo 'LIBDIR += -L$(MKLROOT)/lib' >> make.inc
echo 'LIB += -Wl,--enable-new-dtags -Wl,--rpath,/opt/rocm/lib -Wl,--rpath,$(MKLROOT)/lib -Wl,--rpath,/opt/rocm/magma/lib' >> make.inc
echo 'DEVCCFLAGS += --amdgpu-target=gfx803 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --gpu-max-threads-per-block=256' >> make.inc
# hipcc with openmp flag may cause isnan() on __device__ not to be found; depending on context, compiler may attempt to match with host definition
sed -i 's/^FOPENMP/#FOPENMP/g' make.inc
export PATH="${PATH}:/opt/rocm/bin"
make -f make.gen.hipMAGMA -j $(nproc)
make lib/libmagma.so -j $(nproc) MKLROOT=/opt/conda
make testing/testing_dgemm -j $(nproc) MKLROOT=/opt/conda
popd
mv magma /opt/rocm
}
ver() {
printf "%3d%03d%03d%03d" $(echo "$1" | tr '.' ' ');
}
install_ubuntu() {
apt-get update
if [[ $UBUNTU_VERSION == 18.04 ]]; then
# gpg-agent is not available by default on 18.04
apt-get install -y --no-install-recommends gpg-agent
fi
apt-get install -y kmod
apt-get install -y wget
# Need the libc++1 and libc++abi1 libraries to allow torch._C to load at runtime
apt-get install -y libc++1
apt-get install -y libc++abi1
ROCM_REPO="ubuntu"
if [[ $(ver $ROCM_VERSION) -lt $(ver 4.2) ]]; then
ROCM_REPO="xenial"
fi
# Add rocm repository
wget -qO - http://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -
echo "deb [arch=amd64] http://repo.radeon.com/rocm/apt/${ROCM_VERSION} ${ROCM_REPO} main" > /etc/apt/sources.list.d/rocm.list
apt-get update --allow-insecure-repositories
DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated \
rocm-dev \
rocm-utils \
rocm-libs \
rccl \
rocprofiler-dev \
roctracer-dev
# precompiled miopen kernels added in ROCm 3.5; search for all unversioned packages
# if search fails it will abort this script; use true to avoid case where search fails
MIOPENKERNELS=$(apt-cache search --names-only miopenkernels | awk '{print $1}' | grep -F -v . || true)
if [[ "x${MIOPENKERNELS}" = x ]]; then
echo "miopenkernels package not available"
else
DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-unauthenticated ${MIOPENKERNELS}
fi
install_magma
# Cleanup
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
}
install_centos() {
yum update -y
yum install -y kmod
yum install -y wget
yum install -y openblas-devel
yum install -y epel-release
yum install -y dkms kernel-headers-`uname -r` kernel-devel-`uname -r`
echo "[ROCm]" > /etc/yum.repos.d/rocm.repo
echo "name=ROCm" >> /etc/yum.repos.d/rocm.repo
echo "baseurl=http://repo.radeon.com/rocm/yum/${ROCM_VERSION}" >> /etc/yum.repos.d/rocm.repo
echo "enabled=1" >> /etc/yum.repos.d/rocm.repo
echo "gpgcheck=0" >> /etc/yum.repos.d/rocm.repo
yum update -y
yum install -y \
rocm-dev \
rocm-utils \
rocm-libs \
rccl \
rocprofiler-dev \
roctracer-dev
install_magma
# Cleanup
yum clean all
rm -rf /var/cache/yum
rm -rf /var/lib/yum/yumdb
rm -rf /var/lib/yum/history
}
# Install Python packages depending on the base OS
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
centos)
install_centos
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac

View File

@ -1,24 +0,0 @@
#!/bin/bash
set -ex
[ -n "${SWIFTSHADER}" ]
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
_https_amazon_aws=https://ossci-android.s3.amazonaws.com
# SwiftShader
_swiftshader_dir=/var/lib/jenkins/swiftshader
_swiftshader_file_targz=swiftshader-abe07b943-prebuilt.tar.gz
mkdir -p $_swiftshader_dir
_tmp_swiftshader_targz="/tmp/${_swiftshader_file_targz}"
curl --silent --show-error --location --fail --retry 3 \
--output "${_tmp_swiftshader_targz}" "$_https_amazon_aws/${_swiftshader_file_targz}"
tar -C "${_swiftshader_dir}" -xzf "${_tmp_swiftshader_targz}"
export VK_ICD_FILENAMES="${_swiftshader_dir}/build/Linux/vk_swiftshader_icd.json"

View File

@ -0,0 +1,97 @@
#!/bin/bash
set -ex
as_jenkins() {
# NB: Preserve PATH and LD_LIBRARY_PATH changes
sudo -H -u jenkins env "PATH=$PATH" "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" $*
}
if [ -n "$TRAVIS_PYTHON_VERSION" ]; then
mkdir -p /opt/python
chown jenkins:jenkins /opt/python
# Download Python binary from Travis
pushd tmp
as_jenkins wget --quiet ${TRAVIS_DL_URL_PREFIX}/python-$TRAVIS_PYTHON_VERSION.tar.bz2
# NB: The tarball also comes with /home/travis virtualenv that we
# don't care about. (Maybe we should, but we've worked around the
# "how do I install to python" issue by making this entire directory
# user-writable "lol")
# NB: Relative ordering of opt/python and flags matters
as_jenkins tar xjf python-$TRAVIS_PYTHON_VERSION.tar.bz2 --strip-components=2 --directory /opt/python opt/python
popd
echo "/opt/python/$TRAVIS_PYTHON_VERSION/lib" > /etc/ld.so.conf.d/travis-python.conf
ldconfig
sed -e 's|PATH="\(.*\)"|PATH="/opt/python/'"$TRAVIS_PYTHON_VERSION"'/bin:\1"|g' -i /etc/environment
export PATH="/opt/python/$TRAVIS_PYTHON_VERSION/bin:$PATH"
python --version
pip --version
# Install pip from source.
# The python-pip package on Ubuntu Trusty is old
# and upon install numpy doesn't use the binary
# distribution, and fails to compile it from source.
pushd tmp
as_jenkins curl -L -O https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz
as_jenkins tar zxf pip-9.0.1.tar.gz
pushd pip-9.0.1
as_jenkins python setup.py install
popd
rm -rf pip-9.0.1*
popd
# Install pip packages
as_jenkins pip install --upgrade pip
pip --version
if [[ "$TRAVIS_PYTHON_VERSION" == nightly ]]; then
# These two packages have broken Cythonizations uploaded
# to PyPi, see:
#
# - https://github.com/numpy/numpy/issues/10500
# - https://github.com/yaml/pyyaml/issues/117
#
# Furthermore, the released version of Cython does not
# have these issues fixed.
#
# While we are waiting on fixes for these, we build
# from Git for now. Feel free to delete this conditional
# branch if things start working again (you may need
# to do this if these packages regress on Git HEAD.)
as_jenkins pip install git+https://github.com/cython/cython.git
as_jenkins pip install git+https://github.com/numpy/numpy.git
as_jenkins pip install git+https://github.com/yaml/pyyaml.git
else
as_jenkins pip install numpy pyyaml
fi
as_jenkins pip install \
future \
hypothesis \
protobuf \
pytest \
pillow \
typing
as_jenkins pip install mkl mkl-devel
# SciPy does not support Python 3.7 or Python 2.7.9
if [[ "$TRAVIS_PYTHON_VERSION" != nightly ]] && [[ "$TRAVIS_PYTHON_VERSION" != "2.7.9" ]]; then
as_jenkins pip install scipy==1.1.0 scikit-image librosa>=0.6.2
fi
# Install psutil for dataloader tests
as_jenkins pip install psutil
# Install dill for serialization tests
as_jenkins pip install "dill>=0.3.1"
# Cleanup package manager
apt-get autoclean && apt-get clean
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
fi

View File

@ -2,6 +2,23 @@
set -ex
# This function installs protobuf 2.6
install_protobuf_26() {
pb_dir="/usr/temp_pb_install_dir"
mkdir -p $pb_dir
# On the nvidia/cuda:9-cudnn7-devel-centos7 image we need this symlink or
# else it will fail with
# g++: error: ./../lib64/crti.o: No such file or directory
ln -s /usr/lib64 "$pb_dir/lib64"
curl -LO "https://github.com/google/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz"
tar -xvz -C "$pb_dir" --strip-components 1 -f protobuf-2.6.1.tar.gz
pushd "$pb_dir" && ./configure && make && make check && sudo make install && sudo ldconfig
popd
rm -rf $pb_dir
}
install_ubuntu() {
apt-get update
apt-get install -y --no-install-recommends \
@ -30,16 +47,11 @@ install_centos() {
}
# Install base packages depending on the base OS
ID=$(grep -oP '(?<=^ID=).+' /etc/os-release | tr -d '"')
case "$ID" in
ubuntu)
install_ubuntu
;;
centos)
install_centos
;;
*)
echo "Unable to determine OS..."
exit 1
;;
esac
if [ -f /etc/lsb-release ]; then
install_ubuntu
elif [ -f /etc/os-release ]; then
install_centos
else
echo "Unable to determine OS..."
exit 1
fi

View File

@ -1,24 +0,0 @@
#!/bin/bash
set -ex
[ -n "${VULKAN_SDK_VERSION}" ]
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
_vulkansdk_dir=/var/lib/jenkins/vulkansdk
_tmp_vulkansdk_targz=/tmp/vulkansdk.tar.gz
curl \
--silent \
--show-error \
--location \
--fail \
--retry 3 \
--output "${_tmp_vulkansdk_targz}" "https://ossci-android.s3.amazonaws.com/vulkansdk-linux-x86_64-${VULKAN_SDK_VERSION}.tar.gz"
mkdir -p "${_vulkansdk_dir}"
tar -C "${_vulkansdk_dir}" -xzf "${_tmp_vulkansdk_targz}" --strip-components 1
rm -rf "${_tmp_vulkansdk_targz}"

View File

@ -24,7 +24,7 @@ ARG KATEX
ADD ./common/install_katex.sh install_katex.sh
RUN bash ./install_katex.sh && rm install_katex.sh
# Install conda and other packages (e.g., numpy, coverage, pytest)
# Install conda
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ADD ./common/install_conda.sh install_conda.sh
@ -35,10 +35,11 @@ ARG GCC_VERSION
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# Install clang
ARG CLANG_VERSION
ADD ./common/install_clang.sh install_clang.sh
RUN bash ./install_clang.sh && rm install_clang.sh
# Install non-standard Python versions (via Travis binaries)
ARG TRAVIS_PYTHON_VERSION
ENV PATH /opt/python/$TRAVIS_PYTHON_VERSION/bin:$PATH
ADD ./common/install_travis_python.sh install_travis_python.sh
RUN bash ./install_travis_python.sh && rm install_travis_python.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
@ -61,16 +62,6 @@ RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
ADD ./common/install_openssl.sh install_openssl.sh
ENV OPENSSL_ROOT_DIR /opt/openssl
RUN bash ./install_openssl.sh
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
@ -82,11 +73,6 @@ ADD ./common/install_jni.sh install_jni.sh
ADD ./java/jni.h jni.h
RUN bash ./install_jni.sh && rm install_jni.sh
# Install Open MPI for CUDA
ADD ./common/install_openmpi.sh install_openmpi.sh
RUN if [ -n "${CUDA_VERSION}" ]; then bash install_openmpi.sh; fi
RUN rm install_openmpi.sh
# Include BUILD_ENVIRONMENT environment variable in image
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
@ -95,8 +81,5 @@ ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
ENV TORCH_CUDA_ARCH_LIST Maxwell
ENV TORCH_NVCC_FLAGS "-Xfatbin -compress-all"
# Install LLVM dev version (Defined in the pytorch/builder github repository)
COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
USER jenkins
CMD ["bash"]

View File

@ -1,92 +0,0 @@
ARG UBUNTU_VERSION
FROM ubuntu:${UBUNTU_VERSION}
ARG UBUNTU_VERSION
ENV DEBIAN_FRONTEND noninteractive
# Install common dependencies (so that this step can be cached separately)
ARG EC2
ADD ./common/install_base.sh install_base.sh
RUN bash ./install_base.sh && rm install_base.sh
# Install clang
ARG LLVMDEV
ARG CLANG_VERSION
ADD ./common/install_clang.sh install_clang.sh
RUN bash ./install_clang.sh && rm install_clang.sh
# Install user
ADD ./common/install_user.sh install_user.sh
RUN bash ./install_user.sh && rm install_user.sh
# Install conda and other packages (e.g., numpy, coverage, pytest)
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ADD ./common/install_conda.sh install_conda.sh
RUN bash ./install_conda.sh && rm install_conda.sh
# Install gcc
ARG GCC_VERSION
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
ADD ./common/install_protobuf.sh install_protobuf.sh
RUN if [ -n "${PROTOBUF}" ]; then bash ./install_protobuf.sh; fi
RUN rm install_protobuf.sh
ENV INSTALLED_PROTOBUF ${PROTOBUF}
# (optional) Install database packages like LMDB and LevelDB
ARG DB
ADD ./common/install_db.sh install_db.sh
RUN if [ -n "${DB}" ]; then bash ./install_db.sh; fi
RUN rm install_db.sh
ENV INSTALLED_DB ${DB}
# (optional) Install vision packages like OpenCV and ffmpeg
ARG VISION
ADD ./common/install_vision.sh install_vision.sh
RUN if [ -n "${VISION}" ]; then bash ./install_vision.sh; fi
RUN rm install_vision.sh
ENV INSTALLED_VISION ${VISION}
# Install rocm
ARG ROCM_VERSION
ADD ./common/install_rocm.sh install_rocm.sh
RUN bash ./install_rocm.sh
RUN rm install_rocm.sh
ENV PATH /opt/rocm/bin:$PATH
ENV PATH /opt/rocm/hcc/bin:$PATH
ENV PATH /opt/rocm/hip/bin:$PATH
ENV PATH /opt/rocm/opencl/bin:$PATH
ENV PATH /opt/rocm/llvm/bin:$PATH
ENV MAGMA_HOME /opt/rocm/magma
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
ADD ./common/install_cmake.sh install_cmake.sh
RUN if [ -n "${CMAKE_VERSION}" ]; then bash ./install_cmake.sh; fi
RUN rm install_cmake.sh
# (optional) Install non-default Ninja version
ARG NINJA_VERSION
ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
# Install ccache/sccache (do this last, so we get priority in PATH)
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
RUN bash ./install_cache.sh && rm install_cache.sh
# Include BUILD_ENVIRONMENT environment variable in image
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
USER jenkins
CMD ["bash"]

View File

@ -33,7 +33,7 @@ ARG KATEX
ADD ./common/install_katex.sh install_katex.sh
RUN bash ./install_katex.sh && rm install_katex.sh
# Install conda and other packages (e.g., numpy, coverage, pytest)
# Install conda
ENV PATH /opt/conda/bin:$PATH
ARG ANACONDA_PYTHON_VERSION
ADD ./common/install_conda.sh install_conda.sh
@ -44,9 +44,12 @@ ARG GCC_VERSION
ADD ./common/install_gcc.sh install_gcc.sh
RUN bash ./install_gcc.sh && rm install_gcc.sh
# Install lcov for C++ code coverage
ADD ./common/install_lcov.sh install_lcov.sh
RUN bash ./install_lcov.sh && rm install_lcov.sh
# Install non-standard Python versions (via Travis binaries)
ARG TRAVIS_PYTHON_VERSION
ARG TRAVIS_DL_URL_PREFIX
ENV PATH /opt/python/$TRAVIS_PYTHON_VERSION/bin:$PATH
ADD ./common/install_travis_python.sh install_travis_python.sh
RUN bash ./install_travis_python.sh && rm install_travis_python.sh
# (optional) Install protobuf for ONNX
ARG PROTOBUF
@ -82,18 +85,6 @@ RUN rm AndroidManifest.xml
RUN rm build.gradle
ENV INSTALLED_ANDROID ${ANDROID}
# (optional) Install Vulkan SDK
ARG VULKAN_SDK_VERSION
ADD ./common/install_vulkan_sdk.sh install_vulkan_sdk.sh
RUN if [ -n "${VULKAN_SDK_VERSION}" ]; then bash ./install_vulkan_sdk.sh; fi
RUN rm install_vulkan_sdk.sh
# (optional) Install swiftshader
ARG SWIFTSHADER
ADD ./common/install_swiftshader.sh install_swiftshader.sh
RUN if [ -n "${SWIFTSHADER}" ]; then bash ./install_swiftshader.sh; fi
RUN rm install_swiftshader.sh
# (optional) Install non-default CMake version
ARG CMAKE_VERSION
ADD ./common/install_cmake.sh install_cmake.sh
@ -106,10 +97,6 @@ ADD ./common/install_ninja.sh install_ninja.sh
RUN if [ -n "${NINJA_VERSION}" ]; then bash ./install_ninja.sh; fi
RUN rm install_ninja.sh
ADD ./common/install_openssl.sh install_openssl.sh
RUN bash ./install_openssl.sh
ENV OPENSSL_ROOT_DIR /opt/openssl
# Install ccache/sccache (do this last, so we get priority in PATH)
ADD ./common/install_cache.sh install_cache.sh
ENV PATH /opt/cache/bin:$PATH
@ -124,8 +111,5 @@ RUN bash ./install_jni.sh && rm install_jni.sh
ARG BUILD_ENVIRONMENT
ENV BUILD_ENVIRONMENT ${BUILD_ENVIRONMENT}
# Install LLVM dev version (Defined in the pytorch/builder github repository)
COPY --from=pytorch/llvm:9.0.1 /opt/llvm /opt/llvm
USER jenkins
CMD ["bash"]

View File

@ -1,10 +1,10 @@
FROM ubuntu:18.04
FROM ubuntu:16.04
RUN apt-get update && apt-get install -y python3-pip git && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log
RUN apt-get update && apt-get install -y python-pip && rm -rf /var/lib/apt/lists/* /var/log/dpkg.log
ADD requirements.txt /requirements.txt
RUN pip3 install -r /requirements.txt
RUN pip install -r /requirements.txt
ADD gc.py /usr/bin/gc.py

View File

@ -1,4 +1,4 @@
#!/usr/bin/env python3
#!/usr/bin/env python
from collections import namedtuple

View File

@ -1,10 +1,9 @@
#!/usr/bin/env python3
#!/usr/bin/env python
import argparse
import boto3
import datetime
import boto3
import pytz
import re
import sys
@ -88,9 +87,6 @@ parser = argparse.ArgumentParser(description="Delete old Docker tags from regist
parser.add_argument(
"--dry-run", action="store_true", help="Dry run; print tags that would be deleted"
)
parser.add_argument(
"--debug", action="store_true", help="Debug, print ignored / saved tags"
)
parser.add_argument(
"--keep-stable-days",
type=int,
@ -148,19 +144,7 @@ def chunks(chunkable, n):
""" Yield successive n-sized chunks from l.
"""
for i in range(0, len(chunkable), n):
yield chunkable[i: i + n]
SHA_PATTERN = re.compile(r'^[0-9a-f]{40}$')
def looks_like_git_sha(tag):
"""Returns a boolean to check if a tag looks like a git sha
For reference a sha1 is 40 characters with only 0-9a-f and contains no
"-" characters
"""
return re.match(SHA_PATTERN, tag) is not None
yield chunkable[i : i + n]
stable_window_tags = []
@ -171,48 +155,48 @@ for repo in repos(client):
# Keep list of image digests to delete for this repository
digest_to_delete = []
print(repositoryName)
for image in images(client, repo):
tags = image.get("imageTags")
if not isinstance(tags, (list,)) or len(tags) == 0:
continue
tag = tags[0]
created = image["imagePushedAt"]
age = now - created
for tag in tags:
if any([
looks_like_git_sha(tag),
tag.isdigit(),
tag.count("-") == 4, # TODO: Remove, this no longer applies as tags are now built using a SHA1
tag in ignore_tags]):
window = stable_window
if tag in ignore_tags:
stable_window_tags.append((repositoryName, tag, "", age, created))
elif age < window:
stable_window_tags.append((repositoryName, tag, window, age, created))
else:
window = unstable_window
if tag in ignore_tags or age < window:
if args.debug:
print("Ignoring {}:{} (age: {})".format(repositoryName, tag, age))
break
# new images build on circle ci use workflow ID as tag, which has 4 "-"
if tag.isdigit() or tag.count("-") == 4 or tag in ignore_tags:
window = stable_window
if tag in ignore_tags:
stable_window_tags.append((repositoryName, tag, "", age, created))
elif age < window:
stable_window_tags.append((repositoryName, tag, window, age, created))
else:
for tag in tags:
print("{}Deleting {}:{} (age: {})".format("(dry run) " if args.dry_run else "", repositoryName, tag, age))
digest_to_delete.append(image["imageDigest"])
if args.dry_run:
if args.debug:
print("Skipping actual deletion, moving on...")
else:
# Issue batch delete for all images to delete for this repository
# Note that as of 2018-07-25, the maximum number of images you can
# delete in a single batch is 100, so chunk our list into batches of
# 100
for c in chunks(digest_to_delete, 100):
client.batch_delete_image(
registryId="308535385114",
repositoryName=repositoryName,
imageIds=[{"imageDigest": digest} for digest in c],
)
window = unstable_window
save_to_s3(args.filter_prefix, stable_window_tags)
if tag in ignore_tags:
print("Ignoring tag {} (age: {})".format(tag, age))
continue
if age < window:
print("Not deleting manifest for tag {} (age: {})".format(tag, age))
continue
if args.dry_run:
print("(dry run) Deleting manifest for tag {} (age: {})".format(tag, age))
else:
print("Deleting manifest for tag {} (age: {})".format(tag, age))
digest_to_delete.append(image["imageDigest"])
# Issue batch delete for all images to delete for this repository
# Note that as of 2018-07-25, the maximum number of images you can
# delete in a single batch is 100, so chunk our list into batches of
# 100
for c in chunks(digest_to_delete, 100):
client.batch_delete_image(
registryId="308535385114",
repositoryName=repositoryName,
imageIds=[{"imageDigest": digest} for digest in c],
)
save_to_s3(args.filter_prefix, stable_window_tags)

View File

@ -6,22 +6,13 @@ Please see README.md in this directory for details.
"""
import os
import shutil
import sys
from collections import namedtuple
import shutil
from collections import namedtuple, OrderedDict
import cimodel.data.binary_build_definitions as binary_build_definitions
import cimodel.data.pytorch_build_definitions as pytorch_build_definitions
import cimodel.data.simple.android_definitions
import cimodel.data.simple.binary_smoketest
import cimodel.data.simple.docker_definitions
import cimodel.data.simple.ios_definitions
import cimodel.data.simple.macos_definitions
import cimodel.data.simple.mobile_definitions
import cimodel.data.simple.nightly_android
import cimodel.data.simple.nightly_ios
import cimodel.data.simple.anaconda_prune_defintions
import cimodel.data.windows_build_definitions as windows_build_definitions
import cimodel.data.binary_build_definitions as binary_build_definitions
import cimodel.data.caffe2_build_definitions as caffe2_build_definitions
import cimodel.lib.miniutils as miniutils
import cimodel.lib.miniyaml as miniyaml
@ -30,7 +21,6 @@ class File(object):
"""
Verbatim copy the contents of a file into config.yml
"""
def __init__(self, filename):
self.filename = filename
@ -39,7 +29,7 @@ class File(object):
shutil.copyfileobj(fh, output_filehandle)
class FunctionGen(namedtuple("FunctionGen", "function depth")):
class FunctionGen(namedtuple('FunctionGen', 'function depth')):
__slots__ = ()
@ -49,14 +39,15 @@ class Treegen(FunctionGen):
"""
def write(self, output_filehandle):
miniyaml.render(output_filehandle, self.function(), self.depth)
build_dict = OrderedDict()
self.function(build_dict)
miniyaml.render(output_filehandle, build_dict, self.depth)
class Listgen(FunctionGen):
"""
Insert the content of a YAML list into config.yml
"""
def write(self, output_filehandle):
miniyaml.render(output_filehandle, self.function(), self.depth)
@ -66,6 +57,7 @@ def horizontal_rule():
class Header(object):
def __init__(self, title, summary=None):
self.title = title
self.summary_lines = summary or []
@ -78,104 +70,6 @@ class Header(object):
for line in filter(None, lines):
output_filehandle.write(line + "\n")
def filter_master_only_jobs(items):
def _for_all_items(items, functor) -> None:
if isinstance(items, list):
for item in items:
_for_all_items(item, functor)
if isinstance(items, dict) and len(items) == 1:
item_type, item = next(iter(items.items()))
functor(item_type, item)
def _is_master_item(item):
filters = item.get('filters', None)
branches = filters.get('branches', None) if filters is not None else None
branches_only = branches.get('only', None) if branches is not None else None
return 'master' in branches_only if branches_only is not None else False
master_deps = set()
def _save_requires_if_master(item_type, item):
requires = item.get('requires', None)
item_name = item.get("name", None)
if not isinstance(requires, list):
return
if _is_master_item(item) or item_name in master_deps:
master_deps.update([n.strip('"') for n in requires])
def _do_filtering(items):
if isinstance(items, list):
rc = [_do_filtering(item) for item in items]
return [item for item in rc if len(item if item is not None else []) > 0]
assert isinstance(items, dict) and len(items) == 1
item_type, item = next(iter(items.items()))
item_name = item.get("name", None)
item_name = item_name.strip('"') if item_name is not None else None
if not _is_master_item(item) and item_name not in master_deps:
return None
if 'filters' in item:
item = item.copy()
item.pop('filters')
return {item_type: item}
# Scan of dependencies twice to pick up nested required jobs
# I.e. jobs depending on jobs that master-only job depend on
_for_all_items(items, _save_requires_if_master)
_for_all_items(items, _save_requires_if_master)
return _do_filtering(items)
def gen_build_workflows_tree():
build_workflows_functions = [
cimodel.data.simple.docker_definitions.get_workflow_jobs,
pytorch_build_definitions.get_workflow_jobs,
cimodel.data.simple.macos_definitions.get_workflow_jobs,
cimodel.data.simple.android_definitions.get_workflow_jobs,
cimodel.data.simple.ios_definitions.get_workflow_jobs,
cimodel.data.simple.mobile_definitions.get_workflow_jobs,
cimodel.data.simple.binary_smoketest.get_workflow_jobs,
cimodel.data.simple.nightly_ios.get_workflow_jobs,
cimodel.data.simple.nightly_android.get_workflow_jobs,
cimodel.data.simple.anaconda_prune_defintions.get_workflow_jobs,
windows_build_definitions.get_windows_workflows,
binary_build_definitions.get_post_upload_jobs,
binary_build_definitions.get_binary_smoke_test_jobs,
]
build_jobs = [f() for f in build_workflows_functions]
master_build_jobs = filter_master_only_jobs(build_jobs)
binary_build_functions = [
binary_build_definitions.get_binary_build_jobs,
binary_build_definitions.get_nightly_tests,
binary_build_definitions.get_nightly_uploads,
]
slow_gradcheck_jobs = [
pytorch_build_definitions.get_workflow_jobs,
cimodel.data.simple.docker_definitions.get_workflow_jobs,
]
return {
"workflows": {
"binary_builds": {
"when": r"<< pipeline.parameters.run_binary_tests >>",
"jobs": [f() for f in binary_build_functions],
},
"build": {
"when": r"<< pipeline.parameters.run_build >>",
"jobs": build_jobs,
},
"master_build": {
"when": r"<< pipeline.parameters.run_master_build >>",
"jobs": master_build_jobs,
},
"slow_gradcheck_build": {
"when": r"<< pipeline.parameters.run_slow_gradcheck_build >>",
"jobs": [f(only_slow_gradcheck=True) for f in slow_gradcheck_jobs],
},
}
}
# Order of this list matters to the generated config.yml.
YAML_SOURCES = [
@ -183,22 +77,42 @@ YAML_SOURCES = [
File("commands.yml"),
File("nightly-binary-build-defaults.yml"),
Header("Build parameters"),
File("build-parameters/pytorch-build-params.yml"),
File("build-parameters/binary-build-params.yml"),
File("build-parameters/promote-build-params.yml"),
File("pytorch-build-params.yml"),
File("caffe2-build-params.yml"),
File("binary-build-params.yml"),
Header("Job specs"),
File("job-specs/pytorch-job-specs.yml"),
File("job-specs/binary-job-specs.yml"),
File("job-specs/job-specs-custom.yml"),
File("job-specs/job-specs-promote.yml"),
File("job-specs/binary_update_htmls.yml"),
File("job-specs/binary-build-tests.yml"),
File("job-specs/docker_jobs.yml"),
Header("Workflows"),
Treegen(gen_build_workflows_tree, 0),
File("workflows/workflows-scheduled-ci.yml"),
File("workflows/workflows-ecr-gc.yml"),
File("workflows/workflows-promote.yml"),
File("pytorch-job-specs.yml"),
File("caffe2-job-specs.yml"),
File("binary-job-specs.yml"),
File("job-specs-setup.yml"),
File("job-specs-custom.yml"),
File("binary_update_htmls.yml"),
File("binary-build-tests.yml"),
File("docker_jobs.yml"),
File("workflows.yml"),
File("workflows-setup-job.yml"),
File("windows-build-test.yml"),
Listgen(pytorch_build_definitions.get_workflow_jobs, 3),
File("workflows-pytorch-macos-builds.yml"),
File("workflows-pytorch-android-gradle-build.yml"),
File("workflows-pytorch-ios-builds.yml"),
File("workflows-pytorch-mobile-builds.yml"),
File("workflows-pytorch-ge-config-tests.yml"),
Listgen(caffe2_build_definitions.get_workflow_jobs, 3),
File("workflows-binary-builds-smoke-subset.yml"),
Listgen(binary_build_definitions.get_binary_smoke_test_jobs, 3),
Listgen(binary_build_definitions.get_binary_build_jobs, 3),
File("workflows-nightly-ios-binary-builds.yml"),
File("workflows-nightly-android-binary-builds.yml"),
Header("Nightly tests"),
Listgen(binary_build_definitions.get_nightly_tests, 3),
File("workflows-nightly-uploads-header.yml"),
Listgen(binary_build_definitions.get_nightly_uploads, 3),
File("workflows-s3-html.yml"),
File("workflows-docker-builder.yml"),
File("workflows-ecr-gc.yml"),
]

View File

@ -1,5 +0,0 @@
cd $PSScriptRoot;
$NewFile = New-TemporaryFile;
python generate_config_yml.py > $NewFile.name
(Get-Content $NewFile.name -Raw).TrimEnd().Replace("`r`n","`n") | Set-Content config.yml -Force
Remove-Item $NewFile.name

View File

@ -1,17 +1,8 @@
#!/bin/bash -e
#!/bin/bash -xe
# Allows this script to be invoked from any directory:
cd "$(dirname "$0")"
UNCOMMIT_CHANGE=$(git status -s | grep " config.yml" | wc -l | xargs)
if [[ $UNCOMMIT_CHANGE != 0 ]]; then
OLD_FILE=$(mktemp)
cp config.yml "$OLD_FILE"
echo "Uncommitted change detected in .circleci/config.yml"
echo "It has been backed up to $OLD_FILE"
fi
cd $(dirname "$0")
NEW_FILE=$(mktemp)
./generate_config_yml.py > "$NEW_FILE"
cp "$NEW_FILE" config.yml
echo "New config generated in .circleci/config.yml"
./generate_config_yml.py > $NEW_FILE
cp $NEW_FILE config.yml

View File

@ -33,11 +33,6 @@ else
export BUILDER_ROOT="$workdir/builder"
fi
# Try to extract PR number from branch if not already set
if [[ -z "${CIRCLE_PR_NUMBER:-}" ]]; then
CIRCLE_PR_NUMBER="$(echo ${CIRCLE_BRANCH} | sed -E -n 's/pull\/([0-9]*).*/\1/p')"
fi
# Clone the Pytorch branch
retry git clone https://github.com/pytorch/pytorch.git "$PYTORCH_ROOT"
pushd "$PYTORCH_ROOT"
@ -55,13 +50,13 @@ else
echo "Can't tell what to checkout"
exit 1
fi
retry git submodule update --init --recursive --jobs 0
retry git submodule update --init --recursive
echo "Using Pytorch from "
git --no-pager log --max-count 1
popd
# Clone the Builder master repo
retry git clone -q https://github.com/pytorch/builder.git -b release/1.10 "$BUILDER_ROOT"
retry git clone -q https://github.com/pytorch/builder.git "$BUILDER_ROOT"
pushd "$BUILDER_ROOT"
echo "Using builder from "
git --no-pager log --max-count 1

View File

@ -15,14 +15,13 @@ export PATH="~/anaconda/bin:${PATH}"
source ~/anaconda/bin/activate
# Install dependencies
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi requests typing_extensions --yes
conda install -c conda-forge valgrind --yes
conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing requests --yes
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
# sync submodules
cd ${PROJ_ROOT}
git submodule sync
git submodule update --init --recursive --jobs 0
git submodule update --init --recursive
# run build script
chmod a+x ${PROJ_ROOT}/scripts/build_ios.sh
@ -31,12 +30,8 @@ cat ${PROJ_ROOT}/scripts/build_ios.sh
echo "########################################################"
echo "IOS_ARCH: ${IOS_ARCH}"
echo "IOS_PLATFORM: ${IOS_PLATFORM}"
echo "USE_PYTORCH_METAL: ${USE_PYTORCH_METAL}"
echo "USE_COREML_DELEGATE: ${USE_COREML_DELEGATE}"
export IOS_ARCH=${IOS_ARCH}
export IOS_PLATFORM=${IOS_PLATFORM}
export USE_PYTORCH_METAL=${USE_PYTORCH_METAL}
export USE_COREML_DELEGATE=${USE_COREML_DELEGATE}
unbuffer ${PROJ_ROOT}/scripts/build_ios.sh 2>&1 | ts
#store the binary

View File

@ -8,23 +8,22 @@ cd ${PROJ_ROOT}/ios/TestApp
# install fastlane
sudo gem install bundler && bundle install
# install certificates
echo "${IOS_CERT_KEY_2022}" >> cert.txt
echo "${IOS_CERT_KEY}" >> cert.txt
base64 --decode cert.txt -o Certificates.p12
rm cert.txt
bundle exec fastlane install_root_cert
bundle exec fastlane install_dev_cert
bundle exec fastlane install_cert
# install the provisioning profile
PROFILE=PyTorch_CI_2022.mobileprovision
PROFILE=TestApp_CI.mobileprovision
PROVISIONING_PROFILES=~/Library/MobileDevice/Provisioning\ Profiles
mkdir -pv "${PROVISIONING_PROFILES}"
cd "${PROVISIONING_PROFILES}"
echo "${IOS_SIGN_KEY_2022}" >> cert.txt
echo "${IOS_SIGN_KEY}" >> cert.txt
base64 --decode cert.txt -o ${PROFILE}
rm cert.txt
# run the ruby build script
if ! [ -x "$(command -v xcodebuild)" ]; then
echo 'Error: xcodebuild is not installed.'
exit 1
fi
PROFILE=PyTorch_CI_2022
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM} -c ${PROFILE} -t ${IOS_DEV_TEAM_ID} -f Accelerate,MetalPerformanceShaders,CoreML
fi
PROFILE=TestApp_CI
ruby ${PROJ_ROOT}/scripts/xcode_build.rb -i ${PROJ_ROOT}/build_ios/install -x ${PROJ_ROOT}/ios/TestApp/TestApp.xcodeproj -p ${IOS_PLATFORM} -c ${PROFILE} -t ${IOS_DEV_TEAM_ID}

View File

@ -14,7 +14,7 @@ mkdir -p ${ZIP_DIR}/src
cp -R ${ARTIFACTS_DIR}/arm64/include ${ZIP_DIR}/install/
# build a FAT bianry
cd ${ZIP_DIR}/install/lib
target_libs=(libc10.a libclog.a libcpuinfo.a libeigen_blas.a libpthreadpool.a libpytorch_qnnpack.a libtorch_cpu.a libtorch.a libXNNPACK.a)
target_libs=(libc10.a libclog.a libcpuinfo.a libeigen_blas.a libpytorch_qnnpack.a libtorch_cpu.a libtorch.a libXNNPACK.a)
for lib in ${target_libs[*]}
do
if [ -f "${ARTIFACTS_DIR}/x86_64/lib/${lib}" ] && [ -f "${ARTIFACTS_DIR}/arm64/lib/${lib}" ]; then
@ -22,28 +22,21 @@ do
lipo -create "${libs[@]}" -o ${ZIP_DIR}/install/lib/${lib}
fi
done
# for nnpack, we only support arm64 build
cp ${ARTIFACTS_DIR}/arm64/lib/libnnpack.a ./
lipo -i ${ZIP_DIR}/install/lib/*.a
# copy the umbrella header and license
cp ${PROJ_ROOT}/ios/LibTorch-Lite.h ${ZIP_DIR}/src/
cp ${PROJ_ROOT}/ios/LibTorch.h ${ZIP_DIR}/src/
cp ${PROJ_ROOT}/LICENSE ${ZIP_DIR}/
# zip the library
export DATE="$(date -u +%Y%m%d)"
export IOS_NIGHTLY_BUILD_VERSION="1.10.0.${DATE}"
# libtorch_lite_ios_nightly_1.10.0.20210810.zip
ZIPFILE="libtorch_lite_ios_nightly_${IOS_NIGHTLY_BUILD_VERSION}.zip"
ZIPFILE=libtorch_ios_nightly_build.zip
cd ${ZIP_DIR}
#for testing
touch version.txt
echo "${IOS_NIGHTLY_BUILD_VERSION}" > version.txt
echo $(date +%s) > version.txt
zip -r ${ZIPFILE} install src version.txt LICENSE
# upload to aws
# Install conda then 'conda install' awscli
curl --retry 3 -o ~/conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
chmod +x ~/conda.sh
/bin/bash ~/conda.sh -b -p ~/anaconda
export PATH="~/anaconda/bin:${PATH}"
source ~/anaconda/bin/activate
conda install -c conda-forge awscli --yes
brew install awscli
set +x
export AWS_ACCESS_KEY_ID=${AWS_S3_ACCESS_KEY_FOR_PYTORCH_BINARY_UPLOAD}
export AWS_SECRET_ACCESS_KEY=${AWS_S3_ACCESS_SECRET_FOR_PYTORCH_BINARY_UPLOAD}
@ -51,14 +44,3 @@ set +x
# echo "AWS KEY: ${AWS_ACCESS_KEY_ID}"
# echo "AWS SECRET: ${AWS_SECRET_ACCESS_KEY}"
aws s3 cp ${ZIPFILE} s3://ossci-ios-build/ --acl public-read
# create a new LibTorch-Lite-Nightly.podspec from the template
echo "cp ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec.template ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec"
cp ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec.template ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec
# update pod version
sed -i '' -e "s/IOS_NIGHTLY_BUILD_VERSION/${IOS_NIGHTLY_BUILD_VERSION}/g" ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec
cat ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec
# push the new LibTorch-Lite-Nightly.podspec to CocoaPods
pod trunk push --verbose --allow-warnings --use-libraries --skip-import-validation ${PROJ_ROOT}/ios/LibTorch-Lite-Nightly.podspec

View File

@ -4,31 +4,27 @@ echo "RUNNING ON $(uname -a) WITH $(nproc) CPUS AND $(free -m)"
set -eux -o pipefail
source /env
# Because most Circle executors only have 20 CPUs, using more causes OOMs w/ Ninja and nvcc parallelization
MEMORY_LIMIT_MAX_JOBS=18
NUM_CPUS=$(( $(nproc) - 2 ))
# Defaults here for **binary** linux builds so they can be changed in one place
export MAX_JOBS=${MAX_JOBS:-$(( ${NUM_CPUS} > ${MEMORY_LIMIT_MAX_JOBS} ? ${MEMORY_LIMIT_MAX_JOBS} : ${NUM_CPUS} ))}
if [[ "${DESIRED_CUDA}" == "cu111" || "${DESIRED_CUDA}" == "cu113" ]]; then
export BUILD_SPLIT_CUDA="ON"
fi
# Defaults here so they can be changed in one place
export MAX_JOBS=12
# Parse the parameters
if [[ "$PACKAGE_TYPE" == 'conda' ]]; then
build_script='conda/build_pytorch.sh'
elif [[ "$DESIRED_CUDA" == cpu ]]; then
build_script='manywheel/build_cpu.sh'
elif [[ "$DESIRED_CUDA" == *"rocm"* ]]; then
build_script='manywheel/build_rocm.sh'
else
build_script='manywheel/build.sh'
fi
if [[ "$CIRCLE_BRANCH" == "master" ]] || [[ "$CIRCLE_BRANCH" == release/* ]]; then
export BUILD_DEBUG_INFO=1
# We want to call unbuffer, which calls tclsh which finds the expect
# package. The expect was installed by yum into /usr/bin so we want to
# find /usr/bin/tclsh, but this is shadowed by /opt/conda/bin/tclsh in
# the conda docker images, so we prepend it to the path here.
if [[ "$PACKAGE_TYPE" == 'conda' ]]; then
mkdir /just_tclsh_bin
ln -s /usr/bin/tclsh /just_tclsh_bin/tclsh
export PATH=/just_tclsh_bin:$PATH
fi
# Build the package
SKIP_ALL_TESTS=1 "/builder/$build_script"
SKIP_ALL_TESTS=1 unbuffer "/builder/$build_script" | ts

View File

@ -5,13 +5,14 @@ cat >/home/circleci/project/ci_test_script.sh <<EOL
# =================== The following code will be executed inside Docker container ===================
set -eux -o pipefail
python_nodot="\$(echo $DESIRED_PYTHON | tr -d m.u)"
# Set up Python
if [[ "$PACKAGE_TYPE" == conda ]]; then
retry conda create -qyn testenv python="$DESIRED_PYTHON"
source activate testenv >/dev/null
elif [[ "$DESIRED_PYTHON" == 2.7mu ]]; then
export PATH="/opt/python/cp27-cp27mu/bin:\$PATH"
elif [[ "$PACKAGE_TYPE" != libtorch ]]; then
python_nodot="\$(echo $DESIRED_PYTHON | tr -d m.u)"
python_path="/opt/python/cp\$python_nodot-cp\${python_nodot}"
# Prior to Python 3.8 paths were suffixed with an 'm'
if [[ -d "\${python_path}/bin" ]]; then
@ -21,23 +22,6 @@ elif [[ "$PACKAGE_TYPE" != libtorch ]]; then
fi
fi
EXTRA_CONDA_FLAGS=""
NUMPY_PIN=""
if [[ "\$python_nodot" = *39* ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
# There's an issue with conda channel priority where it'll randomly pick 1.19 over 1.20
# we set a lower boundary here just to be safe
NUMPY_PIN=">=1.20"
fi
if [[ "$DESIRED_CUDA" == "cu112" ]]; then
EXTRA_CONDA_FLAGS="-c=conda-forge"
fi
# Move debug wheels out of the the package dir so they don't get installed
mkdir -p /tmp/debug_final_pkgs
mv /final_pkgs/debug-*.zip /tmp/debug_final_pkgs || echo "no debug packages to move"
# Install the package
# These network calls should not have 'retry's because they are installing
# locally and aren't actually network calls
@ -46,37 +30,23 @@ mv /final_pkgs/debug-*.zip /tmp/debug_final_pkgs || echo "no debug packages to m
# conda build scripts themselves. These should really be consolidated
pkg="/final_pkgs/\$(ls /final_pkgs)"
if [[ "$PACKAGE_TYPE" == conda ]]; then
(
# For some reason conda likes to re-activate the conda environment when attempting this install
# which means that a deactivate is run and some variables might not exist when that happens,
# namely CONDA_MKL_INTERFACE_LAYER_BACKUP from libblas so let's just ignore unbound variables when
# it comes to the conda installation commands
set +u
retry conda install \${EXTRA_CONDA_FLAGS} -yq \
"numpy\${NUMPY_PIN}" \
future \
mkl>=2018 \
ninja \
dataclasses \
typing-extensions \
defaults::protobuf \
six
if [[ "$DESIRED_CUDA" == 'cpu' ]]; then
retry conda install -c pytorch -y cpuonly
conda install -y "\$pkg" --offline
if [[ "$DESIRED_CUDA" == 'cpu' ]]; then
retry conda install -y cpuonly -c pytorch
fi
retry conda install -yq future numpy protobuf six
if [[ "$DESIRED_CUDA" != 'cpu' ]]; then
# DESIRED_CUDA is in format cu90 or cu102
if [[ "${#DESIRED_CUDA}" == 4 ]]; then
cu_ver="${DESIRED_CUDA:2:1}.${DESIRED_CUDA:3}"
else
# DESIRED_CUDA is in format cu90 or cu102
if [[ "${#DESIRED_CUDA}" == 4 ]]; then
cu_ver="${DESIRED_CUDA:2:1}.${DESIRED_CUDA:3}"
else
cu_ver="${DESIRED_CUDA:2:2}.${DESIRED_CUDA:4}"
fi
retry conda install \${EXTRA_CONDA_FLAGS} -yq -c nvidia -c pytorch "cudatoolkit=\${cu_ver}"
cu_ver="${DESIRED_CUDA:2:2}.${DESIRED_CUDA:4}"
fi
conda install \${EXTRA_CONDA_FLAGS} -y "\$pkg" --offline
)
retry conda install -yq -c pytorch "cudatoolkit=\${cu_ver}"
fi
elif [[ "$PACKAGE_TYPE" != libtorch ]]; then
pip install "\$pkg"
retry pip install -q future numpy protobuf typing-extensions six
retry pip install -q future numpy protobuf six
fi
if [[ "$PACKAGE_TYPE" == libtorch ]]; then
pkg="\$(ls /final_pkgs/*-latest.zip)"
@ -86,7 +56,6 @@ fi
# Test the package
/builder/check_binary.sh
# =================== The above code will be executed inside Docker container ===================
EOL
echo

View File

@ -0,0 +1,37 @@
#!/bin/bash
# Do NOT set -x
source /home/circleci/project/env
set -eu -o pipefail
set +x
declare -x "AWS_ACCESS_KEY_ID=${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}"
declare -x "AWS_SECRET_ACCESS_KEY=${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}"
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
# DO NOT TURN -x ON BEFORE THIS LINE
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
set -eux -o pipefail
export PATH="$MINICONDA_ROOT/bin:$PATH"
# This gets set in binary_populate_env.sh, but lets have a sane default just in case
PIP_UPLOAD_FOLDER=${PIP_UPLOAD_FOLDER:-nightly}
# TODO: Combine CONDA_UPLOAD_CHANNEL and PIP_UPLOAD_FOLDER into one variable
# The only difference is the trailing slash
# Strip trailing slashes if there
CONDA_UPLOAD_CHANNEL=$(echo "${PIP_UPLOAD_FOLDER}" | sed 's:/*$::')
# Upload the package to the final location
pushd /home/circleci/project/final_pkgs
if [[ "$PACKAGE_TYPE" == conda ]]; then
retry conda install -yq anaconda-client
anaconda -t "${CONDA_PYTORCHBOT_TOKEN}" upload "$(ls)" -u "pytorch-${CONDA_UPLOAD_CHANNEL}" --label main --no-progress --force
elif [[ "$PACKAGE_TYPE" == libtorch ]]; then
retry pip install -q awscli
s3_dir="s3://pytorch/libtorch/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
for pkg in $(ls); do
retry aws s3 cp "$pkg" "$s3_dir" --acl public-read
done
else
retry pip install -q awscli
s3_dir="s3://pytorch/whl/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
retry aws s3 cp "$(ls)" "$s3_dir" --acl public-read
fi

View File

@ -14,10 +14,6 @@ chmod +x "$build_script"
# Build
cat >"$build_script" <<EOL
export PATH="$workdir/miniconda/bin:$PATH"
if [[ "$CIRCLE_BRANCH" == "nightly" ]]; then
export USE_PYTORCH_METAL_EXPORT=1
export USE_COREML_DELEGATE=1
fi
if [[ "$PACKAGE_TYPE" == conda ]]; then
"$workdir/builder/conda/build_pytorch.sh"
else

View File

@ -20,9 +20,9 @@ if [[ "$PACKAGE_TYPE" == libtorch ]]; then
unzip "$pkg" -d /tmp
cd /tmp/libtorch
elif [[ "$PACKAGE_TYPE" == conda ]]; then
conda install -y "$pkg"
conda install -y "$pkg" --offline
else
pip install "$pkg" -v
pip install "$pkg" --no-index --no-dependencies -v
fi
# Test

View File

@ -0,0 +1,37 @@
#!/bin/bash
# Do NOT set -x
set -eu -o pipefail
set +x
export AWS_ACCESS_KEY_ID="${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}"
export AWS_SECRET_ACCESS_KEY="${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}"
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
# DO NOT TURN -x ON BEFORE THIS LINE
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
set -eux -o pipefail
source "/Users/distiller/project/env"
export "PATH=$workdir/miniconda/bin:$PATH"
# This gets set in binary_populate_env.sh, but lets have a sane default just in case
PIP_UPLOAD_FOLDER=${PIP_UPLOAD_FOLDER:-nightly}
# TODO: Combine CONDA_UPLOAD_CHANNEL and PIP_UPLOAD_FOLDER into one variable
# The only difference is the trailing slash
# Strip trailing slashes if there
CONDA_UPLOAD_CHANNEL=$(echo "${PIP_UPLOAD_FOLDER}" | sed 's:/*$::')
pushd "$workdir/final_pkgs"
if [[ "$PACKAGE_TYPE" == conda ]]; then
retry conda install -yq anaconda-client
retry anaconda -t "${CONDA_PYTORCHBOT_TOKEN}" upload "$(ls)" -u "pytorch-${CONDA_UPLOAD_CHANNEL}" --label main --no-progress --force
elif [[ "$PACKAGE_TYPE" == libtorch ]]; then
retry pip install -q awscli
s3_dir="s3://pytorch/libtorch/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
for pkg in $(ls); do
retry aws s3 cp "$pkg" "$s3_dir" --acl public-read
done
else
retry pip install -q awscli
s3_dir="s3://pytorch/whl/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
retry aws s3 cp "$(ls)" "$s3_dir" --acl public-read
fi

View File

@ -62,30 +62,18 @@ if [[ -z "$DOCKER_IMAGE" ]]; then
if [[ "$PACKAGE_TYPE" == conda ]]; then
export DOCKER_IMAGE="pytorch/conda-cuda"
elif [[ "$DESIRED_CUDA" == cpu ]]; then
export DOCKER_IMAGE="pytorch/manylinux-cpu"
export DOCKER_IMAGE="pytorch/manylinux-cuda100"
else
export DOCKER_IMAGE="pytorch/manylinux-cuda${DESIRED_CUDA:2}"
fi
fi
USE_GOLD_LINKER="OFF"
# GOLD linker can not be used if CUPTI is statically linked into PyTorch, see https://github.com/pytorch/pytorch/issues/57744
if [[ ${DESIRED_CUDA} == "cpu" ]]; then
USE_GOLD_LINKER="ON"
fi
USE_WHOLE_CUDNN="OFF"
# Link whole cuDNN for CUDA-11.1 to include fp16 fast kernels
if [[ "$(uname)" == "Linux" && "${DESIRED_CUDA}" == "cu111" ]]; then
USE_WHOLE_CUDNN="ON"
fi
# Default to nightly, since that's where this normally uploads to
PIP_UPLOAD_FOLDER='nightly/'
# We put this here so that OVERRIDE_PACKAGE_VERSION below can read from it
export DATE="$(date -u +%Y%m%d)"
#TODO: We should be pulling semver version from the base version.txt
BASE_BUILD_VERSION="1.10.0.dev$DATE"
BASE_BUILD_VERSION="1.5.0.dev$DATE"
# Change BASE_BUILD_VERSION to git tag when on a git tag
# Use 'git -C' to make doubly sure we're in the correct directory for checking
# the git tag
@ -93,11 +81,11 @@ if tagged_version >/dev/null; then
# Switch upload folder to 'test/' if we are on a tag
PIP_UPLOAD_FOLDER='test/'
# Grab git tag, remove prefixed v and remove everything after -
# Used to clean up tags that are for release candidates like v1.6.0-rc1
# Turns tag v1.6.0-rc1 -> v1.6.0
# Used to clean up tags that are for release candidates like v1.5.0-rc1
# Turns tag v1.5.0-rc1 -> v1.5.0
BASE_BUILD_VERSION="$(tagged_version | sed -e 's/^v//' -e 's/-.*$//')"
fi
if [[ "$(uname)" == 'Darwin' ]] || [[ "$PACKAGE_TYPE" == conda ]]; then
if [[ "$(uname)" == 'Darwin' ]] || [[ "$DESIRED_CUDA" == "cu102" ]] || [[ "$PACKAGE_TYPE" == conda ]]; then
export PYTORCH_BUILD_VERSION="${BASE_BUILD_VERSION}"
else
export PYTORCH_BUILD_VERSION="${BASE_BUILD_VERSION}+$DESIRED_CUDA"
@ -112,14 +100,8 @@ if [[ "$PACKAGE_TYPE" == libtorch ]]; then
POSSIBLE_JAVA_HOMES+=(/usr/local)
POSSIBLE_JAVA_HOMES+=(/usr/lib/jvm/java-8-openjdk-amd64)
POSSIBLE_JAVA_HOMES+=(/Library/Java/JavaVirtualMachines/*.jdk/Contents/Home)
# Add the Windows-specific JNI path
POSSIBLE_JAVA_HOMES+=("$PWD/.circleci/windows-jni/")
for JH in "${POSSIBLE_JAVA_HOMES[@]}" ; do
if [[ -e "$JH/include/jni.h" ]] ; then
# Skip if we're not on Windows but haven't found a JAVA_HOME
if [[ "$JH" == "$PWD/.circleci/windows-jni/" && "$OSTYPE" != "msys" ]] ; then
break
fi
echo "Found jni.h under $JH"
JAVA_HOME="$JH"
BUILD_JNI=ON
@ -148,7 +130,7 @@ if [[ "${BUILD_FOR_SYSTEM:-}" == "windows" ]]; then
fi
export DATE="$DATE"
export NIGHTLIES_DATE_PREAMBLE=1.10.0.dev
export NIGHTLIES_DATE_PREAMBLE=1.5.0.dev
export PYTORCH_BUILD_VERSION="$PYTORCH_BUILD_VERSION"
export PYTORCH_BUILD_NUMBER="$PYTORCH_BUILD_NUMBER"
export OVERRIDE_PACKAGE_VERSION="$PYTORCH_BUILD_VERSION"
@ -179,11 +161,6 @@ export CIRCLE_TAG="${CIRCLE_TAG:-}"
export CIRCLE_SHA1="$CIRCLE_SHA1"
export CIRCLE_PR_NUMBER="${CIRCLE_PR_NUMBER:-}"
export CIRCLE_BRANCH="$CIRCLE_BRANCH"
export CIRCLE_WORKFLOW_ID="$CIRCLE_WORKFLOW_ID"
export USE_GOLD_LINKER="${USE_GOLD_LINKER}"
export USE_GLOO_WITH_OPENSSL="ON"
export USE_WHOLE_CUDNN="${USE_WHOLE_CUDNN}"
# =================== The above code will be executed inside Docker container ===================
EOL

View File

@ -19,7 +19,7 @@ chmod +x /home/circleci/project/ci_test_script.sh
VOLUME_MOUNTS="-v /home/circleci/project/:/circleci_stuff -v /home/circleci/project/final_pkgs:/final_pkgs -v ${PYTORCH_ROOT}:/pytorch -v ${BUILDER_ROOT}:/builder"
# Run the docker
if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then
export id=$(docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --gpus all ${VOLUME_MOUNTS} -t -d "${DOCKER_IMAGE}")
export id=$(docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --runtime=nvidia ${VOLUME_MOUNTS} -t -d "${DOCKER_IMAGE}")
else
export id=$(docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined ${VOLUME_MOUNTS} -t -d "${DOCKER_IMAGE}")
fi

View File

@ -1,98 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
PACKAGE_TYPE=${PACKAGE_TYPE:-conda}
PKG_DIR=${PKG_DIR:-/tmp/workspace/final_pkgs}
# Designates whether to submit as a release candidate or a nightly build
# Value should be `test` when uploading release candidates
# currently set within `designate_upload_channel`
UPLOAD_CHANNEL=${UPLOAD_CHANNEL:-nightly}
# Designates what subfolder to put packages into
UPLOAD_SUBFOLDER=${UPLOAD_SUBFOLDER:-cpu}
UPLOAD_BUCKET="s3://pytorch"
BACKUP_BUCKET="s3://pytorch-backup"
DRY_RUN=${DRY_RUN:-enabled}
# Don't actually do work unless explicit
ANACONDA="true anaconda"
AWS_S3_CP="aws s3 cp --dryrun"
if [[ "${DRY_RUN}" = "disabled" ]]; then
ANACONDA="anaconda"
AWS_S3_CP="aws s3 cp"
fi
do_backup() {
local backup_dir
backup_dir=$1
(
pushd /tmp/workspace
set -x
${AWS_S3_CP} --recursive . "${BACKUP_BUCKET}/${CIRCLE_TAG}/${backup_dir}/"
)
}
conda_upload() {
(
set -x
${ANACONDA} \
upload \
${PKG_DIR}/*.tar.bz2 \
-u "pytorch-${UPLOAD_CHANNEL}" \
--label main \
--no-progress \
--force
)
}
s3_upload() {
local extension
local pkg_type
extension="$1"
pkg_type="$2"
s3_dir="${UPLOAD_BUCKET}/${pkg_type}/${UPLOAD_CHANNEL}/${UPLOAD_SUBFOLDER}/"
(
for pkg in ${PKG_DIR}/*.${extension}; do
(
set -x
${AWS_S3_CP} --no-progress --acl public-read "${pkg}" "${s3_dir}"
)
done
)
}
case "${PACKAGE_TYPE}" in
conda)
conda_upload
# Fetch platform (eg. win-64, linux-64, etc.) from index file
# Because there's no actual conda command to read this
subdir=$(\
tar -xOf ${PKG_DIR}/*.bz2 info/index.json \
| grep subdir \
| cut -d ':' -f2 \
| sed -e 's/[[:space:]]//' -e 's/"//g' -e 's/,//' \
)
BACKUP_DIR="conda/${subdir}"
;;
libtorch)
s3_upload "zip" "libtorch"
BACKUP_DIR="libtorch/${UPLOAD_CHANNEL}/${UPLOAD_SUBFOLDER}"
;;
# wheel can either refer to wheel/manywheel
*wheel)
s3_upload "whl" "whl"
BACKUP_DIR="whl/${UPLOAD_CHANNEL}/${UPLOAD_SUBFOLDER}"
;;
*)
echo "ERROR: unknown package type: ${PACKAGE_TYPE}"
exit 1
;;
esac
# CIRCLE_TAG is defined by upstream circleci,
# this can be changed to recognize tagged versions
if [[ -n "${CIRCLE_TAG:-}" ]]; then
do_backup "${BACKUP_DIR}"
fi

View File

@ -5,65 +5,18 @@ source "/c/w/env"
mkdir -p "$PYTORCH_FINAL_PACKAGE_DIR"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
export VC_YEAR=2017
export USE_SCCACHE=1
export SCCACHE_BUCKET=ossci-compiler-cache-windows
export NIGHTLIES_PYTORCH_ROOT="$PYTORCH_ROOT"
export VC_YEAR=2019
if [[ "${DESIRED_CUDA}" == "cu111" || "${DESIRED_CUDA}" == "cu113" ]]; then
export BUILD_SPLIT_CUDA="ON"
fi
echo "Free Space for CUDA DEBUG BUILD"
if [[ "$CIRCLECI" == 'true' ]]; then
if [[ -d "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community"
fi
if [[ -d "C:\\Program Files (x86)\\Microsoft Visual Studio 14.0" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft Visual Studio 14.0"
fi
if [[ -d "C:\\Program Files (x86)\\Microsoft.NET" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft.NET"
fi
if [[ -d "C:\\Program Files\\dotnet" ]]; then
rm -rf "C:\\Program Files\\dotnet"
fi
if [[ -d "C:\\Program Files (x86)\\dotnet" ]]; then
rm -rf "C:\\Program Files (x86)\\dotnet"
fi
if [[ -d "C:\\Program Files (x86)\\Microsoft SQL Server" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft SQL Server"
fi
if [[ -d "C:\\Program Files (x86)\\Xamarin" ]]; then
rm -rf "C:\\Program Files (x86)\\Xamarin"
fi
if [[ -d "C:\\Program Files (x86)\\Google" ]]; then
rm -rf "C:\\Program Files (x86)\\Google"
fi
fi
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}
set -x
if [[ "$CIRCLECI" == 'true' && -d "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" ]]; then
mv "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages\\_Instances" .
rm -rf "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mkdir -p "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
mv _Instances "C:\\ProgramData\\Microsoft\\VisualStudio\\Packages"
fi
if [[ "$CIRCLECI" == 'true' && -d "C:\\Microsoft" ]]; then
# don't use quotes here
rm -rf /c/Microsoft/AndroidNDK*
if [[ "$CIRCLECI" == 'true' && -d "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019" ]]; then
rm -rf "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019"
fi
echo "Free space on filesystem before build:"

View File

@ -1,13 +0,0 @@
#!/bin/bash
set -eux -o pipefail
source "/c/w/env"
export CUDA_VERSION="${DESIRED_CUDA/cu/}"
export VC_YEAR=2019
pushd "$BUILDER_ROOT"
./windows/internal/smoke_test.bat
popd

View File

@ -0,0 +1,37 @@
#!/bin/bash
set -eu -o pipefail
set +x
declare -x "AWS_ACCESS_KEY_ID=${PYTORCH_BINARY_AWS_ACCESS_KEY_ID}"
declare -x "AWS_SECRET_ACCESS_KEY=${PYTORCH_BINARY_AWS_SECRET_ACCESS_KEY}"
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
# DO NOT TURN -x ON BEFORE THIS LINE
#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!#!
set -eux -o pipefail
source "/env"
# This gets set in binary_populate_env.sh, but lets have a sane default just in case
PIP_UPLOAD_FOLDER=${PIP_UPLOAD_FOLDER:-nightly/}
# TODO: Combine CONDA_UPLOAD_CHANNEL and PIP_UPLOAD_FOLDER into one variable
# The only difference is the trailing slash
# Strip trailing slashes if there
CONDA_UPLOAD_CHANNEL=$(echo "${PIP_UPLOAD_FOLDER}" | sed 's:/*$::')
pushd /root/workspace/final_pkgs
# Upload the package to the final location
if [[ "$PACKAGE_TYPE" == conda ]]; then
retry conda install -yq anaconda-client
anaconda -t "${CONDA_PYTORCHBOT_TOKEN}" upload "$(ls)" -u "pytorch-${CONDA_UPLOAD_CHANNEL}" --label main --no-progress --force
elif [[ "$PACKAGE_TYPE" == libtorch ]]; then
retry conda install -c conda-forge -yq awscli
s3_dir="s3://pytorch/libtorch/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
for pkg in $(ls); do
retry aws s3 cp "$pkg" "$s3_dir" --acl public-read
done
else
retry conda install -c conda-forge -yq awscli
s3_dir="s3://pytorch/whl/${PIP_UPLOAD_FOLDER}${DESIRED_CUDA}/"
retry aws s3 cp "$(ls)" "$s3_dir" --acl public-read
fi

View File

@ -1,44 +1,15 @@
#!/usr/bin/env bash
set -eux -o pipefail
env
echo "BUILD_ENVIRONMENT:$BUILD_ENVIRONMENT"
export ANDROID_NDK_HOME=/opt/ndk
export ANDROID_NDK=/opt/ndk
export ANDROID_HOME=/opt/android/sdk
# Must be in sync with GRADLE_VERSION in docker image for android
# https://github.com/pietern/pytorch-dockerfiles/blob/master/build.sh#L155
export GRADLE_VERSION=6.8.3
export GRADLE_VERSION=4.10.3
export GRADLE_HOME=/opt/gradle/gradle-$GRADLE_VERSION
export GRADLE_PATH=$GRADLE_HOME/bin/gradle
# touch gradle cache files to prevent expiration
while IFS= read -r -d '' file
do
touch "$file" || true
done < <(find /var/lib/jenkins/.gradle -type f -print0)
export GRADLE_LOCAL_PROPERTIES=~/workspace/android/local.properties
rm -f $GRADLE_LOCAL_PROPERTIES
echo "sdk.dir=/opt/android/sdk" >> $GRADLE_LOCAL_PROPERTIES
echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
echo "cmake.dir=/usr/local" >> $GRADLE_LOCAL_PROPERTIES
retry () {
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
}
# Run custom build script
if [[ "${BUILD_ENVIRONMENT}" == *-gradle-custom-build* ]]; then
# Install torch & torchvision - used to download & dump used ops from test model.
retry pip install torch torchvision --progress-bar off
exec "$(dirname "${BASH_SOURCE[0]}")/../../android/build_test_app_custom.sh" armeabi-v7a
fi
# Run default build
BUILD_ANDROID_INCLUDE_DIR_x86=~/workspace/build_android/install/include
BUILD_ANDROID_LIB_DIR_x86=~/workspace/build_android/install/lib
@ -73,6 +44,9 @@ ln -s ${BUILD_ANDROID_INCLUDE_DIR_arm_v8a} ${JNI_INCLUDE_DIR}/arm64-v8a
ln -s ${BUILD_ANDROID_LIB_DIR_arm_v8a} ${JNI_LIBS_DIR}/arm64-v8a
fi
env
echo "BUILD_ENVIRONMENT:$BUILD_ENVIRONMENT"
GRADLE_PARAMS="-p android assembleRelease --debug --stacktrace"
if [[ "${BUILD_ENVIRONMENT}" == *-gradle-build-only-x86_32* ]]; then
GRADLE_PARAMS+=" -PABI_FILTERS=x86"
@ -82,6 +56,20 @@ if [ -n "{GRADLE_OFFLINE:-}" ]; then
GRADLE_PARAMS+=" --offline"
fi
# touch gradle cache files to prevent expiration
while IFS= read -r -d '' file
do
touch "$file" || true
done < <(find /var/lib/jenkins/.gradle -type f -print0)
env
export GRADLE_LOCAL_PROPERTIES=~/workspace/android/local.properties
rm -f $GRADLE_LOCAL_PROPERTIES
echo "sdk.dir=/opt/android/sdk" >> $GRADLE_LOCAL_PROPERTIES
echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
echo "cmake.dir=/usr/local" >> $GRADLE_LOCAL_PROPERTIES
$GRADLE_PATH $GRADLE_PARAMS
find . -type f -name "*.a" -exec ls -lh {} \;

View File

@ -10,27 +10,18 @@ pt_checkout="/var/lib/jenkins/workspace"
# Since we're cat-ing this file, we need to escape all $'s
echo "cpp_doc_push_script.sh: Invoked with $*"
# for statements like ${1:-${DOCS_INSTALL_PATH:-docs/}}
# the order of operations goes:
# 1. Check if there's an argument $1
# 2. If no argument check for environment var DOCS_INSTALL_PATH
# 3. If no environment var fall back to default 'docs/'
# NOTE: It might seem weird to gather the second argument before gathering the first argument
# but since DOCS_INSTALL_PATH can be derived from DOCS_VERSION it's probably better to
# try and gather it first, just so we don't potentially break people who rely on this script
# Argument 2: What version of the Python API docs we are building.
version="${2:-${DOCS_VERSION:-master}}"
if [ -z "$version" ]; then
echo "error: cpp_doc_push_script.sh: version (arg2) not specified"
# Argument 1: Where to copy the built documentation for Python API to
# (pytorch.github.io/$install_path)
install_path="$1"
if [ -z "$install_path" ]; then
echo "error: cpp_doc_push_script.sh: install_path (arg1) not specified"
exit 1
fi
# Argument 1: Where to copy the built documentation for Python API to
# (pytorch.github.io/$install_path)
install_path="${1:-${DOCS_INSTALL_PATH:-docs/${DOCS_VERSION}}}"
if [ -z "$install_path" ]; then
echo "error: cpp_doc_push_script.sh: install_path (arg1) not specified"
# Argument 2: What version of the Python API docs we are building.
version="$2"
if [ -z "$version" ]; then
echo "error: cpp_doc_push_script.sh: version (arg2) not specified"
exit 1
fi
@ -39,7 +30,13 @@ if [ "$version" == "master" ]; then
is_master_doc=true
fi
echo "install_path: $install_path version: $version"
# Argument 3: (optional) If present, we will NOT do any pushing. Used for testing.
dry_run=false
if [ "$3" != "" ]; then
dry_run=true
fi
echo "install_path: $install_path version: $version dry_run: $dry_run"
# ======================== Building PyTorch C++ API Docs ========================
@ -56,22 +53,31 @@ sudo apt-get -y install doxygen
# Generate ATen files
pushd "${pt_checkout}"
pip install -r requirements.txt
time python -m tools.codegen.gen \
time python aten/src/ATen/gen.py \
-s aten/src/ATen \
-d build/aten/src/ATen
-d build/aten/src/ATen \
aten/src/ATen/Declarations.cwrap \
aten/src/THCUNN/generic/THCUNN.h \
aten/src/ATen/nn.yaml \
aten/src/ATen/native/native_functions.yaml
# Copy some required files
cp aten/src/ATen/common_with_cwrap.py tools/shared/cwrap_common.py
cp torch/_utils_internal.py tools/shared
# Generate PyTorch files
time python tools/setup_helpers/generate_code.py \
--declarations-path build/aten/src/ATen/Declarations.yaml \
--native-functions-path aten/src/ATen/native/native_functions.yaml \
--nn-path aten/src/
# Build the docs
pushd docs/cpp
pip install -r requirements.txt
pip install breathe==4.13.0 bs4 lxml six
pip install --no-cache-dir -e "git+https://github.com/pytorch/pytorch_sphinx_theme.git#egg=pytorch_sphinx_theme"
pip install exhale>=0.2.1
pip install sphinx==2.4.4
# Uncomment once it is fixed
# pip install -r requirements.txt
time make VERBOSE=1 html -j
popd
@ -97,8 +103,24 @@ git status
git config user.email "soumith+bot@pytorch.org"
git config user.name "pytorchbot"
# If there aren't changes, don't make a commit; push is no-op
git commit -m "Generate C++ docs from pytorch/pytorch@$CIRCLE_SHA1" || true
git commit -m "Automatic sync on $(date)" || true
git status
if [ "$dry_run" = false ]; then
echo "Pushing to https://github.com/pytorch/cppdocs"
set +x
/usr/bin/expect <<DONE
spawn git push -u origin master
expect "Username*"
send "pytorchbot\n"
expect "Password*"
send "$::env(GITHUB_PYTORCHBOT_TOKEN)\n"
expect eof
DONE
set -x
else
echo "Skipping push due to dry_run"
fi
popd
# =================== The above code **should** be executed inside Docker container ===================

View File

@ -1,8 +0,0 @@
set "DRIVER_DOWNLOAD_LINK=https://s3.amazonaws.com/ossci-windows/452.39-data-center-tesla-desktop-win10-64bit-international.exe"
curl --retry 3 -kL %DRIVER_DOWNLOAD_LINK% --output 452.39-data-center-tesla-desktop-win10-64bit-international.exe
if errorlevel 1 exit /b 1
start /wait 452.39-data-center-tesla-desktop-win10-64bit-international.exe -s -noreboot
if errorlevel 1 exit /b 1
del 452.39-data-center-tesla-desktop-win10-64bit-international.exe || ver > NUL

View File

@ -5,7 +5,7 @@ set -eu -o pipefail
export ANDROID_NDK_HOME=/opt/ndk
export ANDROID_HOME=/opt/android/sdk
export GRADLE_VERSION=6.8.3
export GRADLE_VERSION=4.10.3
export GRADLE_HOME=/opt/gradle/gradle-$GRADLE_VERSION
export GRADLE_PATH=$GRADLE_HOME/bin/gradle
@ -35,9 +35,7 @@ else
echo "ndk.dir=/opt/ndk" >> $GRADLE_LOCAL_PROPERTIES
echo "SONATYPE_NEXUS_USERNAME=${SONATYPE_NEXUS_USERNAME}" >> $GRADLE_PROPERTIES
echo "mavenCentralRepositoryUsername=${SONATYPE_NEXUS_USERNAME}" >> $GRADLE_PROPERTIES
echo "SONATYPE_NEXUS_PASSWORD=${SONATYPE_NEXUS_PASSWORD}" >> $GRADLE_PROPERTIES
echo "mavenCentralRepositoryPassword=${SONATYPE_NEXUS_PASSWORD}" >> $GRADLE_PROPERTIES
echo "signing.keyId=${ANDROID_SIGN_KEY}" >> $GRADLE_PROPERTIES
echo "signing.password=${ANDROID_SIGN_PASS}" >> $GRADLE_PROPERTIES

View File

@ -7,33 +7,22 @@ sudo apt-get -y install expect-dev
# This is where the local pytorch install in the docker image is located
pt_checkout="/var/lib/jenkins/workspace"
source "$pt_checkout/.jenkins/pytorch/common_utils.sh"
echo "python_doc_push_script.sh: Invoked with $*"
set -ex
# for statements like ${1:-${DOCS_INSTALL_PATH:-docs/}}
# the order of operations goes:
# 1. Check if there's an argument $1
# 2. If no argument check for environment var DOCS_INSTALL_PATH
# 3. If no environment var fall back to default 'docs/'
# NOTE: It might seem weird to gather the second argument before gathering the first argument
# but since DOCS_INSTALL_PATH can be derived from DOCS_VERSION it's probably better to
# try and gather it first, just so we don't potentially break people who rely on this script
# Argument 2: What version of the docs we are building.
version="${2:-${DOCS_VERSION:-master}}"
if [ -z "$version" ]; then
echo "error: python_doc_push_script.sh: version (arg2) not specified"
# Argument 1: Where to copy the built documentation to
# (pytorch.github.io/$install_path)
install_path="$1"
if [ -z "$install_path" ]; then
echo "error: python_doc_push_script.sh: install_path (arg1) not specified"
exit 1
fi
# Argument 1: Where to copy the built documentation to
# (pytorch.github.io/$install_path)
install_path="${1:-${DOCS_INSTALL_PATH:-docs/${DOCS_VERSION}}}"
if [ -z "$install_path" ]; then
echo "error: python_doc_push_script.sh: install_path (arg1) not specified"
# Argument 2: What version of the docs we are building.
version="$2"
if [ -z "$version" ]; then
echo "error: python_doc_push_script.sh: version (arg2) not specified"
exit 1
fi
@ -43,36 +32,21 @@ if [ "$version" == "master" ]; then
fi
# Argument 3: The branch to push to. Usually is "site"
branch="${3:-${DOCS_BRANCH:-site}}"
branch="$3"
if [ -z "$branch" ]; then
echo "error: python_doc_push_script.sh: branch (arg3) not specified"
exit 1
fi
echo "install_path: $install_path version: $version"
# Argument 4: (optional) If present, we will NOT do any pushing. Used for testing.
dry_run=false
if [ "$4" != "" ]; then
dry_run=true
fi
echo "install_path: $install_path version: $version dry_run: $dry_run"
build_docs () {
set +e
set -o pipefail
make $1 2>&1 | tee /tmp/docs_build.txt
code=$?
if [ $code -ne 0 ]; then
set +x
echo =========================
grep "WARNING:" /tmp/docs_build.txt
echo =========================
echo Docs build failed. If the failure is not clear, scan back in the log
echo for any WARNINGS or for the line "build finished with problems"
echo "(tried to echo the WARNINGS above the ==== line)"
echo =========================
fi
set -ex
return $code
}
git clone https://github.com/pytorch/pytorch.github.io -b $branch --depth 1
git clone https://github.com/pytorch/pytorch.github.io -b $branch
pushd pytorch.github.io
export LC_ALL=C
@ -80,38 +54,26 @@ export PATH=/opt/conda/bin:$PATH
rm -rf pytorch || true
# Install TensorBoard in python 3 so torch.utils.tensorboard classes render
pip install -q https://s3.amazonaws.com/ossci-linux/wheels/tensorboard-1.14.0a0-py3-none-any.whl
# Get all the documentation sources, put them in one place
pushd "$pt_checkout"
git clone https://github.com/pytorch/vision
pushd vision
conda install -q pillow
time python setup.py install
popd
pushd docs
rm -rf source/torchvision
cp -a ../vision/docs/source source/torchvision
# Build the docs
pip -q install -r requirements.txt
pip -q install -r requirements.txt || true
if [ "$is_master_doc" = true ]; then
build_docs html
[ $? -eq 0 ] || exit $?
make coverage
# Now we have the coverage report, we need to make sure it is empty.
# Count the number of lines in the file and turn that number into a variable
# $lines. The `cut -f1 ...` is to only parse the number, not the filename
# Skip the report header by subtracting 2: the header will be output even if
# there are no undocumented items.
#
# Also: see docs/source/conf.py for "coverage_ignore*" items, which should
# be documented then removed from there.
lines=$(wc -l build/coverage/python.txt 2>/dev/null |cut -f1 -d' ')
undocumented=$(($lines - 2))
if [ $undocumented -lt 0 ]; then
echo coverage output not found
exit 1
elif [ $undocumented -gt 0 ]; then
echo undocumented objects found:
cat build/coverage/python.txt
exit 1
fi
make html
else
# skip coverage, format for stable or tags
build_docs html-stable
[ $? -eq 0 ] || exit $?
make html-stable
fi
# Move them into the docs repo
@ -120,6 +82,14 @@ popd
git rm -rf "$install_path" || true
mv "$pt_checkout/docs/build/html" "$install_path"
# Add the version handler by search and replace.
# XXX: Consider moving this to the docs Makefile or site build
if [ "$is_master_doc" = true ]; then
find "$install_path" -name "*.html" -print0 | xargs -0 perl -pi -w -e "s@master\s+\((\d\.\d\.[A-Fa-f0-9]+\+[A-Fa-f0-9]+)\s+\)@<a href='http://pytorch.org/docs/versions.html'>\1 \&#x25BC</a>@g"
else
find "$install_path" -name "*.html" -print0 | xargs -0 perl -pi -w -e "s@master\s+\((\d\.\d\.[A-Fa-f0-9]+\+[A-Fa-f0-9]+)\s+\)@<a href='http://pytorch.org/docs/versions.html'>$version \&#x25BC</a>@g"
fi
# Prevent Google from indexing $install_path/_modules. This folder contains
# generated source files.
# NB: the following only works on gnu sed. The sed shipped with mac os is different.
@ -131,8 +101,24 @@ git status
git config user.email "soumith+bot@pytorch.org"
git config user.name "pytorchbot"
# If there aren't changes, don't make a commit; push is no-op
git commit -m "Generate Python docs from pytorch/pytorch@$CIRCLE_SHA1" || true
git commit -m "auto-generating sphinx docs" || true
git status
if [ "$dry_run" = false ]; then
echo "Pushing to pytorch.github.io:$branch"
set +x
/usr/bin/expect <<DONE
spawn git push origin $branch
expect "Username*"
send "pytorchbot\n"
expect "Password*"
send "$::env(GITHUB_PYTORCHBOT_TOKEN)\n"
expect eof
DONE
set -x
else
echo "Skipping push due to dry_run"
fi
popd
# =================== The above code **should** be executed inside Docker container ===================

View File

@ -1,103 +1,81 @@
#!/usr/bin/env bash
set -ex -o pipefail
# Set up NVIDIA docker repo
curl -s -L --retry 3 https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
echo "deb https://nvidia.github.io/libnvidia-container/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
echo "deb https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
echo "deb https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
# Remove unnecessary sources
sudo rm -f /etc/apt/sources.list.d/google-chrome.list
sudo rm -f /etc/apt/heroku.list
sudo rm -f /etc/apt/openjdk-r-ubuntu-ppa-xenial.list
sudo rm -f /etc/apt/partner.list
# To increase the network reliability, let apt decide which mirror is best to use
sudo sed -i -e 's/http:\/\/.*archive/mirror:\/\/mirrors/' -e 's/\/ubuntu\//\/mirrors.txt/' /etc/apt/sources.list
retry () {
$* || $* || $* || $* || $*
}
# Method adapted from here: https://askubuntu.com/questions/875213/apt-get-to-retry-downloading
# (with use of tee to avoid permissions problems)
# This is better than retrying the whole apt-get command
echo "APT::Acquire::Retries \"3\";" | sudo tee /etc/apt/apt.conf.d/80-retries
retry sudo apt-get update -qq
retry sudo apt-get -y install \
sudo apt-get -y update
sudo apt-get -y remove linux-image-generic linux-headers-generic linux-generic docker-ce
# WARNING: Docker version is hardcoded here; you must update the
# version number below for docker-ce and nvidia-docker2 to get newer
# versions of Docker. We hardcode these numbers because we kept
# getting broken CI when Docker would update their docker version,
# and nvidia-docker2 would be out of date for a day until they
# released a newer version of their package.
#
# How to figure out what the correct versions of these packages are?
# My preferred method is to start a Docker instance of the correct
# Ubuntu version (e.g., docker run -it ubuntu:16.04) and then ask
# apt what the packages you need are. Note that the CircleCI image
# comes with Docker.
sudo apt-get -y install \
linux-headers-$(uname -r) \
linux-image-generic \
moreutils \
docker-ce=5:18.09.4~3-0~ubuntu-xenial \
nvidia-container-runtime=2.0.0+docker18.09.4-1 \
nvidia-docker2=2.0.3+docker18.09.4-1 \
expect-dev
echo "== DOCKER VERSION =="
docker version
sudo pkill -SIGHUP dockerd
if ! command -v aws >/dev/null; then
retry sudo pip3 -q install awscli==1.19.64
fi
retry () {
$* || $* || $* || $* || $*
}
retry sudo pip -q install awscli==1.16.35
if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then
DRIVER_FN="NVIDIA-Linux-x86_64-460.39.run"
DRIVER_FN="NVIDIA-Linux-x86_64-440.59.run"
wget "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN"
sudo /bin/bash "$DRIVER_FN" -s --no-drm || (sudo cat /var/log/nvidia-installer.log && false)
nvidia-smi
# Taken directly from https://github.com/NVIDIA/nvidia-docker
# Add the package repositories
distribution=$(. /etc/os-release;echo "$ID$VERSION_ID")
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L "https://nvidia.github.io/nvidia-docker/${distribution}/nvidia-docker.list" | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
retry sudo apt-get update -qq
# Necessary to get the `--gpus` flag to function within docker
retry sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
else
# Explicitly remove nvidia docker apt repositories if not building for cuda
sudo rm -rf /etc/apt/sources.list.d/nvidia-docker.list
fi
add_to_env_file() {
local name=$1
local value=$2
case "$value" in
*\ *)
# BASH_ENV should be set by CircleCI
echo "${name}='${value}'" >> "${BASH_ENV:-/tmp/env}"
;;
*)
echo "${name}=${value}" >> "${BASH_ENV:-/tmp/env}"
;;
esac
}
add_to_env_file IN_CI 1
add_to_env_file CI_MASTER "${CI_MASTER:-}"
add_to_env_file COMMIT_SOURCE "${CIRCLE_BRANCH:-}"
add_to_env_file BUILD_ENVIRONMENT "${BUILD_ENVIRONMENT}"
add_to_env_file CIRCLE_PULL_REQUEST "${CIRCLE_PULL_REQUEST}"
if [[ "${BUILD_ENVIRONMENT}" == *-build ]]; then
add_to_env_file SCCACHE_BUCKET ossci-compiler-cache-circleci-v2
SCCACHE_MAX_JOBS=$(( $(nproc) - 1 ))
MEMORY_LIMIT_MAX_JOBS=8 # the "large" resource class on CircleCI has 32 CPU cores, if we use all of them we'll OOM
MAX_JOBS=$(( ${SCCACHE_MAX_JOBS} > ${MEMORY_LIMIT_MAX_JOBS} ? ${MEMORY_LIMIT_MAX_JOBS} : ${SCCACHE_MAX_JOBS} ))
add_to_env_file MAX_JOBS "${MAX_JOBS}"
echo "declare -x IN_CIRCLECI=1" > /home/circleci/project/env
echo "declare -x COMMIT_SOURCE=${CIRCLE_BRANCH:-}" >> /home/circleci/project/env
echo "declare -x SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2" >> /home/circleci/project/env
if [ -n "${USE_CUDA_DOCKER_RUNTIME:-}" ]; then
add_to_env_file TORCH_CUDA_ARCH_LIST 5.2
echo "declare -x TORCH_CUDA_ARCH_LIST=5.2" >> /home/circleci/project/env
fi
export SCCACHE_MAX_JOBS=`expr $(nproc) - 1`
export MEMORY_LIMIT_MAX_JOBS=8 # the "large" resource class on CircleCI has 32 CPU cores, if we use all of them we'll OOM
export MAX_JOBS=$(( ${SCCACHE_MAX_JOBS} > ${MEMORY_LIMIT_MAX_JOBS} ? ${MEMORY_LIMIT_MAX_JOBS} : ${SCCACHE_MAX_JOBS} ))
echo "declare -x MAX_JOBS=${MAX_JOBS}" >> /home/circleci/project/env
if [[ "${BUILD_ENVIRONMENT}" == *xla* ]]; then
# This IAM user allows write access to S3 bucket for sccache & bazels3cache
set +x
add_to_env_file XLA_CLANG_CACHE_S3_BUCKET_NAME "${XLA_CLANG_CACHE_S3_BUCKET_NAME:-}"
add_to_env_file AWS_ACCESS_KEY_ID "${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}"
add_to_env_file AWS_SECRET_ACCESS_KEY "${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}"
echo "declare -x XLA_CLANG_CACHE_S3_BUCKET_NAME=${XLA_CLANG_CACHE_S3_BUCKET_NAME:-}" >> /home/circleci/project/env
echo "declare -x AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}" >> /home/circleci/project/env
echo "declare -x AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_AND_XLA_BAZEL_S3_BUCKET_V2:-}" >> /home/circleci/project/env
set -x
else
# This IAM user allows write access to S3 bucket for sccache
set +x
add_to_env_file XLA_CLANG_CACHE_S3_BUCKET_NAME "${XLA_CLANG_CACHE_S3_BUCKET_NAME:-}"
add_to_env_file AWS_ACCESS_KEY_ID "${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}"
add_to_env_file AWS_SECRET_ACCESS_KEY "${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}"
echo "declare -x XLA_CLANG_CACHE_S3_BUCKET_NAME=${XLA_CLANG_CACHE_S3_BUCKET_NAME:-}" >> /home/circleci/project/env
echo "declare -x AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}" >> /home/circleci/project/env
echo "declare -x AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_SCCACHE_S3_BUCKET_V4:-}" >> /home/circleci/project/env
set -x
fi
fi
@ -106,7 +84,5 @@ fi
set +x
export AWS_ACCESS_KEY_ID=${CIRCLECI_AWS_ACCESS_KEY_FOR_ECR_READ_WRITE_V4:-}
export AWS_SECRET_ACCESS_KEY=${CIRCLECI_AWS_SECRET_KEY_FOR_ECR_READ_WRITE_V4:-}
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\")
export AWS_REGION=us-east-1
aws ecr get-login-password --region $AWS_REGION|docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
eval $(aws ecr get-login --region us-east-1 --no-include-email)
set -x

View File

@ -33,7 +33,7 @@ systemctl list-units --all | cat
sudo pkill apt-get || true
# For even better luck, purge unattended-upgrades
sudo apt-get purge -y unattended-upgrades || true
sudo apt-get purge -y unattended-upgrades
cat /etc/apt/sources.list

View File

@ -1,140 +0,0 @@
# Documentation: https://docs.microsoft.com/en-us/rest/api/azure/devops/build/?view=azure-devops-rest-6.0
import re
import json
import os
import sys
import requests
import time
AZURE_PIPELINE_BASE_URL = "https://aiinfra.visualstudio.com/PyTorch/"
AZURE_DEVOPS_PAT_BASE64 = os.environ.get("AZURE_DEVOPS_PAT_BASE64_SECRET", "")
PIPELINE_ID = "911"
PROJECT_ID = "0628bce4-2d33-499e-bac5-530e12db160f"
TARGET_BRANCH = os.environ.get("CIRCLE_BRANCH", "master")
TARGET_COMMIT = os.environ.get("CIRCLE_SHA1", "")
build_base_url = AZURE_PIPELINE_BASE_URL + "_apis/build/builds?api-version=6.0"
s = requests.Session()
s.headers.update({"Authorization": "Basic " + AZURE_DEVOPS_PAT_BASE64})
def submit_build(pipeline_id, project_id, source_branch, source_version):
print("Submitting build for branch: " + source_branch)
print("Commit SHA1: ", source_version)
run_build_raw = s.post(build_base_url, json={
"definition": {"id": pipeline_id},
"project": {"id": project_id},
"sourceBranch": source_branch,
"sourceVersion": source_version
})
try:
run_build_json = run_build_raw.json()
except json.decoder.JSONDecodeError as e:
print(e)
print("Failed to parse the response. Check if the Azure DevOps PAT is incorrect or expired.")
sys.exit(-1)
build_id = run_build_json['id']
print("Submitted bulid: " + str(build_id))
print("Bulid URL: " + run_build_json['url'])
return build_id
def get_build(_id):
get_build_url = AZURE_PIPELINE_BASE_URL + f"/_apis/build/builds/{_id}?api-version=6.0"
get_build_raw = s.get(get_build_url)
return get_build_raw.json()
def get_build_logs(_id):
get_build_logs_url = AZURE_PIPELINE_BASE_URL + f"/_apis/build/builds/{_id}/logs?api-version=6.0"
get_build_logs_raw = s.get(get_build_logs_url)
return get_build_logs_raw.json()
def get_log_content(url):
resp = s.get(url)
return resp.text
def wait_for_build(_id):
build_detail = get_build(_id)
build_status = build_detail['status']
while build_status == 'notStarted':
print('Waiting for run to start: ' + str(_id))
sys.stdout.flush()
try:
build_detail = get_build(_id)
build_status = build_detail['status']
except Exception as e:
print("Error getting build")
print(e)
time.sleep(30)
print("Bulid started: ", str(_id))
handled_logs = set()
while build_status == 'inProgress':
try:
print("Waiting for log: " + str(_id))
logs = get_build_logs(_id)
except Exception as e:
print("Error fetching logs")
print(e)
time.sleep(30)
continue
for log in logs['value']:
log_id = log['id']
if log_id in handled_logs:
continue
handled_logs.add(log_id)
print('Fetching log: \n' + log['url'])
try:
log_content = get_log_content(log['url'])
print(log_content)
except Exception as e:
print("Error getting log content")
print(e)
sys.stdout.flush()
build_detail = get_build(_id)
build_status = build_detail['status']
time.sleep(30)
build_result = build_detail['result']
print("Bulid status: " + build_status)
print("Bulid result: " + build_result)
return build_status, build_result
if __name__ == '__main__':
# Convert the branch name for Azure DevOps
match = re.search(r'pull/(\d+)', TARGET_BRANCH)
if match is not None:
pr_num = match.group(1)
SOURCE_BRANCH = f'refs/pull/{pr_num}/head'
else:
SOURCE_BRANCH = f'refs/heads/{TARGET_BRANCH}'
MAX_RETRY = 2
retry = MAX_RETRY
while retry > 0:
build_id = submit_build(PIPELINE_ID, PROJECT_ID, SOURCE_BRANCH, TARGET_COMMIT)
build_status, build_result = wait_for_build(build_id)
if build_result != 'succeeded':
retry = retry - 1
if retry > 0:
print("Retrying... remaining attempt: " + str(retry))
# Wait a bit before retrying
time.sleep((MAX_RETRY - retry) * 120)
continue
else:
print("No more chance to retry. Giving up.")
sys.exit(-1)
else:
break

View File

@ -0,0 +1,87 @@
import glob
import json
import logging
import os
import os.path
import re
import sys
import time
import requests
def get_size(file_dir):
try:
# we should only expect one file, if no, something is wrong
file_name = glob.glob(os.path.join(file_dir, "*"))[0]
return os.stat(file_name).st_size
except:
logging.exception(f"error getting file from: {file_dir}")
return 0
def build_message(size):
pkg_type, py_ver, cu_ver, *_ = os.environ.get("BUILD_ENVIRONMENT", "").split() + [
None,
None,
None,
]
os_name = os.uname()[0].lower()
if os_name == "darwin":
os_name = "macos"
return {
"normal": {
"os": os_name,
"pkg_type": pkg_type,
"py_ver": py_ver,
"cu_ver": cu_ver,
"pr": os.environ.get("CIRCLE_PR_NUMBER"),
"build_num": os.environ.get("CIRCLE_BUILD_NUM"),
"sha1": os.environ.get("CIRCLE_SHA1"),
"branch": os.environ.get("CIRCLE_BRANCH"),
},
"int": {
"time": int(time.time()),
"size": size,
"commit_time": int(os.environ.get("COMMIT_TIME", "0")),
},
}
def send_message(message):
access_token = os.environ.get("SCRIBE_GRAPHQL_ACCESS_TOKEN")
if not access_token:
raise ValueError("Can't find access token from environment variable")
url = "https://graph.facebook.com/scribe_logs"
r = requests.post(
url,
data={
"access_token": access_token,
"logs": json.dumps(
[
{
"category": "perfpipe_pytorch_binary_size",
"message": json.dumps(message),
"line_escape": False,
}
]
),
},
)
print(r.text)
r.raise_for_status()
if __name__ == "__main__":
file_dir = os.environ.get(
"PYTORCH_FINAL_PACKAGE_DIR", "/home/circleci/project/final_pkgs"
)
if len(sys.argv) == 2:
file_dir = sys.argv[1]
print("checking dir: " + file_dir)
size = get_size(file_dir)
if size != 0:
try:
send_message(build_message(size))
except:
logging.exception("can't send message")

Some files were not shown because too many files have changed in this diff Show More