Commit Graph

28 Commits

Author SHA1 Message Date
b2eb0e8c6a docker: Use miniforge, install from pip (#134274)
Switch installation of the pytorch package to be installed from our download.pytorch.org sources which are better maintained.

As well, switching over the miniconda installation to a miniforge installation in order to ensure backwards compat for users expecting to have the conda package manager installed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134274
Approved by: https://github.com/malfet, https://github.com/atalman

Co-authored-by: atalman <atalman@fb.com>
2024-08-22 23:20:22 +00:00
ee140a198f Revert "[Port][Quant][Inductor] Bug fix: mutation nodes not handled correctly for QLinearPointwiseBinaryPT2E (#128591)"
This reverts commit 03e8a4cf45ee45611de77b55b515a8936f60ce31.

Reverted https://github.com/pytorch/pytorch/pull/128591 on behalf of https://github.com/atalman due to Contains release only changes should not be landed ([comment](https://github.com/pytorch/pytorch/pull/128591#issuecomment-2168308233))
2024-06-14 15:51:00 +00:00
03e8a4cf45 [Port][Quant][Inductor] Bug fix: mutation nodes not handled correctly for QLinearPointwiseBinaryPT2E (#128591)
Port #127592 from main to release/2.4

------
Fixes #127402

- Revert some changes to `ir.MutationOutput` and inductor/test_flex_attention.py
- Add checks of mutation for QLinearPointwiseBinaryPT2E

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127592
Approved by: https://github.com/leslie-fang-intel, https://github.com/Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128591
Approved by: https://github.com/jgong5, https://github.com/Chillee
2024-06-14 09:31:38 +00:00
7e059b3c95 Add a call to validate docker images after build step is complete (#127768)
Adds validation to docker images. As discussed here: https://github.com/pytorch/pytorch/issues/125879
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127768
Approved by: https://github.com/huydhn, https://github.com/Skylion007
2024-06-06 20:25:39 +00:00
257d40ba2e Docker release - push nightly tags only for amd64 builds (#125845)
Fixes failure: https://github.com/pytorch/pytorch/actions/runs/9014006158/job/24765880791#step:12:43
```
Unable to find image 'ghcr.io/pytorch/pytorch-nightly:2.4.0.dev20240509-runtime' locally
2.4.0.dev20240509-runtime: Pulling from pytorch/pytorch-nightly
docker: no matching manifest for linux/amd64 in the manifest list entries.
```
This cpu image does not exist for amd64 and not uploaded to dockerhub. Hence don't tag it .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125845
Approved by: https://github.com/malfet, https://github.com/huydhn
2024-05-09 16:42:15 +00:00
b29d77b54f Separate arm64 and amd64 docker builds (#125617)
Fixes https://github.com/pytorch/pytorch/issues/125094

Please note: Docker CUDa 12.4 failure is existing issue, related to docker image not being available on gitlab:
```
docker.io/nvidia/cuda:12.4.0-cudnn8-devel-ubuntu22.04: docker.io/nvidia/cuda:12.4.0-cudnn8-devel-ubuntu22.04: not found
```
 https://github.com/pytorch/pytorch/actions/runs/8974959068/job/24648540236?pr=125617

Here is the reference issue: https://gitlab.com/nvidia/container-images/cuda/-/issues/225

Tracked on our side: https://github.com/pytorch/builder/issues/1811
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125617
Approved by: https://github.com/huydhn, https://github.com/malfet
2024-05-07 11:50:54 +00:00
de25718300 [release] Docker Release build trigger on rc for testing (#117849)
Enable triggering the Docker Release builds on RC. Use test channel in this case. Hence following logic is applied:
1. On RC trigger use test channel and upload to pytorch-test : https://github.com/orgs/pytorch/packages/container/package/pytorch-test
2. On Final RC use prod channel and upload to pytorch : https://github.com/orgs/pytorch/packages/container/package/pytorch
3. Nightly: https://github.com/orgs/pytorch/packages/container/package/pytorch-nightly

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117849
Approved by: https://github.com/malfet
2024-01-19 15:01:46 +00:00
5cf481d1ac [CI] Explicitly specify read-all permissions on the token (#117290)
Would be nice to have it

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117290
Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/huydhn, https://github.com/atalman
2024-01-12 19:15:54 +00:00
3e9bb8d4de Run docker release build on final tag (#117131)
To be successful, the docker release workflow needs to run on final tag, after the Release to conda and pypi are complete.

Please refer to: https://github.com/pytorch/pytorch/blob/main/Dockerfile#L76

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117131
Approved by: https://github.com/huydhn, https://github.com/seemethere, https://github.com/malfet
2024-01-10 21:00:45 +00:00
ea7f2de6f3 [docker] Fix typo in docker-release workflow (#116191)
Fix copy-paste typo in docker-release workflow.  After https://github.com/pytorch/pytorch/pull/116097

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116191
Approved by: https://github.com/malfet
2023-12-20 16:44:36 +00:00
0bd5a3fed7 [releng] Docker release Refactor Push nightly tags step. Move cuda and cudnn version to docker tag rather then name (#116097)
Follow up after : https://github.com/pytorch/pytorch/pull/116070

This PR does 2 things.

1. Refactor Push nightly tags step, don't need to extract CUDA_VERSION anymore. New tag should be in this format: ``${PYTORCH_VERSION}-cuda$(CUDA_VERSION_SHORT)-cudnn$(CUDNN_VERSION)-runtime``
2. Move cuda$(CUDA_VERSION_SHORT)-cudnn$(CUDNN_VERSION) from docker name to tag

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116097
Approved by: https://github.com/jeanschmidt
2023-12-19 13:53:08 +00:00
368a0c06d4 [releng] Docker Official release make sure cuda version is part of image name (#116070)
Follow up on https://github.com/pytorch/pytorch/pull/115949

Change docker build image name:
``pytorch:2.1.2-devel``-> ``2.1.2-cuda12.1-cudnn8-devel and 2.1.2-cuda11.8-cudnn8-devel``

Ref: https://github.com/orgs/pytorch/packages/container/package/pytorch-nightly

Naming will be same as in https://hub.docker.com/r/pytorch/pytorch/tags
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116070
Approved by: https://github.com/huydhn, https://github.com/seemethere
2023-12-19 00:58:15 +00:00
7b6210e8a4 Use matrix generate script for docker release workflows (#115949)
Enable both supported CUDA version builds for docker release. Rather then building only 1 version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115949
Approved by: https://github.com/huydhn
2023-12-18 20:20:59 +00:00
9b3cb1c66c Fix environment condition for docker-release.yml
As those are run on nightlies and release tags environment should be set accordingly.

Also simplify `WITH_PUSH` condition.

Should fix https://github.com/pytorch/pytorch/actions/runs/7156407285/job/19494049140
2023-12-10 14:09:39 -08:00
36b3e1789a Docker release build don't include build suffix in the release (#112046)
This build is used in release as far as I know. For release we don't need suffix.

Test in Release:
```
python3 .github/scripts/generate_pytorch_version.py
2.1.1+cpu
python3 .github/scripts/generate_pytorch_version.py --no-build-suffix
2.1.1
```

Test with nightly:
```
python3 .github/scripts/generate_pytorch_version.py --no-build-suffix
2.2.0.dev20231025
```

With suffix:
```
python3 .github/scripts/generate_pytorch_version.py
2.2.0.dev20231025+cpu
````
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112046
Approved by: https://github.com/huydhn
2023-10-25 19:40:01 +00:00
9a365fe914 Use docker-build env to access GHCR_PAT (#107655)
This will restrict the access to GHCR_PAT to only [docker-build](https://github.com/pytorch/pytorch/settings/environments/1258682414/edit) env.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107655
Approved by: https://github.com/clee2000, https://github.com/atalman
2023-08-23 23:45:41 +00:00
1ab883797a [BE] Dedup hardcoded triton versions (#96580)
Define it once in `.ci/docker/trition_version.txt` and use everywhere.

Also, patch version defined in `triton/__init__.py` as currently it always returns `2.0.0` even if package name is `2.1.0`

Followup after https://github.com/pytorch/pytorch/pull/95896 where version needed to be updated in 4+ places
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96580
Approved by: https://github.com/huydhn
2023-03-12 20:00:48 +00:00
30b968f60d Revert "[BE] Dedup hardcoded triton versions (#96580)"
This reverts commit c131e51e6248cf04135db317040b5be3ab944d41.

Reverted https://github.com/pytorch/pytorch/pull/96580 on behalf of https://github.com/malfet due to Forgot to fix lint
2023-03-12 19:37:52 +00:00
c131e51e62 [BE] Dedup hardcoded triton versions (#96580)
Define it once in `.ci/docker/trition_version.txt` and use everywhere.

Also, patch version defined in `triton/__init__.py` as currently it always returns `2.0.0` even if package name is `2.1.0`

Followup after https://github.com/pytorch/pytorch/pull/95896 where version needed to be updated in 4+ places
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96580
Approved by: https://github.com/huydhn
2023-03-12 16:56:04 +00:00
76cac70939 new triton main pin (#95896)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95896
Approved by: https://github.com/jansel, https://github.com/malfet
2023-03-10 06:30:41 +00:00
a064ce1939 Pin setup-buildx-action version. Fix Docker build (#94734)
This pins setup-buildx-action version.
Our Docker builds where fixed by: https://github.com/pytorch/pytorch/pull/92702 on Jan 25,26
However setup-builder-action update on Jan 27 broke these builds again.
This PR pins version of setup-buildx-action and fixes Docker builds for nightly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94734
Approved by: https://github.com/jeanschmidt
2023-02-13 16:58:44 +00:00
c6cba1865f [Docker] Install Trition deps (#90841)
Triton needs a working gcc, so install one from apt
Also, copy `ptxas` and `cuda.h` from conda to `/usr/local/cuda`
Add `torchaudio` to the matrix
Fix typo in workflow file

Fixes https://github.com/pytorch/pytorch/issues/90377

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90841
Approved by: https://github.com/ngimel
2022-12-16 06:35:43 +00:00
074278f393 [CI] Push latest and hash+CUDAver tags (#88971)
For nightly docker build to simulate the behavior of `push_nightly_docker_ghcr.yml`

Tested in https://github.com/pytorch/pytorch/actions/runs/3465221336/jobs/5787694933

Fixes https://github.com/pytorch/pytorch/issues/88833

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88971
Approved by: https://github.com/seemethere
2022-11-14 21:54:46 +00:00
4f18739bf0 Fix Docker image generation (#88741)
Pass install channel when building nightly images
Pass `TRITON_VERSION` argument to install triton for nightly images

Fix `generate_pytorch_version.py` to work with unannotated tags and avoid failures like the following:
```
% git checkout nightly
% ./.github/scripts/generate_pytorch_version.py

fatal: No annotated tags can describe '93f15b1b54ca5fb4a7ca9c21a813b4b86ebaeafa'.
However, there were unannotated tags: try --tags.
Traceback (most recent call last):
  File "/Users/nshulga/git/pytorch/pytorch-release/./.github/scripts/generate_pytorch_version.py", line 120, in <module>
    main()
  File "/Users/nshulga/git/pytorch/pytorch-release/./.github/scripts/generate_pytorch_version.py", line 115, in main
    print(version_obj.get_release_version())
  File "/Users/nshulga/git/pytorch/pytorch-release/./.github/scripts/generate_pytorch_version.py", line 75, in get_release_version
    if not get_tag():
  File "/Users/nshulga/git/pytorch/pytorch-release/./.github/scripts/generate_pytorch_version.py", line 37, in get_tag
    dirty_tag = subprocess.check_output(
  File "/Users/nshulga/miniforge3/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/Users/nshulga/miniforge3/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'describe']' returned non-zero exit status 128.
```
After the change nightly is reported as(due to autolabelling issue,
should be fixed by ttps://github.com/pytorch/test-infra/pull/1047 ):
```
 % ./.github/scripts/generate_pytorch_version.py
ciflow/inductor/26921+cpu
```

Even for tagged release commits version generation was wrong:
```
% git checkout release/1.13
% ./.github/scripts/generate_pytorch_version.py
ciflow/periodic/79617-4848-g7c98e70d44+cpu
```
After the fix, it is as expected:
```
% ./.github/scripts/generate_pytorch_version.py
1.13.0+cpu
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88741
Approved by: https://github.com/dagitses, https://github.com/msaroufim
2022-11-10 00:06:31 +00:00
ab9a19a95b [BE] Move setup-ssh step ahead of clone PyTorch (#88715)
It allows one to SSH faster rather than having to wait for repo clone to
finish.

I.e. right now one usually have to wait for a few minutes fore PyTorch clone is finished, but with this change you can SSH ahead of time (thanks to `setup-ssh` being a composite action

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88715
Approved by: https://github.com/clee2000, https://github.com/izaitsevfb
2022-11-09 06:55:22 +00:00
894c4218dd ci: Just use regular checkout (#86824)
checkout-pytorch seems to have issues and is purpose made for our PR
testing and appears to conflict with what we're trying to do for binary
builds.

For builds like
https://github.com/pytorch/pytorch/actions/runs/3207520052/jobs/5242479607
there is a confusion over where the reference is pulled and I believe it is
root caused by the checkout logic in checkout-pytorch.

So with that in mind I suggest we just use the upstream checkout action
for this job

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86824
Approved by: https://github.com/atalman
2022-10-13 00:24:02 +00:00
fe89cd6c57 [BE] Use reusable workflows from test-infra (#86035)
Instead of local copies, use workflows checked into test-infra by https://github.com/pytorch/test-infra/pull/783

Thought about deleting the actions later, but if I understand how GHA merges work, older PRs merged onto this changes should not cause any problems as it will immediately reference actions from test-infra
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86035
Approved by: https://github.com/kit1980
2022-10-01 17:21:31 +00:00
1b437718a3 ci: Add workflow to build official docker images with multiarch (#83437)
Resolves https://github.com/pytorch/pytorch/issues/80764

Signed-off-by: Eli Uriegas <seemethere101@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83437
Approved by: https://github.com/ZainRizvi, https://github.com/malfet
2022-08-16 22:45:20 +00:00