fd40516923
Update GH org references ( #6998 )
...
Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com >
Signed-off-by: Logan Adams <loadams@microsoft.com >
Signed-off-by: Fabien Dupont <fdupont@redhat.com >
Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr >
2025-02-05 00:56:50 +00:00
7f3d669b40
Remove Duplicate Declaration of pandas in Dockerfile
( #6959 )
...
### Description
This pull request removes the redundant installation of `pandas` from
the `Dockerfile`.
It was previously declared twice, and this update eliminates the
duplicate entry, improving the clarity and maintainability of the
`Dockerfile`.
018ece5af2/docker/Dockerfile (L124)
018ece5af2/docker/Dockerfile (L135)
### Changes
Removed the duplicate pandas installation line from the `RUN pip
install` command.
2025-01-17 17:44:49 +00:00
a1b0c35a1d
Switch what versions of python are supported ( #5676 )
...
Add support for testing compilation with python 3.11/3.12.
Also add the dockerfiles used to build those images.
---------
Co-authored-by: Michael Wyatt <michael.wyatt@snowflake.com >
2024-11-06 20:37:52 -08:00
15ed83a9a6
Update dockerfile with updated versions ( #4780 )
...
Fixes #4763
2023-12-07 19:25:57 +00:00
e31b40411f
fix: remove unnessary #
punct in the second sed
command ( #4061 )
2023-07-31 16:58:31 +00:00
45cecc05fb
fix "ERROR: failed to solve: nvidia/cuda:11.7.0-devel-ubuntu18.04: docker.io/nvidia/cuda:11.7.0-devel-ubuntu18.04: not found" ( #3930 )
...
Update Nvidia docker version.
Fix "ERROR: failed to solve: nvidia/cuda:11.7.0-devel-ubuntu18.04: docker.io/nvidia/cuda:11.7.0-devel-ubuntu18.04: not found"
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com >
2023-07-13 12:27:29 -07:00
a65f6b9e9b
Update Dockerfile with newer cuda and torch. ( #3716 )
...
* Add non-interactive prompt, causing issues for some users
* Update pytorch version too
2023-06-09 12:31:03 -07:00
ab1d2f826b
Update Dockerfile ( #3298 )
...
line 98 should be
curl -O https://bootstrap.pypa.io/pip/3.6/get-pip.py && \
to avoid
#16 106.9 ERROR: This script does not work on Python 3.6 The minimum supported Python version is 3.7. Please use https://bootstrap.pypa.io/pip/3.6/get-pip.py instead.
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com >
2023-04-20 12:06:07 -07:00
7e2103f8e2
Use rocm/pytorch:latest ( #2613 )
2022-12-15 14:03:09 -08:00
7bcb4fabeb
Enable CG headers on ROCm ( #1821 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
2022-03-11 12:06:41 -08:00
ac71a1a461
[docker] simplify and update rocm dockerfile ( #1819 )
2022-03-09 15:23:27 -08:00
c3c8d5dd93
AMD support ( #1430 )
...
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: Jithun Nair <jithun.nair@amd.com >
Co-authored-by: rraminen <rraminen@amd.com >
Co-authored-by: Jeff Daily <jeff.daily@amd.com >
Co-authored-by: okakarpa <okakarpa@amd.com >
Co-authored-by: rraminen <rraminen@amd.com >
Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com >
Co-authored-by: Jeff Daily <jeff.daily@amd.com >
Co-authored-by: okakarpa <okakarpa@amd.com >
Co-authored-by: Ramya Ramineni <62723901+rraminen@users.noreply.github.com >
2022-03-03 01:53:35 +00:00
599258f979
ZeRO 3 Offload ( #834 )
...
* Squash stage3 v1 (#146 )
Co-authored-by: Samyam <samyamr@microsoft.com >
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Shaden Smith <ShadenTSmith@gmail.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
* Fix correctness bug (#147 )
* formatting fix (#150 )
* stage3 bugfix (API) update and simplified FP16 Z3 tests (#151 )
* fp16 Z3 API update and bugfix
* revert debug change
* ZeRO-3 detach and race condition bugfixes (#149 )
* trying out ZeRO-3 race condition fix
* CUDA sync instead of stream
* reduction stream sync
* remove commented code
* Fix optimizer state_dict KeyError (#148 )
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
* fix for smaller SGS sizes, ensures each grad is backed by unique tensors (#152 )
* Simplifying the logic for getting averaged gradients (#153 )
* skip for now
* Z3 Docs redux (#154 )
* removing some TODOs and commented code (#155 )
* New Z3 defaults (#156 )
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
* formatting
* megatron external params
Co-authored-by: Jeff Rasley <jerasley@microsoft.com >
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com >
Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com >
Co-authored-by: Shaden Smith <ShadenTSmith@gmail.com >
Co-authored-by: eltonzheng <eltonz@microsoft.com >
2021-03-08 12:54:54 -08:00
b29229bf52
update docker image and bump DSE
2020-09-10 17:18:18 +00:00