mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 15:33:51 +08:00
Train {GPT,LLaMA, Phi}-like models (or any model) at ultra low-cost with DeepSpeed Universal Checkpointing (UCP). UCP abstracts away the complexities of saving and loading model states. See arxiv paper, blog and tutorial in this PR for details. --------- Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
175 KiB
984x519px
175 KiB
984x519px
