mirror of
https://github.com/deepspeedai/DeepSpeed.git
synced 2025-10-20 15:33:51 +08:00
* bf16 updates * Got bf16 working * fp32 reduction; flattened tensors * bf16+zero_stage_1 first cut * finish zero_stage 1 sharding * Matching fp16 with debugging codes * Matching loss with fp16 * Fix gradient clipping * bf16 gradient clipping fix bf16 checkpoint save/load * Unscale grad norm * Fix grad norm scaling * Enable loading fp16_zero_1 into bf16_zero_1 engine and vice versa * Fix clip_grad key error * Reduce tied weight gradients * Fix grad norm for moe * Reduce specified gradients * Use O(n) instead of O(n^2) * Remove optimizer restriction for bf16 * Link bf16 & fp32 params * Clip gradients of last stage tied weights * Simplify tied weights reduction logic * Also clip all tp rank parameters * lp to hp mapping * Link lp/hp/optim state; Refresh links after checkpoint load * Remove debug print * Remove debug print * Simplify zero_grad logic * fp32 accessors * Fix update bug Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
32 lines
384 B
Plaintext
32 lines
384 B
Plaintext
*.pyc
|
|
.idea/
|
|
*~
|
|
*.swp
|
|
*.log
|
|
deepspeed/git_version_info_installed.py
|
|
__pycache__
|
|
|
|
# Build + installation data
|
|
build/
|
|
dist/
|
|
*.so
|
|
deepspeed.egg-info/
|
|
build.txt
|
|
|
|
# Website
|
|
docs/_site/
|
|
docs/build
|
|
docs/code-docs/source/_build
|
|
docs/code-docs/_build
|
|
docs/code-docs/build
|
|
.sass-cache/
|
|
.jekyll-cache/
|
|
.jekyll-metadata
|
|
|
|
# Testing data
|
|
tests/unit/saved_checkpoint/
|
|
|
|
# Dev/IDE data
|
|
.vscode
|
|
.theia
|