* Cleanup: context parallel
* Feat: cleanup
* Feat: concept guide
* Fix: rename + version check
* Style
* Fix: add to namespace in a test
* Fix: add skip_if on dataclass tests
* Fix: proper version for version check
* Feat: add tests and cleanup
* Fix: properly version check added tests
* Feat: address comments
* Fix: add both shift_labels and labels to make the model.forward calculate loss
* Fix: remove import, improve comment
* Fix: final checks
* Fix: style
* Fix: style
* add support for SwanLabTracker and update related documentation
* add emoji in FRAMWORK
* apply the style corrections and quality control
* add support for SwanLabTracker in tests
* fix bug in test_tracking
* add regional compilation to cli tools and env vars
* added seq parallel to gaudi docs
* explain that lm_head is also compiled separately
* style
* docstring
* style
* Feat: enable FULL_STATE_DICT in config
* Feat: support FSDP2 FULL_STATE_DICT
* Refactor: remove deprecated save/load_state_dict
* Docs: add FULL_STATE_DICT as supported to docs
* Feat: update tests
* Feat: change Accelerator.get_state_dict() to use new api
* Feat: initial conversion tool draft
* Feat: add value mapping to conversion tool
* Refactor: move from os to pathlib
* Feat: add first tests
* Feat: more tests
* Feat: minor fixes + dataclass conversions
* Feat: more remapping
* Fix: namespace has no attribute version + style
* Fix: offload params behavior
* Feat: add option to only rename keys in the config file to
* Fix: wrong attr name
* Fix: partially resolve comments
* Feat: work on config command + minor fixes to reflect changes
* Refactor: style + quality
* Feat: fsdp2 initial work
* Feat: some cleanups and first running fsdp2
* Fix: version checks + mixed precision policy
* Refactor: style + quality
* Remove obsolete todos
* Feat: grad norm clipping
* Fix: tests + rename attrs
* Refactor: style + quality
* Fix: None object is not iterable
* Fix: default cpu_offload for fsdp2
* Fix: cpu offload now behaves correctly
* Feat: apply_activation_checkpointing
* Fix: append to models
* Feat: start on concept guide
* wip: concept guide
* Fix: toctree
* cleanup of the concept guide
* Fix: minor fixes + mp
* Fix: quality + | to union
* Feat: backwards compatibility + args cleanup
* Fix: style + quality
* Feat: enable dropping refs when getting named params
* Fix: memory footprint with fsdp2
* Feat: cpu ram efficient loading
* Fix: mp
* Fix: not warn about sync_modules if fsdp version is 1
* Refactor: minor changes
* Small fixes + refactors
* Feat: docs + cleanup
* Feat: saving works (not sure about optim)
* More loading/saving work
* Feat: disable local_state_dict for fsdp2
* Fix: fsdp2 convergence
* Feat: working comparison script
* Feat: memory tracking fsdp2
* Feat: memory visualizer
* Feat: more work on benchmark
* Fix: raise error if model+optimizer arent prepared together
* Minor fixes
* Style
* More warnings
* Fix: reshard_after_forward vs sharding_strategy conflict
* Refactor: clean up accelerator
* Feat: more testing in fsdp2 benchmark
* Fix: memory visualizer
* Untested: support load/save_state
* Feat: concept guide improvements
* Refactor: concept guide
* Feat: benchmark works
* Feat: more work on fsdp2 benchmark
* Fix: note syntax
* Fix: small fixes + make original tests work
* Fix: grad scaling
* Feat: reshard after forward tests
* Feat: backward prefetch tests
* Feat: tests for fsdp2
* Refactor: minor fixes
* Feat: fsdp_utils docstrings
* Feat: autodoc fsdp.md
* Docs: get_module_children_bottom_up
* Fix: remove unused images
* Refactor: benchmark cleanup
* Fix: docs
* Feat: final doc changes
* Fix: torch.distributed has no attribute tensor
* Fix: style
* Feat: tests include version in failures
* Fix: benchmark force model to load in fp32
* Fix: rename runs
* Feat: last minor fixes
* Feat: new benchmark images
* feat: Add no_ssh multinode launcher option for deepspeed
* fix: Add CLI hints and brief documentation, add slurm launcher, and ensure that deepspeed 0.14.5 version is used for nossh
* Bookmark
* bookmark
* Add torchao base example
* Currently broken
* Clean
* DDP varient working
* FSDP as well
* Works for all but zero3
* Bookmark: currently zero3 is underperforming
* Bookmark
* Another diff
* Fin
* Fin
* Add req huggingface suite
* update tests for fp8/torchao/ddp
* Log FP8 backend used and adjust typing
* add documentation for convert_to_float8_training
* Rename to convert_model_to_fp8_ao
* Call superinit"
* Add types
* Clean
* Use filter_first_and_last_linear_layers
* Update usage guide docs
* Actually loop through the zero stages
* Clean
* Add cross-entropy example in the gradient accumulation docs
* add example of logs
* correct skeleton code
* replace gather_for_metrics with gather
* batch_size -> per_device_batch_size
* remove main_process_only=True
* add autoregressive example in examples/
* Update docs/source/usage_guides/gradient_accumulation.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* ruff format
* add grad accum test
* update docs
* Update examples/by_feature/gradient_accumulation_for_autoregressive_models.py
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* update tests
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* [WIP] FEAT Decorator to purge accelerate env vars
In some circumstances, calling certain classes or functions can result
in accelerate env vars being set and not being cleaned up afterwards. As
an example, when calling:
TrainingArguments(fp16=True, ...)
The following env var will be set:
ACCELERATE_MIXED_PRECISION=fp16
This can affect subsequent code, since the env var takes precedence over
TrainingArguments(fp16=False). This is especially relevant for unit
testing, where we want to avoid the individual tests to have side
effects on one another. Decorate the unit test function or whole class
with this decorator to ensure that after each test, the env vars are
cleaned up. This works for both unittest.TestCase and normal
classes (pytest); it also works when decorating the parent class.
In its current state, this PR adds the new decorator and tests it, but
the decorator is not yet applied to potentially problematic functions or
classes.
* Linter
* Refactor code to be more readable
---------
Co-authored-by: [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL <muellerzr@gmail.com>
* rebase
* Update torch v
* Rename
* Prop to docs
* Actually reverse states
* Rebase fully
* Restore old state
* Keep as load()
* No need for explicit anymore
* Check numpy version, dtypes was added in 1.25
* Clean up diff
* Fix hang
* Bookmark
* Migratory
* Uncomment
* Rm name to model for now
* Rm container
* Left: test
* Allow only wrapping one model
* Add warning but only ref once
* Refine
* Update src/accelerate/accelerator.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Finish stas nits
* Clean
* Fixup test + test writing
* Fully working
* Fin
* Nit
* Quality
* Update src/accelerate/accelerator.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Actionable error
* Make note of when its enabled
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Merge tests
* Merge
* Add currently broken test script
* Push the working implementation
* Fin
* Add guards for user behavior
* Test nits
* TODO: finish knowledge distillation example
* Update tests/deepspeed/test_deepspeed_multiple_model.py
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Allow for dict-like interface
* Get rid of disable
* Uncomment
* Complete rewrite to force a dict to be used
* Working tests/fin
* Use name as stas suggestion
* Clean
* docnit
* toctree
* toctree
* Missing ref
* Put in break
* Smaller diff
* Make note on how to use zeroinit
* Make note about accelerator ds plugin
* More docnits
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Limit users to not pass in another ds plugin to another accelerator
* not implemented err + Make a note about why no params
* Apply suggestions from code review from Stas
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Add deepspeed_plugins arg + update doc
* Plugin -> plugins
* Change enable() -> select()
* Update ref properly + test
* Be consistent, model1,model2...
* first_, second_
* A few more auto values
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
---------
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>