accelerate

mirror of https://github.com/huggingface/accelerate.git synced 2025-10-20 18:13:46 +08:00

Author	SHA1	Message	Date
Sergio Paniego Blanco	62ede1ed2a	CP docs typos fixed (#3761 )	2025-09-05 12:23:33 +02:00
Matej Sirovatka	6891c57072	Feat: context parallel v2.0 (#3700 ) * Cleanup: context parallel * Feat: cleanup * Feat: concept guide * Fix: rename + version check * Style * Fix: add to namespace in a test * Fix: add skip_if on dataclass tests * Fix: proper version for version check * Feat: add tests and cleanup * Fix: properly version check added tests * Feat: address comments * Fix: add both shift_labels and labels to make the model.forward calculate loss * Fix: remove import, improve comment * Fix: final checks * Fix: style * Fix: style	2025-08-05 16:17:13 +02:00
Pedro Cuenca	e2cc537db8	trackio (#3669 ) * trackio * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Abubakar Abid <abubakar@huggingface.co> * seven -> eight * Add trackio as a real tracker instead * Sort * Style * Style * Remove step * Disable trackio on Python < 3.10 * Update src/accelerate/tracking.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * More style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Abubakar Abid <abubakar@huggingface.co>	2025-07-15 17:17:49 +02:00
kilavvy	5987d79a53	Update gradient_accumulation.md (#3649 )	2025-06-23 11:58:31 +02:00
Shaohon Chen	6597dae780	Integrate SwanLab for offline/online experiment tracking for Accelerate (#3605 ) * add support for SwanLabTracker and update related documentation * add emoji in FRAMWORK * apply the style corrections and quality control * add support for SwanLabTracker in tests * fix bug in test_tracking	2025-06-18 15:42:29 +02:00
Yao Matrix	2eaf5cdbbc	remove ipex.optimize in accelerate (#3608 ) * remove ipex.optimize in accelerate Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix mis-style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update intel_cpu.md * Update launch.py * fix comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * add logging Signed-off-by: YAO Matrix <matrix.yao@intel.com> * Update launch.py * Apply style fixes --------- Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-06-17 11:08:19 +02:00
Fanli Lin	ee2f48c2c3	[docs] no hard-coded cuda in the ddp documentation (#3589 ) * make device-agnostic * refactor	2025-05-27 11:16:42 +02:00
Ilyas Moutawwakil	e6e717589e	Add regional compilation to cli tools and env vars (#3572 ) * add regional compilation to cli tools and env vars * added seq parallel to gaudi docs * explain that lm_head is also compiled separately * style * docstring * style	2025-05-15 11:30:27 +02:00
regisss	32874257f3	Add Gaudi doc (#3537 ) * Add Gaudi doc * Address comment from review * Remove point about region compilation --------- Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>	2025-05-13 18:27:33 +02:00
Fanli Lin	3524a504c8	update path (#3561 )	2025-05-13 13:57:29 +02:00
Ilyas Moutawwakil	7b5774ac55	Dynamo regional compilation (#3529 )	2025-05-12 09:49:29 +02:00
omahs	7013365791	fix typos (#3549 )	2025-05-08 14:10:12 +02:00
Sayak Paul	d02e51cc21	Update big_modeling.md for layerwise casting (#3548 ) * Update big_modeling.md for layerwise casting * doc fix	2025-05-06 09:50:53 +02:00
Marius Arvinte	0af45bf1e8	Fix logic in `accelerator.prepare` + IPEX for 2+ `nn.Models` and/or `optim.Optimizers` (#3517 ) * Fix logic in _prepare_ipex * Add caution about prepare in IPEX docs * Add suggested workaround to IPEX docs * Revert unnecessary change * Update docs/source/usage_guides/ipex.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Remove double space * Simplify logical checks for IPEX availability * Revert unnecessary change --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-04-25 17:31:36 +02:00
Matej Sirovatka	8fb073536a	[FSDP2] Enable FULL_STATE_DICT (#3527 ) * Feat: enable FULL_STATE_DICT in config * Feat: support FSDP2 FULL_STATE_DICT * Refactor: remove deprecated save/load_state_dict * Docs: add FULL_STATE_DICT as supported to docs * Feat: update tests * Feat: change Accelerator.get_state_dict() to use new api	2025-04-23 18:03:45 +02:00
Álvaro Mazcuñán Herreros	4f35cf713c	Solve link error in internal_mechanism documentation (#3506 ) (#3507 ) * Solve link error in internal_mechanism (#3506) * Link correctly to documentation (#3506)	2025-04-23 17:47:25 +02:00
Sadra Barikbin	8c0a29626d	Update low_precision_training.md (#3488 )	2025-04-08 11:39:58 +02:00
Yao Matrix	67a768be07	remove use_xpu to fix ut issues, we don't need this since XPU is OOB … (#3460 ) * remove use_xpu to fix ut issues, we don't need this since XPU is OOB supported now Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * fix style Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add deprecate warnings Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-04-01 11:55:37 +02:00
Matej Sirovatka	d7c741a6bc	Initial FSDP2 support (#3394 ) * Feat: initial conversion tool draft * Feat: add value mapping to conversion tool * Refactor: move from os to pathlib * Feat: add first tests * Feat: more tests * Feat: minor fixes + dataclass conversions * Feat: more remapping * Fix: namespace has no attribute version + style * Fix: offload params behavior * Feat: add option to only rename keys in the config file to * Fix: wrong attr name * Fix: partially resolve comments * Feat: work on config command + minor fixes to reflect changes * Refactor: style + quality * Feat: fsdp2 initial work * Feat: some cleanups and first running fsdp2 * Fix: version checks + mixed precision policy * Refactor: style + quality * Remove obsolete todos * Feat: grad norm clipping * Fix: tests + rename attrs * Refactor: style + quality * Fix: None object is not iterable * Fix: default cpu_offload for fsdp2 * Fix: cpu offload now behaves correctly * Feat: apply_activation_checkpointing * Fix: append to models * Feat: start on concept guide * wip: concept guide * Fix: toctree * cleanup of the concept guide * Fix: minor fixes + mp * Fix: quality + \| to union * Feat: backwards compatibility + args cleanup * Fix: style + quality * Feat: enable dropping refs when getting named params * Fix: memory footprint with fsdp2 * Feat: cpu ram efficient loading * Fix: mp * Fix: not warn about sync_modules if fsdp version is 1 * Refactor: minor changes * Small fixes + refactors * Feat: docs + cleanup * Feat: saving works (not sure about optim) * More loading/saving work * Feat: disable local_state_dict for fsdp2 * Fix: fsdp2 convergence * Feat: working comparison script * Feat: memory tracking fsdp2 * Feat: memory visualizer * Feat: more work on benchmark * Fix: raise error if model+optimizer arent prepared together * Minor fixes * Style * More warnings * Fix: reshard_after_forward vs sharding_strategy conflict * Refactor: clean up accelerator * Feat: more testing in fsdp2 benchmark * Fix: memory visualizer * Untested: support load/save_state * Feat: concept guide improvements * Refactor: concept guide * Feat: benchmark works * Feat: more work on fsdp2 benchmark * Fix: note syntax * Fix: small fixes + make original tests work * Fix: grad scaling * Feat: reshard after forward tests * Feat: backward prefetch tests * Feat: tests for fsdp2 * Refactor: minor fixes * Feat: fsdp_utils docstrings * Feat: autodoc fsdp.md * Docs: get_module_children_bottom_up * Fix: remove unused images * Refactor: benchmark cleanup * Fix: docs * Feat: final doc changes * Fix: torch.distributed has no attribute tensor * Fix: style * Feat: tests include version in failures * Fix: benchmark force model to load in fp32 * Fix: rename runs * Feat: last minor fixes * Feat: new benchmark images	2025-03-27 15:01:18 -04:00
Harry Smallbone	6de900e10a	feat: Add no_ssh and slurm multinode launcher options for deepspeed (#3329 ) * feat: Add no_ssh multinode launcher option for deepspeed * fix: Add CLI hints and brief documentation, add slurm launcher, and ensure that deepspeed 0.14.5 version is used for nossh	2025-03-20 10:33:00 -04:00
siqi654321	ac3749dc11	Add Tecorigin SDAA accelerator support (#3330 ) Co-authored-by: siqi <siqi@tecorigin.com>	2025-03-05 10:11:21 +01:00
Nicholas Broad	90f81986b9	minor doc fixes (#3365 )	2025-02-25 15:52:26 +01:00
Fanli Lin	fa26dc6156	add missing import (#3396 )	2025-02-25 11:07:14 +01:00
Zach Mueller	8039158d71	Torchao float8 training (#3348 ) * Bookmark * bookmark * Add torchao base example * Currently broken * Clean * DDP varient working * FSDP as well * Works for all but zero3 * Bookmark: currently zero3 is underperforming * Bookmark * Another diff * Fin * Fin * Add req huggingface suite * update tests for fp8/torchao/ddp * Log FP8 backend used and adjust typing * add documentation for convert_to_float8_training * Rename to convert_model_to_fp8_ao * Call superinit" * Add types * Clean * Use filter_first_and_last_linear_layers * Update usage guide docs * Actually loop through the zero stages * Clean	2025-02-17 11:51:47 -05:00
Fanli Lin	e34db4d0d2	enable xpu (#3397 )	2025-02-17 17:41:50 +01:00
Maxim Evtush	5cc99e6e02	fix: typos in documentation files (#3388 ) * Update test_scheduler.py * Update test_big_modeling.py * Update test_state_checkpointing.py * Update test_script.py * Update cli.md * Update quicktour.md	2025-02-10 13:11:50 -05:00
XiaobingZhang	ce63623421	works for fp8 with deepspeed (#3361 ) * works for fp8 with deepspeed * Add tests --------- Co-authored-by: [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL <muellerzr@gmail.com>	2025-02-10 09:31:15 -05:00
Stas Bekman	f076495580	deepspeed github repo move (#3376 )	2025-02-03 13:52:08 -05:00
Zach Mueller	b13aadcb67	Bye bye torch <2 (#3331 ) * Bye bye torch <1 * Add 2.6.0 dl args * Rm require fsdp * Adjust imports + 2.0 specific modeling code * Bring back is_bf16	2025-01-09 12:11:08 -05:00
Yoach Lacombe	acfbf72a7f	Give example on how to handle gradient accumulation with cross-entropy (#3193 ) * Add cross-entropy example in the gradient accumulation docs * add example of logs * correct skeleton code * replace gather_for_metrics with gather * batch_size -> per_device_batch_size * remove main_process_only=True * add autoregressive example in examples/ * Update docs/source/usage_guides/gradient_accumulation.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * ruff format * add grad accum test * update docs * Update examples/by_feature/gradient_accumulation_for_autoregressive_models.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * update tests --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-12-24 12:26:45 +01:00
Fanli Lin	3e62fbb09c	[docs] no hard-coding cuda (#3270 ) * no hard-coding cuda * Update docs/source/usage_guides/big_modeling.md Co-authored-by: Zach Mueller <muellerzr@gmail.com> * update device_type --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-12-10 21:32:10 -05:00
Henry Hyeonmok Ko	cb8b7c637a	Fixed typos for Tutorials and Guides docs (#3274 )	2024-12-06 10:39:45 -05:00
Fanli Lin	aa16d69561	[docs] use real path for `checkpoint` (#3220 ) * fix bug * update	2024-12-06 10:39:29 -05:00
Fanli Lin	51fd482d6e	[docs] update set-seed (#3228 ) * update set-seed * update comment	2024-12-06 10:38:59 -05:00
Richard Higgins	dd68af886a	Update troubleshooting.md (#3259 ) I think the terminology of set_breakpoint and check_breakpoint has become set_trigger and check_trigger	2024-12-02 13:41:10 -05:00
max yue	1f508a6df6	Update deferring_execution.md (#3262 )	2024-12-02 13:40:33 -05:00
Benjamin Bossan	29be478862	[WIP] FEAT Decorator to purge accelerate env vars (#3252 ) * [WIP] FEAT Decorator to purge accelerate env vars In some circumstances, calling certain classes or functions can result in accelerate env vars being set and not being cleaned up afterwards. As an example, when calling: TrainingArguments(fp16=True, ...) The following env var will be set: ACCELERATE_MIXED_PRECISION=fp16 This can affect subsequent code, since the env var takes precedence over TrainingArguments(fp16=False). This is especially relevant for unit testing, where we want to avoid the individual tests to have side effects on one another. Decorate the unit test function or whole class with this decorator to ensure that after each test, the env vars are cleaned up. This works for both unittest.TestCase and normal classes (pytest); it also works when decorating the parent class. In its current state, this PR adds the new decorator and tests it, but the decorator is not yet applied to potentially problematic functions or classes. * Linter * Refactor code to be more readable --------- Co-authored-by: [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL <muellerzr@gmail.com>	2024-11-25 12:04:56 -05:00
Fanli Lin	069743775e	[docs] add instruction to install bnb on non-cuda devices (#3227 ) * ad bnb installation link * add period * add xpu comment and fix some bugs * style fix	2024-11-20 16:58:46 -05:00
Fanli Lin	8ad2b3b8e7	[docs] update code in tracking documentation (#3235 ) * update example code * revert	2024-11-20 10:04:07 -05:00
Fanli Lin	cf169a1ae6	enable `find_executable_batch_size` on XPU (#3236 ) * enable on XPU * Update src/accelerate/utils/memory.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2024-11-19 12:29:05 -05:00
Kyle Sayers	bf4572b6ce	[Utils] `align_module_device` (#3204 ) * implement align_module * add docs * move to modeling utils, integrate into existing source code * update source, expose through utils * Suggested docstring Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Rewrite for readability, add try finally Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Use try-finally when aligning with hook Co-authored-by: Zach Mueller <muellerzr@gmail.com> * apply style * improve get_state_dict_from_offload readability * Update docstring Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * rename to align_module_device, update docstring --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2024-11-01 09:05:50 -04:00
Fanli Lin	497eb3cf86	fix bug (#3166 )	2024-10-31 09:08:20 -04:00
Fanli Lin	4dda5797bd	[docs] use nn.module instead of tensor as model (#3157 ) * use nn.module instead of tensor Signed-off-by: Lin, Fanli <fanli.lin@intel.com> * fix neptune --------- Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2024-10-23 12:23:16 -04:00
Fanli Lin	c809f8e45c	[docs] update neptune API (#3181 )	2024-10-23 12:14:52 -04:00
Kyle Sayers	735dfa3018	[Utils] `has_offloaded_params` (#3188 ) * implement has_offloaded_params * update docstring * expose to utils * add docs * apply style, quality * add tests	2024-10-23 16:44:02 +02:00
Zach Mueller	6f79b63b86	Trigger `weights_only=True` by default for all compatible objects (#3036 ) * rebase * Update torch v * Rename * Prop to docs * Actually reverse states * Rebase fully * Restore old state * Keep as load() * No need for explicit anymore * Check numpy version, dtypes was added in 1.25 * Clean up diff * Fix hang	2024-10-10 14:08:24 -04:00
Zach Mueller	fb68cb9d0e	Refactor scaler to util (#3142 ) * Refactor scaler to util * Document * Use the distributed_type directly	2024-10-08 11:07:01 -04:00
Daniel van Strien	d4d6b6e7f5	fix tip brackets typo (#3129 )	2024-10-07 09:34:24 -04:00
Zach Mueller	e9e5a73fcc	POC: multiple model/configuration DeepSpeed support (#3097 ) * Bookmark * Migratory * Uncomment * Rm name to model for now * Rm container * Left: test * Allow only wrapping one model * Add warning but only ref once * Refine * Update src/accelerate/accelerator.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Finish stas nits * Clean * Fixup test + test writing * Fully working * Fin * Nit * Quality * Update src/accelerate/accelerator.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Actionable error * Make note of when its enabled * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Merge tests * Merge * Add currently broken test script * Push the working implementation * Fin * Add guards for user behavior * Test nits * TODO: finish knowledge distillation example * Update tests/deepspeed/test_deepspeed_multiple_model.py Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Allow for dict-like interface * Get rid of disable * Uncomment * Complete rewrite to force a dict to be used * Working tests/fin * Use name as stas suggestion * Clean * docnit * toctree * toctree * Missing ref * Put in break * Smaller diff * Make note on how to use zeroinit * Make note about accelerator ds plugin * More docnits * Apply suggestions from code review Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Limit users to not pass in another ds plugin to another accelerator * not implemented err + Make a note about why no params * Apply suggestions from code review from Stas Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Add deepspeed_plugins arg + update doc * Plugin -> plugins * Change enable() -> select() * Update ref properly + test * Be consistent, model1,model2... * first_, second_ * A few more auto values * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-09-13 07:28:06 -04:00
Zach Mueller	79a8426416	🚨🚨🚨 The Great Deprecation 🚨🚨🚨 (#3098 ) * The great purge * Clean * Some more fixings * Some more deprecations Benjamin found * Fix kwarghandler test	2024-09-12 21:12:32 -04:00

1 2 3 4 5 ...

320 Commits