accelerate

mirror of https://github.com/huggingface/accelerate.git synced 2025-10-20 18:13:46 +08:00

Author	SHA1	Message	Date
Peter St. John	847ae58c74	Fix FP8 tests, enable FP8 to be used without direct `Accelerator()` configuring (#3677 ) * single-gpu tests passing * install deepspeed in fp8 container * revert mixed_precision check	2025-07-15 15:20:57 +02:00
Yao Matrix	ab3c604e48	enable big_model_inference on xpu (#3595 ) * enable big_model_inference on XPU Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix style Signed-off-by: Matrix YAO <matrix.yao@intel.com> * fix quality Signed-off-by: Matrix YAO <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com>	2025-05-30 17:23:26 +02:00
Yao Matrix	273799c85d	enable fsdp2 benchmark on XPU (#3590 ) * enable fsdp2 benchmark on XPU Signed-off-by: Matrix YAO <matrix.yao@intel.com> * add deterministic Signed-off-by: Matrix YAO <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com>	2025-05-27 14:08:59 +02:00
Yao Matrix	4e9d0deba6	enable regional_compilation benchmark on xpu (#3592 ) * enable regional_compilation benchmark on xpu Signed-off-by: Matrix YAO <matrix.yao@intel.com> * Apply style fixes --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-26 15:05:42 +02:00
Ilyas Moutawwakil	7b5774ac55	Dynamo regional compilation (#3529 )	2025-05-12 09:49:29 +02:00
Zach Mueller	583b26db3c	Add FP8 runners + tweak building FP8 image (#3493 ) * Initial test * Try on push * Only wf dispatch now * keep trying * Try again * Try again * source activate? * Force bash * Source activate accelerate to make it get the env propelry * try using nightly docker * Try this? * Try this? * Try this, proper output * Try this, proper output * Try via full conda activate(?) * rm conda * te fp8 tests * add ao * ao in setup too * actually include fp8 deps * FP8 docker image, use newer version * Update docker image to take in input * Test * prior month * igpu? * Use only last 2 digits of year * Build rest * Apply style fixes --------- Co-authored-by: [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL <muellerzr@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-15 11:39:43 +02:00
cyyever	3169339f5b	Bump ruff to 0.11.2 (#3471 ) * ruff format * Bump ruff to 0.11.2	2025-04-01 11:57:06 +02:00
Matej Sirovatka	d7c741a6bc	Initial FSDP2 support (#3394 ) * Feat: initial conversion tool draft * Feat: add value mapping to conversion tool * Refactor: move from os to pathlib * Feat: add first tests * Feat: more tests * Feat: minor fixes + dataclass conversions * Feat: more remapping * Fix: namespace has no attribute version + style * Fix: offload params behavior * Feat: add option to only rename keys in the config file to * Fix: wrong attr name * Fix: partially resolve comments * Feat: work on config command + minor fixes to reflect changes * Refactor: style + quality * Feat: fsdp2 initial work * Feat: some cleanups and first running fsdp2 * Fix: version checks + mixed precision policy * Refactor: style + quality * Remove obsolete todos * Feat: grad norm clipping * Fix: tests + rename attrs * Refactor: style + quality * Fix: None object is not iterable * Fix: default cpu_offload for fsdp2 * Fix: cpu offload now behaves correctly * Feat: apply_activation_checkpointing * Fix: append to models * Feat: start on concept guide * wip: concept guide * Fix: toctree * cleanup of the concept guide * Fix: minor fixes + mp * Fix: quality + \| to union * Feat: backwards compatibility + args cleanup * Fix: style + quality * Feat: enable dropping refs when getting named params * Fix: memory footprint with fsdp2 * Feat: cpu ram efficient loading * Fix: mp * Fix: not warn about sync_modules if fsdp version is 1 * Refactor: minor changes * Small fixes + refactors * Feat: docs + cleanup * Feat: saving works (not sure about optim) * More loading/saving work * Feat: disable local_state_dict for fsdp2 * Fix: fsdp2 convergence * Feat: working comparison script * Feat: memory tracking fsdp2 * Feat: memory visualizer * Feat: more work on benchmark * Fix: raise error if model+optimizer arent prepared together * Minor fixes * Style * More warnings * Fix: reshard_after_forward vs sharding_strategy conflict * Refactor: clean up accelerator * Feat: more testing in fsdp2 benchmark * Fix: memory visualizer * Untested: support load/save_state * Feat: concept guide improvements * Refactor: concept guide * Feat: benchmark works * Feat: more work on fsdp2 benchmark * Fix: note syntax * Fix: small fixes + make original tests work * Fix: grad scaling * Feat: reshard after forward tests * Feat: backward prefetch tests * Feat: tests for fsdp2 * Refactor: minor fixes * Feat: fsdp_utils docstrings * Feat: autodoc fsdp.md * Docs: get_module_children_bottom_up * Fix: remove unused images * Refactor: benchmark cleanup * Fix: docs * Feat: final doc changes * Fix: torch.distributed has no attribute tensor * Fix: style * Feat: tests include version in failures * Fix: benchmark force model to load in fp32 * Fix: rename runs * Feat: last minor fixes * Feat: new benchmark images	2025-03-27 15:01:18 -04:00
Zach Mueller	8039158d71	Torchao float8 training (#3348 ) * Bookmark * bookmark * Add torchao base example * Currently broken * Clean * DDP varient working * FSDP as well * Works for all but zero3 * Bookmark: currently zero3 is underperforming * Bookmark * Another diff * Fin * Fin * Add req huggingface suite * update tests for fp8/torchao/ddp * Log FP8 backend used and adjust typing * add documentation for convert_to_float8_training * Rename to convert_model_to_fp8_ao * Call superinit" * Add types * Clean * Use filter_first_and_last_linear_layers * Update usage guide docs * Actually loop through the zero stages * Clean	2025-02-17 11:51:47 -05:00
Zach Mueller	fb68cb9d0e	Refactor scaler to util (#3142 ) * Refactor scaler to util * Document * Use the distributed_type directly	2024-10-08 11:07:01 -04:00
Monius	e93b056687	fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ (#3132 ) * fix deprecated FutureWarning for pytorch 2.4+ * perform `make style` and `make quality` * try to fix `Quality Check` on `actions/workflows/quality.yml` * undo changes for `src/accelerate/utils/memory.py` * adapt scaler for pytorch.__version__ * fix scalar waning for npu device deps on pytorch2.4 version check * fallback to default npu scaler * fallback to default `GradScaler` doc	2024-10-07 09:26:59 -04:00
Zach Mueller	d5b7b70e06	MS-AMP support (w/o FSDP) (#3093 ) * MS-AMP support sans FSDP * Fix import * Fixings * Last Benjamin nit * New ruff version cleaning * Update src/accelerate/accelerator.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-09-10 12:25:45 -04:00
Benjamin Bossan	3fd02e60dc	MAINT: Upgrade ruff to v0.6.4 (#3095 ) * MNT Upgrade ruff to 0.6.4 Currently used version, 0.2.1, is quite old at this point. Not a lot needed to be changed: - Change ruff version in setup.py - Remove deprecated ignore-init-module-imports option for ruff - Type comparison should use is and not == - Use f-string instead of % formatting - Some line wrapping and empty lines * Oops	2024-09-10 10:43:37 -04:00
Zach Mueller	c2120927b0	Add FP8 docker images (#3048 ) * Add fp8 docker images * Add more docker images * Rv * bring back ds * Less diffy * No need for sep tag	2024-08-26 12:12:34 -04:00
Zach Mueller	c0cf860dc6	Fix fp8 benchmark on single GPU (#3032 )	2024-08-22 16:54:32 -04:00
Zach Mueller	a452327e8e	Enable FSDP & Deepspeed + FP8 (#2983 ) * Working version rebased from main * kwargs * Clean * Fix more nits * Fin * Delay autocast flag * Enable FP8 autocast during eval only if specified * Fin * Rm comment * All done * Zero3 works! * Let the wrapper come off during unwrap_model * Add import check * Migrate all to benchmarks folder and make TE import check work * Add readme * Add README to benchmarks folder * Update CLI to now include fp8 args * Add test config for 0_34 * Finish adding to config yaml * Write docs * Expound docs w/ FP8 * Add to toctree	2024-08-14 14:57:01 -04:00
Zach Mueller	7a2feecad4	Add copyright + some ruff lint things (#2523 ) * Copyright and ruff stuff * lol	2024-03-04 09:14:31 -05:00
Sylvain Gugger	5002e56704	Update quality tools to 2023 (#1046 ) * Setup 2023 tooling for quality * Result of styling * Simplify inits and remove isort and flake8 from doc * Puts back isort skip flag	2023-02-07 13:34:05 -05:00
Sylvain Gugger	ea0d5368bd	Add benchmarks (#506 ) * Add benchmarks * Oops! Forgot one file * Update benchmarks/README.md Co-authored-by: Zachary Mueller <muellerzr@gmail.com> Co-authored-by: Zachary Mueller <muellerzr@gmail.com>	2022-07-12 15:16:45 -04:00

19 Commits