accelerate

mirror of https://github.com/huggingface/accelerate.git synced 2025-10-21 11:09:55 +08:00

Author	SHA1	Message	Date
Marc Sun	ff872f5f71	bump to 1.11.0dev0	2025-08-07 12:58:08 +02:00
Marc Sun	7ecc2d7f39	bump to v1.10.0-release	2025-07-16 16:26:03 +00:00
Pedro Cuenca	e2cc537db8	trackio (#3669 ) * trackio * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Abubakar Abid <abubakar@huggingface.co> * seven -> eight * Add trackio as a real tracker instead * Sort * Style * Style * Remove step * Disable trackio on Python < 3.10 * Update src/accelerate/tracking.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * More style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Abubakar Abid <abubakar@huggingface.co>	2025-07-15 17:17:49 +02:00
Marc Sun	a16d2bb3c1	bump to v1.9.0dev	2025-06-19 15:13:41 +02:00
Shaohon Chen	6597dae780	Integrate SwanLab for offline/online experiment tracking for Accelerate (#3605 ) * add support for SwanLabTracker and update related documentation * add emoji in FRAMWORK * apply the style corrections and quality control * add support for SwanLabTracker in tests * fix bug in test_tracking	2025-06-18 15:42:29 +02:00
Marc Sun	417bc52965	bump to v1.8.0dev	2025-05-15 12:02:44 +02:00
Zach Mueller	583b26db3c	Add FP8 runners + tweak building FP8 image (#3493 ) * Initial test * Try on push * Only wf dispatch now * keep trying * Try again * Try again * source activate? * Force bash * Source activate accelerate to make it get the env propelry * try using nightly docker * Try this? * Try this? * Try this, proper output * Try this, proper output * Try via full conda activate(?) * rm conda * te fp8 tests * add ao * ao in setup too * actually include fp8 deps * FP8 docker image, use newer version * Update docker image to take in input * Test * prior month * igpu? * Use only last 2 digits of year * Build rest * Apply style fixes --------- Co-authored-by: [[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL <muellerzr@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-15 11:39:43 +02:00
Marc Sun	9642a1ac81	bump to v1.7.0dev	2025-04-01 13:55:11 +02:00
cyyever	3169339f5b	Bump ruff to 0.11.2 (#3471 ) * ruff format * Bump ruff to 0.11.2	2025-04-01 11:57:06 +02:00
cyyever	a0edc8dcf2	Apply ruff py39 fixes (#3461 ) * Apply ruff py39 fixes * Ruff format	2025-03-31 19:10:08 +02:00
Luiz F. G. dos Santos	f648feba97	Add `log_artifact`, `log_artifacts` and `log_figure` capabilities to the MLflowTracker. (#3419 ) * Added artifacts and figure tracking at MLFlow tracker * Added `log_artifact` to the MLFlowTracker * Remove changes * Added artifacts, artifacts and figure tracking at MLFlow tracker * Improved the docstring * added require_mlflow function at test_utils * add test for MLflowTracker * Bit of litting * Refactor to a more robust test * Revised the test asserts to something more robust. * Removed incorrect import and some litting. * removed commented code * initiate tracker using Accelerator * Added mlflow and matplotlib to setup.py. Guarded and decoredated the functions that required them. * Guarded mlflow import * added matplotlib required warning. * ran style and quality	2025-03-12 18:11:29 +01:00
Zach Mueller	14fc61eeac	Bump to 1.6.0.dev0	2025-03-12 10:13:18 -04:00
Ilyas Moutawwakil	d9e6af8773	HPU support (#3378 ) * init * style * is_hpu_available * fix * import habana_frameworks.torch.distributed.hccl * style * test * initialize dist proc group * revert * set backend to hccl only if hccl initialization sets a local rank * force backend hccl and multi_hpu type when sure of distributed launch * style * pass accelerator tests * pas big modeling tests with bigger atol/rtol for accelerators * fix hpu device count and skip tests requiring hpu:x * hpu autocast * hpu rng_state * hpu launch * hpu special device placement * hpu launch * rng state * distributed data loop tests * enforce non contiguity after device memory allocation * pass fsdp tests * enforce pt_hpu_lazy_mode=0 when fsdp testing * pass cli tests * pass and document grad sync tests * pass kwargs handler and autocast tests * memory utils * found source of int64 errors * skip some modeling utils tests * enable int64 * skip optimizer tests * pass checkpointing tests * pass accelerator tests with safetensors main * more hpu stuff * style * remove PT_HPU_LAZY_MODE and PT_ENABLE_INT64_SUPPORT as they should be in the testing environment * start testing on gaudi2 * support fp16 on gaudi2 * add testing order * custom hpu fsdp env dict * fix torch trace malloc * test ddp half precision comm hooks * fix * fix * remove lower bound for hpu * use 0.72 as lower bound * lower lower bound * order deepspeed tests * fix * deepspeed_use_hpu * assert non lazy mode with offloaded optimizer * make patching torch with habana frameworks the default * less of require_non_hpu * skip test_multi_device_merge_fsdp_weights for now as it halts * skip another flaky test * format * use habana_visible_modules * patch torch hpu device count * avoid setting HABANA_VISIBLE_MODULES * don't play with habana visible devices/modules * only with hpu * fixes and skips * skip * fix device ids and add some todos * skip offloading with generate() * fix * reduced atol/rtol for hpu * fix * tag deepspeed tests that should run first * enable a test path that was skipped * revert a test that was customized for gaudi1 * some patching to enable HABANA_VISIBLE_MODULES * fix zero3 test * misc * test DTensor TP * remove gaudi1 * test * style * comment * pass pad_across_processes * require_fp16 * pass memory utils test * test_ddp_comm_hook * skip half precision comm hooks on hpu * fix * is_fp16_available * fp16 * tp as part of integration tests * fix * write_basic_config * safetensors * local sgd and masked_fill_fwd_i64 * fix num_processes in test_load_states_by_steps * fp8 support * test * fix * add a workflow * Update src/accelerate/accelerator.py * review comments * ci * style * comments * test * habana_frameworks.torch * patch device count * fix * fix * require_fp8 * fix * fix * gaudi 1 * remove unnecessary * fixed maskd fill error in transformers * style * balanced_memory pass on hpu * remove for now * run first * Apply suggestions from code review * style after merge * Update src/accelerate/accelerator.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/accelerate/utils/transformer_engine.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * empty cache review comments * test_scirpt.py error messages * AccelerateTestCase for accelerator state cleanup * test * add gaudi1 workflow * fp8 avilability * fix * reduce batch size * concurrency * check cuda as well * nits and comments * mark fsdp tests that require_fp16 * style * mark deepspeed fp16 tests * update image * fix * updated * better msgs * skip pippy * test * test on 2 device * support up to 1% relative error in test_accelerate * skip hpu fp16 * allow for 1 byte differene * revert torch_device change * style * skip memory release since it's flaky * add accelerator state cleanup to fixture * fix * atol * fix * more rtol * equal grad test * revert * pass pippy on gaudi2 and skip on gaudi1 * enable sd 1.5 test with require fp16 * added warning on memory release * don't log warning in memory release as it requires PartialState to be initialized * Apply suggestions from code review --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2025-03-11 11:16:57 -04:00
Zach Mueller	65356780d4	[Dev] Update release directions (#3352 ) * Update release directions * Update directions and makefile to account for testpypi fun	2025-01-21 08:59:43 -05:00
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL	78b8126bff	v1.4.0.dev0	2025-01-17 10:36:00 -05:00
Zach Mueller	b13aadcb67	Bye bye torch <2 (#3331 ) * Bye bye torch <1 * Add 2.6.0 dl args * Rm require fsdp * Adjust imports + 2.0 specific modeling code * Bring back is_bf16	2025-01-09 12:11:08 -05:00
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL	5f96369161	v1.2.0.dev	2024-11-20 19:24:51 -05:00
Zach Mueller	85f35647db	🚨 🚨 🚨 Goodbye Python 3.8! 🚨 🚨 🚨 (#3194 )	2024-10-24 10:16:47 -04:00
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL	52581c3f01	Change version	2024-10-09 10:50:12 -04:00
Benjamin Bossan	3fd02e60dc	MAINT: Upgrade ruff to v0.6.4 (#3095 ) * MNT Upgrade ruff to 0.6.4 Currently used version, 0.2.1, is quite old at this point. Not a lot needed to be changed: - Change ruff version in setup.py - Remove deprecated ignore-init-module-imports option for ruff - Type comparison should use is and not == - Use f-string instead of % formatting - Some line wrapping and empty lines * Oops	2024-09-10 10:43:37 -04:00
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL	b5235f21d8	0.35.0.dev	2024-09-02 18:18:42 -04:00
byi8220	ad3f574a3b	Add early support for `torchdata.stateful_dataloader.StatefulDataLoader` within the `Accelerator` (#2895 ) * temporary commit * checkout? * dataloader wrapper * tmp * weird failing test * trying multiple inheritance * DataLoaderAdapter * make style * Some dark magic dynamic reflection (for backwards compat) * typo * some tests * more mixin stuff * maybe found broken test? * this is a very invasive feature * i think the feature is done? * add xpu support (#2864) * better tests * discovered a bug * maybe fixed bug? * make style * hopefully this is PR ready * properly skip tests * parameterize * temporary commit * checkout? * dataloader wrapper * tmp * weird failing test * trying multiple inheritance * DataLoaderAdapter * make style * Some dark magic dynamic reflection (for backwards compat) * typo * some tests * more mixin stuff * maybe found broken test? * this is a very invasive feature * i think the feature is done? * better tests * discovered a bug * maybe fixed bug? * make style * hopefully this is PR ready * properly skip tests * parameterize * Update src/accelerate/utils/dataclasses.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/accelerate/data_loader.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * merge conflicts * move imports * make style * merges are breaking tests * fix test name * Require safetensors>=0.4.3 * undo last commit * minor style * address pr comments * Torchdata version 0.8.0 is stable now * added docs and require torchdata>=0.8.0 for testing * test base_dataloader attr doesn't cause infinite recursion * address pr * replace super().__iter__ with self.base_dataloader.__iter__ --------- Co-authored-by: Fanli Lin <fanli.lin@intel.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-08-22 08:43:45 -04:00
Zach Mueller	52fae0960c	Add end_training/destroy_pg to everything and unpin numpy (#3030 ) * Add end_training/destroy_pg to everything * Carry over to AcceleratorState * If forked, ignore * More numpy fun * Skip only init	2024-08-20 10:40:12 -04:00
Marc Sun	cd5698bb32	update version to 0.34.dev0 (#3007 )	2024-08-12 12:13:37 -04:00
Zach Mueller	dc3b5ad82e	Fix deepspeed tests (#3003 ) * Unpin deepspeed * Include proper branch for docker image * Properly working * Revert all other changes	2024-08-09 15:35:25 -04:00
byi8220	32f368ec3f	Require safetensors>=0.4.3 (#2957 )	2024-07-29 07:35:34 -04:00
nullquant	3ebbe573ad	Add huggingface_hub version to setup.py (#2932 )	2024-07-15 10:11:41 -04:00
[[ -z $EMAIL ]] && read -e -p "Enter your email (for git configuration): " EMAIL	947f64ee62	Version update	2024-07-03 13:27:34 -04:00
Zach Mueller	1f7a79b428	Potentially fix tests (#2862 ) * Potentially fix tests * Try again with numpy sub 2	2024-06-18 11:38:30 +02:00
Zach Mueller	7141881b1f	Push new release version	2024-06-07 10:05:51 -04:00
Zach Mueller	4ba436eccc	Introduce shard-merging util for FSDP (#2772 ) * Initial commit * Now to test * Store false * Slight tweaks * Fix naming * Got it all working with tests * Use not for safetensors arg * rm change * Add docs * Adjust based on Marc's feedback * Specify just weights * Update tests to include CLI and swap namings * Fin * Rm unused * Rm again	2024-05-16 13:49:50 -04:00
Zach Mueller	5b3a7f3892	Update setup.py + test falures found during release	2024-05-03 10:40:25 -04:00
Zach Mueller	c7e5e41b8c	Segment out a deepspeed docker image (#2707 ) * Segment out a deepspeed docker image * Update readme * Keep pinned ds	2024-04-29 11:25:22 -04:00
Zach Mueller	6af157ea93	Add diffusers to req (#2711 )	2024-04-25 08:31:54 -04:00
Zach Mueller	f478201c28	Pin DS...again.. (#2679 )	2024-04-16 12:07:59 -04:00
Zach Mueller	16488be9a4	Update version	2024-04-05 13:11:05 -04:00
Zach Mueller	7531e8c13e	Unpin hub (#2625 )	2024-04-04 10:33:49 -04:00
Zach Mueller	f579d9550d	Pin hub for tests (#2608 )	2024-04-02 10:58:17 -04:00
Zach Mueller	dd62fc90ce	Unpin deepspeed (#2570 )	2024-03-21 09:42:03 -04:00
Zach Mueller	ee163b66fb	Update version	2024-03-12 11:55:22 -04:00
Zach Mueller	e70e3c87de	Overdue email change... (#2534 )	2024-03-08 12:55:42 -05:00
Benjamin Bossan	f20445d4ac	Fix the pytest version to be less than 8.0.1 (#2461 ) * Fix the pytest version to be less than 8.0.0 We're getting errors such as: > /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/transformers/testing_utils.py:129: in <module> > from _pytest.doctest import ( > E ImportError: cannot import name 'import_path' from '_pytest.doctest' (/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/_pytest/doctest.py) * Update setup.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>	2024-02-23 16:03:29 -05:00
Aarni Koskela	13e79ccfab	Enable more Ruff lints & fix issues (#2419 ) * Remove antiquated flake8 and isort configuration * Bump to Ruff 0.2.1 * Explain ruff options * Autofix Ruff B010 (static `setattr`) * Autofix Ruff B009 (static `getattr`) * Enable Ruff UP (not UP007); auto-fix * Fix remaining Ruff UP complaints * Fix a couple more format calls	2024-02-14 08:59:42 -05:00
Zach Mueller	5318bc7733	Dev version	2024-02-13 10:04:34 -05:00
Zach Mueller	b3d2111708	Version 0.28.0.dev	2024-02-09 10:51:07 -05:00
Zach Mueller	c3aec59b12	Migrate pippy examples over and run tests (#2424 ) * Migrate examples over * Finish updating doc * torchpippy * Readme review nits * Mention gather op in examples	2024-02-09 10:01:56 -05:00
Aarni Koskela	0e1ee4b92d	Use Ruff for formatting too (#2400 ) Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-02-06 08:18:18 -05:00
Zach Mueller	d8a64cb79d	Unpin (#2418 )	2024-02-06 08:00:33 -05:00
Zach Mueller	7ba64e632c	Revert "[don't merge yet] unpin torch (#2406 )" (#2407 ) This reverts commit 8b770a7dabd957ae54f1abb028d1ce53db6cf4d4.	2024-02-01 10:13:15 -05:00
Yih-Dar	8b770a7dab	[don't merge yet] unpin torch (#2406 ) * unpin torch * unpin torch --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-01 09:56:16 -05:00

1 2 3 4

154 Commits