frozenleaves/trl - trl - Gitea: Git for Me

mirror of https://github.com/huggingface/trl.git synced 2025-10-21 11:33:51 +08:00

Author	SHA1	Message	Date
Pramodith Ballapuram	8e2d5516ca	Add accuracy reward (#4270 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-10-15 18:01:07 -06:00
Albert Villanova del Moral	f45e86571b	Fix CI ImportError for 'require_torch_gpu_if_bnb_not_multi_backend_enabled' (#4253 )	2025-10-10 16:13:22 +02:00
Albert Villanova del Moral	a944890ff1	Fix callable annotations (#4216 )	2025-10-08 21:21:21 +02:00
Albert Villanova del Moral	45ee98b05e	Replace unittest with pytest (#4188 )	2025-10-06 11:14:54 +02:00
jiqing-feng	20cc58d777	ℹ️ Enable XPU for vLLM client (#4031 )	2025-09-17 22:06:25 -06:00
Quentin Gallouédec	78f1a928ce	🗑️ Remove deprecated `AlignPropTrainer`, `DDPOTrainer` and `IterativeSFTTrainer` (#4068 )	2025-09-15 09:56:41 -06:00
Albert Villanova del Moral	e8b8499f1f	Remove redundant 'None' from docstrings (#4058 )	2025-09-11 08:16:34 +02:00
Quentin Gallouédec	7233b981ce	🧹 Clean SFT tests (#3922 )	2025-08-20 07:36:03 -07:00
Quentin Gallouédec	f5b1ed24a0	⏳ Replaced `unittest.TestCase` with `TrlTestCase` that handles tmp dir (#3863 )	2025-08-12 12:37:19 -07:00
CarlosArguilar	1fb115daff	✋ Prevent NCCL Device Conflicts Between vLLM Server and Trainers (#3762 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-07-24 23:16:15 -06:00
Yao Matrix	be93a0c30c	enable vllm c-s tests on XPU (#3445 ) Signed-off-by: Matrix Yao <matrix.yao@intel.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-05-19 11:55:57 +02:00
Quentin Gallouédec	999acd53ec	🕺 Migrate setup configuration from `setup.py` to `setup.cfg` and make `rich` an optional dep (#3403 ) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2025-05-02 11:03:57 -07:00
Quentin Gallouédec	6cdd24a360	🦾 Test vLLM client-server (#3277 )	2025-04-10 18:29:04 -07:00
Quentin Gallouédec	9df19e8a75	📜 Fix license and copyrights (#3264 )	2025-04-08 15:22:58 -07:00
Quentin Gallouédec	b6bcafb8bb	🏃 Fix and make CI faster (#3160 )	2025-04-08 06:12:08 -07:00
Clara Pohland	6067e2a669	BCOTrainer version upgrade fixes (#2867 ) Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>	2025-03-24 10:55:00 +01:00
Quentin Gallouédec	1d23ecc36f	©️ Update copyrights year (#2547 ) * happy new year * fix wandb import sort	2025-01-07 14:53:09 +01:00
Iaroslav Omelianenko	763738f457	☄️ Update Comet integration to include LogCompletionsCallback and Trainer.evaluation_loop() (#2501 ) * Implemented integration with Comet in `LogCompletionsCallback`. Implemented related integration test. * Implemented integration with Comet in `CPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `CPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `DPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `BCOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `KTOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `ORPOTrainer.evaluation_loop()` during logging of `game_log` table.	2024-12-28 18:35:01 +01:00
Quentin Gallouédec	9410874787	©️ Copyrights update (#2454 ) * First changes * Other files * Finally * rm comment * fix nashmd * Fix example * Fix example [ci skip]	2024-12-10 10:40:00 +01:00
August Moharrami	6578fdc101	🔀 Add `MergeModelCallBack` (#2282 ) * Create mergekit_utils.py * adding mergekit as an optional dependancy * adding MergeModel to callbacks * adding mergekit_utils dependencies to callbacks * setting lower bound for mergekit * setting mergekit lower band to 0.0.5.1 * adding support for MergeModelCallBack __init__.py * adding support for mergemodelcallback * mergemodelcallback tests * Update callbacks.py * Update __init__.py * Update __init__.py * Update test_callbacks.py * Update trl/trainer/callbacks.py removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/callbacks.py removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/callbacks.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * using different dataset for tests Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/mergekit_utils.py adding types Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/mergekit_utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * replacing get_last_checkpoint * renaming Merge to merge_models * setting mergers default value to linear * removing unnecessary docs and comments * adding docstring to Mergeconfig * adding mergekits link to docstring * precommit * removing duplicated import * typos in mergekit_utils docstring * fixing tests * making mergemodelcallback tests optional * Make import optional * minor * use tmp dir in test * sort * Add import error checks for mergekit extra * use a common _merge_and_maybe_push method and compat with windows path * debug windows * Update dependencies for mergekit and add test dependencies * Add assertion to check if merged folder exists in the last checkpoint * Fix temporary directory cleanup in test_callbacks.py * Add sys import and skip test for Python versions below 3.10 due to cleanup errors with temp dir * revert change for debug --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2024-11-21 14:06:45 +01:00
Quentin Gallouédec	5626806aef	🧲 Use our own `require_bitsandbytes` (#2370 ) * use our own require_bitsandbytes * rephrase	2024-11-20 11:51:05 +01:00
Quentin Gallouédec	b80c1a6fb8	🎲 Move random judges in testing utilities (#2365 ) * Update judges and testing utilities * Update judges in test files * Update judges in test files	2024-11-18 18:43:18 +01:00
Quentin Gallouédec	73c3970c1f	🙅 Ensure dependency optionality (#2301 ) * Add conditional check for LLMBlender availability in test_judges.py * Fix import issues and update test requirements * Remove unused imports * Add require_peft decorator to test cases * Fix import_utils module to use correct package name for llm_blender	2024-10-31 22:37:49 +01:00
Quentin Gallouédec	07f0e687cb	Use `transformers` utilities when possible (#2064 ) * use transformers' availability functions * require from transformers * rm file * fix no peft * fix import * don't alter _peft_available * fix require_diffusers * style * transformers>=4.40 and add back `is_liger_kernel_available`	2024-09-16 15:56:49 +02:00
Fanli Lin	d47220f299	make cuda-only tests device-agnostic (#2044 ) * update code * update * fix style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-09-13 14:23:12 +02:00
Kashif Rasul	d57e4b7265	[Online-DPO] fixes to the training scripts and setup.py (#1997 ) * fixes * fixed typo * add tests for liger * fix imports * class name	2024-08-30 22:05:14 +02:00
Edward Beeching	346c99d222	Adds VLM Training support to SFTTrainer + VSFT script (#1518 ) * adds option to skip dataset preparation in SFTTrainer * before changing the template * adds support for new schema * a few fixes to data collator to support new schema * updates args * precommit * adds sys prompt to chat template and other fixes * updates template, fixes collator for multiple images * precommit * rename vsft to vstf_llava * adding integration tests * adds integration test for vsft * precommit * adds back chat template * docs * typo * adds eval, precommit * adds peft launch args * formatting * fixes no deps tests by checking if PIL lib exists * Update __init__.py --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-04-11 15:35:59 +02:00
Younes Belkada	ef209e311f	[`core` / tests ] v1 slow tests (#1218 ) * v1 slow tests * nit * add qlora tests for DPO * add decorator * release memory + log reports * report to none to avoid seg fault issues * update setup * fix * add exampel testing * fix nit * change temp filename * add workflow file * fix comment * add slack push script * more tests for DPO * add dpo example tests * another makefile command * fix * add paths + clean up * nit * Update slow-tests.yml * trigger tests * up * up * more fixes * fix * final fixes * minor fixes * oops * add more text * fix * more * trigger CI * up * fix * remove * run the tests on 2 GPUs only * final fix SFT * revert config files + address comments * fix * add Phi * final fixes * final fix	2024-01-17 10:17:57 +01:00
Younes Belkada	d116887ed4	[`DPOTrainer`] Fix peft + DPO + bf16 if one uses `generate_during_eval` or pre-computed logits (#1203 ) * fix peft + DPO + bf16 * fix * revert old behaviour * fix tests * fix * fix * fix * fix	2024-01-09 09:35:50 +01:00
Younes Belkada	c2884b5096	[`Tests`] Add non optional packages tests (#974 ) * add non-peft tests * change name * test * change * fix test	2023-11-09 15:01:46 +01:00
Abhilash Majumder	ec9e76623e	[Feature] Enable Intel XPU support (#839 ) * enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (#856) * Standardise example scripts (#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (#853) * dont use get_peft_model if model is already peft (#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>	2023-10-31 10:15:35 +01:00
Nathan Lambert	ad8d50e30d	init custom eval loop for further DPO evals (#766 ) * init * run * Update custom eval loop to aid DPO debugging (#770) * sample_during_eval -> generate_during_eval * Remove unused return_tokens * Add import utils for W&B, prevent test fails * Optimize dataloader random batch selection * Separate prompt and response in logs Makes it much easier to quickly read the starts of the generations * Simplify logging * reset eval steps * manual merge fixes * revert merge * remove self.max_length * style * fix max_length --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>	2023-09-26 08:09:15 -07:00
Younes Belkada	0610711dda	[`core`] refactor peft API (#231 ) * refactor peft API * update gpt2 peft script * refactor * few fixes * fix bug * make style * update docs * more update * fix docs * fix issues and add tests * make style * update dcos	2023-03-21 13:35:21 +01:00
Younes Belkada	03d9844730	Let's support naive Pipeline Parallelism (#210 ) * add fixes in to support PP * add same logic for enc-dec * add more checks * fix 20b issues * clean up * update scripts * dp safety checker * added multi gpu tests * fix order * change * fix script	2023-03-15 08:28:52 +01:00
Edward Beeching	679f29d408	`peft` integration (#163 ) * adds a hacky peft example * fixes bug due to missing "prepare_model_for_training" * Formatting * adds peft to requirements * Update trl/trainer/ppo_trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * gpt neo runs * changes requested on the PR * style * updates to prepare_model_for_int8_training PEFT PR https://github.com/huggingface/peft/pull/105 * updates to prepare_model_for_int8_training PEFT PR https://github.com/huggingface/peft/pull/105 * adds missing 8-bit attribute to modeling base * adds lr to example script * adds missing train to trainer * disables caching temporarily while I debug something * debugging issues with unstable training * Fix peft + int8 (#170) * add fix * another fix * Auto stash before merge of "peft-example" and "origin/peft-example" * adds peft model types to modeling base * reduces memory usage using adapters and no ref model. * adds support for EleutherAI/gpt-neox-20b * example for peft finetune of cm model * removes hacky research code * fixing the rebase and some typos * style * style2 * adds gradient checkpointing to base model * cleans up comments * moves config and other pretrained_model properties to __init__ * make style * added tests * change dependency * Update .github/workflows/tests.yml * fix test * fix style and failing tests * make quality * revert change * rm unneeded change * revert changes * rm changes * rm changes * rm uneeded change * Update trl/models/modeling_base.py * revert uneeded changes * make style * adapt suggestions * fix tests * attempt to fix * fix * fix * add no peft test * revert * remove unneded check * more tests * fix logic * add `save_pretrained` support * fix quality * clean up * clean up * stronger test * refactor comments * make style * attempt to add non-peft tests * remove test runner * format * fix test * move `train` on top * fix peft import * make quality * fixes typo * adds peft example to docs --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: younesbelakda <younesbelkada@gmail.com>	2023-03-07 15:08:21 +01:00

35 Commits