frozenleaves/trl - trl - Gitea: Git for Me

mirror of https://github.com/huggingface/trl.git synced 2025-10-21 02:53:59 +08:00

Author	SHA1	Message	Date
Albert Villanova del Moral	45ee98b05e	Replace unittest with pytest (#4188 )	2025-10-06 11:14:54 +02:00
Quentin Gallouédec	304eaf8053	🛠️ Fix CI (#4076 )	2025-09-13 12:38:48 -06:00
Kashif Rasul	206964ce16	🎢 [Callbacks] BEMA (#3855 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-08-14 13:54:52 -07:00
Quentin Gallouédec	f5b1ed24a0	⏳ Replaced `unittest.TestCase` with `TrlTestCase` that handles tmp dir (#3863 )	2025-08-12 12:37:19 -07:00
Quentin Gallouédec	9df19e8a75	📜 Fix license and copyrights (#3264 )	2025-04-08 15:22:58 -07:00
Quentin Gallouédec	2fe2337067	🏃 Migrate CI to self-hosted runners (#3174 )	2025-03-29 11:56:44 -07:00
Quentin Gallouédec	2106b31298	👴 Update `tokenizer` parameter to `processing_class` in tests (#2828 )	2025-02-11 11:46:26 +01:00
Quentin Gallouédec	1d23ecc36f	©️ Update copyrights year (#2547 ) * happy new year * fix wandb import sort	2025-01-07 14:53:09 +01:00
Iaroslav Omelianenko	763738f457	☄️ Update Comet integration to include LogCompletionsCallback and Trainer.evaluation_loop() (#2501 ) * Implemented integration with Comet in `LogCompletionsCallback`. Implemented related integration test. * Implemented integration with Comet in `CPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `CPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `DPOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `BCOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `KTOTrainer.evaluation_loop()` during logging of `game_log` table. * Implemented integration with Comet in `ORPOTrainer.evaluation_loop()` during logging of `game_log` table.	2024-12-28 18:35:01 +01:00
Quentin Gallouédec	9410874787	©️ Copyrights update (#2454 ) * First changes * Other files * Finally * rm comment * fix nashmd * Fix example * Fix example [ci skip]	2024-12-10 10:40:00 +01:00
Quentin Gallouédec	453db5cd79	🤏 New models for tests (#2287 ) * first commit * uncomment * other tests adaptations * Remove unused variable in test_setup_chat_format * Remove unused import statement * style * Add Bart model * Update BCOTrainerTester class in test_bco_trainer.py * Update model IDs and tokenizers in test files * Add new models and processors * Update model IDs in test files * Fix formatting issue in test_dataset_formatting.py * Refactor dataset formatting in test_dataset_formatting.py * Fix dataset sequence length in SFTTrainerTester * Remove tokenizer * Remove print statement * Add reward_model_path and sft_model_path to PPO trainer * Fix tokenizer padding issue * Add chat template for testing purposes in PaliGemma model * Update PaliGemma model and chat template * Increase learning rate to speed up test * Update model names in run_dpo.sh and run_sft.sh scripts * Update model and dataset names * Fix formatting issue in test_dataset_formatting.py * Fix formatting issue in test_dataset_formatting.py * Remove unused chat template * Update model generation script * additional models * Update model references in test files * Remove unused imports in test_online_dpo_trainer.py * Add is_llm_blender_available import and update reward_tokenizer * Refactor test_online_dpo_trainer.py: Move skipped test case decorator * remove models without chat templates * Update model names in scripts and tests * Update model_id in test_modeling_value_head.py * Update model versions in test files * Fix formatting issue in test_dataset_formatting.py * Update embedding model ID in BCOTrainerTester * Update test_online_dpo_trainer.py with reward model changes * Update expected formatted text in test_dataset_formatting.py * Add reward_tokenizer to TestOnlineDPOTrainer * fix tests * Add SIMPLE_CHAT_TEMPLATE to T5 tokenizer * Fix dummy_text format in test_rloo_trainer.py * Skip outdated test for chatML data collator * Add new vision language models * Commented out unused model IDs in test_vdpo_trainer * Update model and vision configurations in generate_tiny_models.py and test_dpo_trainer.py * Update model and tokenizer references * Don't push if it already exists * Add comment explaining test skip * Fix model_exists function call and add new models * Update LlavaForConditionalGeneration model and processor * `qgallouedec` -> `trl-internal-testing`	2024-11-25 16:31:56 +01:00
August Moharrami	6578fdc101	🔀 Add `MergeModelCallBack` (#2282 ) * Create mergekit_utils.py * adding mergekit as an optional dependancy * adding MergeModel to callbacks * adding mergekit_utils dependencies to callbacks * setting lower bound for mergekit * setting mergekit lower band to 0.0.5.1 * adding support for MergeModelCallBack __init__.py * adding support for mergemodelcallback * mergemodelcallback tests * Update callbacks.py * Update __init__.py * Update __init__.py * Update test_callbacks.py * Update trl/trainer/callbacks.py removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/callbacks.py removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/callbacks.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * using different dataset for tests Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/mergekit_utils.py adding types Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/mergekit_utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * replacing get_last_checkpoint * renaming Merge to merge_models * setting mergers default value to linear * removing unnecessary docs and comments * adding docstring to Mergeconfig * adding mergekits link to docstring * precommit * removing duplicated import * typos in mergekit_utils docstring * fixing tests * making mergemodelcallback tests optional * Make import optional * minor * use tmp dir in test * sort * Add import error checks for mergekit extra * use a common _merge_and_maybe_push method and compat with windows path * debug windows * Update dependencies for mergekit and add test dependencies * Add assertion to check if merged folder exists in the last checkpoint * Fix temporary directory cleanup in test_callbacks.py * Add sys import and skip test for Python versions below 3.10 due to cleanup errors with temp dir * revert change for debug --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2024-11-21 14:06:45 +01:00
Kashif Rasul	b8c9d9c7bc	⚖️ Add `use_soft_judge` option to `WinRateCallback` (#2347 ) * add `use_soft_judge` option to WinRateCallback * formatting * Update trl/trainer/callbacks.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * renamed soft_win_rate to avg_win_prob * Update trl/trainer/callbacks.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix tests * keep orignal * formatting * Update tests/test_callbacks.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update trl/trainer/callbacks.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update tests/test_callbacks.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update tests/test_callbacks.py * fix test --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-11-15 15:49:43 +01:00
Quentin Gallouédec	73c3970c1f	🙅 Ensure dependency optionality (#2301 ) * Add conditional check for LLMBlender availability in test_judges.py * Fix import issues and update test requirements * Remove unused imports * Add require_peft decorator to test cases * Fix import_utils module to use correct package name for llm_blender	2024-10-31 22:37:49 +01:00
Quentin Gallouédec	d843b3dadd	Use `processing_class` instead of `tokenizer` in `LogCompletionsCallback` (#2261 )	2024-10-22 09:35:04 +02:00
lewtun	b169e1030d	Add table for WinRateCallback (#2116 ) * Add table for WinRateCallback * Fix tests * Apply suggestions from code review Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Refactor * Remove super * Clean * Clean * Apply suggestions from code review Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-09-26 19:28:44 +02:00
Quentin Gallouédec	10c2f63b2a	`training_args` for all `TrainingArguments` (#2082 )	2024-09-19 15:03:47 +02:00
lewtun	4d8267610f	Use wrapped model for reference completions in `WinRateCallback` and set default `freq` to `eval_steps` in LogCompletionsCallback` (#2074 ) * Use wrapped model for reference completions * Add unit test for LoRA * Apply suggestions from code review Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Fix quality --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-09-18 13:55:49 +02:00
Quentin Gallouédec	40f05226de	Standardizing datasets for testing (#2065 ) * zen dataset * Update dataset test bco * some tests * Simple chat template * bco * xpo * kto * gkd * trainer_args * sft * online dpo * orpo * zen script	2024-09-14 22:34:15 +02:00
lewtun	88bede66fc	Standardise API for `WinRateCallback` and `LogCompletionsCallback` (#2061 ) * Use wrapped model * Make WinRateCallback work * Make LogCompletions work * Make LogCompletions work * Fix scripts * Fix path * Refactor * Remove padding * Refactor * Fix docs * Fix scripts * Fix TLDR template * Use explicit args * Fix callback import * Add docstring	2024-09-13 17:38:42 +02:00

20 Commits