frozenleaves/trl - trl - Gitea: Git for Me

mirror of https://github.com/huggingface/trl.git synced 2025-10-21 11:33:51 +08:00

Author	SHA1	Message	Date
Quentin Gallouédec	9df19e8a75	📜 Fix license and copyrights (#3264 )	2025-04-08 15:22:58 -07:00
Quentin Gallouédec	1d23ecc36f	©️ Update copyrights year (#2547 ) * happy new year * fix wandb import sort	2025-01-07 14:53:09 +01:00
Quentin Gallouédec	ca850be0a2	🕹️ CLI refactor (#2380 ) * Refactor main function in dpo.py * Update setup.py and add cli.py * Add examples to package data * style * Refactor setup.py file * Add new file t.py * Move dpo to package * Update MANIFEST.in and setup.py, refactor trl/cli.py * Add __init__.py to trl/scripts directory * Add license header to __init__.py * File moved instruction * Add Apache License and update file path * Move dpo.py to new location * Refactor CLI and DPO script * Refactor import structure in scripts package * env * rm config from chat arg * rm old cli * chat init * test cli [skip ci] * Add `datast_config_name` to `ScriptArguments` (#2440) * add missing arg * Add test cases for 'trl sft' and 'trl dpo' commands * Add sft.py script and update cli.py to include sft command * Move sft script * chat * style [ci skip] * kto * rm example config * first step on doc * see #2442 * see #2443 * fix chat windows * ©️ Copyrights update (#2454) * First changes * Other files * Finally * rm comment * fix nashmd * Fix example * Fix example [ci skip] * 💬 Fix chat for windows (#2443) * fix chat for windows * add some tests back * Revert "add some tests back" This reverts commit 350aef52f53f8cf34fccd7ad0f78a3dd63867e06. * 🆔 Add `datast_config` to `ScriptArguments` (#2440) * datast_config_name * Update trl/utils.py [ci skip] * sort import * typo [ci skip] * Trigger CI * Rename `dataset_config_name` to `dataset_config` * 🏎 Fix deepspeed preparation of `ref_model` in `OnlineDPOTrainer` (#2417) * Remove unused deepspeed code * add model prep back * add deepspeed even if it doesn't work * rm old code * Fix config name * Remove `make dev` in favor of `pip install -e .[dev]` * Update script paths and remove old symlink related things * Fix chat script path [ci skip] * style	2024-12-13 17:52:23 +01:00
Quentin Gallouédec	460e780265	👯 Standardize `model_args` (#2442 ) * `model_config` -> `model_args` * sort	2024-12-10 12:51:20 +01:00
Quentin Gallouédec	6a05feff02	🆔 Add `datast_config` to `ScriptArguments` (#2440 ) * datast_config_name * Update trl/utils.py [ci skip] * sort import * typo [ci skip] * Trigger CI * Rename `dataset_config_name` to `dataset_config`	2024-12-10 11:09:26 +01:00
Quentin Gallouédec	9410874787	©️ Copyrights update (#2454 ) * First changes * Other files * Finally * rm comment * fix nashmd * Fix example * Fix example [ci skip]	2024-12-10 10:40:00 +01:00
Quentin Gallouédec	e155cb8a66	⛓️‍💥 Don't use `eval_dataset` in scripts when no eval strategy (#2270 )	2024-10-28 11:40:51 +01:00
lewtun	a67f2143c3	Update SFT examples (#2244 )	2024-10-17 14:11:46 +02:00
Quentin Gallouédec	7e394b03e8	🎭 Deprecate `[SFT/DPO/Reward]ScriptArguments` in favour of `ScriptArguments` (#2145 ) * `DPOScriptArguments` to `ScriptArguments` * use dataset_train_split * Use scriptarguments * dataset names in command lines * use `ScriptArguments` everywhere * ignore biais buffer to end * remove in v0.13 * rm comment * update test commands * Update docs/source/rloo_trainer.md * Update tests/test_rloo_trainer.py * Added dataset_train_split argument to ppo.py and rloo.py * update scripts with dataset_train_split	2024-10-14 11:14:58 +02:00
Quentin Gallouédec	47d08a9626	Rename trainer arg `tokenizer` to `processing_class` (#2162 )	2024-10-07 09:39:32 +02:00
Quentin Gallouédec	a9cffc7caf	Default `dataset_text_field` to `"text"` (#2078 ) * clarify ConstantLengthDataset usage * dont provide dataset text field when formatting func is provided * kto maybe_apply_chat_template * default text field * doc * remove maybe_apply_chat_template from kto example * dataset text field always a str * remove `dataset_text_field="text"` * update doc	2024-10-04 10:55:47 +02:00
Quentin Gallouédec	c00722ce0a	🃏 Model card for TRL (#2123 ) * template and util * test for online dpo * template in package_data * template in manifest * standardize push_to_hub * wandb badge and quick start * bco * xpo * simplify `create_model_card` * cpo * kto * dpo * gkd * orpo * style * nash-md * alignprop * bco citation * citation template * cpo citation * ddpo * fix alignprop * dpo * gkd citation * kto * online dpo citation * orpo citation * citation in utils * optional citation * reward * optional trainer citation * sft * remove add_model_tags bco * Remove unnecessary code for adding model tags * Fix model tag issue and update URL format * Remove unused code for adding model tags * Add citation for XPOTrainer * Remove unused code in SFTTrainer * Add model card generation in RLOOTrainer * Remove unused import and method call in reward_trainer.py * Add model card generation * Remove unused code and update error message in ORPOTrainer class * Add import statements and create model card in IterativeSFTTrainer * Add dataset name to push_to_hub() call * Update trainer.push_to_hub() dataset names * script args * test * better doc * fix tag test * fix test tag * Add tags parameter to create_model_card method * doc * script args * Update trl/templates/model_card.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * unittest's `assertIn` instead of `assert` * Update trl/templates/model_card.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-09-27 15:23:05 +02:00
Quentin Gallouédec	5368be1e1e	🧹 Style (#2132 ) * drop `# flake8: noqa` in examples * `__init__.py` * fix init * unwrap_model_for_generation * ignore import violation in init	2024-09-26 21:02:48 +02:00
Quentin Gallouédec	9af4734178	♻️ Standardize `script_args` (#2130 )	2024-09-26 15:23:42 +02:00
Quentin Gallouédec	32d9d34eb1	Standardize pushing to Hub in examples (#2126 )	2024-09-26 10:00:51 +02:00
Quentin Gallouédec	4c0c98d950	Standardize dataset naming (#2081 ) * `ds`, `raw_dataset` etc -> `dataset` * Update docs/source/detoxifying_a_lm.mdx	2024-09-19 08:59:28 +02:00
lewtun	a8fd6dcd17	Remove `RichProgressCallback` from examples (#2053 ) * Disable RichProgressCallback by default in examples * Nuke rich * Clean	2024-09-11 16:51:05 +02:00
Quentin Gallouédec	7ddef5c158	Make use of `trust_remote_code` consistent (#1806 ) Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co>	2024-07-10 18:26:11 +02:00
Alvaro Bartolome	78045dedc8	Fix `TRL_USE_RICH` environment variable handling (#1808 ) * Add `strtobool` custom implementation from `distutils` * Fix `TRL_USE_RICH` handling via `strtobool` * Run `make precommit`	2024-07-07 19:59:26 -04:00
Kashif Rasul	b6af2edc93	add model_init_kwargs to training_args (#1787 )	2024-07-03 08:29:16 +02:00
Juyoung Suk	9956091112	Add dataset_text_field in examples/scripts/sft.py (#1758 )	2024-06-21 11:01:08 +02:00
Kashif Rasul	f30daa4225	[SFT] add SFT Trainer Config dataclass (#1530 ) * initial SFT Config * remove pdb * fix chat_template * undo formatting * add back removed commits * fix the tests * add back options to SftScriptArguments * use sft_script_args * Update trl/commands/cli_utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/commands/cli_utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename SFTScriptArguments and split names * formatting docstrings * docstring --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-04-23 11:55:13 +02:00
Younes Belkada	e04432d5e3	FIX: make the train / test fields modulable (#1551 ) * make the train / test fields modulable * format * fix --output_dir issue	2024-04-18 11:33:30 +02:00
Younes Belkada	a2aa0f0b09	FEAT: Add CLIs in TRL ! (#1419 ) * CLI V1 * v1 CLI * add rich enhancmeents * revert unindented change * some comments * cleaner CLI * fix * fix * remove print callback * move to cli instead of trl_cli * revert unneeded changes * fix test * Update trl/commands/sft.py Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * remove redundant strings * fix import issue * fix other issues * add packing * add config parser * some refactor * cleaner * add example config yaml file * small refactor * change a bit the logic * fix issues here and there * add CLI in docs * move to examples/sft * remove redundant licenses * make it work on dpo * set to None * switch to accelerate and fix many things * add docs * more docs * added tests * doc clarification * more docs * fix CI for windows and python 3.8 * fix * attempt to fix CI * fix? * test * fix * tweak? * fix * test * another test * fix * test * fix * fix * fix * skip tests for windows * test @lvwerra approach * make dev * revert unneeded changes * fix sft dpo * optimize a bit * address final comments * update docs * final comment --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2024-03-18 12:20:54 +01:00
Aarni Koskela	9bc478ecbb	pre-commit: replace linters + formatters with Ruff; fix some issues (#1300 ) * pre-commit: replace linters + formatters with Ruff * Don't use bare except * Clean up `noqa`s * Enable Ruff UP; apply auto-fixes * Enable Ruff B; apply fixes * Enable Ruff T with exceptions * Enable Ruff C (complexity); autofix * Upgrade Ruff to 0.2.0	2024-02-15 04:37:41 +01:00
Costa Huang	3843cfc32f	Fix SFT tuner (#1278 )	2024-01-26 17:49:50 +01:00
Costa Huang	9a71e67be9	Remove tyro (#1176 ) * refactor * Remove tyro in `ppo.py` * quick update * update default args * quick push * precommit * refactor * quick change * remove tyro * quick change * precommit * quick change * fix hello_world * remove docstring diffences * add `module load cuda/12.1` * push changes * precommit * make dpo runnable * fix circular import * quick fix * refactor * quick update * path change * update plots * fix docs * quick change * Update trl/trainer/model_config.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/model_config.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update examples/scripts/dpo.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * address comments. use attn_implementation * precommit * remove duplicate code * update peft.py * fix test no op dep * Update trl/trainer/utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * precommit * add docs --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-26 07:51:15 -08:00
Younes Belkada	ef209e311f	[`core` / tests ] v1 slow tests (#1218 ) * v1 slow tests * nit * add qlora tests for DPO * add decorator * release memory + log reports * report to none to avoid seg fault issues * update setup * fix * add exampel testing * fix nit * change temp filename * add workflow file * fix comment * add slack push script * more tests for DPO * add dpo example tests * another makefile command * fix * add paths + clean up * nit * Update slow-tests.yml * trigger tests * up * up * more fixes * fix * final fixes * minor fixes * oops * add more text * fix * more * trigger CI * up * fix * remove * run the tests on 2 GPUs only * final fix SFT * revert config files + address comments * fix * add Phi * final fixes * final fix	2024-01-17 10:17:57 +01:00
Chris Cates	18a33ffcd3	SFT Tokenizer Fix (#1142 )	2023-12-27 10:25:56 +01:00
Thomas Capelle	c0ce52ab26	consistency on log (#1084 )	2023-12-12 10:58:21 +01:00
lewtun	8f0fc4c8f7	Add args to SFT example (#1079 )	2023-12-11 16:16:47 +01:00
Younes Belkada	cbc6c9bb3e	[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO (#912 ) * make use of forward hooks * correctly delete attributes * fix RM DPP issues * revert unneeded changes * more fixes * fix diff * fix * propagate to SFT * Update examples/scripts/reward_modeling.py * propagate the fix on DPO trainer * add to example scripts * trigger CI	2023-10-31 18:50:17 +01:00
Abhilash Majumder	ec9e76623e	[Feature] Enable Intel XPU support (#839 ) * enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (#856) * Standardise example scripts (#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (#853) * dont use get_peft_model if model is already peft (#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>	2023-10-31 10:15:35 +01:00
lewtun	ddd318865b	Standardise example scripts (#842 ) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>	2023-10-11 17:28:15 +02:00

34 Commits