9df19e8a75
📜 Fix license and copyrights ( #3264 )
2025-04-08 15:22:58 -07:00
1d23ecc36f
©️ Update copyrights year ( #2547 )
...
* happy new year
* fix wandb import sort
2025-01-07 14:53:09 +01:00
ca850be0a2
🕹️ CLI refactor ( #2380 )
...
* Refactor main function in dpo.py
* Update setup.py and add cli.py
* Add examples to package data
* style
* Refactor setup.py file
* Add new file t.py
* Move dpo to package
* Update MANIFEST.in and setup.py, refactor trl/cli.py
* Add __init__.py to trl/scripts directory
* Add license header to __init__.py
* File moved instruction
* Add Apache License and update file path
* Move dpo.py to new location
* Refactor CLI and DPO script
* Refactor import structure in scripts package
* env
* rm config from chat arg
* rm old cli
* chat init
* test cli [skip ci]
* Add `datast_config_name` to `ScriptArguments` (#2440 )
* add missing arg
* Add test cases for 'trl sft' and 'trl dpo' commands
* Add sft.py script and update cli.py to include sft command
* Move sft script
* chat
* style [ci skip]
* kto
* rm example config
* first step on doc
* see #2442
* see #2443
* fix chat windows
* ©️ Copyrights update (#2454 )
* First changes
* Other files
* Finally
* rm comment
* fix nashmd
* Fix example
* Fix example [ci skip]
* 💬 Fix chat for windows (#2443 )
* fix chat for windows
* add some tests back
* Revert "add some tests back"
This reverts commit 350aef52f53f8cf34fccd7ad0f78a3dd63867e06.
* 🆔 Add `datast_config` to `ScriptArguments` (#2440 )
* datast_config_name
* Update trl/utils.py [ci skip]
* sort import
* typo [ci skip]
* Trigger CI
* Rename `dataset_config_name` to `dataset_config`
* 🏎 Fix deepspeed preparation of `ref_model` in `OnlineDPOTrainer` (#2417 )
* Remove unused deepspeed code
* add model prep back
* add deepspeed even if it doesn't work
* rm old code
* Fix config name
* Remove `make dev` in favor of `pip install -e .[dev]`
* Update script paths and remove old symlink related things
* Fix chat script path [ci skip]
* style
2024-12-13 17:52:23 +01:00
460e780265
👯 Standardize model_args
( #2442 )
...
* `model_config` -> `model_args`
* sort
2024-12-10 12:51:20 +01:00
6a05feff02
🆔 Add datast_config
to ScriptArguments
( #2440 )
...
* datast_config_name
* Update trl/utils.py [ci skip]
* sort import
* typo [ci skip]
* Trigger CI
* Rename `dataset_config_name` to `dataset_config`
2024-12-10 11:09:26 +01:00
9410874787
©️ Copyrights update ( #2454 )
...
* First changes
* Other files
* Finally
* rm comment
* fix nashmd
* Fix example
* Fix example [ci skip]
2024-12-10 10:40:00 +01:00
e155cb8a66
⛓️ 💥 Don't use eval_dataset
in scripts when no eval strategy ( #2270 )
2024-10-28 11:40:51 +01:00
a67f2143c3
Update SFT examples ( #2244 )
2024-10-17 14:11:46 +02:00
7e394b03e8
🎭 Deprecate [SFT/DPO/Reward]ScriptArguments
in favour of ScriptArguments
( #2145 )
...
* `DPOScriptArguments` to `ScriptArguments`
* use dataset_train_split
* Use scriptarguments
* dataset names in command lines
* use `ScriptArguments` everywhere
* ignore biais buffer to end
* remove in v0.13
* rm comment
* update test commands
* Update docs/source/rloo_trainer.md
* Update tests/test_rloo_trainer.py
* Added dataset_train_split argument to ppo.py and rloo.py
* update scripts with dataset_train_split
2024-10-14 11:14:58 +02:00
47d08a9626
Rename trainer arg tokenizer
to processing_class
( #2162 )
2024-10-07 09:39:32 +02:00
a9cffc7caf
Default dataset_text_field
to "text"
( #2078 )
...
* clarify ConstantLengthDataset usage
* dont provide dataset text field when formatting func is provided
* kto maybe_apply_chat_template
* default text field
* doc
* remove maybe_apply_chat_template from kto example
* dataset text field always a str
* remove `dataset_text_field="text"`
* update doc
2024-10-04 10:55:47 +02:00
c00722ce0a
🃏 Model card for TRL ( #2123 )
...
* template and util
* test for online dpo
* template in package_data
* template in manifest
* standardize push_to_hub
* wandb badge and quick start
* bco
* xpo
* simplify `create_model_card`
* cpo
* kto
* dpo
* gkd
* orpo
* style
* nash-md
* alignprop
* bco citation
* citation template
* cpo citation
* ddpo
* fix alignprop
* dpo
* gkd citation
* kto
* online dpo citation
* orpo citation
* citation in utils
* optional citation
* reward
* optional trainer citation
* sft
* remove add_model_tags bco
* Remove unnecessary code for adding model tags
* Fix model tag issue and update URL format
* Remove unused code for adding model tags
* Add citation for XPOTrainer
* Remove unused code in SFTTrainer
* Add model card generation in RLOOTrainer
* Remove unused import and method call in reward_trainer.py
* Add model card generation
* Remove unused code and update error message in ORPOTrainer class
* Add import statements and create model card in IterativeSFTTrainer
* Add dataset name to push_to_hub() call
* Update trainer.push_to_hub() dataset names
* script args
* test
* better doc
* fix tag test
* fix test tag
* Add tags parameter to create_model_card method
* doc
* script args
* Update trl/templates/model_card.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* unittest's `assertIn` instead of `assert`
* Update trl/templates/model_card.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
2024-09-27 15:23:05 +02:00
5368be1e1e
🧹 Style ( #2132 )
...
* drop `# flake8: noqa` in examples
* `__init__.py`
* fix init
* unwrap_model_for_generation
* ignore import violation in init
2024-09-26 21:02:48 +02:00
9af4734178
♻️ Standardize script_args
( #2130 )
2024-09-26 15:23:42 +02:00
32d9d34eb1
Standardize pushing to Hub in examples ( #2126 )
2024-09-26 10:00:51 +02:00
4c0c98d950
Standardize dataset naming ( #2081 )
...
* `ds`, `raw_dataset` etc -> `dataset`
* Update docs/source/detoxifying_a_lm.mdx
2024-09-19 08:59:28 +02:00
a8fd6dcd17
Remove RichProgressCallback
from examples ( #2053 )
...
* Disable RichProgressCallback by default in examples
* Nuke rich
* Clean
2024-09-11 16:51:05 +02:00
7ddef5c158
Make use of trust_remote_code
consistent ( #1806 )
...
Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co >
2024-07-10 18:26:11 +02:00
78045dedc8
Fix TRL_USE_RICH
environment variable handling ( #1808 )
...
* Add `strtobool` custom implementation from `distutils`
* Fix `TRL_USE_RICH` handling via `strtobool`
* Run `make precommit`
2024-07-07 19:59:26 -04:00
b6af2edc93
add model_init_kwargs to training_args ( #1787 )
2024-07-03 08:29:16 +02:00
9956091112
Add dataset_text_field in examples/scripts/sft.py ( #1758 )
2024-06-21 11:01:08 +02:00
f30daa4225
[SFT] add SFT Trainer Config dataclass ( #1530 )
...
* initial SFT Config
* remove pdb
* fix chat_template
* undo formatting
* add back removed commits
* fix the tests
* add back options to SftScriptArguments
* use sft_script_args
* Update trl/commands/cli_utils.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update trl/commands/cli_utils.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* rename SFTScriptArguments and split names
* formatting docstrings
* docstring
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
2024-04-23 11:55:13 +02:00
e04432d5e3
FIX: make the train / test fields modulable ( #1551 )
...
* make the train / test fields modulable
* format
* fix --output_dir issue
2024-04-18 11:33:30 +02:00
a2aa0f0b09
FEAT: Add CLIs in TRL ! ( #1419 )
...
* CLI V1
* v1 CLI
* add rich enhancmeents
* revert unindented change
* some comments
* cleaner CLI
* fix
* fix
* remove print callback
* move to cli instead of trl_cli
* revert unneeded changes
* fix test
* Update trl/commands/sft.py
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
* remove redundant strings
* fix import issue
* fix other issues
* add packing
* add config parser
* some refactor
* cleaner
* add example config yaml file
* small refactor
* change a bit the logic
* fix issues here and there
* add CLI in docs
* move to examples/sft
* remove redundant licenses
* make it work on dpo
* set to None
* switch to accelerate and fix many things
* add docs
* more docs
* added tests
* doc clarification
* more docs
* fix CI for windows and python 3.8
* fix
* attempt to fix CI
* fix?
* test
* fix
* tweak?
* fix
* test
* another test
* fix
* test
* fix
* fix
* fix
* skip tests for windows
* test @lvwerra approach
* make dev
* revert unneeded changes
* fix sft dpo
* optimize a bit
* address final comments
* update docs
* final comment
---------
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
2024-03-18 12:20:54 +01:00
9bc478ecbb
pre-commit: replace linters + formatters with Ruff; fix some issues ( #1300 )
...
* pre-commit: replace linters + formatters with Ruff
* Don't use bare except
* Clean up `noqa`s
* Enable Ruff UP; apply auto-fixes
* Enable Ruff B; apply fixes
* Enable Ruff T with exceptions
* Enable Ruff C (complexity); autofix
* Upgrade Ruff to 0.2.0
2024-02-15 04:37:41 +01:00
3843cfc32f
Fix SFT tuner ( #1278 )
2024-01-26 17:49:50 +01:00
9a71e67be9
Remove tyro ( #1176 )
...
* refactor
* Remove tyro in `ppo.py`
* quick update
* update default args
* quick push
* precommit
* refactor
* quick change
* remove tyro
* quick change
* precommit
* quick change
* fix hello_world
* remove docstring diffences
* add `module load cuda/12.1`
* push changes
* precommit
* make dpo runnable
* fix circular import
* quick fix
* refactor
* quick update
* path change
* update plots
* fix docs
* quick change
* Update trl/trainer/model_config.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update trl/trainer/model_config.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update trl/trainer/utils.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update examples/scripts/dpo.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* address comments. use attn_implementation
* precommit
* remove duplicate code
* update peft.py
* fix test no op dep
* Update trl/trainer/utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* precommit
* add docs
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-01-26 07:51:15 -08:00
ef209e311f
[core
/ tests ] v1 slow tests ( #1218 )
...
* v1 slow tests
* nit
* add qlora tests for DPO
* add decorator
* release memory + log reports
* report to none to avoid seg fault issues
* update setup
* fix
* add exampel testing
* fix nit
* change temp filename
* add workflow file
* fix comment
* add slack push script
* more tests for DPO
* add dpo example tests
* another makefile command
* fix
* add paths + clean up
* nit
* Update slow-tests.yml
* trigger tests
* up
* up
* more fixes
* fix
* final fixes
* minor fixes
* oops
* add more text
* fix
* more
* trigger CI
* up
* fix
* remove
* run the tests on 2 GPUs only
* final fix SFT
* revert config files + address comments
* fix
* add Phi
* final fixes
* final fix
2024-01-17 10:17:57 +01:00
18a33ffcd3
SFT Tokenizer Fix ( #1142 )
2023-12-27 10:25:56 +01:00
c0ce52ab26
consistency on log ( #1084 )
2023-12-12 10:58:21 +01:00
8f0fc4c8f7
Add args to SFT example ( #1079 )
2023-12-11 16:16:47 +01:00
cbc6c9bb3e
[core
/ DDP
] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs
in SFT & DPO ( #912 )
...
* make use of forward hooks
* correctly delete attributes
* fix RM DPP issues
* revert unneeded changes
* more fixes
* fix diff
* fix
* propagate to SFT
* Update examples/scripts/reward_modeling.py
* propagate the fix on DPO trainer
* add to example scripts
* trigger CI
2023-10-31 18:50:17 +01:00
ec9e76623e
[Feature] Enable Intel XPU support ( #839 )
...
* enable xpu support
* fix bug
* review commits
* fix style
* add xou decorator
* refactor review commit
* fix test
* review commit
* fix test
* Update benchmark.yml (#856 )
* Standardise example scripts (#842 )
* Standardise example scripts
* fix plotting script
* Rename run_xxx to xxx
* Fix doc
---------
Co-authored-by: Costa Huang <costa.huang@outlook.com >
* Fix version check in import_utils.py (#853 )
* dont use get_peft_model if model is already peft (#857 )
* merge conflict
* add xou decorator
* resolve
* resolves
* upstream
* refactor and precommit
* fix new tests
* add device mapping for xpu
---------
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Costa Huang <costa.huang@outlook.com >
Co-authored-by: Adam Pauls <adpauls@gmail.com >
Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com >
2023-10-31 10:15:35 +01:00
ddd318865b
Standardise example scripts ( #842 )
...
* Standardise example scripts
* fix plotting script
* Rename run_xxx to xxx
* Fix doc
---------
Co-authored-by: Costa Huang <costa.huang@outlook.com >
2023-10-11 17:28:15 +02:00