9a71e67be9
Remove tyro ( #1176 )
...
* refactor
* Remove tyro in `ppo.py`
* quick update
* update default args
* quick push
* precommit
* refactor
* quick change
* remove tyro
* quick change
* precommit
* quick change
* fix hello_world
* remove docstring diffences
* add `module load cuda/12.1`
* push changes
* precommit
* make dpo runnable
* fix circular import
* quick fix
* refactor
* quick update
* path change
* update plots
* fix docs
* quick change
* Update trl/trainer/model_config.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update trl/trainer/model_config.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update trl/trainer/utils.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update examples/scripts/dpo.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* address comments. use attn_implementation
* precommit
* remove duplicate code
* update peft.py
* fix test no op dep
* Update trl/trainer/utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* precommit
* add docs
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-01-26 07:51:15 -08:00
cbc6c9bb3e
[core
/ DDP
] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs
in SFT & DPO ( #912 )
...
* make use of forward hooks
* correctly delete attributes
* fix RM DPP issues
* revert unneeded changes
* more fixes
* fix diff
* fix
* propagate to SFT
* Update examples/scripts/reward_modeling.py
* propagate the fix on DPO trainer
* add to example scripts
* trigger CI
2023-10-31 18:50:17 +01:00
ec9e76623e
[Feature] Enable Intel XPU support ( #839 )
...
* enable xpu support
* fix bug
* review commits
* fix style
* add xou decorator
* refactor review commit
* fix test
* review commit
* fix test
* Update benchmark.yml (#856 )
* Standardise example scripts (#842 )
* Standardise example scripts
* fix plotting script
* Rename run_xxx to xxx
* Fix doc
---------
Co-authored-by: Costa Huang <costa.huang@outlook.com >
* Fix version check in import_utils.py (#853 )
* dont use get_peft_model if model is already peft (#857 )
* merge conflict
* add xou decorator
* resolve
* resolves
* upstream
* refactor and precommit
* fix new tests
* add device mapping for xpu
---------
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com >
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
Co-authored-by: Costa Huang <costa.huang@outlook.com >
Co-authored-by: Adam Pauls <adpauls@gmail.com >
Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com >
2023-10-31 10:15:35 +01:00
0a5aee7d99
[reward_modeling] Cleaning example script ( #882 )
...
* remove load in repeated multiple times & truncation
* trigger CI
2023-10-19 16:00:20 +02:00
f91fb2bda2
remove duplicate key in reward_modeling.py
( #890 )
2023-10-18 23:45:18 +02:00
ddd318865b
Standardise example scripts ( #842 )
...
* Standardise example scripts
* fix plotting script
* Rename run_xxx to xxx
* Fix doc
---------
Co-authored-by: Costa Huang <costa.huang@outlook.com >
2023-10-11 17:28:15 +02:00