frozenleaves/trl - trl - Gitea: Git for Me

mirror of https://github.com/huggingface/trl.git synced 2025-10-21 02:53:59 +08:00

Author	SHA1	Message	Date
Costa Huang	9a71e67be9	Remove tyro (#1176 ) * refactor * Remove tyro in `ppo.py` * quick update * update default args * quick push * precommit * refactor * quick change * remove tyro * quick change * precommit * quick change * fix hello_world * remove docstring diffences * add `module load cuda/12.1` * push changes * precommit * make dpo runnable * fix circular import * quick fix * refactor * quick update * path change * update plots * fix docs * quick change * Update trl/trainer/model_config.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/model_config.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update trl/trainer/utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update examples/scripts/dpo.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * address comments. use attn_implementation * precommit * remove duplicate code * update peft.py * fix test no op dep * Update trl/trainer/utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * precommit * add docs --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-26 07:51:15 -08:00
Younes Belkada	cbc6c9bb3e	[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO (#912 ) * make use of forward hooks * correctly delete attributes * fix RM DPP issues * revert unneeded changes * more fixes * fix diff * fix * propagate to SFT * Update examples/scripts/reward_modeling.py * propagate the fix on DPO trainer * add to example scripts * trigger CI	2023-10-31 18:50:17 +01:00
Abhilash Majumder	ec9e76623e	[Feature] Enable Intel XPU support (#839 ) * enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (#856) * Standardise example scripts (#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (#853) * dont use get_peft_model if model is already peft (#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>	2023-10-31 10:15:35 +01:00
Gaetan LOPEZ LATOUCHE	0a5aee7d99	[reward_modeling] Cleaning example script (#882 ) * remove load in repeated multiple times & truncation * trigger CI	2023-10-19 16:00:20 +02:00
Costa Huang	f91fb2bda2	remove duplicate key in `reward_modeling.py` (#890 )	2023-10-18 23:45:18 +02:00
lewtun	ddd318865b	Standardise example scripts (#842 ) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>	2023-10-11 17:28:15 +02:00

6 Commits