🃏 Model card for TRL (#2123)

mirror of https://github.com/huggingface/trl.git synced 2025-10-21 02:53:59 +08:00

* template and util

* test for online dpo

* template in package_data

* template in manifest

* standardize push_to_hub

* wandb badge and quick start

* bco

* xpo

* simplify `create_model_card`

* cpo

* kto

* dpo

* gkd

* orpo

* style

* nash-md

* alignprop

* bco citation

* citation template

* cpo citation

* ddpo

* fix alignprop

* dpo

* gkd citation

* kto

* online dpo citation

* orpo citation

* citation in utils

* optional citation

* reward

* optional trainer citation

* sft

* remove add_model_tags bco

* Remove unnecessary code for adding model tags

* Fix model tag issue and update URL format

* Remove unused code for adding model tags

* Add citation for XPOTrainer

* Remove unused code in SFTTrainer

* Add model card generation in RLOOTrainer

* Remove unused import and method call in reward_trainer.py

* Add model card generation

* Remove unused code and update error message in ORPOTrainer class

* Add import statements and create model card in IterativeSFTTrainer

* Add dataset name to push_to_hub() call

* Update trainer.push_to_hub() dataset names

* script args

* test

* better doc

* fix tag test

* fix test tag

* Add tags parameter to create_model_card method

* doc

* script args

* Update trl/templates/model_card.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* unittest's `assertIn` instead of `assert`

* Update trl/templates/model_card.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

---------

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

This commit is contained in:

Quentin Gallouédec

2024-09-27 15:23:05 +02:00

committed by

GitHub

parent 124189c86a

commit c00722ce0a

42 changed files with 1032 additions and 254 deletions

									
										2

examples/scripts/reward_modeling.py
									
												View File
												
				@ -130,4 +130,4 @@ if __name__ == "__main__":

				    # Save and push to hub

				    trainer.save_model(training_args.output_dir)

				    if training_args.push_to_hub:

				        trainer.push_to_hub()

				        trainer.push_to_hub(dataset_name=script_args.dataset_name)

🃏 Model card for TRL (#2123)

2 examples/scripts/reward_modeling.py Unescape Escape View File

2

examples/scripts/reward_modeling.py

View File