Commit Graph

5 Commits

Author SHA1 Message Date
9df19e8a75 📜 Fix license and copyrights (#3264) 2025-04-08 15:22:58 -07:00
1d23ecc36f ©️ Update copyrights year (#2547)
* happy new year

* fix wandb import sort
2025-01-07 14:53:09 +01:00
52d213173f 🚜 Use field in dataclasses (#2494)
* in hh-rlhf-helpful-base

* delete tokenize ds

* dataset scripts

* alignprop

* judge tldr

* ddpo

* zen

* sft video

* literal to choices

* chat

* script args

* alignprop

* bco

* better help format

* cpo

* ddpo

* whether or not -> whether

* dpo

* dont set the possible values

* `Optional[...]` to ... or `None`

* xpo

* gkd

* kto

* nash

* online dpo

* Fix typo in learning rate help message

* orpo

* more ... or `None`

* model config

* ppo

* prm

* reward

* rloo

* sft

* online policy config

* make style
2025-01-06 18:29:09 +01:00
9410874787 ©️ Copyrights update (#2454)
* First changes

* Other files

* Finally

* rm comment

* fix nashmd

* Fix example

* Fix example [ci skip]
2024-12-10 10:40:00 +01:00
43df3a485a 🧳 Move zen generation script and fix tests (#2393)
* Move zen

* step -> stepwise_supervision

* Fix train_test_split shuffle issue

* Fix tests

* Update tests/test_sft_trainer.py

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

* Fix typo in key name

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2024-11-26 14:08:06 +01:00