208e9f7df7
📏 torch_dype
to dtype
everywhere ( #4000 )
...
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com >
2025-09-03 15:45:37 -06:00
ed9b78a5f7
🗳️ Remove logging_steps
parameter from for simpler setup ( #3612 )
2025-06-18 13:52:21 +02:00
9df19e8a75
📜 Fix license and copyrights ( #3264 )
2025-04-08 15:22:58 -07:00
1d23ecc36f
©️ Update copyrights year ( #2547 )
...
* happy new year
* fix wandb import sort
2025-01-07 14:53:09 +01:00
9410874787
©️ Copyrights update ( #2454 )
...
* First changes
* Other files
* Finally
* rm comment
* fix nashmd
* Fix example
* Fix example [ci skip]
2024-12-10 10:40:00 +01:00
c10cc8995b
🗝️ Update type hints ( #2399 )
...
* New type hint structure
* Update type hints
* Delete wrong file
* Remove dict import
2024-11-26 20:37:27 +01:00
fb1b48fdbe
Remove max_length
from RewardDataCollatorWithPadding
( #2119 )
2024-09-26 09:59:12 +02:00
4c92ba5769
©️ Copyrights ( #2063 )
...
* copyrights
* fail if missing
2024-09-13 14:18:47 +02:00
54f806b6ff
Standardize dataset_num_proc
usage ( #1925 )
...
* uniform dataset_num_proc
* num_proc in shuffle
* Update examples/datasets/anthropic_hh.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update examples/scripts/ppo.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
* Update examples/scripts/ppo.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
---------
Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co >
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com >
2024-08-13 15:10:39 +02:00
c8cef79e6c
arXiv to HF Papers ( #1870 )
2024-07-24 21:06:57 +02:00
3c0a10b1ae
fix dataset load error ( #1670 )
...
Signed-off-by: Wang, Yi <yi.a.wang@intel.com >
2024-05-27 14:52:20 +02:00
a02513c3b7
Apply deprecated evaluation_strategy
( #1559 )
...
* Deprecate
* Update tests/test_dpo_trainer.py
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com >
2024-05-23 12:48:00 +02:00
2a2676e7ec
set seed in sft/dpo/reward_modeling to make result reproducable ( #1357 )
...
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com >
2024-02-23 11:12:45 +01:00
5c7bfbc8d9
[examples
] Big refactor of examples and documentation ( #509 )
...
* added sfttrainer and rmtrainer example scripts.
* added few lines in the documentation.
* moved notebooks.
* delete `examples/summarization`
* remove from docs as well
* refactor sentiment tuning
* more refactoring.
* updated docs for multi-adapter RL.
* add research projects folder
* more refactor
* refactor docs.
* refactor structure
* add correct scripts all over the place
* final touches
* final touches
* updated documentation from feedback.
2023-07-14 12:00:56 +02:00