|
28bba8c6b1
|
Added SFT LoRA notebook (#4244)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
|
2025-10-20 11:24:54 +02:00 |
|
|
8e2d5516ca
|
Add accuracy reward (#4270)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-15 18:01:07 -06:00 |
|
|
56cb6ccf76
|
Fix typo in Colab link (#4276)
|
2025-10-14 18:51:17 +02:00 |
|
|
49c8f14b06
|
Add Qwen3-VL notebooks (SFT, GRPO) (#4275)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-10-14 18:45:01 +02:00 |
|
|
bcd059a384
|
Remove obsolete research_projects directory (#4243)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-10 10:15:47 -05:00 |
|
|
c38cb69ec7
|
🧘 Enhance markdown style (#4235)
|
2025-10-09 13:49:44 -05:00 |
|
|
65eb45c32b
|
Apply style and revert change in sft_video_llm example (#4214)
|
2025-10-06 13:07:18 -06:00 |
|
|
ae6837f8d4
|
Removed tokenizer/processor creation from example scripts (#4211)
|
2025-10-06 18:40:18 +02:00 |
|
|
251fdb228a
|
📌 Pin vLLM version (#4122)
|
2025-09-23 08:02:30 -06:00 |
|
|
3c8d7209f1
|
👁️ Add VLM support to RLOO trainer (#4067)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-09-18 21:54:06 -06:00 |
|
|
45e59f77ea
|
⌨️ Pin num2words (#4094)
Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
|
2025-09-16 08:48:09 -06:00 |
|
|
4bd4acf172
|
🏞️ Context Parallelism benchmark guide (#4075)
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-09-16 08:46:12 -06:00 |
|
|
78f1a928ce
|
🗑️ Remove deprecated AlignPropTrainer , DDPOTrainer and IterativeSFTTrainer (#4068)
|
2025-09-15 09:56:41 -06:00 |
|
|
9955ee7eaa
|
🐳 Docker update + Simplify Jobs doc (#3931)
Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-09-13 18:35:55 -06:00 |
|
|
e8b8499f1f
|
Remove redundant 'None' from docstrings (#4058)
|
2025-09-11 08:16:34 +02:00 |
|
|
a647e5a78a
|
🗜 Hotfix: avoid passing quantization_config=None (#4019)
|
2025-09-09 14:50:15 -06:00 |
|
|
af82b38482
|
⚖️ Remove average_tokens_across_devices default replacement (#4039)
|
2025-09-09 07:39:12 -06:00 |
|
|
1b799a23c1
|
🥓 [docs] add CP docs (#3994)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-09-08 21:46:22 -06:00 |
|
|
c9484b161f
|
Align docstring parameters with function definitions (#4017)
|
2025-09-07 10:40:09 +02:00 |
|
|
d1bf56020d
|
⚖️ Add vLLM server mode and VLM support to OnlineDPOTrainer (#3783)
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-09-05 16:58:49 -06:00 |
|
|
0c69fd2867
|
👷 Added Kernels on the Hub x TRL guide (#3969)
Co-authored-by: vb <vaibhavs10@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
|
2025-09-04 15:37:02 +02:00 |
|
|
208e9f7df7
|
📏 torch_dype to dtype everywhere (#4000)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-09-03 15:45:37 -06:00 |
|
|
8aa0eed816
|
ℹ️ Validate examples on xpu (#3897)
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
|
2025-08-29 10:56:57 -07:00 |
|
|
e7b37d4e8d
|
🔥 [Refactor] RLOOTrainer (#3801)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com>
|
2025-08-29 09:27:28 -06:00 |
|
|
0c91515b58
|
🧭 HF jobs x TRL guide (#3890)
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-08-26 21:44:29 -07:00 |
|
|
251c0488c8
|
📦 Wrapping the main execution code to avoid multi-processing issues from vLLM (#3932)
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
|
2025-08-21 12:45:13 -07:00 |
|
|
48d7ecc67b
|
🗑️ Deprecate setup_chat_format (#3929)
|
2025-08-20 14:06:23 -07:00 |
|
|
8793a46760
|
🧾 Use logger.warning instead of warnings.warn (#3923)
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-08-20 09:20:09 -07:00 |
|
|
0227d68e50
|
🌓 SFTTrainer for VLM: Support for prompt-completion data (#3907)
|
2025-08-18 16:46:17 -07:00 |
|
|
44e6c153a5
|
🔮 Native VLM support for SFTTrainer (#3862)
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-08-12 20:43:00 -07:00 |
|
|
cb95323429
|
👋 Remove --bf16 value in scripts (#3869)
|
2025-08-07 12:25:36 -07:00 |
|
|
2fb7090231
|
👁️ From AutoModelForVision2Seq to AutoModelForImageTextToText (#3836)
|
2025-08-07 08:00:16 -07:00 |
|
|
17393b8c82
|
🌺 OpenAI GPT OSS & Harmony support (#3848)
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
|
2025-08-05 09:44:59 -07:00 |
|
|
3ae60cd1b4
|
Add GSPO script examples (VLM/LLM) (#3810)
|
2025-07-30 20:07:23 -06:00 |
|
|
25ce0f31ae
|
🐙 Add MPO VLM example script (#3799)
|
2025-07-29 20:52:32 -06:00 |
|
|
fcd3e0fd15
|
🌋 [GRPO] add support for pixel_attention_mask (SmolVLM2) and image_sizes (LLaVa-Next) (#3760)
Co-authored-by: sergiopaniego <sergiopaniego@users.noreply.huggingface.co>
Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-07-28 16:28:29 -06:00 |
|
|
2f4cb38f28
|
📐 Fix CI and GeometricMixtureWrapper (#3779)
|
2025-07-26 16:15:08 -06:00 |
|
|
a043fd74a3
|
Add uv scripts headers (#3767)
|
2025-07-25 07:48:40 -07:00 |
|
|
56f4201db6
|
👁️ [GRPO] Add VLM training capabilities to the trainer (#3072)
|
2025-07-22 20:31:08 -07:00 |
|
|
dffd1acb94
|
👋 Remove --bf16 flag from training scripts (#3724)
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
|
2025-07-11 18:20:15 -07:00 |
|
|
43e6b24e70
|
Remove deprecated processor.tokenizer (#3720)
|
2025-07-11 15:46:34 -06:00 |
|
|
ab331bfd56
|
Update dpo_vlm.py (#3629)
Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>
|
2025-06-24 13:56:34 +02:00 |
|
|
b40c959c00
|
fixing num_processes (#3637)
|
2025-06-24 13:42:58 +02:00 |
|
|
ed9b78a5f7
|
🗳️ Remove logging_steps parameter from for simpler setup (#3612)
|
2025-06-18 13:52:21 +02:00 |
|
|
8a235a9b71
|
Fix Typo in Documentation and Notebook; Improve Library Installation Comment (#3593)
|
2025-06-15 16:46:41 +02:00 |
|
|
3d077fd3de
|
Add support for IterableDataset in DPO Trainer (#3559)
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-06-12 13:06:34 +02:00 |
|
|
1314aac502
|
ℹ️ Unify autocast behavior to torch.autocast and make it cover XPU (#3541)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-06-10 09:13:00 +02:00 |
|
|
c7e3f096a5
|
[GKD] fix the gkd script (#3497)
|
2025-05-26 20:22:15 +02:00 |
|
|
31bf3f9244
|
Fix typo (#3489)
|
2025-05-24 13:24:15 +02:00 |
|
|
45f4c58832
|
✌️ Add support for FSDP2 (#3317)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-05-06 08:29:11 +02:00 |
|