frozenleaves/trl - trl - Gitea: Git for Me

mirror of https://github.com/huggingface/trl.git synced 2025-10-20 18:43:52 +08:00

Author	SHA1	Message	Date
Sergio Paniego Blanco	28bba8c6b1	Added SFT LoRA notebook (#4244 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>	2025-10-20 11:24:54 +02:00
Pramodith Ballapuram	8e2d5516ca	Add accuracy reward (#4270 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-10-15 18:01:07 -06:00
Sergio Paniego Blanco	56cb6ccf76	Fix typo in Colab link (#4276 )	2025-10-14 18:51:17 +02:00
Sergio Paniego Blanco	49c8f14b06	Add Qwen3-VL notebooks (SFT, GRPO) (#4275 ) Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-14 18:45:01 +02:00
Behrooz Azarkhalili	bcd059a384	Remove obsolete research_projects directory (#4243 ) Co-authored-by: behroozazarkhalili <ermiaazarkhalili> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-10-10 10:15:47 -05:00
Quentin Gallouédec	c38cb69ec7	🧘 Enhance markdown style (#4235 )	2025-10-09 13:49:44 -05:00
Quentin Gallouédec	65eb45c32b	Apply style and revert change in `sft_video_llm` example (#4214 )	2025-10-06 13:07:18 -06:00
Sergio Paniego Blanco	ae6837f8d4	Removed tokenizer/processor creation from example scripts (#4211 )	2025-10-06 18:40:18 +02:00
Quentin Gallouédec	251fdb228a	📌 Pin vLLM version (#4122 )	2025-09-23 08:02:30 -06:00
Behrooz Azarkhalili	3c8d7209f1	👁️ Add VLM support to RLOO trainer (#4067 ) Co-authored-by: behroozazarkhalili <ermiaazarkhalili> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-09-18 21:54:06 -06:00
lewtun	45e59f77ea	⌨️ Pin num2words (#4094 ) Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com>	2025-09-16 08:48:09 -06:00
Sergio Paniego Blanco	4bd4acf172	🏞️ Context Parallelism benchmark guide (#4075 ) Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-09-16 08:46:12 -06:00
Quentin Gallouédec	78f1a928ce	🗑️ Remove deprecated `AlignPropTrainer`, `DDPOTrainer` and `IterativeSFTTrainer` (#4068 )	2025-09-15 09:56:41 -06:00
Quentin Gallouédec	9955ee7eaa	🐳 Docker update + Simplify Jobs doc (#3931 ) Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-09-13 18:35:55 -06:00
Albert Villanova del Moral	e8b8499f1f	Remove redundant 'None' from docstrings (#4058 )	2025-09-11 08:16:34 +02:00
Quentin Gallouédec	a647e5a78a	🗜 Hotfix: avoid passing `quantization_config=None` (#4019 )	2025-09-09 14:50:15 -06:00
Quentin Gallouédec	af82b38482	⚖️ Remove `average_tokens_across_devices` default replacement (#4039 )	2025-09-09 07:39:12 -06:00
Kashif Rasul	1b799a23c1	🥓 [docs] add CP docs (#3994 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-09-08 21:46:22 -06:00
Albert Villanova del Moral	c9484b161f	Align docstring parameters with function definitions (#4017 )	2025-09-07 10:40:09 +02:00
johann	d1bf56020d	⚖️ Add vLLM server mode and VLM support to OnlineDPOTrainer (#3783 ) Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-09-05 16:58:49 -06:00
Sergio Paniego Blanco	0c69fd2867	👷 Added Kernels on the Hub x TRL guide (#3969 ) Co-authored-by: vb <vaibhavs10@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2025-09-04 15:37:02 +02:00
Sergio Paniego Blanco	208e9f7df7	📏 `torch_dype` to `dtype` everywhere (#4000 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-09-03 15:45:37 -06:00
Yao Matrix	8aa0eed816	ℹ️ Validate examples on xpu (#3897 ) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>	2025-08-29 10:56:57 -07:00
Shirin Yamani	e7b37d4e8d	🔥 [Refactor] RLOOTrainer (#3801 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com>	2025-08-29 09:27:28 -06:00
Sergio Paniego Blanco	0c91515b58	🧭 HF jobs x TRL guide (#3890 ) Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-08-26 21:44:29 -07:00
kaixuanliu	251c0488c8	📦 Wrapping the main execution code to avoid multi-processing issues from vLLM (#3932 ) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>	2025-08-21 12:45:13 -07:00
Quentin Gallouédec	48d7ecc67b	🗑️ Deprecate `setup_chat_format` (#3929 )	2025-08-20 14:06:23 -07:00
Quentin Gallouédec	8793a46760	🧾 Use `logger.warning` instead of `warnings.warn` (#3923 ) Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-08-20 09:20:09 -07:00
Quentin Gallouédec	0227d68e50	🌓 SFTTrainer for VLM: Support for prompt-completion data (#3907 )	2025-08-18 16:46:17 -07:00
Quentin Gallouédec	44e6c153a5	🔮 Native VLM support for `SFTTrainer` (#3862 ) Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-08-12 20:43:00 -07:00
Sergio Paniego Blanco	cb95323429	👋 Remove `--bf16` value in scripts (#3869 )	2025-08-07 12:25:36 -07:00
Quentin Gallouédec	2fb7090231	👁️ From `AutoModelForVision2Seq` to `AutoModelForImageTextToText` (#3836 )	2025-08-07 08:00:16 -07:00
Quentin Gallouédec	17393b8c82	🌺 OpenAI GPT OSS & Harmony support (#3848 ) Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com> Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>	2025-08-05 09:44:59 -07:00
Sergio Paniego Blanco	3ae60cd1b4	Add GSPO script examples (VLM/LLM) (#3810 )	2025-07-30 20:07:23 -06:00
Sergio Paniego Blanco	25ce0f31ae	🐙 Add MPO VLM example script (#3799 )	2025-07-29 20:52:32 -06:00
Kashif Rasul	fcd3e0fd15	🌋 [GRPO] add support for `pixel_attention_mask` (SmolVLM2) and `image_sizes` (LLaVa-Next) (#3760 ) Co-authored-by: sergiopaniego <sergiopaniego@users.noreply.huggingface.co> Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-07-28 16:28:29 -06:00
Quentin Gallouédec	2f4cb38f28	📐 Fix CI and `GeometricMixtureWrapper` (#3779 )	2025-07-26 16:15:08 -06:00
Quentin Lhoest	a043fd74a3	Add uv scripts headers (#3767 )	2025-07-25 07:48:40 -07:00
Marc Kassubeck	56f4201db6	👁️ [GRPO] Add VLM training capabilities to the trainer (#3072 )	2025-07-22 20:31:08 -07:00
Quentin Gallouédec	dffd1acb94	👋 Remove `--bf16` flag from training scripts (#3724 ) Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>	2025-07-11 18:20:15 -07:00
Qizhi Chen	43e6b24e70	Remove deprecated `processor.tokenizer` (#3720 )	2025-07-11 15:46:34 -06:00
Wei Han (Henry)	ab331bfd56	Update dpo_vlm.py (#3629 ) Co-authored-by: Shirin Yamani <75791599+shirinyamani@users.noreply.github.com>	2025-06-24 13:56:34 +02:00
Shirin Yamani	b40c959c00	fixing num_processes (#3637 )	2025-06-24 13:42:58 +02:00
Quentin Gallouédec	ed9b78a5f7	🗳️ Remove `logging_steps` parameter from for simpler setup (#3612 )	2025-06-18 13:52:21 +02:00
FT	8a235a9b71	Fix Typo in Documentation and Notebook; Improve Library Installation Comment (#3593 )	2025-06-15 16:46:41 +02:00
Tony Wu	3d077fd3de	Add support for `IterableDataset` in DPO Trainer (#3559 ) Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2025-06-12 13:06:34 +02:00
Yao Matrix	1314aac502	ℹ️ Unify autocast behavior to `torch.autocast` and make it cover XPU (#3541 ) Signed-off-by: YAO Matrix <matrix.yao@intel.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>	2025-06-10 09:13:00 +02:00
Kashif Rasul	c7e3f096a5	[GKD] fix the gkd script (#3497 )	2025-05-26 20:22:15 +02:00
Nikolai Kummer	31bf3f9244	Fix typo (#3489 )	2025-05-24 13:24:15 +02:00
lewtun	45f4c58832	✌️ Add support for FSDP2 (#3317 ) Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-05-06 08:29:11 +02:00

1 2 3 4 5 ...

354 Commits