|
49c8f14b06
|
Add Qwen3-VL notebooks (SFT, GRPO) (#4275)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-10-14 18:45:01 +02:00 |
|
|
cefbacb30e
|
Fix style with make precommit (#4265)
|
2025-10-14 12:13:15 +02:00 |
|
|
fae245a062
|
Use FutureWarning instead of DeprecationWarning (#4266)
|
2025-10-14 12:12:03 +02:00 |
|
|
2aa9506c69
|
Fix docstring interlinks (#4221)
|
2025-10-13 13:40:24 +02:00 |
|
|
d6eeb290d9
|
Raise deprecation warning for Python 3.9 (#4226)
|
2025-10-13 11:06:09 +02:00 |
|
|
1684ef279a
|
Fix Python version check for skipping tests on Python 3.13.8 (#4246)
|
2025-10-10 17:41:24 +02:00 |
|
|
aab21eb5e7
|
Include chat_template_kwargs in apply_chat_template (#4233)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-10 10:39:29 -05:00 |
|
|
b997a31981
|
[Online-DPO] fix the completion_len == max_new_tokens crash (#4193)
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-10-10 17:21:01 +02:00 |
|
|
86d1963cc1
|
Fix CI slow test AttributeError: 'TestSFTTrainerSlow' object has no attribute 'addCleanup' (#4255)
|
2025-10-10 17:19:53 +02:00 |
|
|
039d526d24
|
Deprecate unused dataset_formatting module (#4242)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-10 10:16:18 -05:00 |
|
|
bcd059a384
|
Remove obsolete research_projects directory (#4243)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-10 10:15:47 -05:00 |
|
|
0e57b4a9df
|
🧺 [3/N] Refactor _generate in GRPO/RLOO: Rely on generator for prompt truncation (#4153)
|
2025-10-10 10:02:11 -05:00 |
|
|
98488e0946
|
Fix CI slow test ValueError: Unknown loss type: dapo (#4254)
|
2025-10-10 16:37:02 +02:00 |
|
|
f45e86571b
|
Fix CI ImportError for 'require_torch_gpu_if_bnb_not_multi_backend_enabled' (#4253)
|
2025-10-10 16:13:22 +02:00 |
|
|
f5827928a0
|
Install peft from main for CI tests with dev dependencies (#4250)
|
2025-10-10 16:12:15 +02:00 |
|
|
f853e091ea
|
Fix CI CUDA out of memory errors by improving GPU memory management (#4238)
|
2025-10-10 09:49:45 +02:00 |
|
|
803ec0d856
|
Fix CI slow test ValueError: Backward pass should have cleared tracker of all tensors (#4236)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-10-10 09:28:34 +02:00 |
|
|
7a0a615d50
|
Warnings pointing to RFC (#4224)
|
2025-10-09 17:05:36 -06:00 |
|
|
c38cb69ec7
|
🧘 Enhance markdown style (#4235)
|
2025-10-09 13:49:44 -05:00 |
|
|
68ef15c686
|
Remove unused log_example_reports.py script (#4241)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
|
2025-10-09 09:18:48 -07:00 |
|
|
3dd7fc2850
|
Fix CI IndentationError for Python 3.13.8 (#4240)
|
2025-10-09 15:46:41 +02:00 |
|
|
51ced65153
|
Replace setup with pyproject in CI tests paths (#4230)
|
2025-10-09 09:35:08 +02:00 |
|
|
4bb883a6e6
|
Update CI Docker image to pytorch/pytorch:2.8.0 (#4232)
|
2025-10-09 08:09:15 +02:00 |
|
|
f7846321e7
|
Remove unused Path import in __init__.py (#4227)
|
2025-10-08 21:30:54 +02:00 |
|
|
a944890ff1
|
Fix callable annotations (#4216)
|
2025-10-08 21:21:21 +02:00 |
|
|
521db3520a
|
Fix CI unittest asserts (#4234)
|
2025-10-08 21:18:41 +02:00 |
|
|
e2c97a805a
|
Exclude vllm dependencies from dev extra (#4229)
|
2025-10-08 18:14:23 +02:00 |
|
|
d1d0407d3c
|
🏷️ Account for token_type_ids in DataCollatorForVisionLanguageModeling (#4190)
|
2025-10-08 09:34:48 -06:00 |
|
|
824ff8c73e
|
Add Efficient Online Training with GRPO and vLLM in TRL to community tutorials (#4219)
|
2025-10-08 12:59:04 +02:00 |
|
|
f15399d3d3
|
Fix entropy and accuracy calculation for prompt_tuning techniques. (#4196)
|
2025-10-08 09:42:19 +01:00 |
|
|
cc578b6b14
|
🧺 [2/N] Refactor _generate in GRPO/RLOO: Use prompt_ids from generation (#4152)
|
2025-10-07 12:11:34 -06:00 |
|
|
30cf68a97b
|
🎨 Support mixing image+text and text-only examples (#4203)
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
|
2025-10-07 10:21:10 -06:00 |
|
|
452284b8dc
|
Add trainers taxonomy to docs (#4195)
|
2025-10-07 16:06:30 +02:00 |
|
|
6be53e19bc
|
[DOCS] fix prose in lora guide (#4217)
|
2025-10-07 10:40:37 +02:00 |
|
|
3080fc1bd7
|
Fix LoRA params in Python in LoRA without regret (#4215)
|
2025-10-07 09:56:04 +02:00 |
|
|
5d870955f8
|
Fix prompt-completion labeling with add_generation_prompt and warning (#4201)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
|
2025-10-06 18:35:50 -06:00 |
|
|
8265800abf
|
Fix trl-internal-testing/tiny-DbrxForCausalLM (#4213)
|
2025-10-06 15:11:16 -06:00 |
|
|
65eb45c32b
|
Apply style and revert change in sft_video_llm example (#4214)
|
2025-10-06 13:07:18 -06:00 |
|
|
ae6837f8d4
|
Removed tokenizer/processor creation from example scripts (#4211)
|
2025-10-06 18:40:18 +02:00 |
|
|
56a8f1128b
|
Replace setup with pyproject and fix packaging unintended modules (#4194)
|
2025-10-06 17:45:44 +02:00 |
|
|
529101537f
|
Remove Optional from processing_class in PPOTrainer (#4212)
|
2025-10-06 16:04:06 +02:00 |
|
|
0588b1f01d
|
Updated vLLM integration guide (#4162)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-10-06 15:57:17 +02:00 |
|
|
45ee98b05e
|
Replace unittest with pytest (#4188)
|
2025-10-06 11:14:54 +02:00 |
|
|
3800a6ecc7
|
Hotfix: Exclude transformers 4.57.0 for Python 3.9 (#4209)
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
|
2025-10-06 11:13:21 +02:00 |
|
|
7ad9ce8acc
|
Remove tokenizer creation from sft example script (#4197)
|
2025-10-06 11:04:20 +02:00 |
|
|
0c2dc14014
|
Remove custome_container for building the docs (#4198)
|
2025-10-06 08:31:58 +02:00 |
|
|
ced8b337ba
|
[DOCS/FIX] lora without regrets - fix lr (#4207)
|
2025-10-06 08:23:11 +02:00 |
|
|
1eff7da9e0
|
[DOCS] Lora without regret (#4181)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-10-03 20:40:37 +02:00 |
|
|
1cbfb00b6a
|
Replace remaining trainer.tokenizer with trainer.processing_class in GRPO test (#4192)
|
2025-10-03 09:08:53 +02:00 |
|
|
e086f073cf
|
🌡️ Have vLLM return processed (temperature scaled) log probs (#4163)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-10-01 11:58:13 -06:00 |
|