Commit Graph

1902 Commits

Author SHA1 Message Date
ef40c047aa Replace unittest skipTest with pytest.skip (#4263) 2025-10-15 18:15:28 +02:00
7e0adbc552 Fix CI dev test TypeError: unexpected keyword argument 'load_in_4bit' (#4262) 2025-10-15 18:14:49 +02:00
773afd9314 💰 RichProgressCallback enhancement (#4245) 2025-10-15 09:39:17 -06:00
966b397201 Fix CI slow test OSError: You are trying to access a gated repo (#4283) 2025-10-15 16:11:11 +02:00
927cf6ba46 Fix docstrings with Sphinx 'deprecated' directive (#4279) 2025-10-15 10:39:12 +02:00
56cb6ccf76 Fix typo in Colab link (#4276) 2025-10-14 18:51:17 +02:00
49c8f14b06 Add Qwen3-VL notebooks (SFT, GRPO) (#4275)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-14 18:45:01 +02:00
cefbacb30e Fix style with make precommit (#4265) 2025-10-14 12:13:15 +02:00
fae245a062 Use FutureWarning instead of DeprecationWarning (#4266) 2025-10-14 12:12:03 +02:00
2aa9506c69 Fix docstring interlinks (#4221) 2025-10-13 13:40:24 +02:00
d6eeb290d9 Raise deprecation warning for Python 3.9 (#4226) 2025-10-13 11:06:09 +02:00
1684ef279a Fix Python version check for skipping tests on Python 3.13.8 (#4246) 2025-10-10 17:41:24 +02:00
aab21eb5e7 Include chat_template_kwargs in apply_chat_template (#4233)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
2025-10-10 10:39:29 -05:00
b997a31981 [Online-DPO] fix the completion_len == max_new_tokens crash (#4193)
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-10-10 17:21:01 +02:00
86d1963cc1 Fix CI slow test AttributeError: 'TestSFTTrainerSlow' object has no attribute 'addCleanup' (#4255) 2025-10-10 17:19:53 +02:00
039d526d24 Deprecate unused dataset_formatting module (#4242)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
2025-10-10 10:16:18 -05:00
bcd059a384 Remove obsolete research_projects directory (#4243)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
2025-10-10 10:15:47 -05:00
0e57b4a9df 🧺 [3/N] Refactor _generate in GRPO/RLOO: Rely on generator for prompt truncation (#4153) 2025-10-10 10:02:11 -05:00
98488e0946 Fix CI slow test ValueError: Unknown loss type: dapo (#4254) 2025-10-10 16:37:02 +02:00
f45e86571b Fix CI ImportError for 'require_torch_gpu_if_bnb_not_multi_backend_enabled' (#4253) 2025-10-10 16:13:22 +02:00
f5827928a0 Install peft from main for CI tests with dev dependencies (#4250) 2025-10-10 16:12:15 +02:00
f853e091ea Fix CI CUDA out of memory errors by improving GPU memory management (#4238) 2025-10-10 09:49:45 +02:00
803ec0d856 Fix CI slow test ValueError: Backward pass should have cleared tracker of all tensors (#4236)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2025-10-10 09:28:34 +02:00
7a0a615d50 Warnings pointing to RFC (#4224) 2025-10-09 17:05:36 -06:00
c38cb69ec7 🧘 Enhance markdown style (#4235) 2025-10-09 13:49:44 -05:00
68ef15c686 Remove unused log_example_reports.py script (#4241)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
2025-10-09 09:18:48 -07:00
3dd7fc2850 Fix CI IndentationError for Python 3.13.8 (#4240) 2025-10-09 15:46:41 +02:00
51ced65153 Replace setup with pyproject in CI tests paths (#4230) 2025-10-09 09:35:08 +02:00
4bb883a6e6 Update CI Docker image to pytorch/pytorch:2.8.0 (#4232) 2025-10-09 08:09:15 +02:00
f7846321e7 Remove unused Path import in __init__.py (#4227) 2025-10-08 21:30:54 +02:00
a944890ff1 Fix callable annotations (#4216) 2025-10-08 21:21:21 +02:00
521db3520a Fix CI unittest asserts (#4234) 2025-10-08 21:18:41 +02:00
e2c97a805a Exclude vllm dependencies from dev extra (#4229) 2025-10-08 18:14:23 +02:00
d1d0407d3c 🏷️ Account for token_type_ids in DataCollatorForVisionLanguageModeling (#4190) 2025-10-08 09:34:48 -06:00
824ff8c73e Add Efficient Online Training with GRPO and vLLM in TRL to community tutorials (#4219) 2025-10-08 12:59:04 +02:00
f15399d3d3 Fix entropy and accuracy calculation for prompt_tuning techniques. (#4196) 2025-10-08 09:42:19 +01:00
cc578b6b14 🧺 [2/N] Refactor _generate in GRPO/RLOO: Use prompt_ids from generation (#4152) 2025-10-07 12:11:34 -06:00
30cf68a97b 🎨 Support mixing image+text and text-only examples (#4203)
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
2025-10-07 10:21:10 -06:00
452284b8dc Add trainers taxonomy to docs (#4195) 2025-10-07 16:06:30 +02:00
6be53e19bc [DOCS] fix prose in lora guide (#4217) 2025-10-07 10:40:37 +02:00
3080fc1bd7 Fix LoRA params in Python in LoRA without regret (#4215) 2025-10-07 09:56:04 +02:00
5d870955f8 Fix prompt-completion labeling with add_generation_prompt and warning (#4201)
Co-authored-by: behroozazarkhalili <ermiaazarkhalili>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
2025-10-06 18:35:50 -06:00
8265800abf Fix trl-internal-testing/tiny-DbrxForCausalLM (#4213) 2025-10-06 15:11:16 -06:00
65eb45c32b Apply style and revert change in sft_video_llm example (#4214) 2025-10-06 13:07:18 -06:00
ae6837f8d4 Removed tokenizer/processor creation from example scripts (#4211) 2025-10-06 18:40:18 +02:00
56a8f1128b Replace setup with pyproject and fix packaging unintended modules (#4194) 2025-10-06 17:45:44 +02:00
529101537f Remove Optional from processing_class in PPOTrainer (#4212) 2025-10-06 16:04:06 +02:00
0588b1f01d Updated vLLM integration guide (#4162)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-10-06 15:57:17 +02:00
45ee98b05e Replace unittest with pytest (#4188) 2025-10-06 11:14:54 +02:00
3800a6ecc7 Hotfix: Exclude transformers 4.57.0 for Python 3.9 (#4209)
Co-authored-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
2025-10-06 11:13:21 +02:00