trl/source at remove-best-of-n-sampler - trl - Gitea: Git for Me

frozenleaves/trl

mirror of https://github.com/huggingface/trl.git synced 2025-10-20 18:43:52 +08:00

Files

History

behroozazarkhalili 56180a5e26 Remove BestOfNSampler class

This removes the BestOfNSampler class as it's considered out of scope for the TRL library. The class provided a "Best of N" sampling strategy for PPO fine-tuning, but this functionality is not a core part of TRL's reinforcement learning training capabilities.

Changes:
- Removed implementation: trl/extras/best_of_n_sampler.py
- Removed tests: tests/test_best_of_n_sampler.py
- Removed documentation: docs/source/best_of_n.md
- Removed example notebook: examples/notebooks/best_of_n.ipynb
- Updated imports: trl/__init__.py, trl/extras/__init__.py
- Updated documentation references: docs/source/_toctree.yml, docs/source/example_overview.md, examples/notebooks/README.md

2025-10-11 06:52:30 -07:00

..

_toctree.yml

Remove BestOfNSampler class

2025-10-11 06:52:30 -07:00

bco_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

callbacks.md

🧶 feat: Add WeaveCallback for W&B Weave integration (#4089 )

2025-09-17 18:10:45 -06:00

clis.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

community_tutorials.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

cpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

customization.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

data_utils.md

♻️ Reuse multimodal message preparation from SFTTrainer in GRPOTrainer (#3919 )

2025-08-20 10:04:54 -07:00

dataset_formats.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

deepspeed_integration.md

💡 Replace <Tip> with new markdown syntax (#4161 )

2025-09-29 10:48:00 -06:00

distributing_training.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

dpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

example_overview.md

Remove BestOfNSampler class

2025-10-11 06:52:30 -07:00

experimental.md

💡 Replace <Tip> with new markdown syntax (#4161 )

2025-09-29 10:48:00 -06:00

gkd_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

grpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

how_to_train.md

Fixed some typos and added small details about trackio to docs (#3965 )

2025-08-27 17:57:19 +02:00

index.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

installation.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

jobs_training.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

judges.md

💡 Replace <Tip> with new markdown syntax (#4161 )

2025-09-29 10:48:00 -06:00

kernels_hub.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

kto_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

liger_kernel_integration.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

logging.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

lora_without_regret.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

model_utils.md

💬 Fix setup_chat_format and add clone_chat_template (#3404 )

2025-06-15 15:59:42 +02:00

models.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

multi_adapter_rl.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

nash_md_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

online_dpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

orpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

others.md

🪪 Adds a more fine-grained profiling context (#2975 )

2025-02-27 21:58:39 +01:00

paper_index.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

peft_integration.md

Remove obsolete research_projects directory (#4243 )

2025-10-10 10:15:47 -05:00

ppo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

prm_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

quickstart.md

🎁 RewardTrainer refactor (#4093 )

2025-09-30 15:13:45 -06:00

reducing_memory_usage.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

reward_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

rewards.md

🔥 [Refactor] RLOOTrainer (#3801 )

2025-08-29 09:27:28 -06:00

rloo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

script_utils.md

🎚️ Add dataset mixer (#3791 )

2025-08-11 20:14:50 -07:00

sentiment_tuning.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

sft_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

speeding_up_training.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

trackio_integration.md

🎁 RewardTrainer refactor (#4093 )

2025-09-30 15:13:45 -06:00

unsloth_integration.md

🦥 Unsloth Docs update (#3955 )

2025-08-26 20:17:21 -07:00

use_model.md

Fixed some typos and added small details about trackio to docs (#3965 )

2025-08-27 17:57:19 +02:00

using_llama_models.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

vllm_integration.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00

xpo_trainer.md

🧘 Enhance markdown style (#4235 )

2025-10-09 13:49:44 -05:00