This removes the BestOfNSampler class as it's considered out of scope for the TRL library. The class provided a "Best of N" sampling strategy for PPO fine-tuning, but this functionality is not a core part of TRL's reinforcement learning training capabilities.
Changes:
- Removed implementation: trl/extras/best_of_n_sampler.py
- Removed tests: tests/test_best_of_n_sampler.py
- Removed documentation: docs/source/best_of_n.md
- Removed example notebook: examples/notebooks/best_of_n.ipynb
- Updated imports: trl/__init__.py, trl/extras/__init__.py
- Updated documentation references: docs/source/_toctree.yml, docs/source/example_overview.md, examples/notebooks/README.md