Files
trl/docs/source/rewards.md
Pramodith Ballapuram 8e2d5516ca Add accuracy reward (#4270)
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
2025-10-15 18:01:07 -06:00

349 B

Reward Functions

This module contains some useful reward functions, primarily intended for use with the [GRPOTrainer] and [RLOOTrainer].

accuracy_reward

autodoc rewards.accuracy_reward

think_format_reward

autodoc rewards.think_format_reward

get_soft_overlong_punishment

autodoc rewards.get_soft_overlong_punishment