2.3 KiB
Judges
Warning
TRL Judges is an experimental API which is subject to change at any time.
TRL provides judges to easily compare two completions.
Make sure to have installed the required dependencies by running:
pip install trl[judges]
Using the provided judges
TRL provides several judges out of the box. For example, you can use the [HfPairwiseJudge
] to compare two completions using a pre-trained model from the Hugging Face model hub:
from trl import HfPairwiseJudge
judge = HfPairwiseJudge()
judge.judge(
prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
completions=[["Paris", "Lyon"], ["Saturn", "Jupiter"]],
) # Outputs: [0, 1]
Define your own judge
To define your own judge, we provide several base classes that you can subclass. For rank-based judges, you need to subclass [BaseRankJudge
] and implement the [BaseRankJudge.judge
] method. For pairwise judges, you need to subclass [BasePairJudge
] and implement the [BasePairJudge.judge
] method. If you want to define a judge that doesn't fit into these categories, you need to subclass [BaseJudge
] and implement the [BaseJudge.judge
] method.
As an example, let's define a pairwise judge that prefers shorter completions:
from trl import BasePairwiseJudge
class PrefersShorterJudge(BasePairwiseJudge):
def judge(self, prompts, completions, shuffle_order=False):
return [0 if len(completion[0]) > len(completion[1]) else 1 for completion in completions]
You can then use this judge as follows:
judge = PrefersShorterJudge()
judge.judge(
prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
completions=[["Paris", "The capital of France is Paris."], ["Jupiter is the biggest planet in the solar system.", "Jupiter"]],
) # Outputs: [0, 1]
Provided judges
PairRMJudge
autodoc PairRMJudge
HfPairwiseJudge
autodoc HfPairwiseJudge
OpenAIPairwiseJudge
autodoc OpenAIPairwiseJudge
AllTrueJudge
autodoc AllTrueJudge
Base classes
BaseJudge
autodoc BaseJudge
BaseBinaryJudge
autodoc BaseBinaryJudge
BaseRankJudge
autodoc BaseRankJudge
BasePairwiseJudge
autodoc BasePairwiseJudge