Judges

Warning

TRL Judges is an experimental API which is subject to change at any time.

TRL provides judges to easily compare two completions.

Make sure to have installed the required dependencies by running:

pip install trl[judges]

Using the provided judges

TRL provides several judges out of the box. For example, you can use the [HfPairwiseJudge] to compare two completions using a pre-trained model from the Hugging Face model hub:

from trl import HfPairwiseJudge

judge = HfPairwiseJudge()
judge.judge(
    prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
    completions=[["Paris", "Lyon"], ["Saturn", "Jupiter"]],
)  # Outputs: [0, 1]

Define your own judge

To define your own judge, we provide several base classes that you can subclass. For rank-based judges, you need to subclass [BaseRankJudge] and implement the [BaseRankJudge.judge] method. For pairwise judges, you need to subclass [BasePairJudge] and implement the [BasePairJudge.judge] method. If you want to define a judge that doesn't fit into these categories, you need to subclass [BaseJudge] and implement the [BaseJudge.judge] method.

As an example, let's define a pairwise judge that prefers shorter completions:

from trl import BasePairwiseJudge

class PrefersShorterJudge(BasePairwiseJudge):
    def judge(self, prompts, completions, shuffle_order=False):
        return [0 if len(completion[0]) > len(completion[1]) else 1 for completion in completions]

You can then use this judge as follows:

judge = PrefersShorterJudge()
judge.judge(
    prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
    completions=[["Paris", "The capital of France is Paris."], ["Jupiter is the biggest planet in the solar system.", "Jupiter"]],
)  # Outputs: [0, 1]

Provided judges

PairRMJudge

autodoc PairRMJudge

HfPairwiseJudge

autodoc HfPairwiseJudge

OpenAIPairwiseJudge

autodoc OpenAIPairwiseJudge

AllTrueJudge

autodoc AllTrueJudge

Base classes

BaseJudge

autodoc BaseJudge

BaseBinaryJudge

autodoc BaseBinaryJudge

BaseRankJudge

autodoc BaseRankJudge

BasePairwiseJudge

autodoc BasePairwiseJudge

2.3 KiB Raw Blame History