Release: 0.18.1

📚 Fix doc building by removing vLLM from dev dependencies in setup.cfg (#3511 )
📎 Fix clip ratio logging (#3506 )
2025-10-21 02:53:59 +08:00 · 2025-05-29 19:01:07 +00:00 · 2025-05-29 18:49:51 +00:00 · 2025-05-29 18:49:41 +00:00
3 changed files with 3 additions and 4 deletions
--- a/setup.cfg
+++ b/setup.cfg
@ -1,6 +1,6 @@
 [metadata]
 name = trl
-version = 0.18.0
+version = 0.18.1
 description = Train transformer language models with reinforcement learning.
 long_description = file: README.md
 long_description_content_type = text/markdown
@ -89,7 +89,6 @@ dev =
    %(quantization)s
    %(scikit)s
    %(test)s
-    %(vllm)s
    %(vlm)s

 [options.entry_points]
--- a/trl/init.py
+++ b/trl/init.py
@ -12,7 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-__version__ = "0.18.0"
+__version__ = "0.18.1"

 from typing import TYPE_CHECKING

--- a/trl/trainer/grpo_trainer.py
+++ b/trl/trainer/grpo_trainer.py
@ -1229,7 +1229,7 @@ class GRPOTrainer(Trainer):
        # Identify sequences that terminated with EOS and log their lengths
        agg_terminated_with_eos = self.accelerator.gather(is_eos.any(dim=1))
        term_completion_lengths = agg_completion_lengths[agg_terminated_with_eos]
-        clipped_completions_ratio = 1 - len(term_completion_lengths) / len(completion_lengths)
+        clipped_completions_ratio = 1 - len(term_completion_lengths) / len(agg_completion_lengths)
        self._metrics[mode]["completions/clipped_ratio"].append(clipped_completions_ratio)
        if len(term_completion_lengths) == 0:  # edge case where no terminated sequences are found
            term_completion_lengths = torch.zeros(1, device=device)
Author	SHA1	Message	Date
Quentin Gallouédec	2c49300910	Release: 0.18.1	2025-05-29 19:01:07 +00:00
Quentin Gallouédec	e530486c26	📚 Fix doc building by removing vLLM from dev dependencies in `setup.cfg` (#3511 )	2025-05-29 18:49:51 +00:00
Quentin Gallouédec	1bae58c292	📎 Fix clip ratio logging (#3506 )	2025-05-29 18:49:41 +00:00