Skip to content

chore(python-deps): bump trl from 1.4.0 to 1.5.0#5964

Open
dependabot[bot] wants to merge 2 commits into
mainfrom
dependabot/uv/trl-1.5.0
Open

chore(python-deps): bump trl from 1.4.0 to 1.5.0#5964
dependabot[bot] wants to merge 2 commits into
mainfrom
dependabot/uv/trl-1.5.0

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 27, 2026

Bumps trl from 1.4.0 to 1.5.0.

Release notes

Sourced from trl's releases.

v1.5.0

Features

Even more training chat templates

Three more model families gain training-compatible templates with {% generation %} markers (so assistant_only_loss=True just works):

Final logits softcapping for async GRPO

The chunked LM-head path used by AsyncGRPOTrainer now supports models that use final_logit_softcapping (notably Gemma 2). _ChunkedLogProbFunction applies logit_scale, optional tanh-based softcapping, and temperature consistently in both forward and backward — softcapped models are no longer rejected.

by @​mlarnouhet in huggingface/trl#5691

KTO ↔ DPO alignment continues

Two more cycles closer to KTO graduation:

Trainer telemetry (opt-out)

_BaseTrainer.__init__ now emits a single anonymous huggingface_hub.send_telemetry ping per trainer instantiation, so we can finally see which trainers / model families / distributed backends are actually being used in practice and prioritize accordingly.

The payload is intentionally minimal — TRL version, trainer class name, model architecture, PEFT yes/no, distributed backend (deepspeed/fsdp/ddp/none), bucketed world size, device type, GPU model when available. No user data, no dataset names, no model paths, no hyperparameter values, never sent in CI / offline / HF_HUB_DISABLE_TELEMETRY mode.

See usage_stats.md for what's collected and how to opt out.

by @​qgallouedec in huggingface/trl#5758

Other

Fixes

... (truncated)

Commits
  • bd1e73f Release: v1.5 (#5835)
  • fb9cb79 Add Qwen3.5 Think/NoThink training chat templates with generation markers (#5...
  • 9e80cab Fix OpenRewardSpec omitting task‑scoped tools during rollout binding (fixes...
  • 7877695 Migrate tests to Qwen3.5 Think/NoThink fixtures (#5821)
  • 0fcc5e2 Add tiny Qwen3.5 Think/NoThink fixture generation scripts (#5819)
  • 43bd8f5 Align KTO with DPO: Align _compute_loss_liger flow (#5816)
  • cc4a0ff Align and simplify the stable training scripts (#5812)
  • 4711a21 Fix metric_for_best_model for trainer-specific eval metrics (#5811)
  • 909d090 Fix generate_batch: inference tensors block inplace ops in background thread ...
  • d0e8b8c Align KTO with DPO: Align compute_loss flow (#5810)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Open in Devin Review

Bumps [trl](https://github.com/huggingface/trl) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/huggingface/trl/releases)
- [Changelog](https://github.com/huggingface/trl/blob/main/RELEASE.md)
- [Commits](huggingface/trl@v1.4.0...v1.5.0)

---
updated-dependencies:
- dependency-name: trl
  dependency-version: 1.5.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Copy Markdown
Contributor Author

dependabot Bot commented on behalf of github May 27, 2026

Labels

The following labels could not be found: type/dependencies. Please create it before Dependabot can add it to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

@dependabot dependabot Bot added the python Pull requests that update python code label May 27, 2026
@dependabot dependabot Bot added the python Pull requests that update python code label May 27, 2026
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Constraint-dependencies updated

Updated pyproject.toml constraint for trl to >=1.5.0 and regenerated uv.lock.

View commit workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants