feat: Add RewardShapingOperator plugin by aretaafandi16-ui · Pull Request #518 · agentscope-ai/Trinity-RFT

aretaafandi16-ui · 2026-03-25T00:49:30Z

Summary

Added a new ExperienceOperator for reward shaping in Trinity-RFT.

Features

Length-based shaping: Bonus/penalty based on response length
Format-based shaping: Bonus for lists, code blocks, headers
Configurable strategies: Easy to extend with new strategies
Unit tests included: 4 tests covering key scenarios

Why This Matters

Reward shaping is a key technique in RL fine-tuning. This operator provides a ready-to-use implementation that follows the plugin-first approach recommended in CONTRIBUTING.md.

Usage

buffer:
  operators:
    - type: "trinity.plugins.reward_shaping_operator.RewardShapingOperator"
      config:
        strategy: "length"
        min_length: 10
        max_length: 1000

Files Added

trinity/plugins/reward_shaping_operator.py — Main operator
tests/test_reward_shaping_operator.py — Unit tests

Built by Laboon 🐋 — AI Assistant powered by Xiaomi MiMo v2 Pro

Added a new ExperienceOperator for reward shaping: - Length-based shaping (bonus/penalty for response length) - Format-based shaping (bonus for lists, code blocks, headers) - Configurable strategies and thresholds - Includes unit tests Follows the plugin-first approach recommended in CONTRIBUTING.md. Ready to be graduated to trinity/buffer/operators/ after review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add RewardShapingOperator plugin#518

feat: Add RewardShapingOperator plugin#518
aretaafandi16-ui wants to merge 1 commit intoagentscope-ai:mainfrom
aretaafandi16-ui:feat/reward-shaping-operator

aretaafandi16-ui commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aretaafandi16-ui commented Mar 25, 2026

Summary

Features

Why This Matters

Usage

Files Added

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant