17 Jul 07:26

github-actions

5d355cb

v0.4.4 Latest

Latest

ART 0.4.4 Release Notes

New Features

SkyPilot Integration Enhancement: Added SkyPilot extras support for improved cloud deployment
capabilities (#255)
Reward System Improvements: Added experimental support to not scale rewards, providing more
flexibility in reward configuration (#fd2a118)

Documentation & Examples

New Tutorial: Added temporal-clue-7b.ipynb notebook demonstrating temporal reasoning capabilities
(#a2802b3)
Enhanced Documentation: Updated RULER documentation with comprehensive guidance on combining
rewards (#250)
ART•E Integration: Added ART•E notebook examples to documentation (#242, #240)

Bug Fixes & Improvements

Dependency Management: Reverted to previous version of gql to resolve compatibility issues (#249)
Unsloth Integration: Added experimental logprob pre-calculation support for Unsloth services
Installation Fixes: Improved backend dependency installation when using local ART paths (#241)
Documentation Updates: Various minor documentation improvements and clarifications

Technical Improvements

Updated SkyPilot backend installation instructions
Removed obsolete numpy installation cells from quickstart examples
Enhanced dependency synchronization

Assets 4

15 Jul 05:29

github-actions

v0.4.3

214aa3c

v0.4.3

ART 0.4.3 Release Notes

Breaking Changes

SkyPilot is now an optional dependency. If you use SkyPilotBackend, you must now install ART with the skypilot extra:

# Before (no longer works)
pip install openpipe-art

# Now required for SkyPilotBackend users
pip install openpipe-art[skypilot]

What's Changed

Dependencies

Moved SkyPilot dependencies (semver>=3.0.4 and skypilot==0.9.3) to an optional dependency group [skypilot] (#235)
This reduces the default installation size for users who don't need SkyPilot functionality

Documentation Updates

Updated installation instructions in all relevant documentation:
- Installation + Setup guide
- ART Backend documentation
- Summarizer tutorial

Migration Guide

If you're using SkyPilotBackend in your code:

# Your existing code doesn't need to change, just update the installation
from art.skypilot import SkyPilotBackend
backend = SkyPilotBackend(...)

Simply install with: pip install openpipe-art[skypilot] or uv add openpipe-art[skypilot]

Full Changelog

See PR #235 for complete details.

Assets 4

14 Jul 21:30

github-actions

v0.4.2

d8b88f2

v0.4.2

What's Changed

Fix client import error by vendoring transformers constants (#232)
docs: Add comprehensive documentation for additional_histories feature (#231)
Fix Ruff lint (#229)
Update 2048 code to use RULER (#228)
Add RULER notebook for 2048 (#227)
Add RULER promotional snippet to README (#225)
Add run_checks.sh script for code quality checks (#224)
Update README (#223)
fix python version in art-e (#222)
ruler docs (#221)
feat: Decouple vLLM & Unsloth Trainer (#212)

Full Changelog: v0.4.0...v0.4.2

Assets 4

14 Jul 18:50

github-actions

v0.4.1

0e7a34f

v0.4.1

What's Changed

Fix client import error by vendoring transformers constants (#232)
Fix Ruff lint (#229)
Update 2048 code to use RULER (#228)
Add RULER notebook for 2048 (#227)
Add RULER promotional snippet to README (#225)
Add run_checks.sh script for code quality checks (#224)
Update README (#223)
fix python version in art-e (#222)
ruler docs (#221)
feat: Decouple vLLM & Unsloth Trainer (#212)

Full Changelog: v0.4.0...v0.4.1

Assets 4

11 Jul 06:34

github-actions

v0.4.0

c3a4eab

v0.4.0

🚀 Introducing RULER: Relative Universal LLM-Elicited Rewards

We're excited to announce ART v0.4.0, featuring RULER - a groundbreaking general-purpose reward function that makes agent training dramatically easier and faster!

📏 What is RULER?

RULER (Relative Universal LLM-Elicited Rewards) uses an LLM-as-judge to rank agent trajectories, eliminating the need for:

❌ Labeled training data
❌ Expert feedback
❌ Hand-crafted reward functions

Yet it often matches or exceeds the performance of carefully designed reward functions!

🎯 Key Benefits

2-3x faster development: Skip the tedious reward engineering phase
Universal application: Works across diverse RL tasks without modification
Production-ready: Battle-tested on real tasks with impressive results
Simple integration: Just a few lines of code to get started

📖 Learn More

Check out the RULER documentation to see how easy it is to use:

from art.rewards import ruler_score_group

# Score your trajectories with one line
judged_group = await ruler_score_group(group, "openai/gpt-4o-mini")

Read the full launch announcement for detailed performance comparisons and insights.

What's Changed

Major Features

Add RULER reward function (#218) 🎉
RULER documentation (#221)

Other Improvements

Update README (#223)
fix python version in art-e (#222)
Add setproctitle as dep in colab notebooks (#220)
Move plotting dependencies to optional group (#217)
feat: tau-bench brad 003 (#216)
Allow validation_loader argument to train method (#215)
Update swe-bench example docs (#214)
chore: Remove workaround for torch-compile and use --torch-compile flag (#212)
Adds option to use padding with --torch-compile (#211)
Fix tau-bench example (#210)
art-2048: update qwen model identifier (#209)
Allow Unsloth to use --pad_token when tokenizer has no pad token (#208)
Allow using specific wandb projects in the CLI (#207) (#207)
feat: Allow using get_peft_model to re-initialize trainer state (#206)
chore: Add art_trainer module with ART's TRL Trainer (#205)
Art tau bench example (#204)
🔊 Improve noisy startup (#203) (#203)
feat: SWE-Bench Example (#201)
Update to 0.3.13, pin accelerate (#197)

Full Changelog: v0.3.13...v0.4.0
EOF < /dev/null

Assets 4

11 Jul 06:27

corbt

v0.3.13

c84c141

Release v0.3.13

What's Changed

chore: Update TRL (#187)
allow training without logprobs experimentation (#186)
chore: Upgrade Unsloth dependencies (#183)
chore: SWE-Bench related changes (#181)
Bump uv to >=0.6.15 (#180)
Tau bench async rl (#179)
Track entropy at training time (#178)
Adds vllm metrics to wandb (#177)
Pin openpipe-art and accelerate versions in notebooks (#175)
Release 0.3.12 (#174)
Match default SkyPilotBackend version to client (#173)
feat: Add support for multiple histories (#170)
[WIP] More docs (#169)

Full Changelog: v0.3.12...v0.3.13

Assets 2

11 Jul 06:26

corbt

v0.3.12

7032d6c

Release v0.3.12

What's Changed

Tau bench async (#168)
Refactor dev/tau-bench for true async (#167)
ART-E updates (#166)
Add langfuse tracing to run_rl.py (#165)
Make rollout_tau_bench_task synchronous (#164)
feat: Multi-device training (#163)
Create run_training.py for remote training (#162)
Create run_rl.py with ART RL loop (#161)
Wandb weave (#158)
Basic W&B Weave integration (#157)
Properly read base model from CLI (#156)
Deploy model locally (#155)
Fix s3 utils typo (#153)
Fix busy wait in vllm test client (#152)
Fix comment (#151)
dev: swebench (#149)
fix: Improve retry util typing (#148)
Add get_guided_completion_params and use in tic tac toe self play (#147)
Pin vllm to 0.8.5 (#146)

Full Changelog: v0.3.11...v0.3.12

Assets 2

11 Jul 06:26

corbt

v0.3.11

126fa27

Release v0.3.11

What's Changed

Limit number of metrics shown in gather_trajectory_groups (#145)

Full Changelog: v0.3.10...v0.3.11

Assets 2

11 Jul 06:26

corbt

v0.3.10

f63ac00

Release v0.3.10

What's Changed

Update package version to 0.3.9 (#143)
Speed up step deployment (#142)
Add tic tac toe self-play example (#141)
Fix training stability issues with new vLLM version (#140)

Full Changelog: v0.3.9...v0.3.10

Assets 2

11 Jul 06:26

corbt

v0.3.9

7fdfbeb

Release v0.3.9

What's Changed

Serialize model config (#139)
Properly log trajectories and metrics via remote backend (#138)
Properly sync workdir (#137)
reverting to older vllm version, since the latest one shows regressions in convergence of grpo training (#136)
feat: Add force_restart option to SkypilotBackend (#135)
Fix train step (#134)
Revert some of the thread safe changes (#133)
Enable asymmetric PPO clipping (#132)
Support close method on remote backend (#129)
Fix hanging (#126)
update version (#122)
Update to version 0.3.6 (#121)
Simplify tic tac toe example (#120)
add qwen 3 support (#118)

Full Changelog: v0.3.7...v0.3.9

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ART 0.4.4 Release Notes

New Features

Documentation & Examples

Bug Fixes & Improvements

Technical Improvements

Uh oh!

ART 0.4.3 Release Notes

Breaking Changes

What's Changed

Dependencies

Documentation Updates

Migration Guide

Full Changelog

Uh oh!

What's Changed

Uh oh!

What's Changed

Uh oh!

🚀 Introducing RULER: Relative Universal LLM-Elicited Rewards

📏 What is RULER?

🎯 Key Benefits

📖 Learn More

What's Changed

Major Features

Other Improvements

Uh oh!

What's Changed

Uh oh!

What's Changed

Uh oh!

What's Changed

Uh oh!

What's Changed

Uh oh!

What's Changed

Uh oh!

Releases: OpenPipe/ART

v0.4.4

ART 0.4.4 Release Notes

New Features

Documentation & Examples

Bug Fixes & Improvements

Technical Improvements

Uh oh!

v0.4.3

ART 0.4.3 Release Notes

Breaking Changes

What's Changed

Dependencies

Documentation Updates

Migration Guide

Full Changelog

Uh oh!

v0.4.2

What's Changed

Uh oh!

v0.4.1

What's Changed

Uh oh!

v0.4.0

🚀 Introducing RULER: Relative Universal LLM-Elicited Rewards

📏 What is RULER?

🎯 Key Benefits

📖 Learn More

What's Changed

Major Features

Other Improvements

Uh oh!

Release v0.3.13

What's Changed

Uh oh!

Release v0.3.12

What's Changed

Uh oh!

Release v0.3.11

What's Changed

Uh oh!

Release v0.3.10

What's Changed

Uh oh!

Release v0.3.9

What's Changed

Uh oh!