Skip to content

TsingZ0/AP2O

Repository files navigation

Introduction

This is the official implementation of our paper AP2O-Coder: Human-Inspired Progressive Optimization to Fix LLM Code Errors. Accepted by AAAI'26.

Adaptive Progressive Preference Optimization

Coding Error Reduction

Data Efficiency

Reward Curves

Requirements

  • deepspeed 0.17.2
  • python 3.11.11
  • torch 2.7.0
  • trl 0.14.0
  • transformers 4.51.3
  • vllm 0.9.2

Usage

To initiate the preference data self-generation and preference optimization processes, use the following command:

sh pipe-qwen2.5-coder.sh

About

AAAI'26, AP2O-Coder: Human-Inspired Progressive Optimization to Fix LLM Code Errors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published