license: apache-2.0
datasets:
  - Skywork/Skywork-Reward-Preference-80K-v0.1
base_model:
  - Qwen/Qwen2-7B-Instruct

Introduction

Con-J-Qwen2-7B (learning the generative Judge using self-generated Contrastive judgments) is an advanced generative judge built on Qwen2-7B-Instruct architecture and dataset Skywork/Skywork-Reward-Preference-80K-v0.1.

Con-J-Qwen2-7B is trained from preference data. We prompt the pre-trained Qwen2-7B-Instruct model to generate positive and negative judgments, both supported with rationales in natural language form. Then the self-generated contrastive judgment pairs are used to train the generative judge with Direct Preference Optimization (DPO). By doing this, Con-J learns to act as a generative judge and provides accurate and supprting rationales.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Model/Con-J-Qwen2-7B"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)

question = "What is the range of the numeric output of a sigmoid node in a neural network?"
answer1 = "The output of a sigmoid node is bounded between -1 and 1."
answer2 = "The output of a sigmoid node is bounded between 0 and 1."

# Format and tokenize the conversations
CON_J_PROMPT = """作为一个评价专家，给定一个问题和它的两个可能的回答，请选出哪一个回答在连贯性、准确性、覆盖度和上述定义的整体质量方面最为符合。请用JSON格式输出你的判断, 其中"原因"是你提供的解释，"更好的回答"是整数类型的1或2，例如{{"原因": "你的解释", "更好的回答": 1}}。以下是问题和候选回答的内容：
    \n问题：{instruction}
回答1：{output_1}
回答2：{output_2}"""
user_prompt = CON_J_PROMPT.format(instruction=question, output_1=answer1, output_2=answer2)
system_prompt = ""
messages = [
    {"role": "system", "content": system_prompt,},
    {"role": "user", "content": user_prompt},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
prompt = tokenizer([prompt], return_tensors="pt")

# Generate judgment for the given prompt
with torch.no_grad():
    generated_ids = model.generate(prompt.input_ids, do_sample=False, max_new_tokens=2048,)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(prompt.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

# response: {"原因": "回答1中的-1是错误的，因为sigmoid函数的实际输出范围是0到1，而不是包括-1。回答2准确地描述了sigmoid函数的输出范围是0到1。",\n "更好的回答": 2}

Performance

Model	Infinity- Preference	Ultra- Feedback	PKU- SafeRLHF	Reward-Bench
Model	Infinity- Preference	Ultra- Feedback	PKU- SafeRLHF	Chat	Chat-H	Safety	Reasoning
Llama3.1-8B	59.0	62.9	66.4	80.7	49.8	64.0	68.1
Llama3.1-70B	64.0	71.4	67.6	97.2	70.2	82.8	86.0
Qwen2-7B	59.0	64.5	67.2	91.3	44.8	73.6	69.0
Qwen2.5-72B	70.0	66.0	58.7	86.6	61.4	74.5	90.7
Auto-J	69.0	63.9	66.9	93.0	40.0	65.5	50.5
Prometheus 2	68.0	63.3	63.0	85.5	49.1	77.1	76.5
GPT-4o	75.0	72.2	69.6	95.3	74.3	87.6	86.9
Con-J (ours)	81.0	73.0	68.4	91.3	79.6	88.0	87.1

Training Scripts

The training of Con-J is based on a self-modified version of Open-RLHF. The training scripts are available in Code/run_scripts/. The training of Con-J involves the following steps:

task_name="Skywork-Reward-Preference-80K-v0.1"
cd run_scripts/Qwen2/
# repeated sampling
sh vllm_inference_best_of_n.sh 8 $task_name
# hint driven sampling
sh vllm_inference_all.sh $task_name
# dataset filtering and construction
python ../../examples/construct_dpo_dataset_for_critic_model.py --task $task_name
# contrastive training
sh train_dpo.sh $task_name
# inference and evaluation
sh vllm_inference2.sh $task_name $task_name

To enable Con-J training, one should download the base model Qwen/Qwen2-7B-Instruct and the dataset Skywork/Skywork-Reward-Preference-80K-v0.1 to proper place align with the training scripts. Then the downloaded dataset can be preprocessed by runing the following command:

python preprocess_dataset.py

Reference

Coming soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Introduction

Usage

Performance

Training Scripts

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Introduction

Usage

Performance

Training Scripts

Reference