[EMNLP 2025] CoPL: Collaborative Preference Learning for Personalizing LLMs

This repository contains the official implementation of the paper:

CoPL: Collaborative Preference Learning for Personalizing LLMs
Youngbin Choi, Seunghyuk Cho, Minjong Lee, MoonJeong Park, Yesong Ko, Jungseul Ok, Dongwoo Kim
EMNLP, 2025

Paper: arXiv:2503.01658

CoPL is a collaborative preference learning framework that learns user preferences and adapts to new users. This project consists of 4 stages:

Dataset Generation - Generate user preference data
User Representation Learning - Learn user embeddings
RM Training - Train reward models
Unseen Adaptation - Adapt to new users

Requirements

# Install dependencies from requirements.txt
pip install -r requirements.txt

# Install custom PEFT package with MoLE support
cd peft-main
pip install -e .
cd ..

Usage

1. Dataset Generation

Generate user preference data using the following commands:

# PLM dataset
python data/datagen_plm.py --tokenizer_name google/gemma-2b-it --num_users 10000 --n_context 16 --seed 1111 
python data/datagen_plm.py --tokenizer_name google/gemma-2b-it --num_users 10000 --n_context 16 --seed 1111 --AVG

# TLDR dataset
python data/datagen_tldr.py --tokenizer_name google/gemma-2b-it --num_users 10000 --n_context 8 --seed 1111 
python data/datagen_tldr.py --tokenizer_name google/gemma-2b-it --num_users 10000 --n_context 8 --seed 1111 --AVG

# UltraFeedback-P dataset
python data/datagen_ufp.py --other_subsets UF-P-2 --tokenizer_name google/gemma-2b --model_name google/gemma-2b --num_users 10000
python data/datagen_ufp.py --other_subsets UF-P-4 --tokenizer_name google/gemma-2b --model_name google/gemma-2b --num_users 10000
python data/datagen_ufp.py --other_subsets UF-P-2 --tokenizer_name google/gemma-2b --model_name google/gemma-2b --num_users 10000 --AVG 
python data/datagen_ufp.py --other_subsets UF-P-4 --tokenizer_name google/gemma-2b --model_name google/gemma-2b --num_users 10000 --AVG

The data is stored in the following format:

# Data format
{
    'user': user_id,
    'context': [(positive_item, negative_item), ...],
    'context_unseen': [(positive_item, negative_item), ...],
    'target': [(positive_item, negative_item), ...],
    'user_type': user_type
}

2. User Representation Learning

Learn user embeddings using the CoPLGCF model:

python train_CoPL_gcf.py \
    --data_path dataset/your_data.pkl \
    --hidden_dim 512 \
    --l 4 \
    --num_epoch 100 \
    --learning_rate 1e-4 \
    --wandbon True

3. RM Training

Train personalized reward models using user embeddings:

bash scripts/CoPL_rm_train.sh

Or run directly:

torchrun --nproc_per_node=4 --master_port 4788 train_CoPL_rm.py \
    --model_name google/gemma-2b-it \
    --data_path dataset/UF-P-2-10000-ALL.pkl \
    --user_embeds_path gcf_user_embeds/UF-P-2-ALL-user_emb.pt \
    --log_dir logs/CoPL_RM \
    --bf16 True \
    --per_device_train_batch_size 16 \
    --learning_rate 5e-5 \
    --lora_r 8 \
    --num_experts 8 \
    --max_steps 1500 \
    --deepspeed scripts/ds_config.json

4. Unseen Adaptation

Perform adaptation for new users:

python unseen_user_adaptation.py \
    --data_path dataset/your_data.pkl \
    --unseen_data_path dataset/unseen_data.pkl \
    --model_path gcf_models/your_model.pt \
    --save_path unseen_embeddings.pt \
    --hidden_dim 512 \
    --l 4

Adaptation Method:

User embedding generation using 2-hop neighbor information
Inference based on preference patterns of existing users
Embedding aggregation through softmax weighted averaging

Project Structure

CoPL/
├── models/
│   ├── CoPL_gcf.py          # User representation learning model
│   └── CoPL_rm.py           # Reward model
|   └── baselines/           # TODO: add baseline models
├── scripts/
│   ├── CoPL_rm_train.sh     # RM training script
│   └── ds_config.json       # DeepSpeed configuration
├── train_CoPL_gcf.py        # User representation learning
├── train_CoPL_rm.py         # Reward model training
├── unseen_user_adaptation.py # New user adaptation

Acknowledgement

Our implementation is inspired by and builds upon the following works:

MoCLE: Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning. Our MoLE architecture implementation is based on their code.
VPL-LLM: Understanding Hidden Context in Preference Learning. Our preference learning framework is based on their code.

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{
choi2025copl,
title={CoPL: Collaborative Preference Learning for Personalizing LLMs},
author={Youngbin Choi and Seunghyuk Cho and Minjong Lee and MoonJeong Park and Yesong Ko and Jungseul Ok and Dongwoo Kim},
booktitle={The 2025 Conference on Empirical Methods in Natural Language Processing},
year={2025},
url={https://arxiv.org/abs/2503.01658}
}

Contact

If you have questions or suggestions about the project, please create an issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[EMNLP 2025] CoPL: Collaborative Preference Learning for Personalizing LLMs

Requirements

Usage

1. Dataset Generation

2. User Representation Learning

3. RM Training

4. Unseen Adaptation

Project Structure

Acknowledgement

Citation

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
peft-main		peft-main
scripts		scripts
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_CoPL_gcf.py		train_CoPL_gcf.py
train_CoPL_rm.py		train_CoPL_rm.py
unseen_user_adaptation.py		unseen_user_adaptation.py

License

ml-postech/CoPL

Folders and files

Latest commit

History

Repository files navigation

[EMNLP 2025] CoPL: Collaborative Preference Learning for Personalizing LLMs

Requirements

Usage

1. Dataset Generation

2. User Representation Learning

3. RM Training

4. Unseen Adaptation

Project Structure

Acknowledgement

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages