π Website: vectorinstitute.github.io/Factual-Preference-Alignment Β |Β π Paper: arxiv.org/abs/2601.03027 Β |Β π Dataset: Hugging Face
Factuality-aware Direct Preference Optimization is a research and engineering framework for studying and improving factual alignment in preference-optimized Large Language Models (LLMs).
The project introduces F-DPO, a factuality-aware extension of Direct Preference Optimization (DPO) that incorporates:
- Explicit factuality supervision
- Synthetic hallucination inversion
- Margin-based factual penalties
The repository provides end-to-end infrastructure for:
- Dataset construction
- Multi-model preference fine-tuning
- Automated factuality evaluation
All components are config-driven, reproducible, and aligned with the Vector Institute AI Engineering Template.
- π Binary factuality supervision integrated into preference learning
- π§ͺ Synthetic hallucination inversion pairs
- π Ξ-margin factual penalties for controllable hallucination suppression
- βοΈ Fully config-driven data, training, and evaluation pipelines
- π Multi-model Γ multi-Ξ benchmarking at scale
aixpert/
β
βββ src/aixpert/
β βββ config/ # Central config.yaml
β βββ data_construction/ # 8-stage factual dataset pipeline
β βββ training/ # Original-DPO & F-DPO training
β βββ evaluation/ # GPT-4o-mini judge evaluation
β βββ utils/ # Shared helpers
β
βββ README.md
βββ pyproject.toml
Standard DPO aligns models to human preferences, but does not explicitly discourage hallucinated yet preferred responses.
F-DPO introduces a factuality-aware margin:
- Each preference tuple includes
(h_w, h_l)factuality indicators - A penalty Ξ» is applied when the preferred response is less factual
- Optimization pressure shifts toward factually correct preferences
β‘οΈ Result: Lower hallucination rates without sacrificing preference alignment
This repository contains a complete eight-stage pipeline for converting the Skywork Reward-Preference-80K dataset into balanced, factual-aware DPO datasets.
| Stage | Description |
|---|---|
| 1 | Skywork extraction & de-duplication |
| 2 | Preference pair conversion |
| 3 | Binary factuality scoring (GPT-4o-mini) |
| 4 | Canonical DPO transformation |
| 5 | Synthetic hallucination generation |
| 6 | Dataset merging |
| 7 | Balanced bucket construction |
| 8 | Optional preference flipping |
All paths and parameters are defined in:
src/aixpert/config/config.yaml
Every component β datasets, models, hyperparameters, outputs, and evaluation β is controlled via:
src/aixpert/config/config.yaml
Loaded using:
from utils.config_loader import load_config
cfg = load_config()This enables:
- Full reproducibility
- Multi-model automation
- Zero hard-coded paths
python -m aixpert.training.run_dpo_training \
--model "google/gemma-2-9b-it"Trains standard DPO using Skywork preferences.
python -m aixpert.training.run_factual_training \
--model_id "google/gemma-2-9b-it" \
--short "gemma2-9b" \
--delta 10Each Ξ value produces a separate fine-tuned model.
Evaluation is performed using GPT-4o-mini as an LLM-as-a-Judge.
| Metric | Meaning |
|---|---|
| factuality | Mean factual score |
| halluc_rate | % outputs below threshold |
| win_rate | Ξ-model vs baseline |
| count | Prompts evaluated |
Run evaluation:
python -m aixpert.evaluation.evaluations.run_all_evaluationsOutputs:
eval_results.json
- Gemma-2 (2B, 9B)
- Qwen-2.5 / Qwen-3
- LLaMA-3.x
- Any TRL-compatible causal LLM
Models are registered centrally in config.yaml.
- Hugging Face TRL β DPO reference implementation
- Unsloth β QLoRA optimization
- BitsAndBytes β 4-bit quantization
- Flash-Attention-2
- Weights & Biases β experiment tracking
- Accelerate β multi-GPU orchestration
This project builds upon and extends the Skywork Reward-Preference-80K dataset.
We do not claim ownership of the Skywork dataset. All credit belongs to the original authors.
If you use this repository, please cite Skywork:
@article{liu2024skywork,
title={Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs},
author={Liu, Chris Yuhao and Zeng, Liang and Liu, Jiacai and Yan, Rui and He, Jujie and Wang, Chaojie and Yan, Shuicheng and Liu, Yang and Zhou, Yahui},
journal={arXiv preprint arXiv:2410.18451},
year={2024}
}For dataset-related concerns, please contact the Skywork authors via their paper or Hugging Face repository.
If you find this code or dataset useful for your research, please consider citing:
@article{FactualAlignment2026,
title={Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning},
author={Sindhuja Chaduvula, Ahmed Radwan, Azib Farooq, Yani Ioannou, Shaina Raza},
journal={arXiv preprint arXiv:2601.03027},
year={2026}
}For questions, collaborations, or issues:
- Open a GitHub Issue
- Or contact the maintainers via the Vector Institute
β‘ Factuality-aware Direct Preference Optimization promotes in reducing hallucinations and increase factualness
Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute. This research was funded by the European Unionβs Horizon Europe research and innovation programme under the AIXPERT project (Grant Agreement No. 101214389), which aims to develop an agentic, multi-layered, GenAI-powered framework for creating explainable, accountable, and transparent AI systems.
We invite researchers and practitioners to build upon this framework.
