GitHub - Yigit-Kuyu/LLM_with_HumanFeedback: Human feedback adaption to LLM via Reinforcement Learning

INFO

The framework consists of three steps, first: supervised fine-tuning (SFT), second: reward model creation (RM), and last: PPO implementation with human feedback data (RLHF).

SFT.py: This code fine-tunes a large language model (GPT-Neo 1.3B) on a question-answering task using the SQuAD v2 dataset. It prepares the data by combining questions, contexts, and answers into a single format. The model is set up with memory-efficient techniques like quantization and LoRA. The code then trains the model using the Hugging Face Trainer, which handles the training loop and evaluation. The goal is to improve the model's ability to answer questions based on given contexts, creating a specialized version of the original language model for this specific task.
RewardModel.py: This code implements a reward model for RLHF. It uses the Anthropic/hh-rlhf dataset, which contains pairs of chosen and rejected responses. The model, based on DeBERTa-v3, is trained to predict a scalar reward value to distinguish between preferred and non-preferred responses. The training process includes options for debugging and logging with Weights&Biases (wandb).
RLHF.py: This code implements the RLHF stage using PPO. It loads a pre-trained SFT model and a reward model, then prepares them for PPO training. The SQuAD v2 dataset is used, preprocessed for the RLHF task. A PPO configuration is set up with parameters including learning rate, batch size, and KL divergence target. The process aims to fine-tune the language model to maximize rewards while maintaining similarity to the original model through KL divergence constraint.

The basic flow of the framework can be seen below:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Frameworks.jpg		Frameworks.jpg
README.md		README.md
RLHF.py		RLHF.py
RewardModel.py		RewardModel.py
SFT.py		SFT.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INFO

About

Releases

Packages

Languages

Yigit-Kuyu/LLM_with_HumanFeedback

Folders and files

Latest commit

History

Repository files navigation

INFO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages