Skip to content

JasonZhujp/RELIEF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RELIEF

This repository contains the PyTorch implementation of the paper RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning, which is accepted by the SIGKDD 2025.

Inspired by the marginal effect of increasing prompt token length on performance improvement in LLMs, we propose RELIEF, a REinforcement LearnIng Empowered graph Feature prompt tuning method. Our goal is to enhance the performance of pre-trained GNN models on downstream tasks by incorporating only necessary and lightweight feature prompts into input graphs.

Installation

The following packages are required under Python 3.9.

pytorch 2.0.1
torch-geometric 2.3.1
torch-cluster 1.6.3+pt20cu117
torch-scatter 2.1.2+pt20cu117
torch-sparse 0.6.18+pt20cu117
torch-spline-conv 1.2.2+pt20cu117
rdkit 2022.3.4
scikit-learn 1.2.0

Experiments

Graph Classification

  • Datasets: For graph-level tasks, we adopt Chemical datasets to pre-train GNN model. For downstream graph classification tasks, eight binary classification datasets for molecular property prediction are employed. All datasets are referenced from the paper Strategies for Pre-training Graph Neural Networks and are available at their repositery. Please download the chemistry dataset, unzip it while retaining the dataset directory, and place it directly under the graph_level folder.

  • Pre-training: We follow the training steps from the paper Strategies for Pre-training Graph Neural Networks and Graph Contrastive Learning with Augmentations to obtain four pre-trained GIN models, including Deep Graph Infomax, Attribute Masking, Context Prediction and Graph Contrastive Learning strategies. Pre-trained models are available at the repository of an exisiting baseline work.

  • Baselines: Fine-tuning, GPF, GPF-plus, SUPTsoft, SUPThard and All in One.

  • Running: For each downstream dataset, the four pre-trained GIN models are used, forming 32 graph classification tasks. You can reproduce the experimental results presented in our paper by running graph_level/run.sh, where arguments and hyper-paramter settings are provided.

Node Classification

  • Datasets: For node-level tasks, we use Cora, Citeseer, Pubmed and Amazon-Co-Buy (Computers and Photo), which we have already saved as processed (SVD) datasets in the node_level/dataset folder. Since we extend feature prompt tuning approaches to node-level tasks by operating on the induced k-hop subgraphs of the target nodes, we provide the code for splitting training, validation, and testing subgraphs in node_level/preprocess.py. Note that since fine-tuning or other prompt-based methods do not incorporate subgraph-related designs, we also maintained node indices corresponding to training, validation, and testing node sets in the generated file saved within node_level/subgraph/split_data.

  • Pre-training: We use two edge-level pre-training strategies employed in two pioneering work - GPPT and GraphPrompt, respectively. GPPT used masked edge prediction which is a binary classification pretext task, whereas GraphPrompt used contrastive learning, which determines positive and negative node pairs based on edge connectivity. You can first pre-train GIN models by running node_level/pretrain_gppt.py and node_level/pretrain_gprompt.py, or directly use the models we provided in the pretrained_models directory.

  • Baselines: Fine-tuning, GPPT, GraphPrompt, GPF, GPF-plus, SUPTsoft and SUPThard.

  • Running: For each downstream dataset, the two pre-trained GIN models are used, forming 10 node classification tasks. You can reproduce the experimental results presented in our paper by running node_level/run.sh.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published