GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs
This repository contains the code for the paper "GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs" accepted at EMNLP Findings 2023.
Install the required packages using the following command:
pip install -r requirements.txt
The dataset is available at here. Inside each project (ogbl-citation2, ogbn-arxiv and ogbn-products) folder, there are several key files:
- {project}-ogbn.torch: The dataset file including adjacency matrix, node classification labels, and split information.
- {project}_text.csv/X.all.txt: The raw text content for each node.
- mrr_edges.torch: The file containing the edges for link prediction task.
cd scripts
sh ssl_train.sh
The evaluation includes the following tasks:
- MLP node classification
- GraphSage node classification
- Link Prediction
cd scripts
sh eval.sh
The node embeddings checkpoint is available at here.
If you use this code for your research, please cite our paper:
@misc{li2023grenade,
title={GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs},
author={Yichuan Li and Kaize Ding and Kyumin Lee},
year={2023},
eprint={2310.15109},
archivePrefix={arXiv},
primaryClass={cs.CL}
}