This repository contains TensorFlow and pyTorch code and datasets for the paper:
Lianghao Xia, Chao Huang, Yong Xu, Jiashu Zhao, Dawei Yin, Jimmy Xiangji Huang (2022). Hypergraph Contrastive Collaborative Filtering, Paper in arXiv, Paper in ACM. In SIGIR'22, Madrid, Spain, July 11-15, 2022.
Hypergraph Contrastive Collaborative Filtering (HCCF) devises parameterized hypergraph neural network and hypergraph-graph contrastive learning, to relieve the over-smoothing issue for conventional graph neural networks, and address the sparse and skewed data distribution problem in collaborative filtering.
If you want to use our codes and datasets in your research, please cite:
@inproceedings{hccf2022,
author = {Xia, Lianghao and
Huang, Chao and
Xu, Yong and
Zhao, Jiashu and
Yin, Dawei and
Huang, Jimmy Xiangji},
title = {Hypergraph Contrastive Collaborative Filtering},
booktitle = {Proceedings of the 45th International {ACM} {SIGIR} Conference on
Research and Development in Information Retrieval, {SIGIR} 2022, Madrid,
Spain, July 11-15, 2022.},
year = {2022},
}
The codes of HCCF are implemented and tested under the following development environment:
TensorFlow:
- python=3.6.12
- tensorflow=1.14.0
- numpy=1.16.0
- scipy=1.5.2
pyTorch:
- python=3.10.4
- torch=1.11.0
- numpy=1.22.3
- scipy=1.7.3
We utilized three datasets to evaluate HCCF: Yelp, MovieLens, and Amazon. Following the common settings of implicit feedback, if user
Please unzip the datasets first. Also you need to create the History/
and the Models/
directories. The command to train HCCF on the Yelp/MovieLens/Amazon dataset is as follows. The commands specify the hyperparameter settings that generate the reported results in the paper.
- Yelp
python labcode_efficient.py --data yelp --temp 1 --ssl_reg 1e-4
- MovieLens
python labcode_efficient.py --data ml10m --temp 0.1 --ssl_reg 1e-6 --keepRate 1.0 --reg 1e-3
- Amazon
python labcode_efficient.py --data amazon --temp 0.1 --ssl_reg 1e-7 --reg 1e-2
Switch your working directory to torchVersion/
, run python Main.py
. The implementation has been improved in the torch code. You may need to adjust the hyperparameter settings. If you want to run HCCF on other datasets, we suggest you consider using a simplified version torchVersion/Model_sparse.py
if your dataset is sparse. To do so, you should change the imported module in torchVersion/Main.py
from Model
to Model_sparse
. For the dataset used in this paper, we recommend the following configurations:
- Yelp
python Main.py --data yelp --reg 0 --ssl 0.2 --temp 0.1 --keep 1.0
- MovieLens
python Main.py --data ml10m --reg 1e-6 --ssl 0.5 --temp 0.1
- Amazon
python Main.py --data amazon --reg 1e-6 --ssl 0.2 --temp 0.1
Important arguments:
reg
: It is the weight for weight-decay regularization. We tune this hyperparameter from the set{1e-2, 1e-3, 1e-4, 1e-5}
.ssl_reg
: This is the weight for the hypergraph-graph contrastive learning loss. The value is tuned from1e-2
to1e-8
.temp
: This is the temperature factor in the InfoNCE loss in our contrastive learning. The value is selected from{10, 3, 1, 0.3, 0.1}
.keepRate
: It denotes the rate to keep edges in the graph dropout, which is tuned from{0.25, 0.5, 0.75, 1.0}
.leaky
: The slope of the leakyReLU activation function. This parameter is recommended to tune from{1.0, 0.5, 0.1}
.mult
: A hyperparameter to manually tune the embedding magnitude of hypergraph NN inModel_sparse.py
. Empirically, you can tune this parameter for the simplified version of HCCF from{1e-2, 1e-1, 1}
.
This research is supported by the research grants from the Department of Computer Science & Musketeers Foundation Institute of Data Science at the University of Hong Kong, the Natural Sciences & Engineering Research Council (NSERC) of Canada.