Towards Sufficient GPU-accelerated Dynamic Graph Management: Survey and Experiment

This is the artifact of our paper titled Towards Sufficient GPU-accelerated Dynamic Graph Management: Survey and Experiment for VLDB 2025.

Environment Requirements

g++ 9.4 or higher (C++ 14 required)
CUDA 11.3 or higher
Nvidia modern GPU (We tested on cards of Volta and Ampere architecture)

Dataset

Due to the file size limit of github, you can download preprocessed datasets from this Google Drive link, and put unzipped files into dataset folder.

Here are some publicly available sources of our evaluated datasets: road, Wiki, Patent, Pokec, LiveJournal, Stack, and Orkut are from SNAP. Graph500 is produced by Kronecker Generator from the Graph 500 benchmark. LDBC-SF30 and LDBC-SF100 are from LDBC social network benchmark. uk-2005 is from Laboratory for Web Algorithms.

If you want to run with your own datasets, you can use the following data format as input, which is similar to .txt format of SNAP datasets.

vertex_number edge_number
1 2
1 3
1 4
......

For a graph of N edges, the edgelist file contains N+1 lines. The first line indicates the number of vertices and edges. Each of the next N lines indicates an edge consisting of a source vertex ID and a destination vertex ID(first the source vertex, then the destination).

Code Structure

The project directory is organized as follows:

dataset: Contains graph datasets
LPMA: Code for LPMA based on their open-source version
cuSTINGER: Code for cuSTINGER based on their open-source version
faimgraph: Code for LPMA based on their open-source version
gpma: Code for GPMA based on their open-source version
gunrock: Code for Gunrock based on their open-source version
hornet: Code for hornet based on their open-source version
test: The code/scripts for conducting experiments in our paper.

Compile

Before compiling, please make sure you have modified the corresponding Makefiles of CMakeLists to specify the CUDA arch and gencode of your applied Nvidia GPU architecture. Besides, important parameters like the NVCC path and the included path should be specified.

For cuSTINGER, faimgraph, gunrock, and hornet that use CMake:

cd [path to a system]
mkdir build && cd build
cmake ..
make -j8

For LPMA and GPMA use simply Makefile:

cd [path to a system]
make -j8

Run the code

Update operations

To run update tests, the following template of commands is used:

cd [path to executable files]
./update [data_graph] [batch_size]

The first argument data_path indicates the location of the data graph file. The second argument batch_size instructs the system to generate update batches of this given fixed size, which should be between $10^4$ and $10^8$.

For example, to run faimgraph's update on Orkut with a batch size of $10^5$:

cd ./faimgraph/build
./update ../../dataset/orkut.txt 100000

Query primitives and analytic workloads

To run query primitives and analytic workloads:

cd [path to executable files]
./[operation_name] [data_graph] [options]

We implemented 5 query primitives and 3 analytic workloads based on the aforementioned open-source versions of evaluated systems, including edge check (./findE), neighborhood scan (./getNeighor), 2-hop neighbor scan (./get2Hop), cycle detection(./findCycle), clique detection (./findClique), breadth-first search(./bfs), Pagerank (./pr), and betweenness centrality(./bc). Corresponding names of executable files are indicated in parentheses above.

To run ./findE, ./getNeighor, ./get2Hop, ./findCycle, and ./findClique, one extra option is needed, which is either uniform or biased. The former instructs the program to sample vertex points or pairs with equal probability to generate a uniform workload. The latter enforces higher sampling probability to a high-degree vertex.

To run a breadth-first search (./bfs), you should provide a random seed to generate starting vertices. For example:

./bfs ../../dataset/orkut.txt 9527

To run betweenness centrality (./bc), an argument k is needed to control the sample size to estimate betweenness centrality. k should not be larger than the number of distinct vertices in the given data graph.

To run Pagerank, three arguments are required. (1) damping_factor is a probability of 0~1 (0.85 by default). (2) max_iter is the maximum number of iterations. (3) tol is the error tolerance used to check convergence.

For example, to run PageRank with faimgraph on Orkut:

./pr ../../dataset/orkut.txt 0.85 100 0.0000001

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Sufficient GPU-accelerated Dynamic Graph Management: Survey and Experiment

Environment Requirements

Dataset

Code Structure

Compile

Run the code

Update operations

Query primitives and analytic workloads

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
LPMA		LPMA
cuSTINGER		cuSTINGER
dataset		dataset
faimgraph		faimgraph
gpma		gpma
gunrock		gunrock
hornet		hornet
test		test
README.md		README.md

pkumod/GPU_DGM

Folders and files

Latest commit

History

Repository files navigation

Towards Sufficient GPU-accelerated Dynamic Graph Management: Survey and Experiment

Environment Requirements

Dataset

Code Structure

Compile

Run the code

Update operations

Query primitives and analytic workloads

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages