GALG

The repository of GALG, a graph-based artificial intelligence approach to link addresses for user tracking on TLS encrypted traffic.

GALG uses the framework of Graph Auto-encoder and adversarial training to learn the user embedding with semantics and distributions. Employing a new theory – link generation, GALG could link all the addresses of target users from the knowledge of address-service links.

The work is introduced in the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2022):

Tianyu Cui, Gang Xiong, Chang Liu, Junzheng Shi, Peipei Fu, Gaopeng Gou. GALG: Linking Addresses in Tracking Ecosystem Using Graph Autoencoder with Link Generation. European Conference on Practice of Knowledge Discovery in Databases 2022.

Note: this code is based on GAE, ARGA, GAT, and Link Prediction Experiments. Many thanks to the authors.

Requirements

python 3
TensorFlow (1.0 or later)
gensim
networkx
scikit-learn
scipy

Run

python main.py

Data

For privacy consideration, here we only provide the public dataset we used in the paper.

CSTNET: CSTNET is a public dataset collected from March to July 2018 on China Science and Technology Network (CSTNET).

If you want to use your own data, please check if the data format is the same as data/cstnet.json and specify the data path in main.py.

Models

You can choose between the following models:

GALG: Graph Auto-Encoder for Link Generation
VGALG: Variational Graph Auto-Encoder for Link Generation

Utils

We provide the utils for extensive experiments on the task of user tracking and link generation:

baselines: All link prediction methods modified with the link generation framework.

The link prediction methods include:

(Variational) Graph Auto-Encoders: An end-to-end trainable convolutional neural network model for unsupervised learning on graphs
Adversarially Regularized (Variational) Graph Autoencoder: An adversarial graph embedding framework for robust graph embedding learning
Node2Vec/DeepWalk: A skip-gram based approach to learning node embeddings from random walks within a given graph
Spectral Clustering: Using spectral embeddings to create node representations from an adjacency matrix
Heuristics: Common Neighbors, Jaccard, and Preferential Attachment

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
images		images
models		models
utils/baselines		utils/baselines
README.md		README.md
dataloader.py		dataloader.py
extractor.py		extractor.py
layers.py		layers.py
main.py		main.py
metrics.py		metrics.py
model.py		model.py
optimizer.py		optimizer.py
tracker.py		tracker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GALG

Requirements

Run

Data

Models

Utils

About

Releases

Packages

Languages

CuiTianyu961030/GALG

Folders and files

Latest commit

History

Repository files navigation

GALG

Requirements

Run

Data

Models

Utils

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages