- entities.dict: each line represents an entity including an index and entity name separated by "\t".
- relations.dict: each line represents a relation including an index and relation name separated by "\t".
- triplets.txt: each line represents a triplet including an entity name, a relation name and another entity name separated by "\t".
- content_emb.npy: a numpy array with the shape [number of entities, dim of text embeddings].
- num_feats.npy: a numpy array with the shape [number of entities, dim of numeric features].
- node2lab.csv: a csv file with labels for paper entities including entity indices (Index) and their scores (label).
bash scripts/run_kge_rpp.sh 0 CombineLiteralAll_X 100 10 # 100 dimension and 10 epochs
X is a base KGE model (e.g. TransE, DistMult, ComplEx).
bash scripts/run_rr_rpp.sh CombineLiteralAll_X 100 # 100 dimension
- Convert the data in the required format (e.g. entities.dict, relations.dict, etc.).
- Create a directory under "datasets/" and put all files under the new directory.
- Create scripts follow the sample scripts "scripts/run_**.sh".
- Run the scripts.
Our code is mainly based on PyKEEN. Thanks to the organizers for developing and sharing the library!