This repository contains the code which allows to reproduce our results in the paper.
- OS capable of running java and python code
- minimum 16GB RAM
- minimum OpenJRE 8 / Oracle JRE 8
- Python3
- PyPi Packages
- matplotlib
- networkx
- numpy
Download from the official website or get the code on GitHub.
$ python3 -m pip install -r requirements.py
This module contains classes which allow to wrap rules from AMIE+ standard output into an object oriented data structure.
Example usage:
from amie_rule_wrapper import AMIERule, NoAMIERuleInLineError
amie_rules = set()
with open(amie_output_fn, "r") as f:
lines = f.readlines()
for line in lines:
try:
amie_rule = AMIERule(line)
amie_rules.add(amie_rule)
except NoAMIERuleInLineError:
continue
This module implements our approach as described in the paper.
It takes the output from AMIE+ as input and produces a dictionary of relation pairs with their confidences in being synonym in the same directory where the AMIE+ output is located in.
Currently, it also requires the relation2id.txt
mapping from the benchmark as input for parallelization.
For usage details and more options, see:
$ python3 amie_synonyms.py -h
This module contains code for classification and evaluation of the results produced by amie_synonyms.py
.
It will produce precision-recall or precision@topK plots (depending on the experiment) in the same directory where the input is located in.
Currently, it also requires the relation2id.txt
mapping from the benchmark as input for parallelization.
For usage details and more options, see:
$ python3 evaluate_synonyms.py -h
The procedure for each experiment given a benchmark (B) consists of three steps:
- Mining association rules using AMIE+ with (B) as input, producing the rules as output.
- Calculating synonymous relationship pairs with their confidences using our approach with the rules as input.
- Classifying, evaluating and plotting precision-recall or precision@topK plots for the experiment using the list calculated in the previous step.
For each experiment, we provide the exact shell commands for each step to reproduce our results.
We focused on the Wikidata and DBpedia datasets of our previous work for evaluation to extend our pool of baselines. See the paper / repository for details on the samples. The rules used for the experiments are available here: https://cloudstorage.tu-braunschweig.de/public?folderID=MjRQc2FHd3VYUThuWEQ5V3E2am1p The gold datasets for the evaluation are also available for download: https://doi.org/10.6084/m9.figshare.11343785.v1
$ java -jar amie_plus.jar -optimcb -optimfh -minhc 0.005 wikidata-20181221TN-1k_2000_50/wikidata-20181221TN-1k_2000_50.new.tsv |& tee wikidata-20181221TN-1k_2000_50/wikidata-20181221TN-1k_2000_50.new.tsv.amie-output
$ python3 amie_synonyms.py -r wikidata-20181221TN-1k_2000_50/relation2id.txt -a wikidata-20181221TN-1k_2000_50/wikidata-20181221TN-1k_2000_50.new.tsv.amie-output -p 28 -n jaccard --min-std 0.05 --max-diff-std-hc 1.0
$ python3 evaluate_synonyms.py -r wikidata-20181221TN-1k_2000_50/relation2id.txt -g wikidata-20181221TN-1k_2000_50/synonyms_uri_filtered.txt -b wikidata-20181221TN-1k_2000_50/baselines -s wikidata-20181221TN-1k_2000_50/wikidata-20181221TN-1k_2000_50.new.tsv.amie-output.synonyms.dict
$ java -jar amie_plus.jar -optimcb -optimfh -minhc 0.005 dbpedia-201610N-1k-filtered/dbpedia-201610N-1k-filtered.tsv |& tee dbpedia-201610N-1k-filtered/dbpedia-201610N-1k-filtered.tsv.amie-output
$ python3 amie_synonyms.py -r dbpedia-201610N-1k-filtered/relation2id.txt -a dbpedia-201610N-1k-filtered/dbpedia-201610N-1k-filtered.tsv.amie-output -p 28 -n jaccard --min-std 0.05 --max-diff-std-hc 1.0
$ python3 evaluate_synonyms.py -r dbpedia-201610N-1k-filtered/relation2id.txt -s dbpedia-201610N-1k-filtered/dbpedia-201610N-1k-filtered.tsv.amie-output.synonyms.dict -g dbpedia-201610N-1k-filtered/synonyms_uri_combined.txt -b dbpedia-201610N-1k-filtered/baselines -c simple -p 28 -f precision_topk