Learning Word Relatedness over Time

This repository provides the data and implementation of the paper:

Learning Word Relatedness over Time
Guy D. Rosin, Eytan Adar and Kira Radinsky
EMNLP 2017
https://arxiv.org/abs/1707.08081

Code

The main folder contains:

code for creating word embeddings using word2vec, either from a single corpus (word2vec_model_alltime.py), or from a temporal corpus (models_builder.py)
framework for running and evaluating various types of ML classifiers (classifier.py)
a peak detection algorithm that we used (peak_detection.py)

search contains code for temporal query expansion, in particular:

searching the New York Times archive, using Apache Solr, and evaluating search results (temporal_search.py)
performing temporal query expansion. The query can be either a single entity (qe_single_entity.py) or multiple entities (qe_multiple_entities.py)

Relations, in the format of: <entity1, entity2, start_year, end_year, relation_type>
Binary relations that were generated from the relations file, in the format of: <entity1, entity2, year, true/false>

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
search		search
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
classifier.py		classifier.py
models_builder.py		models_builder.py
peak_detection.py		peak_detection.py
requirements.txt		requirements.txt
utils.py		utils.py
word2vec_model.py		word2vec_model.py
word2vec_model_alltime.py		word2vec_model_alltime.py