Collaborators: Nima Afshar & Nika Shahabi
This projection is an adaptation of the paper "Matching the Blanks: Distributional Similarity for Relation Learning" 1 by Baldini et al. for the Persian language.
This project has two phases. First, we analyze different architectures for relation representation as the output of a deep transformer model. Second, we try a semi-supervised approach for relation classification as a distance-learning problem. PERLEX 2 dataset has been used for the relation classification task. It is an expert-translated version of SemEval-2010 Task 8 3 to Persian.
from 1
We have tested six architectures proposed in the paper on our dataset. Their visualizations are available in the figure below. Architecture's names are in the form of "I-O", Where "I" is the entity span identification method which is applied on input tokens, and "O" is the Fixed length relation representation method done to the output of the transformer model to get the relation representation.
Entity span identification methods:
- Standard Input: keep the default input tokens of the BERT model.
- Positional Embedding: The positional embedding number for all the tokens in the first entity, all the tokens in the second entity, and all the tokens in the third entity are set to one, two, and zero, respectively.
- Entity Marker tokens: Two special tokens named [E1] and [/E1] are added before and after the first entity tokens. In the same way, special tokens [E2] and [/E2] are added before and after the second entity tokens.
Fixed length relation representation methods:
- [CLS]: In this method, the embedding for the special token [CLS] is used as the relation embedding.
- Entity mention pooling: In this method, max-pooling is applied to the embeddings of all tokens in each entity to get the entity embedding. The relation representation is defined as the concatenation of the first and second entity embeddings.
- Entity start state: The relation between two entities is represented by concatenating the final hidden states corresponding to their respective start tokens. Since this method can only be applied when the entity Marker tokens method is used as the entity span identification method, start tokens are [E1] and [E2].
from 1
The paper suggests an assumption. This assumption is that for two equal pairs of entities placed in two different sentences, the structures of these two sentences probably imply similar relations. Therefore, these entities can be replaced with BLANK tokens in two sentences. These sentences can be fed to a model that returns a distance value. This value represents the similarity between the relation in the first sentence and the relation in the second sentence.
- This method is not implemented yet.