Skip to content

nikolaStanojkovski/NLP_Twitter_Tasks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLP Twitter Tasks


Seven NLP Tasks With Twitter Datasets



Data Science Project by:

Nikola Stanojkovski, Ines Lesnovska


Context

The experimental landscape in natural language processing for social media is too fragmented. Each year, new shared tasks and datasets are proposed, ranging from classics like sentiment analysis to irony detection or emoji prediction. Therefore, it is unclear what the current state of the art is, as there is no standardized evaluation protocol, neither a strong set of baselines trained on such domainspecific data. The propose of this dataset is presenting evaluation consisting of seven heterogeneous Twitter-specific classification tasks.

Content

This dattaset consists of seven heterogenous tasks in Twitter, all framed as multi-class tweet classification. Each dataset presented in the same format and with fixed training, validation and test splits.

Acknowledgements

  • Francesco Barbieri - Jose Camacho-Collados
  • Leonardo Neves - Luis Espinosa-Anke

Inspiration

  • Emotion Recognition
  • Emoji Prediction
  • Irony Detection
  • Hate Speech Detection
  • Offensive Language Identification
  • Sentiment Analysis
  • Stance Detection

All the work is presented in the file "Prediction.ipynb"


A little little walkthrough for the project is given in the file "Presentation.ppt"


For every particular NLP task given, the most appropriate model that gives the best results has been chosen after the trial of many


For every particular NLP task given, the most appropriate evaluation metrics have been chosen

Releases

No releases published

Packages

No packages published