Skip to content

msheikh24/datascience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

datascience

The SemEval-2016 dataset was downloaded using their recommended python script for Natural Language Processing (NLP) task of Sentiment Analysis. The instructions can be downloaded from https://github.com/aritter/twitter_download. I had to comment line 36 in download_tweets_api.py because script was crashing.

#uid = fields[1]

The raw downloaded dataset stats are:

Environment Task Point number of Tweets
DEV A Three 2000
DEV_TEST A Three 2000
TRAIN A Three 6000
TEST A Three 20,633
DEV B,D Two 1325
DEV_TEST B,D Two 1417
TRAIN B,D Two 4346
TEST B,D Two 10,552
DEV C,E Five 2000
DEV_TEST C,E Five 2000
TRAIN C,E Five 6000
TEST C,E Five 20,632

About

Data for Graduate Project (SemEval-2016)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published