-
Notifications
You must be signed in to change notification settings - Fork 3
FTS152/NLP-Project-2-Offensive-Tweet-Classification-SemEval-2019-Task6
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
required Python3 To run this program, need to put training data and testing data in project2_data/ for BERT/ and stanfordNER.py, corresponding model file is required. Preprocessing: joinTest.py: combine test data and test labels processed.py: generate cased and uncased data with lemmatization load.py: generate preprocessed data(tokenization, filter stopwords and non-alphanumeric symbols) removing_emoji.py: remove emojis Domain_Knowledge_based: classify.py: used models: MultinomialNB, KNN, SVM, DecisionTree, RandomForest, LogisticRegression profanityFilter.py: detect offensive language by a list of common profanity words by Google spacyNER.py: NER tagger by SpaCy stanfordNER.py: NER tagger by stanford NN_based: train model and do prediction by neural network methods used models: CNN, BiLSTM, BiGRU BERT: training.py: train cased or uncased model testing.py: do prediction
About
Identifying and Categorizing Offensive Language in Social Media
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published