Skip to content

A solution for humor detection in binary data, using python and some classification algorithms such as Naive Bayes, KNN, SVM, Decision Trees.

Notifications You must be signed in to change notification settings

AncaElena10/Humor-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

I have provided a solution for humor detection in binary data.

Features extracted:
-> remove word with length less than 3
-> text cleaning (bring the words to their 'complete' form; that means removing the words with aportrofe (like haven't, it will become have not)
-> stop word (but just a few of them, I don't want to extract all the stop words because some of them can give the text a negative sentiment like 'not' or 'no' and if I remove them, then the sentence will automatically become a positive one, which is wrong.
-> punctuation
-> stem words
-> lemm words
-> n-grams (2, 3 words)

Algorithms used:
-> Logistic Regression
-> Naive Bayes (Bernoulli, Multinomial)
-> SVM (LinearSVC, SVC (with kernel = linear or rbf), NuSVC (with kernel = linear or rbf))
-> DecisionTree
-> RandomForest
-> KNeighbors

The best values of C (LogisticRegression), random_state (DecisionTree, RandomForest), nu (NuSVC), n_neighbors (KNeighbors) were chosen with GridSearchCV.

                  clean  sw   punct  stem  lemm   ngrams  all
C                 1000   100  1000   100   1000   100     0.001 
nu*(1)            0.3    0.3  0.3    0.3   0.3    0.7     0.7 
nu*(2)            0.3    0.3  0.3    0.3   0.3    0.3     0.3 
n_neighbors       11     21   17     19    13     21      21 
random_state(1)*  1      12   13     18    15     12      12 
random_state(2)*  15     10   9      11    7      8       15

nu*(1) - NuSVC rbf
nu*(2) - NuSVC linear
random_state(1)* - RandomForest
random_state(2)* - DecisionTree

About

A solution for humor detection in binary data, using python and some classification algorithms such as Naive Bayes, KNN, SVM, Decision Trees.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages