Skip to content

A conceptual analysis to use domain adaptation for classification of documents as "Sports" "Politics" and "Technology"

Notifications You must be signed in to change notification settings

adityaghosh/Domain-Adaptation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Domain-Adaptation

Read the complete report here

A conceptual analysis to use domain adaptation for classification of documents as "Sports" "Politics" and "Technology"

To run the program please follow the below steps.

1. Download 20Newsgroup dataset @ http://qwone.com/~jason/20Newsgroups/20news-19997.tar.gz

2. Extract the data set and copy the folder "20_Newsgroup" to this location

3. Download BBC News dataset @ http://mlg.ucd.ie/files/datasets/bbc-fulltext.zip

4. Extract the data set and copy the folder "bbc_full" to this location

5. Please make sure the following packages are installed.
	a. Python 2.7.3 or higher
	b. numpy - pip install numpy
	c. scipy - pip install scipy
	d. scikit learn - pip install scikit-learn

6. We need some preprocessing for the data first, to do this open a terminal pointing to this location and enter the following command.
	make clean-input

7. Now you can run the code.
   (The program will take a few seconds to load data and start), 
   Command for LogisticRegression with Stochastic Gradient Descent:
	make domain-adaptation-max-ent
		OR
   Command for Multinomial Naive Bayes:
	make domain-adaptation-naive-bayes

About

A conceptual analysis to use domain adaptation for classification of documents as "Sports" "Politics" and "Technology"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published