Built a classifier in Python that can categorize whether articles of local newspaper "Listin Diario" are from the sports section or another. Using rapidminer to preprocess the data and create the model (Naive Bayes and SVM). Java application that reads articles from the main page of the newspaper site and passes them to the better model in order to classify them.
Python scripts were created for the retrieval of the articles via the rss feed. Technologies used: Python, nltk, rapidminer, java.