- Extract all the questions and answers with the java tag.
- Use Mallet to apply LDA on these posts.
The lib folder includes all libraries required to run the project: jsoup-1.8.3.jar lucene-analyzers-common-5.2.1.jar lucene-core-5.2.1.jar lucene-highlighter-5.2.1.jar lucene-queryparser-5.2.1.jar
The Input folder includes all input data: Badges.xml java_all.txt JavaPosts.xml Tags.xml topic_topic.txt wordtopic.txt