Skip to content

kln-courses/corpustextanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title: Corpus and Text Analysis
subtitle: Text Analytics, day 2
place: Karen Blixens Vej 4 (room 27.0.09)
time: November 11, 2016, 9 AM to 4 PM.
instructor: Kristoffer L. Nielbo  (KLN)
contact: kln@cas.au.dk

Text Analytics

Text analytics (~ text mining) is a heterogeneous research field that focuses on extraction of meaningful patterns from unstructured and text-heavy data. The meaningful patterns are typically extracted by applying statistical learning (i.e., machine learning) to target data sets from large non-relational databases. In this one-day introductory course, we will go through a generic text analytics pipeline with particular focus on available tools for data preparation and modeling/analysis.

Program

Time Content Instructor
09:00-10:00 Text Analytics KLN
10:00-10:30 Generic Tools KLN
10:30-11:00 break
11:00-12:00 Data Preparation KLN
12:00-12:30 Concerns about Preprocessing Munksgaard
12:30-13:30 lunch break
13:30-14:00 Sentiments KLN
14:00-14:15 break
14:00-15:00 Clustering KLN
15:00-15:45 Classification KLN
15:45-16:00 course evaluation

Reading material

Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

Brücher, H., Knolmayer, G., & Mittermayer, M.-A. (2002). Document classification methods for organizing explicit knowledge. Institut fur Wirtschaftsinformatik der Universität Bern.

Radovanović, M., & Ivanović, M. (2008). Text mining: Approaches and applications. Novi Sad J. Math, 38(3), 227–234.

Reagan, A., Tivnan, B., Williams, J. R., Danforth, C. M., & Dodds, P. S. (2015). Benchmarking sentiment analysis methods for large-scale texts: A case for using continuum-scored words and word shift graphs. arXiv Preprint arXiv:1512.00531.

Tangherlini, T. R., & Leonard, P. (2013). Trawling in the Sea of the Great Unread: Sub-corpus topic modeling and Humanities research. Poetics, 41(6), 725–749.

Other

While neither mandatory nor strictly necessary, participants will benefit from installing R and Python.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published