This course teaches theories and techniques commonly used in practice of data science. The primary focus is on text analysis covering text parsing, language models, sequence estimation, vector space models and distributional semantics, as well as statistical approaches including cluster analysis and supervised learning. Modern topics such as cloud computing, big data analysis, and data visualization are also discussed. Introductory courses on computer programming and probabilities & statistics are recommended as prerequisites for this course. All exercises as well as homework assignments assume Python programming. Students are expected to present their work on the final project in groups towards the end of the term.