Mining Massive Datasets

Stanford University CS246

What is the course about?

In this course, you will learn many of the interesting algorithms that have been developed for efficient processing of large amounts of data in order to extract simple and useful models of that data. These techniques are often used to predict properties of future instances of the same sort of data, or simply to make sense of the data already available. Many people view data mining, or "big data" as machine learning. There are indeed some techniques for processing large datasets that can be considered machine learning, and we shall cover a number of these. But there are also many algorithms and ideas for dealing with big data that are not usually classified as machine learning, and we shall cover many of these as well.

Instructors of the course

Jure Leskovec
Anand Rajaraman
Jeff Ullman

Course outline (edX)

Be aware that the outline of the course on edX is different from the CS246

MapReduce
Link Analysis (PageRank)
Locality-Sensitive Hashing
Distance Measures and Nearest-Neighbor Learning
Frequent Itemset Analysis
Social-Network Graphs
Algorithms for Data Streams
Recommendation Systems
Dimensionality Reduction
Clustering
Computational Advertising
Machine Learning
More on MapReduce Algorithms
More on Locality-Sensitive Hashing
More on Link Analysis

Course materials

You can download the textbook through this link

Self-study tool

If you are a student and willing to test knowledge on yourself, welcome to use the tool of Gradiance Online Accelerated Learning can register at here and the class token 1EDD8A1D to join the "omnibus class" for the MMDS book.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CS246_Colab_0_(Spark_Tutorial).ipynb		CS246_Colab_0_(Spark_Tutorial).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mining Massive Datasets

What is the course about?

Instructors of the course

Course outline (edX)

Course materials

Self-study tool

About

Releases

Packages

Languages

alisongh/Mining-Massive-Datasets

Folders and files

Latest commit

History

Repository files navigation

Mining Massive Datasets

What is the course about?

Instructors of the course

Course outline (edX)

Course materials

Self-study tool

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages