This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
-
Updated
May 8, 2024 - Scala
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
SparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件
Basics of Big Data and Machine Learning using Apache Spark and Scala
PCARD Ensemble classifier for Big Data
Distributed version of RELIEF-F algorithm for Apache Spark.
😅 A topic model of reddit.com/r/EmojiPasta trained with Spark and an LDA model (NSFW) - Trigger Warning: The r/emojipasta subreddit posts controversial content and anything I have crawled is to provide visibility of a topic modeling some of this controversial content. Unfortunately there is also discriminatory speech which must be called out!
Anomaly Detection with Spark Machine Learning
(Class) Master's thesis source code. "A Distributed Recommender System on Apache Spark"
Add a description, image, and links to the mllib topic page so that developers can more easily learn about it.
To associate your repository with the mllib topic, visit your repo's landing page and select "manage topics."