A curated list of awesome Apache Spark packages and resources.
-
Updated
Oct 24, 2024 - Shell
A curated list of awesome Apache Spark packages and resources.
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Azure Databricks - Advent of 2020 Blogposts
Taller SparkR para las Jornadas de Usuarios de R
Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.
Slides and lab material for the talk R for HPC and big data at http://rsummer.data-analysis.at
Practice and Workshop on BigData and Cloud Computing using Docker Containers and OpenNebula. HDFS, hadoop and spark+R
Taller Big Data con Apache Spark + R desde Databricks cloud
This repository you are browsing contains intermediate level piece of codes which are useful for cleaning, exploratory analysis, handling of missing data points, outlier detection and different visualization techniques using graphics, ggplot2, tidycharts, ggExtra packages. Also in particular part of the script you can get basic information about…
A curated list of essential cheatsheets for data analysis, visualization and machine learning using R or Python
Fit a Cubist regression model on StackOverflow data and make predictions in a distributed manner with SparkR
Self-service modeling analysis tool based on R language and big data. It integrates SparkR, Rserve, and Mlib machine learning libraries
Add a description, image, and links to the sparkr topic page so that developers can more easily learn about it.
To associate your repository with the sparkr topic, visit your repo's landing page and select "manage topics."