Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
-
Updated
Dec 24, 2024 - Scala
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
A Scala kernel for Jupyter
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Qubole Sparklens tool for performance tuning Apache Spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Spark Structured Streaming / Kafka / Cassandra / Elastic
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤
电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎
Spark Connector to read and write with Pulsar
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Apache Spark Course Material
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Apache Spark 3 - Structured Streaming Course Material
A library for loadling Thrift data into Spark SQL
Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.
To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."