This repository has been archived by the owner on Mar 30, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 92
Home old
hbutani edited this page Aug 4, 2016
·
1 revision
#Fast BI using Spark and Druid.
This project is aimed at two classes of users
- Users of Druid who want SQL access to their indexes and use traditional BI tools such as Tableau with Druid
- Spark and Hive users who find performance of their interactive BI painfully slow.
- Quick Start
- Using Tableau with Sparkline
- The Druid project
- Spark
##Indexing
-
Indexing TPCH data as an example.
-
Setting up Druid Druid.
-
Sample data set for TPCH.
##Querying data from Spark
-
Setup thrift server connections so you can use Squirrel, Razor SQL, Zeppelin or Tableau against the datasets.
- Overview
- Quick Start
-
User Guide
- [Defining a DataSource on a Flattened Dataset](https://github.com/SparklineData/spark-druid-olap/wiki/Defining-a Druid-DataSource-on-a-Flattened-Dataset)
- Defining a Star Schema
- Sample Queries
- Approximate Count and Spatial Queries
- Druid Datasource Options
- Sparkline SQLContext Options
- Using Tableau with Sparkline
- How to debug a Query Plan?
- Running the ThriftServer with Sparklinedata components
- [Setting up multiple Sparkline ThriftServers - Load Balancing & HA] (https://github.com/SparklineData/spark-druid-olap/wiki/Setting-up-multiple-Sparkline-ThriftServers-(Load-Balancing-&-HA))
- Runtime Views
- Sparkline SQL extensions
- Sparkline Pluggable Modules
- Dev. Guide
- Reference Architectures
- Releases
- Cluster Spinup Tool
- TPCH Benchmark