Simple and Distributed Machine Learning
-
Updated
Feb 6, 2025 - Scala
Simple and Distributed Machine Learning
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Capture deep metrics on one or all assets within a Databricks workspace
Apache Spark Connector for Azure Cosmos DB
Lakehouse storage system benchmark
DeltaOMS is a solution that help build a centralized repository of Delta Transaction logs and associated operational metrics/statistics for your Delta Lakehouse. Unity Catalog supported in the v0.7.0-rc1 release.Documentation here - https://databrickslabs.github.io/delta-oms/v0.7.0-rc1/
CARTO Analytics Toolbox for Databricks provides geospatial functionality leveraging the Geomesa SparkSQL capabilities.
A Spark data source for reading Microsoft Excel files
OctopuFS library helps managing cloud storage, ADLSgen2 specifically. It allows you to operate on files (moving, copying, setting ACLs) in very efficient manner. Designed to work on databricks, but should work on any other platform as well.
MachineLearning examples using Spark MLIB and Databricks
GangliaExport is a lightweight package provides an alternative way to monitor Databricks cluster utilization using Ganglia Web Service on the driver node of each cluster. The Ganglia metrics can be exported to any Spark datasource format which can be used to analyze cluster usages to avoid costly idle computing.
End-to-end Kafka Streaming Examples on Databricks with Evolving Avro Schemas.
Sample project for Scala applications with dbx and CI/CD setup based on Github actions.
Link Prediction is about predicting the future connections in a graph. In this project, Link Prediction is about predicting whether two authors will be collaborating for their future paper or not given the graph of authors who collaborated for atleast one paper together.
A Quality Spark DQ Library
Ready2019_WTH_DatabricksIntroML
Sample code for working with Kafka & Protobuf in Databricks
Pipeline de dados no Azure para base de imóveis, com estrutura em três camadas (unbound, silver, gold) e trigger automática a cada hora para atualização consistente.
Add a description, image, and links to the databricks topic page so that developers can more easily learn about it.
To associate your repository with the databricks topic, visit your repo's landing page and select "manage topics."