a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
-
Updated
Jan 1, 2025 - Python
a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
🍺 A data engineering project showcasing an ELT pipeline using modern technologies such as Delta-rs, and Apache Airflow.
A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.
An open-source Python library for simplifying local testing of Databricks workflows that use PySpark and Delta tables.
Implementing Change Data Capture for Seamless Fintech Data Migration
Databricks & Blueprint Hackathon - using databricks, spark structured streaming, delta, and azure devops to build automated deployment of notebooks and jobs.
This project builds a cloud-based pipeline to extract NYC taxi data from an API and store it in Azure Data Lake Storage (ADLS). Databricks and PySpark are used to transform the data through the medallion architecture (Bronze → Silver → Gold). Delta Lake ensures reliable storage, and Power BI provides visual insights for data-driven decision-making.
On-premise data lake architecture with Trino, Delta Tables and Hive Metastore
Add a description, image, and links to the delta-tables topic page so that developers can more easily learn about it.
To associate your repository with the delta-tables topic, visit your repo's landing page and select "manage topics."