The open source high performance ELT framework powered by Apache Arrow
-
Updated
Nov 5, 2024 - Go
The open source high performance ELT framework powered by Apache Arrow
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Flow-based programming for JavaScript
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
This repository is a getting started guide to Singer.
Making data lake work for time series
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
A simplified, lightweight ETL Framework based on Apache Spark
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Knowledge Graph Toolkit
A tool for building feature stores.
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Bender - Serverless ETL Framework
Configurable Extract, Transform, and Load
A visual ETL development and debugging tool for big data
Add a description, image, and links to the etl-framework topic page so that developers can more easily learn about it.
To associate your repository with the etl-framework topic, visit your repo's landing page and select "manage topics."