The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Jul 31, 2025 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Postgres to Elasticsearch/OpenSearch sync
Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)
Repo for CDC with debezium blog post
Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered hive table performance comparison
Sample project that describes how you can handle schema within your Django application.
Example pipeline to stream the data changes from RDBMS to Apache Iceberg tables
Keep in sync RDB table with Hive structured store. Added Kafka as a buffer between those two tables.
This is a tryout I prepared to demonstrate CDC (change data capture) using MySQL, Maxwell and Kafka.
Lightweight CDC patterns for SQLite
Change Data Capture (CDC) tool from any source(s) to any target
A provider-agnostic framework to evaluate ordinary CDC (Change Data Capture) features
Transactional change feeds for SQLite
The Yelp Data Pipeline processes business reviews using Python, Kafka, AWS (DynamoDB, S3, Redshift), PySpark, AWS Lambda, and Power BI. It supports real-time streaming, CDC, daily batch processing, and data visualization for insights into customer sentiment, business performance, and industry trends.
This project create data stream from mysql using replication protocols and ingest into kafka. You can create event driven system using this.
Data decoding, encoding, conversion, and translation utilities.
Showcasing CDC with PostgresSQL pglogical plugin and custom scripts.
Distributed change data capture (CDC) framework for Google BigQuery
Real-time data engineering pipeline for an American hiring platform
This project shows how to capture changes from postgres database and stream them into kafka
Add a description, image, and links to the change-data-capture topic page so that developers can more easily learn about it.
To associate your repository with the change-data-capture topic, visit your repo's landing page and select "manage topics."