Avalanche

Avalanche is an automatic materialization process designed to transform raw, structured data. It ensures data completeness while providing a streamlined approach to managing Snowflake Kafka connector-based schemas. It is a Python-based solution that integrates with Snowflake, Kafka, and other data sources to facilitate the transformation and loading of data into structured formats. While not necessary, Avalanche was conceptualized in WW Tech as part of a broader stack like below

For Change Data Capture (CDC) Stack:

For Application Events Stack:

The recommendation for a production setup is similar to above , and thus this document assumes understanding Kafka , Snowflake Connector. For a non-standard setup , you could refer config/sample_nyc_taxi_data.yaml for guidance

Purpose:

Purpose of Avalanche is to transform raw, semi-structured data into relational, structured data. While doing so, it also checks for data completeness.

Diagram:

Getting Started

System Initialization:

This is a one-time setup where the base Avalanche system tables are created. These tables serve as the foundation for all Avalanche deployments. This step is executed using the initialize_system.py module and is required only once per new system setup.

Refer docs/system_initialization.md for details on how to run this script here

Deployment

Once the system is initialized, you are ready to deploy Avalanche service. Deployments are the core of Avalanche's functionality, allowing it to process data from various sources and materialize it into structured tables in Snowflake. Avalanche deployments are designed to materialize RAW tables (Snowflake Kafka connector-based schemas) into structured, queryable data tables. Each deployment is containerized and can be grouped by source (e.g., replicating an order transactions database).

Refer docs/avalanche_service_deployment.md for details on how to deploy Avalanche service here

Local Development Environment Setup

This section provides instructions for setting up a local development environment for Avalanche. It is designed to help developers quickly get started with Avalanche development and testing. It heavily relies on the make command to automate the setup process, including dependency installation, environment variable generation, and configuration. Refer docs/local_development_environment_setup.md for details on how to set up a local development environment here

Avalanche in the wild
Avalanche is currently being used to ingest Terabytes of data in WW supporting over 1400 topics spanning multiple data sources - Postgres, MySQL, Oracle, MongoDB, and schematized application events.

Additional Resources

This section provides references to additional components in the recommended stack

Contributors

Thanks to all the people who have contributed to this project! Maintainers:

Star Contributors:

Want to contribute? For the time being, the best way is to open an issue in the repo, and we will get back to you.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
avalanche		avalanche
config		config
deployment		deployment
docs		docs
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
avalanche.png		avalanche.png
entrypoint.py		entrypoint.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Avalanche

For Change Data Capture (CDC) Stack:

For Application Events Stack:

Purpose:

Diagram:

Getting Started

System Initialization:

Deployment

Local Development Environment Setup

Additional Resources

Contributors

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ww-tech/avalanche

Folders and files

Latest commit

History

Repository files navigation

Avalanche

For Change Data Capture (CDC) Stack:

For Application Events Stack:

Purpose:

Diagram:

Getting Started

System Initialization:

Deployment

Local Development Environment Setup

Additional Resources

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages