Skip to content

tm-cheska-peralta/airflow-ci-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DE Project Workflow Template

Main workflow template for jumpstarting DE projects. Consists of the following files:

  1. Makefile (workflow.mk) to auto-import setups using the below templates
  2. pre-commit config, linters, and PR templates (.github/)
  3. Readme and .gitignore

System Design

The DE Workflow template is designed to be the main repository for creating Data Engineering templates. The system is designed using a multi-repo setup where developers can mix and match multiple templates together to fit the needs of their project.

  graph LR
    DWT[de-workflow-template]
    DDT[dwt-dagster-template]
    DAT[dwt-airflow-template]
    DTT[dwt-terraform-template]
    DS[dbt-starter]
    DCT[dwt-ci-template]
    GS[github-starter]
    subgraph Data Orchestrator
      DDT
      DAT
    end
    subgraph Infrastructure as Code
      DTT
    end
    subgraph CI/CD
      DCT
    end
    subgraph Data Transformation
      DS
    end
    subgraph Documentation
      GS
    end
        GS--make readme-template-->DWT
    DDT--make dagster-->DWT
    DAT--make airflow-->DWT
    DTT--make gcp-terraform-->DWT
    DTT--make aws-terraform-->DWT
    DCT--make cloudbuild-->DWT
    DCT--make codepipeline-->DWT
    DS--make dbt-->DWT
Loading

The system revolves around using Makefile to run scripts that would setup the templates automatically. Ideally, a working project can be created by just typing multiple make commands that build the template from scratch.


Getting started

  1. Create a new repo using this template (or click this link)
  2. Ensure that these are installed in your system
    - direnv
    - git
  3. Choose from the available setup commands below
  4. Update README.md
    make readme-template -f workflow.mk

Setup commands

  • Use the primary make command found for the template you need down bellow.
  • Follow the instructions on how to initialize the template found at the lower section above each part

Setting up your orchestrator

Dagster

Note: the commands below will create a dagster/ directory.

# initialize a Dagster setup
make dagster -f workflow.mk

Airflow

Note: the commands below will create a airflow/ directory. (NEED CONFIRMATION - Dev Notes)

# initialize an Airflow setup
make airflow -f workflow.mk

# initialize an Airflow setup w/ DAG Builder
make airflow -f workflow.mk add_dag_builder=1

For further instructions, go to the Dagster Template Repository or the Airflow Template Repository. More details on Airflow DAG Builder.

Setting up your Infrastructure as Code tool

Terraform

Note: the commands below will create a terraform/ directory.

# For GCP setups,
make gcp-terraform -f workflow.mk

# For AWS setups,
make aws-terraform -f workflow.mk

Then, follow terraform/README.md for the initial Terraform setup.

Setting up your CI/CD tools

IMPORTANT NOTE : The Cloud Build and CodePipeline templates need their respective terraform template and your selected orchestrator template to have already been installed.

Note: the commands below will create a ci/ directory and will create/append files in terraform folder

Cloud Build as CI

# DO THIS FIRST
make gcp-terraform -f workflow.mk

# for Airflow Project
make cloudbuild cloud-platform=gcp orchestrator=airflow -f workflow.mk

# for Dagster Project
make cloudbuild cloud-platform=gcp orchestrator=dagster -f workflow.mk

Then, follow the instructions found in the Cloud Build README to set up the triggers.

CodePipeline as CI

Note: The CodePipeline template is currently only available for Dagster projects.

# DO THIS FIRST
make aws-terraform -f workflow.mk

# for Dagster Project
make codepipeline cloud-platform=aws orchestrator=dagster -f workflow.mk

Then, follow the instructions found in the CodePipeline README.

Setting up your DBT

Note: the command below will create a dbt/ directory

make dbt -f workflow.mk

Then, follow the instructions found in dbt-starter README to set up dbt adapter and environment configurations.


Cleanup

Once done with setting up the project, you can choose to remove the following files from the project directory.

rm workflow.mk
rm terraform.mk
rm ci.mk
rm terraform/README.md
rm -rf terraform/docs/

Template Repos

These are the repositories for the underlying templates used by the De-Workflow-Template.

  • You may choose to use these templates directly at your discretion
  • Additional information regarding the templates can be found in their respective repository
  • Data orchestrator templates

    Infrastructure as code templates

    CI/CD templates

    Data transformation templates

    Documentation templates

    Other Resources

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published