Skip to content

Detailed tutorial for installing Apache Airflow on IBM Cloud

Notifications You must be signed in to change notification settings

KissConsult/Apache-Airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Get Apache Airflow on IBM Cloud

We will deploy Apache Airflow on an IBM Cloud Kubernetes Cluster

  • Prerequisites :
    • You should have an IBM Cloud account, otherwise you can register here.
  1. Provisioning a new Kubernetes Cluster, if already have one skip to step 2
  2. Deploying the IBM Cloud Block Storage plug-in, if already have it skip to step 3
  3. Deploying Apache Airflow

Step 1 provisioning a new Kubernetes Cluster

  • Click the Catalog button on the top

  • Select Service from the left in the catalog

  • Search for Kubernetes Service and click on it Kubernetes

  • At the Kubernetes deployment page, we will specify our deployment details

  • Choose a plan standard or free, the free plan only has one worker node and no subnet, to provision a standard cluster, you will need to upgrade you account to Pay-As-You-Go

    • To upgrade to a Pay-As-You-Go account, complete the following steps:

    • In the console, go to Manage > Account.

    • Select Account settings, and click Add credit card.

    • Enter your payment information, click Next, and submit your information

  • Choose classic or VPC, read the docs and choose the most suitable type for yourself VPC

  • Please decide on your deployment's location parameters , for more information please visit Locations

    • Choose Geography (continent) continent
    • Choose Single or Multizone, in single zone your data is only kept in one datacenter, with Multizone your data is kept on multiple sites for more security avail
    • Choose a Worker Zone if using Single zones or Metro if Multizone worker
      • If you wish to use Multizone please set up your account with VRF or enable Vlan spanning
      • At your current location selection, it is possible there is no Virtual LAN currently available, then a new Vlan will be created for you
  • Choose a Worker node setup or use the preselected one, set Worker node amount per zone worker-pool

  • Choose Master Service Endpoint, In VRF-enabled accounts, you can choose private-only to make your master accessible on the private network or via VPN tunnel. Choose public-only to make your master publicly accessible. When you have a VRF-enabled account, your cluster is set up by default to use both private and public endpoints. For more information visit endpoints. endpoints

  • Give cluster a name

name-new

  • Give desired tags to your cluster, for more information visit tags

tags-new

  • Click create create-new

  • Wait for you cluster to be provisioned cluster-prepare

  • Your cluster is ready for usage

cluster-ready

Step 2 deploy IBM Cloud Block Storage plug-in

The Block Storage plug-in is a persistent, high-performance iSCSI storage that you can add to your apps by using Kubernetes Persistent Volumes (PVs).

  • Click the Catalog button on the top

  • Select Software from the catalog

  • Search for IBM Cloud Block Storage plug-in and click on it Block

  • On the application page Click in the dot next to the cluster, you wish to use

  • Click on Enter or Select Namespace and choose the default Namespace or use a custom one (if you get error please wait 30 minutes for the cluster to finalize) block-c

  • Give a name to this workspace

  • Click install and wait for the deployment block-create

Step 3 Deploy Apache Airflow

In this step we will deploy Apache Airflow on our cluster

  • Click the Catalog button on the top

  • Select Software from the left in the catalog

  • Search for Apache Aifrlow and click on it Search

  • On the application page Click in the dot next to the cluster we just created or use an existing one Cluster

  • Click on Enter or Select Namespace and choose one of the default Namespaces or use a custom one Namespace

  • Give a unique name to your workspace

Name

  • Select which resource group you want to use, it is for access controll and billing purposes. For more information please visit resource groups

apache-resource

  • Here you can give tags to your apache airflow workspace, which will affect your deployment. For more information visit tags

apache-tags

  • Click on Parameters with default values, You can set deployment values or use the default ones

def-val

  • Please tick the box next to the agreements and click install

Install

  • Your apache airflow workspace will start installing, please wait a couple of minutes for the deployment to finish

airflow-progress

  • You apache airflow workspace has been successfully deployed

airflow-finsihed

Verify Apache Airflow installation

  • Go to Resources in your browser

  • Click on Clusters

  • Click on your Cluster Resourcelect

  • Now you are at you cluster's overview, here Click on Actions on the top right and click on Web terminal from the dropdown menu

Actions

  • Click install, then wait couple of minutes

terminal-install

  • Click on Actions

  • Click Web terminal --> a terminal will open up

  • Type in the terminal, please change NAMESPACE to the namespace you choose at the deployment setup:

$ kubectl get ns

get-ns

$ kubectl get pod -n NAMESPACE -o wide 

get-pod

$ kubectl get service -n NAMESPACE

get-service

  • Your running Apache Airflow services will be visible

You successfully deployed Apache Airflow on IBM Cloud!

About

Detailed tutorial for installing Apache Airflow on IBM Cloud

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published