Skip to content
This repository has been archived by the owner on May 27, 2022. It is now read-only.

Latest commit

 

History

History
417 lines (352 loc) · 19.3 KB

converged-edge-experience-kits.md

File metadata and controls

417 lines (352 loc) · 19.3 KB
SPDX-License-Identifier: Apache-2.0
Copyright (c) 2019-2021 Intel Corporation

Converged Edge Experience Kits

Purpose

The Converged Edge Experience Kit is a refreshed repository of Ansible* playbooks for automated deployment of Converged Edge Reference Architectures.

The Converged Edge Experience Kit introduces the following capabilities:

  1. Wide range of deployments from individual building blocks to full end-to-end reference deployments
  2. Minimal to near-zero user interventions.. Typically, the user provides the details of the nodes that constitute the Smart Edge Open edge cluster and executes the deployment script
  3. More advanced deployments can be customized in the form of Ansible* group and host variables. This mode requires users with in-depth knowledge and expertise of the subject edge deployment
  4. Enablement of end-to-end multi-cluster deployments such as Near Edge and On-premises reference architectures

Converged Edge Experience Kit explained

The Converged Edge Experience Kit repository is organized as detailed in the following structure:

├── cloud
├── flavors
├── inventory
│   ├── automated
│   └── default
│       ├── group_vars
│       │   └── all
│       │       └── 10-default.yml
│       └── host_vars
├── playbooks
│   ├── infrastructure.yml
│   ├── kubernetes.yml
│   └── applications.yml
├── roles
│   ├── applications
│   ├── infrastructure
│   ├── kubernetes
│   └── telemetry
├── scripts
├── tasks
├── inventory.yml
├── network_edge_cleanup.yml
└── deploy.py
  • flavors: definition variables of pre-defined deployment flavors
  • inventory: definition of default & generated Ansible* variables
  • inventory/default/group_vars/all/10-default.yml: definition of default variables for all deployments
  • inventory/automated: inventory files that were automatically generated by the deployment helper script
  • playbooks: Ansible* playbooks for infrastructure, Kubernetes and applications
  • roles: Ansible roles for infrastructure, Kubernetes, applications and telemetry
  • scripts: utility scripts
  • inventory.yml: definition of the clusters, their controller & edge nodes and respective deployment flavors
  • deploy.py: the deployment helper script

The inventory file

The inventory file defines the group of physical nodes that constitute the edge cluster which will be deployed by the Converged Edge Experience Kits. The inventory file YAML specification allows deploying multiple edge clusters in one command run. Multiple clusters must be separated by the 3 dashes --- directive.

NOTE: for multi-cluster deployments, user must assign distinct names to the controller and the edge nodes, i.e., no hostname repetitions.

The following variables must be defined

  • cluster_name: a given name for the Smart Edge Open edge cluster deployment - separated by underscores _ instead of spaces.
  • flavor: the deployment flavor applicable for the Smart Edge Open edge deployment as defined in the Deployment flavors document.
  • single_node_deployment: If set to true, a single-node cluster is deployed.. Must satisfy the following conditions:
    • IP address (ansible_host) for both controller and node must be the same
    • controller_group and edgenode_group groups must contain exactly one host
  • limit -- OPTIONAL: constrains the deployment to a specific Ansible* group, e.g., controller, edgenode, edgenode_vca_group or just a particular hostname. This is passed as a --limit command-line option when executing ansible-playbook.
  • ansible_user: deployment will be done for provided user and that user will become kubernetes cluster admin

Sample Deployment Definitions

Single Cluster Deployment

Set single_node_deployment flag to false in the inventory file and provide the controller node name under the controller_group and the edge node names under the edgenode_group.

Example:

---
all:
  vars:
    cluster_name: 5g_near_edge
    flavor: cera_5g_near_edge
    single_node_deployment: false
    limit:
controller_group:
  hosts:
    ctrl.openness.org:
      ansible_host: 10.102.227.154
      ansible_user: openness
edgenode_group:
  hosts:
    node01.openness.org:
      ansible_host: 10.102.227.11
      ansible_user: openness
    node02.openness.org:
      ansible_host: 10.102.227.79
      ansible_user: openness
edgenode_vca_group:
  hosts:
ptp_master:
  hosts:
ptp_slave_group:
  hosts:

Single-node Cluster Deployment

Set single_node_deployment flag to true in the inventory file and provide the node name in the controller_group and the edgenode_group.

Example:

---
all:
  vars:
    cluster_name: 5g_central_office
    flavor: cera_5g_central_office
    single_node_deployment: true   
    limit:  
controller_group:
  hosts:
    node.openness.org:
      ansible_host: 10.102.227.234
      ansible_user: openness
edgenode_group:
  hosts:
    node.openness.org:
      ansible_host: 10.102.227.234
      ansible_user: openness
edgenode_vca_group:
  hosts:
ptp_master:
  hosts:
ptp_slave_group:
  hosts:

Multi-cluster deployment

Provide multiple clusters YAML specifications separated by the 3 dashes --- directive in the inventory.yml. A node name should be used only once across the inventory file, i.e: distinct node names.

Example:

---
all:
  vars:
    cluster_name: 5g_near_edge
    flavor: cera_5g_near_edge
    single_node_deployment: true 
    limit:
controller_group:
  hosts:
    node.openness01.org:
      ansible_host: 10.102.227.154
      ansible_user: openness
edgenode_group:
  hosts:
    node.openness01.org:
      ansible_host: 10.102.227.154
      ansible_user: openness
edgenode_vca_group:
  hosts:
ptp_master:
  hosts:
ptp_slave_group:
  hosts:
---
all:
  vars:
    cluster_name: 5g_central_office
    flavor: cera_5g_central_office
    single_node_deployment: true   
    limit:  
controller_group:
  hosts:
    node.openness02.org:
      ansible_host: 10.102.227.234
      ansible_user: openness
edgenode_group:
  hosts:
    node.openness02.org:
      ansible_host: 10.102.227.234
      ansible_user: openness
edgenode_vca_group:
  hosts:
ptp_master:
  hosts:
ptp_slave_group:
  hosts:

Deployment customization

The deploy.py script creates a new inventory for each cluster to be deployed in a inventory/automated directory. These inventories are based on inventory/default - all directories and files are symlinked. Additionally, relevant flavor files are symlinked.

Customizations made to inventory/default/group_vars and inventory/default/host_vars will affect every deployment performed by deploy.py (because these files are symlinked, not copied). Therefore it is a good place to provide changes relevant to the nodes of the cluster.

Customizing kernel, grub parameters, and tuned profile & variables per host

CEEKs allow a user to customize kernel, grub parameters, and tuned profiles by leveraging Ansible's feature of host_vars.

NOTE: inventory/default/groups_vars/[edgenode|controller|edgenode_vca]_group directories contain variables applicable for the respective groups and they can be used in inventory/default/host_vars to change on per node basis while inventory/default/group_vars/all contains cluster wide variables.

CEEKs contain a inventory/default/host_vars/ directory in which we can create another directory (nodes-inventory-name) and place a YAML file (10-default.yml, e.g., node01/10-default.yml). The file would contain variables that would override roles' default values.

NOTE: Despite the ability to customize parameters (kernel), it is required to have a clean CentOS* 7.9.2009 operating system installed on hosts (from a minimal ISO image) that will be later deployed from Ansible scripts. This OS shall not have any user customizations.

To override the default value, place the variable's name and new value in the host's vars file. For example, the contents of inventory/default/host_vars/node01/10-default.yml that would result in skipping kernel customization on that node:

kernel_skip: true

The following are several common customization scenarios.

IP address range allocation for various CNIs and interfaces

The Converged Edge Experience kits deployment uses/allocates/reserves a set of IP address ranges for different CNIs and interfaces. The server or host IP address should not conflict with the default address allocation. In case if there is a critical need for the server IP address used by the Smart Edge Open default deployment, it would require to modify the default addresses used by the Smart Edge Open.

Following files specify the CIDR for CNIs and interfaces. These are the IP address ranges allocated and used by default just for reference.

flavors/media-analytics-vca/all.yml:19:vca_cidr: "172.32.1.0/12"
inventory/default/group_vars/all/10-default.yml:90:calico_cidr: "10.245.0.0/16"
inventory/default/group_vars/all/10-default.yml:93:flannel_cidr: "10.244.0.0/16"
inventory/default/group_vars/all/10-default.yml:96:weavenet_cidr: "10.32.0.0/12"
inventory/default/group_vars/all/10-default.yml:99:kubeovn_cidr: "10.16.0.0/16,100.64.0.0/16,10.96.0.0/12"
roles/kubernetes/cni/kubeovn/controlplane/templates/crd_local.yml.j2:13:  cidrBlock: "192.168.{{ loop.index0 + 1 }}.0/24"

The 192.168.*.* is used for SRIOV and interface service IP address allocation in Kube-ovn CNI. So it is not allowed for the server IP address, which conflicting with this range. Completely avoid the range of address defined as per the netmask as it may conflict in routing rules.

E.g., If the server/host IP address is required to use 192.168.*.* while this range by default used for SRIOV interfaces in Smart Edge Open. The IP address range for cidrBlock in roles/kubernetes/cni/kubeovn/controlplane/templates/crd_local.yml.j2 file can be changed to 192.167.{{ loop.index0 + 1 }}.0/24 to use some other IP segment for SRIOV interfaces.

Default values

Here are several default values:

# --- machine_setup/custom_kernel
kernel_skip: false  # use this variable to disable custom kernel installation for host

kernel_repo_url: http://linuxsoft.cern.ch/cern/centos/7.9.2009/rt/CentOS-RT.repo
kernel_repo_key: http://linuxsoft.cern.ch/cern/centos/7.9.2009/os/x86_64/RPM-GPG-KEY-cern
kernel_package: kernel-rt-kvm
kernel_devel_package: kernel-rt-devel
kernel_version: 3.10.0-1160.11.1.rt56.1145.el7.x86_64

kernel_dependencies_urls: []
kernel_dependencies_packages: []

# --- machine_setup/grub
hugepage_size: "2M" # Or 1G
hugepage_amount: "5000"

default_grub_params: "hugepagesz={{ hugepage_size }} hugepages={{ hugepage_amount }} intel_iommu=on iommu=pt"
additional_grub_params: ""

# --- machine_setup/configure_tuned
tuned_skip: false   # use this variable to skip tuned profile configuration for host
tuned_packages:
  - tuned-2.11.0-9.el7
  - http://ftp.scientificlinux.org/linux/scientific/7/x86_64/os/Packages/tuned-profiles-realtime-2.11.0-9.el7.noarch.rpm
tuned_profile: realtime
tuned_vars: |
  isolated_cores=2-3
  nohz=on
  nohz_full=2-3

Use different realtime kernel (3.10.0-1062)

By default, kernel-rt-kvm-3.10.0-1160.11.1.rt56.1145.el7.x86_64 from the built-in repository is installed.

To use another version (e.g., kernel-rt-kvm-3.10.0-1062.9.1.rt56.1033.el7.x86_64), create a host_var file for the host with content:

kernel_version: 3.10.0-1062.9.1.rt56.1033.el7.x86_64

Use different non-rt kernel (3.10.0-1062)

The CEEK installs a real-time kernel by default. However, the non-rt kernel is present in the official CentOS repository. Therefore, to use a different non-rt kernel, the following overrides must be applied:

kernel_repo_url: ""                           # package is in default repository, no need to add new repository
kernel_package: kernel                        # instead of kernel-rt-kvm
kernel_devel_package: kernel-devel            # instead of kernel-rt-devel
kernel_version: 3.10.0-1062.el7.x86_64

dpdk_kernel_devel: ""  # kernel-devel is in the repository, no need for url with RPM

# Since, we're not using rt kernel, we don't need a tuned-profiles-realtime but want to keep the tuned 2.11
tuned_packages:
- http://linuxsoft.cern.ch/scientific/7x/x86_64/os/Packages/tuned-2.11.0-8.el7.noarch.rpm
tuned_profile: balanced
tuned_vars: ""

Use tuned 2.9

tuned_packages:
- tuned-2.9.0-1.el7fdp
- tuned-profiles-realtime-2.9.0-1.el7fdp

Default kernel and configure tuned

kernel_skip: true     # skip kernel customization altogether

# update tuned to 2.11 but don't install tuned-profiles-realtime since we're not using rt kernel
tuned_packages:
- http://linuxsoft.cern.ch/scientific/7x/x86_64/os/Packages/tuned-2.11.0-8.el7.noarch.rpm
tuned_profile: balanced
tuned_vars: ""

Change amount of HugePages

hugepage_amount: "1000"   # default is 5000

Change size of HugePages

hugepage_size: "1G"   # default is 2M

Change amount and size of HugePages

hugepage_amount: "10"   # default is 5000
hugepage_size: "1G"     # default is 2M

Remove input output memory management unit (IOMMU) from grub params

default_grub_params: "hugepagesz={{ hugepage_size }} hugepages={{ hugepage_amount }}"

Add custom GRUB parameter

additional_grub_params: "debug"

Configure OVS-DPDK in kube-ovn

By default, OVS-DPDK is disabled (due to set calico as a default cni). To enable it, set a flag:

kubeovn_dpdk: true

NOTE: This flag should be set in roles/kubernetes/cni/kubeovn/common/defaults/main.ym or added to inventory/default/group_vars/all/10-default.yml.

Additionally, HugePages in the OVS pod can be adjusted once default HugePage settings are changed.

kubeovn_dpdk_socket_mem: "1024,0" # Amount of hugepages reserved for OVS per NUMA node (node 0, node 1, ...) in MB
kubeovn_dpdk_hugepage_size: "2Mi" # Default size of hugepages, can be 2Mi or 1Gi
kubeovn_dpdk_hugepages: "1Gi"     # Total amount of hugepages that can be used by OVS-OVN pod

NOTE: If the machine has multiple NUMA nodes, remember that HugePages must be allocated for each NUMA node. For example, if a machine has two NUMA nodes, kubeovn_dpdk_socket_mem: "1024,1024" or similar should be specified.

NOTE: If kubeovn_dpdk_socket_mem is changed, set the value of kubeovn_dpdk_hugepages to be equal to or greater than the sum of kubeovn_dpdk_socket_mem values. For example, for kubeovn_dpdk_socket_mem: "1024,1024", set kubeovn_dpdk_hugepages to at least 2Gi (equal to 2048 MB).

NOTE: kubeovn_dpdk_socket_mem, kubeovn_dpdk_pmd_cpu_mask, and kubeovn_dpdk_lcore_mask can be set on per node basis but the HugePage amount allocated with kubeovn_dpdk_socket_mem cannot be greater than kubeovn_dpdk_hugepages, which is the same for the whole cluster.

OVS pods limits are configured by:

kubeovn_dpdk_resources_requests: "1Gi" # OVS-OVN pod RAM memory (requested)
kubeovn_dpdk_resources_limits: "1Gi"   # OVS-OVN pod RAM memory (limit)

CPU settings can be configured using:

kubeovn_dpdk_pmd_cpu_mask: "0x4" # DPDK PMD CPU mask
kubeovn_dpdk_lcore_mask: "0x2"   # DPDK lcore mask

Adding new CNI plugins for Kubernetes (Network Edge)

  • The role that handles CNI deployment must be placed in the roles/kubernetes/cni/ directory (e.g., roles/kubernetes/cni/kube-ovn/).
  • Subroles for control plane and node (if needed) should be placed in the controlplane/ and node/ directories (e.g., roles/kubernetes/cni/kube-ovn/{controlplane,node}).
  • If there is a part of common command for both control plane and node, additional sub-roles can be created: common (e.g., roles/kubernetes/cni/sriov/common).

NOTE: The automatic inclusion of the common role should be handled by Ansible mechanisms (e.g., usage of meta's dependencies or include_role module)

  • Name of the main role must be added to the available_kubernetes_cnis variable in roles/kubernetes/cni/defaults/main.yml.
  • If additional requirements must checked before running the playbook (to not have errors during execution), they can be placed in the roles/kubernetes/cni/tasks/precheck.yml file, which is included as a pre_task in plays for both Edge Controller and Edge Node.
    The following are basic prechecks that are currently executed:
    • Check if any CNI is requested (i.e., kubernetes_cni is not empty).
    • Check if sriov is not requested as primary (first on the list) or standalone (only on the list).
    • Check if calico is requested as a primary (first on the list).
    • Check if kubeovn is requested as a primary (first on the list).
    • Check if the requested CNI is available (check if some CNI is requested that isn't present in the available_kubernetes_cnis list).
  • CNI roles should be as self-contained as possible (unless necessary, CNI-specific tasks should not be present in kubernetes/{controlplane,node,common} or openness/network_edge/{controlplane,node}).
  • If the CNI needs a custom Smart Edge Open service (e.g., Interface Service in case of kube-ovn), it can be added to the openness/network_edge/{controlplane,node}.
    Preferably, such tasks would be contained in a separate task file (e.g., roles/openness/controlplane/tasks/kube-ovn.yml) and executed only if the CNI is requested. For example:
    - name: deploy interface service for kube-ovn
      include_tasks: kube-ovn.yml
      when: "'kubeovn' in kubernetes_cnis"
  • If the CNI is used as an additional CNI (with Multus*), the network attachment definition must be supplied (refer to Multus docs for more info).