Skip to content

Intel Technology Enabling for OpenShift Architecture and Working Scope

MartinXu edited this page Jun 21, 2024 · 14 revisions

Overview

The Intel Technology Enabling for OpenShift project provides Intel Data Center hardware feature-provisioning technologies with the Red Hat OpenShift Container Platform (RHOCP). The technology to deploy and manage Intel Enterprise AI End-to-End (E2E) solutions and the related reference workloads for these features are also included in the project.

Intel AI hardware and optimized software solutions are integrated into Red Hat OpenShift AI for ease of provisioning and configuration. The Habana AI Operator is used to provision Intel® Gaudi® accelerators and released on the Red Hat Ecosystem Catalog.

Red Hat Distributed CI* (DCI) based CI/CD pipeline is leveraged to enable and test this E2E solution with each RHOCP release to ensure new features and improvements can be promptly available.

The Open Platform for Enterprise AI (OPEA) RAG workloads are used to validate and optimize Intel enterprise AI E2E solutions.

Intel Technology Enabling for OpenShift Architecture

Figure 1 Intel Technology Enabling for OpenShift Architecture

Architecture Options

OpenShift 4 is an operator-focused platform. The operator is the key technology to provision and manage hardware and software resources on OpenShift. The architecture must be highly extensible and configurable to support the provision of variable hardware features and end-to-end solutions for user scenarios.

There are two architecture options. One is the General Operators - Horizontal architecture, and the other is the monolithic Special Resource Operator - Vertical architecture. See figure-2 Operator architecture options

The General Operators architecture divides the provisioning stack into several layers of functions, each General Operator handles a layer of function. To provision a specific feature, the particular custom resources (CRs) that come from the custom resource definitions (CRDs) need to be created and managed by the operators on the different layers. This design is based on the layer of abstractions. Normally, the up-layer functions depend on functions from the low-layer.

The Special Resource Operator architecture implements a monolithic operator for each feature. Normally the monolithic operator is independent from each other. The operator includes all the functions needed to provision the specific feature according to a single CR from the operator CRD.

Operator architecture options

Figure-2 Operator architecture options

To provision variable hardware features and different end-to-end solutions, The General Operators architecture is a good option. This architecture is more extensible and configurable. Each operator is designed for a specific layer of function. All the features, including new ones, can leverage the same operator just by adding a new CR (or modifying an existing CR) from the operator CRD. A new operator can be added to support the new function without interference from the other operators. Furthermore, since the operator is normally designed for the general function, it can be implemented as a general Kubernetes upstream solution, which can be downstreamed and used by all the Kubernetes platforms.

With Special Resource Operator architecture, the operator is designed for a specific hardware feature. All the functions needed to provision the feature are handled by a single Monolithic operator. Normally, only a single CR needs to be created to manage the software stacks to provision the specialized hardware feature. Compared with the General Operator architecture, the implementation, maintenance and configuration are easier.

So if just focusing on a single hardware feature, the Monolithic Special Resource Operator architecture is the better option. If multiple hardware features need to be supported, The General Operators architecture is the preferred option.

To provision several accelerators, according to Figure 1, the General Operators architecture is used by the Intel Technology Enabling for OpenShift project.

Upstream First Policy

Before deep diving into the architecture, the upstream first policy should be introduced.

Although Intel Technology Enabling for OpenShift is the downstream project for RHOCP, the upstream first policy is embraced by this project.

Firstly, the general K8s upstream solutions and projects are preferred by this project compared with the RHOCP specific solutions.

Secondly, all bug fixes and new feature implementations must be processed at the specific upstream project and then downstreamed to RHOCP before it can be used by this project. None of the upstream projects will be forked, and no specific patches will be maintained in the project.

Thirdly, all the project development, maintenance, and support effort follows the general open source process on GitHub. Any contribution including feature requests, bug reports, pull requests etc. is welcomed.

So, this project can fully contribute and leverage the open source community to build a consolidated Intel technology-based ecosystem that facilitates users to adopt and use these technologies in data center and edge computing workload solutions.

Figure 3 demonstrates the open source community ecosystem where the Intel Technology Enabling for OpenShift project reinforces the upstream first policy.

Operator architecture options

Figure 3 Intel Technology Enabling for OpenShift Ecosystem

Architecture Description

To provision Intel hardware features on OCP, following open-source projects are used:

  • Node Feature Discovery (NFD), NFD Operator
    NFD operator is used to deploy and manage the NFD solutions on RHOCP. NFD is used to automatically detect the hardware resource and label the nodes for Hardware provisioning operation. See these details for how NFD and the NFD Operator are used by this project.

  • Machine Config Operator (MCO)
    It is used to configure the Red Hat Enterprise Linux CoreOS (RHCOS) on the nodes. Since RHCOS is the only supported operating system for RHOCP the MCO project is the only RHOCP supported solutions used in this project. See these details for how MCO is used by this project.

  • Kernel Module Management (KMM), KMM Operator
    The KMM operator is used to manage the lifecycle of Intel Data Center GPU Driver containers. KMM is the upstream project for the KMM operator. See these details for how the KMM operator is used to manage the pre-build Intel Data Center GPU Driver containers by this project.

  • Intel® Data Center GPU Driver for OpenShift*, Intel GPU Drivers
    Intel Data Center GPU Driver for OpenShift project is used to build and release Intel Data Center GPU driver container images for RHOCP.

  • Intel Device Plugins Operator
    Intel Device Plugins are utilized to advertise Intel hardware features (resources) to a Red Hat OpenShift Container Platform (RHOCP) Cluster. This allows workloads running on pods deployed within the clusters to leverage these features. To handle the deployment and lifecycle of these device plugins, the Intel Device Plugins Operator is used

  • Habana AI Operator
    It is used to provision Intel® Gaudi® accelerators with OpenShift. The General Operators including NFD Operator, MCO, KMM Operator are leveraged by this operator.

End-to-End Solutions

Intel End-to-end solutions depend on the specific hardware features. The Intel Enterprise AI e2e solution Inference part has been added into Intel Technology Enabling since OpenShift 1.0.0 release. the whole Intel Enterprise AI e2e solution will be added step by step in future releases.

Intel AI Inference E2E Solution for OpenShift

Intel AI inference e2e solution for RHOCP is built upon Intel Data Center GPU Flex Series provisioning and Intel® Xeon® processors. The two following AI inference modes are used to test with Intel Data Center GPU card provisioning in this release:

  • Interactive Mode

Open Data Hub (ODH) and Red Hat OpenShift Data Science (RHODS) provide Intel OpenVINO based Jupyter Notebook to help users interactively debug the inference applications or optimize the models with RHOCP using Intel Data Center GPU cards and Intel Xeon processors.

  • Deployment Mode

OpenVINO™ Toolkit and Operator provide the OpenVINO Model Server (OVMS) for users to deploy their inference workloads using Intel Data Center GPU cards and Intel Xeon processors on RHOCP clusters in cloud or edge environments.

See detail about Intel AI inference end-to-end solution

Reference Workload

Reference workloads can be used by customers as the reference to use one or multiple Intel technologies in their solutions. The Intel Technology Enabling for OpenShift project also is driven by the workloads. That means, from the workload, the gaps in end-to-end solutions and the provisioning stack can be figured out easily. Discussing general solutions to bridge gaps among all the stakeholders can be effectively processed in the open-source community.

Open Platform for Enterprise AI (OPEA) is the major reference workload for Intel Enterprise AI, from 1.3.1 release, this workload will be added.