Skip to content

Commit

Permalink
Add AudioQnA example via GMC (#597)
Browse files Browse the repository at this point in the history
* add AudioQnA example via GMC.
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

* add more information for e2e test scritpts.
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>

* fix bug in e2e test scripts.
Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
  • Loading branch information
zhlsunshine authored Aug 16, 2024
1 parent 039014f commit c86cf85
Show file tree
Hide file tree
Showing 5 changed files with 412 additions and 0 deletions.
74 changes: 74 additions & 0 deletions AudioQnA/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Deploy AudioQnA in Kubernetes Cluster on Xeon and Gaudi

This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline components on Intel Xeon server and Gaudi machines.

The AudioQnA Service leverages a Kubernetes operator called genai-microservices-connector(GMC). GMC supports connecting microservices to create pipelines based on the specification in the pipeline yaml file in addition to allowing the user to dynamically control which model is used in a service such as an LLM or embedder. The underlying pipeline language also supports using external services that may be running in public or private cloud elsewhere.

Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector). Soon as we publish images to Docker Hub, at which point no builds will be required, simplifying install.


The AudioQnA application is defined as a Custom Resource (CR) file that the above GMC operator acts upon. It first checks if the microservices listed in the CR yaml file are running, if not starts them and then proceeds to connect them. When the AudioQnA pipeline is ready, the service endpoint details are returned, letting you use the application. Should you use "kubectl get pods" commands you will see all the component microservices, in particular `asr`, `tts`, and `llm`.


## Using prebuilt images

The AudioQnA uses the below prebuilt images if you choose a Xeon deployment

- tgi-service: ghcr.io/huggingface/text-generation-inference:1.4
- llm: opea/llm-tgi:latest
- asr: opea/asr:latest
- whisper: opea/whisper:latest
- tts: opea/tts:latest
- speecht5: opea/speecht5:latest


Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.
For Gaudi:

- tgi-service: ghcr.io/huggingface/tgi-gaudi:1.2.1
- whisper-gaudi: opea/whisper-gaudi:latest
- speecht5-gaudi: opea/speecht5-gaudi:latest

> [NOTE]
> Please refer to [Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/xeon/README.md) or [Gaudi README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/gaudi/README.md) to build the OPEA images. These too will be available on Docker Hub soon to simplify use.
## Deploy AudioQnA pipeline
This involves deploying the AudioQnA custom resource. You can use audioQnA_xeon.yaml or if you have a Gaudi cluster, you could use audioQnA_gaudi.yaml.

1. Create namespace and deploy application
```sh
kubectl create ns audioqa
kubectl apply -f $(pwd)/audioQnA_xeon.yaml
```

2. GMC will reconcile the AudioQnA custom resource and get all related components/services ready. Check if the service up.

```sh
kubectl get service -n audioqa
```

3. Retrieve the application access URL

```sh
kubectl get gmconnectors.gmc.opea.io -n audioqa
NAME URL READY AGE
audioqa http://router-service.audioqa.svc.cluster.local:8080 6/0/6 5m
```

4. Deploy a client pod to test the application

```sh
kubectl create deployment client-test -n audioqa --image=python:3.8.13 -- sleep infinity
```

5. Access the application using the above URL from the client pod

```sh
export CLIENT_POD=$(kubectl get pod -n audioqa -l app=client-test -o jsonpath={.items..metadata.name})
export accessUrl=$(kubectl get gmc -n audioqa -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n audioqa -- curl $accessUrl -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json'
```

> [NOTE]

You can remove your AudioQnA pipeline by executing standard Kubernetes kubectl commands to remove a custom resource. Verify it was removed by executing kubectl get pods in the audioqa namespace.
58 changes: 58 additions & 0 deletions AudioQnA/kubernetes/audioQnA_gaudi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
labels:
app.kubernetes.io/name: gmconnector
app.kubernetes.io/managed-by: kustomize
gmc/platform: gaudi
name: audioqa
namespace: audioqa
spec:
routerConfig:
name: router
serviceName: router-service
nodes:
root:
routerType: Sequence
steps:
- name: Asr
internalService:
serviceName: asr-svc
config:
endpoint: /v1/audio/transcriptions
ASR_ENDPOINT: whisper-gaudi-svc
- name: WhisperGaudi
internalService:
serviceName: whisper-gaudi-svc
config:
endpoint: /v1/asr
isDownstreamService: true
- name: Llm
data: $response
internalService:
serviceName: llm-svc
config:
endpoint: /v1/chat/completions
TGI_LLM_ENDPOINT: tgi-gaudi-svc
- name: TgiGaudi
internalService:
serviceName: tgi-gaudi-svc
config:
endpoint: /generate
isDownstreamService: true
- name: Tts
data: $response
internalService:
serviceName: tts-svc
config:
endpoint: /v1/audio/speech
TTS_ENDPOINT: speecht5-gaudi-svc
- name: SpeechT5Gaudi
internalService:
serviceName: speecht5-gaudi-svc
config:
endpoint: /v1/tts
isDownstreamService: true
58 changes: 58 additions & 0 deletions AudioQnA/kubernetes/audioQnA_xeon.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
labels:
app.kubernetes.io/name: gmconnector
app.kubernetes.io/managed-by: kustomize
gmc/platform: xeon
name: audioqa
namespace: audioqa
spec:
routerConfig:
name: router
serviceName: router-service
nodes:
root:
routerType: Sequence
steps:
- name: Asr
internalService:
serviceName: asr-svc
config:
endpoint: /v1/audio/transcriptions
ASR_ENDPOINT: whisper-svc
- name: Whisper
internalService:
serviceName: whisper-svc
config:
endpoint: /v1/asr
isDownstreamService: true
- name: Llm
data: $response
internalService:
serviceName: llm-svc
config:
endpoint: /v1/chat/completions
TGI_LLM_ENDPOINT: tgi-svc
- name: Tgi
internalService:
serviceName: tgi-svc
config:
endpoint: /generate
isDownstreamService: true
- name: Tts
data: $response
internalService:
serviceName: tts-svc
config:
endpoint: /v1/audio/speech
TTS_ENDPOINT: speecht5-svc
- name: SpeechT5
internalService:
serviceName: speecht5-svc
config:
endpoint: /v1/tts
isDownstreamService: true
111 changes: 111 additions & 0 deletions AudioQnA/tests/test_gmc_on_gaudi.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#!/bin/bash
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

set -xe
USER_ID=$(whoami)
LOG_PATH=/home/$(whoami)/logs
MOUNT_DIR=/home/$USER_ID/.cache/huggingface/hub
IMAGE_REPO=${IMAGE_REPO:-}

function install_audioqa() {
kubectl create ns $APP_NAMESPACE
sed -i "s|namespace: audioqa|namespace: $APP_NAMESPACE|g" ./audioQnA_gaudi.yaml
kubectl apply -f ./audioQnA_gaudi.yaml

# Wait until the router service is ready
echo "Waiting for the audioqa router service to be ready..."
wait_until_pod_ready "audioqa router" $APP_NAMESPACE "router-service"
output=$(kubectl get pods -n $APP_NAMESPACE)
echo $output
}

function validate_audioqa() {
# deploy client pod for testing
kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity

# wait for client pod ready
wait_until_pod_ready "client-test" $APP_NAMESPACE "client-test"
# giving time to populating data
sleep 60

kubectl get pods -n $APP_NAMESPACE
# send request to audioqa
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
echo "$CLIENT_POD"
accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json' | jq .byte_str)
echo "$byte_str" > $LOG_PATH/curl_audioqa.log
if [ -z "$byte_str" ]; then
echo "audioqa failed, please check the logs in ${LOG_PATH}!"
exit 1
fi
echo "Audioqa response check succeed!"
}

function wait_until_pod_ready() {
echo "Waiting for the $1 to be ready..."
max_retries=30
retry_count=0
while ! is_pod_ready $2 $3; do
if [ $retry_count -ge $max_retries ]; then
echo "$1 is not ready after waiting for a significant amount of time"
get_gmc_controller_logs
exit 1
fi
echo "$1 is not ready yet. Retrying in 10 seconds..."
sleep 10
output=$(kubectl get pods -n $2)
echo $output
retry_count=$((retry_count + 1))
done
}

function is_pod_ready() {
if [ "$2" == "gmc-controller" ]; then
pod_status=$(kubectl get pods -n $1 -o jsonpath='{.items[].status.conditions[?(@.type=="Ready")].status}')
else
pod_status=$(kubectl get pods -n $1 -l app=$2 -o jsonpath='{.items[].status.conditions[?(@.type=="Ready")].status}')
fi
if [ "$pod_status" == "True" ]; then
return 0
else
return 1
fi
}

function get_gmc_controller_logs() {
# Fetch the name of the pod with the app-name gmc-controller in the specified namespace
pod_name=$(kubectl get pods -n $SYSTEM_NAMESPACE -l control-plane=gmc-controller -o jsonpath='{.items[0].metadata.name}')

# Check if the pod name was found
if [ -z "$pod_name" ]; then
echo "No pod found with app-name gmc-controller in namespace $SYSTEM_NAMESPACE"
return 1
fi

# Get the logs of the found pod
echo "Fetching logs for pod $pod_name in namespace $SYSTEM_NAMESPACE..."
kubectl logs $pod_name -n $SYSTEM_NAMESPACE
}

if [ $# -eq 0 ]; then
echo "Usage: $0 <function_name>"
exit 1
fi

case "$1" in
install_AudioQnA)
pushd AudioQnA/kubernetes
install_audioqa
popd
;;
validate_AudioQnA)
pushd AudioQnA/kubernetes
validate_audioqa
popd
;;
*)
echo "Unknown function: $1"
;;
esac
Loading

0 comments on commit c86cf85

Please sign in to comment.