Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate checkbox-dss-validation to checkbox/contrib (New) #1524

Merged
merged 16 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@ contrib/checkbox-provider-ce-oem @canonical/oem-qa
# OEM SWE x86 team
contrib/pc-sanity @canonical/oem-swe-x86
.github/workflows/tox-contrib-pc-sanity.yaml @canonical/oem-swe-x86
# Solutions QA team
contrib/checkbox-dss-validation @canonical/solutions-qa
.github/workflows/testflinger-contrib-dss-regression.yaml @canonical/solutions-qa
60 changes: 60 additions & 0 deletions .github/workflows/testflinger-contrib-dss-regression.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
name: Data Science Stack (DSS) Regression Testing
on:
workflow_dispatch:
# schedule:
# - cron: "0 7 * * 1" # every Monday 07:00 UTC
# push:
# branches:
# - main
# pull_request:
# branches:
# - main

env:
BRANCH: ${{ github.head_ref || github.ref_name }}

jobs:
regression-tests:
name: Regression tests
runs-on: [testflinger]
defaults:
run:
working-directory: contrib/checkbox-dss-validation
strategy:
matrix:
dss_channel:
- latest/stable
- latest/edge
queue:
- dell-precision-3470-c30322 #ADL iGPU + NVIDIA GPU
- dell-precision-5680-c31665 #RPL iGPU + Arc Pro A60M dGPU
steps:
- name: Check out code
uses: actions/checkout@v4
- name: Build job file from template with maas2 provisioning
if: ${{ matrix.queue == 'dell-precision-3470-c30322' }}
env:
PROVISION_DATA: "distro: jammy"
run: |
sed -e "s|REPLACE_BRANCH|${BRANCH}|" \
-e "s|REPLACE_QUEUE|${{ matrix.queue }}|" \
-e "s|REPLACE_PROVISION_DATA|${PROVISION_DATA}|" \
-e "s|REPLACE_DSS_CHANNEL|${{ matrix.dss_channel }}|" \
${GITHUB_WORKSPACE}/testflinger/job-def.yaml > \
${GITHUB_WORKSPACE}/job.yaml
- name: Build job file from template with oemscript provisioning
if: ${{ matrix.queue == 'dell-precision-5680-c31665' }}
env:
PROVISION_DATA: "url: http://10.102.196.9/somerville/Platforms/jellyfish-muk/X96_A00/dell-bto-jammy-jellyfish-muk-X96-20230419-19_A00.iso"
run: |
sed -e "s|REPLACE_BRANCH|${BRANCH}|" \
-e "s|REPLACE_QUEUE|${{ matrix.queue }}|" \
-e "s|REPLACE_PROVISION_DATA|${PROVISION_DATA}|" \
-e "s|REPLACE_DSS_CHANNEL|${{ matrix.dss_channel }}|" \
${GITHUB_WORKSPACE}/testflinger/job-def.yaml > \
${GITHUB_WORKSPACE}/job.yaml
- name: Submit testflinger job
uses: canonical/testflinger/.github/actions/submit@main
with:
poll: true
job-path: ${GITHUB_WORKSPACE}/job.yaml
85 changes: 85 additions & 0 deletions contrib/checkbox-dss-validation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Welcome to the Checkbox DSS project!

This repository contains the Checkbox DSS Provider (test cases and test plans for validating Intel GPU support in the [Data Science Stack](https://documentation.ubuntu.com/data-science-stack/en/latest/)) as well as everything that is required to build the `checkbox-dss` snap.

# Requirements

- Ubuntu Jammy (22.04)
- Supported hardware platforms:
- Intel platforms with recent GPU (>= Broadwell)

# Installation

Install the Checkbox runtime and build/install the dss provider snaps:

```shell
sudo snap install --classic snapcraft
sudo snap install checkbox22
lxd init --auto
git clone https://github.com/canonical/checkbox
cd checkbox/contrib/checkbox-dss-validation
snapcraft
sudo snap install --dangerous --classic ./checkbox-dss_2.0_amd64.snap
```

Make sure that the provider service is running and active:

```shell
systemctl status snap.checkbox-dss.remote-slave.service
```

# Install dependencies

Some test need dependencies, and a helper script is available to install them:

```shell
checkbox-dss.install-deps
```

By default this will install the `data-science-stack` snap from the `latest/stable`
channel. To instead install from `latest/edge` use:

```shell
checkbox-dss.install-deps --dss-snap-channel=latest/edge
```

# Automated Run

To run the test plans:

```shell
checkbox-dss.validate-intel-gpu
```

# Cleanup

WARNING: The following steps will remove kubectl and microk8s from your machine. If you wish to keep them, do not run.

To clean up and uninstall all installed tests, run:

```shell
checkbox-dss.remove-deps
```

This will also remove the `data-science-stack` snap as well as any notebook servers
that are managed by `dss`.

# Develop the Checkbox DSS provider

Since snaps are immutable, it is not possible to modify the content of the scripts or the test cases. Fortunately, Checkbox provides a functionality to side-load a provider on the DUT.

Therefore, if you want to edit a job definition, a script or a test plan, run the following commands on the DUT:

```shell
cd $HOME
git clone https://github.com/canonical/checkbox
mkdir /var/tmp/checkbox-providers
cp -r $HOME/checkbox/contrib/checkbox-dss-validation/checkbox-provider-dss /var/tmp/checkbox-providers/
```

You can then modify the content of the provider in `/var/tmp/checkbox-providers/checkbox-provider-dss/`, and it's this version that will be used when you run the tests.

Please refer to the [Checkbox documentation] on side-loading providers for more information.

[Checkbox]: https://checkbox.readthedocs.io/
[Checkbox documentation]: https://checkbox.readthedocs.io/en/latest/side-loading.html
5 changes: 5 additions & 0 deletions contrib/checkbox-dss-validation/bin/checkbox-cli-wrapper
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

# wrapper around the checkbox-cli
# use absolute path in order to not use system checkbox-cli (from deb packages)
exec /snap/checkbox22/current/bin/checkbox-cli "$@"
58 changes: 58 additions & 0 deletions contrib/checkbox-dss-validation/bin/configure
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#!/usr/bin/env python3
# Copyright 2018-2022 Canonical Ltd.
# All rights reserved.
#
# Written by:
# Maciej Kisielewski <maciej.kisielewski@canonical.com>
# Sylvain Pineau <sylvain.pineau@canonical.com>
import os
import re
import sys

sys.path.append(os.path.expandvars("$SNAP/usr/lib/python3/dist-packages"))
sitepkgpath = "$SNAP/lib/python3.10/site-packages"
sys.path.append(os.path.expandvars(sitepkgpath))

sys.path.append(os.path.expandvars(
"/snap/checkbox22/current/usr/lib/python3/dist-packages"))
runtimepath = "/snap/checkbox22/current/lib/python3.10/site-packages"
sys.path.append(os.path.expandvars(runtimepath))

try:
from checkbox_support.snap_utils.config import update_configuration
from checkbox_support.snap_utils.config import print_checkbox_conf
except ImportError:
msg = """
checkbox-support not found!
You need to install the checkbox22 snap:

snap install checkbox22
"""
print(os.path.expandvars(msg), file=sys.stderr)
sys.exit(1)


def main():
# we need run as root to be able to write to /var/snap/...
if os.geteuid() != 0:
print('You have to run this command with sudo')
return

if len(sys.argv) > 1 and sys.argv[1] == '-l':
print_checkbox_conf()
return

key_re = re.compile(r"^(?:[A-Z0-9]+_?)*[A-Z](?:_?[A-Z0-9])*$")
vars_to_set = dict()
for pair in sys.argv[1:]:
k, _, v = pair.partition('=')
if not key_re.match(k) or not v:
raise SystemExit("'%s' is not a valid configuration entry. "
"Should be KEY=val" % pair)
k = k.replace('_', '-').lower()
vars_to_set[k] = v
update_configuration(vars_to_set)


if __name__ == '__main__':
main()
56 changes: 56 additions & 0 deletions contrib/checkbox-dss-validation/bin/install-deps
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/bin/bash
set -e

echo -e "\nStep 1/5: Installing microk8s snap"
sudo snap install microk8s --channel 1.28/stable --classic

USER=$(id -nu ${SNAP_UID})
HOME=${SNAP_REAL_HOME}

# microk8s commands run from tests are run without sudo
sudo usermod -a -G microk8s $USER
# Directory needed for sharing microk8s config with kubectl snap
mkdir -p $HOME/.kube

echo -e "\nStep 2/5: Configuring microk8s addons"
sudo microk8s status --wait-ready
# Give microk8s another minute to stabilize
# to avoid intermittent failures when
# enabling hostpath-storage
echo "Giving microk8s a minute to stabilize..."
sleep 60
sudo microk8s enable hostpath-storage
sudo microk8s enable dns
sudo microk8s enable rbac

echo "Waiting for microk8s addons to become ready..."
sudo microk8s.kubectl wait \
--for=condition=available \
--timeout 1800s \
-n kube-system \
deployment/coredns \
deployment/hostpath-provisioner
sudo microk8s.kubectl -n kube-system rollout status ds/calico-node

# This is needed to overcome the following bug within microk8s:
# https://github.com/canonical/microk8s/issues/4453
echo -e "\nStep 3/5: Installing kubectl snap"
sudo snap install kubectl --classic --channel=1.29/stable
# hack as redirecting stdout anywhere but /dev/null throws a permission denied error
# see: https://forum.snapcraft.io/t/eksctl-cannot-write-to-stdout/17254/4
sudo microk8s.kubectl config view --raw | tee $HOME/.kube/config > /dev/null

# intel_gpu_top command used for host-level GPU check
# jq used for cases where jsonpath is insufficient for parsing json results
echo -e "\nStep 4/5: Installing intel-gpu-tools"
DEBIAN_FRONTEND=noninteractive sudo apt install -y intel-gpu-tools jq

echo -e "\nStep 5/5: Installing data-science-stack snap"
optional_arg=$1
if [ "${optional_arg}" = "--dss-snap-channel=latest/edge" ]; then
echo "Installing from edge"
sudo snap install data-science-stack --channel latest/edge
else
echo "Installing from stable"
sudo snap install data-science-stack --channel latest/stable
fi
53 changes: 53 additions & 0 deletions contrib/checkbox-dss-validation/bin/install-full-deps
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/bin/bash

RELEASE=$(lsb_release -sc)
REQUIRED_DEPENDENCIES="make clinfo"
## XXX: g++ version is really a factor of the cuda toolkit and we shouldn't
# depend on what is the release default
case ${RELEASE} in
jammy )
REQUIRED_DEPENDENCIES="${REQUIRED_DEPENDENCIES} g++ ocl-icd-libopencl1 build-essential libpng-dev libboost-all-dev libva-dev unzip cmake"
;;
noble )
REQUIRED_DEPENDENCIES="${REQUIRED_DEPENDENCIES} g++ ocl-icd-libopencl1 build-essential libpng-dev libboost-all-dev libva-dev unzip cmake"
;;
* )
echo "Unsupported OS version. Use Ubuntu jammy or noble"
;;
esac

# Downloads for OpenCL version 24.13.29138.7
IGC_CORE_DEB="https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16510.2/intel-igc-core_1.0.16510.2_amd64.deb"
IGC_OCL_DEB="https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16510.2/intel-igc-opencl_1.0.16510.2_amd64.deb"
LEVEL_ZERO_DDEB="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/intel-level-zero-gpu-dbgsym_1.3.29138.7_amd64.ddeb"
LEVEL_ZERO_DEB="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/intel-level-zero-gpu_1.3.29138.7_amd64.deb"
OPENCL_DDEB="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/intel-opencl-icd-dbgsym_24.13.29138.7_amd64.ddeb"
OPENCL_DEB="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/intel-opencl-icd_24.13.29138.7_amd64.deb"
GMMLIB_DEB="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/libigdgmm12_22.3.18_amd64.deb"
CHECKSUM_URL="https://github.com/intel/compute-runtime/releases/download/24.13.29138.7/ww13.sum"

# Install other needed packages
sudo DEBIAN_FRONTEND=noninteractive apt-get -y update
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y ${REQUIRED_DEPENDENCIES}

# Fetch and install the requested OPENCL version
wget ${IGC_CORE_DEB}
wget ${IGC_OCL_DEB}
wget ${LEVEL_ZERO_DDEB}
wget ${LEVEL_ZERO_DEB}
wget ${OPENCL_DDEB}
wget ${OPENCL_DEB}
wget ${GMMLIB_DEB}
wget ${CHECKSUM_URL}
sha256sum -c *.sum
sudo DEBIAN_FRONTEND=noninteractive dpkg -i *.deb *.ddeb

COMPUTE_SAMPLES_DIR=/tmp/compute-samples
git clone https://github.com/mckees/compute-samples $COMPUTE_SAMPLES_DIR
cd $COMPUTE_SAMPLES_DIR && ./scripts/install/install_ubuntu_20_04.sh
mkdir build
cd build && cmake ..
cmake --build .
cmake --build . --target install

exit 0
9 changes: 9 additions & 0 deletions contrib/checkbox-dss-validation/bin/remove-deps
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash

for node in $(sudo microk8s.kubectl get nodes -o name); do
sudo microk8s.kubectl drain --ignore-daemonsets --delete-emptydir-data "${node}"
done
sudo snap remove microk8s --purge
sudo snap remove kubectl --purge
sudo snap remove data-science-stack --purge
sudo delgroup microk8s
4 changes: 4 additions & 0 deletions contrib/checkbox-dss-validation/bin/shell-wrapper
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

echo "$SNAP_NAME runtime shell, type 'exit' to quit the session"
exec bash
18 changes: 18 additions & 0 deletions contrib/checkbox-dss-validation/bin/validate-intel-gpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env -S checkbox-cli-wrapper remote 127.0.0.1
[launcher]
app_id = com.canonical.contrib.dss-validation:checkbox
launcher_version = 1
stock_reports = text, submission_files

[test plan]
unit = com.canonical.contrib::dss-validation
forced = yes

[test selection]
forced = yes

[ui]
type = silent
auto_retry = yes
max_attempts = 10
delay_before_retry = 10
Loading