Skip to content

Commit

Permalink
Merge pull request #433 from flatcar/sayan/add-nvidia-test
Browse files Browse the repository at this point in the history
Add tests for the NVIDIA GPU tests
  • Loading branch information
sayanchowdhury authored Jun 16, 2023
2 parents e6e0047 + 4cd37d7 commit 4771cb7
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 1 deletion.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- `DefaultUser` parameter when registering a test to use a user different from `core` ([#424](https://github.com/flatcar/mantle/pull/424))
- `systemd.sysext.custom-oem` for testing the activation of the OEM sysext image ([#423](https://github.com/flatcar/mantle/pull/423))
- Kubernetes 1.27 tests ([#441](https://github.com/flatcar/mantle/pull/441))
- Add tests for testing the installation/integrity of the NVIDIA drivers ([#433](https://github.com/flatcar/mantle/pull/433))

### Changed

Expand Down Expand Up @@ -67,7 +68,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Ignition v3 support and tests ([#301](https://github.com/flatcar-linux/mantle/pull/301), [#311](https://github.com/flatcar-linux/mantle/pull/311))
- Butane config support ([#318](https://github.com/flatcar-linux/mantle/pull/318))
- GCP: support testing with GVNIC ([#322](https://github.com/flatcar-linux/mantle/pull/322))
- `networkd` Ignition translation test ([#344](https://github.com/flatcar-linux/mantle/pull/334))
- `networkd` Ignition translation test ([#344](https://github.com/flatcar-linux/mantle/pull/334))
- kola test `cl.misc.falco` that tests falco kmod building ([#339](https://github.com/flatcar-linux/mantle/pull/339))
- Kubernetes test for release 1.24.1 ([#337](https://github.com/flatcar-linux/mantle/pull/337))
- Added storage abstraction for Equinix Metal tests (SSH can be used in addition of Google Cloud Storage) ([#340](https://github.com/flatcar-linux/mantle/pull/340))
Expand Down
52 changes: 52 additions & 0 deletions kola/tests/misc/nvidia.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
package misc

import (
"bytes"
"fmt"
"time"

"github.com/coreos/pkg/capnslog"
"github.com/flatcar/mantle/kola"
"github.com/flatcar/mantle/kola/cluster"
"github.com/flatcar/mantle/kola/register"
"github.com/flatcar/mantle/util"
)

const (
CmdTimeout = time.Second * 300
)

var plog = capnslog.NewPackageLogger("github.com/flatcar/mantle", "kola/tests/misc")

func init() {
register.Register(&register.Test{
Name: "cl.misc.nvidia",
Run: verifyNvidiaInstallation,
ClusterSize: 1,
Distros: []string{"cl"},
// This test is to test the NVIDIA installation, limited to AZURE for now
Platforms: []string{"azure"},
Architectures: []string{"amd64"},
Flags: []register.Flag{register.NoEnableSelinux},
})
}

func verifyNvidiaInstallation(c cluster.TestCluster) {
if kola.AzureOptions.Size != "Standard_NC6s_v3" {
c.Skip("skipping due to wrong instance size")
}
m := c.Machines()[0]

nvidiaStatusRetry := func() error {
out, err := c.SSH(m, "systemctl is-active nvidia.service")
if !bytes.Contains(out, []byte("inactive")) {
return fmt.Errorf("nvidia.service: %q: %v", out, err)
}
return nil
}

if err := util.Retry(40, 15*time.Second, nvidiaStatusRetry); err != nil {
c.Fatal(err)
}
c.AssertCmdOutputContains(m, "/opt/bin/nvidia-smi", "Tesla")
}

0 comments on commit 4771cb7

Please sign in to comment.