Automated OS image import, poll and update

CDI supports automating OS image import, poll and update, keeping OS images up-to-date according to the given schedule. On the first time a DataImportCron is scheduled, the controller will import the source image. On any following scheduled poll, if the source image digest (sha256) has updated, the controller will import it to a new source in the DataImportCron namespace, and update the managed DataSource to point to the newly created source. A garbage collector (garbageCollect: Outdated enabled by default) is responsible to keep the last importsToKeep (3 by default) imported sources per DataImportCron, and delete older ones.

See design doc here

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: fedora-image-import-cron
  namespace: golden-images
spec:
  template:
    spec:
      source:
        registry:
          url: "docker://quay.io/kubevirt/fedora-cloud-registry-disk-demo:latest"
          pullMethod: node
          certConfigMap: some-certs
      storage:
        resources:
          requests:
            storage: 5Gi
        storageClassName: hostpath-provisioner
  schedule: "30 1 * * 1"
  garbageCollect: Outdated
  importsToKeep: 2
  managedDataSource: fedora

A DataVolume can use a sourceRef referring to a DataSource, instead of the source, so whenever created it will use the latest imported source similarly to specifying dv.spec.source.

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: fedora-ref
  namespace: golden-images
spec:
  sourceRef:
      kind: DataSource
      name: fedora
  storage:
    resources:
      requests:
        storage: 5Gi
    storageClassName: hostpath-provisioner

OpenShift ImageStreams

Using pullMethod: node we also support import from OpenShift imageStream instead of url:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: rhel8-image-import-cron
  namespace: openshift-virtualization-os-images
spec:
  template:
    spec:
      source:
        registry:
          imageStream: rhel8-is
          pullMethod: node
      storage:
        resources:
          requests:
            storage: 5Gi
        storageClassName: hostpath-provisioner
  schedule: "0 0 * * 5"
  importsToKeep: 4
  managedDataSource: rhel8

Currently we assume the ImageStream is in the same namespace as the DataImportCron.

To create an ImageStream one can use for example:

oc import-image rhel8-is -n openshift-virtualization-os-images --from=registry.redhat.io/rhel8/rhel-guest-image --scheduled --confirm
oc set image-lookup rhel8-is -n openshift-virtualization-os-images

Or on CRC:

oc import-image cirros-is -n openshift-virtualization-os-images --from=kubevirt/cirros-container-disk-demo --scheduled --confirm
oc set image-lookup cirros-is -n openshift-virtualization-os-images

More information on image streams is available here and here.

PVC Source

A PVC from any namespace can also be the source for a DataImportCron. The source digest is based on the PVC UID, which is polled according to the schedule, so when a new PVC is detected it will be imported.

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: pvc-import-cron
  namespace: ns1
spec:
  template:
    spec:
      source:
        pvc:
          name: my-pvc
          namespace: ns2
...

DataImportCron source formats

PersistentVolumeClaim
VolumeSnapshot

DataImportCron was originally designed to only maintain PVC sources,
However, for certain storage types, we know that snapshots sources scale better.
Some details and examples can be found in clone-from-volumesnapshot-source.

We keep this provisioner-specific information on the StorageProfile object for each provisioner at the dataImportCronSourceFormat field (possible values are snapshot/pvc), which tells the DataImportCron which type of source is preferred for the provisioner.

Some provisioners like ceph rbd are opted in automatically.
To opt-in manually, one must edit the StorageProfile:

apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
  ...
spec:
  dataImportCronSourceFormat: snapshot

To ensure smooth transition, existing DataImportCrons can be switchd to maintaining snapshots instead of PVCs by updating their corresponding storage profiles.

DataImportCron storage class

Unless specified explicitly, similarly to PVCs, DataImportCrons will be provisioned using the default virt/k8s storage class.
In previous versions, an admin would have to actively delete the old sources upon change of the storage class
(either explicitly by editing the DataImportCron or a cluster-wide change of the default storage class)

Today, the controller performs this automatically;
However, changing the storage class should be a conscious decision and in some cases (complex CI setups) it's advised to specify it explicitly to avoid exercising a different storage class for golden images throughout installation.
This flip flop could be costly and in some cases outright surprising to cluster admins.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os-image-poll-and-update.md

os-image-poll-and-update.md

Automated OS image import, poll and update

OpenShift ImageStreams

PVC Source

DataImportCron source formats

DataImportCron storage class

Files

os-image-poll-and-update.md

Latest commit

History

os-image-poll-and-update.md

File metadata and controls

Automated OS image import, poll and update

OpenShift ImageStreams

PVC Source

DataImportCron source formats

DataImportCron storage class