Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14.0 Backup error when emptyDir pod volumes are attached #7929

Closed
phoenix-bjoern opened this issue Jun 26, 2024 · 7 comments · Fixed by #7967
Closed

v1.14.0 Backup error when emptyDir pod volumes are attached #7929

phoenix-bjoern opened this issue Jun 26, 2024 · 7 comments · Fixed by #7967

Comments

@phoenix-bjoern
Copy link

What steps did you take and what happened:
Backing up a StatefulSet with an emptyDir attached reports an error during backup:

message: /fail to get PVC for pod integration/redis-master-0 error: /volume integration/redis-tmp-conf has no PVC associated with it
message: /fail to get PVC for pod integration/redis-master-0 error: /volume integration/tmp has no PVC associated with it
message: /fail to get PVC for pod integration/redis-master-0 error: /volume integration/redis-data has no PVC associated with it
name: /redis-master-0 message: /Error backing up item error: /volume integration/redis-tmp-conf has no PVC associated with it
name: /redis-master-0 message: /Error backing up item error: /volume integration/tmp has no PVC associated with it
name: /redis-master-0 message: /Error backing up item error: /volume integration/redis-data has no PVC associated with it

This is the (shortend) resource YAML:

apiVersion: v1
kind: Pod
metadata:
  ...
  name: redis-master-0
  namespace: integration
spec:
  containers:
    - args:
        - '-c'
        - /opt/bitnami/scripts/start-scripts/start-master.sh
      command:
        - /bin/bash
      env:
        - name: BITNAMI_DEBUG
          value: 'false'
        - name: REDIS_REPLICATION_MODE
          value: master
        - name: ALLOW_EMPTY_PASSWORD
          value: 'yes'
        - name: REDIS_TLS_ENABLED
          value: 'no'
        - name: REDIS_PORT
          value: '6379'
      image: docker.io/bitnami/redis:6.2.7-debian-11-r11
      imagePullPolicy: IfNotPresent
      ...
      volumeMounts:
        - mountPath: /opt/bitnami/scripts/start-scripts
          name: start-scripts
        - mountPath: /health
          name: health
        - mountPath: /data
          name: redis-data
        - mountPath: /opt/bitnami/redis/mounted-etc
          name: config
        - mountPath: /opt/bitnami/redis/etc/
          name: redis-tmp-conf
        - mountPath: /tmp
          name: tmp
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-lpnb6
          readOnly: true
  ...
  volumes:
    - configMap:
        defaultMode: 493
        name: redis-scripts
      name: start-scripts
    - configMap:
        defaultMode: 493
        name: redis-health
      name: health
    - configMap:
        defaultMode: 420
        name: redis-configuration
      name: config
    - emptyDir: {}
      name: redis-tmp-conf
    - emptyDir: {}
      name: tmp
    - emptyDir: {}
      name: redis-data

What did you expect to happen:

The code to extend the volume policies seems to expect a PVC:

return nil, errors.Errorf("volume %s/%s has no PVC associated with it", pod.Namespace, vol.Name)

Reporting a missing PVC for an emptyDir doesn't make sense. Instead emptyDirs should be skipped by the code.

The following information will help us better understand what's going on:

The new Volume Policy example provides a snippet for emptyDirs: https://velero.io/docs/main/resource-filtering/#yaml-template

So adding this to a Volume Policy resolves the issue:

- conditions:
    volumeTypes:
      - emptyDir
      - configmap
  action:
    type: skip

Maybe it is intended to make this mandatory. But I can not imagine a situation where it makes sense to actually backup an emptyDir. And adding "ConfigMap" is non-sense, as ConfigMaps are resources which are backed up by Velero anyway (if not excluded explicitly).

If the core team decides that every user should add this condition to their VolumePolicies, I'm fine with closing this issue. But maybe this mandatory change should be mentioned in the release notes then. However, I would propose to not throw an error for emptyDir volumes as such scratch volumes are never be expected to be backed up.

@blackpiglet
Copy link
Contributor

What's the version of Velero you are using?

@blackpiglet blackpiglet added the Needs info Waiting for information label Jun 26, 2024
@blackpiglet blackpiglet self-assigned this Jun 26, 2024
@phoenix-bjoern
Copy link
Author

phoenix-bjoern commented Jun 26, 2024

@blackpiglet Velero v1.14.0 as mentioned in the title ;-)

@phoenix-bjoern phoenix-bjoern changed the title v1.14.0 v1.14.0 Backup error when emptyDir pod volumes are attached Jun 26, 2024
@phoenix-bjoern
Copy link
Author

I apologize, I was wrong in regards to the volume policy (maybe I was misguided by the documentation here):

The problem is actually in the shouldIncludeVolumeInBackup where volume types like hostPath, Secrets and ConfigMaps get excluded, but not emptyDir: https://github.com/vmware-tanzu/velero/blame/c827fd0c6b55aa0068c6d99024c6ca5eca78ffa3/internal/volumehelper/volume_policy_helper.go#L214
The volume policy for the types does not apply as the util method GetPVCForPodVolume instantly returns an error ( https://github.com/vmware-tanzu/velero/blame/c827fd0c6b55aa0068c6d99024c6ca5eca78ffa3/internal/volumehelper/volume_policy_helper.go#L146) and the method exits.

So, IMHO this is a bug as an emptyDir will never have a PVC. The bug can be resolved by extending shouldIncludeVolumeInBackup and add smth like this:

	// cannot backup emptyDir volumes as they are not mounted into /var/lib/kubelet/pods
	// and therefore not accessible to the node agent daemon set.
	if vol.EmptyDir != nil {
		includeVolumeInBackup = false
	}

@blackpiglet blackpiglet added Needs investigation Area/Filters and removed Needs info Waiting for information labels Jun 27, 2024
@reitermarkus
Copy link

Also seeing the same errors on 0.14.0 when using a resource policy.

@phoenix-bjoern
Copy link
Author

@blackpiglet I've tested your fix in the latest velero/velero:release-1.14-dev image and can confirm the issue has been resolved. Thank you!

@DanielJuravski
Copy link

Hi @blackpiglet, any ETA for 1.14.1 release?

@blackpiglet
Copy link
Contributor

The plan is to release it this August, but the detailed date is not set yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants