Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unable to locate ObjectStore plugin named velero.io/aws" #608

Closed
absurd-wombat opened this issue Jul 24, 2024 · 1 comment
Closed

"unable to locate ObjectStore plugin named velero.io/aws" #608

absurd-wombat opened this issue Jul 24, 2024 · 1 comment

Comments

@absurd-wombat
Copy link

What steps did you take and what happened:
After upgrading velero helm chart, restore of namespaces/resources from backups no longer functions. Rollback to previous version works as a temporary fix.

Velero helm chart was installed via helmfile. I upgraded the chart version from 5.2.1 (velero version 1.12.3) to 7.1.2 (velero version 1.14.0) and the velero-plugin-for-aws from 1.8.2 to 1.10.0. I also removed the initContainer for the velero-plugin-for-csi (this is built in for chart versions >= 7.0.0):

  - name: velero
    namespace: velero
    chart: vmware-tanzu/velero
    version: 7.1.2
    values:
      - resources: # the following values were originally taken over from the official chart: requests: CPU: 500m, Memory: 128Mi; limits: CPU: 1, Memory: 512Mi
          requests:
            cpu: 500m
            memory: 500Mi
          limits:
            memory: 1Gi
      - configuration:
          backupStorageLocation:
          - name: default
            provider: aws
            bucket: {{ .Values.awsBucketName }}
            config:
              region: eu-central-1
          volumeSnapshotLocation:
          - name: default
            provider: aws
            config:
              region: eu-central-1
          logFormat: json
          features: EnableCSI
        serviceAccount:
          server:
            annotations:
              eks.amazonaws.com/role-arn: {{ .Values.awsVeleroRole }}
        credentials:
          useSecret: false
        securityContext:
          fsGroup: 65534
        initContainers:
        - name: velero-plugin-for-aws
          image: velero/velero-plugin-for-aws:v1.10.0
          volumeMounts:
          - mountPath: /target
            name: plugins

I created a backup of a namespace using velero:
velero create backup <backup name> --include-namespaces=<namespace name> --ttl 12h --wait

Attempting to restore the namespace from the backup, the velero pod would crash and the phase of the restore changed to "failed" (see output of following commands section):
kubectl delete namespace <namespace name>
velero restore create --from-backup <backup name>

Logs on the velero server show an error in getting the velero-plugin-for-aws:
velero server --log-level debug

ERRO[0007] Current BackupStorageLocations available/unavailable/unknown: 0/1/0, BackupStorageLocation "default" is unavailable: unable to locate ObjectStore plugin named velero.io/aws) controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:178"

What did you expect to happen:
Expected the namespace to be recreated from the backup

The output of the following commands will help us better understand what's going on:
Following command is recommended after running a "velero restore create" to monitor progress
velero restore describe <restore name>
above command returns the following output:

Name:         manual-pgp-api-test-2024-07-24-velero-v1.14.0-20240724121031
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:  Failed (run 'velero restore logs manual-pgp-api-test-2024-07-24-velero-v1.14.0-20240724121031' for more information)

Started:    2024-07-24 12:10:31 +0200 CEST
Completed:  2024-07-24 12:11:04 +0200 CEST

Backup:  manual-pgp-api-test-2024-07-24-velero-v1.14.0

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
  Cluster-scoped:  auto

Namespace mappings:  <none>

Label selector:  <none>

Or label selector:  <none>

Restore PVs:  auto

Existing Resource Policy:   <none>
ItemOperationTimeout:       4h0m0s

Preserve Service NodePorts:  auto

Uploader config:

Following command is also recommended after running a "velero restore create" to monitor progress:
velero restore logs <restore name>
This returns the following:
An error occurred: file not found

The aws plugin appears to have loaded to velero.
velero plugin get
returns a list of plugins including

NAME                                            KIND
velero.io/crd-remap-version                     BackupItemAction
velero.io/crd-remap-version                     BackupItemAction
velero.io/pod                                   BackupItemAction
velero.io/pod                                   BackupItemAction
velero.io/pv                                    BackupItemAction
velero.io/pv                                    BackupItemAction
velero.io/service-account                       BackupItemAction
velero.io/service-account                       BackupItemAction
velero.io/csi-pvc-backupper                     BackupItemActionV2
velero.io/csi-volumesnapshot-backupper          BackupItemActionV2
velero.io/csi-volumesnapshotclass-backupper     BackupItemActionV2
velero.io/csi-volumesnapshotcontent-backupper   BackupItemActionV2
velero.io/csi-volumesnapshot-delete             DeleteItemAction
velero.io/csi-volumesnapshotcontent-delete      DeleteItemAction
velero.io/dataupload-delete                     DeleteItemAction
velero.io/aws                                   ObjectStore
velero.io/add-pv-from-pvc                       RestoreItemAction
velero.io/add-pv-from-pvc                       RestoreItemAction
velero.io/add-pvc-from-pod                      RestoreItemAction
velero.io/add-pvc-from-pod                      RestoreItemAction
velero.io/admission-webhook-configuration       RestoreItemAction
velero.io/admission-webhook-configuration       RestoreItemAction
velero.io/apiservice                            RestoreItemAction
velero.io/apiservice                            RestoreItemAction
velero.io/change-image-name                     RestoreItemAction
velero.io/change-image-name                     RestoreItemAction
velero.io/change-pvc-node-selector              RestoreItemAction
velero.io/change-pvc-node-selector              RestoreItemAction
velero.io/change-storage-class                  RestoreItemAction
velero.io/change-storage-class                  RestoreItemAction
velero.io/cluster-role-bindings                 RestoreItemAction
velero.io/cluster-role-bindings                 RestoreItemAction
velero.io/crd-preserve-fields                   RestoreItemAction
velero.io/crd-preserve-fields                   RestoreItemAction
velero.io/dataupload                            RestoreItemAction
velero.io/dataupload                            RestoreItemAction
velero.io/init-restore-hook                     RestoreItemAction
velero.io/init-restore-hook                     RestoreItemAction
velero.io/job                                   RestoreItemAction
velero.io/job                                   RestoreItemAction
velero.io/pod                                   RestoreItemAction
velero.io/pod                                   RestoreItemAction
velero.io/pod-volume-restore                    RestoreItemAction
velero.io/pod-volume-restore                    RestoreItemAction
velero.io/role-bindings                         RestoreItemAction
velero.io/role-bindings                         RestoreItemAction
velero.io/secret                                RestoreItemAction
velero.io/secret                                RestoreItemAction
velero.io/service                               RestoreItemAction
velero.io/service                               RestoreItemAction
velero.io/service-account                       RestoreItemAction
velero.io/service-account                       RestoreItemAction
velero.io/csi-pvc-restorer                      RestoreItemActionV2
velero.io/csi-volumesnapshot-restorer           RestoreItemActionV2
velero.io/csi-volumesnapshotclass-restorer      RestoreItemActionV2
velero.io/csi-volumesnapshotcontent-restorer    RestoreItemActionV2
velero.io/aws                                   VolumeSnapshotter

Anything else you would like to add:
Increasing limits on the namespace/pods did not help the issue.

Environment:

  • helm version (use helm version): 3.15.3
  • helmfile version: 0.157.0
  • helm chart version and app version (use helm list -n <YOUR NAMESPACE>): chart-version: velero-7.1.2, app-version: 1.14.0
  • Kubernetes version (use kubectl version): v1.29.4-eks
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration: aws
  • OS (e.g. from /etc/os-release):
@absurd-wombat
Copy link
Author

It turned out that the solution was actually still a memory problem. In the release notes for v13.0.0:

Velero introduces the informer cache which is enabled by default. The informer cache improves the 
restore performance but may cause higher memory consumption. Increase the memory limit of the 
Velero pod or disable the informer cache by specifying the --disable-informer-cache option when 
installing Velero if you get the OOM error. 

This means we had to increase the memory limits of the pod to drastically above what we are used to, or manually set the disable-informer-cache option to false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant