Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create cephfs pvc with error 'Operation not permitted' #1818

Closed
deadjoker opened this issue Dec 30, 2020 · 26 comments
Closed

create cephfs pvc with error 'Operation not permitted' #1818

deadjoker opened this issue Dec 30, 2020 · 26 comments
Labels
component/cephfs Issues related to CephFS wontfix This will not be worked on

Comments

@deadjoker
Copy link

deadjoker commented Dec 30, 2020

Describe the bug

I deploy ceph-csi in k8s and use cephfs to provide pvc.
PVC created fail when I use a normal ceph user but succeed if I use admin ceph user.

Environment details

  • Image/version of Ceph CSI driver : v3.2.0
  • OS version: Ubuntu 20.04.1
  • Kernel version : 5.4.0-58
  • Mounter used for mounting PVC (for cephfs its fuse or kernel. for rbd its
    krbd or rbd-nbd) : kernel
  • Kubernetes cluster version : v1.20.0
  • Containerd version: 1.4.3
  • Ceph cluster version : 14.2.8

Steps to reproduce

Steps to reproduce the behavior:

  1. create ceph user
    ceph auth caps client.k8sfs mon 'allow r' mgr 'allow rw' mds 'allow rw' osd 'allow rw tag cephfs *=*'
  2. dowload yaml from https://github.com/ceph/ceph-csi/tree/release-v3.2/deploy/cephfs/kubernetes
  3. modify ceph information in csi-config-map.yaml
  4. add kms-config.yaml and create from it
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    {}
metadata:
  name: ceph-csi-encryption-kms-config
  1. add secret.yaml and create from it
---
apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: ceph-csi
stringData:
  # Required for statically provisioned volumes
  #userID: <plaintext ID>
  #userKey: <Ceph auth key corresponding to ID above>

  # Required for dynamically provisioned volumes
  adminID: k8sfs
  adminKey: AQDuM+xfXz0zNRAAnxeJaWdmR2J5I/QxMR9gLQ==
  1. add storage class
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
  # String representing a Ceph cluster to provision storage from.
  # Should be unique across all Ceph clusters in use for provisioning,
  # cannot be greater than 36 bytes in length, and should remain immutable for
  # the lifetime of the StorageClass in use.
  # Ensure to create an entry in the config map named ceph-csi-config, based on
  # csi-config-map-sample.yaml, to accompany the string chosen to
  # represent the Ceph cluster in clusterID below
  clusterID: d9693b9b-8988-44bb-8bf9-ccb2c2733eec

  # CephFS filesystem name into which the volume shall be created
  fsName: cephfs

  # (optional) Ceph pool into which volume data shall be stored
  # pool: cephfs_data

  # (optional) Comma separated string of Ceph-fuse mount options.
  # For eg:
  # fuseMountOptions: debug

  # (optional) Comma separated string of Cephfs kernel mount options.
  # Check man mount.ceph for mount options. For eg:
  # kernelMountOptions: readdir_max_bytes=1048576,norbytes

  # The secrets have to contain user and/or Ceph admin credentials.
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi
  csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi

  # (optional) The driver can use either ceph-fuse (fuse) or
  # ceph kernelclient (kernel).
  # If omitted, default volume mounter will be used - this is
  # determined by probing for ceph-fuse and mount.ceph
  mounter: kernel
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - debug
  1. create pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-cephfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: csi-cephfs-sc

Actual results

# kubectl get sc
NAME            PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csi-cephfs-sc   cephfs.csi.ceph.com   Retain          Immediate           true                   7h1m

# kubectl get pvc
NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
csi-cephfs-pvc   Pending                                      csi-cephfs-sc   57m

# kubectl get pv
No resources found

Expected behavior

PVC should be created successfully and bound to a PV.

Logs

If the issue is in PVC creation, deletion, cloning please attach complete logs
of below containers.

  • csi-provisioner and csi-rbdplugin/csi-cephfsplugin container logs from the
    provisioner pod.
I1230 09:46:02.448025       1 controller.go:1317] provision "default/csi-cephfs-pvc" class "csi-cephfs-sc": started                                                                                                                                               
 I1230 09:46:02.448273       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"csi-cephfs-pvc", UID:"4bd0ecf9-d613-4f8f-998d-d0b204e8352d", APIVersion:"v1", ResourceVersion:"1814335", FieldPath:""}): type: 'Nor 
 mal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/csi-cephfs-pvc"                                                                                                                                                        
 I1230 09:46:02.448202       1 controller.go:573] CreateVolumeRequest {Name:pvc-4bd0ecf9-d613-4f8f-998d-d0b204e8352d CapacityRange:required_bytes:5368709120  VolumeCapabilities:[mount:<mount_flags:"debug" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > ] Param 
 eters:map[clusterID:d9693b9b-8988-44bb-8bf9-ccb2c2733eec csi.storage.k8s.io/controller-expand-secret-name:csi-cephfs-secret csi.storage.k8s.io/controller-expand-secret-namespace:ceph-csi csi.storage.k8s.io/node-stage-secret-name:csi-cephfs-secret csi.storag 
 e.k8s.io/node-stage-secret-namespace:ceph-csi csi.storage.k8s.io/provisioner-secret-name:csi-cephfs-secret csi.storage.k8s.io/provisioner-secret-namespace:ceph-csi fsName:cephfs mounter:kernel] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequiremen 
 ts:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}                                                                                                                                                                                             
 I1230 09:46:02.467093       1 connection.go:182] GRPC call: /csi.v1.Controller/CreateVolume                                                                                                                                                                       
 I1230 09:46:02.467124       1 connection.go:183] GRPC request: {"capacity_range":{"required_bytes":5368709120},"name":"pvc-4bd0ecf9-d613-4f8f-998d-d0b204e8352d","parameters":{"clusterID":"d9693b9b-8988-44bb-8bf9-ccb2c2733eec","fsName":"cephfs","mounter":"ke 
 rnel"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"mount_flags":["debug"]}},"access_mode":{"mode":5}}]}                                                                                                                            
 I1230 09:46:02.473398       1 connection.go:185] GRPC response: {}                                                                                                                                                                                                
 I1230 09:46:02.473461       1 connection.go:186] GRPC error: rpc error: code = Internal desc = rados: ret=-1, Operation not permitted                                                                                                                             
 I1230 09:46:02.473515       1 controller.go:645] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Internal desc = rados: ret=-1, Operation not permitted                      
 I1230 09:46:02.473588       1 controller.go:1084] Final error received, removing PVC 4bd0ecf9-d613-4f8f-998d-d0b204e8352d from claims in progress                                                                                                                 
 W1230 09:46:02.473608       1 controller.go:943] Retrying syncing claim "4bd0ecf9-d613-4f8f-998d-d0b204e8352d", failure 16                                                                                                                                        
 E1230 09:46:02.473644       1 controller.go:966] error syncing claim "4bd0ecf9-d613-4f8f-998d-d0b204e8352d": failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = Internal desc = rados: ret=-1, Operation not permitted               
 I1230 09:46:02.473699       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"csi-cephfs-pvc", UID:"4bd0ecf9-d613-4f8f-998d-d0b204e8352d", APIVersion:"v1", ResourceVersion:"1814335", FieldPath:""}): type: 'War 
 ning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs-sc": rpc error: code = Internal desc = rados: ret=-1, Operation not permitted

Additional context

the ceph user 'k8sfs' caps:

client.k8sfs
	key: AQDuM+xfXz0zNRAAnxeJaWdmR2J5I/QxMR9gLQ==
	caps: [mds] allow rw
	caps: [mgr] allow rw
	caps: [mon] allow r
	caps: [osd] allow rw tag cephfs *=*

this user has ability to create subvolume and subvolumegroup as well.

# ceph --id k8sfs fs subvolume create cephfs test 
# ceph --id k8sfs fs subvolume ls cephfs
[
    {
        "name": "test"
    }
]

# ceph --id k8sfs fs subvolumegroup ls cephfs
[
    {
        "name": "_nogroup"
    }, 
    {
        "name": "csi"
    }
]

# ceph --id k8sfs fs subvolumegroup create cephfs testgroup
# ceph --id k8sfs fs subvolumegroup ls cephfs
[
    {
        "name": "_nogroup"
    }, 
    {
        "name": "csi"
    }, 
    {
        "name": "testgroup"
    }
]

# ceph --id k8sfs fs subvolume create cephfs testsubvolume csi
# ceph --id k8sfs fs subvolume ls cephfs csi
[
    {
        "name": "testsubvolume"
    }, 
    {
        "name": "csi-vol-eac5a168-4a70-11eb-b23a-8e1756c5ca33"
    }
]

the 'csi' subvolumegroup is created when I use admin keyring in ceph-csi.

@sgissi
Copy link

sgissi commented Feb 21, 2021

I came across the same issue. User for CephFS is able to create subvolumegroups and subvolumes but it fails on the provisioner. An user will full admin rights works without problems. I couldn't find where the call to RADOS is done to find out which permission is missing or which action causes the problem.

@humblec
Copy link
Collaborator

humblec commented Feb 22, 2021

@deadjoker @sgissi these are the capabilities we require for the user in a ceph cluster for Ceph CSI to perform its actions https://github.com/ceph/ceph-csi/blob/master/docs/capabilities.md , even after giving these permissions if you still face issues, please revert!

@deadjoker
Copy link
Author

@humblec I followed this docs and still get this error.
See my step 1

@humblec
Copy link
Collaborator

humblec commented Feb 22, 2021

Thanks @deadjoker for confirming the setup . @yati1998 are we missing any capabilities in the doc ?

@Yuggupta27
Copy link
Contributor

Hi @deadjoker ,
As per the steps mentioned by you, the user creation is done as per the node plugin capabilities, and the cephFS Provisioner capabilities seem to be missing. This might be the reason why you are unable to provision a volume via the cephfs-provisioner.
Unlike rbd, cephfs has separate capability requirements for node plugin and provisioner as mentioned here.
For solving the issue, you can try creating separate cephfs-plugin and cephfs-provisioner secrets.
Feel free to reach out if the issue still persists :)

@deadjoker
Copy link
Author

Hi @Yuggupta27
Here is the secrets in my cluster environment.

kubectl get secret -n ceph-csi
NAME                                 TYPE                                  DATA   AGE
cephfs-csi-nodeplugin-token-sx9v2    kubernetes.io/service-account-token   3      97d
cephfs-csi-provisioner-token-xxnrd   kubernetes.io/service-account-token   3      97d
csi-cephfs-secret                    Opaque                                2      97d
default-token-ccmsh                  kubernetes.io/service-account-token   3      105d

Should I use a new ceph id with capability of

"mon", "allow r",
"mgr", "allow rw",
"osd", "allow rw tag cephfs metadata=*"

and create a csi-cephfs-provisioner-secret for the provisioner?

@alamsyahho
Copy link

alamsyahho commented May 14, 2021

@deadjoker Did you manage to get the issue resolved? I ran into exactly a similar error as well and not sure yet how to resolve the issue?

@deadjoker
Copy link
Author

@alamsyahho have not resolved this issue yet. I'm using admin account instead

@alamsyahho
Copy link

Understood. Probably i will have to use admin account for csi-cephfs as well then. Thanks for your reply

@github-actions
Copy link

github-actions bot commented Sep 4, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Sep 4, 2021
@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@Raboo
Copy link

Raboo commented Oct 19, 2021

This is still very valid.
I have ceph-csi installed via rook, and using the rook scripts to create the ceph clients

client.csi-cephfs-node
	caps: [mds] allow rw
	caps: [mgr] allow rw
	caps: [mon] allow r
	caps: [osd] allow rw tag cephfs *=*
client.csi-cephfs-provisioner
	caps: [mgr] allow rw
	caps: [mon] allow r
	caps: [osd] allow rw tag cephfs metadata=*

trying to provision a cephfs subvolumegroup doesn't work using csi-cephfs-provisioner. However if I tell the storageclass to use admin, it works, so something is either missing from these caps or the code does something different when admin is used.

Update: the csi-cephfs-provisioner is able to create subvolume groups

[root@kw-02000cccea2b /]# ceph -n client.csi-cephfs-provisioner --key xxx== -m v2:10.3.60.25:3300 fs subvolumegroup create cephfs test cephfs_data
[root@kw-02000cccea2b /]# ceph -n client.csi-cephfs-provisioner --key xxx== -m v2:10.3.60.25:3300 fs subvolumegroup ls cephfs
[
    {
        "name": "test"
    }
]

@Raboo
Copy link

Raboo commented Oct 20, 2021

Weirdly enough this still fails if I give the csi-cephfs-provisioner client same caps as admin, but it works if I use the admin client.

[client.csi-cephfs-provisioner]
	caps mds = "allow *"
	caps mgr = "allow *"
	caps mon = "allow *"
	caps osd = "allow *"

@Madhu-1 Madhu-1 reopened this Oct 20, 2021
@github-actions github-actions bot removed the wontfix This will not be worked on label Oct 20, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Nov 19, 2021
@Raboo
Copy link

Raboo commented Nov 22, 2021

I still wasn't able to solve the problem, I simply worked around it using client.admin like some other people here.

@github-actions github-actions bot removed the wontfix This will not be worked on label Nov 22, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Dec 22, 2021
@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@sf98723
Copy link

sf98723 commented Mar 2, 2022

@deadjoker, the ceph capabilities requirements you provided, from the following link have to be used in the userID section of the secret, for static provisioning only. The following example explains the meaning of the userID and adminID sections.

If you expect a dynamic provisioning behaviour, you have to provide an admin user account, for some -not well documented- reasons.

I've faced this issue in the past months -> only the client.admin user worked. When I created another admin user, say "client.admin123" with the same capabilities, it didn't work. A few posts are related to this pb -> this one for example

Last days, users at work asked us to provide dynamic provisioning for our K8S/Ceph environments.

So, I've tried this evening with an "up to date" config :

  • Kubernetes v1.23.1
  • CSI v3.5.1 (not Helm),
  • and Ceph 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)

I've created again an alternative admin account with the same caps as client.admin... inserted these credentials at adminID : .... it works, now, with an alternative admin user !

Screenshot 2022-03-02 at 23 59 33

Here is the user definition and caps for information :

client.admink8s
key: AQBB4............jKSb9Kbjg==
caps: [mds] allow
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *

Very insecure... We do not want to expose an admin token in the clear in Kubernetes as we don't use protected secrets already. At least it would be appreciated not to require write capabilities for the monitors..

Can the development team clarify in the docs directory the minimal caps for an "admin" user for dynamic provisioning ? Or explain why it have to be a full admin having write caps for the Ceph mons.. ?

@humblec ? I will also check at the code and ceph detailed caps next days

Thanks a lot,

@drummerglen
Copy link

drummerglen commented May 19, 2022

Hi guys,

I encountered this problem too, but I have been resolved.
The key point was the adminID and adminKey in Secret file must be admin (client.admin in ceph cluster).
Once I re-apply the Secret yaml file, csi-cephfs-sc works!!!

I found the doc in ceph-csi/docs/capabilities.md
Seems there is some issues in user privilege which config by ceph ( use ceph auth client.xxx caps mon 'allow r' osd....mds...mgr... ). It dosen't work!!!

Here is the change

apiVersion: v1
kind: Secret
metadata:
name: csi-cephfs-secret
namespace: ceph-csi
stringData:
#Required for statically provisioned volumes
#userID: <plaintext ID>
#userKey:

#Required for dynamically provisioned volumes
adminID: k8sfs <-- here should be admin (client.admin)
adminKey: xxxxxxxxxxxxxxxxxxxxxxxxxxxxx==

@Raboo
Copy link

Raboo commented May 19, 2022

@drummerglen It's not resolved. So your "solution"/"resolved" is what exactly what everyone else did to work around the problem and nothing new. It's even written in the original post

... but succeed if I use admin ceph user.

It's not a solution/resolution to run as admin/superuser/god-mode, it's just a temporary work-around.
Privilege separation is there for a reason, mainly to reduce the risk of malicious abuse or errors made by code or humans.

@drummerglen
Copy link

@Raboo Oops, sorry I didn't read every comment. May I ask if any version has resolved this issue?

@Raboo
Copy link

Raboo commented May 24, 2022

@drummerglen no I don't think so. It seems very hard to figure out why this is happening and probably doesn't affect the majority of the users.

@drummerglen
Copy link

@Raboo My ceph cluster was deployed by cephadm running on docker. I have no idea if it is the problem.

@alfredogotamagmail
Copy link

Hi, until today the issue has not been resolved yet. Is there any ongoing fixing that is still pending? Or nobody really cares about this issue? It is very concerning that we need to expose our Ceph superuser credentials into ceph-csi client, a slight human or backend error might jeopardize the whole Ceph cluster.

@alepiazza
Copy link

alepiazza commented Jul 23, 2023

Hi, I am unsure if the issue is the same but you might want to look at #2687.
We faced similar issues in crafting the correct caps so to let the ceph provisioner use credentials with restricted access. Like avoiding allow * in all caps or restricting the permissions to path, fs, volumes.
The caps suggested at the end of the above issue are working for us, but unfortunately, the docs has not been updated yet.

@saeed1gorji
Copy link

saeed1gorji commented Nov 24, 2024

I could too fix this with a non-admin user with the solution @alepiazza pointed out (this comment), which is now reflected in the updated document.

Here's a simple summary of the things I did on the ceph cluster side to create a filesystem, a subvolumegroup for it, and a user with needed permissions:

# Set variables to use in the commands
USER=k8s
FS_NAME=cephfs-k8s
SUB_VOL=csi
# The actual commands
ceph fs volume create $FS_NAME
ceph fs subvolumegroup create $FS_NAME $SUB_VOL
ceph auth get-or-create client.$USER \
  mgr "allow rw" \
  osd "allow rw tag cephfs metadata=$FS_NAME, allow rw tag cephfs data=$FS_NAME" \
  mds "allow r fsname=$FS_NAME path=/volumes, allow rws fsname=$FS_NAME path=/volumes/$SUB_VOL" \
  mon "allow r fsname=$FS_NAME"

I used the helm chart to install cephfs on the k8s cluster. User credentials were set in adminID and adminKey fields. Tested dynamic provisioning and it worked great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/cephfs Issues related to CephFS wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests