Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Create Deployment for Memcached example using operator-sdk #3590

Closed
atef23 opened this issue Jul 29, 2020 · 21 comments
Closed

Unable to Create Deployment for Memcached example using operator-sdk #3590

atef23 opened this issue Jul 29, 2020 · 21 comments
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/needs-information Indicates an issue needs more information in order to work on it.
Milestone

Comments

@atef23
Copy link

atef23 commented Jul 29, 2020

Type of question

Setting up a new project with operator sdk version 0.19

Question

What did you do?
Created a new Memcached golang operator using the guide for the operator-sdk here:

https://sdk.operatorframework.io/docs/golang/quickstart/

and the stock controller from the operator-sdk source:
https://github.com/operator-framework/operator-sdk/blob/master/example/memcached-operator/memcached_controller.go.tmpl

What did you expect to see?
A deployment created upon applying the cr for the Memcached CRD.

What did you see instead? Under which circumstances?

The Memcached CRD gets created and a pod with the controller using my Memcached operator image gets created but when I attempt to apply the CR and create instances of the Memcached operator which should create deployment objects, I don't see any events or errors but no deployment gets created either.

Environment

  • operator-sdk version:
    v0.19.0

  • Kubernetes version information:

4.3.1

  • Kubernetes cluster kind:

Additional context
I was able to get this working when creating an operator using the old scaffolding from operator-sdk v0.17

Screenshot from 2020-07-28 20-17-17

I see the instances of the Memcached operator but the controller doesn't create the deployment. Am I missing something with registering the controller with the manager or something?

Thanks,

Atef

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Jul 29, 2020

Hi @atef,

Could you please run the following commands and add the output here?

  • kubectl logs deployment.apps/memcached-operator-controller-manager -n memcached-operator-system -c manager
  • kubectl get all
  • kubectl get all -n memcached-operator-system

PS. I am guessing that you are looking for the CR deployment in the ns of the operator. If when you run kubectl apply -f config/sample/<cr-file> you did not inform the -n memcached-operator-system then, in OCP it will be created in the NS that you are kubectl get all will return that.

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

@camilamacedo86 thanks for providing those commands. here is the error I see repeatedly from the reconcile loop:

2020-07-29T01:12:22.390Z INFO controllers.Memcached Creating a new Deployment {"memcached": "memcached/memcached-sample", "Deployment.Namespace": "memcached", "Deployment.Name": "memcached-sample"} 2020-07-29T01:12:22.456Z ERROR controllers.Memcached Failed to create new Deployment {"memcached": "memcached/memcached-sample", "Deployment.Namespace": "memcached", "Deployment.Name": "memcached-sample", "error": "deployments.apps \"memcached-sample\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"} github.com/go-logr/zapr.(*zapLogger).Error /go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128 github.com/example-inc/memcached-operator/controllers.(*MemcachedReconciler).Reconcile /workspace/controllers/memcached_controller.go:75 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:256 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:155 k8s.io/apimachinery/pkg/util/wait.BackoffUntil /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:156 k8s.io/apimachinery/pkg/util/wait.JitterUntil /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:133 k8s.io/apimachinery/pkg/util/wait.Until /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:90 2020-07-29T01:12:22.456Z ERROR controller-runtime.controller Reconciler error {"controller": "memcached", "request": "memcached/memcached-sample", "error": "deployments.apps \"memcached-sample\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"} github.com/go-logr/zapr.(*zapLogger).Error /go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:258 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:155 k8s.io/apimachinery/pkg/util/wait.BackoffUntil /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:156 k8s.io/apimachinery/pkg/util/wait.JitterUntil /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:133 k8s.io/apimachinery/pkg/util/wait.Until /go/pkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:90

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

the output from oc get all:

`[atefaziz@localhost ~]$ oc get all
NAME READY STATUS RESTARTS AGE
pod/memcached-operator-controller-manager-59787dfdd4-88hn7 2/2 Running 0 68m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/memcached-operator-controller-manager-metrics-service ClusterIP 172.30.150.61 8443/TCP 68m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/memcached-operator-controller-manager 1/1 1 1 68m

NAME DESIRED CURRENT READY AGE
replicaset.apps/memcached-operator-controller-manager-59787dfdd4 1 1 1 68m
`

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

@camilamacedo86 Here is my source for reference:

https://github.com/atef23/memcached-operator

I took the controller from the operator-sdk repository example controller:

https://github.com/atef23/memcached-operator/blob/master/controllers/memcached_controller.go

So I wonder if it's something related to RBAC policy?

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Jul 29, 2020

Could you please try to add on top of your reconcile the RBAC marker:

// +kubebuilder:rbac:groups=apps,resources=deployments/finalizers,verbs=get;create;update;patch;delete

And then, run make manifests and build and push a new image and re-duo the test? Please, let us know if it worked for you.

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

@camilamacedo86 I updated that and then I get this error:

I0729 01:33:38.745704 1 request.go:621] Throttling request took 1.006031601s, request: GET:https://172.30.0.1:443/apis/autoscaling.openshift.io/v1?timeout=32s 2020-07-29T01:33:39.750Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": "127.0.0.1:8080"} 2020-07-29T01:33:39.750Z INFO setup starting manager I0729 01:33:39.750919 1 leaderelection.go:242] attempting to acquire leader lease memcached2/f1c5ece8.example.com... 2020-07-29T01:33:39.750Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"} I0729 01:33:39.843538 1 leaderelection.go:252] successfully acquired lease memcached2/f1c5ece8.example.com 2020-07-29T01:33:39.843Z INFO controller-runtime.controller Starting EventSource {"controller": "memcached", "source": "kind source: /, Kind="} 2020-07-29T01:33:39.843Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"memcached2","name":"f1c5ece8.example.com","uid":"d1dd9b99-85f1-41bf-bed3-6778b70a66d0","apiVersion":"v1","resourceVersion":"68902"}, "reason": "LeaderElection", "message": "memcached-operator-controller-manager-59787dfdd4-95ntg_b93a68fe-20f1-4a23-8430-7a284216c314 became leader"} 2020-07-29T01:33:39.943Z INFO controller-runtime.controller Starting EventSource {"controller": "memcached", "source": "kind source: /, Kind="} E0729 01:33:39.946783 1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.2/tools/cache/reflector.go:125: Failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:memcached2:default" cannot list resource "deployments" in API group "apps" at the cluster scope

Do I need to add a different clusterrole to the service account?

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

@camilamacedo86 I added this role to the manager-rolebinding:

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin

Which works. Do I need to give it the cluster-admin role or can I get away with giving it less permissions?

@camilamacedo86
Copy link
Contributor

HI @atef23,

The steps provide is expecting that the user logged has admin permission. See how to grant yourself cluster-admin privileges or be logged in as admin. (we need to clarify it in the docs)

The projects are now cluster-scoped by default which means that the logged user requires permission in the cluster to apply the cluster-roles. However, users still able to customize their projects to work with namespaced scope. See; https://master.sdk.operatorframework.io/docs/building-operators/golang/operator-scope/

@atef23
Copy link
Author

atef23 commented Jul 29, 2020

@camilamacedo86 I’ll take a look there. thanks for your help with this. I’ll close this issue.

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Jul 29, 2020

Let's keep this open for we make clear this point needs to be addressed in the docs.

@camilamacedo86 camilamacedo86 added the kind/documentation Categorizes issue or PR as related to documentation. label Jul 29, 2020
@camilamacedo86 camilamacedo86 self-assigned this Jul 30, 2020
@estroz estroz added the triage/needs-information Indicates an issue needs more information in order to work on it. label Aug 3, 2020
@estroz estroz modified the milestones: Backlog, v1.0.0 Aug 3, 2020
@joelanford
Copy link
Member

I'm going to move this out of the 1.0.0 milestone. A few comments and clarifications on the above discussion though.

  1. This seems an issue of permissions that the operator's service account is lacking, NOT an issue that the operator installer has. While it is necessary to have permission to create CRDs, namespaces, cluster roles, cluster role bindings, etc., that doesn't seem to be the issue here.
  2. @atef23 I took a brief look, and it seem like your project has all of the necessary RBAC configurations necessary for the operator to run correctly. Are you deploying this operator with make deploy or are you deploying it via the bundle you generated? If via the bundle, there have been some bugs fixed since 0.19 that resolve RBAC issues in generated packagemanifests and bundles.

Can you retry the latest quickstart using v1.0.0-alpha.2 or a build from master?

@joelanford joelanford modified the milestones: v1.0.0, Backlog Aug 6, 2020
@atef23
Copy link
Author

atef23 commented Aug 10, 2020

@joelanford thanks for taking a look. I was deploying it from a bundle I generated using make bundle. I'll try out a build with the sdk from master

@gaodan-fang
Copy link

gaodan-fang commented Sep 5, 2020

@atef23 did you solve the problem? i also met this issue and i am using the v1.0.0-alpha.2 sdk

pkg/mod/k8s.io/client-go@v0.18.2/tools/cache/reflector.go:125: Failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:memcached2:default" cannot list resource "deployments" in API group "apps" at the cluster scope

@camilamacedo86 camilamacedo86 removed their assignment Sep 7, 2020
@sundar-cs
Copy link

sundar-cs commented Nov 6, 2020

Hi,

I see the same issue as described by atef23 on 29th July, but in my case the logs have No errors (RBAC or whatever).
i.e. the operator logs says Successfully Reconciled but no pod gets created.

The link being followed is:
https://sdk.operatorframework.io/docs/building-operators/golang/quickstart/

Operator-sdk version is 1.1.0

This is on KOPS Kubernetes cluster on AWS.

This has been tried on default namespace and memcached-operator-system namespace but both behave the same way.
kubectl apply -f config/samples/cache_v1_memcached.yaml -n memcached-operator-system
(or without the -n option)

The controller log says:

2020-11-06T04:50:11.042Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "memcached", "request": "default/memcached-sample"}
2020-11-06T05:00:59.962Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "memcached", "request": "default/memcached-sample"}
2020-11-06T05:01:18.546Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "memcached", "request": "memcached-operator-system/memcached-sample"}

kubectl get crd -A
NAME CREATED AT
memcacheds.cache.example.com 2020-11-06T04:44:06Z

kubectl get all -n memcached-operator-system
NAME READY STATUS RESTARTS AGE
pod/memcached-operator-controller-manager-56fdb6d8cd-sb92x 2/2 Running 0 29m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/memcached-operator-controller-manager-metrics-service ClusterIP 100.66.126.98 8443/TCP 30m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/memcached-operator-controller-manager 1/1 1 1 30m

NAME DESIRED CURRENT READY AGE
replicaset.apps/memcached-operator-controller-manager-56fdb6d8cd 1 1 1 29m

i.e. only the operator objects are shown and not the CR that's supposed to have been generated.

However this shows the object:

kubectl get Memcached
NAME AGE
memcached-sample 5m

Isn't it supposed to create a pod resource ?

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 6, 2021
@openshift-ci-robot openshift-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 6, 2021
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kyp0717
Copy link

kyp0717 commented Jan 15, 2022

I am having the same issue where no pods were created. In fact the deploymet is having issues.

@aashishprasad99
Copy link

Even I am having the same issue @kyp0717 , did you get any resolution?

@HariSK20
Copy link

I was able to fix it by adding the following line before the reconcile function which would add the necessary permissions in the rbac/role.yaml

//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests