Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Che-Operator/DWO - NPE if provisioning two public endpoints on k8s #21078

Closed
nils-mosbach opened this issue Jan 26, 2022 · 9 comments
Closed
Labels
area/devworkspace-operator kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.

Comments

@nils-mosbach
Copy link

Describe the bug

Che Operator keeps crashing while deploying new devworkspaces that contain public ports.

It seems that this is caused by handling DevWorkspaceRoutings.

2022-01-26T13:34:51.968Z	INFO	controllers.DevWorkspaceRouting	Reconciling DevWorkspaceRouting	{"Request.Namespace": "dev-studio-workspace-n-m-company-com-zsobsb", "Request.Name": "routing-workspace81fba7d2a4c94c1b", "devworkspace_id": "workspace81fba7d2a4c94c1b"}
panic: runtime error: invalid memory address or nil pointer dereference

We have a workspace that contains ~5 docker images for applications like a database that's used for development. Two ports are made public.
Sometimes after a couple of minutes and multiple operator restarts the workspace starts. Happens on 7.42 and next branch.

Che version

next (development version)

Steps to reproduce

Create a new devworkspace that has multiple ports made public.

Expected behavior

Workspace should start.

Runtime

Kubernetes (vanilla)

Screenshots

No response

Installation method

chectl/latest

Environment

Linux

Eclipse Che Logs

2022-01-26T13:34:00.571Z	INFO	Binary info 	{"Go version": "go1.16.12"}
2022-01-26T13:34:00.571Z	INFO	Binary info 	{"OS": "linux", "Arch": "amd64"}
2022-01-26T13:34:00.571Z	INFO	Address 	{"Metrics": ":60000"}
2022-01-26T13:34:00.571Z	INFO	Address 	{"Probe": ":6789"}
2022-01-26T13:34:00.571Z	INFO	Operator is running on 	{"Infrastructure": "Kubernetes"}
I0126 13:34:01.623604       1 request.go:668] Waited for 1.045884114s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/cert-manager.io/v1beta1?timeout=32s
time="2022-01-26T13:34:02Z" level=info msg="Limit cache by selector: app.kubernetes.io/part-of=che.eclipse.org"
2022-01-26T13:34:05.083Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":60000"}
time="2022-01-26T13:34:07Z" level=info msg="Use 'terminationGracePeriodSeconds' 20 sec. from operator deployment."
time="2022-01-26T13:34:07Z" level=info msg="Set up process signal handler"
2022-01-26T13:34:09.619Z	INFO	setup	DevWorkspace support enabled.
2022-01-26T13:34:09.619Z	INFO	setup	starting manager
I0126 13:34:09.619719       1 leaderelection.go:243] attempting to acquire leader lease dev-studio/e79b08a4.org.eclipse.che...
2022-01-26T13:34:09.619Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
I0126 13:34:51.844648       1 leaderelection.go:253] successfully acquired lease dev-studio/e79b08a4.org.eclipse.che
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterrestore-controller	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterRestore", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.844Z	INFO	controller-runtime.manager.controller.namespace	Starting EventSource	{"reconciler group": "", "reconciler kind": "Namespace", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting EventSource	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.namespace	Starting EventSource	{"reconciler group": "", "reconciler kind": "Namespace", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterbackup-controller	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterBackup", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting EventSource	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterrestore-controller	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterRestore", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterbackup-controller	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterBackup", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.namespace	Starting EventSource	{"reconciler group": "", "reconciler kind": "Namespace", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting EventSource	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.namespace	Starting EventSource	{"reconciler group": "", "reconciler kind": "Namespace", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterbackup-controller	Starting Controller	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterBackup"}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.namespace	Starting Controller	{"reconciler group": "", "reconciler kind": "Namespace"}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checlusterrestore-controller	Starting Controller	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterRestore"}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting EventSource	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting Controller	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster"}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting Controller	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting"}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.845Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting EventSource	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "source": "kind source: /, Kind="}
2022-01-26T13:34:51.846Z	INFO	controller-runtime.manager.controller.checluster	Starting Controller	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster"}
2022-01-26T13:34:51.949Z	INFO	controller-runtime.manager.controller.checlusterrestore-controller	Starting workers	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterRestore", "worker count": 1}
2022-01-26T13:34:51.950Z	INFO	controller-runtime.manager.controller.checlusterbackup-controller	Starting workers	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheClusterBackup", "worker count": 1}
2022-01-26T13:34:51.950Z	INFO	controller-runtime.manager.controller.namespace	Starting workers	{"reconciler group": "", "reconciler kind": "Namespace", "worker count": 1}
2022-01-26T13:34:51.968Z	INFO	controller-runtime.manager.controller.devworkspacerouting	Starting workers	{"reconciler group": "controller.devfile.io", "reconciler kind": "DevWorkspaceRouting", "worker count": 1}
2022-01-26T13:34:51.968Z	INFO	controller-runtime.manager.controller.checluster	Starting workers	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "worker count": 1}
2022-01-26T13:34:51.968Z	INFO	controller-runtime.manager.controller.checluster	Starting workers	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "worker count": 1}
2022-01-26T13:34:51.968Z	INFO	controllers.DevWorkspaceRouting	Reconciling DevWorkspaceRouting	{"Request.Namespace": "dev-studio-workspace-n-m-company-com-zsobsb", "Request.Name": "routing-workspace81fba7d2a4c94c1b", "devworkspace_id": "workspace81fba7d2a4c94c1b"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x1682e96]
goroutine 1661 [running]:
github.com/devfile/devworkspace-operator/pkg/config.ExperimentalFeaturesEnabled(...)
	/che-operator/vendor/github.com/devfile/devworkspace-operator/pkg/config/sync.go:90
github.com/devfile/devworkspace-operator/pkg/provision/sync.printDiff(0x1fd60a8, 0xc001293380, 0x1fd60a8, 0xc001293500, 0x1fb7bd8, 0xc0005e6eb0)
	/che-operator/vendor/github.com/devfile/devworkspace-operator/pkg/provision/sync/sync.go:128 +0x36
github.com/devfile/devworkspace-operator/pkg/provision/sync.SyncObjectWithCluster(0x1fd60a8, 0xc001293380, 0x1fc3f80, 0xc0003f8550, 0x0, 0x0, 0xc00022c070, 0x1fb7bd8, 0xc0005e6eb0, 0x1fadce8, ...)
	/che-operator/vendor/github.com/devfile/devworkspace-operator/pkg/provision/sync/sync.go:69 +0x487
github.com/devfile/devworkspace-operator/controllers/controller/devworkspacerouting.(*DevWorkspaceRoutingReconciler).syncIngresses(0xc000e48100, 0xc000ac0d80, 0xc000026000, 0x3, 0x4, 0x1, 0xc00105af00, 0x1, 0x1, 0x0, ...)
	/che-operator/vendor/github.com/devfile/devworkspace-operator/controllers/controller/devworkspacerouting/sync_ingresses.go:58 +0x3d1
github.com/devfile/devworkspace-operator/controllers/controller/devworkspacerouting.(*DevWorkspaceRoutingReconciler).Reconcile(0xc000e48100, 0x1fadd58, 0xc00118dd10, 0xc000abeb10, 0x2d, 0xc000abec00, 0x21, 0xc00118dd10, 0xc000a92000, 0x1bac900, ...)
	/che-operator/vendor/github.com/devfile/devworkspace-operator/controllers/controller/devworkspacerouting/devworkspacerouting_controller.go:206 +0x11a5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00034b180, 0x1fadcb0, 0xc000e48580, 0x1b535c0, 0xc00003a3c0)
	/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x30d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00034b180, 0x1fadcb0, 0xc000e48580, 0x1e1b000)
	/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2(0xc000fe2010, 0xc00034b180, 0x1fadcb0, 0xc000e48580)
	/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 +0x6b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x425

Additional context

No response

@nils-mosbach nils-mosbach added the kind/bug Outline of a bug - must adhere to the bug report template. label Jan 26, 2022
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jan 26, 2022
@l0rd
Copy link
Contributor

l0rd commented Jan 26, 2022

@nils-mosbach thank you for reporting this issue. Can you share the devfile your are using to create the workspace?

cc @tolusha @skabashnyuk @metlos labelling this as area/devworkspace-che-operator as it seems relate to dw routing, feel free to change it if that's not the case.

@l0rd l0rd added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jan 26, 2022
@nils-mosbach
Copy link
Author

nils-mosbach commented Jan 27, 2022

@l0rd I've tried to create a simple example that causes the issue in our case.

https://github.com/nils-mosbach/devfile.io-demo-che-21078

It seems that Che-Operator crashes if the devfile contains a second public endpoint.

This one works: devfile.works.yaml
This one fails: devfile.yaml

We're on next channel due to a required change in the devworkspace operator, so if we should try out something, please let me know.

By the way: I'm currently looking at your work on che-code. Incredible! That's actually something a lot of our developers have been waiting for. :) Thanks!

@l0rd
Copy link
Contributor

l0rd commented Jan 27, 2022

By the way: I'm currently looking at your work on che-code. Incredible! That's actually something a lot of our developers have been waiting for. :) Thanks!

cc @benoitf

@nils-mosbach
Copy link
Author

Renaming this since it happens consistently if devworkspace operator implementation in che-operator tries to handle a second public endpoint/ ingress.

@nils-mosbach nils-mosbach changed the title Che-Operator - Reconciling DevWorkspaceRouting - NPE Che-Operator/DWO - NPE if provisioning two public endpoints on k8s Feb 10, 2022
@amisevsk
Copy link
Contributor

amisevsk commented Feb 10, 2022

@nils-mosbach looking at the controller logs, you appear to be hitting an embarrassing bug on my part (devfile/devworkspace-operator#766).

One issue we recently found is that next images aren't being correctly rolled out when DWO is installed via Operator. Could you manually rolling out a new DWO deployment via

kubectl rollout restart -n openshift-operators deployment/devworkspace-controller-manager

or by deleting the devworkspace-controller-manager-xxxxx-xxxxx pod directly, and see if that resolves the issue?

@amisevsk
Copy link
Contributor

Created DWO issue for next catalog rollouts since I realized I hadn't done that yet: devfile/devworkspace-operator#779

@nils-mosbach
Copy link
Author

That's good news. Upgraded devworkspace-controller to latest next tag. Issue seems to be resolved. Thanks a lot!

Currently che-operator:next pulls devworkspace-controller:v0.12.3. Any plans releasing v0.12.4 in the next week or two? Would be nice if that change makes it to the next stable version of che.

Anyway: I really like where this is going. Startup time, devfile 2 api and the whole architecture shift. Nice! :)
Closing this issue...

@amisevsk
Copy link
Contributor

I'm glad it's resolved for you, this is the sort of bug I never want to ship!

Currently che-operator:next pulls devworkspace-controller:v0.12.3. Any plans releasing v0.12.4 in the next week or two? Would be nice if that change makes it to the next stable version of che.

Ah apologies, I think I really confused myself in explaining this 😄. The bug you were hitting actually does not impact the DevWorkspace Operator directly, and instead impacts other operators that use DWO as a library (the Che Operator in this case). The way Che creates ingresses and services for a DevWorkspace is by running an instance of DWO's DevWorkspaceRouting reconciler and plugging in its own implementation of how ingresses/services are defined (in code, defined here, where solver is implemented on the Che Operator side).

DWO v0.12.3 already includes this fix as a cherry-pick, and the fix is also used in Che (where it fixes the bug you were hitting) since eclipse-che/che-operator#1306.

I'm only now noting the date you created this issue and had assumed it was new -- this bug was present in Che next on 26 Jan, and was fixed in Che next on 2 Feb.

@nils-mosbach
Copy link
Author

No worries, happens to the best of us :).

Haven’t seen the cherry pick. Runs like s charm now, thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devworkspace-operator kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

4 participants