-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When creating new workspace, UI is stuck on "Loading...", though workspace is successfully created #10501
Comments
As far as I understand the issue might be in how kubernetes infra works and latest changes in the tracking of workspace status. |
@magbj @garagatyi yes, it's a known problem with Che on K8S. |
@garagatyi @eivantsov It sounds great that you are adding in the reconnect ability for WS. In the mean time, by setting this for nginx ingress controller deployment:
I was able to make it automatically switch from "Loading..." to the IDE, though it did not use the nicer process of the disconnected IDE. |
@magbj an interesting workaround! But doesn't it mean that we would require to change k8s infrastructure configuration to run Che on it? |
I've managed to stand up CHE whilst bumping postgres to Deployed on AWS in a private subnet across 3 AZ's using index.ts:145 Error: Failed to run the workspace: "Waiting for ingress 'ingressle14ialw' reached timeout"
at index.ts:307
at che-json-rpc-master-api.ts:198
at json-rpc-client.ts:191
at Array.forEach (<anonymous>)
at e.processNotification (json-rpc-client.ts:190)
at e.processResponse (json-rpc-client.ts:177)
at json-rpc-client.ts:94
at websocket-client.ts:107
at Array.forEach (<anonymous>)
at e.callHandlers (websocket-client.ts:107)
(anonymous) @ index.ts:145
vendor-73784b1bca.js:28856 I think this may point back to my own mishandling of ALB rules and requisite annotations; but even after making the necessary updates to subnets, it fails shortly thereafter Before annotations: 4m 40m 13 che-ingress.15513722adda5a3f Ingress Warning ERROR aws-alb-ingress-controller error parsing annotations: Retrieval of subnets failed to resolve 2 qualified subnets. Subnets must contain the kubernetes.io/cluster/<cluster name> tag with a value of shared or owned and the kubernetes.io/role/internal-elb tag signifying it should be used for ALBs Additionally, there must be at least 2 subnets with unique availability zones as required by ALBs. Either tag subnets to meet this requirement or use the subnets annotation on the ingress resource to explicitly call out what subnets to use for ALB creation. The subnets that did resolve were []. After annotations: 12m 13m 2 che-ingress.15513b806be04b34 Ingress Normal UPDATE aws-alb-ingress-controller Ingress default/che-ingress
13m 13m 1 che-ingress.15513b845eae6530 Ingress Normal CREATE aws-alb-ingress-controller 6d494ba2-default-cheingres-b25b created
13m 13m 2 che-ingress.15513b84912da5e5 Ingress Normal CREATE aws-alb-ingress-controller 6d494ba2-02b495ee1e76880abde target group created
13m 13m 1 che-ingress.15513b84b7caa419 Ingress Normal CREATE aws-alb-ingress-controller 80 listener created
13m 13m 1 che-ingress.15513b84b93efaa3 Ingress Normal CREATE aws-alb-ingress-controller 1 rule created
12m 12m 1 che-ingress.15513b896375cbaf Ingress Normal MODIFY aws-alb-ingress-controller 6d494ba2-default-cheingres-b25b tags modified $ kubectl describe po -n kube-system alb-ingress-controller-5596d9bf8-lk7hm
Name: alb-ingress-controller-5596d9bf8-lk7hm
Namespace: kube-system
Node: ip-192-168-118-34.us-west-2.compute.internal/192.168.118.34
Start Time: Mon, 03 Sep 2018 15:43:03 -0400
Labels: app=alb-ingress-controller
pod-template-hash=115285694
Annotations: <none>
Status: Running
IP: 192.168.124.250
Controlled By: ReplicaSet/alb-ingress-controller-5596d9bf8
Containers:
server:
Container ID: docker://96f7d1e877276743c8d39581f55870369d08c1b8e0ae0413fdf6b4b9b552702e
Image: quay.io/coreos/alb-ingress-controller:1.0-beta.6
Image ID: docker-pullable://quay.io/coreos/alb-ingress-controller@sha256:1c934a32ee5e3aad925dbe0ff37cb50ae04d99e33c4d878186d603c1901ad644
Port: <none>
Host Port: <none>
Args:
/server
--ingress-class=alb
--cluster-name=eclipse-che
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Tue, 04 Sep 2018 21:30:38 -0400
Finished: Tue, 04 Sep 2018 21:30:40 -0400
Ready: False
Restart Count: 14
Environment:
AWS_REGION: us-west-2
POD_NAME: alb-ingress-controller-5596d9bf8-lk7hm (v1:metadata.name)
POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from alb-ingress-token-hwvl7 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
alb-ingress-token-hwvl7:
Type: Secret (a volume populated by a Secret)
SecretName: alb-ingress-token-hwvl7
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 46m (x5 over 1d) kubelet, ip-192-168-118-34.us-west-2.compute.internal pulling image "quay.io/coreos/alb-ingress-controller:1.0-beta.6"
Normal Pulled 46m (x5 over 1d) kubelet, ip-192-168-118-34.us-west-2.compute.internal Successfully pulled image "quay.io/coreos/alb-ingress-controller:1.0-beta.6"
Normal Created 46m (x5 over 1d) kubelet, ip-192-168-118-34.us-west-2.compute.internal Created container
Normal Started 46m (x5 over 1d) kubelet, ip-192-168-118-34.us-west-2.compute.internal Started container
Warning BackOff 3m (x201 over 48m) kubelet, ip-192-168-118-34.us-west-2.compute.internal Back-off restarting failed container
(reverse-i-search)`': clear && kubectl get events --sort-by=.metadata.creationTimestamp
$ kubectl logs -n kube-system $(kubectl get po -n kube-system | egrep -o alb-[a-zA-Z0-9-]+)
-------------------------------------------------------------------------------
AWS ALB Ingress controller
Release: 1.0-beta.6
Build: git-f740c293
Repository: https://github.com/kubernetes-sigs/aws-alb-ingress-controller
-------------------------------------------------------------------------------
I0905 01:30:38.069833 1 flags.go:132] Watching for Ingress class: alb
W0905 01:30:38.070148 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0905 01:30:38.070387 1 main.go:159] Creating API client for https://10.100.0.1:443
I0905 01:30:38.085917 1 main.go:203] Running in Kubernetes cluster version v1.10 (v1.10.3) - git (clean) commit 2bba0127d85d5a46ab4b778548be28623b32d0b0 - platform linux/amd64
I0905 01:30:38.086664 1 alb.go:85] ALB resource names will be prefixed with 6d494ba2
I0905 01:30:38.094112 1 alb.go:158] Starting AWS ALB Ingress controller
I0905 01:30:39.295400 1 leaderelection.go:185] attempting to acquire leader lease kube-system/ingress-controller-leader-alb...
I0905 01:30:39.305289 1 leaderelection.go:194] successfully acquired lease kube-system/ingress-controller-leader-alb
I0905 01:30:39.305322 1 status.go:152] new leader elected: alb-ingress-controller-5596d9bf8-lk7hm
I0905 01:30:39.540996 1 albingresses.go:77] Building list of existing ALBs
I0905 01:30:39.719512 1 albingresses.go:85] Fetching information on 1 ALBs
E0905 01:30:40.076241 1 albingresses.go:211] Failed to find related managed instance SG. Was it deleted from AWS? Error: Didn't find exactly 1 matching (managed) instance SG. Found 3
I0905 01:30:40.076929 1 albingresses.go:101] Assembled 0 ingresses from existing AWS resources in 535.905766ms
F0905 01:30:40.076977 1 albingresses.go:103] Assembled 0 ingresses from 1 load balancers $ kubectl version --short
Client Version: v1.10.3
Server Version: v1.10.3
$ eksctl version
2018-09-04T21:24:25-04:00 [ℹ] versionInfo = map[string]string{"builtAt":"2018-08-31T14:44:02Z", "gitCommit":"0578d6cd44d8c5a4ebe17825db882ad194f0bee4", "gitTag":"0.1.1"}
$ helm version --short
Client: v2.10.0+g9ad53aa
Server: v2.10.0+g9ad53aa Ultimately, I've not been able to get as far as @magbj, noted #10370 but I'm pretty close; I guess I have to take a step back here and ask does CHE still have issues running atop K8S(EKS) or has anyone managed to get this working end-to-end in a live environment? Locally, ofcourse, it all runs beautifully. |
I think the problem is related to security group rules. Try to do what is described here - kubernetes-sigs/aws-load-balancer-controller#236 (comment) |
@magbj issue #10365 is fixed, so, hopefully, you won't need any additional configuration to have IDE loading by after a workspace start. How do you think can we close the issue? @ramene your issue with the volumes is not related to the topic. If you still looking for help from the community I would recommend opening a new issue and describe your difficulties there. Maybe @magbj would be able to help here since it seems that he managed to run Che on a similar setup successfully. |
Thanks @antonbabenko, I'd not seen the issue you referenced; I'll stand it back up and futz with the SG's and report back. @garagatyi, I'm eager to get to the point of the IDE loading after a workspace start; and my apologies for lumping multiple issues together, I'll remove this bit and open a new issue with AWS EFS and storageclasses accordingly. I may simply have to default to use the same storage class as @magbj. I appreciate you chiming in nonetheless. |
@ramene NP, just trying to split issues since it helps in not mixing discussions |
Issues go stale after Mark the issue as fresh with If this issue is safe to close now please do so. Moderators: Add |
Description
After selecting to create a workspace, either selecting "Open IDE" from the creation screen, or selecting the workspace, the UI is stuck on showing "Loading...".
The workspace is still being successfully created in the background though. As soon as it is in a running state, I can always bring it up by reloading the browser. The UI does not discover the change in state though, and brings things up automatically.
Earlier on in my testing, I used to get an IDE screen that was "disconnected", and when the workspace was sufficiently provisioned the status and bootstrap log started to show. I am not sure what changed to not make that appear anymore, and just showing the loading page.
This is what is looks like hanging:
If I do a browser refresh, everything comes up:
Deployed into AWS in a private subnet (1 AZ/subnet), running on top of AWS EKS/Kubernetes
ELB for incoming traffic. HTTPS with do SSL termination on Nginx.
Using Nginx Ingress Controller (0.16.2).
Keycloak version: 3.4.3.Final
Eclipse Che version: 6.8.0-SNAPSHOT
Created a storage class for EBS/GP2
All ports/IPs are accessible within private subnet
I see the same behavior in both Chrome and Firefox.
This is what I see in the browser (Chrome) console:
The text was updated successfully, but these errors were encountered: