Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Camel devfiles are not able to start because of OOMKilled #18939

Closed
5 of 22 tasks
Katka92 opened this issue Feb 1, 2021 · 15 comments
Closed
5 of 22 tasks

Camel devfiles are not able to start because of OOMKilled #18939

Katka92 opened this issue Feb 1, 2021 · 15 comments
Labels
area/devfile-registry area/plugin-registry area/samples kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Milestone

Comments

@Katka92
Copy link
Contributor

Katka92 commented Feb 1, 2021

Describe the bug

The workspace is not started with the OOMKilled message shown. Same for Apache Camel K and Apache Camel based on Spring Boot devfiles.

Che version

  • latest
  • nightly
  • other: please specify 7.25.0

Steps to reproduce

  1. Go to Dashboard, click Get Started menu item, open Get Started tab.
  2. Select Apache Camel K project or Apache Camel based on Spring Boot devfile.
  3. Wait for the error to appear.

Expected behavior

Example devfiles should work out-of-the-box.

Runtime

  • kubernetes (include output of kubectl version)
  • Openshift (include output of oc version) 4.7
  • minikube (include output of minikube version and kubectl version)
  • minishift (include output of minishift version and oc version)
  • docker-desktop + K8S (include output of docker version and kubectl version)
  • other: (please specify) Hosted Che

Screenshots

Screenshot from 2021-02-01 11-48-29

Installation method

  • chectl
    • provide a full command that was used to deploy Eclipse Che (including the output)
    • provide an output of chectl version command
  • OperatorHub
  • I don't know

Environment

  • my computer
    • Windows
    • Linux
    • macOS
  • Cloud
    • Amazon
    • Azure
    • GCE
    • other (please specify)
  • other: please specify - Linux OpenStack

Eclipse Che Logs

Error part:

2021-02-01 14:12:03,334[aceSharedPool-1]  [WARN ] [.i.k.KubernetesInternalRuntime 257]  - Failed to start Kubernetes runtime of workspace workspaceftoo1v6yq5wxqg7t.
org.eclipse.che.api.workspace.server.spi.InfrastructureException: The following containers have terminated:
vscode-apache-camelccf: reason = 'OOMKilled', exit code = 137, message = 'null'
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments.handleStartingPodStatus(KubernetesDeployments.java:465)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments$2.eventReceived(KubernetesDeployments.java:378)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments$2.eventReceived(KubernetesDeployments.java:375)
	at io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49)
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:237)
	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
@Katka92 Katka92 added kind/bug Outline of a bug - must adhere to the bug report template. area/plugins labels Feb 1, 2021
@ibuziuk
Copy link
Member

ibuziuk commented Feb 1, 2021

@Katka92 could you please provide screenshots / logs from the vanilla 7.25.0, (not Hosted Che)

@ibuziuk ibuziuk added severity/P1 Has a major impact to usage or development of the system. and removed area/plugins labels Feb 1, 2021
@ibuziuk
Copy link
Member

ibuziuk commented Feb 1, 2021

cc: @apupier

@Katka92
Copy link
Contributor Author

Katka92 commented Feb 1, 2021

@ibuziuk Screen from vanilla 7.25.0 Che screen and logs:
Screenshot from 2021-02-01 16-01-38

2021-02-01 15:00:45,002[aceSharedPool-2]  [WARN ] [.i.k.KubernetesInternalRuntime 257]  - Failed to start Kubernetes runtime of workspace workspacedtnorlebw6suf0wz.
org.eclipse.che.api.workspace.server.spi.InfrastructureException: The following containers have terminated:
vscode-apache-camel6sk: reason = 'OOMKilled', exit code = 137, message = 'null'
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments.handleStartingPodStatus(KubernetesDeployments.java:465)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments$2.eventReceived(KubernetesDeployments.java:378)
	at org.eclipse.che.workspace.infrastructure.kubernetes.namespace.KubernetesDeployments$2.eventReceived(KubernetesDeployments.java:375)
	at io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49)
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:237)
	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

@ibuziuk
Copy link
Member

ibuziuk commented Feb 1, 2021

adding plugin / devfile registry labels for investigation

@ibuziuk
Copy link
Member

ibuziuk commented Feb 1, 2021

@apupier might be related to eclipse-che/che-plugin-registry@22c01b2

@apupier
Copy link
Contributor

apupier commented Feb 1, 2021

@apupier might be related to eclipse/che-plugin-registry@22c01b2

I do no think so. The OOM is on vscode-apache-camelXXX container. So with VS Code Language Support for Apche Camel, not the VS Code Tooling for Camel K.

Also why it would not have been caught by the PR? I thought there was acheck that the workspaces are starting.

@apupier
Copy link
Contributor

apupier commented Feb 1, 2021

so would be eclipse-che/che-plugin-registry@fcc5dbb

@apupier
Copy link
Contributor

apupier commented Feb 1, 2021

reproducing the OOM on che.opsnhift.io, so Che 7.24 unsing this gist https://gist.githubusercontent.com/apupier/91dcedc79405899cdbe1e9fbd5285e5f/raw/8ac74a51d607a17d2d8207d581f83612156143f8/meta.yaml
and this updated devfile:

apiVersion: 1.0.0
metadata:
  name: apache-camel-springboot-uf1gn
attributes:
  persistVolumes: 'false'
projects:
  - name: fuse-rest-http-booster
    source:
      location: 'https://github.com/jboss-fuse/fuse-rest-http-booster'
      type: git
      branch: master
components:
  - id: redhat/vscode-xml/latest
    memoryLimit: 150Mi
    type: chePlugin
  - memoryLimit: 260Mi
    type: chePlugin
    reference: 'https://raw.githubusercontent.com/apupier/che-plugin-registry/18204-upgradeVSCodeApacheCamelLanguageSupport/v3/plugins/redhat/vscode-apache-camel/0.0.28/meta.yaml'
  - id: redhat/java/latest
    memoryLimit: 1360Mi
    type: chePlugin
  - mountSources: true
    endpoints:
      - name: 8080-tcp
        port: 8080
    memoryLimit: 512Mi
    type: dockerimage
    volumes:
      - name: m2
        containerPath: /home/user/.m2
    image: 'quay.io/eclipse/che-java8-maven:7.20.0'
    alias: maven
    env:
      - value: ''
        name: MAVEN_CONFIG
      - value: '-XX:MaxRAMPercentage=50.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Dsun.zip.disableMemoryMapping=true -Xms20m -Djava.security.egd=file:/dev/./urandom'
        name: MAVEN_OPTS
      - value: '-XX:MaxRAMPercentage=50.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Dsun.zip.disableMemoryMapping=true -Xms20m -Djava.security.egd=file:/dev/./urandom'
        name: JAVA_OPTS
      - value: '-XX:MaxRAMPercentage=50.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -Dsun.zip.disableMemoryMapping=true -Xms20m -Djava.security.egd=file:/dev/./urandom'
        name: JAVA_TOOL_OPTIONS
commands:
  - name: Debug remote java application
    actions:
      - referenceContent: |
          {
          "version": "0.2.0",
          "configurations": [
            {
              "type": "java",
              "name": "Debug (Attach) - Remote",
              "request": "attach",
              "hostName": "localhost",
              "port": 5005
            }]
          }
        type: vscode-launch
  - name: build the project
    actions:
      - workdir: '${CHE_PROJECTS_ROOT}/fuse-rest-http-booster'
        type: exec
        command: mvn clean install
        component: maven
  - name: run the services
    actions:
      - workdir: '${CHE_PROJECTS_ROOT}/fuse-rest-http-booster'
        type: exec
        command: 'mvn spring-boot:run -DskipTests'
        component: maven
  - name: run the services (debugging enabled)
    actions:
      - workdir: '${CHE_PROJECTS_ROOT}/fuse-rest-http-booster'
        type: exec
        command: 'mvn spring-boot:run -DskipTests -Drun.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"'
        component: maven

unless one of the othe rplugins has now more space or che.openshift.io is providing more space, we are doomed. We were at the max of possible memory

@apupier
Copy link
Contributor

apupier commented Feb 1, 2021

unless there si a bug explaining that it is taking more memories. but how to investigate that? how to retrieved a heap Dump of what is taking the memory in the container?

@apupier
Copy link
Contributor

apupier commented Feb 1, 2021

I increased the memory for VS Code Language Support for Cmael to 350M. i tis starting an dworking. (and remved the Java plugin).

a top gives in its container gives:

   PID USER      PR  NI    VIRT    RES  %CPU  %MEM     TIME+ S COMMAND                                       
       1 user      20   0  320.4m  53.1m   0.0   0.2   0:05.63 S /remote-endpoint/plugin-remote-endpoint         
     22 user      20   0 1608.9m  86.4m   0.0   0.3   0:05.07 S  `- java -jar /tmp/vscode-unpacked/redhat.vscod+  
     57 user      20   0    1.6m   0.5m   0.0   0.0   0:00.01 S /bin/sh                                         
     64 user      20   0    1.8m   0.8m   0.0   0.0   0:00.09 R  `- top           

What does the 1608.9m can represent for the Camel Language Server jar represents? 1.6G? It sound simpossible, both by the limit set and in practice it is taking less than 70M.

The 320.4 for the remote endpoint is the one that is relevant for the limit? (300Mi is failing, 350Mi is working)

@ibuziuk
Copy link
Member

ibuziuk commented Feb 2, 2021

@ericwill any recommendations ^?

apupier added a commit to apupier/che-plugin-registry that referenced this issue Feb 2, 2021
apupier added a commit to eclipse-che/che-devfile-registry that referenced this issue Feb 2, 2021
eclipse-che/che#18939

-adjusting for Camel on SpringBoot because we are close to max capacity
for the devfile
- no need to adjust for Camel K stack, picking the default is enough

Signed-off-by: Aurélien Pupier <apupier@redhat.com>
@apupier apupier added this to the 7.26 milestone Feb 2, 2021
@Katka92
Copy link
Contributor Author

Katka92 commented Feb 2, 2021

Also why it would not have been caught by the PR? I thought there was a check that the workspaces are starting.

@apupier Happy path in the eclipse/che repo tests if a workspace is able to start. It's not going through all the devfiles in the Getting Started page as it would increase the run time of a test by more than an hour. It's covering a happy path, it's not meant to test everything. It is using its own devfile with pre-set commands etc. to make the execution of the test as fast as possible.

We have a pre-release suite that currently contains several devfile tests, but unfortunately, this one is not covered yet.

ericwill pushed a commit to eclipse-che/che-devfile-registry that referenced this issue Feb 2, 2021
…Mi (#332)

eclipse-che/che#18939

-adjusting for Camel on SpringBoot because we are close to max capacity
for the devfile
- no need to adjust for Camel K stack, picking the default is enough

Signed-off-by: Aurélien Pupier <apupier@redhat.com>
ericwill pushed a commit to eclipse-che/che-plugin-registry that referenced this issue Feb 2, 2021
ericwill pushed a commit to eclipse-che/che-plugin-registry that referenced this issue Feb 2, 2021
ericwill pushed a commit to eclipse-che/che-devfile-registry that referenced this issue Feb 2, 2021
…Mi (#332)

eclipse-che/che#18939

-adjusting for Camel on SpringBoot because we are close to max capacity
for the devfile
- no need to adjust for Camel K stack, picking the default is enough

Signed-off-by: Aurélien Pupier <apupier@redhat.com>
@apupier
Copy link
Contributor

apupier commented Feb 2, 2021

fixed in 7.26 and 7.25.1 (no milestone 7.25.1?)

created https://issues.redhat.com/browse/FUSETOOLS2-965 to investigate why it is requiring so much memory (but unless we have more precise pointers, I fear it won't have an high priority)

@apupier apupier closed this as completed Feb 2, 2021
@ibuziuk
Copy link
Member

ibuziuk commented Feb 2, 2021

@apupier thank you!

@ScrewTSW
Copy link
Member

ScrewTSW commented Feb 8, 2021

verified working on 7.25.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devfile-registry area/plugin-registry area/samples kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

4 participants