Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch log: "Container Sandbox: Unsupported syscall setsockopt" from Google Cloud Run #1739

Closed
attakei opened this issue Feb 3, 2020 · 29 comments
Assignees
Labels
area: Cloud Run A Cloud Run related issue area: compatibility Issue related to (Linux) kernel compatibility priority: p2 Normal priority type: bug Something isn't working

Comments

@attakei
Copy link

attakei commented Feb 3, 2020

I don't know if this is the correct site to publish this kind of issues as it is related to gVisor but on top of GKE.

Description

I try to use nginx-unit image ( https://hub.docker.com/r/nginx/unit ) on Google Cloud Run.
But, when running container, failed to call kill command.

In container process

This image run entrypoint.sh and has has four steps in shell.

  1. Run background process.
  2. Inject configuration into process.
  3. Stop backgroud process by kill comand.
  4. Run foreground process.

Currently, when running application container based vendor official image, kill command is not accepted, service is not availaved.

Cloud Run has output this log in running container:

Container Sandbox: Unsupported syscall setsockopt(0xb,0x6,0x9,0x3ee1608589cc,0x4,0x29910fc86500). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/setsockopt for more information.

Reproduce steps

Build image from repository and run service from image.
https://gitlab.com/attakei-sandbox/gvisor-issue-setsockopt

I saw logs from service in Iowa region (GCP).
Please see exported csv-log from GCP.

Information from other environments

Local docker engine

Run normally.

$ docker version                                                                                                      Client:
 Version:           19.03.5-ce
 API version:       1.40
 Go version:        go1.13.4
 Git commit:        633a0ea838
 Built:             Fri Nov 15 03:19:09 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.5-ce
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.4
  Git commit:       633a0ea838
  Built:            Fri Nov 15 03:17:51 2019
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          v1.3.2.m
  GitCommit:        d50db0a42053864a270f648048f9a8b4f24eced3.m
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Local docker engine with runsc

Run normally.

$ runsc --version                                                                                                     runsc version release-20200127.0-51-g02997af5abd6
spec: 1.0.1-dev

@ianlewis ianlewis added area: compatibility Issue related to (Linux) kernel compatibility priority: p2 Normal priority type: bug Something isn't working area: Cloud Run A Cloud Run related issue labels Feb 6, 2020
@coronaction
Copy link

coronaction commented May 15, 2020

Also getting similar message on Google Cloud Run for a container running a Java program wrapped in Quarkus framework. Happy to provide additional info if I know which one is of interest for this case. Just let me know

Some messages:

  1. Container Sandbox: Unsupported syscall setsockopt(0xae,0x0,0xb,0x3e6ff77fc1d4,0x4,0x0)
  2. Container Sandbox: Unsupported syscall setsockopt(0xae,0x29,0x31,0x3e6ff77fd7b4,0x4,0x4)
  3. Container Sandbox: Unsupported syscall setsockopt(0xae,0x29,0x12,0x3e6ff77fd7bc,0x4,0x4)

@pebo
Copy link

pebo commented May 29, 2020

The lastest official Node version v12.17.0 triggers setsockopt warnings on Cloud Run, e.g.

Container Sandbox: Unsupported syscall setsockopt(0x13,0x6,0x6,0x3ea340cbc70c,0x4,0x1cc3929404b1).
Container Sandbox: Unsupported syscall setsockopt(0x1b,0x6,0x6,0x3ea340cbc70c,0x4,0x1cc3929404b1)

Would it be possible to suppress these warning as cloud logging gets spammed?

@didier-durand
Copy link
Contributor

To whom are you asking this question: to gVisor team or to me?

@pebo
Copy link

pebo commented May 29, 2020

@didier-durand You probably got notified as you've subscribe to this issue.

I guess it's a feature request to the gVisor team - It would be nice to be able to suppress warnings (e.g. for socket options that is / cannot be implemented i gVisor).

@didier-durand
Copy link
Contributor

I agree with you but then we have to be able to select this option from the Google Cloud UI if we need to remove this logging.

@fevernova90
Copy link

Same for me, getting this syslog spamming my whole Logging Stack. Running Cloud Run.

@Sytten
Copy link

Sytten commented Jun 24, 2020

Also having this issue, but different code:

Container Sandbox: Unsupported syscall setsockopt(0x13,0x6,0x6,0x3ef9e6878e2c,0x4,0xab9dd404b1)

I am using nodejs 12.18.

@shvgn
Copy link

shvgn commented Jul 5, 2020

I use k6.io to run load tests on my service in Cloud Run. What I see it that roughly 2% of tcp connections fail. In Cloud Run logs I see lots of these (along with membarrier):

Container Sandbox: Unsupported syscall setsockopt(0x17,0x6,0x6,0x3e6f301f9734,0x4,0x0). It is very likely that you can safely ignore this ...

k6 warnings during the test run:

...
WARN[0629] Request Failed      error="Get \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0629] Request Failed      error="Get \"https://<...>.run.app/<...>\": unexpected EOF"
WARN[0629] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0630] Request Failed      error="Get \"https://<...>.run.app/<...>\": unexpected EOF"
WARN[0630] Request Failed      error="Post \"https://<...>.run.app/<...>\": write tcp <...>:62719-><...>:443: write: broken pipe"
WARN[0631] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0631] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0632] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0632] Request Failed      error="Post \"https://<...>.run.app/<...>\": unexpected EOF"
WARN[0632] Request Failed      error="Post \"https://<...>.run.app/<...>\": unexpected EOF"
WARN[0633] Request Failed      error="Post \"https://<...>.run.app/<...>\": unexpected EOF"
WARN[0633] Request Failed      error="Get \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0633] Request Failed      error="Get \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0633] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0634] Request Failed      error="Post \"https://<...>.run.app/<...>\": dial tcp <...>:443: i/o timeout"
WARN[0634] Request Failed      error="Post \"https://<...>.run.app/<...>\": EOF"

The service base image is node:14-alpine.

@iangudger
Copy link
Contributor

This seems to be SOL_TCP, TCP_KEEPCNT which was fixed in 4b9652d.

@RtypeStudios
Copy link

RtypeStudios commented Jul 13, 2020

I have thousands of these appearing. I'm using .net core on this image:

FROM mcr.microsoft.com/dotnet/core/aspnet:3.1.2-alpine3.11

Would be great to filter these out as it makes log reading a bit difficult.

image

@AndreiIgna
Copy link

@iangudger is there something we can do after that fix?

Having the same problem, it's quite hard to see something useful in logs when this line is duplicated so many times
Screenshot 2020-08-11 at 20 13 59

gVisor is referenced in the log https://gvisor.dev/c/linux/amd64/setsockopt

@ytnobody
Copy link

ytnobody commented Aug 14, 2020

I saw too. Today Container Sandbox: Unsupported syscall membarrier log on Google Cloud Run.

スクリーンショット 2020-08-14 12 08 20

Frequency of this phenomenon is about 1 to 5 times on a day.

I deployed the container image that is based on golang:1.12-alpine

@iangudger
Copy link
Contributor

@RtypeStudios Can you post the full log line? You cut off the important part.

@AndreiIgna Your logs are about a different socket option (SOL_IP, IP_MTU_DISCOVER). That is tracked in #1643.

@ytnobody Your logs are about a different syscall entirely (membarrier). Please see the compatibility note in the log line that you posted. membarrier is being tracked in #267.

@nlacasse Has 4b9652d rolled out to Cloud Run yet?

@pebo
Copy link

pebo commented Sep 30, 2020

We get "warnings" logged for Cloud Run containers running a JVM app with ktor / netty and google libraries for accessing BQ and GCS.

Is there an issue tracking: Container Sandbox: Unsupported syscall setsockopt(0x13,0x0,0xb,0x3ed13c7f9974,0x4,0x2c1) ?

@vojkny
Copy link

vojkny commented Nov 20, 2020

Similar thing here: spamming my logs, hard to see whatt is relevant.

@marcelsauer4711
Copy link

getting the same message in the logs. Java Spring Application....

{ "textPayload": "Container Sandbox: Unsupported syscall setsockopt(0xc9,0x29,0x12,0x3dfefc9fd864,0x4,0x3). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/setsockopt for more information.", "insertId": "5fbbb75400091587f1e993e7", "resource": { "type": "cloud_run_revision", "labels": { "revision_name": "helloworld-24fjz", "project_id": "xxx", "configuration_name": "helloworld", "location": "europe-west1", "service_name": "helloworld" } }, "timestamp": "2020-11-23T13:21:24.595316477Z", "severity": "DEBUG", "labels": { "instanceId": "xxx" }, "logName": "xxx", "receiveTimestamp": "2020-11-23T13:21:24.783347593Z" }

@sshcherbakov
Copy link

sshcherbakov commented Jan 28, 2021

Here's an error from my Java Spring Boot based gRPC server on Cloud Run:

Container Sandbox: Unsupported syscall setsockopt(0x6,0x29,0x31,0x3efbd9dfc3a4,0x4,0x0). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/setsockopt for more information

Not sure, but that sounds like: SO_DEBUG, SO_DONTROUTE, SO_BROADCAST socket options, right?
Which ones in particular are not supported by gVisor, SO_DEBUG?

@RtypeStudios
Copy link

RtypeStudios commented Feb 3, 2021

@iangudger So sorry, just saw your request for more information. I'm still getting thousands of them.

2021-02-03 18:09:04.374 AWSTContainer Sandbox: Unsupported syscall setsockopt(0x12e,0x1,0xd,0x3e23363fefb8,0x8,0x3e234cef5490). 
It is very likely that you can safely ignore this message and 
that this is not the cause of any error you might be troubleshooting. Please, refer to 
https://gvisor.dev/c/linux/amd64/setsockopt for more information.

2021-02-03 18:09:04.375 AWSTContainer Sandbox: Unsupported syscall setsockopt(0x12e,0x1,0xd,0x3e23363fefb8,0x8,0x3e234cf1a688). 
It is very likely that you can safely ignore this message and 
that this is not the cause of any error you might be troubleshooting. Please, refer to 
https://gvisor.dev/c/linux/amd64/setsockopt for more information.

2021-02-03 18:09:05.489 AWSTContainer Sandbox: Unsupported syscall setsockopt(0x12f,0x1,0xd,0x3e2345fffa38,0x8,0x0). 
It is very likely that you can safely ignore this message and 
that this is not the cause of any error you might be troubleshooting. Please, refer to 
https://gvisor.dev/c/linux/amd64/setsockopt for more information.

@sshcherbakov
Copy link

0xd and 0x31 have only first bit (SO_DEBUG) in common.
Am I completely off here.

@RtypeStudios
Copy link

seems the say in the errors I have posted, but different to the errors others have posted. I have no idea what they mean :)

@sshcherbakov
Copy link

Sorry, the error I listed didn't influence the functionality of my gRPC server (the reason of the malfunction was something else), please ignore.

@louis030195
Copy link

I don't know if I should create a new issue:

Container Sandbox: Unsupported syscall sched_getattr(0x37d,0x3e045912c300,0x38,0x0,0x1,0x3e045912c300). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/docs/user_guide/compatibility/linux/amd64/sched_getattr for more information.

The documentation page does not exist.
I suspect to be caused by the use of playwright.dev (python API) or maybe beautifulsoup

FROM mcr.microsoft.com/playwright:focal

dependencies

google-cloud
google-cloud-firestore
google-cloud-storage
Flask[async]==2.0.2
gunicorn==20.1.0
beautifulsoup4
playwright
requests
fire
tqdm
pandas
openai
scraperapi-sdk
parsel
aiologger

@johnf1004
Copy link

Did anyone ever figure out how to suppress these warning messages?

@petehannam
Copy link

@johnf1004 use the gen2 execution environment: https://cloud.google.com/run/docs/about-execution-environments

@johnf1004
Copy link

Nice, thank you @petehannam !

Works nicely when deploying through the console - any idea if it's possible to specify the gen2 environment with gcloud? Looking through the flags for gcloud run deploy and cant see anything

@petehannam
Copy link

@johnf1004 The documentation has details on how to run it via the command line:

gcloud beta run deploy --image IMAGE_URL --execution-environment gen2

@vojkny
Copy link

vojkny commented May 31, 2022

Note that I am avoiding gen2 because of slower cold starts.

@kevinGC
Copy link
Collaborator

kevinGC commented Jun 10, 2022

This issue sort of straddles the line between gVisor and its downstream consumers. I believe that Cloud Run and others allow for logs to be filtered via terms such as NOT "Unsupported syscall setsockopt".

From gVisor's perspective, the unsupported syscall logs are important. In the rare cases where unsupported syscalls do affect program behavior, the logs are an important debugging tool. We don't want to remove them, as when things do break they will be extra difficult to debug both for users and for us.

Please do file specific issues if you're getting major logspam or application behavior is affected. For now, this issue seems to have become a catchall and I think we should have users file specific bugs for specific messages.

@kevinGC kevinGC closed this as completed Jun 10, 2022
@kevinGC
Copy link
Collaborator

kevinGC commented Jun 10, 2022

For anyone coming across this in the future: if you're seeing the Unsupported syscall message and it either is (1) affecting application behavior or (2) logspamming like crazy, please open an issue for your particular message. I'm closing this one because it's too many issues clumped together and it's not clear which need addressing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Cloud Run A Cloud Run related issue area: compatibility Issue related to (Linux) kernel compatibility priority: p2 Normal priority type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests