Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twoliter's build of krane appears to be linking against libc #403

Closed
cbgbt opened this issue Oct 21, 2024 · 5 comments · Fixed by #405
Closed

Twoliter's build of krane appears to be linking against libc #403

cbgbt opened this issue Oct 21, 2024 · 5 comments · Fixed by #405

Comments

@cbgbt
Copy link
Contributor

cbgbt commented Oct 21, 2024

Twoliter fails to interact with OCI images on hosts with a less-recent libc than our cross-build environment (defined to some degree here. The build of krane occurs here).

When the error occurs, it looks like this:

[2024-10-21T13:16:37Z INFO  twoliter::project::lock::image] Resolving dependency image dependency 'bottlerocket-core-kit-3.0.0@public.ecr.aws/bottlerocket/bottlerocket-core-kit:v3.0.0'.
Error: Failed to run operation with image tool:
 command: /proc/9302/fd/9 manifest public.ecr.aws/bottlerocket/bottlerocket-core-kit:v3.0.0

We need to:

  • Fix the go build of krane to link against musl
  • Fix the logging on the oci-tool so that errors with executing the tool give us stdout/stderr as logs
@cbgbt
Copy link
Contributor Author

cbgbt commented Oct 21, 2024

It would be great if we could combo this with #398 to help us test more easily, or even provide a Makefile target to execute cross builds locally so that we can test more easily.

@sam-berning
Copy link
Contributor

I hacked together a branch of twoliter that builds krane statically and ran that on my machine that has this issue, and it's still hitting the same problem. So I'm not sure this is a linking issue.

I also tried to improve the error message by including stdout and the exit status as well, but it's still not very helpful:

[2024-10-28T22:44:08Z INFO  twoliter::project::lock::image] Resolving dependency image dependency 'bottlerocket-core-kit-3.0.0@public.ecr.aws/bottlerocket/bottlerocket-core-kit:v3.0.0'.
Error: Failed to run operation with image tool: status: signal: 9 (SIGKILL) stderr:  stdout: 
 command: /proc/2573/fd/9 manifest public.ecr.aws/bottlerocket/bottlerocket-core-kit:v3.0.0

The only bit of info we get from that is that the process exited due to a SIGKILL from somewhere, but that could be a number of things. I'll keep looking into it.

@cbgbt
Copy link
Contributor Author

cbgbt commented Oct 29, 2024

This may be due to our use of pentacle to create sealed anonymous files on Linux. I wonder if the kernel headers present at build time may influence us here.

Due to this comment some of the sealing behavior is kernel version dependent, though pentacle should be resilient to missing features. Here's where the seals are added. You could try running this with log level set to trace to see if you get anything from this function call.

@sam-berning
Copy link
Contributor

Yeah, it looks like this might be related. If I strace the twoliter update command on a system that it works on, the F_ADD_SEALS syscalls look like this:

fcntl(9, F_ADD_SEALS, F_SEAL_EXEC)      = 0
fcntl(9, F_ADD_SEALS, 0)                = 0
fcntl(9, F_ADD_SEALS, F_SEAL_SEAL|F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE) = 0
fcntl(9, F_GET_SEALS)                   = 0x3f (seals F_SEAL_SEAL|F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE|F_SEAL_EXEC)

compared to a system that twoliter update fails on:

fcntl(9, F_ADD_SEALS, 0x20 /* F_SEAL_??? */) = -1 EINVAL (Invalid argument)
fcntl(9, F_ADD_SEALS, 0)                = 0
fcntl(9, F_ADD_SEALS, F_SEAL_SEAL|F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE) = 0
fcntl(9, F_GET_SEALS)                   = 0xf (seals F_SEAL_SEAL|F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE)

So the binary never gets sealed with F_SEAL_EXEC, which means that the exec bits can be changed.

@sam-berning
Copy link
Contributor

I've seen twoliter update fail at different points in the process (sometimes on the first krane manifest, sometimes it succeeds krane manifest but fails at krane config), but always exiting with SIGKILL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants