Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: Fix userdata bind mount path inside the container #2222

Merged

Conversation

bpradipt
Copy link
Member

For docker provider we use systemd container which mounts /run as a tmpfs volume.
This mounting of /run happens after docker has bind mounted the userdata file inside the container and consequently the userdata file is no longer accessible.

This commit changes the path of the userdata file to not use /run for the docker provider.

@bpradipt bpradipt requested a review from a team as a code owner December 23, 2024 17:52
@bpradipt
Copy link
Member Author

I'm not sure if there is a better way to solve the issue. Open for suggestion

Copy link
Collaborator

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably better to change the default mount path for the ci disk rather than adding an exception for docker, no? it doesn't need to be in /run I think, it can be in /mnt or wherever

@bpradipt
Copy link
Member Author

it's probably better to change the default mount path for the ci disk rather than adding an exception for docker, no? it doesn't need to be in /run I think, it can be in /mnt or wherever

Yeah, having a uniform one is better. I was not sure if there is a hard dependency on /run for any platform. Let me just use /media as the mount path for the ci disk and check through ci if anything breaks. That should help.

@bpradipt
Copy link
Member Author

One challenge is how to make /media mountpath available early in the boot cycle since root (/) is RO. Only the following are available via systemd - https://freedesktop.org/wiki/Software/systemd/APIFileSystems/

@bpradipt bpradipt force-pushed the docker-userdata-path branch from 467f0a3 to 34f2851 Compare January 1, 2025 17:01
@bpradipt bpradipt added the test_e2e_libvirt Run Libvirt e2e tests label Jan 1, 2025
@bpradipt bpradipt requested a review from mkulke January 2, 2025 06:26
@bpradipt
Copy link
Member Author

bpradipt commented Jan 2, 2025

One challenge is how to make /media mountpath available early in the boot cycle since root (/) is RO. Only the following are available via systemd - https://freedesktop.org/wiki/Software/systemd/APIFileSystems/

Used /var/tmp/media/cidata. /var is tmpfs mounted for mkosi and also works for docker provider

@bpradipt
Copy link
Member Author

bpradipt commented Jan 6, 2025

The CI failures may be fixed via #2226

@bpradipt bpradipt requested a review from a team January 7, 2025 04:45
@wainersm
Copy link
Member

wainersm commented Jan 7, 2025

Hi @bpradipt !

I finally managed to test this fix. It seems to fix the original problem: /var/tmp/media/cidata/user-data is present and the process-user-data service is up and running just fine.

However, I'm still unable to get a pod running. The problem that I'm seeing is related with image pull on guest:

=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created
    assessment_runner.go:389: unexpected warning/error event(s): Error: failed to create containerd container: error unpacking image: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not foundError: failed to create containerd container: error unpacking image: failed to extract layer sha256:1e604deea57dbda554a168861cff1238f93b8c6c69c863c43aed37d9d99c5fed: failed to get reader from content store: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not found

Above is a snippet of e2e test. I got a similar error when launching another pod from a different container image manually.

@bpradipt
Copy link
Member Author

bpradipt commented Jan 8, 2025

Hi @bpradipt !

I finally managed to test this fix. It seems to fix the original problem: /var/tmp/media/cidata/user-data is present and the process-user-data service is up and running just fine.

However, I'm still unable to get a pod running. The problem that I'm seeing is related with image pull on guest:

=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created
    assessment_runner.go:389: unexpected warning/error event(s): Error: failed to create containerd container: error unpacking image: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not foundError: failed to create containerd container: error unpacking image: failed to extract layer sha256:1e604deea57dbda554a168861cff1238f93b8c6c69c863c43aed37d9d99c5fed: failed to get reader from content store: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not found

Above is a snippet of e2e test. I got a similar error when launching another pod from a different container image manually.

@wainersm you are hitting the known nydus-snapshotter issue. Workaround is documented here - https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-api-adaptor/docs/troubleshooting/nydus-snapshotter.md#pod-creation-fails-with-error-unpacking-image
Alternatively you can use crio runtime which doesn't have this issue.

@wainersm
Copy link
Member

wainersm commented Jan 8, 2025

Hi @bpradipt !
I finally managed to test this fix. It seems to fix the original problem: /var/tmp/media/cidata/user-data is present and the process-user-data service is up and running just fine.
However, I'm still unable to get a pod running. The problem that I'm seeing is related with image pull on guest:

=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created
    assessment_runner.go:389: unexpected warning/error event(s): Error: failed to create containerd container: error unpacking image: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not foundError: failed to create containerd container: error unpacking image: failed to extract layer sha256:1e604deea57dbda554a168861cff1238f93b8c6c69c863c43aed37d9d99c5fed: failed to get reader from content store: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not found

Above is a snippet of e2e test. I got a similar error when launching another pod from a different container image manually.

@wainersm you are hitting the known nydus-snapshotter issue. Workaround is documented here - https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-api-adaptor/docs/troubleshooting/nydus-snapshotter.md#pod-creation-fails-with-error-unpacking-image Alternatively you can use crio runtime which doesn't have this issue.

Interesting that I almost never hit that problem when running in a fresh environment like the one setup by the e2e testing provisioner. Let's see how it goes on CI...

Anyway, with the workaround above I could launch a simple pod \o/

@wainersm
Copy link
Member

wainersm commented Jan 8, 2025

One challenge is how to make /media mountpath available early in the boot cycle since root (/) is RO. Only the following are available via systemd - https://freedesktop.org/wiki/Software/systemd/APIFileSystems/

I found in https://systemd.io/TEMPORARY_DIRECTORIES/ that use of /var/tmp has some security implications that we should be aware of. I read that doc fully and concluded we should be fine, however, I'm not an expert so better a 2nd check would be nice.

Copy link
Member

@wainersm wainersm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it locally and it worked out. It seems that we safe with using /var/tmp (keep in min I'm not an expert on the topic).

@bpradipt
Copy link
Member Author

bpradipt commented Jan 8, 2025

Hi @bpradipt !
I finally managed to test this fix. It seems to fix the original problem: /var/tmp/media/cidata/user-data is present and the process-user-data service is up and running just fine.
However, I'm still unable to get a pod running. The problem that I'm seeing is related with image pull on guest:

=== RUN   TestDockerCreateSimplePod/SimplePeerPod_test/PodVM_is_created
    assessment_runner.go:389: unexpected warning/error event(s): Error: failed to create containerd container: error unpacking image: content digest sha256:9fa9226be034e47923c0457d916aa68474cdfb23af8d4525e9baeebc4760977a: not foundError: failed to create containerd container: error unpacking image: failed to extract layer 

Interesting that I almost never hit that problem when running in a fresh environment like the one setup by the e2e testing provisioner. Let's see how it goes on CI...

This issue is pretty consistent with docker provider. With other providers it's occasional.

Anyway, with the workaround above I could launch a simple pod \o/

@bpradipt bpradipt force-pushed the docker-userdata-path branch 2 times, most recently from ef78fc0 to bfd2a8b Compare January 9, 2025 04:12
@bpradipt
Copy link
Member Author

bpradipt commented Jan 9, 2025

One challenge is how to make /media mountpath available early in the boot cycle since root (/) is RO. Only the following are available via systemd - https://freedesktop.org/wiki/Software/systemd/APIFileSystems/

I found in https://systemd.io/TEMPORARY_DIRECTORIES/ that use of /var/tmp has some security implications that we should be aware of. I read that doc fully and concluded we should be fine, however, I'm not an expert so better a 2nd check would be nice.

Seems ok. But would be good to hear comment from @mkulke ..

@mkulke
Copy link
Collaborator

mkulke commented Jan 9, 2025

Seems ok. But would be good to hear comment from @mkulke ..

I mean generally the path sounds a bit odd, why var? why tmp? Can't we create a /media/cidata path statically on the rootfs at image-build time?

@bpradipt
Copy link
Member Author

bpradipt commented Jan 9, 2025

Seems ok. But would be good to hear comment from @mkulke ..

I mean generally the path sounds a bit odd, why var? why tmp? Can't we create a /media/cidata path statically on the rootfs at image-build time?

Yeah, that's a possibility as well. I wanted to avoid changing the rootfs partition if there was an alternative. And it turned out systemd already creates those in tmpfs. Let me know what you prefer?

@mkulke
Copy link
Collaborator

mkulke commented Jan 9, 2025

Seems ok. But would be good to hear comment from @mkulke ..

I mean generally the path sounds a bit odd, why var? why tmp? Can't we create a /media/cidata path statically on the rootfs at image-build time?

Yeah, that's a possibility as well. I wanted to avoid changing the rootfs partition if there was an alternative. And it turned out systemd already creates those in tmpfs. Let me know what you prefer?

personally, I'd prefer a /media/cidata mount point.

@bpradipt
Copy link
Member Author

bpradipt commented Jan 9, 2025

Seems ok. But would be good to hear comment from @mkulke ..

I mean generally the path sounds a bit odd, why var? why tmp? Can't we create a /media/cidata path statically on the rootfs at image-build time?

Yeah, that's a possibility as well. I wanted to avoid changing the rootfs partition if there was an alternative. And it turned out systemd already creates those in tmpfs. Let me know what you prefer?

personally, I'd prefer a /media/cidata mount point.

@mkulke I added a new test commit, can you please check and let me know if the approach looks fine?

@bpradipt bpradipt force-pushed the docker-userdata-path branch from b9f118d to 31b8228 Compare January 9, 2025 14:42
This commit changes the path of the userdata file to /media/cidata
from `/run/media/cidata` to fix an issue with the docker provider.

For docker provider we use systemd container which mounts /run
as a tmpfs volume.
This mounting of /run happens after docker has bind mounted the
userdata file inside the container and consequently the userdata file is no
longer accessible.

The podvm rootfs has a precreated /media/cidata folder to mount the
config-drive with userdata.

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
@bpradipt bpradipt force-pushed the docker-userdata-path branch from 31b8228 to 3370381 Compare January 9, 2025 16:08
@bpradipt bpradipt requested a review from mkulke January 9, 2025 16:08
@bpradipt bpradipt merged commit e215e88 into confidential-containers:main Jan 10, 2025
42 checks passed
@bpradipt bpradipt deleted the docker-userdata-path branch January 10, 2025 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test_e2e_libvirt Run Libvirt e2e tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants