packer: Enable support for adding nvidia gpu to podvm image #1602

bpradipt · 2023-11-24T09:58:19Z

This commit enables NVIDIA gpu support for aws and azure podvm image. For example enabling gpus etc.

NVIDIA GPU support is enabled by default when building packer based
image for aws and azure
A default build of podvm in aws took 10 min

  ==> Wait completed after 10 minutes 16 seconds

  ==> Builds finished. The artifacts of successful builds are:
  --> peer-pod-ubuntu.amazon-ebs.ubuntu: AMIs were created:
  us-east-2: ami-0463ae5aa8d5b3606

  rm -fr toupload

  real    10m34.352s
  user    0m18.919s
  sys     0m10.044s

If you want to disable, then run with ENABLE_NVIDIA_GPU=no
For example:
cd azure/image
PODVM_DISTRO=ubuntu ENABLE_NVIDIA_GPU=no make image

Few additional work is required going forward which I plan to handle in subsequent PRs.

Add gpu verification test case
Add gpu support in mkosi build

bpradipt · 2023-11-24T10:03:14Z

@katexochen @mkulke @snir911 @surajssd ptal and let me know what you think about this interim method to enable some additional features via the podvm image.
I have included two examples - gpu and blobfuse and both uses oci prestart hook.
Tested blobfuse with confidential VM and gpu with non-confidential gpu instance.

katexochen · 2023-11-24T10:48:44Z

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

What are the security considerations of mounting blob fuse? How is this secure in the CC attacker model?

bpradipt · 2023-11-24T12:44:17Z

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

Right and hence I want to hear everyone's thoughts. I can start exploring how to have similar functionality with mkosi.
But unfortunately there is no way to work on other use cases till that time. Also I have made an attempt to keep it isolated and only include in packer builds. By default nothing gets included in the packer generated image.

What are the security considerations of mounting blob fuse? How is this secure in the CC attacker model?

Blob can have encrypted data which will be decrypted inside the CVM. Blob may also have non-encrypted data for processing by the pod.

mkulke · 2023-11-24T13:17:09Z

To my understanding, we plan to remove these images at the end of this release cycle, so I'm not sure why we are adding new features.

Right and hence I want to hear everyone's thoughts. I can start exploring how to have similar functionality with mkosi. But unfortunately there is no way to work on other use cases till that time. Also I have made an attempt to keep it isolated and only include in packer builds. By default nothing gets included in the packer generated image.

IMO it's ok to conduct experiments like until we don't formally declare the packer-related code deprecated and freeze it. I'm not entirely sure we're committed to dropping packer, as it might be required for certain archs. We should reach and document a consensus about that,

What I think we have reached an agreement on so far: we're committed to provide mkosi-based image builds, possibly as a reference architecture. what we do with the existing packer-based infra needs to be decided (like deprecate or move it to a downstream project after 0.9). We should keep discussing this topic in #1326

This PR probably covers 2 aspects:

a facility for having addons in the podvm
sample addons using that facility

I think we can look at those individually.

For 1) I think we should probably consider a similar mechanism for configuring mkosi builds. maybe we can have something like the debug profile? maybe we don't want that level of configurability of mkosi builds, because it leads to an explosion in image permutations or in a large tcb?

In any case, 2) looks intriguing. I'll have to check those out. I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

katexochen · 2023-11-24T13:55:23Z

I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

Continuously raising the bar for mkosi images by continuing adding features to the legacy approach does not seem reasonable to me.

bpradipt · 2023-11-24T14:28:24Z

Let me summarise few things here. We can also in slack and the community meeting on the way forward.

We would like to provide a bare minimum podvm (raw) image which is measured and integrity protected and part of CI tests. This will be based on mkosi in a future release.
We will deprecate/remove packer based image builds in a future release.
We would like to provide a way for end users to create custom images to support GPU and any other extensions. This will also be needed for experiments, prototype so that we make it easy for end users to continue innovating.
Instructions, examples etc to create custom images need to be provided for mkosi and packer to help end users.

I think moving out the packer image build templates to a separate project (as suggested by Magnus) is a reasonable approach to balance between flexibility, maintenance and upstream test support matrix.
On similar lines, the mkosi custom templates (I'm using the generic term template to indicate custom configs) can also be moved to a separate project.
This can also help going forward when adding support for other providers which require native image creation. And if there are other tools to build such images, those can go into that separate repo.

mkulke · 2023-11-24T15:15:01Z

I guess they will also set the bar for what we have to support with fedora-based mkosi builds to make a switch feasible.

Continuously raising the bar for mkosi images by continuing adding features to the legacy approach does not seem reasonable to me.

admittedly that's annoying, but I think that's tolerable if temporary. We probably want to decide on concrete timeline/criteria on when to switch over in a coming community call, so there's less ambiguity. OTOH, at this point of the project, I also wouldn't want to curb/hamper exciting tinkering, like the GPU integration, by enforcing a sanctioned-only way to do images. It's a balance, I guess.

surajssd · 2023-11-30T20:51:49Z

@bpradipt can we add addon-support to mkosi build process as well?

What are the blockers to support the GPU artifacts for mkosi based image builds?

mkulke

Tested a build without an addon => built successfully, was able to start a pod with it on AKS
Tested a build with gpu addon => built successfully, couldn't start pod due to missing gpu quota

So, in general things seem to at least not break. A couple of questions/comments:

afaict we have no test-coverage for those, so the addon-feature or individual addons might be broken inadvertently by future code changes. manual testing is very cumbersome and in the GPU case, depends on available quota.
do we intend to support non-confidential GPU instances in the project?
does it make sense to introduce a "*-contrib" structure (maybe not a repo, can be a folder) in which can stage experimental features that are not supported/systematically tested yet?

podvm/addons/README.md

podvm/addons/opa/README.md

mkulke · 2023-11-30T21:04:16Z

podvm/addons/gpu/README.md

+  labels:
+    app: test
+  annotations:
+    io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3


Suggested change

io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3

io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3

io.containerd.cri.runtime-handler: kata-remote

don't we need this?

In my testing it was not needed. The annotation will be used by CAA to spin up the specific instance and the runtimeclass is part of of the pod spec itself.

I think it will break once we switch to kata#main. AFAIU the nydus-snapshotter requires this.

bpradipt · 2023-12-01T03:59:27Z

@bpradipt can we add addon-support to mkosi build process as well?

What are the blockers to support the GPU artifacts for mkosi based image builds?

This is next in my plan to include both GPU and opa for mkosi based image builds. The only catch is testing the Fedora packages as most of my GPU testing have been with either RHEL or Ubuntu

bpradipt · 2023-12-01T04:02:24Z

Tested a build without an addon => built successfully, was able to start a pod with it on AKS

Tested a build with gpu addon => built successfully, couldn't start pod due to missing gpu quota

So, in general things seem to at least not break. A couple of questions/comments:

afaict we have no test-coverage for those, so the addon-feature or individual addons might be broken inadvertently by future code changes. manual testing is very cumbersome and in the GPU case, depends on available quota.

do we intend to support non-confidential GPU instances in the project?

does it make sense to introduce a "*-contrib" structure (maybe not a repo, can be a folder) in which can stage experimental features that are not supported/systematically tested yet?

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

bpradipt · 2023-12-01T06:03:39Z

do we intend to support non-confidential GPU instances in the project?

I think having this option will be good as access to confidential GPU is hard and only available with Azure today. Having the option to use non-confidential GPU will benefit two things imho - ability for users to try their AI workload with peer-pods (non confidential) and for developers to continue working on integrating the common building blocks for GPU support

bpradipt · 2023-12-01T06:41:20Z

I have removed the opa addon from this series as I feel it's best suited to be used with the policy enablement pr - #1607.

Both gpu and opa addons need to be enabled for mkosi builds. This is next in my todo list.

mkulke

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

I'm not sure I grok the intent of this proposal yet. Is it purely provisional or is intended to be a future-proof image extension mechanism?

In general, we probably want to reduce the configurable surface of podvm images to make testing more manageable and have some confidence about working configurations. This PR would go in an opposite direction. Maybe it's warranted for GPU-enabled podvm images, as a special case. But then I would cover this specific case with some make INCLUDE_GPU_STACK=1 image parameterization, rather than introducing a generic addon mechanism.

mkulke · 2023-12-01T11:14:21Z

podvm/addons/gpu/README.md

+  labels:
+    app: test
+  annotations:
+    io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3


I think it will break once we switch to kata#main. AFAIU the nydus-snapshotter requires this.

podvm/addons/README.md

podvm/addons/.gitignore

bpradipt · 2023-12-01T12:21:52Z

Yeah. I don't have a good idea on how that should be. This applies for packer (which will be removed going forward) and mkosi as well.
Is it ok if the restructuring can be taken up later on after more discussions on the best way forward - like separate directory or a project ?

I'm not sure I grok the intent of this proposal yet. Is it purely provisional or is intended to be a future-proof image extension mechanism?

I would not say future proof. But at least a workable method to add extensions. I am open for other ideas for extensibility.

In general, we probably want to reduce the configurable surface of podvm images to make testing more manageable and have some confidence about working configurations. This PR would go in an opposite direction. Maybe it's warranted for GPU-enabled podvm images, as a special case. But then I would cover this specific case with some make INCLUDE_GPU_STACK=1 image parameterization, rather than introducing a generic addon mechanism.

Agree. I would start with the premise that anything which is turned on by default should be testable in CI.
Also parameterization will soon become unmanageable for different cases. Like the existing misc script with conditionals. But nonetheless required. Please see my previous comment on using addon and parameter both. Using the parameter will enable the addon via Makefile which we want to include by default and test it. Rest will be for users to try out and enable explicitly.

mkulke · 2023-12-01T12:39:47Z

I would not say future proof. But at least a workable method to add extensions. I open for other ideas for extensibility.

Do we need such an extension mechanism, or is it sufficient to document how an image can be customized?

In Systemd there's Sysext and Confext: https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html to extend os images in a systematic fashion (like updating a bundled container runtime or system library) but it's not package manager, things easily break if conflicting system extensions are applied.

bpradipt · 2023-12-01T13:25:12Z

https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html

I would not say future proof. But at least a workable method to add extensions. I open for other ideas for extensibility.

Do we need such an extension mechanism, or is it sufficient to document how an image can be customized?

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

In Systemd there's Sysext and Confext: https://www.freedesktop.org/software/systemd/man/latest/systemd-sysext.html to extend os images in a systematic fashion (like updating a bundled container runtime or system library) but it's not package manager, things easily break if conflicting system extensions are applied.

Are you suggesting enabling this option in the podvm image to allow extensions ?
Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

mkulke · 2023-12-01T14:40:12Z

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

I don't think it's gets a lot easier to customize a build than what is supported right now in packer + hcl if you copy a script and execute it:

  provisioner "file" {
    source      = "/install-some-deps.sh"
    destination = "~/install-some-deps.sh"
  }

  provisioner "shell" {
    remote_folder = "~"
    inline = [
      "sudo bash ~/install-some-deps.sh"
    ]
  }

Are you suggesting enabling this option in the podvm image to allow extensions ? Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

No indeed. sys/confext are only for binaries and configs, nothing is being executed here. rpms/debs you would install in the mkosi or packer build-process. sysext would merely allow extending an image by overlaying some files, without interfering in the build process itself.

Again, I probably don't understand the use case well, to understand the requirements atm. like who would want to extend a podvm image with what?

Is an extension something like a debug image, like a fixed, known-by-us build-time option? could it be make image-gpu like make image-debug or do we require something more composable make image DEBUG=1 GPU=1 then? I think either would work without developing our own extension mechanism.

Or is an extension something unforeseeable, like a bespoke monitoring agent that a customer has to install and configure in their environment? In this case both packer and mkosi provide convenient ways to do this kind of customization, we can hardly improve on those in terms of ux.

bpradipt · 2023-12-01T15:52:50Z

Hmm, challenge is the packer config. Will it be possible via only documentation to provide the changes to packer hcl ?

I don't think it's gets a lot easier to customize a build than what is supported right now in packer + hcl if you copy a script and execute it:
  provisioner "file" {
    source      = "/install-some-deps.sh"
    destination = "~/install-some-deps.sh"
  }

  provisioner "shell" {
    remote_folder = "~"
    inline = [
      "sudo bash ~/install-some-deps.sh"
    ]
  }
Are you suggesting enabling this option in the podvm image to allow extensions ? Also reading about it, I don't think this will work for rpms/debs. Might work for binaries and configs, but even then I'm not sure what's the mechanism to provide these extensions. Do you have some ideas ?

No indeed. sys/confext are only for binaries and configs, nothing is being executed here. rpms/debs you would install in the mkosi or packer build-process. sysext would merely allow extending an image by overlaying some files, without interfering in the build process itself.

Again, I probably don't understand the use case well, to understand the requirements atm. like who would want to extend a podvm image with what?

Is an extension something like a debug image, like a fixed, known-by-us build-time option? could it be make image-gpu like make image-debug or do we require something more composable make image DEBUG=1 GPU=1 then? I think either would work without developing our own extension mechanism.

Or is an extension something unforeseeable, like a bespoke monitoring agent that a customer has to install and configure in their environment? In this case both packer and mkosi provide convenient ways to do this kind of customization, we can hardly improve on those in terms of ux.

Let me ask a different question - what if we add gpu and opa in the default podvm image ? And instead of .enable I can include the addons based on AGENT_POLICY and ENABLE_GPU options. By default both these will be enabled.
Using flexible instance types we can always test gpu in CI if needed (although cost might be a factor).
Other than image build time and size, do you see any drawback ?

And for extensions we can rethink in the context of mkosi and not spend too much time in doing it with packer. Wdyt ?

mkulke · 2023-12-01T17:06:23Z

Let me ask a different question - what if we add gpu and opa in the default podvm image ? And instead of .enable I can include the addons based on AGENT_POLICY and ENABLE_GPU options. By default both these will be enabled. Using flexible instance types we can always test gpu in CI if needed (although cost might be a factor). Other than image build time and size, do you see any drawback ?

And for extensions we can rethink in the context of mkosi and not spend too much time in doing it with packer. Wdyt ?

yes, as built-time options those do make sense to me. About what feature is supposed to be opt-in and opt-out there can be a separate discussion once we port these to mkosi.

bpradipt · 2023-12-05T12:37:07Z

@mkulke I changed this PR to add support for NVIDIA GPU in packer based podvm images. Updated the main description as well.

NVIDIA gpu support is enabled by default, meaning the packer built images will have the NVIDIA gpu drivers and should be usable depending on the instance type selected.
Let me know what you think and if we can move ahead with this one for now.

Few additional work is required going forward which I plan to handle in subsequent PRs.

Add gpu verification test case
Add gpu support in mkosi build

mkulke

that makes sense to me, thanks for accommodating.

nit: I doubt we need the addon folder. the enable_nvidia_gpu flag would be enough. since we copy the necessary scripts on the build image anyway, we can just check for the ENV in misc-settings.sh and invoke nvidia_gpu/setup.sh if necessary.

podvm/files/etc/agent-config.toml

bpradipt · 2023-12-13T04:13:35Z

that makes sense to me, thanks for accommodating.

No problems. I'm glad we could arrive at a consensus.

nit: I doubt we need the addon folder. the enable_nvidia_gpu flag would be enough. since we copy the necessary scripts on the build image anyway, we can just check for the ENV in misc-settings.sh and invoke nvidia_gpu/setup.sh if necessary.

I thought about whether to have a separate folder or not, and felt it's cleaner to have a separate folder and keep the config files under it to accommodate support for other GPUs (eg Intel ones) or other accelerators requiring podvm image customisation. The naming is a challenge though. Thought about misc, contrib etc but then went with addons since there is already misc-settings :-).

NVIDIA GPU support is enabled by default when building packer based image for aws and azure A default build of podvm in aws took 10 min ``` ==> Wait completed after 10 minutes 16 seconds ==> Builds finished. The artifacts of successful builds are: --> peer-pod-ubuntu.amazon-ebs.ubuntu: AMIs were created: us-east-2: ami-0463ae5aa8d5b3606 rm -fr toupload real 10m34.352s user 0m18.919s sys 0m10.044s ``` If you want to disable, then run with ENABLE_NVIDIA_GPU=no For example: cd azure/image PODVM_DISTRO=ubuntu ENABLE_NVIDIA_GPU=no make image This results in the qcow2/setup_addons.sh script executing the addons/nvidia_gpu/setup.sh to setup NVIDIA drivers, libraries and prestart hook into the podvm image Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>

aws/image/Makefile

snir911

LGTM overall, added minor comment, thanks!

bpradipt requested review from snir911, mkulke, surajssd and katexochen November 24, 2023 09:59

bpradipt force-pushed the packer-addons branch 5 times, most recently from 1065b82 to 034a8ab Compare November 29, 2023 09:45

This was referenced Nov 29, 2023

Confidential NVIDIA GPU support #1586

Open

How are we implementing policies in peerpods? #1369

Closed

bpradipt force-pushed the packer-addons branch from 034a8ab to 5a4b4b9 Compare November 30, 2023 12:28

mkulke reviewed Nov 30, 2023

View reviewed changes

bpradipt force-pushed the packer-addons branch from 5a4b4b9 to c308bb4 Compare December 1, 2023 05:58

bpradipt requested a review from mkulke December 1, 2023 06:00

mkulke reviewed Dec 1, 2023

View reviewed changes

bpradipt force-pushed the packer-addons branch from c308bb4 to 3183d2c Compare December 5, 2023 12:27

bpradipt changed the title ~~packer: Enable misc addons for podvm~~ packer: Enable support for adding nvidia gpu to podvm image Dec 5, 2023

mkulke approved these changes Dec 7, 2023

View reviewed changes

podvm/files/etc/agent-config.toml Outdated Show resolved Hide resolved

bpradipt force-pushed the packer-addons branch from 3183d2c to 6c6bb6a Compare December 13, 2023 04:15

snir911 reviewed Dec 13, 2023

View reviewed changes

aws/image/Makefile Show resolved Hide resolved

snir911 approved these changes Dec 13, 2023

View reviewed changes

bpradipt merged commit ddbddb7 into confidential-containers:main Dec 13, 2023
16 checks passed

bpradipt deleted the packer-addons branch December 13, 2023 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

packer: Enable support for adding nvidia gpu to podvm image #1602

packer: Enable support for adding nvidia gpu to podvm image #1602

bpradipt commented Nov 24, 2023 •

edited

Loading

bpradipt commented Nov 24, 2023

katexochen commented Nov 24, 2023 •

edited

Loading

bpradipt commented Nov 24, 2023 •

edited

Loading

mkulke commented Nov 24, 2023

katexochen commented Nov 24, 2023

bpradipt commented Nov 24, 2023

mkulke commented Nov 24, 2023

surajssd commented Nov 30, 2023

mkulke left a comment

mkulke Nov 30, 2023

bpradipt Dec 1, 2023

mkulke Dec 1, 2023

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke left a comment

mkulke Dec 1, 2023

bpradipt commented Dec 1, 2023 •

edited

Loading

mkulke commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke commented Dec 1, 2023

bpradipt commented Dec 5, 2023

mkulke left a comment

bpradipt commented Dec 13, 2023

snir911 left a comment

	io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
	io.katacontainers.config.hypervisor.machine_type: Standard_NC4as_T4_v3
	io.containerd.cri.runtime-handler: kata-remote

packer: Enable support for adding nvidia gpu to podvm image #1602

packer: Enable support for adding nvidia gpu to podvm image #1602

Conversation

bpradipt commented Nov 24, 2023 • edited Loading

bpradipt commented Nov 24, 2023

katexochen commented Nov 24, 2023 • edited Loading

bpradipt commented Nov 24, 2023 • edited Loading

mkulke commented Nov 24, 2023

katexochen commented Nov 24, 2023

bpradipt commented Nov 24, 2023

mkulke commented Nov 24, 2023

surajssd commented Nov 30, 2023

mkulke left a comment

Choose a reason for hiding this comment

mkulke Nov 30, 2023

Choose a reason for hiding this comment

bpradipt Dec 1, 2023

Choose a reason for hiding this comment

mkulke Dec 1, 2023

Choose a reason for hiding this comment

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke left a comment

Choose a reason for hiding this comment

mkulke Dec 1, 2023

Choose a reason for hiding this comment

bpradipt commented Dec 1, 2023 • edited Loading

mkulke commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke commented Dec 1, 2023

bpradipt commented Dec 1, 2023

mkulke commented Dec 1, 2023

bpradipt commented Dec 5, 2023

mkulke left a comment

Choose a reason for hiding this comment

bpradipt commented Dec 13, 2023

snir911 left a comment

Choose a reason for hiding this comment

bpradipt commented Nov 24, 2023 •

edited

Loading

katexochen commented Nov 24, 2023 •

edited

Loading

bpradipt commented Nov 24, 2023 •

edited

Loading

bpradipt commented Dec 1, 2023 •

edited

Loading