-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CNI-configured VM network interfaces. #257
Conversation
@@ -80,11 +80,14 @@ cd ~ | |||
# overlay | |||
# * firecracker-containerd, an alternative containerd binary that includes the | |||
# firecracker VM lifecycle plugin and API | |||
# * tc-redirect-tap and other CNI dependencies that enable VMs to start with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double-check: did you run through the quickstart guide on a real instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run all the commands, but only on my dev instance inside a container. I'll try on a fresh instance with the exact setup/commands here too to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm glad I did this (tried just running it on a fresh i3.metal w/ Debian AMI) because it revealed a bunch of issues, some pre-existing, some new:
- The pre-existing line
exec newgrp docker
appears to just end the script (I guess since it's exec'ing to a different process?).- I got rid of this line and use
sg
to run with the docker group when it's needed, let me know if there's a better way.
- I got rid of this line and use
- I forgot during my previous runs that I was using my fixed
ptp
plugin that didn't have the DNS issue I found.- I have the PR out to fix that now, but for now I added a note to the getting-started guide that DNS resolution may not work until that is fixed in
ptp
. - I will create an issue to follow up on updating the getting-started guide if my
ptp
fix is not merged by the time this commit is merged.
- I have the PR out to fix that now, but for now I added a note to the getting-started guide that DNS resolution may not work until that is fixed in
- I had not tested with the image-builder rootfs (just the default one), which revealed the need for adding an empty
/etc/hosts
file to the image-builder rootfs (though Noah coincidentally just added that to his unrelated PR)- I added that to this PR too
- The PTP plugin seems to fail to add iptables rules that allow the VM outbound internet access, though the VM is capable of pinging the IPs on the host itself. I highly suspect this is due to the fact that I ran this on the host (previously just inside a container) where there are all sorts of iptables rules added automatically by docker.
- I added a note to the getting-started guide that while the demo setup should give the VM access to host IPs, it may or may not grant the VM outbound internet access depending on the details of the user's host network.
I don't think I can do it in this PR, but I think we should probably add an example test that just runs the quickstart script so we don't get into this situation again. I'll add an issue for following up on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pre-existing line exec newgrp docker appears to just end the script (I guess since it's exec'ing to a different process?).
Are you running this as a script or as an interactive session? When exec newgrp docker
is run in an interactive session, it spawns a new shell that you can continue to use for commands.
I think sg
is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note to the getting-started guide that while the demo setup should give the VM access to host IPs, it may or may not grant the VM outbound internet access depending on the details of the user's host network.
Is this something we can fix for the quickstart since we're specifying exactly what should be run on the host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is just a demo a users would inevitably need to customize for their own network/use-case anyways [...]
It is, but it's also meant to be as much of a working setup as we can make it. If it's not something that you think you can address right now, can you open an issue to track it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that it's probably just impossible for this to universally work; there's always going to be some particular network setup on some host that's going to result in the VM not having outbound access. But I agree we should try to minimize those cases as much as possible; we should just also leave the disclaimer in too I think.
I just tried out adding the CNI firewall
plugin. I think it actually uncovered a bug in firewall
because at first I was not able to use it due to the fact that its delete method doesn't conform to the CNI spec; it returns an error when there was no previous result (which the CNI spec explicitly says it shouldn't do) which causes us to hit the error on this line.
We can probably fix that issue upstream, but in the meantime I tried with a forked go-sdk that just logs that error instead of returning it. Then, using firewall
actually does allow the VM to get outbound access as expected even on the more complicated network present on the i3.metal host.
Given all of the above, I'll follow up by:
- Trying to address the issue with the
firewall
plugin upstream - Adding an issue to include the
firewall
plugin in this demo-setup once it has been fixed. - Opening an issue in the Go SDK to add an optional "Force" parameter or something that allows users to ignore errors returned by delete methods (obviously it will default to false), since apparently it's not that uncommon to find plugins that don't perfectly conform to the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that it's probably just impossible for this to universally work; there's always going to be some particular network setup on some host that's going to result in the VM not having outbound access.
Right, I'm not saying that the quickstart needs to account for every situation, but rather that it accounts for itself. Assuming someone follows the quickstart guide exactly, it should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's why I updated the getting-started guide to specify that the expected end result is that the host IPs are accessible and that the outside internet may be accessible but might not depending on the details of the host network. I've added a similar note to the quickstart guide too now.
Once this PR is merged I'll create an issue for updating to use the firewall
plugin, then once I fix the plugin upstream we can reduce the number of cases where outbound internet access doesn't work (but will still need to leave the qualification anyways).
I also created an issue to run the quickstart as a test case, which we should do no matter what: #263
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created issue for the firewall plugin here: #265
ca09e85
to
f1ce526
Compare
Signed-off-by: Erik Sipsma <sipsma@amazon.com>
…ependabot/go_modules/github.com/go-openapi/runtime-0.19.21 Bump github.com/go-openapi/runtime from 0.19.20 to 0.19.21
Signed-off-by: Erik Sipsma sipsma@amazon.com
Closes #14
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.