-
-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSv hangs when booting on XEN #345
Comments
I wonder if that is still the case. I have never run OSv on Xen but given we had recently a paper measuring performance of many microservices apps - https://biblio.ugent.be/publication/8582433/file/8582438.pdf - OSv can properly run on Xen. Can someone verify it? |
@wkozaczuk Hi! This problem happens to me too. I was using Xen 4.9 on Ubuntu 18.04. OSv was blocked on the blkfront_connect function (more precisely, at the first call to bread in vfs_bio.cc. Seems that it was put to sleep and never wake up). |
Hi @yfliuu, Unfortunately, I do not have much experience with XEN. Besides trying to run it locally on my laptop with Dom0 being Ubuntu (like you are), I used to play with OSv and managed to deploy to EC2 xen instances fine without any problems at least a year ago or so (it would be nice to verify if it still works). But I guess this type of XEN setup is different with potentially slightly different para-virtual drivers. I think this is what I saw when compared boot messages between AWS xen and local xen. But I think you might have just nailed a problem: I wonder if we have a bug where we mis-detect a device - It would be nice to figure it out. I am not sure when I will have time to look at it. We welcome patches:-) BTW the author this paper not that long time ago tested OSv quite extensively on XEN - https://biblio.ugent.be/publication/8582433/file/8582438. He seems to have been running his tests on XenServer 7.5. But probably different setup than mine and yours. |
There is this part extracting XEN PCI support - 2d9f0e4. But I doubt it is a one that broke anything as it is dated 3 years after this issue was created. |
reproduced this in my environment:
build command: for every conf_drivers_* arg after xen above, except pvscsi, I also tried a build with each enabled, same result (no networking)
I'm happy to help test changes (in fact, I'm going to rebase on top of the changes that were recently pushed 20 mins ago and see if that helps) but I'm quite far from OS dev in terms of knowledge base so won't be much help there. edit: the new blkio changes didn't resolve the issue |
Unfortunately, I have almost not XEN experience. Also I do not have a machine with XEN host handy. Can you run simplest app - 'Hello World' in C (
and send me an output. Also there is number of XEN-related patches from Spirent (see #1197) that have never been merged upstream which possibly fix the issue you are encountering. |
Output is here. Used
I'll give these a try later this week, thanks! |
After a (somewhat rough) merge of those xen-related patches, output is here. Same result, except now I have no output in the console. |
These two threads seem to be key:
It looks like the thread I am afraid I need to setup a xen machine to reproduce this issue. Meanwhile, can you also send a dump of detailed qemu command ( |
Also, as I mentioned I am not an XEN expert, and the last time I used it was 4-5 years ago with Ubuntu. I presume you are running Xen on Fedora 39 as a host so I guess I can follow these setup instructions - https://docs.fedoraproject.org/en-US/quick-docs/virtualization-getting-started/#_xen and https://wiki.xenproject.org/wiki/Fedora_Host_Installation? |
When I get back home tonight! |
@wkozaczuk I'm not actually using fedora for the host, only for the build. My host is XCP-NG 8.3, using the latest patches, and with xen 4.17.4, Linux 4.19.19 (I'm not 100% on the 19.19, I am 100% on it being 19.x and pretty sure that x is 19). My build system is fedora 39, latest packages installed, nothing special about it other than that it's not dom0. |
Logs from the serial console (over qemu's telnet). Seems like the console driver no longer works though (I had thought it was my hack job pulling in those xen patches that had broken the console, apparently not). I'll see if I can find the commit that broke the console this weekend (it worked at one point IIRC). EDIT: pulled latest main, no change |
I have configured a machine with Fedora 39 and installed Xen on it. That way I can run Xen with fedora on Dom0:
So I can build OSv on the same Fedora version. However, when I try to run OSv I do not see any output (probably Vnc is misconfigured) but also xl utility seems to get stuck trying to create new domain and new qemu process until I kill it. I am not able to connect with gdb nor can I connect with telnet (which port are you using?). In my case, OSv probably crashes earlier and as a result the xl utility keeps trying to restart it. Here are the xl and qemu commands I grabbed with ps when they were running (just to compare to what you can see) and content of the xl config file:
Also, the changes I have made to
I wonder what is different besides you using XCP-NG. |
Adding
|
My As for gdb, this is a bigger difference between XCP-NG and fedora: the gdb remote ( Once you have |
OK. I have made some progress. It looks like the resulting
I am still figuring out the gdbsx - in my case it fails like so (no idea yeat why):
|
For that gdbsx error, I suspect that you'll have to either build your own xen kernel, or use XCP. Easier option is probably to rebuild the distro rpm if it doesn't enable this hypercall by default (probably under an "enable gdbsx remote debug server" or something like that in the menuconfig). |
I have made some progress. It seems the issue can be narrowed down to the I have not gotten to the bottom of it except to confirm that indeed it looks like OSv never receives relevant interrupt right after trying to read 1st block in Here is what works after applying this patch:
and this change to
and build for example lighpd:
|
I should have also made clear that with a raw image and without the change to
|
I can confirm that commenting out those lines works the same way in XCP. This gives me an idea on how I could "get it working" (as opposed to fixing it properly, ie "does it work at any level of performance"): have it ignore the xen blk device in favour of the IDE one. Pretty sure this could work, although I'm not 100% since I don't remember if I used xl or XCP for the test where I had no network but did have a useable disk. Something to try this weekend perhaps. |
I also wanted to point out that the same Xen block device works on old AWS EC2 Xen-based instances. I did test it many times including last time beginning of this year. |
You should fix /var/tmp or specify a different path to lighttpd.conf lightptd 1.4.54 was released May 2019, over 5 years ago. The latest lighttpd release is lighttdp 1.4.76, so you're 22 (!!!) releases behind the latest lighttpd stable release. |
In the last few days, I tried to dig deeper and debug this issue more, but I still fell short of finding the root cause. But I wanted to share all the findings, so maybe someone else may notice something and find the problem or continue research with what I have found. To debug the problem (I still have not found an easy way to fix the gdbsx problem), I have put numerous To better understand what is going on, I decided to bypass
sudo xl create -c /tmp/osv_raw.xl #1st scenario
sudo xl create -c /tmp/osv_img.xl #2nd scenario In both cases, I bypassed the loop device and the whole In the first scenario (raw image), OSv would probe both devices, but eventually crash in the Lines 102 to 107 in 3c35189
In the second scenario (qcow2 image), OSv also would probe both devices, but instead of crashing it would hang also in the It is also worth mentioning, that a raw or qcow2 image with To make sense of the information in the console logs I linked, I wanted to briefly explain (with my limited understanding) the device bootstrap process on Xen and what some key lines mean. More-less the device discovery process on Xen is initiated in the I have annotated the linked console output with lines like
or
to capture interactions between OSv and Xen hypervisor to list directory (1st example) or read a node value (2nd example) in the Xen tree. One can see that in both scenarios (raw and qcow2), the
and the trailing slash in the id (see 1st line). I wonder if we have a bug in this parsing logic. I have tried to force it to remove the trailing It is also worth mentioning that many people reported Xen worked at various points in time and it also works on AWS old Xen-based EC2 instances:
Could it be a configuration issue or a bug in how we implement block device discovery which only affects certain versions on Xen runtime? |
I find it interesting that the behaviour is different depending on the disk format, raw vs qcow2. Theoretically, qemu should be presenting the device no differently when using one vs the other, right? ie the "this is a block device, it's capacity is x, sector size y, etc" info presented to the guest should be identical between the two: how could the guest use the fact the underlying storage is qcow2, much less care? It would only matter to qemu, since it needs to know the format if there is one. Note: for my own tests, I used a loop device backed by a raw image that I had created ahead of time. This almost definitely affects how qemu uses the disk (I mean, it could still use the same POSIX file IO calls as on a file, but it may also try to ioctl on it, dunno) but again shouldn't change how the guest sees it. |
I have added an extra bit of logging in
What is interesting those seem to correspond to most writes except 1st two are missing (is this a symptom of something wrong?):
Also, OSv never gets similar callbacks the networking device. Full new console logs are here for both raw and qcow2 images. |
OK, I have found a culprit - a bug in setting up or handling block device with the multi-page ring buffer scheme. In other words, if I force the block device initialization code to use 1 for To be precise, I do not know what and where exactly this bug is, but I have narrowed down when it occurs. Possibly it is somewhere in For more info about multi-page buffer scheme see "Request Transport Parameters" section in https://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,io,blkif.h.html. BTW I bet in all these cases, when block device work on Xen, xen runtime was configured or reported |
That exact suggestion got my application to boot and function normally, thanks!
I see your bet and double it. |
I have pinpointed the exact cause - non-contiguous ring buffer allocation which matters when the buffer is longer than 1 page. This patch illustrates it: diff --git a/bsd/sys/dev/xen/blkfront/blkfront.cc b/bsd/sys/dev/xen/blkfront/blkfront.cc
index 08fcab3ab..4e3cd7219 100644
--- a/bsd/sys/dev/xen/blkfront/blkfront.cc
+++ b/bsd/sys/dev/xen/blkfront/blkfront.cc
@@ -98,6 +98,8 @@ static int blkif_completion(struct xb_command *);
static void blkif_free(struct xb_softc *);
static void blkif_queue_cb(void *, bus_dma_segment_t *, int, int);
+extern "C" void* alloc_contiguous_aligned(size_t size, size_t align);
+
#define GRANT_INVALID_REF 0
/* Control whether runtime update of vbds is enabled. */
@@ -1125,8 +1127,11 @@ setup_blkring(struct xb_softc *sc)
int error;
int i;
- sring = (blkif_sring_t *)malloc(sc->ring_pages * PAGE_SIZE, M_XENBLOCKFRONT,
- M_NOWAIT|M_ZERO);
+ //sring = (blkif_sring_t *)malloc(sc->ring_pages * PAGE_SIZE, M_XENBLOCKFRONT,
+ // M_NOWAIT|M_ZERO);
+ sring = (blkif_sring_t *)alloc_contiguous_aligned(sc->ring_pages * PAGE_SIZE, PAGE_SIZE);
+ memset(sring, 0, sc->ring_pages * PAGE_SIZE);
+
if (sring == NULL) {
xenbus_dev_fatal(sc->xb_dev, ENOMEM, "allocating shared ring");
return (ENOMEM); I will prepare a pull request to fix this 10-year-old issue. |
that's definitely part of it, I'm not sure it's the whole story though: a second disk attached to the VM causes it to block, waiting for presumably something different this time, see logs and gdb here |
Please send me the xl configuration file so that I can recreate it.
…On Sat, Jul 20, 2024 at 13:37 Kevin Ross ***@***.***> wrote:
that's definitely part of it, I'm not sure it's the whole story though: a
second disk attached to the VM causes it to block, waiting for presumably
something different this time, see logs and gdb here
<https://gist.github.com/kevinross/42ccec3b93f27aa7e4270d5b993ff1c9>
—
Reply to this email directly, view it on GitHub
<#345 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABINEINF5QD3I4GC464Z5QLZNKN57AVCNFSM6AAAAABJXRZI6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGIZTSOBUG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I don't have one unfortunately, this is done with XCP and the tooling doesn't use xl configs, using it's own config database instead. Best I can do is the qemu command line:
Note that I have zero control over the disk and network parameters, with the only exception being what model of NIC device to use. The disks are effectively loop devices, passed to qemu as type=raw (as you can see above), using something similar to qemu-nbd to do the I/O on the disk image. No control at all here over the model or bus type, it's all hardcoded in XCP. |
I just got home and tried this xl config which specifies 2 disks (the 2nd is raw):
And I can successfully boot OSv with both disks probed:
I wonder if there is some configuration difference on your side. Just in case my most recent full patch:
|
I also wonder if the issue possibly relates to the disk size - 32768MB? Can you try smaller disk? |
I think there was some ephemeral problem in my environment because after reverting the ring_pages=1 change and applying the diff above (including the netfront one), I'm up and running with 2 disks (where the size of the second is > 64GB)! edit to add: it appears there's something wrong with rofs, not sure if it's related:
|
This stack trace is identical to what I reported here - #1033. I will try to reproduce it in my environment. |
I cannot reproduce it in my environment. Would you mind connecting with gdbx/gdb and collecting stack traces of all threads? |
Also, after you capture that, could you please apply this patch and see if the problem goes away? |
I was able to reproduce your issue with a Java image that has enough files to make the dir entries table exceed 4K (1 page). And I can confirm my patch fixes it. |
Still want me to grab those stack traces? |
Nope. But please confirm if my latest patch fixes the issue with ROFS on your side. Just wanted to be double sure. |
Will do this weekend, thanks for helping to investigate and solve this! |
Device drivers often use ring buffers to share data between guest and host and therefore require allocation of contiguous area of memory. This is the case for virtio devices as well as Xen front-end para-virtual devices. The relevant code where xen ring buffers are allocated comes from FreeBSD and uses malloc with the arena parameter which unfortunately is ignored in OSv. To fix this we use explicit memory::alloc_phys_contiguous_aligned() to allocate contiguous area of memory for both netfront and blkfront devices. What is interesting, this bug would only show up in cases when number of ring buffer pages was greater than 1 (for details see discussion here - cloudius-systems#345 (comment)). For example this would not happen on EC2 Xen instances but it would on XCP and Xen server with Fedora on DOM0. Also note that netfront rx and tx buffer use single page ring buffer, but we also change the code to be consistent with blkfront. Fixes cloudius-systems#345 Signed-off-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com>
@kevinross I have added a new Wiki page - https://github.com/cloudius-systems/osv/wiki/Running-OSv-on-Xen - to try to better document how to run OSv on XEN. Feel free to improve it and specifically add some info about running OSv on XCP-NG. |
Apologies for the late follow-up, life got busy. Re: rofs? This works now! |
And wiki page updated with a short guide to boot OSv in a VM, along with serial console access and GDB debugging. |
Thanks! Have you updated this one - https://github.com/cloudius-systems/osv/wiki/Running-OSv-on-Xen or created a new Wiki? |
Np! Updated the one you linked |
When I start the guest with workaround for #344 applied, it hangs after printing the version line:
The main thread is blocked by blkfront setup:
The text was updated successfully, but these errors were encountered: