-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LKL does not support clone #155
Comments
Note https://github.com/lkl/linux/blob/58dc2025bf469d880d76250e682dd8e4ed225a6b/arch/lkl/kernel/syscalls.c#L64. |
Thanks. It looks as if setting |
With that in mind, I believe that we should:
This should also make it possible for lthreads to be aware of their Linux tid. |
I tried enabling the clone system call. Unfortunately, it then crashes in |
@davidchisnall where exactly does it crash? Is it due to an access check? |
Oh, never mind, my clone userspace wrapper was misaligning the stack. |
Okay, it looks as if it ends up in LKL's That works correctly for |
@davidchisnall, the problem that you will now face is that the kernel scheduler will want to run these (now kernel-visible) user-level threads. This is inconsistent with the lthread scheduler being in control of their scheduling. You could modify the kernel threads that represent user-level threads to return immediately, but then you will still have the overhead of LKL doing many spurious context-switches. |
How does this work for kernel threads? These are already created via the same code path and we have a lot of them. |
When a userspace lthread does a system call, it assumes the identity of a unique host task (kernel thread) inside the kernel. After the system call has been executed, we have the LKL scheduler context-switch to all pending kernel tasks before returning to userspace. IIRC the host tasks that represent the userspace lthreads are never selected by the kernel scheduler for execution. Perhaps it will be enough if you simply create the lthread/host task mapping at clone time (and not when the first system call is invoked). |
I think I am still a bit confused. When a new kernel thread is created, it goes into the |
Also, which direction of mapping are you talking about? The Linux task structure contains the lthread ID of the thread, which is set when the lthread is created in the LKL arch for any lthread created via the |
The kernel scheduler's LKL retrieves the |
Thanks, that makes sense. To check I understand how this all fits together: LKL has a notion of a 'host task', which is a task that is externally scheduled, but still has LKL state associated with it. When an lthread that was not created by LKL calls Because LKL has a single process model, all of these threads are assumed to be threads of the init process (their task is looked up by pid [thread ID] in the init pid namespace [process]). When a thread returns from a syscall, the I think we should be able to create
As far as I can tell, Does that make sense, or have I missed anything important? |
Yes, that makes sense to me. One minor thing is that our threads don't share the parent pid of init, but rather have a different host parent task, otherwise the kernel wouldn't deliver certain signals to pid=1. |
I have started working on this in the wip-clone branch. Current status:
It seems to be nearly there, but I have not yet been able to diagnose the call of the deadlock. In the test case, the timer thread is firing and delivering ticks, but nothing else happens after the first attempt at a syscall. With tracing enabled, we see this with 8 ethreads:
With 1 ethread, everything is serialised and we see this:
It appears as if one lthread enters |
@prp, do you know how LKL's |
Looking at the preprocessed source for syscalls.c, it appears that |
This implements two new LKL hooks. The first one to create an lthread with a specific initial register state (to capture the returns-twice behaviour of clone, along with the caller's ability to define the stack and TLS addresses). The new thread is immediately associated with the Linux task structure (normally, lthreads are associated with Linux tasks lazily when they perform a system call). The second hook destroys a thread. This is done in response to an exit system call. This is somewhat complicated, because LKL never returns to this thread and the thread's stack may be deallocated by the time we exit it. The lthread scheduler does not have an easy way of adding a mechanism to kill a thread without that thread running. We can add one eventually, but for now create a temporary stack that lthreads can use during teardown and make them run the teardown from there. Disable access02 test. It is spuriously passing and this makes it fail. See #277 for more information. Fixes #155
This implements two new LKL hooks. The first one to create an lthread with a specific initial register state (to capture the returns-twice behaviour of clone, along with the caller's ability to define the stack and TLS addresses). The new thread is immediately associated with the Linux task structure (normally, lthreads are associated with Linux tasks lazily when they perform a system call). The second hook destroys a thread. This is done in response to an exit system call. This is somewhat complicated, because LKL never returns to this thread and the thread's stack may be deallocated by the time we exit it. The lthread scheduler does not have an easy way of adding a mechanism to kill a thread without that thread running. We can add one eventually, but for now create a temporary stack that lthreads can use during teardown and make them run the teardown from there. Disable access02 test. It is spuriously passing and this makes it fail. See #277 for more information. Fixes #155
This was fixed in #259. |
To fix the layering, we need to return to musl creating threads via the
clone
system call. Currently, LKL does not implementclone
at all.We need to provide an implementation that handles a the flags required for
pthread_create
. The correct change in LKL may simply be to provide a host_ops hook that handles clone entirely in the LKL consumer. The musl implementation ofpthread_create
depends on the followingclone
flags:CLONE_VM
: Share address space with the parent. In a single address space world, we cannot support anything other than this.CLONE_FS
: Share a filesystem namespace with the parent. In a single-process world, this is the obvious thing to do.CLONE_FILES
: Share a file descriptor table with the parent. We probably want to support not having this so that our init process can have a separate FD table.CLONE_SIGHAND
. Share signal handlers. We probably want to support not having this so that our init process can have separate signal handlers.CLONE_THREAD
. New thread shares PID and has a separate TGID from the parent. It would be nice to support both variations of this so that we can have a distinct PID for init.CLONE_SYSVSEM
. Shares ownership of SysV semaphores with the parent. It doesn't matter too much if we support this because our init process shouldn't use SysV IPC.CLONE_SETTLS
. Sets the TLS pointer. Should simply set the %fs base value.CLONE_PARENT_SETTID
. Sets the child thread ID in the parent. Should be easy to support.CLONE_CHILD_CLEARTID
. Uses the return thread ID as a futex. See Interceptclone
to handleCLONE_CHILD_CLEARTID
#154.CLONE_DETACHED
. Has no effect in modern Linux, safe to ignore.The text was updated successfully, but these errors were encountered: