Support the clone system call. #259

davidchisnall · 2020-05-18T09:26:19Z

This implements two new LKL hooks. The first one to create an lthread
with a specific initial register state (to capture the returns-twice
behaviour of clone, along with the caller's ability to define the stack
and TLS addresses). The new thread is immediately associated with the
Linux task structure (normally, lthreads are associated with Linux tasks
lazily when they perform a system call).

The second hook destroys a thread. This is done in response to an exit
system call. This is somewhat complicated, because LKL never returns to
this thread and the thread's stack may be deallocated by the time we
exit it.

The lthread scheduler does not have an easy way of adding a mechanism to
kill a thread without that thread running. We can add one eventually,
but for now create a temporary stack that lthreads can use during
teardown and make them run the teardown from there.

Fixes #155

davidchisnall · 2020-05-18T09:27:16Z

Don't merge this yet: the LKL commit will change once lsds/lkl#1 is merged. I'll force-push to this branch to update the LKL submodule once that's done.

davidchisnall · 2020-05-18T09:31:28Z

Note that this doesn't yet fix #154.

prp

LGTM, some minor comments. I see CI failures though...

src/include/enclave/lthread.h

src/lkl/posix-host.c

src/sched/lthread.c

tests/basic/clone/clone.s

davidchisnall · 2020-05-18T12:10:50Z

CI is failing because the weak symbol for lkl_syscall isn't found. It does for me, so I'm not entirely sure what's going on here...

SeanTAllen · 2020-05-18T12:30:32Z

tests/basic/clone/Dockerfile

@@ -0,0 +1,11 @@
+FROM alpine:3.6 AS builder


out of curiosity, given that 3.6 is rather dated. is there a "no later than X" dependency for sgx-lkl that I'm not aware of?

We should use a consistent Alpine version in all tests. @letmaik?

I opened #260. For this PR, it's fine to leave it at an arbitrary version. The maximum supported currently is 3.10.

tests/basic/clone/clone.c

tests/basic/clone/clone.s

SeanTAllen · 2020-05-19T19:57:50Z

I started looking into the failing access02 ltp test. Here's how to replicate.

Start from fresh repo of the clone branch.
make DEBUG=true
cd tests/ltp/ltp-batch1
edit ../batch.mk
comment out the commands for run-hw and run-sw commands (those commands run all tests)

run-hw: $(ROOT_FS)
        #@${LTP_TEST_SCRIPT} run-hw

run-sw: $(ROOT_FS)
        #@${LTP_TEST_SCRIPT} run-sw

make clean
make DEBUG=true
make run

At this point, you are prepped to run the single test for HW mode:

SGXLKL_VERBOSE=1 SGXLKL_KERNEL_VERBOSE=0 ../../../build/sgx-lkl-run-oe --hw-debug sgxlkl-miniroot-fs.img /ltp/testcases/kernel/syscalls/access/access02

for SW mode:

SGXLKL_VERBOSE=1 SGXLKL_KERNEL_VERBOSE=0 ../../../build/sgx-lkl-run-oe --sw-debug sgxlkl-miniroot-fs.img /ltp/testcases/kernel/syscalls/access/access02

Result I get

HW:

tst_test.c:1106: INFO: Timeout per run is 0h 05m 00s
tst_test.c:1125: INFO: No fork support
access02.c:144: PASS: access(file_f, F_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_f, F_OK) as nobody behaviour is correct.
access02.c:144: PASS: access(file_r, R_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_r, R_OK) as nobody behaviour is correct.
access02.c:144: PASS: access(file_w, W_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_w, W_OK) as nobody behaviour is correct.
bad count while changing owner
[[ SGX-LKL ]] FAIL: Kernel panic! Run DEBUG build with SGXLKL_KERNEL_VERBOSE=1 for more information. Aborting...
2020-05-19T19:52:36.000000Z [(H)ERROR] tid(0x7fe8a8ff1700) | :OE_ENCLAVE_ABORTING [/home/sean/openenclave-sgxlkl.git/host/calls.c:oe_call_enclave_function_by_table_id:91]
[ SGX-LKL ] ethread (4: 19) [ SGX-LKL ] FAIL: sgxlkl_ethread_init() failed (id=4 result=19 (OE_ENCLAVE_ABORTING))

SW:

It hangs after printing access02.c:144: PASS: access(file_w, W_OK) as nobody behaviour is correct.

This is somewhat different than what I am seeing in CI, so, before I proceed any further, @davidchisnall can you try recreating the above?

The source of the failing test is https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/access/access02.c

davidchisnall · 2020-05-20T10:59:58Z

Thanks @SeanTAllen. I actually see something more interesting when I try to reproduce this locally:

tst_test.c:1106: INFO: Timeout per run is 0h 05m 00s
tst_test.c:1125: INFO: No fork support
access02.c:144: PASS: access(file_f, F_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_f, F_OK) as nobody behaviour is correct.
access02.c:144: PASS: access(file_r, R_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_r, R_OK) as nobody behaviour is correct.
access02.c:144: PASS: access(file_w, W_OK) as root behaviour is correct.
access02.c:144: PASS: access(file_w, W_OK) as nobody behaviour is correct.
Created new host task 7f2fc02f4540 (for 7f2fbeada4c0)

So it appears that we're hitting the code path for creating a new clone'd task (for a test that shouldn't be calling clone). I'll take a look.

This implements two new LKL hooks. The first one to create an lthread with a specific initial register state (to capture the returns-twice behaviour of clone, along with the caller's ability to define the stack and TLS addresses). The new thread is immediately associated with the Linux task structure (normally, lthreads are associated with Linux tasks lazily when they perform a system call). The second hook destroys a thread. This is done in response to an exit system call. This is somewhat complicated, because LKL never returns to this thread and the thread's stack may be deallocated by the time we exit it. The lthread scheduler does not have an easy way of adding a mechanism to kill a thread without that thread running. We can add one eventually, but for now create a temporary stack that lthreads can use during teardown and make them run the teardown from there. Disable access02 test. It is spuriously passing and this makes it fail. See #277 for more information. Fixes #155

davidchisnall requested review from prp and mikbras May 18, 2020 09:26

davidchisnall mentioned this pull request May 18, 2020

Intercept clone to handle CLONE_CHILD_CLEARTID #154

Closed

prp approved these changes May 18, 2020

View reviewed changes

SeanTAllen reviewed May 18, 2020

View reviewed changes

tests/basic/clone/clone.c Outdated Show resolved Hide resolved

davidchisnall commented May 18, 2020

View reviewed changes

tests/basic/clone/clone.s Outdated Show resolved Hide resolved

davidchisnall force-pushed the clone branch 3 times, most recently from 6fe9619 to 23f27cf Compare May 19, 2020 16:28

davidchisnall force-pushed the clone branch from 23f27cf to bc4416c Compare May 20, 2020 13:26

davidchisnall merged commit 24b7865 into oe_port May 20, 2020

davidchisnall deleted the clone branch May 20, 2020 14:46

SeanTAllen added a commit that referenced this pull request May 20, 2020

Fix missed LKL submodule bump from PR #259

b8a3b6d

SeanTAllen added a commit that referenced this pull request May 20, 2020

Fix missed LKL submodule bump from PR #259

97d2ac4

davidchisnall mentioned this pull request May 26, 2020

LKL does not support clone #155

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the clone system call. #259

Support the clone system call. #259

davidchisnall commented May 18, 2020

davidchisnall commented May 18, 2020

davidchisnall commented May 18, 2020

prp left a comment

davidchisnall commented May 18, 2020

SeanTAllen May 18, 2020

prp May 18, 2020

letmaik May 18, 2020

SeanTAllen commented May 19, 2020 •

edited

Loading

davidchisnall commented May 20, 2020

Support the clone system call. #259

Support the clone system call. #259

Conversation

davidchisnall commented May 18, 2020

davidchisnall commented May 18, 2020

davidchisnall commented May 18, 2020

prp left a comment

Choose a reason for hiding this comment

davidchisnall commented May 18, 2020

SeanTAllen May 18, 2020

Choose a reason for hiding this comment

prp May 18, 2020

Choose a reason for hiding this comment

letmaik May 18, 2020

Choose a reason for hiding this comment

SeanTAllen commented May 19, 2020 • edited Loading

Result I get

davidchisnall commented May 20, 2020

SeanTAllen commented May 19, 2020 •

edited

Loading