Skip to content
This repository has been archived by the owner on May 6, 2020. It is now read-only.

wshd can't run in clear container #342

Closed
wuzhy opened this issue Jul 19, 2017 · 8 comments
Closed

wshd can't run in clear container #342

wuzhy opened this issue Jul 19, 2017 · 8 comments

Comments

@wuzhy
Copy link

wuzhy commented Jul 19, 2017

HI,

I tried to run wshd in clear container, but failed. Did anyone hit this issue before or know any way to workaround it? thanks.

Below was the log:
sh-4.1# strace ./wshd --run /share
execve("./wshd", ["./wshd", "--run", "/share"], [/* 8 vars */]) = 0
uname({sys="Linux", node="1f0511ff4ebce1762a45a656dff21e6f8ba01da5509c5d7c5572dbc9492a684e", ...}) = 0
brk(0) = 0x1eb1000
brk(0x1eb2180) = 0x1eb2180
arch_prctl(ARCH_SET_FS, 0x1eb1860) = 0
brk(0x1ed3180) = 0x1ed3180
brk(0x1ed4000) = 0x1ed4000
stat("/share", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 3
unlink("/share/wshd.sock") = 0
bind(3, {sa_family=AF_FILE, path="/share/wshd.sock"}, 110) = -1 ENXIO (No such device or address)
dup(2) = 4
fcntl(4, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb6f6323000
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
write(4, "bind: No such device or address\n", 32bind: No such device or address
) = 32
close(4) = 0
munmap(0x7fb6f6323000, 4096) = 0
brk(0x1ed3000) = 0x1ed3000
exit_group(1) = ?
sh-4.1#
sh-4.1# uname -a
Linux 1f0511ff4ebce1762a45a656dff21e6f8ba01da5509c5d7c5572dbc9492a684e 4.5.0-50.container #1 SMP Mon Oct 24 22:24:01 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
sh-4.1# cat /etc/system-release
CentOS release 6.4 (Final)
sh-4.1#

@grahamwhaley
Copy link
Contributor

Hi @wuzhy
we've not tried wshd I'm afraid, so have no direct experience.
Seeing:

unlink("/share/wshd.sock") = 0

in the logs makes me think of an issue of 9pfs with unlinked files (see intel/cc-oci-runtime#152) for a similar thread.

In that case, we verified the issue by using a tmpfs on /tmp to see if that worked around the problem:

# mount -t ramfs -o size=20M ramfs /tmp

I don't see how /share is placed inside your container, but my guess is it has been mounted via a docker command request, and thus is a 9p mount. Can you check that for us with a mount command. And if possible, maybe you can verify using a similar tmpfs trick above?

@wuzhy
Copy link
Author

wuzhy commented Jul 19, 2017

@grahamwhaley Yes, the dir /share is mounted via docker run ... -v /root/wsh/share:/share ....

hyperShared on /share type 9p (rw,sync,dirsync,nodev,relatime,trans=virtio)
sh-4.1# ls /share
wsh wsh.tar.gz wshd
sh-4.1#
sh-4.1# mount -t ramfs -o size=20M ramfs /share
sh-4.1# mount
hyperShared on /share type 9p (rw,sync,dirsync,nodev,relatime,trans=virtio)
ramfs on /share type ramfs (rw,relatime,size=20M)
sh-4.1# ls /share

The original files or directory in /share can't seen by us now.

@sboeuf
Copy link
Contributor

sboeuf commented Jul 19, 2017

@wuzhy what do you mean by "now" ? You have seen a recent change ?

@wuzhy
Copy link
Author

wuzhy commented Jul 19, 2017

@sboeuf i mean that after the command "mount -t ramfs -o size=20M ramfs /share" is done, there is no files or directory under the dir "/share".

@wuzhy
Copy link
Author

wuzhy commented Jul 19, 2017

@grahamwhaley yes, it can workaround with the help of your mentioned method. But has this 9p issue not still been fixed by upstream?

sh-4.1# strace ./wshd --run /share
execve("./wshd", ["./wshd", "--run", "/share"], [/* 8 vars */]) = 0
uname({sys="Linux", node="c58b86197c50fe5a26184bee4ab27928cc74575ae9360e347c6d712e89c39432", ...}) = 0
brk(0) = 0x16c7000
brk(0x16c8180) = 0x16c8180
arch_prctl(ARCH_SET_FS, 0x16c7860) = 0
brk(0x16e9180) = 0x16e9180
brk(0x16ea000) = 0x16ea000
stat("/share", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 3
unlink("/share/wshd.sock") = 0
bind(3, {sa_family=AF_FILE, path="/share/wshd.sock"}, 110) = 0
listen(3, 5) = 0
close(0) = 0
close(1) = 0
close(2) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
signalfd4(-1, [CHLD], 8, O_NONBLOCK|O_CLOEXEC) = 0
select(1024, [0 3], NULL, NULL, NULL

@dlespiau
Copy link
Contributor

Is your plan to use an AF_UNIX socket shared between the host and guest? I can't see that working with two different kernels (host and guest). The guest kernel doesn't know about the kernel objects from the host host. It works with containers because they share the same kernel.

Or I am missing something?

@wuzhy
Copy link
Author

wuzhy commented Jul 20, 2017

Yes,we use wshd to communicate between guest and host. any other way to replace it?

@wuzhy
Copy link
Author

wuzhy commented Aug 6, 2017

This issue got fixed, So close it now

@wuzhy wuzhy closed this as completed Aug 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants