Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic with "BUG: unable to handle kernel NULL pointer dereference at (null)" #80

Closed
leoh0 opened this issue May 16, 2018 · 0 comments

Comments

@leoh0
Copy link

leoh0 commented May 16, 2018

Description

A kernel panic occurs when two nodes connect with vxlan.

Steps to reproduce the issue:

Just follow this section. booting-and-initialising-os-images

$ make all

# boot master
$ ./boot.sh

# login to master
$ ./ssh_into_kubelet.sh 192.168.65.11 
# launch master
linuxkit-025000000009:/# kubeadm-init.sh

# join to master
$ ./boot.sh 1 192.168.65.11:6443 --token 43gcdz.40q62te3f1xprg7r --discovery-token-ca-cert-hash sha256:d79e5239a534ae0296410e0fdfa532664b92c65eda5dad7c2924d3ad05cb7313

# and kernel panic occurs

Describe the results you received:

When vxlan is connected then kernel panic is occured.

[  197.502285] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  197.503112] IP:           (null)
[  197.503630] PGD 8000000035950067 P4D 8000000035950067 PUD 3598e067 PMD 0
[  197.504438] Oops: 0010 [#1] SMP PTI
[  197.504815] Modules linked in: dummy vport_vxlan openvswitch xfrm_user xfrm_algo
[  197.505716] CPU: 0 PID: 2246 Comm: weaver Not tainted 4.14.32-linuxkit #1
[  197.506486] Hardware name:   BHYVE, BIOS 1.00 03/14/2014
[  197.507043] task: ffff9a3bb7715040 task.stack: ffffaeb081c64000
[  197.507862] RIP: 0010:          (null)
[  197.508239] RSP: 0018:ffffaeb081c67788 EFLAGS: 00010286
[  197.508784] RAX: ffffffff9d83a080 RBX: ffff9a3bb8214000 RCX: 00000000000005aa
[  197.509473] RDX: ffff9a3bb54b4200 RSI: 0000000000000000 RDI: ffff9a3bb5480400
[  197.510234] RBP: ffffaeb081c67880 R08: 0000000000000006 R09: 0000000000000002
[  197.510980] R10: 0000000000000000 R11: ffff9a3bb5469300 R12: ffff9a3bb54b4200
[  197.511763] R13: ffff9a3bb54804a8 R14: ffff9a3bb5469300 R15: ffff9a3bb8214040
[  197.512323] FS:  0000000003795880(0000) GS:ffff9a3bbe600000(0000) knlGS:0000000000000000
[  197.513038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.513666] CR2: 0000000000000000 CR3: 00000000380b2004 CR4: 00000000000606b0
[  197.514417] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  197.515142] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  197.515865] Call Trace:
[  197.516147]  ? vxlan_xmit_one+0x4af/0x837
[  197.516568]  ? vxlan_xmit+0xb2f/0xb5a
[  197.516969]  ? vxlan_xmit+0xb2f/0xb5a
[  197.517427]  ? skb_network_protocol+0x55/0xb3
[  197.517884]  ? dev_hard_start_xmit+0xd0/0x194
[  197.518441]  ? dev_hard_start_xmit+0xd0/0x194
[  197.518938]  ? __dev_queue_xmit+0x47c/0x5c4
[  197.519437]  ? do_execute_actions+0x99/0x1069 [openvswitch]
[  197.519952]  ? do_execute_actions+0x99/0x1069 [openvswitch]
[  197.520454]  ? slab_post_alloc_hook.isra.52+0xa/0x1a
[  197.520999]  ? __kmalloc+0xc1/0xd3
[  197.521371]  ? ovs_execute_actions+0x77/0xfd [openvswitch]
[  197.521897]  ? ovs_execute_actions+0x77/0xfd [openvswitch]
[  197.522469]  ? ovs_packet_cmd_execute+0x1bb/0x230 [openvswitch]
[  197.523063]  ? genl_family_rcv_msg+0x2db/0x349
[  197.523433]  ? genl_rcv_msg+0x4e/0x69
[  197.523787]  ? genlmsg_multicast_allns+0xf1/0xf1
[  197.524209]  ? netlink_rcv_skb+0x97/0xe8
[  197.524700]  ? genl_rcv+0x24/0x31
[  197.525137]  ? netlink_unicast+0x11a/0x1b5
[  197.525676]  ? netlink_sendmsg+0x2e2/0x308
[  197.526192]  ? sock_sendmsg+0x2d/0x3c
[  197.526754]  ? SYSC_sendto+0xfc/0x138
[  197.527233]  ? do_syscall_64+0x69/0x79
[  197.527681]  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  197.528326] Code:  Bad RIP value.
[  197.528660] RIP:           (null) RSP: ffffaeb081c67788
[  197.529281] CR2: 0000000000000000
[  197.529634] ---[ end trace 391521052893e451 ]---
[  197.533425] Kernel panic - not syncing: Fatal exception in interrupt
[  197.534602] Kernel Offset: 0x1b000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  197.538715] Rebooting in 10 seconds..
[  207.586797] ACPI MEMORY or I/O RESET_REG.
FATA[0211] Cannot run hyperkit: exit status 2

Describe the results you expected:

without kernel panic

Additional information you deem important (e.g. issue happens only occasionally):

I just test linuxkit/linux@6a3c946 commit can fix this problem which is in 4.14.40 kernel.

leoh0 added a commit to leoh0/kubernetes that referenced this issue May 19, 2018
Fixes: linuxkit#80

Signed-off-by: Eohyung Lee <liquidnuker@gmail.com>
@leoh0 leoh0 closed this as completed May 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant