scst_usr_cleanup thread causes soft lockup and scst_usr_released threads in D state #279

ChengyuanLiCY · 2025-02-08T14:11:45Z

We noticed that the scst_usr_cleanup kernel thread causes the CPU in the soft lockup state, and the scst_usr_released threads are in the D state in our storage nodes. This situation often occurs when a scst user space program exits unexpectedly, i.e. the all opened fds on /dev/scst_user are released at the same time when the user space program exits.

The issues are seen on both scst v3.6 and v3.8, and the OS is Ubuntu 22.04/Kernel 5.15.0-26.

The error logs:

root     226734  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]
root     226736  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]
root     226775  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]
root     226776  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]
root     226777  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]
root     226778  0.0  0.0      0     0 ?        D    05:36   0:00 [scst_usr_releas]

# cat /proc/226776/stack
[<0>] msleep+0x2d/0x40
[<0>] scst_acg_del_lun+0xa5/0x1e0 [scst]
[<0>] scst_unregister_virtual_device+0x11c/0x2b0 [scst]
[<0>] dev_user_exit_dev.isra.0+0x101/0x1e0 [scst_user]
[<0>] __dev_user_release+0x13/0x30 [scst_user]
[<0>] kthread+0x127/0x150
[<0>] ret_from_fork+0x1f/0x30

[Wed Jan 22 05:37:21 2025] rcu: INFO: rcu_sched self-detected stall on CPU
[Wed Jan 22 05:37:21 2025] rcu:         31-....: (14998 ticks this GP) idle=a83/1/0x4000000000000000 softirq=753634/753634 fqs=6302
[Wed Jan 22 05:37:21 2025]      (t=15000 jiffies g=2826113 q=405490)
[Wed Jan 22 05:37:21 2025] NMI backtrace for cpu 31
[Wed Jan 22 05:37:21 2025] CPU: 31 PID: 4156 Comm: scst_usr_cleanu Kdump: loaded Tainted: P        W  OEL    5.15.0-26-generic #26
[Wed Jan 22 05:37:21 2025] Hardware name: QCT D52B-1U/D52B-1U, BIOS 3B17.EBY05 02/08/2022
[Wed Jan 22 05:37:21 2025] Call Trace:
[Wed Jan 22 05:37:21 2025]  <IRQ>
[Wed Jan 22 05:37:21 2025]  show_stack+0x52/0x58
[Wed Jan 22 05:37:21 2025]  dump_stack_lvl+0x4a/0x5f
[Wed Jan 22 05:37:21 2025]  dump_stack+0x10/0x12
[Wed Jan 22 05:37:21 2025]  nmi_cpu_backtrace.cold+0x4d/0x93
[Wed Jan 22 05:37:21 2025]  ? lapic_can_unplug_cpu+0x80/0x80
[Wed Jan 22 05:37:21 2025]  nmi_trigger_cpumask_backtrace+0xe4/0xf0
[Wed Jan 22 05:37:21 2025]  arch_trigger_cpumask_backtrace+0x19/0x20
[Wed Jan 22 05:37:21 2025]  trigger_single_cpu_backtrace+0x44/0x4b
[Wed Jan 22 05:37:21 2025]  rcu_dump_cpu_stacks+0x102/0x145
[Wed Jan 22 05:37:21 2025]  print_cpu_stall.cold+0x2f/0xe2
[Wed Jan 22 05:37:21 2025]  check_cpu_stall+0x105/0x270
[Wed Jan 22 05:37:21 2025]  rcu_sched_clock_irq+0x9a/0x240
[Wed Jan 22 05:37:21 2025]  update_process_times+0x94/0xd0
[Wed Jan 22 05:37:21 2025]  ? tick_nohz_handler+0xb0/0xb0
[Wed Jan 22 05:37:21 2025]  tick_sched_handle+0x29/0x60
[Wed Jan 22 05:37:21 2025]  ? tick_nohz_handler+0xb0/0xb0
[Wed Jan 22 05:37:21 2025]  tick_sched_timer+0x88/0xb0
[Wed Jan 22 05:37:21 2025]  __hrtimer_run_queues+0x106/0x210
[Wed Jan 22 05:37:21 2025]  ? clockevents_program_event+0xa6/0x120
[Wed Jan 22 05:37:21 2025]  hrtimer_interrupt+0x101/0x220
[Wed Jan 22 05:37:21 2025]  __sysvec_apic_timer_interrupt+0x5e/0xd0
[Wed Jan 22 05:37:21 2025]  sysvec_apic_timer_interrupt+0x77/0x90
[Wed Jan 22 05:37:21 2025]  </IRQ>
[Wed Jan 22 05:37:21 2025]  <TASK>
[Wed Jan 22 05:37:21 2025]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[Wed Jan 22 05:37:21 2025] RIP: 0010:_raw_spin_lock_bh+0x1b/0x30
[Wed Jan 22 05:37:21 2025] Code: e5 e8 39 ee 3f ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 65 81 05 90 f3 8e 6d 00 02 00 00 31 c0 ba 01 00 00 00 f0 0f b1 17 <75> 01 c3 55 89 c6 48 89 e5 e8 37 e9 3f ff 66 90 5d c3 0f 1f 00 0f
[Wed Jan 22 05:37:21 2025] RSP: 0018:ffff9b915fd73e20 EFLAGS: 00000246
[Wed Jan 22 05:37:21 2025] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000122
[Wed Jan 22 05:37:21 2025] RDX: 0000000000000001 RSI: ffff8f383a2c5640 RDI: ffff8f383a2c56b8
[Wed Jan 22 05:37:21 2025] RBP: ffff9b915fd73e68 R08: 0000000000000000 R09: 0000000000000101
[Wed Jan 22 05:37:21 2025] R10: 0000000000000068 R11: 0000000000000000 R12: 0000000000000009
[Wed Jan 22 05:37:21 2025] R13: ffff8f383a2c56d0 R14: ffff8f383a2c56b8 R15: 0000000000000130
[Wed Jan 22 05:37:21 2025]  ? sgv_pool_flush+0x44/0x180 [scst]
[Wed Jan 22 05:37:21 2025]  dev_user_cleanup_thread+0x150/0x5a0 [scst_user]
[Wed Jan 22 05:37:21 2025]  ? wait_woken+0x70/0x70
[Wed Jan 22 05:37:21 2025]  ? dev_user_get_cmd_to_user+0x1f0/0x1f0 [scst_user]
[Wed Jan 22 05:37:21 2025]  kthread+0x127/0x150
[Wed Jan 22 05:37:21 2025]  ? set_kthread_struct+0x50/0x50
[Wed Jan 22 05:37:21 2025]  ret_from_fork+0x1f/0x30
[Wed Jan 22 05:37:21 2025]  </TASK>
[Wed Jan 22 05:37:31 2025] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 31-... } 15047 jiffies s: 17481 root: 0x4/.
[Wed Jan 22 05:37:31 2025] rcu: blocking rcu_node structures (internal RCU debug): l=1:30-44:0x2/.
[Wed Jan 22 05:37:31 2025] Task dump for CPU 31:
[Wed Jan 22 05:37:31 2025] task:scst_usr_cleanu state:R  running task     stack:    0 pid: 4156 ppid:     2 flags:0x00004008
[Wed Jan 22 05:37:31 2025] Call Trace:
[Wed Jan 22 05:37:31 2025]  <TASK>
[Wed Jan 22 05:37:31 2025]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[Wed Jan 22 05:37:31 2025]  ? sysvec_apic_timer_interrupt+0x4e/0x90
[Wed Jan 22 05:37:31 2025]  ? dev_user_get_next_cmd+0x54/0x3d0 [scst_user]
[Wed Jan 22 05:37:31 2025]  ? _raw_spin_unlock_bh+0x1e/0x20
[Wed Jan 22 05:37:31 2025]  ? sgv_pool_flush+0x117/0x180 [scst]
[Wed Jan 22 05:37:31 2025]  ? dev_user_cleanup_thread+0x150/0x5a0 [scst_user]
[Wed Jan 22 05:37:31 2025]  ? wait_woken+0x70/0x70
[Wed Jan 22 05:37:31 2025]  ? dev_user_get_cmd_to_user+0x1f0/0x1f0 [scst_user]
[Wed Jan 22 05:37:31 2025]  ? kthread+0x127/0x150
[Wed Jan 22 05:37:31 2025]  ? set_kthread_struct+0x50/0x50
[Wed Jan 22 05:37:31 2025]  ? ret_from_fork+0x1f/0x30
[Wed Jan 22 05:37:31 2025]  </TASK>
[Wed Jan 22 05:37:46 2025] watchdog: BUG: soft lockup - CPU#31 stuck for 82s! [scst_usr_cleanu:4156]

The text was updated successfully, but these errors were encountered:

ChengyuanLiCY · 2025-02-08T14:15:57Z

We saw RCU stall message together with scst_usr_cleanup kernel thread in soft lockup, and from the dumped kernel stack, the problem function is sgv_pool_flush() always.

/*
 * sgv_pool_flush() - flushes the SGV pool.
 *
 * Flushes, i.e. frees, all the cached entries in the SGV pool.
 */
void sgv_pool_flush(struct sgv_pool *pool)
{
        int i;

        TRACE_ENTRY();

        for (i = 0; i < pool->max_caches; i++) {
                struct sgv_pool_obj *obj;

                spin_lock_bh(&pool->sgv_pool_lock);

                while (!list_empty(&pool->recycling_lists[i])) {
                        obj = list_first_entry(&pool->recycling_lists[i],
                                struct sgv_pool_obj, recycling_list_entry);

                        __sgv_purge_from_cache(obj);

                        spin_unlock_bh(&pool->sgv_pool_lock);

                        EXTRACHECKS_BUG_ON(obj->owner_pool != pool);
                        sgv_dtor_and_free(obj);

                        spin_lock_bh(&pool->sgv_pool_lock);
                }
                spin_unlock_bh(&pool->sgv_pool_lock);
        }

        TRACE_EXIT();
        return;
}
EXPORT_SYMBOL_GPL(sgv_pool_flush);

ChengyuanLiCY · 2025-02-08T14:20:07Z

While the the scst_usr_released threads D state happens in scst_wait_for_tgt_devs(), and it repeated in the msleep().

I noticed this commit of scst, f59fe00, it mentioned that it's to fix the RCU stall. However, the only change is using msleep() instead of mdelay(). The msleep() calls kernel shedule() and put the process in uninterruptible state (D state), while mdelay() is busy loop. Even it was mdealy(100), it was only 100 ms, the RCU stall error message usually is dumped after 20 seconds or more, so it means that while (scst_tgt_devs_cmds(tgt_dev_list) > 0) returns true always. After changing to msleep(), the process goes to D state instead of RCU stall, that's exactly what we saw in our environment.

static void scst_wait_for_tgt_devs(struct list_head *tgt_dev_list)
{
        while (scst_tgt_devs_cmds(tgt_dev_list) > 0)
                msleep(100);
}

int scst_acg_del_lun(struct scst_acg *acg, uint64_t lun,
                     bool gen_report_luns_changed)
{
        int res = 0;
        struct scst_acg_dev *acg_dev;
        struct scst_tgt_dev *tgt_dev, *tt;
        struct list_head tgt_dev_list;

        TRACE_ENTRY();

        lockdep_assert_held(&scst_mutex);

        acg_dev = __scst_acg_del_lun(acg, lun, &tgt_dev_list,
                                     &gen_report_luns_changed);
        if (acg_dev == NULL) {
                PRINT_ERROR("Device is not found in group %s", acg->acg_name);
                res = -EINVAL;
                goto out;
        }

        if (gen_report_luns_changed)
                scst_report_luns_changed(acg);

        mutex_unlock(&scst_mutex);

        scst_wait_for_tgt_devs(&tgt_dev_list);
        synchronize_rcu();

        mutex_lock(&scst_mutex);

        list_for_each_entry_safe(tgt_dev, tt, &tgt_dev_list,
                                 extra_tgt_dev_list_entry) {
                scst_free_tgt_dev(tgt_dev);
        }
        scst_free_acg_dev(acg_dev);

out:
        TRACE_EXIT_RES(res);
        return res;
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scst_usr_cleanup thread causes soft lockup and scst_usr_released threads in D state #279

scst_usr_cleanup thread causes soft lockup and scst_usr_released threads in D state #279

ChengyuanLiCY commented Feb 8, 2025

ChengyuanLiCY commented Feb 8, 2025

ChengyuanLiCY commented Feb 8, 2025

scst_usr_cleanup thread causes soft lockup and scst_usr_released threads in D state #279

scst_usr_cleanup thread causes soft lockup and scst_usr_released threads in D state #279

Comments

ChengyuanLiCY commented Feb 8, 2025

ChengyuanLiCY commented Feb 8, 2025

ChengyuanLiCY commented Feb 8, 2025