rcu: Don't redump the stalled CPU where RCU GP kthread last ran
authorZhen Lei <thunder.leizhen@huawei.com>
Wed, 12 Jul 2023 15:15:57 +0000 (23:15 +0800)
committerFrederic Weisbecker <frederic@kernel.org>
Mon, 11 Sep 2023 19:46:54 +0000 (21:46 +0200)
The stacks of all stalled CPUs will be dumped in rcu_dump_cpu_stacks().
If the CPU on where RCU GP kthread last ran is stalled, its stack does
not need to be dumped again. We can search the corresponding backtrace
based on the printed CPU ID.

For example:
[   87.328275] rcu: rcu_sched kthread starved for ... ->cpu=3  <--------|
... ...                                                                 |
[   89.385007] NMI backtrace for cpu 3                         <--------|
[   89.385179] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.10.0+ #22 <--|
[   89.385188] Hardware name: linux,dummy-virt (DT)
[   89.385196] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   89.385204] pc : arch_cpu_idle+0x40/0xc0
[   89.385211] lr : arch_cpu_idle+0x2c/0xc0
... ...
[   89.385566] Call trace:
[   89.385574]  arch_cpu_idle+0x40/0xc0
[   89.385581]  default_idle_call+0x100/0x450
[   89.385589]  cpuidle_idle_call+0x2f8/0x460
[   89.385596]  do_idle+0x1dc/0x3d0
[   89.385604]  cpu_startup_entry+0x5c/0xb0
[   89.385613]  secondary_start_kernel+0x35c/0x520

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
kernel/rcu/tree_stall.h

index b5ce0580074e740d24e3a2bb3d936f5c877e49e3..fc04a8d7ce96ebc9107f0e9e52407151267a2c2f 100644 (file)
@@ -534,12 +534,14 @@ static void rcu_check_gp_kthread_starvation(void)
                       data_race(READ_ONCE(rcu_state.gp_state)),
                       gpk ? data_race(READ_ONCE(gpk->__state)) : ~0, cpu);
                if (gpk) {
+                       struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+
                        pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
                        pr_err("RCU grace-period kthread stack dump:\n");
                        sched_show_task(gpk);
                        if (cpu_is_offline(cpu)) {
                                pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
-                       } else  {
+                       } else if (!(data_race(READ_ONCE(rdp->mynode->qsmask)) & rdp->grpmask)) {
                                pr_err("Stack dump where RCU GP kthread last ran:\n");
                                dump_cpu_task(cpu);
                        }