sched/cpuidle: Comment about timers requirements VS idle handler
authorFrederic Weisbecker <frederic@kernel.org>
Tue, 14 Nov 2023 19:38:39 +0000 (14:38 -0500)
committerPeter Zijlstra <peterz@infradead.org>
Wed, 15 Nov 2023 08:57:51 +0000 (09:57 +0100)
Add missing explanation concerning IRQs re-enablement constraints in
the cpuidle path against timers.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lkml.kernel.org/r/20231114193840.4041-2-frederic@kernel.org
kernel/sched/idle.c

index 565f8374ddbbf747f04a3a8e7b9d656cb1e0210a..31231925f1ece276c96269a2516d4d00ff011420 100644 (file)
@@ -258,6 +258,36 @@ static void do_idle(void)
        while (!need_resched()) {
                rmb();
 
+               /*
+                * Interrupts shouldn't be re-enabled from that point on until
+                * the CPU sleeping instruction is reached. Otherwise an interrupt
+                * may fire and queue a timer that would be ignored until the CPU
+                * wakes from the sleeping instruction. And testing need_resched()
+                * doesn't tell about pending needed timer reprogram.
+                *
+                * Several cases to consider:
+                *
+                * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as
+                *   "wfi" or "mwait" are fine because they can be entered with
+                *   interrupt disabled.
+                *
+                * - sti;mwait() couple is fine because the interrupts are
+                *   re-enabled only upon the execution of mwait, leaving no gap
+                *   in-between.
+                *
+                * - ROLLBACK based idle handlers with the sleeping instruction
+                *   called with interrupts enabled are NOT fine. In this scheme
+                *   when the interrupt detects it has interrupted an idle handler,
+                *   it rolls back to its beginning which performs the
+                *   need_resched() check before re-executing the sleeping
+                *   instruction. This can leak a pending needed timer reprogram.
+                *   If such a scheme is really mandatory due to the lack of an
+                *   appropriate CPU sleeping instruction, then a FAST-FORWARD
+                *   must instead be applied: when the interrupt detects it has
+                *   interrupted an idle handler, it must resume to the end of
+                *   this idle handler so that the generic idle loop is iterated
+                *   again to reprogram the tick.
+                */
                local_irq_disable();
 
                if (cpu_is_offline(cpu)) {