sched/timers: Explain why idle task schedules out on remote timer enqueue
authorFrederic Weisbecker <frederic@kernel.org>
Tue, 14 Nov 2023 19:38:40 +0000 (14:38 -0500)
committerPeter Zijlstra <peterz@infradead.org>
Wed, 15 Nov 2023 08:57:52 +0000 (09:57 +0100)
Trying to avoid that didn't bring much value after testing, add comment
about this.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lkml.kernel.org/r/20231114193840.4041-3-frederic@kernel.org
kernel/sched/core.c

index f5f4495d1768d16303871e54690e0762c43eee64..2de77a6d5ef8fde0f66262164db0111618667af1 100644 (file)
@@ -1131,6 +1131,28 @@ static void wake_up_idle_cpu(int cpu)
        if (cpu == smp_processor_id())
                return;
 
+       /*
+        * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
+        * part of the idle loop. This forces an exit from the idle loop
+        * and a round trip to schedule(). Now this could be optimized
+        * because a simple new idle loop iteration is enough to
+        * re-evaluate the next tick. Provided some re-ordering of tick
+        * nohz functions that would need to follow TIF_NR_POLLING
+        * clearing:
+        *
+        * - On most archs, a simple fetch_or on ti::flags with a
+        *   "0" value would be enough to know if an IPI needs to be sent.
+        *
+        * - x86 needs to perform a last need_resched() check between
+        *   monitor and mwait which doesn't take timers into account.
+        *   There a dedicated TIF_TIMER flag would be required to
+        *   fetch_or here and be checked along with TIF_NEED_RESCHED
+        *   before mwait().
+        *
+        * However, remote timer enqueue is not such a frequent event
+        * and testing of the above solutions didn't appear to report
+        * much benefits.
+        */
        if (set_nr_and_not_polling(rq->idle))
                smp_send_reschedule(cpu);
        else