sched,rcu,tracing: Avoid tracing before in_nmi() is correct
authorPeter Zijlstra <peterz@infradead.org>
Wed, 12 Feb 2020 20:01:16 +0000 (21:01 +0100)
committerThomas Gleixner <tglx@linutronix.de>
Tue, 19 May 2020 13:51:18 +0000 (15:51 +0200)
If a tracer is invoked before in_nmi() becomes true, the tracer can no
longer detect it is called from NMI context and behave correctly.

Therefore change nmi_{enter,exit}() to use __preempt_count_{add,sub}()
as the normal preempt_count_{add,sub}() have a (desired) function
trace entry.

This fixes a potential issue with the current code; when the function-tracer
has stack-tracing enabled __trace_stack() will malfunction when it hits the
preempt_count_add() function entry from NMI context.

Suggested-by: Steven Rostedt (VMware) <rosted@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134101.434193525@linutronix.de
include/linux/hardirq.h

index a043ad826c6779973f1390d45c93dfd1c0c3ada2..621556efe45f92846dd5b4d34de26d0190a57505 100644 (file)
@@ -65,6 +65,15 @@ extern void irq_exit(void);
 #define arch_nmi_exit()                do { } while (0)
 #endif
 
+/*
+ * NMI vs Tracing
+ * --------------
+ *
+ * We must not land in a tracer until (or after) we've changed preempt_count
+ * such that in_nmi() becomes true. To that effect all NMI C entry points must
+ * be marked 'notrace' and call nmi_enter() as soon as possible.
+ */
+
 /*
  * nmi_enter() can nest up to 15 times; see NMI_BITS.
  */
@@ -75,7 +84,7 @@ extern void irq_exit(void);
                lockdep_off();                                  \
                ftrace_nmi_enter();                             \
                BUG_ON(in_nmi() == NMI_MASK);                   \
-               preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
+               __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);       \
                rcu_nmi_enter();                                \
                lockdep_hardirq_enter();                        \
        } while (0)
@@ -85,7 +94,7 @@ extern void irq_exit(void);
                lockdep_hardirq_exit();                         \
                rcu_nmi_exit();                                 \
                BUG_ON(!in_nmi());                              \
-               preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
+               __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);       \
                ftrace_nmi_exit();                              \
                lockdep_on();                                   \
                printk_nmi_exit();                              \