entry: Respect changes to system call number by trace_sys_enter()
authorAndré Rösti <an.roesti@gmail.com>
Mon, 11 Mar 2024 21:17:04 +0000 (21:17 +0000)
committerThomas Gleixner <tglx@linutronix.de>
Tue, 12 Mar 2024 12:23:32 +0000 (13:23 +0100)
When a probe is registered at the trace_sys_enter() tracepoint, and that
probe changes the system call number, the old system call still gets
executed.  This worked correctly until commit b6ec41346103 ("core/entry:
Report syscall correctly for trace and audit"), which removed the
re-evaluation of the syscall number after the trace point.

Restore the original semantics by re-evaluating the system call number
after trace_sys_enter().

The performance impact of this re-evaluation is minimal because it only
takes place when a trace point is active, and compared to the actual trace
point overhead the read from a cache hot variable is negligible.

Fixes: b6ec41346103 ("core/entry: Report syscall correctly for trace and audit")
Signed-off-by: André Rösti <an.roesti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240311211704.7262-1-an.roesti@gmail.com
kernel/entry/common.c

index 88cb3c88aaa5c6c1ca271c1cc95dcd60cdd8ca9c..90843cc38588065ee5c52f8549d6a32c69bdf102 100644 (file)
@@ -57,8 +57,14 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
        /* Either of the above might have changed the syscall number */
        syscall = syscall_get_nr(current, regs);
 
-       if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT))
+       if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
                trace_sys_enter(regs, syscall);
+               /*
+                * Probes or BPF hooks in the tracepoint may have changed the
+                * system call number as well.
+                */
+               syscall = syscall_get_nr(current, regs);
+       }
 
        syscall_enter_audit(regs, syscall);