fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sigh...
authorOleg Nesterov <oleg@redhat.com>
Tue, 23 Jan 2024 15:33:55 +0000 (16:33 +0100)
committerAndrew Morton <akpm@linux-foundation.org>
Thu, 8 Feb 2024 05:20:33 +0000 (21:20 -0800)
Patch series "fs/proc: do_task_stat: use sig->stats_".

do_task_stat() has the same problem as getrusage() had before "getrusage:
use sig->stats_lock rather than lock_task_sighand()": a hard lockup.  If
NR_CPUS threads call lock_task_sighand() at the same time and the process
has NR_THREADS, spin_lock_irq will spin with irqs disabled O(NR_CPUS *
NR_THREADS) time.

This patch (of 3):

thread_group_cputime() does its own locking, we can safely shift
thread_group_cputime_adjusted() which does another for_each_thread loop
outside of ->siglock protected section.

Not only this removes for_each_thread() from the critical section with
irqs disabled, this removes another case when stats_lock is taken with
siglock held.  We want to remove this dependency, then we can change the
users of stats_lock to not disable irqs.

Link: https://lkml.kernel.org/r/20240123153313.GA21832@redhat.com
Link: https://lkml.kernel.org/r/20240123153355.GA21854@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
fs/proc/array.c

index ff08a8957552add31a8fdf98e202f8380d519e50..45ba91863808435d84c676d6beaca95effd50cb7 100644 (file)
@@ -511,7 +511,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 
        sigemptyset(&sigign);
        sigemptyset(&sigcatch);
-       cutime = cstime = utime = stime = 0;
+       cutime = cstime = 0;
        cgtime = gtime = 0;
 
        if (lock_task_sighand(task, &flags)) {
@@ -546,7 +546,6 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 
                        min_flt += sig->min_flt;
                        maj_flt += sig->maj_flt;
-                       thread_group_cputime_adjusted(task, &utime, &stime);
                        gtime += sig->gtime;
 
                        if (sig->flags & (SIGNAL_GROUP_EXIT | SIGNAL_STOP_STOPPED))
@@ -562,10 +561,13 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 
        if (permitted && (!whole || num_threads < 2))
                wchan = !task_is_running(task);
-       if (!whole) {
+
+       if (whole) {
+               thread_group_cputime_adjusted(task, &utime, &stime);
+       } else {
+               task_cputime_adjusted(task, &utime, &stime);
                min_flt = task->min_flt;
                maj_flt = task->maj_flt;
-               task_cputime_adjusted(task, &utime, &stime);
                gtime = task_gtime(task);
        }