perf tools: Ignore deleted cgroups
authorNamhyung Kim <namhyung@kernel.org>
Thu, 9 May 2024 18:22:35 +0000 (11:22 -0700)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Fri, 10 May 2024 13:52:46 +0000 (10:52 -0300)
On large systems, cgroups can be created and deleted often.  That means
there's a race between perf tools and cgroups when it gets the cgroup
name and opens the cgroup.

I got a report that 'perf stat' with many cgroups failed quite often due
to the missing cgroups on such a large machine.

I think we can ignore such cgroups when expanding events and use id 0 if
it fails to read the cgroup id.  IIUC 0 is not a vaild cgroup id so it
won't update event counts for the failed cgroups.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240509182235.2319599-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/cgroup.c

index 1c82377ed78b99941df27d86afc8579ea5bfdf01..ea29c372f339705ddaa36c8b25d9ad5f0922088b 100644 (file)
@@ -136,9 +136,8 @@ static int bperf_load_program(struct evlist *evlist)
                cgrp = evsel->cgrp;
 
                if (read_cgroup_id(cgrp) < 0) {
-                       pr_err("Failed to get cgroup id\n");
-                       err = -1;
-                       goto out;
+                       pr_debug("Failed to get cgroup id for %s\n", cgrp->name);
+                       cgrp->id = 0;
                }
 
                map_fd = bpf_map__fd(skel->maps.cgrp_idx);
index fcb50905849969b380cea44432bda5fa76cf0d01..0f759dd96db710fd7ad37ad97fbaef1741e669ee 100644 (file)
@@ -465,9 +465,11 @@ int evlist__expand_cgroup(struct evlist *evlist, const char *str,
                name = cn->name + prefix_len;
                if (name[0] == '/' && name[1])
                        name++;
+
+               /* the cgroup can go away in the meantime */
                cgrp = cgroup__new(name, open_cgroup);
                if (cgrp == NULL)
-                       goto out_err;
+                       continue;
 
                leader = NULL;
                evlist__for_each_entry(orig_list, pos) {