perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
authorRavi Bangoria <ravi.bangoria@amd.com>
Thu, 6 Oct 2022 15:39:42 +0000 (21:09 +0530)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Thu, 6 Oct 2022 19:29:32 +0000 (16:29 -0300)
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events.
Set it for combined load-store event as well which will enable recording
of load latency by default on arch that does not support independent
mem load event.

Also document missing -W in perf-record man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/Documentation/perf-record.txt
tools/perf/builtin-c2c.c
tools/perf/builtin-mem.c

index 378f497f4be327f3b1fa89e6936bf9e08bebb083..e41ae950fdc3b682e87b1f6d233021b782a0bdf7 100644 (file)
@@ -411,6 +411,7 @@ is enabled for all the sampling events. The sampled branch type is the same for
 The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
 Note that this feature may not be available on all processors.
 
+-W::
 --weight::
 Enable weightened sampling. An additional weight is recorded per sample and can be
 displayed with the weight and local_weight sort keys.  This currently works for TSX
index f35a47b2dbe4984c849083c761038e797a690aa0..a9190458d2d50015cbe997ed9cab63b8b51d984a 100644 (file)
@@ -3281,6 +3281,7 @@ static int perf_c2c__record(int argc, const char **argv)
                 */
                if (e->tag) {
                        e->record = true;
+                       rec_argv[i++] = "-W";
                } else {
                        e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
                        e->record = true;
index 9e435fd2350326a628f9311f85be3cb33871e5f2..f7dd8216de72e46dc4b8d46807f43759bd6e29eb 100644 (file)
@@ -122,6 +122,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
            (mem->operation & MEM_OPERATION_LOAD) &&
            (mem->operation & MEM_OPERATION_STORE)) {
                e->record = true;
+               rec_argv[i++] = "-W";
        } else {
                if (mem->operation & MEM_OPERATION_LOAD) {
                        e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);