sched/cputime: Improve cputime_adjust()
authorOleg Nesterov <oleg@redhat.com>
Tue, 19 May 2020 17:25:06 +0000 (19:25 +0200)
committerPeter Zijlstra <peterz@infradead.org>
Mon, 15 Jun 2020 12:10:00 +0000 (14:10 +0200)
commit3dc167ba5729ddd2d8e3fa1841653792c295d3f1
treed3348dfe2edc313740bfd0b348d91d36726f9cc1
parentb3a9e3b9622ae10064826dccb4f7a52bd88c7407
sched/cputime: Improve cputime_adjust()

People report that utime and stime from /proc/<pid>/stat become very
wrong when the numbers are big enough, especially if you watch these
counters incrementally.

Specifically, the current implementation of: stime*rtime/total,
results in a saw-tooth function on top of the desired line, where the
teeth grow in size the larger the values become. IOW, it has a
relative error.

The result is that, when watching incrementally as time progresses
(for large values), we'll see periods of pure stime or utime increase,
irrespective of the actual ratio we're striving for.

Replace scale_stime() with a math64.h helper: mul_u64_u64_div_u64()
that is far more accurate. This also allows architectures to override
the implementation -- for instance they can opt for the old algorithm
if this new one turns out to be too expensive for them.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200519172506.GA317395@hirez.programming.kicks-ass.net
arch/x86/include/asm/div64.h
include/linux/math64.h
kernel/sched/cputime.c
lib/math/div64.c