Paul E. McKenney [Thu, 24 Feb 2022 17:38:46 +0000 (09:38 -0800)]
Merge branches 'exp.2022.02.24a', 'fixes.2022.02.14a', 'rcu_barrier.2022.02.08a', 'rcu-tasks.2022.02.08a', 'rt.2022.02.01b', 'torture.2022.02.01b' and 'torturescript.2022.02.08a' into HEAD
exp.2022.02.24a: Expedited grace-period updates.
fixes.2022.02.14a: Miscellaneous fixes.
rcu_barrier.2022.02.08a: Make rcu_barrier() no longer exclude CPU hotplug.
rcu-tasks.2022.02.08a: RCU-tasks updates.
rt.2022.02.01b: Real-time-related updates.
torture.2022.02.01b: Torture-test updates.
torturescript.2022.02.08a: Torture-test scripting updates.
Yury Norov [Sun, 23 Jan 2022 18:38:53 +0000 (10:38 -0800)]
rcu: Replace cpumask_weight with cpumask_empty where appropriate
In some places, RCU code calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Ingo Molnar [Sat, 15 Jan 2022 00:16:55 +0000 (16:16 -0800)]
rcu: Remove __read_mostly annotations from rcu_scheduler_active externs
Remove the __read_mostly attributes from the rcu_scheduler_active
extern declarations, because these attributes are ignored for
prototypes and we'd have to include the full <linux/cache.h> header
to gain this functionally pointless attribute defined.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Ingo Molnar [Sat, 15 Jan 2022 00:07:28 +0000 (16:07 -0800)]
rcu: Uninline multi-use function: finish_rcuwait()
This is a rarely used function, so uninlining its 3 instructions
is probably a win or a wash - but the main motivation is to
make <linux/rcuwait.h> independent of task_struct details.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 4 Jan 2022 18:34:34 +0000 (10:34 -0800)]
rcu: Mark writes to the rcu_segcblist structure's ->flags field
KCSAN reports data races between the rcu_segcblist_clear_flags() and
rcu_segcblist_set_flags() functions, though misreporting the latter
as a call to rcu_segcblist_is_enabled() from call_rcu(). This commit
converts the updates of this field to WRITE_ONCE(), relying on the
resulting unmarked reads to continue to detect buggy concurrent writes
to this field.
Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Zqiang [Sun, 26 Dec 2021 00:52:04 +0000 (08:52 +0800)]
kasan: Record work creation stack trace with interrupts enabled
Recording the work creation stack trace for KASAN reports in
call_rcu() is expensive, due to unwinding the stack, but also
due to acquiring depot_lock inside stackdepot (which may be contended).
Because calling kasan_record_aux_stack_noalloc() does not require
interrupts to already be disabled, this may unnecessarily extend
the time with interrupts disabled.
Therefore, move calling kasan_record_aux_stack() before the section
with interrupts disabled.
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sat, 18 Dec 2021 17:30:33 +0000 (09:30 -0800)]
rcu: Inline __call_rcu() into call_rcu()
Because __call_rcu() is invoked only by call_rcu(), this commit inlines
the former into the latter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
David Woodhouse [Wed, 8 Dec 2021 23:41:53 +0000 (23:41 +0000)]
rcu: Add mutex for rcu boost kthread spawning and affinity setting
As we handle parallel CPU bringup, we will need to take care to avoid
spawning multiple boost threads, or race conditions when setting their
affinity. Spotted by Paul McKenney.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Wed, 1 Dec 2021 09:20:53 +0000 (10:20 +0100)]
rcu: Fix description of kvfree_rcu()
The kvfree_rcu() header comment's description of the "ptr" parameter
is unclear, therefore rephrase it to make it clear that it is a pointer
to the memory to eventually be passed to kvfree().
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 1 Dec 2021 14:27:59 +0000 (06:27 -0800)]
MAINTAINERS: Add Frederic and Neeraj to their RCU files
Adding Frederic as an RCU maintainer for kernel/rcu/tree_nocb.h given his
work with offloading and de-offloading callbacks from CPUs. Also adding
Neeraj for kernel/rcu/tasks.h given his focused work on RCU Tasks Trace.
As in I am reasonably certain that each understands the full contents
of the corresponding file.
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Paul E. McKenney [Tue, 1 Feb 2022 16:23:46 +0000 (08:23 -0800)]
rcutorture: Provide non-power-of-two Tasks RCU scenarios
This commit adjusts RUDE01 to 3 CPUs and TRACE01 to 5 CPUs in order to
test Tasks RCU's ability to handle non-power-of-two numbers of CPUs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 31 Jan 2022 23:03:36 +0000 (15:03 -0800)]
rcutorture: Test SRCU size transitions
Thie commit adds kernel boot parameters to the SRCU-N and SRCU-P
rcutorture scenarios to cause SRCU-N to test contention-based resizing
and SRCU-P to test init_srcu_struct()-time resizing. Note that this
also tests never-resizing because the contention-based resizing normally
takes some minutes to make the shift.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 27 Jan 2022 17:39:15 +0000 (09:39 -0800)]
torture: Make torture.sh help message match reality
This commit fixes a couple of typos: s/--doall/--do-all/ and
s/--doallmodconfig/--do-allmodconfig/.
[ paulmck: Add Fixes: supplied by Paul Menzel. ]
Fixes: a115a775a8d5 ("torture: Add "make allmodconfig" to torture.sh")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 3 Feb 2022 00:34:40 +0000 (16:34 -0800)]
rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention
Currently, call_rcu_tasks_generic() sets ->percpu_enqueue_shift to
order_base_2(nr_cpu_ids) upon encountering sufficient contention.
This does not shift to use of non-CPU-0 callback queues as intended, but
rather continues using only CPU 0's queue. Although this does provide
some decrease in contention due to spreading work over multiple locks,
it is not the dramatic decrease that was intended.
This commit therefore makes call_rcu_tasks_generic() set
->percpu_enqueue_shift to 0.
Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 2 Feb 2022 23:42:36 +0000 (15:42 -0800)]
rcu-tasks: Use order_base_2() instead of ilog2()
The ilog2() function can be used to generate a shift count, but it will
generate the same count for a power of two as for one greater than a power
of two. This results in shift counts that are larger than necessary for
systems with a power-of-two number of CPUs because the CPUs are numbered
from zero, so that the maximum CPU number is one less than that power
of two.
This commit therefore substitutes order_base_2(), which appears to have
been designed for exactly this use case.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 10 Dec 2021 21:44:17 +0000 (13:44 -0800)]
rcu: Create and use an rcu_rdp_cpu_online()
The pattern "rdp->grpmask & rcu_rnp_online_cpus(rnp)" occurs frequently
in RCU code in order to determine whether rdp->cpu is online from an
RCU perspective. This commit therefore creates an rcu_rdp_cpu_online()
function to replace it.
[ paulmck: Apply kernel test robot unused-variable feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 14 Dec 2021 21:35:17 +0000 (13:35 -0800)]
rcu: Make rcu_barrier() no longer block CPU-hotplug operations
This commit removes the cpus_read_lock() and cpus_read_unlock() calls
from rcu_barrier(), thus allowing CPUs to come and go during the course
of rcu_barrier() execution. Posting of the ->barrier_head callbacks does
synchronize with portions of RCU's CPU-hotplug notifiers, but these locks
are held for short time periods on both sides. Thus, full CPU-hotplug
operations could both start and finish during the execution of a given
rcu_barrier() invocation.
Additional synchronization is provided by a global ->barrier_lock.
Since the ->barrier_lock is only used during rcu_barrier() execution and
during onlining/offlining a CPU, the contention for this lock should
be low. It might be tempting to make use of a per-CPU lock just on
general principles, but straightforward attempts to do this have the
problems shown below.
Initial state: 3 CPUs present, CPU 0 and CPU1 do not have
any callback and CPU2 has callbacks.
1. CPU0 calls rcu_barrier().
2. CPU1 starts offlining for CPU2. CPU1 calls
rcutree_migrate_callbacks(). rcu_barrier_entrain() is called
from rcutree_migrate_callbacks(), with CPU2's rdp->barrier_lock.
It does not entrain ->barrier_head for CPU2, as rcu_barrier()
on CPU0 hasn't started the barrier sequence (by calling
rcu_seq_start(&rcu_state.barrier_sequence)) yet.
3. CPU0 starts new barrier sequence. It iterates over
CPU0 and CPU1, after acquiring their per-cpu ->barrier_lock
and finds 0 segcblist length. It updates ->barrier_seq_snap
for CPU0 and CPU1 and continues loop iteration to CPU2.
for_each_possible_cpu(cpu) {
raw_spin_lock_irqsave(&rdp->barrier_lock, flags);
if (!rcu_segcblist_n_cbs(&rdp->cblist)) {
WRITE_ONCE(rdp->barrier_seq_snap, gseq);
raw_spin_unlock_irqrestore(&rdp->barrier_lock, flags);
rcu_barrier_trace(TPS("NQ"), cpu, rcu_state.barrier_sequence);
continue;
}
4. rcutree_migrate_callbacks() completes execution on CPU1.
Segcblist len for CPU2 becomes 0.
5. The loop iteration on CPU0, checks rcu_segcblist_n_cbs(&rdp->cblist)
for CPU2 and completes the loop iteration after setting
->barrier_seq_snap.
6. As there isn't any ->barrier_head callback entrained; at
this point, rcu_barrier() in CPU0 returns.
7. The callbacks, which migrated from CPU2 to CPU1, execute.
Straightforward per-CPU locking is also subject to the following race
condition noted by Boqun Feng:
1. CPU0 calls rcu_barrier(), starting a new barrier sequence by invoking
rcu_seq_start() and init_completion(), but does not yet initialize
rcu_state.barrier_cpu_count.
2. CPU1 starts offlining for CPU2, calling rcutree_migrate_callbacks(),
which in turn calls rcu_barrier_entrain() holding CPU2's.
rdp->barrier_lock. It then entrains ->barrier_head for CPU2
and atomically increments rcu_state.barrier_cpu_count, which is
unfortunately not yet initialized to the value 2.
3. The just-entrained RCU callback is invoked. It atomically
decrements rcu_state.barrier_cpu_count and sees that it is
now zero. This callback therefore invokes complete().
4. CPU0 continues executing rcu_barrier(), but is not blocked
by its call to wait_for_completion(). This results in rcu_barrier()
returning before all pre-existing callbacks have been invoked,
which is a bug.
Therefore, synchronization is provided by rcu_state.barrier_lock,
which is also held across the initialization sequence, especially the
rcu_seq_start() and the atomic_set() that sets rcu_state.barrier_cpu_count
to the value 2. In addition, this lock is held when entraining the
rcu_barrier() callback, when deciding whether or not a CPU has callbacks
that rcu_barrier() must wait on, when setting the ->qsmaskinitnext for
incoming CPUs, and when migrating callbacks from a CPU that is going
offline.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 14 Dec 2021 21:15:18 +0000 (13:15 -0800)]
rcu: Rework rcu_barrier() and callback-migration logic
This commit reworks rcu_barrier() and callback-migration logic to
permit allowing rcu_barrier() to run concurrently with CPU-hotplug
operations. The key trick is for callback migration to check to see if
an rcu_barrier() is in flight, and, if so, enqueue the ->barrier_head
callback on its behalf.
This commit adds synchronization with RCU's CPU-hotplug notifiers. Taken
together, this will permit a later commit to remove the cpus_read_lock()
and cpus_read_unlock() calls from rcu_barrier().
[ paulmck: Updated per kbuild test robot feedback. ]
[ paulmck: Updated per reviews session with Neeraj, Frederic, Uladzislau, and Boqun. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sat, 11 Dec 2021 00:25:20 +0000 (16:25 -0800)]
rcu: Refactor rcu_barrier() empty-list handling
This commit saves a few lines by checking first for an empty callback
list. If the callback list is empty, then that CPU is taken care of,
regardless of its online or nocb state. Also simplify tracing accordingly
and fold a few lines together.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
David Woodhouse [Tue, 16 Feb 2021 15:04:34 +0000 (15:04 +0000)]
rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion
If we allow architectures to bring APs online in parallel, then we end
up requiring rcu_cpu_starting() to be reentrant. But currently, the
manipulation of rnp->ofl_seq is not thread-safe.
However, rnp->ofl_seq is also fairly much pointless anyway since both
rcu_cpu_starting() and rcu_report_dead() hold rcu_state.ofl_lock for
fairly much the whole time that rnp->ofl_seq is set to an odd number
to indicate that an operation is in progress.
So drop rnp->ofl_seq completely, and use only rcu_state.ofl_lock.
This has a couple of minor complexities: lockdep will complain when we
take rcu_state.ofl_lock, and currently accepts the 'excuse' of having
an odd value in rnp->ofl_seq. So switch it to an arch_spinlock_t to
avoid that false positive complaint. Since we're killing rnp->ofl_seq
of course that 'excuse' has to be changed too, so make it check for
arch_spin_is_locked(rcu_state.ofl_lock).
There's no arch_spin_lock_irqsave() so we have to manually save and
restore local interrupts around the locking.
At Paul's request based on Neeraj's analysis, make rcu_gp_init not just
wait but *exclude* any CPU online/offline activity, which was fairly
much true already by virtue of it holding rcu_state.ofl_lock.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 26 Jan 2022 05:08:55 +0000 (21:08 -0800)]
torture: Change KVM environment variable to RCUTORTURE
The torture-test scripting's long-standing use of KVM as the environment
variable tracking the pathname of the rcutorture directory now conflicts
with allmodconfig builds due to the virt/kvm/Makefile.kvm file's use
of this as a makefile variable. This commit therefore changes the
torture-test scripting from KVM to RCUTORTURE, avoiding the name conflict.
Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 18 Jan 2022 23:40:49 +0000 (15:40 -0800)]
torture: Make kvm-find-errors.sh notice missing vmlinux file
Currently, an obtuse compiler diagnostic can fool kvm-find-errors.sh
into believing that the build was successful. This commit therefore
adds a check for a missing vmlinux file. Note that in the case of
repeated torture-test scenarios ("--configs '2*TREE01'"), the vmlinux
file will only be present in the first directory, that is, in TREE01
but not TREE01.2.
Link: https://lore.kernel.org/lkml/36bd91e4-8eda-5677-7fde-40295932a640@molgen.mpg.de/
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 28 Dec 2021 05:21:35 +0000 (21:21 -0800)]
torture: Print only one summary line per run
The torture.sh scripts currently duplicates the summary lines, getting
one during the run phase and one during the summary phase of each run.
This commit therefore removes the run phase from consideration so as to
get only one summary line per run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 21 Dec 2021 04:24:25 +0000 (20:24 -0800)]
torture: Make kvm-remote.sh try multiple times to download tarball
This commit ups the retries for downloading the build-product tarball
to a given remote system from once to five times, the better to handle
transient network failures.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sat, 18 Dec 2021 00:14:31 +0000 (16:14 -0800)]
torture: Compress KCSAN as well as KASAN vmlinux files
Compressing KASAN vmlinux files reduces torture.sh res file size from
about 100G to about 50G, which is good, but the KCSAN vmlinux files
are also large. Compressing them reduces their size from about 700M to
about 100M (but of course your mileage may vary). This commit therefore
compresses both KASAN and KCSAN vmlinux files.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 6 Dec 2021 17:13:37 +0000 (09:13 -0800)]
torture: Indicate which torture.sh runs' bugs are all KCSAN reports
This commit further improves torture.sh run summaries by indicating
which runs' "Bugs:" counts are all KCSAN reports, and further printing
an additional end-of-run summary line when all errors reported in all
runs were KCSAN reports.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sun, 5 Dec 2021 05:00:55 +0000 (21:00 -0800)]
torture: Make kvm.sh summaries note runs having only KCSAN reports
Runs having only KCSAN reports will normally print a summary line
containing only a "Bugs:" entry. However, these bugs might or might
not be KCSAN reports. This commit therefore flags runs in which all the
"Bugs:" entries are KCSAN reports.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sat, 4 Dec 2021 21:53:24 +0000 (13:53 -0800)]
torture: Output per-failed-run summary lines from torture.sh
Currently, torture.sh lists the failed runs, but it is up to the user
to work out what failed. This is especially annoying for KCSAN runs,
where RCU's tighter definitions result in failures being reported for
other parts of the kernel. This commit therefore outputs "Summary:"
lines for each failed run, allowing the user to more quickly identify
which failed runs need focused attention.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 2 Dec 2021 19:24:05 +0000 (11:24 -0800)]
torture: Allow four-digit repetition numbers for --configs parameter
In a clear-cut case of "not thinking big enough", kvm.sh limits the
multipliers for torture-test scenarios to three digits. Although this is
large enough for any single system that I have ever run rcutorture on,
it does become a problem when you want to use kvm-remote.sh to run as
many instances of TREE09 as fit on a set of 20 systems with 80 CPUs each.
Yes, one could simply say "--configs '800*TREE09 800*TREE09'", but this
commit removes the need for that sort of hacky workaround by permitting
four-digit repetition numbers, thus allowing "--configs '1600*TREE09'".
Five-digit repetition numbers remain off the menu. Should they ever
really be needed, they can easily be added!
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 2 Dec 2021 03:19:13 +0000 (19:19 -0800)]
torture: Drop trailing ^M from console output
Console logs can sometimes have trailing control-M characters, which the
forward-progress evaluation code in kvm-recheck-rcu.sh passes through
to the user output. Which does not cause a technical problem, but which
can look ugly. This commit therefore strips the control-M characters.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 28 Jan 2022 04:29:10 +0000 (20:29 -0800)]
rcutorture: Enable limited callback-flooding tests of SRCU
This commit allows up to 50,000 callbacks worth of callback-flooding
tests of SRCU. The goal of this change is to exercise Tree SRCU's
ability to transition from SRCU_SIZE_SMALL to SRCU_SIZE_BIG triggered
by callback-queue-time lock contention.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 3 Jan 2022 14:07:09 +0000 (06:07 -0800)]
torture: Wake up kthreads after storing task_struct pointer
Currently, _torture_create_kthread() uses kthread_run() to create
torture-test kthreads, which means that the resulting task_struct
pointer is stored after the newly created kthread has been marked
runnable. This in turn can cause spurious failure of checks for
code being run by a particular kthread. This commit therefore changes
_torture_create_kthread() to use kthread_create(), then to do an explicit
wake_up_process() after the task_struct pointer has been stored.
Reported-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 28 Dec 2021 23:59:38 +0000 (15:59 -0800)]
rcutorture: Fix rcu_fwd_mutex deadlock
The rcu_torture_fwd_cb_hist() function acquires rcu_fwd_mutex, but is
invoked from rcutorture_oom_notify() function, which hold this same
mutex across this call. This commit fixes the resulting deadlock.
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Oliver Sang <oliver.sang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 17 Dec 2021 23:05:05 +0000 (15:05 -0800)]
rcutorture: Add end-of-test check to rcu_torture_fwd_prog() loop
The second and subsequent forward-progress kthreads loop waiting for
the first forward-progress kthread to start the next test interval.
Unfortunately, if the test ends while one of those kthreads is waiting,
the test will hang. This hang occurs because that wait loop fails to
check for the end of the test. This commit therefore adds an end-of-test
check to that wait loop.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 17 Dec 2021 20:33:53 +0000 (12:33 -0800)]
rcutorture: Make rcu_fwd_cb_nodelay be a counter
Back when only one rcutorture kthread could do forward-progress testing,
it was just fine for rcu_fwd_cb_nodelay to be a non-atomic bool. It was
set at the start of forward-progress testing and cleared at the end.
But now that there are multiple threads, the value can be cleared while
one of the threads is still doing forward-progress testing. This commit
therefore makes rcu_fwd_cb_nodelay be an atomic counter, replacing the
WRITE_ONCE() operations with atomic_inc() and atomic_dec().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 16 Dec 2021 23:36:02 +0000 (15:36 -0800)]
rcutorture: Increase visibility of forward-progress hangs
This commit adds a few pr_alert() calls to rcutorture's forward-progress
testing in order to better diagnose shutdown-time hangs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 16 Dec 2021 20:23:31 +0000 (12:23 -0800)]
torture: Distinguish kthread stopping and being asked to stop
Right now, if a given kthread (call it "kthread") realizes that it needs
to stop, "Stopping kthread" is written to the console. When the cleanup
code decides that it is time to stop that kthread, "Stopping kthread
tasks" is written to the console. These two events might happen in
either order, especially in the case of time-based torture-test shutdown.
But it is hard to distinguish these, especially for those unfamiliar with
the torture tests. This commit therefore changes the first case from
"Stopping kthread" to "kthread is stopping" to make things more clear.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 6 Dec 2021 23:12:14 +0000 (15:12 -0800)]
rcutorture: Print message before invoking ->cb_barrier()
The various ->cb_barrier() functions, for example, rcu_barrier(),
sometimes cause rcutorture hangs. But currently, the last console
message is the unenlightening "Stopping rcu_torture_stats". This commit
therefore prints a message of the form "rcu_torture_cleanup: Invoking
rcu_barrier+0x0/0x1e0()" to help point people in the right direction.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Tue, 25 Jan 2022 02:47:44 +0000 (10:47 +0800)]
rcu: Add per-CPU rcuc task dumps to RCU CPU stall warnings
When the rcutree.use_softirq kernel boot parameter is set to zero, all
RCU_SOFTIRQ processing is carried out by the per-CPU rcuc kthreads.
If these kthreads are being starved, quiescent states will not be
reported, which in turn means that the grace period will not end, which
can in turn trigger RCU CPU stall warnings. This commit therefore dumps
stack traces of stalled CPUs' rcuc kthreads, which can help identify
what is preventing those kthreads from running.
Suggested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 21 Jan 2022 20:40:08 +0000 (12:40 -0800)]
rcu: Don't deboost before reporting expedited quiescent state
Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx
before reporting the expedited quiescent state. Under heavy real-time
load, this can result in this function being preempted before the
quiescent state is reported, which can in turn prevent the expedited grace
period from completing. Tim Murray reports that the resulting expedited
grace periods can take hundreds of milliseconds and even more than one
second, when they should normally complete in less than a millisecond.
This was fine given that there were no particular response-time
constraints for synchronize_rcu_expedited(), as it was designed
for throughput rather than latency. However, some users now need
sub-100-millisecond response-time constratints.
This patch therefore follows Neeraj's suggestion (seconded by Tim and
by Uladzislau Rezki) of simply reversing the two operations.
Reported-by: Tim Murray <timmurray@google.com>
Reported-by: Joel Fernandes <joelaf@google.com>
Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Tim Murray <timmurray@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Alison Chaiken [Tue, 11 Jan 2022 23:32:53 +0000 (15:32 -0800)]
rcu: Update documentation regarding kthread_prio cmdline parameter
Inform readers that the priority of RCU no-callback threads will also
be boosted.
Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Alison Chaiken [Tue, 11 Jan 2022 23:32:52 +0000 (15:32 -0800)]
rcu: Elevate priority of offloaded callback threads
When CONFIG_PREEMPT_RT=y, the rcutree.kthread_prio command-line
parameter signals initialization code to boost the priority of rcuc
callbacks to the designated value. With the additional
CONFIG_RCU_NOCB_CPU=y configuration and an additional rcu_nocbs
command-line parameter, the callbacks on the listed cores are
offloaded to new rcuop kthreads that are not pinned to the cores whose
post-grace-period work is performed. While the rcuop kthreads perform
the same function as the rcuc kthreads they offload, the kthread_prio
parameter only boosts the priority of the rcuc kthreads. Fix this
inconsistency by elevating rcuop kthreads to the same priority as the rcuc
kthreads.
Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Alison Chaiken [Tue, 11 Jan 2022 23:32:51 +0000 (15:32 -0800)]
rcu: Make priority of grace-period thread consistent
The priority of RCU grace period threads is set to kthread_prio when
they are launched from rcu_spawn_gp_kthread(). The same is not true
of rcu_spawn_one_nocb_kthread(). Accordingly, add priority elevation
to rcu_spawn_one_nocb_kthread().
Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Alison Chaiken [Tue, 11 Jan 2022 23:32:50 +0000 (15:32 -0800)]
rcu: Move kthread_prio bounds-check to a separate function
Move the bounds-check of the kthread_prio cmdline parameter to a new
function in order to faciliate a different callsite.
Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Tue, 28 Dec 2021 16:05:10 +0000 (00:05 +0800)]
rcu: Create per-cpu rcuc kthreads only when rcutree.use_softirq=0
The per-CPU "rcuc" kthreads are used only by kernels booted with
rcutree.use_softirq=0, but they are nevertheless unconditionally created
by kernels built with CONFIG_RCU_BOOST=y. This results in "rcuc"
kthreads being created that are never actually used. This commit
therefore refrains from creating these kthreads unless the kernel
is actually booted with rcutree.use_softirq=0.
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Neeraj Upadhyay [Mon, 13 Dec 2021 07:02:09 +0000 (12:32 +0530)]
rcu: Remove unused rcu_state.boost
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Neeraj Upadhyay [Sat, 11 Dec 2021 17:01:39 +0000 (22:31 +0530)]
rcu/nocb: Handle concurrent nocb kthreads creation
When multiple CPUs in the same nocb gp/cb group concurrently
come online, they might try to concurrently create the same
rcuog kthread. Fix this by using nocb gp CPU's spawn mutex to
provide mutual exclusion for the rcuog kthread creation code.
[ paulmck: Whitespace fixes per kernel test robot feedback. ]
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 13 Dec 2021 19:05:07 +0000 (11:05 -0800)]
rcu: Mark accesses to boost_starttime
The boost_starttime shared variable has conflicting unmarked C-language
accesses, which are dangerous at best. This commit therefore adds
appropriate marking. This was found by KCSAN.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 14 Dec 2021 05:00:02 +0000 (21:00 -0800)]
rcu: Mark ->expmask access in synchronize_rcu_expedited_wait()
This commit adds a READ_ONCE() to an access to the rcu_node structure's
->expmask field to prevent compiler mischief. Detected by KCSAN.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Neeraj Upadhyay [Mon, 13 Dec 2021 06:10:24 +0000 (11:40 +0530)]
rcu/exp: Fix check for idle context in rcu_exp_handler
For PREEMPT_RCU, the rcu_exp_handler() function checks
whether the current CPU is in idle, by calling
rcu_dynticks_curr_cpu_in_eqs(). However, rcu_exp_handler()
is called in IPI handler context. So, it should be checking
the idle context using rcu_is_cpu_rrupt_from_idle(). Fix this
by using rcu_is_cpu_rrupt_from_idle() instead of
rcu_dynticks_curr_cpu_in_eqs(). Non-preempt configuration
already uses the correct check.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 26 Jan 2022 18:42:58 +0000 (10:42 -0800)]
rcu-tasks: Fix computation of CPU-to-list shift counts
The ->percpu_enqueue_shift field is used to map from the running CPU
number to the index of the corresponding callback list. This mapping
can change at runtime in response to varying callback load, resulting
in varying levels of contention on the callback-list locks.
Unfortunately, the initial value of this field is correct only if the
system happens to have a power-of-two number of CPUs, otherwise the
callbacks from the high-numbered CPUs can be placed into the callback list
indexed by 1 (rather than 0), and those index-1 callbacks will be ignored.
This can result in soft lockups and hangs.
This commit therefore corrects this mapping, adding one to this shift
count as needed for systems having odd numbers of CPUs.
Fixes: 7a30871b6a27 ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection")
Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Reported-by: Martin Lau <kafai@fb.com>
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Linus Torvalds [Sun, 23 Jan 2022 08:12:53 +0000 (10:12 +0200)]
Linux 5.17-rc1
Linus Torvalds [Sun, 23 Jan 2022 06:14:21 +0000 (08:14 +0200)]
Merge tag 'perf-tools-for-v5.17-2022-01-22' of git://git./linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Fix printing 'phys_addr' in 'perf script'.
- Fix failure to add events with 'perf probe' in ppc64 due to not
removing leading dot (ppc64 ABIv1).
- Fix cpu_map__item() python binding building.
- Support event alias in form foo-bar-baz, add pmu-events and
parse-event tests for it.
- No need to setup affinities when starting a workload or attaching to
a pid.
- Use path__join() to compose a path instead of ad-hoc snprintf()
equivalent.
- Override attr->sample_period for non-libpfm4 events.
- Use libperf cpumap APIs instead of accessing the internal state
directly.
- Sync x86 arch prctl headers and files changed by the new
set_mempolicy_home_node syscall with the kernel sources.
- Remove duplicate include in cpumap.h.
- Remove redundant err variable.
* tag 'perf-tools-for-v5.17-2022-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Remove redundant err variable
perf test: Add parse-events test for aliases with hyphens
perf test: Add pmu-events test for aliases with hyphens
perf parse-events: Support event alias in form foo-bar-baz
perf evsel: Override attr->sample_period for non-libpfm4 events
perf cpumap: Remove duplicate include in cpumap.h
perf cpumap: Migrate to libperf cpumap api
perf python: Fix cpu_map__item() building
perf script: Fix printing 'phys_addr' failure issue
tools headers UAPI: Sync files changed by new set_mempolicy_home_node syscall
tools headers UAPI: Sync x86 arch prctl headers with the kernel sources
perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename)
perf evlist: No need to setup affinities when disabling events for pid targets
perf evlist: No need to setup affinities when enabling events for pid targets
perf stat: No need to setup affinities when starting a workload
perf affinity: Allow passing a NULL arg to affinity__cleanup()
perf probe: Fix ppc64 'perf probe add events failed' case
Linus Torvalds [Sun, 23 Jan 2022 06:07:02 +0000 (08:07 +0200)]
Merge tag 'trace-v5.17-3' of git://git./linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
"Fix s390 breakage from sorting mcount tables.
The latest merge of the tracing tree sorts the mcount table at build
time. But s390 appears to do things differently (like always) and
replaces the sorted table back to the original unsorted one. As the
ftrace algorithm depends on it being sorted, bad things happen when it
is not, and s390 experienced those bad things.
Add a new config to tell the boot if the mcount table is sorted or
not, and allow s390 to opt out of it"
* tag 'trace-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix assuming build time sort works for s390
Steven Rostedt (Google) [Sat, 22 Jan 2022 14:17:10 +0000 (09:17 -0500)]
ftrace: Fix assuming build time sort works for s390
To speed up the boot process, as mcount_loc needs to be sorted for ftrace
to work properly, sorting it at build time is more efficient than boot up
and can save milliseconds of time. Unfortunately, this change broke s390
as it will modify the mcount_loc location after the sorting takes place
and will put back the unsorted locations. Since the sorting is skipped at
boot up if it is believed that it was sorted at run time, ftrace can crash
as its algorithms are dependent on the list being sorted.
Add a new config BUILDTIME_MCOUNT_SORT that is set when
BUILDTIME_TABLE_SORT but not if S390 is set. Use this config to determine
if sorting should take place at boot up.
Link: https://lore.kernel.org/all/yt9dee51ctfn.fsf@linux.ibm.com/
Fixes: 72b3942a173c ("scripts: ftrace - move the sort-processing in ftrace_init")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sun, 23 Jan 2022 04:32:29 +0000 (06:32 +0200)]
Merge tag 'kbuild-fixes-v5.17' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Bring include/uapi/linux/nfc.h into the UAPI compile-test coverage
- Revert the workaround of CONFIG_CC_IMPLICIT_FALLTHROUGH
- Fix build errors in certs/Makefile
* tag 'kbuild-fixes-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty
certs: Fix build error when CONFIG_MODULE_SIG_KEY is PKCS#11 URI
Revert "Makefile: Do not quote value for CONFIG_CC_IMPLICIT_FALLTHROUGH"
usr/include/Makefile: add linux/nfc.h to the compile-test coverage
Linus Torvalds [Sun, 23 Jan 2022 04:20:44 +0000 (06:20 +0200)]
Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux
Pull bitmap updates from Yury Norov:
- introduce for_each_set_bitrange()
- use find_first_*_bit() instead of find_next_*_bit() where possible
- unify for_each_bit() macros
* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
vsprintf: rework bitmap_list_string
lib: bitmap: add performance test for bitmap_print_to_pagebuf
bitmap: unify find_bit operations
mm/percpu: micro-optimize pcpu_is_populated()
Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
find: micro-optimize for_each_{set,clear}_bit()
include/linux: move for_each_bit() macros from bitops.h to find.h
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
tools: sync tools/bitmap with mother linux
all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
cpumask: use find_first_and_bit()
lib: add find_first_and_bit()
arch: remove GENERIC_FIND_FIRST_BIT entirely
include: move find.h from asm_generic to linux
bitops: move find_bit_*_le functions from le.h to find.h
bitops: protect find_first_{,zero}_bit properly
Minghao Chi [Wed, 12 Jan 2022 08:01:09 +0000 (08:01 +0000)]
perf tools: Remove redundant err variable
Return value from perf_event__process_tracing_data() directly instead
of taking this in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220112080109.666800-1-chi.minghao@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
John Garry [Mon, 17 Jan 2022 15:10:15 +0000 (23:10 +0800)]
perf test: Add parse-events test for aliases with hyphens
Add a test which allows us to test parsing an event alias with hyphens.
Since these events typically do not exist on most host systems, add the
alias to the fake pmu.
Function perf_pmu__test_parse_init() has terms added to match known test
aliases.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
John Garry [Mon, 17 Jan 2022 15:10:14 +0000 (23:10 +0800)]
perf test: Add pmu-events test for aliases with hyphens
Add a test for aliases with hyphens in the name to ensure that the
pmu-events tables are as expects. There should be no reason why these sort
of aliases would be treated differently, but no harm in checking.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
John Garry [Mon, 17 Jan 2022 15:10:13 +0000 (23:10 +0800)]
perf parse-events: Support event alias in form foo-bar-baz
Event aliasing for events whose name in the form foo-bar-baz is not
supported, while foo-bar, foo_bar_baz, and other combinations are, i.e.
two hyphens are not supported.
The HiSilicon D06 platform has events in such form:
$ ./perf list sdir-home-migrate
List of pre-defined events (to be used in -e):
uncore hha:
sdir-home-migrate
[Unit: hisi_sccl,hha]
$ sudo ./perf stat -e sdir-home-migrate
event syntax error: 'sdir-home-migrate'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event>event selector. use 'perf list' to list available events
To support, add an extra PMU event symbol type for "baz", and add a new
rule in the bison file.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
German Gomez [Tue, 18 Jan 2022 14:40:54 +0000 (14:40 +0000)]
perf evsel: Override attr->sample_period for non-libpfm4 events
A previous patch preventing "attr->sample_period" values from being
overridden in pfm events changed a related behaviour in arm-spe.
Before said patch:
perf record -c 10000 -e arm_spe_0// -- sleep 1
Would yield an SPE event with period=10000. After the patch, the period
in "-c 10000" was being ignored because the arm-spe code initializes
sample_period to a non-zero value.
This patch restores the previous behaviour for non-libpfm4 events.
Fixes: ae5dcc8abe31 (“perf record: Prevent override of attr->sample_period for libpfm4 events”)
Reported-by: Chase Conklin <chase.conklin@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220118144054.2541-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Lv Ruyi [Mon, 17 Jan 2022 08:37:30 +0000 (08:37 +0000)]
perf cpumap: Remove duplicate include in cpumap.h
Remove all but the first include of stdbool.h from cpumap.h.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117083730.863200-1-lv.ruyi@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 22 Jan 2022 04:58:10 +0000 (20:58 -0800)]
perf cpumap: Migrate to libperf cpumap api
Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 22 Jan 2022 04:58:09 +0000 (20:58 -0800)]
perf python: Fix cpu_map__item() building
Value should be built as an integer.
Switch some uses of perf_cpu_map to use the library API.
Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yao Jin [Fri, 21 Jan 2022 06:59:54 +0000 (14:59 +0800)]
perf script: Fix printing 'phys_addr' failure issue
Perf script was failed to print the phys_addr for SPE profiling.
One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR
attribute set, perf script then exits with error.
Now referring to 'addr', use evsel__do_check_stype() to check the type.
Before:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p
4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field.
After:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p
4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
4064384/
4064384 ffff802f921be0d0 2f921be0d0
4064384/
4064384 ffff802f921be0d0 2f921be0d0
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Yao Jin <jinyao5@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220121065954.2121900-1-liwei391@huawei.com
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masahiro Yamada [Thu, 20 Jan 2022 19:22:05 +0000 (04:22 +0900)]
certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty
Since
b8c96a6b466c ("certs: simplify $(srctree)/ handling and remove
config_filename macro"), when CONFIG_MODULE_SIG_KEY is empty,
signing_key.x509 fails to build:
CERT certs/signing_key.x509
Usage: extract-cert <source> <dest>
make[1]: *** [certs/Makefile:78: certs/signing_key.x509] Error 2
make: *** [Makefile:1831: certs] Error 2
Pass "" to the first argument of extract-cert to fix the build error.
Link: https://lore.kernel.org/linux-kbuild/20220120094606.2skuyb26yjlnu66q@lion.mk-sys.cz/T/#u
Fixes: b8c96a6b466c ("certs: simplify $(srctree)/ handling and remove config_filename macro")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Masahiro Yamada [Thu, 20 Jan 2022 19:22:04 +0000 (04:22 +0900)]
certs: Fix build error when CONFIG_MODULE_SIG_KEY is PKCS#11 URI
When CONFIG_MODULE_SIG_KEY is PKCS#11 URL (pkcs11:*), signing_key.x509
fails to build:
certs/Makefile:77: *** target pattern contains no '%'. Stop.
Due to the typo, $(X509_DEP) contains a colon.
Fix it.
Fixes: b8c96a6b466c ("certs: simplify $(srctree)/ handling and remove config_filename macro")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Thu, 20 Jan 2022 05:31:00 +0000 (14:31 +0900)]
Revert "Makefile: Do not quote value for CONFIG_CC_IMPLICIT_FALLTHROUGH"
This reverts commit
cd8c917a56f20f48748dd43d9ae3caff51d5b987.
Commit
129ab0d2d9f3 ("kbuild: do not quote string values in
include/config/auto.conf") provided the final solution.
Now reverting the temporary workaround.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Dmitry V. Levin [Mon, 3 Jan 2022 01:24:02 +0000 (04:24 +0300)]
usr/include/Makefile: add linux/nfc.h to the compile-test coverage
As linux/nfc.h userspace compilation was finally fixed by commits
79b69a83705e ("nfc: uapi: use kernel size_t to fix user-space builds")
and
7175f02c4e5f ("uapi: fix linux/nfc.h userspace compilation errors"),
there is no need to keep the compile-test exception for it in
usr/include/Makefile.
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sat, 22 Jan 2022 09:28:23 +0000 (11:28 +0200)]
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
"This is the post-linux-next queue. Material which was based on or
dependent upon material which was in -next.
69 patches.
Subsystems affected by this patch series: mm (migration and zsmalloc),
sysctl, proc, and lib"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (69 commits)
mm: hide the FRONTSWAP Kconfig symbol
frontswap: remove support for multiple ops
mm: mark swap_lock and swap_active_head static
frontswap: simplify frontswap_register_ops
frontswap: remove frontswap_test
mm: simplify try_to_unuse
frontswap: remove the frontswap exports
frontswap: simplify frontswap_init
frontswap: remove frontswap_curr_pages
frontswap: remove frontswap_shrink
frontswap: remove frontswap_tmem_exclusive_gets
frontswap: remove frontswap_writethrough
mm: remove cleancache
lib/stackdepot: always do filter_irq_stacks() in stack_depot_save()
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
proc: remove PDE_DATA() completely
fs: proc: store PDE()->data into inode->i_private
zsmalloc: replace get_cpu_var with local_lock
zsmalloc: replace per zpage lock with pool->migrate_lock
locking/rwlocks: introduce write_lock_nested
...
Linus Torvalds [Sat, 22 Jan 2022 09:12:26 +0000 (11:12 +0200)]
Merge tag '5.17-rc-part2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
- multichannel fixes, addressing additional reconnect and DFS scenarios
- reenabling fscache support (indexing rewrite, metadata caching e.g.)
- send additional version information during NTLMSSP negotiate to
improve debugging
- fix for a mount race
- DFS fixes
- fix for a memory leak for stable
* tag '5.17-rc-part2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
smb3: send NTLMSSP version information
cifs: Support fscache indexing rewrite
cifs: cifs_ses_mark_for_reconnect should also update reconnect bits
cifs: update tcpStatus during negotiate and sess setup
cifs: make status checks in version independent callers
cifs: remove repeated state change in dfs tree connect
cifs: fix the cifs_reconnect path for DFS
cifs: remove unused variable ses_selected
cifs: protect all accesses to chan_* with chan_lock
cifs: fix the connection state transitions with multichannel
cifs: check reconnects for channels of active tcons too
smb3: add new defines from protocol specification
cifs: serialize all mount attempts
cifs: quirk for STATUS_OBJECT_NAME_INVALID returned for non-ASCII dfs refs
cifs: alloc_path_with_tree_prefix: do not append sep. if the path is empty
cifs: clean up an inconsistent indenting
cifs: free ntlmsspblob allocated in negotiate
Linus Torvalds [Sat, 22 Jan 2022 09:04:27 +0000 (11:04 +0200)]
Merge tag 'xfs-5.17-merge-7' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"One of the patches removes some dead code from xfs_ioctl32.h and the
other fixes broken workqueue flushing in the inode garbage collector.
- Minor cleanup of ioctl32 cruft
- Clean up open coded inodegc workqueue function calls"
* tag 'xfs-5.17-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: flush inodegc workqueue tasks before cancel
xfs: remove unused xfs_ioctl32.h declarations
Linus Torvalds [Sat, 22 Jan 2022 08:59:32 +0000 (10:59 +0200)]
Merge tag 'fscache-fixes-
20220121' of git://git./linux/kernel/git/dhowells/linux-fs
Pull more fscache updates from David Howells:
"A set of fixes and minor updates for the fscache rewrite:
- Fix mishandling of volume collisions (the wait condition is
inverted and so it was only waiting if the volume collision was
already resolved).
- Fix miscalculation of whether there's space available in
cachefiles.
- Make sure a default cache name is set on a cache if the user hasn't
set one by the time they bind the cache.
- Adjust the way the backing inode is presented in tracepoints, add a
tracepoint for mkdir and trace directory lookup.
- Add a tracepoint for failure to set the active file mark.
- Add an explanation of the checks made on the backing filesystem.
- Check that the backing filesystem supports tmpfile.
- Document how the page-release cancellation of the read-skip
optimisation works.
And I've included a change for netfslib:
- Make ops->init_rreq() optional"
* tag 'fscache-fixes-
20220121' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Make ops->init_rreq() optional
fscache: Add a comment explaining how page-release optimisation works
cachefiles: Check that the backing filesystem supports tmpfiles
cachefiles: Explain checks in a comment
cachefiles: Trace active-mark failure
cachefiles: Make some tracepoint adjustments
cachefiles: set default tag name if it's unspecified
cachefiles: Calculate the blockshift in terms of bytes, not pages
fscache: Fix the volume collision wait condition
Linus Torvalds [Sat, 22 Jan 2022 08:43:07 +0000 (10:43 +0200)]
Merge tag 'folio-5.17a' of git://git.infradead.org/users/willy/pagecache
Pull more folio updates from Matthew Wilcox:
"Three small folio patches.
One bug fix, one patch pulled forward from the patches destined for
5.18 and then a patch to make use of that functionality"
* tag 'folio-5.17a' of git://git.infradead.org/users/willy/pagecache:
filemap: Use folio_put_refs() in filemap_free_folio()
mm: Add folio_put_refs()
pagevec: Initialise folio_batch->percpu_pvec_drained
Linus Torvalds [Sat, 22 Jan 2022 08:24:02 +0000 (10:24 +0200)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This series is all the stragglers that didn't quite make the first
merge window pull. It's mostly minor updates and bug fixes of merge
window code"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: nsp_cs: Check of ioremap return value
scsi: ufs: ufs-mediatek: Fix error checking in ufs_mtk_init_va09_pwr_ctrl()
scsi: ufs: Modify Tactive time setting conditions
scsi: efct: Remove useless DMA-32 fallback configuration
scsi: message: fusion: mptctl: Use dma_alloc_coherent()
scsi: message: fusion: mptsas: Use dma_alloc_coherent()
scsi: message: fusion: Use dma_alloc_coherent() in mptsas_exp_repmanufacture_info()
scsi: message: fusion: mptbase: Use dma_alloc_coherent()
scsi: message: fusion: Use dma_alloc_coherent() in mpt_alloc_fw_memory()
scsi: message: fusion: Remove usage of the deprecated "pci-dma-compat.h" API
scsi: megaraid: Avoid mismatched storage type sizes
scsi: hisi_sas: Remove unused variable and check in hisi_sas_send_ata_reset_each_phy()
scsi: aic79xx: Remove redundant error variable
scsi: pm80xx: Port reset timeout error handling correction
scsi: mpi3mr: Fix formatting problems in some kernel-doc comments
scsi: mpi3mr: Fix some spelling mistakes
scsi: mpt3sas: Update persistent trigger pages from sysfs interface
scsi: core: Fix scsi_mode_select() interface
scsi: aacraid: Fix spelling of "its"
scsi: qedf: Fix potential dereference of NULL pointer
Linus Torvalds [Sat, 22 Jan 2022 08:22:11 +0000 (10:22 +0200)]
Merge tag 'ata-5.17-rc1-part2' of git://git./linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:
"A single patch to fix a compilation error in the pata_octeon_cf driver
(mips architecture), from me"
* tag 'ata-5.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_octeon_cf: fix call to trace_ata_bmdma_stop()
Linus Torvalds [Sat, 22 Jan 2022 08:15:41 +0000 (10:15 +0200)]
Merge tag 'thermal-5.17-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"Add device IDs for Raptor Lake to the int340x thermal control driver
(Srinivas Pandruvada)"
* tag 'thermal-5.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Add Raptor Lake PCI device id
thermal: int340x: Support Raptor Lake
Linus Torvalds [Sat, 22 Jan 2022 08:09:51 +0000 (10:09 +0200)]
Merge tag 'acpi-5.17-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm
Pull extra ACPI updates from Rafael Wysocki:
"These fix and clean up the ACPI CPPC driver on top of the recent
changes in it merged previously and add some new device IDs to the
ACPI DPTF driver.
Specifics:
- Fix a recently introduced endianness-related issue in the ACPI CPPC
library and clean it up on top of that (Rafael Wysocki)
- Add new device IDs for the Raptor Lake SoC to the ACPI DPTF driver
(Srinivas Pandruvada)"
* tag 'acpi-5.17-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: DPTF: Support Raptor Lake
ACPI: CPPC: Drop redundant local variable from cpc_read()
ACPI: CPPC: Fix up I/O port access in cpc_read()
Linus Torvalds [Sat, 22 Jan 2022 07:52:17 +0000 (09:52 +0200)]
Merge tag 'devicetree-fixes-for-5.17-1' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes and cleanups from Rob Herring:
- Fix a regression when probing a child device reusing the parent
device's DT node pointer
- Refactor of_parse_phandle*() variants to static inlines
- Drop Enric Balletbo i Serra as a maintainer
- Fix DT schemas with arrays incorrectly encoded as a matrix
- Drop unneeded pinctrl properties from schemas
- Add SPI peripheral schema to SPI based displays
- Clean-up several schema examples
- Clean-up trivial-devices.yaml comments
- Add missing, in use vendor prefixes: Wingtech, Thundercomm, Huawei,
F(x)tec, 8devices
* tag 'devicetree-fixes-for-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: google,cros-ec: drop Enric Balletbo i Serra from maintainers
dt-bindings: display: bridge: drop Enric Balletbo i Serra from maintainers
of: Check 'of_node_reused' flag on of_match_device()
of: property: define of_property_read_u{8,16,32,64}_array() unconditionally
of: base: make small of_parse_phandle() variants static inline
dt-bindings: mfd: cirrus,madera: Fix 'interrupts' in example
dt-bindings: Fix array schemas encoded as matrices
dt-bindings: Drop unnecessary pinctrl properties
dt-bindings: rtc: st,stm32-rtc: Make each example a separate entry
dt-bindings: mmc: arm,pl18x: Make each example a separate entry
dt-bindings: display: Add SPI peripheral schema to SPI based displays
scripts/dtc: dtx_diff: remove broken example from help text
dt-bindings: trivial-devices: fix double spaces in comments
dt-bindings: trivial-devices: fix swapped comments
dt-bindings: vendor-prefixes: add Wingtech
dt-bindings: vendor-prefixes: add Thundercomm
dt-bindings: vendor-prefixes: add Huawei
dt-bindings: vendor-prefixes: add F(x)tec
dt-bindings: vendor-prefixes: add 8devices
dt-bindings: power: reset: gpio-restart: Correct default priority
Linus Torvalds [Sat, 22 Jan 2022 07:40:01 +0000 (09:40 +0200)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
"Generic:
- selftest compilation fix for non-x86
- KVM: avoid warning on s390 in mark_page_dirty
x86:
- fix page write-protection bug and improve comments
- use binary search to lookup the PMU event filter, add test
- enable_pmu module parameter support for Intel CPUs
- switch blocked_vcpu_on_cpu_lock to raw spinlock
- cleanups of blocked vCPU logic
- partially allow KVM_SET_CPUID{,2} after KVM_RUN (5.16 regression)
- various small fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
docs: kvm: fix WARNINGs from api.rst
selftests: kvm/x86: Fix the warning in lib/x86_64/processor.c
selftests: kvm/x86: Fix the warning in pmu_event_filter_test.c
kvm: selftests: Do not indent with spaces
kvm: selftests: sync uapi/linux/kvm.h with Linux header
selftests: kvm: add amx_test to .gitignore
KVM: SVM: Nullify vcpu_(un)blocking() hooks if AVIC is disabled
KVM: SVM: Move svm_hardware_setup() and its helpers below svm_x86_ops
KVM: SVM: Drop AVIC's intermediate avic_set_running() helper
KVM: VMX: Don't do full kick when handling posted interrupt wakeup
KVM: VMX: Fold fallback path into triggering posted IRQ helper
KVM: VMX: Pass desired vector instead of bool for triggering posted IRQ
KVM: VMX: Don't do full kick when triggering posted interrupt "fails"
KVM: SVM: Skip AVIC and IRTE updates when loading blocking vCPU
KVM: SVM: Use kvm_vcpu_is_blocking() in AVIC load to handle preemption
KVM: SVM: Remove unnecessary APICv/AVIC update in vCPU unblocking path
KVM: SVM: Don't bother checking for "running" AVIC when kicking for IPIs
KVM: SVM: Signal AVIC doorbell iff vCPU is in guest mode
KVM: x86: Remove defunct pre_block/post_block kvm_x86_ops hooks
KVM: x86: Unexport LAPIC's switch_to_{hv,sw}_timer() helpers
...
Linus Torvalds [Sat, 22 Jan 2022 07:37:31 +0000 (09:37 +0200)]
Merge tag 'for-5.17/parisc-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull more parisc architecture updates from Helge Deller:
"Fixes and enhancements:
- a memory leak fix in an error path in pdc_stable (Miaoqian Lin)
- two compiler warning fixes in the TOC code
- added autodetection for currently used console type (serial or
graphics) which inserts console=<type> if it's missing"
* tag 'for-5.17/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries
parisc: Fix missing prototype for 'toc_intr' warning in toc.c
parisc: Autodetect default output device and set console= kernel parameter
parisc: Use safer strscpy() in setup_cmdline()
parisc: Add visible flag to toc_stack variable
Linus Torvalds [Sat, 22 Jan 2022 07:34:49 +0000 (09:34 +0200)]
Merge tag 'riscv-for-linus-5.17-mw1' of git://git./linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Support for sv48 paging
- Hart ID mappings are now sparse, which enables more CPUs to come up
on systems with sparse hart IDs
- A handful of cleanups and fixes
* tag 'riscv-for-linus-5.17-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (27 commits)
RISC-V: nommu_virt: Drop unused SLAB_MERGE_DEFAULT
RISC-V: Remove redundant err variable
riscv: dts: sifive unmatched: Add gpio poweroff
riscv: canaan: remove useless select of non-existing config SYSCON
RISC-V: Do not use cpumask data structure for hartid bitmap
RISC-V: Move spinwait booting method to its own config
RISC-V: Move the entire hart selection via lottery to SMP
RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method
RISC-V: Do not print the SBI version during HSM extension boot print
RISC-V: Avoid using per cpu array for ordered booting
riscv: default to CONFIG_RISCV_SBI_V01=n
riscv: fix boolconv.cocci warnings
riscv: Explicit comment about user virtual address space size
riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
riscv: Implement sv48 support
asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
riscv: Allow to dynamically define VA_BITS
riscv: Introduce functions to switch pt_ops
riscv: Split early kasan mapping to prepare sv48 introduction
riscv: Move KASAN mapping next to the kernel mapping
...
Linus Torvalds [Sat, 22 Jan 2022 07:22:10 +0000 (09:22 +0200)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes/cleanups from Catalin Marinas:
"Some fixes that turned up during the merge window:
- Add brackets to the io_stop_wc macro
- Avoid -Warray-bounds warning with the LSE atomics inline asm
- Apply __ro_after_init to memory_limit"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: apply __ro_after_init to memory_limit
arm64: atomics: lse: Dereference matching size
asm-generic: Add missing brackets for io_stop_wc macro
Linus Torvalds [Sat, 22 Jan 2022 07:02:57 +0000 (09:02 +0200)]
Merge tag 'docs-5.17-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"Three small documentation fixes"
* tag 'docs-5.17-2' of git://git.lwn.net/linux:
Documentation: fix firewire.rst ABI file path error
docs: ftrace: fix ambiguous sentence
docs: staging/tee.rst: fix two typos found while reading
Christoph Hellwig [Sat, 22 Jan 2022 06:15:14 +0000 (22:15 -0800)]
mm: hide the FRONTSWAP Kconfig symbol
Select FRONTSWAP from ZSWAP instead of prompting for it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-14-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:15:10 +0000 (22:15 -0800)]
frontswap: remove support for multiple ops
There is only a single instance of frontswap ops in the kernel, so
simplify the frontswap code by removing support for multiple operations.
Link: https://lkml.kernel.org/r/20211224062246.1258487-13-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:15:07 +0000 (22:15 -0800)]
mm: mark swap_lock and swap_active_head static
swap_lock and swap_active_head are only used in swapfile.c, so mark them
static.
Link: https://lkml.kernel.org/r/20211224062246.1258487-12-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:15:04 +0000 (22:15 -0800)]
frontswap: simplify frontswap_register_ops
Given that frontswap_register_ops must be called from built-in code,
there is no need to handle the case of swapfiles coming online before or
during it, so delete the code that deals with that case.
Link: https://lkml.kernel.org/r/20211224062246.1258487-11-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:15:01 +0000 (22:15 -0800)]
frontswap: remove frontswap_test
frontswap_test is unused now, remove it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:57 +0000 (22:14 -0800)]
mm: simplify try_to_unuse
Remove the unused frontswap and pages_to_unuse arguments, and mark the
function static now that the caller in frontswap is gone.
[akpm@linux-foundation.org: fix shmem_unuse() stub, per Matthew]
Link: https://lkml.kernel.org/r/20211224062246.1258487-9-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:54 +0000 (22:14 -0800)]
frontswap: remove the frontswap exports
None of the frontswap API is called from modular code.
Link: https://lkml.kernel.org/r/20211224062246.1258487-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:51 +0000 (22:14 -0800)]
frontswap: simplify frontswap_init
Just use IS_ENABLED() and remove the __frontswap_init indirection.
Also remove the unused export.
Link: https://lkml.kernel.org/r/20211224062246.1258487-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:47 +0000 (22:14 -0800)]
frontswap: remove frontswap_curr_pages
frontswap_curr_pages is never called, so remove it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-6-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:44 +0000 (22:14 -0800)]
frontswap: remove frontswap_shrink
frontswap_shrink is never called, so remove it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:41 +0000 (22:14 -0800)]
frontswap: remove frontswap_tmem_exclusive_gets
frontswap_tmem_exclusive_gets is never called, so remove it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:38 +0000 (22:14 -0800)]
frontswap: remove frontswap_writethrough
frontswap_writethrough is never called, so remove it.
Link: https://lkml.kernel.org/r/20211224062246.1258487-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Sat, 22 Jan 2022 06:14:34 +0000 (22:14 -0800)]
mm: remove cleancache
Patch series "remove Xen tmem leftovers".
Since the removal of the Xen tmem driver in 2019, the cleancache hooks
are entirely unused, as are large parts of frontswap. This series
against linux-next (with the folio changes included) removes
cleancaches, and cuts down frontswap to the bits actually used by zswap.
This patch (of 13):
The cleancache subsystem is unused since the removal of Xen tmem driver
in commit
814bbf49dcd0 ("xen: remove tmem driver").
[akpm@linux-foundation.org: remove now-unreachable code]
Link: https://lkml.kernel.org/r/20211224062246.1258487-1-hch@lst.de
Link: https://lkml.kernel.org/r/20211224062246.1258487-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marco Elver [Sat, 22 Jan 2022 06:14:31 +0000 (22:14 -0800)]
lib/stackdepot: always do filter_irq_stacks() in stack_depot_save()
The non-interrupt portion of interrupt stack traces before interrupt
entry is usually arbitrary. Therefore, saving stack traces of
interrupts (that include entries before interrupt entry) to stack depot
leads to unbounded stackdepot growth.
As such, use of filter_irq_stacks() is a requirement to ensure
stackdepot can efficiently deduplicate interrupt stacks.
Looking through all current users of stack_depot_save(), none (except
KASAN) pass the stack trace through filter_irq_stacks() before passing
it on to stack_depot_save().
Rather than adding filter_irq_stacks() to all current users of
stack_depot_save(), it became clear that stack_depot_save() should
simply do filter_irq_stacks().
Link: https://lkml.kernel.org/r/20211130095727.2378739-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Sat, 22 Jan 2022 06:14:27 +0000 (22:14 -0800)]
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be
allocated from memblock, even if stack depot ends up not actually used.
The default size of stack_table is 4MB on 32-bit, 8MB on 64-bit.
This is fine for use-cases such as KASAN which is also a config option
and has overhead on its own. But it's an issue for functionality that
has to be actually enabled on boot (page_owner) or depends on hardware
(GPU drivers) and thus the memory might be wasted. This was raised as
an issue [1] when attempting to add stackdepot support for SLUB's debug
object tracking functionality. It's common to build kernels with
CONFIG_SLUB_DEBUG and enable slub_debug on boot only when needed, or
create only specific kmem caches with debugging for testing purposes.
It would thus be more efficient if stackdepot's table was allocated only
when actually going to be used. This patch thus makes the allocation
(and whole stack_depot_init() call) optional:
- Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
well-defined point of allocation as part of mem_init(). Make
CONFIG_KASAN select this flag.
- Other users have to call stack_depot_init() as part of their own init
when it's determined that stack depot will actually be used. This may
depend on both config and runtime conditions. Convert current users
which are page_owner and several in the DRM subsystem. Same will be
done for SLUB later.
- Because the init might now be called after the boot-time memblock
allocation has given all memory to the buddy allocator, change
stack_depot_init() to allocate stack_table with kvmalloc() when
memblock is no longer available. Also handle allocation failure by
disabling stackdepot (could have theoretically happened even with
memblock allocation previously), and don't unnecessarily align the
memblock allocation to its own size anymore.
[1] https://lore.kernel.org/all/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/
Link: https://lkml.kernel.org/r/20211013073005.11351-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Marco Elver <elver@google.com> # stackdepot
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
From: Colin Ian King <colin.king@canonical.com>
Subject: lib/stackdepot: fix spelling mistake and grammar in pr_err message
There is a spelling mistake of the work allocation so fix this and
re-phrase the message to make it easier to read.
Link: https://lkml.kernel.org/r/20211015104159.11282-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup
On FLATMEM, we call page_ext_init_flatmem_late() just before
kmem_cache_init() which means stack_depot_init() (called by page owner
init) will not recognize properly it should use kvmalloc() and not
memblock_alloc(). memblock_alloc() will also not issue a warning and
return a block memory that can be invalid and cause kernel page fault when
saving stacks, as reported by the kernel test robot [1].
Fix this by moving page_ext_init_flatmem_late() below kmem_cache_init() so
that slab_is_available() is true during stack_depot_init(). SPARSEMEM
doesn't have this issue, as it doesn't do page_ext_init_flatmem_late(),
but a different page_ext_init() even later in the boot process.
Thanks to Mike Rapoport for pointing out the FLATMEM init ordering issue.
While at it, also actually resolve a checkpatch warning in stack_depot_init()
from DRM CI, which was supposed to be in the original patch already.
[1] https://lore.kernel.org/all/
20211014085450.GC18719@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/6abd9213-19a9-6d58-cedc-2414386d2d81@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup3
Due to
cd06ab2fd48f ("drm/locking: add backtrace for locking contended
locks without backoff") landing recently to -next adding a new stack depot
user in drivers/gpu/drm/drm_modeset_lock.c we need to add an appropriate
call to stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/2a692365-cfa1-64f2-34e0-8aa5674dce5e@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup4
Due to
4e66934eaadc ("lib: add reference counting tracking
infrastructure") landing recently to net-next adding a new stack depot
user in lib/ref_tracker.c we need to add an appropriate call to
stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/45c1b738-1a2f-5b5f-2f6d-86fab206d01c@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Slab <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>