Paul E. McKenney [Fri, 3 Feb 2023 00:33:43 +0000 (16:33 -0800)]
Merge branches 'doc.2023.01.05a', 'fixes.2023.01.23a', 'kvfree.2023.01.03a', 'srcu.2023.01.03a', 'srcu-always.2023.02.02a', 'tasks.2023.01.03a', 'torture.2023.01.05a' and 'torturescript.2023.01.03a' into HEAD
doc.2023.01.05a: Documentation update.
fixes.2023.01.23a: Miscellaneous fixes.
kvfree.2023.01.03a: kvfree_rcu() updates.
srcu.2023.01.03a: SRCU updates.
srcu-always.2023.02.02a: Finish making SRCU be unconditionally available.
tasks.2023.01.03a: Tasks-RCU updates.
torture.2023.01.05a: Torture-test updates.
torturescript.2023.01.03a: Torture-test scripting updates.
Uladzislau Rezki (Sony) [Wed, 1 Feb 2023 15:08:07 +0000 (16:08 +0100)]
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
The kvfree_rcu() and kfree_rcu() APIs are hazardous in that if you forget
the second argument, it works, but might sleep. This sleeping can be a
correctness bug from atomic contexts, and even in non-atomic contexts
it might introduce unacceptable latencies. This commit therefore adds
kvfree_rcu_mightsleep() and kfree_rcu_mightsleep(), which will replace
the single-argument kvfree_rcu() and kfree_rcu(), respectively.
This commit enables a series of commits that switch from single-argument
kvfree_rcu() and kfree_rcu() to their _mightsleep() counterparts. Once
all of these commits land, the single-argument versions will be removed.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 23 Nov 2022 02:22:42 +0000 (18:22 -0800)]
kernel/notifier: Remove CONFIG_SRCU
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in conditional compilation based on CONFIG_SRCU.
Therefore, remove the #ifdef.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Borislav Petkov <bp@suse.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 02:10:15 +0000 (18:10 -0800)]
init: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:53:17 +0000 (17:53 -0800)]
fs/quota: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Jan Kara <jack@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:52:19 +0000 (17:52 -0800)]
fs/notify: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: <linux-fsdevel@vger.kernel.org>
Acked-by: Jan Kara <jack@suse.cz>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:49:29 +0000 (17:49 -0800)]
fs/btrfs: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.com>
Cc: <linux-btrfs@vger.kernel.org>
Acked-by: David Sterba <dsterba@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 02:20:28 +0000 (18:20 -0800)]
fs: Remove CONFIG_SRCU
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in conditional compilation based on CONFIG_SRCU.
Therefore, remove the #ifdef and throw away the #else clause.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <linux-fsdevel@vger.kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:48:03 +0000 (17:48 -0800)]
drivers/pci/controller: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: "Krzysztof Wilczyński" <kw@linux.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: <linux-pci@vger.kernel.org>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:25:33 +0000 (17:25 -0800)]
drivers/net: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:24:14 +0000 (17:24 -0800)]
drivers/md: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: <dm-devel@redhat.com>
Cc: <linux-raid@vger.kernel.org>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 01:22:10 +0000 (17:22 -0800)]
drivers/hwtracing/stm: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: <linux-stm32@st-md-mailman.stormreply.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 00:59:03 +0000 (16:59 -0800)]
drivers/dax: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: <nvdimm@lists.linux.dev>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Paul E. McKenney [Wed, 23 Nov 2022 00:52:36 +0000 (16:52 -0800)]
drivers/base: Remove CONFIG_SRCU
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in conditional compilation based on CONFIG_SRCU.
Therefore, remove the #ifdef and throw away the #else clause.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Joel Fernandes (Google) [Thu, 12 Jan 2023 00:52:23 +0000 (00:52 +0000)]
rcu: Disable laziness if lazy-tracking says so
During suspend, we see failures to suspend 1 in 300-500 suspends.
Looking closer, it appears that asynchronous RCU callbacks are being
queued as lazy even though synchronous callbacks are expedited. These
delays appear to not be very welcome by the suspend/resume code as
evidenced by these occasional suspend failures.
This commit modifies call_rcu() to check if rcu_async_should_hurry(),
which will return true if we are in suspend or in-kernel boot.
[ paulmck: Alphabetize local variables. ]
Ignoring the lazy hint makes the 3000 suspend/resume cycles pass
reliably on a 12th gen 12-core Intel CPU, and there is some evidence
that it also slightly speeds up boot performance.
Fixes: 3cb278e73be5 ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Joel Fernandes (Google) [Thu, 12 Jan 2023 00:52:22 +0000 (00:52 +0000)]
rcu: Track laziness during boot and suspend
Boot and suspend/resume should not be slowed down in kernels built with
CONFIG_RCU_LAZY=y. In particular, suspend can sometimes fail in such
kernels.
This commit therefore adds rcu_async_hurry(), rcu_async_relax(), and
rcu_async_should_hurry() functions that track whether or not either
a boot or a suspend/resume operation is in progress. This will
enable a later commit to refrain from laziness during those times.
Export rcu_async_should_hurry(), rcu_async_hurry(), and rcu_async_relax()
for later use by rcutorture.
[ paulmck: Apply feedback from Steve Rostedt. ]
Fixes: 3cb278e73be5 ("rcu: Make call_rcu() lazy to save power")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Wed, 21 Dec 2022 19:15:43 +0000 (11:15 -0800)]
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
The rcu_boost_kthread_setaffinity() function is invoked at
rcutree_online_cpu() and rcutree_offline_cpu() time, early in the online
timeline and late in the offline timeline, respectively. It is also
invoked from rcutree_dead_cpu(), however, in the absence of userspace
manipulations (for which userspace must take responsibility), this call
is redundant with that from rcutree_offline_cpu(). This redundancy can
be demonstrated by printing out the relevant cpumasks
This commit therefore removes the call to rcu_boost_kthread_setaffinity()
from rcutree_dead_cpu().
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Joel Fernandes (Google) [Sun, 1 Jan 2023 06:15:55 +0000 (06:15 +0000)]
torture: Fix hang during kthread shutdown phase
During rcutorture shutdown, the rcu_torture_cleanup() function calls
torture_cleanup_begin(), which sets the fullstop global variable to
FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
fakewriters to exit all of their "while" loops and start shutting down.
They then call torture_kthread_stopping(), which in turn waits for
kthread_stop() to be called. However, rcu_torture_cleanup() has
not yet called kthread_stop() on those threads, and before it gets a
chance to do so, multiple instances of torture_kthread_stopping() invoke
schedule_timeout_interruptible(1) in a tight loop. Tracing confirms that
TIMER_SOFTIRQ can then continuously execute timer callbacks. If that
TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
task might never invoke kthread_stop().
This commit improves this situation by increasing the timeout passed to
schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
invocations. Testing has shown 100 runs of TREE07 passing reliably,
as oppose to the tens-of-percent failure rates seen beforehand.
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: <stable@vger.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 16 Dec 2022 17:47:28 +0000 (09:47 -0800)]
rcutorture: Drop sparse lock-acquisition annotations
The sparse __acquires() and __releases() annotations provide very
little value. The argument is ignored, so sparse cannot tell the
differences between acquiring one lock and releasing another on the one
hand and acquiring and releasing a given lock on the other. In addition,
lockdep annotations provide much more precision, for but one example,
actually knowing which lock is held.
This commit therefore removes the __acquires() and __releases()
annotations from rcutorture.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Joel Fernandes (Google) [Tue, 13 Dec 2022 20:48:39 +0000 (20:48 +0000)]
locktorture: Make the rt_boost factor a tunable
The rt boosting in locktorture has a factor variable s currently large enough
that boosting only happens once every minute or so. Add a tunable to reduce the
factor so that boosting happens more often, to test paths and arrive at failure
modes earlier. With this change, I can set the factor to like 50 and have the
boosting happens every 10 seconds or so.
Tested with boot parameters:
locktorture.torture_type=mutex_lock
locktorture.onoff_interval=1
locktorture.nwriters_stress=8
locktorture.stutter=0
locktorture.rt_boost=1
locktorture.rt_boost_factor=50
locktorture.nlocks=3
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Joel Fernandes (Google) [Tue, 13 Dec 2022 20:48:38 +0000 (20:48 +0000)]
locktorture: Allow non-rtmutex lock types to be boosted
Currently RT boosting is only done for rtmutex_lock, however with proxy
execution, we also have the mutex_lock participating in priorities. To
exercise the testing better, add RT boosting to other lock testing types
as well, using a new knob (rt_boost).
Tested with boot parameters:
locktorture.torture_type=mutex_lock
locktorture.onoff_interval=1
locktorture.nwriters_stress=8
locktorture.stutter=0
locktorture.rt_boost=1
locktorture.rt_boost_factor=1
locktorture.nlocks=3
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 8 Nov 2022 16:18:06 +0000 (08:18 -0800)]
refscale: Add tests using SLAB_TYPESAFE_BY_RCU
This commit adds three read-side-only tests of three use cases featuring
SLAB_TYPESAFE_BY_RCU: One using per-object reference counting, one using
per-object locking, and one using per-object sequence locking.
[ paulmck: Apply feedback from kernel test robot. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zhen Lei [Thu, 24 Nov 2022 06:22:03 +0000 (14:22 +0800)]
doc: Fix htmldocs build warnings of stallwarn.rst
Documentation/RCU/stallwarn.rst:
401: WARNING: Literal block expected; none found.
428: WARNING: Literal block expected; none found.
445: WARNING: Literal block expected; none found.
459: WARNING: Literal block expected; none found.
468: WARNING: Literal block expected; none found.
The literal block needs to be indented, so this commit adds two spaces
to each line.
In addition, ':', which is used as a boundary in the literal block, is
replaced by '|'.
Link: https://lore.kernel.org/linux-next/20221123163255.48653674@canb.auug.org.au/
Fixes: 3d2788ba4573 ("doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Akira Yokosawa [Wed, 23 Nov 2022 09:29:00 +0000 (18:29 +0900)]
docs/RCU/rcubarrier: Right-adjust line numbers in code snippets
Line numbers in code snippets in rcubarrier.rst have beed left adjusted
since commit
4af498306ffd ("doc: Convert to rcubarrier.txt to ReST").
This might have been because right adjusting them had confused Sphinx.
The rules around a literal block in reST are:
- Need a blank line above it.
- A line with the same indent level as the line above it is regarded
as the end of it.
Those line numbers can be right adjusted by keeping indents at two-
digit numbers. While at it, add some spaces between the column of line
numbers and the code area for better readability.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Akira Yokosawa [Wed, 23 Nov 2022 09:23:09 +0000 (18:23 +0900)]
docs/RCU/rcubarrier: Adjust 'Answer' parts of QQs as definition-lists
The "Answer" parts of QQs divert from proper format of definition-lists
as described at [1] and are not rendered as such.
Adjust them.
Link: [1] https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#definition-lists
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zhen Lei [Sat, 19 Nov 2022 09:25:07 +0000 (17:25 +0800)]
doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
This commit documents the additional RCU CPU stall warning output
produced by kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted
with rcupdate.rcu_cpu_stall_cputime=1.
[ paulmck: Apply wordsmithing. ]
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Sat, 5 Nov 2022 01:00:52 +0000 (18:00 -0700)]
doc: Update whatisRCU.rst
This commit updates whatisRCU.rst with wordsmithing and updates provokes
by the passage of time.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 23:55:03 +0000 (16:55 -0700)]
doc: Update rcu.rst URL to RCU publications
Also add the more recent thicket of Google Documents.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 23:45:55 +0000 (16:45 -0700)]
doc: Update UP.rst
This commit updates UP.rst to reflect changes over the past few years,
including the advent of userspace RCU libraries for constrained systems.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 22:34:03 +0000 (15:34 -0700)]
doc: Update torture.rst
This commit updates torture.rst with wordsmithing and the addition of a
few more scripts.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 21:39:32 +0000 (14:39 -0700)]
doc: Update stallwarn.rst
This commit updates stallwarn.rst to reflect RCU additions and changes
over the past few years.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 21:16:48 +0000 (14:16 -0700)]
doc: Update rcu.rst
This commit provides a couple of updates based on the inexorable passage
of time.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 21:03:15 +0000 (14:03 -0700)]
doc: Update and wordsmith rculist_nulls.rst
Do some wordsmithing and breaking up of RCU readers.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 15 Dec 2022 04:53:00 +0000 (20:53 -0800)]
rcu: Permit string-valued Kconfig options in kvm.sh
This commit upgrades the kvm.sh script's --kconfig parameter to accept
string-valued Kconfig options with double-quoted string values.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 15 Dec 2022 00:37:27 +0000 (16:37 -0800)]
torture: Permit double-quoted-string Kconfig options
Currently, the presence of any quoted-string Kconfig option in the
scenario files or the CFcommon file (aside from the special-cased
CONFIG_INITRAMFS_SOURCE option) will result in an "improperly set"
diagnostic. This commit updates configcheck.sh to strip double quotes
in order to permit string-valued Kconfig options to be handled correctly.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Tiezhu Yang [Wed, 23 Nov 2022 01:03:28 +0000 (09:03 +0800)]
selftests: rcutorture: Use "grep -E" instead of "egrep"
The latest version of grep is deprecating the egrep command, so that
its output contains warnings as follows:
egrep: warning: egrep is obsolescent; using grep -E
Fix this using "grep -E" instead.
sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/testing/selftests/rcutorture`
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
tar xf grep-3.8.tar.gz
cd grep-3.8 && ./configure && make
sudo make install
export PATH=/usr/local/bin:$PATH
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 21 Nov 2022 03:56:27 +0000 (19:56 -0800)]
torture: make kvm-find-errors.sh check for compressed vmlinux files
Under some conditions, a given run's vmlinux file will be compressed,
so that it is named vmlinux.xz rather than vmlinux. in such cases,
kvm-find-errors.sh will complain about the nonexistence of vmlinux.
This commit therefore causes kvm-find-errors.sh to check for vmlinux.xz
as well as for vmlinux.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 7 Nov 2022 04:58:15 +0000 (20:58 -0800)]
refscale: Provide for initialization failure
Current tests all have init() functions that are guaranteed to succeed.
But upcoming tests will need to allocate memory, thus possibly failing.
This commit therefore handles init() function failure.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 7 Nov 2022 02:16:14 +0000 (18:16 -0800)]
torture: Seed torture_random_state on CPU
The DEFINE_TORTURE_RANDOM_PERCPU() macro defines per-CPU random-number
generators for torture testing, but the seeds for each CPU's instance
will be identical if they are first used at the same time. This commit
therefore adds the CPU number to the mix when reseeding.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Sat, 3 Dec 2022 02:25:03 +0000 (10:25 +0800)]
rcu-tasks: Handle queue-shrink/callback-enqueue race condition
The rcu_tasks_need_gpcb() determines whether or not: (1) There are
callbacks needing another grace period, (2) There are callbacks ready
to be invoked, and (3) It would be a good time to shrink back down to a
single-CPU callback list. This third case is interesting because some
other CPU might be adding new callbacks, which might suddenly make this
a very bad time to be shrinking.
This is currently handled by requiring call_rcu_tasks_generic() to
enqueue callbacks under the protection of rcu_read_lock() and requiring
rcu_tasks_need_gpcb() to wait for an RCU grace period to elapse before
finalizing the transition. This works well in practice.
Unfortunately, the current code assumes that a grace period whose end is
detected by the poll_state_synchronize_rcu() in the second "if" condition
actually ended before the earlier code counted the callbacks queued on
CPUs other than CPU 0 (local variable "ncbsnz"). Given the current code,
it is possible that a long-delayed call_rcu_tasks_generic() invocation
will queue a callback on a non-zero CPU after these CPUs have had their
callbacks counted and zero has been stored to ncbsnz. Such a callback
would trigger the WARN_ON_ONCE() in the second "if" statement.
To see this, consider the following sequence of events:
o CPU 0 invokes rcu_tasks_one_gp(), and counts fewer than
rcu_task_collapse_lim callbacks. It sees at least one
callback queued on some other CPU, thus setting ncbsnz
to a non-zero value.
o CPU 1 invokes call_rcu_tasks_generic() and loads 42 from
->percpu_enqueue_lim. It therefore decides to enqueue its
callback onto CPU 1's callback list, but is delayed.
o CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
of callbacks does not exceed rcu_task_collapse_lim. It therefore
checks percpu_enqueue_lim, and sees that its value is greater
than the value one. CPU 0 therefore starts the shift back
to a single callback list. It sets ->percpu_enqueue_lim to 1,
but CPU 1 has already read the old value of 42. It also gets
a grace-period state value from get_state_synchronize_rcu().
o CPU 0 sees that ncbsnz is non-zero in its second "if" statement,
so it declines to finalize the shrink operation.
o CPU 0 again invokes rcu_tasks_one_gp(), and counts fewer than
rcu_task_collapse_lim callbacks. It also sees that there are
no callback queued on any other CPU, and thus sets ncbsnz to zero.
o CPU 1 resumes execution and enqueues its callback onto its own
list. This invalidates the value of ncbsnz.
o CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
of callbacks does not exceed rcu_task_collapse_lim. It therefore
checks percpu_enqueue_lim, but sees that its value is already
unity. It therefore does not get a new grace-period state value.
o CPU 0 sees that rcu_task_cb_adjust is non-zero, ncbsnz is zero,
and that poll_state_synchronize_rcu() says that the grace period
has completed. it therefore finalizes the shrink operation,
setting ->percpu_dequeue_lim to the value one.
o CPU 0 does a debug check, scanning the other CPUs' callback lists.
It sees that CPU 1's list has a callback, so it (rightly)
triggers the WARN_ON_ONCE(). After all, the new value of
->percpu_dequeue_lim says to not bother looking at CPU 1's
callback list, which means that this callback will never be
invoked. This can result in hangs and maybe even OOMs.
Based on long experience with rcutorture, this is an extremely
low-probability race condition, but it really can happen, especially in
preemptible kernels or within guest OSes.
This commit therefore checks for completion of the grace period
before counting callbacks. With this change, in the above failure
scenario CPU 0 would know not to prematurely end the shrink operation
because the grace period would not have completed before the count
operation started.
[ paulmck: Adjust grace-period end rather than adding RCU reader. ]
[ paulmck: Avoid spurious WARN_ON_ONCE() with ->percpu_dequeue_lim check. ]
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Wed, 30 Nov 2022 23:45:33 +0000 (07:45 +0800)]
rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug
The synchronize_rcu_tasks_rude() function invokes rcu_tasks_rude_wait_gp()
to wait one rude RCU-tasks grace period. The rcu_tasks_rude_wait_gp()
function in turn checks if there is only a single online CPU. If so, it
will immediately return, because a call to synchronize_rcu_tasks_rude()
is by definition a grace period on a single-CPU system. (We could
have blocked!)
Unfortunately, this check uses num_online_cpus() without synchronization,
which can result in too-short grace periods. To see this, consider the
following scenario:
CPU0 CPU1 (going offline)
migration/1 task:
cpu_stopper_thread
-> take_cpu_down
-> _cpu_disable
(dec __num_online_cpus)
->cpuhp_invoke_callback
preempt_disable
access old_data0
task1
del old_data0 .....
synchronize_rcu_tasks_rude()
task1 schedule out
....
task2 schedule in
rcu_tasks_rude_wait_gp()
->__num_online_cpus == 1
->return
....
task1 schedule in
->free old_data0
preempt_enable
When CPU1 decrements __num_online_cpus, its value becomes 1. However,
CPU1 has not finished going offline, and will take one last trip through
the scheduler and the idle loop before it actually stops executing
instructions. Because synchronize_rcu_tasks_rude() is mostly used for
tracing, and because both the scheduler and the idle loop can be traced,
this means that CPU0's prematurely ended grace period might disrupt the
tracing on CPU1. Given that this disruption might include CPU1 executing
instructions in memory that was just now freed (and maybe reallocated),
this is a matter of some concern.
This commit therefore removes that problematic single-CPU check from the
rcu_tasks_rude_wait_gp() function. This dispenses with the single-CPU
optimization, but there is no evidence indicating that this optimization
is important. In addition, synchronize_rcu_tasks_generic() contains a
similar optimization (albeit only for early boot), which also splats.
(As in exactly why are you invoking synchronize_rcu_tasks_rude() so
early in boot, anyway???)
It is OK for the synchronize_rcu_tasks_rude() function's check to be
unsynchronized because the only times that this check can evaluate to
true is when there is only a single CPU running with preemption
disabled.
While in the area, this commit also fixes a minor bug in which a
call to synchronize_rcu_tasks_rude() would instead be attributed to
synchronize_rcu_tasks().
[ paulmck: Add "synchronize_" prefix and "()" suffix. ]
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Frederic Weisbecker [Fri, 25 Nov 2022 13:55:00 +0000 (14:55 +0100)]
rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
RCU Tasks and PID-namespace unshare can interact in do_exit() in a
complicated circular dependency:
1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace
that every subsequent child of TASK A will belong to. But TASK A
doesn't itself belong to that new PID namespace.
2) TASK A forks() and creates TASK B. TASK A stays attached to its PID
namespace (let's say PID_NS1) and TASK B is the first task belonging
to the new PID namespace created by unshare() (let's call it PID_NS2).
3) Since TASK B is the first task attached to PID_NS2, it becomes the
PID_NS2 child reaper.
4) TASK A forks() again and creates TASK C which get attached to PID_NS2.
Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has
TASK B (belonging to PID_NS2) as a pid_namespace child_reaper.
5) TASK B exits and since it is the child reaper for PID_NS2, it has to
kill all other tasks attached to PID_NS2, and wait for all of them to
die before getting reaped itself (zap_pid_ns_process()).
6) TASK A calls synchronize_rcu_tasks() which leads to
synchronize_srcu(&tasks_rcu_exit_srcu).
7) TASK B is waiting for TASK C to get reaped. But TASK B is under a
tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between
exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A.
8) TASK C exits and since TASK A is its parent, it waits for it to reap
TASK C, but it can't because TASK A waits for TASK B that waits for
TASK C.
Pid_namespace semantics can hardly be changed at this point. But the
coverage of tasks_rcu_exit_srcu can be reduced instead.
The current task is assumed not to be concurrently reapable at this
stage of exit_notify() and therefore tasks_rcu_exit_srcu can be
temporarily relaxed without breaking its constraints, providing a way
out of the deadlock scenario.
[ paulmck: Fix build failure by adding additional declaration. ]
Fixes: 3f95aa81d265 ("rcu: Make TASKS_RCU handle tasks that are almost done exiting")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W . Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Frederic Weisbecker [Fri, 25 Nov 2022 13:54:59 +0000 (14:54 +0100)]
rcu-tasks: Remove preemption disablement around srcu_read_[un]lock() calls
Ever since the following commit:
5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()")
SRCU doesn't rely anymore on preemption to be disabled in order to
modify the per-CPU counter. And even then it used to be done from the API
itself.
Therefore and after checking further, it appears to be safe to remove
the preemption disablement around __srcu_read_[un]lock() in
exit_tasks_rcu_start() and exit_tasks_rcu_finish()
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Frederic Weisbecker [Fri, 25 Nov 2022 13:54:58 +0000 (14:54 +0100)]
rcu-tasks: Improve comments explaining tasks_rcu_exit_srcu purpose
Make sure we don't need to look again into the depths of git blame in
order not to miss a subtle part about how rcu-tasks is dealing with
exiting tasks.
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Mon, 21 Nov 2022 15:01:50 +0000 (23:01 +0800)]
rcu-tasks: Use accurate runstart time for RCU Tasks boot-time testing
Currently, test_rcu_tasks_callback() reads from the jiffies counter only
once when this function is invoked. This introduces inaccuracies because
of the latencies induced by the synchronize_rcu_tasks*() invocations.
This commit therefore re-reads the jiffies counter at the beginning
of each test, thus avoiding penalizing later tests for the latencies
induced by earlier tests.
Therefore, this commit at the start of each RCU Tasks test, re-fetch the
jiffies time as the runstart time.
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 21 Dec 2022 16:32:51 +0000 (08:32 -0800)]
srcu: Update comment after the index flip
Because there is not guaranteed to be a full memory barrier between
the ->srcu_unlock_count increment of an srcu_read_unlock() and the
->srcu_lock_count increment of the next srcu_read_lock(), this next
srcu_read_lock() is not guaranteed to see the effect of the index flip
just prior to this comment. However, this next srcu_read_lock() will
execute a full memory barrier, so the srcu_read_lock() after that is
guaranteed to see that index flip.
This guarantee is illustrated by the following diagram of events and
the litmus test following that.
------------------------------------------------------------------------
READER UPDATER
------------- ----------
// idx is initially 0.
srcu_flip() {
smp_mb();
// RSCS
srcu_read_unlock() {
smp_mb();
idx++; // P
smp_mb(); // QQ
}
srcu_readers_unlock_idx(0) {
,--counted------------ count all unlock[0]; // Q
|
unlock[0]++; // X
}
smp_mb();
srcu_read_lock() {
READ(idx) = 0; ,---- count all lock[0]; // contributes imbalance of 1.
lock[0]++; ----counted |
smp_mb(); // PP } |
} |
|
// RSCS not going to effect above scan
|
srcu_read_unlock() { |
smp_mb(); |
unlock[0]++; |
} |
/
/
srcu_read_lock() { |
READ(idx); // Y -----cannot be counted because of P (has to sample idx as 1)
lock[1]++;
...
}
------------------------------------------------------------------------
This makes it similar to the store buffer pattern. Using X, Y, P and Q
annotated above, we get:
------------------------------------------------------------------------
READER UPDATER
X (write) P (write)
smp_mb(); //PP smp_mb(); //QQ
Y (read) Q (read)
------------------------------------------------------------------------
ASCII art courtesy of Joel Fernandes.
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Reported-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 14 Dec 2022 18:50:30 +0000 (10:50 -0800)]
srcu: Yet more detail for srcu_readers_active_idx_check() comments
The comment in srcu_readers_active_idx_check() following the smp_mb()
is out of date, hailing from a simpler time when preemption was disabled
across the bulk of __srcu_read_lock(). The fact that preemption was
disabled meant that the number of tasks that had fetched the old index
but not yet incremented counters was limited by the number of CPUs.
In our more complex modern times, the number of CPUs is no longer a limit.
This commit therefore updates this comment, additionally giving more
memory-ordering detail.
[ paulmck: Apply Nt->Nc feedback from Joel Fernandes. ]
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Reported-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Reported-by: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Reported-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Pingfan Liu [Wed, 23 Nov 2022 13:56:37 +0000 (21:56 +0800)]
srcu: Remove needless rcu_seq_done() check while holding read lock
The srcu_gp_start_if_needed() function now read-holds the srcu_struct
whose grace period is being started, which means that the corresponding
SRCU grace period cannot end. This in turn means that the SRCU
grace-period sequence number returned by rcu_seq_snap() cannot expire
during this time. And that means that the calls to rcu_seq_done() in
srcu_funnel_exp_start() and srcu_funnel_gp_start() can never return true.
This commit therefore removes these rcu_seq_done() checks, but adds checks
in kernels built with CONFIG_PROVE_RCU=y that splats if rcu_seq_done()
does somehow return true.
[ paulmck: Rearrange checks to handle kernels built with lockdep. ]
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rcu@vger.kernel.org
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 25 Nov 2022 16:42:02 +0000 (08:42 -0800)]
rcu: Add test code for semaphore-like SRCU readers
This commit adds trivial test code for srcu_down_read() and
srcu_up_read().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 23 Nov 2022 23:49:55 +0000 (15:49 -0800)]
rcu: Add srcu_down_read() and srcu_up_read()
A pair of matching srcu_read_lock() and srcu_read_unlock() invocations
must take place within the same context, for example, within the same
task. Otherwise, lockdep complains, as is the right thing to do for
most use cases.
However, there are use cases involving asynchronous I/O where the
SRCU reader needs to begin on one task and end on another. This commit
therefore supplies the semaphore-like srcu_down_read() and srcu_up_read(),
which act like srcu_read_lock() and srcu_read_unlock(), but permitting
srcu_up_read() to be invoked in a different context than was the matching
srcu_down_read().
Neither srcu_down_read() nor srcu_up_read() may be invoked from an
NMI handler.
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Amir Goldstein <amir73il@gmail.com>
Pingfan Liu [Wed, 16 Nov 2022 01:52:44 +0000 (09:52 +0800)]
srcu: Fix the comparision in srcu_invl_snp_seq()
A grace-period sequence number contains two fields: counter and
state. SRCU_SNP_INIT_SEQ provides a guaranteed invalid value for
grace-period sequence numbers in newly allocated srcu_node structures'
->srcu_have_cbs[] and ->srcu_gp_seq_needed_exp fields. The point of the
comparison in srcu_invl_snp_seq() is not to detect invalid grace-period
sequence numbers in general, but rather to detect a newly allocated
srcu_node structure whose ->srcu_have_cbs[] and ->srcu_gp_seq_needed_exp
fields need to be brought into line with the srcu_struct structure's
->srcu_gp_seq field.
This commit therefore causes srcu_invl_snp_seq() to compare both fields
of the specified grace-period sequence number.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <rcu@vger.kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Pingfan Liu [Wed, 16 Nov 2022 01:52:43 +0000 (09:52 +0800)]
srcu: Fix a misspelling in comment
s/srcu_gq_seq/srcu_gp_seq/
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <rcu@vger.kernel.org>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Pingfan Liu [Mon, 31 Oct 2022 01:52:37 +0000 (09:52 +0800)]
srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL
Commit
994f706872e6 ("srcu: Make Tree SRCU able to operate without
snp_node array") assumes that cpu 0 is always online. However, there
really are situations when some other CPU is the boot CPU, for example,
when booting a kdump kernel with the maxcpus=1 boot parameter.
On PowerPC, the kdump kernel can hang as follows:
...
[ 1.740036] systemd[1]: Hostname set to <xyz.com>
[ 243.686240] INFO: task systemd:1 blocked for more than 122 seconds.
[ 243.686264] Not tainted 6.1.0-rc1 #1
[ 243.686272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 243.686281] task:systemd state:D stack:0 pid:1 ppid:0 flags:0x00042000
[ 243.686296] Call Trace:
[ 243.686301] [
c000000016657640] [
c000000016657670] 0xc000000016657670 (unreliable)
[ 243.686317] [
c000000016657830] [
c00000001001dec0] __switch_to+0x130/0x220
[ 243.686333] [
c000000016657890] [
c000000010f607b8] __schedule+0x1f8/0x580
[ 243.686347] [
c000000016657940] [
c000000010f60bb4] schedule+0x74/0x140
[ 243.686361] [
c0000000166579b0] [
c000000010f699b8] schedule_timeout+0x168/0x1c0
[ 243.686374] [
c000000016657a80] [
c000000010f61de8] __wait_for_common+0x148/0x360
[ 243.686387] [
c000000016657b20] [
c000000010176bb0] __flush_work.isra.0+0x1c0/0x3d0
[ 243.686401] [
c000000016657bb0] [
c0000000105f2768] fsnotify_wait_marks_destroyed+0x28/0x40
[ 243.686415] [
c000000016657bd0] [
c0000000105f21b8] fsnotify_destroy_group+0x68/0x160
[ 243.686428] [
c000000016657c40] [
c0000000105f6500] inotify_release+0x30/0xa0
[ 243.686440] [
c000000016657cb0] [
c0000000105751a8] __fput+0xc8/0x350
[ 243.686452] [
c000000016657d00] [
c00000001017d524] task_work_run+0xe4/0x170
[ 243.686464] [
c000000016657d50] [
c000000010020e94] do_notify_resume+0x134/0x140
[ 243.686478] [
c000000016657d80] [
c00000001002eb18] interrupt_exit_user_prepare_main+0x198/0x270
[ 243.686493] [
c000000016657de0] [
c00000001002ec60] syscall_exit_prepare+0x70/0x180
[ 243.686505] [
c000000016657e10] [
c00000001000bf7c] system_call_vectored_common+0xfc/0x280
[ 243.686520] --- interrupt: 3000 at 0x7fffa47d5ba4
[ 243.686528] NIP:
00007fffa47d5ba4 LR:
0000000000000000 CTR:
0000000000000000
[ 243.686538] REGS:
c000000016657e80 TRAP: 3000 Not tainted (6.1.0-rc1)
[ 243.686548] MSR:
800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR:
42044440 XER:
00000000
[ 243.686572] IRQMASK: 0
[ 243.686572] GPR00:
0000000000000006 00007ffffa606710 00007fffa48e7200 0000000000000000
[ 243.686572] GPR04:
0000000000000002 000000000000000a 0000000000000000 0000000000000001
[ 243.686572] GPR08:
000001000c172dd0 0000000000000000 0000000000000000 0000000000000000
[ 243.686572] GPR12:
0000000000000000 00007fffa4ff4bc0 0000000000000000 0000000000000000
[ 243.686572] GPR16:
0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 243.686572] GPR20:
0000000132dfdc50 000000000000000e 0000000000189375 0000000000000000
[ 243.686572] GPR24:
00007ffffa606ae0 0000000000000005 000001000c185490 000001000c172570
[ 243.686572] GPR28:
000001000c172990 000001000c184850 000001000c172e00 00007fffa4fedd98
[ 243.686683] NIP [
00007fffa47d5ba4] 0x7fffa47d5ba4
[ 243.686691] LR [
0000000000000000] 0x0
[ 243.686698] --- interrupt: 3000
[ 243.686708] INFO: task kworker/u16:1:24 blocked for more than 122 seconds.
[ 243.686717] Not tainted 6.1.0-rc1 #1
[ 243.686724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 243.686733] task:kworker/u16:1 state:D stack:0 pid:24 ppid:2 flags:0x00000800
[ 243.686747] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[ 243.686758] Call Trace:
[ 243.686762] [
c0000000166736e0] [
c00000004fd91000] 0xc00000004fd91000 (unreliable)
[ 243.686775] [
c0000000166738d0] [
c00000001001dec0] __switch_to+0x130/0x220
[ 243.686788] [
c000000016673930] [
c000000010f607b8] __schedule+0x1f8/0x580
[ 243.686801] [
c0000000166739e0] [
c000000010f60bb4] schedule+0x74/0x140
[ 243.686814] [
c000000016673a50] [
c000000010f699b8] schedule_timeout+0x168/0x1c0
[ 243.686827] [
c000000016673b20] [
c000000010f61de8] __wait_for_common+0x148/0x360
[ 243.686840] [
c000000016673bc0] [
c000000010210840] __synchronize_srcu.part.0+0xa0/0xe0
[ 243.686855] [
c000000016673c30] [
c0000000105f2c64] fsnotify_mark_destroy_workfn+0xc4/0x1a0
[ 243.686868] [
c000000016673ca0] [
c000000010174ea8] process_one_work+0x2a8/0x570
[ 243.686882] [
c000000016673d40] [
c000000010175208] worker_thread+0x98/0x5e0
[ 243.686895] [
c000000016673dc0] [
c0000000101828d4] kthread+0x124/0x130
[ 243.686908] [
c000000016673e10] [
c00000001000cd40] ret_from_kernel_thread+0x5c/0x64
[ 366.566274] INFO: task systemd:1 blocked for more than 245 seconds.
[ 366.566298] Not tainted 6.1.0-rc1 #1
[ 366.566305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 366.566314] task:systemd state:D stack:0 pid:1 ppid:0 flags:0x00042000
[ 366.566329] Call Trace:
...
The above splat occurs because PowerPC really does use maxcpus=1
instead of nr_cpus=1 in the kernel command line. Consequently, the
(quite possibly non-zero) kdump CPU is the only online CPU in the kdump
kernel. SRCU unconditionally queues a sdp->work on cpu 0, for which no
worker thread has been created, so sdp->work will be never executed and
__synchronize_srcu() will never be completed.
This commit therefore replaces CPU ID 0 with get_boot_cpu_id() in key
places in Tree SRCU. Since the CPU indicated by get_boot_cpu_id()
is guaranteed to be online, this avoids the above splat.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rcu@vger.kernel.org
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Thu, 10 Nov 2022 07:30:13 +0000 (15:30 +0800)]
srcu: Release early_srcu resources when no longer in use
Kernels built with the CONFIG_TREE_SRCU Kconfig option set and then
booted with rcupdate.rcu_self_test=1 and srcutree.convert_to_big=1 will
test Tree SRCU during early boot. The early_srcu structure's srcu_node
array will be allocated when init_srcu_struct_fields() is invoked,
but after the test completes this early_srcu structure will not be used.
This commit therefore invokes cleanup_srcu_struct() to free that srcu_node
structure.
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Wed, 14 Dec 2022 12:06:30 +0000 (13:06 +0100)]
rcu/kvfree: Split ready for reclaim objects from a batch
This patch splits the lists of objects so as to avoid sending any
through RCU that have already been queued for more than one grace
period. These long-term-resident objects are immediately freed.
The remaining short-term-resident objects are queued for later freeing
using queue_rcu_work().
This change avoids delaying workqueue handlers with synchronize_rcu()
invocations. Yes, workqueue handlers are designed to handle blocking,
but avoiding blocking when unnecessary improves performance during
low-memory situations.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Wed, 14 Dec 2022 12:06:29 +0000 (13:06 +0100)]
rcu/kvfree: Carefully reset number of objects in krcp
The schedule_delayed_monitor_work() function relies on the count of
objects queued into any given kfree_rcu_cpu structure. This count is
used to determine how quickly to schedule passing these objects to RCU.
There are three pipes where pointers can be placed. When any pipe is
offloaded, the kfree_rcu_cpu structure's ->count counter is set to zero,
which is wrong because the other pipes might still be non-empty.
This commit therefore maintains per-pipe counters, and introduces a
krc_count() helper to access the aggregate value of those counters.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Fri, 2 Dec 2022 13:18:37 +0000 (14:18 +0100)]
rcu/kvfree: Use READ_ONCE() when access to krcp->head
The need_offload_krc() function is now lock-free, which gives the
compiler freedom to load old values from plain C-language loads from
the kfree_rcu_cpu struture's ->head pointer. This commit therefore
applied READ_ONCE() to these loads.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Tue, 29 Nov 2022 15:58:22 +0000 (16:58 +0100)]
rcu/kvfree: Use a polled API to speedup a reclaim process
Currently all objects placed into a batch wait for a full grace period
to elapse after that batch is ready to send to RCU. However, this
can unnecessarily delay freeing of the first objects that were added
to the batch. After all, several RCU grace periods might have elapsed
since those objects were added, and if so, there is no point in further
deferring their freeing.
This commit therefore adds per-page grace-period snapshots which are
obtained from get_state_synchronize_rcu(). When the batch is ready
to be passed to call_rcu(), each page's snapshot is checked by passing
it to poll_state_synchronize_rcu(). If a given page's RCU grace period
has already elapsed, its objects are freed immediately by kvfree_rcu_bulk().
Otherwise, these objects are freed after a call to synchronize_rcu().
This approach requires that the pages be traversed in reverse order,
that is, the oldest ones first.
Test example:
kvm.sh --memory 10G --torture rcuscale --allcpus --duration 1 \
--kconfig CONFIG_NR_CPUS=64 \
--kconfig CONFIG_RCU_NOCB_CPU=y \
--kconfig CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y \
--kconfig CONFIG_RCU_LAZY=n \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 \
rcuscale.holdoff=20 rcuscale.kfree_loops=10000 \
torture.disable_onoff_at_boot" --trust-make
Before this commit:
Total time taken by all kfree'ers:
8535693700 ns, loops: 10000, batches: 1188, memory footprint: 2248MB
Total time taken by all kfree'ers:
8466933582 ns, loops: 10000, batches: 1157, memory footprint: 2820MB
Total time taken by all kfree'ers:
5375602446 ns, loops: 10000, batches: 1130, memory footprint: 6502MB
Total time taken by all kfree'ers:
7523283832 ns, loops: 10000, batches: 1006, memory footprint: 3343MB
Total time taken by all kfree'ers:
6459171956 ns, loops: 10000, batches: 1150, memory footprint: 6549MB
After this commit:
Total time taken by all kfree'ers:
8560060176 ns, loops: 10000, batches: 1787, memory footprint: 61MB
Total time taken by all kfree'ers:
8573885501 ns, loops: 10000, batches: 1777, memory footprint: 93MB
Total time taken by all kfree'ers:
8320000202 ns, loops: 10000, batches: 1727, memory footprint: 66MB
Total time taken by all kfree'ers:
8552718794 ns, loops: 10000, batches: 1790, memory footprint: 75MB
Total time taken by all kfree'ers:
8601368792 ns, loops: 10000, batches: 1724, memory footprint: 62MB
The reduction in memory footprint is well in excess of an order of
magnitude.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Tue, 29 Nov 2022 15:58:21 +0000 (16:58 +0100)]
rcu/kvfree: Move need_offload_krc() out of krcp->lock
The need_offload_krc() function currently holds the krcp->lock in order
to safely check krcp->head. This commit removes the need for this lock
in that function by updating the krcp->head pointer using WRITE_ONCE()
macro so that readers can carry out lockless loads of that pointer.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Tue, 29 Nov 2022 15:58:20 +0000 (16:58 +0100)]
rcu/kvfree: Move bulk/list reclaim to separate functions
The kvfree_rcu() code maintains lists of pages of pointers, but also a
singly linked list, with the latter being used when memory allocation
fails. Traversal of these two types of lists is currently open coded.
This commit simplifies the code by providing kvfree_rcu_bulk() and
kvfree_rcu_list() functions, respectively, to traverse these two types
of lists. This patch does not introduce any functional change.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Tue, 29 Nov 2022 15:58:19 +0000 (16:58 +0100)]
rcu/kvfree: Switch to a generic linked list API
This commit improves the readability and maintainability of the
kvfree_rcu() code by switching from an open-coded linked list to
the standard Linux-kernel circular doubly linked list. This patch
does not introduce any functional change.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Uladzislau Rezki (Sony) [Tue, 25 Oct 2022 14:46:12 +0000 (16:46 +0200)]
rcu: Refactor kvfree_call_rcu() and high-level helpers
Currently a kvfree_call_rcu() takes an offset within a structure as
a second parameter, so a helper such as a kvfree_rcu_arg_2() has to
convert rcu_head and a freed ptr to an offset in order to pass it. That
leads to an extra conversion on macro entry.
Instead of converting, refactor the code in way that a pointer that has
to be freed is passed directly to the kvfree_call_rcu().
This patch does not make any functional change and is transparent to
all kvfree_rcu() users.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 20 Dec 2022 02:02:20 +0000 (18:02 -0800)]
rcu: Allow expedited RCU CPU stall warnings to dump task stacks
This commit introduces the rcupdate.rcu_exp_stall_task_details kernel
boot parameter, which cause expedited RCU CPU stall warnings to dump
the stacks of any tasks blocking the current expedited grace period.
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 20 Dec 2022 01:02:20 +0000 (17:02 -0800)]
rcu: Test synchronous RCU grace periods at the end of rcu_init()
This commit tests synchronize_rcu() and synchronize_rcu_expedited()
at the end of rcu_init(), in addition to the test already at the
beginning of that function. These tests are run only in kernels built
with CONFIG_PROVE_RCU=y.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zqiang [Thu, 15 Dec 2022 03:57:55 +0000 (11:57 +0800)]
rcu: Make rcu_blocking_is_gp() stop early-boot might_sleep()
Currently, rcu_blocking_is_gp() invokes might_sleep() even during early
boot when interrupts are disabled and before the scheduler is scheduling.
This is at best an accident waiting to happen. Therefore, this commit
moves that might_sleep() under an rcu_scheduler_active check in order
to ensure that might_sleep() is not invoked unless sleeping might actually
happen.
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 16 Dec 2022 23:55:48 +0000 (15:55 -0800)]
rcu: Suppress smp_processor_id() complaint in synchronize_rcu_expedited_wait()
The normal grace period's RCU CPU stall warnings are invoked from the
scheduling-clock interrupt handler, and can thus invoke smp_processor_id()
with impunity, which allows them to directly invoke dump_cpu_task().
In contrast, the expedited grace period's RCU CPU stall warnings are
invoked from process context, which causes the dump_cpu_task() function's
calls to smp_processor_id() to complain bitterly in debug kernels.
This commit therefore causes synchronize_rcu_expedited_wait() to disable
preemption around its call to dump_cpu_task().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 14 Dec 2022 19:41:44 +0000 (11:41 -0800)]
rcu: Make RCU_LOCKDEP_WARN() avoid early lockdep checks
Currently, RCU_LOCKDEP_WARN() checks the condition before checking
to see if lockdep is still enabled. This is necessary to avoid the
false-positive splats fixed by commit
3066820034b5dd ("rcu: Reject
RCU_LOCKDEP_WARN() false positives"). However, the current state can
result in false-positive splats during early boot before lockdep is fully
initialized. This commit therefore checks debug_lockdep_rcu_enabled()
both before and after checking the condition, thus avoiding both sets
of false-positive error reports.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Paul E. McKenney [Fri, 25 Nov 2022 16:43:10 +0000 (08:43 -0800)]
rcu: Upgrade header comment for poll_state_synchronize_rcu()
This commit emphasizes the possibility of concurrent calls to
synchronize_rcu() and synchronize_rcu_expedited() causing one or
the other of the two grace periods being lost from the viewpoint of
poll_state_synchronize_rcu().
If you cannot afford to lose grace periods this way, you should
instead use the _full() variants of the polled RCU API, for
example, poll_state_synchronize_rcu_full().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 14 Nov 2022 17:40:19 +0000 (09:40 -0800)]
rcu: Throttle callback invocation based on number of ready callbacks
Currently, rcu_do_batch() sizes its batches based on the total number
of callbacks in the callback list. This can result in some strange
choices, for example, if there was 12,800 callbacks in the list, but
only 200 were ready to invoke, RCU would invoke 100 at a time (12,800
shifted down by seven bits).
A more measured approach would use the number that were actually ready
to invoke, an approach that has become feasible only recently given the
per-segment ->seglen counts in ->cblist.
This commit therefore bases the batch limit on the number of callbacks
ready to invoke instead of on the total number of callbacks.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 7 Nov 2022 00:33:38 +0000 (16:33 -0800)]
rcu: Consolidate initialization and CPU-hotplug code
This commit consolidates the initialization and CPU-hotplug code at
the end of kernel/rcu/tree.c. This is strictly a code-motion commit.
No functionality has changed.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Zhao Mengmeng [Wed, 19 Oct 2022 12:36:50 +0000 (08:36 -0400)]
rcu: Use hlist_nulls_next_rcu() in hlist_nulls_add_tail_rcu()
In commit
8dbd76e79a16 ("tcp/dccp: fix possible race
__inet_lookup_established()"), function hlist_nulls_add_tail_rcu() was
added back, but the local variable *last* is of type hlist_nulls_node,
so use hlist_nulls_next_rcu() instead of hlist_next_rcu().
Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 20:06:54 +0000 (13:06 -0700)]
doc: Update rcu_dereference.rst
This commit updates rcu_dereference.rst to reflect RCU additions and
changes over the past few years
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 18:44:45 +0000 (11:44 -0700)]
doc: Update rcubarrier.rst
This commit updates rcubarrier.txt to reflect RCU additions and changes
over the past few years.
[ paulmck: Apply Stephen Rothwell feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 18:00:14 +0000 (11:00 -0700)]
doc: Update NMI-RCU.rst
This commit updates NMI-RCU.rst to highlight the ancient heritage of
the example code and to discourage wanton compiler "optimizations".
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 4 Nov 2022 17:53:25 +0000 (10:53 -0700)]
doc: Further updates to RCU's lockdep.rst
This commit wordsmiths RCU's lockdep.rst.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Linus Torvalds [Sun, 25 Dec 2022 21:41:39 +0000 (13:41 -0800)]
Linux 6.2-rc1
Steven Rostedt (Google) [Tue, 20 Dec 2022 18:45:19 +0000 (13:45 -0500)]
treewide: Convert del_timer*() to timer_shutdown*()
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 23 Dec 2022 22:44:08 +0000 (14:44 -0800)]
Merge tag 'spi-fix-v6.2-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"One driver specific change here which handles the case where a SPI
device for some reason tries to change the bus speed during a message
on fsl_spi hardware, this should be very unusual"
* tag 'spi-fix-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fsl_spi: Don't change speed while chipselect is active
Linus Torvalds [Fri, 23 Dec 2022 22:38:00 +0000 (14:38 -0800)]
Merge tag 'regulator-fix-v6.2-rc1' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Two core fixes here, one for a long standing race which some Qualcomm
systems have started triggering with their UFS driver and another
fixing a problem with supply lookup introduced by the fixes for devm
related use after free issues that were introduced in this merge
window"
* tag 'regulator-fix-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: fix deadlock on regulator enable
regulator: core: Fix resolve supply lookup issue
Linus Torvalds [Fri, 23 Dec 2022 21:56:41 +0000 (13:56 -0800)]
Merge tag 'coccinelle-6.2' of git://git./linux/kernel/git/jlawall/linux
Pull coccicheck update from Julia Lawall:
"Modernize use of grep in coccicheck:
Use 'grep -E' instead of 'egrep'"
* tag 'coccinelle-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
scripts: coccicheck: use "grep -E" instead of "egrep"
Linus Torvalds [Fri, 23 Dec 2022 20:00:24 +0000 (12:00 -0800)]
Merge tag 'hardening-v6.2-rc1-fixes' of git://git./linux/kernel/git/kees/linux
Pull kernel hardening fixes from Kees Cook:
- Fix CFI failure with KASAN (Sami Tolvanen)
- Fix LKDTM + CFI under GCC 7 and 8 (Kristina Martsenko)
- Limit CONFIG_ZERO_CALL_USED_REGS to Clang > 15.0.6 (Nathan
Chancellor)
- Ignore "contents" argument in LoadPin's LSM hook handling
- Fix paste-o in /sys/kernel/warn_count API docs
- Use READ_ONCE() consistently for oops/warn limit reading
* tag 'hardening-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
cfi: Fix CFI failure with KASAN
exit: Use READ_ONCE() for all oops/warn limit reads
security: Restrict CONFIG_ZERO_CALL_USED_REGS to gcc or clang > 15.0.6
lkdtm: cfi: Make PAC test work with GCC 7 and 8
docs: Fix path paste-o for /sys/kernel/warn_count
LoadPin: Ignore the "contents" argument of the LSM hooks
Linus Torvalds [Fri, 23 Dec 2022 19:55:54 +0000 (11:55 -0800)]
Merge tag 'pstore-v6.2-rc1-fixes' of git://git./linux/kernel/git/kees/linux
Pull pstore fixes from Kees Cook:
- Switch pmsg_lock to an rt_mutex to avoid priority inversion (John
Stultz)
- Correctly assign mem_type property (Luca Stefani)
* tag 'pstore-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: Properly assign mem_type property
pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES
pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion
Linus Torvalds [Fri, 23 Dec 2022 19:44:20 +0000 (11:44 -0800)]
Merge tag 'dma-mapping-2022-12-23' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
"Fix up the sound code to not pass __GFP_COMP to the non-coherent DMA
allocator, as it copes with that just as badly as the coherent
allocator, and then add a check to make sure no one passes the flag
ever again"
* tag 'dma-mapping-2022-12-23' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: reject GFP_COMP for noncoherent allocations
ALSA: memalloc: don't use GFP_COMP for non-coherent dma allocations
Linus Torvalds [Fri, 23 Dec 2022 19:39:18 +0000 (11:39 -0800)]
Merge tag '9p-for-6.2-rc1' of https://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:
- improve p9_check_errors to check buffer size instead of msize when
possible (e.g. not zero-copy)
- some more syzbot and KCSAN fixes
- minor headers include cleanup
* tag '9p-for-6.2-rc1' of https://github.com/martinetd/linux:
9p/client: fix data race on req->status
net/9p: fix response size check in p9_check_errors()
net/9p: distinguish zero-copy requests
9p/xen: do not memcpy header into req->rc
9p: set req refcount to zero to avoid uninitialized usage
9p/net: Remove unneeded idr.h #include
9p/fs: Remove unneeded idr.h #include
Linus Torvalds [Fri, 23 Dec 2022 19:15:48 +0000 (11:15 -0800)]
Merge tag 'sound-6.2-rc1-2' of git://git./linux/kernel/git/tiwai/sound
Pull more sound updates from Takashi Iwai:
"A few more updates for 6.2: most of changes are about ASoC
device-specific fixes.
- Lots of ASoC Intel AVS extensions and refactoring
- Quirks for ASoC Intel SOF as well as regression fixes
- ASoC Mediatek and Rockchip fixes
- Intel HD-audio HDMI workarounds
- Usual HD- and USB-audio device-specific quirks"
* tag 'sound-6.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (54 commits)
ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless
ALSA: azt3328: Remove the unused function snd_azf3328_codec_outl()
ASoC: lochnagar: Fix unused lochnagar_of_match warning
ASoC: Intel: Add HP Stream 8 to bytcr_rt5640.c
ASoC: SOF: mediatek: initialize panic_info to zero
ASoC: rt5670: Remove unbalanced pm_runtime_put()
ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet
ASoC: Intel: soc-acpi: update codec addr on 0C11/0C4F product
ASoC: rockchip: spdif: Add missing clk_disable_unprepare() in rk_spdif_runtime_resume()
ASoC: wm8994: Fix potential deadlock
ASoC: mediatek: mt8195: add sof be ops to check audio active
ASoC: SOF: Revert: "core: unregister clients and machine drivers in .shutdown"
ASoC: SOF: Intel: pci-tgl: unblock S5 entry if DMA stop has failed"
ALSA: hda/hdmi: fix stream-id config keep-alive for rt suspend
ALSA: hda/hdmi: set default audio parameters for KAE silent-stream
ALSA: hda/hdmi: fix i915 silent stream programming flow
ALSA: hda: Error out if invalid stream is being setup
ASoC: dt-bindings: fsl-sai: Reinstate i.MX93 SAI compatible string
ASoC: soc-pcm.c: Clear DAIs parameters after stream_active is updated
ASoC: codecs: wcd-clsh: Remove the unused function
...
Linus Torvalds [Fri, 23 Dec 2022 19:09:44 +0000 (11:09 -0800)]
Merge tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Holiday fixes!
Two batches from amd, and one group of i915 changes.
amdgpu:
- Spelling fix
- BO pin fix
- Properly handle polaris 10/11 overlap asics
- GMC9 fix
- SR-IOV suspend fix
- DCN 3.1.4 fix
- KFD userptr locking fix
- SMU13.x fixes
- GDS/GWS/OA handling fix
- Reserved VMID handling fixes
- FRU EEPROM fix
- BO validation fixes
- Avoid large variable on the stack
- S0ix fixes
- SMU 13.x fixes
- VCN fix
- Add missing fence reference
amdkfd:
- Fix init vm error handling
- Fix double release of compute pasid
i915
- Documentation fixes
- OA-perf related fix
- VLV/CHV HDMI/DP audio fix
- Display DDI/Transcoder fix
- Migrate fixes"
* tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm: (39 commits)
drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency
drm/amdgpu: enable VCN DPG for GC IP v11.0.4
drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0
drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics
drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34
drm/amdgpu: skip MES for S0ix as well since it's part of GFX
drm/amd/pm: avoid large variable on kernel stack
drm/amdkfd: Fix double release compute pasid
drm/amdkfd: Fix kfd_process_device_init_vm error handling
drm/amd/pm: update SMU13.0.0 reported maximum shader clock
drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings
drm/amd/pm: enable GPO dynamic control support for SMU13.0.7
drm/amd/pm: enable GPO dynamic control support for SMU13.0.0
drm/amdgpu: revert "generally allow over-commit during BO allocation"
drm/amdgpu: Remove unnecessary domain argument
drm/amdgpu: Fix size validation for non-exclusive domains (v4)
drm/amdgpu: Check if fru_addr is not NULL (v2)
drm/i915/ttm: consider CCS for backup objects
drm/i915/migrate: fix corner case in CCS aux copying
drm/amdgpu: rework reserved VMID handling
...
Linus Torvalds [Fri, 23 Dec 2022 18:49:45 +0000 (10:49 -0800)]
Merge tag 'mips_6.2_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
"Fixes due to DT changes"
* tag 'mips_6.2_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: dts: bcm63268: Add missing properties to the TWD node
MIPS: ralink: mt7621: avoid to init common ralink reset controller
Linus Torvalds [Fri, 23 Dec 2022 18:45:00 +0000 (10:45 -0800)]
Merge tag 'mm-hotfixes-stable-2022-12-22-14-34' of git://git./linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Eight fixes, all cc:stable. One is for gcov and the remainder are MM"
* tag 'mm-hotfixes-stable-2022-12-22-14-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
gcov: add support for checksum field
test_maple_tree: add test for mas_spanning_rebalance() on insufficient data
maple_tree: fix mas_spanning_rebalance() on insufficient data
hugetlb: really allocate vma lock for all sharable vmas
kmsan: export kmsan_handle_urb
kmsan: include linux/vmalloc.h
mm/mempolicy: fix memory leak in set_mempolicy_home_node system call
mm, mremap: fix mremap() expanding vma with addr inside vma
Luca Stefani [Thu, 22 Dec 2022 13:10:49 +0000 (14:10 +0100)]
pstore: Properly assign mem_type property
If mem-type is specified in the device tree
it would end up overriding the record_size
field instead of populating mem_type.
As record_size is currently parsed after the
improper assignment with default size 0 it
continued to work as expected regardless of the
value found in the device tree.
Simply changing the target field of the struct
is enough to get mem-type working as expected.
Fixes: 9d843e8fafc7 ("pstore: Add mem_type property DT parsing support")
Cc: stable@vger.kernel.org
Signed-off-by: Luca Stefani <luca@osomprivacy.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222131049.286288-1-luca@osomprivacy.com
John Stultz [Wed, 21 Dec 2022 05:18:55 +0000 (05:18 +0000)]
pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES
In commit
76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex
to avoid priority inversion") I changed a lock to an rt_mutex.
However, its possible that CONFIG_RT_MUTEXES is not enabled,
which then results in a build failure, as the 0day bot detected:
https://lore.kernel.org/linux-mm/
202212211244.TwzWZD3H-lkp@intel.com/
Thus this patch changes CONFIG_PSTORE_PMSG to select
CONFIG_RT_MUTEXES, which ensures the build will not fail.
Cc: Wei Wang <wvw@google.com>
Cc: Midas Chien<midaschieh@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kernel test robot <lkp@intel.com>
Cc: kernel-team@android.com
Fixes: 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221221051855.15761-1-jstultz@google.com
Sami Tolvanen [Thu, 22 Dec 2022 22:57:47 +0000 (22:57 +0000)]
cfi: Fix CFI failure with KASAN
When CFI_CLANG and KASAN are both enabled, LLVM doesn't generate a
CFI type hash for asan.module_ctor functions in translation units
where CFI is disabled, which leads to a CFI failure during boot when
do_ctors calls the affected constructors:
CFI failure at do_basic_setup+0x64/0x90 (target:
asan.module_ctor+0x0/0x28; expected type: 0xa540670c)
Specifically, this happens because CFI is disabled for
kernel/cfi.c. There's no reason to keep CFI disabled here anymore, so
fix the failure by not filtering out CC_FLAGS_CFI for the file.
Note that https://reviews.llvm.org/rG3b14862f0a96 fixed the issue
where LLVM didn't emit CFI type hashes for any sanitizer constructors,
but now type hashes are emitted correctly for TUs that use CFI.
Link: https://github.com/ClangBuiltLinux/linux/issues/1742
Fixes: 89245600941e ("cfi: Switch to -fsanitize=kcfi")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222225747.3538676-1-samitolvanen@google.com
Linus Torvalds [Thu, 22 Dec 2022 19:22:31 +0000 (11:22 -0800)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes and small updates.
The only things of note is a qla2xxx fix for crash on hotplug and
timeout and the addition of a user exposed abstraction layer for
persistent reservation error return handling (which necessitates the
conversion of nvme.c as well as SCSI)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix crash when I/O abort times out
nvme: Convert NVMe errors to PR errors
scsi: sd: Convert SCSI errors to PR errors
scsi: core: Rename status_byte to sg_status_byte
block: Add error codes for common PR failures
scsi: sd: sd_zbc: Trace zone append emulation
scsi: libfc: Include the correct header
Linus Torvalds [Thu, 22 Dec 2022 19:17:34 +0000 (11:17 -0800)]
Merge tag 'afs-next-
20221222' of git://git./linux/kernel/git/dhowells/linux-fs
Pull afs update from David Howells:
"A fix for a couple of missing resource counter decrements, two small
cleanups of now-unused bits of code and a patch to remove writepage
support from afs"
* tag 'afs-next-
20221222' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Stop implementing ->writepage()
afs: remove afs_cache_netfs and afs_zap_permits() declarations
afs: remove variable nr_servers
afs: Fix lost servers_outstanding count
Linus Torvalds [Thu, 22 Dec 2022 19:07:29 +0000 (11:07 -0800)]
Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git./linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:
"perf tools fixes and improvements:
- Don't stop building perf if python setuptools isn't installed, just
disable the affected perf feature.
- Remove explicit reference to python 2.x devel files, that warning
is about python-devel, no matter what version, being unavailable
and thus disabling the linking with libpython.
- Don't use -Werror=switch-enum when building the python support that
handles libtraceevent enumerations, as there is no good way to test
if some specific enum entry is available with the libtraceevent
installed on the system.
- Introduce 'perf lock contention' --type-filter and --lock-filter,
to filter by lock type and lock name:
$ sudo ./perf lock record -a -- ./perf bench sched messaging
$ sudo ./perf lock contention -E 5 -Y spinlock
contended total wait max wait avg wait type caller
802 1.26 ms 11.73 us 1.58 us spinlock __wake_up_common_lock+0x62
13 787.16 us 105.44 us 60.55 us spinlock remove_wait_queue+0x14
12 612.96 us 78.70 us 51.08 us spinlock prepare_to_wait+0x27
114 340.68 us 12.61 us 2.99 us spinlock try_to_wake_up+0x1f5
83 226.38 us 9.15 us 2.73 us spinlock folio_lruvec_lock_irqsave+0x5e
$ sudo ./perf lock contention -l
contended total wait max wait avg wait address symbol
57 1.11 ms 42.83 us 19.54 us
ffff9f4140059000
15 280.88 us 23.51 us 18.73 us
ffffffff9d007a40 jiffies_lock
1 20.49 us 20.49 us 20.49 us
ffffffff9d0d50c0 rcu_state
1 9.02 us 9.02 us 9.02 us
ffff9f41759e9ba0
$ sudo ./perf lock contention -L jiffies_lock,rcu_state
contended total wait max wait avg wait type caller
15 280.88 us 23.51 us 18.73 us spinlock tick_sched_do_timer+0x93
1 20.49 us 20.49 us 20.49 us spinlock __softirqentry_text_start+0xeb
$ sudo ./perf lock contention -L
ffff9f4140059000
contended total wait max wait avg wait type caller
38 779.40 us 42.83 us 20.51 us spinlock worker_thread+0x50
11 216.30 us 39.87 us 19.66 us spinlock queue_work_on+0x39
8 118.13 us 20.51 us 14.77 us spinlock kthread+0xe5
- Fix splitting CC into compiler and options when checking if a
option is present in clang to build the python binding, needed in
systems such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c".
- Refresh metris and events for Intel systems: alderlake.
alderlake-n, bonnell, broadwell, broadwellde, broadwellx,
cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
haswellx, icelake, icelakex, ivybridge, ivytown, jaketown,
knightslanding, meteorlake, nehalemep, nehalemex, sandybridge,
sapphirerapids, silvermont, skylake, skylakex, snowridgex,
tigerlake, westmereep-dp, westmereep-sp, westmereex.
- Add vendor events files (JSON) for AMD Zen 4, from sections
2.1.15.4 "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache
Performance Monitor Counter"s and Section 7.1 "Fabric Performance
Monitor Counter (PMC) Events" in the Processor Programming
Reference (PPR) for AMD Family 19h Model 11h Revision B1
processors.
This constitutes events which capture op dispatch, execution and
retirement, branch prediction, L1 and L2 cache activity, TLB
activity, L3 cache activity and data bandwidth for various links
and interfaces in the Data Fabric.
- Also, from the same PPR are metrics taken from Section 2.1.15.2
"Performance Measurement", including pipeline utilization, which
are new to Zen 4 processors and useful for finding performance
bottlenecks by analyzing activity at different stages of the
pipeline.
- Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and
'srcfile' sort keys performance by postponing calling the external
addr2line utility to the collapse phase of histogram bucketing.
- Fix 'perf test' "all PMU test" to skip parametrized events, that
requires setting up and are not supported by this test.
- Update tools/ copies of kernel headers: features,
disabled-features, fscrypt.h, i915_drm.h, msr-index.h, power pc
syscall table and kvm.h.
- Add .DELETE_ON_ERROR special Makefile target to clean up partially
updated files on error.
- Simplify the mksyscalltbl script for arm64 by avoiding to run the
host compiler to create the syscall table, do it all just with the
shell script.
- Further fixes to honour quiet mode (-q)"
* tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
perf python: Fix splitting CC into compiler and options
perf scripting python: Don't be strict at handling libtraceevent enumerations
perf arm64: Simplify mksyscalltbl
perf build: Remove explicit reference to python 2.x devel files
perf vendor events amd: Add Zen 4 mapping
perf vendor events amd: Add Zen 4 metrics
perf vendor events amd: Add Zen 4 uncore events
perf vendor events amd: Add Zen 4 core events
perf vendor events intel: Refresh westmereex events
perf vendor events intel: Refresh westmereep-sp events
perf vendor events intel: Refresh westmereep-dp events
perf vendor events intel: Refresh tigerlake metrics and events
perf vendor events intel: Refresh snowridgex events
perf vendor events intel: Refresh skylakex metrics and events
perf vendor events intel: Refresh skylake metrics and events
perf vendor events intel: Refresh silvermont events
perf vendor events intel: Refresh sapphirerapids metrics and events
perf vendor events intel: Refresh sandybridge metrics and events
perf vendor events intel: Refresh nehalemex events
perf vendor events intel: Refresh nehalemep events
...
Arnaldo Carvalho de Melo [Thu, 22 Dec 2022 13:56:25 +0000 (10:56 -0300)]
perf python: Fix splitting CC into compiler and options
Noticed this build failure on archlinux:base when building with clang:
clang-14: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
In tools/perf/util/setup.py we check if clang supports that option, but
since commit
3cad53a6f9cdbafa ("perf python: Account for multiple words
in CC") this got broken as in the common case where CC="clang":
>>> cc="clang"
>>> print(cc.split()[0])
clang
>>> option="-ffat-lto-objects"
>>> print(str(cc.split()[1:]) + option)
[]-ffat-lto-objects
>>>
And then the Popen will call clang with that bogus option name that in
turn will not produce the b"unknown argument" or b"is not supported"
that this function uses to detect if the option is not available and
thus later on clang will be called with an unknown/unsupported option.
Fix it by looking if really there are options in the provided CC
variable, and if so override 'cc' with the first token and append the
options to the 'option' variable.
Fixes: 3cad53a6f9cdbafa ("perf python: Account for multiple words in CC")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Khem Raj <raj.khem@gmail.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: http://lore.kernel.org/lkml/Y6Rq5F5NI0v1QQHM@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Howells [Fri, 18 Nov 2022 07:57:27 +0000 (07:57 +0000)]
afs: Stop implementing ->writepage()
We're trying to get rid of the ->writepage() hook[1]. Stop afs from using
it by unlocking the page and calling afs_writepages_region() rather than
folio_write_one().
A flag is passed to afs_writepages_region() to indicate that it should only
write a single region so that we don't flush the entire file in
->write_begin(), but do add other dirty data to the region being written to
try and reduce the number of RPC ops.
This requires ->migrate_folio() to be implemented, so point that at
filemap_migrate_folio() for files and also for symlinks and directories.
This can be tested by turning on the afs_folio_dirty tracepoint and then
doing something like:
xfs_io -c "w 2223 7000" -c "w 15000 22222" -c "w 23 7" /afs/my/test/foo
and then looking in the trace to see if the write at position 15000 gets
stored before page 0 gets dirtied for the write at position 23.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221113162902.883850-1-hch@lst.de/
Link: https://lore.kernel.org/r/166876785552.222254.4403222906022558715.stgit@warthog.procyon.org.uk/
Gaosheng Cui [Fri, 9 Sep 2022 07:03:53 +0000 (15:03 +0800)]
afs: remove afs_cache_netfs and afs_zap_permits() declarations
afs_zap_permits() has been removed since
commit
be080a6f43c4 ("afs: Overhaul permit caching").
afs_cache_netfs has been removed since
commit
523d27cda149 ("afs: Convert afs to use the new fscache API").
so remove the declare for them from header file.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20220909070353.1160228-1-cuigaosheng1@huawei.com/
Colin Ian King [Thu, 20 Oct 2022 17:39:23 +0000 (18:39 +0100)]
afs: remove variable nr_servers
Variable nr_servers is no longer being used, the last reference
to it was removed in commit
45df8462730d ("afs: Fix server list handling")
so clean up the code by removing it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221020173923.21342-1-colin.i.king@gmail.com/
David Howells [Wed, 21 Dec 2022 14:30:48 +0000 (14:30 +0000)]
afs: Fix lost servers_outstanding count
The afs_fs_probe_dispatcher() work function is passed a count on
net->servers_outstanding when it is scheduled (which may come via its
timer). This is passed back to the work_item, passed to the timer or
dropped at the end of the dispatcher function.
But, at the top of the dispatcher function, there are two checks which
skip the rest of the function: if the network namespace is being destroyed
or if there are no fileservers to probe. These two return paths, however,
do not drop the count passed to the dispatcher, and so, sometimes, the
destruction of a network namespace, such as induced by rmmod of the kafs
module, may get stuck in afs_purge_servers(), waiting for
net->servers_outstanding to become zero.
Fix this by adding the missing decrements in afs_fs_probe_dispatcher().
Fixes: f6cbb368bcb0 ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/167164544917.2072364.3759519569649459359.stgit@warthog.procyon.org.uk/
Takashi Iwai [Thu, 22 Dec 2022 08:18:38 +0000 (09:18 +0100)]
Merge tag 'asoc-v6.2-3' of https://git./linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v6.2
Some more small fixes and board quirks that came in since my last
update, the main one being the fixes from Kai for issues around the
attempts to get kexec working well on SOF based systems.