rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion
authorDavid Woodhouse <dwmw@amazon.co.uk>
Tue, 16 Feb 2021 15:04:34 +0000 (15:04 +0000)
committerPaul E. McKenney <paulmck@kernel.org>
Tue, 8 Feb 2022 18:11:41 +0000 (10:11 -0800)
commit82980b1622d97017053c6792382469d7dc26a486
tree86c9c34f1c04754f375294a3b1f3f5dc3426ad02
parentda123016ca8cb5697366c0b2dd55059b976e67e4
rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion

If we allow architectures to bring APs online in parallel, then we end
up requiring rcu_cpu_starting() to be reentrant. But currently, the
manipulation of rnp->ofl_seq is not thread-safe.

However, rnp->ofl_seq is also fairly much pointless anyway since both
rcu_cpu_starting() and rcu_report_dead() hold rcu_state.ofl_lock for
fairly much the whole time that rnp->ofl_seq is set to an odd number
to indicate that an operation is in progress.

So drop rnp->ofl_seq completely, and use only rcu_state.ofl_lock.

This has a couple of minor complexities: lockdep will complain when we
take rcu_state.ofl_lock, and currently accepts the 'excuse' of having
an odd value in rnp->ofl_seq. So switch it to an arch_spinlock_t to
avoid that false positive complaint. Since we're killing rnp->ofl_seq
of course that 'excuse' has to be changed too, so make it check for
arch_spin_is_locked(rcu_state.ofl_lock).

There's no arch_spin_lock_irqsave() so we have to manually save and
restore local interrupts around the locking.

At Paul's request based on Neeraj's analysis, make rcu_gp_init not just
wait but *exclude* any CPU online/offline activity, which was fairly
much true already by virtue of it holding rcu_state.ofl_lock.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
kernel/rcu/tree.c
kernel/rcu/tree.h