locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()
authorUros Bizjak <ubizjak@gmail.com>
Thu, 21 Mar 2024 19:52:47 +0000 (20:52 +0100)
committerIngo Molnar <mingo@kernel.org>
Thu, 11 Apr 2024 13:14:54 +0000 (15:14 +0200)
Use atomic_try_cmpxchg_relaxed(*ptr, &old, new) instead of
atomic_cmpxchg_relaxed (*ptr, old, new) == old in xchg_tail().

x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after CMPXCHG.

No functional change intended.

Since this code requires NR_CPUS >= 16k, I have tested it
by unconditionally setting _Q_PENDING_BITS to 1 in
<asm-generic/qspinlock_types.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20240321195309.484275-1-ubizjak@gmail.com
kernel/locking/qspinlock.c

index ebe6b8ec7cb380da9d62e09f1b7ab98f339d35a8..1df5fef8a656164ea04554ba8abb222cdad80d0c 100644 (file)
@@ -220,21 +220,18 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
  */
 static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
 {
-       u32 old, new, val = atomic_read(&lock->val);
+       u32 old, new;
 
-       for (;;) {
-               new = (val & _Q_LOCKED_PENDING_MASK) | tail;
+       old = atomic_read(&lock->val);
+       do {
+               new = (old & _Q_LOCKED_PENDING_MASK) | tail;
                /*
                 * We can use relaxed semantics since the caller ensures that
                 * the MCS node is properly initialized before updating the
                 * tail.
                 */
-               old = atomic_cmpxchg_relaxed(&lock->val, val, new);
-               if (old == val)
-                       break;
+       } while (!atomic_try_cmpxchg_relaxed(&lock->val, &old, new));
 
-               val = old;
-       }
        return old;
 }
 #endif /* _Q_PENDING_BITS == 8 */