locking/atomic/x86: Introduce arch_try_cmpxchg64
authorUros Bizjak <ubizjak@gmail.com>
Sun, 15 May 2022 18:42:04 +0000 (20:42 +0200)
committerPeter Zijlstra <peterz@infradead.org>
Tue, 17 May 2022 22:08:28 +0000 (00:08 +0200)
commitc2df0a6af177b6c06a859806a876f92b072dc624
tree9be48ac4856d48aa86e5c64c6afe27a87c107f5e
parent0aa7be05d83cc584da0782405e8007e351dfb6cc
locking/atomic/x86: Introduce arch_try_cmpxchg64

Introduce arch_try_cmpxchg64 for 64-bit and 32-bit targets to improve
code using cmpxchg64.  On 64-bit targets, the generated assembly improves
from:

  ab: 89 c8                 mov    %ecx,%eax
  ad: 48 89 4c 24 60        mov    %rcx,0x60(%rsp)
  b2: 83 e0 fd              and    $0xfffffffd,%eax
  b5: 89 54 24 64           mov    %edx,0x64(%rsp)
  b9: 88 44 24 60           mov    %al,0x60(%rsp)
  bd: 48 89 c8              mov    %rcx,%rax
  c0: c6 44 24 62 f2        movb   $0xf2,0x62(%rsp)
  c5: 48 8b 74 24 60        mov    0x60(%rsp),%rsi
  ca: f0 49 0f b1 34 24     lock cmpxchg %rsi,(%r12)
  d0: 48 39 c1              cmp    %rax,%rcx
  d3: 75 cf                 jne    a4 <t+0xa4>

to:

  b3: 89 c2                 mov    %eax,%edx
  b5: 48 89 44 24 60        mov    %rax,0x60(%rsp)
  ba: 83 e2 fd              and    $0xfffffffd,%edx
  bd: 89 4c 24 64           mov    %ecx,0x64(%rsp)
  c1: 88 54 24 60           mov    %dl,0x60(%rsp)
  c5: c6 44 24 62 f2        movb   $0xf2,0x62(%rsp)
  ca: 48 8b 54 24 60        mov    0x60(%rsp),%rdx
  cf: f0 48 0f b1 13        lock cmpxchg %rdx,(%rbx)
  d4: 75 d5                 jne    ab <t+0xab>

where a move and a compare after cmpxchg is saved.  The improvements
for 32-bit targets are even more noticeable, because dual-word compare
after cmpxchg8b gets eliminated.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220515184205.103089-3-ubizjak@gmail.com
arch/x86/include/asm/cmpxchg_32.h
arch/x86/include/asm/cmpxchg_64.h