crypto: x86/sha256-avx2 - add missing vzeroupper
authorEric Biggers <ebiggers@google.com>
Sat, 6 Apr 2024 00:26:09 +0000 (20:26 -0400)
committerHerbert Xu <herbert@gondor.apana.org.au>
Fri, 12 Apr 2024 07:07:52 +0000 (15:07 +0800)
Since sha256_transform_rorx() uses ymm registers, execute vzeroupper
before returning from it.  This is necessary to avoid reducing the
performance of SSE code.

Fixes: d34a460092d8 ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
arch/x86/crypto/sha256-avx2-asm.S

index 9918212faf914ffc61c5801809294ade209f03d4..0ffb072be9561545a021d9390fc467b2c57549f6 100644 (file)
@@ -716,6 +716,7 @@ SYM_TYPED_FUNC_START(sha256_transform_rorx)
        popq    %r13
        popq    %r12
        popq    %rbx
+       vzeroupper
        RET
 SYM_FUNC_END(sha256_transform_rorx)