From: Palmer Dabbelt Date: Thu, 18 Jan 2024 02:07:11 +0000 (-0800) Subject: Merge patch series "riscv: Add fine-tuned checksum functions" X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=c640868491105d53899f9f8e613acd4aa06cef68;p=linux.git Merge patch series "riscv: Add fine-tuned checksum functions" Charlie Jenkins says: Each architecture generally implements fine-tuned checksum functions to leverage the instruction set. This patch adds the main checksum functions that are used in networking. Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster. This patch takes heavy use of the Zbb extension using alternatives patching. To test this patch, enable the configs for KUNIT, then CHECKSUM_KUNIT. I have attempted to make these functions as optimal as possible, but I have not ran anything on actual riscv hardware. My performance testing has been limited to inspecting the assembly, running the algorithms on x86 hardware, and running in QEMU. ip_fast_csum is a relatively small function so even though it is possible to read 64 bits at a time on compatible hardware, the bottleneck becomes the clean up and setup code so loading 32 bits at a time is actually faster. * b4-shazam-merge: kunit: Add tests for csum_ipv6_magic and ip_fast_csum riscv: Add checksum library riscv: Add checksum header riscv: Add static key for misaligned accesses asm-generic: Improve csum_fold Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com Signed-off-by: Palmer Dabbelt --- c640868491105d53899f9f8e613acd4aa06cef68 diff --cc arch/riscv/lib/Makefile index c8a6787d58273,2aa1a4ad361fb..bd6e6c1b0497b --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@@ -6,10 -6,8 +6,11 @@@ lib-y += memmove. lib-y += strcmp.o lib-y += strlen.o lib-y += strncmp.o + lib-y += csum.o +ifeq ($(CONFIG_MMU), y) - lib-y += uaccess.o +lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o +endif + lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o