iommu: Fix a boundary issue to avoid performance drop
authorXiang Chen <chenxiang66@hisilicon.com>
Thu, 25 Mar 2021 03:38:24 +0000 (11:38 +0800)
committerJoerg Roedel <jroedel@suse.de>
Wed, 7 Apr 2021 08:23:58 +0000 (10:23 +0200)
After the change of patch ("iommu: Switch gather->end to the
inclusive end"), the performace drops from 1600+K IOPS to 1200K in our
kunpeng ARM64 platform.
We find that the range [start1, end1) actually is joint from the range
[end1, end2), but it is considered as disjoint after the change,
so it needs more times of TLB sync, and spends more time on it.
So fix the boundary issue to avoid performance drop.

Fixes: 862c3715de8f ("iommu: Switch gather->end to the inclusive end")
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1616643504-120688-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
include/linux/iommu.h

index 5e7fe519430af43207eee4d97c742be65ce99f7f..9ca6e6b8084dcf58b5e7d5b8b754b1a7fc4a2f0e 100644 (file)
@@ -547,7 +547,7 @@ static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
         * structure can be rewritten.
         */
        if (gather->pgsize != size ||
-           end < gather->start || start > gather->end) {
+           end + 1 < gather->start || start > gather->end + 1) {
                if (gather->pgsize)
                        iommu_iotlb_sync(domain, gather);
                gather->pgsize = size;