swiotlb: Fix alignment checks when both allocation and DMA masks are present
authorWill Deacon <will@kernel.org>
Fri, 8 Mar 2024 15:28:27 +0000 (15:28 +0000)
committerChristoph Hellwig <hch@lst.de>
Wed, 13 Mar 2024 18:39:27 +0000 (11:39 -0700)
Nicolin reports that swiotlb buffer allocations fail for an NVME device
behind an IOMMU using 64KiB pages. This is because we end up with a
minimum allocation alignment of 64KiB (for the IOMMU to map the buffer
safely) but a minimum DMA alignment mask corresponding to a 4KiB NVME
page (i.e. preserving the 4KiB page offset from the original allocation).
If the original address is not 4KiB-aligned, the allocation will fail
because swiotlb_search_pool_area() erroneously compares these unmasked
bits with the 64KiB-aligned candidate allocation.

Tweak swiotlb_search_pool_area() so that the DMA alignment mask is
reduced based on the required alignment of the allocation.

Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
Link: https://lore.kernel.org/r/cover.1707851466.git.nicolinc@nvidia.com
Reported-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
kernel/dma/swiotlb.c

index a3645a9ae68e83b863fbe9bf15fca776603836c7..f212943e51ca5f93eb554c263d5ed2fa98bd45f7 100644 (file)
@@ -1003,8 +1003,7 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool
        dma_addr_t tbl_dma_addr =
                phys_to_dma_unencrypted(dev, pool->start) & boundary_mask;
        unsigned long max_slots = get_max_slots(boundary_mask);
-       unsigned int iotlb_align_mask =
-               dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1);
+       unsigned int iotlb_align_mask = dma_get_min_align_mask(dev);
        unsigned int nslots = nr_slots(alloc_size), stride;
        unsigned int offset = swiotlb_align_offset(dev, orig_addr);
        unsigned int index, slots_checked, count = 0, i;
@@ -1015,6 +1014,14 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool
        BUG_ON(!nslots);
        BUG_ON(area_index >= pool->nareas);
 
+       /*
+        * Ensure that the allocation is at least slot-aligned and update
+        * 'iotlb_align_mask' to ignore bits that will be preserved when
+        * offsetting into the allocation.
+        */
+       alloc_align_mask |= (IO_TLB_SIZE - 1);
+       iotlb_align_mask &= ~alloc_align_mask;
+
        /*
         * For mappings with an alignment requirement don't bother looping to
         * unaligned slots once we found an aligned one.