drm/amdkfd: enable subsequent retry fault
authorPhilip Yang <Philip.Yang@amd.com>
Tue, 20 Apr 2021 19:13:59 +0000 (15:13 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 29 Apr 2021 03:36:05 +0000 (23:36 -0400)
After draining the stale retry fault, or failed to validate the range
to recover, have to remove the fault address from fault filter ring, to
be able to handle subsequent retry interrupt on same address. Otherwise
the retry fault will not be processed to recover until timeout passed.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 00d759b257f4335061b4e40687b555a2d902bbaa..d9111fea724b3a64253366c17c929f3f96e2ec57 100644 (file)
@@ -2363,8 +2363,10 @@ retry_write_locked:
 
        mutex_lock(&prange->migrate_mutex);
 
-       if (svm_range_skip_recover(prange))
+       if (svm_range_skip_recover(prange)) {
+               amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
                goto out_unlock_range;
+       }
 
        timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
        /* skip duplicate vm fault on different pages of same range */
@@ -2426,6 +2428,7 @@ out:
 
        if (r == -EAGAIN) {
                pr_debug("recover vm fault later\n");
+               amdgpu_gmc_filter_faults_remove(adev, addr, pasid);
                r = 0;
        }
        return r;