drm/amdgpu: Wipe all VRAM on free when RAS is enabled
authorFelix Kuehling <Felix.Kuehling@amd.com>
Tue, 25 Jan 2022 15:51:49 +0000 (10:51 -0500)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 27 Jan 2022 20:48:41 +0000 (15:48 -0500)
On GPUs with RAS, poison can propagate between processes if VRAM is not
cleared when it is freed or allocated. The reason is, that not all write
accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS
poison. Clearing memory in the background when it is freed should avoid
major performance impact. KFD has been doing this already for a long time.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index 5661b82d84d4641f0ce06ba4f1ad97c7aaac1b03..ec29365d108d9c1bc229f2c422e0b10749d0f6df 100644 (file)
@@ -575,6 +575,9 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
        if (!amdgpu_bo_support_uswc(bo->flags))
                bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
 
+       if (adev->ras_enabled)
+               bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
+
        bo->tbo.bdev = &adev->mman.bdev;
        if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
                          AMDGPU_GEM_DOMAIN_GDS))