drm/amdgpu: add entity error check in amdgpu_ctx_get_entity
authorZhenGuo Yin <zhenguo.yin@amd.com>
Thu, 11 May 2023 09:29:20 +0000 (17:29 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 15 Jun 2023 15:37:55 +0000 (11:37 -0400)
[Why]
UMD is not aware of entity error, and will keep submitting jobs
into the error entity.

[How]
Add entity error check when getting entity from ctx.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c

index 3ccd709ae76a37c8b99f04e6eb372e336a3ae283..0dc9c655c4fbdbd9dbd3224508fac806dbd99dfc 100644 (file)
@@ -432,6 +432,7 @@ int amdgpu_ctx_get_entity(struct amdgpu_ctx *ctx, u32 hw_ip, u32 instance,
                          u32 ring, struct drm_sched_entity **entity)
 {
        int r;
+       struct drm_sched_entity *ctx_entity;
 
        if (hw_ip >= AMDGPU_HW_IP_NUM) {
                DRM_ERROR("unknown HW IP type: %d\n", hw_ip);
@@ -455,7 +456,14 @@ int amdgpu_ctx_get_entity(struct amdgpu_ctx *ctx, u32 hw_ip, u32 instance,
                        return r;
        }
 
-       *entity = &ctx->entities[hw_ip][ring]->entity;
+       ctx_entity = &ctx->entities[hw_ip][ring]->entity;
+       r = drm_sched_entity_error(ctx_entity);
+       if (r) {
+               DRM_DEBUG("error entity %p\n", ctx_entity);
+               return r;
+       }
+
+       *entity = ctx_entity;
        return 0;
 }