From: Andrey Grodzovsky Date: Fri, 21 May 2021 20:41:22 +0000 (-0400) Subject: drm/amdgpu: Fix crash when hot unplug in BACO X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=e1543d83ed55120a860cbaad9e5421afc44c36ff;p=linux.git drm/amdgpu: Fix crash when hot unplug in BACO Problem: When device goes into runtime suspend due to prolonged inactivity (e.g. BACO sleep) and then hot unplugged, PCI core will try to wake up the device as part of unplug process. Since the device is gone all HW programming during rpm resume fails leading to a bad SW state later during pci remove handling. Fix: Use a flag we use for PCIe error recovery to avoid accessing registres. This allows to successfully complete rpm resume sequence and finish pci remove. v2: Renamed HW access block flag Signed-off-by: Andrey Grodzovsky Reviewed-by: Alex Deucher Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081 Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-2-andrey.grodzovsky@amd.com --- diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 3a0890ca816a0..e8bbcde0c6999 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1557,6 +1557,10 @@ static int amdgpu_pmops_runtime_resume(struct device *dev) if (!adev->runpm) return -EINVAL; + /* Avoids registers access if device is physically gone */ + if (!pci_device_is_present(adev->pdev)) + adev->no_hw_access = true; + if (amdgpu_device_supports_px(drm_dev)) { drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;