Evan Quan [Fri, 7 Aug 2020 09:01:47 +0000 (17:01 +0800)]
drm/amd/powerplay: correct UVD/VCE PG state on custom pptable uploading
The UVD/VCE PG state is managed by UVD and VCE IP. It's error-prone to
assume the bootup state in SMU based on the dpm status.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 7 Aug 2020 07:03:40 +0000 (15:03 +0800)]
drm/amd/powerplay: correct Vega20 cached smu feature state
Correct the cached smu feature state on pp_features sysfs
setting.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu ChengZhe [Thu, 6 Aug 2020 06:54:08 +0000 (14:54 +0800)]
drm/amdgpu: Skip some registers config for SRIOV
Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.
v2: move SRIOV VF check into specify functions;
modify commit description and comment.
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 24 Jul 2020 23:58:48 +0000 (16:58 -0700)]
drm/amdkfd: Fix spurious debug exception on gfx10
s_barrier triggers a debug exception when issued with PRIV=1,
DEBUG_EN=1. This causes spurious notifications to rocm-gdb.
Clear MODE before issuing s_barrier and restore MODE afterwards
in the context restore handler.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Laurent Morichetti <laurent.morichetti@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Fri, 7 Aug 2020 22:23:56 +0000 (18:23 -0400)]
Revert "drm/amdkfd: Unify gfx9/gfx10 context save area layouts"
This reverts commit
0a5baee415000a3e18730ac98e19d046c3cebbe6.
The change introduced a regression on some chips. Reverting until
a proper solution can be found.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Fri, 7 Aug 2020 22:22:27 +0000 (18:22 -0400)]
Revert "drm/amdkfd: Fix spurious debug exception on gfx10"
This reverts commit
ea368183ae900e376b66d3f23da22acde48e385a.
Needed due to conflicts when reverting "drm/amdkfd: Unify gfx9/gfx10
context save area layouts".
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christophe JAILLET [Sun, 9 Aug 2020 20:34:06 +0000 (22:34 +0200)]
drm: amdgpu: Use the correct size when allocating memory
When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'.
The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.
Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sandeep Raghuraman [Thu, 6 Aug 2020 17:22:20 +0000 (22:52 +0530)]
drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
Reproducing bug report here:
After hibernating and resuming, DPM is not enabled. This remains the case
even if you test hibernate using the steps here:
https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html
I debugged the problem, and figured out that in the file hardwaremanager.c,
in the function, phm_enable_dynamic_state_management(), the check
'if (!hwmgr->pp_one_vf && smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev) && adev->in_suspend)'
returns true for the hibernate case, and false for the suspend case.
This means that for the hibernate case, the AMDGPU driver doesn't enable DPM
(even though it should) and simply returns from that function.
In the suspend case, it goes ahead and enables DPM, even though it doesn't need to.
I debugged further, and found out that in the case of suspend, for the
CIK/Hawaii GPUs, smum_is_dpm_running(hwmgr) returns false, while in the case of
hibernate, smum_is_dpm_running(hwmgr) returns true.
For CIK, the ci_is_dpm_running() function calls the ci_is_smc_ram_running() function,
which is ultimately used to determine if DPM is currently enabled or not,
and this seems to provide the wrong answer.
I've changed the ci_is_dpm_running() function to instead use the same method that
some other AMD GPU chips do (e.g Fiji), which seems to read the voltage controller.
I've tested on my R9 390 and it seems to work correctly for both suspend and
hibernate use cases, and has been stable so far.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208839
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dennis Li [Tue, 4 Aug 2020 04:32:13 +0000 (12:32 +0800)]
drm/amdgpu: unlock mutex on error
Make sure to unlock the mutex when error happen
v2:
1. correct syntax error in the commit comments
2. remove change-Id
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Wed, 5 Aug 2020 09:24:41 +0000 (17:24 +0800)]
drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)
As VCN related dpm table setup needs VCN be in PG ungate state. Same logics
applies to JPEG.
V2: fix paste typo
V3: code cosmetic
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 3 Aug 2020 03:15:14 +0000 (11:15 +0800)]
drm/amd/powerplay: update swSMU VCN/JPEG PG logics
Add lock protections and avoid unnecessary actions
if the PG state is already the same as required.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Thu, 6 Aug 2020 09:37:28 +0000 (17:37 +0800)]
drm/amdgpu: use mode1 reset by default for sienna_cichlid
Swith default gpu reset method for sienna_cichlid to MODE1 reset.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 15:08:02 +0000 (11:08 -0400)]
drm/amd/display: Drop dm_determine_update_type_for_commit
[Why]
This was added in the past to solve the issue of not knowing when
to stall for medium and full updates in DM.
Since DC is ultimately decides what requires bandwidth changes we
wanted to make use of it directly to determine this.
The problem is that we can't actually pass any of the stream or surface
updates into DC global validation, so we don't actually check if the new
configuration is valid - we just validate the old existing config
instead and stall for outstanding commits to finish.
There's also the problem of grabbing the DRM private object for
pageflips which can lead to page faults in the case where commits
execute out of order and free a DRM private object state that was
still required for commit tail.
[How]
Now that we reset the plane in DM with the same conditions DC checks
we can have planes go through DC validation and we know when we need
to check and stall based on whether the stream or planes changed.
We mark lock_and_validation_needed whenever we've done this, so just
go back to using that instead of dm_determine_update_type_for_commit.
Since we'll skip resetting the plane for a pageflip we will no longer
grab the DRM private object for pageflips as well, avoiding the
page fault issued caused by pageflipping under load with commits
executing out of order.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 14:48:21 +0000 (10:48 -0400)]
drm/amd/display: Reset plane for anything that's not a FAST update
[Why]
MEDIUM or FULL updates can require global validation or affect
bandwidth. By treating these all simply as surface updates we aren't
actually passing this through DC global validation.
[How]
There's currently no way to pass surface updates through DC global
validation, nor do I think it's a good idea to change the interface
to accept these.
DC global validation itself is currently stateless, and we can move
our update type checking to be stateless as well by duplicating DC
surface checks in DM based on DRM properties.
We wanted to rely on DC automatically determining this since DC knows
best, but DM is ultimately what fills in everything into DC plane
state so it does need to know as well.
There are basically only three paths that we exercise in DM today:
1) Cursor (async update)
2) Pageflip (fast update)
3) Full pipe programming (medium/full updates)
Which means that anything that's more than a pageflip really needs to
go down path #3.
So this change duplicates all the surface update checks based on DRM
state instead inside of should_reset_plane().
Next step is dropping dm_determine_update_type_for_commit and we no
longer require the old DC state at all for global validation.
Optimization can come later so we don't reset DC planes at all for
MEDIUM udpates and avoid validation, but we might require some extra
checks in DM to achieve this.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Thu, 6 Aug 2020 19:48:10 +0000 (15:48 -0400)]
drm/amd/display: Use validated tiling_flags and tmz_surface in commit_tail
[Why]
So we're not racing with userspace or deadlocking DM.
[How]
These flags are now stored on dm_plane_state itself and acquried and
validated during commit_check, so just use those instead.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 14:03:10 +0000 (10:03 -0400)]
drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes
[Why]
We're racing with userspace as the flags could potentially change
from when we acquired and validated them in commit_check.
[How]
We unfortunately can't drop this function in its entirety from
prepare_planes since we don't know the afb->address at commit_check
time yet.
So instead of querying new tiling_flags and tmz_surface use the ones
from the plane_state directly.
While we're at it, also update the force_disable_dcc option based
on the state from atomic check.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 13:59:53 +0000 (09:59 -0400)]
drm/amd/display: Reset plane when tiling flags change
[Why]
Enabling or disable DCC or switching between tiled and linear formats
can require bandwidth updates.
They're currently skipping all DC validation by being treated as purely
surface updates.
[How]
Treat tiling_flag changes (which encode DCC state) as a condition for
resetting the plane.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Tue, 28 Jul 2020 13:44:26 +0000 (09:44 -0400)]
drm/amd/display: Store tiling_flags and tmz_surface on dm_plane_state
[Why]
Store these in advance so we can reuse them later in commit_tail without
having to reserve the fbo again.
These will also be used for checking for tiling changes when deciding
to reset the plane or not.
[How]
This change should mostly be a refactor. Only commit check is affected
for now and I'll drop the get_fb_info calls in prepare_planes and
commit_tail after.
This runs a prepass loop once we think that all planes have been added
to the context and replaces the get_fb_info calls with accessing the
dm_plane_state instead.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Thu, 6 Aug 2020 06:41:06 +0000 (14:41 +0800)]
drm/amd/powerplay: update driver if file for sienna_cichlid
Update drive if file for sienna_cichlid.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 30 Jul 2020 13:54:59 +0000 (15:54 +0200)]
drm/amdgpu: new ids flag for tmz (v2)
Allows UMD to know if TMZ is supported and enabled.
This commit also bumps KMS_DRIVER_MINOR because if we don't
UMD can't tell if "ids_flags & AMDGPU_IDS_FLAGS_TMZ == 0" means
"tmz is not enabled" or "tmz may be enabled but the kernel doesn't
report it".
v2: use amdgpu_is_tmz() and reworded commit message.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:28:40 +0000 (15:28 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Vega12
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:24:08 +0000 (15:24 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Vega20
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:02:11 +0000 (15:02 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Renoir
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 07:09:57 +0000 (15:09 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Sienna Cichlid
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 06:55:32 +0000 (14:55 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Navi10
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 06:31:21 +0000 (14:31 +0800)]
drm/amd/powerplay: add control method to bypass metrics cache on Arcturus
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 04:39:58 +0000 (12:39 +0800)]
drm/amd/powerplay: add Vega12 support for gpu metrics export
Add Vega12 gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 04:23:42 +0000 (12:23 +0800)]
drm/amd/powerplay: add Vega20 support for gpu metrics export
Add Vega20 gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 30 Jul 2020 03:40:07 +0000 (11:40 +0800)]
drm/amd/powerplay: enable gpu_metrics export on legacy powerplay routines
Enable gpu_metrics support on legacy powerplay routines.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 27 Jul 2020 08:24:46 +0000 (16:24 +0800)]
drm/amd/powerplay: add Renoir support for gpu metrics export(V2)
Add Renoir gpu metrics export interface.
V2: use memcpy to make code more compact
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 27 Jul 2020 02:00:47 +0000 (10:00 +0800)]
drm/amd/powerplay: add Sienna Cichlid support for gpu metrics export
Add Sienna Cichlid gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 09:24:34 +0000 (17:24 +0800)]
drm/amd/powerplay: add Navi1x support for gpu metrics export
Add Navi1x gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 09:47:03 +0000 (17:47 +0800)]
drm/amd/powerplay: update the data structure for NV12 SmuMetrics
Although it does not bring any problem for now, the coming gpu
metrics interface needs to handle them differently based on the
asic type.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 02:42:39 +0000 (10:42 +0800)]
drm/amd/powerplay: add Arcturus support for gpu metrics export
Add Arcturus gpu metrics export interface.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 24 Jul 2020 10:39:33 +0000 (18:39 +0800)]
drm/amd/powerplay: implement SMU V11 common APIs for retrieving link speed/width
This will be shared around all SMU V11 asics.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 23 Jul 2020 10:03:35 +0000 (18:03 +0800)]
drm/amd/powerplay: add new sysfs interface for retrieving gpu metrics(V2)
A new interface for UMD to retrieve gpu metrics data.
V2: rich the documentation
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 23 Jul 2020 08:07:01 +0000 (16:07 +0800)]
drm/amd/powerplay: define an universal data structure for gpu metrics (V4)
Thus we can provide an interface for UMD to retrieve gpu metrics data.
V2: better naming and comments
V3: two structures created for dGPU and APU separately
V4: add driver attached timestamp
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Colin Ian King [Wed, 5 Aug 2020 12:15:27 +0000 (13:15 +0100)]
drm/amdgpu: fix spelling mistake "paramter" -> "parameter"
There is a spelling mistake in a dev_warn message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Tue, 4 Aug 2020 08:58:30 +0000 (16:58 +0800)]
drm/amd/powerplay: grant Arcturus softmin/max setting on latest PM firmware
For Arcturus, the softmin/max settings from driver are permitted on the
latest(54.26 later) SMU firmware. Thus enabling them in driver.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Philip Yang [Mon, 27 Jul 2020 13:06:18 +0000 (09:06 -0400)]
drm/amdkfd: option to disable system mem limit
If multiple process share system memory through /dev/shm, KFD allocate
memory should not fail if it reaches the system memory limit because
one copy of physical system memory are shared by multiple process.
Add module parameter no_system_mem_limit to provide user option to
disable system memory limit check at runtime using sysfs or during
driver module init using kernel boot argument. By default the system
memory limit is on.
Print out debug message to warn user if KFD allocate memory failed
because system memory reaches limit.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tianjia Zhang [Sun, 2 Aug 2020 11:15:36 +0000 (19:15 +0800)]
drm/amd/display: Fix wrong return value in dm_update_plane_state()
On an error exit path, a negative error code should be returned
instead of a positive return value.
Fixes: 9e869063b0021 ("drm/amd/display: Move iteration out of dm_update_planes")
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:55 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn30_res_pool_funcs
The only usage of dcn30_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:54 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn21_res_pool_funcs
The only usage of dcn21_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rikard Falkeborn [Tue, 4 Aug 2020 20:06:53 +0000 (22:06 +0200)]
drm/amd/display: Constify dcn20_res_pool_funcs
The only usage of dcn20_res_pool_funcs is to assign its address to a
const pointer. Make it const to allow the compiler to put it in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dan Carpenter [Mon, 3 Aug 2020 14:35:19 +0000 (17:35 +0300)]
drm/amd/display: Indent an if statement
The if statement wasn't indented so it's confusing.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 17:14:17 +0000 (13:14 -0400)]
drm/amdgpu: move vram usage by vbios to mman (v2)
It's related to the memory manager so move it there.
v2: inline the structure
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 17:02:25 +0000 (13:02 -0400)]
drm/amdgpu: move IP discovery data to mman
It's related to the memory manager so move it there.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 29 Jul 2020 16:53:56 +0000 (12:53 -0400)]
drm/amdgpu: move stolen memory from gmc to mman
It's more related to memory management than memory
controller.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 19:35:56 +0000 (15:35 -0400)]
drm/amdgpu/gmc: disable keep_stolen_vga_memory on arcturus
I suspect the only reason this was set was to avoid touching
the display related registers on arcturus. Someone should
double check this on arcturus with S3.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:34:50 +0000 (18:34 -0400)]
drm/amdgpu: drop the CPU pointers for the stolen vga bos
We never use them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:30:14 +0000 (18:30 -0400)]
drm/amdgpu/gmc10: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:55 +0000 (18:29 -0400)]
drm/amdgpu/gmc9: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:39 +0000 (18:29 -0400)]
drm/amdgpu/gmc8: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:29:20 +0000 (18:29 -0400)]
drm/amdgpu/gmc7: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:27:46 +0000 (18:27 -0400)]
drm/amdgpu/gmc6: switch to using amdgpu_gmc_get_vbios_allocations
The new helper centralizes the logic in one place.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 19:04:52 +0000 (15:04 -0400)]
drm/amdgpu/gmc: add new helper to get the FB size used by pre-OS console
This adds a new gmc callback to get the size reserved by the pre-OS
console and provides a helper function for use by gmc IP drivers.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 22:05:11 +0000 (18:05 -0400)]
drm/amdgpu: add support for extended stolen vga memory
This will allow us to split the allocation for systems
where we have to keep the stolen memory around to avoid
S3 issues. This way we don't waste as much memory and
still avoid any screen artifacts during the bios to
driver transition.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 21:55:30 +0000 (17:55 -0400)]
drm/amdgpu: move keep stolen memory check into gmc core
Rather than leaving this as a gmc v9 specific hack.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 21:46:00 +0000 (17:46 -0400)]
drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc
Since that is where we store the other data related to
the stolen vga memory.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 18:10:46 +0000 (14:10 -0400)]
drm/amdgpu: use a define for the memory size of the vga emulator
Rather than open coding it everywhere.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 17:57:20 +0000 (13:57 -0400)]
drm/amdgpu: use create_at for the stolen pre-OS buffer
Should be functionally the same since nothing else is
allocated at that point, but let's be exact.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 28 Jul 2020 21:38:29 +0000 (17:38 -0400)]
drm/amdgpu: handle bo size 0 in amdgpu_bo_create_kernel_at (v2)
Just return early to match other bo_create functions.
v2: check if the bo_ptr is NULL rather than checking the size.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 30 Jul 2020 19:21:33 +0000 (15:21 -0400)]
drm/amdgpu/smu: rework i2c adpater registration
The i2c init/fini functions just register the i2c adapter.
There is no need to call them during hw init/fini. They only
need to be called once per driver init/fini. The previous
behavior broke runtime pm because we unregistered the i2c
adapter during suspend.
Tested-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Mon, 27 Jul 2020 14:53:38 +0000 (10:53 -0400)]
drm/amd/display: 3.2.97
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Anthony Koo [Sat, 25 Jul 2020 01:37:56 +0000 (21:37 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.27
| [Header Changes]
| - Reworked the FW versioning to include hotfix
| and test bits
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alvin Lee [Wed, 22 Jul 2020 04:32:14 +0000 (00:32 -0400)]
drm/amd/display: Separate pipe disconnect from rest of progrmaming
[Why]
When changing pixel formats for HDR (e.g. ARGB -> FP16)
there are configurations that change from 2 pipes to 1 pipe.
In these cases, it seems that disconnecting MPCC and doing
a surface update at the same time(after unlocking) causes
some registers to be updated slightly faster than others
after unlocking (e.g. if the pixel format is updated to FP16
before the new surface address is programmed, we get
corruption on the screen because the pixel formats aren't
matching). We separate disconnecting MPCC from the rest
of the pipe programming sequence to prevent this.
[How]
Move MPCC disconnect into separate operation than the
rest of the pipe programming.
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Victor Lu [Tue, 21 Jul 2020 16:08:34 +0000 (12:08 -0400)]
drm/amd/display: Add debugfs for forcing stream timing sync
[why]
There's currently no method to enable multi-stream synchronization from
userspace and we don't check the VSDB bits to know whether or not
specific displays should have the feature enable.
[how]
Add a debugfs entry that controls a new DM debug option,
"force_timing_sync". This debug option will set on any newly created
stream following the change to the debug option.
Expose a new interface from DC that performs the timing sync and a helper
to the "force_timing_sync" debugfs that iterates over the current streams
and modifies the current synchornization state and grouping.
Example usage to force a resync (from an X based desktop):
echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_force_timing_sync
xset dpms force off && xset dpms force on
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Igor Kravchenko [Fri, 24 Jul 2020 15:10:40 +0000 (11:10 -0400)]
drm/amd/display: Display goes blank after inst
[why]
Display goes blank after driver installation.
Aux tuning parameters must be used for 2.x only.
Wrong dc_golden_table offset was used.
[How]
Implement a new enc3_hw_init function without VBIOS constants usage to
be called for 3.x
Calculate dc_golden_table offset using sum of
base dce_info offset and golden table offset
Signed-off-by: Igor Kravchenko <Igor.Kravchenko@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
George Shen [Fri, 17 Jul 2020 17:19:27 +0000 (13:19 -0400)]
drm/amd/display: Change null plane state swizzle mode to 4kb_s
[Why]
During SetPathMode and UpdatePlanes, the plane state can be null. We default
to linear swizzle mode when plane state is null. This resulted in bandwidth
validation failing when trying to set 8K60 mode (which previously passed validation
during rebuild timing list).
[How]
Change the default swizzle mode from linear to 4kb_s and update pitch accordingly.
Signed-off-by: George Shen <george.shen@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
JinZe.Xu [Tue, 21 Jul 2020 09:52:41 +0000 (17:52 +0800)]
drm/amd/display: Use helper function to check for HDMI signal
[How]
Use dc_is_hdmi_signal to determine signal type.
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Thu, 23 Jul 2020 17:06:23 +0000 (13:06 -0400)]
drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink
[Why]
Sink OUI supported cap is not set so driver skips programming it.
[How]
Revert the change the skips OUI programming if the cap is not set
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eryk Brol [Fri, 19 Jun 2020 18:07:03 +0000 (14:07 -0400)]
drm/amd/display: Comments on how to use DSC debugfs some entries
[why]
Some of the DSC debugfs read enteries are missing comments
explaining how to use and how to comprehend the results.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Tue, 30 Jun 2020 15:16:05 +0000 (11:16 -0400)]
drm/amd/display: Fix logger context
[Why&How]
use correct logger context
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eryk Brol [Fri, 19 Jun 2020 18:02:38 +0000 (14:02 -0400)]
drm/amd/display: DSC Bit target rate debugfs write entry
[Why]
We need to be able to specify bits per pixel for DSC on any
connector.
[How]
Overwrite computed DSC target rate in dsc_cfg, with requested value.
Overwrites for both SST and MST connectors, but in different places, but the process is identical. Overwrites only if DSC is decided to be enabled on that connector.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dmytro Laktyushkin [Fri, 26 Jun 2020 18:30:29 +0000 (14:30 -0400)]
drm/amd/display: populate new dml variable
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Igor Kravchenko [Mon, 20 Jul 2020 00:45:28 +0000 (20:45 -0400)]
drm/amd/display: Read VBIOS Golden Settings Tbl
[Why]
For ver.4.4 and higher VBIOS contains default setting table.
{How]
Read Golden Settings Table from VBIOS, apply Aux tuning parameters.
Signed-off-by: Igor Kravchenko <Igor.Kravchenko@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Bernstein [Mon, 20 Jul 2020 23:18:43 +0000 (19:18 -0400)]
drm/amd/display: Use parameter for call to set output mux
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Bernstein [Mon, 20 Jul 2020 20:10:22 +0000 (16:10 -0400)]
drm/amd/display: Update virtual stream encoder
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eryk Brol [Wed, 17 Jun 2020 19:28:04 +0000 (15:28 -0400)]
drm/amd/display: DSC Slice height debugfs write entry
[Why]
We need to be able to specify slice height for any connector's DSC
[How]
Overwrite computed parameters in dsc_cfg, with the value needed/
Overwrites for both SST and MST connectors, but in different places, but the process is identical. Overwrites only if DSC is decided to be enabled on that connector.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Mon, 3 Aug 2020 07:52:52 +0000 (15:52 +0800)]
drm/amdgpu: added RAS EEPROM device support check
updated RAS EEPROM init/threshold sequences to check for device support
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
John Clements [Mon, 3 Aug 2020 06:24:50 +0000 (14:24 +0800)]
drm/amdgpu: enable RAS support for sienna cichlid
enabled GECC error injection and query support
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Monk Liu [Mon, 27 Jul 2020 07:20:12 +0000 (15:20 +0800)]
drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)
what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance
how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed
notes:
this paramter only affects gfx 8/9/10
v2:
refine namings
v3:
choose queues for each ring to that try best to cross pipes evenly.
v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire()
v5:
further fix on indentations
more cleanupsin gfx_compute_queue_acquire()
TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Mon, 27 Jul 2020 07:51:05 +0000 (15:51 +0800)]
drm/amdgpu: update eeprom once specifying one bigger threshold(v3)
During driver's probe, when it hits bad gpu tag in eeprom i2c
init calling(the tag was set when reported bad page reaches
bad page threshold in last driver's working loop), there are
some strategys to deal with the cases:
1. when the module parameter amdgpu_bad_page_threshold = 0,
that means page retirement feature is disabled, so just resetting
the eeprom is fine.
2. When amdgpu_bad_page_threshold is not 0, and moreover, user
sets one bigger valid data in order to make current boot up
succeeds, correct eeprom header tag and do not break booting.
3. For other cases, driver's probe will be broken.
v2: Just update eeprom header tag instead of resetting the whole
table header when user sets one bigger threshold data.
v3: Use dev_info/dev_err to print PCI device information, which
helps in mGPU case.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Mon, 27 Jul 2020 06:56:27 +0000 (14:56 +0800)]
drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0
When amdgpu_bad_page_threshold = 0, bad page reservation stuffs
are skipped in either UMC ECC irq or page retirement calling of
sync flood isr.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Fri, 31 Jul 2020 07:06:32 +0000 (15:06 +0800)]
drm/amdgpu: decouple sysfs creating of bad page node
Bad page information should not be exposed by sysfs when
bad page retirement is disabled, so decouple it from ras
sysfs group creating, and add one guard before creating.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Fri, 31 Jul 2020 06:39:57 +0000 (14:39 +0800)]
drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2)
Add one definition for the RAS module's FS name. It's used
in both debugfs and sysfs cases.
v2: Use static variable instead of macro definition.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Wed, 22 Jul 2020 02:37:01 +0000 (10:37 +0800)]
drm/amdgpu: restore ras flags when user resets eeprom(v2)
RAS flags needs to be cleaned as well when user requires
one clean eeprom.
v2: RAS flags shall be restored after eeprom reset succeeds.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 23 Jul 2020 08:20:02 +0000 (16:20 +0800)]
drm/amdgpu: break GPU recovery once it's in bad state(v4)
When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.
v2: Refine warning message in threshold reaching case, and
fix spelling typo.
v3: Fix explicit calling of bad gpu.
v4: Rename function names.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 23 Jul 2020 08:05:00 +0000 (16:05 +0800)]
drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2)
Once the bad page saved to eeprom reaches the configured
threshold, ras recovery will be issued to notify user.
v2: Fix spelling typo.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 23 Jul 2020 07:50:42 +0000 (15:50 +0800)]
drm/amdgpu: skip bad page reservation once issuing from eeprom write
Once the ras recovery is issued from eeprom write itself,
bad page reservation should be ignored, otherwise, recursive
calling of writting to eeprom would happen.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 23 Jul 2020 07:42:19 +0000 (15:42 +0800)]
drm/amdgpu: break driver init process when it's bad GPU(v5)
When retrieving bad gpu tag from eeprom, GPU init should
fail as the GPU needs to be retired for further check.
v2: Fix spelling typo, correct the condition to detect
bad gpu tag and refine error message.
v3: Refine function argument name.
v4: Fix missing check of returning value of i2c
initialization error case.
v5: Use dev_err to print PCI information in dmesg instead
of DRM_ERROR.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Thu, 23 Jul 2020 07:35:53 +0000 (15:35 +0800)]
drm/amdgpu: add bad gpu tag definition
This tag will be hired for bad gpu detection in eeprom's access.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Wed, 22 Jul 2020 02:00:27 +0000 (10:00 +0800)]
drm/amdgpu: validate bad page threshold in ras(v3)
Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.
v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.
v3: drop the case of setting bad_page_cnt_threshold to be
0xFFFFFFFF, as it confuses user.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Tue, 21 Jul 2020 10:02:00 +0000 (18:02 +0800)]
drm/amdgpu: add bad page count threshold in module parameter(v3)
bad_page_threshold could be configured to enable/disable the
associated bad page retirement feature in RAS.
When it's -1, ras will use typical bad page failure value to
handle bad page retirement.
When it's 0, disable bad page retirement, and no bad page
will be recorded and saved.
For other valid value, driver will use this manual value
as the threshold value of totoal bad pages.
v2: correct documentation of this parameter.
v3: remove confused statement in documentation.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 30 Jul 2020 22:04:33 +0000 (18:04 -0400)]
drm/amdkfd: Replace bitmask with event idx in SMI event msg
Event bitmask is a 64-bit mask with only 1 bit set. Sending this
event bitmask in KFD SMI event message is both wasteful of memory
and potentially limiting to only 64 events. Instead send event
index in SMI event message.
Please note this change does not break the ABI for the two event
types defined so far. The new index is identical to the mask used
before.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 30 Jul 2020 15:02:30 +0000 (11:02 -0400)]
Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers"
This regressed some working configurations so revert it. Will
fix this properly for 5.9 and backport then.
This reverts commit
38e0c89a19fd13f28d2b4721035160a3e66e270b.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Jiansong Chen [Thu, 30 Jul 2020 10:09:47 +0000 (18:09 +0800)]
drm/amdgpu: enable GFXOFF for navy_flounder
Enable GFXOFF for navy_flounder.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu ChengZhe [Fri, 24 Jul 2020 07:55:33 +0000 (15:55 +0800)]
drm amdgpu: Skip tmr load for SRIOV
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.
v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
v4: use my name instead of "root"
v5: fix grammer issue and indent issue
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Liu ChengZhe [Fri, 24 Jul 2020 09:22:15 +0000 (17:22 +0800)]
drm/amdgpu: fix PSP autoload twice in FLR
Assigning false to block->status.hw overwrites PSP's previous
hardware status, which causes the PSP to Resume operation after
hardware init.
Remove this assignment and let the PSP execute Resume operation
when it is told to.
v2: Remove the braces.
v3: Modify the description.
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Vetter [Mon, 27 Jul 2020 21:30:18 +0000 (23:30 +0200)]
drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
Trying to grab dma_resv_lock while in commit_tail before we've done
all the code that leads to the eventual signalling of the vblank event
(which can be a dma_fence) is deadlock-y. Don't do that.
Here the solution is easy because just grabbing locks to read
something races anyway. We don't need to bother, READ_ONCE is
equivalent. And avoids the locking issue.
v2: Also take into account tmz_surface boolean, plus just delete the
old code.
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>