linux.git
11 months agodrm/amd/display: Remove redundant include file
Alex Hung [Mon, 15 Apr 2024 19:47:37 +0000 (13:47 -0600)]
drm/amd/display: Remove redundant include file

This fixes 1 PW.INCLUDE_RECURSION reported by Coverity.

"./drivers/gpu/drm/amd/amdgpu/../display/dc/dc_types.h"
includes itself: dc_types.h -> dal_types.h -> dc_types.h

Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: ASSERT when failing to find index by plane/stream id
Alex Hung [Mon, 22 Apr 2024 18:02:17 +0000 (12:02 -0600)]
drm/amd/display: ASSERT when failing to find index by plane/stream id

[WHY]
find_disp_cfg_idx_by_plane_id and find_disp_cfg_idx_by_stream_id returns
an array index and they return -1 when not found; however, -1 is not a
valid index number.

[HOW]
When this happens, call ASSERT(), and return a positive number (which is
fewer than callers' array size) instead.

This fixes 4 OVERRUN and 2 NEGATIVE_RETURNS issues reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Do not return negative stream id for array
Alex Hung [Mon, 22 Apr 2024 19:43:14 +0000 (13:43 -0600)]
drm/amd/display: Do not return negative stream id for array

[WHY]
resource_stream_to_stream_idx returns an array index and it return -1
when not found; however, -1 is not a valid array index number.

[HOW]
When this happens, call ASSERT(), and return a zero instead.

This fixes an OVERRUN and an NEGATIVE_RETURNS issues reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Fix overlapping copy within dml_core_mode_programming
Hersen Wu [Tue, 23 Apr 2024 14:57:37 +0000 (10:57 -0400)]
drm/amd/display: Fix overlapping copy within dml_core_mode_programming

[WHY]
&mode_lib->mp.Watermark and &locals->Watermark are
the same address. memcpy may lead to unexpected behavior.

[HOW]
memmove should be used.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Skip finding free audio for unknown engine_id
Alex Hung [Mon, 22 Apr 2024 19:52:27 +0000 (13:52 -0600)]
drm/amd/display: Skip finding free audio for unknown engine_id

[WHY]
ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it
also means it is uninitialized and does not need free audio.

[HOW]
Skip and return NULL.

This fixes 2 OVERRUN issues reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Check pipe offset before setting vblank
Alex Hung [Tue, 23 Apr 2024 00:07:17 +0000 (18:07 -0600)]
drm/amd/display: Check pipe offset before setting vblank

pipe_ctx has a size of MAX_PIPES so checking its index before accessing
the array.

This fixes an OVERRUN issue reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdkfd: Enable SQ watchpoint for gfx10
Lancelot SIX [Wed, 3 Apr 2024 09:21:24 +0000 (10:21 +0100)]
drm/amdkfd: Enable SQ watchpoint for gfx10

There are new control registers introduced in gfx10 used to configure
hardware watchpoints triggered by SMEM instructions:
SQ_WATCH{0,1,2,3}_{CNTL_ADDR_HI,ADDR_L}.

Those registers work in a similar way as the TCP_WATCH* registers
currently used for gfx9 and above.

This patch adds support to program the SQ_WATCH registers for gfx10.

The SQ_WATCH?_CNTL.MASK field has one bit more than
TCP_WATCH?_CNTL.MASK, so SQ watchpoints can have a finer granularity
than TCP_WATCH watchpoints.  In this patch, we keep the capabilities
advertised to the debugger unchanged
(HSA_DBG_WATCH_ADDR_MASK_*_BIT_GFX10) as this reflects what both TCP and
SQ watchpoints can do and both watchpoints are configured together.

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Check index msg_id before read or write
Alex Hung [Thu, 18 Apr 2024 19:27:43 +0000 (13:27 -0600)]
drm/amd/display: Check index msg_id before read or write

[WHAT]
msg_id is used as an array index and it cannot be a negative value, and
therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1).

[HOW]
Check whether msg_id is valid before reading and setting.

This fixes 4 OVERRUN issues reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and rlc_microc...
Srinivasan Shanmugam [Thu, 25 Apr 2024 05:52:32 +0000 (11:22 +0530)]
drm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and rlc_microcode()

The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
about potential truncation of output when using the snprintf function.
The issue was due to the size of the buffer 'ucode_prefix' being too
small to accommodate the maximum possible length of the string being
written into it.

The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin",
where %s is replaced by the value of 'chip_name'. The length of this
string without the %s is 16 characters. The warning message indicated
that 'chip_name' could be up to 29 characters long, resulting in a total
of 45 characters, which exceeds the buffer size of 30 characters.

To resolve this issue, the size of the 'ucode_prefix' buffer has been
reduced from 30 to 15. This ensures that the maximum possible length of
the string being written into the buffer will not exceed its size, thus
preventing potential buffer overflow and truncation issues.

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’:
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
  379 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
      |                                                    ^~
......
  439 |         r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix);
      |                                                 ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
  379 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
  413 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
      |                                                    ^~
......
  443 |         r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix);
      |                                                        ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30
  413 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 86301129698b ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0")
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Add NULL pointer check for kzalloc
Hersen Wu [Mon, 22 Apr 2024 16:27:34 +0000 (12:27 -0400)]
drm/amd/display: Add NULL pointer check for kzalloc

[Why & How]
Check return pointer of kzalloc before using it.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Handle Y carry-over in VCP X.Y calculation
George Shen [Thu, 16 Sep 2021 23:55:39 +0000 (19:55 -0400)]
drm/amd/display: Handle Y carry-over in VCP X.Y calculation

Theoretically rare corner case where ceil(Y) results in rounding up to
an integer. If this happens, the 1 should be carried over to the X
value.

CC: stable@vger.kernel.org
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix ras mode2 reset failure in ras aca mode
YiPeng Chai [Wed, 24 Apr 2024 02:21:30 +0000 (10:21 +0800)]
drm/amdgpu: Fix ras mode2 reset failure in ras aca mode

Fix ras mode2 reset failure in ras aca mode.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: fix double free err_addr pointer warnings
Bob Zhou [Tue, 23 Apr 2024 05:32:39 +0000 (13:32 +0800)]
drm/amdgpu: fix double free err_addr pointer warnings

In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages
will be run many times so that double free err_addr in some special case.
So set the err_addr to NULL to avoid the warnings.

Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: initialize the last_jump_jiffies in atom_exec_context
Jesse Zhang [Thu, 25 Apr 2024 02:04:08 +0000 (10:04 +0800)]
drm/amdgpu: initialize the last_jump_jiffies in atom_exec_context

The parameter "last_jump_jiffies" should be initialized
before being used in the function atom_op_jump.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add check before free wb entry
Jesse Zhang [Wed, 24 Apr 2024 04:59:22 +0000 (12:59 +0800)]
drm/amdgpu: add check before free wb entry

Check if ring is not a mes queue before freeing the wb entry,
because we only allocate a wb entry when it's not a mes queue.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte
Bob Zhou [Wed, 24 Apr 2024 07:24:05 +0000 (15:24 +0800)]
drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte

After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be
conducted to put wrong value.
So return and check the i2c transfer result.

Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add error handle to avoid out-of-bounds
Bob Zhou [Tue, 23 Apr 2024 08:58:11 +0000 (16:58 +0800)]
drm/amdgpu: add error handle to avoid out-of-bounds

if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should
be stop to avoid out-of-bounds read, so directly return -EINVAL.

Signed-off-by: Bob Zhou <bob.zhou@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Initialize timestamp for some legacy SOCs
Ma Jun [Mon, 22 Apr 2024 02:07:51 +0000 (10:07 +0800)]
drm/amdgpu: Initialize timestamp for some legacy SOCs

Initialize the interrupt timestamp for some legacy SOCs
to fix the coverity issue "Uninitialized scalar variable"

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Use new interface to reserve bad page
YiPeng Chai [Mon, 8 Apr 2024 06:48:50 +0000 (14:48 +0800)]
drm/amdgpu: Use new interface to reserve bad page

Use new interface to reserve bad page.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix address translation defect
YiPeng Chai [Wed, 3 Apr 2024 06:11:41 +0000 (14:11 +0800)]
drm/amdgpu: Fix address translation defect

retired_page is page frame and should be expanded
to the full address when querying status.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdkfd: Enforce queue BO's adev
Harish Kasiviswanathan [Mon, 22 Apr 2024 21:03:46 +0000 (17:03 -0400)]
drm/amdkfd: Enforce queue BO's adev

Queue buffer, though it is in system memory, has to be created using the
correct amdgpu device. Enforce this as the BO needs to mapped to the
GART for MES Hardware scheduler to access it.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Increase SAT_UPDATE_PENDING timeout
Dmytro Laktyushkin [Thu, 24 Jun 2021 14:50:45 +0000 (10:50 -0400)]
drm/amd/display: Increase SAT_UPDATE_PENDING timeout

Headless dp 2.0 will take longer to update.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Add some missing HDMI registers for DCN3x
Rodrigo Siqueira [Tue, 9 Apr 2024 21:25:23 +0000 (15:25 -0600)]
drm/amd/display: Add some missing HDMI registers for DCN3x

This commit add some missing HDMI control registers to DCN3x.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: support ACA logging ecc errors
YiPeng Chai [Fri, 22 Mar 2024 07:03:07 +0000 (15:03 +0800)]
drm/amdgpu: support ACA logging ecc errors

support ACA logging ecc errors.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add poison consumption handler
YiPeng Chai [Mon, 11 Mar 2024 09:52:37 +0000 (17:52 +0800)]
drm/amdgpu: add poison consumption handler

Add poison consumption handler.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: prepare to handle pasid poison consumption
YiPeng Chai [Mon, 22 Apr 2024 09:40:03 +0000 (17:40 +0800)]
drm/amdgpu: prepare to handle pasid poison consumption

Prepare to handle pasid poison consumption.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: retire bad pages for umc v12_0
YiPeng Chai [Mon, 18 Mar 2024 06:24:13 +0000 (14:24 +0800)]
drm/amdgpu: retire bad pages for umc v12_0

Retire bad pages for umc v12_0.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add condition check for amdgpu_umc_fill_error_record
YiPeng Chai [Fri, 22 Mar 2024 04:06:01 +0000 (12:06 +0800)]
drm/amdgpu: add condition check for amdgpu_umc_fill_error_record

Add condition check for amdgpu_umc_fill_error_record.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Add delay work to retire bad pages
YiPeng Chai [Mon, 22 Apr 2024 09:38:54 +0000 (17:38 +0800)]
drm/amdgpu: Add delay work to retire bad pages

Add delay work to retire bad pages.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: umc v12_0 logs ecc errors
YiPeng Chai [Mon, 18 Mar 2024 03:48:07 +0000 (11:48 +0800)]
drm/amdgpu: umc v12_0 logs ecc errors

1. umc v12_0 logs ecc errors.
2. Reserve newly detected ecc error pages.
3. Add tag for bad pages, so that they can
   be retired later.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: umc v12_0 converts error address
YiPeng Chai [Mon, 11 Mar 2024 09:29:26 +0000 (17:29 +0800)]
drm/amdgpu: umc v12_0 converts error address

Umc v12_0 converts error address.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add interface to update umc v12_0 ecc status
YiPeng Chai [Tue, 19 Mar 2024 02:14:16 +0000 (10:14 +0800)]
drm/amdgpu: add interface to update umc v12_0 ecc status

Add interface to update umc v12_0 ecc status.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add poison creation handler
YiPeng Chai [Mon, 11 Mar 2024 09:02:40 +0000 (17:02 +0800)]
drm/amdgpu: add poison creation handler

Add poison creation handler.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: prepare for logging ecc errors
YiPeng Chai [Mon, 22 Apr 2024 09:37:36 +0000 (17:37 +0800)]
drm/amdgpu: prepare for logging ecc errors

Prepare for logging ecc errors.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add message fifo to handle RAS poison events
YiPeng Chai [Tue, 19 Mar 2024 01:57:58 +0000 (09:57 +0800)]
drm/amdgpu: add message fifo to handle RAS poison events

Add message fifo to handle RAS poison events.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc
Jesse Zhang [Wed, 24 Apr 2024 09:10:46 +0000 (17:10 +0800)]
drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc

Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001.
V2: To really improve the handling we would actually
   need to have a separate value of 0xffffffff.(Christian)

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Add TMDS DC balancer control
Rodrigo Siqueira [Tue, 9 Apr 2024 21:14:49 +0000 (15:14 -0600)]
drm/amd/display: Add TMDS DC balancer control

Add TMDS balancer control to the list of available encoder registers for
DCN 30.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Remove unnecessary NULL check in dcn20_set_input_transfer_func
Srinivasan Shanmugam [Tue, 23 Apr 2024 03:14:40 +0000 (08:44 +0530)]
drm/amd/display: Remove unnecessary NULL check in dcn20_set_input_transfer_func

This commit removes an unnecessary NULL check in the
`dcn20_set_input_transfer_func` function in the `dcn20_hwseq.c` file.
The variable `tf` is assigned the address of
`plane_state->in_transfer_func` unconditionally, so it can never be
`NULL`. Therefore, the check `if (tf == NULL)` is unnecessary and has
been removed.

The plane_state->in_transfer_func itself cannot be NULL because it's a
structure, not a pointer. When we do tf =
&plane_state->in_transfer_func;, we're getting the address of that
structure, which will always be valid as long as plane_state itself is
not NULL.

we've already checked if plane_state is NULL with the line if (dpp_base
== NULL || plane_state == NULL) return false;. So, if the code execution
gets to the point where tf = &plane_state->in_transfer_func; is called,
plane_state is guaranteed to be not NULL, and therefore tf will also not
be NULL.

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c
    1094 bool dcn20_set_input_transfer_func(struct dc *dc,
    1095                                 struct pipe_ctx *pipe_ctx,
    1096                                 const struct dc_plane_state *plane_state)
    1097 {
    1098         struct dce_hwseq *hws = dc->hwseq;
    1099         struct dpp *dpp_base = pipe_ctx->plane_res.dpp;
    1100         const struct dc_transfer_func *tf = NULL;
                                                ^^^^^^^^^ This assignment is not necessary now.

    1101         bool result = true;
    1102         bool use_degamma_ram = false;
    1103
    1104         if (dpp_base == NULL || plane_state == NULL)
    1105                 return false;
    1106
    1107         hws->funcs.set_shaper_3dlut(pipe_ctx, plane_state);
    1108         hws->funcs.set_blend_lut(pipe_ctx, plane_state);
    1109
    1110         tf = &plane_state->in_transfer_func;
                 ^^^^^
Before there was an if statement but now tf is assigned unconditionally

    1111
--> 1112         if (tf == NULL) {
                 ^^^^^^^^^^^^^^^^^
so these conditions are impossible.

    1113                 dpp_base->funcs->dpp_set_degamma(dpp_base,
    1114                                 IPP_DEGAMMA_MODE_BYPASS);
    1115                 return true;
    1116         }
    1117
    1118         if (tf->type == TF_TYPE_HWPWL || tf->type == TF_TYPE_DISTRIBUTED_POINTS)
    1119                 use_degamma_ram = true;
    1120
    1121         if (use_degamma_ram == true) {
    1122                 if (tf->type == TF_TYPE_HWPWL)
    1123                         dpp_base->funcs->dpp_program_degamma_pwl(dpp_base,

Fixes the below Smatch static checker warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1112 dcn20_set_input_transfer_func() warn: address of 'plane_state->in_transfer_func' is non-NULL

Fixes: 285a7054bf81 ("drm/amd/display: Remove plane and stream pointers from dc scratch")
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/mes11: Use a separate fence per transaction
Alex Deucher [Fri, 19 Apr 2024 03:30:46 +0000 (23:30 -0400)]
drm/amdgpu/mes11: Use a separate fence per transaction

We can't use a shared fence location because each transaction
should be considered independently.

Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Add missing dwb registers
Rodrigo Siqueira [Tue, 9 Apr 2024 21:11:20 +0000 (15:11 -0600)]
drm/amd/display: Add missing dwb registers

DCN3.0 supports some specific DWB debug registers that are not exposed
yet. This commit just adds the missing registers.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: use mpcc_count to log MPC state
Melissa Wen [Fri, 12 Apr 2024 16:39:09 +0000 (13:39 -0300)]
drm/amd/display: use mpcc_count to log MPC state

According to [1]:
```
DTN only logs 'pipe_count' instances of MPCC. However in some cases
there are different number of MPCC than DPP (pipe_count).
```

As DTN log still relies on pipe_count to print mpcc state, switch to
mpcc_count in all occurrences.

[1] https://lore.kernel.org/amd-gfx/20240328195047.2843715-39-Roman.Li@amd.com/

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add a spinlock to wb allocation
Alex Deucher [Fri, 19 Apr 2024 22:03:32 +0000 (18:03 -0400)]
drm/amdgpu: add a spinlock to wb allocation

As we use wb slots more dynamically, we need to lock
access to avoid racing on allocation or free.

Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: update fw_share for VCN5
Sonny Jiang [Tue, 23 Apr 2024 16:52:25 +0000 (12:52 -0400)]
drm/amdgpu: update fw_share for VCN5

kmd_fw_shared changed in VCN5

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Remove duplicated function signature from dcn3.01 DCCG
David Tadokoro [Thu, 22 Feb 2024 14:19:00 +0000 (11:19 -0300)]
drm/amd/display: Remove duplicated function signature from dcn3.01 DCCG

In the header file dc/dcn301/dcn301_dccg.h, the function dccg301_create
is declared twice, so remove duplication.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: David Tadokoro <davidbtadokoro@usp.br>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix VRAM memory accounting
Mukul Joshi [Tue, 23 Apr 2024 18:40:37 +0000 (14:40 -0400)]
drm/amdgpu: Fix VRAM memory accounting

Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: update jpeg max decode resolution
Sathishkumar S [Thu, 11 Apr 2024 19:43:06 +0000 (01:13 +0530)]
drm/amdgpu: update jpeg max decode resolution

jpeg ip version v2.1 and higher supports 16kx16k resolution decode

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Fix division by zero in setup_dsc_config
Jose Fernandez [Mon, 22 Apr 2024 14:35:44 +0000 (08:35 -0600)]
drm/amd/display: Fix division by zero in setup_dsc_config

When slice_height is 0, the division by slice_height in the calculation
of the number of slices will cause a division by zero driver crash. This
leaves the kernel in a state that requires a reboot. This patch adds a
check to avoid the division by zero.

The stack trace below is for the 6.8.4 Kernel. I reproduced the issue on
a Z16 Gen 2 Lenovo Thinkpad with a Apple Studio Display monitor
connected via Thunderbolt. The amdgpu driver crashed with this exception
when I rebooted the system with the monitor connected.

kernel: ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
kernel: ? do_trap (arch/x86/kernel/traps.c:113 arch/x86/kernel/traps.c:154)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? do_error_trap (./arch/x86/include/asm/traps.h:58 arch/x86/kernel/traps.c:175)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? exc_divide_error (arch/x86/kernel/traps.c:194 (discriminator 2))
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? asm_exc_divide_error (./arch/x86/include/asm/idtentry.h:548)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: dc_dsc_compute_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1109) amdgpu

After applying this patch, the driver no longer crashes when the monitor
is connected and the system is rebooted. I believe this is the same
issue reported for 3113.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jose Fernandez <josef@netflix.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3113
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Add missing debug registers for DCN2/3/3.1
Rodrigo Siqueira [Tue, 9 Apr 2024 20:35:19 +0000 (14:35 -0600)]
drm/amd/display: Add missing debug registers for DCN2/3/3.1

This commit add some missing debug registers for DPCS and RDPC debug.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add ip dump for each ip in devcoredump
Sunil Khatri [Tue, 16 Apr 2024 11:49:41 +0000 (17:19 +0530)]
drm/amdgpu: add ip dump for each ip in devcoredump

Add ip dump for each ip of the asic in the
devcoredump for all the ips where a callback
is registered for register dump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: dump ip state before reset for each ip
Sunil Khatri [Tue, 16 Apr 2024 11:26:23 +0000 (16:56 +0530)]
drm/amdgpu: dump ip state before reset for each ip

Invoke the dump_ip_state function for each ip before
the asic resets and save the register values for
debugging via devcoredump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add support for gfx v10 print
Sunil Khatri [Tue, 16 Apr 2024 11:18:01 +0000 (16:48 +0530)]
drm/amdgpu: add support for gfx v10 print

Add support to print ip information to be
used to print registers in devcoredump
buffer.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add protype for print ip state
Sunil Khatri [Tue, 16 Apr 2024 11:00:50 +0000 (16:30 +0530)]
drm/amdgpu: add protype for print ip state

Add the protoype for print ip state to be used
to print the registers in devcoredump during
a gpu reset.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add support of gfx10 register dump
Sunil Khatri [Tue, 16 Apr 2024 10:44:12 +0000 (16:14 +0530)]
drm/amdgpu: add support of gfx10 register dump

Adding gfx10 gc registers to be used for register
dump via devcoredump during a gpu reset.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: add prototype for ip dump
Sunil Khatri [Mon, 1 Apr 2024 10:28:38 +0000 (15:58 +0530)]
drm/amdgpu: add prototype for ip dump

Add the prototype to dump ip registers
for all ips of different asics and set
them to NULL for now. Based on the
requirement add a function pointer for
each of them.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Add interface to reserve bad page
YiPeng Chai [Fri, 29 Mar 2024 06:42:59 +0000 (14:42 +0800)]
drm/amdgpu: Add interface to reserve bad page

Add interface to reserve bad page.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix uninitialized variable warnings
Ma Jun [Mon, 22 Apr 2024 06:47:52 +0000 (14:47 +0800)]
drm/amdgpu: Fix uninitialized variable warnings

return 0 to avoid returning an uninitialized variable r

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/mes: fix use-after-free issue
Jack Xiao [Mon, 22 Apr 2024 08:22:54 +0000 (16:22 +0800)]
drm/amdgpu/mes: fix use-after-free issue

Delete fence fallback timer to fix the ramdom
use-after-free issue.

v2: move to amdgpu_mes.c

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3
Alex Deucher [Mon, 15 Apr 2024 01:20:56 +0000 (21:20 -0400)]
drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3

This avoids a potential conflict with firmwares with the newer
HDP flush mechanism.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Update CGCG settings for GFXIP 9.4.3
Rajneesh Bhardwaj [Mon, 22 Apr 2024 01:06:26 +0000 (21:06 -0400)]
drm/amdgpu: Update CGCG settings for GFXIP 9.4.3

Tune coarse grain clock gating idle threshold and rlc idle timeout to
achieve better kernel launch latency.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agoRevert "drm/amd/display: Add fallback configuration when set DRR"
Rodrigo Siqueira [Mon, 22 Apr 2024 13:43:55 +0000 (07:43 -0600)]
Revert "drm/amd/display: Add fallback configuration when set DRR"

This reverts commit d76c0a23b557c6ebb3fac32548100d76a1e0ce23.

This change must be reverted since it caused soft hangs when changing
the refresh rate to 122 & 144Hz when using a 7000 series GPU.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Mark Broadworth <Mark.Broadworth@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix snprintf buffer size in smu_v14_0_init_microcode
Srinivasan Shanmugam [Fri, 19 Apr 2024 09:18:03 +0000 (14:48 +0530)]
drm/amdgpu: Fix snprintf buffer size in smu_v14_0_init_microcode

This commit addresses buffer overflow in the smu_v14_0_init_microcode
function. The issue was about the snprintf function writing more bytes
into the fw_name buffer than it can hold.

The line of code is:

snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);

Here, snprintf is used to write a formatted string into fw_name. The
format is "amdgpu/%s.bin", where %s is a placeholder for the string
ucode_prefix. The sizeof(fw_name) argument tells snprintf the maximum
number of bytes it can write into fw_name, including the
null-terminating character. In the original code, fw_name is an array of
30 characters.

The string "amdgpu/%s.bin" could be up to 41 bytes long, which exceeds
the 30 bytes allocated for fw_name. This is because %s could be replaced
by ucode_prefix, which can be up to 29 characters long. Adding the 12
characters from "amdgpu/" and ".bin", the total length could be 41
characters.

To address this, the size of ucode_prefix has been reduced to 15
characters. This ensures that the maximum length of the string written into
fw_name does not exceed its capacity.

smu_13/14 etc. don't follow legacy scheme ie., amdgpu_ucode_legacy_naming

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c: In function ‘smu_v14_0_init_microcode’:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
   80 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);
      |                                                    ^~       ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 30
   80 |         snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: fe6cd9152464 ("drm/amd/swsmu: add smu14 ip support")
Cc: Li Ma <li.ma@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: replace tmz flag into buffer flag
Frank Min [Wed, 10 Apr 2024 13:13:25 +0000 (21:13 +0800)]
drm/amdgpu: replace tmz flag into buffer flag

Replace tmz flag into buffer flag to make it easier to understand
and extend

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: init microcode chip name from ip versions
Le Ma [Wed, 17 Apr 2024 09:57:52 +0000 (17:57 +0800)]
drm/amdgpu: init microcode chip name from ip versions

To adapt to different gc versions in gfx_v9_4_3.c file.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Fix the ring buffer size for queue VM flush
Prike Liang [Mon, 25 Mar 2024 07:33:34 +0000 (15:33 +0800)]
drm/amdgpu: Fix the ring buffer size for queue VM flush

Here are the corrections needed for the queue ring buffer size
calculation for the following cases:
- Remove the KIQ VM flush ring usage.
- Add the invalidate TLBs packet for gfx10 and gfx11 queue.
- There's no VM flush and PFP sync, so remove the gfx9 real
  ring and compute ring buffer usage.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdkfd: Add VRAM accounting for SVM migration
Mukul Joshi [Thu, 18 Apr 2024 19:13:58 +0000 (15:13 -0400)]
drm/amdkfd: Add VRAM accounting for SVM migration

Do VRAM accounting when doing migrations to vram to make sure
there is enough available VRAM and migrating to VRAM doesn't evict
other possible non-unified memory BOs. If migrating to VRAM fails,
driver can fall back to using system memory seamlessly.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/pm: Restore config space after reset
Lijo Lazar [Fri, 12 Apr 2024 07:41:14 +0000 (13:11 +0530)]
drm/amd/pm: Restore config space after reset

During mode-2 reset, pci config space registers are affected at device
side. However, certain platforms have switches which assign virtual BAR
addresses and returns the same even after device is reset. This
affects pci_restore_state() as it doesn't issue another config write, if
the value read is same as the saved value.

Add a workaround to write saved config space values from driver side.
Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence
restrict the workaround only to those.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend
Lang Yu [Fri, 19 Apr 2024 07:40:08 +0000 (15:40 +0800)]
drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend

umsch test needs full GPU functionality(e.g., VM update, TLB flush,
possibly buffer moving under memory pressure) which may be not ready
under these states. Just skip it to avoid potential issues.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdkfd: Fix rescheduling of restore worker
Felix Kuehling [Fri, 19 Apr 2024 17:25:58 +0000 (13:25 -0400)]
drm/amdkfd: Fix rescheduling of restore worker

Handle the case that the restore worker was already scheduled by another
eviction while the restore was in progress.

Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Yunxiang Li <Yunxiang.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Update BO eviction priorities
Felix Kuehling [Thu, 18 Apr 2024 17:56:42 +0000 (13:56 -0400)]
drm/amdgpu: Update BO eviction priorities

Make SVM BOs more likely to get evicted than other BOs. These BOs
opportunistically use available VRAM, but can fall back relatively
seamlessly to system memory. It also avoids SVM migrations evicting
other, more important BOs as they will evict other SVM allocations
first.

Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Mukul Joshi <mukul.joshi@amd.com>
Tested-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/display: Remove duplicate dcn32/dcn32_clk_mgr.h header
Jiapeng Chong [Fri, 19 Apr 2024 02:18:47 +0000 (10:18 +0800)]
drm/amd/display: Remove duplicate dcn32/dcn32_clk_mgr.h header

./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c: dcn32/dcn32_clk_mgr.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8789
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/vcn: fix unitialized variable warnings
Pierre-Eric Pelloux-Prayer [Thu, 18 Apr 2024 18:06:08 +0000 (20:06 +0200)]
drm/amdgpu/vcn: fix unitialized variable warnings

Avoid returning an uninitialized value if we never enter the loop.
This case should never be hit in practice, but returning 0 doesn't
hurt.

The same fix is applied to the 4 places using the same pattern.

v2: - fixed typos in commit message (Alex)
    - use "return 0;" before the done label instead of initializing
      r to 0

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/mes11: print MES opcodes rather than numbers
Alex Deucher [Sat, 30 Mar 2024 13:46:49 +0000 (09:46 -0400)]
drm/amdgpu/mes11: print MES opcodes rather than numbers

Makes it easier to review the logs when there are MES
errors.

v2: use dbg for emitted, add helpers for fetching strings
v3: fix missing commas (Harish)
v4: drop command prefixes (Felix)
v5: squash in bounds fix (Jun)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/vpe: fix vpe dpm setup failed
Peyton Lee [Fri, 19 Apr 2024 06:07:39 +0000 (14:07 +0800)]
drm/amdgpu/vpe: fix vpe dpm setup failed

The vpe dpm settings should be done before firmware is loaded.
Otherwise, the frequency cannot be successfully raised.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Assign correct bits for SDMA HDP flush
Lijo Lazar [Wed, 10 Apr 2024 14:00:46 +0000 (19:30 +0530)]
drm/amdgpu: Assign correct bits for SDMA HDP flush

HDP Flush request bit can be kept unique per AID, and doesn't need to be
unique SOC-wide. Assign only bits 10-13 for SDMA v4.4.2.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amd/swsmu: add if condition for smu v14.0.1
Li Ma [Wed, 17 Apr 2024 12:42:44 +0000 (20:42 +0800)]
drm/amd/swsmu: add if condition for smu v14.0.1

smu v14.0.1 re-used smu v14.0.0

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/pm: Print od status info
Ma Jun [Tue, 16 Apr 2024 07:35:25 +0000 (15:35 +0800)]
drm/amdgpu/pm: Print od status info

Print the od status info if it's not supported.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu: Support setting reset_method at runtime
Stanley.Yang [Fri, 12 Apr 2024 02:36:25 +0000 (10:36 +0800)]
drm/amdgpu: Support setting reset_method at runtime

In order to support more test cases, support user
change reset_method at runtime.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdgpu/pm: Remove gpu_od if it's an empty directory
Ma Jun [Tue, 16 Apr 2024 09:30:12 +0000 (17:30 +0800)]
drm/amdgpu/pm: Remove gpu_od if it's an empty directory

gpu_od should be removed if it's an empty directory

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reported-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agodrm/amdkfd: demote unsupported device messages to dev_info
Alex Deucher [Fri, 19 Apr 2024 13:57:56 +0000 (09:57 -0400)]
drm/amdkfd: demote unsupported device messages to dev_info

It's not really an error since the devices don't support
the necessary hardware functionality.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3331
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 months agoBackmerge tag 'v6.9-rc5' into drm-next
Dave Airlie [Mon, 22 Apr 2024 04:35:22 +0000 (14:35 +1000)]
Backmerge tag 'v6.9-rc5' into drm-next

Linux 6.9-rc5

I've had a persistent msm failure on clang, and the fix is in fixes
so just pull it back to fix that.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 months agoMerge tag 'drm-misc-next-2024-04-19' of https://gitlab.freedesktop.org/drm/misc/kerne...
Dave Airlie [Mon, 22 Apr 2024 02:29:17 +0000 (12:29 +1000)]
Merge tag 'drm-misc-next-2024-04-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.10-rc1:

UAPI Changes:
- Add SIZE_HINTS property for cursor planes.

Cross-subsystem Changes:

Core Changes:
- Document the requirements and expectations of adding new
  driver-specific properties.
- Assorted small fixes to ttm.
- More Kconfig fixes.
- Add struct drm_edid_product_id and helpers.
- Use drm device based logging in more drm functions.
- Fixes for drm-panic, and option to test it.
- Assorted small fixes and updates to edid.
- Add drm_crtc_vblank_crtc and use it in vkms, nouveau.

Driver Changes:
- Assorted small fixes and improvements to bridge/imx8mp-hdmi-tx, nouveau, ast, qaic, lima, vc4, bridge/anx7625, mipi-dsi.
- Add drm panic to simpledrm, mgag200, imx, ast.
- Use dev_err_probe in bridge/panel drivers.
- Add Innolux G121X1-L03, LG sw43408 panels.
- Use struct drm_edid in i915 bios parsing.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2dc1b7c6-1743-4ddd-ad42-36f700234fbe@linux.intel.com
11 months agoMerge tag 'amd-drm-next-6.10-2024-04-19' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Mon, 22 Apr 2024 00:51:39 +0000 (10:51 +1000)]
Merge tag 'amd-drm-next-6.10-2024-04-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.10-2024-04-19:

amdgpu:
- DC resource allocation logic updates
- DC IPS fixes
- DC YUV fixes
- DMCUB fixes
- DML2 fixes
- Devcoredump updates
- USB-C DSC fix
- Misc display code cleanups
- PSR fixes
- MES timeout fix
- RAS updates
- UAF fix in VA IOCTL
- Fix visible VRAM handling during faults
- Fix IP discovery handling during PCI rescans
- Misc code cleanups
- PSP 14 updates
- More runtime PM code rework
- SMU 14.0.2 support
- GPUVM page fault redirection to secondary IH rings for IH 6.x
- Suspend/resume fixes
- SR-IOV fixes

amdkfd:
- Fix eviction fence handling
- Fix leak in GPU memory allocation failure case
- DMABuf import handling fix

radeon:
- Silence UBSAN warnings related to flexible arrays

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419224332.2938259-1-alexander.deucher@amd.com
11 months agoLinux 6.9-rc5
Linus Torvalds [Sun, 21 Apr 2024 19:35:54 +0000 (12:35 -0700)]
Linux 6.9-rc5

11 months agoMerge tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 21 Apr 2024 17:32:58 +0000 (10:32 -0700)]
Merge tag 'char-misc-6.9-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char / misc driver fixes from Greg KH:
 "Here are some small char/misc and other driver fixes for 6.9-rc5.
  Included in here are the following:

   - binder driver fix for reported problem

   - speakup crash fix

   - mei driver fixes for reported problems

   - comdei driver fix

   - interconnect driver fixes

   - rtsx driver fix

   - peci.h kernel doc fix

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  peci: linux/peci.h: fix Excess kernel-doc description warning
  binder: check offset alignment in binder_get_object()
  comedi: vmk80xx: fix incomplete endpoint checking
  mei: vsc: Unregister interrupt handler for system suspend
  Revert "mei: vsc: Call wake_up() in the threaded IRQ handler"
  misc: rtsx: Fix rts5264 driver status incorrect when card removed
  mei: me: disable RPL-S on SPS and IGN firmwares
  speakup: Avoid crash on very long word
  interconnect: Don't access req_list while it's being manipulated
  interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCM

11 months agoMerge tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Apr 2024 17:30:21 +0000 (10:30 -0700)]
Merge tag 'driver-core-6.9-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull kernfs bugfix and documentation update from Greg KH:
 "Here are two changes for 6.9-rc5 that deal with "driver core" stuff,
  that do the following:

   - sysfs reference leak fix

   - embargoed-hardware-issues.rst update for Power

  Both of these have been in linux-next for over a week with no reported
  issues"

* tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation: embargoed-hardware-issues.rst: Add myself for Power
  fs: sysfs: Fix reference leak in sysfs_break_active_protection()

11 months agoMerge tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 21 Apr 2024 17:27:01 +0000 (10:27 -0700)]
Merge tag 'tty-6.9-rc5' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes for 6.9-rc5 that
  resolve a bunch of reported problems. Included in here are:

   - MAINTAINERS and .mailmap update for Richard Genoud

   - serial core regression fixes from 6.9-rc1 changes

   - pci id cleanups

   - serial core crash fix

   - stm32 driver fixes

   - 8250 driver fixes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: stm32: Reset .throttled state in .startup()
  serial: stm32: Return IRQ_NONE in the ISR if no handling happend
  serial: core: Fix missing shutdown and startup for serial base port
  serial: core: Clearing the circular buffer before NULLifying it
  MAINTAINERS: mailmap: update Richard Genoud's email address
  serial/pmac_zilog: Remove flawed mitigation for rx irq flood
  serial: 8250_pci: Remove redundant PCI IDs
  serial: core: Fix regression when runtime PM is not enabled
  serial: mxs-auart: add spinlock around changing cts state
  serial: 8250_dw: Revert: Do not reclock if already at correct rate
  serial: 8250_lpc18xx: disable clks on error in probe()

11 months agoMerge tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 21 Apr 2024 17:23:27 +0000 (10:23 -0700)]
Merge tag 'usb-6.9-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt driver fixes from Greg KH:
 "Here are some small USB and Thunderbolt driver fixes for 6.9-rc5.
  Included in here are:

   - MAINTAINER file update for invalid email address

   - usb-serial device id updates

   - typec driver fixes

   - thunderbolt / usb4 driver fixes

   - usb core shutdown fixes

   - cdc-wdm driver revert for reported problem in -rc1

   - usb gadget driver fixes

   - xhci driver fixes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
  USB: serial: option: add Telit FN920C04 rmnet compositions
  usb: dwc3: ep0: Don't reset resource alloc flag
  Revert "usb: cdc-wdm: close race between read and workqueue"
  USB: serial: option: add Rolling RW101-GL and RW135-GL support
  USB: serial: option: add Lonsung U8300/U9300 product
  USB: serial: option: add support for Fibocom FM650/FG650
  USB: serial: option: support Quectel EM060K sub-models
  USB: serial: option: add Fibocom FM135-GL variants
  usb: misc: onboard_usb_hub: Disable the USB hub clock on failure
  thunderbolt: Avoid notify PM core about runtime PM resume
  thunderbolt: Fix wake configurations after device unplug
  usb: dwc2: host: Fix dereference issue in DDMA completion flow.
  usb: typec: mux: it5205: Fix ChipID value typo
  MAINTAINERS: Drop Li Yang as their email address stopped working
  usb: gadget: fsl: Initialize udc before using it
  usb: Disable USB3 LPM at shutdown
  usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error
  usb: typec: tcpm: Correct the PDO counting in pd_set
  usb: gadget: functionfs: Wait for fences before enqueueing DMABUF
  usb: gadget: functionfs: Fix inverted DMA fence direction
  ...

11 months agoMerge tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Apr 2024 16:39:36 +0000 (09:39 -0700)]
Merge tag 'sched_urgent_for_v6.9_rc5' of git://git./linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:

 - Add a missing memory barrier in the concurrency ID mm switching

* tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Add missing memory barrier in switch_mm_cid

11 months agoMerge tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Apr 2024 16:36:12 +0000 (09:36 -0700)]
Merge tag 'x86_urgent_for_v6.9_rc5' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ

 - Print the correct error code when FRED reports a bad event type

 - Add a FRED-specific INT80 handler without the special dances that
   need to happen in the current one

 - Enable the using-the-default-return-thunk-but-you-should-not warning
   only on configs which actually enable those special return thunks

 - Check the proper feature flags when selecting BHI retpoline
   mitigation

* tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
  x86/fred: Fix incorrect error code printout in fred_bad_type()
  x86/fred: Fix INT80 emulation for FRED
  x86/retpolines: Enable the default thunk warning only on relevant configs
  x86/bugs: Fix BHI retpoline check

11 months agoMerge tag 'block-6.9-20240420' of git://git.kernel.dk/linux
Linus Torvalds [Sat, 20 Apr 2024 18:28:02 +0000 (11:28 -0700)]
Merge tag 'block-6.9-20240420' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
 "Just two minor fixes that should go into the 6.9 kernel release, one
  fixing a regression with partition scanning errors, and one fixing a
  WARN_ON() that can get triggered if we race with a timer"

* tag 'block-6.9-20240420' of git://git.kernel.dk/linux:
  blk-iocost: do not WARN if iocg was already offlined
  block: propagate partition scanning errors to the BLKRRPART ioctl

11 months agoMerge tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 20 Apr 2024 18:17:22 +0000 (11:17 -0700)]
Merge tag 'email' of git://git./linux/kernel/git/jejb/scsi

Pull email address update from James Bottomley:
 "My IBM email has stopped working, so update to a working email
  address"

* tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  MAINTAINERS: update to working email address

11 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 20 Apr 2024 18:10:51 +0000 (11:10 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "This is a bit on the large side, mostly due to two changes:

   - Changes to disable some broken PMU virtualization (see below for
     details under "x86 PMU")

   - Clean up SVM's enter/exit assembly code so that it can be compiled
     without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched
     return thunk in use. This should not happen!" when running KVM
     selftests.

  Everything else is small bugfixes and selftest changes:

   - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure
     where KVM would allow userspace to refresh the cache with a bogus
     GPA. The bug has existed for quite some time, but was exposed by a
     new sanity check added in 6.9 (to ensure a cache is either
     GPA-based or HVA-based).

   - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that
     got left behind during a 6.9 cleanup.

   - Fix a math goof in x86's hugepage logic for
     KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow
     (detected by KASAN).

   - Fix a bug where KVM incorrectly clears root_role.direct when
     userspace sets guest CPUID.

   - Fix a dirty logging bug in the where KVM fails to write-protect
     SPTEs used by a nested guest, if KVM is using Page-Modification
     Logging and the nested hypervisor is NOT using EPT.

  x86 PMU:

   - Drop support for virtualizing adaptive PEBS, as KVM's
     implementation is architecturally broken without an obvious/easy
     path forward, and because exposing adaptive PEBS can leak host LBRs
     to the guest, i.e. can leak host kernel addresses to the guest.

   - Set the enable bits for general purpose counters in
     PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD
     processors.

   - Disable LBR virtualization on CPUs that don't support LBR
     callstacks, as KVM unconditionally uses
     PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and
     would fail on such CPUs.

  Tests:

   - Fix a flaw in the max_guest_memory selftest that results in it
     exhausting the supply of ucall structures when run with more than
     256 vCPUs.

   - Mark KVM_MEM_READONLY as supported for RISC-V in
     set_memory_region_test"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
  KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start()
  KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test
  KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting
  KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}()
  KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status
  KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update
  KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks
  perf/x86/intel: Expose existence of callback support to KVM
  KVM: VMX: Snapshot LBR capabilities during module initialization
  KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
  KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
  KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD
  KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()
  KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area
  KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area
  KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted
  KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()
  KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV
  KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding
  KVM: SVM: Remove a useless zeroing of allocated memory
  ...

11 months agoMerge tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 20 Apr 2024 18:06:42 +0000 (11:06 -0700)]
Merge tag 'powerpc-6.9-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix wireguard loading failure on pre-Power10 due to Power10 crypto
   routines

 - Fix papr-vpd selftest failure due to missing variable initialization

 - Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev()

Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan
Lynch, and Shivaprasad G Bhat.

* tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc/papr-vpd: Fix missing variable initialization
  powerpc/crypto/chacha-p10: Fix failure on non Power10
  powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()

11 months agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Apr 2024 17:36:02 +0000 (10:36 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "A couple clk driver fixes, a build fix, and a deadlock fix:

   - Mediatek mt7988 has broken PCIe because the wrong parent is used

   - Mediatek clk drivers may deadlock when registering their clks
     because the clk provider device is repeatedly runtime PM resumed
     and suspended during probe and clk registration.

     Resuming the clk provider device deadlocks with an ABBA deadlock
     due to genpd_lock and the clk prepare_lock. The fix is to keep the
     device runtime resumed while registering clks.

   - Another runtime PM related deadlock, this time with disabling
     unused clks during late init.

     We get an ABBA deadlock where a device is runtime PM resuming (or
     suspending) while the disabling of unused clks is happening in
     parallel. That runtime PM action calls into the clk framework and
     tries to grab the clk prepare_lock while the disabling of unused
     clks holds the prepare_lock and is waiting for that runtime PM
     action to complete.

     The fix is to runtime resume all the clk provider devices before
     grabbing the clk prepare_lock during disable unused.

   - A build fix to provide an empty devm_clk_rate_exclusive_get()
     function when CONFIG_COMMON_CLK=n"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port
  clk: mediatek: Do a runtime PM get on controllers during probe
  clk: Get runtime PM before walking tree for clk_summary
  clk: Get runtime PM before walking tree during disable_unused
  clk: Initialize struct clk_core kref earlier
  clk: Don't hold prepare_lock when calling kref_put()
  clk: Remove prepare_lock hold assertion in __clk_release()
  clk: Provide !COMMON_CLK dummy for devm_clk_rate_exclusive_get()

11 months agoMAINTAINERS: update to working email address
James Bottomley [Sat, 20 Apr 2024 12:34:09 +0000 (08:34 -0400)]
MAINTAINERS: update to working email address

jejb@linux.ibm.com no longer works.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
11 months agoMerge tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git.kernel.org/pub/scm...
Linus Torvalds [Fri, 19 Apr 2024 23:34:10 +0000 (16:34 -0700)]
Merge tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git./linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
 "A random set of small bug fixes:

   - Fix perf annotate TUI when used with data type profiling

   - Work around BPF verifier about sighand lock checking

  And a set of kernel header synchronization"

* tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  tools/include: Sync arm64 asm/cputype.h with the kernel sources
  tools/include: Sync asm-generic/bitops/fls.h with the kernel sources
  tools/include: Sync x86 asm/msr-index.h with the kernel sources
  tools/include: Sync x86 asm/irq_vectors.h with the kernel sources
  tools/include: Sync x86 CPU feature headers with the kernel sources
  tools/include: Sync uapi/sound/asound.h with the kernel sources
  tools/include: Sync uapi/linux/kvm.h and asm/kvm.h with the kernel sources
  tools/include: Sync uapi/linux/fs.h with the kernel sources
  tools/include: Sync uapi/drm/i915_drm.h with the kernel sources
  perf lock contention: Add a missing NULL check
  perf annotate: Make sure to call symbol__annotate2() in TUI

11 months agoMerge tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Fri, 19 Apr 2024 21:10:11 +0000 (14:10 -0700)]
Merge tag 'hardening-v6.9-rc5' of git://git./linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Correctly disable UBSAN configs in configs/hardening (Nathan
   Chancellor)

 - Add missing signed integer overflow trap types to arm64 handler

* tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ubsan: Add awareness of signed integer overflow traps
  configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP
  configs/hardening: Fix disabling UBSAN configurations

11 months agoMerge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg...
Linus Torvalds [Fri, 19 Apr 2024 21:02:21 +0000 (14:02 -0700)]
Merge tag 'for-linus-iommufd' of git://git./linux/kernel/git/jgg/iommufd

Pull iommufd fixes from Jason Gunthorpe:
 "Two fixes for the selftests:

   - CONFIG_IOMMUFD_TEST needs CONFIG_IOMMUFD_DRIVER to work

   - The kconfig fragment sshould include fault injection so the fault
     injection test can work"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd: Add config needed for iommufd_fail_nth
  iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest

11 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 19 Apr 2024 20:46:44 +0000 (13:46 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

 - Add a missing mutex_destroy() in rxe

 - Enhance the debugging print for cm_destroy failures to help debug
   these

 - Fix mlx5 MAD processing in cases where multiport devices are running
   in switchedev mode

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/mlx5: Fix port number for counter query in multi-port configuration
  RDMA/cm: Print the old state when cm_destroy_id gets timeout
  RDMA/rxe: Fix the problem "mutex_destroy missing"

11 months agoMerge tag '9p-fixes-for-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Apr 2024 20:36:28 +0000 (13:36 -0700)]
Merge tag '9p-fixes-for-6.9-rc5' of git://git./linux/kernel/git/ericvh/v9fs

Pull fs/9p fixes from Eric Van Hensbergen:
 "This contains a reversion of one of the original 6.9 patches which
  seems to have been the cause of most of the instability. It also
  incorporates several fixes to legacy support and cache fixes.

  There are few additional changes to improve stability, but I want
  another week of testing before sending them upstream"

* tag '9p-fixes-for-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  fs/9p: drop inodes immediately on non-.L too
  fs/9p: Revert "fs/9p: fix dups even in uncached mode"
  fs/9p: remove erroneous nlink init from legacy stat2inode
  9p: explicitly deny setlease attempts
  fs/9p: fix the cache always being enabled on files with qid flags
  fs/9p: translate O_TRUNC into OTRUNC
  fs/9p: only translate RWX permissions for plain 9P2000