linux.git
14 months agoMerge tag 'drm-misc-fixes-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Dave Airlie [Fri, 26 Jan 2024 18:12:14 +0000 (04:12 +1000)]
Merge tag 'drm-misc-fixes-for-v6.8-rc2' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

One regression fixup to samsung-dsim.c module
- The FORCE_STOP_STATE bit is ineffective for forcing DSI link into LP-11 mode,
  causing timing issues and potential bridge failures.
  This patch reverts previous commits and corrects this issue.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240126141130.15512-1-inki.dae@samsung.com
14 months agoRevert "nouveau: push event block/allowing out of the fence context"
Dave Airlie [Fri, 26 Jan 2024 18:04:34 +0000 (04:04 +1000)]
Revert "nouveau: push event block/allowing out of the fence context"

This reverts commit eacabb5462717a52fccbbbba458365a4f5e61f35.

This commit causes some regressions in desktop usage, this will
reintroduce the original deadlock in DRI_PRIME situations, I've
got an idea to fix it by offloading to a workqueue in a different
spot, however this code has a race condition where we sometimes
miss interrupts so I'd like to fix that as well.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
14 months agoMerge tag 'drm-intel-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 26 Jan 2024 17:58:24 +0000 (03:58 +1000)]
Merge tag 'drm-intel-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- PSR fix for HSW

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZbPGBL9lj4DxxIW1@jlahtine-mobl.ger.corp.intel.com
14 months agoMerge tag 'drm-misc-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 26 Jan 2024 17:56:02 +0000 (03:56 +1000)]
Merge tag 'drm-misc-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Plenty of ivpu fixes to improve the general stability and debugging, a
suspend fix for the anx7625 bridge, a revert to fix an initialization
order bug between i915 and simpledrm and a documentation warning fix for
dp_mst.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/tp77e5fokigup6cgmpq6mtg46kzdw2dpze6smpnwfoml4kmwpq@bo6mbkezpkle
14 months agodrm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
Michael Walle [Mon, 13 Nov 2023 16:43:44 +0000 (17:43 +0100)]
drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE

The FORCE_STOP_STATE bit is unsuitable to force the DSI link into LP-11
mode. It seems the bridge internally queues DSI packets and when the
FORCE_STOP_STATE bit is cleared, they are sent in close succession
without any useful timing (this also means that the DSI lanes won't go
into LP-11 mode). The length of this gibberish varies between 1ms and
5ms. This sometimes breaks an attached bridge (TI SN65DSI84 in this
case). In our case, the bridge will fail in about 1 per 500 reboots.

The FORCE_STOP_STATE handling was introduced to have the DSI lanes in
LP-11 state during the .pre_enable phase. But as it turns out, none of
this is needed at all. Between samsung_dsim_init() and
samsung_dsim_set_display_enable() the lanes are already in LP-11 mode.
The code as it was before commit 20c827683de0 ("drm: bridge:
samsung-dsim: Fix init during host transfer") and 0c14d3130654 ("drm:
bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") was correct
in this regard.

This patch basically reverts both commits. It was tested on an i.MX8M
SoC with an SN65DSI84 bridge. The signals were probed and the DSI
packets were decoded during initialization and link start-up. After this
patch the first DSI packet on the link is a VSYNC packet and the timing
is correct.

Command mode between .pre_enable and .enable was also briefly tested by
a quick hack. There was no DSI link partner which would have responded,
but it was made sure the DSI packet was send on the link. As a side
note, the command mode seems to just work in HS mode. I couldn't find
that the bridge will handle commands in LP mode.

Fixes: 20c827683de0 ("drm: bridge: samsung-dsim: Fix init during host transfer")
Fixes: 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec")
Signed-off-by: Michael Walle <mwalle@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231113164344.1612602-1-mwalle@kernel.org
14 months agodrm/sched: Drain all entities in DRM sched run job worker
Matthew Brost [Wed, 24 Jan 2024 21:08:11 +0000 (13:08 -0800)]
drm/sched: Drain all entities in DRM sched run job worker

All entities must be drained in the DRM scheduler run job worker to
avoid the following case. An entity found that is ready, no job found
ready on entity, and run job worker goes idle with other entities + jobs
ready. Draining all ready entities (i.e. loop over all ready entities)
in the run job worker ensures all job that are ready will be scheduled.

Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@mail.gmail.com/
Reported-and-tested-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Link: https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@amd.com/
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Closes: https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd159a8@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124210811.1639040-1-matthew.brost@intel.com
14 months agoMerge tag 'amd-drm-fixes-6.8-2024-01-25' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 26 Jan 2024 02:39:51 +0000 (12:39 +1000)]
Merge tag 'amd-drm-fixes-6.8-2024-01-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.8-2024-01-25:

amdgpu:
- AC/DC power supply tracking fix
- Don't show invalid vram vendor data
- SMU 13.0.x fixes
- GART fix for umr on systems without VRAM
- GFX 10/11 UNORD_DISPATCH fixes
- IPS display fixes (required for S0ix on some platforms)
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240125221503.5019-1-alexander.deucher@amd.com
14 months agoMerge tag 'drm-xe-fixes-2024-01-25' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Thu, 25 Jan 2024 20:09:13 +0000 (06:09 +1000)]
Merge tag 'drm-xe-fixes-2024-01-25' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Make an ops struct static
- Fix an implicit 0 to NULL conversion
- A couple of 32-bit fixes
- A migration coherency fix for Lunar Lake.
- An error path vm id leak fix
- Remove PVC references in kunit tests

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZbIb7l0EhpVp5cXE@fedora
14 months agodrm/amd/display: "Enable IPS by default"
Roman Li [Tue, 23 Jan 2024 20:18:24 +0000 (15:18 -0500)]
drm/amd/display: "Enable IPS by default"

[Why]
IPS was temporary disabled due to instability.
It was fixed in dmub firmware and with:
- "drm/amd/display: Add IPS checks before dcn register access"
- "drm/amd/display: Disable ips before dc interrupt setting"

[How]
Enable IPS by default.
Disable IPS if 0x800 bit set in amdgpu.dcdebugmask module params

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd: Add a DC debug mask for IPS
Roman Li [Tue, 23 Jan 2024 20:14:28 +0000 (15:14 -0500)]
drm/amd: Add a DC debug mask for IPS

For debugging IPS-related issues, expose a new debug mask
that allows to disable IPS.
Usage:
amdgpu.dcdebugmask=0x800

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/display: Disable ips before dc interrupt setting
Roman Li [Mon, 22 Jan 2024 22:45:41 +0000 (17:45 -0500)]
drm/amd/display: Disable ips before dc interrupt setting

[Why]
While in IPS2 an access to dcn registers is not allowed.
If interrupt results in dc call, we should disable IPS.

[How]
Safeguard register access in IPS2 by disabling idle optimization
before calling dc interrupt setting api.

Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/display: Replay + IPS + ABM in Full Screen VPB
ChunTao Tso [Mon, 8 Jan 2024 05:46:59 +0000 (13:46 +0800)]
drm/amd/display: Replay + IPS + ABM in Full Screen VPB

[Why]
Because ABM will wait VStart to start getting histogram data,
 it will cause we can't enter IPS while full screnn video playing.

[How]
Modify the panel refresh rate to the maximun multiple of current
 refresh rate.

Reviewed-by: Dennis Chan <dennis.chan@amd.com>
Acked-by: Roman Li <roman.li@amd.com>
Signed-off-by: ChunTao Tso <chuntao.tso@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/display: Add IPS checks before dcn register access
Roman Li [Tue, 9 Jan 2024 22:31:33 +0000 (17:31 -0500)]
drm/amd/display: Add IPS checks before dcn register access

[Why]
With IPS enabled a system hangs once PSR is active.
PSR active triggers transition to IPS2 state.
While in IPS2 an access to dcn registers results in hard hang.
Existing check doesn't cover for PSR sequence.

[How]
Safeguard register access by disabling idle optimization in atomic commit
and crtc scanout. It will be re-enabled on next vblank.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Roman Li <roman.li@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/display: Add Replay IPS register for DMUB command table
Alvin Lee [Tue, 2 Jan 2024 17:02:39 +0000 (12:02 -0500)]
drm/amd/display: Add Replay IPS register for DMUB command table

- Introduce a new Replay mode for DMUB version 0.0.199.0

Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/display: Allow IPS2 during Replay
Nicholas Kazlauskas [Thu, 7 Dec 2023 19:12:03 +0000 (14:12 -0500)]
drm/amd/display: Allow IPS2 during Replay

[Why & How]
Add regkey to block video playback in IPS2 by default

Allow idle optimizations in the same spot we allow Replay for
video playback usecases.

Avoid sending it when there's an external display connected by
modifying the allow idle checks to check for active non-eDP screens.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
Alex Deucher [Fri, 19 Jan 2024 17:32:59 +0000 (12:32 -0500)]
drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer.  On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads. Updated firmware is also
required for AQL.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
14 months agodrm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
Alex Deucher [Fri, 19 Jan 2024 17:23:55 +0000 (12:23 -0500)]
drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer.  On GC 9.x and older, this needs
to be set to 0.  This can lead to hangs in some mixed
graphics and compute workloads.  Updated firmware is also
required for AQL.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
14 months agodrm/amd/amdgpu: Assign GART pages to AMD device mapping
Tom St Denis [Wed, 17 Jan 2024 17:47:37 +0000 (12:47 -0500)]
drm/amd/amdgpu: Assign GART pages to AMD device mapping

This allows kernel mapped pages like the PDB and PTB to be
read via the iomem debugfs when there is no vram in the system.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
14 months agodrm/amd/pm: Fetch current power limit from FW
Lijo Lazar [Thu, 18 Jan 2024 08:55:35 +0000 (14:25 +0530)]
drm/amd/pm: Fetch current power limit from FW

Power limit of SMUv13.0.6 SOCs can be updated by out-of-band ways. Fetch
the limit from firmware instead of using cached values.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
14 months agodrm/amdgpu: Fix null pointer dereference
Hawking Zhang [Mon, 22 Jan 2024 09:38:23 +0000 (17:38 +0800)]
drm/amdgpu: Fix null pointer dereference

amdgpu_reg_state_sysfs_fini could be invoked at the
time when asic_func is even not initialized, i.e.,
amdgpu_discovery_init fails for some reason.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
14 months agodrm/amdgpu: Show vram vendor only if available
Lijo Lazar [Sat, 20 Jan 2024 08:18:09 +0000 (13:48 +0530)]
drm/amdgpu: Show vram vendor only if available

Ony if vram vendor info is available, show in sysfs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
14 months agodrm/amd/pm: update the power cap setting
Kenneth Feng [Fri, 19 Jan 2024 08:12:00 +0000 (16:12 +0800)]
drm/amd/pm: update the power cap setting

update the power cap setting for smu_v13.0.0/smu_v13.0.7

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2356
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
14 months agodrm/amdgpu: Avoid fetching vram vendor information
Lijo Lazar [Sat, 20 Jan 2024 08:02:51 +0000 (13:32 +0530)]
drm/amdgpu: Avoid fetching vram vendor information

For GFX 9.4.3 APUs, the current method of fetching vram vendor
information is not reliable. Avoid fetching the information.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
14 months agodrm/amdgpu/pm: Fix the power source flag error
Ma Jun [Wed, 17 Jan 2024 06:35:29 +0000 (14:35 +0800)]
drm/amdgpu/pm: Fix the power source flag error

The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
14 months agodrm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_...
Srinivasan Shanmugam [Wed, 17 Jan 2024 03:11:52 +0000 (08:41 +0530)]
drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions

The 'status' variable in 'core_link_read_dpcd()' &
'core_link_write_dpcd()' was uninitialized.

Thus, initializing 'status' variable to 'DC_ERROR_UNEXPECTED' by default.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:226 core_link_read_dpcd() error: uninitialized symbol 'status'.
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:248 core_link_write_dpcd() error: uninitialized symbol 'status'.

Cc: stable@vger.kernel.org
Cc: Jerry Zuo <jerry.zuo@amd.com>
Cc: Jun Lei <Jun.Lei@amd.com>
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agodrm/amd/pm: udpate smu v13.0.6 message permission
Yang Wang [Fri, 19 Jan 2024 03:32:41 +0000 (11:32 +0800)]
drm/amd/pm: udpate smu v13.0.6 message permission

update smu v13.0.6 message to allow guest driver set gfx clock.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
14 months agoaccel/ivpu: Improve recovery and reset support
Jacek Lawrynowicz [Mon, 22 Jan 2024 12:09:45 +0000 (13:09 +0100)]
accel/ivpu: Improve recovery and reset support

  - Synchronize job submission with reset/recovery using reset_lock
  - Always print recovery reason and call diagnose_failure()
  - Don't allow for autosupend during recovery
  - Prevent immediate autosuspend after reset/recovery
  - Prevent force_recovery for issuing TDR when device is suspended
  - Reset VPU instead triggering recovery after changing debugfs params

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122120945.1150728-4-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Improve stability of ivpu_submit_ioctl()
Jacek Lawrynowicz [Mon, 22 Jan 2024 12:09:44 +0000 (13:09 +0100)]
accel/ivpu: Improve stability of ivpu_submit_ioctl()

- Wake up the device as late as possible
- Remove job reference counting in order to simplify the code
- Don't put jobs that are not fully submitted on submitted_jobs_xa in
  order to avoid potential races with reset/recovery

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122120945.1150728-3-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Fix dev open/close races with unbind
Jacek Lawrynowicz [Mon, 22 Jan 2024 12:09:43 +0000 (13:09 +0100)]
accel/ivpu: Fix dev open/close races with unbind

  - Add context_list_lock to synchronize user context addition/removal
  - Use drm_dev_enter() to prevent unbinding the device during ivpu_open()
    and vpu address allocation

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122120945.1150728-2-jacek.lawrynowicz@linux.intel.com
14 months agodrm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT
Ville Syrjälä [Thu, 18 Jan 2024 21:21:31 +0000 (23:21 +0200)]
drm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT

On HSW non-ULT (or at least on Dell Latitude E6540) external displays
start to flicker when we enable PSR on the eDP. We observe a much higher
SR and PC6 residency than should be possible with an external display,
and indeen much higher than what we observe with eDP disabled and
only the external display enabled. Looks like the hardware is somehow
ignoring the fact that the external display is active during PSR.

I wasn't able to redproduce this on my HSW ULT machine, or BDW.
So either there's something specific about this particular laptop
(eg. some unknown firmware thing) or the issue is limited to just
non-ULT HSW systems. All known registers that could affect this
look perfectly reasonable on the affected machine.

As a workaround let's unmask the LPSP event to prevent PSR entry
except while in LPSP mode (only pipe A + eDP active). This
will prevent PSR entry entirely when multiple pipes are active.
The one slight downside is that we now also prevent PSR entry
when driving eDP with pipe B or C, but I think that's a reasonable
tradeoff to avoid having to implement a more complex workaround.

Cc: stable@vger.kernel.org
Fixes: 783d8b80871f ("drm/i915/psr: Re-enable PSR1 on hsw/bdw")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10092
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240118212131.31868-1-ville.syrjala@linux.intel.com
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
(cherry picked from commit 94501c3ca6400e463ff6cc0c9cf4a2feb6a9205d)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
14 months agoMerge tag 'exynos-drm-fixes-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kerne...
Dave Airlie [Thu, 25 Jan 2024 04:22:10 +0000 (14:22 +1000)]
Merge tag 'exynos-drm-fixes-for-v6.8-rc2' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

Several fixups
   - Minor fix in `drm/exynos: gsc: gsc_runtime_resume`
     . The patch ensures `clk_disable_unprepare()` is called on the first
       element of `ctx->clocks` array.
       This issue was identified by the Linux Verification Center.

   - Fix excessive stack usage in `fimd_win_set_pixfmt()` in `drm/exynos`
     . The issue, highlighted by gcc, involved an unnecessary on-stack copy of
       the large `exynos_drm_plane` structure, now replaced with a pointer.

   - Fix an incorrect type issue in `exynos_drm_fimd.c` module
     . Addresses an incorrect type issue in `fimd_commit()` within the
       `exynos_drm_fimd.c` The problem was reported by the kernel test robot[1].

     [1] https://lore.kernel.org/oe-kbuild-all/202312140930.Me9yWf8F-lkp@intel.com/

   - Fix a typo in the dt-bindings for `samsung,exynos-mixer`
     . Changes 'regs' to the correct property name 'reg' in the dt-bindings
       documentation for `samsung,exynos-mixer`

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122072407.39546-1-inki.dae@samsung.com
14 months agoMerge tag 'drm-misc-next-fixes-2024-01-19' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Thu, 25 Jan 2024 04:19:52 +0000 (14:19 +1000)]
Merge tag 'drm-misc-next-fixes-2024-01-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

A null pointer dereference fix for v3d and a protection fault fix for
ttm.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5zrphn2nhxnwillxlmo6ap3zh7qjt3jgydlm5sntuc4fzvwhpo@hznprx2bjyi7
14 months agoMerge tag 'drm-intel-next-fixes-2024-01-19' of git://anongit.freedesktop.org/drm...
Dave Airlie [Thu, 25 Jan 2024 04:19:12 +0000 (14:19 +1000)]
Merge tag 'drm-intel-next-fixes-2024-01-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- DSI sequence revert to fix GitLab #10071 and DP test-pattern fix
- Drop -Wstringop-overflow (broken on GCC11)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZaozNnAGhu6Ec6cb@jlahtine-mobl.ger.corp.intel.com
14 months agoRevert "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync"
Thomas Zimmermann [Tue, 23 Jan 2024 12:09:26 +0000 (13:09 +0100)]
Revert "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync"

This reverts commit 60aebc9559492cea6a9625f514a8041717e3a2e4.

Commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from
device_initcall to subsys_initcall_sync") messes up initialization order
of the graphics drivers and leads to blank displays on some systems. So
revert the commit.

To make the display drivers fully independent from initialization
order requires to track framebuffer memory by device and independently
from the loaded drivers. The kernel currently lacks the infrastructure
to do so.

Reported-by: Jaak Ristioja <jaak@ristioja.ee>
Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t
Reported-by: Huacai Chen <chenhuacai@loongson.cn>
Closes: https://lore.kernel.org/dri-devel/20231108024613.2898921-1-chenhuacai@loongson.cn/
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10133
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: stable@vger.kernel.org # v6.5+
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123120937.27736-1-tzimmermann@suse.de
14 months agodrm/xe: Remove PVC from xe_wa kunit tests
Lucas De Marchi [Tue, 23 Jan 2024 03:12:42 +0000 (19:12 -0800)]
drm/xe: Remove PVC from xe_wa kunit tests

Since the PCI IDs for PVC weren't added to the xe driver, the xe_wa
tests should not try to create a fake PVC device since they can't find
the right PCI ID. Fix bugs when running kunit:

# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
    ret == -19 (0xffffffffffffffed)
[FAILED] PVC (B0)
# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
    ret == -19 (0xffffffffffffffed)
[FAILED] PVC (B1)
# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
    ret == -19 (0xffffffffffffffed)
[FAILED] PVC (C0)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123031242.3548724-1-lucas.demarchi@intel.com
(cherry picked from commit ab5ae65fb25d06c38a6617a628b964828adb4786)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe/vm: bugfix in xe_vm_create_ioctl
Moti Haimovski [Mon, 22 Jan 2024 10:24:24 +0000 (12:24 +0200)]
drm/xe/vm: bugfix in xe_vm_create_ioctl

Fix xe_vm_create_ioctl routine not freeing the vm-id allocated to it
when the function fails.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122102424.4008095-1-mhaimovski@habana.ai
(cherry picked from commit f6bf0424cadc27d7cf6a049d2db960e4b52fa513)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe/xe2: Use XE_CACHE_WB pat index
Himal Prasad Ghimiray [Fri, 19 Jan 2024 04:18:26 +0000 (09:48 +0530)]
drm/xe/xe2: Use XE_CACHE_WB pat index

The pat table entry associated with XE_CACHE_WB is coherent whereas
XE_CACHE_NONE is non coherent. Migration expects the coherency
with cpu therefore use the coherent entry XE_CACHE_WB for
buffers not supporting compression. For read/write to flat ccs region
the issue is not related to coherency with cpu. The hardware expects
the pat index associated with GPUVA for indirect access to be
compression enabled hence use XE_CACHE_NONE_COMPRESSION.

v2
- Fix the argument to emit_pte, pass the bool directly. (Thomas)

v3
- Rebase
- Update commit message (Matt)

v4
- Add a Fixes: tag. (Thomas)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 65ef8dbad1db ("drm/xe/xe2: Update emit_pte to use compression enabled PAT index")
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119041826.1670496-1-himal.prasad.ghimiray@intel.com
(cherry picked from commit 6a02867560f77328ae5637b70b06704b140aafa6)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe/display: Avoid calling readq()
Lucas De Marchi [Fri, 19 Jan 2024 00:16:10 +0000 (16:16 -0800)]
drm/xe/display: Avoid calling readq()

readq() is not available in 32bits and i915_gem_object_read_from_page()
is supposed to allow reading arbitrary sizes determined by the `size`
argument. Currently the only caller only passes a size == 8 so the
second problem is not that big. Migrate to calling
memcpy()/memcpy_fromio() to allow possible changes in the display side
and to fix the build on 32b architectures.

v2: Use memcpy/memcpy_fromio directly rather than using iosys-map with
    the same size == 8 bytes restriction (Matt Roper)

Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-4-lucas.demarchi@intel.com
(cherry picked from commit 406663f777bee53e9ad93dc080c333d4655ab7de)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe/mmio: Cast to u64 when printing
Lucas De Marchi [Fri, 19 Jan 2024 00:16:09 +0000 (16:16 -0800)]
drm/xe/mmio: Cast to u64 when printing

resource_size_t uses %pa format in printk since the size varies
depending on build options. However to keep the io_size/physical_size
addition in the same call we can't pass the address without adding yet
another variable in these function. Simply cast it to u64 and keep using
%llx.

Fixes: 286089ce6929 ("drm/xe: Improve vram info debug printing")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-3-lucas.demarchi@intel.com
(cherry picked from commit 6d8d038364d8ec573e9dc0872e17bee1e5f12490)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe: Use _ULL for u64 division
Lucas De Marchi [Fri, 19 Jan 2024 00:16:08 +0000 (16:16 -0800)]
drm/xe: Use _ULL for u64 division

Use DIV_ROUND_UP_ULL() so it also works on 32bit build.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-2-lucas.demarchi@intel.com
(cherry picked from commit 7b5bdb447b14930b9ef3e39bd301937889c60c96)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe: Use a NULL pointer instead of 0.
Thomas Hellström [Wed, 17 Jan 2024 13:40:48 +0000 (14:40 +0100)]
drm/xe: Use a NULL pointer instead of 0.

The last argument of xe_pcode_read() is a pointer. Use NULL instead of 0.

Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical power")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-6-thomas.hellstrom@linux.intel.com
(cherry picked from commit 79f8eacbdf9dad7ead39b3319e31e12d4dc6529e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/xe/dmabuf: Make xe_dmabuf_ops static
Thomas Hellström [Wed, 17 Jan 2024 13:40:44 +0000 (14:40 +0100)]
drm/xe/dmabuf: Make xe_dmabuf_ops static

It is not referenced outside of the xe_dma_buf.c source file.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit e2dc52f849f8694bdabb75127164c9df622af459)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
14 months agodrm/dp_mst: Separate @failing_port list in drm_dp_mst_atomic_check_mgr() comment
Bagas Sanjaya [Tue, 14 Nov 2023 08:10:33 +0000 (15:10 +0700)]
drm/dp_mst: Separate @failing_port list in drm_dp_mst_atomic_check_mgr() comment

Stephen Rothwell reported htmldocs warnings when merging drm-intel
tree:

Documentation/gpu/drm-kms-helpers:296: drivers/gpu/drm/display/drm_dp_mst_topology.c:5484: ERROR: Unexpected indentation.
Documentation/gpu/drm-kms-helpers:296: drivers/gpu/drm/display/drm_dp_mst_topology.c:5488: WARNING: Block quote ends without a blank line; unexpected unindent.

Separate @failing_port return value list by surrounding it with a
blank line to fix above warnings.

Fixes: 1cd0a5ea427931 ("drm/dp_mst: Factor out a helper to check the atomic state of a topology manager")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20231114141715.6f435118@canb.auug.org.au/
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231114081033.27343-1-bagasdotme@gmail.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
14 months agodrm/bridge: anx7625: Ensure bridge is suspended in disable()
Hsin-Yi Wang [Thu, 18 Jan 2024 01:58:14 +0000 (17:58 -0800)]
drm/bridge: anx7625: Ensure bridge is suspended in disable()

Similar to commit 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge
is suspended in .post_disable()"). Add a mutex to ensure that aux transfer
won't race with atomic_disable by holding the PM reference and prevent
the bridge from suspend.

Also we need to use pm_runtime_put_sync_suspend() to suspend the bridge
instead of idle with pm_runtime_put_sync().

Fixes: 3203e497eb76 ("drm/bridge: anx7625: Synchronously run runtime suspend.")
Fixes: adca62ec370c ("drm/bridge: anx7625: Support reading edid through aux channel")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Xuxin Xiong <xuxinxiong@huaqin.corp-partner.google.com>
Reviewed-by: Pin-yen Lin <treapking@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240118015916.2296741-1-hsinyi@chromium.org
14 months agoaccel/ivpu: Disable PLL after VPU IP reset during FLR
Jacek Lawrynowicz [Tue, 24 Oct 2023 16:53:53 +0000 (18:53 +0200)]
accel/ivpu: Disable PLL after VPU IP reset during FLR

IP reset has to followed by ivpu_pll_disable() to properly enter
reset state.

Fixes: 828d63042aec ("accel/ivpu: Don't enter d0i3 during FLR")
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231024165353.761507-1-stanislaw.gruszka@linux.intel.com
14 months agoaccel/ivpu: Deprecate DRM_IVPU_PARAM_CONTEXT_PRIORITY param
Wachowski, Karol [Mon, 15 Jan 2024 13:44:34 +0000 (14:44 +0100)]
accel/ivpu: Deprecate DRM_IVPU_PARAM_CONTEXT_PRIORITY param

DRM_IVPU_PARAM_CONTEXT_PRIORITY has been deprecated because it
has been replaced with DRM_IVPU_JOB_PRIORITY levels set with
submit IOCTL and was unused anyway.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-10-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Improve buffer object debug logs
Jacek Lawrynowicz [Mon, 15 Jan 2024 13:44:33 +0000 (14:44 +0100)]
accel/ivpu: Improve buffer object debug logs

Make debug logs more readable and consistent:
  - don't print handle as it is not always available for all buffers
  - use hashed ivpu_bo ptr as main buffer identifier
  - remove unused fields from ivpu_bo_print_info()

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-9-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Disable buffer sharing among VPU contexts
Jacek Lawrynowicz [Mon, 15 Jan 2024 13:44:32 +0000 (14:44 +0100)]
accel/ivpu: Disable buffer sharing among VPU contexts

This was not supported properly. A buffer was imported to another VPU
context as a separate buffer object with duplicated sgt.
Both exported and imported buffers could be DMA mapped causing a double
mapping on the same device.

Buffers imported from another VPU context will now just increase
reference count, leaving only a single sgt, fixing the problem above.
Buffers still can't be shared among VPU contexts because each has its
own MMU mapping and ivpu_bo only supports single MMU mappings.

The solution would be to use a mapping list as in panfrost or etnaviv
drivers and it will be implemented in future if required.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-8-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Free buffer sgt on unbind
Jacek Lawrynowicz [Mon, 15 Jan 2024 13:44:31 +0000 (14:44 +0100)]
accel/ivpu: Free buffer sgt on unbind

Call dma_unmap() on all buffers before the VPU is unbinded to avoid
"device driver has pending DMA allocations while released from device"
warning when DMA-API debug is enabled.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-7-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Fix for missing lock around drm_gem_shmem_vmap()
Jacek Lawrynowicz [Mon, 15 Jan 2024 13:44:30 +0000 (14:44 +0100)]
accel/ivpu: Fix for missing lock around drm_gem_shmem_vmap()

drm_gem_shmem_vmap/vunmap requires dma resv lock to be held.
This was missed during conversion to shmem helper.

Fixes: 8d88e4cdce4f ("accel/ivpu: Use GEM shmem helper for all buffers")
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-6-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Add diagnostic messages when VPU fails to boot or suspend
Wachowski, Karol [Mon, 15 Jan 2024 13:44:29 +0000 (14:44 +0100)]
accel/ivpu: Add diagnostic messages when VPU fails to boot or suspend

Make boot/suspend failure debugging easier by dumping FW logs and error
registers.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-5-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Add debug prints for MMU map/unmap operations
Wachowski, Karol [Mon, 15 Jan 2024 13:44:28 +0000 (14:44 +0100)]
accel/ivpu: Add debug prints for MMU map/unmap operations

It is common need to be able to see IOVA/physical to VPU addresses
mappings. Especially when debugging different kind of memory related
issues. Lack of such logs forces user to modify and recompile KMD manually.

This commit adds those logs under MMU debug mask which can be turned on
dynamically with module param during KMD load.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-4-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Call diagnose failure in ivpu_mmu_cmdq_sync()
Wachowski, Karol [Mon, 15 Jan 2024 13:44:27 +0000 (14:44 +0100)]
accel/ivpu: Call diagnose failure in ivpu_mmu_cmdq_sync()

Check for possible failure reasons in the buttress.
Some errors (like external abort) should have corresponding buttress errors
registers set indicating the real reason of failure.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-3-jacek.lawrynowicz@linux.intel.com
14 months agoaccel/ivpu: Dump MMU events in case of VPU boot timeout
Wachowski, Karol [Mon, 15 Jan 2024 13:44:26 +0000 (14:44 +0100)]
accel/ivpu: Dump MMU events in case of VPU boot timeout

Add ivpu_mmu_evtq_dump() function that dumps existing MMU events from
MMU event queue. Call this function if VPU boot failed.

Previously MMU events were only checked in interrupt handler, but if VPU
failed to boot due to MMU faults, those faults were missed because of
interrupts not yet being enabled. This will allow checking potential
fault reason of VPU not booting.

Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240115134434.493839-2-jacek.lawrynowicz@linux.intel.com
14 months agoMerge v6.8-rc1 into drm-misc-fixes
Maxime Ripard [Mon, 22 Jan 2024 08:44:15 +0000 (09:44 +0100)]
Merge v6.8-rc1 into drm-misc-fixes

Let's kickstart the 6.8 fix cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
14 months agodrm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume
Fedor Pchelkin [Wed, 20 Dec 2023 09:53:15 +0000 (12:53 +0300)]
drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume

Do not forget to call clk_disable_unprepare() on the first element of
ctx->clocks array.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 8b7d3ec83aba ("drm/exynos: gsc: Convert driver to IPP v2 core API")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
14 months agodrm/exynos: fix accidental on-stack copy of exynos_drm_plane
Arnd Bergmann [Thu, 14 Dec 2023 12:32:15 +0000 (13:32 +0100)]
drm/exynos: fix accidental on-stack copy of exynos_drm_plane

gcc rightfully complains about excessive stack usage in the fimd_win_set_pixfmt()
function:

drivers/gpu/drm/exynos/exynos_drm_fimd.c: In function 'fimd_win_set_pixfmt':
drivers/gpu/drm/exynos/exynos_drm_fimd.c:750:1: error: the frame size of 1032 bytes is larger than 1024 byte
drivers/gpu/drm/exynos/exynos5433_drm_decon.c: In function 'decon_win_set_pixfmt':
drivers/gpu/drm/exynos/exynos5433_drm_decon.c:381:1: error: the frame size of 1032 bytes is larger than 1024 bytes

There is really no reason to copy the large exynos_drm_plane
structure to the stack before using one of its members, so just
use a pointer instead.

Fixes: 6f8ee5c21722 ("drm/exynos: fimd: Make plane alpha configurable")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
14 months agodrm/exynos: fix incorrect type issue
Inki Dae [Fri, 15 Dec 2023 00:38:19 +0000 (09:38 +0900)]
drm/exynos: fix incorrect type issue

Fix incorrect type issue in fimd_commit() of exynos_drm_fimd.c module.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312140930.Me9yWf8F-lkp@intel.com/
Signed-off-by: Inki Dae <inki.dae@samsung.com>
14 months agoLinux 6.8-rc1
Linus Torvalds [Sun, 21 Jan 2024 22:11:32 +0000 (14:11 -0800)]
Linux 6.8-rc1

14 months agoMerge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Some fixes, Some refactoring, some minor features:

   - Assorted prep work for disk space accounting rewrite

   - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
     makes our trigger context more explicit

   - A few fixes to avoid excessive transaction restarts on
     multithreaded workloads: fstests (in addition to ktest tests) are
     now checking slowpath counters, and that's shaking out a few bugs

   - Assorted tracepoint improvements

   - Starting to break up bcachefs_format.h and move on disk types so
     they're with the code they belong to; this will make room to start
     documenting the on disk format better.

   - A few minor fixes"

* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
  bcachefs: Improve inode_to_text()
  bcachefs: logged_ops_format.h
  bcachefs: reflink_format.h
  bcachefs; extents_format.h
  bcachefs: ec_format.h
  bcachefs: subvolume_format.h
  bcachefs: snapshot_format.h
  bcachefs: alloc_background_format.h
  bcachefs: xattr_format.h
  bcachefs: dirent_format.h
  bcachefs: inode_format.h
  bcachefs; quota_format.h
  bcachefs: sb-counters_format.h
  bcachefs: counters.c -> sb-counters.c
  bcachefs: comment bch_subvolume
  bcachefs: bch_snapshot::btime
  bcachefs: add missing __GFP_NOWARN
  bcachefs: opts->compression can now also be applied in the background
  bcachefs: Prep work for variable size btree node buffers
  bcachefs: grab s_umount only if snapshotting
  ...

14 months agoMerge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git./linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for time and clocksources:

   - A fix for the idle and iowait time accounting vs CPU hotplug.

     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.

   - Assorted fixes and improvements for clocksource/event drivers"

* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
  clocksource/drivers/ep93xx: Fix error handling during probe
  clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
  clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
  clocksource/timer-riscv: Add riscv_clock_shutdown callback
  dt-bindings: timer: Add StarFive JH8100 clint
  dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs

14 months agoMerge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 21 Jan 2024 19:04:29 +0000 (11:04 -0800)]
Merge tag 'powerpc-6.8-2' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Aneesh Kumar:

 - Increase default stack size to 32KB for Book3S

Thanks to Michael Ellerman.

* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Increase default stack size to 32KB

14 months agobcachefs: Improve inode_to_text()
Kent Overstreet [Sun, 21 Jan 2024 17:19:01 +0000 (12:19 -0500)]
bcachefs: Improve inode_to_text()

Add line breaks - inode_to_text() is now much easier to read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: logged_ops_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:57:45 +0000 (02:57 -0500)]
bcachefs: logged_ops_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: reflink_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:54:47 +0000 (02:54 -0500)]
bcachefs: reflink_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs; extents_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:51:56 +0000 (02:51 -0500)]
bcachefs; extents_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: ec_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:47:14 +0000 (02:47 -0500)]
bcachefs: ec_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: subvolume_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:42:53 +0000 (02:42 -0500)]
bcachefs: subvolume_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: snapshot_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:41:06 +0000 (02:41 -0500)]
bcachefs: snapshot_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: alloc_background_format.h
Kent Overstreet [Sun, 21 Jan 2024 05:01:52 +0000 (00:01 -0500)]
bcachefs: alloc_background_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: xattr_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:59:15 +0000 (23:59 -0500)]
bcachefs: xattr_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: dirent_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:57:10 +0000 (23:57 -0500)]
bcachefs: dirent_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: inode_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:55:39 +0000 (23:55 -0500)]
bcachefs: inode_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs; quota_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:53:52 +0000 (23:53 -0500)]
bcachefs; quota_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: sb-counters_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:50:56 +0000 (23:50 -0500)]
bcachefs: sb-counters_format.h

bcachefs_format.h has gotten too big; let's do some organizing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: counters.c -> sb-counters.c
Kent Overstreet [Sun, 21 Jan 2024 04:46:35 +0000 (23:46 -0500)]
bcachefs: counters.c -> sb-counters.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: comment bch_subvolume
Kent Overstreet [Sun, 21 Jan 2024 04:44:17 +0000 (23:44 -0500)]
bcachefs: comment bch_subvolume

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: bch_snapshot::btime
Kent Overstreet [Sun, 21 Jan 2024 04:35:41 +0000 (23:35 -0500)]
bcachefs: bch_snapshot::btime

Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: add missing __GFP_NOWARN
Kent Overstreet [Wed, 17 Jan 2024 22:16:07 +0000 (17:16 -0500)]
bcachefs: add missing __GFP_NOWARN

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: opts->compression can now also be applied in the background
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background

The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Prep work for variable size btree node buffers
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers

bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: grab s_umount only if snapshotting
Su Yue [Mon, 15 Jan 2024 02:21:25 +0000 (10:21 +0800)]
bcachefs: grab s_umount only if snapshotting

When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.

$ cat test.sh
prog=bcachefs

$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots

while true;do
    $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
    sleep 1s
done

$ cat /etc/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/mongod.log

storage:
  dbPath: /mnt/data/

lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
               but task is already holding lock:
[ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
               which lock already depends on the new lock.

[ 3437.456553]
               the existing dependency chain (in reverse order) is:
[ 3437.457054]
               -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507]        down_read+0x3e/0x170
[ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
[ 3437.458498]        do_syscall_64+0x42/0xf0
[ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
               -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615]        down_read+0x3e/0x170
[ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686]        notify_change+0x1f1/0x4a0
[ 3437.461283]        do_truncate+0x7f/0xd0
[ 3437.461555]        path_openat+0xa57/0xce0
[ 3437.461836]        do_filp_open+0xb4/0x160
[ 3437.462116]        do_sys_openat2+0x91/0xc0
[ 3437.462402]        __x64_sys_openat+0x53/0xa0
[ 3437.462701]        do_syscall_64+0x42/0xf0
[ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
               -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843]        down_write+0x3b/0xc0
[ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493]        vfs_write+0x21b/0x4c0
[ 3437.464653]        ksys_write+0x69/0xf0
[ 3437.464839]        do_syscall_64+0x42/0xf0
[ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
               -> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471]        __lock_acquire+0x1455/0x21b0
[ 3437.465656]        lock_acquire+0xc6/0x2b0
[ 3437.465822]        mnt_want_write+0x46/0x1a0
[ 3437.465996]        filename_create+0x62/0x190
[ 3437.466175]        user_path_create+0x2d/0x50
[ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
[ 3437.466791]        do_syscall_64+0x42/0xf0
[ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
               other info that might help us debug this:

[ 3437.469670] 2 locks held by bcachefs/35533:
               other info that might help us debug this:

[ 3437.467507] Chain exists of:
                 sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48

[ 3437.467979]  Possible unsafe locking scenario:

[ 3437.468223]        CPU0                    CPU1
[ 3437.468405]        ----                    ----
[ 3437.468585]   rlock(&type->s_umount_key#48);
[ 3437.468758]                                lock(&c->snapshot_create_lock);
[ 3437.469030]                                lock(&type->s_umount_key#48);
[ 3437.469291]   rlock(sb_writers#10);
[ 3437.469434]
                *** DEADLOCK ***

[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
               stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795]  <TASK>
[ 3437.471884]  dump_stack_lvl+0x57/0x90
[ 3437.472035]  check_noncircular+0x132/0x150
[ 3437.472202]  __lock_acquire+0x1455/0x21b0
[ 3437.472369]  lock_acquire+0xc6/0x2b0
[ 3437.472518]  ? filename_create+0x62/0x190
[ 3437.472683]  ? lock_is_held_type+0x97/0x110
[ 3437.472856]  mnt_want_write+0x46/0x1a0
[ 3437.473025]  ? filename_create+0x62/0x190
[ 3437.473204]  filename_create+0x62/0x190
[ 3437.473380]  user_path_create+0x2d/0x50
[ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819]  ? lock_acquire+0xc6/0x2b0
[ 3437.474002]  ? __fget_files+0x2a/0x190
[ 3437.474195]  ? __fget_files+0xbc/0x190
[ 3437.474380]  ? lock_release+0xc5/0x270
[ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
[ 3437.475277]  do_syscall_64+0x42/0xf0
[ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================

In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.

Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().

Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit
Su Yue [Tue, 16 Jan 2024 11:05:37 +0000 (19:05 +0800)]
bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit

bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:

[  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
[  406.830676 ] #PF: supervisor read access in kernel mode
[  406.831643 ] #PF: error_code(0x0000) - not-present page
[  406.832487 ] PGD 0 P4D 0
[  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
[  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  406.835796 ] RIP: 0010:kfree+0x62/0x140
[  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
[  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
[  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
[  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
[  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
[  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
[  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
[  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
[  406.841464 ] Call Trace:
[  406.841583 ]  <TASK>
[  406.841682 ]  ? __die+0x1f/0x70
[  406.841828 ]  ? page_fault_oops+0x159/0x470
[  406.842014 ]  ? fixup_exception+0x22/0x310
[  406.842198 ]  ? exc_page_fault+0x1ed/0x200
[  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
[  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
[  406.842842 ]  ? kfree+0x62/0x140
[  406.842988 ]  ? kfree+0x104/0x140
[  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
[  406.843390 ]  kobject_put+0xb7/0x170
[  406.843552 ]  deactivate_locked_super+0x2f/0xa0
[  406.843756 ]  cleanup_mnt+0xba/0x150
[  406.843917 ]  task_work_run+0x59/0xa0
[  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
[  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
[  406.844510 ]  do_syscall_64+0x4e/0xf0
[  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  406.844907 ] RIP: 0033:0x7f0a2664e4fb

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: bios must be 512 byte algined
Kent Overstreet [Tue, 16 Jan 2024 16:38:04 +0000 (11:38 -0500)]
bcachefs: bios must be 512 byte algined

Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: remove redundant variable tmp
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp

The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Improve trace_trans_restart_relock
Kent Overstreet [Tue, 16 Jan 2024 01:40:06 +0000 (20:40 -0500)]
bcachefs: Improve trace_trans_restart_relock

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Fix excess transaction restarts in __bchfs_fallocate()
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()

drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: extents_to_bp_state
Kent Overstreet [Mon, 15 Jan 2024 23:19:52 +0000 (18:19 -0500)]
bcachefs: extents_to_bp_state

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: bkey_and_val_eq()
Kent Overstreet [Mon, 15 Jan 2024 23:08:32 +0000 (18:08 -0500)]
bcachefs: bkey_and_val_eq()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Better journal tracepoints
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
bcachefs: Better journal tracepoints

Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Print size of superblock with space allocated
Kent Overstreet [Mon, 15 Jan 2024 22:57:44 +0000 (17:57 -0500)]
bcachefs: Print size of superblock with space allocated

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Avoid flushing the journal in the discard path
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
bcachefs: Avoid flushing the journal in the discard path

When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.

But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Improve move_extent tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:33:39 +0000 (15:33 -0500)]
bcachefs: Improve move_extent tracepoint

Also print out the data_opts, so that we can see what specifically is
being done to an extent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Add missing bch2_moving_ctxt_flush_all()
Kent Overstreet [Mon, 15 Jan 2024 20:06:43 +0000 (15:06 -0500)]
bcachefs: Add missing bch2_moving_ctxt_flush_all()

This fixes a bug with rebalance IOs getting stuck with reads completed,
but writes never being issued.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Re-add move_extent_write tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:04:40 +0000 (15:04 -0500)]
bcachefs: Re-add move_extent_write tracepoint

It appears this was accidentally deleted at some point - also, do a bit
of cleanup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount

Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.

This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Add .val_to_text() for KEY_TYPE_cookie
Kent Overstreet [Mon, 15 Jan 2024 19:15:03 +0000 (14:15 -0500)]
bcachefs: Add .val_to_text() for KEY_TYPE_cookie

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agobcachefs: Don't pass memcmp() as a pointer
Kent Overstreet [Mon, 15 Jan 2024 19:12:43 +0000 (14:12 -0500)]
bcachefs: Don't pass memcmp() as a pointer

Some (buggy!) compilers have issues with this.

Fixes: https://github.com/koverstreet/bcachefs/issues/625
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
14 months agoMerge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 18:21:43 +0000 (10:21 -0800)]
Merge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs

Pull header fix from Kent Overstreet:
 "Just one small fixup for the RT build"

* tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs:
  spinlock: Fix failing build for PREEMPT_RT

14 months agobcachefs: Reduce would_deadlock restarts
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
bcachefs: Reduce would_deadlock restarts

We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.

This tweaks the reflink path to take extents btree locks in the right
order.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>