linux.git
17 months agoMerge tag 'block-6.9-20240503' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 3 May 2024 16:33:59 +0000 (09:33 -0700)]
Merge tag 'block-6.9-20240503' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
 "Nothing major in here - an nvme pull request with mostly auth/tcp
  fixes, and a single fix for ublk not setting segment count and size
  limits"

* tag 'block-6.9-20240503' of git://git.kernel.dk/linux:
  nvme-tcp: strict pdu pacing to avoid send stalls on TLS
  nvmet: fix nvme status code when namespace is disabled
  nvmet-tcp: fix possible memory leak when tearing down a controller
  nvme: cancel pending I/O if nvme controller is in terminal state
  nvmet-auth: replace pr_debug() with pr_err() to report an error.
  nvmet-auth: return the error code to the nvmet_auth_host_hash() callers
  nvme: find numa distance only if controller has valid numa id
  ublk: remove segment count and size limits
  nvme: fix warn output about shared namespaces without CONFIG_NVME_MULTIPATH

17 months agoMerge tag 'sound-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 3 May 2024 16:24:46 +0000 (09:24 -0700)]
Merge tag 'sound-6.9-rc7' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "As usual in a late stage, we received a fair amount of fixes for ASoC,
  and it became bigger than wished. But all fixes are rather device-
  specific, and they look pretty safe to apply.

  A major par of changes are series of fixes for ASoC meson and SOF
  drivers as well as for Realtek and Cirrus codecs. In addition, recent
  emu10k1 regression fixes and usual HD-audio quirks are included"

* tag 'sound-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (46 commits)
  ALSA: hda/realtek: Fix build error without CONFIG_PM
  ALSA: hda/realtek: Fix conflicting PCI SSID 17aa:386f for Lenovo Legion models
  ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318
  ALSA: hda: intel-sdw-acpi: fix usage of device_get_named_child_node()
  ALSA: hda: intel-dsp-config: harden I2C/I2S codec detection
  ASoC: cs35l56: fix usages of device_get_named_child_node()
  ASoC: da7219-aad: fix usage of device_get_named_child_node()
  ASoC: meson: cards: select SND_DYNAMIC_MINORS
  ASoC: meson: axg-tdm: add continuous clock support
  ASoC: meson: axg-tdm-interface: manage formatters in trigger
  ASoC: meson: axg-card: make links nonatomic
  ASoC: meson: axg-fifo: use threaded irq to check periods
  ALSA: hda/realtek: Fix mute led of HP Laptop 15-da3001TU
  ALSA: emu10k1: make E-MU FPGA writes potentially more reliable
  ALSA: emu10k1: fix E-MU dock initialization
  ALSA: emu10k1: use mutex for E-MU FPGA access locking
  ALSA: emu10k1: move the whole GPIO event handling to the workqueue
  ALSA: emu10k1: factor out snd_emu1010_load_dock_firmware()
  ALSA: emu10k1: fix E-MU card dock presence monitoring
  ASoC: rt715-sdca: volume step modification
  ...

17 months agoMerge tag 'drm-fixes-2024-05-03' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 3 May 2024 16:16:36 +0000 (09:16 -0700)]
Merge tag 'drm-fixes-2024-05-03' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly fixes, mostly made up from amdgpu and some panel changes.

  Otherwise xe, nouveau, vmwgfx and a couple of others, all seems pretty
  on track.

  amdgpu:
   - Fix VRAM memory accounting
   - DCN 3.1 fixes
   - DCN 2.0 fix
   - DCN 3.1.5 fix
   - DCN 3.5 fix
   - DCN 3.2.1 fix
   - DP fixes
   - Seamless boot fix
   - Fix call order in amdgpu_ttm_move()
   - Fix doorbell regression
   - Disable panel replay temporarily

  amdkfd:
   - Flush wq before creating kfd process

  xe:
   - Fix UAF on rebind worker
   - Fix ADL-N display integration

  imagination:
   - fix page-count macro

  nouveau:
   - avoid page-table allocation failures
   - fix firmware memory allocation

  panel:
   - ili9341: avoid OF for device properties; respect deferred probe;
     fix usage of errno codes

  ttm:
   - fix status output

  vmwgfx:
   - fix legacy display unit
   - fix read length in fence signalling"

* tag 'drm-fixes-2024-05-03' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
  drm/xe/display: Fix ADL-N detection
  drm/panel: ili9341: Use predefined error codes
  drm/panel: ili9341: Respect deferred probe
  drm/panel: ili9341: Correct use of device property APIs
  drm/xe/vm: prevent UAF in rebind_work_func()
  drm/amd/display: Disable panel replay by default for now
  drm/amdgpu: fix doorbell regression
  drm/amdkfd: Flush the process wq before creating a kfd_process
  drm/amd/display: Disable seamless boot on 128b/132b encoding
  drm/amd/display: Fix DC mode screen flickering on DCN321
  drm/amd/display: Add VCO speed parameter for DCN31 FPU
  drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2
  drm/amd/display: Allocate zero bw after bw alloc enable
  drm/amd/display: Fix incorrect DSC instance for MST
  drm/amd/display: Atom Integrated System Info v2_2 for DCN35
  drm/amd/display: Add dtbclk access to dcn315
  drm/amd/display: Ensure that dmcub support flag is set for DCN20
  drm/amd/display: Handle Y carry-over in VCP X.Y calculation
  drm/amdgpu: Fix VRAM memory accounting
  drm/vmwgfx: Fix invalid reads in fence signaled events
  ...

17 months agoMerge tag 'spi-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Fri, 3 May 2024 16:12:28 +0000 (09:12 -0700)]
Merge tag 'spi-fix-v6.9-rc6' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A few small fixes for v6.9,

  The core fix is for issues with reuse of a spi_message in the case
  where we've got queued messages (a relatively rare occurrence with
  modern code so it wasn't noticed in testing).

  We also avoid an issue with the Kunpeng driver by simply removing the
  debug interface that could trigger it, and address issues with
  confusing and corrupted output when printing the IP version of the AXI
  SPI engine"

* tag 'spi-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: fix null pointer dereference within spi_sync
  spi: hisi-kunpeng: Delete the dump interface of data registers in debugfs
  spi: axi-spi-engine: fix version format string

17 months agoMerge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc...
Dave Airlie [Fri, 3 May 2024 01:16:27 +0000 (11:16 +1000)]
Merge tag 'drm-misc-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

imagination:
- fix page-count macro

nouveau:
- avoid page-table allocation failures
- fix firmware memory allocation

panel:
- ili9341: avoid OF for device properties; respect deferred probe; fix
usage of errno codes

ttm:
- fix status output

vmwgfx:
- fix legacy display unit
- fix read length in fence signalling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502192117.GA12158@linux.fritz.box
17 months agoMerge tag 'drm-xe-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 3 May 2024 01:04:52 +0000 (11:04 +1000)]
Merge tag 'drm-xe-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- Fix UAF on rebind worker
- Fix ADL-N display integration

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6bontwst3mbxozs6u3ad5n3g5zmaucrngbfwv4hkfhpscnwlym@wlwjgjx6pwue
17 months agoMerge tag 'amd-drm-fixes-6.9-2024-05-01' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 3 May 2024 00:43:37 +0000 (10:43 +1000)]
Merge tag 'amd-drm-fixes-6.9-2024-05-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.9-2024-05-01:

amdgpu:
- Fix VRAM memory accounting
- DCN 3.1 fixes
- DCN 2.0 fix
- DCN 3.1.5 fix
- DCN 3.5 fix
- DCN 3.2.1 fix
- DP fixes
- Seamless boot fix
- Fix call order in amdgpu_ttm_move()
- Fix doorbell regression
- Disable panel replay temporarily

amdkfd:
- Flush wq before creating kfd process

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240501135054.1919108-1-alexander.deucher@amd.com
17 months agoMerge tag 'for-6.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 2 May 2024 17:49:12 +0000 (10:49 -0700)]
Merge tag 'for-6.9-rc6-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - set correct ram_bytes when splitting ordered extent. This can be
   inconsistent on-disk but harmless as it's not used for calculations
   and it's only advisory for compression

 - fix lockdep splat when taking cleaner mutex in qgroups disable ioctl

 - fix missing mutex unlock on error path when looking up sys chunk for
   relocation

* tag 'for-6.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: set correct ram_bytes when splitting ordered extent
  btrfs: take the cleaner_mutex earlier in qgroup disable
  btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()

17 months agoMerge tag 's390-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Thu, 2 May 2024 17:43:35 +0000 (10:43 -0700)]
Merge tag 's390-6.9-6' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - The function __storage_key_init_range() expects the end address to be
   the first byte outside the range to be initialized. Fix the callers
   that provide the last byte within the range instead.

 - 3270 Channel Command Word (CCW) may contain zero data address in case
   there is no data in the request. Add data availability check to avoid
   erroneous non-zero value as result of virt_to_dma32(NULL) application
   in cases there is no data

 - Add missing CFI directives for an unwinder to restore the return
   address in the vDSO assembler code

 - NUL-terminate kernel buffer when duplicating user space memory region
   on Channel IO (CIO) debugfs write inject

 - Fix wrong format string in zcrypt debug output

 - Return -EBUSY code when a CCA card is temporarily unavailabile

 - Restore a loop that retries derivation of a protected key from a
   secure key in cases the low level reports temporarily unavailability
   with -EBUSY code

* tag 's390-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/paes: Reestablish retry loop in paes
  s390/zcrypt: Use EBUSY to indicate temp unavailability
  s390/zcrypt: Handle ep11 cprb return code
  s390/zcrypt: Fix wrong format string in debug feature printout
  s390/cio: Ensure the copied buf is NUL terminated
  s390/vdso: Add CFI for RA register to asm macro vdso_func
  s390/3270: Fix buffer assignment
  s390/mm: Fix clearing storage keys for huge pages
  s390/mm: Fix storage key clearing for guest huge pages

17 months agoMerge tag 'xtensa-20240502' of https://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Thu, 2 May 2024 17:41:28 +0000 (10:41 -0700)]
Merge tag 'xtensa-20240502' of https://github.com/jcmvbkbc/linux-xtensa

Pull xtensa fixes from Max Filippov:

 - fix unused variable warning caused by empty flush_dcache_page()
   definition

 - fix stack unwinding on windowed noMMU XIP configurations

 - fix Coccinelle warning 'opportunity for min()' in xtensa ISS platform
   code

* tag 'xtensa-20240502' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: remove redundant flush_dcache_page and ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE macros
  tty: xtensa/iss: Use min() to fix Coccinelle warning
  xtensa: fix MAKE_PC_FROM_RA second argument

17 months agodrm/xe/display: Fix ADL-N detection
Lucas De Marchi [Thu, 25 Apr 2024 18:16:09 +0000 (11:16 -0700)]
drm/xe/display: Fix ADL-N detection

Contrary to i915, in xe ADL-N is kept as a different platform, not a
subplatform of ADL-P. Since the display side doesn't need to
differentiate between P and N, i.e. IS_ALDERLAKE_P_N() is never called,
just fixup the compat header to check for both P and N.

Moving ADL-N to be a subplatform would be more complex as the firmware
loading in xe only handles platforms, not subplatforms, as going forward
the direction is to check on IP version rather than
platforms/subplatforms.

Fix warning when initializing display:

xe 0000:00:02.0: [drm:intel_pch_type [xe]] Found Alder Lake PCH
------------[ cut here ]------------
xe 0000:00:02.0: drm_WARN_ON(!((dev_priv)->info.platform == XE_ALDERLAKE_S) && !((dev_priv)->info.platform == XE_ALDERLAKE_P))

And wrong paths being taken on the display side.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425181610.2704633-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 6a2a90cba12b42eb96c2af3426b77ceb4be31df2)
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
17 months agoMerge tag 'firewire-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 2 May 2024 16:05:21 +0000 (09:05 -0700)]
Merge tag 'firewire-fixes-6.9-rc6' of git://git./linux/kernel/git/ieee1394/linux1394

Pull firewire fixes from Takashi Sakamoto:
 "Two driver fixes:

   - The firewire-ohci driver for 1394 OHCI hardware does not fill time
     stamp for response packet when handling asynchronous transaction to
     local destination. This brings an inconvenience that the response
     packet is not equivalent between the transaction to local and
     remote. It is fixed by fulfilling the time stamp with hardware
     time. The fix should be applied to Linux kernel v6.5 or later as
     well.

   - The nosy driver for Texas Instruments TSB12LV21A (PCILynx) has
     long-standing issue about the behaviour when user space application
     passes less size of buffer than expected. It is fixed by returning
     zero according to the convention of UNIX-like systems"

* tag 'firewire-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: ohci: fulfill timestamp for some local asynchronous transaction
  firewire: nosy: ensure user_length is taken into account when fetching packet contents

17 months agoMerge tag 'thermal-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 2 May 2024 16:01:27 +0000 (09:01 -0700)]
Merge tag 'thermal-6.9-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "Fix a memory leak and a few locking issues (that may cause the kernel
  to crash in principle if all goes wrong) in the thermal debug code
  introduced during the 6.8 development cycle"

* tag 'thermal-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/debugfs: Prevent use-after-free from occurring after cdev removal
  thermal/debugfs: Fix two locking issues with thermal zone debug
  thermal/debugfs: Free all thermal zone debug memory on zone removal

17 months agoMerge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 2 May 2024 15:51:47 +0000 (08:51 -0700)]
Merge tag 'net-6.9-rc7' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf.

  Relatively calm week, likely due to public holiday in most places. No
  known outstanding regressions.

  Current release - regressions:

   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()

   - eth: e1000e: change usleep_range to udelay in PHY mdic access

  Previous releases - regressions:

   - gro: fix udp bad offset in socket lookup

   - bpf: fix incorrect runtime stat for arm64

   - tipc: fix UAF in error path

   - netfs: fix a potential infinite loop in extract_user_to_sg()

   - eth: ice: ensure the copied buf is NUL terminated

   - eth: qeth: fix kernel panic after setting hsuid

  Previous releases - always broken:

   - bpf:
       - verifier: prevent userspace memory access
       - xdp: use flags field to disambiguate broadcast redirect

   - bridge: fix multicast-to-unicast with fraglist GSO

   - mptcp: ensure snd_nxt is properly initialized on connect

   - nsh: fix outer header access in nsh_gso_segment().

   - eth: bcmgenet: fix racing registers access

   - eth: vxlan: fix stats counters.

  Misc:

   - a bunch of MAINTAINERS file updates"

* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
  MAINTAINERS: remove Ariel Elior
  net: gro: add flush check in udp_gro_receive_segment
  net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  ipv4: Fix uninit-value access in __ip_make_skb()
  s390/qeth: Fix kernel panic after setting hsuid
  vxlan: Pull inner IP header in vxlan_rcv().
  tipc: fix a possible memleak in tipc_buf_append
  tipc: fix UAF in error path
  rxrpc: Clients must accept conn from any address
  net: core: reject skb_copy(_expand) for fraglist GSO skbs
  net: bridge: fix multicast-to-unicast with fraglist GSO
  mptcp: ensure snd_nxt is properly initialized on connect
  e1000e: change usleep_range to udelay in PHY mdic access
  net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
  cxgb4: Properly lock TX queue for the selftest.
  rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
  vxlan: Add missing VNI filter counter update in arp_reduce().
  vxlan: Fix racy device stats updates.
  net: qede: use return from qede_parse_actions()
  ...

17 months agoMerge commit '50abcc179e0c9ca667feb223b26ea406d5c4c556' of git://git.infradead.org...
Jens Axboe [Thu, 2 May 2024 13:22:51 +0000 (07:22 -0600)]
Merge commit '50abcc179e0c9ca667feb223b26ea406d5c4c556' of git://git.infradead.org/nvme into block-6.9

Pull NVMe fixes from Keith.

* git://git.infradead.org/nvme:
  nvme-tcp: strict pdu pacing to avoid send stalls on TLS
  nvmet: fix nvme status code when namespace is disabled
  nvmet-tcp: fix possible memory leak when tearing down a controller
  nvme: cancel pending I/O if nvme controller is in terminal state
  nvmet-auth: replace pr_debug() with pr_err() to report an error.
  nvmet-auth: return the error code to the nvmet_auth_host_hash() callers
  nvme: find numa distance only if controller has valid numa id
  nvme: fix warn output about shared namespaces without CONFIG_NVME_MULTIPATH

17 months agoMAINTAINERS: mark MYRICOM MYRI-10G as Orphan
Jakub Kicinski [Tue, 30 Apr 2024 23:35:32 +0000 (16:35 -0700)]
MAINTAINERS: mark MYRICOM MYRI-10G as Orphan

Chris's email address bounces and lore hasn't seen an email
from anyone with his name for almost a decade.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240430233532.1356982-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoMAINTAINERS: remove Ariel Elior
Jakub Kicinski [Tue, 30 Apr 2024 23:33:05 +0000 (16:33 -0700)]
MAINTAINERS: remove Ariel Elior

aelior@marvell.com bounces, we haven't seen Ariel on lore
since March 2022.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240430233305.1356105-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoMerge branch 'net-gro-add-flush-flush_id-checks-and-fix-wrong-offset-in-udp'
Paolo Abeni [Thu, 2 May 2024 09:03:21 +0000 (11:03 +0200)]
Merge branch 'net-gro-add-flush-flush_id-checks-and-fix-wrong-offset-in-udp'

Richard Gobert says:

====================
net: gro: add flush/flush_id checks and fix wrong offset in udp

This series fixes a bug in the complete phase of UDP in GRO, in which
socket lookup fails due to using network_header when parsing encapsulated
packets. The fix is to add network_offset and inner_network_offset to
napi_gro_cb and use these offsets for socket lookup.

In addition p->flush/flush_id should be checked in all UDP flows. The
same logic from tcp_gro_receive is applied for all flows in
udp_gro_receive_segment. This prevents packets with mismatching network
headers (flush/flush_id turned on) from merging in UDP GRO.

The original series includes a change to vxlan test which adds the local
parameter to prevent similar future bugs. I plan to submit it separately to
net-next.

This series is part of a previously submitted series to net-next:
https://lore.kernel.org/all/20240408141720.98832-1-richardbgobert@gmail.com/

v3 -> v4:
 - Store network offsets, and use them only in udp_gro_complete flows
 - Correct commit hash used in Fixes tag
 - v3:
 https://lore.kernel.org/netdev/20240424163045.123528-1-richardbgobert@gmail.com/

v2 -> v3:
 - Add network_offsets and fix udp bug in a single commit to make backporting easier
 - Write to inner_network_offset in {inet,ipv6}_gro_receive
 - Use network_offsets union in tcp[46]_gro_complete as well
 - v2:
 https://lore.kernel.org/netdev/20240419153542.121087-1-richardbgobert@gmail.com/

v1 -> v2:
 - Use network_offsets instead of p_poff param as suggested by Willem
 - Check flush before postpull, and for all UDP GRO flows
 - v1:
 https://lore.kernel.org/netdev/20240412152120.115067-1-richardbgobert@gmail.com/
====================

Link: https://lore.kernel.org/r/20240430143555.126083-1-richardbgobert@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agonet: gro: add flush check in udp_gro_receive_segment
Richard Gobert [Tue, 30 Apr 2024 14:35:55 +0000 (16:35 +0200)]
net: gro: add flush check in udp_gro_receive_segment

GRO-GSO path is supposed to be transparent and as such L3 flush checks are
relevant to all UDP flows merging in GRO. This patch uses the same logic
and code from tcp_gro_receive, terminating merge if flush is non zero.

Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agonet: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to...
Richard Gobert [Tue, 30 Apr 2024 14:35:54 +0000 (16:35 +0200)]
net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb

Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
complete phase of gro. The functions always return skb->network_header,
which in the case of encapsulated packets at the gro complete phase, is
always set to the innermost L3 of the packet. That means that calling
{ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
L3/L4 may return an unexpected value.

This incorrect usage leads to a bug in GRO's UDP socket lookup.
udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
*_hdr functions return network_header which will point to the innermost L3,
resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
encapsulated packets.

This patch adds network_offset and inner_network_offset to napi_gro_cb, and
makes sure both are set correctly.

To fix the issue, network_offsets union is used inside napi_gro_cb, in
which both the outer and the inner network offsets are saved.

Reproduction example:

Endpoint configuration example (fou + local address bind)

    # ip fou add port 6666 ipproto 4
    # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
    # ip link set tun1 up
    # ip a add 1.1.1.2/24 dev tun1

Netperf TCP_STREAM result on net-next before patch is applied:

net-next main, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.28        2.37

net-next main, GRO disabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2745.06

patch applied, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2877.38

Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoipv4: Fix uninit-value access in __ip_make_skb()
Shigeru Yoshida [Tue, 30 Apr 2024 12:39:45 +0000 (21:39 +0900)]
ipv4: Fix uninit-value access in __ip_make_skb()

KMSAN reported uninit-value access in __ip_make_skb() [1].  __ip_make_skb()
tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a
race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL
while __ip_make_skb() is running, the function will access icmphdr in the
skb even if it is not included. This causes the issue reported by KMSAN.

Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL
on the socket.

Also, fl4->fl4_icmp_type and fl4->fl4_icmp_code are not initialized. These
are union in struct flowi4 and are implicitly initialized by
flowi4_init_output(), but we should not rely on specific union layout.

Initialize these explicitly in raw_sendmsg().

[1]
BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481
 __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481
 ip_finish_skb include/net/ip.h:243 [inline]
 ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508
 raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654
 inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x274/0x3c0 net/socket.c:745
 __sys_sendto+0x62c/0x7b0 net/socket.c:2191
 __do_sys_sendto net/socket.c:2203 [inline]
 __se_sys_sendto net/socket.c:2199 [inline]
 __x64_sys_sendto+0x130/0x200 net/socket.c:2199
 do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

Uninit was created at:
 slab_post_alloc_hook mm/slub.c:3804 [inline]
 slab_alloc_node mm/slub.c:3845 [inline]
 kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888
 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577
 __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668
 alloc_skb include/linux/skbuff.h:1318 [inline]
 __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128
 ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365
 raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648
 inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x274/0x3c0 net/socket.c:745
 __sys_sendto+0x62c/0x7b0 net/socket.c:2191
 __do_sys_sendto net/socket.c:2203 [inline]
 __se_sys_sendto net/socket.c:2199 [inline]
 __x64_sys_sendto+0x130/0x200 net/socket.c:2199
 do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb79b1 #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014

Fixes: 99e5acae193e ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://lore.kernel.org/r/20240430123945.2057348-1-syoshida@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agodrm/panel: ili9341: Use predefined error codes
Andy Shevchenko [Thu, 25 Apr 2024 14:26:19 +0000 (17:26 +0300)]
drm/panel: ili9341: Use predefined error codes

In one case the -1 is returned which is quite confusing code for
the wrong device ID, in another the ret is returning instead of
plain 0 that also confusing as readed may ask the possible meaning
of positive codes, which are never the case there. Convert both
to use explicit predefined error codes to make it clear what's going
on there.

Fixes: 5a04227326b0 ("drm/panel: Add ilitek ili9341 panel driver")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Link: https://lore.kernel.org/r/20240425142706.2440113-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425142706.2440113-4-andriy.shevchenko@linux.intel.com
17 months agodrm/panel: ili9341: Respect deferred probe
Andy Shevchenko [Thu, 25 Apr 2024 14:26:18 +0000 (17:26 +0300)]
drm/panel: ili9341: Respect deferred probe

GPIO controller might not be available when driver is being probed.
There are plenty of reasons why, one of which is deferred probe.

Since GPIOs are optional, return any error code we got to the upper
layer, including deferred probe. With that in mind, use dev_err_probe()
in order to avoid spamming the logs.

Fixes: 5a04227326b0 ("drm/panel: Add ilitek ili9341 panel driver")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Link: https://lore.kernel.org/r/20240425142706.2440113-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425142706.2440113-3-andriy.shevchenko@linux.intel.com
17 months agodrm/panel: ili9341: Correct use of device property APIs
Andy Shevchenko [Thu, 25 Apr 2024 14:26:17 +0000 (17:26 +0300)]
drm/panel: ili9341: Correct use of device property APIs

It seems driver missed the point of proper use of device property APIs.
Correct this by updating headers and calls respectively.

Fixes: 5a04227326b0 ("drm/panel: Add ilitek ili9341 panel driver")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20240425142706.2440113-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425142706.2440113-2-andriy.shevchenko@linux.intel.com
17 months agos390/qeth: Fix kernel panic after setting hsuid
Alexandra Winter [Tue, 30 Apr 2024 09:10:04 +0000 (11:10 +0200)]
s390/qeth: Fix kernel panic after setting hsuid

Symptom:
When the hsuid attribute is set for the first time on an IQD Layer3
device while the corresponding network interface is already UP,
the kernel will try to execute a napi function pointer that is NULL.

Example:
---------------------------------------------------------------------------
[ 2057.572696] illegal operation: 0001 ilc:1 [#1] SMP
[ 2057.572702] Modules linked in: af_iucv qeth_l3 zfcp scsi_transport_fc sunrpc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
nft_reject nft_ct nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink ghash_s390 prng xts aes_s390 des_s390 de
s_generic sha3_512_s390 sha3_256_s390 sha512_s390 vfio_ccw vfio_mdev mdev vfio_iommu_type1 eadm_sch vfio ext4 mbcache jbd2 qeth_l2 bridge stp llc dasd_eckd_mod qeth dasd_mod
 qdio ccwgroup pkey zcrypt
[ 2057.572739] CPU: 6 PID: 60182 Comm: stress_client Kdump: loaded Not tainted 4.18.0-541.el8.s390x #1
[ 2057.572742] Hardware name: IBM 3931 A01 704 (LPAR)
[ 2057.572744] Krnl PSW : 0704f00180000000 0000000000000002 (0x2)
[ 2057.572748]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
[ 2057.572751] Krnl GPRS: 0000000000000004 0000000000000000 00000000a3b008d8 0000000000000000
[ 2057.572754]            00000000a3b008d8 cb923a29c779abc5 0000000000000000 00000000814cfd80
[ 2057.572756]            000000000000012c 0000000000000000 00000000a3b008d8 00000000a3b008d8
[ 2057.572758]            00000000bab6d500 00000000814cfd80 0000000091317e46 00000000814cfc68
[ 2057.572762] Krnl Code:#0000000000000000: 0000                illegal
                         >0000000000000002: 0000                illegal
                          0000000000000004: 0000                illegal
                          0000000000000006: 0000                illegal
                          0000000000000008: 0000                illegal
                          000000000000000a: 0000                illegal
                          000000000000000c: 0000                illegal
                          000000000000000e: 0000                illegal
[ 2057.572800] Call Trace:
[ 2057.572801] ([<00000000ec639700>] 0xec639700)
[ 2057.572803]  [<00000000913183e2>] net_rx_action+0x2ba/0x398
[ 2057.572809]  [<0000000091515f76>] __do_softirq+0x11e/0x3a0
[ 2057.572813]  [<0000000090ce160c>] do_softirq_own_stack+0x3c/0x58
[ 2057.572817] ([<0000000090d2cbd6>] do_softirq.part.1+0x56/0x60)
[ 2057.572822]  [<0000000090d2cc60>] __local_bh_enable_ip+0x80/0x98
[ 2057.572825]  [<0000000091314706>] __dev_queue_xmit+0x2be/0xd70
[ 2057.572827]  [<000003ff803dd6d6>] afiucv_hs_send+0x24e/0x300 [af_iucv]
[ 2057.572830]  [<000003ff803dd88a>] iucv_send_ctrl+0x102/0x138 [af_iucv]
[ 2057.572833]  [<000003ff803de72a>] iucv_sock_connect+0x37a/0x468 [af_iucv]
[ 2057.572835]  [<00000000912e7e90>] __sys_connect+0xa0/0xd8
[ 2057.572839]  [<00000000912e9580>] sys_socketcall+0x228/0x348
[ 2057.572841]  [<0000000091514e1a>] system_call+0x2a6/0x2c8
[ 2057.572843] Last Breaking-Event-Address:
[ 2057.572844]  [<0000000091317e44>] __napi_poll+0x4c/0x1d8
[ 2057.572846]
[ 2057.572847] Kernel panic - not syncing: Fatal exception in interrupt
-------------------------------------------------------------------------------------------

Analysis:
There is one napi structure per out_q: card->qdio.out_qs[i].napi
The napi.poll functions are set during qeth_open().

Since
commit 1cfef80d4c2b ("s390/qeth: Don't call dev_close/dev_open (DOWN/UP)")
qeth_set_offline()/qeth_set_online() no longer call dev_close()/
dev_open(). So if qeth_free_qdio_queues() cleared
card->qdio.out_qs[i].napi.poll while the network interface was UP and the
card was offline, they are not set again.

Reproduction:
chzdev -e $devno layer2=0
ip link set dev $network_interface up
echo 0 > /sys/bus/ccwgroup/devices/0.0.$devno/online
echo foo > /sys/bus/ccwgroup/devices/0.0.$devno/hsuid
echo 1 > /sys/bus/ccwgroup/devices/0.0.$devno/online
-> Crash (can be enforced e.g. by af_iucv connect(), ip link down/up, ...)

Note that a Completion Queue (CQ) is only enabled or disabled, when hsuid
is set for the first time or when it is removed.

Workarounds:
- Set hsuid before setting the device online for the first time
or
- Use chzdev -d $devno; chzdev $devno hsuid=xxx; chzdev -e $devno;
to set hsuid on an existing device. (this will remove and recreate the
network interface)

Fix:
There is no need to free the output queues when a completion queue is
added or removed.
card->qdio.state now indicates whether the inbound buffer pool and the
outbound queues are allocated.
card->qdio.c_q indicates whether a CQ is allocated.

Fixes: 1cfef80d4c2b ("s390/qeth: Don't call dev_close/dev_open (DOWN/UP)")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240430091004.2265683-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoALSA: hda/realtek: Fix build error without CONFIG_PM
Takashi Iwai [Thu, 2 May 2024 06:24:42 +0000 (08:24 +0200)]
ALSA: hda/realtek: Fix build error without CONFIG_PM

The alc_spec.power_hook is defined only with CONFIG_PM, and the recent
fix overlooked it, resulting in a build error without CONFIG_PM.
Fix it with the simple ifdef and set __maybe_unused for the function.

We may drop the whole CONFIG_PM dependency there, but it should be
done in a separate cleanup patch later.

Fixes: 1e707769df07 ("ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405012104.Dr7h318W-lkp@intel.com/
Message-ID: <20240502062442.30545-1-tiwai@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
17 months agovxlan: Pull inner IP header in vxlan_rcv().
Guillaume Nault [Tue, 30 Apr 2024 16:50:13 +0000 (18:50 +0200)]
vxlan: Pull inner IP header in vxlan_rcv().

Ensure the inner IP header is part of skb's linear data before reading
its ECN bits. Otherwise we might read garbage.
One symptom is the system erroneously logging errors like
"vxlan: non-ECT from xxx.xxx.xxx.xxx with TOS=xxxx".

Similar bugs have been fixed in geneve, ip_tunnel and ip6_tunnel (see
commit 1ca1ba465e55 ("geneve: make sure to pull inner header in
geneve_rx()") for example). So let's reuse the same code structure for
consistency. Maybe we'll can add a common helper in the future.

Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/1239c8db54efec341dd6455c77e0380f58923a3c.1714495737.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agotipc: fix a possible memleak in tipc_buf_append
Xin Long [Tue, 30 Apr 2024 14:03:38 +0000 (10:03 -0400)]
tipc: fix a possible memleak in tipc_buf_append

__skb_linearize() doesn't free the skb when it fails, so move
'*buf = NULL' after __skb_linearize(), so that the skb can be
freed on the err path.

Fixes: b7df21cf1b79 ("tipc: skb_linearize the head skb when reassembling msgs")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/90710748c29a1521efac4f75ea01b3b7e61414cf.1714485818.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agotipc: fix UAF in error path
Paolo Abeni [Tue, 30 Apr 2024 13:53:37 +0000 (15:53 +0200)]
tipc: fix UAF in error path

Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported
a UAF in the tipc_buf_append() error path:

BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0
linux/net/core/skbuff.c:1183
Read of size 8 at addr ffff88804d2a7c80 by task poc/8034

CPU: 1 PID: 8034 Comm: poc Not tainted 6.8.2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.0-debian-1.16.0-5 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack linux/lib/dump_stack.c:88
 dump_stack_lvl+0xd9/0x1b0 linux/lib/dump_stack.c:106
 print_address_description linux/mm/kasan/report.c:377
 print_report+0xc4/0x620 linux/mm/kasan/report.c:488
 kasan_report+0xda/0x110 linux/mm/kasan/report.c:601
 kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183
 skb_release_data+0x5af/0x880 linux/net/core/skbuff.c:1026
 skb_release_all linux/net/core/skbuff.c:1094
 __kfree_skb linux/net/core/skbuff.c:1108
 kfree_skb_reason+0x12d/0x210 linux/net/core/skbuff.c:1144
 kfree_skb linux/./include/linux/skbuff.h:1244
 tipc_buf_append+0x425/0xb50 linux/net/tipc/msg.c:186
 tipc_link_input+0x224/0x7c0 linux/net/tipc/link.c:1324
 tipc_link_rcv+0x76e/0x2d70 linux/net/tipc/link.c:1824
 tipc_rcv+0x45f/0x10f0 linux/net/tipc/node.c:2159
 tipc_udp_recv+0x73b/0x8f0 linux/net/tipc/udp_media.c:390
 udp_queue_rcv_one_skb+0xad2/0x1850 linux/net/ipv4/udp.c:2108
 udp_queue_rcv_skb+0x131/0xb00 linux/net/ipv4/udp.c:2186
 udp_unicast_rcv_skb+0x165/0x3b0 linux/net/ipv4/udp.c:2346
 __udp4_lib_rcv+0x2594/0x3400 linux/net/ipv4/udp.c:2422
 ip_protocol_deliver_rcu+0x30c/0x4e0 linux/net/ipv4/ip_input.c:205
 ip_local_deliver_finish+0x2e4/0x520 linux/net/ipv4/ip_input.c:233
 NF_HOOK linux/./include/linux/netfilter.h:314
 NF_HOOK linux/./include/linux/netfilter.h:308
 ip_local_deliver+0x18e/0x1f0 linux/net/ipv4/ip_input.c:254
 dst_input linux/./include/net/dst.h:461
 ip_rcv_finish linux/net/ipv4/ip_input.c:449
 NF_HOOK linux/./include/linux/netfilter.h:314
 NF_HOOK linux/./include/linux/netfilter.h:308
 ip_rcv+0x2c5/0x5d0 linux/net/ipv4/ip_input.c:569
 __netif_receive_skb_one_core+0x199/0x1e0 linux/net/core/dev.c:5534
 __netif_receive_skb+0x1f/0x1c0 linux/net/core/dev.c:5648
 process_backlog+0x101/0x6b0 linux/net/core/dev.c:5976
 __napi_poll.constprop.0+0xba/0x550 linux/net/core/dev.c:6576
 napi_poll linux/net/core/dev.c:6645
 net_rx_action+0x95a/0xe90 linux/net/core/dev.c:6781
 __do_softirq+0x21f/0x8e7 linux/kernel/softirq.c:553
 do_softirq linux/kernel/softirq.c:454
 do_softirq+0xb2/0xf0 linux/kernel/softirq.c:441
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x100/0x120 linux/kernel/softirq.c:381
 local_bh_enable linux/./include/linux/bottom_half.h:33
 rcu_read_unlock_bh linux/./include/linux/rcupdate.h:851
 __dev_queue_xmit+0x871/0x3ee0 linux/net/core/dev.c:4378
 dev_queue_xmit linux/./include/linux/netdevice.h:3169
 neigh_hh_output linux/./include/net/neighbour.h:526
 neigh_output linux/./include/net/neighbour.h:540
 ip_finish_output2+0x169f/0x2550 linux/net/ipv4/ip_output.c:235
 __ip_finish_output linux/net/ipv4/ip_output.c:313
 __ip_finish_output+0x49e/0x950 linux/net/ipv4/ip_output.c:295
 ip_finish_output+0x31/0x310 linux/net/ipv4/ip_output.c:323
 NF_HOOK_COND linux/./include/linux/netfilter.h:303
 ip_output+0x13b/0x2a0 linux/net/ipv4/ip_output.c:433
 dst_output linux/./include/net/dst.h:451
 ip_local_out linux/net/ipv4/ip_output.c:129
 ip_send_skb+0x3e5/0x560 linux/net/ipv4/ip_output.c:1492
 udp_send_skb+0x73f/0x1530 linux/net/ipv4/udp.c:963
 udp_sendmsg+0x1a36/0x2b40 linux/net/ipv4/udp.c:1250
 inet_sendmsg+0x105/0x140 linux/net/ipv4/af_inet.c:850
 sock_sendmsg_nosec linux/net/socket.c:730
 __sock_sendmsg linux/net/socket.c:745
 __sys_sendto+0x42c/0x4e0 linux/net/socket.c:2191
 __do_sys_sendto linux/net/socket.c:2203
 __se_sys_sendto linux/net/socket.c:2199
 __x64_sys_sendto+0xe0/0x1c0 linux/net/socket.c:2199
 do_syscall_x64 linux/arch/x86/entry/common.c:52
 do_syscall_64+0xd8/0x270 linux/arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x6f/0x77 linux/arch/x86/entry/entry_64.S:120
RIP: 0033:0x7f3434974f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007fff9154f2b8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3434974f29
RDX: 00000000000032c8 RSI: 00007fff9154f300 RDI: 0000000000000003
RBP: 00007fff915532e0 R08: 00007fff91553360 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000212 R12: 000055ed86d261d0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

In the critical scenario, either the relevant skb is freed or its
ownership is transferred into a frag_lists. In both cases, the cleanup
code must not free it again: we need to clear the skb reference earlier.

Fixes: 1149557d64c9 ("tipc: eliminate unnecessary linearization of incoming buffers")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23852
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/752f1ccf762223d109845365d07f55414058e5a3.1714484273.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agorxrpc: Clients must accept conn from any address
Jeffrey Altman [Fri, 19 Apr 2024 16:30:57 +0000 (13:30 -0300)]
rxrpc: Clients must accept conn from any address

The find connection logic of Transarc's Rx was modified in the mid-1990s
to support multi-homed servers which might send a response packet from
an address other than the destination address in the received packet.
The rules for accepting a packet by an Rx initiator (RX_CLIENT_CONNECTION)
were altered to permit acceptance of a packet from any address provided
that the port number was unchanged and all of the connection identifiers
matched (Epoch, CID, SecurityClass, ...).

This change applies the same rules to the Linux implementation which makes
it consistent with IBM AFS 3.6, Arla, OpenAFS and AuriStorFS.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: Jeffrey Altman <jaltman@auristor.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Link: https://lore.kernel.org/r/20240419163057.4141728-1-marc.dionne@auristor.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge tag 'asoc-fix-v6.9-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git...
Takashi Iwai [Wed, 1 May 2024 16:05:13 +0000 (18:05 +0200)]
Merge tag 'asoc-fix-v6.9-rc6' of https://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.9

This is much larger than is ideal, partly due to your holiday but also
due to several vendors having come in with relatively large fixes at
similar times.  It's all driver specific stuff.

The meson fixes from Jerome fix some rare timing issues with blocking
operations happening in triggers, plus the continuous clock support
which fixes clocking for some platforms.  The SOF series from Peter
builds to the fix to avoid spurious resets of ChainDMA which triggered
errors in cleanup paths with both PulseAudio and PipeWire, and there's
also some simple new debugfs files from Pierre which make support a lot
eaiser.

17 months agoMerge tag 'regulator-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 1 May 2024 15:58:56 +0000 (08:58 -0700)]
Merge tag 'regulator-fix-v6.9-rc6' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "There's a few simple driver specific fixes here, plus some core
  cleanups from Matti which fix issues found with client drivers due to
  the API being confusing.

  The two fixes for the stubs provide more constructive behaviour with
  !REGULATOR configurations, issues were noticed with some hwmon drivers
  which would otherwise have needed confusing bodges in the users.

  The irq_helpers fix to duplicate the provided name for the interrupt
  controller was found because a driver got this wrong and it's again a
  case where the core is the sensible place to put the fix"

* tag 'regulator-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: change devm_regulator_get_enable_optional() stub to return Ok
  regulator: change stubbed devm_regulator_get_enable to return Ok
  regulator: vqmmc-ipq4019: fix module autoloading
  regulator: qcom-refgen: fix module autoloading
  regulator: mt6360: De-capitalize devicetree regulator subnodes
  regulator: irq_helpers: duplicate IRQ name

17 months agodrm/xe/vm: prevent UAF in rebind_work_func()
Matthew Auld [Tue, 23 Apr 2024 07:47:23 +0000 (08:47 +0100)]
drm/xe/vm: prevent UAF in rebind_work_func()

We flush the rebind worker during the vm close phase, however in places
like preempt_fence_work_func() we seem to queue the rebind worker
without first checking if the vm has already been closed.  The concern
here is the vm being closed with the worker flushed, but then being
rearmed later, which looks like potential uaf, since there is no actual
refcounting to track the queued worker. We can't take the vm->lock here
in preempt_rebind_work_func() to first check if the vm is closed since
that will deadlock, so instead flush the worker again when the vm
refcount reaches zero.

v2:
 - Grabbing vm->lock in the preempt worker creates a deadlock, so
   checking the closed state is tricky. Instead flush the worker when
   the refcount reaches zero. It should be impossible to queue the
   preempt worker without already holding vm ref.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1364
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423074721.119633-4-matthew.auld@intel.com
(cherry picked from commit 3d44d67c441a9fe6f81a1d705f7de009a32a5b35)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
17 months agodrm/amd/display: Disable panel replay by default for now
Mario Limonciello [Tue, 30 Apr 2024 14:53:23 +0000 (09:53 -0500)]
drm/amd/display: Disable panel replay by default for now

Panel replay was enabled by default in commit 5950efe25ee0
("drm/amd/display: Enable Panel Replay for static screen use case"), but
it isn't working properly at least on some BOE and AUO panels.  Instead
of being static the screen is solid black when active.  As it's a new
feature that was just introduced that regressed VRR disable it for now
so that problem can be properly root caused.

Cc: Tom Chung <chiahsuan.chung@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
Fixes: 5950efe25ee0 ("drm/amd/display: Enable Panel Replay for static screen use case")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agonet: core: reject skb_copy(_expand) for fraglist GSO skbs
Felix Fietkau [Sat, 27 Apr 2024 18:24:19 +0000 (20:24 +0200)]
net: core: reject skb_copy(_expand) for fraglist GSO skbs

SKB_GSO_FRAGLIST skbs must not be linearized, otherwise they become
invalid. Return NULL if such an skb is passed to skb_copy or
skb_copy_expand, in order to prevent a crash on a potential later
call to skb_gso_segment.

Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agonet: bridge: fix multicast-to-unicast with fraglist GSO
Felix Fietkau [Sat, 27 Apr 2024 18:24:18 +0000 (20:24 +0200)]
net: bridge: fix multicast-to-unicast with fraglist GSO

Calling skb_copy on a SKB_GSO_FRAGLIST skb is not valid, since it returns
an invalid linearized skb. This code only needs to change the ethernet
header, so pskb_copy is the right function to call here.

Fixes: 6db6f0eae605 ("bridge: multicast to unicast")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agonvme-tcp: strict pdu pacing to avoid send stalls on TLS
Hannes Reinecke [Thu, 18 Apr 2024 10:39:45 +0000 (12:39 +0200)]
nvme-tcp: strict pdu pacing to avoid send stalls on TLS

TLS requires a strict pdu pacing via MSG_EOR to signal the end
of a record and subsequent encryption. If we do not set MSG_EOR
at the end of a sequence the record won't be closed, encryption
doesn't start, and we end up with a send stall as the message
will never be passed on to the TCP layer.
So do not check for the queue status when TLS is enabled but
rather make the MSG_MORE setting dependent on the current
request only.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvmet: fix nvme status code when namespace is disabled
Sagi Grimberg [Sun, 28 Apr 2024 09:25:40 +0000 (12:25 +0300)]
nvmet: fix nvme status code when namespace is disabled

If the user disabled a nvmet namespace, it is removed from the subsystem
namespaces list. When nvmet processes a command directed to an nsid that
was disabled, it cannot differentiate between a nsid that is disabled
vs. a non-existent namespace, and resorts to return NVME_SC_INVALID_NS
with the dnr bit set.

This translates to a non-retryable status for the host, which translates
to a user error. We should expect disabled namespaces to not cause an
I/O error in a multipath environment.

Address this by searching a configfs item for the namespace nvmet failed
to find, and if we found one, conclude that the namespace is disabled
(perhaps temporarily). Return NVME_SC_INTERNAL_PATH_ERROR in this case
and keep DNR bit cleared.

Reported-by: Jirong Feng <jirong.feng@easystack.cn>
Tested-by: Jirong Feng <jirong.feng@easystack.cn>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvmet-tcp: fix possible memory leak when tearing down a controller
Sagi Grimberg [Sun, 28 Apr 2024 08:49:49 +0000 (11:49 +0300)]
nvmet-tcp: fix possible memory leak when tearing down a controller

When we teardown the controller, we wait for pending I/Os to complete
(sq->ref on all queues to drop to zero) and then we go over the commands,
and free their command buffers in case they are still fetching data from
the host (e.g. processing nvme writes) and have yet to take a reference
on the sq.

However, we may miss the case where commands have failed before executing
and are queued for sending a response, but will never occur because the
queue socket is already down. In this case we may miss deallocating command
buffers.

Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is
idempotent anyways.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvme: cancel pending I/O if nvme controller is in terminal state
Nilay Shroff [Thu, 25 Apr 2024 14:03:00 +0000 (19:33 +0530)]
nvme: cancel pending I/O if nvme controller is in terminal state

While I/O is running, if the pci bus error occurs then
in-flight I/O can not complete. Worst, if at this time,
user (logically) hot-unplug the nvme disk then the
nvme_remove() code path can't forward progress until
in-flight I/O is cancelled. So these sequence of events
may potentially hang hot-unplug code path indefinitely.
This patch helps cancel the pending/in-flight I/O from the
nvme request timeout handler in case the nvme controller
is in the terminal (DEAD/DELETING/DELETING_NOIO) state and
that helps nvme_remove() code path forward progress and
finish successfully.

Link: https://lore.kernel.org/all/199be893-5dfa-41e5-b6f2-40ac90ebccc4@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvmet-auth: replace pr_debug() with pr_err() to report an error.
Maurizio Lombardi [Wed, 10 Apr 2024 09:48:42 +0000 (11:48 +0200)]
nvmet-auth: replace pr_debug() with pr_err() to report an error.

In nvmet_auth_host_hash(), if a mismatch is detected in the hash length
the kernel should print an error.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvmet-auth: return the error code to the nvmet_auth_host_hash() callers
Maurizio Lombardi [Wed, 10 Apr 2024 09:48:41 +0000 (11:48 +0200)]
nvmet-auth: return the error code to the nvmet_auth_host_hash() callers

If the nvmet_auth_host_hash() function fails, the error code should
be returned to its callers.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agonvme: find numa distance only if controller has valid numa id
Nilay Shroff [Tue, 16 Apr 2024 08:19:23 +0000 (13:49 +0530)]
nvme: find numa distance only if controller has valid numa id

On system where native nvme multipath is configured and iopolicy
is set to numa but the nvme controller numa node id is undefined
or -1 (NUMA_NO_NODE) then avoid calculating node distance for
finding optimal io path. In such case we may access numa distance
table with invalid index and that may potentially refer to incorrect
memory. So this patch ensures that if the nvme controller numa node
id is -1 then instead of calculating node distance for finding optimal
io path, we set the numa node distance of such controller to default 10
(LOCAL_DISTANCE).

Link: https://lore.kernel.org/all/20240413090614.678353-1-nilay@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
17 months agos390/paes: Reestablish retry loop in paes
Harald Freudenberger [Thu, 25 Apr 2024 14:29:48 +0000 (16:29 +0200)]
s390/paes: Reestablish retry loop in paes

With commit ed6776c96c60 ("s390/crypto: remove retry
loop with sleep from PAES pkey invocation") the retry
loop to retry derivation of a protected key from a
secure key has been removed. This was based on the
assumption that theses retries are not needed any
more as proper retries are done in the zcrypt layer.

However, tests have revealed that there exist some
cases with master key change in the HSM and immediately
(< 1 second) attempt to derive a protected key from a
secure key with exact this HSM may eventually fail.

The low level functions in zcrypt_ccamisc.c and
zcrypt_ep11misc.c detect and report this temporary
failure and report it to the caller as -EBUSY. The
re-established retry loop in the paes implementation
catches exactly this -EBUSY and eventually may run
some retries.

Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation")
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
17 months agos390/zcrypt: Use EBUSY to indicate temp unavailability
Harald Freudenberger [Thu, 25 Apr 2024 14:22:51 +0000 (16:22 +0200)]
s390/zcrypt: Use EBUSY to indicate temp unavailability

Use -EBUSY instead of -EAGAIN in zcrypt_ccamisc.c
in cases where the CCA card returns 8/2290 to indicate
a temporarily unavailability of this function.

Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation")
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
17 months agos390/zcrypt: Handle ep11 cprb return code
Harald Freudenberger [Mon, 25 Mar 2024 08:59:19 +0000 (09:59 +0100)]
s390/zcrypt: Handle ep11 cprb return code

An EP11 reply cprb contains a field ret_code which may
hold an error code different than the error code stored
in the payload of the cprb. As of now all the EP11 misc
functions do not evaluate this field but focus on the
error code in the payload.

Before checking the payload error, first the cprb error
field should be evaluated which is introduced with this
patch.

If the return code value 0x000c0003 is seen, this
indicates a busy situation which is reflected by
-EBUSY in the zcrpyt_ep11misc.c low level function.
A higher level caller should consider to retry after
waiting a dedicated duration (say 1 second).

Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation")
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
17 months agos390/zcrypt: Fix wrong format string in debug feature printout
Harald Freudenberger [Mon, 25 Mar 2024 08:43:53 +0000 (09:43 +0100)]
s390/zcrypt: Fix wrong format string in debug feature printout

Fix wrong format string debug feature: %04x was used
to print out a 32 bit value. - changed to %08x.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
17 months agospi: fix null pointer dereference within spi_sync
Mans Rullgard [Tue, 30 Apr 2024 18:27:05 +0000 (19:27 +0100)]
spi: fix null pointer dereference within spi_sync

If spi_sync() is called with the non-empty queue and the same spi_message
is then reused, the complete callback for the message remains set while
the context is cleared, leading to a null pointer dereference when the
callback is invoked from spi_finalize_current_message().

With function inlining disabled, the call stack might look like this:

  _raw_spin_lock_irqsave from complete_with_flags+0x18/0x58
  complete_with_flags from spi_complete+0x8/0xc
  spi_complete from spi_finalize_current_message+0xec/0x184
  spi_finalize_current_message from spi_transfer_one_message+0x2a8/0x474
  spi_transfer_one_message from __spi_pump_transfer_message+0x104/0x230
  __spi_pump_transfer_message from __spi_transfer_message_noqueue+0x30/0xc4
  __spi_transfer_message_noqueue from __spi_sync+0x204/0x248
  __spi_sync from spi_sync+0x24/0x3c
  spi_sync from mcp251xfd_regmap_crc_read+0x124/0x28c [mcp251xfd]
  mcp251xfd_regmap_crc_read [mcp251xfd] from _regmap_raw_read+0xf8/0x154
  _regmap_raw_read from _regmap_bus_read+0x44/0x70
  _regmap_bus_read from _regmap_read+0x60/0xd8
  _regmap_read from regmap_read+0x3c/0x5c
  regmap_read from mcp251xfd_alloc_can_err_skb+0x1c/0x54 [mcp251xfd]
  mcp251xfd_alloc_can_err_skb [mcp251xfd] from mcp251xfd_irq+0x194/0xe70 [mcp251xfd]
  mcp251xfd_irq [mcp251xfd] from irq_thread_fn+0x1c/0x78
  irq_thread_fn from irq_thread+0x118/0x1f4
  irq_thread from kthread+0xd8/0xf4
  kthread from ret_from_fork+0x14/0x28

Fix this by also setting message->complete to NULL when the transfer is
complete.

Fixes: ae7d2346dc89 ("spi: Don't use the message queue if possible in spi_sync")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Link: https://lore.kernel.org/r/20240430182705.13019-1-mans@mansr.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agodrm/amdgpu: fix doorbell regression
Shashank Sharma [Mon, 29 Apr 2024 12:29:47 +0000 (14:29 +0200)]
drm/amdgpu: fix doorbell regression

This patch adds a missed handling of PL domain doorbell while
handling VRAM faults.

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults")
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amdkfd: Flush the process wq before creating a kfd_process
Lancelot SIX [Wed, 10 Apr 2024 13:14:13 +0000 (14:14 +0100)]
drm/amdkfd: Flush the process wq before creating a kfd_process

There is a race condition when re-creating a kfd_process for a process.
This has been observed when a process under the debugger executes
exec(3).  In this scenario:
- The process executes exec.
 - This will eventually release the process's mm, which will cause the
   kfd_process object associated with the process to be freed
   (kfd_process_free_notifier decrements the reference count to the
   kfd_process to 0).  This causes kfd_process_ref_release to enqueue
   kfd_process_wq_release to the kfd_process_wq.
- The debugger receives the PTRACE_EVENT_EXEC notification, and tries to
  re-enable AMDGPU traps (KFD_IOC_DBG_TRAP_ENABLE).
 - When handling this request, KFD tries to re-create a kfd_process.
   This eventually calls kfd_create_process and kobject_init_and_add.

At this point the call to kobject_init_and_add can fail because the
old kfd_process.kobj has not been freed yet by kfd_process_wq_release.

This patch proposes to avoid this race by making sure to drain
kfd_process_wq before creating a new kfd_process object.  This way, we
know that any cleanup task is done executing when we reach
kobject_init_and_add.

Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amd/display: Disable seamless boot on 128b/132b encoding
Sung Joon Kim [Thu, 18 Apr 2024 20:59:36 +0000 (16:59 -0400)]
drm/amd/display: Disable seamless boot on 128b/132b encoding

[why]
preOS will not support display mode programming and link training
for UHBR rates.

[how]
If we detect a sink that's UHBR capable, disable seamless boot

Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agomptcp: ensure snd_nxt is properly initialized on connect
Paolo Abeni [Mon, 29 Apr 2024 18:00:31 +0000 (20:00 +0200)]
mptcp: ensure snd_nxt is properly initialized on connect

Christoph reported a splat hinting at a corrupted snd_una:

  WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
  Modules linked in:
  CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
  Workqueue: events mptcp_worker
  RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
  Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8
   8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe
   <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9
  RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293
  RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4
  RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001
  RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000
  R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000
  FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0
  Call Trace:
   <TASK>
   __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline]
   mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline]
   __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615
   mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767
   process_one_work+0x1e0/0x560 kernel/workqueue.c:3254
   process_scheduled_works kernel/workqueue.c:3335 [inline]
   worker_thread+0x3c7/0x640 kernel/workqueue.c:3416
   kthread+0x121/0x170 kernel/kthread.c:388
   ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147
   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
   </TASK>

When fallback to TCP happens early on a client socket, snd_nxt
is not yet initialized and any incoming ack will copy such value
into snd_una. If the mptcp worker (dumbly) tries mptcp-level
re-injection after such ack, that would unconditionally trigger a send
buffer cleanup using 'bad' snd_una values.

We could easily disable re-injection for fallback sockets, but such
dumb behavior already helped catching a few subtle issues and a very
low to zero impact in practice.

Instead address the issue always initializing snd_nxt (and write_seq,
for consistency) at connect time.

Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240429-upstream-net-20240429-mptcp-snd_nxt-init-connect-v1-1-59ceac0a7dcb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodrm/amd/display: Fix DC mode screen flickering on DCN321
Leo Ma [Thu, 11 Apr 2024 21:17:04 +0000 (17:17 -0400)]
drm/amd/display: Fix DC mode screen flickering on DCN321

[Why && How]
Screen flickering saw on 4K@60 eDP with high refresh rate external
monitor when booting up in DC mode. DC Mode Capping is disabled
which caused wrong UCLK being used.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Leo Ma <hanghong.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amd/display: Add VCO speed parameter for DCN31 FPU
Rodrigo Siqueira [Thu, 18 Apr 2024 17:19:03 +0000 (11:19 -0600)]
drm/amd/display: Add VCO speed parameter for DCN31 FPU

Add VCO speed parameters in the bounding box array.

Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agoe1000e: change usleep_range to udelay in PHY mdic access
Vitaly Lifshits [Mon, 29 Apr 2024 17:10:40 +0000 (10:10 -0700)]
e1000e: change usleep_range to udelay in PHY mdic access

This is a partial revert of commit 6dbdd4de0362 ("e1000e: Workaround
for sporadic MDI error on Meteor Lake systems"). The referenced commit
used usleep_range inside the PHY access routines, which are sometimes
called from an atomic context. This can lead to a kernel panic in some
scenarios, such as cable disconnection and reconnection on vPro systems.

Solve this by changing the usleep_range calls back to udelay.

Fixes: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
Cc: stable@vger.kernel.org
Reported-by: Jérôme Carretero <cJ@zougloub.eu>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218740
Closes: https://lore.kernel.org/lkml/a7eb665c74b5efb5140e6979759ed243072cb24a.camel@zougloub.eu/
Co-developed-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240429171040.1152516-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodrm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2
Christian König [Thu, 21 Mar 2024 10:32:02 +0000 (11:32 +0100)]
drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2

This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move
on same heap. The basic problem here is that after the move the old
location is simply not available any more.

Some fixes were suggested, but essentially we should call the move
notification before actually moving things because only this way we have
the correct order for DMA-buf and VM move notifications as well.

Also rework the statistic handling so that we don't update the eviction
counter before the move.

v2: add missing NULL check

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
17 months agodrm/amd/display: Allocate zero bw after bw alloc enable
Meenakshikumar Somasundaram [Wed, 10 Apr 2024 14:46:35 +0000 (10:46 -0400)]
drm/amd/display: Allocate zero bw after bw alloc enable

[Why]
During DP tunnel creation, CM preallocates BW and reduces
estimated BW of other DPIA. CM release preallocation only
when allocation is complete. Display mode validation logic
validates timings based on bw available per host router.
In multi display setup, this causes bw allocation failure
when allocation greater than estimated bw.

[How]
Do zero alloc to make the CM to release preallocation and
update estimated BW correctly for all DPIAs per host router.

Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amd/display: Fix incorrect DSC instance for MST
Hersen Wu [Tue, 13 Feb 2024 19:26:06 +0000 (14:26 -0500)]
drm/amd/display: Fix incorrect DSC instance for MST

[Why] DSC debugfs, such as dp_dsc_clock_en_read,
use aconnector->dc_link to find pipe_ctx for display.
Displays connected to MST hub share the same dc_link.
DSC instance is from pipe_ctx. This causes incorrect
DSC instance for display connected to MST hub.

[How] Add aconnector->sink check to find pipe_ctx.

CC: stable@vger.kernel.org
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amd/display: Atom Integrated System Info v2_2 for DCN35
Gabe Teeger [Tue, 9 Apr 2024 14:38:58 +0000 (10:38 -0400)]
drm/amd/display: Atom Integrated System Info v2_2 for DCN35

New request from KMD/VBIOS in order to support new UMA carveout
model. This fixes a null dereference from accessing
Ctx->dc_bios->integrated_info while it was NULL.

DAL parses through the BIOS and extracts the necessary
integrated_info but was missing a case for the new BIOS
version 2.3.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agonet: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
Marek Behún [Mon, 29 Apr 2024 13:38:32 +0000 (15:38 +0200)]
net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341

The Topaz family (88E6141 and 88E6341) only support 256 Forwarding
Information Tables.

Fixes: a75961d0ebfd ("net: dsa: mv88e6xxx: Add support for ethernet switch 88E6341")
Fixes: 1558727a1c1b ("net: dsa: mv88e6xxx: Add support for ethernet switch 88E6141")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240429133832.9547-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodrm/amd/display: Add dtbclk access to dcn315
Swapnil Patel [Wed, 3 Apr 2024 01:07:46 +0000 (21:07 -0400)]
drm/amd/display: Add dtbclk access to dcn315

[Why & How]

Currently DCN315 clk manager is missing code to enable/disable dtbclk.
Because of this, "optimized_required" flag is constantly set
and this prevents FreeSync from engaging for certain high bandwidth
display Modes which require DTBCLK.

Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Swapnil Patel <swapnil.patel@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agocxgb4: Properly lock TX queue for the selftest.
Sebastian Andrzej Siewior [Mon, 29 Apr 2024 09:11:47 +0000 (11:11 +0200)]
cxgb4: Properly lock TX queue for the selftest.

The selftest for the driver sends a dummy packet and checks if the
packet will be received properly as it should be. The regular TX path
and the selftest can use the same network queue so locking is required
and was missing in the selftest path. This was addressed in the commit
cited below.
Unfortunately locking the TX queue requires BH to be disabled which is
not the case in selftest path which is invoked in process context.
Lockdep should be complaining about this.

Use __netif_tx_lock_bh() for TX queue locking.

Fixes: c650e04898072 ("cxgb4: Fix race between loopback and normal Tx path")
Reported-by: "John B. Wyatt IV" <jwyatt@redhat.com>
Closes: https://lore.kernel.org/all/Zic0ot5aGgR-V4Ks@thinkpad2021/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20240429091147.YWAaal4v@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agorxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
Yunsheng Lin [Sun, 28 Apr 2024 11:16:38 +0000 (19:16 +0800)]
rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()

rxrpc_alloc_data_txbuf() may be called with data_align being
zero in none_alloc_txbuf() and rxkad_alloc_txbuf(), data_align
is supposed to be an order-based alignment value, but zero is
not a valid order-based alignment value, and '~(data_align - 1)'
doesn't result in a valid mask-based alignment value for
__page_frag_alloc_align().

Fix it by passing a valid order-based alignment value in
none_alloc_txbuf() and rxkad_alloc_txbuf().

Also use page_frag_alloc_align() expecting an order-based
alignment value in rxrpc_alloc_data_txbuf() to avoid doing the
alignment converting operation and to catch possible invalid
alignment value in the future. Remove the 'if (data_align)'
checking too, as it is always true for a valid order-based
alignment value.

Fixes: 6b2536462fd4 ("rxrpc: Fix use of changed alignment param to page_frag_alloc_align()")
Fixes: 49489bb03a50 ("rxrpc: Do zerocopy using MSG_SPLICE_PAGES and page frags")
CC: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20240428111640.27306-1-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodrm/amd/display: Ensure that dmcub support flag is set for DCN20
Rodrigo Siqueira [Thu, 11 Apr 2024 23:38:08 +0000 (17:38 -0600)]
drm/amd/display: Ensure that dmcub support flag is set for DCN20

In the DCN20 resource initialization, ensure that DMCUB support starts
configured as true.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amd/display: Handle Y carry-over in VCP X.Y calculation
George Shen [Thu, 16 Sep 2021 23:55:39 +0000 (19:55 -0400)]
drm/amd/display: Handle Y carry-over in VCP X.Y calculation

Theoretically rare corner case where ceil(Y) results in rounding up to
an integer. If this happens, the 1 should be carried over to the X
value.

CC: stable@vger.kernel.org
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agodrm/amdgpu: Fix VRAM memory accounting
Mukul Joshi [Tue, 23 Apr 2024 18:40:37 +0000 (14:40 -0400)]
drm/amdgpu: Fix VRAM memory accounting

Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
17 months agoublk: remove segment count and size limits
Uday Shankar [Tue, 30 Apr 2024 21:16:24 +0000 (15:16 -0600)]
ublk: remove segment count and size limits

ublk_drv currently creates block devices with the default max_segments
and max_segment_size limits of BLK_MAX_SEGMENTS (128) and
BLK_MAX_SEGMENT_SIZE (65536) respectively. These defaults can
artificially constrain the I/O size seen by the ublk server - for
example, suppose that the ublk server has configured itself to accept
I/Os up to 1M and the application is also issuing 1M sized I/Os. If the
I/O buffer used by the application is backed by 4K pages, the buffer
could consist of up to 1M / 4K = 256 physically discontiguous segments
(even if the buffer is virtually contiguous). As such, the I/O could
exceed the default max_segments limit and get split. This can cause
unnecessary performance issues if the ublk server is optimized to handle
1M I/Os. The block layer's segment count/size limits exist to model
hardware constraints which don't exist in ublk_drv's case, so just
remove those limits for the block devices created by ublk_drv.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Riley Thomasson <riley@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240430211623.2802036-1-ushankar@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
17 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Tue, 30 Apr 2024 19:40:41 +0000 (12:40 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
 "A pretty straightforward fix for a NULL pointer dereference, plus the
  accompanying reproducer"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Add test for uaccesses to non-existent vgic-v2 CPUIF
  KVM: arm64: vgic-v2: Check for non-NULL vCPU in vgic_v2_parse_attr()

17 months agoMerge tag 'kvmarm-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmar...
Paolo Bonzini [Tue, 30 Apr 2024 17:50:55 +0000 (13:50 -0400)]
Merge tag 'kvmarm-fixes-6.9-2' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.9, part #2

- Fix + test for a NULL dereference resulting from unsanitised user
  input in the vgic-v2 device attribute accessors

17 months agodrm/vmwgfx: Fix invalid reads in fence signaled events
Zack Rusin [Thu, 25 Apr 2024 19:27:48 +0000 (15:27 -0400)]
drm/vmwgfx: Fix invalid reads in fence signaled events

Correctly set the length of the drm_event to the size of the structure
that's actually used.

The length of the drm_event was set to the parent structure instead of
to the drm_vmw_event_fence which is supposed to be read. drm_read
uses the length parameter to copy the event to the user space thus
resuling in oob reads.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: 8b7de6aa8468 ("vmwgfx: Rework fence event action")
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23566
Cc: David Airlie <airlied@gmail.com>
CC: Daniel Vetter <daniel@ffwll.ch>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.4+
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240425192748.1761522-1-zack.rusin@broadcom.com
17 months agodrm/nouveau/gsp: Use the sg allocator for level 2 of radix3
Lyude Paul [Mon, 29 Apr 2024 18:23:09 +0000 (14:23 -0400)]
drm/nouveau/gsp: Use the sg allocator for level 2 of radix3

Currently we allocate all 3 levels of radix3 page tables using
nvkm_gsp_mem_ctor(), which uses dma_alloc_coherent() for allocating all of
the relevant memory. This can end up failing in scenarios where the system
has very high memory fragmentation, and we can't find enough contiguous
memory to allocate level 2 of the page table.

Currently, this can result in runtime PM issues on systems where memory
fragmentation is high - as we'll fail to allocate the page table for our
suspend/resume buffer:

  kworker/10:2: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL),
  nodemask=(null),cpuset=/,mems_allowed=0
  CPU: 10 PID: 479809 Comm: kworker/10:2 Not tainted
  6.8.6-201.ChopperV6.fc39.x86_64 #1
  Hardware name: SLIMBOOK Executive/Executive, BIOS N.1.10GRU06 02/02/2024
  Workqueue: pm pm_runtime_work
  Call Trace:
   <TASK>
   dump_stack_lvl+0x64/0x80
   warn_alloc+0x165/0x1e0
   ? __alloc_pages_direct_compact+0xb3/0x2b0
   __alloc_pages_slowpath.constprop.0+0xd7d/0xde0
   __alloc_pages+0x32d/0x350
   __dma_direct_alloc_pages.isra.0+0x16a/0x2b0
   dma_direct_alloc+0x70/0x270
   nvkm_gsp_radix3_sg+0x5e/0x130 [nouveau]
   r535_gsp_fini+0x1d4/0x350 [nouveau]
   nvkm_subdev_fini+0x67/0x150 [nouveau]
   nvkm_device_fini+0x95/0x1e0 [nouveau]
   nvkm_udevice_fini+0x53/0x70 [nouveau]
   nvkm_object_fini+0xb9/0x240 [nouveau]
   nvkm_object_fini+0x75/0x240 [nouveau]
   nouveau_do_suspend+0xf5/0x280 [nouveau]
   nouveau_pmops_runtime_suspend+0x3e/0xb0 [nouveau]
   pci_pm_runtime_suspend+0x67/0x1e0
   ? __pfx_pci_pm_runtime_suspend+0x10/0x10
   __rpm_callback+0x41/0x170
   ? __pfx_pci_pm_runtime_suspend+0x10/0x10
   rpm_callback+0x5d/0x70
   ? __pfx_pci_pm_runtime_suspend+0x10/0x10
   rpm_suspend+0x120/0x6a0
   pm_runtime_work+0x98/0xb0
   process_one_work+0x171/0x340
   worker_thread+0x27b/0x3a0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0xe5/0x120
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x31/0x50
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1b/0x30

Luckily, we don't actually need to allocate coherent memory for the page
table thanks to being able to pass the GPU a radix3 page table for
suspend/resume data. So, let's rewrite nvkm_gsp_radix3_sg() to use the sg
allocator for level 2. We continue using coherent allocations for lvl0 and
1, since they only take a single page.

V2:
* Don't forget to actually jump to the next scatterlist when we reach the
  end of the scatterlist we're currently on when writing out the page table
  for level 2

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240429182318.189668-2-lyude@redhat.com
17 months agodrm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()
Lyude Paul [Mon, 29 Apr 2024 18:23:08 +0000 (14:23 -0400)]
drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()

Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a
BUG() on startup:

  kernel BUG at include/linux/scatterlist.h:187!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30
  Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019
  RIP: 0010:sg_init_one+0x85/0xa0
  Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54
  24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b
  0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00
  RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000
  RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508
  R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018
  FS:  00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0
  Call Trace:
   <TASK>
   ? die+0x36/0x90
   ? do_trap+0xdd/0x100
   ? sg_init_one+0x85/0xa0
   ? do_error_trap+0x65/0x80
   ? sg_init_one+0x85/0xa0
   ? exc_invalid_op+0x50/0x70
   ? sg_init_one+0x85/0xa0
   ? asm_exc_invalid_op+0x1a/0x20
   ? sg_init_one+0x85/0xa0
   nvkm_firmware_ctor+0x14a/0x250 [nouveau]
   nvkm_falcon_fw_ctor+0x42/0x70 [nouveau]
   ga102_gsp_booter_ctor+0xb4/0x1a0 [nouveau]
   r535_gsp_oneinit+0xb3/0x15f0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? nvkm_udevice_new+0x95/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? srso_return_thunk+0x5/0x5f
   ? ktime_get+0x47/0xb0
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_oneinit_+0x4f/0x120 [nouveau]
   nvkm_subdev_init_+0x39/0x140 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   nvkm_subdev_init+0x44/0x90 [nouveau]
   nvkm_device_init+0x166/0x2e0 [nouveau]
   nvkm_udevice_init+0x47/0x70 [nouveau]
   nvkm_object_init+0x41/0x1c0 [nouveau]
   nvkm_ioctl_new+0x16a/0x290 [nouveau]
   ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
   ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
   nvkm_ioctl+0x126/0x290 [nouveau]
   nvif_object_ctor+0x112/0x190 [nouveau]
   nvif_device_ctor+0x23/0x60 [nouveau]
   nouveau_cli_init+0x164/0x640 [nouveau]
   nouveau_drm_device_init+0x97/0x9e0 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   ? pci_update_current_state+0x72/0xb0
   ? srso_return_thunk+0x5/0x5f
   nouveau_drm_probe+0x12c/0x280 [nouveau]
   ? srso_return_thunk+0x5/0x5f
   local_pci_probe+0x45/0xa0
   pci_device_probe+0xc7/0x270
   really_probe+0xe6/0x3a0
   __driver_probe_device+0x87/0x160
   driver_probe_device+0x1f/0xc0
   __driver_attach+0xec/0x1f0
   ? __pfx___driver_attach+0x10/0x10
   bus_for_each_dev+0x88/0xd0
   bus_add_driver+0x116/0x220
   driver_register+0x59/0x100
   ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
   do_one_initcall+0x5b/0x320
   do_init_module+0x60/0x250
   init_module_from_file+0x86/0xc0
   idempotent_init_module+0x120/0x2b0
   __x64_sys_finit_module+0x5e/0xb0
   do_syscall_64+0x83/0x160
   ? srso_return_thunk+0x5/0x5f
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7feeb5cc20cd
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89
  f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
  ff ff 73 01 c3 48 8b 0d 1b cd 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcf220b2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
  RAX: ffffffffffffffda RBX: 000055fdd2916aa0 RCX: 00007feeb5cc20cd
  RDX: 0000000000000000 RSI: 000055fdd29161e0 RDI: 0000000000000035
  RBP: 00007ffcf220b380 R08: 00007feeb5d8fb20 R09: 00007ffcf220b310
  R10: 000055fdd2909dc0 R11: 0000000000000246 R12: 000055fdd29161e0
  R13: 0000000000020000 R14: 000055fdd29203e0 R15: 000055fdd2909d80
   </TASK>

We hit this when trying to initialize firmware of type
NVKM_FIRMWARE_IMG_DMA because we allocate our memory with
dma_alloc_coherent, and DMA allocations can't be turned back into memory
pages - which a scatterlist needs in order to map them.

So, fix this by allocating the memory with vmalloc instead().

V2:
* Fixup explanation as the prior one was bogus

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20240429182318.189668-1-lyude@redhat.com
17 months agoALSA: hda/realtek: Fix conflicting PCI SSID 17aa:386f for Lenovo Legion models
Takashi Iwai [Tue, 30 Apr 2024 16:32:04 +0000 (18:32 +0200)]
ALSA: hda/realtek: Fix conflicting PCI SSID 17aa:386f for Lenovo Legion models

Unfortunately both Lenovo Legion Pro 7 16ARX8H and Legion 7i 16IAX7
got the very same PCI SSID while the hardware implementations are
completely different (the former is with TI TAS2781 codec while the
latter is with Cirrus CS35L41 codec).  The former model got broken by
the recent fix for the latter model.

For addressing the regression, check the codec SSID and apply the
proper quirk for each model now.

Fixes: 24b6332c2d4f ("ALSA: hda: Add Lenovo Legion 7i gen7 sound quirk")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1223462
Message-ID: <20240430163206.5200-1-tiwai@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
17 months agoALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318
Kailang Yang [Tue, 30 Apr 2024 09:15:53 +0000 (17:15 +0800)]
ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318

There is a chance of damaging the IC when S4 resume.
Add safe mode for no stream to disable GPIO3.
Thinkpad with ALC1318 platform need to add this workaround.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Link: https://lore.kernel.org/r/a853dc4f0a4e412381d5f60565181247@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
17 months agoMerge tag 'for-v6.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Tue, 30 Apr 2024 16:15:25 +0000 (09:15 -0700)]
Merge tag 'for-v6.9-rc' of git://git./linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:

 - mt6360_charger: Fix of_match for usb-otg-vbus regulator

 - rt9455: Fix unused-const-variable for !CONFIG_USB_PHY

* tag 'for-v6.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: mt6360_charger: Fix of_match for usb-otg-vbus regulator
  power: rt9455: hide unused rt9455_boost_voltage_values

17 months agoMerge tag 'platform-drivers-x86-v6.9-4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 30 Apr 2024 16:06:05 +0000 (09:06 -0700)]
Merge tag 'platform-drivers-x86-v6.9-4' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fix from Ilpo Järvinen:

 - Add Grand Ridge to HPM CPU list

* tag 'platform-drivers-x86-v6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: ISST: Add Grand Ridge to HPM CPU list

17 months agoMerge tag 'pinctrl-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Tue, 30 Apr 2024 15:50:58 +0000 (08:50 -0700)]
Merge tag 'pinctrl-v6.9-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

 - Fix a double-free in the pinctrl_enable() errorpath

 - Fix a refcount leak in pinctrl_dt_to_map()

 - Fix selecting the GPIO pin control state and the UART3 pin config
   group in the Intel Baytrail driver

 - Fix readback of schmitt trigger status in the Mediatek Paris driver,
   along with some semantic pin config issues in this driver

 - Fix a pin suffix typo in the Meson A1 driver

 - Fix an erroneous register offset in he Aspeed G6 driver

 - Fix an inconsistent lock state and the interrupt type on resume in
   the Renesas RZG2L driver

 - Fix some minor confusion in the Renesas DT bindings

* tag 'pinctrl-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: renesas: rzg2l: Configure the interrupt type on resume
  pinctrl: devicetree: fix refcount leak in pinctrl_dt_to_map()
  pinctrl: baytrail: Add pinconf group for uart3
  pinctrl: baytrail: Fix selecting gpio pinctrl state
  pinctrl: mediatek: paris: Rework support for PIN_CONFIG_{INPUT,OUTPUT}_ENABLE
  pinctrl: mediatek: paris: Fix PIN_CONFIG_INPUT_SCHMITT_ENABLE readback
  pinctrl: core: delete incorrect free in pinctrl_enable()
  pinctrl/meson: fix typo in PDM's pin name
  pinctrl: pinctrl-aspeed-g6: Fix register offset for pinconf of GPIOR-T
  pinctrl: renesas: rzg2l: Execute atomically the interrupt configuration
  dt-bindings: pinctrl: renesas,rzg2l-pinctrl: Allow 'input' and 'output-enable' properties

17 months agoASoC: meson: tdm fixes
Mark Brown [Tue, 30 Apr 2024 14:36:23 +0000 (23:36 +0900)]
ASoC: meson: tdm fixes

Merge series from Jerome Brunet <jbrunet@baylibre.com>:

This patchset fixes 2 problems on TDM which both find a solution
by properly implementing the .trigger() callback for the TDM backend.

ATM, enabling the TDM formatters is done by the .prepare() callback
because handling the formatter is slow due to necessary calls to CCF.

The first problem affects the TDMIN. Because .prepare() is called on DPCM
backend first, the formatter are started before the FIFOs and this may
cause a random channel shifts if the TDMIN use multiple lanes with more
than 2 slots per lanes. Using trigger() allows to set the FE/BE order,
solving the problem.

There has already been an attempt to fix this 3y ago [1] and reverted [2]
It triggered a 'sleep in irq' error on the period IRQ. The solution is
to just use the bottom half of threaded IRQ. This is patch #1. Patch #2
and #3 remain mostly the same as 3y ago.

For TDMOUT, the problem is on pause. ATM pause only stops the FIFO and
the TDMOUT just starves. When it does, it will actually repeat the last
sample continuously. Depending on the platform, if there is no high-pass
filter on the analog path, this may translate to a constant position of
the speaker membrane. There is no audible glitch but it may damage the
speaker coil.

Properly stopping the TDMOUT in pause solves the problem. There is
behaviour change associated with that fix. Clocks used to be continuous
on pause because of the problem above. They will now be gated on pause by
default, as they should. The last change introduce the proper support for
continuous clocks, if needed.

[1]: https://lore.kernel.org/linux-amlogic/20211020114217.133153-1-jbrunet@baylibre.com
[2]: https://lore.kernel.org/linux-amlogic/20220421155725.2589089-1-narmstrong@baylibre.com

17 months agobtrfs: set correct ram_bytes when splitting ordered extent
Qu Wenruo [Mon, 15 Apr 2024 22:37:00 +0000 (08:07 +0930)]
btrfs: set correct ram_bytes when splitting ordered extent

[BUG]
When running generic/287, the following file extent items can be
generated:

        item 16 key (258 EXTENT_DATA 2682880) itemoff 15305 itemsize 53
                generation 9 type 1 (regular)
                extent data disk byte 1378414592 nr 462848
                extent data offset 0 nr 462848 ram 2097152
                extent compression 0 (none)

Note that file extent item is not a compressed one, but its ram_bytes is
way larger than its disk_num_bytes.

According to btrfs on-disk scheme, ram_bytes should match disk_num_bytes
if it's not a compressed one.

[CAUSE]
Since commit b73a6fd1b1ef ("btrfs: split partial dio bios before
submit"), for partial dio writes, we would split the ordered extent.

However the function btrfs_split_ordered_extent() doesn't update the
ram_bytes even it has already shrunk the disk_num_bytes.

Originally the function btrfs_split_ordered_extent() is only introduced
for zoned devices in commit d22002fd37bd ("btrfs: zoned: split ordered
extent when bio is sent"), but later commit b73a6fd1b1ef ("btrfs: split
partial dio bios before submit") makes non-zoned btrfs affected.

Thankfully for un-compressed file extent, we do not really utilize the
ram_bytes member, thus it won't cause any real problem.

[FIX]
Also update btrfs_ordered_extent::ram_bytes inside
btrfs_split_ordered_extent().

Fixes: d22002fd37bd ("btrfs: zoned: split ordered extent when bio is sent")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
17 months agoMerge tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 29 Apr 2024 22:57:37 +0000 (15:57 -0700)]
Merge tag 'wq-for-6.9-rc6-fixes' of git://git./linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:
 "Two doc update patches and the following three fixes:

   - On single node systems, the default pool is used but the
     node_nr_active for the default pool was set to min_active. This
     effectively limited the max concurrency of unbound pools on single
     node systems to 8 causing performance regressions on some
     workloads. Fixed by setting the default pool's node_nr_active to
     max_active.

   - wq_update_node_max_active() could trigger divide-by-zero if the
     intersection between the allowed CPUs for an unbound workqueue and
     online CPUs becomes empty.

   - When kick_pool() was trying to repatriate a worker to a CPU in its
     pod by setting task->wake_cpu, it didn't consider whether the CPU
     being selected is online or not which obviously can lead to
     subobtimal behaviors. On s390, this triggered a crash in arch code.
     The workqueue patch removes the gross misbehavior but doesn't fix
     the crash completely as there's a race window in which CPUs can go
     down after wake_cpu is set. Need to decide whether the fix should
     be on the core or arch side"

* tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Fix divide error in wq_update_node_max_active()
  workqueue: The default node_nr_active should have its max set to max_active
  workqueue: Fix selection of wake_cpu in kick_pool()
  docs/zh_CN: core-api: Update translation of workqueue.rst to 6.9-rc1
  Documentation/core-api: Update events_freezable_power references.

17 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Mon, 29 Apr 2024 21:36:53 +0000 (14:36 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "Minor core fix to prevent the sd driver printing the stream count
  every time we rescan and instead print only if it's changed"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Only print updates to permanent stream count

17 months agoMerge tag 'nfsd-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Mon, 29 Apr 2024 21:22:24 +0000 (14:22 -0700)]
Merge tag 'nfsd-6.9-6' of git://git./linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Avoid freeing unallocated memory (v6.7 regression)

* tag 'nfsd-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: Fix nfsd4_encode_fattr4() crasher

17 months agoMerge tag 'nfs-for-6.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Mon, 29 Apr 2024 19:07:37 +0000 (12:07 -0700)]
Merge tag 'nfs-for-6.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client fixes from Trond Myklebust:

 - Fix an Oops in xs_tcp_tls_setup_socket

 - Fix an Oops due to missing error handling in nfs_net_init()

* tag 'nfs-for-6.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Handle error of rpc_proc_register() in nfs_net_init().
  SUNRPC: add a missing rpc_stat for TCP TLS

17 months agoMerge tag 'bcachefs-2024-04-29' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Mon, 29 Apr 2024 18:04:40 +0000 (11:04 -0700)]
Merge tag 'bcachefs-2024-04-29' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Tiny set of fixes this time"

* tag 'bcachefs-2024-04-29' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: fix integer conversion bug
  bcachefs: btree node scan now fills in sectors_written
  bcachefs: Remove accidental debug assert

17 months agos390/cio: Ensure the copied buf is NUL terminated
Bui Quang Minh [Wed, 24 Apr 2024 14:44:22 +0000 (21:44 +0700)]
s390/cio: Ensure the copied buf is NUL terminated

Currently, we allocate a lbuf-sized kernel buffer and copy lbuf from
userspace to that buffer. Later, we use scanf on this buffer but we don't
ensure that the string is terminated inside the buffer, this can lead to
OOB read when using scanf. Fix this issue by using memdup_user_nul instead.

Fixes: a4f17cc72671 ("s390/cio: add CRW inject functionality")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20240424-fix-oob-read-v2-5-f1f1b53a10f4@gmail.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
17 months agoMerge tag 'erofs-for-6.9-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 29 Apr 2024 15:52:08 +0000 (08:52 -0700)]
Merge tag 'erofs-for-6.9-rc7-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "Three fixes related to EROFS fscache mode. The most important two
  patches fix calling kill_block_super() in bdev-based mode instead of
  kill_anon_super(). The remaining patch is an informative one.

  Summary:

   - Better error message when prepare_ondemand_read failed

   - Fix unmount of bdev-based mode if CONFIG_EROFS_FS_ONDEMAND is on"

* tag 'erofs-for-6.9-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: reliably distinguish block based and fscache mode
  erofs: get rid of erofs_fs_context
  erofs: modify the error message when prepare_ondemand_read failed

17 months agoALSA: hda: intel-sdw-acpi: fix usage of device_get_named_child_node()
Pierre-Louis Bossart [Fri, 26 Apr 2024 15:27:31 +0000 (10:27 -0500)]
ALSA: hda: intel-sdw-acpi: fix usage of device_get_named_child_node()

The documentation for device_get_named_child_node() mentions this
important point:

"
The caller is responsible for calling fwnode_handle_put() on the
returned fwnode pointer.
"

Add fwnode_handle_put() to avoid a leaked reference.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Fixes: 08c2a4bc9f2a ("ALSA: hda: move Intel SoundWire ACPI scan to dedicated module")
Message-ID: <20240426152731.38420-1-pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
17 months agobounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS
Matthew Wilcox (Oracle) [Mon, 29 Apr 2024 14:47:51 +0000 (15:47 +0100)]
bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS

bits_per() rounds up to the next power of two when passed a power of
two.  This causes crashes on some machines and configurations.

Reported-by: Михаил Новоселов <m.novosyolov@rosalinux.ru>
Tested-by: Ильфат Гаптрахманов <i.gaptrakhmanov@rosalinux.ru>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3347
Link: https://lore.kernel.org/all/1c978cf1-2934-4e66-e4b3-e81b04cb3571@rosalinux.ru/
Fixes: f2d5dcb48f7b (bounds: support non-power-of-two CONFIG_NR_CPUS)
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 months agoALSA: hda: intel-dsp-config: harden I2C/I2S codec detection
Pierre-Louis Bossart [Fri, 26 Apr 2024 15:28:18 +0000 (10:28 -0500)]
ALSA: hda: intel-dsp-config: harden I2C/I2S codec detection

The SOF driver is selected whenever specific I2C/I2S HIDs are reported
as 'present' in the ACPI DSDT. In some cases, an HID is reported but
the hardware does not actually rely on I2C/I2S.  This false positive
leads to an invalid selection of the SOF driver and as a result an
invalid topology is loaded.

This patch hardens the detection with a check that the NHLT table is
consistent with the report of an I2S-based codec in DSDT. This table
should expose at least one SSP endpoint configured for an I2S-codec
connection.

Tested on Huawei Matebook D14 (NBLB-WAX9N) using an HDaudio codec with
an invalid ES8336 ACPI HID reported:

[    7.858249] snd_hda_intel 0000:00:1f.3: DSP detected with PCI class/subclass/prog-if info 0x040380
[    7.858312] snd_hda_intel 0000:00:1f.3: snd_intel_dsp_find_config: no valid SSP found for HID ESSX8336, skipped

Reported-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Closes: https://github.com/thesofproject/linux/issues/4934
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Message-ID: <20240426152818.38443-1-pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
17 months agoASoC: cs35l56: fix usages of device_get_named_child_node()
Pierre-Louis Bossart [Fri, 26 Apr 2024 15:29:39 +0000 (10:29 -0500)]
ASoC: cs35l56: fix usages of device_get_named_child_node()

The documentation for device_get_named_child_node() mentions this
important point:

"
The caller is responsible for calling fwnode_handle_put() on the
returned fwnode pointer.
"

Add fwnode_handle_put() to avoid leaked references.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20240426152939.38471-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: da7219-aad: fix usage of device_get_named_child_node()
Pierre-Louis Bossart [Fri, 26 Apr 2024 15:30:33 +0000 (10:30 -0500)]
ASoC: da7219-aad: fix usage of device_get_named_child_node()

The documentation for device_get_named_child_node() mentions this
important point:

"
The caller is responsible for calling fwnode_handle_put() on the
returned fwnode pointer.
"

Add fwnode_handle_put() to avoid a leaked reference.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20240426153033.38500-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: meson: cards: select SND_DYNAMIC_MINORS
Jerome Brunet [Fri, 26 Apr 2024 13:41:47 +0000 (15:41 +0200)]
ASoC: meson: cards: select SND_DYNAMIC_MINORS

Amlogic sound cards do create a lot of pcm interfaces, possibly more than
8. Some pcm interfaces are internal (like DPCM backends and c2c) and not
exposed to userspace.

Those interfaces still increase the number passed to snd_find_free_minor(),
which eventually exceeds 8 causing -EBUSY error on card registration if
CONFIG_SND_DYNAMIC_MINORS=n and the interface is exposed to userspace.

select CONFIG_SND_DYNAMIC_MINORS for Amlogic cards to avoid the problem.

Fixes: 7864a79f37b5 ("ASoC: meson: add axg sound card support")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240426134150.3053741-1-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: meson: axg-tdm: add continuous clock support
Jerome Brunet [Fri, 26 Apr 2024 15:29:41 +0000 (17:29 +0200)]
ASoC: meson: axg-tdm: add continuous clock support

Some devices may need the clocks running, even while paused.
Add support for this use case.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240426152946.3078805-5-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: meson: axg-tdm-interface: manage formatters in trigger
Jerome Brunet [Fri, 26 Apr 2024 15:29:40 +0000 (17:29 +0200)]
ASoC: meson: axg-tdm-interface: manage formatters in trigger

So far, the formatters have been reset/enabled using the .prepare()
callback. This was done in this callback because walking the formatters use
a mutex. A mutex is used because formatter handling require dealing
possibly slow clock operation.

With the support of non-atomic, .trigger() callback may be used which also
allows to properly enable and disable formatters on start but also
pause/resume.

This solve a random shift on TDMIN as well repeated samples on for TDMOUT.

Fixes: d60e4f1e4be5 ("ASoC: meson: add tdm interface driver")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240426152946.3078805-4-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: meson: axg-card: make links nonatomic
Jerome Brunet [Fri, 26 Apr 2024 15:29:39 +0000 (17:29 +0200)]
ASoC: meson: axg-card: make links nonatomic

Non atomic operations need to be performed in the trigger callback
of the TDM interfaces. Those are BEs but what matters is the nonatomic
flag of the FE in the DPCM context. Just set nonatomic for everything so,
at least, what is done is clear.

Fixes: 7864a79f37b5 ("ASoC: meson: add axg sound card support")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240426152946.3078805-3-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoASoC: meson: axg-fifo: use threaded irq to check periods
Jerome Brunet [Fri, 26 Apr 2024 15:29:38 +0000 (17:29 +0200)]
ASoC: meson: axg-fifo: use threaded irq to check periods

With the AXG audio subsystem, there is a possible random channel shift on
TDM capture, when the slot number per lane is more than 2, and there is
more than one lane used.

The problem has been there since the introduction of the axg audio support
but such scenario is pretty uncommon. This is why there is no loud
complains about the problem.

Solving the problem require to make the links non-atomic and use the
trigger() callback to start FEs and BEs in the appropriate order.

This was tried in the past and reverted because it caused the block irq to
sleep while atomic. However, instead of reverting, the solution is to call
snd_pcm_period_elapsed() in a non atomic context.

Use the bottom half of a threaded IRQ to do so.

Fixes: 6dc4fa179fb8 ("ASoC: meson: add axg fifo base driver")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20240426152946.3078805-2-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
17 months agoMerge branch 'vxlan-stats'
David S. Miller [Mon, 29 Apr 2024 12:39:15 +0000 (13:39 +0100)]
Merge branch 'vxlan-stats'

Guillaume Nault says:

====================
vxlan: Fix vxlan counters.

Like most virtual devices, vxlan needs special care when updating its
netdevice counters. This is done in patch 1. Patch 2 just adds a
missing VNI counter update (found while working on patch 1).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agovxlan: Add missing VNI filter counter update in arp_reduce().
Guillaume Nault [Fri, 26 Apr 2024 15:27:19 +0000 (17:27 +0200)]
vxlan: Add missing VNI filter counter update in arp_reduce().

VXLAN stores per-VNI statistics using vxlan_vnifilter_count().
These statistics were not updated when arp_reduce() failed its
pskb_may_pull() call.

Use vxlan_vnifilter_count() to update the VNI counter when that
happens.

Fixes: 4095e0e1328a ("drivers: vxlan: vnifilter: per vni stats")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agovxlan: Fix racy device stats updates.
Guillaume Nault [Fri, 26 Apr 2024 15:27:17 +0000 (17:27 +0200)]
vxlan: Fix racy device stats updates.

VXLAN devices update their stats locklessly. Therefore these counters
should either be stored in per-cpu data structures or the updates
should be done using atomic increments.

Since the net_device_core_stats infrastructure is already used in
vxlan_rcv(), use it for the other rx_dropped and tx_dropped counter
updates. Update the other counters atomically using DEV_STATS_INC().

Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agoALSA: hda/realtek: Fix mute led of HP Laptop 15-da3001TU
Aman Dhoot [Mon, 22 Apr 2024 12:38:23 +0000 (18:08 +0530)]
ALSA: hda/realtek: Fix mute led of HP Laptop 15-da3001TU

This patch simply add SND_PCI_QUIRK for HP Laptop 15-da3001TU to fixed
mute led of laptop.

Signed-off-by: Aman Dhoot <amandhoot12@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CAMTp=B+3NG65Z684xMwHqdXDJhY+DJK-kuSw4adn6xwnG+b5JA@mail.gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>