git.maquefel.me Git - linux.git/log

wifi: iwlwifi: fix a memory corruption

iwl_fw_ini_trigger_tlv::data is a pointer to a __le32, which means that
if we copy to iwl_fw_ini_trigger_tlv::data + offset while offset is in
bytes, we'll write past the buffer.

Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218233
Fixes: cf29c5b66b9f ("iwlwifi: dbg_ini: implement time point handling")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240111150610.2d2b8b870194.I14ed76505a5cf87304e0c9cc05cc0ae85ed3bf91@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: fix potential sta-link leak

When a station is allocated, links are added but not
set to valid yet (e.g. during connection to an AP MLD),
we might remove the station without ever marking links
valid, and leak them. Fix that.

Fixes: cb71f1d136a6 ("wifi: mac80211: add sta link addition/removal")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240111181514.6573998beaf8.I09ac2e1d41c80f82a5a616b8bd1d9d8dd709a6a6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211/mac80211: remove dependency on non-existing option

Commit ffbd0c8c1e7f ("wifi: mac80211: add an element parsing unit test")
and commit 730eeb17bbdd ("wifi: cfg80211: add first kunit tests, for
element defrag") add new configs that depend on !KERNEL_6_2, but the config
option KERNEL_6_2 does not exist in the tree. This dependency is used for
handling backporting to restrict the option to certain kernels but this
really should not be carried around the mainline kernel tree.

Clean up this needless dependency on the non-existing option KERNEL_6_2.

Link: https://lore.kernel.org/lkml/CAKXUXMyfrM6amOR7Ysim3WNQ-Ckf9HJDqRhAoYmLXujo1UV+yA@mail.gmail.com/
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: fix missing interfaces when dumping

The nl80211_dump_interface() supports resumption
in case nl80211_send_iface() doesn't have the
resources to complete its work.

The logic would store the progress as iteration
offsets for rdev and wdev loops.

However the logic did not properly handle
resumption for non-last rdev. Assuming a system
with 2 rdevs, with 2 wdevs each, this could
happen:

dump(cb=[0, 0]):
  if_start=cb[1] (=0)
  send rdev0.wdev0 -> ok
  send rdev0.wdev1 -> yield
  cb[1] = 1

dump(cb=[0, 1]):
  if_start=cb[1] (=1)
  send rdev0.wdev1 -> ok
  // since if_start=1 the rdev0.wdev0 got skipped
  // through if_idx < if_start
  send rdev1.wdev1 -> ok

The if_start needs to be reset back to 0 upon wdev
loop end.

The problem is actually hard to hit on a desktop,
and even on most routers. The prerequisites for
this manifesting was:
- more than 1 wiphy
- a few handful of interfaces
- dump without rdev or wdev filter

I was seeing this with 4 wiphys 9 interfaces each.
It'd miss 6 interfaces from the last wiphy
reported to userspace.

Signed-off-by: Michal Kazior <michal@plume.com>
Link: https://msgid.link/20240116142340.89678-1-kazikcz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: ath11k: rely on mac80211 debugfs handling for vif

mac80211 started to delete debugfs entries in certain cases, causing a
ath11k to crash when it tried to delete the entries later. Fix this by
relying on mac80211 to delete the entries when appropriate and adding
them from the vif_add_debugfs handler.

Fixes: 0a3d898ee9a8 ("wifi: mac80211: add/remove driver debugfs entries as appropriate")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218364
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240115101805.1277949-1-benjamin@sipsolutions.net

wifi: p54: fix GCC format truncation warning with wiphy->fw_version

GCC 13.2 warns:

drivers/net/wireless/intersil/p54/fwio.c:128:34: warning: '%s' directive output may be truncated writing up to 39 bytes into a region of size 32 [-Wformat-truncation=]
drivers/net/wireless/intersil/p54/fwio.c:128:33: note: directive argument in the range [0, 16777215]
drivers/net/wireless/intersil/p54/fwio.c:128:33: note: directive argument in the range [0, 255]
drivers/net/wireless/intersil/p54/fwio.c:127:17: note: 'snprintf' output between 7 and 52 bytes into a destination of size 32

The issue here is that wiphy->fw_version is 32 bytes and in theory the string
we try to place there can be 39 bytes. wiphy->fw_version is used for providing
the firmware version to user space via ethtool, so not really important.
fw_version in theory can be 24 bytes but in practise it's shorter, so even if
print only 19 bytes via ethtool there should not be any practical difference.

I did consider removing fw_var from the string altogether or making the maximum
length for fw_version 19 bytes, but chose this approach as it was the least
intrusive.

Compile tested only.

Signed-off-by: Kalle Valo <kvalo@kernel.org>
Acked-by: Christian Lamparter <chunkeey@gmail.com> # Tested with Dell 1450 USB
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20231219162516.898205-1-kvalo@kernel.org

net: ethernet: cortina: Drop TSO support

The recent change to allow large frames without hardware checksumming
slotted in software checksumming in the driver if hardware could not
do it.

This will however upset TSO (TCP Segment Offloading). Typical
error dumps includes this:

skb len=2961 headroom=222 headlen=66 tailroom=0
(...)
WARNING: CPU: 0 PID: 956 at net/core/dev.c:3259 skb_warn_bad_offload+0x7c/0x108
gemini-ethernet-port: caps=(0x0000010000154813, 0x00002007ffdd7889)

And the packets do not go through.

The TSO implementation is bogus: a TSO enabled driver must propagate
the skb_shinfo(skb)->gso_size value to the TSO engine on the NIC.

Drop the size check and TSO offloading features for now: this
needs to be fixed up properly.

After this ethernet works fine on Gemini devices with a direct connected
PHY such as D-Link DNS-313.

Also tested to still be working with a DSA switch using the Gemini
ethernet as conduit interface.

Link: https://lore.kernel.org/netdev/CANn89iJLfxng1sYL5Zk0mknXpyYQPCp83m3KgD2KJ2_hKCpEUg@mail.gmail.com/
Suggested-by: Eric Dumazet <edumazet@google.com>
Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: fix ethtool per-queue statistics

Fix per-queue statistics for devices with more than one queue.

The output data pointer is currently reset in each loop iteration,
effectively summing all queue statistics in the first four u64 values.

The summary values are not even labeled correctly. For example, if eth0 has
2 queues, ethtool -S eth0 shows:

     q0_tx_pkt_n: 374 (actually tx_pkt_n over all queues)
     q0_tx_irq_n: 23  (actually tx_normal_irq_n over all queues)
     q1_tx_pkt_n: 462 (actually rx_pkt_n over all queues)
     q1_tx_irq_n: 446 (actually rx_normal_irq_n over all queues)
     q0_rx_pkt_n: 0
     q0_rx_irq_n: 0
     q1_rx_pkt_n: 0
     q1_rx_irq_n: 0

Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary")
Cc: stable@vger.kernel.org
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6_tunnel: fix NEXTHDR_FRAGMENT handling in ip6_tnl_parse_tlv_enc_lim()

syzbot pointed out [1] that NEXTHDR_FRAGMENT handling is broken.

Reading frag_off can only be done if we pulled enough bytes
to skb->head. Currently we might access garbage.

[1]
BUG: KMSAN: uninit-value in ip6_tnl_parse_tlv_enc_lim+0x94f/0xbb0
ip6_tnl_parse_tlv_enc_lim+0x94f/0xbb0
ipxip6_tnl_xmit net/ipv6/ip6_tunnel.c:1326 [inline]
ip6_tnl_start_xmit+0xab2/0x1a70 net/ipv6/ip6_tunnel.c:1432
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564
__dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x569/0x660 net/core/neighbour.c:1592
neigh_output include/net/neighbour.h:542 [inline]
ip6_finish_output2+0x23a9/0x2b30 net/ipv6/ip6_output.c:137
ip6_finish_output+0x855/0x12b0 net/ipv6/ip6_output.c:222
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip6_output+0x323/0x610 net/ipv6/ip6_output.c:243
dst_output include/net/dst.h:451 [inline]
ip6_local_out+0xe9/0x140 net/ipv6/output_core.c:155
ip6_send_skb net/ipv6/ip6_output.c:1952 [inline]
ip6_push_pending_frames+0x1f9/0x560 net/ipv6/ip6_output.c:1972
rawv6_push_pending_frames+0xbe8/0xdf0 net/ipv6/raw.c:582
rawv6_sendmsg+0x2b66/0x2e70 net/ipv6/raw.c:920
inet_sendmsg+0x105/0x190 net/ipv4/af_inet.c:847
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b

Uninit was created at:
slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
slab_alloc_node mm/slub.c:3478 [inline]
__kmem_cache_alloc_node+0x5c9/0x970 mm/slub.c:3517
__do_kmalloc_node mm/slab_common.c:1006 [inline]
__kmalloc_node_track_caller+0x118/0x3c0 mm/slab_common.c:1027
kmalloc_reserve+0x249/0x4a0 net/core/skbuff.c:582
pskb_expand_head+0x226/0x1a00 net/core/skbuff.c:2098
__pskb_pull_tail+0x13b/0x2310 net/core/skbuff.c:2655
pskb_may_pull_reason include/linux/skbuff.h:2673 [inline]
pskb_may_pull include/linux/skbuff.h:2681 [inline]
ip6_tnl_parse_tlv_enc_lim+0x901/0xbb0 net/ipv6/ip6_tunnel.c:408
ipxip6_tnl_xmit net/ipv6/ip6_tunnel.c:1326 [inline]
ip6_tnl_start_xmit+0xab2/0x1a70 net/ipv6/ip6_tunnel.c:1432
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564
__dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x569/0x660 net/core/neighbour.c:1592
neigh_output include/net/neighbour.h:542 [inline]
ip6_finish_output2+0x23a9/0x2b30 net/ipv6/ip6_output.c:137
ip6_finish_output+0x855/0x12b0 net/ipv6/ip6_output.c:222
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip6_output+0x323/0x610 net/ipv6/ip6_output.c:243
dst_output include/net/dst.h:451 [inline]
ip6_local_out+0xe9/0x140 net/ipv6/output_core.c:155
ip6_send_skb net/ipv6/ip6_output.c:1952 [inline]
ip6_push_pending_frames+0x1f9/0x560 net/ipv6/ip6_output.c:1972
rawv6_push_pending_frames+0xbe8/0xdf0 net/ipv6/raw.c:582
rawv6_sendmsg+0x2b66/0x2e70 net/ipv6/raw.c:920
inet_sendmsg+0x105/0x190 net/ipv4/af_inet.c:847
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b

CPU: 0 PID: 7345 Comm: syz-executor.3 Not tainted 6.7.0-rc8-syzkaller-00024-gac865f00af29 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023

Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Fix skbuff cleanup of call's recvmsg_queue and rx_oos_queue

Fix rxrpc_cleanup_ring() to use rxrpc_purge_queue() rather than
skb_queue_purge() so that the count of outstanding skbuffs is correctly
updated when a failed call is cleaned up.

Without this rmmod may hang waiting for rxrpc_n_rx_skbs to become zero.

Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxbf_gige: Enable the GigE port in mlxbf_gige_open

At the moment, the GigE port is enabled in the mlxbf_gige_probe
function. If the mlxbf_gige_open is not executed, this could cause
pause frames to increase in the case where there is high backgroud
traffic. This results in clogging the port.
So move enabling the OOB port to mlxbf_gige_open.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: David Thompson <davthompson@nvidia.com>
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxbf_gige: Fix intermittent no ip issue

Although the link is up, there is no ip assigned on setups with high background
traffic. Nothing is transmitted nor received. The RX error count keeps on
increasing. After several minutes, the RX error count stagnates and the
GigE interface finally gets an ip.

The issue is that mlxbf_gige_rx_init() is called before phy_start().
As soon as the RX DMA is enabled in mlxbf_gige_rx_init(), the RX CI reaches the max
of 128, and becomes equal to RX PI. RX CI doesn't decrease since the code hasn't
ran phy_start yet.
Bring the PHY up before starting the RX.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: David Thompson <davthompson@nvidia.com>
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: act_ct: fix skb leak and crash on ooo frags

act_ct adds skb->users before defragmentation. If frags arrive in order,
the last frag's reference is reset in:

  inet_frag_reasm_prepare
    skb_morph

which is not straightforward.

However when frags arrive out of order, nobody unref the last frag, and
all frags are leaked. The situation is even worse, as initiating packet
capture can lead to a crash[0] when skb has been cloned and shared at the
same time.

Fix the issue by removing skb_get() before defragmentation. act_ct
returns TC_ACT_CONSUMED when defrag failed or in progress.

[0]:
[  843.804823] ------------[ cut here ]------------
[  843.809659] kernel BUG at net/core/skbuff.c:2091!
[  843.814516] invalid opcode: 0000 [#1] PREEMPT SMP
[  843.819296] CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G S 6.7.0-rc3 #2
[  843.824107] Hardware name: XFUSION 1288H V6/BC13MBSBD, BIOS 1.29 11/25/2022
[  843.828953] RIP: 0010:pskb_expand_head+0x2ac/0x300
[  843.833805] Code: 8b 70 28 48 85 f6 74 82 48 83 c6 08 bf 01 00 00 00 e8 38 bd ff ff 8b 83 c0 00 00 00 48 03 83 c8 00 00 00 e9 62 ff ff ff 0f 0b <0f> 0b e8 8d d0 ff ff e9 b3 fd ff ff 81 7c 24 14 40 01 00 00 4c 89
[  843.843698] RSP: 0018:ffffc9000cce07c0 EFLAGS: 00010202
[  843.848524] RAX: 0000000000000002 RBX: ffff88811a211d00 RCX: 0000000000000820
[  843.853299] RDX: 0000000000000640 RSI: 0000000000000000 RDI: ffff88811a211d00
[  843.857974] RBP: ffff888127d39518 R08: 00000000bee97314 R09: 0000000000000000
[  843.862584] R10: 0000000000000000 R11: ffff8881109f0000 R12: 0000000000000880
[  843.867147] R13: ffff888127d39580 R14: 0000000000000640 R15: ffff888170f7b900
[  843.871680] FS:  0000000000000000(0000) GS:ffff889ffffc0000(0000) knlGS:0000000000000000
[  843.876242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  843.880778] CR2: 00007fa42affcfb8 CR3: 000000011433a002 CR4: 0000000000770ef0
[  843.885336] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  843.889809] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  843.894229] PKRU: 55555554
[  843.898539] Call Trace:
[  843.902772]  <IRQ>
[  843.906922]  ? __die_body+0x1e/0x60
[  843.911032]  ? die+0x3c/0x60
[  843.915037]  ? do_trap+0xe2/0x110
[  843.918911]  ? pskb_expand_head+0x2ac/0x300
[  843.922687]  ? do_error_trap+0x65/0x80
[  843.926342]  ? pskb_expand_head+0x2ac/0x300
[  843.929905]  ? exc_invalid_op+0x50/0x60
[  843.933398]  ? pskb_expand_head+0x2ac/0x300
[  843.936835]  ? asm_exc_invalid_op+0x1a/0x20
[  843.940226]  ? pskb_expand_head+0x2ac/0x300
[  843.943580]  inet_frag_reasm_prepare+0xd1/0x240
[  843.946904]  ip_defrag+0x5d4/0x870
[  843.950132]  nf_ct_handle_fragments+0xec/0x130 [nf_conntrack]
[  843.953334]  tcf_ct_act+0x252/0xd90 [act_ct]
[  843.956473]  ? tcf_mirred_act+0x516/0x5a0 [act_mirred]
[  843.959657]  tcf_action_exec+0xa1/0x160
[  843.962823]  fl_classify+0x1db/0x1f0 [cls_flower]
[  843.966010]  ? skb_clone+0x53/0xc0
[  843.969173]  tcf_classify+0x24d/0x420
[  843.972333]  tc_run+0x8f/0xf0
[  843.975465]  __netif_receive_skb_core+0x67a/0x1080
[  843.978634]  ? dev_gro_receive+0x249/0x730
[  843.981759]  __netif_receive_skb_list_core+0x12d/0x260
[  843.984869]  netif_receive_skb_list_internal+0x1cb/0x2f0
[  843.987957]  ? mlx5e_handle_rx_cqe_mpwrq_rep+0xfa/0x1a0 [mlx5_core]
[  843.991170]  napi_complete_done+0x72/0x1a0
[  843.994305]  mlx5e_napi_poll+0x28c/0x6d0 [mlx5_core]
[  843.997501]  __napi_poll+0x25/0x1b0
[  844.000627]  net_rx_action+0x256/0x330
[  844.003705]  __do_softirq+0xb3/0x29b
[  844.006718]  irq_exit_rcu+0x9e/0xc0
[  844.009672]  common_interrupt+0x86/0xa0
[  844.012537]  </IRQ>
[  844.015285]  <TASK>
[  844.017937]  asm_common_interrupt+0x26/0x40
[  844.020591] RIP: 0010:acpi_safe_halt+0x1b/0x20
[  844.023247] Code: ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 65 48 8b 04 25 00 18 03 00 48 8b 00 a8 08 75 0c 66 90 0f 00 2d 81 d0 44 00 fb f4 <fa> c3 0f 1f 00 89 fa ec 48 8b 05 ee 88 ed 00 a9 00 00 00 80 75 11
[  844.028900] RSP: 0018:ffffc90000533e70 EFLAGS: 00000246
[  844.031725] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0000000000000000
[  844.034553] RDX: ffff889ffffc0000 RSI: ffffffff828b7f20 RDI: ffff88a090f45c64
[  844.037368] RBP: ffff88a0901a2800 R08: ffff88a090f45c00 R09: 00000000000317c0
[  844.040155] R10: 00ec812281150475 R11: ffff889fffff0e04 R12: ffffffff828b7fa0
[  844.042962] R13: ffffffff828b7f20 R14: 0000000000000001 R15: 0000000000000000
[  844.045819]  acpi_idle_enter+0x7b/0xc0
[  844.048621]  cpuidle_enter_state+0x7f/0x430
[  844.051451]  cpuidle_enter+0x2d/0x40
[  844.054279]  do_idle+0x1d4/0x240
[  844.057096]  cpu_startup_entry+0x2a/0x30
[  844.059934]  start_secondary+0x104/0x130
[  844.062787]  secondary_startup_64_no_verify+0x16b/0x16b
[  844.065674]  </TASK>

Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Tao Liu <taoliu828@163.com>
Link: https://lore.kernel.org/r/20231228081457.936732-1-taoliu828@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and netfilter.

  We haven't accumulated much over the break. If it wasn't for the
  uninterrupted stream of fixes for Intel drivers this PR would be very
  slim. There was a handful of user reports, however, either they stood
  out because of the lower traffic or users have had more time to test
  over the break. The ones which are v6.7-relevant should be wrapped up.

  Current release - regressions:

   - Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum
     required", it caused issues on networks where routers send prefixes
     with preferred_lft=0

   - wifi:
      - iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock
      - mac80211: fix re-adding debugfs entries during reconfiguration

  Current release - new code bugs:

   - tcp: print AO/MD5 messages only if there are any keys

  Previous releases - regressions:

   - virtio_net: fix missing dma unmap for resize, prevent OOM

  Previous releases - always broken:

   - mptcp: prevent tcp diag from closing listener subflows

   - nf_tables:
      - set transport header offset for egress hook, fix IPv4 mangling
      - skip set commit for deleted/destroyed sets, avoid double deactivation

   - nat: make sure action is set for all ct states, fix openvswitch
     matching on ICMP packets in related state

   - eth: mlxbf_gige: fix receive hang under heavy traffic

   - eth: r8169: fix PCI error on system resume for RTL8168FP

   - net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling"

* tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  net/tcp: Only produce AO/MD5 logs if there are any keys
  net: Implement missing SO_TIMESTAMPING_NEW cmsg support
  bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
  net: ravb: Wait for operating mode to be applied
  asix: Add check for usbnet_get_endpoints
  octeontx2-af: Re-enable MAC TX in otx2_stop processing
  octeontx2-af: Always configure NIX TX link credits based on max frame size
  net/smc: fix invalid link access in dumping SMC-R connections
  net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
  virtio_net: fix missing dma unmap for resize
  igc: Fix hicredit calculation
  ice: fix Get link status data length
  i40e: Restore VF MSI-X state during PCI reset
  i40e: fix use-after-free in i40e_aqc_add_filters()
  net: Save and restore msg_namelen in sock_sendmsg
  netfilter: nft_immediate: drop chain reference counter on error
  netfilter: nf_nat: fix action not being set for all ct states
  net: bcmgenet: Fix FCS generation for fragmented skbuffs
  mptcp: prevent tcp diag from closing listener subflows
  MAINTAINERS: add Geliang as reviewer for MPTCP
  ...

x86/csum: clean up `csum_partial' further

Commit 688eb8191b47 ("x86/csum: Improve performance of `csum_partial`")
ended up improving the code generation for the IP csum calculations, and
in particular special-casing the 40-byte case that is a hot case for
IPv6 headers.

It then had _another_ special case for the 64-byte unrolled loop, which
did two chains of 32-byte blocks, which allows modern CPU's to improve
performance by doing the chains in parallel thanks to renaming the carry
flag.

This just unifies the special cases and combines them into just one
single helper the 40-byte csum case, and replaces the 64-byte case by a
80-byte case that just does that single helper twice. It avoids having
all these different versions of inline assembly, and actually improved
performance further in my tests.

There was never anything magical about the 64-byte unrolled case, even
though it happens to be a common size (and typically is the cacheline
size).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86/csum: Remove unnecessary odd handling

The special case for odd aligned buffers is unnecessary and mostly
just adds overhead. Aligned buffers is the expectations, and even for
unaligned buffer, the only case that was helped is if the buffer was
1-byte from word aligned which is ~1/7 of the cases. Overall it seems
highly unlikely to be worth to extra branch.

It was left in the previous perf improvement patch because I was
erroneously comparing the exact output of `csum_partial(...)`, but
really we only need `csum_fold(csum_partial(...))` to match so its
safe to remove.

All csum kunit tests pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Laight <david.laight@aculab.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'platform-drivers-x86-v6.7-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fix from Ilpo Järvinen:
"Unfortunately the P2SB deadlock fix broke some older HW and we need
  some time to figure out the best way to fix the issue so reverting the
  deadlock fix for now"

* tag 'platform-drivers-x86-v6.7-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"

Merge tag 'sound-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"It became more than wished, partly because of vacations. But all
  changes are fairly device-specific and should be safe to apply:

   - A regression fix for Oops at ASoC HD-audio probe

   - A series of TAS2781 HD-audio codec fixes

   - A random build regression fix with SPI helpers

   - Minor endianness fix for USB-audio mixer code

   - ASoC FSL driver error handling fix

   - ASoC Mediatek driver register fix

   - A series of ASoC meson g12a driver fixes

   - A few usual HD-audio oneliner quirks"

* tag 'sound-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6
  ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux
  ASoC: meson: g12a-toacodec: Fix event generation
  ASoC: meson: g12a-tohdmitx: Validate written enum values
  ASoC: meson: g12a-toacodec: Validate written enum values
  ASoC: SOF: Intel: hda-codec: Delay the codec device registration
  ALSA: hda: cs35l41: fix building without CONFIG_SPI
  ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook
  ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx series
  ASoC: mediatek: mt8186: fix AUD_PAD_TOP register and offset
  ALSA: scarlett2: Convert meter levels from little-endian
  ALSA: hda/tas2781: remove sound controls in unbind
  ALSA: hda/tas2781: move set_drv_data outside tasdevice_init
  ALSA: hda/tas2781: fix typos in comment
  ALSA: hda/tas2781: do not use regcache
  ASoC: fsl_rpmsg: Fix error handler with pm_runtime_enable

Merge tag 'drm-fixes-2024-01-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"These were from over the holiday period, mainly i915, a couple of
  qaic, bridge and an mgag200.

  qaic:
   - fix GEM import
   - add quirk for soc version

  bridge:
   - parade-ps8640, ti-sn65dsi86: fix aux reads bounds

  mgag200:
   - fix gamma LUT init

  i915:
   - Fix bogus DPCD rev usage for DP phy test pattern setup
   - Fix handling of MMIO triggered reports in the OA buffer"

* tag 'drm-fixes-2024-01-04' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/perf: Update handling of MMIO triggered reports
  drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern
  drm/mgag200: Fix gamma lut not initialized for G200ER, G200EV, G200SE
  drm/bridge: ps8640: Fix size mismatch warning w/ len
  drm/bridge: ti-sn65dsi86: Never store more than msg->size bytes in AUX xfer
  drm/bridge: parade-ps8640: Never store more than msg->size bytes in AUX xfer
  accel/qaic: Implement quirk for SOC_HW_VERSION
  accel/qaic: Fix GEM import path code

net/tcp: Only produce AO/MD5 logs if there are any keys

User won't care about inproper hash options in the TCP header if they
don't use neither TCP-AO nor TCP-MD5. Yet, those logs can add up in
syslog, while not being a real concern to the host admin:
> kernel: TCP: TCP segment has incorrect auth options set for XX.20.239.12.54681->XX.XX.90.103.80 [S]

Keep silent and avoid logging when there aren't any keys in the system.

Side-note: I also defined static_branch_tcp_*() helpers to avoid more
ifdeffery, going to remove more ifdeffery further with their help.

Reported-by: Christian Kujau <lists@nerdbynature.de>
Closes: https://lore.kernel.org/all/f6b59324-1417-566f-a976-ff2402718a8d@nerdbynature.de/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Fixes: 2717b5adea9e ("net/tcp: Add tcp_hash_fail() ratelimited logs")
Link: https://lore.kernel.org/r/20240104-tcp_hash_fail-logs-v1-1-ff3e1f6f9e72@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-01-03 (i40e, ice, igc)

This series contains updates to i40e, ice, and igc drivers.

Ke Xiao fixes use after free for unicast filters on i40e.

Andrii restores VF MSI-X flag after PCI reset on i40e.

Paul corrects admin queue link status structure to fulfill firmware
expectations for ice.

Rodrigo Cataldo corrects value used for hicredit calculations on igc.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Fix hicredit calculation
  ice: fix Get link status data length
  i40e: Restore VF MSI-X state during PCI reset
  i40e: fix use-after-free in i40e_aqc_add_filters()
====================

Link: https://lore.kernel.org/r/20240103193254.822968-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: Implement missing SO_TIMESTAMPING_NEW cmsg support

Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new
socket option SO_TIMESTAMPING_NEW. However, it was never implemented in
__sock_cmsg_send thus breaking SO_TIMESTAMPING cmsg for platforms using
SO_TIMESTAMPING_NEW.

Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
Link: https://lore.kernel.org/netdev/6a7281bf-bc4a-4f75-bb88-7011908ae471@app.fastmail.com/
Signed-off-by: Thomas Lange <thomas@corelatus.se>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240104085744.49164-1-thomas@corelatus.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"

This reverts commit b28ff7a7c3245d7f62acc20f15b4361292fe4117.

The commit introduced P2SB device scan and resource cache during the
boot process to avoid deadlock. But it caused detection failure of
IDE controllers on old systems [1]. The IDE controllers on old systems
and P2SB devices on newer systems have same PCI DEVFN. It is suspected
the confusion between those two is the failure cause. Revert the change
at this moment until the proper solution gets ready.

Link: https://lore.kernel.org/platform-driver-x86/CABq1_vjfyp_B-f4LAL6pg394bP6nDFyvg110TOLHHb0x4aCPeg@mail.gmail.com/T/#m07b30468d9676fc5e3bb2122371121e4559bb383
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20240104114050.3142690-1-shinichiro.kawasaki@wdc.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()

The 2 lines to check for the BNXT_HWRM_PF_UNLOAD_SP_EVENT bit was
mis-applied to bnxt_cfg_ntp_filters() and should have been applied to
bnxt_sp_task().

Fixes: 19241368443f ("bnxt_en: Send PF driver unload notification to all VFs.")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ravb: Wait for operating mode to be applied

CSR.OPS bits specify the current operating mode and (according to
documentation) they are updated by HW when the operating mode change
request is processed. To comply with this check CSR.OPS before proceeding.

Commit introduces ravb_set_opmode() that does all the necessities for
setting the operating mode (set CCC.OPC (and CCC.GAC, CCC.CSEL, if any) and
wait for CSR.OPS) and call it where needed. This should comply with all the
HW manuals requirements as different manual variants specify that different
modes need to be checked in CSR.OPS when setting CCC.OPC.

If gPTP active in config mode is supported and it needs to be enabled, the
CCC.GAC and CCC.CSEL needs to be configured along with CCC.OPC in the same
write access. For this, ravb_set_opmode() allows passing GAC and CSEL as
part of opmode and the function updates accordingly CCC register.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

asix: Add check for usbnet_get_endpoints

Add check for usbnet_get_endpoints() and return the error if it fails
in order to transfer the error.

Fixes: 16626b0cc3d5 ("asix: Add a new driver for the AX88172A")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: Re-enable MAC TX in otx2_stop processing

During QoS scheduling testing with multiple strict priority flows, the
netdev tx watchdog timeout routine is invoked when a low priority QoS
queue doesn't get a chance to transmit the packets because other high
priority flows are completely subscribing the transmit link. The netdev
tx watchdog timeout routine will stop MAC RX and TX functionality in
otx2_stop() routine before cleanup of HW TX queues which results in SMQ
flush errors because the packets belonging to low priority queues will
never gets flushed since MAC TX is disabled. This patch fixes the issue
by re-enabling MAC TX to ensure the packets in HW pipeline gets flushed
properly.

Fixes: a7faa68b4e7f ("octeontx2-af: Start/Stop traffic in CGX along with NPC")
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: Always configure NIX TX link credits based on max frame size

Currently the NIX TX link credits are initialized based on the max frame
size that can be transmitted on a link but when the MTU is changed, the
NIX TX link credits are reprogrammed by the SW based on the new MTU value.
Since SMQ max packet length is programmed to max frame size by default,
there is a chance that NIX TX may stall while sending a max frame sized
packet on the link with insufficient credits to send the packet all at
once. This patch avoids stall issue by not changing the link credits
dynamically when the MTU is changed.

Fixes: 1c74b89171c3 ("octeontx2-af: Wait for TX link idle for credits change")
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6

LEDs in 'HP ProBook 440 G6' laptop are controlled by ALC236 codec.
Enable already existing quirk 'ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF'
to fix mute and mic-mute LEDs.

Signed-off-by: Siddhesh Dharme <siddheshdharme18@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240104060736.5149-1-siddheshdharme18@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.7-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.7

I recently got a LibreTech Sapphire board for my CI and while
integrating it found and fixed some issues, including crashes for the
enum validation. There's also a couple of patches adding quirks for
another x86 laptop from Hans and an error handling fix for the Freescale
rpmsg driver.

Merge tag 'nf-24-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix nat packets in the related state in OVS, from Brad Cowie.

2) Drop chain reference counter on error path in case chain binding
   fails.

* tag 'nf-24-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_immediate: drop chain reference counter on error
  netfilter: nf_nat: fix action not being set for all ct states
====================

Link: https://lore.kernel.org/r/20240103113001.137936-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'drm-misc-fixes-2024-01-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v6.7 final:
- 2 small qaic fixes.
- Fixes for overflow in aux xfer.
- Fix uninitialised gamma lut in gmag200.
- Small compiler warning fix for backports of a ps8640 fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/9ba866b4-3144-47a9-a2c0-7313c67249d7@linux.intel.com

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-12-27 (igc)

This series contains updates to igc driver only.

Kurt Kanzenbach resolves issues around VLAN ntuple rules; correctly
reporting back added rules and checking for valid values.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Check VLAN EtherType mask
  igc: Check VLAN TCI mask
  igc: Report VLAN EtherType matching back to user
====================

Link: https://lore.kernel.org/r/20231227210041.3035055-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-12-27 (ice, i40e)

This series contains updates to ice and i40e drivers.

Katarzyna changes message to no longer be reported as error under
certain conditions as it can be expected on ice.

Ngai-Mint ensures VSI is always closed when stopping interface to
prevent NULL pointer dereference for ice.

Arkadiusz corrects reporting of phase offset value for ice.

Sudheer corrects checking on ADQ filters to prevent invalid values on
i40e.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: Fix filter input checks to prevent config with invalid values
  ice: dpll: fix phase offset value
  ice: Shut down VSI with "link-down-on-close" enabled
  ice: Fix link_down_on_close message
====================

Link: https://lore.kernel.org/r/20231227182541.3033124-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/smc: fix invalid link access in dumping SMC-R connections

A crash was found when dumping SMC-R connections. It can be reproduced
by following steps:

- environment: two RNICs on both sides.
- run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
  will be created.
- set the first RNIC down on either side and link group will turn to
  SMC_LGR_ASYMMETRIC_LOCAL then.
- run 'smcss -R' and the crash will be triggered.

BUG: kernel NULL pointer dereference, address: 0000000000000010
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W   E      6.7.0-rc6+ #51
RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
Call Trace:
  <TASK>
  ? __die+0x24/0x70
  ? page_fault_oops+0x66/0x150
  ? exc_page_fault+0x69/0x140
  ? asm_exc_page_fault+0x26/0x30
  ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
  smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
  smc_diag_dump+0x26/0x60 [smc_diag]
  netlink_dump+0x19f/0x320
  __netlink_dump_start+0x1dc/0x300
  smc_diag_handler_dump+0x6a/0x80 [smc_diag]
  ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
  sock_diag_rcv_msg+0x121/0x140
  ? __pfx_sock_diag_rcv_msg+0x10/0x10
  netlink_rcv_skb+0x5a/0x110
  sock_diag_rcv+0x28/0x40
  netlink_unicast+0x22a/0x330
  netlink_sendmsg+0x240/0x4a0
  __sock_sendmsg+0xb0/0xc0
  ____sys_sendmsg+0x24e/0x300
  ? copy_msghdr_from_user+0x62/0x80
  ___sys_sendmsg+0x7c/0xd0
  ? __do_fault+0x34/0x1a0
  ? do_read_fault+0x5f/0x100
  ? do_fault+0xb0/0x110
  __sys_sendmsg+0x4d/0x80
  do_syscall_64+0x45/0xf0
  entry_SYSCALL_64_after_hwframe+0x6e/0x76

When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
in this issue. So fix it by accessing the right link.

Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
Reported-by: henaumars <henaumars@sina.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Link: https://lore.kernel.org/r/1703662835-53416-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues

When dma_alloc_coherent() fails, we should free qdev->lrg_buf
to prevent potential memleak.

Fixes: 1357bfcf7106 ("qla3xxx: Dynamically size the rx buffer queue based on the MTU.")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20231227070227.10527-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-12-26 (idpf)

This series contains updates to idpf driver only.

Alexander resolves issues in singleq mode to prevent corrupted frames
and leaking skbs.

Pavan prevents extra padding on RSS struct causing load failure due to
unexpected size.

* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: avoid compiler introduced padding in virtchnl2_rss_key struct
idpf: fix corrupted frames and skb leaks in singleq mode
====================

Link: https://lore.kernel.org/r/20231226174125.2632875-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio_net: fix missing dma unmap for resize

For rq, we have three cases getting buffers from virtio core:

1. virtqueue_get_buf{,_ctx}
2. virtqueue_detach_unused_buf
3. callback for virtqueue_resize

But in commit 295525e29a5b("virtio_net: merge dma operations when
filling mergeable buffers"), I missed the dma unmap for the #3 case.

That will leak some memory, because I did not release the pages referred
by the unused buffers.

If we do such script, we will make the system OOM.

    while true
    do
            ethtool -G ens4 rx 128
            ethtool -G ens4 rx 256
            free -m
    done

Fixes: 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20231226094333.47740-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'pci-v6.7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

- Revert an ASPM patch that caused an unintended reboot when resuming
   after suspend (Bjorn Helgaas)

- Orphan Cadence PCIe IP (Bjorn Helgaas)

* tag 'pci-v6.7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  MAINTAINERS: Orphan Cadence PCIe IP
  Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"

Merge tag 'apparmor-pr-2024-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor fix from John Johansen:
"Detect that the source mount is not in the namespace and if it isn't
  don't use it as a source path match.

  This prevent apparmor from applying the attach_disconnected flag to
  move_mount() source which prevents detached mounts from appearing as /
  when applying mount mediation, which is not only incorrect but could
  result in bad policy being generated"

* tag 'apparmor-pr-2024-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix move_mount mediation by detecting if source is detached

apparmor: Fix move_mount mediation by detecting if source is detached

Prevent move_mount from applying the attach_disconnected flag
to move_mount(). This prevents detached mounts from appearing
as / when applying mount mediation, which is not only incorrect
but could result in bad policy being generated.

Basic mount rules like
  allow mount,
  allow mount options=(move) -> /target/,

will allow detached mounts, allowing older policy to continue
to function. New policy gains the ability to specify `detached` as
a source option
  allow mount detached -> /target/,

In addition make sure support of move_mount is advertised as
a feature to userspace so that applications that generate policy
can respond to the addition.

Note: this fixes mediation of move_mount when a detached mount is used,
      it does not fix the broader regression of apparmor mediation of
      mounts under the new mount api.

Link: https://lore.kernel.org/all/68c166b8-5b4d-4612-8042-1dee3334385b@leemhuis.info/T/#mb35fdde37f999f08f0b02d58dc1bf4e6b65b8da2
Fixes: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>

Merge tag 'efi-urgent-for-v6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fix from Ard Biesheuvel:

- Ensure that the KASLR load flag is set in boot_params when loading
the kernel randomized directly from the EFI stub

* tag 'efi-urgent-for-v6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/x86: Fix the missing KASLR_FLAG bit in boot_params->hdr.loadflags

Merge tag 'trace-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix a NULL kernel dereference in set_gid() on tracefs mounting.

   When tracefs is mounted with "gid=1000", it will update the existing
   dentries to have the new gid. The tracefs_inode which is retrieved by
   a container_of(dentry->d_inode) has flags to see if the inode belongs
   to the eventfs system.

   The issue that was fixed was if getdents() was called on tracefs that
   was previously mounted, and was not closed. It will leave a "cursor
   dentry" in the subdirs list of the current dentries that set_gid()
   walks. On a remount of tracefs, the container_of(dentry->d_inode)
   will dereference a NULL pointer and cause a crash when referenced.

   Simply have a check for dentry->d_inode to see if it is NULL and if
   so, skip that entry.

- Fix the bits of the eventfs_inode structure.

   The "is_events" bit was taken from the nr_entries field, but the
   nr_entries field wasn't updated to be 30 bits and was still 31.
   Including the "is_freed" bit this would use 33 bits which would make
   the structure use another integer for just one bit.

* tag 'trace-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Fix bitwise fields for "is_events"
  tracefs: Check for dentry->d_inode exists in set_gid()

Merge tag 'bcachefs-2024-01-01' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs from Kent Overstreet:
"More bcachefs bugfixes for 6.7, and forwards compatibility work:

   - fix for a nasty extents + snapshot interaction, reported when
     reflink of a snapshotted file wouldn't complete but turned out to
     be a more general bug

   - fix for an invalid free in dio write path when iov vector was
     longer than our inline vector

   - fix for a buffer overflow in the nocow write path -
     BCH_REPLICAS_MAX doesn't actually limit the number of pointers in
     an extent when cached pointers are included

   - RO snapshots are actually RO now

   - And, a new superblock section to avoid future breakage when the
     disk space acounting rewrite rolls out: the new superblock section
     describes versions that need work to downgrade, where the work
     required is a list of recovery passes and errors to silently fix"

* tag 'bcachefs-2024-01-01' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: make RO snapshots actually RO
  bcachefs: bch_sb_field_downgrade
  bcachefs: bch_sb.recovery_passes_required
  bcachefs: Add persistent identifiers for recovery passes
  bcachefs: prt_bitflags_vector()
  bcachefs: move BCH_SB_ERRS() to sb-errors_types.h
  bcachefs: fix buffer overflow in nocow write path
  bcachefs: DARRAY_PREALLOCATED()
  bcachefs: Switch darray to kvmalloc()
  bcachefs: Factor out darray resize slowpath
  bcachefs: fix setting version_upgrade_complete
  bcachefs: fix invalid free in dio write path
  bcachefs: Fix extents iteration + snapshots interaction

igc: Fix hicredit calculation

According to the Intel Software Manual for I225, Section 7.5.2.7,
hicredit should be multiplied by the constant link-rate value, 0x7736.

Currently, the old constant link-rate value, 0x7735, from the boards
supported on igb are being used, most likely due to a copy'n'paste, as
the rest of the logic is the same for both drivers.

Update hicredit accordingly.

Fixes: 1ab011b0bf07 ("igc: Add support for CBS offloading")
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Rodrigo Cataldo <rodrigo.cadore@l-acoustics.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: fix Get link status data length

Get link status version 2 (opcode 0x0607) is returning an error because FW
expects a data length of 56 bytes, and this is causing the driver to fail
probe.

Update the get link status version 2 data length to 56 bytes by adding 5
byte reserved5 field to the end of struct ice_aqc_get_link_status_data and
passing it as parameter to offsetofend() to the fix error.

Fixes: 2777d24ec6d1 ("ice: Add ice_get_link_status_datalen")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

i40e: Restore VF MSI-X state during PCI reset

During a PCI FLR the MSI-X Enable flag in the VF PCI MSI-X capability
register will be cleared. This can lead to issues when a VF is
assigned to a VM because in these cases the VF driver receives no
indication of the PF PCI error/reset and additionally it is incapable
of restoring the cleared flag in the hypervisor configuration space
without fully reinitializing the driver interrupt functionality.

Since the VF driver is unable to easily resolve this condition on its own,
restore the VF MSI-X flag during the PF PCI reset handling.

Fixes: 19b7960b2da1 ("i40e: implement split PCI error reset handler")
Co-developed-by: Karen Ostrowska <karen.ostrowska@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Co-developed-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux

When a control changes value the return value from _put() should be 1 so
we get events generated to userspace notifying applications of the change.
While the I2S mux gets this right the S/PDIF mux does not, fix the return
value.

Fixes: c8609f3870f7 ("ASoC: meson: add g12a tohdmitx control")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-4-424af7a8fb91@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: meson: g12a-toacodec: Fix event generation

When a control changes value the return value from _put() should be 1 so
we get events generated to userspace notifying applications of the change.
We are checking if there has been a change and exiting early if not but we
are not providing the correct return value in the latter case, fix this.

Fixes: af2618a2eee8 ("ASoC: meson: g12a: add internal DAC glue driver")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-3-424af7a8fb91@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: meson: g12a-tohdmitx: Validate written enum values

When writing to an enum we need to verify that the value written is valid
for the enumeration, the helper function snd_soc_item_enum_to_val() doesn't
do it since it needs to return an unsigned (and in any case we'd need to
check the return value).

Fixes: c8609f3870f7 ("ASoC: meson: add g12a tohdmitx control")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-2-424af7a8fb91@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: meson: g12a-toacodec: Validate written enum values

When writing to an enum we need to verify that the value written is valid
for the enumeration, the helper function snd_soc_item_enum_to_val() doesn't
do it since it needs to return an unsigned (and in any case we'd need to
check the return value).

Fixes: af2618a2eee8 ("ASoC: meson: g12a: add internal DAC glue driver")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-1-424af7a8fb91@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>

i40e: fix use-after-free in i40e_aqc_add_filters()

Commit 3116f59c12bd ("i40e: fix use-after-free in
i40e_sync_filters_subtask()") avoided use-after-free issues,
by increasing refcount during update the VSI filter list to
the HW. However, it missed the unicast situation.

When deleting an unicast FDB entry, the i40e driver will release
the mac_filter, and i40e_service_task will concurrently request
firmware to add the mac_filter, which will lead to the following
use-after-free issue.

Fix again for both netdev->uc and netdev->mc.

BUG: KASAN: use-after-free in i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
Read of size 2 at addr ffff888eb3452d60 by task kworker/8:7/6379

CPU: 8 PID: 6379 Comm: kworker/8:7 Kdump: loaded Tainted: G
Workqueue: i40e i40e_service_task [i40e]
Call Trace:
dump_stack+0x71/0xab
print_address_description+0x6b/0x290
kasan_report+0x14a/0x2b0
i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
i40e_sync_vsi_filters+0x1676/0x39c0 [i40e]
i40e_service_task+0x1397/0x2bb0 [i40e]
process_one_work+0x56a/0x11f0
worker_thread+0x8f/0xf40
kthread+0x2a0/0x390
ret_from_fork+0x1f/0x40

Allocated by task 21948:
kasan_kmalloc+0xa6/0xd0
kmem_cache_alloc_trace+0xdb/0x1c0
i40e_add_filter+0x11e/0x520 [i40e]
i40e_addr_sync+0x37/0x60 [i40e]
__hw_addr_sync_dev+0x1f5/0x2f0
i40e_set_rx_mode+0x61/0x1e0 [i40e]
dev_uc_add_excl+0x137/0x190
i40e_ndo_fdb_add+0x161/0x260 [i40e]
rtnl_fdb_add+0x567/0x950
rtnetlink_rcv_msg+0x5db/0x880
netlink_rcv_skb+0x254/0x380
netlink_unicast+0x454/0x610
netlink_sendmsg+0x747/0xb00
sock_sendmsg+0xe2/0x120
__sys_sendto+0x1ae/0x290
__x64_sys_sendto+0xdd/0x1b0
do_syscall_64+0xa0/0x370
entry_SYSCALL_64_after_hwframe+0x65/0xca

Freed by task 21948:
__kasan_slab_free+0x137/0x190
kfree+0x8b/0x1b0
__i40e_del_filter+0x116/0x1e0 [i40e]
i40e_del_mac_filter+0x16c/0x300 [i40e]
i40e_addr_unsync+0x134/0x1b0 [i40e]
__hw_addr_sync_dev+0xff/0x2f0
i40e_set_rx_mode+0x61/0x1e0 [i40e]
dev_uc_del+0x77/0x90
rtnl_fdb_del+0x6a5/0x860
rtnetlink_rcv_msg+0x5db/0x880
netlink_rcv_skb+0x254/0x380
netlink_unicast+0x454/0x610
netlink_sendmsg+0x747/0xb00
sock_sendmsg+0xe2/0x120
__sys_sendto+0x1ae/0x290
__x64_sys_sendto+0xdd/0x1b0
do_syscall_64+0xa0/0x370
entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 3116f59c12bd ("i40e: fix use-after-free in i40e_sync_filters_subtask()")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Ke Xiao <xiaoke@sangfor.com.cn>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Cc: Di Zhu <zhudi2@huawei.com>
Reviewed-by: Jan Sokolowski <jan.sokolowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ASoC: SOF: Intel: hda-codec: Delay the codec device registration

The current code flow is:
1. snd_hdac_device_register()
2. set parameters needed by the hdac driver
3. request_codec_module()
   the hdac driver is probed at this point

During boot the codec drivers are not loaded when the hdac device is
registered, it is going to be probed later when loading the codec module,
which point the parameters are set.

On module remove/insert
rmmod snd_sof_pci_intel_tgl
modprobe snd_sof_pci_intel_tgl

The codec module remains loaded and the driver will be probed when the
hdac device is created right away, before the parameters for the driver
has been configured:

1. snd_hdac_device_register()
   the hdac driver is probed at this point
2. set parameters needed by the hdac driver
3. request_codec_module()
   will be a NOP as the module is already loaded

Move the snd_hdac_device_register() later, to be done right before
requesting the codec module to make sure that the parameters are all set
before the device is created:

1. set parameters needed by the hdac driver
2. snd_hdac_device_register()
3. request_codec_module()

This way at the hdac driver probe all parameters will be set in all cases.

Link: https://github.com/thesofproject/linux/issues/4731
Fixes: a0575b4add21 ("ASoC: hdac_hda: Conditionally register dais for HDMI and Analog")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20231207095425.19597-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/ZYvUIxtrqBQZbNlC@shine.dominikbrodowski.net
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218304
Signed-off-by: Takashi Iwai <tiwai@suse.de>

net: Save and restore msg_namelen in sock_sendmsg

Commit 86a7e0b69bd5 ("net: prevent rewrite of msg_name in
sock_sendmsg()") made sock_sendmsg save the incoming msg_name pointer
and restore it before returning, to insulate the caller against
msg_name being changed by the called code. If the address length
was also changed however, we may return with an inconsistent structure
where the length doesn't match the address, and attempts to reuse it may
lead to lost packets.

For example, a kernel that doesn't have commit 1c5950fc6fe9 ("udp6: fix
potential access to stale information") will replace a v4 mapped address
with its ipv4 equivalent, and shorten namelen accordingly from 28 to 16.
If the caller attempts to reuse the resulting msg structure, it will have
the original ipv6 (v4 mapped) address but an incorrect v4 length.

Fixes: 86a7e0b69bd5 ("net: prevent rewrite of msg_name in sock_sendmsg()")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ALSA: hda: cs35l41: fix building without CONFIG_SPI

When CONFIG_SPI is disabled, the driver produces unused-variable warning:

sound/pci/hda/cs35l41_hda_property.c: In function 'generic_dsd_config':
sound/pci/hda/cs35l41_hda_property.c:181:28: error: unused variable 'spi' [-Werror=unused-variable]
  181 |         struct spi_device *spi;
      |                            ^~~
sound/pci/hda/cs35l41_hda_property.c:180:27: error: unused variable 'cs_gpiod' [-Werror=unused-variable]
  180 |         struct gpio_desc *cs_gpiod;
      |                           ^~~~~~~~

Avoid these by turning the preprocessor contionals into equivalent C code,
which also helps readability.

Fixes: 916d051730ae ("ALSA: hda: cs35l41: Only add SPI CS GPIO if SPI is enabled in kernel")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240103102606.3742476-1-arnd@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

netfilter: nft_immediate: drop chain reference counter on error

In the init path, nft_data_init() bumps the chain reference counter,
decrement it on error by following the error path which calls
nft_data_release() to restore it.

Fixes: 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_nat: fix action not being set for all ct states

This fixes openvswitch's handling of nat packets in the related state.

In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6
packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have
not been dropped, will follow the goto, however the placement of the
goto label means that updating the action bit field will be bypassed.

This causes ovs_nat_update_key() to not be called from ovs_ct_nat()
which means the openvswitch match key for the ICMP/ICMPv6 packet is not
updated and the pre-nat value will be retained for the key, which will
result in the wrong openflow rule being matched for that packet.

Move the goto label above where the action bit field is being set so
that it is updated in all cases where the packet is accepted.

Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc")
Signed-off-by: Brad Cowie <brad@faucet.nz>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge tag 'drm-intel-fixes-2023-12-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v6.7-rc8:
- Fix bogus DPCD rev usage for DP phy test pattern setup
- Fix handling of MMIO triggered reports in the OA buffer

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87cyuqk26k.fsf@intel.com

net: bcmgenet: Fix FCS generation for fragmented skbuffs

The flag DMA_TX_APPEND_CRC was only written to the first DMA descriptor
in the TX path, where each descriptor corresponds to a single skbuff
fragment (or the skbuff head). This led to packets with no FCS appearing
on the wire if the kernel allocated the packet in fragments, which would
always happen when using PACKET_MMAP/TPACKET (cf. tpacket_fill_skb() in
net/af_packet.c).

Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Adrian Cinal <adriancinal1@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231228135638.1339245-1-adriancinal1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mptcp-new-reviewer-and-prevent-a-warning'

Matthieu Baerts says:

====================
mptcp: new reviewer and prevent a warning

Patch 1 adds MPTCP long time contributor -- Geliang Tang -- as a new
reviewer for the project. Thanks!

Patch 2 prevents a warning when TCP Diag is used to close internal MPTCP
listener subflows. This is a correction for a patch introduced in v6.4
which was fixing an issue from v5.17.
====================

Link: https://lore.kernel.org/r/20231226-upstream-net-20231226-mptcp-prevent-warn-v1-0-1404dcc431ea@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: prevent tcp diag from closing listener subflows

The MPTCP protocol does not expect that any other entity could change
the first subflow status when such socket is listening.
Unfortunately the TCP diag interface allows aborting any TCP socket,
including MPTCP listeners subflows. As reported by syzbot, that trigger
a WARN() and could lead to later bigger trouble.

The MPTCP protocol needs to do some MPTCP-level cleanup actions to
properly shutdown the listener. To keep the fix simple, prevent
entirely the diag interface from stopping such listeners.

We could refine the diag callback in a later, larger patch targeting
net-next.

Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status")
Cc: stable@vger.kernel.org
Reported-by: <syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com>
Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231226-upstream-net-20231226-mptcp-prevent-warn-v1-2-1404dcc431ea@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: add Geliang as reviewer for MPTCP

For a long time now, Geliang has contributed to a lot of code and
reviews related to MPTCP. So let's reflect that in the MAINTAINERS file.

This should also encourage patch submitters to add him to the CC list.

Acked-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231226-upstream-net-20231226-mptcp-prevent-warn-v1-1-1404dcc431ea@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Update mvpp2 driver email

I no longer use mw@semihalf.com email. Update
mvpp2 driver entry with my alternative address.

Signed-off-by: Marcin Wojtas <marcin.s.wojtas@gmail.com>
Link: https://lore.kernel.org/r/20231225225245.1606-1-marcin.s.wojtas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: fix a double-free bug in efx_probe_filters

In efx_probe_filters, the channel->rps_flow_id is freed in a
efx_for_each_channel marco  when success equals to 0.
However, after the following call chain:

ef100_net_open
  |-> efx_probe_filters
  |-> ef100_net_stop
        |-> efx_remove_filters

The channel->rps_flow_id is freed again in the efx_for_each_channel of
efx_remove_filters, triggering a double-free bug.

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Link: https://lore.kernel.org/r/20231225112915.3544581-1-alexious@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Orphan Cadence PCIe IP

Tom Joseph <tjoseph@cadence.com> is listed as the maintainer of the Cadence
PCIe IP, but email to that address bounces and lore has no correspondence
from Tom in the past two years
(https://lore.kernel.org/all/?q=f%3Atjoseph).

Mark the Cadence IP orphaned and add Tom to CREDITS.

Link: https://lore.kernel.org/r/20240102182157.GA1732664@bhelgaas
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"

This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c.

Michael reported that when attempting to resume from suspend to RAM on ASUS
mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426
("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay
with no output, followed by a reboot.

Workarounds include:

  - Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
  - Booting with "pcie_aspm=off"
  - Booting with "pcie_aspm.policy=performance"
  - "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm"
    before suspending
  - Connecting a USB flash drive

Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
Reported-by: Michael Schaller <michael@5challer.de>
Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>

Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required"

The commit had a bug and might not have been the right approach anyway.

Fixes: 629df6701c8a ("net: ipv6/addrconf: clamp preferred_lft to the minimum required")
Fixes: ec575f885e3e ("Documentation: networking: explain what happens if temp_prefered_lft is too small or too large")
Reported-by: Dan Moulding <dan@danm.net>
Closes: https://lore.kernel.org/netdev/20231221231115.12402-1-dan@danm.net/
Link: https://lore.kernel.org/netdev/CAMMLpeTdYhd=7hhPi2Y7pwdPCgnnW5JYh-bu3hSc7im39uxnEA@mail.gmail.com/
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231230043252.10530-1-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

eventfs: Fix bitwise fields for "is_events"

A flag was needed to denote which eventfs_inode was the "events"
directory, so a bit was taken from the "nr_entries" field, as there's not
that many entries, and 2^30 is plenty. But the bit number for nr_entries
was not updated to reflect the bit taken from it, which would add an
unnecessary integer to the structure.

Link: https://lore.kernel.org/linux-trace-kernel/20240102151832.7ca87275@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 7e8358edf503e ("eventfs: Fix file and directory uid and gid ownership")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracefs: Check for dentry->d_inode exists in set_gid()

If a getdents() is called on the tracefs directory but does not get all
the files, it can leave a "cursor" dentry in the d_subdirs list of tracefs
dentry. This cursor dentry does not have a d_inode for it. Before
referencing tracefs_inode from the dentry, the d_inode must first be
checked if it has content. If not, then it's not a tracefs_inode and can
be ignored.

The following caused a crash:

#define getdents64(fd, dirp, count) syscall(SYS_getdents64, fd, dirp, count)
#define BUF_SIZE 256
#define TDIR "/tmp/file0"

int main(void)
{
char buf[BUF_SIZE];
int fd;
        int n;

        mkdir(TDIR, 0777);
mount(NULL, TDIR, "tracefs", 0, NULL);
        fd = openat(AT_FDCWD, TDIR, O_RDONLY);
        n = getdents64(fd, buf, BUF_SIZE);
        ret = mount(NULL, TDIR, NULL, MS_NOSUID|MS_REMOUNT|MS_RELATIME|MS_LAZYTIME,
    "gid=1000");
return 0;
}

That's because the 256 BUF_SIZE was not big enough to read all the
dentries of the tracefs file system and it left a "cursor" dentry in the
subdirs of the tracefs root inode. Then on remounting with "gid=1000",
it would cause an iteration of all dentries which hit:

ti = get_tracefs(dentry->d_inode);
if (ti && (ti->flags & TRACEFS_EVENT_INODE))
eventfs_update_gid(dentry, gid);

Which crashed because of the dereference of the cursor dentry which had a NULL
d_inode.

In the subdir loop of the dentry lookup of set_gid(), if a child has a
NULL d_inode, simply skip it.

Link: https://lore.kernel.org/all/20240102135637.3a21fb10@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240102151249.05da244d@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 7e8358edf503e ("eventfs: Fix file and directory uid and gid ownership")
Reported-by: "Ubisectech Sirius" <bugreport@ubisectech.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

efi/x86: Fix the missing KASLR_FLAG bit in boot_params->hdr.loadflags

When KASLR is enabled, the KASLR_FLAG bit in boot_params->hdr.loadflags
should be set to 1 to propagate KASLR status from compressed kernel to
kernel, just as the choose_random_location() function does.

Currently, when the kernel is booted via the EFI stub, the KASLR_FLAG
bit in boot_params->hdr.loadflags is not set, even though it should be.
This causes some functions, such as kernel_randomize_memory(), not to
execute as expected. Fix it.

Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
[ardb: drop 'else' branch clearing KASLR_FLAG]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook

There is a HP ZBook which using ALC236 codec and need the
ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make mute LED
and micmute LED work.

[ confirmed that the new entries are for new models that have no
proper name, so the strings are left as "HP" which will be updated
eventually later -- tiwai ]

Signed-off-by: Andy Chi <andy.chi@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240102024916.19093-1-andy.chi@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

selftests: bonding: do not set port down when adding to bond

Similar to commit be809424659c ("selftests: bonding: do not set port down
before adding to bond"). The bond-arp-interval-causes-panic test failed
after commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing
it up") as the kernel will set the port down _after_ adding to bond if setting
port down specifically.

Fix it by removing the link down operation when adding to bond.

Fixes: 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

connector: Fix proc_event_num_listeners count not cleared

When we register a cn_proc listening event, the proc_event_num_listener
variable will be incremented by one, but if PROC_CN_MCAST_IGNORE is
not called, the count will not decrease.
This will cause the proc_*_connector function to take the wrong path.
It will reappear when the forkstat tool exits via ctrl + c.
We solve this problem by determining whether
there are still listeners to clear proc_event_num_listener.

Signed-off-by: wangkeqi <wangkeqiwang@didiglobal.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: linux/phy.h: fix Excess kernel-doc description warning

Remove the @phy_timer: line to prevent the kernel-doc warning:

include/linux/phy.h:768: warning: Excess struct member 'phy_timer' description in 'phy_device'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: netdev@vger.kernel.org
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)

Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new
socket option SO_TIMESTAMPING_NEW. Setting the option is handled in
sk_setsockopt(), querying it was not handled in sk_getsockopt(), though.

Following remarks on an earlier submission of this patch, keep the old
behavior of getsockopt(SO_TIMESTAMPING_OLD) which returns the active
flags even if they actually have been set through SO_TIMESTAMPING_NEW.

The new getsockopt(SO_TIMESTAMPING_NEW) is stricter, returning flags
only if they have been set through the same option.

Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
Link: https://lore.kernel.org/lkml/20230703175048.151683-1-jthinz@mailbox.tu-berlin.de/
Link: https://lore.kernel.org/netdev/0d7cddc9-03fa-43db-a579-14f3e822615b@app.fastmail.com/
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qrtr: ns: Return 0 if server port is not present

When a 'DEL_CLIENT' message is received from the remote, the corresponding
server port gets deleted. A DEL_SERVER message is then announced for this
server. As part of handling the subsequent DEL_SERVER message, the name-
server attempts to delete the server port which results in a '-ENOENT' error.
The return value from server_del() is then propagated back to qrtr_ns_worker,
causing excessive error prints.
To address this, return 0 from control_cmd_del_server() without checking the
return value of server_del(), since the above scenario is not an error case
and hence server_del() doesn't have any other error return value.

Signed-off-by: Sarannya Sasikumar <quic_sarannya@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bcachefs: make RO snapshots actually RO

Add checks to all the VFS paths for "are we in a RO snapshot?".

Note - we don't check this when setting inode options via our xattr
interface, since those generally only affect data placement, not
contents of data.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>

bcachefs: bch_sb_field_downgrade

Add a new superblock section that contains a list of
{ minor version, recovery passes, errors_to_fix }

that is - a list of recovery passes that must be run when downgrading
past a given version, and a list of errors to silently fix.

The upcoming disk accounting rewrite is not going to be fully
compatible: we're going to have to regenerate accounting both when
upgrading to the new version, and also from downgrading from the new
version, since the new method of doing disk space accounting is a
completely different architecture based on deltas, and synchronizing
them for every jounal entry write to maintain compatibility is going to
be too expensive and impractical.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: bch_sb.recovery_passes_required

Add two new superblock fields. Since the main section of the superblock
is now fully, we have to add a new variable length section for them -
bch_sb_field_ext.

- recovery_passes_requried: recovery passes that must be run on the
next mount
- errors_silent: errors that will be silently fixed

These are to improve upgrading and dwongrading: these fields won't be
cleared until after recovery successfully completes, so there won't be
any issues with crashing partway through an upgrade or a downgrade.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Add persistent identifiers for recovery passes

The next patch will start to refer to recovery passes from the
superblock; naturally, we now need identifiers that don't change, since
the existing enum is in the order in which they are run and is not
fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: prt_bitflags_vector()

similar to prt_bitflags(), but for ulong arrays

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: move BCH_SB_ERRS() to sb-errors_types.h

we need BCH_SB_ERR_MAX in bcachefs.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fix buffer overflow in nocow write path

BCH_REPLICAS_MAX isn't the actual maximum number of pointers in an
extent, it's the maximum number of dirty pointers.

We don't have a real restriction on the number of cached pointers, and
we don't want a fixed size array here anyways - so switch to
DARRAY_PREALLOCATED().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-and-tested-by: Daniel J Blueman <daniel@quora.org>

bcachefs: DARRAY_PREALLOCATED()

Add support to darray for preallocating some number of elements.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Switch darray to kvmalloc()

We sometimes use darrays for quite large buffers - the btree write
buffer in particular needs large buffers, since it must be sized to hold
all the write buffer keys outstanding in the journal.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: Factor out darray resize slowpath

Move the slowpath (actually growing the darray) to an out-of-line
function; also, add some helpers for the upcoming btree write buffer
rewrite.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fix setting version_upgrade_complete

If a superblock write hasn't happened (i.e. we never had to go rw), then
c->sb.version will be out of date w.r.t. c->disk_sb.sb->version.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bcachefs: fix invalid free in dio write path

turns out iterate_iovec() mutates __iov, we need to save our own copy

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: Marcin Mirosław <marcin@mejor.pl>

bcachefs: Fix extents iteration + snapshots interaction

peek_upto() checks against the end position and bails out before
FILTER_SNAPSHOTS checks; this is because if we end up at a different
inode number than the original search key none of the keys we see might
be visibile in the current snapshot - we might be looking at inode in a
completely different subvolume.

But this is broken, because when we're iterating over extents we're
checking against the extent start position to decide when to bail out,
and the extent start position isn't monotonically increasing until after
we've run FILTER_SNAPSHOTS.

Fix this by adding a simple inode number check where the old bailout
check was, and moving the main check to the correct position.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>

MAINTAINERS: step down as TJA11XX C45 maintainer

I am stepping down as TJA11XX C45 maintainer.
Andrei Botila will take the responsibility to maintain and improve the
support for TJA11XX C45 PHYs.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: Fix PCI error on system resume

Some r8168 NICs stop working upon system resume:

[  688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000).
[  688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down
...
[  691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000)

Not sure if it's related, but those NICs have a BMC device at function
0:
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a)

Trial and error shows that increase the loop wait on
rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let
rtl8168ep_driver_start() to wait a bit longer.

Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tcp_sigpool: Use kref_get_unless_zero()

The freeing and re-allocation of algorithm are protected by cpool_mutex,
so it doesn't fix an actual use-after-free, but avoids a deserved
refcount_warn_saturate() warning.

A trivial fix for the racy behavior.

Fixes: 8c73b26315aa ("net/tcp: Prepare tcp_md5sig_pool for TCP-AO")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: em_text: fix possible memory leak in em_text_destroy()

m->data needs to be freed when em_text_destroy is called.

Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 6.7-rc8

get_maintainer: remove stray punctuation when cleaning file emails

When parsing emails from .yaml files in particular, stray punctuation
such as a leading '-' can end up in the name.  For example, consider a
common YAML section such as:

  maintainers:
    - devicetree@vger.kernel.org

This would previously be processed by get_maintainer.pl as:

  - <devicetree@vger.kernel.org>

Make the logic in clean_file_emails more robust by deleting any
sub-names which consist of common single punctuation marks before
proceeding to the best-effort name extraction logic.  The output is then
correct:

  devicetree@vger.kernel.org

Some additional comments are added to the function to make things
clearer to future readers.

Link: https://lore.kernel.org/all/0173e76a36b3a9b4e7f324dd3a36fd4a9757f302.camel@perches.com/
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

get_maintainer: correctly parse UTF-8 encoded names in files

While the script correctly extracts UTF-8 encoded names from the
MAINTAINERS file, the regular expressions damage my name when parsing
from .yaml files.  Fix this by replacing the Latin-1-compatible regular
expressions with the unicode property matcher \p{L}, which matches on
any letter according to the Unicode General Category of letters.

The proposed solution only works if the script uses proper string
encoding from the outset, so instruct Perl to unconditionally open all
files with UTF-8 encoding.  This should be safe, as the entire source
tree is either UTF-8 or ASCII encoded anyway.  See [1] for a detailed
analysis.

Furthermore, to prevent the \w expression from matching non-ASCII when
checking for whether a name should be escaped with quotes, add the /a
flag to the regular expression.  The escaping logic was duplicated in
two places, so it has been factored out into its own function.

The original issue was also identified on the tools mailing list [2].
This should solve the observed side effects there as well.

Link: https://lore.kernel.org/all/dzn6uco4c45oaa3ia4u37uo5mlt33obecv7gghj2l756fr4hdh@mt3cprft3tmq/
Link: https://lore.kernel.org/tools/20230726-gush-slouching-a5cd41@meerkat/
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'trace-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix readers that are blocked on the ring buffer when buffer_percent
   is 100%. They are supposed to wake up when the buffer is full, but
   because the sub-buffer that the writer is on is never considered
   "dirty" in the calculation, dirty pages will never equal nr_pages.
   Add +1 to the dirty count in order to count for the sub-buffer that
   the writer is on.

- When a reader is blocked on the "snapshot_raw" file, it is to be
   woken up when a snapshot is done and be able to read the snapshot
   buffer. But because the snapshot swaps the buffers (the main one with
   the snapshot one), and the snapshot reader is waiting on the old
   snapshot buffer, it was not woken up (because it is now on the main
   buffer after the swap). Worse yet, when it reads the buffer after a
   snapshot, it's not reading the snapshot buffer, it's reading the live
   active main buffer.

   Fix this by forcing a wakeup of all readers on the snapshot buffer
   when a new snapshot happens, and then update the buffer that the
   reader is reading to be back on the snapshot buffer.

- Fix the modification of the direct_function hash. There was a race
   when new functions were added to the direct_function hash as when it
   moved function entries from the old hash to the new one, a direct
   function trace could be hit and not see its entry.

   This is fixed by allocating the new hash, copy all the old entries
   onto it as well as the new entries, and then use rcu_assign_pointer()
   to update the new direct_function hash with it.

   This also fixes a memory leak in that code.

- Fix eventfs ownership

* tag 'trace-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ftrace: Fix modification of direct_function hash while in use
  tracing: Fix blocked reader of snapshot buffer
  ring-buffer: Fix wake ups when buffer_percent is set to 100
  eventfs: Fix file and directory uid and gid ownership

locking/osq_lock: Clarify osq_wait_next()

Directly return NULL or 'next' instead of breaking out of the loop.

Signed-off-by: David Laight <david.laight@aculab.com>
[ Split original patch into two independent parts - Linus ]
Link: https://lore.kernel.org/lkml/7c8828aec72e42eeb841ca0ee3397e9a@AcuMS.aculab.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

locking/osq_lock: Clarify osq_wait_next() calling convention

osq_wait_next() is passed 'prev' from osq_lock() and NULL from
osq_unlock() but only needs the 'cpu' value to write to lock->tail.

Just pass prev->cpu or OSQ_UNLOCKED_VAL instead.

Should have no effect on the generated code since gcc manages to assume
that 'prev != NULL' due to an earlier dereference.

Signed-off-by: David Laight <david.laight@aculab.com>
[ Changed 'old' to 'old_cpu' by request from Waiman Long - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

locking/osq_lock: Move the definition of optimistic_spin_node into osq_lock.c

struct optimistic_spin_node is private to the implementation.
Move it into the C file to ensure nothing is accessing it.

Signed-off-by: David Laight <david.laight@aculab.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>