linux.git
20 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 8 Feb 2024 23:20:37 +0000 (15:20 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/common.h
  38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters")
  fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF")
  c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio")

drivers/net/wireless/microchip/wilc1000/netdev.c
  c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000")
  328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added")

net/unix/garbage.c
  11498715f266 ("af_unix: Remove io_uring code for GC.")
  1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 8 Feb 2024 23:09:29 +0000 (15:09 -0800)]
Merge tag 'net-6.8-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from WiFi and netfilter.

  Current release - regressions:

   - nic: intel: fix old compiler regressions

   - netfilter: ipset: missing gc cancellations fixed

  Current release - new code bugs:

   - netfilter: ctnetlink: fix filtering for zone 0

  Previous releases - regressions:

   - core: fix from address in memcpy_to_iter_csum()

   - netfilter: nfnetlink_queue: un-break NF_REPEAT

   - af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.

   - devlink: avoid potential loop in devlink_rel_nested_in_notify_work()

   - iwlwifi:
       - mvm: fix a battery life regression
       - fix double-free bug

   - mac80211: fix waiting for beacons logic

   - nic: nfp: flower: prevent re-adding mac index for bonded port

  Previous releases - always broken:

   - rxrpc: fix generation of serial numbers to skip zero

   - tipc: check the bearer type before calling tipc_udp_nl_bearer_add()

   - tunnels: fix out of bounds access when building IPv6 PMTU error

   - nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
     missed

   - nic: atlantic: fix DMA mapping for PTP hwts ring

  Misc:

   - selftests: more fixes to deal with very slow hosts"

* tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
  netfilter: nft_set_pipapo: remove scratch_aligned pointer
  netfilter: nft_set_pipapo: add helper to release pcpu scratch area
  netfilter: nft_set_pipapo: store index in scratch maps
  netfilter: nft_set_rbtree: skip end interval element from gc
  netfilter: nfnetlink_queue: un-break NF_REPEAT
  netfilter: nf_tables: use timestamp to check for set element timeout
  netfilter: nft_ct: reject direction for ct id
  netfilter: ctnetlink: fix filtering for zone 0
  s390/qeth: Fix potential loss of L3-IP@ in case of network issues
  netfilter: ipset: Missing gc cancellations fixed
  octeontx2-af: Initialize maps.
  net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
  net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
  netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
  netfilter: nft_compat: restrict match/target protocol to u16
  netfilter: nft_compat: reject unused compat flag
  netfilter: nft_compat: narrow down revision to unsigned 8-bits
  net: intel: fix old compiler regressions
  MAINTAINERS: Maintainer change for rds
  selftests: cmsg_ipv6: repeat the exact packet
  ...

20 months agoMerge tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 8 Feb 2024 23:07:06 +0000 (15:07 -0800)]
Merge tag 'pinctrl-v6.8-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl fix from Linus Walleij:
 "A single fix for the AMD driver which affects developer laptops, the
  pinctrl/GPIO driver won't probe on some systems"

* tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: Add IRQF_ONESHOT to the interrupt request

20 months agoMerge tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 8 Feb 2024 11:56:39 +0000 (12:56 +0100)]
Merge tag 'nf-24-02-08' of git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Narrow down target/match revision to u8 in nft_compat.

2) Bail out with unused flags in nft_compat.

3) Restrict layer 4 protocol to u16 in nft_compat.

4) Remove static in pipapo get command that slipped through when
   reducing set memory footprint.

5) Follow up incremental fix for the ipset performance regression,
   this includes the missing gc cancellation, from Jozsef Kadlecsik.

6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
   as no filtering, from Felix Huettner.

7) Reject direction for NFT_CT_ID.

8) Use timestamp to check for set element expiration while transaction
   is handled to prevent garbage collection from removing set elements
   that were just added by this transaction. Packet path and netlink
   dump/get path still use current time to check for expiration.

9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.

10) map_index needs to be percpu and per-set, not just percpu.
    At this time its possible for a pipapo set to fill the all-zero part
    with ones and take the 'might have bits set' as 'start-from-zero' area.
    From Florian Westphal. This includes three patches:

    - Change scratchpad area to a structure that provides space for a
      per-set-and-cpu toggle and uses it of the percpu one.

    - Add a new free helper to prepare for the next patch.

    - Remove the scratch_aligned pointer and makes AVX2 implementation
      use the exact same memory addresses for read/store of the matching
      state.

netfilter pull request 24-02-08

* tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_set_pipapo: remove scratch_aligned pointer
  netfilter: nft_set_pipapo: add helper to release pcpu scratch area
  netfilter: nft_set_pipapo: store index in scratch maps
  netfilter: nft_set_rbtree: skip end interval element from gc
  netfilter: nfnetlink_queue: un-break NF_REPEAT
  netfilter: nf_tables: use timestamp to check for set element timeout
  netfilter: nft_ct: reject direction for ct id
  netfilter: ctnetlink: fix filtering for zone 0
  netfilter: ipset: Missing gc cancellations fixed
  netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
  netfilter: nft_compat: restrict match/target protocol to u16
  netfilter: nft_compat: reject unused compat flag
  netfilter: nft_compat: narrow down revision to unsigned 8-bits
====================

Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: nft_set_pipapo: remove scratch_aligned pointer
Florian Westphal [Thu, 8 Feb 2024 09:31:29 +0000 (10:31 +0100)]
netfilter: nft_set_pipapo: remove scratch_aligned pointer

use ->scratch for both avx2 and the generic implementation.

After previous change the scratch->map member is always aligned properly
for AVX2, so we can just use scratch->map in AVX2 too.

The alignoff delta is stored in the scratchpad so we can reconstruct
the correct address to free the area again.

Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_pipapo: add helper to release pcpu scratch area
Florian Westphal [Wed, 7 Feb 2024 20:52:47 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: add helper to release pcpu scratch area

After next patch simple kfree() is not enough anymore, so add
a helper for it.

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_pipapo: store index in scratch maps
Florian Westphal [Wed, 7 Feb 2024 20:52:46 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: store index in scratch maps

Pipapo needs a scratchpad area to keep state during matching.
This state can be large and thus cannot reside on stack.

Each set preallocates percpu areas for this.

On each match stage, one scratchpad half starts with all-zero and the other
is inited to all-ones.

At the end of each stage, the half that starts with all-ones is
always zero.  Before next field is tested, pointers to the two halves
are swapped, i.e.  resmap pointer turns into fill pointer and vice versa.

After the last field has been processed, pipapo stashes the
index toggle in a percpu variable, with assumption that next packet
will start with the all-zero half and sets all bits in the other to 1.

This isn't reliable.

There can be multiple sets and we can't be sure that the upper
and lower half of all set scratch map is always in sync (lookups
can be conditional), so one set might have swapped, but other might
not have been queried.

Thus we need to keep the index per-set-and-cpu, just like the
scratchpad.

Note that this bug fix is incomplete, there is a related issue.

avx2 and normal implementation might use slightly different areas of the
map array space due to the avx2 alignment requirements, so
m->scratch (generic/fallback implementation) and ->scratch_aligned
(avx) may partially overlap. scratch and scratch_aligned are not distinct
objects, the latter is just the aligned address of the former.

After this change, write to scratch_align->map_index may write to
scratch->map, so this issue becomes more prominent, we can set to 1
a bit in the supposedly-all-zero area of scratch->map[].

A followup patch will remove the scratch_aligned and makes generic and
avx code use the same (aligned) area.

Its done in a separate change to ease review.

Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_rbtree: skip end interval element from gc
Pablo Neira Ayuso [Wed, 7 Feb 2024 17:49:51 +0000 (18:49 +0100)]
netfilter: nft_set_rbtree: skip end interval element from gc

rbtree lazy gc on insert might collect an end interval element that has
been just added in this transactions, skip end interval elements that
are not yet active.

Fixes: f718863aca46 ("netfilter: nft_set_rbtree: fix overlap expiration walk")
Cc: stable@vger.kernel.org
Reported-by: lonial con <kongln9170@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nfnetlink_queue: un-break NF_REPEAT
Florian Westphal [Tue, 6 Feb 2024 16:54:18 +0000 (17:54 +0100)]
netfilter: nfnetlink_queue: un-break NF_REPEAT

Only override userspace verdict if the ct hook returns something
other than ACCEPT.

Else, this replaces NF_REPEAT (run all hooks again) with NF_ACCEPT
(move to next hook).

Fixes: 6291b3a67ad5 ("netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts")
Reported-by: l.6diay@passmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nf_tables: use timestamp to check for set element timeout
Pablo Neira Ayuso [Mon, 5 Feb 2024 23:11:40 +0000 (00:11 +0100)]
netfilter: nf_tables: use timestamp to check for set element timeout

Add a timestamp field at the beginning of the transaction, store it
in the nftables per-netns area.

Update set backend .insert, .deactivate and sync gc path to use the
timestamp, this avoids that an element expires while control plane
transaction is still unfinished.

.lookup and .update, which are used from packet path, still use the
current time to check if the element has expired. And .get path and dump
also since this runs lockless under rcu read size lock. Then, there is
async gc which also needs to check the current time since it runs
asynchronously from a workqueue.

Fixes: c3e1b005ed1c ("netfilter: nf_tables: add set element timeout support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_ct: reject direction for ct id
Pablo Neira Ayuso [Mon, 5 Feb 2024 13:59:24 +0000 (14:59 +0100)]
netfilter: nft_ct: reject direction for ct id

Direction attribute is ignored, reject it in case this ever needs to be
supported

Fixes: 3087c3f7c23b ("netfilter: nft_ct: Add ct id support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: ctnetlink: fix filtering for zone 0
Felix Huettner [Mon, 5 Feb 2024 09:59:59 +0000 (09:59 +0000)]
netfilter: ctnetlink: fix filtering for zone 0

previously filtering for the default zone would actually skip the zone
filter and flush all zones.

Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Closes: https://lore.kernel.org/netdev/2032238f-31ac-4106-8f22-522e76df5a12@ovn.org/
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agos390/qeth: Fix potential loss of L3-IP@ in case of network issues
Alexandra Winter [Tue, 6 Feb 2024 08:58:49 +0000 (09:58 +0100)]
s390/qeth: Fix potential loss of L3-IP@ in case of network issues

Symptom:
In case of a bad cable connection (e.g. dirty optics) a fast sequence of
network DOWN-UP-DOWN-UP could happen. UP triggers recovery of the qeth
interface. In case of a second DOWN while recovery is still ongoing, it
can happen that the IP@ of a Layer3 qeth interface is lost and will not
be recovered by the second UP.

Problem:
When registration of IP addresses with Layer 3 qeth devices fails, (e.g.
because of bad address format) the respective IP address is deleted from
its hash-table in the driver. If registration fails because of a ENETDOWN
condition, the address should stay in the hashtable, so a subsequent
recovery can restore it.

3caa4af834df ("qeth: keep ip-address after LAN_OFFLINE failure")
fixes this for registration failures during normal operation, but not
during recovery.

Solution:
Keep L3-IP address in case of ENETDOWN in qeth_l3_recover_ip(). For
consistency with qeth_l3_add_ip() we also keep it in case of EADDRINUSE,
i.e. for some reason the card already/still has this address registered.

Fixes: 4a71df50047f ("qeth: new qeth device driver")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20240206085849.2902775-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: ipset: Missing gc cancellations fixed
Jozsef Kadlecsik [Sun, 4 Feb 2024 15:26:42 +0000 (16:26 +0100)]
netfilter: ipset: Missing gc cancellations fixed

The patch fdb8e12cc2cc ("netfilter: ipset: fix performance regression
in swap operation") missed to add the calls to gc cancellations
at the error path of create operations and at module unload. Also,
because the half of the destroy operations now executed by a
function registered by call_rcu(), neither NFNL_SUBSYS_IPSET mutex
or rcu read lock is held and therefore the checking of them results
false warnings.

Fixes: 97f7cf1cd80e ("netfilter: ipset: fix performance regression in swap operation")
Reported-by: syzbot+52bbc0ad036f6f0d4a25@syzkaller.appspotmail.com
Reported-by: Brad Spengler <spender@grsecurity.net>
Reported-by: Стас Ничипорович <stasn77@gmail.com>
Tested-by: Brad Spengler <spender@grsecurity.net>
Tested-by: Стас Ничипорович <stasn77@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agoocteontx2-af: Initialize maps.
Ratheesh Kannoth [Tue, 6 Feb 2024 02:40:00 +0000 (08:10 +0530)]
octeontx2-af: Initialize maps.

kmalloc_array() without __GFP_ZERO flag does not initialize
memory to zero. This causes issues. Use kcalloc() for maps and
bitmap_zalloc() for bitmaps.

Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Brett Creeley <bcreeley@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206024000.1070260-1-rkannoth@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agoMerge branch 'cpsw-enable-mac_managed_pm-to-fix-mdio'
Paolo Abeni [Thu, 8 Feb 2024 10:33:22 +0000 (11:33 +0100)]
Merge branch 'cpsw-enable-mac_managed_pm-to-fix-mdio'

Sinthu Raja says:

====================
CPSW: enable mac_managed_pm to fix mdio

This patch fix the resume/suspend issue on CPSW interface.

Reference from the foloowing patchwork:
https://lore.kernel.org/netdev/20221014144729.1159257-2-shenwei.wang@nxp.com/T/

V1: https://patchwork.kernel.org/project/netdevbpf/patch/20240122083414.6246-1-sinthu.raja@ti.com/
V2: https://patchwork.kernel.org/project/netdevbpf/patch/20240122093326.7618-1-sinthu.raja@ti.com/
====================

Link: https://lore.kernel.org/r/20240206005928.15703-1-sinthu.raja@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonet: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
Sinthu Raja [Tue, 6 Feb 2024 00:59:28 +0000 (06:29 +0530)]
net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio

The below commit  introduced a WARN when phy state is not in the states:
PHY_HALTED, PHY_READY and PHY_UP.
commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")

When cpsw resumes, there have port in PHY_NOLINK state, so the below
warning comes out. Set mac_managed_pm be true to tell mdio that the phy
resume/suspend is managed by the mac, to fix the following warning:

WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
Hardware name: Generic AM33XX (Flattened Device Tree)
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x24/0x2c
 dump_stack_lvl from __warn+0x84/0x15c
 __warn from warn_slowpath_fmt+0x1a8/0x1c8
 warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
 mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
 dpm_run_callback from device_resume+0xb8/0x2b8
 device_resume from dpm_resume+0x144/0x314
 dpm_resume from dpm_resume_end+0x14/0x20
 dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
 suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
 pm_suspend from state_store+0x74/0xd0
 state_store from kernfs_fop_write_iter+0x104/0x1ec
 kernfs_fop_write_iter from vfs_write+0x1b8/0x358
 vfs_write from ksys_write+0x78/0xf8
 ksys_write from ret_fast_syscall+0x0/0x54
Exception stack(0xe094dfa8 to 0xe094dff0)
dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66

Cc: <stable@vger.kernel.org> # v6.0+
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Sinthu Raja <sinthu.raja@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonet: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
Sinthu Raja [Tue, 6 Feb 2024 00:59:27 +0000 (06:29 +0530)]
net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio

The below commit  introduced a WARN when phy state is not in the states:
PHY_HALTED, PHY_READY and PHY_UP.
commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")

When cpsw_new resumes, there have port in PHY_NOLINK state, so the below
warning comes out. Set mac_managed_pm be true to tell mdio that the phy
resume/suspend is managed by the mac, to fix the following warning:

WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
Hardware name: Generic AM33XX (Flattened Device Tree)
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x24/0x2c
 dump_stack_lvl from __warn+0x84/0x15c
 __warn from warn_slowpath_fmt+0x1a8/0x1c8
 warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
 mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
 dpm_run_callback from device_resume+0xb8/0x2b8
 device_resume from dpm_resume+0x144/0x314
 dpm_resume from dpm_resume_end+0x14/0x20
 dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
 suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
 pm_suspend from state_store+0x74/0xd0
 state_store from kernfs_fop_write_iter+0x104/0x1ec
 kernfs_fop_write_iter from vfs_write+0x1b8/0x358
 vfs_write from ksys_write+0x78/0xf8
 ksys_write from ret_fast_syscall+0x0/0x54
Exception stack(0xe094dfa8 to 0xe094dff0)
dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66

Cc: <stable@vger.kernel.org> # v6.0+
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Sinthu Raja <sinthu.raja@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: nft_set_pipapo: remove static in nft_pipapo_get()
Pablo Neira Ayuso [Fri, 2 Feb 2024 09:09:34 +0000 (10:09 +0100)]
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()

This has slipped through when reducing memory footprint for set
elements, remove it.

Fixes: 9dad402b89e8 ("netfilter: nf_tables: expose opaque set element as struct nft_elem_priv")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agoMerge tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Thu, 8 Feb 2024 06:12:14 +0000 (06:12 +0000)]
Merge tag 'v6.8-p3' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "Fix regressions in cbc and algif_hash, as well as an older
  NULL-pointer dereference in ccp"

* tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_hash - Remove bogus SGL free on zero-length error path
  crypto: cbc - Ensure statesize is zero
  crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked

20 months agoMerge tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/denni...
Linus Torvalds [Thu, 8 Feb 2024 06:08:37 +0000 (06:08 +0000)]
Merge tag 'percpu-for-6.8-rc4' of git://git./linux/kernel/git/dennis/percpu

Pull percpu fix from Dennis Zhou:

 - fix riscv wrong size passed to local_flush_tlb_range_asid()

* tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  riscv: Fix wrong size passed to local_flush_tlb_range_asid()

20 months agoMerge branch 'net-more-factorization-in-cleanup_net-paths'
Jakub Kicinski [Thu, 8 Feb 2024 02:55:15 +0000 (18:55 -0800)]
Merge branch 'net-more-factorization-in-cleanup_net-paths'

Eric Dumazet says:

====================
net: more factorization in cleanup_net() paths

This series is inspired by recent syzbot reports hinting to RTNL and
workqueue abuses.

rtnl_lock() is unfair to (single threaded) cleanup_net(), because
many threads can cause contention on it.

This series adds a new (struct pernet_operations) method,
so that cleanup_net() can hold RTNL longer once it finally
acquires it.

It also factorizes unregister_netdevice_many(), to further
reduce stalls in cleanup_net().

Link: https://lore.kernel.org/netdev/CANn89iLJrrJs+6Vc==Un4rVKcpV0Eof4F_4w1_wQGxUCE2FWAg@mail.gmail.com/T/#u
https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/
====================

Link: https://lore.kernel.org/r/20240206144313.2050392-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoxfrm: interface: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:12 +0000 (14:43 +0000)]
xfrm: interface: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-17-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobridge: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:11 +0000 (14:43 +0000)]
bridge: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-16-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip_tunnel: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:10 +0000 (14:43 +0000)]
ip_tunnel: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

This patch takes care of ipip, ip_vti, and ip_gre tunnels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-15-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agosit: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:09 +0000 (14:43 +0000)]
sit: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-14-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_vti: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:08 +0000 (14:43 +0000)]
ip6_vti: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-13-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_tunnel: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:07 +0000 (14:43 +0000)]
ip6_tunnel: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-12-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_gre: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:06 +0000 (14:43 +0000)]
ip6_gre: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-11-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agovxlan: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:05 +0000 (14:43 +0000)]
vxlan: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

v4: (Paolo feedback : https://netdev-3.bots.linux.dev/vmksft-net/results/453141/17-udpgro-fwd-sh/stdout )
  - Changed vxlan_destroy_tunnels() to use vxlan_dellink()
    instead of unregister_netdevice_queue to propely remove
    devices from vn->vxlan_list.
  - vxlan_destroy_tunnels() can simply iterate one list (vn->vxlan_list)
    to find all devices in the most efficient way.
  - Moved sanity checks in a separate vxlan_exit_net() method.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-10-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoipv4: add __unregister_nexthop_notifier()
Eric Dumazet [Tue, 6 Feb 2024 14:43:04 +0000 (14:43 +0000)]
ipv4: add __unregister_nexthop_notifier()

unregister_nexthop_notifier() assumes the caller does not hold rtnl.

We need in the following patch to use it from a context
already holding rtnl.

Add __unregister_nexthop_notifier().

unregister_nexthop_notifier() becomes a wrapper.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agogtp: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:03 +0000 (14:43 +0000)]
gtp: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call per netns.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agogeneve: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:02 +0000 (14:43 +0000)]
geneve: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

Note: it should be possible to remove the synchronize_net()
call from geneve_sock_release() in a future patch.

v4: move WARN_ON_ONCE(!list_empty(&gn->sock_list))
   into geneve_exit_net(), after devices have been unregistered.
   (Antoine Tenart feedback)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobonding: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:01 +0000 (14:43 +0000)]
bonding: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

v2: Added bond_net_pre_exit() method to make sure bond_destroy_sysfs()
    is called before we unregister the devices in bond_net_exit_batch_rtnl
 (Antoine Tenart : https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobareudp: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:00 +0000 (14:43 +0000)]
bareudp: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method
Eric Dumazet [Tue, 6 Feb 2024 14:42:59 +0000 (14:42 +0000)]
nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method

exit_batch_rtnl() is called while RTNL is held.

This saves one rtnl_lock()/rtnl_unlock() pair.

We also need to create nexthop_net_exit()
to make sure net->nexthop.devhash is not freed too soon,
otherwise we will not be able to unregister netdev
from exit_batch_rtnl() methods.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: add exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:42:57 +0000 (14:42 +0000)]
net: add exit_batch_rtnl() method

Many (struct pernet_operations)->exit_batch() methods have
to acquire rtnl.

In presence of rtnl mutex pressure, this makes cleanup_net()
very slow.

This patch adds a new exit_batch_rtnl() method to reduce
number of rtnl acquisitions from cleanup_net().

exit_batch_rtnl() handlers are called while rtnl is locked,
and devices to be killed can be queued in a list provided
as their second argument.

A single unregister_netdevice_many() is called right
before rtnl is released.

exit_batch_rtnl() handlers are called before ->exit() and
->exit_batch() handlers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'mt7530-dsa-subdriver-improvements-act-ii'
Jakub Kicinski [Thu, 8 Feb 2024 02:53:55 +0000 (18:53 -0800)]
Merge branch 'mt7530-dsa-subdriver-improvements-act-ii'

Arınç ÜNAL says:

====================
MT7530 DSA Subdriver Improvements Act II

This is the second patch series with the goal of simplifying the MT7530 DSA
subdriver and improving support for MT7530, MT7531, and the switch on the
MT7988 SoC.

I have done a simple ping test to confirm basic communication on all switch
ports on MCM and standalone MT7530, and MT7531 switch with this patch
series applied.

MT7621 Unielec, MCM MT7530:

rgmii-only-gmac0-mt7621-unielec-u7621-06-16m.dtb
gmac0-and-gmac1-mt7621-unielec-u7621-06-16m.dtb

tftpboot 0x80008000 mips-uzImage.bin; tftpboot 0x83000000 mips-rootfs.cpio.uboot; tftpboot 0x83f00000 $dtb; bootm 0x80008000 0x83000000 0x83f00000

MT7622 Bananapi, MT7531:

gmac0-and-gmac1-mt7622-bananapi-bpi-r64.dtb

tftpboot 0x40000000 arm64-Image; tftpboot 0x45000000 arm64-rootfs.cpio.uboot; tftpboot 0x4a000000 $dtb; booti 0x40000000 0x45000000 0x4a000000

MT7623 Bananapi, standalone MT7530:

rgmii-only-gmac0-mt7623n-bananapi-bpi-r2.dtb
gmac0-and-gmac1-mt7623n-bananapi-bpi-r2.dtb

tftpboot 0x80008000 arm-zImage; tftpboot 0x83000000 arm-rootfs.cpio.uboot; tftpboot 0x83f00000 $dtb; bootz 0x80008000 0x83000000 0x83f00000

This patch series is the continuation of the patch series linked below.

https://lore.kernel.org/r/20230522121532.86610-1-arinc.unal@arinc9.com

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
====================

Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-0-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: do not clear config->supported_interfaces
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:08 +0000 (01:08 +0300)]
net: dsa: mt7530: do not clear config->supported_interfaces

There's no need to clear the config->supported_interfaces bitmap before
reporting the supported interfaces as all bits in the bitmap will already
be initialized to zero when the phylink_config structure is allocated. The
"config" pointer points to &dp->phylink_config, and "dp" is allocated by
dsa_port_touch() with kzalloc(), so all its fields are filled with zeroes.

There's no code that would change the bitmap beforehand. Remove it.

Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-7-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: correct port capabilities of MT7988
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:07 +0000 (01:08 +0300)]
net: dsa: mt7530: correct port capabilities of MT7988

On the switch on the MT7988 SoC, as shown in Block Diagram 8.1.1.3 on page
125 of "MT7988A Wi-Fi 7 Generation Router Platform: Datasheet (Open
Version) v0.1", there are only 4 PHYs. That's port 0 to 3. Set the case for
ports which connect to switch PHYs to '0 ... 3'.

Port 4 and 5 are not used at all in this design.

Link: https://wiki.banana-pi.org/Banana_Pi_BPI-R4#Documents
Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-6-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: remove pad_setup function pointer
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:06 +0000 (01:08 +0300)]
net: dsa: mt7530: remove pad_setup function pointer

The pad_setup function pointer was introduced with 88bdef8be9f6 ("net: dsa:
mt7530: Extend device data ready for adding a new hardware"). It was being
used to set up the core clock and port 6 of the MT7530 switch, and pll of
the MT7531 switch.

All of these were moved to more appropriate locations, and it was never
used for the switch on the MT7988 SoC. Therefore, this function pointer
hasn't got a use anymore. Remove it.

Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-5-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: call port 6 setup from mt7530_mac_config()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:05 +0000 (01:08 +0300)]
net: dsa: mt7530: call port 6 setup from mt7530_mac_config()

mt7530_pad_clk_setup() is called if port 6 is enabled. It used to do more
things than setting up port 6. That part was moved to more appropriate
locations, mt7530_setup() and mt7530_pll_setup().

Now that all it does is set up port 6, rename it to mt7530_setup_port6(),
and move it to a more appropriate location, under mt7530_mac_config().

Change mt7530_setup_port6() to void as there're no error cases.

Leave an empty mt7530_pad_clk_setup() to satisfy the pad_setup function
pointer.

This is the code path for setting up the ports before:

dsa_switch_ops :: phylink_mac_config() -> mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt753x_info :: mac_port_config() -> mt7530_mac_config()
      -> mt7530_setup_port5()
-> mt753x_pad_setup()
   -> mt753x_info :: pad_setup() -> mt7530_pad_clk_setup()

This is after:

dsa_switch_ops :: phylink_mac_config() -> mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt753x_info :: mac_port_config() -> mt7530_mac_config()
      -> mt7530_setup_port5()
      -> mt7530_setup_port6()

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-4-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: simplify mt7530_pad_clk_setup()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:04 +0000 (01:08 +0300)]
net: dsa: mt7530: simplify mt7530_pad_clk_setup()

This code is from before this driver was converted to phylink API. Phylink
deals with the unsupported interface cases before mt7530_pad_clk_setup() is
run. Therefore, the default case would never run. However, it must be
defined nonetheless to handle all the remaining enumeration values, the
phy-modes.

Switch to if statement for RGMII and return which simplifies the code and
saves an indent.

Set P6_INTF_MODE, which is the three least significant bits of the
MT7530_P6ECR register, to 0 for RGMII even though it will already be 0
after reset. This is to keep supporting dynamic reconfiguration of the port
in the case the interface changes from TRGMII to RGMII.

Disable the TRGMII clocks for all cases. They will be enabled if TRGMII is
being used.

Read XTAL after checking for RGMII as it's only needed for the TRGMII
interface mode.

Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-3-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: move XTAL check to mt7530_setup()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:03 +0000 (01:08 +0300)]
net: dsa: mt7530: move XTAL check to mt7530_setup()

The crystal frequency concerns the switch core. The frequency should be
checked when the switch is being set up so the driver can reject the
unsupported hardware earlier and without requiring port 6 to be used.

Move it to mt7530_setup(). Drop the unnecessary function printing.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-2-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: empty default case on mt7530_setup_port5()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:02 +0000 (01:08 +0300)]
net: dsa: mt7530: empty default case on mt7530_setup_port5()

There're two code paths for setting up port 5:

mt7530_setup()
-> mt7530_setup_port5()

mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt7530_mac_config()
      -> mt7530_setup_port5()

On the first code path, priv->p5_intf_sel is either set to
P5_INTF_SEL_PHY_P0 or P5_INTF_SEL_PHY_P4 when mt7530_setup_port5() is run.

On the second code path, priv->p5_intf_sel is set to P5_INTF_SEL_GMAC5 when
mt7530_setup_port5() is run.

Empty the default case which will never run but is needed nonetheless to
handle all the remaining enumeration values.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-1-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agor8169: remove setting LED default trigger, this is done by LED core now
Heiner Kallweit [Mon, 5 Feb 2024 21:54:08 +0000 (22:54 +0100)]
r8169: remove setting LED default trigger, this is done by LED core now

After 1c75c424bd43 ("leds: class: If no default trigger is given, make
hw_control trigger the default trigger") this line isn't needed any
longer.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/3a9cd1a1-40ad-487d-8b1e-6bf255419232@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Thu, 8 Feb 2024 02:34:34 +0000 (18:34 -0800)]
Merge tag 'mlx5-updates-2024-02-01' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2024-02-01

1) IPSec global stats for xfrm and mlx5
2) XSK memory improvements for non-linear SKBs
3) Software steering debug dump to use seq_file ops
4) Various code clean-ups

* tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: XDP, Exclude headroom and tailroom from memory calculations
  net/mlx5e: XSK, Exclude tailroom from non-linear SKBs memory calculations
  net/mlx5: DR, Change SWS usage to debug fs seq_file interface
  net/mlx5: Change missing SyncE capability print to debug
  net/mlx5: Remove initial segmentation duplicate definitions
  net/mlx5: Return specific error code for timeout on wait_fw_init
  net/mlx5: SF, Stop waiting for FW as teardown was called
  net/mlx5: remove fw reporter dump option for non PF
  net/mlx5: remove fw_fatal reporter dump option for non PF
  net/mlx5: Rename mlx5_sf_dev_remove
  Documentation: Fix counter name of mlx5 vnic reporter
  net/mlx5e: Delete obsolete IPsec code
  net/mlx5e: Connect mlx5 IPsec statistics with XFRM core
  xfrm: get global statistics from the offloaded device
  xfrm: generalize xdo_dev_state_update_curlft to allow statistics update
====================

Link: https://lore.kernel.org/r/20240206005527.1353368-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'selftests-bonding-use-slowwait-when-waiting'
Jakub Kicinski [Thu, 8 Feb 2024 02:26:22 +0000 (18:26 -0800)]
Merge branch 'selftests-bonding-use-slowwait-when-waiting'

Hangbin Liu says:

====================
selftests: bonding: use slowwait when waiting

There are a lot waitings in bonding tests use sleep. Let's replace them with
slowwait(added in the first patch). This could save much test time. e.g.

bond-break-lacpdu-tx.sh
  before: 0m16.346s
  after: 0m2.824s

bond_options.sh
  before: 9m25.299s
  after: 6m14.439s

bond-lladdr-target.sh
  before: 0m7.090s
  after: 0m6.148s

In total, we could save about 180 seconds.
====================

Link: https://lore.kernel.org/r/20240205130048.282087-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: use slowwait instead of hard code sleep
Hangbin Liu [Mon, 5 Feb 2024 13:00:48 +0000 (21:00 +0800)]
selftests: bonding: use slowwait instead of hard code sleep

Use slowwait instead of hard code sleep for bonding tests.

In function setup_prepare(), the client_create() will be called after
server_create(). So I think there is no need to sleep in server_create()
and remove it.

For lab_lib.sh, remove bonding module may affect other running bonding tests.
And some test env may buildin bond which can't be removed. The bonding
link should be removed by lag_reset_network() or netns delete.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-5-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: reduce garp_test/arp_validate test time
Hangbin Liu [Mon, 5 Feb 2024 13:00:47 +0000 (21:00 +0800)]
selftests: bonding: reduce garp_test/arp_validate test time

The purpose of grat_arp is testing commit 9949e2efb54e ("bonding: fix
send_peer_notif overflow"). As the send_peer_notif was defined to u8,
to overflow it, we need to

send_peer_notif = num_peer_notif * peer_notif_delay = num_grat_arp * peer_notify_delay / miimon > 255
  (kernel)           (kernel parameter)                   (user parameter)

e.g. 30 (num_grat_arp) * 1000 (peer_notify_delay) / 100 (miimon) > 255.

Which need 30s to complete sending garp messages. To save the testing time,
the only way is reduce the miimon number. Something like
30 (num_grat_arp) * 100 (peer_notify_delay) / 10 (miimon) > 255.

To save more time, the 50 num_grat_arp testing could be removed.

The arp_validate_test also need to check the mii_status, which sleep
too long. Use slowwait to save some time.

For other connection checkings, make sure active slave changed first.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: use tc filter to check if LACP was sent
Hangbin Liu [Mon, 5 Feb 2024 13:00:46 +0000 (21:00 +0800)]
selftests: bonding: use tc filter to check if LACP was sent

Use tc filter to check if LACP was sent, which is accurate and save
more time.

No need to remove bonding module as some test env may buildin bonding.
And the bond link has been deleted.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests/net/forwarding: add slowwait functions
Hangbin Liu [Mon, 5 Feb 2024 13:00:45 +0000 (21:00 +0800)]
selftests/net/forwarding: add slowwait functions

Add slowwait functions to wait for some operations that may need a long time
to finish. The busywait executes the cmd too fast, which is kind of wasting
cpu in this scenario. At the same time, if shell debugging is enabled with
`set -x`. the busywait will output too much logs.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet/smc: change the term virtual ISM to Emulated-ISM
Wen Gu [Mon, 5 Feb 2024 03:33:17 +0000 (11:33 +0800)]
net/smc: change the term virtual ISM to Emulated-ISM

According to latest release of SMCv2.1[1], the term 'virtual ISM' has
been changed to 'Emulated-ISM' to avoid the ambiguity of the word
'virtual' in different contexts. So the names or comments in the code
need be modified accordingly.

[1] https://www.ibm.com/support/pages/node/7112343

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20240205033317.127269-1-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'net-phy-realtek-complete-5gbps-support-and-replace-private-constants'
Jakub Kicinski [Thu, 8 Feb 2024 02:19:55 +0000 (18:19 -0800)]
Merge branch 'net-phy-realtek-complete-5gbps-support-and-replace-private-constants'

Heiner Kallweit says:

====================
net: phy: realtek: complete 5Gbps support and replace private constants

Realtek maps standard C45 registers to vendor-specific registers which
can be accessed via C22 w/o MMD. For an unknown reason C22 MMD access
to C45 registers isn't supported for integrated PHY's.
However the vendor-specific registers preserve the format of the C45
registers, so we can use standard constants. First two patches are
cherry-picked from a series posted by Marek some time ago.

RTL8126 supports 5Gbps, therefore add the missing 5Gbps support to
rtl822x_config_aneg().
====================

Link: https://lore.kernel.org/r/31a83fd9-90ce-402a-84c7-d5c20540b730@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: phy: realtek: add 5Gbps support to rtl822x_config_aneg()
Heiner Kallweit [Sun, 4 Feb 2024 14:18:50 +0000 (15:18 +0100)]
net: phy: realtek: add 5Gbps support to rtl822x_config_aneg()

RTL8126 as an evolution of RTL8125 supports 5Gbps. rtl822x_config_aneg()
is used by the PHY driver for the integrated PHY, therefore add 5Gbps
support to it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/5644ab50-e3e9-477c-96db-05cd5bdc2563@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: phy: realtek: use generic MDIO constants
Marek Behún [Sun, 4 Feb 2024 14:17:53 +0000 (15:17 +0100)]
net: phy: realtek: use generic MDIO constants

Drop the ad-hoc MDIO constants used in the driver and use generic
constants instead.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/732a70d6-4191-4aae-8862-3716b062aa9e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: mdio: add 2.5g and 5g related PMA speed constants
Marek Behún [Sun, 4 Feb 2024 14:16:46 +0000 (15:16 +0100)]
net: mdio: add 2.5g and 5g related PMA speed constants

Add constants indicating 2.5g and 5g ability in the MMD PMA speed
register.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/98e15038-d96c-442f-93e4-410100d27866@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonetfilter: nft_compat: restrict match/target protocol to u16
Pablo Neira Ayuso [Thu, 1 Feb 2024 23:05:23 +0000 (00:05 +0100)]
netfilter: nft_compat: restrict match/target protocol to u16

xt_check_{match,target} expects u16, but NFTA_RULE_COMPAT_PROTO is u32.

NLA_POLICY_MAX(NLA_BE32, 65535) cannot be used because .max in
nla_policy is s16, see 3e48be05f3c7 ("netlink: add attribute range
validation to policy").

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_compat: reject unused compat flag
Pablo Neira Ayuso [Thu, 1 Feb 2024 22:33:29 +0000 (23:33 +0100)]
netfilter: nft_compat: reject unused compat flag

Flag (1 << 0) is ignored is set, never used, reject it it with EINVAL
instead.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_compat: narrow down revision to unsigned 8-bits
Pablo Neira Ayuso [Thu, 1 Feb 2024 21:58:36 +0000 (22:58 +0100)]
netfilter: nft_compat: narrow down revision to unsigned 8-bits

xt_find_revision() expects u8, restrict it to this datatype.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agodpll: check that pin is registered in __dpll_pin_unregister()
Jiri Pirko [Tue, 6 Feb 2024 07:48:53 +0000 (08:48 +0100)]
dpll: check that pin is registered in __dpll_pin_unregister()

Similar to what is done in dpll_device_unregister(), add assertion to
__dpll_pin_unregister() to make sure driver does not try to unregister
non-registered pin.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://lore.kernel.org/r/20240206074853.345744-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'wireless-2024-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Wed, 7 Feb 2024 18:34:51 +0000 (10:34 -0800)]
Merge tag 'wireless-2024-02-06' of git://git./linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.8-rc4

This time we have unusually large wireless pull request. Several
functionality fixes to both stack and iwlwifi. Lots of fixes to
warnings, especially to MODULE_DESCRIPTION().

* tag 'wireless-2024-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (31 commits)
  wifi: mt76: mt7996: fix fortify warning
  wifi: brcmfmac: Adjust n_channels usage for __counted_by
  wifi: iwlwifi: do not announce EPCS support
  wifi: iwlwifi: exit eSR only after the FW does
  wifi: iwlwifi: mvm: fix a battery life regression
  wifi: mac80211: accept broadcast probe responses on 6 GHz
  wifi: mac80211: adding missing drv_mgd_complete_tx() call
  wifi: mac80211: fix waiting for beacons logic
  wifi: mac80211: fix unsolicited broadcast probe config
  wifi: mac80211: initialize SMPS mode correctly
  wifi: mac80211: fix driver debugfs for vif type change
  wifi: mac80211: set station RX-NSS on reconfig
  wifi: mac80211: fix RCU use in TDLS fast-xmit
  wifi: mac80211: improve CSA/ECSA connection refusal
  wifi: cfg80211: detect stuck ECSA element in probe resp
  wifi: iwlwifi: remove extra kernel-doc
  wifi: fill in MODULE_DESCRIPTION()s for mt76 drivers
  wifi: fill in MODULE_DESCRIPTION()s for wilc1000
  wifi: fill in MODULE_DESCRIPTION()s for wl18xx
  wifi: fill in MODULE_DESCRIPTION()s for p54spi
  ...
====================

Link: https://lore.kernel.org/r/20240206095722.CD9D2C433F1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 7 Feb 2024 18:06:16 +0000 (18:06 +0000)]
Merge tag 'loongarch-fixes-6.8-2' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix acpi_core_pic[] array overflow, fix earlycon parameter if KASAN
  enabled, disable UBSAN instrumentation for vDSO build, and two Kconfig
  cleanups"

* tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: vDSO: Disable UBSAN instrumentation
  LoongArch: Fix earlycon parameter if KASAN enabled
  LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC]
  LoongArch: Select HAVE_ARCH_SECCOMP to use the common SECCOMP menu
  LoongArch: Select ARCH_ENABLE_THP_MIGRATION instead of redefining it

20 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Wed, 7 Feb 2024 17:52:16 +0000 (17:52 +0000)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86 guest:

   - Avoid false positive for check that only matters on AMD processors

  x86:

   - Give a hint when Win2016 might fail to boot due to XSAVES &&
     !XSAVEC configuration

   - Do not allow creating an in-kernel PIT unless an IOAPIC already
     exists

  RISC-V:

   - Allow ISA extensions that were enabled for bare metal in 6.8 (Zbc,
     scalar and vector crypto, Zfh[min], Zihintntl, Zvfh[min], Zfa)

  S390:

   - fix CC for successful PQAP instruction

   - fix a race when creating a shadow page"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM
  x86/kvm: Fix SEV check in sev_map_percpu_data()
  KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratum
  KVM: x86: Check irqchip mode before create PIT
  KVM: riscv: selftests: Add Zfa extension to get-reg-list test
  RISC-V: KVM: Allow Zfa extension for Guest/VM
  KVM: riscv: selftests: Add Zvfh[min] extensions to get-reg-list test
  RISC-V: KVM: Allow Zvfh[min] extensions for Guest/VM
  KVM: riscv: selftests: Add Zihintntl extension to get-reg-list test
  RISC-V: KVM: Allow Zihintntl extension for Guest/VM
  KVM: riscv: selftests: Add Zfh[min] extensions to get-reg-list test
  RISC-V: KVM: Allow Zfh[min] extensions for Guest/VM
  KVM: riscv: selftests: Add vector crypto extensions to get-reg-list test
  RISC-V: KVM: Allow vector crypto extensions for Guest/VM
  KVM: riscv: selftests: Add scaler crypto extensions to get-reg-list test
  RISC-V: KVM: Allow scalar crypto extensions for Guest/VM
  KVM: riscv: selftests: Add Zbc extension to get-reg-list test
  RISC-V: KVM: Allow Zbc extension for Guest/VM
  KVM: s390: fix cc for successful PQAP
  KVM: s390: vsie: fix race during shadow creation

20 months agoMerge tag 'nfsd-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Wed, 7 Feb 2024 17:48:15 +0000 (17:48 +0000)]
Merge tag 'nfsd-6.8-3' of git://git./linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Address a deadlock regression in RELEASE_LOCKOWNER

* tag 'nfsd-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: don't take fi_lock in nfsd_break_deleg_cb()

20 months agonet: intel: fix old compiler regressions
Jesse Brandeburg [Tue, 6 Feb 2024 02:29:06 +0000 (18:29 -0800)]
net: intel: fix old compiler regressions

The kernel build regressions/improvements email contained a couple of
issues with old compilers (in fact all the reports were on different
architectures, but all gcc 5.5) and the FIELD_PREP() and FIELD_GET()
conversions. They're all because an integer #define that should have
been declared as unsigned, was shifted to the point that it could set
the sign bit.

The fix just involves making sure the defines use the "U" identifier on
the constants to make sure they're unsigned. Should make the checkers
happier.

Confirmed with objdump before/after that there is no change to the
binaries.

Issues were reported as follows:
./drivers/net/ethernet/intel/ice/ice_base.c:238:7: note: in expansion of macro 'FIELD_GET'
      (FIELD_GET(GLINT_CTL_ITR_GRAN_25_M, regval) == ICE_ITR_GRAN_US))
       ^
./include/linux/compiler_types.h:435:38: error: call to '__compiletime_assert_1093' declared with attribute error: FIELD_GET: mask is not constant
drivers/net/ethernet/intel/ice/ice_nvm.c:709:16: note: in expansion of macro ‘FIELD_GET’
  orom->major = FIELD_GET(ICE_OROM_VER_MASK, combo_ver);
                ^
./include/linux/compiler_types.h:435:38: error: call to ‘__compiletime_assert_796’ declared with attribute error: FIELD_GET: mask is not constant
drivers/net/ethernet/intel/ice/ice_common.c:945:18: note: in expansion of macro ‘FIELD_GET’
  u8 max_agg_bw = FIELD_GET(GL_PWR_MODE_CTL_CAR_MAX_BW_M,
                  ^
./include/linux/compiler_types.h:435:38: error: call to ‘__compiletime_assert_420’ declared with attribute error: FIELD_GET: mask is not constant
drivers/net/ethernet/intel/i40e/i40e_dcb.c:458:8: note: in expansion of macro ‘FIELD_GET’
  oui = FIELD_GET(I40E_LLDP_TLV_OUI_MASK, ouisubtype);
        ^

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/lkml/d03e90ca-8485-4d1b-5ec1-c3398e0e8da@linux-m68k.org/ #i40e #ice
Fixes: 62589808d73b ("i40e: field get conversion")
Fixes: 5a259f8e0baf ("ice: field get conversion")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20240206022906.2194214-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agotsnep: Use devm_platform_get_and_ioremap_resource() in tsnep_probe()
Markus Elfring [Mon, 5 Feb 2024 12:43:14 +0000 (13:43 +0100)]
tsnep: Use devm_platform_get_and_ioremap_resource() in tsnep_probe()

A wrapper function is available since the commit 890cc39a879906b63912482dfc41944579df2dc6
("drivers: provide devm_platform_get_and_ioremap_resource()").
Thus reuse existing functionality instead of keeping duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Tested-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/29e9dc0f-5597-4fee-be5c-25a5ab4fe2dc@web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: Do not return value from init_dummy_netdev()
Amit Cohen [Mon, 5 Feb 2024 10:30:22 +0000 (12:30 +0200)]
net: Do not return value from init_dummy_netdev()

init_dummy_netdev() always returns zero and all the callers do not check
the returned value. Set the function to not return value, as it is not
really used today.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240205103022.440946-1-amcohen@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMAINTAINERS: Maintainer change for rds
Allison Henderson [Mon, 5 Feb 2024 19:03:43 +0000 (12:03 -0700)]
MAINTAINERS: Maintainer change for rds

At this point, Santosh has moved onto other things and I am happy
to take over the role of rds maintainer. Update the MAINTAINERS
accordingly.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Link: https://lore.kernel.org/r/20240205190343.112436-1-allison.henderson@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'net-eee-network-driver-cleanups'
Jakub Kicinski [Wed, 7 Feb 2024 17:03:40 +0000 (09:03 -0800)]
Merge branch 'net-eee-network-driver-cleanups'

Russell King says:

====================
net: eee network driver cleanups

Since commit d1420bb99515 ("net: phy: improve generic EEE ethtool
functions") changed phylib to set eee->eee_active and eee->eee_enabled,
overriding anything that drivers have set these to prior to calling
phy_ethtool_get_eee().

Therefore, drivers setting these members becomes redundant, since
phylib overwrites the values they set. This series finishes off
Heiner's work in the referenced commit by removing these redundant
writes in various drivers and any associated code or structure members
that become unnecessary.
====================

Link: https://lore.kernel.org/r/Zb9/O81fVAZw4ANr@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: b53: remove eee_enabled/eee_active in b53_get_mac_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:28 +0000 (12:13 +0000)]
net: dsa: b53: remove eee_enabled/eee_active in b53_get_mac_eee()

b53_get_mac_eee() sets both eee_enabled and eee_active, and then
returns zero.

dsa_slave_get_eee(), which calls this function, will then continue to
call phylink_ethtool_get_eee(), which will return -EOPNOTSUPP if there
is no PHY present, otherwise calling phy_ethtool_get_eee() which in
turn will call genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Thus, when there is no PHY, dsa_slave_get_eee() will fail with
-EOPNOTSUPP, meaning eee_enabled and eee_active will not be returned to
userspace. When there is a PHY, eee_enabled and eee_active will be
overwritten by phylib, making the setting of these members in
b53_get_mac_eee() entirely unnecessary.

Remove this code, thus simplifying b53_get_mac_eee().

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1rWbNI-002cCz-4x@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: bcmasp: remove eee_enabled/eee_active in bcmasp_get_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:22 +0000 (12:13 +0000)]
net: bcmasp: remove eee_enabled/eee_active in bcmasp_get_eee()

bcmasp_get_eee() sets edata->eee_active and edata->eee_enabled from
its own copy, and then calls phy_ethtool_get_eee() which in turn will
call genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Therefore, setting these members in bcmasp_get_eee() is redundant, and
can be removed. This also makes intf->eee.eee_active unnecessary, so
remove this and use a local variable where appropriate.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rWbNC-002cCt-W7@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: bcmgenet: remove eee_enabled/eee_active in bcmgenet_get_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:17 +0000 (12:13 +0000)]
net: bcmgenet: remove eee_enabled/eee_active in bcmgenet_get_eee()

bcmgenet_get_eee() sets edata->eee_active and edata->eee_enabled from
its own copy, and then calls phy_ethtool_get_eee() which in turn will
call genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Therefore, setting these members in bcmgenet_get_eee() is redundant,
and can be removed. This also makes priv->eee.eee_active unnecessary,
so remove this and use a local variable where appropriate.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rWbN7-002cCn-RO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: fec: remove eee_enabled/eee_active in fec_enet_get_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:12 +0000 (12:13 +0000)]
net: fec: remove eee_enabled/eee_active in fec_enet_get_eee()

fec_enet_get_eee() sets edata->eee_active and edata->eee_enabled from
its own copy, and then calls phy_ethtool_get_eee() which in turn will
call genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Therefore, setting these members in fec_enet_get_eee() is redundant.
Remove this, and remove the setting of fep->eee.eee_active member which
becomes a write-only variable.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/E1rWbN2-002cCh-MY@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: sxgbe: remove eee_enabled/eee_active in sxgbe_get_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:07 +0000 (12:13 +0000)]
net: sxgbe: remove eee_enabled/eee_active in sxgbe_get_eee()

sxgbe_get_eee() sets edata->eee_active and edata->eee_enabled from its
own copy, and then calls phy_ethtool_get_eee() which in turn will call
genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Therefore, setting these members in sxgbe_get_eee() is redundant.
Remove this, and remove the priv->eee_active member which then becomes
a write-only variable.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rWbMx-002cCb-IU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: stmmac: remove eee_enabled/eee_active in stmmac_ethtool_op_get_eee()
Russell King (Oracle) [Sun, 4 Feb 2024 12:13:02 +0000 (12:13 +0000)]
net: stmmac: remove eee_enabled/eee_active in stmmac_ethtool_op_get_eee()

stmmac_ethtool_op_get_eee() sets both eee_enabled and eee_active, and
then goes on to call phylink_ethtool_get_eee().

phylink_ethtool_get_eee() will return -EOPNOTSUPP if there is no PHY
present, otherwise calling phy_ethtool_get_eee() which in turn will call
genphy_c45_ethtool_get_eee().

genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active
with its own interpretation from the PHYs settings and negotiation
result.

Thus, when there is no PHY, stmmac_ethtool_op_get_eee() will fail with
-EOPNOTSUPP, meaning eee_enabled and eee_active will not be returned to
userspace. When there is a PHY, eee_enabled and eee_active will be
overwritten by phylib, making the setting of these members in
stmmac_ethtool_op_get_eee() entirely unnecessary.

Remove this code, thus simplifying stmmac_ethtool_op_get_eee().

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/E1rWbMs-002cCV-EE@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'net-phy-c22-c45-enumeration'
David S. Miller [Wed, 7 Feb 2024 13:47:07 +0000 (13:47 +0000)]
Merge branch 'net-phy-c22-c45-enumeration'

From: Andrew Lunn <andrew@lunn.ch>
To: Heiner Kallweit <hkallweit1@gmail.com>,
 Russell King <linux@armlinux.org.uk>,
 "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
 Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
 Florian Fainelli <f.fainelli@gmail.com>,
 Vladimir Oltean <olteanv@gmail.com>
Cc: netdev@vger.kernel.org,
Tim Menninger <tmenninger@purestorage.com>,
 Andrew Lunn <andrew@lunn.ch>

====================
net: Unify C22 and C45 error handling during bus enumeration

When enumerating an MDIO bus, an MDIO bus driver can return -ENODEV to
a C22 read transaction to indicate there is no device at that address
on the bus. Enumeration will then continue with the next address on
the bus.

Modify C45 enumeration so that it also accepts -ENODEV and moves to
the next address on the bus, rather than consider -ENODEV as a fatal
error.

Convert the mv88e6xxx driver to return -ENODEV rather than 0xffff on
read for families which do not support C45 bus transactions. This is
more efficient, since enumeration will scan multiple devices at one
address when 0xffff is returned, where as -EONDEV immediately jumps to
the next address on the bus.
====================

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: dsa: mv88e6xxx: Return -ENODEV when C45 not supported
Andrew Lunn [Sun, 4 Feb 2024 23:14:15 +0000 (17:14 -0600)]
net: dsa: mv88e6xxx: Return -ENODEV when C45 not supported

MDIO bus drivers can return -ENODEV when they know the bus does not
have a device at the given address, e.g. because of hardware
limitation. One such limitation is that the bus does not support C45
at all. This is more efficient than returning 0xffff, since it
immediately stops the probing on the given address, where as further
reads can be made when 0xffff is returned.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: phy: c45 scanning: Don't consider -ENODEV fatal
Andrew Lunn [Sun, 4 Feb 2024 23:14:14 +0000 (17:14 -0600)]
net: phy: c45 scanning: Don't consider -ENODEV fatal

When scanning the MDIO bus for C22 devices, the driver returning
-ENODEV is not considered fatal, it just indicates the MDIO bus master
knows there is no device at that address, maybe because of hardware
limitation.

Make the C45 scan code act on -ENODEV the same way, to make C22 and
C45 more uniform.

It is expected all reads for a given address will return -ENODEV, so
within get_phy_c45_ids() only the first place a read occurs has been
changed.

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonetdevsim: make nsim_bus const
Ricardo B. Marliere [Sun, 4 Feb 2024 20:16:34 +0000 (17:16 -0300)]
netdevsim: make nsim_bus const

Now that the driver core can properly handle constant struct bus_type,
move the nsim_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agoselftests: cmsg_ipv6: repeat the exact packet
Jakub Kicinski [Sun, 4 Feb 2024 16:56:18 +0000 (08:56 -0800)]
selftests: cmsg_ipv6: repeat the exact packet

cmsg_ipv6 test requests tcpdump to capture 4 packets,
and sends until tcpdump quits. Only the first packet
is "real", however, and the rest are basic UDP packets.
So if tcpdump doesn't start in time it will miss
the real packet and only capture the UDP ones.

This makes the test fail on slow machine (no KVM or with
debug enabled) 100% of the time, while it passes in fast
environments.

Repeat the "real" / expected packet.

Fixes: 9657ad09e1fa ("selftests: net: test IPV6_TCLASS")
Fixes: 05ae83d5a4a2 ("selftests: net: test IPV6_HOPLIMIT")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: tipc: remove redundant 'bool' from CONFIG_TIPC_{MEDIA_UDP,CRYPTO}
Masahiro Yamada [Sun, 4 Feb 2024 13:12:26 +0000 (22:12 +0900)]
net: tipc: remove redundant 'bool' from CONFIG_TIPC_{MEDIA_UDP,CRYPTO}

The 'bool' is already specified for these options.

The second 'bool' under the help message is redundant.

While I am here, I moved 'default y' above, as it is common to place
the help text last.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: ethernet: remove duplicated CONFIG_SUNGEM_PHY entry
Masahiro Yamada [Sun, 4 Feb 2024 12:31:51 +0000 (21:31 +0900)]
net: ethernet: remove duplicated CONFIG_SUNGEM_PHY entry

Both drivers/net/Kconfig and drivers/net/ethernet/Kconfig contain the
same config entry:

  config SUNGEM_PHY
          tristate

Commit f860b0522f65 ("drivers/net: Kconfig and Makefile cleanup") moved
SUNGEM_PHY from drivers/net/Kconfig to drivers/net/ethernet/Kconfig.

Shortly after it was applied, commit 19e2f6fe9601 ("net: Fix sungem_phy
sharing.") added the second one to drivers/net/Kconfig.

I kept the one in drivers/net/Kconfig because this CONFIG option controls
the compilation of drivers/net/sungem_phy.c.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonetlabel: cleanup struct netlbl_lsm_catmap
George Guo [Sun, 4 Feb 2024 02:35:31 +0000 (10:35 +0800)]
netlabel: cleanup struct netlbl_lsm_catmap

Simplify the code from macro NETLBL_CATMAP_MAPTYPE to u64, and fix
warning "Macros with complex values should be enclosed in parentheses"
on "#define NETLBL_CATMAP_BIT (NETLBL_CATMAP_MAPTYPE)0x01", which is
modified to "#define NETLBL_CATMAP_BIT ((u64)0x01)".

Signed-off-by: George Guo <guodongtai@kylinos.cn>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: stmmac: protect updates of 64-bit statistics counters
Petr Tesarik [Sat, 3 Feb 2024 19:09:27 +0000 (20:09 +0100)]
net: stmmac: protect updates of 64-bit statistics counters

As explained by a comment in <linux/u64_stats_sync.h>, write side of struct
u64_stats_sync must ensure mutual exclusion, or one seqcount update could
be lost on 32-bit platforms, thus blocking readers forever. Such lockups
have been observed in real world after stmmac_xmit() on one CPU raced with
stmmac_napi_poll_tx() on another CPU.

To fix the issue without introducing a new lock, split the statics into
three parts:

1. fields updated only under the tx queue lock,
2. fields updated only during NAPI poll,
3. fields updated only from interrupt context,

Updates to fields in the first two groups are already serialized through
other locks. It is sufficient to split the existing struct u64_stats_sync
so that each group has its own.

Note that tx_set_ic_bit is updated from both contexts. Split this counter
so that each context gets its own, and calculate their sum to get the total
value in stmmac_get_ethtool_stats().

For the third group, multiple interrupts may be processed by different CPUs
at the same time, but interrupts on the same CPU will not nest. Move fields
from this group to a newly created per-cpu struct stmmac_pcpu_stats.

Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary")
Link: https://lore.kernel.org/netdev/Za173PhviYg-1qIn@torres.zugschlus.de/t/
Cc: stable@vger.kernel.org
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agoMerge tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Wed, 7 Feb 2024 08:21:32 +0000 (08:21 +0000)]
Merge tag 'for-6.8-rc3-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - two fixes preventing deletion and manual creation of subvolume qgroup

 - unify error code returned for unknown send flags

 - fix assertion during subvolume creation when anonymous device could
   be allocated by other thread (e.g. due to backref walk)

* tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: do not ASSERT() if the newly created subvolume already got read
  btrfs: forbid deleting live subvol qgroup
  btrfs: forbid creating subvol qgroups
  btrfs: send: return EOPNOTSUPP on unknown flags

20 months agotg3: convert EEE handling to use linkmode bitmaps
Heiner Kallweit [Sat, 3 Feb 2024 21:12:50 +0000 (22:12 +0100)]
tg3: convert EEE handling to use linkmode bitmaps

Convert EEE handling to use linkmode bitmaps. This prepares for
removing the legacy bitmaps from struct ethtool_keee.
No functional change intended.

Note: The change to mii_eee_cap1_mod_linkmode_t(tp->eee.advertised, val)
in tg3_phy_autoneg_cfg() isn't completely obvious, but it doesn't change
the current functionality.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/0652b910-6bcc-421f-8769-38f7dae5037e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'net-phy-add-and-use-helper-phy_advertise_eee_all'
Jakub Kicinski [Wed, 7 Feb 2024 02:58:47 +0000 (18:58 -0800)]
Merge branch 'net-phy-add-and-use-helper-phy_advertise_eee_all'

Heiner Kallweit says:

====================
net: phy: add and use helper phy_advertise_eee_all

Per default phylib preserves the EEE advertising at the time of
phy probing. The EEE advertising can be changed from user space,
in addition this helper allows to set the EEE advertising to all
supported modes from drivers in kernel space.
====================

Link: https://lore.kernel.org/r/0d886510-b2b7-43f2-b8a6-fb770d97266d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agor8169: use new helper phy_advertise_eee_all
Heiner Kallweit [Sat, 3 Feb 2024 19:54:48 +0000 (20:54 +0100)]
r8169: use new helper phy_advertise_eee_all

Use new helper phy_advertise_eee_all() to simplify the code.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/ddedd82e-55da-4db5-acc6-9407c03f168c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: phy: add helper phy_advertise_eee_all
Heiner Kallweit [Sat, 3 Feb 2024 19:53:15 +0000 (20:53 +0100)]
net: phy: add helper phy_advertise_eee_all

Per default phylib preserves the EEE advertising at the time of
phy probing. The EEE advertising can be changed from user space,
in addition this helper allows to set the EEE advertising to all
supported modes from drivers in kernel space.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20bfc471-aeeb-4ae4-ba09-7d6d4be6b86a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'add-support-for-encoding-multi-attr-to-ynl'
Jakub Kicinski [Wed, 7 Feb 2024 02:56:22 +0000 (18:56 -0800)]
Merge branch 'add-support-for-encoding-multi-attr-to-ynl'

Alessandro Marcolini says:

====================
Add support for encoding multi-attr to ynl

This patchset add the support for encoding multi-attr attributes, making
it possible to use ynl with qdisc which have this kind of attributes
(e.g: taprio, ets).

Patch 1 corrects two docstrings in nlspec.py
Patch 2 adds the multi-attr attribute to taprio entry
Patch 3 adds the support for encoding multi-attr

Some examples of what is now possible with the ynl cli:

- Add a taprio qdisc

  --do newqdisc --create --json '{
  "family":1, "ifindex":4, "handle":65536, "parent":4294967295, "info":0,
   "kind":"taprio",
   "stab":{
       "base": {
         "cell-log": 0,
         "size-log": 0,
         "cell-align": 0,
         "overhead": 31,
         "linklayer": 0,
         "mpu": 0,
         "mtu": 0,
         "tsize": 0
       }
   },
   "options":{
       "priomap": {
           "num-tc": 3,
           "prio-tc-map": "01010101010101010101010101010101",
           "hw": 0,
           "count": "0100010002000000000000000000000000000000000000000000000000000000",
           "offset": "0100020003000000000000000000000000000000000000000000000000000000"
       },
       "sched-clockid":11,
       "sched-entry-list": {"entry": [
           {"index":0, "cmd":0, "gate-mask":1, "interval":300000},
           {"index":1, "cmd":0, "gate-mask":2, "interval":300000},
           {"index":2, "cmd":0, "gate-mask":4, "interval":400000} ]
       },
       "sched-base-time":1528743495910289987, "flags": 1
   }
  }'

- Add an ets qdisc

--create --json '{
"family":1, "ifindex":4, "handle":65536, "parent":4294967295, "kind":"ets",
"options":{
    "nbands":6,
    "nstrict":3,
    "quanta":{
        "quanta-band": [3500, 3000, 2500]
    },
    "priomap":{
        "priomap-band":[0, 1, 1, 1, 2, 3, 4, 5]
    }
}
}'
====================

Link: https://lore.kernel.org/r/cover.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agotools: ynl: add support for encoding multi-attr
Alessandro Marcolini [Sat, 3 Feb 2024 13:16:53 +0000 (14:16 +0100)]
tools: ynl: add support for encoding multi-attr

Multi-attr elements could not be encoded because of missing logic in the
ynl code. Enable encoding of these attributes by checking if the
attribute is a multi-attr and if the value to be processed is a list.

This has been tested both with the taprio and ets qdisc which contain
this kind of attributes.

Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/c5bc9f5797168dbf7a4379c42f38d5de8ac7f38a.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agodoc: netlink: specs: tc: add multi-attr to tc-taprio-sched-entry
Alessandro Marcolini [Sat, 3 Feb 2024 13:16:52 +0000 (14:16 +0100)]
doc: netlink: specs: tc: add multi-attr to tc-taprio-sched-entry

Add multi-attr attribute to tc-taprio-sched-entry to specify multiple
entries.

Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/0ba5088ea715103a2bce83b12e2dcbdaa08da6ac.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agotools: ynl: correct typo and docstring
Alessandro Marcolini [Sat, 3 Feb 2024 13:16:51 +0000 (14:16 +0100)]
tools: ynl: correct typo and docstring

Correct typo in SpecAttr docstring. Changed SpecSubMessageFormat
docstring.

Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/6ab1dea7fb1f635c0d8b237f03a49eaa448c2bf4.1706962013.git.alessandromarcolini99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agomlx4: Address spelling errors
Simon Horman [Mon, 5 Feb 2024 11:51:57 +0000 (11:51 +0000)]
mlx4: Address spelling errors

Address spelling errors flagged by codespell.

This patch follows-up on an earlier patch by Colin Ian King,
which addressed a spelling error in a user-visible log message [1].
This patch includes that change.

[1] https://lore.kernel.org/netdev/20231209225135.4055334-1-colin.i.king@gmail.com/

This patch is intended to cover all files under
drivers/net/ethernet/mellanox/mlx4

Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240205-mlx5-codespell-v1-1-63b86dffbb61@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoppp_async: limit MRU to 64K
Eric Dumazet [Mon, 5 Feb 2024 17:10:04 +0000 (17:10 +0000)]
ppp_async: limit MRU to 64K

syzbot triggered a warning [1] in __alloc_pages():

WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)

Willem fixed a similar issue in commit c0a2a1b0d631 ("ppp: limit MRU to 64K")

Adopt the same sanity check for ppp_async_ioctl(PPPIOCSMRU)

[1]:

 WARNING: CPU: 1 PID: 11 at mm/page_alloc.c:4543 __alloc_pages+0x308/0x698 mm/page_alloc.c:4543
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u4:0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: events_unbound flush_to_ldisc
pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __alloc_pages+0x308/0x698 mm/page_alloc.c:4543
 lr : __alloc_pages+0xc8/0x698 mm/page_alloc.c:4537
sp : ffff800093967580
x29: ffff800093967660 x28: ffff8000939675a0 x27: dfff800000000000
x26: ffff70001272ceb4 x25: 0000000000000000 x24: ffff8000939675c0
x23: 0000000000000000 x22: 0000000000060820 x21: 1ffff0001272ceb8
x20: ffff8000939675e0 x19: 0000000000000010 x18: ffff800093967120
x17: ffff800083bded5c x16: ffff80008ac97500 x15: 0000000000000005
x14: 1ffff0001272cebc x13: 0000000000000000 x12: 0000000000000000
x11: ffff70001272cec1 x10: 1ffff0001272cec0 x9 : 0000000000000001
x8 : ffff800091c91000 x7 : 0000000000000000 x6 : 000000000000003f
x5 : 00000000ffffffff x4 : 0000000000000000 x3 : 0000000000000020
x2 : 0000000000000008 x1 : 0000000000000000 x0 : ffff8000939675e0
Call trace:
  __alloc_pages+0x308/0x698 mm/page_alloc.c:4543
  __alloc_pages_node include/linux/gfp.h:238 [inline]
  alloc_pages_node include/linux/gfp.h:261 [inline]
  __kmalloc_large_node+0xbc/0x1fc mm/slub.c:3926
  __do_kmalloc_node mm/slub.c:3969 [inline]
  __kmalloc_node_track_caller+0x418/0x620 mm/slub.c:4001
  kmalloc_reserve+0x17c/0x23c net/core/skbuff.c:590
  __alloc_skb+0x1c8/0x3d8 net/core/skbuff.c:651
  __netdev_alloc_skb+0xb8/0x3e8 net/core/skbuff.c:715
  netdev_alloc_skb include/linux/skbuff.h:3235 [inline]
  dev_alloc_skb include/linux/skbuff.h:3248 [inline]
  ppp_async_input drivers/net/ppp/ppp_async.c:863 [inline]
  ppp_asynctty_receive+0x588/0x186c drivers/net/ppp/ppp_async.c:341
  tty_ldisc_receive_buf+0x12c/0x15c drivers/tty/tty_buffer.c:390
  tty_port_default_receive_buf+0x74/0xac drivers/tty/tty_port.c:37
  receive_buf drivers/tty/tty_buffer.c:444 [inline]
  flush_to_ldisc+0x284/0x6e4 drivers/tty/tty_buffer.c:494
  process_one_work+0x694/0x1204 kernel/workqueue.c:2633
  process_scheduled_works kernel/workqueue.c:2706 [inline]
  worker_thread+0x938/0xef4 kernel/workqueue.c:2787
  kthread+0x288/0x310 kernel/kthread.c:388
  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-and-tested-by: syzbot+c5da1f087c9e4ec6c933@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240205171004.1059724-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agodevlink: avoid potential loop in devlink_rel_nested_in_notify_work()
Jiri Pirko [Mon, 5 Feb 2024 17:11:14 +0000 (18:11 +0100)]
devlink: avoid potential loop in devlink_rel_nested_in_notify_work()

In case devlink_rel_nested_in_notify_work() can not take the devlink
lock mutex. Convert the work to delayed work and in case of reschedule
do it jiffie later and avoid potential looping.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Fixes: c137743bce02 ("devlink: introduce object and nested devlink relationship infra")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240205171114.338679-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoaf_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.
Kuniyuki Iwashima [Sat, 3 Feb 2024 18:31:49 +0000 (10:31 -0800)]
af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.

syzbot reported a warning [0] in __unix_gc() with a repro, which
creates a socketpair and sends one socket's fd to itself using the
peer.

  socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
  sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\360", iov_len=1}],
          msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET,
                                      cmsg_type=SCM_RIGHTS, cmsg_data=[3]}],
          msg_controllen=24, msg_flags=0}, MSG_OOB|MSG_PROBE|MSG_DONTWAIT|MSG_ZEROCOPY) = 1

This forms a self-cyclic reference that GC should finally untangle
but does not due to lack of MSG_OOB handling, resulting in memory
leak.

Recently, commit 11498715f266 ("af_unix: Remove io_uring code for
GC.") removed io_uring's dead code in GC and revealed the problem.

The code was executed at the final stage of GC and unconditionally
moved all GC candidates from gc_candidates to gc_inflight_list.
That papered over the reported problem by always making the following
WARN_ON_ONCE(!list_empty(&gc_candidates)) false.

The problem has been there since commit 2aab4b969002 ("af_unix: fix
struct pid leaks in OOB support") added full scm support for MSG_OOB
while fixing another bug.

To fix this problem, we must call kfree_skb() for unix_sk(sk)->oob_skb
if the socket still exists in gc_candidates after purging collected skb.

Then, we need to set NULL to oob_skb before calling kfree_skb() because
it calls last fput() and triggers unix_release_sock(), where we call
duplicate kfree_skb(u->oob_skb) if not NULL.

Note that the leaked socket remained being linked to a global list, so
kmemleak also could not detect it.  We need to check /proc/net/protocol
to notice the unfreed socket.

[0]:
WARNING: CPU: 0 PID: 2863 at net/unix/garbage.c:345 __unix_gc+0xc74/0xe80 net/unix/garbage.c:345
Modules linked in:
CPU: 0 PID: 2863 Comm: kworker/u4:11 Not tainted 6.8.0-rc1-syzkaller-00583-g1701940b1a02 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Workqueue: events_unbound __unix_gc
RIP: 0010:__unix_gc+0xc74/0xe80 net/unix/garbage.c:345
Code: 8b 5c 24 50 e9 86 f8 ff ff e8 f8 e4 22 f8 31 d2 48 c7 c6 30 6a 69 89 4c 89 ef e8 97 ef ff ff e9 80 f9 ff ff e8 dd e4 22 f8 90 <0f> 0b 90 e9 7b fd ff ff 48 89 df e8 5c e7 7c f8 e9 d3 f8 ff ff e8
RSP: 0018:ffffc9000b03fba0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffc9000b03fc10 RCX: ffffffff816c493e
RDX: ffff88802c02d940 RSI: ffffffff896982f3 RDI: ffffc9000b03fb30
RBP: ffffc9000b03fce0 R08: 0000000000000001 R09: fffff52001607f66
R10: 0000000000000003 R11: 0000000000000002 R12: dffffc0000000000
R13: ffffc9000b03fc10 R14: ffffc9000b03fc10 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005559c8677a60 CR3: 000000000d57a000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 process_one_work+0x889/0x15e0 kernel/workqueue.c:2633
 process_scheduled_works kernel/workqueue.c:2706 [inline]
 worker_thread+0x8b9/0x12a0 kernel/workqueue.c:2787
 kthread+0x2c6/0x3b0 kernel/kthread.c:388
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242
 </TASK>

Reported-by: syzbot+fa3ef895554bdbfd1183@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=fa3ef895554bdbfd1183
Fixes: 2aab4b969002 ("af_unix: fix struct pid leaks in OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240203183149.63573-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'nfc-hci-save-a-few-bytes-of-memory-when-registering-a-nfc_llc-engine'
Paolo Abeni [Tue, 6 Feb 2024 14:36:09 +0000 (15:36 +0100)]
Merge branch 'nfc-hci-save-a-few-bytes-of-memory-when-registering-a-nfc_llc-engine'

Christophe says:

====================
nfc: hci: Save a few bytes of memory when registering a 'nfc_llc' engine

nfc_llc_register() calls pass a string literal as the 'name' parameter.

So kstrdup_const() can be used instead of kfree() to avoid a memory
allocation in such cases.
====================

Link: https://lore.kernel.org/r/cover.1706946099.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonfc: hci: Save a few bytes of memory when registering a 'nfc_llc' engine
Christophe JAILLET [Sat, 3 Feb 2024 07:51:04 +0000 (08:51 +0100)]
nfc: hci: Save a few bytes of memory when registering a 'nfc_llc' engine

nfc_llc_register() calls pass a string literal as the 'name' parameter.

So kstrdup_const() can be used instead of kfree() to avoid a memory
allocation in such cases.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>