linux.git
20 months agobnxt_en: Save user configured filters in a lookup list
Pavan Chebbi [Mon, 5 Feb 2024 22:31:57 +0000 (14:31 -0800)]
bnxt_en: Save user configured filters in a lookup list

Driver needs to maintain a lookup list of all the user configured
filters. This is required in order to reconfigure these filters upon
interface toggle. We can look up this list to follow the order with
which they should be re-applied.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240205223202.25341-9-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Add separate function to delete the filter structure
Pavan Chebbi [Mon, 5 Feb 2024 22:31:56 +0000 (14:31 -0800)]
bnxt_en: Add separate function to delete the filter structure

Since we are going to do filter deletion at multiple places in the
upcoming patches, add a function that does the deletion.  Future patches
add more code into this function.

Since we are passing the address of the filter base to free the
entire filter structure, add a comment to make sure that the base
is always at the beginning of the structure.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240205223202.25341-8-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Add drop action support for ntuple
Vikas Gupta [Mon, 5 Feb 2024 22:31:55 +0000 (14:31 -0800)]
bnxt_en: Add drop action support for ntuple

Add drop action for protocols TCP/UDP/ICMP
1) Drop action for TCP/UDP is supported via flow type
   tcp4/udp4/tcp6/udp6.
2) Drop action for ICMPV4/ICMPV6/wildcard is supported
   via flow type ipv4/ipv6.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240205223202.25341-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Enhance ethtool ntuple support for ip flows besides TCP/UDP
Vikas Gupta [Mon, 5 Feb 2024 22:31:54 +0000 (14:31 -0800)]
bnxt_en: Enhance ethtool ntuple support for ip flows besides TCP/UDP

Enable flow type ipv4/ipv6
1) for protocols ICMPV4 and ICMPV6.
2) for wildcard match. Wildcard matches to TCP/UDP/ICMP.
   Note that, IPPROTO_RAW(255) i.e. a reserved protocol
   considered for a wildcard.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240205223202.25341-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: implement fully specified 5-tuple masks
Edwin Peer [Mon, 5 Feb 2024 22:31:53 +0000 (14:31 -0800)]
bnxt_en: implement fully specified 5-tuple masks

Support subfield masking for IP addresses and ports. Previously, only
entire fields could be included or excluded in NTUPLE filters.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20240205223202.25341-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Support ethtool -n to display ether filters.
Michael Chan [Mon, 5 Feb 2024 22:31:52 +0000 (14:31 -0800)]
bnxt_en: Support ethtool -n to display ether filters.

Implement ETHTOOL_GRXCLSRULE for the user defined ether filters.  Use
the common functions to walk the L2 filter hash table.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20240205223202.25341-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Add ethtool -N support for ether filters.
Michael Chan [Mon, 5 Feb 2024 22:31:51 +0000 (14:31 -0800)]
bnxt_en: Add ethtool -N support for ether filters.

Add ETHTOOL_SRXCLSRLINS and ETHTOOL_SRXCLSRLDEL support for inserting
and deleting L2 ether filter rules.  Destination MAC address and
optional VLAN are supported for each filter entry.  This is currently
only supported on older BCM573XX and BCM574XX chips only.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240205223202.25341-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt_en: Use firmware provided maximum filter counts.
Michael Chan [Mon, 5 Feb 2024 22:31:50 +0000 (14:31 -0800)]
bnxt_en: Use firmware provided maximum filter counts.

While individual filter structures are allocated as needed, there is an
array to keep track of the software filter IDs that we allocate ahead
of time.  Rather than relying on a fixed maximum filter count to
allocate this array, get the maximum from the firmware when available.

Move these filter related maximum counts queried from the firmware to the
bnxt_hw_resc struct.  If the firmware is not providing these maximum
counts, fall back to the hard-coded constant.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20240205223202.25341-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: tc-testing: add mirred to block tdc tests
Victor Nogueira [Fri, 2 Feb 2024 02:07:26 +0000 (23:07 -0300)]
selftests: tc-testing: add mirred to block tdc tests

Add 8 new mirred tdc tests that target mirred to block:

- Add mirred mirror to egress block action
- Add mirred mirror to ingress block action
- Add mirred redirect to egress block action
- Add mirred redirect to ingress block action
- Try to add mirred action with both dev and block
- Try to add mirred action without specifying neither dev nor block
- Replace mirred redirect to dev action with redirect to block
- Replace mirred redirect to block action with mirror to dev

Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240202020726.529170-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: emaclite: Use devm_platform_get_and_ioremap_resource() in xemaclite_of_probe()
Markus Elfring [Mon, 5 Feb 2024 13:44:20 +0000 (14:44 +0100)]
net: emaclite: Use devm_platform_get_and_ioremap_resource() in xemaclite_of_probe()

A wrapper function is available since the commit 890cc39a8799
("drivers: provide devm_platform_get_and_ioremap_resource()").
Thus reuse existing functionality instead of keeping duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://lore.kernel.org/r/f87065d0-e398-4ffa-bfa4-9ff99d73f206@web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoethernet: wiznet: Use devm_platform_get_and_ioremap_resource() in w5300_hw_probe()
Markus Elfring [Mon, 5 Feb 2024 13:22:32 +0000 (14:22 +0100)]
ethernet: wiznet: Use devm_platform_get_and_ioremap_resource() in w5300_hw_probe()

A wrapper function is available since the commit 890cc39a8799
("drivers: provide devm_platform_get_and_ioremap_resource()").
Thus reuse existing functionality instead of keeping duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://lore.kernel.org/r/46f64db3-3f8f-4c6c-8d70-38daeefccac1@web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: forwarding: Add missing multicast routing config entries
Ido Schimmel [Thu, 8 Feb 2024 16:55:38 +0000 (18:55 +0200)]
selftests: forwarding: Add missing multicast routing config entries

The two tests that make use of multicast routig (router.sh and
router_multicast.sh) are currently failing in the netdev CI because the
kernel is missing multicast routing support.

Fix by adding the required config entries.

Fixes: 6d4efada3b82 ("selftests: forwarding: Add multicast routing test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240208165538.1303021-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'for-io_uring-add-napi-busy-polling-support'
Jakub Kicinski [Fri, 9 Feb 2024 18:01:12 +0000 (10:01 -0800)]
Merge branch 'for-io_uring-add-napi-busy-polling-support'

Merge netdev bits of io_uring busy polling support.

Jens Axboe says:

====================
io_uring: add napi busy polling support

I finally got around to testing this patchset in its current form, and
results look fine to me. It Works. Using the basic ping/pong test that's
part of the liburing addition, without enabling NAPI I get:

Stock settings, no NAPI, 100k packets:

 rtt(us) min/avg/max/mdev = 31.730/37.006/87.960/0.497

 and with -t10 -b enabled:

 rtt(us) min/avg/max/mdev = 23.250/29.795/63.511/1.203

In short, this patchset enables per io_uring NAPI enablement, rather
than need to enable that globally. This allows targeted NAPI usage with
io_uring.

Here's Stefan's v15 posting, which predates this one:

https://lore.kernel.org/io-uring/20230608163839.2891748-1-shr@devkernel.io/
====================

Link: https://lore.kernel.org/r/20240206163422.646218-1-axboe@kernel.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: add napi_busy_loop_rcu()
Stefan Roesch [Tue, 6 Feb 2024 16:30:04 +0000 (09:30 -0700)]
net: add napi_busy_loop_rcu()

This adds the napi_busy_loop_rcu() function. This function assumes that
the calling function is already holding the rcu read lock and
napi_busy_loop() does not need to take the rcu read lock. Add a
NAPI_F_NO_SCHED flag, which tells __napi_busy_loop() to abort if we
need to reschedule rather than drop the RCU read lock and reschedule.

Signed-off-by: Stefan Roesch <shr@devkernel.io>
Link: https://lore.kernel.org/r/20230608163839.2891748-3-shr@devkernel.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: split off __napi_busy_poll from napi_busy_poll
Stefan Roesch [Tue, 6 Feb 2024 16:30:03 +0000 (09:30 -0700)]
net: split off __napi_busy_poll from napi_busy_poll

This splits off the key part of the napi_busy_poll function into its own
function, __napi_busy_poll, and changes the prefer_busy_poll bool to be
flag based to allow passing in more flags in the future.

This is done in preparation for an additional napi_busy_poll() function,
that doesn't take the rcu_read_lock(). The new function is introduced
in the next patch.

Signed-off-by: Stefan Roesch <shr@devkernel.io>
Link: https://lore.kernel.org/r/20230608163839.2891748-2-shr@devkernel.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'wan-t7x-fastboot'
David S. Miller [Fri, 9 Feb 2024 12:07:49 +0000 (12:07 +0000)]
Merge branch 'wan-t7x-fastboot'

Jinjian Song says:

====================
net: wwan: t7xx: Add fastboot interface

Add support for t7xx WWAN device firmware flashing & coredump collection
using fastboot interface.

Using fastboot protocol command through /dev/wwan0fastboot0 WWAN port to
support firmware flashing and coredump collection, userspace get device
mode from /sys/bus/pci/devices/${bdf}/t7xx_mode.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: wwan: t7xx: Add fastboot WWAN port
Jinjian Song [Mon, 5 Feb 2024 10:22:30 +0000 (18:22 +0800)]
net: wwan: t7xx: Add fastboot WWAN port

On early detection of wwan device in fastboot mode, driver sets
up CLDMA0 HW tx/rx queues for raw data transfer and then create
fastboot port to userspace.

Application can use this port to flash firmware and collect
core dump by fastboot protocol commands.
E.g., flash firmware through fastboot port:
 - "download:%08x": write data to memory with the download size.
 - "flash:%s": write the previously downloaded image to the named partition.
 - "reboot": reboot the device.

Link: https://android.googlesource.com/platform/system/core/+/refs/heads/main/fastboot/README.md
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: wwan: t7xx: Infrastructure for early port configuration
Jinjian Song [Mon, 5 Feb 2024 10:22:29 +0000 (18:22 +0800)]
net: wwan: t7xx: Infrastructure for early port configuration

To support cases such as FW update or Core dump, the t7xx
device is capable of signaling the host that a special port
needs to be created before the handshake phase.

Adds the infrastructure required to create the early ports
which also requires a different configuration of CLDMA queues.

Base on the v5 patch version of follow series:
'net: wwan: t7xx: fw flashing & coredump support'
(https://patchwork.kernel.org/project/netdevbpf/patch/3777bb382f4b0395cb594a602c5c79dbab86c9e0.1674307425.git.m.chetan.kumar@linux.intel.com/)

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: wwan: t7xx: Add sysfs attribute for device state machine
Jinjian Song [Mon, 5 Feb 2024 10:22:28 +0000 (18:22 +0800)]
net: wwan: t7xx: Add sysfs attribute for device state machine

Add support for userspace to get/set the device mode, device's state
machine changes between (unknown/ready/reset/fastboot).

Get the device state mode:
 - 'cat /sys/bus/pci/devices/${bdf}/t7xx_mode'

Set the device state mode:
 - reset(cold reset): 'echo reset > /sys/bus/pci/devices/${bdf}/t7xx_mode'
 - fastboot: 'echo fastboot_switching > /sys/bus/pci/devices/${bdf}/t7xx_mode'
Reload driver to get the new device state after setting operation.

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agowwan: core: Add WWAN fastboot port type
Jinjian Song [Mon, 5 Feb 2024 10:22:27 +0000 (18:22 +0800)]
wwan: core: Add WWAN fastboot port type

Add a new WWAN port that connects to the device fastboot protocol
interface.

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agoMerge branch 'netconsole-userdata-append'
David S. Miller [Fri, 9 Feb 2024 10:23:46 +0000 (10:23 +0000)]
Merge branch 'netconsole-userdata-append'

Matthew Wood says:

====================
netconsole: Add userdata append support

Add the ability to add custom userdata to every outbound netconsole message
as a collection of key/value pairs, allowing users to add metadata to every
netconsole message which can be used for  for tagging, filtering, and
aggregating log messages.

In a previous patch series the ability to prepend the uname release was
added towards the goals above. This patch series builds on that
idea to allow any userdata, keyed by a user provided name, to be
included in netconsole messages.

If CONFIG_NETCONSOLE_DYNAMIC is enabled an additional userdata
directory will be presented in the netconsole configfs tree, allowing
the addition of userdata entries.

    /sys/kernel/config/netconsole/
<target>/
enabled
release
dev_name
local_port
remote_port
local_ip
remote_ip
local_mac
remote_mac
userdata/
<key>/
value
<key>/
value
          ...

v1->v2:
 * Updated netconsole_target docs, kdoc is now clean
v2->v3:
 * Remove inline keyword from to_userdat* functions
 * Break up some lines that exceeded 80 chars
 * Replace typos and remove {} from single line if statement
 * Remove unused variable

Testing for this series is as follows:

Build every patch without CONFIG_NETCONSOLE_DYNAMIC, and also built
with CONFIG_NETCONSOLE_DYNAMIC enabled for every patch after the config
option was added

Test Userdata configfs

    # Adding userdata
    cd /sys/kernel/config/netconsole/ && mkdir cmdline0 && cd cmdline0
    mkdir userdata/release && echo hotfix1 > userdata/release/value
    preview=$(for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done)
    [[ "$preview" == $'release=hotfix1' ]] && echo pass || echo fail
    mkdir userdata/testing && echo something > userdata/testing/value
    preview=$(for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done)
    [[ "$preview" == $'release=hotfix1\ntesting=something' ]] && echo pass || echo fail
    #
    # Removing Userdata
    rmdir userdata/testing
    preview=$(for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done)
    [[ "$preview" == $'release=hotfix1' ]] && echo pass || echo fail
    rmdir userdata/release
    preview=$(for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done)
    [[ "$preview" == $'' ]] && echo pass || echo fail
    #
    # Adding userdata key with too large of 6.7.0-rc8-virtme,12,481,17954104,-directory name [<54 chars]
    mkdir userdata/testing12345678901234567890123456789012345678901234567890
    [[ $? == 1 ]] && echo pass || echo fail
    #
    # Adding userdata value with too large of value [<200 chars]
    mkdir userdata/testing
    echo `for i in {1..201};do printf "%s" "v";done` > userdata/testing/value
    [[ $? == 1 ]] && echo pass || echo fail
    rmdir userdata/testing

- Output:

    pass
    pass
    pass
    pass
    pass
    mkdir: cannot create directory ‘cmdline0/userdata/testing12345678901234567890123456789012345678901234567890’: File name too long
    pass
    bash: echo: write error: Message too long
    pass

Test netconsole messages (w/ msg fragmentation)

    echo `for i in {1..996};do printf "%s" "v";done` > /dev/kmsg

- Output:

    6.7.0-rc8-virtme,12,484,84321212,-,ncfrag=0/997;vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv6.7.0-rc8-virtme,12,484,84321212,-,ncfrag=952/997;vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

Test empty userdatum

    cd /sys/kernel/config/netconsole/ && mkdir cmdline0
    mkdir cmdline0/userdata/empty
    echo test > /dev/kmsg
    rmdir cmdline0/userdata/empty

- Output:

Test netconsole messages (w/o userdata fragmentation)

    cd /sys/kernel/config/netconsole/ && mkdir cmdline0
    mkdir cmdline0/userdata/release && echo hotfix1 > cmdline0/userdata/release/value
    mkdir cmdline0/userdata/testing && echo something > cmdline0/userdata/testing/value
    echo test > /dev/kmsg
    rmdir cmdline0/userdata/release
    rmdir cmdline0/userdata/testing
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: append userdata to fragmented netconsole messages
Matthew Wood [Sun, 4 Feb 2024 23:27:39 +0000 (15:27 -0800)]
net: netconsole: append userdata to fragmented netconsole messages

Regardless of whether the original message body or formatted userdata
exceeds the MAX_PRINT_CHUNK, append userdata to the netconsole message
starting with the first chunk that has available space after writing the
body.

Co-developed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: append userdata to netconsole messages
Matthew Wood [Sun, 4 Feb 2024 23:27:38 +0000 (15:27 -0800)]
net: netconsole: append userdata to netconsole messages

Append userdata to outgoing unfragmented (<1000 bytes) netconsole messages.
When sending messages the userdata string is already formatted and stored
in netconsole_target->userdata_complete.

Always write the outgoing message to buf, so userdata can be appended in
a standard fashion. This is a change from only using buf when the
release needs to be prepended to the message.

Co-developed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: cache userdata formatted string in netconsole_target
Matthew Wood [Sun, 4 Feb 2024 23:27:37 +0000 (15:27 -0800)]
net: netconsole: cache userdata formatted string in netconsole_target

Store a formatted string for userdata that will be appended to netconsole
messages. The string has a capacity of 4KB, as calculated by the userdatum
entry length of 256 bytes and a max of 16 userdata entries.

Update the stored netconsole_target->userdata_complete string with the new
formatted userdata values when a userdatum is created, edited, or
removed. Each userdata entry contains a trailing newline, which will be
formatted as such in netconsole messages::

    6.7.0-rc8-virtme,12,500,1646292204,-;test
    release=foo
    something=bar
    6.7.0-rc8-virtme,12,500,1646292204,-;another test
    release=foo
    something=bar

Enforcement of MAX_USERDATA_ITEMS is done in userdatum_make_item;
update_userdata will not check for this case but will skip any userdata
children over the limit of MAX_USERDATA_ITEMs.

If a userdata entry/dir is created but no value is provided, that entry
will be skipped. This is in part because update_userdata() can't be
called in userdatum_make_item() since the item will not have been added
to the userdata config_group children yet. To preserve the experience of
adding an empty userdata that doesn't show up in the netconsole
messages, purposefully skip empty userdata items even when
update_userdata() can be called.

Co-developed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: add a userdata config_group member to netconsole_target
Matthew Wood [Sun, 4 Feb 2024 23:27:36 +0000 (15:27 -0800)]
net: netconsole: add a userdata config_group member to netconsole_target

Create configfs machinery for netconsole userdata appending, which depends
on CONFIG_NETCONSOLE_DYNAMIC (for configfs interface). Add a userdata
config_group to netconsole_target for managing userdata entries as a tree
under the netconsole configfs subsystem. Directory names created under the
userdata directory become userdatum keys; the userdatum value is the
content of the value file.

Include the minimum-viable-changes for userdata configfs config_group.
init_target_config_group() ties in the complete configfs machinery to
avoid unused func/variable errors during build. Initializing the
netconsole_target->group is moved to init_target_config_group, which
will also init and add the userdata config_group.

Each userdatum entry has a limit of 256 bytes (54 for
the key/directory, 200 for the value, and 2 for '=' and '\n'
characters), which is enforced by the configfs functions for updating
the userdata config_group.

When a new netconsole_target is created, initialize the userdata
config_group and add it as a default group for netconsole_target
config_group, allowing the userdata configfs sub-tree to be presented
in the netconsole configfs tree under the userdata directory.

Co-developed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: add docs for appending netconsole user data
Matthew Wood [Sun, 4 Feb 2024 23:27:35 +0000 (15:27 -0800)]
net: netconsole: add docs for appending netconsole user data

Add a new User Data section to the netconsole docs to describe the
appending of user data capability (for netconsole dynamic configuration)
with usage and netconsole output examples.

Co-developed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: move newline trimming to function
Matthew Wood [Sun, 4 Feb 2024 23:27:34 +0000 (15:27 -0800)]
net: netconsole: move newline trimming to function

Move newline trimming logic from `dev_name_store()` to a new function
(trim_newline()) for shared use in netconsole.c

Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: move netconsole_target config_item to config_group
Matthew Wood [Sun, 4 Feb 2024 23:27:33 +0000 (15:27 -0800)]
net: netconsole: move netconsole_target config_item to config_group

In order to support a nested userdata config_group in later patches,
use a config_group for netconsole_target instead of a
config_item. It's a no-op functionality-wise, since
config_group maintains all features of a config_item via the cg_item
member.

Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agonet: netconsole: cleanup formatting lints
Matthew Wood [Sun, 4 Feb 2024 23:27:32 +0000 (15:27 -0800)]
net: netconsole: cleanup formatting lints

Address checkpatch lint suggestions in preparation for later changes

Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
20 months agoethtool: do not use rtnl in ethnl_default_dumpit()
Eric Dumazet [Wed, 7 Feb 2024 15:35:14 +0000 (15:35 +0000)]
ethtool: do not use rtnl in ethnl_default_dumpit()

for_each_netdev_dump() can be used with RCU protection,
no need for rtnl if we are going to use dev_hold()/dev_put().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240207153514.3640952-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Fri, 9 Feb 2024 03:08:40 +0000 (19:08 -0800)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-02-06 (ixgbe)

This series contains updates to ixgbe driver only.

Jedrzej continues cleanup work from conversion away from ixgbe_status;
s32 values are changed to int, various style issues are addressed, and
some return statements refactored to address some smatch warnings.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ixgbe: Clarify the values of the returning status
  ixgbe: Rearrange args to fix reverse Christmas tree
  ixgbe: Convert ret val type from s32 to int
====================

Link: https://lore.kernel.org/r/20240206214054.1002919-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'add-hw-checksum-offload-support-for-rz-g2l-gbethernet-ip'
Jakub Kicinski [Fri, 9 Feb 2024 03:06:39 +0000 (19:06 -0800)]
Merge branch 'add-hw-checksum-offload-support-for-rz-g2l-gbethernet-ip'

Biju Das says:

====================
Add HW checksum offload support for RZ/G2L GbEthernet IP

This patch series aims to add HW checksum offload supported by TOE module
found on the RZ/G2L Gb ethernet IP.

TOE has hardware support for calculating IP header and TCP/UDP/ICMP
checksum for both IPv4 and IPv6.

For Rx, the 4-byte result of checksum calculation is attached to the
Ethernet frames.First 2-bytes is result of IPv4 header checksum and next
2-bytes is TCP/UDP/ICMP checksum.

If a frame does not have checksum error, 0x0000 is attached as checksum
calculation result. For unsupported frames 0xFFFF is attached as checksum
calculation result. In case of an IPv6 packet, IPv4 checksum is always set
to 0xFFFF.

For Tx, the result of checksum calculation is set to the checksum field of
each IPv4 Header/TCP/UDP/ICMP of ethernet frames. For the unsupported
frames, those fields are not changed. If a transmission frame is an UDPv4
frame and its checksum value in the UDP header field is 0x0000, TOE does
not calculate checksum for UDP part of this frame as it is optional
function as per standards.

Add Tx/Rx checksum offload supported by TOE for IPv4 and TCP/UDP protocols.

Results of iperf3 in Mbps

RZ/V2L:
TCP(Tx/Rx) results with checksum offload Enabled: {921,932}
TCP(Tx/Rx) results with checksum offload Disabled: {867,612}

UDP(Tx/Rx) results with checksum offload Enabled: {950,946}
UDP(Tx/Rx) results with checksum offload Disabled: {952,920}

RZ/G2L:
TCP(Tx/Rx) results with checksum offload Enabled: {920,936}
TCP(Tx/Rx) results with checksum offload Disabled: {871,626}

UDP(Tx/Rx) results with checksum offload Enabled: {953,950}
UDP(Tx/Rx) results with checksum offload Disabled: {954,920}

RZ/G2LC:
TCP(Tx/Rx) results with checksum offload Enabled: {927,936}
TCP(Tx/Rx) results with checksum offload Disabled: {889,626}

UDP(Tx/Rx) results with checksum offload Enabled: {950,946}
UDP(Tx/Rx) results with checksum offload Disabled: {949,944}
====================

Link: https://lore.kernel.org/r/20240207092838.160627-1-biju.das.jz@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoravb: Add Tx checksum offload support for GbEth
Biju Das [Wed, 7 Feb 2024 09:28:38 +0000 (09:28 +0000)]
ravb: Add Tx checksum offload support for GbEth

TOE has hardware support for calculating IP header and TCP/UDP/ICMP
checksum for both IPv4 and IPv6.

Add Tx checksum offload supported by TOE for IPv4 and TCP/UDP.

For Tx, the result of checksum calculation is set to the checksum field of
each IPv4 Header/TCP/UDP/ICMP of ethernet frames. For the unsupported
frames, those fields are not changed. If a transmission frame is an UDPv4
frame and its checksum value in the UDP header field is 0x0000, TOE does
not calculate checksum for UDP part of this frame as it is optional
function as per standards.

We can test this functionality by the below commands

ethtool -K eth0 tx on --> to turn on Tx checksum offload
ethtool -K eth0 tx off --> to turn off Tx checksum offload

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240207092838.160627-3-biju.das.jz@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoravb: Add Rx checksum offload support for GbEth
Biju Das [Wed, 7 Feb 2024 09:28:37 +0000 (09:28 +0000)]
ravb: Add Rx checksum offload support for GbEth

TOE has hardware support for calculating IP header and TCP/UDP/ICMP
checksum for both IPv4 and IPv6.

Add Rx checksum offload supported by TOE for IPv4 and TCP/UDP protocols.

For Rx, the 4-byte result of checksum calculation is attached to the
Ethernet frames.First 2-bytes is result of IPv4 header checksum and next
2-bytes is TCP/UDP/ICMP checksum.

If a frame does not have checksum error, 0x0000 is attached as checksum
calculation result. For unsupported frames 0xFFFF is attached as checksum
calculation result. In case of an IPv6 packet, IPv4 checksum is always set
to 0xFFFF.

We can test this functionality by the below commands

ethtool -K eth0 rx on --> to turn on Rx checksum offload
ethtool -K eth0 rx off --> to turn off Rx checksum offload

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240207092838.160627-2-biju.das.jz@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonetxen_nic: remove redundant assignment to variable capability
Colin Ian King [Tue, 6 Feb 2024 11:50:49 +0000 (11:50 +0000)]
netxen_nic: remove redundant assignment to variable capability

The variable capability is being assigned a value that is never
read and is being re-assigned later. The assignment is redundant and
can be removed. Also remove empty line before assignment to capability.

Cleans up clang scan build warning:
drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c:1189:2: warning:
Value stored to 'capability' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206115049.1879389-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet-procfs: use xarray iterator to implement /proc/net/dev
Eric Dumazet [Wed, 7 Feb 2024 16:53:18 +0000 (16:53 +0000)]
net-procfs: use xarray iterator to implement /proc/net/dev

In commit 759ab1edb56c ("net: store netdevs in an xarray")
Jakub added net->dev_by_index to map ifindex to netdevices.

We can get rid of the old hash table (net->dev_index_head),
one patch at a time, if performance is acceptable.

This patch removes unpleasant code to something more readable.

As a bonus, /proc/net/dev gets netdevices sorted by their ifindex.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240207165318.3814525-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobnxt: convert EEE handling to use linkmode bitmaps
Heiner Kallweit [Wed, 7 Feb 2024 16:47:35 +0000 (17:47 +0100)]
bnxt: convert EEE handling to use linkmode bitmaps

Convert EEE handling to use linkmode bitmaps. This prepares for removing
the legacy bitmaps from struct ethtool_keee. No functional change
intended. When replacing _bnxt_fw_to_ethtool_adv_spds() with
_bnxt_fw_to_linkmode(), remove the fw_pause argument because it's
always passed as 0.

Note:
There's a discussion on whether the underlying implementation is correct,
but it's independent of this mechanical conversion w/o functional change.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/9123bf18-a0d0-404e-a7c4-d6c466b4c5e8@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoqed: remove duplicated assignment to variable opaque_fid
Colin Ian King [Mon, 5 Feb 2024 21:55:30 +0000 (21:55 +0000)]
qed: remove duplicated assignment to variable opaque_fid

Variable opaque_fid is being assigned twice with the same value
in two identical statements. Remove the redundant first assignment.

Cleans up clang scan build warning:
drivers/net/ethernet/qlogic/qed/qed_rdma.c:1796:2: warning: Value
stored to 'opaque_fid' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240205215530.1851115-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoxirc2ps_cs: remove redundant assignment to variable okay, clean up freespace
Colin Ian King [Mon, 5 Feb 2024 21:36:43 +0000 (21:36 +0000)]
xirc2ps_cs: remove redundant assignment to variable okay, clean up freespace

The variable okay is being initialized with a value that is never
read, it is being re-assigned later on. The initialization is
redundant and can be removed.  Also clean up assignment to
variable freespace using an assignment and mask operation.

Cleans up clang scan build warning:
drivers/net/ethernet/xircom/xirc2ps_cs.c:1244:5: warning: Value stored
to 'okay' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240205213643.1850420-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: tag_sja1105: remove "inline" keyword
Vladimir Oltean [Tue, 6 Feb 2024 11:29:27 +0000 (13:29 +0200)]
net: dsa: tag_sja1105: remove "inline" keyword

The convention is to not use the "inline" keyword for functions in C
files, but to let the compiler choose.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206112927.4134375-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: remove "inline" from dsa_user_netpoll_send_skb()
Vladimir Oltean [Tue, 6 Feb 2024 11:29:26 +0000 (13:29 +0200)]
net: dsa: remove "inline" from dsa_user_netpoll_send_skb()

The convention is to not use "inline" functions in C files, and let the
compiler decide whether to inline or not.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206112927.4134375-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: b53: unexport and move b53_eee_enable_set()
Vladimir Oltean [Tue, 6 Feb 2024 11:25:27 +0000 (13:25 +0200)]
net: dsa: b53: unexport and move b53_eee_enable_set()

After commit f86ad77faf24 ("net: dsa: bcm_sf2: Utilize b53_{enable,
disable}_port"), bcm_sf2.c no longer calls b53_eee_enable_set(), and all
its callers are in b53_common.c.

We also need to move it, because it is called within b53_common.c before
its definition, and we want to avoid forward declarations.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240206112527.4132299-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoptp: ocp: add Adva timecard support
Sagi Maimon [Mon, 5 Feb 2024 15:30:46 +0000 (17:30 +0200)]
ptp: ocp: add Adva timecard support

Adding support for the Adva timecard.
The card uses different drivers to provide access to the
firmware SPI flash (Altera based).
Other parts of the code are the same and could be reused.

Signed-off-by: Sagi Maimon <maimon.sagi@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/20240205153046.3642-1-maimon.sagi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet/sun3_82586: Avoid reading past buffer in debug output
Kees Cook [Tue, 6 Feb 2024 16:16:54 +0000 (08:16 -0800)]
net/sun3_82586: Avoid reading past buffer in debug output

Since NUM_XMIT_BUFFS is always 1, building m68k with sun3_defconfig and
-Warraybounds, this build warning is visible[1]:

drivers/net/ethernet/i825xx/sun3_82586.c: In function 'sun3_82586_timeout':
drivers/net/ethernet/i825xx/sun3_82586.c:990:122: warning: array subscript 1 is above array bounds of 'volatile struct transmit_cmd_struct *[1]' [-Warray-bounds=]
  990 |                 printk("%s: command-stats: %04x %04x\n",dev->name,swab16(p->xmit_cmds[0]->cmd_status),swab16(p->xmit_cmds[1]->cmd_status));
      |                                                                                                               ~~~~~~~~~~~~^~~
...
drivers/net/ethernet/i825xx/sun3_82586.c:156:46: note: while referencing 'xmit_cmds'
  156 |         volatile struct transmit_cmd_struct *xmit_cmds[NUM_XMIT_BUFFS];

Avoid accessing index 1 since it doesn't exist.

Link: https://github.com/KSPP/linux/issues/325
Cc: Sam Creasey <sammy@sammy.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20240206161651.work.876-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 8 Feb 2024 23:20:37 +0000 (15:20 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/common.h
  38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters")
  fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF")
  c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio")

drivers/net/wireless/microchip/wilc1000/netdev.c
  c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000")
  328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added")

net/unix/garbage.c
  11498715f266 ("af_unix: Remove io_uring code for GC.")
  1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 8 Feb 2024 23:09:29 +0000 (15:09 -0800)]
Merge tag 'net-6.8-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from WiFi and netfilter.

  Current release - regressions:

   - nic: intel: fix old compiler regressions

   - netfilter: ipset: missing gc cancellations fixed

  Current release - new code bugs:

   - netfilter: ctnetlink: fix filtering for zone 0

  Previous releases - regressions:

   - core: fix from address in memcpy_to_iter_csum()

   - netfilter: nfnetlink_queue: un-break NF_REPEAT

   - af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.

   - devlink: avoid potential loop in devlink_rel_nested_in_notify_work()

   - iwlwifi:
       - mvm: fix a battery life regression
       - fix double-free bug

   - mac80211: fix waiting for beacons logic

   - nic: nfp: flower: prevent re-adding mac index for bonded port

  Previous releases - always broken:

   - rxrpc: fix generation of serial numbers to skip zero

   - tipc: check the bearer type before calling tipc_udp_nl_bearer_add()

   - tunnels: fix out of bounds access when building IPv6 PMTU error

   - nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
     missed

   - nic: atlantic: fix DMA mapping for PTP hwts ring

  Misc:

   - selftests: more fixes to deal with very slow hosts"

* tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
  netfilter: nft_set_pipapo: remove scratch_aligned pointer
  netfilter: nft_set_pipapo: add helper to release pcpu scratch area
  netfilter: nft_set_pipapo: store index in scratch maps
  netfilter: nft_set_rbtree: skip end interval element from gc
  netfilter: nfnetlink_queue: un-break NF_REPEAT
  netfilter: nf_tables: use timestamp to check for set element timeout
  netfilter: nft_ct: reject direction for ct id
  netfilter: ctnetlink: fix filtering for zone 0
  s390/qeth: Fix potential loss of L3-IP@ in case of network issues
  netfilter: ipset: Missing gc cancellations fixed
  octeontx2-af: Initialize maps.
  net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
  net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
  netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
  netfilter: nft_compat: restrict match/target protocol to u16
  netfilter: nft_compat: reject unused compat flag
  netfilter: nft_compat: narrow down revision to unsigned 8-bits
  net: intel: fix old compiler regressions
  MAINTAINERS: Maintainer change for rds
  selftests: cmsg_ipv6: repeat the exact packet
  ...

20 months agoMerge tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 8 Feb 2024 23:07:06 +0000 (15:07 -0800)]
Merge tag 'pinctrl-v6.8-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl fix from Linus Walleij:
 "A single fix for the AMD driver which affects developer laptops, the
  pinctrl/GPIO driver won't probe on some systems"

* tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: Add IRQF_ONESHOT to the interrupt request

20 months agoMerge tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 8 Feb 2024 11:56:39 +0000 (12:56 +0100)]
Merge tag 'nf-24-02-08' of git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Narrow down target/match revision to u8 in nft_compat.

2) Bail out with unused flags in nft_compat.

3) Restrict layer 4 protocol to u16 in nft_compat.

4) Remove static in pipapo get command that slipped through when
   reducing set memory footprint.

5) Follow up incremental fix for the ipset performance regression,
   this includes the missing gc cancellation, from Jozsef Kadlecsik.

6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
   as no filtering, from Felix Huettner.

7) Reject direction for NFT_CT_ID.

8) Use timestamp to check for set element expiration while transaction
   is handled to prevent garbage collection from removing set elements
   that were just added by this transaction. Packet path and netlink
   dump/get path still use current time to check for expiration.

9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.

10) map_index needs to be percpu and per-set, not just percpu.
    At this time its possible for a pipapo set to fill the all-zero part
    with ones and take the 'might have bits set' as 'start-from-zero' area.
    From Florian Westphal. This includes three patches:

    - Change scratchpad area to a structure that provides space for a
      per-set-and-cpu toggle and uses it of the percpu one.

    - Add a new free helper to prepare for the next patch.

    - Remove the scratch_aligned pointer and makes AVX2 implementation
      use the exact same memory addresses for read/store of the matching
      state.

netfilter pull request 24-02-08

* tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_set_pipapo: remove scratch_aligned pointer
  netfilter: nft_set_pipapo: add helper to release pcpu scratch area
  netfilter: nft_set_pipapo: store index in scratch maps
  netfilter: nft_set_rbtree: skip end interval element from gc
  netfilter: nfnetlink_queue: un-break NF_REPEAT
  netfilter: nf_tables: use timestamp to check for set element timeout
  netfilter: nft_ct: reject direction for ct id
  netfilter: ctnetlink: fix filtering for zone 0
  netfilter: ipset: Missing gc cancellations fixed
  netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
  netfilter: nft_compat: restrict match/target protocol to u16
  netfilter: nft_compat: reject unused compat flag
  netfilter: nft_compat: narrow down revision to unsigned 8-bits
====================

Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: nft_set_pipapo: remove scratch_aligned pointer
Florian Westphal [Thu, 8 Feb 2024 09:31:29 +0000 (10:31 +0100)]
netfilter: nft_set_pipapo: remove scratch_aligned pointer

use ->scratch for both avx2 and the generic implementation.

After previous change the scratch->map member is always aligned properly
for AVX2, so we can just use scratch->map in AVX2 too.

The alignoff delta is stored in the scratchpad so we can reconstruct
the correct address to free the area again.

Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_pipapo: add helper to release pcpu scratch area
Florian Westphal [Wed, 7 Feb 2024 20:52:47 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: add helper to release pcpu scratch area

After next patch simple kfree() is not enough anymore, so add
a helper for it.

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_pipapo: store index in scratch maps
Florian Westphal [Wed, 7 Feb 2024 20:52:46 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: store index in scratch maps

Pipapo needs a scratchpad area to keep state during matching.
This state can be large and thus cannot reside on stack.

Each set preallocates percpu areas for this.

On each match stage, one scratchpad half starts with all-zero and the other
is inited to all-ones.

At the end of each stage, the half that starts with all-ones is
always zero.  Before next field is tested, pointers to the two halves
are swapped, i.e.  resmap pointer turns into fill pointer and vice versa.

After the last field has been processed, pipapo stashes the
index toggle in a percpu variable, with assumption that next packet
will start with the all-zero half and sets all bits in the other to 1.

This isn't reliable.

There can be multiple sets and we can't be sure that the upper
and lower half of all set scratch map is always in sync (lookups
can be conditional), so one set might have swapped, but other might
not have been queried.

Thus we need to keep the index per-set-and-cpu, just like the
scratchpad.

Note that this bug fix is incomplete, there is a related issue.

avx2 and normal implementation might use slightly different areas of the
map array space due to the avx2 alignment requirements, so
m->scratch (generic/fallback implementation) and ->scratch_aligned
(avx) may partially overlap. scratch and scratch_aligned are not distinct
objects, the latter is just the aligned address of the former.

After this change, write to scratch_align->map_index may write to
scratch->map, so this issue becomes more prominent, we can set to 1
a bit in the supposedly-all-zero area of scratch->map[].

A followup patch will remove the scratch_aligned and makes generic and
avx code use the same (aligned) area.

Its done in a separate change to ease review.

Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_set_rbtree: skip end interval element from gc
Pablo Neira Ayuso [Wed, 7 Feb 2024 17:49:51 +0000 (18:49 +0100)]
netfilter: nft_set_rbtree: skip end interval element from gc

rbtree lazy gc on insert might collect an end interval element that has
been just added in this transactions, skip end interval elements that
are not yet active.

Fixes: f718863aca46 ("netfilter: nft_set_rbtree: fix overlap expiration walk")
Cc: stable@vger.kernel.org
Reported-by: lonial con <kongln9170@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nfnetlink_queue: un-break NF_REPEAT
Florian Westphal [Tue, 6 Feb 2024 16:54:18 +0000 (17:54 +0100)]
netfilter: nfnetlink_queue: un-break NF_REPEAT

Only override userspace verdict if the ct hook returns something
other than ACCEPT.

Else, this replaces NF_REPEAT (run all hooks again) with NF_ACCEPT
(move to next hook).

Fixes: 6291b3a67ad5 ("netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts")
Reported-by: l.6diay@passmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nf_tables: use timestamp to check for set element timeout
Pablo Neira Ayuso [Mon, 5 Feb 2024 23:11:40 +0000 (00:11 +0100)]
netfilter: nf_tables: use timestamp to check for set element timeout

Add a timestamp field at the beginning of the transaction, store it
in the nftables per-netns area.

Update set backend .insert, .deactivate and sync gc path to use the
timestamp, this avoids that an element expires while control plane
transaction is still unfinished.

.lookup and .update, which are used from packet path, still use the
current time to check if the element has expired. And .get path and dump
also since this runs lockless under rcu read size lock. Then, there is
async gc which also needs to check the current time since it runs
asynchronously from a workqueue.

Fixes: c3e1b005ed1c ("netfilter: nf_tables: add set element timeout support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: nft_ct: reject direction for ct id
Pablo Neira Ayuso [Mon, 5 Feb 2024 13:59:24 +0000 (14:59 +0100)]
netfilter: nft_ct: reject direction for ct id

Direction attribute is ignored, reject it in case this ever needs to be
supported

Fixes: 3087c3f7c23b ("netfilter: nft_ct: Add ct id support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agonetfilter: ctnetlink: fix filtering for zone 0
Felix Huettner [Mon, 5 Feb 2024 09:59:59 +0000 (09:59 +0000)]
netfilter: ctnetlink: fix filtering for zone 0

previously filtering for the default zone would actually skip the zone
filter and flush all zones.

Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Closes: https://lore.kernel.org/netdev/2032238f-31ac-4106-8f22-522e76df5a12@ovn.org/
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agos390/qeth: Fix potential loss of L3-IP@ in case of network issues
Alexandra Winter [Tue, 6 Feb 2024 08:58:49 +0000 (09:58 +0100)]
s390/qeth: Fix potential loss of L3-IP@ in case of network issues

Symptom:
In case of a bad cable connection (e.g. dirty optics) a fast sequence of
network DOWN-UP-DOWN-UP could happen. UP triggers recovery of the qeth
interface. In case of a second DOWN while recovery is still ongoing, it
can happen that the IP@ of a Layer3 qeth interface is lost and will not
be recovered by the second UP.

Problem:
When registration of IP addresses with Layer 3 qeth devices fails, (e.g.
because of bad address format) the respective IP address is deleted from
its hash-table in the driver. If registration fails because of a ENETDOWN
condition, the address should stay in the hashtable, so a subsequent
recovery can restore it.

3caa4af834df ("qeth: keep ip-address after LAN_OFFLINE failure")
fixes this for registration failures during normal operation, but not
during recovery.

Solution:
Keep L3-IP address in case of ENETDOWN in qeth_l3_recover_ip(). For
consistency with qeth_l3_add_ip() we also keep it in case of EADDRINUSE,
i.e. for some reason the card already/still has this address registered.

Fixes: 4a71df50047f ("qeth: new qeth device driver")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20240206085849.2902775-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: ipset: Missing gc cancellations fixed
Jozsef Kadlecsik [Sun, 4 Feb 2024 15:26:42 +0000 (16:26 +0100)]
netfilter: ipset: Missing gc cancellations fixed

The patch fdb8e12cc2cc ("netfilter: ipset: fix performance regression
in swap operation") missed to add the calls to gc cancellations
at the error path of create operations and at module unload. Also,
because the half of the destroy operations now executed by a
function registered by call_rcu(), neither NFNL_SUBSYS_IPSET mutex
or rcu read lock is held and therefore the checking of them results
false warnings.

Fixes: 97f7cf1cd80e ("netfilter: ipset: fix performance regression in swap operation")
Reported-by: syzbot+52bbc0ad036f6f0d4a25@syzkaller.appspotmail.com
Reported-by: Brad Spengler <spender@grsecurity.net>
Reported-by: Стас Ничипорович <stasn77@gmail.com>
Tested-by: Brad Spengler <spender@grsecurity.net>
Tested-by: Стас Ничипорович <stasn77@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agoocteontx2-af: Initialize maps.
Ratheesh Kannoth [Tue, 6 Feb 2024 02:40:00 +0000 (08:10 +0530)]
octeontx2-af: Initialize maps.

kmalloc_array() without __GFP_ZERO flag does not initialize
memory to zero. This causes issues. Use kcalloc() for maps and
bitmap_zalloc() for bitmaps.

Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Brett Creeley <bcreeley@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206024000.1070260-1-rkannoth@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agoMerge branch 'cpsw-enable-mac_managed_pm-to-fix-mdio'
Paolo Abeni [Thu, 8 Feb 2024 10:33:22 +0000 (11:33 +0100)]
Merge branch 'cpsw-enable-mac_managed_pm-to-fix-mdio'

Sinthu Raja says:

====================
CPSW: enable mac_managed_pm to fix mdio

This patch fix the resume/suspend issue on CPSW interface.

Reference from the foloowing patchwork:
https://lore.kernel.org/netdev/20221014144729.1159257-2-shenwei.wang@nxp.com/T/

V1: https://patchwork.kernel.org/project/netdevbpf/patch/20240122083414.6246-1-sinthu.raja@ti.com/
V2: https://patchwork.kernel.org/project/netdevbpf/patch/20240122093326.7618-1-sinthu.raja@ti.com/
====================

Link: https://lore.kernel.org/r/20240206005928.15703-1-sinthu.raja@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonet: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
Sinthu Raja [Tue, 6 Feb 2024 00:59:28 +0000 (06:29 +0530)]
net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio

The below commit  introduced a WARN when phy state is not in the states:
PHY_HALTED, PHY_READY and PHY_UP.
commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")

When cpsw resumes, there have port in PHY_NOLINK state, so the below
warning comes out. Set mac_managed_pm be true to tell mdio that the phy
resume/suspend is managed by the mac, to fix the following warning:

WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
Hardware name: Generic AM33XX (Flattened Device Tree)
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x24/0x2c
 dump_stack_lvl from __warn+0x84/0x15c
 __warn from warn_slowpath_fmt+0x1a8/0x1c8
 warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
 mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
 dpm_run_callback from device_resume+0xb8/0x2b8
 device_resume from dpm_resume+0x144/0x314
 dpm_resume from dpm_resume_end+0x14/0x20
 dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
 suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
 pm_suspend from state_store+0x74/0xd0
 state_store from kernfs_fop_write_iter+0x104/0x1ec
 kernfs_fop_write_iter from vfs_write+0x1b8/0x358
 vfs_write from ksys_write+0x78/0xf8
 ksys_write from ret_fast_syscall+0x0/0x54
Exception stack(0xe094dfa8 to 0xe094dff0)
dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66

Cc: <stable@vger.kernel.org> # v6.0+
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Sinthu Raja <sinthu.raja@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonet: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
Sinthu Raja [Tue, 6 Feb 2024 00:59:27 +0000 (06:29 +0530)]
net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio

The below commit  introduced a WARN when phy state is not in the states:
PHY_HALTED, PHY_READY and PHY_UP.
commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")

When cpsw_new resumes, there have port in PHY_NOLINK state, so the below
warning comes out. Set mac_managed_pm be true to tell mdio that the phy
resume/suspend is managed by the mac, to fix the following warning:

WARNING: CPU: 0 PID: 965 at drivers/net/phy/phy_device.c:326 mdio_bus_phy_resume+0x140/0x144
CPU: 0 PID: 965 Comm: sh Tainted: G           O       6.1.46-g247b2535b2 #1
Hardware name: Generic AM33XX (Flattened Device Tree)
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x24/0x2c
 dump_stack_lvl from __warn+0x84/0x15c
 __warn from warn_slowpath_fmt+0x1a8/0x1c8
 warn_slowpath_fmt from mdio_bus_phy_resume+0x140/0x144
 mdio_bus_phy_resume from dpm_run_callback+0x3c/0x140
 dpm_run_callback from device_resume+0xb8/0x2b8
 device_resume from dpm_resume+0x144/0x314
 dpm_resume from dpm_resume_end+0x14/0x20
 dpm_resume_end from suspend_devices_and_enter+0xd0/0x924
 suspend_devices_and_enter from pm_suspend+0x2e0/0x33c
 pm_suspend from state_store+0x74/0xd0
 state_store from kernfs_fop_write_iter+0x104/0x1ec
 kernfs_fop_write_iter from vfs_write+0x1b8/0x358
 vfs_write from ksys_write+0x78/0xf8
 ksys_write from ret_fast_syscall+0x0/0x54
Exception stack(0xe094dfa8 to 0xe094dff0)
dfa0:                   00000004 005c3fb8 00000001 005c3fb8 00000004 00000001
dfc0: 00000004 005c3fb8 b6f6bba0 00000004 00000004 0059edb8 00000000 00000000
dfe0: 00000004 bed918f0 b6f09bd3 b6e89a66

Cc: <stable@vger.kernel.org> # v6.0+
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Sinthu Raja <sinthu.raja@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
20 months agonetfilter: nft_set_pipapo: remove static in nft_pipapo_get()
Pablo Neira Ayuso [Fri, 2 Feb 2024 09:09:34 +0000 (10:09 +0100)]
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()

This has slipped through when reducing memory footprint for set
elements, remove it.

Fixes: 9dad402b89e8 ("netfilter: nf_tables: expose opaque set element as struct nft_elem_priv")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
20 months agoMerge tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Thu, 8 Feb 2024 06:12:14 +0000 (06:12 +0000)]
Merge tag 'v6.8-p3' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "Fix regressions in cbc and algif_hash, as well as an older
  NULL-pointer dereference in ccp"

* tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_hash - Remove bogus SGL free on zero-length error path
  crypto: cbc - Ensure statesize is zero
  crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked

20 months agoMerge tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/denni...
Linus Torvalds [Thu, 8 Feb 2024 06:08:37 +0000 (06:08 +0000)]
Merge tag 'percpu-for-6.8-rc4' of git://git./linux/kernel/git/dennis/percpu

Pull percpu fix from Dennis Zhou:

 - fix riscv wrong size passed to local_flush_tlb_range_asid()

* tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  riscv: Fix wrong size passed to local_flush_tlb_range_asid()

20 months agoMerge branch 'net-more-factorization-in-cleanup_net-paths'
Jakub Kicinski [Thu, 8 Feb 2024 02:55:15 +0000 (18:55 -0800)]
Merge branch 'net-more-factorization-in-cleanup_net-paths'

Eric Dumazet says:

====================
net: more factorization in cleanup_net() paths

This series is inspired by recent syzbot reports hinting to RTNL and
workqueue abuses.

rtnl_lock() is unfair to (single threaded) cleanup_net(), because
many threads can cause contention on it.

This series adds a new (struct pernet_operations) method,
so that cleanup_net() can hold RTNL longer once it finally
acquires it.

It also factorizes unregister_netdevice_many(), to further
reduce stalls in cleanup_net().

Link: https://lore.kernel.org/netdev/CANn89iLJrrJs+6Vc==Un4rVKcpV0Eof4F_4w1_wQGxUCE2FWAg@mail.gmail.com/T/#u
https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/
====================

Link: https://lore.kernel.org/r/20240206144313.2050392-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoxfrm: interface: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:12 +0000 (14:43 +0000)]
xfrm: interface: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-17-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobridge: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:11 +0000 (14:43 +0000)]
bridge: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-16-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip_tunnel: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:10 +0000 (14:43 +0000)]
ip_tunnel: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

This patch takes care of ipip, ip_vti, and ip_gre tunnels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-15-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agosit: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:09 +0000 (14:43 +0000)]
sit: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-14-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_vti: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:08 +0000 (14:43 +0000)]
ip6_vti: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-13-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_tunnel: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:07 +0000 (14:43 +0000)]
ip6_tunnel: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-12-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoip6_gre: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:06 +0000 (14:43 +0000)]
ip6_gre: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-11-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agovxlan: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:05 +0000 (14:43 +0000)]
vxlan: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call.

v4: (Paolo feedback : https://netdev-3.bots.linux.dev/vmksft-net/results/453141/17-udpgro-fwd-sh/stdout )
  - Changed vxlan_destroy_tunnels() to use vxlan_dellink()
    instead of unregister_netdevice_queue to propely remove
    devices from vn->vxlan_list.
  - vxlan_destroy_tunnels() can simply iterate one list (vn->vxlan_list)
    to find all devices in the most efficient way.
  - Moved sanity checks in a separate vxlan_exit_net() method.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-10-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoipv4: add __unregister_nexthop_notifier()
Eric Dumazet [Tue, 6 Feb 2024 14:43:04 +0000 (14:43 +0000)]
ipv4: add __unregister_nexthop_notifier()

unregister_nexthop_notifier() assumes the caller does not hold rtnl.

We need in the following patch to use it from a context
already holding rtnl.

Add __unregister_nexthop_notifier().

unregister_nexthop_notifier() becomes a wrapper.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agogtp: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:03 +0000 (14:43 +0000)]
gtp: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call per netns.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agogeneve: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:02 +0000 (14:43 +0000)]
geneve: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

Note: it should be possible to remove the synchronize_net()
call from geneve_sock_release() in a future patch.

v4: move WARN_ON_ONCE(!list_empty(&gn->sock_list))
   into geneve_exit_net(), after devices have been unregistered.
   (Antoine Tenart feedback)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobonding: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:01 +0000 (14:43 +0000)]
bonding: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

v2: Added bond_net_pre_exit() method to make sure bond_destroy_sysfs()
    is called before we unregister the devices in bond_net_exit_batch_rtnl
 (Antoine Tenart : https://lore.kernel.org/netdev/170688415193.5216.10499830272732622816@kwain/)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agobareudp: use exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:43:00 +0000 (14:43 +0000)]
bareudp: use exit_batch_rtnl() method

exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.

This saves one rtnl_lock()/rtnl_unlock() pair,
and one unregister_netdevice_many() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method
Eric Dumazet [Tue, 6 Feb 2024 14:42:59 +0000 (14:42 +0000)]
nexthop: convert nexthop_net_exit_batch to exit_batch_rtnl method

exit_batch_rtnl() is called while RTNL is held.

This saves one rtnl_lock()/rtnl_unlock() pair.

We also need to create nexthop_net_exit()
to make sure net->nexthop.devhash is not freed too soon,
otherwise we will not be able to unregister netdev
from exit_batch_rtnl() methods.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: add exit_batch_rtnl() method
Eric Dumazet [Tue, 6 Feb 2024 14:42:57 +0000 (14:42 +0000)]
net: add exit_batch_rtnl() method

Many (struct pernet_operations)->exit_batch() methods have
to acquire rtnl.

In presence of rtnl mutex pressure, this makes cleanup_net()
very slow.

This patch adds a new exit_batch_rtnl() method to reduce
number of rtnl acquisitions from cleanup_net().

exit_batch_rtnl() handlers are called while rtnl is locked,
and devices to be killed can be queued in a list provided
as their second argument.

A single unregister_netdevice_many() is called right
before rtnl is released.

exit_batch_rtnl() handlers are called before ->exit() and
->exit_batch() handlers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'mt7530-dsa-subdriver-improvements-act-ii'
Jakub Kicinski [Thu, 8 Feb 2024 02:53:55 +0000 (18:53 -0800)]
Merge branch 'mt7530-dsa-subdriver-improvements-act-ii'

Arınç ÜNAL says:

====================
MT7530 DSA Subdriver Improvements Act II

This is the second patch series with the goal of simplifying the MT7530 DSA
subdriver and improving support for MT7530, MT7531, and the switch on the
MT7988 SoC.

I have done a simple ping test to confirm basic communication on all switch
ports on MCM and standalone MT7530, and MT7531 switch with this patch
series applied.

MT7621 Unielec, MCM MT7530:

rgmii-only-gmac0-mt7621-unielec-u7621-06-16m.dtb
gmac0-and-gmac1-mt7621-unielec-u7621-06-16m.dtb

tftpboot 0x80008000 mips-uzImage.bin; tftpboot 0x83000000 mips-rootfs.cpio.uboot; tftpboot 0x83f00000 $dtb; bootm 0x80008000 0x83000000 0x83f00000

MT7622 Bananapi, MT7531:

gmac0-and-gmac1-mt7622-bananapi-bpi-r64.dtb

tftpboot 0x40000000 arm64-Image; tftpboot 0x45000000 arm64-rootfs.cpio.uboot; tftpboot 0x4a000000 $dtb; booti 0x40000000 0x45000000 0x4a000000

MT7623 Bananapi, standalone MT7530:

rgmii-only-gmac0-mt7623n-bananapi-bpi-r2.dtb
gmac0-and-gmac1-mt7623n-bananapi-bpi-r2.dtb

tftpboot 0x80008000 arm-zImage; tftpboot 0x83000000 arm-rootfs.cpio.uboot; tftpboot 0x83f00000 $dtb; bootz 0x80008000 0x83000000 0x83f00000

This patch series is the continuation of the patch series linked below.

https://lore.kernel.org/r/20230522121532.86610-1-arinc.unal@arinc9.com

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
====================

Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-0-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: do not clear config->supported_interfaces
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:08 +0000 (01:08 +0300)]
net: dsa: mt7530: do not clear config->supported_interfaces

There's no need to clear the config->supported_interfaces bitmap before
reporting the supported interfaces as all bits in the bitmap will already
be initialized to zero when the phylink_config structure is allocated. The
"config" pointer points to &dp->phylink_config, and "dp" is allocated by
dsa_port_touch() with kzalloc(), so all its fields are filled with zeroes.

There's no code that would change the bitmap beforehand. Remove it.

Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-7-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: correct port capabilities of MT7988
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:07 +0000 (01:08 +0300)]
net: dsa: mt7530: correct port capabilities of MT7988

On the switch on the MT7988 SoC, as shown in Block Diagram 8.1.1.3 on page
125 of "MT7988A Wi-Fi 7 Generation Router Platform: Datasheet (Open
Version) v0.1", there are only 4 PHYs. That's port 0 to 3. Set the case for
ports which connect to switch PHYs to '0 ... 3'.

Port 4 and 5 are not used at all in this design.

Link: https://wiki.banana-pi.org/Banana_Pi_BPI-R4#Documents
Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-6-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: remove pad_setup function pointer
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:06 +0000 (01:08 +0300)]
net: dsa: mt7530: remove pad_setup function pointer

The pad_setup function pointer was introduced with 88bdef8be9f6 ("net: dsa:
mt7530: Extend device data ready for adding a new hardware"). It was being
used to set up the core clock and port 6 of the MT7530 switch, and pll of
the MT7531 switch.

All of these were moved to more appropriate locations, and it was never
used for the switch on the MT7988 SoC. Therefore, this function pointer
hasn't got a use anymore. Remove it.

Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-5-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: call port 6 setup from mt7530_mac_config()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:05 +0000 (01:08 +0300)]
net: dsa: mt7530: call port 6 setup from mt7530_mac_config()

mt7530_pad_clk_setup() is called if port 6 is enabled. It used to do more
things than setting up port 6. That part was moved to more appropriate
locations, mt7530_setup() and mt7530_pll_setup().

Now that all it does is set up port 6, rename it to mt7530_setup_port6(),
and move it to a more appropriate location, under mt7530_mac_config().

Change mt7530_setup_port6() to void as there're no error cases.

Leave an empty mt7530_pad_clk_setup() to satisfy the pad_setup function
pointer.

This is the code path for setting up the ports before:

dsa_switch_ops :: phylink_mac_config() -> mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt753x_info :: mac_port_config() -> mt7530_mac_config()
      -> mt7530_setup_port5()
-> mt753x_pad_setup()
   -> mt753x_info :: pad_setup() -> mt7530_pad_clk_setup()

This is after:

dsa_switch_ops :: phylink_mac_config() -> mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt753x_info :: mac_port_config() -> mt7530_mac_config()
      -> mt7530_setup_port5()
      -> mt7530_setup_port6()

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-4-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: simplify mt7530_pad_clk_setup()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:04 +0000 (01:08 +0300)]
net: dsa: mt7530: simplify mt7530_pad_clk_setup()

This code is from before this driver was converted to phylink API. Phylink
deals with the unsupported interface cases before mt7530_pad_clk_setup() is
run. Therefore, the default case would never run. However, it must be
defined nonetheless to handle all the remaining enumeration values, the
phy-modes.

Switch to if statement for RGMII and return which simplifies the code and
saves an indent.

Set P6_INTF_MODE, which is the three least significant bits of the
MT7530_P6ECR register, to 0 for RGMII even though it will already be 0
after reset. This is to keep supporting dynamic reconfiguration of the port
in the case the interface changes from TRGMII to RGMII.

Disable the TRGMII clocks for all cases. They will be enabled if TRGMII is
being used.

Read XTAL after checking for RGMII as it's only needed for the TRGMII
interface mode.

Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-3-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: move XTAL check to mt7530_setup()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:03 +0000 (01:08 +0300)]
net: dsa: mt7530: move XTAL check to mt7530_setup()

The crystal frequency concerns the switch core. The frequency should be
checked when the switch is being set up so the driver can reject the
unsupported hardware earlier and without requiring port 6 to be used.

Move it to mt7530_setup(). Drop the unnecessary function printing.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-2-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: dsa: mt7530: empty default case on mt7530_setup_port5()
Arınç ÜNAL [Mon, 5 Feb 2024 22:08:02 +0000 (01:08 +0300)]
net: dsa: mt7530: empty default case on mt7530_setup_port5()

There're two code paths for setting up port 5:

mt7530_setup()
-> mt7530_setup_port5()

mt753x_phylink_mac_config()
-> mt753x_mac_config()
   -> mt7530_mac_config()
      -> mt7530_setup_port5()

On the first code path, priv->p5_intf_sel is either set to
P5_INTF_SEL_PHY_P0 or P5_INTF_SEL_PHY_P4 when mt7530_setup_port5() is run.

On the second code path, priv->p5_intf_sel is set to P5_INTF_SEL_GMAC5 when
mt7530_setup_port5() is run.

Empty the default case which will never run but is needed nonetheless to
handle all the remaining enumeration values.

Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240206-for-netnext-mt7530-improvements-2-v5-1-d7d92a185cb1@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agor8169: remove setting LED default trigger, this is done by LED core now
Heiner Kallweit [Mon, 5 Feb 2024 21:54:08 +0000 (22:54 +0100)]
r8169: remove setting LED default trigger, this is done by LED core now

After 1c75c424bd43 ("leds: class: If no default trigger is given, make
hw_control trigger the default trigger") this line isn't needed any
longer.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/3a9cd1a1-40ad-487d-8b1e-6bf255419232@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Thu, 8 Feb 2024 02:34:34 +0000 (18:34 -0800)]
Merge tag 'mlx5-updates-2024-02-01' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2024-02-01

1) IPSec global stats for xfrm and mlx5
2) XSK memory improvements for non-linear SKBs
3) Software steering debug dump to use seq_file ops
4) Various code clean-ups

* tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: XDP, Exclude headroom and tailroom from memory calculations
  net/mlx5e: XSK, Exclude tailroom from non-linear SKBs memory calculations
  net/mlx5: DR, Change SWS usage to debug fs seq_file interface
  net/mlx5: Change missing SyncE capability print to debug
  net/mlx5: Remove initial segmentation duplicate definitions
  net/mlx5: Return specific error code for timeout on wait_fw_init
  net/mlx5: SF, Stop waiting for FW as teardown was called
  net/mlx5: remove fw reporter dump option for non PF
  net/mlx5: remove fw_fatal reporter dump option for non PF
  net/mlx5: Rename mlx5_sf_dev_remove
  Documentation: Fix counter name of mlx5 vnic reporter
  net/mlx5e: Delete obsolete IPsec code
  net/mlx5e: Connect mlx5 IPsec statistics with XFRM core
  xfrm: get global statistics from the offloaded device
  xfrm: generalize xdo_dev_state_update_curlft to allow statistics update
====================

Link: https://lore.kernel.org/r/20240206005527.1353368-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'selftests-bonding-use-slowwait-when-waiting'
Jakub Kicinski [Thu, 8 Feb 2024 02:26:22 +0000 (18:26 -0800)]
Merge branch 'selftests-bonding-use-slowwait-when-waiting'

Hangbin Liu says:

====================
selftests: bonding: use slowwait when waiting

There are a lot waitings in bonding tests use sleep. Let's replace them with
slowwait(added in the first patch). This could save much test time. e.g.

bond-break-lacpdu-tx.sh
  before: 0m16.346s
  after: 0m2.824s

bond_options.sh
  before: 9m25.299s
  after: 6m14.439s

bond-lladdr-target.sh
  before: 0m7.090s
  after: 0m6.148s

In total, we could save about 180 seconds.
====================

Link: https://lore.kernel.org/r/20240205130048.282087-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: use slowwait instead of hard code sleep
Hangbin Liu [Mon, 5 Feb 2024 13:00:48 +0000 (21:00 +0800)]
selftests: bonding: use slowwait instead of hard code sleep

Use slowwait instead of hard code sleep for bonding tests.

In function setup_prepare(), the client_create() will be called after
server_create(). So I think there is no need to sleep in server_create()
and remove it.

For lab_lib.sh, remove bonding module may affect other running bonding tests.
And some test env may buildin bond which can't be removed. The bonding
link should be removed by lag_reset_network() or netns delete.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-5-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: reduce garp_test/arp_validate test time
Hangbin Liu [Mon, 5 Feb 2024 13:00:47 +0000 (21:00 +0800)]
selftests: bonding: reduce garp_test/arp_validate test time

The purpose of grat_arp is testing commit 9949e2efb54e ("bonding: fix
send_peer_notif overflow"). As the send_peer_notif was defined to u8,
to overflow it, we need to

send_peer_notif = num_peer_notif * peer_notif_delay = num_grat_arp * peer_notify_delay / miimon > 255
  (kernel)           (kernel parameter)                   (user parameter)

e.g. 30 (num_grat_arp) * 1000 (peer_notify_delay) / 100 (miimon) > 255.

Which need 30s to complete sending garp messages. To save the testing time,
the only way is reduce the miimon number. Something like
30 (num_grat_arp) * 100 (peer_notify_delay) / 10 (miimon) > 255.

To save more time, the 50 num_grat_arp testing could be removed.

The arp_validate_test also need to check the mii_status, which sleep
too long. Use slowwait to save some time.

For other connection checkings, make sure active slave changed first.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests: bonding: use tc filter to check if LACP was sent
Hangbin Liu [Mon, 5 Feb 2024 13:00:46 +0000 (21:00 +0800)]
selftests: bonding: use tc filter to check if LACP was sent

Use tc filter to check if LACP was sent, which is accurate and save
more time.

No need to remove bonding module as some test env may buildin bonding.
And the bond link has been deleted.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoselftests/net/forwarding: add slowwait functions
Hangbin Liu [Mon, 5 Feb 2024 13:00:45 +0000 (21:00 +0800)]
selftests/net/forwarding: add slowwait functions

Add slowwait functions to wait for some operations that may need a long time
to finish. The busywait executes the cmd too fast, which is kind of wasting
cpu in this scenario. At the same time, if shell debugging is enabled with
`set -x`. the busywait will output too much logs.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240205130048.282087-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet/smc: change the term virtual ISM to Emulated-ISM
Wen Gu [Mon, 5 Feb 2024 03:33:17 +0000 (11:33 +0800)]
net/smc: change the term virtual ISM to Emulated-ISM

According to latest release of SMCv2.1[1], the term 'virtual ISM' has
been changed to 'Emulated-ISM' to avoid the ambiguity of the word
'virtual' in different contexts. So the names or comments in the code
need be modified accordingly.

[1] https://www.ibm.com/support/pages/node/7112343

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20240205033317.127269-1-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agoMerge branch 'net-phy-realtek-complete-5gbps-support-and-replace-private-constants'
Jakub Kicinski [Thu, 8 Feb 2024 02:19:55 +0000 (18:19 -0800)]
Merge branch 'net-phy-realtek-complete-5gbps-support-and-replace-private-constants'

Heiner Kallweit says:

====================
net: phy: realtek: complete 5Gbps support and replace private constants

Realtek maps standard C45 registers to vendor-specific registers which
can be accessed via C22 w/o MMD. For an unknown reason C22 MMD access
to C45 registers isn't supported for integrated PHY's.
However the vendor-specific registers preserve the format of the C45
registers, so we can use standard constants. First two patches are
cherry-picked from a series posted by Marek some time ago.

RTL8126 supports 5Gbps, therefore add the missing 5Gbps support to
rtl822x_config_aneg().
====================

Link: https://lore.kernel.org/r/31a83fd9-90ce-402a-84c7-d5c20540b730@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: phy: realtek: add 5Gbps support to rtl822x_config_aneg()
Heiner Kallweit [Sun, 4 Feb 2024 14:18:50 +0000 (15:18 +0100)]
net: phy: realtek: add 5Gbps support to rtl822x_config_aneg()

RTL8126 as an evolution of RTL8125 supports 5Gbps. rtl822x_config_aneg()
is used by the PHY driver for the integrated PHY, therefore add 5Gbps
support to it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/5644ab50-e3e9-477c-96db-05cd5bdc2563@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
20 months agonet: phy: realtek: use generic MDIO constants
Marek Behún [Sun, 4 Feb 2024 14:17:53 +0000 (15:17 +0100)]
net: phy: realtek: use generic MDIO constants

Drop the ad-hoc MDIO constants used in the driver and use generic
constants instead.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/732a70d6-4191-4aae-8862-3716b062aa9e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>