Wen Gu [Tue, 19 Dec 2023 14:26:10 +0000 (22:26 +0800)]
net/smc: support SMCv2.x supplemental features negotiation
This patch adds SMCv2.x supplemental features negotiation. Supported
SMCv2.x supplemental features are represented by feature_mask in FCE
field. The negotiation process is as follows.
Server Client
Proposal(features(c-mask bits))
<-----------------------------------------
Accept(features(s-mask bits))
----------------------------------------->
Confirm(features(s&c-mask bits))
<-----------------------------------------
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wen Gu [Tue, 19 Dec 2023 14:26:09 +0000 (22:26 +0800)]
net/smc: unify the structs of accept or confirm message for v1 and v2
The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are
separately defined and often casted to each other in the code, which may
increase the risk of errors caused by future divergence of them. So
unify them into one struct for better maintainability.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wen Gu [Tue, 19 Dec 2023 14:26:08 +0000 (22:26 +0800)]
net/smc: introduce sub-functions for smc_clc_send_confirm_accept()
There is a large if-else block in smc_clc_send_confirm_accept() and it
is better to split it into two sub-functions.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wen Gu [Tue, 19 Dec 2023 14:26:07 +0000 (22:26 +0800)]
net/smc: rename some 'fce' to 'fce_v2x' for clarity
Rename some functions or variables with 'fce' in their name but used in
SMCv2.1 as 'fce_v2x' for clarity.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Tue, 19 Dec 2023 16:24:15 +0000 (17:24 +0100)]
net: sfp: fix PHY discovery for FS SFP-10G-T module
Commit
2f3ce7a56c6e ("net: sfp: rework the RollBall PHY waiting code")
changed the long wait before accessing RollBall / FS modules into
probing for PHY every 1 second, and trying 25 times.
Wei Lei reports that this does not work correctly on FS modules: when
initializing, they may report values different from 0xffff in PHY ID
registers for some MMDs, causing get_phy_c45_ids() to find some bogus
MMD.
Fix this by adding the module_t_wait member back, and setting it to 4
seconds for FS modules.
Fixes: 2f3ce7a56c6e ("net: sfp: rework the RollBall PHY waiting code")
Reported-by: Wei Lei <quic_leiwei@quicinc.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Tested-by: Lei Wei <quic_leiwei@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 23 Dec 2023 01:18:59 +0000 (01:18 +0000)]
Merge branch 'dpaa2-switch-small-improvements'
Ioana Ciornei says:
====================
dpaa2-switch: small improvements
This patch set consists of a series of small improvements on the
dpaa2-switch driver ranging from adding some more verbosity when
encountering errors to reorganizing code to be easily extensible.
Changes in v3:
- 4/8: removed the fixes tag and moved it to the commit message
- 5/8: specified that there is no user-visible effect
- 6/8: removed the initialization of the err variable
Changes in v2:
- No changes to the actual diff, only rephrased some commit messages and
added more information.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:33 +0000 (13:59 +0200)]
dpaa2-switch: cleanup the egress flood of an unused FDB
In case a DPAA2 switch interface joins a bridge, the FDB used on the
port will be changed to the one associated with the bridge. What this
means exactly is that any VLAN installed on the port will need to be
removed and then installed back so that it points to the new FDB.
Once this is done, the previous FDB will become unused (no VLAN to
point to it). Even though no traffic will reach this FDB, it's best to
just cleanup the state of the FDB by zeroing its egress flood domain.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:32 +0000 (13:59 +0200)]
dpaa2-switch: move a check to the prechangeupper stage
Two different DPAA2 switch ports from two different DPSW instances
cannot be under the same bridge. Instead of checking for this
unsupported configuration in the CHANGEUPPER event, check it as early as
possible in the PRECHANGEUPPER one.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:31 +0000 (13:59 +0200)]
dpaa2-switch: reorganize the [pre]changeupper events
Create separate functions, dpaa2_switch_port_prechangeupper and
dpaa2_switch_port_changeupper, to be called directly when a DPSW port
changes its upper device.
This way we are not open-coding everything in the main event callback
and we can easily extent, for example, with bond offload.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:30 +0000 (13:59 +0200)]
dpaa2-switch: do not clear any interrupts automatically
The DPSW object has multiple event sources multiplexed over the same
IRQ. The driver has the capability to configure only some of these
events to trigger the IRQ.
The dpsw_get_irq_status() can clear events automatically based on the
value stored in the 'status' variable passed to it. We don't want that
to happen because we could get into a situation when we are clearing
more events than we actually handled.
Just resort to manually clearing the events that we handled. Also, since
status is not used on the out path we remove its initialization to zero.
This change does not have a user-visible effect because the dpaa2-switch
driver enables and handles all the DPSW events which exist at the
moment.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:29 +0000 (13:59 +0200)]
dpaa2-switch: add ENDPOINT_CHANGED to the irq_mask
Commit
84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support")
added support for MAC endpoints in the dpaa2-switch driver but omitted
to add the ENDPOINT_CHANGED irq to the list of interrupt sources. Fix
this by extending the list of events which can raise an interrupt by
extending the mask passed to the dpsw_set_irq_mask() firmware API.
There is no user visible impact even without this patch since whenever a
switch interface is connected/disconnected from an endpoint both events
are set (LINK_CHANGED and ENDPOINT_CHANGED) and, luckily, the
LINK_CHANGED event could actually raise the interrupt and thus get the
MAC/PHY SW configuration started.
Even with this, it's better to just not rely on undocumented firmware
behavior which can change.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:28 +0000 (13:59 +0200)]
dpaa2-switch: print an error when the vlan is already configured
Print a netdev error when we hit a case in which a specific VLAN is
already configured on the port. While at it, change the already existing
netdev_warn into an _err for consistency purposes.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:27 +0000 (13:59 +0200)]
dpaa2-switch: declare the netdev as IFF_LIVE_ADDR_CHANGE capable
There is no restriction around the change of the MAC address on the
switch ports, thus declare the interface netdevs IFF_LIVE_ADDR_CHANGE
capable.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 19 Dec 2023 11:59:26 +0000 (13:59 +0200)]
dpaa2-switch: set interface MAC address only on endpoint change
There is no point in updating the MAC address of a switch interface each
time the link state changes, this only needs to happen in case the
endpoint changes (the switch interface is [dis]connected from/to a MAC).
Just move the call to dpaa2_switch_port_set_mac_addr() under
DPSW_IRQ_EVENT_ENDPOINT_CHANGED.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 23 Dec 2023 01:01:20 +0000 (01:01 +0000)]
Merge branch 'am65-cpsw-preemption-coalescing'
Roger Quadros says:
====================
net: ethernet: am65-cpsw: Add mqprio, frame preemption & coalescing
This series adds mqprio qdisc offload in channel mode,
Frame Preemption MAC merge support and RX/TX coalesing
for AM65 CPSW driver.
In v11 following changes were made
- Fix patch "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode"
by including units.h
Changelog information in each patch file.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko [Tue, 19 Dec 2023 10:58:05 +0000 (12:58 +0200)]
net: ethernet: ti: am65-cpsw: add sw tx/rx irq coalescing based on hrtimers
Add SW IRQ coalescing based on hrtimers for TX and RX data path which
can be enabled by ethtool commands:
- RX coalescing
ethtool -C eth1 rx-usecs 50
- TX coalescing can be enabled per TX queue
- by default enables coalesing for TX0
ethtool -C eth1 tx-usecs 50
- configure TX0
ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100
- configure TX1
ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100
- configure TX0 and TX1
ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce tx-usecs 100
show configuration for TX0 and TX1:
ethtool -Q eth0 queue_mask 3 --show-coalesce
Comparing to gro_flush_timeout and napi_defer_hard_irqs, this patch
allows to enable IRQ coalesing for RX path separately.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:58:04 +0000 (12:58 +0200)]
net: ethernet: ti: am65-cpsw-qos: Add Frame Preemption MAC Merge support
Add driver support for viewing / changing the MAC Merge sublayer
parameters and seeing the verification state machine's current state
via ethtool.
As hardware does not support interrupt notification for verification
events we resort to polling on link up. On link up we try a couple of
times for verification success and if unsuccessful then give up.
The Frame Preemption feature is described in the Technical Reference
Manual [1] in section:
12.3.1.4.6.7 Intersperced Express Traffic (IET – P802.3br/D2.0)
Due to Silicon Errata i2208 [2] we set limit min IET fragment size to
124 (excluding 4 bytes mCRC).
[1] AM62x TRM - https://www.ti.com/lit/ug/spruiv7a/spruiv7a.pdf
[2] AM62x Silicon Errata - https://www.ti.com/lit/er/sprz487c/sprz487c.pdf
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko [Tue, 19 Dec 2023 10:58:03 +0000 (12:58 +0200)]
net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode
This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
not only setting up pri:tc mapping, but also configuring TX shapers
(rate-limiting) on external port FIFOs.
The MQPRIO Qdisc offload is expected to work with or without VLAN/priority
tagged packets.
The CPSW external Port FIFO has 8 Priority queues. The rate-limit can be
set for each of these priority queues. Which Priority queue a packet is
assigned to depends on PN_REG_TX_PRI_MAP register which maps header
priority to switch priority.
The header priority of a packet is assigned via the RX_PRI_MAP_REG which
maps packet priority to header priority.
The packet priority is either the VLAN priority (for VLAN tagged packets)
or the thread/channel offset.
For simplicity, we assign the same priority queue to all queues of a
Traffic Class so it can be rate-limited correctly.
Configuration example:
ethtool -L eth1 tx 5
ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off
tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 hw 1 mode channel \
shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit
tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1
ip link add link eth1 name eth1.100 type vlan id 100
ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
In the above example two ports share the same TX CPPI queue 0 for low
priority traffic. 3 traffic classes are defined for eth1 and mapped to:
TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:58:02 +0000 (12:58 +0200)]
net: ethernet: am65-cpsw: Move register definitions to header file
Move register definitions to header file. No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:58:01 +0000 (12:58 +0200)]
net: ethernet: ti: am65-cpsw: Move code to avoid forward declaration
Move this code around to avoid forward declaration.
No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:58:00 +0000 (12:58 +0200)]
net: ethernet: am65-cpsw: cleanup TAPRIO handling
Handle offloading commands using switch-case in
am65_cpsw_setup_taprio().
Move checks to am65_cpsw_taprio_replace().
Use NL_SET_ERR_MSG_MOD for error messages.
Change error message from "Failed to set cycle time extension"
to "cycle time extension not supported"
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:57:59 +0000 (12:57 +0200)]
net: ethernet: am65-cpsw: Rename TI_AM65_CPSW_TAS to TI_AM65_CPSW_QOS
We will use this Kconfig option to not only enable TAS/EST offload
but also other QoS features like Multiqueue priority descriptors
and MAC-Merge/Frame Preemption. TI_AM65_CPSW_QOS seems a more
appropriate Kconfig option name than TI_AM65_CPSW_TAS.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Tue, 19 Dec 2023 10:57:58 +0000 (12:57 +0200)]
net: ethernet: am65-cpsw: Build am65-cpsw-qos only if required
Build am65-cpsw-qos only if CONFIG_TI_AM65_CPSW_TAS is enabled.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 19 Dec 2023 10:57:57 +0000 (12:57 +0200)]
selftests: forwarding: ethtool_mm: fall back to aggregate if device does not report pMAC stats
Some devices do not support individual 'pmac' and 'emac' stats.
For such devices, resort to 'aggregate' stats.
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 19 Dec 2023 10:57:56 +0000 (12:57 +0200)]
selftests: forwarding: ethtool_mm: support devices with higher rx-min-frag-size
Some devices have errata due to which they cannot report ETH_ZLEN (60)
in the rx-min-frag-size. This was foreseen of course, and lldpad has
logic that when we request it to advertise addFragSize 0, it will round
it up to the lowest value that is _actually_ supported by the hardware.
The problem is that the selftest expects lldpad to report back to us the
same value as we requested.
Make the selftest smarter by figuring out on its own what is a
reasonable value to expect.
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 23 Dec 2023 00:26:32 +0000 (00:26 +0000)]
Merge branch 'net-selftests-unique-namespace-last-part'
Hangbin Liu says:
====================
Convert net selftests to run in unique namespace (last part)
Here is the last part of converting net selftests to run in unique namespace.
This part converts all left tests. After the conversion, we can run the net
sleftests in parallel. e.g.
# ./run_kselftest.sh -n -t net:reuseport_bpf
TAP version 13
1..1
# selftests: net: reuseport_bpf
ok 1 selftests: net: reuseport_bpf
mod 10...
# Socket 0: 0
# Socket 1: 1
...
# Socket 4: 19
# Testing filter add without bind...
# SUCCESS
# ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh
TAP version 13
1..3
# selftests: net: cmsg_so_mark.sh
ok 1 selftests: net: cmsg_so_mark.sh
# selftests: net: cmsg_time.sh
ok 2 selftests: net: cmsg_time.sh
# selftests: net: cmsg_ipv6.sh
ok 3 selftests: net: cmsg_ipv6.sh
# ./run_kselftest.sh -p -n -c net
TAP version 13
1..95
# selftests: net: reuseport_bpf_numa
ok 3 selftests: net: reuseport_bpf_numa
# selftests: net: reuseport_bpf_cpu
ok 2 selftests: net: reuseport_bpf_cpu
# selftests: net: sk_bind_sendto_listen
ok 9 selftests: net: sk_bind_sendto_listen
# selftests: net: reuseaddr_conflict
ok 5 selftests: net: reuseaddr_conflict
...
Here is the part 1 link:
https://lore.kernel.org/netdev/
20231202020110.362433-1-liuhangbin@gmail.com
part 2 link:
https://lore.kernel.org/netdev/
20231206070801.
1691247-1-liuhangbin@gmail.com
part 3 link:
https://lore.kernel.org/netdev/
20231213060856.
4030084-1-liuhangbin@gmail.com
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:56 +0000 (17:48 +0800)]
kselftest/runner.sh: add netns support
Add a variable RUN_IN_NETNS if the user wants to run all the selected tests
in namespace in parallel. With this, we can save a lot of testing time.
Note that some tests may not fit to run in namespace, e.g.
net/drop_monitor_tests.sh, as the dwdump needs to be run in init ns.
I also added another parameter -p to make all the logs reported separately
instead of mixing them in the stdout or output.log.
Nit: the NUM in run_one is not used, rename it to test_num.
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:55 +0000 (17:48 +0800)]
selftests/net: convert pmtu.sh to run it in unique namespace
pmtu test use /bin/sh, so we need to source ./lib.sh instead of lib.sh
Here is the test result after conversion.
# ./pmtu.sh
TEST: ipv4: PMTU exceptions [ OK ]
TEST: ipv4: PMTU exceptions - nexthop objects [ OK ]
TEST: ipv6: PMTU exceptions [ OK ]
TEST: ipv6: PMTU exceptions - nexthop objects [ OK ]
...
TEST: ipv4: list and flush cached exceptions - nexthop objects [ OK ]
TEST: ipv6: list and flush cached exceptions [ OK ]
TEST: ipv6: list and flush cached exceptions - nexthop objects [ OK ]
TEST: ipv4: PMTU exception w/route replace [ OK ]
TEST: ipv4: PMTU exception w/route replace - nexthop objects [ OK ]
TEST: ipv6: PMTU exception w/route replace [ OK ]
TEST: ipv6: PMTU exception w/route replace - nexthop objects [ OK ]
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:54 +0000 (17:48 +0800)]
selftests/net: use unique netns name for setup_loopback.sh setup_veth.sh
The setup_loopback and setup_veth use their own way to create namespace.
So let's just re-define server_ns/client_ns to unique name.
At the same time update the namespace name in gro.sh and toeplitz.sh.
As I don't have env to run toeplitz.sh. Here is only the gro test result.
# ./gro.sh
running test ipv4 data
Expected {200 }, Total 1 packets
Received {200 }, Total 1 packets.
...
Gro::large test passed.
All Tests Succeeded!
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:53 +0000 (17:48 +0800)]
selftests/net: convert xfrm_policy.sh to run it in unique namespace
Here is the test result after conversion.
# ./xfrm_policy.sh
PASS: policy before exception matches
PASS: ping to .254 bypassed ipsec tunnel (exceptions)
PASS: direct policy matches (exceptions)
PASS: policy matches (exceptions)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies)
PASS: direct policy matches (exceptions and block policies)
PASS: policy matches (exceptions and block policies)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hresh changes)
PASS: direct policy matches (exceptions and block policies after hresh changes)
PASS: policy matches (exceptions and block policies after hresh changes)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hthresh change in ns3)
PASS: direct policy matches (exceptions and block policies after hthresh change in ns3)
PASS: policy matches (exceptions and block policies after hthresh change in ns3)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after htresh change to normal)
PASS: direct policy matches (exceptions and block policies after htresh change to normal)
PASS: policy matches (exceptions and block policies after htresh change to normal)
PASS: policies with repeated htresh change
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:52 +0000 (17:48 +0800)]
selftests/net: convert stress_reuseport_listen.sh to run it in unique namespace
Here is the test result after conversion.
# ./stress_reuseport_listen.sh
listen 24000 socks took 0.47714
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:51 +0000 (17:48 +0800)]
selftests/net: convert rtnetlink.sh to run it in unique namespace
When running the test in namespace, the debugfs may not load automatically.
So add a checking to make sure debugfs loaded. Here is the test result
after conversion.
# ./rtnetlink.sh
PASS: policy routing
PASS: route get
...
PASS: address proto IPv4
PASS: address proto IPv6
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:50 +0000 (17:48 +0800)]
selftests/net: convert netns-name.sh to run it in unique namespace
This test will move the device to netns 1. Add a new test_ns to do this.
Here is the test result after conversion.
# ./netns-name.sh
netns-name.sh [ OK ]
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Tue, 19 Dec 2023 09:48:49 +0000 (17:48 +0800)]
selftests/net: convert gre_gso.sh to run it in unique namespace
Here is the test result after conversion.
# ./gre_gso.sh
TEST: GREv6/v4 - copy file w/ TSO [ OK ]
TEST: GREv6/v4 - copy file w/ GSO [ OK ]
TEST: GREv6/v6 - copy file w/ TSO [ OK ]
TEST: GREv6/v6 - copy file w/ GSO [ OK ]
Tests passed: 4
Tests failed: 0
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiapeng Chong [Tue, 19 Dec 2023 05:54:04 +0000 (13:54 +0800)]
selftests/net: remove unneeded semicolon
No functional modification involved.
./tools/testing/selftests/net/tcp_ao/setsockopt-closed.c:121:2-3: Unneeded semicolon.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7771
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Safonov [Tue, 19 Dec 2023 02:03:05 +0000 (02:03 +0000)]
selftest/tcp-ao: Rectify out-of-tree build
Trivial fix for out-of-tree build that I wasn't testing previously:
1. Create a directory for library object files, fixes:
> gcc lib/kconfig.c -Wall -O2 -g -D_GNU_SOURCE -fno-strict-aliasing -I ../../../../../usr/include/ -iquote /tmp/kselftest/kselftest/net/tcp_ao/lib -I ../../../../include/ -o /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o -c
> Assembler messages:
> Fatal error: can't create /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o: No such file or directory
> make[1]: *** [Makefile:46: /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o] Error 1
2. Include $(KHDR_INCLUDES) that's exported by selftests/Makefile, fixes:
> In file included from lib/kconfig.c:6:
> lib/aolib.h:320:45: warning: ‘struct tcp_ao_add’ declared inside parameter list will not be visible outside of this definition or declaration
> 320 | extern int test_prepare_key_sockaddr(struct tcp_ao_add *ao, const char *alg,
> | ^~~~~~~~~~
...
3. While at here, clean-up $(KSFT_KHDR_INSTALL): it's not needed anymore
since commit
f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL")
4. Also, while at here, drop .DEFAULT_GOAL definition: that has a
self-explaining comment, that was valid when I made these selftests
compile on local v4.19 kernel, but not needed since
commit
8ce72dc32578 ("selftests: fix headers_install circular dependency")
Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312190645.q76MmHyq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan Corbet [Tue, 19 Dec 2023 00:28:13 +0000 (17:28 -0700)]
tipc: Remove some excess struct member documentation
Remove documentation for nonexistent struct members, addressing these
warnings:
./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link'
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan Corbet [Tue, 19 Dec 2023 00:26:26 +0000 (17:26 -0700)]
net: skbuff: Remove some excess struct-member documentation
Remove documentation for nonexistent structure members, addressing these
warnings:
./include/linux/skbuff.h:1063: warning: Excess struct member 'sp' description in 'sk_buff'
./include/linux/skbuff.h:1063: warning: Excess struct member 'nf_bridge' description in 'sk_buff'
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 22 Dec 2023 22:15:35 +0000 (22:15 +0000)]
Merge branch 'tcp-refactor-bhash2'
Kuniyuki Iwashima says:
====================
tcp: Refactor bhash2 and remove sk_bind2_node.
This series refactors code around bhash2 and remove some bhash2-specific
fields; sock.sk_bind2_node, and inet_timewait_sock.tw_bind2_node.
patch 1 : optimise bind() for non-wildcard v4-mapped-v6 address
patch 2 - 4 : optimise bind() conflict tests
patch 5 - 12 : Link bhash2 to bhash and unlink sk from bhash2 to
remove sk_bind2_node
The patch 8 will trigger a false-positive error by checkpatch.
v2: resend of https://lore.kernel.org/netdev/
20231213082029.35149-1-kuniyu@amazon.com/
* Rebase on latest net-next
* Patch 11
* Add change in inet_diag_dump_icsk() for recent bhash dump patch
v1: https://lore.kernel.org/netdev/
20231023190255.39190-1-kuniyu@amazon.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:33 +0000 (09:18 +0900)]
tcp: Remove dead code and fields for bhash2.
Now all sockets including TIME_WAIT are linked to bhash2 using
sock_common.skc_bind_node.
We no longer use inet_bind2_bucket.deathrow, sock.sk_bind2_node,
and inet_timewait_sock.tw_bind2_node.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:32 +0000 (09:18 +0900)]
tcp: Link sk and twsk to tb2->owners using skc_bind_node.
Now we can use sk_bind_node/tw_bind_node for bhash2, which means
we need not link TIME_WAIT sockets separately.
The dead code and sk_bind2_node will be removed in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:31 +0000 (09:18 +0900)]
tcp: Unlink sk from bhash.
Now we do not use tb->owners and can unlink sockets from bhash.
sk_bind_node/tw_bind_node are available for bhash2 and will be
used in the following patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:30 +0000 (09:18 +0900)]
tcp: Check hlist_empty(&tb->bhash2) instead of hlist_empty(&tb->owners).
We use hlist_empty(&tb->owners) to check if the bhash bucket has a socket.
We can check the child bhash2 buckets instead.
For this to work, the bhash2 bucket must be freed before the bhash bucket.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:29 +0000 (09:18 +0900)]
tcp: Iterate tb->bhash2 in inet_csk_bind_conflict().
Sockets in bhash are also linked to bhash2, but TIME_WAIT sockets
are linked separately in tb2->deathrow.
Let's replace tb->owners iteration in inet_csk_bind_conflict() with
two iterations over tb2->owners and tb2->deathrow.
This can be done safely under bhash's lock because socket insertion/
deletion in bhash2 happens with bhash's lock held.
Note that twsk_for_each_bound_bhash() will be removed later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:28 +0000 (09:18 +0900)]
tcp: Rearrange tests in inet_csk_bind_conflict().
The following patch adds code in the !inet_use_bhash2_on_bind(sk)
case in inet_csk_bind_conflict().
To avoid adding nest and make the change cleaner, this patch
rearranges tests in inet_csk_bind_conflict().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:27 +0000 (09:18 +0900)]
tcp: Link bhash2 to bhash.
bhash2 added a new member sk_bind2_node in struct sock to link
sockets to bhash2 in addition to bhash.
bhash is still needed to search conflicting sockets efficiently
from a port for the wildcard address. However, bhash itself need
not have sockets.
If we link each bhash2 bucket to the corresponding bhash bucket,
we can iterate the same set of the sockets from bhash2 via bhash.
This patch links bhash2 to bhash only, and the actual use will be
in the later patches. Finally, we will remove sk_bind2_node.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:26 +0000 (09:18 +0900)]
tcp: Rename tb in inet_bind2_bucket_(init|create)().
Later, we no longer link sockets to bhash. Instead, each bhash2
bucket is linked to the corresponding bhash bucket.
Then, we pass the bhash bucket to bhash2 allocation functions as
tb. However, tb is already used in inet_bind2_bucket_create() and
inet_bind2_bucket_init() as the bhash2 bucket.
To make the following diff clear, let's use tb2 for the bhash2 bucket
there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:25 +0000 (09:18 +0900)]
tcp: Save address type in inet_bind2_bucket.
inet_bind2_bucket_addr_match() and inet_bind2_bucket_match_addr_any()
are called for each bhash2 bucket to check conflicts. Thus, we call
ipv6_addr_any() and ipv6_addr_v4mapped() over and over during bind().
Let's avoid calling them by saving the address type in inet_bind2_bucket.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:24 +0000 (09:18 +0900)]
tcp: Save v4 address as v4-mapped-v6 in inet_bind2_bucket.v6_rcv_saddr.
In bhash2, IPv4/IPv6 addresses are saved in two union members,
which complicate address checks in inet_bind2_bucket_addr_match()
and inet_bind2_bucket_match_addr_any() considering uninitialised
memory and v4-mapped-v6 conflicts.
Let's simplify that by saving IPv4 address as v4-mapped-v6 address
and defining tb2.rcv_saddr as tb2.v6_rcv_saddr.s6_addr32[3].
Then, we can compare v6 address as is, and after checking v4-mapped-v6,
we can compare v4 address easily. Also, we can remove tb2->family.
Note these functions will be further refactored in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:23 +0000 (09:18 +0900)]
tcp: Rearrange tests in inet_bind2_bucket_(addr_match|match_addr_any)().
The protocol family tests in inet_bind2_bucket_addr_match() and
inet_bind2_bucket_match_addr_any() are ordered as follows.
if (sk->sk_family != tb2->family)
else if (sk->sk_family == AF_INET6)
else
This patch rearranges them so that AF_INET6 socket is handled first
to make the following patch tidy, where tb2->family will be removed.
if (sk->sk_family == AF_INET6)
else if (tb2->family == AF_INET6)
else
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima [Tue, 19 Dec 2023 00:18:22 +0000 (09:18 +0900)]
tcp: Use bhash2 for v4-mapped-v6 non-wildcard address.
While checking port availability in bind() or listen(), we used only
bhash for all v4-mapped-v6 addresses. But there is no good reason not
to use bhash2 for v4-mapped-v6 non-wildcard addresses.
Let's do it by returning true in inet_use_bhash2_on_bind(). Then, we
also need to add a test in inet_bind2_bucket_match_addr_any() so that
::ffff:X.X.X.X will match with 0.0.0.0.
Note that sk->sk_rcv_saddr is initialised for v4-mapped-v6 sk in
__inet6_bind().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Mon, 18 Dec 2023 13:30:22 +0000 (13:30 +0000)]
selftests/net: Fix various spelling mistakes in TCP-AO tests
There are a handful of spelling mistakes in test messages in the
TCP-AIO selftests. Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suman Ghosh [Mon, 18 Dec 2023 18:02:58 +0000 (23:32 +0530)]
octeontx2-af: Fix a double free issue
There was a memory leak during error handling in function
npc_mcam_rsrcs_init().
Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
Suggested-by: Simon Horman <horms@kernel.org>
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 22 Dec 2023 12:09:52 +0000 (12:09 +0000)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
intel: use bitfield operations
Jesse Brandeburg says:
After repeatedly getting review comments on new patches, and sporadic
patches to fix parts of our drivers, we should just convert the Intel code
to use FIELD_PREP() and FIELD_GET(). It's then "common" in the code and
hopefully future change-sets will see the context and do-the-right-thing.
This conversion was done with a coccinelle script which is mentioned in the
commit messages. Generally there were only a couple conversions that were
"undone" after the automatic changes because they tried to convert a
non-contiguous mask.
Patch 1 is required at the beginning of this series to fix a "forever"
issue in the e1000e driver that fails the compilation test after conversion
because the shift / mask was out of range.
The second patch just adds all the new #includes in one go.
The patch titled: "ice: fix pre-shifted bit usage" is needed to allow the
use of the FIELD_* macros and fix up the unexpected "shifts included"
defines found while creating this series.
The rest are the conversion to use FIELD_PREP()/FIELD_GET(), and the
occasional leXX_{get,set,encode}_bits() call, as suggested by Alex.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 21 Dec 2023 21:17:23 +0000 (22:17 +0100)]
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Adjacent changes:
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
23c93c3b6275 ("bnxt_en: do not map packet buffers twice")
6d1add95536b ("bnxt_en: Modify TX ring indexing logic.")
tools/testing/selftests/net/Makefile
2258b666482d ("selftests: add vlan hw filter tests")
a0bc96c0cd6e ("selftests: net: verify fq per-band packet limit")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Ahern [Tue, 19 Dec 2023 03:07:42 +0000 (20:07 -0700)]
net/ipv6: Remove gc_link warn on in fib6_info_release
A revert of
3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes")
was sent for net-next. Revert the remainder of
5a08d0065a915
which added a warn on if a fib entry is still on the gc_link list
to avoid compile failures when net is merged to net-next
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231219030742.25715-1-dsahern@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Thu, 21 Dec 2023 17:15:37 +0000 (09:15 -0800)]
Merge tag 'net-6.7-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and bpf.
Current release - regressions:
- bpf: syzkaller found null ptr deref in unix_bpf proto add
- eth: i40e: fix ST code value for clause 45
Previous releases - regressions:
- core: return error from sk_stream_wait_connect() if sk_wait_event()
fails
- ipv6: revert remove expired routes with a separated list of routes
- wifi rfkill:
- set GPIO direction
- fix crash with WED rx support enabled
- bluetooth:
- fix deadlock in vhci_send_frame
- fix use-after-free in bt_sock_recvmsg
- eth: mlx5e: fix a race in command alloc flow
- eth: ice: fix PF with enabled XDP going no-carrier after reset
- eth: bnxt_en: do not map packet buffers twice
Previous releases - always broken:
- core:
- check vlan filter feature in vlan_vids_add_by_dev() and
vlan_vids_del_by_dev()
- check dev->gso_max_size in gso_features_check()
- mptcp: fix inconsistent state on fastopen race
- phy: skip LED triggers on PHYs on SFP modules
- eth: mlx5e:
- fix double free of encap_header
- fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()"
* tag 'net-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
net: check dev->gso_max_size in gso_features_check()
kselftest: rtnetlink.sh: use grep_fail when expecting the cmd fail
net/ipv6: Revert remove expired routes with a separated list of routes
net: avoid build bug in skb extension length calculation
net: ethernet: mtk_wed: fix possible NULL pointer dereference in mtk_wed_wo_queue_tx_clean()
net: stmmac: fix incorrect flag check in timestamp interrupt
selftests: add vlan hw filter tests
net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()
net: hns3: add new maintainer for the HNS3 ethernet driver
net: mana: select PAGE_POOL
net: ks8851: Fix TX stall caused by TX buffer overrun
ice: Fix PF with enabled XDP going no-carrier after reset
ice: alter feature support check for SRIOV and LAG
ice: stop trashing VF VSI aggregator node ID information
mailmap: add entries for Geliang Tang
mptcp: fill in missing MODULE_DESCRIPTION()
mptcp: fix inconsistent state on fastopen race
selftests: mptcp: join: fix subflow_send_ack lookup
net: phy: skip LED triggers on PHYs on SFP modules
bpf: Add missing BPF_LINK_TYPE invocations
...
Christian Marangi [Sun, 17 Dec 2023 23:25:08 +0000 (00:25 +0100)]
net: phy: at803x: replace msleep(1) with usleep_range
Replace msleep(1) with usleep_range as suggested by timers-howto guide.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231217232508.26470-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Christian Marangi [Sun, 17 Dec 2023 23:27:39 +0000 (00:27 +0100)]
net: phy: at803x: remove extra space after cast
Remove extra space after cast as reported by checkpatch to keep code
clean.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231217232739.27065-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 21 Dec 2023 11:27:28 +0000 (12:27 +0100)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-12-21
Hi David, hi Jakub, hi Paolo, hi Eric,
The following pull-request contains BPF updates for your *net* tree.
We've added 3 non-merge commits during the last 5 day(s) which contain
a total of 4 files changed, 45 insertions(+).
The main changes are:
1) Fix a syzkaller splat which triggered an oob issue in bpf_link_show_fdinfo(),
from Jiri Olsa.
2) Fix another syzkaller-found issue which triggered a NULL pointer dereference
in BPF sockmap for unconnected unix sockets, from John Fastabend.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add missing BPF_LINK_TYPE invocations
bpf: sockmap, test for unconnected af_unix sock
bpf: syzkaller found null ptr deref in unix_bpf proto add
====================
Link: https://lore.kernel.org/r/20231221104844.1374-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Tue, 19 Dec 2023 12:53:31 +0000 (12:53 +0000)]
net: check dev->gso_max_size in gso_features_check()
Some drivers might misbehave if TSO packets get too big.
GVE for instance uses a 16bit field in its TX descriptor,
and will do bad things if a packet is bigger than 2^16 bytes.
Linux TCP stack honors dev->gso_max_size, but there are
other ways for too big packets to reach an ndo_start_xmit()
handler : virtio_net, af_packet, GRO...
Add a generic check in gso_features_check() and fallback
to GSO when needed.
gso_max_size was added in the blamed commit.
Fixes: 82cc1a7a5687 ("[NET]: Add per-connection option to set max TSO frame size")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231219125331.4127498-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hangbin Liu [Tue, 19 Dec 2023 06:57:37 +0000 (14:57 +0800)]
kselftest: rtnetlink.sh: use grep_fail when expecting the cmd fail
run_cmd_grep_fail should be used when expecting the cmd fail, or the ret
will be set to 1, and the total test return 1 when exiting. This would cause
the result report to fail if run via run_kselftest.sh.
Before fix:
# ./rtnetlink.sh -t kci_test_addrlft
PASS: preferred_lft addresses have expired
# echo $?
1
After fix:
# ./rtnetlink.sh -t kci_test_addrlft
PASS: preferred_lft addresses have expired
# echo $?
0
Fixes: 9c2a19f71515 ("kselftest: rtnetlink.sh: add verbose flag")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231219065737.1725120-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Ahern [Tue, 19 Dec 2023 03:02:43 +0000 (20:02 -0700)]
net/ipv6: Revert remove expired routes with a separated list of routes
This reverts commit
3dec89b14d37ee635e772636dad3f09f78f1ab87.
The commit has some race conditions given how expires is managed on a
fib6_info in relation to gc start, adding the entry to the gc list and
setting the timer value leading to UAF. Revert the commit and try again
in a later release.
Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes")
Cc: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231219030243.25687-1-dsahern@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 21 Dec 2023 07:34:08 +0000 (08:34 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-12-18 (ice)
This series contains updates to ice driver only.
Jakes stops clearing of needed aggregator information.
Dave adds a check for LAG device support before initializing the
associated event handler.
Larysa restores accounting of XDP queues in TC configurations.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix PF with enabled XDP going no-carrier after reset
ice: alter feature support check for SRIOV and LAG
ice: stop trashing VF VSI aggregator node ID information
====================
Link: https://lore.kernel.org/r/20231218192708.3397702-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thomas Weißschuh [Mon, 18 Dec 2023 17:06:54 +0000 (18:06 +0100)]
net: avoid build bug in skb extension length calculation
GCC seems to incorrectly fail to evaluate skb_ext_total_length() at
compile time under certain conditions.
The issue even occurs if all values in skb_ext_type_len[] are "0",
ruling out the possibility of an actual overflow.
As the patch has been in mainline since v6.6 without triggering the
problem it seems to be a very uncommon occurrence.
As the issue only occurs when -fno-tree-loop-im is specified as part of
CFLAGS_GCOV, disable the BUILD_BUG_ON() only when building with coverage
reporting enabled.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312171924.4FozI5FG-lkp@intel.com/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/lkml/487cfd35-fe68-416f-9bfd-6bb417f98304@app.fastmail.com/
Fixes: 5d21d0a65b57 ("net: generalize calculation of skb extensions length")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231218-net-skbuff-build-bug-v1-1-eefc2fb0a7d3@weissschuh.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lorenzo Bianconi [Sun, 17 Dec 2023 15:37:40 +0000 (16:37 +0100)]
net: ethernet: mtk_wed: fix possible NULL pointer dereference in mtk_wed_wo_queue_tx_clean()
In order to avoid a NULL pointer dereference, check entry->buf pointer before running
skb_free_frag in mtk_wed_wo_queue_tx_clean routine.
Fixes: 799684448e3e ("net: ethernet: mtk_wed: introduce wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/3c1262464d215faa8acebfc08869798c81c96f4a.1702827359.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Tue, 19 Dec 2023 23:26:59 +0000 (15:26 -0800)]
posix-timers: Get rid of [COMPAT_]SYS_NI() uses
Only the posix timer system calls use this (when the posix timer support
is disabled, which does not actually happen in any normal case), because
they had debug code to print out a warning about missing system calls.
Get rid of that special case, and just use the standard COND_SYSCALL
interface that creates weak system call stubs that return -ENOSYS for
when the system call does not exist.
This fixes a kCFI issue with the SYS_NI() hackery:
CFI failure at int80_emulation+0x67/0xb0 (target: sys_ni_posix_timers+0x0/0x70; expected type: 0xb02b34d9)
WARNING: CPU: 0 PID: 48 at int80_emulation+0x67/0xb0
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 21 Dec 2023 05:09:47 +0000 (21:09 -0800)]
Merge tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- two multichannel reconnect fixes, one fixing an important refcounting
problem that can lead to umount problems
- atime fix
- five fixes for various potential OOB accesses, including a CVE fix,
and two additional fixes for problems pointed out by Robert Morris's
fuzzing investigation
* tag '6.7-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: do not let cifs_chan_update_iface deallocate channels
cifs: fix a pending undercount of srv_count
fs: cifs: Fix atime update check
smb: client: fix potential OOB in smb2_dump_detail()
smb: client: fix potential OOB in cifs_dump_detail()
smb: client: fix OOB in smbCalcSize()
smb: client: fix OOB in SMB2_query_info_init()
smb: client: fix OOB in cifsd when receiving compounded resps
Linus Torvalds [Thu, 21 Dec 2023 00:12:39 +0000 (16:12 -0800)]
Merge tag 's390-6.7-4' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix virtual vs physical address confusion in Storage Class Memory
(SCM) block device driver.
- Fix saving and restoring of FPU kernel context, which could lead to
corruption of vector registers 8-15
- Update defconfigs
* tag 's390-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfigs
s390/vx: fix save/restore of fpu kernel context
s390/scm: fix virtual vs physical address confusion
Linus Torvalds [Thu, 21 Dec 2023 00:06:40 +0000 (16:06 -0800)]
Merge tag 'soc-fixes-6.7-2' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are only a handful of bugfixes this time, which feels almost too
small, so I hope we are not missing something important.
- One more mediatek dts warning fix after the previous larger set,
this should finally result in a clean defconfig build.
- TI OMAP dts fixes for a spurious hang on am335x and invalid data on
DTA7
- One DTS fix for ethernet on Oriange Pi Zero (Allwinner H616)
- A regression fix for ti-sysc interconnect target module driver to
not access registers after reset if srst_udelay quirk is needed
- Reset controller driver fixes for a crash during error handling and
a build warning"
* tag 'soc-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: mediatek: mt8395-genio-1200-evk: add interrupt-parent for mt6360
ARM: dts: Fix occasional boot hang for am3 usb
reset: Fix crash when freeing non-existent optional resets
ARM: OMAP2+: Fix null pointer dereference and memory leak in omap_soc_device_init
ARM: dts: dra7: Fix DRA7 L3 NoC node register size
bus: ti-sysc: Flush posted write only after srst_udelay
reset: hisilicon: hi6220: fix Wvoid-pointer-to-enum-cast warning
arm64: dts: allwinner: h616: update emac for Orange Pi Zero 3
Linus Torvalds [Wed, 20 Dec 2023 23:58:18 +0000 (15:58 -0800)]
Merge tag 'platform-drivers-x86-v6.7-5' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
- Fan reporting on some ThinkPads
- Laptop 13 spurious keypresses while suspended
- Intel PMC correction to avoid crash
* tag 'platform-drivers-x86-v6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13
platform/x86/amd/pmc: Move keyboard wakeup disablement detection to pmc-quirks
platform/x86/amd/pmc: Only run IRQ1 firmware version check on Cezanne
platform/x86/amd/pmc: Move platform defines to header
platform/x86/intel/pmc: Fix hang in pmc_core_send_ltr_ignore()
platform/x86: thinkpad_acpi: fix for incorrect fan reporting on some ThinkPad systems
Linus Torvalds [Wed, 20 Dec 2023 20:04:03 +0000 (12:04 -0800)]
Merge tag 'ovl-fixes-6.7-rc7' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs fix from Amir Goldstein:
"Fix a regression from this merge window"
* tag 'ovl-fixes-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: fix dentry reference leak after changes to underlying layers
Linus Torvalds [Wed, 20 Dec 2023 19:24:28 +0000 (11:24 -0800)]
Merge tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs fixes from Kent Overstreet:
- Fix a deadlock in the data move path with nocow locks (vs. update in
place writes); when trylock failed we were incorrectly waiting for in
flight ios to flush.
- Fix reporting of NFS file handle length
- Fix early error path in bch2_fs_alloc() - list head wasn't being
initialized early enough
- Make sure correct (hardware accelerated) crc modules get loaded
- Fix a rare overflow in the btree split path, when the packed bkey
format grows and all the keys have no value (LRU btree).
- Fix error handling in the sector allocator
This was causing writes to spuriously fail in multidevice setups, and
another bug meant that the errors weren't being logged, only reported
via fsync.
* tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix bch2_alloc_sectors_start_trans() error handling
bcachefs; guard against overflow in btree node split
bcachefs: btree_node_u64s_with_format() takes nr keys
bcachefs: print explicit recovery pass message only once
bcachefs: improve modprobe support by providing softdeps
bcachefs: fix invalid memory access in bch2_fs_alloc() error path
bcachefs: Fix determining required file handle length
bcachefs: Fix nocow locks deadlock
Linus Torvalds [Wed, 20 Dec 2023 19:16:50 +0000 (11:16 -0800)]
Merge tag 'nfsd-6.7-2' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Address a few recently-introduced issues
* tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Revert
5f7fc5d69f6e92ec0b38774c387f5cf7812c5806
NFSD: Revert
738401a9bd1ac34ccd5723d69640a4adbb1a4bc0
NFSD: Revert
6c41d9a9bd0298002805758216a9c44e38a8500d
nfsd: hold nfsd_mutex across entire netlink operation
nfsd: call nfsd_last_thread() before final nfsd_put()
Linus Torvalds [Wed, 20 Dec 2023 19:01:28 +0000 (11:01 -0800)]
Merge tag 'dm-6.7/dm-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- DM raid target (and MD raid) fix for reconfig_mutex MD deadlock that
should have been merged along with recent v6.7-rc6 MD fixes (see MD
related commits:
f2d87a759f68^..
b39113349de6)
- DM integrity target fix to avoid modifying immutable biovec in the
integrity_metadata() edge case where kmalloc fails.
- Fix drivers/md/Kconfig so DM_AUDIT depends on BLK_DEV_DM.
- Update DM entry in MAINTAINERS to remove stale info.
* tag 'dm-6.7/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
MAINTAINERS: remove stale info for DEVICE-MAPPER
dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DM
dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()
dm-raid: delay flushing event_work() after reconfig_mutex is released
Macpaul Lin [Fri, 15 Dec 2023 07:32:52 +0000 (15:32 +0800)]
arm64: dts: mediatek: mt8395-genio-1200-evk: add interrupt-parent for mt6360
This patch fix the warning introduced by mt6360 node in
mt8395-genio-1200-evk.dts.
arch/arm64/boot/dts/mediatek/mt8195.dtsi:464.4-27: Warning (interrupts_property): /soc/i2c@
11d01000/pmic@34:#interrupt-cells: size is (8), expected multiple of 16
Add a missing 'interrupt-parent' to fix this warning.
Fixes: f2b543a191b6 ("arm64: dts: mediatek: add device-tree for Genio 1200 EVK board")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/linux-devicetree/20231212214737.230115-1-arnd@kernel.org/
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Wed, 20 Dec 2023 12:04:38 +0000 (12:04 +0000)]
Merge tag 'am3-usb-hang-fix-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fix for occasional boot hang for am335x USB
A fix for occasional boot hang for am335x USB that I've only recently
started noticing.
This can be merged naturally whenever suitable. This issue has been seen
with other similar SoCs earlier and has clearly existed for a long time.
* tag 'am3-usb-hang-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Fix occasional boot hang for am3 usb
Link: https://lore.kernel.org/r/pull-1703071616-395333@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Wed, 20 Dec 2023 12:02:25 +0000 (12:02 +0000)]
Merge tag 'omap-for-v6.7/fixes-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps
A few fixes for omaps:
- A regression fix for ti-sysc interconnect target module driver to not access
registers after reset if srst_udelay quirk is needed
- DRA7 L3 NoC node register size fix
* tag 'omap-for-v6.7/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Fix null pointer dereference and memory leak in omap_soc_device_init
ARM: dts: dra7: Fix DRA7 L3 NoC node register size
bus: ti-sysc: Flush posted write only after srst_udelay
Link: https://lore.kernel.org/r/pull-1702037799-781982@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
David S. Miller [Wed, 20 Dec 2023 11:50:13 +0000 (11:50 +0000)]
Merge branch 'net-sched-tc-drop-reason'
Victor Nogueira says:
====================
net: sched: Make tc-related drop reason more flexible for remaining qdiscs
This patch builds on Daniel's patch[1] to add initial support of tc drop
reason. The main goal is to distinguish between policy and error drops for
the remainder of the egress qdiscs (other than clsact).
The drop reason is set by cls_api and act_api in the tc skb cb in case
any error occurred in the data path.
Also add new skb drop reasons that are idiosyncratic to TC.
[1] https://lore.kernel.org/all/
20231009092655.22025-1-daniel@iogearbox.net
Changes in V5:
- Drop "EXT_" from cookie error's drop reason name in doc
Changes in V4:
- Condense all the cookie drop reasons into one
Changes in V3:
- Removed duplicate assignment
- Rename function tc_skb_cb_drop_reason to tcf_get_drop_reason
- Move zone field upwards in struct tc_skb_cb to move hole to the end of
the struct
Changes in V2:
- Dropped RFC tag
- Removed check for drop reason being overwritten by filter in cls_api.c
- Simplified logic and removed function tcf_init_drop_reason
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Victor Nogueira [Sat, 16 Dec 2023 20:44:36 +0000 (17:44 -0300)]
net: sched: Add initial TC error skb drop reasons
Continue expanding Daniel's patch by adding new skb drop reasons that
are idiosyncratic to TC.
More specifically:
- SKB_DROP_REASON_TC_COOKIE_ERROR: An error occurred whilst
processing a tc ext cookie.
- SKB_DROP_REASON_TC_CHAIN_NOTFOUND: tc chain lookup failed.
- SKB_DROP_REASON_TC_RECLASSIFY_LOOP: tc exceeded max reclassify loop
iterations
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Victor Nogueira [Sat, 16 Dec 2023 20:44:35 +0000 (17:44 -0300)]
net: sched: Make tc-related drop reason more flexible for remaining qdiscs
Incrementing on Daniel's patch[1], make tc-related drop reason more
flexible for remaining qdiscs - that is, all qdiscs aside from clsact.
In essence, the drop reason will be set by cls_api and act_api in case
any error occurred in the data path. With that, we can give the user more
detailed information so that they can distinguish between a policy drop
or an error drop.
[1] https://lore.kernel.org/all/
20231009092655.22025-1-daniel@iogearbox.net
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Victor Nogueira [Sat, 16 Dec 2023 20:44:34 +0000 (17:44 -0300)]
net: sched: Move drop_reason to struct tc_skb_cb
Move drop_reason from struct tcf_result to skb cb - more specifically to
struct tc_skb_cb. With that, we'll be able to also set the drop reason for
the remaining qdiscs (aside from clsact) that do not have access to
tcf_result when time comes to set the skb drop reason.
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sat, 16 Dec 2023 19:58:10 +0000 (20:58 +0100)]
r8169: add support for LED's on RTL8168/RTL8101
This adds support for the LED's on most chip versions. Excluded are
the old non-PCIe versions and RTL8125. RTL8125 has a different LED
register layout, support for it will follow later.
LED's can be controlled from userspace using the netdev LED trigger.
Tested on RTL8168h.
Note: The driver can't know which LED's are actually physically
wired. Therefore not every LED device may represent a physically
available LED.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 20 Dec 2023 11:27:21 +0000 (11:27 +0000)]
Merge branch 'bridge-mdb-bulk-delete'
Ido Schimmel says:
====================
Add MDB bulk deletion support
This patchset adds MDB bulk deletion support, allowing user space to
request the deletion of matching entries instead of dumping the entire
MDB and issuing a separate deletion request for each matching entry.
Support is added in both the bridge and VXLAN drivers in a similar
fashion to the existing FDB bulk deletion support.
The parameters according to which bulk deletion can be performed are
similar to the FDB ones, namely: Destination port, VLAN ID, state (e.g.,
"permanent"), routing protocol, source / destination VNI, destination IP
and UDP port. Flushing based on flags (e.g., "offload", "fast_leave",
"added_by_star_ex", "blocked") is not currently supported, but can be
added in the future, if a use case arises.
Patch #1 adds a new uAPI attribute to allow specifying the state mask
according to which bulk deletion will be performed, if any.
Patch #2 adds a new policy according to which bulk deletion requests
(with 'NLM_F_BULK' flag set) will be parsed.
Patches #3-#4 add a new NDO for MDB bulk deletion and invoke it from the
rtnetlink code when a bulk deletion request is made.
Patches #5-#6 implement the MDB bulk deletion NDO in the bridge and
VXLAN drivers, respectively.
Patch #7 allows user space to issue MDB bulk deletion requests by no
longer rejecting the 'NLM_F_BULK' flag when it is set in 'RTM_DELMDB'
requests.
Patches #8-#9 add selftests for both drivers, for both good and bad
flows.
iproute2 changes can be found here [1].
https://github.com/idosch/iproute2/tree/submit/mdb_flush_v1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:44 +0000 (10:32 +0200)]
selftests: vxlan_mdb: Add MDB bulk deletion test
Add test cases to verify the behavior of the MDB bulk deletion
functionality in the VXLAN driver.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:43 +0000 (10:32 +0200)]
selftests: bridge_mdb: Add MDB bulk deletion test
Add test cases to verify the behavior of the MDB bulk deletion
functionality in the bridge driver.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:42 +0000 (10:32 +0200)]
rtnetlink: bridge: Enable MDB bulk deletion
Now that both the common code as well as individual drivers support MDB
bulk deletion, allow user space to make such requests.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:41 +0000 (10:32 +0200)]
vxlan: mdb: Add MDB bulk deletion support
Implement MDB bulk deletion support in the VXLAN driver, allowing MDB
entries to be deleted in bulk according to provided parameters.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:40 +0000 (10:32 +0200)]
bridge: mdb: Add MDB bulk deletion support
Implement MDB bulk deletion support in the bridge driver, allowing MDB
entries to be deleted in bulk according to provided parameters.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:39 +0000 (10:32 +0200)]
rtnetlink: bridge: Invoke MDB bulk deletion when needed
Invoke the new MDB bulk deletion device operation when the 'NLM_F_BULK'
flag is set in the netlink message header.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:38 +0000 (10:32 +0200)]
net: Add MDB bulk deletion device operation
Add MDB net device operation that will be invoked by rtnetlink code in
response to received 'RTM_DELMDB' messages with the 'NLM_F_BULK' flag
set. Subsequent patches will implement the operation in the bridge and
VXLAN drivers.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:37 +0000 (10:32 +0200)]
rtnetlink: bridge: Use a different policy for MDB bulk delete
For MDB bulk delete we will need to validate 'MDBA_SET_ENTRY'
differently compared to regular delete. Specifically, allow the ifindex
to be zero (in case not filtering on bridge port) and force the address
to be zero as bulk delete based on address is not supported.
Do that by introducing a new policy and choosing the correct policy
based on the presence of the 'NLM_F_BULK' flag in the netlink message
header. Use nlmsg_parse() for strict validation.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 17 Dec 2023 08:32:36 +0000 (10:32 +0200)]
bridge: add MDB state mask uAPI attribute
Currently, the 'state' field in 'struct br_port_msg' can be set to 1 if
the MDB entry is permanent or 0 if it is temporary. Additional states
might be added in the future.
In a similar fashion to 'NDA_NDM_STATE_MASK', add an MDB state mask uAPI
attribute that will allow the upcoming bulk deletion API to bulk delete
MDB entries with a certain state or any state.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lai Peter Jun Ann [Mon, 18 Dec 2023 07:51:32 +0000 (15:51 +0800)]
net: stmmac: fix incorrect flag check in timestamp interrupt
The driver should continue get the timestamp if STMMAC_FLAG_EXT_SNAPSHOT_EN
flag is set.
Fixes: aa5513f5d95f ("net: stmmac: replace the ext_snapshot_en field with a flag")
Cc: <stable@vger.kernel.org> # 6.6
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Jinchao [Mon, 18 Dec 2023 07:04:07 +0000 (15:04 +0800)]
octeontx2-af: insert space after include
Maintain Consistent Formatting: Insert Space after #include
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Wang Jinchao <wangjinchao@xfusion.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 20 Dec 2023 11:12:12 +0000 (11:12 +0000)]
Merge tag 'for-net-2023-12-15' of git://git./linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Add encryption key size check when acting as peripheral
- Shut up false-positive build warning
- Send reject if L2CAP command request is corrupted
- Fix Use-After-Free in bt_sock_recvmsg
- Fix not notifying when connection encryption changes
- Fix not checking if HCI_OP_INQUIRY has been sent
- Fix address type send over to the MGMT interface
- Fix deadlock in vhci_send_frame
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kent Overstreet [Tue, 19 Dec 2023 22:16:34 +0000 (17:16 -0500)]
bcachefs: Fix bch2_alloc_sectors_start_trans() error handling
When we fail to allocate because of insufficient open buckets, we don't
want to retry from the full set of devices - we just want to retry in
blocking mode.
But if the retry in blocking mode fails with a different error code, we
end up squashing the -BCH_ERR_open_buckets_empty error with an error
that makes us thing we won't be able to allocate (insufficient_devices)
- which is incorrect when we didn't try to allocate from the full set of
devices, and causes the write to fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Dec 2023 04:31:26 +0000 (23:31 -0500)]
bcachefs; guard against overflow in btree node split
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Dec 2023 04:20:59 +0000 (23:20 -0500)]
bcachefs: btree_node_u64s_with_format() takes nr keys
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Tue, 19 Dec 2023 20:25:43 +0000 (12:25 -0800)]
Merge tag 'trace-v6.7-rc6' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
"While working on the ring buffer, I found one more bug with the
timestamp code, and the fix for this removed the need for the final
64-bit cmpxchg!
The ring buffer events hold a "delta" from the previous event. If it
is determined that the delta can not be calculated, it falls back to
adding an absolute timestamp value. The way to know if the delta can
be used is via two stored timestamps in the per-cpu buffer meta data:
before_stamp and write_stamp
The before_stamp is written by every event before it tries to allocate
its space on the ring buffer. The write_stamp is written after it
allocates its space and knows that nothing came in after it read the
previous before_stamp and write_stamp and the two matched.
A previous fix
dd9394257078 ("ring-buffer: Do not try to put back
write_stamp") removed putting back the write_stamp to match the
before_stamp so that the next event could use the delta, but races
were found where the two would match, but not be for of the previous
event.
It was determined to allow the event reservation to not have a valid
write_stamp when it is finished, and this fixed a lot of races.
The last use of the 64-bit timestamp cmpxchg depended on the
write_stamp being valid after an interruption. But this is no longer
the case, as if an event is interrupted by a softirq that writes an
event, and that event gets interrupted by a hardirq or NMI and that
writes an event, then the softirq could finish its reservation without
a valid write_stamp.
In the slow path of the event reservation, a delta can still be used
if the write_stamp is valid. Instead of using a cmpxchg against the
write stamp, the before_stamp needs to be read again to validate the
write_stamp. The cmpxchg is not needed.
This updates the slowpath to validate the write_stamp by comparing it
to the before_stamp and removes all rb_time_cmpxchg() as there are no
more users of that function.
The removal of the 32-bit updates of rb_time_t will be done in the
next merge window"
* tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Fix slowpath of interrupted event