Samuel Thibault [Mon, 6 May 2024 21:53:35 +0000 (23:53 +0200)]
l2tp: Support several sockets with same IP/port quadruple
Some l2tp providers will use 1701 as origin port and open several
tunnels for the same origin and target. On the Linux side, this
may mean opening several sockets, but then trafic will go to only
one of them, losing the trafic for the tunnel of the other socket
(or leaving it up to userland, consuming a lot of cpu%).
This can also happen when the l2tp provider uses a cluster, and
load-balancing happens to migrate from one origin IP to another one,
for which a socket was already established. Managing reassigning
tunnels from one socket to another would be very hairy for userland.
Lastly, as documented in l2tpconfig(1), as client it may be necessary
to use 1701 as origin port for odd firewalls reasons, which could
prevent from establishing several tunnels to a l2tp server, for the
same reason: trafic would get only on one of the two sockets.
With the V2 protocol it is however easy to route trafic to the proper
tunnel, by looking up the tunnel number in the network namespace. This
fixes the three cases altogether.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Link: https://lore.kernel.org/r/20240506215336.1470009-1-samuel.thibault@ens-lyon.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Thu, 9 May 2024 02:09:38 +0000 (19:09 -0700)]
Merge tag 'wireless-next-2024-05-08' of git://git./linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.10
The third, and most likely the last, "new features" pull request for
v6.10 with changes both in stack and in drivers. In ath12k and rtw89
we disabled Wireless Extensions just like with iwlwifi earlier. Wi-Fi
7 devices will not support Wireless Extensions (WEXT) anymore so if
someone is still using the legacy WEXT interface it's time to switch
to nl80211 now!
We merged wireless into wireless-next as we decided not to send a
wireless pull request to v6.9 this late in the cycle. Also an
immutable branch with MHI subsystem was merged to get ath11k and
ath12k hibernation working.
Major changes:
mac80211/cfg80211
* handle color change per link
mt76
* mt7921 LED control
* mt7925 EHT radiotap support
* mt7920e PCI support
ath12k
* debugfs support
* dfs_simulate_radar debugfs file
* disable Wireless Extensions
* suspend and hibernation support
* ACPI support
* refactoring in preparation of multi-link support
ath11k
* support hibernation (required changes in qrtr and MHI subsystems)
* ieee80211-freq-limit Device Tree property support
ath10k
* firmware-name Device Tree property support
rtw89
* complete features of new WiFi 7 chip 8922AE including BT-coexistence
and WoWLAN
* use BIOS ACPI settings to set TX power and channels
* disable Wireless Extensios on Wi-Fi 7 devices
iwlwifi
* block_esr debugfs file
* support again firmware API 90 (was reverted earlier)
* provide channel survey information for Automatic Channel Selection (ACS)
* tag 'wireless-next-2024-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (214 commits)
wifi: mwl8k: initialize cmd->addr[] properly
wifi: iwlwifi: Ensure prph_mac dump includes all addresses
wifi: iwlwifi: mvm: don't request statistics in restart
wifi: iwlwifi: mvm: exit EMLSR if secondary link is not used
wifi: iwlwifi: mvm: add beacon template version 14
wifi: iwlwifi: mvm: align UATS naming with firmware
wifi: iwlwifi: Force SCU_ACTIVE for specific platforms
wifi: iwlwifi: mvm: record and return channel survey information
wifi: iwlwifi: mvm: add the firmware API for channel survey
wifi: iwlwifi: mvm: Fix race in scan completion
wifi: iwlwifi: mvm: Add a print for invalid link pair due to bandwidth
wifi: iwlwifi: mvm: add a debugfs for reading EMLSR blocking reasons
wifi: iwlwifi: mvm: Add active EMLSR blocking reasons prints
wifi: iwlwifi: bump FW API to 90 for BZ/SC devices
wifi: iwlwifi: mvm: fix primary link setting
wifi: iwlwifi: mvm: use already determined cmd_id
wifi: iwlwifi: mvm: don't reset link selection during restart
wifi: iwlwifi: Print EMLSR states name
wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active
wifi: iwlwifi: mvm: fix typo in debug print
...
====================
Link: https://lore.kernel.org/r/20240508120726.85A10C113CC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 9 May 2024 01:59:51 +0000 (18:59 -0700)]
Merge branch 'netdevsim-add-napi-support'
David Wei says:
====================
netdevsim: add NAPI support
Add NAPI support to netdevsim and register its Rx queues with NAPI
instances. Then add a selftest using the new netdev Python selftest
infra to exercise the existing Netdev Netlink API, specifically the
queue-get API.
This expands test coverage and further fleshes out netdevsim as a test
device. It's still my goal to make it useful for testing things like
flow steering and ZC Rx.
====================
Link: https://lore.kernel.org/r/20240507163228.2066817-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Wei [Tue, 7 May 2024 16:32:28 +0000 (09:32 -0700)]
net: selftest: add test for netdev netlink queue-get API
Add a selftest for netdev generic netlink. For now there is only a
single test that exercises the `queue-get` API.
The test works with netdevsim by default or with a real device by
setting NETIF.
Add a timeout param to cmd() since ethtool -L can take a long time on
real devices.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240507163228.2066817-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Wei [Tue, 7 May 2024 16:32:27 +0000 (09:32 -0700)]
netdevsim: add NAPI support
Add NAPI support to netdevim, similar to veth.
* Add a nsim_rq rx queue structure to hold a NAPI instance and a skb
queue.
* During xmit, store the skb in the peer skb queue and schedule NAPI.
* During napi_poll(), drain the skb queue and pass up the stack.
* Add assoc between rxq and NAPI instance using netif_queue_set_napi().
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240507163228.2066817-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Willem de Bruijn [Tue, 7 May 2024 15:40:58 +0000 (11:40 -0400)]
selftests: drv-net: add checksum tests
Run tools/testing/selftest/net/csum.c as part of drv-net.
This binary covers multiple scenarios, based on arguments given,
for both IPv4 and IPv6:
- Accept UDP correct checksum
- Detect UDP invalid checksum
- Accept TCP correct checksum
- Detect TCP invalid checksum
- Transmit UDP: basic checksum offload
- Transmit UDP: zero checksum conversion
The test direction is reversed between receive and transmit tests, so
that the NIC under test is always the local machine.
In total this adds up to 12 testcases, with more to follow. For
conciseness, I replaced individual functions with a function factory.
Also detect hardware offload feature availability using Ethtool
netlink and skip tests when either feature is off. This need may be
common for offload feature tests and eventually deserving of a thin
wrapper in lib.py.
Missing are the PF_PACKET based send tests ('-P'). These use
virtio_net_hdr to program hardware checksum offload. Which requires
looking up the local MAC address and (harder) the MAC of the next hop.
I'll have to give it some though how to do that robustly and where
that code would belong.
Tested:
make -C tools/testing/selftests/ \
TARGETS="drivers/net drivers/net/hw" \
install INSTALL_PATH=/tmp/ksft
cd /tmp/ksft
sudo NETIF=ens4 REMOTE_TYPE=ssh \
REMOTE_ARGS="root@10.40.0.2" \
LOCAL_V4="10.40.0.1" \
REMOTE_V4="10.40.0.2" \
./run_kselftest.sh -t drivers/net/hw:csum.py
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240507154216.501111-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 7 May 2024 12:17:48 +0000 (12:17 +0000)]
phonet: no longer hold RTNL in route_dumpit()
route_dumpit() already relies on RCU, RTNL is not needed.
Also change return value at the end of a dump.
This allows NLMSG_DONE to be appended to the current
skb at the end of a dump, saving a couple of recvmsg()
system calls.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Remi Denis-Courmont <courmisch@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240507121748.416287-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 7 May 2024 18:41:44 +0000 (18:41 +0000)]
net: annotate data-races around dev->if_port
Various ndo_set_config() methods can change dev->if_port
dev->if_port is going to be read locklessly from
rtnl_fill_link_ifmap().
Add corresponding WRITE_ONCE() on writer sides.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240507184144.1230469-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 7 May 2024 13:27:17 +0000 (13:27 +0000)]
net: dst_cache: minor optimization in dst_cache_set_ip6()
There is no need to use this_cpu_ptr(dst_cache->cache) twice.
Compiler is unable to optimize the second call, because of
per-cpu constraints.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240507132717.627518-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 7 May 2024 13:20:00 +0000 (13:20 +0000)]
net: dst_cache: annotate data-races around dst_cache->reset_ts
dst_cache->reset_ts is read or written locklessly,
add READ_ONCE() and WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240507132000.614591-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Tue, 7 May 2024 10:36:03 +0000 (11:36 +0100)]
netlink/specs: Add VF attributes to rt_link spec
Add support for retrieving VFs as part of link info. For example:
./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \
--do getlink --json '{"ifi-index": 38, "ext-mask": ["vf", "skip-stats"]}'
{'address': 'b6:75:91:f2:64:65',
[snip]
'vfinfo-list': {'info': [{'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00',
'link-state': {'link-state': 'auto', 'vf': 0},
'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00',
'vf': 0},
'rate': {'max-tx-rate': 0,
'min-tx-rate': 0,
'vf': 0},
'rss-query-en': {'setting': 0, 'vf': 0},
'spoofchk': {'setting': 0, 'vf': 0},
'trust': {'setting': 0, 'vf': 0},
'tx-rate': {'rate': 0, 'vf': 0},
'vlan': {'qos': 0, 'vf': 0, 'vlan': 0},
'vlan-list': {'info': [{'qos': 0,
'vf': 0,
'vlan': 0,
'vlan-proto': 0}]}},
{'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00',
'link-state': {'link-state': 'auto', 'vf': 1},
'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x00\x00\x00\x00\x00',
'vf': 1},
'rate': {'max-tx-rate': 0,
'min-tx-rate': 0,
'vf': 1},
'rss-query-en': {'setting': 0, 'vf': 1},
'spoofchk': {'setting': 0, 'vf': 1},
'trust': {'setting': 0, 'vf': 1},
'tx-rate': {'rate': 0, 'vf': 1},
'vlan': {'qos': 0, 'vf': 1, 'vlan': 0},
'vlan-list': {'info': [{'qos': 0,
'vf': 1,
'vlan': 0,
'vlan-proto': 0}]}}]},
'xdp': {'attached': 0}}
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20240507103603.23017-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexandru Gagniuc [Tue, 7 May 2024 02:47:57 +0000 (21:47 -0500)]
dt-bindings: net: ipq4019-mdio: add IPQ9574 compatible
Add a compatible property specific to IPQ9574. This should be used
along with the IPQ4019 compatible. This second compatible serves the
same purpose as the ipq{5,6,8} compatibles. This is to indicate that
the clocks properties are required.
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240507024758.2810514-1-mr.nuke.me@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukasz Majewski [Tue, 7 May 2024 09:11:55 +0000 (11:11 +0200)]
test: hsr: Call cleanup_all_ns when hsr_redbox.sh script exits
Without this change the created netns instances are not cleared after
this script execution. To fix this problem the cleanup_all_ns function
from ../lib.sh is called.
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joel Granados [Tue, 7 May 2024 08:40:39 +0000 (10:40 +0200)]
ax25: Remove superfuous "return" from ax25_ds_set_timer
Remove the explicit call to "return" in the void ax25_ds_set_timer
function that was introduced in
78a7b5dbc060 ("ax.25: x.25: Remove the
now superfluous sentinel elements from ctl_table array").
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Mikhalitsyn [Mon, 6 May 2024 14:14:44 +0000 (16:14 +0200)]
ipvs: allow some sysctls in non-init user namespaces
Let's make all IPVS sysctls writtable even when
network namespace is owned by non-initial user namespace.
Let's make a few sysctls to be read-only for non-privileged users:
- sync_qlen_max
- sync_sock_size
- run_estimation
- est_cpulist
- est_nice
I'm trying to be conservative with this to prevent
introducing any security issues in there. Maybe,
we can allow more sysctls to be writable, but let's
do this on-demand and when we see real use-case.
This patch is motivated by user request in the LXC
project [1]. Having this can help with running some
Kubernetes [2] or Docker Swarm [3] workloads inside the system
containers.
Link: https://github.com/lxc/lxc/issues/4278
Link: https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103
Link: https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Simon Horman <horms@verge.net.au>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Mikhalitsyn [Mon, 6 May 2024 14:14:43 +0000 (16:14 +0200)]
ipvs: add READ_ONCE barrier for ipvs->sysctl_amemthresh
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Simon Horman <horms@verge.net.au>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 6 May 2024 12:32:46 +0000 (14:32 +0200)]
net: stmmac: dwmac-ipq806x: account for rgmii-txid/rxid/id phy-mode
Currently the ipq806x dwmac driver is almost always used attached to the
CPU port of a switch and phy-mode was always set to "rgmii" or "sgmii".
Some device came up with a special configuration where the PHY is
directly attached to the GMAC port and in those case phy-mode needs to
be set to "rgmii-id" to make the PHY correctly work and receive packets.
Since the driver supports only "rgmii" and "sgmii" mode, when "rgmii-id"
(or variants) mode is set, the mode is rejected and probe fails.
Add support also for these phy-modes to correctly setup PHYs that requires
delay applied to tx/rx.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Mon, 6 May 2024 10:32:05 +0000 (12:32 +0200)]
net: bridge: switchdev: Improve error message for port_obj_add/del functions
Enhance the error reporting mechanism in the switchdev framework to
provide more informative and user-friendly error messages.
Following feedback from users struggling to understand the implications
of error messages like "failed (err=-28) to add object (id=2)", this
update aims to clarify what operation failed and how this might impact
the system or network.
With this change, error messages now include a description of the failed
operation, the specific object involved, and a brief explanation of the
potential impact on the system. This approach helps administrators and
developers better understand the context and severity of errors,
facilitating quicker and more effective troubleshooting.
Example of the improved logging:
[ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast
Database entry (object id=2) with error: -ENOSPC (-28).
[ 70.516446] Failure in updating the port's Multicast Database could
lead to multicast forwarding issues.
[ 70.516446] Current HW/SW setup lacks sufficient resources.
This comprehensive update includes handling for a range of switchdev
object IDs, ensuring that most operations within the switchdev framework
benefit from clearer error reporting.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peilin He [Tue, 7 May 2024 07:41:03 +0000 (15:41 +0800)]
net/ipv4: add tracepoint for icmp_send
Introduce a tracepoint for icmp_send, which can help users to get more
detail information conveniently when icmp abnormal events happen.
1. Giving an usecase example:
=============================
When an application experiences packet loss due to an unreachable UDP
destination port, the kernel will send an exception message through the
icmp_send function. By adding a trace point for icmp_send, developers or
system administrators can obtain detailed information about the UDP
packet loss, including the type, code, source address, destination address,
source port, and destination port. This facilitates the trouble-shooting
of UDP packet loss issues especially for those network-service
applications.
2. Operation Instructions:
==========================
Switch to the tracing directory.
cd /sys/kernel/tracing
Filter for destination port unreachable.
echo "type==3 && code==3" > events/icmp/icmp_send/filter
Enable trace event.
echo 1 > events/icmp/icmp_send/enable
3. Result View:
================
udp_client_erro-11370 [002] ...s.12 124.728002:
icmp_send: icmp_send: type=3, code=3.
From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23
skbaddr=
00000000589b167a
Signed-off-by: Peilin He <he.peilin@zte.com.cn>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Reviewed-by: Yunkai Zhang <zhang.yunkai@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: Liu Chun <liu.chun2@zte.com.cn>
Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 May 2024 09:35:11 +0000 (10:35 +0100)]
Merge branch 'ksz-dcb-dscp'
Oleksij Rempel says:
====================
add DCB and DSCP support for KSZ switches
This patch series is aimed at improving support for DCB (Data Center
Bridging) and DSCP (Differentiated Services Code Point) on KSZ switches.
The main goal is to introduce global DSCP and PCP (Priority Code Point)
mapping support, addressing the limitation of KSZ switches not having
per-port DSCP priority mapping. This involves extending the DSA
framework with new callbacks for managing trust settings for global DSCP
and PCP maps. Additionally, we introduce IEEE 802.1q helpers for default
configurations, benefiting other drivers too.
Change logs are in separate patches.
Compared to v6 this series includes some new patches for DSCP global
mapping support and QoS selftest script for KSZ9477 switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:51 +0000 (15:13 +0200)]
selftests: microchip: add test for QoS support on KSZ9477 switch family
Add tests covering following functionality on KSZ9477 switch family:
- default port priority
- global DSCP to Internal Priority Mapping
- apptrust configuration
This script was tested on KSZ9893R
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:50 +0000 (15:13 +0200)]
net: dsa: microchip: add support DSCP priority mapping
Microchip KSZ and LAN variants do not have per port DSCP priority
configuration. Instead there is a global DSCP mapping table.
This patch provides write access to this global DSCP map. In case entry
is "deleted", we map corresponding DSCP entry to a best effort prio,
which is expected to be the default priority for all untagged traffic.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:49 +0000 (15:13 +0200)]
net: dsa: add support switches global DSCP priority mapping
Some switches like Microchip KSZ variants do not support per port DSCP
priority configuration. Instead there is a global DSCP mapping table.
To handle it, we will accept set/del request to any of user ports to
make global configuration and update dcb app entries for all other
ports.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:48 +0000 (15:13 +0200)]
net: dsa: microchip: let DCB code do PCP and DSCP policy configuration
802.1P (PCP) and DiffServ (DSCP) are handled now by DCB code. Let it do
all needed initial configuration.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:47 +0000 (15:13 +0200)]
net: dsa: microchip: init predictable IPV to queue mapping for all non KSZ8xxx variants
Init priority to queue mapping in the way as it shown in IEEE 802.1Q
mapping example.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:46 +0000 (15:13 +0200)]
net: dsa: microchip: enable ETS support for KSZ989X variants
I tested ETS support on KSZ9893, so it should work other KSZ989X
variants too, which was till not listed as support.
With this change we now officially not support only ksz8 family of
chips.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:45 +0000 (15:13 +0200)]
net: dsa: microchip: dcb: add special handling for KSZ88X3 family
KSZ88X3 switches have different behavior on different ports:
- It seems to be not possible to disable VLAN PCP classification on port
2. It means, as soon as mutliqueue support is enabled, frames with
VLAN tag will get PCP prios. This behavior do not affect Port 1 -
it is possible to disable PCP prios.
- DSCP classification is not working on Port 2.
Since there are still usable configuration combinations, I added some
quirks to make sure user will get appropriate error message if not
possible configuration is chosen.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:44 +0000 (15:13 +0200)]
net: dsa: microchip: add support for different DCB app configurations
Add DCB support to configure app trust sources and default port priority.
Following commands can be used for testing:
dcb apptrust set dev lan1 order pcp dscp
dcb app replace dev lan1 default-prio 3
Since it is not possible to configure DSCP-Prio mapping per port, this
patch provide only ability to read switch global dscp-prio mapping and
way to enable/disable app trust for DSCP.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:43 +0000 (15:13 +0200)]
net: dsa: microchip: add multi queue support for KSZ88X3 variants
KSZ88X3 switches support up to 4 queues. Rework ksz8795_set_prio_queue()
to support KSZ8795 and KSZ88X3 families of switches.
Per default, configure KSZ88X3 to use one queue, since it need special
handling due to priority related errata. Errata handling is implemented
in a separate patch.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:42 +0000 (15:13 +0200)]
net: add IEEE 802.1q specific helpers
IEEE 802.1q specification provides recommendation and examples which can
be used as good default values for different drivers.
This patch implements mapping examples documented in IEEE 802.1Q-2022 in
Annex I "I.3 Traffic type to traffic class mapping" and IETF DSCP naming
and mapping DSCP to Traffic Type inspired by RFC8325.
This helpers will be used in followup patches for dsa/microchip DCB
implementation.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:41 +0000 (15:13 +0200)]
net: dsa: microchip: add IPV information support
Most of Microchip KSZ switches use Internal Priority Value associated
with every frame. For example, it is possible to map any VLAN PCP or
DSCP value to IPV and at the end, map IPV to a queue.
Since amount of IPVs is not equal to amount of queues, add this
information and make use of it in some functions.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Fri, 3 May 2024 13:13:40 +0000 (15:13 +0200)]
net: dsa: add support for DCB get/set apptrust configuration
Add DCB support to get/set trust configuration for different packet
priority information sources. Some switch allow to chose different
source of packet priority classification. For example on KSZ switches it
is possible to configure VLAN PCP and/or DSCP sources.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 8 May 2024 02:08:34 +0000 (19:08 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-05-06 (ice)
This series contains updates to ice driver only.
Paul adds support for additional E830 devices and adjusts naming for
existing E830 devices.
Marcin commonizes a couple of TC setup calls to reduce duplicated code.
Mateusz adds ice_vsi_cfg_params into ice_vsi to consolidate info.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi
ice: Deduplicate tc action setup
ice: update E830 device ids and comments
ice: add additional E830 device ids
====================
Link: https://lore.kernel.org/r/20240506170827.948682-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 6 May 2024 14:39:39 +0000 (14:39 +0000)]
net: usb: sr9700: stop lying about skb->truesize
Some usb drivers set small skb->truesize and break
core networking stacks.
In this patch, I removed one of the skb->truesize override.
I also replaced one skb_clone() by an allocation of a fresh
and small skb, to get minimally sized skbs, like we did
in commit
1e2c61172342 ("net: cdc_ncm: reduce skb truesize
in rx path") and
4ce62d5b2f7a ("net: usb: ax88179_178a:
stop lying about skb->truesize")
Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240506143939.3673865-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 6 May 2024 14:23:58 +0000 (14:23 +0000)]
net: usb: smsc75xx: stop lying about skb->truesize
Some usb drivers try to set small skb->truesize and break
core networking stacks.
In this patch, I removed one of the skb->truesize override.
I also replaced one skb_clone() by an allocation of a fresh
and small skb, to get minimally sized skbs, like we did
in commit
1e2c61172342 ("net: cdc_ncm: reduce skb truesize
in rx path") and
4ce62d5b2f7a ("net: usb: ax88179_178a:
stop lying about skb->truesize")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steve Glendinning <steve.glendinning@shawell.net>
Link: https://lore.kernel.org/r/20240506142358.3657918-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 6 May 2024 13:55:46 +0000 (13:55 +0000)]
usb: aqc111: stop lying about skb->truesize
Some usb drivers try to set small skb->truesize and break
core networking stacks.
I replace one skb_clone() by an allocation of a fresh
and small skb, to get minimally sized skbs, like we did
in commit
1e2c61172342 ("net: cdc_ncm: reduce skb truesize
in rx path") and
4ce62d5b2f7a ("net: usb: ax88179_178a:
stop lying about skb->truesize")
Fixes: 361459cd9642 ("net: usb: aqc111: Implement RX data path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240506135546.3641185-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
John Hubbard [Mon, 6 May 2024 19:02:04 +0000 (12:02 -0700)]
selftests/net: fix uninitialized variables
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about three variables that are not initialized in all
cases:
1) The opt_ipproto_off variable is used uninitialized if "testname" is
not "ip". Willem de Bruijn pointed out that this is an actual bug, and
suggested the fix that I'm using here (thanks!).
2) The addr_len is used uninitialized, but only in the assert case,
which bails out, so this is harmless.
3) The family variable in add_listener() is only used uninitialized in
the error case (neither IPv4 nor IPv6 is specified), so it's also
harmless.
Fix by initializing each variable.
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20240506190204.28497-1-jhubbard@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Fainelli [Mon, 6 May 2024 17:50:40 +0000 (10:50 -0700)]
lib: Allow for the DIM library to be modular
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a
module. This is particularly useful in an Android GKI (Google Kernel
Image) configuration where everything is built as a module, including
Ethernet controller drivers. Having to build DIMLIB into the kernel
image with potentially no user is wasteful.
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 6 May 2024 12:30:32 +0000 (12:30 +0000)]
mptcp: fix possible NULL dereferences
subflow_add_reset_reason(skb, ...) can fail.
We can not assume mptcp_get_ext(skb) always return a non NULL pointer.
syzbot reported:
general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
CPU: 0 PID: 5098 Comm: syz-executor132 Not tainted
6.9.0-rc6-syzkaller-01478-gcdc74c9d06e7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
RIP: 0010:subflow_v6_route_req+0x2c7/0x490 net/mptcp/subflow.c:388
Code: 8d 7b 07 48 89 f8 48 c1 e8 03 42 0f b6 04 20 84 c0 0f 85 c0 01 00 00 0f b6 43 07 48 8d 1c c3 48 83 c3 18 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 84 01 00 00 0f b6 5b 01 83 e3 0f 48 89
RSP: 0018:
ffffc9000362eb68 EFLAGS:
00010206
RAX:
0000000000000003 RBX:
0000000000000018 RCX:
ffff888022039e00
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000000
RBP:
ffff88807d961140 R08:
ffffffff8b6cb76b R09:
1ffff1100fb2c230
R10:
dffffc0000000000 R11:
ffffed100fb2c231 R12:
dffffc0000000000
R13:
ffff888022bfe273 R14:
ffff88802cf9cc80 R15:
ffff88802ad5a700
FS:
0000555587ad2380(0000) GS:
ffff8880b9400000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007f420c3f9720 CR3:
0000000022bfc000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
tcp_conn_request+0xf07/0x32c0 net/ipv4/tcp_input.c:7180
tcp_rcv_state_process+0x183c/0x4500 net/ipv4/tcp_input.c:6663
tcp_v6_do_rcv+0x8b2/0x1310 net/ipv6/tcp_ipv6.c:1673
tcp_v6_rcv+0x22b4/0x30b0 net/ipv6/tcp_ipv6.c:1910
ip6_protocol_deliver_rcu+0xc76/0x1570 net/ipv6/ip6_input.c:438
ip6_input_finish+0x186/0x2d0 net/ipv6/ip6_input.c:483
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
__netif_receive_skb_one_core net/core/dev.c:5625 [inline]
__netif_receive_skb+0x1ea/0x650 net/core/dev.c:5739
netif_receive_skb_internal net/core/dev.c:5825 [inline]
netif_receive_skb+0x1e8/0x890 net/core/dev.c:5885
tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1549
tun_get_user+0x2f35/0x4560 drivers/net/tun.c:2002
tun_chr_write_iter+0x113/0x1f0 drivers/net/tun.c:2048
call_write_iter include/linux/fs.h:2110 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xa84/0xcb0 fs/read_write.c:590
ksys_write+0x1a0/0x2c0 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 3e140491dd80 ("mptcp: support rstreason for passive reset")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20240506123032.3351895-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal [Mon, 6 May 2024 11:43:16 +0000 (13:43 +0200)]
selftests: netfilter: conntrack_tcp_unreplied.sh: wait for initial connection attempt
Netdev CI reports occasional failures with this test
("ERROR: ns2-dX6bUE did not pick up tcp connection from peer").
Add explicit busywait call until the initial connection attempt shows
up in conntrack rather than a one-shot 'must exist' check.
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20240506114320.12178-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 6 May 2024 10:28:12 +0000 (10:28 +0000)]
net: annotate writes on dev->mtu from ndo_change_mtu()
Simon reported that ndo_change_mtu() methods were never
updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted
in commit
501a90c94510 ("inet: protect against too small
mtu values.")
We read dev->mtu without holding RTNL in many places,
with READ_ONCE() annotations.
It is time to take care of ndo_change_mtu() methods
to use corresponding WRITE_ONCE()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Simon Horman <horms@kernel.org>
Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jeff Johnson [Sun, 5 May 2024 20:09:31 +0000 (13:09 -0700)]
net: dccp: Fix ccid2_rtt_estimator() kernel-doc
make C=1 reports:
warning: Function parameter or struct member 'mrtt' not described in 'ccid2_rtt_estimator'
So document the 'mrtt' parameter.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240505-ccid2_rtt_estimator-kdoc-v1-1-09231fcb9145@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthias Schiffer [Thu, 2 May 2024 11:13:01 +0000 (13:13 +0200)]
net: phy: marvell: add support for MV88E6250 family internal PHYs
The embedded PHYs of the
88E6250 family switches are very basic - they
do not even have an Extended Address / Page register.
This adds support for the PHYs to the driver to set up PHY interrupts
and retrieve error stats. To deal with PHYs without a page register,
"simple" variants of all stat handling functions are introduced.
The code should work with all
88E6250 family switches (6250/6220/6071/
6070/6020). The PHY ID 0x01410db0 was read from a
88E6020, under the
assumption that all switches of this family use the same ID. The spec
only lists the prefix 0x01410c00 and leaves the last 10 bits as reserved,
but that seems too unspecific to be useful, as it would cover several
existing PHY IDs already supported by the driver; therefore, the ID read
from the actual hardware is used.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/0695f699cd942e6e06da9d30daeedfd47785bc01.1714643285.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthias Schiffer [Thu, 2 May 2024 11:13:00 +0000 (13:13 +0200)]
net: phy: marvell: constify marvell_hw_stats
The list of stat registers is read-only, so we can declare it as const.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/24d7a2f39e0c4c94466e8ad43228fdd798053f3a.1714643285.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dan Carpenter [Sat, 4 May 2024 11:38:15 +0000 (14:38 +0300)]
wifi: mwl8k: initialize cmd->addr[] properly
This loop is supposed to copy the mac address to cmd->addr but the
i++ increment is missing so it copies everything to cmd->addr[0] and
only the last address is recorded.
Fixes: 22bedad3ce11 ("net: convert multicast list to list_head")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/b788be9a-15f5-4cca-a3fe-79df4c8ce7b2@moroto.mountain
Paolo Abeni [Tue, 7 May 2024 09:42:03 +0000 (11:42 +0200)]
Merge branch 'remove-rtnl-lock-protection-of-cvq'
Daniel Jurgens says:
====================
Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the
RTNL lock. Previously this wasn't a major concern because the control VQ
was only used during device setup and user interaction. With the recent
addition of dynamic interrupt moderation the control VQ may be used
frequently during normal operation.
This series removes the RNTL lock dependency by introducing a mutex
to protect the control buffer and writing SGs to the control VQ.
v6:
- Rebased over new stats code.
- Added comment to cvq_lock, init the mutex unconditionally,
and replaced some duplicate code with a goto.
- Fixed minor grammer errors, checkpatch warnings, and clarified
a comment.
v5:
- Changed cvq_lock to a mutex.
- Changed dim_lock to mutex, because it's held taking
the cvq_lock.
- Use spin/mutex_lock/unlock vs guard macros.
v4:
- Protect dim_enabled with same lock as well intr_coal.
- Rename intr_coal_lock to dim_lock.
- Remove some scoped_guard where the error path doesn't
have to be in the lock.
v3:
- Changed type of _offloads to __virtio16 to fix static
analysis warning.
- Moved a misplaced hunk to the correct patch.
v2:
- New patch to only process the provided queue in
virtnet_dim_work
- New patch to lock per queue rx coalescing structure.
====================
Link: https://lore.kernel.org/r/20240503202445.1415560-1-danielj@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:45 +0000 (23:24 +0300)]
virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and
command VQ.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:44 +0000 (23:24 +0300)]
virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be
contention on the per queue RX interrupt coalescing data. Use a mutex
per queue. A mutex is required because virtnet_send_command can sleep.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:43 +0000 (23:24 +0300)]
virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for
the specified queue.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:42 +0000 (23:24 +0300)]
virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a
mutex to protect the control buffer header and the VQ.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:41 +0000 (23:24 +0300)]
virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the struct could
be on the stack, but we can't DMA stack memory. With this change only
the header and status memory are shared between commands, which will
allow using a tighter lock than RTNL.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Jurgens [Fri, 3 May 2024 20:24:40 +0000 (23:24 +0300)]
virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for
removing RTNL lock protection of the control buffer.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Tested-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Arınç ÜNAL [Tue, 30 Apr 2024 05:01:33 +0000 (08:01 +0300)]
net: dsa: mt7530: detect PHY muxing when PHY is defined on switch MDIO bus
Currently, the MT7530 DSA subdriver configures the MT7530 switch to provide
direct access to switch PHYs, meaning, the switch PHYs listen on the MDIO
bus the switch listens on. The PHY muxing feature makes use of this.
This is problematic as the PHY may be attached before the switch is
initialised, in which case, the PHY will fail to be attached.
Since commit
91374ba537bd ("net: dsa: mt7530: support OF-based registration
of switch MDIO bus"), we can describe the switch PHYs on the MDIO bus of
the switch on the device tree. Extend the check to detect PHY muxing when
the PHY is defined on the MDIO bus of the switch on the device tree.
When the PHY is described this way, the switch will be initialised first,
then the switch MDIO bus will be registered. Only after these steps, the
PHY will be attached.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/20240430-b4-for-netnext-mt7530-use-switch-mdio-bus-for-phy-muxing-v2-1-9104d886d0db@arinc9.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 7 May 2024 09:14:53 +0000 (11:14 +0200)]
Merge branch 'rtnetlink-more-rcu-conversions-for-rtnl_fill_ifinfo'
Eric Dumazet says:
====================
rtnetlink: more rcu conversions for rtnl_fill_ifinfo()
We want to no longer rely on RTNL for "ip link show" command.
This is a long road, this series takes care of some parts.
====================
Link: https://lore.kernel.org/r/20240503192059.3884225-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:59 +0000 (19:20 +0000)]
rtnetlink: allow rtnl_fill_link_netnsid() to run under RCU protection
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.
All rtnl_link_ops->get_link_net() methods already using dev_net()
are ready. I added READ_ONCE() annotations on others.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:58 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL in rtnl_xdp_prog_skb()
dev->xdp_prog is protected by RCU, we can lift RTNL requirement
from rtnl_xdp_prog_skb().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:57 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL in rtnl_fill_proto_down()
Change dev_change_proto_down() and dev_change_proto_down_reason()
to write once on dev->proto_down and dev->proto_down_reason.
Then rtnl_fill_proto_down() can use READ_ONCE() annotations
and run locklessly.
rtnl_proto_down_size() should assume worst case,
because readng dev->proto_down_reason multiple
times would be racy without RTNL in the future.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:56 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL for many attributes
Following device fields can be read locklessly
in rtnl_fill_ifinfo() :
type, ifindex, operstate, link_mode, mtu, min_mtu, max_mtu, group,
promiscuity, allmulti, num_tx_queues, gso_max_segs, gso_max_size,
gro_max_size, gso_ipv4_max_size, gro_ipv4_max_size, tso_max_size,
tso_max_segs, num_rx_queues.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:55 +0000 (19:20 +0000)]
net: write once on dev->allmulti and dev->promiscuity
In the following patch we want to read dev->allmulti
and dev->promiscuity locklessly from rtnl_fill_ifinfo()
In this patch I change __dev_set_promiscuity() and
__dev_set_allmulti() to write these fields (and dev->flags)
only if they succeed, with WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:54 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL for IFLA_TXQLEN output
rtnl_fill_ifinfo() can read dev->tx_queue_len locklessly,
granted we add corresponding READ_ONCE()/WRITE_ONCE() annotations.
Add missing READ_ONCE(dev->tx_queue_len) in teql_enqueue()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:53 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL for IFLA_IFNAME output
We can use netdev_copy_name() to no longer rely on RTNL
to fetch dev->name.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Fri, 3 May 2024 19:20:52 +0000 (19:20 +0000)]
rtnetlink: do not depend on RTNL for IFLA_QDISC output
dev->qdisc can be read using RCU protection.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 7 May 2024 09:08:16 +0000 (11:08 +0200)]
Merge branch 'net-qede-don-t-restrict-error-codes'
says:
====================
net: qede: don't restrict error codes
This series fixes the qede driver, so that when a helper function fails,
then the callee should return the returned error code, instead just
assuming that the error is eg. -EINVAL.
The patches in this series, reduces the change of future bugs, so new
error codes can be returned from the helpers, without having to update
the call sites.
This is a follow-up to my recent series "net: qede: avoid overruling
error codes", which fixed the cases where the implicit assumption of
failing with specific error codes had been broken.
https://lore.kernel.org/netdev/
20240426091227.78060-1-ast@fiberby.net/
Asbjørn Sloth Tønnesen (3):
net: qede: use return from qede_parse_actions() for flow_spec
net: qede: use return from qede_flow_spec_validate_unused()
net: qede: use return from qede_flow_parse_ports()
.../net/ethernet/qlogic/qede/qede_filter.c | 27 ++++++++++++-------
1 file changed, 18 insertions(+), 9 deletions(-)
====================
Link: https://lore.kernel.org/r/20240503105505.839342-1-ast@fiberby.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Asbjørn Sloth Tønnesen [Fri, 3 May 2024 10:55:03 +0000 (10:55 +0000)]
net: qede: use return from qede_flow_parse_ports()
When calling qede_flow_parse_ports(), then the
return code was only used for a non-zero check,
and then -EINVAL was returned.
qede_flow_parse_ports() can currently fail with:
* -EINVAL
This patch changes qede_flow_parse_v{4,6}_common() to
use the actual return code from qede_flow_parse_ports(),
so it's no longer assumed that all errors are -EINVAL.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Asbjørn Sloth Tønnesen [Fri, 3 May 2024 10:55:02 +0000 (10:55 +0000)]
net: qede: use return from qede_flow_spec_validate_unused()
When calling qede_flow_spec_validate_unused() then
the return code was only used for a non-zero check,
and then -EOPNOTSUPP was returned.
qede_flow_spec_validate_unused() can currently fail with:
* -EOPNOTSUPP
This patch changes qede_flow_spec_to_rule() to use the
actual return code from qede_flow_spec_validate_unused(),
so it's no longer assumed that all errors are -EOPNOTSUPP.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Asbjørn Sloth Tønnesen [Fri, 3 May 2024 10:55:01 +0000 (10:55 +0000)]
net: qede: use return from qede_parse_actions() for flow_spec
In qede_flow_spec_to_rule(), when calling
qede_parse_actions() then the return code
was only used for a non-zero check, and then
-EINVAL was returned.
qede_parse_actions() can currently fail with:
* -EINVAL
* -EOPNOTSUPP
Commit
319a1d19471e ("flow_offload: check for
basic action hw stats type") broke the implicit
assumption that it could only fail with -EINVAL,
by changing it to return -EOPNOTSUPP, when hardware
stats are requested.
However AFAICT it's not possible to trigger
qede_parse_actions() to return -EOPNOTSUPP, when
called from qede_flow_spec_to_rule(), as hardware
stats can't be requested by ethtool_rx_flow_rule_create().
This patch changes qede_flow_spec_to_rule() to use
the actual return code from qede_parse_actions(),
so it's no longer assumed that all errors are -EINVAL.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 May 2024 02:14:56 +0000 (19:14 -0700)]
Merge tag 'ipsec-next-2024-05-03' of git://git./linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2024-05-03
1) Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support.
This was defined by an early version of an IETF draft
that did not make it to a standard.
2) Introduce direction attribute for xfrm states.
xfrm states have a direction, a stsate can be used
either for input or output packet processing.
Add a direction to xfrm states to make it clear
for what a xfrm state is used.
* tag 'ipsec-next-2024-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: Restrict SA direction attribute to specific netlink message types
xfrm: Add dir validation to "in" data path lookup
xfrm: Add dir validation to "out" data path lookup
xfrm: Add Direction to the SA in or out
udpencap: Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support
====================
Link: https://lore.kernel.org/r/20240503082732.2835810-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shi-Sheng Yang [Thu, 2 May 2024 15:47:40 +0000 (23:47 +0800)]
mptcp: fix typos in comments
This patch fixes the spelling mistakes in comments.
The changes were generated using codespell and reviewed manually.
eariler -> earlier
greceful -> graceful
Signed-off-by: Shi-Sheng Yang <fourcolor4c@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240502154740.249839-1-fourcolor4c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Fri, 3 May 2024 11:11:58 +0000 (12:11 +0100)]
octeontx2-pf: Treat truncation of IRQ name as an error
According to GCC, the constriction of irq_name in otx2_open()
may, theoretically, be truncated.
This patch takes the approach of treating such a situation as an error
which it detects by making use of the return value of snprintf, which is
the total number of bytes, excluding the trailing '\0', that would have
been written.
Based on the approach taken to a similar problem in
commit
54b909436ede ("rtc: fix snprintf() checking in is_rtc_hctosys()")
Flagged by gcc-13 W=1 builds as:
.../otx2_pf.c:1933:58: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name,
| ^
.../otx2_pf.c:1933:17: note: 'snprintf' output between 8 and 33 bytes into a destination of size 32
1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1934 | qidx);
| ~~~~~
Compile tested only.
Tested-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240503-octeon2-pf-irq_name-truncation-v2-1-91099177b942@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dr. David Alan Gilbert [Fri, 3 May 2024 00:18:22 +0000 (01:18 +0100)]
atm/fore200e: Delete unused 'fore200e_boards'
This list looks like it's been unused since the OF conversion in
2008 in
commit
826b6cfcd5d4 ("fore200e: Convert over to pure OF driver.")
This also means we can remove the 'entry' member for the list.
Build tested only.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240503001822.183061-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shailend Chand [Wed, 1 May 2024 23:25:49 +0000 (23:25 +0000)]
gve: Implement queue api
The new netdev queue api is implemented for gve.
Tested-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Shailend Chand <shailend@google.com>
Link: https://lore.kernel.org/all/20240501232549.1327174-11-shailend@google.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mateusz Polchlopek [Fri, 19 Apr 2024 09:11:04 +0000 (05:11 -0400)]
ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi
Refactor struct ice_vsi_cfg_params to be embedded into struct ice_vsi.
Prior to that the members of the struct were scattered around ice_vsi,
and were copy-pasted for purposes of reinit.
Now we have struct handy, and it is easier to have something sticky
in the flags field.
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vaishnavi Tipireddy <vaishnavi.tipireddy@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marcin Szycik [Mon, 15 Apr 2024 08:49:07 +0000 (10:49 +0200)]
ice: Deduplicate tc action setup
ice_tc_setup_redirect_action() and ice_tc_setup_mirror_action() are almost
identical, except for setting filter action. Reduce them to one function
with an extra param, which handles both cases.
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Paul Greenwalt [Fri, 29 Mar 2024 01:07:08 +0000 (21:07 -0400)]
ice: update E830 device ids and comments
Update existing E830 device ids and comments to align with new naming 'C'
for 100G and 'CC' for 200G.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Paul Greenwalt [Fri, 29 Mar 2024 01:07:07 +0000 (21:07 -0400)]
ice: add additional E830 device ids
Add support for additional E830 device ids which are supported by the
driver:
- 0x12D5: Intel(R) Ethernet Controller E830-C for backplane
- 0x12D8: Intel(R) Ethernet Controller E830-C for QSFP
- 0x12DA: Intel(R) Ethernet Controller E830-C for SFP
- 0x12DC: Intel(R) Ethernet Controller E830-XXV for backplane
- 0x12DD: Intel(R) Ethernet Controller E830-XXV for QSFP
- 0x12DE: Intel(R) Ethernet Controller E830-XXV for SFP
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Daniel Gabay [Mon, 6 May 2024 07:04:19 +0000 (10:04 +0300)]
wifi: iwlwifi: Ensure prph_mac dump includes all addresses
In prph_mac_iter, ensure that all required addresses are dumped
even if a read fails. Currently, if a read fails, the region dump
is stopped, preventing the creation of prph_mac.lst.
By dumping all addresses even if a read fails, we can accurately
determine which addresses were successfully read and which were not.
Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Reviewed-by: Eilon Rinat <eilon.rinat@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.31fa9ce91a1c.Ia0c86f70c7a6874c15ffc6f8235aa88530208546@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Mon, 6 May 2024 07:04:18 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: don't request statistics in restart
During restart mac80211 notifies the driver about the association,
(if we was associated before the restart) which causes the driver to
request statistics from the FW. This causes to an immediate exit from
EMLSR after the restart is done, when the statistics notif is handled.
(too low TPT). There is no point in requesting statistics wnyway, since
the FW just started and don't have any.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240506095953.16638dec9f7b.I093514312179bae566ad8d73ffb0355c6eee288a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Mon, 6 May 2024 07:04:17 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: exit EMLSR if secondary link is not used
Exit EMLSR mode if the secondary link is not used enough for Rx/Tx
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240506095953.99ad1d71e9b9.Ide825433488ec809773efdc36937e3089d0012df@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
striebit [Mon, 6 May 2024 07:04:16 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: add beacon template version 14
In version 14 tim_size became the offset of the
broadcast TWT IE.
Signed-off-by: striebit <shaul.triebitz@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.76957de93810.I2c718b0d648f2559fe1337df39915c5e772856bc@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 6 May 2024 07:04:15 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: align UATS naming with firmware
The firmware has different names for this, which is confusing
as even the convention of having the firmware name in a comment
after the struct definition wasn't met here. Fix the naming,
but keep UATS in some of it since that's the BIOS name.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.b0dfe17d5f44.I8f5f5a831c7b934ce3140f838315827c018103bb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Daniel Gabay [Mon, 6 May 2024 07:04:14 +0000 (10:04 +0300)]
wifi: iwlwifi: Force SCU_ACTIVE for specific platforms
Firmware 0x2F7 assert observed in Dell platforms when using GL HW.
This issue is mitigated by setting SCU_FORCE_ACTIVE during platform
low power states.
Driver shall indicate firmware to force SCU active by setting bit 29
in context info prph scratch control flags.
This mitigation is limited to Dell platforms with GL HW only.
Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Reviewed-by: Ofer Kimelman <ofer.kimelman@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.3d0c56c2bb1a.I97d9da402890d2085b5698666cceffc417b6b6df@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Benjamin Berg [Mon, 6 May 2024 07:04:13 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: record and return channel survey information
While doing a passive scan, the firmware will report per-channel survey
information. This information is primarily useful for hostapd when doing
an ACS (Automatic Channel Selection). Collect this information and add
it to the result set when getting the survey information.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.9287591a5999.I54a3f9f6480d3694e67eea1cb4f5853beace2780@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Benjamin Berg [Mon, 6 May 2024 07:04:12 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: add the firmware API for channel survey
When requested, the firmware can return per-channel survey information
generally used for ACS (automatic channel selection). Add the API for
this, which consists of a flag and a new channel survey notification.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.1facde532676.I3864ac4bc0fecb7fd5136e85c07585ab7100234b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ilan Peer [Mon, 6 May 2024 07:04:11 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: Fix race in scan completion
The move of the scan complete notification handling to the wiphy worker
introduced a race between scan complete notification and scan abort:
- The wiphy lock is held, e.g., for rfkill handling etc.
- Scan complete notification is received but not handled yet.
- Scan abort is triggered, and scan abort is sent to the FW. Once the
scan abort command is sent successfully, the flow synchronously waits
for the scan complete notification. However, as the scan complete
notification was already received but not processed yet, this hangs for
a second and continues leaving the scan status in an inconsistent
state.
- Once scan complete handling is started (when the wiphy lock is not held)
since the scan status is not an inconsistent state, a warning is issued
and the scan complete notification is not handled.
To fix this issue, switch back the scan complete notification to be
asynchronously handling, and only move the link selection logic to
a worker (which was the original reason for the move to use wiphy lock).
While at it, refactor some prints to improve debug data.
Fixes: 07bf5297d392 ("wifi: iwlwifi: mvm: Implement new link selection algorithm")
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.1f484a86324b.I63ed445a47f144546948c74ae6df85587fdb4ce3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Yedidya Benshimol [Mon, 6 May 2024 07:04:10 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: Add a print for invalid link pair due to bandwidth
When validating a link pair for EMLSR, add a print for invalid link
pair due to bandwidth
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.9e57ad898cf4.Id8edfd5e3774ea6475d5f4178ab7ea75a870ef95@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Yedidya Benshimol [Mon, 6 May 2024 07:04:09 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: add a debugfs for reading EMLSR blocking reasons
Add a reading for all active EMLSR blocking reasons for testing
purposes.
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.6d494a335e81.Ic0fa6a9636e3c1a3b1420e85e704a19d4a56e8d9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Yedidya Benshimol [Mon, 6 May 2024 07:04:08 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: Add active EMLSR blocking reasons prints
Upon adding/removing an EMLSR blocking reason add to the print
the EMLSR disabling mask
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.1e34fe2c3e51.Ia7db0392d81818ceb70a7b199d3f5fa8a4ad198d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Mon, 6 May 2024 07:04:07 +0000 (10:04 +0300)]
wifi: iwlwifi: bump FW API to 90 for BZ/SC devices
Start supporting API version 90 for new devices.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.4e4b19128b56.I2f9196191f1ea78e96e92f9db8ecb3cc9bbfd9b3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Mon, 6 May 2024 07:04:06 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: fix primary link setting
mvmvif::primary link holds the ID and not a bitmap. Fix this
Fixes: 07bf5297d392 ("wifi: iwlwifi: mvm: Implement new link selection algorithm")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240506095953.779bf6949053.Ia9297991ff2fdc82ae7c730e0069e2dd6e5f2902@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 6 May 2024 07:04:05 +0000 (10:04 +0300)]
wifi: iwlwifi: mvm: use already determined cmd_id
In iwl_mvm_rs_fw_rate_init() we have a variable cmd_id that
holds the command ID, so we can just use that instead of the
various calculations of it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.f894ede03b26.I18f03c272b1c0807767f2713f3ffbb2941c57d9b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Sun, 5 May 2024 06:19:59 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: don't reset link selection during restart
After restart, we might want to end up with the same config
as before, even for multi-link/EMLSR. Therefore, don't reset
the stored link selection result in that case.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.e81db303f1dc.Ie8267082f623d14376a2052d222e18da6545f34b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Daniel Gabay [Sun, 5 May 2024 06:19:58 +0000 (09:19 +0300)]
wifi: iwlwifi: Print EMLSR states name
This is useful for debug instead of looking for the hex value.
Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.f3509cf652f2.Ic086b6b2132ffe249b3c4bdd24c673ce7fd1b614@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Yedidya Benshimol [Sun, 5 May 2024 06:19:57 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active
When there's an active link in a non-station vif, the station vif is
not allowed to enter EMLSR
Note that blocking EMLSR by calling iwl_mvm_block_esr() we will schedule
an exit from EMLSR worker, but the worker cannot run before the
activation of the non-BSS link, as ieee80211_remain_on_channel already
holds the wiphy mutex.
Handle that by explicitly calling ieee80211_set_active_links()
to leave EMLSR, and then doing iwl_mvm_block_esr() only for
consistency and to avoid re-entering it before ready.
Note that a call to ieee80211_set_active_links requires to release the
mvm mutex, but that's ok since we still hold the wiphy lock. The only
thing that might race here is the ESR_MODE_NOTIF, so this changes its
handler to run under the wiphy lock.
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Co-developed-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.916193759f8a.Idf3a3caf5cdc3e69c81710b7ceb57e87f2de87e4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Sun, 5 May 2024 06:19:56 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: fix typo in debug print
Change EMSLR to EMLSR
Fixes: 6cf7df9f013f ("wifi: iwlwifi: mvm: Add helper functions to update EMLSR status")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.db629302bfdc.I135e28b89fab3b614ad8758c0305834934f8c0af@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Sun, 5 May 2024 06:19:55 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: exit EMLSR when CSA happens
If CSA is happening, then exit EMLSR to keep the better link,
which is the primary link unless that's doing the CSA with
quiet. This is done because we can't transmit the OMN frame
on a quiet link, but want to exit EMLSR during CSA for better
beacon reception, so we can follow the switch accurately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.3ffff9577f08.I2620971fa5aef789e0d4a588def4c2621e8bed5b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Yedidya Benshimol [Sun, 5 May 2024 06:19:54 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: Disable/enable EMLSR due to link's bandwidth/band
Enable EMLSR when bandwidth settings meet the criteria in
both band and width, otherwise disable.
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.4e473d4f7f5c.I3adf5619b60bfba8af0cd7eae9dac947419603b6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Sun, 5 May 2024 06:19:53 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: avoid always prefering single-link
The new link selection algorithm uses defaults values for BSS load if
the BSS Load element was not published by the AP.
For 6 GHz, that value is 0. So if the best link is 6 GHz, the EMLSR
grade to always be equal to the grade of the best link,
and then the best link grade is getting a bonus of 10 percent, meaning
that we will never activate EMLSR.
Change the logic to not give a bonus for the best link.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.4614e6891dbd.Ie40eae0dd99d82ba60dea5b6dbcd42dcdf16b90d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Sun, 5 May 2024 06:19:52 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: trigger link selection upon TTLM start/end
When non default TTLM is applied, mac80211 may force us to use a specific
link (For example, if the only active link becomes a dormant link,
mac80211 will pick the first usable link and set it as active).
When default TTLM is applied, we have new usable links that we might want
to select. Therefore, trigger MLO scan and link selection upon change in
TTLM.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Link: https://msgid.link/20240505091420.ed2b386566a8.I0168e61da86b2027633743aaf5d97e483991f0dc@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Sun, 5 May 2024 06:19:51 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: consider FWs recommendation for EMLSR
FW sends a notification indicating whether activating EMLSR mode is
recommended or not.
Support the notification and enter EMLSR only if recommended.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.2fd3387882eb.I7a8a5b24658744ed732bfc03b1872c9298483d62@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Miri Korenblit [Sun, 5 May 2024 06:19:50 +0000 (09:19 +0300)]
wifi: iwlwifi: mvm: Activate EMLSR based on traffic volume
Adjust EMLSR activation to account for traffic levels. By
tracking the number of RX/TX MPDUs, EMLSR will be activated only when
traffic volume meets the required threshold.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.9480f99ac8fc.If9eb946e929a39e10fe5f4638bc8bc3f8976edf1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>