In of_mdiobus_register(), we should call of_node_put() for 'child'
escaped out of for_each_available_child_of_node().
Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral") Co-developed-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Kconfig symbol depended on MMU but was dropped by the commit acad3fe650a5 ("drm/hisilicon: Removed the dependency on the mmu")
because it already had as a dependency ARM64 that already selects MMU.
But later, commit a0f25a6bb319 ("drm/hisilicon/hibmc: Allow to be built
if COMPILE_TEST is enabled") allowed the driver to be built for non-ARM64
when COMPILE_TEST is set but that could lead to unmet direct dependencies
and linking errors.
Prevent a kconfig warning when MMU is not enabled by making
DRM_HISI_HIBMC depend on MMU.
The commit feeb07d0ca5a ("drm/hisilicon/hibmc: Make CONFIG_DRM_HISI_HIBMC
depend on ARM64") made the driver Kconfig symbol to depend on ARM64 since
it only supports that architecture and loading the module on others would
lead to incorrect video modes being used.
But it also prevented the driver to be built on other architectures which
is useful to have compile test coverage when doing subsystem wide changes.
Make the dependency instead to be (ARM64 || COMPILE_TEST), so the driver
is buildable when the CONFIG_COMPILE_TEST option is enabled.
Trying to get the channel from the tx_queue variable here is wrong
because we can only be here if tx_queue is NULL, so we shouldn't
dereference it. As the above comment in the code says, this is very
unlikely to happen, but it's wrong anyway so let's fix it.
I hit this issue because of a different bug that caused tx_queue to be
NULL. If that happens, this is the error message that we get here:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[...]
RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]
Fixes: 12804793b17c ("sfc: decouple TXQ type from label") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220914111135.21038-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In legacy interrupt mode the tx_channel_offset was hardcoded to 1, but
that's not correct if efx_sepparate_tx_channels is false. In that case,
the offset is 0 because the tx queues are in the single existing channel
at index 0, together with the rx queue.
Without this fix, as soon as you try to send any traffic, it tries to
get the tx queues from an uninitialized channel getting these errors:
WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/sfc/tx.c:540 efx_hard_start_xmit+0x12e/0x170 [sfc]
[...]
RIP: 0010:efx_hard_start_xmit+0x12e/0x170 [sfc]
[...]
Call Trace:
<IRQ>
dev_hard_start_xmit+0xd7/0x230
sch_direct_xmit+0x9f/0x360
__dev_queue_xmit+0x890/0xa40
[...]
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[...]
RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]
[...]
Call Trace:
<IRQ>
dev_hard_start_xmit+0xd7/0x230
sch_direct_xmit+0x9f/0x360
__dev_queue_xmit+0x890/0xa40
[...]
While converting max_tx_rate from bytes to Mbps, this value was set to 0,
if the original value was lower than 125000 bytes (1 Mbps). This would
cause no transmission rate limiting to occur. This happened due to lack of
check of max_tx_rate against the 1 Mbps value for max_tx_rate and the
following division by 125000. Fix this issue by adding a helper
i40e_bw_bytes_to_mbits() which sets max_tx_rate to minimum usable value of
50 Mbps, if its value is less than 1 Mbps, otherwise do the required
conversion by dividing by 125000.
Max MTU sent to VF is set to 0 during memory allocation. It cause
that max MTU on VF is changed to IAVF_MAX_RXBUFFER and does not
depend on data from HW.
Set max_mtu field in virtchnl_vf_resource struct to inform
VF in GET_VF_RESOURCES msg what size should be max frame.
Fixes: dab86afdbbd1 ("i40e/i40evf: Change the way we limit the maximum frame size for Rx") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
After setting port VLAN and MTU to 9000 on VF with ice driver there
was an iavf error
"PF returned error -5 (IAVF_ERR_PARAM) to our request 6".
During queue configuration, VF's max packet size was set to
IAVF_MAX_RXBUFFER but on ice max frame size was smaller by VLAN_HLEN
due to making some space for port VLAN as VF is not aware whether it's
in a port VLAN. This mismatch in sizes caused ice to reject queue
configuration with ERR_PARAM error. Proper max_mtu is sent from ice PF
to VF with GET_VF_RESOURCES msg but VF does not look at this.
In iavf change max_frame from IAVF_MAX_RXBUFFER to max_mtu
received from pf with GET_VF_RESOURCES msg to make vf's
max_frame_size dependent from pf. Add check if received max_mtu is
not in eligible range then set it to IAVF_MAX_RXBUFFER.
Fixes: dab86afdbbd1 ("i40e/i40evf: Change the way we limit the maximum frame size for Rx") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The MDIO gateway (GW) lock in BlueField-2 GIGE logic is
set after read. This patch adds logic to make sure the
lock is always cleared at the end of each MDIO transaction.
Fix bad page state, free inappropriate page in handling dummy
descriptor. iavf_build_skb now has to check not only if rx_buffer is
NULL but also if size is zero, same thing in iavf_clean_rx_irq.
Without this patch driver would free page that will be used
by napi_build_skb.
Fixes: a9f49e006030 ("iavf: Fix handling of dummy receive descriptors") Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
OpenWrt's UML with 5.15 was producing odd errors/warnings during preinit
part of the early userspace portion:
|[ 0.000000] Kernel command line: ubd0=root.img root=98:0 console=tty
|[...]
|[ 0.440000] random: jshn: uninitialized urandom read (4 bytes read)
|[ 0.460000] random: jshn: uninitialized urandom read (4 bytes read)
|/etc/preinit: line 47: can't create /dev/tty: No such device or address
|/etc/preinit: line 48: can't create /dev/tty: No such device or address
|/etc/preinit: line 58: can't open /dev/tty: No such device or address
|[...] repeated many times
That "/dev/tty" came from the command line (which is automatically
added if no console= parameter was specified for the uml binary).
The TLDP project tells the following about the /dev/tty:
<https://tldp.org/HOWTO/Text-Terminal-HOWTO-7.html#ss7.3>
| /dev/tty stands for the controlling terminal (if any) for the current
| process.[...]
| /dev/tty is something like a link to the actually terminal device[..]
The "(if any)" is important here, since it's possible for processes to
not have a controlling terminal.
I think this was a simple typo and the author wanted tty0 there.
CC: Thomas Meyer <thomas@m3y3r.de> Fixes: d7ffac33631b ("um: stdio_console: Make preferred console") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 0060c8783330 ("net: stmmac: implement support for passive mode
converters via dt") has changed the plat->interface field semantics from
containing the PHY-mode to specifying the MAC-PCS interface mode. Due to
that the loongson32 platform code will leave the phylink interface
uninitialized with the PHY-mode intended by the means of the actual
platform setup. The commit-author most likely has just missed the
arch-specific code to fix. Let's mend the Loongson32 platform code then by
assigning the PHY-mode to the phy_interface field of the STMMAC platform
data.
Fixes: 0060c8783330 ("net: stmmac: implement support for passive mode converters via dt") Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com> Tested-by: Keguang Zhang <keguang.zhang@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.
The team driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former team
ports after a team device has been deleted; see test_LAG_cleanup() in the
last patch in this series.
Add unsync calls at their expected location, team_close().
v3:
* When adding or deleting a port, only sync/unsync addresses if the team
device is up. In other cases, it is taken care of at the right time by
ndo_open/ndo_set_rx_mode/ndo_stop.
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.
The bonding driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former bond
slaves after a bond has been deleted; see test_LAG_cleanup() in the last
patch in this series.
Add unsync calls, via bond_hw_addr_flush(), at their expected location,
bond_close().
Add dev_mc_add() call to bond_open() to match the above change.
v3:
* When adding or deleting a slave, only sync/unsync, add/del addresses if
the bond is up. In other cases, it is taken care of at the right time by
ndo_open/ndo_set_rx_mode/ndo_stop.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are already a few definitions of arrays containing
MULTICAST_LACPDU_ADDR and the next patch will add one more use. These all
contain the same constant data so define one common instance for all
bonding code.
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 86247aba599e ("net: bonding: Unsync device addresses on ndo_stop") Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
made the __qlt_24xx_handle_abts() function return early if
tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean
up the allocated memory for the management command.
Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com Fixes: 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Aquantia datasheet notes that after issuing a Processor-Intensive
MDIO operation, like changing the low-power state of the device, the
driver should wait for the operation to finish before issuing a new MDIO
command.
The new aqr107_wait_processor_intensive_op() function is added which can
be used after these kind of MDIO operations. At the moment, we are only
adding it at the end of the suspend/resume calls.
The issue was identified on a board featuring the AQR113C PHY, on
which commands like 'ip link (..) up / down' issued without any delays
between them would render the link on the PHY to remain down.
The issue was easy to reproduce with a one-liner:
$ ip link set dev ethX down; ip link set dev ethX up; \
ip link set dev ethX down; ip link set dev ethX up;
__flow_hash_consistentify() wrongly swaps ipv4 addresses in few cases.
This function is indirectly used by __skb_get_hash_symmetric(), which is
used to fanout packets in AF_PACKET.
Intrusion detection systems may be impacted by this issue.
__flow_hash_consistentify() computes the addresses difference then swaps
them if the difference is negative. In few cases src - dst and dst - src
are both negative.
The following snippet mimics __flow_hash_consistentify():
In the first case the addresses differences are always < 0, therefore
__flow_hash_consistentify() always swaps, thus dst->src and src->dst
packets have differents hashes.
Fixes: c3f8324188fa8 ("net: Add full IPv6 addresses to flow_keys") Signed-off-by: Ludovic Cintrat <ludovic.cintrat@gatewatcher.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If an AF_PACKET socket is used to send packets through ipvlan and the
default xmit function of the AF_PACKET socket is changed from
dev_queue_xmit() to packet_direct_xmit() via setsockopt() with the option
name of PACKET_QDISC_BYPASS, the skb->mac_header may not be reset and
remains as the initial value of 65535, this may trigger slab-out-of-bounds
bugs as following:
=================================================================
UG: KASAN: slab-out-of-bounds in ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
PU: 2 PID: 1768 Comm: raw_send Kdump: loaded Not tainted 6.0.0-rc4+ #6
ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33
all Trace:
print_address_description.constprop.0+0x1d/0x160
print_report.cold+0x4f/0x112
kasan_report+0xa3/0x130
ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
ipvlan_start_xmit+0x29/0xa0 [ipvlan]
__dev_direct_xmit+0x2e2/0x380
packet_direct_xmit+0x22/0x60
packet_snd+0x7c9/0xc40
sock_sendmsg+0x9a/0xa0
__sys_sendto+0x18a/0x230
__x64_sys_sendto+0x74/0x90
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause is:
1. packet_snd() only reset skb->mac_header when sock->type is SOCK_RAW
and skb->protocol is not specified as in packet_parse_headers()
2. packet_direct_xmit() doesn't reset skb->mac_header as dev_queue_xmit()
In this case, skb->mac_header is 65535 when ipvlan_xmit_mode_l2() is
called. So when ipvlan_xmit_mode_l2() gets mac header with eth_hdr() which
use "skb->head + skb->mac_header", out-of-bound access occurs.
This patch replaces eth_hdr() with skb_eth_hdr() in ipvlan_xmit_mode_l2()
and reset mac header in multicast to solve this out-of-bound bug.
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The underlying hardware may or may not allow reading of the head or tail
registers and it really makes no difference if we use the software
cached values. So, always used the software cached values.
Fixes: 9c6c12595b73 ("i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Co-developed-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the IDC callback that is accessed when the aux drivers request a reset,
the function to unplug the aux devices is called. This function is also
called in the ice_prepare_for_reset function. This double call is causing
a "scheduling while atomic" BUG.
The correct place to unplug the aux devices for a reset is in the
prepare_for_reset function, as this is a common place for all reset flows.
It also has built in protection from being called twice in a single reset
instance before the aux devices are replugged.
Fixes: f9f5301e7e2d4 ("ice: Register auxiliary device to provide RDMA") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Helena Anna Dubel <helena.anna.dubel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
nf_osf_find() incorrectly returns true on mismatch, this leads to
copying uninitialized memory area in nft_osf which can be used to leak
stale kernel stack data to userspace.
CTCP messages should only be at the start of an IRC message, not
anywhere within it.
While the helper only decodes packes in the ORIGINAL direction, its
possible to make a client send a CTCP message back by empedding one into
a PING request. As-is, thats enough to make the helper believe that it
saw a CTCP message.
ct_sip_next_header and ct_sip_get_header return an absolute
value of matchoff, not a shift from current dataoff.
So dataoff should be assigned matchoff, not incremented by it.
This issue can be seen in the scenario when there are multiple
Contact headers and the first one is using a hostname and other headers
use IP addresses. In this case, ct_sip_walk_headers will work as follows:
The first ct_sip_get_header call to will find the first Contact header
but will return -1 as the header uses a hostname. But matchoff will
be changed to the offset of this header. After that, dataoff should be
set to matchoff, so that the next ct_sip_get_header call find the next
Contact header. But instead of assigning dataoff to matchoff, it is
incremented by it, which is not correct, as matchoff is an absolute
value of the offset. So on the next call to the ct_sip_get_header,
dataoff will be incorrect, and the next Contact header may not be
found at all.
We should call of_node_put() for the reference returned by
of_parse_phandle() in fail path or when it is not used anymore.
Here we only need to move the of_node_put() before the check.
Fixes: d70241913413 ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine users") Signed-off-by: Liang He <windhl@126.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/r/20220720073234.1255474-1-windhl@126.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We've found the AUX channel to be less reliable with PCLK_EDP at a
higher rate (typically 25 MHz). This is especially important on systems
with PSR-enabled panels (like Gru-Kevin), since we make heavy, constant
use of AUX.
According to Rockchip, using any rate other than 24 MHz can cause
"problems between syncing the PHY an PCLK", which leads to all sorts of
unreliabilities around register operations.
Fixes: d67a38c5a623 ("arm64: dts: rockchip: move core edp from rk3399-kevin to shared chromebook") Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: zain wang <wzz@rock-chips.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Link: https://lore.kernel.org/r/20220830131212.v2.1.I98d30623f13b785ca77094d0c0fd4339550553b6@changeid Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add callbacks for atomic_destroy_state, atomic_duplicate_state and
atomic_reset to restore functionality of the DSI driver: this solves
vblank timeouts when another bridge is present in the chain.
Fixes: 7f6335c6a258 ("drm/mediatek: Modify dsi funcs to atomic operations") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20220721172727.14624-1-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Gru-Bob board does not have a pull-up resistor on its
WLAN_HOST_WAKE# pin, but Kevin does. The production/vendor kernel
specified the pin configuration correctly as a pull-up, but this didn't
get ported correctly to upstream.
This means Bob's WLAN_HOST_WAKE# pin is floating, causing inconsistent
wakeup behavior.
Note that bt_host_wake_l has a similar dynamic, but apparently the
upstream choice was to redundantly configure both internal and external
pull-up on Kevin (see the "Kevin has an external pull up" comment in
rk3399-gru.dtsi). This doesn't cause any functional problem, although
it's perhaps wasteful.
SCMI Reset protocol specification allows the asynchronous reset request
only when an autonomous reset action is specified. Reset requests based
on explicit assert/deassert of signals should not be served
asynchronously.
Current implementation will instead issue an asynchronous request in any
case, as long as the reset domain had advertised to support asynchronous
resets.
Avoid requesting the asynchronous resets when the reset action is not
of the autonomous type, even if the target reset domain does, in general,
support the asynchronous requests.
Accessing reset domains descriptors by the index upon the SCMI drivers
requests through the SCMI reset operations interface can potentially
lead to out-of-bound violations if the SCMI driver misbehave.
Add an internal consistency check before any such domains descriptors
accesses.
xfs_repair catches fork size/format mismatches, but the in-kernel
verifier doesn't, leading to null pointer failures when attempting
to perform operations on the fork. This can occur in the
xfs_dir_is_empty() where the in-memory fork format does not match
the size and so the fork data pointer is accessed incorrectly.
Note: this causes new failures in xfs/348 which is testing mode vs
ftype mismatches. We now detect a regular file that has been changed
to a directory or symlink mode as being corrupt because the data
fork is for a symlink or directory should be in local form when
there are only 3 bytes of data in the data fork. Hence the inode
verify for the regular file now fires w/ -EFSCORRUPTED because
the inode fork format does not match the format the corrupted mode
says it should be in.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some reason commit 9a5280b312e2e ("xfs: reorder iunlink remove
operation in xfs_ifree") replaced a jump to the exit path in the
event of an xfs_difree() error with a direct return, which skips
releasing the perag reference acquired at the top of the function.
Restore the original code to drop the reference on error.
Fixes: 9a5280b312e2e ("xfs: reorder iunlink remove operation in xfs_ifree") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The O_TMPFILE creation implementation creates a specific order of
operations for inode allocation/freeing and unlinked list
modification. Currently both are serialised by the AGI, so the order
doesn't strictly matter as long as the are both in the same
transaction.
However, if we want to move the unlinked list insertions largely out
from under the AGI lock, then we have to be concerned about the
order in which we do unlinked list modification operations.
O_TMPFILE creation tells us this order is inode allocation/free,
then unlinked list modification.
Change xfs_ifree() to use this same ordering on unlinked list
removal. This way we always guarantee that when we enter the
iunlinked list removal code from this path, we already have the AGI
locked and we don't have to worry about lock nesting AGI reads
inside unlink list locks because it's already locked and attached to
the transaction.
We can do this safely as the inode freeing and unlinked list removal
are done in the same transaction and hence are atomic operations
with respect to log recovery.
Reported-by: Frank Hofmann <fhofmann@cloudflare.com> Fixes: 298f7bec503f ("xfs: pin inode backing buffer to the inode log item") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Due to undocumented, hysterical raisins on x86, the CFI jump-table
sections in .text are needlessly aligned to PMD_SIZE in the vmlinux
linker script. When compiling a CFI-enabled arm64 kernel with a 64KiB
page-size, a PMD maps 512MiB of virtual memory and so the .text section
increases to a whopping 940MiB and blows the final Image up to 960MiB.
Others report a link failure.
Since the CFI jump-table requires only instruction alignment, reduce the
alignment directives to function alignment for parity with other parts
of the .text section. This reduces the size of the .text section for the
aforementioned 64KiB page size arm64 kernel to 19MiB for a much more
reasonable total Image size of 39MiB.
Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Mohan Rao .vanimina" <mailtoc.mohanrao@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/all/CAL_GTzigiNOMYkOPX1KDnagPhJtFNqSK=1USNbS0wUL4PW6-Uw@mail.gmail.com/ Fixes: cf68fffb66d6 ("add support for Clang CFI") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220922215715.13345-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cpufreq_get_hw_max_freq() returns max frequency in kHz as *unsigned int*,
while freq_inv_set_max_ratio() gets passed this frequency in Hz as 'u64'.
Multiplying max frequency by 1000 can potentially result in overflow --
multiplying by 1000ULL instead should avoid that...
Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.
Fixes: cd0ed03a8903 ("arm64: use activity monitors for frequency invariance") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/01493d64-2bce-d968-86dc-11a122a9c07d@omp.ru Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inject #UD when emulating XSETBV if CR4.OSXSAVE is not set. This also
covers the "XSAVE not supported" check, as setting CR4.OSXSAVE=1 #GPs if
XSAVE is not supported (and userspace gets to keep the pieces if it
forces incoherent vCPU state).
Add a comment to kvm_emulate_xsetbv() to call out that the CPU checks
CR4.OSXSAVE before checking for intercepts. AMD'S APM implies that #UD
has priority (says that intercepts are checked before #GP exceptions),
while Intel's SDM says nothing about interception priority. However,
testing on hardware shows that both AMD and Intel CPUs prioritize the #UD
over interception.
Fixes: 02d4160fbd76 ("x86: KVM: add xsetbv to the emulator") Cc: stable@vger.kernel.org Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220824033057.3576315-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations
__free_slab() invocations out of IRQ context") moved all flush_cpu_slab()
invocations to the global workqueue to avoid a problem related
with deactivate_slab()/__free_slab() being called from an IRQ context
on PREEMPT_RT kernels.
When the flush_all_cpu_locked() function is called from a task context
it may happen that a workqueue with WQ_MEM_RECLAIM bit set ends up
flushing the global workqueue, this will cause a dependency issue.
In create_unique_id(), kmalloc(, GFP_KERNEL) can fail due to
out-of-memory, if it fails, return errno correctly rather than
triggering panic via BUG_ON();
The following happened on an i.MX25 using flexcan with many packets on
the bus:
The rx-offload queue reached a length more than skb_queue_len_max. In
can_rx_offload_offload_one() the drop variable was set to true which
made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
_always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.
can_rx_offload_irq_offload_fifo() looks as follows:
| while (1) {
| skb = can_rx_offload_offload_one(offload, 0);
| if (IS_ERR(skb))
| continue;
| if (!skb)
| break;
| ...
| }
The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
requested, even if there is no CAN frame pending. As the i.MX25 is a
single core CPU, while the rx-offload processing is active, there is
no thread to process packets from the offload queue. So the queue
doesn't get any shorter and this results is a tight loop.
Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
return NULL if no CAN frame is pending.
Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
- don't break in can_rx_offload_irq_offload_fifo() in case of an error,
return NULL in flexcan_mailbox_read() in case of no pending CAN frame
instead
We were failing to call kasan_malloc() from __kmalloc_*track_caller()
which was causing us to sometimes fail to produce KASAN error reports
for allocations made using e.g. devm_kcalloc(), as the KASAN poison was
not being initialized. Fix it.
riscv has an equivalent of arm bug fixed by 653d48b22166 ("arm: fix
really nasty sigreturn bug"); if signal gets caught by an interrupt that
hits when we have the right value in a0 (-513), *and* another signal
gets delivered upon sigreturn() (e.g. included into the blocked mask for
the first signal and posted while the handler had been running), the
syscall restart logics will see regs->cause equal to EXC_SYSCALL (we are
in a syscall, after all) and a0 already restored to its original value
(-513, which happens to be -ERESTARTNOINTR) and assume that we need to
apply the usual syscall restart logics.
When running gpio test on nxp-ls1028 platform with below command
gpiomon --num-events=3 --rising-edge gpiochip1 25
There will be a warning trace as below:
Call trace:
free_irq+0x204/0x360
lineevent_free+0x64/0x70
gpio_ioctl+0x598/0x6a0
__arm64_sys_ioctl+0xb4/0x100
invoke_syscall+0x5c/0x130
......
el0t_64_sync+0x1a0/0x1a4
The reason of this issue is that calling request_threaded_irq()
function failed, and then lineevent_free() is invoked to release
the resource. Since the lineevent_state::irq was already set, so
the subsequent invocation of free_irq() would trigger the above
warning call trace. To fix this issue, set the lineevent_state::irq
after the IRQ register successfully.
Fixes: 468242724143 ("gpiolib: cdev: refactor lineevent cleanup into lineevent_free") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Reviewed-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We now remove the device's debugfs entries when unbinding the driver.
This now causes a NULL-pointer dereference on module exit because the
platform devices are unregistered *after* the global debugfs directory
has been recursively removed. Fix it by unregistering the devices first.
We currently check the MokSBState variable to decide whether we should
treat UEFI secure boot as being disabled, even if the firmware thinks
otherwise. This is used by shim to indicate that it is not checking
signatures on boot images. In the kernel, we use this to relax lockdown
policies.
However, in cases where shim is not even being used, we don't want this
variable to interfere with lockdown, given that the variable may be
non-volatile and therefore persist across a reboot. This means setting
it once will persistently disable lockdown checks on a given system.
So switch to the mirrored version of this variable, called MokSBStateRT,
which is supposed to be volatile, and this is something we can check.
Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Peter Jones <pjones@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When booting the x86 kernel via EFI using the LoadImage/StartImage boot
services [as opposed to the deprecated EFI handover protocol], the setup
header is taken from the image directly, and given that EFI's LoadImage
has no Linux/x86 specific knowledge regarding struct bootparams or
struct setup_header, any absolute addresses in the setup header must
originate from the file and not from a prior loading stage.
Since we cannot generally predict where LoadImage() decides to load an
image (*), such absolute addresses must be treated as suspect: even if a
prior boot stage intended to make them point somewhere inside the
[signed] image, there is no way to validate that, and if they point at
an arbitrary location in memory, the setup_data nodes will not be
covered by any signatures or TPM measurements either, and could be made
to contain an arbitrary sequence of SETUP_xxx nodes, which could
interfere quite badly with the early x86 boot sequence.
(*) Note that, while LoadImage() does take a buffer/size tuple in
addition to a device path, which can be used to provide the image
contents directly, it will re-allocate such images, as the memory
footprint of an image is generally larger than the PE/COFF file
representation.
Add support for Maple Ridge discrete USB4 host controller from Intel
which has a single USB4 port (versus the already supported dual port
Maple Ridge USB4 host controller).
Cc: stable@vger.kernel.org Signed-off-by: Gil Fine <gil.fine@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some DWC3 controllers (e.g. Rockchip SoCs), the DWC3 core
doesn't support 64-bit DMA address width. In this case, this
driver should use the default 32-bit mask. Otherwise, the DWC3
controller will break if it runs on above 4GB physical memory
environment.
This patch reads the DWC_USB3_AWIDTH bits of GHWPARAMS0 which
used for the DMA address width, and only configure 64-bit DMA
mask if the DWC_USB3_AWIDTH is 64.
Fixes: 45d39448b4d0 ("usb: dwc3: support 64 bit DMA in platform driver") Cc: stable <stable@kernel.org> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: William Wu <william.wu@rock-chips.com> Link: https://lore.kernel.org/r/20220901083446.3799754-1-william.wu@rock-chips.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit d725d20e81c2 ("media: flexcop-usb: sanity checking of endpoint
type") tried to add an endpoint type sanity check for the single
isochronous endpoint but instead broke the driver by checking the wrong
descriptor or random data beyond the last endpoint descriptor.
Make sure to check the right endpoint descriptor.
Fixes: d725d20e81c2 ("media: flexcop-usb: sanity checking of endpoint type") Cc: Oliver Neukum <oneukum@suse.com> Cc: stable@vger.kernel.org # 5.9 Reported-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20220822151027.27026-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1) The cleaner kthread tries to start a transaction to delete an unused
block group, but the metadata reservation can not be satisfied right
away, so a reservation ticket is created and it starts the async
metadata reclaim task (fs_info->async_reclaim_work);
2) Writeback for all the filler inodes with an i_size of 2K starts
(generic/562 creates a lot of 2K files with the goal of filling
metadata space). We try to create an inline extent for them, but we
fail when trying to insert the inline extent with -ENOSPC (at
cow_file_range_inline()) - since this is not critical, we fallback
to non-inline mode (back to cow_file_range()), reserve extents, create
extent maps and create the ordered extents;
3) An unmount starts, enters close_ctree();
4) The async reclaim task is flushing stuff, entering the flush states one
by one, until it reaches RUN_DELAYED_IPUTS. There it runs all current
delayed iputs.
After running the delayed iputs and before calling
btrfs_wait_on_delayed_iputs(), one or more ordered extents complete,
and btrfs_add_delayed_iput() is called for each one through
btrfs_finish_ordered_io() -> btrfs_put_ordered_extent(). This results
in bumping fs_info->nr_delayed_iputs from 0 to some positive value.
So the async reclaim task blocks at btrfs_wait_on_delayed_iputs() waiting
for fs_info->nr_delayed_iputs to become 0;
5) The current transaction is committed by the transaction kthread, we then
start unpinning extents and end up calling btrfs_try_granting_tickets()
through unpin_extent_range(), since we released some space.
This results in satisfying the ticket created by the cleaner kthread at
step 1, waking up the cleaner kthread;
6) At close_ctree() we ask the cleaner kthread to park;
7) The cleaner kthread starts the transaction, deletes the unused block
group, and then calls kthread_should_park(), which returns true, so it
parks. And at this point we have the delayed iputs added by the
completion of the ordered extents still pending;
8) Then later at close_ctree(), when we call:
cancel_work_sync(&fs_info->async_reclaim_work);
We hang forever, since the cleaner was parked and no one else can run
delayed iputs after that, while the reclaim task is waiting for the
remaining delayed iputs to be completed.
Fix this by waiting for all ordered extents to complete and running the
delayed iputs before attempting to stop the async reclaim tasks. Note that
we can not wait for ordered extents with btrfs_wait_ordered_roots() (or
other similar functions) because that waits for the BTRFS_ORDERED_COMPLETE
flag to be set on an ordered extent, but the delayed iput is added after
that, when doing the final btrfs_put_ordered_extent(). So instead wait for
the work queues used for executing ordered extent completion to be empty,
which works because we do the final put on an ordered extent at
btrfs_finish_ordered_io() (while we are in the unmount context).
Fixes: d6fd0ae25c6495 ("Btrfs: fix missing delayed iputs on unmount") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During early unmount, at close_ctree(), we try to stop the block group
reclaim task with cancel_work_sync(), but that may hang if the block group
reclaim task is currently at btrfs_relocate_block_group() waiting for the
flag BTRFS_FS_UNFINISHED_DROPS to be cleared from fs_info->flags. During
unmount we only clear that flag later, after trying to stop the block
group reclaim task.
Fix that by clearing BTRFS_FS_UNFINISHED_DROPS before trying to stop the
block group reclaim task and after setting BTRFS_FS_CLOSING_START, so that
if the reclaim task is waiting on that bit, it will stop immediately after
being woken, because it sees the filesystem is closing (with a call to
btrfs_fs_closing()), and then returns immediately with -EINTR.
Fixes: 31e70e527806c5 ("btrfs: fix hang during unmount when block group reclaim task is running") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just as with the 5570 (and the other Dell laptops), this enables the two
subwoofer speakers on the Dell Precision 5530 together with the main
ones, significantly increasing the audio quality. I've tested this
myself on a 5530 and can confirm it's working as expected.
The ASUS G15 2022 (GA503R) series laptop has the same node-to-DAC pairs
as early models and the G14, this includes bass speakers which are by
default mapped incorrectly to the 0x06 node.
The Dell Precision 5570 uses the same 4-speakers-on-ALC289 just like the
previous Precision 5560. I replicated that patch onto this one, and can
confirm that the audio is much better (the woofers are now working);
I've tested it on my Dell Precision 5570.
During the code change to add the support for devres-managed card
instance, we put an explicit kfree(card) call at the error path in
snd_card_new(). This is needed for the early error path before the
card is initialized with the device, but is rather superfluous and
causes a double-free at the error path after the card instance is
initialized, as the destructor of the card object already contains a
kfree() call.
This patch fixes the double-free situation by removing the superfluous
kfree(). Meanwhile we need to call kfree() explicitly for the early
error path, so it's added there instead.
For ARM processor, unaligned access to device memory is not allowed.
Method memcpy does not take care of alignment.
USB detection failure with the unaligned address of memory access, with
below kernel crash. To fix the unaligned address the kernel panic issue,
replace memcpy with memcpy_toio method.
The Lenovo OneLink+ Dock contains two VL812 USB3.0 controllers:
17ef:1018 upstream
17ef:1019 downstream
Those two controllers both have problems with some USB3.0 devices,
particularly self-powered ones. Typical error messages include:
Timeout while waiting for setup device command
device not accepting address X, error -62
unable to enumerate USB device
By process of elimination the controllers themselves were identified as
the cause of the problem. Through trial and error the issue was solved
by using USB_QUIRK_RESET_RESUME for both chips.
Relocate the pullups_connected check until after it is ensured that there
are no runtime PM transitions. If another context triggered the DWC3
core's runtime resume, it may have already enabled the Run/Stop. Do not
re-run the entire pullup sequence again, as it may issue a core soft
reset while Run/Stop is already set.
This patch depends on
commit 69e131d1ac4e ("usb: dwc3: gadget: Prevent repeat pullup()")
If the GEVNTCOUNT indicates events in the event buffer, the driver needs
to acknowledge them before the controller can halt. Simply let the
interrupt handler acknowledges the remaining event generated by the
controller while polling for DSTS.DEVCTLHLT. This avoids disabling irq
and taking care of race condition between the interrupt handlers and
pullup().
Don't do soft-disconnect if it's previously done. Likewise, don't do
soft-connect if the device is currently connected and running. It would
break normal operation.
Currently the caller of pullup() (udc's sysfs soft_connect) only checks
if it had initiated disconnect to prevent repeating soft-disconnect. It
doesn't check for soft-connect. To be safe, let's keep the check here
regardless whether the udc core is fixed.
It is recommended by the Synopsis databook to issue a DCTL.CSftReset
when reconnecting from a device-initiated disconnect routine. This
resolves issues with enumeration during fast composition switching
cases, which result in an unknown device on the host.
There is a race present where the DWC3 runtime resume runs in parallel
to the UDC unbind sequence. This will eventually lead to a possible
scenario where we are enabling the run/stop bit, without a valid
composition defined.
Thread#1 (handling UDC unbind):
usb_gadget_remove_driver()
-->usb_gadget_disconnect()
-->dwc3_gadget_pullup(0)
--> continue UDC unbind sequence
-->Thread#2 is running in parallel here
Thread#2 (handing next cable connect)
__dwc3_set_mode()
-->pm_runtime_get_sync()
-->dwc3_gadget_resume()
-->dwc->gadget_driver is NOT NULL yet
-->dwc3_gadget_run_stop(1)
--> _dwc3gadget_start()
...
Fix this by tracking the pullup disable routine, and avoiding resuming
of the DWC3 gadget. Once the UDC is re-binded, that will trigger the
pullup enable routine, which would handle enabling the DWC3 gadget.
The new r8188eu driver doesn't actually support devices with vendor ID 0bda
and product ID f179[0][1][2], remove the ID so owners of these devices
don't have to blacklist the staging driver.
Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.
This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
We want to be able to call virt data exchange conditionally
after gmc sw init to reserve bad pages as early as possible.
Since this is a conditional call, we will need
to call it again unconditionally later in the init sequence.
Refactor the data exchange function so it can be
called multiple times without re-initializing the work item.
v2: Cleaned up the code. Kept the original call to init_exchange_data()
inside early init to initialize the work item, afterwards call
exchange_data() when needed.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
cpuset_attach() [1], for commit 4f7e7236435ca0ab ("cgroup: Fix
threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that
cpuset_attach() is also called from cgroup_attach_task_all().
Add cpus_read_lock() like what cgroup_procs_write_start() does.
Flush the CPU caches when memory is reclaimed from an SEV guest (where
reclaim also includes it being unmapped from KVM's memslots). Due to lack
of coherency for SEV encrypted memory, failure to flush results in silent
data corruption if userspace is malicious/broken and doesn't ensure SEV
guest memory is properly pinned and unpinned.
Cache coherency is not enforced across the VM boundary in SEV (AMD APM
vol.2 Section 15.34.7). Confidential cachelines, generated by confidential
VM guests have to be explicitly flushed on the host side. If a memory page
containing dirty confidential cachelines was released by VM and reallocated
to another user, the cachelines may corrupt the new user at a later time.
KVM takes a shortcut by assuming all confidential memory remain pinned
until the end of VM lifetime. Therefore, KVM does not flush cache at
mmu_notifier invalidation events. Because of this incorrect assumption and
the lack of cache flushing, malicous userspace can crash the host kernel:
creating a malicious VM and continuously allocates/releases unpinned
confidential memory pages when the VM is running.
Add cache flush operations to mmu_notifier operations to ensure that any
physical memory leaving the guest VM get flushed. In particular, hook
mmu_notifier_invalidate_range_start and mmu_notifier_release events and
flush cache accordingly. The hook after releasing the mmu lock to avoid
contention with other vCPUs.
If we set XFRM security policy by calling setsockopt with option
IPV6_XFRM_POLICY, the policy will be stored in 'sock_policy' in 'sock'
struct. However tcp_v6_send_response doesn't look up dst_entry with the
actual socket but looks up with tcp control socket. This may cause a
problem that a RST packet is sent without ESP encryption & peer's TCP
socket can't receive it.
This patch will make the function look up dest_entry with actual socket,
if the socket has XFRM policy(sock_policy), so that the TCP response
packet via this function can be encrypted, & aligned on the encrypted
TCP socket.
Tested: We encountered this problem when a TCP socket which is encrypted
in ESP transport mode encryption, receives challenge ACK at SYN_SENT
state. After receiving challenge ACK, TCP needs to send RST to
establish the socket at next SYN try. But the RST was not encrypted &
peer TCP socket still remains on ESTABLISHED state.
So we verified this with test step as below.
[Test step]
1. Making a TCP state mismatch between client(IDLE) & server(ESTABLISHED).
2. Client tries a new connection on the same TCP ports(src & dst).
3. Server will return challenge ACK instead of SYN,ACK.
4. Client will send RST to server to clear the SOCKET.
5. Client will retransmit SYN to server on the same TCP ports.
[Expected result]
The TCP connection should be established.
Cc: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Sehee Lee <seheele@google.com> Signed-off-by: Sewook Seo <sewookseo@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In pxa3xx_gcu_write, a count parameter of type size_t is passed to words of
type int. Then, copy_from_user() may cause a heap overflow because it is used
as the third argument of copy_from_user().
The L0 symbol exists in System.map, but not in .tmp_System.map. When
"cmp -s System.map .tmp_System.map" will show "Inconsistent kallsyms
data" error message in link-vmlinux.sh script.
Enabling panfrost GPU OPP with dynamic regulator will make OPP
responsible to enable and configure it.
Unfortunately OPP configure and enable the regulator when an OPP
is asked to be set, which is not the case during
panfrost_devfreq_init().
This leave the regulator unconfigured and if no GPU load is
triggered, no OPP is asked to be set which make the regulator framework
switching it off during regulator_late_cleanup() without
noticing and therefore make the board hang as any access to GPU
memory space make bus locks up.
Call dev_pm_opp_set_opp() with the recommend OPP in
panfrost_devfreq_init() to enable the regulator, this will properly
configure and enable the regulator and will avoid any switch off
by regulator_late_cleanup().
When trying to get a file lock on an AFS file, the server may return
UAEAGAIN to indicate that the lock is already held. This is currently
translated by the default path to -EREMOTEIO.
Translate it instead to -EAGAIN so that we know we can retry it.
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey E Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/166075761334.3533338.2591992675160918098.stgit@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
AZA HW may send a burst read/write request crossing 4K memory boundary.
The 4KB boundary is not guaranteed by Tegra HDA HW. Make SW change to
include the flag AZX_DCAPS_4K_BDLE_BOUNDARY to align BDLE to 4K
boundary.