When a reset was triggered on one VF with i40e_reset_vf
global PF state __I40E_VF_DISABLE was set on a PF until
the reset finished. If immediately after triggering reset
on one VF there is a request to reset on another
it will cause a hang on VF side because VF will be notified
of incoming reset but the reset will never happen because
of this global state, we will get such error message:
[ +4.890195] iavf 0000:86:02.1: Never saw reset
and VF will hang waiting for the reset to be triggered.
Fix this by introducing new VF state I40E_VF_STATE_RESETTING
that will be set on a VF if it is currently resetting instead of
the global __I40E_VF_DISABLE PF state.
Fixes: 3ba9bcb4b68f ("i40e: add locking around VF reset") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
the driver would fail to setup this setting on X722
device since it was using the mask on the register
dedicated for X710 devices.
Apply a different mask on the register when setting the
RSS hash for the X722 device.
When displaying the flow types enabled via ethtool:
ethtool -n $pf rx-flow-hash tcp4|tcp6|udp4|udp6
the driver would print wrong values for X722 device.
Fix this issue by testing masks for X722 device in
i40e_get_rss_hash_opts function.
Fixes: eb0dd6e4a3b3 ("i40e: Allow RSS Hash set with less than four parameters") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Another syzbot report [1] with no reproducer hints
at a bug in ip6_gre tunnel (dev:ip6gretap0)
Since ipv6 mcast code makes sure to read dev->mtu once
and applies a sanity check on it (see commit b9b312a7a451
"ipv6: mcast: better catch silly mtu values"), a remaining
possibility is that a layer is able to set dev->mtu to
an underflowed value (high order bit set).
This could happen indeed in ip6gre_tnl_link_config_route(),
ip6_tnl_link_config() and ipip6_tunnel_bind_dev()
Make sure to sanitize mtu value in a local variable before
it is written once on dev->mtu, as lockless readers could
catch wrong temporary value.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221024020124.3756833-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The testcase "stat_all_metrics.sh" verifies perf stat result for all the
metric events present in perf list. It runs perf metric events with
various commands and expects non-empty metric result.
Incase of powerpc:hv-24x7 events, some of the event count can be 0 based
on system configuration. And if that event used as denominator in divide
equation, it can cause divide by 0 error. The current nest_metric.json
file creating divide by 0 issue for some of the metric events, which
results in failure of the "stat_all_metrics.sh" test case.
Most of the metrics events have cycles or an event which expect to have
a larger value as denominator, so adding 1 to the denominator of the
metric expression as a fix.
VIDIOC_S_FBUF is by definition a scary ioctl, which is why only root
can use it. But at least check if the framebuffer parameters match that
of one of the framebuffer created by vivid, and reject anything else.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: ef834f7836ec ([media] vivid: add the video capture and output parts) Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Hybrid sleep is currently hardcoded to only operate with S3 even
on systems that might not support it.
Instead of assuming this mode is what the user wants to use, for
hybrid sleep follow the setting of `mem_sleep_current` which
will respect mem_sleep_default kernel command line and policy
decisions made by the presence of the FADT low power idle bit.
Fixes: 81d45bdf8913 ("PM / hibernate: Untangle power_down()") Reported-and-tested-by: kolAflash <kolAflash@kolahilft.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216574 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The commit 1149108e2fbf ("can: mscan: improve clock API use") only
adds put_clock() in mpc5xxx_can_remove() function, forgetting to add
put_clock() in the error handling code.
Fix this bug by adding put_clock() in the error handling code.
Fixes: 1149108e2fbf ("can: mscan: improve clock API use") Signed-off-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/all/20221024133828.35881-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the number of pages from the userptr BO differs from the SG BO then the
allocated memory for the SG table doesn't get freed before returning
-EINVAL, which may lead to a memory leak in some error paths. Fix this by
checking the number of pages before allocating memory for the SG table.
Fixes: 264fb4d332f5 ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As Shakeel explains the commit under Fixes had the unintended
side-effect of no longer pre-loading the cached memory allowance.
Even tho we previously dropped the first packet received when
over memory limit - the consecutive ones would get thru by using
the cache. The charging was happening in batches of 128kB, so
we'd let in 128kB (truesize) worth of packets per one drop.
After the change we no longer force charge, there will be no
cache filling side effects. This causes significant drops and
connection stalls for workloads which use a lot of page cache,
since we can't reclaim page cache under GFP_NOWAIT.
Some of the latency can be recovered by improving SACK reneg
handling but nowhere near enough to get back to the pre-5.15
performance (the application I'm experimenting with still
sees 5-10x worst latency).
Apply the suggested workaround of using GFP_ATOMIC. We will now
be more permissive than previously as we'll drop _no_ packets
in softirq when under pressure. But I can't think of any good
and simple way to address that within networking.
This commit fixes a bug that can cause a TCP data sender to repeatedly
defer RTOs when encountering SACK reneging.
The bug is that when we're in fast recovery in a scenario with SACK
reneging, every time we get an ACK we call tcp_check_sack_reneging()
and it can note the apparent SACK reneging and rearm the RTO timer for
srtt/2 into the future. In some SACK reneging scenarios that can
happen repeatedly until the receive window fills up, at which point
the sender can't send any more, the ACKs stop arriving, and the RTO
fires at srtt/2 after the last ACK. But that can take far too long
(O(10 secs)), since the connection is stuck in fast recovery with a
low cwnd that cannot grow beyond ssthresh, even if more bandwidth is
available.
This fix changes the logic in tcp_check_sack_reneging() to only rearm
the RTO timer if data is cumulatively ACKed, indicating forward
progress. This avoids this kind of nearly infinite loop of RTO timer
re-arming. In addition, this meets the goals of
tcp_check_sack_reneging() in handling Windows TCP behavior that looks
temporarily like SACK reneging but is not really.
Many thanks to Jakub Kicinski and Neil Spring, who reported this issue
and provided critical packet traces that enabled root-causing this
issue. Also, many thanks to Jakub Kicinski for testing this fix.
Fixes: 5ae344c949e7 ("tcp: reduce spurious retransmits due to transient SACK reneging") Reported-by: Jakub Kicinski <kuba@kernel.org> Reported-by: Neil Spring <ntspring@fb.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The type of sk_rcvbuf and sk_sndbuf in struct sock is int, and
in tcp_add_backlog(), the variable limit is caculated by adding
sk_rcvbuf, sk_sndbuf and 64 * 1024, it may exceed the max value
of int and overflow. This patch reduces the limit budget by
halving the sndbuf to solve this issue since ACK packets are much
smaller than the payload.
Fixes: c9c3321257e1 ("tcp: add tcp_add_backlog()") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If packet is going to be coalesced, sk_sndbuf/sk_rcvbuf values
are not used. Defer their access to the point we need them.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: ec791d8149ff ("tcp: fix a signed-integer-overflow bug in tcp_add_backlog()") Signed-off-by: Sasha Levin <sashal@kernel.org>
When the ops_init() interface is invoked to initialize the net, but
ops->init() fails, data is released. However, the ptr pointer in
net->gen is invalid. In this case, when nfqnl_nf_hook_drop() is invoked
to release the net, invalid address access occurs.
The process is as follows:
setup_net()
ops_init()
data = kzalloc(...) ---> alloc "data"
net_assign_generic() ---> assign "date" to ptr in net->gen
...
ops->init() ---> failed
...
kfree(data); ---> ptr in net->gen is invalid
...
ops_exit_list()
...
nfqnl_nf_hook_drop()
*q = nfnl_queue_pernet(net) ---> q is invalid
The following is the Call Trace information:
BUG: KASAN: use-after-free in nfqnl_nf_hook_drop+0x264/0x280
Read of size 8 at addr ffff88810396b240 by task ip/15855
Call Trace:
<TASK>
dump_stack_lvl+0x8e/0xd1
print_report+0x155/0x454
kasan_report+0xba/0x1f0
nfqnl_nf_hook_drop+0x264/0x280
nf_queue_nf_hook_drop+0x8b/0x1b0
__nf_unregister_net_hook+0x1ae/0x5a0
nf_unregister_net_hooks+0xde/0x130
ops_exit_list+0xb0/0x170
setup_net+0x7ac/0xbd0
copy_net_ns+0x2e6/0x6b0
create_new_namespaces+0x382/0xa50
unshare_nsproxy_namespaces+0xa6/0x1c0
ksys_unshare+0x3a4/0x7e0
__x64_sys_unshare+0x2d/0x40
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 17869 Comm: syz-executor.2 Not tainted 6.1.0-rc1-syzkaller-00010-gbb1a1146467a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
value changed: 0xffff88812971ce00 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 5859 Comm: syz-executor.3 Not tainted 6.0.0-syzkaller-12189-g19d17ab7c68b-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
NIC is stopped with rtnl_lock held, and during the stop it cancels the
'service_task' work and free irqs.
However, if CONFIG_MACSEC is set, rtnl_lock is acquired both from
aq_nic_service_task and aq_linkstate_threaded_isr. Then a deadlock
happens if aq_nic_stop tries to cancel/disable them when they've already
started their execution.
As the deadlock is caused by rtnl_lock, it causes many other processes
to stall, not only atlantic related stuff.
Fix it by introducing a mutex that protects each NIC's macsec related
data, and locking it instead of the rtnl_lock from the service task and
the threaded IRQ.
Before this patch, all macsec data was protected with rtnl_lock, but
maybe not all of it needs to be protected. With this new mutex, further
efforts can be made to limit the protected data only to that which
requires it. However, probably it doesn't worth it because all macsec's
data accesses are infrequent, and almost all are done from macsec_ops
or ethtool callbacks, called holding rtnl_lock, so macsec_mutex won't
never be much contended.
The issue appeared repeteadly attaching and deattaching the NIC to a
bond interface. Doing that after this patch I cannot reproduce the bug.
Fixes: 62c1c2e606f6 ("net: atlantic: MACSec offload skeleton") Reported-by: Li Liang <liali@redhat.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The offset 12 (bit-rate) of EEPROM SFP DAC (passive) cables is expected
to be in the range 0x64 to 0x68. However, the 5 meter and 7 meter Molex
passive cables have the rate ceiling 0x78 at offset 12.
Add a quirk for Molex passive cables to extend the rate ceiling to 0x78.
Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current XGBE code assumes that offset 6 of EEPROM SFP DAC (passive)
cables is NULL. However, some cables (the 5 meter and 7 meter Molex
passive cables) have non-zero data at offset 6. Fix the logic by moving
the passive cable check above the active checks, so as not to be
improperly identified as an active cable. This will fix the issue for
any passive cable that advertises 1000Base-CX in offset 6.
Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL
enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder,
causing the stack trace to show all text addresses as unreliable:
This happens when the compiled code for show_stack() has a single word
on the stack, and doesn't use a tail call to show_stack_log_lvl().
(CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the
__unwind_start() skip logic hits an off-by-one bug and fails to unwind
all the way to the intended starting frame.
Fix it by reverting the following commit:
f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
The original justification for that commit no longer exists. That
original issue was later fixed in a different way, with the following
commit:
f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")
Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
[jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The `macb_resume`/`macb_suspend` methods already call the
`phylink_start`/`phylink_stop` methods during their execution so
explicitly say that the PM of the PHY is done by MAC by using the
`mac_managed_pm` flag of the `struct phylink_config`.
This also fixes the warning message issued during resume:
WARNING: CPU: 0 PID: 237 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0x144/0x148
In hinic_vf_func_init(), if VF fails to register information with PF
through the MBOX, the MBOX callback function of VF is released once. But
it is released again in hinic_init_hwdev(). Remove one.
The value of lli_credit_cnt is incorrectly assigned, fix it.
Fixes: a0337c0dee68 ("hinic: add support to set and get irq coalesce") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If phy_device_register() fails, phy_device_free() need be called to
put refcount, so memory of phy device and device name can be freed
in callback function.
If get_phy_device() fails, mdiobus_unregister() need be called,
or it will cause warning in mdiobus_free() and kobject is leaked.
It was caused by srv->listener that might be set to null by
tipc_topsrv_stop() in net .exit whereas it's still used in
tipc_topsrv_accept() worker.
srv->listener is protected by srv->idr_lock in tipc_topsrv_stop(), so add
a check for srv->listener under srv->idr_lock in tipc_topsrv_accept() to
avoid the null-ptr-deref. To ensure the lsock is not released during the
tipc_topsrv_accept(), move sock_release() after tipc_topsrv_work_stop()
where it's waiting until the tipc_topsrv_accept worker to be done.
Note that sk_callback_lock is used to protect sk->sk_user_data instead of
srv->listener, and it should check srv in tipc_topsrv_listener_data_ready()
instead. This also ensures that no more tipc_topsrv_accept worker will be
started after tipc_conn_close() is called in tipc_topsrv_stop() where it
sets sk->sk_user_data to null.
Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets") Reported-by: syzbot+c5ce866a8d30f4be0651@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/4eee264380c409c61c6451af1059b7fb271a7e7b.1666120790.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap()
except that the latter also sets a bit in 'cpu_caps_cleared' which
later clears the same cap in secondary cpus, which is likely what is
meant here.
Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20220718141123.136106-2-mlevitsk@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If device_register() fails in snd_ac97_dev_register(), it should
call put_device() to give up reference, or the name allocated in
dev_set_name() is leaked.
The 'chip_np' returned by of_get_next_child() with refcount decremented,
of_node_put() need be called in error path to decrease the refcount.
Fixes: bfc618fcc3f1 ("mtd: rawnand: intel: Read the chip-select line from the correct OF node") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220924131010.957117-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Add 'volatile' to iounmap()'s argument to prevent build warnings.
This make it the same as other major architectures.
Placates these warnings: (12 such warnings)
../drivers/video/fbdev/riva/fbdev.c: In function 'rivafb_probe':
../drivers/video/fbdev/riva/fbdev.c:2067:42: error: passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type [-Werror=discarded-qualifiers]
2067 | iounmap(default_par->riva.PRAMIN);
In commit 97886d9dcd86 ("sched: Migration changes for core scheduling"),
sched_group_cookie_match() was added to help determine if a cookie
matches the core state.
However, while it iterates the SMT group, it fails to actually use the
RQ for each of the CPUs iterated, use cpu_rq(cpu) instead of rq to fix
things.
Fixes: 97886d9dcd86 ("sched: Migration changes for core scheduling") Signed-off-by: Lin Shengwang <linshengwang1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221008022709.642-1-linshengwang1@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Due to the implementation of how SIGTRAP are delivered if
perf_event_attr::sigtrap is set, we've noticed 3 issues:
1. Missing SIGTRAP due to a race with event_sched_out() (more
details below).
2. Hardware PMU events being disabled due to returning 1 from
perf_event_overflow(). The only way to re-enable the event is
for user space to first "properly" disable the event and then
re-enable it.
3. The inability to automatically disable an event after a
specified number of overflows via PERF_EVENT_IOC_REFRESH.
The worst of the 3 issues is problem (1), which occurs when a
pending_disable is "consumed" by a racing event_sched_out(), observed
as follows:
In the above case, not only is that particular SIGTRAP missed, but also
all future SIGTRAPs because 'event_limit' is not reset back to 1.
To fix, rework pending delivery of SIGTRAP via IRQ-work by introduction
of a separate 'pending_sigtrap', no longer using 'event_limit' and
'pending_disable' for its delivery.
Additionally; and different to Marco's proposed patch:
- recognise that pending_disable effectively duplicates oncpu for
the case where it is set. As such, change the irq_work handler to
use ->oncpu to target the event and use pending_* as boolean toggles.
- observe that SIGTRAP targets the ctx->task, so the context switch
optimization that carries contexts between tasks is invalid. If
the irq_work were delayed enough to hit after a context switch the
SIGTRAP would be delivered to the wrong task.
- observe that if the event gets scheduled out
(rotation/migration/context-switch/...) the irq-work would be
insufficient to deliver the SIGTRAP when the event gets scheduled
back in (the irq-work might still be pending on the old CPU).
Therefore have event_sched_out() convert the pending sigtrap into a
task_work which will deliver the signal at return_to_user.
Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events") Reported-by: Dmitry Vyukov <dvyukov@google.com> Debugged-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Marco Elver <elver@google.com> Debugged-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Marco Elver <elver@google.com> Tested-by: Marco Elver <elver@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Update HDMI volatile registers list as DMA, Channel Selection registers,
vbit control registers are being reflected by hardware DP port
disconnection.
This update is required to fix no display and no sound issue observed
after reconnecting TAMA/SANWA DP cables.
Once DP cable is unplugged, DMA control registers are being reset by
hardware, however at second plugin, new dma control values does not
updated to the dma hardware registers since new register value and
cached values at the time of first plugin are same.
Fixes: 7cb37b7bd0d3 ("ASoC: qcom: Add support for lpass hdmi driver") Signed-off-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com> Reported-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Link: https://lore.kernel.org/r/1665637711-13300-1-git-send-email-quic_srivasam@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
It's required by vm_userspace_mem_region_add() that memory size
should be aligned to host page size. However, one guest page is
provided by memslot_modification_stress_test. It triggers failure
in the scenario of 64KB-page-size-host and 4KB-page-size-guest,
as the following messages indicate.
# ./memslot_modification_stress_test
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
guest physical test memory: [0xffbfff0000, 0xffffff0000)
Finished creating vCPUs
Started all vCPUs
==== Test Assertion Failure ====
lib/kvm_util.c:824: vm_adjust_num_guest_pages(vm->mode, npages) == npages
pid=5712 tid=5712 errno=0 - Success
1 0x0000000000404eeb: vm_userspace_mem_region_add at kvm_util.c:822
2 0x0000000000401a5b: add_remove_memslot at memslot_modification_stress_test.c:82
3 (inlined by) run_test at memslot_modification_stress_test.c:110
4 0x0000000000402417: for_each_guest_mode at guest_modes.c:100
5 0x00000000004016a7: main at memslot_modification_stress_test.c:187
6 0x0000ffffb8cd4383: ?? ??:0
7 0x0000000000401827: _start at :?
Number of guest pages is not compatible with the host. Try npages=16
Fix the issue by providing 16 guest pages to the memory slot for this
particular combination of 64KB-page-size-host and 4KB-page-size-guest
on aarch64.
The mode_valid field in drm_connector_helper_funcs is expected to be of
type:
enum drm_mode_status (* mode_valid) (struct drm_connector *connector,
struct drm_display_mode *mode);
The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.
The return type of mdp4_lvds_connector_mode_valid should be changed from
int to enum drm_mode_status.
The "height" and "width" values come from the user so the "height * width"
multiplication can overflow.
Link: https://lore.kernel.org/r/YxBBCRnm3mmvaiuR@kili Fixes: a49d25364dfb ("staging/atomisp: Add support for the Intel IPU v2") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The doc says the I²C device's name is used if devname is NULL, but
actually the I²C device driver's name is used.
Fixes: 0658293012af ("media: v4l: subdev: Add a function to set an I²C sub-device's name") Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before switching back to the right partition in mmc_blk_reset there used
to be a check if hw_reset was even supported. This return value
was removed, so there is no reason to check. Furthermore ensure
part_curr is not falsely set to a valid value on reset or
partition switch error.
As part of this change the code paths of mmc_blk_reset calls were checked
to ensure no commands are issued after a failed mmc_blk_reset directly
without going through the block layer.
Fixes: fefdd3c91e0a ("mmc: core: Drop superfluous validations in mmc_hw|sw_reset()") Cc: stable@vger.kernel.org Signed-off-by: Christian Loehle <cloehle@hyperstone.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/e91be6199d04414a91e20611c81bfe1d@hyperstone.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its one of the partial Path Split patches
that was included.
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its one of the partial Path Split patches
that was included.
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its one of the partial Path Split patches
that was included.
NOTE: fixed a git revert error which caused a new line to be inserted:
line 5755 of lpfc_scsi.c in lpfc_queuecommand
+ atomic_inc(&ndlp->cmd_pending);
Removed the line
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its a fix specific to the Path Split patches,
which were partially included and now being pulled from 5.15.
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its a fix specific to the Path Split patches,
which were partially included and now being pulled from 5.15.
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path
Split mods were a 14-patch set, which refactors the driver from
to split the sli-3 hw (now eol) from the sli-4 hw and use sli4
structures natively. The patches are highly inter-related.
Given only some of the patches were included, it created a situation
where FLOGI's fail, thus SLI Ports can't start communication.
Reverting this patch as its a fix specific to the Path Split patches,
which were partially included and now being pulled from 5.15.
Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entry.
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entry.
For modules, names from kallsyms__parse() contain the module name which
meant that module symbols did not match exactly by name.
Fix by matching the name string up to the separating tab character.
Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026072736.2982-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *")
a memory leakage problem occurs. Memory allocated for page table entries
not released during process termination. This issue can be reproduced by
a small program that allocates a large amount of memory. After several
runs, you'll see that the amount of free memory has reduced and will
continue to reduce after each run. All ARC CPUs are effected by this
issue. The issue was introduced since the kernel stable release v5.15-rc1.
As described in commit d9820ff after switch pgtable_t back to struct
page *, a pointer to "struct page" and appropriate functions are used to
allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't
changed and returns the direct virtual address from the PMD (PGD) entry.
Than this address used as a parameter in the __pte_free() and as a result
this function couldn't release memory page allocated for PTEs.
Fix this issue by changing the pmd_pgtable macro and returning pointer to
struct page.
Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *") Cc: Mike Rapoport <rppt@kernel.org> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syzkaller managed to trigger concurrent calls to
kernfs_remove_by_name_ns() for the same file resulting in
a KASAN detected use-after-free. The race occurs when the root
node is freed during kernfs_drain().
To prevent this acquire an additional reference for the root
of the tree that is removed before calling __kernfs_remove().
Found by syzkaller with the following reproducer (slab_nomerge is
required):
The buggy address belongs to the object at ffff888008880780
which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
128-byte region [ffff888008880780, ffff888008880800)
Memory state around the buggy address: ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================
The signal_read(), action_read(), and action_write() callbacks have been
assuming Signal0 is requested without checking. This results in requests
for Signal1 returning data for Signal0. This patch fixes these
oversights by properly checking for the Signal's id in the respective
callbacks and handling accordingly based on the particular Signal
requested. The trig_inverted member of the mchp_tc_data is removed as
superfluous.
The core issues the warning "drop HS400 support since no 8-bit bus" when
one of the ESDHC_FLAG_HS400* flags is set on a non 8bit capable host. To
avoid this warning set these flags only on hosts that actually can do
8bit, i.e. have bus-width = <8> set in the device tree.
Enhanced Strobe (ES) does not work correctly on the ASUS 1100 series of
devices. Jasper Lake eMMCs (pci_id 8086:4dc4) are supposed to support
ES. There are also two system families under the series, thus this is
being scoped to the ASUS BIOS.
The failing ES prevents the installer from writing to disk. Falling back
to HS400 without ES fixes the issue.
Signed-off-by: Patrick Thompson <ptf@google.com> Fixes: 315e3bd7ac19 ("mmc: sdhci-pci: Add support for Intel JSL") Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221013210017.3751025-1-ptf@google.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SDIO tuple is only allocated for standard SDIO card, especially it causes
memory corruption issues when the non-standard SDIO card has removed, which
is because the card device's reference counter does not increase for it at
sdio_init_func(), but all SDIO card device reference counter gets decreased
at sdio_release_func().
Fixes: 6f51be3d37df ("sdio: allow non-standard SDIO cards") Signed-off-by: Matthew Ma <mahongwei@zeku.com> Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> Reviewed-by: John Wang <wangdayu@zeku.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221014034951.2300386-1-ouyangweizhao@zeku.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
REGMAP_MMIO is not user-configurable, so we can only satisfy this
dependency by enabling some other Kconfig symbol that properly 'select's
it. Use select like everybody else.
Noticed when trying to enable this driver for compile testing.
cti_enable_hw() and cti_disable_hw() are called from an atomic context
so shouldn't use runtime PM because it can result in a sleep when
communicating with firmware.
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock
management to the SCMI power domain""), this causes a hang on Juno when
running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put()
was called with the wrong device until commit 692c9a499b28 ("coresight:
cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022
Call trace:
dump_backtrace+0x134/0x140
show_stack+0x20/0x58
dump_stack_lvl+0x8c/0xb8
dump_stack+0x18/0x34
__might_resched+0x180/0x228
__might_sleep+0x50/0x88
__pm_runtime_resume+0xac/0xb0
cti_enable+0x44/0x120
coresight_control_assoc_ectdev+0xc0/0x150
coresight_enable_path+0xb4/0x288
etm_event_start+0x138/0x170
etm_event_add+0x48/0x70
event_sched_in.isra.122+0xb4/0x280
merge_sched_in+0x1fc/0x3d0
visit_groups_merge.constprop.137+0x16c/0x4b0
ctx_sched_in+0x114/0x1f0
perf_event_sched_in+0x60/0x90
ctx_resched+0x68/0xb0
perf_event_exec+0x138/0x508
begin_new_exec+0x52c/0xd40
load_elf_binary+0x6b8/0x17d0
bprm_execve+0x360/0x7f8
do_execveat_common.isra.47+0x218/0x238
__arm64_sys_execve+0x48/0x60
invoke_syscall+0x4c/0x110
el0_svc_common.constprop.4+0xfc/0x120
do_el0_svc+0x34/0xc0
el0_svc+0x40/0x98
el0t_64_sync_handler+0x98/0xc0
el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not
needed here because it must have already been done when building the
path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver") Cc: stable <stable@kernel.org> Reported-by: Aishwarya TCV <Aishwarya.TCV@arm.com> Reported-by: Cristian Marussi <Cristian.Marussi@arm.com> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Tested-by: Mike Leach <mike.leach@linaro.org>
[ Fix build warnings ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221025131032.1149459-1-suzuki.poulose@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Device-managed resources allocated post component bind must be tied to
the lifetime of the aggregate DRM device or they will not necessarily be
released when binding of the aggregate device is deferred.
This is specifically true for the DP IRQ, which will otherwise remain
requested so that the next bind attempt fails when requesting the IRQ a
second time.
Since commit c3bf8e21b38a ("drm/msm/dp: Add eDP support via aux_bus")
this can happen when the aux-bus panel driver has not yet been loaded so
that probe is deferred.
Fix this by tying the device-managed lifetime of the DP IRQ to the DRM
device so that it is released when bind fails.
Add the missing sanity check on the bridge counter to avoid corrupting
data beyond the fixed-sized bridge array in case there are ever more
than eight bridges.
Add the missing sanity check on the bridge counter to avoid corrupting
data beyond the fixed-sized bridge array in case there are ever more
than eight bridges.
In the S2idle suspend/resume phase the gfxoff is keeping functional so
some IP blocks will be likely to reinitialize at gfxoff entry and that
will result in failing to program GC registers.Therefore, let disallow
gfxoff until AMDGPU IPs reinitialized completely.
Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One of the sysfs values reported for supported_speeds was not valid (20Gb/s
reported instead of 64Gb/s). Instead of driver internal speed mask
definition, use speed mask defined in transport_fc for reporting
host->supported_speeds.
Back in 2014, the LQI was saved in the skb control buffer (skb->cb, or
mac_cb(skb)) without any actual reset of this area prior to its use.
As part of a useful rework of the use of this region, 32edc40ae65c
("ieee802154: change _cb handling slightly") introduced mac_cb_init() to
basically memset the cb field to 0. In particular, this new function got
called at the beginning of mac802154_parse_frame_start(), right before
the location where the buffer got actually filled.
What went through unnoticed however, is the fact that the very first
helper called by device drivers in the receive path already used this
area to save the LQI value for later extraction. Resetting the cb field
"so late" led to systematically zeroing the LQI.
If we consider the reset of the cb field needed, we can make it as soon
as we get an skb from a device driver, right before storing the LQI,
as is the very first time we need to write something there.
unshare_sighand should only access oldsighand->action
while holding oldsighand->siglock, to make sure that
newsighand->action is in a consistent state.
If "interp_elf_ex" fails to allocate memory in load_elf_binary(),
the program will take the "out_free_ph" error handing path,
resulting in "interpreter" file resource is not released.
Fix it by adding an error handing path "out_free_file", which will
release the file resource when "interp_elf_ex" failed to allocate
memory.
Fixes: 0693ffebcfe5 ("fs/binfmt_elf.c: allocate less for static executable") Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221024154421.982230-1-lizetao1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 46573fd6369f ("cpufreq: intel_pstate: hybrid: Rework HWP
calibration") attempted to use the information from CPPC (the nominal
performance in particular) to obtain the scaling factor allowing the
frequency to be computed if the HWP performance level of the given CPU
is known or vice versa.
However, it turns out that on some platforms this doesn't work, because
the CPPC information on them does not align with the contents of the
MSR_HWP_CAPABILITIES registers.
This basically means that the only way to make intel_pstate work on all
of the hybrid platforms to date is to use the observation that on all
of them the scaling factor between the HWP performance levels and
frequency for P-cores is 78741 (approximately 100000/1.27). For
E-cores it is 100000, which is the same as for all of the non-hybrid
"core" platforms and does not require any changes.
Accordingly, make intel_pstate use 78741 as the scaling factor between
HWP performance levels and frequency for P-cores on all hybrid platforms
and drop the dependency of the HWP calibration code on CPPC.
Some of the MSR accesses in intel_pstate are carried out on the CPU
that is running the code, but the values coming from them are used
for the performance scaling of the other CPUs.
This is problematic, for example, on hybrid platforms where
MSR_TURBO_RATIO_LIMIT for P-cores and E-cores is different, so the
values read from it on a P-core are generally not applicable to E-cores
and the other way around.
For this reason, make the driver access all MSRs on the target CPU on
platforms using the "core" pstate_funcs callbacks which is the case for
all of the hybrid platforms released to date. For this purpose, pass
a CPU argument to the ->get_max(), ->get_max_physical(), ->get_min()
and ->get_turbo() pstate_funcs callbacks and from there pass it to
rdmsrl_on_cpu() or rdmsrl_safe_on_cpu() to access the MSR on the target
CPU.
The iio_triggered_buffer_setup_ext() was changed by
commit 15097c7a1adc ("iio: buffer: wrap all buffer attributes into iio_dev_attr")
to silently expect that all attributes given in buffer_attrs array are
device-attributes. This expectation was not forced by the API - and some
drivers did register attributes created by IIO_CONST_ATTR().
The added attribute "wrapping" does not copy the pointer to stored
string constant and when the sysfs file is read the kernel will access
to invalid location.
Change the IIO_CONST_ATTRs from the driver to IIO_DEVICE_ATTR in order
to prevent the invalid memory access.
Currently, every time the device wakes up from sleep, the
iio_chan array is reallocated, leaking the previous one
until the device is removed (basically never).
Move the allocation to the probe function to avoid this.
Signed-off-by: Cosmin Tanislav <cosmin.tanislav@analog.com> Fixes: f110f3188e563 ("iio: temperature: Add support for LTC2983") Cc: <Stable@vger.kernel.org> Link: https://lore.kernel.org/r/20221014123724.1401011-2-demonsingur@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tsl2583 probe() uses devm_iio_device_register() and calling
iio_device_unregister() causes the unregister to occur twice. s
Switch to iio_device_register() instead of devm_iio_device_register()
in probe to avoid the device managed cleanup.
The iio_utils uses a digit calculation in order to know length of the
file name containing a buffer number. The digit calculation does not
work for number 0.
This leads to allocation of one character too small buffer for the
file-name when file name contains value '0'. (Eg. buffer0).
Fix digit calculation by returning one digit to be present for number
'0'.
Endpoints are normally deleted from the bandwidth list when they are
dropped, before the virt device is freed.
If xHC host is dying or being removed then the endpoints aren't dropped
cleanly due to functions returning early to avoid interacting with a
non-accessible host controller.
So check and delete endpoints that are still on the bandwidth list when
freeing the virt device.
Solves a list_del corruption kernel crash when unbinding xhci-pci,
caused by xhci_mem_cleanup() when it later tried to delete already freed
endpoints from the bandwidth list.
This only affects hosts that use software bandwidth checking, which
currenty is only the xHC in intel Panther Point PCH (Ivy Bridge)
Cc: stable@vger.kernel.org Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20221024142720.4122053-5-mathias.nyman@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For optimal power consumption of USB4 routers the XHCI PCIe endpoint
used for tunneling must be in D3. Historically this is accomplished
by a long list of PCIe IDs that correspond to these endpoints because
the xhci_hcd driver will not default to allowing runtime PM for all
devices.
As both AMD and Intel have released new products with new XHCI controllers
this list continues to grow. In reviewing the XHCI specification v1.2 on
page 607 there is already a requirement that the PCI power management
states D3hot and D3cold must be supported.
In the quirk list, use this to indicate that runtime PM should be allowed
on XHCI controllers. The following controllers are known to be xHC 1.2 and
dropped explicitly:
* AMD Yellow Carp
* Intel Alder Lake
* Intel Meteor Lake
* Intel Raptor Lake
[keep PCI ID for Alder Lake PCH for recently added quirk -Mathias]
Systems based on Alder Lake P see significant boot time delay if
boot firmware tries to control usb ports in unexpected link states.
This is seen with self-powered usb devices that survive in U3 link
suspended state over S5.
A more generic solution to power off ports at shutdown was attempted in
commit 83810f84ecf1 ("xhci: turn off port power in shutdown")
but it caused regression.
Add host specific XHCI_RESET_TO_DEFAULT quirk which will reset host and
ports back to default state in shutdown.
Originally the absence of the marvell,nand-keep-config property caused
the setup_data_interface function to be provided. However when
setup_data_interface was moved into nand_controller_ops the logic was
unintentionally inverted. Update the logic so that only if the
marvell,nand-keep-config property is present the bootloader NAND config
kept.
Cc: stable@vger.kernel.org Fixes: 7a08dbaedd36 ("mtd: rawnand: Move ->setup_data_interface() to nand_controller_ops") Signed-off-by: Tony O'Brien <tony.obrien@alliedtelesis.co.nz> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220927024728.28447-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This appears to fix the error:
"xhci_hcd <address>; ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 2 comp_code 13" that appear spuriously (or pretty
often) when using a r8152 USB3 ethernet adapter with integrated hub.
ASM1042 reports as a 0.96 controller, but appears to behave more like 1.0
Inspired by this email thread: https://markmail.org/thread/7vzqbe7t6du6qsw3
When port is connected and then disconnected, the state stays as
configured. Which is incorrect as the port is no longer configured,
but in a not attached state.
The gadget driver may have a certain expectation of how the request
completion flow should be from to its configuration. Make sure the
controller driver respect that. That is, don't set IMI (Interrupt on
Missed Isoc) when usb_request->no_interrupt is set. Also, the driver
should only set IMI to the last TRB of a chain.
When servicing a transfer completion event, the dwc3 driver will reclaim
TRBs of started requests up to the request associated with the interrupt
event. Currently we don't check for interrupt due to missed isoc, and
the driver may attempt to reclaim TRBs beyond the associated event. This
causes invalid memory access when the hardware still owns the TRB. If
there's a missed isoc TRB with IMI (interrupt on missed isoc), make sure
to stop servicing further.
Note that only the last TRB of chained TRBs has its status updated with
missed isoc.
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Cc: stable@vger.kernel.org Reported-by: Jeff Vanhoof <jdv1029@gmail.com> Reported-by: Dan Vacura <w36195@motorola.com> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Reviewed-by: Jeff Vanhoof <jdv1029@gmail.com> Tested-by: Jeff Vanhoof <jdv1029@gmail.com> Link: https://lore.kernel.org/r/b29acbeab531b666095dfdafd8cb5c7654fbb3e1.1666735451.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In uvc_video_encode_isoc_sg, the uvc_request's sg list is
incorrectly being populated leading to corrupt video being
received by the remote end. When building the sg list the
usage of buf->sg's 'dma_length' field is not correct and
instead its 'length' field should be used.
If there is a transmission error the buffer will be returned too early,
causing a memory fault as subsequent requests for that buffer are still
queued up to be sent. Refactor the error handling to wait for the final
request to come in before reporting back the buffer to userspace for all
transfer types (bulk/isoc/isoc_sg). This ensures userspace knows if the
frame was successfully sent.
NVIDIA Jetson devices in Force Recovery mode (RCM) do not support
suspending, ie. flashing fails if the device has been suspended. The
devices are still visible in lsusb and seem to work otherwise, making
the issue hard to debug. This has been discovered in various forum
posts, eg. [1].
The patch has been tested on NVIDIA Jetson AGX Xavier, but I'm adding
all the Jetson models listed in [2] on the assumption that they all
behave similarly.
With char becoming unsigned by default, and with `char` alone being
ambiguous and based on architecture, signed chars need to be marked
explicitly as such. This fixes warnings like:
With char becoming unsigned by default, and with `char` alone being
ambiguous and based on architecture, signed chars need to be marked
explicitly as such. This fixes warnings like:
M-Audio Fast Track C400 and C600 devices (0763:2030 and 0763:2031,
respectively) seem requiring the explicit setup for the implicit
feedback mode. This patch adds the quirk entries for those.
Instead just use del_timer_sync() which will wait for the timer to finish
before continuing. No need to check if the timer is active or not when
doing so.
This doesn't fix the race of a possible re-arming of the timer, but at
least it won't use the data that has just been freed.
kvaser_usb uses completions to signal when a response event is received
for outgoing commands.
However, it uses init_completion() to reinitialize the start_comp and
stop_comp completions before sending the start/stop commands.
In case the device sends the corresponding response just before the
actual command is sent, complete() may be called concurrently with
init_completion() which is not safe.
This might be triggerable even with a properly functioning device by
stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which
also causes the driver to send CMD_STOP_CHIP when restart-ms is off),
but that was not tested.
Fix the issue by using reinit_completion() instead.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is not allowed to call kfree_skb() from hardware interrupt context
or with interrupts being disabled. The skb is unlinked from the queue,
so it can be freed after spin_unlock_irqrestore().
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.com Cc: stable@vger.kernel.org
[mkl: adjust subject] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This was missed in c3ed222745d9 ("NFSv4: Fix free of uninitialized
nfs4_label on referral lookup.") and causes a panic when mounting
with '-o trunkdiscovery':