Linus Torvalds [Wed, 20 Dec 2023 20:04:03 +0000 (12:04 -0800)]
Merge tag 'ovl-fixes-6.7-rc7' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs fix from Amir Goldstein:
"Fix a regression from this merge window"
* tag 'ovl-fixes-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: fix dentry reference leak after changes to underlying layers
Linus Torvalds [Wed, 20 Dec 2023 19:24:28 +0000 (11:24 -0800)]
Merge tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs fixes from Kent Overstreet:
- Fix a deadlock in the data move path with nocow locks (vs. update in
place writes); when trylock failed we were incorrectly waiting for in
flight ios to flush.
- Fix reporting of NFS file handle length
- Fix early error path in bch2_fs_alloc() - list head wasn't being
initialized early enough
- Make sure correct (hardware accelerated) crc modules get loaded
- Fix a rare overflow in the btree split path, when the packed bkey
format grows and all the keys have no value (LRU btree).
- Fix error handling in the sector allocator
This was causing writes to spuriously fail in multidevice setups, and
another bug meant that the errors weren't being logged, only reported
via fsync.
* tag 'bcachefs-2023-12-19' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix bch2_alloc_sectors_start_trans() error handling
bcachefs; guard against overflow in btree node split
bcachefs: btree_node_u64s_with_format() takes nr keys
bcachefs: print explicit recovery pass message only once
bcachefs: improve modprobe support by providing softdeps
bcachefs: fix invalid memory access in bch2_fs_alloc() error path
bcachefs: Fix determining required file handle length
bcachefs: Fix nocow locks deadlock
Linus Torvalds [Wed, 20 Dec 2023 19:16:50 +0000 (11:16 -0800)]
Merge tag 'nfsd-6.7-2' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Address a few recently-introduced issues
* tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Revert
5f7fc5d69f6e92ec0b38774c387f5cf7812c5806
NFSD: Revert
738401a9bd1ac34ccd5723d69640a4adbb1a4bc0
NFSD: Revert
6c41d9a9bd0298002805758216a9c44e38a8500d
nfsd: hold nfsd_mutex across entire netlink operation
nfsd: call nfsd_last_thread() before final nfsd_put()
Linus Torvalds [Wed, 20 Dec 2023 19:01:28 +0000 (11:01 -0800)]
Merge tag 'dm-6.7/dm-fixes-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- DM raid target (and MD raid) fix for reconfig_mutex MD deadlock that
should have been merged along with recent v6.7-rc6 MD fixes (see MD
related commits:
f2d87a759f68^..
b39113349de6)
- DM integrity target fix to avoid modifying immutable biovec in the
integrity_metadata() edge case where kmalloc fails.
- Fix drivers/md/Kconfig so DM_AUDIT depends on BLK_DEV_DM.
- Update DM entry in MAINTAINERS to remove stale info.
* tag 'dm-6.7/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
MAINTAINERS: remove stale info for DEVICE-MAPPER
dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DM
dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()
dm-raid: delay flushing event_work() after reconfig_mutex is released
Kent Overstreet [Tue, 19 Dec 2023 22:16:34 +0000 (17:16 -0500)]
bcachefs: Fix bch2_alloc_sectors_start_trans() error handling
When we fail to allocate because of insufficient open buckets, we don't
want to retry from the full set of devices - we just want to retry in
blocking mode.
But if the retry in blocking mode fails with a different error code, we
end up squashing the -BCH_ERR_open_buckets_empty error with an error
that makes us thing we won't be able to allocate (insufficient_devices)
- which is incorrect when we didn't try to allocate from the full set of
devices, and causes the write to fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Dec 2023 04:31:26 +0000 (23:31 -0500)]
bcachefs; guard against overflow in btree node split
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Dec 2023 04:20:59 +0000 (23:20 -0500)]
bcachefs: btree_node_u64s_with_format() takes nr keys
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Tue, 19 Dec 2023 20:25:43 +0000 (12:25 -0800)]
Merge tag 'trace-v6.7-rc6' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
"While working on the ring buffer, I found one more bug with the
timestamp code, and the fix for this removed the need for the final
64-bit cmpxchg!
The ring buffer events hold a "delta" from the previous event. If it
is determined that the delta can not be calculated, it falls back to
adding an absolute timestamp value. The way to know if the delta can
be used is via two stored timestamps in the per-cpu buffer meta data:
before_stamp and write_stamp
The before_stamp is written by every event before it tries to allocate
its space on the ring buffer. The write_stamp is written after it
allocates its space and knows that nothing came in after it read the
previous before_stamp and write_stamp and the two matched.
A previous fix
dd9394257078 ("ring-buffer: Do not try to put back
write_stamp") removed putting back the write_stamp to match the
before_stamp so that the next event could use the delta, but races
were found where the two would match, but not be for of the previous
event.
It was determined to allow the event reservation to not have a valid
write_stamp when it is finished, and this fixed a lot of races.
The last use of the 64-bit timestamp cmpxchg depended on the
write_stamp being valid after an interruption. But this is no longer
the case, as if an event is interrupted by a softirq that writes an
event, and that event gets interrupted by a hardirq or NMI and that
writes an event, then the softirq could finish its reservation without
a valid write_stamp.
In the slow path of the event reservation, a delta can still be used
if the write_stamp is valid. Instead of using a cmpxchg against the
write stamp, the before_stamp needs to be read again to validate the
write_stamp. The cmpxchg is not needed.
This updates the slowpath to validate the write_stamp by comparing it
to the before_stamp and removes all rb_time_cmpxchg() as there are no
more users of that function.
The removal of the 32-bit updates of rb_time_t will be done in the
next merge window"
* tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Fix slowpath of interrupted event
Linus Torvalds [Tue, 19 Dec 2023 20:19:25 +0000 (12:19 -0800)]
Merge tag 'arc-6.7-fixes' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- build error for hugetlb, sparse and smatch fixes
- removal of VIPT aliasing cache code
* tag 'arc-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: add hugetlb definitions
ARC: fix smatch warning
ARC: fix spare error
ARC: mm: retire support for aliasing VIPT D$
ARC: entry: move ARCompact specific bits out of entry.h
ARC: entry: SAVE_ABI_CALLEE_REG: ISA/ABI specific helper
Steven Rostedt (Google) [Tue, 19 Dec 2023 04:07:12 +0000 (23:07 -0500)]
ring-buffer: Fix slowpath of interrupted event
To synchronize the timestamps with the ring buffer reservation, there are
two timestamps that are saved in the buffer meta data.
1. before_stamp
2. write_stamp
When the two are equal, the write_stamp is considered valid, as in, it may
be used to calculate the delta of the next event as the write_stamp is the
timestamp of the previous reserved event on the buffer.
This is done by the following:
/*A*/ w = current position on the ring buffer
before = before_stamp
after = write_stamp
ts = read current timestamp
if (before != after) {
write_stamp is not valid, force adding an absolute
timestamp.
}
/*B*/ before_stamp = ts
/*C*/ write = local_add_return(event length, position on ring buffer)
if (w == write - event length) {
/* Nothing interrupted between A and C */
/*E*/ write_stamp = ts;
delta = ts - after
/*
* If nothing interrupted again,
* before_stamp == write_stamp and write_stamp
* can be used to calculate the delta for
* events that come in after this one.
*/
} else {
/*
* The slow path!
* Was interrupted between A and C.
*/
This is the place that there's a bug. We currently have:
after = write_stamp
ts = read current timestamp
/*F*/ if (write == current position on the ring buffer &&
after < ts && cmpxchg(write_stamp, after, ts)) {
delta = ts - after;
} else {
delta = 0;
}
The assumption is that if the current position on the ring buffer hasn't
moved between C and F, then it also was not interrupted, and that the last
event written has a timestamp that matches the write_stamp. That is the
write_stamp is valid.
But this may not be the case:
If a task context event was interrupted by softirq between B and C.
And the softirq wrote an event that got interrupted by a hard irq between
C and E.
and the hard irq wrote an event (does not need to be interrupted)
We have:
/*B*/ before_stamp = ts of normal context
---> interrupted by softirq
/*B*/ before_stamp = ts of softirq context
---> interrupted by hardirq
/*B*/ before_stamp = ts of hard irq context
/*E*/ write_stamp = ts of hard irq context
/* matches and write_stamp valid */
<----
/*E*/ write_stamp = ts of softirq context
/* No longer matches before_stamp, write_stamp is not valid! */
<---
w != write - length, go to slow path
// Right now the order of events in the ring buffer is:
//
// |-- softirq event --|-- hard irq event --|-- normal context event --|
//
after = write_stamp (this is the ts of softirq)
ts = read current timestamp
if (write == current position on the ring buffer [true] &&
after < ts [true] && cmpxchg(write_stamp, after, ts) [true]) {
delta = ts - after [Wrong!]
The delta is to be between the hard irq event and the normal context
event, but the above logic made the delta between the softirq event and
the normal context event, where the hard irq event is between the two. This
will shift all the remaining event timestamps on the sub-buffer
incorrectly.
The write_stamp is only valid if it matches the before_stamp. The cmpxchg
does nothing to help this.
Instead, the following logic can be done to fix this:
before = before_stamp
ts = read current timestamp
before_stamp = ts
after = write_stamp
if (write == current position on the ring buffer &&
after == before && after < ts) {
delta = ts - after
} else {
delta = 0;
}
The above will only use the write_stamp if it still matches before_stamp
and was tested to not have changed since C.
As a bonus, with this logic we do not need any 64-bit cmpxchg() at all!
This means the 32-bit rb_time_t workaround can finally be removed. But
that's for a later time.
Link: https://lore.kernel.org/linux-trace-kernel/20231218175229.58ec3daf@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20231218230712.3a76b081@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: dd93942570789 ("ring-buffer: Do not try to put back write_stamp")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Tue, 19 Dec 2023 00:47:21 +0000 (16:47 -0800)]
Merge tag 'hid-for-linus-
2023121901' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fix for division by zero in Nintendo driver when generic joycon is
attached, reported and fixed by SteamOS folks (Guilherme G. Piccoli)
- GCC-7 build fix (which is a good cleanup anyway) for Nintendo driver
(Ryan McClelland)
* tag 'hid-for-linus-
2023121901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: nintendo: Prevent divide-by-zero on code
HID: nintendo: fix initializer element is not constant error
Chuck Lever [Mon, 18 Dec 2023 22:05:40 +0000 (17:05 -0500)]
SUNRPC: Revert
5f7fc5d69f6e92ec0b38774c387f5cf7812c5806
Guillaume says:
> I believe commit
5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from
> node-local memory") in Linux 6.5+ is incorrect. It passes
> unconditionally rq_pool->sp_id as the NUMA node.
>
> While the comment in the svc_pool declaration in sunrpc/svc.h says
> that sp_id is also the NUMA node id, it might not be the case if
> the svc is created using svc_create_pooled(). svc_created_pooled()
> can use the per-cpu pool mode therefore in this case sp_id would
> be the cpu id.
Fix this by reverting now. At a later point this minor optimization,
and the deceptive labeling of the sp_id field, can be revisited.
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Closes: https://lore.kernel.org/linux-nfs/ZYC9rsno8qYggVt9@bender.morinfr.org/T/#u
Fixes: 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from node-local memory")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Guilherme G. Piccoli [Tue, 5 Dec 2023 21:15:51 +0000 (18:15 -0300)]
HID: nintendo: Prevent divide-by-zero on code
It was reported [0] that adding a generic joycon to the system caused
a kernel crash on Steam Deck, with the below panic spew:
divide error: 0000 [#1] PREEMPT SMP NOPTI
[...]
Hardware name: Valve Jupiter/Jupiter, BIOS
F7A0119 10/24/2023
RIP: 0010:nintendo_hid_event+0x340/0xcc1 [hid_nintendo]
[...]
Call Trace:
[...]
? exc_divide_error+0x38/0x50
? nintendo_hid_event+0x340/0xcc1 [hid_nintendo]
? asm_exc_divide_error+0x1a/0x20
? nintendo_hid_event+0x307/0xcc1 [hid_nintendo]
hid_input_report+0x143/0x160
hidp_session_run+0x1ce/0x700 [hidp]
Since it's a divide-by-0 error, by tracking the code for potential
denominator issues, we've spotted 2 places in which this could happen;
so let's guard against the possibility and log in the kernel if the
condition happens. This is specially useful since some data that
fills some denominators are read from the joycon HW in some cases,
increasing the potential for flaws.
[0] https://github.com/ValveSoftware/SteamOS/issues/1070
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Tested-by: Sam Lantinga <slouken@libsdl.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Linus Torvalds [Mon, 18 Dec 2023 19:11:09 +0000 (11:11 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two medium sized fixes, both in drivers.
The UFS one adds parsing of clock info structures, which is required
by some host drivers and the aacraid one reverts the IRQ affinity
mapping patch which has been causing regressions noted in kernel
bugzilla 217599"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Store min and max clk freq from OPP table
Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity"
Linus Torvalds [Mon, 18 Dec 2023 18:59:57 +0000 (10:59 -0800)]
Merge tag 'spi-fix-v6.7-rc7' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few bigger things here, the main one being that there were changes
to the atmel driver in this cycle which made it possible to kill
transfers being used for filesystem I/O which turned out to be very
disruptive, the series of patches here undoes that and hardens things
up further.
There's also a few smaller driver specific changes, the main one being
to revert a change that duplicted delays"
* tag 'spi-fix-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: atmel: Fix clock issue when using devices with different polarities
spi: spi-imx: correctly configure burst length when using dma
spi: cadence: revert "Add SPI transfer delays"
spi: atmel: Prevent spi transfers from being killed
spi: atmel: Drop unused defines
spi: atmel: Do not cancel a transfer upon any signal
Mike Snitzer [Wed, 13 Dec 2023 19:49:12 +0000 (14:49 -0500)]
MAINTAINERS: remove stale info for DEVICE-MAPPER
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mike Snitzer [Wed, 13 Dec 2023 19:46:19 +0000 (14:46 -0500)]
dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DM
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Tue, 5 Dec 2023 15:39:16 +0000 (16:39 +0100)]
dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata()
__bio_for_each_segment assumes that the first struct bio_vec argument
doesn't change - it calls "bio_advance_iter_single((bio), &(iter),
(bvl).bv_len)" to advance the iterator. Unfortunately, the dm-integrity
code changes the bio_vec with "bv.bv_len -= pos". When this code path
is taken, the iterator would be out of sync and dm-integrity would
report errors. This happens if the machine is out of memory and
"kmalloc" fails.
Fix this bug by making a copy of "bv" and changing the copy instead.
Fixes: 7eada909bfd7 ("dm: add integrity target")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Yu Kuai [Fri, 24 Nov 2023 07:59:53 +0000 (15:59 +0800)]
dm-raid: delay flushing event_work() after reconfig_mutex is released
After commit
db5e653d7c9f ("md: delay choosing sync action to
md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however,
in order to make sure event_work is done, __md_stop() will flush
workqueue with reconfig_mutex grabbed, hence if sync_work is still
pending, deadlock will be triggered.
Fortunately, former pacthes to fix stopping sync_thread already make sure
all sync_work is done already, hence such deadlock is not possible
anymore. However, in order not to cause confusions for people by this
implicit dependency, delay flushing event_work to dm-raid where
'reconfig_mutex' is not held, and add some comments to emphasize that
the workqueue can't be flushed with 'reconfig_mutex'.
Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()")
Depends-on:
f52f5c71f3d4 ("md: fix stopping sync thread")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Chuck Lever [Sat, 16 Dec 2023 17:12:50 +0000 (12:12 -0500)]
NFSD: Revert
738401a9bd1ac34ccd5723d69640a4adbb1a4bc0
There's nothing wrong with this commit, but this is dead code now
that nothing triggers a CB_GETATTR callback. It can be re-introduced
once the issues with handling conflicting GETATTRs are resolved.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Sat, 16 Dec 2023 16:57:43 +0000 (11:57 -0500)]
NFSD: Revert
6c41d9a9bd0298002805758216a9c44e38a8500d
For some reason, the wait_on_bit() in nfsd4_deleg_getattr_conflict()
is waiting forever, preventing a clean server shutdown. The
requesting client might also hang waiting for a reply to the
conflicting GETATTR.
Invoking wait_on_bit() in an nfsd thread context is a hazard. The
correct fix is to replace this wait_on_bit() call site with a
mechanism that defers the conflicting GETATTR until the CB_GETATTR
completes or is known to have failed.
That will require some surgery and extended testing and it's late
in the v6.7-rc cycle, so I'm reverting now in favor of trying again
in a subsequent kernel release.
This is my fault: I should have recognized the ramifications of
calling wait_on_bit() in here before accepting this patch.
Thanks to Dai Ngo <dai.ngo@oracle.com> for diagnosing the issue.
Reported-by: Wolfgang Walter <linux-nfs@stwm.de>
Closes: https://lore.kernel.org/linux-nfs/e3d43ecdad554fbdcaa7181833834f78@stwm.de/
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Ryan McClelland [Thu, 14 Dec 2023 17:25:41 +0000 (09:25 -0800)]
HID: nintendo: fix initializer element is not constant error
With gcc-7 builds, an error happens with the controller button values being
defined as const. Change to a define.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312141227.C2h1IzfI-lkp@intel.com/
Signed-off-by: Ryan McClelland <rymcclel@gmail.com>
Reviewed-by: Daniel J. Ogorchock <djogorchock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Kent Overstreet [Sun, 17 Dec 2023 20:41:03 +0000 (15:41 -0500)]
bcachefs: print explicit recovery pass message only once
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 17 Dec 2023 23:19:28 +0000 (15:19 -0800)]
Linux 6.7-rc6
Linus Torvalds [Sun, 17 Dec 2023 22:03:11 +0000 (14:03 -0800)]
Merge tag 'perf_urgent_for_v6.7_rc6' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Avoid iterating over newly created group leader event's siblings
because there are none, and thus prevent a lockdep splat
* tag 'perf_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix perf_event_validate_size() lockdep splat
Linus Torvalds [Sun, 17 Dec 2023 17:27:36 +0000 (09:27 -0800)]
Merge tag 'for-6.7-rc5-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more fix that verifies that the snapshot source is a root, same
check is also done in user space but should be done by the ioctl as
well"
* tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: do not allow non subvolume root targets for snapshot
Linus Torvalds [Sun, 17 Dec 2023 17:24:06 +0000 (09:24 -0800)]
Merge tag 'soundwire-6.7-fixes' of git://git./linux/kernel/git/vkoul/soundwire
Pull soundwire fixes from Vinod Koul:
- Null pointer dereference for mult link in core
- AC timing fix in intel driver
* tag 'soundwire-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: intel_ace2x: fix AC timing setting for ACE2.x
soundwire: stream: fix NULL pointer dereference for multi_link
Linus Torvalds [Sun, 17 Dec 2023 17:19:27 +0000 (09:19 -0800)]
Merge tag 'phy-fixes-6.7' of git://git./linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
- register offset fix for TI driver
- mediatek driver minimal supported frequency fix
- negative error code in probe fix for sunplus driver
* tag 'phy-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: sunplus: return negative error code in sp_usb_phy_probe
phy: mediatek: mipi: mt8183: fix minimal supported frequency
phy: ti: gmii-sel: Fix register offset when parent is not a syscon node
Linus Torvalds [Sun, 17 Dec 2023 17:11:32 +0000 (09:11 -0800)]
Merge tag 'dmaengine-fix-6.7' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
- SPI PDMA data fix for TI k3-psil drivers
- suspend fix, pointer check, logic for arbitration fix and channel
leak fix in fsl-edma driver
- couple of fixes in idxd driver for GRPCFG descriptions and int_handle
field handling
- single fix for stm32 driver for bitfield overflow
* tag 'dmaengine-fix-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: fsl-edma: fix DMA channel leak in eDMAv4
dmaengine: fsl-edma: fix wrong pointer check in fsl_edma3_attach_pd()
dmaengine: idxd: Fix incorrect descriptions for GRPCFG register
dmaengine: idxd: Protect int_handle field in hw descriptor
dmaengine: stm32-dma: avoid bitfield overflow assertion
dmaengine: fsl-edma: Add judgment on enabling round robin arbitration
dmaengine: fsl-edma: Do not suspend and resume the masked dma channel when the system is sleeping
dmaengine: ti: k3-psil-am62a: Fix SPI PDMA data
dmaengine: ti: k3-psil-am62: Fix SPI PDMA data
Linus Torvalds [Sun, 17 Dec 2023 17:07:34 +0000 (09:07 -0800)]
Merge tag 'cxl-fixes-6.7-rc6' of git://git./linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) fixes from Dan Williams:
"A collection of CXL fixes.
The touch outside of drivers/cxl/ is for a helper that allocates
physical address space. Device hotplug tests showed that the driver
failed to utilize (skipped over) valid capacity when allocating a new
memory region. Outside of that, new tests uncovered a small crop of
lockdep reports.
There is also some miscellaneous error path and leak fixups that are
not urgent, but useful to cleanup now.
- Fix alloc_free_mem_region()'s scan for address space, prevent false
negative out-of-space events
- Fix sleeping lock acquisition from CXL trace event (atomic context)
- Fix put_device() like for the new CXL PMU driver
- Fix wrong pointer freed on error path
- Fixup several lockdep reports (missing lock hold) from new
assertion in cxl_num_decoders_committed() and new tests"
* tag 'cxl-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/pmu: Ensure put_device on pmu devices
cxl/cdat: Free correct buffer on checksum error
cxl/hdm: Fix dpa translation locking
kernel/resource: Increment by align value in get_free_mem_region()
cxl: Add cxl_num_decoders_committed() usage to cxl_test
cxl/memdev: Hold region_rwsem during inject and clear poison ops
cxl/core: Always hold region_rwsem while reading poison lists
cxl/hdm: Fix a benign lockdep splat
Linus Torvalds [Sun, 17 Dec 2023 17:02:20 +0000 (09:02 -0800)]
Merge tag 'edac_urgent_for_v6.7_rc6' of git://git./linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
- A single fix for the EDAC Versal driver to read out register fields
properly
* tag 'edac_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro
Linus Torvalds [Sun, 17 Dec 2023 16:50:00 +0000 (08:50 -0800)]
Merge tag 'powerpc-6.7-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix a bug where heavy VAS (accelerator) usage could race with
partition migration and prevent the migration from completing.
- Update MAINTAINERS to add Aneesh & Naveen.
Thanks to Haren Myneni.
* tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
MAINTAINERS: powerpc: Add Aneesh & Naveen
powerpc/pseries/vas: Migration suspend waits for no in-progress open windows
Amir Goldstein [Sun, 17 Dec 2023 09:08:52 +0000 (11:08 +0200)]
ovl: fix dentry reference leak after changes to underlying layers
syzbot excercised the forbidden practice of moving the workdir under
lowerdir while overlayfs is mounted and tripped a dentry reference leak.
Fixes: c63e56a4a652 ("ovl: do not open/llseek lower file with upper sb_writers held")
Reported-and-tested-by: syzbot+8608bb4553edb8c78f41@syzkaller.appspotmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Linus Torvalds [Sun, 17 Dec 2023 00:57:55 +0000 (16:57 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A handful of clk fixes, mostly in the rockchip clk driver:
- Fix a clk name, clk parent, and a register for a clk gate in the
Rockchip rk3128 clk driver
- Add a PLL frequency on Rockchip rk3568 to fix some display
artifacts
- Fix a kbuild dependency for Qualcomm's SM_CAMCC_8550 symbol so that
it isn't possible to select the associated GCC driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name
clk: rockchip: rk3128: Fix aclk_peri_src's parent
clk: qcom: Fix SM_CAMCC_8550 dependencies
clk: rockchip: rk3128: Fix HCLK_OTG gate register
clk: rockchip: rk3568: Add PLL rate for 292.5MHz
Linus Torvalds [Sat, 16 Dec 2023 18:40:51 +0000 (10:40 -0800)]
Merge tag 'trace-v6.7-rc5' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix eventfs to check creating new files for events with names greater
than NAME_MAX. The eventfs lookup needs to check the return result of
simple_lookup().
- Fix the ring buffer to check the proper max data size. Events must be
able to fit on the ring buffer sub-buffer, if it cannot, then it
fails to be written and the logic to add the event is avoided. The
code to check if an event can fit failed to add the possible absolute
timestamp which may make the event not be able to fit. This causes
the ring buffer to go into an infinite loop trying to find a
sub-buffer that would fit the event. Luckily, there's a check that
will bail out if it looped over a 1000 times and it also warns.
The real fix is not to add the absolute timestamp to an event that is
starting at the beginning of a sub-buffer because it uses the
sub-buffer timestamp.
By avoiding the timestamp at the start of the sub-buffer allows
events that pass the first check to always find a sub-buffer that it
can fit on.
- Have large events that do not fit on a trace_seq to print "LINE TOO
BIG" like it does for the trace_pipe instead of what it does now
which is to silently drop the output.
- Fix a memory leak of forgetting to free the spare page that is saved
by a trace instance.
- Update the size of the snapshot buffer when the main buffer is
updated if the snapshot buffer is allocated.
- Fix ring buffer timestamp logic by removing all the places that tried
to put the before_stamp back to the write stamp so that the next
event doesn't add an absolute timestamp. But each of these updates
added a race where by making the two timestamp equal, it was
validating the write_stamp so that it can be incorrectly used for
calculating the delta of an event.
- There's a temp buffer used for printing the event that was using the
event data size for allocation when it needed to use the size of the
entire event (meta-data and payload data)
- For hardening, use "%.*s" for printing the trace_marker output, to
limit the amount that is printed by the size of the event. This was
discovered by development that added a bug that truncated the '\0'
and caused a crash.
- Fix a use-after-free bug in the use of the histogram files when an
instance is being removed.
- Remove a useless update in the rb_try_to_discard of the write_stamp.
The before_stamp was already changed to force the next event to add
an absolute timestamp that the write_stamp is not used. But the
write_stamp is modified again using an unneeded 64-bit cmpxchg.
- Fix several races in the 32-bit implementation of the
rb_time_cmpxchg() that does a 64-bit cmpxchg.
- While looking at fixing the 64-bit cmpxchg, I noticed that because
the ring buffer uses normal cmpxchg, and this can be done in NMI
context, there's some architectures that do not have a working
cmpxchg in NMI context. For these architectures, fail recording
events that happen in NMI context.
* tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI
ring-buffer: Have rb_time_cmpxchg() set the msb counter too
ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()
ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs
ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()
ring-buffer: Do not try to put back write_stamp
tracing: Fix uaf issue when open the hist or hist_debug file
tracing: Add size check when printing trace_marker output
ring-buffer: Have saved event hold the entire event
ring-buffer: Do not update before stamp when switching sub-buffers
tracing: Update snapshot buffer on resize if it is allocated
ring-buffer: Fix memory leak of free page
eventfs: Fix events beyond NAME_MAX blocking tasks
tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing
ring-buffer: Fix writing to the buffer with max_data_size
Linus Torvalds [Sat, 16 Dec 2023 03:59:03 +0000 (19:59 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Arm CMN perf: fix the DTC allocation failure path which can end up
erroneously clearing live counters
- arm64/mm: fix hugetlb handling of the dirty page state leading to a
continuous fault loop in user on hardware without dirty bit
management (DBM). That's caused by the dirty+writeable information
not being properly preserved across a series of mprotect(PROT_NONE),
mprotect(PROT_READ|PROT_WRITE)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify
perf/arm-cmn: Fail DTC counter allocation correctly
Linus Torvalds [Sat, 16 Dec 2023 03:48:47 +0000 (19:48 -0800)]
Merge tag 'pci-v6.7-fixes-1' of git://git./linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Limit Max_Read_Request_Size (MRRS) on some MIPS Loongson systems
because they don't all support MRRS > 256, and firmware doesn't
always initialize it correctly, which meant some PCIe devices didn't
work (Jiaxun Yang)
- Add and use pci_enable_link_state_locked() to prevent potential
deadlocks in vmd and qcom drivers (Johan Hovold)
- Revert recent (v6.5) acpiphp resource assignment changes that fixed
issues with hot-adding devices on a root bus or with large BARs, but
introduced new issues with GPU initialization and hot-adding SCSI
disks in QEMU VMs and (Bjorn Helgaas)
* tag 'pci-v6.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
Revert "PCI: acpiphp: Reassign resources on bridge if necessary"
PCI/ASPM: Add pci_disable_link_state_locked() lockdep assert
PCI/ASPM: Clean up __pci_disable_link_state() 'sem' parameter
PCI: qcom: Clean up ASPM comment
PCI: qcom: Fix potential deadlock when enabling ASPM
PCI: vmd: Fix potential deadlock when enabling ASPM
PCI/ASPM: Add pci_enable_link_state_locked()
PCI: loongson: Limit MRRS to 256
Josef Bacik [Fri, 15 Dec 2023 15:01:44 +0000 (10:01 -0500)]
btrfs: do not allow non subvolume root targets for snapshot
Our btrfs subvolume snapshot <source> <destination> utility enforces
that <source> is the root of the subvolume, however this isn't enforced
in the kernel. Update the kernel to also enforce this limitation to
avoid problems with other users of this ioctl that don't have the
appropriate checks in place.
Reported-by: Martin Michaelis <code@mgjm.de>
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Jens Axboe [Fri, 15 Dec 2023 20:40:57 +0000 (13:40 -0700)]
cred: get rid of CONFIG_DEBUG_CREDENTIALS
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jens Axboe [Fri, 15 Dec 2023 20:24:10 +0000 (13:24 -0700)]
cred: switch to using atomic_long_t
There are multiple ways to grab references to credentials, and the only
protection we have against overflowing it is the memory required to do
so.
With memory sizes only moving in one direction, let's bump the reference
count to 64-bit and move it outside the realm of feasibly overflowing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Thu, 14 Dec 2023 15:08:56 +0000 (09:08 -0600)]
Revert "PCI: acpiphp: Reassign resources on bridge if necessary"
This reverts commit
40613da52b13fb21c5566f10b287e0ca8c12c4e9 and the
subsequent fix to it:
cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus")
40613da52b13 fixed a problem where hot-adding a device with large BARs
failed if the bridge windows programmed by firmware were not large enough.
cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources()
only for non-root bus") fixed a problem with
40613da52b13: an ACPI hot-add
of a device on a PCI root bus (common in the virt world) or firmware
sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron
7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs.
Unfortunately the combination of
40613da52b13 and
cc22522fd55e caused other
problems:
- Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails
sometimes.
- Dongli reported a similar problem with hot-add of SCSI disks.
- Jonathan reported a console freeze during boot on bare metal due to an
error in radeon GPU initialization.
Revert both patches to avoid adding these problems. This means we will
again see the problems with hot-adding devices with large BARs and the NULL
pointer dereferences and suspend/resume issues that
40613da52b13 and
cc22522fd55e were intended to fix.
Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
Fixes: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus")
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com
Reported-by: Dongli Zhang <dongli.zhang@oracle.com>
Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.com
Reported-by: Jonathan Woithe <jwoithe@just42.net>
Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.au
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Cc: <stable@vger.kernel.org>
Linus Torvalds [Fri, 15 Dec 2023 20:20:14 +0000 (12:20 -0800)]
Merge tag 'io_uring-6.7-2023-12-15' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
"Just two minor fixes:
- Fix for the io_uring socket option commands using the wrong value
on some archs (Al)
- Tweak to the poll lazy wake enable (me)"
* tag 'io_uring-6.7-2023-12-15' of git://git.kernel.dk/linux:
io_uring/cmd: fix breakage in SOCKET_URING_OP_SIOC* implementation
io_uring/poll: don't enable lazy wake for POLLEXCLUSIVE
Linus Torvalds [Fri, 15 Dec 2023 20:00:54 +0000 (12:00 -0800)]
Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6
issues"
* tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/mglru: reclaim offlined memcgs harder
mm/mglru: respect min_ttl_ms with memcgs
mm/mglru: try to stop at high watermarks
mm/mglru: fix underprotected page cache
mm/shmem: fix race in shmem_undo_range w/THP
Revert "selftests: error out if kernel header files are not yet built"
crash_core: fix the check for whether crashkernel is from high memory
x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC
sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC
mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC
m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC
loongarch, kexec: change dependency of object files
mm/damon/core: make damon_start() waits until kdamond_fn() starts
selftests/mm: cow: print ksft header before printing anything else
mm: fix VMA heap bounds checking
riscv: fix VMALLOC_START definition
kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP
Linus Torvalds [Fri, 15 Dec 2023 19:35:55 +0000 (11:35 -0800)]
Merge tag 'sound-6.7-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of HD-audio quirks for TAS2781 codec and device-specific
workarounds"
* tag 'sound-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/tas2781: reset the amp before component_add
ALSA: hda/tas2781: call cleanup functions only once
ALSA: hda/tas2781: handle missing EFI calibration data
ALSA: hda/tas2781: leave hda_component in usable state
ALSA: hda/realtek: Apply mute LED quirk for HP15-db
ALSA: hda/hdmi: add force-connect quirks for ASUSTeK Z170 variants
ALSA: hda/hdmi: add force-connect quirk for NUC5CPYB
Linus Torvalds [Fri, 15 Dec 2023 19:07:13 +0000 (11:07 -0800)]
Merge tag 'drm-fixes-2023-12-15' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"More regular fixes, amdgpu, i915, mediatek and nouveau are most of
them this week. Nothing too major, then a few misc bits and pieces in
core, panel and ivpu.
drm:
- fix uninit problems in crtc
- fix fd ownership check
- edid: add modes in fallback paths
panel:
- move LG panel into DSI yaml
- ltk050h3146w: set burst mode
mediatek:
- mtk_disp_gamma: Fix breakage due to merge issue
- fix kernel oops if no crtc is found
- Add spinlock for setting vblank event in atomic_begin
- Fix access violation in mtk_drm_crtc_dma_dev_get
i915:
- Fix selftest engine reset count storage for multi-tile
- Fix out-of-bounds reads for engine reset counts
- Fix ADL+ remapped stride with CCS
- Fix intel_atomic_setup_scalers() plane_state handling
- Fix ADL+ tiled plane stride when the POT stride is smaller than the original
- Fix eDP 1.4 rate select method link configuration
amdgpu:
- Fix suspend fix that got accidently mangled last week
- Fix OD regression
- PSR fixes
- OLED Backlight regression fix
- JPEG 4.0.5 fix
- Misc display fixes
- SDMA 5.2 fix
- SDMA 2.4 regression fix
- GPUVM race fix
nouveau:
- fix gk20a instobj hierarchy
- fix headless iors inheritance regression
ivpu:
- fix WA initialisation"
* tag 'drm-fixes-2023-12-15' of git://anongit.freedesktop.org/drm/drm: (31 commits)
drm/nouveau/kms/nv50-: Don't allow inheritance of headless iors
drm/nouveau: Fixup gk20a instobj hierarchy
drm/amdgpu: warn when there are still mappings when a BO is destroyed v2
drm/amdgpu: fix tear down order in amdgpu_vm_pt_free
drm/amd: Fix a probing order problem on SDMA 2.4
drm/amdgpu/sdma5.2: add begin/end_use ring callbacks
drm/panel: ltk050h3146w: Set burst mode for ltk050h3148w
dt-bindings: panel-simple-dsi: move LG 5" HD TFT LCD panel into DSI yaml
drm/amd/display: Disable PSR-SU on Parade 0803 TCON again
drm/amd/display: Populate dtbclk from bounding box
drm/amd/display: Revert "Fix conversions between bytes and KB"
drm/amdgpu/jpeg: configure doorbell for each playback
drm/amd/display: Restore guard against default backlight value < 1 nit
drm/amd/display: fix hw rotated modes when PSR-SU is enabled
drm/amd/pm: fix pp_*clk_od typo
drm/amdgpu: fix buffer funcs setting order on suspend harder
drm/mediatek: Fix access violation in mtk_drm_crtc_dma_dev_get
drm/edid: also call add modes in EDID connector update fallback
drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select
drm/i915: Fix ADL+ tiled plane stride when the POT stride is smaller than the original
...
NeilBrown [Fri, 15 Dec 2023 00:56:33 +0000 (11:56 +1100)]
nfsd: hold nfsd_mutex across entire netlink operation
Rather than using svc_get() and svc_put() to hold a stable reference to
the nfsd_svc for netlink lookups, simply hold the mutex for the entire
time.
The "entire" time isn't very long, and the mutex is not often contented.
This makes way for us to remove the refcounts of svc, which is more
confusing than useful.
Reported-by: Jeff Layton <jlayton@kernel.org>
Closes: https://lore.kernel.org/linux-nfs/5d9bbb599569ce29f16e4e0eef6b291eda0f375b.camel@kernel.org/T/#u
Fixes: bd9d6a3efa97 ("NFSD: add rpc_status netlink support")
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Fri, 15 Dec 2023 00:56:31 +0000 (11:56 +1100)]
nfsd: call nfsd_last_thread() before final nfsd_put()
If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put()
without calling nfsd_last_thread(). This leaves nn->nfsd_serv pointing
to a structure that has been freed.
So remove 'static' from nfsd_last_thread() and call it when the
nfsd_serv is about to be destroyed.
Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount")
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Steven Rostedt (Google) [Wed, 13 Dec 2023 22:54:03 +0000 (17:54 -0500)]
ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI
As the ring buffer recording requires cmpxchg() to work, if the
architecture does not support cmpxchg in NMI, then do not do any recording
within an NMI.
Link: https://lore.kernel.org/linux-trace-kernel/20231213175403.6fc18540@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 15 Dec 2023 13:41:14 +0000 (08:41 -0500)]
ring-buffer: Have rb_time_cmpxchg() set the msb counter too
The rb_time_cmpxchg() on 32-bit architectures requires setting three
32-bit words to represent the 64-bit timestamp, with some salt for
synchronization. Those are: msb, top, and bottom
The issue is, the rb_time_cmpxchg() did not properly salt the msb portion,
and the msb that was written was stale.
Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: f03f2abce4f39 ("ring-buffer: Have 32 bit time stamps use all 64 bits")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Mathieu Desnoyers [Tue, 12 Dec 2023 19:30:49 +0000 (14:30 -0500)]
ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()
The following race can cause rb_time_read() to observe a corrupted time
stamp:
rb_time_cmpxchg()
[...]
if (!rb_time_read_cmpxchg(&t->msb, msb, msb2))
return false;
if (!rb_time_read_cmpxchg(&t->top, top, top2))
return false;
<interrupted before updating bottom>
__rb_time_read()
[...]
do {
c = local_read(&t->cnt);
top = local_read(&t->top);
bottom = local_read(&t->bottom);
msb = local_read(&t->msb);
} while (c != local_read(&t->cnt));
*cnt = rb_time_cnt(top);
/* If top and msb counts don't match, this interrupted a write */
if (*cnt != rb_time_cnt(msb))
return false;
^ this check fails to catch that "bottom" is still not updated.
So the old "bottom" value is returned, which is wrong.
Fix this by checking that all three of msb, top, and bottom 2-bit cnt
values match.
The reason to favor checking all three fields over requiring a specific
update order for both rb_time_set() and rb_time_cmpxchg() is because
checking all three fields is more robust to handle partial failures of
rb_time_cmpxchg() when interrupted by nested rb_time_set().
Link: https://lore.kernel.org/lkml/20231211201324.652870-1-mathieu.desnoyers@efficios.com/
Link: https://lore.kernel.org/linux-trace-kernel/20231212193049.680122-1-mathieu.desnoyers@efficios.com
Fixes: f458a1453424e ("ring-buffer: Test last update in 32bit version of __rb_time_read()")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 12 Dec 2023 16:53:01 +0000 (11:53 -0500)]
ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs
Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit
architectures. That is:
static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
{
unsigned long cnt, top, bottom, msb;
unsigned long cnt2, top2, bottom2, msb2;
u64 val;
/* The cmpxchg always fails if it interrupted an update */
if (!__rb_time_read(t, &val, &cnt2))
return false;
if (val != expect)
return false;
<<<< interrupted here!
cnt = local_read(&t->cnt);
The problem is that the synchronization counter in the rb_time_t is read
*after* the value of the timestamp is read. That means if an interrupt
were to come in between the value being read and the counter being read,
it can change the value and the counter and the interrupted process would
be clueless about it!
The counter needs to be read first and then the value. That way it is easy
to tell if the value is stale or not. If the counter hasn't been updated,
then the value is still good.
Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/
Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 10464b4aa605e ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit")
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 15 Dec 2023 13:18:10 +0000 (08:18 -0500)]
ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()
When filtering is enabled, a temporary buffer is created to place the
content of the trace event output so that the filter logic can decide
from the trace event output if the trace event should be filtered out or
not. If it is to be filtered out, the content in the temporary buffer is
simply discarded, otherwise it is written into the trace buffer.
But if an interrupt were to come in while a previous event was using that
temporary buffer, the event written by the interrupt would actually go
into the ring buffer itself to prevent corrupting the data on the
temporary buffer. If the event is to be filtered out, the event in the
ring buffer is discarded, or if it fails to discard because another event
were to have already come in, it is turned into padding.
The update to the write_stamp in the rb_try_to_discard() happens after a
fix was made to force the next event after the discard to use an absolute
timestamp by setting the before_stamp to zero so it does not match the
write_stamp (which causes an event to use the absolute timestamp).
But there's an effort in rb_try_to_discard() to put back the write_stamp
to what it was before the event was added. But this is useless and
wasteful because nothing is going to be using that write_stamp for
calculations as it still will not match the before_stamp.
Remove this useless update, and in doing so, we remove another
cmpxchg64()!
Also update the comments to reflect this change as well as remove some
extra white space in another comment.
Link: https://lore.kernel.org/linux-trace-kernel/20231215081810.1f4f38fe@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: b2dd797543cf ("ring-buffer: Force absolute timestamp on discard of event")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 15 Dec 2023 03:29:21 +0000 (22:29 -0500)]
ring-buffer: Do not try to put back write_stamp
If an update to an event is interrupted by another event between the time
the initial event allocated its buffer and where it wrote to the
write_stamp, the code try to reset the write stamp back to the what it had
just overwritten. It knows that it was overwritten via checking the
before_stamp, and if it didn't match what it wrote to the before_stamp
before it allocated its space, it knows it was overwritten.
To put back the write_stamp, it uses the before_stamp it read. The problem
here is that by writing the before_stamp to the write_stamp it makes the
two equal again, which means that the write_stamp can be considered valid
as the last timestamp written to the ring buffer. But this is not
necessarily true. The event that interrupted the event could have been
interrupted in a way that it was interrupted as well, and can end up
leaving with an invalid write_stamp. But if this happens and returns to
this context that uses the before_stamp to update the write_stamp again,
it can possibly incorrectly make it valid, causing later events to have in
correct time stamps.
As it is OK to leave this function with an invalid write_stamp (one that
doesn't match the before_stamp), there's no reason to try to make it valid
again in this case. If this race happens, then just leave with the invalid
write_stamp and the next event to come along will just add a absolute
timestamp and validate everything again.
Bonus points: This gets rid of another cmpxchg64!
Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Shubhrajyoti Datta [Fri, 15 Dec 2023 05:33:52 +0000 (11:03 +0530)]
EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro
Fix the extraction of num_csrows and num_chans. The extraction of the
num_rows is wrong. Instead of extracting using the FIELD_GET it is
calling FIELD_PREP.
The issue was masked as the default design has the rows as 0.
Fixes: 6f15b178cd63 ("EDAC/versal: Add a Xilinx Versal memory controller driver")
Closes: https://lore.kernel.org/all/60ca157e-6eff-d12c-9dc0-8aeab125edda@linux-m68k.org/
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231215053352.8740-1-shubhrajyoti.datta@amd.com
Mark Rutland [Fri, 15 Dec 2023 11:24:50 +0000 (11:24 +0000)]
perf: Fix perf_event_validate_size() lockdep splat
When lockdep is enabled, the for_each_sibling_event(sibling, event)
macro checks that event->ctx->mutex is held. When creating a new group
leader event, we call perf_event_validate_size() on a partially
initialized event where event->ctx is NULL, and so when
for_each_sibling_event() attempts to check event->ctx->mutex, we get a
splat, as reported by Lucas De Marchi:
WARNING: CPU: 8 PID: 1471 at kernel/events/core.c:1950 __do_sys_perf_event_open+0xf37/0x1080
This only happens for a new event which is its own group_leader, and in
this case there cannot be any sibling events. Thus it's safe to skip the
check for siblings, which avoids having to make invasive and ugly
changes to for_each_sibling_event().
Avoid the splat by bailing out early when the new event is its own
group_leader.
Fixes: 382c27f4ed28f803 ("perf: Fix perf_event_validate_size()")
Closes: https://lore.kernel.org/lkml/20231214000620.3081018-1-lucas.demarchi@intel.com/
Closes: https://lore.kernel.org/lkml/ZXpm6gQ%2Fd59jGsuW@xpf.sh.intel.com/
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20231215112450.3972309-1-mark.rutland@arm.com
Ira Weiny [Mon, 16 Oct 2023 23:25:05 +0000 (16:25 -0700)]
cxl/pmu: Ensure put_device on pmu devices
The following kmemleaks were detected when removing the cxl module
stack:
unreferenced object 0xffff88822616b800 (size 1024):
...
backtrace:
[<
00000000bedc6f83>] kmalloc_trace+0x26/0x90
[<
00000000448d1afc>] devm_cxl_pmu_add+0x3a/0x110 [cxl_core]
[<
00000000ca3bfe16>] 0xffffffffa105213b
[<
00000000ba7f78dc>] local_pci_probe+0x41/0x90
[<
000000005bb027ac>] pci_device_probe+0xb0/0x1c0
...
unreferenced object 0xffff8882260abcc0 (size 16):
...
hex dump (first 16 bytes):
70 6d 75 5f 6d 65 6d 30 2e 30 00 26 82 88 ff ff pmu_mem0.0.&....
backtrace:
...
[<
00000000152b5e98>] dev_set_name+0x43/0x50
[<
00000000c228798b>] devm_cxl_pmu_add+0x102/0x110 [cxl_core]
[<
00000000ca3bfe16>] 0xffffffffa105213b
[<
00000000ba7f78dc>] local_pci_probe+0x41/0x90
[<
000000005bb027ac>] pci_device_probe+0xb0/0x1c0
...
unreferenced object 0xffff8882272af200 (size 256):
...
backtrace:
[<
00000000bedc6f83>] kmalloc_trace+0x26/0x90
[<
00000000a14d1813>] device_add+0x4ea/0x890
[<
00000000a3f07b47>] devm_cxl_pmu_add+0xbe/0x110 [cxl_core]
[<
00000000ca3bfe16>] 0xffffffffa105213b
[<
00000000ba7f78dc>] local_pci_probe+0x41/0x90
[<
000000005bb027ac>] pci_device_probe+0xb0/0x1c0
...
devm_cxl_pmu_add() correctly registers a device remove function but it
only calls device_del() which is only part of device unregistration.
Properly call device_unregister() to free up the memory associated with
the device.
Fixes: 1ad3f701c399 ("cxl/pci: Find and register CXL PMU devices")
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231016-pmu-unregister-fix-v1-1-1e2eb2fa3c69@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Lyude Paul [Thu, 14 Dec 2023 00:43:57 +0000 (19:43 -0500)]
drm/nouveau/kms/nv50-: Don't allow inheritance of headless iors
Turns out we made a silly mistake when coming up with OR inheritance on
nouveau. On pre-DCB 4.1, iors are statically routed to output paths via the
DCB. On later generations iors are only routed to an output path if they're
actually being used. Unfortunately, it appears with NVIF_OUTP_INHERIT_V0 we
make the mistake of assuming the later is true on all generations, which is
currently leading us to return bogus ior -> head assignments through nvif,
which causes WARN_ON().
So - fix this by verifying that we actually know that there's a head
assigned to an ior before allowing it to be inherited through nvif. This
-should- hopefully fix the WARN_ON on GT218 reported by Borislav.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231214004359.1028109-1-lyude@redhat.com
Thierry Reding [Fri, 8 Dec 2023 10:46:53 +0000 (11:46 +0100)]
drm/nouveau: Fixup gk20a instobj hierarchy
Commit
12c9b05da918 ("drm/nouveau/imem: support allocations not
preserved across suspend") uses container_of() to cast from struct
nvkm_memory to struct nvkm_instobj, assuming that all instance objects
are derived from struct nvkm_instobj. For the gk20a family that's not
the case and they are derived from struct nvkm_memory instead. This
causes some subtle data corruption (nvkm_instobj.preserve ends up
mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference
in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also
prevents suspend/resume from working.
Fix this by making struct gk20a_instobj derive from struct nvkm_instobj
instead.
Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend")
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com
Linus Torvalds [Fri, 15 Dec 2023 03:57:42 +0000 (19:57 -0800)]
Merge tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Address OOBs and NULL dereference found by Dr. Morris's recent
analysis and fuzzing.
All marked for stable as well"
* tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix OOB in smb2_query_reparse_point()
smb: client: fix NULL deref in asn1_ber_decoder()
smb: client: fix potential OOBs in smb2_parse_contexts()
smb: client: fix OOB in receive_encrypted_standard()
Dave Airlie [Fri, 15 Dec 2023 02:47:11 +0000 (12:47 +1000)]
Merge tag 'drm-misc-fixes-2023-12-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v6.7-rc6:
- Fix regression for checking if FD is master capable.
- Fix uninitialized variables in drm/crtc.
- Fix ivpu w/a.
- Refresh modes correctly when updating EDID.
- Small panel fixes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2d46b68f-c5a4-45e5-beb4-411569f4aac8@linux.intel.com
Dave Airlie [Fri, 15 Dec 2023 02:21:42 +0000 (12:21 +1000)]
Merge tag 'amd-drm-fixes-6.7-2023-12-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.7-2023-12-13:
amdgpu:
- Fix suspend fix that got accidently mangled last week
- Fix OD regression
- PSR fixes
- OLED Backlight regression fix
- JPEG 4.0.5 fix
- Misc display fixes
- SDMA 5.2 fix
- SDMA 2.4 regression fix
- GPUVM race fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231213221122.4937-1-alexander.deucher@amd.com
Linus Torvalds [Fri, 15 Dec 2023 01:15:33 +0000 (17:15 -0800)]
Merge tag 'platform-drivers-x86-v6.7-4' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- tablet-mode-switch events fix
- kernel-doc warning fixes
* tag 'platform-drivers-x86-v6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: intel_ips: fix kernel-doc formatting
platform/x86: thinkpad_acpi: fix kernel-doc warnings
platform/x86: intel-vbtn: Fix missing tablet-mode-switch events
Dave Airlie [Fri, 15 Dec 2023 01:12:40 +0000 (11:12 +1000)]
Merge tag 'drm-intel-fixes-2023-12-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v6.7-rc6:
- Fix selftest engine reset count storage for multi-tile
- Fix out-of-bounds reads for engine reset counts
- Fix ADL+ remapped stride with CCS
- Fix intel_atomic_setup_scalers() plane_state handling
- Fix ADL+ tiled plane stride when the POT stride is smaller than the original
- Fix eDP 1.4 rate select method link configuration
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/871qbqw4rw.fsf@intel.com
Al Viro [Thu, 14 Dec 2023 21:34:08 +0000 (21:34 +0000)]
io_uring/cmd: fix breakage in SOCKET_URING_OP_SIOC* implementation
In
8e9fad0e70b7 "io_uring: Add io_uring command support for sockets"
you've got an include of asm-generic/ioctls.h done in io_uring/uring_cmd.c.
That had been done for the sake of this chunk -
+ ret = prot->ioctl(sk, SIOCINQ, &arg);
+ if (ret)
+ return ret;
+ return arg;
+ case SOCKET_URING_OP_SIOCOUTQ:
+ ret = prot->ioctl(sk, SIOCOUTQ, &arg);
SIOC{IN,OUT}Q are defined to symbols (FIONREAD and TIOCOUTQ) that come from
ioctls.h, all right, but the values vary by the architecture.
FIONREAD is
0x467F on mips
0x4004667F on alpha, powerpc and sparc
0x8004667F on sh and xtensa
0x541B everywhere else
TIOCOUTQ is
0x7472 on mips
0x40047473 on alpha, powerpc and sparc
0x80047473 on sh and xtensa
0x5411 everywhere else
->ioctl() expects the same values it would've gotten from userland; all
places where we compare with SIOC{IN,OUT}Q are using asm/ioctls.h, so
they pick the correct values. io_uring_cmd_sock(), OTOH, ends up
passing the default ones.
Fixes: 8e9fad0e70b7 ("io_uring: Add io_uring command support for sockets")
Cc: <stable@vger.kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20231214213408.GT1674809@ZenIV
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 14 Dec 2023 21:11:49 +0000 (13:11 -0800)]
Merge tag 'net-6.7-rc6' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Current release - regressions:
- tcp: fix tcp_disordered_ack() vs usec TS resolution
Current release - new code bugs:
- dpll: sanitize possible null pointer dereference in
dpll_pin_parent_pin_set()
- eth: octeon_ep: initialise control mbox tasks before using APIs
Previous releases - regressions:
- io_uring/af_unix: disable sending io_uring over sockets
- eth: mlx5e:
- TC, don't offload post action rule if not supported
- fix possible deadlock on mlx5e_tx_timeout_work
- eth: iavf: fix iavf_shutdown to call iavf_remove instead iavf_close
- eth: bnxt_en: fix skb recycling logic in bnxt_deliver_skb()
- eth: ena: fix DMA syncing in XDP path when SWIOTLB is on
- eth: team: fix use-after-free when an option instance allocation
fails
Previous releases - always broken:
- neighbour: don't let neigh_forced_gc() disable preemption for long
- net: prevent mss overflow in skb_segment()
- ipv6: support reporting otherwise unknown prefix flags in
RTM_NEWPREFIX
- tcp: remove acked SYN flag from packet in the transmit queue
correctly
- eth: octeontx2-af:
- fix a use-after-free in rvu_nix_register_reporters
- fix promisc mcam entry action
- eth: dwmac-loongson: make sure MDIO is initialized before use
- eth: atlantic: fix double free in ring reinit logic"
* tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
net: atlantic: fix double free in ring reinit logic
appletalk: Fix Use-After-Free in atalk_ioctl
net: stmmac: Handle disabled MDIO busses from devicetree
net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX
dpaa2-switch: do not ask for MDB, VLAN and FDB replay
dpaa2-switch: fix size of the dma_unmap
net: prevent mss overflow in skb_segment()
vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space()
Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"
MIPS: dts: loongson: drop incorrect dwmac fallback compatible
stmmac: dwmac-loongson: drop useless check for compatible fallback
stmmac: dwmac-loongson: Make sure MDIO is initialized before use
tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set
dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set()
net: ena: Fix XDP redirection error
net: ena: Fix DMA syncing in XDP path when SWIOTLB is on
net: ena: Fix xdp drops handling due to multibuf packets
net: ena: Destroy correct number of xdp queues upon failure
net: Remove acked SYN flag from packet in the transmit queue correctly
qed: Fix a potential use-after-free in qed_cxt_tables_alloc
...
Daniel Hill [Tue, 5 Dec 2023 06:10:28 +0000 (19:10 +1300)]
bcachefs: improve modprobe support by providing softdeps
We need to help modprobe load architecture specific modules so we don't
fall back to generic software implementations, this should help
performance when building as a module.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Thomas Bertschinger [Thu, 14 Dec 2023 19:06:41 +0000 (12:06 -0700)]
bcachefs: fix invalid memory access in bch2_fs_alloc() error path
When bch2_fs_alloc() gets an error before calling
bch2_fs_btree_iter_init(), bch2_fs_btree_iter_exit() makes an invalid
memory access because btree_trans_list is uninitialized.
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Fixes: 6bd68ec266ad ("bcachefs: Heap allocate btree_trans")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Thu, 14 Dec 2023 19:53:00 +0000 (11:53 -0800)]
Merge tag 'for-6.7-rc5-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Some fixes to quota accounting code, mostly around error handling and
correctness:
- free reserves on various error paths, after IO errors or
transaction abort
- don't clear reserved range at the folio release time, it'll be
properly cleared after final write
- fix integer overflow due to int used when passing around size of
freed reservations
- fix a regression in squota accounting that missed some cases with
delayed refs"
* tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: ensure releasing squota reserve on head refs
btrfs: don't clear qgroup reserved bit in release_folio
btrfs: free qgroup pertrans reserve on transaction abort
btrfs: fix qgroup_free_reserved_data int overflow
btrfs: free qgroup reserve when ORDERED_IOERR is set
Igor Russkikh [Wed, 13 Dec 2023 09:40:44 +0000 (10:40 +0100)]
net: atlantic: fix double free in ring reinit logic
Driver has a logic leak in ring data allocation/free,
where double free may happen in aq_ring_free if system is under
stress and driver init/deinit is happening.
The probability is higher to get this during suspend/resume cycle.
Verification was done simulating same conditions with
stress -m 2000 --vm-bytes 20M --vm-hang 10 --backoff 1000
while true; do sudo ifconfig enp1s0 down; sudo ifconfig enp1s0 up; done
Fixed by explicitly clearing pointers to NULL on deallocation
Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/netdev/CAHk-=wiZZi7FcvqVSUirHBjx0bBUZ4dFrMDVLc3+3HCrtq0rBA@mail.gmail.com/
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20231213094044.22988-1-irusskikh@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Gergo Koteles [Wed, 13 Dec 2023 23:49:20 +0000 (00:49 +0100)]
ALSA: hda/tas2781: reset the amp before component_add
Calling component_add starts loading the firmware, the callback function
writes the program to the amplifiers. If the module resets the
amplifiers after component_add, it happens that one of the amplifiers
does not work because the reset and program writing are interleaving.
Call tas2781_reset before component_add to ensure reliable
initialization.
Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver")
CC: stable@vger.kernel.org
Signed-off-by: Gergo Koteles <soyer@irl.hu>
Link: https://lore.kernel.org/r/4d23bf58558e23ee8097de01f70f1eb8d9de2d15.1702511246.git.soyer@irl.hu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Gergo Koteles [Wed, 13 Dec 2023 23:28:16 +0000 (00:28 +0100)]
ALSA: hda/tas2781: call cleanup functions only once
If the module can load the RCA but not the firmware binary, it will call
the cleanup functions. Then unloading the module causes general
protection fault due to double free.
Do not call the cleanup functions in tasdev_fw_ready.
general protection fault, probably for non-canonical address
0x6f2b8a2bff4c8fec: 0000 [#1] PREEMPT SMP NOPTI
Call Trace:
<TASK>
? die_addr+0x36/0x90
? exc_general_protection+0x1c5/0x430
? asm_exc_general_protection+0x26/0x30
? tasdevice_config_info_remove+0x6d/0xd0 [snd_soc_tas2781_fmwlib]
tas2781_hda_unbind+0xaa/0x100 [snd_hda_scodec_tas2781_i2c]
component_unbind+0x2e/0x50
component_unbind_all+0x92/0xa0
component_del+0xa8/0x140
tas2781_hda_remove.isra.0+0x32/0x60 [snd_hda_scodec_tas2781_i2c]
i2c_device_remove+0x26/0xb0
Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver")
CC: stable@vger.kernel.org
Signed-off-by: Gergo Koteles <soyer@irl.hu>
Link: https://lore.kernel.org/r/1a0885c424bb21172702d254655882b59ef6477a.1702510018.git.soyer@irl.hu
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Hyunwoo Kim [Wed, 13 Dec 2023 04:10:56 +0000 (23:10 -0500)]
appletalk: Fix Use-After-Free in atalk_ioctl
Because atalk_ioctl() accesses sk->sk_receive_queue
without holding a sk->sk_receive_queue.lock, it can
cause a race with atalk_recvmsg().
A use-after-free for skb occurs with the following flow.
```
atalk_ioctl() -> skb_peek()
atalk_recvmsg() -> skb_recv_datagram() -> skb_free_datagram()
```
Add sk->sk_receive_queue.lock to atalk_ioctl() to fix this issue.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Link: https://lore.kernel.org/r/20231213041056.GA519680@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Halaney [Tue, 12 Dec 2023 22:18:33 +0000 (16:18 -0600)]
net: stmmac: Handle disabled MDIO busses from devicetree
Many hardware configurations have the MDIO bus disabled, and are instead
using some other MDIO bus to talk to the MAC's phy.
of_mdiobus_register() returns -ENODEV in this case. Let's handle it
gracefully instead of failing to probe the MAC.
Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.")
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20231212-b4-stmmac-handle-mdio-enodev-v2-1-600171acf79f@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Louis Chauvet [Mon, 4 Dec 2023 15:49:03 +0000 (16:49 +0100)]
spi: atmel: Fix clock issue when using devices with different polarities
The current Atmel SPI controller driver (v2) behaves incorrectly when
using two SPI devices with different clock polarities and GPIO CS.
When switching from one device to another, the controller driver first
enables the CS and then applies whatever configuration suits the targeted
device (typically, the polarities). The side effect of such order is the
apparition of a spurious clock edge after enabling the CS when the clock
polarity needs to be inverted wrt. the previous configuration of the
controller.
This parasitic clock edge is problematic when the SPI device uses that edge
for internal processing, which is perfectly legitimate given that its CS
was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in
the kernel are shift registers and will process this first clock edge to
perform a first register shift. In this case, the first bit gets lost and
the whole data block that will later be read by the kernel is all shifted
by one.
Current behavior:
The actual switching of the clock polarity only occurs after the CS
when the controller sends the first message:
CLK ------------\ /-\ /-\
| | | | | . . .
\---/ \-/ \
CS -----\
|
\------------------
^ ^ ^
| | |
| | Actual clock of the message sent
| |
| Change of clock polarity, which occurs with the first
| write to the bus. This edge occurs when the CS is
| already asserted, and can be interpreted as
| the first clock edge by the receiver.
|
GPIO CS toggle
This issue is specific to this controller because while the SPI core
performs the operations in the right order, the controller however does
not. In practice, the controller only applies the clock configuration right
before the first transmission.
So this is not a problem when using the controller's dedicated CS, as the
controller does things correctly, but it becomes a problem when you need to
change the clock polarity and use an external GPIO for the CS.
One possible approach to solve this problem is to send a dummy message
before actually activating the CS, so that the controller applies the clock
polarity beforehand.
New behavior:
CLK ------\ /-\ /-\ /-\ /-\
| | | ... | | | | ... | |
\------/ \- -/ \------/ \- -/ \------
CS -\/-----------------------\
|| |
\/ \---------------------
^ ^ ^ ^ ^
| | | | |
| | | | Expected clock cycles when
| | | | sending the message
| | | |
| | | Actual GPIO CS activation, occurs inside
| | | the driver
| | |
| | Dummy message, to trigger clock polarity
| | reconfiguration. This message is not received and
| | processed by the device because CS is low.
| |
| Change of clock polarity, forced by the dummy message. This
| time, the edge is not detected by the receiver.
|
This small spike in CS activation is due to the fact that the
spi-core activates the CS gpio before calling the driver's
set_cs callback, which deactivates this gpio again until the
clock polarity is correct.
To avoid having to systematically send a dummy packet, the driver keeps
track of the clock's current polarity. In this way, it only sends the dummy
packet when necessary, ensuring that the clock will have the correct
polarity when the CS is toggled.
There could be two hardware problems with this patch:
1- Maybe the small CS activation peak can confuse SPI devices
2- If on a design, a single wire is used to select two devices depending
on its state, the dummy message may disturb them.
Fixes: 5ee36c989831 ("spi: atmel_spi update chipselect handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Sneh Shah [Tue, 12 Dec 2023 09:22:08 +0000 (14:52 +0530)]
net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX
In 10M SGMII mode all the packets are being dropped due to wrong Rx clock.
SGMII 10MBPS mode needs RX clock divider programmed to avoid drops in Rx.
Update configure SGMII function with Rx clk divider programming.
Fixes: 463120c31c58 ("net: stmmac: dwmac-qcom-ethqos: add support for SGMII")
Tested-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Sneh Shah <quic_snehshah@quicinc.com>
Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Link: https://lore.kernel.org/r/20231212092208.22393-1-quic_snehshah@quicinc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Thu, 14 Dec 2023 06:03:01 +0000 (22:03 -0800)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-12-12 (iavf)
This series contains updates to iavf driver only.
Piotr reworks Flow Director states to deal with issues in restoring
filters.
Slawomir fixes shutdown processing as it was missing needed calls.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: Fix iavf_shutdown to call iavf_remove instead iavf_close
iavf: Handle ntuple on/off based on new state machines for flow director
iavf: Introduce new state machines for flow director
====================
Link: https://lore.kernel.org/r/20231212203613.513423-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zheng Yejian [Thu, 14 Dec 2023 01:21:53 +0000 (09:21 +0800)]
tracing: Fix uaf issue when open the hist or hist_debug file
KASAN report following issue. The root cause is when opening 'hist'
file of an instance and accessing 'trace_event_file' in hist_show(),
but 'trace_event_file' has been freed due to the instance being removed.
'hist_debug' file has the same problem. To fix it, call
tracing_{open,release}_file_tr() in file_operations callback to have
the ref count and avoid 'trace_event_file' being freed.
BUG: KASAN: slab-use-after-free in hist_show+0x11e0/0x1278
Read of size 8 at addr
ffff242541e336b8 by task head/190
CPU: 4 PID: 190 Comm: head Not tainted
6.7.0-rc5-g26aff849438c #133
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x98/0xf8
show_stack+0x1c/0x30
dump_stack_lvl+0x44/0x58
print_report+0xf0/0x5a0
kasan_report+0x80/0xc0
__asan_report_load8_noabort+0x1c/0x28
hist_show+0x11e0/0x1278
seq_read_iter+0x344/0xd78
seq_read+0x128/0x1c0
vfs_read+0x198/0x6c8
ksys_read+0xf4/0x1e0
__arm64_sys_read+0x70/0xa8
invoke_syscall+0x70/0x260
el0_svc_common.constprop.0+0xb0/0x280
do_el0_svc+0x44/0x60
el0_svc+0x34/0x68
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x168/0x170
Allocated by task 188:
kasan_save_stack+0x28/0x50
kasan_set_track+0x28/0x38
kasan_save_alloc_info+0x20/0x30
__kasan_slab_alloc+0x6c/0x80
kmem_cache_alloc+0x15c/0x4a8
trace_create_new_event+0x84/0x348
__trace_add_new_event+0x18/0x88
event_trace_add_tracer+0xc4/0x1a0
trace_array_create_dir+0x6c/0x100
trace_array_create+0x2e8/0x568
instance_mkdir+0x48/0x80
tracefs_syscall_mkdir+0x90/0xe8
vfs_mkdir+0x3c4/0x610
do_mkdirat+0x144/0x200
__arm64_sys_mkdirat+0x8c/0xc0
invoke_syscall+0x70/0x260
el0_svc_common.constprop.0+0xb0/0x280
do_el0_svc+0x44/0x60
el0_svc+0x34/0x68
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x168/0x170
Freed by task 191:
kasan_save_stack+0x28/0x50
kasan_set_track+0x28/0x38
kasan_save_free_info+0x34/0x58
__kasan_slab_free+0xe4/0x158
kmem_cache_free+0x19c/0x508
event_file_put+0xa0/0x120
remove_event_file_dir+0x180/0x320
event_trace_del_tracer+0xb0/0x180
__remove_instance+0x224/0x508
instance_rmdir+0x44/0x78
tracefs_syscall_rmdir+0xbc/0x140
vfs_rmdir+0x1cc/0x4c8
do_rmdir+0x220/0x2b8
__arm64_sys_unlinkat+0xc0/0x100
invoke_syscall+0x70/0x260
el0_svc_common.constprop.0+0xb0/0x280
do_el0_svc+0x44/0x60
el0_svc+0x34/0x68
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x168/0x170
Link: https://lore.kernel.org/linux-trace-kernel/20231214012153.676155-1-zhengyejian1@huawei.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pavel Kozlov [Wed, 13 Dec 2023 15:07:10 +0000 (19:07 +0400)]
ARC: add hugetlb definitions
Add hugetlb definitions if THP enabled. ARC doesn't support
HugeTLB FS but it supports THP. Some kernel code such as pagemap
uses hugetlb definitions with THP.
This patch fixes ARC build issue (HPAGE_SIZE undeclared error) with
TRANSPARENT_HUGEPAGE enabled.
Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
Nitin Rawat [Fri, 8 Dec 2023 13:13:31 +0000 (18:43 +0530)]
scsi: ufs: core: Store min and max clk freq from OPP table
OPP support added by commit
72208ebe181e ("scsi: ufs: core: Add support for
parsing OPP") doesn't update the min_freq and max_freq of each clock in
'struct ufs_clk_info'.
But these values are used by the host drivers internally for controller
configuration. When the OPP support is enabled in devicetree, these values
will be 0, causing boot issues on the respective platforms.
So add support to parse the min_freq and max_freq of all clocks while
parsing the OPP table.
Fixes: 72208ebe181e ("scsi: ufs: core: Add support for parsing OPP")
Co-developed-by: Manish Pandey <quic_mapa@quicinc.com>
Signed-off-by: Manish Pandey <quic_mapa@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20231208131331.12596-1-quic_nitirawa@quicinc.com
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jakub Kicinski [Thu, 14 Dec 2023 02:38:56 +0000 (18:38 -0800)]
Merge branch 'dpaa2-switch-various-fixes'
Ioana Ciornei says:
====================
dpaa2-switch: various fixes
The first patch fixes the size passed to two dma_unmap_single() calls
which was wrongly put as the size of the pointer.
The second patch is new to this series and reverts the behavior of the
dpaa2-switch driver to not ask for object replay upon offloading so that
we avoid the errors encountered when a VLAN is installed multiple times
on the same port.
====================
Link: https://lore.kernel.org/r/20231212164326.2753457-1-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Tue, 12 Dec 2023 16:43:26 +0000 (18:43 +0200)]
dpaa2-switch: do not ask for MDB, VLAN and FDB replay
Starting with commit
4e51bf44a03a ("net: bridge: move the switchdev
object replay helpers to "push" mode") the switchdev_bridge_port_offload()
helper was extended with the intention to provide switchdev drivers easy
access to object addition and deletion replays. This works by calling
the replay helpers with non-NULL notifier blocks.
In the same commit, the dpaa2-switch driver was updated so that it
passes valid notifier blocks to the helper. At that moment, no
regression was identified through testing.
In the meantime, the blamed commit changed the behavior in terms of
which ports get hit by the replay. Before this commit, only the initial
port which identified itself as offloaded through
switchdev_bridge_port_offload() got a replay of all port objects and
FDBs. After this, the newly joining port will trigger a replay of
objects on all bridge ports and on the bridge itself.
This behavior leads to errors in dpaa2_switch_port_vlans_add() when a
VLAN gets installed on the same interface multiple times.
The intended mechanism to address this is to pass a non-NULL ctx to the
switchdev_bridge_port_offload() helper and then check it against the
port's private structure. But since the driver does not have any use for
the replayed port objects and FDBs until it gains support for LAG
offload, it's better to fix the issue by reverting the dpaa2-switch
driver to not ask for replay. The pointers will be added back when we
are prepared to ignore replays on unrelated ports.
Fixes: b28d580e2939 ("net: bridge: switchdev: replay all VLAN groups")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20231212164326.2753457-3-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Tue, 12 Dec 2023 16:43:25 +0000 (18:43 +0200)]
dpaa2-switch: fix size of the dma_unmap
The size of the DMA unmap was wrongly put as a sizeof of a pointer.
Change the value of the DMA unmap to be the actual macro used for the
allocation and the DMA map.
Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20231212164326.2753457-2-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 12 Dec 2023 16:46:21 +0000 (16:46 +0000)]
net: prevent mss overflow in skb_segment()
Once again syzbot is able to crash the kernel in skb_segment() [1]
GSO_BY_FRAGS is a forbidden value, but unfortunately the following
computation in skb_segment() can reach it quite easily :
mss = mss * partial_segs;
65535 = 3 * 5 * 17 * 257, so many initial values of mss can lead to
a bad final result.
Make sure to limit segmentation so that the new mss value is smaller
than GSO_BY_FRAGS.
[1]
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 1 PID: 5079 Comm: syz-executor993 Not tainted
6.7.0-rc4-syzkaller-00141-g1ae4cd3cbdd0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
RSP: 0018:
ffffc900043473d0 EFLAGS:
00010202
RAX:
dffffc0000000000 RBX:
0000000000010046 RCX:
ffffffff886b1597
RDX:
000000000000000e RSI:
ffffffff886b2520 RDI:
0000000000000070
RBP:
ffffc90004347578 R08:
0000000000000005 R09:
000000000000ffff
R10:
000000000000ffff R11:
0000000000000002 R12:
ffff888063202ac0
R13:
0000000000010000 R14:
000000000000ffff R15:
0000000000000046
FS:
0000555556e7e380(0000) GS:
ffff8880b9900000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020010000 CR3:
0000000027ee2000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
udp6_ufo_fragment+0xa0e/0xd00 net/ipv6/udp_offload.c:109
ipv6_gso_segment+0x534/0x17e0 net/ipv6/ip6_offload.c:120
skb_mac_gso_segment+0x290/0x610 net/core/gso.c:53
__skb_gso_segment+0x339/0x710 net/core/gso.c:124
skb_gso_segment include/net/gso.h:83 [inline]
validate_xmit_skb+0x36c/0xeb0 net/core/dev.c:3626
__dev_queue_xmit+0x6f3/0x3d60 net/core/dev.c:4338
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
packet_xmit+0x257/0x380 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3087 [inline]
packet_sendmsg+0x24c6/0x5220 net/packet/af_packet.c:3119
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f8692032aa9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007fff8d685418 EFLAGS:
00000246 ORIG_RAX:
000000000000002c
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
00007f8692032aa9
RDX:
0000000000010048 RSI:
00000000200000c0 RDI:
0000000000000003
RBP:
00000000000f4240 R08:
0000000020000540 R09:
0000000000000014
R10:
0000000000000000 R11:
0000000000000246 R12:
00007fff8d685480
R13:
0000000000000001 R14:
00007fff8d685480 R15:
0000000000000003
</TASK>
Modules linked in:
---[ end trace
0000000000000000 ]---
RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
RSP: 0018:
ffffc900043473d0 EFLAGS:
00010202
RAX:
dffffc0000000000 RBX:
0000000000010046 RCX:
ffffffff886b1597
RDX:
000000000000000e RSI:
ffffffff886b2520 RDI:
0000000000000070
RBP:
ffffc90004347578 R08:
0000000000000005 R09:
000000000000ffff
R10:
000000000000ffff R11:
0000000000000002 R12:
ffff888063202ac0
R13:
0000000000010000 R14:
000000000000ffff R15:
0000000000000046
FS:
0000555556e7e380(0000) GS:
ffff8880b9900000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020010000 CR3:
0000000027ee2000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231212164621.4131800-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nikolay Kuratov [Mon, 11 Dec 2023 16:23:17 +0000 (19:23 +0300)]
vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space()
We need to do signed arithmetic if we expect condition
`if (bytes < 0)` to be possible
Found by Linux Verification Center (linuxtesting.org) with SVACE
Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231211162317.4116625-1-kniv@yandex-team.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stephen Boyd [Wed, 13 Dec 2023 23:26:24 +0000 (15:26 -0800)]
Merge tag 'v6.7-rockchip-clkfixes1' of git://git./linux/kernel/git/mmind/linux-rockchip into clk-fixes
Pull Rockchip clk driver fixes for the merge window from Heiko Stuebner:
Fixes for a wrong clockname, a wrong clock-parent, a wrong clock-gate
and finally one new PLL rate for the rk3568 to fix display artifacts
on a handheld devices based on that soc.
* tag 'v6.7-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name
clk: rockchip: rk3128: Fix aclk_peri_src's parent
clk: rockchip: rk3128: Fix HCLK_OTG gate register
clk: rockchip: rk3568: Add PLL rate for 292.5MHz
Christian König [Mon, 4 Dec 2023 14:51:50 +0000 (15:51 +0100)]
drm/amdgpu: warn when there are still mappings when a BO is destroyed v2
This can only happen when there is a reference counting bug.
v2: fix typo
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 8 Dec 2023 12:43:09 +0000 (13:43 +0100)]
drm/amdgpu: fix tear down order in amdgpu_vm_pt_free
When freeing PD/PT with shadows it can happen that the shadow
destruction races with detaching the PD/PT from the VM causing a NULL
pointer dereference in the invalidation code.
Fix this by detaching the the PD/PT from the VM first and then
freeing the shadow instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867
Cc: <stable@vger.kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Tue, 12 Dec 2023 07:09:16 +0000 (01:09 -0600)]
drm/amd: Fix a probing order problem on SDMA 2.4
commit
751e293f2c99 ("drm/amd: Move microcode init from sw_init to
early_init for SDMA v2.4") made a fateful mistake in
`adev->sdma.num_instances` wasn't declared when sdma_v2_4_init_microcode()
was run. This caused probing to fail.
Move the declaration to right before sdma_v2_4_init_microcode().
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3043
Fixes: 751e293f2c99 ("drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 7 Dec 2023 15:14:41 +0000 (10:14 -0500)]
drm/amdgpu/sdma5.2: add begin/end_use ring callbacks
Add begin/end_use ring callbacks to disallow GFXOFF when
SDMA work is submitted and allow it again afterward.
This should avoid corner cases where GFXOFF is erroneously
entered when SDMA is still active. For now just allow/disallow
GFXOFF in the begin and end helpers until we root cause the
issue. This should not impact power as SDMA usage is pretty
minimal and GFXOSS should not be active when SDMA is active
anyway, this just makes it explicit.
v2: move everything into sdma5.2 code. No reason for this
to be generic at this point.
v3: Add comments in new code
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2220
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> (v1)
Tested-by: Mario Limonciello <mario.limonciello@amd.com> (v1)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.15+
Yusong Gao [Wed, 13 Dec 2023 10:31:10 +0000 (10:31 +0000)]
sign-file: Fix incorrect return values check
There are some wrong return values check in sign-file when call OpenSSL
API. The ERR() check cond is wrong because of the program only check the
return value is < 0 which ignored the return val is 0. For example:
1. CMS_final() return 1 for success or 0 for failure.
2. i2d_CMS_bio_stream() returns 1 for success or 0 for failure.
3. i2d_TYPEbio() return 1 for success and 0 for failure.
4. BIO_free() return 1 for success and 0 for failure.
Link: https://www.openssl.org/docs/manmaster/man3/
Fixes: e5a2e3c84782 ("scripts/sign-file.c: Add support for signing with a raw signature")
Signed-off-by: Yusong Gao <a869920004@gmail.com>
Reviewed-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20231213024405.624692-1-a869920004@gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 13 Dec 2023 19:09:58 +0000 (11:09 -0800)]
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull ufs fix from Al Viro:
"ufs got broken this merge window on folio conversion - calling
conventions for filemap_lock_folio() are not the same as for
find_lock_page()"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix ufs_get_locked_folio() breakage
Jakub Kicinski [Wed, 13 Dec 2023 18:56:29 +0000 (10:56 -0800)]
Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"
This reverts commit
f3f32a356c0d2379d4431364e74f101f8f075ce3.
Paolo reports that the change disables autocorking even after
the userspace sets TCP_CORK.
Fixes: f3f32a356c0d ("tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set")
Link: https://lore.kernel.org/r/0d30d5a41d3ac990573016308aaeacb40a9dc79f.camel@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 13 Dec 2023 18:54:50 +0000 (10:54 -0800)]
Merge tag 'efi-urgent-for-v6.7-2' of git://git./linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Deal with a regression in the recently refactored x86 EFI stub code
on older Dell systems by disabling randomization of the physical load
address
- Use the correct load address for relocatable Loongarch kernels
* tag 'efi-urgent-for-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/x86: Avoid physical KASLR on older Dell systems
efi/loongarch: Use load address to calculate kernel entry address
Jan Kara [Wed, 13 Dec 2023 16:51:04 +0000 (17:51 +0100)]
bcachefs: Fix determining required file handle length
The ->encode_fh method is responsible for setting amount of space
required for storing the file handle if not enough space was provided.
bch2_encode_fh() was not setting required length in that case which
breaks e.g. fanotify. Fix it.
Reported-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Farouk Bouabid [Wed, 13 Dec 2023 14:50:45 +0000 (15:50 +0100)]
drm/panel: ltk050h3146w: Set burst mode for ltk050h3148w
The ltk050h3148w variant expects the horizontal component lane byte clock
cycle(lbcc) to be calculated using lane_mbps (burst mode) instead of the
pixel clock.
Using the pixel clock rate by default for this calculation was introduced
in commit
ac87d23694f4 ("drm/bridge: synopsys: dw-mipi-dsi: Use pixel clock
rate to calculate lbcc") and starting from commit
93e82bb4de01
("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode")
only panels that support burst mode can keep using the lane_mbps. So add
MIPI_DSI_MODE_VIDEO_BURST as part of the mode_flags for the dsi host.
Fixes: 93e82bb4de01 ("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode")
Signed-off-by: Farouk Bouabid <farouk.bouabid@theobroma-systems.com>
Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231213145045.41020-1-farouk.bouabid@theobroma-systems.com
Al Viro [Wed, 13 Dec 2023 16:14:09 +0000 (11:14 -0500)]
fix ufs_get_locked_folio() breakage
filemap_lock_folio() returns ERR_PTR(-ENOENT) if the thing is not
in cache - not NULL like find_lock_page() used to.
Fixes: 5fb7bd50b351 "ufs: add ufs_get_locked_folio and ufs_put_locked_folio"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jens Axboe [Wed, 13 Dec 2023 15:58:15 +0000 (08:58 -0700)]
io_uring/poll: don't enable lazy wake for POLLEXCLUSIVE
There are a few quirks around using lazy wake for poll unconditionally,
and one of them is related the EPOLLEXCLUSIVE. Those may trigger
exclusive wakeups, which wake a limited number of entries in the wait
queue. If that wake number is less than the number of entries someone is
waiting for (and that someone is also using DEFER_TASKRUN), then we can
get stuck waiting for more entries while we should be processing the ones
we already got.
If we're doing exclusive poll waits, flag the request as not being
compatible with lazy wakeups.
Reported-by: Pavel Begunkov <asml.silence@gmail.com>
Fixes: 6ce4a93dbb5b ("io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Michael Ellerman [Tue, 5 Dec 2023 05:11:05 +0000 (16:11 +1100)]
MAINTAINERS: powerpc: Add Aneesh & Naveen
Aneesh and Naveen are helping out with some aspects of upstream
maintenance, add them as reviewers.
Acked-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Acked-by: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231205051105.736470-1-mpe@ellerman.id.au
Haren Myneni [Sat, 25 Nov 2023 23:51:04 +0000 (15:51 -0800)]
powerpc/pseries/vas: Migration suspend waits for no in-progress open windows
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.
t1: Allocate and open VAS t2: Migration event
window
lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup window migration_in_progress=true
Closes all windows from the list
// May miss windows that are
// not in the list
unlock vas_pseries_mutex
lock vas_pseries_mutex return
if nr_closed_windows == 0
// No DLPAR CPU or migration
add window to the list
// Window will be added to the
// list after the setup is completed
unlock vas_pseries_mutex
return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY
This patch resolves the issue with the following steps:
- Set the migration_in_progress flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
struct
- This counter tracks the number of open windows are still in
progress
- The allocate setup window thread closes windows if the migration
is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.
The code flow with the fix is as follows:
t1: Allocate and open VAS t2: Migration event
window
lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
nr_open_wins_progress++
// Window opened, but not
// added to the list yet
unlock vas_pseries_mutex
Modify window HCALL migration_in_progress=true
setup window lock vas_pseries_mutex
Closes all windows from the list
While nr_open_wins_progress {
unlock vas_pseries_mutex
lock vas_pseries_mutex sleep
if nr_closed_windows == 0 // Wait if any open window in
or migration is not started // progress. The open window
// No DLPAR CPU or migration // thread closes the window without
add window to the list // adding to the list and return if
nr_open_wins_progress-- // the migration is in progress.
unlock vas_pseries_mutex
return
Close VAS window
nr_open_wins_progress--
unlock vas_pseries_mutex
return -EBUSY lock vas_pseries_mutex
}
unlock vas_pseries_mutex
return
Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231125235104.3405008-1-haren@linux.ibm.com
David S. Miller [Wed, 13 Dec 2023 10:57:01 +0000 (10:57 +0000)]
Merge branch 'stmmac-bug-fixes'
Yanteng Si says:
====================
stmmac: Some bug fixes
* Put Krzysztof's patch into my thread, pick Conor's Reviewed-by
tag and Jiaxun's Acked-by tag.(prev version is RFC patch)
* I fixed an Oops related to mdio, mainly to ensure that
mdio is initialized before use, because it will be used
in a series of patches I am working on.
see <https://lore.kernel.org/loongarch/cover.
1699533745.git.siyanteng@loongson.cn/T/#t>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>