Rob Herring [Mon, 11 Mar 2024 22:26:04 +0000 (16:26 -0600)]
dt-bindings: i2c: qcom,i2c-cci: Fix OV7251 'data-lanes' entries
The OV7251 sensor only has a single data lane, so 2 entries is not valid.
Fix this to be 1 entry as the schema specifies.
The schema validation doesn't catch this currently due to some limitations
in handling of arrays vs. matrices, but a fix is being worked on.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Chris Packham [Tue, 12 Mar 2024 20:14:00 +0000 (09:14 +1300)]
i2c: muxes: pca954x: Allow sharing reset GPIO
Some hardware designs with multiple PCA954x devices use a reset GPIO
connected to all the muxes. Support this configuration by making use of
the reset controller framework which can deal with the shared reset
GPIOs. Fall back to the old GPIO descriptor method if the reset
controller framework is not enabled.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Wolfram Sang [Wed, 20 Mar 2024 08:28:51 +0000 (09:28 +0100)]
Merge tag 'i2c-host-6.9-part2' of git://git./linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow
Théo adds support for the Mobileye EyeQ5-I2C in the bindings.
This patch is followed by eight commits featuring improvements to
the Nomadik controller, such as simplification of the IRQ logic,
renaming of the private data structure, more efficient use of
FIELD_PREP/GET, GENMASK, etc., better time measurement with
ktime, and more.
Linus Torvalds [Wed, 20 Mar 2024 00:27:25 +0000 (17:27 -0700)]
Merge tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Assorted bugfixes.
Most are fixes for simple assertion pops; the most significant fix is
for a deadlock in recovery when we have to rewrite large numbers of
btree nodes to fix errors. This was incorrectly running out of the
same workqueue as the core interior btree update path - we now give it
its own single threaded workqueue.
This was visible to users as "bch2_btree_update_start(): error:
BCH_ERR_journal_reclaim_would_deadlock" - and then recovery hanging"
* tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix lost wakeup on journal shutdown
bcachefs; Fix deadlock in bch2_btree_update_start()
bcachefs: ratelimit errors from async_btree_node_rewrite
bcachefs: Run check_topology() first
bcachefs: Improve bch2_fatal_error()
bcachefs: Fix lost transaction restart error
bcachefs: Don't corrupt journal keys gap buffer when dropping alloc info
bcachefs: fix for building in userspace
bcachefs: bch2_snapshot_is_ancestor() now safe to call in early recovery
bcachefs: Fix nested transaction restart handling in bch2_bucket_gens_init()
bcachefs: Improve sysfs internal/btree_updates
bcachefs: Split out btree_node_rewrite_worker
bcachefs: Fix locking in bch2_alloc_write_key()
bcachefs: Avoid extent entry type assertions in .invalid()
bcachefs: Fix spurious -BCH_ERR_transaction_restart_nested
bcachefs: Fix check_key_has_snapshot() call
bcachefs: Change "accounting overran journal reservation" to a warning
Linus Torvalds [Tue, 19 Mar 2024 18:57:26 +0000 (11:57 -0700)]
Merge tag 'soc-late-6.9' of git://git./linux/kernel/git/soc/soc
Pull more ARM SoC updates from Arnd Bergmann:
"These are changes that for some reason ended up not making it into the
first four branches but that should still make it into 6.9:
- A rework of the omap clock support that touches both drivers and
device tree files
- The reset controller branch changes that had a dependency on late
bugfixes. Merging them here avoids a backmerge of 6.8-rc5 into the
drivers branch
- The RISC-V/starfive, RISC-V/microchip and ARM/Broadcom devicetree
changes that got delayed and needed some extra time in linux-next
for wider testing"
* tag 'soc-late-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits)
soc: fsl: dpio: fix kcalloc() argument order
bus: ts-nbus: Improve error reporting
bus: ts-nbus: Convert to atomic pwm API
riscv: dts: starfive: jh7110: Add camera subsystem nodes
ARM: bcm: stop selecing CONFIG_TICK_ONESHOT
ARM: dts: omap3: Update clksel clocks to use reg instead of ti,bit-shift
ARM: dts: am3: Update clksel clocks to use reg instead of ti,bit-shift
clk: ti: Improve clksel clock bit parsing for reg property
clk: ti: Handle possible address in the node name
dt-bindings: pwm: opencores: Add compatible for StarFive JH8100
dt-bindings: riscv: cpus: reg matches hart ID
reset: Instantiate reset GPIO controller for shared reset-gpios
reset: gpio: Add GPIO-based reset controller
cpufreq: do not open-code of_phandle_args_equal()
of: Add of_phandle_args_equal() helper
reset: simple: add support for Sophgo SG2042
dt-bindings: reset: sophgo: support SG2042
riscv: dts: microchip: add specific compatible for mpfs pdma
riscv: dts: microchip: add missing CAN bus clocks
ARM: brcmstb: Add debug UART entry for 74165
...
Linus Torvalds [Tue, 19 Mar 2024 18:38:27 +0000 (11:38 -0700)]
Merge tag 's390-6.9-2' of git://git./linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Various virtual vs physical address usage fixes
- Add new bitwise types and helper functions and use them in s390
specific drivers and code to make it easier to find virtual vs
physical address usage bugs.
Right now virtual and physical addresses are identical for s390,
except for module, vmalloc, and similar areas. This will be changed,
hopefully with the next merge window, so that e.g. the kernel image
and modules will be located close to each other, allowing for direct
branches and also for some other simplifications.
As a prerequisite this requires to fix all misuses of virtual and
physical addresses. As it turned out people are so used to the
concept that virtual and physical addresses are the same, that new
bugs got added to code which was already fixed. In order to avoid
that even more code gets merged which adds such bugs add and use new
bitwise types, so that sparse can be used to find such usage bugs.
Most likely the new types can go away again after some time
- Provide a simple ARCH_HAS_DEBUG_VIRTUAL implementation
- Fix kprobe branch handling: if an out-of-line single stepped relative
branch instruction has a target address within a certain address area
in the entry code, the program check handler may incorrectly execute
cleanup code as if KVM code was executed, leading to crashes
- Fix reference counting of zcrypt card objects
- Various other small fixes and cleanups
* tag 's390-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
s390/entry: compare gmap asce to determine guest/host fault
s390/entry: remove OUTSIDE macro
s390/entry: add CIF_SIE flag and remove sie64a() address check
s390/cio: use while (i--) pattern to clean up
s390/raw3270: make class3270 constant
s390/raw3270: improve raw3270_init() readability
s390/tape: make tape_class constant
s390/vmlogrdr: make vmlogrdr_class constant
s390/vmur: make vmur_class constant
s390/zcrypt: make zcrypt_class constant
s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support
s390/vfio_ccw_cp: use new address translation helpers
s390/iucv: use new address translation helpers
s390/ctcm: use new address translation helpers
s390/lcs: use new address translation helpers
s390/qeth: use new address translation helpers
s390/zfcp: use new address translation helpers
s390/tape: fix virtual vs physical address confusion
s390/3270: use new address translation helpers
s390/3215: use new address translation helpers
...
Steven Rostedt (Google) [Tue, 19 Mar 2024 17:39:59 +0000 (13:39 -0400)]
tracing: Just use strcmp() for testing __string() and __assign_str() match
As __assign_str() no longer uses its "src" parameter, there's a check to
make sure nothing depends on it being different than what was passed to
__string(). It originally just compared the pointer passed to __string()
with the pointer passed into __assign_str() via the "src" parameter. But
there's a couple of outliers that just pass in a quoted string constant,
where comparing the pointers is UB to the compiler, as the compiler is
free to create multiple copies of the same string constant.
Instead, just use strcmp(). It may slow down the trace event, but this
will eventually be removed.
Also, fix the issue of passing NULL to strcmp() by adding a WARN_ON() to
make sure that both "src" and the pointer saved in __string() are either
both NULL or have content, and then checking if "src" is not NULL before
performing the strcmp().
Link: https://lore.kernel.org/all/CAHk-=wjxX16kWd=uxG5wzqt=aXoYDf1BgWOKk+qVmAO0zh7sjA@mail.gmail.com/
Fixes: b1afefa62ca9 ("tracing: Use strcmp() in __assign_str() WARN_ON() check")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 19 Mar 2024 18:19:36 +0000 (11:19 -0700)]
Merge tag 'pm-6.9-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the Energy Model to make it prevent errors due to power
unit mismatches, fix a typo in power management documentation, convert
one driver to using a platform remove callback returning void, address
two cpufreq issues (one in the core and one in the DT driver), and
enable boost support in the SCMI cpufreq driver.
Specifics:
- Modify the Energy Model code to bail out and complain if the unit
of power is not uW to prevent errors due to unit mismatches (Lukasz
Luba)
- Make the intel_rapl platform driver use a remove callback returning
void (Uwe Kleine-König)
- Fix typo in the suspend and interrupts document (Saravana Kannan)
- Make per-policy boost flags actually take effect on platforms using
cpufreq_boost_set_sw() (Sibi Sankar)
- Enable boost support in the SCMI cpufreq driver (Sibi Sankar)
- Make the DT cpufreq driver use zalloc_cpumask_var() for allocating
cpumasks to avoid using unitinialized memory (Marek Szyprowski)"
* tag 'pm-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: scmi: Enable boost support
firmware: arm_scmi: Add support for marking certain frequencies as turbo
cpufreq: dt: always allocate zeroed cpumask
cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw()
Documentation: power: Fix typo in suspend and interrupts doc
PM: EM: Force device drivers to provide power in uW
powercap: intel_rapl: Convert to platform remove callback returning void
Linus Torvalds [Tue, 19 Mar 2024 18:15:14 +0000 (11:15 -0700)]
Merge tag 'acpi-6.9-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These update ACPI documentation and kerneldoc comments.
Specifics:
- Add markup to generate links from footnotes in the ACPI enumeration
document (Chris Packham)
- Update the handle_eject_request() kerneldoc comment to document the
arguments of the function and improve kerneldoc comments for ACPI
suspend and hibernation functions (Yang Li)"
* tag 'acpi-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Improve kerneldoc comments for suspend and hibernation functions
ACPI: docs: enumeration: Make footnotes links
ACPI: Document handle_eject_request() arguments
Linus Torvalds [Tue, 19 Mar 2024 18:11:01 +0000 (11:11 -0700)]
Merge tag 'thermal-6.9-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These update thermal drivers for ARM platforms by adding new hardware
support (r8a779h0, H616 THS), addressing issues (Mediatek LVTS,
Mediatek MT7896, thermal-of) and cleaning up code.
Specifics:
- Fix memory leak in the error path at probe time in the Mediatek
LVTS driver (Christophe Jaillet)
- Fix control buffer enablement regression on Meditek MT7896 (Frank
Wunderlich)
- Drop spaces before TABs in different places: thermal-of, ST drivers
and Makefile (Geert Uytterhoeven)
- Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
among several SoC versions (Fabio Estevam)
- Add support for the H616 THS controller on Sun8i platforms (Martin
Botka)
- Don't fail probe due to zone registration failure because there is
no trip points defined in the DT (Mark Brown)
- Support variable TMU array size for new platforms (Peng Fan)
- Adjust the DT binding for thermal-of and make the polling time not
required and assume it is zero when not found in the DT (Konrad
Dybcio)
- Add r8a779h0 support in both the DT and the rcar_gen3 driver (Geert
Uytterhoeven)"
* tag 'thermal-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/drivers/rcar_gen3: Add support for R-Car V4M
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support
thermal/of: Assume polling-delay(-passive) 0 when absent
dt-bindings: thermal-zones: Don't require polling-delay(-passive)
thermal/drivers/qoriq: Fix getting tmu range
thermal/drivers/sun8i: Don't fail probe due to zone registration failure
thermal/drivers/sun8i: Add support for H616 THS controller
thermal/drivers/sun8i: Add SRAM register access code
thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
thermal/drivers/sun8i: Explain unknown H6 register value
dt-bindings: thermal: sun8i: Add H616 THS controller
soc: sunxi: sram: export register 0 for THS on H616
dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems
thermal: Drop spaces before TABs
thermal/drivers/mediatek: Fix control buffer enablement on MT7896
thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
Linus Torvalds [Tue, 19 Mar 2024 18:05:34 +0000 (11:05 -0700)]
Merge tag 'ata-6.9-rc1-2' of git://git./linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
"A single fix for ASMedia HBAs.
These HBAs do not indicate that they support SATA Port Multipliers
CAP.SPM (Supports Port Multiplier) is not set.
Likewise, they do not allow you to probe the devices behind an
attached PMP, as defined according to the SATA-IO PMP specification.
Instead, they have decided to implement their own version of PMP,
and because of this, plugging in a PMP actually works, even if the
HBA claims that it does not support PMP.
Revert a recent quirk for these HBAs, as that breaks ASMedia's own
implementation of PMP.
Unfortunately, this will once again give some users of these HBAs
significantly increased boot time. However, a longer boot time for
some, is the lesser evil compared to some other users not being able
to detect their drives at all"
* tag 'ata-6.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ahci: asm1064: asm1166: don't limit reported ports
Linus Torvalds [Tue, 19 Mar 2024 15:57:39 +0000 (08:57 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- Per vq sizes in vdpa
- Info query for block devices support in vdpa
- DMA sync callbacks in vduse
- Fixes, cleanups
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits)
virtio_net: rename free_old_xmit_skbs to free_old_xmit
virtio_net: unify the code for recycling the xmit ptr
virtio-net: add cond_resched() to the command waiting loop
virtio-net: convert rx mode setting to use workqueue
virtio: packed: fix unmap leak for indirect desc table
vDPA: report virtio-blk flush info to user space
vDPA: report virtio-block read-only info to user space
vDPA: report virtio-block write zeroes configuration to user space
vDPA: report virtio-block discarding configuration to user space
vDPA: report virtio-block topology info to user space
vDPA: report virtio-block MQ info to user space
vDPA: report virtio-block max segments in a request to user space
vDPA: report virtio-block block-size to user space
vDPA: report virtio-block max segment size to user space
vDPA: report virtio-block capacity to user space
virtio: make virtio_bus const
vdpa: make vdpa_bus const
vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
vDPA/ifcvf: get_max_vq_size to return max size
virtio_vdpa: create vqs with the actual size
...
Linus Torvalds [Tue, 19 Mar 2024 15:48:09 +0000 (08:48 -0700)]
Merge tag 'for-linus-6.9-rc1-tag' of git://git./linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- Xen event channel handling fix for a regression with a rare kernel
config and some added hardening
- better support of running Xen dom0 in PVH mode
- a cleanup for the xen grant-dma-iommu driver
* tag 'for-linus-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: increment refcnt only if event channel is refcounted
xen/evtchn: avoid WARN() when unbinding an event channel
x86/xen: attempt to inflate the memory balloon on PVH
xen/grant-dma-iommu: Convert to platform remove callback returning void
Rafael J. Wysocki [Tue, 19 Mar 2024 12:25:49 +0000 (13:25 +0100)]
Merge branches 'pm-em', 'pm-powercap' and 'pm-sleep'
Merge additional updates related to the Energy Model, power capping
and system-wide power management for 6.9-rc1:
- Modify the Energy Model code to bail out and complain if the unit of
power is not uW to prevent errors due to unit mismatches (Lukasz
Luba).
- Make the intel_rapl platform driver use a remove callback returning
void (Uwe Kleine-König).
- Fix typo in the suspend and interrupts document (Saravana Kannan).
* pm-em:
PM: EM: Force device drivers to provide power in uW
* pm-powercap:
powercap: intel_rapl: Convert to platform remove callback returning void
* pm-sleep:
Documentation: power: Fix typo in suspend and interrupts doc
Rafael J. Wysocki [Tue, 19 Mar 2024 12:16:15 +0000 (13:16 +0100)]
Merge branch 'acpi-docs'
Merge an ACPI documentation update for 6.9-rc1 which adds markup to
generate links from footnotes in the enumeration document.
* acpi-docs:
ACPI: docs: enumeration: Make footnotes links
Conrad Kostecki [Wed, 13 Mar 2024 21:46:50 +0000 (22:46 +0100)]
ahci: asm1064: asm1166: don't limit reported ports
Previously, patches have been added to limit the reported count of SATA
ports for asm1064 and asm1166 SATA controllers, as those controllers do
report more ports than physically having.
While it is allowed to report more ports than physically having in CAP.NP,
it is not allowed to report more ports than physically having in the PI
(Ports Implemented) register, which is what these HBAs do.
(This is a AHCI spec violation.)
Unfortunately, it seems that the PMP implementation in these ASMedia HBAs
is also violating the AHCI and SATA-IO PMP specification.
What these HBAs do is that they do not report that they support PMP
(CAP.SPM (Supports Port Multiplier) is not set).
Instead, they have decided to add extra "virtual" ports in the PI register
that is used if a port multiplier is connected to any of the physical
ports of the HBA.
Enumerating the devices behind the PMP as specified in the AHCI and
SATA-IO specifications, by using PMP READ and PMP WRITE commands to the
physical ports of the HBA is not possible, you have to use the "virtual"
ports.
This is of course bad, because this gives us no way to detect the device
and vendor ID of the PMP actually connected to the HBA, which means that
we can not apply the proper PMP quirks for the PMP that is connected to
the HBA.
Limiting the port map will thus stop these controllers from working with
SATA Port Multipliers.
This patch reverts both patches for asm1064 and asm1166, so old behavior
is restored and SATA PMP will work again, but it will also reintroduce the
(minutes long) extra boot time for the ASMedia controllers that do not
have a PMP connected (either on the PCIe card itself, or an external PMP).
However, a longer boot time for some, is the lesser evil compared to some
other users not being able to detect their drives at all.
Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports")
Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports")
Cc: stable@vger.kernel.org
Reported-by: Matt <cryptearth@googlemail.com>
Signed-off-by: Conrad Kostecki <conikost@gentoo.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
[cassel: rewrote commit message]
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Xuan Zhuo [Thu, 29 Feb 2024 07:20:43 +0000 (15:20 +0800)]
virtio_net: rename free_old_xmit_skbs to free_old_xmit
Since free_old_xmit_skbs not only deals with skb, but also xdp frame and
subsequent added xsk, so change the name of this function to
free_old_xmit.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20240229072044.77388-19-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 29 Feb 2024 07:20:42 +0000 (15:20 +0800)]
virtio_net: unify the code for recycling the xmit ptr
There are two completely similar and independent implementations. This
is inconvenient for the subsequent addition of new types. So extract a
function from this piece of code and call this function uniformly to
recover old xmit ptr.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20240229072044.77388-18-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jason Wang [Thu, 20 Jul 2023 08:38:39 +0000 (04:38 -0400)]
virtio-net: add cond_resched() to the command waiting loop
Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed on a device whose CVQ might be slow.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230720083839.481487-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Jason Wang [Thu, 20 Jul 2023 08:38:38 +0000 (04:38 -0400)]
virtio-net: convert rx mode setting to use workqueue
This patch convert rx mode setting to be done in a workqueue, this is
a must for allow to sleep when waiting for the cvq command to
response since current code is executed under addr spin lock.
Note that we need to disable and flush the workqueue during freeze,
this means the rx mode setting is lost after resuming. This is not the
bug of this patch as we never try to restore rx mode setting during
resume.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230720083839.481487-2-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Xuan Zhuo [Fri, 23 Feb 2024 07:18:33 +0000 (15:18 +0800)]
virtio: packed: fix unmap leak for indirect desc table
When use_dma_api and premapped are true, then the do_unmap is false.
Because the do_unmap is false, vring_unmap_extra_packed is not called by
detach_buf_packed.
if (unlikely(vq->do_unmap)) {
curr = id;
for (i = 0; i < state->num; i++) {
vring_unmap_extra_packed(vq,
&vq->packed.desc_extra[curr]);
curr = vq->packed.desc_extra[curr].next;
}
}
So the indirect desc table is not unmapped. This causes the unmap leak.
So here, we check vq->use_dma_api instead. Synchronously, dma info is
updated based on use_dma_api judgment
This bug does not occur, because no driver use the premapped with
indirect.
Fixes: b319940f83c2 ("virtio_ring: skip unmap for premapped")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20240223071833.26095-1-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:06 +0000 (02:56 +0800)]
vDPA: report virtio-blk flush info to user space
This commit reports whether a virtio-blk device
support cache flush command to user space
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-11-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:05 +0000 (02:56 +0800)]
vDPA: report virtio-block read-only info to user space
This commit report read-only information of
virtio-blk devices to user space.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-10-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:04 +0000 (02:56 +0800)]
vDPA: report virtio-block write zeroes configuration to user space
This commits reports write zeroes configuration of
virtio-block devices to user space, includes:
1)maximum write zeroes sectors size
2)maximum write zeroes segment number
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-9-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:03 +0000 (02:56 +0800)]
vDPA: report virtio-block discarding configuration to user space
This commit reports virtio-blk discarding configuration
to user space,includes:
1) the maximum discard sectors
2) maximum number of discard segments for the block driver to use
3) the alignment for splitting a discarding request
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-8-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:02 +0000 (02:56 +0800)]
vDPA: report virtio-block topology info to user space
This commit allows vDPA reporting topology information of
virtio-blk devices to user space, includes:
1) the number of logical blocks per physical block
2) offset of first aligned logical block
3) suggested minimum I/O size in blocks
4) optimal (suggested maximum) I/O size in blocks
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-7-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:01 +0000 (02:56 +0800)]
vDPA: report virtio-block MQ info to user space
This commits allows vDPA reporting virtio-block multi-queue
configuration to user sapce.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-6-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:56:00 +0000 (02:56 +0800)]
vDPA: report virtio-block max segments in a request to user space
This commit allows vDPA reporting the maximum number of
segments in a request of virtio-block devices to
user space.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-5-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:55:59 +0000 (02:55 +0800)]
vDPA: report virtio-block block-size to user space
This commit allows reporting the block size of a
virtio-block device to user space.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-4-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:55:58 +0000 (02:55 +0800)]
vDPA: report virtio-block max segment size to user space
This commit allows reporting the max size of any
single segment of virtio-block devices to user space.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-3-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Sun, 18 Feb 2024 18:55:57 +0000 (02:55 +0800)]
vDPA: report virtio-block capacity to user space
This commit allows userspace to query capacity of
a virtio-block device.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240218185606.13509-2-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ricardo B. Marliere [Sun, 4 Feb 2024 20:52:51 +0000 (17:52 -0300)]
virtio: make virtio_bus const
Now that the driver core can properly handle constant struct bus_type,
move the virtio_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Message-Id: <
20240204-bus_cleanup-virtio-v1-1-
3bcb2212aaa0@marliere.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Ricardo B. Marliere [Sun, 4 Feb 2024 20:50:45 +0000 (17:50 -0300)]
vdpa: make vdpa_bus const
Now that the driver core can properly handle constant struct bus_type,
move the vdpa_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Message-Id: <
20240204-bus_cleanup-vdpa-v1-1-
1745eccb0a5c@marliere.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:05 +0000 (00:39 +0800)]
vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
IFCVF HW supports operation with vq size less than the max size,
as the spec required.
This commit implements vdpa_config_ops.get_vq_num_min to report
the minimal size of the virtqueues, which gives vDPA framework
a chance to reduce the vring size.
We need at least one descriptor to be functional, but it is better
no less than 64 to meet ceratin performance requirements.
Actually the framework would allocate at least a PAGE for the vq.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-11-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:04 +0000 (00:39 +0800)]
vDPA/ifcvf: get_max_vq_size to return max size
Since we already implemented vdpa_config_ops.get_vq_size,
so get_max_vq_size can return the acutal max size of the
virtqueues other than the max allowed safe size.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-10-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:03 +0000 (00:39 +0800)]
virtio_vdpa: create vqs with the actual size
The size of a virtqueue is a per vq configuration,
this commit allows virtio_vdpa to create
virtqueues with the actual size of a specific
vq size that supported by the backend device.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-9-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:02 +0000 (00:39 +0800)]
vduse: implement vdpa_config_ops.get_vq_size for vduse
This commit implements get_vq_size for vdpa_config_ops. This
new interface is used to report per vq size.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-8-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:01 +0000 (00:39 +0800)]
vdpa_sim: implement vdpa_config_ops.get_vq_size for vDPA simulator
This commit implements vdpa_config_ops.get_vq_size for vDPA
simulator, this new interface can help report per vq size.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-7-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:39:00 +0000 (00:39 +0800)]
eni_vdpa: implement vdpa_config_ops.get_vq_size
This commit implements get_vq_size which report
per vq size in vdpa_config_ops
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-6-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:38:59 +0000 (00:38 +0800)]
vp_vdpa: implement vdpa_config_ops.get_vq_size
This commit implements vdpa_config_ops.get_vq_size in
vp_vdpa, which reports per virtqueue size.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-5-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:38:58 +0000 (00:38 +0800)]
vDPA/ifcvf: implement vdpa_config_ops.get_vq_size
This commit implements vdpa_ops.get_vq_size to report
the size of a specific virtqueue.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-4-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:38:57 +0000 (00:38 +0800)]
vDPA: introduce get_vq_size to vdpa_config_ops
This commit introduces a new interface get_vq_size to
vDPA config ops, this new interface intends to report
the size of a specific virtqueue
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-3-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhu Lingshan [Fri, 2 Feb 2024 16:38:56 +0000 (00:38 +0800)]
vhost-vdpa: uapi to support reporting per vq size
The size of a virtqueue is a per vq configuration.
This commit introduce a new ioctl uAPI to support this flexibility.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <
20240202163905.8834-2-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shannon Nelson [Tue, 20 Feb 2024 01:10:50 +0000 (17:10 -0800)]
vdpa/pds: fixes for VF vdpa flr-aer handling
This addresses a couple of things found while testing the FLR and AER
handling with the VFs.
- release irqs before calling vp_modern_remove()
- make sure we have a valid struct pointer before using it to release irqs
- make sure the FW is alive before trying to add a new device
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <
20240220011050.30913-1-shannon.nelson@amd.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Maxime Coquelin [Mon, 19 Feb 2024 17:06:06 +0000 (18:06 +0100)]
vduse: implement DMA sync callbacks
Since commit
295525e29a5b ("virtio_net: merge dma
operations when filling mergeable buffers"), VDUSE device
require support for DMA's .sync_single_for_cpu() operation
as the memory is non-coherent between the device and CPU
because of the use of a bounce buffer.
This patch implements both .sync_single_for_cpu() and
.sync_single_for_device() callbacks, and also skip bounce
buffer copies during DMA map and unmap operations if the
DMA_ATTR_SKIP_CPU_SYNC attribute is set to avoid extra
copies of the same buffer.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <
20240219170606.587290-1-maxime.coquelin@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonah Palmer [Fri, 16 Feb 2024 14:25:02 +0000 (09:25 -0500)]
vdpa/mlx5: Allow CVQ size changes
The MLX driver was not updating its control virtqueue size at set_vq_num
and instead always initialized to MLX5_CVQ_MAX_ENT (16) at
setup_cvq_vring.
Qemu would try to set the size to 64 by default, however, because the
CVQ size always was initialized to 16, an error would be thrown when
sending >16 control messages (as used-ring entry 17 is initialized to 0).
For example, starting a guest with x-svq=on and then executing the
following command would produce the error below:
# for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done
qemu-system-x86_64: Insufficient written data (0)
[ 435.331223] virtio_net virtio0: Failed to set mac address by vq command.
SIOCSIFHWADDR: Invalid argument
Acked-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <
20240216142502.78095-1-jonah.palmer@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")
Steve Sistare [Tue, 13 Feb 2024 14:25:58 +0000 (06:25 -0800)]
vdpa: skip suspend/resume ops if not DRIVER_OK
If a vdpa device is not in state DRIVER_OK, then there is no driver state
to preserve, so no need to call the suspend and resume driver ops.
Suggested-by: Eugenio Perez Martin <eperezma@redhat.com>"
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-Id: <
1707834358-165470-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
David Hildenbrand [Tue, 13 Feb 2024 13:54:25 +0000 (14:54 +0100)]
virtio: reenable config if freezing device failed
Currently, we don't reenable the config if freezing the device failed.
For example, virtio-mem currently doesn't support suspend+resume, and
trying to freeze the device will always fail. Afterwards, the device
will no longer respond to resize requests, because it won't get notified
about config changes.
Let's fix this by re-enabling the config if freezing fails.
Fixes: 22b7050a024d ("virtio: defer config changed notifications")
Cc: <stable@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <
20240213135425.795001-1-david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Steve Sistare [Fri, 9 Feb 2024 22:30:07 +0000 (14:30 -0800)]
vdpa_sim: reset must not run
vdpasim_do_reset sets running to true, which is wrong, as it allows
vdpasim_kick_vq to post work requests before the device has been
configured. To fix, do not set running until VIRTIO_CONFIG_S_DRIVER_OK
is set.
Fixes: 0c89e2a3a9d0 ("vdpa_sim: Implement suspend vdpa op")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
1707517807-137331-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suzuki K Poulose [Thu, 25 Jan 2024 23:20:39 +0000 (23:20 +0000)]
virtio: uapi: Drop __packed attribute in linux/virtio_pci.h
Commit
92792ac752aa ("virtio-pci: Introduce admin command sending function")
added "__packed" structures to UAPI header linux/virtio_pci.h. This triggers
build failures in the consumer userspace applications without proper "definition"
of __packed (e.g., kvmtool build fails).
Moreover, the structures are already packed well, and doesn't need explicit
packing, similar to the rest of the structures in all virtio_* headers. Remove
the __packed attribute.
Fixes: 92792ac752aa ("virtio-pci: Introduce admin command sending function")
Cc: Feng Liu <feliu@nvidia.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Message-Id: <
20240125232039.913606-1-suzuki.poulose@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Andrew Melnychenko [Mon, 15 Jan 2024 19:48:40 +0000 (21:48 +0200)]
vhost: Added pad cleanup if vnet_hdr is not present.
When the Qemu launched with vhost but without tap vnet_hdr,
vhost tries to copy vnet_hdr from socket iter with size 0
to the page that may contain some trash.
That trash can be interpreted as unpredictable values for
vnet_hdr.
That leads to dropping some packets and in some cases to
stalling vhost routine when the vhost_net tries to process
packets and fails in a loop.
Qemu options:
-netdev tap,vhost=on,vnet_hdr=off,...
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <
20240115194840.
1183077-1-andrew@daynix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Kent Overstreet [Tue, 19 Mar 2024 02:18:02 +0000 (22:18 -0400)]
bcachefs: Fix lost wakeup on journal shutdown
We need to check for journal shutdown first in __journal_res_get() -
after the journal is shutdown, j->watermark won't be changing anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 19 Mar 2024 01:36:08 +0000 (21:36 -0400)]
bcachefs; Fix deadlock in bch2_btree_update_start()
BCH_TRANS_COMMIT_journal_reclaim with watermark != BCH_WATERMARK_reclaim
means nonblocking, and we need the journal_res_get() in
btree_update_start() to respect that.
In a future refactoring we'll be deleting
BCH_TRANS_COMMIT_journal_reclaim and replacing it with an explicit
BCH_TRANS_COMMIT_nonblocking.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Mon, 18 Mar 2024 22:39:48 +0000 (15:39 -0700)]
Merge tag 'dlm-6.9' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Fix mistaken variable assignment that caused a refcounting problem
- Revert a recent change that began using atomic counters where they
were not needed (for lkb wait_count)
- Add comments around forced state reset for waiting lock operations
during recovery
* tag 'dlm-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: add comments about forced waiters reset
dlm: revert atomic_t lkb_wait_count
dlm: fix user space lkb refcounting
Linus Torvalds [Mon, 18 Mar 2024 22:34:03 +0000 (15:34 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Very small update this cycle:
- Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5,
irdma, rxe, rtrs, mana
- Simplify the hns hem mechanism
- Fix EFA's MSI-X allocation in resource constrained configurations
- Fix a KASN splat in srpt
- Narrow hns's congestion control selection to QPs granularity and
allow userspace to select it
- Solve a parallel module loading race between the CM module and a
driver module
- Flexible array cleanup
- Dump hns's SCC Conext to 'rdma res' for debugging
- Make mana build page lists for HW objects that require a 0 offset
correctly
- Stuck CM ID debugging"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (29 commits)
RDMA/cm: add timeout to cm_destroy_id wait
RDMA/mana_ib: Use virtual address in dma regions for MRs
RDMA/mana_ib: Fix bug in creation of dma regions
RDMA/hns: Append SCC context to the raw dump of QPC
RDMA/uverbs: Avoid -Wflex-array-member-not-at-end warnings
RDMA/hns: Support userspace configuring congestion control algorithm with QP granularity
RDMA/rtrs-clt: Check strnlen return len in sysfs mpath_policy_store()
RDMA/uverbs: Remove flexible arrays from struct *_filter
RDMA/device: Fix a race between mad_client and cm_client init
RDMA/hns: Fix mis-modifying default congestion control algorithm
RDMA/rxe: Remove unused 'iova' parameter from rxe_mr_init_user
RDMA/srpt: Do not register event handler until srpt device is fully setup
RDMA/irdma: Remove duplicate assignment
RDMA/efa: Limit EQs to available MSI-X vectors
RDMA/mlx5: Delete unused mlx5_ib_copy_pas prototype
RDMA/cxgb4: Delete unused c4iw_ep_redirect prototype
RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function
RDMA/mana_ib: Introduce mana_ib_get_netdev helper function
RDMA/mana_ib: Introduce mdev_to_gc helper function
RDMA/hns: Simplify 'struct hns_roce_hem' allocation
...
Linus Torvalds [Mon, 18 Mar 2024 22:27:03 +0000 (15:27 -0700)]
Merge tag 'ktest-v6.9' of git://git./linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Allow variables to contain variables. This makes the shell commands
have a bit more flexibility to reuse existing variables.
- Have make_warnings_file in build-only mode require limited variables
The make_warnings_file test will create a file with all existing
warnings (which can be used to compare against in builds with new
commits). Add it to the build-only list that doesn't require other
variables (like how to reset a machine), as the make_warnings_file
makes the most sense on build only tests.
* tag 'ktest-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: force $buildonly = 1 for 'make_warnings_file' test type
ktest.pl: Process variables within variables
Linus Torvalds [Mon, 18 Mar 2024 22:11:44 +0000 (15:11 -0700)]
Merge tag 'trace-v6.9-2' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
"Main user visible change:
- User events can now have "multi formats"
The current user events have a single format. If another event is
created with a different format, it will fail to be created. That
is, once an event name is used, it cannot be used again with a
different format. This can cause issues if a library is using an
event and updates its format. An application using the older format
will prevent an application using the new library from registering
its event.
A task could also DOS another application if it knows the event
names, and it creates events with different formats.
The multi-format event is in a different name space from the single
format. Both the event name and its format are the unique
identifier. This will allow two different applications to use the
same user event name but with different payloads.
- Added support to have ftrace_dump_on_oops dump out instances and
not just the main top level tracing buffer.
Other changes:
- Add eventfs_root_inode
Only the root inode has a dentry that is static (never goes away)
and stores it upon creation. There's no reason that the thousands
of other eventfs inodes should have a pointer that never gets set
in its descriptor. Create a eventfs_root_inode desciptor that has a
eventfs_inode descriptor and a dentry pointer, and only the root
inode will use this.
- Added WARN_ON()s in eventfs
There's some conditionals remaining in eventfs that should never be
hit, but instead of removing them, add WARN_ON() around them to
make sure that they are never hit.
- Have saved_cmdlines allocation also include the map_cmdline_to_pid
array
The saved_cmdlines structure allocates a large amount of data to
hold its mappings. Within it, it has three arrays. Two are already
apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory
can be saved by also including the map_cmdline_to_pid[] array as
well.
- Restructure __string() and __assign_str() macros used in
TRACE_EVENT()
Dynamic strings in TRACE_EVENT() are declared with:
__string(name, source)
And assigned with:
__assign_str(name, source)
In the tracepoint callback of the event, the __string() is used to
get the size needed to allocate on the ring buffer and
__assign_str() is used to copy the string into the ring buffer.
There's a helper structure that is created in the TRACE_EVENT()
macro logic that will hold the string length and its position in
the ring buffer which is created by __string().
There are several trace events that have a function to create the
string to save. This function is executed twice. Once for
__string() and again for __assign_str(). There's no reason for
this. The helper structure could also save the string it used in
__string() and simply copy that into __assign_str() (it also
already has its length).
By using the structure to store the source string for the
assignment, it means that the second argument to __assign_str() is
no longer needed.
It will be removed in the next merge window, but for now add a
warning if the source string given to __string() is different than
the source string given to __assign_str(), as the source to
__assign_str() isn't even used and will be going away.
- Added checks to make sure that the source of __string() is also the
source of __assign_str() so that it can be safely removed in the
next merge window.
Included fixes that the above check found.
- Other minor clean ups and fixes"
* tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
tracing: Add __string_src() helper to help compilers not to get confused
tracing: Use strcmp() in __assign_str() WARN_ON() check
tracepoints: Use WARN() and not WARN_ON() for warnings
tracing: Use div64_u64() instead of do_div()
tracing: Support to dump instance traces by ftrace_dump_on_oops
tracing: Remove second parameter to __assign_rel_str()
tracing: Add warning if string in __assign_str() does not match __string()
tracing: Add __string_len() example
tracing: Remove __assign_str_len()
ftrace: Fix most kernel-doc warnings
tracing: Decrement the snapshot if the snapshot trigger fails to register
tracing: Fix snapshot counter going between two tracers that use it
tracing: Use EVENT_NULL_STR macro instead of open coding "(null)"
tracing: Use ? : shortcut in trace macros
tracing: Do not calculate strlen() twice for __string() fields
tracing: Rework __assign_str() and __string() to not duplicate getting the string
cxl/trace: Properly initialize cxl_poison region name
net: hns3: tracing: fix hclgevf trace event strings
drm/i915: Add missing ; to __assign_str() macros in tracepoint code
NFSD: Fix nfsd_clid_class use of __string_len() macro
...
Linus Torvalds [Mon, 18 Mar 2024 21:59:13 +0000 (14:59 -0700)]
Merge tag 'sysctl-6.9-rc1' of git://git./linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
"No functional changes - additional testing is required for the rest of
the pending changes.
- New shared repo for sysctl maintenance
- check-sysctl-docs adjustment for API changes by Thomas Weißschuh"
* tag 'sysctl-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
scripts: check-sysctl-docs: handle per-namespace sysctls
ipc: remove linebreaks from arguments of __register_sysctl_table
scripts: check-sysctl-docs: adapt to new API
MAINTAINERS: Update sysctl tree location
Linus Torvalds [Mon, 18 Mar 2024 19:15:19 +0000 (12:15 -0700)]
Merge tag 'for-linus-6.9-ofs1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"One fix, one cleanup...
Fix: Julia Lawall pointed out a null pointer dereference.
Cleanup: Vlastimil Babka sent me a patch to remove some SLAB related
code"
* tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
Julia Lawall reported this null pointer dereference, this should fix it.
fs/orangefs: remove ORANGEFS_CACHE_CREATE_FLAGS
Linus Torvalds [Mon, 18 Mar 2024 18:26:00 +0000 (11:26 -0700)]
Merge tag 'f2fs-for-6.9-rc1' of git://git./linux/kernel/git/jaegeuk/f2fs
Pull f2fs update from Jaegeuk Kim:
"In this round, there are a number of updates on mainly two areas:
Zoned block device support and Per-file compression. For example,
we've found several issues to support Zoned block device especially
having large sections regarding to GC and file pinning used for
Android devices. In compression side, we've fixed many corner race
conditions that had broken the design assumption.
Enhancements:
- Support file pinning for Zoned block device having large section
- Enhance the data recovery after sudden power cut on Zoned block
device
- Add more error injection cases to easily detect the kernel panics
- add a proc entry show the entire disk layout
- Improve various error paths paniced by BUG_ON in block allocation
and GC
- support SEEK_DATA and SEEK_HOLE for compression files
Bug fixes:
- avoid use-after-free issue in f2fs_filemap_fault
- fix some race conditions to break the atomic write design
assumption
- fix to truncate meta inode pages forcely
- resolve various per-file compression issues wrt the space
management and compression policies
- fix some swap-related bugs
In addition, we removed deprecated codes such as io_bits and
heap_allocation, and also fixed minor error handling routines with
neat debugging messages"
* tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
f2fs: fix to avoid use-after-free issue in f2fs_filemap_fault
f2fs: truncate page cache before clearing flags when aborting atomic write
f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag
f2fs: prevent atomic write on pinned file
f2fs: fix to handle error paths of {new,change}_curseg()
f2fs: unify the error handling of f2fs_is_valid_blkaddr
f2fs: zone: fix to remove pow2 check condition for zoned block device
f2fs: fix to truncate meta inode pages forcely
f2fs: compress: fix reserve_cblocks counting error when out of space
f2fs: compress: relocate some judgments in f2fs_reserve_compress_blocks
f2fs: add a proc entry show disk layout
f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
f2fs: fix to check return value of f2fs_gc_range
f2fs: fix to check return value __allocate_new_segment
f2fs: fix to do sanity check in update_sit_entry
f2fs: fix to reset fields for unloaded curseg
f2fs: clean up new_curseg()
f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate()
f2fs: fix blkofs_end correctly in f2fs_migrate_blocks()
f2fs: ro: don't start discard thread for readonly image
...
Linus Torvalds [Mon, 18 Mar 2024 18:15:58 +0000 (11:15 -0700)]
Merge tag 'ovl-fixes-6.9-rc1' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Amir Goldstein:
"Only minor fixes:
- Fix uncalled for WARN_ON from v6.8-rc1
- Fix the overlayfs MAINTAINERS entry"
* tag 'ovl-fixes-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: relax WARN_ON in ovl_verify_area()
MAINTAINERS: update overlayfs git tree
Linus Torvalds [Mon, 18 Mar 2024 16:15:50 +0000 (09:15 -0700)]
Merge tag 'vfs-6.9-rc1.fixes' of git://git./linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"This contains a few small fixes for this merge window:
- Undo the hiding of silly-rename files in afs. If they're hidden
they can't be deleted by rm manually anymore causing regressions
- Avoid caching the preferred address for an afs server to avoid
accidently overriding an explicitly specified preferred server
address
- Fix bad stat() and rmdir() interaction in afs
- Take a passive reference on the superblock when opening a block
device so the holder is available to concurrent callers from the
block layer
- Clear private data pointer in fscache_begin_operation() to avoid it
being falsely treated as valid"
* tag 'vfs-6.9-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fscache: Fix error handling in fscache_begin_operation()
fs,block: get holder during claim
afs: Fix occasional rmdir-then-VNOVNODE with generic/011
afs: Don't cache preferred address
afs: Revert "afs: Hide silly-rename files from userspace"
Linus Torvalds [Mon, 18 Mar 2024 16:10:44 +0000 (09:10 -0700)]
Merge tag 'irq-urgent-2024-03-17' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"A RISC-V irqchip driver fix"
* tag 'irq-urgent-2024-03-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/riscv-intc: Fix use of AIA interrupts 32-63 on riscv32
Linus Torvalds [Mon, 18 Mar 2024 16:05:37 +0000 (09:05 -0700)]
Merge tag 'sound-fix-6.9-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Two regression fixes that had been introduced in this merge window,
additional HD-audio quirks, and a further enhancement for the new
kunit"
* tag 'sound-fix-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: core: add kunitconfig
ALSA: hda/realtek: add in quirk for Acer Swift Go 16 - SFG16-71
Revert "ALSA: usb-audio: Name feature ctl using output if input is PCM"
ALSA: timer: Fix missing irq-disable at closing
ALSA: hda/realtek: Add quirk for Lenovo Yoga 9 14IMH9
Steven Rostedt (Google) [Fri, 15 Mar 2024 03:27:54 +0000 (23:27 -0400)]
tracing: Add __string_src() helper to help compilers not to get confused
The __string() helper macro of the TRACE_EVENT() macro is used to
determine how much of the ring buffer needs to be allocated to fit the
given source string. Some trace events have a string that is dependent on
another variable that could be NULL, and in those cases the string is
passed in to be NULL.
The __string() macro can handle being passed in a NULL pointer for which
it will turn it into "(null)". It does that with:
strlen((src) ? (const char *)(src) : "(null)") + 1
But if src itself has the same conditional type it can confuse the
compiler. That is:
__string(r ? dev(r)->name : NULL)
Would turn into:
strlen((r ? dev(r)->name : NULL) ? (r ? dev(r)->name : NULL) : "(null)" + 1
For which the compiler thinks that NULL is being passed to strlen() and
gives this kind of warning:
./include/trace/stages/stage5_get_offsets.h:50:21: warning: argument 1 null where non-null expected [-Wnonnull]
50 | strlen((src) ? (const char *)(src) : "(null)") + 1)
Instead, create a static inline function that takes the src string and
will return the string if it is not NULL and will return "(null)" if it
is. This will then make the strlen() line:
strlen(__string_src(src)) + 1
Where the compiler can see that strlen() will not end up with NULL and
does not warn about it.
Note that this depends on commit
51270d573a8d ("tracing/net_sched: Fix
tracepoints that save qdisc_dev() as a string") being applied, as passing
the qdisc_dev() into __string_src() will give an error.
Link: https://lore.kernel.org/all/ZfNmfCmgCs4Nc+EH@aschofie-mobl2/
Link: https://lore.kernel.org/linux-trace-kernel/20240314232754.345cea82@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reported-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 12 Mar 2024 15:30:02 +0000 (11:30 -0400)]
tracing: Use strcmp() in __assign_str() WARN_ON() check
The WARN_ON() check in __assign_str() to catch where the source variable
to the macro doesn't match the source variable to __string() gives an
error in clang:
>> include/trace/events/sunrpc.h:703:4: warning: result of comparison against a string literal is unspecified (use an explicit string comparison function instead) [-Wstring-compare]
670 | __assign_str(progname, "unknown");
That's because the __assign_str() macro has:
WARN_ON_ONCE((src) != __data_offsets.dst##_ptr_);
Where "src" is a string literal. Clang warns when comparing a string
literal directly as it is undefined to what the value of the literal is.
Since this is still to make sure the same string that goes to __string()
is the same as __assign_str(), for string literals do a test for that and
then use strcmp() in those cases
Note that this depends on commit
51270d573a8d ("tracing/net_sched: Fix
tracepoints that save qdisc_dev() as a string") being applied, as this was
what found that bug.
Link: https://lore.kernel.org/linux-trace-kernel/20240312113002.00031668@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402292111.KIdExylU-lkp@intel.com/
Fixes: 433e1d88a3be ("tracing: Add warning if string in __assign_str() does not match __string()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Wed, 28 Feb 2024 18:31:12 +0000 (13:31 -0500)]
tracepoints: Use WARN() and not WARN_ON() for warnings
There are two WARN_ON*() warnings in tracepoint.h that deal with RCU
usage. But when they trigger, especially from using a TRACE_EVENT()
macro, the information is not very helpful and is confusing:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/trace/events/lock.h:24 lock_acquire+0x2b2/0x2d0
Where the above warning takes you to:
TRACE_EVENT(lock_acquire, <<<--- line 24 in lock.h
TP_PROTO(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check,
struct lockdep_map *next_lock, unsigned long ip),
[..]
Change the WARN_ON_ONCE() to WARN_ONCE() and add a string that allows
someone to search for exactly where the bug happened.
Link: https://lore.kernel.org/linux-trace-kernel/20240228133112.0d64fb1b@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Thorsten Blum [Sun, 25 Feb 2024 16:45:08 +0000 (17:45 +0100)]
tracing: Use div64_u64() instead of do_div()
Fixes Coccinelle/coccicheck warnings reported by do_div.cocci.
Compared to do_div(), div64_u64() does not implicitly cast the divisor and
does not unnecessarily calculate the remainder.
Link: https://lore.kernel.org/linux-trace-kernel/20240225164507.232942-2-thorsten.blum@toblux.com
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Huang Yiwei [Fri, 23 Feb 2024 08:31:26 +0000 (16:31 +0800)]
tracing: Support to dump instance traces by ftrace_dump_on_oops
Currently ftrace only dumps the global trace buffer on an OOPs. For
debugging a production usecase, instance trace will be helpful to
check specific problems since global trace buffer may be used for
other purposes.
This patch extend the ftrace_dump_on_oops parameter to dump a specific
or multiple trace instances:
- ftrace_dump_on_oops=0: as before -- don't dump
- ftrace_dump_on_oops[=1]: as before -- dump the global trace buffer
on all CPUs
- ftrace_dump_on_oops=2 or =orig_cpu: as before -- dump the global
trace buffer on CPU that triggered the oops
- ftrace_dump_on_oops=<instance_name>: new behavior -- dump the
tracing instance matching <instance_name>
- ftrace_dump_on_oops[=2/orig_cpu],<instance1_name>[=2/orig_cpu],
<instrance2_name>[=2/orig_cpu]: new behavior -- dump the global trace
buffer and multiple instance buffer on all CPUs, or only dump on CPU
that triggered the oops if =2 or =orig_cpu is given
Also, the sysctl node can handle the input accordingly.
Link: https://lore.kernel.org/linux-trace-kernel/20240223083126.1817731-1-quic_hyiwei@quicinc.com
Cc: Ross Zwisler <zwisler@google.com>
Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Cc: <mcgrof@kernel.org>
Cc: <keescook@chromium.org>
Cc: <j.granados@samsung.com>
Cc: <mathieu.desnoyers@efficios.com>
Cc: <corbet@lwn.net>
Signed-off-by: Huang Yiwei <quic_hyiwei@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 21:25:19 +0000 (16:25 -0500)]
tracing: Remove second parameter to __assign_rel_str()
The second parameter of __assign_rel_str() is no longer used. It can be removed.
Note, the only real users of rel_string is user events. This code is just
in the sample code for testing purposes.
This makes __assign_rel_str() different than __assign_str() but that's
fine. __assign_str() is used over 700 places and has a larger impact. That
change will come later.
Link: https://lore.kernel.org/linux-trace-kernel/20240223162519.2beb8112@gandalf.local.home
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 21:13:56 +0000 (16:13 -0500)]
tracing: Add warning if string in __assign_str() does not match __string()
In preparation to remove the second parameter of __assign_str(), make sure
it is really a duplicate of __string() by adding a WARN_ON_ONCE().
Link: https://lore.kernel.org/linux-trace-kernel/20240223161356.63b72403@gandalf.local.home
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 20:28:27 +0000 (15:28 -0500)]
tracing: Add __string_len() example
There's no example code that uses __string_len(), and since the sample
code is used for testing the event logic, add a use case.
Link: https://lore.kernel.org/linux-trace-kernel/20240223152827.5f9f78e2@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 20:22:06 +0000 (15:22 -0500)]
tracing: Remove __assign_str_len()
Now that __assign_str() gets the length from the __string() (and
__string_len()) macros, there's no reason to have a separate
__assign_str_len() macro as __assign_str() can get the length of the
string needed.
Also remove __assign_rel_str() although it had no users anyway.
Link: https://lore.kernel.org/linux-trace-kernel/20240223152206.0b650659@gandalf.local.home
Cc: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Randy Dunlap [Fri, 23 Feb 2024 05:48:33 +0000 (21:48 -0800)]
ftrace: Fix most kernel-doc warnings
Reduce the number of kernel-doc warnings from 52 down to 10, i.e.,
fix 42 kernel-doc warnings by (a) using the Returns: format for
function return values or (b) using "@var:" instead of "@var -"
for function parameter descriptions.
Fix one return values list so that it is formatted correctly when
rendered for output.
Spell "non-zero" with a hyphen in several places.
Link: https://lore.kernel.org/linux-trace-kernel/20240223054833.15471-1-rdunlap@infradead.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/oe-kbuild-all/202312180518.X6fRyDSN-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 01:33:25 +0000 (20:33 -0500)]
tracing: Decrement the snapshot if the snapshot trigger fails to register
Running the ftrace selftests caused the ring buffer mapping test to fail.
Investigating, I found that the snapshot counter would be incremented
every time a snapshot trigger was added, even if that snapshot trigger
failed.
# cd /sys/kernel/tracing
# echo "snapshot" > events/sched/sched_process_fork/trigger
# echo "snapshot" > events/sched/sched_process_fork/trigger
-bash: echo: write error: File exists
That second one that fails increments the snapshot counter but doesn't
decrement it. It needs to be decremented when the snapshot fails.
Link: https://lore.kernel.org/linux-trace-kernel/20240223013344.729055907@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: 16f7e48ffc53a ("tracing: Add snapshot refcount")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 23 Feb 2024 01:33:24 +0000 (20:33 -0500)]
tracing: Fix snapshot counter going between two tracers that use it
Running the ftrace selftests caused the ring buffer mapping test to fail.
Investigating, I found that the snapshot counter would be incremented
every time a tracer that uses the snapshot is enabled even if the snapshot
was used by the previous tracer.
That is:
# cd /sys/kernel/tracing
# echo wakeup_rt > current_tracer
# echo wakeup_dl > current_tracer
# echo nop > current_tracer
would leave the snapshot counter at 1 and not zero. That's because the
enabling of wakeup_dl would increment the counter again but the setting
the tracer to nop would only decrement it once.
Do not arm the snapshot for a tracer if the previous tracer already had it
armed.
Link: https://lore.kernel.org/linux-trace-kernel/20240223013344.570525723@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: 16f7e48ffc53a ("tracing: Add snapshot refcount")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 21:14:19 +0000 (16:14 -0500)]
tracing: Use EVENT_NULL_STR macro instead of open coding "(null)"
The TRACE_EVENT macros has some dependency if a __string() field is NULL,
where it will save "(null)" as the string. This string is also used by
__assign_str(). It's better to create a single macro instead of having
something that will not be caught by the compiler if there is an
unfortunate typo.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211443.106216915@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 21:14:18 +0000 (16:14 -0500)]
tracing: Use ? : shortcut in trace macros
Instead of having:
#define __assign_str(dst, src) \
memcpy(__get_str(dst), __data_offsets.dst##_ptr_ ? \
__data_offsets.dst##_ptr_ : "(null)", \
__get_dynamic_array_len(dst))
Use the ? : shortcut and compact it down to:
#define __assign_str(dst, src) \
memcpy(__get_str(dst), __data_offsets.dst##_ptr_ ? : "(null)", \
__get_dynamic_array_len(dst))
Link: https://lore.kernel.org/linux-trace-kernel/20240222211442.949327725@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 21:14:17 +0000 (16:14 -0500)]
tracing: Do not calculate strlen() twice for __string() fields
The TRACE_EVENT() macro handles dynamic strings by having:
TP_PROTO(struct some_struct *s),
TP_ARGS(s),
TP_STRUCT__entry(
__string(my_string, s->string)
),
TP_fast_assign(
__assign_str(my_string, s->string);
)
TP_printk("%s", __get_str(my_string))
There's even some code that may call a function helper to find the
s->string value. The problem with the above is that the work to get the
s->string is done twice. Once at the __string() and again in the
__assign_str().
The length of the string is calculated via a strlen(), not once, but
twice. Once during the __string() macro and again in __assign_str(). But
the length is actually already recorded in the data location and here's no
reason to call strlen() again.
Just use the saved length that was saved in the __string() code for the
__assign_str() code.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211442.793074999@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 21:14:16 +0000 (16:14 -0500)]
tracing: Rework __assign_str() and __string() to not duplicate getting the string
The TRACE_EVENT() macro handles dynamic strings by having:
TP_PROTO(struct some_struct *s),
TP_ARGS(s),
TP_STRUCT__entry(
__string(my_string, s->string)
),
TP_fast_assign(
__assign_str(my_string, s->string);
)
TP_printk("%s", __get_str(my_string))
There's even some code that may call a function helper to find the
s->string value. The problem with the above is that the work to get the
s->string is done twice. Once at the __string() and again in the
__assign_str().
But the __string() uses dynamic_array() which has a helper structure that
is created holding the offsets and length of the string fields. Instead of
finding the string twice, just save it off in another field from that
helper structure, and have __assign_str() use that instead.
Note, this also means that the second parameter of __assign_str() isn't
even used anymore, and may be removed in the future.
Link: https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Alison Schofield [Thu, 14 Mar 2024 20:12:17 +0000 (13:12 -0700)]
cxl/trace: Properly initialize cxl_poison region name
The TP_STRUCT__entry that gets assigned the region name, or an
empty string if no region is present, is erroneously initialized
to the cxl_region pointer. It needs to be properly initialized
otherwise it's length is wrong and garbage chars can appear in
the kernel trace output: /sys/kernel/tracing/trace
The bad initialization was due in part to a naming conflict with
the parameter: struct cxl_region *region. The field 'region' is
already exposed externally as the region name, so changing that
to something logical, like 'region_name' is not an option. Instead
rename the internal only struct cxl_region to the commonly used
'cxlr'.
Impact is that tooling depending on that trace data can miss
picking up a valid event when searching by region name. The
TP_printk() output, if enabled, does emit the correct region
names in the dmesg log.
This was found during testing of the cxl-list option to report
media-errors for a region.
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: stable@vger.kernel.org
Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Wed, 13 Mar 2024 13:34:54 +0000 (09:34 -0400)]
net: hns3: tracing: fix hclgevf trace event strings
The __string() and __assign_str() helper macros of the TRACE_EVENT() macro
are going through some optimizations where only the source string of
__string() will be used and the __assign_str() source will be ignored and
later removed.
To make sure that there's no issues, a new check is added between the
__string() src argument and the __assign_str() src argument that does a
strcmp() to make sure they are the same string.
The hclgevf trace events have:
__assign_str(devname, &hdev->nic.kinfo.netdev->name);
Which triggers the warning:
hclgevf_trace.h:34:39: error: passing argument 1 of ‘strcmp’ from incompatible pointer type [-Werror=incompatible-pointer-types]
34 | __assign_str(devname, &hdev->nic.kinfo.netdev->name);
[..]
arch/x86/include/asm/string_64.h:75:24: note: expected ‘const char *’ but argument is of type ‘char (*)[16]’
75 | int strcmp(const char *cs, const char *ct);
| ~~~~~~~~~~~~^~
Because __assign_str() now has:
WARN_ON_ONCE(__builtin_constant_p(src) ? \
strcmp((src), __data_offsets.dst##_ptr_) : \
(src) != __data_offsets.dst##_ptr_); \
The problem is the '&' on hdev->nic.kinfo.netdev->name. That's because
that name is:
char name[IFNAMSIZ]
Where passing an address '&' of a char array is not compatible with strcmp().
The '&' is not necessary, remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20240313093454.3909afe7@gandalf.local.home
Cc: netdev <netdev@vger.kernel.org>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Yufeng Mo <moyufeng@huawei.com>
Cc: Huazhong Tan <tanhuazhong@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Fixes: d8355240cf8fb ("net: hns3: add trace event support for PF/VF mailbox")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 18:30:57 +0000 (13:30 -0500)]
drm/i915: Add missing ; to __assign_str() macros in tracepoint code
I'm working on improving the __assign_str() and __string() macros to be
more efficient, and removed some unneeded semicolons. This triggered a bug
in the build as some of the __assign_str() macros in intel_display_trace
was missing a terminating semicolon.
Link: https://lore.kernel.org/linux-trace-kernel/20240222133057.2af72a19@gandalf.local.home
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: stable@vger.kernel.org
Fixes: 2ceea5d88048b ("drm/i915: Print plane name in fbc tracepoints")
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Thu, 22 Feb 2024 17:28:28 +0000 (12:28 -0500)]
NFSD: Fix nfsd_clid_class use of __string_len() macro
I'm working on restructuring the __string* macros so that it doesn't need
to recalculate the string twice. That is, it will save it off when
processing __string() and the __assign_str() will not need to do the work
again as it currently does.
Currently __string_len(item, src, len) doesn't actually use "src", but my
changes will require src to be correct as that is where the __assign_str()
will get its value from.
The event class nfsd_clid_class has:
__string_len(name, name, clp->cl_name.len)
But the second "name" does not exist and causes my changes to fail to
build. That second parameter should be: clp->cl_name.data.
Link: https://lore.kernel.org/linux-trace-kernel/20240222122828.3d8d213c@gandalf.local.home
Cc: Neil Brown <neilb@suse.de>
Cc: Olga Kornievskaia <kolga@netapp.com>
Cc: Dai Ngo <Dai.Ngo@oracle.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: stable@vger.kernel.org
Fixes: d27b74a8675ca ("NFSD: Use new __string_len C macros for nfsd_clid_class")
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
John Garry [Thu, 22 Feb 2024 12:46:39 +0000 (12:46 +0000)]
tracing: Use init_utsname()->release
Instead of using UTS_RELEASE, use init_utsname()->release, which means that
we don't need to rebuild the code just for the git head commit changing.
Link: https://lore.kernel.org/linux-trace-kernel/20240222124639.65629-1-john.g.garry@oracle.com
Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Thu, 22 Feb 2024 00:18:07 +0000 (00:18 +0000)]
tracing/user_events: Document multi-format flag
User programs can now ask user_events to handle the synchronization of
multiple different formats for an event with the same name via the new
USER_EVENT_REG_MULTI_FORMAT flag.
Add a section for USER_EVENT_REG_MULTI_FORMAT that explains the intended
purpose and caveats of using it. Explain how deletion works in these
cases and how to use /sys/kernel/tracing/dynamic_events for per-version
deletion.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-5-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Thu, 22 Feb 2024 00:18:06 +0000 (00:18 +0000)]
selftests/user_events: Test multi-format events
User_events now has multi-format events which allow for the same
register name, but with different formats. When this occurs, different
tracepoints are created with unique names.
Add a new test that ensures the same name can be used for two different
formats. Ensure they are isolated from each other and that name and arg
matching still works if yet another register comes in with the same
format as one of the two.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-4-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Thu, 22 Feb 2024 00:18:05 +0000 (00:18 +0000)]
tracing/user_events: Introduce multi-format events
Currently user_events supports 1 event with the same name and must have
the exact same format when referenced by multiple programs. This opens
an opportunity for malicious or poorly thought through programs to
create events that others use with different formats. Another scenario
is user programs wishing to use the same event name but add more fields
later when the software updates. Various versions of a program may be
running side-by-side, which is prevented by the current single format
requirement.
Add a new register flag (USER_EVENT_REG_MULTI_FORMAT) which indicates
the user program wishes to use the same user_event name, but may have
several different formats of the event. When this flag is used, create
the underlying tracepoint backing the user_event with a unique name
per-version of the format. It's important that existing ABI users do
not get this logic automatically, even if one of the multi format
events matches the format. This ensures existing programs that create
events and assume the tracepoint name will match exactly continue to
work as expected. Add logic to only check multi-format events with
other multi-format events and single-format events to only check
single-format events during find.
Change system name of the multi-format event tracepoint to ensure that
multi-format events are isolated completely from single-format events.
This prevents single-format names from conflicting with multi-format
events if they end with the same suffix as the multi-format events.
Add a register_name (reg_name) to the user_event struct which allows for
split naming of events. We now have the name that was used to register
within user_events as well as the unique name for the tracepoint. Upon
registering events ensure matches based on first the reg_name, followed
by the fields and format of the event. This allows for multiple events
with the same registered name to have different formats. The underlying
tracepoint will have a unique name in the format of {reg_name}.{unique_id}.
For example, if both "test u32 value" and "test u64 value" are used with
the USER_EVENT_REG_MULTI_FORMAT the system would have 2 unique
tracepoints. The dynamic_events file would then show the following:
u:test u64 count
u:test u32 count
The actual tracepoint names look like this:
test.0
test.1
Both would be under the new user_events_multi system name to prevent the
older ABI from being used to squat on multi-formatted events and block
their use.
Deleting events via "!u:test u64 count" would only delete the first
tracepoint that matched that format. When the delete ABI is used all
events with the same name will be attempted to be deleted. If
per-version deletion is required, user programs should either not use
persistent events or delete them via dynamic_events.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-3-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Thu, 22 Feb 2024 00:18:04 +0000 (00:18 +0000)]
tracing/user_events: Prepare find/delete for same name events
The current code for finding and deleting events assumes that there will
never be cases when user_events are registered with the same name, but
different formats. Scenarios exist where programs want to use the same
name but have different formats. An example is multiple versions of a
program running side-by-side using the same event name, but with updated
formats in each version.
This change does not yet allow for multi-format events. If user_events
are registered with the same name but different arguments the programs
see the same return values as before. This change simply makes it
possible to easily accommodate for this.
Update find_user_event() to take in argument parameters and register
flags to accommodate future multi-format event scenarios. Have find
validate argument matching and return error pointers to cover when
an existing event has the same name but different format. Update
callers to handle error pointer logic.
Move delete_user_event() to use hash walking directly now that
find_user_event() has changed. Delete all events found that match the
register name, stop if an error occurs and report back to the user.
Update user_fields_match() to cover list_empty() scenarios now that
find_user_event() uses it directly. This makes the logic consistent
across several callsites.
Link: https://lore.kernel.org/linux-trace-kernel/20240222001807.1463-2-beaub@linux.microsoft.com
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Vincent Donnefort [Tue, 20 Feb 2024 20:23:07 +0000 (20:23 +0000)]
tracing: Add snapshot refcount
When a ring-buffer is memory mapped by user-space, no trace or
ring-buffer swap is possible. This means the snapshot feature is
mutually exclusive with the memory mapping. Having a refcount on
snapshot users will help to know if a mapping is possible or not.
Instead of relying on the global trace_types_lock, a new spinlock is
introduced to serialize accesses to trace_array->snapshot. This intends
to allow access to that variable in a context where the mmap lock is
already held.
Link: https://lore.kernel.org/linux-trace-kernel/20240220202310.2489614-4-vdonnefort@google.com
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 15 Mar 2024 10:31:15 +0000 (06:31 -0400)]
ring-buffer: Make wake once of ring_buffer_wait() more robust
The default behavior of ring_buffer_wait() when passed a NULL "cond"
parameter is to exit the function the first time it is woken up. The
current implementation uses a counter that starts at zero and when it is
greater than one it exits the wait_event_interruptible().
But this relies on the internal working of wait_event_interruptible() as
that code basically has:
if (cond)
return;
prepare_to_wait();
if (!cond)
schedule();
finish_wait();
That is, cond is called twice before it sleeps. The default cond of
ring_buffer_wait() needs to account for that and wait for its counter to
increment twice before exiting.
Instead, use the seq/atomic_inc logic that is used by the tracing code
that calls this function. Add an atomic_t seq to rb_irq_work and when cond
is NULL, have the default callback take a descriptor as its data that
holds the rbwork and the value of the seq when it started.
The wakeups will now increment the rbwork->seq and the cond callback will
simply check if that number is different, and no longer have to rely on
the implementation of wait_event_interruptible().
Link: https://lore.kernel.org/linux-trace-kernel/20240315063115.6cb5d205@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 7af9ded0c2ca ("ring-buffer: Use wait_event_interruptible() in ring_buffer_wait()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
David Howells [Fri, 15 Mar 2024 14:48:26 +0000 (14:48 +0000)]
fscache: Fix error handling in fscache_begin_operation()
Fix fscache_begin_operation() to clear cres->cache_priv on error, otherwise
fscache_resources_valid() will report it as being valid.
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/3933237.1710514106@warthog.procyon.org.uk
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reported-by: Marc Dionne <marc.dionne@auristor.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Christian Brauner [Thu, 14 Mar 2024 14:24:13 +0000 (15:24 +0100)]
fs,block: get holder during claim
Now that we open block devices as files we need to deal with the
realities that closing is a deferred operation. An operation on the
block device such as e.g., freeze, thaw, or removal that runs
concurrently with umount, tries to acquire a stable reference on the
holder. The holder might already be gone though. Make that reliable by
grabbing a passive reference to the holder during bdev_open() and
releasing it during bdev_release().
Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only
Reported-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/ZfEQQ9jZZVes0WCZ@infradead.org
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reported-by: https://lore.kernel.org/r/CAHj4cs8tbDwKRwfS1=DmooP73ysM__xAb2PQc6XsAmWR+VuYmg@mail.gmail.com
Link: https://lore.kernel.org/r/20240315-freibad-annehmbar-ca68c375af91@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
Kent Overstreet [Mon, 18 Mar 2024 01:55:13 +0000 (21:55 -0400)]
bcachefs: ratelimit errors from async_btree_node_rewrite
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Mar 2024 01:30:20 +0000 (21:30 -0400)]
bcachefs: Run check_topology() first
check_topology() doesn't actually require alloc info - and running it
first means other passes don't have to catch btree read errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Mar 2024 01:51:19 +0000 (21:51 -0400)]
bcachefs: Improve bch2_fatal_error()
error messages should always include __func__
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Mar 2024 04:13:34 +0000 (00:13 -0400)]
bcachefs: Fix lost transaction restart error
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 17 Mar 2024 01:22:47 +0000 (21:22 -0400)]
bcachefs: Don't corrupt journal keys gap buffer when dropping alloc info
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 16 Mar 2024 23:36:11 +0000 (19:36 -0400)]
bcachefs: fix for building in userspace
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 18 Mar 2024 00:39:42 +0000 (20:39 -0400)]
bcachefs: bch2_snapshot_is_ancestor() now safe to call in early recovery
this fixes an assertion pop in
bch2_check_snapshot_trees() ->
check_snapshot_tree() ->
bch2_snapshot_tree_master_subvol() ->
bch2_snapshot_is_ancestor()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>