Ricardo B. Marliere [Tue, 5 Mar 2024 19:49:30 +0000 (16:49 -0300)]
mmc: core: make mmc_host_class constant
Since commit
43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the mmc_host_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240305-class_cleanup-mmc-v1-1-4a66e7122ff3@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ulf Hansson [Wed, 6 Mar 2024 22:34:51 +0000 (23:34 +0100)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to
get tested together with the new mmc changes that are targeted for v6.9.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Dominique Martinet [Wed, 6 Mar 2024 01:44:38 +0000 (10:44 +0900)]
mmc: core: Fix switch on gp3 partition
Commit
e7794c14fd73 ("mmc: rpmb: fixes pause retune on all RPMB
partitions.") added a mask check for 'part_type', but the mask used was
wrong leading to the code intended for rpmb also being executed for GP3.
On some MMCs (but not all) this would make gp3 partition inaccessible:
armadillo:~# head -c 1 < /dev/mmcblk2gp3
head: standard input: I/O error
armadillo:~# dmesg -c
[ 422.976583] mmc2: running CQE recovery
[ 423.058182] mmc2: running CQE recovery
[ 423.137607] mmc2: running CQE recovery
[ 423.137802] blk_update_request: I/O error, dev mmcblk2gp3, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 0
[ 423.237125] mmc2: running CQE recovery
[ 423.318206] mmc2: running CQE recovery
[ 423.397680] mmc2: running CQE recovery
[ 423.397837] blk_update_request: I/O error, dev mmcblk2gp3, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 423.408287] Buffer I/O error on dev mmcblk2gp3, logical block 0, async page read
the part_type values of interest here are defined as follow:
main 0
boot0 1
boot1 2
rpmb 3
gp0 4
gp1 5
gp2 6
gp3 7
so mask with EXT_CSD_PART_CONFIG_ACC_MASK (7) to correctly identify rpmb
Fixes: e7794c14fd73 ("mmc: rpmb: fixes pause retune on all RPMB partitions.")
Cc: stable@vger.kernel.org
Cc: Jorge Ramirez-Ortiz <jorge@foundries.io>
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240306-mmc-partswitch-v1-1-bf116985d950@codewreck.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Wolfram Sang [Tue, 5 Mar 2024 10:56:24 +0000 (11:56 +0100)]
mmc: tmio: comment the ERR_PTR usage in this driver
It is not super obvious why the driver sometimes uses an ERR_PTR for the
current mrq. Explain why in comments.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20240305105623.3379-2-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Andy Shevchenko [Mon, 4 Mar 2024 18:48:30 +0000 (20:48 +0200)]
mmc: mmc_spi: Don't mention DMA direction
Since driver doesn't handle any DMA requests, drop any use of DMA bits,
such as DMA direction. Instead, use MMC_DATA_WRITE flag directly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304184830.1319526-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Andy Shevchenko [Mon, 4 Mar 2024 17:56:06 +0000 (19:56 +0200)]
mmc: dw_mmc: Remove unused of_gpio.h
of_gpio.h is deprecated and subject to remove.
The driver doesn't use it, simply remove the unused header.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
Link: https://lore.kernel.org/r/20240304175606.1200076-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yang Xiwen [Thu, 29 Feb 2024 01:36:23 +0000 (09:36 +0800)]
mmc: dw_mmc: add support for hi3798mv200
Add support for Hi3798MV200 specific extension.
Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
Link: https://lore.kernel.org/r/20240229-b4-mmc-hi3798mv200-v7-5-10c03f316285@outlook.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yang Xiwen [Thu, 29 Feb 2024 01:36:22 +0000 (09:36 +0800)]
dt-bindings: mmc: hisilicon,hi3798cv200-dw-mshc: add Hi3798MV200 binding
Add binding and an extra property for Hi3798MV200 DWMMC specific extension.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
Link: https://lore.kernel.org/r/20240229-b4-mmc-hi3798mv200-v7-4-10c03f316285@outlook.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yang Xiwen [Thu, 29 Feb 2024 01:36:21 +0000 (09:36 +0800)]
dt-bindings: mmc: dw-mshc-hi3798cv200: convert to YAML
convert the legacy txt binding to modern YAML and rename to
hisilicon,hi3798cv200-dw-mshc.yaml. No semantic change.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
Link: https://lore.kernel.org/r/20240229-b4-mmc-hi3798mv200-v7-3-10c03f316285@outlook.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yang Xiwen [Thu, 29 Feb 2024 01:36:20 +0000 (09:36 +0800)]
mmc: dw_mmc-hi3798cv200: remove MODULE_ALIAS()
The alias is not used and should be removed.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
Link: https://lore.kernel.org/r/20240229-b4-mmc-hi3798mv200-v7-2-10c03f316285@outlook.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Yang Xiwen [Thu, 29 Feb 2024 01:36:19 +0000 (09:36 +0800)]
mmc: core: Use a struct device* as in-param to mmc_of_parse_clk_phase()
Parsing dt usually happens very early, sometimes even before the struct
mmc_host has been allocated (e.g. dw_mci_probe() and dw_mci_parse_dt() in
dw_mmc.c). Looking at the source of mmc_of_parse_clk_phase(), it's actually
not needed to have an initialized mmc_host, let's therefore pass a struct
device* to it instead.
Also update the only current user, sdhci-of-aspeed.
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Yang Xiwen <forbidden405@outlook.com>
Link: https://lore.kernel.org/r/20240229-b4-mmc-hi3798mv200-v7-1-10c03f316285@outlook.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Christophe JAILLET [Mon, 26 Feb 2024 21:37:39 +0000 (22:37 +0100)]
mmc: wmt-sdmmc: remove an incorrect release_mem_region() call in the .remove function
This looks strange to call release_mem_region() in a remove function
without any request_mem_region() in the probe or "struct resource"
somewhere.
So remove the corresponding code.
Fixes: 3a96dff0f828 ("mmc: SD/MMC Host Controller for Wondermedia WM8505/WM8650")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/bb0bb1ed1e18de55e8c0547625bde271e64b8c31.1708983064.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ulf Hansson [Tue, 5 Mar 2024 11:49:56 +0000 (12:49 +0100)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to
get tested together with the new mmc changes that are targeted for v6.9.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Wolfram Sang [Tue, 5 Mar 2024 10:42:56 +0000 (11:42 +0100)]
mmc: tmio: avoid concurrent runs of mmc_request_done()
With the to-be-fixed commit, the reset_work handler cleared 'host->mrq'
outside of the spinlock protected critical section. That leaves a small
race window during execution of 'tmio_mmc_reset()' where the done_work
handler could grab a pointer to the now invalid 'host->mrq'. Both would
use it to call mmc_request_done() causing problems (see link below).
However, 'host->mrq' cannot simply be cleared earlier inside the
critical section. That would allow new mrqs to come in asynchronously
while the actual reset of the controller still needs to be done. So,
like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from
coming in but still avoiding concurrency between work handlers.
Reported-by: Dirk Behme <dirk.behme@de.bosch.com>
Closes: https://lore.kernel.org/all/20240220061356.3001761-1-dirk.behme@de.bosch.com/
Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Reviewed-by: Dirk Behme <dirk.behme@de.bosch.com>
Cc: stable@vger.kernel.org # 3.0+
Link: https://lore.kernel.org/r/20240305104423.3177-2-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ulf Hansson [Wed, 28 Feb 2024 12:42:37 +0000 (13:42 +0100)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to
get tested together with the new mmc changes that are targeted for v6.9.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Elad Nachman [Thu, 22 Feb 2024 19:17:14 +0000 (21:17 +0200)]
mmc: sdhci-xenon: add timeout for PHY init complete
AC5X spec says PHY init complete bit must be polled until zero.
We see cases in which timeout can take longer than the standard
calculation on AC5X, which is expected following the spec comment above.
According to the spec, we must wait as long as it takes for that bit to
toggle on AC5X.
Cap that with 100 delay loops so we won't get stuck forever.
Fixes: 06c8b667ff5b ("mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC")
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Elad Nachman <enachman@marvell.com>
Link: https://lore.kernel.org/r/20240222191714.1216470-3-enachman@marvell.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Elad Nachman [Thu, 22 Feb 2024 20:09:30 +0000 (22:09 +0200)]
mmc: sdhci-xenon: fix PHY init clock stability
Each time SD/mmc phy is initialized, at times, in some of
the attempts, phy fails to completes its initialization
which results into timeout error. Per the HW spec, it is
a pre-requisite to ensure a stable SD clock before a phy
initialization is attempted.
Fixes: 06c8b667ff5b ("mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC")
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Elad Nachman <enachman@marvell.com>
Link: https://lore.kernel.org/r/20240222200930.1277665-1-enachman@marvell.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Fabio Estevam [Thu, 22 Feb 2024 14:39:11 +0000 (11:39 -0300)]
dt-bindings: mmc: fsl-imx-mmc: Document the required clocks
The fsl-imx-mmc hardware needs two clocks to operate: ipg and per.
Document these required clocks.
This fixes the following schema warning:
imx27-apf27dev.dtb: mmc@
10014000: Unevaluated properties are not allowed ('clock-names', 'clocks' were unexpected)
from schema $id: http://devicetree.org/schemas/mmc/fsl-imx-mmc.yaml#
Signed-off-by: Fabio Estevam <festevam@denx.de>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240222143911.979058-1-festevam@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Wed, 21 Feb 2024 21:23:01 +0000 (22:23 +0100)]
mmc: sh_mmcif: Advance sg_miter before reading blocks
The introduction of sg_miter was a bit sloppy as it didn't
exactly mimic the semantics of the old code on multiblock reads
and writes: these like you to:
- Advance to the first sglist entry *before* starting to read
any blocks from the card.
- Advance and check availability of the next entry *right after*
processing one block.
Not checking if we have more sglist entries right after
reading a block will lead to this not being checked until we
return to the callback to read out more blocks, i.e. until the
next interrupt arrives. Since the last block is the last one
(no more data will arrive) there will not be a next interrupt,
and we will be waiting forever resulting in a timeout for
command 18 when reading multiple blocks.
The same bug was fixed also in the writing of multiple blocks.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 27b57277d9ba ("mmc: sh_mmcif: Use sg_miter for PIO")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20240221-fix-sh-mmcif-v2-2-5e521eb25ae4@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Wed, 21 Feb 2024 21:23:00 +0000 (22:23 +0100)]
mmc: sh_mmcif: sg_miter must not be atomic
All the sglist iterations happen in the *threaded* interrupt handler and
that context is not atomic, so don't request an atomic sglist miter. Using
an atomic miter results in "BUG: scheduling while atomic" splats.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 27b57277d9ba ("mmc: sh_mmcif: Use sg_miter for PIO")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240221-fix-sh-mmcif-v2-1-5e521eb25ae4@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Wed, 28 Feb 2024 09:07:27 +0000 (10:07 +0100)]
mmc: sdhci-esdhc-mcf: Flag the sg_miter as atomic
The sg_miter used to loop over the returned sglist from a
transfer in the esdhc subdriver for SDHCI can be called
from atomic context so the miter needs to be atomic.
sdhci_request_done() is always called from process context,
either as a work or as part of the threaded interrupt handler,
but the one case when we are actually calling .request_done()
from an atomic context is in sdhci_irq().
Fix this by flagging the miter atomic so we always use
kmap_atomic().
Fixes: e8a167b84886 ("mmc: sdhci-esdhc-mcf: Use sg_miter for swapping")
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240228-fix-sdhci-esdhc-mcf-2-v2-1-4ebb3fd691ea@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Marco Felsch [Wed, 21 Feb 2024 18:46:02 +0000 (19:46 +0100)]
dt-bindings: mmc: fsl-imx-esdhc: add default and 100mhz state
Some devices support only the default and the 100MHz case, add the
support for this to the binding to avoid warnings.
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240221184602.3639619-1-m.felsch@pengutronix.de
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ricardo B. Marliere [Mon, 19 Feb 2024 12:43:57 +0000 (09:43 -0300)]
mmc: core: constify the struct device_type usage
Since commit
aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the sdio_type,
sd_type and mmc_type variables to be constant structures as well, placing
it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240219-device_cleanup-mmc-v1-1-1910e283cf5a@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Jisheng Zhang [Sat, 17 Feb 2024 14:42:02 +0000 (22:42 +0800)]
mmc: sdhci-of-dwcmshc: Add support for Sophgo CV1800B and SG2002
Add support for the mmc controller in the Sophgo CV1800B and SG2002
with corresponding new compatible strings. Implement custom sdhci_ops.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20240217144202.3808-3-jszhang@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Jisheng Zhang [Sat, 17 Feb 2024 14:42:01 +0000 (22:42 +0800)]
dt-bindings: mmc: sdhci-of-dwcmhsc: Add Sophgo CV1800B and SG2002 support
Add compatible value for the dwcmshc controller in Sophgo's CV1800B and
SG2002.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240217144202.3808-2-jszhang@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Martin Blumenstingl [Sat, 17 Feb 2024 10:02:00 +0000 (11:02 +0100)]
mmc: meson-mx-sdhc: Remove .card_hw_reset callback
Commit
32f18e596141 ("mmc: improve API to make clear hw_reset callback
is for cards") made it clear that the hw_reset callback is intended for
resetting the card. Remove the .card_hw_reset callback from the
meson-mx-sdhc-mmc driver because it's purpose is to reset the SDHC
controller (FIFOs, PHY, DMA interface, ...).
While here also rename and change the argument of meson_mx_sdhc_hw_reset
so it cannot be called by accident as a replacement for card_hw_reset in
the future.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20240217100200.1494980-3-martin.blumenstingl@googlemail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Martin Blumenstingl [Sat, 17 Feb 2024 10:01:59 +0000 (11:01 +0100)]
mmc: meson-mx-sdhc: Use devm_clk_hw_get_clk() for clock retrieval
Now that devm_clk_hw_get_clk() has been available for a while we can
resolve an older TODO where this API did not exist yet. No functional
changes intended.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20240217100200.1494980-2-martin.blumenstingl@googlemail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Thu, 15 Feb 2024 13:28:18 +0000 (14:28 +0100)]
mmc: davinci_mmc: Drop dangling variable
The sg_miter conversion left a dangling unused variable.
Drop it.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402142042.vg0lnLdb-lkp@intel.com/
Fixes: ed01d210fd91 ("mmc: davinci_mmc: Use sg_miter for PIO")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20240215-mmc-fix-davinci-v1-1-a593678ca7bf@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Geert Uytterhoeven [Wed, 14 Feb 2024 12:59:57 +0000 (13:59 +0100)]
dt-bindings: mmc: renesas,sdhi: Document R-Car V4M support
Document support for the SD Card/MMC Interface in the Renesas R-Car V4M
(R8A779H0) SoC.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/fffc5a0a73c4cc8e8d7c5d93679531cc24e006ca.1707915511.git.geert+renesas@glider.be
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ulf Hansson [Wed, 14 Feb 2024 10:09:02 +0000 (11:09 +0100)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to
get tested together with the new mmc changes that are targeted for v6.9.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Christophe Kerello [Wed, 7 Feb 2024 14:39:51 +0000 (15:39 +0100)]
mmc: mmci: stm32: fix DMA API overlapping mappings warning
Turning on CONFIG_DMA_API_DEBUG_SG results in the following warning:
DMA-API: mmci-pl18x
48220000.mmc: cacheline tracking EEXIST,
overlapping mappings aren't supported
WARNING: CPU: 1 PID: 51 at kernel/dma/debug.c:568
add_dma_entry+0x234/0x2f4
Modules linked in:
CPU: 1 PID: 51 Comm: kworker/1:2 Not tainted 6.1.28 #1
Hardware name: STMicroelectronics STM32MP257F-EV1 Evaluation Board (DT)
Workqueue: events_freezable mmc_rescan
Call trace:
add_dma_entry+0x234/0x2f4
debug_dma_map_sg+0x198/0x350
__dma_map_sg_attrs+0xa0/0x110
dma_map_sg_attrs+0x10/0x2c
sdmmc_idma_prep_data+0x80/0xc0
mmci_prep_data+0x38/0x84
mmci_start_data+0x108/0x2dc
mmci_request+0xe4/0x190
__mmc_start_request+0x68/0x140
mmc_start_request+0x94/0xc0
mmc_wait_for_req+0x70/0x100
mmc_send_tuning+0x108/0x1ac
sdmmc_execute_tuning+0x14c/0x210
mmc_execute_tuning+0x48/0xec
mmc_sd_init_uhs_card.part.0+0x208/0x464
mmc_sd_init_card+0x318/0x89c
mmc_attach_sd+0xe4/0x180
mmc_rescan+0x244/0x320
DMA API debug brings to light leaking dma-mappings as dma_map_sg and
dma_unmap_sg are not correctly balanced.
If an error occurs in mmci_cmd_irq function, only mmci_dma_error
function is called and as this API is not managed on stm32 variant,
dma_unmap_sg is never called in this error path.
Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com>
Fixes: 46b723dd867d ("mmc: mmci: add stm32 sdmmc variant")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240207143951.938144-1-christophe.kerello@foss.st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ulf Hansson [Tue, 13 Feb 2024 22:51:34 +0000 (23:51 +0100)]
mmc: Merge branch fixes into next
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to
get tested together with the new mmc changes that are targeted for v6.9.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ivan Semenov [Tue, 6 Feb 2024 17:28:45 +0000 (19:28 +0200)]
mmc: core: Fix eMMC initialization with 1-bit bus connection
Initializing an eMMC that's connected via a 1-bit bus is current failing,
if the HW (DT) informs that 4-bit bus is supported. In fact this is a
regression, as we were earlier capable of falling back to 1-bit mode, when
switching to 4/8-bit bus failed. Therefore, let's restore the behaviour.
Log for Samsung eMMC 5.1 chip connected via 1bit bus (only D0 pin)
Before patch:
[134509.044225] mmc0: switch to bus width 4 failed
[134509.044509] mmc0: new high speed MMC card at address 0001
[134509.054594] mmcblk0: mmc0:0001 BGUF4R 29.1 GiB
[134509.281602] mmc0: switch to bus width 4 failed
[134509.282638] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.282657] Buffer I/O error on dev mmcblk0, logical block 0, async page read
[134509.284598] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.284602] Buffer I/O error on dev mmcblk0, logical block 0, async page read
[134509.284609] ldm_validate_partition_table(): Disk read failed.
[134509.286495] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.286500] Buffer I/O error on dev mmcblk0, logical block 0, async page read
[134509.288303] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.288308] Buffer I/O error on dev mmcblk0, logical block 0, async page read
[134509.289540] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.289544] Buffer I/O error on dev mmcblk0, logical block 0, async page read
[134509.289553] mmcblk0: unable to read partition table
[134509.289728] mmcblk0boot0: mmc0:0001 BGUF4R 31.9 MiB
[134509.290283] mmcblk0boot1: mmc0:0001 BGUF4R 31.9 MiB
[134509.294577] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
[134509.295835] I/O error, dev mmcblk0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[134509.295841] Buffer I/O error on dev mmcblk0, logical block 0, async page read
After patch:
[134551.089613] mmc0: switch to bus width 4 failed
[134551.090377] mmc0: new high speed MMC card at address 0001
[134551.102271] mmcblk0: mmc0:0001 BGUF4R 29.1 GiB
[134551.113365] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21
[134551.114262] mmcblk0boot0: mmc0:0001 BGUF4R 31.9 MiB
[134551.114925] mmcblk0boot1: mmc0:0001 BGUF4R 31.9 MiB
Fixes: 577fb13199b1 ("mmc: rework selection of bus speed mode")
Cc: stable@vger.kernel.org
Signed-off-by: Ivan Semenov <ivan@semenov.dev>
Link: https://lore.kernel.org/r/20240206172845.34316-1-ivan@semenov.dev
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Jeffrey Hugo [Fri, 9 Feb 2024 16:09:34 +0000 (09:09 -0700)]
MAINTAINERS: Update bouncing @codeaurora addresses for EMMC CMDQ
The @codeaurora email domain's servers have been decommissioned for a
long while now, and any emails addressed there will bounce.
Asutosh has an entry in .mailmap pointing to a new address, but
MAINTAINERS still lists an old @codeaurora address. Update MAINTAINERS
to match .mailmap for anyone reading the file directly.
Ritesh appears to have changed jobs, but looks to be still active in the
community. Update Ritesh's address to the one used in recient community
postings. Also Ritesh has indicated their entry should be changed from
Maintainer (M:) to Reviewer (R:) so make that update while we are making
changes to the entry.
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Acked-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20240209160934.3866475-1-quic_jhugo@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Krzysztof Kozlowski [Thu, 8 Feb 2024 20:21:37 +0000 (21:21 +0100)]
mmc: renesas_sdhi: use typedef for dma_filter_fn
Use existing typedef for dma_filter_fn to avoid duplicating type
definition.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20240208202137.630281-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Colin Ian King [Mon, 5 Feb 2024 19:13:10 +0000 (19:13 +0000)]
mmc: wbsd: remove redundant assignment to variable id
The variable id is being initialized with a value that is never
read, it is being re-assigned later on. The initialization is
redundant and can be removed.
Cleans up clang scan build warning:
drivers/mmc/host/wbsd.c:1287:4: warning: Value stored to 'id'
is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240205191310.1848561-1-colin.i.king@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:56 +0000 (01:19 +0100)]
mmc: sh_mmcif: Use sg_miter for PIO
Use sg_miter iterator instead of sg_virt() and custom code
to loop over the scatterlist. The memory iterator will do
bounce buffering if the page happens to be located in high memory,
which the driver may or may not be using.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-9-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:55 +0000 (01:19 +0100)]
mmc: sdhci-esdhc-mcf: Use sg_miter for swapping
Use sg_miter iterator instead of sg_virt() and custom code
to loop over the scatterlist. The memory iterator will do
bounce buffering if the page happens to be located in high memory,
which the driver may or may not be using.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-8-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:54 +0000 (01:19 +0100)]
mmc: omap: Use sg_miter for PIO
Use the scatterlist memory iterator instead of just
dereferencing virtual memory using sg_virt().
This make highmem references work properly.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-7-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:53 +0000 (01:19 +0100)]
mmc: mxcmmc: Use sg_miter for PIO
Use the scatterlist memory iterator instead of just
dereferencing virtual memory using sg_virt().
This make highmem references work properly.
Since this driver is using a worker, no atomic trickery
is needed.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-6-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:52 +0000 (01:19 +0100)]
mmc: mvsdio: Use sg_miter for PIO
Use the scatterlist memory iterator instead of just
dereferencing virtual memory using sg_virt().
This make highmem references work properly.
This driver also has a bug in the PIO sglist handling that
is fixed as part of the patch: it does not travers the
list of scatterbuffers: it will just process the first
item in the list. This is fixed by augmenting the logic
such that we do not process more than one sgitem
per IRQ instead of counting down potentially the whole
length of the request.
We can suspect that the PIO path is quite untested.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-5-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:51 +0000 (01:19 +0100)]
mmc: moxart-mmc: Use sg_miter for PIO
Use the scatterlist memory iterator instead of just
dereferencing virtual memory using sg_virt().
This make highmem references work properly.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-4-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:50 +0000 (01:19 +0100)]
mmc: moxart-mmc: Fix accounting in DMA transfer
The whole scatterlist chain is submitted to the DMA engine,
but the code is written to just account for the length of
the first sg entry.
When the DMA transfer is finished, all the data in the
request has been transferred, account for this instead.
This only works because the moxart_request() function isn't
checking that all data was transferred and will
unconditionally issue mmc_request_done() after returning
successfully from moxart_transfer_dma().
Keep the assignment of accounted bytes in .bytes_xfered
but move it after the completion where we know it has
actually happened.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-3-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:49 +0000 (01:19 +0100)]
mmc: moxart-mmc: Factor out moxart_use_dma() helper
The same code is in two places and we will add a third place.
Break this out into its own function.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-2-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Sat, 27 Jan 2024 00:19:48 +0000 (01:19 +0100)]
mmc: davinci_mmc: Use sg_miter for PIO
Use the scatterlist memory iterator instead of just
dereferencing virtual memory using sg_virt().
This make highmem references work properly.
Suggested-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-mmc/20240122073423.GA25859@lst.de/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240127-mmc-proper-kmap-v2-1-d8e732aa97d1@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Frank Li [Thu, 1 Feb 2024 20:22:41 +0000 (15:22 -0500)]
dt-bindings: mmc: fsl-imx-esdhc: add iommus property
iMX95 and iMX8QM have smmu. Add property "iommus".
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240201-8qm_smmu-v2-1-3d12a80201a3@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ricardo B. Marliere [Sun, 4 Feb 2024 20:05:58 +0000 (17:05 -0300)]
memstick: core: make memstick_bus_type const
Now that the driver core can properly handle constant struct bus_type,
move the memstick_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240204-bus_cleanup-memstick-v1-1-14809d4405d8@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ricardo B. Marliere [Sat, 3 Feb 2024 19:02:02 +0000 (16:02 -0300)]
mmc: core: make sdio_bus_type const
Now that the driver core can properly handle constant struct bus_type,
move the sdio_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240203-bus_cleanup-mmc-v1-3-ad054dce8dc3@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ricardo B. Marliere [Sat, 3 Feb 2024 19:02:01 +0000 (16:02 -0300)]
mmc: core: make mmc_bus_type const
Now that the driver core can properly handle constant struct bus_type,
move the mmc_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240203-bus_cleanup-mmc-v1-2-ad054dce8dc3@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ricardo B. Marliere [Sat, 3 Feb 2024 19:02:00 +0000 (16:02 -0300)]
mmc: core: make mmc_rpmb_bus_type const
Now that the driver core can properly handle constant struct bus_type,
move the mmc_rpmb_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240203-bus_cleanup-mmc-v1-1-ad054dce8dc3@marliere.net
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Walleij [Thu, 25 Jan 2024 08:50:23 +0000 (09:50 +0100)]
mmc: core Drop BLK_BOUNCE_HIGH
The MMC core sets BLK_BOUNCE_HIGH for devices where dma_mask
is unassigned.
For the majority of MMC hosts this path is never taken: the
OF core will unconditionally assign a 32-bit mask to any
OF device, and most MMC hosts are probed from device tree,
see drivers/of/platform.c:
of_platform_device_create_pdata()
dev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
if (!dev->dev.dma_mask)
dev->dev.dma_mask = &dev->dev.coherent_dma_mask;
of_amba_device_create()
dev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
dev->dev.dma_mask = &dev->dev.coherent_dma_mask;
MMC devices that are probed from ACPI or PCI will likewise
have a proper dma_mask assigned.
The only remaining devices that could have a blank dma_mask
are platform devices instantiated from board files.
These are mostly used on systems without CONFIG_HIGHMEM
enabled which means the block layer will not bounce, and in
the few cases where it is enabled it is not used anyway:
for example some OMAP2 systems such as Nokia n800/n810 will
create a platform_device and not assign a dma_mask, however
they do not have any highmem, so no bouncing will happen
anyway: the block core checks if max_low_pfn >= max_pfn
and this will always be false.
Should it turn out there is a platform_device with blank
DMA mask actually using CONFIG_HIGHMEM somewhere out there
we should set dma_mask for it, not do this trickery.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240125-mmc-no-blk-bounce-high-v1-1-d0f92a30e085@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Peng Fan [Mon, 22 Jan 2024 09:16:23 +0000 (17:16 +0800)]
dt-bindings: mmc: fsl-imx-esdhc: add i.MX95 compatible string
Same as i.MX93, add i.MX95 SDHC which is compatible with i.MX8MM USDHC.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20240122091623.2078089-1-peng.fan@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Christophe JAILLET [Sun, 14 Jan 2024 15:01:57 +0000 (16:01 +0100)]
mmc: core: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_range()/ida_alloc_max() is inclusive. So a -1 has been added when
needed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/583c57d0ae09f9d3a1e1a7b80c1e39ada17954b7.1705244502.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Torvalds [Sun, 11 Feb 2024 20:18:13 +0000 (12:18 -0800)]
Linux 6.8-rc4
Linus Torvalds [Sun, 11 Feb 2024 19:44:14 +0000 (11:44 -0800)]
Merge tag 'timers_urgent_for_v6.8_rc4' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:
- Make sure a warning is issued when a hrtimer gets queued after the
timers have been migrated on the CPU down path and thus said timer
will get ignored
* tag 'timers_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Report offline hrtimer enqueue
Linus Torvalds [Sun, 11 Feb 2024 19:41:51 +0000 (11:41 -0800)]
Merge tag 'x86_urgent_for_v6.8_rc4' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Correct the minimum CPU family for Transmeta Crusoe in Kconfig so
that such hw can boot again
- Do not take into accout XSTATE buffer size info supplied by userspace
when constructing a sigreturn frame
- Switch get_/put_user* to EX_TYPE_UACCESS exception handling when an
MCE is encountered so that it can be properly recovered from instead
of simply panicking
* tag 'x86_urgent_for_v6.8_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6
x86/fpu: Stop relying on userspace for info to fault in xsave buffer
x86/lib: Revert to _ASM_EXTABLE_UA() for {get,put}_user() fixups
Linus Torvalds [Sat, 10 Feb 2024 23:28:07 +0000 (15:28 -0800)]
Merge tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"21 hotfixes. 12 are cc:stable and the remainder pertain to post-6.7
issues or aren't considered to be needed in earlier kernel versions"
* tag 'mm-hotfixes-stable-2024-02-10-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
nilfs2: fix potential bug in end_buffer_async_write
mm/damon/sysfs-schemes: fix wrong DAMOS tried regions update timeout setup
nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
MAINTAINERS: Leo Yan has moved
mm/zswap: don't return LRU_SKIP if we have dropped lru lock
fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super
mailmap: switch email address for John Moon
mm: zswap: fix objcg use-after-free in entry destruction
mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()
arch/arm/mm: fix major fault accounting when retrying under per-VMA lock
selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros
mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page
mm: memcg: optimize parent iteration in memcg_rstat_updated()
nilfs2: fix data corruption in dsync block recovery for small block sizes
mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get()
exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock)
fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats
fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()
getrusage: use sig->stats_lock rather than lock_task_sighand()
getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()
...
Linus Torvalds [Sat, 10 Feb 2024 16:02:48 +0000 (08:02 -0800)]
Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Update a potentially stale firmware attribute (Maurizio)
- Fixes for the recent verbose error logging (Keith, Chaitanya)
- Protection information payload size fix for passthrough (Francis)
- Fix for a queue freezing issue in virtblk (Yi)
- blk-iocost underflow fix (Tejun)
- blk-wbt task detection fix (Jan)
* tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux:
virtio-blk: Ensure no requests in virtqueues before deleting vqs.
blk-iocost: Fix an UBSAN shift-out-of-bounds warning
nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
nvme-core: fix comment to reflect right functions
nvme: move passthrough logging attribute to head
blk-wbt: Fix detection of dirty-throttled tasks
nvme-host: fix the updating of the firmware version
Linus Torvalds [Sat, 10 Feb 2024 15:56:39 +0000 (07:56 -0800)]
Merge tag 'firewire-fixes-6.8-rc4' of git://git./linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
"A change to accelerate the device detection step in some cases.
In the self-identification step after bus-reset, all nodes in the same
bus broadcast selfID packet including the value of gap count. The
value is related to the cable hops between nodes, and used to
calculate the subaction gap and the arbitration reset gap.
When each node has the different value of the gap count, the
asynchronous communication between them is unreliable, since an
asynchronous transaction could be interrupted by another asynchronous
transaction before completion. The gap count inconsistency can be
resolved by several ways; e.g. the transfer of PHY configuration
packet and generation of bus-reset.
The current implementation of firewire stack can correctly detect the
gap count inconsistency, however the recovery action from the
inconsistency tends to be delayed after reading configuration ROM of
root node. This results in the long time to probe devices in some
combinations of hardware.
Here the stack is changed to schedule the action as soon as possible"
* tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: core: send bus reset promptly on gap count error
Linus Torvalds [Sat, 10 Feb 2024 15:53:41 +0000 (07:53 -0800)]
Merge tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
"Two ksmbd server fixes:
- memory leak fix
- a minor kernel-doc fix"
* tag '6.8-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: free aux buffer if ksmbd_iov_pin_rsp_read fails
ksmbd: Add kernel-doc for ksmbd_extract_sharename() function
Linus Torvalds [Sat, 10 Feb 2024 01:15:26 +0000 (17:15 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small driver fixes and one core fix.
The core fix being a fixup to the one in the last pull request which
didn't entirely move checking of scsi_host_busy() out from under the
host lock"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Remove the ufshcd_release() in ufshcd_err_handling_prepare()
scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()
scsi: lpfc: Use unsigned type for num_sge
scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
Linus Torvalds [Sat, 10 Feb 2024 01:09:30 +0000 (17:09 -0800)]
Merge tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- reconnect fix
- multichannel channel selection fix
- minor mount warning fix
- reparse point fix
- null pointer check improvement
* tag '6.8-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: clarify mount warning
cifs: handle cases where multiple sessions share connection
cifs: change tcon status when need_reconnect is set on it
smb: client: set correct d_type for reparse points under DFS mounts
smb3: add missing null server pointer check
Linus Torvalds [Sat, 10 Feb 2024 01:05:02 +0000 (17:05 -0800)]
Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Some fscrypt-related fixups (sparse reads are used only for encrypted
files) and two cap handling fixes from Xiubo and Rishabh"
* tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client:
ceph: always check dir caps asynchronously
ceph: prevent use-after-free in encode_cap_msg()
ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT
libceph: just wait for more data to be available on the socket
libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*()
libceph: fail sparse-read if the data length doesn't match
Linus Torvalds [Sat, 10 Feb 2024 00:59:49 +0000 (16:59 -0800)]
Merge tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 fixes from Konstantin Komarov:
"Fixed:
- size update for compressed file
- some logic errors, overflows
- memory leak
- some code was refactored
Added:
- implement super_operations::shutdown
Improved:
- alternative boot processing
- reduced stack usage"
* tag 'ntfs3_for_6.8' of https://github.com/Paragon-Software-Group/linux-ntfs3: (28 commits)
fs/ntfs3: Slightly simplify ntfs_inode_printk()
fs/ntfs3: Add ioctl operation for directories (FITRIM)
fs/ntfs3: Fix oob in ntfs_listxattr
fs/ntfs3: Fix an NULL dereference bug
fs/ntfs3: Update inode->i_size after success write into compressed file
fs/ntfs3: Fixed overflow check in mi_enum_attr()
fs/ntfs3: Correct function is_rst_area_valid
fs/ntfs3: Use i_size_read and i_size_write
fs/ntfs3: Prevent generic message "attempt to access beyond end of device"
fs/ntfs3: use non-movable memory for ntfs3 MFT buffer cache
fs/ntfs3: Use kvfree to free memory allocated by kvmalloc
fs/ntfs3: Disable ATTR_LIST_ENTRY size check
fs/ntfs3: Fix c/mtime typo
fs/ntfs3: Add NULL ptr dereference checking at the end of attr_allocate_frame()
fs/ntfs3: Add and fix comments
fs/ntfs3: ntfs3_forced_shutdown use int instead of bool
fs/ntfs3: Implement super_operations::shutdown
fs/ntfs3: Drop suid and sgid bits as a part of fpunch
fs/ntfs3: Add file_modified
fs/ntfs3: Correct use bh_read
...
Linus Torvalds [Fri, 9 Feb 2024 20:39:31 +0000 (12:39 -0800)]
work around gcc bugs with 'asm goto' with outputs
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and
a9f180345f53 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit
43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround. But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in
this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it
has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a
barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steve French [Wed, 7 Feb 2024 05:57:18 +0000 (23:57 -0600)]
smb3: clarify mount warning
When a user tries to use the "sec=krb5p" mount parameter to encrypt
data on connection to a server (when authenticating with Kerberos), we
indicate that it is not supported, but do not note the equivalent
recommended mount parameter ("sec=krb5,seal") which turns on encryption
for that mount (and uses Kerberos for auth). Update the warning message.
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Tue, 6 Feb 2024 15:00:47 +0000 (15:00 +0000)]
cifs: handle cases where multiple sessions share connection
Based on our implementation of multichannel, it is entirely
possible that a server struct may not be found in any channel
of an SMB session.
In such cases, we should be prepared to move on and search for
the server struct in the next session.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Tue, 6 Feb 2024 15:00:46 +0000 (15:00 +0000)]
cifs: change tcon status when need_reconnect is set on it
When a tcon is marked for need_reconnect, the intention
is to have it reconnected.
This change adjusts tcon->status in cifs_tree_connect
when need_reconnect is set. Also, this change has a minor
correction in resetting need_reconnect on success. It makes
sure that it is done with tc_lock held.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Fri, 9 Feb 2024 19:19:36 +0000 (11:19 -0800)]
Merge tag 'riscv-for-linus-6.8-rc4' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP
configurations
- fixes to correctly implement the break-before-make behavior requried
by the ISA for NAPOT mappings
- fix a missing TLB flush on intermediate mapping changes
- fix build warning about a missing declaration of overflow_stack
- fix performace regression related to incorrect tracking of completed
batch TLB flushes
* tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask
riscv: declare overflow_stack as exported from traps.c
riscv: Fix arch_hugetlb_migration_supported() for NAPOT
riscv: Flush the tlb when a page directory is freed
riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled
riscv: Fix set_huge_pte_at() for NAPOT mapping
riscv: mm: execute local TLB flush after populating vmemmap
Linus Torvalds [Fri, 9 Feb 2024 19:13:19 +0000 (11:13 -0800)]
Merge tag 'trace-v6.8-rc3' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix broken direct trampolines being called when another callback is
attached the same function.
ARM 64 does not support FTRACE_WITH_REGS, and when it added direct
trampoline calls from ftrace, it removed the "WITH_REGS" flag from
the ftrace_ops for direct trampolines. This broke x86 as x86 requires
direct trampolines to have WITH_REGS.
This wasn't noticed because direct trampolines work as long as the
function it is attached to is not shared with other callbacks (like
the function tracer). When there are other callbacks, a helper
trampoline is called, to call all the non direct callbacks and when
it returns, the direct trampoline is called.
For x86, the direct trampoline sets a flag in the regs field to tell
the x86 specific code to call the direct trampoline. But this only
works if the ftrace_ops had WITH_REGS set. ARM does things
differently that does not require this. For now, set WITH_REGS if the
arch supports WITH_REGS (which ARM does not), and this makes it work
for both ARM64 and x86.
- Fix wasted memory in the saved_cmdlines logic.
The saved_cmdlines is a cache that maps PIDs to COMMs that tracing
can use. Most trace events only save the PID in the event. The
saved_cmdlines file lists PIDs to COMMs so that the tracing tools can
show an actual name and not just a PID for each event. There's an
array of PIDs that map to a small set of saved COMM strings. The
array is set to PID_MAX_DEFAULT which is usually set to 32768. When a
PID comes in, it will add itself to this array along with the index
into the COMM array (note if the system allows more than
PID_MAX_DEFAULT, this cache is similar to cache lines as an update of
a PID that has the same PID_MAX_DEFAULT bits set will flush out
another task with the same matching bits set).
A while ago, the size of this cache was changed to be dynamic and the
array was moved into a structure and created with kmalloc(). But this
new structure had the size of 131104 bytes, or 0x20020 in hex. As
kmalloc allocates in powers of two, it was actually allocating
0x40000 bytes (262144) leaving 131040 bytes of wasted memory. The
last element of this structure was a pointer to the COMM string array
which defaulted to just saving 128 COMMs.
By changing the last field of this structure to a variable length
string, and just having it round up to fill the allocated memory, the
default size of the saved COMM cache is now 8190. This not only uses
the wasted space, but actually saves space by removing the extra
allocation for the COMM names.
* tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix wasted memory in saved_cmdlines logic
ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default
Linus Torvalds [Fri, 9 Feb 2024 19:04:26 +0000 (11:04 -0800)]
Merge tag 'probes-fixes-v6.8-rc3' of git://git./linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- remove unnecessary initial values of kprobes local variables
- probe-events parser bug fixes:
- calculate the argument size and format string after setting type
information from BTF, because BTF can change the size and format
string.
- show $comm parse error correctly instead of failing silently.
* tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Remove unnecessary initial values of variables
tracing/probes: Fix to set arg size and fmt after setting type from BTF
tracing/probes: Fix to show a parse error for bad type for $comm
Linus Torvalds [Fri, 9 Feb 2024 18:40:50 +0000 (10:40 -0800)]
Merge tag 'efi-fixes-for-v6.8-1' of git://git./linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"The only notable change here is the patch that changes the way we deal
with spurious errors from the EFI memory attribute protocol. This will
be backported to v6.6, and is intended to ensure that we will not
paint ourselves into a corner when we tighten this further in order to
comply with MS requirements on signed EFI code.
Note that this protocol does not currently exist in x86 production
systems in the field, only in Microsoft's fork of OVMF, but it will be
mandatory for Windows logo certification for x86 PCs in the future.
- Tighten ELF relocation checks on the RISC-V EFI stub
- Give up if the new EFI memory attributes protocol fails spuriously
on x86
- Take care not to place the kernel in the lowest 16 MB of DRAM on
x86
- Omit special purpose EFI memory from memblock
- Some fixes for the CXL CPER reporting code
- Make the PE/COFF layout of mixed-mode capable images comply with a
strict interpretation of the spec"
* tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
cxl/trace: Remove unnecessary memcpy's
cxl/cper: Fix errant CPER prints for CXL events
efi: Don't add memblocks for soft-reserved memory
efi: runtime: Fix potential overflow of soft-reserved region size
efi/libstub: Add one kernel-doc comment
x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR
x86/efistub: Give up if memory attribute protocol returns an error
riscv/efistub: Tighten ELF relocation check
riscv/efistub: Ensure GP-relative addressing is not used
Linus Torvalds [Fri, 9 Feb 2024 18:37:59 +0000 (10:37 -0800)]
Merge tag 'pci-v6.8-fixes-2' of git://git./linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Fix an unintentional truncation of DWC MSI-X address to 32 bits and
update similar MSI code to match (Dan Carpenter)
* tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: dwc: Clean up dw_pcie_ep_raise_msi_irq() alignment
PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
Linus Torvalds [Fri, 9 Feb 2024 18:35:39 +0000 (10:35 -0800)]
Merge tag 'hwmon-for-v6.8-rc4' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- coretemp: Various fixes, and increase number of supported CPU cores
- aspeed-pwm-tacho: Add missing mutex protection
* tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Enlarge per package core count limit
hwmon: (coretemp) Fix bogus core_id to attr name mapping
hwmon: (coretemp) Fix out-of-bounds memory access
hwmon: (aspeed-pwm-tacho) mutex for tach reading
Linus Torvalds [Fri, 9 Feb 2024 18:33:54 +0000 (10:33 -0800)]
Merge tag 'mmc-v6.8-rc2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Allow non-sleeping read-only slot-gpio
MMC host:
- sdhci-pci-o2micro: Fix a warm reboot BIOS issue"
* tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: slot-gpio: Allow non-sleeping GPIO ro
mmc: sdhci-pci-o2micro: Fix a warm reboot issue that disk can't be detected by BIOS
Linus Torvalds [Fri, 9 Feb 2024 18:29:50 +0000 (10:29 -0800)]
Merge tag 'pmdomain-v6.8-rc1' of git://git./linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"Core:
- Move the unused cleanup to a _sync initcall
Providers:
- mediatek: Fix race conditions at probe/remove with genpd
- renesas: r8a77980-sysc: CR7 must be always on"
* tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: mediatek: fix race conditions with genpd
pmdomain: renesas: r8a77980-sysc: CR7 must be always on
pmdomain: core: Move the unused cleanup to a _sync initcall
Linus Torvalds [Fri, 9 Feb 2024 18:27:56 +0000 (10:27 -0800)]
Merge tag 'gpio-fixes-for-v6.8-rc4' of git://git./linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
- remove the new GPIO device from the global list unconditionally in
error path in core GPIOLIB
* tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: remove GPIO device from the list unconditionally in error path
Linus Torvalds [Fri, 9 Feb 2024 17:57:12 +0000 (09:57 -0800)]
Merge tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular weekly fixes, xe, amdgpu and msm are most of them, with some
misc in i915, ivpu and nouveau, scattered but nothing too intense at
this point.
i915:
- gvt: docs fix, uninit var, MAINTAINERS
ivpu:
- add aborted job status
- disable d3 hot delay
- mmu fixes
nouveau:
- fix gsp rpc size request
- fix dma buffer leaks
- use common code for gsp mem ctor
xe:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix
msm:
- DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
- DP:
- fixes to handle unknown bpc case correctly for DP
- fix for MISC0 programming
- GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable"
* tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm: (43 commits)
drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
drm/xe/vm: don't ignore error when in_kthread
drm/xe: Assume large page size if VMA not yet bound
drm/xe/display: Fix memleak in display initialization
drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
drm/xe: circumvent bogus stringop-overflow warning
drm/xe: Pick correct userptr VMA to repin on REMAP op failure
drm/xe: Take a reference in xe_exec_queue_last_fence_get()
drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
drm/amd/display: Implement bounds check for stream encoder creation in DCN301
drm/amd/display: Increase frame-larger-than for all display_mode_vba files
drm/amd/display: Clear phantom stream count and plane count
drm/amdgpu: Avoid fetching VRAM vendor info
drm/amd/display: Disable ODM by default for DCN35
drm/amd/display: Update phantom pipe enable / disable sequence
drm/amd/display: Fix MST Null Ptr for RV
drm/amdgpu: Fix shared buff copy to user
drm/amd/display: Increase eval/entry delay for DCN35
drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
...
Aleksander Mazur [Tue, 23 Jan 2024 13:43:00 +0000 (14:43 +0100)]
x86/Kconfig: Transmeta Crusoe is CPU family 5, not 6
The kernel built with MCRUSOE is unbootable on Transmeta Crusoe. It shows
the following error message:
This kernel requires an i686 CPU, but only detected an i586 CPU.
Unable to boot - please use a kernel appropriate for your CPU.
Remove MCRUSOE from the condition introduced in commit in Fixes, effectively
changing X86_MINIMUM_CPU_FAMILY back to 5 on that machine, which matches the
CPU family given by CPUID.
[ bp: Massage commit message. ]
Fixes: 25d76ac88821 ("x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig")
Signed-off-by: Aleksander Mazur <deweloper@wp.pl>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240123134309.1117782-1-deweloper@wp.pl
Steven Rostedt (Google) [Fri, 9 Feb 2024 11:36:22 +0000 (06:36 -0500)]
tracing: Fix wasted memory in saved_cmdlines logic
While looking at improving the saved_cmdlines cache I found a huge amount
of wasted memory that should be used for the cmdlines.
The tracing data saves pids during the trace. At sched switch, if a trace
occurred, it will save the comm of the task that did the trace. This is
saved in a "cache" that maps pids to comms and exposed to user space via
the /sys/kernel/tracing/saved_cmdlines file. Currently it only caches by
default 128 comms.
The structure that uses this creates an array to store the pids using
PID_MAX_DEFAULT (which is usually set to 32768). This causes the structure
to be of the size of 131104 bytes on 64 bit machines.
In hex: 131104 = 0x20020, and since the kernel allocates generic memory in
powers of two, the kernel would allocate 0x40000 or 262144 bytes to store
this structure. That leaves 131040 bytes of wasted space.
Worse, the structure points to an allocated array to store the comm names,
which is 16 bytes times the amount of names to save (currently 128), which
is 2048 bytes. Instead of allocating a separate array, make the structure
end with a variable length string and use the extra space for that.
This is similar to a recommendation that Linus had made about eventfs_inode names:
https://lore.kernel.org/all/
20240130190355.11486-5-torvalds@linux-foundation.org/
Instead of allocating a separate string array to hold the saved comms,
have the structure end with: char saved_cmdlines[]; and round up to the
next power of two over sizeof(struct saved_cmdline_buffers) + num_cmdlines * TASK_COMM_LEN
It will use this extra space for the saved_cmdline portion.
Now, instead of saving only 128 comms by default, by using this wasted
space at the end of the structure it can save over 8000 comms and even
saves space by removing the need for allocating the other array.
Link: https://lore.kernel.org/linux-trace-kernel/20240209063622.1f7b6d5f@rorschach.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Mete Durlu <meted@linux.ibm.com>
Fixes: 939c7a4f04fcd ("tracing: Introduce saved_cmdlines_size file")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Wed, 10 Jan 2024 00:13:06 +0000 (09:13 +0900)]
ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default
The commit
60c8971899f3 ("ftrace: Make DIRECT_CALLS work WITH_ARGS
and !WITH_REGS") changed DIRECT_CALLS to use SAVE_ARGS when there
are multiple ftrace_ops at the same function, but since the x86 only
support to jump to direct_call from ftrace_regs_caller, when we set
the function tracer on the same target function on x86, ftrace-direct
does not work as below (this actually works on arm64.)
At first, insmod ftrace-direct.ko to put a direct_call on
'wake_up_process()'.
# insmod kernel/samples/ftrace/ftrace-direct.ko
# less trace
...
<idle>-0 [006] ..s1. 564.686958: my_direct_func: waking up rcu_preempt-17
<idle>-0 [007] ..s1. 564.687836: my_direct_func: waking up kcompactd0-63
<idle>-0 [006] ..s1. 564.690926: my_direct_func: waking up rcu_preempt-17
<idle>-0 [006] ..s1. 564.696872: my_direct_func: waking up rcu_preempt-17
<idle>-0 [007] ..s1. 565.191982: my_direct_func: waking up kcompactd0-63
Setup a function filter to the 'wake_up_process' too, and enable it.
# cd /sys/kernel/tracing/
# echo wake_up_process > set_ftrace_filter
# echo function > current_tracer
# less trace
...
<idle>-0 [006] ..s3. 686.180972: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s3. 686.186919: wake_up_process <-call_timer_fn
<idle>-0 [002] ..s3. 686.264049: wake_up_process <-call_timer_fn
<idle>-0 [002] d.h6. 686.515216: wake_up_process <-kick_pool
<idle>-0 [002] d.h6. 686.691386: wake_up_process <-kick_pool
Then, only function tracer is shown on x86.
But if you enable 'kprobe on ftrace' event (which uses SAVE_REGS flag)
on the same function, it is shown again.
# echo 'p wake_up_process' >> dynamic_events
# echo 1 > events/kprobes/p_wake_up_process_0/enable
# echo > trace
# less trace
...
<idle>-0 [006] ..s2. 2710.345919: p_wake_up_process_0: (wake_up_process+0x4/0x20)
<idle>-0 [006] ..s3. 2710.345923: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s1. 2710.345928: my_direct_func: waking up rcu_preempt-17
<idle>-0 [006] ..s2. 2710.349931: p_wake_up_process_0: (wake_up_process+0x4/0x20)
<idle>-0 [006] ..s3. 2710.349934: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s1. 2710.349937: my_direct_func: waking up rcu_preempt-17
To fix this issue, use SAVE_REGS flag for multiple ftrace_ops flag of
direct_call by default.
Link: https://lore.kernel.org/linux-trace-kernel/170484558617.178953.1590516949390270842.stgit@devnote2
Fixes: 60c8971899f3 ("ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS")
Cc: stable@vger.kernel.org
Cc: Florent Revest <revest@chromium.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Dave Airlie [Fri, 9 Feb 2024 01:32:38 +0000 (11:32 +1000)]
Merge tag 'drm-msm-fixes-2024-02-07' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc4
DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
DP:
- fixes to handle unknown bpc case correctly for DP. The current code was
spilling over into other bits of DP configuration register, had to be
fixed to avoid the extra shifts which were causing the spill over
- fix for MISC0 programming in DP driver to program the correct
colorimetry value
GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv+tb1+_cp7ftxcMZbbxE9810rvxeaC50eL=msQ+zkm0g@mail.gmail.com
Dave Airlie [Fri, 9 Feb 2024 01:21:16 +0000 (11:21 +1000)]
Merge tag 'amd-drm-fixes-6.8-2024-02-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.8-2024-02-08:
amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208165500.4887-1-alexander.deucher@amd.com
Dave Airlie [Fri, 9 Feb 2024 01:17:57 +0000 (11:17 +1000)]
Merge tag 'drm-intel-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Just includes gvt-fixes-2024-02-05
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcTETgXsejwVwat6@jlahtine-mobl.ger.corp.intel.com
Dave Airlie [Fri, 9 Feb 2024 01:12:01 +0000 (11:12 +1000)]
Merge tag 'drm-xe-fixes-2024-02-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer
in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcS2LllawGifubsk@fedora
Dave Airlie [Fri, 9 Feb 2024 00:52:43 +0000 (10:52 +1000)]
Merge tag 'drm-misc-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A null pointer dereference fix for v3d, a TTM pool initialization fix,
several fixes for nouveau around register size, DMA buffer leaks and API
consistency, a multiple fixes for ivpu around MMU setup, initialization
and firmware interactions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4wsi2i6kgkqdu7nzp4g7hxasbswnrmc5cakgf5zzvnix53u7lr@4rmp7hwblow3
Linus Torvalds [Thu, 8 Feb 2024 23:09:29 +0000 (15:09 -0800)]
Merge tag 'net-6.8-rc4' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and netfilter.
Current release - regressions:
- nic: intel: fix old compiler regressions
- netfilter: ipset: missing gc cancellations fixed
Current release - new code bugs:
- netfilter: ctnetlink: fix filtering for zone 0
Previous releases - regressions:
- core: fix from address in memcpy_to_iter_csum()
- netfilter: nfnetlink_queue: un-break NF_REPEAT
- af_unix: fix memory leak for dead unix_(sk)->oob_skb in GC.
- devlink: avoid potential loop in devlink_rel_nested_in_notify_work()
- iwlwifi:
- mvm: fix a battery life regression
- fix double-free bug
- mac80211: fix waiting for beacons logic
- nic: nfp: flower: prevent re-adding mac index for bonded port
Previous releases - always broken:
- rxrpc: fix generation of serial numbers to skip zero
- tipc: check the bearer type before calling tipc_udp_nl_bearer_add()
- tunnels: fix out of bounds access when building IPv6 PMTU error
- nic: hv_netvsc: register VF in netvsc_probe if NET_DEVICE_REGISTER
missed
- nic: atlantic: fix DMA mapping for PTP hwts ring
Misc:
- selftests: more fixes to deal with very slow hosts"
* tag 'net-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
s390/qeth: Fix potential loss of L3-IP@ in case of network issues
netfilter: ipset: Missing gc cancellations fixed
octeontx2-af: Initialize maps.
net: ethernet: ti: cpsw: enable mac_managed_pm to fix mdio
net: ethernet: ti: cpsw_new: enable mac_managed_pm to fix mdio
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
net: intel: fix old compiler regressions
MAINTAINERS: Maintainer change for rds
selftests: cmsg_ipv6: repeat the exact packet
...
Linus Torvalds [Thu, 8 Feb 2024 23:07:06 +0000 (15:07 -0800)]
Merge tag 'pinctrl-v6.8-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fix from Linus Walleij:
"A single fix for the AMD driver which affects developer laptops, the
pinctrl/GPIO driver won't probe on some systems"
* tag 'pinctrl-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: amd: Add IRQF_ONESHOT to the interrupt request
Jens Axboe [Thu, 8 Feb 2024 22:05:18 +0000 (15:05 -0700)]
Merge tag 'nvme-6.8-2023-02-08' of git://git.infradead.org/nvme into block-6.8
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.8
- Update a potentially stale firmware attribute (Maurizio)
- Fixes for the recent verbose error logging (Keith, Chaitanya)
- Protection information payload size fix for passthrough (Francis)"
* tag 'nvme-6.8-2023-02-08' of git://git.infradead.org/nvme:
nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
nvme-core: fix comment to reflect right functions
nvme: move passthrough logging attribute to head
nvme-host: fix the updating of the firmware version
Yi Sun [Mon, 29 Jan 2024 08:52:50 +0000 (16:52 +0800)]
virtio-blk: Ensure no requests in virtqueues before deleting vqs.
Ensure no remaining requests in virtqueues before resetting vdev and
deleting virtqueues. Otherwise these requests will never be completed.
It may cause the system to become unresponsive.
Function blk_mq_quiesce_queue() can ensure that requests have become
in_flight status, but it cannot guarantee that requests have been
processed by the device. Virtqueues should never be deleted before
all requests become complete status.
Function blk_mq_freeze_queue() ensure that all requests in virtqueues
become complete status. And no requests can enter in virtqueues.
Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20240129085250.1550594-1-yi.sun@unisoc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Tejun Heo [Mon, 20 Nov 2023 22:25:56 +0000 (12:25 -1000)]
blk-iocost: Fix an UBSAN shift-out-of-bounds warning
When iocg_kick_delay() is called from a CPU different than the one which set
the delay, @now may be in the past of @iocg->delay_at leading to the
following warning:
UBSAN: shift-out-of-bounds in block/blk-iocost.c:1359:23
shift exponent
18446744073709 is too large for 64-bit type 'u64' (aka 'unsigned long long')
...
Call Trace:
<TASK>
dump_stack_lvl+0x79/0xc0
__ubsan_handle_shift_out_of_bounds+0x2ab/0x300
iocg_kick_delay+0x222/0x230
ioc_rqos_merge+0x1d7/0x2c0
__rq_qos_merge+0x2c/0x80
bio_attempt_back_merge+0x83/0x190
blk_attempt_plug_merge+0x101/0x150
blk_mq_submit_bio+0x2b1/0x720
submit_bio_noacct_nocheck+0x320/0x3e0
__swap_writepage+0x2ab/0x9d0
The underflow itself doesn't really affect the behavior in any meaningful
way; however, the past timestamp may exaggerate the delay amount calculated
later in the code, which shouldn't be a material problem given the nature of
the delay mechanism.
If @now is in the past, this CPU is racing another CPU which recently set up
the delay and there's nothing this CPU can contribute w.r.t. the delay.
Let's bail early from iocg_kick_delay() in such cases.
Reported-by: Breno Leitão <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 5160a5a53c0c ("blk-iocost: implement delay adjustment hysteresis")
Link: https://lore.kernel.org/r/ZVvc9L_CYk5LO1fT@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Paulo Alcantara [Fri, 2 Feb 2024 15:38:24 +0000 (12:38 -0300)]
smb: client: set correct d_type for reparse points under DFS mounts
Send query dir requests with an info level of
SMB_FIND_FILE_FULL_DIRECTORY_INFO rather than
SMB_FIND_FILE_DIRECTORY_INFO when the client is generating its own
inode numbers (e.g. noserverino) so that reparse tags still
can be parsed directly from the responses, but server won't
send UniqueId (server inode number)
Signed-off-by: Paulo Alcantara <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Steve French [Mon, 5 Feb 2024 20:43:17 +0000 (14:43 -0600)]
smb3: add missing null server pointer check
Address static checker warning in cifs_ses_get_chan_index():
warn: variable dereferenced before check 'server'
To be consistent, and reduce risk, we should add another check
for null server pointer.
Fixes: 88675b22d34e ("cifs: do not search for channel if server is terminating")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Li zeming [Tue, 19 Sep 2023 01:28:23 +0000 (09:28 +0800)]
kprobes: Remove unnecessary initial values of variables
ri and sym is assigned first, so it does not need to initialize the
assignment.
Link: https://lore.kernel.org/all/20230919012823.7815-1-zeming@nfschina.com/
Signed-off-by: Li zeming <zeming@nfschina.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) [Tue, 23 Jan 2024 15:03:02 +0000 (00:03 +0900)]
tracing/probes: Fix to set arg size and fmt after setting type from BTF
Since the BTF type setting updates probe_arg::type, the type size
calculation and setting print-fmt should be done after that.
Without this fix, the argument size and print-fmt can be wrong.
Link: https://lore.kernel.org/all/170602218196.215583.6417859469540955777.stgit@devnote2/
Fixes: b576e09701c7 ("tracing/probes: Support function parameters if BTF is available")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) [Tue, 23 Jan 2024 15:02:34 +0000 (00:02 +0900)]
tracing/probes: Fix to show a parse error for bad type for $comm
Fix to show a parse error for bad type (non-string) for $comm/$COMM and
immediate-string. With this fix, error_log file shows appropriate error
message as below.
/sys/kernel/tracing # echo 'p vfs_read $comm:u32' >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/tracing # echo 'p vfs_read \"hoge":u32' >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/tracing # cat error_log
[ 30.144183] trace_kprobe: error: $comm and immediate-string only accepts string type
Command: p vfs_read $comm:u32
^
[ 62.618500] trace_kprobe: error: $comm and immediate-string only accepts string type
Command: p vfs_read \"hoge":u32
^
Link: https://lore.kernel.org/all/170602215411.215583.2238016352271091852.stgit@devnote2/
Fixes: 3dd1f7f24f8c ("tracing: probeevent: Fix to make the type of $comm string")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Paolo Abeni [Thu, 8 Feb 2024 11:56:39 +0000 (12:56 +0100)]
Merge tag 'nf-24-02-08' of git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Narrow down target/match revision to u8 in nft_compat.
2) Bail out with unused flags in nft_compat.
3) Restrict layer 4 protocol to u16 in nft_compat.
4) Remove static in pipapo get command that slipped through when
reducing set memory footprint.
5) Follow up incremental fix for the ipset performance regression,
this includes the missing gc cancellation, from Jozsef Kadlecsik.
6) Allow to filter by zone 0 in ctnetlink, do not interpret zone 0
as no filtering, from Felix Huettner.
7) Reject direction for NFT_CT_ID.
8) Use timestamp to check for set element expiration while transaction
is handled to prevent garbage collection from removing set elements
that were just added by this transaction. Packet path and netlink
dump/get path still use current time to check for expiration.
9) Restore NF_REPEAT in nfnetlink_queue, from Florian Westphal.
10) map_index needs to be percpu and per-set, not just percpu.
At this time its possible for a pipapo set to fill the all-zero part
with ones and take the 'might have bits set' as 'start-from-zero' area.
From Florian Westphal. This includes three patches:
- Change scratchpad area to a structure that provides space for a
per-set-and-cpu toggle and uses it of the percpu one.
- Add a new free helper to prepare for the next patch.
- Remove the scratch_aligned pointer and makes AVX2 implementation
use the exact same memory addresses for read/store of the matching
state.
netfilter pull request 24-02-08
* tag 'nf-24-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_set_pipapo: remove scratch_aligned pointer
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_rbtree: skip end interval element from gc
netfilter: nfnetlink_queue: un-break NF_REPEAT
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nft_ct: reject direction for ct id
netfilter: ctnetlink: fix filtering for zone 0
netfilter: ipset: Missing gc cancellations fixed
netfilter: nft_set_pipapo: remove static in nft_pipapo_get()
netfilter: nft_compat: restrict match/target protocol to u16
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: narrow down revision to unsigned 8-bits
====================
Link: https://lore.kernel.org/r/20240208112834.1433-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Florian Westphal [Thu, 8 Feb 2024 09:31:29 +0000 (10:31 +0100)]
netfilter: nft_set_pipapo: remove scratch_aligned pointer
use ->scratch for both avx2 and the generic implementation.
After previous change the scratch->map member is always aligned properly
for AVX2, so we can just use scratch->map in AVX2 too.
The alignoff delta is stored in the scratchpad so we can reconstruct
the correct address to free the area again.
Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 7 Feb 2024 20:52:47 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
After next patch simple kfree() is not enough anymore, so add
a helper for it.
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 7 Feb 2024 20:52:46 +0000 (21:52 +0100)]
netfilter: nft_set_pipapo: store index in scratch maps
Pipapo needs a scratchpad area to keep state during matching.
This state can be large and thus cannot reside on stack.
Each set preallocates percpu areas for this.
On each match stage, one scratchpad half starts with all-zero and the other
is inited to all-ones.
At the end of each stage, the half that starts with all-ones is
always zero. Before next field is tested, pointers to the two halves
are swapped, i.e. resmap pointer turns into fill pointer and vice versa.
After the last field has been processed, pipapo stashes the
index toggle in a percpu variable, with assumption that next packet
will start with the all-zero half and sets all bits in the other to 1.
This isn't reliable.
There can be multiple sets and we can't be sure that the upper
and lower half of all set scratch map is always in sync (lookups
can be conditional), so one set might have swapped, but other might
not have been queried.
Thus we need to keep the index per-set-and-cpu, just like the
scratchpad.
Note that this bug fix is incomplete, there is a related issue.
avx2 and normal implementation might use slightly different areas of the
map array space due to the avx2 alignment requirements, so
m->scratch (generic/fallback implementation) and ->scratch_aligned
(avx) may partially overlap. scratch and scratch_aligned are not distinct
objects, the latter is just the aligned address of the former.
After this change, write to scratch_align->map_index may write to
scratch->map, so this issue becomes more prominent, we can set to 1
a bit in the supposedly-all-zero area of scratch->map[].
A followup patch will remove the scratch_aligned and makes generic and
avx code use the same (aligned) area.
Its done in a separate change to ease review.
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>