Mark Brown [Tue, 17 May 2022 15:59:05 +0000 (16:59 +0100)]
Merge remote-tracking branch 'regulator/for-5.19' into regulator-next
Miaoqian Lin [Mon, 16 May 2022 07:44:33 +0000 (11:44 +0400)]
regulator: scmi: Fix refcount leak in scmi_regulator_probe
of_find_node_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
Fixes: 0fbeae70ee7c ("regulator: add SCMI driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220516074433.32433-1-linmq006@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Miaoqian Lin [Wed, 11 May 2022 11:35:05 +0000 (15:35 +0400)]
regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt
of_node_get() returns a node with refcount incremented.
Calling of_node_put() to drop the reference when not needed anymore.
Fixes: 3784b6d64dc5 ("regulator: pfuze100: add pfuze100 regulator driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220511113506.45185-1-linmq006@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Konrad Dybcio [Sat, 30 Apr 2022 16:37:52 +0000 (18:37 +0200)]
regulator: qcom_smd: Fix up PM8950 regulator configuration
Following changes have been made:
- S5, L4, L18, L20 and L21 were removed (S5 is managed by
SPMI, whereas the rest seems not to exist [or at least it's blocked
by Sony Loire /MSM8956/ RPM firmware])
- Supply maps have were adjusted to reflect regulator changes.
Fixes: e44adca5fa25 ("regulator: qcom_smd: Add PM8950 regulators")
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Link: https://lore.kernel.org/r/20220430163753.609909-1-konrad.dybcio@somainline.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Zev Weiss [Thu, 5 May 2022 04:31:52 +0000 (21:31 -0700)]
regulator: core: Fix enable_count imbalance with EXCLUSIVE_GET
Since the introduction of regulator->enable_count, a driver that did
an exclusive get on an already-enabled regulator would end up with
enable_count initialized to 0 but rdev->use_count initialized to 1.
With that starting point the regulator is effectively stuck enabled,
because if the driver attempted to disable it it would fail the
enable_count underflow check in _regulator_handle_consumer_disable().
The EXCLUSIVE_GET path in _regulator_get() now initializes
enable_count along with rdev->use_count so that the regulator can be
disabled without underflowing the former.
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Fixes: 5451781dadf85 ("regulator: core: Only count load for enabled consumers")
Link: https://lore.kernel.org/r/20220505043152.12933-1-zev@bewilderbeest.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Wed, 4 May 2022 20:59:20 +0000 (21:59 +0100)]
regulator: dt-bindings: qcom,rpmh: minor cleanups and extend supplies
Merge series from Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>:
Extend the RPMH regulator bindings with minor fixes and adding narrow supply
matching.
Zev Weiss [Wed, 4 May 2022 06:52:49 +0000 (23:52 -0700)]
regulator: core: Add error flags to sysfs attributes
If a regulator provides a get_error_flags() operation, its sysfs
attributes will now include an entry for each defined
REGULATOR_ERROR_* flag.
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Link: https://lore.kernel.org/r/20220504065252.6955-3-zev@bewilderbeest.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Tue, 26 Apr 2022 10:55:01 +0000 (12:55 +0200)]
regulator: dt-bindings: qcom,rpmh: document vdd-l7-bob-supply on PMR735A
The PMR735A comes with vdd-l7-bob-supply supply which was previously not
documented.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220426105501.73200-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Tue, 26 Apr 2022 10:55:00 +0000 (12:55 +0200)]
regulator: dt-bindings: qcom,rpmh: document supplies per variant
The RPMH regulator binding covers several devices with different
regulator supplies, so it uses patterns matching broad range of these
supplies. This works fine but is not specific and might miss actual
mistakes when a wrong supply property is used for given variant.
Describe the supplies depending on the compatible, using a defs-allOf
method.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220426105501.73200-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Tue, 26 Apr 2022 10:54:59 +0000 (12:54 +0200)]
regulator: dt-bindings: qcom,rpmh: update maintainers
David Collins' email bounces ("Recipient address rejected: undeliverable
address: No such user here").
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220426105501.73200-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Nícolas F. R. A. Prado [Fri, 29 Apr 2022 20:13:24 +0000 (16:13 -0400)]
regulator: mt6315: Enforce regulator-compatible, not name
The MT6315 PMIC dt-binding should enforce that one of the valid
regulator-compatible is set in each regulator node. However it was
mistakenly matching against regulator-name instead.
Fix the typo. This not only fixes the compatible verification, but also
lifts the regulator-name restriction, so that more meaningful names can
be set for each platform.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220429201325.2205799-1-nfraprado@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Rickard x Andersson [Fri, 29 Apr 2022 07:22:11 +0000 (09:22 +0200)]
regulator: pca9450: Enable DVS control via PMIC_STBY_REQ
When DVS is enabled via the devicetree properties
"nxp,dvs-run-voltage" and "nxp,dvs-standby-voltage" then
also the bit that enables DVS control via PMIC_STBY_REQ pin
should be set.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Link: https://lore.kernel.org/r/20220429072211.24957-5-rickaran@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Rickard x Andersson [Fri, 29 Apr 2022 07:22:10 +0000 (09:22 +0200)]
regulator: pca9450: Make warm reset on WDOG_B assertion
The default configuration of the PMIC behavior makes the PMIC
power cycle most regulators on WDOG_B assertion. This power
cycling causes the memory contents of OCRAM to be lost.
Some systems neeeds some memory that survives reset and
reboot, therefore this patch is created.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Link: https://lore.kernel.org/r/20220429072211.24957-4-rickaran@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Rickard x Andersson [Fri, 29 Apr 2022 07:22:09 +0000 (09:22 +0200)]
regulator: Add property for WDOG_B warm reset
Make it possible to do warm reset on WDOG_B assertion.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Link: https://lore.kernel.org/r/20220429072211.24957-3-rickaran@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Per-Daniel Olsson [Fri, 29 Apr 2022 07:22:08 +0000 (09:22 +0200)]
regulator: pca9450: Make I2C Level Translator configurable
Make the I2C Level Translator included in PCA9450 configurable from
devicetree. The reset state is off. By setting nxp,i2c-lt-enable, the
I2C Level Translator will be enabled while in STANDBY or RUN state.
Signed-off-by: Per-Daniel Olsson <perdo@axis.com>
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Link: https://lore.kernel.org/r/20220429072211.24957-2-rickaran@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Per-Daniel Olsson [Fri, 29 Apr 2022 07:22:07 +0000 (09:22 +0200)]
regulator: Add property for I2C level shifter
By setting nxp,i2c-lt-enable the I2C level translator is
enabled.
Signed-off-by: Per-Daniel Olsson <perdo@axis.com>
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Link: https://lore.kernel.org/r/20220429072211.24957-1-rickaran@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Markuss Broks [Fri, 29 Apr 2022 12:09:14 +0000 (15:09 +0300)]
regulator: sm5703: Correct reference to the common regulator schema
The correct file name is regulator.yaml, not regulators.yaml.
Signed-off-by: Markuss Broks <markuss.broks@gmail.com>
Link: https://lore.kernel.org/r/20220429120914.9928-1-markuss.broks@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Markuss Broks [Sat, 23 Apr 2022 08:53:18 +0000 (11:53 +0300)]
regulator: sm5703-regulator: Add regulators support for SM5703 MFD
Regulators block of SM5703 controls several voltage regulators which
are used to power various components. There are 3 LDO outputs ranging
from 1.5 to 3.3V, a buck regulator ranging from 1V to 3V, two fixed
voltage LDO regulators for powering the USB devices and one high-power
fixed voltage LDO line (actually two lines) meant to power high-power
USB devices.
Signed-off-by: Markuss Broks <markuss.broks@gmail.com>
Link: https://lore.kernel.org/r/20220423085319.483524-6-markuss.broks@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Markuss Broks [Sat, 23 Apr 2022 08:53:15 +0000 (11:53 +0300)]
dt-bindings: regulator: Add bindings for Silicon Mitus SM5703 regulators
This patch adds device-tree bindings for regulators on Silicon Mitus
SM5703 MFD.
Signed-off-by: Markuss Broks <markuss.broks@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220423085319.483524-3-markuss.broks@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Mon, 25 Apr 2022 07:24:55 +0000 (09:24 +0200)]
regulator: richtek,rt4801: parse GPIOs per regulator
Having one enable-gpios property for all regulators is discouraged and
instead, similarly to regulator core ena_gpiod feature, each GPIO should
be present in each regulator node. Add support for parsing such GPIOs,
keeping backwards compatibility.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/20220425072455.27356-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Mon, 25 Apr 2022 07:24:54 +0000 (09:24 +0200)]
regulator: dt-bindings: richtek,rt4801: use existing ena_gpiod feature
The binding and driver duplicated regulator core feature of controlling
regulators with GPIOs (of_parse_cb + ena_gpiod) and created its own
enable-gpios property with multiple GPIOs.
This is a less preferred way, because enable-gpios should enable only one
element, not multiple. It also duplicates existing solution.
Deprecate the original 'enable-gpios' and add per-regulator property.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220425072455.27356-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
ChiYuan Huang [Fri, 22 Apr 2022 06:50:55 +0000 (14:50 +0800)]
regulator: dt-bindings: Revise the rt5190a buck/ldo description
Revise the rt5190a bucks and ldo property description.
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/1650610255-6180-1-git-send-email-u0084500@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Brian Norris [Wed, 20 Apr 2022 21:22:13 +0000 (14:22 -0700)]
regulator: core: Sleep (not delay) in set_voltage()
These delays can be relatively large (e.g., hundreds of microseconds to
several milliseconds on RK3399 Gru systems). Per
Documentation/timers/timers-howto.rst, that should usually use a
sleeping delay. Let's use the existing regulator delay helper to handle
both large and small delays appropriately. This avoids burning a bunch
of CPU time and hurting scheduling latencies when hitting regulators a
lot (e.g., during cpufreq).
The sleep vs. delay issue choice has been made differently over time --
early versions of RK3399 Gru PWM-regulator support used usleep_range()
in pwm-regulator.c. More of this got moved into the regulator core,
in commits like:
73e705bf81ce regulator: core: Add set_voltage_time op
At the same time, the sleep turned into a delay.
It's OK to sleep in _regulator_do_set_voltage(), as we aren't in an
atomic context. (All our callers grab various mutexes already.)
I avoid using fsleep() because it uses a usleep_range() of [N to N*2],
and usleep_range() very commonly biases to the high end of the range. We
don't want to double the expected delay, especially for long delays.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Link: https://lore.kernel.org/r/20220420141511.v2.2.If0fc61a894f537b052ca41572aff098cf8e7e673@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
Brian Norris [Wed, 20 Apr 2022 21:22:12 +0000 (14:22 -0700)]
regulator: core: Rename _regulator_enable_delay()
I want to use it in other contexts besides _regulator_do_enable().
Signed-off-by: Brian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/r/20220420141511.v2.1.I31ef0014c9597d53722ab513890f839f357fdfb3@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
Wei Yongjun [Thu, 21 Apr 2022 09:03:35 +0000 (09:03 +0000)]
regulator: da9121: Fix uninit-value in da9121_assign_chip_model()
KASAN report slab-out-of-bounds in __regmap_init as follows:
BUG: KASAN: slab-out-of-bounds in __regmap_init drivers/base/regmap/regmap.c:841
Read of size 1 at addr
ffff88803678cdf1 by task xrun/9137
CPU: 0 PID: 9137 Comm: xrun Tainted: G W 5.18.0-rc2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x15a lib/dump_stack.c:88
print_report.cold+0xcd/0x69b mm/kasan/report.c:313
kasan_report+0x8e/0xc0 mm/kasan/report.c:491
__regmap_init+0x4540/0x4ba0 drivers/base/regmap/regmap.c:841
__devm_regmap_init+0x7a/0x100 drivers/base/regmap/regmap.c:1266
__devm_regmap_init_i2c+0x65/0x80 drivers/base/regmap/regmap-i2c.c:394
da9121_i2c_probe+0x386/0x6d1 drivers/regulator/da9121-regulator.c:1039
i2c_device_probe+0x959/0xac0 drivers/i2c/i2c-core-base.c:563
This happend when da9121 device is probe by da9121_i2c_id, but with
invalid dts. Thus, chip->subvariant_id is set to -EINVAL, and later
da9121_assign_chip_model() will access 'regmap' without init it.
Fix it by return -EINVAL from da9121_assign_chip_model() if
'chip->subvariant_id' is invalid.
Fixes: f3fbd5566f6a ("regulator: da9121: Add device variants")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Adam Ward <Adam.Ward.Opensource@diasemi.com>
Link: https://lore.kernel.org/r/20220421090335.1876149-1-weiyongjun1@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Minghao Chi [Tue, 12 Apr 2022 07:10:30 +0000 (07:10 +0000)]
regulator: stm32-vrefbuf: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get is more appropriate
for simplifing code
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/r/20220412071030.2532230-1-chi.minghao@zte.com.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Mon, 11 Apr 2022 10:59:03 +0000 (12:59 +0200)]
regulator: dt-bindings: qcom,rpmh: document h and k ID
Document used PMIC IDs: 'h' (sm8450-hdk, sm8450-qrd) and 'k'
(sc7280-crd).
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220411105903.230733-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Matti Vaittinen [Fri, 8 Apr 2022 08:32:00 +0000 (11:32 +0300)]
MAINTAINERS: Fix reviewer info for a few ROHM ICs
The email backend used by ROHM keeps labeling patches as spam.
Additionally, there have been reports of some emails been completely
dropped. Finally also the email list (or shared inbox)
linux-power@fi.rohmeurope.com inadvertly stopped working and has not
been reviwed during the past few weeks.
Remove no longer working list 'linux-power' list-entry and switch my
email to use the personal gmail account instead of the company account.
Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/Yk/zAHusOdf4+h06@dc73szyh141qn5ck3nwqy-3.rev.dnainternet.fi
Signed-off-by: Mark Brown <broonie@kernel.org>
Kunihiko Hayashi [Tue, 5 Apr 2022 07:55:03 +0000 (16:55 +0900)]
regulator: uniphier: Use unevaluatedProperties
This refers common bindings, so this is preferred for
unevaluatedProperties instead of additionalProperties.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/1649145303-30221-3-git-send-email-hayashi.kunihiko@socionext.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Kunihiko Hayashi [Tue, 5 Apr 2022 07:55:02 +0000 (16:55 +0900)]
regulator: uniphier: Clean up clocks, resets, and their names using compatible string
Instead of "oneOf:" choices, use "allOf:" and "if:" to define clocks,
resets, and their names that can be taken by the compatible string.
The order of clock-names and reset-names doesn't change here.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/1649145303-30221-2-git-send-email-hayashi.kunihiko@socionext.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Thu, 7 Apr 2022 12:52:16 +0000 (13:52 +0100)]
regulator Add Richtek RT5759 buck converter support
Merge series from cy_huang <u0084500@gmail.com>:
This patch series add Richtek RT5759 buck converter support.
Andy Shevchenko [Fri, 25 Mar 2022 18:45:08 +0000 (20:45 +0200)]
regulator: rpi-panel-attiny: Get rid of duplicate of_node assignment
GPIO library does copy the of_node from the parent device of
the GPIO chip, there is no need to repeat this in the individual
drivers. Remove these assignment all at once.
For the details one may look into the of_gpio_dev_init() implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220325184508.45670-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Tue, 5 Apr 2022 09:25:07 +0000 (10:25 +0100)]
Add support for MediaTek PMIC MT6366
Merge series from Johnson Wang <johnson.wang@mediatek.com>:
This patchset adds support for MediaTek PMIC MT6366.
MT6366 is the primary PMIC for MT8186 and probably other SOCs.
Mark Brown [Tue, 5 Apr 2022 09:25:06 +0000 (10:25 +0100)]
regulator: Add support for MediaTek PMIC MT6366
Merge series from Johnson Wang <johnson.wang@mediatek.com>:
This patchset adds support for MediaTek PMIC MT6366.
MT6366 is the primary PMIC for MT8186 and probably other SOCs.
Johnson Wang [Fri, 1 Apr 2022 08:02:11 +0000 (16:02 +0800)]
regulator: mt6366: Add support for MT6366 regulator
The MT6366 is a regulator found on boards based on MediaTek MT8186 and
probably other SoCs. It is a so called pmic and connects as a slave to
SoC using SPI, wrapped inside the pmic-wrapper.
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Johnson Wang <johnson.wang@mediatek.com>
Link: https://lore.kernel.org/r/20220401080212.27383-2-johnson.wang@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Johnson Wang [Fri, 1 Apr 2022 08:02:12 +0000 (16:02 +0800)]
regulator: Add BUCK and LDO document for MT6358 and MT6366
Add buck_vcore_sshub and ldo_vsram_others_sshub
regulators to binding document for MT6358 and MT6366.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Johnson Wang <johnson.wang@mediatek.com>
Link: https://lore.kernel.org/r/20220401080212.27383-3-johnson.wang@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Axel Lin [Sun, 3 Apr 2022 13:22:35 +0000 (21:22 +0800)]
regulator: atc260x: Fix missing active_discharge_on setting
Without active_discharge_on setting, the SWITCH1 discharge enable control
is always disabled. Fix it.
Fixes: 3b15ccac161a ("regulator: Add regulator driver for ATC260x PMICs")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20220403132235.123727-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Fri, 25 Mar 2022 14:46:37 +0000 (14:46 +0000)]
regulator: Flag uncontrollable regulators as always_on
While we currently assume that regulators with no control available are
just uncontionally enabled this isn't always as clearly displayed to
users as is desirable, for example the code for disabling unused
regulators will log that it is about to disable them. Clean this up a
bit by setting always_on during constraint evaluation if we have no
available mechanism for controlling the regualtor so things that check
the constraint will do the right thing.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220325144637.1543496-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Brown [Thu, 24 Mar 2022 20:18:54 +0000 (20:18 +0000)]
regulator: fixed: Remove print on allocation failure
OOMs are very verbose, we don't need to print an additional error message
when we fail to allocate.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220324201854.3107077-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Krzysztof Kozlowski [Fri, 1 Apr 2022 15:37:11 +0000 (17:37 +0200)]
regulator: dt-bindings: richtek,rt4801: minor comments adjustments
Correct grammar in 'enable-gpios' description and remove useless comment
about regulator nodes, because these are obvious from patternProperties.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220401153711.1057853-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
ChiYuan Huang [Sat, 26 Mar 2022 11:39:47 +0000 (19:39 +0800)]
regulator: Add binding for Richtek RT5759 DCDC converter
Add bindings for Richtek RT5759 high-performance DCDC converter.
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/1648294788-11758-2-git-send-email-u0084500@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
ChiYuan Huang [Sat, 26 Mar 2022 11:39:48 +0000 (19:39 +0800)]
regulator: rt5759: Add support for Richtek RT5759 DCDC converter
Add support for Richtek RT5759 high-performance DCDC converter.
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/1648294788-11758-3-git-send-email-u0084500@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Johnson Wang [Thu, 17 Mar 2022 03:04:02 +0000 (11:04 +0800)]
regulator: Add BUCK and LDO document for MT6358 and MT6366
Add buck_vcore_sshub and ldo_vsram_others_sshub
regulators to binding document for MT6358 and MT6366.
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Johnson Wang <johnson.wang@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220317030402.24894-3-johnson.wang@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Johnson Wang [Thu, 17 Mar 2022 03:04:01 +0000 (11:04 +0800)]
regulator: mt6366: Add support for MT6366 regulator
The MT6366 is a regulator found on boards based on MediaTek MT8186 and
probably other SoCs. It is a so called pmic and connects as a slave to
SoC using SPI, wrapped inside the pmic-wrapper.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Johnson Wang <johnson.wang@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220317030402.24894-2-johnson.wang@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Axel Lin [Mon, 4 Apr 2022 02:25:14 +0000 (10:25 +0800)]
regulator: rtq2134: Fix missing active_discharge_on setting
The active_discharge_on setting was missed, so output discharge resistor
is always disabled. Fix it.
Fixes: 0555d41497de ("regulator: rtq2134: Add support for Richtek RTQ2134 SubPMIC")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20220404022514.449231-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Jonathan Bakker [Mon, 28 Mar 2022 01:01:54 +0000 (18:01 -0700)]
regulator: wm8994: Add an off-on delay for WM8994 variant
As per Table 130 of the wm8994 datasheet at [1], there is an off-on
delay for LDO1 and LDO2. In the wm8958 datasheet [2], I could not
find any reference to it. I could not find a wm1811 datasheet to
double-check there, but as no one has complained presumably it works
without it.
This solves the issue on Samsung Aries boards with a wm8994 where
register writes fail when the device is powered off and back-on
quickly.
[1] https://statics.cirrus.com/pubs/proDatasheet/WM8994_Rev4.6.pdf
[2] https://statics.cirrus.com/pubs/proDatasheet/WM8958_v3.5.pdf
Signed-off-by: Jonathan Bakker <xc-racer2@live.ca>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/CY4PR04MB056771CFB80DC447C30D5A31CB1D9@CY4PR04MB0567.namprd04.prod.outlook.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Sun, 3 Apr 2022 21:08:21 +0000 (14:08 -0700)]
Linux 5.18-rc1
Linus Torvalds [Sun, 3 Apr 2022 19:26:01 +0000 (12:26 -0700)]
Merge tag 'trace-v5.18-2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull more tracing updates from Steven Rostedt:
- Rename the staging files to give them some meaning. Just
stage1,stag2,etc, does not show what they are for
- Check for NULL from allocation in bootconfig
- Hold event mutex for dyn_event call in user events
- Mark user events to broken (to work on the API)
- Remove eBPF updates from user events
- Remove user events from uapi header to keep it from being installed.
- Move ftrace_graph_is_dead() into inline as it is called from hot
paths and also convert it into a static branch.
* tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Move user_events.h temporarily out of include/uapi
ftrace: Make ftrace_graph_is_dead() a static branch
tracing: Set user_events to BROKEN
tracing/user_events: Remove eBPF interfaces
tracing/user_events: Hold event_mutex during dyn_event_add
proc: bootconfig: Add null pointer check
tracing: Rename the staging files for trace_events
Linus Torvalds [Sun, 3 Apr 2022 19:21:14 +0000 (12:21 -0700)]
Merge tag 'clk-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"A single revert to fix a boot regression seen when clk_put() started
dropping rate range requests. It's best to keep various systems
booting so we'll kick this out and try again next time"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
Revert "clk: Drop the rate range on clk_put()"
Linus Torvalds [Sun, 3 Apr 2022 19:15:47 +0000 (12:15 -0700)]
Merge tag 'x86-urgent-2022-04-03' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes and updates:
- Make the prctl() for enabling dynamic XSTATE components correct so
it adds the newly requested feature to the permission bitmap
instead of overwriting it. Add a selftest which validates that.
- Unroll string MMIO for encrypted SEV guests as the hypervisor
cannot emulate it.
- Handle supervisor states correctly in the FPU/XSTATE code so it
takes the feature set of the fpstate buffer into account. The
feature sets can differ between host and guest buffers. Guest
buffers do not contain supervisor states. So far this was not an
issue, but with enabling PASID it needs to be handled in the buffer
offset calculation and in the permission bitmaps.
- Avoid a gazillion of repeated CPUID invocations in by caching the
values early in the FPU/XSTATE code.
- Enable CONFIG_WERROR in x86 defconfig.
- Make the X86 defconfigs more useful by adapting them to Y2022
reality"
* tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/xstate: Consolidate size calculations
x86/fpu/xstate: Handle supervisor states in XSTATE permissions
x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
x86/fpu: Cache xfeature flags from CPUID
x86/fpu/xsave: Initialize offset/size cache early
x86/fpu: Remove unused supervisor only offsets
x86/fpu: Remove redundant XCOMP_BV initialization
x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO
x86/config: Make the x86 defconfigs a bit more usable
x86/defconfig: Enable WERROR
selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation
Linus Torvalds [Sun, 3 Apr 2022 19:08:26 +0000 (12:08 -0700)]
Merge tag 'core-urgent-2022-04-03' of git://git./linux/kernel/git/tip/tip
Pull RT signal fix from Thomas Gleixner:
"Revert the RT related signal changes. They need to be reworked and
generalized"
* tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
Linus Torvalds [Sun, 3 Apr 2022 17:31:00 +0000 (10:31 -0700)]
Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping
Pull more dma-mapping updates from Christoph Hellwig:
- fix a regression in dma remap handling vs AMD memory encryption (me)
- finally kill off the legacy PCI DMA API (Christophe JAILLET)
* tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move pgprot_decrypted out of dma_pgprot
PCI/doc: cleanup references to the legacy PCI DMA API
PCI: Remove the deprecated "pci-dma-compat.h" API
Linus Torvalds [Sun, 3 Apr 2022 17:17:48 +0000 (10:17 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
- avoid unnecessary rebuilds for library objects
- fix return value of __setup handlers
- fix invalid input check for "crashkernel=" kernel option
- silence KASAN warnings in unwind_frame
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9191/1: arm/stacktrace, kasan: Silence KASAN warnings in unwind_frame()
ARM: 9190/1: kdump: add invalid input check for 'crashkernel=0'
ARM: 9187/1: JIVE: fix return value of __setup handler
ARM: 9189/1: decompressor: fix unneeded rebuilds of library objects
Stephen Boyd [Sun, 3 Apr 2022 02:28:18 +0000 (19:28 -0700)]
Revert "clk: Drop the rate range on clk_put()"
This reverts commit
7dabfa2bc4803eed83d6f22bd6f045495f40636b. There are
multiple reports that this breaks boot on various systems. The common
theme is that orphan clks are having rates set on them when that isn't
expected. Let's revert it out for now so that -rc1 boots.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/r/366a0232-bb4a-c357-6aa8-636e398e05eb@samsung.com
Cc: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20220403022818.39572-1-sboyd@kernel.org
Linus Torvalds [Sat, 2 Apr 2022 19:57:17 +0000 (12:57 -0700)]
Merge tag 'perf-tools-for-v5.18-2022-04-02' of git://git./linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Avoid SEGV if core.cpus isn't set in 'perf stat'.
- Stop depending on .git files for building PERF-VERSION-FILE, used in
'perf --version', fixing some perf tools build scenarios.
- Convert tracepoint.py example to python3.
- Update UAPI header copies from the kernel sources: socket,
mman-common, msr-index, KVM, i915 and cpufeatures.
- Update copy of libbpf's hashmap.c.
- Directly return instead of using local ret variable in
evlist__create_syswide_maps(), found by coccinelle.
* tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf python: Convert tracepoint.py example to python3
perf evlist: Directly return instead of using local ret variable
perf cpumap: More cpu map reuse by merge.
perf cpumap: Add is_subset function
perf evlist: Rename cpus to user_requested_cpus
perf tools: Stop depending on .git files for building PERF-VERSION-FILE
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools kvm headers arm64: Update KVM headers from the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
perf beauty: Update copy of linux/socket.h with the kernel sources
perf tools: Update copy of libbpf's hashmap.c
perf stat: Avoid SEGV if core.cpus isn't set
Linus Torvalds [Sat, 2 Apr 2022 19:33:31 +0000 (12:33 -0700)]
Merge tag 'kbuild-fixes-v5.18' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix empty $(PYTHON) expansion.
- Fix UML, which got broken by the attempt to suppress Clang warnings.
- Fix warning message in modpost.
* tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
modpost: restore the warning message for missing symbol versions
Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS"
kbuild: Remove '-mno-global-merge'
kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh
kconfig: remove stale comment about removed kconfig_print_symbol()
Linus Torvalds [Sat, 2 Apr 2022 19:14:38 +0000 (12:14 -0700)]
Merge tag 'mips_5.18_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- build fix for gpio
- fix crc32 build problems
- check for failed memory allocations
* tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: crypto: Fix CRC32 code
MIPS: rb532: move GPIOD definition into C-files
MIPS: lantiq: check the return value of kzalloc()
mips: sgi-ip22: add a check for the return of kzalloc()
Linus Torvalds [Sat, 2 Apr 2022 19:09:02 +0000 (12:09 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- Only do MSR filtering for MSRs accessed by rdmsr/wrmsr
- Documentation improvements
- Prevent module exit until all VMs are freed
- PMU Virtualization fixes
- Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences
- Other miscellaneous bugfixes
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
KVM: x86: fix sending PV IPI
KVM: x86/mmu: do compare-and-exchange of gPTE via the user address
KVM: x86: Remove redundant vm_entry_controls_clearbit() call
KVM: x86: cleanup enter_rmode()
KVM: x86: SVM: fix tsc scaling when the host doesn't support it
kvm: x86: SVM: remove unused defines
KVM: x86: SVM: move tsc ratio definitions to svm.h
KVM: x86: SVM: fix avic spec based definitions again
KVM: MIPS: remove reference to trap&emulate virtualization
KVM: x86: document limitations of MSR filtering
KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr
KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
KVM: x86/pmu: Fix and isolate TSX-specific performance event logic
KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set
KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
KVM: x86: Trace all APICv inhibit changes and capture overall status
KVM: x86: Add wrappers for setting/clearing APICv inhibits
KVM: x86: Make APICv inhibit reasons an enum and cleanup naming
KVM: X86: Handle implicit supervisor access with SMAP
KVM: X86: Rename variable smap to not_smap in permission_fault()
...
Masahiro Yamada [Fri, 1 Apr 2022 15:56:10 +0000 (00:56 +0900)]
modpost: restore the warning message for missing symbol versions
This log message was accidentally chopped off.
I was wondering why this happened, but checking the ML log, Mark
precisely followed my suggestion [1].
I just used "..." because I was too lazy to type the sentence fully.
Sorry for the confusion.
[1]: https://lore.kernel.org/all/CAK7LNAR6bXXk9-ZzZYpTqzFqdYbQsZHmiWspu27rtsFxvfRuVA@mail.gmail.com/
Fixes: 4a6795933a89 ("kbuild: modpost: Explicitly warn about unprototyped symbols")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Linus Torvalds [Sat, 2 Apr 2022 18:03:03 +0000 (11:03 -0700)]
Merge tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block
Pull block driver fix from Jens Axboe:
"Got two reports on nbd spewing warnings on load now, which is a
regression from a commit that went into your tree yesterday.
Revert the problematic change for now"
* tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block:
Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
Linus Torvalds [Sat, 2 Apr 2022 17:54:52 +0000 (10:54 -0700)]
Merge tag 'pci-v5.18-changes-2' of git://git./linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
- Fix Hyper-V "defined but not used" build issue added during merge
window (YueHaibing)
* tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: hv: Remove unused hv_set_msi_entry_from_desc()
Linus Torvalds [Sat, 2 Apr 2022 17:44:18 +0000 (10:44 -0700)]
Merge tag 'tag-chrome-platform-for-v5.18' of git://git./linux/kernel/git/chrome-platform/linux
Pull chrome platform updates from Benson Leung:
"cros_ec_typec:
- Check for EC device - Fix a crash when using the cros_ec_typec
driver on older hardware not capable of typec commands
- Make try power role optional
- Mux configuration reorganization series from Prashant
cros_ec_debugfs:
- Fix use after free. Thanks Tzung-bi
sensorhub:
- cros_ec_sensorhub fixup - Split trace include file
misc:
- Add new mailing list for chrome-platform development:
chrome-platform@lists.linux.dev
Now with patchwork!"
* tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_debugfs: detach log reader wq from devm
platform: chrome: Split trace include file
platform/chrome: cros_ec_typec: Update mux flags during partner removal
platform/chrome: cros_ec_typec: Configure muxes at start of port update
platform/chrome: cros_ec_typec: Get mux state inside configure_mux
platform/chrome: cros_ec_typec: Move mux flag checks
platform/chrome: cros_ec_typec: Check for EC device
platform/chrome: cros_ec_typec: Make try power role optional
MAINTAINERS: platform-chrome: Add new chrome-platform@lists.linux.dev list
Jens Axboe [Sat, 2 Apr 2022 17:40:23 +0000 (11:40 -0600)]
Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
This reverts commit
6d35d04a9e18990040e87d2bbf72689252669d54.
Both Gabriel and Borislav report that this commit casues a regression
with nbd:
sysfs: cannot create duplicate filename '/dev/block/43:0'
Revert it before 5.18-rc1 and we'll investigage this separately in
due time.
Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/
Reported-by: Gabriel L. Somlo <somlo@cmu.edu>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eric Dumazet [Mon, 28 Mar 2022 17:07:04 +0000 (18:07 +0100)]
watch_queue: Free the page array when watch_queue is dismantled
Commit
7ea1a0124b6d ("watch_queue: Free the alloc bitmap when the
watch_queue is torn down") took care of the bitmap, but not the page
array.
BUG: memory leak
unreferenced object 0xffff88810d9bc140 (size 32):
comm "syz-executor335", pid 3603, jiffies
4294946994 (age 12.840s)
hex dump (first 32 bytes):
40 a7 40 04 00 ea ff ff 00 00 00 00 00 00 00 00 @.@.............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
kmalloc_array include/linux/slab.h:621 [inline]
kcalloc include/linux/slab.h:652 [inline]
watch_queue_set_size+0x12f/0x2e0 kernel/watch_queue.c:251
pipe_ioctl+0x82/0x140 fs/pipe.c:632
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
Reported-by: syzbot+25ea042ae28f3888727a@syzkaller.appspotmail.com
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20220322004654.618274-1-eric.dumazet@gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt (Google) [Fri, 1 Apr 2022 18:39:03 +0000 (14:39 -0400)]
tracing: mark user_events as BROKEN
After being merged, user_events become more visible to a wider audience
that have concerns with the current API.
It is too late to fix this for this release, but instead of a full
revert, just mark it as BROKEN (which prevents it from being selected in
make config). Then we can work finding a better API. If that fails,
then it will need to be completely reverted.
To not have the code silently bitrot, still allow building it with
COMPILE_TEST.
And to prevent the uapi header from being installed, then later changed,
and then have an old distro user space see the old version, move the
header file out of the uapi directory.
Surround the include with CONFIG_COMPILE_TEST to the current location,
but when the BROKEN tag is taken off, it will use the uapi directory,
and fail to compile. This is a good way to remind us to move the header
back.
Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt (Google) [Fri, 1 Apr 2022 18:39:03 +0000 (14:39 -0400)]
tracing: Move user_events.h temporarily out of include/uapi
While user_events API is under development and has been marked for broken
to not let the API become fixed, move the header file out of the uapi
directory. This is to prevent it from being installed, then later changed,
and then have an old distro user space update with a new kernel, where
applications see the user_events being available, but the old header is in
place, and then they get compiled incorrectly.
Also, surround the include with CONFIG_COMPILE_TEST to the current
location, but when the BROKEN tag is taken off, it will use the uapi
directory, and fail to compile. This is a good way to remind us to move
the header back.
Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.home
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Christophe Leroy [Wed, 30 Mar 2022 07:00:19 +0000 (09:00 +0200)]
ftrace: Make ftrace_graph_is_dead() a static branch
ftrace_graph_is_dead() is used on hot paths, it just reads a variable
in memory and is not worth suffering function call constraints.
For instance, at entry of prepare_ftrace_return(), inlining it avoids
saving prepare_ftrace_return() parameters to stack and restoring them
after calling ftrace_graph_is_dead().
While at it using a static branch is even more performant and is
rather well adapted considering that the returned value will almost
never change.
Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool
by a static branch.
The performance improvement is noticeable.
Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Wed, 30 Mar 2022 19:58:35 +0000 (15:58 -0400)]
tracing: Set user_events to BROKEN
After being merged, user_events become more visible to a wider audience
that have concerns with the current API. It is too late to fix this for
this release, but instead of a full revert, just mark it as BROKEN (which
prevents it from being selected in make config). Then we can work finding
a better API. If that fails, then it will need to be completely reverted.
Link: https://lore.kernel.org/all/2059213643.196683.1648499088753.JavaMail.zimbra@efficios.com/
Link: https://lkml.kernel.org/r/20220330155835.5e1f6669@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Tue, 29 Mar 2022 17:30:51 +0000 (10:30 -0700)]
tracing/user_events: Remove eBPF interfaces
Remove eBPF interfaces within user_events to ensure they are fully
reviewed.
Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/
Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.com
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Beau Belgrave [Mon, 28 Mar 2022 22:32:25 +0000 (15:32 -0700)]
tracing/user_events: Hold event_mutex during dyn_event_add
Make sure the event_mutex is properly held during dyn_event_add call.
This is required when adding dynamic events.
Link: https://lkml.kernel.org/r/20220328223225.1992-1-beaub@linux.microsoft.com
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Lv Ruyi [Tue, 29 Mar 2022 10:40:04 +0000 (10:40 +0000)]
proc: bootconfig: Add null pointer check
kzalloc is a memory allocation function which can return NULL when some
internal memory errors happen. It is safer to add null pointer check.
Link: https://lkml.kernel.org/r/20220329104004.2376879-1-lv.ruyi@zte.com.cn
Cc: stable@vger.kernel.org
Fixes: c1a3c36017d4 ("proc: bootconfig: Add /proc/bootconfig to show boot config list")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 29 Mar 2022 20:50:44 +0000 (16:50 -0400)]
tracing: Rename the staging files for trace_events
When looking for implementation of different phases of the creation of the
TRACE_EVENT() macro, it is pretty useless when all helper macro
redefinitions are in files labeled "stageX_defines.h". Rename them to
state which phase the files are for. For instance, when looking for the
defines that are used to create the event fields, seeing
"stage4_event_fields.h" gives the developer a good idea that the defines
are in that file.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Li RongQing [Wed, 9 Mar 2022 08:35:44 +0000 (16:35 +0800)]
KVM: x86: fix sending PV IPI
If apic_id is less than min, and (max - apic_id) is greater than
KVM_IPI_CLUSTER_SIZE, then the third check condition is satisfied but
the new apic_id does not fit the bitmask. In this case __send_ipi_mask
should send the IPI.
This is mostly theoretical, but it can happen if the apic_ids on three
iterations of the loop are for example 1, KVM_IPI_CLUSTER_SIZE, 0.
Fixes: aaffcfd1e82 ("KVM: X86: Implement PV IPIs in linux guest")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Message-Id: <
1646814944-51801-1-git-send-email-lirongqing@baidu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 29 Mar 2022 16:56:24 +0000 (12:56 -0400)]
KVM: x86/mmu: do compare-and-exchange of gPTE via the user address
FNAME(cmpxchg_gpte) is an inefficient mess. It is at least decent if it
can go through get_user_pages_fast(), but if it cannot then it tries to
use memremap(); that is not just terribly slow, it is also wrong because
it assumes that the VM_PFNMAP VMA is contiguous.
The right way to do it would be to do the same thing as
hva_to_pfn_remapped() does since commit
add6a0cd1c5b ("KVM: MMU: try to
fix up page faults before giving up", 2016-07-05), using follow_pte()
and fixup_user_fault() to determine the correct address to use for
memremap(). To do this, one could for example extract hva_to_pfn()
for use outside virt/kvm/kvm_main.c. But really there is no reason to
do that either, because there is already a perfectly valid address to
do the cmpxchg() on, only it is a userspace address. That means doing
user_access_begin()/user_access_end() and writing the code in assembly
to handle exceptions correctly. Worse, the guest PTE can be 8-byte
even on i686 so there is the extra complication of using cmpxchg8b to
account for. But at least it is an efficient mess.
(Thanks to Linus for suggesting improvement on the inline assembly).
Reported-by: Qiuhao Li <qiuhao@sysec.org>
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com
Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd53cb35a3e9 ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhenzhong Duan [Fri, 11 Mar 2022 10:26:43 +0000 (18:26 +0800)]
KVM: x86: Remove redundant vm_entry_controls_clearbit() call
When emulating exit from long mode, EFER_LMA is cleared with
vmx_set_efer(). This will already unset the VM_ENTRY_IA32E_MODE control
bit as requested by SDM, so there is no need to unset VM_ENTRY_IA32E_MODE
again in exit_lmode() explicitly. In case EFER isn't supported by
hardware, long mode isn't supported, so exit_lmode() cannot be reached.
Note that, thanks to the shadow controls mechanism, this change doesn't
eliminate vmread or vmwrite.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <
20220311102643.807507-3-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhenzhong Duan [Fri, 11 Mar 2022 10:26:42 +0000 (18:26 +0800)]
KVM: x86: cleanup enter_rmode()
vmx_set_efer() sets uret->data but, in fact if the value of uret->data
will be used vmx_setup_uret_msrs() will have rewritten it with the value
returned by update_transition_efer(). uret->data is consumed if and only
if uret->load_into_hardware is true, and vmx_setup_uret_msrs() takes care
of (a) updating uret->data before setting uret->load_into_hardware to true
(b) setting uret->load_into_hardware to false if uret->data isn't updated.
Opportunistically use "vmx" directly instead of redoing to_vmx().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <
20220311102643.807507-2-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Tue, 22 Mar 2022 17:24:48 +0000 (19:24 +0200)]
KVM: x86: SVM: fix tsc scaling when the host doesn't support it
It was decided that when TSC scaling is not supported,
the virtual MSR_AMD64_TSC_RATIO should still have the default '1.0'
value.
However in this case kvm_max_tsc_scaling_ratio is not set,
which breaks various assumptions.
Fix this by always calculating kvm_max_tsc_scaling_ratio regardless of
host support. For consistency, do the same for VMX.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20220322172449.235575-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Tue, 22 Mar 2022 17:24:47 +0000 (19:24 +0200)]
kvm: x86: SVM: remove unused defines
Remove some unused #defines from svm.c
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20220322172449.235575-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Tue, 22 Mar 2022 17:24:46 +0000 (19:24 +0200)]
KVM: x86: SVM: move tsc ratio definitions to svm.h
Another piece of SVM spec which should be in the header file
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20220322172449.235575-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Tue, 22 Mar 2022 17:24:45 +0000 (19:24 +0200)]
KVM: x86: SVM: fix avic spec based definitions again
Due to wrong rebase, commit
4a204f7895878 ("KVM: SVM: Allow AVIC support on system w/ physical APIC ID > 255")
moved avic spec #defines back to avic.c.
Move them back, and while at it extend AVIC_DOORBELL_PHYSICAL_ID_MASK to 12
bits as well (it will be used in nested avic)
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20220322172449.235575-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Sun, 13 Mar 2022 14:05:22 +0000 (15:05 +0100)]
KVM: MIPS: remove reference to trap&emulate virtualization
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20220313140522.
1307751-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 15 Mar 2022 22:17:15 +0000 (18:17 -0400)]
KVM: x86: document limitations of MSR filtering
MSR filtering requires an exit to userspace that is hard to implement and
would be very slow in the case of nested VMX vmexit and vmentry MSR
accesses. Document the limitation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hou Wenlong [Mon, 7 Mar 2022 12:26:33 +0000 (20:26 +0800)]
KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr
If MSR access is rejected by MSR filtering,
kvm_set_msr()/kvm_get_msr() would return KVM_MSR_RET_FILTERED,
and the return value is only handled well for rdmsr/wrmsr.
However, some instruction emulation and state transition also
use kvm_set_msr()/kvm_get_msr() to do msr access but may trigger
some unexpected results if MSR access is rejected, E.g. RDPID
emulation would inject a #UD but RDPID wouldn't cause a exit
when RDPID is supported in hardware and ENABLE_RDTSCP is set.
And it would also cause failure when load MSR at nested entry/exit.
Since msr filtering is based on MSR bitmap, it is better to only
do MSR filtering for rdmsr/wrmsr.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Message-Id: <
2b2774154f7532c96a6f04d71c82a8bec7d9e80b.
1646655860.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hou Wenlong [Wed, 2 Mar 2022 13:15:14 +0000 (21:15 +0800)]
KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
When RDTSCP is supported but RDPID is not supported in host,
RDPID emulation is available. However, __kvm_get_msr() would
only fail when RDTSCP/RDPID both are disabled in guest, so
the emulator wouldn't inject a #UD when RDPID is disabled but
RDTSCP is enabled in guest.
Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Message-Id: <
1dfd46ae5b76d3ed87bde3154d51c64ea64c99c1.
1646226788.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Like Xu [Wed, 9 Mar 2022 08:42:57 +0000 (16:42 +0800)]
KVM: x86/pmu: Fix and isolate TSX-specific performance event logic
HSW_IN_TX* bits are used in generic code which are not supported on
AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
using HSW_IN_TX* bits unconditionally in generic code is resulting in
unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
is 0x2, pmc_reprogram_counter() wrongly assumes that
HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.
Also per the SDM, both bits 32 and 33 "may only be set if the processor
supports HLE or RTM" and for "IN_TXCP (bit 33): this bit may only be set
for IA32_PERFEVTSEL2."
Opportunistically eliminate code redundancy, because if the HSW_IN_TX*
bit is set in pmc->eventsel, it is already set in attr.config.
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reported-by: Jim Mattson <jmattson@google.com>
Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <
20220309084257.88931-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Wed, 2 Mar 2022 10:24:57 +0000 (12:24 +0200)]
KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set
It makes more sense to print new SPTE value than the
old value.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220302102457.588450-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jim Mattson [Sat, 26 Feb 2022 23:41:31 +0000 (15:41 -0800)]
KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
AMD EPYC CPUs never raise a #GP for a WRMSR to a PerfEvtSeln MSR. Some
reserved bits are cleared, and some are not. Specifically, on
Zen3/Milan, bits 19 and 42 are not cleared.
When emulating such a WRMSR, KVM should not synthesize a #GP,
regardless of which bits are set. However, undocumented bits should
not be passed through to the hardware MSR. So, rather than checking
for reserved bits and synthesizing a #GP, just clear the reserved
bits.
This may seem pedantic, but since KVM currently does not support the
"Host/Guest Only" bits (41:40), it is necessary to clear these bits
rather than synthesizing #GP, because some popular guests (e.g Linux)
will set the "Host Only" bit even on CPUs that don't support
EFER.SVME, and they don't expect a #GP.
For example,
root@Ubuntu1804:~# perf stat -e r26 -a sleep 1
Performance counter stats for 'system wide':
0 r26
1.
001070977 seconds time elapsed
Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379957] unchecked MSR access error: WRMSR to 0xc0010200 (tried to write 0x0000020000130026) at rIP: 0xffffffff9b276a28 (native_write_msr+0x8/0x30)
Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379958] Call Trace:
Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379963] amd_pmu_disable_event+0x27/0x90
Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
Reported-by: Lotus Fenn <lotusf@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Like Xu <likexu@tencent.com>
Reviewed-by: David Dunn <daviddunn@google.com>
Message-Id: <
20220226234131.
2167175-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 11 Mar 2022 04:35:17 +0000 (04:35 +0000)]
KVM: x86: Trace all APICv inhibit changes and capture overall status
Trace all APICv inhibit changes instead of just those that result in
APICv being (un)inhibited, and log the current state. Debugging why
APICv isn't working is frustrating as it's hard to see why APICv is still
inhibited, and logging only the first inhibition means unnecessary onion
peeling.
Opportunistically drop the export of the tracepoint, it is not and should
not be used by vendor code due to the need to serialize toggling via
apicv_update_lock.
Note, using the common flow means kvm_apicv_init() switched from atomic
to non-atomic bitwise operations. The VM is unreachable at init, so
non-atomic is perfectly ok.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220311043517.17027-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 11 Mar 2022 04:35:16 +0000 (04:35 +0000)]
KVM: x86: Add wrappers for setting/clearing APICv inhibits
Add set/clear wrappers for toggling APICv inhibits to make the call sites
more readable, and opportunistically rename the inner helpers to align
with the new wrappers and to make them more readable as well. Invert the
flag from "activate" to "set"; activate is painfully ambiguous as it's
not obvious if the inhibit is being activated, or if APICv is being
activated, in which case the inhibit is being deactivated.
For the functions that take @set, swap the order of the inhibit reason
and @set so that the call sites are visually similar to those that bounce
through the wrapper.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220311043517.17027-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 11 Mar 2022 04:35:15 +0000 (04:35 +0000)]
KVM: x86: Make APICv inhibit reasons an enum and cleanup naming
Use an enum for the APICv inhibit reasons, there is no meaning behind
their values and they most definitely are not "unsigned longs". Rename
the various params to "reason" for consistency and clarity (inhibit may
be confused as a command, i.e. inhibit APICv, instead of the reason that
is getting toggled/checked).
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220311043517.17027-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lai Jiangshan [Fri, 11 Mar 2022 07:03:44 +0000 (15:03 +0800)]
KVM: X86: Handle implicit supervisor access with SMAP
There are two kinds of implicit supervisor access
implicit supervisor access when CPL = 3
implicit supervisor access when CPL < 3
Current permission_fault() handles only the first kind for SMAP.
But if the access is implicit when SMAP is on, data may not be read
nor write from any user-mode address regardless the current CPL.
So the second kind should be also supported.
The first kind can be detect via CPL and access mode: if it is
supervisor access and CPL = 3, it must be implicit supervisor access.
But it is not possible to detect the second kind without extra
information, so this patch adds an artificial PFERR_EXPLICIT_ACCESS
into @access. This extra information also works for the first kind, so
the logic is changed to use this information for both cases.
The value of PFERR_EXPLICIT_ACCESS is deliberately chosen to be bit 48
which is in the most significant 16 bits of u64 and less likely to be
forced to change due to future hardware uses it.
This patch removes the call to ->get_cpl() for access mode is determined
by @access. Not only does it reduce a function call, but also remove
confusions when the permission is checked for nested TDP. The nested
TDP shouldn't have SMAP checking nor even the L2's CPL have any bearing
on it. The original code works just because it is always user walk for
NPT and SMAP fault is not set for EPT in update_permission_bitmask.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <
20220311070346.45023-5-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lai Jiangshan [Fri, 11 Mar 2022 07:03:43 +0000 (15:03 +0800)]
KVM: X86: Rename variable smap to not_smap in permission_fault()
Comments above the variable says the bit is set when SMAP is overridden
or the same meaning in update_permission_bitmask(): it is not subjected
to SMAP restriction.
Renaming it to reflect the negative implication and make the code better
readability.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <
20220311070346.45023-4-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lai Jiangshan [Fri, 11 Mar 2022 07:03:42 +0000 (15:03 +0800)]
KVM: X86: Fix comments in update_permission_bitmask
The commit
09f037aa48f3 ("KVM: MMU: speedup update_permission_bitmask")
refactored the code of update_permission_bitmask() and change the
comments. It added a condition into a list to match the new code,
so the number/order for conditions in the comments should be updated
too.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <
20220311070346.45023-3-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lai Jiangshan [Fri, 11 Mar 2022 07:03:41 +0000 (15:03 +0800)]
KVM: X86: Change the type of access u32 to u64
Change the type of access u32 to u64 for FNAME(walk_addr) and
->gva_to_gpa().
The kinds of accesses are usually combinations of UWX, and VMX/SVM's
nested paging adds a new factor of access: is it an access for a guest
page table or for a final guest physical address.
And SMAP relies a factor for supervisor access: explicit or implicit.
So @access in FNAME(walk_addr) and ->gva_to_gpa() is better to include
all these information to do the walk.
Although @access(u32) has enough bits to encode all the kinds, this
patch extends it to u64:
o Extra bits will be in the higher 32 bits, so that we can
easily obtain the traditional access mode (UWX) by converting
it to u32.
o Reuse the value for the access kind defined by SVM's nested
paging (PFERR_GUEST_FINAL_MASK and PFERR_GUEST_PAGE_MASK) as
@error_code in kvm_handle_page_fault().
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <
20220311070346.45023-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Thu, 3 Mar 2022 15:41:12 +0000 (15:41 +0000)]
KVM: Remove dirty handling from gfn_to_pfn_cache completely
It isn't OK to cache the dirty status of a page in internal structures
for an indefinite period of time.
Any time a vCPU exits the run loop to userspace might be its last; the
VMM might do its final check of the dirty log, flush the last remaining
dirty pages to the destination and complete a live migration. If we
have internal 'dirty' state which doesn't get flushed until the vCPU
is finally destroyed on the source after migration is complete, then
we have lost data because that will escape the final copy.
This problem already exists with the use of kvm_vcpu_unmap() to mark
pages dirty in e.g. VMX nesting.
Note that the actual Linux MM already considers the page to be dirty
since we have a writeable mapping of it. This is just about the KVM
dirty logging.
For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to
track which gfn_to_pfn_caches have been used and explicitly mark the
corresponding pages dirty before returning to userspace. But we would
have needed external tracking of that anyway, rather than walking the
full list of GPCs to find those belonging to this vCPU which are dirty.
So let's rely *solely* on that external tracking, and keep it simple
rather than laying a tempting trap for callers to fall into.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20220303154127.202856-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 3 Mar 2022 15:41:11 +0000 (15:41 +0000)]
KVM: Use enum to track if cached PFN will be used in guest and/or host
Replace the guest_uses_pa and kernel_map booleans in the PFN cache code
with a unified enum/bitmask. Using explicit names makes it easier to
review and audit call sites.
Opportunistically add a WARN to prevent passing garbage; instantating a
cache without declaring its usage is either buggy or pointless.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20220303154127.202856-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Gonda [Fri, 4 Mar 2022 16:10:32 +0000 (08:10 -0800)]
KVM: SVM: Fix kvm_cache_regs.h inclusions for is_guest_mode()
Include kvm_cache_regs.h to pick up the definition of is_guest_mode(),
which is referenced by nested_svm_virtualize_tpr() in svm.h. Remove
include from svm_onhpyerv.c which was done only because of lack of
include in svm.h.
Fixes: 883b0a91f41ab ("KVM: SVM: Move Nested SVM Implementation to nested.c")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <
20220304161032.
2270688-1-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jim Mattson [Tue, 8 Mar 2022 01:24:52 +0000 (17:24 -0800)]
KVM: x86/pmu: Use different raw event masks for AMD and Intel
The third nybble of AMD's event select overlaps with Intel's IN_TX and
IN_TXCP bits. Therefore, we can't use AMD64_RAW_EVENT_MASK on Intel
platforms that support TSX.
Declare a raw_event_mask in the kvm_pmu structure, initialize it in
the vendor-specific pmu_refresh() functions, and use that mask for
PERF_TYPE_RAW configurations in reprogram_gp_counter().
Fixes: 710c47651431 ("KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW")
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <
20220308012452.
3468611-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 23 Feb 2022 16:53:02 +0000 (16:53 +0000)]
KVM: Don't actually set a request when evicting vCPUs for GFN cache invd
Don't actually set a request bit in vcpu->requests when making a request
purely to force a vCPU to exit the guest. Logging a request but not
actually consuming it would cause the vCPU to get stuck in an infinite
loop during KVM_RUN because KVM would see the pending request and bail
from VM-Enter to service the request.
Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as
nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all
too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init()
without implementing handling of the request, especially since getting
test coverage of MMU notifier interaction with specific KVM features
usually requires a directed test.
Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus
to evict_vcpus. The purpose of the request is to get vCPUs out of guest
mode, it's supposed to _avoid_ waking vCPUs that are blocking.
Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to
what it wants to accomplish, and to genericize the name so that it can
used for similar but unrelated scenarios, should they arise in the future.
Add a comment and documentation to explain why the "no action" request
exists.
Add compile-time assertions to help detect improper usage. Use the inner
assertless helper in the one s390 path that makes requests without a
hardcoded request.
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220223165302.
3205276-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Tue, 29 Mar 2022 17:11:47 +0000 (13:11 -0400)]
KVM: avoid double put_page with gfn-to-pfn cache
If the cache's user host virtual address becomes invalid, there
is still a path from kvm_gfn_to_pfn_cache_refresh() where __release_gpc()
could release the pfn but the gpc->pfn field has not been overwritten
with an error value. If this happens, kvm_gfn_to_pfn_cache_unmap will
call put_page again on the same page.
Cc: stable@vger.kernel.org
Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>