Daniel Lezcano [Tue, 9 Jan 2024 09:41:12 +0000 (10:41 +0100)]
thermal/debugfs: Add thermal debugfs information for mitigation episodes
The mitigation episodes are recorded. A mitigation episode happens
when the first trip point is crossed the way up and then the way
down. During this episode other trip points can be crossed also and
are accounted for this mitigation episode. The interesting information
is the average temperature at the trip point, the undershot and the
overshot. The standard deviation of the mitigated temperature will be
added later.
The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:
thermal/
`-- thermal_zones
|-- 0
| `-- mitigations
`-- 1
`-- mitigations
The content of the mitigations file has the following format:
,-Mitigation at 349988258us, duration=130136ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 |
| 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 |
,-Mitigation at 272451637us, duration=75000ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 |
,-Mitigation at 238184119us, duration=27316ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 |
,-Mitigation at 39863713us, duration=136196ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 |
More information for a better understanding of the thermal behavior
will be added after. The idea is to give detailed statistics
information about the undershots and overshots, the temperature speed,
etc... As all the information in a single file is too much, the idea
would be to create a directory named with the mitigation timestamp
where all data could be added.
Please note this code is immune against trip ordering but not against
a trip temperature change while a mitigation is happening. However,
this situation should be extremely rare, perhaps not happening and we
might question ourselves if something should be done in the core
framework for other components first.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Daniel Lezcano [Tue, 9 Jan 2024 09:41:11 +0000 (10:41 +0100)]
thermal/debugfs: Add thermal cooling device debugfs information
The thermal framework does not have any debug information except a
sysfs stat which is a bit controversial. This one allocates big chunks
of memory for every cooling devices with a high number of states and
could represent on some systems in production several megabytes of
memory for just a portion of it. As the sysfs is limited to a page
size, the output is not exploitable with large data array and gets
truncated.
The patch provides the same information than sysfs except the
transitions are dynamically allocated, thus they won't show more
events than the ones which actually occurred. There is no longer a
size limitation and it opens the field for more debugging information
where the debugfs is designed for, not sysfs.
The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:
thermal/
-- cooling_devices
|-- 0
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 1
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 2
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 3
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
`-- 4
|-- clear
|-- time_in_state_ms
|-- total_trans
`-- trans_table
The content of the files in the cooling devices directory is the same
as the sysfs one except for the trans_table which has the following
format:
Transition Hits
1->0 246
0->1 246
2->1 632
1->2 632
3->2 98
2->3 98
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Wed, 3 Jan 2024 11:59:10 +0000 (12:59 +0100)]
thermal: netlink: Pass thermal zone pointer to notify routines
There are several rountines in the thermal netlink API that take a
thermal zone ID or a thermal zone type as their arguments, but from
their callers perspective it would be more convenient to pass a thermal
zone pointer to them and let them extract the necessary data from the
given thermal zone object by themselves.
Modify the code accordingly.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 15 Dec 2023 19:59:08 +0000 (20:59 +0100)]
thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
Because thermal_notify_tz_trip_add/delete() are never used, drop them
entirely along with the related code.
The addition or removal of trip points is not supported by the thermal
core and is unlikely to be supported in the future, so it is also
unlikely that these functions will ever be needed.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 15 Dec 2023 19:57:50 +0000 (20:57 +0100)]
thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to
provide specific values needed to populate struct param in them, make
them extract those values from objects passed by the callers via const
pointers.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Wed, 3 Jan 2024 11:49:57 +0000 (12:49 +0100)]
thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
Instead of requiring the caller of thermal_notify_tz_trip_change() to
provide specific values needed to populate struct param in it, make it
extract those values from objects passed to it by the caller via const
pointers.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 15 Dec 2023 19:53:52 +0000 (20:53 +0100)]
thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()
Because thermal_zone_trip_id() does not update the thermal zone object
passed to it, its pointer argument representing the thermal zone can be
const, so adjust its definition accordingly.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Tue, 2 Jan 2024 12:45:36 +0000 (13:45 +0100)]
Merge tag 'thermal-v6.8-rc1' of ssh://gitolite./linux/kernel/git/thermal/linux into thermal
Merge thermal control material for 6.8-rc1 from Daniel Lezcano:
"- Converted Mediatek Thermal to the json-schema (Rafał Miłecki)
- Fixed DT bindings issue on Loongson (Binbin Zhou)
- Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou)
- Added the DT binding for the tsens on SM8650 platform (Neil Armstrong)
- Added a reboot on critical option feature (Fabio Estevam)
- Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König)
- Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev)
- Fixed example in the DT binding for QCom SPMI (Johan Hovold)
- Fixed compilation warning for the tmon utility (Florian Eckert)
- Added interrupt based configuration on Exynos along with a set of
related cleanups (Mateusz Majewski)"
* tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (24 commits)
thermal/drivers/exynos: Use set_trips ops
thermal/drivers/exynos: Use BIT wherever possible
thermal/drivers/exynos: Split initialization of TMU and the thermal zone
thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
thermal/drivers/exynos: Simplify regulator (de)initialization
thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
thermal/drivers/exynos: Drop id field
thermal/drivers/exynos: Remove an unnecessary field description
tools/thermal/tmon: Fix compilation warning for wrong format
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names
thermal/drivers/sun8i: Add D1/T113s THS controller support
dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller
thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
thermal: amlogic: Make amlogic_thermal_disable() return void
thermal/thermal_of: Allow rebooting after critical temp
reboot: Introduce thermal_zone_device_critical_reboot()
thermal/core: Prepare for introduction of thermal reboot
dt-bindings: thermal-zones: Document critical-action
...
Mateusz Majewski [Fri, 1 Dec 2023 09:56:25 +0000 (10:56 +0100)]
thermal/drivers/exynos: Use set_trips ops
Currently, each trip point defined in the device tree corresponds to a
single hardware interrupt. This commit instead switches to using two
hardware interrupts, whose values are set dynamically using the
set_trips callback. Additionally, the critical temperature threshold is
handled specifically.
Setting interrupts in this way also fixes a long-standing lockdep
warning, which was caused by calling thermal_zone_get_trips with our
lock being held. Do note that this requires TMU initialization to be
split into two parts, as done by the parent commit: parts of the
initialization call into the thermal_zone_device structure and so must
be done after its registration, but the initialization is also
responsible for setting up calibration, which must be done before
thermal_zone_device registration, which will call set_trips for the
first time; if the calibration is not done in time, the interrupt values
will be silently wrong!
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-10-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:24 +0000 (10:56 +0100)]
thermal/drivers/exynos: Use BIT wherever possible
The original driver did not use that macro and it allows us to make our
intentions slightly clearer.
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-9-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:23 +0000 (10:56 +0100)]
thermal/drivers/exynos: Split initialization of TMU and the thermal zone
This will be needed in the future, as the thermal zone subsystem might
call our callbacks right after devm_thermal_of_zone_register. Currently
we just make get_temp return EAGAIN in such case, but this will not be
possible with state-modifying callbacks, for instance set_trips.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-8-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:22 +0000 (10:56 +0100)]
thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
Exynos 4210 supports setting a base threshold value, which is added to
all trip points. This might be useful, but is not really necessary in
our usecase, so we always set it to 0 to simplify the code a bit.
Additionally, this change makes it so that we convert the value to the
calibrated one in a slightly different place. This is more correct
morally, though it does not make any change when single-point
calibration is being used (which is the case currently).
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-7-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:21 +0000 (10:56 +0100)]
thermal/drivers/exynos: Simplify regulator (de)initialization
We rewrite the initialization to enable the regulator as part of devm,
which allows us to not handle the struct instance manually.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link:
https://lore.kernel.org/r/
20231201095625.301884-6-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:20 +0000 (10:56 +0100)]
thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
Currently, if regulator is required in the SoC, but
devm_regulator_get_optional fails for whatever reason, the execution
will proceed without propagating the error. Meanwhile there is no
reason to output the error in case of -ENODEV.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-5-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:19 +0000 (10:56 +0100)]
thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
The workqueue boilerplate is mostly one-to-one what the threaded
interrupts do.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-4-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:18 +0000 (10:56 +0100)]
thermal/drivers/exynos: Drop id field
We do not use the value, and only Exynos 7 defines this alias anyway.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-3-m.majewski2@samsung.com
Mateusz Majewski [Fri, 1 Dec 2023 09:56:17 +0000 (10:56 +0100)]
thermal/drivers/exynos: Remove an unnecessary field description
It seems that the field has been removed in one of the previous commits,
but the description has been forgotten.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-2-m.majewski2@samsung.com
Florian Eckert [Mon, 4 Dec 2023 14:13:35 +0000 (15:13 +0100)]
tools/thermal/tmon: Fix compilation warning for wrong format
The following warnings are shown during compilation:
tui.c: In function 'show_cooling_device':
tui.c:216:40: warning: format '%d' expects argument of type 'int', but
argument 7 has type 'long unsigned int' [-Wformat=]
216 | "%02d %12.12s%6d %6d",
| ~~^
| |
| int
| %6ld
......
219 | ptdata.cdi[j].cur_state,
| ~~~~~~~~~~~~~~~~~~~~~~~
| |
| long unsigned int
tui.c:216:44: warning: format '%d' expects argument of type 'int', but
argument 8 has type 'long unsigned int' [-Wformat=]
216 | "%02d %12.12s%6d %6d",
| ~~^
| |
| int
| %6ld
......
220 | ptdata.cdi[j].max_state);
| ~~~~~~~~~~~~~~~~~~~~~~~
| |
| long unsigned int
To fix this, the correct string format must be used for printing.
Signed-off-by: Florian Eckert <fe@dev.tdt.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231204141335.2798194-1-fe@dev.tdt.de
Johan Hovold [Thu, 30 Nov 2023 17:41:14 +0000 (18:41 +0100)]
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples
Clean up the examples by adding newline separators, moving 'reg'
properties after 'compatible' and dropping unused labels.
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231130174114.13122-3-johan+linaro@kernel.org
Johan Hovold [Thu, 30 Nov 2023 17:41:13 +0000 (18:41 +0100)]
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names
The ADC Thermal Monitor is part of an SPMI PMIC, which in turn sits on
an SPMI bus.
Fixes: db03874b8543 ("dt-bindings: thermal: qcom: add HC variant of adc-thermal monitor bindings")
Fixes: e8ffd6c0756b ("dt-bindings: thermal: qcom: add adc-thermal monitor bindings")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231130174114.13122-2-johan+linaro@kernel.org
Maxim Kiselev [Sun, 17 Dec 2023 21:06:23 +0000 (00:06 +0300)]
thermal/drivers/sun8i: Add D1/T113s THS controller support
This patch adds a thermal sensor controller support for the D1/T113s,
which is similar to the one on H6, but with only one sensor and
different scale and offset values.
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231217210629.131486-3-bigunclemax@gmail.com
Maxim Kiselev [Sun, 17 Dec 2023 21:06:22 +0000 (00:06 +0300)]
dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller
Add a binding for D1/T113s thermal sensor controller.
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231217210629.131486-2-bigunclemax@gmail.com
Uwe Kleine-König [Thu, 16 Nov 2023 11:26:36 +0000 (12:26 +0100)]
thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
This macro has the advantage over SIMPLE_DEV_PM_OPS that we don't have to
care about when the functions are actually used, so the corresponding
__maybe_unused can be dropped.
Also make use of pm_ptr() to discard all PM related stuff if CONFIG_PM
isn't enabled.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231116112633.668826-3-u.kleine-koenig@pengutronix.de
Uwe Kleine-König [Thu, 16 Nov 2023 11:26:35 +0000 (12:26 +0100)]
thermal: amlogic: Make amlogic_thermal_disable() return void
amlogic_thermal_disable() returned zero unconditionally and
amlogic_thermal_remove() already ignores the return value.
Make it return no value and modify amlogic_thermal_suspend to not check
the value.
This patch introduces no semantic changes, but makes it more obvious for
a human reader that amlogic_thermal_suspend() cannot fail.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231116112633.668826-2-u.kleine-koenig@pengutronix.de
Fabio Estevam [Wed, 29 Nov 2023 12:43:30 +0000 (09:43 -0300)]
thermal/thermal_of: Allow rebooting after critical temp
Currently, the default mechanism is to trigger a shutdown after the
critical temperature is reached.
In some embedded cases, such behavior does not suit well, as the board may
be unattended in the field and rebooting may be a better approach.
The bootloader may also check the temperature and only allow the boot to
proceed when the temperature is below a certain threshold.
Introduce support for allowing a reboot to be triggered after the
critical temperature is reached.
If the "critical-action" devicetree property is not found, fall back to
the shutdown action to preserve the existing default behavior.
If a custom ops->critical exists, then it takes preference over
critical-actions.
Tested on a i.MX8MM board with the following devicetree changes:
thermal-zones {
cpu-thermal {
critical-action = "reboot";
};
};
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-4-festevam@gmail.com
Fabio Estevam [Wed, 29 Nov 2023 12:43:29 +0000 (09:43 -0300)]
reboot: Introduce thermal_zone_device_critical_reboot()
Introduce thermal_zone_device_critical_reboot() to trigger an
emergency reboot.
It is a counterpart of thermal_zone_device_critical() with the
difference that it will force a reboot instead of shutdown.
The motivation for doing this is to allow the thermal subystem
to trigger a reboot when the temperature reaches the critical
temperature.
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-3-festevam@gmail.com
Fabio Estevam [Wed, 29 Nov 2023 12:43:28 +0000 (09:43 -0300)]
thermal/core: Prepare for introduction of thermal reboot
Add some helper functions to make it easier introducing the support
for thermal reboot.
No functional change.
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-2-festevam@gmail.com
Fabio Estevam [Wed, 29 Nov 2023 12:43:27 +0000 (09:43 -0300)]
dt-bindings: thermal-zones: Document critical-action
Document the critical-action property to describe the thermal action
the OS should perform after the critical temperature is reached.
The possible values are "shutdown" and "reboot".
The motivation for introducing the critical-action property is that
different systems may need different thermal actions when the critical
temperature is reached.
For example, in a desktop PC, it is desired that a shutdown happens
after the critical temperature is reached.
However, in some embedded cases, such behavior does not suit well,
as the board may be unattended in the field and rebooting may be a
better approach.
The bootloader may also benefit from this new property as it can check
the SoC temperature and in case the temperature is above the critical
point, it can trigger a shutdown or reboot accordingly.
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-1-festevam@gmail.com
Neil Armstrong [Tue, 28 Nov 2023 08:44:48 +0000 (09:44 +0100)]
dt-bindings: thermal: qcom-tsens: document the SM8650 Temperature Sensor
Document the Temperature Sensor (TSENS) on the SM8650 Platform.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231128-topic-sm8650-upstream-bindings-tsens-v3-1-54179e6646d3@linaro.org
Binbin Zhou [Fri, 24 Nov 2023 09:57:45 +0000 (17:57 +0800)]
drivers/thermal/loongson2_thermal: Fix incorrect PTR_ERR() judgment
PTR_ERR() returns -ENODEV when thermal-zones are undefined, and we need
-ENODEV as the right value for comparison.
Otherwise, tz->type is NULL when thermal-zones is undefined, resulting
in the following error:
[ 12.290030] CPU 1 Unable to handle kernel paging request at virtual address
fffffffffffffff1, era ==
900000000355f410, ra ==
90000000031579b8
[ 12.302877] Oops[#1]:
[ 12.305190] CPU: 1 PID: 181 Comm: systemd-udevd Not tainted 6.6.0-rc7+ #5385
[ 12.312304] pc
900000000355f410 ra
90000000031579b8 tp
90000001069e8000 sp
90000001069eba10
[ 12.320739] a0
0000000000000000 a1
fffffffffffffff1 a2
0000000000000014 a3
0000000000000001
[ 12.329173] a4
90000001069eb990 a5
0000000000000001 a6
0000000000001001 a7
900000010003431c
[ 12.337606] t0
fffffffffffffff1 t1
54567fd5da9b4fd4 t2
900000010614ec40 t3
00000000000dc901
[ 12.346041] t4
0000000000000000 t5
0000000000000004 t6
900000010614ee20 t7
900000000d00b790
[ 12.354472] t8
00000000000dc901 u0
54567fd5da9b4fd4 s9
900000000402ae10 s0
900000010614ec40
[ 12.362916] s1
90000000039fced0 s2
ffffffffffffffed s3
ffffffffffffffed s4
9000000003acc000
[ 12.362931] s5
0000000000000004 s6
fffffffffffff000 s7
0000000000000490 s8
90000001028b2ec8
[ 12.362938] ra:
90000000031579b8 thermal_add_hwmon_sysfs+0x258/0x300
[ 12.386411] ERA:
900000000355f410 strscpy+0xf0/0x160
[ 12.391626] CRMD:
000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[ 12.397898] PRMD:
00000004 (PPLV0 +PIE -PWE)
[ 12.403678] EUEN:
00000000 (-FPE -SXE -ASXE -BTE)
[ 12.409859] ECFG:
00071c1c (LIE=2-4,10-12 VS=7)
[ 12.415882] ESTAT:
00010000 [PIL] (IS= ECode=1 EsubCode=0)
[ 12.415907] BADV:
fffffffffffffff1
[ 12.415911] PRID:
0014a000 (Loongson-64bit, Loongson-2K1000)
[ 12.415917] Modules linked in: loongson2_thermal(+) vfat fat uio_pdrv_genirq uio fuse zram zsmalloc
[ 12.415950] Process systemd-udevd (pid: 181, threadinfo=
00000000358b9718, task=
00000000ace72fe3)
[ 12.415961] Stack :
0000000000000dc0 54567fd5da9b4fd4 900000000402ae10 9000000002df9358
[ 12.415982]
ffffffffffffffed 0000000000000004 9000000107a10aa8 90000001002a3410
[ 12.415999]
ffffffffffffffed ffffffffffffffed 9000000107a11268 9000000003157ab0
[ 12.416016]
9000000107a10aa8 ffffff80020fc0c8 90000001002a3410 ffffffffffffffed
[ 12.416032]
0000000000000024 ffffff80020cc1e8 900000000402b2a0 9000000003acc000
[ 12.416048]
90000001002a3410 0000000000000000 ffffff80020f4030 90000001002a3410
[ 12.416065]
0000000000000000 9000000002df6808 90000001002a3410 0000000000000000
[ 12.416081]
ffffff80020f4030 0000000000000000 90000001002a3410 9000000002df2ba8
[ 12.416097]
00000000000000b4 90000001002a34f4 90000001002a3410 0000000000000002
[ 12.416114]
ffffff80020f4030 fffffffffffffff0 90000001002a3410 9000000002df2f30
[ 12.416131] ...
[ 12.416138] Call Trace:
[ 12.416142] [<
900000000355f410>] strscpy+0xf0/0x160
[ 12.416167] [<
90000000031579b8>] thermal_add_hwmon_sysfs+0x258/0x300
[ 12.416183] [<
9000000003157ab0>] devm_thermal_add_hwmon_sysfs+0x50/0xe0
[ 12.416200] [<
ffffff80020cc1e8>] loongson2_thermal_probe+0x128/0x200 [loongson2_thermal]
[ 12.416232] [<
9000000002df6808>] platform_probe+0x68/0x140
[ 12.416249] [<
9000000002df2ba8>] really_probe+0xc8/0x3c0
[ 12.416269] [<
9000000002df2f30>] __driver_probe_device+0x90/0x180
[ 12.416286] [<
9000000002df3058>] driver_probe_device+0x38/0x160
[ 12.416302] [<
9000000002df33a8>] __driver_attach+0xa8/0x200
[ 12.416314] [<
9000000002deffec>] bus_for_each_dev+0x8c/0x120
[ 12.416330] [<
9000000002df198c>] bus_add_driver+0x10c/0x2a0
[ 12.416346] [<
9000000002df46b4>] driver_register+0x74/0x160
[ 12.416358] [<
90000000022201a4>] do_one_initcall+0x84/0x220
[ 12.416372] [<
90000000022f3ab8>] do_init_module+0x58/0x2c0
[ 12.416386] [<
90000000022f6538>] init_module_from_file+0x98/0x100
[ 12.416399] [<
90000000022f67f0>] sys_finit_module+0x230/0x3c0
[ 12.416412] [<
900000000358f7c8>] do_syscall+0x88/0xc0
[ 12.416431] [<
900000000222137c>] handle_syscall+0xbc/0x158
Fixes: e7e3a7c35791 ("thermal/drivers/loongson-2: Add thermal management support")
Cc: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/343c14de98216636a47b43e8bfd47b70d0a8e068.1700817227.git.zhoubinbin@loongson.cn
Binbin Zhou [Fri, 24 Nov 2023 09:57:44 +0000 (17:57 +0800)]
dt-bindings: thermal: loongson,ls2k-thermal: Fix binding check issues
Add the missing 'thermal-sensor-cells' property which is required for
every thermal sensor as it's used when using phandles.
And add the thermal-sensor.yaml reference.
In fact, it was a careless mistake when submitting the driver that
caused it to not work properly. So the fix is necessary, although it
will result in the ABI break.
Fixes: 72684d99a854 ("thermal: dt-bindings: add loongson-2 thermal")
Cc: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/6d69362632271ab0af9a5fbfa3bc46a0894f1d54.1700817227.git.zhoubinbin@loongson.cn
Rafał Miłecki [Fri, 17 Nov 2023 05:22:14 +0000 (06:22 +0100)]
dt-bindings: thermal: convert Mediatek Thermal to the json-schema
This helps validating DTS files. Introduced changes:
1. Improved title
2. Simplified description (dropped "This describes the device tree...")
3. Dropped undocumented "reset-names" from example
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231117052214.24554-1-zajec5@gmail.com
Lukasz Luba [Wed, 20 Dec 2023 23:17:53 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Support new update callback of weights
When the thermal instance's weight is updated from the sysfs the governor
update_tz() callback is triggered. Implement proper reaction to this
event in the IPA, which would save CPU cycles spent in throttle().
This will speed-up the main throttle() IPA function and clean it up
a bit.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:52 +0000 (23:17 +0000)]
thermal/sysfs: Update governors when the 'weight' has changed
Support governors update when the thermal instance's weight has changed.
This allows to adjust internal state for the governor.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Add two empty code lines aroung the locking ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:51 +0000 (23:17 +0000)]
thermal/sysfs: Update instance->weight under tz lock
User space can change the weight of a thermal instance via sysfs while the
.throttle() callback is running for a governor, because weight_store()
does not use the zone lock.
The IPA governor uses instance weight values for power calculations and
caches the sum of them as total_weight, so it gets confused when one of
them changes while its .throttle() callback is running.
To prevent that from happening, use thermal zone locking in
weight_store().
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:50 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Simplify checks for valid power actor
There is a need to check if the cooling device in the thermal zone
supports IPA callback and is set for control trip point.
Refactor the code which validates the power actor capabilities and
make it more consistent in all places.
No intentional functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:49 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Move memory allocation out of throttle()
The new thermal callback allows to react to the change of cooling
instances in the thermal zone. Move the memory allocation to that new
callback and save CPU cycles in the throttle() code path.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:48 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Change trace functions
Change trace event trace_thermal_power_allocator() to not use dynamic
array for requested power and granted power for all power actors.
Instead, simplify the trace event and print other simple values.
Add new trace event to print power actor information of requested power
and granted power. That trace event would be called in a loop for each
power actor. The trace data would be easier to parse comparing to the
dynamic array implementation.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:47 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Refactor checks in divvy_up_power()
Simplify the code and remove one extra 'if' block.
No intentional functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:46 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Refactor check_power_actors()
In preparation for a subsequent change, rearrange check_power_actors().
No intentional functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:45 +0000 (23:17 +0000)]
thermal: core: Add governor callback for thermal zone change
Add a new callback to the struct thermal_governor. It can be used for
updating governors when there is a change in the thermal zone internals,
e.g. thermal cooling device is bind to the thermal zone.
That makes possible to move some heavy operations like memory allocations
related to the number of cooling instances out of the throttle() callback.
Both callback code paths (throttle() and update_tz()) are protected with
the same thermal zone lock, which guaranties the consistency.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stanislaw Gruszka [Thu, 28 Dec 2023 10:02:48 +0000 (11:02 +0100)]
thermal: netlink: Add thermal_group_has_listeners() helper
Add a helper function to check if there are listeners for
thermal_gnl_family multicast groups.
For now use it to avoid unnecessary allocations and sending
thermal genl messages when there are no recipients.
In the future, in conjunction with (not yet implemented) notification
of change in the netlink socket group membership, this helper can be
used to open/close hardware interfaces based on the presence of
user space subscribers.
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stanislaw Gruszka [Thu, 28 Dec 2023 10:02:47 +0000 (11:02 +0100)]
thermal: netlink: Add enum for mutlicast groups indexes
Use enum instead of hard-coded numbers for indexing multicast groups.
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:28:31 +0000 (20:28 +0100)]
thermal: core: Resume thermal zones asynchronously
The resume of thermal zones in thermal_pm_notify() is carried out
sequentially, which may be a problem if __thermal_zone_device_update()
takes a significant time to run for some thermal zones, because some
other thermal zones may need to wait for them to resume then and if
any other PM notifiers are going to be invoked after the thermal one,
they will need to wait for it either.
To address this, make thermal_pm_notify() switch the poll_queue delayed
work over to a one-shot thermal_zone_device_resume() work function that
will restore the original one during the thermal zone resume and queue
up poll_queue without a delay for each thermal zone.
Link: https://lore.kernel.org/linux-pm/20231120234015.3273143-1-radusolea@google.com/
Reported-by: Radu Solea <radusolea@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:26:47 +0000 (20:26 +0100)]
thermal: core: Initialize poll_queue in thermal_zone_device_init()
In preparation for a subsequent change, move the initialization of the
poll_queue delayed work from thermal_zone_device_register_with_trips()
to thermal_zone_device_init() which is called by the former.
However, because thermal_zone_device_init() is also called by
thermal_pm_notify(), make the latter call cancel_delayed_work() on
poll_queue before invoking the former, so as to allow the work
item to be re-initialized safely.
Also move thermal_zone_device_check() which needs to be defined
before thermal_zone_device_init(), so the latter can pass it to the
INIT_DELAYED_WORK() macro.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:25:02 +0000 (20:25 +0100)]
thermal: core: Fix thermal zone suspend-resume synchronization
There are 3 synchronization issues with thermal zone suspend-resume
during system-wide transitions:
1. The resume code runs in a PM notifier which is invoked after user
space has been thawed, so it can run concurrently with user space
which can trigger a thermal zone device removal. If that happens,
the thermal zone resume code may use a stale pointer to the next
list element and crash, because it does not hold thermal_list_lock
while walking thermal_tz_list.
2. The thermal zone resume code calls thermal_zone_device_init()
outside the zone lock, so user space or an update triggered by
the platform firmware may see an inconsistent state of a
thermal zone leading to unexpected behavior.
3. Clearing the in_suspend global variable in thermal_pm_notify()
allows __thermal_zone_device_update() to continue for all thermal
zones and it may as well run before the thermal_tz_list walk (or
at any point during the list walk for that matter) and attempt to
operate on a thermal zone that has not been resumed yet. It may
also race destructively with thermal_zone_device_init().
To address these issues, add thermal_list_lock locking to
thermal_pm_notify(), especially arount the thermal_tz_list,
make it call thermal_zone_device_init() back-to-back with
__thermal_zone_device_update() under the zone lock and replace
in_suspend with per-zone bool "suspend" indicators set and unset
under the given zone's lock.
Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/
Reported-by: Bo Ye <bo.ye@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Randy Dunlap [Thu, 21 Dec 2023 05:51:44 +0000 (21:51 -0800)]
thermal: cpuidle_cooling: fix kernel-doc warning and a spello
Correct one misuse of kernel-doc notation and one spelling error as
reported by codespell.
cpuidle_cooling.c:152: warning: cannot understand function prototype: 'struct thermal_cooling_device_ops cpuidle_cooling_ops = '
For the kernel-doc warning, don't use "/**" for a comment on data.
kernel-doc can be used for structure declarations but not definitions.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 14 Dec 2023 10:52:25 +0000 (11:52 +0100)]
thermal: core: Fix NULL pointer dereference in zone registration error path
If device_register() in thermal_zone_device_register_with_trips()
returns an error, the tz variable is set to NULL and subsequently
dereferenced in kfree(tz->tzp).
Commit
adc8749b150c ("thermal/drivers/core: Use put_device() if
device_register() fails") added the tz = NULL assignment in question to
avoid a possible double-free after dropping the reference to the zone
device. However, after commit
4649620d9404 ("thermal: core: Make
thermal_zone_device_unregister() return after freeing the zone"), that
assignment has become redundant, because dropping the reference to the
zone device does not cause the zone object to be freed any more.
Drop it to address the NULL pointer dereference.
Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Daniel Lezcano [Wed, 13 Dec 2023 12:13:22 +0000 (13:13 +0100)]
thermal/core: Check get_temp ops is present when registering a tz
Initially the check against the get_temp ops in the
thermal_zone_device_update() was put in there in order to catch
drivers not providing this method.
Instead of checking again and again the function if the ops exists in
the update function, let's do the check at registration time, so it is
checked one time and for all.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Tue, 5 Dec 2023 19:18:39 +0000 (20:18 +0100)]
thermal: trip: Send trip change notifications on all trip updates
The _store callbacks of the trip point temperature and hysteresis sysfs
attributes invoke thermal_notify_tz_trip_change() to send a notification
regarding the trip point change, but when trip points are updated by the
platform firmware, trip point change notifications are not sent.
To make the behavior after a trip point change more consistent,
modify all of the 3 places where trip point temperature is updated
to use a new function called thermal_zone_set_trip_temp() for this
purpose and make that function call thermal_notify_tz_trip_change().
Note that trip point hysteresis can only be updated via sysfs and
trip_point_hyst_store() calls thermal_notify_tz_trip_change() already,
so this code path need not be changed.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:49:03 +0000 (20:49 +0100)]
thermal: netlink: Use for_each_trip() in thermal_genl_cmd_tz_get_trip()
Make thermal_genl_cmd_tz_get_trip() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:46:35 +0000 (20:46 +0100)]
thermal: helpers: Use for_each_trip() in __thermal_zone_get_temp()
Make __thermal_zone_get_temp() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:41:30 +0000 (20:41 +0100)]
thermal: trip: Use for_each_trip() in __thermal_zone_set_trips()
Make __thermal_zone_set_trips() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:36:14 +0000 (20:36 +0100)]
thermal: trip: Drop redundant __thermal_zone_get_trip() header
The __thermal_zone_get_trip() header in drivers/thermal/thermal_core.h
is redundant, because there is one already in thermal.h, so drop it.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:20:00 +0000 (20:20 +0100)]
thermal: core: Rework thermal zone availability check
In order to avoid running __thermal_zone_device_update() for thermal
zones going away, the thermal zone lock is held around device_del()
in thermal_zone_device_unregister() and thermal_zone_device_update()
passes the given thermal zone device to device_is_registered().
This allows thermal_zone_device_update() to skip the
__thermal_zone_device_update() if device_del() has already run for
the thermal zone at hand.
However, instead of looking at driver core internals, the thermal
subsystem may as well rely on its own data structures for this
purpose. Namely, if the thermal zone is not present in
thermal_tz_list, it can be regarded as unavailable, which in fact is
already the case in thermal_zone_device_unregister(). Accordingly,
the device_is_registered() check in thermal_zone_device_update() can
be replaced with checking whether or not the node list_head in struct
thermal_zone_device is empty, in which case it is not there in
thermal_tz_list.
To make this work, though, it is necessary to initialize tz->node
in thermal_zone_device_register_with_trips() before registering the
thermal zone device and it needs to be added to thermal_tz_list and
deleted from it under its zone lock.
After the above modifications, the zone lock does not need to be
held around device_del() in thermal_zone_device_unregister() any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:19:03 +0000 (20:19 +0100)]
thermal: Drop redundant and confusing device_is_registered() checks
Multiple places in the thermal subsystem (most importantly, sysfs
attribute callback functions) check if the given thermal zone device is
still registered in order to return early in case the device_del() in
thermal_zone_device_unregister() has run already.
However, after thermal_zone_device_unregister() has been made wait for
all of the zone-related activity to complete before returning, it is
not necessary to do that any more, because all of the code holding a
reference to the thermal zone device object will be waited for even if
it does not do anything special to enforce this.
Accordingly, drop all of the device_is_registered() checks that are now
redundant and get rid of the zone locking that is not necessary any more
after dropping them.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:13:44 +0000 (20:13 +0100)]
thermal: core: Make thermal_zone_device_unregister() return after freeing the zone
Make thermal_zone_device_unregister() wait until all of the references
to the given thermal zone object have been dropped and free it before
returning.
This guarantees that when thermal_zone_device_unregister() returns,
there is no leftover activity regarding the thermal zone in question
which is required by some of its callers (for instance, modular driver
code that wants to know when it is safe to let the module go away).
Subsequently, this will allow some confusing device_is_registered()
checks to be dropped from the thermal sysfs and core code.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Tue, 5 Dec 2023 12:26:59 +0000 (13:26 +0100)]
thermal: sysfs: Rework the reading of trip point attributes
Rework the _show() callback functions for the trip point temperature,
hysteresis and type attributes to avoid copying the values of struct
thermal_trip fields that they do not use and make them carry out the
same validation checks as the corresponding _store() callback functions.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Tue, 5 Dec 2023 12:24:08 +0000 (13:24 +0100)]
thermal: sysfs: Rework the handling of trip point updates
Both trip_point_temp_store() and trip_point_hyst_store() use
thermal_zone_set_trip() to update a given trip point, but none of them
actually needs to change more than one field in struct thermal_trip
representing it. However, each of them effectively calls
__thermal_zone_get_trip() twice in a row for the same trip index value,
once directly and once via thermal_zone_set_trip(), which is not
particularly efficient, and the way in which thermal_zone_set_trip()
carries out the update is not particularly straightforward.
Moreover, input processing need not be done under the thermal zone lock
in any of these functions.
Rework trip_point_temp_store() and trip_point_hyst_store() to address
the above, move the part of thermal_zone_set_trip() that is still
useful to a new function called thermal_zone_trip_updated() and drop
the rest of it.
While at it, make trip_point_hyst_store() reject negative hysteresis
values.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Wed, 29 Nov 2023 13:36:07 +0000 (14:36 +0100)]
thermal: trip: Drop a redundant check from thermal_zone_set_trip()
After recent changes in the thermal framework, a trip points array is
required for registering a thermal zone that is not tripless, so the
tz->trips pointer in thermal_zone_set_trip() is never NULL and the
check involving it is redundant. Drop that check.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Lukasz Luba [Wed, 25 Oct 2023 19:22:25 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rearrange initialization of local variables
Rearrange the initialization of local variables in allocate_power() so
as to improve code clarity and the visibility of the initial values.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:24 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Remove excessive local variables
Local variable 'ret' in allocate_power() is only used in the return
statement, so drop it.
Local variable 'trip_max' in allocate_power() is only used for caching
the params->trip_max value which may as well be accessed directly as
needed, so drop it either.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:23 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Use shorter paths to access data when possible
The 'cdev' pointer in allow_maximum_power() is valid, so there is no
need to use 'instance->cdev' instead of it.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:22 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rearrange local variables
Rearrange the order of local variable definitions in multiple functions
so as to follow the kernel coding style in that respect.
Also, move local variable definitions located in nested code blocks to
the beginning of each function to improve the visibility of all local
variables in use.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:21 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Check the cooling devices only for trip_max
The throttling logic only cares about the last passive trip point and
the cooling devices attached to it.
Therefore, there is no need to bail out if other trip points have
cooling devices which are not a supported by the IPA.
Check the cooling devices only for 'trip_max' during the binding.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:20 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Set up trip points earlier
Set up the trip points at the beginning of the binding function.
This simplifies the code a bit and allows for further cleanups.
Also add a check to fail the binding if the last passive trip point is
not found.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:19 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rename trip_max_desired_temperature
Refactor the code and rename the last passive trip point field.
There is a comment describing the field properly. Use shorter field name
so as to allow to clarify the code.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 9 Nov 2023 15:01:48 +0000 (16:01 +0100)]
thermal: core: Add trip thresholds for trip crossing detection
The trip crossing detection in handle_thermal_trip() does not work
correctly in the cases when a trip point is crossed on the way up and
then the zone temperature stays above its low temperature (that is, its
temperature decreased by its hysteresis). The trip temperature may
be passed by the zone temperature subsequently in that case, even
multiple times, but that does not count as the trip crossing as long as
the zone temperature does not fall below the trip's low temperature or,
in other words, until the trip is crossed on the way down.
|-----------low--------high------------|
|<--------->|
| hyst |
| |
| -|--> crossed on the way up
|
<---|-- crossed on the way down
However, handle_thermal_trip() will invoke thermal_notify_tz_trip_up()
every time the trip temperature is passed by the zone temperature on
the way up regardless of whether or not the trip has been crossed on
the way down yet. Moreover, it will not call thermal_notify_tz_trip_down()
if the last zone temperature was between the trip's temperature and its
low temperature, so some "trip crossed on the way down" events may not
be reported.
To address this issue, introduce trip thresholds equal to either the
temperature of the given trip, or its low temperature, such that if
the trip's threshold is passed by the zone temperature on the way up,
its value will be set to the trip's low temperature and
thermal_notify_tz_trip_up() will be called, and if the trip's threshold
is passed by the zone temperature on the way down, its value will be set
to the trip's temperature (high) and thermal_notify_tz_trip_down() will
be called. Accordingly, if the threshold is passed on the way up, it
cannot be passed on the way up again until its passed on the way down
and if it is passed on the way down, it cannot be passed on the way down
again until it is passed on the way up which guarantees correct
triggering of trip crossing notifications.
If the last temperature of the zone is invalid, the trip's threshold
will be set depending of the zone's current temperature: If that
temperature is above the trip's temperature, its threshold will be
set to its low temperature or otherwise its threshold will be set to
its (high) temperature. Because the zone temperature is initially
set to invalid and tz->last_temperature is only updated by
update_temperature(), this is sufficient to set the correct initial
threshold values for all trips.
Link: https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Sun, 19 Nov 2023 23:02:14 +0000 (15:02 -0800)]
Linux 6.7-rc2
Linus Torvalds [Sun, 19 Nov 2023 21:54:28 +0000 (13:54 -0800)]
Merge tag 'kbuild-fixes-v6.7' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix section mismatch warning messages for riscv and loongarch
- Remove CONFIG_IA64 left-over from linux/export-internal.h
- Fix the location of the quotes for UIMAGE_NAME
- Fix a memory leak bug in Kconfig
* tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix memory leak from range properties
kbuild: Move the single quotes for image name
linux/export: clean up the IA-64 KSYM_FUNC macro
modpost: fix section mismatch message for RELA
Linus Torvalds [Sun, 19 Nov 2023 21:49:32 +0000 (13:49 -0800)]
Merge tag 'irq_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
- Flush the translation service tables to prevent unpredictable
behavior on non-coherent GIC devices
* tag 'irq_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Flush ITS tables correctly in non-coherent GIC designs
Linus Torvalds [Sun, 19 Nov 2023 21:46:17 +0000 (13:46 -0800)]
Merge tag 'x86_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Ignore invalid x2APIC entries in order to not waste per-CPU data
- Fix a back-to-back signals handling scenario when shadow stack is in
use
- A documentation fix
- Add Kirill as TDX maintainer
* tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi: Ignore invalid x2APIC entries
x86/shstk: Delay signal entry SSP write until after user accesses
x86/Documentation: Indent 'note::' directive for protocol version number note
MAINTAINERS: Add Intel TDX entry
Linus Torvalds [Sun, 19 Nov 2023 21:35:07 +0000 (13:35 -0800)]
Merge tag 'timers_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:
- Do the push of pending hrtimers away from a CPU which is being
offlined earlier in the offlining process in order to prevent a
deadlock
* tag 'timers_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimers: Push pending hrtimers away from outgoing CPU earlier
Linus Torvalds [Sun, 19 Nov 2023 21:32:00 +0000 (13:32 -0800)]
Merge tag 'sched_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Fix virtual runtime calculation when recomputing a sched entity's
weights
- Fix wrongly rejected unprivileged poll requests to the cgroup psi
pressure files
- Make sure the load balancing is done by only one CPU
* tag 'sched_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix the decision for load balance
sched: psi: fix unprivileged polling against cgroups
sched/eevdf: Fix vruntime adjustment on reweight
Linus Torvalds [Sun, 19 Nov 2023 21:30:21 +0000 (13:30 -0800)]
Merge tag 'locking_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Fix a hardcoded futex flags case which lead to one robust futex test
failure
* tag 'locking_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Fix hardcoded flags
Linus Torvalds [Sun, 19 Nov 2023 21:26:42 +0000 (13:26 -0800)]
Merge tag 'perf_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Make sure the context refcount is transferred too when migrating perf
events
* tag 'perf_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix cpuctx refcounting
Linus Torvalds [Sat, 18 Nov 2023 23:20:58 +0000 (15:20 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Seven small fixes, six in drivers and one in sd.
The sd fix is so large because it changes a struct pointer to a struct
but otherwise is fairly simple"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom-ufs: dt-bindings: Document the SM8650 UFS Controller
scsi: sd: Fix sshdr use in sd_suspend_common()
scsi: scsi_debug: Delete some bogus error checking
scsi: scsi_debug: Fix some bugs in sdebug_error_write()
scsi: ufs: core: Fix racing issue between ufshcd_mcq_abort() and ISR
scsi: ufs: core: Expand MCQ queue slot to DeviceQueueDepth + 1
scsi: qla2xxx: Fix system crash due to bad pointer access
Linus Torvalds [Sat, 18 Nov 2023 23:13:10 +0000 (15:13 -0800)]
Merge tag 'parisc-for-6.7-rc2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"On parisc we still sometimes need writeable stacks, e.g. if programs
aren't compiled with gcc-14. To avoid issues with the upcoming
systemd-254 we therefore have to disable prctl(PR_SET_MDWE) for now
(for parisc only).
The other two patches are minor: a bugfix for the soft power-off on
qemu with 64-bit kernel and prefer strscpy() over strlcpy():
- Fix power soft-off on qemu
- Disable prctl(PR_SET_MDWE) since parisc sometimes still needs
writeable stacks
- Use strscpy instead of strlcpy in show_cpuinfo()"
* tag 'parisc-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
prctl: Disable prctl(PR_SET_MDWE) on parisc
parisc/power: Fix power soft-off when running on qemu
parisc: Replace strlcpy() with strscpy()
Linus Torvalds [Sat, 18 Nov 2023 19:28:28 +0000 (11:28 -0800)]
Merge tag 'xfs-6.7-fixes-1' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Chandan Babu:
- Fix deadlock arising due to intent items in AIL not being cleared
when log recovery fails
- Fix stale data exposure bug when remapping COW fork extents to data
fork
- Fix deadlock when data device flush fails
- Fix AGFL minimum size calculation
- Select DEBUG_FS instead of XFS_DEBUG when XFS_ONLINE_SCRUB_STATS is
selected
- Fix corruption of log inode's extent count field when NREXT64 feature
is enabled
* tag 'xfs-6.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: recovery should not clear di_flushiter unconditionally
xfs: inode recovery does not validate the recovered inode
xfs: fix again select in kconfig XFS_ONLINE_SCRUB_STATS
xfs: fix internal error from AGFL exhaustion
xfs: up(ic_sema) if flushing data device fails
xfs: only remap the written blocks in xfs_reflink_end_cow_extent
XFS: Update MAINTAINERS to catch all XFS documentation
xfs: abort intent items when recovery intents fail
xfs: factor out xfs_defer_pending_abort
Linus Torvalds [Sat, 18 Nov 2023 19:23:32 +0000 (11:23 -0800)]
Merge tag 'nfsd-6.7-1' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix several long-standing bugs in the duplicate reply cache
- Fix a memory leak
* tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Fix checksum mismatches in the duplicate reply cache
NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
NFSD: Update nfsd_cache_append() to use xdr_stream
nfsd: fix file memleak on client_opens_release
Linus Torvalds [Sat, 18 Nov 2023 19:18:46 +0000 (11:18 -0800)]
Merge tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- multichannel fixes (including a lock ordering fix and an important
refcounting fix)
- spnego fix
* tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix lock ordering while disabling multichannel
cifs: fix leak of iface for primary channel
cifs: fix check of rc in function generate_smb3signingkey
cifs: spnego: add ';' in HOST_KEY_LEN
Helge Deller [Sat, 18 Nov 2023 18:33:35 +0000 (19:33 +0100)]
prctl: Disable prctl(PR_SET_MDWE) on parisc
systemd-254 tries to use prctl(PR_SET_MDWE) for it's MemoryDenyWriteExecute
functionality, but fails on parisc which still needs executable stacks in
certain combinations of gcc/glibc/kernel.
Disable prctl(PR_SET_MDWE) by returning -EINVAL for now on parisc, until
userspace has catched up.
Signed-off-by: Helge Deller <deller@gmx.de>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Sam James <sam@gentoo.org>
Closes: https://github.com/systemd/systemd/issues/29775
Tested-by: Sam James <sam@gentoo.org>
Link: https://lore.kernel.org/all/875y2jro9a.fsf@gentoo.org/
Cc: <stable@vger.kernel.org> # v6.3+
Linus Torvalds [Sat, 18 Nov 2023 18:02:16 +0000 (10:02 -0800)]
Merge tag 'for-6.7/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Various fixes for the DM delay target to address regressions
introduced during the 6.7 merge window
- Fixes to both DM bufio and the verity target for no-sleep mode,
to address sleeping while atomic issues
- Update DM crypt target in response to the treewide change that
made MAX_ORDER inclusive
* tag 'for-6.7/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-crypt: start allocating with MAX_ORDER
dm-verity: don't use blocking calls from tasklets
dm-bufio: fix no-sleep mode
dm-delay: avoid duplicate logic
dm-delay: fix bugs introduced by kthread mode
dm-delay: fix a race between delay_presuspend and delay_bio
Helge Deller [Fri, 17 Nov 2023 15:43:52 +0000 (16:43 +0100)]
parisc/power: Fix power soft-off when running on qemu
Firmware returns the physical address of the power switch,
so need to use gsc_writel() instead of direct memory access.
Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Kees Cook [Thu, 16 Nov 2023 19:13:40 +0000 (11:13 -0800)]
parisc: Replace strlcpy() with strscpy()
strlcpy() reads the entire source buffer first. This read may exceed
the destination size limit. This is both inefficient and can lead
to linear read overflows if a source string is not NUL-terminated[1].
Additionally, it returns the size of the source string, not the
resulting size of the destination string. In an effort to remove strlcpy()
completely[2], replace strlcpy() here with strscpy().
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
Link: https://github.com/KSPP/linux/issues/89
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Linus Torvalds [Sat, 18 Nov 2023 17:44:14 +0000 (09:44 -0800)]
Merge tag 'i2c-for-6.7-rc2' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Revert a not-working conversion to generic recovery for PXA,
use proper IO accessors for designware, and use proper PM level
for ocores to allow accessing interrupt providers late"
* tag 'i2c-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ocores: Move system PM hooks to the NOIRQ phase
i2c: designware: Fix corrupted memory seen in the ISR
Revert "i2c: pxa: move to generic GPIO recovery"
Linus Torvalds [Sat, 18 Nov 2023 17:09:17 +0000 (09:09 -0800)]
Merge tag 'turbostat-2023.11.07' of git://git./linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- Turbostat features are now table-driven (Rui Zhang)
- Add support for some new platforms (Sumeet Pawnikar, Rui Zhang)
- Gracefully run in configs when CPUs are limited (Rui Zhang, Srinivas
Pandruvada)
- misc minor fixes
[ This came in during the merge window, but sorting out the signed tag
took a while, so thus the late merge - Linus ]
* tag 'turbostat-2023.11.07' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (86 commits)
tools/power turbostat: version 2023.11.07
tools/power/turbostat: bugfix "--show IPC"
tools/power/turbostat: Add initial support for LunarLake
tools/power/turbostat: Add initial support for ArrowLake
tools/power/turbostat: Add initial support for GrandRidge
tools/power/turbostat: Add initial support for SierraForest
tools/power/turbostat: Add initial support for GraniteRapids
tools/power/turbostat: Add MSR_CORE_C1_RES support for spr_features
tools/power/turbostat: Move process to root cgroup
tools/power/turbostat: Handle cgroup v2 cpu limitation
tools/power/turbostat: Abstrct function for parsing cpu string
tools/power/turbostat: Handle offlined CPUs in cpu_subset
tools/power/turbostat: Obey allowed CPUs for system summary
tools/power/turbostat: Obey allowed CPUs for primary thread/core detection
tools/power/turbostat: Abstract several functions
tools/power/turbostat: Obey allowed CPUs during startup
tools/power/turbostat: Obey allowed CPUs when accessing CPU counters
tools/power/turbostat: Introduce cpu_allowed_set
tools/power/turbostat: Remove PC7/PC9 support on ADL/RPL
tools/power/turbostat: Enable MSR_CORE_C1_RES on recent Intel client platforms
...
Linus Torvalds [Fri, 17 Nov 2023 22:36:58 +0000 (14:36 -0800)]
Merge tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Lots of small fixes for minor nits and compiler warnings.
Bigger items:
- The six locks lost wakeup is finally fixed: six_read_trylock() was
checking for the waiting bit before decrementing the number of
readers - validated the fix with a torture test.
- Fix for a memory reclaim issue: when needing to reallocate a key
cache key, we now do our usual GFP_NOWAIT; unlock(); GFP_KERNEL
dance.
- Multiple deleted inodes btree fixes
- Fix an issue in fsck, where i_nlink would be recalculated
incorrectly for hardlinked files if a snapshot had ever been taken.
- Kill journal pre-reservations: This is a bigger patch than I would
normally send at this point, but it deletes code and it fixes some
of our tests that would sporadically die with the journal getting
stuck, and it's a performance improvement, too"
* tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefs: (22 commits)
bcachefs: Fix missing locking for dentry->d_parent access
bcachefs: six locks: Fix lost wakeup
bcachefs: Fix no_data_io mode checksum check
bcachefs: Fix bch2_check_nlinks() for snapshots
bcachefs: Don't decrease BTREE_ITER_MAX when LOCKDEP=y
bcachefs: Disable debug log statements
bcachefs: Fix missing transaction commit
bcachefs: Fix error path in bch2_mount()
bcachefs: Fix potential sleeping during mount
bcachefs: Fix iterator leak in may_delete_deleted_inode()
bcachefs: Kill journal pre-reservations
bcachefs: Check for nonce offset inconsistency in data_update path
bcachefs: Make sure to drop/retake btree locks before reclaim
bcachefs: btree_trans->write_locked
bcachefs: Run btree key cache shrinker less aggressively
bcachefs: Split out btree_key_cache_types.h
bcachefs: Guard against insufficient devices to create stripes
bcachefs: Fix null ptr deref in bch2_backpointer_get_node()
bcachefs: Fix multiple -Warray-bounds warnings
bcachefs: Use DECLARE_FLEX_ARRAY() helper and fix multiple -Warray-bounds warnings
...
Linus Torvalds [Fri, 17 Nov 2023 22:19:46 +0000 (14:19 -0800)]
Merge tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Thirteen hotfixes. Seven are cc:stable and the remainder pertain to
post-6.6 issues or aren't considered suitable for backporting"
* tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: more ptep_get() conversion
parisc: fix mmap_base calculation when stack grows upwards
mm/damon/core.c: avoid unintentional filtering out of schemes
mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors
mm/damon/sysfs-schemes: handle tried region directory allocation failure
mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure
mm/damon/sysfs: check error from damon_sysfs_update_target()
mm: fix for negative counter: nr_file_hugepages
selftests/mm: add hugetlb_fault_after_madv to .gitignore
selftests/mm: restore number of hugepages
selftests: mm: fix some build warnings
selftests: mm: skip whole test instead of failure
mm/damon/sysfs: eliminate potential uninitialized variable warning
Linus Torvalds [Fri, 17 Nov 2023 22:08:14 +0000 (14:08 -0800)]
Merge tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linux
Pull block fix from Jens Axboe:
"Just a single fix from Christoph/Ming, fixing a case where integrity
IO could be called without having an appropriate queue reference"
* tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linux:
blk-mq: make sure active queue usage is held for bio_integrity_prep()
Linus Torvalds [Fri, 17 Nov 2023 22:03:18 +0000 (14:03 -0800)]
Merge tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single fixup for a change we made in this release, which caused
a regression in sometimes missing fdinfo output if the SQPOLL thread
had the lock held when fdinfo output was retrieved.
This brings us back on par with what we had before, where just the
main uring_lock will prevent that output. We'd love to get rid of that
too, but that is beyond the scope of this release and will have to
wait for 6.8"
* tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linux:
io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval
Linus Torvalds [Fri, 17 Nov 2023 21:58:26 +0000 (13:58 -0800)]
Merge tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"This is a 'blast from the bast' fixes pull, because it contains a
bunch of AGP fixes for amdgpu. Otherwise nothing out of the ordinary.
Next week is back to Dave unless he's knocked out by some conference
bug.
- amdgpu: fixes all over, including a set of AGP fixes
- nouvea: GSP + other bugfixes
- ivpu build fix
- lenovo legion go panel orientation quirk"
* tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drm: (30 commits)
drm/amdgpu/gmc9: disable AGP aperture
drm/amdgpu/gmc10: disable AGP aperture
drm/amdgpu/gmc11: disable AGP aperture
drm/amdgpu: add a module parameter to control the AGP aperture
drm/amdgpu/gmc11: fix logic typo in AGP check
drm/amd/display: Fix encoder disable logic
drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox
drm/amdgpu: add and populate the port num into xgmi topology info
drm/amd/display: Negate IPS allow and commit bits
drm/amd/pm: Don't send unload message for reset
drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c
drm/amd/display: Clear dpcd_sink_ext_caps if not set
drm/amd/display: Enable fast plane updates on DCN3.2 and above
drm/amd/display: fix NULL dereference
drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()
drm/amd/display: Add null checks for 8K60 lightup
drm/amd/pm: Fill pcie error counters for gpu v1_4
drm/amd/pm: Update metric table for smu v13_0_6
drm/amdgpu: correct chunk_ptr to a pointer to chunk.
drm/amd/display: Fix DSC not Enabled on Direct MST Sink
...
Chuck Lever [Fri, 10 Nov 2023 16:28:45 +0000 (11:28 -0500)]
NFSD: Fix checksum mismatches in the duplicate reply cache
nfsd_cache_csum() currently assumes that the server's RPC layer has
been advancing rq_arg.head[0].iov_base as it decodes an incoming
request, because that's the way it used to work. On entry, it
expects that buf->head[0].iov_base points to the start of the NFS
header, and excludes the already-decoded RPC header.
These days however, head[0].iov_base now points to the start of the
RPC header during all processing. It no longer points at the NFS
Call header when execution arrives at nfsd_cache_csum().
In a retransmitted RPC the XID and the NFS header are supposed to
be the same as the original message, but the contents of the
retransmitted RPC header can be different. For example, for krb5,
the GSS sequence number will be different between the two. Thus if
the RPC header is always included in the DRC checksum computation,
the checksum of the retransmitted message might not match the
checksum of the original message, even though the NFS part of these
messages is identical.
The result is that, even if a matching XID is found in the DRC,
the checksum mismatch causes the server to execute the
retransmitted RPC transaction again.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 10 Nov 2023 16:28:33 +0000 (11:28 -0500)]
NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
The "statp + 1" pointer that is passed to nfsd_cache_update() is
supposed to point to the start of the egress NFS Reply header. In
fact, it does point there for AUTH_SYS and RPCSEC_GSS_KRB5 requests.
But both krb5i and krb5p add fields between the RPC header's
accept_stat field and the start of the NFS Reply header. In those
cases, "statp + 1" points at the extra fields instead of the Reply.
The result is that nfsd_cache_update() caches what looks to the
client like garbage.
A connection break can occur for a number of reasons, but the most
common reason when using krb5i/p is a GSS sequence number window
underrun. When an underrun is detected, the server is obliged to
drop the RPC and the connection to force a retransmit with a fresh
GSS sequence number. The client presents the same XID, it hits in
the server's DRC, and the server returns the garbage cache entry.
The "statp + 1" argument has been used since the oldest changeset
in the kernel history repo, so it has been in nfsd_dispatch()
literally since before history began. The problem arose only when
the server-side GSS implementation was added twenty years ago.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 10 Nov 2023 16:28:39 +0000 (11:28 -0500)]
NFSD: Update nfsd_cache_append() to use xdr_stream
When inserting a DRC-cached response into the reply buffer, ensure
that the reply buffer's xdr_stream is updated properly. Otherwise
the server will send a garbage response.
Cc: stable@vger.kernel.org # v6.3+
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Mahmoud Adam [Fri, 10 Nov 2023 18:21:04 +0000 (19:21 +0100)]
nfsd: fix file memleak on client_opens_release
seq_release should be called to free the allocated seq_file
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Mahmoud Adam <mngyadam@amazon.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <neilb@suse.de>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Mikulas Patocka [Fri, 17 Nov 2023 17:38:33 +0000 (18:38 +0100)]
dm-crypt: start allocating with MAX_ORDER
Commit
23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely")
changed the meaning of MAX_ORDER from exclusive to inclusive. So, we
can allocate compound pages with up to 1 << MAX_ORDER pages.
Reflect this change in dm-crypt and start trying to allocate compound
pages with MAX_ORDER.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:37:25 +0000 (18:37 +0100)]
dm-verity: don't use blocking calls from tasklets
The commit
5721d4e5a9cd enhanced dm-verity, so that it can verify blocks
from tasklets rather than from workqueues. This reportedly improves
performance significantly.
However, dm-verity was using the flag CRYPTO_TFM_REQ_MAY_SLEEP from
tasklets which resulted in warnings about sleeping function being called
from non-sleeping context.
BUG: sleeping function called from invalid context at crypto/internal.h:206
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 PID: 14 Comm: ksoftirqd/0 Tainted: G W 6.7.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x32/0x50
__might_resched+0x110/0x160
crypto_hash_walk_done+0x54/0xb0
shash_ahash_update+0x51/0x60
verity_hash_update.isra.0+0x4a/0x130 [dm_verity]
verity_verify_io+0x165/0x550 [dm_verity]
? free_unref_page+0xdf/0x170
? psi_group_change+0x113/0x390
verity_tasklet+0xd/0x70 [dm_verity]
tasklet_action_common.isra.0+0xb3/0xc0
__do_softirq+0xaf/0x1ec
? smpboot_thread_fn+0x1d/0x200
? sort_range+0x20/0x20
run_ksoftirqd+0x15/0x30
smpboot_thread_fn+0xed/0x200
kthread+0xdc/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x40
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
This commit fixes dm-verity so that it doesn't use the flags
CRYPTO_TFM_REQ_MAY_SLEEP and CRYPTO_TFM_REQ_MAY_BACKLOG from tasklets. The
crypto API would do GFP_ATOMIC allocation instead, it could return -ENOMEM
and we catch -ENOMEM in verity_tasklet and requeue the request to the
workqueue.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v6.0+
Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:36:34 +0000 (18:36 +0100)]
dm-bufio: fix no-sleep mode
dm-bufio has a no-sleep mode. When activated (with the
DM_BUFIO_CLIENT_NO_SLEEP flag), the bufio client is read-only and we
could call dm_bufio_get from tasklets. This is used by dm-verity.
Unfortunately, commit
450e8dee51aa ("dm bufio: improve concurrent IO
performance") broke this and the kernel would warn that cache_get()
was calling down_read() from no-sleeping context. The bug can be
reproduced by using "veritysetup open" with the "--use-tasklets"
flag.
This commit fixes dm-bufio, so that the tasklet mode works again, by
expanding use of the 'no_sleep_enabled' static_key to conditionally
use either a rw_semaphore or rwlock_t (which are colocated in the
buffer_tree structure using a union).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v6.4
Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:24:04 +0000 (18:24 +0100)]
dm-delay: avoid duplicate logic
This is small refactoring of dm-delay - we avoid duplicate logic in
flush_delayed_bios and flush_delayed_bios_fast and join these two
functions into one.
We also add cond_resched() to flush_delayed_bios because the list may have
unbounded number of entries.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>