Lukasz Luba [Wed, 20 Dec 2023 23:17:49 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Move memory allocation out of throttle()
The new thermal callback allows to react to the change of cooling
instances in the thermal zone. Move the memory allocation to that new
callback and save CPU cycles in the throttle() code path.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:48 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Change trace functions
Change trace event trace_thermal_power_allocator() to not use dynamic
array for requested power and granted power for all power actors.
Instead, simplify the trace event and print other simple values.
Add new trace event to print power actor information of requested power
and granted power. That trace event would be called in a loop for each
power actor. The trace data would be easier to parse comparing to the
dynamic array implementation.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:47 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Refactor checks in divvy_up_power()
Simplify the code and remove one extra 'if' block.
No intentional functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:46 +0000 (23:17 +0000)]
thermal: gov_power_allocator: Refactor check_power_actors()
In preparation for a subsequent change, rearrange check_power_actors().
No intentional functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 20 Dec 2023 23:17:45 +0000 (23:17 +0000)]
thermal: core: Add governor callback for thermal zone change
Add a new callback to the struct thermal_governor. It can be used for
updating governors when there is a change in the thermal zone internals,
e.g. thermal cooling device is bind to the thermal zone.
That makes possible to move some heavy operations like memory allocations
related to the number of cooling instances out of the throttle() callback.
Both callback code paths (throttle() and update_tz()) are protected with
the same thermal zone lock, which guaranties the consistency.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stanislaw Gruszka [Thu, 28 Dec 2023 10:02:48 +0000 (11:02 +0100)]
thermal: netlink: Add thermal_group_has_listeners() helper
Add a helper function to check if there are listeners for
thermal_gnl_family multicast groups.
For now use it to avoid unnecessary allocations and sending
thermal genl messages when there are no recipients.
In the future, in conjunction with (not yet implemented) notification
of change in the netlink socket group membership, this helper can be
used to open/close hardware interfaces based on the presence of
user space subscribers.
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stanislaw Gruszka [Thu, 28 Dec 2023 10:02:47 +0000 (11:02 +0100)]
thermal: netlink: Add enum for mutlicast groups indexes
Use enum instead of hard-coded numbers for indexing multicast groups.
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:28:31 +0000 (20:28 +0100)]
thermal: core: Resume thermal zones asynchronously
The resume of thermal zones in thermal_pm_notify() is carried out
sequentially, which may be a problem if __thermal_zone_device_update()
takes a significant time to run for some thermal zones, because some
other thermal zones may need to wait for them to resume then and if
any other PM notifiers are going to be invoked after the thermal one,
they will need to wait for it either.
To address this, make thermal_pm_notify() switch the poll_queue delayed
work over to a one-shot thermal_zone_device_resume() work function that
will restore the original one during the thermal zone resume and queue
up poll_queue without a delay for each thermal zone.
Link: https://lore.kernel.org/linux-pm/20231120234015.3273143-1-radusolea@google.com/
Reported-by: Radu Solea <radusolea@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:26:47 +0000 (20:26 +0100)]
thermal: core: Initialize poll_queue in thermal_zone_device_init()
In preparation for a subsequent change, move the initialization of the
poll_queue delayed work from thermal_zone_device_register_with_trips()
to thermal_zone_device_init() which is called by the former.
However, because thermal_zone_device_init() is also called by
thermal_pm_notify(), make the latter call cancel_delayed_work() on
poll_queue before invoking the former, so as to allow the work
item to be re-initialized safely.
Also move thermal_zone_device_check() which needs to be defined
before thermal_zone_device_init(), so the latter can pass it to the
INIT_DELAYED_WORK() macro.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 18 Dec 2023 19:25:02 +0000 (20:25 +0100)]
thermal: core: Fix thermal zone suspend-resume synchronization
There are 3 synchronization issues with thermal zone suspend-resume
during system-wide transitions:
1. The resume code runs in a PM notifier which is invoked after user
space has been thawed, so it can run concurrently with user space
which can trigger a thermal zone device removal. If that happens,
the thermal zone resume code may use a stale pointer to the next
list element and crash, because it does not hold thermal_list_lock
while walking thermal_tz_list.
2. The thermal zone resume code calls thermal_zone_device_init()
outside the zone lock, so user space or an update triggered by
the platform firmware may see an inconsistent state of a
thermal zone leading to unexpected behavior.
3. Clearing the in_suspend global variable in thermal_pm_notify()
allows __thermal_zone_device_update() to continue for all thermal
zones and it may as well run before the thermal_tz_list walk (or
at any point during the list walk for that matter) and attempt to
operate on a thermal zone that has not been resumed yet. It may
also race destructively with thermal_zone_device_init().
To address these issues, add thermal_list_lock locking to
thermal_pm_notify(), especially arount the thermal_tz_list,
make it call thermal_zone_device_init() back-to-back with
__thermal_zone_device_update() under the zone lock and replace
in_suspend with per-zone bool "suspend" indicators set and unset
under the given zone's lock.
Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/
Reported-by: Bo Ye <bo.ye@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Randy Dunlap [Thu, 21 Dec 2023 05:51:44 +0000 (21:51 -0800)]
thermal: cpuidle_cooling: fix kernel-doc warning and a spello
Correct one misuse of kernel-doc notation and one spelling error as
reported by codespell.
cpuidle_cooling.c:152: warning: cannot understand function prototype: 'struct thermal_cooling_device_ops cpuidle_cooling_ops = '
For the kernel-doc warning, don't use "/**" for a comment on data.
kernel-doc can be used for structure declarations but not definitions.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 14 Dec 2023 10:52:25 +0000 (11:52 +0100)]
thermal: core: Fix NULL pointer dereference in zone registration error path
If device_register() in thermal_zone_device_register_with_trips()
returns an error, the tz variable is set to NULL and subsequently
dereferenced in kfree(tz->tzp).
Commit
adc8749b150c ("thermal/drivers/core: Use put_device() if
device_register() fails") added the tz = NULL assignment in question to
avoid a possible double-free after dropping the reference to the zone
device. However, after commit
4649620d9404 ("thermal: core: Make
thermal_zone_device_unregister() return after freeing the zone"), that
assignment has become redundant, because dropping the reference to the
zone device does not cause the zone object to be freed any more.
Drop it to address the NULL pointer dereference.
Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Daniel Lezcano [Wed, 13 Dec 2023 12:13:22 +0000 (13:13 +0100)]
thermal/core: Check get_temp ops is present when registering a tz
Initially the check against the get_temp ops in the
thermal_zone_device_update() was put in there in order to catch
drivers not providing this method.
Instead of checking again and again the function if the ops exists in
the update function, let's do the check at registration time, so it is
checked one time and for all.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Tue, 5 Dec 2023 19:18:39 +0000 (20:18 +0100)]
thermal: trip: Send trip change notifications on all trip updates
The _store callbacks of the trip point temperature and hysteresis sysfs
attributes invoke thermal_notify_tz_trip_change() to send a notification
regarding the trip point change, but when trip points are updated by the
platform firmware, trip point change notifications are not sent.
To make the behavior after a trip point change more consistent,
modify all of the 3 places where trip point temperature is updated
to use a new function called thermal_zone_set_trip_temp() for this
purpose and make that function call thermal_notify_tz_trip_change().
Note that trip point hysteresis can only be updated via sysfs and
trip_point_hyst_store() calls thermal_notify_tz_trip_change() already,
so this code path need not be changed.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:49:03 +0000 (20:49 +0100)]
thermal: netlink: Use for_each_trip() in thermal_genl_cmd_tz_get_trip()
Make thermal_genl_cmd_tz_get_trip() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:46:35 +0000 (20:46 +0100)]
thermal: helpers: Use for_each_trip() in __thermal_zone_get_temp()
Make __thermal_zone_get_temp() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:41:30 +0000 (20:41 +0100)]
thermal: trip: Use for_each_trip() in __thermal_zone_set_trips()
Make __thermal_zone_set_trips() use for_each_trip() instead of an open-
coded loop over trip indices.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki [Mon, 4 Dec 2023 19:36:14 +0000 (20:36 +0100)]
thermal: trip: Drop redundant __thermal_zone_get_trip() header
The __thermal_zone_get_trip() header in drivers/thermal/thermal_core.h
is redundant, because there is one already in thermal.h, so drop it.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:20:00 +0000 (20:20 +0100)]
thermal: core: Rework thermal zone availability check
In order to avoid running __thermal_zone_device_update() for thermal
zones going away, the thermal zone lock is held around device_del()
in thermal_zone_device_unregister() and thermal_zone_device_update()
passes the given thermal zone device to device_is_registered().
This allows thermal_zone_device_update() to skip the
__thermal_zone_device_update() if device_del() has already run for
the thermal zone at hand.
However, instead of looking at driver core internals, the thermal
subsystem may as well rely on its own data structures for this
purpose. Namely, if the thermal zone is not present in
thermal_tz_list, it can be regarded as unavailable, which in fact is
already the case in thermal_zone_device_unregister(). Accordingly,
the device_is_registered() check in thermal_zone_device_update() can
be replaced with checking whether or not the node list_head in struct
thermal_zone_device is empty, in which case it is not there in
thermal_tz_list.
To make this work, though, it is necessary to initialize tz->node
in thermal_zone_device_register_with_trips() before registering the
thermal zone device and it needs to be added to thermal_tz_list and
deleted from it under its zone lock.
After the above modifications, the zone lock does not need to be
held around device_del() in thermal_zone_device_unregister() any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:19:03 +0000 (20:19 +0100)]
thermal: Drop redundant and confusing device_is_registered() checks
Multiple places in the thermal subsystem (most importantly, sysfs
attribute callback functions) check if the given thermal zone device is
still registered in order to return early in case the device_del() in
thermal_zone_device_unregister() has run already.
However, after thermal_zone_device_unregister() has been made wait for
all of the zone-related activity to complete before returning, it is
not necessary to do that any more, because all of the code holding a
reference to the thermal zone device object will be waited for even if
it does not do anything special to enforce this.
Accordingly, drop all of the device_is_registered() checks that are now
redundant and get rid of the zone locking that is not necessary any more
after dropping them.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Fri, 8 Dec 2023 19:13:44 +0000 (20:13 +0100)]
thermal: core: Make thermal_zone_device_unregister() return after freeing the zone
Make thermal_zone_device_unregister() wait until all of the references
to the given thermal zone object have been dropped and free it before
returning.
This guarantees that when thermal_zone_device_unregister() returns,
there is no leftover activity regarding the thermal zone in question
which is required by some of its callers (for instance, modular driver
code that wants to know when it is safe to let the module go away).
Subsequently, this will allow some confusing device_is_registered()
checks to be dropped from the thermal sysfs and core code.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Tue, 5 Dec 2023 12:26:59 +0000 (13:26 +0100)]
thermal: sysfs: Rework the reading of trip point attributes
Rework the _show() callback functions for the trip point temperature,
hysteresis and type attributes to avoid copying the values of struct
thermal_trip fields that they do not use and make them carry out the
same validation checks as the corresponding _store() callback functions.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Tue, 5 Dec 2023 12:24:08 +0000 (13:24 +0100)]
thermal: sysfs: Rework the handling of trip point updates
Both trip_point_temp_store() and trip_point_hyst_store() use
thermal_zone_set_trip() to update a given trip point, but none of them
actually needs to change more than one field in struct thermal_trip
representing it. However, each of them effectively calls
__thermal_zone_get_trip() twice in a row for the same trip index value,
once directly and once via thermal_zone_set_trip(), which is not
particularly efficient, and the way in which thermal_zone_set_trip()
carries out the update is not particularly straightforward.
Moreover, input processing need not be done under the thermal zone lock
in any of these functions.
Rework trip_point_temp_store() and trip_point_hyst_store() to address
the above, move the part of thermal_zone_set_trip() that is still
useful to a new function called thermal_zone_trip_updated() and drop
the rest of it.
While at it, make trip_point_hyst_store() reject negative hysteresis
values.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki [Wed, 29 Nov 2023 13:36:07 +0000 (14:36 +0100)]
thermal: trip: Drop a redundant check from thermal_zone_set_trip()
After recent changes in the thermal framework, a trip points array is
required for registering a thermal zone that is not tripless, so the
tz->trips pointer in thermal_zone_set_trip() is never NULL and the
check involving it is redundant. Drop that check.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Lukasz Luba [Wed, 25 Oct 2023 19:22:25 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rearrange initialization of local variables
Rearrange the initialization of local variables in allocate_power() so
as to improve code clarity and the visibility of the initial values.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:24 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Remove excessive local variables
Local variable 'ret' in allocate_power() is only used in the return
statement, so drop it.
Local variable 'trip_max' in allocate_power() is only used for caching
the params->trip_max value which may as well be accessed directly as
needed, so drop it either.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:23 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Use shorter paths to access data when possible
The 'cdev' pointer in allow_maximum_power() is valid, so there is no
need to use 'instance->cdev' instead of it.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:22 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rearrange local variables
Rearrange the order of local variable definitions in multiple functions
so as to follow the kernel coding style in that respect.
Also, move local variable definitions located in nested code blocks to
the beginning of each function to improve the visibility of all local
variables in use.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:21 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Check the cooling devices only for trip_max
The throttling logic only cares about the last passive trip point and
the cooling devices attached to it.
Therefore, there is no need to bail out if other trip points have
cooling devices which are not a supported by the IPA.
Check the cooling devices only for 'trip_max' during the binding.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:20 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Set up trip points earlier
Set up the trip points at the beginning of the binding function.
This simplifies the code a bit and allows for further cleanups.
Also add a check to fail the binding if the last passive trip point is
not found.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lukasz Luba [Wed, 25 Oct 2023 19:22:19 +0000 (20:22 +0100)]
thermal: gov_power_allocator: Rename trip_max_desired_temperature
Refactor the code and rename the last passive trip point field.
There is a comment describing the field properly. Use shorter field name
so as to allow to clarify the code.
This change is not expected to alter the general functionality.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 9 Nov 2023 15:01:48 +0000 (16:01 +0100)]
thermal: core: Add trip thresholds for trip crossing detection
The trip crossing detection in handle_thermal_trip() does not work
correctly in the cases when a trip point is crossed on the way up and
then the zone temperature stays above its low temperature (that is, its
temperature decreased by its hysteresis). The trip temperature may
be passed by the zone temperature subsequently in that case, even
multiple times, but that does not count as the trip crossing as long as
the zone temperature does not fall below the trip's low temperature or,
in other words, until the trip is crossed on the way down.
|-----------low--------high------------|
|<--------->|
| hyst |
| |
| -|--> crossed on the way up
|
<---|-- crossed on the way down
However, handle_thermal_trip() will invoke thermal_notify_tz_trip_up()
every time the trip temperature is passed by the zone temperature on
the way up regardless of whether or not the trip has been crossed on
the way down yet. Moreover, it will not call thermal_notify_tz_trip_down()
if the last zone temperature was between the trip's temperature and its
low temperature, so some "trip crossed on the way down" events may not
be reported.
To address this issue, introduce trip thresholds equal to either the
temperature of the given trip, or its low temperature, such that if
the trip's threshold is passed by the zone temperature on the way up,
its value will be set to the trip's low temperature and
thermal_notify_tz_trip_up() will be called, and if the trip's threshold
is passed by the zone temperature on the way down, its value will be set
to the trip's temperature (high) and thermal_notify_tz_trip_down() will
be called. Accordingly, if the threshold is passed on the way up, it
cannot be passed on the way up again until its passed on the way down
and if it is passed on the way down, it cannot be passed on the way down
again until it is passed on the way up which guarantees correct
triggering of trip crossing notifications.
If the last temperature of the zone is invalid, the trip's threshold
will be set depending of the zone's current temperature: If that
temperature is above the trip's temperature, its threshold will be
set to its low temperature or otherwise its threshold will be set to
its (high) temperature. Because the zone temperature is initially
set to invalid and tz->last_temperature is only updated by
update_temperature(), this is sufficient to set the correct initial
threshold values for all trips.
Link: https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Sun, 19 Nov 2023 23:02:14 +0000 (15:02 -0800)]
Linux 6.7-rc2
Linus Torvalds [Sun, 19 Nov 2023 21:54:28 +0000 (13:54 -0800)]
Merge tag 'kbuild-fixes-v6.7' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix section mismatch warning messages for riscv and loongarch
- Remove CONFIG_IA64 left-over from linux/export-internal.h
- Fix the location of the quotes for UIMAGE_NAME
- Fix a memory leak bug in Kconfig
* tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix memory leak from range properties
kbuild: Move the single quotes for image name
linux/export: clean up the IA-64 KSYM_FUNC macro
modpost: fix section mismatch message for RELA
Linus Torvalds [Sun, 19 Nov 2023 21:49:32 +0000 (13:49 -0800)]
Merge tag 'irq_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
- Flush the translation service tables to prevent unpredictable
behavior on non-coherent GIC devices
* tag 'irq_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Flush ITS tables correctly in non-coherent GIC designs
Linus Torvalds [Sun, 19 Nov 2023 21:46:17 +0000 (13:46 -0800)]
Merge tag 'x86_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Ignore invalid x2APIC entries in order to not waste per-CPU data
- Fix a back-to-back signals handling scenario when shadow stack is in
use
- A documentation fix
- Add Kirill as TDX maintainer
* tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi: Ignore invalid x2APIC entries
x86/shstk: Delay signal entry SSP write until after user accesses
x86/Documentation: Indent 'note::' directive for protocol version number note
MAINTAINERS: Add Intel TDX entry
Linus Torvalds [Sun, 19 Nov 2023 21:35:07 +0000 (13:35 -0800)]
Merge tag 'timers_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:
- Do the push of pending hrtimers away from a CPU which is being
offlined earlier in the offlining process in order to prevent a
deadlock
* tag 'timers_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimers: Push pending hrtimers away from outgoing CPU earlier
Linus Torvalds [Sun, 19 Nov 2023 21:32:00 +0000 (13:32 -0800)]
Merge tag 'sched_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Fix virtual runtime calculation when recomputing a sched entity's
weights
- Fix wrongly rejected unprivileged poll requests to the cgroup psi
pressure files
- Make sure the load balancing is done by only one CPU
* tag 'sched_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix the decision for load balance
sched: psi: fix unprivileged polling against cgroups
sched/eevdf: Fix vruntime adjustment on reweight
Linus Torvalds [Sun, 19 Nov 2023 21:30:21 +0000 (13:30 -0800)]
Merge tag 'locking_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Fix a hardcoded futex flags case which lead to one robust futex test
failure
* tag 'locking_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Fix hardcoded flags
Linus Torvalds [Sun, 19 Nov 2023 21:26:42 +0000 (13:26 -0800)]
Merge tag 'perf_urgent_for_v6.7_rc2' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Make sure the context refcount is transferred too when migrating perf
events
* tag 'perf_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix cpuctx refcounting
Linus Torvalds [Sat, 18 Nov 2023 23:20:58 +0000 (15:20 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Seven small fixes, six in drivers and one in sd.
The sd fix is so large because it changes a struct pointer to a struct
but otherwise is fairly simple"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom-ufs: dt-bindings: Document the SM8650 UFS Controller
scsi: sd: Fix sshdr use in sd_suspend_common()
scsi: scsi_debug: Delete some bogus error checking
scsi: scsi_debug: Fix some bugs in sdebug_error_write()
scsi: ufs: core: Fix racing issue between ufshcd_mcq_abort() and ISR
scsi: ufs: core: Expand MCQ queue slot to DeviceQueueDepth + 1
scsi: qla2xxx: Fix system crash due to bad pointer access
Linus Torvalds [Sat, 18 Nov 2023 23:13:10 +0000 (15:13 -0800)]
Merge tag 'parisc-for-6.7-rc2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"On parisc we still sometimes need writeable stacks, e.g. if programs
aren't compiled with gcc-14. To avoid issues with the upcoming
systemd-254 we therefore have to disable prctl(PR_SET_MDWE) for now
(for parisc only).
The other two patches are minor: a bugfix for the soft power-off on
qemu with 64-bit kernel and prefer strscpy() over strlcpy():
- Fix power soft-off on qemu
- Disable prctl(PR_SET_MDWE) since parisc sometimes still needs
writeable stacks
- Use strscpy instead of strlcpy in show_cpuinfo()"
* tag 'parisc-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
prctl: Disable prctl(PR_SET_MDWE) on parisc
parisc/power: Fix power soft-off when running on qemu
parisc: Replace strlcpy() with strscpy()
Linus Torvalds [Sat, 18 Nov 2023 19:28:28 +0000 (11:28 -0800)]
Merge tag 'xfs-6.7-fixes-1' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Chandan Babu:
- Fix deadlock arising due to intent items in AIL not being cleared
when log recovery fails
- Fix stale data exposure bug when remapping COW fork extents to data
fork
- Fix deadlock when data device flush fails
- Fix AGFL minimum size calculation
- Select DEBUG_FS instead of XFS_DEBUG when XFS_ONLINE_SCRUB_STATS is
selected
- Fix corruption of log inode's extent count field when NREXT64 feature
is enabled
* tag 'xfs-6.7-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: recovery should not clear di_flushiter unconditionally
xfs: inode recovery does not validate the recovered inode
xfs: fix again select in kconfig XFS_ONLINE_SCRUB_STATS
xfs: fix internal error from AGFL exhaustion
xfs: up(ic_sema) if flushing data device fails
xfs: only remap the written blocks in xfs_reflink_end_cow_extent
XFS: Update MAINTAINERS to catch all XFS documentation
xfs: abort intent items when recovery intents fail
xfs: factor out xfs_defer_pending_abort
Linus Torvalds [Sat, 18 Nov 2023 19:23:32 +0000 (11:23 -0800)]
Merge tag 'nfsd-6.7-1' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix several long-standing bugs in the duplicate reply cache
- Fix a memory leak
* tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Fix checksum mismatches in the duplicate reply cache
NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
NFSD: Update nfsd_cache_append() to use xdr_stream
nfsd: fix file memleak on client_opens_release
Linus Torvalds [Sat, 18 Nov 2023 19:18:46 +0000 (11:18 -0800)]
Merge tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- multichannel fixes (including a lock ordering fix and an important
refcounting fix)
- spnego fix
* tag '6.7-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix lock ordering while disabling multichannel
cifs: fix leak of iface for primary channel
cifs: fix check of rc in function generate_smb3signingkey
cifs: spnego: add ';' in HOST_KEY_LEN
Helge Deller [Sat, 18 Nov 2023 18:33:35 +0000 (19:33 +0100)]
prctl: Disable prctl(PR_SET_MDWE) on parisc
systemd-254 tries to use prctl(PR_SET_MDWE) for it's MemoryDenyWriteExecute
functionality, but fails on parisc which still needs executable stacks in
certain combinations of gcc/glibc/kernel.
Disable prctl(PR_SET_MDWE) by returning -EINVAL for now on parisc, until
userspace has catched up.
Signed-off-by: Helge Deller <deller@gmx.de>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Sam James <sam@gentoo.org>
Closes: https://github.com/systemd/systemd/issues/29775
Tested-by: Sam James <sam@gentoo.org>
Link: https://lore.kernel.org/all/875y2jro9a.fsf@gentoo.org/
Cc: <stable@vger.kernel.org> # v6.3+
Linus Torvalds [Sat, 18 Nov 2023 18:02:16 +0000 (10:02 -0800)]
Merge tag 'for-6.7/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Various fixes for the DM delay target to address regressions
introduced during the 6.7 merge window
- Fixes to both DM bufio and the verity target for no-sleep mode,
to address sleeping while atomic issues
- Update DM crypt target in response to the treewide change that
made MAX_ORDER inclusive
* tag 'for-6.7/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-crypt: start allocating with MAX_ORDER
dm-verity: don't use blocking calls from tasklets
dm-bufio: fix no-sleep mode
dm-delay: avoid duplicate logic
dm-delay: fix bugs introduced by kthread mode
dm-delay: fix a race between delay_presuspend and delay_bio
Helge Deller [Fri, 17 Nov 2023 15:43:52 +0000 (16:43 +0100)]
parisc/power: Fix power soft-off when running on qemu
Firmware returns the physical address of the power switch,
so need to use gsc_writel() instead of direct memory access.
Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Kees Cook [Thu, 16 Nov 2023 19:13:40 +0000 (11:13 -0800)]
parisc: Replace strlcpy() with strscpy()
strlcpy() reads the entire source buffer first. This read may exceed
the destination size limit. This is both inefficient and can lead
to linear read overflows if a source string is not NUL-terminated[1].
Additionally, it returns the size of the source string, not the
resulting size of the destination string. In an effort to remove strlcpy()
completely[2], replace strlcpy() here with strscpy().
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
Link: https://github.com/KSPP/linux/issues/89
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Linus Torvalds [Sat, 18 Nov 2023 17:44:14 +0000 (09:44 -0800)]
Merge tag 'i2c-for-6.7-rc2' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Revert a not-working conversion to generic recovery for PXA,
use proper IO accessors for designware, and use proper PM level
for ocores to allow accessing interrupt providers late"
* tag 'i2c-for-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ocores: Move system PM hooks to the NOIRQ phase
i2c: designware: Fix corrupted memory seen in the ISR
Revert "i2c: pxa: move to generic GPIO recovery"
Linus Torvalds [Sat, 18 Nov 2023 17:09:17 +0000 (09:09 -0800)]
Merge tag 'turbostat-2023.11.07' of git://git./linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- Turbostat features are now table-driven (Rui Zhang)
- Add support for some new platforms (Sumeet Pawnikar, Rui Zhang)
- Gracefully run in configs when CPUs are limited (Rui Zhang, Srinivas
Pandruvada)
- misc minor fixes
[ This came in during the merge window, but sorting out the signed tag
took a while, so thus the late merge - Linus ]
* tag 'turbostat-2023.11.07' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (86 commits)
tools/power turbostat: version 2023.11.07
tools/power/turbostat: bugfix "--show IPC"
tools/power/turbostat: Add initial support for LunarLake
tools/power/turbostat: Add initial support for ArrowLake
tools/power/turbostat: Add initial support for GrandRidge
tools/power/turbostat: Add initial support for SierraForest
tools/power/turbostat: Add initial support for GraniteRapids
tools/power/turbostat: Add MSR_CORE_C1_RES support for spr_features
tools/power/turbostat: Move process to root cgroup
tools/power/turbostat: Handle cgroup v2 cpu limitation
tools/power/turbostat: Abstrct function for parsing cpu string
tools/power/turbostat: Handle offlined CPUs in cpu_subset
tools/power/turbostat: Obey allowed CPUs for system summary
tools/power/turbostat: Obey allowed CPUs for primary thread/core detection
tools/power/turbostat: Abstract several functions
tools/power/turbostat: Obey allowed CPUs during startup
tools/power/turbostat: Obey allowed CPUs when accessing CPU counters
tools/power/turbostat: Introduce cpu_allowed_set
tools/power/turbostat: Remove PC7/PC9 support on ADL/RPL
tools/power/turbostat: Enable MSR_CORE_C1_RES on recent Intel client platforms
...
Linus Torvalds [Fri, 17 Nov 2023 22:36:58 +0000 (14:36 -0800)]
Merge tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Lots of small fixes for minor nits and compiler warnings.
Bigger items:
- The six locks lost wakeup is finally fixed: six_read_trylock() was
checking for the waiting bit before decrementing the number of
readers - validated the fix with a torture test.
- Fix for a memory reclaim issue: when needing to reallocate a key
cache key, we now do our usual GFP_NOWAIT; unlock(); GFP_KERNEL
dance.
- Multiple deleted inodes btree fixes
- Fix an issue in fsck, where i_nlink would be recalculated
incorrectly for hardlinked files if a snapshot had ever been taken.
- Kill journal pre-reservations: This is a bigger patch than I would
normally send at this point, but it deletes code and it fixes some
of our tests that would sporadically die with the journal getting
stuck, and it's a performance improvement, too"
* tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefs: (22 commits)
bcachefs: Fix missing locking for dentry->d_parent access
bcachefs: six locks: Fix lost wakeup
bcachefs: Fix no_data_io mode checksum check
bcachefs: Fix bch2_check_nlinks() for snapshots
bcachefs: Don't decrease BTREE_ITER_MAX when LOCKDEP=y
bcachefs: Disable debug log statements
bcachefs: Fix missing transaction commit
bcachefs: Fix error path in bch2_mount()
bcachefs: Fix potential sleeping during mount
bcachefs: Fix iterator leak in may_delete_deleted_inode()
bcachefs: Kill journal pre-reservations
bcachefs: Check for nonce offset inconsistency in data_update path
bcachefs: Make sure to drop/retake btree locks before reclaim
bcachefs: btree_trans->write_locked
bcachefs: Run btree key cache shrinker less aggressively
bcachefs: Split out btree_key_cache_types.h
bcachefs: Guard against insufficient devices to create stripes
bcachefs: Fix null ptr deref in bch2_backpointer_get_node()
bcachefs: Fix multiple -Warray-bounds warnings
bcachefs: Use DECLARE_FLEX_ARRAY() helper and fix multiple -Warray-bounds warnings
...
Linus Torvalds [Fri, 17 Nov 2023 22:19:46 +0000 (14:19 -0800)]
Merge tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Thirteen hotfixes. Seven are cc:stable and the remainder pertain to
post-6.6 issues or aren't considered suitable for backporting"
* tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: more ptep_get() conversion
parisc: fix mmap_base calculation when stack grows upwards
mm/damon/core.c: avoid unintentional filtering out of schemes
mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors
mm/damon/sysfs-schemes: handle tried region directory allocation failure
mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure
mm/damon/sysfs: check error from damon_sysfs_update_target()
mm: fix for negative counter: nr_file_hugepages
selftests/mm: add hugetlb_fault_after_madv to .gitignore
selftests/mm: restore number of hugepages
selftests: mm: fix some build warnings
selftests: mm: skip whole test instead of failure
mm/damon/sysfs: eliminate potential uninitialized variable warning
Linus Torvalds [Fri, 17 Nov 2023 22:08:14 +0000 (14:08 -0800)]
Merge tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linux
Pull block fix from Jens Axboe:
"Just a single fix from Christoph/Ming, fixing a case where integrity
IO could be called without having an appropriate queue reference"
* tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linux:
blk-mq: make sure active queue usage is held for bio_integrity_prep()
Linus Torvalds [Fri, 17 Nov 2023 22:03:18 +0000 (14:03 -0800)]
Merge tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single fixup for a change we made in this release, which caused
a regression in sometimes missing fdinfo output if the SQPOLL thread
had the lock held when fdinfo output was retrieved.
This brings us back on par with what we had before, where just the
main uring_lock will prevent that output. We'd love to get rid of that
too, but that is beyond the scope of this release and will have to
wait for 6.8"
* tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linux:
io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval
Linus Torvalds [Fri, 17 Nov 2023 21:58:26 +0000 (13:58 -0800)]
Merge tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"This is a 'blast from the bast' fixes pull, because it contains a
bunch of AGP fixes for amdgpu. Otherwise nothing out of the ordinary.
Next week is back to Dave unless he's knocked out by some conference
bug.
- amdgpu: fixes all over, including a set of AGP fixes
- nouvea: GSP + other bugfixes
- ivpu build fix
- lenovo legion go panel orientation quirk"
* tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drm: (30 commits)
drm/amdgpu/gmc9: disable AGP aperture
drm/amdgpu/gmc10: disable AGP aperture
drm/amdgpu/gmc11: disable AGP aperture
drm/amdgpu: add a module parameter to control the AGP aperture
drm/amdgpu/gmc11: fix logic typo in AGP check
drm/amd/display: Fix encoder disable logic
drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox
drm/amdgpu: add and populate the port num into xgmi topology info
drm/amd/display: Negate IPS allow and commit bits
drm/amd/pm: Don't send unload message for reset
drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c
drm/amd/display: Clear dpcd_sink_ext_caps if not set
drm/amd/display: Enable fast plane updates on DCN3.2 and above
drm/amd/display: fix NULL dereference
drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()
drm/amd/display: Add null checks for 8K60 lightup
drm/amd/pm: Fill pcie error counters for gpu v1_4
drm/amd/pm: Update metric table for smu v13_0_6
drm/amdgpu: correct chunk_ptr to a pointer to chunk.
drm/amd/display: Fix DSC not Enabled on Direct MST Sink
...
Chuck Lever [Fri, 10 Nov 2023 16:28:45 +0000 (11:28 -0500)]
NFSD: Fix checksum mismatches in the duplicate reply cache
nfsd_cache_csum() currently assumes that the server's RPC layer has
been advancing rq_arg.head[0].iov_base as it decodes an incoming
request, because that's the way it used to work. On entry, it
expects that buf->head[0].iov_base points to the start of the NFS
header, and excludes the already-decoded RPC header.
These days however, head[0].iov_base now points to the start of the
RPC header during all processing. It no longer points at the NFS
Call header when execution arrives at nfsd_cache_csum().
In a retransmitted RPC the XID and the NFS header are supposed to
be the same as the original message, but the contents of the
retransmitted RPC header can be different. For example, for krb5,
the GSS sequence number will be different between the two. Thus if
the RPC header is always included in the DRC checksum computation,
the checksum of the retransmitted message might not match the
checksum of the original message, even though the NFS part of these
messages is identical.
The result is that, even if a matching XID is found in the DRC,
the checksum mismatch causes the server to execute the
retransmitted RPC transaction again.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 10 Nov 2023 16:28:33 +0000 (11:28 -0500)]
NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
The "statp + 1" pointer that is passed to nfsd_cache_update() is
supposed to point to the start of the egress NFS Reply header. In
fact, it does point there for AUTH_SYS and RPCSEC_GSS_KRB5 requests.
But both krb5i and krb5p add fields between the RPC header's
accept_stat field and the start of the NFS Reply header. In those
cases, "statp + 1" points at the extra fields instead of the Reply.
The result is that nfsd_cache_update() caches what looks to the
client like garbage.
A connection break can occur for a number of reasons, but the most
common reason when using krb5i/p is a GSS sequence number window
underrun. When an underrun is detected, the server is obliged to
drop the RPC and the connection to force a retransmit with a fresh
GSS sequence number. The client presents the same XID, it hits in
the server's DRC, and the server returns the garbage cache entry.
The "statp + 1" argument has been used since the oldest changeset
in the kernel history repo, so it has been in nfsd_dispatch()
literally since before history began. The problem arose only when
the server-side GSS implementation was added twenty years ago.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 10 Nov 2023 16:28:39 +0000 (11:28 -0500)]
NFSD: Update nfsd_cache_append() to use xdr_stream
When inserting a DRC-cached response into the reply buffer, ensure
that the reply buffer's xdr_stream is updated properly. Otherwise
the server will send a garbage response.
Cc: stable@vger.kernel.org # v6.3+
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Mahmoud Adam [Fri, 10 Nov 2023 18:21:04 +0000 (19:21 +0100)]
nfsd: fix file memleak on client_opens_release
seq_release should be called to free the allocated seq_file
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Mahmoud Adam <mngyadam@amazon.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <neilb@suse.de>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Mikulas Patocka [Fri, 17 Nov 2023 17:38:33 +0000 (18:38 +0100)]
dm-crypt: start allocating with MAX_ORDER
Commit
23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely")
changed the meaning of MAX_ORDER from exclusive to inclusive. So, we
can allocate compound pages with up to 1 << MAX_ORDER pages.
Reflect this change in dm-crypt and start trying to allocate compound
pages with MAX_ORDER.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:37:25 +0000 (18:37 +0100)]
dm-verity: don't use blocking calls from tasklets
The commit
5721d4e5a9cd enhanced dm-verity, so that it can verify blocks
from tasklets rather than from workqueues. This reportedly improves
performance significantly.
However, dm-verity was using the flag CRYPTO_TFM_REQ_MAY_SLEEP from
tasklets which resulted in warnings about sleeping function being called
from non-sleeping context.
BUG: sleeping function called from invalid context at crypto/internal.h:206
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 PID: 14 Comm: ksoftirqd/0 Tainted: G W 6.7.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x32/0x50
__might_resched+0x110/0x160
crypto_hash_walk_done+0x54/0xb0
shash_ahash_update+0x51/0x60
verity_hash_update.isra.0+0x4a/0x130 [dm_verity]
verity_verify_io+0x165/0x550 [dm_verity]
? free_unref_page+0xdf/0x170
? psi_group_change+0x113/0x390
verity_tasklet+0xd/0x70 [dm_verity]
tasklet_action_common.isra.0+0xb3/0xc0
__do_softirq+0xaf/0x1ec
? smpboot_thread_fn+0x1d/0x200
? sort_range+0x20/0x20
run_ksoftirqd+0x15/0x30
smpboot_thread_fn+0xed/0x200
kthread+0xdc/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x40
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
This commit fixes dm-verity so that it doesn't use the flags
CRYPTO_TFM_REQ_MAY_SLEEP and CRYPTO_TFM_REQ_MAY_BACKLOG from tasklets. The
crypto API would do GFP_ATOMIC allocation instead, it could return -ENOMEM
and we catch -ENOMEM in verity_tasklet and requeue the request to the
workqueue.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v6.0+
Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:36:34 +0000 (18:36 +0100)]
dm-bufio: fix no-sleep mode
dm-bufio has a no-sleep mode. When activated (with the
DM_BUFIO_CLIENT_NO_SLEEP flag), the bufio client is read-only and we
could call dm_bufio_get from tasklets. This is used by dm-verity.
Unfortunately, commit
450e8dee51aa ("dm bufio: improve concurrent IO
performance") broke this and the kernel would warn that cache_get()
was calling down_read() from no-sleeping context. The bug can be
reproduced by using "veritysetup open" with the "--use-tasklets"
flag.
This commit fixes dm-bufio, so that the tasklet mode works again, by
expanding use of the 'no_sleep_enabled' static_key to conditionally
use either a rw_semaphore or rwlock_t (which are colocated in the
buffer_tree structure using a union).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v6.4
Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:24:04 +0000 (18:24 +0100)]
dm-delay: avoid duplicate logic
This is small refactoring of dm-delay - we avoid duplicate logic in
flush_delayed_bios and flush_delayed_bios_fast and join these two
functions into one.
We also add cond_resched() to flush_delayed_bios because the list may have
unbounded number of entries.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:22:47 +0000 (18:22 +0100)]
dm-delay: fix bugs introduced by kthread mode
This commit fixes the following bugs introduced by commit
70bbeb29fab0
("dm delay: for short delays, use kthread instead of timers and wq"):
* the function flush_worker_fn has no exit path - on unload, this
function will just loop and consume 100% CPU without any progress
* the wake-up mechanism in flush_worker_fn is racy - a wake up will be
missed if the process adds entries to the delayed_bios list just
before set_current_state(TASK_INTERRUPTIBLE)
* flush_delayed_bios_fast submits a bio while holding a global mutex;
this may deadlock if we have multiple stacked dm-delay devices and
the underlying device attempts to acquire the mutex too
* if the target constructor fails, it will call delay_dtr. delay_dtr
would attempt to free dc->timer_lock without it being initialized by
the constructor.
* if the target constructor's kthread allocation fails, delay_dtr
would crash trying to dereference dc->worker because it is non-NULL
due to ERR_PTR.
Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Fri, 17 Nov 2023 17:21:14 +0000 (18:21 +0100)]
dm-delay: fix a race between delay_presuspend and delay_bio
In delay_presuspend, we set the atomic variable may_delay and then stop
the timer and flush pending bios. The intention here is to prevent the
delay target from re-arming the timer again.
However, this test is racy. Suppose that one thread goes to delay_bio,
sees that dc->may_delay is one and proceeds; now, another thread executes
delay_presuspend, it sets dc->may_delay to zero, deletes the timer and
flushes pending bios. Then, the first thread continues and adds the bio to
delayed->list despite the fact that dc->may_delay is false.
Fix this bug by changing may_delay's type from atomic_t to bool and
only access it while holding the delayed_bios_lock mutex. Note that we
don't have to grab the mutex in delay_resume because there are no bios
in flight at this point.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Linus Torvalds [Fri, 17 Nov 2023 14:18:48 +0000 (09:18 -0500)]
Merge tag 'ovl-fixes-6.7-rc2' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs fixes from Amir Goldstein:
"A fix to an overlayfs param parsing bug and a misformatted comment"
* tag 'ovl-fixes-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: fix memory leak in ovl_parse_param()
ovl: fix misformatted comment
Linus Torvalds [Fri, 17 Nov 2023 14:05:31 +0000 (09:05 -0500)]
Merge tag 'sound-6.7-rc2' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes: including a regression fix in RC1 wrt
HD-audio / i915 component binding, while the rest are HD-audio
device-speific fixes / quirks"
* tag 'sound-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Add quirks for HP Laptops
ALSA: hda/realtek: Add quirks for ASUS 2024 Zenbooks
ALSA: hda: i915: Alays handle -EPROBE_DEFER
ALSA: hda/realtek: Enable Mute LED on HP 255 G10
ALSA: hda: cs35l56: Enable low-power hibernation mode on i2c
ALSA: hda/realtek - Enable internal speaker of ASUS K6500ZC
ALSA: hda/realtek: Enable Mute LED on HP 255 G8
ALSA: hda/realtek - Add Dell ALC295 to pin fall back table
Linus Torvalds [Fri, 17 Nov 2023 13:42:05 +0000 (08:42 -0500)]
Merge tag 'audit-pr-
20231116' of git://git./linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
"One small audit patch to convert a WARN_ON_ONCE() into a normal
conditional to avoid scary looking console warnings when eBPF code
generates audit records from unexpected places"
* tag 'audit-pr-
20231116' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()
Daniel Vetter [Fri, 17 Nov 2023 10:46:06 +0000 (11:46 +0100)]
Merge tag 'amd-drm-fixes-6.7-2023-11-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.7-2023-11-17:
amdgpu:
- DMCUB fixes
- SR-IOV fix
- GMC9 fix
- Documentation fix
- DSC MST fix
- CS chunk parsing fix
- SMU13.0.6 fixes
- 8K tiled display fix
- Fix potential NULL pointer dereferences
- Cursor lag fix
- Backlight fix
- DCN s0ix fix
- XGMI fix
- DCN encoder disable logic fix
- AGP aperture fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231117063441.4883-1-alexander.deucher@amd.com
Daniel Vetter [Fri, 17 Nov 2023 10:04:52 +0000 (11:04 +0100)]
Merge tag 'drm-misc-fixes-2023-11-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Assorted fixes for v6.7-rc2:
- Nouveau GSP fixes.
- Fix nouveau driver load without display.
- Use rwlock for nouveau's event lock to break a lockdep splat.
- Add orientation quirk for Lenovo Legion Go.
- Fix build failure in IVPU.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/98fc82d3-8714-45e7-bd12-c95ba8c6c35f@linux.intel.com
Alex Deucher [Thu, 9 Nov 2023 20:40:00 +0000 (15:40 -0500)]
drm/amdgpu/gmc9: disable AGP aperture
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 9 Nov 2023 20:38:54 +0000 (15:38 -0500)]
drm/amdgpu/gmc10: disable AGP aperture
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 9 Nov 2023 20:34:19 +0000 (15:34 -0500)]
drm/amdgpu/gmc11: disable AGP aperture
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 9 Nov 2023 20:31:00 +0000 (15:31 -0500)]
drm/amdgpu: add a module parameter to control the AGP aperture
Add a module parameter to control the AGP aperture. The AGP
aperture is an aperture in the GPU's internal address space
which provides direct non-paged access to the platform address
space. This access is non-snooped so only uncached memory
can be accessed.
Add a knob so that we can toggle this for debugging.
Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 14 Nov 2023 16:36:56 +0000 (11:36 -0500)]
drm/amdgpu/gmc11: fix logic typo in AGP check
Should be && rather than ||.
Fixes: b2e1cbe6281f ("drm/amdgpu/gmc11: disable AGP on GC 11.5")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Susanto [Wed, 1 Nov 2023 19:30:10 +0000 (15:30 -0400)]
drm/amd/display: Fix encoder disable logic
[WHY]
DENTIST hangs when OTG is off and encoder is on. We were not
disabling the encoder properly when switching from extended mode to
external monitor only.
[HOW]
Disable the encoder using an existing enable/disable fifo helper instead
of enc35_stream_encoder_enable.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lewis Huang [Thu, 19 Oct 2023 09:22:21 +0000 (17:22 +0800)]
drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox
[WHY]
Flush command sent to DMCUB spends more time for execution on
a dGPU than on an APU. This causes cursor lag when using high
refresh rate mouses.
[HOW]
1. Change the DMCUB mailbox memory location from FB to inbox.
2. Only change windows memory to inbox.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Lewis Huang <lewis.huang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shiwu Zhang [Tue, 31 Oct 2023 03:02:49 +0000 (11:02 +0800)]
drm/amdgpu: add and populate the port num into xgmi topology info
The port num info is firstly introduced with 20.00.01.13 xgmi ta and
make them as part of topology info.
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Duncan Ma [Wed, 25 Oct 2023 23:07:21 +0000 (19:07 -0400)]
drm/amd/display: Negate IPS allow and commit bits
[WHY]
On s0i3, IPS mask isn't saved and restored.
It is reset to zero on exit.
If it is cleared unexpectedly, driver will
proceed operations while DCN is in IPS2 and
cause a hang.
[HOW]
Negate the bit logic. Default value of
zero indicates it is still in IPS2. Driver
must poll for the bit to assert.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Duncan Ma <duncan.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Fri, 10 Nov 2023 07:45:39 +0000 (13:15 +0530)]
drm/amd/pm: Don't send unload message for reset
No need to notify about unload during reset. Also remove the FW version
check.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Mon, 13 Nov 2023 06:34:55 +0000 (14:34 +0800)]
drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c
fix ras err_data null pointer issue in amdgpu_ras.c
Fixes: 8cc0f5669eb6 ("drm/amdgpu: Support multiple error query modes")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Hsieh [Wed, 25 Oct 2023 02:53:35 +0000 (10:53 +0800)]
drm/amd/display: Clear dpcd_sink_ext_caps if not set
[WHY]
Some eDP panels' ext caps don't set initial values
and the value of dpcd_addr (0x317) is random.
It means that sometimes the eDP can be OLED, miniLED and etc,
and cause incorrect backlight control interface.
[HOW]
Add remove_sink_ext_caps to remove sink ext caps (HDR, OLED and etc)
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tianci Yin [Wed, 1 Nov 2023 01:47:13 +0000 (09:47 +0800)]
drm/amd/display: Enable fast plane updates on DCN3.2 and above
[WHY]
When cursor moves across screen boarder, lag cursor observed,
since subvp settings need to sync up with vblank that causes
cursor updates being delayed.
[HOW]
Enable fast plane updates on DCN3.2 to fix it.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tianci Yin <tianci.yin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
José Pekkarinen [Tue, 14 Nov 2023 15:27:51 +0000 (17:27 +0200)]
drm/amd/display: fix NULL dereference
The following patch will fix a minor issue where a debug message is
referencing an struct that has just being checked whether is null or
not. This has been noticed by using coccinelle, in the following output:
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c:540:25-29: ERROR: aconnector is NULL but dereferenced.
Fixes: 5d72e247e58c ("drm/amd/display: switch DC over to the new DRM logging macros")
Signed-off-by: José Pekkarinen <jose.pekkarinen@foxhound.fi>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Wed, 8 Nov 2023 19:31:57 +0000 (13:31 -0600)]
drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()
When ddc_service_construct() is called, it explicitly checks both the
link type and whether there is something on the link which will
dictate whether the pin is marked as hw_supported.
If the pin isn't set or the link is not set (such as from
unloading/reloading amdgpu in an IGT test) then fail the
amdgpu_dm_i2c_xfer() call.
Cc: stable@vger.kernel.org
Fixes: 22676bc500c2 ("drm/amd/display: Fix dmub soft hang for PSR 1")
Link: https://github.com/fwupd/fwupd/issues/6327
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Muhammad Ahmed [Tue, 31 Oct 2023 20:03:21 +0000 (16:03 -0400)]
drm/amd/display: Add null checks for 8K60 lightup
[WHY & HOW]
Add some null checks to fix an issue where 8k60
tiled display fails to light up.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Tue, 14 Nov 2023 08:17:17 +0000 (16:17 +0800)]
drm/amd/pm: Fill pcie error counters for gpu v1_4
Fill PCIE error counters & instantaneous bandwidth
in gpu metrics v1_4 for smu v_13_0_6
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 30 Oct 2023 19:14:02 +0000 (03:14 +0800)]
drm/amd/pm: Update metric table for smu v13_0_6
Update pmfw metric table to include pcie
instantaneous bandwidth & pcie error counters
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YuanShang [Tue, 31 Oct 2023 02:32:37 +0000 (10:32 +0800)]
drm/amdgpu: correct chunk_ptr to a pointer to chunk.
The variable "chunk_ptr" should be a pointer pointing
to a struct drm_amdgpu_cs_chunk instead of to a pointer
of that.
Signed-off-by: YuanShang <YuanShang.Mao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fangzhi Zuo [Mon, 23 Oct 2023 17:57:32 +0000 (13:57 -0400)]
drm/amd/display: Fix DSC not Enabled on Direct MST Sink
[WHY & HOW]
For the scenario when a dsc capable MST sink device is directly
connected, it needs to use max dsc compression as the link bw constraint.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Roman Li <roman.li@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Srinivasan Shanmugam [Sun, 12 Nov 2023 04:21:19 +0000 (09:51 +0530)]
drm/amdgpu: Address member 'ring' not described in 'amdgpu_ vce, uvd_entity_init()'
Fixes the following:
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:237: warning: Function parameter or member 'ring' not described in 'amdgpu_vce_entity_init'
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:405: warning: Function parameter or member 'ring' not described in 'amdgpu_uvd_entity_init'
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Le Ma [Mon, 13 Nov 2023 10:05:34 +0000 (18:05 +0800)]
drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini
The valid num_mem_partitions is required during ttm pool fini,
thus move the cleanup at the end of the function.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Victor Lu [Wed, 4 Oct 2023 18:24:15 +0000 (14:24 -0400)]
drm/amdgpu: Do not program VF copy regs in mmhub v1.8 under SRIOV (v2)
MC_VM_AGP_* registers should not be programmed by guest driver.
v2: move early return outside of loop
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Samir Dhume <samir.dhume@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Wed, 13 Sep 2023 20:18:44 +0000 (16:18 -0400)]
drm/amd/display: Guard against invalid RPTR/WPTR being set
[WHY]
HW can return invalid values on register read, guard against these being
set and causing us to access memory out of range and page fault.
[HOW]
Guard at sync_inbox1 and guard at pushing commands.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Masahiro Yamada [Wed, 15 Nov 2023 04:16:53 +0000 (13:16 +0900)]
kconfig: fix memory leak from range properties
Currently, sym_validate_range() duplicates the range string using
xstrdup(), which is overwritten by a subsequent sym_calc_value() call.
It results in a memory leak.
Instead, only the pointer should be copied.
Below is a test case, with a summary from Valgrind.
[Test Kconfig]
config FOO
int "foo"
range 10 20
[Test .config]
CONFIG_FOO=0
[Before]
LEAK SUMMARY:
definitely lost: 3 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 17,465 bytes in 21 blocks
suppressed: 0 bytes in 0 blocks
[After]
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 17,462 bytes in 20 blocks
suppressed: 0 bytes in 0 blocks
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Simon Glass [Sat, 11 Nov 2023 00:28:01 +0000 (17:28 -0700)]
kbuild: Move the single quotes for image name
Add quotes where UIMAGE_NAME is used, rather than where it is defined.
This allows the UIMAGE_NAME variable to be set by the user.
Signed-off-by: Simon Glass <sjg@chromium.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Lukas Bulwahn [Fri, 10 Nov 2023 12:07:22 +0000 (13:07 +0100)]
linux/export: clean up the IA-64 KSYM_FUNC macro
With commit
cf8e8658100d ("arch: Remove Itanium (IA-64) architecture"),
there is no need to keep the IA-64 definition of the KSYM_FUNC macro.
Clean up the IA-64 definition of the KSYM_FUNC macro.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Kent Overstreet [Wed, 15 Nov 2023 00:11:04 +0000 (19:11 -0500)]
bcachefs: Fix missing locking for dentry->d_parent access
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Thu, 16 Nov 2023 12:51:26 +0000 (07:51 -0500)]
Merge tag 'net-6.7-rc2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from BPF and netfilter.
Current release - regressions:
- core: fix undefined behavior in netdev name allocation
- bpf: do not allocate percpu memory at init stage
- netfilter: nf_tables: split async and sync catchall in two
functions
- mptcp: fix possible NULL pointer dereference on close
Current release - new code bugs:
- eth: ice: dpll: fix initial lock status of dpll
Previous releases - regressions:
- bpf: fix precision backtracking instruction iteration
- af_unix: fix use-after-free in unix_stream_read_actor()
- tipc: fix kernel-infoleak due to uninitialized TLV value
- eth: bonding: stop the device in bond_setup_by_slave()
- eth: mlx5:
- fix double free of encap_header
- avoid referencing skb after free-ing in drop path
- eth: hns3: fix VF reset
- eth: mvneta: fix calls to page_pool_get_stats
Previous releases - always broken:
- core: set SOCK_RCU_FREE before inserting socket into hashtable
- bpf: fix control-flow graph checking in privileged mode
- eth: ppp: limit MRU to 64K
- eth: stmmac: avoid rx queue overrun
- eth: icssg-prueth: fix error cleanup on failing initialization
- eth: hns3: fix out-of-bounds access may occur when coalesce info is
read via debugfs
- eth: cortina: handle large frames
Misc:
- selftests: gso: support CONFIG_MAX_SKB_FRAGS up to 45"
* tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
macvlan: Don't propagate promisc change to lower dev in passthru
net: sched: do not offload flows with a helper in act_ct
net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors
net/mlx5e: Check return value of snprintf writing to fw_version buffer
net/mlx5e: Reduce the size of icosq_str
net/mlx5: Increase size of irq name buffer
net/mlx5e: Update doorbell for port timestamping CQ before the software counter
net/mlx5e: Track xmit submission to PTP WQ after populating metadata map
net/mlx5e: Avoid referencing skb after free-ing in drop path of mlx5e_sq_xmit_wqe
net/mlx5e: Don't modify the peer sent-to-vport rules for IPSec offload
net/mlx5e: Fix pedit endianness
net/mlx5e: fix double free of encap_header in update funcs
net/mlx5e: fix double free of encap_header
net/mlx5: Decouple PHC .adjtime and .adjphase implementations
net/mlx5: DR, Allow old devices to use multi destination FTE
net/mlx5: Free used cpus mask when an IRQ is released
Revert "net/mlx5: DR, Supporting inline WQE when possible"
bpf: Do not allocate percpu memory at init stage
net: Fix undefined behavior in netdev name allocation
dt-bindings: net: ethernet-controller: Fix formatting error
...