Dave Hansen [Wed, 9 Aug 2023 15:16:46 +0000 (08:16 -0700)]
x86/apic: Nuke ack_APIC_irq()
Yet another wrapper of a wrapper gone along with the outdated comment
that this compiles to a single instruction.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:15 +0000 (15:04 -0700)]
x86/apic: Remove pointless arguments from [native_]eoi_write()
Every callsite hands in the same constants which is a pointless exercise
and cannot be optimized by the compiler due to the indirect calls.
Use the constants in the eoi() callbacks and remove the arguments.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:14 +0000 (15:04 -0700)]
x86/apic/noop: Tidy up the code
First of all apic_noop can't be probed because it's not registered. So
there is no point for implementing a probe callback. The machine is
rightfully to die when that is invoked.
Remove the gunk and tidy up the other space consuming dummy callbacks.
This gunk should simply die. Nothing should ever invoke APIC callbacks once
this is installed, But that's a differrent story for another round of
cleanups. The comment on top of this file which was intentionally left in
place tells exactly why this is needed: voodoo programming.
In fact the kernel of today should just outright refuse to boot on a system
with no (functional) local APIC. That would spare tons of #ifdeffery and
other nonsense.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:13 +0000 (15:04 -0700)]
x86/apic: Remove pointless NULL initializations
Wasted space for no value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:12 +0000 (15:04 -0700)]
x86/apic: Sanitize APIC ID range validation
Now that everything has apic::max_apic_id set and the eventual update for
the x2APIC case is in place, switch the apic_id_valid() helper to use
apic::max_apic_id and remove the apic::apic_id_valid() callback.
[ dhansen: Fix subject typo ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:11 +0000 (15:04 -0700)]
x86/apic: Prepare x2APIC for using apic::max_apic_id
In order to remove the apic::apic_id_valid() callback and switch to
checking apic::max_apic_id, it is required to update apic::max_apic_id when
the APIC initialization code overrides it via x2apic_set_max_apicid().
Make the existing booleans a bitfield and add a flag which lets the update
function and the core code which switches the driver detect whether the
apic instance wants to have that update or not and apply it if required.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:10 +0000 (15:04 -0700)]
x86/apic: Simplify X2APIC ID validation
Currently, x2apic_max_apicid==0 means that there is no max APIC id limit.
But, this means that 0 needs to be special-cased.
Designate UINT_MAX to mean unlimited so that a plain old less than or equal
compare works and there is no special-casing. Replace the 0 initialization
with UINT_MAX.
[ dhansen: muck with changelog ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:10 +0000 (15:04 -0700)]
x86/apic: Add max_apic_id member
There is really no point to have a callback which compares numbers.
Add a field which allows each APIC to store the maximum APIC ID supported
and fill it in for all APIC incarnations.
The next step will remove the callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:09 +0000 (15:04 -0700)]
x86/apic: Wrap APIC ID validation into an inline
Prepare for removing the callback and making this as simple comparison to
an upper limit, which is the obvious solution to do for limit checks...
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:08 +0000 (15:04 -0700)]
x86/apic/64: Uncopypaste probing
No need for the same thing twice. Also prepares for simplifying the APIC ID
validation checks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:07 +0000 (15:04 -0700)]
x86/apic/x2apic: Share all common IPI functions
Yet more copy and pasta gone.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:07 +0000 (15:04 -0700)]
x86/apic/uv: Get rid of wrapper callbacks
Why on earth makes a wrapper around some common function sense? Just to be
able to slap some vendor name on it...
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:06 +0000 (15:04 -0700)]
x86/apic: Move safe wait_icr_idle() next to apic_mem_wait_icr_idle()
Move it next to apic_mem_wait_icr_idle(), rename it so that it's clear what
it does and rewrite it in readable form.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:05 +0000 (15:04 -0700)]
x86/apic: Allow apic::safe_wait_icr_idle() to be NULL
Remove tons of NOOP callbacks by making the invocation of
safe_wait_icr_idle() conditional in the inline wrapper.
Will be replaced by a static_call_cond() later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:04 +0000 (15:04 -0700)]
x86/apic: Allow apic::wait_icr_idle() to be NULL
Nuke more NOOP callbacks and make the invocation conditional. Will be
replaced with a static call later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:04 +0000 (15:04 -0700)]
x86/apic: Consolidate wait_icr_idle() implementations
Two copies and also needlessly public. Move it into ipi.c so it can be
inlined. Rename it to apic_mem_wait_icr_idle().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:03 +0000 (15:04 -0700)]
x86/apic/ipi: Tidy up the code and fixup comments
Replace the undecodable comment on top of the function, replace the space
consuming zero content comments with useful ones and tidy up the
implementation to prevent further eye bleed.
Make __default_send_IPI_shortcut() static as it has no other users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:02 +0000 (15:04 -0700)]
x86/apic: Mop up apic::apic_id_registered()
Really not a hotpath and again no reason for having a gazillion of empty
callbacks returning 1. Make it return bool and provide one shared
implementation for the remaining users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:04:01 +0000 (15:04 -0700)]
x86/apic: Mop up *setup_apic_routing()
default_setup_apic_routing() is a complete misnomer. On 64bit it does the
actual APIC probing and on 32bit it is used to force select the bigsmp APIC
and to emit a redundant message in the apic::setup_apic_routing() callback.
Rename the 64bit and 32bit function so they reflect what they are doing and
remove the useless APIC callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:59 +0000 (15:03 -0700)]
x86/ioapic/32: Decrapify phys_id_present_map operation
The operation to set the IOAPIC ID in phys_id_present_map is as convoluted
as it can be.
1) Allocate a bitmap of 32byte size on the stack
2) Zero the bitmap and set the IOAPIC ID bit
3) Or the temporary bitmap over phys_id_present_map
The same functionality can be achieved by setting the IOAPIC ID bit
directly in the phys_id_present_map.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:59 +0000 (15:03 -0700)]
x86/apic: Nuke apic::apicid_to_cpu_present()
This is only used on 32bit and is a wrapper around
physid_set_mask_of_physid() in all 32bit APIC drivers.
Remove the callback and use physid_set_mask_of_physid() in the code
directly,
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:58 +0000 (15:03 -0700)]
x86/apic: Nuke empty init_apic_ldr() callbacks
apic::init_apic_ldr() is only invoked when the APIC is initialized. So
there is really no point in having:
- Default empty callbacks all over the place
- Two implementations of the actual LDR init function where one is
just unreadable gunk but does exactly the same as the other.
Make the apic::init_apic_ldr() invocation conditional, remove the empty
callbacks and consolidate the two implementation into one.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:56 +0000 (15:03 -0700)]
x86/apic/32: Remove bigsmp_cpu_present_to_apicid()
It's a copy of default_cpu_present_to_apicid() with the omission of the
actual check whether the CPU is present.
This APIC callback should die completely, but the XEN APIC implementation
does something different which needs to be addressed first.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:56 +0000 (15:03 -0700)]
x86/apic/32: Decrapify the def_bigsmp mechanism
If the system has more than 8 CPUs then XAPIC and the bigsmp APIC driver is
required. This is ensured via:
1) Enumerating all possible CPUs up to NR_CPUS
2) Checking at boot CPU APIC setup time whether the system has more than
8 CPUs and has an XAPIC.
If that's the case then it's attempted to install the bigsmp APIC
driver and a magic variable 'def_to_bigsmp' is set to one.
3) If that magic variable is set and CONFIG_X86_BIGSMP=n and the system
has more than 8 CPUs smp_sanity_check() removes all CPUs >= #8 from
the present and possible mask in the most convoluted way.
This logic is completely broken for the case where the bigsmp driver is
enabled, but not selected due to a command line option specifying the
default APIC. In that case the system boots with default APIC in logical
destination mode and fails to reduce the number of CPUs.
That aside the above which is sprinkled over 3 different places is yet
another piece of art.
It would have been too obvious to check the requirements upfront and limit
nr_cpu_ids _before_ enumerating tons of CPUs and then removing them again.
Implement exactly this. Check the bigsmp requirement when the boot APIC is
registered which happens _before_ ACPI/MPTABLE parsing and limit the number
of CPUs to 8 if it can't be used. Switch it over when the boot CPU apic is
set up if necessary.
[ dhansen: fix nr_cpu_ids off-by-one in default_setup_apic_routing() ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:55 +0000 (15:03 -0700)]
x86/apic/32: Remove pointless default_acpi_madt_oem_check()
On 32bit there is no APIC implementing the acpi_madt_oem_check() except XEN
PV, but that does not matter at all.
generic_apic_probe() runs before ACPI tables are parsed. This selects the
XEN APIC if there is no command line override because the XEN APIC driver
is the first to be probed.
If there is a command line override then the XEN PV driver won't be
selected in the MADT OEM check either.
As there is no other MADT check implemented for 32bit APICs, this whole
excercise is a NOOP and can be removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:54 +0000 (15:03 -0700)]
x86/apic: Mop up early_per_cpu() abuse
UV X2APIC uses the per CPU variable from:
native_smp_prepare_cpus()
uv_system_init()
uv_system_init_hub()
which is long after the per CPU areas have been set up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:54 +0000 (15:03 -0700)]
x86/apic/ipi: Code cleanup
Remove completely useless and mindlessly copied comments and tidy up the
code which causes eye bleed when looking at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:53 +0000 (15:03 -0700)]
x86/apic/32: Remove x86_cpu_to_logical_apicid
This per CPU variable is just yet another form of voodoo programming. The
boot ordering is:
per_cpu(x86_cpu_to_logical_apicid, cpu) = 1U << cpu;
.....
setup_apic()
apic->init_apic_ldr()
default_init_apic_ldr()
apic_write(SET_APIC_LOGICAL_ID(1UL << smp_processor_id(), APIC_LDR);
id = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR);
WARN_ON(id != per_cpu(x86_cpu_to_logical_apicid, cpu));
per_cpu(x86_cpu_to_logical_apicid, cpu) = id;
So first write the default into LDR and then validate it against the same default
which was set up during early boot APIC enumeration.
Brilliant, isn't it?
The comment above the per CPU variable declaration describes it well:
'Let's keep it ugly for now.'
Remove the useless gunk and use '1U << cpu' consistently all over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:53 +0000 (15:03 -0700)]
x86/apic/32: Sanitize logical APIC ID handling
apic::x86_32_early_logical_apicid() is yet another historical joke.
It is used to preset the x86_cpu_to_logical_apicid per CPU variable during
APIC enumeration with:
- 1 shifted left by the CPU number
- the physical APIC ID in case of bigsmp
The latter is hillarious because bigsmp uses physical destination mode
which never can use the logical APIC ID.
It gets even worse. As bigsmp can be enforced late in the boot process the
probe function overwrites the per CPU variable which is never used for this
APIC type once again.
Remove that gunk and store 1 << cpunr unconditionally if and only if the
CPU number is less than 8, because the default logical destination mode
only allows up to 8 CPUs.
This is just an intermediate step before removing the per CPU insanity
completely. Stay tuned.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:52 +0000 (15:03 -0700)]
x86/apic: Get rid of apic_phys
No need for an extra variable to find out whether the APIC has been mapped
or is accessible (X2APIC mode).
Provide an inline for this and check apic_mmio_base which is only set when
the local APIC has been mapped.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:52 +0000 (15:03 -0700)]
x86/apic: Remove check_phys_apicid_present()
The only silly usage site is gone. Remove the gunk which was even outright
wrong in the bigsmp_32 case which returned true unconditionally.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:50 +0000 (15:03 -0700)]
x86/apic: Nuke another processor check
The boot CPUs local APIC is now always registered, so there is no point to
have another unreadable validatation for it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:50 +0000 (15:03 -0700)]
x86/apic: Sanitize num_processors handling
num_processors is 0 by default and only gets incremented when local APICs
are registered.
Make init_apic_mappings(), which tries to enable the local APIC in the case
that no SMP configuration was found set num_processors to 1.
This allows to remove yet another check for the local APIC and yet another
place which registers the boot CPUs local APIC ID.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:49 +0000 (15:03 -0700)]
x86/xen/pv: Pretend that it found SMP configuration
Unlike all other SMP configuration "parsers" XEN/PV does not set
smp_found_config which is inconsistent and prevents doing proper decision
logic based on this flag.
Make XEN/PV pretend that it found SMP configuration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:49 +0000 (15:03 -0700)]
x86/apic: Sanitize APIC address setup
Convert places which just write mp_lapic_addr and let them register the
local APIC address directly instead of relying on magic other code to do
so.
Add a WARN_ON() into register_lapic_address() which is raised when
register_lapic_address() is invoked more than once during boot.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:48 +0000 (15:03 -0700)]
x86/apic: Split register_apic_address()
Split the fixmap setup out of register_lapic_address() and reuse it when
the X2APIC is disabled during setup.
This avoids registering the APIC ID (setting 'mp_lapic_addr') twice.
[ dhansen: changelog wording tweak ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:48 +0000 (15:03 -0700)]
x86/apic: Make some APIC init functions bool
Quite some APIC init functions are pure boolean, but use the success = 0,
fail < 0 model. That's confusing as hell when reading through the code.
Convert them to boolean.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:47 +0000 (15:03 -0700)]
x86/of: Fix the APIC address registration
The device tree APIC parser tries to force-enable the local APIC when it is
not set in CPUID. apic_force_enable() registers the boot CPU apic on
success.
If that succeeds then dtb_lapic_setup() registers the local APIC again
eventually with a different address.
Rewrite the code so that it only registers it once.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Dave Hansen [Tue, 8 Aug 2023 22:03:47 +0000 (15:03 -0700)]
x86/apic: Remove mpparse 'apicid' variable
From: Dave Hansen <dave.hansen@linux.intel.com>
Some truly ancient code had different ways of calculating the 'apicid'
but it is long gone. Zap the unnecssary local variablee
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Thomas Gleixner [Tue, 8 Aug 2023 22:03:46 +0000 (15:03 -0700)]
x86/apic: Remove the pointless APIC version check
This historical leftover is really uninteresting today. Whatever MPTABLE or
MADT delivers we only trust the hardware anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:46 +0000 (15:03 -0700)]
x86/apic: Register boot CPU APIC early
Register the boot CPU APIC right when the boot CPUs APIC is read from the
hardware. No point is doing this on random places and having wild
heuristics to save the boot CPU APIC ID slot and CPU number 0 reserved.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:45 +0000 (15:03 -0700)]
x86/apic: Consolidate boot_cpu_physical_apicid initialization sites
boot_cpu_physical_apicid is written in random places and in the last
consequence filled with the APIC ID read from the local APIC. That causes
it to have inconsistent state when the MPTABLE is broken. As a consequence
tons of moronic checks are sprinkled all over the place.
Consolidate the code and read it exactly once when either X2APIC mode is
detected early or when the APIC mapping is established.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:44 +0000 (15:03 -0700)]
x86/apic: Nuke unused apic::inquire_remote_apic()
Put it to the other historical leftovers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:43 +0000 (15:03 -0700)]
x86/apic: Remove unused max_physical_apicid
max_physical_apicid is assigned but never read.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:43 +0000 (15:03 -0700)]
x86/apic: Get rid of hard_smp_processor_id()
No point in having a wrapper around read_apic_id().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:42 +0000 (15:03 -0700)]
x86/apic: Remove pointless x86_bios_cpu_apicid
It's a useless copy of x86_cpu_to_apicid.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:41 +0000 (15:03 -0700)]
x86/apic/ioapic: Rename skip_ioapic_setup
Another variable name which is confusing at best. Convert to bool.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:40 +0000 (15:03 -0700)]
x86/apic: Rename disable_apic
It reflects a state and not a command. Make it bool while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:39 +0000 (15:03 -0700)]
x86/cpu: Remove unused physid_*() nonsense
Tons of silly unused bitmap wrappers...
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Thomas Gleixner [Tue, 8 Aug 2023 22:03:39 +0000 (15:03 -0700)]
x86/cpu: Make identify_boot_cpu() static
It's not longer used outside the source file.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
Xin Li [Wed, 21 Jun 2023 17:12:48 +0000 (10:12 -0700)]
tools: Get rid of IRQ_MOVE_CLEANUP_VECTOR from tools
IRQ_MOVE_CLEANUP_VECTOR is not longer in use. Remove the last traces.
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230621171248.6805-4-xin3.li@intel.com
Thomas Gleixner [Wed, 21 Jun 2023 17:12:47 +0000 (10:12 -0700)]
x86/vector: Replace IRQ_MOVE_CLEANUP_VECTOR with a timer callback
The left overs of a moved interrupt are cleaned up once the interrupt is
raised on the new target CPU. Keeping the vector valid on the original
target CPU guarantees that there can't be an interrupt lost if the affinity
change races with an concurrent interrupt from the device.
This cleanup utilizes the lowest priority interrupt vector for this
cleanup, which makes sure that in the unlikely case when the to be cleaned
up interrupt is pending in the local APICs IRR the cleanup vector does not
live lock.
But there is no real reason to use an interrupt vector for cleaning up the
leftovers of a moved interrupt. It's not a high performance operation. The
only requirement is that it happens on the original target CPU.
Convert it to use a timer instead and adjust the code accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230621171248.6805-3-xin3.li@intel.com
Thomas Gleixner [Wed, 21 Jun 2023 17:12:46 +0000 (10:12 -0700)]
x86/vector: Rename send_cleanup_vector() to vector_schedule_cleanup()
Rename send_cleanup_vector() to vector_schedule_cleanup() to prepare for
replacing the vector cleanup IPI with a timer callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20230621171248.6805-2-xin3.li@intel.com
Linus Torvalds [Sun, 30 Jul 2023 20:23:47 +0000 (13:23 -0700)]
Linux 6.5-rc4
Linus Torvalds [Sun, 30 Jul 2023 19:54:31 +0000 (12:54 -0700)]
Merge tag 'spi-fix-v6.5-rc3' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A bunch of fixes for the Qualcomm QSPI driver, fixing multiple issues
with the newly added DMA mode - it had a number of issues exposed when
tested in a wider range of use cases, both race condition style issues
and issues with different inputs to those that had been used in test"
* tag 'spi-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-qcom-qspi: Add mem_ops to avoid PIO for badly sized reads
spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes
spi: spi-qcom-qspi: Add DMA_CHAIN_DONE to ALL_IRQS
spi: spi-qcom-qspi: Call dma_wmb() after setting up descriptors
spi: spi-qcom-qspi: Use GFP_ATOMIC flag while allocating for descriptor
spi: spi-qcom-qspi: Ignore disabled interrupts' status in isr
Linus Torvalds [Sun, 30 Jul 2023 19:52:05 +0000 (12:52 -0700)]
Merge tag 'regulator-fix-v6.5-rc3' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of small fixes for the the mt6358 driver, fixing error
reporting and a bootstrapping issue"
* tag 'regulator-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: mt6358: Fix incorrect VCN33 sync error message
regulator: mt6358: Sync VCN33_* enable status after checking ID
Linus Torvalds [Sun, 30 Jul 2023 18:57:51 +0000 (11:57 -0700)]
Merge tag 'usb-6.5-rc4' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a set of USB driver fixes for 6.5-rc4. Include in here are:
- new USB serial device ids
- dwc3 driver fixes for reported issues
- typec driver fixes for reported problems
- gadget driver fixes
- reverts of some problematic USB changes that went into -rc1
All of these have been in linux-next with no reported problems"
* tag 'usb-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits)
usb: misc: ehset: fix wrong if condition
usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
usb: gadget: call usb_gadget_check_config() to verify UDC capability
usb: typec: Use sysfs_emit_at when concatenating the string
usb: typec: Iterate pds array when showing the pd list
usb: typec: Set port->pd before adding device for typec_port
usb: typec: qcom: fix return value check in qcom_pmic_typec_probe()
Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
Revert "usb: xhci: tegra: Fix error check"
USB: gadget: Fix the memory leak in raw_gadget driver
usb: gadget: core: remove unbalanced mutex_unlock in usb_gadget_activate
Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
Revert "xhci: add quirk for host controllers that don't update endpoint DCS"
USB: quirks: add quirk for Focusrite Scarlett
usb: xhci-mtk: set the dma max_seg_size
MAINTAINERS: drop invalid usb/cdns3 Reviewer e-mail
usb: dwc3: don't reset device side if dwc3 was configured as host-only
usb: typec: ucsi: move typec_set_mode(TYPEC_STATE_SAFE) to ucsi_unregister_partner()
usb: ohci-at91: Fix the unhandle interrupt when resume
...
Linus Torvalds [Sun, 30 Jul 2023 18:51:36 +0000 (11:51 -0700)]
Merge tag 'tty-6.5-rc4' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small TTY and serial driver fixes for 6.5-rc4 for some
reported problems. Included in here is:
- TIOCSTI fix for braille readers
- documentation fix for minor numbers
- MAINTAINERS update for new serial files in -rc1
- minor serial driver fixes for reported problems
All of these have been in linux-next with no reported problems"
* tag 'tty-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_dw: Preserve original value of DLF register
tty: serial: sh-sci: Fix sleeping in atomic context
serial: sifive: Fix sifive_serial_console_setup() section
Documentation: devices.txt: reconcile serial/ucc_uart minor numers
MAINTAINERS: Update TTY layer for lists and recently added files
tty: n_gsm: fix UAF in gsm_cleanup_mux
TIOCSTI: always enable for CAP_SYS_ADMIN
Linus Torvalds [Sun, 30 Jul 2023 18:47:56 +0000 (11:47 -0700)]
Merge tag 'staging-6.5-rc4' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are three small staging driver fixes for 6.5-rc4 that resolve
some reported problems. These fixes are:
- fix for an old bug in the r8712 driver
- fbtft driver fix for a spi device
- potential overflow fix in the ks7010 driver
All of these have been in linux-next with no reported problems"
* tag 'staging-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
staging: fbtft: ili9341: use macro FBTFT_REGISTER_SPI_DRIVER
staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
Linus Torvalds [Sun, 30 Jul 2023 18:44:00 +0000 (11:44 -0700)]
Merge tag 'char-misc-6.5-rc4' of git://git./linux/kernel/git/gregkh/char-misc
Pull char driver and Documentation fixes from Greg KH:
"Here is a char driver fix and some documentation updates for 6.5-rc4
that contain the following changes:
- sram/genalloc bugfix for reported problem
- security-bugs.rst update based on recent discussions
- embargoed-hardware-issues minor cleanups and then partial revert
for the project/company lists
All of these have been in linux-next for a while with no reported
problems, and the documentation updates have all been reviewed by the
relevant developers"
* tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc/genalloc: Name subpools by of_node_full_name()
Documentation: embargoed-hardware-issues.rst: add AMD to the list
Documentation: embargoed-hardware-issues.rst: clean out empty and unused entries
Documentation: security-bugs.rst: clarify CVE handling
Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
Linus Torvalds [Sun, 30 Jul 2023 18:27:22 +0000 (11:27 -0700)]
Merge tag 'probes-fixes-v6.5-rc3' of git://git./linux/kernel/git/trace/linux-trace
Pull probe fixes from Masami Hiramatsu:
- probe-events: add NULL check for some BTF API calls which can return
error code and NULL.
- ftrace selftests: check fprobe and kprobe event correctly. This fixes
a miss condition of the test command.
- kprobes: do not allow probing functions that start with "__cfi_" or
"__pfx_" since those are auto generated for kernel CFI and not
executed.
* tag 'probes-fixes-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Prohibit probing on CFI preamble symbol
selftests/ftrace: Fix to check fprobe event eneblement
tracing/probes: Fix to add NULL check for BTF APIs
Linus Torvalds [Sun, 30 Jul 2023 18:19:08 +0000 (11:19 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"x86:
- Do not register IRQ bypass consumer if posted interrupts not
supported
- Fix missed device interrupt due to non-atomic update of IRR
- Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
- Make VMREAD error path play nice with noinstr
- x86: Acquire SRCU read lock when handling fastpath MSR writes
- Support linking rseq tests statically against glibc 2.35+
- Fix reference count for stats file descriptors
- Detect userspace setting invalid CR0
Non-KVM:
- Remove coccinelle script that has caused multiple confusion
("debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE()
usage", acked by Greg)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: selftests: Expand x86's sregs test to cover illegal CR0 values
KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Revert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"
KVM: selftests: Verify stats fd is usable after VM fd has been closed
KVM: selftests: Verify stats fd can be dup()'d and read
KVM: selftests: Verify userspace can create "redundant" binary stats files
KVM: selftests: Explicitly free vcpus array in binary stats test
KVM: selftests: Clean up stats fd in common stats_test() helper
KVM: selftests: Use pread() to read binary stats header
KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"
KVM: x86: Acquire SRCU read lock when handling fastpath MSR writes
KVM: VMX: Use vmread_error() to report VM-Fail in "goto" path
KVM: VMX: Make VMREAD error path play nice with noinstr
KVM: x86/irq: Conditionally register IRQ bypass consumer again
KVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
KVM: x86: check the kvm_cpu_get_interrupt result before using it
KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
...
Linus Torvalds [Sun, 30 Jul 2023 18:12:32 +0000 (11:12 -0700)]
Merge tag 'locking_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Fix a rtmutex race condition resulting from sharing of the sort key
between the lock waiters and the PI chain tree (->pi_waiters) of a
task by giving each tree their own sort key
* tag 'locking_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Fix task->pi_waiters integrity
Linus Torvalds [Sun, 30 Jul 2023 18:05:35 +0000 (11:05 -0700)]
Merge tag 'x86_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- AMD's automatic IBRS doesn't enable cross-thread branch target
injection protection (STIBP) for user processes. Enable STIBP on such
systems.
- Do not delete (but put the ref instead) of AMD MCE error thresholding
sysfs kobjects when destroying them in order not to delete the kernfs
pointer prematurely
- Restore annotation in ret_from_fork_asm() in order to fix kthread
stack unwinding from being marked as unreliable and thus breaking
livepatching
* tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
x86: Fix kthread unwind
Linus Torvalds [Sun, 30 Jul 2023 17:59:19 +0000 (10:59 -0700)]
Merge tag 'irq_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Work around an erratum on GIC700, where a race between a CPU handling
a wake-up interrupt, a change of affinity, and another CPU going to
sleep can result in a lack of wake-up event on the next interrupt
- Fix the locking required on a VPE for GICv4
- Enable Rockchip
3588001 erratum workaround for RK3588S
- Fix the irq-bcm6345-l1 assumtions of the boot CPU always be the first
CPU in the system
* tag 'irq_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3: Workaround for GIC-700 erratum
2941627
irqchip/gic-v3: Enable Rockchip
3588001 erratum workaround for RK3588S
irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
Linus Torvalds [Sun, 30 Jul 2023 03:49:13 +0000 (20:49 -0700)]
Merge tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Four small SMB3 client fixes:
- two reconnect fixes (to address the case where non-default
iocharset gets incorrectly overridden at reconnect with the
default charset)
- fix for NTLMSSP_AUTH request setting a flag incorrectly)
- Add missing check for invalid tlink (tree connection) in ioctl"
* tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: add missing return value check for cifs_sb_tlink
smb3: do not set NTLMSSP_VERSION flag for negotiate not auth request
cifs: fix charset issue in reconnection
fs/nls: make load_nls() take a const parameter
Linus Torvalds [Sun, 30 Jul 2023 03:40:43 +0000 (20:40 -0700)]
Merge tag 'trace-v6.5-rc3' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix to /sys/kernel/tracing/per_cpu/cpu*/stats read and entries.
If a resize shrinks the buffer it clears the read count to notify
readers that they need to reset. But the read count is also used for
accounting and this causes the numbers to be off. Instead, create a
separate variable to use to notify readers to reset.
- Fix the ref counts of the "soft disable" mode. The wrong value was
used for testing if soft disable mode should be enabled or disable,
but instead, just change the logic to do the enable and disable in
place when the SOFT_MODE is set or cleared.
- Several kernel-doc fixes
- Removal of unused external declarations
* tag 'trace-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix warning in trace_buffered_event_disable()
ftrace: Remove unused extern declarations
tracing: Fix kernel-doc warnings in trace_seq.c
tracing: Fix kernel-doc warnings in trace_events_trigger.c
tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c
ring-buffer: Fix kernel-doc warnings in ring_buffer.c
ring-buffer: Fix wrong stat of cpu_buffer->read
Sven Joachim [Thu, 27 Jul 2023 20:00:41 +0000 (22:00 +0200)]
arch/*/configs/*defconfig: Replace AUTOFS4_FS by AUTOFS_FS
Commit
a2225d931f75 ("autofs: remove left-over autofs4 stubs")
promised the removal of the fs/autofs/Kconfig fragment for AUTOFS4_FS
within a couple of releases, but five years later this still has not
happened yet, and AUTOFS4_FS is still enabled in 63 defconfigs.
Get rid of it mechanically:
git grep -l CONFIG_AUTOFS4_FS -- '*defconfig' |
xargs sed -i 's/AUTOFS4_FS/AUTOFS_FS/'
Also just remove the AUTOFS4_FS config option stub. Anybody who hasn't
regenerated their config file in the last five years will need to just
get the new name right when they do.
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 29 Jul 2023 15:59:25 +0000 (08:59 -0700)]
Merge tag 'loongarch-fixes-6.5-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Some bug fixes for build system, builtin cmdline handling, bpf and
{copy, clear}_user, together with a trivial cleanup"
* tag 'loongarch-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Cleanup __builtin_constant_p() checking for cpu_has_*
LoongArch: BPF: Fix check condition to call lu32id in move_imm()
LoongArch: BPF: Enable bpf_probe_read{, str}() on LoongArch
LoongArch: Fix return value underflow in exception path
LoongArch: Fix CMDLINE_EXTEND and CMDLINE_BOOTLOADER handling
LoongArch: Fix module relocation error with binutils 2.41
LoongArch: Only fiddle with CHECKFLAGS if `need-compiler'
Sean Christopherson [Tue, 13 Jun 2023 20:30:37 +0000 (13:30 -0700)]
KVM: selftests: Expand x86's sregs test to cover illegal CR0 values
Add coverage to x86's set_sregs_test to verify KVM rejects vendor-agnostic
illegal CR0 values, i.e. CR0 values whose legality doesn't depend on the
current VMX mode. KVM historically has neglected to reject bad CR0s from
userspace, i.e. would happily accept a completely bogus CR0 via
KVM_SET_SREGS{2}.
Punt VMX specific subtests to future work, as they would require quite a
bit more effort, and KVM gets coverage for CR0 checks in general through
other means, e.g. KVM-Unit-Tests.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230613203037.
1968489-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 13 Jun 2023 20:30:36 +0000 (13:30 -0700)]
KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only
if KVM itself is not configured to utilize unrestricted guests, i.e. don't
stuff CR0/CR4 for a restricted L2 that is running as the guest of an
unrestricted L1. Any attempt to VM-Enter a restricted guest with invalid
CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should
never observe a restricted L2 with incompatible CR0/CR4, since nested
VM-Enter from L1 should have failed.
And if KVM does observe an active, restricted L2 with incompatible state,
e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail
does more harm than good, as KVM will often neglect to undo the side
effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus
the damage can easily spill over to L1. On the other hand, letting
VM-Enter fail due to bad guest state is more likely to contain the damage
to L2 as KVM relies on hardware to perform most guest state consistency
checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into
L1 irrespective of (un)restricted guest behavior.
Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230613203037.
1968489-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 13 Jun 2023 20:30:35 +0000 (13:30 -0700)]
KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid,
e.g. due to setting bits 63:32, illegal combinations, or to a value that
isn't allowed in VMX (non-)root mode. The VMX checks in particular are
"fun" as failure to disallow Real Mode for an L2 that is configured with
unrestricted guest disabled, when KVM itself has unrestricted guest
enabled, will result in KVM forcing VM86 mode to virtual Real Mode for
L2, but then fail to unwind the related metadata when synthesizing a
nested VM-Exit back to L1 (which has unrestricted guest enabled).
Opportunistically fix a benign typo in the prototype for is_valid_cr4().
Cc: stable@vger.kernel.org
Reported-by: syzbot+5feef0b9ee9c8e9e5689@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000f316b705fdf6e2b4@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230613203037.
1968489-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 26 Jul 2023 20:29:20 +0000 (13:29 -0700)]
Revert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"
Remove coccinelle's recommendation to use DEFINE_DEBUGFS_ATTRIBUTE()
instead of DEFINE_SIMPLE_ATTRIBUTE(). Regardless of whether or not the
"significant overhead" incurred by debugfs_create_file() is actually
meaningful, warnings from the script have led to a rash of low-quality
patches that have sowed confusion and consumed maintainer time for little
to no benefit. There have been no less than four attempts to "fix" KVM,
and a quick search on lore shows that KVM is not alone.
This reverts commit
5103068eaca290f890a30aae70085fac44cecaf6.
Link: https://lore.kernel.org/all/87tu2nbnz3.fsf@mpe.ellerman.id.au
Link: https://lore.kernel.org/all/c0b98151-16b6-6d8f-1765-0f7d46682d60@redhat.com
Link: https://lkml.kernel.org/r/20230706072954.4881-1-duminjie%40vivo.com
Link: https://lore.kernel.org/all/Y2FsbufV00jbyF0B@google.com
Link: https://lore.kernel.org/all/Y2ENJJ1YiSg5oHiy@orome
Link: https://lore.kernel.org/all/7560b350e7b23786ce712118a9a504356ff1cca4.camel@kernel.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230726202920.507756-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:31 +0000 (16:01 -0700)]
KVM: selftests: Verify stats fd is usable after VM fd has been closed
Verify that VM and vCPU binary stats files are usable even after userspace
has put its last direct reference to the VM. This is a regression test
for a UAF bug where KVM didn't gift the stats files a reference to the VM.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:30 +0000 (16:01 -0700)]
KVM: selftests: Verify stats fd can be dup()'d and read
Expand the binary stats test to verify that a stats fd can be dup()'d
and read, to (very) roughly simulate userspace passing around the file.
Adding the dup() test is primarily an intermediate step towards verifying
that userspace can read VM/vCPU stats before _and_ after userspace closes
its copy of the VM fd; the dup() test itself is only mildly interesting.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:29 +0000 (16:01 -0700)]
KVM: selftests: Verify userspace can create "redundant" binary stats files
Verify that KVM doesn't artificially limit KVM_GET_STATS_FD to a single
file per VM/vCPU. There's no known use case for getting multiple stats
fds, but it should work, and more importantly creating multiple files will
make it easier to test that KVM correct manages VM refcounts for stats
files.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:28 +0000 (16:01 -0700)]
KVM: selftests: Explicitly free vcpus array in binary stats test
Explicitly free the all-encompassing vcpus array in the binary stats test
so that the test is consistent with respect to freeing all dynamically
allocated resources (versus letting them be freed on exit).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:27 +0000 (16:01 -0700)]
KVM: selftests: Clean up stats fd in common stats_test() helper
Move the stats fd cleanup code into stats_test() and drop the
superfluous vm_stats_test() and vcpu_stats_test() helpers in order to
decouple creation of the stats file from consuming/testing the file
(deduping code is a bonus). This will make it easier to test various
edge cases related to stats, e.g. that userspace can dup() a stats fd,
that userspace can have multiple stats files for a singleVM/vCPU, etc.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:26 +0000 (16:01 -0700)]
KVM: selftests: Use pread() to read binary stats header
Use pread() with an explicit offset when reading the header and the header
name for a binary stats fd so that the common helper and the binary stats
test don't subtly rely on the file effectively being untouched, e.g. to
allow multiple reads of the header, name, etc.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230711230131.648752-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 11 Jul 2023 23:01:25 +0000 (16:01 -0700)]
KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
Grab a reference to KVM prior to installing VM and vCPU stats file
descriptors to ensure the underlying VM and vCPU objects are not freed
until the last reference to any and all stats fds are dropped.
Note, the stats paths manually invoke fd_install() and so don't need to
grab a reference before creating the file.
Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU")
Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM")
Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu>
Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5r=Q@mail.gmail.com
Cc: stable@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Message-Id: <
20230711230131.648752-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 21 Jul 2023 22:33:52 +0000 (15:33 -0700)]
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
To allow running rseq and KVM's rseq selftests as statically linked
binaries, initialize the various "trampoline" pointers to point directly
at the expect glibc symbols, and skip the dlysm() lookups if the rseq
size is non-zero, i.e. the binary is statically linked *and* the libc
registered its own rseq.
Define weak versions of the symbols so as not to break linking against
libc versions that don't support rseq in any capacity.
The KVM selftests in particular are often statically linked so that they
can be run on targets with very limited runtime environments, i.e. test
machines.
Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35")
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230721223352.
2333911-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 21 Jul 2023 22:43:37 +0000 (15:43 -0700)]
Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"
Now that handle_fastpath_set_msr_irqoff() acquires kvm->srcu, i.e. allows
dereferencing memslots during WRMSR emulation, drop the requirement that
"next RIP" is valid. In hindsight, acquiring kvm->srcu would have been a
better fix than avoiding the pastpath, but at the time it was thought that
accessing SRCU-protected data in the fastpath was a one-off edge case.
This reverts commit
5c30e8101e8d5d020b1d7119117889756a6ed713.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230721224337.
2335137-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 21 Jul 2023 22:43:36 +0000 (15:43 -0700)]
KVM: x86: Acquire SRCU read lock when handling fastpath MSR writes
Temporarily acquire kvm->srcu for read when potentially emulating WRMSR in
the VM-Exit fastpath handler, as several of the common helpers used during
emulation expect the caller to provide SRCU protection. E.g. if the guest
is counting instructions retired, KVM will query the PMU event filter when
stepping over the WRMSR.
dump_stack+0x85/0xdf
lockdep_rcu_suspicious+0x109/0x120
pmc_event_is_allowed+0x165/0x170
kvm_pmu_trigger_event+0xa5/0x190
handle_fastpath_set_msr_irqoff+0xca/0x1e0
svm_vcpu_run+0x5c3/0x7b0 [kvm_amd]
vcpu_enter_guest+0x2108/0x2580
Alternatively, check_pmu_event_filter() could acquire kvm->srcu, but this
isn't the first bug of this nature, e.g. see commit
5c30e8101e8d ("KVM:
SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"). Providing
protection for the entirety of WRMSR emulation will allow reverting the
aforementioned commit, and will avoid having to play whack-a-mole when new
uses of SRCU-protected structures are inevitably added in common emulation
helpers.
Fixes: dfdeda67ea2d ("KVM: x86/pmu: Prevent the PMU from counting disallowed events")
Reported-by: Greg Thelen <gthelen@google.com>
Reported-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230721224337.
2335137-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 21 Jul 2023 23:56:37 +0000 (16:56 -0700)]
KVM: VMX: Use vmread_error() to report VM-Fail in "goto" path
Use vmread_error() to report VM-Fail on VMREAD for the "asm goto" case,
now that trampoline case has yet another wrapper around vmread_error() to
play nice with instrumentation.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230721235637.
2345403-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Fri, 21 Jul 2023 23:56:36 +0000 (16:56 -0700)]
KVM: VMX: Make VMREAD error path play nice with noinstr
Mark vmread_error_trampoline() as noinstr, and add a second trampoline
for the CONFIG_CC_HAS_ASM_GOTO_OUTPUT=n case to enable instrumentation
when handling VM-Fail on VMREAD. VMREAD is used in various noinstr
flows, e.g. immediately after VM-Exit, and objtool rightly complains that
the call to the error trampoline leaves a no-instrumentation section
without annotating that it's safe to do so.
vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0xc9:
call to vmread_error_trampoline() leaves .noinstr.text section
Note, strictly speaking, enabling instrumentation in the VM-Fail path
isn't exactly safe, but if VMREAD fails the kernel/system is likely hosed
anyways, and logging that there is a fatal error is more important than
*maybe* encountering slightly unsafe instrumentation.
Reported-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20230721235637.
2345403-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Like Xu [Mon, 24 Jul 2023 11:12:36 +0000 (19:12 +0800)]
KVM: x86/irq: Conditionally register IRQ bypass consumer again
As was attempted commit
14717e203186 ("kvm: Conditionally register IRQ
bypass consumer"): "if we don't support a mechanism for bypassing IRQs,
don't register as a consumer. Initially this applied to AMD processors,
but when AVIC support was implemented for assigned devices,
kvm_arch_has_irq_bypass() was always returning true.
We can still skip registering the consumer where enable_apicv
or posted-interrupts capability is unsupported or globally disabled.
This eliminates meaningless dev_info()s when the connect fails
between producer and consumer", such as on Linux hosts where enable_apicv
or posted-interrupts capability is unsupported or globally disabled.
Cc: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Yong He <alexyonghe@tencent.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217379
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <
20230724111236.76570-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peng Hao [Fri, 28 Jul 2023 06:49:48 +0000 (14:49 +0800)]
KVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
The pid_table of ipiv is the persistent memory allocated by
per-vcpu, which should be counted into the memory cgroup.
Signed-off-by: Peng Hao <flyingpeng@tencent.com>
Message-Id: <CAPm50aLxCQ3TQP2Lhc0PX3y00iTRg+mniLBqNDOC=t9CLxMwwA@mail.gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Wed, 26 Jul 2023 13:59:45 +0000 (16:59 +0300)]
KVM: x86: check the kvm_cpu_get_interrupt result before using it
The code was blindly assuming that kvm_cpu_get_interrupt never returns -1
when there is a pending interrupt.
While this should be true, a bug in KVM can still cause this.
If -1 is returned, the code before this patch was converting it to 0xFF,
and 0xFF interrupt was injected to the guest, which results in an issue
which was hard to debug.
Add WARN_ON_ONCE to catch this case and skip the injection
if this happens again.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20230726135945.260841-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Wed, 26 Jul 2023 13:59:44 +0000 (16:59 +0300)]
KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
When the APICv is inhibited, the irr_pending optimization is used.
Therefore, when kvm_apic_update_irr sets bits in the IRR,
it must set irr_pending to true as well.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20230726135945.260841-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Wed, 26 Jul 2023 13:59:43 +0000 (16:59 +0300)]
KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically
If APICv is inhibited, then IPIs from peer vCPUs are done by
atomically setting bits in IRR.
This means, that when __kvm_apic_update_irr copies PIR to IRR,
it has to modify IRR atomically as well.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20230726135945.260841-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 01:50:47 +0000 (10:50 +0900)]
kprobes: Prohibit probing on CFI preamble symbol
Do not allow to probe on "__cfi_" or "__pfx_" started symbol, because those
are used for CFI and not executed. Probing it will break the CFI.
Link: https://lore.kernel.org/all/168904024679.116016.18089228029322008512.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sat, 29 Jul 2023 01:31:18 +0000 (18:31 -0700)]
Merge tag 'ata-6.5-rc4' of git://git./linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Fix error message output in the pata_arasan_cf driver (Minjie)
- Fix invalid error return in the pata_octeon_cf driver initialization
(Yingliang)
- Fix a compilation warning due to a missing static function
declaration in the pata_ns87415 driver (Arnd)
- Fix the condition evaluating when to fetch sense data for successful
completions, which should be done only when command duration limits
are being used (Niklas)
* tag 'ata-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: libata-core: fix when to fetch sense data for successful commands
ata: pata_ns87415: mark ns87560_tf_read static
ata: pata_octeon_cf: fix error return code in octeon_cf_probe()
ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer()
Zheng Yejian [Wed, 26 Jul 2023 09:58:04 +0000 (17:58 +0800)]
tracing: Fix warning in trace_buffered_event_disable()
Warning happened in trace_buffered_event_disable() at
WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace:
? __warn+0xa5/0x1b0
? trace_buffered_event_disable+0x189/0x1b0
__ftrace_event_enable_disable+0x19e/0x3e0
free_probe_data+0x3b/0xa0
unregister_ftrace_function_probe_func+0x6b8/0x800
event_enable_func+0x2f0/0x3d0
ftrace_process_regex.isra.0+0x12d/0x1b0
ftrace_filter_write+0xe6/0x140
vfs_write+0x1c9/0x6f0
[...]
The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
```
#!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was set;
# 2) trace_buffered_event_enable() was called first time;
echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
# 2. Enable the event registered, then:
# 1) SOFT_DISABLED_BIT was cleared;
# 2) trace_buffered_event_disable() was called first time;
echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
# set again!!!
cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was cleared again;
# 2) trace_buffered_event_disable() was called second time!!!
echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled.
Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes: 0fc1b09ff1ff ("tracing: Use temp buffer when filtering events")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sat, 29 Jul 2023 00:19:52 +0000 (17:19 -0700)]
Merge tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git./linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"11 hotfixes. Five are cc:stable and the remainder address post-6.4
issues or aren't considered serious enough to justify backporting"
* tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/memory-failure: fix hardware poison check in unpoison_memory()
proc/vmcore: fix signedness bug in read_from_oldmem()
mailmap: update remaining active codeaurora.org email addresses
mm: lock VMA in dup_anon_vma() before setting ->anon_vma
mm: fix memory ordering for mm_lock_seq and vm_lock_seq
scripts/spelling.txt: remove 'thead' as a typo
mm/pagewalk: fix EFI_PGT_DUMP of espfix area
shmem: minor fixes to splice-read implementation
tmpfs: fix Documentation of noswap and huge mount options
Revert "um: Use swap() to make code cleaner"
mm/damon/core-test: initialise context before test in damon_test_set_attrs()
Linus Torvalds [Sat, 29 Jul 2023 00:14:05 +0000 (17:14 -0700)]
Merge tag 'thermal-6.5-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Constify thermal_zone_device_register() parameters, which was omitted
by mistake, and fix a double free on thermal zone unregistration in
the generic DT thermal driver (Ahmad Fatoum)"
* tag 'thermal-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: of: fix double-free on unregistration
thermal: core: constify params in thermal_zone_device_register
Linus Torvalds [Sat, 29 Jul 2023 00:08:59 +0000 (17:08 -0700)]
Merge tag 'pm-6.5-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Fix the arming of wakeup IRQs in the generic wakeup IRQ code
(wakeirq), drop unused functions from it and fix up a driver using it
and trying to work around the IRQ arming issue in a questionable way
(Johan Hovold)"
* tag 'pm-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
serial: qcom-geni: drop bogus runtime pm state update
PM: sleep: wakeirq: drop unused enable helpers
PM: sleep: wakeirq: fix wake irq arming
Linus Torvalds [Sat, 29 Jul 2023 00:02:11 +0000 (17:02 -0700)]
Merge tag 'hwmon-for-v6.5-rc4' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- k10temp: Display negative temperatures for industrial processors
- pmbus core: Fix deadlock, NULL pointer dereference, and chip enable
detection
- nct7802: Do not display PECI1 temperature if disabled
- nct6775: Fix IN scaling factors and feature detection for
NCT6798/6799
- oxp-sensors: Fix race condition during device attribute creation
- aquacomputer_d5next: Fix incorrect PWM value readout
* tag 'hwmon-for-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (k10temp) Enable AMD3255 Proc to show negative temperature
hwmon: (pmbus_core) Fix Deadlock in pmbus_regulator_get_status
hwmon: (pmbus_core) Fix NULL pointer dereference
hwmon: (pmbus_core) Fix pmbus_is_enabled()
hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
hwmon: (nct6775) Fix IN scaling factors for 6798/6799
hwmon: (oxp-sensors) Move tt_toggle attribute to dev_groups
hwmon: (aquacomputer_d5next) Fix incorrect PWM value readout
hwmon: (nct6775) Fix register for nct6799
YueHaibing [Tue, 25 Jul 2023 13:48:08 +0000 (21:48 +0800)]
ftrace: Remove unused extern declarations
commit
6a9c981b1e96 ("ftrace: Remove unused function ftrace_arch_read_dyn_info()")
left ftrace_arch_read_dyn_info() extern declaration.
And commit
1d74f2a0f64b ("ftrace: remove ftrace_ip_converted()")
leave ftrace_ip_converted() declaration.
Link: https://lore.kernel.org/linux-trace-kernel/20230725134808.9716-1-yuehaibing@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Gaosheng Cui [Mon, 24 Jul 2023 14:08:27 +0000 (22:08 +0800)]
tracing: Fix kernel-doc warnings in trace_seq.c
Fix kernel-doc warning:
kernel/trace/trace_seq.c:142: warning: Function parameter or member
'args' not described in 'trace_seq_vprintf'
Link: https://lkml.kernel.org/r/20230724140827.1023266-5-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Gaosheng Cui [Mon, 24 Jul 2023 14:08:26 +0000 (22:08 +0800)]
tracing: Fix kernel-doc warnings in trace_events_trigger.c
Fix kernel-doc warnings:
kernel/trace/trace_events_trigger.c:59: warning: Function parameter
or member 'buffer' not described in 'event_triggers_call'
kernel/trace/trace_events_trigger.c:59: warning: Function parameter
or member 'event' not described in 'event_triggers_call'
Link: https://lkml.kernel.org/r/20230724140827.1023266-4-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>