Tom Lendacky [Fri, 26 Jan 2024 04:11:24 +0000 (22:11 -0600)]
crypto: ccp: Add the SNP_COMMIT command
The SNP_COMMIT command is used to commit the currently installed version
of the SEV firmware. Once committed, the firmware cannot be replaced
with a previous firmware version (cannot be rolled back). This command
will also update the reported TCB to match that of the currently
installed firmware.
[ mdr: Note the reported TCB update in the documentation/commit. ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-25-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:23 +0000 (22:11 -0600)]
crypto: ccp: Add the SNP_PLATFORM_STATUS command
This command is used to query the SNP platform status. See the SEV-SNP
spec for more details.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-24-michael.roth@amd.com
Michael Roth [Fri, 26 Jan 2024 04:11:22 +0000 (22:11 -0600)]
x86/cpufeatures: Enable/unmask SEV-SNP CPU feature
With all the required host changes in place, it should now be possible
to initialize SNP-related MSR bits, set up RMP table enforcement, and
initialize SNP support in firmware while maintaining legacy support for
SEV/SEV-ES guests. Go ahead and enable the SNP feature now.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-23-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:21 +0000 (22:11 -0600)]
KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
Implement a workaround for an SNP erratum where the CPU will incorrectly
signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the
RMP entry of a VMCB, VMSA or AVIC backing page.
When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
backing pages as "in-use" via a reserved bit in the corresponding RMP
entry after a successful VMRUN. This is done for _all_ VMs, not just
SNP-Active VMs.
If the hypervisor accesses an in-use page through a writable
translation, the CPU will throw an RMP violation #PF. On early SNP
hardware, if an in-use page is 2MB-aligned and software accesses any
part of the associated 2MB region with a hugepage, the CPU will
incorrectly treat the entire 2MB region as in-use and signal a an RMP
violation #PF.
To avoid this, the recommendation is to not use a 2MB-aligned page for
the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure
that the page returned is not 2MB-aligned and is safe to be used when
SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA
pages of nested guests.
[ mdr: Squash in nested guest handling from Ashish, commit msg fixups. ]
Reported-by: Alper Gun <alpergun@google.com> # for nested VMSA case
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-22-michael.roth@amd.com
Ashish Kalra [Fri, 26 Jan 2024 04:11:20 +0000 (22:11 -0600)]
crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
Add a kdump safe version of sev_firmware_shutdown() and register it as a
crash_kexec_post_notifier so it will be invoked during panic/crash to do
SEV/SNP shutdown. This is required for transitioning all IOMMU pages to
reclaim/hypervisor state, otherwise re-init of IOMMU pages during
crashdump kernel boot fails and panics the crashdump kernel.
This panic notifier runs in atomic context, hence it ensures not to
acquire any locks/mutexes and polls for PSP command completion instead
of depending on PSP command completion interrupt.
[ mdr: Remove use of "we" in comments. ]
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-21-michael.roth@amd.com
Ashish Kalra [Fri, 26 Jan 2024 04:11:19 +0000 (22:11 -0600)]
iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
Add a new IOMMU API interface amd_iommu_snp_disable() to transition
IOMMU pages to Hypervisor state from Reclaim state after SNP_SHUTDOWN_EX
command. Invoke this API from the CCP driver after SNP_SHUTDOWN_EX
command.
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-20-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:18 +0000 (22:11 -0600)]
crypto: ccp: Handle legacy SEV commands when SNP is enabled
The behavior of legacy SEV commands is altered when the firmware is
initialized for SNP support. In that case, all command buffer memory
that may get written to by legacy SEV commands must be marked as
firmware-owned in the RMP table prior to issuing the command.
Additionally, when a command buffer contains a system physical address
that points to additional buffers that firmware may write to, special
handling is needed depending on whether:
1) the system physical address points to guest memory
2) the system physical address points to host memory
To handle case #1, the pages of these buffers are changed to
firmware-owned in the RMP table before issuing the command, and restored
to hypervisor-owned after the command completes.
For case #2, a bounce buffer is used instead of the original address.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-19-michael.roth@amd.com
Tom Lendacky [Fri, 26 Jan 2024 04:11:17 +0000 (22:11 -0600)]
crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
For SEV/SEV-ES, a buffer can be used to access non-volatile data so it
can be initialized from a file specified by the init_ex_path CCP module
parameter instead of relying on the SPI bus for NV storage, and
afterward the buffer can be read from to sync new data back to the file.
When SNP is enabled, the pages comprising this buffer need to be set to
firmware-owned in the RMP table before they can be accessed by firmware
for subsequent updates to the initial contents.
Implement that handling here.
[ bp: Carve out allocation into a helper. ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-18-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:16 +0000 (22:11 -0600)]
crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
The behavior and requirement for the SEV-legacy command is altered when
the SNP firmware is in the INIT state. See SEV-SNP firmware ABI
specification for more details.
Allocate the Trusted Memory Region (TMR) as a 2MB-sized/aligned region
when SNP is enabled to satisfy new requirements for SNP. Continue
allocating a 1MB-sized region for !SNP configuration.
[ bp: Carve out TMR allocation into a helper. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-17-michael.roth@amd.com
Ashish Kalra [Fri, 26 Jan 2024 04:11:15 +0000 (22:11 -0600)]
x86/sev: Introduce an SNP leaked pages list
Pages are unsafe to be released back to the page-allocator if they
have been transitioned to firmware/guest state and can't be reclaimed
or transitioned back to hypervisor/shared state. In this case, add them
to an internal leaked pages list to ensure that they are not freed or
touched/accessed to cause fatal page faults.
[ mdr: Relocate to arch/x86/virt/svm/sev.c ]
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20240126041126.1927228-16-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:14 +0000 (22:11 -0600)]
crypto: ccp: Provide an API to issue SEV and SNP commands
Export sev_do_cmd() as a generic API for the hypervisor to issue
commands to manage an SEV or an SNP guest. The commands for SEV and SNP
are defined in the SEV and SEV-SNP firmware specifications.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-15-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:13 +0000 (22:11 -0600)]
crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP
Before SNP VMs can be launched, the platform must be appropriately
configured and initialized via the SNP_INIT command.
During the execution of SNP_INIT command, the firmware configures
and enables SNP security policy enforcement in many system components.
Some system components write to regions of memory reserved by early
x86 firmware (e.g. UEFI). Other system components write to regions
provided by the operation system, hypervisor, or x86 firmware.
Such system components can only write to HV-fixed pages or Default
pages. They will error when attempting to write to pages in other page
states after SNP_INIT enables their SNP enforcement.
Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list of
system physical address ranges to convert into the HV-fixed page states
during the RMP initialization. If INIT_RMP is 1, hypervisors should
provide all system physical address ranges that the hypervisor will
never assign to a guest until the next RMP re-initialization.
For instance, the memory that UEFI reserves should be included in the
range list. This allows system components that occasionally write to
memory (e.g. logging to UEFI reserved regions) to not fail due to
RMP initialization and SNP enablement.
Note that SNP_INIT(_EX) must not be executed while non-SEV guests are
executing, otherwise it is possible that the system could reset or hang.
The psp_init_on_probe module parameter was added for SEV/SEV-ES support
and the init_ex_path module parameter to allow for time for the
necessary file system to be mounted/available.
SNP_INIT(_EX) does not use the file associated with init_ex_path. So, to
avoid running into issues where SNP_INIT(_EX) is called while there are
other running guests, issue it during module probe regardless of the
psp_init_on_probe setting, but maintain the previous deferrable handling
for SEV/SEV-ES initialization.
[ mdr: Squash in psp_init_on_probe changes from Tom, reduce
proliferation of 'probe' function parameter where possible.
bp: Fix 32-bit allmodconfig build. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-14-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:12 +0000 (22:11 -0600)]
crypto: ccp: Define the SEV-SNP commands
AMD introduced the next generation of SEV called SEV-SNP (Secure Nested
Paging). SEV-SNP builds upon existing SEV and SEV-ES functionality while
adding new hardware security protection.
Define the commands and structures used to communicate with the AMD-SP
when creating and managing the SEV-SNP guests. The SEV-SNP firmware spec
is available at developer.amd.com/sev.
[ mdr: update SNP command list and SNP status struct based on current
spec, use C99 flexible arrays, fix kernel-doc issues. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-13-michael.roth@amd.com
Michael Roth [Fri, 26 Jan 2024 04:11:11 +0000 (22:11 -0600)]
x86/sev: Adjust the directmap to avoid inadvertent RMP faults
If the kernel uses a 2MB or larger directmap mapping to write to an
address, and that mapping contains any 4KB pages that are set to private
in the RMP table, an RMP #PF will trigger and cause a host crash.
SNP-aware code that owns the private PFNs will never attempt such
a write, but other kernel tasks writing to other PFNs in the range may
trigger these checks inadvertently due to writing to those other PFNs
via a large directmap mapping that happens to also map a private PFN.
Prevent this by splitting any 2MB+ mappings that might end up containing
a mix of private/shared PFNs as a result of a subsequent RMPUPDATE for
the PFN/rmp_level passed in.
Another way to handle this would be to limit the directmap to 4K
mappings in the case of hosts that support SNP, but there is potential
risk for performance regressions of certain host workloads.
Handling it as-needed results in the directmap being slowly split over
time, which lessens the risk of a performance regression since the more
the directmap gets split as a result of running SNP guests, the more
likely the host is being used primarily to run SNP guests, where
a mostly-split directmap is actually beneficial since there is less
chance of TLB flushing and cpa_lock contention being needed to perform
these splits.
Cases where a host knows in advance it wants to primarily run SNP guests
and wishes to pre-split the directmap can be handled by adding
a tuneable in the future, but preliminary testing has shown this to not
provide a signficant benefit in the common case of guests that are
backed primarily by 2MB THPs, so it does not seem to be warranted
currently and can be added later if a need arises in the future.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20240126041126.1927228-12-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:10 +0000 (22:11 -0600)]
x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
The RMPUPDATE instruction updates the access restrictions for a page via
its corresponding entry in the RMP Table. The hypervisor will use the
instruction to enforce various access restrictions on pages used for
confidential guests and other specialized functionality. See APM3 for
details on the instruction operations.
The PSMASH instruction expands a 2MB RMP entry in the RMP table into a
corresponding set of contiguous 4KB RMP entries while retaining the
state of the validated bit from the original 2MB RMP entry. The
hypervisor will use this instruction in cases where it needs to re-map a
page as 4K rather than 2MB in a guest's nested page table.
Add helpers to make use of these instructions.
[ mdr: add RMPUPDATE retry logic for transient FAIL_OVERLAP errors. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-11-michael.roth@amd.com
Michael Roth [Fri, 26 Jan 2024 04:11:09 +0000 (22:11 -0600)]
x86/fault: Dump RMP table information when RMP page faults occur
RMP faults on kernel addresses are fatal and should never happen in
practice. They indicate a bug in the host kernel somewhere. Userspace
RMP faults shouldn't occur either, since even for VMs the memory used
for private pages is handled by guest_memfd and by design is not
mappable by userspace.
Dump RMP table information about the PFN corresponding to the faulting
HVA to help diagnose any issues of this sort when show_fault_oops() is
triggered by an RMP fault.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-10-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:08 +0000 (22:11 -0600)]
x86/traps: Define RMP violation #PF error code
Bit 31 in the page fault-error bit will be set when processor encounters
an RMP violation.
While at it, use the BIT() macro.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-9-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:07 +0000 (22:11 -0600)]
x86/fault: Add helper for dumping RMP entries
This information will be useful for debugging things like page faults
due to RMP access violations and RMPUPDATE failures.
[ mdr: move helper to standalone patch, rework dump logic as suggested
by Boris. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-8-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:06 +0000 (22:11 -0600)]
x86/sev: Add RMP entry lookup helpers
Add a helper that can be used to access information contained in the RMP
entry corresponding to a particular PFN. This will be needed to make
decisions on how to handle setting up mappings in the NPT in response to
guest page-faults and handling things like cleaning up pages and setting
them back to the default hypervisor-owned state when they are no longer
being used for private data.
[ mdr: separate 'assigned' indicator from return code, and simplify
function signatures for various helpers. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-7-michael.roth@amd.com
Ashish Kalra [Fri, 26 Jan 2024 04:11:05 +0000 (22:11 -0600)]
x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled
SNP enabled platforms require the MtrrFixDramModeEn bit to be set across
all CPUs when SNP is enabled. Therefore, don't print error messages when
MtrrFixDramModeEn is set when bringing CPUs online.
Closes: https://lore.kernel.org/kvm/68b2d6bf-bce7-47f9-bebb-2652cc923ff9@linux.microsoft.com/
Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-6-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:04 +0000 (22:11 -0600)]
x86/sev: Add SEV-SNP host initialization support
The memory integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). The RMP is a single data
structure shared across the system that contains one entry for every 4K
page of DRAM that may be used by SEV-SNP VMs. The APM Volume 2 section
on Secure Nested Paging (SEV-SNP) details a number of steps needed to
detect/enable SEV-SNP and RMP table support on the host:
- Detect SEV-SNP support based on CPUID bit
- Initialize the RMP table memory reported by the RMP base/end MSR
registers and configure IOMMU to be compatible with RMP access
restrictions
- Set the MtrrFixDramModEn bit in SYSCFG MSR
- Set the SecureNestedPagingEn and VMPLEn bits in the SYSCFG MSR
- Configure IOMMU
RMP table entry format is non-architectural and it can vary by
processor. It is defined by the PPR document for each respective CPU
family. Restrict SNP support to CPU models/families which are compatible
with the current RMP table entry format to guard against any undefined
behavior when running on other system types. Future models/support will
handle this through an architectural mechanism to allow for broader
compatibility.
SNP host code depends on CONFIG_KVM_AMD_SEV config flag which may be
enabled even when CONFIG_AMD_MEM_ENCRYPT isn't set, so update the
SNP-specific IOMMU helpers used here to rely on CONFIG_KVM_AMD_SEV
instead of CONFIG_AMD_MEM_ENCRYPT.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-5-michael.roth@amd.com
Ashish Kalra [Fri, 26 Jan 2024 04:11:03 +0000 (22:11 -0600)]
iommu/amd: Don't rely on external callers to enable IOMMU SNP support
Currently, the expectation is that the kernel will call
amd_iommu_snp_enable() to perform various checks and set the
amd_iommu_snp_en flag that the IOMMU uses to adjust its setup routines
to account for additional requirements on hosts where SNP is enabled.
This is somewhat fragile as it relies on this call being done prior to
IOMMU setup. It is more robust to just do this automatically as part of
IOMMU initialization, so rework the code accordingly.
There is still a need to export information about whether or not the
IOMMU is configured in a manner compatible with SNP, so relocate the
existing amd_iommu_snp_en flag so it can be used to convey that
information in place of the return code that was previously provided by
calls to amd_iommu_snp_enable().
While here, also adjust the kernel messages related to IOMMU SNP
enablement for consistency/grammar/clarity.
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-4-michael.roth@amd.com
Kim Phillips [Fri, 26 Jan 2024 04:11:02 +0000 (22:11 -0600)]
x86/speculation: Do not enable Automatic IBRS if SEV-SNP is enabled
Without SEV-SNP, Automatic IBRS protects only the kernel. But when
SEV-SNP is enabled, the Automatic IBRS protection umbrella widens to all
host-side code, including userspace. This protection comes at a cost:
reduced userspace indirect branch performance.
To avoid this performance loss, don't use Automatic IBRS on SEV-SNP
hosts and all back to retpolines instead.
[ mdr: squash in changes from review discussion. ]
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/20240126041126.1927228-3-michael.roth@amd.com
Brijesh Singh [Fri, 26 Jan 2024 04:11:01 +0000 (22:11 -0600)]
x86/cpufeatures: Add SEV-SNP CPU feature
Add CPU feature detection for Secure Encrypted Virtualization with
Secure Nested Paging. This feature adds a strong memory integrity
protection to help prevent malicious hypervisor-based attacks like
data replay, memory re-mapping, and more.
Since enabling the SNP CPU feature imposes a number of additional
requirements on host initialization and handling legacy firmware APIs
for SEV/SEV-ES guests, only introduce the CPU feature bit so that the
relevant handling can be added, but leave it disabled via a
disabled-features mask.
Once all the necessary changes needed to maintain legacy SEV/SEV-ES
support are introduced in subsequent patches, the SNP feature bit will
be unmasked/enabled.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Ashish Kalra <Ashish.Kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-2-michael.roth@amd.com
Ard Biesheuvel [Fri, 26 Jan 2024 16:39:19 +0000 (17:39 +0100)]
x86/sme: Fix memory encryption setting if enabled by default and not overridden
Commit
cbebd68f59f0 ("x86/mm: Fix use of uninitialized buffer in sme_enable()")
'fixed' an issue in sme_enable() detected by static analysis, and broke
the common case in the process.
cmdline_find_option() will return < 0 on an error, or when the command
line argument does not appear at all. In this particular case, the
latter is not an error condition, and so the early exit is wrong.
Instead, without mem_encrypt= on the command line, the compile time
default should be honoured, which could be to enable memory encryption,
and this is currently broken.
Fix it by setting sme_me_mask to a preliminary value based on the
compile time default, and only omitting the command line argument test
when cmdline_find_option() returns an error.
[ bp: Drop active_by_default while at it. ]
Fixes: cbebd68f59f0 ("x86/mm: Fix use of uninitialized buffer in sme_enable()")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20240126163918.2908990-2-ardb+git@google.com
Kirill A. Shutemov [Wed, 24 Jan 2024 14:02:16 +0000 (16:02 +0200)]
x86/mm: Fix memory encryption features advertisement
When memory encryption is enabled, the kernel prints the encryption
flavor that the system supports.
The check assumes that everything is AMD SME/SEV if it doesn't have
the TDX CPU feature set.
Hyper-V vTOM sets cc_vendor to CC_VENDOR_INTEL when it runs as L2 guest
on top of TDX, but not X86_FEATURE_TDX_GUEST. Hyper-V only needs memory
encryption enabled for I/O without the rest of CoCo enabling.
To avoid confusion, check the cc_vendor directly.
[ bp: Massage commit message. ]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240124140217.533748-1-kirill.shutemov@linux.intel.com
Borislav Petkov (AMD) [Fri, 5 Jan 2024 10:14:07 +0000 (11:14 +0100)]
x86/sev: Harden #VC instruction emulation somewhat
Compare the opcode bytes at rIP for each #VC exit reason to verify the
instruction which raised the #VC exception is actually the right one.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20240105101407.11694-1-bp@alien8.de
Linus Torvalds [Mon, 29 Jan 2024 01:01:12 +0000 (17:01 -0800)]
Linux 6.8-rc2
Linus Torvalds [Sun, 28 Jan 2024 21:55:56 +0000 (13:55 -0800)]
Merge tag 'cxl-fixes-6.8-rc2' of git://git./linux/kernel/git/cxl/cxl
Pull cxl fixes from Dan Williams:
"A build regression fix, a device compatibility fix, and an original
bug preventing creation of large (16 device) interleave sets:
- Fix unit test build regression fallout from global
"missing-prototypes" change
- Fix compatibility with devices that do not support interrupts
- Fix overflow when calculating the capacity of large interleave sets"
* tag 'cxl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/region:Fix overflow issue in alloc_hpa()
cxl/pci: Skip irq features if MSI/MSI-X are not supported
tools/testing/nvdimm: Disable "missing prototypes / declarations" warnings
tools/testing/cxl: Disable "missing prototypes / declarations" warnings
Linus Torvalds [Sun, 28 Jan 2024 18:43:06 +0000 (10:43 -0800)]
Merge tag 'mips-fixes_6.8_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix boot issue on single core Lantiq Danube devices
- fix boot issue on Loongson64 platforms
- fix improper FPU setup
- fix missing prototypes issues
* tag 'mips-fixes_6.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan
MIPS: loongson64: set nid for reserved memblock region
Revert "MIPS: loongson64: set nid for reserved memblock region"
MIPS: lantiq: register smp_ops on non-smp platforms
MIPS: loongson64: set nid for reserved memblock region
MIPS: reserve exception vector space ONLY ONCE
MIPS: BCM63XX: Fix missing prototypes
MIPS: sgi-ip32: Fix missing prototypes
MIPS: sgi-ip30: Fix missing prototypes
MIPS: fw arc: Fix missing prototypes
MIPS: sgi-ip27: Fix missing prototypes
MIPS: Alchemy: Fix missing prototypes
MIPS: Cobalt: Fix missing prototypes
Linus Torvalds [Sun, 28 Jan 2024 18:38:16 +0000 (10:38 -0800)]
Merge tag 'locking_urgent_for_v6.8_rc2' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Prevent an inconsistent futex operation leading to stale state
exposure
* tag 'locking_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Prevent the reuse of stale pi_state
Linus Torvalds [Sun, 28 Jan 2024 18:34:55 +0000 (10:34 -0800)]
Merge tag 'irq_urgent_for_v6.8_rc2' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
- Initialize the resend node of each IRQ descriptor, not only the first
one
* tag 'irq_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Initialize resend_node hlist for all interrupt descriptors
Linus Torvalds [Sun, 28 Jan 2024 18:33:14 +0000 (10:33 -0800)]
Merge tag 'timers_urgent_for_v6.8_rc2' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Borislav Petkov:
- Preserve the number of idle calls and sleep entries across CPU
hotplug events in order to be able to compute correct averages
- Limit the duration of the clocksource watchdog checking interval as
too long intervals lead to wrongly marking the TSC as unstable
* tag 'timers_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/sched: Preserve number of idle sleeps across CPU hotplug events
clocksource: Skip watchdog check for large watchdog intervals
Linus Torvalds [Sun, 28 Jan 2024 17:45:11 +0000 (09:45 -0800)]
Merge tag 'x86_urgent_for_v6.8_rc2' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Make sure 32-bit syscall registers are properly sign-extended
- Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater
Forest CPU model number
- Make a stub function export non-GPL because it is part of the
paravirt alternatives and that can be used by non-GPL code
* tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5
x86/entry/ia32: Ensure s32 is sign extended to s64
x86/cpu: Add model number for Intel Clearwater Forest processor
x86/CPU/AMD: Add X86_FEATURE_ZEN5
x86/paravirt: Make BUG_func() usable by non-GPL modules
Linus Torvalds [Sun, 28 Jan 2024 17:41:39 +0000 (09:41 -0800)]
Merge tag 'fixes-2024-01-28' of git://git./linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix crash when reserved memory is not added to memory.
When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, the initialization
of reserved pages may cause access of NODE_DATA() with invalid nid and
crash.
Add a fall back to early_pfn_to_nid() in memmap_init_reserved_pages()
to ensure a valid node id is always passed to init_reserved_page()"
* tag 'fixes-2024-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: fix crash when reserved memory is not added to memory
Linus Torvalds [Sat, 27 Jan 2024 17:48:55 +0000 (09:48 -0800)]
Merge tag 'platform-drivers-x86-v6.8-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
- WMI bus driver fixes
- Second attempt (previously reverted) at P2SB PCI rescan deadlock fix
- AMD PMF driver improvements
- MAINTAINERS updates
- Misc other small fixes and hw-id additions
* tag 'platform-drivers-x86-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: touchscreen_dmi: Add info for the TECLAST X16 Plus tablet
platform/x86/intel/ifs: Call release_firmware() when handling errors.
platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data()
platform/x86/amd/pmf: Get ambient light information from AMD SFH driver
platform/x86/amd/pmf: Get Human presence information from AMD SFH driver
platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events
platform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full
MAINTAINERS: remove defunct acpi4asus project info from asus notebooks section
MAINTAINERS: add Luke Jones as maintainer for asus notebooks
MAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainer
platform/x86: silicom-platform: Add missing "Description:" for power_cycle sysfs attr
platform/x86: intel-wmi-sbl-fw-update: Fix function name in error message
platform/x86: p2sb: Use pci_resource_n() in p2sb_read_bar0()
platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
platform/x86: intel-uncore-freq: Fix types in sysfs callbacks
platform/x86: wmi: Fix wmi_dev_probe()
platform/x86: wmi: Fix notify callback locking
platform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list
platform/x86: wmi: Return immediately if an suitable WMI event is found
platform/x86: wmi: Fix error handling in legacy WMI notify handler functions
Linus Torvalds [Sat, 27 Jan 2024 17:44:40 +0000 (09:44 -0800)]
Merge tag 'loongarch-fixes-6.8-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix boot failure on machines with more than 8 nodes, and fix two build
errors about KVM"
* tag 'loongarch-fixes-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Add returns to SIMD stubs
LoongArch: KVM: Fix build due to API changes
LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
Linus Torvalds [Sat, 27 Jan 2024 17:17:01 +0000 (09:17 -0800)]
Merge tag 'xfs-6.8-fixes-1' of git://git./fs/xfs/xfs-linux
Pull xfs fix from Chandan Babu:
- Fix read only mounts when using fsopen mount API
* tag 'xfs-6.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: read only mounts with fsopen mount API are busted
Linus Torvalds [Sat, 27 Jan 2024 17:11:52 +0000 (09:11 -0800)]
Merge tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- fix for REQ_OP_FLUSH usage; this fixes filesystems going read only
with -EOPNOTSUPP from the block layer.
(this really should have gone in with the block layer patch causing
the -EOPNOTSUPP, or should have gone in before).
- fix an allocation in non-sleepable context
- fix one source of srcu lock latency, on devices with terrible discard
latency
- fix a reattach_inode() issue in fsck
* tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefs:
bcachefs: __lookup_dirent() works in snapshot, not subvol
bcachefs: discard path uses unlock_long()
bcachefs: fix incorrect usage of REQ_OP_FLUSH
bcachefs: Add gfp flags param to bch2_prt_task_backtrace()
Linus Torvalds [Sat, 27 Jan 2024 17:06:56 +0000 (09:06 -0800)]
Merge tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix netlink OOB
- Minor kernel doc fix
* tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix global oob in ksmbd_nl_policy
smb: Fix some kernel-doc comments
Linus Torvalds [Sat, 27 Jan 2024 17:02:42 +0000 (09:02 -0800)]
Merge tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Nine cifs/smb client fixes
- Four network error fixes (three relating to replays of requests
that need to be retried, and one fixing some places where we were
returning the wrong rc up the stack on network errors)
- Two multichannel fixes including locking fix and case where subset
of channels need reconnect
- netfs integration fixup: share remote i_size with netfslib
- Two small cleanups (one for addressing a clang warning)"
* tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix stray unlock in cifs_chan_skip_or_disable
cifs: set replay flag for retries of write command
cifs: commands that are retried should have replay flag set
cifs: helper function to check replayable error codes
cifs: translate network errors on send to -ECONNABORTED
cifs: cifs_pick_channel should try selecting active channels
cifs: Share server EOF pos with netfslib
smb: Work around Clang __bdos() type confusion
smb: client: delete "true", "false" defines
Xi Ruoyao [Fri, 26 Jan 2024 21:05:57 +0000 (05:05 +0800)]
mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan
If we still own the FPU after initializing fcr31, when we are preempted
the dirty value in the FPU will be read out and stored into fcr31,
clobbering our setting. This can cause an improper floating-point
environment after execve(). For example:
zsh% cat measure.c
#include <fenv.h>
int main() { return fetestexcept(FE_INEXACT); }
zsh% cc measure.c -o measure -lm
zsh% echo $((1.0/3)) # raising FE_INEXACT
0.
33333333333333331
zsh% while ./measure; do ; done
(stopped in seconds)
Call lose_fpu(0) before setting fcr31 to prevent this.
Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/
Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling")
Cc: stable@vger.kernel.org
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Huang Pei [Sat, 27 Jan 2024 09:12:21 +0000 (17:12 +0800)]
MIPS: loongson64: set nid for reserved memblock region
Commit
61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit
61167ad5fecd
("mm: pass nid to reserve_bootmem_region()") under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Thomas Bogendoerfer [Sat, 27 Jan 2024 10:07:49 +0000 (11:07 +0100)]
Revert "MIPS: loongson64: set nid for reserved memblock region"
This reverts commit
ce7b1b97776ec0b068c4dd6b6dbb48ae09a23519.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Linus Torvalds [Fri, 26 Jan 2024 23:24:00 +0000 (15:24 -0800)]
Merge tag 'ata-6.8-rc2' of git://git./linux/kernel/git/libata/linux
Pull ata updates from Niklas Cassel:
- Fix an incorrect link_power_management_policy sysfs attribute value.
We were previously using the same attribute value for two different
LPM policies (me)
- Add a ASMedia ASM1166 quirk.
The SATA host controller always reports that it has 32 ports, even
though it only has six ports. Add a quirk that overrides the value
reported by the controller (Conrad)
- Add a ASMedia ASM1061 quirk.
The SATA host controller completely ignores the upper 21 bits of the
DMA address. This causes IOMMU error events when a (valid) DMA
address actually has any of the upper 21 bits set. Add a quirk that
limits the dma_mask to 43-bits (Lennert)
* tag 'ata-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers
ahci: asm1166: correct count of reported ports
ata: libata-sata: improve sysfs description for ATA_LPM_UNKNOWN
Linus Torvalds [Fri, 26 Jan 2024 23:19:43 +0000 (15:19 -0800)]
Merge tag 'block-6.8-2024-01-26' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- RCU warning fix for md (Mikulas)
- Fix for an aoe issue that lockdep rightfully complained about
(Maksim)
- Fix for an error code change in partitioning that caused a regression
with some tools (Li)
- Fix for a data direction warning with bi-direction commands
(Christian)
* tag 'block-6.8-2024-01-26' of git://git.kernel.dk/linux:
md: fix a suspicious RCU usage warning
aoe: avoid potential deadlock at set_capacity
block: Fix WARNING in _copy_from_iter
block: Move checking GENHD_FL_NO_PART to bdev_add_partition()
Linus Torvalds [Fri, 26 Jan 2024 23:17:42 +0000 (15:17 -0800)]
Merge tag 'io_uring-6.8-2024-01-26' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single tweak to the newly added IORING_OP_FIXED_FD_INSTALL from
Paul, ensuring it goes via the audit path and playing it safe by
excluding it from using registered creds"
* tag 'io_uring-6.8-2024-01-26' of git://git.kernel.dk/linux:
io_uring: enable audit and restrict cred override for IORING_OP_FIXED_FD_INSTALL
Linus Torvalds [Fri, 26 Jan 2024 23:06:23 +0000 (15:06 -0800)]
Merge tag 'thermal-6.8-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control update from Rafael Wysocki:
"Remove some dead code from the Intel powerclamp thermal control driver
(Srinivas Pandruvada)"
* tag 'thermal-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: powerclamp: Remove dead code for target mwait value
Linus Torvalds [Fri, 26 Jan 2024 22:53:28 +0000 (14:53 -0800)]
Merge tag 'pm-6.8-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq drivers and the cpupower utility.
Specifics:
- Fix the handling of scaling_max/min_freq sysfs attributes in the
AMD P-state cpufreq driver (Mario Limonciello)
- Make the intel_pstate cpufreq driver avoid unnecessary computation
of the HWP performance level corresponding to a given frequency in
the cases when it is known already, which also helps to avoid
reducing the maximum CPU capacity artificially on some systems
(Rafael J. Wysocki)
- Fix compilation of the cpupower utility when CFLAGS is passed as a
make argument for cpupower, but it does not take effect as expected
due to mishandling (Stanley Chan)"
* tag 'pm-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq/amd-pstate: Fix setting scaling max/min freq values
cpufreq: intel_pstate: Refine computation of P-state for given frequency
tools cpupower bench: Override CFLAGS assignments
Linus Torvalds [Fri, 26 Jan 2024 22:51:41 +0000 (14:51 -0800)]
Merge tag 'docs-6.8-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of relatively boring documentation fixes"
* tag 'docs-6.8-fixes' of git://git.lwn.net/linux:
docs: admin-guide: remove obsolete advice related to SLAB allocator
doc: admin-guide/kernel-parameters: remove useless comment
docs/accel: correct links to mailing list archives
docs/sphinx: Fix TOC scroll hack for the home page
Linus Torvalds [Fri, 26 Jan 2024 21:52:18 +0000 (13:52 -0800)]
Merge tag 'drm-fixes-2024-01-27' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Lots going on for rc2, ivpu has a bunch of stabilisation and debugging
work, then amdgpu and xe are the main fixes. i915, exynos have a few,
then some misc panel and bridge fixes.
Worth mentioning are three regressions. One of the nouveau fixes in
6.7 for a serious deadlock had side effects, so I guess we will bring
back the deadlock until I can figure out what should be done properly.
There was a scheduler regression vs amdgpu which was reported in a few
places and is now fixed. There was an i915 vs simpledrm problem
resulting in black screens, that is reverted also.
I'll be working on a proper nouveau fix, it kinda looks like one of
those cases where someone tried to use an atomic where they should
have probably used a lock, but I'll see.
fb:
- fix simpledrm/i915 regression by reverting change
scheduler:
- fix regression affecting amdgpu users due to sched draining
nouveau:
- revert 6.7 deadlock fix as it has side effects
dp:
- fix documentation warning
ttm:
- fix dummy page read on some platforms
bridge:
- anx7625 suspend fix
- sii902x: fix probing and audio registration
- parade-ps8640: fix suspend of bridge, aux fixes
- samsung-dsim: avoid using FORCE_STOP_STATE
panel:
- simple add missing bus flags
- fix samsung-s6d7aa0 flags
amdgpu:
- AC/DC power supply tracking fix
- Don't show invalid vram vendor data
- SMU 13.0.x fixes
- GART fix for umr on systems without VRAM
- GFX 10/11 UNORD_DISPATCH fixes
- IPS display fixes (required for S0ix on some platforms)
- Misc fixes
i915:
- DSI sequence revert to fix GitLab #10071 and DP test-pattern fix
- Drop -Wstringop-overflow (broken on GCC11)
ivpu:
- fix recovery/reset support
- improve submit ioctl stability
- fix dev open/close races on unbind
- PLL disable reset fix
- deprecate context priority param
- improve debug buffer logging
- disable buffer sharing across VPU contexts
- free buffer sgt on unbind
- fix missing lock around shmem vmap
- add better boot diagnostics
- add more debug prints around mapping
- dump MMU events in case of timeout
v3d:
- NULL ptr dereference fix
exynos:
- fix stack usage
- fix incorrect type
- fix dt typo
- fix gsc runtime resume
xe:
- Make an ops struct static
- Fix an implicit 0 to NULL conversion
- A couple of 32-bit fixes
- A migration coherency fix for Lunar Lake.
- An error path vm id leak fix
- Remove PVC references in kunit tests"
* tag 'drm-fixes-2024-01-27' of git://anongit.freedesktop.org/drm/drm: (66 commits)
Revert "nouveau: push event block/allowing out of the fence context"
drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
drm/sched: Drain all entities in DRM sched run job worker
drm/amd/display: "Enable IPS by default"
drm/amd: Add a DC debug mask for IPS
drm/amd/display: Disable ips before dc interrupt setting
drm/amd/display: Replay + IPS + ABM in Full Screen VPB
drm/amd/display: Add IPS checks before dcn register access
drm/amd/display: Add Replay IPS register for DMUB command table
drm/amd/display: Allow IPS2 during Replay
drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
drm/amd/amdgpu: Assign GART pages to AMD device mapping
drm/amd/pm: Fetch current power limit from FW
drm/amdgpu: Fix null pointer dereference
drm/amdgpu: Show vram vendor only if available
drm/amd/pm: update the power cap setting
drm/amdgpu: Avoid fetching vram vendor information
drm/amdgpu/pm: Fix the power source flag error
drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions
...
Linus Torvalds [Fri, 26 Jan 2024 21:22:59 +0000 (13:22 -0800)]
Merge tag 'asm-generic-6.8-2' of git://git./linux/kernel/git/arnd/asm-generic
Pull asm-generic update from Arnd Bergmann:
"Just one patch this time, adding Andreas Larsson as co-maintainer for
arch/sparc. He is volunteering to help since David Miller has become
much less active over the past few years.
In turn, I'm helping Andreas get set up as a new maintainer, starting
with the entry in the MAINTAINERS file"
* tag 'asm-generic-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
MAINTAINERS: Add Andreas Larsson as co-maintainer for arch/sparc
Linus Torvalds [Fri, 26 Jan 2024 21:09:38 +0000 (13:09 -0800)]
Merge tag 'arm-fixes-6.8-1' of git://git./linux/kernel/git/soc/soc
Pull arm SoC fixes from Arnd Bergmann:
"There are a couple of devicetree fixes for samsung, riscv/sophgo, and
for TPM device nodes on a couple of platforms.
Both the Arm FF-A and the SCMI firmware drivers get a number of code
fixes, addressing minor implementation bugs and compatibility with
firmware implementations. Most of these bugs relate to the usage of
xarray and rwlock structures and are fixed by Cristian Marussi"
* tag 'arm-fixes-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
riscv: dts: sophgo: separate sg2042 mtime and mtimecmp to fit aclint format
arm64: dts: Fix TPM schema violations
ARM: dts: Fix TPM schema violations
ARM: dts: exynos4212-tab3: add samsung,invert-vclk flag to fimd
arm64: dts: exynos: gs101: comply with the new cmu_misc clock names
firmware: arm_ffa: Handle partitions setup failures
firmware: arm_ffa: Use xa_insert() and check for result
firmware: arm_ffa: Simplify ffa_partitions_cleanup()
firmware: arm_ffa: Check xa_load() return value
firmware: arm_ffa: Add missing rwlock_init() for the driver partition
firmware: arm_ffa: Add missing rwlock_init() in ffa_setup_partitions()
firmware: arm_scmi: Fix the clock protocol supported version
firmware: arm_scmi: Fix the clock protocol version for v3.2
firmware: arm_scmi: Use xa_insert() when saving raw queues
firmware: arm_scmi: Use xa_insert() to store opps
firmware: arm_scmi: Replace asm-generic/bug.h with linux/bug.h
firmware: arm_scmi: Check mailbox/SMT channel for consistency
Linus Torvalds [Fri, 26 Jan 2024 20:29:04 +0000 (12:29 -0800)]
Merge tag 'spi-fix-v6.8-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"As well as a few device IDs and the usual scattering of driver
specific fixes this contains a couple of core things.
One is a missed case in error handling, the other patch is a change
from me raising the number of chip selects allowed by the newly added
multi chip select support patches to resolve problems seen on several
systems that exceeded the limit.
This is not a real solution to the issue but rather just a change to
avoid disruption to users, one of the options I am considering is just
sending a revert of those changes if we can't come up with something
sensible"
* tag 'spi-fix-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fix finalize message on error return
spi: cs42l43: Handle error from devm_pm_runtime_enable
spi: Raise limit on number of chip selects
spi: hisi-sfc-v3xx: Return IRQ_NONE if no interrupts were detected
spi: spi-cadence: Reverse the order of interleaved write and read operations
spi: spi-imx: Use dev_err_probe for failed DMA channel requests
spi: bcm-qspi: fix SFDP BFPT read by usig mspi read
spi: intel-pci: Add support for Arrow Lake SPI serial flash
spi: intel-pci: Remove Meteor Lake-S SoC PCI ID from the list
Linus Torvalds [Fri, 26 Jan 2024 20:26:02 +0000 (12:26 -0800)]
Merge tag 'gpio-fixes-for-v6.8-rc2' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- add a quirk to GPIO ACPI handling to ignore touchpad wakeups on GPD
G1619-04
- clear interrupt status bits (that may have been set before enabling
the interrupts) after setting the interrupt type in gpio-eic-sprd
* tag 'gpio-fixes-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: eic-sprd: Clear interrupt after set the interrupt type
gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04
Linus Torvalds [Fri, 26 Jan 2024 20:11:49 +0000 (12:11 -0800)]
Merge tag 'media/v6.8-3' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- remove K3 DT prefix from wave5
- vb2 core: fix missing caps on VIDIO_CREATE_BUFS under certain
circumstances
- videobuf2: Stop direct calls to queue num_buffers field
* tag 'media/v6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vb2: refactor setting flags and caps, fix missing cap
media: media videobuf2: Stop direct calls to queue num_buffers field
media: chips-media: wave5: Remove K3 References
dt-bindings: media: Remove K3 Family Prefix from Compatible
Phoenix Chen [Fri, 26 Jan 2024 09:53:08 +0000 (17:53 +0800)]
platform/x86: touchscreen_dmi: Add info for the TECLAST X16 Plus tablet
Add touch screen info for TECLAST X16 Plus tablet.
Signed-off-by: Phoenix Chen <asbeltogf@gmail.com>
Link: https://lore.kernel.org/r/20240126095308.5042-1-asbeltogf@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Jithu Joseph [Thu, 25 Jan 2024 08:22:50 +0000 (00:22 -0800)]
platform/x86/intel/ifs: Call release_firmware() when handling errors.
Missing release_firmware() due to error handling blocked any future image
loading.
Fix the return code and release_fiwmare() to release the bad image.
Fixes: 25a76dbb36dd ("platform/x86/intel/ifs: Validate image size")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240125082254.424859-2-ashok.raj@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cong Liu [Wed, 24 Jan 2024 01:29:38 +0000 (09:29 +0800)]
platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data()
amd_pmf_get_pb_data() will allocate memory for the policy buffer,
but does not free it if copy_from_user() fails. This leads to a memory
leak.
Fixes: 10817f28e533 ("platform/x86/amd/pmf: Add capability to sideload of policy binary")
Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Link: https://lore.kernel.org/r/20240124012939.6550-1-liucong2@kylinos.cn
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Shyam Sundar S K [Tue, 23 Jan 2024 14:14:58 +0000 (19:44 +0530)]
platform/x86/amd/pmf: Get ambient light information from AMD SFH driver
AMD SFH driver has APIs defined to export the ambient light information;
use this within the PMF driver to send inputs to the PMF TA, so that PMF
driver can enact to the actions coming from the TA.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240123141458.3715211-2-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Shyam Sundar S K [Tue, 23 Jan 2024 14:14:57 +0000 (19:44 +0530)]
platform/x86/amd/pmf: Get Human presence information from AMD SFH driver
AMD SFH driver has APIs defined to export the human presence information;
use this within the PMF driver to send inputs to the PMF TA, so that PMF
driver can enact to the actions coming from the TA.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240123141458.3715211-1-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Rafael J. Wysocki [Fri, 26 Jan 2024 18:16:48 +0000 (19:16 +0100)]
Merge branch 'pm-cpufreq'
Merge cpufreq fixes for 6.8-rc2:
- Fix the handling of scaling_max/min_freq sysfs attributes in the AMD
P-state cpufreq driver (Mario Limonciello).
- Make the intel_pstate cpufreq driver avoid unnecessary computation of
the HWP performance level corresponding to a given frequency in the
cases when it is known already, which also helps to avoid reducing
the maximum CPU capacity artificially on some systems (Rafael J.
Wysocki).
* pm-cpufreq:
cpufreq/amd-pstate: Fix setting scaling max/min freq values
cpufreq: intel_pstate: Refine computation of P-state for given frequency
Dave Airlie [Fri, 26 Jan 2024 18:12:14 +0000 (04:12 +1000)]
Merge tag 'drm-misc-fixes-for-v6.8-rc2' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes
One regression fixup to samsung-dsim.c module
- The FORCE_STOP_STATE bit is ineffective for forcing DSI link into LP-11 mode,
causing timing issues and potential bridge failures.
This patch reverts previous commits and corrects this issue.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240126141130.15512-1-inki.dae@samsung.com
Dave Airlie [Fri, 26 Jan 2024 18:04:34 +0000 (04:04 +1000)]
Revert "nouveau: push event block/allowing out of the fence context"
This reverts commit
eacabb5462717a52fccbbbba458365a4f5e61f35.
This commit causes some regressions in desktop usage, this will
reintroduce the original deadlock in DRI_PRIME situations, I've
got an idea to fix it by offloading to a workqueue in a different
spot, however this code has a race condition where we sometimes
miss interrupts so I'd like to fix that as well.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 26 Jan 2024 17:58:24 +0000 (03:58 +1000)]
Merge tag 'drm-intel-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- PSR fix for HSW
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZbPGBL9lj4DxxIW1@jlahtine-mobl.ger.corp.intel.com
Dave Airlie [Fri, 26 Jan 2024 17:56:02 +0000 (03:56 +1000)]
Merge tag 'drm-misc-fixes-2024-01-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Plenty of ivpu fixes to improve the general stability and debugging, a
suspend fix for the anx7625 bridge, a revert to fix an initialization
order bug between i915 and simpledrm and a documentation warning fix for
dp_mst.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/tp77e5fokigup6cgmpq6mtg46kzdw2dpze6smpnwfoml4kmwpq@bo6mbkezpkle
Andreas Larsson [Mon, 15 Jan 2024 15:02:00 +0000 (16:02 +0100)]
MAINTAINERS: Add Andreas Larsson as co-maintainer for arch/sparc
Dave has not been very active on arch/sparc for the past two years.
I have been contributing to the SPARC32 port as well as maintaining
out-of-tree SPARC32 patches for LEON3/4/5 (SPARCv8 with CAS support)
since 2012. I am willing to step up as an arch/sparc (co-)maintainer.
For recent discussions on the matter, see [1] and [2].
[1] https://lore.kernel.org/r/
20230713075235.
2164609-1-u.kleine-koenig@pengutronix.de
[2] https://lore.kernel.org/r/
20231209105816.GA1085691@ravnborg.org/
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Acked-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Michael Walle [Mon, 13 Nov 2023 16:43:44 +0000 (17:43 +0100)]
drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
The FORCE_STOP_STATE bit is unsuitable to force the DSI link into LP-11
mode. It seems the bridge internally queues DSI packets and when the
FORCE_STOP_STATE bit is cleared, they are sent in close succession
without any useful timing (this also means that the DSI lanes won't go
into LP-11 mode). The length of this gibberish varies between 1ms and
5ms. This sometimes breaks an attached bridge (TI SN65DSI84 in this
case). In our case, the bridge will fail in about 1 per 500 reboots.
The FORCE_STOP_STATE handling was introduced to have the DSI lanes in
LP-11 state during the .pre_enable phase. But as it turns out, none of
this is needed at all. Between samsung_dsim_init() and
samsung_dsim_set_display_enable() the lanes are already in LP-11 mode.
The code as it was before commit
20c827683de0 ("drm: bridge:
samsung-dsim: Fix init during host transfer") and
0c14d3130654 ("drm:
bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") was correct
in this regard.
This patch basically reverts both commits. It was tested on an i.MX8M
SoC with an SN65DSI84 bridge. The signals were probed and the DSI
packets were decoded during initialization and link start-up. After this
patch the first DSI packet on the link is a VSYNC packet and the timing
is correct.
Command mode between .pre_enable and .enable was also briefly tested by
a quick hack. There was no DSI link partner which would have responded,
but it was made sure the DSI packet was send on the link. As a side
note, the command mode seems to just work in HS mode. I couldn't find
that the bridge will handle commands in LP mode.
Fixes: 20c827683de0 ("drm: bridge: samsung-dsim: Fix init during host transfer")
Fixes: 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec")
Signed-off-by: Michael Walle <mwalle@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231113164344.1612602-1-mwalle@kernel.org
Inochi Amaoto [Fri, 26 Jan 2024 09:20:00 +0000 (17:20 +0800)]
riscv: dts: sophgo: separate sg2042 mtime and mtimecmp to fit aclint format
Change the timer layout in the dtb to fit the format that needed by
the SBI.
Fixes: 967a94a92aaa ("riscv: dts: add initial Sophgo SG2042 SoC device tree")
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Aleksander Jan Bajkowski [Mon, 22 Jan 2024 18:47:09 +0000 (19:47 +0100)]
MIPS: lantiq: register smp_ops on non-smp platforms
Lantiq uses a common kernel config for devices with 24Kc and 34Kc cores.
The changes made previously to add support for interrupts on all cores
work on 24Kc platforms with SMP disabled and 34Kc platforms with SMP
enabled. This patch fixes boot issues on Danube (single core 24Kc) with
SMP enabled.
Fixes: 730320fd770d ("MIPS: lantiq: enable all hardware interrupts on second VPE")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Huang Pei [Tue, 23 Jan 2024 01:47:58 +0000 (09:47 +0800)]
MIPS: loongson64: set nid for reserved memblock region
Commit
61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit
61167ad5fecd
("mm: pass nid to reserve_bootmem_region()") under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Huang Pei [Tue, 23 Jan 2024 01:47:57 +0000 (09:47 +0800)]
MIPS: reserve exception vector space ONLY ONCE
"cpu_probe" is called both by BP and APs, but reserving exception vector
(like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is
too late since memblock is unavailable at that time.
So, reserve exception vector ONLY by BP.
Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Huang Pei <huangpei@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Florian Fainelli [Tue, 23 Jan 2024 17:46:54 +0000 (09:46 -0800)]
MIPS: BCM63XX: Fix missing prototypes
Most of the symbols for which we do not have a prototype can actually be
made static and for the few that cannot, there is already a declaration
in a header for it.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Randy Dunlap [Fri, 26 Jan 2024 08:22:07 +0000 (16:22 +0800)]
LoongArch: KVM: Add returns to SIMD stubs
The stubs for kvm_own/lsx()/kvm_own_lasx() when CONFIG_CPU_HAS_LSX or
CONFIG_CPU_HAS_LASX is not defined should have a return value since they
return an int, so add "return -EINVAL;" to the stubs.
Fixes the build error:
In file included from ../arch/loongarch/include/asm/kvm_csr.h:12,
from ../arch/loongarch/kvm/interrupt.c:8:
../arch/loongarch/include/asm/kvm_vcpu.h: In function 'kvm_own_lasx':
../arch/loongarch/include/asm/kvm_vcpu.h:73:39: error: no return statement in function returning non-void [-Werror=return-type]
73 | static inline int kvm_own_lasx(struct kvm_vcpu *vcpu) { }
Fixes: db1ecca22edf ("LoongArch: KVM: Add LSX (128bit SIMD) support")
Fixes: 118e10cd893d ("LoongArch: KVM: Add LASX (256bit SIMD) support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Fri, 26 Jan 2024 08:22:07 +0000 (16:22 +0800)]
LoongArch: KVM: Fix build due to API changes
Commit
8569992d64b8f750e34b7858eac ("KVM: Use gfn instead of hva for
mmu_notifier_retry") replaces mmu_invalidate_retry_hva() usage with
mmu_invalidate_retry_gfn() for X86, LoongArch also need similar changes
to fix build.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Fri, 26 Jan 2024 08:22:07 +0000 (16:22 +0800)]
LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
Machines which have more than 8 nodes fail to boot SMP after commit
a2ccf46333d7b2cf96 ("LoongArch/smp: Call rcutree_report_cpu_starting()
earlier"). Because such machines use tlb-based per-cpu base address
rather than dmw-based per-cpu base address, resulting per-cpu variables
can only be accessed after tlb_init(). But rcutree_report_cpu_starting()
is now called before tlb_init() and accesses per-cpu variables indeed.
Since the original patch want to avoid the lockdep warning caused by
page allocation in tlb_init(), we can move rcutree_report_cpu_starting()
to tlb_init() where after tlb exception configuration but before page
allocation.
Fixes: a2ccf46333d7b2cf96 ("LoongArch/smp: Call rcutree_report_cpu_starting() earlier")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Matthew Brost [Wed, 24 Jan 2024 21:08:11 +0000 (13:08 -0800)]
drm/sched: Drain all entities in DRM sched run job worker
All entities must be drained in the DRM scheduler run job worker to
avoid the following case. An entity found that is ready, no job found
ready on entity, and run job worker goes idle with other entities + jobs
ready. Draining all ready entities (i.e. loop over all ready entities)
in the run job worker ensures all job that are ready will be scheduled.
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@mail.gmail.com/
Reported-and-tested-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Link: https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@amd.com/
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Closes: https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd159a8@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124210811.1639040-1-matthew.brost@intel.com
Dave Airlie [Fri, 26 Jan 2024 02:39:51 +0000 (12:39 +1000)]
Merge tag 'amd-drm-fixes-6.8-2024-01-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.8-2024-01-25:
amdgpu:
- AC/DC power supply tracking fix
- Don't show invalid vram vendor data
- SMU 13.0.x fixes
- GART fix for umr on systems without VRAM
- GFX 10/11 UNORD_DISPATCH fixes
- IPS display fixes (required for S0ix on some platforms)
- Misc fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240125221503.5019-1-alexander.deucher@amd.com
Dave Airlie [Thu, 25 Jan 2024 20:09:13 +0000 (06:09 +1000)]
Merge tag 'drm-xe-fixes-2024-01-25' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Make an ops struct static
- Fix an implicit 0 to NULL conversion
- A couple of 32-bit fixes
- A migration coherency fix for Lunar Lake.
- An error path vm id leak fix
- Remove PVC references in kunit tests
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZbIb7l0EhpVp5cXE@fedora
Kent Overstreet [Wed, 24 Jan 2024 22:26:33 +0000 (17:26 -0500)]
bcachefs: __lookup_dirent() works in snapshot, not subvol
Add a new helper, bch2_hash_lookup_in_snapshot(), for when we're not
operating in a subvolume and already have a snapshot ID, and then use it
in lookup_lostfound() -> __lookup_dirent().
This is a bugfix - lookup_lostfound() doesn't take a subvolume ID, we
were passing a nonsense subvolume ID before, and don't have one to pass
since we may be operating in an interior snapshot node that doesn't have
a subvolume ID.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Jens Axboe [Fri, 26 Jan 2024 00:03:54 +0000 (17:03 -0700)]
Merge tag 'md-6.8-
20240126' of https://git./linux/kernel/git/song/md into block-6.8
Pull MD fix from Song:
"This change fixes a RCU warning."
* tag 'md-6.8-
20240126' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: fix a suspicious RCU usage warning
David Lechner [Thu, 25 Jan 2024 20:53:09 +0000 (14:53 -0600)]
spi: fix finalize message on error return
In __spi_pump_transfer_message(), the message was not finalized in the
first error return as it is in the other error return paths. Not
finalizing the message could cause anything waiting on the message to
complete to hang forever.
This adds the missing call to spi_finalize_current_message().
Fixes: ae7d2346dc89 ("spi: Don't use the message queue if possible in spi_sync")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://msgid.link/r/20240125205312.3458541-2-dlechner@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Roman Li [Tue, 23 Jan 2024 20:18:24 +0000 (15:18 -0500)]
drm/amd/display: "Enable IPS by default"
[Why]
IPS was temporary disabled due to instability.
It was fixed in dmub firmware and with:
- "drm/amd/display: Add IPS checks before dcn register access"
- "drm/amd/display: Disable ips before dc interrupt setting"
[How]
Enable IPS by default.
Disable IPS if 0x800 bit set in amdgpu.dcdebugmask module params
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Tue, 23 Jan 2024 20:14:28 +0000 (15:14 -0500)]
drm/amd: Add a DC debug mask for IPS
For debugging IPS-related issues, expose a new debug mask
that allows to disable IPS.
Usage:
amdgpu.dcdebugmask=0x800
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Mon, 22 Jan 2024 22:45:41 +0000 (17:45 -0500)]
drm/amd/display: Disable ips before dc interrupt setting
[Why]
While in IPS2 an access to dcn registers is not allowed.
If interrupt results in dc call, we should disable IPS.
[How]
Safeguard register access in IPS2 by disabling idle optimization
before calling dc interrupt setting api.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ChunTao Tso [Mon, 8 Jan 2024 05:46:59 +0000 (13:46 +0800)]
drm/amd/display: Replay + IPS + ABM in Full Screen VPB
[Why]
Because ABM will wait VStart to start getting histogram data,
it will cause we can't enter IPS while full screnn video playing.
[How]
Modify the panel refresh rate to the maximun multiple of current
refresh rate.
Reviewed-by: Dennis Chan <dennis.chan@amd.com>
Acked-by: Roman Li <roman.li@amd.com>
Signed-off-by: ChunTao Tso <chuntao.tso@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Tue, 9 Jan 2024 22:31:33 +0000 (17:31 -0500)]
drm/amd/display: Add IPS checks before dcn register access
[Why]
With IPS enabled a system hangs once PSR is active.
PSR active triggers transition to IPS2 state.
While in IPS2 an access to dcn registers results in hard hang.
Existing check doesn't cover for PSR sequence.
[How]
Safeguard register access by disabling idle optimization in atomic commit
and crtc scanout. It will be re-enabled on next vblank.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Roman Li <roman.li@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alvin Lee [Tue, 2 Jan 2024 17:02:39 +0000 (12:02 -0500)]
drm/amd/display: Add Replay IPS register for DMUB command table
- Introduce a new Replay mode for DMUB version 0.0.199.0
Reviewed-by: Martin Leung <martin.leung@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Thu, 7 Dec 2023 19:12:03 +0000 (14:12 -0500)]
drm/amd/display: Allow IPS2 during Replay
[Why & How]
Add regkey to block video playback in IPS2 by default
Allow idle optimizations in the same spot we allow Replay for
video playback usecases.
Avoid sending it when there's an external display connected by
modifying the allow idle checks to check for active non-eDP screens.
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 19 Jan 2024 17:32:59 +0000 (12:32 -0500)]
drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs
This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer. On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads. Updated firmware is also
required for AQL.
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Fri, 19 Jan 2024 17:23:55 +0000 (12:23 -0500)]
drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs
This needs to be set to 1 to avoid a potential deadlock in
the GC 10.x and newer. On GC 9.x and older, this needs
to be set to 0. This can lead to hangs in some mixed
graphics and compute workloads. Updated firmware is also
required for AQL.
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Tom St Denis [Wed, 17 Jan 2024 17:47:37 +0000 (12:47 -0500)]
drm/amd/amdgpu: Assign GART pages to AMD device mapping
This allows kernel mapped pages like the PDB and PTB to be
read via the iomem debugfs when there is no vram in the system.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
Lijo Lazar [Thu, 18 Jan 2024 08:55:35 +0000 (14:25 +0530)]
drm/amd/pm: Fetch current power limit from FW
Power limit of SMUv13.0.6 SOCs can be updated by out-of-band ways. Fetch
the limit from firmware instead of using cached values.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
Hawking Zhang [Mon, 22 Jan 2024 09:38:23 +0000 (17:38 +0800)]
drm/amdgpu: Fix null pointer dereference
amdgpu_reg_state_sysfs_fini could be invoked at the
time when asic_func is even not initialized, i.e.,
amdgpu_discovery_init fails for some reason.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Lijo Lazar [Sat, 20 Jan 2024 08:18:09 +0000 (13:48 +0530)]
drm/amdgpu: Show vram vendor only if available
Ony if vram vendor info is available, show in sysfs.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
Kenneth Feng [Fri, 19 Jan 2024 08:12:00 +0000 (16:12 +0800)]
drm/amd/pm: update the power cap setting
update the power cap setting for smu_v13.0.0/smu_v13.0.7
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2356
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Lijo Lazar [Sat, 20 Jan 2024 08:02:51 +0000 (13:32 +0530)]
drm/amdgpu: Avoid fetching vram vendor information
For GFX 9.4.3 APUs, the current method of fetching vram vendor
information is not reliable. Avoid fetching the information.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
Ma Jun [Wed, 17 Jan 2024 06:35:29 +0000 (14:35 +0800)]
drm/amdgpu/pm: Fix the power source flag error
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Srinivasan Shanmugam [Wed, 17 Jan 2024 03:11:52 +0000 (08:41 +0530)]
drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions
The 'status' variable in 'core_link_read_dpcd()' &
'core_link_write_dpcd()' was uninitialized.
Thus, initializing 'status' variable to 'DC_ERROR_UNEXPECTED' by default.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:226 core_link_read_dpcd() error: uninitialized symbol 'status'.
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:248 core_link_write_dpcd() error: uninitialized symbol 'status'.
Cc: stable@vger.kernel.org
Cc: Jerry Zuo <jerry.zuo@amd.com>
Cc: Jun Lei <Jun.Lei@amd.com>
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Yang Wang [Fri, 19 Jan 2024 03:32:41 +0000 (11:32 +0800)]
drm/amd/pm: udpate smu v13.0.6 message permission
update smu v13.0.6 message to allow guest driver set gfx clock.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>