Linus Torvalds [Thu, 18 Nov 2021 20:13:24 +0000 (12:13 -0800)]
Merge tag 'for-5.16/parisc-4' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"parisc bug and warning fixes and wire up futex_waitv.
Fix some warnings which showed up with allmodconfig builds, a revert
of a change to the sigreturn trampoline which broke signal handling,
wire up futex_waitv and add CONFIG_PRINTK_TIME=y to 32bit defconfig"
* tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
parisc: Wrap assembler related defines inside __ASSEMBLY__
parisc: Wire up futex_waitv
parisc: Include stringify.h to avoid build error in crypto/api.c
parisc/sticon: fix reverse colors
Linus Torvalds [Thu, 18 Nov 2021 20:05:22 +0000 (12:05 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Selftest changes:
- Cleanups for the perf test infrastructure and mapping hugepages
- Avoid contention on mmap_sem when the guests start to run
- Add event channel upcall support to xen_shinfo_test
x86 changes:
- Fixes for Xen emulation
- Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
- Fixes for migration of 32-bit nested guests on 64-bit hypervisor
- Compilation fixes
- More SEV cleanups
Generic:
- Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
and num_online_cpus(). Most architectures were only using one of
the two"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
KVM: x86: Assume a 64-bit hypercall for guests with protected state
selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
riscv: kvm: fix non-kernel-doc comment block
KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
KVM: SEV: Drop a redundant setting of sev->asid during initialization
KVM: SEV: WARN if SEV-ES is marked active but SEV is not
KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
KVM: x86/xen: Use sizeof_field() instead of open-coding it
KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
...
Linus Torvalds [Thu, 18 Nov 2021 19:01:06 +0000 (11:01 -0800)]
Merge tag 'docs-5.16-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of documentation fixes for 5.16"
* tag 'docs-5.16-2' of git://git.lwn.net/linux:
Documentation/process: fix a cross reference
Documentation: update vcpu-requests.rst reference
docs: accounting: update delay-accounting.rst reference
libbpf: update index.rst reference
docs: filesystems: Fix grammatical error "with" to "which"
doc/zh_CN: fix a translation error in management-style
docs: ftrace: fix the wrong path of tracefs
Documentation: arm: marvell: Fix link to armada_1000_pb.pdf document
Documentation: arm: marvell: Put Armada XP section between Armada 370 and 375
Documentation: arm: marvell: Add some links to homepage / product infos
docs: Update Sphinx requirements
Linus Torvalds [Thu, 18 Nov 2021 18:50:45 +0000 (10:50 -0800)]
Merge tag 'printk-for-5.16-fixup' of git://git./linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:
- Try to flush backtraces from other CPUs also on the local one. This
was a regression caused by printk_safe buffers removal.
- Remove header dependency warning.
* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: Remove printk.h inclusion in percpu.h
printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
Petr Mladek [Thu, 18 Nov 2021 09:03:47 +0000 (10:03 +0100)]
Merge branch 'rework/printk_safe-removal' into for-linus
Helge Deller [Wed, 17 Nov 2021 14:48:45 +0000 (15:48 +0100)]
parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Wed, 17 Nov 2021 10:05:07 +0000 (11:05 +0100)]
Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
This reverts commit
e4f2006f1287e7ea17660490569cff323772dac4.
This patch shows problems with signal handling. Revert it for now.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15
Helge Deller [Tue, 16 Nov 2021 12:12:21 +0000 (13:12 +0100)]
parisc: Wrap assembler related defines inside __ASSEMBLY__
Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers,
because a COND() define is used there which conflicts with the COND() for
PA-RISC assembly. Although the snapdragon driver isn't relevant for parisc, it
is nevertheless compiled when CONFIG_COMPILE_TEST is defined.
Move the COND() define and other PA-RISC mnemonics inside the #ifdef
__ASSEMBLY__ part to avoid this conflict.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Helge Deller [Tue, 16 Nov 2021 12:11:26 +0000 (13:11 +0100)]
parisc: Wire up futex_waitv
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Mon, 15 Nov 2021 16:34:50 +0000 (17:34 +0100)]
parisc: Include stringify.h to avoid build error in crypto/api.c
Include stringify.h to avoid this build error:
arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify'
arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label]
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:43 +0000 (17:34 +0100)]
KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211116163443.88707-7-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:42 +0000 (17:34 +0100)]
KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
KVM_CAP_NR_VCPUS is a legacy advisory value which on other architectures
return num_online_cpus() caped by KVM_CAP_NR_VCPUS or something else
(ppc and arm64 are special cases). On s390, KVM_CAP_NR_VCPUS returns
the same as KVM_CAP_MAX_VCPUS and this may turn out to be a bad
'advice'. Switch s390 to returning caped num_online_cpus() too.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <
20211116163443.88707-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:41 +0000 (17:34 +0100)]
KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Message-Id: <
20211116163443.88707-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:40 +0000 (17:34 +0100)]
KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211116163443.88707-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:39 +0000 (17:34 +0100)]
KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
It doesn't make sense to return the recommended maximum number of
vCPUs which exceeds the maximum possible number of vCPUs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211116163443.88707-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Tue, 16 Nov 2021 16:34:38 +0000 (17:34 +0100)]
KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
Generally, it doesn't make sense to return the recommended maximum number
of vCPUs which exceeds the maximum possible number of vCPUs.
Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs
depending on whether it is a system-wide ioctl or a per-VM one. Previously,
KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to
keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
which is what gets returned by system-wide KVM_CAP_MAX_VCPUS.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211116163443.88707-2-vkuznets@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tom Lendacky [Mon, 24 May 2021 17:48:57 +0000 (12:48 -0500)]
KVM: x86: Assume a 64-bit hypercall for guests with protected state
When processing a hypercall for a guest with protected state, currently
SEV-ES guests, the guest CS segment register can't be checked to
determine if the guest is in 64-bit mode. For an SEV-ES guest, it is
expected that communication between the guest and the hypervisor is
performed to shared memory using the GHCB. In order to use the GHCB, the
guest must have been in long mode, otherwise writes by the guest to the
GHCB would be encrypted and not be able to be comprehended by the
hypervisor.
Create a new helper function, is_64_bit_hypercall(), that assumes the
guest is in 64-bit mode when the guest has protected state, and returns
true, otherwise invoking is_64_bit_mode() to determine the mode. Update
the hypercall related routines to use is_64_bit_hypercall() instead of
is_64_bit_mode().
Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to
this helper function for a guest running with protected state.
Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <
e0b20c770c9d0d1403f23d83e785385104211f74.
1621878537.git.thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Arnaldo Carvalho de Melo [Tue, 16 Nov 2021 15:03:25 +0000 (12:03 -0300)]
selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
$ git status
nothing to commit, working tree clean
$
$ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1
$ git status
Untracked files:
(use "git add <file>..." to include in what will be committed)
tools/testing/selftests/kvm/x86_64/sev_migrate_tests
nothing added to commit but untracked files present (use "git add" to track)
$
Fixes: 6a58150859fdec76 ("selftest: KVM: Add intra host migration tests")
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Message-Id: <YZPIPfvYgRDCZi/w@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Randy Dunlap [Sun, 7 Nov 2021 03:47:06 +0000 (20:47 -0700)]
riscv: kvm: fix non-kernel-doc comment block
Don't use "/**" to begin a comment block for a non-kernel-doc comment.
Prevents this docs build warning:
vcpu_sbi.c:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Copyright (c) 2019 Western Digital Corporation or its affiliates.
Fixes: dea8ee31a039 ("RISC-V: KVM: Add SBI v0.1 support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Anup Patel <anup.patel@wdc.com>
Cc: kvm@vger.kernel.org
Cc: kvm-riscv@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Message-Id: <
20211107034706.30672-1-rdunlap@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 16 Nov 2021 13:08:19 +0000 (08:08 -0500)]
Merge branch 'kvm-5.16-fixes' into kvm-master
* Fixes for Xen emulation
* Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
* Fixes for migration of 32-bit nested guests on 64-bit hypervisor
* Compilation fixes
* More SEV cleanups
Sean Christopherson [Tue, 9 Nov 2021 21:51:01 +0000 (21:51 +0000)]
KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
Rename cmd_allowed_from_miror() to is_cmd_allowed_from_mirror(), fixing
a typo and making it obvious that the result is a boolean where
false means "not allowed".
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211109215101.
2211373-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 9 Nov 2021 21:51:00 +0000 (21:51 +0000)]
KVM: SEV: Drop a redundant setting of sev->asid during initialization
Remove a fully redundant write to sev->asid during SEV/SEV-ES guest
initialization. The ASID is set a few lines earlier prior to the call to
sev_platform_init(), which doesn't take "sev" as a param, i.e. can't
muck with the ASID barring some truly magical behind-the-scenes code.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211109215101.
2211373-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 9 Nov 2021 21:50:59 +0000 (21:50 +0000)]
KVM: SEV: WARN if SEV-ES is marked active but SEV is not
WARN if the VM is tagged as SEV-ES but not SEV. KVM relies on SEV and
SEV-ES being set atomically, and guards common flows with "is SEV", i.e.
observing SEV-ES without SEV means KVM has a fatal bug.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211109215101.
2211373-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 9 Nov 2021 21:50:58 +0000 (21:50 +0000)]
KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
Set sev_info.active during SEV/SEV-ES activation before calling any code
that can potentially consume sev_info.es_active, e.g. set "active" and
"es_active" as a pair immediately after the initial sanity checks. KVM
generally expects that es_active can be true if and only if active is
true, e.g. sev_asid_new() deliberately avoids sev_es_guest() so that it
doesn't get a false negative. This will allow WARNing in sev_es_guest()
if the VM is tagged as SEV-ES but not SEV.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211109215101.
2211373-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 9 Nov 2021 21:50:56 +0000 (21:50 +0000)]
KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
Reject COPY_ENC_CONTEXT_FROM if the destination VM has created vCPUs.
KVM relies on SEV activation to occur before vCPUs are created, e.g. to
set VMCB flags and intercepts correctly.
Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Nathan Tempelman <natet@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211109215101.
2211373-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:27 +0000 (16:50 +0000)]
KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
In commit
7e2175ebd695 ("KVM: x86: Fix recording of guest steal time /
preempted status") I removed the only user of these functions because
it was basically impossible to use them safely.
There are two stages to the GFN->PFN mapping; first through the KVM
memslots to a userspace HVA and then through the page tables to
translate that HVA to an underlying PFN. Invalidations of the former
were being handled correctly, but no attempt was made to use the MMU
notifiers to invalidate the cache when the HVA->GFN mapping changed.
As a prelude to reinventing the gfn_to_pfn_cache with more usable
semantics, rip it out entirely and untangle the implementation of
the unsafe kvm_vcpu_map()/kvm_vcpu_unmap() functions from it.
All current users of kvm_vcpu_map() also look broken right now, and
will be dealt with separately. They broadly fall into two classes:
* Those which map, access the data and immediately unmap. This is
mostly gratuitous and could just as well use the existing user
HVA, and could probably benefit from a gfn_to_hva_cache as they
do so.
* Those which keep the mapping around for a longer time, perhaps
even using the PFN directly from the guest. These will need to
be converted to the new gfn_to_pfn_cache and then kvm_vcpu_map()
can be removed too.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-8-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:26 +0000 (16:50 +0000)]
KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
And thus another call to kvm_vcpu_map() can die.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-7-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:25 +0000 (16:50 +0000)]
KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
Kill another mostly gratuitous kvm_vcpu_map() which could just use the
userspace HVA for it.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-6-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:23 +0000 (16:50 +0000)]
KVM: x86/xen: Use sizeof_field() instead of open-coding it
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-4-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:24 +0000 (16:50 +0000)]
KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
Using kvm_vcpu_map() for reading from the guest is entirely gratuitous,
when all we do is a single memcpy and unmap it again. Fix it up to use
kvm_read_guest()... but in fact I couldn't bring myself to do that
without also making it use a gfn_to_hva_cache for both that *and* the
copy in the other direction.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-5-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:21 +0000 (16:50 +0000)]
KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
In commit
319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache") we
stopped storing this in-kernel as a GPA, and started storing it as a GFN.
Which means we probably should have stopped calling gpa_to_gfn() on it
when userspace asks for it back.
Cc: stable@vger.kernel.org
Fixes: 319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Mon, 15 Nov 2021 13:18:37 +0000 (15:18 +0200)]
KVM: x86/mmu: include EFER.LMA in extended mmu role
Incorporate EFER.LMA into kvm_mmu_extended_role, as it used to compute the
guest root level and is not reflected in kvm_mmu_page_role.level when TDP
is in use. When simply running the guest, it is impossible for EFER.LMA
and kvm_mmu.root_level to get out of sync, as the guest cannot transition
from PAE paging to 64-bit paging without toggling CR0.PG, i.e. without
first bouncing through a different MMU context. And stuffing guest state
via KVM_SET_SREGS{,2} also ensures a full MMU context reset.
However, if KVM_SET_SREGS{,2} is followed by KVM_SET_NESTED_STATE, e.g. to
set guest state when migrating the VM while L2 is active, the vCPU state
will reflect L2, not L1. If L1 is using TDP for L2, then root_mmu will
have been configured using L2's state, despite not being used for L2. If
L2.EFER.LMA != L1.EFER.LMA, and L2 is using PAE paging, then root_mmu will
be configured for guest PAE paging, but will match the mmu_role for 64-bit
paging and cause KVM to not reconfigure root_mmu on the next nested VM-Exit.
Alternatively, the root_mmu's role could be invalidated after a successful
KVM_SET_NESTED_STATE that yields vcpu->arch.mmu != vcpu->arch.root_mmu,
i.e. that switches the active mmu to guest_mmu, but doing so is unnecessarily
tricky, and not even needed if L1 and L2 do have the same role (e.g., they
are both 64-bit guests and run with the same CR4).
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20211115131837.195527-3-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maxim Levitsky [Mon, 15 Nov 2021 13:18:36 +0000 (15:18 +0200)]
KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load
When loading nested state, don't use check vcpu->arch.efer to get the
L1 host's 64-bit vs. 32-bit state and don't check it for consistency
with respect to VM_EXIT_HOST_ADDR_SPACE_SIZE, as register state in vCPU
may be stale when KVM_SET_NESTED_STATE is called---and architecturally
does not exist. When restoring L2 state in KVM, the CPU is placed in
non-root where nested VMX code has no snapshot of L1 host state: VMX
(conditionally) loads host state fields loaded on VM-exit, but they need
not correspond to the state before entry. A simple case occurs in KVM
itself, where the host RIP field points to vmx_vmexit rather than the
instruction following vmlaunch/vmresume.
However, for the particular case of L1 being in 32- or 64-bit mode
on entry, the exit controls can be treated instead as the source of
truth regarding the state of L1 on entry, and can be used to check
that vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE matches vmcs12.HOST_EFER if
vmcs12.VM_EXIT_LOAD_IA32_EFER is set. The consistency check on CPU
EFER vs. vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE, instead, happens only
on VM-Enter. That's because, again, there's conceptually no "current"
L1 EFER to check on KVM_SET_NESTED_STATE.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <
20211115131837.195527-2-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Sun, 14 Nov 2021 08:59:02 +0000 (08:59 +0000)]
KVM: Fix steal time asm constraints
In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits
of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use
the [abcd] registers. For which, GCC has the "q" constraint instead of
the less restrictive "r".
Also fix st->preempted, which is an input/output operand rather than an
input.
Fixes: 7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
89bf72db1b859990355f9c40713a34e0d2d86c98.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paul Durrant [Mon, 15 Nov 2021 14:41:31 +0000 (14:41 +0000)]
cpuid: kvm_find_kvm_cpuid_features() should be declared 'static'
The lack a static declaration currently results in:
arch/x86/kvm/cpuid.c:128:26: warning: no previous prototype for function 'kvm_find_kvm_cpuid_features'
when compiling with "W=1".
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 760849b1476c ("KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Message-Id: <
20211115144131.5943-1-pdurrant@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Wed, 17 Nov 2021 23:55:07 +0000 (15:55 -0800)]
Merge tag 'gfs2-v5.16-rc2-fixes' of git://git./linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
- The current iomap_file_buffered_write behavior of failing the entire
write when part of the user buffer cannot be faulted in leads to an
endless loop in gfs2. Work around that in gfs2 for now.
- Various other bugs all over the place.
* tag 'gfs2-v5.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Prevent endless loops in gfs2_file_buffered_write
gfs2: Fix "Introduce flag for glock holder auto-demotion"
gfs2: Fix length of holes reported at end-of-file
gfs2: release iopen glock early in evict
gfs2: Fix atomic bug in gfs2_instantiate
gfs2: Only dereference i->iov when iter_is_iovec(i)
Linus Torvalds [Wed, 17 Nov 2021 23:12:50 +0000 (15:12 -0800)]
Merge tag 'mips-fixes_5.16_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- wire futex_waitv syscall
- build fixes for lantiq and bcm63xx configs
- yamon-dt bugfix
* tag 'mips-fixes_5.16_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: lantiq: add support for clk_get_parent()
mips: bcm63xx: add support for clk_get_parent()
MIPS: generic/yamon-dt: fix uninitialized variable error
MIPS: syscalls: Wire up futex_waitv syscall
Linus Torvalds [Wed, 17 Nov 2021 16:46:15 +0000 (08:46 -0800)]
Merge tag 'hyperv-fixes-signed-
20211117' of git://git./linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Fix ring size calculation for balloon driver (Boqun Feng)
- Fix issues in Hyper-V setup code (Sean Christopherson)
* tag 'hyperv-fixes-signed-
20211117' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Move required MSRs check to initial platform probing
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails
Drivers: hv: balloon: Use VMBUS_RING_SIZE() wrapper for dm_ring_size
Linus Torvalds [Wed, 17 Nov 2021 16:38:00 +0000 (08:38 -0800)]
Merge tag 'nfsd-5.16-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfix from Bruce Fields:
"This is just one bugfix for a buffer overflow in knfsd's xdr decoding"
* tag 'nfsd-5.16-1' of git://linux-nfs.org/~bfields/linux:
NFSD: Fix exposure in nfsd4_decode_bitmap()
Mauro Carvalho Chehab [Tue, 16 Nov 2021 12:11:23 +0000 (12:11 +0000)]
Documentation/process: fix a cross reference
The cross-reference for the handbooks section works. However, it is
meant to describe the path inside the Kernel's doc where the section
is, but there's an space instead of a dash, plus it lacks the .rst at
the end, which makes:
./scripts/documentation-file-ref-check
to complain.
Fixes: 604370e106cc ("Documentation/process: Add maintainer handbooks section")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 16 Nov 2021 12:11:22 +0000 (12:11 +0000)]
Documentation: update vcpu-requests.rst reference
Changeset
2f5947dfcaec ("Documentation: move Documentation/virtual to Documentation/virt")
renamed: Documentation/virtual/kvm/vcpu-requests.rst
to: Documentation/virt/kvm/vcpu-requests.rst.
Update its cross-reference accordingly.
Fixes: 2f5947dfcaec ("Documentation: move Documentation/virtual to Documentation/virt")
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 16 Nov 2021 12:11:21 +0000 (12:11 +0000)]
docs: accounting: update delay-accounting.rst reference
The file name: accounting/delay-accounting.rst
should be, instead: Documentation/accounting/delay-accounting.rst.
Also, there's no need to use doc:`foo`, as automarkup.py will
automatically handle plain text mentions to Documentation/
files.
So, update its cross-reference accordingly.
Fixes: fcb501704554 ("delayacct: Document task_delayacct sysctl")
Fixes: c3123552aad3 ("docs: accounting: convert to ReST")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Mauro Carvalho Chehab [Tue, 16 Nov 2021 12:11:20 +0000 (12:11 +0000)]
libbpf: update index.rst reference
Changeset
d20b41115ad5 ("libbpf: Rename libbpf documentation index file")
renamed: Documentation/bpf/libbpf/libbpf.rst
to: Documentation/bpf/libbpf/index.rst.
Update its cross-reference accordingly.
Fixes: d20b41115ad5 ("libbpf: Rename libbpf documentation index file")
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Sven Schnelle [Sun, 14 Nov 2021 16:08:17 +0000 (17:08 +0100)]
parisc/sticon: fix reverse colors
sticon_build_attr() checked the reverse argument and flipped
background and foreground color, but returned the non-reverse
value afterwards. Fix this and also add two local variables
for foreground and background color to make the code easier
to read.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Paolo Bonzini [Tue, 16 Nov 2021 12:44:13 +0000 (07:44 -0500)]
Merge branch 'kvm-selftest' into kvm-master
- Cleanups for the perf test infrastructure and mapping hugepages
- Avoid contention on mmap_sem when the guests start to run
- Add event channel upcall support to xen_shinfo_test
黄乐 [Mon, 15 Nov 2021 14:08:29 +0000 (14:08 +0000)]
KVM: x86: Fix uninitialized eoi_exit_bitmap usage in vcpu_load_eoi_exitmap()
In vcpu_load_eoi_exitmap(), currently the eoi_exit_bitmap[4] array is
initialized only when Hyper-V context is available, in other path it is
just passed to kvm_x86_ops.load_eoi_exitmap() directly from on the stack,
which would cause unexpected interrupt delivery/handling issues, e.g. an
*old* linux kernel that relies on PIT to do clock calibration on KVM might
randomly fail to boot.
Fix it by passing ioapic_handled_vectors to load_eoi_exitmap() when Hyper-V
context is not available.
Fixes: f2bc14b69c38 ("KVM: x86: hyper-v: Prepare to meet unallocated Hyper-V context")
Cc: stable@vger.kernel.org
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Huang Le <huangle1@jd.com>
Message-Id: <
62115b277dab49ea97da5633f8522daf@jd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Matlack [Thu, 11 Nov 2021 00:12:57 +0000 (00:12 +0000)]
KVM: selftests: Use perf_test_destroy_vm in memslot_modification_stress_test
Change memslot_modification_stress_test to use perf_test_destroy_vm
instead of manually calling ucall_uninit and kvm_vm_free.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <
20211111001257.
1446428-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Matlack [Thu, 11 Nov 2021 00:12:56 +0000 (00:12 +0000)]
KVM: selftests: Wait for all vCPU to be created before entering guest mode
Thread creation requires taking the mmap_sem in write mode, which causes
vCPU threads running in guest mode to block while they are populating
memory. Fix this by waiting for all vCPU threads to be created and start
running before entering guest mode on any one vCPU thread.
This substantially improves the "Populate memory time" when using 1GiB
pages since it allows all vCPUs to zero pages in parallel rather than
blocking because a writer is waiting (which is waiting for another vCPU
that is busy zeroing a 1GiB page).
Before:
$ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb
...
Populate memory time: 52.811184013s
After:
$ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb
...
Populate memory time: 10.204573342s
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111001257.
1446428-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Matlack [Thu, 11 Nov 2021 00:12:55 +0000 (00:12 +0000)]
KVM: selftests: Move vCPU thread creation and joining to common helpers
Move vCPU thread creation and joining to common helper functions. This
is in preparation for the next commit which ensures that all vCPU
threads are fully created before entering guest mode on any one
vCPU.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <
20211111001257.
1446428-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Matlack [Thu, 11 Nov 2021 00:12:54 +0000 (00:12 +0000)]
KVM: selftests: Start at iteration 0 instead of -1
Start at iteration 0 instead of -1 to avoid having to initialize
vcpu_last_completed_iteration when setting up vCPU threads. This
simplifies the next commit where we move vCPU thread initialization
out to a common helper.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111001257.
1446428-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:10 +0000 (00:03 +0000)]
KVM: selftests: Sync perf_test_args to guest during VM creation
Copy perf_test_args to the guest during VM creation instead of relying on
the caller to do so at their leisure. Ideally, tests wouldn't even be
able to modify perf_test_args, i.e. they would have no motivation to do
the sync, but enforcing that is arguably a net negative for readability.
No functional change intended.
[Set wr_fract=1 by default and add helper to override it since the new
access_tracking_perf_test needs to set it dynamically.]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <
20211111000310.
1435032-13-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:09 +0000 (00:03 +0000)]
KVM: selftests: Fill per-vCPU struct during "perf_test" VM creation
Fill the per-vCPU args when creating the perf_test VM instead of having
the caller do so. This helps ensure that any adjustments to the number
of pages (and thus vcpu_memory_bytes) are reflected in the per-VM args.
Automatically filling the per-vCPU args will also allow a future patch
to do the sync to the guest during creation.
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Updated access_tracking_perf_test as well.]
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <
20211111000310.
1435032-12-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:08 +0000 (00:03 +0000)]
KVM: selftests: Create VM with adjusted number of guest pages for perf tests
Use the already computed guest_num_pages when creating the so called
extra VM pages for a perf test, and add a comment explaining why the
pages are allocated as extra pages.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-11-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:07 +0000 (00:03 +0000)]
KVM: selftests: Remove perf_test_args.host_page_size
Remove perf_test_args.host_page_size and instead use getpagesize() so
that it's somewhat obvious that, for tests that care about the host page
size, they care about the system page size, not the hardware page size,
e.g. that the logic is unchanged if hugepages are in play.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-10-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:06 +0000 (00:03 +0000)]
KVM: selftests: Move per-VM GPA into perf_test_args
Move the per-VM GPA into perf_test_args instead of storing it as a
separate global variable. It's not obvious that guest_test_phys_mem
holds a GPA, nor that it's connected/coupled with per_vcpu->gpa.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-9-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:05 +0000 (00:03 +0000)]
KVM: selftests: Use perf util's per-vCPU GPA/pages in demand paging test
Grab the per-vCPU GPA and number of pages from perf_util in the demand
paging test instead of duplicating perf_util's calculations.
Note, this may or may not result in a functional change. It's not clear
that the test's calculations are guaranteed to yield the same value as
perf_util, e.g. if guest_percpu_mem_size != vcpu_args->pages.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-8-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:04 +0000 (00:03 +0000)]
KVM: selftests: Capture per-vCPU GPA in perf_test_vcpu_args
Capture the per-vCPU GPA in perf_test_vcpu_args so that tests can get
the GPA without having to calculate the GPA on their own.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-7-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:03 +0000 (00:03 +0000)]
KVM: selftests: Use shorthand local var to access struct perf_tests_args
Use 'pta' as a local pointer to the global perf_tests_args in order to
shorten line lengths and make the code borderline readable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-6-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:02 +0000 (00:03 +0000)]
KVM: selftests: Require GPA to be aligned when backed by hugepages
Assert that the GPA for a memslot backed by a hugepage is aligned to
the hugepage size and fix perf_test_util accordingly. Lack of GPA
alignment prevents KVM from backing the guest with hugepages, e.g. x86's
write-protection of hugepages when dirty logging is activated is
otherwise not exercised.
Add a comment explaining that guest_page_size is for non-huge pages to
try and avoid confusion about what it actually tracks.
Cc: Ben Gardon <bgardon@google.com>
Cc: Yanan Wang <wangyanan55@huawei.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Used get_backing_src_pagesz() to determine alignment dynamically.]
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:01 +0000 (00:03 +0000)]
KVM: selftests: Assert mmap HVA is aligned when using HugeTLB
Manually padding and aligning the mmap region is only needed when using
THP. When using HugeTLB, mmap will always return an address aligned to
the HugeTLB page size. Add a comment to clarify this and assert the mmap
behavior for HugeTLB.
[Removed requirement that HugeTLB mmaps must be padded per Yanan's
feedback and added assertion that mmap returns aligned addresses
when using HugeTLB.]
Cc: Ben Gardon <bgardon@google.com>
Cc: Yanan Wang <wangyanan55@huawei.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:03:00 +0000 (00:03 +0000)]
KVM: selftests: Expose align() helpers to tests
Refactor align() to work with non-pointers and split into separate
helpers for aligning up vs. down. Add align_ptr_up() for use with
pointers. Expose all helpers so that they can be used by tests and/or
other utilities. The align_down() helper in particular will be used to
ensure gpa alignment for hugepages.
No functional change intended.
[Added sepearate up/down helpers and replaced open-coded alignment
bit math throughout the KVM selftests.]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <
20211111000310.
1435032-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Thu, 11 Nov 2021 00:02:59 +0000 (00:02 +0000)]
KVM: selftests: Explicitly state indicies for vm_guest_mode_params array
Explicitly state the indices when populating vm_guest_mode_params to
make it marginally easier to visualize what's going on.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
[Added indices for new guest modes.]
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <
20211111000310.
1435032-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Woodhouse [Mon, 15 Nov 2021 16:50:22 +0000 (16:50 +0000)]
KVM: selftests: Add event channel upcall support to xen_shinfo_test
When I first looked at this, there was no support for guest exception
handling in the KVM selftests. In fact it was merged into 5.10 before
the Xen support got merged in 5.11, and I could have used it from the
start.
Hook it up now, to exercise the Xen upcall delivery. I'm about to make
things a bit more interesting by handling the full 2level event channel
stuff in-kernel on top of the basic vector injection that we already
have, and I'll want to build more tests on top.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <
20211115165030.7422-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Randy Dunlap [Mon, 15 Nov 2021 01:20:51 +0000 (17:20 -0800)]
mips: lantiq: add support for clk_get_parent()
Provide a simple implementation of clk_get_parent() in the
lantiq subarch so that callers of it will build without errors.
Fixes this build error:
ERROR: modpost: "clk_get_parent" [drivers/iio/adc/ingenic-adc.ko] undefined!
Fixes: 171bb2f19ed6 ("MIPS: Lantiq: Add initial support for Lantiq SoCs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: linux-mips@vger.kernel.org
Cc: John Crispin <john@phrozen.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: linux-iio@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: John Crispin <john@phrozen.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Randy Dunlap [Mon, 15 Nov 2021 00:42:18 +0000 (16:42 -0800)]
mips: bcm63xx: add support for clk_get_parent()
BCM63XX selects HAVE_LEGACY_CLK but does not provide/support
clk_get_parent(), so add a simple implementation of that
function so that callers of it will build without errors.
Fixes these build errors:
mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4770_adc_init_clk_div':
ingenic-adc.c:(.text+0xe4): undefined reference to `clk_get_parent'
mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4725b_adc_init_clk_div':
ingenic-adc.c:(.text+0x1b8): undefined reference to `clk_get_parent'
Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs." )
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Artur Rojek <contact@artur-rojek.eu>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: linux-mips@vger.kernel.org
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: linux-iio@vger.kernel.org
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Colin Ian King [Wed, 10 Nov 2021 23:28:24 +0000 (23:28 +0000)]
MIPS: generic/yamon-dt: fix uninitialized variable error
In the case where fw_getenv returns an error when fetching values
for ememsizea and memsize then variable phys_memsize is not assigned
a variable and will be uninitialized on a zero check of phys_memsize.
Fix this by initializing phys_memsize to zero.
Cleans up cppcheck error:
arch/mips/generic/yamon-dt.c:100:7: error: Uninitialized variable: phys_memsize [uninitvar]
Fixes: f41d2430bbd6 ("MIPS: generic/yamon-dt: Support > 256MB of RAM")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Wang Haojun [Wed, 3 Nov 2021 02:55:21 +0000 (10:55 +0800)]
MIPS: syscalls: Wire up futex_waitv syscall
Wire up the futex_waitv syscall.
Fix Build warning: #warning syscall futex_waitv not implemented [-Wcpp]
Signed-off-by: Wang Haojun <wanghaojun@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Chuck Lever [Sun, 14 Nov 2021 20:16:04 +0000 (15:16 -0500)]
NFSD: Fix exposure in nfsd4_decode_bitmap()
rtm@csail.mit.edu reports:
> nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC
> directs it to do so. This can cause nfsd4_decode_state_protect4_a()
> to write client-supplied data beyond the end of
> nfsd4_exchange_id.spo_must_allow[] when called by
> nfsd4_decode_exchange_id().
Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond
@bmlen.
Reported by: rtm@csail.mit.edu
Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Andy Shevchenko [Fri, 12 Nov 2021 14:07:49 +0000 (16:07 +0200)]
printk: Remove printk.h inclusion in percpu.h
After the commit
42a0bb3f7138 ("printk/nmi: generic solution for safe
printk in NMI") the printk.h is not needed anymore in percpu.h.
Moreover `make headerdep` complains (an excerpt)
In file included from linux/printk.h,
from linux/dynamic_debug.h:188
from linux/printk.h:559 <-- here
from linux/percpu.h:9
from linux/idr.h:17
include/net/9p/client.h:13: warning: recursive header inclusion
Yeah, it's not a root cause of this, but removing will help to reduce
the noise.
Fixes: 42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211112140749.80042-1-andriy.shevchenko@linux.intel.com
Sean Christopherson [Thu, 4 Nov 2021 18:22:39 +0000 (18:22 +0000)]
x86/hyperv: Move required MSRs check to initial platform probing
Explicitly check for MSR_HYPERCALL and MSR_VP_INDEX support when probing
for running as a Hyper-V guest instead of waiting until hyperv_init() to
detect the bogus configuration. Add messages to give the admin a heads
up that they are likely running on a broken virtual machine setup.
At best, silently disabling Hyper-V is confusing and difficult to debug,
e.g. the kernel _says_ it's using all these fancy Hyper-V features, but
always falls back to the native versions. At worst, the half baked setup
will crash/hang the kernel.
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20211104182239.1302956-3-seanjc@google.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Sean Christopherson [Thu, 4 Nov 2021 18:22:38 +0000 (18:22 +0000)]
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails
Check for a valid hv_vp_index array prior to derefencing hv_vp_index when
setting Hyper-V's TSC change callback. If Hyper-V setup failed in
hyperv_init(), the kernel will still report that it's running under
Hyper-V, but will have silently disabled nearly all functionality.
BUG: kernel NULL pointer dereference, address:
0000000000000010
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP
CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc2+ #75
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:set_hv_tscchange_cb+0x15/0xa0
Code: <8b> 04 82 8b 15 12 17 85 01 48 c1 e0 20 48 0d ee 00 01 00 f6 c6 08
...
Call Trace:
kvm_arch_init+0x17c/0x280
kvm_init+0x31/0x330
vmx_init+0xba/0x13a
do_one_initcall+0x41/0x1c0
kernel_init_freeable+0x1f2/0x23b
kernel_init+0x16/0x120
ret_from_fork+0x22/0x30
Fixes: 93286261de1b ("x86/hyperv: Reenlightenment notifications support")
Cc: stable@vger.kernel.org
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20211104182239.1302956-2-seanjc@google.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Boqun Feng [Mon, 1 Nov 2021 15:00:26 +0000 (23:00 +0800)]
Drivers: hv: balloon: Use VMBUS_RING_SIZE() wrapper for dm_ring_size
Baihua reported an error when boot an ARM64 guest with PAGE_SIZE=64k and
BALLOON is enabled:
hv_vmbus: registering driver hv_balloon
hv_vmbus: probe failed for device
1eccfd72-4b41-45ef-b73a-
4a6e44c12924 (-22)
The cause of this is that the ringbuffer size for hv_balloon is not
adjusted with VMBUS_RING_SIZE(), which makes the size not large enough
for ringbuffers on guest with PAGE_SIZE=64k. Therefore use
VMBUS_RING_SIZE() to calculate the ringbuffer size. Note that the old
size (20 * 1024) counts a 4k header in the total size, while
VMBUS_RING_SIZE() expects the parameter as the payload size, so use
16 * 1024.
Cc: <stable@vger.kernel.org> # 5.15.x
Reported-by: Baihua Lu <baihua.lu@microsoft.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20211101150026.736124-1-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Wasin Thonkaew [Wed, 3 Nov 2021 19:35:04 +0000 (19:35 +0000)]
docs: filesystems: Fix grammatical error "with" to "which"
Signed-off-by: Wasin Thonkaew <wasin@wasin.io>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Alex Shi [Wed, 10 Nov 2021 12:02:13 +0000 (20:02 +0800)]
doc/zh_CN: fix a translation error in management-style
'The name of the game' means the most important part of an activity, so
we should translate it by the meaning instead of the words.
Suggested-by: Xinyong Wang <wang.xy.chn@gmail.com>
Signed-off-by: Alex Shi <alexs@kernel.org>
Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Zhaoyu Liu [Sat, 13 Nov 2021 13:37:34 +0000 (21:37 +0800)]
docs: ftrace: fix the wrong path of tracefs
Delete "tracing" due to it has been included in /proc/mounts.
Delete "echo nop > $tracefs/tracing/current_tracer", maybe
this command is redundant.
Signed-off-by: Zhaoyu Liu <zackary.liu.pro@gmail.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pali Rohár [Fri, 8 Oct 2021 16:01:05 +0000 (18:01 +0200)]
Documentation: arm: marvell: Fix link to armada_1000_pb.pdf document
File armada_1000_pb.pdf is not available on Marvell website anymore.
So update link to webarchive where is backup copy.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pali Rohár [Fri, 8 Oct 2021 16:01:04 +0000 (18:01 +0200)]
Documentation: arm: marvell: Put Armada XP section between Armada 370 and 375
From evolution and feature point of view Armada XP belongs between Armada
370 and Armada 375 families.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pali Rohár [Fri, 8 Oct 2021 16:01:03 +0000 (18:01 +0200)]
Documentation: arm: marvell: Add some links to homepage / product infos
Webarchive contains some useful resources like product info or links to
other documents.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Akira Yokosawa [Wed, 10 Nov 2021 09:16:48 +0000 (18:16 +0900)]
docs: Update Sphinx requirements
Commit
f546ff0c0c07 ("Move our minimum Sphinx version to 1.7") raised
the minimum version to 1.7.
For pdfdocs, sphinx_pre_install says:
note: If you want pdf, you need at least Sphinx 2.4.4.
, and current requirements.txt installs Sphinx 2.4.4.
Update Sphinx versions mentioned in docs and remove a note on earlier
Sphinx versions.
Update zh_CN and it_IT translations as well.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Federico Vaga <federico.vaga@vaga.pv.it>
Cc: Alex Shi <alexs@kernel.org>
Reviewed-by: Alex Shi <alexs@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Linus Torvalds [Mon, 15 Nov 2021 03:07:19 +0000 (19:07 -0800)]
Merge tag 'trace-v5.16-5' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Update to tracing histogram variable string copy
A fix to only copy the size of the field to the histogram string did
not take into account that the size can be larger than the storage"
* tag 'trace-v5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Add length protection to histogram string copies
Gustavo A. R. Silva [Mon, 15 Nov 2021 02:48:44 +0000 (20:48 -0600)]
kbuild: Fix -Wimplicit-fallthrough=5 error for GCC 5.x and 6.x
-Wimplicit-fallthrough=5 was under cc-option because it was only
available in GCC 7.x and newer so the build is now broken for GCC 5.x
and 6.x:
gcc: error: unrecognized command line option '-Wimplicit-fallthrough=5';
did you mean '-Wno-fallthrough'?
Fix this by moving -Wimplicit-fallthrough=5 under cc-option.
Fixes: dee2b702bcf0 ("kconfig: Add support for -Wimplicit-fallthrough")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Co-developed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt (VMware) [Sun, 14 Nov 2021 18:28:34 +0000 (13:28 -0500)]
tracing: Add length protection to histogram string copies
The string copies to the histogram storage has a max size of 256 bytes
(defined by MAX_FILTER_STR_VAL). Only the string size of the event field
needs to be copied to the event storage, but no more than what is in the
event storage. Although nothing should be bigger than 256 bytes, there's
no protection against overwriting of the storage if one day there is.
Copy no more than the destination size, and enforce it.
Also had to turn MAX_FILTER_STR_VAL into an unsigned int, to keep the
min() comparison of the string sizes of comparable types.
Link: https://lore.kernel.org/all/CAHk-=wjREUihCGrtRBwfX47y_KrLCGjiq3t6QtoNJpmVrAEb1w@mail.gmail.com/
Link: https://lkml.kernel.org/r/20211114132834.183429a4@rorschach.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 63f84ae6b82b ("tracing/histogram: Do not copy the fixed-size char array field over the field size")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Linus Torvalds [Sun, 14 Nov 2021 21:56:52 +0000 (13:56 -0800)]
Linux 5.16-rc1
Gustavo A. R. Silva [Sun, 14 Nov 2021 00:57:25 +0000 (18:57 -0600)]
kconfig: Add support for -Wimplicit-fallthrough
Add Kconfig support for -Wimplicit-fallthrough for both GCC and Clang.
The compiler option is under configuration CC_IMPLICIT_FALLTHROUGH,
which is enabled by default.
Special thanks to Nathan Chancellor who fixed the Clang bug[1][2]. This
bugfix only appears in Clang 14.0.0, so older versions still contain
the bug and -Wimplicit-fallthrough won't be enabled for them, for now.
This concludes a long journey and now we are finally getting rid
of the unintentional fallthrough bug-class in the kernel, entirely. :)
Link: https://github.com/llvm/llvm-project/commit/9ed4a94d6451046a51ef393cd62f00710820a7e8
Link: https://bugs.llvm.org/show_bug.cgi?id=51094
Link: https://github.com/KSPP/linux/issues/115
Link: https://github.com/ClangBuiltLinux/linux/issues/236
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 14 Nov 2021 20:18:22 +0000 (12:18 -0800)]
Merge tag 'xfs-5.16-merge-5' of git://git./fs/xfs/xfs-linux
Pull xfs cleanups from Darrick Wong:
"The most 'exciting' aspect of this branch is that the xfsprogs
maintainer and I have worked through the last of the code
discrepancies between kernel and userspace libxfs such that there are
no code differences between the two except for #includes.
IOWs, diff suffices to demonstrate that the userspace tools behave the
same as the kernel, and kernel-only bits are clearly marked in the
/kernel/ source code instead of just the userspace source.
Summary:
- Clean up open-coded swap() calls.
- A little bit of #ifdef golf to complete the reunification of the
kernel and userspace libxfs source code"
* tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: sync xfs_btree_split macros with userspace libxfs
xfs: #ifdef out perag code for userspace
xfs: use swap() to make dabtree code cleaner
Linus Torvalds [Sun, 14 Nov 2021 19:53:59 +0000 (11:53 -0800)]
Merge tag 'for-5.16/parisc-3' of git://git./linux/kernel/git/deller/parisc-linux
Pull more parisc fixes from Helge Deller:
"Fix a build error in stracktrace.c, fix resolving of addresses to
function names in backtraces, fix single-stepping in assembly code and
flush userspace pte's when using set_pte_at()"
* tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc/entry: fix trace test in syscall exit path
parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
parisc: Fix implicit declaration of function '__kernel_text_address'
parisc: Fix backtrace to always include init funtion names
Linus Torvalds [Sun, 14 Nov 2021 19:37:49 +0000 (11:37 -0800)]
Merge tag 'sh-for-5.16' of git://git.libc.org/linux-sh
Pull arch/sh updates from Rich Felker.
* tag 'sh-for-5.16' of git://git.libc.org/linux-sh:
sh: pgtable-3level: Fix cast to pointer from integer of different size
sh: fix READ/WRITE redefinition warnings
sh: define __BIG_ENDIAN for math-emu
sh: math-emu: drop unused functions
sh: fix kconfig unmet dependency warning for FRAME_POINTER
sh: Cleanup about SPARSE_IRQ
sh: kdump: add some attribute to function
maple: fix wrong return value of maple_bus_init().
sh: boot: avoid unneeded rebuilds under arch/sh/boot/compressed/
sh: boot: add intermediate vmlinux.bin* to targets instead of extra-y
sh: boards: Fix the cacography in irq.c
sh: check return code of request_irq
sh: fix trivial misannotations
Linus Torvalds [Sun, 14 Nov 2021 19:30:50 +0000 (11:30 -0800)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
- Fix early_iounmap
- Drop cc-option fallbacks for architecture selection
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9156/1: drop cc-option fallbacks for architecture selection
ARM: 9155/1: fix early early_iounmap()
Linus Torvalds [Sun, 14 Nov 2021 19:11:51 +0000 (11:11 -0800)]
Merge tag 'devicetree-fixes-for-5.16-1' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Two fixes due to DT node name changes on Arm, Ltd. boards
- Treewide rename of Ingenic CGU headers
- Update ST email addresses
- Remove Netlogic DT bindings
- Dropping few more cases of redundant 'maxItems' in schemas
- Convert toshiba,tc358767 bridge binding to schema
* tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: watchdog: sunxi: fix error in schema
bindings: media: venus: Drop redundant maxItems for power-domain-names
dt-bindings: Remove Netlogic bindings
clk: versatile: clk-icst: Ensure clock names are unique
of: Support using 'mask' in making device bus id
dt-bindings: treewide: Update @st.com email address to @foss.st.com
dt-bindings: media: Update maintainers for st,stm32-hwspinlock.yaml
dt-bindings: media: Update maintainers for st,stm32-cec.yaml
dt-bindings: mfd: timers: Update maintainers for st,stm32-timers
dt-bindings: timer: Update maintainers for st,stm32-timer
dt-bindings: i2c: imx: hardware do not restrict clock-frequency to only 100 and 400 kHz
dt-bindings: display: bridge: Convert toshiba,tc358767.txt to yaml
dt-bindings: Rename Ingenic CGU headers to ingenic,*.h
Linus Torvalds [Sun, 14 Nov 2021 18:43:38 +0000 (10:43 -0800)]
Merge tag 'timers-urgent-2021-11-14' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for POSIX CPU timers to address a problem where POSIX CPU
timer delivery stops working for a new child task because
copy_process() copies state information which is only valid for the
parent task"
* tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
Linus Torvalds [Sun, 14 Nov 2021 18:38:27 +0000 (10:38 -0800)]
Merge tag 'irq-urgent-2021-11-14' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of fixes for the interrupt subsystem
Core code:
- A regression fix for the Open Firmware interrupt mapping code where
a interrupt controller property in a node caused a map property in
the same node to be ignored.
Interrupt chip drivers:
- Workaround a limitation in SiFive PLIC interrupt chip which
silently ignores an EOI when the interrupt line is masked.
- Provide the missing mask/unmask implementation for the CSKY MP
interrupt controller.
PCI/MSI:
- Prevent a use after free when PCI/MSI interrupts are released by
destroying the sysfs entries before freeing the memory which is
accessed in the sysfs show() function.
- Implement a mask quirk for the Nvidia ION AHCI chip which does not
advertise masking capability despite implementing it. Even worse
the chip comes out of reset with all MSI entries masked, which due
to the missing masking capability never get unmasked.
- Move the check which prevents accessing the MSI[X] masking for XEN
back into the low level accessors. The recent consolidation missed
that these accessors can be invoked from places which do not have
that check which broke XEN. Move them back to he original place
instead of sprinkling tons of these checks all over the code"
* tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
of/irq: Don't ignore interrupt-controller when interrupt-map failed
irqchip/sifive-plic: Fixup EOI failed when masked
irqchip/csky-mpintc: Fixup mask/unmask implementation
PCI/MSI: Destroy sysfs before freeing entries
PCI: Add MSI masking quirk for Nvidia ION AHCI
PCI/MSI: Deal with devices lying about their MSI mask capability
PCI/MSI: Move non-mask check back into low level accessors
Linus Torvalds [Sun, 14 Nov 2021 18:30:17 +0000 (10:30 -0800)]
Merge tag 'locking-urgent-2021-11-14' of git://git./linux/kernel/git/tip/tip
Pull x86 static call update from Thomas Gleixner:
"A single fix for static calls to make the trampoline patching more
robust by placing explicit signature bytes after the call trampoline
to prevent patching random other jumps like the CFI jump table
entries"
* tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
static_call,x86: Robustify trampoline patching
Linus Torvalds [Sun, 14 Nov 2021 17:39:03 +0000 (09:39 -0800)]
Merge tag 'sched_urgent_for_v5.16_rc1' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Avoid touching ~100 config files in order to be able to select the
preemption model
- clear cluster CPU masks too, on the CPU unplug path
- prevent use-after-free in cfs
- Prevent a race condition when updating CPU cache domains
- Factor out common shared part of smp_prepare_cpus() into a common
helper which can be called by both baremetal and Xen, in order to fix
a booting of Xen PV guests
* tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt: Restore preemption model selection configs
arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology()
sched/fair: Prevent dead task groups from regaining cfs_rq's
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
x86/smp: Factor out parts of native_smp_prepare_cpus()
Linus Torvalds [Sun, 14 Nov 2021 17:33:12 +0000 (09:33 -0800)]
Merge tag 'perf_urgent_for_v5.16_rc1' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Prevent unintentional page sharing by checking whether a page
reference to a PMU samples page has been acquired properly before
that
- Make sure the LBR_SELECT MSR is saved/restored too
- Reset the LBR_SELECT MSR when resetting the LBR PMU to clear any
residual data left
* tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Avoid put_page() when GUP fails
perf/x86/vlbr: Add c->flags to vlbr event constraints
perf/x86/lbr: Reset LBR_SELECT during vlbr reset
Linus Torvalds [Sun, 14 Nov 2021 17:29:03 +0000 (09:29 -0800)]
Merge tag 'x86_urgent_for_v5.16_rc1' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Add the model number of a new, Raptor Lake CPU, to intel-family.h
- Do not log spurious corrected MCEs on SKL too, due to an erratum
- Clarify the path of paravirt ops patches upstream
- Add an optimization to avoid writing out AMX components to sigframes
when former are in init state
* tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add Raptor Lake to Intel family
x86/mce: Add errata workaround for Skylake SKX37
MAINTAINERS: Add some information to PARAVIRT_OPS entry
x86/fpu: Optimize out sigframe xfeatures when in init state
Linus Torvalds [Sun, 14 Nov 2021 17:25:01 +0000 (09:25 -0800)]
Merge tag 'perf-tools-for-v5.16-2021-11-13' of git://git./linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:
"Hardware tracing:
- ARM:
* Print the size of the buffer size consistently in hexadecimal in
ARM Coresight.
* Add Coresight snapshot mode support.
* Update --switch-events docs in 'perf record'.
* Support hardware-based PID tracing.
* Track task context switch for cpu-mode events.
- Vendor events:
* Add metric events JSON file for power10 platform
perf test:
- Get 'perf test' unit tests closer to kunit.
- Topology tests improvements.
- Remove bashisms from some tests.
perf bench:
- Fix memory leak of perf_cpu_map__new() in the futex benchmarks.
libbpf:
- Add some more weak libbpf functions o allow building with the
libbpf versions, old ones, present in distros.
libbeauty:
- Translate [gs]setsockopt 'level' argument integer values to
strings.
tools headers UAPI:
- Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files
with the kernel sources.
Documentation:
- Add documentation to 'struct symbol'.
- Synchronize the definition of enum perf_hw_id with code in
tools/perf/design.txt"
* tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
perf tests: Remove bash constructs from stat_all_pmu.sh
perf tests: Remove bash construct from record+zstd_comp_decomp.sh
perf test: Remove bash construct from stat_bpf_counters.sh test
perf bench futex: Fix memory leak of perf_cpu_map__new()
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync sound/asound.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools headers UAPI: Sync arch prctl headers with the kernel sources
perf tools: Add more weak libbpf functions
perf bpf: Avoid memory leak from perf_env__insert_btf()
perf symbols: Factor out annotation init/exit
perf symbols: Bit pack to save a byte
perf symbols: Add documentation to 'struct symbol'
tools headers UAPI: Sync files changed by new futex_waitv syscall
perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning
perf arm-spe: Support hardware-based PID tracing
perf arm-spe: Save context ID in record
perf arm-spe: Update --switch-events docs in 'perf record'
perf arm-spe: Track task context switch for cpu-mode events
...
Thomas Gleixner [Sun, 14 Nov 2021 12:59:05 +0000 (13:59 +0100)]
Merge tag 'irqchip-fixes-5.16-1' of git://git./linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:
- Address an issue with the SiFive PLIC being unable to EOI
a masked interrupt
- Move the disable/enable methods in the CSky mpintc to
mask/unmask
- Fix a regression in the OF irq code where an interrupt-controller
property in the same node as an interrupt-map property would get
ignored
Link: https://lore.kernel.org/all/20211112173459.4015233-1-maz@kernel.org
Linus Torvalds [Sat, 13 Nov 2021 23:32:30 +0000 (15:32 -0800)]
Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux
Pull zstd update from Nick Terrell:
"Update to zstd-1.4.10.
Add myself as the maintainer of zstd and update the zstd version in
the kernel, which is now 4 years out of date, to a much more recent
zstd release. This includes bug fixes, much more extensive fuzzing,
and performance improvements. And generates the kernel zstd
automatically from upstream zstd, so it is easier to keep the zstd
verison up to date, and we don't fall so far out of date again.
This includes 5 commits that update the zstd library version:
- Adds a new kernel-style wrapper around zstd.
This wrapper API is functionally equivalent to the subset of the
current zstd API that is currently used. The wrapper API changes to
be kernel style so that the symbols don't collide with zstd's
symbols. The update to zstd-1.4.10 maintains the same API and
preserves the semantics, so that none of the callers need to be
updated. All callers are updated in the commit, because there are
zero functional changes.
- Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
depend on the layout of `lib/zstd/` to include every source file.
This allows the next patch to be automatically generated.
- Imports the zstd-1.4.10 source code. This commit is automatically
generated from upstream zstd (https://github.com/facebook/zstd).
- Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
- Fixes a newly added build warning for clang.
The discussion around this patchset has been pretty long, so I've
included a FAQ-style summary of the history of the patchset, and why
we are taking this approach.
Why do we need to update?
-------------------------
The zstd version in the kernel is based off of zstd-1.3.1, which is
was released August 20, 2017. Since then zstd has seen many bug fixes
and performance improvements. And, importantly, upstream zstd is
continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
older versions. So the only way to sanely get these fixes is to keep
up to date with upstream zstd.
There are no known security issues that affect the kernel, but we need
to be able to update in case there are. And while there are no known
security issues, there are relevant bug fixes. For example the problem
with large kernel decompression has been fixed upstream for over 2
years [1]
Additionally the performance improvements for kernel use cases are
significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
- BtrFS zstd compression at levels 1 and 3 is 5% faster
- BtrFS zstd decompression+read is 15% faster
- SquashFS zstd decompression+read is 15% faster
- F2FS zstd compression+write at level 3 is 8% faster
- F2FS zstd decompression+read is 20% faster
- ZRAM decompression+read is 30% faster
- Kernel zstd decompression is 35% faster
- Initramfs zstd decompression+build is 5% faster
On top of this, there are significant performance improvements coming
down the line in the next zstd release, and the new automated update
patch generation will allow us to pull them easily.
How is the update patch generated?
----------------------------------
The first two patches are preparation for updating the zstd version.
Then the 3rd patch in the series imports upstream zstd into the
kernel. This patch is automatically generated from upstream. A script
makes the necessary changes and imports it into the kernel. The
changes are:
- Replace all libc dependencies with kernel replacements and rewrite
includes.
- Remove unncessary portability macros like: #if defined(_MSC_VER).
- Use the kernel xxhash instead of bundling it.
This automation gets tested every commit by upstream's continuous
integration. When we cut a new zstd release, we will submit a patch to
the kernel to update the zstd version in the kernel.
The automated process makes it easy to keep the kernel version of zstd
up to date. The current zstd in the kernel shares the guts of the
code, but has a lot of API and minor changes to work in the kernel.
This is because at the time upstream zstd was not ready to be used in
the kernel envrionment as-is. But, since then upstream zstd has
evolved to support being used in the kernel as-is.
Why are we updating in one big patch?
-------------------------------------
The 3rd patch in the series is very large. This is because it is
restructuring the code, so it both deletes the existing zstd, and
re-adds the new structure. Future updates will be directly
proportional to the changes in upstream zstd since the last import.
They will admittidly be large, as zstd is an actively developed
project, and has hundreds of commits between every release. However,
there is no other great alternative.
One option ruled out is to replay every upstream zstd commit. This is
not feasible for several reasons:
- There are over 3500 upstream commits since the zstd version in the
kernel.
- The automation to automatically generate the kernel update was only
added recently, so older commits cannot easily be imported.
- Not every upstream zstd commit builds.
- Only zstd releases are "supported", and individual commits may have
bugs that were fixed before a release.
Another option to reduce the patch size would be to first reorganize
to the new file structure, and then apply the patch. However, the
current kernel zstd is formatted with clang-format to be more
"kernel-like". But, the new method imports zstd as-is, without
additional formatting, to allow for closer correlation with upstream,
and easier debugging. So the patch wouldn't be any smaller.
It also doesn't make sense to import upstream zstd commit by commit
going forward. Upstream zstd doesn't support production use cases
running of the development branch. We have a lot of post-commit
fuzzing that catches many bugs, so indiviudal commits may be buggy,
but fixed before a release. So going forward, I intend to import every
(important) zstd release into the Kernel.
So, while it isn't ideal, updating in one big patch is the only patch
I see forward.
Who is responsible for this code?
---------------------------------
I am. This patchset adds me as the maintainer for zstd. Previously,
there was no tree for zstd patches. Because of that, there were
several patches that either got ignored, or took a long time to merge,
since it wasn't clear which tree should pick them up. I'm officially
stepping up as maintainer, and setting up my tree as the path through
which zstd patches get merged. I'll make sure that patches to the
kernel zstd get ported upstream, so they aren't erased when the next
version update happens.
How is this code tested?
------------------------
I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
aarch64. I checked both performance and correctness.
Also, thanks to many people in the community who have tested these
patches locally.
Lastly, this code will bake in linux-next before being merged into
v5.16.
Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
------------------------------------------------------------
This patchset has been outstanding since 2020, and zstd-1.4.10 was the
latest release when it was created. Since the update patch is
automatically generated from upstream, I could generate it from
zstd-1.5.0.
However, there were some large stack usage regressions in zstd-1.5.0,
and are only fixed in the latest development branch. And the latest
development branch contains some new code that needs to bake in the
fuzzer before I would feel comfortable releasing to the kernel.
Once this patchset has been merged, and we've released zstd-1.5.1, we
can update the kernel to zstd-1.5.1, and exercise the update process.
You may notice that zstd-1.4.10 doesn't exist upstream. This release
is an artifical release based off of zstd-1.4.9, with some fixes for
the kernel backported from the development branch. I will tag the
zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
is running a known version of zstd that can be debugged upstream.
Why was a wrapper API added?
----------------------------
The first versions of this patchset migrated the kernel to the
upstream zstd API. It first added a shim API that supported the new
upstream API with the old code, then updated callers to use the new
shim API, then transitioned to the new code and deleted the shim API.
However, Cristoph Hellwig suggested that we transition to a kernel
style API, and hide zstd's upstream API behind that. This is because
zstd's upstream API is supports many other use cases, and does not
follow the kernel style guide, while the kernel API is focused on the
kernel's use cases, and follows the kernel style guide.
Where is the previous discussion?
---------------------------------
Links for the discussions of the previous versions of the patch set
below. The largest changes in the design of the patchset are driven by
the discussions in v11, v5, and v1. Sorry for the mix of links, I
couldn't find most of the the threads on lkml.org"
Link: https://lkml.org/lkml/2020/9/29/27
Link: https://www.spinics.net/lists/linux-crypto/msg58189.html
Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/
Link: https://lkml.org/lkml/2020/12/3/1195
Link: https://lkml.org/lkml/2020/12/2/1245
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html
Link: https://lkml.org/lkml/2020/9/23/1074
Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
* tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
MAINTAINERS: Add maintainer entry for zstd
lib: zstd: Upgrade to latest upstream zstd version 1.4.10
lib: zstd: Add decompress_sources.h for decompress_unzstd
lib: zstd: Add kernel-specific API
Linus Torvalds [Sat, 13 Nov 2021 21:14:05 +0000 (13:14 -0800)]
Merge tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux
Pull virtio-mem update from David Hildenbrand:
"Support the VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature in virtio-mem,
now that "accidential" access to logically unplugged memory inside
added Linux memory blocks is no longer possible, because we:
- Removed /dev/kmem in commit
bbcd53c96071 ("drivers/char: remove
/dev/kmem for good")
- Disallowed access to virtio-mem device memory via /dev/mem in
commit
2128f4e21aa ("virtio-mem: disallow mapping virtio-mem memory
via /dev/mem")
- Sanitized access to virtio-mem device memory via /proc/kcore in
commit
0daa322b8ff9 ("fs/proc/kcore: don't read offline sections,
logically offline pages and hwpoisoned pages")
- Sanitized access to virtio-mem device memory via /proc/vmcore in
commit
ce2814622e84 ("virtio-mem: kdump mode to sanitize
/proc/vmcore access")
The new VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature that will be
required by some hypervisors implementing virtio-mem in the near
future, so let's support it now that we safely can"
* tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux:
virtio-mem: support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
James Clark [Thu, 28 Oct 2021 13:48:27 +0000 (14:48 +0100)]
perf tests: Remove bash constructs from stat_all_pmu.sh
The tests were passing but without testing and were printing the
following:
$ ./perf test -v 90
90: perf all PMU test :
--- start ---
test child forked, pid 51650
Testing cpu/branch-instructions/
./tests/shell/stat_all_pmu.sh: 10: [:
Performance counter stats for 'true':
137,307 cpu/branch-instructions/
0.
001686672 seconds time elapsed
0.
001376000 seconds user
0.
000000000 seconds sys: unexpected operator
Changing the regexes to a grep works in sh and prints this:
$ ./perf test -v 90
90: perf all PMU test :
--- start ---
test child forked, pid 60186
[...]
Testing tlb_flush.stlb_any
test child finished with 0
---- end ----
perf all PMU test: Ok
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20211028134828.65774-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>