Will Deacon [Thu, 4 Jan 2024 12:28:00 +0000 (12:28 +0000)]
Merge branch 'for-next/perf' into for-next/core
* for-next/perf: (30 commits)
arm: perf: Fix ARCH=arm build with GCC
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
Documentation: arm64: Document the PMU event counting threshold feature
arm64: perf: Add support for event counting threshold
arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
perf/arm_dmc620: Remove duplicate format attribute #defines
arm: pmu: Share user ABI format mechanism with SPE
arm64: perf: Include threshold control fields in PMEVTYPER mask
arm: perf: Convert remaining fields to use GENMASK
arm: perf: Use GENMASK for PMMIR fields
arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
arm: perf: Remove inlines from arm_pmuv3.c
drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax
drivers/perf: Remove usage of the deprecated ida_simple_xx() API
...
Will Deacon [Thu, 4 Jan 2024 12:27:55 +0000 (12:27 +0000)]
Merge branch 'for-next/mm' into for-next/core
* for-next/mm:
arm64: irq: set the correct node for shadow call stack
arm64: irq: set the correct node for VMAP stack
Will Deacon [Thu, 4 Jan 2024 12:27:49 +0000 (12:27 +0000)]
Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
arm64: memory: remove duplicated include
arm64: Delete the zero_za macro
Documentation/arch/arm64: Fix typo
Will Deacon [Thu, 4 Jan 2024 12:27:42 +0000 (12:27 +0000)]
Merge branch 'for-next/lpa2-prep' into for-next/core
* for-next/lpa2-prep:
arm64: mm: get rid of kimage_vaddr global variable
arm64: mm: Take potential load offset into account when KASLR is off
arm64: kernel: Disable latent_entropy GCC plugin in early C runtime
arm64: Add ARM64_HAS_LPA2 CPU capability
arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
arm64/mm: Update tlb invalidation routines for FEAT_LPA2
arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
arm64/mm: Modify range-based tlbi to decrement scale
Will Deacon [Thu, 4 Jan 2024 12:27:35 +0000 (12:27 +0000)]
Merge branch 'for-next/kbuild' into for-next/core
* for-next/kbuild:
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
arm64: properly install vmlinuz.efi
arm64: replace <asm-generic/export.h> with <linux/export.h>
arm64: vdso32: rename 32-bit debug vdso to vdso32.so.dbg
Will Deacon [Thu, 4 Jan 2024 12:27:29 +0000 (12:27 +0000)]
Merge branch 'for-next/fpsimd' into for-next/core
* for-next/fpsimd:
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
arm64: fpsimd: Drop unneeded 'busy' flag
Will Deacon [Thu, 4 Jan 2024 12:27:13 +0000 (12:27 +0000)]
Merge branch 'for-next/early-idreg-overrides' into for-next/core
* for-next/early-idreg-overrides:
arm64/kernel: Move 'nokaslr' parsing out of early idreg code
arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit
arm64: idreg-override: Avoid sprintf() for simple string concatenation
arm64: idreg-override: avoid strlen() to check for empty strings
arm64: idreg-override: Avoid parameq() and parameqn()
arm64: idreg-override: Prepare for place relative reloc patching
arm64: idreg-override: Omit non-NULL checks for override pointer
Will Deacon [Thu, 4 Jan 2024 12:26:56 +0000 (12:26 +0000)]
Merge branch 'for-next/cpufeature' into for-next/core
* for-next/cpufeature:
arm64: Align boot cpucap handling with system cpucap handling
arm64: Cleanup system cpucap handling
arm64: Kconfig: drop KAISER reference from KPTI option description
arm64: mm: Only map KPTI trampoline if it is going to be used
arm64: Get rid of ARM64_HAS_NO_HW_PREFETCH
Masahiro Yamada [Mon, 18 Dec 2023 08:01:27 +0000 (17:01 +0900)]
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
You do not need to use $(shell ...) in recipe lines, as they are already
executed in a shell. An alternative solution is $$(...), which is an
escaped sequence of the shell's command substituion, $(...).
For this case, there is a reason to avoid $(shell ...).
Kbuild detects command changes by using the if_changed macro, which
compares the previous command recorded in .*.cmd with the current
command from Makefile. If they differ, Kbuild re-runs the build rule.
To diff the commands, Make must expand $(shell ...) first. It means that
hexdump is executed every time, even when nothing needs rebuilding. If
Kbuild determines that vmlinux.bin needs rebuilding, hexdump will be
executed again to evaluate the 'cmd' macro, one more time to really
build vmlinux.bin, and finally yet again to record the expanded command
into .*.cmd.
Replace $(shell ...) with $$(...) to avoid multiple, unnecessay shell
evaluations. Since Make is agnostic about the shell code, $(...), the
if_changed macro compares the string "$(hexdump -s16 -n4 ...)" verbatim,
so hexdump is run only for building vmlinux.bin.
For the same reason, $(shell ...) in EFI_ZBOOT_OBJCOPY_FLAGS should be
eliminated.
While I was here, I replaced '&&' with ';' because a command for
if_changed is executed with 'set -e'.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231218080127.907460-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Josef Bacik [Thu, 14 Dec 2023 16:18:50 +0000 (11:18 -0500)]
arm64: properly install vmlinuz.efi
If you select CONFIG_EFI_ZBOOT, we will generate vmlinuz.efi, and then
when we go to install the kernel we'll install the vmlinux instead
because install.sh only recognizes Image.gz as wanting the compressed
install image. With CONFIG_EFI_ZBOOT we don't get the proper kernel
installed, which means it doesn't boot, which makes for a very confused
and subsequently angry kernel developer.
Fix this by properly installing our compressed kernel if we've enabled
CONFIG_EFI_ZBOOT.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Cc: <stable@vger.kernel.org> # 6.1.x
Fixes: c37b830fef13 ("arm64: efi: enable generic EFI compressed boot")
Reviewed-by: Simon Glass <sjg@chromium.org>
Link: https://lore.kernel.org/r/6edb1402769c2c14c4fbef8f7eaedb3167558789.1702570674.git.josef@toxicpanda.com
Signed-off-by: Will Deacon <will@kernel.org>
Wang Jinchao [Fri, 15 Dec 2023 06:40:08 +0000 (14:40 +0800)]
arm64: memory: remove duplicated include
remove duplicated include
Signed-off-by: Wang Jinchao <wangjinchao@xfusion.com>
Link: https://lore.kernel.org/r/202312151439+0800-wangjinchao@xfusion.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Fri, 15 Dec 2023 17:56:48 +0000 (17:56 +0000)]
arm: perf: Fix ARCH=arm build with GCC
LLVM ignores everything inside the if statement and doesn't generate
errors, but GCC doesn't ignore it, resulting in the following error:
drivers/perf/arm_pmuv3.c: In function ‘armv8pmu_write_evtype’:
include/linux/bits.h:34:29: error: left shift count >= width of type [-Werror=shift-count-overflow]
34 | (((~UL(0)) - (UL(1) << (l)) + 1) & \
Fix it by using GENMASK_ULL which doesn't overflow on arm32 (even though
the value is never used there).
Fixes: 3115ee021bfb ("arm64: perf: Include threshold control fields in PMEVTYPER mask")
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Closes: https://lore.kernel.org/linux-arm-kernel/20231215120817.h2f3akgv72zhrtqo@pengutronix.de/
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20231215175648.3397170-2-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Mark Rutland [Tue, 12 Dec 2023 17:09:10 +0000 (17:09 +0000)]
arm64: Align boot cpucap handling with system cpucap handling
Currently the detection+enablement of boot cpucaps is separate from the
patching of boot cpucap alternatives, which means there's a period where
cpus_have_cap($CAP) and alternative_has_cap($CAP) may be mismatched.
It would be preferable to manage the boot cpucaps in the same way as the
system cpucaps, both for clarity and to minimize the risk of accidental
usage of code relying upon an alternative which has not yet been
patched.
This patch aligns the handling of boot cpucaps with the handling of
system cpucaps:
* The existing setup_boot_cpu_capabilities() function is moved to be
closer to the setup_system_capabilities() and setup_system_features()
functions so that they're more clearly related and more likely to be
updated together in future.
* The patching of boot cpucap alternatives is moved into
setup_boot_cpu_capabilities(), immediately after boot cpucaps are
detected and enabled.
* A new setup_boot_cpu_features() function is added to mirror
setup_system_features(); this handles initialization of cpucap data
structures and calls setup_boot_cpu_capabilities(). This makes
init_cpu_features() a closer mirror to update_cpu_features(), and
makes smp_prepare_boot_cpu() a closer mirror to smp_cpus_done().
Importantly, while these changes alter the structure of the code, they
retain the existing order of calls to:
init_cpu_features(); // prefix initializing feature regs
init_cpucap_indirect_list();
detect_system_supports_pseudo_nmi();
update_cpu_capabilities(SCOPE_BOOT_CPU | SCOPE_LOCAL_CPU);
enable_cpu_capabilities(SCOPE_BOOT_CPU);
apply_boot_alternatives();
... and hence there should be no functional change as a result of this
patch; this is purely a structural cleanup.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20231212170910.3745497-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Mark Rutland [Tue, 12 Dec 2023 17:09:09 +0000 (17:09 +0000)]
arm64: Cleanup system cpucap handling
Recent changes to remove cpus_have_const_cap() introduced new users of
cpus_have_cap() in the period between detecting system cpucaps and
patching alternatives. It would be preferable to defer these until after
the relevant cpucaps have been patched so that these can use the usual
feature check helper functions, which is clearer and has less risk of
accidental usage of code relying upon an alternative which has not yet
been patched.
This patch reworks the system-wide cpucap detection and patching to
minimize this transient period:
* The detection, enablement, and patching of system cpucaps is moved
into a new setup_system_capabilities() function so that these can be
grouped together more clearly, with no other functions called in the
period between detection and patching. This is called from
setup_system_features() before the subsequent checks that depend on
the cpucaps.
The logging of TTBR0 PAN and cpucaps with a mask is also moved here to
keep these as close as possible to update_cpu_capabilities().
At the same time, comments are corrected and improved to make the
intent clearer.
* As hyp_mode_check() only tests system register values (not hwcaps) and
must be called prior to patching, the call to hyp_mode_check() is
moved before the call to setup_system_features().
* In setup_system_features(), the use of system_uses_ttbr0_pan() is
restored, now that this occurs after alternatives are patched. This is
a partial revert of commit:
53d62e995d9eaed1 ("arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN")
* In sve_setup() and sme_setup(), the use of system_supports_sve() and
system_supports_sme() respectively are restored, now that these occur
after alternatives are patched. This is a partial revert of commit:
a76521d160284a1e ("arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20231212170910.3745497-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Shuai Xue [Fri, 8 Dec 2023 02:56:52 +0000 (10:56 +0800)]
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
Add maintainers for Synopsys DesignWare PCIe PMU driver and driver
document.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231208025652.87192-6-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
Shuai Xue [Fri, 8 Dec 2023 02:56:51 +0000 (10:56 +0800)]
drivers/perf: add DesignWare PCIe PMU driver
This commit adds the PCIe Performance Monitoring Unit (PMU) driver support
for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express
Core controller IP which provides statistics feature. The PMU is a PCIe
configuration space register block provided by each PCIe Root Port in a
Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
injection, and Statistics).
To facilitate collection of statistics the controller provides the
following two features for each Root Port:
- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
time spent in each low-power LTSSM state) and
- one 32-bit counter for Event Counting (error and non-error events for
a specified lane)
Note: There is no interrupt for counter overflow.
This driver adds PMU devices for each PCIe Root Port. And the PMU device is
named based the BDF of Root Port. For example,
30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
the PMU device name for this Root Port is dwc_rootport_3018.
Example usage of counting PCIe RX TLP data payload (Units of bytes)::
$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
average RX bandwidth can be calculated like this:
PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-5-xueshuai@linux.alibaba.com
[will: Fix sparse error due to use of uninitialised 'vsec' symbol in
dwc_pcie_match_des_cap()]
Signed-off-by: Will Deacon <will@kernel.org>
Shuai Xue [Fri, 8 Dec 2023 02:56:50 +0000 (10:56 +0800)]
PCI: Move pci_clear_and_set_dword() helper to PCI header
The clear and set pattern is commonly used for accessing PCI config,
move the helper pci_clear_and_set_dword() from aspm.c into PCI header.
In addition, rename to pci_clear_and_set_config_dword() to retain the
"config" information and match the other accessors.
No functional change intended.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-4-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
Shuai Xue [Fri, 8 Dec 2023 02:56:49 +0000 (10:56 +0800)]
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
The Alibaba Vendor ID (0x1ded) is now used by Alibaba elasticRDMA ("erdma")
and will be shared with the upcoming PCIe PMU ("dwc_pcie_pmu"). Move the
Vendor ID to linux/pci_ids.h so that it can shared by several drivers
later.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-3-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
Shuai Xue [Fri, 8 Dec 2023 02:56:48 +0000 (10:56 +0800)]
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe
controller which implements PMU for performance and functional debugging to
facilitate system maintenance.
Document it to provide guidance on how to use it.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-2-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
Huang Shijie [Wed, 13 Dec 2023 01:20:46 +0000 (09:20 +0800)]
arm64: irq: set the correct node for shadow call stack
The init_irq_stacks() has been changed to use the correct node:
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?id=
75b5e0bf90bf
The init_irq_scs() has the same issue with init_irq_stacks():
cpu_to_node() is not initialized yet, it does not work.
This patch uses early_cpu_to_node() to set the init_irq_scs()
with the correct node.
Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20231213012046.12014-1-shijie@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Will Deacon [Wed, 13 Dec 2023 09:47:52 +0000 (09:47 +0000)]
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
This reverts commit
a5f4ca68f348ac059efd6a3d7ad4040aed1c0818.
Pulling in the Arm-specific 'linux/perf/arm_pmu.h' header breaks the
allmodconfig build for x86:
> In file included from drivers/perf/arm_dmc620_pmu.c:26:
> include/linux/perf/arm_pmu.h:15:10: fatal error: asm/cputype.h: No such file or directory
> 15 | #include <asm/cputype.h>
> | ^~~~~~~~~~~~~~~
Just put things back like they were so that the driver can continue to
be compile-tested on a variety of architectures.
Link: https://lore.kernel.org/r/20231213100931.12d9d85e@canb.auug.org.au
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Fri, 8 Dec 2023 11:32:22 +0000 (12:32 +0100)]
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
Now that kernel mode FPSIMD state is context switched along with other
task state, we can enable the existing logic that keeps track of which
task's FPSIMD state the CPU is holding in its registers. If it is the
context of the task that we are switching to, we can elide the reload of
the FPSIMD state from memory.
Note that we also need to check whether the FPSIMD state on this CPU is
the most recent: if a task gets migrated away and back again, the state
in memory may be more recent than the state in the CPU. So add another
CPU id field to task_struct to keep track of this. (We could reuse the
existing CPU id field used for user mode context, but that might result
in user state to be discarded unnecessarily, given that two distinct
CPUs could be holding the most recent user mode state and the most
recent kernel mode state)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231208113218.3001940-9-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Fri, 8 Dec 2023 11:32:21 +0000 (12:32 +0100)]
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
Currently, the FPSIMD register file is not preserved and restored along
with the general registers on exception entry/exit or context switch.
For this reason, we disable preemption when enabling FPSIMD for kernel
mode use in task context, and suspend the processing of softirqs so that
there are no concurrent uses in the kernel. (Kernel mode FPSIMD may not
be used at all in other contexts).
Disabling preemption while doing CPU intensive work on inputs of
potentially unbounded size is bad for real-time performance, which is
why we try and ensure that SIMD crypto code does not operate on more
than ~4k at a time, which is an arbitrary limit and requires assembler
code to implement efficiently.
We can avoid the need for disabling preemption if we can ensure that any
in-kernel users of the NEON will not lose the FPSIMD register state
across a context switch. And given that disabling softirqs implicitly
disables preemption as well, we will also have to ensure that a softirq
that runs code using FPSIMD can safely interrupt an in-kernel user.
So introduce a thread_info flag TIF_KERNEL_FPSTATE, and modify the
context switch hook for FPSIMD to preserve and restore the kernel mode
FPSIMD to/from struct thread_struct when it is set. This avoids any
scheduling blackouts due to prolonged use of FPSIMD in kernel mode,
without the need for manual yielding.
In order to support softirq processing while FPSIMD is being used in
kernel task context, use the same flag to decide whether the kernel mode
FPSIMD state needs to be preserved and restored before allowing FPSIMD
to be used in softirq context.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231208113218.3001940-8-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Fri, 8 Dec 2023 11:32:20 +0000 (12:32 +0100)]
arm64: fpsimd: Drop unneeded 'busy' flag
Kernel mode NEON will preserve the user mode FPSIMD state by saving it
into the task struct before clobbering the registers. In order to avoid
the need for preserving kernel mode state too, we disallow nested use of
kernel mode NEON, i..e, use in softirq context while the interrupted
task context was using kernel mode NEON too.
Originally, this policy was implemented using a per-CPU flag which was
exposed via may_use_simd(), requiring the users of the kernel mode NEON
to deal with the possibility that it might return false, and having NEON
and non-NEON code paths. This policy was changed by commit
13150149aa6ded1 ("arm64: fpsimd: run kernel mode NEON with softirqs
disabled"), and now, softirq processing is disabled entirely instead,
and so may_use_simd() can never fail when called from task or softirq
context.
This means we can drop the fpsimd_context_busy flag entirely, and
instead, ensure that we disable softirq processing in places where we
formerly relied on the flag for preventing races in the FPSIMD preserve
routines.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20231208113218.3001940-7-ardb@google.com
[will: Folded in fix from CAMj1kXFhzbJRyWHELCivQW1yJaF=p07LLtbuyXYX3G1WtsdyQg@mail.gmail.com]
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:16 +0000 (12:16 +0100)]
arm64/kernel: Move 'nokaslr' parsing out of early idreg code
Parsing and ignoring 'nokaslr' can be done from anywhere, except from
the code that runs very early and is therefore built with limitations on
the kind of relocations it is permitted to use.
So move it to a source file that is part of the ordinary kernel build.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-63-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:15 +0000 (12:16 +0100)]
arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit
All ID register value overrides are =0 with the exception of the nokaslr
pseudo feature which uses =1. In order to remove the dependency on
kstrtou64(), which is part of the core kernel and no longer usable once
we move idreg-override into the early mini C runtime, let's just parse a
single hex digit (with optional leading 0x) and set the output value
accordingly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-62-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:14 +0000 (12:16 +0100)]
arm64: idreg-override: Avoid sprintf() for simple string concatenation
Instead of using sprintf() with the "%s.%s=" format, where the first
string argument is always the same in the inner loop of match_options(),
use simple memcpy() for string concatenation, and move the first copy to
the outer loop. This removes the dependency on sprintf(), which will be
difficult to fulfil when we move this code into the early mini C
runtime.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-61-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:13 +0000 (12:16 +0100)]
arm64: idreg-override: avoid strlen() to check for empty strings
strlen() is a costly way to decide whether a string is empty, as in that
case, the first character will be NUL so we can check for that directly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-60-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:12 +0000 (12:16 +0100)]
arm64: idreg-override: Avoid parameq() and parameqn()
The only way parameq() and parameqn() deviate from the ordinary string
and memory routines is that they ignore the difference between dashes
and underscores.
Since we copy each command line argument into a buffer before passing it
to parameq() and parameqn() numerous times, let's just convert all
dashes to underscores just once, and update the alias array accordingly.
This also helps reduce the dependency on kernel APIs that are no longer
available once we move this code into the early mini C runtime.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-59-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:11 +0000 (12:16 +0100)]
arm64: idreg-override: Prepare for place relative reloc patching
The ID reg override handling code uses a rather elaborate data structure
that relies on statically initialized absolute address values in pointer
fields. This means that this code cannot run until relocation fixups
have been applied, and this is unfortunate, because it means we cannot
discover overrides for KASLR or LVA/LPA without creating the kernel
mapping and performing the relocations first.
This can be solved by switching to place-relative relocations, which can
be applied by the linker at build time. This means some additional
arithmetic is required when dereferencing these pointers, as we can no
longer dereference the pointer members directly.
So let's implement this for idreg-override.c in a preliminary way, i.e.,
convert all the references in code to use a special accessor that
produces the correct absolute value at runtime.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-58-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:16:10 +0000 (12:16 +0100)]
arm64: idreg-override: Omit non-NULL checks for override pointer
Now that override pointers are always set, we can drop the various
non-NULL checks that we have in the code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-57-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:15:59 +0000 (12:15 +0100)]
arm64: mm: get rid of kimage_vaddr global variable
We store the address of _text in kimage_vaddr, but since commit
09e3c22a86f6889d ("arm64: Use a variable to store non-global mappings
decision"), we no longer reference this variable from modules so we no
longer need to export it.
In fact, we don't need it at all so let's just get rid of it.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20231129111555.3594833-46-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:15:58 +0000 (12:15 +0100)]
arm64: mm: Take potential load offset into account when KASLR is off
We enable CONFIG_RELOCATABLE even when CONFIG_RANDOMIZE_BASE is
disabled, and this permits the loader (i.e., EFI) to place the kernel
anywhere in physical memory as long as the base address is 64k aligned.
This means that the 'KASLR' case described in the header that defines
the size of the statically allocated page tables could take effect even
when CONFIG_RANDMIZE_BASE=n. So check for CONFIG_RELOCATABLE instead.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20231129111555.3594833-45-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Nov 2023 11:15:57 +0000 (12:15 +0100)]
arm64: kernel: Disable latent_entropy GCC plugin in early C runtime
In subsequent patches, mark portions of the early C code will be marked
as __init. Unfortunarely, __init implies __latent_entropy, and this
would result in the early C code being instrumented in an unsafe manner.
Disable the latent entropy plugin for the early C code.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231129111555.3594833-44-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:23 +0000 (16:13 +0000)]
Documentation: arm64: Document the PMU event counting threshold feature
Add documentation for the new Perf event open parameters and
the threshold_max capability file.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-12-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:22 +0000 (16:13 +0000)]
arm64: perf: Add support for event counting threshold
FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
events whose count meets a specified threshold condition. For example if
PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
equal, count), and the threshold is set to 2, then the PMU counter will
now only increment by 1 when an event would have previously incremented
the PMU counter by 2 or more on a single processor cycle.
Three new Perf event config fields, 'threshold', 'threshold_compare' and
'threshold_count' have been added to control the feature.
threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
threshold_count maps to the first bit of TC. These separate attributes
have been picked rather than enumerating all the possible combinations
of the TC field as in the Arm ARM. The attributes would be used on a
Perf command line like this:
$ perf stat -e stall_slot/threshold=2,threshold_compare=2/
A new capability for reading out the maximum supported threshold value
has also been added:
$ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max
0x000000ff
If a threshold higher than threshold_max is provided, then an error is
generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is
running, then threshold_max reads zero, and attempting to set a
threshold value will also result in an error.
The threshold is per PMU counter, and there are potentially different
threshold_max values per PMU type on heterogeneous systems.
Bits higher than 32 now need to be written into PMEVTYPER, so
armv8pmu_write_evtype() has to be updated to take an unsigned long value
rather than u32 which gives the correct behavior on both aarch32 and 64.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-11-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:21 +0000 (16:13 +0000)]
arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
-EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them.
This will allow __hw_perf_event_init() to return a different code or not
print that particular message for a different error in the next commit.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-10-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:20 +0000 (16:13 +0000)]
KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
Now that ARMV8_PMU_PMCR_N is made with GENMASK, update usages to treat
it as a pre-shifted mask.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-9-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:19 +0000 (16:13 +0000)]
perf/arm_dmc620: Remove duplicate format attribute #defines
These were copied from the SPE driver, but now they're in the arm_pmu.h
header so delete them and use the header instead.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-8-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:18 +0000 (16:13 +0000)]
arm: pmu: Share user ABI format mechanism with SPE
This mechanism makes it much easier to define and read new attributes
so move it to the arm_pmu.h header so that it can be shared. At the same
time update the existing format attributes to use it.
GENMASK has to be changed to GENMASK_ULL because the config fields are
64 bits even on arm32 where this will also be used now.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-7-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:17 +0000 (16:13 +0000)]
arm64: perf: Include threshold control fields in PMEVTYPER mask
FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include
them in the mask. These aren't writable on 32 bit kernels as they are in
the high part of the register, so only include them for arm64.
It would be difficult to do this statically in the asm header files for
each platform without resulting in circular includes or #ifdefs inline
in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has
been removed and the mask is constructed programmatically.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-6-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:16 +0000 (16:13 +0000)]
arm: perf: Convert remaining fields to use GENMASK
Convert the remaining fields to use either GENMASK or be built from
other fields. These all already started at bit 0 so don't need a code
change for the lack of _SHIFT.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-5-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:15 +0000 (16:13 +0000)]
arm: perf: Use GENMASK for PMMIR fields
This is so that FIELD_GET and FIELD_PREP can be used and that the fields
are in a consistent format to arm64/tools/sysreg
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-4-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:14 +0000 (16:13 +0000)]
arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
This is so that FIELD_GET and FIELD_PREP can be used and that the fields
are in a consistent format to arm64/tools/sysreg
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-3-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
James Clark [Mon, 11 Dec 2023 16:13:13 +0000 (16:13 +0000)]
arm: perf: Remove inlines from arm_pmuv3.c
These are all static and in one compilation unit so the inline has no
effect on the binary. Except if FTRACE is enabled, then 3 functions
which were already not inlined now get the nops added which allows them
to be traced.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-2-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Will Deacon [Tue, 12 Dec 2023 09:26:38 +0000 (09:26 +0000)]
drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax
For some reason, the Arm DSU PMU driver uses kerneldoc-style comment
syntax (i.e. /** ) for non-kerneldoc comments. This makes the robots
very angry indeed, so just revert these to normal comments to stop
the noise.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312092000.8ltwotjt-lkp@intel.com/
Signed-off-by: Will Deacon <will@kernel.org>
Christophe JAILLET [Sun, 10 Dec 2023 17:48:18 +0000 (18:48 +0100)]
drivers/perf: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
This is less verbose.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/85b0b73a1b2f743dd5db15d4765c7685100de27f.1702230488.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Will Deacon <will@kernel.org>
Zenghui Yu [Tue, 5 Dec 2023 16:01:40 +0000 (00:01 +0800)]
arm64: Delete the zero_za macro
zero_za was introduced in commit
ca8a4ebcff44 ("arm64/sme: Manually encode
SME instructions") but doesn't appear to have any in kernel user. Drop it.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231205160140.1438-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Mon, 27 Nov 2023 12:00:53 +0000 (13:00 +0100)]
arm64: Kconfig: drop KAISER reference from KPTI option description
KAISER is a reference to the KASLR hardening technique that already
existed before Meltdown happened, and by now, it is sufficiently obscure
that mentioning it does not actually clarify anything. So remove this
reference, and replace it with KPTI.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231127120049.2258650-8-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Mon, 27 Nov 2023 12:00:52 +0000 (13:00 +0100)]
arm64: mm: Only map KPTI trampoline if it is going to be used
Avoid creating the fixmap entries for the KPTI trampoline if KPTI is not
in use.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231127120049.2258650-7-ardb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Tsung-Han Lin [Sun, 3 Dec 2023 01:18:04 +0000 (09:18 +0800)]
Documentation/arch/arm64: Fix typo
Should be 'if' here.
Signed-off-by: Tsung-Han Lin <tsunghan.tw@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20231203011804.27694-1-tsunghan.tw@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Huang Shijie [Fri, 24 Nov 2023 03:15:13 +0000 (11:15 +0800)]
arm64: irq: set the correct node for VMAP stack
In current code, init_irq_stacks() will call cpu_to_node().
The cpu_to_node() depends on percpu "numa_node" which is initialized in:
arch_call_rest_init() --> rest_init() -- kernel_init()
--> kernel_init_freeable() --> smp_prepare_cpus()
But init_irq_stacks() is called in init_IRQ() which is before
arch_call_rest_init().
So in init_irq_stacks(), the cpu_to_node() does not work, it
always return 0. In NUMA, it makes the node 1 cpu accesses the IRQ stack which
is in the node 0.
This patch fixes it by:
1.) export the early_cpu_to_node(), and use it in the init_irq_stacks().
2.) change init_irq_stacks() to __init function.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231124031513.81548-1-shijie@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Masahiro Yamada [Sun, 26 Nov 2023 15:10:45 +0000 (00:10 +0900)]
arm64: replace <asm-generic/export.h> with <linux/export.h>
Commit
ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm-generic/export.h>, which is now a wrapper of
<linux/export.h>.
Replace #include <asm-generic/export.h> with #include <linux/export.h>.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20231126151045.1556686-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Xu Yang [Mon, 20 Nov 2023 09:33:16 +0000 (17:33 +0800)]
perf: fsl_imx8_ddr: Add driver support for i.MX8DXL DDR Perf
Add driver support for i.MX8DXL DDR Perf, which supports AXI ID PORT
CHANNEL filter.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231120093317.2652866-4-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Xu Yang [Mon, 20 Nov 2023 09:33:15 +0000 (17:33 +0800)]
dt-bindings: perf: fsl-imx-ddr: Add i.MX8DXL compatible
Add a compatible for i.MX8DXL which is compatile with "fsl,imx8-ddr-pmu".
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20231120093317.2652866-3-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Xu Yang [Mon, 20 Nov 2023 09:33:14 +0000 (17:33 +0800)]
docs/perf: Add explanation for DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk
Add explanation for DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20231120093317.2652866-2-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Xu Yang [Mon, 20 Nov 2023 09:33:13 +0000 (17:33 +0800)]
perf: fsl_imx8_ddr: Add AXI ID PORT CHANNEL filter support
This is the extension of AXI ID filter.
Filter is defined with 2 configuration registers per counter 1-3 (counter
0 is not used for filtering and lacks these registers).
* Counter N MASK COMP register - AXI_ID and AXI_MASKING.
* Counter N MUX CNTL register - AXI CHANNEL and AXI PORT.
-- 0: address channel
-- 1: data channel
This filter is exposed to userspace as an additional (channel, port) pair.
The definition of axi_channel is inverted in userspace, and it will be
reverted in driver automatically.
AXI filter of Perf Monitor in DDR Subsystem, only a single port0 exist, so
axi_port is reserved which should be 0.
e.g.
perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd
perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231120093317.2652866-1-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Anshuman Khandual [Wed, 15 Nov 2023 09:28:05 +0000 (14:58 +0530)]
drivers: perf: arm_pmu: Drop 'pmu_lock' element from 'struct pmu_hw_events'
As 'pmu_lock' element is not being used in any ARM PMU implementation, just
drop this from 'struct pmu_hw_events'.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231115092805.737822-3-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Anshuman Khandual [Wed, 15 Nov 2023 09:28:04 +0000 (14:58 +0530)]
arm: perf: Remove PMU locking
Currently the 32-bit arm PMU drivers use the pmu_hw_events::lock spinlock in
their arm_pmu::{start,stop,enable,disable}() callbacks to protect hardware
state and event data.
This locking is not necessary as the perf core code already provides mutual
exclusion, disabling interrupts to serialize against the IRQ handler, and
using perf_event_context::lock to protect against concurrent modifications of
events cross-cpu.
The locking was removed from the arm64 (now PMUv3) PMU driver in commit:
2a0e2a02e4b7 ("arm64: perf: Remove PMU locking")
... and the same reasoning applies to all the 32-bit PMU drivers.
Remove the locking from the 32-bit PMU drivers.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231115092805.737822-2-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Junhao He [Mon, 4 Dec 2023 11:04:25 +0000 (19:04 +0800)]
drivers/perf: hisi: Fix some event id for HiSilicon UC pmu
Some event id of HiSilicon uncore UC PMU driver is incorrect, fix them.
Fixes: 312eca95e28d ("drivers/perf: hisi: Add support for HiSilicon UC PMU driver")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20231204110425.20354-1-hejunhao3@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Mark Rutland [Mon, 4 Dec 2023 11:58:47 +0000 (11:58 +0000)]
drivers/perf: pmuv3: don't expose SW_INCR event in sysfs
The SW_INCR event is somewhat unusual, and depends on the specific HW
counter that it is programmed into. When programmed into PMEVCNTR<n>,
SW_INCR will count any writes to PMSWINC_EL0 with bit n set, ignoring
writes to SW_INCR with bit n clear.
Event rotation means that there's no fixed relationship between
perf_events and HW counters, so this isn't all that useful.
Further, we program PMUSERENR.{SW,EN}=={0,0}, which causes EL0 writes to
PMSWINC_EL0 to be trapped and handled as UNDEFINED, resulting in a
SIGILL to userspace.
Given that, it's not a good idea to expose SW_INCR in sysfs. Hide it as
we did for CHAIN back in commit:
4ba2578fa7b55701 ("arm64: perf: don't expose CHAIN event in sysfs")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20231204115847.2993026-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Anshuman Khandual [Tue, 14 Nov 2023 06:16:56 +0000 (11:46 +0530)]
drivers: perf: arm_pmuv3: Add new macro PMUV3_INIT_MAP_EVENT()
This further compacts all remaining PMU init procedures requiring specific
map_event functions via a new macro PMUV3_INIT_MAP_EVENT(). While here, it
also changes generated init function names to match to those generated via
the other macro PMUV3_INIT_SIMPLE(). This does not cause functional change.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20231114061656.337231-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Robin Murphy [Thu, 23 Nov 2023 11:58:13 +0000 (11:58 +0000)]
perf/arm-cmn: Fix HN-F class_occup_id events
A subtle copy-paste error managed to slip through the reorganisation
of these patches in development, and not only give some HN-F events
the wrong type, but use that wrong type before the subsequent patch
defined it. Too late to fix history, but we can at least fix the bug.
Fixes: b1b7dc38e482 ("perf/arm-cmn: Refactor HN-F event selector macros")
Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5a22439de84ff188ef76674798052448eb03a3e1.1700740693.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Marc Zyngier [Wed, 22 Nov 2023 13:37:54 +0000 (13:37 +0000)]
arm64: Get rid of ARM64_HAS_NO_HW_PREFETCH
Back in 2016, it was argued that implementations lacking a HW
prefetcher could be helped by sprinkling a number of PRFM
instructions in strategic locations.
In 2023, the one platform that presumably needed this hack is no
longer in active use (let alone maintained), and an quick
experiment shows dropping this hack only leads to a 0.4% drop
on a full kernel compilation (tested on a MT30-GS0 48 CPU system).
Given that this is pretty much in the noise department and that
it may give odd ideas to other implementers, drop the hack for
good.
Suggested-by: Will Deacon <will@kernel.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20231122133754.1240687-1-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Masahiro Yamada [Fri, 17 Nov 2023 12:56:19 +0000 (21:56 +0900)]
arm64: vdso32: rename 32-bit debug vdso to vdso32.so.dbg
'make vdso_install' renames arch/arm64/kernel/vdso32/vdso.so.dbg to
vdso32.so during installation, which allows 64-bit and 32-bit vdso
files to be installed in the same directory.
However, arm64 is the only architecture that requires this renaming.
To simplify the vdso_install logic, rename the in-tree vdso file so
its base name matches the installed file name.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20231117125620.1058300-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Ryan Roberts [Mon, 27 Nov 2023 11:17:30 +0000 (11:17 +0000)]
arm64: Add ARM64_HAS_LPA2 CPU capability
Expose FEAT_LPA2 as a capability so that we can take advantage of
alternatives patching in the hypervisor.
Although FEAT_LPA2 presence is advertised separately for stage1 and
stage2, the expectation is that in practice both stages will either
support or not support it. Therefore, we combine both into a single
capability, allowing us to simplify the implementation. KVM requires
support in both stages in order to use LPA2 since the same library is
used for hyp stage 1 and guest stage 2 pgtables.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-6-ryan.roberts@arm.com
Anshuman Khandual [Mon, 27 Nov 2023 11:17:29 +0000 (11:17 +0000)]
arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-5-ryan.roberts@arm.com
Ryan Roberts [Mon, 27 Nov 2023 11:17:28 +0000 (11:17 +0000)]
arm64/mm: Update tlb invalidation routines for FEAT_LPA2
FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in
the non-range tlbi instructions can now validly take a 0 value as a
level hint for the 4KB granule (this is due to the extra level of
translation) - previously TTL=0b0100 meant no hint and was treated as
0b0000. Secondly, The BADDR field of the range-based tlbi instructions
is specified in 64KB units when LPA2 is in use (TCR.DS=1), whereas it is
in page units otherwise. Changes are required for tlbi to continue to
operate correctly when LPA2 is in use.
Solve the first problem by always adding the level hint if the level is
between [0, 3] (previously anything other than 0 was hinted, which
breaks in the new level -1 case from kvm). When running on non-LPA2 HW,
0 is still safe to hint as the HW will fall back to non-hinted. While we
are at it, we replace the notion of 0 being the non-hinted sentinel with
a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating
if/when translation depth increases in future.
The second issue is more complex: When LPA2 is in use, use the non-range
tlbi instructions to forward align to a 64KB boundary first, then use
range-based tlbi from there on, until we have either invalidated all
pages or we have a single page remaining. If the latter, that is done
with non-range tlbi. We determine whether LPA2 is in use based on
lpa2_is_enabled() (for kernel calls) or kvm_lpa2_is_enabled() (for kvm
calls).
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-4-ryan.roberts@arm.com
Ryan Roberts [Mon, 27 Nov 2023 11:17:27 +0000 (11:17 +0000)]
arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
Add stub functions which is initially always return false. These provide
the hooks that we need to update the range-based TLBI routines, whose
operands are encoded differently depending on whether lpa2 is enabled or
not.
The kernel and kvm will enable the use of lpa2 asynchronously in future,
and part of that enablement will involve fleshing out their respective
hook to advertise when it is using lpa2.
Since the kernel's decision to use lpa2 relies on more than just whether
the HW supports the feature, it can't just use the same static key as
kvm. This is another reason to use separate functions. lpa2_is_enabled()
is already implemented as part of Ard's kernel lpa2 series. Since kvm
will make its decision solely based on HW support, kvm_lpa2_is_enabled()
will be defined as system_supports_lpa2() once kvm starts using lpa2.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-3-ryan.roberts@arm.com
Ryan Roberts [Mon, 27 Nov 2023 11:17:26 +0000 (11:17 +0000)]
arm64/mm: Modify range-based tlbi to decrement scale
In preparation for adding support for LPA2 to the tlb invalidation
routines, modify the algorithm used by range-based tlbi to start at the
highest 'scale' and decrement instead of starting at the lowest 'scale'
and incrementing. This new approach makes it possible to maintain 64K
alignment as we work through the range, until the last op (at scale=0).
This is required when LPA2 is enabled. (This part will be added in a
subsequent commit).
This change is separated into its own patch because it will also impact
non-LPA2 systems, and I want to make it easy to bisect in case it leads
to performance regression (see below for benchmarks that suggest this
should not be a problem).
The original commit (
d1d3aa98 "arm64: tlb: Use the TLBI RANGE feature in
arm64") stated this as the reason for _incrementing_ scale:
However, in most scenarios, the pages = 1 when flush_tlb_range() is
called. Start from scale = 3 or other proper value (such as scale
=ilog2(pages)), will incur extra overhead. So increase 'scale' from 0
to maximum.
But pages=1 is already special cased by the non-range invalidation path,
which will take care of it the first time through the loop (both in the
original commit and in my change), so I don't think switching to
decrement scale should have any extra performance impact after all.
Indeed benchmarking kernel compilation, a TLBI-heavy workload, suggests
that this new approach actually _improves_ performance slightly (using a
virtual machine on Apple M2):
Table shows time to execute kernel compilation workload with 8 jobs,
relative to baseline without this patch (more negative number is
bigger speedup). Repeated 9 times across 3 system reboots:
| counter | mean | stdev |
|:----------|-----------:|----------:|
| real-time | -0.6% | 0.0% |
| kern-time | -1.6% | 0.5% |
| user-time | -0.4% | 0.1% |
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-2-ryan.roberts@arm.com
Linus Torvalds [Mon, 27 Nov 2023 03:59:33 +0000 (19:59 -0800)]
Linux 6.7-rc3
Linus Torvalds [Mon, 27 Nov 2023 03:48:20 +0000 (19:48 -0800)]
Merge tag 'trace-v6.7-rc2' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt::
"Eventfs fixes:
- With the usage of simple_recursive_remove() recommended by Al Viro,
the code should not be calling "d_invalidate()" itself. Doing so is
causing crashes. The code was calling d_invalidate() on the race of
trying to look up a file while the parent was being deleted. This
was detected, and the added dentry was having d_invalidate() called
on it, but the deletion of the directory was also calling
d_invalidate() on that same dentry.
- A fix to not free the eventfs_inode (ei) until the last dput() was
called on its ei->dentry made the ei->dentry exist even after it
was marked for free by setting the ei->is_freed. But code elsewhere
still was checking if ei->dentry was NULL if ei->is_freed is set
and would trigger WARN_ON if that was the case. That's no longer
true and there should not be any warnings when it is true.
- Use GFP_NOFS for allocations done under eventfs_mutex. The
eventfs_mutex can be taken on file system reclaim, make sure that
allocations done under that mutex do not trigger file system
reclaim.
- Clean up code by moving the taking of inode_lock out of the helper
functions and into where they are needed, and not use the parameter
to know to take it or not. It must always be held but some callers
of the helper function have it taken when they were called.
- Warn if the inode_lock is not held in the helper functions.
- Warn if eventfs_start_creating() is called without a parent. As
eventfs is underneath tracefs, all files created will have a parent
(the top one will have a tracefs parent).
Tracing update:
- Add Mathieu Desnoyers as an official reviewer of the tracing subsystem"
* tag 'trace-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: TRACING: Add Mathieu Desnoyers as Reviewer
eventfs: Make sure that parent->d_inode is locked in creating files/dirs
eventfs: Do not allow NULL parent to eventfs_start_creating()
eventfs: Move taking of inode_lock into dcache_dir_open_wrapper()
eventfs: Use GFP_NOFS for allocation when eventfs_mutex is held
eventfs: Do not invalidate dentry in create_file/dir_dentry()
eventfs: Remove expectation that ei->is_freed means ei->dentry == NULL
Linus Torvalds [Sun, 26 Nov 2023 17:59:39 +0000 (09:59 -0800)]
Merge tag 'parisc-for-6.7-rc3' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"This patchset fixes and enforces correct section alignments for the
ex_table, altinstructions, parisc_unwind, jump_table and bug_table
which are created by inline assembly.
Due to not being correctly aligned at link & load time they can
trigger unnecessarily the kernel unaligned exception handler at
runtime. While at it, I switched the bug table to use relative
addresses which reduces the size of the table by half on 64-bit.
We still had the ENOSYM and EREMOTERELEASE errno symbols as left-overs
from HP-UX, which now trigger build-issues with glibc. We can simply
remove them.
Most of the patches are tagged for stable kernel series.
Summary:
- Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
build issues
- Fix section alignments for ex_table, altinstructions, parisc unwind
table, jump_table and bug_table
- Reduce size of bug_table on 64-bit kernel by using relative
pointers"
* tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Reduce size of the bug_table on 64-bit kernel by half
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
parisc: Use natural CPU alignment for bug_table
parisc: Ensure 32-bit alignment on parisc unwind section
parisc: Mark lock_aligned variables 16-byte aligned on SMP
parisc: Mark jump_table naturally aligned
parisc: Mark altinstructions read-only and 32-bit aligned
parisc: Mark ex_table entries 32-bit aligned in uaccess.h
parisc: Mark ex_table entries 32-bit aligned in assembly.h
Linus Torvalds [Sun, 26 Nov 2023 16:42:42 +0000 (08:42 -0800)]
Merge tag 'x86-urgent-2023-11-26' of git://git./linux/kernel/git/tip/tip
Pull x86 microcode fixes from Ingo Molnar:
"Fix/enhance x86 microcode version reporting: fix the bootup log spam,
and remove the driver version announcement to avoid version confusion
when distros backport fixes"
* tag 'x86-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Rework early revisions reporting
x86/microcode: Remove the driver announcement and version
Linus Torvalds [Sun, 26 Nov 2023 16:34:12 +0000 (08:34 -0800)]
Merge tag 'perf-urgent-2023-11-26' of git://git./linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
"Fix a bug in the Intel hybrid CPUs hardware-capabilities enumeration
code resulting in non-working events on those platforms"
* tag 'perf-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Correct incorrect 'or' operation for PMU capabilities
Linus Torvalds [Sun, 26 Nov 2023 16:30:11 +0000 (08:30 -0800)]
Merge tag 'locking-urgent-2023-11-26' of git://git./linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix lockdep block chain corruption resulting in KASAN warnings"
* tag 'locking-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Fix block chain corruption
Linus Torvalds [Sun, 26 Nov 2023 16:22:27 +0000 (08:22 -0800)]
Merge tag '6.7-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- use after free fix in releasing multichannel interfaces
- fixes for special file types (report char, block, FIFOs properly when
created e.g. by NFS to Windows)
- fixes for reporting various special file types and symlinks properly
when using SMB1
* tag '6.7-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: introduce cifs_sfu_make_node()
smb: client: set correct file type from NFS reparse points
smb: client: introduce ->parse_reparse_point()
smb: client: implement ->query_reparse_point() for SMB1
cifs: fix use after free for iface while disabling secondary channels
Linus Torvalds [Sun, 26 Nov 2023 02:22:42 +0000 (18:22 -0800)]
Merge tag 'usb-6.7-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB / PHY / Thunderbolt fixes from Greg KH:
"Here are a number of reverts, fixes, and new device ids for 6.7-rc3
for the USB, PHY, and Thunderbolt driver subsystems. Include in here
are:
- reverts of some PHY drivers that went into 6.7-rc1 that shouldn't
have been merged yet, the author is reworking them based on review
comments as they were using older apis that shouldn't be used
anymore for newer drivers
- small thunderbolt driver fixes for reported issues
- USB driver fixes for a variety of small issues in dwc3, typec,
xhci, and other smaller drivers.
- new device ids for usb-serial and onboard_usb_hub drivers.
All of these have been in linux-next with no reported issues"
* tag 'usb-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
USB: serial: option: add Luat Air72*U series products
USB: dwc3: qcom: fix ACPI platform device leak
USB: dwc3: qcom: fix software node leak on probe errors
USB: dwc3: qcom: fix resource leaks on probe deferral
USB: dwc3: qcom: simplify wakeup interrupt setup
USB: dwc3: qcom: fix wakeup after probe deferral
dt-bindings: usb: qcom,dwc3: fix example wakeup interrupt types
usb: misc: onboard-hub: add support for Microchip USB5744
dt-bindings: usb: microchip,usb5744: Add second supply
usb: misc: ljca: Fix enumeration error on Dell Latitude 9420
USB: serial: option: add Fibocom L7xx modules
USB: xhci-plat: fix legacy PHY double init
usb: typec: tipd: Supply also I2C driver data
usb: xhci-mtk: fix in-ep's start-split check failure
usb: dwc3: set the dma max_seg_size
usb: config: fix iteration issue in 'usb_get_bos_descriptor()'
usb: dwc3: add missing of_node_put and platform_device_put
USB: dwc2: write HCINT with INTMASK applied
usb: misc: ljca: Drop _ADR support to get ljca children devices
usb: cdnsp: Fix deadlock issue during using NCM gadget
...
Linus Torvalds [Sat, 25 Nov 2023 16:57:09 +0000 (08:57 -0800)]
Merge tag 'xfs-6.7-fixes-3' of git://git./fs/xfs/xfs-linux
Pull xfs fix from Chandan Babu:
- Validate quota records recovered from the log before writing them to
the disk.
* tag 'xfs-6.7-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: dquot recovery does not validate the recovered dquot
xfs: clean up dqblk extraction
Linus Torvalds [Sat, 25 Nov 2023 16:43:46 +0000 (08:43 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix "rodata=on" not disabling "rodata=full" on arm64
- Add arm64 make dependency between vmlinuz.efi and Image, leading to
occasional build failures previously (with parallel building)
- Add newline to the output formatting of the za-fork kselftest
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: add dependency between vmlinuz.efi and Image
kselftest/arm64: Fix output formatting for za-fork
arm64: mm: Fix "rodata=on" when CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
Linus Torvalds [Sat, 25 Nov 2023 16:32:44 +0000 (08:32 -0800)]
Merge tag 'for-linus-6.7a-rc3-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A small cleanup patch for the Xen privcmd driver
- A fix for the swiotlb-xen driver which was missing the advertising of
the maximum mapping length
- A fix for Xen on Arm for a longstanding bug, which happened to occur
only recently: a structure in percpu memory crossed a page boundary,
which was rejected by the hypervisor
* tag 'for-linus-6.7a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
arm/xen: fix xen_vcpu_info allocation alignment
xen: privcmd: Replace zero-length array with flex-array member and use __counted_by
swiotlb-xen: provide the "max_mapping_size" method
Helge Deller [Thu, 23 Nov 2023 20:57:19 +0000 (21:57 +0100)]
parisc: Reduce size of the bug_table on 64-bit kernel by half
Enable GENERIC_BUG_RELATIVE_POINTERS which will store 32-bit relative
offsets to the bug address and the source file name instead of 64-bit
absolute addresses. This effectively reduces the size of the
bug_table[] array by half on 64-bit kernels.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Thu, 23 Nov 2023 19:28:27 +0000 (20:28 +0100)]
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
Those return codes are only defined for the parisc architecture and
are leftovers from when we wanted to be HP-UX compatible.
They are not returned by any Linux kernel syscall but do trigger
problems with the glibc strerrorname_np() and strerror() functions as
reported in glibc issue #31080.
There is no need to keep them, so simply remove them.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Bruno Haible <bruno@clisp.org>
Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080
Cc: stable@vger.kernel.org
Helge Deller [Mon, 20 Nov 2023 22:30:49 +0000 (23:30 +0100)]
parisc: Use natural CPU alignment for bug_table
Make sure that the __bug_table section gets 32- or 64-bit aligned,
depending if a 32- or 64-bit kernel is being built.
Mark it non-writeable and use .blockz instead of the .org assembler
directive to pad the struct.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Sat, 25 Nov 2023 08:16:02 +0000 (09:16 +0100)]
parisc: Ensure 32-bit alignment on parisc unwind section
Make sure the .PARISC.unwind section will be 32-bit aligned.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Sat, 25 Nov 2023 08:11:56 +0000 (09:11 +0100)]
parisc: Mark lock_aligned variables 16-byte aligned on SMP
On parisc we need 16-byte alignment for variables which are used for
locking. Mark the __lock_aligned attribute acordingly so that the
.data..lock_aligned section will get that alignment in the generated
object files.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Mon, 20 Nov 2023 22:14:39 +0000 (23:14 +0100)]
parisc: Mark jump_table naturally aligned
The jump_table stores two 32-bit words and one 32- (on 32-bit kernel)
or one 64-bit word (on 64-bit kernel).
Ensure that the last word is always 64-bit aligned on a 64-bit kernel
by aligning the whole structure on sizeof(long).
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Mon, 20 Nov 2023 22:10:20 +0000 (23:10 +0100)]
parisc: Mark altinstructions read-only and 32-bit aligned
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Mon, 20 Nov 2023 14:39:03 +0000 (15:39 +0100)]
parisc: Mark ex_table entries 32-bit aligned in uaccess.h
Add an align statement to tell the linker that all ex_table entries and as
such the whole ex_table section should be 32-bit aligned in vmlinux and modules.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Helge Deller [Mon, 20 Nov 2023 14:37:50 +0000 (15:37 +0100)]
parisc: Mark ex_table entries 32-bit aligned in assembly.h
Add an align statement to tell the linker that all ex_table entries and as
such the whole ex_table section should be 32-bit aligned in vmlinux and modules.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Linus Torvalds [Fri, 24 Nov 2023 19:44:50 +0000 (11:44 -0800)]
Merge tag 's390-6.7-3' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Remove unnecessary assignment of the performance event last_tag.
- Create missing /sys/firmware/ipl/* attributes when kernel is booted
in dump mode using List-directed ECKD IPL.
- Remove odd comment.
- Fix s390-specific part of scripts/checkstack.pl script that only
matches three-digit numbers starting with 3 or any higher number and
skips any stack sizes smaller than 304 bytes.
* tag 's390-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
scripts/checkstack.pl: match all stack sizes for s390
s390: remove odd comment
s390/ipl: add missing IPL_TYPE_ECKD_DUMP case to ipl_init()
s390/pai: cleanup event initialization
Linus Torvalds [Fri, 24 Nov 2023 19:30:35 +0000 (11:30 -0800)]
Merge tag 'acpi-6.7-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These add an ACPI IRQ override quirk for ASUS ExpertBook B1402CVA and
fix an ACPI processor idle issue leading to triple-faults in Xen HVM
guests and an ACPI backlight driver issue that causes GPUs to
misbehave while their children power is being fixed up.
Specifics:
- Avoid powering up GPUs while attempting to fix up power for their
children (Hans de Goede)
- Use raw_safe_halt() instead of safe_halt() in acpi_idle_play_dead()
so as to avoid triple-falts during CPU online in Xen HVM guests due
to the setting of the hardirqs_enabled flag in safe_halt() (David
Woodhouse)
- Add an ACPI IRQ override quirk for ASUS ExpertBook B1402CVA (Hans
de Goede)"
* tag 'acpi-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: resource: Skip IRQ override on ASUS ExpertBook B1402CVA
ACPI: video: Use acpi_device_fix_up_power_children()
ACPI: PM: Add acpi_device_fix_up_power_children() function
ACPI: processor_idle: use raw_safe_halt() in acpi_idle_play_dead()
Linus Torvalds [Fri, 24 Nov 2023 19:26:00 +0000 (11:26 -0800)]
Merge tag 'pm-6.7-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a syntax error in the sleepgraph utility which causes it to exit
early on every invocation (David Woodhouse)"
* tag 'pm-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: tools: Fix sleepgraph syntax error
Linus Torvalds [Fri, 24 Nov 2023 18:40:03 +0000 (10:40 -0800)]
Merge tag 'afs-fixes-
20231124' of git://git./linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
- Fix the afs_server_list struct to be cleaned up with RCU
- Fix afs to translate a no-data result from a DNS lookup into ENOENT,
not EDESTADDRREQ for consistency with OpenAFS
- Fix afs to translate a negative DNS lookup result into ENOENT rather
than EDESTADDRREQ
- Fix file locking on R/O volumes to operate in local mode as the
server doesn't handle exclusive locks on such files
- Set SB_RDONLY on superblocks for RO and Backup volumes so that the
VFS can see that they're read only
* tag 'afs-fixes-
20231124' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Mark a superblock for an R/O or Backup volume as SB_RDONLY
afs: Fix file locking on R/O volumes to operate in local mode
afs: Return ENOENT if no cell DNS record can be found
afs: Make error on cell lookup failure consistent with OpenAFS
afs: Fix afs_server_list to be cleaned up with RCU
Rafael J. Wysocki [Fri, 24 Nov 2023 18:16:22 +0000 (19:16 +0100)]
Merge branches 'acpi-video' and 'acpi-processor' into acpi
Merge ACPI backlight driver fixes and an ACPI processor driver fix for
6.7-rc3:
- Avoid powering up GPUs while attempting to fix up power for their
children (Hans de Goede).
- Use raw_safe_halt() instead of safe_halt() in acpi_idle_play_dead()
so as to avoid triple-falts during CPU online in Xen HVM guests due
to the setting of the hardirqs_enabled flag in safe_halt() (David
Woodhouse).
* acpi-video:
ACPI: video: Use acpi_device_fix_up_power_children()
ACPI: PM: Add acpi_device_fix_up_power_children() function
* acpi-processor:
ACPI: processor_idle: use raw_safe_halt() in acpi_idle_play_dead()
Linus Torvalds [Fri, 24 Nov 2023 17:45:40 +0000 (09:45 -0800)]
Merge tag 'vfs-6.7-rc3.fixes' of git://git./linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Avoid calling back into LSMs from vfs_getattr_nosec() calls.
IMA used to query inode properties accessing raw inode fields without
dedicated helpers. That was finally fixed a few releases ago by
forcing IMA to use vfs_getattr_nosec() helpers.
The goal of the vfs_getattr_nosec() helper is to query for attributes
without calling into the LSM layer which would be quite problematic
because incredibly IMA is called from __fput()...
__fput()
-> ima_file_free()
What it does is to call back into the filesystem to update the file's
IMA xattr. Querying the inode without using vfs_getattr_nosec() meant
that IMA didn't handle stacking filesystems such as overlayfs
correctly. So the switch to vfs_getattr_nosec() is quite correct. But
the switch to vfs_getattr_nosec() revealed another bug when used on
stacking filesystems:
__fput()
-> ima_file_free()
-> vfs_getattr_nosec()
-> i_op->getattr::ovl_getattr()
-> vfs_getattr()
-> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()
-> security_inode_getattr() # calls back into LSMs
Now, if that __fput() happens from task_work_run() of an exiting task
current->fs and various other pointer could already be NULL. So
anything in the LSM layer relying on that not being NULL would be
quite surprised.
Fix that by passing the information that this is a security request
through to the stacking filesystem by adding a new internal
ATT_GETATTR_NOSEC flag. Now the callchain becomes:
__fput()
-> ima_file_free()
-> vfs_getattr_nosec()
-> i_op->getattr::ovl_getattr()
-> if (AT_GETATTR_NOSEC)
vfs_getattr_nosec()
else
vfs_getattr()
-> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()
- Fix a bug introduced with the iov_iter rework from last cycle.
This broke /proc/kcore by copying too much and without the correct
offset.
- Add a missing NULL check when allocating the root inode in
autofs_fill_super().
- Fix stable writes for multi-device filesystems (xfs, btrfs etc) and
the block device pseudo filesystem.
Stable writes used to be a superblock flag only, making it a per
filesystem property. Add an additional AS_STABLE_WRITES mapping flag
to allow for fine-grained control.
- Ensure that offset_iterate_dir() returns 0 after reaching the end of
a directory so it adheres to getdents() convention.
* tag 'vfs-6.7-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
libfs: getdents() should return 0 after reaching EOD
xfs: respect the stable writes flag on the RT device
xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags
block: update the stable_writes flag in bdev_add
filemap: add a per-mapping stable writes flag
autofs: add: new_inode check in autofs_fill_super()
iov_iter: fix copy_page_to_iter_nofault()
fs: Pass AT_GETATTR_NOSEC flag to getattr interface function
Linus Torvalds [Fri, 24 Nov 2023 17:36:33 +0000 (09:36 -0800)]
Merge tag 'drm-fixes-2023-11-24' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Back to regular scheduled fixes pull request, mainly a bunch of msm,
some i915 and otherwise a few scattered, one memory crasher in the
nouveau GSP paths is helping stabilise that work.
msm:
- Fix the VREG_CTRL_1 for 4nm CPHY to match downstream
- Remove duplicate call to drm_kms_helper_poll_init() in
msm_drm_init()
- Fix the safe_lut_tbl[] for sc8280xp to match downstream
- Don't attach the drm_dp_set_subconnector_property() for eDP
- Fix to attach drm_dp_set_subconnector_property() for DP. Otherwise
there is a bootup crash on multiple targets
- Remove unnecessary NULL check left behind during cleanup
i915:
- Fix race between DP MST connectore registration and setup
- Fix GT memory leak on probe error path
panel:
- Fixes for innolux and auo,b101uan08.3 panel.
- Fix Himax83102-j02 timings.
ivpu:
- Fix ivpu MMIO reset.
ast:
- AST fix on connetor disconnection.
nouveau:
- gsp memory corruption fix
rockchip:
- color fix"
* tag 'drm-fixes-2023-11-24' of git://anongit.freedesktop.org/drm/drm:
nouveau/gsp: allocate enough space for all channel ids.
drm/panel: boe-tv101wum-nl6: Fine tune Himax83102-j02 panel HFP and HBP
drm/ast: Disconnect BMC if physical connector is connected
accel/ivpu/37xx: Fix hangs related to MMIO reset
drm/rockchip: vop: Fix color for RGB888/BGR888 format on VOP full
drm/i915: do not clean GT table on error path
drm/i915/dp_mst: Fix race between connector registration and setup
drm/panel: simple: Fix Innolux G101ICE-L01 timings
drm/panel: simple: Fix Innolux G101ICE-L01 bus flags
drm/msm: remove unnecessary NULL check
drm/panel: auo,b101uan08.3: Fine tune the panel power sequence
drm/msm/dp: attach the DP subconnector property
drm/msm/dp: don't touch DP subconnector property in eDP case
drm/msm/dpu: Add missing safe_lut_tbl in sc8280xp catalog
drm/msm: remove exra drm_kms_helper_poll_init() call
drm/msm/dsi: use the correct VREG_CTRL_1 value for 4nm cphy
Greg Kroah-Hartman [Fri, 24 Nov 2023 16:30:38 +0000 (16:30 +0000)]
Merge tag 'usb-serial-6.7-rc3' of https://git./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fixes for 6.7-rc3
Here are a couple of modem device entry fixes and some new modem device
ids.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.7-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: (329 commits)
USB: serial: option: add Luat Air72*U series products
USB: serial: option: add Fibocom L7xx modules
USB: serial: option: fix FM101R-GL defines
USB: serial: option: don't claim interface 4 for ZTE MF290
Linux 6.7-rc2
prctl: Disable prctl(PR_SET_MDWE) on parisc
parisc/power: Fix power soft-off when running on qemu
parisc: Replace strlcpy() with strscpy()
NFSD: Fix checksum mismatches in the duplicate reply cache
NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
NFSD: Update nfsd_cache_append() to use xdr_stream
nfsd: fix file memleak on client_opens_release
dm-crypt: start allocating with MAX_ORDER
dm-verity: don't use blocking calls from tasklets
dm-bufio: fix no-sleep mode
dm-delay: avoid duplicate logic
dm-delay: fix bugs introduced by kthread mode
dm-delay: fix a race between delay_presuspend and delay_bio
drm/amdgpu/gmc9: disable AGP aperture
drm/amdgpu/gmc10: disable AGP aperture
...
David Howells [Thu, 2 Nov 2023 16:24:00 +0000 (16:24 +0000)]
afs: Mark a superblock for an R/O or Backup volume as SB_RDONLY
Mark a superblock that is for for an R/O or Backup volume as SB_RDONLY when
mounting it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
David Howells [Wed, 1 Nov 2023 22:03:28 +0000 (22:03 +0000)]
afs: Fix file locking on R/O volumes to operate in local mode
AFS doesn't really do locking on R/O volumes as fileservers don't maintain
state with each other and thus a lock on a R/O volume file on one
fileserver will not be be visible to someone looking at the same file on
another fileserver.
Further, the server may return an error if you try it.
Fix this by doing what other AFS clients do and handle filelocking on R/O
volume files entirely within the client and don't touch the server.
Fixes: 6c6c1d63c243 ("afs: Provide mount-time configurable byte-range file locking emulation")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org